Skip to main content

Posts

Paper on Gray Failures

A great paper annotated by The Morning Paper, this time on the subject of gray failures. https://blog.acolyer.org/2017/06/15/gray-failure-the-achilles-heel-of-cloud-scale-systems/ There are a handful of interesting takeaways from this one:It is important that monitoring on a system aligns with the clients of the system's definition of failure. The cycle of failure is inevitable unless proper root causes are identified.Some personal observations:A potential observability gap is the difference between proximate and root cause. For example a service may fail because a proxy server returns an error. The error may be due to timing out a request to an upstream server. The cause of that could be the fact that latency has increased on the upstream server. The root cause of that could be that the working set of the server no longer fully fits in memory and the server is swapping leading to latency. The server itself is not failing any requests, but the proxy is throwing the results away be…
Recent posts

Operational Metrics and Alerts for Distributed Software Systems

This post will be about operational metrics and alerts for distributed software systems. What do I mean by that? I mean the metrics and alerts that allow operations personel to detect failure of of a distributed software system and helps them to quickly diagnose what is wrong.MetricsThe metrics are measurements of characteristics of the system collected at regular(ish) intervals and stored somewhere for processing - rendering into graphs, triggering alert notifications, etc. Metrics can be divided into 3 categories: input metrics, output metrics, and process metrics. Input metrics are measures of the inputs to the system, for example, the number of user requests, counts of particular characteristics of the requests - where they are from, how large the request data is, counts of particular features in the request (for example, which resources/items/products are being asked for). Output metrics are measures of the output of the system. Examples of these would include orders successfully…

With Great Power Comes Great Responsibility

Forth is used as a bootloader for SPARC based machines. One feature that SPARC based machines made by SUN Mircosystems had was the ability to drop back to the bootloader's Forth interpreter by pressing the Stop-A key combination at the console. This suspended the operating system and gave the user an ok prompt to work at. Typically this was used to kick off a kernel debugger or to kick errant SCSI hardware back into line. In effect the Open Boot Prom (OBP), as the Forth based bootloader was branded, was a very lightweight hypervisor.A consequence of this was that while working at the ok prompt, the user wasn't subject to privilege system of Solaris. People at the console could use this to gain root privileges. The method worked as follows:Find the address in memory where the proc structure of a shell that the user has open, i.e., where the shell's process resides in memory.Press Stop-A to drop to OBP.Write 0 to the cr_uid field of the processes cred structure. The location…

Learning Forth

One of my side projects for this year is to learn the programming language, Forth. Some people might consider this an odd language to learn. It is not a popular language. There are no hot startups using it (that I know of). It doesn't even show up in the top 100 languages in the TIOBE Index. However, I am convinced learning it is worthwhile. Some of my reasons for this are: Forth is probably the most successful and widely deployed language that nobody has heard of. It is the language used to develop OpenFirmware. This boot loader is installed on the laptops of the One Laptop Per Child Project, on PowerPC based Apple Mac computers, and on SPARC based computers from SUN Microsystems. It has also been used to develop to develop control software for the National Radio Astronomy Observatory, which is where it was developed.While not as widely used as C/C++, Forth is used a lot in embedded applications and has been ported to most micro-controllers. For example, the Forth, Inc. website h…

TIL: ARM Has Java Bytecode Execution in Hardware

I recently purchased a Raspberry Pi. While poking around in /proc I discovered that java is one of the features of the ARM processor in the Pi. It turns out that some ARM models have Java bytecode instructions implemented in hardware.

On Writing Well

I like doing things that have a body of theory behind them. For example, I prefer taijiquan to kickboxing, as a martial art, as it has deeper theory. So when I started blogging, I went looking for its theoretical foundations. I found them in the principles of good nonfiction writing. I bought two books: The Elements of Style, by William Strunk Jr. and E. B. White, and On Writing Well, by William Zinsser. The book is unusual for a writing guide. Firstly, it is a good read. It is actually hard to put down. Advice on writing is given clearly and simply. That is part of it. However, Mr. Zinsser illustrates his points with personal anecdotes, and this is what makes the book so interesting. In the chapter entitled A Writer's Decisions, for example, he uses an account of a trip he took to Timbuktu. He walks us through the article he wrote, paragraph by paragraph, explaining what he wrote and what he was thinking at the time. Between the travel piece and its explanation, you get an idea o…

First Post

Hello and welcome to the inane ramblings of an Irish software developer.The title of the blog comes from Lewis Carroll's, Through the Looking Glass. In the book, Alice goes running with the Red Queen, but they don't seem to make any progress. Alice remarks on this, saying, "Well in our country, you'd generally get to somewhere else - if you ran very fast for a long time as we've been doing." The Red Queen replies, "A slow sort of country. Now, here, you see, it takes all the running you can do, to stay in the same place." The Red Queen Effect is quite applicable to the software industry, and as I probably will be talking quite a bit about the software industry, I thought it would be a good name for a blog.I have a few objectives for my new blog. By writing here, I hope to learn how to write well. That is, I hope to learn how to write clearly and concisely, and be interesting at the same time. I also hope that this blog will become a good professiona…