Chaos Engineering for Humans
Today I am going to take a little detour from my usual topics of engineering and leadership to talk about you and me. Maybe its not too much of a detour because ultimately leadership starts with us, will we show up today and more importantly will we show up every day and do the hard work.
Join me as I meander through computer engineering, the endurance athlete mindset, and neuroscience and pull them into a challenge for the year.
Chaos Engineering
I remember the first time I heard about Netflix’s Chaos Monkey. Netflix streaming had just launched recently and the Netflix engineering blog was putting out a lot of unique content at the time. I’m sure chaos engineering had been lurking around but Netflix brought it into the mainstream. Here is how they described it in their early post:
“One of the first systems our engineers built in AWS is called the Chaos Monkey. The Chaos Monkey’s job is to randomly kill instances and services within our architecture. If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most — in the event of an unexpected outage.”
Chaos Monkey was just what it sounds like. It was a piece of software that would do random things like turn off a server, mess with configuration or increase the latency. This kind of chaos caused a lot of issues at first but over time it allowed Netflix to build extreme resiliency into the system.
Rather than wait for disaster to strike they found a way to simulate disaster so that they could be confident that their software would work even if everything seemed to be going wrong. If an actual outage happened in the cloud it was just another regular day for Netflix.
I have always been intrigued by chaos engineering. It was a level of crazy that had some logic to it. It was a place that few wanted to go but the ones that did considered it worth it. Nobody wants to have outages or failures but Netflix decided that the only way to avoid them is to have them every day.
I had only heard about chaos engineering in the context of software. But should it apply to just software?
Chaos for Humans
What about you? What about me? Are there other uses for the concepts of chaos?
I began to think about stress and how most people are one stressful event away from disaster. They only have a little margin for error in their lives and like software, every day has the potential to cause massive failure be it physical, mental, emotional or spiritual.
Is there a way to do something about that? Is there a way to chaos engineer ourselves?
I started building a philosophy around this concept. I called it Forged Humans and it became the seed idea eventually for Forged Managers. Forged Humans was about embracing stress to make you stronger. Human systems have an antifragile nature to them. Meaning the more stress you put on them the more resilient they get. I began to deep dive on stress and its effects across 4 domains: physical, mental, emotional and spiritual.
I can tell you for certain that the philosophy of embracing stress works. The problem is that the philosophy sucks. Literally. It is hard to always be pressure testing yourself. So I have fumbled my way through life trying to embrace stress more but in my weakness resorting back to comfort.
An Unusual Conversation
I hadn’t thought real deeply about the philosophy of stress recently but I picked the topic back up after listening to a podcast between Andrew Huberman and David Goggins. This conversation hit me right when my mind was ready for a new idea.
David Goggins has taken the concepts in Forged Humans farther than anyone I have heard of. His philosophy is “embrace the suck” and “stay hard”. His story is rough ad full of pain but he has embraced it. Because of that he has done a lot of interesting things in his life and he has a deep understanding of himself.
Huberman is a neuroscientist who through his podcasts has also laid out a lot of the same things I had been thinking related to stress and the benefits of stress that most people don’t get. He provided a lot of the scientific evidence that backed the realizations I had already been making.
Their conversation is a hard listen if you are not used to Goggins or strong language. However, if you are able to decipher it Goggins lays out his simple yet extremely difficult philosophy.
“Do things you hate”
Not things you hate from a more standpoint but things that are good for you but you hate to do. Goggins is a renown ultrarunner but it turns out, he hates running. Every day he gets up and does it. He chooses to do the thing he hates most and he credits that thing to his strong mind and his ability to endure.
It turns out that there is scientific backing for his philosophy. A part of the brain called the anterior midcingulate cortex appears to be responsible for willpower and as we do something we don’t want to do it grows. The bigger it grows the more willpower we have for other areas of life.
Huberman points out the following:
“When people do something they don't want to do, like add three hours of exercise per week or resist eating something while dieting, this brain area grows."
Have you ever noticed that after working out it is easier to say no to the donut. I always attributed that to logic. I didn’t want to undo what I had just worked hard for. There is some logic there but perhaps there are other, more deeply built in, systems at work in these decisions. Maybe my midcingulate cortex just grew a tiny bit.
The Curse of Abundance
Why put yourself through “the suck” and start to chaos engineer yourself? Do you really need to increase your willpower by increasing your midcingulate cortext? Naval, another great thinker on these topics introduced me to the curse of abundance. It goes like this:
"Most of modern life, all our diseases, are diseases of abundance, not diseases of scarcity… In an age of abundance, pursuing pleasure for its own sake creates addiction.
The modern struggle is really about individuals—disconnected from their tribe, religion and cultural networks—who are trying to stand up to all these addictions that have been weaponized: alcohol, drugs, pornography, processed foods, news media, Internet, social media and video games.
We live in unprecedented times. People are no longer constrained by access. Every addiction is available 24/7 for almost no cost. The job market is tough and our communities are fractured. To put it another way, you don’t have a choice. That is the paradox of life, hard times will find you, do you want to be ready for them when they do?
A Call to Action
2024 has to be about action.
The world is full of self help books and full of broken people. The idea is that reading the book will somehow help with your problems. But reading a book wont tell you where you are going to break, it wont build your willpower.
In the same vein, Chaos engineering isn’t about reading architecture or data flow diagrams to find issues. It assumes that you cant learn the weaknesses of the system that way. You have to do the action. You have to stress the system to see where it breaks.
We cant improve through learning alone. We have to do the things that we don’t want to do to find the insights we need to improve.
If you liked this, consider subscribing to my Substack. I publish an article about software engineering and software engineering management every Tuesday. If you want even more, consider joining my paid content where I am writing a course in real time about making the jump from developer to manager.