Why do companies, institutions, or even whole societies not do what is best or most optimal for themselves? Why is it so hard to change?
I recently came across an interesting piece by software engineer and blogger Dan Luu, in which he discusses the “normalization of deviance.” The phrase itself seems like a contradiction in terms: if something has been normalized, it’s no longer deviant, is it? What’s really being described is deviation from the optimum–the best or most efficient course of action. Luu illustrates the concept through a series of personal anecdotes, and I’ll offer a few of my own.
I once worked for a company that was started right around the year I was born. What began as a handful of people in a garage eventually grew to almost 500 people in three states. By that measure, the company was a success. It had found a market, it had grown, it had made profits. But it was starting to run into problems. The software on which they had built their little empire was growing long in the tooth. Newer, sexier technologies were overtaking it in the marketplace. The initial plan was simple enough: take all the central features of the existing product and build a new product with more current tools that implements those same features. It could then be put to market as a shiny new package to compete with everyone else’s offerings.
By the time I came along, the product had been in development for several years and had just reached beta. A number of customers–some of them quite large–had been running it for a few years, but it was known to have a lot of issues. Part of the problem was that the design was conceived mostly by developers rather than analysts familiar with the industry the software was meant to serve. This meant that the software was replete with features that other developers found elegant and powerful, but it was an absolute nightmare for users. The graphical interface was intended to be cutting edge–customers could use the included tools to build their own user interfaces and workflows. They could even use a custom scripting language to define their own program behavior. This was an extremely powerful enterprise application, the problem was that it was so complex it took a long time to learn–and this was in a high turnover industry.
The problems didn’t end there. Upgrades, for instance, were a nightmare. As the application’s database grew, upgrades took longer and longer as each upgrade would have one or more conversion programs which updated the internal data structures, preparing them for whatever new features were included in the latest version. By the time the developers realized the potential impact of this problem–with upgrades taking up to several hours to complete–everyone was at a loss for how to address it. Eventually, it reached a crisis point as one customer had a database so large the upgrade couldn’t finish within an 8-hour window. To have it take longer would mean cutting into the business hours of the customer–in other words, the customer would be losing money. This was unacceptable. The solution? One heroic developer spent weeks carving out bits and pieces of the upgrade process so that it could be run in phases, performing the database updates a little bit at a time. This allowed the entire upgrade to be phased over a period of days. The problem was, it was only for this customer and for this specific upgrade. It wouldn’t work for anyone else or even for a future version–it was custom all the way.
At this point, you might be wondering if someone eventually rewrote the entire upgrade system, perhaps to let the application handle the upgrades intelligently–maybe to even perform the database conversions while the overall application is running, so as not to impact users. This did not happen. Instead, the project continued to spin out of control and bleed money. After years of refusing to face reality, management decided that having an infinitely customizable product, while technically impressive, made for an application that was too hard to use and thus too hard to sell. They finally sent down the mandate that a set of best practices should be installed by default, with the customization options tucked away so as not to intimidate users. This is probably what should have been done all along.
But it was not enough to save the overall product, which had been in development for over a decade, and was now using outdated technology. It remained unstable, upgrades were still a roll of the dice (they crashed frequently), and it was still too hard to use. This product was so bad it almost destroyed the company–indeed, it led directly to a large reduction-in-force that landed me in the unemployment line.
The thing is, it was obvious to me as a new employee what an unwieldy mess this product was. Early on, I was shown a flowchart of all its modules and how it functioned. It was a tangled web of little boxes with arcane names that stretched across a wall. If anyone told me they understood all of it, I would have assumed they were lying. When I saw the product in action, I was shocked by the complexity of even simple tasks. I was accustomed to web forms which might have a handful of fields. Most screens in this application had a few dozen, so many that they were crammed together in a visually unappealing manner. The development tools were mostly custom, and weren’t terrible, but definitely lagged behind the rest of the industry in terms of best practices. By the time I arrived, they were taking baby steps toward real source control, and by the time I left it was an integral piece of the development infrastructure. That’s progress.
But that progress just came too slowly, no matter the obvious merits of change. I have written before about institutional inertia and organizational dysfunction. Even when it seems everybody knows that processes are flawed, nobody works to change them. I remember that coworkers often apologized or made excuses for how complex and difficult everything was. They knew they had problems, but seemed unable to solve them.
What ultimately worked, albeit not quickly enough to save the company from crisis, was management buy-in and employee accountability. What kills me is that they had processes. They were very formal, very well-defined, just outdated, unresponsive, and weird. Instead of refining those processes to better align with industry standards, they undertook desperate experiments that threw out everything they knew and tried to replace it with something else, be it Extreme Programming (XP), Scrum, Agile, Six Sigma, or the Rational Unified Process. All of these initiatives failed because they demanded that experienced developers essentially start from scratch with whole new ways of thinking. People simply don’t work that way.
Intuitively, we understand this outside the world of software, too. Imagine a stretch of road where the speed limit is 40 miles and hour. Now imagine it is suddenly reduced to 30. Number one, most drivers were probably speeding through at 45 or higher to begin with. Secondly, most aren’t going to notice immediately that it was reduced–they will probably keep speeding at 45 or more. We’re creatures of habit.
But what if police put up an LED sign that clearly announces the new speed limit right at the beginning of the 30 MPH section? This gets the attention of drivers because it’s new and unusual–you never saw that sign before. A lot more drivers will comply, but probably not all.
Station a patrol car in that section, though, and watch compliance shoot way up. Now that there are consequences to ignoring the law, people will behave.
As Luu and the analysis he got his inspiration from indicate, telling people to do things differently doesn’t work if they can simply ignore it. Individuals are more concerned with their own needs and what’s most convenient for them than what’s best for the whole, and by and large we will prefer to do what we’ve always done even if it sucks, because it’s easier and more comfortable than changing. That said, most people also want to be good, so while being told once to change your ways may not have an impact, seeing a relevant reminder every time you have a choice to engage in the bad old behavior to use the good new behavior actually does help. But what helps more is enforcement: someone looking over your shoulder to make sure you’re doing it the right way. This doesn’t have to be a person, either. It can be a technical process which doesn’t let you proceed unless you’ve done everything you’re supposed to, the way you’re supposed to.
Where this kind of enforcement ends up being most important is in the trivial, tedious activities which don’t appear to produce any immediate value, but which introduce unnecessary risk. The PMC paper cites handwashing by healthcare workers as an obvious example. It’s an easy thing to do, but also an easy thing to avoid or forget–but not doing it also presents grievous potential risks to patients. If a bad practice has no ill effects 99 times out of 100, but has a catastrophic effect that hundredth time, we’re only motivated to change up until the bad memory fades, and then it’s right back to the bad practice. But if, with enforcement, we can ensure that the good practice is followed 99 out of 100 times, we’ve now reduced the risk tremendously: what was once a 1% risk is now a 0.01% risk. This is clearly a huge improvement, but unlikely to be noticeable to those who must use the good practice. Our memories contribute to this, too, because we don’t remember all the bad things that could have happened, only those that did happen, and a long run without a disaster tends to make us complacent. This is why vigilance must be eternal.
Getting people to comply with a new practice, though, no matter how good it is, remains a difficult problem. But the PMC paper offers what I think are some good guidelines and expectations in its conclusion, namely:
1. Pay attention to weak signals -- That is, notice when someone (or something) is making noise that something is amiss, even if it would be easy to ignore. 2. Resist the urge to be unreasonably optimistic -- When presented with choices that carry different risk levels, assume the worst, or at least account for it. Don't break the rules (or let others do so) just because you think the outcome will be OK. 3. Teach employees how to conduct emotionally uncomfortable conversations -- Cultures that are excessively mean _or_ polite leads to crucial conversations not happening, and problems festering. Communicating through conflict is a skill, and it can be taught. 4. System operators need to feel safe in speaking up -- Some companies fire people who rock the boat. Organizations need to have clear policies for reporting problems, and clear paths to remediating them. 5. Realize that oversight and monitoring for rule compliance are never-ending -- Like I said: eternal vigilance.
In the end, deviant practices persist because they are allowed to, because the consequences do not cause sufficient pain to the deviants. They can be changed, but only with buy-in and ultimately continued, consistent enforcement. A culture of risk mitigation, once established, can still wither as an organization churns or grows, so keeping good practices in place is a never-ending process.
(As an aside: this is my anniversary post! I’ve updated this blog every day for a full year now. Go me.)
Photo by La case photo de Got