Introduction
Software Architecture
What is it?
Definition
Ask 10 different developers what software architecture is and you'll get 12 different answers. It's one of those concepts we all seem to have an intuitive feel for, like recognizing a catchy tune or spotting a well-designed user interface. Yet when challenged to define it concretely we often find ourselves stumbling over words or locked in heated debate.
Thought Exercise
Before reading on ask yourself, what is software architecture? Don't worry, there are no wrong answers... just varying degrees of rightness.
Now, let's see how your answer stacks up against some common definitions:
- The highest level components and their interactions
- Shared understanding of the system
- Set of architectural design decisions, guidelines, principles or rules
- Structure combined with architecture characteristics (i.e. scalability, security etc.)
- the significant design decisions that shape a system, where significant is measured by cost of change.
- Things that people perceive as hard to change
- Decisions that need to be made early on
- The cloud infrastructure
- The technology stack
- Architecture is the stuff you can't google
- Architecture is a hypothesis, that needs to be proven by implementation and measurement.
If your definition aligns with any of these, congratulations! You're in good company. If it doesn't don't worry, because here's the kicker: none of these definitions are wrong, but none are entirely right either. The truth is, software architecture is subjective, relative, and ever-changing. It's like trying to nail jelly to a wall - just when you think you've got it pinned down, it slips away and reshapes itself.
Everything is Relative
"Everything is relative to the observer."
Just as the perception of time and space can change based on the observer's frame of reference so can the perception of what constitutes as "architecture" in software. What's architectural in one context might be mere implementation detail in another.
- The use of python might be considered part of an architecture decision in a data analysis project but not nearly as important of a decision for an ecommerce website.
- A concert ticketing system might need an architecture that can handle massive traffic spikes when Taylor Swift holds a concert and massive traffic dips when only indie bands are performing while a nuclear power plant monitoring systems, with more consistent traffic, might not.
- An API framework might be significant within the context of a microservice but not in the context of how service workflows are structured.
- your mission-critical service might be architecturally significant within your team but only a blib on the radar of an enterprise architect
- your super important algorithm might be another persons "just another component"
- A frontend developer might view the UI framework as a critical architectural choice, while a backend developer might consider it irrelevant to the system's architecture.
- In a small application, the choice of a database might be an architectural decision. In a large enterprise system, it could be just another implementation detail within a broader data management strategy
You get the point, just like how forest fire fighters switch between thinking about individual trees or entire forests depending on what part of their job they are doing so do we change what is considered architecturally significant based on our scope. The key is to understand that architecture in software is context-dependent and can vary based on the perspective, project, domain, size, complexity, age, technology and the time of implementation.
Evolutionary Architecture
"The only thing that is constant is change"
This architectural relativity extends to time as well. What's considered architecture today might become an implementation detail tomorrow, and what once seemed like an architectural pipe dream may now be the industry standard.
Consider the evolution of microservices architecture. Imagine pitching this idea in the early 2000s: "I propose we run our services in isolated environments, each with its own database, minimal shared code and technical diversity. Oh, and let's have 50 of them!"
Such a suggestion would have been met with incredulity, if not outright mockery. Why? Because the technological and architectural landscape was vastly different:
- Relatively low computing power
- No containers, and virtual machines were still in their infancy
- There were no reliable cloud providers
- Everything had to be run manually on-premise
- Limited DevOps and automation practices
- Few open-source initiatives
- Expensive software licenses
- Expensive hardware / servers
Running microservices in such an environment back then would have been financially and technically infeasible for most organizations. Fast forward to today and now microservices are a viable and popular option for even the smallest startups.
The architecture of our applications will always change. Maybe it's because of new tech or maybe it's because of new feature requests, new engineering practices, new trends, bugs, scaling issues or budget constraints. Whatever the case, change is inevitable. Change will lead to more change creating a dynamic ecosystem where adaptability and equilibrium become crucial.
The Microbial Arms Race
We see this intricate dance of evolution where one change begets another all the time in nature. The ongoing battle between us and bacteria exemplifies this perfectly. When penicillin was first widely used in the 1940s, it seemed like a miracle cure, decimating bacterial infections that were once fatal. However, nature abhors a vacuum, and bacteria quickly evolved resistance mechanisms, rendering penicillin less effective.
This prompted scientists to develop methicillin in 1959, a new antibiotic designed to overcome penicillin resistance. Yet, within just two years, methicillin-resistant Staphylococcus aureus (MRSA) emerged, continuing the evolutionary arms race. The medical community then turned to vancomycin in the 1970s, considering it a "last resort" antibiotic. For decades, it held the line against resistant infections, but even this stronghold fell as vancomycin-resistant enterococci appeared in the 1980s.
As bacteria continued to adapt, so did we. The 1980s and 1990s saw the introduction of new classes of antibiotics like carbapenems and later, linezolid. Each new drug brought hope, only to be met with newfound bacterial resistance within months or years. This dance continues in the 21st century with antibiotics like daptomycin and ceftaroline, each facing the same inevitable challenge of bacterial adaptation.
This ongoing cycle of innovation and resistance mirrors the dynamic ecosystem of software development. Just as bacteria evolve to survive in an antibiotic-rich environment, our software systems must evolve to thrive in an ever-changing technological landscape. As we look to the future, both in medicine and technology, we must recognize that this dance of evolution will continue and that the only constant is change itself.
Okay so changes are inevitable, our applications, their architecture and what is considered architecturally significant wil change. But changes are also the thing that break our applications. So what can we do about it? We'll, for starters, we can build changeability into the architecture as well it's very definition.
Instead of trying to avoid change we should embrace it. We know it's going to happen so let's be good at it. Our architecture should be more like a lego building and less like a Jenga tower, we should be able to add, remove and modify pieces without the whole thing coming crumbling down.
The CrowdStrike Incident
In 2024, the world witnessed a profound example of how changes, intended to improve our systems, can lead to catastrophic failures if not managed properly.
On July 19th, the cybersecurity company CrowdStrike released an update to its Falcon Sensor security software. What began as a routine update quickly led to one of the largest outages in IT history. A tiny modification in a configuration file led to widespread disaster with approximately 8.5 million systems running Windows crashing and becoming unable to restart. The impact was felt globally, disrupting businesses, governments, and daily life across various sectors.
The incident showcased the fragility of modern IT infrastructure and the dire consequences of failing to manage changes effectively. Airlines grounded flights, hospitals postponed surgeries, and financial institutions faced operational hurdles, all due to a single faulty update. The economic repercussions were staggering, with estimated financial damages exceeding $10 billion.
This serves as a powerful reminder of the importance of building changeability into our architecture. By designing systems that can handle modifications gracefully, we can mitigate the risks associated with inevitable changes. The CrowdStrike incident underscores the necessity of an evolutionary approach to architecture, where adaptability is a core principle, ensuring that updates and changes strengthen rather than compromise our applications.
In later chapters we will look at many architecture styles, patterns and practices that help us build evolvability into our applications, helping us manage change instead of preventing it.
Definition Flaws
Now that we've danced around various definitions and looked at the relativistic and evolutionary nature of software architecture you are hopefully starting to see the problems.
Thought Exercise
Read the common definitions again and try to think about what potential flaws they might have.
Let's look at some of the common definitions and analyze why they might be considered controversial or wrong.
The highest level components and their interactions
The idea that architecture is about "the highest level components and their interactions" sounds great in theory. However, in practice, it's about as clear as a riddle wrapped in a mystery. What exactly constitutes a "high-level" component? Are we talking about microservices? or a grouping of microservices? But what if our system doesn't use microservices? Perhaps it's our design patterns? Our classes? Or even our functions?
This definition stumbles into the trap of relativity that we discussed earlier. What's considered "high-level" can vary wildly depending on the perspective and context.
Shared understanding of the system
If architecture is purely based on shared understanding, what happens when that understanding differs? Does the architecture change every time a new developer joins the team with a different perspective?
Like usual, this one also stumbles into the trap of relativity. What constitutes a "shared understanding" can be different depending on the team's size, experience level, background, organization and context.
Things that people perceive as hard to change
Firstly what is hard to change is subjective, what's perceived as hard to change can differ from one person to another, or even for the same person over time.
Secondly we've established that this isn't the mindset we want to foster, nor is it how we want our architecture to be structured .Our goal is to embrace change, not shy away from it. Ideally, our architecture should be evolvable and therefore shouldn't be inherently "hard to change" (although, this is easier said than done).
Decisions that need to be made early on
This definition assumes a level of foresight that simply doesn't exist in the ever-evolving world of software development. It harkens back to the waterfall days, where we believed we could plan everything upfront and execute flawlessly.
We will never be able to predict the future, our architecture will change and we want our architecture to be adaptable. Defining architecture as "decisions that need to be made early on" flies in the face of this reality.
Set of architectural design decisions, guidelines, principles or rules
To be frank I don't hate this definition, it does a decent job of capturing the essence of architecture without falling into the pitfalls of relativity or architectural evolution. While what constitutes a design decisions can certainly vary across contexts and evolve over time, this definition doesn't impose such constraints. It simply states that architecture is a set of architectural decisions... whatever those may be.
However, this brings us to the definition's main flaw, it's self-referential. The critical question remains unanswered: What exactly makes a design decision "architectural"? According to this definition, a decision is architectural if it's... well, architectural. Not exactly helpful, is it?
Defining the Undefinable
After exploring various definitions and their flaws, we find ourselves at an impasse. How do we define something that is relative and constantly changing? This leads us to my personal favorite definition
"Architecture is about the important stuff, whatever that is"
At first glance, this definition might seem frustratingly vague - akin to saying "the treasure is buried under the tree, wherever that is." However, it's precisely this vagueness that makes it so flexible. It means that deciding what is architecture is simply deciding what is important.
- If the interactions between microservice a and b are important then it's part of the architecture
- If a certain design decision is not important then it is not part of the architecture
- If the choice to use MongoDB as a data source doesn't matter then it's not a part of the architecture
Yes what is important is of course still subjective, but at least it doesn't hide from the fact.
Alas, the definition "Architecture is about the important stuff, whatever that is" is also highly debated, so maybe there should be no definition and software architecture should simply remain as the unexpressible intuitive feeling we have in our head.
Why do we do it?
" 'Do you ever get anywhere?' he asked with a mocking laugh. 'Yes,' replied the Tortoise. "
Big ball of mud
Picture this, you've just joined a new company, excited to dive into their codebase and start making meaningful contributions. But as you crack open the repository, your enthusiasm quickly turns to despair.
What greets you is a monolithic behemoth, a tangled web of spaghetti code that would make even the most seasoned pasta chef weep. Classes and functions snake their way across hundreds of lines, riddled with duplication and inconsistencies. Documentation, when present, strays far from the actual implementation. Finding anything is a maddening ordeal, the database schema resembles the haphazard work of a drunken octopus, and the naming conventions feel like an afterthought—if they were ever considered at all.
You soon discover that no change is a trivial one and that making even the tiniest change requires updating code in 17 different places only to find out after deploying to production that you've broken the entire system because you actually needed to update the code in 19 places not 17.
Why do we take shortcuts?
This type of system is, unfortunately, all to common. But why? As we've established, we know that our applications will need to change. so why do so many programmers seem to have the self sabotaging desire to shoot themselves, and their fellow team members, in both feet by taking shortcuts?
Well the answer is simple, there are deadlines to be met, markets to conquer, bosses to please and, frankly, we're lazy. We often justify our blatant disregard for our system by telling ourselves things like "We'll clean it up later" or "this solution is only temporary" but these are lies.
Deadlines will always be looming, and by taking shortcuts now, we only sabotage our future ability to meet them, locking us into a downward spiral of quick fixes and hacks until the system collapses under its own weight.
Recreation of Martin Fowler's diagram
Take a look at this completely fabricated graph to illustrate this point. At first taking shortcuts seems to pay off. We deliver faster, and our stakeholders are thrilled. But as our system's complexity grows, those shortcuts start catching up with us. Eventually, we hit a wall where making changes takes longer and longer until they become virtually impossible and those stakeholders that were previously singing our praises are now humming a very different tune, all because we sacrificed internal quality early on.
Thought Exercise
Reflect on times you've taking shortcuts consequently sacrificing the quality of the system.
Has the system become harder to maintain due to these shortcuts?
Did you tell yourself excuses like "I'll fix it later" and if so, did you ever fix it?
Defend your code
From many stakeholders' perspective all changes and feature requests have roughly the same size and scope, no matter when in the timeline of the project the feature requests occur. But we know the truth, as our system grows in complexity so to do the scope of requests increase.
For this reason we need to fight for the architecture. We need to make our stakeholders aware of this problem and help them understand that if we don't put in work to refactor, clean up and improve the internal quality of our system we'll face increasing difficulties in implementing new features, leading to longer development times, higher costs, and very likely more bugs and system instability.
Stakeholders want what is best for the product, and despite some jokes developers tell at the expense of product managers they aren't all idiots. From their perspective what is best for the product is developing new features and getting quick to market, but we know that this isn't always what is best for the product if it means completely sacrificing internal quality. It is therefore our job, as stakeholders of the product ourselves, to defend our code and help stakeholders with different perspectives see things from our view.
Don't overengineer
"Be Moderate In Everything Including Moderation"
As we've discussed, architecture and internal quality are necessary for an adaptable system but there is such a thing as too much of a good thing.
We don't want to design and implement our entire architecture up front, instead we want to follow the KISS principle (Keep it Simple Stupid) and only implement what we need when we need it. There are primarily two reason for this:
- We don't want to spend time implementing something if we won't need it in the future
- We don't want to increase the complexity of our system with architecture if we truly don't need said architecture
Again: our architecture, just as our features, should be adaptable and changeable. We should design our architecture with changeability built in so we can keep it simple and dial up the complexity and infrastructure as needed.
Other factors
The primary factor of why we do it is change, we don't want our software to be like a house of cards that collapses at the slightest touch, nor do we want it to be like a house made of steel and concrete that is close to impossible to change once constructed. What we want is a lego building, something modular that is easy to modify and expand without us needing to worry too much about breaking the current structure.
But there are aspects of architecture other than change that can be important. Like DevOps,automation, cost, data, scalability, security etc. All things we will explore at later times.
Everything is a trade-off
"Everything in software architecture is a trade-off"
In software architecture, as in life, there's rarely a perfect solution and every decision comes with its own set of pros and cons, benefits and drawbacks. It's easy to get caught up in the excitement of designing a system that perfectly aligns with all our ideals: scalable, secure, maintainable, efficient, and adaptable. However, the reality is that no architecture can optimize for every possible concern simultaneously. Instead, we must learn to make conscious decisions that balance these competing priorities, understanding that improving one area often comes at the expense of another.
Making architectural decisions is less about finding the "right" answer and more about choosing the trade-offs that best fit your specific context, remember architecture is about the important stuff, whatever that is.
For example:
Consistency vs. Availability Maybe it's more important that the data you read from a database is always consistent rather than always available and so you decide to have a single database instance shared between your service instances.
Scalability vs. Development Speed Building for a million users and a 100 development teams from day one sounds great, until you realize you've spent a year optimizing you're event driven microservice architecture for load and team members you don't have yet.
Security vs. Usability Maybe it's important for your system that you require a 64-character password changed daily, but your users might revolt.
Performance vs. Maintainability You might want to write that algorithm in assembly for blazing speed, but good luck maintaining it six months from now.