First time at Zeet?

16 Dec
2021
-
4
min read

The Ideal Site Reliability Engineer to Developer Ratio

Learn the advantages of a shift from a typical 1:10 SRE-to-developer ratio to a 1:50 ratio using self-service DevOps and cloud orchestration tools, like Zeet.

Jack Dwyer

Product
Platform Engineering + DevOps
Content
heading2
heading3
heading4
heading5
heading6
heading7

Share this article

The KPI for scaling your engineering team

Imagine an application development engineer with a standard background and minimal infrastructure experience wants to deploy and maintain an analytics stack on AWS. It could require weeks of effort to familiarize themselves with complex cloud concepts and the AWS CLI to successfully deploy and configure the system in production.

Software engineering in general is a broad field, so companies employ specialists to ensure that developers aren't wasting time battling problems outside their areas of expertise. One such specialized developer, a Site Reliability Engineer (SRE), is responsible for deploying and maintaining a company's software in production and ensuring any automation in the devops team’s stack is functioning.

Hiring, training, and retaining developers requires significant time and money, so organizations need to balance the investment in SREs with other technical staff needs. So, companies of all sizes are increasingly turning to solutions that can abstract away production complexities and provide a seamless developer experience out of the box to every engineer on their team.

Defining SRE vs DevOps

Before we get into the golden ratio, we need to define SRE vs DevOps in the context of most cloud-native software development teams. Across the software development life cycle, both Agile and Non-agile methodologies generally employ some version of this framework to both separate responsibilities and skill sets across team members.

  • Developers write code
  • DevOps and operations teams gets code to dev, staging, and production
  • SRE maintains the infrastructure and pipelines and prevents outages and downtime

These roles aren’t always done by different people. Developers build applications, but they can also deploy those applications and manage infrastructure. Continuous integration and continuous delivery tools like Jenkins (or Zeet!) have made it so a developer can own more of the stack.

As organizations scale, however, software delivery usually gets broken into discreet roles.

The SRE role will make sure things like observability and metrics are in check at the IT infrastructure and pipeline level across cloud platforms. Scalability, failure rate, and things like CD pipelines are in their purview. They don’t necessarily do DevOps work, but devops tools and uptime are why DevOps teams need the SRE team to do a good job, not to mentions SLAs and service level objectives. Things of these as System Administrators (SysAdmins) of the new world.

DevOps teams (sometimes called DevSecOps when there is a big security focus) come in to leverage that IT infrastructure by setting up automation tools on top of it. These tools would be thinks like CD pipelines, whether from Github or other change management products, which get new features in the hands of end-users. They ensure that things go smoothly, no matter the deployment frequency needs of the development team, and are usually the ones bridging application developments and end users. If this is a Kubernetes team, the DevOps engineer may worry about getting code onto the correct clusters, but the SRE team would be in charge of ensuring Kubernetes clusters are healthy.

Developers write code. There are times with DevOps engineers will write code, but in general, developers write code for the product. They don’t concern themselves with service level agreements, they worry about getting new features in the hands of users.

Abstract Complexity, Improve Developer Experience

While it's important for both developers and infrastructure teams to have some ownership of operations, developers don't want to be involved in the practical application of deploying, managing, and provisioning complex infrastructure. Time spent putting out fires in production is time developers otherwise could have spent on shipping features.

Thus, it's necessary to have a solution that abstracts away these hassles and empowers developers with a self-service ability infrastructure.

The SRE-to-Developer Ratio

In their well-known book, Site Reliability Engineering, a team of SREs and technical writers lay out Google's philosophies and practices regarding, you guessed it, site reliability engineering. One famous concept in the book is the "SRE-to-developer ratio." As SREs have specialized skills that add leverage to other developers' work but are a scarce resource, companies maintain a SRE-to-developer ratio of about 1:10 where one SRE team commonly works with multiple developer teams in their product area.

While the industry standard 1:10 ratio seems like a good starting place, it fails at both extremes of organization size. A seed-stage startup likely has fewer than ten engineers, so a dedicated SRE would exceed the ratio, straining limited resources. Instead, many small companies get by with tasking some or all of their application engineers with managing their own infrastructure, slowing product development. At the opposite end of the spectrum, a rapidly scaling company that needs to add 300 application engineers would require 30 additional infrastructure developers to account for the growth.

How to improve your SRE-to-Developer Ratio

Organizations of all sizes are rethinking their approach to scaling both teams and infrastructure. Investments in internal tooling, whether built or bought, automates repeated tasks and abstracts complexity, which allows each SRE to support more developers.

With a self-service DevOps solution, it's possible to approach upwards of a 1:50 SRE-to-developer ratio. Individual developers are empowered with the tools they need to manage their own infrastructure day-to-day with minimal effort, while SREs focus on the toughest and most pressing problems the infrastructure presents. By providing this additional leverage, self-service DevOps doesn't just scale infrastructure, it scales people.

Traditionally this has required building infrastructure in-house, or even creating a dedicated Platform or Internal Tools team. This is an incredibly expensive effort, and many are turning away from custom-built tools that require dedicated headcount to maintain and improve. Instead, cloud orchestration tools like Zeet are making it possible to achieve a 1:50 SRE-to-Developer Ratio without having to build the tools yourself.

What's your organization's SRE:Developer Ratio?

Subscribe to Changelog newsletter

Jack from the Zeet team shares DevOps & SRE learnings, top articles, and new Zeet features in a twice-a-month newsletter.

Thank you!

Your submission has been processed
Oops! Something went wrong while submitting the form.