There were 808 press releases posted in the last 24 hours and 399,474 in the last 365 days.

DevOps and Data: Lessons Teams Can Learn About Managing Databases

According to the Bureau of Labor Statistics, the outlook for jobs around managing data architecture and databases looks pretty good: The number of professionals with roles around managing data is due to grow by eight percent from 2022 to 2032. However, while the number of roles that work around data is going up, the position of database administrator (DBA) alone is actually dropping. Instead, the DBA specialist role has been replaced by the DevOps world’s site reliability engineer (SRE). 

The SRE role developed at Google, applying their skills around website management and operations to other areas of technology. SREs have a job scope much wider than that of a DBA, covering the same kinds of tasks like operational management, availability, system redundancy, and security, but for the whole IT infrastructure, not just the databases that are in place. However, the tasks performed by DBAs have not diminished, and there are nuances that DBAs may understand that other employees don’t.

What Issues Do DBAs Confront? 

While the majority of SREs will be able to manage database instances and keep them running, they will not have the depth of experience and knowledge that someone who has concentrated on database theory and management will have. When something out of the ordinary happens, SREs will have to understand what is going on and whether they can fix the problem, or whether they might need to call in an expert.

A good example of this is how to set up and manage Structured Query Language, or SQL. SQL may be the IEEE’s 2023 most popular language, but it has a fussy syntax that relatively few dedicate themselves to mastering. Many developers are not familiar with how to write effective and efficient SQL queries, so they may end up with poorly performing requests that take longer to return results. Alternatively, developers often turn to Object-Relational Mappers (ORMs) to handle their SQL requests for them. While ORMs might make the situation simpler for developers, they can suffer from the same poor performance and bad query design that writing your own SQL code can, coupled with the need to update and manage the ORM itself. This combination is often seen alongside a penchant for using long-running transactions that stifle performance. 

For DBAs, spotting these issues and correcting them was part of the full-time job. However, for SREs that are not familiar with database performance, these slow transactions can be accepted as “just how things are” rather than a symptom of something being wrong. Alternatively, developers can try throwing more resources at the problem by buying larger machines or cloud instances to run in.

Alongside query design, DBAs were also responsible for setting up data indexes on their databases. Indexing data is a Harry Potter-ish dark art to many, who either over-index or under-index, leading to poor performance. In the past, DBAs used to look for redundant indexes that were no longer used or popular queries that had not been indexed, and then correct the database for better performance. 

Lastly, DBAs would be responsible for running queries themselves to track performance over time using ANALYZE TABLE. This would keep statistics for the optimizer current and flag any areas where changes or additions had affected performance levels. Without this insight, SREs can leave indexes in place that are either no longer needed at best, or that affect performance negatively at worst. 

Planning Ahead

There are also lessons that need to be re-learned on the operational side of databases. For example, there is an old saying that a DBA was only ever as good as their last backup. While you may hope that you never need to recover data after an issue, having a working and fully tested backup for any critical file is essential. In the world of databases, many SREs now rely on their cloud provider to run this for them. However, is this enough to be adequate? 

While you can point to what your Service Level Agreement states around a backup and restore process, this may not accurately portray how quickly you can get back up and running after a problem. Firstly, this SLA is dependent on your backup being good and your being able to fully recover from it. Until you have actually loaded up a backup and started using that data again, you can’t be sure that you have fully protected your operations. Secondly, your SLA time may be very different from the amount of time that you can afford to be offline and not processing. It used to be the duty of the DBA to be able to spot data loss and return it to a good state. While a cloud service provider can tell you what their SLA is for your data, they are not necessarily going to provide everything that you need to meet your internal service requirements.

Similarly, structuring database tables requires a lot of knowledge about how data is managed. While developers might understand how databases can be used to store, sort, and return data that they need for their applications, there are some nuances involved in how to properly align tables to get the most out of that data over time. Alongside this, having a proper understanding of the relational model helps you understand that compartmentalization of data into separate tables causes poor performance. There are also database-specific tricks that you pick up as part of managing these instances directly – for example, many do not know that MySQL now wants you to have primary keys on every table, or that PostgreSQL may need to pad tables if columns do not fit nicely on an eight-byte boundary. 

The Future for Data Management

Data is increasingly important within companies. It provides the basis for serving customers efficiently, but it is also at the heart of new business services or deep analytics projects that are used in new products. Without this data, those products – and the revenue they bring in – either don’t exist or fail to deliver the value they were designed to provide.

At the same time, the skills around data management are leaking out of organizations, subsumed in wider roles or handed over to service providers. When everything is working, this is fine. But when a problem strikes, you will need that in-depth knowledge to solve the issue and make sure it does not happen again. It also means that you can spend more than you need to maintain the data that you keep and carry out work around it.

While the role of the DBA has been taken over by newer positions around DevOps and SRE, the tasks associated with that role are still with us. For SRE and DevOps professionals, knowing your data theory can be the difference between spending cash on infrastructure and saving on performance.