Explaining the difference between self-managing databases and managed database services.
At CrystalDB, we are commonly asked how a managed database service compares to a self-managing database. This is a great question because both terms describe something similar: offloading database management. The difference lies in how—and how well—they do this.
A managed database service—sometimes also called a database as a service (DBaaS)—is a database run by a cloud provider to relieve customers of administrative responsibilities such as installation, routine software patching, backups, and replacing failed systems. These tasks can be scripted to run automatically, and skilled operators oversee the service as a whole, managing thousands or even millions of databases. Examples of operator responsibilities include overall capacity planning, addressing infrastructure problems as they occur, and deciding how to respond to reports of new security vulnerabilities.
In addition to running standard open source databases, some cloud providers have also modified or extended these databases to improve their performance, reliability, or scalability. Examples include Amazon Aurora, Google AlloyDB, and Azure Cosmos DB for PostgreSQL (previously Citus). While this class of managed database services is in some ways akin to proprietary database software such as Oracle or SQL Server, including with respect to lock-in considerations, they can offer improved performance or reliability.
Using a cloud database service, you can set up a development database environment in under an hour, whereas manual configuration might require many hours. A production configuration could be even more time consuming, taking weeks to design, deploy, and test. Deploying a production database still requires planning even with a managed database service, but the process is significantly simplified.
The key limitation of managed database services arises because cloud providers only take care of tasks that can be done the same way regardless of customer or workload: the tasks that are straightforward to automate at scale. With limited exceptions, managed database services are not able to help with the vast array of application-specific tuning required to run a production database well.
A self-managing database uses AI and automation to eliminate many of the tasks that a skilled database administrator (DBA) would do to keep a system running reliably, efficiently, and securely. You might run a self-managing database on a public cloud, on hardware in your own datacenter, on customer premises, or in an embedded application.
A database like PostgreSQL has over 300 configurable parameters, and a self-managing database like CrystalDB can figure out how to set them based on just a handful of high-level objectives. Self-managing databases can also determine the best hardware specifications to meet performance and availability needs. They can make indexing decisions and can even advise on application-level tuning recommendations.
Manual database configuration can be error-prone, but self-managing databases can use AI models trained on thousands of databases to apply the optimal configuration in any given situation. They also incorporate guardrails carefully engineered to ensure that databases run reliably and efficiently, regardless of where they are deployed. Self-managing databases also allow your team to spend time on their most valuable initiatives rather than on tuning, troubleshooting, or mundane maintenance.
Self-managing databases invite comparison to self-driving vehicles. At CrystalDB, we view this connection as both technical and metaphorical. We adapt mechanisms and engineering methodologies that allow self-driving vehicles to safely navigate complex urban environments to ensure that databases run efficiently and reliably in much the same way. The parallels between self-managing databases and self-driving vehicles can also help us think about the difference between self-managing databases and managed services.
In this framing, using a managed database service is like leasing a new car that comes with roadside assistance and a dealer network to take care of all your routine maintenance. Getting from point A to point B still requires driving skill, knowledge of the city, patience and energy. You are spared the effort of crawling under the vehicle to change the oil, and you will not be found crouching by the side of the road changing a flat tire, but the day-to-day work of driving still falls on you.
A self-managing database is like a self-driving robotaxi, which takes instructions in the form “get me from point A to point B” and then computes a route and does the driving. The database allows a user to define a high-level objective—such as “optimize for cost while maintaining 99.99% availability”—and then figures out everything that needs to happen to attain that objective.
You can have your pick when it comes to combining self-managing databases and managed database services: Choose one, the other, or both. In a traditional on-prem context, a database can be augmented with self-managing capabilities to automate administrative tasks such as tuning or indexing. In a cloud context, self-managing database technology can form a layer on top of a managed database service. The combination of a managed database service (DBaaS) with self-management is sometimes called a self-managing DBaaS.
This combination can be viewed as the next phase in an evolution. To see this, consider what it takes to run a database in the cloud without the help of a managed service. Using the cloud’s infrastructure as a service (IaaS) model allows you to offload the ownership and maintenance of physical hardware, networks and storage. To run a database using IaaS, you start with an operating system and build up from there. The managed database service provides a layer on top of IaaS, allowing you to offload additional concerns—those that are generic across workloads and customers. Adding self-managing as another layer completes the administrative offload and can even help with application performance insights.
Infrastructure as a Service (IaaS) | Database as a Service (DBaaS) | DBaaS with Self-Managing Capabilities (Self-Managing DBaaS) | |
---|---|---|---|
Application-level SQL tuning | Customer Responsibility | Provider Responsibility | Shared Responsibility |
Workload-optimized configuration: indexing, connection settings, memory settings, vacuum tuning, and cost optimization | Customer Responsibility | Provider Responsibility | Provider Responsibility |
Generic configuration | Customer Responsibility | Provider Responsibility | Provider Responsibility |
Cloud infrastructure | Provider Responsibility | Provider Responsibility | Provider Responsibility |
Self-managing databases are distinct from managed database services. While both provide some form of automation, self-managing databases address concerns much closer to the application, concerns that are generally distinct for every workload. A self-managing database can run in the cloud, but it does need to run in the cloud, while managed database services are tied closely to the cloud, usually to a specific cloud provider. The automation managed database services provides is valuable yet largely generic. However, you can have both: self-managing database services. Combining the two maximizes the level of automation and allows you to offload the greatest amount of undifferentiated work.
Recap of KubeCon 2024 in Salt Lake City.
Automating database administration for PostgreSQL with the open source AutoDBA project.
We describe how CrystalDB uses AI techniques to turn PostgreSQL into a self-managing serverless database.