We need a new DevOps Database tool

Mark Wojciechowicz • May 23, 2021

Database deployments have had two dominant models over the past decade or more – a declarative model and a migration script approach. The major tools for the declarative model are SSDT (SQL server data tools) and BIML by Varigence (I have no experience with the latter). The rest of the market is made up of the migration script model with the leading tools being Ready Roll and Liquibase. Both approaches have their own challenges. In the migration script paradigm:

The definition of the code is not the source of truth. That is buried in the accumulation of the migration scripts. So viewing object level differences in source control is impossible
Scripts need to be applied in the specific order that they are checked in. This makes merge conflicts a common nuisance.
It’s easy enough to write a bad script and removing or undoing that change can also be painful.
If there’s schema drift in the target database, things can get messy pretty quick, either leaving the database in a unique unknown state or the deployment failing altogether.
What do you do with functions and procs? Since we are not comparing differences during deployment, you need to deploy this code each time in case there are changes.

On the model based approach:

External dependencies (database references) are awful to deal with on anything but a greenfield project. i.e. database1 has a proc that updates data in a table in database2. Database2 has a view that selects from tables in database1 and database2. Which database do you deploy first? The answer is neither, you cannot have circular references like this so you need to fix your code.
The memory footprint of a large database project (and all of its reference databases) can bring visual studio to its knees. It will take a long time to open the project and resolve all the references.

In both approaches, the team needs a lot of discipline. They need to adopt the new process and only apply changes this way or whatever they deploy out of band will be overwritten. I happen to prefer the model based approach because it addresses the schema drift issue, drawing the development team back into the practice of using the tool. SSDT is a fantastic tool and has a million great features, including tracking the rename of objects with an automation to help facilitate it. Besides the two issues noted above, another bit problem is that it only works with SQL Server. So what do we do about other relational databases, MySQL, Postgres, Snowflake…?

I think we need another tool. I think it’s unfeasible to try to do everything that SSDT does, but in the very least it should fill the model based gap with these features in mind:

Provide a model based approach for any popular relational database that follows ANSI standards
Schema Compare between the project and the database or two live databases
Reverse Engineer existing databases
Define the project as a collection of databases. This will remove the issue of circular database references because both databases can be deployed at the same time. It won’t address real circular references, that’s still on the developer.
Emphasize a small memory footprint, which will be even more challenging with the point noted above.
Default to best practices, such as using an additive approach. i.e. Do not delete objects not in the project, ignore column order in table deployments.
Provide the other essentials:
Dependency tracking
Rename operations
Rollback deployment scripts on error
Pre and post deployment scripts
One time migration scripts, a hybrid model will allow for flexibility when you need it

< Older Post

Home

Mail