Greetings, Jazz Guitarists, I've briefly talked about this with Markus and he mentioned that the subject was already brought up by Tyson Clugg but I think it deserves a proper discussion here.
I'm typing this from the comfort of Django: Under the Hood sprints so please excuse poor grammar and the somewhat chaotic explanations that follow. I'm very tired and English is not my mother tongue. This is not a DEP but merely a stream of consciousness I'd love to get some feedback on. Here are some of the problems we face when dealing with migrations: 1. Dependency resolution that turns the migration dependency graph into an ordered list happens every time you try to create or execute a migration. If you have several hundred migrations it becomes quite slow. I'm talking multiple minutes kind of slow. As you can imagine working with multiple branches or perfecting your migrations quickly becomes a tedious task. 2. Dependency resolution is only stable as long as the migration set is frozen. Sometimes introducing a new migration is enough to break existing migrations by causing them to execute in a slightly different order. We often have to backtrack and edit existing migrations and enforce a strict resolution order by introducing arbitrary dependencies. 3. Removing an app from a project is a nightmare. You can't migrate to zero state unless the app is still there. There is no way to add "revert all migrations for app X" to the migration graph, it's something you need to run manually. There is no clean way to remove an app that was ever references in a relation. We were forced to do all kinds of hacks to get around this. Sometimes it's necessary to create an empty eggshell app with the same name and copy all migrations there then add necessary data migrations and finally migrations that remove all the models, indices, procedures etc. Sometimes people just leave a dead application in INSTALLED_APPS to not have to deal with this. 4. Squashing migrations is wonky at best. If you create a model in one migration, alter one of its fields in another and then finally drop the model sometime later, the squashed migration will have Django try to execute the alter first and complain about the table not being there. Also the only reason we need to squash migrations is to prevent problem 1 above from becoming exponentially worse. If migrations were only as slow as the underlying SQL commands, we'd likely never squash them. 5. There's no simple way to roll back all the migrations introduced after a particular point in time which is very useful when working with multiple feature branches. In my current project dropping the database means having to reimport over 200 MB of data snapshots. Switching branches requires me to look at branch diffs to determine which migrations to revert. 6. Conflict detection and resolution (migrate --merge) is a make-believe solution. It just trains people to execute the command without investigating whether their migration history still makes sense. Some of these I need to dig deeper into and probably file proper tickets. For example I have an idea on how to fix 4 but it would make 1 even slower. I took some time to get a good long look at what other ORMs are doing. The graph-based dependency solving approach is rather uncommon. Most systems treat migrations as part of the project rather than the packages it uses. Possible solution (or "how I'd build it today if there was no existing code in Django core"): a. Make migrations part of the project and not individual apps. This takes care of problem 3 above. b. Prefix individual migration files with a UTC timestamp (20161105151023_add_foo) to provide a strict sorting order. This removes the depsolving requirement and takes care of 1 and 2. By eliminating those it makes 4 kind of obsolete as squashing migrations would become pointless. c. Have reusable apps provide migration templates that Django then copies to my project when "makemigrations" is run. d. Maintain a separate directory for each database connection. e. Execute all migrations in alphabetical order (which means by timestamp first). When an unapplied migration is followed by an applied one, ask whether to attempt to just apply it or if the user wants to first unapply migrations that came after it. To me this would work better than 6. f. Migrating to a timestamp solves 5. Of course we do have migration support in core and it's not compatible with most of the above list. Any ideas? I think serializing the dependency solver state and reusing it between runs could be a pretty low hanging fruit (like "npm shrinkwrap" or yarn's lock file). -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at https://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/b5a14b65-05f0-4282-a741-e9e8bef213ac%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.