Ruiqin - Re: backwards compatibility - I'm not sure, but my guess is that the major versions have breaking schema changes that aren't simultaneously backwards compatible.
Matt - Here's the offline mode support in Airflow and the Alembic docs. - https://github.com/apache/incubator-airflow/blob/f4f8027cbf61ce2ed6a9989facf6c99dffb12f66/airflow/migrations/env.py#L49-L66 - https://alembic.zzzcomputing.com/en/latest/offline.html I haven't tested the two performance-wise but I would think online with nothing else going would be comparable. *Taylor Edmiston* Blog <https://blog.tedmiston.com/> | LinkedIn <https://www.linkedin.com/in/tedmiston/> | Stack Overflow <https://stackoverflow.com/users/149428/taylor-edmiston> | Developer Story <https://stackoverflow.com/story/taylor> On Tue, Sep 25, 2018 at 11:00 PM, Matt Davis <jiffyc...@gmail.com> wrote: > Good point about mentioning the database specifics, thanks. It's a Postgres > 9.6.6 DB running in AWS RDS in an db.r3.large instance (2 vCPUs, 15 GB of > RAM). > > Not sure what you mean by online/offline, but we timed the migrations in a > test run against a database with nothing else going on at the time. > > - Matt > > On Tue, Sep 25, 2018 at 7:54 PM Ruiqin Yang <yrql...@gmail.com> wrote: > > > Thank you Taylor, the db-cleanup DAG is very nice! Got a question for > you, > > should we expect the DB migration to be backward compatible, i.e. would > > 1.8.x cluster run fine with upgraded DB? > > > > Thank you! > > Kevin Y > > > > On Tue, Sep 25, 2018 at 6:14 PM Taylor Edmiston <tedmis...@gmail.com> > > wrote: > > > > > I haven't done 1.8.x to 1.10.x in one go, but multiple hours seems long > > for > > > running a handful of Alembic migrations on 10M rows. It might be worth > > > noting if you're using MySQL or Postgres and how your db is hosted... I > > > wonder if there's a bottleneck at play here. > > > > > > Also, are you running the migrations in online or offline mode? > > > > > > You may see a performance improvement if you collapse all migrations > into > > > one then apply that (https://stackoverflow.com/a/34492022/149428). > > > > > > I prefer to keep all of my metadata in place personally, but the > > db-cleanup > > > DAG in https://github.com/teamclairvoyant/airflow-maintenance-dags has > > > been > > > brought up before. > > > > > > T > > > > > > *Taylor Edmiston* > > > Blog <https://blog.tedmiston.com/> | LinkedIn > > > <https://www.linkedin.com/in/tedmiston/> | Stack Overflow > > > <https://stackoverflow.com/users/149428/taylor-edmiston> | Developer > > Story > > > <https://stackoverflow.com/story/taylor> > > > > > > > > > On Tue, Sep 25, 2018 at 8:30 PM, Sid Anand <san...@apache.org> wrote: > > > > > > > I checked with our Ops guy and he mentioned that when he upgraded > from > > > > 1.8.x to 1.9.x, it took a few seconds. We had 3M rows in the > > > task_instance > > > > table and run MySQL 5.7. > > > > > > > > -s > > > > > > > > On Tue, Sep 25, 2018 at 4:54 PM Matt Davis <jiffyc...@gmail.com> > > wrote: > > > > > > > > > Hi folks, > > > > > > > > > > Here at Clover we're excitedly migrating to Airflow 1.10 (thanks > for > > > > > everyone's hard work on that!). We're finding that it's taking > about > > 2 > > > > > hours to apply all the migrations to go from Airflow 1.8 to 1.10, > > > largely > > > > > driven by the 10 million rows in our task_instance table. That got > us > > > > > wondering what kind of maintenance people do on their Airflow > > metadata > > > > > databases. Do folks mostly put up with long migrations and > generally > > > > longer > > > > > queries, or are y'all doing periodic cleanups of your metadata DB > to > > > keep > > > > > it fairly light? > > > > > > > > > > Thanks, > > > > > Matt Davis > > > > > > > > > > > > > > >