[ https://issues.apache.org/jira/browse/AIRFLOW-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384225#comment-16384225 ]
Joy Gao commented on AIRFLOW-1642: ---------------------------------- This one fell off my radar, I do have a PR out for it [https://github.com/apache/incubator-airflow/pull/2632] but never got merged :( > An Alembic script not using scoped session causing deadlock > ----------------------------------------------------------- > > Key: AIRFLOW-1642 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1642 > Project: Apache Airflow > Issue Type: Bug > Reporter: Joy Gao > Priority: Minor > > The bug I'm about to describe is a more of an obscure edge case, however I > think it's something still worth fixing. > After upgrading to airflow 1.9, while running `airflow resetdb` on my local > machine (with mysql), I encountered a deadlock on the final alembic revision > _d2ae31099d61 Increase text size for MySQL (not relevant for other DBs' text > types)_. > The deadlock turned out to be caused by another earlier session that was > created and left open in revision _cc1e65623dc7 add max tries column to task > instance_. Notably the code below: > {code} > sessionmaker = sa.orm.sessionmaker() > session = sessionmaker(bind=connection) > dagbag = DagBag(settings.DAGS_FOLDER) > {code} > The session created here was not a `scoped_session`, so when the DAGs were > being parsed in line 3 above, one of the DAG files makes a direct call to the > class method `Variable.get()` to acquire an env variable, which makes a db > query to the `variable` table, but raised a KeyError as the env variable was > non-existent, thus holding the lock to the `variable` table as a result of > that exception. > Later on, the latter alembic script `_cc1e65623dc7` needs to alter the > `Variable` table. Instead of creating its own Session object, it attempts to > reuse the same one as above. And because of the exception, it waits > indefinitely to acquire the lock on that table. > So the DAG file itself could have avoided the KeyError by providing a default > value when calling Variable.get(). However I think it would be a good idea to > avoid using unscoped sessions in general, as an exception could potentially > occur in the future elsewhere. The easiest fix is replacing *session = > sessionmaker(bind=connection)* with *session = settings.Session()*, which is > scoped. However, making a change on a migration script is going to make folks > anxious. > If anyone have any thoughts on this, let me know! Thanks :) -- This message was sent by Atlassian JIRA (v7.6.3#76005)