[ https://issues.apache.org/jira/browse/AIRFLOW-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005910#comment-17005910 ]
zorro.shi commented on AIRFLOW-2516: ------------------------------------ Hi [~higrys] I had a similar issue happening at 1.10.6. The following is a deadlock log that may help you ------------------------ LATEST DETECTED DEADLOCK ------------------------ 2019-12-31 01:00:35 0x7fa024527700 *** (1) TRANSACTION: TRANSACTION 12953238, ACTIVE 0 sec starting index read mysql tables in use 3, locked 3 LOCK WAIT 16 lock struct(s), heap size 1136, 26 row lock(s) MySQL thread id 4441025, OS thread handle 140324389902080, query id 132956673 iZwz9ds6obifnanjsnb2gqZ 172.18.38.223 airflow Sending data UPDATE task_instance, (SELECT task_instance.try_number AS try_number, task_instance.task_id AS task_id, task_instance.dag_id AS dag_id, task_instance.execution_date AS execution_date, task_instance.start_date AS start_date, task_instance.end_date AS end_date, task_instance.duration AS duration, task_instance.state AS state, task_instance.max_tries AS max_tries, task_instance.hostname AS hostname, task_instance.unixname AS unixname, task_instance.job_id AS job_id, task_instance.pool AS pool, task_instance.queue AS queue, task_instance.priority_weight AS priority_weight, task_instance.operator AS operator, task_instance.queued_dttm AS queued_dttm, task_instance.pid AS pid, task_instance.executor_config AS executor_config FROM task_instance LEFT OUTER JOIN dag_run ON task_instance.dag_id = dag_run.dag_id AND task_instance.execution_date = dag_run.execution_date WHERE task_in *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 122 page no 39 n bits 160 index PRIMARY of table `airflow`.`task_instance` trx id 12953238 lock_mode X locks rec but not gap waiting Record lock, heap no 88 PHYSICAL RECORD: n_fields 21; compact format; info bits 0 0: len 24; hex 776d735f6e6b635f7072696e74696e675f686973746f7279; asc wms_nkc_printing_history;; 1: len 19; hex 776d735f617263686976652e776d735f6e6b63; asc wms_archive.wms_nkc;; 2: len 7; hex 5e08db90000000; asc ^ ;; 3: len 6; hex 000000c5a694; asc ;; 4: len 7; hex 3b000001870928; asc ; (;; 5: len 7; hex 5e0a2d22050d7b; asc ^ -" {;; 6: len 7; hex 5e0a2d22050d88; asc ^ -" ;; 7: SQL NULL; 8: len 6; hex 717565756564; asc queued;; 9: len 4; hex 80000000; asc ;; 10: len 0; hex ; asc ;; 11: len 4; hex 726f6f74; asc root;; 12: SQL NULL; 13: len 12; hex 64656661756c745f706f6f6c; asc default_pool;; 14: len 7; hex 64656661756c74; asc default;; 15: len 4; hex 80000002; asc ;; 16: SQL NULL; 17: len 7; hex 5e0a2d33061021; asc ^ -3 !;; 18: SQL NULL; 19: len 4; hex 80000000; asc ;; 20: len 5; hex 80057d942e; asc } .;; *** (2) TRANSACTION: TRANSACTION 12953236, ACTIVE 0 sec updating or deleting, thread declared inside InnoDB 4999 mysql tables in use 1, locked 1 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1 MySQL thread id 4445271, OS thread handle 140325780879104, query id 132956674 iZwz9ds6obifnanjsnb2gqZ 172.18.38.223 airflow updating UPDATE task_instance SET state='queued', queued_dttm='2019-12-30 17:00:35.397345' WHERE task_instance.task_id = 'wms_nkc_printing_history' AND task_instance.dag_id = 'wms_archive.wms_nkc' AND task_instance.execution_date = '2019-12-29 17:00:00' *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 122 page no 39 n bits 160 index PRIMARY of table `airflow`.`task_instance` trx id 12953236 lock_mode X locks rec but not gap Record lock, heap no 88 PHYSICAL RECORD: n_fields 21; compact format; info bits 0 0: len 24; hex 776d735f6e6b635f7072696e74696e675f686973746f7279; asc wms_nkc_printing_history;; 1: len 19; hex 776d735f617263686976652e776d735f6e6b63; asc wms_archive.wms_nkc;; 2: len 7; hex 5e08db90000000; asc ^ ;; 3: len 6; hex 000000c5a694; asc ;; 4: len 7; hex 3b000001870928; asc ; (;; 5: len 7; hex 5e0a2d22050d7b; asc ^ -" {;; 6: len 7; hex 5e0a2d22050d88; asc ^ -" ;; 7: SQL NULL; 8: len 6; hex 717565756564; asc queued;; 9: len 4; hex 80000000; asc ;; 10: len 0; hex ; asc ;; 11: len 4; hex 726f6f74; asc root;; 12: SQL NULL; 13: len 12; hex 64656661756c745f706f6f6c; asc default_pool;; 14: len 7; hex 64656661756c74; asc default;; 15: len 4; hex 80000002; asc ;; 16: SQL NULL; 17: len 7; hex 5e0a2d33061021; asc ^ -3 !;; 18: SQL NULL; 19: len 4; hex 80000000; asc ;; 20: len 5; hex 80057d942e; asc } .;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 122 page no 17 n bits 320 index ti_dag_state of table `airflow`.`task_instance` trx id 12953236 lock_mode X locks rec but not gap waiting Record lock, heap no 235 PHYSICAL RECORD: n_fields 4; compact format; info bits 0 0: len 19; hex 776d735f617263686976652e776d735f6e6b63; asc wms_archive.wms_nkc;; 1: len 9; hex 7363686564756c6564; asc scheduled;; 2: len 24; hex 776d735f6e6b635f7072696e74696e675f686973746f7279; asc wms_nkc_printing_history;; 3: len 7; hex 5e08db90000000; asc ^ ;; *** WE ROLL BACK TRANSACTION (2) ---------------------- Airflow exception log ---------------------- Traceback (most recent call last): File "/usr/local/python38/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 930, in _run_raw_task result = task_copy.execute(context=context) File "/usr/local/python38/lib/python3.8/site-packages/airflow/operators/subdag_operator.py", line 100, in execute self.subdag.run( File "/usr/local/python38/lib/python3.8/site-packages/airflow/models/dag.py", line 1284, in run job.run() File "/usr/local/python38/lib/python3.8/site-packages/airflow/jobs/base_job.py", line 222, in run self._execute() File "/usr/local/python38/lib/python3.8/site-packages/airflow/utils/db.py", line 74, in wrapper return func(*args, **kwargs) File "/usr/local/python38/lib/python3.8/site-packages/airflow/jobs/backfill_job.py", line 764, in _execute self._execute_for_run_dates(run_dates=dates_to_process, File "/usr/local/python38/lib/python3.8/site-packages/airflow/utils/db.py", line 70, in wrapper return func(*args, **kwargs) File "/usr/local/python38/lib/python3.8/site-packages/airflow/jobs/backfill_job.py", line 694, in _execute_for_run_dates processed_dag_run_dates = self._process_backfill_task_instances( File "/usr/local/python38/lib/python3.8/site-packages/airflow/utils/db.py", line 70, in wrapper return func(*args, **kwargs) File "/usr/local/python38/lib/python3.8/site-packages/airflow/jobs/backfill_job.py", line 586, in _process_backfill_task_instances _per_task_process(task, key, ti) File "/usr/local/python38/lib/python3.8/site-packages/airflow/utils/db.py", line 74, in wrapper return func(*args, **kwargs) File "/usr/local/python38/lib/python3.8/site-packages/airflow/jobs/backfill_job.py", line 508, in _per_task_process session.commit() File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 1027, in commit self.transaction.commit() File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 494, in commit self._prepare_impl() File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 473, in _prepare_impl self.session.flush() File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2470, in flush self._flush(objects) File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2608, in _flush transaction.rollback(_capture_exception=True) File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__ compat.reraise(exc_type, exc_value, exc_tb) File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 153, in reraise raise value File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2568, in _flush flush_context.execute() File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute rec.execute(self) File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 586, in execute persistence.save_obj( File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 230, in save_obj _emit_update_statements( File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 995, in _emit_update_statements c = cached_connections[connection].execute( File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 982, in execute return meth(self, multiparams, params) File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection return connection._execute_clauseelement(self, multiparams, params) File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1095, in _execute_clauseelement ret = self._execute_context( File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1249, in _execute_context self._handle_dbapi_exception( File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception util.raise_from_cause(sqlalchemy_exception, exc_info) File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause reraise(type(exception), exception, tb=exc_tb, cause=cause) File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 152, in reraise raise value.with_traceback(tb) File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1245, in _execute_context self.dialect.do_execute( File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 581, in do_execute cursor.execute(statement, parameters) File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/cursors.py", line 255, in execute self.errorhandler(self, exc, value) File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/connections.py", line 50, in defaulterrorhandler raise errorvalue File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/cursors.py", line 252, in execute res = self._query(query) File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/cursors.py", line 378, in _query db.query(q) File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/connections.py", line 280, in query _mysql.connection.query(self, query) sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') [SQL: UPDATE task_instance SET state=%s, queued_dttm=%s WHERE task_instance.task_id = %s AND task_instance.dag_id = %s AND task_instance.execution_date = %s] [parameters: ('queued', datetime.datetime(2019, 12, 30, 17, 0, 35, 397345, tzinfo=<Timezone [UTC]>), 'wms_nkc_printing_history', 'wms_archive.wms_nkc', datetime.datetime(2019, 12, 29, 17, 0, tzinfo=<Timezone [UTC]>))] > Deadlock found when trying to update task_instance table > -------------------------------------------------------- > > Key: AIRFLOW-2516 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2516 > Project: Apache Airflow > Issue Type: Bug > Components: DagRun > Affects Versions: 1.8.0 > Reporter: Jeff Liu > Priority: Major > Attachments: Screenshot 2019-12-30 at 10.42.52.png, > image-2019-12-30-10-48-41-313.png, image-2019-12-30-10-58-02-610.png, jobs.py > > > > > {code:java} > [2018-05-23 17:59:57,218] {base_task_runner.py:98} INFO - Subtask: > [2018-05-23 17:59:57,217] {base_executor.py:49} INFO - Adding to queue: > airflow run production_wipeout_wipe_manager.Carat Carat_20180227 > 2018-05-23T17:41:18.815809 --local -sd DAGS_FOLDER/wipeout/wipeout.py > [2018-05-23 17:59:57,231] {base_task_runner.py:98} INFO - Subtask: Traceback > (most recent call last): > [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/bin/airflow", line 27, in <module> > [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: > args.func(args) > [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/airflow/bin/cli.py", line 392, in run > [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: > pool=args.pool, > [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/airflow/utils/db.py", line 50, in > wrapper > [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: result = > func(*args, **kwargs) > [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1532, in > _run_raw_task > [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: > self.handle_failure(e, test_mode, context) > [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1641, in > handle_failure > [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: > session.merge(self) > [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line > 1920, in merge > [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: > _resolve_conflict_map=_resolve_conflict_map) > [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line > 1974, in _merge > [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: merged = > self.query(mapper.class_).get(key[1]) > [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 882, > in get > [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: ident, > loading.load_on_pk_identity) > [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 952, > in _get_impl > [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: return > db_load_fn(self, primary_key_identity) > [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", line 247, > in load_on_pk_i > dentity > [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: return > q.one() > [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2884, > in one > [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: ret = > self.one_or_none() > [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2854, > in one_or_none > [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: ret = > list(self) > [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2925, > in __iter__ > [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO - Subtask: return > self._execute_and_instances(context) > [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2946, > in _execute_and_instances > [2018-05-23 17:59:57,240] {base_task_runner.py:98} INFO - Subtask: > close_with_result=True) > [2018-05-23 17:59:57,240] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2955, > in _get_bind_ar > s > [2018-05-23 17:59:57,240] {base_task_runner.py:98} INFO - Subtask: **kw > [2018-05-23 17:59:57,240] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2937, > in _connection_f > rom_session > [2018-05-23 17:59:57,241] {base_task_runner.py:98} INFO - Subtask: conn = > self.session.connection(**kw) > [2018-05-23 17:59:57,241] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line > 1035, in connection > [2018-05-23 17:59:57,241] {base_task_runner.py:98} INFO - Subtask: > execution_options=execution_options) > [2018-05-23 17:59:57,241] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line > 1040, in _connection > _for_bind > [2018-05-23 17:59:57,242] {base_task_runner.py:98} INFO - Subtask: engine, > execution_options) > [2018-05-23 17:59:57,242] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 388, > in _connection_ > for_bind > [2018-05-23 17:59:57,242] {base_task_runner.py:98} INFO - Subtask: > self._assert_active() > [2018-05-23 17:59:57,243] {base_task_runner.py:98} INFO - Subtask: File > "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 276, > in _assert_acti > ve > [2018-05-23 17:59:57,243] {base_task_runner.py:98} INFO - Subtask: % > self._rollback_exception > [2018-05-23 17:59:57,244] {base_task_runner.py:98} INFO - Subtask: > sqlalchemy.exc.InvalidRequestError: This Session's transaction has been > rolled back due to a previou > s exception during flush. To begin a new transaction with this Session, first > issue Session.rollback(). Original exception was: > (_mysql_exceptions.OperationalError) (1 > 213, 'Deadlock found when trying to get lock; try restarting transaction') > [SQL: u'UPDATE task_instance SET state=%s WHERE task_instance.task_id = %s > AND task_instance > .dag_id = %s AND task_instance.execution_date = %s'] [parameters: (u'queued', > 'Carat_20180227', 'production_wipeout_wipe_manager.Carat', > datetime.datetime(2018, 5, 23, > 17, 41, 18, 815809))] (Background on this error at: > http://sqlalche.me/e/e3q8){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)