stablum edited a comment on issue #19957:
URL: https://github.com/apache/airflow/issues/19957#issuecomment-1079965417


   Unfortunately I'm still experiencing this bug with Airflow 2.2.4:
   
   ```
   [2022-03-27 16:14:12,804] {scheduler_job.py:433} INFO - Setting the 
following tasks to queued state:
        
   [2022-03-27 16:14:12,805] {scheduler_job.py:527} INFO - Executor reports 
execution of download_and_preprocess_sets.download_1544 
run_id=manual__2022-03-24T13:43:29.617461+00:00 exited with status success for 
try_number 2
   [2022-03-27 16:14:12,809] {scheduler_job.py:570} INFO - TaskInstance 
Finished: dag_id=download_and_preprocess_sets, task_id=download_1544, 
run_id=manual__2022-03-24T13:43:29.617461+00:00, run_start_date=2022-03-27 
16:13:00.591762+00:00, run_end_date=2022-03-27 16:13:37.437128+00:00, 
run_duration=36.845366, state=success, executor_state=success, try_number=2, 
max_tries=2, job_id=10383, pool=default_pool, queue=default, 
priority_weight=59, operator=PythonOperator
   [2022-03-27 16:14:12,829] {scheduler_job.py:1137} INFO - Resetting orphaned 
tasks for active dag runs
   [2022-03-27 16:14:12,830] {scheduler_job.py:1160} INFO - Marked 1 
SchedulerJob instances as failed
   [2022-03-27 16:14:12,835] {scheduler_job.py:1201} INFO - Reset the following 
1 orphaned TaskInstances:
        <TaskInstance: download_and_preprocess_sets.download_1546 
manual__2022-03-24T13:43:29.617461+00:00 [running]>
   [2022-03-27 16:14:45,853] {scheduler_job.py:667} ERROR - Exception when 
executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py", 
line 1276, in _execute_context
       self.dialect.do_execute(
     File 
"/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/default.py", line 
608, in do_execute
       cursor.execute(statement, parameters)
   psycopg2.errors.DeadlockDetected: deadlock detected
   DETAIL:  Process 309686 waits for ShareLock on transaction 12660839; blocked 
by process 314699.
   Process 314699 waits for ShareLock on transaction 12660838; blocked by 
process 309686.
   HINT:  See server log for query details.
   CONTEXT:  while updating tuple (4928,21) in relation "task_instance"
   
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File 
"/usr/local/lib/python3.9/dist-packages/airflow/jobs/scheduler_job.py", line 
651, in _execute
       self._run_scheduler_loop()
     File 
"/usr/local/lib/python3.9/dist-packages/airflow/jobs/scheduler_job.py", line 
732, in _run_scheduler_loop
       num_queued_tis = self._do_scheduling(session)
     File 
"/usr/local/lib/python3.9/dist-packages/airflow/jobs/scheduler_job.py", line 
815, in _do_scheduling
       callback_to_run = self._schedule_dag_run(dag_run, session)
     File 
"/usr/local/lib/python3.9/dist-packages/airflow/jobs/scheduler_job.py", line 
1072, in _schedule_dag_run
       dag_run.schedule_tis(schedulable_tis, session)
     File "/usr/local/lib/python3.9/dist-packages/airflow/utils/session.py", 
line 67, in wrapper
       return func(*args, **kwargs)
     File "/usr/local/lib/python3.9/dist-packages/airflow/models/dagrun.py", 
line 901, in schedule_tis
       session.query(TI)
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/query.py", 
line 4063, in update
       update_op.exec_()
     File 
"/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/persistence.py", line 
1697, in exec_
       self._do_exec()
     File 
"/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/persistence.py", line 
1895, in _do_exec
       self._execute_stmt(update_stmt)
     File 
"/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/persistence.py", line 
1702, in _execute_stmt
       self.result = self.query._execute_crud(stmt, self.mapper)
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/query.py", 
line 3568, in _execute_crud
       return conn.execute(stmt, self._params)
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py", 
line 1011, in execute
       return meth(self, multiparams, params)
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/sql/elements.py", 
line 298, in _execute_on_connection
       return connection._execute_clauseelement(self, multiparams, params)
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py", 
line 1124, in _execute_clauseelement
       ret = self._execute_context(
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py", 
line 1316, in _execute_context
       self._handle_dbapi_exception(
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py", 
line 1510, in _handle_dbapi_exception
       util.raise_(
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/util/compat.py", 
line 182, in raise_
       raise exception
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py", 
line 1276, in _execute_context
       self.dialect.do_execute(
     File 
"/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/default.py", line 
608, in do_execute
       cursor.execute(statement, parameters)
   sqlalchemy.exc.OperationalError: (psycopg2.errors.DeadlockDetected) deadlock 
detected
   DETAIL:  Process 309686 waits for ShareLock on transaction 12660839; blocked 
by process 314699.
   Process 314699 waits for ShareLock on transaction 12660838; blocked by 
process 309686.
   HINT:  See server log for query details.
   CONTEXT:  while updating tuple (4928,21) in relation "task_instance"
   
   [SQL: UPDATE task_instance SET state=%(state)s WHERE task_instance.dag_id = 
%(dag_id_1)s AND task_instance.run_id = %(run_id_1)s AND task_instance.task_id 
IN (%(task_id_1)s)]
   [parameters: {'state': <TaskInstanceState.SCHEDULED: 'scheduled'>, 
'dag_id_1': 'download_and_preprocess_sets', 'run_id_1': 
'manual__2022-03-24T13:43:29.617461+00:00', 'task_id_1': 'download_1546'}]
   (Background on this error at: http://sqlalche.me/e/13/e3q8)
   [2022-03-27 16:14:45,858] {local_executor.py:388} INFO - Shutting down 
LocalExecutor; waiting for running tasks to finish.  Signal again if you don't 
want to wait.
   [2022-03-27 16:14:46,970] {process_utils.py:120} INFO - Sending 
Signals.SIGTERM to group 309683. PIDs of all processes in the group: [309683]
   [2022-03-27 16:14:46,970] {process_utils.py:75} INFO - Sending the signal 
Signals.SIGTERM to group 309683
   [2022-03-27 16:14:47,022] {process_utils.py:70} INFO - Process 
psutil.Process(pid=309683, status='terminated', exitcode=0, started='16:08:59') 
(309683) terminated with exit code 0
   [2022-03-27 16:14:47,022] {scheduler_job.py:678} INFO - Exited execute loop
   Traceback (most recent call last):
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py", 
line 1276, in _execute_context
       self.dialect.do_execute(
     File 
"/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/default.py", line 
608, in do_execute
       cursor.execute(statement, parameters)
   psycopg2.errors.DeadlockDetected: deadlock detected
   DETAIL:  Process 309686 waits for ShareLock on transaction 12660839; blocked 
by process 314699.
   Process 314699 waits for ShareLock on transaction 12660838; blocked by 
process 309686.
   HINT:  See server log for query details.
   CONTEXT:  while updating tuple (4928,21) in relation "task_instance"
   
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File "/usr/local/bin/airflow", line 8, in <module>
       sys.exit(main())
     File "/usr/local/lib/python3.9/dist-packages/airflow/__main__.py", line 
48, in main
       args.func(args)
     File "/usr/local/lib/python3.9/dist-packages/airflow/cli/cli_parser.py", 
line 48, in command
       return func(*args, **kwargs)
     File "/usr/local/lib/python3.9/dist-packages/airflow/utils/cli.py", line 
92, in wrapper
       return f(*args, **kwargs)
     File 
"/usr/local/lib/python3.9/dist-packages/airflow/cli/commands/scheduler_command.py",
 line 75, in scheduler
       _run_scheduler_job(args=args)
     File 
"/usr/local/lib/python3.9/dist-packages/airflow/cli/commands/scheduler_command.py",
 line 46, in _run_scheduler_job
       job.run()
     File "/usr/local/lib/python3.9/dist-packages/airflow/jobs/base_job.py", 
line 246, in run
       self._execute()
     File 
"/usr/local/lib/python3.9/dist-packages/airflow/jobs/scheduler_job.py", line 
651, in _execute
       self._run_scheduler_loop()
     File 
"/usr/local/lib/python3.9/dist-packages/airflow/jobs/scheduler_job.py", line 
732, in _run_scheduler_loop
       num_queued_tis = self._do_scheduling(session)
     File 
"/usr/local/lib/python3.9/dist-packages/airflow/jobs/scheduler_job.py", line 
815, in _do_scheduling
       callback_to_run = self._schedule_dag_run(dag_run, session)
     File 
"/usr/local/lib/python3.9/dist-packages/airflow/jobs/scheduler_job.py", line 
1072, in _schedule_dag_run
       dag_run.schedule_tis(schedulable_tis, session)
     File "/usr/local/lib/python3.9/dist-packages/airflow/utils/session.py", 
line 67, in wrapper
       return func(*args, **kwargs)
     File "/usr/local/lib/python3.9/dist-packages/airflow/models/dagrun.py", 
line 901, in schedule_tis
       session.query(TI)
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/query.py", 
line 4063, in update
       update_op.exec_()
     File 
"/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/persistence.py", line 
1697, in exec_
       self._do_exec()
     File 
"/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/persistence.py", line 
1895, in _do_exec
       self._execute_stmt(update_stmt)
     File 
"/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/persistence.py", line 
1702, in _execute_stmt
       self.result = self.query._execute_crud(stmt, self.mapper)
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/query.py", 
line 3568, in _execute_crud
       return conn.execute(stmt, self._params)
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py", 
line 1011, in execute
       return meth(self, multiparams, params)
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/sql/elements.py", 
line 298, in _execute_on_connection
       return connection._execute_clauseelement(self, multiparams, params)
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py", 
line 1124, in _execute_clauseelement
       ret = self._execute_context(
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py", 
line 1316, in _execute_context
       self._handle_dbapi_exception(
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py", 
line 1510, in _handle_dbapi_exception
       util.raise_(
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/util/compat.py", 
line 182, in raise_
       raise exception
     File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py", 
line 1276, in _execute_context
       self.dialect.do_execute(
     File 
"/usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/default.py", line 
608, in do_execute
       cursor.execute(statement, parameters)
   sqlalchemy.exc.OperationalError: (psycopg2.errors.DeadlockDetected) deadlock 
detected
   DETAIL:  Process 309686 waits for ShareLock on transaction 12660839; blocked 
by process 314699.
   Process 314699 waits for ShareLock on transaction 12660838; blocked by 
process 309686.
   HINT:  See server log for query details.
   CONTEXT:  while updating tuple (4928,21) in relation "task_instance"
   
   [SQL: UPDATE task_instance SET state=%(state)s WHERE task_instance.dag_id = 
%(dag_id_1)s AND task_instance.run_id = %(run_id_1)s AND task_instance.task_id 
IN (%(task_id_1)s)]
   [parameters: {'state': <TaskInstanceState.SCHEDULED: 'scheduled'>, 
'dag_id_1': 'download_and_preprocess_sets', 'run_id_1': 
'manual__2022-03-24T13:43:29.617461+00:00', 'task_id_1': 'download_1546'}]
   (Background on this error at: http://sqlalche.me/e/13/e3q8)
   ```
   I also set `concurrency=1` and `max_active_tasks=1`, but the bug keeps 
reappearing
   
   This is my PostgreSQL log:
   ```
   2022-03-27 15:37:25.192 UTC [273303] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 15:37:25.201 UTC [300223] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 15:37:25.203 UTC [298660] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 15:37:25.205 UTC [297039] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 15:37:25.209 UTC [301829] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 15:37:26.193 UTC [276786] airflow_user@airflow_db FATAL:  
terminating connection due to administrator command
   2022-03-27 15:39:22.382 UTC [304375] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 15:40:31.194 UTC [305340] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 15:41:38.554 UTC [306673] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 15:42:51.378 UTC [307629] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 15:43:23.546 UTC [303336] airflow_user@airflow_db ERROR:  
deadlock detected
   2022-03-27 15:43:23.546 UTC [303336] airflow_user@airflow_db DETAIL:  
Process 303336 waits for ShareLock on transaction 12660394; blocked by process 
309021.
        Process 309021 waits for ShareLock on transaction 12660383; blocked by 
process 303336.
        Process 303336: UPDATE task_instance SET state='scheduled' WHERE 
task_instance.dag_id = 'download_and_preprocess_sets' AND task_instance.run_id 
= 'manual__2022-03-24T13:43:29.617461+00:00' AND task_instance.task_id IN 
('parse_1381', 'download_1385')
        Process 309021: SELECT task_instance.try_number AS 
task_instance_try_number, task_instance.task_id AS task_instance_task_id, 
task_instance.dag_id AS task_instance_dag_id, task_instance.run_id AS 
task_instance_run_id, task_instance.start_date AS task_instance_start_date, 
task_instance.end_date AS task_instance_end_date, task_instance.duration AS 
task_instance_duration, task_instance.state AS task_instance_state, 
task_instance.max_tries AS task_instance_max_tries, task_instance.hostname AS 
task_instance_hostname, task_instance.unixname AS task_instance_unixname, 
task_instance.job_id AS task_instance_job_id, task_instance.pool AS 
task_instance_pool, task_instance.pool_slots AS task_instance_pool_slots, 
task_instance.queue AS task_instance_queue, task_instance.priority_weight AS 
task_instance_priority_weight, task_instance.operator AS 
task_instance_operator, task_instance.queued_dttm AS task_instance_queued_dttm, 
task_instance.queued_by_job_id AS task_instance_queued_by_job_id, task_
 instance.pid AS task_instance_pid, task_insta
   2022-03-27 15:43:23.546 UTC [303336] airflow_user@airflow_db HINT:  See 
server log for query details.
   2022-03-27 15:43:23.546 UTC [303336] airflow_user@airflow_db CONTEXT:  while 
updating tuple (4903,44) in relation "task_instance"
   2022-03-27 15:43:23.546 UTC [303336] airflow_user@airflow_db STATEMENT:  
UPDATE task_instance SET state='scheduled' WHERE task_instance.dag_id = 
'download_and_preprocess_sets' AND task_instance.run_id = 
'manual__2022-03-24T13:43:29.617461+00:00' AND task_instance.task_id IN 
('parse_1381', 'download_1385')
   2022-03-27 15:43:32.520 UTC [308612] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 15:43:33.596 UTC [303337] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 16:10:41.435 UTC [310498] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 16:11:54.228 UTC [311405] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 16:12:26.266 UTC [312643] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 16:13:37.542 UTC [312957] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 16:14:45.851 UTC [309686] airflow_user@airflow_db ERROR:  
deadlock detected
   2022-03-27 16:14:45.851 UTC [309686] airflow_user@airflow_db DETAIL:  
Process 309686 waits for ShareLock on transaction 12660839; blocked by process 
314699.
        Process 314699 waits for ShareLock on transaction 12660838; blocked by 
process 309686.
        Process 309686: UPDATE task_instance SET state='scheduled' WHERE 
task_instance.dag_id = 'download_and_preprocess_sets' AND task_instance.run_id 
= 'manual__2022-03-24T13:43:29.617461+00:00' AND task_instance.task_id IN 
('download_1546')
        Process 314699: SELECT task_instance.try_number AS 
task_instance_try_number, task_instance.task_id AS task_instance_task_id, 
task_instance.dag_id AS task_instance_dag_id, task_instance.run_id AS 
task_instance_run_id, task_instance.start_date AS task_instance_start_date, 
task_instance.end_date AS task_instance_end_date, task_instance.duration AS 
task_instance_duration, task_instance.state AS task_instance_state, 
task_instance.max_tries AS task_instance_max_tries, task_instance.hostname AS 
task_instance_hostname, task_instance.unixname AS task_instance_unixname, 
task_instance.job_id AS task_instance_job_id, task_instance.pool AS 
task_instance_pool, task_instance.pool_slots AS task_instance_pool_slots, 
task_instance.queue AS task_instance_queue, task_instance.priority_weight AS 
task_instance_priority_weight, task_instance.operator AS 
task_instance_operator, task_instance.queued_dttm AS task_instance_queued_dttm, 
task_instance.queued_by_job_id AS task_instance_queued_by_job_id, task_
 instance.pid AS task_instance_pid, task_insta
   2022-03-27 16:14:45.851 UTC [309686] airflow_user@airflow_db HINT:  See 
server log for query details.
   2022-03-27 16:14:45.851 UTC [309686] airflow_user@airflow_db CONTEXT:  while 
updating tuple (4928,21) in relation "task_instance"
   2022-03-27 16:14:45.851 UTC [309686] airflow_user@airflow_db STATEMENT:  
UPDATE task_instance SET state='scheduled' WHERE task_instance.dag_id = 
'download_and_preprocess_sets' AND task_instance.run_id = 
'manual__2022-03-24T13:43:29.617461+00:00' AND task_instance.task_id IN 
('download_1546')
   2022-03-27 16:14:45.958 UTC [314289] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   2022-03-27 16:14:47.017 UTC [309687] airflow_user@airflow_db LOG:  could not 
receive data from client: Connection reset by peer
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to