GitHub user rcrchawla created a discussion: Airflow task failed but spark kube
app is running
### Body
Airflow task got failed where spark kube app is running. Although spark kube
app is long running app most probably around 1-2 hour. And there are
concurrently many task running at the same time usually it happens between
02:30 am - 03:45 am UTC.
Q) What causing issue ?
A) Airflow task failed while spark kube app running
Airflow version -- **3.0.4**
Setup config
2 API servers
2 workers
1 dag processor
2 schedulers
Deployment --> HELM Chart deployment on Azure Kubernetes
Please check below logs
Worker logs :
-------------------------------------
2026-03-10 02:33:56.191330 [info ] Task
execute_workload[8cbabf91-009f-44a6-86d1-bef109c70341] succeeded in
2715.019189195242s: None [celery.app.trace]
2026-03-10 02:39:57.112078 [info ] Task finished
[supervisor] duration=1723.7576029417105 exit_code=0 final_state=success
2026-03-10 02:39:57.128929 [info ] Task
execute_workload[9b3f27ec-09b5-424e-8d5c-412e541f51e8] succeeded in
1723.8186896019615s: None [celery.app.trace]
2026-03-10 02:40:50.688403 [info ] Task finished
[supervisor] duration=744.0669570546597 exit_code=0 final_state=success
2026-03-10 02:40:50.705538 [info ] Task
execute_workload[b08ac31a-2ee7-4029-b897-753157b18475] succeeded in
744.139388079755s: None [celery.app.trace]
2026-03-10 02:42:11.649891 [info ] Task finished
[supervisor] duration=756.7588595808484 exit_code=0 final_state=success
2026-03-10 02:42:11.666368 [info ] Task
execute_workload[0351c271-194e-4e58-87e4-a9c224351ab1] succeeded in
756.8229349320754s: None [celery.app.trace]
2026-03-10 02:43:37.239128 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 1st time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:38.119304 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 1st time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:38.640468 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 1st time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:39.247588 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 1st time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:39.425843 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 1st time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:39.618220 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 1st time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:40.002999 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 1st time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:40.582177 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 1st time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:41.186771 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 1st time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:41.510710 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 1st time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:42.658853 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 2nd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:43.171303 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 2nd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:43.826966 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 2nd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:44.330891 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 2nd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:44.874859 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 2nd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:44.922591 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 2nd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:45.866775 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 2nd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:46.194974 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 2nd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:46.482845 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 2nd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:46.750792 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 2nd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:48.198838 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 3rd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:48.462121 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 3rd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:49.749467 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 3rd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:50.029438 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 3rd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:50.834835 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 3rd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:51.334847 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 3rd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:51.431052 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 3rd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:51.537615 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 3rd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:52.567197 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 3rd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:52.967177 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 3rd time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:53.615078 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 4th time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:54.513959 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 4th time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:56.442819 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 4th time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:57.527549 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 4th time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:57.765172 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 4th time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:57.982839 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 4th time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:58.099625 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 4th time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:58.534632 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 4th time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:59.007106 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 4th time calling it.
[airflow.sdk.api.client]
2026-03-10 02:43:59.947380 [warning ] Starting call to
'airflow.sdk.api.client.Client.request', this is the 4th time calling it.
[airflow.sdk.api.client]
2026-03-10 02:44:02.200313 [warning ] Failed to send heartbeat. Will be
retried [supervisor] failed_heartbeats=1 max_retries=3
ti_id=UUID('019cd54c-28b0-7e18-9a7b-71ba469bf545')
API Server
----------------------
2026-03-10 02:45:23 [debug ] Retrieved current task state
current_hostname=airflow-worker-1.airflow-worker.de-services.svc.cluster.local
current_pid=152133 state=running ti_id=019cd542-0d3e-7467-9f7a-4dfc2d7f0017
2026-03-10 02:45:23 [debug ] Retrieved current task state
current_hostname=airflow-worker-1.airflow-worker.de-services.svc.cluster.local
current_pid=155023 state=running ti_id=019cd518-d7c9-7e7e-bde2-efc6322e36a3
2026-03-10 02:45:23 [debug ] Retrieved current task state
current_hostname=airflow-worker-0.airflow-worker.de-services.svc.cluster.local
current_pid=81402 state=running ti_id=019cd578-f8c1-7125-9906-ef64229dbba5
2026-03-10 02:45:23 [debug ] Retrieved current task state
current_hostname=airflow-worker-1.airflow-worker.de-services.svc.cluster.local
current_pid=157917 state=running ti_id=019cd54c-28ad-7db0-b0f8-d64ed0916d78
2026-03-10 02:45:23 [debug ] Retrieved current task state
current_hostname=airflow-worker-0.airflow-worker.de-services.svc.cluster.local
current_pid=86154 state=running ti_id=019cd542-0d45-75e4-95d5-a2c461e3e559
2026-03-10 02:45:23 [debug ] Heartbeat updated state=running
ti_id=019cd542-0d3e-7467-9f7a-4dfc2d7f0017
INFO: 10.10.12.52:40870 - "GET /api/v2/version HTTP/1.1" 200 OK
INFO: 10.10.12.52:40880 - "GET /api/v2/version HTTP/1.1" 200 OK
2026-03-10 02:45:23 [debug ] Processing heartbeat
hostname=airflow-worker-1.airflow-worker.de-services.svc.cluster.local
pid=151395 ti_id=019cd542-0d47-7d93-a021-0cc2c9de7344
2026-03-10 02:45:23 [debug ] Refreshed token issued to Task
[airflow.api_fastapi.execution_api.deps] refresh_when_less_than=120
valid_left=73
2026-03-10 02:45:23 [debug ] Refreshed token issued to Task
[airflow.api_fastapi.execution_api.deps] refresh_when_less_than=120
valid_left=73
2026-03-10 02:45:23 [debug ] Heartbeat updated state=running
ti_id=019cd526-91bc-7461-8be3-aa7574c5f60b
2026-03-10 02:45:23 [debug ] Processing heartbeat
hostname=airflow-worker-1.airflow-worker.de-services.svc.cluster.local
pid=155023 ti_id=019cd518-d7c9-7e7e-bde2-efc6322e36a3
[2026-03-10T02:45:23.575+0000] {exceptions.py:77} ERROR - Error with id 9zBmdizJ
File
"/home/airflow/.local/lib/python3.12/site-packages/starlette/_exception_handler.py",
line 42, in wrapped_app
await app(scope, receive, sender)
File
"/home/airflow/.local/lib/python3.12/site-packages/starlette/routing.py", line
75, in app
response = await f(request)
^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/fastapi/routing.py",
line 302, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/fastapi/routing.py",
line 213, in run_endpoint_function
return await dependant.call(**values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/airflow/.local/lib/python3.12/site-packages/cadwyn/structure/versions.py",
line 474, in decorator
response = await self._convert_endpoint_response_to_version(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/airflow/.local/lib/python3.12/site-packages/cadwyn/structure/versions.py",
line 520, in _convert_endpoint_response_to_version
response_or_response_body: Union[FastapiResponse, object] = await
run_in_threadpool(
^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/airflow/.local/lib/python3.12/site-packages/starlette/concurrency.py",
line 38, in run_in_threadpool
return await anyio.to_thread.run_sync(func)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/anyio/to_thread.py",
line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/airflow/.local/lib/python3.12/site-packages/anyio/_backends/_asyncio.py",
line 2476, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File
"/home/airflow/.local/lib/python3.12/site-packages/anyio/_backends/_asyncio.py",
line 967, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/airflow/.local/lib/python3.12/site-packages/cadwyn/schema_generation.py",
line 515, in __call__
return self._original_callable(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/airflow/.local/lib/python3.12/site-packages/airflow/api_fastapi/execution_api/routes/xcoms.py",
line 419, in set_xcom
session.flush()
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/orm/session.py",
line 3449, in flush
self._flush(objects)
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/orm/session.py",
line 3588, in _flush
with util.safe_reraise():
^^^^^^^^^^^^^^^^^^^
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/util/langhelpers.py",
line 70, in __exit__
compat.raise_(
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/util/compat.py",
line 211, in raise_
raise exception
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/orm/session.py",
line 3549, in _flush
flush_context.execute()
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/orm/unitofwork.py",
line 456, in execute
rec.execute(self)
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/orm/unitofwork.py",
line 630, in execute
util.preloaded.orm_persistence.save_obj(
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/orm/persistence.py",
line 245, in save_obj
_emit_insert_statements(
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/orm/persistence.py",
line 1097, in _emit_insert_statements
c = connection._execute_20(
^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/base.py",
line 1710, in _execute_20
return meth(self, args_10style, kwargs_10style, execution_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/sql/elements.py",
line 334, in _execute_on_connection
return connection._execute_clauseelement(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/base.py",
line 1577, in _execute_clauseelement
ret = self._execute_context(
^^^^^^^^^^^^^^^^^^^^^^
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/base.py",
line 1953, in _execute_context
self._handle_dbapi_exception(
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/base.py",
line 2134, in _handle_dbapi_exception
util.raise_(
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/util/compat.py",
line 211, in raise_
raise exception
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/base.py",
line 1910, in _execute_context
self.dialect.do_execute(
File
"/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/default.py",
line 736, in do_execute
cursor.execute(statement, parameters)
File "/home/airflow/.local/lib/python3.12/site-packages/MySQLdb/cursors.py",
line 179, in execute
res = self._query(mogrified_query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/MySQLdb/cursors.py",
line 330, in _query
db.query(q)
File
"/home/airflow/.local/lib/python3.12/site-packages/MySQLdb/connections.py",
line 280, in query
_mysql.connection.query(self, query)
2026-03-10 02:45:23 [debug ] Heartbeat updated state=running
ti_id=019cd518-d7c9-7e7e-bde2-efc6322e36a3
2026-03-10 02:45:23 [debug ] Heartbeat updated state=running
ti_id=019cd54c-28ad-7db0-b0f8-d64ed0916d78
2026-03-10 02:45:23 [debug ] Retrieved current task state
current_hostname=airflow-worker-1.airflow-worker.de-services.svc.cluster.local
current_pid=65618 state=running ti_id=019cd526-91bc-7461-8be3-aa7574c5f60b
2026-03-10 02:45:23 [debug ] Heartbeat updated state=running
ti_id=019cd526-91bc-7461-8be3-aa7574c5f60b
2026-03-10 02:45:23 [debug ] Retrieved current task state
current_hostname=airflow-worker-1.airflow-worker.de-services.svc.cluster.local
current_pid=151858 state=running ti_id=019cd542-0d49-744c-aa72-a33d5ac4249d
2026-03-10 02:45:23 [debug ] Heartbeat updated state=running
ti_id=019cd542-0d45-75e4-95d5-a2c461e3e559
2026-03-10 02:45:23 [debug ] Heartbeat updated state=running
ti_id=019cd542-0d49-744c-aa72-a33d5ac4249d
2026-03-10 02:45:23 [debug ] Retrieved current task state
current_hostname=airflow-worker-1.airflow-worker.de-services.svc.cluster.local
current_pid=152133 state=running ti_id=019cd542-0d3e-7467-9f7a-4dfc2d7f0017
2026-03-10 02:45:23 [debug ] Retrieved current task state
current_hostname=airflow-worker-1.airflow-worker.de-services.svc.cluster.local
current_pid=157917 state=running ti_id=019cd54c-28ad-7db0-b0f8-d64ed0916d78
2026-03-10 02:45:23 [debug ] Heartbeat updated state=running
ti_id=019cd542-0d3e-7467-9f7a-4dfc2d7f0017
What you think should happen instead?
Airflow task should run without getting failed.
### Committer
- [x] I acknowledge that I am a maintainer/committer of the Apache Airflow
project.
GitHub link: https://github.com/apache/airflow/discussions/63298
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]