davidjfrickert opened a new issue, #64826:
URL: https://github.com/apache/airflow/issues/64826

   ### Under which category would you file this issue?
   
   Airflow Core
   
   ### Apache Airflow version
   
   3.1.7
   
   ### What happened and how to reproduce it?
   
   Not sure how to reproduce, but out of nowhere all DAGs started failing. We 
use `KubernetesPodOperator` exclusively. 
   Task pods were failing all with the same issue:
   
   ```
     2026-04-07T06:27:23.301527Z [info     ] setup plugin 
alembic.autogenerate.schemas
     [alembic.runtime.plugins] loc=plugins.py:37
     2026-04-07T06:27:23.301648Z [info     ] setup plugin 
alembic.autogenerate.tables
     [alembic.runtime.plugins] loc=plugins.py:37
     2026-04-07T06:27:23.301748Z [info     ] setup plugin 
alembic.autogenerate.types
     [alembic.runtime.plugins] loc=plugins.py:37
     2026-04-07T06:27:23.301813Z [info     ] setup plugin 
alembic.autogenerate.constraints
     [alembic.runtime.plugins] loc=plugins.py:37
     2026-04-07T06:27:23.301869Z [info     ] setup plugin 
alembic.autogenerate.defaults
     [alembic.runtime.plugins] loc=plugins.py:37
     2026-04-07T06:27:23.301923Z [info     ] setup plugin 
alembic.autogenerate.comments
     [alembic.runtime.plugins] loc=plugins.py:37
     
{"timestamp":"2026-04-07T06:27:23.411801Z","level":"info","event":"Executing
     workload","workload":"ExecuteTask(token='<REDACTED_JWT>',
     ti=TaskInstance(id=UUID('<REDACTED_UUID>'), 
dag_version_id=UUID('<REDACTED_UUID>'),
     task_id='start_healthcheck', dag_id='<REDACTED_DAG_ID>',
     run_id='scheduled__2026-04-07T06:27:12.595187+00:00', try_number=1, 
map_index=-1,
     pool_slots=1, queue='default', priority_weight=4, executor_config=None,
     parent_context_carrier={}, context_carrier={}),
     dag_rel_path=PurePosixPath('local_dags/<REDACTED_DAG_ID>.py'),
     bundle_info=BundleInfo(name='dags-folder', version=None), 
log_path='dag_id=<REDACTED_DAG_ID>/r
     
un_id=scheduled__2026-04-07T06:27:12.595187+00:00/task_id=start_healthcheck/attempt=1.log',
     
type='ExecuteTask')","logger":"__main__","filename":"execute_workload.py","lineno":56}
     
{"timestamp":"2026-04-07T06:27:23.412014Z","level":"info","event":"Connecting to
     
server:","server":"http://<REDACTED_HOST>/execution/","logger":"__main__","filename":"execute_
     workload.py","lineno":64}
     {"timestamp":"2026-04-07T06:27:23.430813Z","level":"info","event":"Secrets 
backends loaded for
      
worker","count":1,"backend_classes":["EnvironmentVariablesBackend"],"logger":"supervisor","fi
     lename":"supervisor.py","lineno":1975}
     {"timestamp":"2026-04-07T06:27:25.698677Z","level":"info","event":"Process
     
exited","pid":14,"exit_code":-9,"signal_sent":"SIGKILL","logger":"supervisor","filename":"supe
     rvisor.py","lineno":710}
     Traceback (most recent call last):
       File "<frozen runpy>", line 198, in _run_module_as_main
       File "<frozen runpy>", line 88, in _run_code
       File 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/execute_w
     orkload.py", line 125, in <module>
         main()
       File 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/execute_w
     orkload.py", line 121, in main
         execute_workload(workload)
       File 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/execute_w
     orkload.py", line 66, in execute_workload
         supervise(
       File
     
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/supervisor.py",
     line 1984, in supervise
         process = ActivitySubprocess.start(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
       File
     
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/supervisor.py",
     line 955, in start
         proc._on_child_started(ti=what, dag_rel_path=dag_rel_path, 
bundle_info=bundle_info)
       File
     
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/supervisor.py",
     line 966, in _on_child_started
         ti_context = self.client.task_instances.start(ti.id, self.pid, 
start_date)
                      
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/api/client.py", 
line
     215, in start
         resp = self.client.patch(f"task-instances/{id}/run", 
content=body.model_dump_json())
                
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File 
"/home/airflow/.local/lib/python3.12/site-packages/httpx/_client.py", line 
1218, in
     patch
         return self.request(
                ^^^^^^^^^^^^^
       File 
"/home/airflow/.local/lib/python3.12/site-packages/tenacity/__init__.py", line 
338, in
     wrapped_f
         return copy(f, *args, **kw)
                ^^^^^^^^^^^^^^^^^^^^
       File 
"/home/airflow/.local/lib/python3.12/site-packages/tenacity/__init__.py", line 
477, in
     __call__
         do = self.iter(retry_state=retry_state)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File 
"/home/airflow/.local/lib/python3.12/site-packages/tenacity/__init__.py", line 
378, in
     iter
         result = action(retry_state)
                  ^^^^^^^^^^^^^^^^^^^
       File 
"/home/airflow/.local/lib/python3.12/site-packages/tenacity/__init__.py", line 
400, in
     <lambda>
         self._add_action_func(lambda rs: rs.outcome.result())
                                          ^^^^^^^^^^^^^^^^^^^
       File "/usr/python/lib/python3.12/concurrent/futures/_base.py", line 449, 
in result
         return self.__get_result()
                ^^^^^^^^^^^^^^^^^^^
       File "/usr/python/lib/python3.12/concurrent/futures/_base.py", line 401, 
in __get_result
         raise self._exception
       File 
"/home/airflow/.local/lib/python3.12/site-packages/tenacity/__init__.py", line 
480, in
     __call__
         result = fn(*args, **kwargs)
                  ^^^^^^^^^^^^^^^^^^^
       File 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/api/client.py", 
line
     887, in request
         return super().request(*args, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File 
"/home/airflow/.local/lib/python3.12/site-packages/httpx/_client.py", line 825, 
in
     request
         return self.send(request, auth=auth, follow_redirects=follow_redirects)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File 
"/home/airflow/.local/lib/python3.12/site-packages/httpx/_client.py", line 914, 
in send
         response = self._send_handling_auth(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^
       File 
"/home/airflow/.local/lib/python3.12/site-packages/httpx/_client.py", line 942, 
in
     _send_handling_auth
         response = self._send_handling_redirects(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File 
"/home/airflow/.local/lib/python3.12/site-packages/httpx/_client.py", line 999, 
in
     _send_handling_redirects
         raise exc
       File 
"/home/airflow/.local/lib/python3.12/site-packages/httpx/_client.py", line 982, 
in
     _send_handling_redirects
         hook(response)
       File 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/api/client.py", 
line
     186, in raise_on_4xx_5xx_with_note
         return get_json_error(response) or response.raise_for_status()
                ^^^^^^^^^^^^^^^^^^^^^^^^
       File 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/api/client.py", 
line
     176, in get_json_error
         raise err
     airflow.sdk.api.client.ServerResponseError: Invalid auth token: Signature 
verification failed
     Correlation-id=<REDACTED_UUID>
   ```
   
   Nothing really changed on the deployment. only thing i noticed was that one 
of the scheduler pods restarted recently. 
   
   ```
     2026-04-07T06:27:38.072406Z [info     ] Task instance state
     updated
     [airflow.api_fastapi.execution_api.routes.task_instances]
     loc=task_instances.py:424 new_state=failed rows_affected=1
     ti_id=<REDACTED_UUID>
     INFO:     <REDACTED_IP>:<PORT> - "PATCH
     /execution/task-instances/<REDACTED_UUID>/state HTTP/1.1" 204
      No Content
     INFO:     <REDACTED_IP>:<PORT> - "PUT
     /execution/task-instances/<REDACTED_UUID>/heartbeat HTTP/1.1"
      204 No Content
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "GET /admin/metrics/
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "PUT
     /execution/task-instances/<REDACTED_UUID>/heartbeat HTTP/1.1"
      204 No Content
     INFO:     <REDACTED_IP>:<PORT> - "PUT
     /execution/task-instances/<REDACTED_UUID>/heartbeat HTTP/1.1"
      204 No Content
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "GET /admin/metrics/
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "PUT
     /execution/task-instances/<REDACTED_UUID>/heartbeat HTTP/1.1"
      204 No Content
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "PUT
     /execution/task-instances/<REDACTED_UUID>/heartbeat HTTP/1.1"
      204 No Content
     INFO:     <REDACTED_IP>:<PORT> - "GET /admin/metrics/
     HTTP/1.1" 200 OK
     2026-04-07T06:28:21.716258Z [warning  ] Failed to validate
     JWT         [airflow.api_fastapi.execution_api.deps]
     loc=deps.py:86 token=<REDACTED_JWT>
     Traceback (most recent call last):
       File "/home/airflow/.local/lib/python3.12/site-packages/air
     flow/api_fastapi/execution_api/deps.py", line 83, in __call__
         claims = await
     validator.avalidated_claims(creds.credentials, validators)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     ^^^^^^^^^^^^^^^^
       File "/home/airflow/.local/lib/python3.12/site-packages/air
     flow/api_fastapi/auth/tokens.py", line 328, in
     avalidated_claims
         claims = jwt.decode(
                  ^^^^^^^^^^^
       File "/home/airflow/.local/lib/python3.12/site-packages/jwt
     /api_jwt.py", line 371, in decode
         decoded = self.decode_complete(
                   ^^^^^^^^^^^^^^^^^^^^^
       File "/home/airflow/.local/lib/python3.12/site-packages/jwt
     /api_jwt.py", line 268, in decode_complete
         decoded = self._jws.decode_complete(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
       File "/home/airflow/.local/lib/python3.12/site-packages/jwt
     /api_jws.py", line 244, in decode_complete
         self._verify_signature(signing_input, header, signature,
     key, algorithms)
       File "/home/airflow/.local/lib/python3.12/site-packages/jwt
     /api_jws.py", line 359, in _verify_signature
         raise InvalidSignatureError("Signature verification
     failed")
     jwt.exceptions.InvalidSignatureError: Signature verification
     failed
     2026-04-07T06:28:21.716787Z [warning  ] JWT reissue
     middleware failed to refresh token
     [airflow.api_fastapi.execution_api.app] error='Signature
     verification failed' loc=app.py:126
     Traceback (most recent call last):
       File "/home/airflow/.local/lib/python3.12/site-packages/air
     flow/api_fastapi/execution_api/app.py", line 115, in dispatch
         claims = await validator.avalidated_claims(token, {})
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File "/home/airflow/.local/lib/python3.12/site-packages/air
     flow/api_fastapi/auth/tokens.py", line 328, in
     avalidated_claims
         claims = jwt.decode(
                  ^^^^^^^^^^^
       File "/home/airflow/.local/lib/python3.12/site-packages/jwt
     /api_jwt.py", line 371, in decode
         decoded = self.decode_complete(
                   ^^^^^^^^^^^^^^^^^^^^^
       File "/home/airflow/.local/lib/python3.12/site-packages/jwt
     /api_jwt.py", line 268, in decode_complete
         decoded = self._jws.decode_complete(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
       File "/home/airflow/.local/lib/python3.12/site-packages/jwt
     /api_jws.py", line 244, in decode_complete
         self._verify_signature(signing_input, header, signature,
     key, algorithms)
       File "/home/airflow/.local/lib/python3.12/site-packages/jwt
     /api_jws.py", line 359, in _verify_signature
         raise InvalidSignatureError("Signature verification
     failed")
     jwt.exceptions.InvalidSignatureError: Signature verification
     failed
     INFO:     <REDACTED_IP>:<PORT> - "PATCH
     /execution/task-instances/<REDACTED_UUID>/run HTTP/1.1" 403
     Forbidden
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "GET /login/ HTTP/1.1" 200
     OK
     INFO:     <REDACTED_IP>:<PORT> - "PUT
     /execution/task-instances/<REDACTED_UUID>/heartbeat HTTP/1.1"
      204 No Content
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "PUT
     /execution/task-instances/<REDACTED_UUID>/heartbeat HTTP/1.1"
      204 No Content
     INFO:     <REDACTED_IP>:<PORT> - "GET /admin/metrics/
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     INFO:     <REDACTED_IP>:<PORT> - "GET /api/v2/version
     HTTP/1.1" 200 OK
     2026-04-07T06:28:42.641659Z [warning  ] Failed to validate
     JWT         [airflow.api_fastapi.execution_api.deps]
     loc=deps.py:86 token=<REDACTED_JWT>
     Traceback (most recent call last):
       [... same JWT InvalidSignatureError traceback ...]
     jwt.exceptions.InvalidSignatureError: Signature verification
     failed
     2026-04-07T06:28:42.642204Z [warning  ] JWT reissue
     middleware failed to refresh token
     [airflow.api_fastapi.execution_api.app] error='Signature
     verification failed' loc=app.py:126
     Traceback (most recent call last):
       [... same JWT InvalidSignatureError traceback ...]
     jwt.exceptions.InvalidSignatureError: Signature verification
     failed
     INFO:     <REDACTED_IP>:<PORT> - "PATCH
     /execution/task-instances/<REDACTED_UUID>/run HTTP/1.1" 403
     Forbidden
     INFO:     <REDACTED_IP>:<PORT> - "GET /admin/metrics/
     HTTP/1.1" 200 OK
     2026-04-07T06:28:46.212078Z [warning  ] Failed to validate
     JWT         [airflow.api_fastapi.execution_api.deps]
     loc=deps.py:86 token=<REDACTED_JWT>
     Traceback (most recent call last):
       [... same JWT InvalidSignatureError traceback ...]
     jwt.exceptions.InvalidSignatureError: Signature verification
     failed
     2026-04-07T06:28:46.212700Z [warning  ] JWT reissue
     middleware failed to refresh token
     [airflow.api_fastapi.execution_api.app] error='Signature
     verification failed' loc=app.py:126
     Traceback (most recent call last):
       [... same JWT InvalidSignatureError traceback ...]
     jwt.exceptions.InvalidSignatureError: Signature verification
     failed
     INFO:     <REDACTED_IP>:<PORT> - "PATCH
     /execution/task-instances/<REDACTED_UUID>/run HTTP/1.1" 403
     Forbidden
   ```
   
   I did a rollout restart of api-server, scheduler and dag-processor and this 
seems to have fixed it.
   
   When I took JWT on pod logs and checked signature on `jwt.io` against the 
key in k8s secret `airflow-jwt-secret` it shows all OK, so I don't know why 
airflow suddenly failed to verify signatures...
   
   ### What you think should happen instead?
   
   No issues on running DAGs.
   
   ### Operating System
   
   Rocky Linux 8
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Apache Airflow Provider(s)
   
   _No response_
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Official Helm Chart version
   
   1.18.0
   
   ### Kubernetes Version
   
   v1.30.14
   
   ### Helm Chart configuration
   
   _No response_
   
   ### Docker Image customizations
   
   Not applicable
   
   ### Anything else?
   
   It occurred out of nowhere.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to