CraigChaffee opened a new issue, #28171:
URL: https://github.com/apache/airflow/issues/28171

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   Our scheduler started failing with this trace:
   
     File 
"/usr/local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 
1187, in get_failed_dep_statuses
       for dep_status in dep.get_dep_statuses(self, session, dep_context):
     File 
"/usr/local/lib/python3.9/site-packages/airflow/ti_deps/deps/base_ti_dep.py", 
line 95, in get_dep_statuses
       yield from self._get_dep_statuses(ti, session, dep_context)
     File 
"/usr/local/lib/python3.9/site-packages/airflow/ti_deps/deps/not_in_retry_period_dep.py",
 line 47, in _get_dep_statuses
       next_task_retry_date = ti.next_retry_datetime()
     File 
"/usr/local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 
1243, in next_retry_datetime
       return self.end_date + delay
   OverflowError: date value out of range
   
   We found a dag with a large # of retries and exponential backoff will 
trigger this date error and take down the entire scheduler. The workaround is 
to force a max_delay setting. 
   
   The bug is here:
   
https://github.com/apache/airflow/blob/2.3.3/airflow/models/taskinstance.py#L1243
   
   The current version seems to use the same code:
   
https://github.com/apache/airflow/blob/main/airflow/models/taskinstance.py#L1147
   
   
   ### What you think should happen instead
   
   There are a few solutions. Exponential backoff should probably require a max 
delay value.
   
   At the very least, it shouldn't kill the scheduler.
   
   ### How to reproduce
   
   Create dag with exponential delay and force it to retry until it overflows.
   
   ### Operating System
   
   linux
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to