o-nikolas opened a new issue, #27069:
URL: https://github.com/apache/airflow/issues/27069
### Apache Airflow version
main (development)
### What happened
In the #14915 change to make tests run in parallel, the heartbeat interval
threshold was raised an order of magnitude from 0.05 to 0.5. Though I
frequently see tests failing in PRs due breaching that threshold by a tiny
amount. Do we need to increase that theshold again? CC @potiuk
Example below where the time was `0.5193889999999999`, `0.0193...` past the
threshold for the test:
```
=================================== FAILURES
===================================
_________________ TestLocalTaskJob.test_heartbeat_failed_fast
__________________
self = <tests.jobs.test_local_task_job.TestLocalTaskJob object at
0x7f4088400950>
def test_heartbeat_failed_fast(self):
"""
Test that task heartbeat will sleep when it fails fast
"""
self.mock_base_job_sleep.side_effect = time.sleep
dag_id = 'test_heartbeat_failed_fast'
task_id = 'test_heartbeat_failed_fast_op'
with create_session() as session:
dag_id = 'test_heartbeat_failed_fast'
task_id = 'test_heartbeat_failed_fast_op'
dag = self.dagbag.get_dag(dag_id)
task = dag.get_task(task_id)
dr = dag.create_dagrun(
run_id="test_heartbeat_failed_fast_run",
state=State.RUNNING,
execution_date=DEFAULT_DATE,
start_date=DEFAULT_DATE,
session=session,
)
ti = dr.task_instances[0]
ti.refresh_from_task(task)
ti.state = State.QUEUED
ti.hostname = get_hostname()
ti.pid = 1
session.commit()
job = LocalTaskJob(task_instance=ti,
executor=MockExecutor(do_update=False))
job.heartrate = 2
heartbeat_records = []
job.heartbeat_callback = lambda session:
heartbeat_records.append(job.latest_heartbeat)
job._execute()
assert len(heartbeat_records) > 2
for i in range(1, len(heartbeat_records)):
time1 = heartbeat_records[i - 1]
time2 = heartbeat_records[i]
# Assert that difference small enough
delta = (time2 - time1).total_seconds()
> assert abs(delta - job.heartrate) < 0.5
E assert 0.5193889999999999 < 0.5
E + where 0.5193889999999999 = abs((2.519389 - 2))
E + where 2 = <airflow.jobs.local_task_job.LocalTaskJob
object at 0x7f408835a7d0>.heartrate
tests/jobs/test_local_task_job.py:312: AssertionError
```
(source)[https://github.com/apache/airflow/actions/runs/3253568905/jobs/5341352671]
### What you think should happen instead
Tests should not be flaky and should pass reliably :)
### How to reproduce
_No response_
### Operating System
Linux
### Versions of Apache Airflow Providers
_No response_
### Deployment
Other
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]