potiuk commented on PR #35473:
URL: https://github.com/apache/airflow/pull/35473#issuecomment-1798360252

   FYI: @bolkedebruin - so far so good. I have not seen any "self-hosted" test 
failing with this flaky test since yesterday (and I am looking quite 
regularly). 
   
   Looks like my hypothesis about parallel job contention on some resources 
(I/O most likely) was right (and this test is particularly vulnerable): 
   
   Usually this test takes 8-19 seconds:
   
   
https://github.com/apache/airflow/actions/runs/6783603921/job/18438411043?pr=35492#step:6:2660
   
   ```
     9.97s call     
tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutorQueue::test_dask_queues_no_queue_specified
     8.46s call     
tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutorQueue::test_dask_queues
   ```
   
   Buit sometimes longer:
   
   
https://github.com/apache/airflow/actions/runs/6783603921/job/18438410767?pr=35492#step:6:2939
   ```
     14.14s call     
tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutor::test_dask_executor_functions
     12.40s call     
tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutorQueue::test_dask_queues
     10.83s call     
tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutorQueue::test_dask_queues_no_queue_specified
   ```
   
   And  there are cases where they way longer:
   
   
https://github.com/apache/airflow/actions/runs/6783603921/job/18438412609?pr=35492#step:6:3399
   ```
     23.37s call     
tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutorQueue::test_dask_queues
     22.87s call     
tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutorQueue::test_dask_queues_no_queue_specified
     22.14s call     
tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutor::test_dask_executor_functions
   ```
   
   So I guess increasing timeout in this case was the right call to decrease 
the probability of flakiness.
   
   
   Of course better soulution would be to make the test less `fragile' - but 
this is an exercise for someone who understands Dask integration better and 
spend time/assess if the test can be improved.
   
   Or maybe follow the Dask provider removal, which would be an ultimate 
improvement in stability possibly. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to