Hello Everyone,

I think we need some more pairs of eyes to take a look at potential fixes
we have for the pesky LocalExecutorTest that we are all experiencing with
our Travis builds. Once we solve it I think we should be much closer to
have stable builds - including some other flaky test fixes merged recently.

It turned out that the problem relates to quite deep internals of how data
is passed between processes using multiprocessing queues. It's really deep
in the core processing of Airflow so I think it would be great if also
other experienced Airflowers review and comment it and help to select the
best solution as we could have missed something.

I was looking at it together with Ash and Bas and I (a bit too fast) merged
a preliminary version of the fix last week. We reverted it later as it
turned out to have some side effects, so we know we have to be careful with
this one.

After more detailed analysis and discussions with Omar, we have now two
potential candidates to fix it. Both are green and from local testing -
both are solving the problem in a different way.

   - https://github.com/apache/airflow/pull/5199
   - https://github.com/apache/airflow/pull/5200

I tried to describe the problem, solution candidates with Pros and Cons in
the JIRA ticket :
https://issues.apache.org/jira/browse/AIRFLOW-4401

I'd love if we can get reviews in the PRs and input to discussion on which
solution to choose.

J.


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
E: jarek.pot...@polidea.com

Reply via email to