Thanks for the explanation. I'd go for the manager as well. In the end, if
you really want to scale out, you should go for Celery/Kubernetes.

Cheers, Fokko

Op ma 29 apr. 2019 om 23:09 schreef Jarek Potiuk <jarek.pot...@polidea.com>:

> Yep. Manager() creates a SyncManager instance and runs start().
> This in turn creates a Manager process that all processes communicate with
> via Proxies:
>
> https://docs.python.org/3.4/library/multiprocessing.html?highlight=process#sharing-state-between-processes
>
> It's reliable, but slower than Shared Memory used by direct Queue
> instantiation.
>
> J.
>
> On Mon, Apr 29, 2019 at 10:26 PM Bas Harenslak <
> basharens...@godatadriven.com> wrote:
>
> > @Fokko I believe that’s implicitly created by multiprocessing.Manager()<
> >
> https://docs.python.org/3.4/library/multiprocessing.html?highlight=process#multiprocessing.sharedctypes.multiprocessing.Manager
> > >:
> >
> > ….. The returned manager object corresponds to a spawned child process
> and
> > ….
> >
> > I’m in favour of the Manager route, since this doesn’t introduce
> > additional complex multiprocessing code in Airflow. I’m reviewing right
> now
> > although it’s a bit out of my comfort zone...
> >
> > Bas
> >
> > On 29 Apr 2019, at 21:46, Driesprong, Fokko <fo...@driesprong.frl
> <mailto:
> > fo...@driesprong.frl>> wrote:
> >
> > I'm missing the part of another process? This is within the Scheduler
> > process if I understand correctly.
> >
> > Cheers, Fokko
> >
> > Op ma 29 apr. 2019 om 21:33 schreef Jarek Potiuk <
> jarek.pot...@polidea.com
> > <mailto:jarek.pot...@polidea.com>>:
> >
> > I am also leaning towards the manager. I updated the
> > https://github.com/apache/airflow/pull/5200 PR now after review and once
> > it
> > passes CI I think we can merge it.
> > If anyone wants to have a look as well, happy to hear it.
> >
> > J.
> >
> > On Mon, Apr 29, 2019 at 2:14 PM Ash Berlin-Taylor <a...@apache.org
> <mailto:
> > a...@apache.org>> wrote:
> >
> > I think I lean towards the built-in/manager approach as it is less
> > concurrency code we have to manage/maintain in Airflow, though I'm not
> > hugely happy about another process :(
> >
> > -ash
> >
> > On 29 Apr 2019, at 07:33, Jarek Potiuk <jarek.pot...@polidea.com<mailto:
> > jarek.pot...@polidea.com>>
> > wrote:
> >
> > Hello Everyone,
> >
> > I think we need some more pairs of eyes to take a look at potential
> > fixes
> > we have for the pesky LocalExecutorTest that we are all experiencing
> > with
> > our Travis builds. Once we solve it I think we should be much closer to
> > have stable builds - including some other flaky test fixes merged
> > recently.
> >
> > It turned out that the problem relates to quite deep internals of how
> > data
> > is passed between processes using multiprocessing queues. It's really
> > deep
> > in the core processing of Airflow so I think it would be great if also
> > other experienced Airflowers review and comment it and help to select
> > the
> > best solution as we could have missed something.
> >
> > I was looking at it together with Ash and Bas and I (a bit too fast)
> > merged
> > a preliminary version of the fix last week. We reverted it later as it
> > turned out to have some side effects, so we know we have to be careful
> > with
> > this one.
> >
> > After more detailed analysis and discussions with Omar, we have now two
> > potential candidates to fix it. Both are green and from local testing -
> > both are solving the problem in a different way.
> >
> >  - https://github.com/apache/airflow/pull/5199
> >  - https://github.com/apache/airflow/pull/5200
> >
> > I tried to describe the problem, solution candidates with Pros and Cons
> > in
> > the JIRA ticket :
> > https://issues.apache.org/jira/browse/AIRFLOW-4401
> >
> > I'd love if we can get reviews in the PRs and input to discussion on
> > which
> > solution to choose.
> >
> > J.
> >
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > E: jarek.pot...@polidea.com<mailto:jarek.pot...@polidea.com>
> >
> >
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > E: jarek.pot...@polidea.com<mailto:jarek.pot...@polidea.com>
> >
> >
> >
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> E: jarek.pot...@polidea.com
>

Reply via email to