I also like "stop" idea. Also to answer a bit my own question and explain
current behaviour.

We know that if you use systemd or similar (or simply run airflow in
terminal and press ^C) the webserver and scheduler will be killed nicely.
But I think we miss the case when you want to kill the webserver process
itself using the pid (even if we handle the --pid) command.

Not everyone knows that, but pressing ^C actually sends INT signal to the
foreground process group and not to the main process. This is a surprise
for many people who even know how signals work in Unix so I wanted to
mention it here.
You can read more about it here:
https://unix.stackexchange.com/questions/149741/why-is-sigint-not-propagated-to-child-process-when-sent-to-its-parent-process


Systemd uses "control-group" KillMode that basically does the same - that's
why systemd integration works well for airflow.

But if you use manually started webserver/scheduler with -D mode and even
specify --pid file then even if you kill -INT <webserver pid > or kill -INT
<scheduler pid>.  Then (if we do not propagate the signal) -  only main
process is killed. Child process are moved to be owned by init and they
continue running.

I looked briefly at the code and - unless I missed something - it seems
that in -D mode we are not setting our own signal handlers. In the
interactive mode we are setting signal handlers that simply do
sys.exit(0).

I just wonder if others now/looked in the past in how it is done and have
some thoughts about it.

One of the ways how we could improve it (it worked for me in the past) - we
could have Webserver/Scheduler start all the processes in their own new
process group and propagate all signals to that group before handling them.
That would work nicely in both - interactive and daemon mode. Both systemd
integration and manually sending signal to webserver/scheduler would kill
all the processes spawned by webserver/scheduler.

Let me know what you think about it.

J.

On Sat, Jan 4, 2020 at 12:38 PM Kaxil Naik <[email protected]> wrote:

> That is a good idea I think.
>
> On Sat, Jan 4, 2020 at 11:33 AM Tomasz Urbaszek <[email protected]>
> wrote:
>
> > From some time I think about adding "stop" commands like "airflow
> scheduler
> > stop", "airflow celery worker stop".
> > What do you think? I have already done this in native executor POC and
> it's
> > helpful.
> >
> > T.
> >
> > On Sat, Jan 4, 2020 at 12:22 PM Kaxil Naik <[email protected]> wrote:
> >
> > > Systemd integrations have worked nicely for me:
> > > https://airflow.apache.org/docs/stable/howto/run-with-systemd.html
> > >
> > >
> > >
> > > On Sat, Jan 4, 2020 at 11:01 AM Jarek Potiuk <[email protected]
> >
> > > wrote:
> > >
> > > > I would like to bring the subject from user@ group
> > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/5add5e8a19cb86ef2141d9d0634bd01c12d74a7655c4eddfa7b8e75a%40%3Cusers.airflow.apache.org%3E
> > > >
> > > >
> > > > Seems some people have problems with nicely killing airflow
> > > > scheduler/webserver with signals and I was wondering if this already
> > > > implemented/or someone has some insight/experience with it and can
> > share
> > > > thoughts about it, before we dig deeper?
> > > >
> > > > I know Tomek had recently some experience with killing workers nicely
> > and
> > > > is looking at it, but I think it would be great to have working and
> > > > described scheduler/webserver killing scenarios - which signals work,
> > how
> > > > threads/processes behave when the signals are received etc.
> > > >
> > > > Does anyone have any insight into it ?
> > > >
> > > > J.
> > > > --
> > > >
> > > > Jarek Potiuk
> > > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > > >
> > > > M: +48 660 796 129 <+48660796129>
> > > > [image: Polidea] <https://www.polidea.com/>
> > > >
> > >
> >
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to