Re: Some suggestions about s6 and s6-rc

2015-09-19 Thread Steve Litt
On Sat, 19 Sep 2015 11:11:37 -0700
Avery Payne  wrote:

> With regard to having scripted placement of down files, if it was in a
> template or compiled as such, then the entire process of writing it
> into the definition becomes trivial or moot.  While there should
> always be a manual option to override a script, or the option to
> write one directly, I think the days of writing all of the
> definitions needed by hand have long since past.
> 
> But there is an issue that would keep this idea from easily
> occurring. You would need to find a way to signal the daemon that it
> the system is going down, vs. merely the daemon going down.

I solved all this stuff, with LittKit, by defining files "reallydown"
and "nodown", signifying "yes, this service truly is supposed to be
down", and "this is the first service to run", respectively. I've
tested it with s6 and with daemontools-encore (slightly different
versions of the shellscripts), it works perfectly.

Basically, on startup, before bringing up the process supervisor, you
write "down" files to every service not containing a "nodown". Then you
erase down files one at a time.

LittKit in no way requires any modification to the supervision system.
If the supervisor does what original daemontools does with the "down"
file, LittKit can bring up services (and oneshots) one at a time,
intermixing services and oneshots.

Here's the LittKit README:

http://troubleshooters.com/projects/littkit/README

I'm not saying nine little shellscripts is the best solution to the
situation, but it's not all that tough a situation.

SteveT

Steve Litt 
August 2015 featured book: Troubleshooting: Just the Facts
http://www.troubleshooters.com/tjust


Re: Some suggestions about s6 and s6-rc

2015-09-19 Thread Steve Litt
On Sun, 20 Sep 2015 11:06:34 +0800
"Casper Ti. Vector"  wrote:
> On Sun, Sep 20, 2015 at 12:33:28AM +0200, Laurent Bercot wrote:
> >   I agree that the name collision is confusing, and it is an
> > annoyance.
> 


> Since s6-rc is still unreleased, perhaps we can still take the chance
> to rename `up'/`down' in oneshots to `run'/`finish', in order to let
> them look a little more unified?
> 

Yes. The use of a file called "down" to tell the system not to run the
process, and also the use of a script called "down" to perform an
action at the appropriate time, will be holy hell to document, even if
theoretically they cannot both happen in the same script directory.

Also, don't be sure that they can't happen in the same script
directory. LittKit uses daemontools-style "down" files for both
oneshots and longruns, and there are probably other workarounds out
there that do the same thing. They'll all break upon the introduction
of a script called "down".

Breaking workarounds isn't a valid reason not to do something, but the
documentation problem is. The confusion it causes will disuade a
non-trivial number of people from working with s6-rc, all because of a
(I think unfortunate) filename.

SteveT

Steve Litt 
August 2015 featured book: Troubleshooting: Just the Facts
http://www.troubleshooters.com/tjust


Re: Some suggestions about s6 and s6-rc

2015-09-19 Thread Casper Ti. Vector
I just read your modification on the blurb page (commit e56e1294), and
found it somehow still lacking: in my experience, dependency is honoured
by OpenRC even with `rc_parallel' enabled; and more than that,
"readiness" (here defined as `exit 0' for a runscript) is also honoured:

> % head /etc/init.d/test.*
> ==> /etc/init.d/test.1 <==
> #!/sbin/openrc-run
> description="test 1"
> start() { sleep 2; }
> stop() { sleep 2; }
> 
> ==> /etc/init.d/test.2 <==
> #!/sbin/openrc-run
> description="test 2"
> depend() { need test.1; }
> start() { sleep 2; }
> stop() { sleep 2; }

> % (
>sudo /etc/init.d/test.2 --quiet start & for i in $(seq 3);
>do sleep 1.5; echo "# test $i"; rc-status | grep test; done
>   ) 2> /dev/null
> # test 1
>  test.1   [ starting  ]
>  test.2   [ starting  ]
> # test 2
>  test.1   [  started  ]
>  test.2   [ starting  ]
> # test 3
>  test.1   [  started  ]
>  test.2   [  started  ]

> % (
>sudo /etc/init.d/test.1 --quiet stop & for i in $(seq 3);
>do sleep 1.5; echo "# test $i"; rc-status | grep test; done
>   ) 2> /dev/null 
> # test 1
>  test.1   [ stopping  ]
>  test.2   [ stopping  ]
> # test 2
>  test.1   [ stopping  ]
> # test 3

I (after reading your response) also grepped the OpenRC source tree and
found the `if not rc_parallel then waitpid()' lines; a cursory guess is
that OpenRC uses some other (whether elegant or ugly) technique to
ensure dependency and "readiness" (`exit 0').

On Sat, Sep 19, 2015 at 02:26:44PM +0200, Laurent Bercot wrote:
>   Grepping the current OpenRC git for "parallel" shows only a few instances
> of rc_parallel use; it seems to be used to defer waitpid() calls,
> which means OpenRC will be able to start/stop services without waiting
> for the exit code of previous invocations.
> 
>   This very much looks like an addition as an afterthought, not as an
> inherently parallel design. Unless I'm mistaken, there is no check for
> readiness; in the serial case, readiness is considered achieved when the
> invoked program exits. Using rc_parallel seems to defeat that design and
> possibly break service ordering: in other words, it is a hack that will
> only work if you're lucky, and goes contrary to the mechanics of OpenRC
> in the first place.
>   Unfortunately, since most services get ready quickly enough and the
> Linux scheduler isn't retarded, problems rarely occur, as you have
> experienced; so it is not entirely obvious that rc_parallel is broken -
> but broken it is, and broken it will be unless the whole OpenRC engine
> is redesigned.
> 
>   You can't add parallel service start/stop as an afterthought. It has to
> be included in the design. OpenRC is a good serial rc system, but it's
> not a parallel rc system by any means.

-- 
My current OpenPGP key:
4096R/0xE18262B5D9BF213A (expires: 2017.1.1)
D69C 1828 2BF2 755D C383 D7B2 E182 62B5 D9BF 213A



Re: Some suggestions about s6 and s6-rc

2015-09-19 Thread Avery Payne
With regard to having scripted placement of down files, if it was in a
template or compiled as such, then the entire process of writing it into
the definition becomes trivial or moot.  While there should always be a
manual option to override a script, or the option to write one directly, I
think the days of writing all of the definitions needed by hand have long
since past.

But there is an issue that would keep this idea from easily occurring. You
would need to find a way to signal the daemon that it the system is going
down, vs. merely the daemon going down.  I suppose you could argue that the
down and stop commands should be semantically different, and use those to
send the signal, but that's not how they are used today.  Beyond that, if
the placement of the down file was baked into all of the scripts, either by
compiling or templating, then there isn't an issue of repeatedly typing in
support for this.


Some suggestions about s6 and s6-rc

2015-09-19 Thread Casper Ti. Vector
Since it has been public that Laurent schedules the release of s6-rc in
September 2015, I think it will be beneficial to try to rip the related
documentation of factual errors (I keep imagining how Rachel Carson and
her friends tried to eliminate flaws in "Silent Spring").  Here are my
own findings:

* In `s6:doc/servicedir.html':

  In description of the `finish' file, there is "A finish script must do
  its work and exit in less than 3 seconds", which (1) does not mention
  the timeout can be modified in `timeout-finish' and (2) is out of sync
  with the default value (5000ms, i.e. 5s) in the implementation in
  `src/supervision/s6-supervise.c' and the description of
  `timeout-finish' on the same HTML page.

* In `s6-rc:doc/why.html':

  This "blurb" page describes OpenRC as starting (and shutting down,
  though not explicitly saying that) services sequentially.  This is
  only partially true: in Gentoo, parallel startup and shutdown in
  OpenRC can be enabled by setting `rc_parallel=YES' in `/etc/rc.conf'.
  The comment in `rc.conf' says lock-up might still be encountered, but
  I only experienced that problem three or four times several years ago,
  among several thousand times of bootup hitherto.

By the way, I feel one behaviour of s6-rc can be slightly adjusted for a
good reason:

* In `s6-rc:doc/s6-rc-compile.html':

  With current behaviour, oneshots mandates an `up' file, but not a
  `down' file.  At the first sight this asymmetry seemed really
  unnecessary to me, and I remember recently reading one post on this
  mail list asking about oneshots with only `down' functioning, so this
  is not just imagination.  Therefore, I propose to change the
  requirements of oneshot from mandating `up' to mandating *at least one
  between* `up' and `down'; I think this change should be technically
  trivial and also backward-compatible.

Finally, I acknowledge that code written by Laurent is excellent, but
I also think some git commits by Laurent have really terrible summaries:
"s6-rc-update doc, bugfix", " and another one", "More work on
s6-rc-update", etc.  Therefore, to ease future references, perhaps
Laurent can spend a little more time phrasing the summary when
committing more changes?

Sorry if this mail does not seem quite humble...

-- 
My current OpenPGP key:
4096R/0xE18262B5D9BF213A (expires: 2017.1.1)
D69C 1828 2BF2 755D C383 D7B2 E182 62B5 D9BF 213A



Re: Some suggestions about s6 and s6-rc

2015-09-19 Thread Laurent Bercot

On 19/09/2015 14:52, James Powell wrote:

I don't see it, rc_parallel, as entirely broken, that is if you
follow proper scripting techniques and create the proper dependency
prestarts.


 Even if you do, it's not guaranteed to work as long as you don't
have a way to notify readiness. In the serial case, OpenRC starts a
subprocess to start a service, and readiness is assumed when the
subprocess exits. That defers readiness test to the subprocess, which
is perfectly reasonable.
 With rc_parallel, you just don't wait for the subprocess to exit.
I haven't studied the code in detail, but without any readiness
notification system, there's no way it's going to respect the
dependency graph. It's basically "start everything at the same
time, and yolo". Which defeats the purpose of a dependency-based
service manager.



I've often wondered if services started via OpenRC could be ran
wrapped to s6, such as instead of scripting to start the daemon
normally via direct execution, you start it wrapped via OpenRC by
executing the s6 run script and stopped by the finish script within
the OpenRC script acting as a manager layer.


 I think that's what the "supervisor=s6" variable does.
 See https://github.com/OpenRC/openrc/blob/master/s6-guide.md

--
 Laurent


Re: Some suggestions about s6 and s6-rc

2015-09-19 Thread Casper Ti. Vector
On Sat, Sep 19, 2015 at 02:26:44PM +0200, Laurent Bercot wrote:
>   You can't add parallel service start/stop as an afterthought. It has to
> be included in the design. OpenRC is a good serial rc system, but it's
> not a parallel rc system by any means.

Thanks for your explanation, it is very clear.  Nevertheless, I think it
will save us some argument against a certain group of propagandists
after September if you somehow clarify this in the documentation -- as
you said in the DNG mail list, "propaganda works", so making the blurb
more distortion-proof might reduce productivity wasted in malicious
arguments.

>   So, I don't mind the asymmetry because it's a natural one given the
> way a system works, and working around it is trivial. I only made
> the "down" script optional because it's true that a lot of oneshots
> won't have anything to do when being turned off; but the opposite is
> exceptional. Does it really bother you?

Well, it is more of a "mathematical" ugliness to me -- a workaround is
also trivial.  If the proposed change is applied, it will make the model
(very) slightly better with nearly negligible cost, so this is really up
to the party in charge: I usually choose to apply such changes, but this
time it finally depends on your taste.

>   I promise I'll try to be more verbose and explicit in the commit logs
> for significant changes, and especially changes that add functionalities,
> i.e. things that may need a revert: it's important to know what commit
> broke stuff. For trivial modifications, I don't want to bother making
> nontrivial commit messages. Maybe I'm wrong about this.

That will be nice :)

>   As long as you're civil and relevant, that's really not a problem.
> We're all a big bunch of egos, no need to apologize for it. ;)

Thanks :)

-- 
My current OpenPGP key:
4096R/0xE18262B5D9BF213A (expires: 2017.1.1)
D69C 1828 2BF2 755D C383 D7B2 E182 62B5 D9BF 213A



Re: Some suggestions about s6 and s6-rc

2015-09-19 Thread Laurent Bercot

On 20/09/2015 00:23, Steve Litt wrote:

Basically, on startup, before bringing up the process supervisor, you
write "down" files to every service not containing a "nodown". Then you
erase down files one at a time.


 Clarity check.
 Casper, Guillermo and I were not talking about ./down files in a
(longrun) service directory. We were talking about "down" scripts
in a (oneshot) s6-rc service definition directory, which is not
the same at all.

 I agree that the name collision is confusing, and it is an annoyance.
But in the context of s6-rc, the confusion cannot happen, because
those "down" scripts only exist for oneshot services, which do not
have a service directory managed by a supervision suite.

--
 Laurent



Re: Some suggestions about s6 and s6-rc

2015-09-19 Thread Casper Ti. Vector
Since s6-rc is still unreleased, perhaps we can still take the chance to
rename `up'/`down' in oneshots to `run'/`finish', in order to let them
look a little more unified?

On Sun, Sep 20, 2015 at 12:33:28AM +0200, Laurent Bercot wrote:
>   I agree that the name collision is confusing, and it is an annoyance.

-- 
My current OpenPGP key:
4096R/0xE18262B5D9BF213A (expires: 2017.1.1)
D69C 1828 2BF2 755D C383 D7B2 E182 62B5 D9BF 213A



Re: Some suggestions about s6 and s6-rc

2015-09-19 Thread Casper Ti. Vector
Allow me to clarify myself: what I proposed is to *also* allow oneshots
which have a `down' file but no `up' file.  But again, the choice is not
up to me, so I stop here...

On Sat, Sep 19, 2015 at 01:03:32PM -0300, Guillermo wrote:
> have an explicit start(). Being forced to always do 'touch down; chmod
> a+x down' for s6-rc oneshots, to always write "stop() { ; }" in OpenRC
> scripts, or to always do:

-- 
My current OpenPGP key:
4096R/0xE18262B5D9BF213A (expires: 2017.1.1)
D69C 1828 2BF2 755D C383 D7B2 E182 62B5 D9BF 213A