Re: Some suggestions about s6 and s6-rc
On Sat, 19 Sep 2015 11:11:37 -0700 Avery Paynewrote: > With regard to having scripted placement of down files, if it was in a > template or compiled as such, then the entire process of writing it > into the definition becomes trivial or moot. While there should > always be a manual option to override a script, or the option to > write one directly, I think the days of writing all of the > definitions needed by hand have long since past. > > But there is an issue that would keep this idea from easily > occurring. You would need to find a way to signal the daemon that it > the system is going down, vs. merely the daemon going down. I solved all this stuff, with LittKit, by defining files "reallydown" and "nodown", signifying "yes, this service truly is supposed to be down", and "this is the first service to run", respectively. I've tested it with s6 and with daemontools-encore (slightly different versions of the shellscripts), it works perfectly. Basically, on startup, before bringing up the process supervisor, you write "down" files to every service not containing a "nodown". Then you erase down files one at a time. LittKit in no way requires any modification to the supervision system. If the supervisor does what original daemontools does with the "down" file, LittKit can bring up services (and oneshots) one at a time, intermixing services and oneshots. Here's the LittKit README: http://troubleshooters.com/projects/littkit/README I'm not saying nine little shellscripts is the best solution to the situation, but it's not all that tough a situation. SteveT Steve Litt August 2015 featured book: Troubleshooting: Just the Facts http://www.troubleshooters.com/tjust
Re: Some suggestions about s6 and s6-rc
On Sun, 20 Sep 2015 11:06:34 +0800 "Casper Ti. Vector"wrote: > On Sun, Sep 20, 2015 at 12:33:28AM +0200, Laurent Bercot wrote: > > I agree that the name collision is confusing, and it is an > > annoyance. > > Since s6-rc is still unreleased, perhaps we can still take the chance > to rename `up'/`down' in oneshots to `run'/`finish', in order to let > them look a little more unified? > Yes. The use of a file called "down" to tell the system not to run the process, and also the use of a script called "down" to perform an action at the appropriate time, will be holy hell to document, even if theoretically they cannot both happen in the same script directory. Also, don't be sure that they can't happen in the same script directory. LittKit uses daemontools-style "down" files for both oneshots and longruns, and there are probably other workarounds out there that do the same thing. They'll all break upon the introduction of a script called "down". Breaking workarounds isn't a valid reason not to do something, but the documentation problem is. The confusion it causes will disuade a non-trivial number of people from working with s6-rc, all because of a (I think unfortunate) filename. SteveT Steve Litt August 2015 featured book: Troubleshooting: Just the Facts http://www.troubleshooters.com/tjust
Re: Some suggestions about s6 and s6-rc
I just read your modification on the blurb page (commit e56e1294), and found it somehow still lacking: in my experience, dependency is honoured by OpenRC even with `rc_parallel' enabled; and more than that, "readiness" (here defined as `exit 0' for a runscript) is also honoured: > % head /etc/init.d/test.* > ==> /etc/init.d/test.1 <== > #!/sbin/openrc-run > description="test 1" > start() { sleep 2; } > stop() { sleep 2; } > > ==> /etc/init.d/test.2 <== > #!/sbin/openrc-run > description="test 2" > depend() { need test.1; } > start() { sleep 2; } > stop() { sleep 2; } > % ( >sudo /etc/init.d/test.2 --quiet start & for i in $(seq 3); >do sleep 1.5; echo "# test $i"; rc-status | grep test; done > ) 2> /dev/null > # test 1 > test.1 [ starting ] > test.2 [ starting ] > # test 2 > test.1 [ started ] > test.2 [ starting ] > # test 3 > test.1 [ started ] > test.2 [ started ] > % ( >sudo /etc/init.d/test.1 --quiet stop & for i in $(seq 3); >do sleep 1.5; echo "# test $i"; rc-status | grep test; done > ) 2> /dev/null > # test 1 > test.1 [ stopping ] > test.2 [ stopping ] > # test 2 > test.1 [ stopping ] > # test 3 I (after reading your response) also grepped the OpenRC source tree and found the `if not rc_parallel then waitpid()' lines; a cursory guess is that OpenRC uses some other (whether elegant or ugly) technique to ensure dependency and "readiness" (`exit 0'). On Sat, Sep 19, 2015 at 02:26:44PM +0200, Laurent Bercot wrote: > Grepping the current OpenRC git for "parallel" shows only a few instances > of rc_parallel use; it seems to be used to defer waitpid() calls, > which means OpenRC will be able to start/stop services without waiting > for the exit code of previous invocations. > > This very much looks like an addition as an afterthought, not as an > inherently parallel design. Unless I'm mistaken, there is no check for > readiness; in the serial case, readiness is considered achieved when the > invoked program exits. Using rc_parallel seems to defeat that design and > possibly break service ordering: in other words, it is a hack that will > only work if you're lucky, and goes contrary to the mechanics of OpenRC > in the first place. > Unfortunately, since most services get ready quickly enough and the > Linux scheduler isn't retarded, problems rarely occur, as you have > experienced; so it is not entirely obvious that rc_parallel is broken - > but broken it is, and broken it will be unless the whole OpenRC engine > is redesigned. > > You can't add parallel service start/stop as an afterthought. It has to > be included in the design. OpenRC is a good serial rc system, but it's > not a parallel rc system by any means. -- My current OpenPGP key: 4096R/0xE18262B5D9BF213A (expires: 2017.1.1) D69C 1828 2BF2 755D C383 D7B2 E182 62B5 D9BF 213A
Re: Some suggestions about s6 and s6-rc
With regard to having scripted placement of down files, if it was in a template or compiled as such, then the entire process of writing it into the definition becomes trivial or moot. While there should always be a manual option to override a script, or the option to write one directly, I think the days of writing all of the definitions needed by hand have long since past. But there is an issue that would keep this idea from easily occurring. You would need to find a way to signal the daemon that it the system is going down, vs. merely the daemon going down. I suppose you could argue that the down and stop commands should be semantically different, and use those to send the signal, but that's not how they are used today. Beyond that, if the placement of the down file was baked into all of the scripts, either by compiling or templating, then there isn't an issue of repeatedly typing in support for this.
Some suggestions about s6 and s6-rc
Since it has been public that Laurent schedules the release of s6-rc in September 2015, I think it will be beneficial to try to rip the related documentation of factual errors (I keep imagining how Rachel Carson and her friends tried to eliminate flaws in "Silent Spring"). Here are my own findings: * In `s6:doc/servicedir.html': In description of the `finish' file, there is "A finish script must do its work and exit in less than 3 seconds", which (1) does not mention the timeout can be modified in `timeout-finish' and (2) is out of sync with the default value (5000ms, i.e. 5s) in the implementation in `src/supervision/s6-supervise.c' and the description of `timeout-finish' on the same HTML page. * In `s6-rc:doc/why.html': This "blurb" page describes OpenRC as starting (and shutting down, though not explicitly saying that) services sequentially. This is only partially true: in Gentoo, parallel startup and shutdown in OpenRC can be enabled by setting `rc_parallel=YES' in `/etc/rc.conf'. The comment in `rc.conf' says lock-up might still be encountered, but I only experienced that problem three or four times several years ago, among several thousand times of bootup hitherto. By the way, I feel one behaviour of s6-rc can be slightly adjusted for a good reason: * In `s6-rc:doc/s6-rc-compile.html': With current behaviour, oneshots mandates an `up' file, but not a `down' file. At the first sight this asymmetry seemed really unnecessary to me, and I remember recently reading one post on this mail list asking about oneshots with only `down' functioning, so this is not just imagination. Therefore, I propose to change the requirements of oneshot from mandating `up' to mandating *at least one between* `up' and `down'; I think this change should be technically trivial and also backward-compatible. Finally, I acknowledge that code written by Laurent is excellent, but I also think some git commits by Laurent have really terrible summaries: "s6-rc-update doc, bugfix", " and another one", "More work on s6-rc-update", etc. Therefore, to ease future references, perhaps Laurent can spend a little more time phrasing the summary when committing more changes? Sorry if this mail does not seem quite humble... -- My current OpenPGP key: 4096R/0xE18262B5D9BF213A (expires: 2017.1.1) D69C 1828 2BF2 755D C383 D7B2 E182 62B5 D9BF 213A
Re: Some suggestions about s6 and s6-rc
On 19/09/2015 14:52, James Powell wrote: I don't see it, rc_parallel, as entirely broken, that is if you follow proper scripting techniques and create the proper dependency prestarts. Even if you do, it's not guaranteed to work as long as you don't have a way to notify readiness. In the serial case, OpenRC starts a subprocess to start a service, and readiness is assumed when the subprocess exits. That defers readiness test to the subprocess, which is perfectly reasonable. With rc_parallel, you just don't wait for the subprocess to exit. I haven't studied the code in detail, but without any readiness notification system, there's no way it's going to respect the dependency graph. It's basically "start everything at the same time, and yolo". Which defeats the purpose of a dependency-based service manager. I've often wondered if services started via OpenRC could be ran wrapped to s6, such as instead of scripting to start the daemon normally via direct execution, you start it wrapped via OpenRC by executing the s6 run script and stopped by the finish script within the OpenRC script acting as a manager layer. I think that's what the "supervisor=s6" variable does. See https://github.com/OpenRC/openrc/blob/master/s6-guide.md -- Laurent
Re: Some suggestions about s6 and s6-rc
On Sat, Sep 19, 2015 at 02:26:44PM +0200, Laurent Bercot wrote: > You can't add parallel service start/stop as an afterthought. It has to > be included in the design. OpenRC is a good serial rc system, but it's > not a parallel rc system by any means. Thanks for your explanation, it is very clear. Nevertheless, I think it will save us some argument against a certain group of propagandists after September if you somehow clarify this in the documentation -- as you said in the DNG mail list, "propaganda works", so making the blurb more distortion-proof might reduce productivity wasted in malicious arguments. > So, I don't mind the asymmetry because it's a natural one given the > way a system works, and working around it is trivial. I only made > the "down" script optional because it's true that a lot of oneshots > won't have anything to do when being turned off; but the opposite is > exceptional. Does it really bother you? Well, it is more of a "mathematical" ugliness to me -- a workaround is also trivial. If the proposed change is applied, it will make the model (very) slightly better with nearly negligible cost, so this is really up to the party in charge: I usually choose to apply such changes, but this time it finally depends on your taste. > I promise I'll try to be more verbose and explicit in the commit logs > for significant changes, and especially changes that add functionalities, > i.e. things that may need a revert: it's important to know what commit > broke stuff. For trivial modifications, I don't want to bother making > nontrivial commit messages. Maybe I'm wrong about this. That will be nice :) > As long as you're civil and relevant, that's really not a problem. > We're all a big bunch of egos, no need to apologize for it. ;) Thanks :) -- My current OpenPGP key: 4096R/0xE18262B5D9BF213A (expires: 2017.1.1) D69C 1828 2BF2 755D C383 D7B2 E182 62B5 D9BF 213A
Re: Some suggestions about s6 and s6-rc
On 20/09/2015 00:23, Steve Litt wrote: Basically, on startup, before bringing up the process supervisor, you write "down" files to every service not containing a "nodown". Then you erase down files one at a time. Clarity check. Casper, Guillermo and I were not talking about ./down files in a (longrun) service directory. We were talking about "down" scripts in a (oneshot) s6-rc service definition directory, which is not the same at all. I agree that the name collision is confusing, and it is an annoyance. But in the context of s6-rc, the confusion cannot happen, because those "down" scripts only exist for oneshot services, which do not have a service directory managed by a supervision suite. -- Laurent
Re: Some suggestions about s6 and s6-rc
Since s6-rc is still unreleased, perhaps we can still take the chance to rename `up'/`down' in oneshots to `run'/`finish', in order to let them look a little more unified? On Sun, Sep 20, 2015 at 12:33:28AM +0200, Laurent Bercot wrote: > I agree that the name collision is confusing, and it is an annoyance. -- My current OpenPGP key: 4096R/0xE18262B5D9BF213A (expires: 2017.1.1) D69C 1828 2BF2 755D C383 D7B2 E182 62B5 D9BF 213A
Re: Some suggestions about s6 and s6-rc
Allow me to clarify myself: what I proposed is to *also* allow oneshots which have a `down' file but no `up' file. But again, the choice is not up to me, so I stop here... On Sat, Sep 19, 2015 at 01:03:32PM -0300, Guillermo wrote: > have an explicit start(). Being forced to always do 'touch down; chmod > a+x down' for s6-rc oneshots, to always write "stop() { ; }" in OpenRC > scripts, or to always do: -- My current OpenPGP key: 4096R/0xE18262B5D9BF213A (expires: 2017.1.1) D69C 1828 2BF2 755D C383 D7B2 E182 62B5 D9BF 213A