Re: s6 bites noob
just take this as a data sample for what can happen when a random noob tries to use s6. Although unpleasant (not gonna lie), it was a very useful user experience report, thank you. Among other things, it comforts me in the belief that a user interface layer on top of s6 + s6-rc + s6-linux-init is the way to go - a layer that makes things Just Work even when users don't do everything perfectly, and with friendlier behaviour in case of an error. People will still be able to look under the hood and tweak things manually, but they won't have to, and they won't be exposed to the nuts and bolts unless they want to. Also, just in case someone tries the latest s6 / s6-rc git head: I have added "uid/self" and "gid/self" key checking in the accessrules library, for when the client runs with the same euid / the same egid as the server; and I have changed s6-rc-compile to use the functionality, removing its -u and -g options in the process. So now, the behaviour should always be consistent: the user who can operate a s6-rc database is always the user who owns the supervision tree. No exceptions. root can also use s6-rc commands, but services will still run as the user who owns the supervision tree. A numbered release of s6 and s6-rc (and lots of other packages) will happen some time next month. BTW, your explanations of why things are designed the way they are were helpful for understanding the system. I recommend copying them into the docs. I should write a "rationale / policy recommendation" section in the documentation pages, that is a good idea. -- Laurent
Re: Generic interrupt command?
Not outputting anything causes kill (on my system at least) to exit non 0 Not outputting anything isn't an option, for the case where -o pid is used in addition to other fields. The field number and order must be respected. It's probably best to use some OOB indicator. How about NA, which I already use for non-numeric fields? it makes kill correctly choke. Would it be better to use NA in all the numeric fields, too? -- Laurent
Re: Generic interrupt command?
On Tue, Feb 5, 2019, 2:20 AM Laurent Bercot wrote: > >Be careful, though. If the service is down, kill will use -1 for the PID, > >and will probably signal everything in your system except PID 1. > > That's a good point. Should s6-svstat use 0 as the "service is down" > pid value instead, to avoid this ? > 0 behaves better for this use case, but can still produce unexpected behavior. The construction "echo 0 | xargs kill -STOP" for example leaves behind a paused background task that needs to be cleaned by hand. The construction "kill -STOP $(echo 0)" hangs the terminal until someone resumes the user's shell. Most other "kill -whatever $(echo 0)" results in the shell exiting and the user having to log back in. So, 0 is a lot better than -1, but still not great. Not outputting anything causes kill (on my system at least) to exit non 0 and give some diagnostic ("`' not a pid or valid pid spec", "you need to specify whom to kill", or the usage message). That's nice, but would probably break other scripting that expects a value, especially for s6-svstat showing multiple fields. I can't think of a safe and simple way to do this. For example, we could suggest people do something like this (based on Roger Pate's post): pid=$(s6-svstat -p /my/service) && [ "$pid" -ne -1 ] && kill -SIGNAL $pid but that's a lot of typing and requires that people see and remember the suggestion, so not quite simple :-/ -- John O'Meara >
Re: s6 bites noob
Laurent Bercot writes: >>Anyway, recompile with -u 1000, re-update, and try again. Now, I can't even >>do s6-rc -a list; I get: >>s6-rc fatal: unable to take locks: Permission denied > > Hmmm, that's weird. If all the previous operations have been done as > the same user, you should never get EPERM. Have you run something as > root before? Indeed, I did. My command history from last night shows that before I remembered to try compiling with -u 1000, I tried sudo s6-rc change testpipe, after the previous non-sudo invocation failed with a permission error, so that must be what screwed it up. I don't remember doing that. Must have been really tired and frustrated. So I killed svscan, removed my compiled databases and scan and live dirs, and started from scratch. Now s6-rc succeeds, but when I brought up testpipe (two daemons funneling to a logger), I got once per second: fdclose: fatal: unable to exec ./run.user: Exec format error Oops, I forgot #!/bin/bash at the top of one of the run files. (Would have been helpful if the error message had specified which one.) Fix that, recompile, make new link, do an update, try again. Now: s6-fdholder-retrievec: fatal: unable to retrieve fd for id pipe:s6rc-r-logger: Broken pipe s6-fdholder-retrievec: fatal: unable to retrieve fd for id pipe:s6rc-w-logger: Broken pipe s6-fdholder-retrievec: fatal: unable to retrieve fd for id pipe:s6rc-r-logger: Connection reset by peer s6-fdholder-retrievec: fatal: unable to retrieve fd for id pipe:s6rc-w-logger: Connection reset by peer It also somehow managed to hose the terminal in which svscan was running. As in, when I try to type in it, only a small percentage of the letters actually appear. Killed svscan, tried to reset the terminal, no luck. This is the first time I remember ever getting an un-resettable terminal. No problem, I can just kill the terminal, but... weird. Oops, after I added the forgotten #!/bin/bash, I forgot -u 1000 again when I recompiled. So, the failure should be expected, but hosing the terminal? Really? And the error messages give no hint of what's actually wrong, unless you're familiar with the internal design of s6, which seems an excessive burden for a mere user. I guess I'm spoiled by modern C compilers, which have become excellent in the past few years at explaining in exquisite detail exactly in which way I'm currently being an idiot. So, remove the compiled databases and scan directory, recompile with -u 1000, restart svscan, re-run s6-rc-init, try testpipe again, and... success! Wow, that was unexpected. I'd become conditioned to expect failure. Ok now, quick, while I remember how to use s6, I'll install it into my project and make sure it works perfectly, so I never have to touch it again. There are other things I'd be curious to try with it too, but I shouldn't keep pestering you and the mailing list for unpaid tech support, so I guess just take this as a data sample for what can happen when a random noob tries to use s6. BTW, your explanations of why things are designed the way they are were helpful for understanding the system. I recommend copying them into the docs.