Re: logging services with shell interaction
we have a fair number of services which allow (and occasionally require) user interaction via a (built-in) shell. All the shell interaction is supposed to be logged, in addition to all the messages that are issued spontaneously by the process. So we cannot directly use a logger attached to the stdout/stderr of the process. I don't understand the consequence relationship here. - If you control your services / builtin shells, the services could have an option to log the IO of their shells to stderr, as well as their own messages. - Even if you cannot make the services log the shell IO, you can add a small data dumper in front of the service's shell, which transmits full-duplex everything it gets but also writes it to its own stdout or stderr; if that stdout/err is the same pipe as the stdout/err of your service, then all the IO from the shell will be logged to the same place (and log lines won't be mixed unless they're more than PIPE_BUF bytes long, which shouldn't happen in practice). So with that solution you could definitely make your services log to multilog. procServ is a process supervisor adapted to such situations. It allows an external process (conserver in our case) to attach to the service's shell via a TCP or UNIX domain socket. procServ supports logging everything it sees (input and output) to a file or stdout. That works too. IOC=$1 /usr/bin/procServ -f -L- --logstamp --timefmt="$TIMEFMT" \ -q -n %i --ignore=^D^C^] -P "unix:$RUNDIR/$IOC" -c "$BOOTDIR" "./$STCMD" \ | /usr/bin/multilog "s$LOGSIZE" "n$LOGNUM" "$LOGDIR/$IOC" ``` So far this seems to do the job, but I have two questions: 1. Is there anything "bad" about this approach? Most supervision tools have this sort of thing as a built-in feature and I suspect there may be a reason for that other than mere convenience. It's not *bad*, it's just not as airtight as supervision suites make it. The reasons why it's a built-in feature in daemontools/runit/s6/others are: - it allows the logger process to be supervised as well - it maintains open the pipe to the logger, so service and logger can be restarted independently at will, without risk of losing logs. As is, you can't send signals to multilog (useful if you want to force a rotation) without knowing its pid. And if multilog dies, it broken pipes procServ, and it (and your service) is probably forced to restart, and you lose the data that it wanted to write. A supervision architecture with integrated logging protects from this. 2. Do any of the existing process supervision tools support what procServ gives us wrt interactive shell access from outside? Not that I know of, because that need is pretty specific to your service architecture. However, unless there are more details you have omitted, I still believe you could obtain the same functionality with a daemontools/etc. infrastructure and a program recording the IO from/to the shell. Since you don't seem opposed to using old djb programs, you could probably even directly reuse "recordio" from ucspi-tcp for this. :) -- Laurent
Re: Service watchdog
Petr Malat said on Tue, 19 Oct 2021 09:41:19 +0200 >Yes, in my usecase this would be used at the place where sd_notify() >is used if the service runs under systemd. Then periodically executed >watchdog could check the service makes progress and react if it >doesn't. > >The question is how to implement the watchdog then - it could be either >a global service or another executable in service directory, which >would be started periodically by runsv. LOL, I'll tell you how I did it on my reminder system, and you can decide whether or not to do it my way... I have a reminder system written by me in Perl early this century, when I still used Perl. It runs 5 times a day via cron, popping a window up on the screen telling me of my appointments. Some consider it intrusive, I like it that way (which is why I wrote it that way). After a few years of using my reminder system, it became apparent that sometimes it was failing silently, and I wouldn't notice the absence of popup windows, causing me to miss appointments and the like. So I wrote another program (by this time I'd switched to Python), run as a runit service: #!/bin/sh cd /d/at/python/reminder_check exec chpst -u slitt:slitt /d/at/python/reminder_check/reminder_check.py The main routine of the Python program follows: while True: if tooOld(LOGFILE, TOO_OLD_HOURS): alarm_all() time.sleep(SLEEP_SECONDS) So every SLEEP_SECONDS seconds, it checks logfile LOGFILE, which is written by the reminder program itself, to see if it's more than TOO_OLD_HOURS old, and if it does, it throws up a big old green and purple window proclaiming the alarm system is broken. In my case, SLEEP_SECONDS is 3600. Yeah, it's polling instead of interrupt driven, but I make no apology for polling once an hour. Matter of fact, I'd make no apologies for 10 second polling, given that if everything's OK all it's going to do is check a file date. It seems to me the key question is how quickly do you need to be informed of the failure of the watched daemon. If being informed a minute later is OK, I'd say my method is fine. If being informed a second later is OK, I'd rewrite the time check in C and then if it flunks, system() the "on error" program. If you need subsecond warning, my method is probably not what you want. By the way, when I test for a daemon functioning, I typically don't use svstatus or that other program that just returns a 1 or 0, because I don't care if the program is running: I want to know that it's *functioning*, so I test the functionality of the running program. So for the network, I'd do a quick 1 iteration ping, for PostGreSQL I might do a simple select statement, etc. Best of luck. SteveT Steve Litt Spring 2021 featured book: Troubleshooting Techniques of the Successful Technologist http://www.troubleshooters.com/techniques
Re: Service watchdog
Yes, in my usecase this would be used at the place where sd_notify() is used if the service runs under systemd. Then periodically executed watchdog could check the service makes progress and react if it doesn't. The question is how to implement the watchdog then - it could be either a global service or another executable in service directory, which would be started periodically by runsv. If a single notification step is enough for you, i.e. the service goes from a "preparing" state to a "ready" state and remains ready until the process dies, then what you want is implemented in the s6 process supervisor: https://skarnet.org/software/s6/notifywhenup.html Then you can synchronously wait for service readiness (s6-svwait $service) or, if you have a watchdog service, periodically poll for readiness (s6-svstat -r $service). But that's only valid if your service can only change states once (from "not ready" to "ready"). If you need anything more complex, s6 won't support it intrinsically. The reason why there isn't more advanced support for this in any supervision suite (save systemd but even there it's pretty minimal) is that service states other than "not ready yet" and "ready" are very much service-dependent and it's impossible for a generic process supervisor to support enough states for every possible existing service. Daemons that need complex states usually come with their own monitoring software that handles their specific states, with integrated health checks etc. So my advice would be: - if what you need is just readiness notification, switch to s6. It's very similar to runit and I think you'll find it has other benefits as well. The drawback, obviously, is that it's not in busybox and the required effort to switch may not be worth it. - if you need anything more complex, you can stick to runit, but you will kinda need to write your own monitor for your daemon, because that's what everyone does. Depending on the details of the monitoring you need, the monitoring software can be implemented as another service (e.g. to receive heartbeats from your daemon), or as a polling client (e.g. to do periodic health checks). Both approaches are valid. Don't hack on runit, especially the control pipe thing. It will not end well. (runit's control pipe feature is super dangerous, because it allows a service to hijack the control flow of its supervisor, which endangers the supervisor's safety. That's why s6 does not implement it; it provides similar - albeit slightly less powerful - control features via ways that never give the service any power over the supervisor.) -- Laurent
logging services with shell interaction
Hi Everyone we have a fair number of services which allow (and occasionally require) user interaction via a (built-in) shell. All the shell interaction is supposed to be logged, in addition to all the messages that are issued spontaneously by the process. So we cannot directly use a logger attached to the stdout/stderr of the process. procServ is a process supervisor adapted to such situations. It allows an external process (conserver in our case) to attach to the service's shell via a TCP or UNIX domain socket. procServ supports logging everything it sees (input and output) to a file or stdout. In the past we had recurring problems with processes that spew out an extreme amount of messages, quickly filling up our local disks. Since logrotate runs via cron it is not possible to reliably guarantee that this doesn't happen. Thus, inspired by process supervision suites a la daemontools, we are now using a small shell wrapper script that pipes the output of the process into the multilog tool from the daemontools package. Here is the script, slightly simplified. Most of the parameters are passed via environment. ``` IOC=$1 /usr/bin/procServ -f -L- --logstamp --timefmt="$TIMEFMT" \ -q -n %i --ignore=^D^C^] -P "unix:$RUNDIR/$IOC" -c "$BOOTDIR" "./$STCMD" \ | /usr/bin/multilog "s$LOGSIZE" "n$LOGNUM" "$LOGDIR/$IOC" ``` So far this seems to do the job, but I have two questions: 1. Is there anything "bad" about this approach? Most supervision tools have this sort of thing as a built-in feature and I suspect there may be a reason for that other than mere convenience. 2. Do any of the existing process supervision tools support what procServ gives us wrt interactive shell access from outside? Cheers Ben -- I would rather have questions that cannot be answered, than answers that cannot be questioned. -- Richard Feynman
Re: Service watchdog
Yes, in my usecase this would be used at the place where sd_notify() is used if the service runs under systemd. Then periodically executed watchdog could check the service makes progress and react if it doesn't. The question is how to implement the watchdog then - it could be either a global service or another executable in service directory, which would be started periodically by runsv. On Tue, Oct 19, 2021 at 07:24:38AM +, Ellenor Bjornsdottir wrote: > Is this some genre of continuous readiness notification, or so? > > On 19 October 2021 07:20:41 UTC, Petr Malat wrote: > >Hi, > >I'm using the busybox implementation of runit to manage services and I > >miss some kind of a watchdog in runsv. I though about extending > >supervise/control pipe by a status command which would allow to publish > >a status, for example 's Running'. Runsv would then append a monotonic > >timestamp when it was received and the passed string to its argv[0] > >making it visible in the process listing. This could be used by "check" > >to check the service is up and also by watchdog to see it made some > >progress since the last run. > >Any opinions on that? > >BR, > > Petr > > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: Service watchdog
Is this some genre of continuous readiness notification, or so? On 19 October 2021 07:20:41 UTC, Petr Malat wrote: >Hi, >I'm using the busybox implementation of runit to manage services and I >miss some kind of a watchdog in runsv. I though about extending >supervise/control pipe by a status command which would allow to publish >a status, for example 's Running'. Runsv would then append a monotonic >timestamp when it was received and the passed string to its argv[0] >making it visible in the process listing. This could be used by "check" >to check the service is up and also by watchdog to see it made some >progress since the last run. >Any opinions on that? >BR, > Petr -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Service watchdog
Hi, I'm using the busybox implementation of runit to manage services and I miss some kind of a watchdog in runsv. I though about extending supervise/control pipe by a status command which would allow to publish a status, for example 's Running'. Runsv would then append a monotonic timestamp when it was received and the passed string to its argv[0] making it visible in the process listing. This could be used by "check" to check the service is up and also by watchdog to see it made some progress since the last run. Any opinions on that? BR, Petr