Re: logging services with shell interaction

2023-06-23 Thread Ben Franksen

Hi Caspar

thanks for the heads-up; this is certainly an interesting project, but 
for me to start playing with it only makes sense if and when it has 
matured to the point where there is a minimum of documentation 
(-h/--help or something like that) and ideally some sort of revision 
control, too. I may be (barely) able to debug such low-level C code if I 
notice it misbehaving but to reverse-engineer what it is supposed to do 
is beyond my abilities.


Cheers
Ben

Am 22.06.23 um 19:16 schrieb Casper Ti. Vector:

On Thu, Oct 21, 2021 at 02:01:29AM +0800, Casper Ti. Vector wrote:

As has been said by Laurent, in the presence of a supervision system
with reliable logging and proper rotation, what `procServ' mainly does
can be done better by something like `socat' which wraps something like
`recordio', which in turn wraps the actual service process (EPICS IOC).
The devil is in the details: most importantly, when the service is to
be stopped, the ideal situation is that the actual service process gets
killed, leading to the graceful exit of `recordio' and then `socat'.


It is found that socat does not do I/O fan-in/fan-out with multiple
clients; it also assumes the `exec:'-ed subprocess is constantly present
(i.e. it does not handle IOC restarting).  So I have written a dedicated
program, ipctee (see below for link to source code), that does this.
I have also written a program, iotrap, that after receiving a
terminating signal, first closes the stdin of its children in the hope
that the latter exits cleanly, and after a tunable delay forwards the
signal.  This way IOCs are allowed to really run their clean-up code,
instead of just being killed instantly by the signal.


So the two wrapping programs need to propagate the killing signal, and
then exit after waiting for the subprocess; since `procServ' defaults
to kill the subprocess using SIGKILL, `recordio' also needs to translate
the signal if this is to be emulated.  `socat' does this correctly when
the `sighup'/`sigint'/`sigquit' options are given for `exec' addresses,
but its manual page does not state about SIGTERM.  `recordio' does not
seem to propagate (let alone translate) the signal; additionally, its
output format (which is after all mainly used for debugging) feels too
low-level to me, and perhaps needs to be adjusted.


Closer inspection of recordio revealed that it was designed in a smarter
way: after forking, the parent exec()s into the intended program, and
the children is what actually does the work of I/O forwarding.  This way
recordio (the children) does not need to forward signals.  Based on it,
I have written a program, recordln, that performs more line-oriented
recording: line fragments (without the line terminator) that go through
the same fd consecutively are joined before being copied to stderr.


At the facility where I am from, we use CentOS 7 and unsupervised
procServ (triple shame for a systemd opponent, s6 enthusiast and
minimalist :(), because we have not yet been bitten by log rotation
problems.  It also takes quite an amount of code to implement the
dynamic management of user supervision trees for IOCs, in addition
to the adjustments needed for `recordio'.  To make the situation even
worse, we are also using procServControl; anyway, I still hope we can
get rid of procServ entirely someday.


Source code for the programs above are available (licence: CC0) at

These programs can be tested with (in three different terminals):
$ ipctee /tmp/in.sock /tmp/out.sock
$ socat unix-connect:/tmp/in.sock exec:'recordln iotrap /bin/sh',sigint,sigquit
$ socat unix-connect:/tmp/out.sock -
Please feel free to tell me in case you find any defect in the code.
The dynamic management of IOC servicedirs is being developed, and will
be tested internally here before a paper gets submitted somewhere.



--
I would rather have questions that cannot be answered, than answers that
cannot be questioned.  -- Richard Feynman




Re: logging services with shell interaction

2021-10-24 Thread Ben Franksen

Am 23.10.21 um 18:40 schrieb Casper Ti. Vector:

On Sat, Oct 23, 2021 at 05:48:23PM +0200, Ben Franksen wrote:

I agree. BTW, another detail is the special handling of certain control
characters by procServ: ^X to restart the child, ^T to toggle auto-restart,
and the possibility to disable some others like ^C and especially ^D; which
is not only convenient but also avoids accidental restarts (people are used
to ^D meaning "exit the shell").


These functionalities would need to be (and would perhaps have better
been) done outside of the `socat'/`recordio' pair, as separate commands
(like `s6-svc -k ...' or `touch .../down') or wrappers.  `socat' simply
exits upon ^D/^C by default, so the IOC would not be hurt; I find this
enough to prevent most user errors, therefore more filtering of control
characters seems unnecessary.


Sure, there may be other solutions, it's just another one of those 
details that need to be taken care of somehow.



Our approach uses a somewhat hybrid mixture of several components. Since the
OS is Debian we use systemd service units, one for each IOC. They are
executing `/usr/bin/unshare -u sethostname %i runuser -u ioc -- softIOC-run
%i` which fakes the host name to trick EPICS' Channel Access "Security" into
the proper behavior, and then drops privileges. softIOC-run is the script of
which I posted a simplified version, with the pipeline between procServ and
multilog. Despite the disadvantages explained by Laurent, so far this works
pretty well (I have never yet observed multilog to crash or otherwise
misbehave). Finally, the configuration for all IOCs (name, which host do
they run on, path to the startup script) all reside in a small database and
there are scripts to automatically install everything, including automatic
enabling and disabling of the service units.


Frankly I find the above a little over-complicated, even discounting the
part about CA security which we do not yet involve.  I think you might
be going to find our paper (after publication; it is to be submitted the
next week) interesting in simplifying IOC management.


I am looking forward to it. You may want to post a link when it's done, 
here or on the EPICS mailing list.


Cheers
Ben
--
I would rather have questions that cannot be answered, than answers that
cannot be questioned.  -- Richard Feynman




Re: logging services with shell interaction

2021-10-23 Thread Ben Franksen

Hi Casper

Am 20.10.21 um 20:01 schrieb Casper Ti. Vector:

On Wed, Oct 20, 2021 at 09:53:58AM +0200, Ben Franksen wrote:

Interesting, I didn't know about recordio, will take a look.


Hello from a fellow sufferer from EPICS.  (If you see a paper on some
synchrotron-related journal in a few months that mentions "automation
of automation", it will be from me, albeit not using a pseudonym.
Another shameless plug: <https://github.com/CasperVector/ADXspress3>.)


Interesting, I didn't know you are from the accelerator community!


As has been said by Laurent, in the presence of a supervision system
with reliable logging and proper rotation, what `procServ' mainly does
can be done better by something like `socat' which wraps something like
`recordio', which in turn wraps the actual service process (EPICS IOC).


Yeah, that's what I was thinking, too.


The devil is in the details: most importantly, when the service is to
be stopped, the ideal situation is that the actual service process gets
killed, leading to the graceful exit of `recordio' and then `socat'.

So the two wrapping programs need to propagate the killing signal, and
then exit after waiting for the subprocess; since `procServ' defaults
to kill the subprocess using SIGKILL, `recordio' also needs to translate
the signal if this is to be emulated.  `socat' does this correctly when
the `sighup'/`sigint'/`sigquit' options are given for `exec' addresses,
but its manual page does not state about SIGTERM.  `recordio' does not
seem to propagate (let alone translate) the signal; additionally, its
output format (which is after all mainly used for debugging) feels too
low-level to me, and perhaps needs to be adjusted.


I agree. BTW, another detail is the special handling of certain control 
characters by procServ: ^X to restart the child, ^T to toggle 
auto-restart, and the possibility to disable some others like ^C and 
especially ^D; which is not only convenient but also avoids accidental 
restarts (people are used to ^D meaning "exit the shell").



At the facility where I am from, we use CentOS 7 and unsupervised
procServ (triple shame for a systemd opponent, s6 enthusiast and
minimalist :(), because we have not yet been bitten by log rotation
problems.  It also takes quite an amount of code to implement the
dynamic management of user supervision trees for IOCs, in addition
to the adjustments needed for `recordio'.  To make the situation even
worse, we are also using procServControl; anyway, I still hope we can
get rid of procServ entirely someday.


Our approach uses a somewhat hybrid mixture of several components. Since 
the OS is Debian we use systemd service units, one for each IOC. They 
are executing `/usr/bin/unshare -u sethostname %i runuser -u ioc -- 
softIOC-run %i` which fakes the host name to trick EPICS' Channel Access 
"Security" into the proper behavior, and then drops privileges. 
softIOC-run is the script of which I posted a simplified version, with 
the pipeline between procServ and multilog. Despite the disadvantages 
explained by Laurent, so far this works pretty well (I have never yet 
observed multilog to crash or otherwise misbehave). Finally, the 
configuration for all IOCs (name, which host do they run on, path to the 
startup script) all reside in a small database and there are scripts to 
automatically install everything, including automatic enabling and 
disabling of the service units.


When I started developing this scheme I thought that systemd was a great 
leap forward from /etc/init.d scripts. I still think so, but I quickly 
became frustrated with its monolithic approach. Despite 1000s of 
configuration options, it always seemed like the one I needed was 
missing. I spend days and days debugging service units that should have 
worked according to the docs but did not, for reasons I wasn't always 
able to figure out. Nowadays my standing assumption about systemd is 
that nothing you didn't thoroughly test should be expected to work, 
regardless of what the docs claim.


In contrast, I found that small specialized tools that use the 
chain-loading technique to modify a particular aspect of a program much 
more reliably produce exactly the desired effect and nothing more. The 
fine-grained control this gives you over the order of these effects 
(like, first fake the host name, then drop privileges) is something that 
a monolith with an unstructured flat configuration language cannot give 
you. The syntactic simplicity of systemd's configuration language is 
certainly appealing, especially for non-programmers, but this easily 
lets you forget the extreme complexity of its semantics. I cannot help 
but see the machine executing it as an idiosyncratic monster with lots 
of poorly handled corner cases.


I would like to experiment with alternatives like s6/s6-rc but that 
means using one of the small distros that support it and I am sure such 
a proposal would not be well received.

Re: logging services with shell interaction

2021-10-20 Thread Ben Franksen

Am 20.10.21 um 01:27 schrieb Laurent Bercot:
we have a fair number of services which allow (and occasionally 
require) user interaction via a (built-in) shell. All the shell 
interaction is supposed to be logged, in addition to all the messages 
that are issued spontaneously by the process. So we cannot directly 
use a logger attached to the stdout/stderr of the process.


  I don't understand the consequence relationship here.
>   - If you control your services / builtin shells, the services could
have an option to log the IO of their shells to stderr, as well as
their own messages.


We do have control over them, theoretically, but adding this 
functionality seems impractical. This is a complex piece of software, 
built from multiple components maintained by different parties. There is 
some sort of common framework for issuing messages but none of the 
components strictly adhere to it. In other words, they use things like 
printf all over the place. The only way I see to reliably get all the IO 
for logging is to delegate this to an external process.



  - Even if you cannot make the services log the shell IO, you can add
a small data dumper in front of the service's shell, which transmits
full-duplex everything it gets but also writes it to its own stdout or
stderr; if that stdout/err is the same pipe as the stdout/err of your
service, then all the IO from the shell will be logged to the same place
(and log lines won't be mixed unless they're more than PIPE_BUF bytes
long, which shouldn't happen in practice). So with that solution you
could definitely make your services log to multilog.


Yes, that would be possible. More or less what procServ does minus the 
supervision aspect.



IOC=$1

/usr/bin/procServ -f -L- --logstamp --timefmt="$TIMEFMT" \
 -q -n %i --ignore=^D^C^] -P "unix:$RUNDIR/$IOC" -c "$BOOTDIR" 
"./$STCMD" \

 | /usr/bin/multilog "s$LOGSIZE" "n$LOGNUM" "$LOGDIR/$IOC"
```

So far this seems to do the job, but I have two questions:

1. Is there anything "bad" about this approach? Most supervision tools 
have this sort of thing as a built-in feature and I suspect there may 
be a reason for that other than mere convenience.


  It's not *bad*, it's just not as airtight as supervision suites make
it. The reasons why it's a built-in feature in daemontools/runit/s6/others
are:
  - it allows the logger process to be supervised as well
  - it maintains open the pipe to the logger, so service and logger can
be restarted independently at will, without risk of losing logs.

  As is, you can't send signals to multilog (useful if you want to force
a rotation) without knowing its pid. And if multilog dies, it broken
pipes procServ, and it (and your service) is probably forced to restart,
and you lose the data that it wanted to write.
  A supervision architecture with integrated logging protects from this.


Thanks, this answers my question perfectly.

2. Do any of the existing process supervision tools support what 
procServ gives us wrt interactive shell access from outside?


  Not that I know of, because that need is pretty specific to your
service architecture.


It sure is.


  However, unless there are more details you have omitted, I still
believe you could obtain the same functionality with a daemontools/etc.
infrastructure and a program recording the IO from/to the shell. Since
you don't seem opposed to using old djb programs, you could probably
even directly reuse "recordio" from ucspi-tcp for this. :)


Interesting, I didn't know about recordio, will take a look.

Again, thanks a lot for the detailed response!

Cheers
Ben
--
I would rather have questions that cannot be answered, than answers that
cannot be questioned.  -- Richard Feynman




logging services with shell interaction

2021-10-19 Thread Ben Franksen

Hi Everyone

we have a fair number of services which allow (and occasionally require) 
user interaction via a (built-in) shell. All the shell interaction is 
supposed to be logged, in addition to all the messages that are issued 
spontaneously by the process. So we cannot directly use a logger 
attached to the stdout/stderr of the process.


procServ is a process supervisor adapted to such situations. It allows 
an external process (conserver in our case) to attach to the service's 
shell via a TCP or UNIX domain socket. procServ supports logging 
everything it sees (input and output) to a file or stdout.


In the past we had recurring problems with processes that spew out an 
extreme amount of messages, quickly filling up our local disks. Since 
logrotate runs via cron it is not possible to reliably guarantee that 
this doesn't happen. Thus, inspired by process supervision suites a la 
daemontools, we are now using a small shell wrapper script that pipes 
the output of the process into the multilog tool from the daemontools 
package.


Here is the script, slightly simplified. Most of the parameters are 
passed via environment.


```
IOC=$1

/usr/bin/procServ -f -L- --logstamp --timefmt="$TIMEFMT" \
 -q -n %i --ignore=^D^C^] -P "unix:$RUNDIR/$IOC" -c "$BOOTDIR" "./$STCMD" \
 | /usr/bin/multilog "s$LOGSIZE" "n$LOGNUM" "$LOGDIR/$IOC"
```

So far this seems to do the job, but I have two questions:

1. Is there anything "bad" about this approach? Most supervision tools 
have this sort of thing as a built-in feature and I suspect there may be 
a reason for that other than mere convenience.


2. Do any of the existing process supervision tools support what 
procServ gives us wrt interactive shell access from outside?


Cheers
Ben
--
I would rather have questions that cannot be answered, than answers that
cannot be questioned.  -- Richard Feynman




Re: runit patches to fix compiler warnings on RHEL 7

2021-02-17 Thread Ben Franksen
Am 27.11.19 um 21:33 schrieb J. Lewis Muir:
> On 11/25, J. Lewis Muir wrote:
>> Is runit hosted in a public source code repo?  If so, where?
>>
>> Are patches to fix runit compiler warnings on RHEL 7 welcome?
> 
> I have patches now.  Is there a public source code repo I can contribute
> against?  Or would it be helpful to just send the patches to the list?

Hi Lewis

it's cool to see you are interested in runit. May I ask whether you are
using it for controls / EPICS?

AFAIK there is no public repository for runit. You may want to ping
Gerrit Pape personally to get his attention. His last message to the
list is from march this year where he said he was "looking forward to do
a maintenance release of runit eventually and [...] collecting patches".

Cheers
Ben


pEpkey.asc
Description: application/pgp-keys


Re: Path monitoring support in s6-services

2021-02-17 Thread Ben Franksen
Am 17.02.21 um 10:26 schrieb Casper Ti. Vector:
> On Wed, Feb 17, 2021 at 01:08:44PM +0530, billa chaitanya wrote:
>> I am trying to start a service when a file/path is
>> modified/touched/created.Do we have any mechanism in s6 that supports
>> enabling a service up on monitoring a path?
> 
> inotifyd (or something similar) + s6-svc (or s6-rc)?

You can also write your own specialised daemon with a simple shell
script a la

inotifywait -q -m ... | while true; do
  # get the next event:
  event=$(head -1)
  ...
done

I am using this method in a production system.

Cheers
Ben