Re: Logging in a web server context

2020-06-13 Thread yianiris
This is fungalnet with a different email

I am not 100% sure if we are talking about the same things, but here
some of the fun we are having with Eric Vidal's 66 at obarun and void:

Tree root is a tree of services, we can have many trees.  Each service
has its own log.  The 66-inservice -p14 dhcpcd command gives the last 14
lines of the log of the service.  Except for oneshot services and tty
every service supervised by s6 gets its own log:

Here is a sample output: (ip4/6 addresses scrambled)

oba@runner ~ % sudo 66-intree -zg root
Name : root
Initialized  : yes
Enabled  : yes
Starts after : None
Current  : yes
Allowed  : root
Symlinks : svc->source db->source
Contents : /
   ├─(1165,Enabled,classic) ntpd-log
   ├─(1168,Enabled,classic) ntpd
   ├─(1172,Enabled,classic) dhcpcd-log
   ├─(1169,Enabled,classic) tty@tty2
   ├─(1171,Enabled,classic) tty@tty1
   └─(1173,Enabled,classic) dhcpcd


oba@runner ~ % sudo 66-inservice -v2 -p14 dhcpcd
Name  : dhcpcd
Version   : 0.2.0
In tree   : root
Status: enabled, up (pid 1173) 38887 seconds
Type  : classic
Description   : dhcpcd daemon
Source: /usr/lib/66/service/dhcpcd
Live  : /run/66/scandir/0/dhcpcd
Dependencies  : dhcpcd-log
External dependencies : None
Optional dependencies : None
Start script  :  execl-cmdline -s { dhcpcd ${cmd_args} -f
${conf_dir}/${conf_file} }
Stop script   :  execl-cmdline -s { dhcpcd ${cmd_args_stop} }
Environment source: /etc/66/conf/dhcpcd/0.2.0
Environment file  : cmd_args=!-B
cmd_args_stop=!-x
conf_dir=!/etc
conf_file=!dhcpcd.conf
Log name  : dhcpcd-log
Log destination   : /var/log/66/dhcpcd
Log file  :
2020-06-13 21:17:16.955071365  eth0: fe80::fgf0:faff:fea4:4331 is
unreachable
2020-06-13 21:17:17.617629075  eth0: fe80::fgf0:faff:fea4:4331 is
reachable again
2020-06-13 21:17:17.872363985  eth0: soliciting an IPv6 router
2020-06-13 21:17:25.755133469  eth0: fe80::fgf0:faff:fea4:4331 is
unreachable
2020-06-13 21:17:26.729551295  eth0: soliciting an IPv6 router
2020-06-13 21:17:26.737844183  eth0: fe80::fgf0:faff:fea4:4331 is
reachable again
2020-06-13 21:36:32.380456414  eth0: fe80::fgf0:faff:fea4:4331: no
longer a default router
2020-06-13 21:36:32.380766427  eth0: deleting address
2v02:2149:862f:9d00:1a03:73ff:fe2b:30de/64
2020-06-13 21:36:32.381024234  eth0: deleting route to
2v02:2149:862f:9d00::/64
2020-06-13 21:36:32.381127805  eth0: deleting default route via
fv80::ce03:faff:fea4:4331
2020-06-13 21:36:34.051845089  eth0: adding address
2v02:2149:862f:9d00:1a03:73ff:fe2b:30de/64
2020-06-13 21:36:34.052160364  eth0: adding route to
2v02:2149:862f:9d00::/64
2020-06-13 21:36:34.083391650  eth0: adding default route via
fe80::ce03:faff:fea4:4331


log files stored in those /var/log/xx/ directories go by the default
name "current".  There is also a /run/66/log/0/current of the booting
steps log with adjustable verbosity.
66 just sets things up for s6 to do all the work.



Re: Logging in a web server context

2020-06-13 Thread Steve Litt
On Sat, 13 Jun 2020 10:41:24 +0200
Carl Winbäck  wrote:

> Hi folks!
> 
> Is it possible when using s6-svscan/s6-supervise to somehow arrange so
> that a daemon’s stdout is sent to one logdir and stderr to another
> logdir?

Hi Carl,

Many people have given excellent answers, some of which were over my
head. I personally think it's possible, with or without s6-rc, and will
do my best to explain that in this email.

Before I'd ever used s6 or runit, I used daemontools. With daemontools,
and therefore I assume with everything, if you didn't use one of those
log-handling daemons like syslog, your service directory acquired a
directory called "main", and inside "main" was your log file, nicely
rotated by daemontools (I assume). So like somebody said, you don't get
the fan-in fan-out, you don't need to write the daemon in the message,
and you don't need complex parsing to extract messages from a specific
daemon.

I'd imagine that's possible in all daemontools-inspired process
supervisors such as s6.

Nowadays I run one of those log gathering programs, at the suggestion
of the experts on the Void Linux IRC channel, but maybe I'll switch
back if possible.

As I remember from daemontools, it captured only stdout for logging.
So, if you wanted it to record both, you did a little fake to get it to 
record both:

===
exec 2>&1
exec nginx 
===

But the preceding is exactly what you don't want: You don't want to mix
stderr and stdout. So perhaps something like the following:

===
exec 2>main/stderr.log
exec nginx 
===

I *think* the preceding would work for you, but I don't think you'd get
log rotation, a timestamp and maybe some other needed stuff. So I think
you could do the following:

===
exec 2>stderr.intermediate | stderrhandler.sh main
exec nginx 
===

Sorry for the contrived nature of line 1 of the preceding. I'm sure
there must be something more direct that doesn't require an
intermediate file, but I couldn't quickly find it using either
/bin/sh or execline.

Anyway, perhaps stderrhandler.sh could be made to do log rotation too,
as well as tacking on a timestamp.

If I'm correct about this, the supervisor itself puts the stdout log in
./main/whatever.log and rotates the log. Meanwhile, stderrhandler.sh
puts stderr into a different log, and perhaps stderrhandler.sh could
also log rotate, although I have a feeling there would be race
conditions doing so.

SteveT

Steve Litt 
May 2020 featured book: Troubleshooting Techniques
 of the Successful Technologist
http://www.troubleshooters.com/techniques


Re: Logging in a web server context

2020-06-13 Thread Laurent Bercot



 Add-on: Jan suggested to run the service with only *one* output channel
in order to benefit from the reliable pipe integration feature in s6,
and do to the dispatching into different service directories at the
s6-log level.

 I'm generally not a fan of fan-in fan-out mechanisms (pun unintended),
because it's a lot easier to mix data than to triage it, so if data
streams are separated early, mixing them to transmit them over one
channel then parsing the data to separate it into several streams again
is wasteful. Having the ability to keep service-specific log stream
rather than conflating everything into a global logger is one of the
big advantages of the s6 logging infrastructure over the syslog model.

 However, in this precise case, until s6(-rc) provides tools to
cleanly and reliably handle several output channels for a service, it
may be worth considering the option, just to make the installation
easier.
 - prefix the stdout log lines with something easily recognizable
 - prefix the stderr log lines with something easily recognizable, and
obviously different
 - direct both streams to the service's stdout
 - use the s6 logger feature that will maintain a pipe for you
 - in your s6-log script, parse your log lines and direct them to
different logdirs depending on the prefix they have. If you chose
easily differentiable prefixes, it shouldn't even take any noticeable
CPU time.

--
 Laurent



Re: Logging in a web server context

2020-06-13 Thread Laurent Bercot

I'm not completely sure about s6, but runsv (from runit) hands only the
stdout of the ./run script to the logger, and passes the stderr out of
it's own stderr.


 All daemontools-inspired supervision suites, including s6, follow that
model, so yes, your runit example can be directly applied to s6.

 Generally speaking, supervision suites only provide integrated support
(i.e. a pipe that is already created and maintained by the supervision
tree) for one output channel, and if a service needs other output
channels then it's of course possible but the plumbing must be done
manually.

 In Carl's case: in order to have foo's stdout logged by a dedicated
logger, and foo's stderr by *another* dedicated logger, then you can
do something like the following:

 - If /service/foo is your service directory for foo, then you declare
your service directory for foo's stdout in /service/foo/log.
 - mkfifo -m 0600 /service/foo/stderr-pipe
 - In /service/foo/run, redirect stderr to stderr-pipe
 - Write a separate service directory, say /service/foo-stderr-log,
with a run script that redirects its stdin from /service/foo/stderr-pipe
and execs into s6-log.

 This should do what you want, with a small caveat: if the process for
foo-stderr-log dies, then you'll need to restart foo in order to get
any further logs, because the pipe has been broken and isn't maintained
by the supervision tree like the pipe to foo/log is.
 The way to prevent this is to use a fd-holder, and s6 provides tools
for that, but it's the next level of complexity.

 Historically, the vast majority of daemons have only needed one output
channel, and that is why supervision suites address the common case but
offer no help otherwise; unfortunately, certain web servers buck the
trend and need several output channels (which is not a bad idea per se,
it's just that the tooling to handle it in a user-friendly way is not
there yet), so for now the way to address it is still to whip out the
hammer and screwdriver.

 s6-rc wouldn't help you in that case, because its funnel model still
only supports one consumer per producer, and extra output channels have
to be set up manually as well.

 The nginx/apache case is something that comes up with some regularity,
either on this list or on the #s6 channel, so I should try and write
some tools that make it easier to automate. However, it is likely that
those tools would become part of s6-rc, not s6, because doing things in
a reliable way involves a fdholder, which requires some policy that
s6 does not provide but s6-rc does (and s6-rc already has all the
necessary infrastructure).

--
 Laurent



Re: Logging in a web server context

2020-06-13 Thread Carl Winbäck
On Sat, 13 Jun 2020 at 13:59, Casper Ti. Vector  wrote:
> > I gather from it that what I want is currently not possible without
> > s6-rc. Is that correct?
>
> I do not think so :)  In a nutshell, your service needs to connect its
> stdout and stderr to a pair of different loggers; that can be done by
> fd-holding and chainloading, and is how s6-rc internally implements its
> piping (well, funneling) mechanism.  However, s6-rc only connects the
> stdout, so the fd-holder needs an amount of additional configuration
> whether you use s6-rc or not; that amount will be a bit big without
> s6-rc, but it is surely doable and even scriptable.

Aha! Implementing such a mechanism (outside of s6-rc) could be an
interesting exercise.

However, I believe that the proposals Jan posted would be sufficient
for my needs.


Re: Logging in a web server context

2020-06-13 Thread Carl Winbäck
On Sat, 13 Jun 2020 at 14:17, Carl Winbäck  wrote:
>> Replying to a private mail here on the list in case it is of interest
>> to someone else.[1]

Sorry for the botched quote level. I should know better than to use
Gmail’s web interface for posting to mailing lists :-p


Re: Logging in a web server context

2020-06-13 Thread Carl Winbäck
>
> Replying to a private mail here on the list in case it is of interest
> to someone else.[1]
>
> On Sat, 13 Jun 2020 at 12:05, Arsen Arsenović  wrote:
> > You're interested in nginxes `error_log', `access_log' and
> > `log_format' directives, I believe.
> > It can write both directly to a file, for better or for worse.
> > You could use fifos to separate the logs into different logdirs, but
> > I'm unsure of how well that would work, since AFAIK it also tries to
> > rotate logs every so often.
>
> Yes, nginx can write its messages to either stdout, stderr, syslog or
> directly to a file. Nginx itself will not do any log rotation. That
> could be handled by logrotate(8). But IMHO, using logrotate(8) or
> syslog feels like clumsy and outdated solutions.
>
> Syslog was introduced in the early 80s and I don’t believe that its
> design has aged well. logrotate(8) has the issue that it cannot work
> in tandem with neither the daemon nor the supervisor. This is
> something that Jonathan de Boyne Pollard has described well in his
> article “Don’t use logrotate or newsyslog in this century”.[1]
>
> > Personally, I'd just drop access logs entirely in favor of either
> > nothing, or Matomo.
>
> Yes, that could be an option and I will consider it. It needs way more
> computing resources though[3]. With that said, this option offers some
> features that an access log analyzer doesn’t.
>
>
>
> [1] Arsen has approved of this posting :)
> [2] https://jdebp.eu/FGA/do-not-use-logrotate.html
> [3] See
> https://www.mail-archive.com/supervision@list.skarnet.org/msg02185.html
> [4]
> https://matomo.org/docs/requirements/#recommended-servers-sizing-cpu-ram-disks
>


Re: Logging in a web server context

2020-06-13 Thread Casper Ti. Vector
On Sat, Jun 13, 2020 at 12:50:50PM +0200, Carl Winbäck wrote:
> I gather from it that what I want is currently not possible without
> s6-rc. Is that correct?

I do not think so :)  In a nutshell, your service needs to connect its
stdout and stderr to a pair of different loggers; that can be done by
fd-holding and chainloading, and is how s6-rc internally implements its
piping (well, funneling) mechanism.  However, s6-rc only connects the
stdout, so the fd-holder needs an amount of additional configuration
whether you use s6-rc or not; that amount will be a bit big without
s6-rc, but it is surely doable and even scriptable.

-- 
My current OpenPGP key:
RSA4096/0x227E8CAAB7AA186C (expires: 2022.09.20)
7077 7781 B859 5166 AE07 0286 227E 8CAA B7AA 186C



Re: Logging in a web server context

2020-06-13 Thread Carl Winbäck
On Sat, 13 Jun 2020 at 12:59, Jan Braun  wrote:
> I'm not completely sure about s6, but runsv (from runit) hands only the
> stdout of the ./run script to the logger, and passes the stderr out of
> it's own stderr. This allows you to nest two runsv instances, one for
> each output channel:
>
> [...]
>
> Translation to s6 is left as an exercise to the reader.

Very interesting! I will study this example in depth in order to see
if it could be applied to s6.

If anyone has done this in s6, I would be interested to hear how you
achieved it.

> You could also run the webserver with stderr redirected to stdout, and
> let s6-log/svlogd filter the messages into one of two logdirs:
>
> $ printf 'stdout\nstderr\n' | s6-log -- '-.*err' t ./stdout f t ./stderr
> $ head std*/current
> ==> stderr/current <==
> @40005ee4acd00efdfbe9 stderr
>
> ==> stdout/current <==
> @40005ee4acd00efd6f5e stdout
>
> However, that's a brittle solution, because it relies on you creating
> correctly-matching filter rules. Depending on the webserver's output, it
> might still be feasible, but I recommend the other approach.

I agree that this method is not rock-solid since it relies on text
parsing. But for my purposes it would probably be sufficient.

Then I have at least one solution that should be feasible and possibly
both of them are.

Thank you Jan!


Re: Logging in a web server context

2020-06-13 Thread Jan Braun
Hey,

Carl Winbäck schrob:
> Is it possible when using s6-svscan/s6-supervise to somehow arrange so
> that a daemon’s stdout is sent to one logdir and stderr to another
> logdir?

I'm not completely sure about s6, but runsv (from runit) hands only the
stdout of the ./run script to the logger, and passes the stderr out of
it's own stderr. This allows you to nest two runsv instances, one for
each output channel:

$ find -type f | xargs head -n 99
==> ./inner/log/run <==
#!/bin/sh
mkdir -p stdout
svlogd -t stdout

==> ./inner/run <==
#!/bin/sh
echo stderr >&2
echo stdout
sleep 10

==> ./log/run <==
#!/bin/sh
mkdir -p stderr
svlogd -t stderr

==> ./run <==
#!/bin/sh
exec runsv inner 2>&1
$ timeout 11 runsv .
$ find -name current | xargs head
==> ./inner/log/stdout/current <==
@40005ee4a23f2591745c stdout
@40005ee4a24925b79614 stdout

==> ./log/stderr/current <==
@40005ee4a23f257ae304 stderr
@40005ee4a24925b76f04 stderr
$

Translation to s6 is left as an exercise to the reader.

> One tricky aspect of logging that is specific to web servers is that
> they emit two different categories of messages:
> 
> a) Errors and warnings
> 
> b) Info about page requests[2]

You could also run the webserver with stderr redirected to stdout, and
let s6-log/svlogd filter the messages into one of two logdirs:

$ printf 'stdout\nstderr\n' | s6-log -- '-.*err' t ./stdout f t ./stderr
$ head std*/current
==> stderr/current <==
@40005ee4acd00efdfbe9 stderr

==> stdout/current <==
@40005ee4acd00efd6f5e stdout

However, that's a brittle solution, because it relies on you creating
correctly-matching filter rules. Depending on the webserver's output, it
might still be feasible, but I recommend the other approach.

HTH & cheers,
Jan


signature.asc
Description: PGP signature


Re: Logging in a web server context

2020-06-13 Thread Carl Winbäck
On Sat, 13 Jun 2020 at 12:04, Casper Ti. Vector  wrote:
> See ?

If you like, feel free to expand on the previous discussion.[1]

I gather from it that what I want is currently not possible without
s6-rc. Is that correct?



[1] Trying to browse the old archives was a bit confusing, it seems
like the threading was a bit weird.


Re: Logging in a web server context

2020-06-13 Thread Casper Ti. Vector
On Sat, Jun 13, 2020 at 10:41:24AM +0200, Carl Winbäck wrote:
> Is it possible when using s6-svscan/s6-supervise to somehow arrange so
> that a daemon’s stdout is sent to one logdir and stderr to another
> logdir?

See ?

-- 
My current OpenPGP key:
RSA4096/0x227E8CAAB7AA186C (expires: 2022.09.20)
7077 7781 B859 5166 AE07 0286 227E 8CAA B7AA 186C



Logging in a web server context

2020-06-13 Thread Carl Winbäck
Hi folks!

Is it possible when using s6-svscan/s6-supervise to somehow arrange so
that a daemon’s stdout is sent to one logdir and stderr to another
logdir?

Or must I use s6-rc in order to achieve this?

My premises are the following:

I want to run nginx under s6, on a mainstream Linux distro such as
Debian/Ubuntu/CentOS. My plan is currently to use s6 only for nginx
and not any other daemons.

I want to collect the web server access logs in order to generate
visitor statistics[1].

One tricky aspect of logging that is specific to web servers is that
they emit two different categories of messages:

a) Errors and warnings

b) Info about page requests[2]

For many other kinds of daemons, one only needs to arrange for logging
stderr and then call it a day. With web servers it’s not quite so
simple.

I am very interested to hear from others on the list what your
thoughts are about this scenario.

The reason I’m weary about s6-rc is that I think it might not play so
well with the distros I mentioned earlier. If this concern is not
correct, please let me know.



[1] An alternative to collecting access logs could be to use Google
Analytics instead, but I want to avoid that since I don’t feel
comfortable about handing over my visitors’ personal data over to
Google.

[2] I.e. lines like this '10.131.214.101 - - [20/Nov/2017:18:52:17
+] "GET / HTTP/1.1" 401 188 "-" "Mozilla/5.0 (X11; Ubuntu; Linux
x86_64; rv:47.0) Gecko/20100101 Firefox/47.0"'