On Thu, May 03, 2018 at 06:00:45PM +0200, Tim Düsterhus wrote:
> > What you have above looks like stderr. The rest are logs. They are for
> > very different usages, stderr is there to inform you that something went
> > wrong during a reload operation (that systemd happily hides so that you
> > believe it was OK but it was not), while the logs are there for future
> > traffic analysis and troubleshooting.
> 
> The distinction seems to be not really clear drawn:
> 
> This goes into my syslog:
> 
> > May  3 13:30:26 ### haproxy[21754]: Proxy bk_aaa stopped (FE: 0 conns, BE: 
> > 0 conns).
> > May  3 13:30:26 ### haproxy[21754]: Proxy bk_aaa stopped (FE: 0 conns, BE: 
> > 0 conns).
> > May  3 13:30:26 ### haproxy[21754]: Proxy zzz stopped (FE: 2 conns, BE: 0 
> > conns).
> > May  3 13:30:26 ### haproxy[21754]: Proxy zzz stopped (FE: 2 conns, BE: 0 
> > conns).
> > May  3 13:30:26 ### haproxy[14926]: Server bk_xxx/xxx is DOWN, changed from 
> > server-state after a reload. 0 active and 0 backup servers left. 0 sessions 
> > active, 0 requeued, 0 remaining in queue.
> > May  3 13:30:26 ### haproxy[14926]: Server bk_xxx/xxx is DOWN, changed from 
> > server-state after a reload. 0 active and 0 backup servers left. 0 sessions 
> > active, 0 requeued, 0 remaining in queue.
> > May  3 13:30:26 ### haproxy[14926]: backend bk_xxx has no server available!
> > May  3 13:30:26 ### haproxy[14926]: backend bk_xxx has no server available!
> > May  3 13:30:26 ### haproxy[14926]: Server bk_yyy/yyy is DOWN, changed from 
> > server-state after a reload. 1 active and 0 backup servers left. 0 sessions 
> > active, 0 requeued, 0 remaining in queue.
> 
> This goes into the journal:
> 
> > May 03 13:30:26 ### haproxy[11635]: Proxy bk_xxx started.
> > May 03 13:30:26 ### haproxy[11635]: Proxy bk_xxx started.
> > May 03 13:30:26 ### haproxy[11635]: Proxy bk_yyy started.
> > May 03 13:30:26 ### haproxy[11635]: Proxy bk_yyy started.
> > May 03 13:30:26 ### haproxy[11635]: Proxy bk_zzz started.
> > May 03 13:30:26 ### haproxy[11635]: Proxy bk_zzz started.
> > May 03 13:30:26 ### haproxy[11635]: Proxy aaa started.
> > May 03 13:30:26 ### haproxy[11635]: Proxy aaa started.
> 
> At least the Proxy ... started / stopped messages should go into the
> same log.

On a regular system, these ones do not even exist because they are mostly
debug messages, which are dumped after the fork, thus which are never
shown unless you're running in foreground mode. On an init system which
requires daemon to stay in the foreground... you get the debugging
messages on output, and since the daemon confiscates your output, it
sends it to its journal. I *think* (not tried though) that you can hide
them using "-q" on the command line.

> > Going back to the initial subject, are you interested in seeing if you
> > can add a warning counter to each frontend/backend, and possibly a rate
> > limited warning in the logs as well ? I'm willing to help if needed, it's
> > just that I really cannot take care of this myself, given that I spent
> > the last 6 months dealing with bugs and various other discussions, almost
> > not having been able to start to do anything for the next release :-/ So
> > any help here is welcome as you can guess.
> > 
> 
> Personally I'd prefer the rate limited warning over the counter. As
> outlined before: A warning counter probably will be incremented for
> multiple unrelated reasons in the longer term and thus loses it
> usefulness. Having a warning_headers_too_big counter and a
> warning_whatever_there_may_be is stupid, no?

For now we don't have such a warning, so the only reason for logging
it would be this header issue. It's never supposed to happen in theory
as it normally needs to be addressed immediately and ultimately we
should block by default on this. And if later we find another reason
to add a warning, we'll figure if it makes sense to use a different
counter or not.

Also you said yourself that you wouldn't look at the logs first but at
munin first. And munin monitors your stats socket, so logically munin
should report you increases of this counter found on the stats socket.

> I feel that the error counter could / should be re-used for this and
> just the log message should be added.

Except that it's not an error until we block. We detected an error and
decided to let it pass through, which is a warning. It would be an error
if we'd block on it though.

> My munin already monitors the
> error counts. The `eresp` counter seems to fit: "- failure applying
> filters to the response.".

If you see an error, you have the guarantee that the request or response
was blocked, so definitely here it doesn't fit for the case where you
don't block. And it's very important not to violate such guarantees as
some people really rely on them. For example during forensics after an
intrusion attempt on your systems, you really want to know if the attacker
managed to retrieve something or not.

Willy

Reply via email to