On 26/07/2016 03:30 μμ, Willy Tarreau wrote: > Hi Pavlos! > > On Tue, Jul 26, 2016 at 03:23:01PM +0200, Pavlos Parissis wrote: >> Here is a suggestion { "frontend": { "www.haproxy.org": { "bin": >> "999999999999", "lbtot": >> "555555", ... }, "www.haproxy.com": { "bin": "999999999999", "lbtot": >> "555555", ... }, }, >> "backend": { "www.haproxy.org": { "bin": "999999999999", "lbtot": "555555", >> .... "server": >> { "srv1": { "bin": "999999999999", "lbtot": "555555", .... }, ... } >> >> }, }, "haproxy": { "id1": { "PipesFree": "555", "Process_num": "1", ... }, >> "id2": { >> "PipesFree": "555", "Process_num": "2", ... }, ... }, } > > Thanks. How does it scale if we later want to aggregate these ones over > multiple processes > and/or nodes ? The typed output already emits a process number for each > field. Also, we do > have the information of how data need to be parsed and aggregated. I suspect > that we want to > produce this with the JSON output as well so that we don't lose information > when dumping in > JSON mode. I would not be surprized if people find JSON easier to process > than our current > format to aggregate their stats, provided we have all the fields :-) > > Cheers, Willy >
I am glad you asked about aggregation as I deliberately didn't include aggregation. In all my setups I have nbproc > 1 and after a lot of changes and on how I aggregate HAProxy stats and what most people want to see on graphs, I came up that with something like the following: { "frontend": { "www.haproxy.org": { "bin": "999999999999", "lbtot": "555555", ... }, "www.haproxy.com": { "bin": "999999999999", "lbtot": "555555", ... }, }, "backend": { "www.haproxy.org": { "bin": "999999999999", "lbtot": "555555", .... "server": { "srv1": { "bin": "999999999999", "lbtot": "555555", .... }, ... }, }, }, "haproxy": { "PipesFree": "555", ... , "per_process": { "id1": { "PipesFree": "555", "Process_num": "1", ... }, "id2": { "PipesFree": "555", "Process_num": "2", ... }, ... }, }, "server": { "srv1": { "bin": "999999999999", "lbtot": "555555", ... }, ... }, } Let me explain a bit: - It is very useful and handy to know stats for a server per backend but also across all backends. Thus, I include a top level key 'server' which holds stats for each server across all backends. Few server's stats has to be excluded as they are meaningless in this context. For example, status, lastchg, check_duration, check_code and few others. For those which aren't counters but fixed numbers you want to either sum them(slim) or get the average(weight). I don't do the latter in my setup. - Aggregation across multiple processes for haproxy stats(show info output) As you can see I provide stats per process and across all processes. It has been proven very useful to know the CPU utilization per process. We depend on the kernel to do the distribution of incoming connects to all processes and so far it works very well, but sometimes you see a single process to consume a lot of CPU and if you don't provide percentiles or stats per process then you are going to miss it. The metrics about uptime, version, description and few other can be excluded in the aggregation. - nbproc > 1 and aggregation for frontend/backend/server My proposal doesn't cover stats for frontend/backend/server per haproxy process. The stats are already aggregated and few metrics are excluded. For example all status stuff. Each process performs healthchecking, so they act as little brains which never agree on the status of a server as they run their checks on different interval. But, if nbproc == 1 then these metrics have to be included. Cheers, Pavlos
signature.asc
Description: OpenPGP digital signature