Re: [PATCH] Support statistics in multi-process mode

2015-09-14 Thread Philipp Kolmann

Hi Willi,

On 09/14/15 12:17, Willy Tarreau wrote:

OK I now found a moment to spare some time on your patch. During my
first lecture I didn't understand that it relied on SIGUSR2 to
aggregate counters. I'm seeing several issues with that approach :


I never had the intent to look like I did the patch. The original mail 
is from Hiep Nguyen, hie...@vccloud.vn (CCed).


I just wanted to re-raise the topic again, since the mail from Hiep 
seemed to have drowned and I am interested in this feature.


@Hiep: Please look at Willi's  suggestions.

Thanks
Philipp

--
---
DI Mag. Philipp Kolmann  mail: kolm...@zid.tuwien.ac.at
Technische Universitaet Wien  web: www.zid.tuwien.ac.at
Zentraler Informatikdienst (ZID) tel: +43(1)58801-42011
Wiedner Hauptstr. 8-10, A-1040 WienDVR: 0005886
---




Re: [PATCH] Support statistics in multi-process mode

2015-09-14 Thread Aleksandar Lazic

Hi.

Am 14-09-2015 12:17, schrieb Willy Tarreau:

Hi Philipp,



[snipped]

What I'd like to have instead would be a per-proxy shared memory 
segment

for stats in addition to the per-process one, that is updated using
atomic operations each time other stats are updated. The max are a bit
tricky as you need to use a compare-and-swap operation but that's no 
big

deal. Please note that before doing this, it would be wise to move all
existing stats to a few dedicated structures so that in each proxy, 
server,

listener and so on we could simply have something like this :

struct proxy_stats *local;
struct proxy_stats *global;

As you guessed it local would be allocated in per process while global
would be shared between all of them.

Another benefit would be that we could improve the current sample fetch
functions which already look at some stats and use the global ones. 
That's
even more important for maxconn where it would *open* the possibility 
to
monitor the global connection count and not just the per-process one 
(but

there are other things to do prior to this being possible, such as
inter-process calls). However without inter-process calls we could 
decide

that we can slightly overbook the queue by up to one connection max per
process and that could be reasonably acceptable while waiting for a 
longterm

multi-threaded approach though.


I follow for some time uwsgi

https://uwsgi-docs.readthedocs.org/

Now I have implemented uwsgi now for some cgi's, yes there are such 
thinga still out ;-)


I like the stats solution, which is similar as you suggest, as far as I 
have understood you solution properly.


https://uwsgi-docs.readthedocs.org/en/latest/StatsServer.html

As far as I have understood there solution they use a registry pattern 
similar as already exist in haproxy.


https://github.com/unbit/uwsgi/search?utf8=%E2%9C%93=uwsgi_register_stats_pusher
http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/proto_http.c;h=eb3582bd77be9fb96a0babb0e5390c276c77e50e;hb=HEAD#l13043

How about to 'just' add a stats_register to the modules.

For example in proto_http
http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/proto_http.c;h=eb3582bd77be9fb96a0babb0e5390c276c77e50e;hb=HEAD#l13066


13066 static void __http_protocol_init(void)
13067 {
13068 acl_register_keywords(_kws);
13069 sample_register_fetches(_fetch_keywords);
13070 sample_register_convs(_conv_kws);

  stats_register_global_and_or_local($global|$proxy,...);

13071 http_req_keywords_register(_req_actions);
13072 http_res_keywords_register(_res_actions);
13073 }


From my point of view the benefit is that the stats server is in haproxy 
and the data could be stored in efficient way.


Maybe there is a standalone instance with 'mode stats' like the other 
modes.


http://cbonte.github.io/haproxy-dconv/configuration-1.6.html#4-mode

Best regads
Aleks



Re: [PATCH] Support statistics in multi-process mode

2015-09-14 Thread Willy Tarreau
On Mon, Sep 14, 2015 at 01:07:45PM +0200, Philipp Kolmann wrote:
> Hi Willi,
> 
> On 09/14/15 12:17, Willy Tarreau wrote:
> >OK I now found a moment to spare some time on your patch. During my
> >first lecture I didn't understand that it relied on SIGUSR2 to
> >aggregate counters. I'm seeing several issues with that approach :
> 
> I never had the intent to look like I did the patch. The original mail 
> is from Hiep Nguyen, hie...@vccloud.vn (CCed).

OK sorry for the confusion. I indeed saw a different name in the from
header but the name displayed was "root", letting me think it was
probably sent from a development server or something.

> I just wanted to re-raise the topic again, since the mail from Hiep 
> seemed to have drowned and I am interested in this feature.

Thanks for the clarification.

Best regards,
Willy




Re: [PATCH] Support statistics in multi-process mode

2015-09-14 Thread Willy Tarreau
Hi Philipp,

OK I now found a moment to spare some time on your patch. During my
first lecture I didn't understand that it relied on SIGUSR2 to
aggregate counters. I'm seeing several issues with that approach :

  - the time to scan all the proxies can be huge. Some people on this
list run with more than 5 backends. And the processing is
serialized in shm_proxy_update() so that means that even if they
run 32 processes on 32 cores, they'll have to wait 32 times the
time it takes to process 5 backends.

  - stats cannot easily be summed this way. The max are never valid
this way, because max(A+B) is somwhere between max(A,B) and
max(max(A)+max(B)). For this reason it's important to update stats
in real time using atomic operations.

  - the use of semaphores should *really* be avoided, as they create
huge trouble in production because they last after the process'
death. Here for example if you send a SIGUSR2 to your processes,
find that they're taking too much time to aggregate and decide
to kill the culprit, the other ones will be stuck forever and
when you decide to kill and restart the service, the old IPC is
still there until you reach the moment where there are no more
left and admins reboot the system because nowadays nobody thinks
to run "ipcs -a" then "ipcrm". I'd rather welcome a solution
involving atomic operations and/or mutexes even if it's limited
to a few architectures/operating systems. Note that the shared_ctx
used for SSL actually does that.

  - you have a process that remains stuck for up to 100ms per process
in the update_shm_proxy() function! By this time it's not processing
any traffic and is waiting for nothing, you must never ever do such
a thing. Imagine if someone sends a signal even just once a minute
to collect stats, you'll have big holes in your traffic graphs!

What I'd like to have instead would be a per-proxy shared memory segment
for stats in addition to the per-process one, that is updated using
atomic operations each time other stats are updated. The max are a bit
tricky as you need to use a compare-and-swap operation but that's no big
deal. Please note that before doing this, it would be wise to move all
existing stats to a few dedicated structures so that in each proxy, server,
listener and so on we could simply have something like this :

struct proxy_stats *local;
struct proxy_stats *global;

As you guessed it local would be allocated in per process while global
would be shared between all of them.

Another benefit would be that we could improve the current sample fetch
functions which already look at some stats and use the global ones. That's
even more important for maxconn where it would *open* the possibility to
monitor the global connection count and not just the per-process one (but
there are other things to do prior to this being possible, such as
inter-process calls). However without inter-process calls we could decide
that we can slightly overbook the queue by up to one connection max per
process and that could be reasonably acceptable while waiting for a longterm
multi-threaded approach though.

Best regards,
Willy




Re: [PATCH] Support statistics in multi-process mode

2015-09-07 Thread Willy Tarreau
Hi Philipp,

On Wed, Sep 02, 2015 at 06:23:16PM +0200, Philipp Kolmann wrote:
> Hi Willy,
> 
> I saw once a message that you forgot about this patch, but never saw any 
> comment on this ever again:

I remember I took a glance at it and there was something bothering me
but I didn't have the time to go further. I'll have to recheck. Thanks
for reminding me.

Willy




Re: [PATCH] Support statistics in multi-process mode

2015-09-02 Thread Philipp Kolmann

Hi Willy,

I saw once a message that you forgot about this patch, but never saw any 
comment on this ever again:


On 04/24/15 12:34, root wrote:

From: HiepNV 

Signed-off-by: root 
---
  Makefile  |   4 +-
  include/proto/shm_proxy.h |  28 +++
  src/dumpstats.c   |  59 ++-
  src/haproxy.c |  48 -
  src/shm_proxy.c   | 439 ++
  5 files changed, 571 insertions(+), 7 deletions(-)
  create mode 100644 include/proto/shm_proxy.h
  create mode 100644 src/shm_proxy.c




http://comments.gmane.org/gmane.comp.web.haproxy/21470


Could you please recheck, if that would be a possible feature?

thanks
Philipp


--
---
DI Mag. Philipp Kolmann  mail: kolm...@zid.tuwien.ac.at
Technische Universitaet Wien  web: www.zid.tuwien.ac.at
Zentraler Informatikdienst (ZID) tel: +43(1)58801-42011
Wiedner Hauptstr. 8-10, A-1040 WienDVR: 0005886
---