Re: [PATCH] Support statistics in multi-process mode
Hi Willi, On 09/14/15 12:17, Willy Tarreau wrote: OK I now found a moment to spare some time on your patch. During my first lecture I didn't understand that it relied on SIGUSR2 to aggregate counters. I'm seeing several issues with that approach : I never had the intent to look like I did the patch. The original mail is from Hiep Nguyen, hie...@vccloud.vn (CCed). I just wanted to re-raise the topic again, since the mail from Hiep seemed to have drowned and I am interested in this feature. @Hiep: Please look at Willi's suggestions. Thanks Philipp -- --- DI Mag. Philipp Kolmann mail: kolm...@zid.tuwien.ac.at Technische Universitaet Wien web: www.zid.tuwien.ac.at Zentraler Informatikdienst (ZID) tel: +43(1)58801-42011 Wiedner Hauptstr. 8-10, A-1040 WienDVR: 0005886 ---
Re: [PATCH] Support statistics in multi-process mode
Hi. Am 14-09-2015 12:17, schrieb Willy Tarreau: Hi Philipp, [snipped] What I'd like to have instead would be a per-proxy shared memory segment for stats in addition to the per-process one, that is updated using atomic operations each time other stats are updated. The max are a bit tricky as you need to use a compare-and-swap operation but that's no big deal. Please note that before doing this, it would be wise to move all existing stats to a few dedicated structures so that in each proxy, server, listener and so on we could simply have something like this : struct proxy_stats *local; struct proxy_stats *global; As you guessed it local would be allocated in per process while global would be shared between all of them. Another benefit would be that we could improve the current sample fetch functions which already look at some stats and use the global ones. That's even more important for maxconn where it would *open* the possibility to monitor the global connection count and not just the per-process one (but there are other things to do prior to this being possible, such as inter-process calls). However without inter-process calls we could decide that we can slightly overbook the queue by up to one connection max per process and that could be reasonably acceptable while waiting for a longterm multi-threaded approach though. I follow for some time uwsgi https://uwsgi-docs.readthedocs.org/ Now I have implemented uwsgi now for some cgi's, yes there are such thinga still out ;-) I like the stats solution, which is similar as you suggest, as far as I have understood you solution properly. https://uwsgi-docs.readthedocs.org/en/latest/StatsServer.html As far as I have understood there solution they use a registry pattern similar as already exist in haproxy. https://github.com/unbit/uwsgi/search?utf8=%E2%9C%93=uwsgi_register_stats_pusher http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/proto_http.c;h=eb3582bd77be9fb96a0babb0e5390c276c77e50e;hb=HEAD#l13043 How about to 'just' add a stats_register to the modules. For example in proto_http http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/proto_http.c;h=eb3582bd77be9fb96a0babb0e5390c276c77e50e;hb=HEAD#l13066 13066 static void __http_protocol_init(void) 13067 { 13068 acl_register_keywords(_kws); 13069 sample_register_fetches(_fetch_keywords); 13070 sample_register_convs(_conv_kws); stats_register_global_and_or_local($global|$proxy,...); 13071 http_req_keywords_register(_req_actions); 13072 http_res_keywords_register(_res_actions); 13073 } From my point of view the benefit is that the stats server is in haproxy and the data could be stored in efficient way. Maybe there is a standalone instance with 'mode stats' like the other modes. http://cbonte.github.io/haproxy-dconv/configuration-1.6.html#4-mode Best regads Aleks
Re: [PATCH] Support statistics in multi-process mode
On Mon, Sep 14, 2015 at 01:07:45PM +0200, Philipp Kolmann wrote: > Hi Willi, > > On 09/14/15 12:17, Willy Tarreau wrote: > >OK I now found a moment to spare some time on your patch. During my > >first lecture I didn't understand that it relied on SIGUSR2 to > >aggregate counters. I'm seeing several issues with that approach : > > I never had the intent to look like I did the patch. The original mail > is from Hiep Nguyen, hie...@vccloud.vn (CCed). OK sorry for the confusion. I indeed saw a different name in the from header but the name displayed was "root", letting me think it was probably sent from a development server or something. > I just wanted to re-raise the topic again, since the mail from Hiep > seemed to have drowned and I am interested in this feature. Thanks for the clarification. Best regards, Willy
Re: [PATCH] Support statistics in multi-process mode
Hi Philipp, OK I now found a moment to spare some time on your patch. During my first lecture I didn't understand that it relied on SIGUSR2 to aggregate counters. I'm seeing several issues with that approach : - the time to scan all the proxies can be huge. Some people on this list run with more than 5 backends. And the processing is serialized in shm_proxy_update() so that means that even if they run 32 processes on 32 cores, they'll have to wait 32 times the time it takes to process 5 backends. - stats cannot easily be summed this way. The max are never valid this way, because max(A+B) is somwhere between max(A,B) and max(max(A)+max(B)). For this reason it's important to update stats in real time using atomic operations. - the use of semaphores should *really* be avoided, as they create huge trouble in production because they last after the process' death. Here for example if you send a SIGUSR2 to your processes, find that they're taking too much time to aggregate and decide to kill the culprit, the other ones will be stuck forever and when you decide to kill and restart the service, the old IPC is still there until you reach the moment where there are no more left and admins reboot the system because nowadays nobody thinks to run "ipcs -a" then "ipcrm". I'd rather welcome a solution involving atomic operations and/or mutexes even if it's limited to a few architectures/operating systems. Note that the shared_ctx used for SSL actually does that. - you have a process that remains stuck for up to 100ms per process in the update_shm_proxy() function! By this time it's not processing any traffic and is waiting for nothing, you must never ever do such a thing. Imagine if someone sends a signal even just once a minute to collect stats, you'll have big holes in your traffic graphs! What I'd like to have instead would be a per-proxy shared memory segment for stats in addition to the per-process one, that is updated using atomic operations each time other stats are updated. The max are a bit tricky as you need to use a compare-and-swap operation but that's no big deal. Please note that before doing this, it would be wise to move all existing stats to a few dedicated structures so that in each proxy, server, listener and so on we could simply have something like this : struct proxy_stats *local; struct proxy_stats *global; As you guessed it local would be allocated in per process while global would be shared between all of them. Another benefit would be that we could improve the current sample fetch functions which already look at some stats and use the global ones. That's even more important for maxconn where it would *open* the possibility to monitor the global connection count and not just the per-process one (but there are other things to do prior to this being possible, such as inter-process calls). However without inter-process calls we could decide that we can slightly overbook the queue by up to one connection max per process and that could be reasonably acceptable while waiting for a longterm multi-threaded approach though. Best regards, Willy
Re: [PATCH] Support statistics in multi-process mode
Hi Philipp, On Wed, Sep 02, 2015 at 06:23:16PM +0200, Philipp Kolmann wrote: > Hi Willy, > > I saw once a message that you forgot about this patch, but never saw any > comment on this ever again: I remember I took a glance at it and there was something bothering me but I didn't have the time to go further. I'll have to recheck. Thanks for reminding me. Willy
Re: [PATCH] Support statistics in multi-process mode
Hi Willy, I saw once a message that you forgot about this patch, but never saw any comment on this ever again: On 04/24/15 12:34, root wrote: From: HiepNVSigned-off-by: root --- Makefile | 4 +- include/proto/shm_proxy.h | 28 +++ src/dumpstats.c | 59 ++- src/haproxy.c | 48 - src/shm_proxy.c | 439 ++ 5 files changed, 571 insertions(+), 7 deletions(-) create mode 100644 include/proto/shm_proxy.h create mode 100644 src/shm_proxy.c http://comments.gmane.org/gmane.comp.web.haproxy/21470 Could you please recheck, if that would be a possible feature? thanks Philipp -- --- DI Mag. Philipp Kolmann mail: kolm...@zid.tuwien.ac.at Technische Universitaet Wien web: www.zid.tuwien.ac.at Zentraler Informatikdienst (ZID) tel: +43(1)58801-42011 Wiedner Hauptstr. 8-10, A-1040 WienDVR: 0005886 ---