Re: set weight bug?
Hi, Willy, after upgraded to haproxy-ss-20131122, enable and disable servers via socket will crash haproxy, there's no this issue in haproxy-ss-20131031. Bests, -Igor On Thu, Nov 21, 2013 at 10:42 PM, Willy Tarreau w...@1wt.eu wrote: Hi Igor, On Thu, Nov 21, 2013 at 09:03:05PM +0800, Igor wrote: Thanks Willy, because I use snapshot haproxy in production, so I have no change to do more investigation, glad you could reproduce the bug :) now you'll have something a bit more reliable to work with. I've just committed the two following fixes : e7b7348 BUG/MEDIUM: checks: fix slow start regression after fix attempt 004e045 BUG/MAJOR: server: weight calculation fails for map-based algorithms The first one is a crash when using slowstart without checks. It's not your case but you'll probably want to fix it in case you happen to switch to slowstart. I encountered it while trying to reproduce your bug based on Lukas' findings. The second one fixes the issue you're facing which is also what Lukas noticed (wrong weight after a set weight on a map-based algorithm). I can confirm that it resulted in the total backend weight to be larger than the table, causing out of bounds accesses after a weight change as soon as the map indx went far enough (depending on your load and the total initial weights, it could take a few seconds to a few minutes). The fix is quite large because I wanted to get rid of all places where the computations were hazardous (and there were quite a few). From my opinion and my tests, it now correcty covers all situations. Thanks for reporting it! Willy
Re: set weight bug?
Hi Igor, On Wed, Nov 27, 2013 at 08:46:38PM +0800, Igor wrote: Hi, Willy, after upgraded to haproxy-ss-20131122, enable and disable servers via socket will crash haproxy, there's no this issue in haproxy-ss-20131031. I don't understand, I cannot reproduce this with your config, at least the config I used to reproduce and fix the previous bugs :-( Maybe you have something different there ? Thanks, Willy
Unsubscribe : set weight bug?
Unsubscribe -Original Message- From: paveoc...@gmail.com [mailto:paveoc...@gmail.com] On Behalf Of Igor Sent: Wednesday, November 27, 2013 9:41 AM To: Willy Tarreau Cc: Lukas Tribus; haproxy@formilux.org Subject: Re: set weight bug? It's almost the same except some servers with weight=0 in conf, script to disable/enable servers works fine with haproxy-ss-20131031, but with haproxy-ss-20131122 or newer, it will crash haproxy after several disable/enable commands via socket. I can mail you the conf if need. Bests, -Igor On Wed, Nov 27, 2013 at 9:30 PM, Willy Tarreau w...@1wt.eu wrote: Hi Igor, On Wed, Nov 27, 2013 at 08:46:38PM +0800, Igor wrote: Hi, Willy, after upgraded to haproxy-ss-20131122, enable and disable servers via socket will crash haproxy, there's no this issue in haproxy-ss-20131031. I don't understand, I cannot reproduce this with your config, at least the config I used to reproduce and fix the previous bugs :-( Maybe you have something different there ? Thanks, Willy
Re: set weight bug?
On Wed, Nov 27, 2013 at 10:41:12PM +0800, Igor wrote: It's almost the same except some servers with weight=0 in conf, script to disable/enable servers works fine with haproxy-ss-20131031, but with haproxy-ss-20131122 or newer, it will crash haproxy after several disable/enable commands via socket. I can mail you the conf if need. That would definitely help, as even by doing what you describe above I still fail to reproduce the issue. Thanks Igor, Willy
Re: set weight bug?
On Wed, Nov 27, 2013 at 10:41:12PM +0800, Igor wrote: It's almost the same except some servers with weight=0 in conf, script to disable/enable servers works fine with haproxy-ss-20131031, but with haproxy-ss-20131122 or newer, it will crash haproxy after several disable/enable commands via socket. I can mail you the conf if need. OK thank you for your config, Igor. It definitely helped. The issue was related to the track servers with the recent changes. It caused an (almost) infinite recursion. It was easier to see here because I have no server responding to your addresses, so the process crashes as soon as the first server fails the first check. I'm attaching the fix to apply on top of your snapshot, that I'm going to merge into mainline. Simon, I suggest you apply the attached patch on top of your development branch in case you still work on the lb agent checks, because then you're affected as well. Thanks, Willy From 151cd51485f8ebe47aa3b2f2eedb604d4c2cfd0d Mon Sep 17 00:00:00 2001 From: Willy Tarreau w...@1wt.eu Date: Wed, 27 Nov 2013 16:52:23 +0100 Subject: BUG/MAJOR: fix haproxy crash when using server tracking instead of checks Igor at owind reported a very recent bug (just present in latest snapshot). Commit 4a741432 MEDIUM: Paramatise functions over the check of a server causes up/down to die with tracked servers due to a typo. The following call in set_server_down causes the server to put itself down recurseively because check is the current server's check, so once fed to the function again, it will pass through the exact same path (note we have the exact symmetry in set_server_up) : for (srv = s-tracknext; srv; srv = srv-tracknext) if (!(srv-state SRV_MAINTAIN)) /* Only notify tracking servers that are not already in maintenance. */ set_server_down(check); Instead we should stop the tracking server being visited in the loop : for (srv = s-tracknext; srv; srv = srv-tracknext) if (!(srv-state SRV_MAINTAIN)) /* Only notify tracking servers that are not already in maintenance. */ set_server_down(srv-check); But that's not exactly enough because srv-check-server is only set when checks are enabled, so -server is NULL for tracking servers, still causing a crash upon first iteration. The fix is easy and consists in always initializing check-server when creating a new server, which is what was already done a few patches later by 69d29f9 (MEDIUM: cfgparse: Factor out check initialisation). With the fix above alone on top of current version or snapshot 20131122, the problem disappears. Thanks to Igor for testing and reporting the issue. --- src/checks.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/src/checks.c b/src/checks.c index cfbb8a3..a722c12 100644 --- a/src/checks.c +++ b/src/checks.c @@ -445,10 +445,10 @@ void set_server_down(struct check *check) s-counters.down_trans++; if (s-state SRV_CHECKED) - for(srv = s-tracknext; srv; srv = srv-tracknext) - if (! (srv-state SRV_MAINTAIN)) + for (srv = s-tracknext; srv; srv = srv-tracknext) + if (!(srv-state SRV_MAINTAIN)) /* Only notify tracking servers that are not already in maintenance. */ - set_server_down(check); + set_server_down(srv-check); } check-health = 0; /* failure */ @@ -520,10 +520,10 @@ void set_server_up(struct check *check) { send_log(s-proxy, LOG_NOTICE, %s.\n, trash.str); if (s-state SRV_CHECKED) - for(srv = s-tracknext; srv; srv = srv-tracknext) - if (! (srv-state SRV_MAINTAIN)) + for (srv = s-tracknext; srv; srv = srv-tracknext) + if (!(srv-state SRV_MAINTAIN)) /* Only notify tracking servers if they're not in maintenance. */ - set_server_up(check); + set_server_up(srv-check); } if (check-health = s-rise) -- 1.7.12.1
Re: set weight bug?
On Wed, Nov 27, 2013 at 05:09:47PM +0100, Willy Tarreau wrote: On Wed, Nov 27, 2013 at 10:41:12PM +0800, Igor wrote: It's almost the same except some servers with weight=0 in conf, script to disable/enable servers works fine with haproxy-ss-20131031, but with haproxy-ss-20131122 or newer, it will crash haproxy after several disable/enable commands via socket. I can mail you the conf if need. OK thank you for your config, Igor. It definitely helped. The issue was related to the track servers with the recent changes. It caused an (almost) infinite recursion. It was easier to see here because I have no server responding to your addresses, so the process crashes as soon as the first server fails the first check. I'm attaching the fix to apply on top of your snapshot, that I'm going to merge into mainline. Simon, I suggest you apply the attached patch on top of your development branch in case you still work on the lb agent checks, because then you're affected as well. Thanks and sorry for the bug. Thanks, Willy From 151cd51485f8ebe47aa3b2f2eedb604d4c2cfd0d Mon Sep 17 00:00:00 2001 From: Willy Tarreau w...@1wt.eu Date: Wed, 27 Nov 2013 16:52:23 +0100 Subject: BUG/MAJOR: fix haproxy crash when using server tracking instead of checks Igor at owind reported a very recent bug (just present in latest snapshot). Commit 4a741432 MEDIUM: Paramatise functions over the check of a server causes up/down to die with tracked servers due to a typo. The following call in set_server_down causes the server to put itself down recurseively because check is the current server's check, so once fed to the function again, it will pass through the exact same path (note we have the exact symmetry in set_server_up) : for (srv = s-tracknext; srv; srv = srv-tracknext) if (!(srv-state SRV_MAINTAIN)) /* Only notify tracking servers that are not already in maintenance. */ set_server_down(check); Instead we should stop the tracking server being visited in the loop : for (srv = s-tracknext; srv; srv = srv-tracknext) if (!(srv-state SRV_MAINTAIN)) /* Only notify tracking servers that are not already in maintenance. */ set_server_down(srv-check); But that's not exactly enough because srv-check-server is only set when checks are enabled, so -server is NULL for tracking servers, still causing a crash upon first iteration. The fix is easy and consists in always initializing check-server when creating a new server, which is what was already done a few patches later by 69d29f9 (MEDIUM: cfgparse: Factor out check initialisation). With the fix above alone on top of current version or snapshot 20131122, the problem disappears. Thanks to Igor for testing and reporting the issue. --- src/checks.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/src/checks.c b/src/checks.c index cfbb8a3..a722c12 100644 --- a/src/checks.c +++ b/src/checks.c @@ -445,10 +445,10 @@ void set_server_down(struct check *check) s-counters.down_trans++; if (s-state SRV_CHECKED) - for(srv = s-tracknext; srv; srv = srv-tracknext) - if (! (srv-state SRV_MAINTAIN)) + for (srv = s-tracknext; srv; srv = srv-tracknext) + if (!(srv-state SRV_MAINTAIN)) /* Only notify tracking servers that are not already in maintenance. */ - set_server_down(check); + set_server_down(srv-check); } check-health = 0; /* failure */ @@ -520,10 +520,10 @@ void set_server_up(struct check *check) { send_log(s-proxy, LOG_NOTICE, %s.\n, trash.str); if (s-state SRV_CHECKED) - for(srv = s-tracknext; srv; srv = srv-tracknext) - if (! (srv-state SRV_MAINTAIN)) + for (srv = s-tracknext; srv; srv = srv-tracknext) + if (!(srv-state SRV_MAINTAIN)) /* Only notify tracking servers if they're not in maintenance. */ - set_server_up(check); + set_server_up(srv-check); } if (check-health = s-rise) -- 1.7.12.1
RE: set weight bug?
Hi! Here is my config http://pastie.org/private/wf0dv30krqpasgmhtdnahw (Deleted some servers and two backends for clear config) I used script to handle servers weight since haproxy-ss-20131031, so I never tried previous versions. Sorry, still can't reproduce. For problem 1 (after setting weight the unix socket, authentication on the http stats doesn't work anymore): can you send us the output of the following curl test, when its working and another output when its not working: curl -v http://admin:passw0rd@127.0.0.1:11190/ha-stats; /dev/null For problem 2 (haproxy crashes after multiple fast set weight commands): please generate a coredump, that will help the developers here understand what happens: http://www.mail-archive.com/haproxy@formilux.org/msg09472.html Regards, Lukas
set weight bug?
Using newest snapshot, when I do echo set weight s1/p1 100| socat stdio /tmp/haproxy to a server already has weight 100, then fresh haproxy's stat page, it requires password, and it doesn't accept the right password set in stats auth until I reload the haproxy. I have a script to set servers weight, I found sometimes set weight to servers rapidly, like multi echo set weight s(*)/p(*) 100| socat stdio /tmp/haproxy, will crash haproxy daemon. Bests, -Igor
RE: set weight bug?
Hi Igor, Using newest snapshot, when I do echo set weight s1/p1 100| socat stdio /tmp/haproxy to a server already has weight 100, then fresh haproxy's stat page, it requires password, and it doesn't accept the right password set in stats auth until I reload the haproxy. I have a script to set servers weight, I found sometimes set weight to servers rapidly, like multi echo set weight s(*)/p(*) 100| socat stdio /tmp/haproxy, will crash haproxy daemon. I can't reproduce neither problem. Can you post your configurations so we can try to reproduce? If you know a certain snapshot/release where this worked fine for you, please tell, this will help reducing the regression range (if its a bug). Regards, Lukas
Re: set weight bug?
Here is my config http://pastie.org/private/wf0dv30krqpasgmhtdnahw (Deleted some servers and two backends for clear config) I used script to handle servers weight since haproxy-ss-20131031, so I never tried previous versions. Bests, -Igor On Wed, Nov 6, 2013 at 5:55 AM, Lukas Tribus luky...@hotmail.com wrote: Hi Igor, Using newest snapshot, when I do echo set weight s1/p1 100| socat stdio /tmp/haproxy to a server already has weight 100, then fresh haproxy's stat page, it requires password, and it doesn't accept the right password set in stats auth until I reload the haproxy. I have a script to set servers weight, I found sometimes set weight to servers rapidly, like multi echo set weight s(*)/p(*) 100| socat stdio /tmp/haproxy, will crash haproxy daemon. I can't reproduce neither problem. Can you post your configurations so we can try to reproduce? If you know a certain snapshot/release where this worked fine for you, please tell, this will help reducing the regression range (if its a bug). Regards, Lukas