Hi Igor,

On Thu, Nov 21, 2013 at 09:03:05PM +0800, Igor wrote:
> Thanks Willy, because I use snapshot haproxy in production, so I have
> no change to do more investigation, glad you  could reproduce the bug
> :)

now you'll have something a bit more reliable to work with. I've just
committed the two following fixes :

e7b7348 BUG/MEDIUM: checks: fix slow start regression after fix attempt
004e045 BUG/MAJOR: server: weight calculation fails for map-based algorithms

The first one is a crash when using slowstart without checks. It's not your
case but you'll probably want to fix it in case you happen to switch to
slowstart. I encountered it while trying to reproduce your bug based on
Lukas' findings.

The second one fixes the issue you're facing which is also what Lukas
noticed (wrong weight after a set weight on a map-based algorithm).
I can confirm that it resulted in the total backend weight to be larger
than the table, causing out of bounds accesses after a weight change
as soon as the map indx went far enough (depending on your load and
the total initial weights, it could take a few seconds to a few minutes).

The fix is quite large because I wanted to get rid of all places where
the computations were hazardous (and there were quite a few). From my
opinion and my tests, it now correcty covers all situations.

Thanks for reporting it!

Willy

>From e7b73485d0a2dbf7572f78756234337d1277b46c Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w...@1wt.eu>
Date: Thu, 21 Nov 2013 11:50:50 +0100
Subject: BUG/MEDIUM: checks: fix slow start regression after fix attempt

Commit 2e99390 (BUG/MEDIUM: checks: fix slowstart behaviour when server
tracking is in use) moved the slowstart task initialization within the
health check code and leaves it unset when checks are disabled. The
problem is that it's possible to trigger slowstart from the CLI by
issuing "disable server XXX / enable server XXX" even when checks are
disabled. The result is a crash when trying to wake up the slowstart
task of that server.

Move the task initialization earlier so that it is done even if the
checks are disabled.

This patch should be backported to 1.4 since the commit above was
backported there.
---
 src/checks.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/checks.c b/src/checks.c
index 44ad4b7..7269ac9 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -1553,6 +1553,20 @@ int start_checks() {
         */
        for (px = proxy; px; px = px->next) {
                for (s = px->srv; s; s = s->next) {
+                       if (s->slowstart) {
+                               if ((t = task_new()) == NULL) {
+                                       Alert("Starting [%s:%s] check: out of 
memory.\n", px->id, s->id);
+                                       return -1;
+                               }
+                               /* We need a warmup task that will be called 
when the server
+                                * state switches from down to up.
+                                */
+                               s->warmup = t;
+                               t->process = server_warmup;
+                               t->context = s;
+                               t->expire = TICK_ETERNITY;
+                       }
+
                        if (!(s->state & SRV_CHECKED))
                                continue;
 
@@ -1576,20 +1590,6 @@ int start_checks() {
         */
        for (px = proxy; px; px = px->next) {
                for (s = px->srv; s; s = s->next) {
-                       if (s->slowstart) {
-                               if ((t = task_new()) == NULL) {
-                                       Alert("Starting [%s:%s] check: out of 
memory.\n", px->id, s->id);
-                                       return -1;
-                               }
-                               /* We need a warmup task that will be called 
when the server
-                                * state switches from down to up.
-                                */
-                               s->warmup = t;
-                               t->process = server_warmup;
-                               t->context = s;
-                               t->expire = TICK_ETERNITY;
-                       }
-
                        if (!(s->state & SRV_CHECKED))
                                continue;
 
-- 
1.7.12.1

>From 004e045f3141cedcfe578a102a2a0d4e68363a15 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w...@1wt.eu>
Date: Thu, 21 Nov 2013 11:22:01 +0100
Subject: BUG/MAJOR: server: weight calculation fails for map-based algorithms

A crash was reported by Igor at owind when changing a server's weight
on the CLI. Lukas Tribus could reproduce a related bug where setting
a server's weight would result in the new weight being multiplied by
the initial one. The two bugs are the same.

The incorrect weight calculation results in the total farm weight being
larger than what was initially allocated, causing the map index to be out
of bounds on some hashes. It's easy to reproduce using "balance url_param"
with a variable param, or with "balance static-rr".

It appears that the calculation is made at many places and is not always
right and not always wrong the same way. Thus, this patch introduces a
new function "server_recalc_eweight()" which is dedicated to this task
of computing ->eweight from many other elements including uweight and
current time (for slowstart), and all users now switch to use this
function.

The patch is a bit large but the code was not trivially fixable in a way
that could guarantee this situation would not occur anymore. The fix is
much more readable and has been verified to work with all algorithms,
with both consistent and map-based hashes, and even with static-rr.

Slowstart was tested as well, just like enable/disable server.

The same bug is very likely present in 1.4 as well, so the patch will
probably need to be backported eventhough it will not apply as-is.

Thanks to Lukas and Igor for the information they provided to reproduce it.
---
 include/proto/server.h |  6 +++++
 src/checks.c           | 30 +++----------------------
 src/lb_chash.c         |  3 ++-
 src/lb_fas.c           |  3 ++-
 src/lb_fwlc.c          |  3 ++-
 src/lb_fwrr.c          |  3 ++-
 src/lb_map.c           |  2 +-
 src/proto_http.c       | 22 +-----------------
 src/server.c           | 61 +++++++++++++++++++++++++++++++-------------------
 9 files changed, 57 insertions(+), 76 deletions(-)

diff --git a/include/proto/server.h b/include/proto/server.h
index 1151b3c..d19c6ac 100644
--- a/include/proto/server.h
+++ b/include/proto/server.h
@@ -58,6 +58,12 @@ struct srv_kw *srv_find_kw(const char *kw);
 /* Dumps all registered "server" keywords to the <out> string pointer. */
 void srv_dump_kws(char **out);
 
+/* Recomputes the server's eweight based on its state, uweight, the current 
time,
+ * and the proxy's algorihtm. To be used after updating sv->uweight. The warmup
+ * state is automatically disabled if the time is elapsed.
+ */
+void server_recalc_eweight(struct server *sv);
+
 /*
  * Parses weight_str and configures sv accordingly.
  * Returns NULL on success, error message string otherwise.
diff --git a/src/checks.c b/src/checks.c
index 7269ac9..cfbb8a3 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -481,18 +481,10 @@ void set_server_up(struct check *check) {
 
                if (s->slowstart > 0) {
                        s->state |= SRV_WARMINGUP;
-                       if (s->proxy->lbprm.algo & BE_LB_PROP_DYN) {
-                               /* For dynamic algorithms, start at the first 
step of the weight,
-                                * without multiplying by BE_WEIGHT_SCALE.
-                                */
-                               s->eweight = s->uweight;
-                               if (s->proxy->lbprm.update_server_eweight)
-                                       
s->proxy->lbprm.update_server_eweight(s);
-                       }
                        task_schedule(s->warmup, tick_add(now_ms, 
MS_TO_TICKS(MAX(1000, s->slowstart / 20))));
                }
-               if (s->proxy->lbprm.set_server_status_up)
-                       s->proxy->lbprm.set_server_status_up(s);
+
+               server_recalc_eweight(s);
 
                /* If the server is set with "on-marked-up 
shutdown-backup-sessions",
                 * and it's not a backup server and its effective weight is > 0,
@@ -1273,23 +1265,7 @@ static struct task *server_warmup(struct task *t)
        if ((s->state & (SRV_RUNNING|SRV_WARMINGUP|SRV_MAINTAIN)) != 
(SRV_RUNNING|SRV_WARMINGUP))
                return t;
 
-       if (now.tv_sec < s->last_change || now.tv_sec >= s->last_change + 
s->slowstart) {
-               /* go to full throttle if the slowstart interval is reached */
-               s->state &= ~SRV_WARMINGUP;
-               if (s->proxy->lbprm.algo & BE_LB_PROP_DYN)
-                       s->eweight = s->uweight * BE_WEIGHT_SCALE;
-               if (s->proxy->lbprm.update_server_eweight)
-                       s->proxy->lbprm.update_server_eweight(s);
-       }
-       else if (s->proxy->lbprm.algo & BE_LB_PROP_DYN) {
-               /* for dynamic algorithms, let's slowly update the weight */
-               s->eweight = (BE_WEIGHT_SCALE * (now.tv_sec - s->last_change) +
-                             s->slowstart - 1) / s->slowstart;
-               s->eweight *= s->uweight;
-               if (s->proxy->lbprm.update_server_eweight)
-                       s->proxy->lbprm.update_server_eweight(s);
-       }
-       /* Note that static algorithms are already running at full throttle */
+       server_recalc_eweight(s);
 
        /* probably that we can refill this server with a bit more connections 
*/
        check_for_pending(s);
diff --git a/src/lb_chash.c b/src/lb_chash.c
index d65de74..41e06c7 100644
--- a/src/lb_chash.c
+++ b/src/lb_chash.c
@@ -382,7 +382,8 @@ void chash_init_server_tree(struct proxy *p)
 
        p->lbprm.wdiv = BE_WEIGHT_SCALE;
        for (srv = p->srv; srv; srv = srv->next) {
-               srv->prev_eweight = srv->eweight = srv->uweight * 
BE_WEIGHT_SCALE;
+               srv->eweight = (srv->uweight * p->lbprm.wdiv + p->lbprm.wmult - 
1) / p->lbprm.wmult;
+               srv->prev_eweight = srv->eweight;
                srv->prev_state = srv->state;
        }
 
diff --git a/src/lb_fas.c b/src/lb_fas.c
index daff666..b4cfc79 100644
--- a/src/lb_fas.c
+++ b/src/lb_fas.c
@@ -252,7 +252,8 @@ void fas_init_server_tree(struct proxy *p)
 
        p->lbprm.wdiv = BE_WEIGHT_SCALE;
        for (srv = p->srv; srv; srv = srv->next) {
-               srv->prev_eweight = srv->eweight = srv->uweight * 
BE_WEIGHT_SCALE;
+               srv->eweight = (srv->uweight * p->lbprm.wdiv + p->lbprm.wmult - 
1) / p->lbprm.wmult;
+               srv->prev_eweight = srv->eweight;
                srv->prev_state = srv->state;
        }
 
diff --git a/src/lb_fwlc.c b/src/lb_fwlc.c
index c2fcb4e..de1896e 100644
--- a/src/lb_fwlc.c
+++ b/src/lb_fwlc.c
@@ -244,7 +244,8 @@ void fwlc_init_server_tree(struct proxy *p)
 
        p->lbprm.wdiv = BE_WEIGHT_SCALE;
        for (srv = p->srv; srv; srv = srv->next) {
-               srv->prev_eweight = srv->eweight = srv->uweight * 
BE_WEIGHT_SCALE;
+               srv->eweight = (srv->uweight * p->lbprm.wdiv + p->lbprm.wmult - 
1) / p->lbprm.wmult;
+               srv->prev_eweight = srv->eweight;
                srv->prev_state = srv->state;
        }
 
diff --git a/src/lb_fwrr.c b/src/lb_fwrr.c
index 7f5c8a9..7cf0cc7 100644
--- a/src/lb_fwrr.c
+++ b/src/lb_fwrr.c
@@ -272,7 +272,8 @@ void fwrr_init_server_groups(struct proxy *p)
 
        p->lbprm.wdiv = BE_WEIGHT_SCALE;
        for (srv = p->srv; srv; srv = srv->next) {
-               srv->prev_eweight = srv->eweight = srv->uweight * 
BE_WEIGHT_SCALE;
+               srv->eweight = (srv->uweight * p->lbprm.wdiv + p->lbprm.wmult - 
1) / p->lbprm.wmult;
+               srv->prev_eweight = srv->eweight;
                srv->prev_state = srv->state;
        }
 
diff --git a/src/lb_map.c b/src/lb_map.c
index 9858249..7ecd003 100644
--- a/src/lb_map.c
+++ b/src/lb_map.c
@@ -180,7 +180,7 @@ void init_server_map(struct proxy *p)
 
        act = bck = 0;
        for (srv = p->srv; srv; srv = srv->next) {
-               srv->eweight = srv->uweight / pgcd;
+               srv->eweight = (srv->uweight * p->lbprm.wdiv + p->lbprm.wmult - 
1) / p->lbprm.wmult;
                srv->prev_eweight = srv->eweight;
                srv->prev_state = srv->state;
                if (srv->state & SRV_BACKUP)
diff --git a/src/proto_http.c b/src/proto_http.c
index b5e912e..5ad865b 100644
--- a/src/proto_http.c
+++ b/src/proto_http.c
@@ -2932,28 +2932,8 @@ int http_process_req_stat_post(struct stream_interface 
*si, struct http_txn *txn
                                                else
                                                        sv->uweight = 0;
 
-                                               if (px->lbprm.algo & 
BE_LB_PROP_DYN) {
-                                                       /* we must take care of 
not pushing the server to full throttle during slow starts */
-                                                       if ((sv->state & 
SRV_WARMINGUP) && (px->lbprm.algo & BE_LB_PROP_DYN))
-                                                               sv->eweight = 
(BE_WEIGHT_SCALE * (now.tv_sec - sv->last_change) + sv->slowstart - 1) / 
sv->slowstart;
-                                                       else
-                                                               sv->eweight = 
BE_WEIGHT_SCALE;
-                                                       sv->eweight *= 
sv->uweight;
-                                               } else {
-                                                       sv->eweight = 
sv->uweight;
-                                               }
+                                               server_recalc_eweight(sv);
 
-                                               /* static LB algorithms are a 
bit harder to update */
-                                               if 
(px->lbprm.update_server_eweight)
-                                                       
px->lbprm.update_server_eweight(sv);
-                                               else if (sv->eweight) {
-                                                       if 
(px->lbprm.set_server_status_up)
-                                                               
px->lbprm.set_server_status_up(sv);
-                                               }
-                                               else {
-                                                       if 
(px->lbprm.set_server_status_down)
-                                                               
px->lbprm.set_server_status_down(sv);
-                                               }
                                                altered_servers++;
                                                total_servers++;
                                                break;
diff --git a/src/server.c b/src/server.c
index dd6e9e6..efba257 100644
--- a/src/server.c
+++ b/src/server.c
@@ -158,6 +158,43 @@ static void __listener_init(void)
        srv_register_keywords(&srv_kws);
 }
 
+/* Recomputes the server's eweight based on its state, uweight, the current 
time,
+ * and the proxy's algorihtm. To be used after updating sv->uweight. The warmup
+ * state is automatically disabled if the time is elapsed.
+ */
+void server_recalc_eweight(struct server *sv)
+{
+       struct proxy *px = sv->proxy;
+       unsigned w;
+
+       if (now.tv_sec < sv->last_change || now.tv_sec >= sv->last_change + 
sv->slowstart) {
+               /* go to full throttle if the slowstart interval is reached */
+               sv->state &= ~SRV_WARMINGUP;
+       }
+
+       /* We must take care of not pushing the server to full throttle during 
slow starts.
+        * It must also start immediately, at least at the minimal step when 
leaving maintenance.
+        */
+       if ((sv->state & SRV_WARMINGUP) && (px->lbprm.algo & BE_LB_PROP_DYN))
+               w = (px->lbprm.wdiv * (now.tv_sec - sv->last_change) + 
sv->slowstart) / sv->slowstart;
+       else
+               w = px->lbprm.wdiv;
+
+       sv->eweight = (sv->uweight * w + px->lbprm.wmult - 1) / px->lbprm.wmult;
+
+       /* now propagate the status change to any LB algorithms */
+       if (px->lbprm.update_server_eweight)
+               px->lbprm.update_server_eweight(sv);
+       else if (sv->eweight) {
+               if (px->lbprm.set_server_status_up)
+                       px->lbprm.set_server_status_up(sv);
+       }
+       else {
+               if (px->lbprm.set_server_status_down)
+                       px->lbprm.set_server_status_down(sv);
+       }
+}
+
 /*
  * Parses weight_str and configures sv accordingly.
  * Returns NULL on success, error message string otherwise.
@@ -199,29 +236,7 @@ const char *server_parse_weight_change_request(struct 
server *sv,
                return "Backend is using a static LB algorithm and only accepts 
weights '0%' and '100%'.\n";
 
        sv->uweight = w;
-
-       if (px->lbprm.algo & BE_LB_PROP_DYN) {
-       /* we must take care of not pushing the server to full throttle during 
slow starts */
-               if ((sv->state & SRV_WARMINGUP))
-                       sv->eweight = (BE_WEIGHT_SCALE * (now.tv_sec - 
sv->last_change) + sv->slowstart - 1) / sv->slowstart;
-               else
-                       sv->eweight = BE_WEIGHT_SCALE;
-               sv->eweight *= sv->uweight;
-       } else {
-               sv->eweight = sv->uweight;
-       }
-
-       /* static LB algorithms are a bit harder to update */
-       if (px->lbprm.update_server_eweight)
-               px->lbprm.update_server_eweight(sv);
-       else if (sv->eweight) {
-               if (px->lbprm.set_server_status_up)
-                       px->lbprm.set_server_status_up(sv);
-       }
-       else {
-               if (px->lbprm.set_server_status_down)
-                       px->lbprm.set_server_status_down(sv);
-       }
+       server_recalc_eweight(sv);
 
        return NULL;
 }
-- 
1.7.12.1

Reply via email to