On 2007/11/18 6:04 PM, "Preston Norvell"
<[EMAIL PROTECTED]> muttered eloquently:

<snip> 
> The first is a basic issue with load balancing.  No matter which algorithm
> we choose, initial traffic is extremely heavily waited towards the system in
> the table with the highest id.  In point of experience so far, the only time
> more than one host is reliably used is when using the roundrobin type of
> load-balancing.  If 'loadbalance' or 'hash' is used, 99.9% of traffic ends
> up on a single host; some will end up on other hosts, sometime momentarily
> though, and not what we've been able see as deterministically.  The
> situation with 'loadbalance' we understand since our test system on the
> internet is essentially coming from essentially one address (though even in
> limited testing with a hand full of additional requesting addresses, it
> appears that it works the same).
> 
> With a test of traffic from our test host with roundrobin (50 separate,
> simultaneous single request/response sessions run for several seconds), 797
> of the requests ended up at the high id host and 628 across the remaining 7
> (89 or 90 for each).
> > 

We have discovered the issue with this unbalanced balancing.  The root cause
appears to be some invalid assumptions in the roundrobin code in the
relay_from_table function in relay.c.

If you look at the config (snipped here for space), you will notice that we
have 16 hosts in the appx table.  Hosts 9-16 are offline until further
notice, and it's their existence in the table that is causing the roundrobin
to be more of a half-moon robin.  If we remove them from the table, the
balancing returns to normal.

Here's the theory, born out by experience and some snooping through the
code:

Basically when the requests start coming in, it tries #1 which is up and the
connection is sent there.  Then another connection comes in and it
roundrobins to #2 which is up so the connection is sent there, and so on and
so forth up to the 9th connection.  Then another connection comes in, it
roundrobins to #9 which isn't up so it chews through the table (in backwards
order?), and finds #8 up first so it sends it to #8.  Then the tenth
connection comes in, which it rounrobins to #10, which isn't up so it chews
through the table and finds #8 up first so it sends it to #8.  This happens
until it's gone through the remaining hosts in the table, then it resets to
the first item in the table, sends the next connection to #1, and the next
to #2, etc.  

Pardon me if I get the exact interpretation, but I haven't done C
programming in a very long time.  The balancer logic for roundrobin iterates
through the hosts in the table by incrementing a tracking variable in the
relay's struct.  It then breaks, and hops to the while loop to check if the
host is up.  If it's not up it iterates through the rest of the hosts in the
table until it finds one or runs out of items in the table.  If it runs out
it decides to run through the entire table from the top.  In either of these
cases, I believe the connection is dispatched to the first item it finds,
rather than the next one it should go to according to the theory of
roundrobin.

This exactly matches the mathematical distribution of the sessions in our
logs.  In general the roundrobin seems to suffer with an assumption that a
large block of hosts wouldn't be down at one time.  This is an invalid
assumption (intentional or not) for a production environment where someone
may need to take down a substantial number of hosts at once for maintenance.
In addition, since the same logic is used for all three algorithms
(roundrobin, loadbalance, and hash), it explains why the non-roundrobin
modes were producing consistently incorrect balancing as well.  There is
some stickiness provided by the hash in these additional modes, but their
balancing seems to be similarly borked but in a more complicated fashion.

<snip>

Thoughts?

Thanks much,

;P mn

--
Preston M Norvell <[EMAIL PROTECTED]>
Systems/Network Administrator
Serials Solutions <http://www.serialssolutions.com>
Phone:  (866) SERIALS (737-4257) ext 1094

Reply via email to