hi!

On Wed, Nov 21, 2007 at 11:34:02PM -0800, Preston Norvell wrote:
> <snip> 
> > The first is a basic issue with load balancing.  No matter which algorithm
> > we choose, initial traffic is extremely heavily waited towards the system in
> > the table with the highest id.  In point of experience so far, the only time
> > more than one host is reliably used is when using the roundrobin type of
> > load-balancing.  If 'loadbalance' or 'hash' is used, 99.9% of traffic ends
> > up on a single host; some will end up on other hosts, sometime momentarily
> > though, and not what we've been able see as deterministically.  The
> > situation with 'loadbalance' we understand since our test system on the
> > internet is essentially coming from essentially one address (though even in
> > limited testing with a hand full of additional requesting addresses, it
> > appears that it works the same).
> > 
> > With a test of traffic from our test host with roundrobin (50 separate,
> > simultaneous single request/response sessions run for several seconds), 797
> > of the requests ended up at the high id host and 628 across the remaining 7
> > (89 or 90 for each).
> > > 
> 
> We have discovered the issue with this unbalanced balancing.  The root cause
> appears to be some invalid assumptions in the roundrobin code in the
> relay_from_table function in relay.c.
> 

- please try the attached diff, it will fix the roundrobin mode by
saving the last index and traversing to the next available host. 

(you can also have a look at my little test program to verify the alg:
http://team.vantronix.net/~reyk/q.c)

- i'm also looking into improving the loadbalance mode. the attached
diff includes the source port in loadbalance mode and the destination
(relay) port in loadbalance and hash mode. make also sure that you
feed in other variables if you want to get better results, for example

        request hash "Host"

to feed the virtual hostname into the hash/loadbalance hash.

reyk

Index: hoststated.h
===================================================================
RCS file: /cvs/src/usr.sbin/hoststated/hoststated.h,v
retrieving revision 1.81
diff -u -p -r1.81 hoststated.h
--- hoststated.h        22 Nov 2007 10:09:53 -0000      1.81
+++ hoststated.h        22 Nov 2007 11:45:00 -0000
@@ -327,6 +327,7 @@ struct host {
        u_long                   up_cnt;
        int                      retry_cnt;
        struct ctl_tcp_event     cte;
+       int                      idx;
 };
 TAILQ_HEAD(hostlist, host);
 
Index: relay.c
===================================================================
RCS file: /cvs/src/usr.sbin/hoststated/relay.c,v
retrieving revision 1.65
diff -u -p -r1.65 relay.c
--- relay.c     22 Nov 2007 10:09:53 -0000      1.65
+++ relay.c     22 Nov 2007 11:45:01 -0000
@@ -463,6 +463,7 @@ relay_init(void)
                                if (rlay->dstnhosts >= RELAY_MAXHOSTS)
                                        fatal("relay_init: "
                                            "too many hosts in table");
+                               host->idx = rlay->dstnhosts;
                                rlay->dsthost[rlay->dstnhosts++] = host;
                        }
                        log_info("adding %d hosts from table %s%s",
@@ -1876,10 +1877,14 @@ relay_hash_addr(struct sockaddr_storage 
                sin4 = (struct sockaddr_in *)ss;
                p = hash32_buf(&sin4->sin_addr,
                    sizeof(struct in_addr), p);
+               p = hash32_buf(&sin4->sin_port,
+                   sizeof(struct in_addr), p);
        } else {
                sin6 = (struct sockaddr_in6 *)ss;
                p = hash32_buf(&sin6->sin6_addr,
                    sizeof(struct in6_addr), p);
+               p = hash32_buf(&sin6->sin6_port,
+                   sizeof(struct in6_addr), p);
        }
 
        return (p);
@@ -1903,7 +1908,7 @@ relay_from_table(struct session *con)
        case RELAY_DSTMODE_ROUNDROBIN:
                if ((int)rlay->dstkey >= rlay->dstnhosts)
                        rlay->dstkey = 0;
-               idx = (int)rlay->dstkey++;
+               idx = (int)rlay->dstkey;
                break;
        case RELAY_DSTMODE_LOADBALANCE:
                p = relay_hash_addr(&con->in.ss, p);
@@ -1933,6 +1938,8 @@ relay_from_table(struct session *con)
        fatalx("relay_from_table: no active hosts, desynchronized");
 
  found:
+       if (rlay->conf.dstmode == RELAY_DSTMODE_ROUNDROBIN)
+               rlay->dstkey = host->idx + 1;
        con->retry = host->conf.retry;
        con->out.port = table->conf.port;
        bcopy(&host->conf.ss, &con->out.ss, sizeof(con->out.ss));

Reply via email to