We have been trying to migrate from an Apache proxy balancer to hoststated
and have run into a couple issues, one of which a document and another I
will send along later.

We are using 4.2-stable:
OpenBSD mesh1 4.2 GENERIC.MP#1378 amd64

Our first issue is in getting load balancing to occur in a deterministic
fashion.  There are actually two parts to this, but first a basic
description of what we are trying to do...

We are doing layer 7 balancing with http for hosts connecting from the
Inernet.  Traffic is to be spread across a sizeable number of machines (8
during this testing, but 16 or more in production).  Because the application
to be balanced is very sensitive to sessions (requests are made, results
take a while to queue and are typically referenced in subsequent requests),
cookies or GET variables (depending on how the application is accessed) are
used to manage state.

The first is a basic issue with load balancing.  No matter which algorithm
we choose, initial traffic is extremely heavily waited towards the system in
the table with the highest id.  In point of experience so far, the only time
more than one host is reliably used is when using the roundrobin type of
load-balancing.  If 'loadbalance' or 'hash' is used, 99.9% of traffic ends
up on a single host; some will end up on other hosts, sometime momentarily
though, and not what we've been able see as deterministically.  The
situation with 'loadbalance' we understand since our test system on the
internet is essentially coming from essentially one address (though even in
limited testing with a hand full of additional requesting addresses, it
appears that it works the same).

With a test of traffic from our test host with roundrobin (50 separate,
simultaneous single request/response sessions run for several seconds), 797
of the requests ended up at the high id host and 628 across the remaining 7
(89 or 90 for each).

For our application, the level of unbalanced weighting towards a host would
be bad, very bad.  A little further down is our hoststated.conf contents.
Our hope is that we are missing something basic in configuration that we are
missing.  Is this the expected behavior or have we misconfigured something?

The second issue we is related to this in that we have a value in the
sessionid in the cookie (or the GET variables depending on how folks are
accessing the system).  This variable helps us point the request at a
specific machine in the load balance pool since, as stated, it is very
important subsequent requests end up going to the same host.  For instance,
the variable might be something like sessionID=1234bcdfadedf.APP1.  With our
Apache load balancer we can grab this value from the request, extract the
APP1 off the end, then route to a specific host associated with APP1.  It
doesn't appear from the ways we have interpreted the man page, nor from any
way we have attempted to configure it that this kind of deterministic
routing is possible in hoststated.  Is possible with hoststated?

Any help is greatly appreciated.

The rest of this post is our config (some of the various permutations of
load balancing are left commented out):


#== MACROS
 #== IPs
CS1="10.100.0.201"
CS2="10.100.0.202"
CS3="10.100.0.203"
CS4="10.100.0.204"
CS5="10.100.0.205"
CS6="10.100.0.206"
CS7="10.100.0.207"
CS8="10.100.0.208"
CS9="10.100.0.209"
CS10="10.100.0.210"
CS11="10.100.0.211"
CS12="10.100.0.212"
CS13="10.100.0.213"
CS14="10.100.0.214"
CS15="10.100.0.215"
CS16="10.100.0.216"

#== GLOBAL OPTIONS
interval 10
timeout 200
prefork 5
log updates

#== TABLES
# mapped to tables in pf.conf
table appx {
        real port 8080

        #== check hosts in table via hash of systemStatus
        #check http "/systemStatus" \
        #        digest "fc3626a53938f22eed804aaec987cfe0b762b9f8"
        check icmp
 
        #== the actual hosts to push into the table
    host $CS1
    host $CS2
    host $CS3
    host $CS4
    host $CS5
    host $CS6
    host $CS7
    host $CS8
    host $CS9
    host $CS10
    host $CS11
    host $CS12
    host $CS13
    host $CS14
    host $CS15
    host $CS16
}

#== PROTOCOLS
# Layer 7 games

protocol appx {
        #== define what protocol we're manipulating
        protocol http

        #== RFC2616
        header append "$REMOTE_ADDR" to "X-Forwarded-For"
        header append "$SERVER_ADDR:$SERVER_PORT" to "X-Forwarded-By"

        #== grab session in URL for sru and log
        # order is important

        #== colon delimters
        request url hash "::*::" log

        #== url encoded colon delimiters
        request url hash "%3A%3A*%3A%3A" log

        #== grab session from Cookie header in app and log
        request cookie hash "csSessionId" log

        #== TCP performance options
        tcp { nodelay, sack, socket buffer 65536, backlog 128 }
}

#== RELAYS
# this is where the things actually happen
relay appx {
        #== use protocol defined above
        protocol appx

        #== bind to ip and listen
        listen on $IP_CSAPP port 6000

        #== loadbalance with hash from request in protocols section
        #table appx hash
        table appx roundrobin

        #table appx loadbalance
}

Thanks much,

;P mn

--
Preston M Norvell <[EMAIL PROTECTED]>

Reply via email to