On 06.05.2009 15:08, Jim Jagielski wrote:
> 
> On May 6, 2009, at 4:35 AM, Jess Holle wrote:
> 
>>
>> Of course that redoes what a servlet engine would be doing and does so
>> with lower fidelity.  An ability to ask a backend for its current
>> session count and load balance new requests on that basis would be
>> really helpful.  Whether this ability is buried into AJP, for
>> instance, or is simply a separate request to a designated URL is
>> another question, but the latter approach seems fairly general and the
>> number of such requests could be throttled by a time-to-live setting
>> on the last such count obtained.
>>
>> Actually this could and should be generalized beyond active sessions
>> to a back-end health metric.  Each backend could compute and respond
>> with a relative measure of busyness/health and respond and the load
>> balancer could then balance new (session-less) requests to the least
>> busy / most healthy backend.  This would seem to be *huge* step
>> forward in load balancing capability/fidelity.
>>
> 
> The trick, of course, at least with HTTP, is that the querying of
> the backend is, of course, a request, and so one needs to worry about
> such things as keepalives and persistent connections, and how long
> do we wait for responses, etc...
> 
> That's why oob-like health-and-status chatter is nice, because
> it doesn't interfere with the normal reverse-proxy/host logic.
> 
> An idea: Instead of asking for this info before sending the
> request, what about the backend sending it as part of the response,
> as a response header. You don't know that status of the machine
> "now", but you do know the status of it right after it handled the last
> request (the last time you saw it) and, assuming nothing else touched
> it, that status is likely still "good". Latency will be an issue,
> of course... Overlapping requests where you don't have the response
> from req1 before you send req2 means that both requests think the
> server is at the same state, whereas of course, they aren't, but it
> may even out since req3, for example, (which happens after req1 is done)
> thinks that the backend has 2 concurrent requests, instead of the 1
> (req2) and so maybe isn't selected... The hysteresis would be interesting
> to model :)

I think asking each time before sending data is to much overhead in
general. Of course it depends on how accurate you try to distribute
load. I would expect, that in most situations the overhead for a per
request accurate decision does not pay off, especially when under high
load there is always a time window between getting the data and handling
the request, and a lot of concurrent requests will already again have
changed the data.

I expect in most cases a granularity of status data between once per
second and once per minute will be appropriate (still a factor of 60 to
decide or configure).

When sending the data back as part of the response: some load numbers
might be to expensive to retrieve like 500 times a second. Other load
numbers might not really make sense as a snapshot (per request), only as
an average value (like: what does CPU load as a snapshot mean? Since
your load data collecting code is on the CPU, a one CPU system will be
100% busy at this point in time. So CPU measurement mostly makes sense
as average values over relatively short intervals).

So we should already expect the backend to send data with is not
necessarily up-to-date w.r.t. each request. I would assume, that when
data comes with each response, one would use some sort of floating average.

Piggybacking will be easier to implement (no real protocol needed etc.),
out of band communication will be more flexible.

Regards,

Rainer

Reply via email to