Most applications won't handle duplicate requests gracefully, especially REST 
like systems that are doing operations like POST or PUT to update data. 
Generally, once you begin a write operation, you want it to run until the app 
server tells you there was a timeout.

For read-only operations you might try a short timeout from the load balancer, 
and upon timeout (no first-byte-response after 2 seconds for example) then mark 
the node as unhealthy and try a second node with no timeout. Don't be tempted 
to try more than two because in high load situations that only makes matters 
worse. You also need a concept of "fewest nodes in pool" so that when things do 
get slow on the whole cluster you don't end up pruning it to the point where 
it's too small to be able to do any useful work.... more importantly so you 
don't empty the cluster. The idea is just to temporarily prune a node or two 
that get slow, but should recover later.

This approach is better than aggressive polling because it uses less resources 
(on the proxy and the app server) and the end user never gets trapped on a 
slow/dead node in the time between the last good health check and the failure 
event.

Adrian

Nils Goroll <sl...@schokola.de> wrote:

>Hi Paul,
>
>> so if you could duplicate an http request to multiple servers and get
>> the first response, you'd have to be pretty unlucky for all of them to
>> be garbage collecting and unresponsive.
>
>This issue can (and should) be solved by properly tuning your Java VMs.
>
>Duplicating workload on the backend to me sounds just like the opposite of 
>what 
>you usually want to achieve with an optimization tool like varnish.
>
>Nils
>_______________________________________________
>varnish-dev mailing list
>varnish-dev@projects.linpro.no
>http://projects.linpro.no/mailman/listinfo/varnish-dev
_______________________________________________
varnish-dev mailing list
varnish-dev@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-dev

Reply via email to