Re: How to redispatch a request after queue timeout

2013-08-31 Thread Willy Tarreau
Hi Lukas,

On Fri, Aug 30, 2013 at 10:27:34AM +0200, Lukas Tribus wrote:
  Is there a way to redispatch the requests to other servers ignoring the
  persistence on queue timeout?
 
 No; after a queue timeout (configurable by timeout queue), haproxy will
 drop the request.
 
 Think of what would happen when we redispatch on queue timeout:
 
 The request would wait seconds for a free slot on the backend and only
 then be redispatched to another server. This would delay many requests
 and hide the problem from you, while your customers furiously switch
 to a competitor.

It could be even worse if you do so with caches : this will end up mixing
objects among all nodes when one cache takes a long time to respond,
resulting in a lower hit rate, so a higher general load increasing this
problem, and you could very well end up killing the whole farm in a domino
effect.

Willy




Re: How to redispatch a request after queue timeout

2013-08-31 Thread Willy Tarreau
On Fri, Aug 30, 2013 at 02:10:50PM +0530, Sachin Shetty wrote:
 Thanks Lukas.
 
 Yes, I was hoping to workaround by setting a smaller maxqueue limit and
 queue timeout.
 
 So what other options do we have, I need to:
 1. Send all requests for a host (mytest.mydomain.com) to one backend as
 long as it can serve.
 2. If the backend is swamped, it should go to any other backend available.

I'm wondering if we should not try to implement this when the hash type
is set to consistent. The principle of the consistent hash precisely
is that we want the closest node but we know that sometimes things will
be slightly redistributed (eg: when adding/removing a server in the farm).
So maybe it would make sense to specify that when using consistent hash,
if a server has a maxqueue parameter and this maxqueue is reached, then
look for the closest server. That might be OK with caches as well as the
ones close to each other tend to share a few objects when the farm size
changes.

What do others think ?

Willy




Re: How to redispatch a request after queue timeout

2013-08-31 Thread Sachin Shetty
Thanks Willy.

I am precisely using it for caching. I need requests to go to the same
nodes for cache hits, but when the node is already swamped I would prefer
a cache miss over a 503.

Thanks
Sachin

On 8/31/13 12:57 PM, Willy Tarreau w...@1wt.eu wrote:

On Fri, Aug 30, 2013 at 02:10:50PM +0530, Sachin Shetty wrote:
 Thanks Lukas.
 
 Yes, I was hoping to workaround by setting a smaller maxqueue limit and
 queue timeout.
 
 So what other options do we have, I need to:
 1. Send all requests for a host (mytest.mydomain.com) to one backend as
 long as it can serve.
 2. If the backend is swamped, it should go to any other backend
available.

I'm wondering if we should not try to implement this when the hash type
is set to consistent. The principle of the consistent hash precisely
is that we want the closest node but we know that sometimes things will
be slightly redistributed (eg: when adding/removing a server in the farm).
So maybe it would make sense to specify that when using consistent hash,
if a server has a maxqueue parameter and this maxqueue is reached, then
look for the closest server. That might be OK with caches as well as the
ones close to each other tend to share a few objects when the farm size
changes.

What do others think ?

Willy






Re: How to redispatch a request after queue timeout

2013-08-31 Thread Sachin Shetty
We did try consistent hashing, but I found better distribution without it.
We don¹t add or remove servers often so we should be ok. Our total pool is
sized correctly and we are able to serve 100% requests when we use
roundrobin, however sticky on host is what causes some nodes to hit
maxconn. My goal is to never send a 503 as long as we have other nodes
available which is always the case in our pool.

Thanks
Sachin

On 8/31/13 1:17 PM, Willy Tarreau w...@1wt.eu wrote:

On Sat, Aug 31, 2013 at 01:03:34PM +0530, Sachin Shetty wrote:
 Thanks Willy.
 
 I am precisely using it for caching. I need requests to go to the same
 nodes for cache hits, but when the node is already swamped I would
prefer
 a cache miss over a 503.

Then you should already be using hash-type consistent, otherwise when
you lose or add a server, you redistribute everything and will end up
with only about 1/#cache at the same place and all the rest with misses.
Not many cache architectures resist to this, really.

Interestingly, a long time ago I wanted to have some outgoing rules
(they're
on the diagram in the doc directory). The idea was to be able to apply
some
processing *after* the LB algorithm was called. Such processing could
include
detecting the selected server's queue size or any such thing and decide to
force to use another server. But in practice it doesn't play well with the
current sequencing so it was never done. It could have been useful in such
a situation I think.

I'll wait a bit for others to step up about the idea of redistributing
connections only for consistent hashing. I really don't want to break
existing setups (eventhough I think in this case it should be OK).

Willy






Re: How to redispatch a request after queue timeout

2013-08-31 Thread Willy Tarreau
On Sat, Aug 31, 2013 at 01:27:41PM +0530, Sachin Shetty wrote:
 We did try consistent hashing, but I found better distribution without it.

That's known and normal.

 We don¹t add or remove servers often so we should be ok.

It depends on what you do with them in fact, because most places will
not accept that the whole farm goes down due to a server falling down
causing 100% redistribution. If you have reverse caches in general it
is not a big issue because the number of objects is very limited and
caches can quickly refill. But outgoing caches generally take ages to
fill up.

 Our total pool is
 sized correctly and we are able to serve 100% requests when we use
 roundrobin, however sticky on host is what causes some nodes to hit
 maxconn. My goal is to never send a 503 as long as we have other nodes
 available which is always the case in our pool.

OK so if we perform the proposed change it will not match your usage
since you're not using consistent hashing anyway. So we might have to
add another explicit option such as loose/strict assignment of the
server. We could have 3 levels BTW :

   - no-queue : find another server if the destination is full
   - loose: find another server if the destination has reached maxqueue
   - strict   : never switch to another server

I would just like to find how to do something clean for the map-based hash
that you're using without recomputing a map excluding the unusable server(s)
but trying to stick as much as possible to the same servers to optimize hit
rate.

Maybe scanning the table for the next usable server will be enough, though
it will not match the same servers as the ones used in case of a change of
the farm size. This could be a limitation that has to be accepted for this.

Willy




Re: How to redispatch a request after queue timeout

2013-08-31 Thread Sachin Shetty
Yes, no-queue is what I am looking for. If consistent hashing is easier to
accommodate this change, I won't mind switching to consistent hashing when
the fix is available. Right now with no support for maxqueue failover,
consistent hashing is even more severe for our use case.

Thanks again Willy.

Thanks
Sachin

On 8/31/13 2:17 PM, Willy Tarreau w...@1wt.eu wrote:

On Sat, Aug 31, 2013 at 01:27:41PM +0530, Sachin Shetty wrote:
 We did try consistent hashing, but I found better distribution without
it.

That's known and normal.

 We don¹t add or remove servers often so we should be ok.

It depends on what you do with them in fact, because most places will
not accept that the whole farm goes down due to a server falling down
causing 100% redistribution. If you have reverse caches in general it
is not a big issue because the number of objects is very limited and
caches can quickly refill. But outgoing caches generally take ages to
fill up.

 Our total pool is
 sized correctly and we are able to serve 100% requests when we use
 roundrobin, however sticky on host is what causes some nodes to hit
 maxconn. My goal is to never send a 503 as long as we have other nodes
 available which is always the case in our pool.

OK so if we perform the proposed change it will not match your usage
since you're not using consistent hashing anyway. So we might have to
add another explicit option such as loose/strict assignment of the
server. We could have 3 levels BTW :

   - no-queue : find another server if the destination is full
   - loose: find another server if the destination has reached
maxqueue
   - strict   : never switch to another server

I would just like to find how to do something clean for the map-based hash
that you're using without recomputing a map excluding the unusable
server(s)
but trying to stick as much as possible to the same servers to optimize
hit
rate.

Maybe scanning the table for the next usable server will be enough, though
it will not match the same servers as the ones used in case of a change of
the farm size. This could be a limitation that has to be accepted for
this.

Willy






Re: How to redispatch a request after queue timeout

2013-08-31 Thread Willy Tarreau
On Sat, Aug 31, 2013 at 02:25:47PM +0530, Sachin Shetty wrote:
 Yes, no-queue is what I am looking for. If consistent hashing is easier to
 accommodate this change,

I wouldn't say it's easier, better that it's less complicated :-/

 I won't mind switching to consistent hashing when
 the fix is available. Right now with no support for maxqueue failover,
 consistent hashing is even more severe for our use case.

OK, I'll think about this. I can't provide any ETA though.

Willy




RE: How to redispatch a request after queue timeout

2013-08-30 Thread Lukas Tribus
Hi Sachin,


 We want to maintain stickiness to a backed server based on host header 
 so balance hdr(host) works pretty well for us, however as soon at the 
 backend hits max connection, requests pile up in the queue eventually 
 timeout with a 503 and –sQ in the logs.

I don't think balance hdr(host) is the correct load-balancing method then.



 Is there a way to redispatch the requests to other servers ignoring the
 persistence on queue timeout?

No; after a queue timeout (configurable by timeout queue), haproxy will
drop the request.

Think of what would happen when we redispatch on queue timeout:

The request would wait seconds for a free slot on the backend and only
then be redispatched to another server. This would delay many requests
and hide the problem from you, while your customers furiously switch
to a competitor.


You should find another way to better distribute the load across your
backends.



Lukas 


Re: How to redispatch a request after queue timeout

2013-08-30 Thread Sachin Shetty
Thanks Lukas.

Yes, I was hoping to workaround by setting a smaller maxqueue limit and
queue timeout.

So what other options do we have, I need to:
1. Send all requests for a host (mytest.mydomain.com) to one backend as
long as it can serve.
2. If the backend is swamped, it should go to any other backend available.

Thanks
Sachin

On 8/30/13 1:57 PM, Lukas Tribus luky...@hotmail.com wrote:

Hi Sachin,


 We want to maintain stickiness to a backed server based on host header
 so balance hdr(host) works pretty well for us, however as soon at the
 backend hits max connection, requests pile up in the queue eventually
 timeout with a 503 and ­sQ in the logs.

I don't think balance hdr(host) is the correct load-balancing method then.



 Is there a way to redispatch the requests to other servers ignoring the
 persistence on queue timeout?

No; after a queue timeout (configurable by timeout queue), haproxy will
drop the request.

Think of what would happen when we redispatch on queue timeout:

The request would wait seconds for a free slot on the backend and only
then be redispatched to another server. This would delay many requests
and hide the problem from you, while your customers furiously switch
to a competitor.


You should find another way to better distribute the load across your
backends.



Lukas