Willy, thanks for your answer.
On Sat, May 12, 2012 at 7:21 PM, Willy Tarreau <[email protected]> wrote: > On Sat, May 12, 2012 at 07:01:19PM +0300, Bar Ziony wrote: > > Hey, > > > > I have a dynamic backend with maxconn 80 with multiple servers. > > Many times I can see on the haproxy stats page that servers on this > backend > > are reaching their maximum 80, but I don't see the number of requests > > currently in queue. The maximum number I ever see is 80. Why is that? > Can I > > somehow see the number of requests in the queue? > > The queue is split between servers and backend. In the servers' queue, you > only see the requests which absolutely need to be served by the given > server > (due to persistence cookie or stick-tables). Otherwise the request lies in > the backend's queue so that it will be served by the first available > server. > It's very normal not to have too many requests in the server's queue and > have > more in the backend's queue. > I now see the "Queue" part in the backend line and indeed I can see the numbers rising on load! Thanks :) I have no persistency and my backend servers are totally agnostic to what user they're serving (user sessions are stored on a centeralized memcached). > > Also, with a munin plugin that checks the HTTP page with ";csv", I see > that > > sometimes the graphs shows 400+ req/sec for this backend, which is not > > possible since the maximum is 80... > > Last, what is the difference between "Sessions" and "Session rate" ? > > You seem to be really confusing concurrency and rate I'm afraid. Imagine a > highway, it's the same. Session rate is the number of cars you see pass an > observation point each second. Session concurrency is the number of > parallel > lanes that are occupied at a given instant. If the traffic slows down, you > need more lanes to drain the same number of cars without slowing the rate > down. If your cars drive faster, you need less lanes for a same cars rate. > So session rate is the number of requests per second ? Why is it called session then if it's really requests? And "Sessions" is just plain sessions number, without caring for how much of them were happening in 1 sec? > Regards, > Willy > > > How can I tell when I need another dynamic backend server? > > It's simple : observe the total queue size in a backend (backend + sum of > servers). Divide the number by the maxconn and it will tell you the number > of servers that would allow the requests to be processed without queuing. > Note that it's fine to have a bit of queueing, it saves you from buying > more hardware at the expense of a slightly delayed processing. You just > need to ensure the queue is not too deep. The average time spent in the > queue is the average queue size divided by the maxconn and multiplied by > the average response time. > > So in order to get an idea : > > srv1 has maxconn 80 and queue around 10 > srv2 has maxconn 80 and queue around 10 > backend has a queue around 100 > > The total queue is 120, which is the equivalent of 1.5 server. Let's say > you add a single server, you'll then have around 80 requests spread over > the last server, and 40 requests still in the queues. If your servers > exhibit an average response time of 50 ms, the average time spent in the > queue will be 40/80*50 ms = 25ms, so the total response time will increase > from 50ms to 75ms due to the queue. For many sites this will not be > noticeable and probably acceptable. Now if your site is already slow (eg: > 2 seconds response time), adding 50% more will give you 3 seconds and your > users will clearly notice the difference. > How can I know the average response time of my servers? haproxy provides that data somewhere? I have a max of 800 requests in the backend queue (none in the servers queue since there is no persistence). Is that a lot ? :| I also see 3,400 sessions in the frontend, and only ~100 in the dynamic backend and 15 in the static backend (in the "cur" column). How is that possible? So many requests are not valid, or sessions are kept and are not for 1 request only ? :\ I don't understand that.. Thanks, Bar. > That's why you first need to maintain the response times as low as possible > by limiting the maxconn, and only then estimate the number of servers > needed > to keep the response time low. > > Hoping this helps, > Willy > >

