RE: Connection limiting & Sorry servers

2009-08-27 Thread Boštjan Merčun
Hi John, Willy,

On Mon, 2009-08-10 at 10:07 -0400, John Lauro wrote:
> Do you have haproxy between your web servers and the 3rd party?  If not (ie: 
> only to your servers), perhaps that is what you should do.  Trying to 
> throttle the maximum connections to your web servers sounds pointless given 
> that it's not a very good correlation to the traffic to the third party 
> servers.

That is correct. I was also thinking about that but this was later done
at application level and this issue is supposed to be solved. If it
turns out not to be, I can still try to put another haproxy between our
servers and 3rd party.

> If you need to rate limit the connections per second, you could always do 
> that with iptables on linux, or pf on bsd, etc...  but it sounds like it's 
> something the third party needs to fix.

I did exactly like that because I also have to protect my servers from
users, but that has some limitation. Our users have to click a few times
on HTTP and then a few times more on SSL. I only limit traffic to HTTP
and have to enable keepalive so that once user comes to the site he is
not redirected into waiting room anymore. For such a simple solution it
works really great, but I don't like to reconfigure Apache to use
keepalive (I actually run two instances on each server for that) and
also I have to intervene every time we expect higher load. I would like
to solve this with haproxy. However... :)

This is also a problem with haproxy that I could not figure out.
How can I make sure that one user, that already came to the site is not
redirected to waiting room on the next click? I don't need (and also
don't want) any persistence. So how can this be done? I read that
haproxy doesn't work with keepalive connections so even the only working
solution stops working if I put haproxy in between. Does haproxy have
any solution without changing the application like the user is
redirected to different IP after the first click?

I also decided that (like you suggested) I will try not to limit
connection rate, only total number of connections. The problem is that
with these two rules:

acl toomany connslots(main) lt 10
use_backend sorry if toomany

users don't see waiting room. They just timeout. I also tried with
dst_conn and it didn't work either.

For example rate limit (which I don't use now):

acl toofast be_sess_rate(main) gt 6
use_backend sorry if toofast

works fine.

I have a few more questions:
-Is it possible to see the value of some acl variable at some moment?
Maybe put it into logs or output it in stats?
-Can you estimate the difference in resource usage between redirection
on 3rd/4th and 7th layer? (for example iptables redirect Vs checking
cookies in HTTP header and then redirecting)?
-Is is possible or planned for the future, to use some external
check/script from with which we could decide how to handle traffic (I
would like to monitor database load and use it in acls).

Thank you and best regards

Bostjan




RE: Connection limiting & Sorry servers

2009-08-10 Thread John Lauro
Do you have haproxy between your web servers and the 3rd party?  If not (ie: 
only to your servers), perhaps that is what you should do.  Trying to throttle 
the maximum connections to your web servers sounds pointless given that it's 
not a very good correlation to the traffic to the third party servers.

If you need to rate limit the connections per second, you could always do that 
with iptables on linux, or pf on bsd, etc...  but it sounds like it's something 
the third party needs to fix.


> -Original Message-
> From: Boštjan Merčun [mailto:bostjan.mer...@dhimahi.com]
> Sent: Monday, August 10, 2009 9:32 AM
> To: Willy Tarreau
> Cc: haproxy@formilux.org
> Subject: Re: Connection limiting & Sorry servers
> 
> On Wed, 2009-08-05 at 18:26 +0200, Willy Tarreau wrote:
> > On Wed, Aug 05, 2009 at 05:52:50PM +0200, Bo??tjan Mer??un wrote:
> > > Hi Willy
> > >
> > > On Mon, 2009-08-03 at 09:21 +0200, Willy Tarreau wrote:
> > >
> > > > why are you saying that ? Except for rare cases of huge bugs, a
> server
> > > > is not limited in requests per second. At full speed, it will
> simply use
> > > > 100% of the CPU, which is why you bought it after all. When a
> server dies,
> > > > it's almost always because a limited resource has been exhausted,
> and most
> > > > often this resource is memory. In some cases, it may be other
> limits such
> > > > as sockets, file descriptors, etc... which cause some unexpected
> exceptions
> > > > not to be properly caught.
> > >
> > > We have a problem that our servers open connections to some 3rd
> party,
> > > and if we get too many users at the same time, they get too many
> > > connections.
> >
> > So you're agreeing that the problem comes from "too many
> connections". This
> > is exactly what "maxconn" is solving.
> 
> The whole story is like that: during the process on our servers, we
> have
> to open a few connections for every user to some 3rd party and the
> process for the user finishes.
> If any of the connections is unsuccesful, so is everything that user
> did
> before that (if he does not try again and eventually succeeds).
> The 3rd party limits total concurrent connections and connections per
> second.
> The number of connections that users make to the 3rd party depends on
> what users do on our pages. User can just browse the site for 10
> minutes
> and open no connections or he can finish his process in a minute and
> open more then 10 connections during that time.
> 
> As you probably see, my problem is the difference between the user,
> that
> comes the check the site and the user that knows exactly what he wants
> on the site.
> 
> The factor is at least 20 (probably more) which means that one setting
> is not good for all scenarios, either it will be to high and users will
> flood the 3rd patry with too many connections or few users will be able
> to browse the site and the rest will wait even though server will be
> sleeping.
> 
> I know that these problems should be solved on different levels like
> application, 3rd party connection limiting etc... but the problem is
> actually more of political nature and what I am trying to do is just
> solving the current situation with the tools and options I have. One of
> them is HAProxy and it's connection limiting and with it I would like
> to
> help myself as much as I can.
> 
> I hope that clarified my situation a bit.
> 
> I will not be able to test anything for a week or more likely two, but
> I
> will continue as soon as possible and if I come to any useful
> conclusions, I will also notify the list.
> 
> Thank you again and best regards
> 
> 
> Bostjan
> 
> 
> Checked by AVG - www.avg.com
> Version: 8.5.392 / Virus Database: 270.13.25/2256 - Release Date:
> 08/07/09 06:22:00




Re: Connection limiting & Sorry servers

2009-08-10 Thread Boštjan Merčun
On Wed, 2009-08-05 at 18:26 +0200, Willy Tarreau wrote:
> On Wed, Aug 05, 2009 at 05:52:50PM +0200, Bo??tjan Mer??un wrote:
> > Hi Willy
> > 
> > On Mon, 2009-08-03 at 09:21 +0200, Willy Tarreau wrote:
> > 
> > > why are you saying that ? Except for rare cases of huge bugs, a server
> > > is not limited in requests per second. At full speed, it will simply use
> > > 100% of the CPU, which is why you bought it after all. When a server dies,
> > > it's almost always because a limited resource has been exhausted, and most
> > > often this resource is memory. In some cases, it may be other limits such
> > > as sockets, file descriptors, etc... which cause some unexpected 
> > > exceptions
> > > not to be properly caught.
> > 
> > We have a problem that our servers open connections to some 3rd party,
> > and if we get too many users at the same time, they get too many
> > connections.
> 
> So you're agreeing that the problem comes from "too many connections". This
> is exactly what "maxconn" is solving.

The whole story is like that: during the process on our servers, we have
to open a few connections for every user to some 3rd party and the
process for the user finishes.
If any of the connections is unsuccesful, so is everything that user did
before that (if he does not try again and eventually succeeds).
The 3rd party limits total concurrent connections and connections per
second.
The number of connections that users make to the 3rd party depends on
what users do on our pages. User can just browse the site for 10 minutes
and open no connections or he can finish his process in a minute and
open more then 10 connections during that time.

As you probably see, my problem is the difference between the user, that
comes the check the site and the user that knows exactly what he wants
on the site.

The factor is at least 20 (probably more) which means that one setting
is not good for all scenarios, either it will be to high and users will
flood the 3rd patry with too many connections or few users will be able
to browse the site and the rest will wait even though server will be
sleeping.

I know that these problems should be solved on different levels like
application, 3rd party connection limiting etc... but the problem is
actually more of political nature and what I am trying to do is just
solving the current situation with the tools and options I have. One of
them is HAProxy and it's connection limiting and with it I would like to
help myself as much as I can.

I hope that clarified my situation a bit.

I will not be able to test anything for a week or more likely two, but I
will continue as soon as possible and if I come to any useful
conclusions, I will also notify the list.

Thank you again and best regards


Bostjan




Re: Connection limiting & Sorry servers

2009-08-05 Thread Willy Tarreau
On Wed, Aug 05, 2009 at 05:52:50PM +0200, Bo??tjan Mer??un wrote:
> Hi Willy
> 
> On Mon, 2009-08-03 at 09:21 +0200, Willy Tarreau wrote:
> 
> > why are you saying that ? Except for rare cases of huge bugs, a server
> > is not limited in requests per second. At full speed, it will simply use
> > 100% of the CPU, which is why you bought it after all. When a server dies,
> > it's almost always because a limited resource has been exhausted, and most
> > often this resource is memory. In some cases, it may be other limits such
> > as sockets, file descriptors, etc... which cause some unexpected exceptions
> > not to be properly caught.
> 
> We have a problem that our servers open connections to some 3rd party,
> and if we get too many users at the same time, they get too many
> connections.

So you're agreeing that the problem comes from "too many connections". This
is exactly what "maxconn" is solving.

> > I'm well aware of the problem, many sites have the same. The queuing
> > mechanism in haproxy was developped exactly for that. The first user
> > was a gaming site which went from 50 req/s to 1 req/s on patch days.
> > They too thought their servers could not handle that, while it was just
> > a matter of concurrent connections once again. By enabling the queueing
> > mechanism, they could sustain the 1 req/s with only a few hundred
> > concurrent connections.
> 
> If that is the case, I will try the same and only limit max connections and 
> see, what will happen.
> If that will actually work, I will have much simpler situation to handle.

I bet so ;-)

Willy




Re: Connection limiting & Sorry servers

2009-08-05 Thread Boštjan Merčun
Hi Willy

On Mon, 2009-08-03 at 09:21 +0200, Willy Tarreau wrote:

> why are you saying that ? Except for rare cases of huge bugs, a server
> is not limited in requests per second. At full speed, it will simply use
> 100% of the CPU, which is why you bought it after all. When a server dies,
> it's almost always because a limited resource has been exhausted, and most
> often this resource is memory. In some cases, it may be other limits such
> as sockets, file descriptors, etc... which cause some unexpected exceptions
> not to be properly caught.

We have a problem that our servers open connections to some 3rd party,
and if we get too many users at the same time, they get too many
connections.

> I'm well aware of the problem, many sites have the same. The queuing
> mechanism in haproxy was developped exactly for that. The first user
> was a gaming site which went from 50 req/s to 1 req/s on patch days.
> They too thought their servers could not handle that, while it was just
> a matter of concurrent connections once again. By enabling the queueing
> mechanism, they could sustain the 1 req/s with only a few hundred
> concurrent connections.

If that is the case, I will try the same and only limit max connections and 
see, what will happen.
If that will actually work, I will have much simpler situation to handle.

Thank you for now, you have been very helpful.

Best regards

Bostjan





Re: Connection limiting & Sorry servers

2009-08-03 Thread Willy Tarreau
Hi Bostjan,

On Mon, Aug 03, 2009 at 08:51:09AM +0200, Bo??tjan Mer??un wrote:
> > I really don't know why you're limiting on the number of requests
> > per second. It is not the proper way to do this at all. In fact,
> > you should *only* need to play with the server's maxconn parameter,
> > as there should be no reason your servers would be sensible to a
> > number of requests per second.
> 
> I don't know why limiting number of requests per second wouldn't be
> proper way to limit the traffic?

yes it precisely is a way of limiting the traffic.

> During high load, we can get few thousand requests per second. Servers
> can't handle that.

why are you saying that ? Except for rare cases of huge bugs, a server
is not limited in requests per second. At full speed, it will simply use
100% of the CPU, which is why you bought it after all. When a server dies,
it's almost always because a limited resource has been exhausted, and most
often this resource is memory. In some cases, it may be other limits such
as sockets, file descriptors, etc... which cause some unexpected exceptions
not to be properly caught.

> If I only set max connections for every server, they
> won't even get there. With for example 50 new connections at the same
> time per server, I think they will die before reaching the limit.

Then set the limit lower. If you're sure they can't handle 50 concurrent
connections, it's exactly the reason you must set a maxconn below this
value to protect them.

> The thing is we usually have events that start at a certain time and
> users know that. The event is also disabled until that time so sometimes
> we really get huge amount of connections at the beginning.

I'm well aware of the problem, many sites have the same. The queuing
mechanism in haproxy was developped exactly for that. The first user
was a gaming site which went from 50 req/s to 1 req/s on patch days.
They too thought their servers could not handle that, while it was just
a matter of concurrent connections once again. By enabling the queueing
mechanism, they could sustain the 1 req/s with only a few hundred
concurrent connections.

> If there is some mechanism inside HAProxy, that would help us survive
> such cases, please let me know.
>
> Besides, the contractor thinks that getting waiting room is better user
> experience than waiting for site to open.

with the queue, you're not particularly waiting for the site to open,
you access it normally, it's just that you don't rush on it all at once.

> That is why I wished to let certain amount of people to site and show
> everybody else waiting room.

The problem with the waiting room is that only the application knows how
many people are on the site. You can't guess that from TCP connection counts.
And unfortunately, you will randomly accept then reject users inside a same
session. And by doing that, sometimes you'll send some users in the waiting
room while there are free slots on the servers.

However, with the queue you have the ability to decide to send people to
a waiting room when the queue is full and if the user has no application
cookie for instance. That way you know  it's a new user and you prefer to
let him wait for the best moment.

> > > I would like to limit number of current users on real servers and amount
> > > of new connections that can be created per some time unit.
> > 
> > This is the point I cannot agree with. I think you need to limit the
> > amount of concurrent connections on your servers. Otherwise, your
> > application is deadly broken, but that's not how you introduced it
> > first :-)
> 
> If deadly broken means it does not take care of connection limiting,
> then it is deadly broken.

No, being dependant on connection limiting is normal. Being dependant
on connections per second is abnormal.

> Application does not do any limiting, servers do when they stop
> responding :)

Once again I'm well aware of this problem :-)

> If our servers handle can 200 concurrent connections and I limit all
> sites so that the total doesn't exceed that, I have to limit each site
> to about 5 concurrent connections (about 40 sites at the moment). That
> means that instead of using as much resources as possible, the site with
> the most traffic would be using 5 connections only.

No, if your servers are shared between multiple sites, you can very well
use a same backend for all those sites, or even for some groups of sites.
Also, limiting on the number of requests per second will never prevent a
slow site from saturating your 200 concurrent connections.

> Am I right?
> I am thinking about fullconn option now and since we make sure that
> there is only one site having high traffic at the moment, I might be
> able to use that?

The fullconn is only used in conjunction with the minconn, it only tells
the algorithm what threshold must be considered full load.

Hoping this helps,
Willy




Re: Connection limiting & Sorry servers

2009-08-02 Thread Boštjan Merčun
Hi Willy.

First, thank you for your answer.

> I really don't know why you're limiting on the number of requests
> per second. It is not the proper way to do this at all. In fact,
> you should *only* need to play with the server's maxconn parameter,
> as there should be no reason your servers would be sensible to a
> number of requests per second.

I don't know why limiting number of requests per second wouldn't be
proper way to limit the traffic?
During high load, we can get few thousand requests per second. Servers
can't handle that. If I only set max connections for every server, they
won't even get there. With for example 50 new connections at the same
time per server, I think they will die before reaching the limit.
The thing is we usually have events that start at a certain time and
users know that. The event is also disabled until that time so sometimes
we really get huge amount of connections at the beginning.

If there is some mechanism inside HAProxy, that would help us survive
such cases, please let me know.

Besides, the contractor thinks that getting waiting room is better user
experience than waiting for site to open.

That is why I wished to let certain amount of people to site and show
everybody else waiting room.


> Well you could use cookies to know if a user has a sesion on the site or
> not, but it's still not the right solution.
> 
> > Can someout with more experience than me advice what would be the best
> > way to handle this?
> > 
> > I would like to limit number of current users on real servers and amount
> > of new connections that can be created per some time unit.
> 
> This is the point I cannot agree with. I think you need to limit the
> amount of concurrent connections on your servers. Otherwise, your
> application is deadly broken, but that's not how you introduced it
> first :-)

If deadly broken means it does not take care of connection limiting,
then it is deadly broken.
Application does not do any limiting, servers do when they stop
responding :)

I would like HAProxy to help me with that as much as possible.

> This is precisely where the splitting of the backend helps. Assuming
> that your servers can handle, say, 200 concurrent connections, you
> split your backends either per application or per application group
> (you might want to guarantee a quality of service for a group of
> applications). Then you adjust each server's maxconn so that the
> total does not exceed the real server's maxconn (=MaxClients on
> apache). That way, even if an application takes a lot of resources,
> it will not saturate the server. Also, just enable health-checks
> in one backend and configure the other ones to track the first one,
> it will greatly reduce the number of health-checks sent to your
> servers.

If our servers handle can 200 concurrent connections and I limit all
sites so that the total doesn't exceed that, I have to limit each site
to about 5 concurrent connections (about 40 sites at the moment). That
means that instead of using as much resources as possible, the site with
the most traffic would be using 5 connections only.

Am I right?
I am thinking about fullconn option now and since we make sure that
there is only one site having high traffic at the moment, I might be
able to use that?

Thank you and best regards

Bostjan




Re: Connection limiting & Sorry servers

2009-07-31 Thread Willy Tarreau
Hello,

On Fri, Jul 31, 2009 at 09:19:23AM +0200, Bo??tjan Mer??un wrote:
> Dear haproxy list,
> 
> This message will be a bit longer, I hope that somebody will read it
> though and give me some opinion.
> I am testing HAProxy in our load-balanced environment, which now works
> with keepalived and the main reason for trying HAProxy is connection
> limiting and ACL, which keepalived doesn't have.
> We have more than 40 sites load balanced on 6 web servers and we have a
> problem that from time to time some of the sites gets huge amount of
> traffic. Much more then servers handle.
> 
> I hoped I would be able to automate this with HAProxy, but I have some
> problems.
> I would like to have limits as centralized as possible, that is why I
> created one backend for all sites. In this case I can limit the taffic
> on backend and have frontends use whatever they can.
> The problem is that if I create the same rules for all frontends, they
> behave in the same way; when ACLs are true all stop working and then all
> start again. I would like to have all sites working even if one uses a
> lot of resources (for example: one using 80% of all resources and all
> others 20% or something like that).

Then you really need to use separate backends and assign them a certain
amount of connections. The advantage is that if a lot of traffic comes
to a backend, queueing will occur of that backend without impacting
other ones.

> My configuration is like that:
> 
> frontend my_site1
> bind XXX.XXX.XXX.1:8880
> default_backend main
> acl my_site_toofast be_sess_rate(main) gt 6
> acl my_site_toomany connslots(main) lt 10
> acl my_site_slow fe_sess_rate lt 1
> use_backend sorry if my_site_toomany or my_site_toofast ! my_site_slow

I really don't know why you're limiting on the number of requests
per second. It is not the proper way to do this at all. In fact,
you should *only* need to play with the server's maxconn parameter,
as there should be no reason your servers would be sensible to a
number of requests per second.

(...)
> backend main
> option httpchk OPTIONS * HTTP/1.1\r\nHost:\ www
> server web1 YYY.YYY.YYY.1 check port 8880 inter 10s fall 2 rise
> 3 weight 10 maxconn 50 maxqueue 1
> server web2 YYY.YYY.YYY.2 check port 8880 inter 10s fall 2 rise
> 3 weight 10 maxconn 50 maxqueue 1
> server web3 YYY.YYY.YYY.3 check port 8880 inter 10s fall 2 rise
> 3 weight 10 maxconn 50 maxqueue 1
> server web4 YYY.YYY.YYY.4 check port 8880 inter 10s fall 2 rise
> 3 weight 10 maxconn 50 maxqueue 1
> server web5 YYY.YYY.YYY.5 check port 8880 inter 10s fall 2 rise
> 3 weight 10 maxconn 50 maxqueue 1
> server web6 YYY.YYY.YYY.6 check port 8880 inter 10s fall 2 rise
> 3 weight 10 maxconn 50 maxqueue 1
> 
> backend sorry
> option httpchk OPTIONS * HTTP/1.1\r\nHost:\ www
> server sorry YYY.YYY.YYY.7:1234 check inter 10s fall 5 rise 5
> 
> When "be_sess_rate(main) gt 6" gets true, all sites stop working - I get
> the sorry page.

That's expected. Since all your traffic converges there, and it's the only
parameter you're monitoring from all frontends, it is normal that above a
given rate every frontend is affected.

> I don't want to split backends for each site because then I lose the
> control over all traffic; setting them to 1 request per second would
> limit each site too much and when all would work, they would flood the
> backend.

Once again, you don't have to limit in requests/s. You should just limit
per concurrent requests. The limit on requests/s was introduced for a few
reasons :
  - it was easy since the value was already measured for stats
  - it could be used by people experiencing heavy DDoS as a
temporary workaround
  - it could be used by people who know their application is broken
to prevent it from reaching a known breaking point

But I don't know anyone using it right now. The limit on the frontend
session rate is useful for hosting companies though, as it allows them
to enforce an SLA per hosted application.

> I can not use fe_sess_rate to limit the traffic to frontend, because the
> way it works, when we get a lot of traffic to one site, no user will
> actually come through it. Or is there a way to redirect most users to
> sorry servers but still let some users to the site even when the
> connection rate to frontend is higher than configured fe_sess_rate?

Well you could use cookies to know if a user has a sesion on the site or
not, but it's still not the right solution.

> Can someout with more experience than me advice what would be the best
> way to handle this?
> 
> I would like to limit number of current users on real servers and amount
> of new connections that can be created per some time unit.

This is the point I cannot agree with. I think you need to limit the
amount of concurrent connections on your servers. Otherwise, your
application is deadly broken, but that's not how you introduc