RE: Connection limiting & Sorry servers
Hi John, Willy, On Mon, 2009-08-10 at 10:07 -0400, John Lauro wrote: > Do you have haproxy between your web servers and the 3rd party? If not (ie: > only to your servers), perhaps that is what you should do. Trying to > throttle the maximum connections to your web servers sounds pointless given > that it's not a very good correlation to the traffic to the third party > servers. That is correct. I was also thinking about that but this was later done at application level and this issue is supposed to be solved. If it turns out not to be, I can still try to put another haproxy between our servers and 3rd party. > If you need to rate limit the connections per second, you could always do > that with iptables on linux, or pf on bsd, etc... but it sounds like it's > something the third party needs to fix. I did exactly like that because I also have to protect my servers from users, but that has some limitation. Our users have to click a few times on HTTP and then a few times more on SSL. I only limit traffic to HTTP and have to enable keepalive so that once user comes to the site he is not redirected into waiting room anymore. For such a simple solution it works really great, but I don't like to reconfigure Apache to use keepalive (I actually run two instances on each server for that) and also I have to intervene every time we expect higher load. I would like to solve this with haproxy. However... :) This is also a problem with haproxy that I could not figure out. How can I make sure that one user, that already came to the site is not redirected to waiting room on the next click? I don't need (and also don't want) any persistence. So how can this be done? I read that haproxy doesn't work with keepalive connections so even the only working solution stops working if I put haproxy in between. Does haproxy have any solution without changing the application like the user is redirected to different IP after the first click? I also decided that (like you suggested) I will try not to limit connection rate, only total number of connections. The problem is that with these two rules: acl toomany connslots(main) lt 10 use_backend sorry if toomany users don't see waiting room. They just timeout. I also tried with dst_conn and it didn't work either. For example rate limit (which I don't use now): acl toofast be_sess_rate(main) gt 6 use_backend sorry if toofast works fine. I have a few more questions: -Is it possible to see the value of some acl variable at some moment? Maybe put it into logs or output it in stats? -Can you estimate the difference in resource usage between redirection on 3rd/4th and 7th layer? (for example iptables redirect Vs checking cookies in HTTP header and then redirecting)? -Is is possible or planned for the future, to use some external check/script from with which we could decide how to handle traffic (I would like to monitor database load and use it in acls). Thank you and best regards Bostjan
RE: Connection limiting & Sorry servers
Do you have haproxy between your web servers and the 3rd party? If not (ie: only to your servers), perhaps that is what you should do. Trying to throttle the maximum connections to your web servers sounds pointless given that it's not a very good correlation to the traffic to the third party servers. If you need to rate limit the connections per second, you could always do that with iptables on linux, or pf on bsd, etc... but it sounds like it's something the third party needs to fix. > -Original Message- > From: Boštjan Merčun [mailto:bostjan.mer...@dhimahi.com] > Sent: Monday, August 10, 2009 9:32 AM > To: Willy Tarreau > Cc: haproxy@formilux.org > Subject: Re: Connection limiting & Sorry servers > > On Wed, 2009-08-05 at 18:26 +0200, Willy Tarreau wrote: > > On Wed, Aug 05, 2009 at 05:52:50PM +0200, Bo??tjan Mer??un wrote: > > > Hi Willy > > > > > > On Mon, 2009-08-03 at 09:21 +0200, Willy Tarreau wrote: > > > > > > > why are you saying that ? Except for rare cases of huge bugs, a > server > > > > is not limited in requests per second. At full speed, it will > simply use > > > > 100% of the CPU, which is why you bought it after all. When a > server dies, > > > > it's almost always because a limited resource has been exhausted, > and most > > > > often this resource is memory. In some cases, it may be other > limits such > > > > as sockets, file descriptors, etc... which cause some unexpected > exceptions > > > > not to be properly caught. > > > > > > We have a problem that our servers open connections to some 3rd > party, > > > and if we get too many users at the same time, they get too many > > > connections. > > > > So you're agreeing that the problem comes from "too many > connections". This > > is exactly what "maxconn" is solving. > > The whole story is like that: during the process on our servers, we > have > to open a few connections for every user to some 3rd party and the > process for the user finishes. > If any of the connections is unsuccesful, so is everything that user > did > before that (if he does not try again and eventually succeeds). > The 3rd party limits total concurrent connections and connections per > second. > The number of connections that users make to the 3rd party depends on > what users do on our pages. User can just browse the site for 10 > minutes > and open no connections or he can finish his process in a minute and > open more then 10 connections during that time. > > As you probably see, my problem is the difference between the user, > that > comes the check the site and the user that knows exactly what he wants > on the site. > > The factor is at least 20 (probably more) which means that one setting > is not good for all scenarios, either it will be to high and users will > flood the 3rd patry with too many connections or few users will be able > to browse the site and the rest will wait even though server will be > sleeping. > > I know that these problems should be solved on different levels like > application, 3rd party connection limiting etc... but the problem is > actually more of political nature and what I am trying to do is just > solving the current situation with the tools and options I have. One of > them is HAProxy and it's connection limiting and with it I would like > to > help myself as much as I can. > > I hope that clarified my situation a bit. > > I will not be able to test anything for a week or more likely two, but > I > will continue as soon as possible and if I come to any useful > conclusions, I will also notify the list. > > Thank you again and best regards > > > Bostjan > > > Checked by AVG - www.avg.com > Version: 8.5.392 / Virus Database: 270.13.25/2256 - Release Date: > 08/07/09 06:22:00
Re: Connection limiting & Sorry servers
On Wed, 2009-08-05 at 18:26 +0200, Willy Tarreau wrote: > On Wed, Aug 05, 2009 at 05:52:50PM +0200, Bo??tjan Mer??un wrote: > > Hi Willy > > > > On Mon, 2009-08-03 at 09:21 +0200, Willy Tarreau wrote: > > > > > why are you saying that ? Except for rare cases of huge bugs, a server > > > is not limited in requests per second. At full speed, it will simply use > > > 100% of the CPU, which is why you bought it after all. When a server dies, > > > it's almost always because a limited resource has been exhausted, and most > > > often this resource is memory. In some cases, it may be other limits such > > > as sockets, file descriptors, etc... which cause some unexpected > > > exceptions > > > not to be properly caught. > > > > We have a problem that our servers open connections to some 3rd party, > > and if we get too many users at the same time, they get too many > > connections. > > So you're agreeing that the problem comes from "too many connections". This > is exactly what "maxconn" is solving. The whole story is like that: during the process on our servers, we have to open a few connections for every user to some 3rd party and the process for the user finishes. If any of the connections is unsuccesful, so is everything that user did before that (if he does not try again and eventually succeeds). The 3rd party limits total concurrent connections and connections per second. The number of connections that users make to the 3rd party depends on what users do on our pages. User can just browse the site for 10 minutes and open no connections or he can finish his process in a minute and open more then 10 connections during that time. As you probably see, my problem is the difference between the user, that comes the check the site and the user that knows exactly what he wants on the site. The factor is at least 20 (probably more) which means that one setting is not good for all scenarios, either it will be to high and users will flood the 3rd patry with too many connections or few users will be able to browse the site and the rest will wait even though server will be sleeping. I know that these problems should be solved on different levels like application, 3rd party connection limiting etc... but the problem is actually more of political nature and what I am trying to do is just solving the current situation with the tools and options I have. One of them is HAProxy and it's connection limiting and with it I would like to help myself as much as I can. I hope that clarified my situation a bit. I will not be able to test anything for a week or more likely two, but I will continue as soon as possible and if I come to any useful conclusions, I will also notify the list. Thank you again and best regards Bostjan
Re: Connection limiting & Sorry servers
On Wed, Aug 05, 2009 at 05:52:50PM +0200, Bo??tjan Mer??un wrote: > Hi Willy > > On Mon, 2009-08-03 at 09:21 +0200, Willy Tarreau wrote: > > > why are you saying that ? Except for rare cases of huge bugs, a server > > is not limited in requests per second. At full speed, it will simply use > > 100% of the CPU, which is why you bought it after all. When a server dies, > > it's almost always because a limited resource has been exhausted, and most > > often this resource is memory. In some cases, it may be other limits such > > as sockets, file descriptors, etc... which cause some unexpected exceptions > > not to be properly caught. > > We have a problem that our servers open connections to some 3rd party, > and if we get too many users at the same time, they get too many > connections. So you're agreeing that the problem comes from "too many connections". This is exactly what "maxconn" is solving. > > I'm well aware of the problem, many sites have the same. The queuing > > mechanism in haproxy was developped exactly for that. The first user > > was a gaming site which went from 50 req/s to 1 req/s on patch days. > > They too thought their servers could not handle that, while it was just > > a matter of concurrent connections once again. By enabling the queueing > > mechanism, they could sustain the 1 req/s with only a few hundred > > concurrent connections. > > If that is the case, I will try the same and only limit max connections and > see, what will happen. > If that will actually work, I will have much simpler situation to handle. I bet so ;-) Willy
Re: Connection limiting & Sorry servers
Hi Willy On Mon, 2009-08-03 at 09:21 +0200, Willy Tarreau wrote: > why are you saying that ? Except for rare cases of huge bugs, a server > is not limited in requests per second. At full speed, it will simply use > 100% of the CPU, which is why you bought it after all. When a server dies, > it's almost always because a limited resource has been exhausted, and most > often this resource is memory. In some cases, it may be other limits such > as sockets, file descriptors, etc... which cause some unexpected exceptions > not to be properly caught. We have a problem that our servers open connections to some 3rd party, and if we get too many users at the same time, they get too many connections. > I'm well aware of the problem, many sites have the same. The queuing > mechanism in haproxy was developped exactly for that. The first user > was a gaming site which went from 50 req/s to 1 req/s on patch days. > They too thought their servers could not handle that, while it was just > a matter of concurrent connections once again. By enabling the queueing > mechanism, they could sustain the 1 req/s with only a few hundred > concurrent connections. If that is the case, I will try the same and only limit max connections and see, what will happen. If that will actually work, I will have much simpler situation to handle. Thank you for now, you have been very helpful. Best regards Bostjan
Re: Connection limiting & Sorry servers
Hi Bostjan, On Mon, Aug 03, 2009 at 08:51:09AM +0200, Bo??tjan Mer??un wrote: > > I really don't know why you're limiting on the number of requests > > per second. It is not the proper way to do this at all. In fact, > > you should *only* need to play with the server's maxconn parameter, > > as there should be no reason your servers would be sensible to a > > number of requests per second. > > I don't know why limiting number of requests per second wouldn't be > proper way to limit the traffic? yes it precisely is a way of limiting the traffic. > During high load, we can get few thousand requests per second. Servers > can't handle that. why are you saying that ? Except for rare cases of huge bugs, a server is not limited in requests per second. At full speed, it will simply use 100% of the CPU, which is why you bought it after all. When a server dies, it's almost always because a limited resource has been exhausted, and most often this resource is memory. In some cases, it may be other limits such as sockets, file descriptors, etc... which cause some unexpected exceptions not to be properly caught. > If I only set max connections for every server, they > won't even get there. With for example 50 new connections at the same > time per server, I think they will die before reaching the limit. Then set the limit lower. If you're sure they can't handle 50 concurrent connections, it's exactly the reason you must set a maxconn below this value to protect them. > The thing is we usually have events that start at a certain time and > users know that. The event is also disabled until that time so sometimes > we really get huge amount of connections at the beginning. I'm well aware of the problem, many sites have the same. The queuing mechanism in haproxy was developped exactly for that. The first user was a gaming site which went from 50 req/s to 1 req/s on patch days. They too thought their servers could not handle that, while it was just a matter of concurrent connections once again. By enabling the queueing mechanism, they could sustain the 1 req/s with only a few hundred concurrent connections. > If there is some mechanism inside HAProxy, that would help us survive > such cases, please let me know. > > Besides, the contractor thinks that getting waiting room is better user > experience than waiting for site to open. with the queue, you're not particularly waiting for the site to open, you access it normally, it's just that you don't rush on it all at once. > That is why I wished to let certain amount of people to site and show > everybody else waiting room. The problem with the waiting room is that only the application knows how many people are on the site. You can't guess that from TCP connection counts. And unfortunately, you will randomly accept then reject users inside a same session. And by doing that, sometimes you'll send some users in the waiting room while there are free slots on the servers. However, with the queue you have the ability to decide to send people to a waiting room when the queue is full and if the user has no application cookie for instance. That way you know it's a new user and you prefer to let him wait for the best moment. > > > I would like to limit number of current users on real servers and amount > > > of new connections that can be created per some time unit. > > > > This is the point I cannot agree with. I think you need to limit the > > amount of concurrent connections on your servers. Otherwise, your > > application is deadly broken, but that's not how you introduced it > > first :-) > > If deadly broken means it does not take care of connection limiting, > then it is deadly broken. No, being dependant on connection limiting is normal. Being dependant on connections per second is abnormal. > Application does not do any limiting, servers do when they stop > responding :) Once again I'm well aware of this problem :-) > If our servers handle can 200 concurrent connections and I limit all > sites so that the total doesn't exceed that, I have to limit each site > to about 5 concurrent connections (about 40 sites at the moment). That > means that instead of using as much resources as possible, the site with > the most traffic would be using 5 connections only. No, if your servers are shared between multiple sites, you can very well use a same backend for all those sites, or even for some groups of sites. Also, limiting on the number of requests per second will never prevent a slow site from saturating your 200 concurrent connections. > Am I right? > I am thinking about fullconn option now and since we make sure that > there is only one site having high traffic at the moment, I might be > able to use that? The fullconn is only used in conjunction with the minconn, it only tells the algorithm what threshold must be considered full load. Hoping this helps, Willy
Re: Connection limiting & Sorry servers
Hi Willy. First, thank you for your answer. > I really don't know why you're limiting on the number of requests > per second. It is not the proper way to do this at all. In fact, > you should *only* need to play with the server's maxconn parameter, > as there should be no reason your servers would be sensible to a > number of requests per second. I don't know why limiting number of requests per second wouldn't be proper way to limit the traffic? During high load, we can get few thousand requests per second. Servers can't handle that. If I only set max connections for every server, they won't even get there. With for example 50 new connections at the same time per server, I think they will die before reaching the limit. The thing is we usually have events that start at a certain time and users know that. The event is also disabled until that time so sometimes we really get huge amount of connections at the beginning. If there is some mechanism inside HAProxy, that would help us survive such cases, please let me know. Besides, the contractor thinks that getting waiting room is better user experience than waiting for site to open. That is why I wished to let certain amount of people to site and show everybody else waiting room. > Well you could use cookies to know if a user has a sesion on the site or > not, but it's still not the right solution. > > > Can someout with more experience than me advice what would be the best > > way to handle this? > > > > I would like to limit number of current users on real servers and amount > > of new connections that can be created per some time unit. > > This is the point I cannot agree with. I think you need to limit the > amount of concurrent connections on your servers. Otherwise, your > application is deadly broken, but that's not how you introduced it > first :-) If deadly broken means it does not take care of connection limiting, then it is deadly broken. Application does not do any limiting, servers do when they stop responding :) I would like HAProxy to help me with that as much as possible. > This is precisely where the splitting of the backend helps. Assuming > that your servers can handle, say, 200 concurrent connections, you > split your backends either per application or per application group > (you might want to guarantee a quality of service for a group of > applications). Then you adjust each server's maxconn so that the > total does not exceed the real server's maxconn (=MaxClients on > apache). That way, even if an application takes a lot of resources, > it will not saturate the server. Also, just enable health-checks > in one backend and configure the other ones to track the first one, > it will greatly reduce the number of health-checks sent to your > servers. If our servers handle can 200 concurrent connections and I limit all sites so that the total doesn't exceed that, I have to limit each site to about 5 concurrent connections (about 40 sites at the moment). That means that instead of using as much resources as possible, the site with the most traffic would be using 5 connections only. Am I right? I am thinking about fullconn option now and since we make sure that there is only one site having high traffic at the moment, I might be able to use that? Thank you and best regards Bostjan
Re: Connection limiting & Sorry servers
Hello, On Fri, Jul 31, 2009 at 09:19:23AM +0200, Bo??tjan Mer??un wrote: > Dear haproxy list, > > This message will be a bit longer, I hope that somebody will read it > though and give me some opinion. > I am testing HAProxy in our load-balanced environment, which now works > with keepalived and the main reason for trying HAProxy is connection > limiting and ACL, which keepalived doesn't have. > We have more than 40 sites load balanced on 6 web servers and we have a > problem that from time to time some of the sites gets huge amount of > traffic. Much more then servers handle. > > I hoped I would be able to automate this with HAProxy, but I have some > problems. > I would like to have limits as centralized as possible, that is why I > created one backend for all sites. In this case I can limit the taffic > on backend and have frontends use whatever they can. > The problem is that if I create the same rules for all frontends, they > behave in the same way; when ACLs are true all stop working and then all > start again. I would like to have all sites working even if one uses a > lot of resources (for example: one using 80% of all resources and all > others 20% or something like that). Then you really need to use separate backends and assign them a certain amount of connections. The advantage is that if a lot of traffic comes to a backend, queueing will occur of that backend without impacting other ones. > My configuration is like that: > > frontend my_site1 > bind XXX.XXX.XXX.1:8880 > default_backend main > acl my_site_toofast be_sess_rate(main) gt 6 > acl my_site_toomany connslots(main) lt 10 > acl my_site_slow fe_sess_rate lt 1 > use_backend sorry if my_site_toomany or my_site_toofast ! my_site_slow I really don't know why you're limiting on the number of requests per second. It is not the proper way to do this at all. In fact, you should *only* need to play with the server's maxconn parameter, as there should be no reason your servers would be sensible to a number of requests per second. (...) > backend main > option httpchk OPTIONS * HTTP/1.1\r\nHost:\ www > server web1 YYY.YYY.YYY.1 check port 8880 inter 10s fall 2 rise > 3 weight 10 maxconn 50 maxqueue 1 > server web2 YYY.YYY.YYY.2 check port 8880 inter 10s fall 2 rise > 3 weight 10 maxconn 50 maxqueue 1 > server web3 YYY.YYY.YYY.3 check port 8880 inter 10s fall 2 rise > 3 weight 10 maxconn 50 maxqueue 1 > server web4 YYY.YYY.YYY.4 check port 8880 inter 10s fall 2 rise > 3 weight 10 maxconn 50 maxqueue 1 > server web5 YYY.YYY.YYY.5 check port 8880 inter 10s fall 2 rise > 3 weight 10 maxconn 50 maxqueue 1 > server web6 YYY.YYY.YYY.6 check port 8880 inter 10s fall 2 rise > 3 weight 10 maxconn 50 maxqueue 1 > > backend sorry > option httpchk OPTIONS * HTTP/1.1\r\nHost:\ www > server sorry YYY.YYY.YYY.7:1234 check inter 10s fall 5 rise 5 > > When "be_sess_rate(main) gt 6" gets true, all sites stop working - I get > the sorry page. That's expected. Since all your traffic converges there, and it's the only parameter you're monitoring from all frontends, it is normal that above a given rate every frontend is affected. > I don't want to split backends for each site because then I lose the > control over all traffic; setting them to 1 request per second would > limit each site too much and when all would work, they would flood the > backend. Once again, you don't have to limit in requests/s. You should just limit per concurrent requests. The limit on requests/s was introduced for a few reasons : - it was easy since the value was already measured for stats - it could be used by people experiencing heavy DDoS as a temporary workaround - it could be used by people who know their application is broken to prevent it from reaching a known breaking point But I don't know anyone using it right now. The limit on the frontend session rate is useful for hosting companies though, as it allows them to enforce an SLA per hosted application. > I can not use fe_sess_rate to limit the traffic to frontend, because the > way it works, when we get a lot of traffic to one site, no user will > actually come through it. Or is there a way to redirect most users to > sorry servers but still let some users to the site even when the > connection rate to frontend is higher than configured fe_sess_rate? Well you could use cookies to know if a user has a sesion on the site or not, but it's still not the right solution. > Can someout with more experience than me advice what would be the best > way to handle this? > > I would like to limit number of current users on real servers and amount > of new connections that can be created per some time unit. This is the point I cannot agree with. I think you need to limit the amount of concurrent connections on your servers. Otherwise, your application is deadly broken, but that's not how you introduc