Re: Backend connections leak

2019-10-01 Thread Marco Colli
>
> With "forever" you mean longer then 1m ?


Yes, unfortunately forever means forever, not just 1 minute. I have already
tried to wait several minutes (e.g. more than 10 min) and the number of
backend connections reported by Datadog remains the same (e.g. ~200). When
I restart HAProxy then the number of connections drops to the normal value
(0-2 connections).


> retry-on all-retryable-errors


I can't do that. I prefer to abort failed requests and return an error to
the client, instead of retrying them for the following reason:

You have to make sure the application has a replay protection mechanism
built
in such as a unique transaction IDs passed in requests, or that replaying
the
same request has no consequence, or it is very dangerous to use any retry-on
value beside "conn-failure" and "none".


On Tue, Oct 1, 2019 at 12:02 PM Aleksandar Lazic  wrote:

> Am 01.10.19 um 11:18 schrieb Marco Colli:
> > Here's my configuration:
> >
> > $ haproxy -vv
> > HA-Proxy version 2.0.7-1ppa1~bionic 2019/09/28 - https://haproxy.org/
>
> [snipp]
>
> > $ cat /etc/haproxy/haproxy.cfg
> > global
> > log /dev/loglocal0
> > log /dev/loglocal1 notice
> > chroot /var/lib/haproxy
> > stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd
> listeners
> > stats timeout 30s
> > user haproxy
> > group haproxy
> > daemon
> >
> > maxconn 16384
> >
> > nbproc 1
> > nbthread 4
> > cpu-map auto:1/1-4 0-3
> >
> > # Default SSL material locations
> > ca-base /etc/ssl/certs
> > crt-base /etc/ssl/private
> >
> > # See:
> >
> https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
> > ssl-default-bind-ciphers ...
> > ssl-default-bind-ciphersuites ...
> > ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11
> no-tls-tickets
> > tune.ssl.default-dh-param 2048
> >
> > defaults
> > logglobal
> > modehttp
> > option  httpchk HEAD /health HTTP/1.1\r\nHost:\ example.com
> > \r\nX-Forwarded-Proto:\ https
> > optionhttplog
> > optiondontlognull
> > option  dontlog-normal
> > option  forwardfor
> > option  http-server-close
> > option  redispatch
> > timeout client 10s
> > timeout client-fin 5s
> > timeout http-request 5s
> > timeout server 30s
> > timeout server-fin 10s
> > timeout connect 10s
> > timeout queue 10s
> > errorfile 400 /etc/haproxy/errors/400.http
> > errorfile 403 /etc/haproxy/errors/403.http
> > errorfile 408 /etc/haproxy/errors/408.http
> > errorfile 500 /etc/haproxy/errors/500.http
> > errorfile 502 /etc/haproxy/errors/502.http
> > errorfile 503 /etc/haproxy/errors/503.http
> > errorfile 504 /etc/haproxy/errors/504.http
> >
> > listen stats
> > bind :8000
> > bind-process 1
> > mode http
> > stats enable
> > stats hide-version
> > stats realm HAProxy\ Stats
> > stats uri /
> > stats auth theuser:thepassword
> >
> > frontend www-frontend
> > bind :::80 v4v6
> > bind :::443 v4v6 ssl crt /etc/ssl/private/ev-2019.pem
> > default_backend www-backend
> > compression algo gzip
> > compression type text/html text/css text/javascript
> > application/javascript application/json
> >
> > backend www-backend
> > http-request redirect prefix https://%[hdr(host),regsub(^www\.,,i)] if {
> > hdr_beg(host) -i www. }
> > http-request add-header X-Forwarded-Proto https
> > redirect scheme https if !{ ssl_fc }
> > balance roundrobin
> > default-server maxconn 256 inter 10s fall 3 rise 2 check
> > server web0 10.113.220.155:6000 <http://10.113.220.155:6000>
> > server web1 10.113.221.156:6000 <http://10.113.221.156:6000>
> > server web2 10.113.222.157:6000 <http://10.113.222.157:6000>
> >
> >
> > On Tue, Oct 1, 2019 at 11:02 AM Aleksandar Lazic  > <mailto:al-hapr...@none.at>> wrote:
> >
> > Hi.
> >
> > Am 01.10.19 um 10:46 schrieb Marco Colli:
> > > Hello!
> > >
> > > I use HAProxy to load balance HTTP(S) traffic to some web servers.
> Web servers
> > > then connect to a database. I have noticed that when we restart
> the database
> > > some errors occur (and that is normal during the restart).
> > >
> > > However the problem is that **a few hundreds connecti

Re: Backend connections leak

2019-10-01 Thread Marco Colli
r maxconn 256 inter 10s fall 3 rise 2 check
server web0 10.113.220.155:6000
server web1 10.113.221.156:6000
server web2 10.113.222.157:6000


On Tue, Oct 1, 2019 at 11:02 AM Aleksandar Lazic  wrote:

> Hi.
>
> Am 01.10.19 um 10:46 schrieb Marco Colli:
> > Hello!
> >
> > I use HAProxy to load balance HTTP(S) traffic to some web servers. Web
> servers
> > then connect to a database. I have noticed that when we restart the
> database
> > some errors occur (and that is normal during the restart).
> >
> > However the problem is that **a few hundreds connections remain open from
> > HAProxy to the Puma web servers forever**. That slow down HAProxy.
> >
> > When we restart HAProxy then everything works fine again and the number
> of
> > backend connections drops to zero, which is the normal value since we
> use option
> > http-server-close. We have also configured the following timeouts but
> nothing
> > has changed (some connections to backend remain open forever):
> >
> > timeout client 10s
> > timeout client-fin 5s
> > timeout http-request 5s
> > timeout server 30s
> > timeout server-fin 10s
> > timeout connect 10s
> > timeout queue 10s
> >
> > HAProxy Version: 2.0
>
> Please can you post the full haproxy -vv as there are many fixes in the
> laster
> versions.
>
> Are there any checks in the config?
> Can you share the (minimal) config so that we can see some more
> information's
> about your setup.
>
> Regards
> Aleks
>


Backend connections leak

2019-10-01 Thread Marco Colli
Hello!

I use HAProxy to load balance HTTP(S) traffic to some web servers. Web
servers then connect to a database. I have noticed that when we restart the
database some errors occur (and that is normal during the restart).

However the problem is that **a few hundreds connections remain open from
HAProxy to the Puma web servers forever**. That slow down HAProxy.

When we restart HAProxy then everything works fine again and the number of
backend connections drops to zero, which is the normal value since we use
option http-server-close. We have also configured the following timeouts
but nothing has changed (some connections to backend remain open forever):

timeout client 10s
timeout client-fin 5s
timeout http-request 5s
timeout server 30s
timeout server-fin 10s
timeout connect 10s
timeout queue 10s

HAProxy Version: 2.0


Re: How to wait some time before retry?

2019-09-27 Thread Marco Colli
I have already set "timeout connect" to 10s, but it doesn't work. I think
because the connection is *immediately rejected* by the OS (Ubuntu) if
there isn't any process listening on it (i.e. during the Puma server
restart).

On Fri, Sep 27, 2019 at 2:11 PM Lukas Tribus  wrote:

> Ciao Marco,
>
>
>
> On Fri, Sep 27, 2019 at 1:21 PM Marco Colli 
> wrote:
> >
> > Still have this issue and I cannot find a solution. It would be great to
> have an option "wait time before retry" in the next versions of HAProxy
> (instead of the fixed value of 1 sec).
>
> Why not raise "timeout connect" to 10s in your configuration?
>
>
> Lukas
>


Re: How to wait some time before retry?

2019-09-27 Thread Marco Colli
Still have this issue and I cannot find a solution. It would be great to
have an option "wait time before retry" in the next versions of HAProxy
(instead of the fixed value of 1 sec).

On Mon, Sep 16, 2019 at 2:03 PM Marco Colli  wrote:

> Hello!
>
> I have a question about HAProxy configuration. Maybe someone has a
> solution ;)
>
> I have a HAProxy (v2.0) load balancer in front of many web servers.
>
> When I restart the web servers the TCP socket remains closed for a few
> seconds (~10s). For this reason I would like to retry failed attempts to
> connect after some seconds.
>
> I already use option redispatch, however it seems that does not solve my
> issue. The problem is that the request is retried immediately (after 1s),
> thus causing all the retries to fail. From the HAProxy docs:
>
> In order to avoid immediate reconnections to a server which is restarting,
> a turn-around timer of min("timeout connect", one second) is applied before
> a retry occurs.
>
> Is there any option to wait some more time (e.g. 10s) before retrying? Or
> do you have any other solution?
>
>


How to wait some time before retry?

2019-09-16 Thread Marco Colli
Hello!

I have a question about HAProxy configuration. Maybe someone has a solution
;)

I have a HAProxy (v2.0) load balancer in front of many web servers.

When I restart the web servers the TCP socket remains closed for a few
seconds (~10s). For this reason I would like to retry failed attempts to
connect after some seconds.

I already use option redispatch, however it seems that does not solve my
issue. The problem is that the request is retried immediately (after 1s),
thus causing all the retries to fail. From the HAProxy docs:

In order to avoid immediate reconnections to a server which is restarting,
a turn-around timer of min("timeout connect", one second) is applied before
a retry occurs.

Is there any option to wait some more time (e.g. 10s) before retrying? Or
do you have any other solution?


DDoS protection: ban clients with high HTTP error rates

2019-01-23 Thread Marco Colli
Hello!

I use HAProxy in front of a web app / service and I would like to add DDoS
protection and rate limiting. The problem is that each part of the
application has different request rates and for some customers we must
accept very hight request rates and burst, while this is not allowed for
unauthenticated users for example. So I was thinking about this solution:

1. Based on advanced conditions (e.g. current user) our Rails application
decides whether to return a normal response (e.g. 2xx) or a 429 (Too Many
Requests); it can also return other errors, like 401
2. HAProxy bans clients if they produce too many 4xx errors

What do you think about this solution?
Also, is it correct to use HAProxy directly or it is more performant to use
fail2ban on HAProxy logs?

This is the HAProxy configuration that I would like to use:

frontend www-frontend
  tcp-request connection reject if { src_http_err_rate(st_abuse) ge 5 }
  http-request track-sc0 src table st_abuse
  ...
  default_backend www-backend

backend www-backend
  ...

backend st_abuse
  stick-table type ipv6 size 1m expire 10s store http_err_rate(10s)



Do you think that the above rules are correct? Am I missing something?
Also, is it correct to mix *tcp*-request and src_*http*_err_rate in the
frontend?
Is it possible to include only the 4xx errors (and not 5xx) in
http_err_rate?


Any suggestion would be greatly appreciated
Thank you
Marco Colli


Re: Cannot handle more than 1,000 clients / s

2018-05-13 Thread Marco Colli
Thanks, I didn't see that low value... however it's not that the problem,
because that value is ignored in my case, since I don't use minconn:
https://discourse.haproxy.org/t/backend-sessions-limit-200/1661
Basically fullconn is useful only if you set minconn (not my case),
otherwise it is ignored.

On Sat, May 12, 2018 at 3:53 PM, Jarno Huuskonen 
wrote:

> Hi,
>
> On Fri, May 11, Marco Colli wrote:
> > >
> > > Do you get better results if you'll use http instead of https ?
> >
> >
> > I already tested it yesterday and the results are pretty much the same
> > (only a very small improvement, which is expected, but not a substantial
> > change).
>
> Couple of things to check:
> - first: can you test serving the response straight from haproxy,
>   something like:
> frontend www-frontend
>   ...
>   http-request deny deny_status 200
>
> - second: from the stats screen captures you sent looks like
>   "backend www-backend" is limited to 500 sessions, try increasing
>   backend fullconn
>   (https://cbonte.github.io/haproxy-dconv/1.6/configuration.html#4.2-
> fullconn)
>
> Are you running haproxy 1.6.3 ? It's pretty old (December 2015).
>
> -Jarno
>
> --
> Jarno Huuskonen
>


Re: Cannot handle more than 1,000 clients / s

2018-05-11 Thread Marco Colli
>
> Do you get better results if you'll use http instead of https ?


I already tested it yesterday and the results are pretty much the same
(only a very small improvement, which is expected, but not a substantial
change).

Running top / htop should show if userspace uses all cpu.


 During the test the CPU usage is this:


%Cpu0  :* 65.1 *us,*  5.0 *sy,*  0.0 *ni,* 29.9 *id,*  0.0 *wa,*  0.0 *hi,*
0.0 *si,*  0.0 *st

%Cpu1  :* 49.0 *us,*  6.3 *sy,*  0.0 *ni,* 30.3 *id,*  0.0 *wa,*  0.0 *hi,
* 14.3 *si,*  0.0 *st

%Cpu2  :* 67.7 *us,*  4.0 *sy,*  0.0 *ni,* 24.8 *id,*  0.0 *wa,*  0.0 *hi,*
3.6 *si,*  0.0 *st

%Cpu3  :* 72.1 *us,*  6.0 *sy,*  0.0 *ni,* 21.9 *id,*  0.0 *wa,*  0.0 *hi,*
0.0 *si,*  0.0 *st


Also note that when I increase the number of CPUs and HAProxy processes I
don't get any benefit on performance (and the CPU usage is much lower).


On Fri, May 11, 2018 at 5:45 PM, Jarno Huuskonen 
wrote:

> Hi,
>
> On Fri, May 11, Marco Colli wrote:
> > Hope that this is the right place to ask.
> >
> > We have a website that uses HAProxy as a load balancer and nginx in the
> > backend. The website is hosted on DigitalOcean (AMS2).
> >
> > The problem is that - no matter the configuration or the server size - we
> > cannot achieve a connection rate higher than 1,000 new connections / s.
> > Indeed we are testing using loader.io and these are the results:
> > - for a session rate of 1,000 clients per second we get exactly 1,000
> > responses per second
> > - for session rates higher than that, we get long response times (e.g.
> 3s)
> > and only some hundreds of responses per second (so there is a bottleneck)
> > https://ldr.io/2I5hry9
>
> Is your load tester using https connections or http (probably https,
> since you have redirect scheme https if !{ ssl_fc }) ? If https and each
> connection renegotiates tls then there's a chance you are testing how
> fast your VM can do tls negot.
>
> Running top / htop should show if userspace uses all cpu.
>
> Do you get better results if you'll use http instead of https ?
>
> -Jarno
>
> --
> Jarno Huuskonen
>


Re: Cannot handle more than 1,000 clients / s

2018-05-11 Thread Marco Colli
>
> Maybe you want to disable it


Thanks for the reply! I have already tried that and doesn't help.

 Maybe you can run a "top" showing each CPU usage, so we can see how much
> time is spent in SI and in userland


During the test the CPU usage is pretty constant and the values are these:


%Cpu0  :* 65.1 *us,*  5.0 *sy,*  0.0 *ni,* 29.9 *id,*  0.0 *wa,*  0.0 *hi,*
0.0 *si,*  0.0 *st

%Cpu1  :* 49.0 *us,*  6.3 *sy,*  0.0 *ni,* 30.3 *id,*  0.0 *wa,*  0.0 *hi,*
14.3 *si,*  0.0 *st

%Cpu2  :* 67.7 *us,*  4.0 *sy,*  0.0 *ni,* 24.8 *id,*  0.0 *wa,*  0.0 *hi,*
3.6 *si,*  0.0 *st

%Cpu3  :* 72.1 *us,*  6.0 *sy,*  0.0 *ni,* 21.9 *id,*  0.0 *wa,*  0.0 *hi,*
0.0 *si,*  0.0 *st


I saw you're doing http-server-close. Is there any good reason for that?


I need to handle different requests from different clients (I am not
interested in keep alive, since clients usually make just 1 or 2 requests).
So I think that http-server-close doesn't matter because it is used only
for multiple request *from the same client*.

The maxconn on your frontend seem too low too compared to your target
> traffic (despite the 5000 will apply to each process).


It is 5,000 * 4 = 20,000 which should be enough for a test with 2,000
clients. In any case I have also tried to increase it to 25,000 per process
and the performance are the same in the load tests.

Last, I would create 4 bind lines, one per process, like this in your
> frontend:
>   bind :80 process 1
>   bind :80 process 2
>

Do you mean bind-process? The HAProxy docs say that when bind-process is
not present is the same as bind-process all, so I think that it is useless
to write it explicitly.


On Fri, May 11, 2018 at 4:58 PM, Baptiste  wrote:

> Hi Marco,
>
> I see you enabled compression in your HAProxy configuration. Maybe you
> want to disable it and re-run a test just to see (though I don't expect any
> improvement since you seem to have some free CPU cycles on the machine).
> Maybe you can run a "top" showing each CPU usage, so we can see how much
> time is spent in SI and in userland.
> I saw you're doing http-server-close. Is there any good reason for that?
> The maxconn on your frontend seem too low too compared to your target
> traffic (despite the 5000 will apply to each process).
> Last, I would create 4 bind lines, one per process, like this in your
> frontend:
>   bind :80 process 1
>   bind :80 process 2
>   ...
>
> Maybe one of your process is being saturated and you don't see it . The
> configuration above will ensure an even load distribution of the incoming
> connections to the HAProxy process.
>
> Baptiste
>
>
> On Fri, May 11, 2018 at 4:29 PM, Marco Colli 
> wrote:
>
>> how many connections you have opened on the private side
>>
>>
>> Thanks for the reply! What should I do exactly? Can you see it from
>> HAProxy stats? I have taken two screenshots (see attachments) during the
>> load test (30s, 2,000 client/s)
>>
>> here are not closing fast enough and you are reaching the limit.
>>
>>
>> What can I do to improve that?
>>
>>
>>
>>
>> On Fri, May 11, 2018 at 3:30 PM, Mihai Vintila  wrote:
>>
>>> Check how many connections you have opened on the private side(i.e.
>>> between haproxy and nginx), i'm thinking that there are not closing fast
>>> enough and you are reaching the limit.
>>>
>>> Best regards,
>>> Mihai
>>>
>>> On 5/11/2018 4:26 PM, Marco Colli wrote:
>>>
>>> Another note: each nginx server in the backend can handle 8,000 new
>>> clients/s: http://bit.ly/2Kh86j9 (tested with keep alive disabled and
>>> with the same http request)
>>>
>>> On Fri, May 11, 2018 at 2:02 PM, Marco Colli 
>>> wrote:
>>>
>>>> Hello!
>>>>
>>>> Hope that this is the right place to ask.
>>>>
>>>> We have a website that uses HAProxy as a load balancer and nginx in the
>>>> backend. The website is hosted on DigitalOcean (AMS2).
>>>>
>>>> The problem is that - no matter the configuration or the server size -
>>>> we cannot achieve a connection rate higher than 1,000 new connections / s.
>>>> Indeed we are testing using loader.io and these are the results:
>>>> - for a session rate of 1,000 clients per second we get exactly 1,000
>>>> responses per second
>>>> - for session rates higher than that, we get long response times (e.g.
>>>> 3s) and only some hundreds of responses per second (so there is a
>>>> bottleneck) https://ldr.io/2I5hry9
>>>>
>>>> Note that if we use a long ht

Re: Cannot handle more than 1,000 clients / s

2018-05-11 Thread Marco Colli
>
> Solution is to have more than one ip on the backend and a round robin when
> sending to the backends.


What do you mean exactly? I already use round robin (as you can see in the
config file linked previously) and in the backend I have 10 different
servers with 10 different IPs

sysctl net.ipv4.ip_local_port_range


Currently I have ~30,000 ports available... they should be enough for 2,000
clients / s. Note that the number during the test is kept constant to 2,000
client (the number of connected clients is not cumulative / does not
increase during the test).
In any case I have also tested increasing the number of ports to 64k and
run a load test, but nothing changes.

You are probably keeping it opened for around 60 seconds and thus the limit


No, on the backend side I use http-server-close. On the client side the
number is constant to 2k clients during the test and in any case I have
http keep alive timeout set to 500ms.


On Fri, May 11, 2018 at 4:51 PM, Mihai Vintila  wrote:

> You can not have too many open ports . Once a new connections comes to
> haproxy on the backend it'll initiate a new connection to the nginx. Each
> new connections opens a local port, and ports are limited by sysctl
> net.ipv4.ip_local_port_range . So even if you set it to 1024 65535 you
> still have only ~ 64000 sessions. Solution is to have more than one ip on
> the backend and a round robin when sending to the backends. This way you'll
> have for each backend ip on the haproxy 64000 sessions. Alternatively make
> sure that you are not keeping the connections opened for too long . You are
> probably keeping it opened for around 60 seconds and thus the limit. As you
> can see you have 61565 sessions in the screenshots provided. Other limit
> could be the file descriptors but seems that this is set to 200k
>
> Best regards,
> Mihai Vintila
>
> On 5/11/2018 5:29 PM, Marco Colli wrote:
>
> how many connections you have opened on the private side
>
>
> Thanks for the reply! What should I do exactly? Can you see it from
> HAProxy stats? I have taken two screenshots (see attachments) during the
> load test (30s, 2,000 client/s)
>
> here are not closing fast enough and you are reaching the limit.
>
>
> What can I do to improve that?
>
>
>
>
> On Fri, May 11, 2018 at 3:30 PM, Mihai Vintila  wrote:
>
>> Check how many connections you have opened on the private side(i.e.
>> between haproxy and nginx), i'm thinking that there are not closing fast
>> enough and you are reaching the limit.
>>
>> Best regards,
>> Mihai
>>
>> On 5/11/2018 4:26 PM, Marco Colli wrote:
>>
>> Another note: each nginx server in the backend can handle 8,000 new
>> clients/s: http://bit.ly/2Kh86j9 (tested with keep alive disabled and
>> with the same http request)
>>
>> On Fri, May 11, 2018 at 2:02 PM, Marco Colli 
>> wrote:
>>
>>> Hello!
>>>
>>> Hope that this is the right place to ask.
>>>
>>> We have a website that uses HAProxy as a load balancer and nginx in the
>>> backend. The website is hosted on DigitalOcean (AMS2).
>>>
>>> The problem is that - no matter the configuration or the server size -
>>> we cannot achieve a connection rate higher than 1,000 new connections / s.
>>> Indeed we are testing using loader.io and these are the results:
>>> - for a session rate of 1,000 clients per second we get exactly 1,000
>>> responses per second
>>> - for session rates higher than that, we get long response times (e.g.
>>> 3s) and only some hundreds of responses per second (so there is a
>>> bottleneck) https://ldr.io/2I5hry9
>>>
>>> Note that if we use a long http keep alive in HAProxy and the same
>>> browser makes multiple requests we get much better results: however the
>>> problem is that in the reality we need to handle many different clients
>>> (which make 1 or 2 requests on average), not many requests from the same
>>> client.
>>>
>>> Currently we have this configuration:
>>> - 1x HAProxy with 4 vCPU (we have also tested with 12 vCPU... the result
>>> is the same)
>>> - system / process limits and HAProxy configuration:
>>> https://gist.github.com/collimarco/347fa757b1bd1b3f1de536bf1e90f195
>>> - 10x nginx backend servers with 2 vCPU each
>>>
>>> What can we improve in order to handle more than 1,000 different new
>>> clients per second?
>>>
>>> Any suggestion would be extremely helpful.
>>>
>>> Have a nice day
>>> Marco Colli
>>>
>>>
>>
>


Re: Cannot handle more than 1,000 clients / s

2018-05-11 Thread Marco Colli
Another note: each nginx server in the backend can handle 8,000 new
clients/s: http://bit.ly/2Kh86j9 (tested with keep alive disabled and with
the same http request)

On Fri, May 11, 2018 at 2:02 PM, Marco Colli  wrote:

> Hello!
>
> Hope that this is the right place to ask.
>
> We have a website that uses HAProxy as a load balancer and nginx in the
> backend. The website is hosted on DigitalOcean (AMS2).
>
> The problem is that - no matter the configuration or the server size - we
> cannot achieve a connection rate higher than 1,000 new connections / s.
> Indeed we are testing using loader.io and these are the results:
> - for a session rate of 1,000 clients per second we get exactly 1,000
> responses per second
> - for session rates higher than that, we get long response times (e.g. 3s)
> and only some hundreds of responses per second (so there is a bottleneck)
> https://ldr.io/2I5hry9
>
> Note that if we use a long http keep alive in HAProxy and the same browser
> makes multiple requests we get much better results: however the problem is
> that in the reality we need to handle many different clients (which make 1
> or 2 requests on average), not many requests from the same client.
>
> Currently we have this configuration:
> - 1x HAProxy with 4 vCPU (we have also tested with 12 vCPU... the result
> is the same)
> - system / process limits and HAProxy configuration:
> https://gist.github.com/collimarco/347fa757b1bd1b3f1de536bf1e90f195
> - 10x nginx backend servers with 2 vCPU each
>
> What can we improve in order to handle more than 1,000 different new
> clients per second?
>
> Any suggestion would be extremely helpful.
>
> Have a nice day
> Marco Colli
>
>


Cannot handle more than 1,000 clients / s

2018-05-11 Thread Marco Colli
Hello!

Hope that this is the right place to ask.

We have a website that uses HAProxy as a load balancer and nginx in the
backend. The website is hosted on DigitalOcean (AMS2).

The problem is that - no matter the configuration or the server size - we
cannot achieve a connection rate higher than 1,000 new connections / s.
Indeed we are testing using loader.io and these are the results:
- for a session rate of 1,000 clients per second we get exactly 1,000
responses per second
- for session rates higher than that, we get long response times (e.g. 3s)
and only some hundreds of responses per second (so there is a bottleneck)
https://ldr.io/2I5hry9

Note that if we use a long http keep alive in HAProxy and the same browser
makes multiple requests we get much better results: however the problem is
that in the reality we need to handle many different clients (which make 1
or 2 requests on average), not many requests from the same client.

Currently we have this configuration:
- 1x HAProxy with 4 vCPU (we have also tested with 12 vCPU... the result is
the same)
- system / process limits and HAProxy configuration:
https://gist.github.com/collimarco/347fa757b1bd1b3f1de536bf1e90f195
- 10x nginx backend servers with 2 vCPU each

What can we improve in order to handle more than 1,000 different new
clients per second?

Any suggestion would be extremely helpful.

Have a nice day
Marco Colli