RE: Haproxy timing issues

2011-11-04 Thread Erik Torlen
I get some problems on step 5 where it doesn't seem to do the ./Configure 
properly. I moved the existing Configure and made a symlink
named Configure that pointed to config. When running step 5 again it seemed 
to jump into an endless making of openssl :/
Meaning that it is starting to do something but it never finish, waited for 
~20min.

Any ideas?

/E

-Original Message-
From: Vincent Bernat [mailto:ber...@luffy.cx] 
Sent: den 2 november 2011 23:16
To: Erik Torlen
Cc: haproxy@formilux.org
Subject: Re: Haproxy timing issues

OoO En cette  nuit nuageuse du jeudi 03 novembre  2011, vers 01:21, Erik
Torlen erik.tor...@apicasystem.com disait :

 Yes, I'm currently on Ubuntu 10.04. 
 So basically I could grab this (http://packages.ubuntu.com/oneiric/openssl) 
 .deb package and then
 add the patch you linked for me to it?
 Can  I then  compile  stud  as default  or  do I  have  to modify  the
 Makefile?

On a development machine :
 1. dget 
http://archive.ubuntu.com/ubuntu/pool/main/o/openssl/openssl_1.0.0e-2ubuntu4.dsc
 2. cd openssl-1.0.0e
 3. curl 
https://raw.github.com/gist/1272151/7f1c3cfa9e95474cfac7c248c7ab41b4fd9e1632/openssl-1.0.0e-backport.patch
 | patch -p1
 4. Update debian/changelog like the first hunk of the patch (which will
not apply cleanly since it is not targeted at the same version)
 5. dpkg-buildpackage -us -uc
 6. dpkg -i ../openssl*deb ../libssl*deb
 7. cd ../stud
 8. make USE_SHARED_CACHE=1
 9. You  get your  stud  linked  against OpenSSL  1.0.0e.  Now, on  your
server,   install  libssl1.0.0_1.0.0e-2ubuntu4~bpoXXX1.deb  then
stud.
-- 
Vincent Bernat ☯ http://vincent.bernat.im

 /*
  * For moronic filesystems that do not allow holes in file.
  * We may have to extend the file.
  */
2.4.0-test2 /usr/src/linux/fs/buffer.c


Re: Haproxy timing issues

2011-11-04 Thread Vincent Bernat
OoO  Pendant le repas  du vendredi  04 novembre  2011, vers  19:22, Erik
Torlen erik.tor...@apicasystem.com disait :

 I get some problems on step 5 where it doesn't seem to do the
 ./Configure properly. I moved the existing Configure and made a
 symlink
 named Configure that pointed to config. When running step 5 again
 it seemed to jump into an endless making of openssl :/
 Meaning  that it  is starting  to do  something but  it  never finish,
 waited for ~20min.

Symlink seems a wrong idea. Why doesn't it seem to do the ./Configure properly?
-- 
Vincent Bernat ☯ http://vincent.bernat.im

Document your data layouts.
- The Elements of Programming Style (Kernighan  Plauger)



Re: Haproxy timing issues

2011-11-03 Thread Vincent Bernat
OoO En cette  nuit nuageuse du jeudi 03 novembre  2011, vers 01:21, Erik
Torlen erik.tor...@apicasystem.com disait :

 Yes, I'm currently on Ubuntu 10.04. 
 So basically I could grab this (http://packages.ubuntu.com/oneiric/openssl) 
 .deb package and then
 add the patch you linked for me to it?
 Can  I then  compile  stud  as default  or  do I  have  to modify  the
 Makefile?

On a development machine :
 1. dget 
http://archive.ubuntu.com/ubuntu/pool/main/o/openssl/openssl_1.0.0e-2ubuntu4.dsc
 2. cd openssl-1.0.0e
 3. curl 
https://raw.github.com/gist/1272151/7f1c3cfa9e95474cfac7c248c7ab41b4fd9e1632/openssl-1.0.0e-backport.patch
 | patch -p1
 4. Update debian/changelog like the first hunk of the patch (which will
not apply cleanly since it is not targeted at the same version)
 5. dpkg-buildpackage -us -uc
 6. dpkg -i ../openssl*deb ../libssl*deb
 7. cd ../stud
 8. make USE_SHARED_CACHE=1
 9. You  get your  stud  linked  against OpenSSL  1.0.0e.  Now, on  your
server,   install  libssl1.0.0_1.0.0e-2ubuntu4~bpoXXX1.deb  then
stud.
-- 
Vincent Bernat ☯ http://vincent.bernat.im

 /*
  * For moronic filesystems that do not allow holes in file.
  * We may have to extend the file.
  */
2.4.0-test2 /usr/src/linux/fs/buffer.c



RE: Haproxy timing issues

2011-11-02 Thread Erik Torlen
Hi,

Yeah the clients are not the problem, we are using 5 different datacenters with 
5 machines each so ~25 machines. Hardcore loadtesting :)
Btw, the loadtest are done transatlantic so that is causing latency etc. 

After some more testing yesterday we found at just what you mentioned here: 
using stud with too many processes made the result much
more worse.
The perfect setup turned out to be stud with n=3 and haproxy nbproc=1. 
Increasing n with n=4,5,6.. made the result much worse. 

When I got these results I used stud with n=6 which caused a lot of response 
time problems. However, I don't see these response time
now when running with n=3 in haproxy logs. So how could stud with n=6 affect 
the response time on the backend in haproxy logs?

We are currently using the latest version of stud from github, 
bumptech-stud-0.2-76-g8012fe3. Is the emericbr patches merge 
in there or is that a fork?

The loadtest client is doing a renegotiation for every connection. The scenario 
is containing 3 small images.
Each connection is making 3 request times 10 with 3-7s waittime between each 
request.
This is to maintain the connection as long as possible and get many active 
connections. (We also have scenarios doing a lot of conns/s etc).

Yeah, Aloha would have been cool to test. But this is not for us, this is for a 
customer :)

These are my main sysctl values which gave me visible performance improvement:
net.ipv4.tcp_max_syn_backlog=262144
net.ipv4.tcp_syncookies=0
net.ipv4.tcp_tw_reuse=1
net.ipv4.tcp_no_metrics_save=1
net.core.somaxconn=262144

net.ipv4.ip_local_port_range=1024 65536
net.ipv4.tcp_tw_recycle=1

These are some more I have tried with but it did not gave me so much 
improvement:
#net.ipv4.tcp_rmem=4096 87380 16777216
#net.ipv4.tcp_wmem=4096 65536 16777216
#net.ipv4.tcp_fin_timeout = 3
#net.ipv4.tcp_max_orphans = 262144
#net.ipv4.tcp_synack_retries = 2
#net.ipv4.tcp_syn_retries = 2

#net.core.rmem_max=16777216
#net.core.wmem_max=16777216
#net.core.netdev_max_backlog = 262144

/E


-Original Message-
From: Baptiste [mailto:bed...@gmail.com] 
Sent: den 1 november 2011 16:08
To: Erik Torlen
Cc: haproxy@formilux.org
Subject: Re: Haproxy timing issues

Hi,

First question: are you sure you're reaching the limit of
haproxy/varnish and not the limit of your client?
Mainly concerning the increasing response time.

How many CPUs do you have in your VM? Starting too much stud proccess
could be counter-productive.
I doubt doing CPU affinity in a VM improves something :)

Concerning the logs, the time we can see on your client side are very
high! Too high :)
3/4s for HAProxy to get the full request.

How are you running stud?
Which options? Are you using the one with emericbr patches?
Are you using requesting using the same SSL Session ID or do you
renegotiate a new one for each connection?

Have you checked your network statisitics, on both client and server side?
netstat -in and netstat -s
Is there a lot of drops, retransmission, congestion, etc...

On your last log line, we can see that HAProxy took 22s to establish a
TCP connection to your Varnish...

Can you share your stud, haproxy, and varnish configuration, the
version of each software, the startup parameters for Varnish.
What kind of tool do you use on your client to run your load test?
What sysctl have you already tunned?


Unfortunately, the Aloha does not run on Amazon :)


cheers,


On Tue, Nov 1, 2011 at 9:16 PM, Erik Torlen erik.tor...@apicasystem.com wrote:
 Hi,

 I am currently (and have been from time to time the last weeks) doing some 
 heavy loadtesting against haproxy with stud in front of it handling the ssl.

 My loadtest has been focused on loadtesting SSL traffic through stud against 
 haproxy on amazon ec2.

 Our current problem is that we cannot get more then ~30k active connections 
 (~150 conns/s) until we starting to see increased response time (10-60s) on 
 the
 client side. Running with 38k connections now and seeing much higher response 
 time.

 The setup is:
 1 instance running haproxy + stud
 2 instances running varnish server 3 cached images

 Varnish has 100% cache hit ratio so nothing goes to the backend.

 We have tried using m1.xlarge and the c1.xlarge. The m1.xlarge uses almost 
 100% cpu when doing the loadtests while c1.xlarge has a lot of resources left 
 (stud using a few percent per process) and haproxy ~60-70%cpu.
 The only difference is that c1.xlarge gives quite better response time before 
 the actual problem happens where resp times are increasing.

 Haproxy is running with nbproc=1
 Stud is running with n=6 and shared session cache. (Tried it with n=3 as well

 From the logging in haproxy I could see that the time it takes to establish a 
 connection against the backend and receive the data:

 Haproxy.log
 Nov  1 18:40:35 127.0.0.1 haproxy[18511]: x.x.x.x:54113 
 [01/Nov/2011:18:39:40.273] varnish varnish/varnish1 4519/0/73/50215/54809 200 
 2715 - -  238/236/4/5/0 0/0 GET /assets/images/icons

RE: Haproxy timing issues

2011-11-02 Thread Lukas Tribus

Hi,



you should switch net.ipv4.tcp_tw_recycle off; you have already tcp_tw_reuse 
on, which serves the same purpose (and it's less dangerous with NATted clients).


http://www.serverphorums.com/read.php?10,182544


Lukas

 From: erik.tor...@apicasystem.com
 To: bed...@gmail.com
 CC: haproxy@formilux.org
 Subject: RE: Haproxy timing issues
 Date: Wed, 2 Nov 2011 18:17:58 +
 
 Hi,
 
 Yeah the clients are not the problem, we are using 5 different datacenters 
 with 5 machines each so ~25 machines. Hardcore loadtesting :)
 Btw, the loadtest are done transatlantic so that is causing latency etc. 
 
 After some more testing yesterday we found at just what you mentioned here: 
 using stud with too many processes made the result much
 more worse.
 The perfect setup turned out to be stud with n=3 and haproxy nbproc=1. 
 Increasing n with n=4,5,6.. made the result much worse. 
 
 When I got these results I used stud with n=6 which caused a lot of response 
 time problems. However, I don't see these response time
 now when running with n=3 in haproxy logs. So how could stud with n=6 affect 
 the response time on the backend in haproxy logs?
 
 We are currently using the latest version of stud from github, 
 bumptech-stud-0.2-76-g8012fe3. Is the emericbr patches merge 
 in there or is that a fork?
 
 The loadtest client is doing a renegotiation for every connection. The 
 scenario is containing 3 small images.
 Each connection is making 3 request times 10 with 3-7s waittime between each 
 request.
 This is to maintain the connection as long as possible and get many active 
 connections. (We also have scenarios doing a lot of conns/s etc).
 
 Yeah, Aloha would have been cool to test. But this is not for us, this is for 
 a customer :)
 
 These are my main sysctl values which gave me visible performance improvement:
 net.ipv4.tcp_max_syn_backlog=262144
 net.ipv4.tcp_syncookies=0
 net.ipv4.tcp_tw_reuse=1
 net.ipv4.tcp_no_metrics_save=1
 net.core.somaxconn=262144
 
 net.ipv4.ip_local_port_range=1024 65536
 net.ipv4.tcp_tw_recycle=1
 
 These are some more I have tried with but it did not gave me so much 
 improvement:
 #net.ipv4.tcp_rmem=4096 87380 16777216
 #net.ipv4.tcp_wmem=4096 65536 16777216
 #net.ipv4.tcp_fin_timeout = 3
 #net.ipv4.tcp_max_orphans = 262144
 #net.ipv4.tcp_synack_retries = 2
 #net.ipv4.tcp_syn_retries = 2
 
 #net.core.rmem_max=16777216
 #net.core.wmem_max=16777216
 #net.core.netdev_max_backlog = 262144
 
 /E
 
 
 -Original Message-
 From: Baptiste [mailto:bed...@gmail.com] 
 Sent: den 1 november 2011 16:08
 To: Erik Torlen
 Cc: haproxy@formilux.org
 Subject: Re: Haproxy timing issues
 
 Hi,
 
 First question: are you sure you're reaching the limit of
 haproxy/varnish and not the limit of your client?
 Mainly concerning the increasing response time.
 
 How many CPUs do you have in your VM? Starting too much stud proccess
 could be counter-productive.
 I doubt doing CPU affinity in a VM improves something :)
 
 Concerning the logs, the time we can see on your client side are very
 high! Too high :)
 3/4s for HAProxy to get the full request.
 
 How are you running stud?
 Which options? Are you using the one with emericbr patches?
 Are you using requesting using the same SSL Session ID or do you
 renegotiate a new one for each connection?
 
 Have you checked your network statisitics, on both client and server side?
 netstat -in and netstat -s
 Is there a lot of drops, retransmission, congestion, etc...
 
 On your last log line, we can see that HAProxy took 22s to establish a
 TCP connection to your Varnish...
 
 Can you share your stud, haproxy, and varnish configuration, the
 version of each software, the startup parameters for Varnish.
 What kind of tool do you use on your client to run your load test?
 What sysctl have you already tunned?
 
 
 Unfortunately, the Aloha does not run on Amazon :)
 
 
 cheers,
 
 
 On Tue, Nov 1, 2011 at 9:16 PM, Erik Torlen erik.tor...@apicasystem.com 
 wrote:
  Hi,
 
  I am currently (and have been from time to time the last weeks) doing some 
  heavy loadtesting against haproxy with stud in front of it handling the ssl.
 
  My loadtest has been focused on loadtesting SSL traffic through stud 
  against haproxy on amazon ec2.
 
  Our current problem is that we cannot get more then ~30k active connections 
  (~150 conns/s) until we starting to see increased response time (10-60s) 
  on the
  client side. Running with 38k connections now and seeing much higher 
  response time.
 
  The setup is:
  1 instance running haproxy + stud
  2 instances running varnish server 3 cached images
 
  Varnish has 100% cache hit ratio so nothing goes to the backend.
 
  We have tried using m1.xlarge and the c1.xlarge. The m1.xlarge uses almost 
  100% cpu when doing the loadtests while c1.xlarge has a lot of resources 
  left (stud using a few percent per process) and haproxy ~60-70%cpu.
  The only difference is that c1.xlarge gives quite better response time 
  before

RE: Haproxy timing issues

2011-11-02 Thread Erik Torlen
Note: We did not make use of re-negotiation for every connection, only for the 
first 2000. 

I have started to do the same loadtests now with re-negotiation for each 
connection.

/E

-Original Message-
From: Baptiste [mailto:bed...@gmail.com] 
Sent: den 1 november 2011 16:08
To: Erik Torlen
Cc: haproxy@formilux.org
Subject: Re: Haproxy timing issues

Hi,

First question: are you sure you're reaching the limit of
haproxy/varnish and not the limit of your client?
Mainly concerning the increasing response time.

How many CPUs do you have in your VM? Starting too much stud proccess
could be counter-productive.
I doubt doing CPU affinity in a VM improves something :)

Concerning the logs, the time we can see on your client side are very
high! Too high :)
3/4s for HAProxy to get the full request.

How are you running stud?
Which options? Are you using the one with emericbr patches?
Are you using requesting using the same SSL Session ID or do you
renegotiate a new one for each connection?

Have you checked your network statisitics, on both client and server side?
netstat -in and netstat -s
Is there a lot of drops, retransmission, congestion, etc...

On your last log line, we can see that HAProxy took 22s to establish a
TCP connection to your Varnish...

Can you share your stud, haproxy, and varnish configuration, the
version of each software, the startup parameters for Varnish.
What kind of tool do you use on your client to run your load test?
What sysctl have you already tunned?


Unfortunately, the Aloha does not run on Amazon :)


cheers,


On Tue, Nov 1, 2011 at 9:16 PM, Erik Torlen erik.tor...@apicasystem.com wrote:
 Hi,

 I am currently (and have been from time to time the last weeks) doing some 
 heavy loadtesting against haproxy with stud in front of it handling the ssl.

 My loadtest has been focused on loadtesting SSL traffic through stud against 
 haproxy on amazon ec2.

 Our current problem is that we cannot get more then ~30k active connections 
 (~150 conns/s) until we starting to see increased response time (10-60s) on 
 the
 client side. Running with 38k connections now and seeing much higher response 
 time.

 The setup is:
 1 instance running haproxy + stud
 2 instances running varnish server 3 cached images

 Varnish has 100% cache hit ratio so nothing goes to the backend.

 We have tried using m1.xlarge and the c1.xlarge. The m1.xlarge uses almost 
 100% cpu when doing the loadtests while c1.xlarge has a lot of resources left 
 (stud using a few percent per process) and haproxy ~60-70%cpu.
 The only difference is that c1.xlarge gives quite better response time before 
 the actual problem happens where resp times are increasing.

 Haproxy is running with nbproc=1
 Stud is running with n=6 and shared session cache. (Tried it with n=3 as well

 From the logging in haproxy I could see that the time it takes to establish a 
 connection against the backend and receive the data:

 Haproxy.log
 Nov  1 18:40:35 127.0.0.1 haproxy[18511]: x.x.x.x:54113 
 [01/Nov/2011:18:39:40.273] varnish varnish/varnish1 4519/0/73/50215/54809 200 
 2715 - -  238/236/4/5/0 0/0 GET /assets/images/icons/elite_logo_beta.png 
 HTTP/1.1
 Nov  1 18:40:35 127.0.0.1 haproxy[18511]: x.x.x.x:55635 
 [01/Nov/2011:18:39:41.547] varnish varnish/varnish1 3245/0/81/50207/53535 200 
 1512 - -  238/236/3/4/0 0/0 GET /assets/images/icons/favicon.ico 
 HTTP/1.1
 ...
 Nov  1 18:40:44 127.0.0.1 haproxy[18511]: x.x.x.x:34453 
 [01/Nov/2011:18:39:25.330] varnish varnish/varnish1 3082/0/225/32661/79559 
 200 1512 - -  234/232/1/2/0 0/0 GET /assets/images/icons/favicon.ico 
 HTTP/1.1
 Nov  1 18:40:44 127.0.0.1 haproxy[18511]: x.x.x.x:53731 
 [01/Nov/2011:18:39:25.036] varnish varnish/varnish1 3377/0/216/32669/79854 
 200 1725 - -  233/231/0/1/0 0/0 GET /assets/images/create/action_btn.png 
 HTTP/1.1

 Haproxy.err (NOTE: 504 error here)

 Nov  1 18:40:11 127.0.0.1 haproxy[18511]: x.x.x.x:34885 
 [01/Nov/2011:18:39:07.597] varnish varnish/varnish1 4299/0/27/-1/64330 504 
 194 - - sH-- 10916/10914/4777/2700/0 0/0 GET 
 /assets/images/icons/favicon.ico HTTP/1.1
 Nov  1 18:40:12 127.0.0.1 haproxy[18511]: x.x.x.x:58878 
 [01/Nov/2011:18:39:12.621] varnish varnish/varnish2 314/0/55/-1/60374 504 194 
 - - sH-- 3692/3690/3392/1623/0 0/0 GET /assets/images/icons/favicon.ico 
 HTTP/1.1

 Nov  1 18:40:18 127.0.0.1 haproxy[18511]: x.x.x.x:35505 
 [01/Nov/2011:18:39:42.670] varnish varnish/varnish1 3515/0/22078/10217/35811 
 200 1512 - -  1482/1481/1238/710/1 0/0 GET 
 /assets/images/icons/favicon.ico HTTP/1.1
 Nov  1 18:40:18 127.0.0.1 haproxy[18511]: x.x.x.x:40602 
 [01/Nov/2011:18:39:42.056] varnish varnish/varnish1 4126/0/22081/10226/36435 
 200 1512 - -  1475/1474/1231/703/1 0/0 GET 
 /assets/images/icons/favicon.ico HTTP/1.1


 Here is the logs from running haproxy with varnish as a backend on the local 
 machine:

 Haproxy.log
 Nov  1 20:00:52 127.0.0.1 haproxy[18953]: x.x.x.x:38552 
 [01/Nov/2011:20:00:45.157

RE: Haproxy timing issues

2011-11-02 Thread Erik Torlen
Yes, Vincent Bernat blog posts is really good. However, using these softwares 
on EC2 which are VMs does not give the
same performance in all meanings. But I think that it stills perform pretty 
good.

I am using taskset for all processes. Haproxy goes to cpu 01 and each stud 
process gets bound to 02.03.04 etc. depending on
how many processes of stud I run with.

The latest test I'm doing is with SSL cache turned off on the client side which 
means that for each new connection it negotiates
again. And for each connection it is doing these requests I mentioned before.

The result are totally different...

Running with negotiation for each connection gives better result using more 
stud processes.
Now I get the worst result using stud with n=3 and the best result with stud 
n=6 :)
Also I get much less actual connection/s in haproxy, only 800 conns/s compared 
to 2000 conns/s. 

This was definitely much heavier for the webserver and stud proved to handle it 
better with more processes.
Also HAproxy worked less, it had an idle time of ~50% compared to 15% with the 
tests before.

Swap was not used, it was plenty of memory left.

Last but not least, when benching a platform, it's a bad idea to
introduce random stuff. IE your think time.
- I agree, using a script on the backend would have been much better. The only 
reason I made the script as is with wait time was because it's a customer
environment that I am currently working with and putting in a custom script was 
a longer process then just solving the problem my self :)
The script should make use of http keep alive with a small wait time between 
each request in order to not close the connection.

Thanks
/E


-Original Message-
From: Baptiste [mailto:bed...@gmail.com] 
Sent: den 2 november 2011 14:22
To: Erik Torlen
Cc: haproxy@formilux.org
Subject: Re: Haproxy timing issues

Hi Erik,

I doubt this could improve things because of virtualization, but have
you tried binding processes to CPUs?
On a physical hardware, the purpose is to benefit of the l2/l3 CPU
cache, mainly for network IO and HAProxy, and also reducing the
overhead of the CPU moving processes from a core to an other one.

Again, I guess the virtualisation layer would lower the impact of
L2/L3 CPU cache :)

In your c1.xlarge, you have 8 virtual CPU cores. I guess you wanted 6
studs + 1 HAProxy to get the best of your hardware.
Since your best results are obtained with 3 CPU cores, it seems, you
can use only 4 physical CPUs.
Read Vincent article: http://vincent.bernat.im/en/blog/2011-ssl-benchmark.html
You'll see that stud capacity grows almost linearly with the number of
stud process.

When running 6 studs, have you recorded the vmstat output?
Compare it with a record without any load and maybe compare it too
when running 3 studs processes, maybe something obvious will appear.
I guess there are some locks somewhere, maybe in the underlying hypervisor...

You're maybe hitting the famous virtualization clock drift:
http://software.intel.com/en-us/blogs/2009/06/25/virtualization-and-performance-vm-time-drift/
Since HAProxy is event-driven, it is very sensible to clock synchronization.

Once your VM is overloaded, all the processes are impacted, so the
network IO and HAProxy too.

By the way, have you checked you were not using the swap?
when doing load-balancing, swapping is the worst thing that could happen.
Actually, the worst thing would to swap in a VM :)

By the way, it seems you're running the right stud version :)
Emericbr's patches has been included in mid-august.

Concerning your load test, do you mean a SSL negociation per session
or per connection??
You should try first with no negociation, to lower the impact on the CPU...
But you're right to renogociate for each request, it provides you your
total SSL handcheck capacity.

Last but not least, when benching a platform, it's a bad idea to
introduce random stuff. IE your think time.
How do you know how many requests your clients are generating if you
introduce randomization?
This might be interesting in a second step, but on first step, when
benching total capacity of a platform, it's not a good idea. Well,
this is my own point of view :)
If you want many active connections, just create a slow backend script
in PHP which holds the connection for a few seconds (let say 20)
before answering Ok.

Nothing to say on your syslogs, but your rmem and wmem seems very high :)

cheers


On Wed, Nov 2, 2011 at 7:17 PM, Erik Torlen erik.tor...@apicasystem.com wrote:
 Hi,

 Yeah the clients are not the problem, we are using 5 different datacenters 
 with 5 machines each so ~25 machines. Hardcore loadtesting :)
 Btw, the loadtest are done transatlantic so that is causing latency etc.

 After some more testing yesterday we found at just what you mentioned here: 
 using stud with too many processes made the result much
 more worse.
 The perfect setup turned out to be stud with n=3 and haproxy nbproc=1. 
 Increasing n with n=4,5,6.. made

Re: Haproxy timing issues

2011-11-02 Thread Vincent Bernat
OoO En ce début de soirée du mercredi 02 novembre 2011, vers 21:13, Erik
Torlen erik.tor...@apicasystem.com disait :

 /usr/local/bin/stud -b 127.0.0.1 85 -f *,443 --ssl -B 1000 -n 2 -C
 4 -u stud -r /home/stud --write-proxy /usr/share/ssl-cert/
 cert.pem

 

 I have tried stud using 10k of shared cache which gave me worse
 performance. Has anyone tried stud with different sizes of the shared
 session cache?

It depends  on the profile of  your traffic. With about  4000 conn/s and
1000  new  client/s,   a  cache  of  20k  seems   to  provide  the  best
performance. Increasing it does not hinder the performance. However, the
benchmark was  a bit artificial because  each client will do  4 conn and
will disappear forever.
-- 
Vincent Bernat ☯ http://vincent.bernat.im

# Okay, what on Earth is this one supposed to be used for?
2.4.0 linux/drivers/char/cp437.uni



Re: Haproxy timing issues

2011-11-02 Thread Vincent Bernat
OoO La nuit ayant déjà recouvert d'encre ce jour du mercredi 02 novembre
2011, vers 23:50, Erik Torlen erik.tor...@apicasystem.com disait :

 How big difference is it between OpenSSL 0.9.8k and 1.0.0?
 I tried to get openssl 1.0.0 into the system before but had problems
 with other programs where their dependencies got broken.

Memory usage can be divided by 10 with OpenSSL 1.0.0. You need to ensure
that  you use  a  stud version  using  SSL_MODE_RELEASE_BUFFERS to  take
advantage of it.
-- 
Vincent Bernat ☯ http://vincent.bernat.im

Follow each decision as closely as possible with its associated action.
- The Elements of Programming Style (Kernighan  Plauger)



RE: Haproxy timing issues

2011-11-02 Thread Erik Torlen
Ok, could be an idea to use that then.

Btw, I am on a system that I can't upgrade to a later version of the dist and 
take advantage of openssl 1.0.0 through apt. 
Could I make stud use openssl with static libs? E.g compiling openssl from 
source and the linking it in Makefile for stud.

/E


-Original Message-
From: Vincent Bernat [mailto:ber...@luffy.cx] 
Sent: den 2 november 2011 16:01
To: Erik Torlen
Cc: Baptiste; haproxy@formilux.org
Subject: Re: Haproxy timing issues

OoO La nuit ayant déjà recouvert d'encre ce jour du mercredi 02 novembre
2011, vers 23:50, Erik Torlen erik.tor...@apicasystem.com disait :

 How big difference is it between OpenSSL 0.9.8k and 1.0.0?
 I tried to get openssl 1.0.0 into the system before but had problems
 with other programs where their dependencies got broken.

Memory usage can be divided by 10 with OpenSSL 1.0.0. You need to ensure
that  you use  a  stud version  using  SSL_MODE_RELEASE_BUFFERS to  take
advantage of it.
-- 
Vincent Bernat ☯ http://vincent.bernat.im

Follow each decision as closely as possible with its associated action.
- The Elements of Programming Style (Kernighan  Plauger)


Re: Haproxy timing issues

2011-11-02 Thread Vincent Bernat
OoO La nuit ayant déjà recouvert d'encre ce jour du mercredi 02 novembre
2011, vers 23:55, Erik Torlen erik.tor...@apicasystem.com disait :

 Okey, good to know Vincent.
 Do you know the memory impact using 10k, 20k etc?

Yes. Divide  by two to get  the size in kbytes.  So a 10k  cache will be
about 5Mbytes.  There is  also the internal  cache of OpenSSL  which can
contains 2 sessions. I think that you can account for about the same
size. Emeric submitted a recent patch where he was modifying the size of
the internal session  cache depending on the size  of the external cache
(8 times smaller).

An active SSL connection can take a lot more memory than a session but I
don't know how  much exactly. If you have  long running connection, this
will be more an issue than session cache.
-- 
Vincent Bernat ☯ http://vincent.bernat.im

Watch out for off-by-one errors.
- The Elements of Programming Style (Kernighan  Plauger)



Re: Haproxy timing issues

2011-11-02 Thread Vincent Bernat
OoO En cette  nuit nuageuse du jeudi 03 novembre  2011, vers 00:32, Erik
Torlen erik.tor...@apicasystem.com disait :

 Ok, could be an idea to use that then.
 Btw, I am on a system that I can't upgrade to a later version of the
 dist and take advantage of openssl 1.0.0 through apt.
 Could I make stud use openssl with static libs? E.g compiling openssl
 from source and the linking it in Makefile for stud.

It should  be possible.  But OpenSSL  1.0.0 can live  side by  side with
OpenSSL 0.9.8k.  I suppose that you  use Ubuntu LTS 10.04.  You can grab
the package from Oneiric and apply a simple patch to backport it.

 https://gist.github.com/1272151/b1a61124d1568eb795fa82b24b875889cbd0005c
-- 
Vincent Bernat ☯ http://vincent.bernat.im

panic(floppy: Port bolixed.);
2.2.16 /usr/src/linux/include/asm-sparc/floppy.h



RE: Haproxy timing issues

2011-11-02 Thread Erik Torlen
Yes, I'm currently on Ubuntu 10.04. 
So basically I could grab this (http://packages.ubuntu.com/oneiric/openssl) 
.deb package and then
add the patch you linked for me to it?
Can I then compile stud as default or do I have to modify the Makefile?

/E

-Original Message-
From: Vincent Bernat [mailto:ber...@luffy.cx] 
Sent: den 2 november 2011 16:38
To: Erik Torlen
Cc: haproxy@formilux.org
Subject: Re: Haproxy timing issues

OoO En cette  nuit nuageuse du jeudi 03 novembre  2011, vers 00:32, Erik
Torlen erik.tor...@apicasystem.com disait :

 Ok, could be an idea to use that then.
 Btw, I am on a system that I can't upgrade to a later version of the
 dist and take advantage of openssl 1.0.0 through apt.
 Could I make stud use openssl with static libs? E.g compiling openssl
 from source and the linking it in Makefile for stud.

It should  be possible.  But OpenSSL  1.0.0 can live  side by  side with
OpenSSL 0.9.8k.  I suppose that you  use Ubuntu LTS 10.04.  You can grab
the package from Oneiric and apply a simple patch to backport it.

 https://gist.github.com/1272151/b1a61124d1568eb795fa82b24b875889cbd0005c
-- 
Vincent Bernat ☯ http://vincent.bernat.im

panic(floppy: Port bolixed.);
2.2.16 /usr/src/linux/include/asm-sparc/floppy.h


Re: Haproxy timing issues

2011-11-02 Thread Baptiste
I'm writting currently writting the blog article about it, but last
Emeric patch will allow you scale OUT your SSL perfomance through a
shared SSL session ID cache.

cheers


On Thu, Nov 3, 2011 at 1:21 AM, Erik Torlen erik.tor...@apicasystem.com wrote:
 Yes, I'm currently on Ubuntu 10.04.
 So basically I could grab this (http://packages.ubuntu.com/oneiric/openssl) 
 .deb package and then
 add the patch you linked for me to it?
 Can I then compile stud as default or do I have to modify the Makefile?

 /E

 -Original Message-
 From: Vincent Bernat [mailto:ber...@luffy.cx]
 Sent: den 2 november 2011 16:38
 To: Erik Torlen
 Cc: haproxy@formilux.org
 Subject: Re: Haproxy timing issues

 OoO En cette  nuit nuageuse du jeudi 03 novembre  2011, vers 00:32, Erik
 Torlen erik.tor...@apicasystem.com disait :

 Ok, could be an idea to use that then.
 Btw, I am on a system that I can't upgrade to a later version of the
 dist and take advantage of openssl 1.0.0 through apt.
 Could I make stud use openssl with static libs? E.g compiling openssl
 from source and the linking it in Makefile for stud.

 It should  be possible.  But OpenSSL  1.0.0 can live  side by  side with
 OpenSSL 0.9.8k.  I suppose that you  use Ubuntu LTS 10.04.  You can grab
 the package from Oneiric and apply a simple patch to backport it.

  https://gist.github.com/1272151/b1a61124d1568eb795fa82b24b875889cbd0005c
 --
 Vincent Bernat ☯ http://vincent.bernat.im

 panic(floppy: Port bolixed.);
        2.2.16 /usr/src/linux/include/asm-sparc/floppy.h




Haproxy timing issues

2011-11-01 Thread Erik Torlen
Hi,

I am currently (and have been from time to time the last weeks) doing some 
heavy loadtesting against haproxy with stud in front of it handling the ssl.

My loadtest has been focused on loadtesting SSL traffic through stud against 
haproxy on amazon ec2. 

Our current problem is that we cannot get more then ~30k active connections 
(~150 conns/s) until we starting to see increased response time (10-60s) on the
client side. Running with 38k connections now and seeing much higher response 
time.

The setup is:
1 instance running haproxy + stud
2 instances running varnish server 3 cached images

Varnish has 100% cache hit ratio so nothing goes to the backend.

We have tried using m1.xlarge and the c1.xlarge. The m1.xlarge uses almost 100% 
cpu when doing the loadtests while c1.xlarge has a lot of resources left (stud 
using a few percent per process) and haproxy ~60-70%cpu.
The only difference is that c1.xlarge gives quite better response time before 
the actual problem happens where resp times are increasing.

Haproxy is running with nbproc=1
Stud is running with n=6 and shared session cache. (Tried it with n=3 as well

From the logging in haproxy I could see that the time it takes to establish a 
connection against the backend and receive the data:

Haproxy.log
Nov  1 18:40:35 127.0.0.1 haproxy[18511]: x.x.x.x:54113 
[01/Nov/2011:18:39:40.273] varnish varnish/varnish1 4519/0/73/50215/54809 200 
2715 - -  238/236/4/5/0 0/0 GET /assets/images/icons/elite_logo_beta.png 
HTTP/1.1
Nov  1 18:40:35 127.0.0.1 haproxy[18511]: x.x.x.x:55635 
[01/Nov/2011:18:39:41.547] varnish varnish/varnish1 3245/0/81/50207/53535 200 
1512 - -  238/236/3/4/0 0/0 GET /assets/images/icons/favicon.ico HTTP/1.1
...
Nov  1 18:40:44 127.0.0.1 haproxy[18511]: x.x.x.x:34453 
[01/Nov/2011:18:39:25.330] varnish varnish/varnish1 3082/0/225/32661/79559 200 
1512 - -  234/232/1/2/0 0/0 GET /assets/images/icons/favicon.ico HTTP/1.1
Nov  1 18:40:44 127.0.0.1 haproxy[18511]: x.x.x.x:53731 
[01/Nov/2011:18:39:25.036] varnish varnish/varnish1 3377/0/216/32669/79854 200 
1725 - -  233/231/0/1/0 0/0 GET /assets/images/create/action_btn.png 
HTTP/1.1

Haproxy.err (NOTE: 504 error here)

Nov  1 18:40:11 127.0.0.1 haproxy[18511]: x.x.x.x:34885 
[01/Nov/2011:18:39:07.597] varnish varnish/varnish1 4299/0/27/-1/64330 504 194 
- - sH-- 10916/10914/4777/2700/0 0/0 GET /assets/images/icons/favicon.ico 
HTTP/1.1
Nov  1 18:40:12 127.0.0.1 haproxy[18511]: x.x.x.x:58878 
[01/Nov/2011:18:39:12.621] varnish varnish/varnish2 314/0/55/-1/60374 504 194 - 
- sH-- 3692/3690/3392/1623/0 0/0 GET /assets/images/icons/favicon.ico HTTP/1.1

Nov  1 18:40:18 127.0.0.1 haproxy[18511]: x.x.x.x:35505 
[01/Nov/2011:18:39:42.670] varnish varnish/varnish1 3515/0/22078/10217/35811 
200 1512 - -  1482/1481/1238/710/1 0/0 GET 
/assets/images/icons/favicon.ico HTTP/1.1
Nov  1 18:40:18 127.0.0.1 haproxy[18511]: x.x.x.x:40602 
[01/Nov/2011:18:39:42.056] varnish varnish/varnish1 4126/0/22081/10226/36435 
200 1512 - -  1475/1474/1231/703/1 0/0 GET 
/assets/images/icons/favicon.ico HTTP/1.1


Here is the logs from running haproxy with varnish as a backend on the local 
machine:

Haproxy.log
Nov  1 20:00:52 127.0.0.1 haproxy[18953]: x.x.x.x:38552 
[01/Nov/2011:20:00:45.157] varnish varnish/local_varnish 7513/0/0/0/7513 200 
1725 - -  4/3/0/1/0 0/0 GET /assets/images/create/action_btn.png HTTP/1.1
Nov  1 20:00:54 127.0.0.1 haproxy[18953]: x.x.x.x:40850 
[01/Nov/2011:20:00:48.219] varnish varnish/local_varnish 6524/0/0/0/6524 200 
1725 - -  2/1/0/1/0 0/0 GET /assets/images/create/action_btn.png HTTP/1.1

Haproxy.err
Nov  1 20:00:38 127.0.0.1 haproxy[18953]: x.x.x.x:39649 
[01/Nov/2011:20:00:08.665] varnish varnish/local_varnish 7412/0/22090/23/29525 
200 1511 - -  15700/15698/267/268/1 0/0 GET 
/assets/images/icons/favicon.ico HTTP/1.1
Nov  1 20:00:38 127.0.0.1 haproxy[18953]: x.x.x.x:54565 
[01/Nov/2011:20:00:12.255] varnish varnish/local_varnish 3823/0/22090/23/25936 
200 1511 - -  15700/15698/266/267/1 0/0 GET 
/assets/images/icons/favicon.ico HTTP/1.1

I see on all these tests that haproxy-stats is showing %1 idle but top shows 
that haproxy are using ~70% cpu?

The jungle aka amazon and its internal network are causing a lot of latency 
when running varnish on external machine. The response times get better when 
running
varnish locally (~0.5s improvement). 
But there is still very high response times in haproxy.err when running varnish 
locally?

I have played around with sysctl values and found some that improved my 
performance. 
My feeling is that I need to tune some more values in order to go beyond this 
level, suggestions?

Kind Regards
Erik



Re: Haproxy timing issues

2011-11-01 Thread Baptiste
Hi,

First question: are you sure you're reaching the limit of
haproxy/varnish and not the limit of your client?
Mainly concerning the increasing response time.

How many CPUs do you have in your VM? Starting too much stud proccess
could be counter-productive.
I doubt doing CPU affinity in a VM improves something :)

Concerning the logs, the time we can see on your client side are very
high! Too high :)
3/4s for HAProxy to get the full request.

How are you running stud?
Which options? Are you using the one with emericbr patches?
Are you using requesting using the same SSL Session ID or do you
renegotiate a new one for each connection?

Have you checked your network statisitics, on both client and server side?
netstat -in and netstat -s
Is there a lot of drops, retransmission, congestion, etc...

On your last log line, we can see that HAProxy took 22s to establish a
TCP connection to your Varnish...

Can you share your stud, haproxy, and varnish configuration, the
version of each software, the startup parameters for Varnish.
What kind of tool do you use on your client to run your load test?
What sysctl have you already tunned?


Unfortunately, the Aloha does not run on Amazon :)


cheers,


On Tue, Nov 1, 2011 at 9:16 PM, Erik Torlen erik.tor...@apicasystem.com wrote:
 Hi,

 I am currently (and have been from time to time the last weeks) doing some 
 heavy loadtesting against haproxy with stud in front of it handling the ssl.

 My loadtest has been focused on loadtesting SSL traffic through stud against 
 haproxy on amazon ec2.

 Our current problem is that we cannot get more then ~30k active connections 
 (~150 conns/s) until we starting to see increased response time (10-60s) on 
 the
 client side. Running with 38k connections now and seeing much higher response 
 time.

 The setup is:
 1 instance running haproxy + stud
 2 instances running varnish server 3 cached images

 Varnish has 100% cache hit ratio so nothing goes to the backend.

 We have tried using m1.xlarge and the c1.xlarge. The m1.xlarge uses almost 
 100% cpu when doing the loadtests while c1.xlarge has a lot of resources left 
 (stud using a few percent per process) and haproxy ~60-70%cpu.
 The only difference is that c1.xlarge gives quite better response time before 
 the actual problem happens where resp times are increasing.

 Haproxy is running with nbproc=1
 Stud is running with n=6 and shared session cache. (Tried it with n=3 as well

 From the logging in haproxy I could see that the time it takes to establish a 
 connection against the backend and receive the data:

 Haproxy.log
 Nov  1 18:40:35 127.0.0.1 haproxy[18511]: x.x.x.x:54113 
 [01/Nov/2011:18:39:40.273] varnish varnish/varnish1 4519/0/73/50215/54809 200 
 2715 - -  238/236/4/5/0 0/0 GET /assets/images/icons/elite_logo_beta.png 
 HTTP/1.1
 Nov  1 18:40:35 127.0.0.1 haproxy[18511]: x.x.x.x:55635 
 [01/Nov/2011:18:39:41.547] varnish varnish/varnish1 3245/0/81/50207/53535 200 
 1512 - -  238/236/3/4/0 0/0 GET /assets/images/icons/favicon.ico 
 HTTP/1.1
 ...
 Nov  1 18:40:44 127.0.0.1 haproxy[18511]: x.x.x.x:34453 
 [01/Nov/2011:18:39:25.330] varnish varnish/varnish1 3082/0/225/32661/79559 
 200 1512 - -  234/232/1/2/0 0/0 GET /assets/images/icons/favicon.ico 
 HTTP/1.1
 Nov  1 18:40:44 127.0.0.1 haproxy[18511]: x.x.x.x:53731 
 [01/Nov/2011:18:39:25.036] varnish varnish/varnish1 3377/0/216/32669/79854 
 200 1725 - -  233/231/0/1/0 0/0 GET /assets/images/create/action_btn.png 
 HTTP/1.1

 Haproxy.err (NOTE: 504 error here)

 Nov  1 18:40:11 127.0.0.1 haproxy[18511]: x.x.x.x:34885 
 [01/Nov/2011:18:39:07.597] varnish varnish/varnish1 4299/0/27/-1/64330 504 
 194 - - sH-- 10916/10914/4777/2700/0 0/0 GET 
 /assets/images/icons/favicon.ico HTTP/1.1
 Nov  1 18:40:12 127.0.0.1 haproxy[18511]: x.x.x.x:58878 
 [01/Nov/2011:18:39:12.621] varnish varnish/varnish2 314/0/55/-1/60374 504 194 
 - - sH-- 3692/3690/3392/1623/0 0/0 GET /assets/images/icons/favicon.ico 
 HTTP/1.1

 Nov  1 18:40:18 127.0.0.1 haproxy[18511]: x.x.x.x:35505 
 [01/Nov/2011:18:39:42.670] varnish varnish/varnish1 3515/0/22078/10217/35811 
 200 1512 - -  1482/1481/1238/710/1 0/0 GET 
 /assets/images/icons/favicon.ico HTTP/1.1
 Nov  1 18:40:18 127.0.0.1 haproxy[18511]: x.x.x.x:40602 
 [01/Nov/2011:18:39:42.056] varnish varnish/varnish1 4126/0/22081/10226/36435 
 200 1512 - -  1475/1474/1231/703/1 0/0 GET 
 /assets/images/icons/favicon.ico HTTP/1.1


 Here is the logs from running haproxy with varnish as a backend on the local 
 machine:

 Haproxy.log
 Nov  1 20:00:52 127.0.0.1 haproxy[18953]: x.x.x.x:38552 
 [01/Nov/2011:20:00:45.157] varnish varnish/local_varnish 7513/0/0/0/7513 200 
 1725 - -  4/3/0/1/0 0/0 GET /assets/images/create/action_btn.png 
 HTTP/1.1
 Nov  1 20:00:54 127.0.0.1 haproxy[18953]: x.x.x.x:40850 
 [01/Nov/2011:20:00:48.219] varnish varnish/local_varnish 6524/0/0/0/6524 200 
 1725 - -  2/1/0/1/0 0/0 GET /assets/images/create/action_btn.png 
 HTTP/1.1

 Haproxy.err
 Nov