Hi Willy,

Le 24/01/2016 18:03, Willy Tarreau a écrit :
Hi Cyril,

On Sat, Jan 23, 2016 at 11:55:27PM +0100, Cyril Bonté wrote:
 From my first tests, I can see a performance drop  between commit
5506e3f8 and e583ea58 (before 1.6-dev1).

Here is the sample configuration I use :
global
   tune.ssl.default-dh-param 2048

listen test
   mode http
   bind :9443 ssl crt localhost.pem
   server s localhost:80


At commit 5506e3f8 (BUG/MINOR: stats: correctly set the request/response
analysers)

$ ab -n1000 -c1 -k https://127.0.0.1:9443/1mb
Time taken for tests: 2.403 seconds
Requests per second:    416.08 [#/sec] (mean)

The next commit (5be2f352 MAJOR: polling: centralize calls to I/O
callbacks) seems to break keep-alive, I can't test any commit until
e583ea58 where I have this result (still true with current 1.7 head) :

$ ab -n1000 -c1 -k https://127.0.0.1:9443/1mb
Time taken for tests:   2.933 second
Requests per second:    340.97 [#/sec] (mean)

So with exactly the same procedure and same config, I get the same
performance from both versions. But I have found something odd even
with 1.5 that I cannot explain for now. I initially ran the test on
an atom board which has the benefit of easily being CPU-bound and
not network bound and the numbers were the same with a 3-digit
precision for 1.5, 1.6 and 1.7 along multiple tests.

Then I tried on my laptop (everything local) and noticed varying
performances between two consecutive runs with the same version. I
first attributed this to the shitty intel-pstate which permanently
changes the CPU's frequency regardless of the governor, but after
I managed to stabilize the frequency (to the lowest available one),
I found that all versions were still showing random performance
levels varying from 1 to 3 on exactly the same test. Gprof is
useless there due to the short time spent in the various functions.
It's even visible between multiple output lines from "ab".

I'm also testing on my laptop on localhost to avoid some network side effects. Here, the tests are consistent, I always have the same performance for a given version.

Under stace, it's even worse. The number of calls to sendto() for
a single run varies between 1000 and 199000, without any error nor
EAGAIN. And the number of recvfrom() varies from 129001 to 519011,
which really doesn't make sense. read() and write() are perfectly
stable though, at 3030 and 65005 respectively. They're called from
openssl.

Using ltrace, I see an interesting pattern :

With haproxy 1.5, I get :
1453658897.664685 recv(7, 0xec50b4, 8192, 0)     = 8192
1453658897.664847 __errno_location()             = 0x7fa04fdd1690
1453658897.664957 recv(7, 0xec70b4, 8192, 0)     = 8192
1453658897.665115 SSL_write(0xeb83d0, 0xec50b4, 0x4000, 0x4000 <unfinished ...>
1453658897.665221 malloc(16472)                  = 0xec90d0
1453658897.665375 free(0xec90d0)                 = <void>
1453658897.665486 <... SSL_write resumed> )      = 0x4000
1453658897.665519 gettimeofday(0x7217d0, 0)      = 0
1453658897.665634 epoll_wait(3, 0xeac440, 200, 0) = 0

Where haproxy 1.7 gives :
1453658760.290096 recv(7, 0x173ae84, 0x3c08, 0)  = 0x3c08
1453658760.290271 __errno_location()             = 0x7fcabf9fa690
1453658760.290382 recv(7, 0x173ea8c, 1024, 0)    = 1024
1453658760.290545 SSL_write(0x17768b0, 0x173ae84, 0x4008, 0x4008 <unfinished ...>
1453658760.290654 malloc(16472)                  = 0x17773c0
1453658760.290818 <... SSL_write resumed> )      = 0x4000
1453658760.290868 SSL_write(0x17768b0, 0x173ee84, 8, 8 <unfinished ...>
1453658760.291012 free(0x17773c0)                = <void>
1453658760.291129 <... SSL_write resumed> )      = 8
1453658760.291162 gettimeofday(0x76b340, 0)      = 0
1453658760.291287 epoll_wait(3, 0x1775f40, 200, 0) = 0

Here we have 2 consecutive recv of 0x3c08 (15368) bytes then 1024 bytes = 15392 bytes. SSl_Write then is able to send 16384 bytes and needs to send the 8 remaining bytes in a 2nd call.


commit 5506e3f8 behaves like haproxy-1.5
commit e583ea58 like haproxy-1.7

I'm now investigating this, I don't understand how such syscalls
can be called as often. Maybe openssl releases only a few bytes
from the buffer allowing a few bytes read to be done, I don't
know. But I don't like this, to say the least.

I think we're close to the same conclusion ;-)


--
Cyril Bonté

Reply via email to