Hello, Willy.

Yes, this is a 'real machine', running FreeBSD 9 x64.

It is a Xeon E5-2650 Dual (So we have 16 physical cores to use here and 32
threads).

We are speaking about 100Kpps (input) and 140Kpps (output) 'approximately'.

Here is the vmstat 1 result :

procs      memory      page                    disks     faults         cpu
 r b w     avm    fre   flt  re  pi  po    fr  sr da0 pa0   in   sy   cs us
sy id
 7 0 0   4818M    35G   643   0   0   0   714   0   0   0 4977 1364 5996  8
25 67
 3 0 0   4818M    35G   224   0   0   0   174   0   0   0 42698 355001
170303  8 22 71
 3 0 0   4818M    35G   177   0   0   0   174   0   0   0 28715 383061
138108  7 23 69
 4 0 0   4818M    35G   173   0   0   0   174   0   0   0 28342 375281
138067  8 24 69
 5 0 0   4818M    35G   185   0   0   0   174   0   0   0 32900 372294
148576  7 21 71
 5 0 0   4818M    35G   372   0   0   0   174   0   0   0 29112 364030
138826  7 25 68
 4 0 0   4818M    35G   159   0   0   0   174   0   0   0 34102 368835
150530  9 22 70
 4 0 0   4818M    35G   362   0   0   0   174   0   0   0 39928 366139
165853  8 21 71
 3 0 0   4818M    35G   220   0   0   0   174   0   0   0 39195 371933
163533  8 21 71
 6 0 0   4818M    35G   262   0   0   0   174   0   0   0 42681 354697
172687  8 21 71

-----Mensagem original-----
De: Willy Tarreau [mailto:w...@1wt.eu] 
Enviada em: segunda-feira, 28 de outubro de 2013 20:58
Para: Fred Pedrisa
Cc: 'Lukas Tribus'; haproxy@formilux.org
Assunto: Re: RES: RES: RES: RES: RES: RES: High CPU Usage (HaProxy)

Hello Fred,

On Mon, Oct 28, 2013 at 10:02:15AM -0200, Fred Pedrisa wrote:
> Hello, Willy.
> 
> As you said, take a look :
> 
> getsockopt(0x12e,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 
> (0x0)
> sendto(302,"\^D\0\^V0\0\0^z\M-L-\a\0d8\0\0"...,926,0x80,NULL,0x0) = 
> 926
> (0x39e)
> recvfrom(682,"\^S\0W0\0\0\M-,\^?\M-L-\^P\0\^E@"...,8030,0x0,NULL,0x0) 
> = 988
> (0x3dc)
> recvfrom(682,0x801f3545c,7042,0x0,0x0,0x0)       ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x2a9,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 
> (0x0)
> sendto(681,"\^S\0W0\0\0\M-,\^?\M-L-\^P\0\^E@"...,988,0x80,NULL,0x0) = 
> 988
> (0x3dc)
> recvfrom(1428,"\^N\0!\M-0\0\0\M-\\M^_\M-H-\^AoU"...,8030,0x0,NULL,0x0) 
> = 444
> (0x1bc)
> recvfrom(1428,0x8011b523c,7586,0x0,0x0,0x0)      ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x593,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 
> (0x0)
> sendto(1427,"\^N\0!\M-0\0\0\M-\\M^_\M-H-\^AoU"...,444,0x80,NULL,0x0) = 
> 444
> (0x1bc)
> recvfrom(201,"\b\0\\0\0\0\M-=\M-]\M-G-\^O\0\0"...,8030,0x0,NULL,0x0) = 
> 2627
> (0xa43)
> recvfrom(201,0x800ec5ac3,5403,0x0,0x0,0x0)       ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0xbf,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 
> (0x0)
> sendto(191,"\b\0\\0\0\0\M-=\M-]\M-G-\^O\0\0"...,2627,0x80,NULL,0x0) = 
> 2627
> (0xa43)
> recvfrom(888,"\^S\0W0\0\0\M-,\^?\M-L-\^P\0\^E@"...,8030,0x0,NULL,0x0) 
> = 1226
> (0x4ca)
> recvfrom(888,0x801ee354a,6804,0x0,0x0,0x0)       ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x377,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 
> (0x0)
> sendto(887,"\^S\0W0\0\0\M-,\^?\M-L-\^P\0\^E@"...,1226,0x80,NULL,0x0) = 
> 1226
> (0x4ca)
> recvfrom(674,"\f\0\M-=\M-0\0\0\M^K}\M-#-d\r\0"...,8030,0x0,NULL,0x0) = 
> 982
> (0x3d6)
> recvfrom(674,0x800f6f456,7048,0x0,0x0,0x0)       ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x2a1,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 
> (0x0)
> sendto(673,"\f\0\M-=\M-0\0\0\M^K}\M-#-d\r\0"...,982,0x80,NULL,0x0) = 
> 982
> (0x3d6)
> recvfrom(1032,"\^S\0W0\0\0\M-,\^?\M-L-\^P\0\^E@"...,8030,0x0,NULL,0x0) 
> =
> 1205 (0x4b5)
> recvfrom(1032,0x801ddb535,6825,0x0,0x0,0x0)      ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x407,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 
> (0x0)
> sendto(1031,"\^S\0W0\0\0\M-,\^?\M-L-\^P\0\^E@"...,1205,0x80,NULL,0x0) 
> = 1205
> (0x4b5)
> recvfrom(1339,"\v\0tpDa\^A\^DV \0\0\^A\M^R\M^K"...,8030,0x0,NULL,0x0) 
> = 68
> (0x44)
> recvfrom(1339,0x8011790c4,7962,0x0,0x0,0x0)      ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x53c,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 
> (0x0) sendto(1340,"\v\0tpDa\^A\^DV 
> \0\0\^A\M^R\M^K"...,68,0x80,NULL,0x0) = 68
> (0x44)
> recvfrom(913,"\v\0tpj\M-h\^A\^D\M-Q\^]\0\0\^A"...,8030,0x0,NULL,0x0) = 
> 108
> (0x6c)
> recvfrom(913,0x8019090ec,7922,0x0,0x0,0x0)       ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x392,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 
> (0x0)
> sendto(914,"\v\0tpj\M-h\^A\^D\M-Q\^]\0\0\^A"...,108,0x80,NULL,0x0) = 
> 108
> (0x6c)
> recvfrom(166,"\^D\0\^V0\0\0\M-$\M^@\M-L-\^T\0p"...,8030,0x0,NULL,0x0) 
> = 643
> (0x283)
> recvfrom(166,0x800f13303,7387,0x0,0x0,0x0)       ERR#35 'Resource
> temporarily unavailable'
> 
> So yes, a lot of recv/send calls as you said before.

Yes but they're not all that small. The average size looks like .5 or 1kB.
That said, assuming you're dealing with 300 Mbps (about 40 MB/s) and say
500 bytes per message, this turns into 80k messages per second, which
require :
  - 2 recvfrom()
  - 1 getsockopt()  (we can remove this one, 1.5 doesn't have it)
  - 1 sendto()

So 4 syscalls per message, resulting in 320k syscalls per second. It can
start to represent some CPU usage. But there's more. Such small messages are
transferred using TCP_NODELAY meaning that a TCP PUSH is set on each
outgoing packet and that each of them is immediately ACKed. So you get
80kpps per side in each direction, resulting in 320kpps as well. If you have
a firewall running on the system, it might take its share of load as well,
which is possibly attributed to the sending process on outgoing messages.

That said, even with that in mind, I still consider that the system load is
high for the workload. Could you please share the output of "vmstat 1"
(just take the first 10 lines) ? Also, can you confirm that this is a real
machine and that we're not troubleshooting a VM ?

It could make sense to try 1.5 (latest snapshot) for maybe the highest
loaded process only if that makes the test easier and check if its CPU load
drops or not.

Best regards,
Willy



Reply via email to