Hello Fred,

On Mon, Oct 28, 2013 at 10:02:15AM -0200, Fred Pedrisa wrote:
> Hello, Willy.
> 
> As you said, take a look :
> 
> getsockopt(0x12e,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 (0x0)
> sendto(302,"\^D\0\^V0\0\0^z\M-L-\a\0d8\0\0"...,926,0x80,NULL,0x0) = 926
> (0x39e)
> recvfrom(682,"\^S\0W0\0\0\M-,\^?\M-L-\^P\0\^E@"...,8030,0x0,NULL,0x0) = 988
> (0x3dc)
> recvfrom(682,0x801f3545c,7042,0x0,0x0,0x0)       ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x2a9,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 (0x0)
> sendto(681,"\^S\0W0\0\0\M-,\^?\M-L-\^P\0\^E@"...,988,0x80,NULL,0x0) = 988
> (0x3dc)
> recvfrom(1428,"\^N\0!\M-0\0\0\M-\\M^_\M-H-\^AoU"...,8030,0x0,NULL,0x0) = 444
> (0x1bc)
> recvfrom(1428,0x8011b523c,7586,0x0,0x0,0x0)      ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x593,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 (0x0)
> sendto(1427,"\^N\0!\M-0\0\0\M-\\M^_\M-H-\^AoU"...,444,0x80,NULL,0x0) = 444
> (0x1bc)
> recvfrom(201,"\b\0\\0\0\0\M-=\M-]\M-G-\^O\0\0"...,8030,0x0,NULL,0x0) = 2627
> (0xa43)
> recvfrom(201,0x800ec5ac3,5403,0x0,0x0,0x0)       ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0xbf,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 (0x0)
> sendto(191,"\b\0\\0\0\0\M-=\M-]\M-G-\^O\0\0"...,2627,0x80,NULL,0x0) = 2627
> (0xa43)
> recvfrom(888,"\^S\0W0\0\0\M-,\^?\M-L-\^P\0\^E@"...,8030,0x0,NULL,0x0) = 1226
> (0x4ca)
> recvfrom(888,0x801ee354a,6804,0x0,0x0,0x0)       ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x377,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 (0x0)
> sendto(887,"\^S\0W0\0\0\M-,\^?\M-L-\^P\0\^E@"...,1226,0x80,NULL,0x0) = 1226
> (0x4ca)
> recvfrom(674,"\f\0\M-=\M-0\0\0\M^K}\M-#-d\r\0"...,8030,0x0,NULL,0x0) = 982
> (0x3d6)
> recvfrom(674,0x800f6f456,7048,0x0,0x0,0x0)       ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x2a1,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 (0x0)
> sendto(673,"\f\0\M-=\M-0\0\0\M^K}\M-#-d\r\0"...,982,0x80,NULL,0x0) = 982
> (0x3d6)
> recvfrom(1032,"\^S\0W0\0\0\M-,\^?\M-L-\^P\0\^E@"...,8030,0x0,NULL,0x0) =
> 1205 (0x4b5)
> recvfrom(1032,0x801ddb535,6825,0x0,0x0,0x0)      ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x407,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 (0x0)
> sendto(1031,"\^S\0W0\0\0\M-,\^?\M-L-\^P\0\^E@"...,1205,0x80,NULL,0x0) = 1205
> (0x4b5)
> recvfrom(1339,"\v\0tpDa\^A\^DV \0\0\^A\M^R\M^K"...,8030,0x0,NULL,0x0) = 68
> (0x44)
> recvfrom(1339,0x8011790c4,7962,0x0,0x0,0x0)      ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x53c,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 (0x0)
> sendto(1340,"\v\0tpDa\^A\^DV \0\0\^A\M^R\M^K"...,68,0x80,NULL,0x0) = 68
> (0x44)
> recvfrom(913,"\v\0tpj\M-h\^A\^D\M-Q\^]\0\0\^A"...,8030,0x0,NULL,0x0) = 108
> (0x6c)
> recvfrom(913,0x8019090ec,7922,0x0,0x0,0x0)       ERR#35 'Resource
> temporarily unavailable'
> getsockopt(0x392,0xffff,0x1007,0x7fffffffdb94,0x7fffffffdb90,0x0) = 0 (0x0)
> sendto(914,"\v\0tpj\M-h\^A\^D\M-Q\^]\0\0\^A"...,108,0x80,NULL,0x0) = 108
> (0x6c)
> recvfrom(166,"\^D\0\^V0\0\0\M-$\M^@\M-L-\^T\0p"...,8030,0x0,NULL,0x0) = 643
> (0x283)
> recvfrom(166,0x800f13303,7387,0x0,0x0,0x0)       ERR#35 'Resource
> temporarily unavailable'
> 
> So yes, a lot of recv/send calls as you said before.

Yes but they're not all that small. The average size looks like .5 or 1kB.
That said, assuming you're dealing with 300 Mbps (about 40 MB/s) and say
500 bytes per message, this turns into 80k messages per second, which
require :
  - 2 recvfrom()
  - 1 getsockopt()  (we can remove this one, 1.5 doesn't have it)
  - 1 sendto()

So 4 syscalls per message, resulting in 320k syscalls per second. It can
start to represent some CPU usage. But there's more. Such small messages
are transferred using TCP_NODELAY meaning that a TCP PUSH is set on each
outgoing packet and that each of them is immediately ACKed. So you get
80kpps per side in each direction, resulting in 320kpps as well. If you
have a firewall running on the system, it might take its share of load
as well, which is possibly attributed to the sending process on outgoing
messages.

That said, even with that in mind, I still consider that the system load
is high for the workload. Could you please share the output of "vmstat 1"
(just take the first 10 lines) ? Also, can you confirm that this is a real
machine and that we're not troubleshooting a VM ?

It could make sense to try 1.5 (latest snapshot) for maybe the highest
loaded process only if that makes the test easier and check if its CPU
load drops or not.

Best regards,
Willy


Reply via email to