On Sunday, December 16, 2007 Alex Pankratov wrote: > Admittedly, I didn't run the test over a one-gig link, but > still the discrepancy between your findings and my results > is a quite a bit odd.
I'm not sure what's happening on 100 MBits/s. The real fun did not even start until I was well over that number - the original problem was to somehow send a gigabit per second of non-fragmented non-jumbo UDP packets. All my testing without a gigabit interface was just over a loopback connection on my home 2001 machine, so it was probably not even all that relevant (though the basic patterns were more or less similar, if I remember correctly). > * the execution time of sendto() on my machine clearly depends on > a size of the packet and it is virtually the same for blocking > and non-blocking sockets. > > bytes microseconds > 256 25 > 1024 87 > 4096 345 > 16384 1370 In gigabit scenario you have to be under 10 microseconds per 1-KB packet in order to fill the link to capacity. On our 1000-1400 byte packets this time was several times higher than 10 microseconds, and it was this time that was the performance bottleneck when a single thread was doing all the sending. If multiple threads were sending data, the CPU was maxing out and becoming the bottleneck instead. Not sure how much of this would be noticeable with a 100-MBit link - the times above are just about what you'd expect from a decently behaving sendto(), whereas in our case both sendto() and CPU usage behaved themselves fairly indecently. The only way to drop the sendto() time to a desired ~10-mcs value was to use FIONBIO. > * sendto() on non-blocking socket does fail with WOULDBLOCK. The > larger the packet size, the more frequently it fails. With the > test code described below, the failure rate was 20% for 1024 > byte packets and nearly 95% for 2048 byte ones. > ... > The test is a simple busy loop (i.e. without any sleeping) > that calls sendto() and (conditionally) select() to wait for > socket's writability if sendto() fails with WOULDBLOCK. Yes; this is not a surprise. In 100-MBit case it is not a problem to generate enough packets to saturate the link, so naturally the non-blocking sendto() would return WOULDBLOCK frequently. In our gigabit scenario the problem was to generate enough packets to use the link to capacity, and without FIONBIO one thread could not do it because of the excessive sendto() time, and multiple threads - because of the CPU saturation. Best wishes - S.Osokine. 17 Dec 2007. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex Pankratov Sent: Sunday, December 16, 2007 11:23 PM To: 'theory and practice of decentralized computer networks' Subject: Re: [p2p-hackers] MTU in the real world > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Serguei Osokine > Sent: Sunday, December 16, 2007 6:15 PM > To: theory and practice of decentralized computer networks > Subject: Re: [p2p-hackers] MTU in the real world > > On Tuesday, May 31, 2005 Serguei Osokine wrote: > > We tried to use UDP to transfer stuff over a gigabit LAN inside > > the cluster. Pretty soon we discovered that with small (~1500 byte) > > packets the CPU was the bottleneck, because you can send > only so many > > packets per second, and the resulting throughput was > nowhere close to > > a gigabit. > > ... > > (Datagrams smaller than MTU sucked performance-wise when compared to > > TCP, but that is another story - gigabit cards tend to > offload plenty > > of TCP functionality from the CPU, so it was not that the UDP was > > particularly bad, but rather that TCP performance was very good.) > > An update for anyone who still cares after two and a half years: > it turns out that UDP *was* particularly bad. We have discovered this > almost as an accident, and it looks like a Windows problem - probably > in WinSock UDP implementation. > > As it turned out, the CPU percent vs sending rate chart has a > clear 'hockey stick' shape - CPU use is zero until some middle point, > and then it starts to grow linearly, which is already unexpected by > itself. What's even more funny, the sendto() call time is always the > same regardless of the sending rate (controlled by sleeps between > sends) and regardless of CPU usage percent, and it is this time that > is limiting the single thread sending performance. Thanks for sharing this information, this is very interesting. In fact so interesting that I had to run few tests ASAP, which I just did. Unfortunately I cannot reproduce your findings - * the execution time of sendto() on my machine clearly depends on a size of the packet and it is virtually the same for blocking and non-blocking sockets. bytes microseconds 256 25 1024 87 4096 345 16384 1370 * sendto() on non-blocking socket does fail with WOULDBLOCK. The larger the packet size, the more frequently it fails. With the test code described below, the failure rate was 20% for 1024 byte packets and nearly 95% for 2048 byte ones. * CPU usage patterns of sendto() loop in blocking and non-blocking cases are virtually the same. The test is a simple busy loop (i.e. without any sleeping) that calls sendto() and (conditionally) select() to wait for socket's writability if sendto() fails with WOULDBLOCK. CPU was maxed when the packet size was between 16 and 256 bytes. The usage dropped to 70-80% for 512-2048 byte packets, to 30% - for 4096 byte one, to 20% - for 16K and to 15% - for 64K. The network link utilization was close 100% at all times. I tested over 100MBit cable and 802.11g wireless connections. I have also tested the case when the recepient had a receiving socket open, and the case when it did not (thus generating backflow of ICMPs). Admittedly, I didn't run the test over a one-gig link, but still the discrepancy between your findings and my results is a quite a bit odd. Is there any chance your test machine was running some sort of 3rd party firewall or, perhaps, a network monitor ? If it was a socket- or TDI-level filter, this could explain the constant sendto() execution time and other observations. Alex _______________________________________________ p2p-hackers mailing list [email protected] http://lists.zooko.com/mailman/listinfo/p2p-hackers _______________________________________________ p2p-hackers mailing list [email protected] http://lists.zooko.com/mailman/listinfo/p2p-hackers
