unruh <un...@invalid.ca> wrote: > On 2012-06-08, Rick Jones <rick.jon...@hp.com> wrote: > > I would suggest then trying disabling of the interrupt coalescing > > via ethtool on the 1GbE NIC of your server and a few select > > clients and see what that does. If things start to look cleaner > > then you know it is an implementation-specific detail of one or > > more GbE NICs.
> It looks to me that interrupt coalescing is not enables according to > ethtools. I'd like to see the full output of ethtool, ethtool -i and ethtool -c for your interfaces if I may. Feel free to send as direct email if you prefer. > > If it is possible to connect a client "back-to-back" to your server at > > the same time (via a second port) - still with interrupt coalescing > > disabled at both ends that would be an excellent addition. That will > > help evaluate the switch. > > > > I trust there were no OS changes when going from 100BT to GbE? Though > > even if not, there is still the prospect of the drivers for the 100BT > > cards not doing what linux calls "napi" and the drivers for the GbE > > cards doing it, which may introduce some timing changes. > What is napi? Napi is a mechanism whereby interrupts on a NIC get disabled, and packets are polled for for a certain length of time. http://www.linuxfoundation.org/collaborate/workgroups/networking/napi http://en.wikipedia.org/wiki/New_API > >> So yes, I think it is the Gb technology that is causing trouble. > > > > I split what may seem a hair between Gb technology being the IEEE > > specification and Gb implementation being what specific NIC vendors > > do. So, to me, interrupt coalescing is implementation not technology. > For me, I do not care what which it is, it is all Gb. I suspect that my caring about Gb technology/specification vs Gb implementation may be not all that far from a timekeeper's desire to distinguish between accuracy and precision, even when laypeople start to mix the two :) > Note that on one of the clients, there are two separate clusters of > roundtrip delays, one from .15 to about .4ms, and the other from > about 1.3 to 1.6 ms. The slope within each cluster is as above but > the slope between the clusters is the opposite. Ie, within the > cluster, the client to server is being delayed, while the clusters > are due to a huge delay in the server to client. (if I have the > signs right) > In http://www.theory.physics.ubc.ca/scatter/scatter.html I have the > scatter plots (offset vs return time) for two clients to two > different servers. One of the servers is a Gb server, while the > other is a 100Mb server. Both servers are disciplined by a GPS PPS > device. The offset fluctuations on both servers is about 4 us, so > none of the offset fluctuations come from the server clocks > themselves. It would be good to include the specific card name and driver rev etc in subsequent writeups. Over the years there have been several Intel gigabit cards and 100BT cards. I believe just about all the Intel GbE cards have had support for interrupt coalescing in some form or another. At least those which have crossed my path. rick jones lspci -v can help if you don't already know the card name(s) -- It is not a question of half full or empty - the glass has a leak. The real question is "Can it be patched?" these opinions are mine, all mine; HP might not want them anyway... :) feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH... _______________________________________________ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions