I'm going to sum up what I've discoverd during investigating this bugreport.
Issue ifconfig and iproute2 (sometimes) shows different numbers for network statistics (like send and received packets/bytes/etc). Cause The statistics are stored as "unsigned long" in the kernel. Variables of type "unsigned long" are eigther 32 or 64bits large depending on architecture. This makes the problem only expose itself with a 64bit kernel (no matter if the userspace is 32 or 64 bits in this case). ifconfig gets it's statistics from /proc/net/dev where it's exposed as _text_. iproute2 gets it's statistics from a binary "netlink" interface. When exporting the statistics binary there's a problem between kernel/userspace about agreeing how large an "unsigned long" is, since you can run a 64bit kernel (where the unsigned long then would be 64bits) and a 32bit userland (where it would be 32 bits and would overflow). The kernel has an static-sized unsigned 32bits variable type called "u32" to avoid these kind of problems. For some reason (like not bloating 32bit achitectures?) the u32 variable type was used in the netlink interface that iproute2 uses instead of an "u64" which would have enough room on both types of architectures (but would be useless/wasteful on 32bit ones). This makes the number that iproute2 gets it's hands on to always have rolled over at 32bits even if the kernels unsigned long is 64 bits. Problem There really isn't a problem. Any program using these statistics needs to cope with rollovers, which will (eventually) happen for both 32bits and 64bits statistics. It's however unfortunate that ifconfig and iproute2 differ in the statistics which will be confusing for people not aware of the internals behind this. The numbers between ifconfig and iproute2 can't be compared (unless the ifconfig numbers are post-processed to rollover at 32bits as well). Solution(s) Even though this isn't really a bug and ifconfig is supposed to be deprecated since a long time on linux, it would preferably be handled anyway since ifconfig hasn't and (unfortunately) most likely will not go away in the near future to not cause the confusion for people comparing the ifconfig and iproute2 output. There are two ways of doing this, the easy way and the hard (proper?) way. The easy way is just to force the /proc/net/dev output to rollover at 32bits as well, even though it's not really necessary in itself. This way both methods would roll over at 32bits and the confusion wouldn't occur. Unfortunately changing even the smallest thing in /proc files has shown to cause breakage in userland code in the past and one could never really be sure what problems this change would have on all possible applications that might use /proc/net/dev. The hard way is to add a 64bit (u64) variant of the statistics to kernels netlink interface and add the abitily in iproute2 to use this if the kernel provides it instead of the old/current 32bits interface. Both of these methods depend on modifications in the kernel! Additionally, one could argue that the bug really is in the kernel and that it's a feature request / wishlish for iproute2 to support this non-yet-existing kernel function. Or that this is just as much a bug in ifconfig. Just because iproute2 and ifconfig states differently doesn't mean it's the fault of iproute2. If you would want to shift the blame towards ifconfig, you could use the fact that it could even be "fixed" in ifconfig without requiring kernel modification (but then there are probably many other programs that use the /proc/net/dev values and would require the same "fix"). Since I've already submitted[1] a patch for "the simple solution" to the (linux) netdev mailing list, and noone cared to comment or apply it. I guess they would prefer a nice backwards-compatible implementation of "the hard solution", or probably even that this is such a small issue that it's not worth fixing. Possibly the /proc filesystem is going to be cleaned up one day in a distant future to remove all the cruft that is not process-related (and break all applications, like ifconfig, that depends on these deprecated methods of gathering kernel information). I'm suggesting documenting the behaviour (unless this bug report doesn't count as good enough documentation) and lowering the severity to wishlist. Most likely noone will care to fix this since there's so little to gain. [1]: See http://www.spinics.net/lists/netdev/msg35472.html or http://marc.info/?l=linux-netdev&m=118415534518953 -- Regards, Andreas Henriksson
signature.asc
Description: This is a digitally signed message part