Bug#696060: mtr: StDev overflowed to negative
Control: fixed -1 0.85-1 On Sun, Dec 16, 2012 at 04:44:04PM +0100, Rogier Wolff wrote: > The variance, which is used to calculate the stdev, is stored in a > 64-bit integer. > > However, what we store there are the squares of the difference from > the average. So if you have 70 second ping time (sometimes), the > square of 7 miliseconds becomes 4900 million! Quite a lot, but > unlikely to overflow a 64-bit value However the calculation is > done in microseconds Thus your 70 seconds is 70 million > microseocnds, giving 4900 trillion (4.9 * 10^15) added to the running > total every second or so, (as long as the average remains around > zero). This can overflow a 64-bit variable in human-observable time. > > I've modified the code to do the calculations in miliseconds from now > on. This should buy us a factor of a million of margin. :-) So, if I'm not mistaken this is https://github.com/traviscross/mtr/commit/bc39728995df74dd0ab78feea9a8ecfc53579fce and was ultimately included in 0.83. Marking the first Debian version that has the fix. Bernhard
Bug#696060: mtr: StDev overflowed to negative
Package: mtr Version: 0.82-3 Severity: minor Dear Maintainer, As part of an effort to diagnose - and later to confirm the fix of - an ongoing network problem, I have maintained an mtr session running for several weeks straight. The current overall summary for one hop in that session presently reads as follows: HostnameLoss Rcv Snt Last Best Avg Worst StDev 73.223.7.1 0.3% 4028069 4039341 687 12 60593 -2147483.75 The standard deviation value is negative, which is meaningless AFAIK, and therefore should not be possible. The specific negative value in question looks at a glance like the result of an overflow. I am not clear on exactly what it would take to reproduce this problem. Presumably, unreasonably high worst-case ping times in what is otherwise a normal network environment would be at least a contributing factor. However, I am relatively certain that I recall past sessions where this hop has shown a Worst value of over 7 milliseconds, but the StDev value has remained positive; as such, I am not sure whether that would be sufficient. This bug is of course extremely minor, as even if it does occur reproducibly, the circumstances for it are rare and it is unlikely to have more than a cosmetic effect even when it does occur. However, as it is still a bug, I felt it worth reporting anyway. If there is anything I can do to help to track this down, please don't hesitate to let me know. -- System Information: Debian Release: wheezy/sid APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'testing'), (500, 'stable') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 3.2.0-3-amd64 (SMP w/12 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages mtr depends on: ii libatk1.0-0 2.4.0-2 ii libc6 2.13-35 ii libcairo2 1.12.2-2 ii libfontconfig1 2.9.0-7 ii libfreetype62.4.9-1 ii libgdk-pixbuf2.0-0 2.26.1-1 ii libglib2.0-02.33.12+really2.32.4-3 ii libgtk2.0-0 2.24.10-2 ii libncurses5 5.9-10 ii libpango1.0-0 1.30.0-1 ii libtinfo5 5.9-10 mtr recommends no packages. mtr suggests no packages. -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#696060: mtr: StDev overflowed to negative
Hi, The variance, which is used to calculate the stdev, is stored in a 64-bit integer. However, what we store there are the squares of the difference from the average. So if you have 70 second ping time (sometimes), the square of 7 miliseconds becomes 4900 million! Quite a lot, but unlikely to overflow a 64-bit value However the calculation is done in microseconds Thus your 70 seconds is 70 million microseocnds, giving 4900 trillion (4.9 * 10^15) added to the running total every second or so, (as long as the average remains around zero). This can overflow a 64-bit variable in human-observable time. I've modified the code to do the calculations in miliseconds from now on. This should buy us a factor of a million of margin. :-) Roger. On Sun, Dec 16, 2012 at 08:30:28AM -0500, The Wanderer wrote: Package: mtr Version: 0.82-3 Severity: minor Dear Maintainer, As part of an effort to diagnose - and later to confirm the fix of - an ongoing network problem, I have maintained an mtr session running for several weeks straight. The current overall summary for one hop in that session presently reads as follows: HostnameLoss Rcv Snt Last Best Avg Worst StDev 73.223.7.1 0.3% 4028069 4039341 687 12 60593 -2147483.75 The standard deviation value is negative, which is meaningless AFAIK, and therefore should not be possible. The specific negative value in question looks at a glance like the result of an overflow. I am not clear on exactly what it would take to reproduce this problem. Presumably, unreasonably high worst-case ping times in what is otherwise a normal network environment would be at least a contributing factor. However, I am relatively certain that I recall past sessions where this hop has shown a Worst value of over 7 milliseconds, but the StDev value has remained positive; as such, I am not sure whether that would be sufficient. This bug is of course extremely minor, as even if it does occur reproducibly, the circumstances for it are rare and it is unlikely to have more than a cosmetic effect even when it does occur. However, as it is still a bug, I felt it worth reporting anyway. If there is anything I can do to help to track this down, please don't hesitate to let me know. -- System Information: Debian Release: wheezy/sid APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'testing'), (500, 'stable') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 3.2.0-3-amd64 (SMP w/12 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages mtr depends on: ii libatk1.0-0 2.4.0-2 ii libc6 2.13-35 ii libcairo2 1.12.2-2 ii libfontconfig1 2.9.0-7 ii libfreetype62.4.9-1 ii libgdk-pixbuf2.0-0 2.26.1-1 ii libglib2.0-02.33.12+really2.32.4-3 ii libgtk2.0-0 2.24.10-2 ii libncurses5 5.9-10 ii libpango1.0-0 1.30.0-1 ii libtinfo5 5.9-10 mtr recommends no packages. mtr suggests no packages. -- no debconf information -- ** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 ** **Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233** *-- BitWizard writes Linux device drivers for any device you may have! --* The plan was simple, like my brother-in-law Phil. But unlike Phil, this plan just might work. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#696060: mtr: StDev overflowed to negative
On 12/16/2012 10:44 AM, Rogier Wolff wrote: Hi, The variance, which is used to calculate the stdev, is stored in a 64-bit integer. However, what we store there are the squares of the difference from the average. So if you have 70 second ping time (sometimes), the square of 7 miliseconds becomes 4900 million! Quite a lot, but unlikely to overflow a 64-bit value However the calculation is done in microseconds Thus your 70 seconds is 70 million microseocnds, giving 4900 trillion (4.9 * 10^15) added to the running total every second or so, (as long as the average remains around zero). This can overflow a 64-bit variable in human-observable time. The case at hand was only about 6 milliseconds, but yes, that would explain the problem. The fact that I've seen 7-millisecond Worst times without seeing this problem would then be explained by the fact that those sessions didn't last this long; IIRC they were about two weeks at most, and this one is over six. I've modified the code to do the calculations in miliseconds from now on. This should buy us a factor of a million of margin. :-) Not a 100% fix in theory, but it should hide the problem for pretty much any case that's actually reasonable to support. Sounds good to me; thanks for the prompt response! -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org