Re: svn commit: r265472 - head/bin/dd
On Thu, May 8, 2014 at 4:55 PM, Dmitry Morozovsky wrote: > On Wed, 7 May 2014, Alan Somers wrote: > > [snip] > >> Even if nanosecond resolution isn't useful, monotonicity is. Nobody >> should be using a nonmonotonic clock just to measure durations. I >> started an audit of all of FreeBSD to look for other programs that use >> gettimeofday to measure durations. I haven't finished, but I've >> already found a lot, including xz, ping, hastd, fetch, systat, powerd, >> and others. I don't have time to fix them, though. Would you be >> interested, or do you know anyone else who would? > > From your list, hastd seems to be the most dangerous point to me. > > Adding trociny@ as one of the most active hastd-related committers so the > issue > would not be lost in the areas... In hastd's case, the problematic comparison is only used to print debugging info. It happens twice, in primary.c lines 1978 and 2039. > > -- > Sincerely, > D.Marck [DM5020, MCK-RIPE, DM3-RIPN] > [ FreeBSD committer: ma...@freebsd.org ] > > *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** > ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r265472 - head/bin/dd
On Wed, 7 May 2014, Alan Somers wrote: [snip] > Even if nanosecond resolution isn't useful, monotonicity is. Nobody > should be using a nonmonotonic clock just to measure durations. I > started an audit of all of FreeBSD to look for other programs that use > gettimeofday to measure durations. I haven't finished, but I've > already found a lot, including xz, ping, hastd, fetch, systat, powerd, > and others. I don't have time to fix them, though. Would you be > interested, or do you know anyone else who would? >From your list, hastd seems to be the most dangerous point to me. Adding trociny@ as one of the most active hastd-related committers so the issue would not be lost in the areas... -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: ma...@freebsd.org ] *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r265472 - head/bin/dd
On Thu, 8 May 2014, Alan Somers wrote: On Wed, May 7, 2014 at 9:39 PM, Bruce Evans wrote: On Wed, 7 May 2014, Jilles Tjoelker wrote: On Wed, May 07, 2014 at 12:10:31PM -0600, Alan Somers wrote: On Tue, May 6, 2014 at 9:47 PM, Bruce Evans wrote: On Tue, 6 May 2014, Alan Somers wrote: [snip] + st.start = tv.tv_sec + tv.tv_nsec * 1.0e-9; } The floating point addition starts losing precision after 8388608 seconds (slightly more than 97 days, a plausible uptime for a server). It is better to subtract the timespecs to avoid this issue. No, it is better to use floating point for results that only need to be approximate. Especially when the inputs are approximate and the final approximation doesn't need to be very accurate. Floating point is good for all timespec and timeval calculations, except in the kernel where it is unavailable. timespecs and timevals are mostly used for timeouts, and the kernel isn't very careful about exact timeouts. Short timeouts have inherent large inaccuracy due to interrupt granularity and latency. Long timeouts can be relatively more accurate, but only if the kernel is careful about them. It is only careful in some places. No, Jilles is right. The problem isn't that dd uses doubles; it's that dd converts longs to doubles _before_ subtracting the values. That causes rounding if the tv_sec values are large. If the implementation of CLOCK_MONOTONIC ever changed to measure time since the Epoch, or something similar, then the rounding error would be extremely significant. Better to subtract the timespecs, then convert to double. Nowhere near a problem. Conversion from long to double is exact up to 2**53 seconds or 285 megayears. So it works for seconds since the POSIX Epoch although not for seconds since the big bang. It would be a problem for floats. Actually, I didn't notice the fragile conversion order. With microseconds, the precision of a double is sufficient for 272 years, so that calculation is probably acceptable. Actually 285 years. A measly 0.285 years in nanoseconds before exactness is lost. The subtraction problem also affects tv_nsec, but is not very serious since it wouldn't hurt to ignore the nanoseconds part of a runtime of 0.285 years. We are 54 years from the Epoch now. That is a bit more than 0.285, so we lose a significant part of the nanoseconds precision. But we get microseconds precision for 285 years since the Epoch. About 5 times that now. ... Bugs in the boot time can be fixed more easily than by micro-adjusting the monotonic clock. Just keep the initial boot time (except adjust it when it was initially local instead of UTC) and frob the real time using a different variable. Export both variables so that applications can compensate for the frobbing at the cost of some complexity. E.g., in uptime(1): clock_gettime(CLOCK_UPTIME, &ts); /* * Actually, do the compensations in the kernel for CLOCK_UPTIME. * It doesn't need to be monotonic. But suppose it is the same * as the unfixed CLOCK_MONOTONIC and compensate here. * * Also fix the bogus variable name 'tp'. */ sysctl_mumble(&boottime); sysctl_mumble(&frobbed_boottime); uptime = ts.tv_sec +- (boottime.tv_sec - frobbed_boottime.tv_sec); Note that the compensation may go backwards, so this method doesn't work in general for monotonic times. However, it can be used if the compensation is non-negative or relatively small negative. dd could use this method. It already has to fix up for zero times and still has parts of the old method that fixes up for negative times. Note that the compensation may be very large across a suspension. You might start dd, SIGSTOP it, suspend the system and restart everything a day later. The compensation would be about 1 day. The average from this wouldn't be very useful, but it would be the same as if dd was stopped for a day but the system was not suspended. Wouldn't it be simpler just for the kernel to adjust CLOCK_MONOTONIC to add suspension time? That works for forward jumps like ones for suspension. BTW, I do a similar adjustment for "suspension" that is actually sitting in ddb. I only step the real time. My accuracy is better than 10 ppm. Good enough for an ntp server. The stepping is done in rtcintr() on seconds rollover interrupts 3 seconds after ddb has exited and remained inactive, occurding to measurements made every 64 seconds rollover interrupt (the delay is to prevent adjustments while single stepping). Nothing much notices this stepping, but for monotonic time you have to do something about scheduled timeouts. Strictly, stepping the monotonic clock forward by a lot should cause many timeouts to expire. This is the correct behaviour, but it won't happen automatially, and it would cause thundering herds. Cron should have similar scheduling problems after the realtime clock is stepped, but I don't use it much.
Re: svn commit: r265472 - head/bin/dd
On Wed, May 7, 2014 at 9:39 PM, Bruce Evans wrote: > On Wed, 7 May 2014, Jilles Tjoelker wrote: > >> On Wed, May 07, 2014 at 12:10:31PM -0600, Alan Somers wrote: >>> >>> On Tue, May 6, 2014 at 9:47 PM, Bruce Evans wrote: On Tue, 6 May 2014, Alan Somers wrote: > > ... > > The solution is to use clock_gettime(2) with CLOCK_MONOTONIC_PRECISE > as > the > clock_id. That clock advances steadily, regardless of changes to the > system > clock. > ... > +#include >> >> Use of is a style bug. It is not used in BSD or KNF code like dd used to be. >> >> >>> sysexits.h is recommended by the err(3) man page. Is that >>> recommendation meant to apply selectively, or is it obsolete, or is >>> some sort of edit war being waged by man page authors? > > > Bug in the err(3) man page. Sort of an edit war. Just 2 FreeBSD > committers liked sysexits and used it in their code and added a > recommendation to use it in some man pages. But it has negative > advantages, and normal BSD programs don't use it. It has been > edited in and out of style(9). > > >> The recommendation for was incompletely removed, yes. > > > It is still in err(3), and sysexits(3) still justifies itself by > pointing to partly-removed words in style(9). > > err(3) is the last place that should recommend using sysexits. err() > gives a nice way of encouraging text descriptions for all exits. > With text descriptions, there is almost no need for cryptic numeric > exit codes. Only sets of programs that communicate a little status > in the exit code should use sysexits (or perhaps their own exit > codes, or certain standard exit codes like 126 or 127 for xargs and > some other utilities). Some of the uses of the standard exit codes > are even. I don't know of any utility except possibly sendmail that > documents that it uses sysexits enough for its exit codes to be > useful for more than a binary success/fail decision. Certainly not > dd after these changes. If its use of sysexits were documented, > then the documentation would say "dd uses sysexits to report 3 errors > that can't happen; otherwise, it uses the normal 2-state exit codes > (there is a macro for them. It expands to the concise but > grammatically challenged "exits 0 on success, and >0 if an error > occurs". Here ">0" standardises the usual sloppiness of not > distinguishing codes between 1 and 127). > > sysexits(3) now says: > > % DESCRIPTION > % According to style(9), it is not a good practice to call exit(3) with > % arbitrary values to indicate a failure condition when ending a > program. > % Instead, the pre-defined exit codes from sysexits should be used, so > the > % caller of the process can get a rough estimation about the failure > class > % without looking up the source code. > > but style(9) now says: > > % Exits should be 0 on success, or 1 on failure. > % % exit(0);/* > % * Avoid obvious comments such as > % * "Exit 0 on success." > % */ > % } > > The latter is not what I asked for either. In previous discussion > of this, I think we agreed to at least mention EXIT_SUCCESS and > EXIT_FAILURE, and possibly deprecate sysexits. > > This is a weakened version of the 4.4BSD style rules, which say: > > % /* > %* Exits should be 0 on success, and 1 on failure. Don't denote > %* all the possible exit points, using the integers 1 through 300. > %*/ > % exit(0);/* Avoid obvious comments such as "Exit 0 on success." > */ > > The main point of this is to disallow cryptic undocumented exit statuses. > Recommending sysexits almost reverses this. It gives cryptic undocumented > error statuses that are not even easy to decrypt for programs. Programs > can look up sysexits, but without documentation there is no guarantee that > the encoding is according to sysexits. Actually documenting use of > sysexits would make it even more painful to use. > > >>> [snip] > > - st.start = tv.tv_sec + tv.tv_usec * 1e-6; > + if (clock_gettime(CLOCK_MONOTONIC_PRECISE, &tv)) > + err(EX_OSERR, "clock_gettime"); >>> >>> [snip] > > + st.start = tv.tv_sec + tv.tv_nsec * 1.0e-9; > } >> >> >> The floating point addition starts losing precision after 8388608 >> seconds (slightly more than 97 days, a plausible uptime for a server). >> It is better to subtract the timespecs to avoid this issue. > > > No, it is better to use floating point for results that only need to > be approximate. Especially when the inputs are approximate and the > final approximation doesn't need to be very accurate. > > Floating point is good for all timespec and timeval calculations, > except in the kernel where it is unavailable. timespecs and timevals > are mostly used for timeouts, and the kernel isn't very careful a
Re: svn commit: r265472 - head/bin/dd
On Wed, 7 May 2014, Alan Somers wrote: On Tue, May 6, 2014 at 9:47 PM, Bruce Evans wrote: On Tue, 6 May 2014, Alan Somers wrote: This is about some minor details that I didn't reply to for later followups. + if (clock_gettime(CLOCK_MONOTONIC_PRECISE, &tv)) + err(EX_OSERR, "clock_gettime"); + if (clock_getres(CLOCK_MONOTONIC_PRECISE, &tv_res)) + err(EX_OSERR, "clock_getres"); clock_getres() is almost useless, and is useless here. It is broken as designed, since the precision may be less than 1 nanosecond but 1 nanosecond is the smallest positive representable value, but that is not a problem here since clock_gettime() also can't distinguish differences smaller than 1 nanosecond. Since it's reporting the clock resolution and not precision, and since clock_gettime() only reports with 1ns resolution, I don't think it's a problem for clock_getres to report with 1ns resolution too. I got most of the backwardness backwards. The syscall is clock_getres(), not clock_getprec(), and the variable name matches this. But what it returns is the precision. The resolution is just that of a timespec (1 nanosecond). No API is needed to report this. APIs are needed to report: - the precision. The API is misnamed clock_getres() - the granularity. This is the minimum time between successive measurements. It can be determined by actually doing some measurements. - the accuracy. No API is available For clocks based on timecounters, we use time timecounter clock period rounded up to nanoseconds for the precision. With a TSC, this is always 1 nanosecond above 1GHz. dd needs more like the granularity than the precision, but it doesn't really matter since the runtime must be much larger than the granularity for the statistics to be accurate, and usually is. The fixup is now only reachable in 3 cases that can't happen: - when the monotonic time goes backwards due to a kernel bug - when the monotonic time doesn't increase, so that the difference is 0. Oops, this can happen for timecounters with very low "precision". You don't need to know the "precision" to check for this. On my Xeon E5504 systems, I can see adjacent calls to clock_gettime return equal values when using one of the _FAST clocks. It can't be proven that this case will never happen with any other clock either, so the program needs to handle it. Hrmph. This is either from the design error of the existence of the _FAST clocks, or from the design error of the existence of TSC-low. First, the _FAST clocks are only supposed to have a resolution of 1/hz. clock_getres() is quite broken here. It returns the timecounter precision for the _FAST clocks too. Also, if it returned 1/hz, then it would be inconsistent with the libc implementation of clock_gettime(). The latter gives gives the timecounter precision. TSC-low intentionally destroys the hardware TSC precision by right shifting, due to FUD and to handle a minor problem above 4GHz. The shift used to be excessive in most cases. On freefall it used to be about 7, so the precision of ~1/2.67 nsec was reduced to ~48 nsec. This was easy to see in tests programs for such things. Now the shift is just 1. Since 1<<1 is less than 2.67, the loss of precision from the shift is less than the loss of precision from converting from bintimes to timespecs. The shift is still a pessimization. sysctl has a read-only tunable kern.timecounter.tsc_shift. Use of this seems to be quite broken. The shift count is determined dynamically, and the tunable barely effects this. The active shift count is not written back to the tunable, so you can't see what it is easy. However, the shift count is now always 1 except in exceptional cases. The tunable defaults to 1. This is for CPU speeds between 2GHz and 4GHz to implement the support for the FUD at these speeds. Above 4GHz, the shift is increased to 2 without changing the tunable. Above 8GHz, the shift is increased to 3. That can't happen yet, but you can tune higher to get a higher shift count at lower speeds. You can also tune to 0 to avoid the shift up to 4GHz. The shift together with some fencing pessimizations that are not even done in the kernel version (only libc) is due to FUD. rdtsc is not a serializing instruction, so its direct use may give surprising results. I think it is serialized with respect to itself on the same CPU. It is obviously not serialized with respect to other instructions on the same CPU. So it doesn't work properly in code like "rdtsc; ; v++; rdtsc; " even with quite a bit more than v++ between the rdtsc's. Normally there is much more than v++ between rdtsc's so code like this works well in practice. When the rdtsc's are on separate CPUs, it is just a bug to depend on their order unless there are synchronization instructions for more than the rdtsc's. The old kernel code is sloppy about such things. It tries to do everything without atomic locking or mutexes. This most
Re: svn commit: r265472 - head/bin/dd
On Wed, 7 May 2014, Alan Somers wrote: On Tue, May 6, 2014 at 9:47 PM, Bruce Evans wrote: On Tue, 6 May 2014, Alan Somers wrote: This is about some minor details that I didn't reply to for later followups. + if (clock_gettime(CLOCK_MONOTONIC_PRECISE, &tv)) + err(EX_OSERR, "clock_gettime"); + if (clock_getres(CLOCK_MONOTONIC_PRECISE, &tv_res)) + err(EX_OSERR, "clock_getres"); clock_getres() is almost useless, and is useless here. It is broken as designed, since the precision may be less than 1 nanosecond but 1 nanosecond is the smallest positive representable value, but that is not a problem here since clock_gettime() also can't distinguish differences smaller than 1 nanosecond. Since it's reporting the clock resolution and not precision, and since clock_gettime() only reports with 1ns resolution, I don't think it's a problem for clock_getres to report with 1ns resolution too. I got most of the backwardness backwards. The syscall is clock_getres(), not clock_getprec(), and the variable name matches this. But what it returns is the precision. The resolution is just that of a timespec (1 nanosecond). No API is needed to report this. APIs are needed to report: - the precision. The API is misnamed clock_getres() - the granularity. This is the minimum time between successive measurements. It can be determined by actually doing some measurements. - the accuracy. No API is available For clocks based on timecounters, we use time timecounter clock period rounded up to nanoseconds for the precision. With a TSC, this is always 1 nanosecond above 1GHz. dd needs more like the granularity than the precision, but it doesn't really matter since the runtime must be much larger than the granularity for the statistics to be accurate, and usually is. The fixup is now only reachable in 3 cases that can't happen: - when the monotonic time goes backwards due to a kernel bug - when the monotonic time doesn't increase, so that the difference is 0. Oops, this can happen for timecounters with very low "precision". You don't need to know the "precision" to check for this. On my Xeon E5504 systems, I can see adjacent calls to clock_gettime return equal values when using one of the _FAST clocks. It can't be proven that this case will never happen with any other clock either, so the program needs to handle it. Hrmph. This is either from the design error of the existence of the _FAST clocks, or from the design error of the existence of TSC-low. First, the _FAST clocks are only supposed to have a resolution of 1/hz. clock_getres() is quite broken here. It returns the timecounter precision for the _FAST clocks too. Also, if it returned 1/hz, then it would be inconsistent with the libc implementation of clock_gettime(). The latter gives gives the timecounter precision. TSC-low intentionally destroys the hardware TSC precision by right shifting, due to FUD and to handle a minor problem above 4GHz. The shift used to be excessive in most cases. On freefall it used to be about 7, so the precision of ~1/2.67 nsec was reduced to ~48 nsec. This was easy to see in tests programs for such things. Now the shift is just 1. Since 1<<1 is less than 2.67, the loss of precision from the shift is less than the loss of precision from converting from bintimes to timespecs. The shift is still a pessimization. sysctl has a read-only tunable kern.timecounter.tsc_shift. Use of this seems to be quite broken. The shift count is determined dynamically, and the tunable barely effects this. The active shift count is not written back to the tunable, so you can't see what it is easy. However, the shift count is now always 1 except in exceptional cases. The tunable defaults to 1. This is for CPU speeds between 2GHz and 4GHz to implement the support for the FUD at these speeds. Above 4GHz, the shift is increased to 2 without changing the tunable. Above 8GHz, the shift is increased to 3. That can't happen yet, but you can tune higher to get a higher shift count at lower speeds. You can also tune to 0 to avoid the shift up to 4GHz. The shift together with some fencing pessimizations that are not even done in the kernel version (only libc) is due to FUD. rdtsc is not a serializing instruction, so its direct use may give surprising results. I think it is serialized with respect to itself on the same CPU. It is obviously not serialized with respect to other instructions on the same CPU. So it doesn't work properly in code like "rdtsc; ; v++; rdtsc; " even with quite a bit more than v++ between the rdtsc's. Normally there is much more than v++ between rdtsc's so code like this works well in practice. When the rdtsc's are on separate CPUs, it is just a bug to depend on their order unless there are synchronization instructions for more than the rdtsc's. The old kernel code is sloppy about such things. It tries to do everything without atomic locking or mutexes. This mos
Re: svn commit: r265472 - head/bin/dd
On Wed, 7 May 2014, Alan Somers wrote: On Wed, May 7, 2014 at 2:26 PM, Jilles Tjoelker wrote: The floating point addition starts losing precision after 8388608 seconds (slightly more than 97 days, a plausible uptime for a server). It is better to subtract the timespecs to avoid this issue. With microseconds, the precision of a double is sufficient for 272 years, so that calculation is probably acceptable. Good point. I'll fix it. BTW, FreeBSD does not have a timespecsub, but I notice that NetBSD does. This seems like a good candidate to import. http://netbsd.gw.com/cgi-bin/man-cgi?timespecsub++NetBSD-current Hrmph. This was intentionally left out in FreeBSD. Floating point works better. Unfortunately, floating point is unavailable in kernels, or was too slow, especially in 1980, so APIs use the not so good timespec and timeval APIs. Not to mention bintimes, sbintimes and struct tm. sbintime or just nanoseconds would be better if floating point is unavailable. struct tm is too bloated to be good for syscalls although not bloated enough to support sub-second times. FreeBSD does have the corresponding design errors for timeval APIs: timeradd(), etc. These are even documented, but in a wrong way as prototypes. They are actually unsafe macros spelled as if they are safe. Some of them could be functions, but timercmp() must be a macro. Design errors in them start with their names (timer* instead of tv*). has rather complicated ifdefs to keep these APIs in or out of the kernel. It already has timespecsub(), but limits it to the kernel. The efficiency of these functions is unimportant, and some like timevalsub() are extern functions in the kernel. This helps give the complicated ifdefs and confusing namespaces: - bintime_sub() is kernel inline - bintime_sub() is user inline (undocumented pollution) - bintime_sub() uses an extra underscore - the good name btsub() is not used - timevalsub() is kernel extern - timersub() is user macro (otherBSD compatibility cruft) - the good name tvsub() is not used. timersub() is an especially bad name for an API that handles timevals. timevals were intentionally left out of old versions of POSIX. POSIX invented timespecs instead. The old APIs using timevals couldn't be killed so easily, and POSIX eventually had to standardize them. POSIX also standardized many new timespec APIs. Then, just when the timevals could be killed better, the unnecessary APIs for manipulating them were introduced, though not in POSIX. The non-kernel versions of them have a name that is particularly when there is more than 1 timer struct. - timespecsub() is kernel macro - timespecsub() as user API is intentionally left out - the good name tssub() is not used. Parameter names and orders for these APIs are also inconsistent and mostly bad. In the macro timersub(), the parameters are (tvp, uvp, vvp). You might thing tvp is the target, but it is one of the sources. In the man page "prototype", they are (a, b, res). Slightly better, but now missing a 'p'. But 'p' is not missing in the man page for timerset() or timerclear(). The ordering with the target last is an old convention. bintime functions mostly but not always go the other way. Floating point is easier to use than these functions, except for conversion from floating point back to a timespec or timeval. That takes about 2 statements instead of 1 function call. It's just a bit inelegant to go through floating point when you start with timespecs and have to convert back to timespecs. Bruce ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r265472 - head/bin/dd
On Wed, 7 May 2014, Jilles Tjoelker wrote: On Wed, May 07, 2014 at 12:10:31PM -0600, Alan Somers wrote: On Tue, May 6, 2014 at 9:47 PM, Bruce Evans wrote: On Tue, 6 May 2014, Alan Somers wrote: ... The solution is to use clock_gettime(2) with CLOCK_MONOTONIC_PRECISE as the clock_id. That clock advances steadily, regardless of changes to the system clock. ... +#include Use of is a style bug. It is not used in BSD or KNF code like dd used to be. sysexits.h is recommended by the err(3) man page. Is that recommendation meant to apply selectively, or is it obsolete, or is some sort of edit war being waged by man page authors? Bug in the err(3) man page. Sort of an edit war. Just 2 FreeBSD committers liked sysexits and used it in their code and added a recommendation to use it in some man pages. But it has negative advantages, and normal BSD programs don't use it. It has been edited in and out of style(9). The recommendation for was incompletely removed, yes. It is still in err(3), and sysexits(3) still justifies itself by pointing to partly-removed words in style(9). err(3) is the last place that should recommend using sysexits. err() gives a nice way of encouraging text descriptions for all exits. With text descriptions, there is almost no need for cryptic numeric exit codes. Only sets of programs that communicate a little status in the exit code should use sysexits (or perhaps their own exit codes, or certain standard exit codes like 126 or 127 for xargs and some other utilities). Some of the uses of the standard exit codes are even. I don't know of any utility except possibly sendmail that documents that it uses sysexits enough for its exit codes to be useful for more than a binary success/fail decision. Certainly not dd after these changes. If its use of sysexits were documented, then the documentation would say "dd uses sysexits to report 3 errors that can't happen; otherwise, it uses the normal 2-state exit codes (there is a macro for them. It expands to the concise but grammatically challenged "exits 0 on success, and >0 if an error occurs". Here ">0" standardises the usual sloppiness of not distinguishing codes between 1 and 127). sysexits(3) now says: % DESCRIPTION % According to style(9), it is not a good practice to call exit(3) with % arbitrary values to indicate a failure condition when ending a program. % Instead, the pre-defined exit codes from sysexits should be used, so the % caller of the process can get a rough estimation about the failure class % without looking up the source code. but style(9) now says: % Exits should be 0 on success, or 1 on failure. % % exit(0);/* % * Avoid obvious comments such as % * "Exit 0 on success." % */ % } The latter is not what I asked for either. In previous discussion of this, I think we agreed to at least mention EXIT_SUCCESS and EXIT_FAILURE, and possibly deprecate sysexits. This is a weakened version of the 4.4BSD style rules, which say: % /* %* Exits should be 0 on success, and 1 on failure. Don't denote %* all the possible exit points, using the integers 1 through 300. %*/ % exit(0);/* Avoid obvious comments such as "Exit 0 on success." */ The main point of this is to disallow cryptic undocumented exit statuses. Recommending sysexits almost reverses this. It gives cryptic undocumented error statuses that are not even easy to decrypt for programs. Programs can look up sysexits, but without documentation there is no guarantee that the encoding is according to sysexits. Actually documenting use of sysexits would make it even more painful to use. [snip] - st.start = tv.tv_sec + tv.tv_usec * 1e-6; + if (clock_gettime(CLOCK_MONOTONIC_PRECISE, &tv)) + err(EX_OSERR, "clock_gettime"); [snip] + st.start = tv.tv_sec + tv.tv_nsec * 1.0e-9; } The floating point addition starts losing precision after 8388608 seconds (slightly more than 97 days, a plausible uptime for a server). It is better to subtract the timespecs to avoid this issue. No, it is better to use floating point for results that only need to be approximate. Especially when the inputs are approximate and the final approximation doesn't need to be very accurate. Floating point is good for all timespec and timeval calculations, except in the kernel where it is unavailable. timespecs and timevals are mostly used for timeouts, and the kernel isn't very careful about exact timeouts. Short timeouts have inherent large inaccuracy due to interrupt granularity and latency. Long timeouts can be relatively more accurate, but only if the kernel is careful about them. It is only careful in some places. With microseconds, the precision of a double is sufficient for 272 years, so that calculation is probably acceptable. dd actually us
Re: svn commit: r265472 - head/bin/dd
On Wed, May 7, 2014 at 2:26 PM, Jilles Tjoelker wrote: > On Wed, May 07, 2014 at 12:10:31PM -0600, Alan Somers wrote: >> On Tue, May 6, 2014 at 9:47 PM, Bruce Evans wrote: >> > On Tue, 6 May 2014, Alan Somers wrote: >> > >> >> Log: >> >> dd(1) uses gettimeofday(2) to compute the throughput statistics. >> >> However, >> >> gettimeofday returns the system clock, which may jump forward or back, >> >> especially if NTP is in use. If the time jumps backwards, then dd will >> >> see >> >> negative elapsed time, round it up to 1usec, and print an absurdly fast >> >> transfer rate. >> >> >> >> The solution is to use clock_gettime(2) with CLOCK_MONOTONIC_PRECISE as >> >> the >> >> clock_id. That clock advances steadily, regardless of changes to the >> >> system >> >> clock. >> >> >> >> Reviewed by: delphij >> >> MFC after: 3 days >> >> Sponsored by: Spectra Logic >> >> ... >> >> >> >> Modified: head/bin/dd/dd.c >> >> >> >> == >> >> --- head/bin/dd/dd.cTue May 6 22:04:50 2014(r265471) >> >> +++ head/bin/dd/dd.cTue May 6 22:06:39 2014(r265472) >> >> @@ -50,7 +50,6 @@ __FBSDID("$FreeBSD$"); >> >> #include >> >> #include >> >> #include >> >> -#include >> >> >> >> #include >> >> #include >> >> @@ -61,6 +60,8 @@ __FBSDID("$FreeBSD$"); >> >> #include >> >> #include >> >> #include >> >> +#include > >> > Use of is a style bug. It is not used in BSD or KNF code >> > like dd used to be. > >> sysexits.h is recommended by the err(3) man page. Is that >> recommendation meant to apply selectively, or is it obsolete, or is >> some sort of edit war being waged by man page authors? > > The recommendation for was incompletely removed, yes. > >> [snip] >> >> - st.start = tv.tv_sec + tv.tv_usec * 1e-6; >> >> + if (clock_gettime(CLOCK_MONOTONIC_PRECISE, &tv)) >> >> + err(EX_OSERR, "clock_gettime"); >> [snip] >> >> + st.start = tv.tv_sec + tv.tv_nsec * 1.0e-9; >> >> } > > The floating point addition starts losing precision after 8388608 > seconds (slightly more than 97 days, a plausible uptime for a server). > It is better to subtract the timespecs to avoid this issue. > > With microseconds, the precision of a double is sufficient for 272 > years, so that calculation is probably acceptable. Good point. I'll fix it. BTW, FreeBSD does not have a timespecsub, but I notice that NetBSD does. This seems like a good candidate to import. http://netbsd.gw.com/cgi-bin/man-cgi?timespecsub++NetBSD-current > > I think I agree with Bruce that using CLOCK_MONOTONIC_PRECISE is not > necessary here; the standard CLOCK_MONOTONIC should suffice. > >> [snip] >> Even if nanosecond resolution isn't useful, monotonicity is. Nobody >> should be using a nonmonotonic clock just to measure durations. I >> started an audit of all of FreeBSD to look for other programs that use >> gettimeofday to measure durations. I haven't finished, but I've >> already found a lot, including xz, ping, hastd, fetch, systat, powerd, >> and others. I don't have time to fix them, though. Would you be >> interested, or do you know anyone else who would? > > I have a local patch for time(1). > > Whether the monotonic clock is right also depends on how long the > durations typically are. For very long durations, users might refer to > wall clocks and CLOCK_REALTIME may be more appropriate. I would say that gettimeofday() or CLOCK_REALTIME() should be used whenever the absolute value of the result matters. For example, if it's written to disk, displayed to the user, or used as an absolute wakeup time for something like at(1). But if only the relative value of 2 or more calls matter, then I think that a monotonic clock should always be used. A corollary is that durations that cross a reboot cycle, or when the start and end times are measured by different machines, must use the realtime clock. -Alan > > -- > Jilles Tjoelker ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r265472 - head/bin/dd
On Wed, May 07, 2014 at 12:10:31PM -0600, Alan Somers wrote: > On Tue, May 6, 2014 at 9:47 PM, Bruce Evans wrote: > > On Tue, 6 May 2014, Alan Somers wrote: > > > >> Log: > >> dd(1) uses gettimeofday(2) to compute the throughput statistics. > >> However, > >> gettimeofday returns the system clock, which may jump forward or back, > >> especially if NTP is in use. If the time jumps backwards, then dd will > >> see > >> negative elapsed time, round it up to 1usec, and print an absurdly fast > >> transfer rate. > >> > >> The solution is to use clock_gettime(2) with CLOCK_MONOTONIC_PRECISE as > >> the > >> clock_id. That clock advances steadily, regardless of changes to the > >> system > >> clock. > >> > >> Reviewed by: delphij > >> MFC after: 3 days > >> Sponsored by: Spectra Logic > >> ... > >> > >> Modified: head/bin/dd/dd.c > >> > >> == > >> --- head/bin/dd/dd.cTue May 6 22:04:50 2014(r265471) > >> +++ head/bin/dd/dd.cTue May 6 22:06:39 2014(r265472) > >> @@ -50,7 +50,6 @@ __FBSDID("$FreeBSD$"); > >> #include > >> #include > >> #include > >> -#include > >> > >> #include > >> #include > >> @@ -61,6 +60,8 @@ __FBSDID("$FreeBSD$"); > >> #include > >> #include > >> #include > >> +#include > > Use of is a style bug. It is not used in BSD or KNF code > > like dd used to be. > sysexits.h is recommended by the err(3) man page. Is that > recommendation meant to apply selectively, or is it obsolete, or is > some sort of edit war being waged by man page authors? The recommendation for was incompletely removed, yes. > [snip] > >> - st.start = tv.tv_sec + tv.tv_usec * 1e-6; > >> + if (clock_gettime(CLOCK_MONOTONIC_PRECISE, &tv)) > >> + err(EX_OSERR, "clock_gettime"); > [snip] > >> + st.start = tv.tv_sec + tv.tv_nsec * 1.0e-9; > >> } The floating point addition starts losing precision after 8388608 seconds (slightly more than 97 days, a plausible uptime for a server). It is better to subtract the timespecs to avoid this issue. With microseconds, the precision of a double is sufficient for 272 years, so that calculation is probably acceptable. I think I agree with Bruce that using CLOCK_MONOTONIC_PRECISE is not necessary here; the standard CLOCK_MONOTONIC should suffice. > [snip] > Even if nanosecond resolution isn't useful, monotonicity is. Nobody > should be using a nonmonotonic clock just to measure durations. I > started an audit of all of FreeBSD to look for other programs that use > gettimeofday to measure durations. I haven't finished, but I've > already found a lot, including xz, ping, hastd, fetch, systat, powerd, > and others. I don't have time to fix them, though. Would you be > interested, or do you know anyone else who would? I have a local patch for time(1). Whether the monotonic clock is right also depends on how long the durations typically are. For very long durations, users might refer to wall clocks and CLOCK_REALTIME may be more appropriate. -- Jilles Tjoelker ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r265472 - head/bin/dd
On Tue, May 6, 2014 at 9:47 PM, Bruce Evans wrote: > On Tue, 6 May 2014, Alan Somers wrote: > >> Log: >> dd(1) uses gettimeofday(2) to compute the throughput statistics. >> However, >> gettimeofday returns the system clock, which may jump forward or back, >> especially if NTP is in use. If the time jumps backwards, then dd will >> see >> negative elapsed time, round it up to 1usec, and print an absurdly fast >> transfer rate. >> >> The solution is to use clock_gettime(2) with CLOCK_MONOTONIC_PRECISE as >> the >> clock_id. That clock advances steadily, regardless of changes to the >> system >> clock. >> >> Reviewed by: delphij >> MFC after: 3 days >> Sponsored by: Spectra Logic >> ... >> >> Modified: head/bin/dd/dd.c >> >> == >> --- head/bin/dd/dd.cTue May 6 22:04:50 2014(r265471) >> +++ head/bin/dd/dd.cTue May 6 22:06:39 2014(r265472) >> @@ -50,7 +50,6 @@ __FBSDID("$FreeBSD$"); >> #include >> #include >> #include >> -#include >> >> #include >> #include >> @@ -61,6 +60,8 @@ __FBSDID("$FreeBSD$"); >> #include >> #include >> #include >> +#include > > > Use of is a style bug. It is not used in BSD or KNF code > like dd used to be. sysexits.h is recommended by the err(3) man page. Is that recommendation meant to apply selectively, or is it obsolete, or is some sort of edit war being waged by man page authors? > > >> +#include >> #include >> >> #include "dd.h" >> @@ -123,7 +124,7 @@ static void >> setup(void) >> { >> u_int cnt; >> - struct timeval tv; >> + struct timespec tv; >> >> if (in.name == NULL) { >> in.name = "stdin"; >> @@ -240,8 +241,9 @@ setup(void) >> ctab = casetab; >> } >> >> - (void)gettimeofday(&tv, NULL); > > > Not checking for errors, and especially voiding the result to say that > errors need not be checked for, was sloppy. > > >> - st.start = tv.tv_sec + tv.tv_usec * 1e-6; >> + if (clock_gettime(CLOCK_MONOTONIC_PRECISE, &tv)) >> + err(EX_OSERR, "clock_gettime"); > > > The existence of CLOCK_MONOTONIC_PRECISE is a design error. It is the > same as the standard CLOCK_MONOTONIC. Use of the former is just > gratuitously unportable. It ensures a compile-time failure on OSes > (including old versions of FreeBSD) that don't have the > CLOCK_MONOTONIC_PRECISE mistake. It gives a runtime error on old > versions of FreeBSD that have clock_gettime() and CLOCK_MONOTONIC but > not CLOCK_MONOTONIC_PRECISE. Apart from this error, clock_gettime() > is as likely to fail as gettimeofday(), or exit(). > > The existence of CLOCK_MONOTONIC_FAST is a larger design error. It > is not fast, but just sloppy, broken and not very fast (sloppiness is > the intentional part of its design). Of course there is no need to > use it to speed up the whole 2 clock_gettime() calls in dd. It is > almost useless for other applications too. In non-broken implementations, > the speed of CLOCK_* tends to be dominated by the syscall time even > when the timer hardware is slow. On freefall now, the implementation > is about half as good as possible, and takes between 32 and 35 nanoseconds > for all clock ids, with the least accurate clock id TIME_SECOND tieing > with CLOCK_REALTIME for slowest (only about 1 nanosecond slower) since > it has the worst implementation. > (The implementation on freefall avoids syscalls provided the timer > hardware is the TSC. The hardware timer and binuptime() are used for > all syscalls including TIME_SECOND. TIME_SECOND is slowest because > it does extra work to zero tv_sec after calculating it with maximal > precision. Doing it this way gives the correct time in seconds, but > is incompatible with the kernel (syscall and kernel seconds variable) > since the kernel calculations are sloppy. The kernel time for seconds > and for the sloppy "FAST" clock ids and for the corresponding kernel > timestamp functions lags the precise time by up to 1/HZ seconds. This > difference can easily be seen from userland. > > The hardware part of the 32-35 nanoseconds is 2-4 nanoseconds. This is > in sloppy tests. Reading the TSC is unserialized, and loops reading it > may be pipelined too well to time it accurately. Also, the times read > from it are not completely accurate. But clock_gettime() doesn't > serialize > it either. Serializing it would make it much slower (maybe 50 > nanoseconds). > > 2-4 nanoseconds is almost as fast as on AthlonXP 10 years ago! rdtsc > tends to be slower than that on Intel CPUs and on modern CPUs. > Synchronization in hardware makes it slower on modern CPUs. ISTR it > took about 65 cycles on a previous generation of Intel CPUs. > > 32-35 nanonseconds for the libc implementation is actually less than > half as good as it possible if the hardware part really is only 2-4 > nanoseconds. There
Re: svn commit: r265472 - head/bin/dd
On Tue, 6 May 2014, Alan Somers wrote: Log: dd(1) uses gettimeofday(2) to compute the throughput statistics. However, gettimeofday returns the system clock, which may jump forward or back, especially if NTP is in use. If the time jumps backwards, then dd will see negative elapsed time, round it up to 1usec, and print an absurdly fast transfer rate. The solution is to use clock_gettime(2) with CLOCK_MONOTONIC_PRECISE as the clock_id. That clock advances steadily, regardless of changes to the system clock. Reviewed by: delphij MFC after: 3 days Sponsored by: Spectra Logic ... Modified: head/bin/dd/dd.c == --- head/bin/dd/dd.cTue May 6 22:04:50 2014(r265471) +++ head/bin/dd/dd.cTue May 6 22:06:39 2014(r265472) @@ -50,7 +50,6 @@ __FBSDID("$FreeBSD$"); #include #include #include -#include #include #include @@ -61,6 +60,8 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include Use of is a style bug. It is not used in BSD or KNF code like dd used to be. +#include #include #include "dd.h" @@ -123,7 +124,7 @@ static void setup(void) { u_int cnt; - struct timeval tv; + struct timespec tv; if (in.name == NULL) { in.name = "stdin"; @@ -240,8 +241,9 @@ setup(void) ctab = casetab; } - (void)gettimeofday(&tv, NULL); Not checking for errors, and especially voiding the result to say that errors need not be checked for, was sloppy. - st.start = tv.tv_sec + tv.tv_usec * 1e-6; + if (clock_gettime(CLOCK_MONOTONIC_PRECISE, &tv)) + err(EX_OSERR, "clock_gettime"); The existence of CLOCK_MONOTONIC_PRECISE is a design error. It is the same as the standard CLOCK_MONOTONIC. Use of the former is just gratuitously unportable. It ensures a compile-time failure on OSes (including old versions of FreeBSD) that don't have the CLOCK_MONOTONIC_PRECISE mistake. It gives a runtime error on old versions of FreeBSD that have clock_gettime() and CLOCK_MONOTONIC but not CLOCK_MONOTONIC_PRECISE. Apart from this error, clock_gettime() is as likely to fail as gettimeofday(), or exit(). The existence of CLOCK_MONOTONIC_FAST is a larger design error. It is not fast, but just sloppy, broken and not very fast (sloppiness is the intentional part of its design). Of course there is no need to use it to speed up the whole 2 clock_gettime() calls in dd. It is almost useless for other applications too. In non-broken implementations, the speed of CLOCK_* tends to be dominated by the syscall time even when the timer hardware is slow. On freefall now, the implementation is about half as good as possible, and takes between 32 and 35 nanoseconds for all clock ids, with the least accurate clock id TIME_SECOND tieing with CLOCK_REALTIME for slowest (only about 1 nanosecond slower) since it has the worst implementation. (The implementation on freefall avoids syscalls provided the timer hardware is the TSC. The hardware timer and binuptime() are used for all syscalls including TIME_SECOND. TIME_SECOND is slowest because it does extra work to zero tv_sec after calculating it with maximal precision. Doing it this way gives the correct time in seconds, but is incompatible with the kernel (syscall and kernel seconds variable) since the kernel calculations are sloppy. The kernel time for seconds and for the sloppy "FAST" clock ids and for the corresponding kernel timestamp functions lags the precise time by up to 1/HZ seconds. This difference can easily be seen from userland. The hardware part of the 32-35 nanoseconds is 2-4 nanoseconds. This is in sloppy tests. Reading the TSC is unserialized, and loops reading it may be pipelined too well to time it accurately. Also, the times read from it are not completely accurate. But clock_gettime() doesn't serialize it either. Serializing it would make it much slower (maybe 50 nanoseconds). 2-4 nanoseconds is almost as fast as on AthlonXP 10 years ago! rdtsc tends to be slower than that on Intel CPUs and on modern CPUs. Synchronization in hardware makes it slower on modern CPUs. ISTR it took about 65 cycles on a previous generation of Intel CPUs. 32-35 nanonseconds for the libc implementation is actually less than half as good as it possible if the hardware part really is only 2-4 nanoseconds. There is a lot of overhead from layering and from working around hardware bugs that usually don't exist. ) Bad implementations can be quite bad. With an i8254 timecounter, it takes about 5000 nanoseconds to read the hardware. On a 486, it took an additional 1-2 nanoseconds for overheads including a syscall. On a Pentium1, it only took 1000-2000 nanoseconds for overheads in old versions of FreeBSD (much more now). With an ACPI-"fast" timecounter, it takes 1000-2000 nanoseconds to read the hardware.