2500

Chuck Swiger Mon, 07 Sep 2009 15:09:25 -0700

Hi, all--

On Sep 7, 2009, at 12:21 AM, Luigi Rizzo wrote:

3. ?CAUSE an error in tvtohz(), reported long ago in
? ? ? ?http://www.dragonflybsd.org/presentations/nanosleep/

? ? ? ?which causes a systematic error of an extra tick in thecomputation

? ? ? ?of the sleep times.
? ?FIX: the above link also contains a proposed fix (which in fact
? ? ? ?reverts a bug introduced in some old commit on FreeBSD)
? ?PROBLEM: none that i see.


This change, as-is, is extremely dangerous.  tsleep/msleep() use a
value of 0 meaning 'forever'.  Changing tvtohz() so that it can now
return 0 for a non-zero tv is setting land-mines all over the place.
There's something like 27 callers of tvtohz() in sys/kern alone, some
of which are used to supply tsleep/msleep() timeouts.  Note that the

dragonflybsd patch above only adds the 'returns 0' check to onesinglecaller. You either need to patch all callers of tvtohz() sinceyou've

change the semantics, or add a 'if (ticks == 0) ticks = 1' check (or
checks) in the appropriate places inside tvtohz().

If you don't do it, then you end up with callers of random functions
with very small timeouts instead finding themselves sleeping forever.


You are right, a proper fix for this third issue should be different
(if we want a fix at all -- I'd be almost satisfied by just
removing the drift).

A while ago, I took a fairly close look at timing accuracy (ie, tryingto sleep until exactly the next second or whatever, and then comparingthe actual time you were woken up against the target time), did my ownfancy graphs showing sawtooth waves indicating timer aliasingproblems, etc. There's no question that the system is allowed tosleep longer than requested, and, if the system is busy doing othertasks, some amount of extra delay is expected and OK, but missing thetarget time by more than several ticks, especially if the system isidle, is bogus.

The Dragonfly patch to tvtohz() definitely improves timer accuracy,but it ends up computing a tick value which is too small by onesometimes (ie, the target time lands *within* that scheduler quantum,rather than at or before the start of that tick), causing negativeoffsets from the desired time (ie, you woke up just a bit too soon).That's also bogus. :-)

From my own testing of various platforms, as well as the resultsreported elsewhere, MacOS X has the most consistent timer accuracy,which is probably not too surprising given that soft realtimecapabilities for audio/visual processing tasks such as iTunes,Quicktime, LogicStudio, etc are pretty important to the Mac userbase.I just compared the tvtohz() implementations in kern/kern_clock.cbetween FreeBSD and Darwin/XNU (specifically xnu-1228.15.4, the kernelfor 10.5.8), and they are effectively identical. However, that'sprobably moot because Darwin uses tvtoabstime() rather than tvtohz()for the callouts in the setitimer()/getitimer() implementation fromkern/kern_time.c:


/* FreeBSD */
int

kern_setitimer(struct thread *td, u_int which, struct itimerval *aitv,struct itimerval *oitv)

{
[ ... ]
        if (which == ITIMER_REAL) {
                PROC_LOCK(p);
                if (timevalisset(&p->p_realtimer.it_value))
                        callout_stop(&p->p_itcallout);
                getmicrouptime(&ctv);
                if (timevalisset(&aitv->it_value)) {

callout_reset(&p->p_itcallout, tvtohz(&aitv->it_value),

                            realitexpire, p);
                        timevaladd(&aitv->it_value, &ctv);
                }
                *oitv = p->p_realtimer;
                p->p_realtimer = *aitv;
                PROC_UNLOCK(p);
                if (timevalisset(&oitv->it_value)) {
                        if (timevalcmp(&oitv->it_value, &ctv, <))
                                timevalclear(&oitv->it_value);
                        else
                                timevalsub(&oitv->it_value, &ctv);
                }
[ ... ]

/* Darwin */
int

setitimer(struct proc *p, struct setitimer_args *uap, register_t*retval)

{
[ ... ]
        switch (uap->which) {

        case ITIMER_REAL:
                proc_spinlock(p);
                if (timerisset(&aitv.it_value)) {
                        microuptime(&p->p_rtime);
                        timevaladd(&p->p_rtime, &aitv.it_value);
                        p->p_realtimer = aitv;

if (!thread_call_enter_delayed(p->p_rcall,tvtoabstime(&p->p_rtime))) // [1]

                                p->p_ractive++;
                } else  {
                        timerclear(&p->p_rtime);
                        p->p_realtimer = aitv;
                        if (thread_call_cancel(p->p_rcall))
                                p->p_ractive--;
                }
                proc_spinunlock(p);

                break;
[ ... ]

The simplest option is perhaps to compute a custom value for
nanosleep, select and poll. This would remove the risk of side
effects to other parts of the system, and we could also use the
chance to compensate for the errors that arise when hz*tick !=
1000000 or when we know that hardclock does not run exactly every
'tick' (an integer) microseconds.

I think it would be relatively simple to adapt the DragonFly changebut ensure that tztohz() returns at least 1, and that would be helpfulimprovement.

However, in the long run, a callout mechanism based upon schedulerticks rather than actual time values suffers an inherent loss ofaccuracy (you're performing an integer division of time intervals byHZ and rounding up to use the API, after all) and makes assumptionsthat the scheduler's clock is always running at a constant frequencysuch that tick * HZ is exactly a million microseconds.


Regards,
--
-Chuck

[1]: See 
http://www.opensource.apple.com/source/xnu/xnu-1228.15.4/osfmk/kern/thread_call.c
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: incorrect usleep/select delays with HZ > 2500

Reply via email to