Andi Kleen wrote:
> 
> On Fri, Oct 06, 2000 at 10:00:36PM +1100, Andrew Morton wrote:
> > The little-low-latency patch for test9 is at
> >
> >       http://www.uow.edu.au/~andrewm/linux/2.4.0-test9-low-latency.patch
> >
> > Notes:
> >
> > - It now passes Benno's tests with 50% headroom (thanks to
> >   Ingo's scheduler race fix).
> 
> What was that race exactly ?

Not completely sure.  I _think_ the problem was that when the kernel was
switching from your SCHED_FIFO process to some other process, and an
interrupt occurred between the reenabling of interrupts and the
switch_to() in schedule(), and that interrupt tried to wake the
SCHED_FIFO process, it wasn't noticed until the next timeslice.  That
was as far as I got when the problem magically disappeared.  Due to
this hunk:

        switch_to(prev, next, prev);
        __schedule_tail(prev);

same_process:
        reacquire_kernel_lock(current);
+       if (current->need_resched)
+               goto tq_scheduler_back;

        return;


> There is a scheduler race which may also hurt (and is harder to fix):
> when the timer interrupt hits in syscall exit after the need_resched check
> was done then you may lose a time slice. The window can be quite long
> when signals are handled (after do_signal returned there is no reschedule
> check). Without signals it is only a few instructions window.
> 
> I have not checked if it really is a problem in practice though. With
> lots of signals it may be a problem.

Is it not a matter of:

a): checking need_resched after the call to do_signal() and

b): disabling local interrupts prior to the final need_resched
    check to make this test atomic wrt interrupts.  RESTORE_ALL
    will do the right thing and an intervening smp_send_reschedule()
    will be blocked until the return to user space.

Seems too simple...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Reply via email to