Re: Latency: allowing resheduling while holding spin_locks
Roger Larsson wrote: > > On Sunday 14 January 2001 01:06, george anzinger wrote: > > Nigel Gamble wrote: > > > On Sat, 13 Jan 2001, Roger Larsson wrote: > > > > A rethinking of the rescheduling strategy... > > > > > > Actually, I think you have more-or-less described how successful > > > preemptible kernels have already been developed, given that your > > > "sleeping spin locks" are really just sleeping mutexes (or binary > > > semaphores). > > > > > > 1. Short critical regions are protected by spin_lock_irq(). The maximum > > > value of "short" is therefore bounded by the maximum time we are happy > > > to disable (local) interrupts - ideally ~100us. > > > > > > 2. Longer regions are protected by sleeping mutexes. > > > > > > 3. Algorithms are rearchitected until all of the highly contended locks > > > are of type 1, and only low contention locks are of type 2. > > > > > > This approach has the advantage that we don't need to use a no-preempt > > > count, and test it on exit from every spinlock to see if a preempting > > > interrupt that has caused a need_resched has occurred, since we won't > > > see the interrupt until it's safe to do the preemptive resched. > > > > I agree that this was true in days of yore. But these days the irq > > instructions introduce serialization points and, me thinks, may be much > > more time consuming than the "++, --, if (false)" that a preemption > > count implemtation introduces. Could some one with a knowledge of the > > hardware comment on this? > > > > I am not suggesting that the "++, --, if (false)" is faster than an > > interrupt, but that it is faster than cli, sti. Of course we are > > assuming that there is between the cli and the sti as there is > > between the ++ and the -- if (false). > > > > The problem with counting scheme is that you can not schedule inside any > spinlock - you have to split them up. Maybe you will have to do that anyway. > But if your RT process never needs more memory - it should be quite safe. > > The difference with a sleeping mutex is that it can be made lazier - keep it > in the runlist, there should be very few... > Nigel and I agree on the approach he has layed out with the possible exception of just how to handle the short spinlocks. It is agreed that we can not preempt a task that has a spinlock. He suggests that the overhead of testing for preemption on the exit of a spinlock protected with the preempt_count is higher than the cost of turning off and on the interrupt system. He may well be right, and surly was right 5 or 10 years ago. Today the cost of an cli or sti is much higher relative to the memory references, especially if we don't need to make the result visible to other processors (and we don't). We only have to serialize WRT our own interrupt system, but the interrupt itself will do this, and only when we need it. snip WRT your patch, A big problem with simple sleeping mutexes is that of priority inversion. An example: Given tasks L of low priority, M of medium, and H of high and X a mutex. If L is holding X when it is preempted by M and M wants to run a long time Then when H preempts M and trys to get X it will have to wait while M does his thing, just because L can not get the cycles needed to get out of X. A priority inherit mutex (pi_mutex) handles this by, when H trys to get X, boosting the priority of L (the holder of X) to its own priority until L releases X. At this point L reverts to its prior priority and H continues, now having suceeded in getting X. This is all complicated, of course, by remembering that a task can hold several mutexes at a time and each can have several waiters. >From a real time point of view, we would NEVER want to scan the task list looking for someone to wake up. We should know who to wake up from the getgo. Likewise, clutter in the run_list adds wasted cycles and cache lines to the schedule process. George - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Latency: allowing resheduling while holding spin_locks
On Sunday 14 January 2001 01:06, george anzinger wrote: > Nigel Gamble wrote: > > On Sat, 13 Jan 2001, Roger Larsson wrote: > > > A rethinking of the rescheduling strategy... > > > > Actually, I think you have more-or-less described how successful > > preemptible kernels have already been developed, given that your > > "sleeping spin locks" are really just sleeping mutexes (or binary > > semaphores). > > > > 1. Short critical regions are protected by spin_lock_irq(). The maximum > > value of "short" is therefore bounded by the maximum time we are happy > > to disable (local) interrupts - ideally ~100us. > > > > 2. Longer regions are protected by sleeping mutexes. > > > > 3. Algorithms are rearchitected until all of the highly contended locks > > are of type 1, and only low contention locks are of type 2. > > > > This approach has the advantage that we don't need to use a no-preempt > > count, and test it on exit from every spinlock to see if a preempting > > interrupt that has caused a need_resched has occurred, since we won't > > see the interrupt until it's safe to do the preemptive resched. > > I agree that this was true in days of yore. But these days the irq > instructions introduce serialization points and, me thinks, may be much > more time consuming than the "++, --, if (false)" that a preemption > count implemtation introduces. Could some one with a knowledge of the > hardware comment on this? > > I am not suggesting that the "++, --, if (false)" is faster than an > interrupt, but that it is faster than cli, sti. Of course we are > assuming that there is between the cli and the sti as there is > between the ++ and the -- if (false). > The problem with counting scheme is that you can not schedule inside any spinlock - you have to split them up. Maybe you will have to do that anyway. But if your RT process never needs more memory - it should be quite safe. The difference with a sleeping mutex is that it can be made lazier - keep it in the runlist, there should be very few... See first patch attempt. (George, Nigel told me about your idea before I sent the previous mail. So major influence comes from you. But I do not think that it is equivalent) /RogerL Note: changed email... --- ./linux/kernel/sched.c.orig Sat Jan 13 19:19:20 2001 +++ ./linux/kernel/sched.c Sat Jan 13 23:27:13 2001 @@ -144,7 +144,7 @@ * Also, dont trigger a counter recalculation. */ weight = -1; - if (p->policy & SCHED_YIELD) + if (p->policy & (SCHED_YIELD | SCHED_SPINLOCK)) goto out; /* @@ -978,7 +978,7 @@ read_lock(_lock); p = find_process_by_pid(pid); if (p) - retval = p->policy & ~SCHED_YIELD; + retval = p->policy & ~(SCHED_YIELD | SCHED_SPINLOCK); read_unlock(_lock); out_nounlock: @@ -1267,3 +1267,54 @@ atomic_inc(_mm.mm_count); enter_lazy_tlb(_mm, current, cpu); } + +void wakeup_spinlock_yielder(spinlock_t *lock) +{ + int need_resched = 0; + struct list_head *tmp; + struct task_struct *p; + + /* I do not like this part... + * not SMP safe, the runqueue might change under us... + * can not use spinlocks... + * runlist might be long... + */ + local_irqsave(); + if (lock->spin) { + /* someone is "spinning" on it + * it has to have higher prio than this + * let go of ALL :-( spinning processes + */ + lock->spin = 0; + + list_for_each(tmp, _head) { + p = list_entry(tmp, struct task_struct, run_list); + if (p->policy & SCHED_SPINLOCK) { +p->policy &= ~SCHED_SPINLOCK; + } + } + + need_resched = 1; + } + local_irqrestore(); + + /* all higher prio will get a chance to run... */ + if (need_resched) + schedule_running(); +} + +void schedule_spinlock(spinlock_t *lock) +{ + while (test_and_set(lock->lock)) { + /* note: owner can not race here, it has lower prio */ + + lock->spinon = 1; + p->policy |= SCHED_SPINLOCK; + schedule_running(); + /* will be released in priority order */ + } +} + + + + --- ./linux/include/linux/sched.h.orig Sat Jan 13 19:25:53 2001 +++ ./linux/include/linux/sched.h Sat Jan 13 19:26:31 2001 @@ -119,6 +119,7 @@ * yield the CPU for one re-schedule.. */ #define SCHED_YIELD 0x10 +#define SCHED_SPINLOCK 0x20 struct sched_param { int sched_priority; --- ./linux/include/linux/spinlock.h.orig Sat Jan 13 19:40:30 2001 +++ ./linux/include/linux/spinlock.h Sat Jan 13 21:51:14 2001 @@ -66,16 +66,37 @@ typedef struct { volatile unsigned long lock; + ??? queue; } spinlock_t; #define SPIN_LOCK_UNLOCKED (spinlock_t) { 0 } +void wakeup_spinlock_yielder(spinlock_t *lock); +void schedule_spinlock(spinlock_t *lock); + #define spin_lock_init(x) do { (x)->lock = 0; } while (0) #define spin_is_locked(lock) (test_bit(0,(lock))) -#define spin_trylock(lock) (!test_and_set_bit(0,(lock))) +#define spin_trylock(lock) (!test_and_set_bit(0,(lock))) /* fail handled */ + +#define spin_lock(x) do { +if (test_and_set(lock->lock)) \ + schedule_spinlock(); /* kind of yield, giving low goodness, sticky */ \ +}
Re: Latency: allowing resheduling while holding spin_locks
On Sunday 14 January 2001 01:06, george anzinger wrote: Nigel Gamble wrote: On Sat, 13 Jan 2001, Roger Larsson wrote: A rethinking of the rescheduling strategy... Actually, I think you have more-or-less described how successful preemptible kernels have already been developed, given that your "sleeping spin locks" are really just sleeping mutexes (or binary semaphores). 1. Short critical regions are protected by spin_lock_irq(). The maximum value of "short" is therefore bounded by the maximum time we are happy to disable (local) interrupts - ideally ~100us. 2. Longer regions are protected by sleeping mutexes. 3. Algorithms are rearchitected until all of the highly contended locks are of type 1, and only low contention locks are of type 2. This approach has the advantage that we don't need to use a no-preempt count, and test it on exit from every spinlock to see if a preempting interrupt that has caused a need_resched has occurred, since we won't see the interrupt until it's safe to do the preemptive resched. I agree that this was true in days of yore. But these days the irq instructions introduce serialization points and, me thinks, may be much more time consuming than the "++, --, if (false)" that a preemption count implemtation introduces. Could some one with a knowledge of the hardware comment on this? I am not suggesting that the "++, --, if (false)" is faster than an interrupt, but that it is faster than cli, sti. Of course we are assuming that there is stuff between the cli and the sti as there is between the ++ and the -- if (false). The problem with counting scheme is that you can not schedule inside any spinlock - you have to split them up. Maybe you will have to do that anyway. But if your RT process never needs more memory - it should be quite safe. The difference with a sleeping mutex is that it can be made lazier - keep it in the runlist, there should be very few... See first patch attempt. (George, Nigel told me about your idea before I sent the previous mail. So major influence comes from you. But I do not think that it is equivalent) /RogerL Note: changed email... --- ./linux/kernel/sched.c.orig Sat Jan 13 19:19:20 2001 +++ ./linux/kernel/sched.c Sat Jan 13 23:27:13 2001 @@ -144,7 +144,7 @@ * Also, dont trigger a counter recalculation. */ weight = -1; - if (p-policy SCHED_YIELD) + if (p-policy (SCHED_YIELD | SCHED_SPINLOCK)) goto out; /* @@ -978,7 +978,7 @@ read_lock(tasklist_lock); p = find_process_by_pid(pid); if (p) - retval = p-policy ~SCHED_YIELD; + retval = p-policy ~(SCHED_YIELD | SCHED_SPINLOCK); read_unlock(tasklist_lock); out_nounlock: @@ -1267,3 +1267,54 @@ atomic_inc(init_mm.mm_count); enter_lazy_tlb(init_mm, current, cpu); } + +void wakeup_spinlock_yielder(spinlock_t *lock) +{ + int need_resched = 0; + struct list_head *tmp; + struct task_struct *p; + + /* I do not like this part... + * not SMP safe, the runqueue might change under us... + * can not use spinlocks... + * runlist might be long... + */ + local_irqsave(flags); + if (lock-spin) { + /* someone is "spinning" on it + * it has to have higher prio than this + * let go of ALL :-( spinning processes + */ + lock-spin = 0; + + list_for_each(tmp, runqueue_head) { + p = list_entry(tmp, struct task_struct, run_list); + if (p-policy SCHED_SPINLOCK) { +p-policy = ~SCHED_SPINLOCK; + } + } + + need_resched = 1; + } + local_irqrestore(flags); + + /* all higher prio will get a chance to run... */ + if (need_resched) + schedule_running(); +} + +void schedule_spinlock(spinlock_t *lock) +{ + while (test_and_set(lock-lock)) { + /* note: owner can not race here, it has lower prio */ + + lock-spinon = 1; + p-policy |= SCHED_SPINLOCK; + schedule_running(); + /* will be released in priority order */ + } +} + + + + --- ./linux/include/linux/sched.h.orig Sat Jan 13 19:25:53 2001 +++ ./linux/include/linux/sched.h Sat Jan 13 19:26:31 2001 @@ -119,6 +119,7 @@ * yield the CPU for one re-schedule.. */ #define SCHED_YIELD 0x10 +#define SCHED_SPINLOCK 0x20 struct sched_param { int sched_priority; --- ./linux/include/linux/spinlock.h.orig Sat Jan 13 19:40:30 2001 +++ ./linux/include/linux/spinlock.h Sat Jan 13 21:51:14 2001 @@ -66,16 +66,37 @@ typedef struct { volatile unsigned long lock; + ??? queue; } spinlock_t; #define SPIN_LOCK_UNLOCKED (spinlock_t) { 0 } +void wakeup_spinlock_yielder(spinlock_t *lock); +void schedule_spinlock(spinlock_t *lock); + #define spin_lock_init(x) do { (x)-lock = 0; } while (0) #define spin_is_locked(lock) (test_bit(0,(lock))) -#define spin_trylock(lock) (!test_and_set_bit(0,(lock))) +#define spin_trylock(lock) (!test_and_set_bit(0,(lock))) /* fail handled */ + +#define spin_lock(x) do { +if (test_and_set(lock-lock)) \ + schedule_spinlock(); /* kind of yield, giving low goodness, sticky */ \ +} while (0) + +#define
Re: Latency: allowing resheduling while holding spin_locks
Roger Larsson wrote: On Sunday 14 January 2001 01:06, george anzinger wrote: Nigel Gamble wrote: On Sat, 13 Jan 2001, Roger Larsson wrote: A rethinking of the rescheduling strategy... Actually, I think you have more-or-less described how successful preemptible kernels have already been developed, given that your "sleeping spin locks" are really just sleeping mutexes (or binary semaphores). 1. Short critical regions are protected by spin_lock_irq(). The maximum value of "short" is therefore bounded by the maximum time we are happy to disable (local) interrupts - ideally ~100us. 2. Longer regions are protected by sleeping mutexes. 3. Algorithms are rearchitected until all of the highly contended locks are of type 1, and only low contention locks are of type 2. This approach has the advantage that we don't need to use a no-preempt count, and test it on exit from every spinlock to see if a preempting interrupt that has caused a need_resched has occurred, since we won't see the interrupt until it's safe to do the preemptive resched. I agree that this was true in days of yore. But these days the irq instructions introduce serialization points and, me thinks, may be much more time consuming than the "++, --, if (false)" that a preemption count implemtation introduces. Could some one with a knowledge of the hardware comment on this? I am not suggesting that the "++, --, if (false)" is faster than an interrupt, but that it is faster than cli, sti. Of course we are assuming that there is stuff between the cli and the sti as there is between the ++ and the -- if (false). The problem with counting scheme is that you can not schedule inside any spinlock - you have to split them up. Maybe you will have to do that anyway. But if your RT process never needs more memory - it should be quite safe. The difference with a sleeping mutex is that it can be made lazier - keep it in the runlist, there should be very few... Nigel and I agree on the approach he has layed out with the possible exception of just how to handle the short spinlocks. It is agreed that we can not preempt a task that has a spinlock. He suggests that the overhead of testing for preemption on the exit of a spinlock protected with the preempt_count is higher than the cost of turning off and on the interrupt system. He may well be right, and surly was right 5 or 10 years ago. Today the cost of an cli or sti is much higher relative to the memory references, especially if we don't need to make the result visible to other processors (and we don't). We only have to serialize WRT our own interrupt system, but the interrupt itself will do this, and only when we need it. snip WRT your patch, A big problem with simple sleeping mutexes is that of priority inversion. An example: Given tasks L of low priority, M of medium, and H of high and X a mutex. If L is holding X when it is preempted by M and M wants to run a long time Then when H preempts M and trys to get X it will have to wait while M does his thing, just because L can not get the cycles needed to get out of X. A priority inherit mutex (pi_mutex) handles this by, when H trys to get X, boosting the priority of L (the holder of X) to its own priority until L releases X. At this point L reverts to its prior priority and H continues, now having suceeded in getting X. This is all complicated, of course, by remembering that a task can hold several mutexes at a time and each can have several waiters. From a real time point of view, we would NEVER want to scan the task list looking for someone to wake up. We should know who to wake up from the getgo. Likewise, clutter in the run_list adds wasted cycles and cache lines to the schedule process. George - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Latency: allowing resheduling while holding spin_locks
Nigel Gamble wrote: > > On Sat, 13 Jan 2001, Roger Larsson wrote: > > A rethinking of the rescheduling strategy... > > Actually, I think you have more-or-less described how successful > preemptible kernels have already been developed, given that your > "sleeping spin locks" are really just sleeping mutexes (or binary > semaphores). > > 1. Short critical regions are protected by spin_lock_irq(). The maximum > value of "short" is therefore bounded by the maximum time we are happy > to disable (local) interrupts - ideally ~100us. > > 2. Longer regions are protected by sleeping mutexes. > > 3. Algorithms are rearchitected until all of the highly contended locks > are of type 1, and only low contention locks are of type 2. > > This approach has the advantage that we don't need to use a no-preempt > count, and test it on exit from every spinlock to see if a preempting > interrupt that has caused a need_resched has occurred, since we won't > see the interrupt until it's safe to do the preemptive resched. I agree that this was true in days of yore. But these days the irq instructions introduce serialization points and, me thinks, may be much more time consuming than the "++, --, if (false)" that a preemption count implemtation introduces. Could some one with a knowledge of the hardware comment on this? I am not suggesting that the "++, --, if (false)" is faster than an interrupt, but that it is faster than cli, sti. Of course we are assuming that there is between the cli and the sti as there is between the ++ and the -- if (false). George > > Nigel Gamble[EMAIL PROTECTED] > Mountain View, CA, USA. http://www.nrg.org/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Latency: allowing resheduling while holding spin_locks
On Sat, 13 Jan 2001, Roger Larsson wrote: > A rethinking of the rescheduling strategy... Actually, I think you have more-or-less described how successful preemptible kernels have already been developed, given that your "sleeping spin locks" are really just sleeping mutexes (or binary semaphores). 1. Short critical regions are protected by spin_lock_irq(). The maximum value of "short" is therefore bounded by the maximum time we are happy to disable (local) interrupts - ideally ~100us. 2. Longer regions are protected by sleeping mutexes. 3. Algorithms are rearchitected until all of the highly contended locks are of type 1, and only low contention locks are of type 2. This approach has the advantage that we don't need to use a no-preempt count, and test it on exit from every spinlock to see if a preempting interrupt that has caused a need_resched has occurred, since we won't see the interrupt until it's safe to do the preemptive resched. Nigel Gamble[EMAIL PROTECTED] Mountain View, CA, USA. http://www.nrg.org/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Latency: allowing resheduling while holding spin_locks
Hi, A rethinking of the rescheduling strategy... I have come to this conclusion. A spinlock prevents other processes to enter that specific region. But interrupts are allowed they might delay execution of a spin locked reqion for a undefined (small but anyway) time. Code with critical maximum times should use spin_lock_irq ! => spin_locks are not about disallowing reschedules. Prior to the introduction of spin locks it did not make sense to allow reschedules in kernel since the big kernel lock was so big... Any code that wanted do any non pure computation task would hit it very quickly. Now with spin locks the situation is quite different... [First assume UP kernel for simplicity] Suppose you have two processes one that normal (P) and one high priority (RTP). P runs user code, makes a system call, enters a spin lock region. Interrupt! The interrupt service routine wakes up RTP, which marks P as need_reschedule, and returns, on return from interrupt it detects that P needs_reschedule - do it even if it is executing in kernel and holding a spin_lock. RTP starts, and if it does not hit the same spin_lock there is nothing special happening until it goes to sleep again. But suppose it does! RTP tries to get the spin_lock but fails, since it is the currently highest prio process and P is running it wants to reschedule to P to get its own stuff done. P runs the final part of its spin_locked region, upon spin_unlock it needs to get RTP running. Something like this: spin_lock(lock) { while (test_and_set(lock->lock)) { schedule_spinlock(); /* kind of yield, giving low goodness, sticky */ } } spin_unlock(lock) { clear(lock); /* note: someone with higher prio than me, might steal the lock from even higher prio waiters here */ if (lock->queue) wakeup_spinlock_yielder(lock); } schedule_spinlock() { /* note: owner can not run here, it has lower prio */ addqueue(lock->queue, current); p->policy |= SCHED_SPINLOCK; schedule(); } wakeup_spinlock_yielder(lock) { int need_resched = 0; int my_goodness = goodness(current); forall p in lock->queue p->policy &= ~SCHED_SPINLOCK; if (goodness(p) > my_goodness) need_resched = 1; } if (need_resched) schedule(); } A final note on spin_lock_irq, since they prevent IRQs there will be no requests to wakeup any process during their locked region => no problems. -- Home page: no currently - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Latency: allowing resheduling while holding spin_locks
Hi, A rethinking of the rescheduling strategy... I have come to this conclusion. A spinlock prevents other processes to enter that specific region. But interrupts are allowed they might delay execution of a spin locked reqion for a undefined (small but anyway) time. Code with critical maximum times should use spin_lock_irq ! = spin_locks are not about disallowing reschedules. Prior to the introduction of spin locks it did not make sense to allow reschedules in kernel since the big kernel lock was so big... Any code that wanted do any non pure computation task would hit it very quickly. Now with spin locks the situation is quite different... [First assume UP kernel for simplicity] Suppose you have two processes one that normal (P) and one high priority (RTP). P runs user code, makes a system call, enters a spin lock region. Interrupt! The interrupt service routine wakes up RTP, which marks P as need_reschedule, and returns, on return from interrupt it detects that P needs_reschedule - do it even if it is executing in kernel and holding a spin_lock. RTP starts, and if it does not hit the same spin_lock there is nothing special happening until it goes to sleep again. But suppose it does! RTP tries to get the spin_lock but fails, since it is the currently highest prio process and P is running it wants to reschedule to P to get its own stuff done. P runs the final part of its spin_locked region, upon spin_unlock it needs to get RTP running. Something like this: spin_lock(lock) { while (test_and_set(lock-lock)) { schedule_spinlock(); /* kind of yield, giving low goodness, sticky */ } } spin_unlock(lock) { clear(lock); /* note: someone with higher prio than me, might steal the lock from even higher prio waiters here */ if (lock-queue) wakeup_spinlock_yielder(lock); } schedule_spinlock() { /* note: owner can not run here, it has lower prio */ addqueue(lock-queue, current); p-policy |= SCHED_SPINLOCK; schedule(); } wakeup_spinlock_yielder(lock) { int need_resched = 0; int my_goodness = goodness(current); forall p in lock-queue p-policy = ~SCHED_SPINLOCK; if (goodness(p) my_goodness) need_resched = 1; } if (need_resched) schedule(); } A final note on spin_lock_irq, since they prevent IRQs there will be no requests to wakeup any process during their locked region = no problems. -- Home page: no currently - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Latency: allowing resheduling while holding spin_locks
On Sat, 13 Jan 2001, Roger Larsson wrote: A rethinking of the rescheduling strategy... Actually, I think you have more-or-less described how successful preemptible kernels have already been developed, given that your "sleeping spin locks" are really just sleeping mutexes (or binary semaphores). 1. Short critical regions are protected by spin_lock_irq(). The maximum value of "short" is therefore bounded by the maximum time we are happy to disable (local) interrupts - ideally ~100us. 2. Longer regions are protected by sleeping mutexes. 3. Algorithms are rearchitected until all of the highly contended locks are of type 1, and only low contention locks are of type 2. This approach has the advantage that we don't need to use a no-preempt count, and test it on exit from every spinlock to see if a preempting interrupt that has caused a need_resched has occurred, since we won't see the interrupt until it's safe to do the preemptive resched. Nigel Gamble[EMAIL PROTECTED] Mountain View, CA, USA. http://www.nrg.org/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Latency: allowing resheduling while holding spin_locks
Nigel Gamble wrote: On Sat, 13 Jan 2001, Roger Larsson wrote: A rethinking of the rescheduling strategy... Actually, I think you have more-or-less described how successful preemptible kernels have already been developed, given that your "sleeping spin locks" are really just sleeping mutexes (or binary semaphores). 1. Short critical regions are protected by spin_lock_irq(). The maximum value of "short" is therefore bounded by the maximum time we are happy to disable (local) interrupts - ideally ~100us. 2. Longer regions are protected by sleeping mutexes. 3. Algorithms are rearchitected until all of the highly contended locks are of type 1, and only low contention locks are of type 2. This approach has the advantage that we don't need to use a no-preempt count, and test it on exit from every spinlock to see if a preempting interrupt that has caused a need_resched has occurred, since we won't see the interrupt until it's safe to do the preemptive resched. I agree that this was true in days of yore. But these days the irq instructions introduce serialization points and, me thinks, may be much more time consuming than the "++, --, if (false)" that a preemption count implemtation introduces. Could some one with a knowledge of the hardware comment on this? I am not suggesting that the "++, --, if (false)" is faster than an interrupt, but that it is faster than cli, sti. Of course we are assuming that there is stuff between the cli and the sti as there is between the ++ and the -- if (false). George Nigel Gamble[EMAIL PROTECTED] Mountain View, CA, USA. http://www.nrg.org/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/