Re: delaying bottom halfes (was Re: nice policy in Linux)

Richard Gooch Thu, 8 Apr 1999 07:19:15 -0400
Torsten Scherer writes:
> On 01-Apr-99 Richard Gooch wrote:
> 
> > Well, I disagree. A RT task may be waiting for data from a network
> > socket. That's a legitimate thing to do.
> 
>  Sure, but then it's no longer very much RT and it doesn't matter to
> delay it a bit more.

Except for the case I described below.
[...]
> > But there is a further problem. Say you have two RT tasks. Task A
> > has highest priority and is waiting for data from the network. Task
> > B is at lower priority and is doing longer-term computing. By
> > preventing BH processing, you delay A from being woken up until B
> > finishes processing. This is undesirable.
> 
>  If it's a BH that actually wakes task A again (I just assume so, I
> don't dare to look at the network code ;-) then you're obviously
> right.

Yep, I think it's a BH that wakes A. The interrupt handler grabs the
data off the NIC, and the BH figures out where the packet should go.

>  But on the other hand you've got a RT task running as task B, and
> you expect it to be RT, don't you? It's a problem, yes, but you'll
> always get into problems if you've got more than just one RT task
> running at the same time - you'll always violate the demands of all
> but one of them. This is more a philosophical issue...

No, these things are all quite well defined. That's why we have
priorities. If RT task A has higher priority than RT task B, task A
gets the CPU. Always (provided task A is on the run queue).

If task A is sleeping and an IRQ/BH wakes it up, it gets put on the
run queue. It must then start running immediately if it's the highest
priority task on the run queue.

If task B can defer task A waking up, because BHs are deferred, that
is definately no good. Task A may be waiting on an urgent data from a
data capture card. Waiting until task B decides to block, or for the
next timer tick, is no good.

>  I didn't claim the idea is perfect. I just claim it may help for
> SOME (simple) cases. There still is RT-Linux for the more severe
> cases...

I agree about RT-Linux for more severe cases, but I don't agree with
deferring BHs. It makes the soft-RT we have now worse for some
cases. And it's worse in a way that's just wrong.

Now, if you could make sure that you don't defer BHs for an RT task,
it would be OK. But you basically can't do that. For an incoming
packet, you have to process it before you know who it's for, and once
you've done all that, you've done almost all the work anyway.

If you care about BHs blocking RT processes, I think you need to use
RT-Linux. Hacking around with BH deferral in this way is not going to
work, at least not for a general solution. It may be fine for your
problem, but it can makes things much worse for others.

> > But if you have a scenario which I described, (A blocks and B
> > computes), then you lose badly. To make your scheme work, you'd need
> > to define another scheduling class (or a new scheduling flag), which
> > says "defer BH processing if I'm on the run queue". Call it
> > SCHED_DEFER_BH.
>
>  Saying `I am your task and thou shallst not have other tasks
> besides me'? :-) Sure, yes, but this would only declare the
> behaviour as `per definition', it would not avoid the behaviour.

Yes. Task A would use this so it doesn't lose CPU time to those
heathen BHs.

> > Hm, better yet, instead of basing BH deferral on SCHED_DEFER_BH
> > tasks on the run queue, base it on whether there is a SCHED_DEFER_BH
> > task running on the current processor. You'd just need the
> > switch_to() function to set/unset a per-CPU "defer_bh" flag. Then
> > the BH entry point can check this flag. This wouldn't need any
> > locking or atomic operation, and would not be part of the scheduler
> > proper (the bit that scans the run queue).
> >
> > Adding the code to switch_to() is OK, because that's a fairly heavy
> > operation anyway.
> 
>  Per CPU. Hmmm. I've not done this to a 2.2 kernel as we've still
> (2.2.4) got NFS problems with them. But it'll be a bit more
> complicated `per CPU'.

Nope. Just look at the "current" pointer. That is the task that is
currently running on the CPU. In fact, you don't even need to hack
switch_to().

*All* you need to do is add:
    if (current->sched_flags & SCHED_FLAG_DEFER_BH) return;

in the BH code, before calling the individual handlers. Easy! I don't
know why I kept on about switch_to() once I thought of checking the
task on the current processor.

Make it a scheduling flag, not a new scheduling class. I'd like to see
us move towards scheduling flags: it will make some of the scheduler
code simpler, IIRC.

> > Hm. I'm beginning to warm to this idea. It doesn't solve interrupts
> > stealing time, but it helps. Naturally, it does nothing for reducing
> > interrupt latency either, but it may prove useful for some
> > applications.
> 
>  I can send you the complete patch if you like.

Sure, although I'm currently not fiddling with RT stuff, but I will be
again. I'll certainly cast my eye over it.

                                Regards,

                                        Richard....
-
Linux SMP list: FIRST see FAQ at http://www.irisa.fr/prive/mentre/smp-faq/
To Unsubscribe: send "unsubscribe linux-smp" to [EMAIL PROTECTED]
Re: delaying bottom halfes (was Re: nice policy in Linux)

Reply via email to