Re: [networking-discuss] Potential blocking in calls from ip_input() - kosher or not?

Gary . Morton Fri, 15 Jun 2007 10:04:46 -0700

Garrett D'Amore wrote:

[EMAIL PROTECTED] wrote:
Garrett D'Amore wrote:
Erik Nordmark wrote:
[Sorry for the slow response time]

Garrett D'Amore wrote:
I'm not talking about consuming cycles. I'm not talking aboutwall clock time considerations. I'm talking about potentialdeadlocks due to cv_wait.
You must -not- under any conditions cv_wait while in ip_input'spath. Because you don't know anything about what locks or PIL thecalling context is coming from.
I don't think there is a hard and fast rule on this.
It is true that *typically* a mutex is used with a short or boundedhold time, and that a condition variable might be used when thehold time is unbounded.
But there might be subsystems that use a mutex (or rwlock) wherethe hold time is unbounded.And one can use condition variables to create mutual exclusionprimitives that have bounded hold time.
The 'interrupts as threads' support in the Solaris kernel can dealwith all combinations; a mutex_enter on an adaptive mutex willsleep if the thread holding the mutex is not running.
Thus cv vs. mutex/rwlock is a red herring. What matters is whetherthe hold time is bounded or not.
I think you've missed a very important point.
mutexes that are adaptive have priority inheritance, to preventpriority inversion. condition variables do not.
The problem I see is what happens if the resources you are waitingfor require an interrupt to be processed, but the interrupt threadcannot run because the processor interrupt is currently held becausethe thread you are currently executing on has a higher priorityinterrupt level.
In the case of the driver which started this discussion, thatargument does not hold true. The process that will wake the threadssleeping on the cv is spinning on proc and requires no interrupt. Iunderstand the basic reason for this "rule" - but in this particularcase forcing all threads entering the driver to spin would be anunnecessary waste of system resources. It does not buy us any safetyagainst deadlock. In life there are very few absolutes....
_BUT_ the process that is spinning on proc _may_ not actually be _on_proc. I.e. what happens if it cannot be scheduled on the CPU?

That could happen even if we followed the rules and never blocked,unless we were non-preemptible. Even in the case of the software cryptoimplementation which does not sleep, it could still be preempted andput to sleep by the OS.....

Unless you take _extreme_ caution to make sure that you validate 100%that the code can never, ever be context switched away (see "very fewabsolutes") ... even by real-time threads.... then you have apotential hang/deadlock.

Again that could happen even if you don't block

I will also point out that this smacks very, very highly of a commonmis-optimization. That is, you are optimizing for the case where thesystem is under very light load (queue depth is at or close toempty). In such cases, the extra overhead of queuing is not generallynoticeable unless you are tuning for extremely low latency.Generally, latencies of up to a msec or more are tolerable on LANs.This is even more true for packets that are processed by IPsec.(Everyone understands that you cannot get low-latencies with IPsec dueto encryption overheads.)

Give me a break - on a CMT system throughput is critical - clearly ourdesign could never neglect that requirement We spent several monthstrying different options and the one we ended up with performed bestboth for throughput and latency....

I strongly consider that you consider just queueing; your design willbe simpler, will be correct, and will be performant when it _needs_ tobe; i.e. when the system is under stressed load.
And, you won't waste any extra CPU spinning.
Unless maybe you are trying to create a benchmark special forlow-latency? (But every benchmark I can think of generally wants toload the system up fairly heavily, so your optimization would onlywork for the first packet or two submitted.)
The example you mention above would just be a bug in my opinion.
Its possible that mutexes which do not have high priority interruptcookies are free from these problems (because the "interrupts asthreads" design you mentioned), but designing for that seems a badidea ... some day the design could change.
So, the hard and fast rule that we have used for years is, don'tsleep while in interrupt context. The STREAMs manual pages say thatput() and srv() fall into the category of things that must followthese rules (because they may be run while in interrupt context;whether this ever actually occurs is a matter of debate -- butgiving any different advice would be a major departure from whatwe've been telling folks for many years.)
sometimes its ok to be a flip-flopper ;-)
I'm not sure I understand your point. I still think you need torethink what you're trying to do.

> but giving any different advice would be a major departure from whatwe've been telling folks for many years.

poor attempt at humor - if you changed your "advice" it could beconsidered flip-flopping.... never mind...


    -- Garrett


-gary


    -- Garrett


   Erik


_______________________________________________
networking-discuss mailing list
[email protected]

Re: [networking-discuss] Potential blocking in calls from ip_input() - kosher or not?

Reply via email to