Garrett D'Amore wrote:

[EMAIL PROTECTED] wrote:

Garrett D'Amore wrote:

Erik Nordmark wrote:


[Sorry for the slow response time]

Garrett D'Amore wrote:

I'm not talking about consuming cycles. I'm not talking about wall clock time considerations. I'm talking about potential deadlocks due to cv_wait.

You must -not- under any conditions cv_wait while in ip_input's path. Because you don't know anything about what locks or PIL the calling context is coming from.



I don't think there is a hard and fast rule on this.

It is true that *typically* a mutex is used with a short or bounded hold time, and that a condition variable might be used when the hold time is unbounded.

But there might be subsystems that use a mutex (or rwlock) where the hold time is unbounded. And one can use condition variables to create mutual exclusion primitives that have bounded hold time.

The 'interrupts as threads' support in the Solaris kernel can deal with all combinations; a mutex_enter on an adaptive mutex will sleep if the thread holding the mutex is not running.

Thus cv vs. mutex/rwlock is a red herring. What matters is whether the hold time is bounded or not.



I think you've missed a very important point.

mutexes that are adaptive have priority inheritance, to prevent priority inversion. condition variables do not.

The problem I see is what happens if the resources you are waiting for require an interrupt to be processed, but the interrupt thread cannot run because the processor interrupt is currently held because the thread you are currently executing on has a higher priority interrupt level.



In the case of the driver which started this discussion, that argument does not hold true. The process that will wake the threads sleeping on the cv is spinning on proc and requires no interrupt. I understand the basic reason for this "rule" - but in this particular case forcing all threads entering the driver to spin would be an unnecessary waste of system resources. It does not buy us any safety against deadlock. In life there are very few absolutes....


_BUT_ the process that is spinning on proc _may_ not actually be _on_ proc. I.e. what happens if it cannot be scheduled on the CPU?


That could happen even if we followed the rules and never blocked, unless we were non-preemptible. Even in the case of the software crypto implementation which does not sleep, it could still be preempted and put to sleep by the OS.....


Unless you take _extreme_ caution to make sure that you validate 100% that the code can never, ever be context switched away (see "very few absolutes") ... even by real-time threads.... then you have a potential hang/deadlock.


Again that could happen even if you don't block
I will also point out that this smacks very, very highly of a common mis-optimization. That is, you are optimizing for the case where the system is under very light load (queue depth is at or close to empty). In such cases, the extra overhead of queuing is not generally noticeable unless you are tuning for extremely low latency. Generally, latencies of up to a msec or more are tolerable on LANs. This is even more true for packets that are processed by IPsec. (Everyone understands that you cannot get low-latencies with IPsec due to encryption overheads.)


Give me a break - on a CMT system throughput is critical - clearly our design could never neglect that requirement We spent several months trying different options and the one we ended up with performed best both for throughput and latency....


I strongly consider that you consider just queueing; your design will be simpler, will be correct, and will be performant when it _needs_ to be; i.e. when the system is under stressed load.

And, you won't waste any extra CPU spinning.

Unless maybe you are trying to create a benchmark special for low-latency? (But every benchmark I can think of generally wants to load the system up fairly heavily, so your optimization would only work for the first packet or two submitted.)


The example you mention above would just be a bug in my opinion.


Its possible that mutexes which do not have high priority interrupt cookies are free from these problems (because the "interrupts as threads" design you mentioned), but designing for that seems a bad idea ... some day the design could change.

So, the hard and fast rule that we have used for years is, don't sleep while in interrupt context. The STREAMs manual pages say that put() and srv() fall into the category of things that must follow these rules (because they may be run while in interrupt context; whether this ever actually occurs is a matter of debate -- but giving any different advice would be a major departure from what we've been telling folks for many years.)



sometimes its ok to be a flip-flopper ;-)


I'm not sure I understand your point. I still think you need to rethink what you're trying to do.


> but giving any different advice would be a major departure from what we've been telling folks for many years.

poor attempt at humor - if you changed your "advice" it could be considered flip-flopping.... never mind...


    -- Garrett


-gary


    -- Garrett


   Erik






_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to