Re: kernel condvars: how to use?

Mouse Thu, 07 Dec 2017 19:27:27 -0800

Omnibus reply here to some half-dozen of the replies; I'll try to get
all the attributions right.


First, a big thank you to everyone who replied.  I was not expecting
this many, nor this helpful, responses to something about NetBSD as old
as 5.2!

[me - all the double-quoted text here is mine]
>> db{0}> tr
>> breakpoint() at netbsd:breakpoint+0x5
>> comintr() at netbsd:comintr+0x53a
>> Xintr_ioapic_edge7() at netbsd:Xintr_ioapic_edge7+0xeb
>> --- interrupt ---
>> x86_pause() at netbsd:x86_pause
>> intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x16
>> Xintr_ioapic_level5() at netbsd:Xintr_ioapic_level5+0xf3
>> --- interrupt ---
>> x86_pause() at netbsd:x86_pause+0x2
>> cdev_poll() at netbsd:cdev_poll+0x6d

[Taylor R Campbell]
> This stack trace suggests that you're in the middle of a busy-wait
> loop somewhere inside your d_poll routine

I didn't mention that because I knew it was false (and neglected to say
anything about it - my fault).  Why am I sure?  Because the driver in
question uses nopoll as its poll routine. :)

> It may be helpful to build with all debug options enabled.

DIAGNOSTIC and DEBUG are both on already.  None of the LOCK* options
are, though - see Paul Goyette's response, below.

>> On reflection, I think I know why.  Userland's syscall handler took
>> the mutex in preparation for cv_wait_sig(), the interrupt happens,
>> my code is called (verified with a printf), and it tries to take the
>> same mutex so it can cv_broadcast().
> cv_wait_sig releases the lock while it waits.

I'm positing an interrupt that strikes after taking the mutex and
before enterint cv_wait_sig.  (See also your other message, below.)

>> I can of course provide more information if it would help, but I'm
>> not sure what would be useful to mention here.
> Can you provide the code you're using?  Hard to guess otherwise.

Heh.  Fair enough.

First, a brief sketch of my intent....

What I was/am trying to do was to take the lpt driver and give it a
mode in which it uses the four output and five input bits in the
control and status registers for low-data-rate parallel digital input
and output.

The design is that, on output, userland write()s a buffer of even
length and the kernel shoves alternate bytes out the data and control
ports, with no particular timing constraints (in the anticipated use
case there will be only one pair of bytes per write, and writes are
only occasional); for input, I have a callout that, once per hz, checks
to see if the status bits have changed, and, if so, saves them in a
buffer for read().  (The input data rate is low - in the anticipated
application, those input bits are driven off mechanical switches
operated by human actions - and only changes matter.)  I actually may
change this to provide, somehow, different interfaces for the control
bits and the data bits; I'm not sure the "alternate bytes" paradigm is
all that good a one for my anticipated use.

Rather than replace the lpt driver entirely, I had it grow another flag
bit in the device minor number: 0x100 indicates this `raw' mode.

As for the code....

ftp.rodents-montreal.org:/mouse/misc/lpt/ holds the code.  base/ has
the code I started with (which is stock 5.2, for these files); new/ has
my current version.  lptreg.h is identical; lptvar.h and lpt.c have
changes.  diffs is output from "diff -u -r base new".  (All these are
relative to sys/dev/ic/.)

>> So I wrote some code using a condvar and a mutex, and the system
>> promptly deadlocked.  [...]

[Joerg Sonnenberger]
> Did you set the IPL for your mutex correctly?

I don't know.  I tried to...

> Adaptive mutexes must not be shared with interrupts.

...I used IPL_VM, because splvm() is the same as spllpt(), and because
the manpage says that results in a spin mutex, which I thought was what
I needed.  Is there some way to tell what IPL the hardware interrupt
for the device comes in on?

[Taylor R Campbell again, different email]
> Note that if you use a mutex in a thread _and_ an interrupt handler,
> you must initialize the mutex with the ipl at which the interrupt
> handler runs.  That way, while you hold the lock, it will block the
> interrupt handler too, which avoids the scenario you described.

Oh!  That's very important; it was not clear to me from the manpage
that mutex_enter() implies spl*().  Yes, then, as you say, my scenario
is impossible if the mutex is correctly set up.

[Brian Buhrow]
> 1.  [...].  Mutexes that use spin locks can't be used in interrupt
> context.

Sure you don't have that backwards?  I _think_ mutex(9) says that spin
mutexes are the only ones that _can_ be used from an interrupt.

> 2.  Initialize your convar with cv_init().

Done.  (I think.)

> 4.  If you run into lock contention when debugging your code, pay
> careful attention to who holds the lock at the time of the panic.

Or, in my case, hang?  I don't know who holds it; I'm not sure how to
find out.  I should probably go peek under the hood.

[me]
>> [...my scenario outline...]
[Brad Spencer]
> I recently worked with this for a driver I have written to provide
> entropy to the kernel random number generator subsystem from quantum
> event sources.  [...]

It sounds as though you're doing something similar enough that the
locking issues should be identical - and, looking at your code, it
looks as though you're doing basically the same thing I am.  This seems
to me to indicate that my problem most likely is with how I'm
initializing something - based on what Taylor R Campbell said, quite
likely the mutex.

[Paul Goyette]
> You might well find LOCKDEBUG to be your friend here!

Is there some list of such options?  sys/arch/amd64/conf/GENERIC has
only two lines containing "LOCK", neither of which looks relevant here:

# options       INTEL_ONDEMAND_CLOCKMOD
#options        IPFILTER_DEFAULT_BLOCK  # block all packets by default

/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mo...@rodents-montreal.org
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

Re: kernel condvars: how to use?

Reply via email to