Hi Sebastian,

On Mon, Apr 15, 2019 at 05:20:22PM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-04-15 14:55:02 [+0000], Kirill Smelkov wrote:
> > Hi Sebastian,
> Hi Kirill,
> 
> > On Mon, Apr 15, 2019 at 04:38:57PM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2019-04-13 17:00:59 [+0000], Kirill Smelkov wrote:
> > > > stream_open.cocci was issuing only warning for pci/switchtec, but after
> > > > 8a29a3bae2a2 ("pci/switchtec: Don't use completion's wait queue") they
> > > > started to use wait_even_* inside read method and, since
> > > > stream_open.cocci considers wait_event_* as blocking the warning became
> > > > error. Previously it was completions there, but I added support for wait
> > > > events only for simplicity.
> > > 
> > > why is wait_event_interruptible() treated differently compared to
> > > wait_for_completion_interruptible()?
> > 
> > No particular reason. I just taught stream_open.cocci to consider
> > only "wait_event_*" as blocking:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/coccinelle/api/stream_open.cocci?h=v5.1-rc5#n35
> > 
> > based on original /proc/xen/xenbus deadlock:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/xen/xenbus/xenbus_dev_frontend.c?h=v5.1-rc5#n135
> > https://git.kernel.org/linus/581d21a2d02a
> > 
> > We can extend "a function that blocks" rule to cover other kernel
> > primitives.
> > 
> > For the reference: the deadlock scenario is described in
> > 
> > https://git.kernel.org/linus/10dce8af3422
> 
> As far I understand the problem is when the ->read() callback waits for
> the ->write() callback. The locking isn't changed by patch you
> mentioned.

Yes, correct. The patch that I mentioned only adds semantic patch which
find places with such problem and can generate a regular patch to change
locking. Here is that place for pci/switchtec:

https://lab.nexedi.com/kirr/linux/commit/edaeb4101860?expand_all_diffs=1#ccc4baef911c8dad164d4ff29a8c0b287abed7c2_393_393

> So extended might make sense. But then wait_event_* by itself in
> ->read() isn't a problem as long as its counter part isn't in ->write().

It is a problem either if its counterpart is in write _or_ if that
wait_event depends on external source and waiting can be for potentially
unbounded time, like e.g. waiting to receive a character from serial
port or network.

But you are right that even with wait_event used, cases are possible that
there is no blocking that depend on external source and it could be just
e.g. spawn kernel thread to do some limited amount of work and wait for
it to complete. I did not taught stream_open.cocci about that because
when something goes wrong with semantic patch and Coccinelle complains,
it is hard to understand what is going on, and because generally it is
better to convert files that do not depend on position, even if there is
no deadlock at all, to stream_open - i.e. don't do any f_pos_lock
locking at all.

> But yes, nice finding.

Thanks,

Kirill

Reply via email to