> I wonder if we are simply racing the initialization of the completion
> vs when it is triggered by dma.
We have wmb() in the ioat2_tx_submit_unlock (before submitting work to dma,
but after init_completion), shouldn't it make race between interrupt handler
and our waiting thread impossible?

I also believe I traced x->done in the do_wait_for_common
before and after 'timeout = action(timeout);' and it was as expected:
0 - before and 1 - after.

> I also think we need to be waiting
> for in-flight tasklet runs when stopping the channel.
Yes, I also thought that's the reason of the race, because tasklet is in
the queue, but it is disabled (on my weird trace I don't see the interrupt,
so I abandoned this idea).
But it seems to be possible only when we issue dma, wait_for_completion_timeout
waits and times out, we do cleanup (tasklet_disable), we get interrupt and
do tasklet_schedule. But (again) I don't see timeout and
interrupt on my trace (could it be corrupted?).

Despite all this theory, the patch below (which just replaces tasklet_disable
with tasklet_kill in the cleanup routines) seems to be working (did about
40 reboots and didn't see the issue).

I think replacing tasklet_disable with tasklet_kill is reasonable anyway, so
should I send the patch rebased on 3.13 with comments or you'll take care
and merge it into mainline and stable yourself?

---
diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c
index 17a2393b3e25..6c62f4de02c0 100644
--- a/drivers/dma/ioat/dma.c
+++ b/drivers/dma/ioat/dma.c
@@ -379,7 +379,7 @@ static void ioat1_dma_free_chan_resources(struct dma_chan 
*c)
        if (ioat->desccount == 0)
                return;
 
-       tasklet_disable(&chan->cleanup_task);
+       tasklet_kill(&chan->cleanup_task);
        del_timer_sync(&chan->timer);
        ioat1_cleanup(ioat);
 
diff --git a/drivers/dma/ioat/dma_v2.c b/drivers/dma/ioat/dma_v2.c
index b925e1b1d139..268e93d1af2d 100644
--- a/drivers/dma/ioat/dma_v2.c
+++ b/drivers/dma/ioat/dma_v2.c
@@ -809,7 +809,7 @@ void ioat2_free_chan_resources(struct dma_chan *c)
        if (!ioat->ring)
                return;
 
-       tasklet_disable(&chan->cleanup_task);
+       tasklet_kill(&chan->cleanup_task);
        del_timer_sync(&chan->timer);
        device->cleanup_fn((unsigned long) c);
        device->reset_hw(chan);
diff --git a/scripts/package/builddeb b/scripts/package/builddeb
index acb86507828a..94a5f04e114e 100644
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to