On Fri 04-05-18 07:35:34, Tetsuo Handa wrote:
> Jan Kara wrote:
> > Make wb_workfn() use wakeup_wb() for requeueing the work which takes all
> > the necessary precautions against racing with bdi unregistration.
>
> Yes, this patch will solve NULL pointer dereference bug. But is it OK to
> leave list_empty(&wb->work_list) == false situation? Who takes over the
> role of making list_empty(&wb->work_list) == true?
That's a good question. The reason is the last running instance of
wb_workfn() cannot leave with the work_list non-empty. Once WB_registered
is cleared we cannot add new entries to work_list. Then we'll queue and
flush last wb_workfn() to clean up the list. The problem with NULL ptr
deref has been triggered not by this last running wb_workfn() but by one
running independently in parallel to wb_shutdown(). So something like:
CPU0 CPU1 CPU2
wb_workfn()
do {
...
} while (!list_empty(&wb->work_list));
wb_queue_work()
if (test_bit(WB_registered, &wb->state)) {
list_add_tail(&work->list, &wb->work_list);
mod_delayed_work(bdi_wq, &wb->dwork, 0);
}
wb_shutdown()
if
(!test_and_clear_bit(WB_registered, &wb->state)) {
...
mod_delayed_work(bdi_wq,
&wb->dwork, 0);
flush_delayed_work(&wb->dwork);
if (!list_empty(&wb->work_list))
mod_delayed_work(bdi_wq, &wb->dwork, 0); -> queues buggy work
> Just a confirmation, for Fabiano Rosas is facing a problem that "write call
> hangs in kernel space after virtio hot-remove" and is thinking that we might
> need to go the opposite direction
> ( http://lkml.kernel.org/r/[email protected]
> ).
Yes, I'm aware of that report and I think it should be solved
differently than what Fabiano suggests.
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR