On Tue, Oct 27, 2015 at 02:19:39PM +0900, Tejun Heo wrote: > Hello, > > On Tue, Oct 27, 2015 at 12:37:16PM +0900, Linus Torvalds wrote: > > > I believe that the above should instead be: > > > > > > struct bdi_writeback *wb = list_entry_rcu(bdi->wb_list.next, > > I should have just used list_entry() here. It's just offseting the > pointer to set up the initial iteration point.
OK, that sounds much better! > ... > > That said, I'm not sure why it doesn't just do the normal > > > > rcu_read_lock(); > > list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node) { > > .... > > } > > rcu_read_unlock(); > > > > like the other places do. It looks like it wants that > > "list_for_each_entry_continue_rcu()" because it does that odd "pin > > entry and drop rcu lock and retake it and continue where you left > > off", but I'm not sure why the continue version would be so > > different.. It's going to do that "follow next entry" regardless, and > > the "goto restart" doesn't look like it actually adds anything. If > > following the next pointer is ok even after having released the RCU > > read lock, then I'm not seeing why the end of the loop couldn't just > > do > > > > rcu_read_unlock(); > > wb_wait_for_completion(bdi, &fallback_work_done); > > rcu_read_lock(); > > > > and just continue the loop (and the pinning of "wb" and releasing the > > "last_wb" thing in the *next* iteration should make it all work the > > same). > > > > Adding Tejun to the cc, because this is his code and there's probably > > something subtle I'm missing. Tejun, can you take a look? It's > > bdi_split_work_to_wbs() in fs/fs-writeback.c. > > Yeah, just releasing and regrabbing should work too as the iterator > doesn't depend on anything other than the current entry (e.g. as > opposed to imaginary list_for_each_entry_safe_rcu()). It's slightly > icky to meddle with locking behind the iterator's back tho. Either > way should be fine but how about something like the following? > > Subject: writeback: don't use list_entry_rcu() for pointer offsetting in > bdi_split_work_to_wbs() > > bdi_split_work_to_wbs() uses list_for_each_entry_rcu_continue() to > walk @bdi->wb_list. To set up the initial iteration condition, it > uses list_entry_rcu() to calculate the entry pointer corresponding to > the list head; however, this isn't an actual RCU dereference and using > list_entry_rcu() for it ended up breaking a proposed list_entry_rcu() > change because it was feeding an non-lvalue pointer into the macro. > > Don't use the RCU variant for simple pointer offsetting. Use > list_entry() instead. > > Signed-off-by: Tejun Heo <t...@kernel.org> Acked-by: Paul E. McKenney <paul...@linux.vnet.ibm.com> > --- > fs/fs-writeback.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index 29e4599..7378169 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -779,8 +779,8 @@ static void bdi_split_work_to_wbs(struct backing_dev_info > *bdi, > bool skip_if_busy) > { > struct bdi_writeback *last_wb = NULL; > - struct bdi_writeback *wb = list_entry_rcu(&bdi->wb_list, > - struct bdi_writeback, bdi_node); > + struct bdi_writeback *wb = list_entry(&bdi->wb_list, > + struct bdi_writeback, bdi_node); > > might_sleep(); > restart: > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/