On Wed, Jun 17, 2020 at 7:19 PM Josef Bacik <jo...@toxicpanda.com> wrote: > > On 6/17/20 2:11 PM, Filipe Manana wrote: > > On Wed, Jun 17, 2020 at 6:43 PM Chris Mason <c...@fb.com> wrote: > >> > >> On 17 Jun 2020, at 13:20, Filipe Manana wrote: > >> > >>> On Wed, Jun 17, 2020 at 5:32 PM Boris Burkov <bo...@bur.io> wrote: > >>> > >>>> --- > >>>> fs/btrfs/extent_io.c | 45 > >>>> ++++++++++++++++++++++++++++---------------- > >>>> 1 file changed, 29 insertions(+), 16 deletions(-) > >>>> > >>>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c > >>>> index c59e07360083..f6758ebbb6a2 100644 > >>>> --- a/fs/btrfs/extent_io.c > >>>> +++ b/fs/btrfs/extent_io.c > >>>> @@ -3927,6 +3927,11 @@ static noinline_for_stack int > >>>> write_one_eb(struct extent_buffer *eb, > >>>> clear_bit(EXTENT_BUFFER_WRITE_ERR, &eb->bflags); > >>>> num_pages = num_extent_pages(eb); > >>>> atomic_set(&eb->io_pages, num_pages); > >>>> + /* > >>>> + * It is possible for releasepage to clear the TREE_REF bit > >>>> before we > >>>> + * set io_pages. See check_buffer_tree_ref for a more > >>>> detailed comment. > >>>> + */ > >>>> + check_buffer_tree_ref(eb); > >>> > >>> This is a whole different case from the one described in the > >>> changelog, as this is in the write path. > >>> Why do we need this one? > >> > >> This was Josef’s idea, but I really like the symmetry. You set > >> io_pages, you do the tree_ref dance. Everyone fiddling with the write > >> back bit right now correctly clears writeback after doing the atomic_dec > >> on io_pages, but the race is tiny and prone to getting exposed again by > >> shifting code around. Tree ref checks around io_pages are the most > >> reliable way to prevent this bug from coming back again later. > > > > Ok, but that still doesn't answer my question. > > Is there an actual race/problem this hunk solves? > > > > Before calling write_one_eb() we increment the ref on the eb and we > > also call lock_extent_buffer_for_io(), > > which clears the dirty bit and sets the writeback bit on the eb while > > holding its ref_locks spin_lock. > > > > Even if we get to try_release_extent_buffer, it calls > > extent_buffer_under_io(eb) while holding the ref_locks spin_lock, > > so at any time it should return true, as either the dirty or the > > writeback bit is set. > > > > Is this purely a safety guard that is being introduced? > > > > Can we at least describe in the changelog why we are adding this hunk > > in the write path? > > All it mentions is a race between reading and releasing pages, there's > > nothing mentioned about races with writeback. > > > > I think maybe we make that bit a separate patch, and leave the panic fix on > it's > own. I suggested this because I have lofty ideas of killing the refs_lock, > and > this would at least keep us consistent in our usage TREE_REF to save us from > freeing stuff that may be currently under IO. > > I'm super not happy with our reference handling coupled with releasepage, but > these are the kind of hoops we're going to have to jump through until we have > some better infrastructure in place to handle metadata. Thanks,
Ok, so it's just a safety guard then. Yes, either a separate patch or at the very least mention why we are adding that in the change log. Thanks. > > Josef -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.”