On Tue 03-01-17 14:36:05, Ross Zwisler wrote:
> Currently in DAX if we have three read faults on the same hole address we
> can end up with the following:
> 
> Thread 0              Thread 1                Thread 2
> --------              --------                --------
> dax_iomap_fault
>  grab_mapping_entry
>   lock_slot
>    <locks empty DAX entry>
> 
>                       dax_iomap_fault
>                        grab_mapping_entry
>                         get_unlocked_mapping_entry
>                          <sleeps on empty DAX entry>
> 
>                                               dax_iomap_fault
>                                                grab_mapping_entry
>                                                 get_unlocked_mapping_entry
>                                                  <sleeps on empty DAX entry>
>   dax_load_hole
>    find_or_create_page
>    ...
>     page_cache_tree_insert
>      dax_wake_mapping_entry_waiter
>       <wakes one sleeper>
>      __radix_tree_replace
>       <swaps empty DAX entry with 4k zero page>
> 
>                       <wakes>
>                       get_page
>                       lock_page
>                       ...
>                       put_locked_mapping_entry
>                       unlock_page
>                       put_page
> 
>                                               <sleeps forever on the DAX
>                                                wait queue>
> 
> The crux of the problem is that once we insert a 4k zero page, all locking
> from then on is done in terms of that 4k zero page and any additional
> threads sleeping on the empty DAX entry will never be woken.  Fix this by
> waking all sleepers when we replace the DAX radix tree entry with a 4k zero
> page.  This will allow all sleeping threads to successfully transition from
> locking based on the DAX empty entry to locking on the 4k zero page.
> 
> With the test case reported by Xiong this happens very regularly in my test
> setup, with some runs resulting in 9+ threads in this deadlocked state.
> With this fix I've been able to run that same test dozens of times in a
> loop without issue.
> 
> Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com>
> Reported-by: Xiong Zhou <xz...@redhat.com>
> Fixes: commit ac401cc78242 ("dax: New fault locking")
> Cc: Jan Kara <j...@suse.cz>
> Cc: sta...@vger.kernel.org # 4.7+

Ah, very good catch. You can add:

Reviewed-by: Jan Kara <j...@suse.cz>

I wonder why I was not able to reproduce this... Probably the timing didn't
work out right on my test machine.

                                                                Honza

> ---
> 
> This issue exists as far back as v4.7, and I was easly able to reproduce it
> with v4.7 using the same test.
> 
> Unfortunately this patch won't apply cleanly to the stable trees, but the
> change is very simple and should be easy to replicate by hand.  Please ping
> me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees.
> 
> ---
>  mm/filemap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index d0e4d10..b772a33 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space 
> *mapping,
>                               dax_radix_locked_entry(0, RADIX_DAX_EMPTY));
>                       /* Wakeup waiters for exceptional entry lock */
>                       dax_wake_mapping_entry_waiter(mapping, page->index, p,
> -                                                   false);
> +                                                   true);
>               }
>       }
>       __radix_tree_replace(&mapping->page_tree, node, slot, page,
> -- 
> 2.7.4
> 
-- 
Jan Kara <j...@suse.com>
SUSE Labs, CR

Reply via email to