madvise(MADV_REMOVE) deadlocks on shmem THP

Sergey Senozhatsky Wed, 13 Jan 2021 19:35:16 -0800

Hi,

We are running into lockups during the memory pressure tests on our
boards, which essentially NMI panic them. In short the test case is


- THP shmem
    echo advise > /sys/kernel/mm/transparent_hugepage/shmem_enabled

- And a user-space process doing madvise(MADV_HUGEPAGE) on new mappings,
  and madvise(MADV_REMOVE) when it wants to remove the page range

The problem boils down to the reverse locking chain:
        kswapd does

                lock_page(page) -> down_read(page->mapping->i_mmap_rwsem)

        madvise() process does

                down_write(page->mapping->i_mmap_rwsem) -> lock_page(page)



CPU0                                                       CPU1

kswapd                                                     vfs_fallocate()
 shrink_node()                                              shmem_fallocate()
  shrink_active_list()                                       
unmap_mapping_range()
   page_referenced() << lock page:PG_locked >>                
unmap_mapping_pages()  << down_write(mapping->i_mmap_rwsem) >>
    rmap_walk_file()                                           
zap_page_range_single()
     down_read(mapping->i_mmap_rwsem) << W-locked on CPU1>>     
unmap_page_range()
      rwsem_down_read_failed()                                   
__split_huge_pmd()
       __rwsem_down_read_failed_common()                          __lock_page() 
 << PG_locked on CPU0 >>
        schedule()                                                 
wait_on_page_bit_common()
                                                                    
io_schedule()

        -ss

madvise(MADV_REMOVE) deadlocks on shmem THP

Reply via email to