Hi,
We are running into lockups during the memory pressure tests on our
boards, which essentially NMI panic them. In short the test case is
- THP shmem
echo advise > /sys/kernel/mm/transparent_hugepage/shmem_enabled
- And a user-space process doing madvise(MADV_HUGEPAGE) on new mappings,
and madvise(MADV_REMOVE) when it wants to remove the page range
The problem boils down to the reverse locking chain:
kswapd does
lock_page(page) -> down_read(page->mapping->i_mmap_rwsem)
madvise() process does
down_write(page->mapping->i_mmap_rwsem) -> lock_page(page)
CPU0 CPU1
kswapd vfs_fallocate()
shrink_node() shmem_fallocate()
shrink_active_list()
unmap_mapping_range()
page_referenced() << lock page:PG_locked >>
unmap_mapping_pages() << down_write(mapping->i_mmap_rwsem) >>
rmap_walk_file()
zap_page_range_single()
down_read(mapping->i_mmap_rwsem) << W-locked on CPU1>>
unmap_page_range()
rwsem_down_read_failed()
__split_huge_pmd()
__rwsem_down_read_failed_common() __lock_page()
<< PG_locked on CPU0 >>
schedule()
wait_on_page_bit_common()
io_schedule()
-ss