Miaohe Lin <linmia...@huawei.com> writes: > On 2021/3/30 9:57, Huang, Ying wrote: >> Hi, Miaohe, >> >> Miaohe Lin <linmia...@huawei.com> writes: >> >>> Hi all, >>> I am investigating the swap code, and I found the below possible race >>> window: >>> >>> CPU 1 CPU 2 >>> ----- ----- >>> do_swap_page >>> skip swapcache case (synchronous swap_readpage) >>> alloc_page_vma >>> swapoff >>> release swap_file, >>> bdev, or ... >>> swap_readpage >>> check sis->flags is ok >>> access swap_file, bdev or ...[oops!] >>> si->flags = 0 >>> >>> The swapcache case is ok because swapoff will wait on the page_lock of >>> swapcache page. >>> Is this will really happen or Am I miss something ? >>> Any reply would be really grateful. Thanks! :) >> >> This appears possible. Even for swapcache case, we can't guarantee the > > Many thanks for reply! > >> swap entry gotten from the page table is always valid too. The > > The page table may change at any time. And we may thus do some useless work. > But the pte_same() check could handle these races correctly if these do not > result in oops. > >> underlying swap device can be swapped off at the same time. So we use >> get/put_swap_device() for that. Maybe we need similar stuff here. > > Using get/put_swap_device() to guard against swapoff for swap_readpage() > sounds > really bad as swap_readpage() may take really long time. Also such race may > not be > really hurtful because swapoff is usually done when system shutdown only. > I can not figure some simple and stable stuff out to fix this. Any > suggestions or > could anyone help get rid of such race?
Some reference counting on the swap device can prevent swap device from swapping-off. To reduce the performance overhead on the hot-path as much as possible, it appears we can use the percpu_ref. Best Regards, Huang, Ying