On Thu, 14 Feb 2019 15:33:18 +0100 Michal Hocko <mho...@kernel.org> wrote:
> > Because swapoff() is very rare code path, to make the normal path runs as > > fast as possible, disabling preemption + stop_machine() instead of > > reference count is used to implement get/put_swap_device(). From > > get_swap_device() to put_swap_device(), the preemption is disabled, so > > stop_machine() in swapoff() will wait until put_swap_device() is called. > > > > In addition to swap_map, cluster_info, etc. data structure in the struct > > swap_info_struct, the swap cache radix tree will be freed after swapoff, > > so this patch fixes the race between swap cache looking up and swapoff > > too. > > > > Races between some other swap cache usages protected via disabling > > preemption and swapoff are fixed too via calling stop_machine() between > > clearing PageSwapCache() and freeing swap cache data structure. > > > > Alternative implementation could be replacing disable preemption with > > rcu_read_lock_sched and stop_machine() with synchronize_sched(). > > using stop_machine is generally discouraged. It is a gross > synchronization. This was discussed to death and I think the changelog explains the conclusions adequately. swapoff is super-rare so a stop_machine() in that path is appropriate if its use permits more efficiency in the regular swap code paths. > Besides that, since when do we have this problem? What problem??