On Fri, Mar 05, 2021 at 02:11:43PM -0800, Luck, Tony wrote:
> This whole page table walking patch is trying to work around the
> races caused by multiple calls to memory_failure() for the same
> page.
> 
> Maybe better to just avoid the races.  The comment right above
> memory_failure says:
> 
>  * Must run in process context (e.g. a work queue) with interrupts
>  * enabled and no spinlocks hold.
> 
> So it should be safe to grab and hold a mutex.  See patch below.

The mutex approach looks simpler and safer, so I'm fine with it.

> 
> -Tony
> 
> commit 8dd0dbe7d595e02647e9c2c76c03341a9f6bd7b9
> Author: Tony Luck <tony.l...@intel.com>
> Date:   Fri Mar 5 10:40:48 2021 -0800
> 
>     Use a mutex to avoid memory_failure() races
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 24210c9bd843..c1509f4b565e 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1381,6 +1381,8 @@ static int memory_failure_dev_pagemap(unsigned long 
> pfn, int flags,
>       return rc;
>  }
>  
> +static DEFINE_MUTEX(mf_mutex);
> +
>  /**
>   * memory_failure - Handle memory failure of a page.
>   * @pfn: Page Number of the corrupted page
> @@ -1424,12 +1426,18 @@ int memory_failure(unsigned long pfn, int flags)
>               return -ENXIO;
>       }
>  
> +     mutex_lock(&mf_mutex);

Is it better to take mutex before memory_failure_dev_pagemap() block?
Or we don't have to protect against race for device memory?

Thanks,
Naoya Horiguchi

Reply via email to