On Fri, 2018-10-12 at 21:29 -0700, Dan Williams wrote:
> The device-dax unit test sometimes fails with the following kernel
> message signature:
> 
>      Memory failure: Unable to find user space address 204300 in lt-device-dax
>      Memory failure: 0x204300: forcibly killing lt-device-dax:1334 because of 
> failure to unmap
> 
> This happens when there is a 3rd party vma in the rmap that has an entry
> at the same index as the currently failing page. While the test has
> munmap()'d the previous mapping we still trip over the fact that the
> kernel memory-failure code does not differentiate munmap vs mremap and
> upgrades the failure to process fatal.
> 
> The add_to_kill() routine in the kernel has a comment that says:
> 
>         /*
>          * In theory we don't have to kill when the page was
>          * munmaped. But it could be also a mremap. Since that's
>          * likely very rare kill anyways just out of paranoia, but use
>          * a SIGKILL because the error is not contained anymore.
>          */
> 
> ...when it is determining what to do when it can't find the given pfn
> mapped into the process at the given index.
> 
> Avoid this case by munmap()'ing *and* closing the file to trigger old /
> stale vma's to be reaped. With that the only vma that can be looked up
> is the one the error was injected, the lookup succeeds, and the test
> passes.
> 
> Signed-off-by: Dan Williams <[email protected]>
> ---
>  test/device-dax.c |   49 ++++++++++++++++++++++++++++++++++---------------
>  1 file changed, 34 insertions(+), 15 deletions(-)

Looks good, applied.

> 
> 

_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

Reply via email to