On Wed, May 15, 2019 at 7:43 AM Matthew Wilcox <wi...@infradead.org> wrote:
>
> > > W dniu 25.04.2019 o 11:25, Lech Perczak pisze:
> > >> Some time ago, after upgrading the Kernel on our i.MX6Q-based boards to 
> > >> mainline 4.18, and now to LTS 4.19 line, during stress tests we started 
> > >> noticing strange warnings coming from 'read' syscall, when 
> > >> page_copy_sane() check failed. Typical reproducibility is up to ~4 
> > >> events per 24h. Warnings origin from different processes, mostly 
> > >> involved with the stress tests, but not necessarily with block devices 
> > >> we're stressing. If the warning appeared in process relating to block 
> > >> device stress test, it would be accompanied by corrupted data, as the 
> > >> read operation gets aborted.
> > >>
> > >> When I started debugging the issue, I noticed that in all cases we're 
> > >> dealing with highmem zero-order pages. In this case, page_head(page) == 
> > >> page, so page_address(page) should be equal to page_address(head).
> > >> However, it isn't the case, as page_address(head) in each case returns 
> > >> zero, causing the value of "v" to explode, and the check to fail.
>
> You're seeing a race between page_address(page) being called twice.
> Between those two calls, something has caused the page to be removed from
> the page_address_map() list.  Eric's patch avoids calling page_address(),
> so apply it and be happy.

Hmm... wont the kmap_atomic() done later, after page_copy_sane() would
suffer from the race ?

It seems there is a real bug somewhere to fix.

>
> Greg, can you consider 6daef95b8c914866a46247232a048447fff97279 for
> backporting to stable?  Nobody realised it was a bugfix at the time it
> went in.  I suspect there aren't too many of us running HIGHMEM kernels
> any more.
>

Reply via email to