On Thu, Apr 20, 2017 at 02:46:51PM -0700, Dan Williams wrote:
> On Sat, Mar 18, 2017 at 2:52 AM, tip-bot for Kirill A. Shutemov
> <[email protected]> wrote:
> > Commit-ID:  2947ba054a4dabbd82848728d765346886050029
> > Gitweb:     
> > http://git.kernel.org/tip/2947ba054a4dabbd82848728d765346886050029
> > Author:     Kirill A. Shutemov <[email protected]>
> > AuthorDate: Fri, 17 Mar 2017 00:39:06 +0300
> > Committer:  Ingo Molnar <[email protected]>
> > CommitDate: Sat, 18 Mar 2017 09:48:03 +0100
> >
> > x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation
> >
> > This patch provides all required callbacks required by the generic
> > get_user_pages_fast() code and switches x86 over - and removes
> > the platform specific implementation.
> >
> > Signed-off-by: Kirill A. Shutemov <[email protected]>
> > Cc: Andrew Morton <[email protected]>
> > Cc: Aneesh Kumar K . V <[email protected]>
> > Cc: Borislav Petkov <[email protected]>
> > Cc: Catalin Marinas <[email protected]>
> > Cc: Dann Frazier <[email protected]>
> > Cc: Dave Hansen <[email protected]>
> > Cc: H. Peter Anvin <[email protected]>
> > Cc: Linus Torvalds <[email protected]>
> > Cc: Peter Zijlstra <[email protected]>
> > Cc: Rik van Riel <[email protected]>
> > Cc: Steve Capper <[email protected]>
> > Cc: Thomas Gleixner <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
> > Link: 
> > http://lkml.kernel.org/r/[email protected]
> > [ Minor readability edits. ]
> > Signed-off-by: Ingo Molnar <[email protected]>
> 
> I'm still trying to spot the bug, but bisect points to this patch as
> the point at which my unit tests start failing with the following
> signature:
> 
> [   35.423841] WARNING: CPU: 8 PID: 245 at lib/percpu-refcount.c:155
> percpu_ref_switch_to_atomic_rcu+0x1f5/0x200

Okay, I've tracked it down. The issue is triggered by replacment
get_page() with page_cache_get_speculative().

page_cache_get_speculative() doesn't have get_zone_device_page(). :-|

And I think it's your bug, Dan: it's wrong to have
get_/put_zone_device_page() in get_/put_page(). I must be handled by
page_ref_* machinery to catch all cases where we manipulate with page
refcount.

Back to the big picture:

I hate that we need to have such additional code in page refcount
primitives. I worked hard to remove compound page ugliness from there and
now zone_device creeping in...

Is it the only option?

-- 
 Kirill A. Shutemov

Reply via email to