Re: [PATCH v5 1/6] mm/gup: remove unused vmas parameter from get_user_pages()

2023-05-16 Thread John Hubbard
On 5/16/23 07:35, David Hildenbrand wrote:
...
>>> When passing NULL as "pages" to get_user_pages(), __get_user_pages_locked()
>>> won't set FOLL_GET. As FOLL_PIN is also not set, we won't be messing with
>>> the mapcount of the page.
> 
> For completeness: s/mapcount/refcount/ :)

whew, you had me going there! Now it all adds up. :) 

thanks,
-- 
John Hubbard
NVIDIA



Re: [PATCH v5 1/6] mm/gup: remove unused vmas parameter from get_user_pages()

2023-05-16 Thread David Hildenbrand

On 16.05.23 16:30, Sean Christopherson wrote:

On Tue, May 16, 2023, David Hildenbrand wrote:

On 15.05.23 21:07, Sean Christopherson wrote:

On Sun, May 14, 2023, Lorenzo Stoakes wrote:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index cb5c13eee193..eaa5bb8dbadc 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2477,7 +2477,7 @@ static inline int check_user_page_hwpoison(unsigned long 
addr)
   {
int rc, flags = FOLL_HWPOISON | FOLL_WRITE;
-   rc = get_user_pages(addr, 1, flags, NULL, NULL);
+   rc = get_user_pages(addr, 1, flags, NULL);
return rc == -EHWPOISON;


Unrelated to this patch, I think there's a pre-existing bug here.  If gup() 
returns
a valid page, KVM will leak the refcount and unintentionally pin the page.  
That's


When passing NULL as "pages" to get_user_pages(), __get_user_pages_locked()
won't set FOLL_GET. As FOLL_PIN is also not set, we won't be messing with
the mapcount of the page.


For completeness: s/mapcount/refcount/ :)



Ah, that's what I'm missing.




--
Thanks,

David / dhildenb



Re: [PATCH v5 1/6] mm/gup: remove unused vmas parameter from get_user_pages()

2023-05-16 Thread Sean Christopherson
On Tue, May 16, 2023, David Hildenbrand wrote:
> On 15.05.23 21:07, Sean Christopherson wrote:
> > On Sun, May 14, 2023, Lorenzo Stoakes wrote:
> > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > > index cb5c13eee193..eaa5bb8dbadc 100644
> > > --- a/virt/kvm/kvm_main.c
> > > +++ b/virt/kvm/kvm_main.c
> > > @@ -2477,7 +2477,7 @@ static inline int check_user_page_hwpoison(unsigned 
> > > long addr)
> > >   {
> > >   int rc, flags = FOLL_HWPOISON | FOLL_WRITE;
> > > - rc = get_user_pages(addr, 1, flags, NULL, NULL);
> > > + rc = get_user_pages(addr, 1, flags, NULL);
> > >   return rc == -EHWPOISON;
> > 
> > Unrelated to this patch, I think there's a pre-existing bug here.  If gup() 
> > returns
> > a valid page, KVM will leak the refcount and unintentionally pin the page.  
> > That's
> 
> When passing NULL as "pages" to get_user_pages(), __get_user_pages_locked()
> won't set FOLL_GET. As FOLL_PIN is also not set, we won't be messing with
> the mapcount of the page.

Ah, that's what I'm missing.

> So even if get_user_pages() returns "1", we should be fine.
> 
> 
> Or am I misunderstanding your concern?

Nope, you covered everything.  I do think we can drop the extra gup() though,
AFAICT it's 100% redundant.  But it's not a bug.

Thanks!


Re: [PATCH v5 1/6] mm/gup: remove unused vmas parameter from get_user_pages()

2023-05-16 Thread David Hildenbrand

On 15.05.23 21:07, Sean Christopherson wrote:

On Sun, May 14, 2023, Lorenzo Stoakes wrote:

No invocation of get_user_pages() use the vmas parameter, so remove it.

The GUP API is confusing and caveated. Recent changes have done much to
improve that, however there is more we can do. Exporting vmas is a prime
target as the caller has to be extremely careful to preclude their use
after the mmap_lock has expired or otherwise be left with dangling
pointers.

Removing the vmas parameter focuses the GUP functions upon their primary
purpose - pinning (and outputting) pages as well as performing the actions
implied by the input flags.

This is part of a patch series aiming to remove the vmas parameter
altogether.

Suggested-by: Matthew Wilcox (Oracle) 
Acked-by: Greg Kroah-Hartman 
Acked-by: David Hildenbrand 
Reviewed-by: Jason Gunthorpe 
Acked-by: Christian K�nig  (for radeon parts)
Acked-by: Jarkko Sakkinen 
Signed-off-by: Lorenzo Stoakes 
---
  arch/x86/kernel/cpu/sgx/ioctl.c | 2 +-
  drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
  drivers/misc/sgi-gru/grufault.c | 2 +-
  include/linux/mm.h  | 3 +--
  mm/gup.c| 9 +++--
  mm/gup_test.c   | 5 ++---
  virt/kvm/kvm_main.c | 2 +-
  7 files changed, 10 insertions(+), 15 deletions(-)


Acked-by: Sean Christopherson  (KVM)


diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index cb5c13eee193..eaa5bb8dbadc 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2477,7 +2477,7 @@ static inline int check_user_page_hwpoison(unsigned long 
addr)
  {
int rc, flags = FOLL_HWPOISON | FOLL_WRITE;
  
-	rc = get_user_pages(addr, 1, flags, NULL, NULL);

+   rc = get_user_pages(addr, 1, flags, NULL);
return rc == -EHWPOISON;


Unrelated to this patch, I think there's a pre-existing bug here.  If gup() 
returns
a valid page, KVM will leak the refcount and unintentionally pin the page.  
That's


When passing NULL as "pages" to get_user_pages(), 
__get_user_pages_locked() won't set FOLL_GET. As FOLL_PIN is also not 
set, we won't be messing with the mapcount of the page.


So even if get_user_pages() returns "1", we should be fine.


Or am I misunderstanding your concern? At least hva_to_pfn_slow() most 
certainly didn't return "1" if we end up calling 
check_user_page_hwpoison(), so nothing would have been pinned there as well.


--
Thanks,

David / dhildenb



Re: [PATCH v5 1/6] mm/gup: remove unused vmas parameter from get_user_pages()

2023-05-15 Thread Sean Christopherson
On Sun, May 14, 2023, Lorenzo Stoakes wrote:
> No invocation of get_user_pages() use the vmas parameter, so remove it.
> 
> The GUP API is confusing and caveated. Recent changes have done much to
> improve that, however there is more we can do. Exporting vmas is a prime
> target as the caller has to be extremely careful to preclude their use
> after the mmap_lock has expired or otherwise be left with dangling
> pointers.
> 
> Removing the vmas parameter focuses the GUP functions upon their primary
> purpose - pinning (and outputting) pages as well as performing the actions
> implied by the input flags.
> 
> This is part of a patch series aiming to remove the vmas parameter
> altogether.
> 
> Suggested-by: Matthew Wilcox (Oracle) 
> Acked-by: Greg Kroah-Hartman 
> Acked-by: David Hildenbrand 
> Reviewed-by: Jason Gunthorpe 
> Acked-by: Christian K�nig  (for radeon parts)
> Acked-by: Jarkko Sakkinen 
> Signed-off-by: Lorenzo Stoakes 
> ---
>  arch/x86/kernel/cpu/sgx/ioctl.c | 2 +-
>  drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
>  drivers/misc/sgi-gru/grufault.c | 2 +-
>  include/linux/mm.h  | 3 +--
>  mm/gup.c| 9 +++--
>  mm/gup_test.c   | 5 ++---
>  virt/kvm/kvm_main.c | 2 +-
>  7 files changed, 10 insertions(+), 15 deletions(-)

Acked-by: Sean Christopherson  (KVM)

> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index cb5c13eee193..eaa5bb8dbadc 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2477,7 +2477,7 @@ static inline int check_user_page_hwpoison(unsigned 
> long addr)
>  {
>   int rc, flags = FOLL_HWPOISON | FOLL_WRITE;
>  
> - rc = get_user_pages(addr, 1, flags, NULL, NULL);
> + rc = get_user_pages(addr, 1, flags, NULL);
>   return rc == -EHWPOISON;

Unrelated to this patch, I think there's a pre-existing bug here.  If gup() 
returns
a valid page, KVM will leak the refcount and unintentionally pin the page.  
That's
highly unlikely as check_user_page_hwpoison() is called iff 
get_user_pages_unlocked()
fails (called by hva_to_pfn_slow()), but it's theoretically possible that 
userspace
could change the VMAs between hva_to_pfn_slow() and check_user_page_hwpoison() 
since
KVM doesn't hold any relevant locks at this point.

E.g. if there's no VMA during hva_to_pfn_{fast,slow}(), npages==-EFAULT and KVM
will invoke check_user_page_hwpoison().  If userspace installs a valid mapping
after hva_to_pfn_slow() but before KVM acquires mmap_lock, then gup() will find
a valid page.

I _think_ the fix is to simply delete this code. The bug was introduced by 
commit
fafc3dbaac64 ("KVM: Replace is_hwpoison_address with __get_user_pages").  At 
that
time, KVM didn't check for "npages == -EHWPOISON" from the first call to
get_user_pages_unlocked().  Later on, commit 0857b9e95c1a ("KVM: Enable async 
page
fault processing") reworked the caller to be:

mmap_read_lock(current->mm);
if (npages == -EHWPOISON ||
  (!async && check_user_page_hwpoison(addr))) {
pfn = KVM_PFN_ERR_HWPOISON;
goto exit;
}

where async really means NOWAIT, so that the hwpoison use of gup() didn't sleep.

KVM: Enable async page fault processing

If asynchronous hva_to_pfn() is requested call GUP with FOLL_NOWAIT to
avoid sleeping on IO. Check for hwpoison is done at the same time,
otherwise check_user_page_hwpoison() will call GUP again and will put
vcpu to sleep.

There are other potential problems too, e.g. the hwpoison call doesn't honor
the recently introduced @interruptible flag.

I don't see any reason to keep check_user_page_hwpoison(), KVM can simply rely 
on
the "npages == -EHWPOISON" check.   get_user_pages_unlocked() is guaranteed to 
be
called with roughly equivalent flags, and the flags that aren't equivalent are
arguably bugs in check_user_page_hwpoison(), e.g. assuming FOLL_WRITE is wrong.

TL;DR: Go ahead with this change, I'll submit a separate patch to delete the
buggy KVM code.