get_user_pages_fast is supposed to be a faster drop-in equivalent of
get_user_pages. As such, callers expect it to return a negative return
code when passed an invalid address, and never expect it to
return 0 when passed a positive number of pages, since
its documentation says:
* Returns number of pages pinned. This may be fewer than the number
* requested. If nr_pages is 0 or negative, returns 0. If no pages
* were pinned, returns -errno.
When get_user_pages_fast fall back on get_user_pages this is exactly
what happens. Unfortunately the implementation is inconsistent: it returns 0 if
passed a kernel address, confusing callers: for example, the following
is pretty common but does not appear to do the right thing with a kernel
address:
ret = get_user_pages_fast(addr, 1, writeable, &page);
if (ret < 0)
return ret;
Change get_user_pages_fast to return -EFAULT when supplied a
kernel address to make it match expectations.
All caller have been audited for consistency with the documented
semantics.
Lightly tested.
Cc: Kirill A. Shutemov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thorsten Leemhuis <[email protected]>
Cc: [email protected]
Fixes: 5b65c4677a57 ("mm, x86/mm: Fix performance regression in
get_user_pages_fast()")
Reported-by: [email protected]
Signed-off-by: Michael S. Tsirkin <[email protected]>
---
mm/gup.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/gup.c b/mm/gup.c
index 6afae32..8f3a064 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1806,9 +1806,12 @@ int get_user_pages_fast(unsigned long start, int
nr_pages, int write,
len = (unsigned long) nr_pages << PAGE_SHIFT;
end = start + len;
+ if (nr_pages <= 0)
+ return 0;
+
if (unlikely(!access_ok(write ? VERIFY_WRITE : VERIFY_READ,
(void __user *)start, len)))
- return 0;
+ return -EFAULT;
if (gup_fast_permitted(start, nr_pages, write)) {
local_irq_disable();
--
MST