When returning results to userspace, do_sys_poll repeatedly calls
put_user() - once per fd that it's watching.

This means that on architectures that support some form of
kernel-to-userspace access protection, we end up enabling and disabling
access once for each file descripter we're watching. This is inefficent
and we can improve things. We could do careful batching of the opening
and closing of the access window, or we could just copy the entire walk
entries structure. While that copies more data, it potentially does so
more efficiently, and the overhead is much less than the lock/unlock
overhead.

Unscientific benchmarking with the poll2_threads microbenchmark from
will-it-scale, run as `./poll2_threads -t 1 -s 15`:

  - Bare-metal Power9 with KUAP: ~49% speed-up
  - VM on amd64 laptop with SMAP: ~25% speed-up

Signed-off-by: Daniel Axtens <d...@axtens.net>

---

v2: Use copy_to_user instead of put_user, thanks Christoph Hellwig.
---
 fs/select.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index 7aef49552d4c..52118cdddf77 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -1012,12 +1012,10 @@ static int do_sys_poll(struct pollfd __user *ufds, 
unsigned int nfds,
        poll_freewait(&table);
 
        for (walk = head; walk; walk = walk->next) {
-               struct pollfd *fds = walk->entries;
-               int j;
-
-               for (j = 0; j < walk->len; j++, ufds++)
-                       if (__put_user(fds[j].revents, &ufds->revents))
-                               goto out_fds;
+               if (copy_to_user(ufds, walk->entries,
+                                sizeof(struct pollfd) * walk->len))
+                       goto out_fds;
+               ufds += walk->len;
        }
 
        err = fdcount;
-- 
2.25.1

Reply via email to