When returning results to userspace, do_sys_poll repeatedly calls put_user() - once per fd that it's watching.
This means that on architectures that support some form of kernel-to-userspace access protection, we end up enabling and disabling access once for each file descripter we're watching. This is inefficent and we can improve things. We could do careful batching of the opening and closing of the access window, or we could just copy the entire walk entries structure. While that copies more data, it potentially does so more efficiently, and the overhead is much less than the lock/unlock overhead. Unscientific benchmarking with the poll2_threads microbenchmark from will-it-scale, run as `./poll2_threads -t 1 -s 15`: - Bare-metal Power9 with KUAP: ~49% speed-up - VM on amd64 laptop with SMAP: ~25% speed-up Signed-off-by: Daniel Axtens <d...@axtens.net> --- v2: Use copy_to_user instead of put_user, thanks Christoph Hellwig. --- fs/select.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/fs/select.c b/fs/select.c index 7aef49552d4c..52118cdddf77 100644 --- a/fs/select.c +++ b/fs/select.c @@ -1012,12 +1012,10 @@ static int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds, poll_freewait(&table); for (walk = head; walk; walk = walk->next) { - struct pollfd *fds = walk->entries; - int j; - - for (j = 0; j < walk->len; j++, ufds++) - if (__put_user(fds[j].revents, &ufds->revents)) - goto out_fds; + if (copy_to_user(ufds, walk->entries, + sizeof(struct pollfd) * walk->len)) + goto out_fds; + ufds += walk->len; } err = fdcount; -- 2.25.1