On Thu, Jan 25, 2018 at 1:20 PM, Linus Torvalds <torva...@linux-foundation.org> wrote: > On Thu, Jan 25, 2018 at 1:08 PM, Andy Lutomirski <l...@kernel.org> wrote: >> >> With retpoline, the retpoline in the trampoline sucks. I don't need >> perf for that -- I've benchmarked it both ways. It sucks. I'll fix >> it, but it'll be kind of complicated. > > Ahh, I'd forgotten about that (and obviously didn't see it in the profiles). > > But yeah, that is fixable even if it does require a page per CPU. Or > did you have some clever scheme in mind?
Nothing clever. I was going to see if I could get actual binutils-generated relocations to work in the trampoline. We already have code to parse ELF relocations and turn them into a simple table, and it shouldn't be *that* hard to run a separate pass on the entry trampoline. Another potentially useful if rather minor optimization would be to rejigger the SYSCALL_DEFINE macros a bit. Currently we treat all syscalls like this: long func(long arg0, long arg1, long arg2, long arg3, long arg4, long arg5); I wonder if we'd be better off doing: long func(const struct pt_regs *regs); and autogenerating: static long SyS_read(const struct pt_regs *regs) { return sys_reg(regs->di, ...); }