* Denys Vlasenko <dvlas...@redhat.com> wrote: > PER_CPU_VAR(kernel_stack) was set up in a way where it > points five stack slots below the top of stack. > > Presumably, it was done to avoid one "sub $5*8,%rsp" in > syscall/sysenter code paths, where iret frame needs to be > created by hand. > > Ironically, none of them benefit from this optimization, > since all of them need to allocate additional data on > stack (struct pt_regs), so they still have to perform > subtraction.
Well, the original idea of percpu::kernel_stack was that of an optimization of the 64-bit system_call() path: to set up RSP as it has to be before we call into system calls. This optimization has bitrotted away: because these days the first SAVE_ARGS in the 64-bit entry path modifies RSP as well, undoing the optimization. But the fix should be to not touch RSP in SAVE_ARGS, to keep percpu::kernel_stack as an optimized entry point - with KERNEL_STACK_OFFSET pointing to. So NAK - this should be fixed for real. > And ia32_sysenter_target even needs to *undo* this > optimization: it constructs iret stack with pushes > instead of movs, so it needs to start right at the top. Lets keep it in mind that in any case the micro-costs of the 32-bit entry path are almost always irrelevant: we optimize the 64-bit entry path, if that helps the 32-bit side as well then that's a happy coincidence, nothing more. If the 32-bit entry path can be optimized without affecting the 64-bit path then that's good, but we don't ever hurt the 64-bit path to make things easier or simpler for the 32-bit path. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/