[email protected] writes: > On December 28, 2017 2:47:47 PM PST, [email protected] wrote: >>Linus Torvalds <[email protected]> writes: >> >>> From: Linus Torvalds <[email protected]> >>> Date: Wed, 27 Dec 2017 11:41:30 -0800 >>> Subject: [PATCH] x86-32: fix kexec with stack canary >>(CONFIG_CC_STACKPROTECTOR) >>> >>> Commit e802a51ede91 ("x86/idt: Consolidate IDT invalidation") cleaned >>up >>> and unified the IDT invalidation that existed in a couple of places. >>It >>> changed no actual real code. >>> >>> Despite not changing any actual real code, it _did_ change code >>> generation: by implementing the common idt_invalidate() function in >>> archx86/kernel/idt.c, it made the use of the function in >>> arch/x86/kernel/machine_kexec_32.c be a real function call rather >>than >>> an (accidental) inlining of the function. >>> >>> That, in turn, exposed two issues: >>> >>> - in load_segments(), we had incorrectly reset all the segment >>> registers, which then made the stack canary load (which gcc does >>> using offset of %gs) cause a trap. Instead of %gs pointing to the >>> stack canary, it will be the normal zero-based kernel segment, and >>> the stack canary load will take a page fault at address 0x14. >>> >>> - to make this even harder to debug, we had invalidated the GDT just >>> before calling idt_invalidate(), which meant that the fault >>happened >>> with an invalid GDT, which in turn causes a triple fault and >>> immediate reboot. >>> >>> Fix this by >>> >>> (a) not reloading the special segments in load_segments(). We >>currently >>> don't do any percpu accesses (which would require %fs on x86-32) >>in >>> this area, but there's no reason to think that we might not want >>to >>> do them, and like %gs, it's pointless to break it. >>> >>> (b) doing idt_invalidate() before invalidating the GDT, to keep >>things >>> at least _slightly_ more debuggable for a bit longer. Without a >>> IDT, traps will not work. Without a GDT, traps also will not >>work, >>> but neither will any segment loads etc. So in a very real sense, >>> the GDT is even more core than the IDT. >>> >>> Reported-and-tested-by: Alexandru Chirvasitu <[email protected]> >>> Fixes: e802a51ede91 ("x86/idt: Consolidate IDT invalidation") >>> Cc: Thomas Gleixner <[email protected]> >>> Cc: Andy Lutomirski <[email protected]> >>> Cc: Peter Anvin <[email protected]> >>> Cc: Ingo Molnar <[email protected]> >>> Signed-off-by: Linus Torvalds <[email protected]> >>> --- >>> >>> I wrote "Reported-and-tested-by: Alexandru" because while this isn't >>> exactly the same patch as anything Alexandru tested, it's pretty >>close, >>> and I'm pretty sure this version will fix his issues too. >>> >>> I decided to try to just do the minimal changes: the GDT invalidation >>last >>> (because of the debugging) and _only_ removing the resetting of fs/gs >> >>> rather than removing load_segments() entirely. >>> >>> I think making idt_invalidate() be inline would be a good thing as >>well, >>> and I do think that all those "phys_to_virt(0)" things are garbage, >>but I >>> also think they are independent issues, so I didn't touch any of >>that. >>> >>> I'm assuming I'll get this patch back through the x86 tree, and will >>not >>> be applying it to my own git tree unless the x86 people ask me to. >>> >>> Comments? >> >>There is one significant problem with this patch. It changes the ABI >>that kexec provides to the next kernel. >> >>That ABI is that the segments will be set to a well defined value. >>That value is flat 32bit segments with a base address of 0. >> >>By removing %fs and %gs from load_segments they are now effectively >>random undefined values, to the next kernel. >> >>I don't know if anything actually cares. But if they do they are now >>broken. It is easy enough to preserve that invariant I don't see >>a point in risking potential breaking and looking to see if we have >>actually broken the ABI. >> >>It feels like this is something we should move into assembly rather >>than attempting to cater to the changing evironment of C code in the >>kernel. Or if not we need a big fat comment be very very careful >>this code is special. >> >>Eric >> >> >>> arch/x86/kernel/machine_kexec_32.c | 4 +--- >>> 1 file changed, 1 insertion(+), 3 deletions(-) >>> >>> diff --git a/arch/x86/kernel/machine_kexec_32.c >>b/arch/x86/kernel/machine_kexec_32.c >>> index 00bc751c861c..edfede768688 100644 >>> --- a/arch/x86/kernel/machine_kexec_32.c >>> +++ b/arch/x86/kernel/machine_kexec_32.c >>> @@ -48,8 +48,6 @@ static void load_segments(void) >>> "\tmovl $"STR(__KERNEL_DS)",%%eax\n" >>> "\tmovl %%eax,%%ds\n" >>> "\tmovl %%eax,%%es\n" >>> - "\tmovl %%eax,%%fs\n" >>> - "\tmovl %%eax,%%gs\n" >>> "\tmovl %%eax,%%ss\n" >>> : : : "eax", "memory"); >>> #undef STR >>> @@ -232,8 +230,8 @@ void machine_kexec(struct kimage *image) >>> * The gdt & idt are now invalid. >>> * If you want to load them you must set up your own idt & gdt. >>> */ >>> - set_gdt(phys_to_virt(0), 0); >>> idt_invalidate(phys_to_virt(0)); >>> + set_gdt(phys_to_virt(0), 0); >>> >>> /* now call it */ >>> image->start = relocate_kernel_ptr((unsigned long)image->head, > > The ABI the kernel requires on entry is also documented, and we should > stick to that.
Wrong interface this, does not transfer directly to a linux kernel. This transfers to a shim that starts a linux kernel or something else. It is way past time to be having design discussions about what this interface should do. It is more than a decade old. > That being said, the bottom line is to just stop putting these kinds > of final handovers into C and just hope the compiler (or > tracing/debugging developers) doesn't randomly break at some thing. In general I agree. That makes the code a little less approachable, but it would seem to remove the chance of surprise interactions even more. Eric

