On 20.12.2022 21:56, Andrew Cooper wrote: > On 20/12/2022 1:51 pm, Jan Beulich wrote: >> On 16.12.2022 21:17, Andrew Cooper wrote: >>> + "mov %[cr4], %%cr4\n\t" /* CR4.PGE = 1 */ >>> + : [cr4] "=&a" (tmp) /* Could be "r", but "a" makes better asm */ >>> + : [cr3] "r" (__pa(idle_pg_table)), >>> + [pge] "i" (X86_CR4_PGE) >>> + : "memory" ); >> The removed stack copying worries me, to be honest. Yes, for local >> variables of ours it doesn't matter because they are about to go out >> of scope. But what if the compiler instantiates any for its own use, >> e.g. ... >> >>> + /* >>> + * End of the critical region. Updates to locals and globals now work >>> as >>> + * expected. >>> + * >>> + * Updates to local variables which have been spilled to the stack >>> since >>> + * the memcpy() have been lost. But we don't care, because they're all >>> + * going out of scope imminently. >>> + */ >>> + >>> + printk("New Xen image base address: %#lx\n", xen_phys_start); >> ... the result of the address calculation for the string literal >> here? Such auxiliary calculations can happen at any point in the >> function, and won't necessarily (hence the example chosen) become >> impossible for the compiler to do with the memory clobber in the >> asm(). And while the string literal's address is likely cheap >> enough to calculate right in the function invocation sequence (and >> an option would also be to retain the printk() in the caller), >> other instrumentation options could be undermined by this as well. > > Right now, the compiler is free to calculate the address of the string > literal in a register, and move it the other side of the TLB flush. > This will work just fine. > > However, the compiler cannot now, or ever in the future, spill such a > calculation to the stack.
I'm afraid the compiler's view at things is different: Whatever it puts on the stack is viewed as virtual registers, unaffected by a memory clobber (of course there can be effects resulting from uses of named variables). Look at -O3 output of gcc12 (which is what I happened to play with; I don't think it's overly version dependent) for this (clearly contrived) piece of code: int __attribute__((const)) calc(int); int test(int i) { int j = calc(i); asm("nopl %0" : "+m" (j)); asm("nopq %%rsp" ::: "memory", "ax", "cx", "dx", "bx", "bp", "si", "di", "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15"); j = calc(i); asm("nopl %0" :: "m" (j)); return j; } It instantiates a local on the stack for the result of calc(); it does not re-invoke calc() a 2nd time. Which means the memory clobber in the middle asm() does not affect that (and by implication: any) stack slot. Using godbolt I can also see that clang15 agrees with gcc12 in this regard. I didn't bother trying other versions. > Whether it's spelt "memory", or something else in the future, OSes > really do genuinely need a way of telling the compiler "you literally > cannot move anything the other side of this asm()", because otherwise > malfunctions will occur. Something like this may indeed be desirable in situations like this one, but I don't think such a construct exists. Jan