On 20.12.2022 21:56, Andrew Cooper wrote:
> On 20/12/2022 1:51 pm, Jan Beulich wrote:
>> On 16.12.2022 21:17, Andrew Cooper wrote:
>>> +        "mov    %[cr4], %%cr4\n\t"     /* CR4.PGE = 1 */
>>> +        : [cr4] "=&a" (tmp) /* Could be "r", but "a" makes better asm */
>>> +        : [cr3] "r" (__pa(idle_pg_table)),
>>> +          [pge] "i" (X86_CR4_PGE)
>>> +        : "memory" );
>> The removed stack copying worries me, to be honest. Yes, for local
>> variables of ours it doesn't matter because they are about to go out
>> of scope. But what if the compiler instantiates any for its own use,
>> e.g. ...
>>
>>> +    /*
>>> +     * End of the critical region.  Updates to locals and globals now work 
>>> as
>>> +     * expected.
>>> +     *
>>> +     * Updates to local variables which have been spilled to the stack 
>>> since
>>> +     * the memcpy() have been lost.  But we don't care, because they're all
>>> +     * going out of scope imminently.
>>> +     */
>>> +
>>> +    printk("New Xen image base address: %#lx\n", xen_phys_start);
>> ... the result of the address calculation for the string literal
>> here? Such auxiliary calculations can happen at any point in the
>> function, and won't necessarily (hence the example chosen) become
>> impossible for the compiler to do with the memory clobber in the
>> asm(). And while the string literal's address is likely cheap
>> enough to calculate right in the function invocation sequence (and
>> an option would also be to retain the printk() in the caller),
>> other instrumentation options could be undermined by this as well.
> 
> Right now, the compiler is free to calculate the address of the string
> literal in a register, and move it the other side of the TLB flush. 
> This will work just fine.
> 
> However, the compiler cannot now, or ever in the future, spill such a
> calculation to the stack.

I'm afraid the compiler's view at things is different: Whatever it puts
on the stack is viewed as virtual registers, unaffected by a memory
clobber (of course there can be effects resulting from uses of named
variables). Look at -O3 output of gcc12 (which is what I happened to
play with; I don't think it's overly version dependent) for this
(clearly contrived) piece of code:

int __attribute__((const)) calc(int);

int test(int i) {
        int j = calc(i);

        asm("nopl %0" : "+m" (j));
        asm("nopq %%rsp" ::: "memory", "ax", "cx", "dx", "bx", "bp", "si", "di",
                             "r8", "r9", "r10", "r11", "r12", "r13", "r14", 
"r15");
        j = calc(i);
        asm("nopl %0" :: "m" (j));

        return j;
}

It instantiates a local on the stack for the result of calc(); it does
not re-invoke calc() a 2nd time. Which means the memory clobber in the
middle asm() does not affect that (and by implication: any) stack slot.

Using godbolt I can also see that clang15 agrees with gcc12 in this
regard. I didn't bother trying other versions.

> Whether it's spelt "memory", or something else in the future, OSes
> really do genuinely need a way of telling the compiler "you literally
> cannot move anything the other side of this asm()", because otherwise
> malfunctions will occur.

Something like this may indeed be desirable in situations like this one,
but I don't think such a construct exists.

Jan

Reply via email to