On 25-02-15 18:26, Patrick Baggett wrote:
>>
>>
>> In general things don't get optimized across function calls, except in
>> case of inlinable functions.
>>
>> And for compiler attributes it's the opposite,__attribute__((const)) and
>> __attribute((pure)) can be used to indicate some kind of safety to optimize
>> across functions.
>>
>> https://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html
>>
>> This is true, but LTO increases the compiler's ability to make these sorts
> of optimizations across function calls and even C source file boundaries
> without you needing to explicitly mark functions as such.
Even if pthread_mutex_lock was completely inlined there would still be a asm 
volatile(("" ::: "memory")) in there acting as a complete memory barrier to the 
compiler.

Create a function called dummy.c, abusing the fact that gcc can't handle 
pointers well so it won't get reduced to a constant return value:
int x;
int *px = &x;

int main() {
    if (*px == 1)
        return 1;
    asm volatile("" ::: "memory");
    if (*px == 1)
        return 1;

    return -1;
}
Now compile with gcc test.c -O3 -fwhole-program, and run objdump -d a.out:
  400400:       83 3d 49 0c 20 00 01    cmpl   $0x1,0x200c49(%rip)        # 
601050 <x>
  400407:       74 09                   je     400412 <main+0x12>
  400409:       83 3d 40 0c 20 00 01    cmpl   $0x1,0x200c40(%rip)        # 
601050 <x>
  400410:       75 06                   jne    400418 <main+0x18>
  400412:       b8 01 00 00 00          mov    $0x1,%eax
  400417:       c3                      retq   
  400418:       83 c8 ff                or     $0xffffffff,%eax
  40041b:       c3                      retq   

Hey my second check didn't get compiled away.. magic.

And to show that a random function call does the same, replace the barrier with 
random():

0000000000400440 <main>:
  400440:       83 3d 09 0c 20 00 01    cmpl   $0x1,0x200c09(%rip)        # 
601050 <x>
  400447:       74 1b                   je     400464 <main+0x24>
  400449:       50                      push   %rax
  40044a:       31 c0                   xor    %eax,%eax
  40044c:       e8 df ff ff ff          callq  400430 <random@plt>
  400451:       83 3d f8 0b 20 00 01    cmpl   $0x1,0x200bf8(%rip)        # 
601050 <x>
  400458:       b8 01 00 00 00          mov    $0x1,%eax
  40045d:       75 0b                   jne    40046a <main+0x2a>
  40045f:       48 83 c4 08             add    $0x8,%rsp
  400463:       c3                      retq   
  400464:       b8 01 00 00 00          mov    $0x1,%eax
  400469:       c3                      retq   
  40046a:       83 c8 ff                or     $0xffffffff,%eax
  40046d:       eb f0                   jmp    40045f <main+0x1f>

And just to be thorough, showing what happens without function call or barrier:
0000000000400400 <main>:
  400400:       8b 05 4a 0c 20 00       mov    0x200c4a(%rip),%eax        # 
601050 <x>
  400406:       ba ff ff ff ff          mov    $0xffffffff,%edx
  40040b:       83 f8 01                cmp    $0x1,%eax
  40040e:       0f 45 c2                cmovne %edx,%eax
  400411:       c3                      retq   

~Maarten
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

Reply via email to