While working on some Linux kernel code, I've found that functions that
are declared as 'static inline' are having their arguments evaluated
well before they are used. For example I have a function:

static inline void trace(arg1, arg2)
{
    if (unlikely(enabled)) {
        <use the arguments>
    }
}

To make this more concrete here is a simple .c program:


#include <stdio.h>

# define unlikely(x)    __builtin_expect(!!(x), 0)

int enabled = 0;

struct foo {
        int value;
};

struct foo a = {
        .value = 10
};

static inline evaluate(int value) {
        if (unlikely(enabled)) {
                printf("value is: %d\n", value);
        }
}


/*

#define evaluate(val) \
do { \
        if (unlikely(enabled)) { \
                printf("value is: %d\n", val); \
        } \
} while (0)

*/

int main() {

        evaluate((&a)->value);
}


With the macro commented out I get:
00000000004004cc <main>:
  4004cc:       55                      push   %rbp
  4004cd:       48 89 e5                mov    %rsp,%rbp
  4004d0:       48 83 ec 10             sub    $0x10,%rsp
  4004d4:       8b 3d 22 04 20 00       mov    0x200422(%rip),%edi        #
6008fc <a>
  4004da:       e8 02 00 00 00          callq  4004e1 <evaluate>

Thus, a is loaded before the call to 'evaluate'

However, if i compile the macro version of 'evaluate' i get:

00000000004004cc <main>:
  4004cc:       55                      push   %rbp
  4004cd:       48 89 e5                mov    %rsp,%rbp
  4004d0:       48 83 ec 10             sub    $0x10,%rsp
  4004d4:       8b 05 ee 03 20 00       mov    0x2003ee(%rip),%eax        #
6008c8 <enabled>
  4004da:       85 c0                   test   %eax,%eax
  4004dc:       0f 95 c0                setne  %al
  4004df:       0f b6 c0                movzbl %al,%eax
  4004e2:       48 85 c0                test   %rax,%rax
  4004e5:       74 15                   je     4004fc <main+0x30>
  4004e7:       8b 35 c7 03 20 00       mov    0x2003c7(%rip),%esi        #
6008b4 <a>
  4004ed:       bf f8 05 40 00          mov    $0x4005f8,%edi
  4004f2:       b8 00 00 00 00          mov    $0x0,%eax
  4004f7:       e8 bc fe ff ff          callq  4003b8 <pri...@plt>


Thus, the load of 'a' happens after the 'unlikely' test as I would like it. It
would be nice if gcc could optimize the 'unlikely' case in the 'static inline'
function case.

thanks.


-- 
           Summary: request for enhancement: delay argument loading until
                    needed
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: jbaron at redhat dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40207

Reply via email to