https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
--- Comment #1 from Denis Vlasenko <vda.linux at googlemail dot com> --- Created attachment 35528 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35528&action=edit Preprocessed example exhibiting a bug This is a preprocessed kernel/locking/mutex.c file from kernel source. When built with either -O2 or -Os, it wrongly deinlines spin_lock() and spin_unlock(): $ gcc -O2 -c mutex.preprocessed.c -o mutex.preprocessed.o $ objdump -dr mutex.preprocessed.o mutex.preprocessed.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <spin_unlock>: 0: 80 07 01 addb $0x1,(%rdi) 3: c3 retq 4: 66 66 66 2e 0f 1f 84 data32 data32 nopw %cs:0x0(%rax,%rax,1) b: 00 00 00 00 00 0000000000000010 <__mutex_init>: ... 0000000000000040 <spin_lock>: 40: e9 00 00 00 00 jmpq 45 <spin_lock+0x5> 41: R_X86_64_PC32 _raw_spin_lock-0x4 45: 66 66 2e 0f 1f 84 00 data32 nopw %cs:0x0(%rax,%rax,1) 4c: 00 00 00 00 These functions are defined as: static inline __attribute__((no_instrument_function)) void spin_unlock(spinlock_t *lock) { __raw_spin_unlock(&lock->rlock); } static inline __attribute__((no_instrument_function)) void spin_lock(spinlock_t *lock) { _raw_spin_lock(&lock->rlock); } and programmer's intent was that they will always be inlined. This is with gcc-4.7.2