Bug#702641: ia64, wrong asm register contraints in the futex implementation

2013-03-09 Thread Ben Hutchings
Control: tag -1 upstream moreinfo
Control: severity -1 normal

On Sat, 2013-03-09 at 14:53 +0100, Stephan Schreiber wrote:
> Package: src:linux
> Version: 3.2.23-1
> Severity: important
> Tags: patch
[...]
> I also filed another similar bug#702639 (wrong asm register contraints  
> in the kvm implementation).

Again, you will need to get this accepted upstream first.

Ben.

-- 
Ben Hutchings
Always try to do things in chronological order;
it's less confusing that way.


signature.asc
Description: This is a digitally signed message part


Bug#702641: ia64, wrong asm register contraints in the futex implementation

2013-03-09 Thread Stephan Schreiber

Package: src:linux
Version: 3.2.23-1
Severity: important
Tags: patch


The Linux Kernel contains some inline assembly source code which has  
wrong asm register constraints in arch/ia64/include/asm/futex.h.
Since it causes trouble when compiling the Kernel with GCC4.4, it  
should be fixed.


File arch/ia64/include/asm/futex.h:

static inline int
futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
  u32 oldval, u32 newval)
{
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
return -EFAULT;

{
register unsigned long r8 __asm ("r8");
unsigned long prev;
__asm__ __volatile__(
"  mf;;\n"
"  mov %0=r0   \n"
"  mov ar.ccv=%4;; \n"
"[1:]  cmpxchg4.acq %1=[%2],%3,ar.ccv  \n"
"  .xdata4 \"__ex_table\", 1b-., 2f-.\n"
"[2:]"
: "=r" (r8), "=r" (prev)
: "r" (uaddr), "r" (newval),
  "rO" ((long) (unsigned) oldval)
: "memory");
*uval = prev;
return r8;
}
}



The list of output registers is
: "=r" (r8), "=r" (prev)
The constraint "=r" means that the GCC has to maintain that these vars  
are in registers and contain valid info when the program flow leaves  
the assembly block (output registers).
But "=r" also means that GCC can put them in registers that are used  
as input registers. Input registers are uaddr, newval, oldval on the  
example.

The second assembly instruction
"  mov %0=r0   \n"
is the first one which writes to a register; it sets %0 to 0. %0 means  
the first register operand; it is r8 here. (The r0 is read-only and  
always 0 on the Itanium; it can be used if an immediate zero value is  
needed.)
This instruction might overwrite one of the other registers which are  
still needed.
Whether it really happens depends on how GCC decides what registers it  
uses and how it optimizes the code.


The objdump utility can give us disassembly.
The futex_atomic_cmpxchg_inatomic() function is inline, so we have to  
look for a module that uses the funtion. This is the  
cmpxchg_futex_value_locked() function in

kernel/futex.c:

static int cmpxchg_futex_value_locked(u32 *curval, u32 __user *uaddr,
  u32 uval, u32 newval)
{
int ret;

pagefault_disable();
ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
pagefault_enable();

return ret;
}


Now the disassembly. At first from the Kernel package 3.2.23 which has  
been compiled with GCC 4.4, remeber this Kernel seemed to work:

objdump -d linux-3.2.23/debian/build/build_ia64_none_mckinley/kernel/futex.o

0230 :
 230:   0b 18 80 1b 18 21   [MMI]   adds r3=3168,r13;;
 236:   80 40 0d 00 42 00   adds r8=40,r3
 23c:   00 00 04 00 nop.i 0x0;;
 240:   0b 50 00 10 10 10   [MMI]   ld4 r10=[r8];;
 246:   90 08 28 00 42 00   adds r9=1,r10
 24c:   00 00 04 00 nop.i 0x0;;
 250:   09 00 00 00 01 00   [MMI]   nop.m 0x0
 256:   00 48 20 20 23 00   st4 [r8]=r9
 25c:   00 00 04 00 nop.i 0x0;;
 260:   08 10 80 06 00 21   [MMI]   adds r2=32,r3
 266:   00 00 00 02 00 00   nop.m 0x0
 26c:   02 08 f1 52 extr.u r16=r33,0,61
 270:   05 40 88 00 08 e0   [MLX]   addp4 r8=r34,r0
 276:   ff ff 0f 00 00 e0   movl r15=0xfffbfff;;
 27c:   f1 f7 ff 65
 280:   09 70 00 04 18 10   [MMI]   ld8 r14=[r2]
 286:   00 00 00 02 00 c0   nop.m 0x0
 28c:   f0 80 1c d0 cmp.ltu p6,p7=r15,r16;;
 290:   08 40 fc 1d 09 3b   [MMI]   cmp.eq p8,p9=-1,r14
 296:   00 00 00 02 00 40   nop.m 0x0
 29c:   e1 08 2d d0 cmp.ltu p10,p11=r14,r33
 2a0:	56 01 10 00 40 10 	[BBB] (p10) br.cond.spnt.few 2e0  

 2a6:	02 08 00 80 21 03 	  (p08) br.cond.dpnt.few 2b0  

 2ac:	40 00 00 41   	  (p06) br.cond.spnt.few 2e0  


 2b0:   0a 00 00 00 22 00   [MMI]   mf;;
 2b6:   80 00 00 00 42 00   mov r8=r0
 2bc:   00 00 04 00 nop.i 0x0
 2c0:   0b 00 20 40 2a 04   [MMI]   mov.m ar.ccv=r8;;
 2c6:   10 1a 85 22 20 00   cmpxchg4.acq 
r33=[r33],r35,ar.ccv
 2cc:   00 00