On 11/27, Paul Mackerras wrote: > > Oleg Nesterov writes: > > > 0xfeacd24 > > 0xfeacd28 > > 0xfeacd2c > > 0xfeacd30 > > 0xfeacd34 > > ... > > > > and so on forever, > ... > > beg-> 0x0feacd24 <__GI__IO_list_lock+68>: lwarx r0,0,r31 > > 0x0feacd28 <__GI__IO_list_lock+72>: cmpw r0,r11 > > 0x0feacd2c <__GI__IO_list_lock+76>: bne- 0xfeacd38 > > <__GI__IO_list_lock+88> > > 0x0feacd30 <__GI__IO_list_lock+80>: stwcx. r9,0,r31 > > end-> 0x0feacd34 <__GI__IO_list_lock+84>: bne+ 0xfeacd24 > > <__GI__IO_list_lock+68> > > > > I don't even know whether this is user-space bug or kernel bug, > > the asm above is the black magic for me. > > The lwarx and stwcx. work together to do an atomic update to the word > whose address is in r31. They are like LL (load-linked) and SC > (store-conditional) on other architectures such as alpha. Basically > the lwarx creates an internal "reservation" on the word pointed to by > r31 and loads its value into r0. The stwcx. stores into that word but > only if the reservation still exists. The reservation gets cleared > (in hardware) if any other cpu writes to that word in the meantime. > If the reservation did get cleared, the bne (branch if not equal) > instruction will be taken and we loop around to try again. > > There is a difficulty when single-stepping through such a sequence > because the process of taking the single-step exception and returning > will clear the reservation. Thus if you single-step through that > sequence it will never succeed. I believe gdb has code to recognize > this kind of sequence and run through it without stopping until after > the bne, precisely to avoid this problem.
Thanks! This explains everything, I think. Could you look at this ptrace-copy_process-should-disable-stepping.patch http://marc.info/?l=linux-mm-commits&m=125789789322573 patch? It is not clear to me how we can modify the test-case to verify it fixes the original problem for powerpc. At least, do you think this patch is good for powerpc ? Oleg.