Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)

2010-01-11 Thread CAI Qian
Thanks for pointing out. Sorry for the false alarm.



Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)

2009-12-14 Thread Jan Kratochvil
On Wed, 09 Dec 2009 19:12:41 +0100, Oleg Nesterov wrote:
   while the '.func_name' is the text address.
 
  tried to change the code to
 
  REGS_ACCESS (regs, nip) = (unsigned long) .raise_sigusr2
 
  but gcc doesn't like this ;)
...
 Yes, I verified the patch below fixes step-jump-cont.c on
 ibm-js20-02.lab.bos.redhat.com.

Checked-in a similar patch but same as used now in other testcases, sorry for
not using the patch of yours.


Regards,
Jan


--- step-jump-cont.c8 Dec 2008 18:23:41 -   1.12
+++ step-jump-cont.c14 Dec 2009 11:38:37 -  1.13
@@ -213,6 +213,24 @@ int main (void)
   REGS_ACCESS (regs, eip) = (unsigned long) raise_sigusr2;
 #elif defined __x86_64__
   REGS_ACCESS (regs, rip) = (unsigned long) raise_sigusr2;
+#elif defined __powerpc64__
+  {
+/* ppc64 `raise_sigusr2' resolves to the function descriptor.  */
+union
+  {
+   void (*f) (void);
+   struct
+ {
+   void *entry;
+   void *toc;
+ }
+   *p;
+  }
+const func_u = { raise_sigusr2 };
+
+REGS_ACCESS (regs, nip) = (unsigned long) func_u.p-entry;
+REGS_ACCESS (regs, gpr[2]) = (unsigned long) func_u.p-toc;
+  }
 #elif defined __powerpc__
   REGS_ACCESS (regs, nip) = (unsigned long) raise_sigusr2;
 #else



Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)

2009-12-09 Thread Oleg Nesterov
On 12/08, Ananth N Mavinakayanahalli wrote:

 On Mon, Dec 07, 2009 at 07:05:40PM +0100, Oleg Nesterov wrote:
  On 12/07, Oleg Nesterov wrote:
  
   On 12/07, Jan Kratochvil wrote:
   
On Mon, 07 Dec 2009 15:24:51 +0100, Oleg Nesterov wrote:
 But. raise_sigusr2 is not equal to the actual address of 
 raise_sigusr2(),
 this value points to the thunk (I do not know the correct English 
 term)
   
ppc64 calls it function descriptor (GDB
ppc64_linux_convert_from_func_ptr_addr):
   For PPC64, a function descriptor is a TOC entry,
  
   Thanks Jan.
  
in a data section,
  
   Yes!
  
   Now I can't understand how this test-case could ever work on ppc.
   step-jump-cont does:
  
 regs-nip = raise_sigusr2;  --- points to data section
 ptrace(PTRACE_CONT);
  
   of course, the tracee gets SIGSEGV, this section is not executable.
 
  Hmm. Looks like, powerpc means a lot of different hardware, and
  _PAGE_EXEC may be 0. I didn't notice this when I quickly grepped
  arch/powerpc/
 
  IOW, perhaps on some machines r implies x ?
 
  Is yes, this can explain why the results differ on different
  machines.

 Well, powerpc 32-bit adheres to the SVR4 ABI, while powerpc 64-bit uses
 the PPC64-ELF ABI (http://refspecs.linuxfoundation.org/ELF/ppc64/). The
 64bit ABI uses function descriptors and the 'func_name' is the data
 address,

Cai, Ananth, thank you.

So. I think we can forget about the possible kernel problems (and
in any case we can rule out utrace).

The test-case just wrong and should be fixed. The tracee can't execute
the function descriptor in data section, that is why it gets SIGSEGV.

 while the '.func_name' is the text address.

tried to change the code to

REGS_ACCESS (regs, nip) = (unsigned long) .raise_sigusr2

but gcc doesn't like this ;)

 (See
 handle_rt_signal64 in arch/powerpc/kernel/signal_64.c and
 kprobe_lookup_name in arch/powerpc/include/asm/kprobes.h.

Thanks... looking at handle_rt_signal64(), looks like we should
also set regs-gpr[2] = funct_desc_ptr-toc if we change regs-nip


I hope someone who understand powerpc could fix the test-case ;)

Oleg.



Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)

2009-12-08 Thread Ananth N Mavinakayanahalli
On Mon, Dec 07, 2009 at 01:43:27PM +0100, Oleg Nesterov wrote:
 On 12/06, CAI Qian wrote:
 
 Ananth, could you please confirm once again that step-jump-cont (from
 ptrace-tests testsuite) not fail on your machine? If yes, please tell
 me the version of glibc/gcc. Is PTRACE_GETREGS defined on your machine?

Hi Oleg,

It works for me on a Fedora 12 machine.

[ana...@mjs22lp1 ptrace-tests]$ gcc --version
gcc (GCC) 4.4.2 20091027 (Red Hat 4.4.2-7)

[ana...@mjs22lp1 ptrace-tests]$ rpm -qa |grep glibc
glibc-common-2.11-2.ppc
glibc-2.11-2.ppc64
glibc-devel-2.11-2.ppc
glibc-static-2.11-2.ppc
glibc-2.11-2.ppc
glibc-devel-2.11-2.ppc64
glibc-headers-2.11-2.ppc

And yes, PTRACE_GETREGS is defined in /usr/include/asm/ptrace.h

Ananth



Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)

2009-12-08 Thread Ananth N Mavinakayanahalli
On Mon, Dec 07, 2009 at 07:05:40PM +0100, Oleg Nesterov wrote:
 On 12/07, Oleg Nesterov wrote:
 
  On 12/07, Jan Kratochvil wrote:
  
   On Mon, 07 Dec 2009 15:24:51 +0100, Oleg Nesterov wrote:
But. raise_sigusr2 is not equal to the actual address of 
raise_sigusr2(),
this value points to the thunk (I do not know the correct English 
term)
  
   ppc64 calls it function descriptor (GDB
   ppc64_linux_convert_from_func_ptr_addr):
  For PPC64, a function descriptor is a TOC entry,
 
  Thanks Jan.
 
   in a data section,
 
  Yes!
 
  Now I can't understand how this test-case could ever work on ppc.
  step-jump-cont does:
 
  regs-nip = raise_sigusr2;  --- points to data section
  ptrace(PTRACE_CONT);
 
  of course, the tracee gets SIGSEGV, this section is not executable.
 
 Hmm. Looks like, powerpc means a lot of different hardware, and
 _PAGE_EXEC may be 0. I didn't notice this when I quickly grepped
 arch/powerpc/
 
 IOW, perhaps on some machines r implies x ?
 
 Is yes, this can explain why the results differ on different
 machines.

Well, powerpc 32-bit adheres to the SVR4 ABI, while powerpc 64-bit uses
the PPC64-ELF ABI (http://refspecs.linuxfoundation.org/ELF/ppc64/). The
64bit ABI uses function descriptors and the 'func_name' is the data
address, while the '.func_name' is the text address. (See
handle_rt_signal64 in arch/powerpc/kernel/signal_64.c and
kprobe_lookup_name in arch/powerpc/include/asm/kprobes.h.

Ananth



Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)

2009-12-07 Thread caiqian

 I'll try to investigate, but currently I am all confused, and I
 suspect we have some user-space issues. If only I knew something about ppc...

Sorry for the confusing.

 
 Ananth, could you please confirm once again that step-jump-cont (from
 ptrace-tests testsuite) not fail on your machine? If yes, please tell
 me the version of glibc/gcc. Is PTRACE_GETREGS defined on your
 machine?

Funny enough. The above failure only seen on that particular system so far. In 
fact, different PPC64 systems have different results there (roland's git tree + 
your lockless patch).

ibm-js20-02.lab.bos.redhat.com
FAIL: watchpoint
ppc-dabr-race: ./../tests/ppc-dabr-race.c:141: handler_fail: Assertion `0' 
failed.
/bin/sh: line 5: 16928 Aborted ${dir}$tst
FAIL: ppc-dabr-race
syscall-reset: ./../tests/syscall-reset.c:95: main: Assertion 
`(*__errno_location ()) == 38' failed.
errno 14 (Bad address)
unexpected child status 67f
FAIL: syscall-reset
step-fork: ./../tests/step-fork.c:56: handler_fail: Assertion `0' failed.
/bin/sh: line 5: 31144 Aborted ${dir}$tst
FAIL: step-fork

ibm-js22-02.rhts.bos.redhat.com
ibm-js12-04.rhts.bos.redhat.com
ibm-js12-05.rhts.bos.redhat.com
Looks like failed only for syscall-reset and step-fork, as we have discussed 
before. I'll be reserving ibm-js20-02.lab.bos.redhat.com at the moment.

Thanks,
CAI Qian



Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)

2009-12-07 Thread Oleg Nesterov
On 12/07, caiq...@redhat.com wrote:

  Ananth, could you please confirm once again that step-jump-cont (from
  ptrace-tests testsuite) not fail on your machine? If yes, please tell
  me the version of glibc/gcc. Is PTRACE_GETREGS defined on your
  machine?

 Funny enough. The above failure only seen on that particular system so far.
 In fact, different PPC64 systems have different results there (roland's git
 tree + your lockless patch).

Great! thanks.

OK, I seem to understand what happens, but I can not explain WHY does
this happen on that machine.

Once again. The tracer changes the tracee's instruction pointer to
the adrress of raise_sigusr2(), and resumes the tracee. The tracee
gets SIGSEGV right after that.

But. raise_sigusr2 is not equal to the actual address of raise_sigusr2(),
this value points to the thunk (I do not know the correct English term)
which contains the actual address:

(gdb) disassemble 0x100118c0
Dump of assembler code for function raise_sigusr2:
0x100118c0 raise_sigusr2+0:   .long 0x0    
SIGSEGV
0x100118c4 raise_sigusr2+4:   .long 0x1ab0 
aof raise_sigusr2()
0x100118c8 raise_sigusr2+8:   .long 0x0

And!!! this thunk does NOT live in .text, and vma does NOT have
VM_EXEC bit!

# cat /proc/30494/maps
0010-0012 r-xp  00:00 0 
 [vdso]
1000-1001 r-xp  fd:00 59262 
 /root/TST/sjc
1001-1002 rw-p  fd:00 59262 
 /root/TST/sjc

That is why the tracee gets SIGSEGV, and this is correct.


Cai, perhaps you could give me access to another ppc machine where
this test does not fail?

Or, could you please run the trivial program below on that machine?

Oleg.

#include stdio.h
#include stdlib.h
#include unistd.h

void my_func(void)
{
}

int main(void)
{
char cmd[128];

printf(ptr: %p\n, my_func);

sprintf(cmd, cat /proc/%d/maps, getpid());
system(cmd);

return 0;
}



Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)

2009-12-07 Thread Jan Kratochvil
On Mon, 07 Dec 2009 15:24:51 +0100, Oleg Nesterov wrote:
 But. raise_sigusr2 is not equal to the actual address of raise_sigusr2(),
 this value points to the thunk (I do not know the correct English term)

ppc64 calls it function descriptor (GDB
ppc64_linux_convert_from_func_ptr_addr):
   For PPC64, a function descriptor is a TOC entry, in a data section,
   which contains three words: the first word is the address of the
   function, the second word is the TOC pointer (r2), and the third word
   is the static chain value.

(gdb) x/8gx 0x805b6f6258
0x805b6f6258 open:0x00805b65cf68  0x00805b702ac0
0x805b6f6268 open64:  0x00805b65d010  0x00805b702ac0

(gdb) x/20i 0x00805b65cf68
0x805b65cf68 .__GI___open:lwz r10,-30432(r13)
0x805b65cf6c .__GI___open+4:  cmpwi   r10,0
0x805b65cf70 .__GI___open+8:  bne-0x805b65cf84 .__GI___open+28

(gdb) info sym 0x00805b702ac0
last_nip in section .bss

I was not aware there is any third word before and I do not see it there.


Regards,
Jan



Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)

2009-12-07 Thread Oleg Nesterov
On 12/07, Jan Kratochvil wrote:

 On Mon, 07 Dec 2009 15:24:51 +0100, Oleg Nesterov wrote:
  But. raise_sigusr2 is not equal to the actual address of 
  raise_sigusr2(),
  this value points to the thunk (I do not know the correct English term)

 ppc64 calls it function descriptor (GDB
 ppc64_linux_convert_from_func_ptr_addr):
For PPC64, a function descriptor is a TOC entry,

Thanks Jan.

 in a data section,

Yes!

Now I can't understand how this test-case could ever work on ppc.
step-jump-cont does:

regs-nip = raise_sigusr2;  --- points to data section
ptrace(PTRACE_CONT);

of course, the tracee gets SIGSEGV, this section is not executable.

Oleg.



Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)

2009-12-07 Thread Oleg Nesterov
On 12/07, Oleg Nesterov wrote:

 On 12/07, Jan Kratochvil wrote:
 
  On Mon, 07 Dec 2009 15:24:51 +0100, Oleg Nesterov wrote:
   But. raise_sigusr2 is not equal to the actual address of 
   raise_sigusr2(),
   this value points to the thunk (I do not know the correct English term)
 
  ppc64 calls it function descriptor (GDB
  ppc64_linux_convert_from_func_ptr_addr):
 For PPC64, a function descriptor is a TOC entry,

 Thanks Jan.

  in a data section,

 Yes!

 Now I can't understand how this test-case could ever work on ppc.
 step-jump-cont does:

   regs-nip = raise_sigusr2;  --- points to data section
   ptrace(PTRACE_CONT);

 of course, the tracee gets SIGSEGV, this section is not executable.

Hmm. Looks like, powerpc means a lot of different hardware, and
_PAGE_EXEC may be 0. I didn't notice this when I quickly grepped
arch/powerpc/

IOW, perhaps on some machines r implies x ?

Is yes, this can explain why the results differ on different
machines.

Oleg.