Re: Tests Failures on PPC64

2009-12-11 Thread Oleg Nesterov
On 12/11, K.Prasad wrote:

 On Thu, Dec 10, 2009 at 08:24:36PM +0100, Oleg Nesterov wrote:
 
  Oh well. I spent this day grepping arch/powerpc to understand how
  PTRACE_SET_DEBUGREG works and what is the problem. But I am afraid
  this time I need a help from someone who understands the hardware
  magic on powerpc.
 

 There's relatively less magic with PPC64 (with just one DABR) compared
 to x86 :-)

 I hope to offer a little help here (given that I work to tweak
 ptrace_set_debugreg() in PPC64 to use the hw-breakpoint interfaces)

Thanks, please see another email, I cc'ed you.

 Watchpoints (using DABR) through GDB can fail for many reasonsthey
 must ideally be set after the program has started execution - to enable
 GDB know the size of the variable...else they would resort to
 single-stepping to trap access to the target variable.

Yes. I straced gdb to be sure it really does PTRACE_SET_DEBUGREF to
use the hardware watchpoint.

There is something strange though. gdb does PTRACE_SINGLESTEP and only
then PTRACE_CONT after watch xxx.

Where can one find the relevant piece of testcase?

http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/watchpoint.c?cvsroot=systemtap

Oleg.



Re: Tests Failures on PPC64

2009-12-11 Thread K.Prasad
On Fri, Dec 11, 2009 at 04:59:44PM +0100, Oleg Nesterov wrote:
 On 12/11, K.Prasad wrote:
  Watchpoints (using DABR) through GDB can fail for many reasonsthey
  must ideally be set after the program has started execution - to enable
  GDB know the size of the variable...else they would resort to
  single-stepping to trap access to the target variable.
 
 Yes. I straced gdb to be sure it really does PTRACE_SET_DEBUGREF to
 use the hardware watchpoint.
 
 There is something strange though. gdb does PTRACE_SINGLESTEP and only
 then PTRACE_CONT after watch xxx.


I haven't taken a good look at the testcase...although I suspect that
the use of PTRACE_SINGLESTEP vs PTRACE_SET_DEBUGREG during your trials
is due to the way watch var is being set.

For instance, here are two screenlogs taken from a PPC64 (Power5 box
running RHEL 5.3 2.6.18-128.el5). It can be seen that hw-breakpoints are
used only during the second-run of GDB vs single-stepping done in the
first...wondering if you used it in similar ways.

First run
---
[r...@p510 ~]# gdb prasad
GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type show
copying
and show warranty for details.
This GDB was configured as ppc64-redhat-linux-gnu...
(gdb) watch b
Watchpoint 1: b (Watchpoint vs Hardware watchpoint)
(gdb) r
Starting program: /root/prasad 

Prasad: sizeof(long): sizeof(b)=4
Watchpoint 1: b

Old value = 0
New value = 200
main () at a.c:17
17  i = 300;
(gdb) c


Second run
---
[r...@p510 ~]# gdb prasad
GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type show
copying
and show warranty for details.
This GDB was configured as ppc64-redhat-linux-gnu...
(gdb) start
Breakpoint 1 at 0x14bc: file a.c, line 14.
Starting program: /root/prasad 
main () at a.c:14
14  printf(\nPrasad: sizeof(long): sizeof(b)=%d\n,
sizeof(b));
(gdb) watch b
Hardware watchpoint 2: b --(uses DABR unlike the firstrun)
(gdb) q
The program is running.  Exit anyway? (y or n) y



Re: powerpc: PPC970FX dabr bug? (Was: Tests Failures on PPC64)

2009-12-11 Thread Oleg Nesterov
On 12/11, Oleg Nesterov wrote:

 For those who didn't read the whole thread, the test-case:
 http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/watchpoint.c?cvsroot=systemtap

or you can look at

https://www.redhat.com/archives/utrace-devel/2009-December/msg00096.html

it is very easy to reproduce the problem with gdb.

 On 12/10, Oleg Nesterov wrote:
 
  On 12/10, Oleg Nesterov wrote:
  
   On 12/09, CAI Qian wrote:
   
- Oleg Nesterov o...@redhat.com wrote:
   
 Thanks, but it doesn't fail for me on this machine...
   
Hmm, it failed for me.
   
# cd /root/ptrace-tests
   
# make check
...
FAIL: watchpoint
  
   OMG. Yet another test-case fails on powerpc  I didn't see this
   failure in the previous reports or missed it ...
  
   I bet it fails without utrace too? (please don't tell it doesn't ;)
  
   Did you see it fails on other ppc64 machines?
  
  
   Oh well. I spent this day grepping arch/powerpc to understand how
   PTRACE_SET_DEBUGREG works and what is the problem. But I am afraid
   this time I need a help from someone who understands the hardware
   magic on powerpc.
  
   So far:
  
 - the test-case looks correct to me
 
  OOPS.
 
  I am not sure, will re-check tomorrow. But it seems to me gcc
  optimizes out check = 1, despite the fact it is declared as
  volatile.
 
 No, I misread the asm (which I don't understand anyway). The tracee
 does write to check, and this is even seen by PTRACE_PEEKDATA.
 
 Looks like a hardware problem to me. For example, this patch
 
   --- watchpoint.c~   2009-12-11 15:32:14.0 +0100
   +++ watchpoint.c2009-12-11 15:36:17.0 +0100
   @@ -144,7 +144,7 @@ handler_fail (int signo)
  raise (signo);
}

   -static volatile long long check;
   +volatile long long check;

int
main (void)
 
 fixes the problem. This one
 
   --- watchpoint.c~   2009-12-11 15:32:14.0 +0100
   +++ watchpoint.c2009-12-11 15:38:10.0 +0100
   @@ -169,7 +169,7 @@ main (void)
   i = raise (SIGUSR1);
   assert (i == 0);

   -   check = 1;
   +   check = 0xfff;

   i = raise (SIGUSR2);
   assert (i == 0);
 
 helps too (any value which can't be immediate for powerpc works,
 unless I misinterpret asm again).
 
 
 I give up, this needs a help from powerpc experts. As a last resort
 I tried google,
 
   # grep cpu /proc/cpuinfo
   cpu : PPC970FX, altivec supported
   cpu : PPC970FX, altivec supported
 
 
 http://www.google.com/linux?q=powerpc+970FX+dabr+bug
 
 from http://lists.ozlabs.org/pipermail/linuxppc-dev/2008-March/052910.html
 
   Which is IBM PowerPC 970FX RISC Microprocessor Errata List for DD3.X
   and contains Erratum #8: DABRX register might not always be updated 
 correctly:
 
   Projected Impact
 The data address breakpoint function might not always 
 work.
   Workaround
 None.
   Status
 A fix is not planned at this time for the PowerPC 970FX.
 
 but this machine sets set_dabr = pseries_set_dabr(), not pseries_set_xdabr(),
 not sure this is relevant.
 
 Gurus, please help!
 
 Oleg.