[QUERY] signal_struct-count/live

2009-11-27 Thread Ananth N Mavinakayanahalli
Oleg,

I am confused as to why we need two atomics count and live in signal_struct.

report_death() uses -live as the group_dead indicator, while there are
places (like the scheduler) which uses -count as the nr_threads
indicator.

I tried git blame to see if it remembers why, but the addition predates
2.6.12 and so it does not know.

Could you please shed some light on this?

Ananth



Re: [RFC,PATCH 0/14] utrace/ptrace

2009-11-27 Thread Christoph Hellwig
On Thu, Nov 26, 2009 at 01:24:41PM +0100, Ingo Molnar wrote:
 FYI, the merge window has not opened yet, so it cannot close in a few 
 days.

subsystems merged window, not Linus'.

 
  [...] and thus not getting any of the broad -next test coverage is a 
  pretty bad idea.  In the end it will be the maintainers ruling but 
  that doesn't make it a good idea from the engineering point of view.
 
 FYI, it's been in -mm, that's where it's maintained.

None of the recent mm snapshots has anything utrace related in there,
just a few ptrace patches from Oleg (which are in this series but a very
small part of it) and certainly not all this new code that is pretty
recent (take a look at the utrace list for the development).

 Yes. Which is a further argument to not do it like that but to do one 
 arch at a time. Trying to do too much at once is bad engineering.

I'm not sure why you're trying to pick fights here, but no one has said
about doing it all in once.  The point I'm trying to make is that it's
pretty bad to keep parallel ptrace implementations, and we should settle
on one.  A pre-requisite of using the new once genericly is to have the
architecture ptrace code updated.  I think arm and mips are the two
only relevant ones still missing, so updating them and killing the other
ones is easy.

If you think keeping the two ptrace implementations is fine argue for
that directly, but please stick to the technical points instead of just
fighting for fightings sake.



Re: utrace-ptrace gdb testsuite tesults

2009-11-27 Thread Veaceslav Falico
On Wed, Nov 25, 2009 at 01:17:15PM -0800, Roland McGrath wrote:
 
 That's certainly good to hear.  If you are pretty confident about that,
 then I am quite happy to consider nonregression on all of ptrace-tests the
 sole gating test for kernel changes.  We just don't want to wind up having
 other upstream reviewers notice a regression using gdb that we didn't
 notice before we submitted a kernel change.
 

I've just done 'make check' twice on unpatched kernel, and found that the
results are not stable:

--- gdb.sum 2009-11-27 09:54:14.0 +0100
+++ gdb.sum22009-11-27 10:51:42.0 +0100
@@ -1,4 +1,4 @@
-Test Run By root on Thu Nov 26 18:52:09 2009
+Test Run By root on Fri Nov 27 09:54:33 2009
 Native configuration is i686-pc-linux-gnu
 
=== gdb tests ===
@@ -3537,12 +3537,12 @@ PASS: gdb.base/foll-fork.exp: unpatch ch
 PASS: gdb.base/foll-fork.exp: unpatch child, catch fork
 PASS: gdb.base/foll-fork.exp: unpatch child, breakpoint at exit call
 PASS: gdb.base/foll-fork.exp: unpatch child, set follow child
-FAIL: gdb.base/foll-fork.exp: unpatch child, unpatched parent breakpoints from 
child (timeout)
+PASS: gdb.base/foll-fork.exp: unpatch child, unpatched parent breakpoints from 
child
 PASS: gdb.base/foll-fork.exp: explicit parent follow, set tcatch fork
 PASS: gdb.base/foll-fork.exp: explicit parent follow, tcatch fork
 PASS: gdb.base/foll-fork.exp: set follow parent
 PASS: gdb.base/foll-fork.exp: set follow parent, tbreak
-PASS: gdb.base/foll-fork.exp: set follow parent, hit tbreak
+FAIL: gdb.base/foll-fork.exp: (timeout) set follow parent, hit tbreak
 PASS: gdb.base/foll-fork.exp: set follow parent, cleanup
 Running ./gdb.base/foll-vfork.exp ...
 PASS: gdb.base/foll-vfork.exp: set verbose
@@ -12499,7 +12499,7 @@ PASS: gdb.mi/mi-nsmoribund.exp: thread s
 PASS: gdb.mi/mi-nsmoribund.exp: resume all, thread specific breakpoint
 PASS: gdb.mi/mi-nsmoribund.exp: hit thread specific breakpoint
 PASS: gdb.mi/mi-nsmoribund.exp: thread state: all running except the 
breakpoint thread
-PASS: gdb.mi/mi-nsmoribund.exp: resume all, program exited normally
+FAIL: gdb.mi/mi-nsmoribund.exp: unexpected stop
 Running ./gdb.mi/mi-nsthrexec.exp ...
 PASS: gdb.mi/mi-nsthrexec.exp: successfully compiled posix threads test case
 PASS: gdb.mi/mi-nsthrexec.exp: breakpoint at main
@@ -14507,7 +14507,7 @@ PASS: gdb.threads/watchthreads2.exp: bre
 PASS: gdb.threads/watchthreads2.exp: all threads started
 PASS: gdb.threads/watchthreads2.exp: watch x
 PASS: gdb.threads/watchthreads2.exp: set var test_ready = 1
-KFAIL: gdb.threads/watchthreads2.exp: gdb can drop watchpoints in 
multithreaded app (PRMS: gdb/10116)
+PASS: gdb.threads/watchthreads2.exp: all threads incremented x
 Running ./gdb.threads/watchthreads.exp ...
 PASS: gdb.threads/watchthreads.exp: successfully compiled posix threads test 
case
 PASS: gdb.threads/watchthreads.exp: watch args[0]
@@ -14672,7 +14672,7 @@ UNSUPPORTED: gdb.xml/tdesc-xinclude.exp:
=== gdb Summary ===
 
 # of expected passes   13854
-# of unexpected failures   75
+# of unexpected failures   76
 # of expected failures 43
 # of untested testcases7
 # of unsupported tests 59

--
Veaceslav



Re: utrace-ptrace gdb testsuite tesults

2009-11-27 Thread Jan Kratochvil
On Fri, 27 Nov 2009 15:11:09 +0100, Veaceslav Falico wrote:
 -FAIL: gdb.base/foll-fork.exp: unpatch child, unpatched parent breakpoints 
 from child (timeout)
 +PASS: gdb.base/foll-fork.exp: unpatch child, unpatched parent breakpoints 
 from child
 -PASS: gdb.base/foll-fork.exp: set follow parent, hit tbreak
 +FAIL: gdb.base/foll-fork.exp: (timeout) set follow parent, hit tbreak

To be ignored, fixed upstream:
http://sourceware.org/ml/gdb-patches/2009-11/msg00573.html


 -PASS: gdb.mi/mi-nsmoribund.exp: resume all, program exited normally
 +FAIL: gdb.mi/mi-nsmoribund.exp: unexpected stop
 -KFAIL: gdb.threads/watchthreads2.exp: gdb can drop watchpoints in 
 multithreaded app (PRMS: gdb/10116)
 +PASS: gdb.threads/watchthreads2.exp: all threads incremented x

These are known to be unstable but there some known watch and non-stop
problems so it may not even be a testcase-side bug.


Therefore this test shows no changes/regressions.


Regards,
Jan



Re: [RFC,PATCH 0/14] utrace/ptrace

2009-11-27 Thread Oleg Nesterov
On 11/27, Christoph Hellwig wrote:

 On Thu, Nov 26, 2009 at 01:24:41PM +0100, Ingo Molnar wrote:
 
  FYI, it's been in -mm, that's where it's maintained.

 None of the recent mm snapshots has anything utrace related in there,

Well, not that I think this is important, but...

Two weeks ago we asked Andrew do drop utrace-core.patch from -mm,
it should be replaced by this updated version.

Oleg.



Re: utrace-ptrace gdb testsuite tesults

2009-11-27 Thread Oleg Nesterov
On 11/27, Veaceslav Falico wrote:

 On Wed, Nov 25, 2009 at 01:17:15PM -0800, Roland McGrath wrote:
 
  That's certainly good to hear.  If you are pretty confident about that,
  then I am quite happy to consider nonregression on all of ptrace-tests the
  sole gating test for kernel changes.  We just don't want to wind up having
  other upstream reviewers notice a regression using gdb that we didn't
  notice before we submitted a kernel change.
 

 I've just done 'make check' twice on unpatched kernel, and found that the
 results are not stable:

 --- gdb.sum 2009-11-27 09:54:14.0 +0100
 +++ gdb.sum22009-11-27 10:51:42.0 +0100
 @@ -1,4 +1,4 @@
 -Test Run By root on Thu Nov 26 18:52:09 2009
 +Test Run By root on Fri Nov 27 09:54:33 2009
  Native configuration is i686-pc-linux-gnu

 === gdb tests ===
 @@ -3537,12 +3537,12 @@ PASS: gdb.base/foll-fork.exp: unpatch ch
  PASS: gdb.base/foll-fork.exp: unpatch child, catch fork
  PASS: gdb.base/foll-fork.exp: unpatch child, breakpoint at exit call
  PASS: gdb.base/foll-fork.exp: unpatch child, set follow child
 -FAIL: gdb.base/foll-fork.exp: unpatch child, unpatched parent breakpoints 
 from child (timeout)
 +PASS: gdb.base/foll-fork.exp: unpatch child, unpatched parent breakpoints 
 from child
  PASS: gdb.base/foll-fork.exp: explicit parent follow, set tcatch fork
  PASS: gdb.base/foll-fork.exp: explicit parent follow, tcatch fork
  PASS: gdb.base/foll-fork.exp: set follow parent
  PASS: gdb.base/foll-fork.exp: set follow parent, tbreak
 -PASS: gdb.base/foll-fork.exp: set follow parent, hit tbreak
 +FAIL: gdb.base/foll-fork.exp: (timeout) set follow parent, hit tbreak
  PASS: gdb.base/foll-fork.exp: set follow parent, cleanup
  Running ./gdb.base/foll-vfork.exp ...
  PASS: gdb.base/foll-vfork.exp: set verbose
 @@ -12499,7 +12499,7 @@ PASS: gdb.mi/mi-nsmoribund.exp: thread s
  PASS: gdb.mi/mi-nsmoribund.exp: resume all, thread specific breakpoint
  PASS: gdb.mi/mi-nsmoribund.exp: hit thread specific breakpoint
  PASS: gdb.mi/mi-nsmoribund.exp: thread state: all running except the 
 breakpoint thread
 -PASS: gdb.mi/mi-nsmoribund.exp: resume all, program exited normally
 +FAIL: gdb.mi/mi-nsmoribund.exp: unexpected stop
  Running ./gdb.mi/mi-nsthrexec.exp ...
  PASS: gdb.mi/mi-nsthrexec.exp: successfully compiled posix threads test case
  PASS: gdb.mi/mi-nsthrexec.exp: breakpoint at main
 @@ -14507,7 +14507,7 @@ PASS: gdb.threads/watchthreads2.exp: bre
  PASS: gdb.threads/watchthreads2.exp: all threads started
  PASS: gdb.threads/watchthreads2.exp: watch x
  PASS: gdb.threads/watchthreads2.exp: set var test_ready = 1
 -KFAIL: gdb.threads/watchthreads2.exp: gdb can drop watchpoints in 
 multithreaded app (PRMS: gdb/10116)
 +PASS: gdb.threads/watchthreads2.exp: all threads incremented x
  Running ./gdb.threads/watchthreads.exp ...
  PASS: gdb.threads/watchthreads.exp: successfully compiled posix threads test 
 case
  PASS: gdb.threads/watchthreads.exp: watch args[0]
 @@ -14672,7 +14672,7 @@ UNSUPPORTED: gdb.xml/tdesc-xinclude.exp:
 === gdb Summary ===

  # of expected passes   13854
 -# of unexpected failures   75
 +# of unexpected failures   76
  # of expected failures 43
  # of untested testcases7
  # of unsupported tests 59

Nice, thanks.

So. I am going to conclude that, more or less,  utrace-ptrace passes
these tests.

Jan, if you see something particular which needs more attention or should
be fixed, please let me know. I'll try to investigate then.

Oleg.



Re: utrace-ptrace gdb testsuite tesults

2009-11-27 Thread Jan Kratochvil
On Fri, 27 Nov 2009 15:34:05 +0100, Oleg Nesterov wrote:
 Jan, if you see something particular which needs more attention or should
 be fixed, please let me know. I'll try to investigate then.

I am still not finished with the verifications yesterday but so far no kernel
behavior change has been proven and I doubt it will be.  Going to reply today.

The ppc kernel should be checked but I do not have built two non-utrace/utrace
matching kernel rpms for it.


Regards,
Jan



Re: powerpc: fork stepping (Was: [RFC, PATCH 0/14] utrace/ptrace)

2009-11-27 Thread Oleg Nesterov
On 11/27, Ananth N Mavinakayanahalli wrote:

 On Thu, Nov 26, 2009 at 03:50:51PM +0100, Oleg Nesterov wrote:

  Ananth, could you please run the test-case from the changelog
  below ? I do not really expect this can help, but just in case.

 Right, it doesn't help :-(

 GDB shows that the parent is forever struck at wait().

Now this is interesting. Could you please double check the parent hangs
in wait() ?

This doesn't match the testing we did on powerpc machine with Veaceslav,
and I hoped the problem was already resolved?

Please see other emails in this thread.


Hmm. Fortunately I still have the access to the testing machine.
Yes, according to gdb it looks as if it hangs in wait(). This
is not true. You can strace gdb itself, or look at xxx_ctxt_switches
in /proc/pid_of_parent/status.

Better yet, do not use gdb at all. Just strace (without -f) the parent,
you should see it continues to trace the child and loops forever.

Oleg.



Re: [QUERY] signal_struct-count/live

2009-11-27 Thread Ananth N Mavinakayanahalli
On Fri, Nov 27, 2009 at 04:15:21PM +0100, Oleg Nesterov wrote:
 On 11/27, Ananth N Mavinakayanahalli wrote:
 
  I am confused as to why we need two atomics count and live in signal_struct.
 
  report_death() uses -live as the group_dead indicator,
 
 report_death? Perhaps you meant do_exit() ?

Right, do_exit() and that is what is picked up by
tracehook_report_death(), and in turn by report_death().

  while there are
  places (like the scheduler) which uses -count as the nr_threads
  indicator.
 
  I tried git blame to see if it remembers why, but the addition predates
  2.6.12 and so it does not know.
 
  Could you please shed some light on this?
 
 In short: signal-count must die. I was going to do this a long ago
 but never had the time. See also 4ab6c08336535f8c8e42cf45d7adeda882eff06e
 commit, this is the first step.
 
 Last time I did the grepping almost any usage of signal-count is
 not right. For example, __exit_signal() is correct, but it doesn't
 need to use -count.
 
 Except: it is needed for things like get_nr_threads() in proc.
 
 In short: never use signal-count ;)

Thanks for the clarification Oleg.

Ananth



Re: powerpc: fork stepping (Was: [RFC, PATCH 0/14] utrace/ptrace)

2009-11-27 Thread Ananth N Mavinakayanahalli
On Fri, Nov 27, 2009 at 06:46:27PM +0100, Veaceslav Falico wrote:
 On Thu, Nov 26, 2009 at 11:37:03PM +0100, Oleg Nesterov wrote:
 
  Could you look at this
 
ptrace-copy_process-should-disable-stepping.patch
http://marc.info/?l=linux-mm-commitsm=125789789322573
 
  patch? It is not clear to me how we can modify the test-case to
  verify it fixes the original problem for powerpc.
 
 I modified the test-case, it confirms that
 ptrace-copy_process-should-disable-stepping.patch fixes the
 problem with TIF_SINGLESTEP copied by fork() on powerpc.
 
 Probably we need a similar fix for step-fork.c in ptrace-tests.
 
 Modified the original testcase to call fork via syscall(__NR_fork),
 to avoid the looping inside libc's fork() on powerpc.
 The parent singlesteps until he sees that the child has forked, after
 that the parent PTRACE_CONTs until the child exits.

Thanks Veaceslav. This works:

Index: ptrace-tests/tests/step-fork.c
===
--- ptrace-tests.orig/tests/step-fork.c
+++ ptrace-tests/tests/step-fork.c
@@ -29,6 +29,7 @@
 #include unistd.h
 #include sys/wait.h
 #include string.h
+#include sys/syscall.h
 #include signal.h

 #ifndef PTRACE_SINGLESTEP
@@ -78,7 +79,7 @@ main (int argc, char **argv)
sigprocmask (SIG_BLOCK, mask, NULL);
ptrace (PTRACE_TRACEME);
raise (SIGUSR1);
-   if (fork () == 0)
+   if (syscall(__NR_fork) == 0)
  {
read (-1, NULL, 0);
_exit (22);

Oleg,
With the above patch applied, syscall-reset is the only failure I see on
powerpc:

errno 14 (Bad address)
syscall-reset: syscall-reset.c:95: main: Assertion `(*__errno_location
()) == 38' failed.
unexpected child status 67f
FAIL: syscall-reset
...

1 of 40 tests failed
(11 tests were not run)
Please report to utrace-devel@redhat.com


Ananth