[QUERY] signal_struct-count/live
Oleg, I am confused as to why we need two atomics count and live in signal_struct. report_death() uses -live as the group_dead indicator, while there are places (like the scheduler) which uses -count as the nr_threads indicator. I tried git blame to see if it remembers why, but the addition predates 2.6.12 and so it does not know. Could you please shed some light on this? Ananth
Re: [RFC,PATCH 0/14] utrace/ptrace
On Thu, Nov 26, 2009 at 01:24:41PM +0100, Ingo Molnar wrote: FYI, the merge window has not opened yet, so it cannot close in a few days. subsystems merged window, not Linus'. [...] and thus not getting any of the broad -next test coverage is a pretty bad idea. In the end it will be the maintainers ruling but that doesn't make it a good idea from the engineering point of view. FYI, it's been in -mm, that's where it's maintained. None of the recent mm snapshots has anything utrace related in there, just a few ptrace patches from Oleg (which are in this series but a very small part of it) and certainly not all this new code that is pretty recent (take a look at the utrace list for the development). Yes. Which is a further argument to not do it like that but to do one arch at a time. Trying to do too much at once is bad engineering. I'm not sure why you're trying to pick fights here, but no one has said about doing it all in once. The point I'm trying to make is that it's pretty bad to keep parallel ptrace implementations, and we should settle on one. A pre-requisite of using the new once genericly is to have the architecture ptrace code updated. I think arm and mips are the two only relevant ones still missing, so updating them and killing the other ones is easy. If you think keeping the two ptrace implementations is fine argue for that directly, but please stick to the technical points instead of just fighting for fightings sake.
Re: utrace-ptrace gdb testsuite tesults
On Wed, Nov 25, 2009 at 01:17:15PM -0800, Roland McGrath wrote: That's certainly good to hear. If you are pretty confident about that, then I am quite happy to consider nonregression on all of ptrace-tests the sole gating test for kernel changes. We just don't want to wind up having other upstream reviewers notice a regression using gdb that we didn't notice before we submitted a kernel change. I've just done 'make check' twice on unpatched kernel, and found that the results are not stable: --- gdb.sum 2009-11-27 09:54:14.0 +0100 +++ gdb.sum22009-11-27 10:51:42.0 +0100 @@ -1,4 +1,4 @@ -Test Run By root on Thu Nov 26 18:52:09 2009 +Test Run By root on Fri Nov 27 09:54:33 2009 Native configuration is i686-pc-linux-gnu === gdb tests === @@ -3537,12 +3537,12 @@ PASS: gdb.base/foll-fork.exp: unpatch ch PASS: gdb.base/foll-fork.exp: unpatch child, catch fork PASS: gdb.base/foll-fork.exp: unpatch child, breakpoint at exit call PASS: gdb.base/foll-fork.exp: unpatch child, set follow child -FAIL: gdb.base/foll-fork.exp: unpatch child, unpatched parent breakpoints from child (timeout) +PASS: gdb.base/foll-fork.exp: unpatch child, unpatched parent breakpoints from child PASS: gdb.base/foll-fork.exp: explicit parent follow, set tcatch fork PASS: gdb.base/foll-fork.exp: explicit parent follow, tcatch fork PASS: gdb.base/foll-fork.exp: set follow parent PASS: gdb.base/foll-fork.exp: set follow parent, tbreak -PASS: gdb.base/foll-fork.exp: set follow parent, hit tbreak +FAIL: gdb.base/foll-fork.exp: (timeout) set follow parent, hit tbreak PASS: gdb.base/foll-fork.exp: set follow parent, cleanup Running ./gdb.base/foll-vfork.exp ... PASS: gdb.base/foll-vfork.exp: set verbose @@ -12499,7 +12499,7 @@ PASS: gdb.mi/mi-nsmoribund.exp: thread s PASS: gdb.mi/mi-nsmoribund.exp: resume all, thread specific breakpoint PASS: gdb.mi/mi-nsmoribund.exp: hit thread specific breakpoint PASS: gdb.mi/mi-nsmoribund.exp: thread state: all running except the breakpoint thread -PASS: gdb.mi/mi-nsmoribund.exp: resume all, program exited normally +FAIL: gdb.mi/mi-nsmoribund.exp: unexpected stop Running ./gdb.mi/mi-nsthrexec.exp ... PASS: gdb.mi/mi-nsthrexec.exp: successfully compiled posix threads test case PASS: gdb.mi/mi-nsthrexec.exp: breakpoint at main @@ -14507,7 +14507,7 @@ PASS: gdb.threads/watchthreads2.exp: bre PASS: gdb.threads/watchthreads2.exp: all threads started PASS: gdb.threads/watchthreads2.exp: watch x PASS: gdb.threads/watchthreads2.exp: set var test_ready = 1 -KFAIL: gdb.threads/watchthreads2.exp: gdb can drop watchpoints in multithreaded app (PRMS: gdb/10116) +PASS: gdb.threads/watchthreads2.exp: all threads incremented x Running ./gdb.threads/watchthreads.exp ... PASS: gdb.threads/watchthreads.exp: successfully compiled posix threads test case PASS: gdb.threads/watchthreads.exp: watch args[0] @@ -14672,7 +14672,7 @@ UNSUPPORTED: gdb.xml/tdesc-xinclude.exp: === gdb Summary === # of expected passes 13854 -# of unexpected failures 75 +# of unexpected failures 76 # of expected failures 43 # of untested testcases7 # of unsupported tests 59 -- Veaceslav
Re: utrace-ptrace gdb testsuite tesults
On Fri, 27 Nov 2009 15:11:09 +0100, Veaceslav Falico wrote: -FAIL: gdb.base/foll-fork.exp: unpatch child, unpatched parent breakpoints from child (timeout) +PASS: gdb.base/foll-fork.exp: unpatch child, unpatched parent breakpoints from child -PASS: gdb.base/foll-fork.exp: set follow parent, hit tbreak +FAIL: gdb.base/foll-fork.exp: (timeout) set follow parent, hit tbreak To be ignored, fixed upstream: http://sourceware.org/ml/gdb-patches/2009-11/msg00573.html -PASS: gdb.mi/mi-nsmoribund.exp: resume all, program exited normally +FAIL: gdb.mi/mi-nsmoribund.exp: unexpected stop -KFAIL: gdb.threads/watchthreads2.exp: gdb can drop watchpoints in multithreaded app (PRMS: gdb/10116) +PASS: gdb.threads/watchthreads2.exp: all threads incremented x These are known to be unstable but there some known watch and non-stop problems so it may not even be a testcase-side bug. Therefore this test shows no changes/regressions. Regards, Jan
Re: [RFC,PATCH 0/14] utrace/ptrace
On 11/27, Christoph Hellwig wrote: On Thu, Nov 26, 2009 at 01:24:41PM +0100, Ingo Molnar wrote: FYI, it's been in -mm, that's where it's maintained. None of the recent mm snapshots has anything utrace related in there, Well, not that I think this is important, but... Two weeks ago we asked Andrew do drop utrace-core.patch from -mm, it should be replaced by this updated version. Oleg.
Re: utrace-ptrace gdb testsuite tesults
On 11/27, Veaceslav Falico wrote: On Wed, Nov 25, 2009 at 01:17:15PM -0800, Roland McGrath wrote: That's certainly good to hear. If you are pretty confident about that, then I am quite happy to consider nonregression on all of ptrace-tests the sole gating test for kernel changes. We just don't want to wind up having other upstream reviewers notice a regression using gdb that we didn't notice before we submitted a kernel change. I've just done 'make check' twice on unpatched kernel, and found that the results are not stable: --- gdb.sum 2009-11-27 09:54:14.0 +0100 +++ gdb.sum22009-11-27 10:51:42.0 +0100 @@ -1,4 +1,4 @@ -Test Run By root on Thu Nov 26 18:52:09 2009 +Test Run By root on Fri Nov 27 09:54:33 2009 Native configuration is i686-pc-linux-gnu === gdb tests === @@ -3537,12 +3537,12 @@ PASS: gdb.base/foll-fork.exp: unpatch ch PASS: gdb.base/foll-fork.exp: unpatch child, catch fork PASS: gdb.base/foll-fork.exp: unpatch child, breakpoint at exit call PASS: gdb.base/foll-fork.exp: unpatch child, set follow child -FAIL: gdb.base/foll-fork.exp: unpatch child, unpatched parent breakpoints from child (timeout) +PASS: gdb.base/foll-fork.exp: unpatch child, unpatched parent breakpoints from child PASS: gdb.base/foll-fork.exp: explicit parent follow, set tcatch fork PASS: gdb.base/foll-fork.exp: explicit parent follow, tcatch fork PASS: gdb.base/foll-fork.exp: set follow parent PASS: gdb.base/foll-fork.exp: set follow parent, tbreak -PASS: gdb.base/foll-fork.exp: set follow parent, hit tbreak +FAIL: gdb.base/foll-fork.exp: (timeout) set follow parent, hit tbreak PASS: gdb.base/foll-fork.exp: set follow parent, cleanup Running ./gdb.base/foll-vfork.exp ... PASS: gdb.base/foll-vfork.exp: set verbose @@ -12499,7 +12499,7 @@ PASS: gdb.mi/mi-nsmoribund.exp: thread s PASS: gdb.mi/mi-nsmoribund.exp: resume all, thread specific breakpoint PASS: gdb.mi/mi-nsmoribund.exp: hit thread specific breakpoint PASS: gdb.mi/mi-nsmoribund.exp: thread state: all running except the breakpoint thread -PASS: gdb.mi/mi-nsmoribund.exp: resume all, program exited normally +FAIL: gdb.mi/mi-nsmoribund.exp: unexpected stop Running ./gdb.mi/mi-nsthrexec.exp ... PASS: gdb.mi/mi-nsthrexec.exp: successfully compiled posix threads test case PASS: gdb.mi/mi-nsthrexec.exp: breakpoint at main @@ -14507,7 +14507,7 @@ PASS: gdb.threads/watchthreads2.exp: bre PASS: gdb.threads/watchthreads2.exp: all threads started PASS: gdb.threads/watchthreads2.exp: watch x PASS: gdb.threads/watchthreads2.exp: set var test_ready = 1 -KFAIL: gdb.threads/watchthreads2.exp: gdb can drop watchpoints in multithreaded app (PRMS: gdb/10116) +PASS: gdb.threads/watchthreads2.exp: all threads incremented x Running ./gdb.threads/watchthreads.exp ... PASS: gdb.threads/watchthreads.exp: successfully compiled posix threads test case PASS: gdb.threads/watchthreads.exp: watch args[0] @@ -14672,7 +14672,7 @@ UNSUPPORTED: gdb.xml/tdesc-xinclude.exp: === gdb Summary === # of expected passes 13854 -# of unexpected failures 75 +# of unexpected failures 76 # of expected failures 43 # of untested testcases7 # of unsupported tests 59 Nice, thanks. So. I am going to conclude that, more or less, utrace-ptrace passes these tests. Jan, if you see something particular which needs more attention or should be fixed, please let me know. I'll try to investigate then. Oleg.
Re: utrace-ptrace gdb testsuite tesults
On Fri, 27 Nov 2009 15:34:05 +0100, Oleg Nesterov wrote: Jan, if you see something particular which needs more attention or should be fixed, please let me know. I'll try to investigate then. I am still not finished with the verifications yesterday but so far no kernel behavior change has been proven and I doubt it will be. Going to reply today. The ppc kernel should be checked but I do not have built two non-utrace/utrace matching kernel rpms for it. Regards, Jan
Re: powerpc: fork stepping (Was: [RFC, PATCH 0/14] utrace/ptrace)
On 11/27, Ananth N Mavinakayanahalli wrote: On Thu, Nov 26, 2009 at 03:50:51PM +0100, Oleg Nesterov wrote: Ananth, could you please run the test-case from the changelog below ? I do not really expect this can help, but just in case. Right, it doesn't help :-( GDB shows that the parent is forever struck at wait(). Now this is interesting. Could you please double check the parent hangs in wait() ? This doesn't match the testing we did on powerpc machine with Veaceslav, and I hoped the problem was already resolved? Please see other emails in this thread. Hmm. Fortunately I still have the access to the testing machine. Yes, according to gdb it looks as if it hangs in wait(). This is not true. You can strace gdb itself, or look at xxx_ctxt_switches in /proc/pid_of_parent/status. Better yet, do not use gdb at all. Just strace (without -f) the parent, you should see it continues to trace the child and loops forever. Oleg.
Re: [QUERY] signal_struct-count/live
On Fri, Nov 27, 2009 at 04:15:21PM +0100, Oleg Nesterov wrote: On 11/27, Ananth N Mavinakayanahalli wrote: I am confused as to why we need two atomics count and live in signal_struct. report_death() uses -live as the group_dead indicator, report_death? Perhaps you meant do_exit() ? Right, do_exit() and that is what is picked up by tracehook_report_death(), and in turn by report_death(). while there are places (like the scheduler) which uses -count as the nr_threads indicator. I tried git blame to see if it remembers why, but the addition predates 2.6.12 and so it does not know. Could you please shed some light on this? In short: signal-count must die. I was going to do this a long ago but never had the time. See also 4ab6c08336535f8c8e42cf45d7adeda882eff06e commit, this is the first step. Last time I did the grepping almost any usage of signal-count is not right. For example, __exit_signal() is correct, but it doesn't need to use -count. Except: it is needed for things like get_nr_threads() in proc. In short: never use signal-count ;) Thanks for the clarification Oleg. Ananth
Re: powerpc: fork stepping (Was: [RFC, PATCH 0/14] utrace/ptrace)
On Fri, Nov 27, 2009 at 06:46:27PM +0100, Veaceslav Falico wrote: On Thu, Nov 26, 2009 at 11:37:03PM +0100, Oleg Nesterov wrote: Could you look at this ptrace-copy_process-should-disable-stepping.patch http://marc.info/?l=linux-mm-commitsm=125789789322573 patch? It is not clear to me how we can modify the test-case to verify it fixes the original problem for powerpc. I modified the test-case, it confirms that ptrace-copy_process-should-disable-stepping.patch fixes the problem with TIF_SINGLESTEP copied by fork() on powerpc. Probably we need a similar fix for step-fork.c in ptrace-tests. Modified the original testcase to call fork via syscall(__NR_fork), to avoid the looping inside libc's fork() on powerpc. The parent singlesteps until he sees that the child has forked, after that the parent PTRACE_CONTs until the child exits. Thanks Veaceslav. This works: Index: ptrace-tests/tests/step-fork.c === --- ptrace-tests.orig/tests/step-fork.c +++ ptrace-tests/tests/step-fork.c @@ -29,6 +29,7 @@ #include unistd.h #include sys/wait.h #include string.h +#include sys/syscall.h #include signal.h #ifndef PTRACE_SINGLESTEP @@ -78,7 +79,7 @@ main (int argc, char **argv) sigprocmask (SIG_BLOCK, mask, NULL); ptrace (PTRACE_TRACEME); raise (SIGUSR1); - if (fork () == 0) + if (syscall(__NR_fork) == 0) { read (-1, NULL, 0); _exit (22); Oleg, With the above patch applied, syscall-reset is the only failure I see on powerpc: errno 14 (Bad address) syscall-reset: syscall-reset.c:95: main: Assertion `(*__errno_location ()) == 38' failed. unexpected child status 67f FAIL: syscall-reset ... 1 of 40 tests failed (11 tests were not run) Please report to utrace-devel@redhat.com Ananth