(add Roland) On 10/01, Jan Kratochvil wrote: > > the ptrace-testsuite > http://sourceware.org/systemtap/wiki/utrace/tests > > currently FAILs (also) on Fedora 12 kernel-2.6.31.1-48.fc12.x86_64 for: > FAIL: detach-stopped > FAIL: stopped-attach-transparency > > Do you agree with the testcases and is it planned to fix them for F12?
I do not know. I'd leave this to Roland. I mean, if he thinks this should be fixed - I'll try to fix. But. This all looks unfixeable to me. In my opinion, the kernel is obviously wrong, and test-case are wrong too. And any fix in this area is user-visible and can break the current expectations. As for kernel, I lost any hope to understand what is the _supposed_ behaviour. As for user-space, I don't really understand the second test-case, this again means I don't understand the supposed behaviour. Firstly, I think we should un-revert edaba2c5334492f82d39ec35637c6dea5176a977. This unconditional wakeup is hopelessly wrong imho, and it is removed from utrace-ptrace code. But this breaks another test-case, attach-wait-on-stopped. I still think this test-case is wrong. We had a lengthy discussion about this. Now, this patch --- TTT_32/kernel/signal.c~PT_STOP 2009-10-04 04:08:36.000000000 +0200 +++ TTT_32/kernel/signal.c 2009-10-05 03:17:39.000000000 +0200 @@ -1708,7 +1708,7 @@ static int do_signal_stop(int signr) */ if (sig->group_stop_count) { if (!--sig->group_stop_count) - sig->flags = SIGNAL_STOP_STOPPED; + sig->flags = SIGNAL_STOP_STOPPED | SIGNAL_STOP_DEQUEUED; current->exit_code = sig->group_exit_code; __set_current_state(TASK_STOPPED); } fixes the tests above. Of course this change is not enough, I did it just to verify I really understand what happens. Except, stopped-attach-transparency prints Excessive waiting SIGSTOP after the second attach/detach afaics the test-case is not right here. attach_detach() leaves the traced threads in STOPPED state, why pid_notifying_sigstop() should fail? But as I said, I do not really understand what this test-case tries to do. What ptrace(PTRACE_DETACH, SIGSTOP) should mean? I think that ptrace(PTRACE_DETACH, signr) should mean the tracee should proceed with this signal, as if it was sent by, say, kill. In this case, I don't understand why stopped-attach-transparency "sends" SIGSTOP to every sub-thread. If the tracer wants to stop the thread group after detach, it can do ptrace(PTRACE_DETACH, anythread, SIGSTOP); for_each_other_thread(pid) ptrace(PTRACE_DETACH, anythread, 0); or just kill(SIGSTOP); for_each_thread(pid) ptrace(PTRACE_DETACH, anythread, 0); I do not say this will really work with the current implementaion, we have other bugs/races. I mean I'd expect this should be the right way to do detach+stop. And. Currently PTRACE_CONT/PTRACE_DETACH/etc wakes up the tracee even if the thread group is stopped. This is obviously not right, but utrace-ptrace does the same. I guess we can't fix this without breaking existing applications. In short: I don't know what to do ;) Oleg.