Re: clone bug (glibc?) (Was: clone-multi-ptrace test failure)
On Tue, 01 Dec 2009 20:39:40 +0100, Roland McGrath wrote: I think the best bet is to link with -Wl,-z,now and then minimize the library code you rely on. Checked-in the fix of at least Fedora 12 x86_64 below. getppid() does not look to be needed there - PTRACE_SYSCALL does stop (WIFSTOPPED) on the entry (before WIFEXITED) to __NR_exit keeping the PASS/FAIL reproducibility. Regards, Jan --- Makefile.am 29 Nov 2009 02:23:25 - 1.60 +++ Makefile.am 14 Dec 2009 09:47:54 - 1.61 @@ -111,6 +111,8 @@ stopped_attach_transparency_LDFLAGS = -l erestartsys_trap_LDFLAGS = -lutil erestartsys_trap_debugger_LDFLAGS = -lutil erestartsys_trap_32fails_debugger_LDFLAGS = -lutil +# After clone syscall it must call no glibc code (such as _dl_runtime_resolve). +clone_multi_ptrace_LDFLAGS = -Wl,-z,now check_TESTS = $(SAFE) xcheck_TESTS = $(CRASHERS) --- clone-multi-ptrace.c5 Dec 2008 14:41:57 - 1.6 +++ clone-multi-ptrace.c14 Dec 2009 09:47:54 - 1.7 @@ -65,10 +65,10 @@ static char grandchild_seen[THREAD_NUM]; static int grandchild_func (void *unused) { - /* Need to have at least one syscall before exit */ - getppid (); - /* _exit() would make ALL threads to exit. We need rew syscall */ + /* _exit() would make ALL threads to exit. We need rew syscall. After the + clone syscall it must call no glibc code (such as _dl_runtime_resolve). */ syscall (__NR_exit, 22); + return 0; }
Re: [PATCH v2] ptrace-tests: fix step-fork.c on powerpc for ptrace-utrace
On Tue, 01 Dec 2009 18:38:27 +0100, Veaceslav Falico wrote: Instead of using fork(), call syscall(__NR_fork) in step-fork.c to avoid looping on powerpc arch in libc. Checked-in. (Not seen any problems with syscall and using glibc afterwards as in the clone-multi-ptrace.c case so left it as is.) Regards, Jan Signed-off-by: Veaceslav Falico vfal...@redhat.com --- --- a/ptrace-tests/tests/step-fork.c 2009-12-01 17:17:14.0 +0100 +++ b/ptrace-tests/tests/step-fork.c 2009-12-01 18:35:15.0 +0100 @@ -29,6 +29,7 @@ #include unistd.h #include sys/wait.h #include string.h +#include sys/syscall.h #include signal.h #ifndef PTRACE_SINGLESTEP @@ -78,7 +79,12 @@ main (int argc, char **argv) sigprocmask (SIG_BLOCK, mask, NULL); ptrace (PTRACE_TRACEME); raise (SIGUSR1); - if (fork () == 0) + + /* + * Can't use fork() directly because on powerpc it loops inside libc under + * PTRACE_SINGLESTEP. See http://marc.info/?l=linux-kernelm=125927241130695 + */ + if (syscall(__NR_fork) == 0) { read (-1, NULL, 0); _exit (22);
Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)
On Wed, 09 Dec 2009 19:12:41 +0100, Oleg Nesterov wrote: while the '.func_name' is the text address. tried to change the code to REGS_ACCESS (regs, nip) = (unsigned long) .raise_sigusr2 but gcc doesn't like this ;) ... Yes, I verified the patch below fixes step-jump-cont.c on ibm-js20-02.lab.bos.redhat.com. Checked-in a similar patch but same as used now in other testcases, sorry for not using the patch of yours. Regards, Jan --- step-jump-cont.c8 Dec 2008 18:23:41 - 1.12 +++ step-jump-cont.c14 Dec 2009 11:38:37 - 1.13 @@ -213,6 +213,24 @@ int main (void) REGS_ACCESS (regs, eip) = (unsigned long) raise_sigusr2; #elif defined __x86_64__ REGS_ACCESS (regs, rip) = (unsigned long) raise_sigusr2; +#elif defined __powerpc64__ + { +/* ppc64 `raise_sigusr2' resolves to the function descriptor. */ +union + { + void (*f) (void); + struct + { + void *entry; + void *toc; + } + *p; + } +const func_u = { raise_sigusr2 }; + +REGS_ACCESS (regs, nip) = (unsigned long) func_u.p-entry; +REGS_ACCESS (regs, gpr[2]) = (unsigned long) func_u.p-toc; + } #elif defined __powerpc__ REGS_ACCESS (regs, nip) = (unsigned long) raise_sigusr2; #else
Re: Tests Failures on PPC64
On Wed, 09 Dec 2009 19:31:52 +0100, Oleg Nesterov wrote: Hmm. it is obvioulsy racy, static volatile unsigned started is not atomic and thus the main thread can hang doing while (started THREADS); not that I think this explains the failure though. Thanks, fixed (but the problem is not reproducible for me). Regards, Jan --- ppc-dabr-race.c 8 Dec 2008 18:23:41 - 1.8 +++ ppc-dabr-race.c 14 Dec 2009 12:03:49 - 1.9 @@ -141,13 +141,14 @@ handler_fail (int signo) assert (0); } +/* STARTED requires atomic access. */ static volatile unsigned started; static void *child_thread (void *data) { pid_t tid = gettid (); - started++; + __sync_add_and_fetch (started, 1); /* We should stay in the syscall - better race probability. */ sleep (1); @@ -178,7 +179,7 @@ static void child_func (void) assert (i == 0); } - while (started THREADS); + while (__sync_add_and_fetch (started, 0) THREADS); l = ptrace (PTRACE_TRACEME, 0, NULL, NULL); assert (l == 0);
Re: [RFC,PATCH 14/14] utrace core
On Sun, 2009-12-13 at 16:25 -0800, Roland McGrath wrote: I'm sorry for the delay. I'm picking up with responding to the parts of your review that I did not include in my first reply. I appreciate very much the discussion you've had with Oleg about the issues that I did not address myself. I look forward to your replies to my comments in that first reply, which we have yet to see. Yeah, no worries, I'm not too quick these days myself.. Seems inconsistent on u32 vs enum utrace_resume_action. Why force enum utrace_resume_action into a u32? The first argument to almost all callbacks (all the ones made when the task is alive) is called action and it's consistent that its low bits contain an enum utrace_resume_action. The argument is u32 action in the report_signal and report_syscall_entry callbacks, where it combines an enum utrace_resume_action with an enum utrace_{signal,syscall}_action (respectively). It would indeed be more consistent to use u32 action throughout, but it seemed nicer not to have engine writers always writing utrace_resume_action(action) for all the cases where there are no other bits in there. C does implicit casts from enum to integer types, right? So always using u32 here would not impose any extra typing on the user, or am I missing something subtle here? The return value is a mirror of the u32 action argument. In many calls, it has only an enum utrace_resume_action in it. But in some it combines that with another enum utrace_*_action. There I went for consistency (and less typing) in the return type, since it doesn't make any difference to how you have to write the code in your callback function. The main reason I used u32 at all instead of unsigned int is just its shortness for less typing. I don't really mind changing these details to whatever people think is best. The other people writing code to use the utrace API may have more opinions than I do. I guess it could even be OK to make the return value and action argument always a plain enum utrace_resume_action, and use a second in/out enum utrace_{signal,syscall}_action * parameter in those particular calls. But that does put some more register pressure and loads/stores into this API. I don't mind the sharing of the argument, it just looked odd to have the u32/unsigned int/enum thing intermixed, since you care about typing length (as good a criteria as any) I'd just be lazy and make everything u32 ;-) Quite gross this.. can't we key off the tracehoook_report_clone_complete() and use a wakeup there? Yes, we intended to clean this up at some point later. But doing that is just a distraction and complication right now so we put it off. For this case, however, I suppose we could just punt for the initial version. This code exists to support the special semantics of calling utrace_attach_task() from inside the parent's report_clone() callback. That guarantees the parent that it wins any race with any third thread calling utrace_attach_task(). This guarantees it will be first attacher in the callback order, but the general subject of callback order is already something we think we will want to revisit in the future after we have more experience with lots of different engines trying to work together. It's most meaningful when using the UTRACE_ATTACH_EXCLUSIVE flag -- then you can use UTRACE_ATTACH_EXCLUSIVE|UTRACE_ATTACH_MATCH_OPS to synchronize with other threads trying to attach the same kind of engine to a task, and give special priority in that to the engine that attaches in report_clone() from tracing the parent. I broke the text so it reads easier for me, it might be me, it might not be proper English -- I'm not a native speaker -- but this is an example of what you asked for below. The thing is, your sentences are rather long, with lots of sub-parts and similar. I find I need a break after digesting a few such things. As to the content, can't you accomplish the same thing by processing such exclusive parent registration before exposing the child in the pid-hash? Right before cgroup_fork_callback() in kernel/fork.c:copy_process() seems like the ideal site. Really I came up with this special semantics originally just with ptrace in mind. ptrace is an engine that wants to exclude other tracer threads attaching asynchronously with the same kind of engine, and that wants to give special priority on a child to the parent's tracer (to implement PTRACE_O_TRACECLONE et al). However, Oleg's current ptrace code still relies on the old hard-coded locking kludges to exclude the separate attachers and to magically privilege the auto-attach from the parent's tracer. So we are not relying on this special semantics yet. We could just punt utrace_attach_delay() and remove the associated documentation about the special rights of report_clone() calling utrace_attach_task(). If we decide it helps clean things up later when we finish more
Re: [RFC,PATCH 14/14] utrace core
On 12/13, Roland McGrath wrote: Its not entirely clear why we can check pending_attach outside of the utrace-lock and not be racy. I think it can be racy sometimes but that does not matter. Maybe Oleg can verify my logic here. If it's right, he can add some comments to make it more clear. There is only a very limited sort of timeliness guarantee about getting your callbacks after utrace_attach_task()+utrace_set_events(). If you know somehow that the task was definitely still in TASK_STOPPED or TASK_TRACED after utrace_attach_task() returned, then your engine gets all possible callbacks starting from when it resumes. Otherwise, you can use utrace_control() with UTRACE_REPORT to ask to get some callback pretty soon. The only callback events you are ever 100% guaranteed about (after success return from utrace_set_events()) are for DEATH and REAP, which have an unconditional lock-and-check before making engine callbacks. Yes, I think this is correct. It is fine to miss -pending_attach == T, and in any case the new attacher can come right after the check, even if it was checked under utrace-lock. It is important that the tracee can't miss, say, UTRACE_REPORT request (as you already explained), and every time the tracee clears -resume it calls splice_attaching(). In the stopped cases, there are lots of locks and barriers and things after resuming. (Oleg?) Every time the tracee resumes after TASK_TRACED it uses utrace-lock to synchronize with utrace_control/etc, it must see any changes. Oleg.
Motores, Caixas de velocidades. Componentes Usados .....
TRADING PEÇAS AUTO USADAS A R F TRADING PEÇAS AUTO USADAS A R F A ARF- Tranding Peças Auto Usadas é a empresa líder no comércio e distribuição de PEÇAS AUTO USADAS, Existindo há 7 anos no Mercado, fomos a 1ª empresa em Portugal a fazer entregas de material usado ao domicilio. Durante estes 7 anos o mercado automóvel confiou em nós e nós transmitimos confiança ao mercado através de uma evolução constante, de forma a satisfazermos da melhor forma e o mais rápido possível os nossos clientes, vamos continuar a inovar para responder as necessidades dos nossos clientes. Assim, Alem da Comercialização e Distribuição de Todo o tipo de PEÇAS AUTO USADAS (MOTORES, CAIXAS de VELOCIDADES, Componentes MECÂNICOS e de CARROÇARIA), estamos a comercializar MATERIAL RECONSTRUIDO (MOTORES, CAIXAS de VELOCIDADES AUTOMÁTICAS, CABEÇAS de MOTOR e CATALIZADORES) com garantia até 2 anos e proveniente das mais conceituadas fábricas da Europa. Dispomos também actualmente de uma oferta muito significativa em AIRBAGS. ENTREGAMOS GRATUITAMENTE … “A TODO o VAPOR !” Zona Industrial de Santarem T: 243 579 097 F: 243 579 898 TM: 917 144 547 arf-pecas-usa...@jpsf.pt Zona Industrial de Santarem Rua Conde da Ribeira Grande, Lt 2 – Ed DET Lj 34 2001-905 SantarémMiguel Ferreira -Depto. Comercial
Re: [RFC,PATCH 14/14] utrace core
On 12/14, Peter Zijlstra wrote: Quite gross this.. can't we key off the tracehoook_report_clone_complete() and use a wakeup there? Yes, we intended to clean this up at some point later. But doing that is just a distraction and complication right now so we put it off. For this case, however, I suppose we could just punt for the initial version. This code exists to support the special semantics of calling utrace_attach_task() from inside the parent's report_clone() callback. That guarantees the parent that it wins any race with any third thread calling utrace_attach_task(). This guarantees it will be first attacher in the callback order, but the general subject of callback order is already something we think we will want to revisit in the future after we have more experience with lots of different engines trying to work together. It's most meaningful when using the UTRACE_ATTACH_EXCLUSIVE flag -- then you can use UTRACE_ATTACH_EXCLUSIVE|UTRACE_ATTACH_MATCH_OPS to synchronize with other threads trying to attach the same kind of engine to a task, and give special priority in that to the engine that attaches in report_clone() from tracing the parent. As to the content, can't you accomplish the same thing by processing such exclusive parent registration before exposing the child in the pid-hash? Right before cgroup_fork_callback() in kernel/fork.c:copy_process() seems like the ideal site. I thought about this too, but there are some complications. Just for example, what if copy_process() fails after we already reported the new child was attached. And, the new child is not properly initialized yet, it doesn't even have the valid pid/real_parent/etc. Just imagine a callback wants to record task_pid_vnr(). Or it decides to send a fatal signal (even private) to the new tracee. Best would be to fix up the utrace-ptrace implementation and get rid of those other kludges I think, not sure.. its all a bit involved and I'm not at all sure I'm fully aware of all the ptrace bits. It is not that utrace-ptrace is buggy. We try to preserve the current semantics. Yes, we can move this kludge to ptrace layer, but I am not sure about other UTRACE_ATTACH_EXCLUSIVE engines. If we move this to ptrace, then we can probably mark the new child as ptrace_attach() should fail, because we are going to auto-attach. But in this case we need multiple hooks in do_fork() path again, like the old ptrace does, while utrace needs only one. The major improvement this utrace stuff brings over the old ptrace is that it fully multiplexes the task tracing bits, however if the new bit isn't powerful enough to express all of the old bits with that then that seems to take the shine of the new bits. Note that it would be easy to add another callback, and hide this special case. But we should think twice before doing this. I'm well aware that ptrace had some quirky bits in, and this might well be one of the crazier parts of it, but to the un-initiated reviewer (me) this could have done with a comment explaining exactly why this particular site is not worth properly abstracting etc.. Well, agreed. In the pretty soon case, that means set_notify_resume: if (!test_and_set_tsk_thread_flag(task, TIF_NOTIFY_RESUME)) kick_process(task); i.e. IPI after test_and_set. The test_and_set implies an smp_mb(). So it should be the case that the set of utrace-pending_attach was seen before the set of TIF_NOTIFY_RESUME. Not exactly sure where the matching rmb() would come from in start_report() and friends -- task_utrace_struct() ? I am a bit confused... As Roland said, we don't have any timing guarantees, and we can't have. Whatever we do, the tracee can miss, say, -report_syscall_entry() event even if it does a system call after utrace_set_events(UTRACE_EVENT_SYSCALL), this is fine. But it shouldn't miss TIF_NOTIFY_RESUME/signal_pending/etc. I mean, sooner or later it must hanlde the signal or TIF_NOTIFY_RESUME. And in this case we can't miss -pending_attach. IOW, we must ensure that if ever clear TIF_NOTIFY_RESUME we must not miss -pending_attach, correct? and for this case we have mb() after clear_thread_flag(). Perhaps instead we should add mb__after_clear_bit() into arch/ hooks, but this needs a lot of arch-dependent changes. And, btw, the usage of -replacement_session_keyring looks racy, exactly because (without utracee) we done have the necessary barriers on both sides. Oleg.
Re: [RFC,PATCH 14/14] utrace core
On 12/14, Oleg Nesterov wrote: IOW, we must ensure that if ever clear TIF_NOTIFY_RESUME we must not miss -pending_attach, correct? and for this case we have mb() after clear_thread_flag(). Perhaps instead we should add mb__after_clear_bit() into arch/ hooks, but this needs a lot of arch-dependent changes. Cough. And I always read this rmb as mb. Even when I changed the comment to explain that we need a barrier between clear bit and read flags, I didn't notice it is in fact rmb... I guess we need the trivial fix, Roland? Oleg.
Re: [RFC,PATCH 14/14] utrace core
On 12/14, Oleg Nesterov wrote: IOW, we must ensure that if ever clear TIF_NOTIFY_RESUME we must not miss -pending_attach, correct? and for this case we have mb() after clear_thread_flag(). Perhaps instead we should add mb__after_clear_bit() into arch/ hooks, but this needs a lot of arch-dependent changes. Since it's utrace/tracehook code that relies on the barrier I think it makes sense to have it in tracehook_notify_resume() or utrace_resume(). The arch requirement is having done clear_thread_flag() beforehand, so the arch-independent code can reasonably assume whatever semantics that is guaranteed to have. Cough. And I always read this rmb as mb. Even when I changed the comment to explain that we need a barrier between clear bit and read flags, I didn't notice it is in fact rmb... I guess we need the trivial fix, Roland? You're the barrier man, send me what changes it should get. Thanks, Roland
Re: [RFC,PATCH 14/14] utrace core
Yes, I think this is correct. It is fine to miss -pending_attach == T, and in any case the new attacher can come right after the check, even if it was checked under utrace-lock. Right. It is important that the tracee can't miss, say, UTRACE_REPORT request (as you already explained), and every time the tracee clears -resume it calls splice_attaching(). Right. In the stopped cases, there are lots of locks and barriers and things after resuming. (Oleg?) Every time the tracee resumes after TASK_TRACED it uses utrace-lock to synchronize with utrace_control/etc, it must see any changes. And TASK_STOPPED? Please send me patches to add whatever comments would make all this clear enough to Peter when reading the code. Thanks, Roland
Re: [RFC,PATCH 14/14] utrace core
On 12/14, Roland McGrath wrote: In the stopped cases, there are lots of locks and barriers and things after resuming. (Oleg?) Every time the tracee resumes after TASK_TRACED it uses utrace-lock to synchronize with utrace_control/etc, it must see any changes. And TASK_STOPPED? SIGCONT can wake up the TASK_STOPPED tracee. I don't think the tracer should ever rely on TASK_STOPPED (utrace never does). If the tracer needs the really stopped tracee we have UTRACE_STOP, and this means TASK_TRACED. But I am not sure I understand this part of discussion... In any case the tracee should see any changes which were done before the wakeup. But TASK_STOPPED can't guarantee the tracee won't be resumed until the tracer wakes it up. Of course, TASK_TRACED can't prevent SIGKILL, but in this case we should only care about the tracee can't resume until we drop utrace-lock case. Oleg.