Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)
I'll try to investigate, but currently I am all confused, and I suspect we have some user-space issues. If only I knew something about ppc... Sorry for the confusing. Ananth, could you please confirm once again that step-jump-cont (from ptrace-tests testsuite) not fail on your machine? If yes, please tell me the version of glibc/gcc. Is PTRACE_GETREGS defined on your machine? Funny enough. The above failure only seen on that particular system so far. In fact, different PPC64 systems have different results there (roland's git tree + your lockless patch). ibm-js20-02.lab.bos.redhat.com FAIL: watchpoint ppc-dabr-race: ./../tests/ppc-dabr-race.c:141: handler_fail: Assertion `0' failed. /bin/sh: line 5: 16928 Aborted ${dir}$tst FAIL: ppc-dabr-race syscall-reset: ./../tests/syscall-reset.c:95: main: Assertion `(*__errno_location ()) == 38' failed. errno 14 (Bad address) unexpected child status 67f FAIL: syscall-reset step-fork: ./../tests/step-fork.c:56: handler_fail: Assertion `0' failed. /bin/sh: line 5: 31144 Aborted ${dir}$tst FAIL: step-fork ibm-js22-02.rhts.bos.redhat.com ibm-js12-04.rhts.bos.redhat.com ibm-js12-05.rhts.bos.redhat.com Looks like failed only for syscall-reset and step-fork, as we have discussed before. I'll be reserving ibm-js20-02.lab.bos.redhat.com at the moment. Thanks, CAI Qian
Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)
On 12/07, caiq...@redhat.com wrote: Ananth, could you please confirm once again that step-jump-cont (from ptrace-tests testsuite) not fail on your machine? If yes, please tell me the version of glibc/gcc. Is PTRACE_GETREGS defined on your machine? Funny enough. The above failure only seen on that particular system so far. In fact, different PPC64 systems have different results there (roland's git tree + your lockless patch). Great! thanks. OK, I seem to understand what happens, but I can not explain WHY does this happen on that machine. Once again. The tracer changes the tracee's instruction pointer to the adrress of raise_sigusr2(), and resumes the tracee. The tracee gets SIGSEGV right after that. But. raise_sigusr2 is not equal to the actual address of raise_sigusr2(), this value points to the thunk (I do not know the correct English term) which contains the actual address: (gdb) disassemble 0x100118c0 Dump of assembler code for function raise_sigusr2: 0x100118c0 raise_sigusr2+0: .long 0x0 SIGSEGV 0x100118c4 raise_sigusr2+4: .long 0x1ab0 aof raise_sigusr2() 0x100118c8 raise_sigusr2+8: .long 0x0 And!!! this thunk does NOT live in .text, and vma does NOT have VM_EXEC bit! # cat /proc/30494/maps 0010-0012 r-xp 00:00 0 [vdso] 1000-1001 r-xp fd:00 59262 /root/TST/sjc 1001-1002 rw-p fd:00 59262 /root/TST/sjc That is why the tracee gets SIGSEGV, and this is correct. Cai, perhaps you could give me access to another ppc machine where this test does not fail? Or, could you please run the trivial program below on that machine? Oleg. #include stdio.h #include stdlib.h #include unistd.h void my_func(void) { } int main(void) { char cmd[128]; printf(ptr: %p\n, my_func); sprintf(cmd, cat /proc/%d/maps, getpid()); system(cmd); return 0; }
Vous avez reçu un Fax
Title: Fax Reception Si ce message ne s'affiche pas correctement, Visualisez la version en ligne conomique Plus besoin de ligne tlphonique ddie, ni de tlcopieur. Fini les problmes d'encre, de papier et de sonnerie. Simplicit En quelques minutes, vous choisissez votre nouveau numro de fax, et votre ligne est active et oprationnelle instantanment. Confidentialit Vous disposez d'un numro de fax personnel ainsi que d'un accs scuris. Mobilit Grce l'accs web, vous pouvez consulter vos fax mme lorsque vous tes en dplacement. Compatible PDA Ds souscription au service FaxReception, vous recevez aussi vos fax directement sur votre PDA. cologique En utilisant FaxReception pour recevoir vos fax, vous rduisez votre consommation de papier. Pour envoyer vos fax Envoyez directement vos Fax partir de votre PC : www.safefax.fr 2009 ALLIANCE MCA Remplissez le formulaire pour ne plus recevoir notre newsletter. A rception, celui-ci sera pris en compte conformment la loi n 78-17 du 06-01-1978 relative linformatique, aux fichiers et aux liberts modifie par la loi n2004-801 du 06-08-2004
Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)
On Mon, 07 Dec 2009 15:24:51 +0100, Oleg Nesterov wrote: But. raise_sigusr2 is not equal to the actual address of raise_sigusr2(), this value points to the thunk (I do not know the correct English term) ppc64 calls it function descriptor (GDB ppc64_linux_convert_from_func_ptr_addr): For PPC64, a function descriptor is a TOC entry, in a data section, which contains three words: the first word is the address of the function, the second word is the TOC pointer (r2), and the third word is the static chain value. (gdb) x/8gx 0x805b6f6258 0x805b6f6258 open:0x00805b65cf68 0x00805b702ac0 0x805b6f6268 open64: 0x00805b65d010 0x00805b702ac0 (gdb) x/20i 0x00805b65cf68 0x805b65cf68 .__GI___open:lwz r10,-30432(r13) 0x805b65cf6c .__GI___open+4: cmpwi r10,0 0x805b65cf70 .__GI___open+8: bne-0x805b65cf84 .__GI___open+28 (gdb) info sym 0x00805b702ac0 last_nip in section .bss I was not aware there is any third word before and I do not see it there. Regards, Jan
Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)
On 12/07, Jan Kratochvil wrote: On Mon, 07 Dec 2009 15:24:51 +0100, Oleg Nesterov wrote: But. raise_sigusr2 is not equal to the actual address of raise_sigusr2(), this value points to the thunk (I do not know the correct English term) ppc64 calls it function descriptor (GDB ppc64_linux_convert_from_func_ptr_addr): For PPC64, a function descriptor is a TOC entry, Thanks Jan. in a data section, Yes! Now I can't understand how this test-case could ever work on ppc. step-jump-cont does: regs-nip = raise_sigusr2; --- points to data section ptrace(PTRACE_CONT); of course, the tracee gets SIGSEGV, this section is not executable. Oleg.
Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)
On 12/07, Oleg Nesterov wrote: On 12/07, Jan Kratochvil wrote: On Mon, 07 Dec 2009 15:24:51 +0100, Oleg Nesterov wrote: But. raise_sigusr2 is not equal to the actual address of raise_sigusr2(), this value points to the thunk (I do not know the correct English term) ppc64 calls it function descriptor (GDB ppc64_linux_convert_from_func_ptr_addr): For PPC64, a function descriptor is a TOC entry, Thanks Jan. in a data section, Yes! Now I can't understand how this test-case could ever work on ppc. step-jump-cont does: regs-nip = raise_sigusr2; --- points to data section ptrace(PTRACE_CONT); of course, the tracee gets SIGSEGV, this section is not executable. Hmm. Looks like, powerpc means a lot of different hardware, and _PAGE_EXEC may be 0. I didn't notice this when I quickly grepped arch/powerpc/ IOW, perhaps on some machines r implies x ? Is yes, this can explain why the results differ on different machines. Oleg.
Re: [RFC,PATCH 14/14] utrace core
On Tue, 2009-12-01 at 23:08 +0100, Oleg Nesterov wrote: @@ -560,6 +625,20 @@ static inline void tracehook_report_deat int signal, void *death_cookie, int group_dead) { + /* + * This barrier ensures that our caller's setting of + * @task-exit_state precedes checking @task-utrace_flags here. + * If utrace_set_events() was just called to enable + * UTRACE_EVENT(DEATH), then we are obliged to call + * utrace_report_death() and not miss it. utrace_set_events() + * uses tasklist_lock to synchronize enabling the bit with the + * actual change to @task-exit_state, but we need this barrier + * to be sure we see a flags change made just before our caller + * took the tasklist_lock. + */ + smp_mb(); + if (task_utrace_flags(task) _UTRACE_DEATH_EVENTS) + utrace_report_death(task, death_cookie, group_dead, signal); } I don't think its allowed to pair a mb with a lock-barrier, since the lock barriers are semi-permeable. Could you clarify? According to memory-barriers.txt a mb can be paired with mb,wmb,rmb depending on the situation. For LOCK it states: (1) LOCK operation implication: Memory operations issued after the LOCK will be completed after the LOCK operation has completed. Memory operations issued before the LOCK may be completed after the LOCK operation has completed. Which is not something I can see making sense to pair with an mb. So either the comment is confusing and its again referring to an UNLOCK +LOCK pair, or there's something fishy @@ -589,10 +668,20 @@ static inline void set_notify_resume(str * asynchronously, this will be called again before we return to * user mode. * - * Called without locks. + * Called without locks. However, on some machines this may be + * called with interrupts disabled. */ static inline void tracehook_notify_resume(struct pt_regs *regs) { + struct task_struct *task = current; + /* + * This pairs with the barrier implicit in set_notify_resume(). + * It ensures that we read the nonzero utrace_flags set before + * set_notify_resume() was called by utrace setup. + */ + smp_rmb(); + if (task_utrace_flags(task)) + utrace_resume(task, regs); } Sending an IPI implies the mb? Yes, but this has nothing to do with IPI. The caller, do_notify_resume(), does: clear_thread_flag(TIF_NOTIFY_RESUME); tracehook_notify_resume: if (task_utrace_flags(task)) utrace_resume(); We should not read task_utrace_flags() before we clear TIF_NOTIFY_RESUME. Then this comment needs an update.. that wasn't at all clear to me. +static inline struct utrace *task_utrace_struct(struct task_struct *task) +{ + struct utrace *utrace; + + /* + * This barrier ensures that any prior load of task-utrace_flags + * is ordered before this load of task-utrace. We use those + * utrace_flags checks in the hot path to decide to call into + * the utrace code. The first attach installs task-utrace before + * setting task-utrace_flags nonzero, with a barrier between. + * See utrace_task_alloc(). + */ + smp_rmb(); + utrace = task-utrace; + + smp_read_barrier_depends(); /* See utrace_task_alloc(). */ + return utrace; +} I spot two barriers here, but only 1 over in utrace_task_alloc(), hmm? smp_read_barrier_depends() pairs with utrace_task_alloc()-wmb(). smp_rmb() is needed for another reason. Suppose the code does: if (task_utrace_flags() SOMETHING) do_something_with(task-utrace); if we race with utrace_attach_task(), we can see -utrace_flags != 0 but task-utrace == NULL without rmb(). Still not clear what we pair with.. There is no obvious barrier in utrace_attach_task() nor a comment referring to this site. +struct utrace_engine { +/* private: */ + struct kref kref; + void (*release)(void *); + struct list_head entry; + +/* public: */ + const struct utrace_engine_ops *ops; + void *data; + + unsigned long flags; +}; Sorry, the kernel is written in C, not C++. Hmm. I almost never read the comments, but these 2 look very clear to me ;) I see no point in adding comments like that at all. + * Most callbacks take an @action argument, giving the resume action + * chosen by other tracing engines. All callbacks take an @engine + * argument, and a @task argument, which is always equal to @current. Given that some functions have a lot of arguments (depleting regparam), isn't it more expensive to push current on the stack than it is to simply read it again? Yes, perhaps. Only -report_reap() really needs @task, it may be !current. +struct utrace_engine_ops { + u32 (*report_signal)(u32 action, + struct