Hi I was executing Alexey's testcase cited in (http://marc.info/?l=linux-kernel&m=117128445312243&w=2) to test the utrace and system crashed on pressing ctr+c.
Environment: 2.6.23-rc7, ppc64. 6:mon> e cpu 0x6: Vector: 300 (Data Access) at [c00000002510b650] pc: c00000000038b0f8: ._spin_lock+0x20/0x88 lr: c0000000000b1e78: .get_utrace_lock_attached+0x50/0xc0 sp: c00000002510b8d0 msr: 8000000000009032 dar: 7f9d0000419e0058 dsisr: 40000000 current = 0xc000000035f800b0 paca = 0xc00000000058d900 pid = 23108, comm = a.out On further analysis, I could make these observations 1)When a process dies, it tries to go through all tracees list and detachs the engine. As in ptrace_exit() list_for_each_safe_rcu(pos, n, &tsk->ptracees) { state = list_entry(pos, struct ptrace_state, entry); error = utrace_detach(state->task, state->engine); 6:mon> t [c00000002510b950] c0000000000b1e78 .get_utrace_lock_attached+0x50/0xc0 [c00000002510b9e0] c0000000000b331c .utrace_detach+0x30/0x148 [c00000002510ba80] c0000000000b778c .ptrace_exit+0xa0/0x1c8 [c00000002510bb20] c000000000071848 .do_exit+0x188/0xa54 [c00000002510bbc0] c0000000000721e8 .sys_exit_group+0x0/0x8 [c00000002510bc50] c00000000007d6f8 .get_signal_to_deliver+0x480/0x4f4 [c00000002510bd00] c0000000000126d4 .do_signal+0x68/0x32c [c00000002510be30] c000000000008af0 do_work+0x28/0x2c --- Exception: c00 (System Call) at 000000000ff18d2c SP (ffe0f030) is in userspace 2) But when process tries to access "state->task", it looks like state->task has been released and all fields in it has invalid values. 6:mon> r R00 = 0000000080000006 R16 = 0000000000000000 R01 = c00000002510b8d0 R17 = 0000000000000000 R02 = c00000000067a808 R18 = 0000000000000000 R03 = 7f9d0000419e0058 R19 = 0000000000000000 R04 = c00000007f86b740 R20 = 0000000000000000 R05 = 8000000000c24000 R21 = 0000000000000000 R06 = 8000000000000000 R22 = 0000000000000000 R07 = 000000007fffffff R23 = 0000000000000000 R08 = c000000008133408 R24 = c000000035f807b8 R09 = c00000009f14aa30 R25 = c00000002510bea0 R10 = c000000000574e84 R26 = c00000002510bd90 R11 = fffffffffffffffd R27 = ffffffffffffffff R12 = 4000000000000000 R28 = c00000007f86b740 R13 = c00000000058d900 R29 = c0000000279f00b0 R14 = 0000000000000000 R30 = c000000000616430 R15 = 0000000000000000 R31 = 7f9d0000419e0058 pc = c00000000038b0f8 ._spin_lock+0x20/0x88 lr = c0000000000b1e78 .get_utrace_lock_attached+0x50/0xc0 msr = 8000000000009032 cr = 22000448 ctr = 800000000014dcd0 xer = 0000000000000000 trap = 300 dar = 7f9d0000419e0058 dsisr = 40000000 task->utrace r29+1904 6:mon> d c0000000279f0820 c0000000279f0820 7f9d0000419e0038 7c0018287c005800 |....A..8|..(|.X.| c0000000279f0830 4082000c7d20192d 40c2fff04c00012c |@...} [EMAIL PROTECTED],| c0000000279f0840 2f80000040de0044 813f004893a90008 |/[EMAIL PROTECTED]| 3) Reason for this error could be, While parent process(Reader process) was going through the rcu tracess list, some writer process(Another thread from the same group through ptrace_detach()) goes deletes it from tracees rcu list (state->entry). So parent process(Reader) holding the reference to old rcu list, access the stacte->task(which is deleted) and system crashes. 4) Since we need both reader and writer running parallely to recreate this issue, Its very rare to reproduce this bug. This leads me to suspect a possible issue with the usage of RCU in utrace. Please let me know your comments.