Here's an interesting crash. It sometimes happens on EMC load. It's more or less reproducible by fireing up and shutting down EMC in a row. Something like 20 or 30 tries are sometimes required to reproduce.
So, what is this? Well, I'm fairly certain that we have a race condition here. This crash does only seem to happen on SMP (I have two CPUs) when isolcpus is _not_ used. I'm using classic RCUs. [ 339.984905] RTAI[sched]: timer setup = 999 ns, resched latency = 2944 ns. [ 340.046754] RTAI[math]: loaded. [ 340.120839] parport_pc 00:05: activated [ 340.126383] config string '0xDC00 0xD800' [ 340.224226] invalid opcode: 0000 [#1] PREEMPT SMP [ 340.224245] last sysfs file: /sys/devices/pci0000:00/0000:00:0b.0/0000:02:00.0/resource [ 340.224248] CPU 0 [ 340.224254] Modules linked in: charge_pump stepgen hal_parport probe_parport motmod trivkins hal_lib rtapi rtai_math rtai_sem rtai_shm rtai_fifos rtai_sched rtai_hal k8temp evdev [last unloaded: rtai_hal] [ 340.224299] Pid: 3359, comm: milltask Not tainted 2.6.29.6-rtai #2 GeForce7050M-M [ 340.224303] RIP: 0010:[<ffffffff80c80d8e>] [<ffffffff80c80d8e>] 0xffffffff80c80d8e [ 340.224313] RSP: 0018:ffffffff80c80d90 EFLAGS: 00214286 [ 340.224316] RAX: 000000004b309c23 RBX: ffffffff80cd7f58 RCX: 0000000000000000 [ 340.224320] RDX: 0000000030e298b1 RSI: 0000000013153411 RDI: 000000004b309c23 [ 340.224323] RBP: ffffffff80cd7b58 R08: ffff880072db4000 R09: 0000000000000000 [ 340.224326] R10: 0000000000000000 R11: ffffffffa00d70b0 R12: 0000000000000008 [ 340.224330] R13: ffffffff80903ebb R14: 0000000000000246 R15: ffffffff8024d1c8 [ 340.224333] FS: 00007fc192c0f6f0(0000) GS:ffffffff80c89040(0000) knlGS:0000000000000000 [ 340.224337] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 340.224341] CR2: 00007fcbf6fd5520 CR3: 0000000072d9c000 CR4: 00000000000006a0 [ 340.224344] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 340.224347] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400 [ 340.224350] Process milltask (pid: 3359, threadinfo ffff880072db4000, task ffff88007f08f080) [ 340.224354] Stack: [ 340.224357] ffffffff80903ff7 0000000000000202 ffffffff8027d4f7 0000000000000009 [ 340.224366] 0000000000000001 0000000000000101 ffffffff8027d609 0000000000000040 [ 340.224377] 0000000000000257 0000004f36e3a398 0000000000000000 ffffffff80c80e20 [ 340.224391] Call Trace: [ 340.224395] <IRQ> <0> [<ffffffff80903ff7>] ? _spin_unlock_irqrestore+0x47/0x60 [ 340.224407] [<ffffffff8027d4f7>] ? __rcu_process_callbacks+0x267/0x330 [ 340.224415] [<ffffffff8027d609>] ? rcu_process_callbacks+0x49/0x50 [ 340.224423] [<ffffffff8020cfcc>] ? call_softirq+0x1c/0x30 [ 340.224430] [<ffffffff8020eaa5>] ? do_softirq+0x75/0xc0 [ 340.224436] [<ffffffff80248317>] ? irq_exit+0x47/0xb0 [ 340.224442] [<ffffffff8021d235>] ? smp_apic_timer_interrupt+0x75/0xc0 [ 340.224450] [<ffffffff8021d1c0>] ? smp_apic_timer_interrupt+0x0/0xc0 [ 340.224457] [<ffffffff80280621>] ? __ipipe_sync_stage+0x281/0x287 [ 340.224464] [<ffffffff80280627>] ? __xirq_end+0x0/0x102 [ 340.224470] [<ffffffffa00bbc00>] ? rtai_hirq_dispatcher+0x3a0/0x620 [rtai_hal] [ 340.224481] [<ffffffff80248604>] ? _local_bh_enable+0x4/0xc0 [ 340.224487] [<ffffffff8020c455>] ? common_interrupt+0x15/0x2e [ 340.224493] <EOI> <0>Code: ff ff ff ff ff 8e 0d c8 80 ff ff ff ff 10 00 00 00 00 00 00 00 86 42 21 00 00 00 00 00 90 0d c8 80 ff ff ff ff 18 00 00 00 00 00 <00> 00 f7 3f 90 80 ff ff ff ff 02 02 00 00 00 00 00 00 f7 d4 27 [ 340.224620] RIP [<ffffffff80c80d8e>] 0xffffffff80c80d8e [ 340.224626] RSP <ffffffff80c80d90> [ 340.224630] ---[ end trace b768805364b420bd ]--- [ 340.224634] note: milltask[3359] exited with preempt_count 268435457 [ 340.254840] ------------[ cut here ]------------ [ 340.254840] ------------[ cut here ]------------ [ 340.254854] WARNING: at kernel/sched_fair.c:890 pick_next_task_fair+0x10b/0x120() [ 340.254858] Hardware name: GeForce7050M-M [ 340.254861] Modules linked in: charge_pump stepgen hal_parport probe_parport motmod trivkins hal_lib rtapi rtai_math rtai_sem rtai_shm rtai_fifos rtai_sched rtai_hal k8temp evdev [last unloaded: rtai_hal] [ 340.254912] Pid: 3352, comm: hal_manualtoolc Tainted: G D 2.6.29.6-rtai #2 [ 340.254916] Call Trace: [ 340.254924] [<ffffffff802421aa>] warn_slowpath+0xea/0x160 [ 340.254932] [<ffffffff8021d1c0>] smp_apic_timer_interrupt+0x0/0xc0 [ 340.254938] [<ffffffff80280621>] __ipipe_sync_stage+0x281/0x287 [ 340.254943] [<ffffffff80280627>] __xirq_end+0x0/0x102 [ 340.254948] [<ffffffff802c20b5>] mem_cgroup_charge_common+0x75/0xa0 [ 340.254954] [<ffffffff807d25af>] dev_kfree_skb_irq+0x5f/0x90 [ 340.254959] [<ffffffff8020c44e>] common_interrupt+0xe/0x2e [ 340.254963] [<ffffffff8085de70>] unix_poll+0x0/0xa0 [ 340.254968] [<ffffffff8023555b>] pick_next_task_fair+0x10b/0x120 [ 340.254974] [<ffffffff809018d0>] schedule+0x370/0x8a0 [ 340.254978] [<ffffffff80280621>] __ipipe_sync_stage+0x281/0x287 [ 340.254983] [<ffffffff80280627>] __xirq_end+0x0/0x102 [ 340.254988] [<ffffffff80902945>] schedule_hrtimeout_range+0x125/0x140 [ 340.254997] [<ffffffff8021c6a0>] smp_invalidate_interrupt+0x0/0xb0 [ 340.255001] [<ffffffff8020c44e>] common_interrupt+0xe/0x2e [ 340.255006] [<ffffffff802d4d1c>] poll_schedule_timeout+0x2c/0x50 [ 340.255011] [<ffffffff802d5ad1>] do_select+0x5c1/0x660 [ 340.255015] [<ffffffff802d6210>] __pollwait+0x0/0x120 [ 340.255019] [<ffffffff802d6330>] pollwake+0x0/0x50 [ 340.255023] [<ffffffff8023dc80>] default_wake_function+0x0/0x10 [ 340.255027] [<ffffffff80233fe3>] __wake_up_common+0x53/0x80 [ 340.255031] [<ffffffff802344e7>] __wake_up_sync+0x47/0x90 [ 340.255038] [<ffffffff80903ff7>] _spin_unlock_irqrestore+0x47/0x60 [ 340.255043] [<ffffffff807c76da>] sock_def_readable+0x3a/0x70 [ 340.255047] [<ffffffff808605ba>] unix_stream_sendmsg+0x26a/0x3e0 [ 340.255051] [<ffffffff807c33aa>] sock_aio_write+0x13a/0x150 [ 340.255056] [<ffffffff802a5914>] handle_mm_fault+0x1f4/0x8a0 [ 340.255059] [<ffffffff807c3270>] sock_aio_write+0x0/0x150 [ 340.255064] [<ffffffff802c4e8b>] do_sync_readv_writev+0xcb/0x110 [ 340.255067] [<ffffffff802d5da7>] core_sys_select+0x237/0x360 [ 340.255072] [<ffffffff802c56ab>] do_readv_writev+0x16b/0x220 [ 340.255075] [<ffffffff802d6151>] sys_select+0x51/0x110 [ 340.255079] [<ffffffff8020ba6f>] system_call_fastpath+0x16/0x1b [ 340.255082] ---[ end trace b768805364b420be ]--- [ 340.255095] ------------[ cut here ]------------ [ 340.255098] kernel BUG at kernel/exit.c:1095! [ 340.255101] invalid opcode: 0000 [#2] PREEMPT SMP [ 340.255111] last sysfs file: /sys/devices/pci0000:00/0000:00:0b.0/0000:02:00.0/resource [ 340.255115] CPU 0 [ 340.255121] ------------[ cut here ]------------ [ 340.255126] WARNING: at kernel/sched_fair.c:890 enqueue_task+0x52/0x60() [ 340.255128] Hardware name: GeForce7050M-M [ 340.255136] Modules linked in: charge_pump stepgen hal_parport probe_parport motmod trivkins hal_lib rtapi rtai_math rtai_sem rtai_shm rtai_fifos rtai_sched rtai_hal k8temp evdev [last unloaded: rtai_hal] [ 340.255157] Pid: 2630, comm: Xorg Tainted: G D W 2.6.29.6-rtai #2 [ 340.255159] Call Trace: [ 340.255166] [<ffffffff802421aa>] warn_slowpath+0xea/0x160 [ 340.255168] Modules linked in: charge_pump stepgen [<ffffffffa00d674a>] rt_timer_handler+0x86a/0x940 [rtai_sched] [ 340.255189] hal_parport probe_parport motmod trivkins hal_lib rtapi rtai_math rtai_sem rtai_shm rtai_fifos rtai_sched rtai_hal k8temp evdev [last unloaded: rtai_hal] [ 340.255200] Pid: 3376, comm: rsyslogd Tainted: G D W 2.6.29.6-rtai #2 GeForce7050M-M [ 340.255201] RIP: 0010:[<ffffffff802465c9>] [<ffffffff802465c9>] do_exit+0x639/0x850 [ 340.255206] RSP: 0018:ffff880072dcfee8 EFLAGS: 00010292 [ 340.255207] RAX: ffff880072dcffd8 RBX: 0000000000000012 RCX: ffffffff80cdd2c0 [ 340.255209] RDX: ffff88000100f8c0 RSI: ffff880072c3f080 RDI: ffff88000101d600 [ 340.255211] RBP: ffff88007f08f080 R08: ffff880072dce000 R09: 0000000000000007 [ 340.255212] R10: 000000000000000e R11: ffffffffa00d70b0 R12: ffff88007f08f210 [ 340.255214] R13: ffff88007f08f238 R14: ffff88007f860000 R15: ffff88007f08f210 [ 340.255216] FS: 00007fcbf953f6e0(0000) GS:ffffffff80c89040(0000) knlGS:0000000000000000 [ 340.255217] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 340.255219] CR2: 00007fcbf52f7d20 CR3: 0000000072d8f000 CR4: 00000000000006a0 [ 340.255221] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 340.255223] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400 [ 340.255225] Process rsyslogd (pid: 3376, threadinfo ffff880072dce000, task ffff88007f08f080) [ 340.255227] Stack: [ 340.255227] 000000000000000e ffff880072dcff08 0000000180c77120 ffff88007f08f238 [ 340.255230] ffff880072dcff08 ffff880072dcff08 00007fde3b5fe880 ffff8800781f3480 [ 340.255232] 0000000000000000 00007fde3b5fe880 0000000041056ed0 000000000040aec0 [ 340.255235] Call Trace: [ 340.255236] [<ffffffff80246813>] ? do_group_exit+0x33/0xa0 [ 340.255239] [<ffffffff80246892>] ? sys_exit_group+0x12/0x20 [ 340.255241] [<ffffffff8020ba6f>] ? system_call_fastpath+0x16/0x1b [ 340.255244] Code: bd f0 05 00 00 48 85 ff 74 05 e8 d3 61 08 00 65 48 8b 04 25 10 00 00 00 ff 80 44 e0 ff ff 48 c7 45 00 40 00 00 00 e8 97 af 6b 00 <0f> 0b eb fe 0f 0b eb fe 48 8b 85 30 04 00 00 48 8b 78 48 48 85 [ 340.255262] RIP [<ffffffff802465c9>] do_exit+0x639/0x850 [ 340.255264] RSP <ffff880072dcfee8> [ 340.255266] ---[ end trace b768805364b420bf ]--- [ 340.255268] Fixing recursive fault but reboot is needed! [ 340.255298] [<ffffffffa00bba0f>] rtai_hirq_dispatcher+0x1af/0x620 [rtai_hal] [ 340.255304] [<ffffffff8020c455>] common_interrupt+0x15/0x2e [ 340.255315] [<ffffffff80903fc8>] _spin_unlock_irqrestore+0x18/0x60 [ 340.255319] [<ffffffff80235de6>] tg_shares_up+0xd6/0x1e0 [ 340.255324] [<ffffffff80233d52>] enqueue_task+0x52/0x60 [ 340.255334] [<ffffffff80233e50>] activate_task+0x20/0x30 [ 340.255340] [<ffffffff8023dc62>] try_to_wake_up+0x202/0x220 [ 340.255345] [<ffffffff802d6373>] pollwake+0x43/0x50 [ 340.255356] [<ffffffff8023dc80>] default_wake_function+0x0/0x10 [ 340.255360] [<ffffffff80233fe3>] __wake_up_common+0x53/0x80 [ 340.255364] [<ffffffff802344e7>] __wake_up_sync+0x47/0x90 [ 340.255375] [<ffffffff80860932>] unix_write_space+0x42/0x90 [ 340.255379] [<ffffffff80860943>] unix_write_space+0x53/0x90 [ 340.255385] [<ffffffff807c81c1>] sock_wfree+0x51/0x60 [ 340.255397] [<ffffffff807c9f18>] skb_release_head_state+0x48/0xa0 [ 340.255401] [<ffffffff807cbe39>] __kfree_skb+0x9/0xa0 [ 340.255405] [<ffffffff8085ffd1>] unix_stream_recvmsg+0x291/0x610 [ 340.255420] [<ffffffffa00d674a>] rt_timer_handler+0x86a/0x940 [rtai_sched] [ 340.255426] [<ffffffff802bb1ff>] add_partial+0x1f/0x80 [ 340.255436] [<ffffffff807c3502>] sock_aio_read+0x142/0x150 [ 340.255441] [<ffffffff802c50cb>] do_sync_read+0xdb/0x120 [ 340.255452] [<ffffffff80248317>] irq_exit+0x47/0xb0 [ 340.255458] [<ffffffff80259950>] autoremove_wake_function+0x0/0x30 [ 340.255463] [<ffffffff80280627>] __xirq_end+0x0/0x102 [ 340.255474] [<ffffffff802c5dd6>] vfs_read+0x176/0x180 [ 340.255478] [<ffffffff802c5ee3>] sys_read+0x53/0xa0 [ 340.255482] [<ffffffff8020ba6f>] system_call_fastpath+0x16/0x1b [ 340.255486] ---[ end trace b768805364b420c0 ]--- [ 340.255600] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078 [ 340.255614] IP: [<ffffffff80234c98>] check_preempt_wakeup+0x158/0x170 [ 340.255620] PGD 7dc61067 PUD 7e12d067 PMD 0 [ 340.255634] Oops: 0000 [#3] PREEMPT SMP [ 340.255643] last sysfs file: /sys/devices/pci0000:00/0000:00:0b.0/0000:02:00.0/resource [ 340.255652] CPU 1 [ 340.255657] Modules linked in: charge_pump stepgen hal_parport probe_parport motmod trivkins hal_lib rtapi rtai_math rtai_sem rtai_shm rtai_fifos rtai_sched rtai_hal k8temp evdev [last unloaded: rtai_hal] [ 340.255714] Pid: 2630, comm: Xorg Tainted: G D W 2.6.29.6-rtai #2 GeForce7050M-M [ 340.255717] RIP: 0010:[<ffffffff80234c98>] [<ffffffff80234c98>] check_preempt_wakeup+0x158/0x170 [ 340.255724] RSP: 0000:ffff88007cd8fac8 EFLAGS: 00010206 [ 340.255733] RAX: ffff88000101d670 RBX: 0000000000000000 RCX: 0000000000000002 [ 340.255736] RDX: 0000000000000000 RSI: ffff880072c3f080 RDI: ffff880001024480 [ 340.255739] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000007 [ 340.255743] R10: 000000000000000e R11: 0000000000000000 R12: ffff88007f08f080 [ 340.255752] R13: ffff880072c3f080 R14: ffff88000101d600 R15: 0000000000000001 [ 340.255756] FS: 00007f8f1f3136e0(0000) GS:ffff88007f803980(0000) knlGS:0000000000000000 [ 340.255759] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 340.255762] CR2: 0000000000000078 CR3: 000000007cc40000 CR4: 00000000000006a0 [ 340.255771] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 340.255775] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 340.255778] Process Xorg (pid: 2630, threadinfo ffff88007cd8e000, task ffff88007e146a40) [ 340.255782] Stack: [ 340.255790] 0000000000000001 ffff880072c3f080 ffff88000101d600 0000000000000001 [ 340.255799] 0000000000000000 ffffffff8023db94 ffffffff80c73700 000000000000001f [ 340.255818] 0000000000000000 ffff880076d1cca0 0000000000000000 0000000000000001 [ 340.255838] Call Trace: [ 340.255841] [<ffffffff8023db94>] try_to_wake_up+0x134/0x220 [ 340.255853] [<ffffffff802d6373>] pollwake+0x43/0x50 [ 340.255860] [<ffffffff8023dc80>] default_wake_function+0x0/0x10 [ 340.255873] [<ffffffff80233fe3>] __wake_up_common+0x53/0x80 [ 340.255878] [<ffffffff802344e7>] __wake_up_sync+0x47/0x90 [ 340.255889] [<ffffffff80860932>] unix_write_space+0x42/0x90 [ 340.255895] [<ffffffff80860943>] unix_write_space+0x53/0x90 [ 340.255900] [<ffffffff807c81c1>] sock_wfree+0x51/0x60 [ 340.255912] [<ffffffff807c9f18>] skb_release_head_state+0x48/0xa0 [ 340.255918] [<ffffffff807cbe39>] __kfree_skb+0x9/0xa0 [ 340.255930] [<ffffffff8085ffd1>] unix_stream_recvmsg+0x291/0x610 [ 340.255938] [<ffffffffa00d674a>] rt_timer_handler+0x86a/0x940 [rtai_sched] [ 340.255956] [<ffffffff802bb1ff>] add_partial+0x1f/0x80 [ 340.255968] [<ffffffff807c3502>] sock_aio_read+0x142/0x150 [ 340.255974] [<ffffffff802c50cb>] do_sync_read+0xdb/0x120 [ 340.255979] [<ffffffff80248317>] irq_exit+0x47/0xb0 [ 340.255991] [<ffffffff80259950>] autoremove_wake_function+0x0/0x30 [ 340.255996] [<ffffffff80280627>] __xirq_end+0x0/0x102 [ 340.256009] [<ffffffff802c5dd6>] vfs_read+0x176/0x180 [ 340.256016] [<ffffffff802c5ee3>] sys_read+0x53/0xa0 [ 340.256021] [<ffffffff8020ba6f>] system_call_fastpath+0x16/0x1b [ 340.256033] Code: 70 39 c8 7c f6 89 c1 39 c1 7d 20 48 8b 5b 70 ff c8 39 c1 7c f6 48 8b 43 78 48 39 45 78 0f 84 76 ff ff ff 48 8b 6d 70 48 8b 5b 70 <48> 8b 43 78 48 39 45 78 75 ee e9 5f ff ff ff 0f 0b eb fe 0f -- Greetings, Michael. ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Emc-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/emc-developers
