Hello Michael, I can reproduce this with Linux 3.2.0-2-amd64 (3.2.12-1) and mdadm (3.2.3-2): mdadm --grow /dev/md0 --bitmap=none mdadm --grow /dev/md0 --bitmap=internal
And a few seconds later the kernel reboots: [ 75.119802] md0: bitmap file is out of date (0 < 147) -- forcing full recovery [ 75.119817] created bitmap (1 pages) for device md0 [ 80.797978] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 80.797996] IP: [<ffffffffa00272c1>] bitmap_endwrite+0x131/0x18f [md_mod] [ 80.798013] PGD 0 [ 80.798020] Oops: 0000 [#1] SMP [ 80.798028] CPU 0 [ 80.798032] Modules linked in: fuse evdev snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr ext3 mbcache jbd raid1 md_mod xen_netfront xen_blkfront [ 80.798070] [ 80.798075] Pid: 0, comm: swapper/0 Not tainted 3.2.0-2-amd64 #1 [ 80.798086] RIP: e030:[<ffffffffa00272c1>] [<ffffffffa00272c1>] bitmap_endwrite+0x131/0x18f [md_mod] [ 80.798102] RSP: e02b:ffff8800ffe5ec88 EFLAGS: 00010046 [ 80.798109] RAX: 0000000000000000 RBX: ffff8800031835c0 RCX: 0000000000000888 [ 80.798117] RDX: 0000000000000000 RSI: 0000000000000088 RDI: ffff8800031835c0 [ 80.798124] RBP: 0000000001103d50 R08: 0000000000000000 R09: 0000000000000000 [ 80.798132] R10: 0000000000000246 R11: 0000000000000246 R12: 0000000000000008 [ 80.798140] R13: ffff8800031835fc R14: ffff880003180078 R15: 0000000000000001 [ 80.798154] FS: 00007f9fe4ac57c0(0000) GS:ffff8800ffe5b000(0000) knlGS:0000000000000000 [ 80.798165] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 80.798172] CR2: 0000000000000010 CR3: 00000000f48cc000 CR4: 0000000000002660 [ 80.798181] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 80.798189] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 80.798198] Process swapper/0 (pid: 0, threadinfo ffffffff81600000, task ffffffff8160d020) [ 80.798207] Stack: [ 80.798211] 0000000000000246 ffff880003183678 0000000000000000 000000000001c2b0 [ 80.798227] ffff8800032f0040 ffff8800f57c00c0 ffff88000309c3c0 ffff8800032f2e48 [ 80.798243] 000000000000000b ffff8800030a54d0 0000000000000000 ffffffffa0006808 [ 80.798258] Call Trace: [ 80.798263] <IRQ> [ 80.798273] [<ffffffffa0006808>] ? close_write+0x71/0x7d [raid1] [ 80.798284] [<ffffffffa0009677>] ? r1_bio_write_done+0x1e/0x37 [raid1] [ 80.798295] [<ffffffffa00097a8>] ? raid1_end_write_request+0x118/0x134 [raid1] [ 80.798309] [<ffffffff81006670>] ? xen_force_evtchn_callback+0x9/0xa [ 80.798320] [<ffffffff81006c52>] ? check_events+0x12/0x20 [ 80.798331] [<ffffffff811965e8>] ? blk_update_request+0x18c/0x30a [ 80.798341] [<ffffffff8119677e>] ? blk_update_bidi_request+0x18/0x63 [ 80.798351] [<ffffffff81197a0c>] ? __blk_end_bidi_request+0xe/0x27 [ 80.798361] [<ffffffff81197a3f>] ? __blk_end_request_all+0x1a/0x23 [ 80.798371] [<ffffffffa0000794>] ? blkif_interrupt+0x23f/0x2ae [xen_blkfront] [ 80.798384] [<ffffffff81090581>] ? handle_irq_event_percpu+0x50/0x180 [ 80.798394] [<ffffffff81070733>] ? arch_local_irq_restore+0x7/0x8 [ 80.798405] [<ffffffff81062441>] ? hrtimer_get_next_event+0x79/0x8f [ 80.798414] [<ffffffff810906e5>] ? handle_irq_event+0x34/0x53 [ 80.798425] [<ffffffff81219206>] ? ack_dynirq+0x17/0x2e [ 80.798435] [<ffffffff81092b21>] ? handle_edge_irq+0xa2/0xc9 [ 80.798446] [<ffffffff81218f25>] ? __xen_evtchn_do_upcall+0x157/0x1f2 [ 80.798457] [<ffffffff8106b8b3>] ? arch_local_irq_restore+0x7/0x8 [ 80.798468] [<ffffffff8121a548>] ? xen_evtchn_do_upcall+0x22/0x32 [ 80.798480] [<ffffffff813502be>] ? xen_do_hypervisor_callback+0x1e/0x30 [ 80.798488] <EOI> [ 80.798496] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [ 80.798505] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [ 80.798515] [<ffffffff8100663a>] ? xen_safe_halt+0xc/0x13 [ 80.798525] [<ffffffff810144fc>] ? default_idle+0x47/0x7f [ 80.798535] [<ffffffff8100d252>] ? cpu_idle+0xaf/0xf2 [ 80.798545] [<ffffffff816aab3d>] ? start_kernel+0x3bd/0x3c8 [ 80.798554] [<ffffffff816ac64a>] ? xen_start_kernel+0x590/0x596 [ 80.798561] Code: 77 0a 01 e1 48 8b 04 24 66 8b 10 ff ca 66 83 fa 02 66 89 10 77 2e 48 8b 4b 20 48 89 ee 48 89 df 83 e9 09 48 d3 ee e8 f9 f4 ff ff <48> 8b 40 10 48 8b 53 58 8d 04 85 01 00 00 00 0f ab 02 c7 43 78 [ 80.798665] RIP [<ffffffffa00272c1>] bitmap_endwrite+0x131/0x18f [md_mod] [ 80.798678] RSP <ffff8800ffe5ec88> [ 80.798683] CR2: 0000000000000010 [ 80.798691] ---[ end trace 96c25711b3dbe8e9 ]--- [ 80.798697] Kernel panic - not syncing: Fatal exception in interrupt [ 80.798705] Pid: 0, comm: swapper/0 Tainted: G D 3.2.0-2-amd64 #1 [ 80.798712] Call Trace: [ 80.798716] <IRQ> [<ffffffff81342930>] ? panic+0x95/0x1a5 [ 80.798731] [<ffffffff81070733>] ? arch_local_irq_restore+0x7/0x8 [ 80.798741] [<ffffffff81349e86>] ? oops_end+0xa9/0xb6 [ 80.798750] [<ffffffff8134227c>] ? no_context+0x1ff/0x20e [ 80.798760] [<ffffffff8102bac8>] ? pvclock_clocksource_read+0x42/0xb2 [ 80.798770] [<ffffffff8134be99>] ? do_page_fault+0x1a8/0x337 [ 80.798779] [<ffffffff81006c3f>] ? xen_restore_fl_direct_reloc+0x4/0x4 [ 80.798789] [<ffffffff810b7227>] ? arch_local_irq_restore+0x7/0x8 [ 80.798800] [<ffffffff810359f1>] ? set_task_rq+0x23/0x35 [ 80.798809] [<ffffffff8104098c>] ? select_task_rq_fair+0x39f/0x67e [ 80.798819] [<ffffffff8102bac8>] ? pvclock_clocksource_read+0x42/0xb2 [ 80.798828] [<ffffffff8102bac8>] ? pvclock_clocksource_read+0x42/0xb2 [ 80.798838] [<ffffffff81006670>] ? xen_force_evtchn_callback+0x9/0xa [ 80.798848] [<ffffffff81006c52>] ? check_events+0x12/0x20 [ 80.798857] [<ffffffff813495f5>] ? page_fault+0x25/0x30 [ 80.798870] [<ffffffffa00272c1>] ? bitmap_endwrite+0x131/0x18f [md_mod] [ 80.798882] [<ffffffffa00272c1>] ? bitmap_endwrite+0x131/0x18f [md_mod] [ 80.798893] [<ffffffffa0006808>] ? close_write+0x71/0x7d [raid1] [ 80.798903] [<ffffffffa0009677>] ? r1_bio_write_done+0x1e/0x37 [raid1] [ 80.798914] [<ffffffffa00097a8>] ? raid1_end_write_request+0x118/0x134 [raid1] [ 80.798925] [<ffffffff81006670>] ? xen_force_evtchn_callback+0x9/0xa [ 80.798935] [<ffffffff81006c52>] ? check_events+0x12/0x20 [ 80.801968] [<ffffffff811965e8>] ? blk_update_request+0x18c/0x30a [ 80.801968] [<ffffffff8119677e>] ? blk_update_bidi_request+0x18/0x63 [ 80.801968] [<ffffffff81197a0c>] ? __blk_end_bidi_request+0xe/0x27 [ 80.801968] [<ffffffff81197a3f>] ? __blk_end_request_all+0x1a/0x23 [ 80.801968] [<ffffffffa0000794>] ? blkif_interrupt+0x23f/0x2ae [xen_blkfront] [ 80.801968] [<ffffffff81090581>] ? handle_irq_event_percpu+0x50/0x180 [ 80.801968] [<ffffffff81070733>] ? arch_local_irq_restore+0x7/0x8 [ 80.801968] [<ffffffff81062441>] ? hrtimer_get_next_event+0x79/0x8f [ 80.801968] [<ffffffff810906e5>] ? handle_irq_event+0x34/0x53 [ 80.801968] [<ffffffff81219206>] ? ack_dynirq+0x17/0x2e [ 80.801968] [<ffffffff81092b21>] ? handle_edge_irq+0xa2/0xc9 [ 80.801968] [<ffffffff81218f25>] ? __xen_evtchn_do_upcall+0x157/0x1f2 [ 80.801968] [<ffffffff8106b8b3>] ? arch_local_irq_restore+0x7/0x8 [ 80.801968] [<ffffffff8121a548>] ? xen_evtchn_do_upcall+0x22/0x32 [ 80.801968] [<ffffffff813502be>] ? xen_do_hypervisor_callback+0x1e/0x30 [ 80.801968] <EOI> [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [ 80.801968] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [ 80.801968] [<ffffffff8100663a>] ? xen_safe_halt+0xc/0x13 [ 80.801968] [<ffffffff810144fc>] ? default_idle+0x47/0x7f [ 80.801968] [<ffffffff8100d252>] ? cpu_idle+0xaf/0xf2 [ 80.801968] [<ffffffff816aab3d>] ? start_kernel+0x3bd/0x3c8 [ 80.801968] [<ffffffff816ac64a>] ? xen_start_kernel+0x590/0x596 With a static linked mdadm (v2.6.7 - 6th June 2008) all keeps fine: /tmp/mdadm.static64 --grow /dev/md0 --bitmap=none /tmp/mdadm.static64 --grow /dev/md0 --bitmap=internal [ 281.355160] md: md0: resync done. [ 345.359062] md0: bitmap file is out of date (0 < 210) -- forcing full recovery [ 345.359077] created bitmap (160 pages) for device md0 [ 345.359156] md0: bitmap file is out of date, doing full recovery [ 345.392291] md0: bitmap initialized from disk: read 11/11 pages, set 327679 of 327679 bits And the kernel keeps going without interruption. I'm puzzled because the older mdadm doesn't trigger this bug. Hope this helps. -- greetings eMHa
signature.asc
Description: This is a digitally signed message part.