On Fri, Sep 14, 2012 at 11:00:35AM +0000, Holger Kiehl wrote: > Hello, > > Got the following error situation where I do not know why it happened. In > /var/log/messages I found the following:
>From what I see, drbd does a simple printk only. And your fb console wants to print those, but thinks it needs to scroll something to show it. That results in some memory areas being mapped/unmapped, which should have triggered a "might sleep" warning before this already. Anyways, it ends up needing a tlb flush, which iterates over all cpus, and that triggers the "someone calls smp_call_function_many with irq disabled, and that could deadlock! WTF?" check right there. Because, well, vprintk_emit disabled irqs. So I guess, get rid of your funky fb console and be happy. Or get someone to fix that mga g200 fb driver for you... > Sep 14 08:32:06 praktifix kernel: WARNING: at kernel/smp.c:461 > smp_call_function_many+0x6c/0x1bb() > Sep 14 08:32:06 praktifix kernel: Hardware name: PRIMERGY RX300 S4 > Sep 14 08:32:06 praktifix kernel: Modules linked in: drbd(O) coretemp > ipmi_devintf ipmi_si bonding binfmt_misc video acpi_ipmi ipmi_msghandler ac > nvram sr_mod cdrom sg usbhid mgag200 fbcon ttm tileblit font bitblit > softcursor drm_kms_helper drm i2c_algo_bit sysimgblt sysfillrect syscopyarea > i5k_amb pata_acpi i2c_i801 ata_generic i2c_core i5000_edac ehci_hcd uhci_hcd > usbcore usb_common [last unloaded: microcode] > Sep 14 08:32:06 praktifix kernel: Pid: 4442, comm: drbd_r_r0 Tainted: G > O 3.5.3 #1 > Sep 14 08:32:06 praktifix kernel: Call Trace: > Sep 14 08:32:06 praktifix kernel: [<ffffffff81060411>] ? > smp_call_function_many+0x6c/0x1bb > Sep 14 08:32:06 praktifix kernel: [<ffffffff8102ab0e>] > warn_slowpath_common+0x80/0x99 > Sep 14 08:32:06 praktifix kernel: [<ffffffff8102ab3c>] > warn_slowpath_null+0x15/0x17 > Sep 14 08:32:06 praktifix kernel: [<ffffffff81060411>] > smp_call_function_many+0x6c/0x1bb > Sep 14 08:32:06 praktifix kernel: [<ffffffff81024bd1>] ? > leave_mm+0x43/0x43 > Sep 14 08:32:06 praktifix kernel: [<ffffffff81024bd1>] ? > leave_mm+0x43/0x43 > Sep 14 08:32:06 praktifix kernel: [<ffffffff810605c2>] > smp_call_function+0x20/0x24 > Sep 14 08:32:06 praktifix kernel: [<ffffffff810606a9>] > on_each_cpu+0x16/0x32 > Sep 14 08:32:06 praktifix kernel: [<ffffffff81024aa3>] > flush_tlb_all+0x17/0x19 > Sep 14 08:32:06 praktifix kernel: [<ffffffff8109f971>] > __purge_vmap_area_lazy+0x122/0x17a > Sep 14 08:32:06 praktifix kernel: [<ffffffff8109fa4b>] > free_vmap_area_noflush+0x54/0x5b > Sep 14 08:32:06 praktifix kernel: [<ffffffff810a05e9>] > free_unmap_vmap_area+0x20/0x24 > Sep 14 08:32:06 praktifix kernel: [<ffffffff810a064a>] > remove_vm_area+0x5d/0x71 > Sep 14 08:32:06 praktifix kernel: [<ffffffff810a076a>] __vunmap+0x38/0xb5 > Sep 14 08:32:06 praktifix kernel: [<ffffffff810a080d>] vunmap+0x26/0x28 > Sep 14 08:32:06 praktifix kernel: [<ffffffffa00e9fb7>] > ttm_bo_kunmap+0x55/0xa3 [ttm] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa00fc6a3>] > mga_dirty_update+0x10b/0x122 [mgag200] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa00fc6e4>] > mga_imageblit+0x2a/0x2f [mgag200] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa00ca7a4>] > bit_putcs+0x44b/0x4b0 [bitblit] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa00cacf7>] ? > bit_cursor+0x4ee/0x7f7 [bitblit] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa0103a74>] > fbcon_putcs+0xa1/0x101 [fbcon] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa00ca359>] ? > bit_clear+0xd6/0xd6 [bitblit] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa0105232>] > fbcon_redraw+0xd8/0x16c [fbcon] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa0104494>] ? > fbcon_cursor+0x127/0x150 [fbcon] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa00ca809>] ? > bit_putcs+0x4b0/0x4b0 [bitblit] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa01071c5>] > fbcon_scroll+0x687/0xc6c [fbcon] > Sep 14 08:32:06 praktifix kernel: [<ffffffff8102bc9f>] ? > console_unlock+0x2e0/0x2ef > Sep 14 08:32:06 praktifix kernel: [<ffffffff811ee543>] scrup+0x71/0xe8 > Sep 14 08:32:06 praktifix kernel: [<ffffffff811ee64e>] lf+0x2d/0x66 > Sep 14 08:32:06 praktifix kernel: [<ffffffff811f3119>] > vt_console_print+0x1d9/0x304 > Sep 14 08:32:06 praktifix kernel: [<ffffffff8102afd5>] > call_console_drivers+0x7b/0x8d > Sep 14 08:32:06 praktifix kernel: [<ffffffff8102bc1f>] > console_unlock+0x260/0x2ef > Sep 14 08:32:06 praktifix kernel: [<ffffffff8102c435>] > vprintk_emit+0x302/0x364 > Sep 14 08:32:06 praktifix kernel: [<ffffffff8102c97c>] > printk_emit+0x88/0x8a > Sep 14 08:32:06 praktifix kernel: [<ffffffff8104cd4b>] ? > __wake_up+0x43/0x50 > Sep 14 08:32:06 praktifix kernel: [<ffffffff812fb18d>] ? > netlink_broadcast_filtered+0x28e/0x2bb > Sep 14 08:32:06 praktifix kernel: [<ffffffff81205d8b>] > __dev_printk+0x1d2/0x1e4 > Sep 14 08:32:06 praktifix kernel: [<ffffffffa01a5c82>] ? > drbd_bcast_event+0xd7/0x11c [drbd] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa01a949f>] ? > drbd_khelper+0x1cc/0x1ff [drbd] > Sep 14 08:32:06 praktifix kernel: [<ffffffff8120636a>] > dev_printk+0xa9/0xab > Sep 14 08:32:06 praktifix kernel: [<ffffffffa018f551>] ? > drbd_recv+0x26/0x15a [drbd] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa018f551>] ? > drbd_recv+0x26/0x15a [drbd] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa018e4d1>] > drbd_sync_handshake+0x34b/0x548 [drbd] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa0194d8d>] > receive_state+0x3ce/0x75d [drbd] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa01908fc>] drbdd+0x9d/0x13a > [drbd] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa019118c>] > drbdd_init+0x79/0x98 [drbd] > Sep 14 08:32:06 praktifix kernel: [<ffffffffa01a2b38>] > drbd_thread_setup+0x97/0x13f [drbd] > Sep 14 08:32:06 praktifix kernel: [<ffffffff81377254>] > kernel_thread_helper+0x4/0x10 > Sep 14 08:32:06 praktifix kernel: [<ffffffffa01a2aa1>] ? > drbd_bmio_clear_n_write+0x149/0x149 [drbd] > Sep 14 08:32:06 praktifix kernel: [<ffffffff81377250>] ? gs_change+0xb/0xb > Sep 14 08:32:06 praktifix kernel: ---[ end trace 8b6e7b6ecbb1b906 ]--- > Sep 14 08:32:06 praktifix kernel: block drbd0: helper command: > /sbin/drbdadm split-brain minor-0 > Sep 14 08:32:06 praktifix kernel: block drbd0: helper command: > /sbin/drbdadm split-brain minor-0 exit code 0 (0x0) > Sep 14 08:32:06 praktifix kernel: d-con r0: conn( NetworkFailure -> > Disconnecting ) > Sep 14 08:32:06 praktifix kernel: d-con r0: error receiving ReportState, > e: -5 l: 0! > Sep 14 08:32:06 praktifix kernel: d-con r0: Connection closed > Sep 14 08:32:06 praktifix kernel: d-con r0: conn( Disconnecting -> > StandAlone ) > Sep 14 08:32:06 praktifix kernel: d-con r0: receiver terminated > Sep 14 08:32:06 praktifix kernel: d-con r0: Terminating receiver thread > > Before this it was running kernel 3.2.x and drbd 8.4.1 for a long time > without any errors. Any clue why this happened? > > If more information is needed please just ask. > > Regards, > Holger > _______________________________________________ > drbd-user mailing list > [email protected] > http://lists.linbit.com/mailman/listinfo/drbd-user -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed _______________________________________________ drbd-user mailing list [email protected] http://lists.linbit.com/mailman/listinfo/drbd-user
