On Wed, Oct 20, 2010 at 04:15:15PM +0200, Welterlen Benoit wrote: > I'm doing some tests on OCFS2 with a 2.6.32-100 kernel (Oracle) or > RHEL6/fedora and I have a hang in lowcomms.c as you can see below. > I have a crash dump if you need more information. I'm lost and I need > help to know where to search to debug this problem.
Whee! Userspace stack on the 2.6.32-100 kernel ;-) We haven't actually tested this configuration yet; it's not supported officially. However, it "should" work, just as the userspace stack stuff has worked for a while. I've forwarded this report on to the fs/dlm maintainer for pointers to see if we can get you any help. Joel > Thanks > > Regards, > > Benoit > > > > Kernel 2.6.32-100.0.19.el5 on an x86_64 > chili0 login: ------------[ cut here ]------------ > kernel BUG at fs/dlm/lowcomms.c:647! > invalid opcode: 0000 [#1] SMP > last sysfs file: /sys/kernel/dlm/14E8093BB71D447EBEE691622CF86B9C/control > CPU 34 > Modules linked in: ocfs2(U) ocfs2_nodemanager(U) nfsd(U) exportfs(U) > sctp(U) libcrc32c(U) ocfs2_stack_user(U) ocfs2_stackglue(U) dlm(U) > configfs(U) acpi_cpufreq(U) freq_table(U) ipmi_devintf(U) ipmi_si(U) > ipmi_msghandler(U) nfs(U) lockd(U) fscache(U) nfs_acl(U) auth_rpcgss(U) > sunrpc(U) ipv6(U) scsi_dh_emc(U) dm_round_robin(U) dm_multipath(U) > iTCO_wdt(U) iTCO_vendor_support(U) mlx4_core(U) i2c_i801(U) igb(U) > pcspkr(U) i2c_core(U) ioatdma(U) dca(U) ahci(U) uhci_hcd(U) ehci_hcd(U) > lpfc(U) scsi_transport_fc(U) scsi_tgt(U) [last unloaded: ocfs2_nodemanager] > Pid: 27062, comm: dlm_recv/34 Not tainted 2.6.32-100.0.19.el5 #1 bullx > super-node > RIP: 0010:[<ffffffffa02406c3>] [<ffffffffa02406c3>] > receive_from_sock+0x554/0x6ed [dlm] > RSP: 0018:ffff880c77c6bc60 EFLAGS: 00010246 > RAX: 0000000000000030 RBX: ffff8810774b8d30 RCX: ffff88087c4548f8 > RDX: 0000000000000030 RSI: ffff880876dce000 RDI: ffffffff81398045 > RBP: ffff880c77c6be50 R08: ffff000000000000 R09: ffff880c77c6b900 > R10: ffff880c77c6b8f0 R11: 0000000000000030 R12: 0000000000000030 > R13: ffff8810774b8d20 R14: ffff880c7caa00c0 R15: ffffffffa023ecca > FS: 0000000000000000(0000) GS:ffff88048e600000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 0000000000fcb078 CR3: 0000000001001000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process dlm_recv/34 (pid: 27062, threadinfo ffff880c77c6a000, task > ffff880c7caa00c0) > Stack: > ffff880c77c6bc70 ffffffff8122fa24 ffff880c77c6bc90 ffffffff8122faca > <0> ffff88048e414ec0 0000100000000002 0000000000000000 ffffffff00000000 > <0> 0000000000000000 0000000000000000 ffffffffa024bb20 0000000000000030 > Call Trace: > [<ffffffff8122fa24>] ? cpumask_next+0x19/0x1b > [<ffffffff8122faca>] ? cpumask_next_and+0x20/0x32 > [<ffffffffa023ecca>] ? process_recv_sockets+0x0/0x28 [dlm] > [<ffffffffa023ecea>] process_recv_sockets+0x20/0x28 [dlm] > [<ffffffff81071802>] worker_thread+0x14d/0x1ed > [<ffffffff81075a7c>] ? autoremove_wake_function+0x0/0x3d > [<ffffffff810716b5>] ? worker_thread+0x0/0x1ed > [<ffffffff810756d3>] kthread+0x6e/0x76 > [<ffffffff81012dea>] child_rip+0xa/0x20 > [<ffffffff81075665>] ? kthread+0x0/0x76 > [<ffffffff81012de0>] ? child_rip+0x0/0x20 > Code: 29 e7 ff ff e9 2d 01 00 00 41 8b 74 24 10 0f b7 d0 48 c7 c7 d1 8c > 24 a0 31 c0 e8 ab 71 e1 e0 e9 12 01 00 00 41 83 7d 08 00 75 04 <0f> 0b > eb fe 4d 8d 7d 68 49 be 00 00 00 00 00 16 00 00 41 8b 55 > RIP [<ffffffffa02406c3>] receive_from_sock+0x554/0x6ed [dlm] > RSP <ffff880c77c6bc60> > Initializing cgroup subsys cpuset > Initializing cgroup subsys cpu > Linux version 2.6.32-100.0.19.el5 (mockbu...@ca-build9.us.oracle.com) > (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Fri Sep 17 > 17:51:41 EDT 2010 > Command line: ro root=/dev/mapper/vg_chili0-lv_root > rd_LVM_LV=vg_chili0/lv_root rd_LVM_LV=vg_chili0/lv_swap rd_NO_LUKS > rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 > KEYBOARDTYPE=pc KEYTABLE=fr-pc cgroup_disable=memory selinux=0 > pcie_aspm=off nmi_watchdog=0 console=ttyS1,115200 maxcpus=1 > reset_devices memmap=exactmap memmap=6...@0k memmap=1959...@33408k > elfcorehdr=229356K memmap=308K#1993940K memmap=16K#2077704K > memmap=4K#2077748K memmap=4K#2077764K memmap=44K#2077768K > memmap=72K#2077812K memmap=4K#2077884K memmap=4K#2077888K > memmap=4K#2077892K memmap=4K#2078024K memmap=2716K#2078052K > memmap=1024K#69204860K memmap=128K#69205884K > KERNEL supported cpus: > Intel GenuineIntel > AMD AuthenticAMD > Centaur CentaurHauls > BIOS-provided physical RAM map: > > From the dump : > GNU gdb (GDB) 7.0 > Copyright (C) 2009 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-unknown-linux-gnu"... > > KERNEL: /usr/lib/debug/lib/modules/2.6.32-100.0.19.el5/vmlinux > DUMPFILE: /var/var/crash/127.0.0.1-2010-10-18-16:42:07/vmcore > [PARTIAL DUMP] > CPUS: 64 > DATE: Mon Oct 18 16:41:48 2010 > UPTIME: 00:15:00 > LOAD AVERAGE: 1.06, 1.22, 1.65 > TASKS: 1594 > NODENAME: chili0 > RELEASE: 2.6.32-100.0.19.el5 > VERSION: #1 SMP Fri Sep 17 17:51:41 EDT 2010 > MACHINE: x86_64 (1999 Mhz) > MEMORY: 64 GB > PANIC: "kernel BUG at fs/dlm/lowcomms.c:647!" > PID: 27062 > COMMAND: "dlm_recv/34" > TASK: ffff880c7caa00c0 [THREAD_INFO: ffff880c77c6a000] > CPU: 34 > STATE: TASK_RUNNING (PANIC) > > crash> bt > PID: 27062 TASK: ffff880c7caa00c0 CPU: 34 COMMAND: "dlm_recv/34" > #0 [ffff880c77c6b910] machine_kexec at ffffffff8102cc9b > #1 [ffff880c77c6b990] crash_kexec at ffffffff810964d4 > #2 [ffff880c77c6ba60] oops_end at ffffffff81439bd9 > #3 [ffff880c77c6ba90] die at ffffffff81015639 > #4 [ffff880c77c6bac0] do_trap at ffffffff8143952c > #5 [ffff880c77c6bb10] do_invalid_op at ffffffff81013902 > #6 [ffff880c77c6bbb0] invalid_op at ffffffff81012b7b > [exception RIP: receive_from_sock+1364] > RIP: ffffffffa02406c3 RSP: ffff880c77c6bc60 RFLAGS: 00010246 > RAX: 0000000000000030 RBX: ffff8810774b8d30 RCX: ffff88087c4548f8 > RDX: 0000000000000030 RSI: ffff880876dce000 RDI: ffffffff81398045 > RBP: ffff880c77c6be50 R8: ffff000000000000 R9: ffff880c77c6b900 > R10: ffff880c77c6b8f0 R11: 0000000000000030 R12: 0000000000000030 > R13: ffff8810774b8d20 R14: ffff880c7caa00c0 R15: ffffffffa023ecca > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #7 [ffff880c77c6be58] process_recv_sockets at ffffffffa023ecea > #8 [ffff880c77c6be78] worker_thread at ffffffff81071802 > #9 [ffff880c77c6bee8] kthread at ffffffff810756d3 > #10 [ffff880c77c6bf48] kernel_thread at ffffffff81012dea > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users -- "Every new beginning comes from some other beginning's end." Joel Becker Senior Development Manager Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users