hi everyone. last week the node 1 of my cluster failed. I checked the logs but i have no idea why it felt down. this is part of the log messages. does any know something about it? I would appreciate your help. thank you
*May 4 23:50:42 HSLL-BD1 kernel: BUG: soft lockup - CPU#3 stuck for 10s! [fs.sh:28923]* May 4 23:50:42 HSLL-BD1 kernel: May 4 23:50:42 HSLL-BD1 kernel: Pid: 28923, comm: fs.sh May 4 23:50:42 HSLL-BD1 kernel: EIP: 0060:[<c04e3d98>] CPU: 3 May 4 23:50:42 HSLL-BD1 kernel: EIP is at memcmp+0x0/0x22 May 4 23:50:42 HSLL-BD1 kernel: EFLAGS: 00000202 Tainted: G (2.6.18-92.el5PAE #1) May 4 23:50:42 HSLL-BD1 kernel: EAX: df5dcf1d EBX: df5dcf1c ECX: 00000004 EDX: df5dce95 May 4 23:50:42 HSLL-BD1 kernel: ESI: df5dce95 EDI: 00000004 EBP: 00000000 DS: 007b ES: 007b May 4 23:50:42 HSLL-BD1 kernel: CR0: 8005003b CR2: 00c24540 CR3: 3487bde0 CR4: 000006f0 May 4 23:50:42 HSLL-BD1 kernel: [<f8d4c1f5>] abi_personality+0x55/0x7c [abi_lcall] May 4 23:50:42 HSLL-BD1 kernel: [<c046f4d3>] do_sync_read+0xb6/0xf1 May 4 23:50:42 HSLL-BD1 kernel: [<c0457353>] get_page_from_freelist+0x96/0x333 May 4 23:50:42 HSLL-BD1 kernel: [<f922d01a>] xout_load_object+0x1a/0x82d [binfmt_xout] May 4 23:50:42 HSLL-BD1 kernel: [<c045c6ba>] page_address+0x7a/0x81 May 4 23:50:42 HSLL-BD1 kernel: [<c045cc0f>] kunmap_high+0x14/0x7e May 4 23:50:42 HSLL-BD1 kernel: [<f922d8bc>] xout_load_binary+0xe/0x26 [binfmt_xout] May 4 23:50:42 HSLL-BD1 kernel: [<c0477cea>] search_binary_handler+0x99/0x219 May 4 23:50:42 HSLL-BD1 kernel: [<c047953b>] do_execve+0x158/0x1f5 May 4 23:50:42 HSLL-BD1 kernel: [<c040321f>] sys_execve+0x2a/0x4a May 4 23:50:42 HSLL-BD1 kernel: [<c0404eff>] syscall_call+0x7/0xb May 4 23:50:42 HSLL-BD1 kernel: ======================= *May 4 23:50:46 HSLL-BD1 kernel: BUG: soft lockup - CPU#1 stuck for 10s! [modclusterd:28924]* May 4 23:50:46 HSLL-BD1 kernel: May 4 23:50:46 HSLL-BD1 kernel: Pid: 28924, comm: modclusterd May 4 23:50:46 HSLL-BD1 kernel: EIP: 0060:[<f8d4c208>] CPU: 1 May 4 23:50:46 HSLL-BD1 kernel: EIP is at abi_personality+0x68/0x7c [abi_lcall] May 4 23:50:46 HSLL-BD1 kernel: EFLAGS: 00200293 Tainted: G (2.6.18-92.el5PAE #1) May 4 23:50:46 HSLL-BD1 kernel: EAX: ffffff60 EBX: df5dcf34 ECX: cba85e9a EDX: 0000006c May 4 23:50:46 HSLL-BD1 kernel: ESI: cba85e9a EDI: 00000008 EBP: 00000000 DS: 007b ES: 007b May 4 23:50:46 HSLL-BD1 kernel: CR0: 8005003b CR2: b7f3e000 CR3: 29216640 CR4: 000006f0 May 4 23:50:46 HSLL-BD1 kernel: [<c046f4d3>] do_sync_read+0xb6/0xf1 May 4 23:50:46 HSLL-BD1 kernel: [<c0457353>] get_page_from_freelist+0x96/0x333 May 4 23:50:46 HSLL-BD1 kernel: [<f922d01a>] xout_load_object+0x1a/0x82d [binfmt_xout] May 4 23:50:46 HSLL-BD1 kernel: [<c045c6ba>] page_address+0x7a/0x81 May 4 23:50:46 HSLL-BD1 kernel: [<c045cc0f>] kunmap_high+0x14/0x7e May 4 23:50:46 HSLL-BD1 kernel: [<f922d8bc>] xout_load_binary+0xe/0x26 [binfmt_xout] May 4 23:50:46 HSLL-BD1 kernel: [<c0477cea>] search_binary_handler+0x99/0x219 May 4 23:50:46 HSLL-BD1 kernel: [<c047953b>] do_execve+0x158/0x1f5 May 4 23:50:46 HSLL-BD1 kernel: [<c040321f>] sys_execve+0x2a/0x4a May 4 23:50:46 HSLL-BD1 kernel: [<c0404eff>] syscall_call+0x7/0xb May 4 23:50:46 HSLL-BD1 kernel: ======================= *May 4 23:50:52 HSLL-BD1 kernel: BUG: soft lockup - CPU#3 stuck for 10s! [fs.sh:28923]* May 4 23:50:52 HSLL-BD1 kernel: May 4 23:50:52 HSLL-BD1 kernel: Pid: 28923, comm: fs.sh May 4 23:50:52 HSLL-BD1 kernel: EIP: 0060:[<f8d4c1f5>] CPU: 3 May 4 23:50:52 HSLL-BD1 kernel: EIP is at abi_personality+0x55/0x7c [abi_lcall] May 4 23:50:52 HSLL-BD1 kernel: EFLAGS: 00000212 Tainted: G (2.6.18-92.el5PAE #1) May 4 23:50:52 HSLL-BD1 kernel: EAX: 0000005d EBX: df5dcf1c ECX: df5dce95 EDX: 0000005d May 4 23:50:52 HSLL-BD1 kernel: ESI: df5dce95 EDI: 00000004 EBP: 00000000 DS: 007b ES: 007b May 4 23:50:52 HSLL-BD1 kernel: CR0: 8005003b CR2: 00c24540 CR3: 3487bde0 CR4: 000006f0 May 4 23:50:52 HSLL-BD1 kernel: [<c046f4d3>] do_sync_read+0xb6/0xf1 May 4 23:50:52 HSLL-BD1 kernel: [<c0457353>] get_page_from_freelist+0x96/0x333 May 4 23:50:52 HSLL-BD1 kernel: [<f922d01a>] xout_load_object+0x1a/0x82d [binfmt_xout] May 4 23:50:52 HSLL-BD1 kernel: [<c045c6ba>] page_address+0x7a/0x81 May 4 23:50:52 HSLL-BD1 kernel: [<c045cc0f>] kunmap_high+0x14/0x7e May 4 23:50:52 HSLL-BD1 kernel: [<f922d8bc>] xout_load_binary+0xe/0x26 [binfmt_xout] May 4 23:50:52 HSLL-BD1 kernel: [<c0477cea>] search_binary_handler+0x99/0x219 May 4 23:50:52 HSLL-BD1 kernel: [<c047953b>] do_execve+0x158/0x1f5 May 4 23:50:52 HSLL-BD1 kernel: [<c040321f>] sys_execve+0x2a/0x4a May 4 23:50:52 HSLL-BD1 kernel: [<c0404eff>] syscall_call+0x7/0xb May 4 23:50:52 HSLL-BD1 kernel: ======================= *May 4 23:50:56 HSLL-BD1 kernel: BUG: soft lockup - CPU#1 stuck for 10s! [modclusterd:28924]* May 4 23:50:56 HSLL-BD1 kernel: May 4 23:50:56 HSLL-BD1 kernel: Pid: 28924, comm: modclusterd May 4 23:50:56 HSLL-BD1 kernel: EIP: 0060:[<f8d4c1e9>] CPU: 1 May 4 23:50:56 HSLL-BD1 kernel: EIP is at abi_personality+0x49/0x7c [abi_lcall] May 4 23:50:56 HSLL-BD1 kernel: EFLAGS: 00200202 Tainted: G (2.6.18-92.el5PAE #1) May 4 23:50:56 HSLL-BD1 kernel: EAX: ffffff1a EBX: df5dcf1c ECX: cba85e9a EDX: fffffff1 May 4 23:50:56 HSLL-BD1 kernel: ESI: cba85e9a EDI: 00000008 EBP: 00000000 DS: 007b ES: 007b May 4 23:50:56 HSLL-BD1 kernel: CR0: 8005003b CR2: b7f3e000 CR3: 29216640 CR4: 000006f0 May 4 23:50:56 HSLL-BD1 kernel: [<c046f4d3>] do_sync_read+0xb6/0xf1 May 4 23:50:56 HSLL-BD1 kernel: [<c0457353>] get_page_from_freelist+0x96/0x333 May 4 23:50:56 HSLL-BD1 kernel: [<f922d01a>] xout_load_object+0x1a/0x82d [binfmt_xout] May 4 23:50:56 HSLL-BD1 kernel: [<c045c6ba>] page_address+0x7a/0x81 May 4 23:50:56 HSLL-BD1 kernel: [<c045cc0f>] kunmap_high+0x14/0x7e May 4 23:50:56 HSLL-BD1 kernel: [<f922d8bc>] xout_load_binary+0xe/0x26 [binfmt_xout] May 4 23:50:56 HSLL-BD1 kernel: [<c0477cea>] search_binary_handler+0x99/0x219 May 4 23:50:56 HSLL-BD1 kernel: [<c047953b>] do_execve+0x158/0x1f5 May 4 23:50:56 HSLL-BD1 kernel: [<c040321f>] sys_execve+0x2a/0x4a May 4 23:50:56 HSLL-BD1 kernel: [<c0404eff>] syscall_call+0x7/0xb May 4 23:50:56 HSLL-BD1 kernel: =======================
-- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster