Re: Ceph kernel client - kernel craches
Sorry your mail fell through the cracks before. I filed http://tracker.newdream.net/issues/2445 to track the ceph-related crashes. Alex, do you think the first crash is related to ceph at all? Josh On 05/10/2012 11:00 AM, Giorgos Kappes wrote: Sorry for my late response. I reproduced the above bug with the Linux kernel 3.3.4 and without using XEN: uname -a Linux node33 3.3.4 #1 SMP Wed May 9 13:00:07 EEST 2012 x86_64 GNU/Linux The trace is shown below: [ 763.984023] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) [ 763.984177] BUG: unable to handle kernel paging request at 880037bd0800 [ 763.984402] IP: [880037bd0800] 0x880037bd07ff [ 763.984568] PGD 1806063 PUD 180a063 PMD 800037a001e3 [ 763.984845] Oops: 0011 [#1] SMP [ 763.985058] CPU 3 [ 763.985124] Modules linked in: cbc netconsole loop snd_pcm snd_timer snd soundcore snd_page_alloc processor tpm_tis i5400_edac tpm edac_core tpm_bios evdev pcspkr i5k_amb rng_core thermal_sys button shpchp pci_hotplug sd_mod crc_t10dif usbhid hid ide_cd_mod cdrom ata_generic uhci_hcd ehci_hcd ata_piix libata piix ide_core usbcore usb_common tg3 libphy mptsas mptscsih mptbase scsi_transport_sas scsi_mod [last unloaded: scsi_wait_scan] [ 763.988002] [ 763.988002] Pid: 0, comm: swapper/3 Not tainted 3.3.4 #1 HP ProLiant DL160 G5 [ 763.988002] RIP: 0010:[880037bd0800] [880037bd0800] 0x880037bd07ff [ 763.988002] RSP: 0018:8800bfcc3e78 EFLAGS: 00010292 [ 763.988002] RAX: 8800b97745b0 RBX: 8800bfcce770 RCX: 880037bd0800 [ 763.988002] RDX: 880037bd1600 RSI: b9b6a040 RDI: 880037bd1600 [ 763.988002] RBP: 81820080 R08: 8800b9dd0b00 R09: 00018020001c [ 763.988002] R10: 8020001c R11: 816075c0 R12: 8800bfcce7a0 [ 763.988002] R13: 8800b97745b0 R14: 0003 R15: 000a [ 763.988002] FS: () GS:8800bfcc() knlGS: [ 763.988002] CS: 0010 DS: ES: CR0: 8005003b [ 763.988002] CR2: 880037bd0800 CR3: b895b000 CR4: 06e0 [ 763.988002] DR0: DR1: DR2: [ 763.988002] DR3: DR6: 0ff0 DR7: 0400 [ 763.988002] Process swapper/3 (pid: 0, threadinfo 8800bbae, task 8800bbad8000) [ 763.988002] Stack: [ 763.988002] 8109b44d 8800bbacd820 8800b97745b0 8800bbae0010 [ 763.988002] 8800bbad8000 8800bfcc3ea0 0048 8800bbae1fd8 [ 763.988002] 0100 0001 0009 8800bbae1fd8 [ 763.988002] Call Trace: [ 763.988002]IRQ [ 763.988002] [8109b44d] ? __rcu_process_callbacks+0x1e9/0x335 [ 763.988002] [8109b8fb] ? rcu_process_callbacks+0x2c/0x56 [ 763.988002] [8103e3b1] ? __do_softirq+0xc4/0x1a0 [ 763.988002] [8102515b] ? lapic_next_event+0x18/0x1d [ 763.988002] [815d3b1c] ? call_softirq+0x1c/0x30 [ 763.988002] [8100fba3] ? do_softirq+0x3f/0x79 [ 763.988002] [8103e186] ? irq_exit+0x44/0xb1 [ 763.988002] [81025c61] ? smp_apic_timer_interrupt+0x85/0x93 [ 763.988002] [815d311e] ? apic_timer_interrupt+0x6e/0x80 [ 763.988002]EOI [ 763.988002] [810145e1] ? native_sched_clock+0x28/0x33 [ 763.988002] [810152f6] ? mwait_idle+0x8c/0xbc [ 763.988002] [810152ae] ? mwait_idle+0x44/0xbc [ 763.988002] [8100de94] ? cpu_idle+0xb9/0xf7 [ 763.988002] [815c43c6] ? start_secondary+0x270/0x275 [ 763.988002] Code: 00 00 00 00 04 8a b8 00 88 ff ff 00 04 8a b8 00 88 ff ff 00 03 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 16 bd 37 00 88 ff ff 40 ab cd bf 00 88 ff ff 20 15 42 b9 00 [ 763.988002] RIP [880037bd0800] 0x880037bd07ff [ 763.988002] RSP8800bfcc3e78 [ 763.988002] CR2: 880037bd0800 [ 763.988002] ---[ end trace 614049dc850267ac ]--- [ 763.988002] Kernel panic - not syncing: Fatal exception in interrupt [ 763.997833] [ cut here ] [ 763.997936] WARNING: at arch/x86/kernel/smp.c:120 update_process_times+0x57/0x63() [ 763.998072] Hardware name: ProLiant DL160 G5 [ 763.998171] Modules linked in: cbc netconsole loop snd_pcm snd_timer snd soundcore snd_page_alloc processor tpm_tis i5400_edac tpm edac_core tpm_bios evdev pcspkr i5k_amb rng_core thermal_sys button shpchp pci_hotplug sd_mod crc_t10dif usbhid hid ide_cd_mod cdrom ata_generic uhci_hcd ehci_hcd ata_piix libata piix ide_core usbcore usb_common tg3 libphy mptsas mptscsih mptbase scsi_transport_sas scsi_mod [last unloaded: scsi_wait_scan] [ 764.001205] Pid: 0, comm: swapper/3 Tainted: G D 3.3.4 #1 [ 764.001311] Call Trace: [ 764.001404]IRQ[81038bb0] ? warn_slowpath_common+0x78/0x8c [ 764.001573] [81044937] ? update_process_times+0x57/0x63 [
Re: Ceph kernel client - kernel craches
Sorry for my late response. I reproduced the above bug with the Linux kernel 3.3.4 and without using XEN: uname -a Linux node33 3.3.4 #1 SMP Wed May 9 13:00:07 EEST 2012 x86_64 GNU/Linux The trace is shown below: [ 763.984023] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) [ 763.984177] BUG: unable to handle kernel paging request at 880037bd0800 [ 763.984402] IP: [880037bd0800] 0x880037bd07ff [ 763.984568] PGD 1806063 PUD 180a063 PMD 800037a001e3 [ 763.984845] Oops: 0011 [#1] SMP [ 763.985058] CPU 3 [ 763.985124] Modules linked in: cbc netconsole loop snd_pcm snd_timer snd soundcore snd_page_alloc processor tpm_tis i5400_edac tpm edac_core tpm_bios evdev pcspkr i5k_amb rng_core thermal_sys button shpchp pci_hotplug sd_mod crc_t10dif usbhid hid ide_cd_mod cdrom ata_generic uhci_hcd ehci_hcd ata_piix libata piix ide_core usbcore usb_common tg3 libphy mptsas mptscsih mptbase scsi_transport_sas scsi_mod [last unloaded: scsi_wait_scan] [ 763.988002] [ 763.988002] Pid: 0, comm: swapper/3 Not tainted 3.3.4 #1 HP ProLiant DL160 G5 [ 763.988002] RIP: 0010:[880037bd0800] [880037bd0800] 0x880037bd07ff [ 763.988002] RSP: 0018:8800bfcc3e78 EFLAGS: 00010292 [ 763.988002] RAX: 8800b97745b0 RBX: 8800bfcce770 RCX: 880037bd0800 [ 763.988002] RDX: 880037bd1600 RSI: b9b6a040 RDI: 880037bd1600 [ 763.988002] RBP: 81820080 R08: 8800b9dd0b00 R09: 00018020001c [ 763.988002] R10: 8020001c R11: 816075c0 R12: 8800bfcce7a0 [ 763.988002] R13: 8800b97745b0 R14: 0003 R15: 000a [ 763.988002] FS: () GS:8800bfcc() knlGS: [ 763.988002] CS: 0010 DS: ES: CR0: 8005003b [ 763.988002] CR2: 880037bd0800 CR3: b895b000 CR4: 06e0 [ 763.988002] DR0: DR1: DR2: [ 763.988002] DR3: DR6: 0ff0 DR7: 0400 [ 763.988002] Process swapper/3 (pid: 0, threadinfo 8800bbae, task 8800bbad8000) [ 763.988002] Stack: [ 763.988002] 8109b44d 8800bbacd820 8800b97745b0 8800bbae0010 [ 763.988002] 8800bbad8000 8800bfcc3ea0 0048 8800bbae1fd8 [ 763.988002] 0100 0001 0009 8800bbae1fd8 [ 763.988002] Call Trace: [ 763.988002] IRQ [ 763.988002] [8109b44d] ? __rcu_process_callbacks+0x1e9/0x335 [ 763.988002] [8109b8fb] ? rcu_process_callbacks+0x2c/0x56 [ 763.988002] [8103e3b1] ? __do_softirq+0xc4/0x1a0 [ 763.988002] [8102515b] ? lapic_next_event+0x18/0x1d [ 763.988002] [815d3b1c] ? call_softirq+0x1c/0x30 [ 763.988002] [8100fba3] ? do_softirq+0x3f/0x79 [ 763.988002] [8103e186] ? irq_exit+0x44/0xb1 [ 763.988002] [81025c61] ? smp_apic_timer_interrupt+0x85/0x93 [ 763.988002] [815d311e] ? apic_timer_interrupt+0x6e/0x80 [ 763.988002] EOI [ 763.988002] [810145e1] ? native_sched_clock+0x28/0x33 [ 763.988002] [810152f6] ? mwait_idle+0x8c/0xbc [ 763.988002] [810152ae] ? mwait_idle+0x44/0xbc [ 763.988002] [8100de94] ? cpu_idle+0xb9/0xf7 [ 763.988002] [815c43c6] ? start_secondary+0x270/0x275 [ 763.988002] Code: 00 00 00 00 04 8a b8 00 88 ff ff 00 04 8a b8 00 88 ff ff 00 03 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 16 bd 37 00 88 ff ff 40 ab cd bf 00 88 ff ff 20 15 42 b9 00 [ 763.988002] RIP [880037bd0800] 0x880037bd07ff [ 763.988002] RSP 8800bfcc3e78 [ 763.988002] CR2: 880037bd0800 [ 763.988002] ---[ end trace 614049dc850267ac ]--- [ 763.988002] Kernel panic - not syncing: Fatal exception in interrupt [ 763.997833] [ cut here ] [ 763.997936] WARNING: at arch/x86/kernel/smp.c:120 update_process_times+0x57/0x63() [ 763.998072] Hardware name: ProLiant DL160 G5 [ 763.998171] Modules linked in: cbc netconsole loop snd_pcm snd_timer snd soundcore snd_page_alloc processor tpm_tis i5400_edac tpm edac_core tpm_bios evdev pcspkr i5k_amb rng_core thermal_sys button shpchp pci_hotplug sd_mod crc_t10dif usbhid hid ide_cd_mod cdrom ata_generic uhci_hcd ehci_hcd ata_piix libata piix ide_core usbcore usb_common tg3 libphy mptsas mptscsih mptbase scsi_transport_sas scsi_mod [last unloaded: scsi_wait_scan] [ 764.001205] Pid: 0, comm: swapper/3 Tainted: G D 3.3.4 #1 [ 764.001311] Call Trace: [ 764.001404] IRQ [81038bb0] ? warn_slowpath_common+0x78/0x8c [ 764.001573] [81044937] ? update_process_times+0x57/0x63 [ 764.001681] [81075dbe] ? tick_sched_timer+0x65/0x8b [ 764.001788] [810561bd] ? __run_hrtimer+0xb2/0x13d [ 764.001832] [81013ca9] ? read_tsc+0x5/0x16 [ 764.001832] [81056482] ? hrtimer_interrupt+0xd8/0x1a7 [
Ceph kernel client - kernel craches
hi, When I am running deboostrap to install a base Debian Squeeze system on a Ceph directory the client's kernel crashes with the following message: I: Retrieving Release I: Validating Packages I: Resolving dependencies of required packages... I: Resolving dependencies of base packages... I: Found additional required dependencies: insserv libbz2-1.0 libdb4.8 libslang2 I: Found additional base dependencies: libnfnetlink0 libsqlite3-0 I: Checking component main on http://ftp.us.debian.org/debian... I: Validating libacl1 ... I: Extracting xz-utils... I: Extracting zlib1g... W: Failure trying to run: chroot /mnt/debian mount -t proc proc /proc [ 759.776151] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) [ 759.776169] BUG: unable to handle kernel paging request at e8fe4ab0 [ 759.776182] IP: [e8fe4ab0] 0xe8fe4aaf [ 759.776195] PGD c42b067 PUD c42c067 PMD c42d067 PTE 80100c445067 [ 759.776209] Oops: 0011 [#1] SMP [ 759.776219] CPU 0 [ 759.776224] Modules linked in: pcspkr [last unloaded: scsi_wait_scan] [ 759.776237] [ 759.776244] Pid: 0, comm: swapper/0 Tainted: GW3.2.11 #2 [ 759.776255] RIP: e030:[e8fe4ab0] [e8fe4ab0] 0xe8fe4aaf [ 759.776267] RSP: e02b:88001ffaae98 EFLAGS: 00010296 [ 759.776274] RAX: 880012d7a900 RBX: 88001ffb5960 RCX: e8fe4ab0 [ 759.776302] RDX: 88000d1a9b00 RSI: 000f RDI: 88000d1a9b00 [ 759.776309] RBP: 81c1fa80 R08: 88001eb74000 R09: 0001801f [ 759.776317] R10: 801f R11: 818055f5 R12: 88001ffb5990 [ 759.776324] R13: 88000c5ea880 R14: 0001 R15: 000a [ 759.776334] FS: 7f21095a4740() GS:88001ffa7000() knlGS: [ 759.776342] CS: e033 DS: ES: CR0: 8005003b [ 759.776349] CR2: e8fe4ab0 CR3: 12e28000 CR4: 2660 [ 759.776356] DR0: DR1: DR2: [ 759.776364] DR3: DR6: 0ff0 DR7: 0400 [ 759.776372] Process swapper/0 (pid: 0, threadinfo 81c0, task 81c0d020) [ 759.776379] Stack: [ 759.776384] 81099405 0001 880012d7a900 88001ffaaeb0 [ 759.776397] 0048 81c01fd8 0100 0001 [ 759.776409] 0009 81c01fd8 81099898 81c01fd8 [ 759.776422] Call Trace: [ 759.776427] IRQ [ 759.776438] [81099405] ? __rcu_process_callbacks+0x1c7/0x2f8 [ 759.776447] [81099898] ? rcu_process_callbacks+0x2c/0x56 [ 759.776457] [8104cb72] ? __do_softirq+0xc4/0x1a0 [ 759.776465] [81096875] ? handle_percpu_irq+0x3d/0x54 [ 759.776475] [8150efb6] ? __xen_evtchn_do_upcall+0x1c7/0x205 [ 759.776484] [8176e52c] ? call_softirq+0x1c/0x30 [ 759.776493] [8100fa47] ? do_softirq+0x3f/0x79 [ 759.776501] [8104c942] ? irq_exit+0x44/0xb5 [ 759.776508] [8150ffc6] ? xen_evtchn_do_upcall+0x27/0x32 [ 759.776516] [8176e57e] ? xen_do_hypervisor_callback+0x1e/0x30 [ 759.776523] EOI [ 759.776531] [81006f3f] ? xen_restore_fl_direct_reloc+0x4/0x4 [ 759.776539] [810013aa] ? hypercall_page+0x3aa/0x1000 [ 759.776547] [810013aa] ? hypercall_page+0x3aa/0x1000 [ 759.776556] [8163969b] ? cpuidle_idle_call+0x16/0x1af [ 759.776564] [810068dc] ? xen_safe_halt+0xc/0x15 [ 759.776572] [810150a6] ? default_idle+0x4b/0x84 [ 759.776580] [8100ddf6] ? cpu_idle+0xb9/0xef [ 759.776588] [81cf7bff] ? start_kernel+0x395/0x3a0 [ 759.776596] [81cfa536] ? xen_start_kernel+0x593/0x598 [ 759.776602] Code: e8 ff ff 80 4a fe ff ff e8 ff ff 0b 00 00 00 01 00 00 00 fa ff ff ff fa ff ff ff 06 00 00 00 02 00 00 00 05 00 00 00 cc cc cc cc 00 9b 1a 0d 00 88 ff ff 00 0f b7 1e 00 88 ff ff 01 00 00 00 00 [ 759.776699] RIP [e8fe4ab0] 0xe8fe4aaf [ 759.776712] RSP 88001ffaae98 [ 759.776717] CR2: e8fe4ab0 [ 759.776725] ---[ end trace 36924001333caa12 ]--- [ 759.776731] Kernel panic - not syncing: Fatal exception in interrupt [ 759.776739] Pid: 0, comm: swapper/0 Tainted: G D W3.2.11 #2 [ 759.776745] Call Trace: [ 759.776749] IRQ [81764003] ? panic+0x92/0x1a0 [ 759.776771] [810478c0] ? kmsg_dump+0x41/0xdd [ 759.776779] [81766cc1] ? oops_end+0xa9/0xb6 [ 759.776788] [8102ec7d] ? no_context+0x1ff/0x20c [ 759.776795] [81768d9f] ? do_page_fault+0x1ad/0x34c [ 759.776805] [8106dfb3] ? tick_nohz_handler+0xcb/0xcb [ 759.776813] [8102c12a] ? pvclock_clocksource_read+0x46/0xb4 [ 759.776821] [81006eb3] ? xen_vcpuop_set_next_event+0x4d/0x61 [ 759.776829] [8106cdcc] ? clockevents_program_event+0x99/0xb8 [ 759.776837] [817663b5] ? page_fault+0x25/0x30 [ 759.776845]