Re: [Ocfs2-users] remove locks? or copy the whole file?
On Tue, Jul 03, 2012 at 11:46:08PM -0700, Aleks Clark wrote: > any ideas how long this is going to take on a 2tb fs with ~400gb used? > going on 10 hours of downtime, and it's been doing Pass 0a for the > past 10 minutes. also, should all the nodes be up (but unmounted) for > this? I don't have any good idea how long it will take. That's a lot of filesystem. However, it should take less than 10h! You don't need any other nodes up or down. All that matters is that there are no mounts. Joel > > On Tue, Jul 3, 2012 at 11:42 PM, Joel Becker wrote: > > Because it's unsafe to do any I/O at that point. We'd rather you have > > to reboot than scribble more bad data on your disk! > > > > Joel > > > > On Tue, Jul 03, 2012 at 11:35:32PM -0700, Aleks Clark wrote: > >> it said 'clean' and exited. Working on bringing the cluster down. Is > >> there a reason why, after the kernel panics, ocfs2 makes all i/o > >> block? I can't even unmount the filesystem on any node, I have to > >> actually reboot it. > >> > >> On Tue, Jul 3, 2012 at 11:17 PM, Joel Becker wrote: > >> > On Tue, Jul 03, 2012 at 06:57:53PM -0700, Aleks Clark wrote: > >> >> well, by 'clean', it said it was clean. the locks persisted though. I > >> >> seriously can't believe there's no way to force lock removal. is it > >> >> just a file somewhere I can delete? > >> > > >> > There's no lock hanging around past a full restart. This looks like > >> > on-disk corruption. Did fsck.ocfs2 say that it run multiple passes, or > >> > just say "clean" and exit? Please try fsck.ocfs2 with the '-f' flag > >> > (obviously with the filesystem not mounted on ANY node). > >> > > >> > Joel > >> > > >> >> > >> >> > >> >> On Tue, Jul 3, 2012 at 6:56 PM, Aleks Clark > >> >> wrote: > >> >> > yep, tried that, returned clean. > >> >> > > >> >> > On Tue, Jul 3, 2012 at 6:25 PM, herbert van.den.bergh > >> >> > wrote: > >> >> >> > >> >> >> One more thing: did you try running fsck.ocfs2 on it? > >> >> >> > >> >> >> Thanks, > >> >> >> Herbert. > >> >> >> > >> >> >> > >> >> >> On 7/3/2012 6:23 PM, herbert van.den.bergh wrote: > >> >> >>> > >> >> >>> Hmm doesn't mean much to me, but maybe to someone else on the list. > >> >> >>> But > >> >> >>> I bet their first suggestion will be to try a recent kernel... > >> >> >>> > >> >> >>> Thanks, > >> >> >>> Herbert. > >> >> >>> > >> >> >>> On 7/3/2012 6:19 PM, Aleks Clark wrote: > >> >> > >> >> Nick, I don't think so, it's a 2tb partition with only 300gb used. > >> >> > >> >> Herb, > >> >> > >> >> > >> >> Jul 3 14:47:26 castor kernel: [3488036.578659] > >> >> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression: > >> >> path_leaf_bh(left_path) == path_leaf_bh(right_path) > >> >> Jul 3 14:47:26 castor kernel: [3488036.578714] > >> >> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error > >> >> during insert of 15761664 (left path cpos 20725762) results in two > >> >> identical paths ending at 395267 > >> >> Jul 3 14:47:26 castor kernel: [3488036.578800] [ cut > >> >> here > >> >> ] > >> >> Jul 3 14:47:26 castor kernel: [3488036.578826] kernel BUG at > >> >> > >> >> /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483! > >> >> Jul 3 14:47:26 castor kernel: [3488036.578881] invalid opcode: > >> >> [#1] > >> >> SMP > >> >> Jul 3 14:47:26 castor kernel: [3488036.578909] last sysfs file: > >> >> /sys/devices/virtual/net/lo/operstate > >> >> Jul 3 14:47:26 castor kernel: [3488036.578937] CPU 0 > >> >> Jul 3 14:47:26 castor kernel: [3488036.578960] Modules linked in: > >> >> drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables > >> >> iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac > >> >> x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb > >> >> ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp > >> >> loop > >> >> md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 > >> >> i2c_core pcspkr processor button psmouse joydev evdev serio_raw > >> >> usbhid > >> >> hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata > >> >> usbcore scsi_mod e1000e nls_base thermal thermal_sys [last > >> >> unloaded: > >> >> drbd] > >> >> Jul 3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: > >> >> kvm > >> >> Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM > >> >> Jul 3 14:47:26 castor kernel: [3488036.579309] RIP: > >> >> 0010:[] [] > >> >> ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2] > >> >> Jul 3 14:47:26 castor kernel: [3488036.579363] RSP: > >> >> 0018:880014839688 EFLAGS: 00010292 > >> >> Jul 3 14:47:26 castor kernel: [3488036.579390] RAX: > >> >> 00bf > >> >> RBX: 00
Re: [Ocfs2-users] remove locks? or copy the whole file?
any ideas how long this is going to take on a 2tb fs with ~400gb used? going on 10 hours of downtime, and it's been doing Pass 0a for the past 10 minutes. also, should all the nodes be up (but unmounted) for this? On Tue, Jul 3, 2012 at 11:42 PM, Joel Becker wrote: > Because it's unsafe to do any I/O at that point. We'd rather you have > to reboot than scribble more bad data on your disk! > > Joel > > On Tue, Jul 03, 2012 at 11:35:32PM -0700, Aleks Clark wrote: >> it said 'clean' and exited. Working on bringing the cluster down. Is >> there a reason why, after the kernel panics, ocfs2 makes all i/o >> block? I can't even unmount the filesystem on any node, I have to >> actually reboot it. >> >> On Tue, Jul 3, 2012 at 11:17 PM, Joel Becker wrote: >> > On Tue, Jul 03, 2012 at 06:57:53PM -0700, Aleks Clark wrote: >> >> well, by 'clean', it said it was clean. the locks persisted though. I >> >> seriously can't believe there's no way to force lock removal. is it >> >> just a file somewhere I can delete? >> > >> > There's no lock hanging around past a full restart. This looks like >> > on-disk corruption. Did fsck.ocfs2 say that it run multiple passes, or >> > just say "clean" and exit? Please try fsck.ocfs2 with the '-f' flag >> > (obviously with the filesystem not mounted on ANY node). >> > >> > Joel >> > >> >> >> >> >> >> On Tue, Jul 3, 2012 at 6:56 PM, Aleks Clark wrote: >> >> > yep, tried that, returned clean. >> >> > >> >> > On Tue, Jul 3, 2012 at 6:25 PM, herbert van.den.bergh >> >> > wrote: >> >> >> >> >> >> One more thing: did you try running fsck.ocfs2 on it? >> >> >> >> >> >> Thanks, >> >> >> Herbert. >> >> >> >> >> >> >> >> >> On 7/3/2012 6:23 PM, herbert van.den.bergh wrote: >> >> >>> >> >> >>> Hmm doesn't mean much to me, but maybe to someone else on the list. >> >> >>> But >> >> >>> I bet their first suggestion will be to try a recent kernel... >> >> >>> >> >> >>> Thanks, >> >> >>> Herbert. >> >> >>> >> >> >>> On 7/3/2012 6:19 PM, Aleks Clark wrote: >> >> >> >> Nick, I don't think so, it's a 2tb partition with only 300gb used. >> >> >> >> Herb, >> >> >> >> >> >> Jul 3 14:47:26 castor kernel: [3488036.578659] >> >> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression: >> >> path_leaf_bh(left_path) == path_leaf_bh(right_path) >> >> Jul 3 14:47:26 castor kernel: [3488036.578714] >> >> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error >> >> during insert of 15761664 (left path cpos 20725762) results in two >> >> identical paths ending at 395267 >> >> Jul 3 14:47:26 castor kernel: [3488036.578800] [ cut >> >> here >> >> ] >> >> Jul 3 14:47:26 castor kernel: [3488036.578826] kernel BUG at >> >> >> >> /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483! >> >> Jul 3 14:47:26 castor kernel: [3488036.578881] invalid opcode: >> >> [#1] >> >> SMP >> >> Jul 3 14:47:26 castor kernel: [3488036.578909] last sysfs file: >> >> /sys/devices/virtual/net/lo/operstate >> >> Jul 3 14:47:26 castor kernel: [3488036.578937] CPU 0 >> >> Jul 3 14:47:26 castor kernel: [3488036.578960] Modules linked in: >> >> drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables >> >> iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac >> >> x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb >> >> ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp loop >> >> md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 >> >> i2c_core pcspkr processor button psmouse joydev evdev serio_raw >> >> usbhid >> >> hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata >> >> usbcore scsi_mod e1000e nls_base thermal thermal_sys [last unloaded: >> >> drbd] >> >> Jul 3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: kvm >> >> Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM >> >> Jul 3 14:47:26 castor kernel: [3488036.579309] RIP: >> >> 0010:[] [] >> >> ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2] >> >> Jul 3 14:47:26 castor kernel: [3488036.579363] RSP: >> >> 0018:880014839688 EFLAGS: 00010292 >> >> Jul 3 14:47:26 castor kernel: [3488036.579390] RAX: 00bf >> >> RBX: 00060803 RCX: 1806 >> >> Jul 3 14:47:26 castor kernel: [3488036.579435] RDX: >> >> RSI: 0096 RDI: 0246 >> >> Jul 3 14:47:26 castor kernel: [3488036.579479] RBP: 8800148398a8 >> >> R08: 000209d0 R09: 000a >> >> Jul 3 14:47:26 castor kernel: [3488036.579524] R10: >> >> R11: 0001 R12: 013c4002 >> >> Jul 3 14:47:26 castor kernel: [3488036.579568] R13: 88002a1e4030 >> >> R14:
Re: [Ocfs2-users] remove locks? or copy the whole file?
Because it's unsafe to do any I/O at that point. We'd rather you have to reboot than scribble more bad data on your disk! Joel On Tue, Jul 03, 2012 at 11:35:32PM -0700, Aleks Clark wrote: > it said 'clean' and exited. Working on bringing the cluster down. Is > there a reason why, after the kernel panics, ocfs2 makes all i/o > block? I can't even unmount the filesystem on any node, I have to > actually reboot it. > > On Tue, Jul 3, 2012 at 11:17 PM, Joel Becker wrote: > > On Tue, Jul 03, 2012 at 06:57:53PM -0700, Aleks Clark wrote: > >> well, by 'clean', it said it was clean. the locks persisted though. I > >> seriously can't believe there's no way to force lock removal. is it > >> just a file somewhere I can delete? > > > > There's no lock hanging around past a full restart. This looks like > > on-disk corruption. Did fsck.ocfs2 say that it run multiple passes, or > > just say "clean" and exit? Please try fsck.ocfs2 with the '-f' flag > > (obviously with the filesystem not mounted on ANY node). > > > > Joel > > > >> > >> > >> On Tue, Jul 3, 2012 at 6:56 PM, Aleks Clark wrote: > >> > yep, tried that, returned clean. > >> > > >> > On Tue, Jul 3, 2012 at 6:25 PM, herbert van.den.bergh > >> > wrote: > >> >> > >> >> One more thing: did you try running fsck.ocfs2 on it? > >> >> > >> >> Thanks, > >> >> Herbert. > >> >> > >> >> > >> >> On 7/3/2012 6:23 PM, herbert van.den.bergh wrote: > >> >>> > >> >>> Hmm doesn't mean much to me, but maybe to someone else on the list. > >> >>> But > >> >>> I bet their first suggestion will be to try a recent kernel... > >> >>> > >> >>> Thanks, > >> >>> Herbert. > >> >>> > >> >>> On 7/3/2012 6:19 PM, Aleks Clark wrote: > >> > >> Nick, I don't think so, it's a 2tb partition with only 300gb used. > >> > >> Herb, > >> > >> > >> Jul 3 14:47:26 castor kernel: [3488036.578659] > >> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression: > >> path_leaf_bh(left_path) == path_leaf_bh(right_path) > >> Jul 3 14:47:26 castor kernel: [3488036.578714] > >> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error > >> during insert of 15761664 (left path cpos 20725762) results in two > >> identical paths ending at 395267 > >> Jul 3 14:47:26 castor kernel: [3488036.578800] [ cut here > >> ] > >> Jul 3 14:47:26 castor kernel: [3488036.578826] kernel BUG at > >> > >> /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483! > >> Jul 3 14:47:26 castor kernel: [3488036.578881] invalid opcode: > >> [#1] > >> SMP > >> Jul 3 14:47:26 castor kernel: [3488036.578909] last sysfs file: > >> /sys/devices/virtual/net/lo/operstate > >> Jul 3 14:47:26 castor kernel: [3488036.578937] CPU 0 > >> Jul 3 14:47:26 castor kernel: [3488036.578960] Modules linked in: > >> drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables > >> iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac > >> x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb > >> ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp loop > >> md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 > >> i2c_core pcspkr processor button psmouse joydev evdev serio_raw usbhid > >> hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata > >> usbcore scsi_mod e1000e nls_base thermal thermal_sys [last unloaded: > >> drbd] > >> Jul 3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: kvm > >> Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM > >> Jul 3 14:47:26 castor kernel: [3488036.579309] RIP: > >> 0010:[] [] > >> ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2] > >> Jul 3 14:47:26 castor kernel: [3488036.579363] RSP: > >> 0018:880014839688 EFLAGS: 00010292 > >> Jul 3 14:47:26 castor kernel: [3488036.579390] RAX: 00bf > >> RBX: 00060803 RCX: 1806 > >> Jul 3 14:47:26 castor kernel: [3488036.579435] RDX: > >> RSI: 0096 RDI: 0246 > >> Jul 3 14:47:26 castor kernel: [3488036.579479] RBP: 8800148398a8 > >> R08: 000209d0 R09: 000a > >> Jul 3 14:47:26 castor kernel: [3488036.579524] R10: > >> R11: 0001 R12: 013c4002 > >> Jul 3 14:47:26 castor kernel: [3488036.579568] R13: 88002a1e4030 > >> R14: 0001 R15: 88023c153c60 > >> Jul 3 14:47:26 castor kernel: [3488036.579613] FS: > >> 7f0cfef83700() GS:880008a0() > >> knlGS: > >> Jul 3 14:47:26 castor kernel: [3488036.579659] CS: 0010 DS: 002b ES: > >> 002b CR0: 8005003b > >> Jul 3 14:47:26 castor kernel: [3488036.579687] CR2: 7f0d25dbf000 > >> C
Re: [Ocfs2-users] remove locks? or copy the whole file?
it said 'clean' and exited. Working on bringing the cluster down. Is there a reason why, after the kernel panics, ocfs2 makes all i/o block? I can't even unmount the filesystem on any node, I have to actually reboot it. On Tue, Jul 3, 2012 at 11:17 PM, Joel Becker wrote: > On Tue, Jul 03, 2012 at 06:57:53PM -0700, Aleks Clark wrote: >> well, by 'clean', it said it was clean. the locks persisted though. I >> seriously can't believe there's no way to force lock removal. is it >> just a file somewhere I can delete? > > There's no lock hanging around past a full restart. This looks like > on-disk corruption. Did fsck.ocfs2 say that it run multiple passes, or > just say "clean" and exit? Please try fsck.ocfs2 with the '-f' flag > (obviously with the filesystem not mounted on ANY node). > > Joel > >> >> >> On Tue, Jul 3, 2012 at 6:56 PM, Aleks Clark wrote: >> > yep, tried that, returned clean. >> > >> > On Tue, Jul 3, 2012 at 6:25 PM, herbert van.den.bergh >> > wrote: >> >> >> >> One more thing: did you try running fsck.ocfs2 on it? >> >> >> >> Thanks, >> >> Herbert. >> >> >> >> >> >> On 7/3/2012 6:23 PM, herbert van.den.bergh wrote: >> >>> >> >>> Hmm doesn't mean much to me, but maybe to someone else on the list. But >> >>> I bet their first suggestion will be to try a recent kernel... >> >>> >> >>> Thanks, >> >>> Herbert. >> >>> >> >>> On 7/3/2012 6:19 PM, Aleks Clark wrote: >> >> Nick, I don't think so, it's a 2tb partition with only 300gb used. >> >> Herb, >> >> >> Jul 3 14:47:26 castor kernel: [3488036.578659] >> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression: >> path_leaf_bh(left_path) == path_leaf_bh(right_path) >> Jul 3 14:47:26 castor kernel: [3488036.578714] >> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error >> during insert of 15761664 (left path cpos 20725762) results in two >> identical paths ending at 395267 >> Jul 3 14:47:26 castor kernel: [3488036.578800] [ cut here >> ] >> Jul 3 14:47:26 castor kernel: [3488036.578826] kernel BUG at >> >> /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483! >> Jul 3 14:47:26 castor kernel: [3488036.578881] invalid opcode: >> [#1] >> SMP >> Jul 3 14:47:26 castor kernel: [3488036.578909] last sysfs file: >> /sys/devices/virtual/net/lo/operstate >> Jul 3 14:47:26 castor kernel: [3488036.578937] CPU 0 >> Jul 3 14:47:26 castor kernel: [3488036.578960] Modules linked in: >> drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables >> iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac >> x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb >> ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp loop >> md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 >> i2c_core pcspkr processor button psmouse joydev evdev serio_raw usbhid >> hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata >> usbcore scsi_mod e1000e nls_base thermal thermal_sys [last unloaded: >> drbd] >> Jul 3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: kvm >> Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM >> Jul 3 14:47:26 castor kernel: [3488036.579309] RIP: >> 0010:[] [] >> ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2] >> Jul 3 14:47:26 castor kernel: [3488036.579363] RSP: >> 0018:880014839688 EFLAGS: 00010292 >> Jul 3 14:47:26 castor kernel: [3488036.579390] RAX: 00bf >> RBX: 00060803 RCX: 1806 >> Jul 3 14:47:26 castor kernel: [3488036.579435] RDX: >> RSI: 0096 RDI: 0246 >> Jul 3 14:47:26 castor kernel: [3488036.579479] RBP: 8800148398a8 >> R08: 000209d0 R09: 000a >> Jul 3 14:47:26 castor kernel: [3488036.579524] R10: >> R11: 0001 R12: 013c4002 >> Jul 3 14:47:26 castor kernel: [3488036.579568] R13: 88002a1e4030 >> R14: 0001 R15: 88023c153c60 >> Jul 3 14:47:26 castor kernel: [3488036.579613] FS: >> 7f0cfef83700() GS:880008a0() >> knlGS: >> Jul 3 14:47:26 castor kernel: [3488036.579659] CS: 0010 DS: 002b ES: >> 002b CR0: 8005003b >> Jul 3 14:47:26 castor kernel: [3488036.579687] CR2: 7f0d25dbf000 >> CR3: 00023ccb6000 CR4: 000426e0 >> Jul 3 14:47:26 castor kernel: [3488036.579732] DR0: >> DR1: DR2: >> Jul 3 14:47:26 castor kernel: [3488036.579776] DR3: >> DR6: 0ff0 DR7: 0400 >> Jul 3 14:47:26 castor kernel: [3488036.579821] Process kvm (pid: >> 25326, threadi
Re: [Ocfs2-users] remove locks? or copy the whole file?
On Tue, Jul 03, 2012 at 06:57:53PM -0700, Aleks Clark wrote: > well, by 'clean', it said it was clean. the locks persisted though. I > seriously can't believe there's no way to force lock removal. is it > just a file somewhere I can delete? There's no lock hanging around past a full restart. This looks like on-disk corruption. Did fsck.ocfs2 say that it run multiple passes, or just say "clean" and exit? Please try fsck.ocfs2 with the '-f' flag (obviously with the filesystem not mounted on ANY node). Joel > > > On Tue, Jul 3, 2012 at 6:56 PM, Aleks Clark wrote: > > yep, tried that, returned clean. > > > > On Tue, Jul 3, 2012 at 6:25 PM, herbert van.den.bergh > > wrote: > >> > >> One more thing: did you try running fsck.ocfs2 on it? > >> > >> Thanks, > >> Herbert. > >> > >> > >> On 7/3/2012 6:23 PM, herbert van.den.bergh wrote: > >>> > >>> Hmm doesn't mean much to me, but maybe to someone else on the list. But > >>> I bet their first suggestion will be to try a recent kernel... > >>> > >>> Thanks, > >>> Herbert. > >>> > >>> On 7/3/2012 6:19 PM, Aleks Clark wrote: > > Nick, I don't think so, it's a 2tb partition with only 300gb used. > > Herb, > > > Jul 3 14:47:26 castor kernel: [3488036.578659] > (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression: > path_leaf_bh(left_path) == path_leaf_bh(right_path) > Jul 3 14:47:26 castor kernel: [3488036.578714] > (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error > during insert of 15761664 (left path cpos 20725762) results in two > identical paths ending at 395267 > Jul 3 14:47:26 castor kernel: [3488036.578800] [ cut here > ] > Jul 3 14:47:26 castor kernel: [3488036.578826] kernel BUG at > > /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483! > Jul 3 14:47:26 castor kernel: [3488036.578881] invalid opcode: [#1] > SMP > Jul 3 14:47:26 castor kernel: [3488036.578909] last sysfs file: > /sys/devices/virtual/net/lo/operstate > Jul 3 14:47:26 castor kernel: [3488036.578937] CPU 0 > Jul 3 14:47:26 castor kernel: [3488036.578960] Modules linked in: > drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables > iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac > x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb > ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp loop > md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 > i2c_core pcspkr processor button psmouse joydev evdev serio_raw usbhid > hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata > usbcore scsi_mod e1000e nls_base thermal thermal_sys [last unloaded: > drbd] > Jul 3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: kvm > Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM > Jul 3 14:47:26 castor kernel: [3488036.579309] RIP: > 0010:[] [] > ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.579363] RSP: > 0018:880014839688 EFLAGS: 00010292 > Jul 3 14:47:26 castor kernel: [3488036.579390] RAX: 00bf > RBX: 00060803 RCX: 1806 > Jul 3 14:47:26 castor kernel: [3488036.579435] RDX: > RSI: 0096 RDI: 0246 > Jul 3 14:47:26 castor kernel: [3488036.579479] RBP: 8800148398a8 > R08: 000209d0 R09: 000a > Jul 3 14:47:26 castor kernel: [3488036.579524] R10: > R11: 0001 R12: 013c4002 > Jul 3 14:47:26 castor kernel: [3488036.579568] R13: 88002a1e4030 > R14: 0001 R15: 88023c153c60 > Jul 3 14:47:26 castor kernel: [3488036.579613] FS: > 7f0cfef83700() GS:880008a0() > knlGS: > Jul 3 14:47:26 castor kernel: [3488036.579659] CS: 0010 DS: 002b ES: > 002b CR0: 8005003b > Jul 3 14:47:26 castor kernel: [3488036.579687] CR2: 7f0d25dbf000 > CR3: 00023ccb6000 CR4: 000426e0 > Jul 3 14:47:26 castor kernel: [3488036.579732] DR0: > DR1: DR2: > Jul 3 14:47:26 castor kernel: [3488036.579776] DR3: > DR6: 0ff0 DR7: 0400 > Jul 3 14:47:26 castor kernel: [3488036.579821] Process kvm (pid: > 25326, threadinfo 880014838000, task 88023b999c40) > Jul 3 14:47:26 castor kernel: [3488036.579867] Stack: > Jul 3 14:47:26 castor kernel: [3488036.579887] 00f08100 > 013c4002 00060803 880014839718 > Jul 3 14:47:26 castor kernel: [3488036.579923]<0> 880232abde80 > 88023b999c40 88023b999c40 8800148397a8 > Jul
Re: [Ocfs2-users] remove locks? or copy the whole file?
grr. the copy I 'recovered' using dd to copy instead of cp is totally munged. Would really appreciate some pointers on fixing the ocfs2 issue, I've got data backups but not looking forward to rebuilding the whole damned VM :/ On Tue, Jul 3, 2012 at 6:57 PM, Aleks Clark wrote: > well, by 'clean', it said it was clean. the locks persisted though. I > seriously can't believe there's no way to force lock removal. is it > just a file somewhere I can delete? > > > On Tue, Jul 3, 2012 at 6:56 PM, Aleks Clark wrote: >> yep, tried that, returned clean. >> >> On Tue, Jul 3, 2012 at 6:25 PM, herbert van.den.bergh >> wrote: >>> >>> One more thing: did you try running fsck.ocfs2 on it? >>> >>> Thanks, >>> Herbert. >>> >>> >>> On 7/3/2012 6:23 PM, herbert van.den.bergh wrote: Hmm doesn't mean much to me, but maybe to someone else on the list. But I bet their first suggestion will be to try a recent kernel... Thanks, Herbert. On 7/3/2012 6:19 PM, Aleks Clark wrote: > > Nick, I don't think so, it's a 2tb partition with only 300gb used. > > Herb, > > > Jul 3 14:47:26 castor kernel: [3488036.578659] > (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression: > path_leaf_bh(left_path) == path_leaf_bh(right_path) > Jul 3 14:47:26 castor kernel: [3488036.578714] > (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error > during insert of 15761664 (left path cpos 20725762) results in two > identical paths ending at 395267 > Jul 3 14:47:26 castor kernel: [3488036.578800] [ cut here > ] > Jul 3 14:47:26 castor kernel: [3488036.578826] kernel BUG at > > /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483! > Jul 3 14:47:26 castor kernel: [3488036.578881] invalid opcode: [#1] > SMP > Jul 3 14:47:26 castor kernel: [3488036.578909] last sysfs file: > /sys/devices/virtual/net/lo/operstate > Jul 3 14:47:26 castor kernel: [3488036.578937] CPU 0 > Jul 3 14:47:26 castor kernel: [3488036.578960] Modules linked in: > drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables > iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac > x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb > ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp loop > md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 > i2c_core pcspkr processor button psmouse joydev evdev serio_raw usbhid > hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata > usbcore scsi_mod e1000e nls_base thermal thermal_sys [last unloaded: > drbd] > Jul 3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: kvm > Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM > Jul 3 14:47:26 castor kernel: [3488036.579309] RIP: > 0010:[] [] > ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.579363] RSP: > 0018:880014839688 EFLAGS: 00010292 > Jul 3 14:47:26 castor kernel: [3488036.579390] RAX: 00bf > RBX: 00060803 RCX: 1806 > Jul 3 14:47:26 castor kernel: [3488036.579435] RDX: > RSI: 0096 RDI: 0246 > Jul 3 14:47:26 castor kernel: [3488036.579479] RBP: 8800148398a8 > R08: 000209d0 R09: 000a > Jul 3 14:47:26 castor kernel: [3488036.579524] R10: > R11: 0001 R12: 013c4002 > Jul 3 14:47:26 castor kernel: [3488036.579568] R13: 88002a1e4030 > R14: 0001 R15: 88023c153c60 > Jul 3 14:47:26 castor kernel: [3488036.579613] FS: > 7f0cfef83700() GS:880008a0() > knlGS: > Jul 3 14:47:26 castor kernel: [3488036.579659] CS: 0010 DS: 002b ES: > 002b CR0: 8005003b > Jul 3 14:47:26 castor kernel: [3488036.579687] CR2: 7f0d25dbf000 > CR3: 00023ccb6000 CR4: 000426e0 > Jul 3 14:47:26 castor kernel: [3488036.579732] DR0: > DR1: DR2: > Jul 3 14:47:26 castor kernel: [3488036.579776] DR3: > DR6: 0ff0 DR7: 0400 > Jul 3 14:47:26 castor kernel: [3488036.579821] Process kvm (pid: > 25326, threadinfo 880014838000, task 88023b999c40) > Jul 3 14:47:26 castor kernel: [3488036.579867] Stack: > Jul 3 14:47:26 castor kernel: [3488036.579887] 00f08100 > 013c4002 00060803 880014839718 > Jul 3 14:47:26 castor kernel: [3488036.579923]<0> 880232abde80 > 88023b999c40 88023b999c40 8800148397a8 > Jul 3 14:47:26 castor kernel: [3488036.579977]<0> 8800148397c8 > 8800148398a8 88023d8027f8 00f08100 > Jul 3 14:47:26 cast
Re: [Ocfs2-users] remove locks? or copy the whole file?
well, by 'clean', it said it was clean. the locks persisted though. I seriously can't believe there's no way to force lock removal. is it just a file somewhere I can delete? On Tue, Jul 3, 2012 at 6:56 PM, Aleks Clark wrote: > yep, tried that, returned clean. > > On Tue, Jul 3, 2012 at 6:25 PM, herbert van.den.bergh > wrote: >> >> One more thing: did you try running fsck.ocfs2 on it? >> >> Thanks, >> Herbert. >> >> >> On 7/3/2012 6:23 PM, herbert van.den.bergh wrote: >>> >>> Hmm doesn't mean much to me, but maybe to someone else on the list. But >>> I bet their first suggestion will be to try a recent kernel... >>> >>> Thanks, >>> Herbert. >>> >>> On 7/3/2012 6:19 PM, Aleks Clark wrote: Nick, I don't think so, it's a 2tb partition with only 300gb used. Herb, Jul 3 14:47:26 castor kernel: [3488036.578659] (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression: path_leaf_bh(left_path) == path_leaf_bh(right_path) Jul 3 14:47:26 castor kernel: [3488036.578714] (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error during insert of 15761664 (left path cpos 20725762) results in two identical paths ending at 395267 Jul 3 14:47:26 castor kernel: [3488036.578800] [ cut here ] Jul 3 14:47:26 castor kernel: [3488036.578826] kernel BUG at /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483! Jul 3 14:47:26 castor kernel: [3488036.578881] invalid opcode: [#1] SMP Jul 3 14:47:26 castor kernel: [3488036.578909] last sysfs file: /sys/devices/virtual/net/lo/operstate Jul 3 14:47:26 castor kernel: [3488036.578937] CPU 0 Jul 3 14:47:26 castor kernel: [3488036.578960] Modules linked in: drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp loop md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 i2c_core pcspkr processor button psmouse joydev evdev serio_raw usbhid hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata usbcore scsi_mod e1000e nls_base thermal thermal_sys [last unloaded: drbd] Jul 3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: kvm Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM Jul 3 14:47:26 castor kernel: [3488036.579309] RIP: 0010:[] [] ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.579363] RSP: 0018:880014839688 EFLAGS: 00010292 Jul 3 14:47:26 castor kernel: [3488036.579390] RAX: 00bf RBX: 00060803 RCX: 1806 Jul 3 14:47:26 castor kernel: [3488036.579435] RDX: RSI: 0096 RDI: 0246 Jul 3 14:47:26 castor kernel: [3488036.579479] RBP: 8800148398a8 R08: 000209d0 R09: 000a Jul 3 14:47:26 castor kernel: [3488036.579524] R10: R11: 0001 R12: 013c4002 Jul 3 14:47:26 castor kernel: [3488036.579568] R13: 88002a1e4030 R14: 0001 R15: 88023c153c60 Jul 3 14:47:26 castor kernel: [3488036.579613] FS: 7f0cfef83700() GS:880008a0() knlGS: Jul 3 14:47:26 castor kernel: [3488036.579659] CS: 0010 DS: 002b ES: 002b CR0: 8005003b Jul 3 14:47:26 castor kernel: [3488036.579687] CR2: 7f0d25dbf000 CR3: 00023ccb6000 CR4: 000426e0 Jul 3 14:47:26 castor kernel: [3488036.579732] DR0: DR1: DR2: Jul 3 14:47:26 castor kernel: [3488036.579776] DR3: DR6: 0ff0 DR7: 0400 Jul 3 14:47:26 castor kernel: [3488036.579821] Process kvm (pid: 25326, threadinfo 880014838000, task 88023b999c40) Jul 3 14:47:26 castor kernel: [3488036.579867] Stack: Jul 3 14:47:26 castor kernel: [3488036.579887] 00f08100 013c4002 00060803 880014839718 Jul 3 14:47:26 castor kernel: [3488036.579923]<0> 880232abde80 88023b999c40 88023b999c40 8800148397a8 Jul 3 14:47:26 castor kernel: [3488036.579977]<0> 8800148397c8 8800148398a8 88023d8027f8 00f08100 Jul 3 14:47:26 castor kernel: [3488036.580047] Call Trace: Jul 3 14:47:26 castor kernel: [3488036.580074] [] ? ocfs2_insert_extent+0x5fb/0x6e6 [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580108] [] ? __ocfs2_journal_access+0x261/0x32a [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580156] [] ? ocfs2_add_clusters_in_btree+0x35f/0x53c [ocfs2] Jul 3 14:47
Re: [Ocfs2-users] remove locks? or copy the whole file?
yep, tried that, returned clean. On Tue, Jul 3, 2012 at 6:25 PM, herbert van.den.bergh wrote: > > One more thing: did you try running fsck.ocfs2 on it? > > Thanks, > Herbert. > > > On 7/3/2012 6:23 PM, herbert van.den.bergh wrote: >> >> Hmm doesn't mean much to me, but maybe to someone else on the list. But >> I bet their first suggestion will be to try a recent kernel... >> >> Thanks, >> Herbert. >> >> On 7/3/2012 6:19 PM, Aleks Clark wrote: >>> >>> Nick, I don't think so, it's a 2tb partition with only 300gb used. >>> >>> Herb, >>> >>> >>> Jul 3 14:47:26 castor kernel: [3488036.578659] >>> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression: >>> path_leaf_bh(left_path) == path_leaf_bh(right_path) >>> Jul 3 14:47:26 castor kernel: [3488036.578714] >>> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error >>> during insert of 15761664 (left path cpos 20725762) results in two >>> identical paths ending at 395267 >>> Jul 3 14:47:26 castor kernel: [3488036.578800] [ cut here >>> ] >>> Jul 3 14:47:26 castor kernel: [3488036.578826] kernel BUG at >>> >>> /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483! >>> Jul 3 14:47:26 castor kernel: [3488036.578881] invalid opcode: [#1] >>> SMP >>> Jul 3 14:47:26 castor kernel: [3488036.578909] last sysfs file: >>> /sys/devices/virtual/net/lo/operstate >>> Jul 3 14:47:26 castor kernel: [3488036.578937] CPU 0 >>> Jul 3 14:47:26 castor kernel: [3488036.578960] Modules linked in: >>> drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables >>> iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac >>> x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb >>> ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp loop >>> md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 >>> i2c_core pcspkr processor button psmouse joydev evdev serio_raw usbhid >>> hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata >>> usbcore scsi_mod e1000e nls_base thermal thermal_sys [last unloaded: >>> drbd] >>> Jul 3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: kvm >>> Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM >>> Jul 3 14:47:26 castor kernel: [3488036.579309] RIP: >>> 0010:[] [] >>> ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2] >>> Jul 3 14:47:26 castor kernel: [3488036.579363] RSP: >>> 0018:880014839688 EFLAGS: 00010292 >>> Jul 3 14:47:26 castor kernel: [3488036.579390] RAX: 00bf >>> RBX: 00060803 RCX: 1806 >>> Jul 3 14:47:26 castor kernel: [3488036.579435] RDX: >>> RSI: 0096 RDI: 0246 >>> Jul 3 14:47:26 castor kernel: [3488036.579479] RBP: 8800148398a8 >>> R08: 000209d0 R09: 000a >>> Jul 3 14:47:26 castor kernel: [3488036.579524] R10: >>> R11: 0001 R12: 013c4002 >>> Jul 3 14:47:26 castor kernel: [3488036.579568] R13: 88002a1e4030 >>> R14: 0001 R15: 88023c153c60 >>> Jul 3 14:47:26 castor kernel: [3488036.579613] FS: >>> 7f0cfef83700() GS:880008a0() >>> knlGS: >>> Jul 3 14:47:26 castor kernel: [3488036.579659] CS: 0010 DS: 002b ES: >>> 002b CR0: 8005003b >>> Jul 3 14:47:26 castor kernel: [3488036.579687] CR2: 7f0d25dbf000 >>> CR3: 00023ccb6000 CR4: 000426e0 >>> Jul 3 14:47:26 castor kernel: [3488036.579732] DR0: >>> DR1: DR2: >>> Jul 3 14:47:26 castor kernel: [3488036.579776] DR3: >>> DR6: 0ff0 DR7: 0400 >>> Jul 3 14:47:26 castor kernel: [3488036.579821] Process kvm (pid: >>> 25326, threadinfo 880014838000, task 88023b999c40) >>> Jul 3 14:47:26 castor kernel: [3488036.579867] Stack: >>> Jul 3 14:47:26 castor kernel: [3488036.579887] 00f08100 >>> 013c4002 00060803 880014839718 >>> Jul 3 14:47:26 castor kernel: [3488036.579923]<0> 880232abde80 >>> 88023b999c40 88023b999c40 8800148397a8 >>> Jul 3 14:47:26 castor kernel: [3488036.579977]<0> 8800148397c8 >>> 8800148398a8 88023d8027f8 00f08100 >>> Jul 3 14:47:26 castor kernel: [3488036.580047] Call Trace: >>> Jul 3 14:47:26 castor kernel: [3488036.580074] [] >>> ? ocfs2_insert_extent+0x5fb/0x6e6 [ocfs2] >>> Jul 3 14:47:26 castor kernel: [3488036.580108] [] >>> ? __ocfs2_journal_access+0x261/0x32a [ocfs2] >>> Jul 3 14:47:26 castor kernel: [3488036.580156] [] >>> ? ocfs2_add_clusters_in_btree+0x35f/0x53c [ocfs2] >>> Jul 3 14:47:26 castor kernel: [3488036.580205] [] >>> ? ocfs2_add_inode_data+0x62/0x6e [ocfs2] >>> Jul 3 14:47:26 castor kernel: [3488036.580239] [] >>> ? ocfs2_journal_access_di+0x0/0xf [ocfs2] >>> Jul 3 14:47:26 castor kernel: [3488036.580272] [] >>> ? ocfs2_write_begin_nolock+0x1376/0x1de2 [ocfs2] >>> Jul 3 14:47:26 castor kernel
Re: [Ocfs2-users] remove locks? or copy the whole file?
Hmm doesn't mean much to me, but maybe to someone else on the list. But I bet their first suggestion will be to try a recent kernel... Thanks, Herbert. On 7/3/2012 6:19 PM, Aleks Clark wrote: > Nick, I don't think so, it's a 2tb partition with only 300gb used. > > Herb, > > > Jul 3 14:47:26 castor kernel: [3488036.578659] > (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression: > path_leaf_bh(left_path) == path_leaf_bh(right_path) > Jul 3 14:47:26 castor kernel: [3488036.578714] > (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error > during insert of 15761664 (left path cpos 20725762) results in two > identical paths ending at 395267 > Jul 3 14:47:26 castor kernel: [3488036.578800] [ cut here > ] > Jul 3 14:47:26 castor kernel: [3488036.578826] kernel BUG at > /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483! > Jul 3 14:47:26 castor kernel: [3488036.578881] invalid opcode: [#1] SMP > Jul 3 14:47:26 castor kernel: [3488036.578909] last sysfs file: > /sys/devices/virtual/net/lo/operstate > Jul 3 14:47:26 castor kernel: [3488036.578937] CPU 0 > Jul 3 14:47:26 castor kernel: [3488036.578960] Modules linked in: > drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables > iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac > x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb > ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp loop > md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 > i2c_core pcspkr processor button psmouse joydev evdev serio_raw usbhid > hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata > usbcore scsi_mod e1000e nls_base thermal thermal_sys [last unloaded: > drbd] > Jul 3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: kvm > Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM > Jul 3 14:47:26 castor kernel: [3488036.579309] RIP: > 0010:[] [] > ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.579363] RSP: > 0018:880014839688 EFLAGS: 00010292 > Jul 3 14:47:26 castor kernel: [3488036.579390] RAX: 00bf > RBX: 00060803 RCX: 1806 > Jul 3 14:47:26 castor kernel: [3488036.579435] RDX: > RSI: 0096 RDI: 0246 > Jul 3 14:47:26 castor kernel: [3488036.579479] RBP: 8800148398a8 > R08: 000209d0 R09: 000a > Jul 3 14:47:26 castor kernel: [3488036.579524] R10: > R11: 0001 R12: 013c4002 > Jul 3 14:47:26 castor kernel: [3488036.579568] R13: 88002a1e4030 > R14: 0001 R15: 88023c153c60 > Jul 3 14:47:26 castor kernel: [3488036.579613] FS: > 7f0cfef83700() GS:880008a0() > knlGS: > Jul 3 14:47:26 castor kernel: [3488036.579659] CS: 0010 DS: 002b ES: > 002b CR0: 8005003b > Jul 3 14:47:26 castor kernel: [3488036.579687] CR2: 7f0d25dbf000 > CR3: 00023ccb6000 CR4: 000426e0 > Jul 3 14:47:26 castor kernel: [3488036.579732] DR0: > DR1: DR2: > Jul 3 14:47:26 castor kernel: [3488036.579776] DR3: > DR6: 0ff0 DR7: 0400 > Jul 3 14:47:26 castor kernel: [3488036.579821] Process kvm (pid: > 25326, threadinfo 880014838000, task 88023b999c40) > Jul 3 14:47:26 castor kernel: [3488036.579867] Stack: > Jul 3 14:47:26 castor kernel: [3488036.579887] 00f08100 > 013c4002 00060803 880014839718 > Jul 3 14:47:26 castor kernel: [3488036.579923]<0> 880232abde80 > 88023b999c40 88023b999c40 8800148397a8 > Jul 3 14:47:26 castor kernel: [3488036.579977]<0> 8800148397c8 > 8800148398a8 88023d8027f8 00f08100 > Jul 3 14:47:26 castor kernel: [3488036.580047] Call Trace: > Jul 3 14:47:26 castor kernel: [3488036.580074] [] > ? ocfs2_insert_extent+0x5fb/0x6e6 [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.580108] [] > ? __ocfs2_journal_access+0x261/0x32a [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.580156] [] > ? ocfs2_add_clusters_in_btree+0x35f/0x53c [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.580205] [] > ? ocfs2_add_inode_data+0x62/0x6e [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.580239] [] > ? ocfs2_journal_access_di+0x0/0xf [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.580272] [] > ? ocfs2_write_begin_nolock+0x1376/0x1de2 [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.580321] [] > ? ocfs2_set_buffer_uptodate+0x15/0x60e [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.580370] [] > ? ocfs2_validate_inode_block+0x0/0x1ab [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.580418] [] > ? ocfs2_journal_access_di+0x0/0xf [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.580451] [] > ? ocfs2_write_begin+0x116/0x1d2 [ocfs2] > Jul 3 14:47:26 castor kernel: [3488036.580484] [] > ? ge
Re: [Ocfs2-users] remove locks? or copy the whole file?
One more thing: did you try running fsck.ocfs2 on it? Thanks, Herbert. On 7/3/2012 6:23 PM, herbert van.den.bergh wrote: > Hmm doesn't mean much to me, but maybe to someone else on the list. But > I bet their first suggestion will be to try a recent kernel... > > Thanks, > Herbert. > > On 7/3/2012 6:19 PM, Aleks Clark wrote: >> Nick, I don't think so, it's a 2tb partition with only 300gb used. >> >> Herb, >> >> >> Jul 3 14:47:26 castor kernel: [3488036.578659] >> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression: >> path_leaf_bh(left_path) == path_leaf_bh(right_path) >> Jul 3 14:47:26 castor kernel: [3488036.578714] >> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error >> during insert of 15761664 (left path cpos 20725762) results in two >> identical paths ending at 395267 >> Jul 3 14:47:26 castor kernel: [3488036.578800] [ cut here >> ] >> Jul 3 14:47:26 castor kernel: [3488036.578826] kernel BUG at >> /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483! >> Jul 3 14:47:26 castor kernel: [3488036.578881] invalid opcode: [#1] SMP >> Jul 3 14:47:26 castor kernel: [3488036.578909] last sysfs file: >> /sys/devices/virtual/net/lo/operstate >> Jul 3 14:47:26 castor kernel: [3488036.578937] CPU 0 >> Jul 3 14:47:26 castor kernel: [3488036.578960] Modules linked in: >> drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables >> iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac >> x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb >> ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp loop >> md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 >> i2c_core pcspkr processor button psmouse joydev evdev serio_raw usbhid >> hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata >> usbcore scsi_mod e1000e nls_base thermal thermal_sys [last unloaded: >> drbd] >> Jul 3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: kvm >> Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM >> Jul 3 14:47:26 castor kernel: [3488036.579309] RIP: >> 0010:[] [] >> ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2] >> Jul 3 14:47:26 castor kernel: [3488036.579363] RSP: >> 0018:880014839688 EFLAGS: 00010292 >> Jul 3 14:47:26 castor kernel: [3488036.579390] RAX: 00bf >> RBX: 00060803 RCX: 1806 >> Jul 3 14:47:26 castor kernel: [3488036.579435] RDX: >> RSI: 0096 RDI: 0246 >> Jul 3 14:47:26 castor kernel: [3488036.579479] RBP: 8800148398a8 >> R08: 000209d0 R09: 000a >> Jul 3 14:47:26 castor kernel: [3488036.579524] R10: >> R11: 0001 R12: 013c4002 >> Jul 3 14:47:26 castor kernel: [3488036.579568] R13: 88002a1e4030 >> R14: 0001 R15: 88023c153c60 >> Jul 3 14:47:26 castor kernel: [3488036.579613] FS: >> 7f0cfef83700() GS:880008a0() >> knlGS: >> Jul 3 14:47:26 castor kernel: [3488036.579659] CS: 0010 DS: 002b ES: >> 002b CR0: 8005003b >> Jul 3 14:47:26 castor kernel: [3488036.579687] CR2: 7f0d25dbf000 >> CR3: 00023ccb6000 CR4: 000426e0 >> Jul 3 14:47:26 castor kernel: [3488036.579732] DR0: >> DR1: DR2: >> Jul 3 14:47:26 castor kernel: [3488036.579776] DR3: >> DR6: 0ff0 DR7: 0400 >> Jul 3 14:47:26 castor kernel: [3488036.579821] Process kvm (pid: >> 25326, threadinfo 880014838000, task 88023b999c40) >> Jul 3 14:47:26 castor kernel: [3488036.579867] Stack: >> Jul 3 14:47:26 castor kernel: [3488036.579887] 00f08100 >> 013c4002 00060803 880014839718 >> Jul 3 14:47:26 castor kernel: [3488036.579923]<0> 880232abde80 >> 88023b999c40 88023b999c40 8800148397a8 >> Jul 3 14:47:26 castor kernel: [3488036.579977]<0> 8800148397c8 >> 8800148398a8 88023d8027f8 00f08100 >> Jul 3 14:47:26 castor kernel: [3488036.580047] Call Trace: >> Jul 3 14:47:26 castor kernel: [3488036.580074] [] >> ? ocfs2_insert_extent+0x5fb/0x6e6 [ocfs2] >> Jul 3 14:47:26 castor kernel: [3488036.580108] [] >> ? __ocfs2_journal_access+0x261/0x32a [ocfs2] >> Jul 3 14:47:26 castor kernel: [3488036.580156] [] >> ? ocfs2_add_clusters_in_btree+0x35f/0x53c [ocfs2] >> Jul 3 14:47:26 castor kernel: [3488036.580205] [] >> ? ocfs2_add_inode_data+0x62/0x6e [ocfs2] >> Jul 3 14:47:26 castor kernel: [3488036.580239] [] >> ? ocfs2_journal_access_di+0x0/0xf [ocfs2] >> Jul 3 14:47:26 castor kernel: [3488036.580272] [] >> ? ocfs2_write_begin_nolock+0x1376/0x1de2 [ocfs2] >> Jul 3 14:47:26 castor kernel: [3488036.580321] [] >> ? ocfs2_set_buffer_uptodate+0x15/0x60e [ocfs2] >> Jul 3 14:47:26 castor kernel: [3488036.580370] [] >> ? ocfs2_validate_inode_block+0x0/0x1ab [ocfs2] >> Jul 3 14:47:26 castor kernel:
Re: [Ocfs2-users] remove locks? or copy the whole file?
Nick, I don't think so, it's a 2tb partition with only 300gb used. Herb, Jul 3 14:47:26 castor kernel: [3488036.578659] (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression: path_leaf_bh(left_path) == path_leaf_bh(right_path) Jul 3 14:47:26 castor kernel: [3488036.578714] (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error during insert of 15761664 (left path cpos 20725762) results in two identical paths ending at 395267 Jul 3 14:47:26 castor kernel: [3488036.578800] [ cut here ] Jul 3 14:47:26 castor kernel: [3488036.578826] kernel BUG at /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483! Jul 3 14:47:26 castor kernel: [3488036.578881] invalid opcode: [#1] SMP Jul 3 14:47:26 castor kernel: [3488036.578909] last sysfs file: /sys/devices/virtual/net/lo/operstate Jul 3 14:47:26 castor kernel: [3488036.578937] CPU 0 Jul 3 14:47:26 castor kernel: [3488036.578960] Modules linked in: drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp loop md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 i2c_core pcspkr processor button psmouse joydev evdev serio_raw usbhid hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata usbcore scsi_mod e1000e nls_base thermal thermal_sys [last unloaded: drbd] Jul 3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: kvm Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM Jul 3 14:47:26 castor kernel: [3488036.579309] RIP: 0010:[] [] ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.579363] RSP: 0018:880014839688 EFLAGS: 00010292 Jul 3 14:47:26 castor kernel: [3488036.579390] RAX: 00bf RBX: 00060803 RCX: 1806 Jul 3 14:47:26 castor kernel: [3488036.579435] RDX: RSI: 0096 RDI: 0246 Jul 3 14:47:26 castor kernel: [3488036.579479] RBP: 8800148398a8 R08: 000209d0 R09: 000a Jul 3 14:47:26 castor kernel: [3488036.579524] R10: R11: 0001 R12: 013c4002 Jul 3 14:47:26 castor kernel: [3488036.579568] R13: 88002a1e4030 R14: 0001 R15: 88023c153c60 Jul 3 14:47:26 castor kernel: [3488036.579613] FS: 7f0cfef83700() GS:880008a0() knlGS: Jul 3 14:47:26 castor kernel: [3488036.579659] CS: 0010 DS: 002b ES: 002b CR0: 8005003b Jul 3 14:47:26 castor kernel: [3488036.579687] CR2: 7f0d25dbf000 CR3: 00023ccb6000 CR4: 000426e0 Jul 3 14:47:26 castor kernel: [3488036.579732] DR0: DR1: DR2: Jul 3 14:47:26 castor kernel: [3488036.579776] DR3: DR6: 0ff0 DR7: 0400 Jul 3 14:47:26 castor kernel: [3488036.579821] Process kvm (pid: 25326, threadinfo 880014838000, task 88023b999c40) Jul 3 14:47:26 castor kernel: [3488036.579867] Stack: Jul 3 14:47:26 castor kernel: [3488036.579887] 00f08100 013c4002 00060803 880014839718 Jul 3 14:47:26 castor kernel: [3488036.579923] <0> 880232abde80 88023b999c40 88023b999c40 8800148397a8 Jul 3 14:47:26 castor kernel: [3488036.579977] <0> 8800148397c8 8800148398a8 88023d8027f8 00f08100 Jul 3 14:47:26 castor kernel: [3488036.580047] Call Trace: Jul 3 14:47:26 castor kernel: [3488036.580074] [] ? ocfs2_insert_extent+0x5fb/0x6e6 [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580108] [] ? __ocfs2_journal_access+0x261/0x32a [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580156] [] ? ocfs2_add_clusters_in_btree+0x35f/0x53c [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580205] [] ? ocfs2_add_inode_data+0x62/0x6e [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580239] [] ? ocfs2_journal_access_di+0x0/0xf [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580272] [] ? ocfs2_write_begin_nolock+0x1376/0x1de2 [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580321] [] ? ocfs2_set_buffer_uptodate+0x15/0x60e [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580370] [] ? ocfs2_validate_inode_block+0x0/0x1ab [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580418] [] ? ocfs2_journal_access_di+0x0/0xf [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580451] [] ? ocfs2_write_begin+0x116/0x1d2 [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580484] [] ? generic_file_buffered_write+0x118/0x278 Jul 3 14:47:26 castor kernel: [3488036.580515] [] ? __generic_file_aio_write+0x25f/0x293 Jul 3 14:47:26 castor kernel: [3488036.580548] [] ? ocfs2_prepare_inode_for_write+0x683/0x69c [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580597] [] ? ocfs2_rw_lock+0x16d/0x239 [ocfs2] Jul 3 14:47:26 castor kernel: [3488036.580628] [] ?
Re: [Ocfs2-users] remove locks? or copy the whole file?
On 07/03/2012 04:12 PM, Aleks Clark wrote: > Ok, so I've got this ocfs2 cluster that's been running for a long > while, hosting my VMs. All of the sudden I'm getting kernel panics > originating from ocfs2 when trying to spin up one particular file. > I've determined that there are several locks on this file, one of them > exclusive. I restarted the whole cluster to try to get rid of it, but > no go. I also tried to copy the file, both on and off of the cluster, > but only half of it copied. Any way to get around either issue would > be appreciated. The panic stack may be helpful, and any messages that the kernel spit out before it. Thanks, Herbert. ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-users
[Ocfs2-users] remove locks? or copy the whole file?
Ok, so I've got this ocfs2 cluster that's been running for a long while, hosting my VMs. All of the sudden I'm getting kernel panics originating from ocfs2 when trying to spin up one particular file. I've determined that there are several locks on this file, one of them exclusive. I restarted the whole cluster to try to get rid of it, but no go. I also tried to copy the file, both on and off of the cluster, but only half of it copied. Any way to get around either issue would be appreciated. Regards, -- Aleks Clark ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-users