Kernel BUG: __tree_mod_log_rewind
I can get btrfs to throw a kernel bug easily by running btrfs fi defrag on some files in 3.9.0: May 7 01:57:33 caper kernel: [0.00] Linux version 3.9.0-030900-generic (apw@gomeisa) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201304291257 SMP Mon Apr 29 16:58:15 UTC 2013 ... May 7 02:09:21 caper kernel: [ 726.745485] [ cut here ] May 7 02:09:21 caper kernel: [ 726.745567] Kernel BUG at a00ea503 [verbose debug info unavailable] May 7 02:09:21 caper kernel: [ 726.745643] invalid opcode: [#1] SMP May 7 02:09:21 caper kernel: [ 726.745807] Modules linked in: snd_hrtimer zram(C) bnep rfcomm bluetooth parport_pc ppdev nfsd nfs_acl auth_rpcgss nfs fscache binfmt_misc lockd sunrpc snd_hda_codec_hdmi joydev hid_gaff ff_memless snd_usb_ audio snd_usbmidi_lib uvcvideo snd_seq_midi videobuf2_core videodev snd_rawmidi videobuf2_vmalloc videobuf2_memops snd_seq_midi_event dm_multipath snd_hda_codec_realtek snd_seq scsi_dh kvm_amd snd_seq_device snd_hda_intel kvm snd_hda_codec snd_hwdep microcode snd_pcm snd_timer k10temp edac_core edac_mce_amd serio_raw snd sp5100_tco i2c_piix4 soundcore snd_page_alloc mac_hid wmi it87 hwmon_vid lp parport xfs btrfs raid6_pq zlib_deflate xor libcrc32c ses enclosure dm_crypt hi d_generic usbhid hid usb_storage firewire_ohci firewire_core crc_itu_t ahci pata_acpi pata_atiixp libahci r8169 May 7 02:09:21 caper kernel: [ 726.749841] CPU 3 May 7 02:09:21 caper kernel: [ 726.749900] Pid: 1703, comm: btrfs-endio-wri Tainted: G C 3.9.0-030900-generic #201304291257 Gigabyte Technology Co., Ltd. GA-MA790GP-UD4H/GA-MA790GP-UD4H May 7 02:09:21 caper kernel: [ 726.750069] RIP: 0010:[] [] __tree_mod_log_rewind+0x253/0x260 [btrfs] May 7 02:09:21 caper kernel: [ 726.750244] RSP: 0018:88011a2e1838 EFLAGS: 00010293 May 7 02:09:21 caper kernel: [ 726.750316] RAX: RBX: 88004b2798f0 RCX: 88011a2e17d8 May 7 02:09:21 caper kernel: [ 726.750390] RDX: 13f3a75c RSI: 05e8 RDI: 8800172ea880 May 7 02:09:21 caper kernel: [ 726.750463] RBP: 88011a2e1868 R08: 1000 R09: 88011a2e17e8 May 7 02:09:21 caper kernel: [ 726.750536] R10: 000103db R11: R12: 880098cf4d80 May 7 02:09:21 caper kernel: [ 726.750609] R13: 002b R14: 8800172ea700 R15: 0009c7a7 May 7 02:09:21 caper kernel: [ 726.750683] FS: 7fa2bc594700() GS:88014fd8() knlGS: May 7 02:09:21 caper kernel: [ 726.750770] CS: 0010 DS: ES: CR0: 8005003b May 7 02:09:21 caper kernel: [ 726.750841] CR2: fd82c000 CR3: 00014654d000 CR4: 07e0 May 7 02:09:21 caper kernel: [ 726.750914] DR0: DR1: DR2: May 7 02:09:21 caper kernel: [ 726.750987] DR3: DR6: 0ff0 DR7: 0400 May 7 02:09:21 caper kernel: [ 726.751061] Process btrfs-endio-wri (pid: 1703, threadinfo 88011a2e, task 88004a6b2ea0) May 7 02:09:21 caper kernel: [ 726.751147] Stack: May 7 02:09:21 caper kernel: [ 726.751212] 88011a2e1858 880104c8de30 0009c7a7 8800 May 7 02:09:21 caper kernel: [ 726.751488] a8598000 880148278000 88011a2e18b8 a00ea5ef May 7 02:09:21 caper kernel: [ 726.751763] 880098cf4d80 88004b2798f0 8800338d3000 0001 May 7 02:09:21 caper kernel: [ 726.752038] Call Trace: May 7 02:09:21 caper kernel: [ 726.752135] [] tree_mod_log_rewind+0xdf/0x240 [btrfs] May 7 02:09:21 caper kernel: [ 726.752237] [] btrfs_search_old_slot+0x4cb/0x670 [btrfs] May 7 02:09:21 caper kernel: [ 726.752351] [] __resolve_indirect_ref+0xc8/0x150 [btrfs] May 7 02:09:21 caper kernel: [ 726.752462] [] __resolve_indirect_refs+0x9e/0x200 [btrfs] May 7 02:09:21 caper kernel: [ 726.752573] [] find_parent_nodes+0x45d/0x6b0 [btrfs] May 7 02:09:21 caper kernel: [ 726.752684] [] btrfs_find_all_roots+0x99/0x100 [btrfs] May 7 02:09:21 caper kernel: [ 726.752792] [] ? btrfs_submit_direct+0x190/0x190 [btrfs] May 7 02:09:21 caper kernel: [ 726.752901] [] ? btrfs_submit_direct+0x190/0x190 [btrfs] May 7 02:09:21 caper kernel: [ 726.753012] [] iterate_extent_inodes+0x177/0x2c0 [btrfs] May 7 02:09:21 caper kernel: [ 726.753123] [] iterate_inodes_from_logical+0x92/0xb0 [btrfs] May 7 02:09:21 caper kernel: [ 726.753244] [] ? btrfs_submit_direct+0x190/0x190 [btrfs] May 7 02:09:21 caper kernel: [ 726.753353] [] record_extent_backrefs+0x78/0xf0 [btrfs] May 7 02:09:21 caper kernel: [ 726.753462] [] relink_file_extents+0x44/0x180 [btrfs] May 7 02:09:21 caper kernel: [ 726.753571] [] btrfs_finish_ordered_io+0x135/0x4d0 [btrfs] May 7 02:09:21 caper kernel: [ 726.753681] [] finish_ordered_fn+0x15/0x20 [btrfs] May 7 02:09:21 caper kernel: [ 726.753791] [] worker_loop+0xa0/0x320 [btrfs] May 7 02:09:21 caper kernel: [ 726.753901
Re: A couple bugs with btrfs and 3.5.0 kernel
On Sun, Jan 20, 2013 at 5:51 PM, Liu Bo wrote: > On Sun, Jan 20, 2013 at 05:39:57PM -0800, Elladan wrote: >> Any ideas? I guess I could try to mount in degraded mode or try a 3.6 >> kernel or something, but this all seems like I should probably just >> restore from backups and move on. > > Hi Elladan, > > For 'bio too big' issue, this patch is helpful, > > https://patchwork.kernel.org/patch/1619691/ > > thanks, > liubo Hi, After poking around, I determined that the 3.8 kernel is the first one with this patch. I installed it, and re-ran btrfs device delete. The delete ran to completion successfully. However, "btrfs fi show" still indicated that the deleted device was part of the filesystem. I don't know if that was a bug in my older btrfs binary or not. It mounts fine without the deleted device. Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
A couple bugs with btrfs and 3.5.0 kernel
I upgraded to Ubuntu 12.10 and thought, "Hey, that 3.5 kernel is relatively recent. And they seem to finally have implemented restriping. Maybe it's time to try btrfs again!" So, first off, I backed up all my data. Next, I decided I would attempt to use btrfs's features for my benefit. Specifically (this part is less interesting except as setup): 1. I put a btrfs filesystem on top of dm-crypt on an external USB drive. 2. I copied data to it. 3. I unmounted the original partition, and then immediately mounted the btrfs partition in its place. Ok, now to the interesting bits: My goal here is to delete the usb device and just leave myself with my data, migrated back to the internal disk (with minimal downtime) So, I figured I could use restriping/device delete to live-migrate back onto the internal hard disk. 4. I did a btrfs device add on a partition (over lvm/dm-crypt) on the internal disk. Now I have 2 partitons in the fs. I attempted to btrfs device delete the usb disk, and it errored out (with somewhat inscrutable information) telling me that I can't reduce raid1 to dup this way. Note: Arguably, this is a bug. You really ought to do it, but with a -f option, and automatically reduce the chunks appropriately. Note: Also arguably, this is also a bug because it should not have changed the metadata profile from dup to raid1 without asking me. Maybe I don't want raid1. Anyway, I figure I can fix this up with a balance filter (this is primarily what made me think btrfs might be more usable now). 6. I attempt to balance with a filter -mconvert=dup. This immediately errors out with no real indication as to why. In the dmesg log I found: [52656.153908] btrfs: unable to start balance with target metadata profile 32 Clearly a bug. 7. After some random trial and error, I find that it accepts -mconvert=single, and the result appears to be metadata in dup state. Maybe. Ok now that's done, it's time to delete. 8. btrfs device delete /dev/dm-11 /btrfs Some hours later, it fails. I find stuff like this all over my dmesg log: [113936.300109] bio too big device dm-11 (1024 > 240) [113936.297242] btrfs: bdev /dev/dm-11 errs: wr 101, rd 10247, flush 0, corrupt 109, gen 0 [113935.425960] btrfs_dev_stat_print_on_error: 38 callbacks suppressed It also found 2 files with csum errors, which were left on the USB device. [92750.052638] btrfs csum failed ino 257 off 49278976 csum 948519347 private 2127080388 [95692.348662] btrfs: checksum error at logical 94682349568 on dev /dev/mapper/tempusb, sector 224788736, root 256, inode 114815, offset 14360576, length 4096, links 1 (path:...path to file) The csum errors appeared to have caused it to stop. Googling around seemed to indicate that someone had once experienced a similar problem with an external drive around the 3.0 kernel era. They suggested something about the filesystem not working when dealing with devices mixed between SATA and USB, which sounded a bit wacky to me. I initially assumed that maybe the USB drive was a bit flaky, but this sounds to me like the csum errors were probably btrfs causing silent corruption. I tried deleting the files with the csum errors and running the device delete again, but it immediately failed with invalid argument errors and nothing in the dmesg log. Clearly a bug. Then, I tried unmounting, remounting, and then re-running the delete. This time it started, but it's been running for a long time and spamming my kernel logs with the bio too big for device errors. I'm guessing I'll probably need to sysrq reboot or something. This is with Ubuntu's standard 3.5.0-22 generic kernel. Any ideas? I guess I could try to mount in degraded mode or try a 3.6 kernel or something, but this all seems like I should probably just restore from backups and move on. Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Kernel oops with 2.6.31 in btrfs_set_acl
Hi, I got an oops this morning with btrfs on the Ubuntu 2.6.31 kernel. I mounted with compress and thread_pool=4. This is on a 64 bit quad AMD machine. Sep 18 04:00:07 caper kernel: [614439.131866] BUG: unable to handle kernel NULL pointer dereference at 0004 Sep 18 04:00:07 caper kernel: [614439.131882] IP: [] posix_acl_equiv_mode+0x1/0xa0 Sep 18 04:00:07 caper kernel: [614439.131900] PGD 126d98067 PUD 11e572067 PMD 0 Sep 18 04:00:07 caper kernel: [614439.131911] Oops: [#4] SMP Sep 18 04:00:07 caper kernel: [614439.131918] last sysfs file: /sys/devices/virtual/net/pan0/statistics/collisions Sep 18 04:00:07 caper kernel: [614439.131925] CPU 1 Sep 18 04:00:07 caper kernel: [614439.131928] Modules linked in: joydev hid_gaff ff_memless btrfs zlib_deflate crc32c libcrc32c isofs udf crc_itu_t nls_iso8859_1 nls_cp437 vfat fat usb_storage binfmt_misc ppdev bnep ipt_MASQUERADE iptabl e_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp kvm_amd kvm tun video output nfsd nfs lockd nfs_acl auth_rpcgss sunrpc it87 hwmon_vid lp parport snd_hd a_codec_realtek snd_hda_intel snd_hda_codec snd_usb_audio snd_usb_lib snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device uvcvideo usblp pcspk r amd64_edac_mod edac_core i2c_piix4 snd soundcore snd_page_alloc videodev v4l1_compat v4l2_compat_ioctl32 nvidia(P) xfs exportfs ohci1394 ieee1394 r8169 mii usbhid Sep 18 04:00:07 caper kernel: [614439.132057] Pid: 23795, comm: rdiff-backup Tainted: P D W 2.6.31-020631-generic #020631 GA-MA790GP-UD4H Sep 18 04:00:07 caper kernel: [614439.132064] RIP: 0010:[] [] posix_acl_equiv_mode+0x1/0xa0 Sep 18 04:00:07 caper kernel: [614439.132077] RSP: 0018:88007cc61c58 EFLAGS: 00010246 Sep 18 04:00:07 caper kernel: [614439.132082] RAX: 8180 RBX: ffea RCX: 0004 Sep 18 04:00:07 caper kernel: [614439.132088] RDX: 8000 RSI: 88007cc61c8c RDI: Sep 18 04:00:07 caper kernel: [614439.132094] RBP: 88007cc61cb8 R08: R09: a0f5e9a0 Sep 18 04:00:07 caper kernel: [614439.132099] R10: R11: R12: Sep 18 04:00:07 caper kernel: [614439.132104] R13: 8000 R14: 88005aa4b9b0 R15: 8000 Sep 18 04:00:07 caper kernel: [614439.132112] FS: 7fc423a2e6f0() GS:88002803f000() knlGS:f72c1750 Sep 18 04:00:07 caper kernel: [614439.132117] CS: 0010 DS: ES: CR0: 80050033 Sep 18 04:00:07 caper kernel: [614439.132123] CR2: 0004 CR3: 00011b806000 CR4: 06a0 Sep 18 04:00:07 caper kernel: [614439.132129] DR0: DR1: DR2: Sep 18 04:00:07 caper kernel: [614439.132134] DR3: DR6: 0ff0 DR7: 0400 Sep 18 04:00:07 caper kernel: [614439.132141] Process rdiff-backup (pid: 23795, threadinfo 88007cc6, task 88007f432d60) Sep 18 04:00:07 caper kernel: [614439.132146] Stack: Sep 18 04:00:07 caper kernel: [614439.132149] 88007cc61cb8 a0f3c157 88007cc61ce8 0008 Sep 18 04:00:07 caper kernel: [614439.132158] <0> 0eb6 88003e0af110 81807cc61c98 Sep 18 04:00:07 caper kernel: [614439.132168] <0> 0004 8000 88005aa4b9b0 880079484240 Sep 18 04:00:07 caper kernel: [614439.132178] Call Trace: Sep 18 04:00:07 caper kernel: [614439.132228] [] ? btrfs_set_acl+0x87/0x230 [btrfs] Sep 18 04:00:07 caper kernel: [614439.132266] [] btrfs_xattr_set_acl+0x84/0xa0 [btrfs] Sep 18 04:00:07 caper kernel: [614439.132300] [] btrfs_xattr_acl_access_set+0xe/0x10 [btrfs] Sep 18 04:00:07 caper kernel: [614439.132311] [] generic_setxattr+0x6f/0x90 Sep 18 04:00:07 caper kernel: [614439.132349] [] btrfs_setxattr+0x4a/0xa0 [btrfs] Sep 18 04:00:07 caper kernel: [614439.132359] [] vfs_setxattr+0xbe/0x210 Sep 18 04:00:07 caper kernel: [614439.132368] [] setxattr+0xb5/0x110 Sep 18 04:00:07 caper kernel: [614439.132375] [] ? path_put+0x2c/0x40 Sep 18 04:00:07 caper kernel: [614439.132383] [] ? putname+0x31/0x50 Sep 18 04:00:07 caper kernel: [614439.132390] [] ? user_path_at+0x59/0x90 Sep 18 04:00:07 caper kernel: [614439.132400] [] ? _atomic_dec_and_lock+0x4d/0x70 Sep 18 04:00:07 caper kernel: [614439.132407] [] ? path_put+0x2c/0x40 Sep 18 04:00:07 caper kernel: [614439.132415] [] ? sys_fchmodat+0x71/0xe0 Sep 18 04:00:07 caper kernel: [614439.132423] [] ? _atomic_dec_and_lock+0x4d/0x70 Sep 18 04:00:07 caper kernel: [614439.132432] [] sys_setxattr+0x92/0xa0 Sep 18 04:00:07 caper kernel: [614439.132442] [] system_call_fastpath+0x16/0x1b Sep 18 04:00:07 caper kernel: [614439.132447] Code: 08 0f 85 6a ff ff ff 85 ff 90 0f 85 61 ff ff ff e9 77 f