Re: Seeking Help on Corruption Issues
On 10/3/2017 2:11 PM, Hugo Mills wrote: Hi, Stephen, On Tue, Oct 03, 2017 at 08:52:04PM +, Stephen Nesbitt wrote: Here it i. There are a couple of out-of-order entries beginning at 117. And yes I did uncover a bad stick of RAM: btrfs-progs v4.9.1 leaf 2589782867968 items 134 free space 6753 generation 3351574 owner 2 fs uuid 24b768c3-2141-44bf-ae93-1c3833c8c8e3 chunk uuid 19ce12f0-d271-46b8-a691-e0d26c1790c6 [snip] item 116 key (1623012749312 EXTENT_ITEM 45056) itemoff 10908 itemsize 53 extent refs 1 gen 3346444 flags DATA extent data backref root 271 objectid 2478 offset 0 count 1 item 117 key (1621939052544 EXTENT_ITEM 8192) itemoff 10855 itemsize 53 extent refs 1 gen 3346495 flags DATA extent data backref root 271 objectid 21751764 offset 6733824 count 1 item 118 key (1623012450304 EXTENT_ITEM 8192) itemoff 10802 itemsize 53 extent refs 1 gen 3351513 flags DATA extent data backref root 271 objectid 5724364 offset 680640512 count 1 item 119 key (1623012802560 EXTENT_ITEM 12288) itemoff 10749 itemsize 53 extent refs 1 gen 3346376 flags DATA extent data backref root 271 objectid 21751764 offset 6701056 count 1 hex(1623012749312) '0x179e3193000' hex(1621939052544) '0x179a319e000' hex(1623012450304) '0x179e314a000' hex(1623012802560) '0x179e31a' That's "e" -> "a" in the fourth hex digit, which is a single-bit flip, and should be fixable by btrfs check (I think). However, even fixing that, it's not ordered, because 118 is then before 117, which could be another bitflip ("9" -> "4" in the 7th digit), but two bad bits that close to each other seems unlikely to me. Hugo. Hope this is a duplicate reply - I might have fat fingered something. The underlying file is disposable/replaceable. Any way to zero out/zap the bad BTRFS entry? -steve -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Seeking Help on Corruption Issues
All: I came back to my computer yesterday to find my filesystem in read only mode. Running a btrfs scrub start -dB aborts as follows: btrfs scrub start -dB /mnt ERROR: scrubbing /mnt failed for device id 4: ret=-1, errno=5 (Input/output error) ERROR: scrubbing /mnt failed for device id 5: ret=-1, errno=5 (Input/output error) scrub device /dev/sdb (id 4) canceled scrub started at Mon Oct 2 21:51:46 2017 and was aborted after 00:09:02 total bytes scrubbed: 75.58GiB with 1 errors error details: csum=1 corrected errors: 0, uncorrectable errors: 1, unverified errors: 0 scrub device /dev/sdc (id 5) canceled scrub started at Mon Oct 2 21:51:46 2017 and was aborted after 00:11:11 total bytes scrubbed: 50.75GiB with 0 errors The resulting dmesg is: [ 699.534066] BTRFS error (device sdc): bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 6, gen 0 [ 699.703045] BTRFS error (device sdc): unable to fixup (regular) error at logical 1609808347136 on dev /dev/sdb [ 783.306525] BTRFS critical (device sdc): corrupt leaf, bad key order: block=2589782867968, root=1, slot=116 [ 789.776132] BTRFS critical (device sdc): corrupt leaf, bad key order: block=2589782867968, root=1, slot=116 [ 911.529842] BTRFS critical (device sdc): corrupt leaf, bad key order: block=2589782867968, root=1, slot=116 [ 918.365225] BTRFS critical (device sdc): corrupt leaf, bad key order: block=2589782867968, root=1, slot=116 Running btrfs check /dev/sdc results in: btrfs check /dev/sdc Checking filesystem on /dev/sdc UUID: 24b768c3-2141-44bf-ae93-1c3833c8c8e3 checking extents bad key ordering 116 117 bad block 2589782867968 ERROR: errors found in extent allocation tree or chunk allocation checking free space cache There is no free space entry for 1623012450304-1623012663296 There is no free space entry for 1623012450304-1623225008128 cache appears valid but isn't 1622151266304 found 288815742976 bytes used err is -22 total csum bytes: 0 total tree bytes: 350781440 total fs tree bytes: 0 total extent tree bytes: 350027776 btree space waste bytes: 115829777 file data blocks allocated: 156499968 uname -a: Linux sysresccd 4.9.24-std500-amd64 #2 SMP Sat Apr 22 17:14:43 UTC 2017 x86_64 Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz GenuineIntel GNU/Linux btrfs --version: btrfs-progs v4.9.1 btrfs fi show: Label: none uuid: 24b768c3-2141-44bf-ae93-1c3833c8c8e3 Total devices 2 FS bytes used 475.08GiB devid 4 size 931.51GiB used 612.06GiB path /dev/sdb devid 5 size 931.51GiB used 613.09GiB path /dev/sdc btrfs fi df /mnt: Data, RAID1: total=603.00GiB, used=468.03GiB System, RAID1: total=64.00MiB, used=112.00KiB System, single: total=32.00MiB, used=0.00B Metadata, RAID1: total=9.00GiB, used=7.04GiB Metadata, single: total=1.00GiB, used=0.00B GlobalReserve, single: total=512.00MiB, used=0.00B What is the recommended procedure at this point? Run btrfs check --repair? I have backups so losing a file or two isn't critical, but I really don't want to go through the effort of a bare metal reinstall. In the process of researching this I did uncover a bad DIMM. Am I correct that the problems I'm seeing are likely linked to the resulting memory errors. Thx in advance, -steve -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 00/22] fs: enhanced writeback error reporting with errseq_t (pile #1)
Hi Jeff, On Mon, 19 Jun 2017 12:23:46 -0400 Jeff Layton <jlay...@redhat.com> wrote: > > If there are no major objections to this set, I'd like to have > linux-next start picking it up to get some wider testing. What's the > right vehicle for this, given that it touches stuff all over the tree? > > I can see 3 potential options: > > 1) I could just pull these into the branch that Stephen is already > picking up for file-locks in my tree > > 2) I could put them into a new branch, and have Stephen pull that one in > addition to the file-locks branch > > 3) It could go in via someone else's tree entirely (Andrew or Al's > maybe?) > > I'm fine with any of these. Anyone have thoughts? Given that this is a one off development, either 1 or 3 (in Al's tree) would be fine. 2 is a possibility (but people forget to ask me to remove one shot trees :-() -- Cheers, Stephen Rothwell -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Planned feature status
I know that setting different RAID level per subvolume is planned for the future, but I can't find documentation on the Wiki as to what priority the feature is. I can find docs on some user submitted feature requests, but it seems since this is something that was planned longer ago it's not documented. Can someone tell me where to find a list of feature priorities or when this might be done. Thank you, Stephen -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: incoming merge conflict to linux-next
Hi Chris, On Wed, 18 May 2016 17:10:43 -0400 Chris Mason <c...@fb.com> wrote: > > Dave Sterba's tree in linux-next has a few btrfs patches that we're not > sending yet into Linus. We've got an update for Josef's enospc work > that'll get sent in next week. > > So he prepped a pull for me that merged up a number of his branches but > didn't include Josef's new code. It has all been in -next for some > time, and then I put some fixes from Filipe on top. > > Long story short, you'll get a merge conflict from my next branch: > > https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git next > > I've got the sample resolution in next-merge: > > https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git next-merge > > Please let us know if you have any problems. A bit of a mess, but I sorted it out, thanks for the test merge. -- Cheers, Stephen Rothwell -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible Raid Bug
Yeah I think the Gotchas page would be a good place to give people a heads up. -- Stephen Williams steph...@veryfast.biz On Sat, Mar 26, 2016, at 09:58 PM, Chris Murphy wrote: > On Sat, Mar 26, 2016 at 8:00 AM, Stephen Williams <steph...@veryfast.biz> > wrote: > > > I know this is quite a rare occurrence for home use but for Data center > > use this is something that will happen A LOT. > > This really should be placed in the wiki while we wait for a fix. I can > > see a lot of sys admins crying over this. > > Maybe on the gotchas page? While it's not a data loss bug, it might be > viewed as an uptime bug because the dataset is stuck being ro and > hence unmodifiable, until a restore to a rw volume is complete. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible Raid Bug
Can confirm that you only get one chance to fix the problem before the array is dead. I know this is quite a rare occurrence for home use but for Data center use this is something that will happen A LOT. This really should be placed in the wiki while we wait for a fix. I can see a lot of sys admins crying over this. -- Stephen Williams steph...@veryfast.biz On Sat, Mar 26, 2016, at 11:51 AM, Patrik Lundquist wrote: > So with the lessons learned: > > # mkfs.btrfs -m raid10 -d raid10 /dev/sdb /dev/sdc /dev/sdd /dev/sde > > # mount /dev/sdb /mnt; dmesg | tail > # touch /mnt/test1; sync; btrfs device usage /mnt > > Only raid10 profiles. > > # echo 1 >/sys/block/sde/device/delete > > We lost a disk. > > # touch /mnt/test2; sync; dmesg | tail > > We've got write errors. > > # btrfs device usage /mnt > > No 'single' profiles because we haven't remounted yet. > > # reboot > # wipefs -a /dev/sde; reboot > > # mount -o degraded /dev/sdb /mnt; dmesg | tail > # btrfs device usage /mnt > > Still only raid10 profiles. > > # touch /mnt/test3; sync; btrfs device usage /mnt > > Now we've got 'single' profiles. Replace now or get hosed. > > # btrfs replace start -B 4 /dev/sde /mnt; dmesg | tail > > # btrfs device stats /mnt > > [/dev/sde].write_io_errs 0 > [/dev/sde].read_io_errs0 > [/dev/sde].flush_io_errs 0 > [/dev/sde].corruption_errs 0 > [/dev/sde].generation_errs 0 > > We didn't inherit the /dev/sde error count. Is that a bug? > > # btrfs balance start -dconvert=raid10,soft -mconvert=raid10,soft > -sconvert=raid10,soft -vf /mnt; dmesg | tail > > # btrfs device usage /mnt > > Back to only 'raid10' profiles. > > # umount /mnt; mount /dev/sdb /mnt; dmesg | tail > > # btrfs device stats /mnt > > [/dev/sde].write_io_errs 11 > [/dev/sde].read_io_errs0 > [/dev/sde].flush_io_errs 2 > [/dev/sde].corruption_errs 0 > [/dev/sde].generation_errs 0 > > The old counters are back. That's good, but wtf? > > # btrfs device stats -z /dev/sde > > Give /dev/sde a clean bill of health. Won't warn when mounting again. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible Raid Bug
Hi Patrik, [root@Xen ~]# uname -r 4.4.5-1-ARCH [root@Xen ~]# pacman -Q btrfs-progs btrfs-progs 4.4.1-1 Your information below was very helpful and I was able to recreate the Raid array. However my initial question still stands - What if the drives dies completely? I work in a Data center and we see this quite a lot where a drive is beyond dead - The OS will literally not detect it. At this point would the Raid10 array be beyond repair? As you need the drive present in order to mount the array in degraded mode. -- Stephen Williams steph...@veryfast.biz On Fri, Mar 25, 2016, at 02:57 PM, Patrik Lundquist wrote: > On Debian Stretch with Linux 4.4.6, btrfs-progs 4.4 in VirtualBox > 5.0.16 with 4*2GB VDIs: > > # mkfs.btrfs -m raid10 -d raid10 /dev/sdb /dev/sdc /dev/sdd /dev/sdbe > > # mount /dev/sdb /mnt > # touch /mnt/test > # umount /mnt > > Everything fine so far. > > # wipefs -a /dev/sde > > *reboot* > > # mount /dev/sdb /mnt > mount: wrong fs type, bad option, bad superblock on /dev/sdb, >missing codepage or helper program, or other error > >In some cases useful info is found in syslog - try >dmesg | tail or so. > > # dmesg | tail > [ 85.979655] BTRFS info (device sdb): disk space caching is enabled > [ 85.979660] BTRFS: has skinny extents > [ 85.982377] BTRFS: failed to read the system array on sdb > [ 85.996793] BTRFS: open_ctree failed > > Not very informative! An information regression? > > # mount -o degraded /dev/sdb /mnt > > # dmesg | tail > [ 919.899071] BTRFS info (device sdb): allowing degraded mounts > [ 919.899075] BTRFS info (device sdb): disk space caching is enabled > [ 919.899077] BTRFS: has skinny extents > [ 919.903216] BTRFS warning (device sdb): devid 4 uuid > 8549a275-f663-4741-b410-79b49a1d465f is missing > > # touch /mnt/test2 > # ls -l /mnt/ > total 0 > -rw-r--r-- 1 root root 0 mar 25 15:17 test > -rw-r--r-- 1 root root 0 mar 25 15:42 test2 > > # btrfs device remove missing /mnt > ERROR: error removing device 'missing': unable to go below four > devices on raid10 > > As expected. > > # btrfs replace start -B missing /dev/sde /mnt > ERROR: source device must be a block device or a devid > > Would have been nice if missing worked here too. Maybe it does in > btrfs-progs 4.5? > > # btrfs replace start -B 4 /dev/sde /mnt > > # dmesg | tail > [ 1618.170619] BTRFS info (device sdb): dev_replace from disk> (devid 4) to /dev/sde started > [ 1618.184979] BTRFS info (device sdb): dev_replace from disk> (devid 4) to /dev/sde finished > > Repaired! > > # umount /mnt > # mount /dev/sdb /mnt > # dmesg | tail > [ 1729.917661] BTRFS info (device sde): disk space caching is enabled > [ 1729.917665] BTRFS: has skinny extents > > All in all it works just fine with Linux 4.4.6. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Possible Raid Bug
Hi, Find instructions on how to recreate below - I have a BTRFS raid 10 setup in Virtualbox (I'm getting to grips with the Filesystem) I have the raid mounted to /mnt like so - [root@Xen ~]# btrfs filesystem show /mnt/ Label: none uuid: ad1d95ee-5cdc-420f-ad30-bd16158ad8cb Total devices 4 FS bytes used 1.00GiB devid1 size 2.00GiB used 927.00MiB path /dev/sdb devid2 size 2.00GiB used 927.00MiB path /dev/sdc devid3 size 2.00GiB used 927.00MiB path /dev/sdd devid4 size 2.00GiB used 927.00MiB path /dev/sde And - [root@Xen ~]# btrfs filesystem usage /mnt/ Overall: Device size: 8.00GiB Device allocated: 3.62GiB Device unallocated:4.38GiB Device missing: 0.00B Used: 2.00GiB Free (estimated): 2.69GiB (min: 2.69GiB) Data ratio: 2.00 Metadata ratio: 2.00 Global reserve: 16.00MiB (used: 0.00B) Data,RAID10: Size:1.50GiB, Used:1.00GiB /dev/sdb 383.50MiB /dev/sdc 383.50MiB /dev/sdd 383.50MiB /dev/sde 383.50MiB Metadata,RAID10: Size:256.00MiB, Used:1.16MiB /dev/sdb 64.00MiB /dev/sdc 64.00MiB /dev/sdd 64.00MiB /dev/sde 64.00MiB System,RAID10: Size:64.00MiB, Used:16.00KiB /dev/sdb 16.00MiB /dev/sdc 16.00MiB /dev/sdd 16.00MiB /dev/sde 16.00MiB Unallocated: /dev/sdb1.55GiB /dev/sdc1.55GiB /dev/sdd1.55GiB /dev/sde1.55GiB Right so everything looks good and I stuck some dummy files in there too - [root@Xen ~]# ls -lh /mnt/ total 1.1G -rw-r--r-- 1 root root 1.0G May 30 2008 1GB.zip -rw-r--r-- 1 root root 28 Mar 24 15:16 hello -rw-r--r-- 1 root root6 Mar 24 16:12 niglu -rw-r--r-- 1 root root4 Mar 24 15:32 test The bug appears to happen when you try and test out it's ability to handle a dead drive. If you follow the instructions here: https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices#Replacing_failed_devices It tells you do mount the drive with the 'degraded' option,, however this just does not work, allow me to show - 1) I power off the VM and remove one of the drives (Simulating a drive being pulled from a machine) 2) Power on the VM 3) Check DMESG - Everything looks good 4) Check how BTRFS is feeling - Label: none uuid: ad1d95ee-5cdc-420f-ad30-bd16158ad8cb Total devices 4 FS bytes used 1.00GiB devid1 size 2.00GiB used 1.31GiB path /dev/sdb devid2 size 2.00GiB used 1.31GiB path /dev/sdc devid3 size 2.00GiB used 1.31GiB path /dev/sdd *** Some devices missing So far so good, /dev/sde is missing and BTRFS has detected this. 5) Try and mount it as per the wiki so I can remove the bad drive and replace it with a good one - [root@Xen ~]# mount -o degraded /dev/sdb /mnt/ mount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. Ok, this is not good, I check DMESG - [root@Xen ~]# dmesg | tail [4.416445] e1000: enp0s3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [4.416672] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s3: link becomes ready [4.631812] snd_intel8x0 :00:05.0: white list rate for 1028:0177 is 48000 [7.091047] floppy0: no floppy controllers found [ 27.488345] BTRFS info (device sdb): allowing degraded mounts [ 27.488348] BTRFS info (device sdb): disk space caching is enabled [ 27.488349] BTRFS: has skinny extents [ 27.489794] BTRFS warning (device sdb): devid 4 uuid ebcd53d9-5956-41d9-b0ef-c59d08e5830f is missing [ 27.491465] BTRFS: missing devices(1) exceeds the limit(0), writeable mount is not allowed [ 27.520231] BTRFS: open_ctree failed So here lies the problem - BTRFS needs you to have all the devices present in order to mount is as writeable, however if a drive dies spectacularly (as they can do) You can't have that do that. And as a result you cannot mount any of the remaining drives and fix the problem. Now you ARE able to mount it read only but you can't issue the fix that is recommend on the wiki, see here - [root@Xen ~]# mount -o ro,degraded /dev/sdb /mnt/ [root@Xen ~]# btrfs device delete missing /mnt/ ERROR: error removing device 'missing': Read-only file system So catch 22, you need all the drives otherwise it won't let you mount, But what happens if a drive dies and the OS doesn't detect it? BTRFS wont allow you to mount the raid volume to remove the bad disk! I also tried it with read only - [root@Xen ~]# mount -o ro,degraded /dev/sdb /mnt/ [root@Xen ~]# btrfs device delete missing /mnt/ ERROR: error removing device 'missing': Read-only file system subscribe linux-btrfs -- To unsubscribe from this list: send the line
kernel BUG at fs/btrfs/send.c:1482
I'm running 4.4.0-rc7. This exact problem was present on 4.0.5 and 4.3.3 too though. I do a "btrfs send /var/lib/lxc/template64/rootfs", that generates the following error consistently at the same file, over and over again: Dec 29 14:49:04 argo kernel: kernel BUG at fs/btrfs/send.c:1482! Dec 29 14:49:04 argo kernel: Modules linked in: nfsd Dec 29 14:49:04 argo kernel: task: 880041295c40 ti: 88010423c000 task.ti: 88010423c000 Dec 29 14:49:04 argo kernel: RSP: 0018:88010423fb20 EFLAGS: 00010202 Dec 29 14:49:04 argo kernel: RDX: 0001 RSI: RDI: Dec 29 14:49:04 argo kernel: R10: 88019d53b5e0 R11: R12: 8801b35ac510 Dec 29 14:49:04 argo kernel: FS: 7fac9113f8c0() GS:88022fd8() knlGS: Dec 29 14:49:04 argo kernel: CR2: 7f99ba308520 CR3: 000154a4 CR4: 001006e0 Dec 29 14:49:04 argo kernel: 81ed 0009b15a a1ff Dec 29 14:49:04 argo kernel: 0009b15a 1a03 8801baf2e800 Dec 29 14:49:04 argo kernel: [] send_create_inode_if_needed+0x30/0x49 Dec 29 14:49:04 argo kernel: [] ? btrfs_item_key+0x19/0x1b Dec 29 14:49:04 argo kernel: [] btrfs_compare_trees+0x2f2/0x4fe Dec 29 14:49:04 argo kernel: [] btrfs_ioctl_send+0x846/0xce5 Dec 29 14:49:04 argo kernel: [] ? try_to_freeze_unsafe+0x9/0x32 Dec 29 14:49:04 argo kernel: [] ? _raw_spin_lock_irq+0xf/0x11 Dec 29 14:49:04 argo kernel: [] ? ptrace_do_notify+0x84/0x95 Dec 29 14:49:04 argo kernel: [] SyS_ioctl+0x43/0x61 Dec 29 14:49:04 argo kernel: RIP [] send_create_inode+0x1ce/0x30d On the receiving end, I have a "btrfs receive" which takes the above stream as input, and *always* reports this: receiving snapshot 20151230-141324.1451484804.965085668@argo uuid=53df0616-5715-ad40-ae81-78a023860fe0, ctransid=649684 parent_ uuid=d3f807da-1e9d-aa4d-ab01-77ce5e2fbcd7, parent_ctransid=649735 utimes rename bin -> o257-379784-0 mkdir o257-34888-0 rename o257-34888-0 -> bin utimes chown bin - uid=0, gid=0 chmod bin - mode=0755 utimes bin rmdir boot ERROR: rmdir boot failed. No such file or directory mkdir o258-34888-0 rename o258-34888-0 -> boot utimes chown boot - uid=0, gid=0 chmod boot - mode=0755 utimes boot rename dev -> o259-379784-0 mkdir o259-34888-0 rename o259-34888-0 -> dev ... rest of the logging follows as normal... ... then we get ... rmdir media mkdir o264-34888-0 rename o264-34888-0 -> media utimes chown media - uid=0, gid=0 chmod media - mode=0755 utimes media rmdir mnt ERROR: rmdir mnt failed. No such file or directory rmdir opt mkdir o266-34888-0 rename o266-34888-0 -> opt utimes ... continues as normal ... It then still creates lots of files, until it encounters the sudden EOF due to the sending side experiencing the kernel bug and abruptly halting the send. Since the problem is consistently and easily reproducible, I can immediately try any proposed patches or fixes (or provide more insight into the subvolume this problem occurs with). Numerous other subvolumes in the same BTRFS partition work flawlessly using btrfs send/receive. The sending partition is RAID0 with two 512GB SSD drives. The receiving partition is RAID1 with 6 6TB HDD drives. -- Stephen. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel BUG at fs/btrfs/send.c:1482
Stephen R. van den Berg wrote: >I'm running 4.4.0-rc7. >This exact problem was present on 4.0.5 and 4.3.3 too though. >I do a "btrfs send /var/lib/lxc/template64/rootfs", that generates >the following error consistently at the same file, over and over again: >Dec 29 14:49:04 argo kernel: kernel BUG at fs/btrfs/send.c:1482! Ok, found part of the solution. The kernel bug was being triggered by symbolic links in that subvolume that have an empty target. It is unknown how these ever ended up on that partition. The partitions have been created using regular btrfs. The only strange thing that might have happened, is that I ran duperemove over those partitions afterward. -- Stephen. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next conflict resolution branch for btrfs
Hi Chris, On Thu, 20 Aug 2015 13:39:18 -0400 Chris Mason c...@fb.com wrote: There are a few conflicts for btrfs in linux-next this time. They are small, but I pushed out the merge commit I'm using here: git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git next-merge Thanks for that. It seems to have merged OK but maybe it conflicts with something later in linux-next. Unfortunately see my other email about a build problem. I will keep this example merge in mind for later. -- Cheers, Stephen Rothwells...@canb.auug.org.au -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Panic while running defrag
I ran into a panic while running find -xdev | xargs brtfs fi defrag '{}'. I don't remember the exact command because the history was not saved. I also started and stopped it a few times however. The kernel logs were on a different filesystem. Here is the kern.log:http://fpaste.org/9383/36729191/ My setup is two 2TB hard drives in raid 1. They are both sata drives so as far as I know the USB disconnect line isn't referring to btrfs. output of df -h: /dev/sdb1 3.7T 3.2T 371G 90% /mount/point I haven't figured out how to reproduce the bug. -- Stephen -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Update LZO compression
Hi Markus, On Tue, 09 Oct 2012 21:54:59 +0200 Markus F.X.J. Oberhumer mar...@oberhumer.com wrote: On 2012-10-09 21:26, Andrew Morton wrote: On Sun, 7 Oct 2012 17:07:55 +0200 Markus F.X.J. Oberhumer mar...@oberhumer.com wrote: As requested by akpm I am sending my lzo-update branch at git://github.com/markus-oberhumer/linux.git lzo-update to lkml as a patch series created by git format-patch -M v3.5..lzo-update. You can also browse the branch at https://github.com/markus-oberhumer/linux/compare/lzo-update and review the three patches at https://github.com/markus-oberhumer/linux/commit/7c979cebc0f93dc692b734c12665a6824d219c20 https://github.com/markus-oberhumer/linux/commit/10f6781c8591fe5fe4c8c733131915e5ae057826 https://github.com/markus-oberhumer/linux/commit/5f702781f158cb59075cfa97e5c21f52275057f1 The changes look OK to me. Please ask Stephen to include the tree in linux-next, for a 3.7 merge. I'd ask you to include my lzo-update branch in linux-next: git://github.com/markus-oberhumer/linux.git lzo-update I have added this from today. Thanks for adding your subsystem tree as a participant of linux-next. As you may know, this is not a judgment of your code. The purpose of linux-next is for integration testing and to lower the impact of conflicts between subsystems in the next merge window. You will need to ensure that the patches/commits in your tree/series have been: * submitted under GPL v2 (or later) and include the Contributor's Signed-off-by, * posted to the relevant mailing list, * reviewed by you (or another maintainer of your subsystem tree), * successfully unit tested, and * destined for the current or next Linux merge window. Basically, this should be just what you would send to Linus (or ask him to fetch). It is allowed to be rebased if you deem it necessary. -- Cheers, Stephen Rothwell s...@canb.auug.org.au Legal Stuff: By participating in linux-next, your subsystem tree contributions are public and will be included in the linux-next trees. You may be sent e-mail messages indicating errors or other issues when the patches/commits from your subsystem tree are merged and tested in linux-next. These messages may also be cross-posted to the linux-next mailing list, the linux-kernel mailing list, etc. The linux-next tree project and IBM (my employer) make no warranties regarding the linux-next project, the testing procedures, the results, the e-mails, etc. If you don't agree to these ground rules, let me know and I'll remove your tree from participation in linux-next. pgp5irlxkpRuy.pgp Description: PGP signature
linux-next: build warninga in Linus' tree
Hi all, After merging the Linus' tree, today's linux-next build (powerpc ppc64_defconfig) produced these warnings: fs/btrfs/sysfs.c:76:26: warning: 'btrfs_root_attrs' defined but not used fs/btrfs/sysfs.c:97:26: warning: 'btrfs_super_attrs' defined but not used fs/btrfs/sysfs.c:153:13: warning: 'btrfs_super_release' defined but not used fs/btrfs/sysfs.c:160:13: warning: 'btrfs_root_release' defined but not used I have started using gcc v4.5.2 (instead of v4.4.4) if that makes a difference. -- Cheers, Stephen Rothwells...@canb.auug.org.au http://www.canb.auug.org.au/~sfr/ pgpFuP2qATvih.pgp Description: PGP signature
Re: Observed unexpected behavior of BTRFS in d_instantiate
On Tue, 2011-04-26 at 20:15 -0700, Casey Schaufler wrote: I have been tracking down an problem that we've been seeing with Smack on top of btrfs and have narrowed it down to a check in smack_d_instantiate() that checks to see if the underlying filesystem supports extended attributes by looking at inode-i_op-getxattr If the filesystem has no entry for getxattr it is assumed that it does not support extended attributes. The Smack code clearly finds this value to be NULL for btrfs and uses a fallback value. Clearly something is amiss, as other code paths clearly find the i_op-getxattr function and use it to effect. The btrfs code quite obviously includes getxattr functions. So, what is btrfs up to such that the inode ops does not include getxattr when security_d_instantiate is called? I am led to understand that SELinux has worked around this, but looking at the SELinux code I expect that there is a problem there as well. Thank you. kernel version(s)? reproducer? -- Stephen Smalley National Security Agency -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Observed unexpected behavior of BTRFS in d_instantiate
On Thu, 2011-04-28 at 10:03 -0700, Casey Schaufler wrote: On 4/28/2011 6:30 AM, Stephen Smalley wrote: On Tue, 2011-04-26 at 20:15 -0700, Casey Schaufler wrote: I have been tracking down an problem that we've been seeing with Smack on top of btrfs and have narrowed it down to a check in smack_d_instantiate() that checks to see if the underlying filesystem supports extended attributes by looking at inode-i_op-getxattr If the filesystem has no entry for getxattr it is assumed that it does not support extended attributes. The Smack code clearly finds this value to be NULL for btrfs and uses a fallback value. Clearly something is amiss, as other code paths clearly find the i_op-getxattr function and use it to effect. The btrfs code quite obviously includes getxattr functions. So, what is btrfs up to such that the inode ops does not include getxattr when security_d_instantiate is called? I am led to understand that SELinux has worked around this, but looking at the SELinux code I expect that there is a problem there as well. Thank you. kernel version(s)? 2.6.37 2.6.39rc4 reproducer? The MeeGo team saw the behavior first. I have been instrumenting the Smack code to track down what is happening. I am in the process of developing a Smack workaround for the btrfs behavior. If this is for newly created files, then we initialize the in-core security label for the inode as part of the inode_init_security hook in SELinux and thus don't even try to call -getxattr at d_instantiate time. Not sure though why it wouldn't already be set. -- Stephen Smalley National Security Agency -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Observed unexpected behavior of BTRFS in d_instantiate
On Thu, 2011-04-28 at 13:13 -0400, Stephen Smalley wrote: On Thu, 2011-04-28 at 10:03 -0700, Casey Schaufler wrote: On 4/28/2011 6:30 AM, Stephen Smalley wrote: On Tue, 2011-04-26 at 20:15 -0700, Casey Schaufler wrote: I have been tracking down an problem that we've been seeing with Smack on top of btrfs and have narrowed it down to a check in smack_d_instantiate() that checks to see if the underlying filesystem supports extended attributes by looking at inode-i_op-getxattr If the filesystem has no entry for getxattr it is assumed that it does not support extended attributes. The Smack code clearly finds this value to be NULL for btrfs and uses a fallback value. Clearly something is amiss, as other code paths clearly find the i_op-getxattr function and use it to effect. The btrfs code quite obviously includes getxattr functions. So, what is btrfs up to such that the inode ops does not include getxattr when security_d_instantiate is called? I am led to understand that SELinux has worked around this, but looking at the SELinux code I expect that there is a problem there as well. Thank you. kernel version(s)? 2.6.37 2.6.39rc4 reproducer? The MeeGo team saw the behavior first. I have been instrumenting the Smack code to track down what is happening. I am in the process of developing a Smack workaround for the btrfs behavior. If this is for newly created files, then we initialize the in-core security label for the inode as part of the inode_init_security hook in SELinux and thus don't even try to call -getxattr at d_instantiate time. Not sure though why it wouldn't already be set. Actually, a quick look at the code makes it clear. btrfs_create() and friends call d_instantiate() before setting inode-i_op() for new inodes. In contrast, ext[234] set the i_op before calling d_instantiate(). In any event, you don't really need to go through the slow path of calling -getxattr for new inodes as you already know the label that is being set. -- Stephen Smalley National Security Agency -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
lockdep warnings
Running with lockdep I see these warnings (running 2.6.37-rc1) It occurred during the time when rsync is running backup. Nov 14 12:03:31 nehalam kernel: [ 5527.284541] = Nov 14 12:03:31 nehalam kernel: [ 5527.284544] [ INFO: possible recursive locking detected ] Nov 14 12:03:31 nehalam kernel: [ 5527.284546] 2.6.37-rc1+ #67 Nov 14 12:03:31 nehalam kernel: [ 5527.284547] - Nov 14 12:03:31 nehalam kernel: [ 5527.284549] rsync/2782 is trying to acquire lock: Nov 14 12:03:31 nehalam kernel: [ 5527.284551] ((eb-lock)-rlock){+.+...}, at: [a005f026] btrfs_try_spin_lock+0x53/0xd1 [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284567] Nov 14 12:03:31 nehalam kernel: [ 5527.284567] but task is already holding lock: Nov 14 12:03:31 nehalam kernel: [ 5527.284569] ((eb-lock)-rlock){+.+...}, at: [a005f0c6] btrfs_clear_lock_blocking+0x22/0x2c [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284581] Nov 14 12:03:31 nehalam kernel: [ 5527.284581] other info that might help us debug this: Nov 14 12:03:31 nehalam kernel: [ 5527.284583] 2 locks held by rsync/2782: Nov 14 12:03:31 nehalam kernel: [ 5527.284585] #0: (sb-s_type-i_mutex_key#13){+.+.+.}, at: [810f98be] do_lookup+0x9d/0x10d Nov 14 12:03:31 nehalam kernel: [ 5527.284592] #1: ((eb-lock)-rlock){+.+...}, at: [a005f0c6] btrfs_clear_lock_blocking+0x22/0x2c [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284605] Nov 14 12:03:31 nehalam kernel: [ 5527.284605] stack backtrace: Nov 14 12:03:31 nehalam kernel: [ 5527.284607] Pid: 2782, comm: rsync Not tainted 2.6.37-rc1+ #67 Nov 14 12:03:31 nehalam kernel: [ 5527.284609] Call Trace: Nov 14 12:03:31 nehalam kernel: [ 5527.284615] [8106b651] __lock_acquire+0xc7a/0xcf1 Nov 14 12:03:31 nehalam kernel: [ 5527.284619] [810ba186] ? activate_page+0x130/0x13f Nov 14 12:03:31 nehalam kernel: [ 5527.284622] [8106b799] lock_acquire+0xd1/0xf7 Nov 14 12:03:31 nehalam kernel: [ 5527.284633] [a005f026] ? btrfs_try_spin_lock+0x53/0xd1 [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284638] [813b31c7] _raw_spin_lock+0x31/0x40 Nov 14 12:03:31 nehalam kernel: [ 5527.284648] [a005f026] ? btrfs_try_spin_lock+0x53/0xd1 [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284659] [a005f0c6] ? btrfs_clear_lock_blocking+0x22/0x2c [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284669] [a005f026] btrfs_try_spin_lock+0x53/0xd1 [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284677] [a0022714] btrfs_search_slot+0x3e6/0x513 [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284687] [a003121f] btrfs_lookup_inode+0x2f/0x8f [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284698] [a003ac58] ? btrfs_init_locked_inode+0x0/0x2e [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284709] [a003e2a7] btrfs_iget+0xc3/0x415 [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284721] [a00419b9] btrfs_lookup_dentry+0x105/0x3c4 [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284724] [81069db5] ? trace_hardirqs_on+0xd/0xf Nov 14 12:03:31 nehalam kernel: [ 5527.284735] [a0041c8e] btrfs_lookup+0x16/0x2e [btrfs] Nov 14 12:03:31 nehalam kernel: [ 5527.284738] [810f97b5] d_alloc_and_lookup+0x55/0x74 Nov 14 12:03:31 nehalam kernel: [ 5527.284741] [810f98dc] do_lookup+0xbb/0x10d Nov 14 12:03:31 nehalam kernel: [ 5527.284744] [810fb6a1] link_path_walk+0x2a6/0x3fc Nov 14 12:03:31 nehalam kernel: [ 5527.284746] [810fb8f3] path_walk+0x69/0xd9 Nov 14 12:03:31 nehalam kernel: [ 5527.284750] [811e23c2] ? strncpy_from_user+0x48/0x76 Nov 14 12:03:31 nehalam kernel: [ 5527.284753] [810fba35] do_path_lookup+0x2a/0x4f Nov 14 12:03:31 nehalam kernel: [ 5527.284756] [810fc52f] user_path_at+0x56/0x9a Nov 14 12:03:31 nehalam kernel: [ 5527.284760] [810caa14] ? might_fault+0x5c/0xac Nov 14 12:03:31 nehalam kernel: [ 5527.284764] [810f4701] ? cp_new_stat+0xf7/0x10d Nov 14 12:03:31 nehalam kernel: [ 5527.284767] [810f45a2] vfs_fstatat+0x37/0x62 Nov 14 12:03:31 nehalam kernel: [ 5527.284770] [810f45eb] vfs_lstat+0x1e/0x20 Nov 14 12:03:31 nehalam kernel: [ 5527.284772] [810f4773] sys_newlstat+0x1f/0x3d Nov 14 12:03:31 nehalam kernel: [ 5527.284776] [81069d84] ? trace_hardirqs_on_caller+0x118/0x13c Nov 14 12:03:31 nehalam kernel: [ 5527.284779] [813b2fe0] ? trace_hardirqs_on_thunk+0x3a/0x3f Nov 14 12:03:31 nehalam kernel: [ 5527.284783] [8100245b] system_call_fastpath+0x16/0x1b -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
namespace routines that could be static
I got namespace.pl working again, and it showed the following routines could be declared static. fs/btrfs/ctree btrfs_clear_path_blocking btrfs_insert_some_items btrfs_prev_leaf fs/btrfs/delayed-ref btrfs_delayed_ref_pending fs/btrfs/dir-item btrfs_match_dir_item_name fs/btrfs/disk-io btrfs_congested_async btrfs_lookup_fs_root btrfs_read_fs_root write_all_supers fs/btrfs/extent-tree block_rsv_release_bytes btrfs_get_block_group btrfs_init_new_buffer fs/btrfs/extent_io extent_bmap extent_commit_write extent_prepare_write set_range_dirty wait_extent_bit wait_on_extent_buffer_writeback wait_on_extent_writeback fs/btrfs/file-item btrfs_lookup_csum fs/btrfs/free-space-cache btrfs_block_group_free_space fs/btrfs/inode btrfs_orphan_del btrfs_writepages fs/btrfs/inode-map btrfs_find_highest_inode fs/btrfs/ioctl btrfs_ioctl_space_info fs/btrfs/locking btrfs_try_tree_lock fs/btrfs/print-tree btrfs_print_tree fs/btrfs/root-tree btrfs_search_root fs/btrfs/struct-funcs btrfs_device_bandwidth btrfs_device_group btrfs_device_seek_speed btrfs_device_start_offset btrfs_dir_transid btrfs_disk_block_group_chunk_objectid btrfs_disk_block_group_flags btrfs_disk_root_generation btrfs_disk_root_level btrfs_file_extent_generation btrfs_inode_transid btrfs_set_chunk_io_align btrfs_set_chunk_io_width btrfs_set_chunk_length btrfs_set_chunk_num_stripes btrfs_set_chunk_owner btrfs_set_chunk_sector_size btrfs_set_chunk_stripe_len btrfs_set_chunk_sub_stripes btrfs_set_chunk_type btrfs_set_disk_block_group_chunk_objectid btrfs_set_disk_block_group_flags btrfs_set_disk_block_group_used btrfs_set_disk_root_bytenr btrfs_set_disk_root_generation btrfs_set_disk_root_level btrfs_set_disk_root_refs btrfs_set_extent_refs_v0 btrfs_set_ref_generation_v0 btrfs_set_ref_objectid_v0 btrfs_set_ref_root_v0 btrfs_set_stripe_devid btrfs_set_stripe_offset fs/btrfs/sysfs btrfs_sysfs_add_root btrfs_sysfs_add_super btrfs_sysfs_del_root btrfs_sysfs_del_super fs/btrfs/tree-log btrfs_log_inode_parent fs/btrfs/volumes btrfs_add_device btrfs_alloc_dev_extent btrfs_lock_volumes btrfs_read_super_device btrfs_unlock_volumes btrfs_unplug_page -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Quota Support
Sorry about emailing the list about this but after doing some googling i can't seem to find the answer. Im just wondering if subvolumes or snap shot can have quotas imposed on them. The wiki says that: Subvolumes can be given a quota of blocks, and once this quota is reached no new writes are allowed. Although many posts to this mailing list and on other mailing list seem to indicate that quotas are not implemented, im just wondering if someone can clean this up. Thanks Steve -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs trees for linux-next
On Wed, 10 Dec 2008 20:06:04 -0800 Andrew Morton [EMAIL PROTECTED] wrote: I'd prefer that it go into linux-next in the usual fashion. But the first step is review.. OK, I wasn't sure where it was up to (not being a file system person). -- Cheers, Stephen Rothwell[EMAIL PROTECTED] http://www.canb.auug.org.au/~sfr/ pgpxwKjfHuU0F.pgp Description: PGP signature
Re: Raid1 with failing drive
On Wed, 29 Oct 2008 14:02:04 -0600 Joe Peterson [EMAIL PROTECTED] wrote: Chris Mason wrote: On Tue, 2008-10-28 at 16:48 -0700, Stephen Hemminger wrote: I have a system with a pair of small/fast but unreliable scsi drives. I tried setting up a raid1 configuration and using it for builds. Using 2.6.26.7 and btrfs 0.16. When using ext3 (no raid) on same partition, the driver would recalibrate and log something an keep going. But with btrfs it doesn't recover and takes drive offline. Btrfs doesn't really take drives offline. In the future we'll notice that a drive is returning all errors, but for now we'll probably just keep beating on it. It can also detect when a bad checksum is returned or the drive returns an i/o error, right? Would the all-zero test be a heuristic in case neither of those happened (but I cannot imagine why the zeros would get by the checksum check)? The IO error handling code in btrfs currently expects it'll be able to find at least one good mirror. You're probably hitting some bad conditions as it fails to clean up. What happens (or rather, will happen) on a regular/non-mirrored btrfs? Would it then return an i/o error to the user and/or mark a block as bad? In ZFS, the state of the volume changes, noting an issue (also happens on a scrub), and the user can check this. What I don't like about ZFS is that the user can clear the condition, and then it appears OK again until another scrub. -Joe I think my problem was that the meta data was mirrored but not the actual data. This lead to total meltdown when data got an error. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs_tree_lock trylock
On Mon, 08 Sep 2008 12:20:32 -0400 Chris Mason [EMAIL PROTECTED] wrote: On Mon, 2008-09-08 at 12:13 -0400, jim owens wrote: Chris Mason wrote: My guess is that the improvement happens mostly from the first couple of tries, not from repeated spinning. And since it is a mutex, you could even do: I started with lower spin counts, I really didn't want to spin at all but the current values came from trial and error. Exactly the problem Steven is saying about adaptive locking. Using benchmarks (or any test), on a small sample of systems leads you to conclude this design/tuning combination is better. I've been burned repeatedly by that... ugly things happen as you move away from your design testing center. I'm not saying your code does not work, just that we need a lot more proof with different configurations and loads to see that is at least no worse. Oh, I completely agree. This is tuned on just one CPU in a handful of workloads. In general, it makes sense to spin for about as long as it takes someone to do a btree search in the block, which we could benchmark up front at mount time. I could also get better results from an API where the holder of the lock indicates it is going to hold on to things for a while, which might happen right before doing an IO. Over the long term these are important issues, but for today I'm focused on the disk format ;) -chris Not to mention the problem that developers seem to have faster machines than average user, but slower than the enterprise and future generation CPU's. So any tuning value seems to get out of date fast. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs day 1
On Thu, 14 Aug 2008 14:21:22 -0400 Chris Mason [EMAIL PROTECTED] wrote: On Thu, 2008-08-14 at 11:06 -0700, Stephen Hemminger wrote: So, the question is why the kernel compile workload works for me. What kind of hardware are you running (ram, cpu, disks?) Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz Memory 2G Disk 80G (partition was 20G) It seems you have the secret to corrupting things. I'll try to reproduce with smaller partitions and less ram here. -chris Actually, the partition that got corrupted was 60G -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html