Re: [RFC] Preliminary BTRFS Encryption
On Fri, 16 Sep 2016 11:12:13 +1000 Dave Chinnerwrote: > > As of now these patch set supports encryption on per subvolume, as > > managing properties on per subvolume is a kind of core to btrfs, which is > > easier for data center solution-ing, seamlessly persistent and easy to > > manage. > > We've got dmcrypt for this sort of transparent "device level" > encryption. Do we really need another btrfs layer that re-implements > generic, robust, widely deployed, stable functionality? "Btrfs subvolume-level" is far from "device-level", subvolumes are so lightweight and dynamic that they are akin to regular directories for most intents and purposes, not devices or partitions. And yes I'd say (effectively) a directory-level encryption in an FS can be useful; for example encrypting /home, but not the rest of the filesystem, or any other scenarios where only some of the stored data needs to be encrypted, and it's not known in advance what proportion, so it's not convenient to have any static partition or LVM based bounds. Currently this can be achieved with tools like encfs or ecryptfs -- so it's those you'd want to measure Btrfs encryption against, not dmcrypt. -- With respect, Roman pgpDncVUCrA04.pgp Description: OpenPGP digital signature
Re: df -i shows 0 inodes 0 used 0 free on 4.4.0-36-generic Ubuntu 14 - Bug or not?
GWB posted on Thu, 15 Sep 2016 18:58:24 -0500 as excerpted: > I don't expect accurate data on a btrfs file system when using df, but > after upgrading to kernel 4.4.0 I get the following: > > $ df -i ... > /dev/sdc3 0 0 0 - /home > /dev/sdc4 0 0 0 - /vm0 ... > > Where /dev/sdc3 and /dev/sdc4 are btrfs filesystems. > > So is this a bug or not? Not a bug. Btrfs uses inodes, but unlike ext*, it creates them dynamically as- needed, so showing inodes used vs. free simply makes no sense in btrfs context. Now btrfs /does/ track data and metadata separately, creating chunks of each type, and it /is/ possible to have all otherwise free space already allocated to chunks of one type or the other and then run out of space in the one type of chunk while there's plenty of space in the other type of chunk, but that's quite a different concept, and btrfs fi usage (tho your v3.14 btrfs-progs will be too old for usage) or btrfs fi df coupled with btrfs fi show (the old way to get the same info), gives the information for that. And in fact, the btrfs fi show for vm0 says 374.66 GiB size and used, so indeed, all space on that one is allocated. Unfortunately you don't post the btrfs fi df for that one, so we can't tell where all that allocated space is going and whether it's actually used, but it's all allocated. You probably want to run a balance to get back some unallocated space. Meanwhile, your kernel is 4.4.x LTS series so not bad there, but your userspace is extremely old, 3.12, making support a bit hard as some of the commands have changed (btrfs fi usage, for one, and I think the checker was still btrfsck in 3.12, while in current btrfs-progs, it's btrfs check). I'd suggest updating that to at least something around the 4.4 level to match the kernel, tho you can upgrade to the latest 4.7.2 (don't try 4.6 or 4.7 previous to 4.7.2, or don't btrfs check --repair if you do, as there's a bug with it in those versions that's fixed in 4.7.2) if you like, as newer userspace is designed to work with older kernels as well. Besides which, while old btrfs userspace isn't a big deal (other than translating back and forth between old style and new style commands) when your filesystems are running pretty much correctly, as in that case all userspace is doing in most cases is calling the kernel to do the real work anyway, it becomes a much bigger deal when something goes wrong, because it's userspace code that's executing with btrfs check or btrfs restore, and newer userspace knows about and can fix a LOT more problems than the really ancient 3.12. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Thoughts on btrfs RAID-1 for cold storage/archive?
E V posted on Thu, 15 Sep 2016 11:48:13 -0400 as excerpted: > I'm investigating using btrfs for archiving old data and offsite > storage, essentially put 2 drives in btrfs RAID-1, copy the data to the > filesystem and then unmount, remove a drive and take it to an offsite > location. Remount the other drive -o ro,degraded until my systems slots > fill up, then remove the local drive and put it on a shelf. I'd verify > the file md5sums after data is written to the drive for piece of mind, > but maybe a btrfs scrub would give the same assurances. Seem > straightforward? Anything to look out for? Long term format stability > seems good, right? Also, I like the idea of being able to pull the > offsite drive back and scrub if the local drive ever has problems, a > nice extra piece of mind we wouldn't get with ext4. Currently using the > 4.1.32 kernel since the driver for the r750 card in our 45 drives system > only supports up to 4.3 ATM. As described I believe it should work fine. Btrfs raid1 isn't like normal raid1 in some ways and in particular isn't designed to be mounted degraded, writable, long term, only temporarily, in ordered to replace a bad device. As that's what I thought you were going to propose when I read the subject line, I was all ready to tell you no, don't try it and expect it to work, but of course you had something different in mind, only read-only mounting of the degraded raid1 (unless needed for scrub, etc), not mounting it writable, and as long as you are careful to do just that, only mount it read-only, you should be fine. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: unable to handle kernel paging request
Mark Gavalda posted on Thu, 15 Sep 2016 22:12:57 +0200 as excerpted: [Moved to bottom to retain quote/reply order.] > On Thu, Sep 15, 2016 at 6:05 PM, Chris Masonwrote: >> On 09/15/2016 10:08 AM, Mark Gavalda wrote: >>> >>> Hi, >>> >>> Bumped into the following one today; kernel 4.4.0-36-generic Ubuntu >>> 16.4.1; CPU went to 100% and only a hard restart solved the issue. >>> Since then everything's back to normal. >>> >>> Please let me know how can I help get to the bottom of this? >> >> >> I saw similar traces when tracking down this bug: >> >> https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/ commit/?h=for-linus-4.8=cbd60aa7cd17d81a434234268c55192862147439 >> >> It's flagged for stable, so you'll get it with the next stable update, >> or you can apply it by hand and rebuild. > Thanks, I can see it included in 4.8-rc6 but not the other branches. > Will it get pulled later or is this a 4.8 only fix? Flagged for stable means it's headed to the maintained stable branches (well, the ones to which the fix applies for regression fixes, but definitely 4.4 LTS series in this case since Chris Mason indicated it should apply in your case), not just current. But stabilization policy says a patch must hit mainline current first, before it is eligible for older stable as well. So it would be /expected/ to hit 4.8-rc, current development, first. After that, given that it's already flagged for stable, it should eventually hit all the stable kernels to which it applies as well. That can be right away, but if the stable maintainer (Greg K-H, normally) is backlogged due to just getting back from vacation or something, as sometimes happens, it can take a few weeks to work thru the backlog, so it can take a bit, as well. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Preliminary BTRFS Encryption
On Tue, Sep 13, 2016 at 09:39:46PM +0800, Anand Jain wrote: > > This patchset adds btrfs encryption support. > > The main objective of this series is to have bugs fixed and stability. > I have verified with fstests to confirm that there is no regression. > > A design write-up is coming next, however here below is the quick example > on the cli usage. Please try out, let me know if I have missed something. Yup, that best practices say "do not roll your own encryption infrastructure". This is just my 2c worth - take it or leave it, don't other flaming. Keep in mind that I'm not picking on btrfs here - I asked similar hard questions about the proposed f2fs encryption implementation. That was a "copy and snowflake" version of the ext4 encryption code - they made changes and now we have generic code and common functionality between ext4 and f2fs. > Also would like to mention that a review from the security experts is due, > which is important and I believe those review comments can be accommodated > without major changes from here. That's a fairly significant red flag to me - security reviews need to be done at the design phase against specific threat models - security review is not a code/implementation review... The ext4 developers got this right by publishing threat models and design docs, which got quite a lot of review and feedback before code was published for review. https://docs.google.com/document/d/1ft26lUQyuSpiu6VleP70_npaWdRfXFoNnB8JYnykNTg/edit#heading=h.qmnirp22ipew [small reorder of comments] > As of now these patch set supports encryption on per subvolume, as > managing properties on per subvolume is a kind of core to btrfs, which is > easier for data center solution-ing, seamlessly persistent and easy to > manage. We've got dmcrypt for this sort of transparent "device level" encryption. Do we really need another btrfs layer that re-implements generic, robust, widely deployed, stable functionality? What concerns me the most here is that it seems like that nothing has been learnt from the btrfs RAID5/6 debacle. i.e. the btrfs reimplementation of existing robust, stable, widely deployed infrastructure was fatally flawed and despite regular corruption reports they were ignored for, what, 2 years? And then a /user/ spent the time to isolate the problem, and now several months later it still hasn't been fixed. I haven't seen any developer interest in fixing it, either. This meets the definition of unmaintained software, and it sets a poor example for how complex new btrfs features might be maintained in the long term. Encryption simply cannot be treated like this - it has to be right, and it has to be well maintained. So what is being done differently ito the RAID5/6 review process this time that will make the new btrfs-specific encryption implementation solid and have minimal risk of zero-day fatal flaws? And how are you going to guarantee that it will be adequately maintained several years down the track? > Also yes, thanks for the emails, I hear, per file encryption and inline > with vfs layer is also important, which is wip among other things in the > list. The generic file encryption code is solid, reviewed, tested and already widely deployed via two separate filesystems. There is a much wider pool of developers who will maintain it, reveiw changes and know all the traps that a new implementation might fall into. There's a much bigger safety net here, which significantly lowers the risk of zero-day fatal flaws in a new implementation and of flaws in future modifications and enhancements. Hence, IMO, the first thing to do is implement and make the generic file encryption support solid and robust, not tack it on as an afterthought for the magic btrfs encryption pixies to take care of. Indeed, with the generic file encryption, btrfs may not even need the special subvolume encryption pixies. i.e. you can effectively implement subvolume encryption via configuration of a multi-user encryption key for each subvolume and apply it to the subvolume tree root at creation time. Then only users with permission to unlock the subvolume key can access it. Once the generic file encryption is solid and fulfils the needs of most users, then you can look to solving the less common threat models that neither dmcrypt or per-file encryption address. Only if the generic code cannot be expanded to address specific threat models should you then implement something that is unique to btrfs Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
df -i shows 0 inodes 0 used 0 free on 4.4.0-36-generic Ubuntu 14 - Bug or not?
I don't expect accurate data on a btrfs file system when using df, but after upgrading to kernel 4.4.0 I get the following: $ df -i ... /dev/sdc3 0 0 0 - /home /dev/sdc4 0 0 0 - /vm0 ... Where /dev/sdc3 and /dev/sdc4 are btrfs filesystems. So is this a bug or not? I ask because root (/dev/sdc3) began to display the error message "no space left on device", which was eventually cured by deleting old snapshots then btrfs fi sync and btrfs balance. fi show and fi df show space, even when df -i shows 0 inodes: sudo btrfs fi show / ... Label: none uuid: 9acdb642-d743-4c2a-a59f-811022c2a2b0 Total devices 1 FS bytes used 23.86GiB devid1 size 60.00GiB used 42.03GiB path /dev/sdc3 sudo btrfs fi df / Data, single: total=37.00GiB, used=21.11GiB System, single: total=32.00MiB, used=16.00KiB Metadata, single: total=5.00GiB, used=2.75GiB unknown, single: total=512.00MiB, used=0.00 Please excuse my inexperience here; I don't know how to use btrfs commands to show inodes. btrfs inspect-internal will reference an inode as near as I can tell, but won't list the used and free inodes ("free" may not be the correct word here, given btrfs architecture). Many Thanks, Gordon Machine Specs: % uname -a Linux Bon-E 4.4.0-36-generic #55~14.04.1-Ubuntu SMP Fri Aug 12 11:49:30 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux % btrfs --version Btrfs v3.12 % sudo btrfs fi show Label: none uuid: 9acdb642-d743-4c2a-a59f-811022c2a2b0 Total devices 1 FS bytes used 23.86GiB devid1 size 60.00GiB used 42.03GiB path /dev/sdc3 %Label: vm0 uuid: 72539416-d30e-4a34-8b2d-b2369d1fb075 Total devices 1 FS bytes used 349.96GiB devid1 size 374.66GiB used 374.66GiB path /dev/sdc4 dmesg does not appear to show anything useful for btrfs or device, but mount shows: % mount | grep btrfs /dev/sdc3 on / type btrfs (rw,ssd,subvol=@) /dev/sdc3 on /home type btrfs (rw,ssd,subvol=@home) /dev/sdc4 on /vm0 type btrfs (rw,ssd,space_cache,compress=lzo,subvol=@vm0) /dev/sdc3 on /mnt/btrfs-root type btrfs (rw) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: multi-device btrfs with single data mode and disk failure
On Thu, Sep 15, 2016 at 3:48 PM, Alexandre Pouxwrote: > > Le 15/09/2016 à 18:54, Chris Murphy a écrit : >> On Thu, Sep 15, 2016 at 10:30 AM, Alexandre Poux wrote: >>> Thank you very much for your answers >>> >>> Le 15/09/2016 à 17:38, Chris Murphy a écrit : On Thu, Sep 15, 2016 at 1:44 AM, Alexandre Poux wrote: > Is it possible to do some king of a "btrfs delete missing" on this > kind of setup, in order to recover access in rw to my other data, or > I must copy all my data on a new partition That *should* work :) Except that your file system with 6 drives is too full to be shrunk to 5 drives. Btrfs will either refuse, or get confused, about how to shrink a nearly full 6 drive volume into 5. So you'll have to do one of three things: 1. Add a 2+TB drive, then remove the missing one; OR 2. btrfs replace is faster and is raid10 reliable; OR 3. Read only scrub to get a file listing of bad files, then remount read-write degraded and delete them all. Now you maybe can do a device delete missing. But it's still a tight fit, it basically has to balance things out to get it to fit on an odd number of drives, it may actually not work even though there seems to be enough total space, there has to be enough space on FOUR drives. >>> Are you sure you are talking about data in single mode ? >>> I don't understand why you are talking about raid10, >>> or the fact that it will have to rebalance everything. >> Yeah sorry I got confused in that very last sentence. Single, it will >> find space in 1GiB increments. Of course this fails because that data >> doesn't exist anymore, but to start the operation it needs to be >> possible. > No problem >>> Moreover, even in degraded mode I cannot mount it in rw >>> It tells me >>> "too many missing devices, writeable remount is not allowed" >>> due to the fact I'm in single mode. >> Oh you're in that trap. Well now you're stuck. I've had the case where >> I could mount read write degraded with metadata raid1 and data single, >> but it was good for only one mount and then I get the same message you >> get and it was only possible to mount read only. At that point it's >> totally suck unless you're adept at manipulating the file system with >> a hex editor... >> >> Someone might have a patch somewhere that drops this check and lets >> too many missing devices to mount anyway... I seem to recall this. >> It'd be in the archives if it exists. >> >> >> >>> And as far as as know, btrfs replace and btrfs delete, are not supposed >>> to work in read only... >> It doesn't. Must be read write mounted. >> >> >>> I would like to tell him forgot about the missing data, and give me back >>> my partition. >> This feature doesn't exist yet. I really want to see this, it'd be >> great for ceph and gluster if the volume could lose a drive, report >> all the missing files to the cluster file system, delete the device >> and the file references, and then the cluster knows that brick doesn't >> have those files and can replicate them somewhere else or even back to >> the brick that had them. >> > So I found this patch : https://patchwork.kernel.org/patch/7014141/ > > Does this seems ok ? No idea I haven't tried it. > > So after patching my kernel with it, > I should be able to mount in rw my partition, and thus, > I will be able to do a btrfs delete missing > Which will just forgot about the old disk and everything should be fine > afterward ? It will forget about the old disk but it will try to migrate all metadata and data that was on that disk to the remaining drives; so until you delete all files that are corrupt, you'll continue to get corruption messages about them. > > Is this risky ? or not so much ? Probably. If you care about the data, mount read only, back up what you can, then see if you can fix it after that. > The scrubing is almost finished, and as I was expecting, I lost no data > at all. Well I'd guess the device delete should work then, but I still have no idea if that patch will let you mount it degraded read-write. Worth a shot though, it'll save time. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Thu, 2016-09-15 at 14:20 -0400, Austin S. Hemmelgarn wrote: > 3. Fsck should be needed only for un-mountable filesystems. Ideally, > we > should be handling things like Windows does. Preform slightly > better > checking when reading data, and if we see an error, flag the > filesystem > for expensive repair on the next mount. That philosophy also has some drawbacks: - The user doesn't directly that anything went wrong. Thus errors may even continue to accumulate and getting much worse if the fs would have immediately gone ro and giving the user the chance to manually intervene (possibly then with help from upstream). - Any smart auto-magical™ repair may also just fail (and make things worse, as the current --repair e.g. may). Not performing such auto- repair, gives the user at least the possible chance to make a bitwise copy of the whole fs, before trying any rescue operations. This wouldn't be the case, if the user never noticed that something happen, and the fs tries to repair things right at mounting. So I think any such auto-repair should be used with extreme caution and only in those cases where one is absolutely a 100% sure that the action will help and just do good. Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature
Re: Size of scrubbed Data
On Thu, Sep 15, 2016 at 9:48 AM, Stefan Malte Schumacherwrote: > > btrfs --version > btrfs-progs v4.7.1 Upgrade to 4.7.2 or downgrade to 4.6.1 before using btrfs check; see the changelog for details. I'm not recommending that you use btrfs check, just staying this version of tools is not reliable for some file systems. > btrfs fi df /mnt/btrfs-raid/ > Data, RAID1: total=6.17TiB, used=6.05TiB > System, RAID1: total=32.00MiB, used=916.00KiB > System, single: total=4.00MiB, used=0.00B > Metadata, RAID1: total=10.00GiB, used=8.14GiB > GlobalReserve, single: total=512.00MiB, used=0.00B The 3rd line is possibly rather dangerous in that it might mean there's some tiny amount of system data that has only one copy on one drive. And since it's a system chunk, if it's true that there's only one copy, if it were damaged or lost, it'd take out the whole volume. btrfs balance start -mconvert=raid1,soft See what that gets you, and then recheck with btrfs fi df or better use btrfs fi us. Unfortunately I don't have an answer for the original question. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Size of scrubbed Data
Hi sash How do I move the single system data to raid1? Dmesg doesnt show any scrubbing errors, according to Smart all the disks are okay. I am not using any any compression. How would I change freespacecache to v2 and what benefits would it entail? I think I need to add the following to my original question: How is "total bytes scrubbed" calculated? On a Raid1, shouldnt it be exactly two times the actual data size on the disk? Yours Stefan 2016-09-15 22:11 GMT+02:00: > Hi Stefan, > > 1st you should run an balance on system data to move the single data to > raid1. imho. > then do the scrub again. > btw are there any scrubbing errors in dmesg? disks are ok?! any > compression involved? changed freespacecache to v2? > > > sash > > > > Am 15.09.2016 um 17:48 schrieb Stefan Malte Schumacher: >> Hello >> >> >> I have encountered a very strange phenomenon while using btrfs-scrub. >> I believe it may be a result of replacing my old installation of >> Debian Jessie with Debian Stretch, resulting in a Kernel Switch from >> 3.16+63 to 4.6.0-1. I scrub my filesystem once a month and let anacron >> send me the results. My filesystem, consisting of four four-gigabyte >> drives with both data and metadata as RAID1 was reported as containing >> nearly 12TiB of data in scrubs done in May, June, July and August. But >> then it changed and suddenly shows only 9TiB in size, despite the fact >> that I did not delete any large files. If I remember correctly my >> switch from Debian Jessie to Stretch was around that time period. >> Could someone explain this behavior to me? Was a new way of >> calculating the size of scrubed data introduced? How can I check if I >> have lost data? I have a backup, but only one generation and rsync >> will by now have deleted files on the NAS, which might have lost on >> the fileserver. According to the long and short self-tests, which I >> run with smartmontools my drives are alright. How do I proceed? >> >> >> >> Yours >> >> Stefan >> >> uname -a >> Linux mars 4.6.0-1-amd64 #1 SMP Debian 4.6.4-1 (2016-07-18) x86_64 GNU/Linux >> >> btrfs --version >> btrfs-progs v4.7.1 >> >> btrfs fi show >> Label: none uuid: 8c668854-db5d-45a7-875d-43c4e82a829e >> Total devices 4 FS bytes used 6.06TiB >> devid1 size 3.64TiB used 3.09TiB path /dev/sde >> devid2 size 3.64TiB used 3.09TiB path /dev/sdc >> devid3 size 3.64TiB used 3.09TiB path /dev/sdd >> devid4 size 3.64TiB used 3.09TiB path /dev/sda >> >> >> btrfs fi df /mnt/btrfs-raid/ >> Data, RAID1: total=6.17TiB, used=6.05TiB >> System, RAID1: total=32.00MiB, used=916.00KiB >> System, single: total=4.00MiB, used=0.00B >> Metadata, RAID1: total=10.00GiB, used=8.14GiB >> GlobalReserve, single: total=512.00MiB, used=0.00B >> >> Maybe this is also of use in identifying the problem: >> grep btrfs * >> grep: apt: Ist ein Verzeichnis >> grep: cups: Ist ein Verzeichnis >> dpkg.log:2016-09-03 15:20:16 upgrade btrfs-progs:amd64 4.7-1 4.7.1-1 >> dpkg.log:2016-09-03 15:20:16 status triggers-awaited btrfs-progs:amd64 4.7-1 >> dpkg.log:2016-09-03 15:20:16 status half-configured btrfs-progs:amd64 4.7-1 >> dpkg.log:2016-09-03 15:20:16 status unpacked btrfs-progs:amd64 4.7-1 >> dpkg.log:2016-09-03 15:20:16 status half-installed btrfs-progs:amd64 4.7-1 >> dpkg.log:2016-09-03 15:20:16 status half-installed btrfs-progs:amd64 4.7-1 >> dpkg.log:2016-09-03 15:20:17 status unpacked btrfs-progs:amd64 4.7.1-1 >> dpkg.log:2016-09-03 15:20:17 status unpacked btrfs-progs:amd64 4.7.1-1 >> dpkg.log:2016-09-03 15:20:45 configure btrfs-progs:amd64 4.7.1-1 >> dpkg.log:2016-09-03 15:20:45 status unpacked btrfs-progs:amd64 4.7.1-1 >> dpkg.log:2016-09-03 15:20:45 status unpacked btrfs-progs:amd64 4.7.1-1 >> dpkg.log:2016-09-03 15:20:45 status half-configured btrfs-progs:amd64 4.7.1-1 >> dpkg.log:2016-09-03 15:20:46 status triggers-awaited btrfs-progs:amd64 >> 4.7.1-1 >> dpkg.log:2016-09-03 15:20:51 status installed btrfs-progs:amd64 4.7.1-1 >> dpkg.log.1:2016-08-10 16:58:23 upgrade btrfs-progs:amd64 4.5.2-1 4.6.1-1 >> dpkg.log.1:2016-08-10 16:58:23 status triggers-awaited btrfs-progs:amd64 >> 4.5.2-1 >> dpkg.log.1:2016-08-10 16:58:23 status half-configured btrfs-progs:amd64 >> 4.5.2-1 >> dpkg.log.1:2016-08-10 16:58:23 status unpacked btrfs-progs:amd64 4.5.2-1 >> dpkg.log.1:2016-08-10 16:58:23 status half-installed btrfs-progs:amd64 >> 4.5.2-1 >> dpkg.log.1:2016-08-10 16:58:24 status half-installed btrfs-progs:amd64 >> 4.5.2-1 >> dpkg.log.1:2016-08-10 16:58:24 status unpacked btrfs-progs:amd64 4.6.1-1 >> dpkg.log.1:2016-08-10 16:58:24 status unpacked btrfs-progs:amd64 4.6.1-1 >> dpkg.log.1:2016-08-10 17:01:25 configure btrfs-progs:amd64 4.6.1-1 >> dpkg.log.1:2016-08-10 17:01:25 status unpacked btrfs-progs:amd64 4.6.1-1 >> dpkg.log.1:2016-08-10 17:01:26 status unpacked btrfs-progs:amd64 4.6.1-1 >> dpkg.log.1:2016-08-10 17:01:26 status half-configured btrfs-progs:amd64 >> 4.6.1-1 >> dpkg.log.1:2016-08-10
Re: Size of scrubbed Data
Hi Stefan, 1st you should run an balance on system data to move the single data to raid1. imho. then do the scrub again. btw are there any scrubbing errors in dmesg? disks are ok?! any compression involved? changed freespacecache to v2? sash Am 15.09.2016 um 17:48 schrieb Stefan Malte Schumacher: > Hello > > > I have encountered a very strange phenomenon while using btrfs-scrub. > I believe it may be a result of replacing my old installation of > Debian Jessie with Debian Stretch, resulting in a Kernel Switch from > 3.16+63 to 4.6.0-1. I scrub my filesystem once a month and let anacron > send me the results. My filesystem, consisting of four four-gigabyte > drives with both data and metadata as RAID1 was reported as containing > nearly 12TiB of data in scrubs done in May, June, July and August. But > then it changed and suddenly shows only 9TiB in size, despite the fact > that I did not delete any large files. If I remember correctly my > switch from Debian Jessie to Stretch was around that time period. > Could someone explain this behavior to me? Was a new way of > calculating the size of scrubed data introduced? How can I check if I > have lost data? I have a backup, but only one generation and rsync > will by now have deleted files on the NAS, which might have lost on > the fileserver. According to the long and short self-tests, which I > run with smartmontools my drives are alright. How do I proceed? > > > > Yours > > Stefan > > uname -a > Linux mars 4.6.0-1-amd64 #1 SMP Debian 4.6.4-1 (2016-07-18) x86_64 GNU/Linux > > btrfs --version > btrfs-progs v4.7.1 > > btrfs fi show > Label: none uuid: 8c668854-db5d-45a7-875d-43c4e82a829e > Total devices 4 FS bytes used 6.06TiB > devid1 size 3.64TiB used 3.09TiB path /dev/sde > devid2 size 3.64TiB used 3.09TiB path /dev/sdc > devid3 size 3.64TiB used 3.09TiB path /dev/sdd > devid4 size 3.64TiB used 3.09TiB path /dev/sda > > > btrfs fi df /mnt/btrfs-raid/ > Data, RAID1: total=6.17TiB, used=6.05TiB > System, RAID1: total=32.00MiB, used=916.00KiB > System, single: total=4.00MiB, used=0.00B > Metadata, RAID1: total=10.00GiB, used=8.14GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > Maybe this is also of use in identifying the problem: > grep btrfs * > grep: apt: Ist ein Verzeichnis > grep: cups: Ist ein Verzeichnis > dpkg.log:2016-09-03 15:20:16 upgrade btrfs-progs:amd64 4.7-1 4.7.1-1 > dpkg.log:2016-09-03 15:20:16 status triggers-awaited btrfs-progs:amd64 4.7-1 > dpkg.log:2016-09-03 15:20:16 status half-configured btrfs-progs:amd64 4.7-1 > dpkg.log:2016-09-03 15:20:16 status unpacked btrfs-progs:amd64 4.7-1 > dpkg.log:2016-09-03 15:20:16 status half-installed btrfs-progs:amd64 4.7-1 > dpkg.log:2016-09-03 15:20:16 status half-installed btrfs-progs:amd64 4.7-1 > dpkg.log:2016-09-03 15:20:17 status unpacked btrfs-progs:amd64 4.7.1-1 > dpkg.log:2016-09-03 15:20:17 status unpacked btrfs-progs:amd64 4.7.1-1 > dpkg.log:2016-09-03 15:20:45 configure btrfs-progs:amd64 4.7.1-1 > dpkg.log:2016-09-03 15:20:45 status unpacked btrfs-progs:amd64 4.7.1-1 > dpkg.log:2016-09-03 15:20:45 status unpacked btrfs-progs:amd64 4.7.1-1 > dpkg.log:2016-09-03 15:20:45 status half-configured btrfs-progs:amd64 4.7.1-1 > dpkg.log:2016-09-03 15:20:46 status triggers-awaited btrfs-progs:amd64 4.7.1-1 > dpkg.log:2016-09-03 15:20:51 status installed btrfs-progs:amd64 4.7.1-1 > dpkg.log.1:2016-08-10 16:58:23 upgrade btrfs-progs:amd64 4.5.2-1 4.6.1-1 > dpkg.log.1:2016-08-10 16:58:23 status triggers-awaited btrfs-progs:amd64 > 4.5.2-1 > dpkg.log.1:2016-08-10 16:58:23 status half-configured btrfs-progs:amd64 > 4.5.2-1 > dpkg.log.1:2016-08-10 16:58:23 status unpacked btrfs-progs:amd64 4.5.2-1 > dpkg.log.1:2016-08-10 16:58:23 status half-installed btrfs-progs:amd64 4.5.2-1 > dpkg.log.1:2016-08-10 16:58:24 status half-installed btrfs-progs:amd64 4.5.2-1 > dpkg.log.1:2016-08-10 16:58:24 status unpacked btrfs-progs:amd64 4.6.1-1 > dpkg.log.1:2016-08-10 16:58:24 status unpacked btrfs-progs:amd64 4.6.1-1 > dpkg.log.1:2016-08-10 17:01:25 configure btrfs-progs:amd64 4.6.1-1 > dpkg.log.1:2016-08-10 17:01:25 status unpacked btrfs-progs:amd64 4.6.1-1 > dpkg.log.1:2016-08-10 17:01:26 status unpacked btrfs-progs:amd64 4.6.1-1 > dpkg.log.1:2016-08-10 17:01:26 status half-configured btrfs-progs:amd64 > 4.6.1-1 > dpkg.log.1:2016-08-10 17:01:26 status triggers-awaited btrfs-progs:amd64 > 4.6.1-1 > dpkg.log.1:2016-08-10 17:02:34 status installed btrfs-progs:amd64 4.6.1-1 > dpkg.log.1:2016-08-19 00:45:05 upgrade btrfs-progs:amd64 4.6.1-1 4.7-1 > dpkg.log.1:2016-08-19 00:45:05 status triggers-awaited btrfs-progs:amd64 > 4.6.1-1 > dpkg.log.1:2016-08-19 00:45:05 status half-configured btrfs-progs:amd64 > 4.6.1-1 > dpkg.log.1:2016-08-19 00:45:05 status unpacked btrfs-progs:amd64 4.6.1-1 > dpkg.log.1:2016-08-19 00:45:05 status half-installed btrfs-progs:amd64 4.6.1-1 > dpkg.log.1:2016-08-19 00:45:06 status half-installed btrfs-progs:amd64
Re: Is stability a joke? (wiki updated)
On Thu, Sep 15, 2016 at 2:16 PM, Hugo Millswrote: > On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote: >> On Thu, Sep 15, 2016 at 12:20 PM, Austin S. Hemmelgarn >> wrote: >> >> > 2. We're developing new features without making sure that check can fix >> > issues in any associated metadata. Part of merging a new feature needs to >> > be proving that fsck can handle fixing any issues in the metadata for that >> > feature short of total data loss or complete corruption. >> > >> > 3. Fsck should be needed only for un-mountable filesystems. Ideally, we >> > should be handling things like Windows does. Preform slightly better >> > checking when reading data, and if we see an error, flag the filesystem for >> > expensive repair on the next mount. >> >> Right, well I'm vaguely curious why ZFS, as different as it is, >> basically take the position that if the hardware went so batshit that >> they can't unwind it on a normal mount, then an fsck probably can't >> help either... they still don't have an fsck and don't appear to want >> one. >> >> I'm not sure if the brfsck is really all that helpful to user as much >> as it is for developers to better learn about the failure vectors of >> the file system. >> >> >> > 4. Btrfs check should know itself if it can fix something or not, and that >> > should be reported. I have an otherwise perfectly fine filesystem that >> > throws some (apparently harmless) errors in check, and check can't repair >> > them. Despite this, it gives zero indication that it can't repair them, >> > zero indication that it didn't repair them, and doesn't even seem to give a >> > non-zero exit status for this filesystem. >> >> Yeah, it's really not a user tool in my view... >> >> >> >> > >> > As far as the other tools: >> > - Self-repair at mount time: This isn't a repair tool, if the FS mounts, >> > it's not broken, it's just a messy and the kernel is tidying things up. >> > - btrfsck/btrfs check: I think I covered the issues here well. >> > - Mount options: These are mostly just for expensive checks during mount, >> > and most people should never need them except in very unusual >> > circumstances. >> > - btrfs rescue *: These are all fixes for very specific issues. They >> > should >> > be folded into check with special aliases, and not be separate tools. The >> > first fixes an issue that's pretty much non-existent in any modern kernel, >> > and the other two are for very low-level data recovery of horribly broken >> > filesystems. >> > - scrub: This is a very purpose specific tool which is supposed to be part >> > of regular maintainence, and only works to fix things as a side effect of >> > what it does. >> > - balance: This is also a relatively purpose specific tool, and again only >> > fixes things as a side effect of what it does. > >You've forgotten btrfs-zero-log, which seems to have built itself a > reputation on the internet as the tool you run to fix all btrfs ills, > rather than a very finely-targeted tool that was introduced to deal > with approximately one bug somewhere back in the 2.x era (IIRC). > >Hugo. :-) It's in my original list, and it's in Austin's by way of being lumped into 'btrfs rescue *' along with chunk and super recover. Seems like super recover should be built into Btrfs check, and would be one of the first ambiguities to get out of the way but I'm just an ape that wears pants so what do I know. Thing is?? zero log has fixed file systems in cases where I never would have expected it to, and the user was recommended not to use it, or use it as a 2nd to last resort. So, pfffIt's like throwing salt around. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote: > On Thu, Sep 15, 2016 at 12:20 PM, Austin S. Hemmelgarn >wrote: > > > 2. We're developing new features without making sure that check can fix > > issues in any associated metadata. Part of merging a new feature needs to > > be proving that fsck can handle fixing any issues in the metadata for that > > feature short of total data loss or complete corruption. > > > > 3. Fsck should be needed only for un-mountable filesystems. Ideally, we > > should be handling things like Windows does. Preform slightly better > > checking when reading data, and if we see an error, flag the filesystem for > > expensive repair on the next mount. > > Right, well I'm vaguely curious why ZFS, as different as it is, > basically take the position that if the hardware went so batshit that > they can't unwind it on a normal mount, then an fsck probably can't > help either... they still don't have an fsck and don't appear to want > one. > > I'm not sure if the brfsck is really all that helpful to user as much > as it is for developers to better learn about the failure vectors of > the file system. > > > > 4. Btrfs check should know itself if it can fix something or not, and that > > should be reported. I have an otherwise perfectly fine filesystem that > > throws some (apparently harmless) errors in check, and check can't repair > > them. Despite this, it gives zero indication that it can't repair them, > > zero indication that it didn't repair them, and doesn't even seem to give a > > non-zero exit status for this filesystem. > > Yeah, it's really not a user tool in my view... > > > > > > > As far as the other tools: > > - Self-repair at mount time: This isn't a repair tool, if the FS mounts, > > it's not broken, it's just a messy and the kernel is tidying things up. > > - btrfsck/btrfs check: I think I covered the issues here well. > > - Mount options: These are mostly just for expensive checks during mount, > > and most people should never need them except in very unusual circumstances. > > - btrfs rescue *: These are all fixes for very specific issues. They should > > be folded into check with special aliases, and not be separate tools. The > > first fixes an issue that's pretty much non-existent in any modern kernel, > > and the other two are for very low-level data recovery of horribly broken > > filesystems. > > - scrub: This is a very purpose specific tool which is supposed to be part > > of regular maintainence, and only works to fix things as a side effect of > > what it does. > > - balance: This is also a relatively purpose specific tool, and again only > > fixes things as a side effect of what it does. You've forgotten btrfs-zero-log, which seems to have built itself a reputation on the internet as the tool you run to fix all btrfs ills, rather than a very finely-targeted tool that was introduced to deal with approximately one bug somewhere back in the 2.x era (IIRC). Hugo. > > Yeah I know, it's just much of this is non-obvious to users unfamiliar > with this file system. And even I'm often throwing spaghetti on a > wall. > > > -- > Chris Murphy > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Hugo Mills | It's against my programming to impersonate a deity! hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 | C3PO, Return of the Jedi signature.asc Description: Digital signature
Re: unable to handle kernel paging request
Thanks, I can see it included in 4.8-rc6 but not the other branches. Will it get pulled later or is this a 4.8 only fix? Mark On Thu, Sep 15, 2016 at 6:05 PM, Chris Masonwrote: > On 09/15/2016 10:08 AM, Mark Gavalda wrote: >> >> Hi, >> >> Bumped into the following one today; kernel 4.4.0-36-generic Ubuntu >> 16.4.1; CPU went to 100% and only a hard restart solved the issue. >> Since then everything's back to normal. >> >> Please let me know how can I help get to the bottom of this? > > > I saw similar traces when tracking down this bug: > > https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/?h=for-linus-4.8=cbd60aa7cd17d81a434234268c55192862147439 > > It's flagged for stable, so you'll get it with the next stable update, or > you can apply it by hand and rebuild. > > -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Expose verbose flag on subvolume delete
Exposing the verbose flag that already had the logic for verbose output Vincent Batts (1): btrfs-progs: subvolume verbose delete flag cmds-subvolume.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) -- 2.9.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: subvolume verbose delete flag
There was already the logic for verbose output, but the flag parsing did not include it. Signed-off-by: Vincent Batts--- cmds-subvolume.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/cmds-subvolume.c b/cmds-subvolume.c index e7ef67d..f8c9f48 100644 --- a/cmds-subvolume.c +++ b/cmds-subvolume.c @@ -245,6 +245,7 @@ static const char * const cmd_subvol_delete_usage[] = { "", "-c|--commit-after wait for transaction commit at the end of the operation", "-C|--commit-each wait for transaction commit after deleting each subvolume", + "-v|--verbose verbose output of operations", NULL }; @@ -267,10 +268,11 @@ static int cmd_subvol_delete(int argc, char **argv) static const struct option long_options[] = { {"commit-after", no_argument, NULL, 'c'}, /* commit mode 1 */ {"commit-each", no_argument, NULL, 'C'}, /* commit mode 2 */ + {"verbose", no_argument, NULL, 'v'}, {NULL, 0, NULL, 0} }; - c = getopt_long(argc, argv, "cC", long_options, NULL); + c = getopt_long(argc, argv, "cCv", long_options, NULL); if (c < 0) break; -- 2.9.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Thu, Sep 15, 2016 at 12:20 PM, Austin S. Hemmelgarnwrote: > 2. We're developing new features without making sure that check can fix > issues in any associated metadata. Part of merging a new feature needs to > be proving that fsck can handle fixing any issues in the metadata for that > feature short of total data loss or complete corruption. > > 3. Fsck should be needed only for un-mountable filesystems. Ideally, we > should be handling things like Windows does. Preform slightly better > checking when reading data, and if we see an error, flag the filesystem for > expensive repair on the next mount. Right, well I'm vaguely curious why ZFS, as different as it is, basically take the position that if the hardware went so batshit that they can't unwind it on a normal mount, then an fsck probably can't help either... they still don't have an fsck and don't appear to want one. I'm not sure if the brfsck is really all that helpful to user as much as it is for developers to better learn about the failure vectors of the file system. > 4. Btrfs check should know itself if it can fix something or not, and that > should be reported. I have an otherwise perfectly fine filesystem that > throws some (apparently harmless) errors in check, and check can't repair > them. Despite this, it gives zero indication that it can't repair them, > zero indication that it didn't repair them, and doesn't even seem to give a > non-zero exit status for this filesystem. Yeah, it's really not a user tool in my view... > > As far as the other tools: > - Self-repair at mount time: This isn't a repair tool, if the FS mounts, > it's not broken, it's just a messy and the kernel is tidying things up. > - btrfsck/btrfs check: I think I covered the issues here well. > - Mount options: These are mostly just for expensive checks during mount, > and most people should never need them except in very unusual circumstances. > - btrfs rescue *: These are all fixes for very specific issues. They should > be folded into check with special aliases, and not be separate tools. The > first fixes an issue that's pretty much non-existent in any modern kernel, > and the other two are for very low-level data recovery of horribly broken > filesystems. > - scrub: This is a very purpose specific tool which is supposed to be part > of regular maintainence, and only works to fix things as a side effect of > what it does. > - balance: This is also a relatively purpose specific tool, and again only > fixes things as a side effect of what it does. > Yeah I know, it's just much of this is non-obvious to users unfamiliar with this file system. And even I'm often throwing spaghetti on a wall. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: kill BUG_ON in do_relocation
On 09/15/2016 03:01 PM, Liu Bo wrote: On Wed, Sep 14, 2016 at 11:19:04AM -0700, Liu Bo wrote: On Wed, Sep 14, 2016 at 01:31:31PM -0400, Josef Bacik wrote: On 09/14/2016 01:29 PM, Chris Mason wrote: On 09/14/2016 01:13 PM, Josef Bacik wrote: On 09/14/2016 12:27 PM, Liu Bo wrote: While updating btree, we try to push items between sibling nodes/leaves in order to keep height as low as possible. But we don't memset the original places with zero when pushing items so that we could end up leaving stale content in nodes/leaves. One may read the above stale content by increasing btree blocks' @nritems. Ok this sounds really bad. Is this as bad as I think it sounds? We should probably fix this like right now right? He's bumping @nritems with a fuzzer I think? As in this happens when someone forces it (or via some other bug) but not in normal operations. Oh ok if this happens with a fuzzer than this is fine, but I'd rather do -EIO so we know this is something bad with the fs. -EIO may be more appropriate to be given while reading btree blocks and checking their validation? Looks like EIO doesn't fit into this case, either, do we have any errno representing 'corrupted filesystem'? That's EIO. Sometimes the EIO is big enough we have to abort, but really the abort is just adding bonus. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: handle quota reserve failure properly
btrfs/022 was spitting a warning for the case that we exceed the quota. If we fail to make our quota reservation we need to clean up our data space reservation. Thanks, Signed-off-by: Josef Bacik--- fs/btrfs/extent-tree.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 03da2f6..d72eaae 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4286,13 +4286,10 @@ int btrfs_check_data_free_space(struct inode *inode, u64 start, u64 len) if (ret < 0) return ret; - /* -* Use new btrfs_qgroup_reserve_data to reserve precious data space -* -* TODO: Find a good method to avoid reserve data space for NOCOW -* range, but don't impact performance on quota disable case. -*/ + /* Use new btrfs_qgroup_reserve_data to reserve precious data space. */ ret = btrfs_qgroup_reserve_data(inode, start, len); + if (ret) + btrfs_free_reserved_data_space_noquota(inode, start, len); return ret; } -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: kill BUG_ON in do_relocation
On Wed, Sep 14, 2016 at 11:19:04AM -0700, Liu Bo wrote: > On Wed, Sep 14, 2016 at 01:31:31PM -0400, Josef Bacik wrote: > > On 09/14/2016 01:29 PM, Chris Mason wrote: > > > > > > > > > On 09/14/2016 01:13 PM, Josef Bacik wrote: > > > > On 09/14/2016 12:27 PM, Liu Bo wrote: > > > > > While updating btree, we try to push items between sibling > > > > > nodes/leaves in order to keep height as low as possible. > > > > > But we don't memset the original places with zero when > > > > > pushing items so that we could end up leaving stale content > > > > > in nodes/leaves. One may read the above stale content by > > > > > increasing btree blocks' @nritems. > > > > > > > > > > > > > Ok this sounds really bad. Is this as bad as I think it sounds? We > > > > should probably fix this like right now right? > > > > > > He's bumping @nritems with a fuzzer I think? As in this happens when > > > someone > > > forces it (or via some other bug) but not in normal operations. > > > > > > > Oh ok if this happens with a fuzzer than this is fine, but I'd rather do > > -EIO so we know this is something bad with the fs. > > -EIO may be more appropriate to be given while reading btree blocks and > checking their validation? Looks like EIO doesn't fit into this case, either, do we have any errno representing 'corrupted filesystem'? Thanks, -liubo > > > And change the changelog > > to make it explicit that this is the result of fs corruption, not normal > > operation. Then you can add > > > > Reviewed-by: Josef Bacik> > OK, make sense. > > Thanks, > > -liubo > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On 2016-09-15 14:01, Chris Murphy wrote: On Tue, Sep 13, 2016 at 5:35 AM, Austin S. Hemmelgarnwrote: On 2016-09-12 16:08, Chris Murphy wrote: - btrfsck status e.g. btrfs-progs 4.7.2 still warns against using --repair, and lists it under dangerous options also; while that's true, Btrfs can't be considered stable or recommended by default e.g. There's still way too many separate repair tools for Btrfs. Depending on how you count there's at least 4, and more realistically 8 ways, scattered across multiple commands. This excludes btrfs check's -E, -r, and -s flags. And it ignores sequence in the success rate. The permutations are just excessive. It's definitely not easy to know how to fix a Btrfs volume should things go wrong. I assume you're counting balance and scrub in that, plus check gives 3, what are you considering the 4th? - Self repair at mount time, similar to other fs's with a journal - fsck, similar to other fs's except the output is really unclear about what the prognosis is compared to ext4 or xfs - mount option usebackuproot/recovery - btrfs rescue zero-log - btrfs rescue super-recover - btrfs rescue chunk-recover - scrub - balance check --repair really needed to be fail safe a long time ago, it's what everyone's come to expect from fsck's, that they don't make things worse; and in particular on Btrfs it seems like its repairs should be reversible but the reality is the man page says do not use (except under advisement) and that it's dangerous (twice). And a user got a broken system in the bug that affects 4.7, 4.7.1, that 4.7.2 apparently can't fix. So... life is hard, file systems are hard. But it's also hard to see how distros can possibly feel comfortable with Btrfs by default when the fsck tool is dangerous, even if in theory it shouldn't often be necessary. For check specifically, I see four issues: 1. It spits out pretty low-level information about the internals in many cases when it returns an error. xfs_repair does this too, but it's needed even less frequently than btrfs check, and it at least uses relatively simple jargon by comparison. I've been using BTRFS for years and still can't tell what more than half the error messages check can return mean. In contrast to that, deciphering an error message from e2fsck is pretty trivial if you have some basic understanding of VFS level filesystem abstractions (stuff like what inodes and dentries are), and I never needed to learn low level things about the internals of ext4 to parse the fsck output (I did anyway, but that's beside the point). 2. We're developing new features without making sure that check can fix issues in any associated metadata. Part of merging a new feature needs to be proving that fsck can handle fixing any issues in the metadata for that feature short of total data loss or complete corruption. 3. Fsck should be needed only for un-mountable filesystems. Ideally, we should be handling things like Windows does. Preform slightly better checking when reading data, and if we see an error, flag the filesystem for expensive repair on the next mount. 4. Btrfs check should know itself if it can fix something or not, and that should be reported. I have an otherwise perfectly fine filesystem that throws some (apparently harmless) errors in check, and check can't repair them. Despite this, it gives zero indication that it can't repair them, zero indication that it didn't repair them, and doesn't even seem to give a non-zero exit status for this filesystem. As far as the other tools: - Self-repair at mount time: This isn't a repair tool, if the FS mounts, it's not broken, it's just a messy and the kernel is tidying things up. - btrfsck/btrfs check: I think I covered the issues here well. - Mount options: These are mostly just for expensive checks during mount, and most people should never need them except in very unusual circumstances. - btrfs rescue *: These are all fixes for very specific issues. They should be folded into check with special aliases, and not be separate tools. The first fixes an issue that's pretty much non-existent in any modern kernel, and the other two are for very low-level data recovery of horribly broken filesystems. - scrub: This is a very purpose specific tool which is supposed to be part of regular maintainence, and only works to fix things as a side effect of what it does. - balance: This is also a relatively purpose specific tool, and again only fixes things as a side effect of what it does. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Tue, Sep 13, 2016 at 5:35 AM, Austin S. Hemmelgarnwrote: > On 2016-09-12 16:08, Chris Murphy wrote: >> >> - btrfsck status >> e.g. btrfs-progs 4.7.2 still warns against using --repair, and lists >> it under dangerous options also; while that's true, Btrfs can't be >> considered stable or recommended by default >> e.g. There's still way too many separate repair tools for Btrfs. >> Depending on how you count there's at least 4, and more realistically >> 8 ways, scattered across multiple commands. This excludes btrfs >> check's -E, -r, and -s flags. And it ignores sequence in the success >> rate. The permutations are just excessive. It's definitely not easy to >> know how to fix a Btrfs volume should things go wrong. > > I assume you're counting balance and scrub in that, plus check gives 3, what > are you considering the 4th? - Self repair at mount time, similar to other fs's with a journal - fsck, similar to other fs's except the output is really unclear about what the prognosis is compared to ext4 or xfs - mount option usebackuproot/recovery - btrfs rescue zero-log - btrfs rescue super-recover - btrfs rescue chunk-recover - scrub - balance check --repair really needed to be fail safe a long time ago, it's what everyone's come to expect from fsck's, that they don't make things worse; and in particular on Btrfs it seems like its repairs should be reversible but the reality is the man page says do not use (except under advisement) and that it's dangerous (twice). And a user got a broken system in the bug that affects 4.7, 4.7.1, that 4.7.2 apparently can't fix. So... life is hard, file systems are hard. But it's also hard to see how distros can possibly feel comfortable with Btrfs by default when the fsck tool is dangerous, even if in theory it shouldn't often be necessary. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: multi-device btrfs with single data mode and disk failure
On Thu, Sep 15, 2016 at 10:30 AM, Alexandre Pouxwrote: > Thank you very much for your answers > > Le 15/09/2016 à 17:38, Chris Murphy a écrit : >> On Thu, Sep 15, 2016 at 1:44 AM, Alexandre Poux wrote: >>> Is it possible to do some king of a "btrfs delete missing" on this >>> kind of setup, in order to recover access in rw to my other data, or >>> I must copy all my data on a new partition >> That *should* work :) Except that your file system with 6 drives is >> too full to be shrunk to 5 drives. Btrfs will either refuse, or get >> confused, about how to shrink a nearly full 6 drive volume into 5. >> >> So you'll have to do one of three things: >> >> 1. Add a 2+TB drive, then remove the missing one; OR >> 2. btrfs replace is faster and is raid10 reliable; OR >> 3. Read only scrub to get a file listing of bad files, then remount >> read-write degraded and delete them all. Now you maybe can do a device >> delete missing. But it's still a tight fit, it basically has to >> balance things out to get it to fit on an odd number of drives, it may >> actually not work even though there seems to be enough total space, >> there has to be enough space on FOUR drives. >> > Are you sure you are talking about data in single mode ? > I don't understand why you are talking about raid10, > or the fact that it will have to rebalance everything. Yeah sorry I got confused in that very last sentence. Single, it will find space in 1GiB increments. Of course this fails because that data doesn't exist anymore, but to start the operation it needs to be possible. > > Moreover, even in degraded mode I cannot mount it in rw > It tells me > "too many missing devices, writeable remount is not allowed" > due to the fact I'm in single mode. Oh you're in that trap. Well now you're stuck. I've had the case where I could mount read write degraded with metadata raid1 and data single, but it was good for only one mount and then I get the same message you get and it was only possible to mount read only. At that point it's totally suck unless you're adept at manipulating the file system with a hex editor... Someone might have a patch somewhere that drops this check and lets too many missing devices to mount anyway... I seem to recall this. It'd be in the archives if it exists. > And as far as as know, btrfs replace and btrfs delete, are not supposed > to work in read only... It doesn't. Must be read write mounted. > > I would like to tell him forgot about the missing data, and give me back > my partition. This feature doesn't exist yet. I really want to see this, it'd be great for ceph and gluster if the volume could lose a drive, report all the missing files to the cluster file system, delete the device and the file references, and then the cluster knows that brick doesn't have those files and can replicate them somewhere else or even back to the brick that had them. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: multi-device btrfs with single data mode and disk failure
Thank you very much for your answers Le 15/09/2016 à 17:38, Chris Murphy a écrit : > On Thu, Sep 15, 2016 at 1:44 AM, Alexandre Pouxwrote: >> Is it possible to do some king of a "btrfs delete missing" on this >> kind of setup, in order to recover access in rw to my other data, or >> I must copy all my data on a new partition > That *should* work :) Except that your file system with 6 drives is > too full to be shrunk to 5 drives. Btrfs will either refuse, or get > confused, about how to shrink a nearly full 6 drive volume into 5. > > So you'll have to do one of three things: > > 1. Add a 2+TB drive, then remove the missing one; OR > 2. btrfs replace is faster and is raid10 reliable; OR > 3. Read only scrub to get a file listing of bad files, then remount > read-write degraded and delete them all. Now you maybe can do a device > delete missing. But it's still a tight fit, it basically has to > balance things out to get it to fit on an odd number of drives, it may > actually not work even though there seems to be enough total space, > there has to be enough space on FOUR drives. > Are you sure you are talking about data in single mode ? I don't understand why you are talking about raid10, or the fact that it will have to rebalance everything. Moreover, even in degraded mode I cannot mount it in rw It tells me "too many missing devices, writeable remount is not allowed" due to the fact I'm in single mode. And as far as as know, btrfs replace and btrfs delete, are not supposed to work in read only... I would like to tell him forgot about the missing data, and give me back my partition. In fact I'm pretty sure, there was no data at all on the dead device, only metadata in raid1. I'm currently scrubing to be absolutely sure -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: unable to handle kernel paging request
On 09/15/2016 10:08 AM, Mark Gavalda wrote: Hi, Bumped into the following one today; kernel 4.4.0-36-generic Ubuntu 16.4.1; CPU went to 100% and only a hard restart solved the issue. Since then everything's back to normal. Please let me know how can I help get to the bottom of this? I saw similar traces when tracking down this bug: https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/?h=for-linus-4.8=cbd60aa7cd17d81a434234268c55192862147439 It's flagged for stable, so you'll get it with the next stable update, or you can apply it by hand and rebuild. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Size of scrubbed Data
Hello I have encountered a very strange phenomenon while using btrfs-scrub. I believe it may be a result of replacing my old installation of Debian Jessie with Debian Stretch, resulting in a Kernel Switch from 3.16+63 to 4.6.0-1. I scrub my filesystem once a month and let anacron send me the results. My filesystem, consisting of four four-gigabyte drives with both data and metadata as RAID1 was reported as containing nearly 12TiB of data in scrubs done in May, June, July and August. But then it changed and suddenly shows only 9TiB in size, despite the fact that I did not delete any large files. If I remember correctly my switch from Debian Jessie to Stretch was around that time period. Could someone explain this behavior to me? Was a new way of calculating the size of scrubed data introduced? How can I check if I have lost data? I have a backup, but only one generation and rsync will by now have deleted files on the NAS, which might have lost on the fileserver. According to the long and short self-tests, which I run with smartmontools my drives are alright. How do I proceed? Yours Stefan uname -a Linux mars 4.6.0-1-amd64 #1 SMP Debian 4.6.4-1 (2016-07-18) x86_64 GNU/Linux btrfs --version btrfs-progs v4.7.1 btrfs fi show Label: none uuid: 8c668854-db5d-45a7-875d-43c4e82a829e Total devices 4 FS bytes used 6.06TiB devid1 size 3.64TiB used 3.09TiB path /dev/sde devid2 size 3.64TiB used 3.09TiB path /dev/sdc devid3 size 3.64TiB used 3.09TiB path /dev/sdd devid4 size 3.64TiB used 3.09TiB path /dev/sda btrfs fi df /mnt/btrfs-raid/ Data, RAID1: total=6.17TiB, used=6.05TiB System, RAID1: total=32.00MiB, used=916.00KiB System, single: total=4.00MiB, used=0.00B Metadata, RAID1: total=10.00GiB, used=8.14GiB GlobalReserve, single: total=512.00MiB, used=0.00B Maybe this is also of use in identifying the problem: grep btrfs * grep: apt: Ist ein Verzeichnis grep: cups: Ist ein Verzeichnis dpkg.log:2016-09-03 15:20:16 upgrade btrfs-progs:amd64 4.7-1 4.7.1-1 dpkg.log:2016-09-03 15:20:16 status triggers-awaited btrfs-progs:amd64 4.7-1 dpkg.log:2016-09-03 15:20:16 status half-configured btrfs-progs:amd64 4.7-1 dpkg.log:2016-09-03 15:20:16 status unpacked btrfs-progs:amd64 4.7-1 dpkg.log:2016-09-03 15:20:16 status half-installed btrfs-progs:amd64 4.7-1 dpkg.log:2016-09-03 15:20:16 status half-installed btrfs-progs:amd64 4.7-1 dpkg.log:2016-09-03 15:20:17 status unpacked btrfs-progs:amd64 4.7.1-1 dpkg.log:2016-09-03 15:20:17 status unpacked btrfs-progs:amd64 4.7.1-1 dpkg.log:2016-09-03 15:20:45 configure btrfs-progs:amd64 4.7.1-1 dpkg.log:2016-09-03 15:20:45 status unpacked btrfs-progs:amd64 4.7.1-1 dpkg.log:2016-09-03 15:20:45 status unpacked btrfs-progs:amd64 4.7.1-1 dpkg.log:2016-09-03 15:20:45 status half-configured btrfs-progs:amd64 4.7.1-1 dpkg.log:2016-09-03 15:20:46 status triggers-awaited btrfs-progs:amd64 4.7.1-1 dpkg.log:2016-09-03 15:20:51 status installed btrfs-progs:amd64 4.7.1-1 dpkg.log.1:2016-08-10 16:58:23 upgrade btrfs-progs:amd64 4.5.2-1 4.6.1-1 dpkg.log.1:2016-08-10 16:58:23 status triggers-awaited btrfs-progs:amd64 4.5.2-1 dpkg.log.1:2016-08-10 16:58:23 status half-configured btrfs-progs:amd64 4.5.2-1 dpkg.log.1:2016-08-10 16:58:23 status unpacked btrfs-progs:amd64 4.5.2-1 dpkg.log.1:2016-08-10 16:58:23 status half-installed btrfs-progs:amd64 4.5.2-1 dpkg.log.1:2016-08-10 16:58:24 status half-installed btrfs-progs:amd64 4.5.2-1 dpkg.log.1:2016-08-10 16:58:24 status unpacked btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-10 16:58:24 status unpacked btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-10 17:01:25 configure btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-10 17:01:25 status unpacked btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-10 17:01:26 status unpacked btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-10 17:01:26 status half-configured btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-10 17:01:26 status triggers-awaited btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-10 17:02:34 status installed btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-19 00:45:05 upgrade btrfs-progs:amd64 4.6.1-1 4.7-1 dpkg.log.1:2016-08-19 00:45:05 status triggers-awaited btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-19 00:45:05 status half-configured btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-19 00:45:05 status unpacked btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-19 00:45:05 status half-installed btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-19 00:45:06 status half-installed btrfs-progs:amd64 4.6.1-1 dpkg.log.1:2016-08-19 00:45:06 status unpacked btrfs-progs:amd64 4.7-1 dpkg.log.1:2016-08-19 00:45:06 status unpacked btrfs-progs:amd64 4.7-1 dpkg.log.1:2016-08-19 00:47:06 configure btrfs-progs:amd64 4.7-1 dpkg.log.1:2016-08-19 00:47:06 status unpacked btrfs-progs:amd64 4.7-1 dpkg.log.1:2016-08-19 00:47:06 status unpacked btrfs-progs:amd64 4.7-1 dpkg.log.1:2016-08-19 00:47:06 status half-configured btrfs-progs:amd64 4.7-1 dpkg.log.1:2016-08-19 00:47:06 status
Thoughts on btrfs RAID-1 for cold storage/archive?
I'm investigating using btrfs for archiving old data and offsite storage, essentially put 2 drives in btrfs RAID-1, copy the data to the filesystem and then unmount, remove a drive and take it to an offsite location. Remount the other drive -o ro,degraded until my systems slots fill up, then remove the local drive and put it on a shelf. I'd verify the file md5sums after data is written to the drive for piece of mind, but maybe a btrfs scrub would give the same assurances. Seem straightforward? Anything to look out for? Long term format stability seems good, right? Also, I like the idea of being able to pull the offsite drive back and scrub if the local drive ever has problems, a nice extra piece of mind we wouldn't get with ext4. Currently using the 4.1.32 kernel since the driver for the r750 card in our 45 drives system only supports up to 4.3 ATM. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: multi-device btrfs with single data mode and disk failure
On Thu, Sep 15, 2016 at 1:44 AM, Alexandre Pouxwrote: > I had a btrfs partition on a 6 disk array without raid (metadata in > raid10, but data in single), and one of the disks just died. > > So I lost some of my data, ok, I knew that. > > But two question : > > * > > Is it possible to know (using metadata I suppose) what data I have > lost ? The safest option is to remount read only and do a read only scrub. That will spit out messages for corrupt (missing) metadata and data, to the kernel message buffer. The missing data will appear as corrupt files that can't be fixed with full file paths. There will likely be so many that dmesg will be useless so you'll need to use journalctl -fk to follow the scrub; or journalctl -bk after the fact or even -b-1 -k or -b-2 -k, etc.; or /var/log/messages as it's probably going to exceed the kernel message buffer, so dmesg won't help. > Is it possible to do some king of a "btrfs delete missing" on this > kind of setup, in order to recover access in rw to my other data, or > I must copy all my data on a new partition That *should* work :) Except that your file system with 6 drives is too full to be shrunk to 5 drives. Btrfs will either refuse, or get confused, about how to shrink a nearly full 6 drive volume into 5. So you'll have to do one of three things: 1. Add a 2+TB drive, then remove the missing one; OR 2. btrfs replace is faster and is raid10 reliable; OR 3. Read only scrub to get a file listing of bad files, then remount read-write degraded and delete them all. Now you maybe can do a device delete missing. But it's still a tight fit, it basically has to balance things out to get it to fit on an odd number of drives, it may actually not work even though there seems to be enough total space, there has to be enough space on FOUR drives. I'd go with option 2. And that should still spit out the paths to bad files. If the replace works, I'm pretty sure you still need to delete all of the files that are missing in order to get rid of the corruption warnings on any subsequent scrub or balance. > > btrfs --version : > btrfs-progs v4.7.1 You should upgrade to 4.7.2 or downgrade to 4.6.1 before doing btrfs check. Not urgent so long as you don't actually do a repair with this version. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: stability matrix
Am Donnerstag, 15. September 2016, 07:54:26 CEST schrieb Austin S. Hemmelgarn: > On 2016-09-15 05:49, Hans van Kranenburg wrote: > > On 09/15/2016 04:14 AM, Christoph Anton Mitterer wrote: […] > I specifically do not think we should worry about distro kernels though. > If someone is using a specific distro, that distro's documentation > should cover what they support and what works and what doesn't. Some > (like Arch and to a lesser extent Gentoo) use almost upstream kernels, > so there's very little point in tracking them. Some (like Ubuntu and > Debian) use almost upstream LTS kernels, so there's little point > tracking them either. Many others though (like CentOS, RHEL, and OEL) > Use forked kernels that have so many back-ported patches that it's > impossible to track up-date to up-date what the hell they've got. A > rather ridiculous expression regarding herding of cats comes to mind > with respect to the last group. Yep. I just read through RHEL releasenotes for a RHEL 7 workshop I will hold for a customer… and noted that newer RHEL 7 kernels for example have device mapper from Kernel 4.1 (while the kernel still says its a 3.10 one), XFS from kernel this.that, including new incompat CRC disk format and the need to also upgrade xfsprogs in lockstep, and this and that from kernel this.that and so on. Frankenstein as an association comes to my mind, but I bet RHEL kernel engineers know what they are doing. -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Preliminary BTRFS Encryption
On 2016-09-15 10:06, Anand Jain wrote: Thanks for comments. Pls see inline as below. On 09/15/2016 07:37 PM, Austin S. Hemmelgarn wrote: On 2016-09-13 09:39, Anand Jain wrote: This patchset adds btrfs encryption support. The main objective of this series is to have bugs fixed and stability. I have verified with fstests to confirm that there is no regression. A design write-up is coming next, however here below is the quick example on the cli usage. Please try out, let me know if I have missed something. Also would like to mention that a review from the security experts is due, which is important and I believe those review comments can be accommodated without major changes from here. Also yes, thanks for the emails, I hear, per file encryption and inline with vfs layer is also important, which is wip among other things in the list. As of now these patch set supports encryption on per subvolume, as managing properties on per subvolume is a kind of core to btrfs, which is easier for data center solution-ing, seamlessly persistent and easy to manage. Steps: - Make sure following kernel TFMs are compiled in. # cat /proc/crypto | egrep 'cbc\(aes\)|ctr\(aes\)' name : ctr(aes) name : cbc(aes) Create encrypted subvolume. # btrfs su create -e 'ctr(aes)' /btrfs/e1 Create subvolume '/btrfs/e1' Passphrase: Again passphrase: A key is created and its hash is updated into the subvolume item, and then added to the system keyctl. # btrfs su show /btrfs/e1 | egrep -i encrypt Encryption: ctr(aes)@btrfs:75197c8e (594790215) # keyctl show 594790215 Keyring 594790215 --alsw-v 0 0 logon: btrfs:75197c8e Now any file data extents under the subvol /btrfs/e1 will be encrypted. You may revoke key using keyctl or btrfs(8) as below. # btrfs su encrypt -k out /btrfs/e1 # btrfs su show /btrfs/e1 | egrep -i encrypt Encryption: ctr(aes)@btrfs:75197c8e (Required key not available) # keyctl show 594790215 Keyring Unable to dump key: Key has been revoked As the key hash is updated, If you provide wrong passphrase in the next key in, it won't add key to the system. So we have key verification from the day1. # btrfs su encrypt -k in /btrfs/e1 Passphrase: Again passphrase: ERROR: failed to set attribute 'btrfs.encrypt' to 'ctr(aes)@btrfs:75197c8e' : Key was rejected by service ERROR: key set failed: Key was rejected by service # btrfs su encrypt -k in /btrfs/e1 Passphrase: Again passphrase: key for '/btrfs/e1' has logged in with keytag 'btrfs:75197c8e' Now if you revoke the key the read / write fails with key error. # md5sum /btrfs/e1/2k-test-file 8c9fbc69125ebe84569a5c1ca088cb14 /btrfs/e1/2k-test-file # btrfs su encrypt -k out /btrfs/e1 # md5sum /btrfs/e1/2k-test-file md5sum: /btrfs/e1/2k-test-file: Key has been revoked # cp /tfs/1k-test-file /btrfs/e1/ cp: cannot create regular file ‘/btrfs/e1/1k-test-file’: Key has been revoked Plain text memory scratches for security reason is pending. As there are some key revoke notification challenges to coincide with encryption context switch, which I do believe should be fixed in the due course, but is not a roadblock at this stage. Before I make any other comments, I should state that I asbolutely agree with Alex Elsayed about the issues with using CBC or CTR mode, and not supporting AE or AEAD modes. Alex comments was quite detailed, I did reply to it. Looks like you missed my reply to Alex's comments ? I've been having issues with GMail delaying random e-mails for excessive amounts of time (hours sometimes), so I didn't see your reply before sending this. Even so, I do want it on the record that I agree with him completely. How does this handle cloning of extents? Can extents be cloned across subvolume boundaries when one of the subvolumes is encrypted? Yes only if both the subvol keys match. OK, that makes sense. Can they be cloned within an encrypted subvolume? Yes. That's things as usual. Glad to see that that still works. Most people I know who do batch deduplication do so within subvolumes but not across them, so that still working with encrypted subvolumes is a good thing. What happens when you try to clone them in either case if it isn't supported? Gets -EOPNOTSUPP. That actually makes more sense than what my first thought for a return code was (-EINVAL). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: stability matrix
On Thu, Sep 15, 2016 at 5:54 AM, Austin S. Hemmelgarnwrote: > > > I specifically do not think we should worry about distro kernels though. It will be essentially impossible to keep such a thing up to date. It's difficult in the best case scenario to even track upstream's own backports to longterm kernels, whether those would actually even change anything in the matrix. I'd say each major version gets it's own page, and just dup the page for each version. So for starters, the current page is for version 4.7. If when 4.8 is released there's no significant change in stability that affects the color (stability status) of any listed feature, then that page could say 4.7 through current. If it's true that the status page has no major changes going back to 4.4 through current, label it that way. As soon as there's a change that affects the color coding of an item in the grid, duplicate the page. Old page gets a fixed range of kernels, say 4.4 to 4.7. And now the newest page is 4.8 - current. I think a column for version will lose the historical perspective of when something goes from red to yellow, yellow to green. > If > someone is using a specific distro, that distro's documentation should cover > what they support and what works and what doesn't. Some (like Arch and to a > lesser extent Gentoo) use almost upstream kernels, so there's very little > point in tracking them. Some (like Ubuntu and Debian) use almost upstream > LTS kernels, so there's little point tracking them either. Many others > though (like CentOS, RHEL, and OEL) Use forked kernels that have so many > back-ported patches that it's impossible to track up-date to up-date what > the hell they've got. A rather ridiculous expression regarding herding of > cats comes to mind with respect to the last group. Yeah you need the secret decoder ring to sort it out. Forget it, not worth it. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
unable to handle kernel paging request
Hi, Bumped into the following one today; kernel 4.4.0-36-generic Ubuntu 16.4.1; CPU went to 100% and only a hard restart solved the issue. Since then everything's back to normal. Please let me know how can I help get to the bottom of this? [239049.350514] BUG: unable to handle kernel paging request at d3c53de8 [239049.358107] IP: [] hrtimer_active+0x9/0x60 [239049.364127] PGD a688df067 PUD 0 [239049.367828] Oops: [#2] SMP [239049.371543] Modules linked in: xt_recent xt_nat xt_multiport ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 xt_limit xt_addrtype xt_conntrack binfmt_misc veth xt_CHECKSUM iptable_mangle xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_comment iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables x_tables ppdev parport_pc parport serio_raw pvpanic ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse virtio_scsi [239049.457553] CPU: 24 PID: 1298718 Comm: kworker/u64:24 Tainted: G D 4.4.0-36-generic #55-Ubuntu [239049.467497] Hardware name: Google Google/Google, BIOS Google 01/01/2011 [239049.474348] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] [239049.481472] task: 8801512dee00 ti: 8801fd1c8000 task.ti: 8801fd1c8000 [239049.489205] RIP: 0010:[] [] hrtimer_active+0x9/0x60 [239049.497702] RSP: 0018:8801fd1cbc20 EFLAGS: 00010046 [239049.503230] RAX: RBX: d3c53db8 RCX: [239049.510569] RDX: RSI: 0003 RDI: d3c53db8 [239049.517910] RBP: 8801fd1cbc20 R08: 88336f416d00 R09: 88333fe57e00 [239049.525250] R10: 00a0 R11: 01b5675f R12: 8807024cb628 [239049.532595] R13: 8802d5e5bd98 R14: R15: 0003 [239049.539938] FS: () GS:88336f40() knlGS: [239049.548246] CS: 0010 DS: ES: CR0: 80050033 [239049.554234] CR2: d3c53de8 CR3: 000a5e8a1000 CR4: 001406e0 [239049.561576] Stack: [239049.563812] 8801fd1cbc58 810ef249 0003 d3b71a0d [239049.571936] d3c53db8 8807024cb628 8802d5e5bd98 8801fd1cbca8 [239049.580141] 8182d3c1 810c35f2 00010001 [239049.588300] Call Trace: [239049.590976] [] hrtimer_try_to_cancel+0x29/0x130 [239049.597384] [] schedule_hrtimeout_range_clock+0xd1/0x1b0 [239049.604571] [] ? __wake_up_common+0x52/0x90 [239049.610619] [] __wake_up+0x39/0x50 [239049.615906] [] btrfs_remove_ordered_extent+0x154/0x250 [btrfs] [239049.623620] [] btrfs_finish_ordered_io+0x1d0/0x650 [btrfs] [239049.630993] [] finish_ordered_fn+0x15/0x20 [btrfs] [239049.637666] [] btrfs_scrubparity_helper+0xca/0x2f0 [btrfs] [239049.645042] [] btrfs_endio_write_helper+0xe/0x10 [btrfs] [239049.652239] [] process_one_work+0x165/0x480 [239049.658293] [] worker_thread+0x4b/0x4c0 [239049.663984] [] ? process_one_work+0x480/0x480 [239049.670196] [] ? process_one_work+0x480/0x480 [239049.676420] [] kthread+0xd8/0xf0 [239049.681514] [] ? kthread_create_on_node+0x1e0/0x1e0 [239049.688260] [] ret_from_fork+0x3f/0x70 [239049.693870] [] ? kthread_create_on_node+0x1e0/0x1e0 [239049.700599] Code: 00 00 0f 1f 44 00 00 55 48 c7 47 28 10 f0 0e 81 48 89 77 58 48 89 e5 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 <48> 8b 57 30 eb 1d 80 7f 38 00 75 32 48 3b 78 08 74 2c 39 50 04 [239049.727797] RIP [] hrtimer_active+0x9/0x60 [239049.733868] RSP [239049.737557] CR2: d3c53de8 [239049.741535] ---[ end trace 774da4af66731bb5 ]--- Thanks, Mark Gavalda -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mkfs+mount failure of small fs on ppc64
On 9/13/16 4:44 PM, Eric Sandeen wrote: > on ppc64, 4.7-rc kernel, git btrfs-progs, v4.7.2: > > # truncate --size=500m testfile > # ./mkfs.btrfs testfile > # mkdir -p mnt > # mount -o loop testfile mnt Same failure on aarch64 if that makes it any more interesting. ;) # mount -o loop testfile mnt mount: mount /dev/loop0 on /root/mnt failed: No space left on device Sector size issue I guess, driven by page size. -Eric > btrfs-progs v4.7.2 > See http://btrfs.wiki.kernel.org for more information. > > Label: (null) > UUID: c531b759-a491-4c9f-a954-4787cea9106d > Node size: 65536 > Sector size:65536 > Filesystem size:500.00MiB > Block group profiles: > Data: single8.00MiB > Metadata: DUP 32.00MiB > System: DUP 8.00MiB > SSD detected: no > Incompat features: extref, skinny-metadata > Number of devices: 1 > Devices: >IDSIZE PATH > 1 500.00MiB testfile > > > # dmesg -c > [ 61.210287] loop: module loaded > [ 61.247105] BTRFS: device fsid a8d79cd0-977f-4b93-8410-246dc08b3683 devid > 1 transid 5 /dev/loop0 > [ 61.247391] BTRFS info (device loop0): disk space caching is enabled > [ 61.247397] BTRFS info (device loop0): has skinny extents > [ 61.270492] BTRFS info (device loop0): creating UUID tree > [ 61.312149] BTRFS warning (device loop0): failed to create the UUID tree: > -28 > [ 61.483028] BTRFS: open_ctree failed > > 2nd mount works: > > # mount -o loop testfile mnt > # dmesg -c > [ 87.504564] BTRFS info (device loop0): disk space caching is enabled > [ 87.504579] BTRFS info (device loop0): has skinny extents > [ 87.506979] BTRFS info (device loop0): creating UUID tree > > Any ideas? This seems to have regressed since 3.9.1, but there are a couple > other mkfs breakages in between, and my bisect was not fruitful. > > Thanks, > -Eric > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Preliminary BTRFS Encryption
Thanks for comments. Pls see inline as below. On 09/15/2016 07:37 PM, Austin S. Hemmelgarn wrote: On 2016-09-13 09:39, Anand Jain wrote: This patchset adds btrfs encryption support. The main objective of this series is to have bugs fixed and stability. I have verified with fstests to confirm that there is no regression. A design write-up is coming next, however here below is the quick example on the cli usage. Please try out, let me know if I have missed something. Also would like to mention that a review from the security experts is due, which is important and I believe those review comments can be accommodated without major changes from here. Also yes, thanks for the emails, I hear, per file encryption and inline with vfs layer is also important, which is wip among other things in the list. As of now these patch set supports encryption on per subvolume, as managing properties on per subvolume is a kind of core to btrfs, which is easier for data center solution-ing, seamlessly persistent and easy to manage. Steps: - Make sure following kernel TFMs are compiled in. # cat /proc/crypto | egrep 'cbc\(aes\)|ctr\(aes\)' name : ctr(aes) name : cbc(aes) Create encrypted subvolume. # btrfs su create -e 'ctr(aes)' /btrfs/e1 Create subvolume '/btrfs/e1' Passphrase: Again passphrase: A key is created and its hash is updated into the subvolume item, and then added to the system keyctl. # btrfs su show /btrfs/e1 | egrep -i encrypt Encryption: ctr(aes)@btrfs:75197c8e (594790215) # keyctl show 594790215 Keyring 594790215 --alsw-v 0 0 logon: btrfs:75197c8e Now any file data extents under the subvol /btrfs/e1 will be encrypted. You may revoke key using keyctl or btrfs(8) as below. # btrfs su encrypt -k out /btrfs/e1 # btrfs su show /btrfs/e1 | egrep -i encrypt Encryption: ctr(aes)@btrfs:75197c8e (Required key not available) # keyctl show 594790215 Keyring Unable to dump key: Key has been revoked As the key hash is updated, If you provide wrong passphrase in the next key in, it won't add key to the system. So we have key verification from the day1. # btrfs su encrypt -k in /btrfs/e1 Passphrase: Again passphrase: ERROR: failed to set attribute 'btrfs.encrypt' to 'ctr(aes)@btrfs:75197c8e' : Key was rejected by service ERROR: key set failed: Key was rejected by service # btrfs su encrypt -k in /btrfs/e1 Passphrase: Again passphrase: key for '/btrfs/e1' has logged in with keytag 'btrfs:75197c8e' Now if you revoke the key the read / write fails with key error. # md5sum /btrfs/e1/2k-test-file 8c9fbc69125ebe84569a5c1ca088cb14 /btrfs/e1/2k-test-file # btrfs su encrypt -k out /btrfs/e1 # md5sum /btrfs/e1/2k-test-file md5sum: /btrfs/e1/2k-test-file: Key has been revoked # cp /tfs/1k-test-file /btrfs/e1/ cp: cannot create regular file ‘/btrfs/e1/1k-test-file’: Key has been revoked Plain text memory scratches for security reason is pending. As there are some key revoke notification challenges to coincide with encryption context switch, which I do believe should be fixed in the due course, but is not a roadblock at this stage. Before I make any other comments, I should state that I asbolutely agree with Alex Elsayed about the issues with using CBC or CTR mode, and not supporting AE or AEAD modes. Alex comments was quite detailed, I did reply to it. Looks like you missed my reply to Alex's comments ? How does this handle cloning of extents? Can extents be cloned across subvolume boundaries when one of the subvolumes is encrypted? Yes only if both the subvol keys match. Can they be cloned within an encrypted subvolume? Yes. That's things as usual. What happens when you try to clone them in either case if it isn't supported? Gets -EOPNOTSUPP. Thanks, Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH]btrfs-progs: btrfs-convert.c : check source file system state
Signed-off-by: Lakshmipathi.G--- btrfs-convert.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/btrfs-convert.c b/btrfs-convert.c index c10dc17..27da9ce 100644 --- a/btrfs-convert.c +++ b/btrfs-convert.c @@ -2171,6 +2171,17 @@ static void ext2_copy_inode_item(struct btrfs_inode_item *dst, } memset(>reserved, 0, sizeof(dst->reserved)); } +static int check_filesystem_state(struct btrfs_convert_context *cctx) +{ + ext2_filsys fs = cctx->fs_data; + +if (!(fs->super->s_state & EXT2_VALID_FS)) + return 1; + else if (fs->super->s_state & EXT2_ERROR_FS) + return 1; + else + return 0; +} /* * copy a single inode. do all the required works, such as cloning @@ -2340,6 +2351,10 @@ static int do_convert(const char *devname, int datacsum, int packing, ret = convert_open_fs(devname, ); if (ret) goto fail; + ret = check_filesystem_state(); + if (ret) + warning("Source Filesystem is not clean, \ +running e2fsck is recommended."); ret = convert_read_used_space(); if (ret) goto fail; -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: stability matrix
On 2016-09-15 05:49, Hans van Kranenburg wrote: On 09/15/2016 04:14 AM, Christoph Anton Mitterer wrote: Hey. As for the stability matrix... In general: - I think another column should be added, which tells when and for which kernel version the feature-status of each row was revised/updated the last time and especially by whom. If a core dev makes a statement on a particular feature, this probably means much more, than if it was made by "just" a list regular. And yes I know, in the beginning it already says "this is for 4.7"... but let's be honest, it's pretty likely when this is bumped to 4.8 that not each and every point will be thoroughly checked again. - Optionally even one further column could be added, that lists bugs where the specific cases are kept record of (if any). - Perhaps a 3rd Status like "eats-your-data" which is worse than critical, e.g. for things were it's known that there is a high chance for still getting data corruption (RAID56?) About the "for 4.7" issue... The Status page could have an extra column, which for every OK labeled row lists the first version (kernel.org x.y.0 release) it's OK for. The bugs make it more complicated. * Feature A is labeled OK in kernel 5.0 * During development of kernel 8-rc, an eat my data bug is fixed. The OK for this feature in the table is bumped to 8.0? * kernel 5 is EOL * kernel 6 is still supported, and the fix is applied to 6.12 * then there's distros which have their own old kernels, applying fixes on them whenever they like, for example 5.6-distro4 which is leading its own life "Normal" users are using distro kernels. They shouldn't be panicing about their data if they're running 6.14 or 5.6-distro4, but the OK in the table is bumped to 8.0 because of the serious bugs. At least the official kernels should be tracked in the table I think. Separately, a list of known serious bugs per feature (like the 4 about compression, http://www.spinics.net/lists/linux-btrfs/msg58674.html ) could be listed on another Bugs! page (lots of work) so a user, or someone helping the user can see if the listed commits are or aren't included in the actual whatever kernel a user is using. This list of serious bugs could also help disussions that now sound like "yeah, there were issues with compression which some time ago got fixed, but noone knows what it was and when, so don't use compression". Many of the commits which fix serious bugs (even if they're only triggered in an edge case) have some explanation about how to trigger them, like the excellent commit messages of Filipe in the commits mentioned above. This helps setting up and maintaining the bug page, and helps advanced users to decide if they're hitting the edge case or not with their usage pattern. I'd like to help creating/maintaining this bug overview. A good start would be to just crawl through all stable kernels and some distro kernels and see which commits show up in fs/btrfs. As of right now, we kind of do have such a page: https://btrfs.wiki.kernel.org/index.php/Gotchas It's not really well labeled though, ans it's easy to overlook. I specifically do not think we should worry about distro kernels though. If someone is using a specific distro, that distro's documentation should cover what they support and what works and what doesn't. Some (like Arch and to a lesser extent Gentoo) use almost upstream kernels, so there's very little point in tracking them. Some (like Ubuntu and Debian) use almost upstream LTS kernels, so there's little point tracking them either. Many others though (like CentOS, RHEL, and OEL) Use forked kernels that have so many back-ported patches that it's impossible to track up-date to up-date what the hell they've got. A rather ridiculous expression regarding herding of cats comes to mind with respect to the last group. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Preliminary BTRFS Encryption
On Thu, 15 Sep 2016 19:33:48 +0800, Anand Jain wrote: > Thanks for commenting. pls see inline below. > > On 09/15/2016 12:53 PM, Alex Elsayed wrote: >> On Tue, 13 Sep 2016 21:39:46 +0800, Anand Jain wrote: >> >>> This patchset adds btrfs encryption support. >>> >>> The main objective of this series is to have bugs fixed and stability. >>> I have verified with fstests to confirm that there is no regression. >>> >>> A design write-up is coming next, however here below is the quick >>> example on the cli usage. Please try out, let me know if I have missed >>> something. >>> >>> Also would like to mention that a review from the security experts is >>> due, >>> which is important and I believe those review comments can be >>> accommodated without major changes from here. >>> >>> Also yes, thanks for the emails, I hear, per file encryption and >>> inline with vfs layer is also important, which is wip among other >>> things in the list. >>> >>> As of now these patch set supports encryption on per subvolume, as >>> managing properties on per subvolume is a kind of core to btrfs, which >>> is easier for data center solution-ing, seamlessly persistent and easy >>> to manage. >>> >>> >>> Steps: >>> - >>> >>> Make sure following kernel TFMs are compiled in. >>> # cat /proc/crypto | egrep 'cbc\(aes\)|ctr\(aes\)' >>> name : ctr(aes) >>> name : cbc(aes) >> >> First problem: These are purely encryption algorithms, rather than AE >> (Authenticated Encryption) or AEAD (Authenticated Encryption with >> Associated Data). As a result, they are necessarily vulnerable to >> adaptive chosen-ciphertext attacks, and CBC has historically had other >> issues. I highly recommend using a well-reviewed AE or AEAD mode, such >> as AES-GCM (as ecryptfs does), as long as the code can handle the >> ciphertext being longer than the plaintext. >> >> If it _cannot_ handle the ciphertext being longer than the plaintext, >> please consider that a very serious red flag: It means that you cannot >> provide better security than block-level encryption, which greatly >> reduces the benefit of filesystem-integrated encryption. Being at the >> extent level _should_ permit using AEAD - if it does not, something is >> wrong. >> >> If at all possible, I'd suggest _only_ permitting AEAD cipher modes to >> be used. >> >> Anyway, even for block-level encryption, CTR and CBC have been >> considered obsolete and potentially dangerous to use in disk encryption >> for quite a while - current recommendations for block-level encryption >> are to use either a narrow-block tweakable cipher mode (such as XTS), >> or a wide- block one (such as EME or CMC), with the latter providing >> slightly better security, but worse performance. > >Yes. CTR should be changed, so I have kept it as a cli option. And >with the current internal design, hope we can plugin more algorithms >as suggested/if-its-outdated and yes code can handle (or with a >little tweak) bigger ciphertext (than plaintext) as well. > >encryption + keyhash (as below) + Btrfs-data-checksum provides >similar to AE, right ? No, it does not provide anything remotely similar to AE. AE requires _cryptographic_ authentication of the data. Not only is a CRC (as Btrfs uses for the data checksum) not enough, a _cryptographic hash_ (such as SHA256) isn't even enough. A MAC (message authentication code) is necessary. Moreover, combining an encryption algorithm and a MAC is very easy to get wrong, in ways that absolutely ruin security - as an example, see the Vaudenay/Lucky13 padding oracle attacks on TLS. In order for this to be secure, you need to use a secure encryption system that also authenticates the data in a cryptographically secure manner. Certain schemes are well-studied and believed to be secure - AES- GCM and ChaCha20-Poly1305 are common and well-regarded, and there's a generic security reduction for Encrypt-then-MAC constructions (using CTR together with HMAC in such a construction is generally acceptable). The Btrfs data checksum is wholly inadequate, and the keyhash is a non- sequitur - it prevents accidentally opening the subvolume with the wrong key, but neither it (nor the btrfs data checksum, which is a CRC rather than a cryptographic MAC) protect adequately against malicious corruption of the ciphertext. I'd suggest pulling in Herbert Xu, as he'd likely be able to tell you what of the Crypto API is actually sane to use for this. >>> Create encrypted subvolume. >>> # btrfs su create -e 'ctr(aes)' /btrfs/e1 Create subvolume '/btrfs/e1' >>> Passphrase: >>> Again passphrase: >> >> I presume the command first creates a key, then creates a subvolume >> referencing that key? If so, that seems sensible. > > Hmm I didn't get the why part, any help ? (this doesn't encrypt > metadata part). Basically, if your tool merely sets up an entry in the kernel keyring, then calls the subvolume creation interface (passing in the key ID), then it
Re: [RFC] Preliminary BTRFS Encryption
On 2016-09-13 09:39, Anand Jain wrote: This patchset adds btrfs encryption support. The main objective of this series is to have bugs fixed and stability. I have verified with fstests to confirm that there is no regression. A design write-up is coming next, however here below is the quick example on the cli usage. Please try out, let me know if I have missed something. Also would like to mention that a review from the security experts is due, which is important and I believe those review comments can be accommodated without major changes from here. Also yes, thanks for the emails, I hear, per file encryption and inline with vfs layer is also important, which is wip among other things in the list. As of now these patch set supports encryption on per subvolume, as managing properties on per subvolume is a kind of core to btrfs, which is easier for data center solution-ing, seamlessly persistent and easy to manage. Steps: - Make sure following kernel TFMs are compiled in. # cat /proc/crypto | egrep 'cbc\(aes\)|ctr\(aes\)' name : ctr(aes) name : cbc(aes) Create encrypted subvolume. # btrfs su create -e 'ctr(aes)' /btrfs/e1 Create subvolume '/btrfs/e1' Passphrase: Again passphrase: A key is created and its hash is updated into the subvolume item, and then added to the system keyctl. # btrfs su show /btrfs/e1 | egrep -i encrypt Encryption: ctr(aes)@btrfs:75197c8e (594790215) # keyctl show 594790215 Keyring 594790215 --alsw-v 0 0 logon: btrfs:75197c8e Now any file data extents under the subvol /btrfs/e1 will be encrypted. You may revoke key using keyctl or btrfs(8) as below. # btrfs su encrypt -k out /btrfs/e1 # btrfs su show /btrfs/e1 | egrep -i encrypt Encryption: ctr(aes)@btrfs:75197c8e (Required key not available) # keyctl show 594790215 Keyring Unable to dump key: Key has been revoked As the key hash is updated, If you provide wrong passphrase in the next key in, it won't add key to the system. So we have key verification from the day1. # btrfs su encrypt -k in /btrfs/e1 Passphrase: Again passphrase: ERROR: failed to set attribute 'btrfs.encrypt' to 'ctr(aes)@btrfs:75197c8e' : Key was rejected by service ERROR: key set failed: Key was rejected by service # btrfs su encrypt -k in /btrfs/e1 Passphrase: Again passphrase: key for '/btrfs/e1' has logged in with keytag 'btrfs:75197c8e' Now if you revoke the key the read / write fails with key error. # md5sum /btrfs/e1/2k-test-file 8c9fbc69125ebe84569a5c1ca088cb14 /btrfs/e1/2k-test-file # btrfs su encrypt -k out /btrfs/e1 # md5sum /btrfs/e1/2k-test-file md5sum: /btrfs/e1/2k-test-file: Key has been revoked # cp /tfs/1k-test-file /btrfs/e1/ cp: cannot create regular file ‘/btrfs/e1/1k-test-file’: Key has been revoked Plain text memory scratches for security reason is pending. As there are some key revoke notification challenges to coincide with encryption context switch, which I do believe should be fixed in the due course, but is not a roadblock at this stage. Before I make any other comments, I should state that I asbolutely agree with Alex Elsayed about the issues with using CBC or CTR mode, and not supporting AE or AEAD modes. If that's going to be the case, then there's essentially no point in merging this as is, as it has worse security than other filesystem level encryption options in the kernel by a pretty significant margin. This absolutely _needs_ to be done right the first time, otherwise the reputation of BTRFS will suffer further, and nobody sane is going to use subvolume encryption for years after it's 'fixed' to be properly secure. Now, the other thing I wanted to comment about: How does this handle cloning of extents? Can extents be cloned across subvolume boundaries when one of the subvolumes is encrypted? Can they be cloned within an encrypted subvolume? What happens when you try to clone them in either case if it isn't supported? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Preliminary BTRFS Encryption
Thanks for commenting. pls see inline below. On 09/15/2016 12:53 PM, Alex Elsayed wrote: On Tue, 13 Sep 2016 21:39:46 +0800, Anand Jain wrote: This patchset adds btrfs encryption support. The main objective of this series is to have bugs fixed and stability. I have verified with fstests to confirm that there is no regression. A design write-up is coming next, however here below is the quick example on the cli usage. Please try out, let me know if I have missed something. Also would like to mention that a review from the security experts is due, which is important and I believe those review comments can be accommodated without major changes from here. Also yes, thanks for the emails, I hear, per file encryption and inline with vfs layer is also important, which is wip among other things in the list. As of now these patch set supports encryption on per subvolume, as managing properties on per subvolume is a kind of core to btrfs, which is easier for data center solution-ing, seamlessly persistent and easy to manage. Steps: - Make sure following kernel TFMs are compiled in. # cat /proc/crypto | egrep 'cbc\(aes\)|ctr\(aes\)' name : ctr(aes) name : cbc(aes) First problem: These are purely encryption algorithms, rather than AE (Authenticated Encryption) or AEAD (Authenticated Encryption with Associated Data). As a result, they are necessarily vulnerable to adaptive chosen-ciphertext attacks, and CBC has historically had other issues. I highly recommend using a well-reviewed AE or AEAD mode, such as AES-GCM (as ecryptfs does), as long as the code can handle the ciphertext being longer than the plaintext. If it _cannot_ handle the ciphertext being longer than the plaintext, please consider that a very serious red flag: It means that you cannot provide better security than block-level encryption, which greatly reduces the benefit of filesystem-integrated encryption. Being at the extent level _should_ permit using AEAD - if it does not, something is wrong. If at all possible, I'd suggest _only_ permitting AEAD cipher modes to be used. Anyway, even for block-level encryption, CTR and CBC have been considered obsolete and potentially dangerous to use in disk encryption for quite a while - current recommendations for block-level encryption are to use either a narrow-block tweakable cipher mode (such as XTS), or a wide- block one (such as EME or CMC), with the latter providing slightly better security, but worse performance. Yes. CTR should be changed, so I have kept it as a cli option. And with the current internal design, hope we can plugin more algorithms as suggested/if-its-outdated and yes code can handle (or with a little tweak) bigger ciphertext (than plaintext) as well. encryption + keyhash (as below) + Btrfs-data-checksum provides similar to AE, right ? Create encrypted subvolume. # btrfs su create -e 'ctr(aes)' /btrfs/e1 Create subvolume '/btrfs/e1' Passphrase: Again passphrase: I presume the command first creates a key, then creates a subvolume referencing that key? If so, that seems sensible. Hmm I didn't get the why part, any help ? (this doesn't encrypt metadata part). A key is created and its hash is updated into the subvolume item, and then added to the system keyctl. # btrfs su show /btrfs/e1 | egrep -i encrypt Encryption: ctr(aes)@btrfs:75197c8e (594790215) # keyctl show 594790215 Keyring 594790215 --alsw-v 0 0 logon: btrfs:75197c8e That's entirely reasonable, though you may want to support "trusted and encrypted keys" (Documentation/security/keys-trusted-encrypted.txt) Yes. that's in the list. Now any file data extents under the subvol /btrfs/e1 will be encrypted. You may revoke key using keyctl or btrfs(8) as below. # btrfs su encrypt -k out /btrfs/e1 # btrfs su show /btrfs/e1 | egrep -i encrypt Encryption: ctr(aes)@btrfs:75197c8e (Required key not available) # keyctl show 594790215 Keyring Unable to dump key: Key has been revoked As the key hash is updated, If you provide wrong passphrase in the next key in, it won't add key to the system. So we have key verification from the day1. This is good. Thanks. Thanks, Anand # btrfs su encrypt -k in /btrfs/e1 Passphrase: Again passphrase: ERROR: failed to set attribute 'btrfs.encrypt' to 'ctr(aes)@btrfs:75197c8e' : Key was rejected by service ERROR: key set failed: Key was rejected by service # btrfs su encrypt -k in /btrfs/e1 Passphrase: Again passphrase: key for '/btrfs/e1' has logged in with keytag 'btrfs:75197c8e' Now if you revoke the key the read / write fails with key error. # md5sum /btrfs/e1/2k-test-file 8c9fbc69125ebe84569a5c1ca088cb14 /btrfs/e1/2k-test-file # btrfs su encrypt -k out /btrfs/e1 # md5sum /btrfs/e1/2k-test-file md5sum: /btrfs/e1/2k-test-file: Key has been revoked # cp /tfs/1k-test-file /btrfs/e1/ cp: cannot create regular file ‘/btrfs/e1/1k-test-file’: Key has been revoked Plain text
Re: [RFC] Preliminary BTRFS Encryption
Thanks for the comments. Pls see inline below.. On 09/15/2016 01:38 PM, Chris Murphy wrote: On Tue, Sep 13, 2016 at 7:39 AM, Anand Jainwrote: This patchset adds btrfs encryption support. The main objective of this series is to have bugs fixed and stability. I have verified with fstests to confirm that there is no regression. A design write-up is coming next, however here below is the quick example on the cli usage. Please try out, let me know if I have missed something. What's the behavior with nested subvolumes having different keys? subvolume A (encrypted with key A) | - subvolume B (encrypted with key B) Without encryption I can discover either A or B whether top-level, A, or B are mounted. With encryption, must A be opened [1] for B to be discovered? Must A be opened before B can be opened? Or is the subvolume metadata always non-encrypted, and it's just file extents that are encrypted? Are filenames in those subvolumes discoverable (e.g. btrfs-debug-tree, btrfs-image) if the subvolume is not opened? And reflink handling between subvolumes behaves how? nested encrypting subvolume isn't supported, its just that it wasn't in my mind or the use case analysis review which I did, didn't tell me that. However I did a bit of code changes, its not that tough to get that in the current setup though. Yes only extent encrypted. Thanks, Anand [1] open in the cryptsetup open/luksOpen sense -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: stability matrix
On 09/15/2016 04:14 AM, Christoph Anton Mitterer wrote: > Hey. > > As for the stability matrix... > > In general: > - I think another column should be added, which tells when and for > which kernel version the feature-status of each row was > revised/updated the last time and especially by whom. > If a core dev makes a statement on a particular feature, this > probably means much more, than if it was made by "just" a list > regular. > And yes I know, in the beginning it already says "this is for 4.7"... > but let's be honest, it's pretty likely when this is bumped to 4.8 > that not each and every point will be thoroughly checked again. > - Optionally even one further column could be added, that lists bugs > where the specific cases are kept record of (if any). > - Perhaps a 3rd Status like "eats-your-data" which is worse than > critical, e.g. for things were it's known that there is a high > chance for still getting data corruption (RAID56?) About the "for 4.7" issue... The Status page could have an extra column, which for every OK labeled row lists the first version (kernel.org x.y.0 release) it's OK for. The bugs make it more complicated. * Feature A is labeled OK in kernel 5.0 * During development of kernel 8-rc, an eat my data bug is fixed. The OK for this feature in the table is bumped to 8.0? * kernel 5 is EOL * kernel 6 is still supported, and the fix is applied to 6.12 * then there's distros which have their own old kernels, applying fixes on them whenever they like, for example 5.6-distro4 which is leading its own life "Normal" users are using distro kernels. They shouldn't be panicing about their data if they're running 6.14 or 5.6-distro4, but the OK in the table is bumped to 8.0 because of the serious bugs. At least the official kernels should be tracked in the table I think. Separately, a list of known serious bugs per feature (like the 4 about compression, http://www.spinics.net/lists/linux-btrfs/msg58674.html ) could be listed on another Bugs! page (lots of work) so a user, or someone helping the user can see if the listed commits are or aren't included in the actual whatever kernel a user is using. This list of serious bugs could also help disussions that now sound like "yeah, there were issues with compression which some time ago got fixed, but noone knows what it was and when, so don't use compression". Many of the commits which fix serious bugs (even if they're only triggered in an edge case) have some explanation about how to trigger them, like the excellent commit messages of Filipe in the commits mentioned above. This helps setting up and maintaining the bug page, and helps advanced users to decide if they're hitting the edge case or not with their usage pattern. I'd like to help creating/maintaining this bug overview. A good start would be to just crawl through all stable kernels and some distro kernels and see which commits show up in fs/btrfs. -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH]btrfs-progs: Add fast,slow symlinks and fifo types to convert test
Signed-off-by: Lakshmipathi.G--- tests/common.convert | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/tests/common.convert b/tests/common.convert index 67c99b1..2790be5 100644 --- a/tests/common.convert +++ b/tests/common.convert @@ -25,10 +25,10 @@ generate_dataset() { done ;; - symlink) + fast_symlink) for num in $(seq 1 $DATASET_SIZE); do run_check $SUDO_HELPER touch $dirpath/$dataset_type.$num - run_check $SUDO_HELPER ln -s $dirpath/$dataset_type.$num $dirpath/slink.$num + run_check $SUDO_HELPER cd $dirpath && ln -s $dataset_type.$num $dirpath/slink.$num && cd / done ;; @@ -71,12 +71,24 @@ generate_dataset() { run_check $SUDO_HELPER setfattr -n user.foo -v bar$num $dirpath/$dataset_type.$num done ;; + fifo) + for num in $(seq 1 $DATASET_SIZE); do + run_check $SUDO_HELPER mkfifo $dirpath/$dataset_type.$num + done + ;; + slow_symlink) + for num in $(seq 1 $DATASET_SIZE); do + fname64=`date +%s | sha256sum | cut -f1 -d'-'` + run_check $SUDO_HELPER touch $dirpath/$fname64 + run_check $SUDO_HELPER ln -s $dirpath/$fname64 $dirpath/slow_slink.$num + done + ;; esac } populate_fs() { -for dataset_type in 'small' 'hardlink' 'symlink' 'brokenlink' 'perm' 'sparse' 'acls'; do +for dataset_type in 'small' 'hardlink' 'fast_symlink' 'brokenlink' 'perm' 'sparse' 'acls' 'fifo' 'slow_symlink'; do generate_dataset "$dataset_type" done } -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke?
Am Donnerstag, 15. September 2016, 07:55:36 CEST schrieb Kai Krakow: > Am Mon, 12 Sep 2016 08:20:20 -0400 > > schrieb "Austin S. Hemmelgarn": > > On 2016-09-11 09:02, Hugo Mills wrote: > > > On Sun, Sep 11, 2016 at 02:39:14PM +0200, Waxhead wrote: > > >> Martin Steigerwald wrote: > > [...] > > [...] > > [...] > > [...] > > > > >> That is exactly the same reason I don't edit the wiki myself. I > > >> could of course get it started and hopefully someone will correct > > >> what I write, but I feel that if I start this off I don't have deep > > >> enough knowledge to do a proper start. Perhaps I will change my > > >> mind about this. > > >> > > >Given that nobody else has done it yet, what are the odds that > > > > > > someone else will step up to do it now? I would say that you should > > > at least try. Yes, you don't have as much knowledge as some others, > > > but if you keep working at it, you'll gain that knowledge. Yes, > > > you'll probably get it wrong to start with, but you probably won't > > > get it *very* wrong. You'll probably get it horribly wrong at some > > > point, but even the more knowledgable people you're deferring to > > > didn't identify the problems with parity RAID until Zygo and Austin > > > and Chris (and others) put in the work to pin down the exact > > > issues. > > > > FWIW, here's a list of what I personally consider stable (as in, I'm > > willing to bet against reduced uptime to use this stuff on production > > systems at work and personal systems at home): > > 1. Single device mode, including DUP data profiles on single device > > without mixed-bg. > > 2. Multi-device raid0, raid1, and raid10 profiles with symmetrical > > devices (all devices are the same size). > > 3. Multi-device single profiles with asymmetrical devices. > > 4. Small numbers (max double digit) of snapshots, taken at infrequent > > intervals (no more than once an hour). I use single snapshots > > regularly to get stable images of the filesystem for backups, and I > > keep hourly ones of my home directory for about 48 hours. > > 5. Subvolumes used to isolate parts of a filesystem from snapshots. > > I use this regularly to isolate areas of my filesystems from backups. > > 6. Non-incremental send/receive (no clone source, no parent's, no > > deduplication). I use this regularly for cloning virtual machines. > > 7. Checksumming and scrubs using any of the profiles I've listed > > above. 8. Defragmentation, including autodefrag. > > 9. All of the compat_features, including no-holes and skinny-metadata. > > > > Things I consider stable enough that I'm willing to use them on my > > personal systems but not systems at work: > > 1. In-line data compression with compress=lzo. I use this on my > > laptop and home server system. I've never had any issues with it > > myself, but I know that other people have, and it does seem to make > > other things more likely to have issues. > > 2. Batch deduplication. I only use this on the back-end filesystems > > for my personal storage cluster, and only because I have multiple > > copies as a result of GlusterFS on top of BTRFS. I've not had any > > significant issues with it, and I don't remember any reports of data > > loss resulting from it, but it's something that people should not be > > using if they don't understand all the implications. > > I could at least add one "don't do it": > > Don't use BFQ patches (it's an IO scheduler) if you're using btrfs. > Some people like to use it especially for running VMs and desktops > because it provides very good interactivity while maintaining very good > throughput. But it completely destroyed my btrfs beyond repair at least > twice, either while actually using a VM (in VirtualBox) or during high > IO loads. I now stick to the deadline scheduler instead which provides > very good interactivity for me, too, and the corruptions didn't occur > again so far. > > The story with BFQ has always been the same: System suddenly freezes > during moderate to high IO until all processes stop working (no process > shows D state, tho). Only hard reboot possible. After rebooting, access > to some (unrelated) files may fail with "errno=-17 Object already > exists" which cannot be repaired. If it affects files needed during > boot, you are screwed because file system goes RO. This could be a further row in the table. And well… as for CFQ Jens Axboe currently works on bandwidth throttling patches *exactly* for the reason to provide more interactivity and fairness to I/O operations in between. Right now, Completely Fair in CFQ is a *huge* exaggeration, at least while you have a dd bs=1M thing running. Thanks, -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 4.4.0 - no space left with >1.7 TB free space left
On Fri, 8 Apr 2016 16:53:32 +0500 Roman Mamedovwrote: > On Fri, 08 Apr 2016 20:36:26 +0900 > Tomasz Chmielewski wrote: > > > On 2016-02-08 20:24, Roman Mamedov wrote: > > > > >> Linux 4.4.0 - btrfs is mainly used to host lots of test containers, > > >> often snapshots, and at times, there is heavy IO in many of them for > > >> extended periods of time. btrfs is on HDDs. > > >> > > >> > > >> Every few days I'm getting "no space left" in a container running > > >> mongo > > >> 3.2.1 database. Interestingly, haven't seen this issue in containers > > >> with MySQL. All databases have chattr +C set on their directories. > > > > > > Hello, > > > > > > Do you snapshot the parent subvolume which holds the databases? Can you > > > correlate that perhaps ENOSPC occurs at the time of snapshotting? If > > > yes, then > > > you should try the patch https://patchwork.kernel.org/patch/7967161/ > > > > > > (Too bad this was not included into 4.4.1.) > > > > By the way - was it included in any later kernel? I'm running 4.4.5 on > > that server, but still hitting the same issue. > > It's not in 4.4.6 either. I don't know why it doesn't get included, or what > we need to do. Last time I asked, it was queued: > http://www.spinics.net/lists/linux-btrfs/msg52478.html > But maybe that meant 4.5 or 4.6 only? While the bug is affecting people on > 4.4.x today. This got applied now in 4.4.21, thanks. -- With respect, Roman pgpqOFdEk406P.pgp Description: OpenPGP digital signature
Re: Is stability a joke?
Hello Nicholas. Am Mittwoch, 14. September 2016, 21:05:52 CEST schrieb Nicholas D Steeves: > On Mon, Sep 12, 2016 at 08:20:20AM -0400, Austin S. Hemmelgarn wrote: > > On 2016-09-11 09:02, Hugo Mills wrote: […] > > As far as documentation though, we [BTRFS] really do need to get our act > > together. It really doesn't look good to have most of the best > > documentation be in the distro's wikis instead of ours. I'm not trying to > > say the distros shouldn't be documenting BTRFS, but the point at which > > Debian (for example) has better documentation of the upstream version of > > BTRFS than the upstream project itself does, that starts to look bad. > > I would have loved to have this feature-to-stability list when I > started working on the Debian documentation! I started it because I > was saddened by number of horror story "adventures with btrfs" > articles and posts I had read about, combined with the perspective of > certain members within the Debian community that it was a toy fs. > > Are my contributions to that wiki of a high enough quality that I > can work on the upstream one? Do you think the broader btrfs > community is interested in citations and curated links to discussions? > > eg: if a company wants to use btrfs, they check the status page, see a > feature they want is still in the yellow zone of stabilisation, and > then follow the links to familiarise themselves with past discussions. > I imagine this would also help individuals or grad students more > quickly familiarise themselves with the available literature before > choosing a specific project. If regular updates from SUSE, STRATO, > Facebook, and Fujitsu are also publicly available the k.org wiki would > be a wonderful place to syndicate them! I definately think the quality of your contributions is high enough, others can also proofread and give in their experiences, so… By *all* means, go ahead *already*. It doesn´t fit all inside the table directly, I bet, *but* you can use footnotes or further explainations regarding features that need them with a headline per feature below the table and a link to it from within the table. Thank you! -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
multi-device btrfs with single data mode and disk failure
I had a btrfs partition on a 6 disk array without raid (metadata in raid10, but data in single), and one of the disks just died. So I lost some of my data, ok, I knew that. But two question : * Is it possible to know (using metadata I suppose) what data I have lost ? * Is it possible to do some king of a "btrfs delete missing" on this kind of setup, in order to recover access in rw to my other data, or I must copy all my data on a new partition I haven't been able to get any answer on google or in the wiki, so I send here an e-mail, hoping that's the right place. Excuse me, if I'm wrong. Thank you for any help (Sorry for my poor english) uname -a : Linux Grand-PC 4.7.2-1-ARCH #1 SMP PREEMPT Sat Aug 20 23:02:56 CEST 2016 x86_64 GNU/Linux btrfs --version : btrfs-progs v4.7.1 btrfs fi show : Label: 'Data' uuid: 62db560b-a040-4c64-b613-6e7db033dc4d Total devices 6 FS bytes used 6.66TiB devid1 size 2.53TiB used 2.12TiB path /dev/sdd6 devid7 size 2.53TiB used 2.12TiB path /dev/sdb6 devid9 size 262.57GiB used 0.00B path /dev/sde6 devid 11 size 2.53TiB used 2.12TiB path /dev/sdc6 devid 12 size 728.32GiB used 312.03GiB path /dev/sda6 *** Some devices missing mount -o recovery,ro,degraded /dev/sda6 /Data relevant part of dmesg : [ 1828.093704] BTRFS warning (device sda6): 'recovery' is deprecated, use 'usebackuproot' instead [ 1828.093708] BTRFS info (device sda6): trying to use backup root at mount time [ 1828.093718] BTRFS info (device sda6): allowing degraded mounts [ 1828.093719] BTRFS info (device sda6): disk space caching is enabled [ 1828.107763] BTRFS warning (device sda6): devid 8 uuid 950378c0-307c-413d-9805-ab2bb899aa78 missing btrfs fi df /Data Data, single: total=6.65TiB, used=6.65TiB System, RAID1: total=32.00MiB, used=768.00KiB Metadata, RAID1: total=13.00GiB, used=10.99GiB GlobalReserve, single: total=512.00MiB, used=0.00B -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke?
Am Mon, 12 Sep 2016 08:20:20 -0400 schrieb "Austin S. Hemmelgarn": > On 2016-09-11 09:02, Hugo Mills wrote: > > On Sun, Sep 11, 2016 at 02:39:14PM +0200, Waxhead wrote: > >> Martin Steigerwald wrote: > [...] > [...] > [...] > [...] > >> That is exactly the same reason I don't edit the wiki myself. I > >> could of course get it started and hopefully someone will correct > >> what I write, but I feel that if I start this off I don't have deep > >> enough knowledge to do a proper start. Perhaps I will change my > >> mind about this. > > > >Given that nobody else has done it yet, what are the odds that > > someone else will step up to do it now? I would say that you should > > at least try. Yes, you don't have as much knowledge as some others, > > but if you keep working at it, you'll gain that knowledge. Yes, > > you'll probably get it wrong to start with, but you probably won't > > get it *very* wrong. You'll probably get it horribly wrong at some > > point, but even the more knowledgable people you're deferring to > > didn't identify the problems with parity RAID until Zygo and Austin > > and Chris (and others) put in the work to pin down the exact > > issues. > FWIW, here's a list of what I personally consider stable (as in, I'm > willing to bet against reduced uptime to use this stuff on production > systems at work and personal systems at home): > 1. Single device mode, including DUP data profiles on single device > without mixed-bg. > 2. Multi-device raid0, raid1, and raid10 profiles with symmetrical > devices (all devices are the same size). > 3. Multi-device single profiles with asymmetrical devices. > 4. Small numbers (max double digit) of snapshots, taken at infrequent > intervals (no more than once an hour). I use single snapshots > regularly to get stable images of the filesystem for backups, and I > keep hourly ones of my home directory for about 48 hours. > 5. Subvolumes used to isolate parts of a filesystem from snapshots. > I use this regularly to isolate areas of my filesystems from backups. > 6. Non-incremental send/receive (no clone source, no parent's, no > deduplication). I use this regularly for cloning virtual machines. > 7. Checksumming and scrubs using any of the profiles I've listed > above. 8. Defragmentation, including autodefrag. > 9. All of the compat_features, including no-holes and skinny-metadata. > > Things I consider stable enough that I'm willing to use them on my > personal systems but not systems at work: > 1. In-line data compression with compress=lzo. I use this on my > laptop and home server system. I've never had any issues with it > myself, but I know that other people have, and it does seem to make > other things more likely to have issues. > 2. Batch deduplication. I only use this on the back-end filesystems > for my personal storage cluster, and only because I have multiple > copies as a result of GlusterFS on top of BTRFS. I've not had any > significant issues with it, and I don't remember any reports of data > loss resulting from it, but it's something that people should not be > using if they don't understand all the implications. I could at least add one "don't do it": Don't use BFQ patches (it's an IO scheduler) if you're using btrfs. Some people like to use it especially for running VMs and desktops because it provides very good interactivity while maintaining very good throughput. But it completely destroyed my btrfs beyond repair at least twice, either while actually using a VM (in VirtualBox) or during high IO loads. I now stick to the deadline scheduler instead which provides very good interactivity for me, too, and the corruptions didn't occur again so far. The story with BFQ has always been the same: System suddenly freezes during moderate to high IO until all processes stop working (no process shows D state, tho). Only hard reboot possible. After rebooting, access to some (unrelated) files may fail with "errno=-17 Object already exists" which cannot be repaired. If it affects files needed during boot, you are screwed because file system goes RO. -- Regards, Kai Replies to list-only preferred. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html