Re: Is stability a joke? (wiki updated)
On Mon, Sep 19, 2016 at 01:38:36PM -0400, Austin S. Hemmelgarn wrote: > >>I'm not sure if the brfsck is really all that helpful to user as much > >>as it is for developers to better learn about the failure vectors of > >>the file system. > > > >ReiserFS had no working fsck for all of the 8 years I used it (and still > >didn't last year when I tried to use it on an old disk). "Not working" > >here means "much less data is readable from the filesystem after running > >fsck than before." It's not that much of an inconvenience if you have > >backups. > For a small array, this may be the case. Once you start looking into double > digit TB scale arrays though, restoring backups becomes a very expensive > operation. If you had a multi-PB array with a single dentry which had no > inode, would you rather be spending multiple days restoring files and > possibly losing recent changes, or spend a few hours to check the filesystem > and fix it with minimal data loss? I'd really prefer to be able to delete the dead dentry with 'rm' as root, or failing that, with a ZDB-like tool or ioctl, if it's the only known instance of such a bad metadata object and I already know where it's located. Usually the ultimate failure mode of a btrfs filesystem is a read-only filesystem from which you can read most or all of your data, but you can't ever make it writable again because of fsck limitations. The one thing I do miss about every filesystem that isn't ext2/ext3 is automated fsck that prioritizes availability, making the filesystem safely writable even if it can't recover lost data. On the other hand, fixing an ext[23] filesystem is utterly trivial compared to any btree-based filesystem. signature.asc Description: Digital signature
Re: Is stability a joke? (wiki updated)
On 2016-09-19 14:27, Chris Murphy wrote: On Mon, Sep 19, 2016 at 11:38 AM, Austin S. Hemmelgarnwrote: ReiserFS had no working fsck for all of the 8 years I used it (and still didn't last year when I tried to use it on an old disk). "Not working" here means "much less data is readable from the filesystem after running fsck than before." It's not that much of an inconvenience if you have backups. For a small array, this may be the case. Once you start looking into double digit TB scale arrays though, restoring backups becomes a very expensive operation. If you had a multi-PB array with a single dentry which had no inode, would you rather be spending multiple days restoring files and possibly losing recent changes, or spend a few hours to check the filesystem and fix it with minimal data loss? Yep restoring backups, even fully re-replicating data in a cluster, is untenable it's so expensive. But even offline fsck is sufficiently non-scalable that at a certain volume size it's not tenable. 100TB takes a long time to fsck offline, and is it even possible to fsck 1PB Btrfs? Seems to me it's another case were if it were possible to isolate what tree limbs are sick, just cut them off and report the data loss rather than consider the whole fs unusable. That's what we do with living things. This is part of why I said the ZFS approach is valid. At the moment though, we can't even do that, and to do it properly, we'd need a tool to bypass the VFS layer to prune the tree, which is non-trivial in and of itself. It would be nice to have a mode in check where you could say 'I know this path in the FS has some kind of issue, figure out what's wrong and fix it if possible, otherwise optionally prune that branch from the appropriate tree'. On the same note, it would be nice to be able to manually restrict it to specific checks (eg, 'check only for orphaned inodes', or 'only validate the FSC/FST'). If we were to add such functionality, dealing with some minor corruption in a 100TB+ array wouldn't be quite as much of an issue. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Mon, Sep 19, 2016 at 11:38 AM, Austin S. Hemmelgarnwrote: >> ReiserFS had no working fsck for all of the 8 years I used it (and still >> didn't last year when I tried to use it on an old disk). "Not working" >> here means "much less data is readable from the filesystem after running >> fsck than before." It's not that much of an inconvenience if you have >> backups. > > For a small array, this may be the case. Once you start looking into double > digit TB scale arrays though, restoring backups becomes a very expensive > operation. If you had a multi-PB array with a single dentry which had no > inode, would you rather be spending multiple days restoring files and > possibly losing recent changes, or spend a few hours to check the filesystem > and fix it with minimal data loss? Yep restoring backups, even fully re-replicating data in a cluster, is untenable it's so expensive. But even offline fsck is sufficiently non-scalable that at a certain volume size it's not tenable. 100TB takes a long time to fsck offline, and is it even possible to fsck 1PB Btrfs? Seems to me it's another case were if it were possible to isolate what tree limbs are sick, just cut them off and report the data loss rather than consider the whole fs unusable. That's what we do with living things. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On 2016-09-19 00:08, Zygo Blaxell wrote: On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote: Right, well I'm vaguely curious why ZFS, as different as it is, basically take the position that if the hardware went so batshit that they can't unwind it on a normal mount, then an fsck probably can't help either... they still don't have an fsck and don't appear to want one. ZFS has no automated fsck, but it does have a kind of interactive debugger that can be used to manually fix things. ZFS seems to be a lot more robust when it comes to handling bad metadata (contrast with btrfs-style BUG_ON panics). When you delete a directory entry that has a missing inode on ZFS, the dirent goes away. In the ZFS administrator documentation they give examples of this as a response in cases where ZFS metadata gets corrupted. When you delete a file with a missing inode on btrfs, something (VFS?) wants to check the inode to see if it has attributes that might affect unlink (e.g. the immutable bit), gets an error reading the inode, and bombs out of the unlink() before unlink() can get rid of the dead dirent. So if you get a dirent with no inode on btrfs on a large filesystem (too large for btrfs check to handle), you're basically stuck with it forever. You can't even rename it. Hopefully it doesn't happen in a top-level directory. ZFS is also infamous for saying "sucks to be you, I'm outta here" when things go wrong. People do want ZFS fsck and defrag, but nobody seems to be bothered much about making those things happen. At the end of the day I'm not sure fsck really matters. If the filesystem is getting corrupted enough that both copies of metadata are broken, there's something fundamentally wrong with that setup (hardware bugs, software bugs, bad RAM, etc) and it's just going to keep slowly eating more data until the underlying problem is fixed, and there's no guarantee that a repair is going to restore data correctly. If we exclude broken hardware, the only thing btrfs check is going to repair is btrfs kernel bugs...and in that case, why would we expect btrfs check to have fewer bugs than the filesystem itself? I wouldn't, but I would still expect to have some tool to deal with things like orphaned inodes, dentries which are missing inodes, and other similar cases that don't make the filesystem unusable, but can't easily be fixed in a sane manner on a live filesystem. The ZFS approach is valid, but it can't deal with things like orphaned inodes where there's no reference in the directories any more. I'm not sure if the brfsck is really all that helpful to user as much as it is for developers to better learn about the failure vectors of the file system. ReiserFS had no working fsck for all of the 8 years I used it (and still didn't last year when I tried to use it on an old disk). "Not working" here means "much less data is readable from the filesystem after running fsck than before." It's not that much of an inconvenience if you have backups. For a small array, this may be the case. Once you start looking into double digit TB scale arrays though, restoring backups becomes a very expensive operation. If you had a multi-PB array with a single dentry which had no inode, would you rather be spending multiple days restoring files and possibly losing recent changes, or spend a few hours to check the filesystem and fix it with minimal data loss? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Mon, Sep 19, 2016 at 08:32:14AM -0400, Austin S. Hemmelgarn wrote: > On 2016-09-18 23:47, Zygo Blaxell wrote: > >On Mon, Sep 12, 2016 at 12:56:03PM -0400, Austin S. Hemmelgarn wrote: > >>4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS > >>is healthy. > > > >I've found issues with OOB dedup (clone/extent-same): > > > >1. Don't dedup data that has not been committed--either call fsync() > >on it, or check the generation numbers on each extent before deduping > >it, or make sure the data is not being actively modified during dedup; > >otherwise, a race condition may lead to the the filesystem locking up and > >becoming inaccessible until the kernel is rebooted. This is particularly > >important if you are doing bedup-style incremental dedup on a live system. > > > >I've worked around #1 by placing a fsync() call on the src FD immediately > >before calling FILE_EXTENT_SAME. When I do an A/B experiment with and > >without the fsync, "with-fsync" runs for weeks at a time without issues, > >while "without-fsync" hangs, sometimes in just a matter of hours. Note > >that the fsync() doesn't resolve the underlying race condition, it just > >makes the filesystem hang less often. > > > >2. There is a practical limit to the number of times a single duplicate > >extent can be deduplicated. As more references to a shared extent > >are created, any part of the filesystem that uses backref walking code > >gets slower. This includes dedup itself, balance, device replace/delete, > >FIEMAP, LOGICAL_INO, and mmap() (which can be bad news if the duplicate > >files are executables). Several factors (including file size and number > >of snapshots) are involved, making it difficult to devise workarounds or > >set up test cases. 99.5% of the time, these operations just get slower > >by a few ms each time a new reference is created, but the other 0.5% of > >the time, write operations will abruptly grow to consume hours of CPU > >time or dozens of gigabytes of RAM (in millions of kmalloc-32 slabs) > >when they touch one of these over-shared extents. When this occurs, > >it effectively (but not literally) crashes the host machine. > > > >I've worked around #2 by building tables of "toxic" hashes that occur too > >frequently in a filesystem to be deduped, and using these tables in dedup > >software to ignore any duplicate data matching them. These tables can > >be relatively small as they only need to list hashes that are repeated > >more than a few thousand times, and typical filesystems (up to 10TB or > >so) have only a few hundred such hashes. > > > >I happened to have a couple of machines taken down by these issues this > >very weekend, so I can confirm the issues are present in kernels 4.4.21, > >4.5.7, and 4.7.4. > OK, that's good to know. In my case, I'm not operating on a very big data > set (less than 40GB, but the storage cluster I'm doing this on only has > about 200GB of total space, so I'm trying to conserve as much as possible), > and it's mostly static data (less than 100MB worth of changes a day except > on Sunday when I run backups), so it makes sense that I've not seen either > of these issues. I ran into issue #2 on an 8GB filesystem last weekend. The lower limit on filesystem size could be as low as a few megabytes if they're arranged in *just* the right way. > The second one sounds like the same performance issue caused by having very > large numbers of snapshots, and based on what's happening, I don't think > there's any way we could fix it without rewriting certain core code. find_parent_nodes is the usual culprit for CPU usage. Fixing this is required for in-band dedup as well, so I assume someone has it on their roadmap and will get it done eventually. signature.asc Description: Digital signature
Re: Is stability a joke? (wiki updated)
On Mon, Sep 19, 2016 at 12:08:55AM -0400, Zygo Blaxell wrote: > > At the end of the day I'm not sure fsck really matters. If the filesystem > is getting corrupted enough that both copies of metadata are broken, > there's something fundamentally wrong with that setup (hardware bugs, > software bugs, bad RAM, etc) and it's just going to keep slowly eating > more data until the underlying problem is fixed, and there's no guarantee > that a repair is going to restore data correctly. If we exclude broken > hardware, the only thing btrfs check is going to repair is btrfs kernel > bugs...and in that case, why would we expect btrfs check to have fewer > bugs than the filesystem itself? I see btrfs check as having a very useful role: fixing known problems introduced by previous versions of kernel / progs. In my ext conversion thread, I seem to have discovered a problem introduced by convert, balance, or defrag. The data and metadata seem to be OK, however the filesystem cannot be written to without btrfs falling over. If this was caused by some edge-case data in the btrfs partition, it makes a lot more sense to have btrfs check repair it than it does to modify the kernel code to work around this and possibly many other bugs. The upshot to this is that since (potentially all of) the data is intact, a functional btrfs check would save me the hassle of restoring from backup. --Sean -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On 2016-09-18 22:57, Zygo Blaxell wrote: On Fri, Sep 16, 2016 at 08:00:44AM -0400, Austin S. Hemmelgarn wrote: To be entirely honest, both zero-log and super-recover could probably be pretty easily integrated into btrfs check such that it detects when they need to be run and does so. zero-log has a very well defined situation in which it's absolutely needed (log tree corrupted such that it can't be replayed), which is pretty easy to detect (the kernel obviously does so, albeit by crashing). Check already includes zero-log. It loses a little data that way, but that is probably better than the alternative (try to teach btrfs check how to replay the log tree and keep up with kernel changes). Interesting, as I've never seen check try to zero the log (even in cases where it would fix things) unless it makes some other change in the FS. I won't dispute that it clears the log tree _if_ it makes other changes to the FS (it kind of has to for safety reasons), but that's the only circumstance that I've seen it do so (even on filesystems where the log tree was corrupted, but the rest of the FS was fine). There have been at least two log-tree bugs (or, more accurately, bugs triggered while processing the log tree during mount) in the 3.x and 4.x kernels. The most recent I've encountered was in one of the 4.7-rc kernels. zero-log is certainly not obsolete. I won't dispute this, as I've had it happen myself (albeit not quite that recently), all I was trying to say was that it fixes a very well defined problem. For a filesystem where availablity is more important than integrity (e.g. root filesystems) it's really handy to have zero-log as a separate tool without the huge overhead (and regression risk) of check. Agreed, hence my later statement that if it gets fully merged, there should be an option to run just that. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On 2016-09-18 23:47, Zygo Blaxell wrote: On Mon, Sep 12, 2016 at 12:56:03PM -0400, Austin S. Hemmelgarn wrote: 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS is healthy. I've found issues with OOB dedup (clone/extent-same): 1. Don't dedup data that has not been committed--either call fsync() on it, or check the generation numbers on each extent before deduping it, or make sure the data is not being actively modified during dedup; otherwise, a race condition may lead to the the filesystem locking up and becoming inaccessible until the kernel is rebooted. This is particularly important if you are doing bedup-style incremental dedup on a live system. I've worked around #1 by placing a fsync() call on the src FD immediately before calling FILE_EXTENT_SAME. When I do an A/B experiment with and without the fsync, "with-fsync" runs for weeks at a time without issues, while "without-fsync" hangs, sometimes in just a matter of hours. Note that the fsync() doesn't resolve the underlying race condition, it just makes the filesystem hang less often. 2. There is a practical limit to the number of times a single duplicate extent can be deduplicated. As more references to a shared extent are created, any part of the filesystem that uses backref walking code gets slower. This includes dedup itself, balance, device replace/delete, FIEMAP, LOGICAL_INO, and mmap() (which can be bad news if the duplicate files are executables). Several factors (including file size and number of snapshots) are involved, making it difficult to devise workarounds or set up test cases. 99.5% of the time, these operations just get slower by a few ms each time a new reference is created, but the other 0.5% of the time, write operations will abruptly grow to consume hours of CPU time or dozens of gigabytes of RAM (in millions of kmalloc-32 slabs) when they touch one of these over-shared extents. When this occurs, it effectively (but not literally) crashes the host machine. I've worked around #2 by building tables of "toxic" hashes that occur too frequently in a filesystem to be deduped, and using these tables in dedup software to ignore any duplicate data matching them. These tables can be relatively small as they only need to list hashes that are repeated more than a few thousand times, and typical filesystems (up to 10TB or so) have only a few hundred such hashes. I happened to have a couple of machines taken down by these issues this very weekend, so I can confirm the issues are present in kernels 4.4.21, 4.5.7, and 4.7.4. OK, that's good to know. In my case, I'm not operating on a very big data set (less than 40GB, but the storage cluster I'm doing this on only has about 200GB of total space, so I'm trying to conserve as much as possible), and it's mostly static data (less than 100MB worth of changes a day except on Sunday when I run backups), so it makes sense that I've not seen either of these issues. The second one sounds like the same performance issue caused by having very large numbers of snapshots, and based on what's happening, I don't think there's any way we could fix it without rewriting certain core code. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote: > Right, well I'm vaguely curious why ZFS, as different as it is, > basically take the position that if the hardware went so batshit that > they can't unwind it on a normal mount, then an fsck probably can't > help either... they still don't have an fsck and don't appear to want > one. ZFS has no automated fsck, but it does have a kind of interactive debugger that can be used to manually fix things. ZFS seems to be a lot more robust when it comes to handling bad metadata (contrast with btrfs-style BUG_ON panics). When you delete a directory entry that has a missing inode on ZFS, the dirent goes away. In the ZFS administrator documentation they give examples of this as a response in cases where ZFS metadata gets corrupted. When you delete a file with a missing inode on btrfs, something (VFS?) wants to check the inode to see if it has attributes that might affect unlink (e.g. the immutable bit), gets an error reading the inode, and bombs out of the unlink() before unlink() can get rid of the dead dirent. So if you get a dirent with no inode on btrfs on a large filesystem (too large for btrfs check to handle), you're basically stuck with it forever. You can't even rename it. Hopefully it doesn't happen in a top-level directory. ZFS is also infamous for saying "sucks to be you, I'm outta here" when things go wrong. People do want ZFS fsck and defrag, but nobody seems to be bothered much about making those things happen. At the end of the day I'm not sure fsck really matters. If the filesystem is getting corrupted enough that both copies of metadata are broken, there's something fundamentally wrong with that setup (hardware bugs, software bugs, bad RAM, etc) and it's just going to keep slowly eating more data until the underlying problem is fixed, and there's no guarantee that a repair is going to restore data correctly. If we exclude broken hardware, the only thing btrfs check is going to repair is btrfs kernel bugs...and in that case, why would we expect btrfs check to have fewer bugs than the filesystem itself? > I'm not sure if the brfsck is really all that helpful to user as much > as it is for developers to better learn about the failure vectors of > the file system. ReiserFS had no working fsck for all of the 8 years I used it (and still didn't last year when I tried to use it on an old disk). "Not working" here means "much less data is readable from the filesystem after running fsck than before." It's not that much of an inconvenience if you have backups. signature.asc Description: Digital signature
Re: Is stability a joke? (wiki updated)
On Mon, Sep 12, 2016 at 12:56:03PM -0400, Austin S. Hemmelgarn wrote: > 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS > is healthy. I've found issues with OOB dedup (clone/extent-same): 1. Don't dedup data that has not been committed--either call fsync() on it, or check the generation numbers on each extent before deduping it, or make sure the data is not being actively modified during dedup; otherwise, a race condition may lead to the the filesystem locking up and becoming inaccessible until the kernel is rebooted. This is particularly important if you are doing bedup-style incremental dedup on a live system. I've worked around #1 by placing a fsync() call on the src FD immediately before calling FILE_EXTENT_SAME. When I do an A/B experiment with and without the fsync, "with-fsync" runs for weeks at a time without issues, while "without-fsync" hangs, sometimes in just a matter of hours. Note that the fsync() doesn't resolve the underlying race condition, it just makes the filesystem hang less often. 2. There is a practical limit to the number of times a single duplicate extent can be deduplicated. As more references to a shared extent are created, any part of the filesystem that uses backref walking code gets slower. This includes dedup itself, balance, device replace/delete, FIEMAP, LOGICAL_INO, and mmap() (which can be bad news if the duplicate files are executables). Several factors (including file size and number of snapshots) are involved, making it difficult to devise workarounds or set up test cases. 99.5% of the time, these operations just get slower by a few ms each time a new reference is created, but the other 0.5% of the time, write operations will abruptly grow to consume hours of CPU time or dozens of gigabytes of RAM (in millions of kmalloc-32 slabs) when they touch one of these over-shared extents. When this occurs, it effectively (but not literally) crashes the host machine. I've worked around #2 by building tables of "toxic" hashes that occur too frequently in a filesystem to be deduped, and using these tables in dedup software to ignore any duplicate data matching them. These tables can be relatively small as they only need to list hashes that are repeated more than a few thousand times, and typical filesystems (up to 10TB or so) have only a few hundred such hashes. I happened to have a couple of machines taken down by these issues this very weekend, so I can confirm the issues are present in kernels 4.4.21, 4.5.7, and 4.7.4. signature.asc Description: Digital signature
Re: Is stability a joke? (wiki updated)
On Fri, Sep 16, 2016 at 08:00:44AM -0400, Austin S. Hemmelgarn wrote: > To be entirely honest, both zero-log and super-recover could probably be > pretty easily integrated into btrfs check such that it detects when they > need to be run and does so. zero-log has a very well defined situation in > which it's absolutely needed (log tree corrupted such that it can't be > replayed), which is pretty easy to detect (the kernel obviously does so, > albeit by crashing). Check already includes zero-log. It loses a little data that way, but that is probably better than the alternative (try to teach btrfs check how to replay the log tree and keep up with kernel changes). There have been at least two log-tree bugs (or, more accurately, bugs triggered while processing the log tree during mount) in the 3.x and 4.x kernels. The most recent I've encountered was in one of the 4.7-rc kernels. zero-log is certainly not obsolete. For a filesystem where availablity is more important than integrity (e.g. root filesystems) it's really handy to have zero-log as a separate tool without the huge overhead (and regression risk) of check. signature.asc Description: Digital signature
Re: Is stability a joke? (wiki updated)
On 2016-09-15 17:23, Christoph Anton Mitterer wrote: On Thu, 2016-09-15 at 14:20 -0400, Austin S. Hemmelgarn wrote: 3. Fsck should be needed only for un-mountable filesystems. Ideally, we should be handling things like Windows does. Preform slightly better checking when reading data, and if we see an error, flag the filesystem for expensive repair on the next mount. That philosophy also has some drawbacks: - The user doesn't directly that anything went wrong. Thus errors may even continue to accumulate and getting much worse if the fs would have immediately gone ro and giving the user the chance to manually intervene (possibly then with help from upstream). Except that the fsck implementation in windows for NTFS actually fixes things that are broken. MS policy is 'if chkdsk can't fix it, you need to just reinstall and restore from backups'. They don't beat around the bush trying to figure out what exactly went wrong, because 99% of the time on Windows a corrupted filesystem means broken hardware or a virus. BTRFS obviously isn't to that point yet, but it has the potential if we were to start focusing on fixing stuff that's broken instead of working on shiny new features that will inevitably make everything else harder to debug, we could probably get there faster than most other Linux filesystems. - Any smart auto-magical™ repair may also just fail (and make things worse, as the current --repair e.g. may). Not performing such auto- repair, gives the user at least the possible chance to make a bitwise copy of the whole fs, before trying any rescue operations. This wouldn't be the case, if the user never noticed that something happen, and the fs tries to repair things right at mounting. People talk about it being dangerous, but I have yet to see it break a filesystem that wasn't already in a state that in XFS or ext4 would be considered broken beyond repair. For pretty much all of the common cases (orphaned inodes, dangling hardlinks, isize mismatches, etc), it does fix things correctly. Most of that stuff could be optionally checked at mount and fixed without causing issues, but it's not something that should be done all the time since it's expensive, hence me suggesting checking such things dynamically on-access and flagging them for cleanup next mount. So I think any such auto-repair should be used with extreme caution and only in those cases where one is absolutely a 100% sure that the action will help and just do good. In general, I agree with this, and I'd say it should be opt-in, not mandatory. I'm not talking about doing things that are all that risky though, but things which btrfs check can handle safely. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On 2016-09-15 16:26, Chris Murphy wrote: On Thu, Sep 15, 2016 at 2:16 PM, Hugo Millswrote: On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote: On Thu, Sep 15, 2016 at 12:20 PM, Austin S. Hemmelgarn wrote: 2. We're developing new features without making sure that check can fix issues in any associated metadata. Part of merging a new feature needs to be proving that fsck can handle fixing any issues in the metadata for that feature short of total data loss or complete corruption. 3. Fsck should be needed only for un-mountable filesystems. Ideally, we should be handling things like Windows does. Preform slightly better checking when reading data, and if we see an error, flag the filesystem for expensive repair on the next mount. Right, well I'm vaguely curious why ZFS, as different as it is, basically take the position that if the hardware went so batshit that they can't unwind it on a normal mount, then an fsck probably can't help either... they still don't have an fsck and don't appear to want one. I'm not sure if the brfsck is really all that helpful to user as much as it is for developers to better learn about the failure vectors of the file system. 4. Btrfs check should know itself if it can fix something or not, and that should be reported. I have an otherwise perfectly fine filesystem that throws some (apparently harmless) errors in check, and check can't repair them. Despite this, it gives zero indication that it can't repair them, zero indication that it didn't repair them, and doesn't even seem to give a non-zero exit status for this filesystem. Yeah, it's really not a user tool in my view... As far as the other tools: - Self-repair at mount time: This isn't a repair tool, if the FS mounts, it's not broken, it's just a messy and the kernel is tidying things up. - btrfsck/btrfs check: I think I covered the issues here well. - Mount options: These are mostly just for expensive checks during mount, and most people should never need them except in very unusual circumstances. - btrfs rescue *: These are all fixes for very specific issues. They should be folded into check with special aliases, and not be separate tools. The first fixes an issue that's pretty much non-existent in any modern kernel, and the other two are for very low-level data recovery of horribly broken filesystems. - scrub: This is a very purpose specific tool which is supposed to be part of regular maintainence, and only works to fix things as a side effect of what it does. - balance: This is also a relatively purpose specific tool, and again only fixes things as a side effect of what it does. You've forgotten btrfs-zero-log, which seems to have built itself a reputation on the internet as the tool you run to fix all btrfs ills, rather than a very finely-targeted tool that was introduced to deal with approximately one bug somewhere back in the 2.x era (IIRC). Hugo. :-) It's in my original list, and it's in Austin's by way of being lumped into 'btrfs rescue *' along with chunk and super recover. Seems like super recover should be built into Btrfs check, and would be one of the first ambiguities to get out of the way but I'm just an ape that wears pants so what do I know. Thing is?? zero log has fixed file systems in cases where I never would have expected it to, and the user was recommended not to use it, or use it as a 2nd to last resort. So, pfffIt's like throwing salt around. To be entirely honest, both zero-log and super-recover could probably be pretty easily integrated into btrfs check such that it detects when they need to be run and does so. zero-log has a very well defined situation in which it's absolutely needed (log tree corrupted such that it can't be replayed), which is pretty easy to detect (the kernel obviously does so, albeit by crashing). super-recover is also used in a pretty specific set of circumstances (first SB corrupted, backups fine), which are also pretty easy to detect. In both cases, I'd like to see some switch (--single-fix maybe?) for directly invoking just those functions (as well as a few others like dropping the FSC/FST or cancelling a paused or crashed balance) that operate at a filesystem level instead of a block/inode/extent level like most of the other stuff in check does. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Thu, 2016-09-15 at 14:20 -0400, Austin S. Hemmelgarn wrote: > 3. Fsck should be needed only for un-mountable filesystems. Ideally, > we > should be handling things like Windows does. Preform slightly > better > checking when reading data, and if we see an error, flag the > filesystem > for expensive repair on the next mount. That philosophy also has some drawbacks: - The user doesn't directly that anything went wrong. Thus errors may even continue to accumulate and getting much worse if the fs would have immediately gone ro and giving the user the chance to manually intervene (possibly then with help from upstream). - Any smart auto-magical™ repair may also just fail (and make things worse, as the current --repair e.g. may). Not performing such auto- repair, gives the user at least the possible chance to make a bitwise copy of the whole fs, before trying any rescue operations. This wouldn't be the case, if the user never noticed that something happen, and the fs tries to repair things right at mounting. So I think any such auto-repair should be used with extreme caution and only in those cases where one is absolutely a 100% sure that the action will help and just do good. Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature
Re: Is stability a joke? (wiki updated)
On Thu, Sep 15, 2016 at 2:16 PM, Hugo Millswrote: > On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote: >> On Thu, Sep 15, 2016 at 12:20 PM, Austin S. Hemmelgarn >> wrote: >> >> > 2. We're developing new features without making sure that check can fix >> > issues in any associated metadata. Part of merging a new feature needs to >> > be proving that fsck can handle fixing any issues in the metadata for that >> > feature short of total data loss or complete corruption. >> > >> > 3. Fsck should be needed only for un-mountable filesystems. Ideally, we >> > should be handling things like Windows does. Preform slightly better >> > checking when reading data, and if we see an error, flag the filesystem for >> > expensive repair on the next mount. >> >> Right, well I'm vaguely curious why ZFS, as different as it is, >> basically take the position that if the hardware went so batshit that >> they can't unwind it on a normal mount, then an fsck probably can't >> help either... they still don't have an fsck and don't appear to want >> one. >> >> I'm not sure if the brfsck is really all that helpful to user as much >> as it is for developers to better learn about the failure vectors of >> the file system. >> >> >> > 4. Btrfs check should know itself if it can fix something or not, and that >> > should be reported. I have an otherwise perfectly fine filesystem that >> > throws some (apparently harmless) errors in check, and check can't repair >> > them. Despite this, it gives zero indication that it can't repair them, >> > zero indication that it didn't repair them, and doesn't even seem to give a >> > non-zero exit status for this filesystem. >> >> Yeah, it's really not a user tool in my view... >> >> >> >> > >> > As far as the other tools: >> > - Self-repair at mount time: This isn't a repair tool, if the FS mounts, >> > it's not broken, it's just a messy and the kernel is tidying things up. >> > - btrfsck/btrfs check: I think I covered the issues here well. >> > - Mount options: These are mostly just for expensive checks during mount, >> > and most people should never need them except in very unusual >> > circumstances. >> > - btrfs rescue *: These are all fixes for very specific issues. They >> > should >> > be folded into check with special aliases, and not be separate tools. The >> > first fixes an issue that's pretty much non-existent in any modern kernel, >> > and the other two are for very low-level data recovery of horribly broken >> > filesystems. >> > - scrub: This is a very purpose specific tool which is supposed to be part >> > of regular maintainence, and only works to fix things as a side effect of >> > what it does. >> > - balance: This is also a relatively purpose specific tool, and again only >> > fixes things as a side effect of what it does. > >You've forgotten btrfs-zero-log, which seems to have built itself a > reputation on the internet as the tool you run to fix all btrfs ills, > rather than a very finely-targeted tool that was introduced to deal > with approximately one bug somewhere back in the 2.x era (IIRC). > >Hugo. :-) It's in my original list, and it's in Austin's by way of being lumped into 'btrfs rescue *' along with chunk and super recover. Seems like super recover should be built into Btrfs check, and would be one of the first ambiguities to get out of the way but I'm just an ape that wears pants so what do I know. Thing is?? zero log has fixed file systems in cases where I never would have expected it to, and the user was recommended not to use it, or use it as a 2nd to last resort. So, pfffIt's like throwing salt around. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote: > On Thu, Sep 15, 2016 at 12:20 PM, Austin S. Hemmelgarn >wrote: > > > 2. We're developing new features without making sure that check can fix > > issues in any associated metadata. Part of merging a new feature needs to > > be proving that fsck can handle fixing any issues in the metadata for that > > feature short of total data loss or complete corruption. > > > > 3. Fsck should be needed only for un-mountable filesystems. Ideally, we > > should be handling things like Windows does. Preform slightly better > > checking when reading data, and if we see an error, flag the filesystem for > > expensive repair on the next mount. > > Right, well I'm vaguely curious why ZFS, as different as it is, > basically take the position that if the hardware went so batshit that > they can't unwind it on a normal mount, then an fsck probably can't > help either... they still don't have an fsck and don't appear to want > one. > > I'm not sure if the brfsck is really all that helpful to user as much > as it is for developers to better learn about the failure vectors of > the file system. > > > > 4. Btrfs check should know itself if it can fix something or not, and that > > should be reported. I have an otherwise perfectly fine filesystem that > > throws some (apparently harmless) errors in check, and check can't repair > > them. Despite this, it gives zero indication that it can't repair them, > > zero indication that it didn't repair them, and doesn't even seem to give a > > non-zero exit status for this filesystem. > > Yeah, it's really not a user tool in my view... > > > > > > > As far as the other tools: > > - Self-repair at mount time: This isn't a repair tool, if the FS mounts, > > it's not broken, it's just a messy and the kernel is tidying things up. > > - btrfsck/btrfs check: I think I covered the issues here well. > > - Mount options: These are mostly just for expensive checks during mount, > > and most people should never need them except in very unusual circumstances. > > - btrfs rescue *: These are all fixes for very specific issues. They should > > be folded into check with special aliases, and not be separate tools. The > > first fixes an issue that's pretty much non-existent in any modern kernel, > > and the other two are for very low-level data recovery of horribly broken > > filesystems. > > - scrub: This is a very purpose specific tool which is supposed to be part > > of regular maintainence, and only works to fix things as a side effect of > > what it does. > > - balance: This is also a relatively purpose specific tool, and again only > > fixes things as a side effect of what it does. You've forgotten btrfs-zero-log, which seems to have built itself a reputation on the internet as the tool you run to fix all btrfs ills, rather than a very finely-targeted tool that was introduced to deal with approximately one bug somewhere back in the 2.x era (IIRC). Hugo. > > Yeah I know, it's just much of this is non-obvious to users unfamiliar > with this file system. And even I'm often throwing spaghetti on a > wall. > > > -- > Chris Murphy > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Hugo Mills | It's against my programming to impersonate a deity! hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 | C3PO, Return of the Jedi signature.asc Description: Digital signature
Re: Is stability a joke? (wiki updated)
On Thu, Sep 15, 2016 at 12:20 PM, Austin S. Hemmelgarnwrote: > 2. We're developing new features without making sure that check can fix > issues in any associated metadata. Part of merging a new feature needs to > be proving that fsck can handle fixing any issues in the metadata for that > feature short of total data loss or complete corruption. > > 3. Fsck should be needed only for un-mountable filesystems. Ideally, we > should be handling things like Windows does. Preform slightly better > checking when reading data, and if we see an error, flag the filesystem for > expensive repair on the next mount. Right, well I'm vaguely curious why ZFS, as different as it is, basically take the position that if the hardware went so batshit that they can't unwind it on a normal mount, then an fsck probably can't help either... they still don't have an fsck and don't appear to want one. I'm not sure if the brfsck is really all that helpful to user as much as it is for developers to better learn about the failure vectors of the file system. > 4. Btrfs check should know itself if it can fix something or not, and that > should be reported. I have an otherwise perfectly fine filesystem that > throws some (apparently harmless) errors in check, and check can't repair > them. Despite this, it gives zero indication that it can't repair them, > zero indication that it didn't repair them, and doesn't even seem to give a > non-zero exit status for this filesystem. Yeah, it's really not a user tool in my view... > > As far as the other tools: > - Self-repair at mount time: This isn't a repair tool, if the FS mounts, > it's not broken, it's just a messy and the kernel is tidying things up. > - btrfsck/btrfs check: I think I covered the issues here well. > - Mount options: These are mostly just for expensive checks during mount, > and most people should never need them except in very unusual circumstances. > - btrfs rescue *: These are all fixes for very specific issues. They should > be folded into check with special aliases, and not be separate tools. The > first fixes an issue that's pretty much non-existent in any modern kernel, > and the other two are for very low-level data recovery of horribly broken > filesystems. > - scrub: This is a very purpose specific tool which is supposed to be part > of regular maintainence, and only works to fix things as a side effect of > what it does. > - balance: This is also a relatively purpose specific tool, and again only > fixes things as a side effect of what it does. > Yeah I know, it's just much of this is non-obvious to users unfamiliar with this file system. And even I'm often throwing spaghetti on a wall. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On 2016-09-15 14:01, Chris Murphy wrote: On Tue, Sep 13, 2016 at 5:35 AM, Austin S. Hemmelgarnwrote: On 2016-09-12 16:08, Chris Murphy wrote: - btrfsck status e.g. btrfs-progs 4.7.2 still warns against using --repair, and lists it under dangerous options also; while that's true, Btrfs can't be considered stable or recommended by default e.g. There's still way too many separate repair tools for Btrfs. Depending on how you count there's at least 4, and more realistically 8 ways, scattered across multiple commands. This excludes btrfs check's -E, -r, and -s flags. And it ignores sequence in the success rate. The permutations are just excessive. It's definitely not easy to know how to fix a Btrfs volume should things go wrong. I assume you're counting balance and scrub in that, plus check gives 3, what are you considering the 4th? - Self repair at mount time, similar to other fs's with a journal - fsck, similar to other fs's except the output is really unclear about what the prognosis is compared to ext4 or xfs - mount option usebackuproot/recovery - btrfs rescue zero-log - btrfs rescue super-recover - btrfs rescue chunk-recover - scrub - balance check --repair really needed to be fail safe a long time ago, it's what everyone's come to expect from fsck's, that they don't make things worse; and in particular on Btrfs it seems like its repairs should be reversible but the reality is the man page says do not use (except under advisement) and that it's dangerous (twice). And a user got a broken system in the bug that affects 4.7, 4.7.1, that 4.7.2 apparently can't fix. So... life is hard, file systems are hard. But it's also hard to see how distros can possibly feel comfortable with Btrfs by default when the fsck tool is dangerous, even if in theory it shouldn't often be necessary. For check specifically, I see four issues: 1. It spits out pretty low-level information about the internals in many cases when it returns an error. xfs_repair does this too, but it's needed even less frequently than btrfs check, and it at least uses relatively simple jargon by comparison. I've been using BTRFS for years and still can't tell what more than half the error messages check can return mean. In contrast to that, deciphering an error message from e2fsck is pretty trivial if you have some basic understanding of VFS level filesystem abstractions (stuff like what inodes and dentries are), and I never needed to learn low level things about the internals of ext4 to parse the fsck output (I did anyway, but that's beside the point). 2. We're developing new features without making sure that check can fix issues in any associated metadata. Part of merging a new feature needs to be proving that fsck can handle fixing any issues in the metadata for that feature short of total data loss or complete corruption. 3. Fsck should be needed only for un-mountable filesystems. Ideally, we should be handling things like Windows does. Preform slightly better checking when reading data, and if we see an error, flag the filesystem for expensive repair on the next mount. 4. Btrfs check should know itself if it can fix something or not, and that should be reported. I have an otherwise perfectly fine filesystem that throws some (apparently harmless) errors in check, and check can't repair them. Despite this, it gives zero indication that it can't repair them, zero indication that it didn't repair them, and doesn't even seem to give a non-zero exit status for this filesystem. As far as the other tools: - Self-repair at mount time: This isn't a repair tool, if the FS mounts, it's not broken, it's just a messy and the kernel is tidying things up. - btrfsck/btrfs check: I think I covered the issues here well. - Mount options: These are mostly just for expensive checks during mount, and most people should never need them except in very unusual circumstances. - btrfs rescue *: These are all fixes for very specific issues. They should be folded into check with special aliases, and not be separate tools. The first fixes an issue that's pretty much non-existent in any modern kernel, and the other two are for very low-level data recovery of horribly broken filesystems. - scrub: This is a very purpose specific tool which is supposed to be part of regular maintainence, and only works to fix things as a side effect of what it does. - balance: This is also a relatively purpose specific tool, and again only fixes things as a side effect of what it does. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Tue, Sep 13, 2016 at 5:35 AM, Austin S. Hemmelgarnwrote: > On 2016-09-12 16:08, Chris Murphy wrote: >> >> - btrfsck status >> e.g. btrfs-progs 4.7.2 still warns against using --repair, and lists >> it under dangerous options also; while that's true, Btrfs can't be >> considered stable or recommended by default >> e.g. There's still way too many separate repair tools for Btrfs. >> Depending on how you count there's at least 4, and more realistically >> 8 ways, scattered across multiple commands. This excludes btrfs >> check's -E, -r, and -s flags. And it ignores sequence in the success >> rate. The permutations are just excessive. It's definitely not easy to >> know how to fix a Btrfs volume should things go wrong. > > I assume you're counting balance and scrub in that, plus check gives 3, what > are you considering the 4th? - Self repair at mount time, similar to other fs's with a journal - fsck, similar to other fs's except the output is really unclear about what the prognosis is compared to ext4 or xfs - mount option usebackuproot/recovery - btrfs rescue zero-log - btrfs rescue super-recover - btrfs rescue chunk-recover - scrub - balance check --repair really needed to be fail safe a long time ago, it's what everyone's come to expect from fsck's, that they don't make things worse; and in particular on Btrfs it seems like its repairs should be reversible but the reality is the man page says do not use (except under advisement) and that it's dangerous (twice). And a user got a broken system in the bug that affects 4.7, 4.7.1, that 4.7.2 apparently can't fix. So... life is hard, file systems are hard. But it's also hard to see how distros can possibly feel comfortable with Btrfs by default when the fsck tool is dangerous, even if in theory it shouldn't often be necessary. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Mon, Sep 12, 2016 at 02:44:35PM -0600, Chris Murphy wrote: > Just to cut yourself some slack, you could skip 3.14 because it's EOL > now, and just go from 4.4. Don't the btrfs-tools used to create the filesystem also play a huge role in this game? Greetings Marc -- - Marc Haber | "I don't trust Computers. They | Mailadresse im Header Leimen, Germany| lose things."Winona Ryder | Fon: *49 6224 1600402 Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
Am Dienstag, 13. September 2016, 07:28:38 CEST schrieb Austin S. Hemmelgarn: > On 2016-09-12 16:44, Chris Murphy wrote: > > On Mon, Sep 12, 2016 at 2:35 PM, Martin Steigerwaldwrote: > >> Am Montag, 12. September 2016, 23:21:09 CEST schrieb Pasi Kärkkäinen: > >>> On Mon, Sep 12, 2016 at 09:57:17PM +0200, Martin Steigerwald wrote: > Am Montag, 12. September 2016, 18:27:47 CEST schrieb David Sterba: > > On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote: […] > > https://btrfs.wiki.kernel.org/index.php/Status > > Great. > > I made to minor adaption. I added a link to the Status page to my > warning > in before the kernel log by feature page. And I also mentioned that at > the time the page was last updated the latest kernel version was 4.7. > Yes, thats some extra work to update the kernel version, but I think > its > beneficial to explicitely mention the kernel version the page talks > about. Everyone who updates the page can update the version within a > second. > >>> > >>> Hmm.. that will still leave people wondering "but I'm running Linux 4.4, > >>> not 4.7, I wonder what the status of feature X is.." > >>> > >>> Should we also add a column for kernel version, so we can add "feature X > >>> is > >>> known to be OK on Linux 3.18 and later".. ? Or add those to "notes" > >>> field, > >>> where applicable? > >> > >> That was my initial idea, and it may be better than a generic kernel > >> version for all features. Even if we fill in 4.7 for any of the features > >> that are known to work okay for the table. > >> > >> For RAID 1 I am willing to say it works stable since kernel 3.14, as this > >> was the kernel I used when I switched /home and / to Dual SSD RAID 1 on > >> this ThinkPad T520. > > > > Just to cut yourself some slack, you could skip 3.14 because it's EOL > > now, and just go from 4.4. > > That reminds me, we should probably make a point to make it clear that > this is for the _upstream_ mainline kernel versions, not for versions > from some arbitrary distro, and that people should check the distro's > documentation for that info. I´d do the following: Really state the first known to work stable kernel version for a feature. But before the table state this: 1) Instead of the first known to work stable kernel for a feature recommend to use the latest upstream kernel or alternatively the latest upstream LTS kernel for those users who want to play it a bit safer. 2) For stable distros such as SLES, RHEL, Ubuntu LTS, Debian Stable recommend to check distro documentation. Note that some distro kernels track upstream kernels quite closely like Debian backport kernel or Ubuntu kernel backports PPA. Thanks, -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On 2016-09-12 16:08, Chris Murphy wrote: On Mon, Sep 12, 2016 at 10:56 AM, Austin S. Hemmelgarnwrote: Things listed as TBD status: 1. Seeding: Seems to work fine the couple of times I've tested it, however I've only done very light testing, and the whole feature is pretty much undocumented. Mostly OK. Odd behaviors: - mount seed (ro), add device, remount mountpoint: this just changed the mounted fs volume UUID - if two sprouts for a seed exist, ambiguous which is remounted rw, you'd have to check - remount should probably be disallowed in this case somehow; require explicit mount of the sprout btrfs fi usage crash when multiple device volume contains seed device https://bugzilla.kernel.org/show_bug.cgi?id=115851 Yeah, like I said, I've only done very light testing. I kind of lost interest in seeding when overlayfs went mainline, as it offers pretty much everything I care about that seeding does, and it's filesystem agnostic. 2. Device Replace: Works perfectly as long as the filesystem itself is not corrupted, all the component devices are working, and the FS isn't using any raid56 profiles. Works fine if only the device being replaced is failing. I've not done much testing WRT replacement when multiple devices are suspect, but what I've done seems to suggest that it might be possible to make it work, but it doesn't currently. On raid56 it sometimes works fine, sometimes corrupts data, and sometimes takes an insanely long time to complete (putting data at risk from subsequent failures while the replace is running). 3. Balance: Works perfectly as long as the filesystem is not corrupted and nothing throws any read or write errors. IOW, only run this on a generally healthy filesystem. Similar caveats to those for replace with raid56 apply here too. 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS is healthy. Concur. Missing from the matrix: - default file system for distros recommendation e.g. between enospc and btrfsck status, I'd say in general this is not currently recommended by upstream (short of having a Btrfs kernel developer on staff) I'd add the whole UUID issue to that too. - enospc status e.g. there's new stuff in 4.8 that probably still needs to shake out, and Jeff's found some metadata accounting problem resulting in enospc where there's tons of unallocated space available. e.g. I have empty block groups, and they are not being deallocated, they just stick around, and this is with 4.7 and 4.8 kernels; so whatever was at one time automatically removing totally empty bg's isn't happening anymore. FWIW, that's still working on my systems. - btrfsck status e.g. btrfs-progs 4.7.2 still warns against using --repair, and lists it under dangerous options also; while that's true, Btrfs can't be considered stable or recommended by default e.g. There's still way too many separate repair tools for Btrfs. Depending on how you count there's at least 4, and more realistically 8 ways, scattered across multiple commands. This excludes btrfs check's -E, -r, and -s flags. And it ignores sequence in the success rate. The permutations are just excessive. It's definitely not easy to know how to fix a Btrfs volume should things go wrong. I assume you're counting balance and scrub in that, plus check gives 3, what are you considering the 4th? In the case of just balance, scrub, and check, the differentiation there makes more sense IMHO than combining them, check only runs on offline filesystems (and as much as we want online fsck, I doubt that that will happen any time soon), while scrub and balance operate on online filesystems and do two semantically different things. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On 2016-09-12 16:44, Chris Murphy wrote: On Mon, Sep 12, 2016 at 2:35 PM, Martin Steigerwaldwrote: Am Montag, 12. September 2016, 23:21:09 CEST schrieb Pasi Kärkkäinen: On Mon, Sep 12, 2016 at 09:57:17PM +0200, Martin Steigerwald wrote: Am Montag, 12. September 2016, 18:27:47 CEST schrieb David Sterba: On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote: I therefore would like to propose that some sort of feature / stability matrix for the latest kernel is added to the wiki preferably somewhere where it is easy to find. It would be nice to archive old matrix'es as well in case someone runs on a bit older kernel (we who use Debian tend to like older kernels). In my opinion it would make things bit easier and perhaps a bit less scary too. Remember if you get bitten badly once you tend to stay away from from it all just in case, if you on the other hand know what bites you can safely pet the fluffy end instead :) Somebody has put that table on the wiki, so it's a good starting point. I'm not sure we can fit everything into one table, some combinations do not bring new information and we'd need n-dimensional matrix to get the whole picture. https://btrfs.wiki.kernel.org/index.php/Status Great. I made to minor adaption. I added a link to the Status page to my warning in before the kernel log by feature page. And I also mentioned that at the time the page was last updated the latest kernel version was 4.7. Yes, thats some extra work to update the kernel version, but I think its beneficial to explicitely mention the kernel version the page talks about. Everyone who updates the page can update the version within a second. Hmm.. that will still leave people wondering "but I'm running Linux 4.4, not 4.7, I wonder what the status of feature X is.." Should we also add a column for kernel version, so we can add "feature X is known to be OK on Linux 3.18 and later".. ? Or add those to "notes" field, where applicable? That was my initial idea, and it may be better than a generic kernel version for all features. Even if we fill in 4.7 for any of the features that are known to work okay for the table. For RAID 1 I am willing to say it works stable since kernel 3.14, as this was the kernel I used when I switched /home and / to Dual SSD RAID 1 on this ThinkPad T520. Just to cut yourself some slack, you could skip 3.14 because it's EOL now, and just go from 4.4. That reminds me, we should probably make a point to make it clear that this is for the _upstream_ mainline kernel versions, not for versions from some arbitrary distro, and that people should check the distro's documentation for that info. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On 2016-09-13 04:38, Timofey Titovets wrote: https://btrfs.wiki.kernel.org/index.php/Status I suggest to mark RAID1/10 as 'mostly ok' as on btrfs RAID1/10 is safe to data, but not for application that uses it. i.e. it not hide I/O error even if it's can be masked. https://www.spinics.net/lists/linux-btrfs/msg56739.html /* Retest it with upstream 4.7.2 - not fixed */ This doesn't match with what my own testing indicates at least for raid1 mode. I run similar tests myself every time a new stable kernel version comes out (but only on the most recent stable version) once I get my own patches re-based onto it, and I haven't seen issues like this in any of the 4.7 kernels, and don't recall any issues like this in any of the 4.6 kernels. In fact, I've actually dealt with systems with failing disks using BTRFS raid1 mode, including one at work just yesterday where the SATA cable had worked loose from vibrations and was causing significant data corruption. It survived just fine, as have all the other systems I've dealt with which had hardware issues while running BTRFS in raid1 mode. The indicated behavior would be consistent with issues seen sometimes when using compression however, but the OP in the linked message made no indication of there being any in-line compression involved. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
https://btrfs.wiki.kernel.org/index.php/Status I suggest to mark RAID1/10 as 'mostly ok' as on btrfs RAID1/10 is safe to data, but not for application that uses it. i.e. it not hide I/O error even if it's can be masked. https://www.spinics.net/lists/linux-btrfs/msg56739.html /* Retest it with upstream 4.7.2 - not fixed */ -- Have a nice day, Timofey. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
Pasi Kärkkäinen wrote: On Mon, Sep 12, 2016 at 09:57:17PM +0200, Martin Steigerwald wrote: Great. I made to minor adaption. I added a link to the Status page to my warning in before the kernel log by feature page. And I also mentioned that at the time the page was last updated the latest kernel version was 4.7. Yes, thats some extra work to update the kernel version, but I think its beneficial to explicitely mention the kernel version the page talks about. Everyone who updates the page can update the version within a second. Hmm.. that will still leave people wondering "but I'm running Linux 4.4, not 4.7, I wonder what the status of feature X is.." Should we also add a column for kernel version, so we can add "feature X is known to be OK on Linux 3.18 and later".. ? Or add those to "notes" field, where applicable? -- Pasi I think a separate column would be the best solution. For example archiving the status page pr. kernel version (as I suggested) will lead to issues too. For example if something appears to be just fine in 4.6 is found to be horribly broken in for example 4.10 the archive would still indicate that it WAS ok at that time even if it perhaps was not. Then you have regressions - something that worked in 4.4 may not work in 4.9, but I still think the best idea is to simply label the status as ok / broken since 4.x as those who really want to use a broken feature probably would to research to see if this used to work , besides if something that used to work goes haywire it should be fixed quickly :) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Mon, Sep 12, 2016 at 2:35 PM, Martin Steigerwaldwrote: > Am Montag, 12. September 2016, 23:21:09 CEST schrieb Pasi Kärkkäinen: >> On Mon, Sep 12, 2016 at 09:57:17PM +0200, Martin Steigerwald wrote: >> > Am Montag, 12. September 2016, 18:27:47 CEST schrieb David Sterba: >> > > On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote: >> > > > > I therefore would like to propose that some sort of feature / >> > > > > stability >> > > > > matrix for the latest kernel is added to the wiki preferably >> > > > > somewhere >> > > > > where it is easy to find. It would be nice to archive old matrix'es >> > > > > as >> > > > > well in case someone runs on a bit older kernel (we who use Debian >> > > > > tend >> > > > > to like older kernels). In my opinion it would make things bit >> > > > > easier >> > > > > and perhaps a bit less scary too. Remember if you get bitten badly >> > > > > once >> > > > > you tend to stay away from from it all just in case, if you on the >> > > > > other >> > > > > hand know what bites you can safely pet the fluffy end instead :) >> > > > >> > > > Somebody has put that table on the wiki, so it's a good starting >> > > > point. >> > > > I'm not sure we can fit everything into one table, some combinations >> > > > do >> > > > not bring new information and we'd need n-dimensional matrix to get >> > > > the >> > > > whole picture. >> > > >> > > https://btrfs.wiki.kernel.org/index.php/Status >> > >> > Great. >> > >> > I made to minor adaption. I added a link to the Status page to my warning >> > in before the kernel log by feature page. And I also mentioned that at >> > the time the page was last updated the latest kernel version was 4.7. >> > Yes, thats some extra work to update the kernel version, but I think its >> > beneficial to explicitely mention the kernel version the page talks >> > about. Everyone who updates the page can update the version within a >> > second. >> >> Hmm.. that will still leave people wondering "but I'm running Linux 4.4, not >> 4.7, I wonder what the status of feature X is.." >> >> Should we also add a column for kernel version, so we can add "feature X is >> known to be OK on Linux 3.18 and later".. ? Or add those to "notes" field, >> where applicable? > > That was my initial idea, and it may be better than a generic kernel version > for all features. Even if we fill in 4.7 for any of the features that are > known to work okay for the table. > > For RAID 1 I am willing to say it works stable since kernel 3.14, as this was > the kernel I used when I switched /home and / to Dual SSD RAID 1 on this > ThinkPad T520. Just to cut yourself some slack, you could skip 3.14 because it's EOL now, and just go from 4.4. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
Am Montag, 12. September 2016, 23:21:09 CEST schrieb Pasi Kärkkäinen: > On Mon, Sep 12, 2016 at 09:57:17PM +0200, Martin Steigerwald wrote: > > Am Montag, 12. September 2016, 18:27:47 CEST schrieb David Sterba: > > > On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote: > > > > > I therefore would like to propose that some sort of feature / > > > > > stability > > > > > matrix for the latest kernel is added to the wiki preferably > > > > > somewhere > > > > > where it is easy to find. It would be nice to archive old matrix'es > > > > > as > > > > > well in case someone runs on a bit older kernel (we who use Debian > > > > > tend > > > > > to like older kernels). In my opinion it would make things bit > > > > > easier > > > > > and perhaps a bit less scary too. Remember if you get bitten badly > > > > > once > > > > > you tend to stay away from from it all just in case, if you on the > > > > > other > > > > > hand know what bites you can safely pet the fluffy end instead :) > > > > > > > > Somebody has put that table on the wiki, so it's a good starting > > > > point. > > > > I'm not sure we can fit everything into one table, some combinations > > > > do > > > > not bring new information and we'd need n-dimensional matrix to get > > > > the > > > > whole picture. > > > > > > https://btrfs.wiki.kernel.org/index.php/Status > > > > Great. > > > > I made to minor adaption. I added a link to the Status page to my warning > > in before the kernel log by feature page. And I also mentioned that at > > the time the page was last updated the latest kernel version was 4.7. > > Yes, thats some extra work to update the kernel version, but I think its > > beneficial to explicitely mention the kernel version the page talks > > about. Everyone who updates the page can update the version within a > > second. > > Hmm.. that will still leave people wondering "but I'm running Linux 4.4, not > 4.7, I wonder what the status of feature X is.." > > Should we also add a column for kernel version, so we can add "feature X is > known to be OK on Linux 3.18 and later".. ? Or add those to "notes" field, > where applicable? That was my initial idea, and it may be better than a generic kernel version for all features. Even if we fill in 4.7 for any of the features that are known to work okay for the table. For RAID 1 I am willing to say it works stable since kernel 3.14, as this was the kernel I used when I switched /home and / to Dual SSD RAID 1 on this ThinkPad T520. -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Mon, Sep 12, 2016 at 09:57:17PM +0200, Martin Steigerwald wrote: > Am Montag, 12. September 2016, 18:27:47 CEST schrieb David Sterba: > > On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote: > > > > I therefore would like to propose that some sort of feature / stability > > > > matrix for the latest kernel is added to the wiki preferably somewhere > > > > where it is easy to find. It would be nice to archive old matrix'es as > > > > well in case someone runs on a bit older kernel (we who use Debian tend > > > > to like older kernels). In my opinion it would make things bit easier > > > > and perhaps a bit less scary too. Remember if you get bitten badly once > > > > you tend to stay away from from it all just in case, if you on the other > > > > hand know what bites you can safely pet the fluffy end instead :) > > > > > > Somebody has put that table on the wiki, so it's a good starting point. > > > I'm not sure we can fit everything into one table, some combinations do > > > not bring new information and we'd need n-dimensional matrix to get the > > > whole picture. > > > > https://btrfs.wiki.kernel.org/index.php/Status > > Great. > > I made to minor adaption. I added a link to the Status page to my warning in > before the kernel log by feature page. And I also mentioned that at the time > the page was last updated the latest kernel version was 4.7. Yes, thats some > extra work to update the kernel version, but I think its beneficial to > explicitely mention the kernel version the page talks about. Everyone who > updates the page can update the version within a second. > Hmm.. that will still leave people wondering "but I'm running Linux 4.4, not 4.7, I wonder what the status of feature X is.." Should we also add a column for kernel version, so we can add "feature X is known to be OK on Linux 3.18 and later".. ? Or add those to "notes" field, where applicable? -- Pasi > -- > Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Mon, Sep 12, 2016 at 10:56 AM, Austin S. Hemmelgarnwrote: > > Things listed as TBD status: > 1. Seeding: Seems to work fine the couple of times I've tested it, however > I've only done very light testing, and the whole feature is pretty much > undocumented. Mostly OK. Odd behaviors: - mount seed (ro), add device, remount mountpoint: this just changed the mounted fs volume UUID - if two sprouts for a seed exist, ambiguous which is remounted rw, you'd have to check - remount should probably be disallowed in this case somehow; require explicit mount of the sprout btrfs fi usage crash when multiple device volume contains seed device https://bugzilla.kernel.org/show_bug.cgi?id=115851 > 2. Device Replace: Works perfectly as long as the filesystem itself is not > corrupted, all the component devices are working, and the FS isn't using any > raid56 profiles. Works fine if only the device being replaced is failing. > I've not done much testing WRT replacement when multiple devices are > suspect, but what I've done seems to suggest that it might be possible to > make it work, but it doesn't currently. On raid56 it sometimes works fine, > sometimes corrupts data, and sometimes takes an insanely long time to > complete (putting data at risk from subsequent failures while the replace is > running). > 3. Balance: Works perfectly as long as the filesystem is not corrupted and > nothing throws any read or write errors. IOW, only run this on a generally > healthy filesystem. Similar caveats to those for replace with raid56 apply > here too. > 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS > is healthy. Concur. Missing from the matrix: - default file system for distros recommendation e.g. between enospc and btrfsck status, I'd say in general this is not currently recommended by upstream (short of having a Btrfs kernel developer on staff) - enospc status e.g. there's new stuff in 4.8 that probably still needs to shake out, and Jeff's found some metadata accounting problem resulting in enospc where there's tons of unallocated space available. e.g. I have empty block groups, and they are not being deallocated, they just stick around, and this is with 4.7 and 4.8 kernels; so whatever was at one time automatically removing totally empty bg's isn't happening anymore. - btrfsck status e.g. btrfs-progs 4.7.2 still warns against using --repair, and lists it under dangerous options also; while that's true, Btrfs can't be considered stable or recommended by default e.g. There's still way too many separate repair tools for Btrfs. Depending on how you count there's at least 4, and more realistically 8 ways, scattered across multiple commands. This excludes btrfs check's -E, -r, and -s flags. And it ignores sequence in the success rate. The permutations are just excessive. It's definitely not easy to know how to fix a Btrfs volume should things go wrong. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
Am Montag, 12. September 2016, 18:27:47 CEST schrieb David Sterba: > On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote: > > > I therefore would like to propose that some sort of feature / stability > > > matrix for the latest kernel is added to the wiki preferably somewhere > > > where it is easy to find. It would be nice to archive old matrix'es as > > > well in case someone runs on a bit older kernel (we who use Debian tend > > > to like older kernels). In my opinion it would make things bit easier > > > and perhaps a bit less scary too. Remember if you get bitten badly once > > > you tend to stay away from from it all just in case, if you on the other > > > hand know what bites you can safely pet the fluffy end instead :) > > > > Somebody has put that table on the wiki, so it's a good starting point. > > I'm not sure we can fit everything into one table, some combinations do > > not bring new information and we'd need n-dimensional matrix to get the > > whole picture. > > https://btrfs.wiki.kernel.org/index.php/Status Great. I made to minor adaption. I added a link to the Status page to my warning in before the kernel log by feature page. And I also mentioned that at the time the page was last updated the latest kernel version was 4.7. Yes, thats some extra work to update the kernel version, but I think its beneficial to explicitely mention the kernel version the page talks about. Everyone who updates the page can update the version within a second. -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On 2016-09-12 13:29, Filipe Manana wrote: On Mon, Sep 12, 2016 at 5:56 PM, Austin S. Hemmelgarnwrote: On 2016-09-12 12:27, David Sterba wrote: On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote: I therefore would like to propose that some sort of feature / stability matrix for the latest kernel is added to the wiki preferably somewhere where it is easy to find. It would be nice to archive old matrix'es as well in case someone runs on a bit older kernel (we who use Debian tend to like older kernels). In my opinion it would make things bit easier and perhaps a bit less scary too. Remember if you get bitten badly once you tend to stay away from from it all just in case, if you on the other hand know what bites you can safely pet the fluffy end instead :) Somebody has put that table on the wiki, so it's a good starting point. I'm not sure we can fit everything into one table, some combinations do not bring new information and we'd need n-dimensional matrix to get the whole picture. https://btrfs.wiki.kernel.org/index.php/Status Some things to potentially add based on my own experience: Things listed as TBD status: 1. Seeding: Seems to work fine the couple of times I've tested it, however I've only done very light testing, and the whole feature is pretty much undocumented. 2. Device Replace: Works perfectly as long as the filesystem itself is not corrupted, all the component devices are working, and the FS isn't using any raid56 profiles. Works fine if only the device being replaced is failing. I've not done much testing WRT replacement when multiple devices are suspect, but what I've done seems to suggest that it might be possible to make it work, but it doesn't currently. On raid56 it sometimes works fine, sometimes corrupts data, and sometimes takes an insanely long time to complete (putting data at risk from subsequent failures while the replace is running). 3. Balance: Works perfectly as long as the filesystem is not corrupted and nothing throws any read or write errors. IOW, only run this on a generally healthy filesystem. Similar caveats to those for replace with raid56 apply here too. 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS is healthy. Virtually all other features work fine if the fs is healthy... I would add more, but I don't often have the time to test broken filesystems... TBH though, that's most of the issue I see with BTRFS in general at the moment. RAID5/6 works fine, as long as all the devices keep working and you don't try to replace them and don't lose power. Qgroups appear to work fine as long as no other bug shows up (other than the issues with accounting and returning ENOSPC instead of EDQUOT). We do so much testing on pristine filesystems, but most of the utilities and less widely used features have had near zero testing on filesystems that are in bad shape. If you pay attention, many (possibly most?) of the recently reported bugs are from broken (or poorly curated) filesystems, not some random kernel bug. New features are nice, but they generally don't improve stability, and for BTRFS to be truly production ready outside of constrained environments like FaceBook, it needs to not choke on encountering a FS with some small amount of corruption. Other stuff: 1. Compression: The specific known issue is that compressed extents don't always get recovered properly on failed reads when dealing with lots of failed reads. This can be demonstrated by generating a large raid1 filesystem image with huge numbers of small (1MB) readliy compressible files, then putting that on top of a dm-flaky or dm-error target set to give a high read-error rate, then mounting and running cat `find .` > /dev/null from the top level of the FS multiple times in a row. 2. Send: The particular edge case appears to be caused by metadata corruption on the sender and results in send choking on the same file every time you try to run it. The quick fix is to copy the contents of the file to another file and rename that over the original. I don't remember having seen such case at least for the last 2 or 3 years, all the problems I've seen/solved or seen fixes from others were all related to bugs in the send algorithm and definitely not any metadata corruption. So I wonder what evidence you have about this. For the compression related issues, I can still reproduce it, but it takes a while. As for the send issues, I do still see these on rare occasion, but only on 2+ year old filesystems, but I think the last time I saw it happen was more than 3 months ago. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Mon, Sep 12, 2016 at 5:56 PM, Austin S. Hemmelgarnwrote: > On 2016-09-12 12:27, David Sterba wrote: >> >> On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote: I therefore would like to propose that some sort of feature / stability matrix for the latest kernel is added to the wiki preferably somewhere where it is easy to find. It would be nice to archive old matrix'es as well in case someone runs on a bit older kernel (we who use Debian tend to like older kernels). In my opinion it would make things bit easier and perhaps a bit less scary too. Remember if you get bitten badly once you tend to stay away from from it all just in case, if you on the other hand know what bites you can safely pet the fluffy end instead :) >>> >>> >>> Somebody has put that table on the wiki, so it's a good starting point. >>> I'm not sure we can fit everything into one table, some combinations do >>> not bring new information and we'd need n-dimensional matrix to get the >>> whole picture. >> >> >> https://btrfs.wiki.kernel.org/index.php/Status > > > Some things to potentially add based on my own experience: > > Things listed as TBD status: > 1. Seeding: Seems to work fine the couple of times I've tested it, however > I've only done very light testing, and the whole feature is pretty much > undocumented. > 2. Device Replace: Works perfectly as long as the filesystem itself is not > corrupted, all the component devices are working, and the FS isn't using any > raid56 profiles. Works fine if only the device being replaced is failing. > I've not done much testing WRT replacement when multiple devices are > suspect, but what I've done seems to suggest that it might be possible to > make it work, but it doesn't currently. On raid56 it sometimes works fine, > sometimes corrupts data, and sometimes takes an insanely long time to > complete (putting data at risk from subsequent failures while the replace is > running). > 3. Balance: Works perfectly as long as the filesystem is not corrupted and > nothing throws any read or write errors. IOW, only run this on a generally > healthy filesystem. Similar caveats to those for replace with raid56 apply > here too. > 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS > is healthy. Virtually all other features work fine if the fs is healthy... > > Other stuff: > 1. Compression: The specific known issue is that compressed extents don't > always get recovered properly on failed reads when dealing with lots of > failed reads. This can be demonstrated by generating a large raid1 > filesystem image with huge numbers of small (1MB) readliy compressible > files, then putting that on top of a dm-flaky or dm-error target set to give > a high read-error rate, then mounting and running cat `find .` > /dev/null > from the top level of the FS multiple times in a row. > 2. Send: The particular edge case appears to be caused by metadata > corruption on the sender and results in send choking on the same file every > time you try to run it. The quick fix is to copy the contents of the file > to another file and rename that over the original. I don't remember having seen such case at least for the last 2 or 3 years, all the problems I've seen/solved or seen fixes from others were all related to bugs in the send algorithm and definitely not any metadata corruption. So I wonder what evidence you have about this. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Filipe David Manana, "People will forget what you said, people will forget what you did, but people will never forget how you made them feel." -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On 2016-09-12 12:27, David Sterba wrote: On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote: I therefore would like to propose that some sort of feature / stability matrix for the latest kernel is added to the wiki preferably somewhere where it is easy to find. It would be nice to archive old matrix'es as well in case someone runs on a bit older kernel (we who use Debian tend to like older kernels). In my opinion it would make things bit easier and perhaps a bit less scary too. Remember if you get bitten badly once you tend to stay away from from it all just in case, if you on the other hand know what bites you can safely pet the fluffy end instead :) Somebody has put that table on the wiki, so it's a good starting point. I'm not sure we can fit everything into one table, some combinations do not bring new information and we'd need n-dimensional matrix to get the whole picture. https://btrfs.wiki.kernel.org/index.php/Status Some things to potentially add based on my own experience: Things listed as TBD status: 1. Seeding: Seems to work fine the couple of times I've tested it, however I've only done very light testing, and the whole feature is pretty much undocumented. 2. Device Replace: Works perfectly as long as the filesystem itself is not corrupted, all the component devices are working, and the FS isn't using any raid56 profiles. Works fine if only the device being replaced is failing. I've not done much testing WRT replacement when multiple devices are suspect, but what I've done seems to suggest that it might be possible to make it work, but it doesn't currently. On raid56 it sometimes works fine, sometimes corrupts data, and sometimes takes an insanely long time to complete (putting data at risk from subsequent failures while the replace is running). 3. Balance: Works perfectly as long as the filesystem is not corrupted and nothing throws any read or write errors. IOW, only run this on a generally healthy filesystem. Similar caveats to those for replace with raid56 apply here too. 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS is healthy. Other stuff: 1. Compression: The specific known issue is that compressed extents don't always get recovered properly on failed reads when dealing with lots of failed reads. This can be demonstrated by generating a large raid1 filesystem image with huge numbers of small (1MB) readliy compressible files, then putting that on top of a dm-flaky or dm-error target set to give a high read-error rate, then mounting and running cat `find .` > /dev/null from the top level of the FS multiple times in a row. 2. Send: The particular edge case appears to be caused by metadata corruption on the sender and results in send choking on the same file every time you try to run it. The quick fix is to copy the contents of the file to another file and rename that over the original. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is stability a joke? (wiki updated)
On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote: > > I therefore would like to propose that some sort of feature / stability > > matrix for the latest kernel is added to the wiki preferably somewhere > > where it is easy to find. It would be nice to archive old matrix'es as > > well in case someone runs on a bit older kernel (we who use Debian tend > > to like older kernels). In my opinion it would make things bit easier > > and perhaps a bit less scary too. Remember if you get bitten badly once > > you tend to stay away from from it all just in case, if you on the other > > hand know what bites you can safely pet the fluffy end instead :) > > Somebody has put that table on the wiki, so it's a good starting point. > I'm not sure we can fit everything into one table, some combinations do > not bring new information and we'd need n-dimensional matrix to get the > whole picture. https://btrfs.wiki.kernel.org/index.php/Status -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html