On Mon, Sep 12, 2016 at 08:20:20AM -0400, Austin S. Hemmelgarn wrote: > On 2016-09-11 09:02, Hugo Mills wrote: > >On Sun, Sep 11, 2016 at 02:39:14PM +0200, Waxhead wrote: > >>Martin Steigerwald wrote: > >>>Am Sonntag, 11. September 2016, 13:43:59 CEST schrieb Martin Steigerwald: > >>>>>>Thing is: This just seems to be when has a feature been implemented > >>>>>>matrix. > >>>>>>Not when it is considered to be stable. I think this could be done with > >>>>>>colors or so. Like red for not supported, yellow for implemented and > >>>>>>green for production ready. > >>>>>Exactly, just like the Nouveau matrix. It clearly shows what you can > >>>>>expect from it. > >>>I mentioned this matrix as a good *starting* point. And I think it would be > >>>easy to extent it: > >>> > >>>Just add another column called "Production ready". Then research / ask > >>>about > >>>production stability of each feature. The only challenge is: Who is > >>>authoritative on that? I´d certainly ask the developer of a feature, but > >>>I´d > >>>also consider user reports to some extent. > >>> > >>>Maybe thats the real challenge. > >>> > >>>If you wish, I´d go through each feature there and give my own estimation. > >>>But > >>>I think there are others who are deeper into this. > >>That is exactly the same reason I don't edit the wiki myself. I > >>could of course get it started and hopefully someone will correct > >>what I write, but I feel that if I start this off I don't have deep > >>enough knowledge to do a proper start. Perhaps I will change my mind > >>about this. > > > > Given that nobody else has done it yet, what are the odds that > >someone else will step up to do it now? I would say that you should at > >least try. Yes, you don't have as much knowledge as some others, but > >if you keep working at it, you'll gain that knowledge. Yes, you'll > >probably get it wrong to start with, but you probably won't get it > >*very* wrong. You'll probably get it horribly wrong at some point, but > >even the more knowledgable people you're deferring to didn't identify > >the problems with parity RAID until Zygo and Austin and Chris (and > >others) put in the work to pin down the exact issues. > FWIW, here's a list of what I personally consider stable (as in, I'm willing > to bet against reduced uptime to use this stuff on production systems at > work and personal systems at home): > 1. Single device mode, including DUP data profiles on single device without > mixed-bg. > 2. Multi-device raid0, raid1, and raid10 profiles with symmetrical devices > (all devices are the same size). > 3. Multi-device single profiles with asymmetrical devices. > 4. Small numbers (max double digit) of snapshots, taken at infrequent > intervals (no more than once an hour). I use single snapshots regularly to > get stable images of the filesystem for backups, and I keep hourly ones of > my home directory for about 48 hours. > 5. Subvolumes used to isolate parts of a filesystem from snapshots. I use > this regularly to isolate areas of my filesystems from backups. > 6. Non-incremental send/receive (no clone source, no parent's, no > deduplication). I use this regularly for cloning virtual machines. > 7. Checksumming and scrubs using any of the profiles I've listed above. > 8. Defragmentation, including autodefrag. > 9. All of the compat_features, including no-holes and skinny-metadata. > > Things I consider stable enough that I'm willing to use them on my personal > systems but not systems at work: > 1. In-line data compression with compress=lzo. I use this on my laptop and > home server system. I've never had any issues with it myself, but I know > that other people have, and it does seem to make other things more likely to > have issues. > 2. Batch deduplication. I only use this on the back-end filesystems for my > personal storage cluster, and only because I have multiple copies as a > result of GlusterFS on top of BTRFS. I've not had any significant issues > with it, and I don't remember any reports of data loss resulting from it, > but it's something that people should not be using if they don't understand > all the implications. > > Things that I don't consider stable but some people do: > 1. Quotas and qgroups. Some people (such as SUSE) consider these to be > stable. There are a couple of known issues with them still however (such as > returning the wrong errno when a quota is hit (should be returning -EDQUOT, > instead returns -ENOSPC)). > 2. RAID5/6. There are a few people who use this, but it's generally agreed > to be unstable. There are still at least 3 known bugs which can cause > complete loss of a filesystem, and there's also a known issue with rebuilds > taking insanely long, which puts data at risk as well. > 3. Multi device filesystems with asymmetrical devices running raid0, raid1, > or raid10. The issue I have here is that it's much easier to hit errors > regarding free space than a reliable system should be. It's possible to > avoid with careful planning (for example, a 3 disk raid1 profile with 1 disk > exactly twice the size of the other two will work fine, albeit with more > load on the larger disk). > ... > As far as documentation though, we [BTRFS] really do need to get our act > together. It really doesn't look good to have most of the best > documentation be in the distro's wikis instead of ours. I'm not trying to > say the distros shouldn't be documenting BTRFS, but the point at which > Debian (for example) has better documentation of the upstream version of > BTRFS than the upstream project itself does, that starts to look bad.
I would have loved to have this feature-to-stability list when I started working on the Debian documentation! I started it because I was saddened by number of horror story "adventures with btrfs" articles and posts I had read about, combined with the perspective of certain members within the Debian community that it was a toy fs. Are my contributions to that wiki of a high enough quality that I can work on the upstream one? Do you think the broader btrfs community is interested in citations and curated links to discussions? eg: if a company wants to use btrfs, they check the status page, see a feature they want is still in the yellow zone of stabilisation, and then follow the links to familiarise themselves with past discussions. I imagine this would also help individuals or grad students more quickly familiarise themselves with the available literature before choosing a specific project. If regular updates from SUSE, STRATO, Facebook, and Fujitsu are also publicly available the k.org wiki would be a wonderful place to syndicate them! Sincerely, Nicholas > > > > So, go for it. You have a lot to offer the community. > > > > Hugo. > > > >>>I do think for example that scrubbing and auto raid repair are stable, > >>>except > >>>for RAID 5/6. Also device statistics and RAID 0 and 1 I consider to be > >>>stable. > >>>I think RAID 10 is also stable, but as I do not run it, I don´t know. For > >>>me > >>>also skinny-metadata is stable. For me so far even compress=lzo seems to be > >>>stable, but well for others it may not. > >>> > >>>Since what kernel version? Now, there you go. I have no idea. All I know I > >>>started BTRFS with Kernel 2.6.38 or 2.6.39 on my laptop, but not as RAID 1 > >>>at > >>>that time. > >>> > >>>See, the implementation time of a feature is much easier to assess. Maybe > >>>thats part of the reason why there is not stability matrix: Maybe no one > >>>*exactly* knows *for sure*. How could you? So I would even put a footnote > >>>on > >>>that "production ready" row explaining "Considered to be stable by > >>>developer > >>>and user oppinions". > >>> > >>>Of course additionally it would be good to read about experiences of > >>>corporate > >>>usage of BTRFS. I know at least Fujitsu, SUSE, Facebook, Oracle are using > >>>it. > >>>But I don´t know in what configurations and with what experiences. One > >>>Oracle > >>>developer invests a lot of time to bring BTRFS like features to XFS and > >>>RedHat > >>>still favors XFS over BTRFS, even SLES defaults to XFS for /home and other > >>>non > >>>/-filesystems. That also tells a story. > >>> > >>>Some ideas you can get from SUSE releasenotes. Even if you do not want to > >>>use > >>>it, it tells something and I bet is one of the better sources of > >>>information > >>>regarding your question you can get at this time. Cause I believe SUSE > >>>developers invested some time to assess the stability of features. Cause > >>>they > >>>would carefully assess what they can support in enterprise environments. > >>>There > >>>is also someone from Fujitsu who shared experiences in a talk, I can search > >>>the URL to the slides again. > >>By all means, SUSE's wiki is very valuable. I just said that I > >>*prefer* to have that stuff on the BTRFS wiki and feel that is the > >>right place for it. > >>> > >>>I bet Chris Mason and other BTRFS developers at Facebook have some idea on > >>>what they use within Facebook as well. To what extent they are allowed to > >>>talk > >>>about it… I don´t know. My personal impression is that as soon as Chris > >>>went > >>>to Facebook he became quite quiet. Maybe just due to being busy. Maybe due > >>>to > >>>Facebook being concerned much more about the privacy of itself than of its > >>>users. > >>> > >>>Thanks, > >> > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
signature.asc
Description: Digital signature