Re: unsolvable technical issues?

Duncan Fri, 22 Jun 2018 22:14:35 -0700

waxhead posted on Fri, 22 Jun 2018 01:13:31 +0200 as excerpted:

> According to this:
> 
> https://stratis-storage.github.io/StratisSoftwareDesign.pdf Page 4 ,
> section 1.2
> 
> It claims that BTRFS still have significant technical issues that may
> never be resolved.
> Could someone shed some light on exactly what these technical issues
> might be?! What are BTRFS biggest technical problems?
> 
> If you forget about the "RAID"5/6 like features then the only annoyances
> that I have with BTRFS so far is...
> 
> 1. Lack of per subvolume "RAID" levels
> 2. Lack of not using the deviceid to re-discover and re-add dropped
> devices
> 
> And that's about it really...


... And those both have solutions on the roadmap, with RFC patches 
already posted for #2 (tho I'm not sure they use devid) altho 
realistically they're likely to take years to appear and be tested to 
stability.  Meanwhile...

While as the others have said you really need to go to the author to get 
what was referred to, and I agree, I can speculate a bit.  While this 
*is* speculation, admittedly somewhat uninformed as I don't claim to be a 
dev, and I'd actually be interested in what others think so don't be 
afraid to tell me I haven't a clue, as long as you say why... based on 
several years reading the list now...

1) When I see btrfs "technical issue that may never be resolved", the #1 
first thing I think of, that AFAIK there are _definitely_ no plans to 
resolve, because it's very deeply woven into the btrfs core by now, is...

Filesystem UUID Identification.  Btrfs takes the UU bit of Universally 
Unique quite literally, assuming they really *are* unique, at least on 
that system, and uses them to identify the possibly multiple devices that 
may be components of the filesystem, a problem most filesystems don't 
have to deal with since they're single-device-only.  Because btrfs uses 
this supposedly unique ID to ID devices that belong to the filesystem, it 
can get *very* mixed up, with results possibly including dataloss, if it 
sees devices that don't actually belong to a filesystem with the same UUID 
as a mounted filesystem.

But technologies such as LVM allow cloning devices and these additional 
devices naturally have the same filesystem metadata, including filesystem 
UUID, as the original.  Making the problem worse is udev with its plug-n-
play style detection, which will normally trigger a btrfs device scan, 
thus making btrfs aware of new devices containing (a component of) a 
btrfs, as soon as udev detects the device.

So people, including users of redhat/fedora which standardizes on lvm and 
systemd/udev, have to be _very_ careful when cloning devices, etc, with 
existing mounted btrfs, not to allow btrfs to see the new clones, lest it 
get mixed up and write data to the wrong device due to it having the same 
UUID as the mounted filesystem, possibly resulting in data loss.

But btrfs made the choice to use UUID as if it were really unique, just 
as it says it is on the label, many years ago, when btrfs was much 
younger, and that choice is now embedded so deeply it's not practical to 
consider changing it to something else (tho there is a utility to allow a 
suitably careful user to change it on a cloned device, should it be 
necessary).

For someone standardized on a solution such as lvm, that could be 
considered an unsolvable technical issue indeed, and indeed, I don't 
believe anyone here will argue that it's going to change.  Tho I'd 
definitely argue the bug is in apps that deliberately make UUIDs non-UUID 
any longer, no longer unique, not in btrfs, which simply takes the claim 
on the label at face value.


While that's the only truly "unsolvable" one I know of, depending on 
one's strictness in defining "unsolvable" and the scope of the time frame 
under consideration, it's quite conceivable (indeed, having read a bit 
about them before, it seems to be the case, certainly the PR case) that 
stratis, et. al., have lost patience at the slow pace of btrfs 
development, and consider various other still missing features as now 
"practically insolvable as in won't be solved to production ready", at 
least in a "reasonable" time frame of under say 3-5 (or 5-7, or whatever) 
years.  These could arguably include:

2) Subvolume and (more technically) reflink-aware defrag.

It was there for a couple kernel versions some time ago, but "impossibly" 
slow, so it was disabled until such time as btrfs could be made to scale 
rather better in this regard.

There's no hint yet as to when that might actually be, if it will _ever_ 
be, so this can arguably be validly added to the "may never be resolved" 
list.

3) N-way-mirroring.

This one was on the roadmap for "right after raid56 support, since it'll 
use some of that code", since at least 3.5, when raid56 was supposed to 
be introduced in 3.6.  I know because this is the one I've been most 
looking forward to personally, tho my original reason, aging but still 
usable devices that I wanted extra redundancy for, has long since itself 
been aged out of rotation.

Of course we know the raid56 story and thus the implied delay here, if 
it's even still roadmapped at all now, and as with reflink-aware-defrag, 
there's no hint yet as to when we'll actually see this at all, let alone 
see it in a reasonably stable form, so at least in the practical sense, 
it's arguably "might never be resolved."

4) (Until relatively recently, and still in terms of scaling) Quotas.

Until relatively recently, quotas could arguably be added to the list.  
They were rewritten multiple times, and until recently, appeared to be 
effectively eternally broken.

While that has happily changed recently and (based on the list, I don't 
use 'em personally) quotas actually seem at least someone usable these 
days (altho less critical bugs are still being fixed), AFAIK quota 
scalability while doing btrfs maintenance remains a serious enough issue 
that the recommendation is to turn them off before doing balances, and 
the same would almost certainly apply to reflink-aware-defrag (turn 
quotas off before defraging) were it available, as well.  That 
scalability alone could arguably be a "technical issue that may never be 
resolved", and while quotas themselves appear to be reasonably functional 
now, that could arguably justify them still being on the list.


And of course that's avoiding the two you mentioned, tho arguably they 
could go on the "may in practice never be resolved, at least not in the 
non-bluesky lifetime" list as well.


As for stratis, supposedly they're deliberately taking existing proven in 
multi-layer-form technology and simply exposing it in unified form.  They 
claim this dramatically lessens the required new code and shortens time-
to-stability to something reasonable, in contrast to the about a decade 
btrfs has taken already, without yet reaching a full feature set and full 
stability.  IMO they may well have a point, tho AFAIK they're still new 
and immature themselves and (I believe) don't have it either, so it's a 
point that AFAIK has yet to be fully demonstrated.

We'll see how they evolve.  I do actually expect them to move faster than 
btrfs, but also expect the interface may not be as smooth and unified as 
they'd like to present as I expect there to remain some hiccups in 
smoothing over the layering issues.  Also, because they've deliberately 
chosen to go with existing technology where possible in ordered to evolve 
to stability faster, by the same token they're deliberately limiting the 
evolution to incremental over existing technology, and I expect there's 
some stuff btrfs will do better as a result... at least until btrfs (or a 
successor) becomes stable enough for them to integrate (parts of?) it as 
existing demonstrated-stable technology.

The other difference, AFAIK, is that stratis is specifically a 
corporation making it a/the main money product, whereas btrfs was always 
something the btrfs devs used at their employers (oracle, facebook), who 
have other things as their main product.  As such, stratis is much more 
likely to prioritize things like raid status monitors, hot-spares, etc, 
that can be part of the product they sell, where they've been lower 
priority for btrfs.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: unsolvable technical issues?

Reply via email to