On 2017-04-06 23:25, John Petrini wrote:
Interesting. That's the first time I'm hearing this. If that's the
case I feel like it's a stretch to call it RAID10 at all. It sounds a
lot more like basic replication similar to Ceph only Ceph understands
failure domains and therefore can be configured to handle device
failure (albeit at a higher level)
Yeah, the stacking is a bit odd, and there are some rather annoying caveats that make most of the names other than raid5/raid6 misleading. In fact, raid1 mode in BTRFS is more like what most people think of as RAID10 when run on more than 2 disks than BTRFS raid10 mode is, although it stripes at a much higher level.

I do of course keep backups but I chose RAID10 for the mix of
performance and reliability. It doesn't seems worth it losing 50% of
my usable space for the performance gain alone.

Thank you for letting me know about this. Knowing that I think I may
have to reconsider my choice here. I've really been enjoying the
flexibility of BTRS which is why I switched to it in the first place
but with experimental RAID5/6 and what you've just told me I'm
beginning to doubt that it's the right choice.
There are some other options in how you configure it. Most of the more useful operational modes actually require stacking BTRFS on top of LVM or MD. I'm rather fond of running BTRFS raid1 on top of LVM RAID0 volumes, which while it provides no better data safety than BTRFS raid10 mode, gets noticeably better performance. You can also reverse that to get something more like traditional RAID10, but you lose the self-correcting aspect of BTRFS.

What's more concerning is that I haven't found a good way to monitor
BTRFS. I might be able to accept that the array can only handle a
single drive failure if I was confident that I could detect it but so
far I haven't found a good solution for this.
This I can actually give some advice on. There are a couple of options, but the easiest is to find a piece of generic monitoring software that can check the return code of external programs, and then write some simple scripts to perform the checks on BTRFS. The things you want to keep an eye on are:

1. Output of 'btrfs dev stats'. If you've got a new enough copy of btrfs-progs, you can pass '--check' and the return code will be non-zero if any of the error counters isn't zero. If you've got to use an older version, you'll instead have to write a script to parse the output (I will comment that this is much easier in a language like Perl or Python than it is in bash). You want to watch for steady increases in error counts or sudden large jumps. Single intermittent errors are worth tracking, but they tend to happen more frequently the larger the array is.

2. Results from 'btrfs scrub'. This is somewhat tricky because scrub is either asynchronous or blocks for a _long_ time. The simplest option I've found is to fire off an asynchronous scrub to run during down-time, and then schedule recurring checks with 'btrfs scrub status'. On the plus side, 'btrfs scrub status' already returns non-zero if the scrub found errors.

3. Watch the filesystem flags. Some monitoring software can easily do this for you (Monit for example can watch for changes in the flags). The general idea here is that BTRFS will go read-only if it hits certain serious errors, so you can watch for that transition and send a notification when it happens. This is also worth watching since the filesystem flags should not change during normal operation of any filesystem.

4. Watch SMART status on the drives and run regular self-tests. Most of the time, issues will show up here before they show up in the FS, so by watching this, you may have an opportunity to replace devices before the filesystem ends up completely broken.

5. If you're feeling really ambitious, watch the kernel logs for errors from BTRFS and whatever storage drivers you use. This is the least reliable thing out of this list to automate, so I'd not suggest just doing this by itself.

The first two items are BTRFS specific. The rest however, are standard things you should be monitoring regardless of what type of storage stack you have. Of these, item 3 will immediately trigger in the event of a catastrophic device failure, while 1, 2, and 5 will provide better coverage of slow failures, and 4 will cover both aspects.

As far as what to use to actually track these, that really depends on your use case. For tracking on an individual system basis, I'd suggest Monit, it's efficient, easy to configure, provides some degree of error resilience, and can actually cover a lot of monitoring tasks beyond stuff like this. If you want some kind of centralized monitoring, I'd probably go with Nagios, but that's more because that's the standard for that type of thing, not because I've used it myself (I much prefer per-system decentralized monitoring, with only the checks that systems are online centralized).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to