Re: USB upgrade fun

Chris Murphy Thu, 12 Oct 2017 09:57:31 -0700

On Sun, Oct 8, 2017 at 10:58 AM, Kai Hendry <hen...@iki.fi> wrote:
> Hi there,
>
> My /mnt/raid1 suddenly became full somewhat expectedly, so I bought 2
> new USB 4TB hard drives (one WD, one Seagate) to upgrade to.
>
> After adding sde and sdd I started to see errors in dmesg [2].
> https://s.natalian.org/2017-10-07/raid1-newdisks.txt
> [2] https://s.natalian.org/2017-10-07/btrfs-errors.txt


I'm not sure what the call traces mean exactly but they seem
non-fatal. The entire dmesg might be useful to see if there are device
or bus related errors.

I have a similar modeled NUC and I can tell you for sure it does not
provide enough USB bus power for 2.5" laptop drives. They must be
externally powered, or you need a really good USB hub with an even
better power supply that can handle e.g. 4 drives at the same time to
bus power them. I had lots of problems before I fixed this, but Btrfs
managed to recover gracefully once I solved the power issue.


>
>
> I assumed it had to perhaps with the USB bus on my NUC5CPYB being maxed
> out, and to expedite the sync, I tried to remove one of the older 2TB
> sdc1.  However the load went crazy and my system went completely
> unstable. I shutdown the machine and after an hour I hard powered it
> down since it seemed to hang (it's headless).

I've notice recent kernels hanging under trivial scrub and balance
with hard drives. It does complete, but they are really laggy and
sometimes unresponsive to anything else unless the operation is
cancelled. I haven't had time to do regression testing. My assertion
about this is in the archives, about versions I think it started with.




>
> Sidenote: I've since learnt that removing a drive actually deletes the
> contents of the drive? I don't want that. I was hoping to put that drive
> into cold storage. How do I remove a drive without losing data from a
> RAID1 configuration?

I'm pretty sure, but not certain of the following:  device
delete/remove is replicating chunk by chunk cow style. The entire
operation is not atomic. The chunk operations themselves are atomic. I
expect that metadata is updated as each chunk is properly replicated
so I don't think what you want is possible.

Again, pretty sure about this too, but not certain: device replace is
an atomic operation, the whole thing succeeds or fails, and at the end
merely the Btrfs signature is wiped from the deleted device(s). So you
could restore that signature and the device would be valid again;
HOWEVER it's going to have the same volume UUID as the new devices.
Even though the device UUIDs are unique, and should prevent confusion,
maybe confusion is possible.

A better way, which currently doesn't exist, is to make the raid1 a
seed device, and then add two new devices and remove the seed. That
way you get the replication you want, the instant the sprout is
mounted rw, it can be used in production (all changes go to the
sprout), while the chunks from the seed are replicated. The reason
this isn't viable right now is the tools aren't mature enough to
handle multiple devices yet. Otherwise with a single device seed to a
single sprout, this works and would be the way to do what you want.

A better way that does exist is to setup an overlay for the two
original devices. Mount the overlay devices, add the new devices,
delete the overlays. So the overlay devices get the writes that cause
those devices to be invalidated. The original devices aren't really
touched. There's a way to do this with dmsetup like how live boot
media work, and there's another way I haven't ever used before that's
described here:

https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file




> After a reboot it failed, namely because "nofail" wasn't in my fstab and
> systemd is pedantic by default. After managing to get it booting into my
> system without /mnt/raid1 I faced these "open ctree failed" issues.
> After running btrfs check on all the drives and getting nowhere, I
> decided to unplug the new drives and I discovered that when I take out
> the new 4TB WD drive, I could mount it with -o degraded.
>
> dmesg errors with the WD include "My Passport" Wrong diagnostic page;
> asked for 1 got 8 "Failed to get diagnostic page 0xffffffea" which
> raised my suspicions. The model number btw is WDBYFT0040BYI-WESN
>
> Anyway, I'm back up and running with 2x2TB  (one of them didn't finish
> removing, I don't know which) & 1x4TB.


Be aware that you are likely in a very precarious position now.
Anytime raid1 volumes are mounted rw,degraded, one or more of the
devices will end up with new empty single chunks (there is a patch to
prevent this, I'm not sure if it's in 4.13). The consequence of these
new empty single chunks is that they will prevent any subsequent
degraded rw mount. You get a one time degraded,rw. Any subsquent
attempt will require ro,degraded to get it to mount. If you end up
snared in this, there are patches in the archives to inhibit the
kernels protection to allow mounting of such volumes. Super annoying.
You'll have to build a custom kernel.

My opinion is you should update backups before  you do anything else,
just in cas
Next, you have to figure out a way to get all devices to be used in
this volume healthy. Tricky as you technically have a 4 device raid 1
with a missing device. I propose first to check if you have single
chunks with either 'btrfs fi us' or 'btrfs fi df' and if so, get rid
of them with a filtered balance 'btrfs balance start
-mconvert=raid1,soft -dconvert=raid1,soft' and then in theory you
should be able to do 'btrfs delete missing' to end up with a valid
three device btrfs raid 1, which you can use until you get your USB
power supply issues sorted.

So I have a lot of nausea and something of a fever right now as I'm
writing this, you should definitely not trust anything I've said at
face value. Except the backup now business. That's probably good
advice.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: USB upgrade fun

Reply via email to