On Fri, Aug 14, 2015 at 12:12 PM, Timothy Normand Miller
<theo...@gmail.com> wrote:
> Sorry about that empty email.  I hit a wrong key, and gmail decided to send.
>
> Anyhow, my replacement drive is going to arrive this evening, and I
> need to know how to add it to my btrfs array.  Here's the situation:
>
> - I had a drive fail, so I removed it and mounted degraded.
> - I hooked up a replacement drive, did an "add" on that one, and did a
> "delete missing".
> - During the rebalance, the replacement drive failed, there were OOPSes, etc.
> - Now, although all of my data is there, I can't mount degraded,
> because btrfs is complaining that too many devices are missing (3 are
> there, but it sees 2 missing).

It might be related to this (long) bug:
https://bugzilla.kernel.org/show_bug.cgi?id=92641

While Btrfs RAID 1 can tolerate only a single device failure, what you
have is an in-progress rebuild of a missing device. If it becomes
missing, the volume should be no worse off than it was before. But
Btrfs doesn't see it this way, instead is sees this as two separate
missing devices and now too many devices missing and it refuses to
proceed. And there's no mechanism to remove missing devices unless you
can mount rw. So it's stuck.


> So I could use some help with cleaning up this mess.  All the data is
> there, so I need to know how to either force it to mount degraded, or
> add and remove devices offline.  Where do I begin?

You can try to ask on IRC. I have no ideas for this scenario, I've
tried and failed. My case was throw away, what should still be
possible is using btrfs restore.


> Also, doesn't it seem a bit arbitrary that there are "too many
> missing," when all of the data is there?  If I understand correctly,
> all four drives in my RAID1 should all have copies of the metadata,

No that's not correct. RAID 1 means 2 copies of metadata. In a 4
device RAID 1 that's still only 2 copies. It is not n-way RAID 1.

But that doesn't matter here, the problem is that Btrfs has a narrow
idea of the volume, it assumes without context that once the number of
devices is below the minimum, the volume can't be mounted. In reality,
an exception exists if the failure is for an in-progress rebuild of a
missing drive. That drive failing should mean the volume is no worse
off than before but Btrfs doesn't know that.

Pretty sure about that anyway.


> and of the remaining three good drives, there should be one or two
> copies of every data block.  So it's all there, but btrfs has decided,
> based on the NUMBER of missing devices, that it won't mount.
> Shouldn't it refuse to mount if it knows there is data missing?  For
> that matter, why should it even refuse in that case?  So some data
> might missing, so it should throw some errors if you try to access
> that missing data.  Right?

I think no data is missing, no metadata is missing, and Btrfs is
confused and stuck in this case.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to