I'm not sure my situation is quite like the one you linked, so here's
my bug report:

https://bugzilla.kernel.org/show_bug.cgi?id=102881

On Fri, Aug 14, 2015 at 2:44 PM, Chris Murphy <li...@colorremedies.com> wrote:
> On Fri, Aug 14, 2015 at 12:12 PM, Timothy Normand Miller
> <theo...@gmail.com> wrote:
>> Sorry about that empty email.  I hit a wrong key, and gmail decided to send.
>>
>> Anyhow, my replacement drive is going to arrive this evening, and I
>> need to know how to add it to my btrfs array.  Here's the situation:
>>
>> - I had a drive fail, so I removed it and mounted degraded.
>> - I hooked up a replacement drive, did an "add" on that one, and did a
>> "delete missing".
>> - During the rebalance, the replacement drive failed, there were OOPSes, etc.
>> - Now, although all of my data is there, I can't mount degraded,
>> because btrfs is complaining that too many devices are missing (3 are
>> there, but it sees 2 missing).
>
> It might be related to this (long) bug:
> https://bugzilla.kernel.org/show_bug.cgi?id=92641
>
> While Btrfs RAID 1 can tolerate only a single device failure, what you
> have is an in-progress rebuild of a missing device. If it becomes
> missing, the volume should be no worse off than it was before. But
> Btrfs doesn't see it this way, instead is sees this as two separate
> missing devices and now too many devices missing and it refuses to
> proceed. And there's no mechanism to remove missing devices unless you
> can mount rw. So it's stuck.
>
>
>> So I could use some help with cleaning up this mess.  All the data is
>> there, so I need to know how to either force it to mount degraded, or
>> add and remove devices offline.  Where do I begin?
>
> You can try to ask on IRC. I have no ideas for this scenario, I've
> tried and failed. My case was throw away, what should still be
> possible is using btrfs restore.
>
>
>> Also, doesn't it seem a bit arbitrary that there are "too many
>> missing," when all of the data is there?  If I understand correctly,
>> all four drives in my RAID1 should all have copies of the metadata,
>
> No that's not correct. RAID 1 means 2 copies of metadata. In a 4
> device RAID 1 that's still only 2 copies. It is not n-way RAID 1.
>
> But that doesn't matter here, the problem is that Btrfs has a narrow
> idea of the volume, it assumes without context that once the number of
> devices is below the minimum, the volume can't be mounted. In reality,
> an exception exists if the failure is for an in-progress rebuild of a
> missing drive. That drive failing should mean the volume is no worse
> off than before but Btrfs doesn't know that.
>
> Pretty sure about that anyway.
>
>
>> and of the remaining three good drives, there should be one or two
>> copies of every data block.  So it's all there, but btrfs has decided,
>> based on the NUMBER of missing devices, that it won't mount.
>> Shouldn't it refuse to mount if it knows there is data missing?  For
>> that matter, why should it even refuse in that case?  So some data
>> might missing, so it should throw some errors if you try to access
>> that missing data.  Right?
>
> I think no data is missing, no metadata is missing, and Btrfs is
> confused and stuck in this case.
>
> --
> Chris Murphy



-- 
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to