Re: Can't mount degraded. How to remove/add drives OFFLINE?
I'm not sure my situation is quite like the one you linked, so here's my bug report: https://bugzilla.kernel.org/show_bug.cgi?id=102881 On Fri, Aug 14, 2015 at 2:44 PM, Chris Murphy li...@colorremedies.com wrote: On Fri, Aug 14, 2015 at 12:12 PM, Timothy Normand Miller theo...@gmail.com wrote: Sorry about that empty email. I hit a wrong key, and gmail decided to send. Anyhow, my replacement drive is going to arrive this evening, and I need to know how to add it to my btrfs array. Here's the situation: - I had a drive fail, so I removed it and mounted degraded. - I hooked up a replacement drive, did an add on that one, and did a delete missing. - During the rebalance, the replacement drive failed, there were OOPSes, etc. - Now, although all of my data is there, I can't mount degraded, because btrfs is complaining that too many devices are missing (3 are there, but it sees 2 missing). It might be related to this (long) bug: https://bugzilla.kernel.org/show_bug.cgi?id=92641 While Btrfs RAID 1 can tolerate only a single device failure, what you have is an in-progress rebuild of a missing device. If it becomes missing, the volume should be no worse off than it was before. But Btrfs doesn't see it this way, instead is sees this as two separate missing devices and now too many devices missing and it refuses to proceed. And there's no mechanism to remove missing devices unless you can mount rw. So it's stuck. So I could use some help with cleaning up this mess. All the data is there, so I need to know how to either force it to mount degraded, or add and remove devices offline. Where do I begin? You can try to ask on IRC. I have no ideas for this scenario, I've tried and failed. My case was throw away, what should still be possible is using btrfs restore. Also, doesn't it seem a bit arbitrary that there are too many missing, when all of the data is there? If I understand correctly, all four drives in my RAID1 should all have copies of the metadata, No that's not correct. RAID 1 means 2 copies of metadata. In a 4 device RAID 1 that's still only 2 copies. It is not n-way RAID 1. But that doesn't matter here, the problem is that Btrfs has a narrow idea of the volume, it assumes without context that once the number of devices is below the minimum, the volume can't be mounted. In reality, an exception exists if the failure is for an in-progress rebuild of a missing drive. That drive failing should mean the volume is no worse off than before but Btrfs doesn't know that. Pretty sure about that anyway. and of the remaining three good drives, there should be one or two copies of every data block. So it's all there, but btrfs has decided, based on the NUMBER of missing devices, that it won't mount. Shouldn't it refuse to mount if it knows there is data missing? For that matter, why should it even refuse in that case? So some data might missing, so it should throw some errors if you try to access that missing data. Right? I think no data is missing, no metadata is missing, and Btrfs is confused and stuck in this case. -- Chris Murphy -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't mount degraded. How to remove/add drives OFFLINE?
On Fri, Aug 14, 2015 at 1:03 PM, Timothy Normand Miller theo...@gmail.com wrote: I'm not sure my situation is quite like the one you linked, so here's my bug report: https://bugzilla.kernel.org/show_bug.cgi?id=102881 I can easily reproduce with just 2 device RAID. I updated the bug. It's best these are separate bugs, but I think the underlying problems are related. The work around is to mount -o ro,degraded, and then move data to a new Btrfs volume with btrfs send/receive or conventional copy for data that's not already in a read-only snapshot. -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't mount degraded. How to remove/add drives OFFLINE?
On Fri, Aug 14, 2015 at 12:12 PM, Timothy Normand Miller theo...@gmail.com wrote: Sorry about that empty email. I hit a wrong key, and gmail decided to send. Anyhow, my replacement drive is going to arrive this evening, and I need to know how to add it to my btrfs array. Here's the situation: - I had a drive fail, so I removed it and mounted degraded. - I hooked up a replacement drive, did an add on that one, and did a delete missing. - During the rebalance, the replacement drive failed, there were OOPSes, etc. - Now, although all of my data is there, I can't mount degraded, because btrfs is complaining that too many devices are missing (3 are there, but it sees 2 missing). It might be related to this (long) bug: https://bugzilla.kernel.org/show_bug.cgi?id=92641 While Btrfs RAID 1 can tolerate only a single device failure, what you have is an in-progress rebuild of a missing device. If it becomes missing, the volume should be no worse off than it was before. But Btrfs doesn't see it this way, instead is sees this as two separate missing devices and now too many devices missing and it refuses to proceed. And there's no mechanism to remove missing devices unless you can mount rw. So it's stuck. So I could use some help with cleaning up this mess. All the data is there, so I need to know how to either force it to mount degraded, or add and remove devices offline. Where do I begin? You can try to ask on IRC. I have no ideas for this scenario, I've tried and failed. My case was throw away, what should still be possible is using btrfs restore. Also, doesn't it seem a bit arbitrary that there are too many missing, when all of the data is there? If I understand correctly, all four drives in my RAID1 should all have copies of the metadata, No that's not correct. RAID 1 means 2 copies of metadata. In a 4 device RAID 1 that's still only 2 copies. It is not n-way RAID 1. But that doesn't matter here, the problem is that Btrfs has a narrow idea of the volume, it assumes without context that once the number of devices is below the minimum, the volume can't be mounted. In reality, an exception exists if the failure is for an in-progress rebuild of a missing drive. That drive failing should mean the volume is no worse off than before but Btrfs doesn't know that. Pretty sure about that anyway. and of the remaining three good drives, there should be one or two copies of every data block. So it's all there, but btrfs has decided, based on the NUMBER of missing devices, that it won't mount. Shouldn't it refuse to mount if it knows there is data missing? For that matter, why should it even refuse in that case? So some data might missing, so it should throw some errors if you try to access that missing data. Right? I think no data is missing, no metadata is missing, and Btrfs is confused and stuck in this case. -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Can't mount degraded. How to remove/add drives OFFLINE?
My -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Can't mount degraded. How to remove/add drives OFFLINE?
Sorry about that empty email. I hit a wrong key, and gmail decided to send. Anyhow, my replacement drive is going to arrive this evening, and I need to know how to add it to my btrfs array. Here's the situation: - I had a drive fail, so I removed it and mounted degraded. - I hooked up a replacement drive, did an add on that one, and did a delete missing. - During the rebalance, the replacement drive failed, there were OOPSes, etc. - Now, although all of my data is there, I can't mount degraded, because btrfs is complaining that too many devices are missing (3 are there, but it sees 2 missing). So I could use some help with cleaning up this mess. All the data is there, so I need to know how to either force it to mount degraded, or add and remove devices offline. Where do I begin? Also, doesn't it seem a bit arbitrary that there are too many missing, when all of the data is there? If I understand correctly, all four drives in my RAID1 should all have copies of the metadata, and of the remaining three good drives, there should be one or two copies of every data block. So it's all there, but btrfs has decided, based on the NUMBER of missing devices, that it won't mount. Shouldn't it refuse to mount if it knows there is data missing? For that matter, why should it even refuse in that case? So some data might missing, so it should throw some errors if you try to access that missing data. Right? Thanks! -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't mount degraded. How to remove/add drives OFFLINE?
On 08/15/2015 02:12 AM, Timothy Normand Miller wrote: Sorry about that empty email. I hit a wrong key, and gmail decided to send. Anyhow, my replacement drive is going to arrive this evening, and I need to know how to add it to my btrfs array. Here's the situation: - I had a drive fail, so I removed it and mounted degraded. that bit dangerous to do without the below patch. patch has more details why. - I hooked up a replacement drive, did an add on that one, and did a delete missing. - During the rebalance, the replacement drive failed, there were OOPSes, etc. - Now, although all of my data is there, I can't mount degraded, because btrfs is complaining that too many devices are missing (3 are there, but it sees 2 missing). This is addressed in the patch [PATCH 23/23] Btrfs: allow -o rw,degraded for single group profile Thanks, Anand So I could use some help with cleaning up this mess. All the data is there, so I need to know how to either force it to mount degraded, or add and remove devices offline. Where do I begin? Also, doesn't it seem a bit arbitrary that there are too many missing, when all of the data is there? If I understand correctly, all four drives in my RAID1 should all have copies of the metadata, and of the remaining three good drives, there should be one or two copies of every data block. So it's all there, but btrfs has decided, based on the NUMBER of missing devices, that it won't mount. Shouldn't it refuse to mount if it knows there is data missing? For that matter, why should it even refuse in that case? So some data might missing, so it should throw some errors if you try to access that missing data. Right? Thanks! -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't mount degraded. How to remove/add drives OFFLINE?
Just to be clear, I removed the drive (the original failed drive) when the power was off, then powered up, and then mounted degraded. That's not dangerous that I know of. patch has details. pls refer. Where is this patch, and what kernel versions can this be applied to? https://patchwork.kernel.org/patch/7014141/ its on 4.3. but should apply nice on below. thanks Anand -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't mount degraded. How to remove/add drives OFFLINE?
On Fri, Aug 14, 2015 at 7:49 PM, Anand Jain anand.j...@oracle.com wrote: - I had a drive fail, so I removed it and mounted degraded. that bit dangerous to do without the below patch. patch has more details why. Just to be clear, I removed the drive (the original failed drive) when the power was off, then powered up, and then mounted degraded. That's not dangerous that I know of. - I hooked up a replacement drive, did an add on that one, and did a delete missing. - During the rebalance, the replacement drive failed, there were OOPSes, etc. - Now, although all of my data is there, I can't mount degraded, because btrfs is complaining that too many devices are missing (3 are there, but it sees 2 missing). This is addressed in the patch [PATCH 23/23] Btrfs: allow -o rw,degraded for single group profile Where is this patch, and what kernel versions can this be applied to? -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't mount degraded. How to remove/add drives OFFLINE?
I applied that patch to my 4.1.4, it mounted degraded, and now it's balancing to the new drive. Thanks for all the help! On Fri, Aug 14, 2015 at 8:28 PM, Anand Jain anand.j...@oracle.com wrote: Just to be clear, I removed the drive (the original failed drive) when the power was off, then powered up, and then mounted degraded. That's not dangerous that I know of. patch has details. pls refer. Where is this patch, and what kernel versions can this be applied to? https://patchwork.kernel.org/patch/7014141/ its on 4.3. but should apply nice on below. thanks Anand -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't mount degraded. How to remove/add drives OFFLINE?
I thought for a second that maybe the problem is due to the phantom single chunk(s) created at mkfs time. I redid the test, and did a balance to get rid of the single chunk. I did this right after populating volume with some data. But the problem still happens. --- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html