Timothy Normand Miller wrote on 2015/08/18 22:55 -0400:
On Tue, Aug 18, 2015 at 10:48 PM, Qu Wenruo <quwen...@cn.fujitsu.com> wrote:
Timothy Normand Miller wrote on 2015/08/18 22:46 -0400:
On Tue, Aug 18, 2015 at 9:32 PM, Qu Wenruo <quwen...@cn.fujitsu.com>
wrote:
Hi Timothy,
Although I have replied to the bugzilla, IMHO it's more appropriate to
discuss it in mail list, as it's not a kernel bug.
All four devices were online. The "missing" one was a drive that
died, which was replaced by a new one, but btrfs wouldn't finish the
deletion of the missing device.
By replaced, did you mean "btrfs replace"? Or just change the physical disk
without using "btrfs replace"?
Here's what happened:
- A drive started throwing bad sectors. Somehow this caused metadata
on other drives to get messed up.
Did that cause any huge damage?
- I took that drive offline and mounted degraded (it's a 4-drive RAID1)
- I did a "btrfs add" on a new drive and then a "btrfs delete missing"
- The replacement drive failed during the replacement operation, and
everything went to crap.
- With some help, I got a kernel patch that allowed me to mount the
original three drives with TWO missing devices.
So the original 3 drives are still OK,
original bad one is missing, and the newly add one is also missing?
That sounds quite repairable.
- I added a brand new drive and then did "delete missing" again. This
time, the first "delete missing" was successful, but it didn't fully
balance the drives, and there was another missing device, so I had to
do a "delete missing" again, and that failed.
I wanted to get this back online and restored from a backup, but I was
willing to keep it this way if people wanted to probe at, in case we
can uncover any btrfs bugs. So it was suggested to get a metadata
image, but that ran into some kind of bug in btrfs-image.
If btrfs-image doesn't work, you can also try btrfs-debug-tree.
IIRC, debug-tree should be more robust than btrfs-image.
BTW, have you tried btrfsck on it? Does it also cause the infinite loop?
I'll also try to reproduce it and investigate the codes directly.
Thanks,
Qu
Currently, I'm restoring from backup, but I have at least a partial
metadata dump.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html