Re: RAID 6 full, but there is still space left on some devices

Gareth Pye Tue, 01 Mar 2016 15:42:41 -0800

When I've been converting from RAID1 to RAID5 I've been getting
stripes that only contain 1G regardless of how wide the stripe is. So
when I've done a large convert I've had to limit the blocks and then
do a balance of the target profile and repeat till finished.


Has anyone else seen similar?

On Wed, Mar 2, 2016 at 1:13 AM, Dan Blazejewski
<dan.blazejew...@gmail.com> wrote:
> Hey all,
>
> Just wanted to follow up with this for anyone experiencing the same issue.
>
> First, I tried Qu's suggestion, of re-balancing to single, then
> re-balancing to RAID 6. I noticed when I completed the conversion to
> single, that a few drives didn't receive an identical amount of data.
> Balancing back to RAID 6 didn't totally work either. It definitely
> made it better, but I still had multiple stripes of varying widths.
> IIRC, I had one ~1.7TB stripe that went across all 7 drives, and then
> a conglomerate of stripes ranging from 2-5 drives wide, and sizes 30GB
> - 1TB. The majority of data was striped across all 7, but I was
> concerned that as I added data, I'd run into the same situation as
> before.
>
> This process took quite a long time, as you guys expected. About 11
> days for RAID 6 -> Single -> Raid 6. Patience is a virtue with large
> arrays.
>
>
>
> Henk, for some reason I didn't receive the email suggesting using the
> -dstripes= filter until I was well into the conversion to single. Once
> I finished the RAID 6 -> Single -> RAID 6, I attempted your method.
> I'm happy to say that it worked, using -dstripes="1..6". This only
> took about 30 hours, as most of the data was striped correctly. When
> it finished, I was left with one RAID 6 profile, about ~2.50 TB
> striped across all 7 drives. As I understand, running a balance with
> the -dstripes="1..$drivecount-1" filter will force BTRFS to balance
> chunks that are not evenly striped across all drives. I will
> definitely have to keep this trick in mind in the future.
>
>
> A side note, I'm happy with how robust BTRFS is becoming. I had a
> sustained power outage while I wasn't home that resulted in an unclean
> shutdown in the middle of the balance. (I had preciously disconnected
> my UPS' USB connector to move the server to a different room and
> forgot to reconnect it. Doh!). When power was returned, it started
> right back up where it left off with no corruption or data loss. I
> have backups, but I wasn't looking forward to the idea of restoring 11
> TB of data.
>
> Than you everyone for your help, and thank you for putting all this
> work into BTRFS. Your efforts are truly appreciated.
>
> Regards,
> Dan
>
> On Thu, Feb 18, 2016 at 8:36 PM, Qu Wenruo <quwen...@cn.fujitsu.com> wrote:
>>
>>
>> Henk Slager wrote on 2016/02/19 00:27 +0100:
>>>
>>> On Thu, Feb 18, 2016 at 3:03 AM, Qu Wenruo <quwen...@cn.fujitsu.com>
>>> wrote:
>>>>
>>>>
>>>>
>>>> Dan Blazejewski wrote on 2016/02/17 18:04 -0500:
>>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> I upgraded my kernel to 4.4.2, and btrfs-progs to 4.4. I also added
>>>>> another 4TB disk and kicked off a full balance (currently 7x4TB
>>>>> RAID6). I'm interested to see what an additional drive will do to
>>>>> this. I'll also have to wait and see if a full system balance on a
>>>>> newer version of BTRFS tools does the trick or not.
>>>>>
>>>>> I also noticed that "btrfs device usage" shows multiple entries for
>>>>> Data, RAID 6 on some drives. Is this normal? Please note that /dev/sdh
>>>>> is the new disk, and I only just started the balance.
>>>>>
>>>>> # btrfs dev usage /mnt/data
>>>>> /dev/sda, ID: 5
>>>>>      Device size:             3.64TiB
>>>>>      Data,RAID6:              1.43TiB
>>>>>      Data,RAID6:              1.48TiB
>>>>>      Data,RAID6:            320.00KiB
>>>>>      Metadata,RAID6:          2.55GiB
>>>>>      Metadata,RAID6:          1.50GiB
>>>>>      System,RAID6:           16.00MiB
>>>>>      Unallocated:           733.67GiB
>>>>>
>>>>> /dev/sdb, ID: 6
>>>>>      Device size:             3.64TiB
>>>>>      Data,RAID6:              1.48TiB
>>>>>      Data,RAID6:            320.00KiB
>>>>>      Metadata,RAID6:          1.50GiB
>>>>>      System,RAID6:           16.00MiB
>>>>>      Unallocated:             2.15TiB
>>>>>
>>>>> /dev/sdc, ID: 7
>>>>>      Device size:             3.64TiB
>>>>>      Data,RAID6:              1.43TiB
>>>>>      Data,RAID6:            732.69GiB
>>>>>      Data,RAID6:              1.48TiB
>>>>>      Data,RAID6:            320.00KiB
>>>>>      Metadata,RAID6:          2.55GiB
>>>>>      Metadata,RAID6:        982.00MiB
>>>>>      Metadata,RAID6:          1.50GiB
>>>>>      System,RAID6:           16.00MiB
>>>>>      Unallocated:            25.21MiB
>>>>>
>>>>> /dev/sdd, ID: 1
>>>>>      Device size:             3.64TiB
>>>>>      Data,RAID6:              1.43TiB
>>>>>      Data,RAID6:            732.69GiB
>>>>>      Data,RAID6:              1.48TiB
>>>>>      Data,RAID6:            320.00KiB
>>>>>      Metadata,RAID6:          2.55GiB
>>>>>      Metadata,RAID6:        982.00MiB
>>>>>      Metadata,RAID6:          1.50GiB
>>>>>      System,RAID6:           16.00MiB
>>>>>      Unallocated:            25.21MiB
>>>>>
>>>>> /dev/sdf, ID: 3
>>>>>      Device size:             3.64TiB
>>>>>      Data,RAID6:              1.43TiB
>>>>>      Data,RAID6:            732.69GiB
>>>>>      Data,RAID6:              1.48TiB
>>>>>      Data,RAID6:            320.00KiB
>>>>>      Metadata,RAID6:          2.55GiB
>>>>>      Metadata,RAID6:        982.00MiB
>>>>>      Metadata,RAID6:          1.50GiB
>>>>>      System,RAID6:           16.00MiB
>>>>>      Unallocated:            25.21MiB
>>>>>
>>>>> /dev/sdg, ID: 2
>>>>>      Device size:             3.64TiB
>>>>>      Data,RAID6:              1.43TiB
>>>>>      Data,RAID6:            732.69GiB
>>>>>      Data,RAID6:              1.48TiB
>>>>>      Data,RAID6:            320.00KiB
>>>>>      Metadata,RAID6:          2.55GiB
>>>>>      Metadata,RAID6:        982.00MiB
>>>>>      Metadata,RAID6:          1.50GiB
>>>>>      System,RAID6:           16.00MiB
>>>>>      Unallocated:            25.21MiB
>>>>>
>>>>> /dev/sdh, ID: 8
>>>>>      Device size:             3.64TiB
>>>>>      Data,RAID6:            320.00KiB
>>>>>      Unallocated:             3.64TiB
>>>>>
>>>>
>>>> Not sure how that multiple chunk type shows up.
>>>> Maybe all these shown RAID6 has different number of stripes?
>>>
>>>
>>> Indeed, its 4 different sets of stripe-widths, i.e. how many drives is
>>> striped accross. Someone has suggested to indicate this in the output
>>> of    btrfs de us  comand some time ago.
>>>
>>> The fs has only RAID6 profile and I am not fully sure if the
>>> 'Unallocated'  numbers are correct (on RAID10 they are 2x too high
>>> with unpatched v4.4 progs), but anyhow the lower devid's are way too
>>> full.
>>>
>>>  From the size, one can derive how many devices (or stipe-width):
>>> 732.69GiB 4, 1.43TiB 5, 1.48TiB 6, 320.00KiB 7
>>>
>>>>> Qu, in regards to your question, I ran RAID 1 on multiple disks of
>>>>> different sizes. I believe I had a mix of 2x4TB, 1x2TB, and 1x3TB
>>>>> drive. I replaced the 2TB drive first with a 4TB, and balanced it.
>>>>> Later on, I replaced the 3TB drive with another 4TB, and balanced,
>>>>> yielding an array of 4x4TB RAID1. A little while later, I wound up
>>>>> sticking a fifth 4TB drive in, and converting to RAID6. The sixth 4TB
>>>>> drive was added some time after that. The seventh was added just a few
>>>>> minutes ago.
>>>>
>>>>
>>>>
>>>> Personally speaking, I just came up to one method to balance all these
>>>> disks, and in fact you don't need to add a disk.
>>>>
>>>> 1) Balance all data chunk to single profile
>>>> 2) Balance all metadata chunk to single or RAID1 profile
>>>> 3) Balance all data chunk back to RAID6 profile
>>>> 4) Balance all metadata chunk back to RAID6 profile
>>>> System chunk is so small that normally you don't need to bother.
>>>>
>>>> The trick is, as single is the most flex chunk type, only needs one disk
>>>> with unallocated space.
>>>> And btrfs chunk allocater will allocate chunk to device with most
>>>> unallocated space.
>>>>
>>>> So after 1) and 2) you should found that chunk allocation is almost
>>>> perfectly balanced across all devices, as long as they are in same size.
>>>>
>>>> Now you have a balance base layout for RAID6 allocation. Should make
>>>> things
>>>> go quite smooth and result a balanced RAID6 chunk layout.
>>>
>>>
>>> This is a good trick to get out of 'the RAID6 full' situation. I have
>>> done some RAID5 tests on 100G VM disks with kernel/tools 4.5-rcX/v4.4,
>>> and various balancing starts, cancels, profile converts etc, worked
>>> surprisingly well, compared to my experience a year back with RAID5
>>> (hitting bugs, crashes).
>>>
>>> A RAID6 full balance with this setup might be very slow, even if the
>>> fs would be not so full. The VMs I use are on a mixed SSD/HDD
>>> (bcache'd) array so balancing within the last GB(s), so almost no
>>> workspace, still makes progress. But on HDD only, things can take very
>>> long. The 'Unallocated' space on devid 1 should be at least a few GiB,
>>> otherwise rebalancing will be very slow or just not work.
>>
>>
>> That's true the rebalance of all chunks will be quite slow.
>> I just hope OP won't encounter super slow
>>
>> BTW, the 'unallocated' space can on any device, as btrfs will choose devices
>> by the order of unallocated space, to alloc new chunk.
>> In the case of OP, balance itself should continue without much porblem as
>> several devices have a lot of unallocated space.
>>
>>>
>>> The way from RAID6 -> single/RAID1 -> RAID6 might also be more
>>> acceptable w.r.t. speed in total. Just watch progress I would say.
>>> Maybe its not needed to do a full convert, just make sure you will
>>> have enough workspace before starting a convert from single/RAID1 to
>>> RAID6 again.
>>>
>>> With v4.4 tools, you can do filtered balance based on stripe-width, so
>>> it avoids complete balance again of block groups that are already
>>> allocated across the right amount of devices.
>>>
>>> In this case, avoiding the re-balance of the '320.00KiB group' (in the
>>> means time could be much larger) you could do this:
>>> btrfs balance start -v -dstripes=1..6 /mnt/data
>>
>>
>> Super brilliant idea!!!
>>
>> I didn't realize that's the silver bullet for such use case.
>>
>> BTW, can stripes option be used with convert?
>> IMHO we still need to use single as a temporary state for those not fully
>> allocated RAID6 chunks.
>> Or we won't be able to alloc new RAID6 chunk with full stripes.
>>
>> Thanks,
>> Qu
>>
>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Gareth Pye - blog.cerberos.id.au
Level 2 MTG Judge, Melbourne, Australia
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RAID 6 full, but there is still space left on some devices

Reply via email to