Certainly. My apologies for not including them before. As described, the disks are reasonably balanced -- not as full as the last time. As such, it might be enough that balance would (slowly) free up enough chunks to get things going. And if I have to, I will partially convert to single again. Certainly btrfs replace seems like the most planned and simple path but it will result in a strange distribution of the chunks.
Label: 'butter' uuid: a91755d4-87d8-4acd-ae08-c11e7f1f5438 Total devices 3 FS bytes used 6.11TiB devid 1 size 3.62TiB used 3.47TiB path /dev/sdj2Overall: Device size: 12.70TiB Device allocated: 12.25TiB Device unallocated: 459.95GiB Device missing: 0.00B Used: 12.21TiB Free (estimated): 246.35GiB (min: 246.35GiB) Data ratio: 2.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 1.32MiB) Data,RAID1: Size:6.11TiB, Used:6.09TiB /dev/sda 3.48TiB /dev/sdi2 5.28TiB /dev/sdj2 3.46TiB Metadata,RAID1: Size:14.00GiB, Used:12.38GiB /dev/sda 8.00GiB /dev/sdi2 7.00GiB /dev/sdj2 13.00GiB System,RAID1: Size:32.00MiB, Used:888.00KiB /dev/sdi2 32.00MiB /dev/sdj2 32.00MiB Unallocated: /dev/sda 153.02GiB /dev/sdi2 154.56GiB /dev/sdj2 152.36GiB devid 2 size 3.64TiB used 3.49TiB path /dev/sda devid 3 size 5.43TiB used 5.28TiB path /dev/sdi2 On Sat, May 26, 2018 at 7:16 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: > > > On 2018年05月27日 10:06, Brad Templeton wrote: >> Thanks. These are all things which take substantial fractions of a >> day to try, unfortunately. > > Normally I would suggest just using VM and several small disks (~10G), > along with fallocate (the fastest way to use space) to get a basic view > of the procedure. > >> Last time I ended up fixing it in a >> fairly kluged way, which was to convert from raid-1 to single long >> enough to get enough single blocks that when I converted back to >> raid-1 they got distributed to the right drives. > > Yep, that's the ultimate one-fit-all solution. > Also, this reminds me about the fact we could do the > RAID1->Single/DUP->Single downgrade in a much much faster way. > I think it's worthy considering for later enhancement. > >> But this is, aside >> from being a kludge, a procedure with some minor risk. Of course I am >> taking a backup first, but still... >> >> This strikes me as something that should be a fairly common event -- >> your raid is filling up, and so you expand it by replacing the oldest >> and smallest drive with a new much bigger one. In the old days of >> RAID, you could not do that, you had to grow all drives at the same >> time, and this is one of the ways that BTRFS is quite superior. >> When I had MD raid, I went through a strange process of always having >> a raid 5 that consisted of different sized drives. The raid-5 was >> based on the smallest of the 3 drives, and then the larger ones had >> extra space which could either be in raid-1, or more imply was in solo >> disk mode and used for less critical data (such as backups and old >> archives.) Slowly, and in a messy way, each time I replaced the >> smallest drive, I could then grow the raid 5. Yuck. BTRFS is so >> much better, except for this issue. >> >> So if somebody has a thought of a procedure that is fairly sure to >> work and doesn't involve too many copying passes -- copying 4tb is not >> a quick operation -- it is much appreciated and might be a good thing >> to add to a wiki page, which I would be happy to do. > > Anyway, "btrfs fi show" and "btrfs fi usage" would help before any > further advice from community. > > Thanks, > Qu > >> >> On Sat, May 26, 2018 at 6:56 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: >>> >>> >>> On 2018年05月27日 09:49, Brad Templeton wrote: >>>> That is what did not work last time. >>>> >>>> I say I think there can be a "fix" because I hope the goal of BTRFS >>>> raid is to be superior to traditional RAID. That if one replaces a >>>> drive, and asks to balance, it figures out what needs to be done to >>>> make that work. I understand that the current balance algorithm may >>>> have trouble with that. In this situation, the ideal result would be >>>> the system would take the 3 drives (4TB and 6TB full, 8TB with 4TB >>>> free) and move extents strictly from the 4TB and 6TB to the 8TB -- ie >>>> extents which are currently on both the 4TB and 6TB -- by moving only >>>> one copy. >>> >>> Btrfs can only do balance in a chunk unit. >>> Thus btrfs can only do: >>> 1) Create new chunk >>> 2) Copy data >>> 3) Remove old chunk. >>> >>> So it can't do the way you mentioned. >>> But your purpose sounds pretty valid and maybe we could enhanace btrfs >>> to do such thing. >>> (Currently only replace can behave like that) >>> >>>> It is not strictly a "bug" in that the code is operating >>>> as designed, but it is an undesired function. >>>> >>>> The problem is the approach you describe did not work in the prior upgrade. >>> >>> Would you please try 4/4/6 + 4 or 4/4/6 + 2 and then balance? >>> And before/after balance, "btrfs fi usage" and "btrfs fi show" output >>> could also help. >>> >>> Thanks, >>> Qu >>> >>>> >>>> On Sat, May 26, 2018 at 6:41 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: >>>>> >>>>> >>>>> On 2018年05月27日 09:27, Brad Templeton wrote: >>>>>> A few years ago, I encountered an issue (halfway between a bug and a >>>>>> problem) with attempting to grow a BTRFS 3 disk Raid 1 which was >>>>>> fairly full. The problem was that after replacing (by add/delete) a >>>>>> small drive with a larger one, there were now 2 full drives and one >>>>>> new half-full one, and balance was not able to correct this situation >>>>>> to produce the desired result, which is 3 drives, each with a roughly >>>>>> even amount of free space. It can't do it because the 2 smaller >>>>>> drives are full, and it doesn't realize it could just move one of the >>>>>> copies of a block off the smaller drive onto the larger drive to free >>>>>> space on the smaller drive, it wants to move them both, and there is >>>>>> nowhere to put them both. >>>>> >>>>> It's not that easy. >>>>> For balance, btrfs must first find a large enough space to locate both >>>>> copy, then copy data. >>>>> Or if powerloss happens, it will cause data corruption. >>>>> >>>>> So in your case, btrfs can only find enough space for one copy, thus >>>>> unable to relocate any chunk. >>>>> >>>>>> >>>>>> I'm about to do it again, taking my nearly full array which is 4TB, >>>>>> 4TB, 6TB and replacing one of the 4TB with an 8TB. I don't want to >>>>>> repeat the very time consuming situation, so I wanted to find out if >>>>>> things were fixed now. I am running Xenial (kernel 4.4.0) and could >>>>>> consider the upgrade to bionic (4.15) though that adds a lot more to >>>>>> my plate before a long trip and I would prefer to avoid if I can. >>>>> >>>>> Since there is nothing to fix, the behavior will not change at all. >>>>> >>>>>> >>>>>> So what is the best strategy: >>>>>> >>>>>> a) Replace 4TB with 8TB, resize up and balance? (This is the "basic" >>>>>> strategy) >>>>>> b) Add 8TB, balance, remove 4TB (automatic distribution of some blocks >>>>>> from 4TB but possibly not enough) >>>>>> c) Replace 6TB with 8TB, resize/balance, then replace 4TB with >>>>>> recently vacated 6TB -- much longer procedure but possibly better >>>>>> >>>>>> Or has this all been fixed and method A will work fine and get to the >>>>>> ideal goal -- 3 drives, with available space suitably distributed to >>>>>> allow full utilization over time? >>>>> >>>>> Btrfs chunk allocator is already trying to utilize all drivers for a >>>>> long long time. >>>>> When allocate chunks, btrfs will choose the device with the most free >>>>> space. However the nature of RAID1 needs btrfs to allocate extents from >>>>> 2 different devices, which makes your replaced 4/4/6 a little complex. >>>>> (If your 4/4/6 array is set up and then filled to current stage, btrfs >>>>> should be able to utilize all the space) >>>>> >>>>> >>>>> Personally speaking, if you're confident enough, just add a new device, >>>>> and then do balance. >>>>> If enough chunks get balanced, there should be enough space freed on >>>>> existing disks. >>>>> Then remove the newly added device, then btrfs should handle the >>>>> remaining space well. >>>>> >>>>> Thanks, >>>>> Qu >>>>> >>>>>> >>>>>> On Sat, May 26, 2018 at 6:24 PM, Brad Templeton <brad...@gmail.com> >>>>>> wrote: >>>>>>> A few years ago, I encountered an issue (halfway between a bug and a >>>>>>> problem) with attempting to grow a BTRFS 3 disk Raid 1 which was fairly >>>>>>> full. The problem was that after replacing (by add/delete) a small >>>>>>> drive >>>>>>> with a larger one, there were now 2 full drives and one new half-full >>>>>>> one, >>>>>>> and balance was not able to correct this situation to produce the >>>>>>> desired >>>>>>> result, which is 3 drives, each with a roughly even amount of free >>>>>>> space. >>>>>>> It can't do it because the 2 smaller drives are full, and it doesn't >>>>>>> realize >>>>>>> it could just move one of the copies of a block off the smaller drive >>>>>>> onto >>>>>>> the larger drive to free space on the smaller drive, it wants to move >>>>>>> them >>>>>>> both, and there is nowhere to put them both. >>>>>>> >>>>>>> I'm about to do it again, taking my nearly full array which is 4TB, >>>>>>> 4TB, 6TB >>>>>>> and replacing one of the 4TB with an 8TB. I don't want to repeat the >>>>>>> very >>>>>>> time consuming situation, so I wanted to find out if things were fixed >>>>>>> now. >>>>>>> I am running Xenial (kernel 4.4.0) and could consider the upgrade to >>>>>>> bionic >>>>>>> (4.15) though that adds a lot more to my plate before a long trip and I >>>>>>> would prefer to avoid if I can. >>>>>>> >>>>>>> So what is the best strategy: >>>>>>> >>>>>>> a) Replace 4TB with 8TB, resize up and balance? (This is the "basic" >>>>>>> strategy) >>>>>>> b) Add 8TB, balance, remove 4TB (automatic distribution of some blocks >>>>>>> from >>>>>>> 4TB but possibly not enough) >>>>>>> c) Replace 6TB with 8TB, resize/balance, then replace 4TB with recently >>>>>>> vacated 6TB -- much longer procedure but possibly better >>>>>>> >>>>>>> Or has this all been fixed and method A will work fine and get to the >>>>>>> ideal >>>>>>> goal -- 3 drives, with available space suitably distributed to allow >>>>>>> full >>>>>>> utilization over time? >>>>>>> >>>>>>> On Fri, Mar 25, 2016 at 7:35 AM, Henk Slager <eye...@gmail.com> wrote: >>>>>>>> >>>>>>>> On Fri, Mar 25, 2016 at 2:16 PM, Patrik Lundquist >>>>>>>> <patrik.lundqu...@gmail.com> wrote: >>>>>>>>> On 23 March 2016 at 20:33, Chris Murphy <li...@colorremedies.com> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On Wed, Mar 23, 2016 at 1:10 PM, Brad Templeton <brad...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> I am surprised to hear it said that having the mixed sizes is an odd >>>>>>>>>>> case. >>>>>>>>>> >>>>>>>>>> Not odd as in wrong, just uncommon compared to other arrangements >>>>>>>>>> being >>>>>>>>>> tested. >>>>>>>>> >>>>>>>>> I think mixed drive sizes in raid1 is a killer feature for a home NAS, >>>>>>>>> where you replace an old smaller drive with the latest and largest >>>>>>>>> when you need more storage. >>>>>>>>> >>>>>>>>> My raid1 currently consists of 6TB+3TB+3*2TB. >>>>>>>> >>>>>>>> For the original OP situation, with chunks all filled op with extents >>>>>>>> and devices all filled up with chunks, 'integrating' a new 6TB drive >>>>>>>> in an 4TB+3TG+2TB raid1 array could probably be done in a bit unusual >>>>>>>> way in order to avoid immediate balancing needs: >>>>>>>> - 'plug-in' the 6TB >>>>>>>> - btrfs-replace 4TB by 6TB >>>>>>>> - btrfs fi resize max 6TB_devID >>>>>>>> - btrfs-replace 2TB by 4TB >>>>>>>> - btrfs fi resize max 4TB_devID >>>>>>>> - 'unplug' the 2TB >>>>>>>> >>>>>>>> So then there would be 2 devices with roughly 2TB space available, so >>>>>>>> good for continued btrfs raid1 writes. >>>>>>>> >>>>>>>> An offline variant with dd instead of btrfs-replace could also be done >>>>>>>> (I used to do that sometimes when btrfs-replace was not implemented). >>>>>>>> My experience is that btrfs-replace speed is roughly at max speed (so >>>>>>>> harddisk magnetic media transferspeed) during the whole replace >>>>>>>> process and it does in a more direct way what you actually want. So in >>>>>>>> total mostly way faster device replace/upgrade than with the >>>>>>>> add+delete method. And raid1 redundancy is active all the time. Of >>>>>>>> course it means first make sure the system runs up-to-date/latest >>>>>>>> kernel+tools. >>>>>>> >>>>>>> >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>>> the body of a message to majord...@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>>> >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html