On 2018年05月27日 10:06, Brad Templeton wrote: > Thanks. These are all things which take substantial fractions of a > day to try, unfortunately.
Normally I would suggest just using VM and several small disks (~10G), along with fallocate (the fastest way to use space) to get a basic view of the procedure. > Last time I ended up fixing it in a > fairly kluged way, which was to convert from raid-1 to single long > enough to get enough single blocks that when I converted back to > raid-1 they got distributed to the right drives. Yep, that's the ultimate one-fit-all solution. Also, this reminds me about the fact we could do the RAID1->Single/DUP->Single downgrade in a much much faster way. I think it's worthy considering for later enhancement. > But this is, aside > from being a kludge, a procedure with some minor risk. Of course I am > taking a backup first, but still... > > This strikes me as something that should be a fairly common event -- > your raid is filling up, and so you expand it by replacing the oldest > and smallest drive with a new much bigger one. In the old days of > RAID, you could not do that, you had to grow all drives at the same > time, and this is one of the ways that BTRFS is quite superior. > When I had MD raid, I went through a strange process of always having > a raid 5 that consisted of different sized drives. The raid-5 was > based on the smallest of the 3 drives, and then the larger ones had > extra space which could either be in raid-1, or more imply was in solo > disk mode and used for less critical data (such as backups and old > archives.) Slowly, and in a messy way, each time I replaced the > smallest drive, I could then grow the raid 5. Yuck. BTRFS is so > much better, except for this issue. > > So if somebody has a thought of a procedure that is fairly sure to > work and doesn't involve too many copying passes -- copying 4tb is not > a quick operation -- it is much appreciated and might be a good thing > to add to a wiki page, which I would be happy to do. Anyway, "btrfs fi show" and "btrfs fi usage" would help before any further advice from community. Thanks, Qu > > On Sat, May 26, 2018 at 6:56 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: >> >> >> On 2018年05月27日 09:49, Brad Templeton wrote: >>> That is what did not work last time. >>> >>> I say I think there can be a "fix" because I hope the goal of BTRFS >>> raid is to be superior to traditional RAID. That if one replaces a >>> drive, and asks to balance, it figures out what needs to be done to >>> make that work. I understand that the current balance algorithm may >>> have trouble with that. In this situation, the ideal result would be >>> the system would take the 3 drives (4TB and 6TB full, 8TB with 4TB >>> free) and move extents strictly from the 4TB and 6TB to the 8TB -- ie >>> extents which are currently on both the 4TB and 6TB -- by moving only >>> one copy. >> >> Btrfs can only do balance in a chunk unit. >> Thus btrfs can only do: >> 1) Create new chunk >> 2) Copy data >> 3) Remove old chunk. >> >> So it can't do the way you mentioned. >> But your purpose sounds pretty valid and maybe we could enhanace btrfs >> to do such thing. >> (Currently only replace can behave like that) >> >>> It is not strictly a "bug" in that the code is operating >>> as designed, but it is an undesired function. >>> >>> The problem is the approach you describe did not work in the prior upgrade. >> >> Would you please try 4/4/6 + 4 or 4/4/6 + 2 and then balance? >> And before/after balance, "btrfs fi usage" and "btrfs fi show" output >> could also help. >> >> Thanks, >> Qu >> >>> >>> On Sat, May 26, 2018 at 6:41 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: >>>> >>>> >>>> On 2018年05月27日 09:27, Brad Templeton wrote: >>>>> A few years ago, I encountered an issue (halfway between a bug and a >>>>> problem) with attempting to grow a BTRFS 3 disk Raid 1 which was >>>>> fairly full. The problem was that after replacing (by add/delete) a >>>>> small drive with a larger one, there were now 2 full drives and one >>>>> new half-full one, and balance was not able to correct this situation >>>>> to produce the desired result, which is 3 drives, each with a roughly >>>>> even amount of free space. It can't do it because the 2 smaller >>>>> drives are full, and it doesn't realize it could just move one of the >>>>> copies of a block off the smaller drive onto the larger drive to free >>>>> space on the smaller drive, it wants to move them both, and there is >>>>> nowhere to put them both. >>>> >>>> It's not that easy. >>>> For balance, btrfs must first find a large enough space to locate both >>>> copy, then copy data. >>>> Or if powerloss happens, it will cause data corruption. >>>> >>>> So in your case, btrfs can only find enough space for one copy, thus >>>> unable to relocate any chunk. >>>> >>>>> >>>>> I'm about to do it again, taking my nearly full array which is 4TB, >>>>> 4TB, 6TB and replacing one of the 4TB with an 8TB. I don't want to >>>>> repeat the very time consuming situation, so I wanted to find out if >>>>> things were fixed now. I am running Xenial (kernel 4.4.0) and could >>>>> consider the upgrade to bionic (4.15) though that adds a lot more to >>>>> my plate before a long trip and I would prefer to avoid if I can. >>>> >>>> Since there is nothing to fix, the behavior will not change at all. >>>> >>>>> >>>>> So what is the best strategy: >>>>> >>>>> a) Replace 4TB with 8TB, resize up and balance? (This is the "basic" >>>>> strategy) >>>>> b) Add 8TB, balance, remove 4TB (automatic distribution of some blocks >>>>> from 4TB but possibly not enough) >>>>> c) Replace 6TB with 8TB, resize/balance, then replace 4TB with >>>>> recently vacated 6TB -- much longer procedure but possibly better >>>>> >>>>> Or has this all been fixed and method A will work fine and get to the >>>>> ideal goal -- 3 drives, with available space suitably distributed to >>>>> allow full utilization over time? >>>> >>>> Btrfs chunk allocator is already trying to utilize all drivers for a >>>> long long time. >>>> When allocate chunks, btrfs will choose the device with the most free >>>> space. However the nature of RAID1 needs btrfs to allocate extents from >>>> 2 different devices, which makes your replaced 4/4/6 a little complex. >>>> (If your 4/4/6 array is set up and then filled to current stage, btrfs >>>> should be able to utilize all the space) >>>> >>>> >>>> Personally speaking, if you're confident enough, just add a new device, >>>> and then do balance. >>>> If enough chunks get balanced, there should be enough space freed on >>>> existing disks. >>>> Then remove the newly added device, then btrfs should handle the >>>> remaining space well. >>>> >>>> Thanks, >>>> Qu >>>> >>>>> >>>>> On Sat, May 26, 2018 at 6:24 PM, Brad Templeton <brad...@gmail.com> wrote: >>>>>> A few years ago, I encountered an issue (halfway between a bug and a >>>>>> problem) with attempting to grow a BTRFS 3 disk Raid 1 which was fairly >>>>>> full. The problem was that after replacing (by add/delete) a small >>>>>> drive >>>>>> with a larger one, there were now 2 full drives and one new half-full >>>>>> one, >>>>>> and balance was not able to correct this situation to produce the desired >>>>>> result, which is 3 drives, each with a roughly even amount of free space. >>>>>> It can't do it because the 2 smaller drives are full, and it doesn't >>>>>> realize >>>>>> it could just move one of the copies of a block off the smaller drive >>>>>> onto >>>>>> the larger drive to free space on the smaller drive, it wants to move >>>>>> them >>>>>> both, and there is nowhere to put them both. >>>>>> >>>>>> I'm about to do it again, taking my nearly full array which is 4TB, 4TB, >>>>>> 6TB >>>>>> and replacing one of the 4TB with an 8TB. I don't want to repeat the >>>>>> very >>>>>> time consuming situation, so I wanted to find out if things were fixed >>>>>> now. >>>>>> I am running Xenial (kernel 4.4.0) and could consider the upgrade to >>>>>> bionic >>>>>> (4.15) though that adds a lot more to my plate before a long trip and I >>>>>> would prefer to avoid if I can. >>>>>> >>>>>> So what is the best strategy: >>>>>> >>>>>> a) Replace 4TB with 8TB, resize up and balance? (This is the "basic" >>>>>> strategy) >>>>>> b) Add 8TB, balance, remove 4TB (automatic distribution of some blocks >>>>>> from >>>>>> 4TB but possibly not enough) >>>>>> c) Replace 6TB with 8TB, resize/balance, then replace 4TB with recently >>>>>> vacated 6TB -- much longer procedure but possibly better >>>>>> >>>>>> Or has this all been fixed and method A will work fine and get to the >>>>>> ideal >>>>>> goal -- 3 drives, with available space suitably distributed to allow full >>>>>> utilization over time? >>>>>> >>>>>> On Fri, Mar 25, 2016 at 7:35 AM, Henk Slager <eye...@gmail.com> wrote: >>>>>>> >>>>>>> On Fri, Mar 25, 2016 at 2:16 PM, Patrik Lundquist >>>>>>> <patrik.lundqu...@gmail.com> wrote: >>>>>>>> On 23 March 2016 at 20:33, Chris Murphy <li...@colorremedies.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On Wed, Mar 23, 2016 at 1:10 PM, Brad Templeton <brad...@gmail.com> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> I am surprised to hear it said that having the mixed sizes is an odd >>>>>>>>>> case. >>>>>>>>> >>>>>>>>> Not odd as in wrong, just uncommon compared to other arrangements >>>>>>>>> being >>>>>>>>> tested. >>>>>>>> >>>>>>>> I think mixed drive sizes in raid1 is a killer feature for a home NAS, >>>>>>>> where you replace an old smaller drive with the latest and largest >>>>>>>> when you need more storage. >>>>>>>> >>>>>>>> My raid1 currently consists of 6TB+3TB+3*2TB. >>>>>>> >>>>>>> For the original OP situation, with chunks all filled op with extents >>>>>>> and devices all filled up with chunks, 'integrating' a new 6TB drive >>>>>>> in an 4TB+3TG+2TB raid1 array could probably be done in a bit unusual >>>>>>> way in order to avoid immediate balancing needs: >>>>>>> - 'plug-in' the 6TB >>>>>>> - btrfs-replace 4TB by 6TB >>>>>>> - btrfs fi resize max 6TB_devID >>>>>>> - btrfs-replace 2TB by 4TB >>>>>>> - btrfs fi resize max 4TB_devID >>>>>>> - 'unplug' the 2TB >>>>>>> >>>>>>> So then there would be 2 devices with roughly 2TB space available, so >>>>>>> good for continued btrfs raid1 writes. >>>>>>> >>>>>>> An offline variant with dd instead of btrfs-replace could also be done >>>>>>> (I used to do that sometimes when btrfs-replace was not implemented). >>>>>>> My experience is that btrfs-replace speed is roughly at max speed (so >>>>>>> harddisk magnetic media transferspeed) during the whole replace >>>>>>> process and it does in a more direct way what you actually want. So in >>>>>>> total mostly way faster device replace/upgrade than with the >>>>>>> add+delete method. And raid1 redundancy is active all the time. Of >>>>>>> course it means first make sure the system runs up-to-date/latest >>>>>>> kernel+tools. >>>>>> >>>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>> the body of a message to majord...@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
signature.asc
Description: OpenPGP digital signature