Re: [lustre-discuss] Replacing ldiskfs MDT with larger disk

Andreas Dilger Wed, 31 Jul 2019 16:28:29 -0700

Just to clarify, when I referred to "file level backup/restore", I was 
referring to the MDT ldiskfs filesystem, not the whole Lustre filesystem (which 
would be _much_ too large for most sites.  The various backup/restore methods 
are documented in the Lustre Operations Manual.


Cheers, Andreas

> On Jul 31, 2019, at 15:10, Jesse Stroik <jesse.str...@ssec.wisc.edu> wrote:
> 
> This is excellent information, Andreas.
> 
> Presently we do file level backups to the live file system and they take over 
> 24 hours, so they're done continuously. For that timeframe to wrok, we'd need 
> to be able to back up and recover the MDT to the new MDT with the file system 
> online.
> 
> Given that resizing the file system will proportionately increase the inodes 
> (I didn't realize that), dd to a logical volume may be a reasonable option 
> for us. The dd would be faster enough that we could weather the downtime.
> 
> PFL and FLR aren't features they're planning for the file system and it may 
> be replaced next year so I suspect they'll opt for the DNE method.
> 
> Thanks again,
> Jesse Stroik
> 
> On 7/31/19 3:11 PM, Andreas Dilger wrote:
>> Normally the easy answer would be that a "dd" copy of the MDT device from 
>> your HDDs to a larger SSD LUN, then resize2fs to increase the filesystem 
>> size would also increase the number of inodes proportionately to the LUN 
>> size.
>> However, since you are *not* using 1024-byte inode size, only 512-byte inode 
>> size + 512-bytes space for other things (ie. 1024 bytes-per-inode ratio), 
>> I'd suggest a file-level MDT backup/restore to a newly-formatted MDT because 
>> newer features like PFL and FLR need more space in the inode itself. The 
>> benefit of this approach is that you keep a full backup of the MDT on the 
>> HDDs in case of problems.  Note that after backup/restore the LFSCK OI Scrub 
>> will run for some time (maybe an hour or two, depending on size), which will 
>> result in slowdown. That would likely be compensated by faster SSD storage.
>> If you go the DNE route, then migrate some of the namespace to the new MDT, 
>> you definitely still need to keep MDT0000.  However, you could combine these 
>> approaches and still copy MDT0000 to new flash storage instead of keeping 
>> the HDDs around forever.  I'd again recommend a file-level MDT 
>> backup/restore to a newly-formatted MDT to get the newer format options.
>> Cheers, Andreas
>>> On Jul 31, 2019, at 13:50, Jesse Stroik <jesse.str...@ssec.wisc.edu> wrote:
>>> 
>>> Hi everyone,
>>> 
>>> One of our lustre file systems outgrew its MDT and the original scope of 
>>> its operation. This one is still running ldiskfs on the MDT. Here's our 
>>> setup and restrictions:
>>> 
>>> - centos 6 / lustre 2.8
>>> - ldiskfs MDT
>>> - minimal downtime allowed, but the FS can be read-only for a while.
>>> 
>>> The MDT itself, set up with -i 1024, needs both more space and available 
>>> inodes. Its purpose changed in scope and we'd now like the performance 
>>> benefits of getting off of spinning media as well.
>>> 
>>> We need a new files system instead of expanding the existing ldiskfs 
>>> because we need more inodes.
>>> 
>>> I think my options are (1) a file level backup and recovery or direct copy 
>>> onto the new file system or (2) add a new MDT to the system and assign all 
>>> directories under the root to it, then lfs_migrate everything on the file 
>>> system thereafter.
>>> 
>>> Is there a disadvantage to the DNE approach other than the fact that we 
>>> have to keep the original spinning-disk MDT around to service the root of 
>>> the FS?
>>> 
>>> If we had to do option 1, we'd want to remount the current MDT read only 
>>> and continue using it while we were preparing new MDT. When I searched, I 
>>> couldn't find anything that seemed definitive about ensuring no changes to 
>>> an ldiskfs MDT during operation and I don't want to assume i can simply 
>>> remount it read only.
>>> 
>>> Thanks,
>>> Jesse Stroik
>>> 
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss@lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Principal Lustre Architect
Whamcloud






_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Replacing ldiskfs MDT with larger disk

Reply via email to