Re: [lustre-discuss] Replacing ldiskfs MDT with larger disk
Just to clarify, when I referred to "file level backup/restore", I was referring to the MDT ldiskfs filesystem, not the whole Lustre filesystem (which would be _much_ too large for most sites. The various backup/restore methods are documented in the Lustre Operations Manual. Cheers, Andreas > On Jul 31, 2019, at 15:10, Jesse Stroik wrote: > > This is excellent information, Andreas. > > Presently we do file level backups to the live file system and they take over > 24 hours, so they're done continuously. For that timeframe to wrok, we'd need > to be able to back up and recover the MDT to the new MDT with the file system > online. > > Given that resizing the file system will proportionately increase the inodes > (I didn't realize that), dd to a logical volume may be a reasonable option > for us. The dd would be faster enough that we could weather the downtime. > > PFL and FLR aren't features they're planning for the file system and it may > be replaced next year so I suspect they'll opt for the DNE method. > > Thanks again, > Jesse Stroik > > On 7/31/19 3:11 PM, Andreas Dilger wrote: >> Normally the easy answer would be that a "dd" copy of the MDT device from >> your HDDs to a larger SSD LUN, then resize2fs to increase the filesystem >> size would also increase the number of inodes proportionately to the LUN >> size. >> However, since you are *not* using 1024-byte inode size, only 512-byte inode >> size + 512-bytes space for other things (ie. 1024 bytes-per-inode ratio), >> I'd suggest a file-level MDT backup/restore to a newly-formatted MDT because >> newer features like PFL and FLR need more space in the inode itself. The >> benefit of this approach is that you keep a full backup of the MDT on the >> HDDs in case of problems. Note that after backup/restore the LFSCK OI Scrub >> will run for some time (maybe an hour or two, depending on size), which will >> result in slowdown. That would likely be compensated by faster SSD storage. >> If you go the DNE route, then migrate some of the namespace to the new MDT, >> you definitely still need to keep MDT. However, you could combine these >> approaches and still copy MDT to new flash storage instead of keeping >> the HDDs around forever. I'd again recommend a file-level MDT >> backup/restore to a newly-formatted MDT to get the newer format options. >> Cheers, Andreas >>> On Jul 31, 2019, at 13:50, Jesse Stroik wrote: >>> >>> Hi everyone, >>> >>> One of our lustre file systems outgrew its MDT and the original scope of >>> its operation. This one is still running ldiskfs on the MDT. Here's our >>> setup and restrictions: >>> >>> - centos 6 / lustre 2.8 >>> - ldiskfs MDT >>> - minimal downtime allowed, but the FS can be read-only for a while. >>> >>> The MDT itself, set up with -i 1024, needs both more space and available >>> inodes. Its purpose changed in scope and we'd now like the performance >>> benefits of getting off of spinning media as well. >>> >>> We need a new files system instead of expanding the existing ldiskfs >>> because we need more inodes. >>> >>> I think my options are (1) a file level backup and recovery or direct copy >>> onto the new file system or (2) add a new MDT to the system and assign all >>> directories under the root to it, then lfs_migrate everything on the file >>> system thereafter. >>> >>> Is there a disadvantage to the DNE approach other than the fact that we >>> have to keep the original spinning-disk MDT around to service the root of >>> the FS? >>> >>> If we had to do option 1, we'd want to remount the current MDT read only >>> and continue using it while we were preparing new MDT. When I searched, I >>> couldn't find anything that seemed definitive about ensuring no changes to >>> an ldiskfs MDT during operation and I don't want to assume i can simply >>> remount it read only. >>> >>> Thanks, >>> Jesse Stroik >>> >>> ___ >>> lustre-discuss mailing list >>> lustre-discuss@lists.lustre.org >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> ___ >> lustre-discuss mailing list >> lustre-discuss@lists.lustre.org >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org Cheers, Andreas -- Andreas Dilger Principal Lustre Architect Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Replacing ldiskfs MDT with larger disk
This is excellent information, Andreas. Presently we do file level backups to the live file system and they take over 24 hours, so they're done continuously. For that timeframe to wrok, we'd need to be able to back up and recover the MDT to the new MDT with the file system online. Given that resizing the file system will proportionately increase the inodes (I didn't realize that), dd to a logical volume may be a reasonable option for us. The dd would be faster enough that we could weather the downtime. PFL and FLR aren't features they're planning for the file system and it may be replaced next year so I suspect they'll opt for the DNE method. Thanks again, Jesse Stroik On 7/31/19 3:11 PM, Andreas Dilger wrote: Normally the easy answer would be that a "dd" copy of the MDT device from your HDDs to a larger SSD LUN, then resize2fs to increase the filesystem size would also increase the number of inodes proportionately to the LUN size. However, since you are *not* using 1024-byte inode size, only 512-byte inode size + 512-bytes space for other things (ie. 1024 bytes-per-inode ratio), I'd suggest a file-level MDT backup/restore to a newly-formatted MDT because newer features like PFL and FLR need more space in the inode itself. The benefit of this approach is that you keep a full backup of the MDT on the HDDs in case of problems. Note that after backup/restore the LFSCK OI Scrub will run for some time (maybe an hour or two, depending on size), which will result in slowdown. That would likely be compensated by faster SSD storage. If you go the DNE route, then migrate some of the namespace to the new MDT, you definitely still need to keep MDT. However, you could combine these approaches and still copy MDT to new flash storage instead of keeping the HDDs around forever. I'd again recommend a file-level MDT backup/restore to a newly-formatted MDT to get the newer format options. Cheers, Andreas On Jul 31, 2019, at 13:50, Jesse Stroik wrote: Hi everyone, One of our lustre file systems outgrew its MDT and the original scope of its operation. This one is still running ldiskfs on the MDT. Here's our setup and restrictions: - centos 6 / lustre 2.8 - ldiskfs MDT - minimal downtime allowed, but the FS can be read-only for a while. The MDT itself, set up with -i 1024, needs both more space and available inodes. Its purpose changed in scope and we'd now like the performance benefits of getting off of spinning media as well. We need a new files system instead of expanding the existing ldiskfs because we need more inodes. I think my options are (1) a file level backup and recovery or direct copy onto the new file system or (2) add a new MDT to the system and assign all directories under the root to it, then lfs_migrate everything on the file system thereafter. Is there a disadvantage to the DNE approach other than the fact that we have to keep the original spinning-disk MDT around to service the root of the FS? If we had to do option 1, we'd want to remount the current MDT read only and continue using it while we were preparing new MDT. When I searched, I couldn't find anything that seemed definitive about ensuring no changes to an ldiskfs MDT during operation and I don't want to assume i can simply remount it read only. Thanks, Jesse Stroik ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org smime.p7s Description: S/MIME Cryptographic Signature ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Replacing ldiskfs MDT with larger disk
Normally the easy answer would be that a "dd" copy of the MDT device from your HDDs to a larger SSD LUN, then resize2fs to increase the filesystem size would also increase the number of inodes proportionately to the LUN size. However, since you are *not* using 1024-byte inode size, only 512-byte inode size + 512-bytes space for other things (ie. 1024 bytes-per-inode ratio), I'd suggest a file-level MDT backup/restore to a newly-formatted MDT because newer features like PFL and FLR need more space in the inode itself. The benefit of this approach is that you keep a full backup of the MDT on the HDDs in case of problems. Note that after backup/restore the LFSCK OI Scrub will run for some time (maybe an hour or two, depending on size), which will result in slowdown. That would likely be compensated by faster SSD storage. If you go the DNE route, then migrate some of the namespace to the new MDT, you definitely still need to keep MDT. However, you could combine these approaches and still copy MDT to new flash storage instead of keeping the HDDs around forever. I'd again recommend a file-level MDT backup/restore to a newly-formatted MDT to get the newer format options. Cheers, Andreas > On Jul 31, 2019, at 13:50, Jesse Stroik wrote: > > Hi everyone, > > One of our lustre file systems outgrew its MDT and the original scope of its > operation. This one is still running ldiskfs on the MDT. Here's our setup and > restrictions: > > - centos 6 / lustre 2.8 > - ldiskfs MDT > - minimal downtime allowed, but the FS can be read-only for a while. > > The MDT itself, set up with -i 1024, needs both more space and available > inodes. Its purpose changed in scope and we'd now like the performance > benefits of getting off of spinning media as well. > > We need a new files system instead of expanding the existing ldiskfs because > we need more inodes. > > I think my options are (1) a file level backup and recovery or direct copy > onto the new file system or (2) add a new MDT to the system and assign all > directories under the root to it, then lfs_migrate everything on the file > system thereafter. > > Is there a disadvantage to the DNE approach other than the fact that we have > to keep the original spinning-disk MDT around to service the root of the FS? > > If we had to do option 1, we'd want to remount the current MDT read only and > continue using it while we were preparing new MDT. When I searched, I > couldn't find anything that seemed definitive about ensuring no changes to an > ldiskfs MDT during operation and I don't want to assume i can simply remount > it read only. > > Thanks, > Jesse Stroik > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Replacing ldiskfs MDT with larger disk
Hi everyone, One of our lustre file systems outgrew its MDT and the original scope of its operation. This one is still running ldiskfs on the MDT. Here's our setup and restrictions: - centos 6 / lustre 2.8 - ldiskfs MDT - minimal downtime allowed, but the FS can be read-only for a while. The MDT itself, set up with -i 1024, needs both more space and available inodes. Its purpose changed in scope and we'd now like the performance benefits of getting off of spinning media as well. We need a new files system instead of expanding the existing ldiskfs because we need more inodes. I think my options are (1) a file level backup and recovery or direct copy onto the new file system or (2) add a new MDT to the system and assign all directories under the root to it, then lfs_migrate everything on the file system thereafter. Is there a disadvantage to the DNE approach other than the fact that we have to keep the original spinning-disk MDT around to service the root of the FS? If we had to do option 1, we'd want to remount the current MDT read only and continue using it while we were preparing new MDT. When I searched, I couldn't find anything that seemed definitive about ensuring no changes to an ldiskfs MDT during operation and I don't want to assume i can simply remount it read only. Thanks, Jesse Stroik smime.p7s Description: S/MIME Cryptographic Signature ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org