On Jun 18, 2009 11:32 +0200, Ramiro Alba Queipo wrote: > There are 3 ways of doing an MDT backup: > > 1) Device-level using dd command > > You can do it from the original device to another local device with at > least the same capacity, BUT no clients and no OSTs should be active, so > NOT SUITABLE for an automated nightly backup
Well, "no clients/OSTs should be active" is a relative term. You will almost certainly have a usable backup even if the filesystem was active, because ext3 has a robust on-disk layout, but you would need to run an e2fsck afterward. > 2) File-level using tar or rsync commands > > You can make a copy to other directory (even remotely) BUT you MUST STOP > lustre and remount it as an 'ldiskfs' file system type. You also have to > save aditional information (cd /lustre/mds; getfattr -R -d -m '.*' -P . > > /<backup-dir>/ea.bak). So NOT SUITABLE for an automated nightly backup > either Right. Note that when using "tar" or "rsync" you should use the "--sparse" option so that it doesn't back up empty files. Also, with newer versions of tar (on RHEL/FC) and rsync it is possible to have it do the backup/restore of the extended attributes directly. You could also use "dump-0.4b40" (or later) to do a hybrid device/file level backup. It will back up the filesystem directly from the block device, but only the files that are in use. Versions 0.4b40+ can also do the backup/restore of extended attributes, which is critical. > 3) File-level on LVM snapshots > > LVM allows you to make a duplication of the MDT while lustre file system > is operational, so you can make afterwards a File-level backup of the > LVM snapshot while everything is running. Then it IS SUITABLE for an > automated backup. > Disadvantages are that you need extra local space for LVM snapshots and > the impact on performance of using LVM over the MDT. This is probably the best option. It allows consistent backups to be done, and if you only keep a single snapshot the performance hit isn't too big. > By the way. The procedure described at 'How do I replace an OST or MDS?' > in Apendix B of Lustre Operational Manual differs from procedure > discribed at 15.1.3.1 (Backing Up an MDS File): > - getfattr -R -d -m '.*' -P . > ea.bak > - getfattr -R -e base64 -d . > /tmp/mdsea I would say the first one is better, though I like to use "-e hex" instead of "-e base64" because the hex output is easier for me to decode if I need to for some reason. Probably the "replace an OST/MDT" chapter should just reference the backup/restore section instead of duplicating the content. > On Wed, 2009-06-17 at 16:23 -0600, Andreas Dilger wrote: > > On Jun 17, 2009 12:35 -0700, Cliff White wrote: > > > Ramiro Alba Queipo wrote: > > > > By reading Chapter 15 of Lustre Operations Manual, it follows that an > > > > MDT backup is only useful if you are changing hardwary or the like. > > > > I am afraid that you can not pretend to replace with a previous image an > > > > failed MDT, as data in OSTs and MDT is not matching any more, right? > > > > > > If you do a backup/immediate restore, it should be fine. If you restore > > > from an old image you will lose the changes made post-backup, but the > > > rest of the data should be fine. > > > cliffw > > > > Right - just like any backup, any changes made after the backup will of > > course not be restored. One additional issue is that some OST objects > > will not be available if they were deleted after the backup, even though > > the restored MDS will still reference them. Accessing these files will > > return -ENOENT. > > > > At that point it would be possible (though not necessary) to run "lfsck" > > to clean up the inconsistencies between the MDT and OST filesystems. > > It is also possible to just re-delete the files that have "-ENOENT" and > > restore (from some other filesystem-level backup) the rest of the files. > > > > An MDS backup is a good idea, because it avoids having to restore 100TB+ > > (or whatever) of data from backup, leaving only a smaller number of changed > > files that might need to be restored. It should NOT be the only form of > > backup for the filesystem, since it does not contain any of the FILE data. > > You, or your users, should do backups of their critical files separately. > > > > > > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote: > > > >> As we move forward with our lustre testing I am wondering about MDT > > > >> backup. > > > >> > > > >> > > > >> > > > >> Is it feasible to unmount the MDT, create an image of it and remount > > > >> it after the backup. Of course this wouldn’t happen but nightly. > > > >> > > > >> > > > >> > > > >> From what I can identify, in the case of an MDT failure we would have > > > >> to do the following: > > > >> > > > >> > > > >> > > > >> Restore from the last backup. > > > >> > > > >> Run an lfsck across the filesystem. > > > >> > > > >> > > > >> > > > >> Am I missing anything else at this point? We will also be doing file > > > >> level backups of the filesystem as a whole but we are looking for > > > >> quick ways to recover from an MDT failure. > > > >> > > > >> > > > >> > > > >> Thanks, > > > >> > > > >> Dan Kulinski > > > >> > > > >> > > > >> > > > >> -- > > > >> Aquest missatge ha estat analitzat per MailScanner > > > >> a la cerca de virus i d'altres continguts perillosos, > > > >> i es considera que está net. > > > >> MailScanner agraeix a transtec Computers pel seu suport. > > > >> _______________________________________________ > > > >> Lustre-discuss mailing list > > > >> [email protected] > > > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > >> > > > >> ------------------------------------------------------------------------ > > > >> > > > >> _______________________________________________ > > > >> Lustre-discuss mailing list > > > >> [email protected] > > > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > _______________________________________________ > > > Lustre-discuss mailing list > > > [email protected] > > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > Cheers, Andreas > > -- > > Andreas Dilger > > Sr. Staff Engineer, Lustre Group > > Sun Microsystems of Canada, Inc. > > > > > -- > Ramiro Alba > > Centre Tecnològic de Tranferència de Calor > http://www.cttc.upc.edu > > > Escola Tècnica Superior d'Enginyeries > Industrial i Aeronàutica de Terrassa > Colom 11, E-08222, Terrassa, Barcelona, Spain > Tel: (+34) 93 739 86 46 > > > -- > Aquest missatge ha estat analitzat per MailScanner > a la cerca de virus i d'altres continguts perillosos, > i es considera que est? net. > For all your IT requirements visit: http://www.transtec.co.uk > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
