Re: ?anyone using TSM to backup Panasus PanFS?
You could attach the TSM server to the clustered filesystem and allow it to perform the backup from its own client. This might get you better bandwidth if it's done properly. MEMORYEFFICIENT YES will scan one directory at a time per producer thread. This keeps from eating up all system memory, at the expense of some scan speed. Memory usage is 400-1200 bytes per file in the list to be backed up. RESOURCEUTIL 10 is the limit and gets you only 4-6 producers. If you have good bandwidth, and a large number of execution threads, it may be beneficial to run parallel backups. Parallel backups can be used to spread the workload across more threads and/or more systems. virtualnodename or asnode can be used if necessary. VIRTUALMOUNTPOINT can be used to get around scanning areas which should not be backed up rather than excluding those areas. To prevent retransmits, you could perform backups of a replica or a snapshot. Incremental by Date is valuable for your busy days. It won't expire files, but it saves a bunch of time scanning. Limitations for path/filename depth: AIX HP-UX Solaris: File_space_name 1024 Path_name or directory_name 1023 File_name 256 Linux File_space_name 1024 Path_name or directory_name768 File_name 256 Windows XP/2000/2003 File_space_name 1024 Path_name or directory_name248 File_name 248 With friendly regards, Josh-D. S. Davis From: James R Owen jim.o...@yale.edu To: ADSM-L@VM.MARIST.EDU Sent: Tue, January 26, 2010 5:09:53 PM Subject: [ADSM-L] ?anyone using TSM to backup Panasus PanFS? Yale uses Panasus PanFS, a massive parallel storage system, to store research data generated from HPC clusters. In considering feasibility to backup PanFS using TSM, we are concerned about whether TSM is appropriate to backup and restore: 1. very large volumes, 2. deep subdirectory hierarchy with 100's to 1000's of sublevels, 3. large numbers of files within individual subdirectories, 4. much larger numbers of files within each directory hierarchy. Are there effective maximum limits for any of the above, beyond which TSM becomes inappropriate to effectively perform backups and restores? Please advise about the feasibility and any configuration recommendation(s) to maximize PanFS backup and restore efficiency using TSM. Thanks for your help. -- jim.o...@yale.edu (w#203.432.6693, c#203.494.9201, h#203.387.3030)
Re: ?anyone using TSM to backup Panasus PanFS?
Does PanFS have a tool like GPFS's tslistall? If it does you can use that to at least get a list of files to pass to TSM so TSM itself doesn't have to do a scan. If you can get mtime information you could even figure out exactly what files have changed since the last backup and just do a selective backup. Another thing you can do if your data is easily partitionable is to setup a proxy node group and only backup part of the filesystem on each node. James R Owen wrote: Yale uses Panasus PanFS, a massive parallel storage system, to store research data generated from HPC clusters. In considering feasibility to backup PanFS using TSM, we are concerned about whether TSM is appropriate to backup and restore: 1. very large volumes, 2. deep subdirectory hierarchy with 100's to 1000's of sublevels, 3. large numbers of files within individual subdirectories, 4. much larger numbers of files within each directory hierarchy. Are there effective maximum limits for any of the above, beyond which TSM becomes inappropriate to effectively perform backups and restores? Please advise about the feasibility and any configuration recommendation(s) to maximize PanFS backup and restore efficiency using TSM. Thanks for your help. -- jim.o...@yale.edu (w#203.432.6693, c#203.494.9201, h#203.387.3030) -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S048, (206)-685-7354 -- University of Washington School of Medicine
Re: ?anyone using TSM to backup Panasus PanFS?
We use TSM to backup our primary research data server. It's a SUN sparc server with 150TB of data. 51M files, 2M directories, 1+TB / day rate of change. It takes ~6-8 hours to run the backup on this. As you can expect, most of the time is scanning the filesystems to find changes. The filesystem is Veritas vxfs. Since the scan/backup time is well within our window of opportunity, it is not a big deal. As we grow, we will probably add capacity with more drives and more server capacity (faster/more procs/more ram/etc). TSM keeps up with this very well. I would really *hate* to ever have to run a full backup on this beast. Thanks, Bill Evans Research Computing Support FRED HUTCHINSON CANCER RESEARCH CENTER -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of James R Owen Sent: Tuesday, January 26, 2010 3:10 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] ?anyone using TSM to backup Panasus PanFS? Yale uses Panasus PanFS, a massive parallel storage system, to store research data generated from HPC clusters. In considering feasibility to backup PanFS using TSM, we are concerned about whether TSM is appropriate to backup and restore: 1. very large volumes, 2. deep subdirectory hierarchy with 100's to 1000's of sublevels, 3. large numbers of files within individual subdirectories, 4. much larger numbers of files within each directory hierarchy. Are there effective maximum limits for any of the above, beyond which TSM becomes inappropriate to effectively perform backups and restores? Please advise about the feasibility and any configuration recommendation(s) to maximize PanFS backup and restore efficiency using TSM. Thanks for your help. -- jim.o...@yale.edu (w#203.432.6693, c#203.494.9201, h#203.387.3030)
?anyone using TSM to backup Panasus PanFS?
Yale uses Panasus PanFS, a massive parallel storage system, to store research data generated from HPC clusters. In considering feasibility to backup PanFS using TSM, we are concerned about whether TSM is appropriate to backup and restore: 1. very large volumes, 2. deep subdirectory hierarchy with 100's to 1000's of sublevels, 3. large numbers of files within individual subdirectories, 4. much larger numbers of files within each directory hierarchy. Are there effective maximum limits for any of the above, beyond which TSM becomes inappropriate to effectively perform backups and restores? Please advise about the feasibility and any configuration recommendation(s) to maximize PanFS backup and restore efficiency using TSM. Thanks for your help. -- jim.o...@yale.edu (w#203.432.6693, c#203.494.9201, h#203.387.3030)
Re: ?anyone using TSM to backup Panasus PanFS?
Hi, I do not think this is task for any normal backup solution (not to mention PanFS is possibly not supported by any). With these specification you may easily exceed any limit of filename/path length backup software has. Scanning the filesystem can take hours (if only!) even before single byte is transferred - huge number of files is a killer for any backup solution. What is your RPO and RTO? What is the purpose of the backup and granularity required? Without knowing anything more about your environment it seems to me that replication (possibly synchronous) between two sites and volume block level backup (what VERY LARGE means?) is what you end with crystal balls are scarce these days :) Harry From: James R Owen jim.o...@yale.edu To: ADSM-L@VM.MARIST.EDU Sent: Wed, January 27, 2010 12:09:53 AM Subject: [ADSM-L] ?anyone using TSM to backup Panasus PanFS? Yale uses Panasus PanFS, a massive parallel storage system, to store research data generated from HPC clusters. In considering feasibility to backup PanFS using TSM, we are concerned about whether TSM is appropriate to backup and restore: 1. very large volumes, 2. deep subdirectory hierarchy with 100's to 1000's of sublevels, 3. large numbers of files within individual subdirectories, 4. much larger numbers of files within each directory hierarchy. Are there effective maximum limits for any of the above, beyond which TSM becomes inappropriate to effectively perform backups and restores? Please advise about the feasibility and any configuration recommendation(s) to maximize PanFS backup and restore efficiency using TSM. Thanks for your help. -- jim.o...@yale.edu (w#203.432.6693, c#203.494.9201, h#203.387.3030)