Re: ?anyone using TSM to backup Panasus PanFS?

2010-02-01 Thread Josh Davis
You could attach the TSM server to the clustered filesystem and allow it to 
perform the backup from its own client.  This might get you better bandwidth if 
it's done properly.

MEMORYEFFICIENT YES will scan one directory at a time per producer
thread.  This keeps from eating up all system memory, at the expense of
some scan speed.  Memory usage is 400-1200 bytes per file in the list to be 
backed up.

RESOURCEUTIL 10 is the limit and gets you only 4-6 producers.  If you have good 
bandwidth, and a large number of execution threads, it may be beneficial to run 
parallel backups.

Parallel backups can be used to spread the workload across more threads
and/or more systems.  virtualnodename or asnode can be used if
necessary.

VIRTUALMOUNTPOINT can be used to get around scanning areas which should
not be backed up rather than excluding those areas.

To prevent retransmits, you could perform backups of a replica or a snapshot.

Incremental by Date is valuable for your busy days.  It won't expire
files, but it saves a bunch of time scanning.


Limitations for path/filename depth:

AIX HP-UX  Solaris:
   File_space_name   1024
   Path_name or directory_name   1023
   File_name  256

Linux
   File_space_name   1024
   Path_name or directory_name768
   File_name  256

Windows XP/2000/2003
   File_space_name   1024
   Path_name or directory_name248
   File_name  248 With friendly regards,
Josh-D. S. Davis





From: James R Owen jim.o...@yale.edu
To: ADSM-L@VM.MARIST.EDU
Sent: Tue, January 26, 2010 5:09:53 PM
Subject: [ADSM-L] ?anyone using TSM to backup Panasus PanFS?

Yale uses Panasus PanFS, a massive parallel storage system, to store research 
data generated from HPC clusters.  In considering feasibility to backup PanFS 
using TSM,
we are concerned about whether TSM is appropriate to backup and restore:

1. very large volumes,
2. deep  subdirectory hierarchy  with 100's to 1000's of sublevels,
3. large numbers of files within individual subdirectories,
4. much larger numbers of files within each directory hierarchy.

Are there effective maximum limits for any of the above, beyond which
TSM becomes inappropriate to effectively perform backups and restores?

Please advise about the feasibility and any configuration recommendation(s)
to maximize PanFS backup and restore efficiency using TSM.

Thanks for your help.
--
jim.o...@yale.edu   (w#203.432.6693, c#203.494.9201, h#203.387.3030)


Re: ?anyone using TSM to backup Panasus PanFS?

2010-01-27 Thread Skylar Thompson

Does PanFS have a tool like GPFS's tslistall? If it does you can use
that to at least get a list of files to pass to TSM so TSM itself
doesn't have to do a scan. If you can get mtime information you could
even figure out exactly what files have changed since the last backup
and just do a selective backup. Another thing you can do if your data is
easily partitionable is to setup a proxy node group and only backup part
of the filesystem on each node.

James R Owen wrote:

Yale uses Panasus PanFS, a massive parallel storage system, to store
research data generated from HPC clusters.  In considering feasibility
to backup PanFS using TSM,
we are concerned about whether TSM is appropriate to backup and restore:

 1. very large volumes,
 2. deep  subdirectory hierarchy  with 100's to 1000's of sublevels,
 3. large numbers of files within individual subdirectories,
 4. much larger numbers of files within each directory hierarchy.

Are there effective maximum limits for any of the above, beyond which
TSM becomes inappropriate to effectively perform backups and restores?

Please advise about the feasibility and any configuration
recommendation(s)
to maximize PanFS backup and restore efficiency using TSM.

Thanks for your help.
--
jim.o...@yale.edu   (w#203.432.6693, c#203.494.9201, h#203.387.3030)


--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S048, (206)-685-7354
-- University of Washington School of Medicine


Re: ?anyone using TSM to backup Panasus PanFS?

2010-01-27 Thread Evans, Bill
We use TSM to backup our primary research data server.  It's a SUN sparc
server with 150TB of data.  51M files, 2M directories, 1+TB / day rate
of change.  It takes ~6-8 hours to run the backup on this.  As you can
expect, most of the time is scanning the filesystems to find changes.
The filesystem is Veritas vxfs.

Since the scan/backup time is well within our window of opportunity, it
is not a big deal.  As we grow, we will probably add capacity with more
drives and more server capacity (faster/more procs/more ram/etc).  TSM
keeps up with this very well.  I would really *hate* to ever have to run
a full backup on this beast.


Thanks,

Bill Evans
Research Computing Support
FRED HUTCHINSON CANCER RESEARCH CENTER

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
James R Owen
Sent: Tuesday, January 26, 2010 3:10 PM
To: ADSM-L@VM.MARIST.EDU
Subject: [ADSM-L] ?anyone using TSM to backup Panasus PanFS?

Yale uses Panasus PanFS, a massive parallel storage system, to store
research data generated from HPC clusters.  In considering feasibility
to backup PanFS using TSM,
we are concerned about whether TSM is appropriate to backup and restore:

  1. very large volumes,
  2. deep  subdirectory hierarchy  with 100's to 1000's of sublevels,
  3. large numbers of files within individual subdirectories,
  4. much larger numbers of files within each directory hierarchy.

Are there effective maximum limits for any of the above, beyond which
TSM becomes inappropriate to effectively perform backups and restores?

Please advise about the feasibility and any configuration
recommendation(s)
to maximize PanFS backup and restore efficiency using TSM.

Thanks for your help.
--
jim.o...@yale.edu   (w#203.432.6693, c#203.494.9201, h#203.387.3030)


?anyone using TSM to backup Panasus PanFS?

2010-01-26 Thread James R Owen

Yale uses Panasus PanFS, a massive parallel storage system, to store research 
data generated from HPC clusters.  In considering feasibility to backup PanFS 
using TSM,
we are concerned about whether TSM is appropriate to backup and restore:

 1. very large volumes,
 2. deep  subdirectory hierarchy  with 100's to 1000's of sublevels,
 3. large numbers of files within individual subdirectories,
 4. much larger numbers of files within each directory hierarchy.

Are there effective maximum limits for any of the above, beyond which
TSM becomes inappropriate to effectively perform backups and restores?

Please advise about the feasibility and any configuration recommendation(s)
to maximize PanFS backup and restore efficiency using TSM.

Thanks for your help.
--
jim.o...@yale.edu   (w#203.432.6693, c#203.494.9201, h#203.387.3030)


Re: ?anyone using TSM to backup Panasus PanFS?

2010-01-26 Thread Harry Redl
Hi,

I do not think this is task for any normal backup solution (not to mention 
PanFS is possibly not supported by any). With these specification you may 
easily exceed any limit of filename/path length backup software has. Scanning 
the filesystem can take hours (if only!) even before single byte is transferred 
- huge number of files is a killer for any backup solution. What is your RPO 
and RTO? What is the purpose of the backup and granularity required?
Without knowing anything more about your environment it seems to me that 
replication (possibly synchronous) between two sites and volume block level 
backup (what VERY LARGE means?) is what you end with  crystal balls are 
scarce these days :)

Harry





From: James R Owen jim.o...@yale.edu
To: ADSM-L@VM.MARIST.EDU
Sent: Wed, January 27, 2010 12:09:53 AM
Subject: [ADSM-L] ?anyone using TSM to backup Panasus PanFS?

Yale uses Panasus PanFS, a massive parallel storage system, to store research 
data generated from HPC clusters.  In considering feasibility to backup PanFS 
using TSM,
we are concerned about whether TSM is appropriate to backup and restore:

1. very large volumes,
2. deep  subdirectory hierarchy  with 100's to 1000's of sublevels,
3. large numbers of files within individual subdirectories,
4. much larger numbers of files within each directory hierarchy.

Are there effective maximum limits for any of the above, beyond which
TSM becomes inappropriate to effectively perform backups and restores?

Please advise about the feasibility and any configuration recommendation(s)
to maximize PanFS backup and restore efficiency using TSM.

Thanks for your help.
--
jim.o...@yale.edu   (w#203.432.6693, c#203.494.9201, h#203.387.3030)