Re: Parallel dumps of a single file system?

2006-05-23 Thread Ross Vandegrift
On Tue, May 23, 2006 at 11:04:44AM -0400, Jon LaBadie wrote:
> Recently I had a look at amplot results for my new vtape setup.
> One thing it showed was that for 2/3 of the time, only one of the
> default four dumpers was active.

This is a good point.  amplot is awesome for checking out what kind of
stuff is slowing down your backups!  Also check the output at the end
of amstatus when a run is finished.  It'll give you a summary of the
same information.  But there's nothing like a cool graph!

As far as the original poster's question: I think you should try it
out.  Whether it's a performance win or loss is going to depend
heavily on how the data has ended up across those disks.

Your RAID5 performance is always dominated by the time it takes to
seek for data.  If all n disks can just just stream for a while, you
get full streaming performance from the disks.  But if even one of
them needs to seek to find its blocks, you're going to have to wait
until that disk finishes.

This makes me think that in most cases, dumping a big RAID5 in
parallell would hurt performance.  However, if your array is old, it
may be highly fragmented.  The extra I/O requests might be smoothed
over by an elevator algorith somewhere, and you might fit more data
into the same time...

I'd say it calls for an experiment.

--
Ross Vandegrift
[EMAIL PROTECTED]

"The good Christian should beware of mathematicians, and all those who
make empty prophecies. The danger already exists that the mathematicians
have made a covenant with the devil to darken the spirit and to confine
man in the bonds of Hell."
--St. Augustine, De Genesi ad Litteram, Book II, xviii, 37


Re: Parallel dumps of a single file system?

2006-05-23 Thread Jon LaBadie
On Tue, May 23, 2006 at 10:39:09AM -0400, Paul Lussier wrote:
> On 5/23/06, Andreas Hallmann <[EMAIL PROTECTED]> wrote:
> 
> >Since in the raid blocks are spreed sequentially (w.r.t the file) among
> >most (raid5) of the avail platters, it will behave more like a single
> >spindle with more layers.
> >
> >> So, my question is this: Am I doing the right thing by dumping these
> >> DLEs serially, or can I dump them in parallel?
> >
> >Dumping this DLEs sequentially is your only option to keep spindle
> >movements low. So your doing it the way I would do it.
> >Anything else should reduce your throughput.
> 
> 
> I'm looking for ways to speed up my daily incremental backups.  We may well
> be purchasing a new RAID array in the near future.  Which may allow me to
> migrate the existing data to it and split it up into multiple file systems,
> then go back and re-format the old one.

This is something you could easily try in stages.  For example,
suppose you have 8 DLEs defined on that raid, all as spindle "1".
Redefine 2-4 of them as spindle "2".  Also make sure your client
is not dumper limited.  If you see significant improvement,
define a few more DLEs as spindle "3" etc.

Recently I had a look at amplot results for my new vtape setup.
One thing it showed was that for 2/3 of the time, only one of the
default four dumpers was active.  I changed the default number of
dumpers to six and changed the number of simultaneous dumps per
client from one to two.  My total backup time dropped from over
four hours to one and a half hours.

-- 
Jon H. LaBadie  [EMAIL PROTECTED]
 JG Computing
 4455 Province Line Road(609) 252-0159
 Princeton, NJ  08540-4322  (609) 683-7220 (fax)


Re: Parallel dumps of a single file system?

2006-05-23 Thread Paul Lussier
On 5/23/06, Andreas Hallmann <[EMAIL PROTECTED]> wrote:
Since in the raid blocks are spreed sequentially (w.r.t the file) amongmost (raid5) of the avail platters, it will behave more like a singlespindle with more layers.> So, my question is this: Am I doing the right thing by dumping these
> DLEs serially, or can I dump them in parallel?Dumping this DLEs sequentially is your only option to keep spindlemovements low. So your doing it the way I would do it.Anything else should reduce your throughput.
Does that imply that if this RAID set were split into multiple file systems, I'd still be better off dumping them one at a time?I'm looking for ways to speed up my daily incremental backups.  We may well be purchasing a new RAID array in the near future.  Which may allow me to migrate the existing data to it and split it up into multiple file systems, then go back and re-format the old one.
Thanks,--Paul


Re: Parallel dumps of a single file system?

2006-05-23 Thread Andreas Hallmann

Paul Lussier wrote:


Hi all,

I have a 1 TB RAID5 array which I inherited.  The previous admin 
configured it to be a single file system as well.  The disklist I have 
set up currently splits this file system up into multiple DLEs for 
backup purposes and dumps them using gtar.


In the past, on systems with multiple partitions, I would configure 
all file systems on different physical drives to be backed up in 
parallel.  Since this system has but 1 file system, I've been backing 
them up one at a time.
But since this is a RAID array, it really wouldn't matter whether this 
were many file systems or 1, since everything is RAIDed out across all 
disks, would it?


There is nothing to RAID out. Avoiding spindle movement is both the key 
longer disk life time and performance.
Since in the raid blocks are spreed sequentially (w.r.t the file) among 
most (raid5) of the avail platters, it will behave more like a single 
spindle with more layers.


So, my question is this: Am I doing the right thing by dumping these 
DLEs serially, or can I dump them in parallel?


Dumping this DLEs sequentially is your only option to keep spindle 
movements low. So your doing it the way I would do it.

Anything else should reduce your throughput.

Andreas