On Monday 18 March 2013 10:06:59 Amit Karpe did opine:

> Reply in-line:
> 
> On Mon, Mar 18, 2013 at 1:59 PM, Jon LaBadie <j...@jgcomp.com> wrote:
> > On Mon, Mar 18, 2013 at 12:55:40PM +0800, Amit Karpe wrote:
> > > Brian,
> > > Yes, initially I was using holdingdisk for all config. But when I
> > > see backup is low and taking more time, I had doubt that my backup
> > > & holdingdisk are on same disk (NAS) which is 19TB. So I change it
> > > & force not to use it.
> > > 
> > > But now I realized its required so I will use for next usage of
> > > Amanda.
> > > 
> > > Could someone let me know use of chunksize ?
> > > It should big enough like 10GB or 50GB ?
> > > As I am using Vtape, finally all these temporary files are going
> > > to merge into one file. So why should we create these chunks ? Why
> > > not
> > 
> > let
> > 
> > > dumper directly dump into final vtape's slot directory ?
> > 
> > The holding disk chunksize was added to overcome the 2GB max file size
> > limitation of some filesystems.  It is also useful if you allocate
> > multiple holding disks, some which may not be big enough for your
> > large DLEs.
> > 
> > 
> > The parts of amanda doing the dumps do not peek to see where the data
> > will eventually be stored.  Taper does that part and it is not called
> > until a DLE is successfully dumped.  If you are going direct to tape
> > taper is receiving the dump directly but then you can only do one DLE
> > at a time.
> > 
> > Jon
> 
> So using  chunksize 10GB/ 50GB/ 100GB kind of option, will help amanda
> to run dumper parallely ?

There are, generally speaking, several aspects of doing backups in 
parallel.

1. If you don't want your disk(s) to be thrashed by seeking, and that of 
course has a speed penalty, you must restrict the read operations to one 
file at a time per "disk spindle", this is in the docs, man disklist I 
believe.

2. The chunk size is to get around some file system limits that often cause 
things to go all aglay when fast integer math in the filesystem falls over 
at >2Gb on a single file.  It has nothing or very little to do with speed 
other than the overhead of breaking it up during the writes to the holding 
disk area, then splicing it back together as its sent down the cable to the 
storage media.  IOW it is to keep your OS from trashing the file as its 
being put in the holding area as a merged file from the directory tree 
specified in the disklist.  IOW "your" file is likely not the problem 
unless that file is a dvd image, but the merged output of tar or (spit) 
dump, can easily be more than 2Gb.

I have one directory in my /home that will almost certainly have to be as 
separate files as I have debian-testing for 3 different architectures here.  
That would be about 30 disklist entries all by itself as there are 30 dvd 
images for the whole thing.

3. Parallelism would probably be helped, given that the data moving 
bandwidth is sufficient, if more than one holding disk area was allocated, 
with each allocation being on a separate spindle/disk so that the holding 
disk itself would not be subjected to this same seek thrashing time killing 
IF you also had more than one storage drive being written in parallel.  If 
only one tape is mounted at a time in your setup, once you've taken action 
against seek thrashing of the source disk(s), the next thing is improving 
the bandwidth.

This last however, may not be something that amanda has learned how to use 
effectively as AFAIK, there is not an optional 'spindle' number in the 
holding disk entry for that distinction.

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
My web page: <http://coyoteden.dyndns-free.com:85/gene> is up!
My views 
<http://www.armchairpatriot.com/What%20Has%20America%20Become.shtml>
"Engineering without management is art."
                -- Jeff Johnson
I was taught to respect my elders, but its getting 
harder and harder to find any...

Reply via email to