Weird illustration of peculiar interactions :-}

2001-01-14 Thread David Wolfskill

So a few weeks ago, I finally got around to implementing (as recorded
either here in amanda-users or in -hackers; I forget) a separate
"Offsite" configuration that runs on Sundays, and does a full backup of
everything (with "no-record").  Since I had been able to (just barely!)
run amanda with a dumpcycle of 2 days, I figured that I should normally
be able to get the full ("Offsite") backup in 2 tapes... so I defined a
runtapes value of 3.

Well, as it happened, one of the file systems I back up is hovering right
around 12 GB.  And as luck would have it, amanda would typically get about
30 - 35 GB on the first tape before this file system was ready to be taped.
Then taper would try to write this backup image to tape, and eventually
fail, only to re-start on the next tape (all as expected).  But that would
leave a bunch of wasted space on the end of that first tape, with the
result that the Offsite backups would actually require all 3 tapes.

Friday, I added another system to the set of backups -- one that is on
our perimeter net (and thus, subject to firewall restrictions).  I was
able to use the "Firewall" entry in the FAQ-o-matic (thank you!), and
then today, for the Offsite backup -- as I half-suspected -- the timing
was such that the 12 GB file system wasn't ready to be taped until amanda
had switched to the second tape already.

So now, because I added another bunch of file systems (including one that
is 3.5 GB used), the Offsite backup only used 2 tapes.

Go figure. :-)

(Tape chunking should address this rather nicely, of course)

Anyway, I thought that some folks might find that vaguely amusing:

>These dumps were to tapes E15, E16.
>The next 3 tapes Amanda expects to used are: E17, E18, E19.
>
>
>STATISTICS:
>  Total   Full  Daily
>      
>Estimate Time (hrs:min)0:36
>Run Time (hrs:min)15:25
>Dump Time (hrs:min)   213:22  213:22   0:00
>Output Size (meg)   74162.474162.40.0
>Original Size (meg) 141433.4141433.40.0
>Avg Compressed Size (%)44.3   44.3-- 
>Filesystems Dumped  150150  0
>Avg Dump Rate (k/s)98.9   98.9-- 
>
>Tape Time (hrs:min)3:49   3:49   0:00
>Tape Size (meg) 74167.174167.10.0
>Tape Used (%) 211.9  211.90.0
>Filesystems Taped   150150  0
>Avg Tp Write Rate (k/s)  5517.6 5517.6-- 

Cheers,
david
-- 
David Wolfskill  [EMAIL PROTECTED]   UNIX System Administrator
Desk: 650/577-7158   TIE: 8/499-7158   Cell: 650/759-0823

I need help: http://www.whistle.com/employment/employ-engg.html#K030391


>From [EMAIL PROTECTED] Sun Jan 14 17:20:37 2001
>Date: Sun, 14 Jan 2001 17:18:37 -0800 (PST)
>From: System Operator <[EMAIL PROTECTED]>
>To: [EMAIL PROTECTED]
>Subject: Whistle Communications AMANDA MAIL REPORT FOR January 14, 2001
>



Re: Weird illustration of peculiar interactions :-}

2001-01-14 Thread Martin Apel

On Sun, 14 Jan 2001, David Wolfskill wrote:

> Well, as it happened, one of the file systems I back up is hovering right
> around 12 GB.  And as luck would have it, amanda would typically get about
> 30 - 35 GB on the first tape before this file system was ready to be taped.
> Then taper would try to write this backup image to tape, and eventually
> fail, only to re-start on the next tape (all as expected).  But that would
> leave a bunch of wasted space on the end of that first tape, with the
> result that the Offsite backups would actually require all 3 tapes.

I implemented some changes in the driver that cause it to gather dumps
until a certain threshold is reached. Afterwards it will always write
the biggest dump still fitting on the tape. This works quite nicely for me
and improves tape utilization a lot. Unfortunately it also increases the
total dump time a bit, if your tape is slow.
I haven't released it yet, because I implemented it in Amanda 2.4.1p1
and didn't come around to porting it to 2.4.2.
But if you like I can post the patches for 2.4.1p1.

Greetings,

Martin


Martin Apel, Dipl.-Inform.t e c m a t h  A G
Group Manager Software Development  Human Solutions Division
phone +49 (0)6301 606-300Sauerwiesen 2, 67661 Kaiserslautern
fax   +49 (0)6301 606-309Germany
[EMAIL PROTECTED]   http://www.tecmath.com





Re: Weird illustration of peculiar interactions :-}

2001-01-17 Thread Joi Ellis

On Mon, 15 Jan 2001, Martin Apel wrote:

>I implemented some changes in the driver that cause it to gather dumps
>until a certain threshold is reached. Afterwards it will always write
>the biggest dump still fitting on the tape. This works quite nicely for me
>and improves tape utilization a lot. Unfortunately it also increases the
>total dump time a bit, if your tape is slow.
>I haven't released it yet, because I implemented it in Amanda 2.4.1p1
>and didn't come around to porting it to 2.4.2.
>But if you like I can post the patches for 2.4.1p1.

I have a perl script which will go through my holding disk and spit
out a list of backup sets to select to best pack tapes.

here's an example:

[amanda@joi amanda]$ pack -C OffSite
138530/140906 (   98%)
/home/amanda/mnt/holdingdisk/OffSite/20010106
/home/amanda/mnt/holdingdisk/OffSite/20010114

116784/140906 (   82%)
/home/amanda/mnt/holdingdisk/OffSite/20010107
/home/amanda/mnt/holdingdisk/OffSite/20010108

114088/140906 (   80%)
/home/amanda/mnt/holdingdisk/OffSite/20010111

99327/140906 (   70%)
/home/amanda/mnt/holdingdisk/OffSite/20010112

Nothing left to pack!

I amflush these to tapes to send to offsite storage.
I've already done two tapes from this batch, I have four
left to do.

-- 
Joi Ellis
[EMAIL PROTECTED], http://www.visi.com/~gyles19/




Re: Weird illustration of peculiar interactions :-}

2001-01-17 Thread Martin Apel

On Wed, 17 Jan 2001, Joi Ellis wrote:

> I have a perl script which will go through my holding disk and spit
> out a list of backup sets to select to best pack tapes.
> 
> here's an example:
> 
> [amanda@joi amanda]$ pack -C OffSite
> 138530/140906 (   98%)
> /home/amanda/mnt/holdingdisk/OffSite/20010106
> /home/amanda/mnt/holdingdisk/OffSite/20010114
> 
> 116784/140906 (   82%)
> /home/amanda/mnt/holdingdisk/OffSite/20010107
> /home/amanda/mnt/holdingdisk/OffSite/20010108
> 
> 114088/140906 (   80%)
> /home/amanda/mnt/holdingdisk/OffSite/20010111
> 
> 99327/140906 (   70%)
> /home/amanda/mnt/holdingdisk/OffSite/20010112
> 
> Nothing left to pack!
> 
> I amflush these to tapes to send to offsite storage.
> I've already done two tapes from this batch, I have four
> left to do.

That's a nice idea, but I have more data to back up than fits on the
holding disk, so I have to flush some dumps to tape in order to dump
all filesystems completely.

Martin


Martin Apel, Dipl.-Inform.t e c m a t h  A G
Group Manager Software Development  Human Solutions Division
phone +49 (0)6301 606-300Sauerwiesen 2, 67661 Kaiserslautern
fax   +49 (0)6301 606-309Germany
[EMAIL PROTECTED]   http://www.tecmath.com





Re: Weird illustration of peculiar interactions :-}

2001-01-18 Thread Joi Ellis

On Thu, 18 Jan 2001, Martin Apel wrote:

>On Wed, 17 Jan 2001, Joi Ellis wrote:
>
>> 
>> I amflush these to tapes to send to offsite storage.
>> I've already done two tapes from this batch, I have four
>> left to do.
>
>That's a nice idea, but I have more data to back up than fits on the
>holding disk, so I have to flush some dumps to tape in order to dump
>all filesystems completely.
>
>Martin

So?  I was trying to point out that simply selecting the biggest
dump may not give you the best packing.  Often, the few tapes contain
four or five smaller dumps and can obtain a 99.8% usage rate.

The total amount of data to be backed up has no effect, since you can
only flush what's on the holding disk.

-- 
Joi Ellis
[EMAIL PROTECTED], http://www.visi.com/~gyles19/




Re: Weird illustration of peculiar interactions :-}

2001-01-18 Thread Martin Apel

On Thu, 18 Jan 2001, Joi Ellis wrote:

> On Thu, 18 Jan 2001, Martin Apel wrote:
> 
> >On Wed, 17 Jan 2001, Joi Ellis wrote:
> >
> >> 
> >> I amflush these to tapes to send to offsite storage.
> >> I've already done two tapes from this batch, I have four
> >> left to do.
> >
> >That's a nice idea, but I have more data to back up than fits on the
> >holding disk, so I have to flush some dumps to tape in order to dump
> >all filesystems completely.
> >
> >Martin
> 
> So?  I was trying to point out that simply selecting the biggest
> dump may not give you the best packing.  Often, the few tapes contain
> four or five smaller dumps and can obtain a 99.8% usage rate.
> 
> The total amount of data to be backed up has no effect, since you can
> only flush what's on the holding disk.

Yes, you are right. You might achieve a better packing by a more intelligent
algorithm. But after you wrote out the first dump, things might have changed already
because another dump has been finished in the meantime. I used this scheme for a few 
months now and it gives me usage ratios close to 100 % for all but the last tape.
But this only works if you have a good mix of file system sizes.

Greetings,

Martin




Re: Weird illustration of peculiar interactions :-}

2001-01-18 Thread Chris Karakas

Martin Apel wrote:
> 
> >
> > So?  I was trying to point out that simply selecting the biggest
> > dump may not give you the best packing.  Often, the few tapes contain
> > four or five smaller dumps and can obtain a 99.8% usage rate.
> >
...

> Yes, you are right. You might achieve a better packing by a more intelligent
> algorithm. 

I dont know if you noticed it, but you are talking about the famous "bin
packing problem" in combinatorics. Just search the web for "bin packing"
and you will find quite a few algorithms and further literature on this
vast subject (even AMANDA uses one, according to some old papers). 

-- 
Regards

Chris Karakas
DonĀ“t waste your cpu time - crack rc5: http://www.distributed.net



Re: Weird illustration of peculiar interactions :-}

2001-01-18 Thread Martin Apel

On Fri, 19 Jan 2001, Chris Karakas wrote:

> Martin Apel wrote:
> > 
> > >
> > > So?  I was trying to point out that simply selecting the biggest
> > > dump may not give you the best packing.  Often, the few tapes contain
> > > four or five smaller dumps and can obtain a 99.8% usage rate.
> > >
> ...
> 
> > Yes, you are right. You might achieve a better packing by a more intelligent
> > algorithm. 
> 
> I dont know if you noticed it, but you are talking about the famous "bin
> packing problem" in combinatorics. Just search the web for "bin packing"
> and you will find quite a few algorithms and further literature on this
> vast subject (even AMANDA uses one, according to some old papers). 

Yes, I knew that this is a known problem. But most of these algorithms
are designed to work with full knowledge, i.e. they assume they know
all sizes of all dumps. In Amanda's case this is not true, because
after you wrote out a first dump you can make a new decision with extended
information. After all, since I have rather good experience with my simple
approach there's not really a need for a better algorithm. 

Regards,

Martin


Martin Apel, Dipl.-Inform.t e c m a t h  A G
Group Manager Software Development  Human Solutions Division
phone +49 (0)6301 606-300Sauerwiesen 2, 67661 Kaiserslautern
fax   +49 (0)6301 606-309Germany
[EMAIL PROTECTED]   http://www.tecmath.com