Re: Samar amanda output (with holding area)

Brian Cuttler Wed, 01 Dec 2004 12:44:14 -0800

Gene,
Stefan,
Eric,

Well, that was frustrating, re-routed my network cable, hit the
power on the strip, lost my editing session (never edit files
in /tmp, stupid stupid stupid).


I'm going to try to coalesce the threads here. I think I'm handling
Eric's, Gene's and then Stefan's emails in order but I wouldn't bet
all that much that I haven't messed things a little. And actually
there is a good amount of cross between all of the emails.


Thanks for the info on how dumporder is handled.

I unfortunately don't have the resources to create new holding
areas out of old disks, I'm part of the computer core and don't
actually own any of the equiptment that I'm working with (well,
a small bit of it but nothing related to the Samar system).

This by the way is an SGI/IRIX Origin 300 multiprocessor (4x 500Mhz
IP 35 processors, for what its worth). There is an internal PCI bus
but the internal SCSI bus is fully occupied, the external already
has the Raid array, the jukebox/SDLT and a second SDLT in a shoe box.
I'm not sure now long I can safely make the daisychain. PCI to dual
SCSI interface cards are available (we installed one in another
Origin 300 down the hall from this one) but I've been unable to rouse
interest in its purchase.

If I could get additional spindle(s) it would help with contention, on
the raid if not on the SCSI bus. I also need to remind myself that with
the data being stored in amanda work in chunksized pieces that the new
spindle need not be all that large in order to be useful.


Because dump does not cross partition boundries (at least no dump
that I've worked with) I feel comfortable using the xfsdump utility
to dump root without worring about encoding exceptions in the disklist
file. The other partitions do not crosslink so they shouldn't need
exceptions either, even the ones dumped with tar (gnutar specifically).

Is "dumped" really the right term there ?    "Dumped with Tar"

Samar is an amanda client of the Samar amanda server, there are no
other clients of this server.

samar 1# df -kl
Filesystem             Type  kbytes     use     avail  %use Mounted on
/dev/root               xfs 67482780 19779524 47703256  30  /
/dev/dsk/dks0d2s0       xfs 63288988 60982928  2306060  97  /usr1
/dev/dsk/dks1d2s0       xfs 884945404 803357524 81587880  91  /usr5

We are using dump for root and /usr1.

For /usr5 which is large we are using tar do dump the partition broken
down by user directory. We do not have DLE for /usr5/dumps which is where
I've placed amanda work, nor for lost+found/ the /usr5/amanda directory
contains the amanda config and log files. The amount of data that any
given user owns or updates is highly variable and beyond my control. Users
do not even have quota limits.

samar 2# du -ks /usr5/*
 63442848    /usr5/allengr
    13440    /usr5/amanda
 20078964    /usr5/bimal
132563428    /usr5/dtaylor
        0    /usr5/dumps
 69378168    /usr5/hxgao
161161372    /usr5/joy
121191328    /usr5/lalor
 21337456    /usr5/leith
114485136    /usr5/liw
        0    /usr5/lost+found
 53840856    /usr5/ninggao
 40176552    /usr5/skaur
  2751032    /usr5/tapu


There is no assurance I will be allowed to continue using /usr5 as an
amanda work area.

Amanda work can consume whatever is left on /usr5, but there is
no assurance that there will be any space and I just know the
first time a user tries to save a large file and can't allocate
the space there is going to be a problem.

We always like to use dump rather than tar for root partitions. It
has been my understanding that tar is just a character archiver where
dump will understand the vendor specific special devices. Dump/restore
are the recommended way to preserve and restore a bootable partition.

No, the "global" dumptype definition does not specify the program to
use, it must be a compile time default.


A closer look at the two most recent amanda dump reports shows that
a few partitions where promoted to level 0 and that a couple of the
partitions that had encountered EOT on the tape where bumped to level 1.

It is still unclear to me if the level 0 of those partitions was properly
written to tape.

Messages in the report include


FAILURE AND STRANGE DUMP SUMMARY:
  samar      /usr5/allengr lev 0 FAILED ["data write: Broken pipe"]
  samar      /usr5/allengr lev 0 FAILED [dump to tape failed] 
  samar      /usr5/liw lev 0 FAILED ["data write: Broken pipe"]
  samar      /usr5/liw lev 0 FAILED [dump to tape failed]
  samar      /usr5/liw lev 0 STRANGE


FAILED AND STRANGE DUMP DETAILS:  

/-- samar      /usr5/allengr lev 0 FAILED ["data write: Broken pipe"]
sendbackup: start [samar:/usr5/allengr level 0]  
sendbackup: info BACKUP=/usr/local/sbin/gnutar
sendbackup: info RECOVER_CMD=/usr/sbin/gzip -dc |/usr/local/sbin/gnutar -f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end      
\--------


  taper: tape SAMAR02 kb 161967808 fm 9 writing file: No space left on device
  taper: retrying samar:/usr5/allengr.0 on new tape: [writing file: No space
+left on device]

This last message indicating that amanda will try to write the data
to the next tape volume. However there is no file to DD from the amanda
work area - in order to retry amanda would have to go all of the way 
back to the tar and zip it again.

Does amanda retry from the start and reasign to a dumper or does it
simply reassign the DLE to the taper ?

I could always remount the tapes and find out I guess.

> I know that you as the responsible admin tend to see things
> pessimistic. It is your job to do so, and I understand this perfectly
> as I am "the responsible admin" for several sites, too.
>
> We can't afford to be too OPTIMISTIC, do we? ;-)
>
> I also know from my view as an active member of the list that your
> installation is still in the process of development.

On that note I've got to tell you that one fo the other guys here
insists (rightly I should add) that the purpose of data backup is not
getting the data to tape, its getting the data back off of the tape.

>> So it is very likely that not all of your level 0 backups that have to
>> be done first for new DLEs will fit on your tapes.

BC> Given a large enough holding area I'd expect that any DD

DLE? ;-)

I was attemping to delineate the specific step involved.

A DLE can be broken into many chunksize pieces fitting into more
than one amanda work area. When the chunks are reassembled they
MUST fit into a single physical tape (or I suppose virtual tape but
I'm not going there today, though it is on the table for next week.
Yes, for real, I have this new 4 TByes drive...)

So when I said DD the partion to tape I'd meant litterally using
DD internally in amanda.

>> I understand that you can't run this config every day as it seems to
>> have run for full 3 days this time.

BC> After the initial run I added /usr5/dumps (which is on the same raid
BC> partition) and run time dramatically inproved.

/usr5/dumps is a holdingdisk in your amanda.conf?

BC> ** This is also midleading as the failed level 0 from the previous run
BC>    should have re-run as level zero and many of them ran at level 1.

BC> This is a "second" problem, a result of the first but a completely
BC> different part of the logic.

phew ...

Actually the jury hasn't returned a virdict yet. I would guess that
in the case of direct to tape amanda would not restart the DLE from
dumper but only from taper, which can not work since there is nothing
in the work-area. However I don't know that for sure, it is a guess and
I haven't mounted the tapes to find out (yet).

> Getting the backups right is top-priority.
> Getting them fast is secondary, at least for a beginning.

You can say that again.

> I tend to use one "program" for the whole config as it is easier to
> configure (and wrap your head around).

It would be easier, but I question reliablity to restore root and can't
use dump on the raid, too big.


I think that was all of the open questions, I can't be certain anymore.

Thank you all for your time on this, I apreciate it.

                                                        Brian

---
   Brian R Cuttler                 [EMAIL PROTECTED]
   Computer Systems Support        (v) 518 486-1697
   Wadsworth Center                (f) 518 473-6384
   NYS Department of Health        Help Desk 518 473-0773

Re: Samar amanda output (with holding area)

Reply via email to