from:"Eric Siegerman"

Re: amrestore problem, headers ok but no data

2005-01-05 Thread Eric Siegerman

Brian Cuttler wrote:
> samar 31# /usr/local/sbin/amrestore -p /dev/sdlt2 samar /usr5/bimal
> | /usr/local/sbin/gnutar -tf -
> amrestore:   0: skipping start of tape: date 20041229 label SAMAR24
> amrestore:   1: skipping samar._usr5_lalor.20041229.0
> amrestore:   2: skipping samar._usr5_tapu.20041229.1
> amrestore:   3: skipping samar._.20041229.1
> amrestore:   4: skipping samar._usr5_amanda.20041229.1
> amrestore:   5: restoring samar._usr5_bimal.20041229.1
> ./
> [...]
> amrestore./cvs_test/spider/man/CVS/
>
> : read error: I/O error./cvs_test/spider/src/
>
> ./cvs_test/spider/src/CVS/

That one's a GTAR DLE, isn't it?  Looks as though you're only
partially out of the woods for those...

Try manually dd'ing data from the tape (basically, doing by hand
what "amrestore -r" does).  This will help you to determine
whether the data's physically on the tape.

mt rewind
mt fsf N# for some appropriate N
dd bs=32k tempfile

(It looks as though the right value of N is simply the number
printed by amrestore; e.g. with the tape you used to generate the
above excerpt, to get samar:/usr5/tapu, use "mt fsf 2").

First, take note of the block counts printed by dd at the end,
and see if they match your expectations.  Note that it's counting
the physical blocks it read from the tape; when it says "X+Y",
the X is the number of full-size records it read -- 32 KB since
you said "bs=32k"; the Y is the number of partial blocks, i.e.
those that were less than 32 KB.  Since your tape is configured
for variable-length blocks, I *think* I'd expect to see Y=0, i.e.
all blocks being 32 KB long.  Ditto if you configured it for
32-KB fixed-length blocks.  I'm not sure what would happen if you
configured the drive for shorter fixed-length blocks -- probably
depends on the drive, and its driver, whether it'd:
  - emulate 32-KB blocks, i.e. break each one up into, say,
512-byte blocks and reassemble them at read time, thus
yielding "X+0" from dd

  - break up the 32-KB blocks, but not reassemble them, yielding
"0+Y" from dd

  - write the first 512 bytes of each 32-KB block and discard the
rest, yielding a garbage tape

Then look at the  you dd'ed off the tape.  Its expected
contents is a 32-KB Amanda per-DLE header (just like the one
you've been successfully getting from amrestore), followed by the
backup of that DLE (either DUMP or GTAR format; compressed iff
you're using software compression on that DLE).

You can check the contents with:
dd  And I'd make the suggestion that somewhere in this setup, there is 
> insufficient iron to do the job.  I don't know anything about an 
> SDLT, but if its taking 33 hours of real time, that drive has got to 
> be doing a lot of "shoe shining" [...]

I'm not sure about that.  It's only 14:34 of tape time, and Amanda
reports 2469.2 KB/sec tape speed.  Is that reasonable for this
drive?

> There has been a suggested rule of thumb for tape drives and 
> capacities put forth that says that if it takes the drive, in 
> streaming mode, more than 3 hours to complete that backup by writing 
> a nearly full tape, its too small and slow if its streaming 100% of 
> the time.

That might be useful for guesstimating, until one gets better
stats, but I wouldn't depend on it for diagnosing problems!  It's
about as scientific as Moore's "law", which seems to hold
surprisingly true, but even so would better be called "Moore's
Observation" -- or as Dave Tillbrook's facetious comment that
"the computer you want always costs $5000" :-)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: amrestore problem, headers ok but no data

2005-01-07 Thread Eric Siegerman

On Fri, Jan 07, 2005 at 11:17:21AM -0500, Brian Cuttler wrote:
> Following Gene's model, I set the default block size on the tape
> devices (sgi command # mt -f /dev/rmt/tps1d4nrns devblksz 32768)
> and also switched from the varable length to the fixed length tape
> device, used amlabel to relabel the tape (not what Gene indicated).
> 
> Oddly trying to dd if=/dev/rmt/tps... read no data
> 
> samar 85# mt -f /dev/rmt/tps1d4nrns rewind
> samar 86# dd if=/dev/rmt/tps1d4nrns of=scratch
> Read error: Invalid argument
> 0+0 records in
> 0+0 records out

These two things might well be related.  That dd command, without
a "bs=" argument, is trying to read 512-byte blocks.  But the
physical blocks on the tape are 32 KB -- your adjustments have
seen to that.  It would be appropriate for the read() call to
fail in that situation, as indeed it did.  On Solaris (whose man
pages I have access to at the moment), the error status would be
ENOMEM; perhaps on your system it's EINVAL == "Invalid argument"
instead.  (The place to look that up would likely be in the man
page for the tape driver -- st(7) is where I found the Solaris
version.)

> However, I ran amdump last night. Still having problems with TAR DLE
> though oddly I was able to see that a DUMP DLE attempted to write.

I'm lost.  "Attempted to write" what?  To tape during amdump, or
to disk during amrestore?  If the former, do you mean to say that
the tar DLE *didn't* attempt it?

> I was able to retrieve the file, using both amrestore and Eric's
> suggestion of manually issuing the dd command to get the file from
> tape. I was able to open the dump file (DLE for /usr1) and saw that
> the file "kmitra" was present. This I thought to be good news since
> the only top level file on the partition is kmitra/ (note directlry
> slash). Unfortuantely xfsdump reported the file as a regular file
> and not a directory and I was unable to proceed from there.

Something else you could try: "amrestore -r" one of those DLEs,
and "dd" it from tape as I described before.  Then "cmp" the two
files.  They should be identical of course.  That'll tell you
whether there are problems with amrestore.

To see whether amdump's back end (taper) is putting the data on
tape correctly, try this:
  - run amdump *with no tape in the drive*; it'll run in degraded
mode and leave all the dumps in holding disk
  - make copies of the dumps in holding disk
  - amflush them (the originals, that is) to tape
  - "amrestore -r" them, and/or "dd" them from tape
  - compare what came off the tape with the holding-disk copies
you made before the amflush (use "dd" to strip off and
discard the first 32 KB of each file, as I described
previously, because there *will* be differences between the
files' Amanda headers; but the remainders of the two files
should be identical)

If the holding-disk files are split into multiple chunks, you'll
have to do some "dd" magic to reassemble them; don't forget to
discard the first 32 KB of *every* chunk.

To see if amdump's front end (dumper et al) is getting the data
onto holding disk correctly in the first place, try to restore
from the holding disk copy, which hasn't made the journey
to-and-from the tape.  (Strip off the header and, if necessary,
reassemble as described above.)

> I've tried to retrieve several of the TAR DLE but have been unsuccessful
> with either method.

Sorry, but I gotta ask:  with or without "bs=32k"?

Hmm, it seems some of my recipes above are premature.  Oh well,
I'll leave them in anyway, since they might be useful further
down the road.

It sounds to me as though it's time to:
  - send you off looking at the debug files on the clients (in
/tmp/amanda unless you've configured them otherwise); I'm not
sure just what you'd be looking *for*; just anything
unusual...

  - ask you to show us the following from a run that demonstrates
the problem:
  - the email report
  - log.MMDD and amdump.N files
  - description of the results of restore attempts for a
couple of representative DLEs

I can't recall:  how much testing have you done on the tape
subsystem itself, without Amanda in the loop to confuse things?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: amrestore problem, headers ok but no data

2005-01-07 Thread Eric Siegerman

On Fri, Jan 07, 2005 at 11:59:54AM -0500, Brian Cuttler wrote:
> samar 126# which xfsrestore
> /sbin/xfsrestore
> samar 127# which xfsdump
> /usr/sbin/xfsdump

I suppose the theory must be that anyone can do a restore, but
only root can use [xfs]dump.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: amrestore problem, headers ok but no data

2005-01-07 Thread Eric Siegerman

On Fri, Jan 07, 2005 at 01:20:42PM -0500, Brian Cuttler wrote:
> samar 170# dd of=/dev/sdlt2 obs=32k if=./scratch 
> 64+0 records in
> 1+0 records out
> 
> "bs" block size, "obs" outpub BS, (there is an IBS also, which I
> am afraid of developing should this not resolve soon)

Yup, this makes sense.  Since you specified "obs", the input
block size remained at the default of 512 bytes.  "dd" read
enough of those (64) to make a 32-KB output block and then wrote
the latter.  If the file had been longer, it would have repeated
the process -- 64 512-byte reads followed by 1 32-KB write --
until done.  It can also convert the other way, from a large ibs
to a smaller obs.

For a disk file, the block size doesn't matter much; things speed
up if you use a larger one, but you get the same result except
for dd's stats at the end.  For a tape, block size can be a lot
more important.

With a traditional tape drive that uses variable-length blocks,
each write() system call produces exactly one physical tape
block, whose length is simply the length specified to write().
Correspondingly, each read() call reads exactly one physical tape
block; the length specified to read() must be at least as large
as the block currently under the heads, or something will go
wrong (on Solaris SCSI tapes, as I mentioned before, the read
will fail with ENOMEM; I don't know whether other systems behave
differently, but one thing that almost certainly will *not*
happen is that you get the first chunk of the physical block on
this read(), and the next chunk on the next read().  One physical
block for each read() call is the rule.

This is why "dd" has separate "ibs" and "obs" arguments in the
first place -- to let you reblock a tape, i.e. copy it to another
tape while changing the block size, without making an
intermediate copy on disk.  When copying from one disk file to
another, or between tape and disk, specifying different input and
output block sizes doesn't really accomplish anything.

I don't know how tape drives with fixed-length blocks (or
configured for them, as in your case) work.  Perhaps the drive
does the necessary block-length conversion itself, using its
internal RAM buffer.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: amrestore problem, headers ok but no data

2005-01-07 Thread Eric Siegerman

On Fri, Jan 07, 2005 at 03:40:12PM -0500, Brian Cuttler wrote:
> samar 5# /usr/local/sbin/amrestore -r /dev/sdlt2
> amrestore:   0: skipping start of tape: date 20050107 label SAMAR05
> amrestore:   1: restoring samar._usr5_amanda.20050107.1
> amrestore: read error: I/O error
> [likewise with "dd bs=32k"]

Ooo, I don't like the look of "I/O error" at all.  That suggests
a hardware or media problem, as opposed to anything that software
has any control over.

If you haven't done so (can't recall; maybe you've already
described this), look in the system logs for errors from that
device.  If there aren't any, it couldn't hurt to take a look at
the driver's man page (and pages for the SCSI subsystem in
general -- it is a SCSI drive, isn't it?) to see if more verbose
error reporting can be enabled.

If you come up with anything, please post it.

> samar 12# mt -f /dev/sdlt2 rewind
> samar 13# mt -f /dev/sdlt2 fsf 1
> samar 14# cat -evt /dev/sdlt2 | more
> Input error: Invalid argument (/dev/sdlt2)

This one's to be expected.  "cat" uses a smaller block size --
and unlike "dd", you can't change it.

> samar 15# mt -f /dev/sdlt2 rewind
> samar 16# mt -f /dev/sdlt2 fsf 1
> samar 17# dd if=/dev/sdlt2 bs=32768| cat -evt | more
> AMANDA: FILE 20050107 samar /usr5/amanda lev 1 comp .gz program 
> /usr/local/sbin/
> gnutar$
> To restore, position tape at start of file and run:$
> ^Idd if= bs=32k skip=1 | /usr/sbin/gzip -dc | usr/local/sbin/gnutar 
> -f... 
> -$
> ^L$
> [EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
> PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
> PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
> PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
> PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
> PROTECTED]@
> [EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
> PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
> PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
> PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
> PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
> PROTECTED]@
> 
> 
>  ** many null characters removed.

Hmmm, I'd have expected to see "Read error: I/O error" again
here, seeing as you get it on every other 32-KB read.  Odd that
the message is missing while the main output is still as though
the error had occurred.

> samar 18# mt -f /dev/sdlt2 rewind
> samar 19# mt -f /dev/sdlt2 fsf 1
> samar 23# dd if=/dev/sdlt2 bs=32768 skip=1 | /usr/sbin/gzip -dc | 
> /usr/local/sbin/gnutar -tf - | more
> Read error: I/O error
> 0+0 records in
> 0+0 records out
> 
> gzip: stdin: unexpected end of file

Yeah, more of same... If dd can't get the bits off the tape,
there's not much point trying to do different things with its
nonexistent output :-(

> I had followed Gene's instructions

I've never had to use these, so my comments here are pretty
tentative.

> [Commands to show that the tape drive is set for
> variable-length blocks, stash the label in ./scratch, and
> verify that the label was stashed ok.]

I don't see the command to set the drive to 32768-byte blocks,
but it's presumably there becuse:

> samar 32# mt -f /dev/sdlt2 blksize
> 
>  Recommended tape I/O size: 131072 bytes (256 512-byte blocks)
>  Minimum block size: 4 byte(s)
>  Maximum block size: 16777212 bytes
>  Current block size: 32768 byte(s)
> 
> samar 33# mt -f /dev/sdlt2 rewind
> 
> samar 34# mt -f /dev/sdlt2 blksize
> 
>  Recommended tape I/O size: 131072 bytes (256 512-byte blocks)
>  Minimum block size: 4 byte(s)
>  Maximum block size: 16777212 bytes
>  Current block size: 32768 byte(s)
> 
> samar 35# dd bs=32768 if=scratch of=/dev/sdlt2
> 1+0 records in
> 1+0 records out

If I recall from Gene's description of the problem, it's only
when you go to *read* a tape that your setting gets magically
zapped.  So it couldn't hurt to do a final:
mt rewind
dd bs=32k ./scratch2
mt -f /dev/sdlt2 blksize
to verify that that hasn't happened.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: amrestore problem, headers ok but no data

2005-01-08 Thread Eric Siegerman

 in a very long time.

The only optimization that's left for dd to perform is a small one:
if ibs and obs are the same, it can save a tiny amount of CPU
time by not using an inner loop; it can just do something like
this (omitting all the error checking and handling for clarity):
while ((actual = read(infd, buf, bs)) > 0) {
if (actual == bs)
++wholeBlocksRead;
else
++partialBlocksRead;

if (write(outfd, buf, actual) == actual)
++wholeBlocksWritten;
else
++partialBlocksWritten;
}

The variables being incremented are the source of the stats dd
prints at the end.

The optimization is so small that in practice, dd implementations
might not bother; they might just fold the ibs==obs case into one
of the other two cases.

If ibs and obs differ, the code has to be more complicated: a
bunch of small read()s to fill up a larger output buffer, or a
bunch of small write()s to empty out a larger input buffer
(possibly with padding and syncing and other data-munging if
specified, but none of that's relevent to this thread).

> So if dd is left with a default 512 byte "ibs", input block size,
> but the device is using a larger block size, like an amanda tape
> of 32k, dd has allocated a 512 byte piece of memory to hold the
> input data.  But when dd requests the first block it unexectedly
> gets 32k of data and has "insufficient memory" (ENOMEM).

Just so.  Or maybe an "invalid argument" (EINVAL) :-)

> The reverse is not really a problem.  Suppose you said "ibs=128k".
> dd would simply read sufficient device blocks until the buffer
> was filled, four blocks in the above example.

Yes.  As you've said, it would be dd that did this, *not* the
kernel.  dd would call read() enough times -- in this case four
-- to fill the buffer.  Each call would read one 32-KB physical
tape block.

> On output dd can make its own adjustments.  If the obs is larger,
> it can move multiple input buffers to the output buffer before
> doing the write.  If the reverse is true, input block size larger
> than output, it can copy part of the input block to the output
> buffer and do multiple outputs from a single input buffer.

Yes, except that in neither case does it need to copy the data
from one buffer to another.  It can just have a single buffer
that's max(ibs,obs) long, and do a number of read()s at
appropriate offsets within that one buffer, then one write() of
the whole thing; or vice versa.  The only time dd needs to copy
data internally is when it's doing more complex manipulations.
That's what the "conversion buffer", whose size is given by the
"cbs=" argument, is for; and why the man page bothers to discuss
when the conversion buffer is or is not needed.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: amrestore problem, headers ok but no data

2005-01-08 Thread Eric Siegerman

Apologies for following up my own post.

On Sat, Jan 08, 2005 at 06:20:59PM -0500, Eric Siegerman wrote:
> [...] in neither case does [dd] need to copy the data
> from one buffer to another.  It can just have a single buffer
> that's max(ibs,obs) long [...]

Oops; I'm wrong about this.  It's only true if ibs is an exact
multiple of obs, or vice versa.  Otherwise, the data *will* need
to be copied.  I have no idea whether any dd implementations do
in fact optimize the exact-multiple case.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: implausibly old time stamp 1969-12-31 17:00:00

2005-01-10 Thread Eric Siegerman

On Mon, Jan 10, 2005 at 02:47:43PM -0700, Jason Davis wrote:
> tar: ./2005-01-07_17-00-57/ibdata01.ibz: implausibly old time stamp
> 1969-12-31 17:00:00

For some reason, tar thinks the file's timestamp is 0, or else
the timestamp recorded in the tarball is in fact 0.  (1969-12-31
17:00:00 is the UNIX epoch, converted to your local timezone.)
I'm not sure why it's happening.

Are you sure you're using GNU tar for both the backup and the
restore?  If the backup used gtar but the restore used your
vendor's tar, there might be an incompatibility.

Which version of gtar are you using?  Some versions are known to
be incompatible with Amanda; I haven't seen this particular
problem before, but it's a possibility to consider.  1.13.25 is
the usually recommended version for use with Amanda.  Darn, I
keep forgetting whether anyone has seen problems with 1.14 or
not.  Sorry. :-(

> Here is the output
> of stat of the original file that was backed up
> [...]
> Access: 2005-01-10 13:02:04.0 -0700
> Modify: 2005-01-07 17:07:31.0 -0700
> Change: 2005-01-07 17:07:31.0 -0700

Just out of curiosity, what does stat say about the *restored*
file?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: amrestore problem, headers ok but no data

2005-01-11 Thread Eric Siegerman

On Tue, Jan 11, 2005 at 09:11:31AM +0100, Stefan G. Weichinger wrote:
>   The minimum  blocksize  value
>   is  32  KBytes.   The maximum blocksize value is 32
>   KBytes.

The man pages have configure variables, which are expanded
during "make".  Presumably the maximum block size is one of
these, while the minimum is simple, hard-coded text.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: amrestore problem, headers ok but no data

2005-01-12 Thread Eric Siegerman

On Tuesday 11 January 2005 16:40, Jon LaBadie wrote:
>Also, I think that if both types of devices exist on the same bus,
>the lower performance one determines the performance of the entire
>bus.

In theory, this is *not* the case.  One of the (many) selling
points of SCSI over IDE is supposed to be that a SCSI bus can run
each device at its own speed (though perhaps later versions of
the IDE spec have caught up, as they have in some other respects;
I dunno).  Of course, the slower/narrower device will consume
more of the SCSI bus's available bandwidth to carry the same
amount of data, even if they don't directly affect performance of
the faster/wider devices.

In practice, according to the excellent SCSI FAQ, it depends on
the devices in question.  See these questions in particular:
  - "Can I connect a SCSI-3 disk to my SCSI-1 host adapter?
[...]"  (which isn't Brian's precise situation, but the
answer might well apply)
  - "How can I calculate the performance I'll get with mixed SCSI
devices?"

The SCSI FAQ is dated, but still useful.  It's at
www.scsifaq.org; click on the link for "Official
comp.periphs.scsi FAQ".  Sorry, but the site uses
too-smart-for-its-own-good navigation that makes it hard to post
the actual URL.

On Tue, Jan 11, 2005 at 10:48:17PM -0500, Gene Heskett wrote:
> [A SCSI bus is]
> double handicapped because the cable is, compared to a piece of well 
> built coax, pretty much a guestimate as to its operating impedance, 
> usually quoted as being in the 120 to 130 ohm territory,

This isn't supposed to be a problem either, because cable
impedence isn't supposed to be a guesstimate; it's explicitly
specified in the SCSI specs.  But in practice, what you've said
is true; there's all manner of out-of-spec junk sold as "SCSI
cables".

For example, I've read that you can have problems if you put a
SCSI adapter in the middle of an internal/external chain, even if
all the termination is correct, because the internal ribbon cable
and the external cable might have different impedences, leading
to signal reflections between the two cables.

For a telling, if rather ancient, anecdote told by someone from
Adaptec, see question "What is the problem with the Adaptec 1542C
and external cables?" in the SCSI FAQ.

> To many folks forget that a a scsi bus is indeed 
> an rf transmission line, subject to the usual rules about vswr.

Geez, you mean we've got an actual engineer in the e-room?
Awesome!  (Despite my rambling about impedences and such, I'm
sure no hardware guy.)

>From the context it's pretty clear what "vswr" means, but what
does it stand for?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
In theory, there is no difference between theory and practice.
But, in practice, there is.
- Jan van de Snepscheut

Re: GNU tar versions (was: Backups failing on Solaris 9 with GNU tar)

2005-01-17 Thread Eric Siegerman

On Mon, Jan 17, 2005 at 04:45:18AM -0500, Gene Heskett wrote:
> So I'm puzzled as to whether I'm still doing something wrong, or this 
> new tar doesn't recognize the '-f... -' for the stdin from a pipe 
> option.

That "..." is a placeholder for "x or t, with whatever other
options you want, e.g. v"; you're not meant to type it literally.

> /bin/gtar: ./.kde/share/apps/RecentDocuments/cdrom.desktop: time stamp 
> 2020-04-27 15:36:21 is 482063157 s in the future

Do the original files have the same bad timestamps?  That'll
determine whether this is really a bug or version incompatibility
in tar, or just a case of shooting the messenger...

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Dump level, behaviour change

2005-01-21 Thread Eric Siegerman

On Fri, Jan 21, 2005 at 11:30:02AM -0500, Brian Cuttler wrote:
> 4) We did also change the crontab for bin to force level 0 dumps on
>Fridays. This change reflecting the changes made when we renamed
>"notes_dlt" to "wcnotes".

Are you perchance seeing these level-2 dumps disproportionally
often on Sunday nights? :-)  Maybe people just don't use Notes
much on weekends, and now that the level-0's are synched to the
work week...

> [...] we are not losing any
> data nor do I believe that a file restore (of individual files) would
> be any more difficult than previous. A full restore would of course
> entail restores from each dump level.

Agreed on all points.  As you said, it's a non-problem.

>  Could we be seeing more change in the file system than previously
>  because of changes to the disk cluster-size ? ie, the granularity
>  in the number of blocks/sectors on the Raid vs the older single
>  spindle disk drives ?

I'd expect to see more level-bumping as the amount of changed
data *decreases*.

But either way, it's hard to imagine how changing the cluster
size would have any effect that's visible up at the file level.
It'd affect the amount of wasted space at the ends of files, but
tar doesn't back up that wasted space, and I rather doubt that
dump does either, so Amanda shouldn't even notice the change.

>  We could increase both the bumpdays and bumpsize values

If it ain't broke, don't fix it!

>  We could even force level 0 every day if desired [...]

See above :-)  There might well be legitimate reasons to do this,
but the current situation isn't one of them.

Another possibility that hasn't been raised: has Notes's
behaviour changed in some way, to make it modify fewer files?
Could be fewer writes overall, or more creating of new files as
opposed to modifying of old ones (which would make the old files
in question contribute to the bumpable side of the ledger,
instead of to the unbumpable side as before).  Maybe it's a
year-end rollover, i.e. Notes is now writing a new, still-small
foo.2005, and the huge foo.2004 is now static.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Dumper Timeouts

2005-01-21 Thread Eric Siegerman

On Fri, Jan 21, 2005 at 11:08:57AM -0500, Gene Heskett wrote:
> >  ? sendbackup: index tee cannot write [Connection timed out]
> Try increasing the etimeout value in your amanda.conf

dtimeout, no?  (I have no idea whether that'd help, but it's more
likely to than is etimeout)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Dump level, behaviour change

2005-01-21 Thread Eric Siegerman

On Fri, Jan 21, 2005 at 03:29:16PM -0500, Brian Cuttler wrote:
> I think Jon LaBadie hit it

Cool.  I was speaking in ignorance of what the data looked like.

> There is a word that I like to use for this type of design. "Hidious"

Yup, that's a technical term :-)

> So a one block file would
> have a min size of 5 rather than 3 blocks.

Well, yes and no.  It does indeed consume 5 rather than 3 disk
blocks, and that shows up in "du" and "df" output and in how
quickly the partition fills up -- but a program that open()s and
read()s the file will still see exactly the same number of bytes,
no matter what the cluster size, or indeed the hardware sector
size [1].  That's why I said that tar wouldn't notice the
difference.

Dump doesn't go through the file system, but straight to the
disk, so theoretically, it could back up the unused bytes at the
end of a file's final cluster.  But there's no reason to do so,
and I'd hope that dump wouldn't be that dumb.

[1] This was one of UNIX's many innovations over then-current
mainframe O/S's, which saw a file as a sequence of blocks,
not of bytes as UNIX does.

UNIX tried to do the same thing with tapes -- to abstract the
concept of "file" away from its hardware implementation --
but as I recently described in another thread, that
experiment failed, which is why we have to worry about buffer
lengths and short reads and all that bs= with tapes, but not
with disk files.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Fedora Core 3 - which version of tar??

2005-01-21 Thread Eric Siegerman

On Thu, Jan 20, 2005 at 10:22:16PM +0100, Stefan G. Weichinger wrote:
> - configure and make as $AMANDAUSER

I don't believe this is necessary.  One should avoid building
Amanda as root, but that's not because it'll cause problems for
Amanda; it's for the same reason one should avoid building
*anything* as root.

I've never had a problem building Amanda under my own user
account, and it's hard to see why such a problem might ever
occur.

> make install as root

This *is* necessary, of course.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Fedora Core 3 - which version of tar??

2005-01-24 Thread Eric Siegerman

On Fri, Jan 21, 2005 at 11:05:23PM -0500, Gene Heskett wrote:
> Most users are not that priviledged, and should not be.  And thats the 
> main justification for a seperate user to run amanda.

Agreed 100%!

"erics" isn't a member of "disk".  (Sorry I didn't mention that.
I agree with the above so fully that the possibility never even
occurred to me. :-) The reason I mentioned building under my own
account was to back up my assertion that building as the Amanda
user, or with any other kind of special privilege, is
unnecessary.

The build shouldn't need any particular permissions at all,
since in theory:
  - the build doesn't modify any files outside the build (and
maybe source) trees

  - any user or group ids that get hard-wired at build time are
taken from the --with-user, --with-owner, or --with-group
config parameters, not from getuid() or the like

If the above claim is false, i.e. if building Amanda as your
Amanda user works better for you than building it as a completely
unprivileged user (given that both builds are installed as root),
then IMO that's a bug in Amanda.  In that case, continuing to
build as the Amanda user might be a useful workaround, but should
only remain necessary until the bug gets fixed.

Gene, on your system, if you build Amanda as a vanilla,
unprivileged user -- not root, not in the "disk" group -- and
then install it as root, what specifically goes wrong?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Fedora Core 3 - which version of tar??

2005-01-24 Thread Eric Siegerman

On Mon, Jan 24, 2005 at 03:51:13PM -0500, Gene Heskett wrote:
> Now become 'amanda' and do an amcheck, which works just fine.
> Back out of that and become 'gene' and the permissions are denied, the 
> user gene, even though he built it, cannot run it.
> [...]
> So basicly it has to be run by whomever is set in the configuration, 
> but not by who built it.

That's my understanding.  Kind of makes sense.  And it's
certainly how the permissions are set up here:
-rwsr-x---  1 root  operator  87183 Apr 23  2004 /usr/local/sbin/amcheck
(Our Amanda server is a FreeBSD box, on which group "operator"
serves the same function as "disk" on your machines.)

Amdump insists on being run by the Amanda user too (the file has
read and execute permission for everyone, but the script itself
checks).

> If I were to change that line in the 
> configuration, then I'd assume gene could run it, but not amanda.

I'd imagine so.  Of course, amcheck might get some errors, since
"gene" isn't in the "disk" group, and (hopefully) doesn't have
permission to write index, log, and tapelist files.  (If amcheck
didn't notice, amdump certainly would...)

> I'll leave it this way for now & see how it runs tonight.

Cool.  I'm looking forward to the results :-)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: archiving tapes?!

2005-01-27 Thread Eric Siegerman

On Thu, Jan 27, 2005 at 06:08:01PM +0100, Sebastian Kösters wrote:
> if I want to restore a
> tape/backup older than the last one this fails. Iam only able to restore
> the last Backup from tape. I wanted to use the same amlabel for every Sunday
> because I dont want the tapelist file become that long.

DON'T DON'T DON'T give all your tapes the same label!  This will
confuse Amanda, and probably you too.  Your problem with
restoring from old backups is merely one symptom of Amanda's
confusion.

A symptom of your confusion would be mounting the wrong tape (or
forgetting to change tapes), and thus overwriting a backup that
you wanted to keep.

There are some of Amanda's features that it can sometimes make
sense to work around (e.g. what I assume you're doing to force
full backups on Sundays).  The tape-labelling/tapelist mechanism
is NOT one of them; trying to subvert that is a REALLY bad idea.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: AW: archiving tapes?!

2005-01-31 Thread Eric Siegerman

On Fri, Jan 28, 2005 at 01:59:32PM -0500, Eric Dantan Rzewnicki wrote:
> what are some good options for long term archival storage?

There's only one: redundancy!

I don't know the answer to the question you're actually asking.
All the media I know of are either not great under typical,
less-than-ideal conditions (magnetic) or too new for there to be
much real-world data (optical) -- not that I've made much of a
study of it recently, I admit.

But whatever technology(ies?) you choose, making multiple copies
is excellent insurance.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Removing unwanted dumps.

2005-01-31 Thread Eric Siegerman

On Mon, Jan 31, 2005 at 12:17:18PM -0500, Jon LaBadie wrote:
> I can't see anything that would cause problems if you delete
> [unneeded dumps that are still in holding disk].
> Others may.

The curinfo database will still think the dumps exist, won't it?
That might confuse amrecover until the dumps in question have
been superceded by newer ones (i.e. at same or lower dump level).
Then again, maybe these particular dumps have already been
superceded -- maybe that's why they're unneeded in the first
place.

How long would it take for this discrepancy to clear itself up?
(I know the answer for dumps that *have* been taped -- until the
tape gets overwritten one tapecycle hence.  How does it work for
untaped dumps like these?)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Removing unwanted dumps.

2005-01-31 Thread Eric Siegerman

On Mon, Jan 31, 2005 at 01:11:48PM -0500, Jon LaBadie wrote:
> The curinfo db has very little info that would make a difference.
> As to particular dumps, things like date, size, level are only
> kept for the most recent at each level.

I misspoke.  I should have said, "curinfo + indexes + any other
info that Amanda keeps around".  What you say applies to curinfo
itself, but what about the rest?  If nothing else, those index
files will take up some space; will they ever be deleted, or will
they hang around forever?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Trouble with amrestore

2005-02-02 Thread Eric Siegerman

On Wed, Feb 02, 2005 at 08:13:06AM -0800, Steve H wrote:
> What I am confused about, is the client thinking itself is the server.

One way to change this is "amrecover -s  -t ".
Maybe the default can be changed by an option to configure; not
sure about that.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: mounting/identifying a new tape drive

2005-02-03 Thread Eric Siegerman

On Thu, Feb 03, 2005 at 12:47:34PM -0500, Jon LaBadie wrote:
>   rm -f /dev/rmt/*
>   devfsadm -c tape

Does that do (the tape-related subset of) the same thing as a
reconfiguration boot, i.e. with "-r"?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: mounting/identifying a new tape drive

2005-02-03 Thread Eric Siegerman

On Thu, Feb 03, 2005 at 03:02:36PM -0500, Gil Naveh wrote:
> but I don't think it identified the new tape drive...
> >mt -f /dev/rmt/0n
> I got:
> >/dev/rmt/0n: no tape loaded or drive offline

On the contrary, I think that means your devfsadm command *did*
work.  You're now getting a tape-specific error message, instead
of a generic one; that means the system now understands that
/dev/rmt/0n is in fact a tape drive.

So, taking the message at face value ... Was there a tape loaded?
Was the drive online?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: VXA-2 packet-loader issues and AMANDA [Fwd: hard luck with the new autoloader]

2005-02-03 Thread Eric Siegerman

On Thu, Feb 03, 2005 at 03:03:22PM -0500, James D. Freels wrote:
> Here is what [drive vendor's Tech Support said] is needed:
> 
> 1) need a separate scsi chain; they said I already have too many scsi
> devices on this chain to make it reliable.

See recent threads re. SCSI cables, bus lengths, etc.  (Recent ==
last month or two).

> 2) need to upgrade the firmware in the autoloader to the latest version;
> this may not work on an alpha machine and more likely will only work
> from an Intel machine

I sure hope you mean only that the upgrade process might need an
Intel box.  If that's the case, doing the firmware upgrades is
the cheapest and probably easiest thing to try, even if you do
have to do some temporary recabling (well, as long as you have an
Intel box with a SCSI adapter...)

If on the other hand you mean that, once upgraded, the unit might
be less Alpha-compatible than it was before ... send the #&!*~
thing back for a refund! :-)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: VXA-2 packet-loader issues and AMANDA [Fwd: hard luck with the new autoloader]

2005-02-07 Thread Eric Siegerman

On Fri, Feb 04, 2005 at 07:21:59PM -0500, Gene Heskett wrote:
> Aha, LVD!  LVD is not compatible with the rest of the system unless 
> the rest of the system is also LVD.  It is two, completely seperate 
> signalling methods that just happen to use the same cabling.

Yes and no.  From the SCSI FAQ: "[ANSI] specified that if an LVD
device is designed properly, it can switch to S.E. [single-ended,
i.e. "normal", SCSI] mode and operate with S.E. devices on the
same bus segment."
  - http://h000625f788f5.ne.client2.attbi.com/scsi_faq/scsifaq.html#Generic099

So if you mix it with S.E., you lose its LVDness, e.g. you have
to stick to a S.E. bus length; but you shouldn't fry any
hardware.

HVD (high-voltage differential, i.e. the original differential
variant of SCSI) is another story completely!  That is indeed
flat-out incompatible with S.E. (and presumably with LVD too...)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: VXA-2 packet-loader issues and AMANDA [Fwd: hard luck with the new autoloader]

2005-02-08 Thread Eric Siegerman

On Mon, Feb 07, 2005 at 04:41:17PM -0500, Jon LaBadie wrote:
> Can on look at the device connectors, or better yet, the external connectors,
> and tell if a device is LVD or SE?  Or does one have to check the HW doc?

I have no idea.  Sorry.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Flushed, but not forgotten

2005-02-14 Thread Eric Siegerman

On Mon, Feb 14, 2005 at 05:05:40PM +, Gaby vanhegan wrote:
> If these dumps on the holding disk get superseded by other, more recent 
> backups, will they automatically be removed from the holding disk?

Yes, when they're amflushed.  (Unless there's something else
preventing it, although I can't think what that might be.  But
simply the fact that they've been superceded doesn't keep them
from being deleted once they've been flushed.)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: VXA-2 packet-loader issues -- a scsi question added

2005-02-15 Thread Eric Siegerman

On Tue, Feb 15, 2005 at 10:53:55AM +0100, Geert Uytterhoeven wrote:
> I thought[*] 7 was the highest priority, and 0 the lowest (on a narrow
> channel).

That's what I recall too.

> Wide devices have an even lower priority: 15 to 8.

This sounds vaguely familiar too, but I'm *far* less certain
about it.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: forcing skipped incrementals into holding disk

2005-02-15 Thread Eric Siegerman

On Tue, Feb 15, 2005 at 01:26:20PM -0500, Brian Cuttler wrote:
> If you are really looking to get all of the dumps, regardless of what
> will actually fit on the tape you could always lie about the tape length.

This is less desirable than some of the other options, at least
if you're using any of the "whatever-fit" taperalgo's.  In that
case, Amanda chooses which DLE to tape next based partly on how
much tape it thinks there is left.  If you've lied about the tape
length, Amanda will sometimes pick a DLE that's too big, under
the mistaken impression that there's room for it.  This will
waste tape space, and time as well.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Exabyte VXA-2-Packet-Loader: new problem

2005-02-15 Thread Eric Siegerman

On Tue, Feb 15, 2005 at 01:40:44PM -0500, Jon LaBadie wrote:
> If I issue another command to the drive before it is really
> ready, even an "mt status", I get error messages.  Thus I routinely
> put in delays (sleep's) in scripts that might rewind a tape or change
> a tape to another slot.  As much as 20 or 30 second delays IIRC.

Rather than a hard-coded delay (which is suboptimal if it's too
long and breaks things if it's too short), why not write a little
"mtsync" script that polls the drive for readiness, by doing "mt
status" commands in a loop until one succeeds?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Amanda's report

2005-02-16 Thread Eric Siegerman

On Wed, Feb 16, 2005 at 09:45:38AM -0500, Gil Naveh wrote:
> What does STRANGE means?

Amanda looks through the stderr output from dump (or gtar) and
tries to classify each message as either an error or benign
verbosity.  "STRANGE" means that there was a message that Amanda
doesn't recognize, and so can't classify.  As others have pointed
out, you have to look at the particular message to determine for
yourself whether it indicates a serious problem.  (In your
temp-file case, I kind of suspect it does...)

This isn't relevent to your case, but for general reference: the
messages that Amanda can recognize are in hard-coded lists in
sendbackup-dump.c and sendbackup-gnutar.c.  If you get a
recurring STRANGE message and want Amanda to classify it for you,
you can add an appropriate regex to the appropriate list.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

amanda-users@amanda.org

2005-02-16 Thread Eric Siegerman

On Wed, Feb 16, 2005 at 04:59:13PM -0300, Germán C. Basisty wrote:
> Can't determine disk and mount point from $CWD '/root'

That's normal; don't worry about it.

> EOF, check amidxtaped..debug file on bombon.

So what's in the amidxtaped..debug file on bombon?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: AMDUMP not working under DEBIAN 3

2005-02-16 Thread Eric Siegerman

On Tue, Feb 15, 2005 at 07:00:04PM -0500, Gene Heskett wrote:
> [running amdump in] The 
> group backup is also generally acceptable.

It depends.  If you're using gtar, that runs as root, so indeed,
the group shouldn't be relevent.  (Avoiding important groups like
root is a good idea from a security point of view, but shouldn't
affect correct operation.)

If you're using dump, though, amdump usually has to run in the
group that owns the relevent special files (/dev/whatever).
Which group that is, is system-dependent; I've seen "disk",
"operator", and "sys".  (Of course you could chgrp the special
files instead, but that's less wise because something, e.g. a
system upgrade or an automated file-permissions-fixer, might
chgrp them back.)

I said "amdump *usually* has to run..." because on some systems,
dump needs to run as root; in that case, I don't know whether the
group matters -- same reasoning as for gtar.

Hmmm, maybe rundump could take care of running dump in the
correct group, on those systems where that matters, instead of
not being used at all on such systems...

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: invalid compressed data--crc error and other corruption on disk files

2005-02-18 Thread Eric Siegerman

On Fri, Feb 18, 2005 at 11:36:46AM +, Thomas Charles Robinson wrote:
> [an excellently clear, concise, and complete [1] problem report
> -- thank you! -- which included the following:]
>
> gzip: stdin: invalid compressed data--crc error

All of tar's varied complaints appear to stem from corrupt input,
which in turn is adequately explained by this message.

Thus, either gzip or hardware looks like the culprit.  RAM is a
good place to look, especially considering that the data being
backed up all resides on the Amanda server; you're giving that
box quite a workout.  The disk and its bus (SCSI, IDE, etc.) are
possibilities too, but less likely IMO -- I'd expect the kernel
to detect and report the I/O errors in that case.

Not to completely rule out problems with Amanda itself -- I've
learned never to rule *anything* out where computers are
concerned (or humans for that matter :-/) -- but it seems
unlikely.

As for gtar, 1.13.25 is well regarded on this list.  'Nuff said,
until its input is known to be good.  (After all, even if,
hypothetically, tar were producing complete junk, gzip should be
able to compress and decompress that junk without reporting CRC
errors :-)

> gzip-1.3.3-9

... is a beta.  It might be worthwhile to try the latest released
version, 1.2.4.  From the web page, it looks as though that
version can't handle files over 2 GB, so you'll have to split up
any larger DLEs.  Or just disable them for the duration of the
test -- no loss; it's not as if you have usable backups of them
now :-( 

Another useful test would be to temporarily disable software
compression completely.  That should fairly quickly tell you
whether the corruption is occurring during gzipping (whether gzip
itself or hardware is the ultimate source of the problem).

> Lastly, I am currently using an nfs share for the holding disk but this
> was NOT being used previously and I was still getting the corruption
> mentioned.

Hmm, did you ever run with local holding disk, while explicitly
testing holding-disk files as you're doing now?  I.e. was there
ever a point where neither NFS nor the tape drive was in the
loop?  I'm wondering about the possibility that two independent
sources of data corruption -- NFS and the tape subsystem -- might
be confounding your attempts to isolate "the" problem.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: invalid compressed data--crc error and other corruption on disk files

2005-02-18 Thread Eric Siegerman

On Fri, Feb 18, 2005 at 05:33:44PM +, Thomas Charles Robinson wrote:
> On Fri, 2005-02-18 at 16:30, Eric Siegerman wrote:
> > On Fri, Feb 18, 2005 at 11:36:46AM +, Thomas Charles Robinson wrote:
> > > [an excellently clear, concise, and complete [1] problem report

Oops, I forgot to type the footnote :-)
[1] "concise and complete" might sound like an oxymoron, but it's
not.

> > I'm wondering about the possibility that two independent
> > sources of data corruption -- NFS and the tape subsystem -- might
> > be confounding your attempts to isolate "the" problem.
> 
> I was trying all the manual checks before I started using the nfs
> volume. Although it may be a factor I'm prepared to continue using the
> volume at this stage

Agreed.  That you were experiencing the problem when explicitly
checking local holding-disk files, pretty much does in my
hypothesis.  I won't rule out NFS problems, but that's just on
general principles (see my previous post) :-)  NFS is very likely
a red herring.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: invalid compressed data--crc error and other corruption ondiskfiles

2005-02-18 Thread Eric Siegerman

On Fri, Feb 18, 2005 at 04:49:32PM +, Thomas Charles Robinson wrote:
> An interesting point is that after a second run of my test 'some' of the
> dump-files verified as good. This indicates a intermittent problem.
> Would bad memory gives this type of behaviour?

Oh yeah!  That sure smells like a hardware problem of some
sort...

BTW, I might have been wrong earlier about one thing, and
misleading about another:
  - I said that the kernel would detect SCSI- or IDE-bus errors;
on second thought, I'm not so sure.  It depends on the bus
and its age.  Any semi-recent SCSI revision has parity
checking; though I know a lot less about IDE, I believe that
semi-recent versions of that do CRC checking.  But old IDE's
don't have any bus-error detection mechanism at all, and in
truly ancient SCSI's it's optional.  If a bus doesn't have
error correction, errors might well manifest as data
corruption instead of as kernel log messages :-/

  - If you do indeed have a hardware problem, removing gzip from
the loop *might* remove just enough load from the machine to
stop the hardware from malfunctioning; so if the problem goes
away when you disable software compression, that *suggests* a
gzip problem, but doesn't *confirm* it.  Of course you could
always run a few independent, long-running gzip's at the same time
as amdump to restore the system load -- you know, something
like:
gzip /dev/null &
as many times as Amanda now runs simultaneous gzip's.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: invalid compressed data--crc error and other corruption on disk files

2005-02-18 Thread Eric Siegerman

On Fri, Feb 18, 2005 at 04:27:30PM -0500, Gene Heskett wrote:
> It might be a beta, but its a beta thats been used by the whole planet 
> since back in 2002, without noticable errors.

Fair enough.  I stand corrected.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Issue making amanda on Solaris 8

2005-02-20 Thread Eric Siegerman

On Wed, Feb 09, 2005 at 12:30:44PM -0800, Steve H wrote:
> killpgrp.c:90: error: too many arguments to function `getpgrp'

One common cause of weird build problems on Solaris is using the
wrong tool set.  I don't know about this specific error, but it
sort of sounds like a mismatch between the variant of getpgrp()
that "configure" detected, and the one that the C compiler
subsequently tried to use.

To fix it, make sure that /usr/ccs/bin is in your PATH, and that
/usr/ucb is *not*.  If that doesn't work, try it the other way
around :-)

Hmmm, looking at the Solaris 8 box on which I'm typing this, it
seems I built Amanda with both /usr/ccs/bin and /usr/ucb in my
path.  But they're in the order stated here; perhaps you have
/usr/ucb first.  Maybe it's sufficient to make sure that
/usr/ccs/bin precedes /usr/ucb.

After making any such path change, it's best to "make distclean"
and rerun configure; otherwise, stale feature detections from the
previous setup might continue to screw things up.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Runtar error

2005-02-21 Thread Eric Siegerman

On Fri, Feb 18, 2005 at 09:10:30AM -0600, Dege, Robert C. wrote:
> runtar: error [must be setuid root]

On Fri, Feb 18, 2005 at 10:49:46AM -0600, Dege, Robert C. wrote:
> -rwsr-x---  1 root   amanda  9947 Feb 16 10:43 runtar
> [plus evidence that this copy of runtar *is* the one being
> used]

Hmm, that looks like runtar complaining, so it must have been
executed.  That argues against the hypothesis that Amanda can't
run runtar at all because it's not in the "amanda" group.

And runtar clearly is setuid root.

I wonder if the file system is mounted "nosuid".  You could
test it by copying the "id" program into the directory where
runtar lives, making it setuid root, and running it as a nonroot
user to see what it says.  (MAKE SURE to nuke your copy as soon
as you're finished with it; "id" presumably hasn't been audited
for setuid-safety!)

On a Solaris box, I get (I've edited out the list of secondary
groups):
% pwd
/home/erics/test

% ls -ld id
// I took away its world-execute more for security paranoia
// than for the sake of strictly emulating runtar's perms
-rwsr-x---   1 root erics   8044 Feb 21 14:39 id

// The real "id" command just says I'm me -- ho hum
% /bin/id -a
uid=1000(erics) gid=1000(erics) groups=...

// My setuid-root "id" command.  Still says my uid is my own,
// but note the "euid=0(root)"; that's what we're looking
// for.  (euid==0 && uid==) is the sign of a
// setuid-root executable.  (Similarly with gid's for setgid,
// but that's not relevent here.)
% ./id -a
uid=1000(erics) gid=1000(erics) euid=0(root) groups=...

// And just as a check, run it from a root shell; the "euid="
// has gone away, since both euid and ruid are now both 0.
# ./id -a
uid=0(root) gid=1(other) groups=...

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Amanda-2.4.4p4 - RedHat 7.3 - Samba-3.0

2005-02-24 Thread Eric Siegerman

On Thu, Feb 24, 2005 at 09:19:31AM +, Tom Brown wrote:
> FAILED AND STRANGE DUMP DETAILS:

I posted about this in the last week or two.  Look for "strange"
in the list archives.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Amanda's report

2005-02-24 Thread Eric Siegerman

On Thu, Feb 24, 2005 at 12:52:17PM -0600, Karl W. Burkett wrote:
> mount("/dev/md/dsk/d92", "/tmp/.rlg.10aqFe/.rlg.10aqFe", 
> MS_RDONLY|MS_DATA|MS_OP
> TIONSTR, "ufs", 0xFFBFEBBC, 4) = 0

That'd be the fundamental reason that ufsdump wants root.  That
it fails to create the temp directory otherwise, turns out to be
pretty irrelevent, since what it does with the thing requires
root in the first place :-/

One can only guess what it's doing, but from Jon(?)'s observation
that the problem only arises on partial-filesystem dumps, my
guess would be that it's figuring out which inodes to dump by
traversing the file system via readdir() like any other process
-- even though it does the actual backup directly, via the
special file.  Well, I've seen things done in even-weirder
ways...

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: CVS info on FAQ

2005-02-25 Thread Eric Siegerman

[Cc'ed to -hackers; followups should probably go there]

On Fri, Feb 25, 2005 at 09:13:23PM +, Gavin Henry wrote:
> Could some update the details on howto checkout things from CVS?
> [...]
> Just DNS changes.

A guide to the repository would also be useful.  In particular,
which module does one want?  There appear to be at least
"amanda", "amanda-2", and "amanda-krb-2" to choose from.

Then there are other modules like "cgi-support", which I guess
are for www.amanda.org rather than for the package itself, but
it'd be nice not to *have* to guess :-)  (Easy for me, since I
know Amanda doesn't do CGI, but someone new might not know that.)

This update should go on Amanda's "CVS" page on Sourceforge, too.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Linux Storage Options

2005-02-28 Thread Eric Siegerman

On Mon, Feb 28, 2005 at 02:17:04PM -, Gavin Henry wrote:
> Actually, Amanda is very enterprise ready.

On Mon, Feb 28, 2005 at 03:47:31PM -0500, Brian Cuttler wrote:
> Amanda is limited to DLE (DiskList Entries) that do not exceed the
> capacity of a single tape volume.

On Wed, Feb 23, 2005 at 02:04:50PM +, Bruce S. Skinner wrote:
> There will always be a race between [disk and tape]
> for the biggest or fastest and there is no telling who is
> going to be out front next week, let alone next year.  It is not
> reasonable to design a system that you know is likely to break
> whenever disk capacity pulls ahead in the race.  If in that
> eventuality you are going to have to adopt another backup solution,
> the reasonable choice is to adopt that other backup solutiuon now.

Gentlemen, your timing is impeccable :-) :-)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Amanda Backup

2005-03-02 Thread Eric Siegerman

On Wed, Mar 02, 2005 at 01:10:53PM -0500, Jon LaBadie wrote:
>   starttime -100  # start 1 hour before amdump is started  :))

You could kind-of actually do this.  It'd just mean, "delay
everything *else* by 1 hour".  Now, whether there's a reason to
implement it is another question entirely.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Some questions about Amanda's emailing of backup reports

2005-03-03 Thread Eric Siegerman

On Thu, Mar 03, 2005 at 12:35:03PM -0500, Jon LaBadie wrote:
> On Thu, Mar 03, 2005 at 11:23:51AM -0600, Hull, Dave wrote:
> > The amanda account has a procmail filter that saves a copy to a
> > local file and forwards another copy on to the required persons.

I just do that manually -- save the report to a particular email
folder after I've looked it over.  But yes, a procmail recipe
would be even better.

> Good solution, particularly if the local file can be the report content
> without the email headers.

Personally, I like keeping the headers -- that way, if I'm
looking for the reports that match certain criteria, I can often
use my mail client to give me a filtered view of them.  But if
you want the headers stripped, just have procmail pipe the emails
to a script which does that.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"

Re: Unusual dump times?

2003-09-16 Thread Eric Siegerman

On Tue, Sep 16, 2003 at 07:35:55AM -0400, Jack Baty wrote:
> With everything finally working, I'm wondering if my dump times are 
> excessive or to be expected.
> [...]
> marvin.fusio /usr0 70091804311267  61.5 177:48 404.1  31:352275.2
> scooby.fusio /usr1   55450   5443   9.8   8:49  10.3   0:11 515.1
> stewie   -e/projects 1 200200   --0:02  96.7   0:009069.9
> stewie   -e/software 12290   2290   --0:06 379.9   0:013149.8
> stewie   hda22   14380   2535  17.6   1:26  29.6   0:03 996.3

That depends on so many things that it's hard to give a simple
answer: client hardware, O/S, dump vs. gtar, many small files vs.
fewer big ones, network technology, network saturation, etc. etc.
etc.  (And if you set all those out for me, I'd still have
difficulty saying "yes, it's reasonable" or "no, it isn't".)

> I plan to gradually include more machines 
> totalling about 20GB. If all the hosts take as long as marvin (below), 
> things could end up taking more than 12 hours to run.

Well, to do a full backup on them all, maybe.  But you won't be
doing that -- as with the run you quoted, most of the DLE's are
doing incrementals on any given night, so the one or two full
backups dominate the stats.

We run a two-configuration setup here (three actually, but two of
them are similar enough that for this discussion I'm treating
them as one):
  - A daily backup to disk (i.e. file:), which is a standard
Amanda configuration of mixed fulls and incrementals

  - A weekly full backup of everything to tape

The weekly backup is about 50 GB, and takes about 23 hours.
That's why it runs Friday night :-)

But the last 30 dailies took between 0:37 and 4:33 each, with 80%
of them under 3 hours.  Sending them to tape would presumably
slow down the total duration, but with enough holding disk, the
impact on the clients shouldn't be affected much.

(There's lots of optimization I could do, for both configurations
-- I'm not at all happy with the level of parallelism I'm
getting.  So far I haven't needed to worry about it.)

So if you're using a standard configuration, where you let Amanda
schedule full and incremental backups, I'd add in the rest of
your DLEs and let Amanda run for a dumpcycle or two before
worrying too much about it.  Just add them in a few at a time, or
you *will* face some very long dump times at first, since the
first dump Amanda does of any given DLE has to be a level-0.

> Wondering if I 
> should just stop using compression.

Again, that would depend on just what the bottleneck is.  If it's
CPU usage on the client, try reducing from --best to --fast as
someone else suggested, or try changing to server-side
compression.  If it's network bandwidth, go the other way:  move
compression from server to client, and/or increase the
compression level until you start maxing out the client CPU.

You can't make meaningful optimizations until you know what to
optimize for, and you can't know that until you can find out, or
hypothesize, which resource is saturated.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: How to backup the firewall host itself?

2003-09-19 Thread Eric Siegerman

This is only tangentially related to Amanda, but it seemed worth
posting to the list to get others' input.

On Thu, Sep 18, 2003 at 03:02:23PM -0300, Bruno Negrão wrote:
> I have an amanda server on my DMZ and i like it to backup my firewall
> machine(the amanda client).

Are you *really* sure you want to do this?  The security
implications are pretty frightening!  If an intruder takes over 
your Amanda server, they can hack Amanda to write corrupted
backups.  They might stick a trojan into the backup, then wait
for you to restore from it.  Ok, that's pretty far-fetched, but
how about this?

An intruder who takes over a machine on the DMZ can use it to 
stage attacks on the firewall.  Because you've opened up ports on
the firewall to accept Amanda-related connections from the DMZ
Amanda server, you've given the intruder more ports to attack.

Worse yet, because you have an Amanda client on the firewall,
configured to accept connections from the DMZ server, an intruder
can exploit any security problems (buffer overruns etc.) in
Amanda itself!

At the very least, an intruder who takes over your Amanda server
can grab a full backup of the firewall machine -- including the
firewall rules, which they can then study to look for holes.

It seems to me to be *much* safer to put the Amanda server on
your internal network and have it reach *out* through the
firewall to the DMZ machines.  (You still weaken your firewall's
security this way, but not nearly as much, because the Amanda
server itself is now much less subject to attack.)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: ANSWER: pre- and post-dump script?

2003-09-19 Thread Eric Siegerman

On Thu, Sep 18, 2003 at 09:52:32AM -0400, Kurt Yoder wrote:
> 1. compile amanda with tar=/usr/local/bin/tar
> 2. copy or symlink tar to /usr/local/bin/realtar
> 3. create a script /usr/local/bin/tar
> 4. chmod 755 /usr/local/bin/tar

The problem with this is that users typically have /usr/local/bin
in their paths, so now they'll get your script instead of a
vanilla tar command.  Breaking the environment like this is truly
evil; your users will not take it kindly!

Much better would be:
  1. compile amanda with tar=/usr/local/libexec/amandatar (or
 whatever you prefer, as long as it's not both called "tar"
 and in peoples' paths)
  2. leave the real tar well alone!!
  3. create a script /usr/local/libexec/amandatar
  4. chmod 755 /usr/local/libexec/amandatar

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: Obtaining CVS source

2003-09-19 Thread Eric Siegerman

On Fri, Sep 19, 2003 at 11:01:04AM +0100, Stevens, Julian C wrote:
> Please can someone advise me how to obtain Amanda source via cvs?
> My only internet access is from an NT workstation(!) via a proxy server, but
> I don't know how to tell CVS about this.

I just installed TortoiseCVS (www.tortoisecvs.org) on somebody's
Windows machine here.  (I don't use it myself, since I pretty
much stick to UNIX -- but if I were a Windows type, I think it'd
be my preferred CVS client).  I didn't need to configure proxy
info for our use of it, but I seem to recall that its Prefs
dialog had a place for it.  FWIW...

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: Use /dev/nst0 or /dev/nrst0?

2003-09-19 Thread Eric Siegerman

On Fri, Sep 19, 2003 at 04:00:51PM -0300, Bruno Negrão wrote:
> I´m using a redhat linux to make amanda backups based on tar. Should I use the 
> device /dev/nst0 or /dev/nrst0?
> 
> What does the r letter stand for?

Not sure.  Historically, it has meant "raw", i.e. a character
special file; no prefix meant "cooked", i.e. the block special
file for the same underlying device.

But it's been a *long* time since I've seen a tape device with
both raw and cooked versions, and Linux seems to have done away
with the distinction even for disk devices.

Now, I suspect "r" stands for redundant :-)

I'd be curious to see the output of "ls -ld /dev/*st0" on your
system...

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: amanda skipped two runs.

2003-09-22 Thread Eric Siegerman

On Mon, Sep 22, 2003 at 01:57:58PM -0500, Darin Dugan wrote:
> http://groups.yahoo.com/group/amanda-users/message/46310
> 
> [...] Does anyone else 
> think it odd that deleting that one line cures the warning without causing 
> any problems?

Not at all.  As the referenced message says, amdump is doing
something whose effects are undefined.  The patch makes it not do
that.

In case you're interested "wait()" says to wait until one of the
caller's child processes exits, and then return the child's exit
status to the caller (in this case, the caller is amdump).

"signal(SIGCHLD, SIG_IGN)" says ... well, I won't go into what
its main purpose is, since that would take explaining all about
signals.  But as a side-effect, it says, "don't keep around the
child-process information necessary to satisfy wait() calls".  In
fact, I think this side-effect is system-dependent.

So amdump is asking the kernel to do contradictory things: first,
"forget about my child processes; I won't be asking about them",
then "tell me about my child processes".  The patch simply makes
it not make the first request; thus the second request is no
longer a problem.

> I'm wondering if anyone has done a kernel upgrade to 2.4.20-20.9 (RH9.0) 
> as mentioned that
> it fixes signal delivery race condition, as I don't know whether to 
> follow the Amanda patch, or
> the kernel patch.

There's no reason to assume they're mutually exclusive!

The Amanda patch can't hurt, so do that for sure.  I don't know
about the kernel patch one way or the other.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: amrecover error

2003-09-25 Thread Eric Siegerman

On Thu, Sep 25, 2003 at 12:34:57PM -0500, chris weisiger wrote:
> so how do i rewind the tape?

"mt rewind"

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: Rewind before ejecting?

2003-09-25 Thread Eric Siegerman

On Thu, Sep 25, 2003 at 11:57:35AM -0400, M3 Freak wrote:
> [...] should I just issue an "eject"
> command to the drive to spit the tape out, or do I have to rewind it
> before ejecting it?  

It depends on the tape technology, I think.  DAT tapes rewind on
their own.  Some other kinds might not (though I don't really
know).  What kind of tape drive do you have?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: problems dumping certain filesystems

2003-09-29 Thread Eric Siegerman

On Mon, Sep 29, 2003 at 12:06:48PM +0200, Paul Bijnens wrote:
> Marc Cuypers wrote:
> > Found the problem.  The firewall blocked communication between taper and 
> > dumper.
> 
> That's strange, because there is no immediate communication between
> these two, as far as I know.
>
> Driver is connected with a pipe to each dumper and to taper-reader.

I believe there is a dumper->taper connection, for direct-to-tape
dumps.  That's how I read docs/PORT.USAGE, anyway -- see the bits
on stream_server() and stream_client().  But both of those
processes run on the same host, so it's still hard to see how a
firewall could get between them.

Unless Amanda's running on the firewall machine itself -- which
I'd consider an unsafe idea anyway!

> Strange is also that some partitions on that host got backed up, while
> others did not.

It looks as though the ones that succeeded all dumped via holding
disk, so in those cases there was indeed no need for dumpers and
taper to talk directly.

> there was no error msg whatsoever in the mail report, except that it
> simply failed).

Yeah, that was the part that got my attention too!  But then,
it's 2.4.2, so maybe that bug's been fixed.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Snapshot vs. -p1 (was Re: Lot's of I/O errors)

2003-09-29 Thread Eric Siegerman

On Mon, Jul 14, 2003 at 11:07:10AM +0200, Toralf Lund wrote:
> I've been getting a lot of
> 
> *** A TAPE ERROR OCCURRED: [[writing file: I/O error]].

On Mon, Jul 14, 2003 at 01:44:26PM +0200, Toralf Lund wrote:
> Note that I've now gone back to amanda-2.4.4, and successfully flushed
> some images that caused trouble in the past. I think an important question
> here is whether something related to the holding disk handling or taping
> of images has changed since 2.4.4.


Did you ever learn anything more about this?  I'd like to upgrade
from 2.4.4 to -p1 or the latest snapshot, but I'd appreciate your
thoughts first -- and those of anyone else who cares to speak up.

Thanks much.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: amdump - hosts report missing estimate

2003-10-03 Thread Eric Siegerman

On Fri, Oct 03, 2003 at 06:06:55PM +0100, Steve Taylor wrote:
> planner: time 0.037: no feature set from host localhost
> error result for host localhost disk /var: missing estimate

Don't use "localhost".  Use the FQDN instead.

I'm beginning to think amcheck should print a warning if it sees
"localhost" in the disklist...

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: Amanda not sending emails

2003-10-03 Thread Eric Siegerman

On Fri, Oct 03, 2003 at 11:22:45AM -0400, M3 Freak wrote:
> Amanda is using "mail", and it works fine (just sent a test message
> using "mail").  Here's what I got when I typed in what you suggested: 
> 
> UNCOMPRESS_PATH="/usr/bin/gzip" MAILER="/usr/bin/Mail"

Note that sometimes "mail" and "Mail" are *not* (symlinks to) the
same program, so don't assume that they behave identically.

At some point in the distant past (at Berkeley, I believe),
someone came up with a spiffy new mail client (which we'd now
consider as awesome as, say, a 486 :-)  For backward
compatibility, they couldn't call it "mail", since that was the
name of the old(er), ugly(er) client.  Instead they picked a
stupid name, "Mail", and for compatibility with *them*, we've had
to live with that ever since (though Mail is sometimes known as
"mailx" instead.)

> 02 4 * * * root run-parts /etc/cron.daily
> [...]
> Should I create a new file called, for example,
> "amanda.cron" under "/etc/cron.daily" instead of placing the cron entry
> for amanda in "/etc/crontab"?

NO!  Or rather, only if you want your amdump and amcheck to run:
  - as root -- which you don't, since they want to run as the
amanda user (i.e. the value you gave for the --with-user=
option to configure).  And for security reasons, that user
should *not* be root!

  - at 4:20 AM, which is quite possibly too late for amdump, and
is almost certainly too late for "amcheck -m" :-)

Either make separate entries in /etc/crontab, or use "crontab -e"
to create a crontab for the amanda user.  As far as I can tell,
the two methods are identical.  (The only difference is that the
user themselves can use "crontab -e", but only root can edit
/etc/crontab.  For the amanda user, that's probably irrelevent.)

On Fri, Oct 03, 2003 at 12:29:06PM -0400, Jon LaBadie wrote:
> On Fri, Oct 03, 2003 at 11:22:45AM -0400, M3 Freak wrote:
> > [in crontab]
> > MAILTO=root
> 
> The parameter in the amanda.conf file "mailto" (lowercase) is
> set correctly isn't it?
> 
> But the uppercase form does appear in the source code.  I don't know
> if it might pick this up from the environment.

Doesn't look that way.  The only place MAILTO occurs (in the
2.4.4 source tree) is in server-src/conffile.c, where it's (a) an
enum constant, and (b) a string that's used to (case-insensitively)
match tokens from amanda.conf.  I think that the MAILTO
environment variable is a red herring.

However, it might be useful in tracking down the real problem.
What MAILTO in a crontab does is to tell cron where to email the
program's stdout and stderr, if there is any -- but of course
there isn't from "amcheck -m", since that mails its report
instead.

But what if amcheck is dying before it sends the email?  Like,
because it's being run as root instead of the amanda user
perhaps...

Try removing the "-m", and setting MAILTO properly in the
crontab, if it isn't already.  Then "amcheck" will write its
report to stdout, and cron will mail you the results.  That way,
if amcheck is aborting early, you'll see the messages.

Also check the mail logs, as others have suggested; and the cron
logs too (they're in /var/log, or /var/cron, or /var/spool/cron,
or somewhere like that).  Either of those might tell you
something useful.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: Dumper Patch Problem

2003-10-07 Thread Eric Siegerman

On Tue, Oct 07, 2003 at 10:57:50AM -0500, Jim Summers wrote:
> Reading the patch file does it mean I simply need to delete the line
> dumper.c that has the SIGCHLD in it?  

Correct.

> But I guess I should figure out the correct patch command also?

Well, you could, but for this particular patch, editing by hand
is probably easier :-)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: exclude question

2003-10-07 Thread Eric Siegerman

On Tue, Oct 07, 2003 at 05:07:50PM -0400, Jean-Francois Malouin wrote:
> one subdir contains more than 100GB [i.e. tape size] but
> also has something close to 2000 subdirs.

You could write a script to generate the excludes dynamically.

Or maybe you could rearrange the directories a bit: split them up
into tape-sized subsets, and use symlinks to provide the
appearance that they're all still in one directory.  That is, if
you currently have:
/mountpoint/subdir/aardvark
...
/mountpoint/subdir/zebra

turn it into something like:
/mountpoint/subdir-storage/a-m/aardvark
...
/mountpoint/subdir-storage/n-z/zebra
and:
/mountpoint/subdir/aardvark -> ../subdir-storage/a-m/aardvark
...
/mountpoint/subdir/zebra -> ../subdir-storage/n-z/zebra

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: exclude question

2003-10-07 Thread Eric Siegerman

On Tue, Oct 07, 2003 at 11:19:37PM +0200, Paul Bijnens wrote:
>   include "./[a-m]*"

That works too -- and it's a lot cleaner than either of my ideas :-/

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: more doubts

2003-10-17 Thread Eric Siegerman

On Fri, Oct 17, 2003 at 11:18:48AM +0200, JC Simonetti wrote:
> Just to be sure we are talking about the same thing [...]

We are indeed on the same wavelength.

> I have different values for the filemarks, measured with another program and not 
> amtapetype.
> Do you know if the filemarks are application-dependant, tape-dependant, 
> tape-and-taper-dependant ???

Doesn't surprise me.

A tape mark is a particular bit pattern on the tape (well, duh!).
Whether it's a funny otherwise-illegal block, which takes up some
amount of extra space, as I gather was the case with old
mainframe 9-track tape; or a one-bit field in a block header that
would have been written anyway, as seems a reasonable hypothesis
for "filemark 0" technologies like DDS; or something else,
depends on the drive technology.

So in theory, all tape marks written by a given drive should be
the same size.

But you can't read a tape mark.  The whole idea is that it's a
way to represent an out-of-band signal, i.e. end of file.  So the
drive reads the tape mark, but all it'll tell you about that is,
"I just read a tape mark".  You don't get to see the magic bit
pattern.

Thus, programs *can't* find out a tape mark's size for sure; they
have to guess, using some heuristic or other.  Different programs
will use different heuristics, so I suppose they'll guess
different values for what is in fact, the same quantity.  We can
only hope that their guesses are "close enough" to the real
value.

Thus, the tape mark itself is not application-dependent, but the
estimate of its length *is*.

> The IBM fms software tells that filemarks are tape-and-taper-dependant. Do you know 
> more? Do you have any opinions concerning that?

By "tape-and-taper-dependent", do you mean, "dependent on the
drive and on the particular tape"?

If so, I'd agree.  In theory, as I said, it depends only on the
drive.  But in practice, this is mag-tape we're talking about.
It's a notoriously unreliable medium.  One way of dealing with
that, which was used by the old 9-track stuff, is that writing a
block consisted of:
 1. Write the block
 2. Go back and (try to) reread it
 3. If you can't read it, skip forward a bit (erasing?  Not sure)
and retry steps 1-2
 4. All of this was happening within the hardware, i.e. within
one O/S-level I/O request.  There was a threshold (number of
retries?  length of tape consumed?  Not sure) after which the
drive would give up and return an error status to the O/S
 5. Perhaps the O/S would retry the failed write a few times
(steps 1-4, i.e. each software retry would involve many
hardware-level retries)
 6. Perhaps the O/S would then print a message on the console
asking the operator what to do -- the mainframe equivalent of
"Abort, Retry, Fail?".  If the operator said "retry", repeat
steps 1-6.  Only if the operator said "fail" would the
application's I/O request (the local equivalent of the UNIX
write() sysstem call) finally return with an "I/O Error"
status.

Of course the read algorithms in both hardware and O/S knew to
compensate for all of that.

So you can see that there are potentially an *awful* lot of
retries going on there, each one consuming a small chunk of tape.
Thus, a tape record of a given length (whether a data block or a
tape mark) would always take up the same amount of tape *for the
record itself*, but the "inter-record gap" preceding the record
could vary wildly in length -- anywhere between a fraction of an
inch to several feet.

Thus, the *apparent* length of a tape mark (or of any other tape
record, for that matter) would depend not only on how much tape
the bit pattern itself occupied (constant, I presume), but on
whether you happened to try to write it on a bad patch of the
tape or a good one.

(I'm not sure how newer technologies deal with this
error-prone-ness; if they have better ways, the variablity in
apparent block lengths might be a lot less.  But they haven't
been able to reduce that variability to zero -- at least not for
DDS3, judging by my amdump reports.  And for streaming
technologies, that introduces yet another variable.)

So you can see that trying to intuit a tape mark's length is, to
put it mildly, a bit of a challenge.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: more doubts

2003-10-17 Thread Eric Siegerman

On Fri, Oct 17, 2003 at 09:53:12AM -0400, Jon LaBadie wrote:
> The sermon is, unless you see really ridiculous filemark values,
> don't worry.  They have nearly zero impact on amanda planning.

Wisdom has been spoken :-)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: tape type for HP DDS3 C5708A

2003-10-20 Thread Eric Siegerman

On Sat, Oct 18, 2003 at 03:29:13PM -0400, Jon LaBadie wrote:
> On Sat, Oct 18, 2003 at 11:24:05PM +0530, Rohit Peyyeti wrote:
> > FAILURE AND STRANGE DUMP SUMMARY:
> >   localhost  /h lev 0 FAILED [dumps too big, but cannot incremental dump new disk]
> >   [plus three more of the same]
> 
> This suggests to me that the DLE's are each larger than your tape.

I don't think so.  In that case, the message would have been
"dump larger than tape, but cannot yada yada" (Phase 1 of
delay_dumps()).

The "dumps too big" variant comes from Phase 2, when a full dump
is due, but planner wants to postpone it and do an incremental
instead, to fit all the DLEs onto the tape.  In the case of a new
disk, as you pointed out, the do-an-incremental-instead option
isn't possible, so planner's only choice is to skip the DLE
entirely.

Rohit:

The answer, as Jon said, is to add DLEs a few at a time -- or, in
your case, it looks like *one* at a time :-(, so as not to ask
Amanda to put more on a tape than will fit.

Or else just don't worry about it; leave all of the DLEs in there
and let them fight it out for tape space :-)  Sooner or later,
they'll all make it onto tape, as //windows/UsersG3 did this
time.  I doubt that it'll happen any faster if you follow the
usual advice and only add them slowly.  The only advantages of
doing it that way are:
  - you won't get Amanda shouting at you that YOUR BACKUPS
FAILED!!

  - if some DLEs are more important to back up than others, you
get to put those into the backup system first, rather than
letting Amanda choose randomly which one(s) to add in any
given run

> > --> some NT_STATUS_OBJECT_NAME_NOT_FOUND errors <--

Someone recently posted a solution to that, I think.  Couldn't
hurt to check the archives for it.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: recommendation needed

2003-10-20 Thread Eric Siegerman

On Fri, Oct 17, 2003 at 06:43:40PM -0400, Jon LaBadie wrote:
> I tried some throughput checks today.  Test one was a "cp -r"
> of a directory tree with 8.5GB (only a few large files) and
> test two was a ufsdump of a 1GB partition.  Both gave between
> 3 and 3.5MB/sec rates to the NFS device.  That certainly is
> higher than the 1MB/sec I get to tape, but quite a bit lower
> than the rate to a local disk.

NFS-2 write performance is lousy, since the server needs to
effectively fsync() on every write() call before it returns
status to the client.  (I suspect that one tends to notice this
with large files more than with small ones, because we expect
lots-of-small-file performance to be worse in any case.)

NFS-3 is supposed to be better, as is FreeBSD's "nqnfs" (the "nq"
stands for "not quite"), but I've never tried either.

Here's a test, in addition to the ones Tony suggested:
  - Try copying a single large file from local disk to the
NFS-mounted drive
  - Try copying the same large file, to the same partition on the
NFS server, using FTP (if you use scp instead, turn off
compression to keep it from becoming CPU-bound and confusing
things)

Comparing the length of time for those two tests will tell you
how much of the problem is in NFS itself, and how much is in the
lower layers of the networking subsystem.

Also, if you can sit beside the Snap Server when doing those
tests, listen to it.  If its disks make noise when they seek, I
bet you'll hear a *lot* more of that during the NFS test than
during the FTP one.  If there's a disk-activity LED, you might
see a difference there too.  (I don't know how this manifests in
more scientific measurements like iostat results, if it does at
all...  The completely unscientific seek-rattle is more
viscerally convincing anyway; at least I found it so :-)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: Question about new client setup

2003-10-20 Thread Eric Siegerman

On Mon, Oct 20, 2003 at 11:53:34AM -0700, Dana Bourgeois wrote:
>   alta   / lev 0 FAILED [disk / offline on alta?]

I hate that message; it's extremely misleading.  Besides what it
says, it can also mean that sendsize was unable to parse the
output of whatever subcommand it ran.  That in turn can be
because it used the wrong subcommand entirely (e.g.
/usr/ccs/bin/dump instead of ufsdump on a Solaris box).  Only
rarely, it seems, does "disk foo offline" actually mean that disk
foo was offline :-(

Try looking in the "amdump" file and the various debug files
(especially the sendsize* ones on the clients in question).

> I'm wondering if it's the mismatch between
> client version and server.

Unknown, but docs/UPGRADE doesn't mention it (and it does mention
an earlier incompatibility, so that'd be the place to look).  I'd
try to rule out other problems first.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Proposal for more run-time configuration

2003-10-22 Thread Eric Siegerman

force use of mmap instead of shared memory support
  --with-assertions  compile assertions into code
  --with-gnu-ld   assume the C compiler uses GNU ld default=no
  --with-pic  try to use only PIC/non-PIC objects default=use both
  --with-krb4-security=DIR   Location of Kerberos software /usr/kerberos /usr/cygnus 
/usr /opt/kerberos
  --without-debugging=/debug/dir do not record runtime debugging information in 
specified directory --with-tmpdir

These should be configurable at run time.  Actually, both for
backward compatibility and to make initial installation easier,
it probably makes sense to keep the current configure-time
options, but it should be possible to override their values in a
run-time config file.  This would be similar the way Apache
configuration works.

  --with-index-server=HOST default amanda index server `uname -n`
  --with-config=CONFIG   default configuration DailySet1
  --with-tape-server=HOST default restoring tape server is HOST same as 
--with-index-server
  --with-tape-device=ARG restoring tape server HOST's no rewinding tape drive
  --with-ftape-rawdevice=ARG raw device on tape server HOST's if using Linux ftape 
>=3.04d
  --with-changer-device=ARG default tape changer device /dev/ch0 if it exists
  --with-gnutar-listdir=DIR  gnutar directory lists go in DIR 
localstatedir/amanda/gnutar-lists
  --with-maxtapeblocksize=kbMaximum size of a tape block
  --with-debug-days=NNnumber of days to keep debugging files default=4
  --with-dump-honor-nodump  if dump supports -h, use it for level0s too
  --with-tmpdir=/temp/dir area Amanda can use for temp files /tmp/amanda
  --with-testing=suffix use alternate service names

For these, providing both configure-time and run-time
configuration is especially useful, since configure "looks for
one" by default.

  --with-gnutar=PROG  use PROG as GNU tar executable default: looks for one
  --with-smbclient=PROG   use PROG as Samba's smbclient executable default: looks for 
one

This must be settable at configure time.  For obvious
bootstrapping reasons, it can't be overridden by a value in the
run-time config file; but, as with Apache, it should be
overridable on the command line.

  --with-configdir=DIR   runtime config files in DIR sysconfdir/amanda

These ones I'm not sure about.  My arguments above point to
making them run-time configurable, but there may be
security-related arguments against this.  (I've put the
Kerberos-related ones in this class simply because I don't
understand them...)

  --without-amandahosts  use .rhosts instead of .amandahosts
  --without-bsd-security do not use BSD rsh/rlogin style security
  --with-portrange=low,high bind unreserved TCP server sockets to ports within 
this range unlimited
  --with-tcpportrange=low,high  bind unreserved TCP server sockets to ports within 
this range unlimited
  --with-udpportrange=low,high  bind reserved UDP server sockets to ports within this 
range unlimited
  --with-user=USER   force execution to USER on client systems required
  --without-force-uiddo not force the uid to --with-user
  Kerberos options:
--with-server-principal=ARGserver host principal  "amanda"
--with-server-instance=ARG server host instance   "amanda"
--with-server-keyfile=ARG  server host key file   "/.amanda"
--with-client-principal=ARGclient host principal  "rcmd"
--with-client-instance=ARG client host instance   HOSTNAME_INSTANCE
--with-client-keyfile=ARG  client host key file   KEYFILE
--with-ticket-lifetime=ARG ticket lifetime128

Special cases:
  --with-owner=USER   force ownership of files to USER default == --with-user value
  This is overloaded to mean both "set files to owner USER"
  and "run executables as user USER".  It should be split
  into two; I DO NOT want Amanda running as the owner of its
  files -- least privilege and all that.  Assuming it's split
  into --with-file-owner=USER to set file ownership, and
  --with-run-user=USER to specify who gets to run amdump, the
  former obviously has to be a configure-time option, and the
  latter goes into the "not sure because of security" list.

  --with-group=GROUP group allowed to execute setuid-root programs required
  As for --with-owner

Not sure about these; I don't understand the issues well enough
to have an opinion:
  --with-fqdnuse FQDN's to backup multiple networks
  --with-buffered-dump   buffer the dumping sockets on the server for speed

These are splendid examples of what I'm advocating :-)
  --with-indexdirdeprecated, use indexdir in amanda.conf
  --with-dbdir   deprecated, use infofile in amanda.conf
  --with-logdir  deprecated, use logfile in amanda.conf


--

|  | /\
|-_|/  >   Eric Siegerman,

Re: NFS mount as second holding disk

2003-10-23 Thread Eric Siegerman

On Thu, Oct 23, 2003 at 08:32:31AM -0500, Dan Willis wrote:
> Has anyone successfully used an NFS mount as a secondary holding disk?

Haven't tried it.

> Can backups still be run through dump or should they all be tar
> going this route?

I don't see offhand why it would make a difference.  From the
Amanda server's point of view, they're both just byte streams
coming in over a socket.

> Or is this just not advisable at all?

Perhaps this is overly alarmist, but over on the info-cvs list,
the common wisdom is that people should *not* access their CVS
repositories over NFS, but use the CVS client/server protocol
instead.  That's because interoperability problems between
different O/S's have been known to knock holes out of files
(whole blocks of NULs instead of the data that should be there).

A recent thread discussed the problem, and the circumstances in
which one can probably get away with NFS-mounting one's repo.
See this message from Larry Jones, one of the main CVS
maintainers at this point, and a guy who, IMO, generally seems to
know what he's talking about:
http://mail.gnu.org/archive/html/info-cvs/2003-10/msg00060.html

and the final paragraph of this one:
http://mail.gnu.org/archive/html/info-cvs/2003-10/msg00064.html

(The rest of the thread is of less interest, since it deals with
more CVS-specific issues.)

There are circumstances in which that problem isn't as critical
(e.g.  CVS working directories; since everything's in the repo
anyway, all you risk is your latest round of changes).  But a
backup isn't the kind of thing I'd want to gamble with --
especially not a compressed one, where an error would trash the
rest of the file!

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: log files lost

2003-10-28 Thread Eric Siegerman

On Tue, Oct 28, 2003 at 04:23:08PM +0300, vlad f halilow wrote:
> hello everyone. by different reason's log files of amanda war
> deleted. i want to manual recover it from tape (amrecover do
> not working, Warning: no log files found for tape serversXX for
> all tapes ), but i cannot. 
> [...]
> [EMAIL PROTECTED] # tar tvf /dev/rmt/0bn

You can still use amrestore; it's only amrecover that needs index
files, logs, etc.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: Rename tape labels?

2003-10-31 Thread Eric Siegerman

On Fri, Oct 31, 2003 at 01:25:12PM +0530, Rohit wrote:
> Is there anyway I can change tape label names after I'm well into amanda 
> dumpcycle? I currently have TLS-Set-1-01.. Set-1-02.. Set-2-01 and so on. 
> I want to get rid of this set concept from the label.

This is untested; treat it with caution...

 1. To begin the process, change the "labelstr" regexp in
amanda.conf so that both the old and new label styles are
recognized.

 2. For the next tapecycle's worth of dumps, you'll have a mix of
old- and new-style labels.  Each day that a tape with an
old-style label comes up to be reused:
 a. Do "amrmtape  "

 b. Relabel the tape with a new-style label.  Amanda will
consider it to be a "new tape", but that's ok.

 3. After all of the old-style tapes have been relabelled, change
"labelstr" again, to only accept the new style of labels

WARNINGS:
  - DO NOT do (2) all at once for all of the tapes; you'll
clobber your backups.  Make sure you do those steps, for each
tape, just before Amanda would have overwritten that tape
anyway.

  - If you're still on your first tapecycle, i.e. you're still
adding new tapes one at a time, you'll have to finish adding
in your new tapes before you can even begin step (2).

  - During this process, you're circumventing all (or most) of
Amanda's protections against clobbering the wrong tape.  So
be careful!

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: Spindle numbers in disklist

2003-11-06 Thread Eric Siegerman

On Thu, Nov 06, 2003 at 10:05:19PM +0100, Alexander Jolk wrote:
> I thought that by giving different spindle numbers to [NFS-mounted DLEs], amanda
> would back them up in parallel (barring holding disk contention of
> course).

By default, Amanda will only back up one DLE *per client* at a
time.  You can fix that with the "maxdumps" parameter in
amanda.conf, either globally or per-dumptype.

"inparallel", by contrast, controls overall parallelism among the
different clients.  Before a new dump will start, among other
constraints, there have to be fewer than "inparallel" dumps
already running overall, *and* fewer than "maxdumps" dumps
running on the client in question.

The "client-constrained" state happens when either the "maxdumps"
condition or the DLEs-per-spindle condition (which you've already
dealt with) fails; if the "inparallel" condition fails, you get a
different state reported -- "no-dumpers" I think.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
- Michael Collins, Apollo 11 Command Module Pilot

Re: moved to new disk, now amanda wants to do level 0's on whole system

2003-11-14 Thread Eric Siegerman

On Fri, Nov 14, 2003 at 09:20:23AM -0500, Jay Fenlason wrote:
> Also, cp/fr may not have correctly reset the modification times of the
> files when it copied them.  Oh, and they may not handle links well
> either.  To copy directory trees, I usually use "( cd /fromdir ; tar
> cf - . ) | ( cd /todir ; tar xpf -)", which preserves modification
> times, and permissions.

I've had problems with tar, too.  Unfortunately, that was so long
ago that I forget what they were.  Maybe it stores only mtime in
the tarball, and on extraction sets both mtime and atime to the
saved mtime value.  Oh, and I think it likes to (try to) copy the
contents of special files, FIFOs, and the like, instead of
recreating them in the destination tree.

Until recently, I used the cpio variant of your suggestion:
cd /fromdir
find . -depth -print0 | cpio -padmu0 /todir
(You need GNU find and cpio for the "0" part to work.  -depth is
to get the directories' mtimes copied properly.  It makes each
directory come *after* its contents in the file listing.  Without
-depth, the directory would come first; cpio would properly set
its mtime, and then stomp on it by creating the directory's
contents.)

But then I discovered rsync.  Rsync rocks.  "rsync -aH" copies
everything the kernel lets you copy (i.e. not ctimes, and not
inumbers).  The only problem with rsync is the weird way it gives
meaning to a trailing slash; these two are *not* equivalent:
rsync -aH srcdir/ destdir
rsync -aH srcdir destdir

Then again, I'm not sure whether either cpio or rsync can deal
with a username that's changed its numerical userid, or similarly
for groups.  I think some tar's can.  Or maybe it's cpio that can
handle that; can't remember.  And gtar probably doesn't have any
of those problems -- people are using it for backups after all
:-) -- but it's not always available, and even non-GNU cpio's do
everything but the "0" trick.

But all of those -- tar, cpio, rsync -- are kludges.  Is it just
me, or do other people also find it ludicrous that 30+ years on,
UNIX still doesn't have a proper copy command?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: degraded mode

2003-11-17 Thread Eric Siegerman

On Tue, Nov 18, 2003 at 12:52:04PM +1100, Barry Haycock wrote:
> FAIL driver  /dev/rdsk/c0t1d0s7 2003112 2 [can't dump no-hold
> disk in degraded mode]
> [...]
> What is degraded mode?

It's what happens when there's no tape in the drive, or no more
room on the tape, or for whatever other reason, Amanda decides
that the tape has become unusable.  In that case, Amanda tries to
dump as much as it can to the holding disk, and leaves it there
for you to amflush to tape later.

A no-hold DLE is one that has been configured to be dumped
straight to tape, bypassing the holding disk.  If all of this
run's dumps must go to the holding disk, and this DLE must not go
to the holding disk, you can see why Amanda has a problem with
that...

> This file system just happens to be where Amanda dumps are done to. 

An excellent reason for it to be marked no-hold.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Permission Denied error on client

2003-11-17 Thread Eric Siegerman

On Mon, Nov 17, 2003 at 04:59:19PM -0500, John Grover wrote:
> Amanda Backup Client Hosts Check
> 
> ERROR: host.domain.edu: [could not access /dev/vx/rdsk/var (/dev/vx/rdsk/var): 
> Permission denied]
> ERROR: host.domain.edu: [could not access /dev/vx/rdsk/rootvol 
> (/dev/vx/rdsk/rootvol): Permission denied]
>
> Is this a read permission error on the filesystem or an execute error
> on vxdump?

Looks like the former.  Check the ownership and permissions on
the special files mentioned.  The user/group under which vxdump
is running needs read permission.

I don't know about vxdump, but other dumps I've used do NOT need
write permission, and so I do my best to arrange that they don't
have it, even if that means deviating from the defaults for the
system in question.  Least Privilege, and all that.

E.g.
brw-r-   1 root sys   32,  8 Jun 23  2000 /dev/dsk/c0t1d0s0

Amanda was configured with "--with-group=sys", and for good
measure, the "--with-user=XXX" user (which is NOT root) is a
member of group "sys" in /etc/group.

For FreeBSD, replace "sys" with "operator".  For Linux, it
probably depends on the distro, or you might have to chgrp the
special files to a group you've created, as it looks as though I
did here.

On at least some of our systems (can't remember which ones), the
original mode was 660; I had to chmod it to 640.  So far,
nothing's blown up as a result...

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Testing tapes before use / bad tape

2003-11-24 Thread Eric Siegerman

On Mon, Nov 24, 2003 at 09:46:31AM +0100, Martin Oehler wrote:
> Hmm, the only option that sounds like it could speed up the [amtapetype] process
> is blocksize. Does anyone know a good value for this? 

The same value as amdump will be using!  With some tape
technologies, the tape's capacity depends very much on the block
size.  In such a case, using a different block size for the test
would give misleading results.

On Sun, Nov 23, 2003 at 10:28:38AM +0100, Martin Oehler wrote:
> My second problem is how to handle the "short write"?
> I have to send in the tape, but the are 3-4 GB of data on this tape.
> Without this data, my backup is inconsistent. The only possibility
> I see (at the moment) is doing a full backup of the partitions having
> some data on this tape.

That's one possibility.  You can use "amadmin force", staging the
full backups over a few runs if necessary to fit them in.
Another possibility would be to wait a tapecycle (or at the very
least a dumpcycle) for the backups to expire on their own.
(Don't forget to erase the tape before sending it back, if it
contains anything confidential.)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: amtapetype idea (Was: Testing tapes before use / bad tape)

2003-11-24 Thread Eric Siegerman

On Mon, Nov 24, 2003 at 07:14:56PM +0100, Paul Bijnens wrote:
> My idea was to write only one large file in the first pass, just
> until [amtapetype] hits end of tape.

One problem with that is that the drive's internal buffering
might distort the results, by letting amtapetype think it has
successfully written blocks that in fact won't make it to tape.
(That's a problem anyway, of course, but sticking in a filemark
every once in a while puts a known upper bound on the error.)

Perhaps amtapetype could have a "test-tape" flag, that would
basically tell it to suppress the second pass.  Or the second
pass could become a verification pass (just re-seed the
random-number generator to the value from the beginning of the
write pass).  Or provide both options.

Of course that would make "amtapetype" a rather misleading name.
"amtape" would be a great choice for a new name; too bad it's
taken :-/

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Running amdump leads to high CPU load on Linux server

2003-11-24 Thread Eric Siegerman

On Sun, Nov 23, 2003 at 07:46:32PM -0500, Kurt Raschke wrote:
> ...when amdump runs, the load spikes to between 4.00 and
> 6.00, and the system becomes nearly unresponsive for the duration of
> the backup.  The server is backing up several local partitions, and
> also two partitions on remote servers.

Are you short of RAM?  If the system's paging heavily, that'd
make it crawl too.

> I've tried starting amdump
> with nice and setting it to a low priority, but when gtar and gzip are
> started by amanda, the priority setting is somehow lost.

Not surprising.  Recall that Amanda runs client/server even when
backing up the server's DLE's.  The client-side processes are
descendents of [x]inetd, not of amdump, and so don't inherit the
latter's "nice" level.

> The server
> isn't even trying to back up multiple partitions in parallel,

By this do you mean, "only one DLE at a time"; or "only one DLE
*from the server* at a time, along with remote backups in
parallel"?  If the latter, well, of course there's some amount of
server-side work even for the remote DLEs.  Is the compression
for the remote DLEs client- or server-side?  If the latter,
change "some amount" to "a lot" in the previous sentence :-)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Memory requirements for Amanda Server

2003-11-24 Thread Eric Siegerman

On Mon, Nov 24, 2003 at 03:26:39PM -0500, Jon LaBadie wrote:
> Anyone out there running amanda on a 386 or 486 with 16M of ram?

Not any more, but I did in 1995 or thereabouts :-)  486-DX33 (or
was it a DX2/66?) with 16 whole megabytes worth of 30-pin SIMMs.

It turned out that (a) that wasn't enough RAM, and (b) FreeBSD
2.0.5's low-memory robustness left a fair amount to be desired,
as I discovered a few times when I came in in the morning to find
the backup server down, and the /var partition, where the holding
disk was, thoroughly trashed.

(It also turns out that FreeBSD still supports the *3*86 -- last
month's 4.9 release contains a bug fix for it.  I imagine Linux
still does too.  Both are equally gratifying...)

I think the Amanda version was 2.2.6 -- just saw that mentioned
near the beginning of the ChangLog, and it rings a bell.
Interesting limitations (some from memory, some from ChangeLog;
some probably incorrect):
  - No changer support.  One tape per run; that's it, that's all.
  - No indexing, no amrecover.  Amrestore would pull dumps off
the tape, but from there you were strictly on your own.
  - No gnutar; it was strictly dump.
  - No "reserve".  Effectively, it was hard-wired to 100%.
  - No "runspercycle".
  - No chunked dumps on holding disk.
  - No promote_hills() in the planner.  If you missed a tape
(e.g. for a holiday), causing that night's full dumps to be
postponed, they'd have a strong tendency to stay clumped
together with the next night's full dumps for a long time, at
least on a small network like the one I was responsible for.
  - Blair Zajac's extensive patch set hadn't yet been merged into
the canonical sources.  If you wanted them, you had to
download and apply them yourself.

I don't believe Jean-Louis was involved yet (from the ChangeLog,
runspercycle and chunked dumps were among his early patches); I'm
not even sure that (the long-departed) Alexandre Oliva and John
R. Jackson were around back then.

Truth be told, I think Amanda was rather moribund at the time
(hence the existence of BZ's patches in the first place); ISTR
that it was Alexandre who woke the project up again.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Barcode readers: how important?

2003-11-25 Thread Eric Siegerman

On Tue, Nov 25, 2003 at 05:28:02PM +0100, Alexander Jolk wrote:
> I understand that tape changing is also a somewhat faster process if
> barcode labels are used because it doesn't need to do a load-read-unload
> cycle for every tape, but I'm not sure about that.

If that's true (which I *can't* vouch for, having been stuck in
chg-manual land till now :-), it would also mean that the barcode
reader saves wear and tear on the tapes.  ISTM that every
load-read-unload to check the tape label should count as a "use"
for purposes of deciding when to retire the tape -- 99.9% of the
media wasn't touched, but if that first .1% goes bad, you've got
problems.  Does that make sense to people?  If so, it seems like
the barcode reader could pay for itself fairly quickly -- and in
a far more measurable way than just convenience.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: How to fix annoying break in tape sequence?

2003-12-01 Thread Eric Siegerman

For the benefit of the archives (I know you've solved your recent
problem):

On Mon, Dec 01, 2003 at 04:49:50PM +, Dave Ewart wrote:
> On Monday, 01.12.2003 at 16:32 +, Tom Brown wrote:
> > [...] alter the config/tapelist file so that the required OurName-C-Mon
> > is at the bottom (although this is less desirable)
> 
> Are you sure?  I have read that altering tapelist has no effect, since
> it tapelist is an end-result file, not a "read-at-start" config file ...

This is incorrect; editing tapelist *will* affect future runs.

That said, the suggestion still has a problem.  Simply moving
OurName-C-Mon to the bottom (making it swap places with the
"skipped" tape, OurName-B-Fri) will work for that Monday's run,
but you'll have to do it again on Tuesday night, and every night
until OurName-B-Fri cycles around again.

I suppose you could move the OurName-B-Fri entry up to the top,
to make it look as though the tape had been used in its proper
sequence, but I'd be *very* reluctant to do that without
carefully thinking through the ramifications.

Reducing "tapecycle" for the duration is certainly cleaner, quite
possibly safer, and in the end, probably easier.

N.B.:  In a tapelist record, I can't recall offhand whether it's
the date field, or the record's physical position within the
file, that Amanda actually cares about.  Perhaps both.  So to be
safe, "moving" a tapelist entry should probably consist of both:
  - physically moving the line to the appropriate position
  - editing the line's date so that it sorts properly into its
new location

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Bad tape or worse news?

2003-12-03 Thread Eric Siegerman

On Wed, Dec 03, 2003 at 01:32:11AM -0800, Jack Twilley wrote:
> *** A TAPE ERROR OCCURRED: [writing label: short write].

How much was written to the tape before this message appeared?
Anything at all?  Look for a line like this in the NOTES section:
taper: tape twilley008 kb 12033408 fm 8 writing file: short write 

> This is showing up in my messages file:
Here's my interpretation:
> Dec  3 01:08:19 duchess kernel: (sa0:ahc0:0:6:0): WRITE FILEMARKS. CDB: 10 0 0 0 2 0

I believe this is the SCSI command that was being attempted when
a SCSI error occurred.

> Dec  3 01:08:19 duchess kernel: (sa0:ahc0:0:6:0): CAM Status: SCSI Status Error
> Dec  3 01:08:19 duchess kernel: (sa0:ahc0:0:6:0): SCSI Status: Check Condition 

There was a SCSI error.

> Dec  3 01:08:19 duchess kernel: (sa0:ahc0:0:6:0): DATA PROTECT asc:27,0
> Dec  3 01:08:19 duchess kernel: (sa0:ahc0:0:6:0): Write protected  

This is the specific error code.  ASC is "additional sense code";
there's also an ASCQ, "additional sense code qualifier".  I'm
guessing that 27,0 are the ASC and ASCQ resp.

You can look these up at:
http://www.t10.org/lists/asc-num.htm

The above is the kernel reporting the raw data from the hardware.
The rest is the kernel's interpretation.

> Dec  3 01:08:19 duchess kernel: (sa0:ahc0:0:6:0): Unretryable error
> Dec  3 01:08:19 duchess kernel: (sa0:ahc0:0:6:0): failed to write terminating 
> filemark(s)
> Dec  3 01:08:19 duchess kernel: (sa0:ahc0:0:6:0): tape is now frozen- use an 
> OFFLINE, REWIND or MTEOM command to clear this state.

So what we have is that the tape drive is refusing to write a
file mark, because it says that the tape is write-protected.

The most likely possibility, of course, is the write-protect
switch on the tape.  But it could also be a problem with the
drive -- maybe something as simple as a fluffball covering an
optical sensor.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Update: Getting the Drive to Work...

2003-12-09 Thread Eric Siegerman

Random thoughts:

On Tue, Dec 09, 2003 at 10:09:28AM -0500, Josiah Ritchie wrote:
> ># ./filltape 
> >dd: writing to `/dev/nst0': Input/output error
> >285760+0 records in
> >4464+0 records out
> >dd: closing output file `/dev/nst0': Input/output error
> >Command exited with non-zero status 1
> >0.28user 1.74system 18:35.34elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k

> # bin/filltape 
> 6491712+0 records in
> 101433+0 records out
> 5.67user 33.95system 34:41.58elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k

The second run got a lot higher data rate -- 48.7 output records
per second, as opposed to 4.0 the first time.  Changing the SCSI
ID should *not* have done that if, as you say, the tape drive and
the SCSI adapter are the only devices on the bus.  (Even if there
were an ID collision, I'm not sure it would have manifested as a
12-times slowdown -- but then I'm not sure it wouldn't have done,
either.)

> "Kernel Panic: for safety
> In interrupt handler -- not syncing"
> 
> Googling around seems to suggest that this is a problem with the card/the rest
> of the hardware,

That's what it smells like to me.

> I've noticed that there is an aic7xxx_old driver. Maybe I should be using
> that instead? Its an AHA2940/aic7870.

At this point, I suspect you'll get much better answers on a
mailing list for your O/S.

> Is there any issue with both having
> parity checking on

If the card and drive both support parity, absolutely you should
turn it on.  Enabling it on only one of them won't work, but if
both devices support it, having it enabled is a lot safer than
not.

> Can the SDT-9000 provide the term power to
> the terminator on the end of the cable or is that just for the built-in
> terminator? These are some thoughts that are floating in my head. I also noticed
> that the syncing is initialized by the card in the card's bios is it possible
> that this is the source of the issue?

And these are subjects for a SCSI list, or see Gary Field's
excellent SCSI FAQ:
http://fieldhome.net:9080/scsi_faq/scsifaq.html
There's also a SCSI FAQ at:
http://scsifaq.paralan.com/
but I can't offer an opinion, having only just now discovered it.
DON'T use the ones at faqs.org or rtfm.mit.edu; they're Gary
Field's, but ancient versions, not updated since 1998.

> >Card - Start of cable (auto terminate)

Try setting the card's termination explicitly in its BIOS.  I've
read that sometimes auto-termination gets it wrong.

> ># ./filltape 
> >dd: writing to `/dev/nst0': Input/output error
> >[...]
> >/dev/nst0: No such device or address

OK, that second message suggests that the kernel somehow decided
it no longer had a tape drive.  Not sure about that, but it's not
*completely* surprising under the circumstances.  Definitely
grounds for a reboot.

> >Than I rebooted to make sure the thing reset...
> >
> ># bin/filltape 
> >dd: opening `/dev/nst0': Permission denied

But this is just plain weird.  A flaky controller should *not*
have anything to do with the mode on a special file.  Are you
sure you ran this "filltape" as root (or as someone with write
permission on /dev/nst0)?

> >[...]
> >/dev/nst0: No such file or directory

And this is weird too.  It looks as though the /dev/nst0 special
file itself got deleted (as opposed to the underlying hardware
appearing to go away, which is what the earlier "no such device
or address" suggests.)

> >The jumpers on the back look like this:
> >Description   setting
> >--- --- --- --- --- --- --- 
> >Disable Compression  off

As an aside, for Amanda it's generally better to run with
compression disabled (for details, see many threads in the
archives).

> >SCSI ID Settings above should be ID 6, but there are no other devices on the
> >system (except maybe the card itself if that has an ID). Is it safe for me to
> >set it to SCSI ID 0? I went ahead and did that and it booted up okay.

The only requirement is that no two devices may have the same ID.
The card *does* have an ID, btw -- probably 7; that's the
traditional value.  SCSI gives priority to the device with the
higher ID, if both want the bus at once.  That's why tapes are
often at ID 6, so that they're less likely to be starved for
data, which in turn means that they're more likely to keep
streaming.  See "How should I set the IDs of my devices?" in Gary
Field's FAQ.

But with only the two devices on the bus, priority isn't an
issue, so 0 and 6 are equally good.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: degraded mode

2003-12-12 Thread Eric Siegerman

On Fri, Dec 12, 2003 at 10:11:51AM -0500, Joshua Baker-LePain wrote:
> In your situation, I would setup the config that backs up the server 
> itself to not use the holding disk.  On a local only config, it doesn't 
> really buy you much speed, so why bother?

A holding disk still buys you parallelism.  Sure, a local full
backup can provide bytes faster than a typical tape drive can
consume them.  But on an incremental, dump or (especially) gtar
can spend a lot of time searching for the next file to back up.
During that time, bytes aren't being copied; in a direct-to-tape
dump, your tape drive sits idle; with a holding disk, the drive
could have been writing a different DLE.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: gnutar version -- exactly 1.13.25 or just 1.13.25 and above?

2003-12-16 Thread Eric Siegerman

On Tue, Dec 16, 2003 at 05:51:51PM -0500, Jon LaBadie wrote:
>  **  there still are no [gtar] point releases listed from 1.13.25 - 1.13.89 :))
>  yet 1.13.90 was followed five weeks later with 1.13.91 and a week
>  after that 1.13.92.

I've seen this before.  It seems to be a new(ish) convention for
labelling betas, the idea being that they're counting up towards
a 1.14 release.  I'm not sure what characteristics distinguish a
point release (final component starting with a low digit) from a
beta (yada yada high digit) -- or what they do if they need
another beta after 1.13.99.  I suppose both of those vary
per-project.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
Furthermore, if you do not like any of [the many standards from
which to choose], you can just wait for next year's model.
- Andrew Tanenbaum

Re: database01 /export lev 0 FAILED 20031222[could not connect to database01]

2003-12-22 Thread Eric Siegerman

On Mon, Dec 22, 2003 at 01:22:27PM -0500, Gene Heskett wrote:
> On Monday 22 December 2003 12:23, Dean Pullen wrote:
> ># /bin/sh ../config/mkinstalldirs /usr/bin/man/man8
>
> As root, move my script out of there, and do a make clean.

If the configuration is botched, "make distclean" is better,
since "make clean" does NOT clean up the files created by
"configure".  (Note that most of what Gene's script does is to
run "configure" with the right options.)

On Mon, Dec 22, 2003 at 06:00:31PM -, Dean Pullen wrote:
> Ok I added --mandir=/usr/share/man to the configure params and it
> successfully 'make install'.

The --mandir option is provided to let you put the man pages in
an unusual place.  Be aware that using it to force them to go
where they should have gone anyway, is patching around the
underlying problem, not fixing it.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: amanda server on sco 5.0.6

2003-12-22 Thread Eric Siegerman

On Mon, Dec 22, 2003 at 04:52:27PM -0500, Kurt Yoder wrote:
> I get some warnings (I culled these from the other configure messages)
> 
> configure: WARNING: *** You do not have gnuplot.  Amplot will not be
> installed.

This is a package dependency.  Amplot relies on gnuplot, so if
you want the former, you gotta install the latter.  If you don't
care about amplot, no problem.  (I have no opinion as to whether
or not you *should* care about amplot :-)

> configure: WARNING: `cc' requires `-belf' to build shared libraries

Ok, what this says about the compiler couldn't be clearer; but as
for what it means for configure, I haven't a clue  :-/ Is it
telling you to add "-belf" to CFLAGS?  Why can't configure do so
itself?  Or did it add them already (in which case, why's it
bothering you with the warning?)?  If all else fails, you could
always --disable-shared.  But don't do that yet; we don't yet
know if it's needed.

> configure: WARNING: netinet/ip.h: present but cannot be compiled
> configure: WARNING: netinet/ip.h: check for missing prerequisite
> headers?
> configure: WARNING: netinet/ip.h: proceeding with the preprocessor's
> result

Ignore this.  It's not something users need to worry about -- it's more a
message for the Amanda developers.  (If you're curious, see
http://groups.yahoo.com/group/amanda-users/message/45004).

> configure: WARNING: *** No readline library, no history and command
> line editing in amrecover!

As with the gnuplot message, this means that an optional
prerequisite couldn't be found, but that configure has worked
around it by suppressing some functionality.  Whether it's worth fixing
is up to you.

None of these configure warnings has anything (that I can see) to
do with tape changers.

> Can I just remove all references to changer-src from the configure
> script? I don't need a tape changer. Would I break amanda-server if
> I tried to compile without changer-src?

Not sure; try it and see...

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Dump Vs Tar tradeoffs (if any)

2003-12-23 Thread Eric Siegerman

On Tue, Dec 23, 2003 at 11:42:14AM -0500, Henson, George Mr JMLFDC wrote:
> What are the advantages or disadvantages to using tar instead
> of dump?

(This is partially brief repetition, but also contains new
points.)

In dump's favour:
  - The estimate phase is faster

  - It doesn't change any of the timestamps of files it's backing
up (tar doesn't change mtime either of course, but can't
avoid changing either atime or ctime (actually, I recently
read that Solaris provides a way, if you're root, but I don't
know whether GNU tar takes advantage of it)

  - You can do interactive restores natively.  (amrecover gives
you the same functionality, regardless of dump vs. tar, so
this difference *only* applies if Amanda isn't in the loop at
restore time, or if you don't have the index files, which
amrecover requires.)

  - Dump programs are customized to the local file system's
idiosyncracies.  I'm guessing (but don't know) that this
means that dump can back up system-dependent metadata that
tar has no clue about (ACLs, Linux ext2 "chattr" flags,
FreeBSD's "chflags" variant thereof, and the like)

In tar's favour:
  - You can exclude files

  - You can split a partition into multiple DLE's.  This is
necessary if you have partitions larger than will fit on a
single tape, since Amanda can't split a single dump onto
multiple tapes (not yet anyway; work is in progress,
hooray!).

  - Dump is reported to be undependable on Linux -- Linus says
so, anyway.  (He has a thing against dump, so doesn't see
that as a problem, but IMO it's because Linux has deviated
from standard UNIX in undesirable ways.  Regardless of blame,
though, it's an issue to be dealt with.)

  - Backups are portable.  The downside of every dump being
customized to its file system is that you very likely can't
restore a dump from platform X using platform Y's "restore".
I've never tried cross-file-system restores on the same box
(restoring from a Solaris VXFS dump onto a Solaris ufs
partition, for example), but I imagine that whether you can
get away with it depends on the specific combination and the
specific platform.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Scheduled backup being "prevented"?

2004-01-07 Thread Eric Siegerman

On Wed, Jan 07, 2004 at 10:22:27AM -0500, Ken D'Ambrosio wrote:
> Why the heck would we get "no estimate?"

Take a look at the debug files on that client, especially the
relevent sendsize.TIMESTAMP.debug.

> Why would it say "Preventing bump... as directed"?

Looks as though someone did "amadmin force-no-bump" on that DLE.

The two symptoms seem unrelated, but there might be some
connection I'm not aware of.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: extremely varied tape write rates?

2004-01-12 Thread Eric Siegerman

On Fri, Jan 09, 2004 at 11:33:26AM -0500, Kurt Yoder wrote:
> >DUMPER STATS  TAPER STATS
> > HOSTNAME   DISKL   ORIG-KBOUT-KB  COMP%  MMM:SSKB/s MMM:SSKB/s
> >  -- ---
> > borneo.shc -corporate25821120  12795031   49.6   67:34  3155.9   45:57  4641.0
> > borneo.shc -cs_shared18595830  10108659   54.4   55:42  3024.9  145:29  1158.1
> > borneo.shc -_shared_212075330   8126030   67.3   40:23  3353.9   30:27  4447.4
> > britain.sh /shared01 25085110  12979245   51.7   37:29  5771.6  172:00  1257.7
> > sumatra.sh //java/c$  7488930   7488930--70:04  1781.5   22:30  5548.0

I can't see anything here that distinguishes those two DLE's from
the other three:
  - All five seem to have used the holding disk (in each case,
the dump time and tape time are different)

  - As you remarked, some of Borneo's DLEs are fast, but one is
slow, so it doesn't look client-specific

  - They're neither the biggest nor the smallest DLEs

Are you sure there's nothing different about the slow DLEs'
dumptypes that could account for it?

Another thought:  One thing that isn't shown in the report email
is the order in which the DLEs were taped.  Try digging through
the amdump or log file to find that out (and, indeed, the time of
day for each one).  I don't know what you'll find, but there
might be an interesting pattern.

What I'm wondering about is resource contention on the Amanda
server; e.g. a news expire that kicks in part-way through the
amdump run, and thrashes the drive that contains your holding
disk.  Or something that saturates the bus your tape drive's on.
Or any number of other possibilities.  For that theory to fit
your observations, the same DLEs would have to be taped at about
the same time every run, which doesn't seem very Amanda-like but
could perhaps happen under the right conditions.  It might be
interesting to look at the logs from more than one run, to see
what varies and what doesn't.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Ramifications of dump cycle and number of tapes choices

2004-01-12 Thread Eric Siegerman

On Mon, Jan 12, 2004 at 06:56:25PM -0600, Fran Fabrizio wrote:
> [Assuming dumpcycle=7, runspercycle=7, runtapes=1, tapecycle=33:]
> Are the following statements correct?
> 
> 1. We will have a full dump of any given partition somewhere on the most
> recent 7 tapes

Yes.  As you've figured out, precisely where is up to Amanda.

> and that to restore a file to any given date in the past
> 7 days

27 days, actually (looking ahead to question #2).  The file might
not be in the most recent dump cycle, but whichever cycle it's in
is still only 7 tapes long...

> would take at most 7 tapes/steps but on average 3.5 tapes/steps.

In your case, I think the numbers are 4 and very roughly 2, resp.

Amanda doesn't always "bump" a given DLE to the next dump level;
whether it does is determined by how much tape it would save by
doing so (see amanda.conf parameters "bumpsize", "bumpmult", and
"bumpdays").

With your parameters, and with the default bumpdays value of 2,
the maximum number of tapes for a restore is 4:
Day:   1 2 3 4 5 6 7 8
Level: 0 1 1 2 2 3 3 0

(With bumpdays=1, you'd be right; the maximum would be 7.)

As for the average number of tapes...

The vast majority of DLEs never get above level 1 or 2 in my
experience; their dump histories look more like "011011..."
or "01101112...".  Thus, as an estimate of the average number
of tapes to be read for a restore, (maximum_possible_dumplevel/2)
is on the high side (in your case, max/2 = 2 is pretty close, but
that's purely by accident).

(maximum_actual_dumplevel/2) is probably low; the DLE spends a
lot more days at the maximum dump level than it does climbing up
to it.

There are too many variables, I think, to estimate the average
number of tapes without looking at your actual dump history,
though the two formulas above might serve as *very* rough bounds.

My estimate of 2 for your average is very much
back-of-the-envelope guesswork:
  - for all those DLEs that never get above dump level 1, the
average is 1.9 ((1 + 2 + 2 + 2 + 2 + 2 + 2) / 7).
  - for the few DLEs that stop at dump level 2, the average
is 2.4 ((1 + 2 + 2 + 3 + 3 + 3 + 3) / 7), but only
approximately; the DLE might bump to level 2 later, making
the average a bit lower
  - for the *very* few DLEs that go to level 3 or above, the
average will be higher still
  - but the level-1 DLEs outnumber all the rest combined, so 2
seems as good a guess as any

But then, all of this pseudomathematical blather only applies to
restores of many files (full-DLE disaster recovery, or a
user-requested restore of an entire directory).  For a
single-file restore, you should only to need to read *one* tape.
Amrecover will figure out in advance which tape the desired file
lives on, so it won't need to search the rest.

> 2. We can retrieve a file as it existed on any date in the past 27 days,
> and possibly as it existed on days 28-33

This looks right to me.

> Hence, it seems you are guaranteed to be able to retrieve any
> file as it existed 27 or less days ago.

"Almost guaranteed" :-(  Be aware that in a panic situation, a
full backup can be postponed to a run after the one where it
should have been done ("delayed" is the word Amanda actually
uses).  If that happens, there will come a day or two, a month or
so hence, when you can't quite meet your 27-day guarantee.

Amanda tries hard to avoid delaying full backups, but it can
happen, due to things like tape errors, operator failing to mount
the right tape (shouldn't be an issue for you), tape filling up
before it was expected to, and possibly other circumstances that
aren't coming to mind right now.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: planner schedule ?

2004-01-14 Thread Eric Siegerman

On Wed, Jan 14, 2004 at 03:33:02PM -0500, Brian Cuttler wrote:
> I'm not sure its "planner", what component handles which order DLEs
> are handled in ?

Planner and driver both contribute, I believe:
  - Planner offers a suggested ordering (e.g. it gives priority
to a DLE that's overdue for a level 0)

  - Driver determines dynamically which dump(s) it can start at
any given time, based on your "dumporder" amanda.conf
parameter, the ordering suggested by planner, and on
available resources -- holding disk space, network bandwidth,
number of dumps already running on that client, etc.  (N.B.:
Once a dump has been started, it runs flat-out; Amanda does
no further throttling.)

> I've got a partition of 54Gig, of which about 36 Gig are in use.
> I've an amanda spool of 35 Gig. Fortunately for us compression
> seems to be working pretty well as we are able to use the work/spool
> area rather than going directly to tape.
> 
> Got me wondering though, is amanda saving this DLE for last so that
> it can utilize all of the spool area by itself or where we just lucky ?

You're *not* just lucky.

I don't think Amanda makes a point of specifically "saving this
DLE for last" (unless your "dumporder" tells it to); that's just
how things work out in your case.  But it does make a point of
saving the DLE until there's holding-disk budget for it -- and,
once the DLE in question has been started, Amanda holds off
starting others if they'd overbook the holding disk.  (It'd be
driver making that sort of decision, btw, not planner.)

> Since this DLE is running 'alone', are we really gaining any performance
> over running in degraded mode ?

I presume you mean "over running in direct-to-tape mode";
degraded mode is something different.

There probably is a gain.  When you dump to disk, taper is
typically (these days, I'd venture to guess almost certainly)
able to provide data (from the holding-disk file) as fast as the
tape drive needs it; but when you dump direct to tape (especially
over the network), that might not be possible.  In the latter
case, the tape transport has to stop and start as data becomes
available -- some people here call that "shoeshining the drive".
That reduces the overall transfer rate, and depending on the tape
technology, possibly (probably?) the amount of data that will fit
on the tape.

Besides, now that you've added a bunch of spool space, the DLE
will no longer necessarily be running alone...

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: dump incremental level not greater than 1

2004-02-05 Thread Eric Siegerman

On Thu, Feb 05, 2004 at 04:15:24PM +0100, [EMAIL PROTECTED] wrote:
> in our environment there is one linux system (SuSE 8.1 amanda-2.4.4-41) 
> where amanda not increase the dump level to 2,3...

Presumably the bumpsize/bumpmult/bumpdays criteria are never
satisfied.  See the documentation of those three amanda.conf
parameters in amanda(8).

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Concurrancy - dumping partitions

2004-02-06 Thread Eric Siegerman

On Tue, Feb 03, 2004 at 10:40:45AM -0500, Brian Cuttler wrote:
> At this point we discovered "netusage" was set to a very low value
> (1200 though I'm uncertain of the units).

The units are bytes/second (*not* bits/second), adjusted of
course by whatever multiplier you specify.

> Note 1 on this, I was surprised that with server = client this was
> a factor.

Amanda has a more general way of dealing with this.  You can put
multiple "interface" sections in amanda.conf, with different
capacities; then specify for each DLE which interface its data
will travel over.  (The "interface" sections are intended to
correspond to the NICs in your server, but Amanda doesn't try to
enforce this.)  So you can just create an "interface" section
with a really high capacity, and use that for the server's DLEs.

As an aside, I recently set our Amanda server's netusage way
higher than its NIC's capacity.  (a factor of ten, to be
specific; that's how I discovered what the units are :-)  The
Dump Time reported in the nightly emails went through the roof,
but the interesting part is that the Run Time wasn't much
affected -- maybe even a small decrease, though I don't recall
for sure.  That surprised me at first, but it makes sense with a
bit of thought.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Concurrancy - dumping partitions

2004-02-06 Thread Eric Siegerman

On Fri, Feb 06, 2004 at 04:29:36PM -0500, Eric Siegerman wrote:
> You can put
> multiple "interface" sections in amanda.conf, with different
> capacities; then specify for each DLE which interface its data
> will travel over.

I should have made clear that all this is purely descriptive, not
*pre*scriptive.  That is, whatever you do with "interface"
sections, DLEs, etc., will *not* affect which NIC the packets
actually travel through.  That's well beyond Amanda's control;
it's up to the kernel's TCP/IP stack as usual.  The only purpose
here is to tell Amanda what's going on, so that it can budget the
various NICs' bandwidth appropriately.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: Encryption

2004-02-06 Thread Eric Siegerman

[CCing the author of the web page in question, in case she's
no longer subscribed]

On Tue, Jan 27, 2004 at 09:19:26AM +0100, Paul Bijnens wrote:
> Long ago, I bookmarked this page:
> 
> http://security.uchicago.edu/tools/gpg-amanda/
> 
> but I never tried it myself...

Long ago, it seems I had a small amount of input into it.  Funny
how that happens :-)  But I've never tried it either.

Looking at it now, I see that the basic approach is to have the
Amanda client do compression, with the "compression" program
(GZIP= environment variable to "configure") being a script that,
during backups, does essentially "gpg -e | gzip", and during
restores does the inverse.

The gzip step in this pipeline is a pointless waste of CPU time,
and may make the backup *larger*; by design, encrypted data is
supposed to resemble random data, and so it compresses very
poorly.  (If yours compresses well, I'd be *very* worried about
how secure your encryption is!)

Try removing the gzip step entirely, and reducing it to just the
gpg command.  (gpg compresses the data internally, for the sake
of better encryption, so putting a gzip step *before* gpg is
equally pointless.)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: suggestions for a backup scheme?

2004-02-09 Thread Eric Siegerman

On Fri, Jan 23, 2004 at 09:37:23AM -0500, Greg Troxel wrote:
> Then, have a cron job that copies in either amanda.conf.full or
> amanda.conf.incr to amanda.conf before the dump runs.

Our approach is to have multiple Amanda configurations, with the
same disklist (using symlinks).  The crontab entry runs a script
that in turn runs amdump with the appropriate "configuration"
argument.

> With this (to answer Jon's query), you meet your imposed requirements,
> and you still get multi-machine backups, holding disk, concurrency,
> and the ability to find backups.  Making amanda do what you want is
> vastly easier than writing your own scripts from scratch.

Agreed.  There's a lot more to Amanda than just its scheduling
policy, and it does not seem to me to be prima facie ridiculous
to want to use it with a different policy, and to be frustrated
at how hard that can be.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Making two copies of a backup

2004-02-09 Thread Eric Siegerman

[I'm CCing amanda-hackers because the answer to my question might
depend heavily on Amanda internals; but the discussion doesn't
belong there, so please reply to amanda-users.]

I want to make two identical copies of an Amanda backup.  This is
a one-off thing -- archival backups of a client that's about to
be wiped clean and repurposed.  If it were an ongoing need, I'd
ask for budget for a second tape drive and learn about rait.

I have enough holding-disk space to hold all of the client's DLEs
at once, so what I'm thinking is this:

 1. Build a new Amanda configuration that backs up only the client
in question (reserve=0; also record=no to be on the safe side)

 2. Run the configuration with no tape in the drive, forcing all
the (full) backups to holding disk

 3. Hard-link the holding-disk files to another directory that
Amanda doesn't know about
 
 4. Run amflush

 5. Hard-link the holding-disk files back to the Amanda spool
directory (it's pure paranoia that I choose not to mv them
instead and thus dispense with step 7)

 6. Run amflush again

 7. Delete the holding-disk files from the other directory

Does this look like a reasonable approach?  My main worry is that
the curinfo database and multiple amflush's of the same data
won't get along with each other.  Is that likely to be a problem?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: amverify fails for large (?) samba share

2004-02-11 Thread Eric Siegerman

On Wed, Feb 11, 2004 at 04:06:15PM +0100, Vincent GRENET wrote:
> [error from amverify]
> The amanda server runs RHS 7.3, amanda 2.4.2p2, gnu tar 1.13.19, samba
> 2.0.7.

Is that gtar old enough to be a problem, or is 1.13.19 known to
work?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: "Unable to create temporary directory" message

2004-02-11 Thread Eric Siegerman

On Wed, Feb 11, 2004 at 05:16:59PM -0600, Fran Fabrizio wrote:
> ? Unable to create temporary directory in any of the directories listed 
> below:
> [...]

It's ufsdump saying this.  The leading "?"'s are a clue -- Amanda
prepends them to messages from the dump program that it doesn't
recognize (they ultimately appear in the report message, labelled
as "strange").  I confirmed it by finding the message in the
output of "strings /usr/sbin/ufsdump".

I don't know why it's happening.  Does ufsdump run as the Amanda
user or as root?  (Check the first couple of lines on one of the
sendbackup.* debug files).  If the former, well, you've checked
that that should be ok.  If it's root -- about the only reasons I
can think of that root couldn't create a file is if the
filesystem's mounted readonly (totally bogus for /tmp of course!)
or NFS-mounted and the permissions don't allow *world* write
(again very weird for /tmp and probably broken, but just barely
conceivable).

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

Re: big filemark

2004-02-12 Thread Eric Siegerman

On Thu, Feb 12, 2004 at 12:14:45PM -0500, Jon LaBadie wrote:
> Well I think we have a new champion filemark.
>   filemark 534180 kbytes

That's only 1.9% away from 512 (binary) MB -- I suspect within
amtapetype's margin of error.  I wonder if that drive writes
fixed 512 MB physical blocks, doing the appropriate buffering
internally.

Not that it matters much, as Jon said; just idle curiosity.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau

1 2 3 >

1 - 100 of 206 matches

Mail list logo