Re: Linux and dump

2001-05-20 Thread Alexandre Oliva

On May 17, 2001, "John R. Jackson" <[EMAIL PROTECTED]> wrote:

>>> Looks like gcc 2.8.1.

>> Gee.  That's dead broken.  ...

> Yeah, yeah.  But you're just a wee bit biased :-) :-).

FYI, I've just found this in GNU tar 1.13.19's README:


* Solaris issues.

GNU tar exercises many features that can cause problems with older GCC
versions.  In particular, GCC 2.8.1 (sparc, -O1 or -O2) is known to
miscompile GNU tar.  No compiler-related problems have been reported
when using GCC 2.95.2 or later.



And here's the patch for a crash I had mentioned I had found.  It
would crash while doing --listed-incremental full or incremental
backups whenever it didn't have permission to enter a directory:



--- src/incremen.c~	Sat Jan 13 03:59:29 2001
+++ src/incremen.c	Sun May 20 05:51:37 2001
@@ -183,7 +183,10 @@
 enum children children;
 
 if (! dirp)
-  savedir_error (path);
+  {
+	savedir_error (path);
+	return 0;
+  }
 errno = 0;
 
 name_buffer_size = strlen (path) + NAME_FIELD_SIZE;



-- 
Alexandre Oliva   Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer  aoliva@{cygnus.com, redhat.com}
CS PhD student at IC-Unicampoliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist*Please* write to mailing lists, not to me



Re: A TAPE ERROR OCCURRED: [[writing file: Bad file descriptor]]

2001-05-20 Thread Sven Kirmess

Alexandre Oliva wrote:

> > /-- localhost  /dev/md190 lev 1 FAILED ["data write: Broken pipe"]
> > What does that mean? Is my tape broken?
> It probably means you ran out of tape space during a direct-to-tape
> backup.

I don't think that was the problem because I have a DDS-3 tape and amanda
used only:

Output Size (meg)   1.10.01.1

I'v never seen this under NOTES:

NOTES:
  taper: tape daily26 kb 0 fm 0 writing filemark: Input/output error


Sven

--
I attach the whole report:

These dumps were to tape daily26.
*** A TAPE ERROR OCCURRED: [[writing file: Bad file descriptor]].
Some dumps may have been left in the holding disk.
Run amflush to flush them to tape.
The next tape Amanda expects to use is: daily27.

FAILURE AND STRANGE DUMP SUMMARY:
  localhost  /dev/md191 lev 1 FAILED [out of tape]
  localhost  /dev/md191 lev 1 FAILED [dump to tape failed]
  localhost  /dev/md230 lev 1 FAILED [out of tape]
  localhost  /dev/md230 lev 1 FAILED [dump to tape failed]
  localhost  /dev/md192 lev 1 FAILED [out of tape]
  localhost  /dev/md192 lev 1 FAILED [dump to tape failed]
  localhost  /dev/md101 lev 1 FAILED [out of tape]
  localhost  /dev/md101 lev 1 FAILED [dump to tape failed]
  localhost  /dev/md220 lev 1 FAILED [out of tape]
  localhost  /dev/md220 lev 1 FAILED [dump to tape failed]
  localhost  /dev/md100 lev 1 FAILED [out of tape]
  localhost  /dev/md100 lev 1 FAILED [dump to tape failed]
  localhost  /dev/md190 lev 1 FAILED [out of tape]
  localhost  /dev/md190 lev 1 FAILED ["data write: Broken pipe"]
  localhost  /dev/md190 lev 1 FAILED [dump to tape failed]
  localhost  /dev/md199 lev 1 FAILED [out of tape]
  localhost  /dev/md199 lev 1 FAILED ["data write: Broken pipe"]
  localhost  /dev/md199 lev 1 FAILED [dump to tape failed]
  localhost  /dev/md151 lev 1 FAILED [out of tape]
  localhost  /dev/md151 lev 1 FAILED ["data write: Broken pipe"]
  localhost  /dev/md151 lev 1 FAILED [dump to tape failed]
  localhost  /dev/md251 lev 1 FAILED [out of tape]
  localhost  /dev/md251 lev 1 FAILED ["data write: Broken pipe"]
  localhost  /dev/md251 lev 1 FAILED [dump to tape failed]
  localhost  /dev/md200 lev 1 FAILED [out of tape]
  localhost  /dev/md200 lev 1 FAILED ["data write: Broken pipe"]
  localhost  /dev/md200 lev 1 FAILED [dump to tape failed]
  localhost  /dev/md150 lev 0 FAILED [out of tape]
  localhost  /dev/md150 lev 0 FAILED ["data write: Broken pipe"]
  localhost  /dev/md150 lev 0 FAILED [dump to tape failed]
  localhost  /dev/md159 lev 1 FAILED [out of tape]
  localhost  /dev/md159 lev 1 FAILED ["data write: Broken pipe"]
  localhost  /dev/md159 lev 1 FAILED [dump to tape failed]
  localhost  /dev/md170 lev 0 FAILED [out of tape]
  localhost  /dev/md170 lev 0 FAILED ["data write: Broken pipe"]
  localhost  /dev/md170 lev 0 FAILED [dump to tape failed]
  localhost  /dev/md210 lev 0 FAILED [out of tape]
  localhost  /dev/md210 lev 0 FAILED ["data write: Broken pipe"]
  localhost  /dev/md210 lev 0 FAILED [dump to tape failed]


STATISTICS:
  Total   Full  Daily
      
Estimate Time (hrs:min)2:08
Run Time (hrs:min) 2:50
Dump Time (hrs:min)0:00   0:00   0:00
Output Size (meg)   1.10.01.1
Original Size (meg) 1.10.01.1
Avg Compressed Size (%) -- -- --(level:#disks ...)

Filesystems Dumped6  0  6   (1:6)
Avg Dump Rate (k/s)   200.3--   200.3

Tape Time (hrs:min)0:00   0:00   0:00
Tape Size (meg) 0.00.00.0
Tape Used (%)   0.00.00.0
Filesystems Taped 0  0  0
Avg Tp Write Rate (k/s) -- -- --

FAILED AND STRANGE DUMP DETAILS:

/-- localhost  /dev/md190 lev 1 FAILED ["data write: Broken pipe"]
sendbackup: start [localhost:/dev/md190 level 1]
sendbackup: info BACKUP=/opt/tar-1.13.19/bin/tar
sendbackup: info RECOVER_CMD=/opt/tar-1.13.19/bin/tar -f... -
sendbackup: info end
\

/-- localhost  /dev/md199 lev 1 FAILED ["data write: Broken pipe"]
sendbackup: start [localhost:/dev/md199 level 1]
sendbackup: info BACKUP=/opt/tar-1.13.19/bin/tar
sendbackup: info RECOVER_CMD=/opt/tar-1.13.19/bin/tar -f... -
sendbackup: info end
\

/-- localhost  /dev/md151 lev 1 FAILED ["data write: Broken pipe"]
sendbackup: start [localhost:/dev/md151 level 1]
sendbackup: info BACKUP=/opt/tar-1.13.19/bin/tar
sendbackup: info RECOVER_CMD=/opt/tar-1.13.19/bin/tar -f... -
sendbackup: info end
\

/-- localhost  /dev/md251 lev 1 FAILED ["data write: Broken pipe"]
sendbackup: start [localhost:/dev/md251 level 1]
sendbackup: info BACKUP=/opt/tar-1.13.19/bin/tar
sendbackup: info RECOVER_CMD=/opt/tar-1.13.19/bin/tar -f... -
sendbackup: info end
\

/-- localhost  /dev/md200 lev 1 FAILED ["data write: Broken pipe"]
sendba

Re: A TAPE ERROR OCCURRED: [[writing file: Bad file descriptor]]

2001-05-20 Thread Sven Kirmess

> NOTES:
>   taper: tape daily26 kb 0 fm 0 writing filemark: Input/output error

Sory for replaying to my own message. I found something in syslog:

kernel: st0: Error with sense data: Info fld=0x1, Current st
09:00: sense key Medium Error
kernel: Additional sense indicates Write error


I think that means the tape is broken...?


Sven




Re: A TAPE ERROR OCCURRED: [[writing file: Bad file descriptor]]

2001-05-20 Thread Alexandre Oliva

On May 20, 2001, Sven Kirmess <[EMAIL PROTECTED]> wrote:

>> NOTES:
>> taper: tape daily26 kb 0 fm 0 writing filemark: Input/output error

> Sory for replaying to my own message. I found something in syslog:

> kernel: st0: Error with sense data: Info fld=0x1, Current st
> 09:00: sense key Medium Error
> kernel: Additional sense indicates Write error


> I think that means the tape is broken...?

Quite possibly.  Try running tapetype on it and see how far it goes.

-- 
Alexandre Oliva   Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer  aoliva@{cygnus.com, redhat.com}
CS PhD student at IC-Unicampoliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist*Please* write to mailing lists, not to me



Re: [Amanda-users] Re: Linux and dump

2001-05-20 Thread Ray Shaw


On Sun, May 20, 2001 at 04:45:52PM +1000, Jason Thomas wrote:
> On Fri, May 18, 2001 at 12:18:30AM -0400, Ray Shaw wrote:
> > 
> > On Thu, May 17, 2001 at 09:53:59PM -0300, Alexandre Oliva wrote:
> > 
> > > I can also tell from personal experience that I haven't had trouble
> > > with GNU tar 1.13.17 on GNU/Linux/x86, but I still haven't been able
> > > to do backups reliably with 1.13.19 (it will generally abort part-way
> > > through the back-up; I suspect a network problem, but a bug in GNU tar
> > > still isn't ruled out)
> > 
> > My systems (Debian potato):
> > 
> > media:~$ tar --version
> > tar (GNU tar) 1.13.17
> > 
> > media:~$ gcc --version
> > 2.95.2
> > 
> > No problems here, except that the kernel isn't best friends with my
> > ATA tape drive sometimes.  The joys of an academic budget! :)
> 
> when you say no problems have you tested using amverify.

No, but I've tried restoring random files from tapes.  And I've also
had to do some actual restores when another server went down (couldn't
simply move the disk over thanks to cursed proprietary CompaQ SCSI
disk interface...)

> I'm running the debian potato, and tar does not seem todo the right
> thing, the archives are corrupt.  I'm about to start testing with a
> newer version of tar taken from sid most probably.

Well, at least now your bzip2 option will be l instead of I :)


-- 
--Ray

-
Sotto la panca la capra crepa
sopra la panca la capra campa



Re: disk offline

2001-05-20 Thread Alexandre Oliva

On May 19, 2001, Olivier Nicole <[EMAIL PROTECTED]> wrote:

> I have upgraded to tar 1.13.19 but when I try to activate amanda on a
> client machine, I get disk off line error:

> FAILURE AND STRANGE DUMP SUMMARY:
>   oak/home/fidji lev 0 FAILED [disk /home/fidji offline on oak?]
>   oak/home/hawai lev 0 FAILED [disk /home/hawai offline on oak?]

Is this on Solaris?

It seems that the environment clean-up that sendsize does before
running runtar is getting Solaris/x86' start-up code somewhat
confused.  At least, I'm getting crashes very early in the program
execution, before ld.so gets to open librt.so (the first shared
library tar depends on).  This appears to be a bug that triggers in a
very particular condition: when the environment is empty.  It doesn't
trigger with 1.12 because it just doesn't depend on librt.so.

Anyway, I've managed to work around the bug by wrapping the tar
1.13.19 executable with a script that exports an environment variable
set to an empty string:

#! /bin/sh
X=; export X
exec $0.exe ${1+"$@"}

-- 
Alexandre Oliva   Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer  aoliva@{cygnus.com, redhat.com}
CS PhD student at IC-Unicampoliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist*Please* write to mailing lists, not to me



planner failure under Linux

2001-05-20 Thread C. Chan

I recently "upgraded" an Amanda 2.4.2p2 server from Linux 2.2.19 kernel
to the 2.4.3 kernel with XFS patches. Amanda was compiled from source,
not a prepackaged RPM binary.

When backing up a config with a single client with a large disklist
(about 125) entries everything is fine.

However in another config with 38 hosts and 332 entries the run will
fail with a "RESULTS MISSING" for all the hosts in the report and
a "got empty schedule from planner" from the driver. I trimmed the
disk list to 5 hosts and 38 entries and re-doing the run completes w/o incident.

I recompiled Amanda and it didn't make any difference. Since these configs
worked fine under 2.2.19, I assume that something has changed in Linux.

The disk estimates are all returned properly, it is failing at the
planner stage.

>From the amdump log for the failed run:

...

got result for host host1 disk /opt: 0 -> 3360K, 1 -> 100K, -1
-> -1K
got result for host host1 disk /usr: 0 -> 900K, 1 -> 450K, -1 ->
 -1K
 syncpipe_get: w: unexpected EOF
 taper: pid 2103 finish time Sat May 19 19:15:03 2001
 /usr/local/amanda/sbin/amdump: line 103:  2095 Hangup 
 $libexecdir/planner$SUF $conf
   2096   | $libexecdir/driver$SUF $conf
   amdump: end at Sat May 19 19:15:57 CDT 2001
...

And from the log file:

...
START taper datestamp 20010519 label XYZ_DailySetA20 tape 0
WARNING driver WARNING: got empty schedule from planner
STATS driver startup time 8927.065
INFO taper tape XYZ_DailySetA20 kb 0 fm 0 [OK]
FINISH driver date 20010519 time 8927.216
...

Any ideas where to look in planner.c or what Linux kernel
parameters may be relevant?


--
C. Chan < [EMAIL PROTECTED] > 
Finger [EMAIL PROTECTED] for PGP public key.




Re: disk offline

2001-05-20 Thread Alexandre Oliva

On May 20, 2001, Alexandre Oliva <[EMAIL PROTECTED]> wrote:

> This appears to be a bug that triggers in a very particular
> condition: when the environment is empty.  It doesn't trigger with
> 1.12 because it just doesn't depend on librt.so.

Nope.  It didn't trigger on 1.12 because it wasn't linked with GNU ld.
The problem is that GNU ld sets the dynamic linker as
/usr/lib/libc.so.1, whereas Sun ld sets it to /usr/lib/ld.so.1.  Even
though Sun officially supports both, according to a friend of mine at
Sun, the one chosen by GNU ld fails because it assumes the environment
contains at least one non-NULL entry.  My friend is working on a fix
for Solaris/x86 libc; I'm not sure we (GNU developers) are going to
fix GCC or GNU ld, or both, or neither, since the work-around is
simple: just link the program using
`-Wl,--dynamic-linker,/usr/lib/ld.so.1', or use Sun ld, or make sure
the environment is never empty, or wait for a patch from Sun.

-- 
Alexandre Oliva   Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer  aoliva@{cygnus.com, redhat.com}
CS PhD student at IC-Unicampoliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist*Please* write to mailing lists, not to me



No Subject

2001-05-20 Thread Florian Brodersen

unsubscribe amanda-users



Re: disk offline

2001-05-20 Thread Bernhard R. Erdmann

> #! /bin/sh
> X=; export X
> exec $0.exe ${1+"$@"}

What is ${1+"$@"} good for? Never seen that construction before...



Re: disk offline

2001-05-20 Thread Alexandre Oliva

On May 20, 2001, "Bernhard R. Erdmann" <[EMAIL PROTECTED]> wrote:

>> #! /bin/sh
>> X=; export X
>> exec $0.exe ${1+"$@"}

> What is ${1+"$@"} good for? Never seen that construction before...

Some shells expand "$@" to "" when $# = 0.

-- 
Alexandre Oliva   Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer  aoliva@{cygnus.com, redhat.com}
CS PhD student at IC-Unicampoliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist*Please* write to mailing lists, not to me



RE: amverify 'not at start of tape' errors

2001-05-20 Thread Carey Jung

>
> > we seem to get a lot of 'not at start of tape' errors
>
> It's not an error, just a warning that the tape section numbers that
> follow do not reflect the actual tape section numbers on tape, because
> you hadn't started amrestore at the beginning of the tape.  It has
> absolutely nothing to do with what is actually on the tape.  It has to
> do with whether amrestore found a tape label in the beginning of the
> first section it read or not, and this tape label is only written by
> Amanda in the beginning of a tape.
>

I'm still confused.  These errors are showing up in the middle of amverify
reports and consistently on the same partition, which is an smbclient
partition in the middle of the tape.  The label at the head of the tape is
fine.  (The first several filesystems check out fine.  How can we correct
this, even assuming it's just a warning?  Here's fuller output:

amverify DailySet1
Sun May 20 06:53:14 CDT 2001

Loading current slot...
Using device /dev/tape0
Volume DailySet147, Date 20010519
Checked server._.20010519.1
...
...
...
Checked server.__LEAH5_C$.20010519.1
** Error detected (server.__LEAH5_D$.20010519.1)
amrestore: WARNING: not at start of tape, file numbers will be offset
amrestore:   0: restoring server.__LEAH5_D$.20010519.1

gzip: stdin: decompression OK, trailing garbage ignored
/bin/gtar: Skipping to next header
/bin/gtar: Error exit delayed from previous errors
64+0 records in
64+0 records out
Checked server.__ATHENA_G$.20010519.0
...
...
...
End-of-Tape detected.




Re: A TAPE ERROR OCCURRED: [[writing file: Bad file descriptor]]

2001-05-20 Thread Sven Kirmess

Alexandre Oliva wrote:

> > I think that means the tape is broken...?
> Quite possibly.  Try running tapetype on it and see how far it goes.

tapetype did not complain but it found 44235 mbytes on a DDS3 (without hw
compression). And I got a

tapetype: could not rewind /dev/nst0: Input/output error


Sven




Re: A TAPE ERROR OCCURRED: [[writing file: Bad file descriptor]]

2001-05-20 Thread Bernhard R. Erdmann

> tapetype did not complain but it found 44235 mbytes on a DDS3 (without hw
> compression). And I got a

Tell me the trick you did with your drive to store 44 GB on a DDS-3
tape! ;-)

You should get something around 11.900 MB without and 9.500 MB with
hardware compression (yes, random data (i.e. gzipped data) is blown up
by h/w compression).

I assume you got faulty hardware.



Re: [Amanda-users] Re: Linux and dump

2001-05-20 Thread Jason Thomas

On Sun, May 20, 2001 at 08:28:32AM -0400, Ray Shaw wrote:
> Well, at least now your bzip2 option will be l instead of I :)
j actually

-- 
Jason Thomas   Phone:  +61 2 6257 7111
System Administrator  -  UID 0 Fax:+61 2 6257 7311
tSA Consulting Group Pty. Ltd. Mobile: 0418 29 66 81
1 Hall Street Lyneham ACT 2602 http://www.topic.com.au/

 PGP signature


Re: disk offline

2001-05-20 Thread Olivier Nicole

>It's printed in sendsize.debug, IIRC.  And in runtar.debug.
>
>It will reference a gnutar-list-dir .new file.  It's created initially
>empty for level 0 estimates, and a copy of the lower level for
>incremental estimates.
>
>FWIW, I've had similar results with GNU tar 1.13.19 on Solaris 7/x86.
>I've started investigating, but didn't get very far, and ended up
>downgrading back to 1.12+patches.

Hi,

I solved the problem, it was an access right permission issue that was
clearly indicated in /tmp/amanda/sendsize.*.debug

I would suggest that this debug file is mentionned in the FAQ, about
the "disk offline" question. It seems that it was mentionned several
times on the mailing list.

Here are few other ideas I had during the weekend.

- My problem was due to the fact I did a ./configure --with-user=14
  going with userID instead of user name. Maybe it could be clearly
  mentionned it should be the name and not the ID.

- It appears that the estimation of the size is done with
  euid=ruid=amanda while the effective back-up (runtar) is done with
  ruid=amanda and euid=root. It would give a more accurate estimate if
  it was run with euid=root I think.

- Apparently amcheck do not perform as deep verification as some cases
  can arise when amcheck will report OK while the dump will fail. The
  above case was one of it.

- Runing amanda 2.4.2p2, with gnutar, there is no more problem with
  disk access permission: I configured/install amanda, as user/group
  amanda, did not modify anything on the disk, did not set up amanda
  in the goup orperator, wheel or whatever, and it runs smoothly.

It is a great tool, thanks,

Olivier



Re: A TAPE ERROR OCCURRED: [[writing file: Bad file descriptor]]

2001-05-20 Thread Olivier Nicole

>tapetype: could not rewind /dev/nst0: Input/output error

Did you run tapetype as root?

Olivier



Re: Problems with amcheck and hostname resolution.

2001-05-20 Thread Peter Losher

On Wed, 16 May 2001, John R. Jackson wrote:

> >I recently decided to add another Amanda client to my existing Amanda
> >setup (server running v2.4.1p1) where I back up three clients using Krb4.
>
> Does the new client also use Krb4?

Yes, the clients are using Amanda 2.4.2 with the Krb4 patches.

> If none of that helps, I'd start stepping through the K4 code and see what
> it didn't like and (hopefully) why.  But K4 is only minimally supported in
> Amanda, and I certainly don't know much about it other than the basics.

I would agree, the patches are several releases backward, and in a perfect
world (well, my perfect world ), Amanda would fully support Krb5 and
be done with it. :)  Alas, it is not the case.

Before I loose brain cells debugging this; is there a better way to handle
client authentication inside Amanda?  Perhaps a way to exchange RSA/DSA
keys; or some other challenge authentication method?  I don't care about
SSH tunneling at this point, I just want to have a way to authenticate the
amanda user on the client box w/o resorting to RHosts. (which is a line I
won't cross, or my security officer and my conscience would have my head) :)

Thanks for any input you can provide.

-Peter
-- 
[EMAIL PROTECTED] - [ Systems Admin. | Nominum, Inc. ]