Broken Pipe errors

2002-02-25 Thread Matt Galer

I recently added a new machine to my backups, and seem to be having a
problem with a couple of file systems with "index tee cannot write [Broken
pipe]" errors.  

Client is Solaris 7
Server is Solaris 7
amanda version 2.4.2p2
gnutar 1.13.19
firewall between the two servers, but no security restrictions (as in all
other backups work fine).  Server is NAT'ed at firewall

I've run some tests just backing one of those filesystems up, and no matter
how I fiddle with the dumptype (thinking compression in particular might be
part of the problem) I cannot get this error to go away.  In test mode I'm
trying to dump only to holding disk which has 11.5 GB free.  The file system
in question is only ~1.5 GBs.  Larger filesystems on the same host can be
backed up with the same dumptype parameters with no problems.

When I was watching the backup being performed with "amstatus" it showed
data being transferred, and the correct file was being created and updated
on the server.  I think when it stopped updating, the size sent reported by
amstatus was a few KB (32?) over the estimated backup size (didnt write the
data down like a fool) - I realize this may or may not mean a thing, but
just throwing it out there.
 
Here are some (hopefully!) appropriate log files (they dont tell me too
much):

=
log.20020225.0 from server
=
START planner date 20020225
START driver date 20020225
ERROR taper no-tape [no tape online]
FINISH planner date 20020225
STATS driver startup time 2.472
FAIL dumper rnbuyer /export/home 0 [mesg read: Connection timed out]
  sendbackup: start [rnbuyer:/export/home level 0]
  sendbackup: info BACKUP=/usr/local/bin/tar
  sendbackup: info RECOVER_CMD=/usr/local/bin/tar -f... -
  sendbackup: info end
FINISH driver date 20020225 time 7744.120

=
amdump.1 from server
=
amdump: start at Mon Feb 25 10:38:00 EST 2002
planner: pid 5123 executable /usr/local/libexec/planner version 2.4.2p2
planner: build: VERSION="Amanda-2.4.2p2"
planner:BUILT_DATE="Mon Jul 30 12:22:45 EDT 2001"
planner:BUILT_MACH="SunOS utl-atl-05 5.7 Generic_106541-15 sun4u
sparc SUNW,Ultra-4"
planner:CC="gcc"
planner: paths: bindir="/usr/local/bin" sbindir="/usr/local/sbin"
planner:libexecdir="/usr/local/libexec" mandir="/usr/local/man"
planner:AMANDA_TMPDIR="/tmp/amanda" AMANDA_DBGDIR="/tmp/amanda"
planner:CONFIG_DIR="/usr/local/etc/amanda" DEV_PREFIX="/dev/dsk/"
planner:RDEV_PREFIX="/dev/rdsk/" DUMP="/usr/sbin/ufsdump"
planner:RESTORE="/usr/sbin/ufsrestore"
planner:SAMBA_CLIENT="/usr/local/samba/bin/smbclient"
planner:GNUTAR="/usr/local/bin/tar"
planner:COMPRESS_PATH="/usr/local/bin/gzip"
planner:UNCOMPRESS_PATH="/usr/local/bin/gzip"
planner:MAILER="/usr/bin/mailx"
planner:listed_incr_dir="/usr/local/var/amanda/gnutar-lists"
planner: defs:  DEFAULT_SERVER="utl-atl-06" DEFAULT_CONFIG="DailySet1"
planner:DEFAULT_TAPE_SERVER="utl-atl-06"
planner:DEFAULT_TAPE_DEVICE="/dev/null" HAVE_MMAP HAVE_SYSVSHM
planner:LOCKING=POSIX_FCNTL SETPGRP_VOID DEBUG_CODE
planner:AMANDA_DEBUG_DAYS=4 BSD_SECURITY USE_AMANDAHOSTS
planner:CLIENT_LOGIN="siteops" FORCE_USERID HAVE_GZIP
planner:COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
planner:COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
planner: dgram_bind: socket bound to 0.0.0.0.603
READING CONF FILES...
startup took 0.010 secs

SETTING UP FOR ESTIMATES...
setting up estimates for rnbuyer:/export/home
rnbuyer:/export/home overdue 11744 days for level 0
setup_estimate: rnbuyer:/export/home: command 4, options:
last_level -1 next_level0 -11744 level_days 0
getting estimates 0 (0) -1 (-1) -1 (-1)
setting up estimates took 0.000 secs

GETTING ESTIMATES...
driver: pid 5122 executable /usr/local/libexec/driver version 2.4.2p2
driver: send-cmd time 0.005 to taper: START-TAPER 20020225
taper: pid 5124 executable taper version 2.4.2p2
driver: started dumper0 pid 5126
dumper: dgram_bind: socket bound to 0.0.0.0.606
dumper: pid 5126 executable dumper version 2.4.2p2, using port 606
dumper: dgram_bind: socket bound to 0.0.0.0.607
dumper: pid 5127 executable dumper version 2.4.2p2, using port 607
driver: started dumper1 pid 5127
driver: started dumper2 pid 5128
driver: started dumper3 pid 5129
driver: started dumper4 pid 5130
driver: started dumper5 pid 5131
dumper: dgram_bind: socket bound to 0.0.0.0.608
dumper: pid 5128 executable dumper version 2.4.2p2, using port 608
dumper: dgram_bind: socket bound to 0.0.0.0.610
dumper: pid 5130 exec

Re: Broken pipe errors

2001-07-02 Thread Olivier Nicole

>A DDS-3 tape should hold 12GB, compressed, right?  It looks like "taper"
>dies after 10GB.  Does the TAPE-ERROR/short write mean the tape is bad?  If
>the tape was more full, I wouldn't be suspicious; but it seems like I
>should be getting more space out of 125m.

Amanda consider only the non-compressed size.

Compressed size is a commercial only argument, that has NO reality. 

Do NOT calculate with compressed size.

Olivier



Re: Broken pipe errors

2001-07-02 Thread John R. Jackson

>> You don't, by any chance, have both hardware and software compression
>> turned on, do you?
>
>How do I tell, exactly?  I have a dumptype line reading
>
>   compress client fast
>
>But that just seems like which CPU it uses to compute the compression
>algorithm.

This line says you are using software compression.  It also happens to
say it's done on the client, but that doesn't really matter.

Since you're using software compression, you **must** turn off hardware
compression.  How you do that is highly OS dependent (it's based on the
tape device name for Solaris and AIX, I think it's an "mt" command for
Linux, and so on).

If you want to use hardware compression, you should turn off software
compression with:

  compress none

... for all the dumptypes.

I wouldn't advise going that route unless you're having trouble (e.g.
excessive load or wallclock time) with software compression.  Software
does a better job (since it can look at longer strings of data) and is
more flexible (you could turn it off for just certain disks whose data
was already compressed, for instance).

>> Which size specification?  If you mean the one from taper, it's very
>> accurate (to the KByte).
>
>Good.

By "the one from taper" I was referring to the "taper: tape /dev/whatever
kb NNN fm NNN ..." line in the "NOTES" section of the E-mail (or in the
amdump.NN file).  All the other Amanda statistics are based on what was
*successfully* done.  So if the image being processed at the time of
the error was large, they could be way off from where the error happened.

>Drew

John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]



Re: Broken pipe errors

2001-07-02 Thread Drew Raines

* "John R. Jackson" <[EMAIL PROTECTED]>:
>
> I thought DDS-3 was 12 native and 24 compressed.

You're right.  I had my terminology mixed up.  By "12 compressed" I meant
*after* the compression had taken effect.  Anyway...

> You don't, by any chance, have both hardware and software compression
> turned on, do you?

How do I tell, exactly?  I have a dumptype line reading

compress client fast

But that just seems like which CPU it uses to compute the compression
algorithm.

> Double compressing things often makes them expand,
> i.e. you don't even get the native capacity, and might explain what
> you're seeing.

That could explain a lot.  I inherited this system from a previous admin
who had a few gigabytes less data to back-up.  Now, after a few months have
passed, I'm the one stuck with the tapes filling up.  He very well could
have been using both compression types and not have known it.

> Which size specification?  If you mean the one from taper, it's very
> accurate (to the KByte).

Good.

> If you mean the one from the manufacturer, well, let's just say a lot
> of those folks won't ever have an employment problem as long as there
> are cars (or snake oil :-).

We all can't be engineers... :)

-- 
Drew



Re: Broken pipe errors

2001-07-02 Thread John R. Jackson

>A DDS-3 tape should hold 12GB, compressed, right?  ...

I thought DDS-3 was 12 native and 24 compressed.

>It looks like "taper" dies after 10GB.  ...

Correct.

>Does the TAPE-ERROR/short write mean the tape is bad?  ...

It means taper got an error.  That, in turn, might mean any number of
things are bad (tape, drive, cleaning, cables, distance to the nearest
candy machine, etc).

You don't, by any chance, have both hardware and software compression
turned on, do you?  Double compressing things often makes them expand,
i.e. you don't even get the native capacity, and might explain what
you're seeing.

>How accurate is the size specification of the tape?

Which size specification?  If you mean the one from taper, it's very
accurate (to the KByte).

If you mean the one from the manufacturer, well, let's just say a lot
of those folks won't ever have an employment problem as long as there
are cars (or snake oil :-).

>Drew

John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]



Re: Broken pipe errors

2001-07-02 Thread Bernhard R. Erdmann

> A DDS-3 tape should hold 12GB, compressed, right?  It looks like "taper"
> dies after 10GB.  Does the TAPE-ERROR/short write mean the tape is bad?  If
> the tape was more full, I wouldn't be suspicious; but it seems like I
> should be getting more space out of 125m.

DDS-3 holds 12 GB uncompressed. Period.
Don't count on what marketing people say. blah, 24 GB, blah... They have
no clue of technical details.

It depends on the compressibility of your data how much will fit with
_either_ hardware or software (gzip/bzip2) compression onto a tape.

_Do not_ use both hard and software compression. _It will_ enlarge your
data.

You get something around 20 GB of the usual data mix onto DDS-3 using
either hardware or software compression.

Lots of source code, executables, news or mails extend compressibility,
lots of mp3's or gzipped files reduce it. It's the amount of entropy in
a file being important for reducing it's size.

Do you really expect some algorithm being able to reduce an mp3 in size
with no loss of information which in turn already has only 1/10th of the
original PCM signal _with_ loss of information? - May this loss be
detectable with your ears or not. How do you want to compress _random_
data? What similarities do you expect in random data to be able to write
them in a short term?

I assume you're using hardware compression with gzipped images.
Something between 9.5 to 10 GB is here the usual limit on DDS-3.

Using software compression makes things clearer for Amanda. She knows
how much an image compresses in full and incrementals. This information
will be used in planner stage deciding what to backup in what level. The
usable tape size value can be very accurate.

Hardware compression saves your CPU times, but Amanda doesn't have a
clue about the compressibility of each image. So you use an average
value for tape size hoping for the best.

> How accurate is the size specification of the tape?

When I pushed Amanda to the limit by adding more and more hosts or
filesystems using software compression, I noticed tapetype's measure
being very accurate. Sometimes it didn't fit onto one tape (DDS-2, 4 GB)
because it failed to write 10-50 MB. This is a real good guess! You'll
have to pay lot's of money to get a measuring instrument working below 1
% error.



Broken pipe errors

2001-07-02 Thread Drew Raines

I'm having some seemingly elementary errors that I don't quite understand.

Using Solaris 2.7, Amanda 2.4.1p1, Sun StorEdge DDS-3 changer on an E450.

>From the amdump log:

  driver: dumping phg:/export/home directly to tape
  driver: send-cmd time 11090.642 to taper: PORT-WRITE 00-00043 phg
/export/home 0 20010630
  driver: result time 11090.656 from taper: PORT 41252
  driver: send-cmd time 11090.660 to dumper0: PORT-DUMP 01-00044 41252 phg
/export/home 0 1970:1:1:0:0:0 DUMP |;bsd-auth;compress-fast;index;
  driver: state time 11090.660 free kps: 1357 space: 15843292 taper: writing
idle-dumpers: 3 qlen tape q: 0 runq: 0 stoppedq: 0 wakeup: 86400 driver-idle: 
not-idle
  driver: interface-state time 11090.660 if : free 1 if LE0: free 1
if LOCAL: free 357
  driver: hdisk-state time 11090.660 hdisk 0: free 15843292 dumpers 0
  taper: writing end marker. [phg-weekly-015 ERR kb 10020128 fm 22]
  driver: result time 14749.051 from dumper0: FAILED 01-00044 ["data write:
Broken pipe"]
  dumper: kill index command
  driver: result time 14749.051 from taper: TAPE-ERROR 00-00043 [writing
file: short write]
  driver: QUITTING time 14749.137 telling children to quit
  driver: send-cmd time 14749.137 to taper: QUIT
Broken Pipe
  amdump: end at Sat Jun 30 05:05:50 CDT 2001

A DDS-3 tape should hold 12GB, compressed, right?  It looks like "taper"
dies after 10GB.  Does the TAPE-ERROR/short write mean the tape is bad?  If
the tape was more full, I wouldn't be suspicious; but it seems like I
should be getting more space out of 125m.

How accurate is the size specification of the tape?

-- 
Drew