Re: client failure

2009-05-07 Thread Brian Cuttler

Jean-Louis,
Dustin

Sorry it took so long, but we put the latest snapshot into
place and it DID resolve the issue with backup of the older
amanda client on the SGI/IRIX box.

Thank you,

Brian


> I'm out the rest of the week and am reluctant to install a
> new version when I wouldn't be here to check the result.
> 
> Will compile and install the new release early next week
> (with the patch if its not included in p1) and will leave
> the everest DLE on the old server for the time being.
> 
> Will let you know how I make out with the install/patch
> when I return.
> 
> Thank you,
> 
> Brian
> 
> On Fri, Apr 10, 2009 at 11:46:21AM -0400, Brian Cuttler wrote:
> > On Thu, Apr 09, 2009 at 05:04:21PM -0400, Dustin J. Mitchell wrote:
> > > On Thu, Apr 9, 2009 at 4:55 PM, Brian Cuttler  wrote:
> > > > # more chunker.20090409164847.debug
> > > 
> > > That's a chunker debug log -- do you have a dumper debug log?
> > > dumper.20090409164847.debug or something similar?  You may have
> > > several -- see if you can find one that shows something "unusual" at
> > > the end (like a traceback).
> > 
> > Sorry, included amdump log, which is not the same as dumper debug...
> > and when that meantioned the chunker...
> > 
> > From the client - I wonder if I need to rebuild with port restrictions
> > because my new server has them...
> > 
> > Not dynanically reconfigurable ? Requires a rebuild ?
> > 
> > verest 66# more sendbackup.debug 
> > sendbackup: debug 1 pid 467288 ruid 0 euid 0 start time Thu Apr  9 18:46:58 
> > 2009
> > /usr/local/libexec/sendbackup: got input request: DUMP /images3  0 
> > 1970:1:1:0:0:
> > 0 OPTIONS |;bsd-auth;compress-fast;no-record;
> >   parsed request as: program `DUMP' disk `/images3' lev 0 since 
> > 1970:1:1:0:0:0 o
> > pt `|;bsd-auth;compress-fast;no-record;'
> >   waiting for connect on 857, then 690
> >   got all connections
> > sendbackup: spawning "/usr/sbin/gzip" in pipeline
> > sendbackup: argument list: "/usr/sbin/gzip" "--fast"
> > sendbackup: spawning "/usr/local/libexec/rundump" in pipeline
> > sendbackup: argument list: "xfsdump" "-J" "-F" "-l" "0" "-" 
> > "/dev/rdsk/dks0d3s0"
> > sendbackup: pid 464097 finish time Thu Apr  9 19:12:33 2009
> > 
> > 
> > I ran # amdump curie grifserv, generating these new dumper debug
> > files in directory /tmp/amanda/server/curie, I don't see anything
> > exciting here, but I'm not always certain what to look for.
> > 
> > 
> > > more dumper.*
> > ::
> > dumper.20090410113619.debug
> > ::
> > 123939.211485: dumper: pid 24796 ruid 110 euid 110 version 2.6.1: start 
> > at F
> > ri Apr 10 11:36:19 2009
> > 123939.214530: dumper: pid 24796 ruid 110 euid 110 version 2.6.1: 
> > rename at 
> > Fri Apr 10 11:36:19 2009
> > 123939.214802: dumper: getcmd: START 20090410113619
> > 1239378177.267303: dumper: getcmd: PORT-DUMP 00-2 10092 everest 
> > 34cbfe811f01
> > 00 /images3 NODEVICE 0 1970:1:1:0:0:0 DUMP X X X bsd 
> > |;bsd-auth;compress
> > -fast;index;
> > 1239378177.273822: dumper: make_socket opening socket with family 2
> > 1239378177.273911: dumper: connect_port: Try  port 10084: available - 
> > Success
> > 1239378177.274039: dumper: connected to 127.0.0.1.10092
> > 1239378177.274044: dumper: our side is 0.0.0.0.10084
> > 1239378177.274053: dumper: try_socksize: send buffer size is 65536
> > ::
> > dumper.20090410113619000.debug
> > ::
> > 123939.211908: dumper: pid 24795 ruid 110 euid 110 version 2.6.1: start 
> > at F
> > ri Apr 10 11:36:19 2009
> > 123939.214921: dumper: pid 24795 ruid 110 euid 110 version 2.6.1: 
> > rename at 
> > Fri Apr 10 11:36:19 2009
> > 123939.215212: dumper: getcmd: START 20090410113619
> > 1239378162.227997: dumper: getcmd: PORT-DUMP 00-1 10093 everest 
> > 34cbfe811f01
> > 00 /images3 NODEVICE 0 1970:1:1:0:0:0 DUMP X X X bsd 
> > |;bsd-auth;compress
> > -fast;index;
> > 1239378162.234231: dumper: make_socket opening socket with family 2
> > 1239378162.234317: dumper: connect_port: Try  port 10084: available - 
> > Success
> > 1239378162.234448: dumper: connected to 127.0.0.1.10093
> > 1239378162.234453: dumper: our side is 0.0.0.0.10084
> > 1239378162.234461: dumper: try_socksize: send buffer size is 65536
> > ::
> > dumper.20090410113619001.debug
> > ::
> > 123939.212055: dumper: pid 24797 ruid 110 euid 110 version 2.6.1: start 
> > at F
> > ri Apr 10 11:36:19 2009
> > 123939.215056: dumper: pid 24797 ruid 110 euid 110 version 2.6.1: 
> > rename at 
> > Fri Apr 10 11:36:19 2009
> > 123939.215350: dumper: getcmd: START 20090410113619
> > 1239378177.276916: dumper: getcmd: QUIT ""
> > 1239378177.277106: dumper: pid 24797 finish time Fri Apr 10 11:42:57 2009
> > ::
> > dumper.20090410113619002.debug
> > ::
> > 123939.223566: dumper: pid 24798 ruid 110 euid 110 

Re: client failure

2009-04-13 Thread Brian Cuttler
I'm out the rest of the week and am reluctant to install a
new version when I wouldn't be here to check the result.

Will compile and install the new release early next week
(with the patch if its not included in p1) and will leave
the everest DLE on the old server for the time being.

Will let you know how I make out with the install/patch
when I return.

Thank you,

Brian

On Fri, Apr 10, 2009 at 11:46:21AM -0400, Brian Cuttler wrote:
> On Thu, Apr 09, 2009 at 05:04:21PM -0400, Dustin J. Mitchell wrote:
> > On Thu, Apr 9, 2009 at 4:55 PM, Brian Cuttler  wrote:
> > > # more chunker.20090409164847.debug
> > 
> > That's a chunker debug log -- do you have a dumper debug log?
> > dumper.20090409164847.debug or something similar?  You may have
> > several -- see if you can find one that shows something "unusual" at
> > the end (like a traceback).
> 
> Sorry, included amdump log, which is not the same as dumper debug...
> and when that meantioned the chunker...
> 
> From the client - I wonder if I need to rebuild with port restrictions
> because my new server has them...
> 
> Not dynanically reconfigurable ? Requires a rebuild ?
> 
> verest 66# more sendbackup.debug 
> sendbackup: debug 1 pid 467288 ruid 0 euid 0 start time Thu Apr  9 18:46:58 
> 2009
> /usr/local/libexec/sendbackup: got input request: DUMP /images3  0 
> 1970:1:1:0:0:
> 0 OPTIONS |;bsd-auth;compress-fast;no-record;
>   parsed request as: program `DUMP' disk `/images3' lev 0 since 
> 1970:1:1:0:0:0 o
> pt `|;bsd-auth;compress-fast;no-record;'
>   waiting for connect on 857, then 690
>   got all connections
> sendbackup: spawning "/usr/sbin/gzip" in pipeline
> sendbackup: argument list: "/usr/sbin/gzip" "--fast"
> sendbackup: spawning "/usr/local/libexec/rundump" in pipeline
> sendbackup: argument list: "xfsdump" "-J" "-F" "-l" "0" "-" 
> "/dev/rdsk/dks0d3s0"
> sendbackup: pid 464097 finish time Thu Apr  9 19:12:33 2009
> 
> 
> I ran # amdump curie grifserv, generating these new dumper debug
> files in directory /tmp/amanda/server/curie, I don't see anything
> exciting here, but I'm not always certain what to look for.
> 
> 
> > more dumper.*
> ::
> dumper.20090410113619.debug
> ::
> 123939.211485: dumper: pid 24796 ruid 110 euid 110 version 2.6.1: start 
> at F
> ri Apr 10 11:36:19 2009
> 123939.214530: dumper: pid 24796 ruid 110 euid 110 version 2.6.1: rename 
> at 
> Fri Apr 10 11:36:19 2009
> 123939.214802: dumper: getcmd: START 20090410113619
> 1239378177.267303: dumper: getcmd: PORT-DUMP 00-2 10092 everest 
> 34cbfe811f01
> 00 /images3 NODEVICE 0 1970:1:1:0:0:0 DUMP X X X bsd 
> |;bsd-auth;compress
> -fast;index;
> 1239378177.273822: dumper: make_socket opening socket with family 2
> 1239378177.273911: dumper: connect_port: Try  port 10084: available - Success
> 1239378177.274039: dumper: connected to 127.0.0.1.10092
> 1239378177.274044: dumper: our side is 0.0.0.0.10084
> 1239378177.274053: dumper: try_socksize: send buffer size is 65536
> ::
> dumper.20090410113619000.debug
> ::
> 123939.211908: dumper: pid 24795 ruid 110 euid 110 version 2.6.1: start 
> at F
> ri Apr 10 11:36:19 2009
> 123939.214921: dumper: pid 24795 ruid 110 euid 110 version 2.6.1: rename 
> at 
> Fri Apr 10 11:36:19 2009
> 123939.215212: dumper: getcmd: START 20090410113619
> 1239378162.227997: dumper: getcmd: PORT-DUMP 00-1 10093 everest 
> 34cbfe811f01
> 00 /images3 NODEVICE 0 1970:1:1:0:0:0 DUMP X X X bsd 
> |;bsd-auth;compress
> -fast;index;
> 1239378162.234231: dumper: make_socket opening socket with family 2
> 1239378162.234317: dumper: connect_port: Try  port 10084: available - Success
> 1239378162.234448: dumper: connected to 127.0.0.1.10093
> 1239378162.234453: dumper: our side is 0.0.0.0.10084
> 1239378162.234461: dumper: try_socksize: send buffer size is 65536
> ::
> dumper.20090410113619001.debug
> ::
> 123939.212055: dumper: pid 24797 ruid 110 euid 110 version 2.6.1: start 
> at F
> ri Apr 10 11:36:19 2009
> 123939.215056: dumper: pid 24797 ruid 110 euid 110 version 2.6.1: rename 
> at 
> Fri Apr 10 11:36:19 2009
> 123939.215350: dumper: getcmd: START 20090410113619
> 1239378177.276916: dumper: getcmd: QUIT ""
> 1239378177.277106: dumper: pid 24797 finish time Fri Apr 10 11:42:57 2009
> ::
> dumper.20090410113619002.debug
> ::
> 123939.223566: dumper: pid 24798 ruid 110 euid 110 version 2.6.1: start 
> at F
> ri Apr 10 11:36:19 2009
> 123939.226566: dumper: pid 24798 ruid 110 euid 110 version 2.6.1: rename 
> at 
> Fri Apr 10 11:36:19 2009
> 123939.226823: dumper: getcmd: START 20090410113619
> 1239378177.276956: dumper: getcmd: QUIT ""
> 1239378177.277162: dumper: pid 24798 finish time Fri Apr 10 11:42:57 2009
> 
> ---
>Brian R Cuttler brian.cutt...@wadsworth.org
>Computer Systems Support(v) 518 486-1697
>Wadsworth Center(f) 518 473-6384

Re: client failure

2009-04-10 Thread Brian Cuttler



On Thu, Apr 09, 2009 at 05:04:21PM -0400, Dustin J. Mitchell wrote:
> On Thu, Apr 9, 2009 at 4:55 PM, Brian Cuttler  wrote:
> > # more chunker.20090409164847.debug
> 
> That's a chunker debug log -- do you have a dumper debug log?
> dumper.20090409164847.debug or something similar?  You may have
> several -- see if you can find one that shows something "unusual" at
> the end (like a traceback).

Sorry, included amdump log, which is not the same as dumper debug...
and when that meantioned the chunker...

>From the client - I wonder if I need to rebuild with port restrictions
because my new server has them...

Not dynanically reconfigurable ? Requires a rebuild ?

verest 66# more sendbackup.debug 
sendbackup: debug 1 pid 467288 ruid 0 euid 0 start time Thu Apr  9 18:46:58 2009
/usr/local/libexec/sendbackup: got input request: DUMP /images3  0 1970:1:1:0:0:
0 OPTIONS |;bsd-auth;compress-fast;no-record;
  parsed request as: program `DUMP' disk `/images3' lev 0 since 1970:1:1:0:0:0 o
pt `|;bsd-auth;compress-fast;no-record;'
  waiting for connect on 857, then 690
  got all connections
sendbackup: spawning "/usr/sbin/gzip" in pipeline
sendbackup: argument list: "/usr/sbin/gzip" "--fast"
sendbackup: spawning "/usr/local/libexec/rundump" in pipeline
sendbackup: argument list: "xfsdump" "-J" "-F" "-l" "0" "-" "/dev/rdsk/dks0d3s0"
sendbackup: pid 464097 finish time Thu Apr  9 19:12:33 2009


I ran # amdump curie grifserv, generating these new dumper debug
files in directory /tmp/amanda/server/curie, I don't see anything
exciting here, but I'm not always certain what to look for.


> more dumper.*
::
dumper.20090410113619.debug
::
123939.211485: dumper: pid 24796 ruid 110 euid 110 version 2.6.1: start at F
ri Apr 10 11:36:19 2009
123939.214530: dumper: pid 24796 ruid 110 euid 110 version 2.6.1: rename at 
Fri Apr 10 11:36:19 2009
123939.214802: dumper: getcmd: START 20090410113619
1239378177.267303: dumper: getcmd: PORT-DUMP 00-2 10092 everest 34cbfe811f01
00 /images3 NODEVICE 0 1970:1:1:0:0:0 DUMP X X X bsd |;bsd-auth;compress
-fast;index;
1239378177.273822: dumper: make_socket opening socket with family 2
1239378177.273911: dumper: connect_port: Try  port 10084: available - Success
1239378177.274039: dumper: connected to 127.0.0.1.10092
1239378177.274044: dumper: our side is 0.0.0.0.10084
1239378177.274053: dumper: try_socksize: send buffer size is 65536
::
dumper.20090410113619000.debug
::
123939.211908: dumper: pid 24795 ruid 110 euid 110 version 2.6.1: start at F
ri Apr 10 11:36:19 2009
123939.214921: dumper: pid 24795 ruid 110 euid 110 version 2.6.1: rename at 
Fri Apr 10 11:36:19 2009
123939.215212: dumper: getcmd: START 20090410113619
1239378162.227997: dumper: getcmd: PORT-DUMP 00-1 10093 everest 34cbfe811f01
00 /images3 NODEVICE 0 1970:1:1:0:0:0 DUMP X X X bsd |;bsd-auth;compress
-fast;index;
1239378162.234231: dumper: make_socket opening socket with family 2
1239378162.234317: dumper: connect_port: Try  port 10084: available - Success
1239378162.234448: dumper: connected to 127.0.0.1.10093
1239378162.234453: dumper: our side is 0.0.0.0.10084
1239378162.234461: dumper: try_socksize: send buffer size is 65536
::
dumper.20090410113619001.debug
::
123939.212055: dumper: pid 24797 ruid 110 euid 110 version 2.6.1: start at F
ri Apr 10 11:36:19 2009
123939.215056: dumper: pid 24797 ruid 110 euid 110 version 2.6.1: rename at 
Fri Apr 10 11:36:19 2009
123939.215350: dumper: getcmd: START 20090410113619
1239378177.276916: dumper: getcmd: QUIT ""
1239378177.277106: dumper: pid 24797 finish time Fri Apr 10 11:42:57 2009
::
dumper.20090410113619002.debug
::
123939.223566: dumper: pid 24798 ruid 110 euid 110 version 2.6.1: start at F
ri Apr 10 11:36:19 2009
123939.226566: dumper: pid 24798 ruid 110 euid 110 version 2.6.1: rename at 
Fri Apr 10 11:36:19 2009
123939.226823: dumper: getcmd: START 20090410113619
1239378177.276956: dumper: getcmd: QUIT ""
1239378177.277162: dumper: pid 24798 finish time Fri Apr 10 11:42:57 2009

---
   Brian R Cuttler brian.cutt...@wadsworth.org
   Computer Systems Support(v) 518 486-1697
   Wadsworth Center(f) 518 473-6384
   NYS Department of HealthHelp Desk 518 473-0773



IMPORTANT NOTICE: This e-mail and any attachments may contain
confidential or sensitive information which is, or may be, legally
privileged or otherwise protected by law from further disclosure.  It
is intended only for the addressee.  If you received this in error or
from someone who was not authorized to send it to you, please do not
distribute, copy or use it or any attachments.  Please notify the
sender immediately by reply e-mail and delete this from your
system. Thank you for your cooperation.




Re: client failure

2009-04-09 Thread Dustin J. Mitchell
On Thu, Apr 9, 2009 at 7:20 PM, Jean-Louis Martineau
 wrote:
> Can you try this patch?
> I need this path to connect to a 2.4.2p1 client, I was not able to compile
> 2.4.1

The patch looks good to me, whether it solves Brian's problem notwithstanding.

Dustin

-- 
Open Source Storage Engineer
http://www.zmanda.com


Re: client failure

2009-04-09 Thread Jean-Louis Martineau

Brian,

Can you try this patch?
I need this path to connect to a 2.4.2p1 client, I was not able to 
compile 2.4.1


Jean-Louis

Brian Cuttler wrote:

I'm trying to migrate an amanda client, SGI/IRIX with amanda 2.4.1p1
from server Solaris 9 with amanda 2.4.4 to a server Solaris 10 with
amanda 2.6.1.

I have an error, but don't see anything standing out in the client's
/tmp/amanda tree.

FAILURE DUMP SUMMARY:
   everest /images3 lev 0  FAILED [dumper1 died]

Did I cross a threshhold on versioning ?

You'd think I could easilly find this on the internet but no.

I'd thought there was a client server protocal issue at 2.4.0,
did I misremember or am I looking at a different issue ?

This page suggests that I can work with clients older than 2.5.1, but
I'm not sure its not a client/server protocal issue and not rather
than a communications issue.

http://wiki.zmanda.com/index.php/Selfcheck_request_failed#Backing_Up_Older_Amanda_Clients_.28pre-2.5.1.29

Suggested that auth "bsd" would allow me to backup older clients. But
that wasn't the solution for me.

thank you,

Brian
---
   Brian R Cuttler brian.cutt...@wadsworth.org
   Computer Systems Support(v) 518 486-1697
   Wadsworth Center(f) 518 473-6384
   NYS Department of HealthHelp Desk 518 473-0773



IMPORTANT NOTICE: This e-mail and any attachments may contain
confidential or sensitive information which is, or may be, legally
privileged or otherwise protected by law from further disclosure.  It
is intended only for the addressee.  If you received this in error or
from someone who was not authorized to send it to you, please do not
distribute, copy or use it or any attachments.  Please notify the
sender immediately by reply e-mail and delete this from your
system. Thank you for your cooperation.


  


Index: server-src/dumper.c
===
--- server-src/dumper.c	(revision 1854)
+++ server-src/dumper.c	(working copy)
@@ -2089,7 +2089,7 @@
 		   " ", dumpdate,
 		   " OPTIONS ", options,
 		   /* compat: if authopt=krb4, send krb4-auth */
-		   (strcasecmp(authopt, "krb4") ? "" : "krb4-auth"),
+		   (authopt && strcasecmp(authopt, "krb4") ? "" : "krb4-auth"),
 		   "\n",
 		   NULL);
 }


Re: client failure

2009-04-09 Thread Dustin J. Mitchell
On Thu, Apr 9, 2009 at 4:55 PM, Brian Cuttler  wrote:
> # more chunker.20090409164847.debug

That's a chunker debug log -- do you have a dumper debug log?
dumper.20090409164847.debug or something similar?  You may have
several -- see if you can find one that shows something "unusual" at
the end (like a traceback).

Dustin

-- 
Open Source Storage Engineer
http://www.zmanda.com


Re: client failure

2009-04-09 Thread Jean-Louis Martineau

Post dumper.*.debug files?

Jean-Louis

Brian Cuttler wrote:

Dustin,
Jean-Louis,

On Thu, Apr 09, 2009 at 04:37:24PM -0400, Dustin J. Mitchell wrote:
  

On Thu, Apr 9, 2009 at 4:21 PM, Brian Cuttler  wrote:


? everest /images3 lev 0 ?FAILED [dumper1 died]
  

Check the dumper debug logs on the server, rather than the client.






# more chunker.20090409164847.debug 

1239310127.748174: chunker: pid 23052 ruid 110 euid 110 version 2.6.1: start at 
Thu Apr  9 16:48:47 2009

1239310127.751130: chunker: pid 23052 ruid 110 euid 110 version 2.6.1: rename at
 Thu Apr  9 16:48:47 2009
1239310127.751381: chunker: getcmd: START 20090409164827
1239310127.751428: chunker: getcmd: PORT-WRITE 00-3 /thump/amanda/work/20090
409164827/everest._images3.0 everest 34cbfe811f0100 /images3 0 1970:1:1:
0:0:0 1048576 DUMP 1178784 |;bsd-auth;compress-fast;index;
1239310127.752279: chunker: stream_server opening socket with family 2 (requeste
d family was 2)
1239310127.752350: chunker: try_socksize: receive buffer size is 65536
1239310127.758397: chunker: bind_portrange2: Try  port 10096: Available - Succes
s
1239310127.758476: chunker: stream_server: waiting for connection: 0.0.0.0.10096
1239310127.758491: chunker: putresult: 23 PORT
1239310127.764993: chunker: stream_accept: connection from 127.0.0.1.10084
1239310127.765019: chunker: try_socksize: receive buffer size is 65536
1239310127.765485: chunker: putresult: 10 FAILED
1239310127.765592: chunker: pid 23052 finish time Thu Apr  9 16:48:47 2009



amdump.1


amdump: start at Thu Apr 9 16:48:27 EDT 2009
amdump: datestamp 20090409
amdump: starttime 20090409164827
amdump: starttime-locale-independent 2009-04-09 16:48:27 EDT
planner: pid 22319 executable /usr/local/libexec/amanda/planner version 
2.6.1-20090227
planner: build: VERSION="Amanda-2.6.1-20090227"
planner:BUILT_DATE="Mon Mar 9 17:02:49 EDT 2009"
planner:BUILT_MACH="i386-pc-solaris2.10" BUILT_REV="1714"
planner:BUILT_BRANCH="amanda-261" CC="/opt/SUNWspro/bin/cc"
planner: paths: bindir="/usr/local/bin" sbindir="/usr/local/sbin"
planner:libexecdir="/usr/local/libexec"
planner:amlibexecdir="/usr/local/libexec/amanda"
planner:mandir="/usr/local/share/man" AMANDA_TMPDIR="/tmp/amanda"
planner:AMANDA_DBGDIR="/tmp/amanda"
planner:CONFIG_DIR="/usr/local/etc/amanda" DEV_PREFIX="/dev/dsk/"
planner:RDEV_PREFIX="/dev/rdsk/" DUMP="/usr/sbin/ufsdump"
planner:RESTORE="/usr/sbin/ufsrestore" VDUMP=UNDEF VRESTORE=UNDEF
planner:XFSDUMP=UNDEF XFSRESTORE=UNDEF VXDUMP=UNDEF VXRESTORE=UNDEF
planner:SAMBA_CLIENT="/usr/sfw/bin/smbclient"
planner:GNUTAR="/usr/sfw/bin/gtar" COMPRESS_PATH="/usr/bin/gzip"
planner:UNCOMPRESS_PATH="/usr/bin/gzip" LPRCMD="/usr/bin/lpr"
planner: MAILER=UNDEF
planner:listed_incr_dir="/usr/local/var/amanda/gnutar-lists"
planner: defs:  DEFAULT_SERVER="curie" DEFAULT_CONFIG="DailySet1"
planner:DEFAULT_TAPE_SERVER="curie" DEFAULT_TAPE_DEVICE=""
planner:HAVE_MMAP NEED_STRSTR HAVE_SYSVSHM AMFLOCK_POSIX AMFLOCK_LOCKF
planner:AMFLOCK_LNLOCK SETPGRP_VOID AMANDA_DEBUG_DAYS=4 BSD_SECURITY
planner:USE_AMANDAHOSTS CLIENT_LOGIN="amanda" CHECK_USERID HAVE_GZIP
planner:COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
planner:COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
READING CONF INFO...
driver: pid 22320 executable /usr/local/libexec/amanda/driver version 
2.6.1-20090227
planner: timestamp 20090409164827
planner: time 0.000: startup took 0.000 secs

SENDING FLUSHES...
driver: tape size 822083584
driver: adding holding disk 0 dir /amanda0/work size 469420032 chunksize 1048576
driver: adding holding disk 1 dir /thump/amanda/work size 9148094464 chunksize 
1048576
reserving 0 out of 9617514496 for degraded-mode dumps
driver: send-cmd time 0.004 to taper: START-TAPER 20090409164827
FLUSH trel /trel 20090409133424 1 /thump/amanda/work/20090409133424/trel._trel.1
ENDFLUSH

SETTING UP FOR ESTIMATES...
planner: time 0.002: setting up estimates for everest:/images3
everest:/images3 overdue 14344 days for level 0
setup_estimate: everest:/images3: command 0, options: nonelast_level -1 
next_level0 -14344 level_days 0getting estimates 0 (-2) -1 (-2) -1 (-2)
planner: time 0.002: setting up estimates took 0.000 secs

GETTING ESTIMATES...
driver: started dumper0 pid 22322
driver: send-cmd time 0.005 to dumper0: START 20090409164827
driver: started dumper1 pid 22323
driver: send-cmd time 0.006 to dumper1: START 20090409164827
driver: started dumper2 pid 22324
driver: send-cmd time 0.006 to dumper2: START 20090409164827
driver: started dumper3 pid 22325
driver: send-cmd time 0.007 to dumper3: START 20090409164827
driver: start time 0.007 inparallel 4 bandwidth 800 diskspace 9617514496  
dir OBSOLETE datestamp 20090409164827 driver: drain-ends tapeq FIRST 
big-dumpers ssSS
dumper: pid 22322 executable dumper0 version 2.6.1-2009

Re: client failure

2009-04-09 Thread Brian Cuttler
Dustin,
Jean-Louis,

On Thu, Apr 09, 2009 at 04:37:24PM -0400, Dustin J. Mitchell wrote:
> On Thu, Apr 9, 2009 at 4:21 PM, Brian Cuttler  wrote:
> > ? everest /images3 lev 0 ?FAILED [dumper1 died]
> 
> Check the dumper debug logs on the server, rather than the client.




# more chunker.20090409164847.debug 

1239310127.748174: chunker: pid 23052 ruid 110 euid 110 version 2.6.1: start at 
Thu Apr  9 16:48:47 2009
1239310127.751130: chunker: pid 23052 ruid 110 euid 110 version 2.6.1: rename at
 Thu Apr  9 16:48:47 2009
1239310127.751381: chunker: getcmd: START 20090409164827
1239310127.751428: chunker: getcmd: PORT-WRITE 00-3 /thump/amanda/work/20090
409164827/everest._images3.0 everest 34cbfe811f0100 /images3 0 1970:1:1:
0:0:0 1048576 DUMP 1178784 |;bsd-auth;compress-fast;index;
1239310127.752279: chunker: stream_server opening socket with family 2 (requeste
d family was 2)
1239310127.752350: chunker: try_socksize: receive buffer size is 65536
1239310127.758397: chunker: bind_portrange2: Try  port 10096: Available - Succes
s
1239310127.758476: chunker: stream_server: waiting for connection: 0.0.0.0.10096
1239310127.758491: chunker: putresult: 23 PORT
1239310127.764993: chunker: stream_accept: connection from 127.0.0.1.10084
1239310127.765019: chunker: try_socksize: receive buffer size is 65536
1239310127.765485: chunker: putresult: 10 FAILED
1239310127.765592: chunker: pid 23052 finish time Thu Apr  9 16:48:47 2009



amdump.1


amdump: start at Thu Apr 9 16:48:27 EDT 2009
amdump: datestamp 20090409
amdump: starttime 20090409164827
amdump: starttime-locale-independent 2009-04-09 16:48:27 EDT
planner: pid 22319 executable /usr/local/libexec/amanda/planner version 
2.6.1-20090227
planner: build: VERSION="Amanda-2.6.1-20090227"
planner:BUILT_DATE="Mon Mar 9 17:02:49 EDT 2009"
planner:BUILT_MACH="i386-pc-solaris2.10" BUILT_REV="1714"
planner:BUILT_BRANCH="amanda-261" CC="/opt/SUNWspro/bin/cc"
planner: paths: bindir="/usr/local/bin" sbindir="/usr/local/sbin"
planner:libexecdir="/usr/local/libexec"
planner:amlibexecdir="/usr/local/libexec/amanda"
planner:mandir="/usr/local/share/man" AMANDA_TMPDIR="/tmp/amanda"
planner:AMANDA_DBGDIR="/tmp/amanda"
planner:CONFIG_DIR="/usr/local/etc/amanda" DEV_PREFIX="/dev/dsk/"
planner:RDEV_PREFIX="/dev/rdsk/" DUMP="/usr/sbin/ufsdump"
planner:RESTORE="/usr/sbin/ufsrestore" VDUMP=UNDEF VRESTORE=UNDEF
planner:XFSDUMP=UNDEF XFSRESTORE=UNDEF VXDUMP=UNDEF VXRESTORE=UNDEF
planner:SAMBA_CLIENT="/usr/sfw/bin/smbclient"
planner:GNUTAR="/usr/sfw/bin/gtar" COMPRESS_PATH="/usr/bin/gzip"
planner:UNCOMPRESS_PATH="/usr/bin/gzip" LPRCMD="/usr/bin/lpr"
planner: MAILER=UNDEF
planner:listed_incr_dir="/usr/local/var/amanda/gnutar-lists"
planner: defs:  DEFAULT_SERVER="curie" DEFAULT_CONFIG="DailySet1"
planner:DEFAULT_TAPE_SERVER="curie" DEFAULT_TAPE_DEVICE=""
planner:HAVE_MMAP NEED_STRSTR HAVE_SYSVSHM AMFLOCK_POSIX AMFLOCK_LOCKF
planner:AMFLOCK_LNLOCK SETPGRP_VOID AMANDA_DEBUG_DAYS=4 BSD_SECURITY
planner:USE_AMANDAHOSTS CLIENT_LOGIN="amanda" CHECK_USERID HAVE_GZIP
planner:COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
planner:COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
READING CONF INFO...
driver: pid 22320 executable /usr/local/libexec/amanda/driver version 
2.6.1-20090227
planner: timestamp 20090409164827
planner: time 0.000: startup took 0.000 secs

SENDING FLUSHES...
driver: tape size 822083584
driver: adding holding disk 0 dir /amanda0/work size 469420032 chunksize 1048576
driver: adding holding disk 1 dir /thump/amanda/work size 9148094464 chunksize 
1048576
reserving 0 out of 9617514496 for degraded-mode dumps
driver: send-cmd time 0.004 to taper: START-TAPER 20090409164827
FLUSH trel /trel 20090409133424 1 /thump/amanda/work/20090409133424/trel._trel.1
ENDFLUSH

SETTING UP FOR ESTIMATES...
planner: time 0.002: setting up estimates for everest:/images3
everest:/images3 overdue 14344 days for level 0
setup_estimate: everest:/images3: command 0, options: nonelast_level -1 
next_level0 -14344 level_days 0getting estimates 0 (-2) -1 (-2) -1 (-2)
planner: time 0.002: setting up estimates took 0.000 secs

GETTING ESTIMATES...
driver: started dumper0 pid 22322
driver: send-cmd time 0.005 to dumper0: START 20090409164827
driver: started dumper1 pid 22323
driver: send-cmd time 0.006 to dumper1: START 20090409164827
driver: started dumper2 pid 22324
driver: send-cmd time 0.006 to dumper2: START 20090409164827
driver: started dumper3 pid 22325
driver: send-cmd time 0.007 to dumper3: START 20090409164827
driver: start time 0.007 inparallel 4 bandwidth 800 diskspace 9617514496  
dir OBSOLETE datestamp 20090409164827 driver: drain-ends tapeq FIRST 
big-dumpers ssSS
dumper: pid 22322 executable dumper0 version 2.6.1-20090227
dumper: pid 22323 executable dumper1 version 2.6.1-20090227
dumper: pid 

Re: client failure

2009-04-09 Thread Jean-Louis Martineau

Brian,

I tested compatibility with 2.4.5, but it should works with 2.4.1.
It's a bug in the server since the driver died, can you get a backtrace 
of the process?

Also, send me the amdump.1 file.

Jean-Louis

Brian Cuttler wrote:

I'm trying to migrate an amanda client, SGI/IRIX with amanda 2.4.1p1
from server Solaris 9 with amanda 2.4.4 to a server Solaris 10 with
amanda 2.6.1.

I have an error, but don't see anything standing out in the client's
/tmp/amanda tree.

FAILURE DUMP SUMMARY:
   everest /images3 lev 0  FAILED [dumper1 died]

Did I cross a threshhold on versioning ?

You'd think I could easilly find this on the internet but no.

I'd thought there was a client server protocal issue at 2.4.0,
did I misremember or am I looking at a different issue ?

This page suggests that I can work with clients older than 2.5.1, but
I'm not sure its not a client/server protocal issue and not rather
than a communications issue.

http://wiki.zmanda.com/index.php/Selfcheck_request_failed#Backing_Up_Older_Amanda_Clients_.28pre-2.5.1.29

Suggested that auth "bsd" would allow me to backup older clients. But
that wasn't the solution for me.

thank you,

Brian
---
   Brian R Cuttler brian.cutt...@wadsworth.org
   Computer Systems Support(v) 518 486-1697
   Wadsworth Center(f) 518 473-6384
   NYS Department of HealthHelp Desk 518 473-0773



IMPORTANT NOTICE: This e-mail and any attachments may contain
confidential or sensitive information which is, or may be, legally
privileged or otherwise protected by law from further disclosure.  It
is intended only for the addressee.  If you received this in error or
from someone who was not authorized to send it to you, please do not
distribute, copy or use it or any attachments.  Please notify the
sender immediately by reply e-mail and delete this from your
system. Thank you for your cooperation.


  




Re: client failure

2009-04-09 Thread Dustin J. Mitchell
On Thu, Apr 9, 2009 at 4:21 PM, Brian Cuttler  wrote:
>   everest /images3 lev 0  FAILED [dumper1 died]

Check the dumper debug logs on the server, rather than the client.

Dustin

-- 
Open Source Storage Engineer
http://www.zmanda.com


Re: Client failure problem -answer

2005-01-06 Thread Jon LaBadie
On Thu, Jan 06, 2005 at 10:46:30AM +, Keith Matthews wrote:
> On Wed, 5 Jan 2005 15:15:01 -0500
> > Did you perchance increase the etimeout and dtimeout values first?
> > 
> 
> Nope, partly as there is either no documentation on them or it's well
> hidden.


Yup, we keep the man page for amanda well hidden :))


> > ... .  As long as the spindle numbers used are 
> > assigned to the individual drive per number, I've not even had any 
> > disk thrashing problems either.
> > 
> 
> This was all one spindle.


But did you tell amanda that?  If you don't tell it, by indicating which
DLE's are on the same spindle, amanda assumes each is a separate
spindle and might try many simultaneous dumps from that spindle.

-- 
Jon H. LaBadie  [EMAIL PROTECTED]
 JG Computing
 4455 Province Line Road(609) 252-0159
 Princeton, NJ  08540-4322  (609) 683-7220 (fax)


Re: Client failure problem -answer

2005-01-06 Thread Alexander Jolk
Keith Matthews wrote:
> Further
> testing revealed that it was quite happy with as many as 7 entries, more
> would cause the first few to fail, and the whole set would cause the lot
> to fail.

Hadn't there been talk once on the list of UDP packet size problems
too?  Something like a specific operating system supporting only so many
bytes in a single UDP packet, and when it got too big, things failed
mysteriously?  That would explain why there's so low a limit for this
host.  (I have several 10s of DLEs for a few hosts, and things work just
perfectly.)

Alex

-- 
Alexander Jolk / BUF Compagnie
tel +33-1 42 68 18 28 /  fax +33-1 42 68 18 29


Re: Client failure problem -answer

2005-01-06 Thread Keith Matthews
On Wed, 5 Jan 2005 15:15:01 -0500
Gene Heskett <[EMAIL PROTECTED]> wrote:


> >Replacing the above with one entry per filesystem (i.e wd0a, wd0e,
> > wd0g) where the whole filesystem was needed, and the top level
> > directory (/var) for the other case, with an exclude file to
> > eliminate the unwanted had the whole set dumping correctly.  I have
> > no idea if this is a generic Amanda issue or one specific to the
> > OpenBSD port.
> >
> >Debugging was complicated by the disk entries being tried in reverse
> >order, something else that does not seem to be mentioned in the
> >documentation.
> >
> >In case anyone wonders about the effect of 'inparrallel' I left it
> > at the default of 4.
> 
> Did you perchance increase the etimeout and dtimeout values first?
> 

Nope, partly as there is either no documentation on them or it's well
hidden.

I'm not sure it would have had any effect anyway as the original problem
situation had amandad failing very quickly (less than 5 seconds, I never
managed to catch it running with ps) with status 1.


> I have had as high as 53 entries for a single client in my disklist 
> without any problems.  As long as the spindle numbers used are 
> assigned to the individual drive per number, I've not even had any 
> disk thrashing problems either.
> 
> 

This was all one spindle.


Re: Client failure problem -answer

2005-01-05 Thread Gene Heskett
On Wednesday 05 January 2005 13:35, Keith Matthews wrote:
>On Sat, 18 Dec 2004 10:01:41 +
>
>Keith Matthews <[EMAIL PROTECTED]> wrote:
>> On Sat, 18 Dec 2004 09:17:30 +
>>
>> Keith Matthews <[EMAIL PROTECTED]> wrote:
>> > In the light of messages just posted I'll report this in case it
>> > didn't get out. Apologies to those who got it first time, I
>> > don't like those who assume that no answer simply maans people
>> > don't want to answer either.
>> >
>> > I'm having trouble getting a remote backup to work.
>> >
>> > The report states that the disk backup failed due to a timeout.
>> > Examination of /var/log/messages at the client shows that
>> > amandad exited'status 1' but gives no other indication of the
>> > cause of the problem. This is happening with all disks on that
>> > client.
>
>OK, for the sake of posterity I'd better post some more for this.
>
>The problem seems to be related to the number of entries in the
> disklist for the relevant host.
>
>I originally had
>
> wd0a user-tar -1
> wd0e comp-user-tar -1
> wd0g comp-user-tar -1
> /var/amanda user-tar -1
> /var/backups user-tar -1
> /var/clamav user-tar -1
> /var/cron user-tar -1
> /var/mysql user-tar -1
> /var/named user-tar -1
> /var/spool comp-user-tar -1
> /var/www user-tar -1
>
>(I've replaced the real, fqdn, hostname for security reasons).
>
>After some considerable amount of cut-and-try testing I discovered
> that the system worked quite happily with just one disklist entry.
> Further testing revealed that it was quite happy with as many as 7
> entries, more would cause the first few to fail, and the whole set
> would cause the lot to fail.
>
>Replacing the above with one entry per filesystem (i.e wd0a, wd0e,
> wd0g) where the whole filesystem was needed, and the top level
> directory (/var) for the other case, with an exclude file to
> eliminate the unwanted had the whole set dumping correctly.  I have
> no idea if this is a generic Amanda issue or one specific to the
> OpenBSD port.
>
>Debugging was complicated by the disk entries being tried in reverse
>order, something else that does not seem to be mentioned in the
>documentation.
>
>In case anyone wonders about the effect of 'inparrallel' I left it
> at the default of 4.

Did you perchance increase the etimeout and dtimeout values first?

I have had as high as 53 entries for a single client in my disklist 
without any problems.  As long as the spindle numbers used are 
assigned to the individual drive per number, I've not even had any 
disk thrashing problems either.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.31% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.


Re: Client failure problem -answer

2005-01-05 Thread Keith Matthews
On Sat, 18 Dec 2004 10:01:41 +
Keith Matthews <[EMAIL PROTECTED]> wrote:

> On Sat, 18 Dec 2004 09:17:30 +
> Keith Matthews <[EMAIL PROTECTED]> wrote:
> 
> > In the light of messages just posted I'll report this in case it
> > didn't get out. Apologies to those who got it first time, I don't
> > like those who assume that no answer simply maans people don't want
> > to answer either.
> > 
> > I'm having trouble getting a remote backup to work. 
> > 
> > The report states that the disk backup failed due to a timeout.
> > Examination of /var/log/messages at the client shows that amandad
> > exited'status 1' but gives no other indication of the cause of the
> > problem. This is happening with all disks on that client. 
> > 


OK, for the sake of posterity I'd better post some more for this.

The problem seems to be related to the number of entries in the disklist
for the relevant host.

I originally had 

 wd0a user-tar -1
 wd0e comp-user-tar -1
 wd0g comp-user-tar -1
 /var/amanda user-tar -1
 /var/backups user-tar -1
 /var/clamav user-tar -1
 /var/cron user-tar -1
 /var/mysql user-tar -1
 /var/named user-tar -1
 /var/spool comp-user-tar -1
 /var/www user-tar -1

(I've replaced the real, fqdn, hostname for security reasons).

After some considerable amount of cut-and-try testing I discovered that
the system worked quite happily with just one disklist entry. Further
testing revealed that it was quite happy with as many as 7 entries, more
would cause the first few to fail, and the whole set would cause the lot
to fail.

Replacing the above with one entry per filesystem (i.e wd0a, wd0e, wd0g)
where the whole filesystem was needed, and the top level directory
(/var) for the other case, with an exclude file to eliminate the
unwanted had the whole set dumping correctly.  I have no idea if this is
a generic Amanda issue or one specific to the OpenBSD port.

Debugging was complicated by the disk entries being tried in reverse
order, something else that does not seem to be mentioned in the
documentation.

In case anyone wonders about the effect of 'inparrallel' I left it at
the default of 4. 


Re: Client failure problem -part answer

2004-12-18 Thread Keith Matthews
On Sat, 18 Dec 2004 09:17:30 +
Keith Matthews <[EMAIL PROTECTED]> wrote:

> In the light of messages just posted I'll report this in case it
> didn't get out. Apologies to those who got it first time, I don't like
> those who assume that no answer simply maans people don't want to
> answer either.
> 
> I'm having trouble getting a remote backup to work. 
> 
> The report states that the disk backup failed due to a timeout.
> Examination of /var/log/messages at the client shows that amandad
> exited'status 1' but gives no other indication of the cause of the
> problem. This is happening with all disks on that client. 
> 
> The failure seems to happen immediately (I was trying to check the
> user that amandad was running under and the process did not last long
> enough to show).
> 
> There do not seem to be any logs on the client to give more
> information and I was wondering if there is some sort of debug setting
> I could invoke.
> 
> It ran correctly for a test about two weeks ago but has since
> consistently failed as above. Backups of the tape server host work
> correctly.
> 
> Server is Slackware 10, client OpenBSD 3.5 if that makes a difference.
> 
> Anyone got a clue how I can find out what the client is objecting to ?

OK. for the assistance of anyone who uses the archives I've found a set
of files in /tmp/amanda which give some information.

One is saying 


amandad:UNCOMPRESS_OPT="-dc"
got packet:

Amanda 2.4 REQ HANDLE 000-E0750608 SEQ 1103362411
SECURITY USER amanda
SERVICE noop
OPTIONS features=feff9ffe0f;


sending nack:

Amanda 2.4 NAK HANDLE 000-E0750608 SEQ 1103362411
ERROR unknown service: noop



If I can work out what has changed between the original run and these
later ones that could have caused that I'll be able to fix it.