Re: 2.5.1p2: planner returns error=EOF on read

2006-11-27 Thread Jean-Louis Martineau

Jean-Francois,

Could you try the attached patch, it should kill the tar process.
If it still doesn't work, could you attach a debugger to the sendbackup 
process to know where it is?


Jean-Louis

Jean-Francois Malouin wrote:

* Jean-Louis Martineau [EMAIL PROTECTED] [20061123 17:20]:
  

Do amandad is running?



To be sure that I would hit the problem again I didn't load new tapes
so after the holddding filled up I have a bunch of gtar running and
just wasting a lot of cycles. A typical example (sorry for the long lines):

  amanda35559363557855  0- -   0:00 defunct 
  amanda3557855  1  0 05:37:39 ?   1:30 /opt/amanda/amanda1/libexec/sendbackup amandad bsdtcp

root35680873557855  0 05:40:02 ?  277:35 gtar --create --file - 
--directory /data/mafalda/mafalda1/susanita/sandra/DTI -

Tracing the sendbackup process yields no output while the gtar one
gives just an uninterupted flow of lines:

   68mS tar(3568087): write(1, fd 63 e3 6c ff 42 3c df 87
41 82 bf e9 dd 95 2c..., 10240) errno = 32 (Broken pipe)

Don't know if this is going to be of any help...
jf

  
Could you try to find what each process (amanda,sendbackup,tar) are 
doing? on which system call they are hung?

On linux, I would use strace for that.

Jean-Louis

Jean-Francois Malouin wrote:


Jean-Louis,

The amdump just finished and because amanda ran out of tapes
(runtapes=10) it completed badly leaving stuff in the holdding
disk. I'm flushing it at the moment but I noticed that when amanda
gave up it left a lot of gnutar lying around, with amandad as parent.
I had to manually kill them. Looks like signals are not doing ok.
I've attached one example of a DLE's sendbackup and runtar debug file.
It's not the first time I notice that when a DLE fails to make it
to tape successfully processes are left running...

* Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]:
 
  

Could you post amandad.timestamp.debug file from yorick?

Jean-Francois Malouin wrote:
   


On a server running irix-6.5 and amanda 2.5.1p2
planner debug shows:

security_getdriver(name=bsdtcp) returns 4075aa8
security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP))
security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP))
security_stream_close(10053a80)
security_stream_seterr(1005e538, SOCKET_EOF)
security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF
on read from yorick)
security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_stream_close(1005e538)

and the amanda report shows things like:

yorick  DATA_sub101   lev 0  FAILED [hmm, disk was stranded on waitq]
planner: ERROR Request to yorick failed: EOF on read from yorick

and the backup fails. The funny thing is that I have 2 other 
configurations

running 2.5.1p2 in parallel that doesn't exhibit this behaviour.
Any clues?
jf

 
  
 



runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13 
2006

runtar: version 2.5.1p2
/usr/freeware/bin/tar version: tar (GNU tar) 1.13.25

config: stk_80-conf1
runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 
2006
running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' 
'--directory' 
'/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' 
'--one-file-system' '--listed-incremental' 
'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' 
'--sparse' '--ignore-failed-read' '--totals' '.' runtar: pid 2826413 
finish time Thu Nov 23 08:04:13 2006
 



sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 
08:03:20 2006

sendbackup: version 2.5.1p2
Could not open conf file 
/opt/amanda/amanda1/etc/amanda/amanda-client.conf: No such file or 
directory
Reading conf file 
/opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf.
sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 
08:03:20 2006
 sendbackup req: GNUTAR sub101 
 /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 
 2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index;

 parsed request as: program `GNUTAR'
disk `sub101'
device 
`/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101'

level 1
since 2006:11:13:13:21:32
options `|;auth=bsdtcp;index;'
sendbackup: start: yorick:sub101 lev 1
sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental 
  
from '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_0' to 


'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new'
sendbackup-gnutar: time 53.057: doing level 1 dump from date: 2006-11-13 
13:22:26 GMT
sendbackup: time 53.246: started index creator: /usr/freeware/bin/tar -tf 
- 

Re: 2.5.1p2: planner returns error=EOF on read

2006-11-27 Thread Jean-Francois Malouin
* Jean-Louis Martineau [EMAIL PROTECTED] [20061127 09:24]:
 Jean-Francois,
 
 Could you try the attached patch, it should kill the tar process.
 If it still doesn't work, could you attach a debugger to the sendbackup 
 process to know where it is?

Thanks Jean-Louis. 
I'm recompiling and will let you know what happens.

jf

 
 Jean-Louis
 
 Jean-Francois Malouin wrote:
 * Jean-Louis Martineau [EMAIL PROTECTED] [20061123 17:20]:
   
 Do amandad is running?
 
 
 To be sure that I would hit the problem again I didn't load new tapes
 so after the holddding filled up I have a bunch of gtar running and
 just wasting a lot of cycles. A typical example (sorry for the long lines):
 
   amanda35559363557855  0- -   0:00 defunct 
   amanda3557855  1  0 05:37:39 ?   1:30 
   /opt/amanda/amanda1/libexec/sendbackup amandad bsdtcp
 root35680873557855  0 05:40:02 ?  277:35 gtar --create 
 --file - --directory /data/mafalda/mafalda1/susanita/sandra/DTI -
 
 Tracing the sendbackup process yields no output while the gtar one
 gives just an uninterupted flow of lines:
 
68mS tar(3568087): write(1, fd 63 e3 6c ff 42 3c df 87
 41 82 bf e9 dd 95 2c..., 10240) errno = 32 (Broken pipe)
 
 Don't know if this is going to be of any help...
 jf
 
   
 Could you try to find what each process (amanda,sendbackup,tar) are 
 doing? on which system call they are hung?
 On linux, I would use strace for that.
 
 Jean-Louis
 
 Jean-Francois Malouin wrote:
 
 Jean-Louis,
 
 The amdump just finished and because amanda ran out of tapes
 (runtapes=10) it completed badly leaving stuff in the holdding
 disk. I'm flushing it at the moment but I noticed that when amanda
 gave up it left a lot of gnutar lying around, with amandad as parent.
 I had to manually kill them. Looks like signals are not doing ok.
 I've attached one example of a DLE's sendbackup and runtar debug file.
 It's not the first time I notice that when a DLE fails to make it
 to tape successfully processes are left running...
 
 * Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]:
  
   
 Could you post amandad.timestamp.debug file from yorick?
 
 Jean-Francois Malouin wrote:

 
 On a server running irix-6.5 and amanda 2.5.1p2
 planner debug shows:
 
 security_getdriver(name=bsdtcp) returns 4075aa8
 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP))
 security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP))
 security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP))
 security_stream_close(10053a80)
 security_stream_seterr(1005e538, SOCKET_EOF)
 security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF
 on read from yorick)
 security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP))
 security_stream_close(1005e538)
 
 and the amanda report shows things like:
 
 yorick  DATA_sub101   lev 0  FAILED [hmm, disk was stranded on waitq]
 planner: ERROR Request to yorick failed: EOF on read from yorick
 
 and the backup fails. The funny thing is that I have 2 other 
 configurations
 running 2.5.1p2 in parallel that doesn't exhibit this behaviour.
 Any clues?
 jf
 
  
   
  
 
 
 runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 
 08:04:13 2006
 runtar: version 2.5.1p2
 /usr/freeware/bin/tar version: tar (GNU tar) 1.13.25
 
 config: stk_80-conf1
 runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 
 2006
 running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' 
 '--directory' 
 '/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' 
 '--one-file-system' '--listed-incremental' 
 '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' 
 '--sparse' '--ignore-failed-read' '--totals' '.' runtar: pid 2826413 
 finish time Thu Nov 23 08:04:13 2006
  
 
 
 sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 
 08:03:20 2006
 sendbackup: version 2.5.1p2
 Could not open conf file 
 /opt/amanda/amanda1/etc/amanda/amanda-client.conf: No such file or 
 directory
 Reading conf file 
 /opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf.
 sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 
 08:03:20 2006
  sendbackup req: GNUTAR sub101 
  /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 
  2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index;
  parsed request as: program `GNUTAR'
 disk `sub101'
 device 
 
  `/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101'
 level 1
 since 2006:11:13:13:21:32
 options `|;auth=bsdtcp;index;'
 sendbackup: start: yorick:sub101 lev 1
 sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental 
   
 from 

Re: 2.5.1p2: planner returns error=EOF on read

2006-11-26 Thread Jean-Louis Martineau

Amanda expect tar to exit with a broken pipe.
I looks that we need to kill it.

Jean-Louis

Jean-Francois Malouin wrote:

* Jean-Louis Martineau [EMAIL PROTECTED] [20061123 17:20]:
  

Do amandad is running?



To be sure that I would hit the problem again I didn't load new tapes
so after the holddding filled up I have a bunch of gtar running and
just wasting a lot of cycles. A typical example (sorry for the long lines):

  amanda35559363557855  0- -   0:00 defunct 
  amanda3557855  1  0 05:37:39 ?   1:30 /opt/amanda/amanda1/libexec/sendbackup amandad bsdtcp

root35680873557855  0 05:40:02 ?  277:35 gtar --create --file - 
--directory /data/mafalda/mafalda1/susanita/sandra/DTI -

Tracing the sendbackup process yields no output while the gtar one
gives just an uninterupted flow of lines:

   68mS tar(3568087): write(1, fd 63 e3 6c ff 42 3c df 87
41 82 bf e9 dd 95 2c..., 10240) errno = 32 (Broken pipe)

Don't know if this is going to be of any help...
jf

  
Could you try to find what each process (amanda,sendbackup,tar) are 
doing? on which system call they are hung?

On linux, I would use strace for that.

Jean-Louis

Jean-Francois Malouin wrote:


Jean-Louis,

The amdump just finished and because amanda ran out of tapes
(runtapes=10) it completed badly leaving stuff in the holdding
disk. I'm flushing it at the moment but I noticed that when amanda
gave up it left a lot of gnutar lying around, with amandad as parent.
I had to manually kill them. Looks like signals are not doing ok.
I've attached one example of a DLE's sendbackup and runtar debug file.
It's not the first time I notice that when a DLE fails to make it
to tape successfully processes are left running...

* Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]:
 
  

Could you post amandad.timestamp.debug file from yorick?

Jean-Francois Malouin wrote:
   


On a server running irix-6.5 and amanda 2.5.1p2
planner debug shows:

security_getdriver(name=bsdtcp) returns 4075aa8
security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP))
security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP))
security_stream_close(10053a80)
security_stream_seterr(1005e538, SOCKET_EOF)
security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF
on read from yorick)
security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_stream_close(1005e538)

and the amanda report shows things like:

yorick  DATA_sub101   lev 0  FAILED [hmm, disk was stranded on waitq]
planner: ERROR Request to yorick failed: EOF on read from yorick

and the backup fails. The funny thing is that I have 2 other 
configurations

running 2.5.1p2 in parallel that doesn't exhibit this behaviour.
Any clues?
jf

 
  
 



runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13 
2006

runtar: version 2.5.1p2
/usr/freeware/bin/tar version: tar (GNU tar) 1.13.25

config: stk_80-conf1
runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 
2006
running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' 
'--directory' 
'/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' 
'--one-file-system' '--listed-incremental' 
'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' 
'--sparse' '--ignore-failed-read' '--totals' '.' runtar: pid 2826413 
finish time Thu Nov 23 08:04:13 2006
 



sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 
08:03:20 2006

sendbackup: version 2.5.1p2
Could not open conf file 
/opt/amanda/amanda1/etc/amanda/amanda-client.conf: No such file or 
directory
Reading conf file 
/opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf.
sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 
08:03:20 2006
 sendbackup req: GNUTAR sub101 
 /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 
 2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index;

 parsed request as: program `GNUTAR'
disk `sub101'
device 
`/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101'

level 1
since 2006:11:13:13:21:32
options `|;auth=bsdtcp;index;'
sendbackup: start: yorick:sub101 lev 1
sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental 
  
from '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_0' to 


'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new'
sendbackup-gnutar: time 53.057: doing level 1 dump from date: 2006-11-13 
13:22:26 GMT
sendbackup: time 53.246: started index creator: /usr/freeware/bin/tar -tf 
- 2/dev/null | sed -e 's/^\.//'
sendbackup: time 53.254: spawning /opt/amanda/amanda1/libexec/runtar in 
pipeline

Re: 2.5.1p2: planner returns error=EOF on read

2006-11-25 Thread Jean-Francois Malouin
* Jean-Louis Martineau [EMAIL PROTECTED] [20061123 17:20]:
 Do amandad is running?

To be sure that I would hit the problem again I didn't load new tapes
so after the holddding filled up I have a bunch of gtar running and
just wasting a lot of cycles. A typical example (sorry for the long lines):

  amanda35559363557855  0- -   0:00 defunct 
  amanda3557855  1  0 05:37:39 ?   1:30 
/opt/amanda/amanda1/libexec/sendbackup amandad bsdtcp
root35680873557855  0 05:40:02 ?  277:35 gtar --create --file - 
--directory /data/mafalda/mafalda1/susanita/sandra/DTI -

Tracing the sendbackup process yields no output while the gtar one
gives just an uninterupted flow of lines:

   68mS tar(3568087): write(1, fd 63 e3 6c ff 42 3c df 87
41 82 bf e9 dd 95 2c..., 10240) errno = 32 (Broken pipe)

Don't know if this is going to be of any help...
jf

 
 Could you try to find what each process (amanda,sendbackup,tar) are 
 doing? on which system call they are hung?
 On linux, I would use strace for that.
 
 Jean-Louis
 
 Jean-Francois Malouin wrote:
 Jean-Louis,
 
 The amdump just finished and because amanda ran out of tapes
 (runtapes=10) it completed badly leaving stuff in the holdding
 disk. I'm flushing it at the moment but I noticed that when amanda
 gave up it left a lot of gnutar lying around, with amandad as parent.
 I had to manually kill them. Looks like signals are not doing ok.
 I've attached one example of a DLE's sendbackup and runtar debug file.
 It's not the first time I notice that when a DLE fails to make it
 to tape successfully processes are left running...
 
 * Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]:
   
 Could you post amandad.timestamp.debug file from yorick?
 
 Jean-Francois Malouin wrote:
 
 On a server running irix-6.5 and amanda 2.5.1p2
 planner debug shows:
 
 security_getdriver(name=bsdtcp) returns 4075aa8
 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP))
 security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP))
 security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP))
 security_stream_close(10053a80)
 security_stream_seterr(1005e538, SOCKET_EOF)
 security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF
 on read from yorick)
 security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP))
 security_stream_close(1005e538)
 
 and the amanda report shows things like:
 
  yorick  DATA_sub101   lev 0  FAILED [hmm, disk was stranded on waitq]
  planner: ERROR Request to yorick failed: EOF on read from yorick
 
 and the backup fails. The funny thing is that I have 2 other 
 configurations
 running 2.5.1p2 in parallel that doesn't exhibit this behaviour.
 Any clues?
 jf
  
   
 
   
 
 
 runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13 
 2006
 runtar: version 2.5.1p2
 /usr/freeware/bin/tar version: tar (GNU tar) 1.13.25
 
 config: stk_80-conf1
 runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 
 2006
 running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' 
 '--directory' 
 '/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' 
 '--one-file-system' '--listed-incremental' 
 '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' 
 '--sparse' '--ignore-failed-read' '--totals' '.' runtar: pid 2826413 
 finish time Thu Nov 23 08:04:13 2006
   
 
 
 sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 
 08:03:20 2006
 sendbackup: version 2.5.1p2
 Could not open conf file 
 /opt/amanda/amanda1/etc/amanda/amanda-client.conf: No such file or 
 directory
 Reading conf file 
 /opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf.
 sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 
 08:03:20 2006
   sendbackup req: GNUTAR sub101 
   /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 
   2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index;
   parsed request as: program `GNUTAR'
  disk `sub101'
  device 
  
  `/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101'
  level 1
  since 2006:11:13:13:21:32
  options `|;auth=bsdtcp;index;'
 sendbackup: start: yorick:sub101 lev 1
 sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental 
 from '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_0' to 
 '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new'
 sendbackup-gnutar: time 53.057: doing level 1 dump from date: 2006-11-13 
 13:22:26 GMT
 sendbackup: time 53.246: started index creator: /usr/freeware/bin/tar -tf 
 - 2/dev/null | sed -e 's/^\.//'
 sendbackup: time 53.254: spawning /opt/amanda/amanda1/libexec/runtar in 
 pipeline
 sendbackup: argument list: runtar stk_80-conf1 gtar --create 

Re: 2.5.1p2: planner returns error=EOF on read

2006-11-23 Thread Jean-Francois Malouin
Jean-Louis,

The amdump just finished and because amanda ran out of tapes
(runtapes=10) it completed badly leaving stuff in the holdding
disk. I'm flushing it at the moment but I noticed that when amanda
gave up it left a lot of gnutar lying around, with amandad as parent.
I had to manually kill them. Looks like signals are not doing ok.
I've attached one example of a DLE's sendbackup and runtar debug file.
It's not the first time I notice that when a DLE fails to make it
to tape successfully processes are left running...

* Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]:
 Could you post amandad.timestamp.debug file from yorick?
 
 Jean-Francois Malouin wrote:
 On a server running irix-6.5 and amanda 2.5.1p2
 planner debug shows:
 
 security_getdriver(name=bsdtcp) returns 4075aa8
 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP))
 security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP))
 security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP))
 security_stream_close(10053a80)
 security_stream_seterr(1005e538, SOCKET_EOF)
 security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF
 on read from yorick)
 security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP))
 security_stream_close(1005e538)
 
 and the amanda report shows things like:
 
   yorick  DATA_sub101   lev 0  FAILED [hmm, disk was stranded on waitq]
   planner: ERROR Request to yorick failed: EOF on read from yorick
 
 and the backup fails. The funny thing is that I have 2 other configurations
 running 2.5.1p2 in parallel that doesn't exhibit this behaviour.
 Any clues?
 jf
   

-- 
° 
runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13 2006
runtar: version 2.5.1p2
/usr/freeware/bin/tar version: tar (GNU tar) 1.13.25

config: stk_80-conf1
runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 2006
running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' '--directory' 
'/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' 
'--one-file-system' '--listed-incremental' 
'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' '--sparse' 
'--ignore-failed-read' '--totals' '.' 
runtar: pid 2826413 finish time Thu Nov 23 08:04:13 2006
sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 08:03:20 
2006
sendbackup: version 2.5.1p2
Could not open conf file /opt/amanda/amanda1/etc/amanda/amanda-client.conf: 
No such file or directory
Reading conf file 
/opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf.
sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 
08:03:20 2006
  sendbackup req: GNUTAR sub101 
/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 
2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index;
  parsed request as: program `GNUTAR'
 disk `sub101'
 device 
`/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101'
 level 1
 since 2006:11:13:13:21:32
 options `|;auth=bsdtcp;index;'
sendbackup: start: yorick:sub101 lev 1
sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental from 
'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_0' to 
'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new'
sendbackup-gnutar: time 53.057: doing level 1 dump from date: 2006-11-13 
13:22:26 GMT
sendbackup: time 53.246: started index creator: /usr/freeware/bin/tar -tf - 
2/dev/null | sed -e 's/^\.//'
sendbackup: time 53.254: spawning /opt/amanda/amanda1/libexec/runtar in pipeline
sendbackup: argument list: runtar stk_80-conf1 gtar --create --file - 
--directory /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 
--one-file-system --listed-incremental 
/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new --sparse 
--ignore-failed-read --totals .
sendbackup-gnutar: time 53.292: /opt/amanda/amanda1/libexec/runtar: pid 2826413
sendbackup: time 53.292: started backup
sendbackup: time 1839.833: index tee cannot write [Broken pipe]
sendbackup: time 1839.833: pid 2836285 finish time Thu Nov 23 08:34:00 2006
sendbackup: time 1839.926: 118: strange(?): sendbackup: index tee cannot write 
[Broken pipe]
sendbackup: time 9284.787: error [/usr/freeware/bin/tar got signal 15]
sendbackup: time 9284.856: pid 2836263 finish time Thu Nov 23 10:38:05 2006


Re: 2.5.1p2: planner returns error=EOF on read

2006-11-23 Thread Jean-Louis Martineau

Do amandad is running?

Could you try to find what each process (amanda,sendbackup,tar) are 
doing? on which system call they are hung?

On linux, I would use strace for that.

Jean-Louis

Jean-Francois Malouin wrote:

Jean-Louis,

The amdump just finished and because amanda ran out of tapes
(runtapes=10) it completed badly leaving stuff in the holdding
disk. I'm flushing it at the moment but I noticed that when amanda
gave up it left a lot of gnutar lying around, with amandad as parent.
I had to manually kill them. Looks like signals are not doing ok.
I've attached one example of a DLE's sendbackup and runtar debug file.
It's not the first time I notice that when a DLE fails to make it
to tape successfully processes are left running...

* Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]:
  

Could you post amandad.timestamp.debug file from yorick?

Jean-Francois Malouin wrote:


On a server running irix-6.5 and amanda 2.5.1p2
planner debug shows:

security_getdriver(name=bsdtcp) returns 4075aa8
security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP))
security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP))
security_stream_close(10053a80)
security_stream_seterr(1005e538, SOCKET_EOF)
security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF
on read from yorick)
security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_stream_close(1005e538)

and the amanda report shows things like:

 yorick  DATA_sub101   lev 0  FAILED [hmm, disk was stranded on waitq]
 planner: ERROR Request to yorick failed: EOF on read from yorick

and the backup fails. The funny thing is that I have 2 other configurations
running 2.5.1p2 in parallel that doesn't exhibit this behaviour.
Any clues?
jf
 
  


  



runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13 2006
runtar: version 2.5.1p2
/usr/freeware/bin/tar version: tar (GNU tar) 1.13.25

config: stk_80-conf1
runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 2006
running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' '--directory' '/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' '--one-file-system' '--listed-incremental' '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' '--sparse' '--ignore-failed-read' '--totals' '.' 
runtar: pid 2826413 finish time Thu Nov 23 08:04:13 2006
  



sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 08:03:20 
2006
sendbackup: version 2.5.1p2
Could not open conf file /opt/amanda/amanda1/etc/amanda/amanda-client.conf: 
No such file or directory
Reading conf file 
/opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf.
sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 
08:03:20 2006
  sendbackup req: GNUTAR sub101 
/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 2006:11:13:13:21:32 
OPTIONS |;auth=bsdtcp;index;
  parsed request as: program `GNUTAR'
 disk `sub101'
 device 
`/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101'
 level 1
 since 2006:11:13:13:21:32
 options `|;auth=bsdtcp;index;'
sendbackup: start: yorick:sub101 lev 1
sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental from 
'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_0' to 
'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new'
sendbackup-gnutar: time 53.057: doing level 1 dump from date: 2006-11-13 
13:22:26 GMT
sendbackup: time 53.246: started index creator: /usr/freeware/bin/tar -tf - 
2/dev/null | sed -e 's/^\.//'
sendbackup: time 53.254: spawning /opt/amanda/amanda1/libexec/runtar in pipeline
sendbackup: argument list: runtar stk_80-conf1 gtar --create --file - 
--directory /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 
--one-file-system --listed-incremental 
/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new --sparse 
--ignore-failed-read --totals .
sendbackup-gnutar: time 53.292: /opt/amanda/amanda1/libexec/runtar: pid 2826413
sendbackup: time 53.292: started backup
sendbackup: time 1839.833: index tee cannot write [Broken pipe]
sendbackup: time 1839.833: pid 2836285 finish time Thu Nov 23 08:34:00 2006
sendbackup: time 1839.926: 118: strange(?): sendbackup: index tee cannot write 
[Broken pipe]
sendbackup: time 9284.787: error [/usr/freeware/bin/tar got signal 15]
sendbackup: time 9284.856: pid 2836263 finish time Thu Nov 23 10:38:05 2006
  




2.5.1p2: planner returns error=EOF on read

2006-11-22 Thread Jean-Francois Malouin
On a server running irix-6.5 and amanda 2.5.1p2
planner debug shows:

security_getdriver(name=bsdtcp) returns 4075aa8
security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP))
security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP))
security_stream_close(10053a80)
security_stream_seterr(1005e538, SOCKET_EOF)
security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF
on read from yorick)
security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_stream_close(1005e538)

and the amanda report shows things like:

  yorick  DATA_sub101   lev 0  FAILED [hmm, disk was stranded on waitq]
  planner: ERROR Request to yorick failed: EOF on read from yorick

and the backup fails. The funny thing is that I have 2 other configurations
running 2.5.1p2 in parallel that doesn't exhibit this behaviour.
Any clues?
jf


Re: 2.5.1p2: planner returns error=EOF on read

2006-11-22 Thread Jean-Louis Martineau

Could you post amandad.timestamp.debug file from yorick?

Jean-Francois Malouin wrote:

On a server running irix-6.5 and amanda 2.5.1p2
planner debug shows:

security_getdriver(name=bsdtcp) returns 4075aa8
security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP))
security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP))
security_stream_close(10053a80)
security_stream_seterr(1005e538, SOCKET_EOF)
security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF
on read from yorick)
security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_stream_close(1005e538)

and the amanda report shows things like:

  yorick  DATA_sub101   lev 0  FAILED [hmm, disk was stranded on waitq]
  planner: ERROR Request to yorick failed: EOF on read from yorick

and the backup fails. The funny thing is that I have 2 other configurations
running 2.5.1p2 in parallel that doesn't exhibit this behaviour.
Any clues?
jf
  




Re: 2.5.1p2: planner returns error=EOF on read

2006-11-22 Thread Jean-Francois Malouin
* Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]:
 Could you post amandad.timestamp.debug file from yorick?

See the attached file.

Jean-Louis I realize now that we went through something
similar a little while ago...


 
 Jean-Francois Malouin wrote:
 On a server running irix-6.5 and amanda 2.5.1p2
 planner debug shows:
 
 security_getdriver(name=bsdtcp) returns 4075aa8
 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP))
 security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP))
 security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP))
 security_stream_close(10053a80)
 security_stream_seterr(1005e538, SOCKET_EOF)
 security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF
 on read from yorick)
 security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP))
 security_stream_close(1005e538)
 
 and the amanda report shows things like:
 
   yorick  DATA_sub101   lev 0  FAILED [hmm, disk was stranded on waitq]
   planner: ERROR Request to yorick failed: EOF on read from yorick
 
 and the backup fails. The funny thing is that I have 2 other configurations
 running 2.5.1p2 in parallel that doesn't exhibit this behaviour.
 Any clues?
 jf
   

-- 
° 
amandad: debug 1 pid 2030786 ruid 666 euid 666: start at Wed Nov 22 00:15:15 
2006
security_getdriver(name=bsdtcp) returns 5ffb5aa8
amandad: version 2.5.1p2
amandad: build: VERSION=Amanda-2.5.1p2
amandad:BUILT_DATE=Mon Nov 13 12:48:25 EST 2006
amandad:BUILT_MACH=IRIX64 yorick 6.5 01100601 IP35 CC=gcc
amandad:CONFIGURE_COMMAND='./configure' '--prefix=/opt/amanda/amanda1' 
'--with-user=amanda' '--with-group=sys' '--with-bsdtcp-security' 
'--with-amandahosts' '--with-testing=conf1' '--with-tmpdir=/tmp/amanda-conf1' 
'--with-tcpportrange=15872,16383' '--with-udpportrange=739,762' 
'--with-tape-device=/hw/tape/tps21d1nrnsv' 
'--with-changer-device=/dev/changer-l80' '--with-gnutar=/usr/freeware/bin/tar' 
'--with-tape-server=yorick' '--with-index-server=yorick' 
'--with-config=stk_80-conf1' '--with-includes=/usr/freeware/include' 
'--with-libraries=/usr/freeware/lib32'
amandad: paths: bindir=/opt/amanda/amanda1/bin
amandad:sbindir=/opt/amanda/amanda1/sbin
amandad:libexecdir=/opt/amanda/amanda1/libexec
amandad:mandir=/opt/amanda/amanda1/man
amandad:AMANDA_TMPDIR=/tmp/amanda-conf1
amandad:AMANDA_DBGDIR=/tmp/amanda-conf1
amandad:CONFIG_DIR=/opt/amanda/amanda1/etc/amanda
amandad:DEV_PREFIX=/dev/dsk/ RDEV_PREFIX=/dev/rdsk/
amandad:DUMP=/sbin/dump RESTORE=/sbin/restore VDUMP=UNDEF
amandad:VRESTORE=UNDEF XFSDUMP=/sbin/xfsdump
amandad:XFSRESTORE=/sbin/xfsrestore VXDUMP=UNDEF VXRESTORE=UNDEF
amandad:SAMBA_CLIENT=UNDEF GNUTAR=/usr/freeware/bin/tar
amandad:COMPRESS_PATH=/usr/sbin/gzip
amandad:UNCOMPRESS_PATH=/usr/sbin/gzip LPRCMD=/usr/bsd/lpr
amandad:MAILER=/usr/sbin/Mail
amandad:listed_incr_dir=/opt/amanda/amanda1/var/amanda/gnutar-lists
amandad: defs:  DEFAULT_SERVER=yorick DEFAULT_CONFIG=stk_80-conf1
amandad:DEFAULT_TAPE_SERVER=yorick
amandad:DEFAULT_TAPE_DEVICE=/hw/tape/tps21d1nrnsv HAVE_MMAP
amandad:HAVE_SYSVSHM LOCKING=POSIX_FCNTL SETPGRP_VOID DEBUG_CODE
amandad:AMANDA_DEBUG_DAYS=4 BSD_SECURITY RSH_SECURITY USE_AMANDAHOSTS
amandad:CLIENT_LOGIN=amanda FORCE_USERID HAVE_GZIP
amandad:COMPRESS_SUFFIX=.gz COMPRESS_FAST_OPT=--fast
amandad:COMPRESS_BEST_OPT=--best UNCOMPRESS_OPT=-dc
security_handleinit(handle=1001f8f0, driver=5ffb5aa8 (BSDTCP))
security_streaminit(stream=1002b3f0, driver=5ffb5aa8 (BSDTCP))
amandad: time 0.007: accept recv REQ pkt:

SERVICE noop
OPTIONS features=feff9ffeff7f;

amandad: time 0.007: creating new service: noop
OPTIONS features=feff9ffeff7f;

amandad: time 0.017: sending ACK pkt:


amandad: time 0.037: sending REP pkt:

OPTIONS features=feff9ffeff7f;

amandad: time 0.047: received ACK pkt:


security_close(handle=1001f8f0, driver=5ffb5aa8 (BSDTCP))
security_stream_close(1002b3f0)
amandad: time 0.123: tcpm_recv_token: invalid size 34624
amandad: time 0.123: pid 2030786 finish time Wed Nov 22 00:15:15 2006