Re: 2.5.1p2: planner returns error=EOF on read
Jean-Francois, Could you try the attached patch, it should kill the tar process. If it still doesn't work, could you attach a debugger to the sendbackup process to know where it is? Jean-Louis Jean-Francois Malouin wrote: * Jean-Louis Martineau [EMAIL PROTECTED] [20061123 17:20]: Do amandad is running? To be sure that I would hit the problem again I didn't load new tapes so after the holddding filled up I have a bunch of gtar running and just wasting a lot of cycles. A typical example (sorry for the long lines): amanda35559363557855 0- - 0:00 defunct amanda3557855 1 0 05:37:39 ? 1:30 /opt/amanda/amanda1/libexec/sendbackup amandad bsdtcp root35680873557855 0 05:40:02 ? 277:35 gtar --create --file - --directory /data/mafalda/mafalda1/susanita/sandra/DTI - Tracing the sendbackup process yields no output while the gtar one gives just an uninterupted flow of lines: 68mS tar(3568087): write(1, fd 63 e3 6c ff 42 3c df 87 41 82 bf e9 dd 95 2c..., 10240) errno = 32 (Broken pipe) Don't know if this is going to be of any help... jf Could you try to find what each process (amanda,sendbackup,tar) are doing? on which system call they are hung? On linux, I would use strace for that. Jean-Louis Jean-Francois Malouin wrote: Jean-Louis, The amdump just finished and because amanda ran out of tapes (runtapes=10) it completed badly leaving stuff in the holdding disk. I'm flushing it at the moment but I noticed that when amanda gave up it left a lot of gnutar lying around, with amandad as parent. I had to manually kill them. Looks like signals are not doing ok. I've attached one example of a DLE's sendbackup and runtar debug file. It's not the first time I notice that when a DLE fails to make it to tape successfully processes are left running... * Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]: Could you post amandad.timestamp.debug file from yorick? Jean-Francois Malouin wrote: On a server running irix-6.5 and amanda 2.5.1p2 planner debug shows: security_getdriver(name=bsdtcp) returns 4075aa8 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP)) security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP)) security_stream_close(10053a80) security_stream_seterr(1005e538, SOCKET_EOF) security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF on read from yorick) security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_stream_close(1005e538) and the amanda report shows things like: yorick DATA_sub101 lev 0 FAILED [hmm, disk was stranded on waitq] planner: ERROR Request to yorick failed: EOF on read from yorick and the backup fails. The funny thing is that I have 2 other configurations running 2.5.1p2 in parallel that doesn't exhibit this behaviour. Any clues? jf runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13 2006 runtar: version 2.5.1p2 /usr/freeware/bin/tar version: tar (GNU tar) 1.13.25 config: stk_80-conf1 runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 2006 running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' '--directory' '/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' '--one-file-system' '--listed-incremental' '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' '--sparse' '--ignore-failed-read' '--totals' '.' runtar: pid 2826413 finish time Thu Nov 23 08:04:13 2006 sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 08:03:20 2006 sendbackup: version 2.5.1p2 Could not open conf file /opt/amanda/amanda1/etc/amanda/amanda-client.conf: No such file or directory Reading conf file /opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf. sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 08:03:20 2006 sendbackup req: GNUTAR sub101 /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index; parsed request as: program `GNUTAR' disk `sub101' device `/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' level 1 since 2006:11:13:13:21:32 options `|;auth=bsdtcp;index;' sendbackup: start: yorick:sub101 lev 1 sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental from '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_0' to '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' sendbackup-gnutar: time 53.057: doing level 1 dump from date: 2006-11-13 13:22:26 GMT sendbackup: time 53.246: started index creator: /usr/freeware/bin/tar -tf -
Re: 2.5.1p2: planner returns error=EOF on read
* Jean-Louis Martineau [EMAIL PROTECTED] [20061127 09:24]: Jean-Francois, Could you try the attached patch, it should kill the tar process. If it still doesn't work, could you attach a debugger to the sendbackup process to know where it is? Thanks Jean-Louis. I'm recompiling and will let you know what happens. jf Jean-Louis Jean-Francois Malouin wrote: * Jean-Louis Martineau [EMAIL PROTECTED] [20061123 17:20]: Do amandad is running? To be sure that I would hit the problem again I didn't load new tapes so after the holddding filled up I have a bunch of gtar running and just wasting a lot of cycles. A typical example (sorry for the long lines): amanda35559363557855 0- - 0:00 defunct amanda3557855 1 0 05:37:39 ? 1:30 /opt/amanda/amanda1/libexec/sendbackup amandad bsdtcp root35680873557855 0 05:40:02 ? 277:35 gtar --create --file - --directory /data/mafalda/mafalda1/susanita/sandra/DTI - Tracing the sendbackup process yields no output while the gtar one gives just an uninterupted flow of lines: 68mS tar(3568087): write(1, fd 63 e3 6c ff 42 3c df 87 41 82 bf e9 dd 95 2c..., 10240) errno = 32 (Broken pipe) Don't know if this is going to be of any help... jf Could you try to find what each process (amanda,sendbackup,tar) are doing? on which system call they are hung? On linux, I would use strace for that. Jean-Louis Jean-Francois Malouin wrote: Jean-Louis, The amdump just finished and because amanda ran out of tapes (runtapes=10) it completed badly leaving stuff in the holdding disk. I'm flushing it at the moment but I noticed that when amanda gave up it left a lot of gnutar lying around, with amandad as parent. I had to manually kill them. Looks like signals are not doing ok. I've attached one example of a DLE's sendbackup and runtar debug file. It's not the first time I notice that when a DLE fails to make it to tape successfully processes are left running... * Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]: Could you post amandad.timestamp.debug file from yorick? Jean-Francois Malouin wrote: On a server running irix-6.5 and amanda 2.5.1p2 planner debug shows: security_getdriver(name=bsdtcp) returns 4075aa8 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP)) security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP)) security_stream_close(10053a80) security_stream_seterr(1005e538, SOCKET_EOF) security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF on read from yorick) security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_stream_close(1005e538) and the amanda report shows things like: yorick DATA_sub101 lev 0 FAILED [hmm, disk was stranded on waitq] planner: ERROR Request to yorick failed: EOF on read from yorick and the backup fails. The funny thing is that I have 2 other configurations running 2.5.1p2 in parallel that doesn't exhibit this behaviour. Any clues? jf runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13 2006 runtar: version 2.5.1p2 /usr/freeware/bin/tar version: tar (GNU tar) 1.13.25 config: stk_80-conf1 runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 2006 running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' '--directory' '/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' '--one-file-system' '--listed-incremental' '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' '--sparse' '--ignore-failed-read' '--totals' '.' runtar: pid 2826413 finish time Thu Nov 23 08:04:13 2006 sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 08:03:20 2006 sendbackup: version 2.5.1p2 Could not open conf file /opt/amanda/amanda1/etc/amanda/amanda-client.conf: No such file or directory Reading conf file /opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf. sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 08:03:20 2006 sendbackup req: GNUTAR sub101 /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index; parsed request as: program `GNUTAR' disk `sub101' device `/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' level 1 since 2006:11:13:13:21:32 options `|;auth=bsdtcp;index;' sendbackup: start: yorick:sub101 lev 1 sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental from
Re: 2.5.1p2: planner returns error=EOF on read
Amanda expect tar to exit with a broken pipe. I looks that we need to kill it. Jean-Louis Jean-Francois Malouin wrote: * Jean-Louis Martineau [EMAIL PROTECTED] [20061123 17:20]: Do amandad is running? To be sure that I would hit the problem again I didn't load new tapes so after the holddding filled up I have a bunch of gtar running and just wasting a lot of cycles. A typical example (sorry for the long lines): amanda35559363557855 0- - 0:00 defunct amanda3557855 1 0 05:37:39 ? 1:30 /opt/amanda/amanda1/libexec/sendbackup amandad bsdtcp root35680873557855 0 05:40:02 ? 277:35 gtar --create --file - --directory /data/mafalda/mafalda1/susanita/sandra/DTI - Tracing the sendbackup process yields no output while the gtar one gives just an uninterupted flow of lines: 68mS tar(3568087): write(1, fd 63 e3 6c ff 42 3c df 87 41 82 bf e9 dd 95 2c..., 10240) errno = 32 (Broken pipe) Don't know if this is going to be of any help... jf Could you try to find what each process (amanda,sendbackup,tar) are doing? on which system call they are hung? On linux, I would use strace for that. Jean-Louis Jean-Francois Malouin wrote: Jean-Louis, The amdump just finished and because amanda ran out of tapes (runtapes=10) it completed badly leaving stuff in the holdding disk. I'm flushing it at the moment but I noticed that when amanda gave up it left a lot of gnutar lying around, with amandad as parent. I had to manually kill them. Looks like signals are not doing ok. I've attached one example of a DLE's sendbackup and runtar debug file. It's not the first time I notice that when a DLE fails to make it to tape successfully processes are left running... * Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]: Could you post amandad.timestamp.debug file from yorick? Jean-Francois Malouin wrote: On a server running irix-6.5 and amanda 2.5.1p2 planner debug shows: security_getdriver(name=bsdtcp) returns 4075aa8 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP)) security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP)) security_stream_close(10053a80) security_stream_seterr(1005e538, SOCKET_EOF) security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF on read from yorick) security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_stream_close(1005e538) and the amanda report shows things like: yorick DATA_sub101 lev 0 FAILED [hmm, disk was stranded on waitq] planner: ERROR Request to yorick failed: EOF on read from yorick and the backup fails. The funny thing is that I have 2 other configurations running 2.5.1p2 in parallel that doesn't exhibit this behaviour. Any clues? jf runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13 2006 runtar: version 2.5.1p2 /usr/freeware/bin/tar version: tar (GNU tar) 1.13.25 config: stk_80-conf1 runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 2006 running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' '--directory' '/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' '--one-file-system' '--listed-incremental' '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' '--sparse' '--ignore-failed-read' '--totals' '.' runtar: pid 2826413 finish time Thu Nov 23 08:04:13 2006 sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 08:03:20 2006 sendbackup: version 2.5.1p2 Could not open conf file /opt/amanda/amanda1/etc/amanda/amanda-client.conf: No such file or directory Reading conf file /opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf. sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 08:03:20 2006 sendbackup req: GNUTAR sub101 /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index; parsed request as: program `GNUTAR' disk `sub101' device `/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' level 1 since 2006:11:13:13:21:32 options `|;auth=bsdtcp;index;' sendbackup: start: yorick:sub101 lev 1 sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental from '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_0' to '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' sendbackup-gnutar: time 53.057: doing level 1 dump from date: 2006-11-13 13:22:26 GMT sendbackup: time 53.246: started index creator: /usr/freeware/bin/tar -tf - 2/dev/null | sed -e 's/^\.//' sendbackup: time 53.254: spawning /opt/amanda/amanda1/libexec/runtar in pipeline
Re: 2.5.1p2: planner returns error=EOF on read
* Jean-Louis Martineau [EMAIL PROTECTED] [20061123 17:20]: Do amandad is running? To be sure that I would hit the problem again I didn't load new tapes so after the holddding filled up I have a bunch of gtar running and just wasting a lot of cycles. A typical example (sorry for the long lines): amanda35559363557855 0- - 0:00 defunct amanda3557855 1 0 05:37:39 ? 1:30 /opt/amanda/amanda1/libexec/sendbackup amandad bsdtcp root35680873557855 0 05:40:02 ? 277:35 gtar --create --file - --directory /data/mafalda/mafalda1/susanita/sandra/DTI - Tracing the sendbackup process yields no output while the gtar one gives just an uninterupted flow of lines: 68mS tar(3568087): write(1, fd 63 e3 6c ff 42 3c df 87 41 82 bf e9 dd 95 2c..., 10240) errno = 32 (Broken pipe) Don't know if this is going to be of any help... jf Could you try to find what each process (amanda,sendbackup,tar) are doing? on which system call they are hung? On linux, I would use strace for that. Jean-Louis Jean-Francois Malouin wrote: Jean-Louis, The amdump just finished and because amanda ran out of tapes (runtapes=10) it completed badly leaving stuff in the holdding disk. I'm flushing it at the moment but I noticed that when amanda gave up it left a lot of gnutar lying around, with amandad as parent. I had to manually kill them. Looks like signals are not doing ok. I've attached one example of a DLE's sendbackup and runtar debug file. It's not the first time I notice that when a DLE fails to make it to tape successfully processes are left running... * Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]: Could you post amandad.timestamp.debug file from yorick? Jean-Francois Malouin wrote: On a server running irix-6.5 and amanda 2.5.1p2 planner debug shows: security_getdriver(name=bsdtcp) returns 4075aa8 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP)) security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP)) security_stream_close(10053a80) security_stream_seterr(1005e538, SOCKET_EOF) security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF on read from yorick) security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_stream_close(1005e538) and the amanda report shows things like: yorick DATA_sub101 lev 0 FAILED [hmm, disk was stranded on waitq] planner: ERROR Request to yorick failed: EOF on read from yorick and the backup fails. The funny thing is that I have 2 other configurations running 2.5.1p2 in parallel that doesn't exhibit this behaviour. Any clues? jf runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13 2006 runtar: version 2.5.1p2 /usr/freeware/bin/tar version: tar (GNU tar) 1.13.25 config: stk_80-conf1 runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 2006 running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' '--directory' '/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' '--one-file-system' '--listed-incremental' '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' '--sparse' '--ignore-failed-read' '--totals' '.' runtar: pid 2826413 finish time Thu Nov 23 08:04:13 2006 sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 08:03:20 2006 sendbackup: version 2.5.1p2 Could not open conf file /opt/amanda/amanda1/etc/amanda/amanda-client.conf: No such file or directory Reading conf file /opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf. sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 08:03:20 2006 sendbackup req: GNUTAR sub101 /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index; parsed request as: program `GNUTAR' disk `sub101' device `/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' level 1 since 2006:11:13:13:21:32 options `|;auth=bsdtcp;index;' sendbackup: start: yorick:sub101 lev 1 sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental from '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_0' to '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' sendbackup-gnutar: time 53.057: doing level 1 dump from date: 2006-11-13 13:22:26 GMT sendbackup: time 53.246: started index creator: /usr/freeware/bin/tar -tf - 2/dev/null | sed -e 's/^\.//' sendbackup: time 53.254: spawning /opt/amanda/amanda1/libexec/runtar in pipeline sendbackup: argument list: runtar stk_80-conf1 gtar --create
Re: 2.5.1p2: planner returns error=EOF on read
Jean-Louis, The amdump just finished and because amanda ran out of tapes (runtapes=10) it completed badly leaving stuff in the holdding disk. I'm flushing it at the moment but I noticed that when amanda gave up it left a lot of gnutar lying around, with amandad as parent. I had to manually kill them. Looks like signals are not doing ok. I've attached one example of a DLE's sendbackup and runtar debug file. It's not the first time I notice that when a DLE fails to make it to tape successfully processes are left running... * Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]: Could you post amandad.timestamp.debug file from yorick? Jean-Francois Malouin wrote: On a server running irix-6.5 and amanda 2.5.1p2 planner debug shows: security_getdriver(name=bsdtcp) returns 4075aa8 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP)) security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP)) security_stream_close(10053a80) security_stream_seterr(1005e538, SOCKET_EOF) security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF on read from yorick) security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_stream_close(1005e538) and the amanda report shows things like: yorick DATA_sub101 lev 0 FAILED [hmm, disk was stranded on waitq] planner: ERROR Request to yorick failed: EOF on read from yorick and the backup fails. The funny thing is that I have 2 other configurations running 2.5.1p2 in parallel that doesn't exhibit this behaviour. Any clues? jf -- ° runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13 2006 runtar: version 2.5.1p2 /usr/freeware/bin/tar version: tar (GNU tar) 1.13.25 config: stk_80-conf1 runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 2006 running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' '--directory' '/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' '--one-file-system' '--listed-incremental' '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' '--sparse' '--ignore-failed-read' '--totals' '.' runtar: pid 2826413 finish time Thu Nov 23 08:04:13 2006 sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 08:03:20 2006 sendbackup: version 2.5.1p2 Could not open conf file /opt/amanda/amanda1/etc/amanda/amanda-client.conf: No such file or directory Reading conf file /opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf. sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 08:03:20 2006 sendbackup req: GNUTAR sub101 /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index; parsed request as: program `GNUTAR' disk `sub101' device `/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' level 1 since 2006:11:13:13:21:32 options `|;auth=bsdtcp;index;' sendbackup: start: yorick:sub101 lev 1 sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental from '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_0' to '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' sendbackup-gnutar: time 53.057: doing level 1 dump from date: 2006-11-13 13:22:26 GMT sendbackup: time 53.246: started index creator: /usr/freeware/bin/tar -tf - 2/dev/null | sed -e 's/^\.//' sendbackup: time 53.254: spawning /opt/amanda/amanda1/libexec/runtar in pipeline sendbackup: argument list: runtar stk_80-conf1 gtar --create --file - --directory /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 --one-file-system --listed-incremental /opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new --sparse --ignore-failed-read --totals . sendbackup-gnutar: time 53.292: /opt/amanda/amanda1/libexec/runtar: pid 2826413 sendbackup: time 53.292: started backup sendbackup: time 1839.833: index tee cannot write [Broken pipe] sendbackup: time 1839.833: pid 2836285 finish time Thu Nov 23 08:34:00 2006 sendbackup: time 1839.926: 118: strange(?): sendbackup: index tee cannot write [Broken pipe] sendbackup: time 9284.787: error [/usr/freeware/bin/tar got signal 15] sendbackup: time 9284.856: pid 2836263 finish time Thu Nov 23 10:38:05 2006
Re: 2.5.1p2: planner returns error=EOF on read
Do amandad is running? Could you try to find what each process (amanda,sendbackup,tar) are doing? on which system call they are hung? On linux, I would use strace for that. Jean-Louis Jean-Francois Malouin wrote: Jean-Louis, The amdump just finished and because amanda ran out of tapes (runtapes=10) it completed badly leaving stuff in the holdding disk. I'm flushing it at the moment but I noticed that when amanda gave up it left a lot of gnutar lying around, with amandad as parent. I had to manually kill them. Looks like signals are not doing ok. I've attached one example of a DLE's sendbackup and runtar debug file. It's not the first time I notice that when a DLE fails to make it to tape successfully processes are left running... * Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]: Could you post amandad.timestamp.debug file from yorick? Jean-Francois Malouin wrote: On a server running irix-6.5 and amanda 2.5.1p2 planner debug shows: security_getdriver(name=bsdtcp) returns 4075aa8 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP)) security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP)) security_stream_close(10053a80) security_stream_seterr(1005e538, SOCKET_EOF) security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF on read from yorick) security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_stream_close(1005e538) and the amanda report shows things like: yorick DATA_sub101 lev 0 FAILED [hmm, disk was stranded on waitq] planner: ERROR Request to yorick failed: EOF on read from yorick and the backup fails. The funny thing is that I have 2 other configurations running 2.5.1p2 in parallel that doesn't exhibit this behaviour. Any clues? jf runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13 2006 runtar: version 2.5.1p2 /usr/freeware/bin/tar version: tar (GNU tar) 1.13.25 config: stk_80-conf1 runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 2006 running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' '--directory' '/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' '--one-file-system' '--listed-incremental' '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' '--sparse' '--ignore-failed-read' '--totals' '.' runtar: pid 2826413 finish time Thu Nov 23 08:04:13 2006 sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 08:03:20 2006 sendbackup: version 2.5.1p2 Could not open conf file /opt/amanda/amanda1/etc/amanda/amanda-client.conf: No such file or directory Reading conf file /opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf. sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 08:03:20 2006 sendbackup req: GNUTAR sub101 /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index; parsed request as: program `GNUTAR' disk `sub101' device `/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' level 1 since 2006:11:13:13:21:32 options `|;auth=bsdtcp;index;' sendbackup: start: yorick:sub101 lev 1 sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental from '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_0' to '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' sendbackup-gnutar: time 53.057: doing level 1 dump from date: 2006-11-13 13:22:26 GMT sendbackup: time 53.246: started index creator: /usr/freeware/bin/tar -tf - 2/dev/null | sed -e 's/^\.//' sendbackup: time 53.254: spawning /opt/amanda/amanda1/libexec/runtar in pipeline sendbackup: argument list: runtar stk_80-conf1 gtar --create --file - --directory /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 --one-file-system --listed-incremental /opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new --sparse --ignore-failed-read --totals . sendbackup-gnutar: time 53.292: /opt/amanda/amanda1/libexec/runtar: pid 2826413 sendbackup: time 53.292: started backup sendbackup: time 1839.833: index tee cannot write [Broken pipe] sendbackup: time 1839.833: pid 2836285 finish time Thu Nov 23 08:34:00 2006 sendbackup: time 1839.926: 118: strange(?): sendbackup: index tee cannot write [Broken pipe] sendbackup: time 9284.787: error [/usr/freeware/bin/tar got signal 15] sendbackup: time 9284.856: pid 2836263 finish time Thu Nov 23 10:38:05 2006
2.5.1p2: planner returns error=EOF on read
On a server running irix-6.5 and amanda 2.5.1p2 planner debug shows: security_getdriver(name=bsdtcp) returns 4075aa8 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP)) security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP)) security_stream_close(10053a80) security_stream_seterr(1005e538, SOCKET_EOF) security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF on read from yorick) security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_stream_close(1005e538) and the amanda report shows things like: yorick DATA_sub101 lev 0 FAILED [hmm, disk was stranded on waitq] planner: ERROR Request to yorick failed: EOF on read from yorick and the backup fails. The funny thing is that I have 2 other configurations running 2.5.1p2 in parallel that doesn't exhibit this behaviour. Any clues? jf
Re: 2.5.1p2: planner returns error=EOF on read
Could you post amandad.timestamp.debug file from yorick? Jean-Francois Malouin wrote: On a server running irix-6.5 and amanda 2.5.1p2 planner debug shows: security_getdriver(name=bsdtcp) returns 4075aa8 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP)) security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP)) security_stream_close(10053a80) security_stream_seterr(1005e538, SOCKET_EOF) security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF on read from yorick) security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_stream_close(1005e538) and the amanda report shows things like: yorick DATA_sub101 lev 0 FAILED [hmm, disk was stranded on waitq] planner: ERROR Request to yorick failed: EOF on read from yorick and the backup fails. The funny thing is that I have 2 other configurations running 2.5.1p2 in parallel that doesn't exhibit this behaviour. Any clues? jf
Re: 2.5.1p2: planner returns error=EOF on read
* Jean-Louis Martineau [EMAIL PROTECTED] [20061122 09:32]: Could you post amandad.timestamp.debug file from yorick? See the attached file. Jean-Louis I realize now that we went through something similar a little while ago... Jean-Francois Malouin wrote: On a server running irix-6.5 and amanda 2.5.1p2 planner debug shows: security_getdriver(name=bsdtcp) returns 4075aa8 security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP)) security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP)) security_stream_close(10053a80) security_stream_seterr(1005e538, SOCKET_EOF) security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF on read from yorick) security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP)) security_stream_close(1005e538) and the amanda report shows things like: yorick DATA_sub101 lev 0 FAILED [hmm, disk was stranded on waitq] planner: ERROR Request to yorick failed: EOF on read from yorick and the backup fails. The funny thing is that I have 2 other configurations running 2.5.1p2 in parallel that doesn't exhibit this behaviour. Any clues? jf -- ° amandad: debug 1 pid 2030786 ruid 666 euid 666: start at Wed Nov 22 00:15:15 2006 security_getdriver(name=bsdtcp) returns 5ffb5aa8 amandad: version 2.5.1p2 amandad: build: VERSION=Amanda-2.5.1p2 amandad:BUILT_DATE=Mon Nov 13 12:48:25 EST 2006 amandad:BUILT_MACH=IRIX64 yorick 6.5 01100601 IP35 CC=gcc amandad:CONFIGURE_COMMAND='./configure' '--prefix=/opt/amanda/amanda1' '--with-user=amanda' '--with-group=sys' '--with-bsdtcp-security' '--with-amandahosts' '--with-testing=conf1' '--with-tmpdir=/tmp/amanda-conf1' '--with-tcpportrange=15872,16383' '--with-udpportrange=739,762' '--with-tape-device=/hw/tape/tps21d1nrnsv' '--with-changer-device=/dev/changer-l80' '--with-gnutar=/usr/freeware/bin/tar' '--with-tape-server=yorick' '--with-index-server=yorick' '--with-config=stk_80-conf1' '--with-includes=/usr/freeware/include' '--with-libraries=/usr/freeware/lib32' amandad: paths: bindir=/opt/amanda/amanda1/bin amandad:sbindir=/opt/amanda/amanda1/sbin amandad:libexecdir=/opt/amanda/amanda1/libexec amandad:mandir=/opt/amanda/amanda1/man amandad:AMANDA_TMPDIR=/tmp/amanda-conf1 amandad:AMANDA_DBGDIR=/tmp/amanda-conf1 amandad:CONFIG_DIR=/opt/amanda/amanda1/etc/amanda amandad:DEV_PREFIX=/dev/dsk/ RDEV_PREFIX=/dev/rdsk/ amandad:DUMP=/sbin/dump RESTORE=/sbin/restore VDUMP=UNDEF amandad:VRESTORE=UNDEF XFSDUMP=/sbin/xfsdump amandad:XFSRESTORE=/sbin/xfsrestore VXDUMP=UNDEF VXRESTORE=UNDEF amandad:SAMBA_CLIENT=UNDEF GNUTAR=/usr/freeware/bin/tar amandad:COMPRESS_PATH=/usr/sbin/gzip amandad:UNCOMPRESS_PATH=/usr/sbin/gzip LPRCMD=/usr/bsd/lpr amandad:MAILER=/usr/sbin/Mail amandad:listed_incr_dir=/opt/amanda/amanda1/var/amanda/gnutar-lists amandad: defs: DEFAULT_SERVER=yorick DEFAULT_CONFIG=stk_80-conf1 amandad:DEFAULT_TAPE_SERVER=yorick amandad:DEFAULT_TAPE_DEVICE=/hw/tape/tps21d1nrnsv HAVE_MMAP amandad:HAVE_SYSVSHM LOCKING=POSIX_FCNTL SETPGRP_VOID DEBUG_CODE amandad:AMANDA_DEBUG_DAYS=4 BSD_SECURITY RSH_SECURITY USE_AMANDAHOSTS amandad:CLIENT_LOGIN=amanda FORCE_USERID HAVE_GZIP amandad:COMPRESS_SUFFIX=.gz COMPRESS_FAST_OPT=--fast amandad:COMPRESS_BEST_OPT=--best UNCOMPRESS_OPT=-dc security_handleinit(handle=1001f8f0, driver=5ffb5aa8 (BSDTCP)) security_streaminit(stream=1002b3f0, driver=5ffb5aa8 (BSDTCP)) amandad: time 0.007: accept recv REQ pkt: SERVICE noop OPTIONS features=feff9ffeff7f; amandad: time 0.007: creating new service: noop OPTIONS features=feff9ffeff7f; amandad: time 0.017: sending ACK pkt: amandad: time 0.037: sending REP pkt: OPTIONS features=feff9ffeff7f; amandad: time 0.047: received ACK pkt: security_close(handle=1001f8f0, driver=5ffb5aa8 (BSDTCP)) security_stream_close(1002b3f0) amandad: time 0.123: tcpm_recv_token: invalid size 34624 amandad: time 0.123: pid 2030786 finish time Wed Nov 22 00:15:15 2006