Re: 2.6.6-rc2 and newer cause trouble with amanda
Quoting Andreas Sundstrom [EMAIL PROTECTED]: Eric Siegerman wrote: On Sun, Jun 13, 2004 at 10:41:14PM +0200, Andreas Sundstrom wrote: --with-user=root \ --with-group=root A stab in the dark here: these settings seem a bit suspicious. Normally one doesn't run Amanda as root; it's able to get root privilege when it needs it (that's what the little runtar and rundump programs are for). I don't know anything about 2.6-series kernels, but maybe some change snuck in that interferes with Amanda trying to run as root. Even if that turns out not to be the source of your 2.6.6 problem, for security reasons I'd suggest running Amanda as some other user. I'll give it a shot, I have actually ment to do that anyway, but since everything has worked flawlessly I didn't bother. But I guess you never can have to much security anyways. /Andreas Well, now I have created a user named amanda which has default group membership disk and is also a member of users. Then I recompiled with user=amanda, group=disk and installed. I had som trouble to get it working but I eventually found out that when run with xinetd this statement is needed groups = yes. All went well and the backup finished successful. Then I switched to my identically compiled 2.6.6-rc2 kernel and it fails with the same error as earlier: These dumps were to tape dflt10. The next tape Amanda expects to use is: dflt11. The next new tape already labelled is: dflt12. FAILURE AND STRANGE DUMP SUMMARY: zappa.zapp /imagelib lev 1 FAILED [bad CONNECT response] zappa.zapp /boot lev 0 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp /home/sunkan lev 1 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp /var lev 1 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp / lev 0 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp /home/emelie lev 0 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp /apps lev 0 FAILED [bad CONNECT response] So I tarred up the /tmp/amanda dir and put it at ftp://zappa.cx/pub/amanda-zappa.cx.tar.gz if anyone wants to take a look at it. This is the complete failed session and nothing else. Thanks for the help and pointers so far everyone. /Andreas
Re: 2.6.6-rc2 and newer cause trouble with amanda
Andreas Sundstrom wrote: All went well and the backup finished successful. Then I switched to my identically compiled 2.6.6-rc2 kernel and it fails with the same error as earlier: These dumps were to tape dflt10. The next tape Amanda expects to use is: dflt11. The next new tape already labelled is: dflt12. FAILURE AND STRANGE DUMP SUMMARY: zappa.zapp /imagelib lev 1 FAILED [bad CONNECT response] zappa.zapp /boot lev 0 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp /home/sunkan lev 1 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp /var lev 1 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp / lev 0 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp /home/emelie lev 0 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp /apps lev 0 FAILED [bad CONNECT response] So I tarred up the /tmp/amanda dir and put it at ftp://zappa.cx/pub/amanda-zappa.cx.tar.gz if anyone wants to take a look at it. This is the complete failed session and nothing else. In /tmp/amanda are mostly client files. The debug files are all, as far as I can see, from the succeeded session. When failing, the client does not even get started somehow, or did I miss that file? I miss the amdump.1 and log.2004 file, which are the server side logs (in ~amanda/dflt ). But I don't expect something shocking in those files anyway. As it seems to be network related, I'm still more interested to know if the client had trouble sending the packets, or the server had trouble receiving it. A tcpdump trace of the failing session would be really helpful. -- Paul Bijnens, XplanationTel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] *** * I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, F6, * * quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, * * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, * * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, * * kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ...* * ... Are you sure? ... YES ... Phew ... I'm out * ***
Re: 2.6.6-rc2 and newer cause trouble with amanda
Hi, Andreas, on Dienstag, 15. Juni 2004 at 09:29 you wrote to amanda-users: AS Well, now I have created a user named amanda which has default group membership AS disk and is also a member of users. Then I recompiled with user=amanda, AS group=disk and installed. AS I had som trouble to get it working but I eventually found out that when run AS with xinetd this statement is needed groups = yes. AS All went well and the backup finished successful. Fine ;-) AS Then I switched to my AS identically compiled 2.6.6-rc2 kernel and it fails with the same error as earlier: AS These dumps were to tape dflt10. AS The next tape Amanda expects to use is: dflt11. AS The next new tape already labelled is: dflt12. AS FAILURE AND STRANGE DUMP SUMMARY: AS zappa.zapp /imagelib lev 1 FAILED [bad CONNECT response] AS zappa.zapp /boot lev 0 FAILED 20040615[could not connect to zappa.zappa.cx] AS zappa.zapp /home/sunkan lev 1 FAILED 20040615[could not connect to zappa.zappa.cx] AS zappa.zapp /var lev 1 FAILED 20040615[could not connect to zappa.zappa.cx] AS zappa.zapp / lev 0 FAILED 20040615[could not connect to zappa.zappa.cx] AS zappa.zapp /home/emelie lev 0 FAILED 20040615[could not connect to zappa.zappa.cx] AS zappa.zapp /apps lev 0 FAILED [bad CONNECT response] AS So I tarred up the /tmp/amanda dir and put it at AS ftp://zappa.cx/pub/amanda-zappa.cx.tar.gz if anyone wants to take a look at it. AS This is the complete failed session and nothing else. AS Thanks for the help and pointers so far everyone. I think you just run out of tcp-ports. I don't know exactly if that is kernel-related but you could try to reconfigure amanda with the options: --with-tcpportrange=5,50040 --with-udpportrange=890,899 (Substitute with your preferred port-numbers) This would specify the range to use and rule out any differences that might occur in handling this between kernel-releases. I have these options in my amanda-configure-script for quite a time now and never hit these problems with any 2.6-kernel. You could also read the docs/PORT-USAGE document for more infos on port-handling. Give it a try and let us know. -- Apart from that the server-side logs would be interesting but I suggest that reconfig-run first. -- best regards, Stefan Stefan G. Weichinger mailto:[EMAIL PROTECTED]
Re: 2.6.6-rc2 and newer cause trouble with amanda
Quoting Paul Bijnens [EMAIL PROTECTED]: Andreas Sundstrom wrote: All went well and the backup finished successful. Then I switched to my identically compiled 2.6.6-rc2 kernel and it fails with the same error as earlier: These dumps were to tape dflt10. The next tape Amanda expects to use is: dflt11. The next new tape already labelled is: dflt12. FAILURE AND STRANGE DUMP SUMMARY: zappa.zapp /imagelib lev 1 FAILED [bad CONNECT response] zappa.zapp /boot lev 0 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp /home/sunkan lev 1 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp /var lev 1 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp / lev 0 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp /home/emelie lev 0 FAILED 20040615[could not connect to zappa.zappa.cx] zappa.zapp /apps lev 0 FAILED [bad CONNECT response] So I tarred up the /tmp/amanda dir and put it at ftp://zappa.cx/pub/amanda-zappa.cx.tar.gz if anyone wants to take a look at it. This is the complete failed session and nothing else. In /tmp/amanda are mostly client files. The debug files are all, as far as I can see, from the succeeded session. When failing, the client does not even get started somehow, or did I miss that file? I miss the amdump.1 and log.2004 file, which are the server side logs (in ~amanda/dflt ). But I don't expect something shocking in those files anyway. As it seems to be network related, I'm still more interested to know if the client had trouble sending the packets, or the server had trouble receiving it. A tcpdump trace of the failing session would be really helpful. I have tarred them up to the file: ftp://zappa.cx/pub/amdump-zappa.cx.tar.gz have a look if you have time for it. Also what do you recommend as parameters to tcpdump, it's all running on the same host does that mean I can sniff on lo? /Andreas
Re: 2.6.6-rc2 and newer cause trouble with amanda
Andreas Sundstrom wrote: Also what do you recommend as parameters to tcpdump, it's all running on the same host does that mean I can sniff on lo? This works for me on Linux 2.4.22: sudo tcpdump -i lo -w trace.lo -- Paul Bijnens, XplanationTel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] *** * I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, F6, * * quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, * * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, * * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, * * kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ...* * ... Are you sure? ... YES ... Phew ... I'm out * ***
Re: 2.6.6-rc2 and newer cause trouble with amanda
Quoting Paul Bijnens [EMAIL PROTECTED]: Andreas Sundstrom wrote: Also what do you recommend as parameters to tcpdump, it's all running on the same host does that mean I can sniff on lo? This works for me on Linux 2.4.22: sudo tcpdump -i lo -w trace.lo I have now made a dump on the traffic passing through lo during the amdump. I have also recompiled amanda with these two settings (as another friendly person suggested): --with-tcpportrange=5,50040 --with-udpportrange=890,899 The dump is available at ftp://zappa.cx/pub/amanda.pcap I used this filter in ethereal to get rid of some other stuff that I suppose is not interesting: ! dns and ! tcp.port == 25 I'm not that used to looking at packet dumps and I have never looked at a working amanda dump so have a look and see if it gives you anymore clues about what is going on. /Andreas
Re: 2.6.6-rc2 and newer cause trouble with amanda
Hi, Andreas, on Dienstag, 15. Juni 2004 at 13:32 you wrote to amanda-users: AS I have now made a dump on the traffic passing through lo during the amdump. AS I AS have also recompiled amanda with these two settings (as another friendly person AS suggested): --with-tcpportrange=5,50040 --with-udpportrange=890,899 Did it run through now? What was the report? AS The dump is available at ftp://zappa.cx/pub/amanda.pcap AS I used this filter in ethereal to get rid of some other stuff that I suppose is AS not interesting: ! dns and ! tcp.port == 25 AS I'm not that used to looking at packet dumps and I have never looked at a AS working amanda dump so have a look and see if it gives you anymore clues about AS what is going on. I am not so used to that, either. Looks pretty good to me so far... uses the assigned ports and such. Maybe Paul sees more in it ... -- best regards, Stefan Stefan G. Weichinger mailto:[EMAIL PROTECTED]
[no subject]
Stefan G. Weichinger wrote: Hi, Andreas, on Dienstag, 15. Juni 2004 at 13:32 you wrote to amanda-users: AS I have now made a dump on the traffic passing through lo during the amdump. AS I AS have also recompiled amanda with these two settings (as another friendly person AS suggested): --with-tcpportrange=5,50040 --with-udpportrange=890,899 Did it run through now? What was the report? Sorry, the report and result was the same as earlier, I forgot to mention that. AS The dump is available at ftp://zappa.cx/pub/amanda.pcap AS I used this filter in ethereal to get rid of some other stuff that I suppose is AS not interesting: ! dns and ! tcp.port == 25 AS I'm not that used to looking at packet dumps and I have never looked at a AS working amanda dump so have a look and see if it gives you anymore clues about AS what is going on. I am not so used to that, either. Looks pretty good to me so far... uses the assigned ports and such. It looks to me that the TCP sessions don't transfer any actual data but I don't understand why. Maybe Paul sees more in it ... I hope he do.
Re: 2.6.6-rc2 and newer cause trouble with amanda
Andreas Sundstrom wrote: I have now made a dump on the traffic passing through lo during the amdump. I have also recompiled amanda with these two settings (as another friendly person suggested): --with-tcpportrange=5,50040 --with-udpportrange=890,899 The dump is available at ftp://zappa.cx/pub/amanda.pcap I used this filter in ethereal to get rid of some other stuff that I suppose is not interesting: ! dns and ! tcp.port == 25 It seems ethereal store only the first few bytes of each packet. There is probably an option to set that size; similar to tcpdump -s 1500. That means I don't have the full info, but I believe I've seen enough! There is something strange indeed. 13:00:30.573676 192.168.20.100.amanda 192.168.20.100.890: udp 125 (DF) 0x 4500 0099 008d 4000 4011 8fae c0a8 1464[EMAIL PROTECTED]@..d 0x0010 c0a8 1464 2760 037a 0085 aebc 416d 616e...d'`.zAman 0x0020 6461 2032 2e34 2052 4550 2048 414e 444cda.2.4.REP.HANDL 0x0030 4520 3030 302d 4338 4443 3036 3038 2053E.000-C8DC0608.S 0x0040 4551 2031 3038 3732 3937 3037 310a 434fEQ.1087297071.CO 0x0050 4e4e NN 13:00:30.574146 192.168.20.100.890 192.168.20.100.amanda: udp 50 (DF) 0x 4500 004e 0001 4000 4011 9085 c0a8 1464[EMAIL PROTECTED]@..d 0x0010 c0a8 1464 037a 2760 003a b654 416d 616e...d.z'`.:.TAman 0x0020 6461 2032 2e34 2041 434b 2048 414e 444cda.2.4.ACK.HANDL 0x0030 4520 3030 302d 4338 4443 3036 3038 2053E.000-C8DC0608.S 0x0040 4551 2031 3038 3732 3937 3037 310a EQ.1087297071. The above was the request to set up the tcp connections. The tracer dumped not all of the packet, if broke of after CONN, followed by the tcp port numbers, but which you find in one of the amandad..debug logs as well. Normally, there should be three consecutive numbers. The three tcp portnumbers are used in the next exhange to set up three tcp connections, for data, error, and index respectively. 13:00:45.622602 192.168.20.100.50027 192.168.20.100.50001: S ... 13:00:45.622696 192.168.20.100.50001 192.168.20.100.50027: S ... 13:00:45.622815 192.168.20.100.50027 192.168.20.100.50001: . ack ... This was handshake for the first connection: the data connection. (I have shortened the line to fit on screen.) It connected to port 50001. 13:00:45.625090 192.168.20.100.50028 192.168.20.100.50002: S ... 13:00:45.625165 192.168.20.100.50002 192.168.20.100.50028: S ... 13:00:45.625237 192.168.20.100.50028 192.168.20.100.50002: . ack ... The second handshake to port 50002, for the error messages. 13:00:45.627502 192.168.20.100.50029 192.168.20.100.65535: S ... 13:00:45.627564 192.168.20.100.65535 192.168.20.100.50029: R ... You would expect a handshake to port 50003, for the index, but instead there is a connection to port 65535, which is rejected. 13:00:45.628082 192.168.20.100.50027 192.168.20.100.50001: F ... 13:00:45.628177 192.168.20.100.50028 192.168.20.100.50002: F ... 13:00:45.628849 192.168.20.100.50001 192.168.20.100.50027: . ack ... 13:00:45.628880 192.168.20.100.50002 192.168.20.100.50028: . ack ... And amanda cleans up the other two connections. Amanda tries again with another set of ports a few times but always trying to connect to 65535 for the index. Then she gives up completely. Can you verify in the amandad..log that the index connection was indeed asked to port 50003? The debug file looks like (search for string CONNECT): Amanda 2.4 REP HANDLE 000-C8DC0608 SEQ 1087282569 CONNECT DATA 32771 MESG 32772 INDEX 32773 OPTIONS features=feff9ffe0f; Next thing to find out is who/why/when decided to connect to port 65535 instead. Also note that number: all 1-bits 16-bit wide. A kernel bug is indeed one of the possibilities. Just for fun: if you disable the indexing, then the backup will run fine, I believe. (index no in dumptype). -- Paul Bijnens, XplanationTel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] *** * I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, F6, * * quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, * * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, * * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, * * kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ...* * ... Are you sure? ... YES ... Phew ... I'm out * ***
Re: 2.6.6-rc2 and newer cause trouble with amanda
Paul Bijnens wrote: It seems ethereal store only the first few bytes of each packet. There is probably an option to set that size; similar to tcpdump -s 1500. That means I don't have the full info, but I believe I've seen enough! Following up on myself. I've digged into the amanda source, and noticed there is only one place where bad CONNECT response is actived. That happens only for data or message tcp-connections. But from the amdump.1 log file, it seems not always consistent, now and then it is bad CONNECT response, other times it is the index that fails. So it is not always the index that fails, as I concluded earlier. Therefore I really would like to have a complete trace, not cut off, together with the accompying amamdad..debug files + amdump.1, all from the same run. -- Paul Bijnens, XplanationTel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] *** * I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, F6, * * quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, * * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, * * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, * * kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ...* * ... Are you sure? ... YES ... Phew ... I'm out * ***
warning - last level 0 overwritten
Amanda 2.4.4p1 Solaris 8 Realize that this is probably not the current version of amanda but it might represent an ongoing issue. We haven't been backing up a particular partition, its relatively large given the size of the tape drive (22 Gig used on a 32 Gig partition, tape drive is a DLT 7000 and we use SW compression). Last night's amanda report warned us that we would overwrite the last level 0 dump in one day. Being concerned about this we thought we'd pull the tape with the level 0 from the pool (replace with one of the same label) until we where able to correct the no-dump issue. However an amadmin find shows us that the level 0 all had short tape writes and that we had nothing better than a level 1 available. Does amanda record the dates of last level 0 as success prior to actually completing the tape write ? thanks, Brian --- Brian R Cuttler [EMAIL PROTECTED] Computer Systems Support(v) 518 486-1697 Wadsworth Center(f) 518 473-6384 NYS Department of HealthHelp Desk 518 473-0773
Re: 2.6.6-rc2 and newer cause trouble with amanda
Paul Bijnens wrote: Paul Bijnens wrote: It seems ethereal store only the first few bytes of each packet. There is probably an option to set that size; similar to tcpdump -s 1500. That means I don't have the full info, but I believe I've seen enough! Following up on myself. I've digged into the amanda source, and noticed there is only one place where bad CONNECT response is actived. That happens only for data or message tcp-connections. But from the amdump.1 log file, it seems not always consistent, now and then it is bad CONNECT response, other times it is the index that fails. So it is not always the index that fails, as I concluded earlier. Therefore I really would like to have a complete trace, not cut off, together with the accompying amamdad..debug files + amdump.1, all from the same run. Sure, here's a new run this time with -s 1500 so all packets will hopefully be intact. All files are in: ftp://zappa.cx/pub/amanda-zappa.cx-2.tar.gz Notice that amdump.1 and log.20040615.6 resides in var/lib/amanda/dflt in my configuration. /Andreas
Re: directory permissions cause tar to segfault?
Hi Eric, I had a similar problem with several versions of gtar (1.13, 1.13.25 and 1.14) on Solaris 8. By running the gtar command used by amanda interactively with the -v flag, I found that gtar was SEGV-ing on a specific user directory. When I added that directory to the amanda exclude file, gtar worked fine, and the amanda backup succeeded. This didn't really solve the problem, but it gives me some comfort that it was caused by gtar, and not amanda. I suppose running a debug version of gtar could shed some light on what is causing the SEGV. HTH, Marc. On Fri, 2004-06-11 at 11:01, Eric Sproul wrote: On Fri, 2004-06-11 at 10:44, Jon LaBadie wrote: Well, I feel it shouldn't segfault under any conditions. Hi Jon, I completely agree. Just two considerations, is /bin/tar the one amanda uses? Yes. I copied the command directly from the sendsize debug, but I was confused because it mentions both /bin/tar and /usr/lib/amanda/runtar: sendsize[29880]: time 0.500: getting size via gnutar for md0 level 0 sendsize[29880]: time 0.501: spawning /usr/lib/amanda/runtar in pipeline sendsize[29880]: argument list: /bin/tar (... rest of the options) I found that when I tested from the command line I needed to leave out /bin/tar. See below. And I'm pretty sure that when run by amanda, tar is setuid root. It does most things as the amanda-user, then invokes tar with a setuid program called runtar. You might check that perms on runtar are unchanged. This host uses the Debian package, amanda-client_2.4.4p2-1. The runtar binary looks correct. I installed this version of amanda-client on 2/25, so a mod-date of 2/15 is reasonable, as we are tracking Debian-sid. # ls -l /usr/lib/amanda/runtar -rwsr-xr--1 root backup 4716 Feb 15 21:44 /usr/lib/amanda/runtar I tried as user backup running the estimate command using runtar instead, and still got a segfault. backup@HOST:~$ /usr/lib/amanda/runtar --create --file /dev/null \ --directory /usr --one-file-system --listed-incremental \ /var/lib/amanda/gnutar-lists/dhcp02md0_1.new --sparse \ --ignore-failed-read --totals --exclude-from \ /tmp/amanda/sendsize.md0.20040609041435000.exclude . Segmentation fault But... it didn't mention the directory permissions this time, as runtar is suid-root (which you noted). The segfault seems to be coming from somewhere else then. I guess I'll have to dig deeper into what is actually being backed up. Maybe one of my files is causing this. Thanks, Eric
Re: 2.6.6-rc2 and newer cause trouble with amanda
Hi, Paul, on Dienstag, 15. Juni 2004 at 16:49 you wrote to amanda-users: PB And amanda cleans up the other two connections. PB Amanda tries again with another set of ports a few times PB but always trying to connect to 65535 for the index. PB Then she gives up completely. I also noticed the strange usage of 65535 ... PB Can you verify in the amandad..log that the index connection PB was indeed asked to port 50003? PB The debug file looks like (search for string CONNECT): PB PBAmanda 2.4 REP HANDLE 000-C8DC0608 SEQ 1087282569 PBCONNECT DATA 32771 MESG 32772 INDEX 32773 PBOPTIONS features=feff9ffe0f; PB PB Next thing to find out is who/why/when decided to connect to port PB 65535 instead. Also note that number: all 1-bits 16-bit wide. PB A kernel bug is indeed one of the possibilities. Where could that one reside? A faulty network-module? Remember that these things work fine here with each 2.6 .. Andreas, you could give me your .config, I will take a look ... Just to update me: You also tried this with 2.6.6? Have you tried to really start over with your kernel-tree? Did you get to 2.6.6 via patching or via downloading the whole tree? If patched, did you look for .rej-files? Did you use make clean or make distclean? Also the modules could be faulty ... any fancy network-hardware used? I would suggest a complete re-install of a vanilla 2.6.6-kernel and testing with this one. Doesn't make much sense to look for bugs in that rc in my opinion. -- best regards, Stefan Stefan G. Weichinger mailto:[EMAIL PROTECTED]
Re: 2.6.6-rc2 and newer cause trouble with amanda
Stefan G. Weichinger wrote: Hi, Paul, on Dienstag, 15. Juni 2004 at 16:49 you wrote to amanda-users: PB And amanda cleans up the other two connections. PB Amanda tries again with another set of ports a few times PB but always trying to connect to 65535 for the index. PB Then she gives up completely. I also noticed the strange usage of 65535 ... PB Can you verify in the amandad..log that the index connection PB was indeed asked to port 50003? PB The debug file looks like (search for string CONNECT): PB PBAmanda 2.4 REP HANDLE 000-C8DC0608 SEQ 1087282569 PBCONNECT DATA 32771 MESG 32772 INDEX 32773 PBOPTIONS features=feff9ffe0f; PB PB Next thing to find out is who/why/when decided to connect to port PB 65535 instead. Also note that number: all 1-bits 16-bit wide. PB A kernel bug is indeed one of the possibilities. Where could that one reside? A faulty network-module? Remember that these things work fine here with each 2.6 .. I'm almost sure it's some kind of kernel bug. That's why I have bothered to narrow down on wich -rc kernel which started causing it. Andreas, you could give me your .config, I will take a look ... Just to update me: You also tried this with 2.6.6? Sure, that's how it all started problems when upgrading from 2.6.5 to 2.6.6. I'll attach my 2.6.6 config at the end. Have you tried to really start over with your kernel-tree? Yes. Did you get to 2.6.6 via patching or via downloading the whole tree? Whole tree. If patched, did you look for .rej-files? Did you use make clean or make distclean? I've patched from 2.6.5 to 2.6.6-rc1 or rc2 ofcourse, but I always patch a newly untarred kernel tree. Also the modules could be faulty ... any fancy network-hardware used? Maybe, not very fancy but a bit uncommon: 00:00.0 Host bridge: Intel Corp. 440LX/EX - 82443LX/EX Host bridge (rev 03) 00:01.0 PCI bridge: Intel Corp. 440LX/EX - 82443LX/EX AGP bridge (rev 03) 00:07.0 ISA bridge: Intel Corp. 82371AB/EB/MB PIIX4 ISA (rev 02) 00:07.1 IDE interface: Intel Corp. 82371AB/EB/MB PIIX4 IDE (rev 01) 00:07.2 USB Controller: Intel Corp. 82371AB/EB/MB PIIX4 USB (rev 01) 00:07.3 Bridge: Intel Corp. 82371AB/EB/MB PIIX4 ACPI (rev 02) 00:09.0 RAID bus controller: Promise Technology, Inc. 20262 (rev 01) 00:0a.0 Ethernet controller: 3Com Corporation 3c905 100BaseTX [Boomerang] 00:0b.0 SCSI storage controller: Adaptec AHA-2940U/UW/D / AIC-7881U (rev 01) 00:0c.0 Ethernet controller: 3Com Corporation 3c980-TX 10/100baseTX NIC [Python-T] (rev 78) 01:00.0 VGA compatible controller: Intel Corp. i740 (rev 21) The 3c980-TX is not the most common NIC, it's the one serving the lan, but all my backups are local so I don't really see why it should mean trouble. I would suggest a complete re-install of a vanilla 2.6.6-kernel and testing with this one. Doesn't make much sense to look for bugs in that rc in my opinion. I have already done this several times before starting to try different rc versions, that is only done to narrow down things for my report to the lkml which was where I started looking for help. /Andreas # # Automatically generated make config: don't edit # CONFIG_X86=y CONFIG_MMU=y CONFIG_UID16=y CONFIG_GENERIC_ISA_DMA=y # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_STANDALONE=y CONFIG_BROKEN_ON_SMP=y # # General setup # CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_POSIX_MQUEUE is not set # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set CONFIG_LOG_BUF_SHIFT=15 CONFIG_HOTPLUG=y # CONFIG_IKCONFIG is not set # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set # # Loadable module support # CONFIG_MODULES=y # CONFIG_MODULE_UNLOAD is not set CONFIG_OBSOLETE_MODPARM=y # CONFIG_MODVERSIONS is not set CONFIG_KMOD=y # # Processor type and features # CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set CONFIG_MPENTIUMII=y # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_L1_CACHE_SHIFT=5 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y
Re: 2.6.6-rc2 and newer cause trouble with amanda
Paul Bijnens wrote: Just for fun: if you disable the indexing, then the backup will run fine, I believe. (index no in dumptype). Well, no it doesn't work that way either. /Andreas
Re: 2.6.6-rc2 and newer cause trouble with amanda
Hi, Andreas, on Dienstag, 15. Juni 2004 at 21:56 you wrote to amanda-users: Where could that one reside? A faulty network-module? Remember that these things work fine here with each 2.6 .. AS I'm almost sure it's some kind of kernel bug. That's why I have bothered AS to narrow down on wich -rc kernel which started causing it. I am pretty cautious with pointing my finger at a kernel bug. Andreas, you could give me your .config, I will take a look ... Just to update me: You also tried this with 2.6.6? AS Sure, that's how it all started problems when upgrading from 2.6.5 to AS 2.6.6. I'll attach my 2.6.6 config at the end. Where did you get this config from? Have you modified it? A diff against my current .config shows that this seems to be a pretty fat kernel, many many things compiled into it statically ... I am no kernel-hacker but I know that having ONE of all those options wrong can break things ... this has happened to me for several times. If patched, did you look for .rej-files? Did you use make clean or make distclean? AS I've patched from 2.6.5 to 2.6.6-rc1 or rc2 ofcourse, but I always patch AS a newly untarred kernel tree. Always done make clean between compiler-runs? Always checked for rej-files after patching? Also the modules could be faulty ... any fancy network-hardware used? AS Maybe, not very fancy but a bit uncommon: AS The 3c980-TX is not the most common NIC, it's the one serving the lan, but AS all my backups are local so I don't really see why it should mean trouble. Never assume anything is local ;-) TCP/IP does not care much about local stuff ... This reminds of something completely different (who knows which TV-series? ...) : Show us your disklist. I would suggest a complete re-install of a vanilla 2.6.6-kernel and testing with this one. Doesn't make much sense to look for bugs in that rc in my opinion. AS I have already done this several times before starting to try different AS rc versions, that is only done to narrow down things for my report AS to the lkml which was where I started looking for help. Tasty problem. Looking forward to the solution ... -- best regards, Stefan Stefan G. Weichinger mailto:[EMAIL PROTECTED]
Re: 2.6.6-rc2 and newer cause trouble with amanda
Very very strange... Do you have netcat installed? What is the output of this command on 2.6.6rc2? nc -v -v -s 127.0.0.1 -p 1234 127.0.0.1 1234 -- Paul Bijnens, XplanationTel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] *** * I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, F6, * * quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, * * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, * * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, * * kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ...* * ... Are you sure? ... YES ... Phew ... I'm out * ***
[no subject]
unsubscribe [EMAIL PROTECTED] -- Fernando Serto Systems Administrator Memetrics Pty. Phone: +61 2 95560833 Fax: +61 2 95556911 Mobile: 0403 338 005 E-mail: [EMAIL PROTECTED] --- Certain disclaimers and policies apply to all email sent from Memetrics. For the full text of these disclaimers and policies see a href=http://www.memetrics.com/emailpolicy.html;http://www.memetrics.com/em ailpolicy.html/a
Re: 2.6.6-rc2 and newer cause trouble with amanda
Paul Bijnens wrote: Very very strange... Do you have netcat installed? What is the output of this command on 2.6.6rc2? nc -v -v -s 127.0.0.1 -p 1234 127.0.0.1 1234 [EMAIL PROTECTED]:/tmp/amanda$ nc -v -v -s 127.0.0.1 -p 1234 127.0.0.1 1234 localhost [127.0.0.1] 1234 (?) open /Andreas