Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Andreas Sundstrom
Quoting Andreas Sundstrom [EMAIL PROTECTED]:

 Eric Siegerman wrote:
  On Sun, Jun 13, 2004 at 10:41:14PM +0200, Andreas Sundstrom wrote:
  
  --with-user=root \
  --with-group=root
  
  
  A stab in the dark here:  these settings seem a bit suspicious.
  Normally one doesn't run Amanda as root; it's able to get root
  privilege when it needs it (that's what the little runtar and
  rundump programs are for).  I don't know anything about
  2.6-series kernels, but maybe some change snuck in that
  interferes with Amanda trying to run as root.
  
  Even if that turns out not to be the source of your 2.6.6
  problem, for security reasons I'd suggest running Amanda as some
  other user.
 
 I'll give it a shot, I have actually ment to do that anyway, but since 
 everything has worked flawlessly I didn't bother. But I guess you never 
 can have to much security anyways.
 
 /Andreas
 
 

Well, now I have created a user named amanda which has default group membership
disk and is also a member of users. Then I recompiled with user=amanda,
group=disk and installed.

I had som trouble to get it working but I eventually found out that when run
with xinetd this statement is needed groups = yes.

All went well and the backup finished successful. Then I switched to my
identically compiled 2.6.6-rc2 kernel and it fails with the same error as earlier:
These dumps were to tape dflt10.
The next tape Amanda expects to use is: dflt11.
The next new tape already labelled is: dflt12.

FAILURE AND STRANGE DUMP SUMMARY:
  zappa.zapp /imagelib lev 1 FAILED [bad CONNECT response]
  zappa.zapp /boot lev 0 FAILED 20040615[could not connect to zappa.zappa.cx]
  zappa.zapp /home/sunkan lev 1 FAILED 20040615[could not connect to zappa.zappa.cx]
  zappa.zapp /var lev 1 FAILED 20040615[could not connect to zappa.zappa.cx]
  zappa.zapp / lev 0 FAILED 20040615[could not connect to zappa.zappa.cx]
  zappa.zapp /home/emelie lev 0 FAILED 20040615[could not connect to zappa.zappa.cx]
  zappa.zapp /apps lev 0 FAILED [bad CONNECT response]

So I tarred up the /tmp/amanda dir and put it at
ftp://zappa.cx/pub/amanda-zappa.cx.tar.gz if anyone wants to take a look at it.
This is the complete failed session and nothing else.

Thanks for the help and pointers so far everyone.
/Andreas


Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Paul Bijnens
Andreas Sundstrom wrote:
All went well and the backup finished successful. Then I switched to my
identically compiled 2.6.6-rc2 kernel and it fails with the same error as earlier:
These dumps were to tape dflt10.
The next tape Amanda expects to use is: dflt11.
The next new tape already labelled is: dflt12.
FAILURE AND STRANGE DUMP SUMMARY:
  zappa.zapp /imagelib lev 1 FAILED [bad CONNECT response]
  zappa.zapp /boot lev 0 FAILED 20040615[could not connect to zappa.zappa.cx]
  zappa.zapp /home/sunkan lev 1 FAILED 20040615[could not connect to zappa.zappa.cx]
  zappa.zapp /var lev 1 FAILED 20040615[could not connect to zappa.zappa.cx]
  zappa.zapp / lev 0 FAILED 20040615[could not connect to zappa.zappa.cx]
  zappa.zapp /home/emelie lev 0 FAILED 20040615[could not connect to zappa.zappa.cx]
  zappa.zapp /apps lev 0 FAILED [bad CONNECT response]
So I tarred up the /tmp/amanda dir and put it at
ftp://zappa.cx/pub/amanda-zappa.cx.tar.gz if anyone wants to take a look at it.
This is the complete failed session and nothing else.
In /tmp/amanda are mostly client files.  The debug files are all, as
far as I can see, from the succeeded session.  When failing, the
client does not even get started somehow, or did I miss that file?
I miss the amdump.1 and log.2004 file, which are the server side 
logs (in ~amanda/dflt ).

But I don't expect something shocking in those files anyway.
As it seems to be network related, I'm still more interested to know
if the client had trouble sending the packets, or the server had trouble
receiving it.  A tcpdump trace of the failing session would be really
helpful.
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  Are you sure?  ...   YES   ...   Phew ...   I'm out  *
***



Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Stefan G. Weichinger
Hi, Andreas,

on Dienstag, 15. Juni 2004 at 09:29 you wrote to amanda-users:

AS Well, now I have created a user named amanda which has default group membership
AS disk and is also a member of users. Then I recompiled with user=amanda,
AS group=disk and installed.

AS I had som trouble to get it working but I eventually found out that when run
AS with xinetd this statement is needed groups = yes.

AS All went well and the backup finished successful.

Fine ;-)

AS Then I switched to my
AS identically compiled 2.6.6-rc2 kernel and it fails with the same error as earlier:
AS These dumps were to tape dflt10.
AS The next tape Amanda expects to use is: dflt11.
AS The next new tape already labelled is: dflt12.

AS FAILURE AND STRANGE DUMP SUMMARY:
AS   zappa.zapp /imagelib lev 1 FAILED [bad CONNECT response]
AS   zappa.zapp /boot lev 0 FAILED 20040615[could not connect to zappa.zappa.cx]
AS   zappa.zapp /home/sunkan lev 1 FAILED 20040615[could not connect to 
zappa.zappa.cx]
AS   zappa.zapp /var lev 1 FAILED 20040615[could not connect to zappa.zappa.cx]
AS   zappa.zapp / lev 0 FAILED 20040615[could not connect to zappa.zappa.cx]
AS   zappa.zapp /home/emelie lev 0 FAILED 20040615[could not connect to 
zappa.zappa.cx]
AS   zappa.zapp /apps lev 0 FAILED [bad CONNECT response]

AS So I tarred up the /tmp/amanda dir and put it at
AS ftp://zappa.cx/pub/amanda-zappa.cx.tar.gz if anyone wants to take a look at it.
AS This is the complete failed session and nothing else.

AS Thanks for the help and pointers so far everyone.

I think you just run out of tcp-ports. I don't know exactly if that is
kernel-related but you could try to reconfigure amanda with the
options:

--with-tcpportrange=5,50040 --with-udpportrange=890,899

(Substitute with your preferred port-numbers)

This would specify the range to use and rule out any differences that
might occur in handling this between kernel-releases. I have these
options in my amanda-configure-script for quite a time now and never
hit these problems with any 2.6-kernel.

You could also read the docs/PORT-USAGE document for more infos on
port-handling.

Give it a try and let us know.

--

Apart from that the server-side logs would be interesting but I
suggest that reconfig-run first.

-- 
best regards,
Stefan

Stefan G. Weichinger
mailto:[EMAIL PROTECTED]






Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Andreas Sundstrom
Quoting Paul Bijnens [EMAIL PROTECTED]:

 Andreas Sundstrom wrote:
 
  All went well and the backup finished successful. Then I switched to my
  identically compiled 2.6.6-rc2 kernel and it fails with the same error as
 earlier:
  These dumps were to tape dflt10.
  The next tape Amanda expects to use is: dflt11.
  The next new tape already labelled is: dflt12.
  
  FAILURE AND STRANGE DUMP SUMMARY:
zappa.zapp /imagelib lev 1 FAILED [bad CONNECT response]
zappa.zapp /boot lev 0 FAILED 20040615[could not connect to
 zappa.zappa.cx]
zappa.zapp /home/sunkan lev 1 FAILED 20040615[could not connect to
 zappa.zappa.cx]
zappa.zapp /var lev 1 FAILED 20040615[could not connect to
 zappa.zappa.cx]
zappa.zapp / lev 0 FAILED 20040615[could not connect to zappa.zappa.cx]
zappa.zapp /home/emelie lev 0 FAILED 20040615[could not connect to
 zappa.zappa.cx]
zappa.zapp /apps lev 0 FAILED [bad CONNECT response]
  
  So I tarred up the /tmp/amanda dir and put it at
  ftp://zappa.cx/pub/amanda-zappa.cx.tar.gz if anyone wants to take a look at
 it.
  This is the complete failed session and nothing else.
 
 In /tmp/amanda are mostly client files.  The debug files are all, as
 far as I can see, from the succeeded session.  When failing, the
 client does not even get started somehow, or did I miss that file?
 
 I miss the amdump.1 and log.2004 file, which are the server side 
 logs (in ~amanda/dflt ).
 
 But I don't expect something shocking in those files anyway.
 As it seems to be network related, I'm still more interested to know
 if the client had trouble sending the packets, or the server had trouble
 receiving it.  A tcpdump trace of the failing session would be really
 helpful.

I have tarred them up to the file: ftp://zappa.cx/pub/amdump-zappa.cx.tar.gz
have a look if you have time for it.

Also what do you recommend as parameters to tcpdump, it's all running on the
same host does that mean I can sniff on lo?

/Andreas


Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Paul Bijnens
Andreas Sundstrom wrote:
Also what do you recommend as parameters to tcpdump, it's all running on the
same host does that mean I can sniff on lo?
This works for me on Linux 2.4.22:
  sudo tcpdump -i lo -w trace.lo
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  Are you sure?  ...   YES   ...   Phew ...   I'm out  *
***



Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Andreas Sundstrom
Quoting Paul Bijnens [EMAIL PROTECTED]:

 Andreas Sundstrom wrote:
 
  Also what do you recommend as parameters to tcpdump, it's all running on
 the
  same host does that mean I can sniff on lo?
 
 This works for me on Linux 2.4.22:
 
sudo tcpdump -i lo -w trace.lo

I have now made a dump on the traffic passing through lo during the amdump. I
have also recompiled amanda with these two settings (as another friendly person
suggested): --with-tcpportrange=5,50040 --with-udpportrange=890,899

The dump is available at ftp://zappa.cx/pub/amanda.pcap

I used this filter in ethereal to get rid of some other stuff that I suppose is
not interesting: ! dns and ! tcp.port == 25

I'm not that used to looking at packet dumps and I have never looked at a
working amanda dump so have a look and see if it gives you anymore clues about
what is going on.

/Andreas


Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Stefan G. Weichinger
Hi, Andreas,

on Dienstag, 15. Juni 2004 at 13:32 you wrote to amanda-users:

AS I have now made a dump on the traffic passing through lo during the amdump.
AS I
AS have also recompiled amanda with these two settings (as another friendly person
AS suggested): --with-tcpportrange=5,50040 --with-udpportrange=890,899

Did it run through now? What was the report?

AS The dump is available at ftp://zappa.cx/pub/amanda.pcap

AS I used this filter in ethereal to get rid of some other stuff that I suppose is
AS not interesting: ! dns and ! tcp.port == 25

AS I'm not that used to looking at packet dumps and I have never looked at a
AS working amanda dump so have a look and see if it gives you anymore clues about
AS what is going on.

I am not so used to that, either. Looks pretty good to me so far...
uses the assigned ports and such.

Maybe Paul sees more in it ...

-- 
best regards,
Stefan

Stefan G. Weichinger
mailto:[EMAIL PROTECTED]






[no subject]

2004-06-15 Thread Andreas Sundstrom
Stefan G. Weichinger wrote:
 Hi, Andreas,

 on Dienstag, 15. Juni 2004 at 13:32 you wrote to amanda-users:

 AS I have now made a dump on the traffic passing through lo during the amdump.
 AS I
 AS have also recompiled amanda with these two settings (as another friendly
person
 AS suggested): --with-tcpportrange=5,50040 --with-udpportrange=890,899

 Did it run through now? What was the report?

Sorry, the report and result was the same as earlier, I forgot to
mention that.

 AS The dump is available at ftp://zappa.cx/pub/amanda.pcap

 AS I used this filter in ethereal to get rid of some other stuff that I
suppose is
 AS not interesting: ! dns and ! tcp.port == 25

 AS I'm not that used to looking at packet dumps and I have never looked at a
 AS working amanda dump so have a look and see if it gives you anymore clues about
 AS what is going on.

 I am not so used to that, either. Looks pretty good to me so far...
 uses the assigned ports and such.

It looks to me that the TCP sessions don't transfer any actual data but
I don't understand why.


 Maybe Paul sees more in it ...

I hope he do.


Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Paul Bijnens
Andreas Sundstrom wrote:
I have now made a dump on the traffic passing through lo during the amdump. I
have also recompiled amanda with these two settings (as another friendly person
suggested): --with-tcpportrange=5,50040 --with-udpportrange=890,899
The dump is available at ftp://zappa.cx/pub/amanda.pcap
I used this filter in ethereal to get rid of some other stuff that I suppose is
not interesting: ! dns and ! tcp.port == 25
It seems ethereal store only the first few bytes of each packet.
There is probably an option to set that size; similar to tcpdump -s 1500.
That means I don't have the full info, but I believe I've seen enough!
There is something strange indeed.
13:00:30.573676 192.168.20.100.amanda  192.168.20.100.890: udp 125 (DF)
0x   4500 0099 008d 4000 4011 8fae c0a8 1464[EMAIL PROTECTED]@..d
0x0010   c0a8 1464 2760 037a 0085 aebc 416d 616e...d'`.zAman
0x0020   6461 2032 2e34 2052 4550 2048 414e 444cda.2.4.REP.HANDL
0x0030   4520 3030 302d 4338 4443 3036 3038 2053E.000-C8DC0608.S
0x0040   4551 2031 3038 3732 3937 3037 310a 434fEQ.1087297071.CO
0x0050   4e4e   NN
13:00:30.574146 192.168.20.100.890  192.168.20.100.amanda: udp 50 (DF)
0x   4500 004e 0001 4000 4011 9085 c0a8 1464[EMAIL PROTECTED]@..d
0x0010   c0a8 1464 037a 2760 003a b654 416d 616e...d.z'`.:.TAman
0x0020   6461 2032 2e34 2041 434b 2048 414e 444cda.2.4.ACK.HANDL
0x0030   4520 3030 302d 4338 4443 3036 3038 2053E.000-C8DC0608.S
0x0040   4551 2031 3038 3732 3937 3037 310a EQ.1087297071.
The above was the request to set up the tcp connections.  The  tracer
dumped not all of the packet, if broke of after CONN, followed by the
tcp port numbers, but which you find in one of the amandad..debug
logs as well.
Normally, there should be three consecutive numbers.
The three tcp portnumbers are used in the next exhange to set up
three tcp connections, for data, error, and index respectively.
13:00:45.622602 192.168.20.100.50027  192.168.20.100.50001: S ...
13:00:45.622696 192.168.20.100.50001  192.168.20.100.50027: S ...
13:00:45.622815 192.168.20.100.50027  192.168.20.100.50001: . ack ...
This was handshake for the first connection: the data connection.
(I have shortened the line to fit on screen.)
It connected to port 50001.
13:00:45.625090 192.168.20.100.50028  192.168.20.100.50002: S ...
13:00:45.625165 192.168.20.100.50002  192.168.20.100.50028: S ...
13:00:45.625237 192.168.20.100.50028  192.168.20.100.50002: . ack ...
The second handshake to port 50002, for the error messages.
13:00:45.627502 192.168.20.100.50029  192.168.20.100.65535: S ...
13:00:45.627564 192.168.20.100.65535  192.168.20.100.50029: R ...
You would expect a handshake to port 50003, for the index, but
instead there is a connection to port 65535, which is rejected.
13:00:45.628082 192.168.20.100.50027  192.168.20.100.50001: F ...
13:00:45.628177 192.168.20.100.50028  192.168.20.100.50002: F ...
13:00:45.628849 192.168.20.100.50001  192.168.20.100.50027: . ack ...
13:00:45.628880 192.168.20.100.50002  192.168.20.100.50028: . ack ...
And amanda cleans up the other two connections.
Amanda tries again with another set of ports a few times
but always trying to connect to 65535 for the index.
Then she gives up completely.
Can you verify in the amandad..log that the index connection
was indeed asked to port 50003?
The debug file looks like (search for string CONNECT):
  
  Amanda 2.4 REP HANDLE 000-C8DC0608 SEQ 1087282569
  CONNECT DATA 32771 MESG 32772 INDEX 32773
  OPTIONS features=feff9ffe0f;
  
Next thing to find out is who/why/when decided to connect to port
65535 instead.  Also note that number:  all 1-bits 16-bit wide.
A kernel bug is indeed one of the possibilities.
Just for fun: if you disable the indexing, then the backup will run
fine, I believe.  (index no in dumptype).
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  Are you sure?  ...   YES   ...   Phew ...   I'm out  *
***



Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Paul Bijnens
Paul Bijnens wrote:
It seems ethereal store only the first few bytes of each packet.
There is probably an option to set that size; similar to tcpdump -s 1500.
That means I don't have the full info, but I believe I've seen enough!
Following up on myself.
I've digged into the amanda source, and noticed there is only one
place where bad CONNECT response is actived.
That happens only for data or message tcp-connections.
But from the amdump.1 log file, it seems not always consistent, now
and then it is bad CONNECT response, other times it is the index
that fails.
So it is not always the index that fails, as I concluded earlier.
Therefore I really would like to have a complete trace,
not cut off, together with the accompying amamdad..debug
files + amdump.1, all from the same run.
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  Are you sure?  ...   YES   ...   Phew ...   I'm out  *
***



warning - last level 0 overwritten

2004-06-15 Thread Brian Cuttler

Amanda 2.4.4p1
Solaris 8

Realize that this is probably not the current version of amanda
but it might represent an ongoing issue.

We haven't been backing up a particular partition, its relatively
large given the size of the tape drive (22 Gig used on a 32 Gig
partition, tape drive is a DLT 7000 and we use SW compression).

Last night's amanda report warned us that we would overwrite the
last level 0 dump in one day. Being concerned about this we thought
we'd pull the tape with the level 0 from the pool (replace with one
of the same label) until we where able to correct the no-dump issue.

However an amadmin find shows us that the level 0 all had short
tape writes and that we had nothing better than a level 1 available.

Does amanda record the dates of last level 0 as success prior to
actually completing the tape write ?

thanks,

Brian

---
   Brian R Cuttler [EMAIL PROTECTED]
   Computer Systems Support(v) 518 486-1697
   Wadsworth Center(f) 518 473-6384
   NYS Department of HealthHelp Desk 518 473-0773



Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Andreas Sundstrom
Paul Bijnens wrote:
Paul Bijnens wrote:
It seems ethereal store only the first few bytes of each packet.
There is probably an option to set that size; similar to tcpdump -s 
1500.
That means I don't have the full info, but I believe I've seen enough!

Following up on myself.
I've digged into the amanda source, and noticed there is only one
place where bad CONNECT response is actived.
That happens only for data or message tcp-connections.
But from the amdump.1 log file, it seems not always consistent, now
and then it is bad CONNECT response, other times it is the index
that fails.
So it is not always the index that fails, as I concluded earlier.
Therefore I really would like to have a complete trace,
not cut off, together with the accompying amamdad..debug
files + amdump.1, all from the same run.
Sure, here's a new run this time with -s 1500 so all packets will
hopefully be intact. All files are in:
ftp://zappa.cx/pub/amanda-zappa.cx-2.tar.gz
Notice that amdump.1 and log.20040615.6 resides in var/lib/amanda/dflt
in my configuration.
/Andreas


Re: directory permissions cause tar to segfault?

2004-06-15 Thread Marc Langlois
Hi Eric,

I had a similar problem with several versions of gtar (1.13, 1.13.25 and
1.14) on Solaris 8. By running the gtar command used by amanda
interactively with the -v flag, I found that gtar was SEGV-ing on a
specific user directory. When I added that directory to the amanda
exclude file, gtar worked fine, and the amanda backup succeeded.

This didn't really solve the problem, but it gives me some comfort that
it was caused by gtar, and not amanda. I suppose running a debug version
of gtar could shed some light on what is causing the SEGV.

HTH,
Marc. 

   
On Fri, 2004-06-11 at 11:01, Eric Sproul wrote:
 On Fri, 2004-06-11 at 10:44, Jon LaBadie wrote:
  Well, I feel it shouldn't segfault under any conditions.
 
 Hi Jon,
 I completely agree.
 
  
  Just two considerations, is /bin/tar the one amanda uses?
 
 Yes.  I copied the command directly from the sendsize debug, but I was
 confused because it mentions both /bin/tar and /usr/lib/amanda/runtar:
 
 sendsize[29880]: time 0.500: getting size via gnutar for md0 level 0
 sendsize[29880]: time 0.501: spawning /usr/lib/amanda/runtar in pipeline
 sendsize[29880]: argument list: /bin/tar (... rest of the options)
 
 I found that when I tested from the command line I needed to leave out
 /bin/tar.  See below.
 
  And I'm pretty sure that when run by amanda, tar is setuid root.
  It does most things as the amanda-user, then invokes tar with
  a setuid program called runtar.  You might check that perms
  on runtar are unchanged.
 
 This host uses the Debian package, amanda-client_2.4.4p2-1.  The runtar
 binary looks correct.  I installed this version of amanda-client on
 2/25, so a mod-date of 2/15 is reasonable, as we are tracking
 Debian-sid.
 
 # ls -l /usr/lib/amanda/runtar
 -rwsr-xr--1 root backup   4716 Feb 15 21:44 /usr/lib/amanda/runtar
 
 I tried as user backup running the estimate command using runtar
 instead, and still got a segfault.
 
 backup@HOST:~$ /usr/lib/amanda/runtar --create --file /dev/null \
  --directory /usr --one-file-system --listed-incremental \
  /var/lib/amanda/gnutar-lists/dhcp02md0_1.new --sparse \
  --ignore-failed-read --totals --exclude-from \
  /tmp/amanda/sendsize.md0.20040609041435000.exclude .
 Segmentation fault
 
 But... it didn't mention the directory permissions this time, as runtar
 is suid-root (which you noted).  The segfault seems to be coming from
 somewhere else then.  I guess I'll have to dig deeper into what is
 actually being backed up.  Maybe one of my files is causing this.
 
 Thanks,
 Eric




Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Stefan G. Weichinger
Hi, Paul,

on Dienstag, 15. Juni 2004 at 16:49 you wrote to amanda-users:

PB And amanda cleans up the other two connections.
PB Amanda tries again with another set of ports a few times
PB but always trying to connect to 65535 for the index.
PB Then she gives up completely.

I also noticed the strange usage of 65535 ...

PB Can you verify in the amandad..log that the index connection
PB was indeed asked to port 50003?
PB The debug file looks like (search for string CONNECT):

PB
PBAmanda 2.4 REP HANDLE 000-C8DC0608 SEQ 1087282569
PBCONNECT DATA 32771 MESG 32772 INDEX 32773
PBOPTIONS features=feff9ffe0f;
PB

PB Next thing to find out is who/why/when decided to connect to port
PB 65535 instead.  Also note that number:  all 1-bits 16-bit wide.

PB A kernel bug is indeed one of the possibilities.

Where could that one reside? A faulty network-module? Remember that
these things work fine here with each 2.6 ..

Andreas, you could give me your .config, I will take a look ...
Just to update me: You also tried this with 2.6.6?

Have you tried to really start over with your kernel-tree?
Did you get to 2.6.6 via patching or via downloading the whole tree?
If patched, did you look for .rej-files?
Did you use make clean or make distclean?
Also the modules could be faulty ... any fancy network-hardware used?

I would suggest a complete re-install of a vanilla 2.6.6-kernel and
testing with this one. Doesn't make much sense to look for bugs in
that rc in my opinion.

-- 
best regards,
Stefan

Stefan G. Weichinger
mailto:[EMAIL PROTECTED]






Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Andreas Sundstrom
Stefan G. Weichinger wrote:
Hi, Paul,
on Dienstag, 15. Juni 2004 at 16:49 you wrote to amanda-users:
PB And amanda cleans up the other two connections.
PB Amanda tries again with another set of ports a few times
PB but always trying to connect to 65535 for the index.
PB Then she gives up completely.
I also noticed the strange usage of 65535 ...
PB Can you verify in the amandad..log that the index connection
PB was indeed asked to port 50003?
PB The debug file looks like (search for string CONNECT):
PB
PBAmanda 2.4 REP HANDLE 000-C8DC0608 SEQ 1087282569
PBCONNECT DATA 32771 MESG 32772 INDEX 32773
PBOPTIONS features=feff9ffe0f;
PB
PB Next thing to find out is who/why/when decided to connect to port
PB 65535 instead.  Also note that number:  all 1-bits 16-bit wide.
PB A kernel bug is indeed one of the possibilities.
Where could that one reside? A faulty network-module? Remember that
these things work fine here with each 2.6 ..
I'm almost sure it's some kind of kernel bug. That's why I have bothered
to narrow down on wich -rc kernel which started causing it.
Andreas, you could give me your .config, I will take a look ...
Just to update me: You also tried this with 2.6.6?
Sure, that's how it all started problems when upgrading from 2.6.5 to
2.6.6. I'll attach my 2.6.6 config at the end.
Have you tried to really start over with your kernel-tree?
Yes.
Did you get to 2.6.6 via patching or via downloading the whole tree?
Whole tree.
If patched, did you look for .rej-files?
Did you use make clean or make distclean?
I've patched from 2.6.5 to 2.6.6-rc1 or rc2 ofcourse, but I always patch
a newly untarred kernel tree.
Also the modules could be faulty ... any fancy network-hardware used?
Maybe, not very fancy but a bit uncommon:
00:00.0 Host bridge: Intel Corp. 440LX/EX - 82443LX/EX Host bridge (rev 03)
00:01.0 PCI bridge: Intel Corp. 440LX/EX - 82443LX/EX AGP bridge (rev 03)
00:07.0 ISA bridge: Intel Corp. 82371AB/EB/MB PIIX4 ISA (rev 02)
00:07.1 IDE interface: Intel Corp. 82371AB/EB/MB PIIX4 IDE (rev 01)
00:07.2 USB Controller: Intel Corp. 82371AB/EB/MB PIIX4 USB (rev 01)
00:07.3 Bridge: Intel Corp. 82371AB/EB/MB PIIX4 ACPI (rev 02)
00:09.0 RAID bus controller: Promise Technology, Inc. 20262 (rev 01)
00:0a.0 Ethernet controller: 3Com Corporation 3c905 100BaseTX [Boomerang]
00:0b.0 SCSI storage controller: Adaptec AHA-2940U/UW/D / AIC-7881U (rev 01)
00:0c.0 Ethernet controller: 3Com Corporation 3c980-TX 10/100baseTX NIC [Python-T] 
(rev 78)
01:00.0 VGA compatible controller: Intel Corp. i740 (rev 21)
The 3c980-TX is not the most common NIC, it's the one serving the lan, but
all my backups are local so I don't really see why it should mean trouble.
I would suggest a complete re-install of a vanilla 2.6.6-kernel and
testing with this one. Doesn't make much sense to look for bugs in
that rc in my opinion.
I have already done this several times before starting to try different
rc versions, that is only done to narrow down things for my report
to the lkml which was where I started looking for help.
/Andreas
#
# Automatically generated make config: don't edit
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_STANDALONE=y
CONFIG_BROKEN_ON_SMP=y

#
# General setup
#
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_LOG_BUF_SHIFT=15
CONFIG_HOTPLUG=y
# CONFIG_IKCONFIG is not set
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set

#
# Loadable module support
#
CONFIG_MODULES=y
# CONFIG_MODULE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
CONFIG_KMOD=y

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
CONFIG_MPENTIUMII=y
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y

Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Andreas Sundstrom
Paul Bijnens wrote:
Just for fun: if you disable the indexing, then the backup will run
fine, I believe.  (index no in dumptype).
Well, no it doesn't work that way either.
/Andreas


Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Stefan G. Weichinger
Hi, Andreas,

on Dienstag, 15. Juni 2004 at 21:56 you wrote to amanda-users:

 Where could that one reside? A faulty network-module? Remember that
 these things work fine here with each 2.6 ..

AS I'm almost sure it's some kind of kernel bug. That's why I have bothered
AS to narrow down on wich -rc kernel which started causing it.

I am pretty cautious with pointing my finger at a kernel bug.

 Andreas, you could give me your .config, I will take a look ...
 Just to update me: You also tried this with 2.6.6?

AS Sure, that's how it all started problems when upgrading from 2.6.5 to
AS 2.6.6. I'll attach my 2.6.6 config at the end.

Where did you get this config from? Have you modified it?

A diff against my current .config shows that this seems to be a pretty
fat kernel, many many things compiled into it statically ...

I am no kernel-hacker but I know that having ONE of all those options
wrong can break things ... this has happened to me for several times.

 If patched, did you look for .rej-files?
 Did you use make clean or make distclean?
AS I've patched from 2.6.5 to 2.6.6-rc1 or rc2 ofcourse, but I always patch
AS a newly untarred kernel tree.

Always done make clean between compiler-runs?
Always checked for rej-files after patching?

 Also the modules could be faulty ... any fancy network-hardware used?
AS Maybe, not very fancy but a bit uncommon:

AS The 3c980-TX is not the most common NIC, it's the one serving the lan, but
AS all my backups are local so I don't really see why it should mean trouble.

Never assume anything is local ;-)
TCP/IP does not care much about local stuff ...

This reminds of something completely different (who knows which
TV-series? ...) :

Show us your disklist.

 I would suggest a complete re-install of a vanilla 2.6.6-kernel and
 testing with this one. Doesn't make much sense to look for bugs in
 that rc in my opinion.

AS I have already done this several times before starting to try different
AS rc versions, that is only done to narrow down things for my report
AS to the lkml which was where I started looking for help.

Tasty problem. Looking forward to the solution ...

-- 
best regards,
Stefan

Stefan G. Weichinger
mailto:[EMAIL PROTECTED]






Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Paul Bijnens
Very very strange...
Do you have netcat installed?
What is the output of this command on 2.6.6rc2?
nc -v -v -s 127.0.0.1 -p 1234 127.0.0.1 1234
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  Are you sure?  ...   YES   ...   Phew ...   I'm out  *
***


[no subject]

2004-06-15 Thread Fernando Serto
unsubscribe [EMAIL PROTECTED]

--
Fernando Serto
Systems Administrator
Memetrics Pty.
Phone: +61 2 95560833
Fax: +61 2 95556911
Mobile: 0403 338 005
E-mail: [EMAIL PROTECTED]

--- 
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see 
a
href=http://www.memetrics.com/emailpolicy.html;http://www.memetrics.com/em
ailpolicy.html/a


Re: 2.6.6-rc2 and newer cause trouble with amanda

2004-06-15 Thread Andreas Sundstrom
Paul Bijnens wrote:
Very very strange...
Do you have netcat installed?
What is the output of this command on 2.6.6rc2?
nc -v -v -s 127.0.0.1 -p 1234 127.0.0.1 1234
[EMAIL PROTECTED]:/tmp/amanda$ nc -v -v -s 127.0.0.1 -p 1234 127.0.0.1 1234
localhost [127.0.0.1] 1234 (?) open
/Andreas