Paul Bijnens wrote:
Mike Heller wrote:
I have amanda running on serveral servers and last night I tried to back up one more to the tape server. When I arrived this morning, the backups were still running and the new server had an extremely high load
Am I correct that this is the first time that a backup is tried on that host?
That is correct, it was the first time that this client has been backed up.
It has wait=yes already.
on it. It's a RedHat Linux 9.0 server and the load was over 520 (quad Xeon system). There were about 1500 processes with "amanda" as the
Have close look at your xinetd configuration for amanda. Maybe you have "wait = no", instead of "yes" in the file?
service amanda { socket_type = dgram protocol = udp wait = yes user = amanda group = disk server = /usr/local/libexec/amandad disable = no }
After only one session I have 1029 files in the /tmp/amanda directory. Lookiing at the most recent, I see:
Just a guess. Have a look in /tmp/amanda/*debug files too.
amandad: time 40.815: sending ACK pkt: <<<<< >>>>> amandad: try_socksize: send buffer size is 65536 amandad: try_socksize: receive buffer size is 65536 amandad: time 57.335: stream_server: waiting for connection: 0.0.0.0.33752 amandad: try_socksize: send buffer size is 65536 amandad: try_socksize: receive buffer size is 65536 amandad: time 57.335: stream_server: waiting for connection: 0.0.0.0.33753 amandad: try_socksize: send buffer size is 65536 amandad: try_socksize: receive buffer size is 65536 amandad: time 57.335: stream_server: waiting for connection: 0.0.0.0.33754 amandad: time 57.335: sending REP pkt: <<<<< CONNECT DATA 33752 MESG 33753 INDEX 33754 OPTIONS features=fffffeff9ffe0f; >>>>> amandad: time 57.336: received ACK pkt: <<<<< >>>>> amandad: time 87.327: stream_accept: timeout after 30 seconds amandad: time 87.327: stream 0 accept failed: bad SECURITY line: '' amandad: time 117.327: stream_accept: timeout after 30 seconds amandad: time 117.327: stream 1 accept failed: bad SECURITY line: '' amandad: time 147.327: stream_accept: timeout after 30 seconds amandad: time 147.327: stream 2 accept failed: bad SECURITY line: '' amandad: time 148.337: pid 11395 finish time Wed Jan 7 08:04:36 2004
However at this point the server may be totally hung and things may not be working well. The first one (right after the backup started) seems to be working better:
amandad: time 1.026: sending ACK pkt: <<<<< >>>>> amandad: time 6.573: sending REP pkt: <<<<< OPTIONS features=fffffeff9ffe0f; /big/www/docs 0 SIZE 4984710 /big/mysqldata 0 SIZE 31290 /var/log 0 SIZE 6390 /boot 0 SIZE 32335 >>>>> amandad: time 6.574: received ACK pkt: <<<<< >>>>> amandad: time 30.581: pid 4004 finish time Wed Jan 7 01:01:56 2004
Mike