Timeout during backup
On one of my linux servers, I have recently been geting the following error when running the nightly backup. It backed up the / and the /usr partitions, but it fails on /home every time. Is this a file size problem? I can not figure this out. Last time I had an index tee connot write, it was a newtwork card problem. It does not appear to be so this time. Any help would be appreciated. Regards, Ryan Williams Here is the error: sendbackup: debug 1 pid 6960 ruid 502 euid 502 start time Thu Aug 9 05:01:06 2001 /usr/local/libexec/sendbackup: got input request: DUMP hda7 0 1970:1:1:0:0:0 OPTIONS |;bsd-auth;srvcomp-fast;index; parsed request as: program `DUMP' disk `hda7' lev 0 since 1970:1:1:0:0:0 opt `|;bsd-auth;srvcomp-fast;index;' waiting for connect on 3387, then 3388, then 3389 got all connections sendbackup: spawning /sbin/dump in pipeline sendbackup: argument list: dump 0usf 1048576 - /dev/hda7 sendbackup: started index creator: /sbin/restore -tvf - 21 | sed -e ' s/^leaf[]*[0-9]*[ ]*\.// t /^dir[ ]/ { s/^dir[ ]*[0-9]*[ ]*\.// s%$%/% t } d ' index tee cannot write [Broken pipe]
Re: Strange Amanda Error: sendbackup: index tee cannot write [Broken pipe]
The problem ended up being the autosensing media. I have never run into a problem with a media autosense before but I think from now on I will try to set that every time I setup a network card. Thanks for the Help, Ryan Williams - Original Message - From: John R. Jackson [EMAIL PROTECTED] To: Ryan Williams [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Tuesday, May 01, 2001 6:59 PM Subject: Re: Strange Amanda Error: sendbackup: index tee cannot write [Broken pipe] It is a 100meg card set to autoselect between all of the 100 and 10 base T. Uh, huh. On Solaris, at least, autoselect is death. No matter what you try to do, it will pick wrong and really bad things start to happen. We **always** force the issue with an explicit config file entry. I don't know that's what's happening to you, but it's been a common problem. The network is a 10baseT. I am not shure if it is full or half duplex. ... I'm out of my depth here, but I think if the cable is actually 10Mbit, duplex does not matter. Only 100Mbit bring duplex into the picture. What do you make of that. I dont see anywhere that it says what it is actually set at. ... Which might make sense if it does not apply because you're really only doing 10Mbit. It would also be useful to go to the client and look at sendbackup*debug in /tmp/amanda, in particular the start and stop time (first and last lines). /usr/local/libexec/sendbackup: got input request: DUMP ad0s1e 0 1970:1:1:0:0:0 OPTIONS |;bsd-auth;srvcomp-fast;index; parsed request as: program `DUMP' disk `ad0s1e' lev 0 since 1970:1:1:0:0:0 opt `|;bsd-auth;srvcomp-fast;index;' waiting for connect on 2622, then 2623, then 2624 /usr/local/libexec/sendbackup: timeout on mesg port 2623 /usr/local/libexec/sendbackup: timeout on index port 2624 sendbackup: pid 79500 finish time Tue May 1 01:47:00 2001 You cut off the first line, but in any case, this indicates something else is going on. It says sendbackup on the client got tired (30 seconds) of waiting on dumper on the server side to make the connections on those ports. The next line after waiting for connect ... should have been got all connections. The sequence of events is: dumper connects to the amandad port on the client they do some security stuff (UDP packets) then dumper tells it to start sendbackup sendbackup starts listening on two or three new ports and sends their numbers (UDP packet) back to dumper on the server dumper connects to those ports (TCP) on the client and data starts to flow So either dumper never got the list of ports, or the connections it tried to make didn't work. It looks like dumper would log an error if the connection failed, which implies sendbackup never saw it. If you upgrade at least the server to 2.4.2p2, you'll get messages like this in the amdump.NN file showing that dumper did its part: dumper: stream_client: connected to 128.210.10.26.63832 dumper: stream_client: our side is 0.0.0.0.63834 Upgrading the client would show similar extra detail on that side. I don't know why this would be happening. Any firewalls or other protection between the two machines that would not allow these ports to go through? Any help from the FreeBSD folks? John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]
Re: Strange Amanda Error: sendbackup: index tee cannot write [Broken pipe]
- Original Message - From: John R. Jackson [EMAIL PROTECTED] To: Ryan Williams [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, April 30, 2001 7:20 PM Subject: Re: Strange Amanda Error: sendbackup: index tee cannot write [Broken pipe] Below are the two pertinant parts (I think) of an error that is occuring on an everyday basis on this server. ... What version of Amanda? Same version on client and server? 2.4.2 on both client and server ... There are 11 other servers that are being backed up just fine but this one just does not want to work. ... So what's different about it? (just kidding :-). ... Of course it was upgraded because we could not find a network card that worked decently ... Are we talking 100 Mbit Ethernet? Any chance you have a duplex problem between the client and switch? Getting that wrong can cause truly amazingly bad performance. It is a 100meg card set to autoselect between all of the 100 and 10 base T. The network is a 10baseT. I am not shure if it is full or half duplex. It could be a duplex problem. I believe that our backup network can only support 1/2 duplex. rl0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 inet 10.0.0.203 netmask 0xff00 broadcast 10.0.0.255 inet6 fe80::250:baff:fe88:c760%rl0 prefixlen 64 scopeid 0x2 ether 00:50:ba:88:c7:60 media: autoselect (none) status: active supported media: autoselect 100baseTX full-duplex 100baseTX 10baseT/UTP full-duplex 10baseT/UTP 100baseTX hw-loopback What do you make of that. I dont see anywhere that it says what it is actually set at. Mabey I should try and manually set the mode on the card. web45.inte ad0s1f lev 0 FAILED [data timeout] web45.inte ad0s1a lev 0 FAILED [could not connect to web45.internal] web45.inte ad0s1e lev 0 FAILED [could not connect to web45.internal] The first one says Amanda waited 30 minutes between getting started or getting a block of data and then gave up. I'm guessing the other two failures were because amandad was still running on the client and so the new connections were not allowed. If you're running 2.4.2 or beyond, you could increase the dtimeout value in amanda.conf, but half an hour to wait on data is a long time. I suspect it means something else is wrong. 1/2 an hour should be plenty of time I would think. It would also be useful to go to the client and look at sendbackup*debug in /tmp/amanda, in particular the start and stop time (first and last lines). /usr/local/libexec/sendbackup: got input request: DUMP ad0s1e 0 1970:1:1:0:0:0 OPTIONS |;bsd-auth;srvcomp-fast;index; parsed request as: program `DUMP' disk `ad0s1e' lev 0 since 1970:1:1:0:0:0 opt `|;bsd-auth;srvcomp-fast;index;' waiting for connect on 2622, then 2623, then 2624 /usr/local/libexec/sendbackup: timeout on mesg port 2623 /usr/local/libexec/sendbackup: timeout on index port 2624 sendbackup: pid 79500 finish time Tue May 1 01:47:00 2001 The index tee cannot write [Broken pipe] stuff is just a symptom of the server giving up on the client. The server shut down the connections and the client was still trying to write, so it got a broken pipe error. The real problem is why the data stream quit moving. Some other possibilities: * An **extremely** busy disk that tar has a hard time getting access to. Judging on a ps auxw, It does not appear that the server is doing an awefull lot on the disk. It should not be doing much as it is mostly just a mail server for a small number of users. * A busy client and you have compression turned on, so it just cannot get enough CPU. It has a big enough cpu to try and do so I would believe. It is a PII 350 and like I said it does not do much. * A very large file system that compresses extremely well so tar and gzip are crunching along but not generating any (enough) output. Small filesystem. The whole server is only about 3-4 gigs used. * A broken disk that is very slow to respond (e.g. lots of retries). I dont see any evidence of this on the server. * Some kind of networking problem, software or hardware. You might try some big ftp put's from the client to /dev/null on the server and see what happens. Most likely. I was at first under the impression from the output that it was a software problem. I am going to look further into the hardware aspects of it now. Ryan Williams John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED] Thanks
Strange Amanda Error: sendbackup: index tee cannot write [Broken pipe]
Below are the two pertinant parts (I think) of an error that is occuring on an everyday basis on this server. Amcheck runs just fine, it just does not work on this server. There are 11 other servers that are being backed up just fine but this one just does not want to work. I recently upgraded the OS from Freebsd 2.2.7 to Freebsd 4.2. This is when the error started occuring. The upgrade kind of botched itself when it ran but I was able to recover most of the server. Of course it was upgraded because we could not find a network card that worked decently to do a backup with amanda so we did not have a backup of the server. (3c509's work terriably in Freebsd, especially under a load like amanda, even the fix that we found on one of the mailing lists did not fix the problem) Any help would be appreciated. Also if there is any further information that I could give pertaining to this backup. FAILURE AND STRANGE DUMP SUMMARY: web45.inte ad0s1f lev 0 FAILED [data timeout] web45.inte ad0s1a lev 0 FAILED [could not connect to web45.internal] web45.inte ad0s1e lev 0 FAILED [could not connect to web45.internal] /-- web45.inte ad0s1f lev 0 FAILED [data timeout] sendbackup: start [web45.internal:ad0s1f level 0] sendbackup: info BACKUP=/usr/bin/tar sendbackup: info RECOVER_CMD=/usr/bin/tar -f... - sendbackup: info end ? sendbackup: index tee cannot write [Broken pipe] ? index returned 1 sendbackup: error [/usr/bin/tar got signal 13] \ Regards, Ryan Williams
spam messages in the amanda-user mailing list
There are now daily spam messages about toner supplies going to the amanda mailing list. This is a big annoyance. Please do something to prevent such a thing from happening again. If needed I can provide headers and the emails that I recieved. Regards, Ryan Williams
Re: spam messages in the amanda-user mailing list
I do not have any experience with majordomo but I know that, with mailman, it is easy to do. If nothing else there could be a line put into the sendmail access file that says REJECT bad.relay.mailserver.mx or just make shure that the person is subscribed before they can send email to the list. Like I said though, I do not run majordomo and I am unfamiliar with how it works. I am just assuming that this can be done. Regards, Ryan Williams - Original Message - From: "Jonathan Dill" [EMAIL PROTECTED] To: "Ryan Williams" [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Monday, February 19, 2001 3:11 PM Subject: Re: spam messages in the amanda-user mailing list Ryan Williams wrote: There are now daily spam messages about toner supplies going to the amanda mailing list. This is a big annoyance. Please do something to prevent such a thing from happening again. If needed I can provide headers and the emails that I recieved. I'm mad about it too, but what makes you think that John R. Jackson can do anything about it, or anybody else subscribed to the list for that matter? Majordomo doesn't have much capabilities for spam filtering as far as I know. Personally, I'd like to see the list run on mailman rather than majordomo. The list appears to be hosted on surly.omniscient.com. I'm going to inquire about helping out with the admin of the server and some other options. Until then, I'd suggest getting a real e-mail client like Netscape Messenger and use a message filter on "Sender-Contains" and "toner" then "Move To-Spam" or something like that. -- "Jonathan F. Dill" ([EMAIL PROTECTED]) CARB Systems and Network Administrator Home Page: http://www.umbi.umd.edu/~dill
Re: Configuration question
Would it be feasable to do one of the following two things: Keep the the old files in a diferent directory and only backup certain directories. Or possable gzip the old ones so that they arent so large. I usually find that gzip will compress dumped database files to 10x smaller than they were. or Just use amanda like it normally does things and it will only backup ones that it does not allready have. If you have to keep the databases that are x days old then every 10 days it would back all of them up and you would not have to wory about overwriting your old tapes with new ones. That way you would always have one of every database on tapes and one on the partition. Regards, Ryan Williams - Original Message - From: "Wood, David" [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, February 14, 2001 8:46 PM Subject: Configuration question Hi, I have a filesystem "/sybase_dumps". Each night a script dumps Sybase databases to this filesystem and the files are quite large. There is a purge script which cleans out the filesystem of files more than X days old - we need to keep X days worth of dump files on-line. Basically, I only want Amanda to backup new dump files (as much as possible) within this filesystem. Does anyone have advice on how to configure Amanda for this filesystem? I've thought about using the 'incronly' strategy; however, the usefulness of this strategy breaks down when dump levels reach 9. I've thought about switching to GNUTAR and tinkering with the source to have DUMP_LEVELS = 2147483647 (size of int); however, I'm afraid of breaking other parts of the code (will I?). I imagine the bump parameters are going to give me grief too. Thanks for any advice ... David
Re: amanda mailing list
Perhaps it would help the two of you. Perhaps you can explain why the rest of us should be inconvenienced because you can't spare the time to learn to use the right tool for the right job? 200 m/day might be a burden for you. For many of us that's a light day. The list messages are already tagged in the Sender: header. Please learn to use it. Who is to say that we would be the ones inconveniencing others. If everyone but you wants this, you would be inconveniencing us to ask that it was not done. I was just placing out an idea when I suggested this just to get some feed back and to see what other people thought about this. We all know your opinion about the matter now how about we hear some other peoples opinions.
amanda mailing list
Just a little pet peeve I would like to ask about. Would it be possable to put an [amanda-users] in the subject of everything sent to the mailing list? I know that mailman is capable of this but I am not shure of the capabilitys of majordomo. Regards, Ryan Williams
Re: Amanda Client on FreeBSD
- Original Message - From: "John R. Jackson" [EMAIL PROTECTED] To: "Ryan Williams" [EMAIL PROTECTED] Cc: "Shawn M. Green" [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, February 06, 2001 12:06 AM Subject: Re: Amanda Client on FreeBSD Like I said not an expert, the first 4 times reading the book were just to understand the concept :). (dumpcycles, tapecycles) and the 5th time I installed it. First off, I just want to start this out saying that I am not an expert at this by far but I have read the chapter from backup central about 5 times ... As far as I'm concerned, that makes you an "expert" :-). In freebsd you must add amanda to the group operators. ... That depends on whether you are using dump or GNU tar and what group has read access to the raw disk devices. GNU tar doesn't need special group membership because it runs under a setuid-root wrapper. For dump, you either need to put Amanda in the group that owns the devices, or change the group of the devices to something Amanda is a member of (possibly a brand new Amanda only group). You don't use the directory's when telling it what to backup ... I prefer to use the logical (mount point) names rather than the disk names. I've moved data around too often in the past and prefer the extra level of indirection. But this is personal preference. Amanda can handle either. For GNU tar it will convert a disk name to the mount point. For dump it will convert a mount point to the disk (assuming, in either case, your /etc/fstab or the equivalent is correct). John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]
server crashes
I just implemented amanda to backup about 12 clients and so far amanda has run twice. Each time amdump runs on the server, one of the clients (a freebsd 4.0 box) completely hangs without an error in the /var/log/messages. I have not found any logs to tell me what broke when I ran a backup. Does anyone have an idea about what is wrong or what I should check? Thanks in advance. Ryan Williams
gzip
I have 2 questions relating to gzip. 1. I have all of my gzips set to fast instead of best but whenever amdump is running there will be a gzip --fast and gzip --best for every file that is in my holding disk. What are the reasons behind this? 2. quoting a colocation facilitys website: "We use bzip2 instead of gzip for data compression. Unlike gzip, bzip2 compresses data in blocks, which means that in the unlikely event that a small part of the backup is corrupted, only the affected block is lost. All other data is still recoverable." Is this true and if so is there a way to use bzip instead of gzip? Has anyone ever looked into this? Thanks, Ryan Williams
Re: Amanda Client on FreeBSD
First off, I just want to start this out saying that I am not an expert at this by far but I have read the chapter from backup central about 5 times so that I would understand what I am doing. Please correct me if I am wrong at anything here. I just recently installed amanda on about 15 freebsd boxes so I feel I could do it in my sleep now :). In freebsd you must add amanda to the group operators. (edit /etc/group and after root on the group operators put ,amanda so it looks like root,amanda) You don't use the directory's when telling it what to backup, atleast not in my experience. You would have to be root to do so even if it was allowed (i think). Type `df` at a prompt and it will tell you all of your partitions. It should come out something like the following: Filesystem 1K-blocks UsedAvail Capacity Mounted on /dev/ad0s1a 4958336173 944479%/ /dev/ad0s1f 5425485 3993979 99746880%/usr /dev/ad0s1e496111 323299 13312471%/var procfs 440 100%/proc If you want to backup the user partition, you would stick ad0s1f from that example into your disklist and not usr. - Original Message - From: "John R. Jackson" [EMAIL PROTECTED] To: "Shawn M. Green" [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, February 05, 2001 11:18 PM Subject: Re: Amanda Client on FreeBSD I have an Amanda 2.4.2 client set to backup the /home of a FreeBSD client. When I run amcheck on the index server (Red Hat 6.2), it comes back saying 'permission denied' to access the /home partition. I assume what you mean is it said permission denied trying to access something or other in /dev, right? The partition is owned by root.wheel, 755 perms ... Again, I assume you mean the /dev for the partition? Any chance one of the parent directories is blocking access? and dump is suid on the FreeBSD box. ... Which is not relevant. It is only setuid because of the insanely stupid way they (and most dump vendors) start the rmt protocol. It drops its permissions right after startup, so has no more access than the calling (Amanda) user. User amanda has been added to the wheel group. So if you run something like this on the client: su amanda-user -c "dump 9f - /home /dev/null" (whatever you dump program is called) does it work? Is Amanda using dump or GNU tar? Are you running xinetd on the client? Did you use "groups yes" in the amandad entry so xinetd gives the child all the alternate groups? Did amcheck report the correct device that you think maps to /home? In other words, is /etc/fstab (or whatever your system uses) correct? If none of this helps, please post the exact messages from amcheck and a "ls -lL" of the items it says have a permission problem. Shawn John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]
Re: server crashes
One thing on that server that I find kinda hard to understand is that at a different time of the day, every day, the whole server is dumped to a second hard drive with the same vendor dump program. I took out the -u option so it did not use /etc/dumpdates. I am more apt to believe that it is a driver problem with the ethernet card because I am having a problem on two other machines with that same nic. They are all 3c509 nics but the weird thing is that on the other ones when a backup is run, they just stop responding on that nic. If I try to ping out that nic it gives an error of ping: sendto: No buffer space available I have been searching places like google and it seems like this may be linked to a bad drive because I found people that had the same driver and the same error. I dont know why that one would be freezing instead of just breaking that nic card though. - Original Message - From: "John R. Jackson" [EMAIL PROTECTED] To: "Ryan Williams" [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, February 05, 2001 11:42 PM Subject: Re: server crashes ... Each time amdump runs on the server, one of the clients (a freebsd 4.0 box) completely hangs without an error in the /var/log/messages. I have not found any logs to tell me what broke when I ran a backup. Does anyone have an idea about what is wrong or what I should check? This is all guesswork because I don't run FreeBSD, but that's never stopped me from shooting off my E-mouth before :-): * It could be any number of hardware problems, including the disk being backed up, the cable, the controller, seating of any of the things that move, bad termination, bad option switch settings, speed mismatches (SCSI-2 on a SCSI-1 bus), bad firmware, bad DMA control to/from memory, bad memory, etc. It could also be any of the above with the network interface rather than the disk. * It could be a kernel problem dealing with a hardware problem, or a just plain kernel/driver bug. * It is unlikely to be a problem with the dump program or any portion of Amanda. Those all run in normal user space and usually with minimal special privileges. Probably the first few things I'd try are reseating everything that moves and checking all the DIP switches and jumpers. Twice. Then once more. Then I'd try a few dummy dumps (are you using dump or GNU tar?) along these lines (adjust as needed for your OS): dump 0f - /some/file/system /dev/null dump 9f - /some/file/system /dev/null If you have maxdumps set greater than one, you might try two (or more) of these at the same time to add even more contention. You might also try some large (dump image sized) ftp transfers from the client to /dev/null on the server. Unless you happen to find a FreeBSD expert here (which is certainly likely), my guess is you'll need to go those mailing lists to get much more help with this. They might be able to tell you, for instance, how to get into the machine when it is hung and find out exactly what processes are running, what the kernel is doing, etc. Ryan Williams John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]