Re: Solaris 7, ufsdump - (very occasional) system hang
I've had two instances this year (the last one just this week) on one of my backup clients of a file system on that client becoming locked seemingly due to Amanda's estimate run. ... Mail doesn't go through and processes get stuck and virtual memory fills up. When I manage to get into the machine I've found both times that there are a number of amanda processes running - sorry I forgot to grab the exact ps output but I'm pretty sure sendsize and killpgrp were there. When I kill these off the system gets _real_ busy for a while as it catches up with things but then settles down. ... That's not a lot to go on, but if you saw killpgrp running, that's such a trivial program my guess is that one of the ufsdump processes is hung (e.g. in disk wait) and cannot be killed, which sends you down that path. Then either something loosens up, or just killing the killpgrp process (or sendsize) lets Amanda continue on. It could be you still have ufsdump (or even killpgrp) processes stranded. In any case, this sounds like an OS hang problem rather than an Amanda issue. While I understand perfectly the make it stop hurting feeling :-), next time it happens (if there is one), see if you can grab a ps -lu amanda and ps -fu amanda. Then kill things off one at a time until it starts going. Start with the ufsdump processes, and start with the bottom of that chain. You might also watch and see if kill has any effect on what you hit. It might also be useful to grab a copy of /tmp/amanda/killpgrp*debug **before** you start killing things to see how far it got before it hung. Does anyone recognise these symptoms? Any ideas on whether it's an Amanda problem (which might go away if I update my installation to 2.4.2p2 which I should probably do anyway) or something to do with ufsdump? We run Solaris 2.6, 2.7 and 2.8 here and I don't think we've ever seen this. You might also want to make sure you're up to the latest Sun patches both for the kernel/OS and ufsdump/ufsrestore. Also, there are essentially no changes to killpgrp since 2.4.2 (just a couple of minor message formatting things), so I don't know that an upgrade will help with this particular problem. Paul John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]
Re: no index
Here's what I got when I gunzip'd the first few lines of one of your index files: 07323430437/./ 07323430437/./qbopt.C141/ 07323430437/./qbopt.C141/ccxx/ 07323430437/./qbopt.C141/ccxx/base/ 07323430437/./qbopt.C141/ccxx/base/obj/ 07323430437/./qbopt.C141/ccxx/base/obj/gcc/ 07323430437/./qbopt.C141/ccxx/base/obj/gcc/dbg/ 07323430437/./qbopt.C141/ccxx/base/obj/gcc/opt/ 07323430437/./qbopt.C141/ccxx/cnm/ 07323430437/./qbopt.C141/ccxx/cnm/obj/ Those big numbers on the front of the lines are a symptom of a broken version of GNU tar. Several versions of 1.13 have had serious problems. You should get 1.13.19 from alpha.gnu.org or go back to 1.12 (plus the Amanda patches from www.amanda.org). John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]
Re: Appending incrementals
John R. Jackson [EMAIL PROTECTED] wrote: Again, future work may allow this, but only after you've aced an intensive I have implemented tape appending about one or two years ago. Your warnings however still apply :)
RE: Problem starting Amanda
Sir, I get this errors in my log: START planner date 20010716 INFO planner Adding new disk chino.master.com:/. START driver date 20010716 START taper datestamp 20010716 label Daily000 tape 0 FAIL planner chino.master.com / 0 [Request to chino.master.com timed out.] FINISH planner date 20010716 WARNING driver WARNING: got empty schedule from planner STATS driver startup time 29.993 INFO taper tape Daily000 kb 0 fm 0 [OK] FINISH driver date 20010716 time 33.530 On top of that, I have changed my Amanda.conf and tapelist file. Thanks very much. -Original Message- From: John R. Jackson [mailto:[EMAIL PROTECTED]] Sent: Sunday, July 15, 2001 12:49 AM To: Desmond Lim Cc: [EMAIL PROTECTED] Subject: Re: Problem starting Amanda After I run the command amdump Daily, I hear the tape drive moving and the tape light on the drive blinking for a while then stopped. It seems that Amanda never start the backup process. ... What version of Amanda are you using? Are there any Amanda processes still running, either on the server or any of the clients listed in your disklist? Another of your comments implies you used localhost in your disklist. I always recommend you not do this. Use the fully qualified host name of the machine instead. Did you kill the amdump or did it terminate by itself? Did you get an E-mail report? According to your amanda.conf, it would have gone to address amanda. If the report was lost and you want to regenerate it, cd to the directory with the log.MMDD.NN files (should be /usr/adm/amanda/Daily according to your amanda.conf) and run amreport again. You can either change the mailto entry in amanda.conf to have it mailed again, or you can use the -f option and have amreport write the letter to a file instead of mailing it. If the most recent log file is log.20010714.1, it would look like this: amreport Daily -l log.20010714.1 -f /tmp/amanda-report Make sure you run this as the Amanda user (in particular, do **not** run it as root). I ran amcheck -lt Daily and there' s no error. It's good you used amcheck to look for problems. But you should probably run it without any flags, e.g. amcheck Daily. With -lt you did not check anything on the clients (even localhost is a client as far as Amanda is concerned), and that could be where the problem is. Also after I run amstatus Daily after amdump Daily, I get this error /usr/adm/Amanda/Daily/amdump:No such file or directory at /usr/local/sbin/amstatus line103 Amstatus is usually used while amdump is running. If you use it after amdump is done, you need to tell it what file to process. For instance, amstatus Daily --stats --summary --file amdump.1. The amdump.1 file is always the most recent amdump run (amdump.2 is the second most recent and so on). Desmond Lim John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED] amanda.conf disklist