Re: Solaris 7, ufsdump - (very occasional) system hang

2001-07-15 Thread John R. Jackson

I've had two instances this year (the last one just this week) on one
of my backup clients of a file system on that client becoming locked
seemingly due to Amanda's estimate run.
...
Mail doesn't go through and processes get stuck and virtual memory fills up.

When I manage to get into the machine I've found both times that
there are a number of amanda processes running - sorry I forgot to grab
the exact ps output but I'm pretty sure sendsize and killpgrp were
there.  When I kill these off the system gets _real_ busy for
a while as it catches up with things but then settles down.  ...

That's not a lot to go on, but if you saw killpgrp running, that's such
a trivial program my guess is that one of the ufsdump processes is hung
(e.g. in disk wait) and cannot be killed, which sends you down that path.
Then either something loosens up, or just killing the killpgrp process
(or sendsize) lets Amanda continue on.  It could be you still have
ufsdump (or even killpgrp) processes stranded.

In any case, this sounds like an OS hang problem rather than an Amanda
issue.

While I understand perfectly the make it stop hurting feeling :-),
next time it happens (if there is one), see if you can grab a ps -lu
amanda and ps -fu amanda.  Then kill things off one at a time
until it starts going.  Start with the ufsdump processes, and start with
the bottom of that chain.  You might also watch and see if kill has
any effect on what you hit.

It might also be useful to grab a copy of /tmp/amanda/killpgrp*debug
**before** you start killing things to see how far it got before it hung.

Does anyone recognise these symptoms?  Any ideas on whether it's an Amanda
problem (which might go away if I update my installation to 2.4.2p2 which
I should probably do anyway) or something to do with ufsdump?

We run Solaris 2.6, 2.7 and 2.8 here and I don't think we've ever seen
this.  You might also want to make sure you're up to the latest Sun
patches both for the kernel/OS and ufsdump/ufsrestore.

Also, there are essentially no changes to killpgrp since 2.4.2 (just
a couple of minor message formatting things), so I don't know that an
upgrade will help with this particular problem.

Paul

John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]



Re: no index

2001-07-15 Thread John R. Jackson

Here's what I got when I gunzip'd the first few lines of one of your
index files:

  07323430437/./
  07323430437/./qbopt.C141/
  07323430437/./qbopt.C141/ccxx/
  07323430437/./qbopt.C141/ccxx/base/
  07323430437/./qbopt.C141/ccxx/base/obj/
  07323430437/./qbopt.C141/ccxx/base/obj/gcc/
  07323430437/./qbopt.C141/ccxx/base/obj/gcc/dbg/
  07323430437/./qbopt.C141/ccxx/base/obj/gcc/opt/
  07323430437/./qbopt.C141/ccxx/cnm/
  07323430437/./qbopt.C141/ccxx/cnm/obj/

Those big numbers on the front of the lines are a symptom of a broken
version of GNU tar.  Several versions of 1.13 have had serious problems.
You should get 1.13.19 from alpha.gnu.org or go back to 1.12 (plus the
Amanda patches from www.amanda.org).

John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]



Re: Appending incrementals

2001-07-15 Thread Marc SCHAEFER

John R. Jackson [EMAIL PROTECTED] wrote:
 Again, future work may allow this, but only after you've aced an intensive

I have implemented tape appending about one or two years ago. Your
warnings however still apply :)




RE: Problem starting Amanda

2001-07-15 Thread Desmond Lim

Sir,

I get this errors in my log:

START planner date 20010716
INFO planner Adding new disk chino.master.com:/.
START driver date 20010716
START taper datestamp 20010716 label Daily000 tape 0
FAIL planner chino.master.com / 0 [Request to chino.master.com timed out.]
FINISH planner date 20010716
WARNING driver WARNING: got empty schedule from planner
STATS driver startup time 29.993
INFO taper tape Daily000 kb 0 fm 0 [OK]
FINISH driver date 20010716 time 33.530

On top of that, I have changed my Amanda.conf and tapelist file.

Thanks very much.


-Original Message-
From: John R. Jackson [mailto:[EMAIL PROTECTED]]
Sent: Sunday, July 15, 2001 12:49 AM
To: Desmond Lim
Cc: [EMAIL PROTECTED]
Subject: Re: Problem starting Amanda

After I run the command amdump Daily, I hear the tape drive moving and
the
tape light on the drive blinking for a while then stopped. It seems that
Amanda never start the backup process. ...

What version of Amanda are you using?

Are there any Amanda processes still running, either on the server or
any of the clients listed in your disklist?

Another of your comments implies you used localhost in your disklist.
I always recommend you not do this.  Use the fully qualified host name
of the machine instead.

Did you kill the amdump or did it terminate by itself?

Did you get an E-mail report?  According to your amanda.conf, it would
have gone to address amanda.

If the report was lost and you want to regenerate it, cd to the directory
with the log.MMDD.NN files (should be /usr/adm/amanda/Daily according
to your amanda.conf) and run amreport again.  You can either change the
mailto entry in amanda.conf to have it mailed again, or you can use the -f
option and have amreport write the letter to a file instead of mailing it.
If the most recent log file is log.20010714.1, it would look like this:

  amreport Daily -l log.20010714.1 -f /tmp/amanda-report

Make sure you run this as the Amanda user (in particular, do **not**
run it as root).

I ran amcheck -lt Daily and there' s no error.

It's good you used amcheck to look for problems.  But you should probably
run it without any flags, e.g. amcheck Daily.  With -lt you did not
check anything on the clients (even localhost is a client as far as
Amanda is concerned), and that could be where the problem is.

Also after I run amstatus Daily after amdump Daily, I get this error
/usr/adm/Amanda/Daily/amdump:No such file or directory at
/usr/local/sbin/amstatus line103

Amstatus is usually used while amdump is running.  If you use it after
amdump is done, you need to tell it what file to process.  For instance,
amstatus Daily --stats --summary --file amdump.1.  The amdump.1 file
is always the most recent amdump run (amdump.2 is the second most recent
and so on).

Desmond Lim

John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]

 amanda.conf
 disklist