yes, /var/mail does tend to be a problem. My mail.log can get to a
quarter million lines a day, but that's not the same as getting that
number of messages. Typically, my level 1 backups aren't much different
from the level 0, because basically everything changes every day. I get
by with fssnap. That isn't the same as quiescing the system, but it
reduces exposure to issues to the time it takes to do the snapshot. The
thing is, with an active mail system you need a place for the backing
store that is nearly as large as the mail partition, because things are
going to change while you are backing up, so they will have to get copied.
If your mail admin isn't cooperative, ask your mail admin, or your mail
admin's boss, whether they care about having backups. Ask them if it is
alright not to have a backup. You could even ask them if there are any
legal requirements relating to discovery, etc. (NYS Dept. of Health?
I'll bet there are!) That might get you in deeper than you want, but
hey, it should get some attention.
For details on how I do it (on Solaris 9), see
http://wiki.zmanda.com/index.php/Backup_client#Chris_Hoogendyk.27s_Example.
If they would allow it, you could set up a sudo entry for the amanda
user to stop sendmail, take a snapshot, and start sendmail again (or
whatever mail software you are using). It would be sort of like what I
have for stopping xntpd to snapshot the root partition, but you would
special case it for the /var/mail partition. Presumably, they have the
/etc/init.d/sendmail or similar script properly set up so that you won't
be in danger of screwing things up. Presumably, they would work with you
to get it going.
---------------
Chris Hoogendyk
-
O__ ---- Systems Administrator
c/ /'_ --- Biology & Geology Departments
(*) \(*) -- 140 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst
<hoogen...@bio.umass.edu>
---------------
Erdös 4
Brian Cuttler wrote:
Amanda users,
For the issue below, we have see several hundred thousand emails
move through the system system each day. UFSdump is failing because
it seems too many files come and go, queries to "continue" but can't
get a reply (I don't know of a way anyway).
We tried to switch the problem DLE to gtar - but the estimate phase
seems to take hours to run. I haven't set etimeout high enough to get
a estimate yet, and this will push actual dumps back by hours.
Is there a workaround I can employ, either to get quicker estimates
(ok to assume level 0 is the usage of the partition) or get ufsdump
to work ?
I've recommended we do something to quiess the system, but our mail god
hasn't seemed to take any interested in that suggestion. Nor do we
currently have a mechanism to snapshot or replicate (rsync, break a
mirror, etc) the partition.
thanks,
Brian
----- Forwarded message from Brian Cuttler <br...@wadsworth.org> -----
Date: Tue, 11 Aug 2009 11:37:35 -0400
From: Brian Cuttler <br...@wadsworth.org>
To: daver <da...@wadsworth.org>, amanda-users@amanda.org,
Chris Knight <kni...@wadsworth.org>
Cc: Ivan Auger <ivan.au...@wadsworth.org>
Subject: Re: amanda probelm
In-Reply-To: <4a818aa0.5070...@wadsworth.org>
User-Agent: Mutt/1.4.1i
Reviewing the issue.
Server, Solaris 10x86, Amanda 2.6.1 (with patches)
Client, Solaris 9, Amanda 2.4.4
The problem performing level 0 dumps is that there are a large
number of files in flux -- its the mailhost system -- so ufsdump
eventually asks for help, to continue or quit.
There is no help in non-interactive mode, and I don't know if there
is a mechanism to get amanda to respond to ufsdump's query. So the
level 0 of /usr1 usually fails.
Warnings - Dave's suggestion, that the history of the error be more
explicite is a good one. Can # amdump report last successful level 0,
ie due date, if current level zero fails ? Put it in the notes
section or something ?
Non-solutions - a snapshot of an open file is an open file. All things
being equal, you will get as many open files in a snapshot as in a
live system. This will not resolve the problem.
Work arounds
1) quiess the mail server for a period of time, force a level 0
of /usr1 during that interval.
2) Quiess mail delivery long enough to snapshot/rsync or break
a mirror. Backup the placid copy.
3) Q: will a TAR of the DLE get a backup when ufsdump can not ?
4) "Its not really a problem."
By and large each individual message will be backed up "most"
of the time, each message is its own file and if its not on
the level 0 its on almost all of the level 1 and 2 dumps.
What we will lose are the index files, which would probably
require a rebuild after a restore anyway, so that are not
that important to get only tape.
On Tue, Aug 11, 2009 at 11:13:36AM -0400, daver wrote:
brian checking over all the overdue files systems on curie, I find only
mailserv:/usr1 is truly overdue
As you mentioned., this may be due to the system being too active.
there are NO warnings, in this regard, with the exception of a "Can't
switch to degraded mode for unknown reason" in amdump
as we just about never read these amanda files and use the email
generated by the system to notify us of problems, this would seem to be
a significant problem with Amanda. I agree that the developers should
be contacted in this regard.
as to getting /usr1 backed up. if amanda can't do it, perhaps we need to
consider an alternative. like tar