Ahh. So are you saying that the other nodes work when the backup is run manually, but hang forever when it is run from a CRON job? That might imply permissions when run as a cron job vs live…. Deb
On Aug 17, 2015, at 3:25 PM, Seann <nombran...@tsukinokage.net> wrote: > Deb, > > The CRON issue happens with, and without the two clients who fail. > > All of my normal testing, and manual runs are done with the two problem hosts > disabled, because they are known problems, and I figure it is due to large > amounts of files on both of them causing a timeout. > > I am working on a separate solution to see if I can get a better method of > backup on those two hosts. > They are both VM's and one is notoriously slow in disk reads, in addition to > a large amount of files in the directory that is backed up. > Two other similar VM's backup properly without issue. > > Regards, > Seann > > On 8/17/2015 3:05 PM, Debra S Baddorf wrote: >> Does this include the two clients who fail — do THEY also say that their >> estimates are complete? Or are they still working on estimates, and thus >> holding up the whole works? All of the estimates seem to need to finish, >> before anybody gets to start. >> Deb Baddorf >> Fermilab >> >> On Aug 17, 2015, at 2:33 PM, Seann <nombran...@tsukinokage.net> wrote: >> >>> All, >>> >>> I am looking for a little direction on a problem that has cropped up for me >>> recently. >>> >>> I have a backup set, that was created using Amanda 2.5 (default on CentOS >>> 5.11) and ran very well, both manually and from the cron job I had set for >>> it. >>> It has approximately 13 hosts to backup, from as simple as backing up a >>> single directory, to backing up the full system, and it ran with no issues >>> on CentOS 5.11. >>> The basic setup is using hard drives as the backup media, compressing the >>> backups to save space, using server compression, these also use GNU-TAR as >>> the archive format. >>> >>> Fast forward to today, I have the system upgraded to CentOS 7, which also >>> upgraded to Amanda 3.3.3-13, and after some configuration file re-writing, >>> I got most of the backups to work. >>> Two systems, one backing up the web directory, the other backing up the >>> full disk, fail constantly. >>> When these two disklist statements are removed, the backup runs, and takes >>> approximately 2 and a half hours to run on the 8 other hosts (the other 3 >>> hosts are currently offline and not in scope). >>> >>> When the CRON job kicks off at midnight, it runs for over 12 hours (I have >>> the etimeout set to one day, as the planner kept dying saying to timed out). >>> This is the same basic error that I get with the two above mentioned >>> failing backups. >>> >>> When the hung backup job is running, I see the dumpers and main dump >>> process running on the backup server, but nothing in the logs outside of >>> the "We started the backup job" type of log messages. >>> On all of the hosts, I don't see the client running, nor to I see any TAR >>> processes running. >>> There are also no clues in the logs on which host the server is waiting on, >>> and checking all the hosts in scope show they are all in the same state, >>> that is they have sent the estimate to the backup server and are waiting on >>> the next phase. >>> >>> >>> Any help on this would be appreciated, and also is there a better way of >>> making sense of the logs (such as using something like Graylog2?), and on >>> reporting for issues with Amanda 3.3? >>> >>> >>> Regards, >>> Seann > > > -- > > Regards, > Seann > > This message is confidential. It may also be privileged or otherwise > protected by work product immunity or other legal rules. If you have received > it by mistake, please let us know by e-mail reply and delete it from your > system; you may not copy this message or disclose its contents to anyone. > Please send us by fax any message containing deadlines as incoming e-mails > are not screened for response deadlines. The integrity and security of this > message cannot be guaranteed on the Internet. > _____________________________________________________________________ > > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. If > you have received this email in error please notify the system manager. > Please note that any views or opinions presented in this email are solely > those of the author and do not necessarily represent those of the company. > Finally, the recipient should check this email and any attachments for the > presence of viruses. The company accepts no liability for any damage caused > by any virus transmitted by this email. > > Tsukinokage.net Omaha, Nebraska > >