Re: Expecting new tape...
Jon LaBadie wrote: > On Mon, Jun 09, 2008 at 10:39:16PM -0400, Robert Kuropkat wrote: >> Did something bad, but not sure what. I had a tape with a bar code >> label but apparently no Amanda label. So I labeled it and hoped the >> next backup would write to it. It did not. It was tape #28 and I had >> already passed that number in the sequence (Tape #30). So I figured I >> did something goofy and just loaded the next set. Unfortunately, it >> still says it needs a new tape. >> >> Unfortunately, I was confident I knew what I was doing, so I also made >> other changes while I was there. I had taken the top 10 tapes out of >> the sequence reducing it from 100 to 90. So I took the top ten entries >> out of the tapelist file and changed the tapecycle entry in amanda.conf >> to 90. >> >> I'm new to Amanda and inherited this setup so aside from posting every >> config file, I'm not sure what the relevant entries to post would be. >> > > Couple of things. First the entries in the tapelist file can be > active or inactive as indicated by the "reuse/noreuse" tag. If > you are getting a "need a new tape" message, the number of "reuse" > tapes is less than the tapecycle value. > > Second, tapecycle need not match the actual number of tapes > in rotation. It must be equal to or less. By having it less, > when a tape is damaged or archived (and marked "noreuse" in > tapelist) you will not get the dreaded "need a new tape". > > Third, amanda does not use the tapes in human defined numeric > order. If it originally saw the tapes in the order 1,3,5,99, > that is the order it will expect them in the future. And there > are several ways the expected order can be affected later on. > > So, even thought tape 30 has been passed, tape 28's turn may > be coming up. Alternatively, as you labelled it just recently, > it may be the "new tape" that amanda is seeking. If there > are 90 tapes listed as "reuse" in tapelist and tape 28 is the > only one without a date last used, that is likely the case. > > jl ooh, I think you nailed it. One of the reasons I was taking the top 10 out of the cycle was because I wanted to fill in some holes where tapes had gotten damaged and because I had 6 sets of 15 and 1 set of 10 tapes. Unfortunately, I had not yet added those tapes in so I had a tape cycle of 90 and only 87 tapes in the tape list. Running it now to see if that works. If so, I will add a couple more tapes at the top of the list again to prevent this in the future. Thanks! Robert Kuropkat
[Amanda-users] confusing problem with NO-NEW-TAPE
Jean-Louis took a look at my config, and it turns out I had a foolish error. I had used the following lines from the default config: # flush-threshold-dumped, flush-threshold-scheduled, taperflush, and autoflush # are used to control tape utilization. See the amanda.conf (5) manpage for # details on how they work. Taping will not start until all criteria are # satisfied. Here are some examples: # You want to keep the most recent dumps on holding disk, for faster recovery. # Older dumps will be rotated to tape during each run. flush-threshold-dumped 300 # (or more) flush-threshold-scheduled 300 # (or more) taperflush 300 autoflush yes This configuration instructs the system to not write any tape volumes until the holding disk contains 300 *percent* of a volume size. I had understood it to mean 300mb on disk. I have no idea why I thought that. Commenting out those lines allowed the backup to work as I had originally thought that it would. I just thought that I would post this for the others that are having the same issue - read the manual...as they say. Thanks for the help! +-- |This was sent by [EMAIL PROTECTED] via Backup Central. |Forward SPAM to [EMAIL PROTECTED] +--
Re: dumper abort()ing occasionally
Douglas, Many bugs are already fixed in the 2.5.1 tree, but we didn't made a release. You can try the latest 2.5.1 snapshot from http://www.zmanda.com/community-builds.php It is only bug fixes since the 2.5.1p3 release. I don't remember if this bug was fixed. Jean-Louis Douglas K. Rand wrote: Once or twice a week my amanda backups are failing when a dumper exits on signal 6, SIGABRT: Jun 5 23:27:38 scotch kernel: pid 82566 (dumper), uid 0: exited on signal 6 Jun 8 19:53:43 scotch kernel: pid 96672 (dumper), uid 0: exited on signal 6 In looking at the source there clearly are calls to abort() in several places. I'm assuming that there is an overflow problem with file descriptors, that 4294967295 isn't a valid FD? driver: event_register: Invalid file descriptor 4294967295 I'm running FreeBSD 6.3 with Amanda 2.5.1p3 from ports. There isn't much information in the log:
Re: dumper abort()ing occasionally
Ian> Are there any details regarding this issue in the /tmp/amanda Ian> debug files for this dumper? Doug> Nothing that I saw. Here is the dumper.*.debug file for that Doug> pid. I also uploaded all of the debug files for that run from Doug> the server to: Doug> http://meridian-enviro.com/rand/amanda/ Dustin> I can't look at the logs at that URL -- the Apache user Dustin> doesn't have read permission on the files themselves. Doh! Checked that index worked, not that they were readable. Sorry. Fixed. Dustin> The debug logs do show the client connection timing out, Dustin> though. It's likely that this condition is what is tickling Dustin> the dumper bug, and since 2.5.1 is no longer maintained, the Dustin> solution is to stop tickling the bug :). See if you can Dustin> figure out why that connection is timing out -- busy network? Dustin> Downed client? Network partition? Well, in this particular case it was because the system being backed up froze. (I think the motherboard is failing.) Usually when this happens it is not due to a crashed system. I see that the FreeBSD port for Amanda is still at 2.5.1 (I'm usually lazy and assume that if I'm up to date to the port I'm up to date with the software.) I'll see about getting the port upgraded to 2.6.0. Thanks for the help.
Nominate Amanda for SourceForge Community Choice Awards
Reminder: nominations for SourceForge's community choice awards close soon. If you haven't already, please take a moment to nominate Amanda! http://sourceforge.net/community/cca08-nominate?group_id=120 You can nominate in multiple categories -- after your first nomination, simply click on the link "nominate this project in another category." Dustin -- Storage Software Engineer http://www.zmanda.com
Re: dumper abort()ing occasionally
On Mon, Jun 9, 2008 at 5:11 PM, Douglas K. Rand <[EMAIL PROTECTED]> wrote: > Once or twice a week my amanda backups are failing when a dumper exits > on signal 6, SIGABRT: > > Jun 5 23:27:38 scotch kernel: pid 82566 (dumper), uid 0: exited on signal 6 > Jun 8 19:53:43 scotch kernel: pid 96672 (dumper), uid 0: exited on signal 6 > > In looking at the source there clearly are calls to abort() in several > places. I'm assuming that there is an overflow problem with file > descriptors, that 4294967295 isn't a valid FD? > > driver: event_register: Invalid file descriptor 4294967295 That large integer is also known as -1. I'm guessing that when the dumper exits unexpectedly, the driver gets an EOF from its file descriptor and sets that fd to -1, but then incorrectly tries to re-register it with the event system. The pre-2.6.0 event system was a careful balancing act, but in this case it seems to have handled the error correctly. The problem is to figure out why dumper aborted. Most (all?) abort calls in Amanda are through the error() macro, which should log a message to the debug log as well. But looking at the debug logs you sent, I see no such thing. I can't look at the logs at that URL -- the Apache user doesn't have read permission on the files themselves. The debug logs do show the client connection timing out, though. It's likely that this condition is what is tickling the dumper bug, and since 2.5.1 is no longer maintained, the solution is to stop tickling the bug :). See if you can figure out why that connection is timing out -- busy network? Downed client? Network partition? Dustin -- Storage Software Engineer http://www.zmanda.com