Re: Question to: Friday tape question - Top 10
Ralf Auer schrieb: Let's assume, the five tapes from my last example were labeled 'Backup01'-'Backup05'. Now, as in the last example, this 'short-of-tapes'-problem occurs after using 'Backup02'. To avoid overwriting 'Backup03' Amanda would kindly ask me to add another tape (if I understood correctly). When I add the new tape, what label would it get? 'Backup02a'? 'BackupPre03' ? '02Backup03' ? Depends on your definition of labelstr and which label you give that tape via running amlabel. Amanda does not care for pre or something, *she* just uses every tape with a label matching the defined labelstr. No need to 'insert' a tape into a existing sequence (for example, between 2 and 3 ...). So the tape would very likely be labeled as 'Backup06', and this would have to be done by you. Stefan
Re: Question to: Friday tape question - Top 10
Hi Stefan, Amanda does not care for pre or something, *she* just uses every tape with a label matching the defined labelstr. No need to 'insert' a tape into a existing sequence (for example, between 2 and 3 ...). So the tape would very likely be labeled as 'Backup06', and this would have to be done by you. thanks, but does this also mean, that in future times I have always to put the tapes in ...-02-06-03-... order, or does *she* switch over to 02-06-03-04-05-01-02-03-04-05-06-01... automatically, so that the newest tape is in fact added at the end of the list after one full cycle? Ralf Stefan G. Weichinger schrieb: Ralf Auer schrieb: Let's assume, the five tapes from my last example were labeled 'Backup01'-'Backup05'. Now, as in the last example, this 'short-of-tapes'-problem occurs after using 'Backup02'. To avoid overwriting 'Backup03' Amanda would kindly ask me to add another tape (if I understood correctly). When I add the new tape, what label would it get? 'Backup02a'? 'BackupPre03' ? '02Backup03' ? Depends on your definition of labelstr and which label you give that tape via running amlabel. Stefan -- Ralf Auer Physics Institute IVOffice: 2.137 University of Erlangen-NurembergPhone: +49-9131-8527087 Erwin-Rommel-Str. 1 Fax:+49-9131-15249 D-91058 Erlangen, Germany [EMAIL PROTECTED]
Re: Question to: Friday tape question - Top 10
Patrick M. Hausen schrieb: Hi! On Wed, Aug 01, 2007 at 11:45:00AM +0200, Ralf Auer wrote: thanks, but does this also mean, that in future times I have always to put the tapes in ...-02-06-03-... order, or does *she* switch over to 02-06-03-04-05-01-02-03-04-05-06-01... automatically, so that the newest tape is in fact added at the end of the list after one full cycle? Amanda may reorder tapes to be used any time, anyway. Set up a cron job that executes amcheck every morning and sends you the ouput via email. It contains the information which tape Amanda expects next. Ralf, Patrick is correct: Amanda tracks tape usage and tells you which tape to insert. Either by running amcheck or 'amadmin conf tape', and the report mail of amdump also contains that info. Just let loose on this ;-) Stefan
Re: Question to: Friday tape question - Top 10
Hi! On Wed, Aug 01, 2007 at 11:45:00AM +0200, Ralf Auer wrote: thanks, but does this also mean, that in future times I have always to put the tapes in ...-02-06-03-... order, or does *she* switch over to 02-06-03-04-05-01-02-03-04-05-06-01... automatically, so that the newest tape is in fact added at the end of the list after one full cycle? Amanda may reorder tapes to be used any time, anyway. Set up a cron job that executes amcheck every morning and sends you the ouput via email. It contains the information which tape Amanda expects next. HTH, Patrick M. Hausen Leiter Netzwerke und Sicherheit -- punkt.de GmbH * Vorholzstr. 25 * 76137 Karlsruhe Tel. 0721 9109 0 * Fax 0721 9109 100 [EMAIL PROTECTED] http://www.punkt.de Gf: Jürgen Egeling AG Mannheim 108285
Re: Question to: Friday tape question - Top 10
I think the short answer is that if you're running a configuration where the requested fulls just barely fit before the tapes cycle, then if tons of data is added then yes it will overflow. This is pretty much inherent to using that many tapes. For reliability, you should have more than one full of everything on tape at all times. So if you are going to have the normal weekday runspercycle 5 and dumpcycle 7, you should have probably 15 tapes, or at least 12. And, your tapes should be at least twice as big as what balance outputs. When all goes well, you'll be just overwriting the third full dump with each new one, leaving two. In my view, tapes are cheap (even LTO-2 are $32 each), compared to the value of data, the cost of the tape drive, and management time, and it's silly to try to run with fewer tapes than is appropriate, especially only a handful. I would agree that amanda is not optimized for operation with too few tapes, but I think it does pretty well given in what is essentially an untenable situation.
Re: Question to: Friday tape question - Top 10
On Wed, Aug 01, 2007 at 07:25:00AM +0200, Ralf Auer wrote: Why with such a tight setup do you have runtapes set to greater than 1. With only 1 tape allowed per day you would have gotten a failure to backup that DLE that did not fit. This would have been noticed by amanda during the estimate phase, before the dump started and noted in the report. That is not true, maybe. Since the data of that one client could be on several hard disks, it would be possible to backup the client to several tapes, as long as all the single HDs are smaller than the tape capacity. Not sure I know what part you feel is not true. You hypothesised a situation where normally every day's dump fit on one tape. Then on one particular day a single DLE grew enough to cause the day's run to require 3 tapes. That would only happen if that one DLE were now larger than a single tape (even for an incremental as the scenario is constructed). With runtapes == 1, amanda would not even start the dump of that DLE. -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
Re: Question to: Friday tape question - Top 10
Stefan G. Weichinger wrote: Patrick M. Hausen schrieb: Hi! On Wed, Aug 01, 2007 at 11:45:00AM +0200, Ralf Auer wrote: thanks, but does this also mean, that in future times I have always to put the tapes in ...-02-06-03-... order, or does *she* switch over to 02-06-03-04-05-01-02-03-04-05-06-01... automatically, so that the newest tape is in fact added at the end of the list after one full cycle? Amanda may reorder tapes to be used any time, anyway. Set up a cron job that executes amcheck every morning and sends you the ouput via email. It contains the information which tape Amanda expects next. Ralf, Patrick is correct: Amanda tracks tape usage and tells you which tape to insert. Either by running amcheck or 'amadmin conf tape', and the report mail of amdump also contains that info. just to clarify this a bit... amanda doesn't re-order tapes. It takes them as they come (as you give them to it), as long as they fit amanda's algorithm according to your cycle. Rather than go over it all here, look at the wiki documentation: http://wiki.zmanda.com/index.php/Taper_scan_algorithm. With a tape library, amanda is not going to scan the whole library. It is going to look at the current slot. If the tape in that slot is acceptable according to the algorithm, it will use it. Otherwise it will go on to the next slot until it finds an acceptable tape or runs out of slots. So, while amanda will tell you what tape it expects on the next run (the least recently used one), it will take what you give it as long as it is acceptable. It's not going to take that '06' and put it back in order. The order as far as amanda is concerned is according to the date the tapes were written. The label is just a pattern you specify that amanda will require a match on. The one time I lost a tape, it was bio-daily-007. I labeled a new tape bio-daily-0072, put it in the current slot, and let amanda use it. So my order now goes through 006, 0072, 008, etc. That just reminds me that it is a tape I replaced. If you look at the tapelist file, you will see the dates that all the tapes were last written to. If you do `amtape conf update`, amanda will scan all the slots in your changer and you will see what you have in every slot. And, as others have said, having dumpcycle=runspercycle=tapecycle is not good. I have 5,5,35, which gives me 6 weeks+ of full backups. You are responsible for your tapes and your cycle, but amanda will always do the right thing as best as it can and will keep you informed. Running amcheck in the afternoon off cron will result in amanda sending you an email if things aren't ready to go for the night backup. It will tell you what it was expecting and exactly what was wrong. The report amanda sends you after a backup tells you exactly what it did and what tapes were used or if there were any errors with the tape. And, if you come in in the morning and don't have an email report, and wonder what the heck is going on, you can run amstatus and it will give you full details of exactly what is going on, what has been completed, how far along stuff is that is being done, and what is waiting to be done. --- Chris Hoogendyk - O__ Systems Administrator c/ /'_ --- Biology Geology Departments (*) \(*) -- 140 Morrill Science Center ~~ - University of Massachusetts, Amherst [EMAIL PROTECTED] --- Erdös 4
Re: Question to: Friday tape question - Top 10
Hi Jon, Not sure I know what part you feel is not true. You hypothesised a situation where normally every day's dump fit on one tape. Then on one particular day a single DLE grew enough to cause the day's run to require 3 tapes. That would only happen if that one DLE were now larger than a single tape (even for an incremental as the scenario is constructed). With runtapes == 1, amanda would not even start the dump of that DLE. Maybe I am completely wrong, but I thought, if the big dump is distributed over several partitions, each one with a single entry in the disklist and each partition size smaller than the tape capacity, the backup should work. For instance, my type capacity is 400GB, the specific client has three partitions of 350GB each and each partition has its own entry in the disklist. So I assumed, that one partition backup would go to the first tape, anther one to the second and the third one onto the last tape. Since 'tape spawning' is not necessary in this case, I thought that the backup would be run by Amanda. But, as I said, I am not sure about that and probably you're right... Ralf
Re: Question to: Friday tape question - Top 10
Hi Chris, thank you for your explanation! That's exactly what I tried to figure out. Now I am completely happy, and I think we can 'close' this thread ... Thanks again to all who tried to help and 'sorry for bothering' :-) Best regards, Ralf
Re: Question to: Friday tape question - Top 10
Interestingly I'm seen at my site that a large partition, which might fit on a tape, can be not-the-first on a tape and hit EOT while being written, it will then dump to a second volume (if jukebox == 1). You can have a similar cascading failure if you try to put another DLE on the end of the second tape volume, rolling onto the third. You can have a senario where the single DLE would have fit properly on the 1st tape volume had the large interviening DLE not put put to tape before it. taper ordering is a critical thing sometimes. On Wed, Aug 01, 2007 at 08:55:59PM +0200, Ralf Auer wrote: Hi Jon, Not sure I know what part you feel is not true. You hypothesised a situation where normally every day's dump fit on one tape. Then on one particular day a single DLE grew enough to cause the day's run to require 3 tapes. That would only happen if that one DLE were now larger than a single tape (even for an incremental as the scenario is constructed). With runtapes == 1, amanda would not even start the dump of that DLE. Maybe I am completely wrong, but I thought, if the big dump is distributed over several partitions, each one with a single entry in the disklist and each partition size smaller than the tape capacity, the backup should work. For instance, my type capacity is 400GB, the specific client has three partitions of 350GB each and each partition has its own entry in the disklist. So I assumed, that one partition backup would go to the first tape, anther one to the second and the third one onto the last tape. Since 'tape spawning' is not necessary in this case, I thought that the backup would be run by Amanda. But, as I said, I am not sure about that and probably you're right... Ralf --- Brian R Cuttler [EMAIL PROTECTED] Computer Systems Support(v) 518 486-1697 Wadsworth Center(f) 518 473-6384 NYS Department of HealthHelp Desk 518 473-0773 IMPORTANT NOTICE: This e-mail and any attachments may contain confidential or sensitive information which is, or may be, legally privileged or otherwise protected by law from further disclosure. It is intended only for the addressee. If you received this in error or from someone who was not authorized to send it to you, please do not distribute, copy or use it or any attachments. Please notify the sender immediately by reply e-mail and delete this from your system. Thank you for your cooperation.
Re: Question to: Friday tape question - Top 10
Chris Hoogendyk schrieb: Stefan G. Weichinger wrote: Patrick M. Hausen schrieb: Hi! On Wed, Aug 01, 2007 at 11:45:00AM +0200, Ralf Auer wrote: thanks, but does this also mean, that in future times I have always to put the tapes in ...-02-06-03-... order, or does *she* switch over to 02-06-03-04-05-01-02-03-04-05-06-01... automatically, so that the newest tape is in fact added at the end of the list after one full cycle? Amanda may reorder tapes to be used any time, anyway. Set up a cron job that executes amcheck every morning and sends you the ouput via email. It contains the information which tape Amanda expects next. Ralf, Patrick is correct: Amanda tracks tape usage and tells you which tape to insert. Either by running amcheck or 'amadmin conf tape', and the report mail of amdump also contains that info. just to clarify this a bit... amanda doesn't re-order tapes. It takes them as they come (as you give them to it), as long as they fit amanda's algorithm according to your cycle. Rather than go over it all here, look at the wiki documentation: http://wiki.zmanda.com/index.php/Taper_scan_algorithm. With a tape library, amanda is not going to scan the whole library. It is going to look at the current slot. If the tape in that slot is acceptable according to the algorithm, it will use it. Otherwise it will go on to the next slot until it finds an acceptable tape or runs out of slots. So, while amanda will tell you what tape it expects on the next run (the least recently used one), it will take what you give it as long as it is acceptable. It's not going to take that '06' and put it back in order. The order as far as amanda is concerned is according to the date the tapes were written. The label is just a pattern you specify that amanda will require a match on. The one time I lost a tape, it was bio-daily-007. I labeled a new tape bio-daily-0072, put it in the current slot, and let amanda use it. So my order now goes through 006, 0072, 008, etc. That just reminds me that it is a tape I replaced. If you look at the tapelist file, you will see the dates that all the tapes were last written to. If you do `amtape conf update`, amanda will scan all the slots in your changer and you will see what you have in every slot. And, as others have said, having dumpcycle=runspercycle=tapecycle is not good. I have 5,5,35, which gives me 6 weeks+ of full backups. You are responsible for your tapes and your cycle, but amanda will always do the right thing as best as it can and will keep you informed. Running amcheck in the afternoon off cron will result in amanda sending you an email if things aren't ready to go for the night backup. It will tell you what it was expecting and exactly what was wrong. The report amanda sends you after a backup tells you exactly what it did and what tapes were used or if there were any errors with the tape. And, if you come in in the morning and don't have an email report, and wonder what the heck is going on, you can run amstatus and it will give you full details of exactly what is going on, what has been completed, how far along stuff is that is being done, and what is waiting to be done. Nice summary, Chris. Why not contribute stuff like this to the docs/Wiki? Thanks, greetings, Stefan.
Re: Question to: Friday tape question - Top 10
On Wed, Aug 01, 2007 at 08:55:59PM +0200, Ralf Auer wrote: Not sure I know what part you feel is not true. You hypothesised a situation where normally every day's dump fit on one tape. Then on one particular day a single DLE grew enough to cause the day's run to require 3 tapes. That would only happen if that one DLE were now larger than a single tape (even for an incremental as the scenario is constructed). With runtapes == 1, amanda would not even start the dump of that DLE. Maybe I am completely wrong, but I thought, if the big dump is distributed over several partitions, each one with a single entry in the disklist and each partition size smaller than the tape capacity, the backup should work. For instance, my type capacity is 400GB, the specific client has three partitions of 350GB each and each partition has its own entry in the disklist. So I assumed, that one partition backup would go to the first tape, anther one to the second and the third one onto the last tape. Since 'tape spawning' is not necessary in this case, I thought that the backup would be run by Amanda. But, as I said, I am not sure about that and probably you're right... If I can try to summarize, you're discussing situations where Amanda is fairly massively oversubscribed; that is, Amanda has very little room to deal with unexpected circumstances, including an overlarge incremental, an unavailable client, etc. In the specific situation, under normal circumstances, you expect Amanda to balance dumps into about 1 tape per run. You've set runtapes to a larger number, to allow Amanda to use more tapes if necessary, but you don't really have enough tape to support your full retention period with 1 run per tape. The correct calculation is: tapecycle = reundancy_factor * runtapes * runspercycle + epsilon where epsilon is 1 or 2 -- spare tapes to allow slack for damaged tapes, etc. The redundancy_factor is the number of full backups you'd like to have around at any time -- 1 is OK, 2 or more is recommended. Anything less than 1 is asking for trouble. In your case, if I remember the numbers correctly, you had: tapecycle 5 runspercycle 5 runtapes 3 epsilon 0 solving for redundancy_factor gives 5 = redundancy_factor * 3 * 5 + 0 redundancy_factor = 0.33 which is clearly suboptimal. This is not to say that this kind of configuration won't work -- Amanda will do her level best -- but it should not be a surprise that level best is not always good enough, especially when unexpected things happen. I think the bottom line is: this is your Wednesday email telling you to buy more tapes ;) Dustin -- Dustin J. Mitchell Storage Software Engineer, Zmanda, Inc. http://www.zmanda.com/
Question to: Friday tape question - Top 10
Hi everybody, I have a question that is related to that 'friday tape question' in Amanda's Top10 I quoted below: /* Imagine that you have your classic backup-schedule running fine. Everything is calculated and designed well, so your tape gets filled well each night. Now one user generates an unforeseen huge amount of data. For example, he duplicates one big data-directory by mistake. So the size of the directory raises within one day, maybe for multiple GBs. Would your classic backup-scheme catch that? Or would it run out of tape, simply because it was not calculated to have that filesystem with that size? Amanda would try to catch it (and most of the time succeed ...). */ What, if it does NOT succeed? For example, I run out of free tapes unexpectedly early? Do I get a notification, or do I loose data? Example: Let's assume, I have 5 clients to backup, my tapecycle is 5, my dumpcyle is 3 and runspercycle is 5. I have all 5 tapes in my autochanger. On Monday, clients 'U' 'V' are in the full backup. On Tuesday, client 'W' is in full backup and during the day some user produces so much data, that for Wednesday-Backup all three remaining tapes are used. Then, there is no tape left to hold the still pending full backups for clients 'X' and 'Y' on Thursday or Friday... The optimum would be an email after Wendnesday, that tells me that I have to buy more tapes. The worst thing would be, that Amanda tapes over the 'U V' full backup tape for the 'X' and 'Y' backups. So, can you tell me, what will happen in this case? Can I rely on not loosing data, just because some user wrote an infinite loop in his MonteCarlo program producing GBs of data? Thanks for your help!
Re: Question to: Friday tape question - Top 10
On Wed, Aug 01, 2007 at 04:52:01AM +0200, Ralf Auer wrote: Hi everybody, I have a question that is related to that 'friday tape question' in Amanda's Top10 I quoted below: /* Imagine that you have your classic backup-schedule running fine. Everything is calculated and designed well, so your tape gets filled well each night. Now one user generates an unforeseen huge amount of data. For example, he duplicates one big data-directory by mistake. So the size of the directory raises within one day, maybe for multiple GBs. Would your classic backup-scheme catch that? Or would it run out of tape, simply because it was not calculated to have that filesystem with that size? Amanda would try to catch it (and most of the time succeed ...). */ What, if it does NOT succeed? For example, I run out of free tapes unexpectedly early? Do I get a notification, or do I loose data? Example: Let's assume, I have 5 clients to backup, my tapecycle is 5, my dumpcyle is 3 and runspercycle is 5. I have all 5 tapes in my autochanger. On Monday, clients 'U' 'V' are in the full backup. On Tuesday, client 'W' is in full backup and during the day some user produces so much data, that for Wednesday-Backup all three remaining tapes are used. Then, there is no tape left to hold the still pending full backups for clients 'X' and 'Y' on Thursday or Friday... The optimum would be an email after Wendnesday, that tells me that I have to buy more tapes. The worst thing would be, that Amanda tapes over the 'U V' full backup tape for the 'X' and 'Y' backups. So, can you tell me, what will happen in this case? Can I rely on not loosing data, just because some user wrote an infinite loop in his MonteCarlo program producing GBs of data? What is this, trolling for arguments against amanda usage? How/why are you running 5 amdumps (runs per cycle) with only a 3 day dump cycle? Oh, was that a mistype, it should have been a 5 day dumpcycle? Then why does your number of tapes in rotation exactly match the runspercycle when the recommended is 2-4 times that? Why with such a tight setup do you have runtapes set to greater than 1. With only 1 tape allowed per day you would have gotten a failure to backup that DLE that did not fit. This would have been noticed by amanda during the estimate phase, before the dump started and noted in the report. Even without the staged situation, amanda would have told you after one dumpcycle that you were about to overwrite your only full backup of insert a DLE name notifying you of the need for more tapes. You only have one DLE per client? Very unusual. But that is the only reason I'd see amanda choosing to do a full backup of exactly one client on each day of a dumpcycle. Or are you artificially forcing the desired full dumps? -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
Re: Question to: Friday tape question - Top 10
Hello Jon, first of all, thanks for your reply. What is this, trolling for arguments against amanda usage? This is definitively no troll against Amanda usage. I am using it for quite a long time without any problems (although I am missing some usefull features, to be honest). So SORRY for any misunderstanding or wrong words... How/why are you running 5 amdumps (runs per cycle) with only a 3 day dump cycle? Of course I am not moron enough to run the configuration I described in the example in real life. All I wanted was to know from an expert what WOULD happen, if that kind of 'missing tape' problem occurs; could that leed to data loss or not, that's all. Therefore I constructed an example, where I could IMAGINE that something would go wrong... Why with such a tight setup do you have runtapes set to greater than 1. With only 1 tape allowed per day you would have gotten a failure to backup that DLE that did not fit. This would have been noticed by amanda during the estimate phase, before the dump started and noted in the report. That is not true, maybe. Since the data of that one client could be on several hard disks, it would be possible to backup the client to several tapes, as long as all the single HDs are smaller than the tape capacity. Even without the staged situation, amanda would have told you after one dumpcycle that you were about to overwrite your only full backup of insert a DLE name notifying you of the need for more tapes. So, thank you, THAT is exactly the info I was looking for, because I have not run into that situation on my site. I think, I do have enough tapes in my cycle, even to come up with about 12 TB on 40 clients... :-) But, if you don't mind, I would like to use the opportunity having an expert 'on the line' and go even a little further and push that question a little more to the extreme, just out of curiosity: Let's assume, the five tapes from my last example were labeled 'Backup01'-'Backup05'. Now, as in the last example, this 'short-of-tapes'-problem occurs after using 'Backup02'. To avoid overwriting 'Backup03' Amanda would kindly ask me to add another tape (if I understood correctly). When I add the new tape, what label would it get? 'Backup02a'? 'BackupPre03' ? '02Backup03' ? Again, this is just to understand better, how Amanda works. I am neither running, nor planning to run this configuration, but sometimes it's good to try to tackle your own system down to be prepared if somebody else tries... With my best regards, Ralf -- Ralf Auer Physics Institute IVOffice: 2.137 University of Erlangen-NurembergPhone: +49-9131-8527087 Erwin-Rommel-Str. 1 Fax:+49-9131-15249 D-91058 Erlangen, Germany [EMAIL PROTECTED]