On Sat, Jul 08, 2006 at 01:50:09AM +0200, Peter Kunst wrote: > Jon LaBadie wrote: > >On Fri, Jul 07, 2006 at 05:11:50PM -0400, [EMAIL PROTECTED] wrote:
> >>Dump images spanning multiple media volumes: > >And no more simple recovery of spanned dumps using > >standard unix tools when amanda is not available. > >That needs to be pointed out in any revision. > > Indeed. Just had another case last week, where the dd method tells me > "this is chunk 3 of 4"... well, on which tapes do i find the other > chunks, i asked myself. Run the amandatape program after every backup, and you will always have the answer to this question right on your tape labels. But back to the tape chunking: I don't like this static chunking. You have to choose between either wasting a lot of tape or having lots of chunks, making manual recovery a fragile task. So you trade waste against inconvenience. It would be much better if the chunking algorithm would take into account how much tape is available for the next chunk. Something like chunkfactor 3/4 # 0<=chunkfactor<=1 minsize 1GB The chunkfactor specifies how much of the rest of the tape will be allocated for the next chunk. The minsize specifies the minimum size of a chunk, to avoid a high number of chunks because they are getting too small. With the above specification and a tapesize of 100GB we would get: first chunk: 75.00GB==3/4*(100) second chunk: 18.75GB==3/4*(100-75) third chunk: 4.69GB==3/4*(100-75-18.75) fourth chunk: 1.17GB==3/4*(100-75-18.75-4.69) fifth chunk: 0.29GB==3/4*(100-75-18.75-4.69-1.17) # forced to 1GB # since the fifth chunk is less than minsize, it is foced to minsize (1GB). # If it don't fit to the tape, we start over again on the next tape: fifth chunk: 75.00GB==3/4*(100) # goes to a new tape On the new tape, the fifth chunk would again start over at 75GB because the new tape has 100GB available again. We ended up with only four chunks (the fifth is ignored since we got a write error) on the first tape and have wasted only 0.29GB (that is 0.29%) of the tape. We will never waste more than 1% (minsize/tapesize) and we will never get more than four chunks of a given dump on a single tape. (In contrast, the current algorithm will end up with 100 chunks if you don't want to waste more than 1%) With growing chunkfactor (limit is 1), you get a lower number of chunks, but you risk to waste more tape if you get an early write error. So now you trade risk against waste (instead of inconvenience against waste, as in current algorithm). With reliable tapes, you can have higher chunkfactor and end up with a low number of chunks. With unreliable tapes it is better to go lower chunkfactor because the earlier you get a write error the more tape you waste. Either way, you have bigger chunks at the start of the tape, honoring the assumption that probability of write errors is higher on the end of the tape. For vtape usage (you have no risk for write errors here), chunks can be made exactly the size that is needed to fill the vtape with: chunkfactor 1 minsize 0 So we never waste any disk space on vtapes any more! The current chunking behavior can be achieved with: chunkfactor 0 # will always be smaller than minsize minsize 10GB # thus all chunks will be 10GB I have already implemented such a system in a different project (Jon, you remember the ssh based system I mentioned a couple of months ago?) and I am pretty happy with this algorithm.