On 1/5/11 12:00 PM, rory_f wrote:
I want to ensure tapes are filled 100% each time where possible. I've written a
script in python to look at directory, figure out size, and create a disklist
which will ensure a round about size for each disklist file - so for instance
it will try to create a disklist file that contains entries in groups of
400gb's - the size of a tape. I know amanda will fill a tape to 100% where
possible but sometimes, if it is using compression, this doesn't work, and the
first two tapes will fill 500gb+ and then the last tape will be left with
200gb. This is a waste of 200gb - I'm trying to make sure all tapes are full
where possible and not waste any space.
Not to be rude, but that's a false economy.
It could just as easily be said that you would be wasting tape capacity by not
using compression.
You are asking to not allow more than 400GB per tape, and thus no more than 1200GB on the set of 3.
Then you are complaining that the 1200GB is unevenly distributed across the 3 tapes, because
compression allowed more than 400GB on each of the first 2 tapes. So, stated another way, you are
asking that the "wasted" (or unused) 300GB (or so) of space be distributed across all 3 tapes,
rather than just being on the last tape, and/or to just not use compression so that you can imagine
that you are not wasting tape.
500GB per tape means that you are getting about 20% compression. If that is consistent, have your
python script set to queue up somewhere between 1400GB to 1500GB for backup, the choice depending on
how close you want to shave it (with a higher risk of over running the last tape). Then you are
being economical with your tape usage, getting a couple hundred more GB on the set of tapes than you
were originally thinking.
Of course, compressibility varies widely. Huge directories of TIFF and JPEG files can be essentially
uncompressible. Typical unix directories of predominantly text based stuff, like log files or
configuration files, are highly compressible, and repetitive things like Apache access logs can
compress as much as 10:1. So, you have to know your data to efficiently plan what you are trying to do.
--
---------------
Chris Hoogendyk
-
O__ ---- Systems Administrator
c/ /'_ --- Biology& Geology Departments
(*) \(*) -- 140 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst
<hoogen...@bio.umass.edu>
---------------
Erdös 4