On Friday 23 March 2018 08:01:30 Austin S. Hemmelgarn wrote: > On 2018-03-22 19:03, Ryan, Lyle (US) wrote: > > I've got an Amanda 3.4.5 server running on Centos 7 now, and am able > > to do rudimentary backups of a remote client. > > > > But in spite of reading man pages, HowTo's, etc, I need help > > choosing config params. I don't mind continuing to read and > > experiment, but if someone could get me at least in the ballpark, > > I'd really appreciate it. > > > > The server has an 11TB filesystem to store the backups in. I should > > probably be fancier and split this up more, but not now. So I've > > got my holding, state, and vtapes directories all in there. > > > > The main client I want to back up has 4TB I want to backup. It's > > almost all in one filesystem, but the HowTo for splitting DLE's with > > exclude lists is clear, so it should be easy to split this into > > (say) 10 smaller individual dumps. The bulk of the data is pretty > > static, maybe 10%/month changes. It's hard to imagine 20%/month > > changing. > > > > For a start, I'd like to get a full done every 2 weeks, and > > incrementals/differentials on the intervening days. If I have room > > to keep 2 fulls (2 complete dumpcycles) that would be great. > > Given what you've said, you should have enough room to do so, but only > if you use compression. Assuming the rate of change you quote above s > approximately constant and doesn't result in bumping to a level higher > than 1, then without compression you will need roughly 4.015TB per > cycle (4TB for the full backup, ~15.38GB for the incrementals (roughly > 0.38% change per day for 13 days)), plus 4TB of space for the holding > disk (because you have to have room for a full backup _there_ prior to > taping anything). With compression and assuming you get a compression > ratio of about 50%, you should actually be able to fit four complete > cycles (you would need about 2.0075TB per cycle), though if you decide > you want that I would bump the tapecycle to 60 and the number of slots > to 60. > > > So I'm thinking: > > > > - dumpcycle = 14 > > > > - runspercycle = 0 (default) > > > > - tapecycle = 30 > > > > - runtapes = 1 (default) > > > > I'd break the filesystem into 10 pieces, so 400GB each. and make the > > vtapes 400GB each (with tapetype length) relying on server-side > > compression to make it fit. > > > > The HowTo "Use pigz to speed compression" looks clear, and the DL380 > > G7 isn't doing anything else, so server-side compression sounds > > good. > > > > Any advice on this or better ideas? Maybe I'm off in left-field. > > > > And one bonus question: I'm assuming Amanda will just make vtapes > > as necessary, but is there any guidance as to how many vtape slots I > > should create ahead of time? If my dumpcycle=14, maybe create 14 > > slots just to make tapes easier to find? > > Debra covered the requirements for vtapes, slots, and everything very > well in her reply, so I won't repeat any of that here. I do however > have some other more generic advice I can give based on my own > experience: > > * Make your vtapes as large as possible. They won't take up any space > beyond what's stored on them (in storage terminology, they're thinly > provisioned), so their total 'virtual' size can be far more than your > actual storage capacity, but if you can make it so that you can always > fit a full backup on a single vtape, it will make figuring out how > many vtapes you need easier, and additionally give a slight boost to > taping performance (because the taper never has to stop to switch to a > new vtape). In your case, I'd say stating 5TB for your vtape size is > reasonable, that would give you some extra room if you suddenly have > more data without being insanely over-sized. > > * Make sure to set a reasonable part_size for your vtapes. While you > wouldn't have to worry about splitting dumps if you take my above > advice about vtape size, using parts has some other performance > related advantages. I normally use 1G, but all of my dumps are less > than 100G in size. In your case, if you'll have 10 400G dumps, I'd > probably go for 4G for the part size. > > * Match your holding disk chunk size to your vtape's part_size. I > have no hard number to back this up, but it appears to provide a > slight performance improvement while dumping data. > > * Don't worry right now about parallelizing the taping process. It's > somewhat complicated to get it working right, significantly changes > how you have to calculate vtape slots and sizes, and will probably not > provide much benefit unless you're taping to a really fast RAID array > that does a very good job of handling parallel writes. > > * There's essentially zero performance benefit to having your holding > disk on a separate partition from your final storage unless you have > it on a completely separate disk. There are some benefits in terms of > reliability, but realizing them requires some significant planning > (you have to figure out exactly what amount of space your holding disk > will need). > > * If you're indexing the backups, store the working index directory > (the one Amanda actually reads and writes to) on a separate drive from > the holding disk and final backup storage, but make sure it doesn't > get included in the backup if you're backing up your local system as > part of this configuration. This is the single biggest performance > booster I've found so far when dealing with Amanda. You can still > copy the index over to the final backup storage location (and I would > actually encourage you to do so), but just make sure it's not being > written to or read from off of that location while backups are being > taped.
Based on this, I see another way I can improve GenesAmandaHelper. I am currently backing up the amanda database as well as copying it and the config to the end of the current backup, making bare metal recoveries much easier. Thank you. If I can find the time, it will get done. ATM, I am up to them taking care of my wife, a heavy smoker with paper mache bones and zero padding, who broke a hip last Feb, then a leg this past Feb, and will be in a cast till late May. And outside weather permitting, am currently busy putting a wheel chair ramp into the front deck. But all these nor-easters, 4 in 2 weeks now, and a 5th promised over the weekend, aren't making it easy. I'll announce it when I've used it for a month with no surprises. > * Given the fact that you're going to need to use compression, I would > suggest looking into how much processing power you can throw at that > by doing some actual testing. In particular, I would suggest trying > test dumps a couple of times with different compression types to see > how fast each type runs and how much space it saves you. Keep in mind > that you can pass extra options to any compression program you want by > using the custom compression support and a wrapper script like this: > > #!/bin/bash > /path/to/program --options $@ > > If you can get it on your distribution, I'd suggest looking into > zstandard [1] for compression. The default settings for it compress > both better _and_ faster than the default gzip settings. > > * Given that you're only backing up to a local disk, try tweaking the > device_output_buffer_size and see how that impacts your performance. > 1M seems to be a good starting point for local disks, but higher > values may get you much better performance. -- Cheers, Gene Heskett -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Genes Web page <http://geneslinuxbox.net:6309/gene>