>> On Fri, 20 Mar 2009 09:39:23 -0500, Nick Laflamme <dplafla...@gmail.com> >> said:
> My heart leapt when my RSS reader presented me an article in the TSM > udpates feed from IBM with the heading, "Keeping more than one TSM > server database backup on a tape." As I'm implementing a new server > using 3592 drives, I haven't been happy with my options for this > particular issue. Maybe, I thought, I was about to learn something > of immediate use and high value! > My heart sank when I read the actual article, which might be > paraphrased as, "Sorry, Charlie, too risky." I say, bunk. Of course, your decisions have to be guided by your own sense of paranoia, but I think a blanket "too risky" is just plain wrong. If you actually measure your risks, I think you'll find you can lower them, not raise them, and get "more than one DB backup on a tape" as a side effect. Here's what I do: My library manager is also the server-to-server virtual volume target for all my infrastructure's database backups. The DB backups are thus primary archive data, from the perspective of the LM instance. I then make offsite and onsite copies of these primary stgpools. I end up with three different physical copies of the same backup run. Contrast with direct backups to volumes: You can do a normal full and a snapshot, in the interest of having something to take offsite and something to keep onsite. But they are -different- backups. They require different procedures, and only one of them can (for example) be used as part of a full/incremental scheme. Further, you have to re-do work. If you want "a backup onsite, and a backup offsite", you have to run two backups; you can't copy a DB backup at all. More of your 24-hour clock occluded with DB-intensive maintenance tasks. Just what you need. --- Media risk in the direct-backup case is the basic media failure risk of the device in question. Low for any modern media, astronomically so for 3592-class volumes. But not zero, as we all well know. Media risk in my case is basic-media-failure _cubed_. I'll handwave around the procedural risks, "did I manage to make my copies", and address that separately. If you'll grant me the copies, you can clearly see that I need three different pieces of media to have failed in order to miss my restore: the primary, the onsite copy, and the offsite copy. Better still, if you want more belts and suspenders, go to town. Two copy stgpools? why not four: two onsite, two offsite! We could go for one-googolth risk levels. That'd be silly, but achievable. One-molarth is probably adequate for humans. --- I handwaved at procedural risks, but I don't intend to just ignore them: Yes, you have to maintain the copy stgpools in order to get that increased security. But we do that all the time, every day. if our TSM administrative scheduling isn't adequate to maintain a few small copy pools (mine total under 3T each) it's not adequate to manage the DB backups in the first place. --- Note I haven't specifically addressed 'more than one DB backup on a tape' yet. It's offstage, behind that 'the DB backups are primary data, from the perspective of the LM instance' dodge. I've managed my servers' DB backups in a variety of ways. Right now, I collocate them by node, to prevent server_a from occluding a restore by server_b. but all the fulls and incrementals for a given machine are on one tape. --- Finally, don't be misled by the eggs-to-basket ratio. It's an emotionally persuasive argument, but irrelevant to your needs. You don't care about the other eggs, the other DB backups: you care about a particular one. If you wanted Monday's full, and a tape has gone bad, this doesn't somehow mean you want Friday's full instead. This means you're falling back. What I'm suggesting is that you 'fall back' to another copy of the full backup you wanted in the first place. - Allen S. Rout