Re: Two different retention policies for the same node
Thanks Steven and Wanda! Your ideas are very valuable! I've been thinking about Steven's recipe number 2. It seems ok, but in DR situation I wouldn't really want to mess with DB restores. But what if.. 1. Create a new domain for high priority servers (there are just a few of them) with the same retention as before, but.. 2. ...point the domain's copygroup destination to a separate DISK stgpool. 3. During incremental have simultaneous write put the active data to offsite Active-data pool (of devt FILE, of course) 4. Migrate DISK stgpool down to an onsite FILE stgpool that will have MIGR DELAY of 7. 5. Replicate that FILE stgpool (by the means of underlying storage) to an offsite storage (with dedupe or without). In DR situatiuon I'll have to: 1. Restore the DB 2. The active data is in the active-data pool 3. + 7 days old data is in the FILE stgpool. How is that? Any flaws you can spot in this approach? -- Warm regards, Michael Green On Wed, Mar 18, 2009 at 11:52 PM, Steven Harris sjhar...@au1.ibm.com wrote: Michael, I have two ideas about your problem. Idea 1. Create another domain for your high priority servers, with the 7 day retention period. Move the nodes into this domain. Create new nodes for these machines in the old domain with different names. For machines performing an normal incremental run two backups every day, one to each node name. For machines with too many files for that run a weekly incremental to the longer retention domain. For databases/tdp nodes run a weekly extra backup to the longer retention domain. Explain carefully to management that the coverage of the longer retention period data now has holes in it. Idea 2. The 7day retention for offsite sounds like a simple-minded notion of what might be nice to have. What they really want is an activepool but they aren't comfortable with it. Create an activepool daily for the high priority data. Send it offsite to disk. Set the DB expiration to not less than 7 days. Set activepool pending delay to 7 days. Continue to send your normal copypool tapes offsite. Have a small library offsite too. If you have a disaster all your active data is instantly available. If you need day -n, restore the db for that day and the activepool data for that day will be available. Alternatively use your small library and restore day -n data from the usual copypool tapes. One loose end if your ERP is SAP, the backups are actually TSM archives. I'm not sure how Activepools work with archives and don't have the time to look it up now. Regards Steve. Steven Harris TSM Admin, Sydney Australia Michael Green mishagr...@gmail .COM To Sent by: ADSM: ADSM-L@VM.MARIST.EDU Dist Stor cc Manager ads...@vm.marist Subject .EDU Re: [ADSM-L] Two different retention policies for the same node 19/03/2009 12:27 AM Please respond to ADSM: Dist Stor Manager ads...@vm.marist .EDU On Tue, Mar 17, 2009 at 11:18 PM, Conway, Timothy timothy.con...@jbssa.com wrote: Is this to avoid having two copypools? That's a reasonable goal. I have only one copypool, which is my DR offsite pool. Just make your onsite copypool an offsite pool, and you can give them 25 times better than they're asking for. No, the idea is to keep offsite 7 days history for very few most important servers (ERP, HR) on disk. I don't much care if that will be primary pool or copy pool. As long as I can get my data back off it - it's fine. Today, I manage 3 servers here and am sitting on 0.5 peta of backup data. There is no point to have all that data (most of which is inactive) at DR site (we do have offsite vault though). At DR site we want to keep preconfigured turn-key ERP, HR servers, a preconfigured TSM server with its database and SAN or NAS attached disk that has the 7-days history. I have yet to work out how and by what means my 140GB database will get to DR site on daily basis. Maybe we will use a dedupe or maybe we will open a separate TSM instance just for these few servers so that the DB that we will have to copy to DR site will be as small as possible. Also the smaller DB, the better in DR situation. Unless most of the data changes every day, the difference between 7 days and 180 days worth of copypool is remarkably small. It can be big. ERP server backs up over 100G nightly. I guess it dedupes pretty well though. If you have no copypool at all, the whole thing is a farce. If they're wanting fast portable full restores of a subset of the total nodes, how
Re: Two different retention policies for the same node
Michael migdelay of 7 will keep newly changed files for 7 days. Older ones will move off. So your basic OS files and so on won't be there. Only way around that is to run a selective once a week or change MODE to absolute in the backup copygroup on Sunday morning and back to modified on Monday morning. But if you are going to do that, you might as well buy Netbackup. Regards Steve Michael Green mishagr...@gmail .COM To Sent by: ADSM: ADSM-L@VM.MARIST.EDU Dist Stor cc Manager ads...@vm.marist Subject .EDU Re: [ADSM-L] Two different retention policies for the same node 25/03/2009 08:55 AM Please respond to ADSM: Dist Stor Manager ads...@vm.marist .EDU Thanks Steven and Wanda! Your ideas are very valuable! I've been thinking about Steven's recipe number 2. It seems ok, but in DR situation I wouldn't really want to mess with DB restores. But what if.. 1. Create a new domain for high priority servers (there are just a few of them) with the same retention as before, but.. 2. ...point the domain's copygroup destination to a separate DISK stgpool. 3. During incremental have simultaneous write put the active data to offsite Active-data pool (of devt FILE, of course) 4. Migrate DISK stgpool down to an onsite FILE stgpool that will have MIGR DELAY of 7. 5. Replicate that FILE stgpool (by the means of underlying storage) to an offsite storage (with dedupe or without). In DR situatiuon I'll have to: 1. Restore the DB 2. The active data is in the active-data pool 3. + 7 days old data is in the FILE stgpool. How is that? Any flaws you can spot in this approach? -- Warm regards, Michael Green On Wed, Mar 18, 2009 at 11:52 PM, Steven Harris sjhar...@au1.ibm.com wrote: Michael, I have two ideas about your problem. Idea 1. Create another domain for your high priority servers, with the 7 day retention period. Move the nodes into this domain. Create new nodes for these machines in the old domain with different names. For machines performing an normal incremental run two backups every day, one to each node name. For machines with too many files for that run a weekly incremental to the longer retention domain. For databases/tdp nodes run a weekly extra backup to the longer retention domain. Explain carefully to management that the coverage of the longer retention period data now has holes in it. Idea 2. The 7day retention for offsite sounds like a simple-minded notion of what might be nice to have. What they really want is an activepool but they aren't comfortable with it. Create an activepool daily for the high priority data. Send it offsite to disk. Set the DB expiration to not less than 7 days. Set activepool pending delay to 7 days. Continue to send your normal copypool tapes offsite. Have a small library offsite too. If you have a disaster all your active data is instantly available. If you need day -n, restore the db for that day and the activepool data for that day will be available. Alternatively use your small library and restore day -n data from the usual copypool tapes. One loose end if your ERP is SAP, the backups are actually TSM archives. I'm not sure how Activepools work with archives and don't have the time to look it up now. Regards Steve. Steven Harris TSM Admin, Sydney Australia Michael Green mishagr...@gmail .COM To Sent by: ADSM: ADSM-L@VM.MARIST.EDU Dist Stor cc Manager ads...@vm.marist Subject
Re: Two different retention policies for the same node
(7 days of retention at the DR site also sounds bogus to me - why 7? Sounds like they don't really know what they need and you should reexamine RTO and RPO.) But regardless of that, here's what I would recommend to solve the problem as stated: For the primary site, keep your 180 days retention. On the DR site, set up the same mgmt classes but with 7 days retention. On the primary site each day do an incremental EXPORT of the nodes. An incremental export is one that specifies EXPORT filedata=all with fromdate/fromtime specified as 24 hours ago. That will put only the new data on tape. Send those tapes to the DR site and IMPORT. The data that rolls into the DB disk pool at the DR site will be governed by the mgmt classes there, i.e. 7 days retention. Better to do the export directly to the remote TSM server instead of to tape, if you have the bandwidth. (If you don't have the bandwidth, then your mgmt isn't serious about doing the DR this way.) Also, I'm not a big fan of replicating backup data, which is what you're really doing here. If you're going to spend the money to have a live DR site, why not use replication techology to replicate the LIVE applications to the DR site, instead of the backup data? You generally move less data per day, and you have business continuance, not just backup. W On Wed, Mar 18, 2009 at 5:52 PM, Steven Harris sjhar...@au1.ibm.com wrote: Michael, I have two ideas about your problem. Idea 1. Create another domain for your high priority servers, with the 7 day retention period. Move the nodes into this domain. Create new nodes for these machines in the old domain with different names. For machines performing an normal incremental run two backups every day, one to each node name. For machines with too many files for that run a weekly incremental to the longer retention domain. For databases/tdp nodes run a weekly extra backup to the longer retention domain. Explain carefully to management that the coverage of the longer retention period data now has holes in it. Idea 2. The 7day retention for offsite sounds like a simple-minded notion of what might be nice to have. What they really want is an activepool but they aren't comfortable with it. Create an activepool daily for the high priority data. Send it offsite to disk. Set the DB expiration to not less than 7 days. Set activepool pending delay to 7 days. Continue to send your normal copypool tapes offsite. Have a small library offsite too. If you have a disaster all your active data is instantly available. If you need day -n, restore the db for that day and the activepool data for that day will be available. Alternatively use your small library and restore day -n data from the usual copypool tapes. One loose end if your ERP is SAP, the backups are actually TSM archives. I'm not sure how Activepools work with archives and don't have the time to look it up now. Regards Steve. Steven Harris TSM Admin, Sydney Australia Michael Green mishagr...@gmail .COM To Sent by: ADSM: ADSM-L@VM.MARIST.EDU Dist Stor cc Manager ads...@vm.marist Subject .EDU Re: [ADSM-L] Two different retention policies for the same node 19/03/2009 12:27 AM Please respond to ADSM: Dist Stor Manager ads...@vm.marist .EDU On Tue, Mar 17, 2009 at 11:18 PM, Conway, Timothy timothy.con...@jbssa.com wrote: Is this to avoid having two copypools? That's a reasonable goal. I have only one copypool, which is my DR offsite pool. Just make your onsite copypool an offsite pool, and you can give them 25 times better than they're asking for. No, the idea is to keep offsite 7 days history for very few most important servers (ERP, HR) on disk. I don't much care if that will be primary pool or copy pool. As long as I can get my data back off it - it's fine. Today, I manage 3 servers here and am sitting on 0.5 peta of backup data. There is no point to have all that data (most of which is inactive) at DR site (we do have offsite vault though). At DR site we want to keep preconfigured turn-key ERP, HR servers, a preconfigured TSM server with its database and SAN or NAS attached disk that has the 7-days history. I have yet to work out how and by what means my 140GB database will get to DR site on daily basis. Maybe we will use a dedupe or maybe we will open a separate TSM instance just for these few servers so that the DB that we will have to copy to DR site will be as small as possible. Also the smaller DB, the better in DR
Re: Two different retention policies for the same node
On Tue, Mar 17, 2009 at 11:33 PM, Kelly Lipp l...@storserver.com wrote: Where are the people that use the data going to be? How will customers interact with them? This is very valid point! Luckily I'm not the one who needs to think about it ;-) If you worry DR application by application and think about all aspects of using the data, the problem actually becomes simpler since there is less data to worry about the time frame to recover it is probably longer than you think. It's all about RPO and RTO. In our sales practice, I'm spending a lot of time consulting (during pre-sales so it's free) about DR issues. Bottom line: you need to have a very good plan, but since you will probably never execute (beyond testing), you probably shouldn't spend too much time/money on it. Kelly Lipp CTO STORServer, Inc. 485-B Elkton Drive Colorado Springs, CO 80907 719-266-8777 x7105 www.storserver.com
Re: Two different retention policies for the same node
On Tue, Mar 17, 2009 at 11:18 PM, Conway, Timothy timothy.con...@jbssa.com wrote: Is this to avoid having two copypools? That's a reasonable goal. I have only one copypool, which is my DR offsite pool. Just make your onsite copypool an offsite pool, and you can give them 25 times better than they're asking for. No, the idea is to keep offsite 7 days history for very few most important servers (ERP, HR) on disk. I don't much care if that will be primary pool or copy pool. As long as I can get my data back off it - it's fine. Today, I manage 3 servers here and am sitting on 0.5 peta of backup data. There is no point to have all that data (most of which is inactive) at DR site (we do have offsite vault though). At DR site we want to keep preconfigured turn-key ERP, HR servers, a preconfigured TSM server with its database and SAN or NAS attached disk that has the 7-days history. I have yet to work out how and by what means my 140GB database will get to DR site on daily basis. Maybe we will use a dedupe or maybe we will open a separate TSM instance just for these few servers so that the DB that we will have to copy to DR site will be as small as possible. Also the smaller DB, the better in DR situation. Unless most of the data changes every day, the difference between 7 days and 180 days worth of copypool is remarkably small. It can be big. ERP server backs up over 100G nightly. I guess it dedupes pretty well though. If you have no copypool at all, the whole thing is a farce. If they're wanting fast portable full restores of a subset of the total nodes, how about backupsets? Make a nodegroup containing all the nodes Backup sets are fine as long as tey are relatively small and you don't have to generate them on daily basis. Imagine your ERP is about 400GB worth of active data and you have to generate backup set that big on daly basis? I don't even know yet what kind of bandwidth I'll have to our DR location. Assuming I get backupset generated in 4-5 hours, how many hours will be required to send it off? Also what happens if then the managment decides they want a few more machine to join the first one at DR location? This solution sound like a nice idea TSM-wise, but imho it's not very scalable otherwise. As it looks to me the best approach is to backup locally, dedupe, send it off deduped. they want daily fulls of, and make a backupset of that nodegroup every day. Give the backupset a 7 day retention, and keep track of the volumes that are in the list of backupset volumes from one day that disappear the next (simple to script). That same script can note tapes that show up in the list of backupset volumes that weren't there the day before, and check them out of the library and throw your operations team an email listing every tape to be sent offsite and to be recalled. I find that I can generate file-class backupsets(with TOC) at about 27MB/S - 8.5 hours to do an 814GB node, single-threaded.
Re: Two different retention policies for the same node
Michael, I have two ideas about your problem. Idea 1. Create another domain for your high priority servers, with the 7 day retention period. Move the nodes into this domain. Create new nodes for these machines in the old domain with different names. For machines performing an normal incremental run two backups every day, one to each node name. For machines with too many files for that run a weekly incremental to the longer retention domain. For databases/tdp nodes run a weekly extra backup to the longer retention domain. Explain carefully to management that the coverage of the longer retention period data now has holes in it. Idea 2. The 7day retention for offsite sounds like a simple-minded notion of what might be nice to have. What they really want is an activepool but they aren't comfortable with it. Create an activepool daily for the high priority data. Send it offsite to disk. Set the DB expiration to not less than 7 days. Set activepool pending delay to 7 days. Continue to send your normal copypool tapes offsite. Have a small library offsite too. If you have a disaster all your active data is instantly available. If you need day -n, restore the db for that day and the activepool data for that day will be available. Alternatively use your small library and restore day -n data from the usual copypool tapes. One loose end if your ERP is SAP, the backups are actually TSM archives. I'm not sure how Activepools work with archives and don't have the time to look it up now. Regards Steve. Steven Harris TSM Admin, Sydney Australia Michael Green mishagr...@gmail .COM To Sent by: ADSM: ADSM-L@VM.MARIST.EDU Dist Stor cc Manager ads...@vm.marist Subject .EDU Re: [ADSM-L] Two different retention policies for the same node 19/03/2009 12:27 AM Please respond to ADSM: Dist Stor Manager ads...@vm.marist .EDU On Tue, Mar 17, 2009 at 11:18 PM, Conway, Timothy timothy.con...@jbssa.com wrote: Is this to avoid having two copypools? That's a reasonable goal. I have only one copypool, which is my DR offsite pool. Just make your onsite copypool an offsite pool, and you can give them 25 times better than they're asking for. No, the idea is to keep offsite 7 days history for very few most important servers (ERP, HR) on disk. I don't much care if that will be primary pool or copy pool. As long as I can get my data back off it - it's fine. Today, I manage 3 servers here and am sitting on 0.5 peta of backup data. There is no point to have all that data (most of which is inactive) at DR site (we do have offsite vault though). At DR site we want to keep preconfigured turn-key ERP, HR servers, a preconfigured TSM server with its database and SAN or NAS attached disk that has the 7-days history. I have yet to work out how and by what means my 140GB database will get to DR site on daily basis. Maybe we will use a dedupe or maybe we will open a separate TSM instance just for these few servers so that the DB that we will have to copy to DR site will be as small as possible. Also the smaller DB, the better in DR situation. Unless most of the data changes every day, the difference between 7 days and 180 days worth of copypool is remarkably small. It can be big. ERP server backs up over 100G nightly. I guess it dedupes pretty well though. If you have no copypool at all, the whole thing is a farce. If they're wanting fast portable full restores of a subset of the total nodes, how about backupsets? Make a nodegroup containing all the nodes Backup sets are fine as long as tey are relatively small and you don't have to generate them on daily basis. Imagine your ERP is about 400GB worth of active data
Two different retention policies for the same node
I've been asked to provide a DR/BAckup solution that seems to contradict TSM methodology, but I've decided I'll throw this in here anyway. Given the following retention policy: RETE=180 RETO=180 VERE=NOL VERD=NOL (180 days, no version limit) I've been asked to find a way to keep offsite only 7 days worth of data (on deduped disk or somthng like that), both active and inactive. So that it would allow us to restore complete system image from any day within last week. Doable (without resorting to double backups under different MCs)? -- Warm regards, Michael Green
Re: Two different retention policies for the same node
There could be some serious issues with this. If you have an onsite volume that has 170 day old data, with 5 day old data and 80 day old data (due to reclamation), and the volume goes bad, all you'll be able to restore is the 5 day old data. However, this is a real challenge. I'd like to see the solution. It appears on the surface that this is a result of is the we want to keep it forever but can't afford the cost mentality. Champagne taste on Beer budget. See Ya' Howard -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Michael Green Sent: Tuesday, March 17, 2009 2:57 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Two different retention policies for the same node I've been asked to provide a DR/BAckup solution that seems to contradict TSM methodology, but I've decided I'll throw this in here anyway. Given the following retention policy: RETE=180 RETO=180 VERE=NOL VERD=NOL (180 days, no version limit) I've been asked to find a way to keep offsite only 7 days worth of data (on deduped disk or somthng like that), both active and inactive. So that it would allow us to restore complete system image from any day within last week. Doable (without resorting to double backups under different MCs)? -- Warm regards, Michael Green
Re: Two different retention policies for the same node
Is this to avoid having two copypools? That's a reasonable goal. I have only one copypool, which is my DR offsite pool. Just make your onsite copypool an offsite pool, and you can give them 25 times better than they're asking for. Unless most of the data changes every day, the difference between 7 days and 180 days worth of copypool is remarkably small. If you have no copypool at all, the whole thing is a farce. If they're wanting fast portable full restores of a subset of the total nodes, how about backupsets? Make a nodegroup containing all the nodes they want daily fulls of, and make a backupset of that nodegroup every day. Give the backupset a 7 day retention, and keep track of the volumes that are in the list of backupset volumes from one day that disappear the next (simple to script). That same script can note tapes that show up in the list of backupset volumes that weren't there the day before, and check them out of the library and throw your operations team an email listing every tape to be sent offsite and to be recalled. I find that I can generate file-class backupsets(with TOC) at about 27MB/S - 8.5 hours to do an 814GB node, single-threaded. -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Michael Green Sent: Tuesday, March 17, 2009 1:57 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Two different retention policies for the same node I've been asked to provide a DR/BAckup solution that seems to contradict TSM methodology, but I've decided I'll throw this in here anyway. Given the following retention policy: RETE=180 RETO=180 VERE=NOL VERD=NOL (180 days, no version limit) I've been asked to find a way to keep offsite only 7 days worth of data (on deduped disk or somthng like that), both active and inactive. So that it would allow us to restore complete system image from any day within last week. Doable (without resorting to double backups under different MCs)? -- Warm regards, Michael Green
Re: Two different retention policies for the same node
The more important issue regarding DR is the prioritization of application restore based on the business itself. As it turns out, after a disaster about 90% of the stuff that's backed up isn't necessary to run the business. And the ability to get the previous seven days doesn't help either. More important to have two different DR pools: one for important data and one for the rest. Then optimize the important data DR pool to ensure it can do what you need it to do: I must have application XYZ back up and available to users within 24 hours of the disaster to a point no further back than 48 hours. And it turns out that most business's ability to use recovered data within 24 hours is sketchy at best. Where are the people that use the data going to be? How will customers interact with them? All of these issues are actually about 100 times more difficult than restoring data. Yet few think of them! If you worry DR application by application and think about all aspects of using the data, the problem actually becomes simpler since there is less data to worry about the time frame to recover it is probably longer than you think. It's all about RPO and RTO. In our sales practice, I'm spending a lot of time consulting (during pre-sales so it's free) about DR issues. Bottom line: you need to have a very good plan, but since you will probably never execute (beyond testing), you probably shouldn't spend too much time/money on it. Kelly Lipp CTO STORServer, Inc. 485-B Elkton Drive Colorado Springs, CO 80907 719-266-8777 x7105 www.storserver.com -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Howard Coles Sent: Tuesday, March 17, 2009 2:47 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Two different retention policies for the same node There could be some serious issues with this. If you have an onsite volume that has 170 day old data, with 5 day old data and 80 day old data (due to reclamation), and the volume goes bad, all you'll be able to restore is the 5 day old data. However, this is a real challenge. I'd like to see the solution. It appears on the surface that this is a result of is the we want to keep it forever but can't afford the cost mentality. Champagne taste on Beer budget. See Ya' Howard -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Michael Green Sent: Tuesday, March 17, 2009 2:57 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Two different retention policies for the same node I've been asked to provide a DR/BAckup solution that seems to contradict TSM methodology, but I've decided I'll throw this in here anyway. Given the following retention policy: RETE=180 RETO=180 VERE=NOL VERD=NOL (180 days, no version limit) I've been asked to find a way to keep offsite only 7 days worth of data (on deduped disk or somthng like that), both active and inactive. So that it would allow us to restore complete system image from any day within last week. Doable (without resorting to double backups under different MCs)? -- Warm regards, Michael Green