Re: BRP
On Thu, 11 Nov 2010 13:10:49 -0800, Schuh, Richard rsc...@visa.com wrot e: In essence, we will be breaking the connections with the main system at a time not previously disclosed to us, and will not be allowed to go back to it or reference anything on it for the duration of the test. We will have to resync the dasd after the test has been completed. The main system will stay up and running so that those who are not part of the test can continue working. If you'll indulge me, I have a side question: When you break the replication in order to perform a DR test this puts updates to your production site at risk of loss should a real disaster occur during your DR test. Is there something else in your setup that elimtates this risk or has management signed off on the risk? To eliminate the risk I would expect a setup whereby the mirrored copy gets flashed to a tertiary copy and the DR test conducted from the tertiary copy and replication is never broken. If you have an alternate setup that eliminates the risk I would be interested in what it is. Brian Nielsen
Re: BRP
It does indeed leave a window open; however, I am not setting the rules. The ones who do are aware of it and are willing to accept the risk. They have set the parameters such that we must get the system back up within 12 hours, and that the system brought up be no more than 24 hours out of step. That is fully met by the test plans and does not require the additional hardware (3100+ devices) needed to create the tertiary copy. Regards, Richard Schuh -Original Message- From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Brian Nielsen Sent: Friday, November 12, 2010 7:35 AM To: IBMVM@LISTSERV.UARK.EDU `-+`` On Thu, 11 Nov 2010 13:10:49 -0800, Schuh, Richard rsc...@visa.com wrot= e: In essence, we will be breaking the connections with the main system at a time not previously disclosed to us, and will not be allowed to go back to it or reference anything on it for the duration of the test. We will have to resync the dasd after the test has been completed. The main system will stay up and running so that those who are not part of the test can continue working. If you'll indulge me, I have a side question: When you break the replication in order to perform a DR test this puts updates to your production site at risk of loss should a real disaster occur during your = DR test. Is there something else in your setup that elimtates this risk = or has management signed off on the risk? To eliminate the risk I would expect a setup whereby the mirrored copy = gets flashed to a tertiary copy and the DR test conducted from the tertiary copy and replication is never broken. If you have an alternate setup that eliminates the risk I would be interested in what it is. Brian Nielsen
Re: BRP
Hi Richard, We have all EMC DASD and have a very nice, albeit slightly convoluted process. We have 4 hours to get our TPF system to norm. VM comes up during that time. If you want to chat offline, let me know. :-) | | From: | | --| |Schuh, Richard rsc...@visa.com | --| | | To:| | --| |IBMVM@listserv.uark.edu | --| | | Date: | | --| |11/10/2010 05:51 PM | --| | | Subject: | | --| |BRP | --| | | Sent by: | | --| |The IBM z/VM Operating System IBMVM@listserv.uark.edu | --| Finally, the powers that be are considering remote shadowing of DASD as the way to handle the BRP situation. The time we are allotted to recover the system has been reduced to a number that is impossible using tape backups. I would appreciate it if anyone who is already doing this would regale me of their experiences - what they are doing, what are the gotchas, how satisfied are they, etc. It undoubtedly is different depending on the dasd vendors so here is what we have: EMC DASD - about half of our DASD. HDS DASD - the other half. Currently, there is no SCSI, it is all ECKD We currently have no IBM DASD; however, that does not mean that we will not have some in the future. Every couple of years, we go through a DASD refresh, at which time we may change vendors. I will gladly accept replies on or off list. TIA. Regards, Richard Schuh
Re: BRP
Christy, I definitely would like to chat off list. BTW, whatever was in the box below didn't make it through the list. Regards, Richard Schuh From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Christy Brogan Sent: Thursday, November 11, 2010 7:25 AM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP Hi Richard, We have all EMC DASD and have a very nice, albeit slightly convoluted process. We have 4 hours to get our TPF system to norm. VM comes up during that time. If you want to chat offline, let me know. :-) [cid:1__=88BBFD4BDFC7F8638f9e8a93df938@us.ibm.com] [cid:2__=88BBFD4BDFC7F8638f9e8a93df938@us.ibm.com]Schuh, Richard ---11/10/2010 05:51:14 PM---Finally, the powers that be are considering remote shadowing of DASD as the way to handle the BRP si From: Schuh, Richard rsc...@visa.com To: IBMVM@listserv.uark.edu Date: 11/10/2010 05:51 PM Subject: BRP Sent by: The IBM z/VM Operating System IBMVM@listserv.uark.edu Finally, the powers that be are considering remote shadowing of DASD as the way to handle the BRP situation. The time we are allotted to recover the system has been reduced to a number that is impossible using tape backups. I would appreciate it if anyone who is already doing this would regale me of their experiences - what they are doing, what are the gotchas, how satisfied are they, etc. It undoubtedly is different depending on the dasd vendors so here is what we have: * EMC DASD - about half of our DASD. * HDS DASD - the other half. * Currently, there is no SCSI, it is all ECKD We currently have no IBM DASD; however, that does not mean that we will not have some in the future. Every couple of years, we go through a DASD refresh, at which time we may change vendors. I will gladly accept replies on or off list. TIA. Regards, Richard Schuh
Re: BRP
We also have all EMC dasd. To guard against application faux pas that are not immediately discovered, we maintain 3 copies of TPF at 8 hour intervals (we can also bring home our offsite copies, which you need to be able do when a real disaster is over). For DR testing, we snap off point-in-time copies of TPF, z/OS, z/VM, and z/Linux (ECKD) dasd. We bring up TPF under our z/VM at the DR site so we can remap devices to correspond with the vendor provided hardware environment. It all works like a charm. Once the vendor moves our dasd over, we IPL z/VM, check the hardware environment, then un-NOLOG TPFPROD, and IPL it. Getting the network switched over takes more time than this, so we wind up waiting on them. From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Schuh, Richard Sent: Wednesday, November 10, 2010 7:51 PM To: IBMVM@LISTSERV.UARK.EDU Subject: BRP Finally, the powers that be are considering remote shadowing of DASD as the way to handle the BRP situation. The time we are allotted to recover the system has been reduced to a number that is impossible using tape backups. I would appreciate it if anyone who is already doing this would regale me of their experiences - what they are doing, what are the gotchas, how satisfied are they, etc. It undoubtedly is different depending on the dasd vendors so here is what we have: * EMC DASD - about half of our DASD. * HDS DASD - the other half. * Currently, there is no SCSI, it is all ECKD We currently have no IBM DASD; however, that does not mean that we will not have some in the future. Every couple of years, we go through a DASD refresh, at which time we may change vendors. I will gladly accept replies on or off list. TIA. Regards, Richard Schuh
Re: BRP
VM is my main concern, we already have multiple copies of TPF at different centers. The TPF folks have their own DR requirements, including no complete network outage. We are concerned with the ability to update source and test the updates, which requires both VM and Linux, and to run potentially critical applications that require VM. z/OS has its own set of requirements which are at least partially met by there being running instances of z/OS at each of the centers. Our DR site is a CBU LPAR in our other datacenter. The hardware configuration is (supposedly, no confirmation as yet) maintained in parallel with our running system. Once the DR test starts, we will be allowed no contact with the running system and there will be no ability to snap off a copy prior to the test - in fact, it is expressly forbidden. Regards, Richard Schuh From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Quay, Jonathan (IHG) Sent: Thursday, November 11, 2010 8:28 AM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP We also have all EMC dasd. To guard against application faux pas that are not immediately discovered, we maintain 3 copies of TPF at 8 hour intervals (we can also bring home our offsite copies, which you need to be able do when a real disaster is over). For DR testing, we snap off point-in-time copies of TPF, z/OS, z/VM, and z/Linux (ECKD) dasd. We bring up TPF under our z/VM at the DR site so we can remap devices to correspond with the vendor provided hardware environment. It all works like a charm. Once the vendor moves our dasd over, we IPL z/VM, check the hardware environment, then un-NOLOG TPFPROD, and IPL it. Getting the network switched over takes more time than this, so we wind up waiting on them. From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Schuh, Richard Sent: Wednesday, November 10, 2010 7:51 PM To: IBMVM@LISTSERV.UARK.EDU Subject: BRP Finally, the powers that be are considering remote shadowing of DASD as the way to handle the BRP situation. The time we are allotted to recover the system has been reduced to a number that is impossible using tape backups. I would appreciate it if anyone who is already doing this would regale me of their experiences - what they are doing, what are the gotchas, how satisfied are they, etc. It undoubtedly is different depending on the dasd vendors so here is what we have: * EMC DASD - about half of our DASD. * HDS DASD - the other half. * Currently, there is no SCSI, it is all ECKD We currently have no IBM DASD; however, that does not mean that we will not have some in the future. Every couple of years, we go through a DASD refresh, at which time we may change vendors. I will gladly accept replies on or off list. TIA. Regards, Richard Schuh
Re: BRP
If you can't snap off a copy what are you going to do during a test? Stop replicating? Kind of defeats the purpose. Anyway, we've never had a problem with the vm or linux filesystems. A lost inode here or there, but that is to be expected. From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Schuh, Richard Sent: Thursday, November 11, 2010 12:05 PM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP VM is my main concern, we already have multiple copies of TPF at different centers. The TPF folks have their own DR requirements, including no complete network outage. We are concerned with the ability to update source and test the updates, which requires both VM and Linux, and to run potentially critical applications that require VM. z/OS has its own set of requirements which are at least partially met by there being running instances of z/OS at each of the centers. Our DR site is a CBU LPAR in our other datacenter. The hardware configuration is (supposedly, no confirmation as yet) maintained in parallel with our running system. Once the DR test starts, we will be allowed no contact with the running system and there will be no ability to snap off a copy prior to the test - in fact, it is expressly forbidden. Regards, Richard Schuh From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Quay, Jonathan (IHG) Sent: Thursday, November 11, 2010 8:28 AM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP We also have all EMC dasd. To guard against application faux pas that are not immediately discovered, we maintain 3 copies of TPF at 8 hour intervals (we can also bring home our offsite copies, which you need to be able do when a real disaster is over). For DR testing, we snap off point-in-time copies of TPF, z/OS, z/VM, and z/Linux (ECKD) dasd. We bring up TPF under our z/VM at the DR site so we can remap devices to correspond with the vendor provided hardware environment. It all works like a charm. Once the vendor moves our dasd over, we IPL z/VM, check the hardware environment, then un-NOLOG TPFPROD, and IPL it. Getting the network switched over takes more time than this, so we wind up waiting on them. From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Schuh, Richard Sent: Wednesday, November 10, 2010 7:51 PM To: IBMVM@LISTSERV.UARK.EDU Subject: BRP Finally, the powers that be are considering remote shadowing of DASD as the way to handle the BRP situation. The time we are allotted to recover the system has been reduced to a number that is impossible using tape backups. I would appreciate it if anyone who is already doing this would regale me of their experiences - what they are doing, what are the gotchas, how satisfied are they, etc. It undoubtedly is different depending on the dasd vendors so here is what we have: * EMC DASD - about half of our DASD. * HDS DASD - the other half. * Currently, there is no SCSI, it is all ECKD We currently have no IBM DASD; however, that does not mean that we will not have some in the future. Every couple of years, we go through a DASD refresh, at which time we may change vendors. I will gladly accept replies on or off list. TIA. Regards, Richard Schuh
Re: BRP
Ok, so you're really doing a proof of concept of your dasd replication solution. Obviously, once in production one doesn't want to stop replicating just to do a test, unless you don't care how stale your DR data gets. So once the concept is proved, you'll have to come up with procedures to do testing which will involve various R2's, BCVs, PIT gold copies, etc. You'll need to understand those requirements ahead of time to properly size your DR dasd solution. We also successfully restore our VTAPE library using this technique. It is very small (one or two mod 3's), but the concept is extensible. From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Schuh, Richard Sent: Thursday, November 11, 2010 4:11 PM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP In essence, we will be breaking the connections with the main system at a time not previously disclosed to us, and will not be allowed to go back to it or reference anything on it for the duration of the test. We will have to resync the dasd after the test has been completed. The main system will stay up and running so that those who are not part of the test can continue working. Far from defeating the purpose of the test, which is to demonstrate that we can get the BRP system up and fully functional in x hours (x has yet to be determined, but it will be fairly small, without reverting to using the main system to help in any way. With the tape backup system, x used to be 24; however, it was trimmed to be only 12 and we demonstrated that it could not be done in that time frame. The restore of our (VSSI) VTAPE library, which is not tiny, did not complete during the window. It had been running for almost 8 hours and was only about half done when the window closed. We just got confirmation that the current configuration at the DR site has not been kept up to date. :-( That is a problem we do not expect to have if we are replicating the dasd. Regards, Richard Schuh From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Quay, Jonathan (IHG) Sent: Thursday, November 11, 2010 10:51 AM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP If you can't snap off a copy what are you going to do during a test? Stop replicating? Kind of defeats the purpose. Anyway, we've never had a problem with the vm or linux filesystems. A lost inode here or there, but that is to be expected. From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Schuh, Richard Sent: Thursday, November 11, 2010 12:05 PM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP VM is my main concern, we already have multiple copies of TPF at different centers. The TPF folks have their own DR requirements, including no complete network outage. We are concerned with the ability to update source and test the updates, which requires both VM and Linux, and to run potentially critical applications that require VM. z/OS has its own set of requirements which are at least partially met by there being running instances of z/OS at each of the centers. Our DR site is a CBU LPAR in our other datacenter. The hardware configuration is (supposedly, no confirmation as yet) maintained in parallel with our running system. Once the DR test starts, we will be allowed no contact with the running system and there will be no ability to snap off a copy prior to the test - in fact, it is expressly forbidden. Regards, Richard Schuh From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Quay, Jonathan (IHG) Sent: Thursday, November 11, 2010 8:28 AM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP We also have all EMC dasd. To guard against application faux pas that are not immediately discovered, we maintain 3 copies of TPF at 8 hour intervals (we can also bring home our offsite copies, which you need to be able do when a real disaster is over). For DR testing, we snap off point-in-time copies of TPF, z/OS, z/VM, and z/Linux (ECKD) dasd. We bring up TPF under our z/VM at the DR site so we can remap devices to correspond with the vendor provided hardware environment. It all works like a charm. Once the vendor moves our dasd over, we IPL z/VM, check the hardware environment, then un-NOLOG TPFPROD, and IPL it. Getting the network switched over takes more time than this, so we wind up waiting on them. From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu
Re: BRP
We will be required to do a complete DR test once per year. Our VTAPE is 48 3390-03 volumes. The usage is in the range of 40-80% with an occasional foray into the 90+% RANGE. Regards, Richard Schuh From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Quay, Jonathan (IHG) Sent: Thursday, November 11, 2010 1:21 PM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP Ok, so you're really doing a proof of concept of your dasd replication solution. Obviously, once in production one doesn't want to stop replicating just to do a test, unless you don't care how stale your DR data gets. So once the concept is proved, you'll have to come up with procedures to do testing which will involve various R2's, BCVs, PIT gold copies, etc. You'll need to understand those requirements ahead of time to properly size your DR dasd solution. We also successfully restore our VTAPE library using this technique. It is very small (one or two mod 3's), but the concept is extensible. From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Schuh, Richard Sent: Thursday, November 11, 2010 4:11 PM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP In essence, we will be breaking the connections with the main system at a time not previously disclosed to us, and will not be allowed to go back to it or reference anything on it for the duration of the test. We will have to resync the dasd after the test has been completed. The main system will stay up and running so that those who are not part of the test can continue working. Far from defeating the purpose of the test, which is to demonstrate that we can get the BRP system up and fully functional in x hours (x has yet to be determined, but it will be fairly small, without reverting to using the main system to help in any way. With the tape backup system, x used to be 24; however, it was trimmed to be only 12 and we demonstrated that it could not be done in that time frame. The restore of our (VSSI) VTAPE library, which is not tiny, did not complete during the window. It had been running for almost 8 hours and was only about half done when the window closed. We just got confirmation that the current configuration at the DR site has not been kept up to date. :-( That is a problem we do not expect to have if we are replicating the dasd. Regards, Richard Schuh From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Quay, Jonathan (IHG) Sent: Thursday, November 11, 2010 10:51 AM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP If you can't snap off a copy what are you going to do during a test? Stop replicating? Kind of defeats the purpose. Anyway, we've never had a problem with the vm or linux filesystems. A lost inode here or there, but that is to be expected. From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Schuh, Richard Sent: Thursday, November 11, 2010 12:05 PM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP VM is my main concern, we already have multiple copies of TPF at different centers. The TPF folks have their own DR requirements, including no complete network outage. We are concerned with the ability to update source and test the updates, which requires both VM and Linux, and to run potentially critical applications that require VM. z/OS has its own set of requirements which are at least partially met by there being running instances of z/OS at each of the centers. Our DR site is a CBU LPAR in our other datacenter. The hardware configuration is (supposedly, no confirmation as yet) maintained in parallel with our running system. Once the DR test starts, we will be allowed no contact with the running system and there will be no ability to snap off a copy prior to the test - in fact, it is expressly forbidden. Regards, Richard Schuh From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Quay, Jonathan (IHG) Sent: Thursday, November 11, 2010 8:28 AM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: BRP We also have all EMC dasd. To guard against application faux pas that are not immediately discovered, we maintain 3 copies of TPF at 8 hour intervals (we can also bring home our offsite copies, which you need to be able do when a real disaster is over). For DR testing, we snap off point-in-time copies of TPF, z/OS, z/VM, and z/Linux (ECKD) dasd. We bring up TPF under our z/VM at the DR site so we can remap devices to correspond with the vendor provided hardware environment. It all works like a charm. Once the vendor moves our dasd over, we IPL z/VM, check the hardware environment, then un-NOLOG TPFPROD, and IPL it. Getting the network switched over takes more time than this, so we wind up waiting on them
BRP
Finally, the powers that be are considering remote shadowing of DASD as the way to handle the BRP situation. The time we are allotted to recover the system has been reduced to a number that is impossible using tape backups. I would appreciate it if anyone who is already doing this would regale me of their experiences - what they are doing, what are the gotchas, how satisfied are they, etc. It undoubtedly is different depending on the dasd vendors so here is what we have: * EMC DASD - about half of our DASD. * HDS DASD - the other half. * Currently, there is no SCSI, it is all ECKD We currently have no IBM DASD; however, that does not mean that we will not have some in the future. Every couple of years, we go through a DASD refresh, at which time we may change vendors. I will gladly accept replies on or off list. TIA. Regards, Richard Schuh
Re: BRP
Richard, We're using disk replication on HDS DASD for disaster recovery. We were able to reduce a script that had about 30 steps to one that isn't much more complicated than: 1. Suspend replication 2. IPL 3. Answer one question: is this a test or a real disaster? 4. Certify The test or real disaster answer determines the network configuration. DR tests are behind a firewall, so they use different OSA's than we would use for a real disaster. The answer can also be used for other things, such as determining which guests are started. We don't start z/Linux guests automatically during DR tests, because our Linux security product has a problem with two guests having the same name. If the DR guest refreshes keys or whatever it does with Active Directory, the production guest loses access. This seems to happen around midnight, so we start up DR guests when we're ready to test and shut them down as soon as we're done. One of our Linux people could explain this much better than I can. Make sure that your CP maintenance is current. We had a CP abend and guest I/O errors when we resumed replication on a z/VM 5.4.0 RSU 0801 or 0802 (I forget which) system. The problem went away when we upgraded to RSU 1001 plus some additional service. We have two copies of the disks in our DR site. This allows us to continue replicating production on the secondary disks while we test on the tertiary disks. Dennis In all matters of opinion, our adversaries are insane. -- Mark Twain From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Schuh, Richard Sent: Wednesday, November 10, 2010 16:51 To: IBMVM@LISTSERV.UARK.EDU Subject: [IBMVM] BRP Finally, the powers that be are considering remote shadowing of DASD as the way to handle the BRP situation. The time we are allotted to recover the system has been reduced to a number that is impossible using tape backups. I would appreciate it if anyone who is already doing this would regale me of their experiences - what they are doing, what are the gotchas, how satisfied are they, etc. It undoubtedly is different depending on the dasd vendors so here is what we have: * EMC DASD - about half of our DASD. * HDS DASD - the other half. * Currently, there is no SCSI, it is all ECKD We currently have no IBM DASD; however, that does not mean that we will not have some in the future. Every couple of years, we go through a DASD refresh, at which time we may change vendors. I will gladly accept replies on or off list. TIA. Regards, Richard Schuh -- This message w/attachments (message) is intended solely for the use of the intended recipient(s) and may contain information that is privileged, confidential or proprietary. If you are not an intended recipient, please notify the sender, and then please delete and destroy all copies and attachments, and be advised that any review or dissemination of, or the taking of any action in reliance on, the information contained in or attached to this message is prohibited. Unless specifically indicated, this message is not an offer to sell or a solicitation of any investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Sender. Subject to applicable law, Sender may intercept, monitor, review and retain e-communications (EC) traveling through its networks/systems and may produce any such EC to regulators, law enforcement, in litigation and as required by law. The laws of the country of each sender/recipient may impact the handling of EC, and EC may be archived, supervised and produced in countries other than the country in which you are located. This message cannot be guaranteed to be secure or free of errors or viruses. References to Sender are references to any subsidiary of Bank of America Corporation. Securities and Insurance Products: * Are Not FDIC Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a Condition to Any Banking Service or Activity * Are Not Insured by Any Federal Government Agency. Attachments that are part of this EC may have additional important disclosures and disclaimers, which you should read. This message is subject to terms available at the following link: http://www.bankofamerica.com/emaildisclaimer. By messaging with Sender you consent to the foregoing.