Re: DDR'ing 3390 DASD To Remote Location

Michael Coffin Wed, 18 Jun 2008 07:04:28 -0700

Hi Tom,

Holy Cow!  A 3 MONTH recovery window?!?!?  I don't think I've ever come
across such a "generous" recovery window, most companies would be out of
business in 3 months without access to their mission-critical systems.
:)


On another note, has anyone ever come across an FTP stage for the CMS
PIPE command?  Someone on the LINUX-390 listserv suggested there might
have been one once upon a time..

-Mike

-----Original Message-----
From: The IBM z/VM Operating System [mailto:[EMAIL PROTECTED] On
Behalf Of Tom Duerbusch
Sent: Tuesday, June 17, 2008 3:42 PM
To: IBMVM@LISTSERV.UARK.EDU
Subject: Re: DDR'ing 3390 DASD To Remote Location


It depends on your recovery window.  
In the shop where we don't have disaster recovery tests, the window is
officially 3 months, but I have it down to 1 month. In the shop where we
do disaster recovery tests, the window is 3 days. IMHO, the closer you
get to a hot site (already spinning a copy of your files), the more you
need the monkey boy.  In the "days" recovery window, good instructions
are needed.  In the "month" recovery window, we can get a system
programmer consultant (to replace "me, a consultant"), that can get the
system up.

But all that depends on your recovery time frame, what you have in place
(contacts, contracts, locations, etc), and what the shop is willing to
pay for.

We are pretty much in sync in the other aspects.

I'm planning on a VM recovery system, on a single 3590 tape, with the
script to restore the system, on floppy.  That includes a small VSE
system (no VM tape management, but VSE does have Dynam, and all the rest
of the 390 backups are done within VSE), to restore each of the other
VSE systems from tape (currently 3590, soon to be TS1120).  Then, we
(may) need to do application level restores from the most current backup
available.  But that can be done via the scheduling software.  

Yep, DDR is nice in that it restores the cylinders that are on that tape
without caring if the cylinders before or after are done.  Just mount
the tapes, have rexx scan the TLBL, attach the disk drive and restore
that set of cylinders.  Eventually, all cylinders that got backed up,
will be restored.  I had a similar concept back in the late 80's/early
'90s.  

Thanks for the info

Tom Duerbusch
THD Consulting

Law of Cat Acceleration

  A cat will accelerate at a constant rate, until he gets good and
  ready to stop.


>>> Michael Coffin <[EMAIL PROTECTED]> 6/17/2008 12:28 PM >>>
Hi Tom,

You really should plan towards "monkey boy" as being the only resource
capable of performing the recovery.  I classify "disasters" in three
classes:

1.  Minor: This would be like a prolonged regional power failure, but
all facilities, systems and equipment are still present.

2.  Major:  This would be something involving the total loss of the
computer facility and perhaps even some staff, but at least some people
with expertise and knowledge of the business and systems are available
to participate in the recovery.  For example, a fire at the computer
facility.

3.  Total:  All facilities and staff are impacted and unavailable,
nobody has expertise and knowledge of the business and systems - this is
the "monkey boy" scenario.  An example might be a natural disaster
(hurricane, tornado, earthquake, etc.) or act of terrorism.

The recovery system is prestaged on tape, it's a small footprint z/VM
system with EVERYTHING needed to perform the recovery (it senses
available ARM libraries, networks, DASD, etc. etc.).  Restoring the
recovery system to disk and IPL'ing is the one part of the process that
is not menu driven, but it's well documented and really only involves a
few steps:

1.  Mount recovery tape.

2.  IPL DDR on the head of the recovery tape.

3.  Restore the recovery system to disk.

4.  IPL the recovery system from disk and startup the menu-based
recovery system.

The 4 steps above are the most "technical" part of the process, the
"monkey boy" simply needs to execute commands and provide responses
provided on a Recovery Worksheet completed by the host site technical
staff, so even though it is a "technical" process, it's really more a
question of following instructions (whether you understand them or not).
:)

I think we probably use around 30 or 40 full 3590 tapes, but since the
entire process is automated and you can run multiple concurrent restore
streams it becomes a moot point (basically you just tell the process how
many 3590 drives are available for your use in the available ARM's and
it will dispatch a number of dynamic DDR recovery slaves, one per tape
drive).

Years ago I wrote an automated recovery system for BellCore (the
research and development arm of the old "phone company").  Back then, it
might take 3 or 4 3480 carts to hold a single DASD spindle.  That
process was pretty elegant too, the SL on the tapes (and yes, all of my
DDR tapes use SL tapes - which of course is a challenge in and of
itself!) specified what contents were on that cartridge.  The operators
would fire up the DR Recovery process and simply start opening buckets
of tapes and stuffing them into 3480 autoloaders in any order.  The
recovery system would piece it all together, and of course had a nice
status monitor showing the progress of each spindle recovery and the
ability to quiesce/restart slaves and such.  I remember one time when we
were at Sunguard for a DR drill the z/OS guys (MVS back in those days)
were busily responding to a million human-interactive steps, locating
and mounting specific tapes, initiating jobs to read tapes/write to
DASD, etc. etc. - the VM guys were just sitting there having a coffee
and watching the automated recovery monitor, occasionally opening a
fresh tub of tapes and stuffing them into 3480 autoloaders in no
particular order.  Since we also had multiple concurrent streams going
our recovery was done painlessly and QUICKLY compared to the poor ol'
MVS guys.  :)

-Mike

-----Original Message-----
From: The IBM z/VM Operating System [mailto:[EMAIL PROTECTED] On
Behalf Of Tom Duerbusch
Sent: Tuesday, June 17, 2008 1:04 PM
To: IBMVM@LISTSERV.UARK.EDU 
Subject: Re: DDR'ing 3390 DASD To Remote Location


I do like the "monkey boy" concept.  I keep trying to go towards that
state.

You have a menu driven system, which implies you are not restoring to
bare iron.  So what was prestaged?  Do you have a mini VM system on tape
(or DVD) to be restored?  If this is a Flex or P/390 type system, never
mind.  They have some other, easier, more interesting options.  

For one site, which doesn't do disaster recovery tests, I'm thinking of
a TS1120 backup and a script, which will be testing on one of our LPARs,
for recovery of our systems.

At another site, which does disaster recovery tests, and Sungard has VM,
I'm looking at scripts for the standalone side, to eliminate the people
errors when restoring their systems.  BTW, they keep the most recent 2
generations offsite, just in case of a media failure.  They are 3480
based, and the standalone restores take some 80 tapes.  Uggg!

Tom Duerbusch
THD Consulting

>>> Michael Coffin <[EMAIL PROTECTED]> 6/17/2008 11:49 AM >>>
Hi Tom,

Yep, I have that all covered.  This is actually a DR process that I
developed about 8 years ago (and have been improving over the years)
that is a complete "soup to nuts" system, automating both the backups
AND the recovery.  The system assumes a "worst case scenario" where
computer center is gone, and all of the people having detailed knowledge
of the system are likewise "gone", it allows someone with virtually no
knowledge about the specifics of the system being recovered and very
minimal mainframe knowledge to fully recover the system.  When we
conduct DR drills we typically recruit a management type that has
minimal computing skills and little specific knowledge about the system
(someone we affectionately refer to as the "monkey boy", with the
understanding that a slightly trained monkey could actually complete
this task) to actually conduct the recovery.  The programming staff
provides no input to the "monkey boy", instead taking notes of anything
in the documentation that they found unclear and/or any technical
problems that may arise.  The entire process is menu-driven and pretty
slick (including a Recovery Monitor that reports what each DDR slave is
doing, what's it's ETA to completion of the current task is, what the
total ETA to full system recovery is, the ability to quiesce and restart
slaves/streams/devices, etc.).

In 2006 there was a massive flood in Washington DC that required
implementation of this DR Plan.  I'm pleased to say it worked without a
hitch, and from the time we got the green light to start spinning tapes
we were back up in running in something like 3 or 4 hours (I think we
had about 20 DDR slaves running simultaneously).  While this process
works extremely well, I now want to remove tapes from the process -
there are a number of reasons why this makes sense:

1.  There is a Federal mandate to encrypt all removable media which this
site is subject to, and we don't presently have TS1120 tape
drives/cartridges.

2.  Tapes can be lost and/or damaged (damage used to happen with
ALARMING frequency!), one bad tape and your entire recovery could be
jeapordized.

Ultimately, I'd like to have our production DASD replicate (either in
realtime or via a nightly batch job) to a remote DASD array using PPRC -
but until such time as I get funding to do that (perhaps never!) I need
to eliminate the darned tapes.  :)

-Mike

-----Original Message-----
From: The IBM z/VM Operating System [mailto:[EMAIL PROTECTED] On
Behalf Of Tom Duerbusch
Sent: Tuesday, June 17, 2008 12:31 PM
To: IBMVM@LISTSERV.UARK.EDU 
Subject: Re: DDR'ing 3390 DASD To Remote Location


The back half of this also needs to be considered.

In case of an actual disaster, the process you are using requires a
running VM system before you can do the restore.  Disaster recovery
sites have VM running.  If you have your own replacement hardware, you
can bring up the VM starter system, but you may still need software,
along with scripts etc, on that site, in order for you to connect to
your backup site, and bring the volumes back.

Just something to consider before getting to far into the backup half of
the project.

Tom Duerbusch
THD Consulting

>>> Michael Coffin <[EMAIL PROTECTED]> 6/17/2008 10:42 AM >>>
(Cross-posted on VMESA-L and LINUX-390)
 
Hi Folks,
 
I want to eliminate use of tapes in my weekly DR process.  Currently we
DDR numerous 3390 spindles to 3590 tape cartridges.
 
I have set up a Linux server at our DR site with a ton of free disk
space, but the question becomes what is the best method to get images of
our DASD stored on it?
 
I've modified our procedures to use DDR2CMS to create CMS files
representing the 3390 DASD images, which are then FTP'd to the Linux
server - but the process is VERY inefficient:

1.      DDR2CMS produces RECFM=V files which are unsuitable for FTP
(I've NEVER had any luck successfully FTP'ing RECFM=V files to a non-CMS
environment and getting them back in the correct format later), so I
have to COPYFILE (PACK the output from DDR2CMS.   DDR2CMS takes around
47 minutes/spindle, and the COPYFILE takes around 38 minutes - the FTP
only takes around 17 minutes!  So we are really wasting nearly 90
minutes/spindle just prepping the data to be transmitted.
2.      The output from DDR2CMS for a 3390-3 spindle may actually be
LARGER than a 3390-3 spindle (even using COMPACT), so we need to use
3390-9 spindles as "work space", something I'm not fond of doing (as a
general rule we don't use 3390-9's at this site, but I configured a
string of them just for this purpose).

There is a great tool on the VM download page called PIPEDDR which
basically does what DDR2CMS does using PIPE TRACKREAD - and it can write
the output to a TCPIP stage.  This is exactly what I'm looking for, with
ONE important difference - PIPEDDR only talks to a remote VM/CMS system
running PIPEDDR to receive the output, I need to be able to PIPE the
output to a remote Linux storage server.
 
Can anyone recommend a nice client that can run on Linux and listen on a
TCPIP port, accept some authorization credentials and host commands
(i.e. MKDIR, CD to dir, etc.) and receive/write to disk a stream of data
similar to what PIPEDDR might write to it's TCPIP stage?  I could then
skip creating the DDR2CMS file and COPYFILE (PACKing it, writing
"indirectly" to the Linux server.  I'd rather not reinvent the wheel if
there's already something out there.  :)
 
PS:  It would be sweet if there were just a way to mount a remote EXT3
filesystem somehow on CMS, but it looks like the only way to do this is
with NFS, which is a problem because it is considered an "unsafe
protocol".  :(
 
-Mike

Re: DDR'ing 3390 DASD To Remote Location

Reply via email to