If indeed all of your i/o is a consistent copy across that DMX 1000, then I think you could do it while they are up. It would be no different than a VM abend or other failure. You might get some file system errors that should be correctable with fsck (you are running a journaled FS presumably). I'd still want the file system backups for data that can't otherwise be recreated for just in case... And there's nothing like backups for "ooops, I erased that file I shouldn't have".
What does EMC say? What happens too when you outgrow the 1 DMX 1000 and have to buy a 2nd box? ( or maybe you don't have those whacky growth issues). Marcy "This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose, or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation." -----Original Message----- From: Linux on 390 Port [mailto:linux-...@vm.marist.edu] On Behalf Of Hallock, Arthur T Sent: Monday, September 28, 2009 11:47 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: [LINUX-390] How to ensure I/O write order >From Marcy: What's a consisteny option? Does it go across all LCU's and file systems? Linux does timestamp its i/o. VM does not (currently - see IBM). I'm not familiar with the EMC's consistency. We do use GDPS/XRC for our recovery, giving a recovery point of less than a minute. Does this EMC solution use those? Once you have many many interdependent applications the same point in time recovery becomes very important. My response: We have all VM and Linux volumes on a single physical controller EMC DMX-1000. On z/OS I run a batch job to execute an EMCSNAP utility. EMCSNAP allows me to identify the volumes to snap and to make a consistent (point-in-time) copy. The controller handles the I/O activity such that a point-in-time copy is made to the target volumes. In our mirrored configuration, the local DMX-1000 is mirrored to another DMX-1000 using EMC's SRDF/A feature. The controllers handle all aspects of the mirroring (the remote DMX-1000 is not connected to a host until DR time). You can logically view this mirroring as the local controller performing a consistent snap of all volumes every thirty seconds to the remote controller. We are primarily a z/OS shop and all of the production and system type volumes for z/OS, z/VM, and z/Linux reside on the local DMX-1000 controller. There is a second local EMC controller used to support development environments. It is not mirrored. The z/VM is used to support the z/Linux only. No CMS users other than those to support the systems. The z/Linux servers are used to support a couple of lightly used applications under WebSphere with DB/2 as the database. An Oracle database is also on a z/Linux server and lightly used for a z/OS application. >From Mark Post and David Boyes: Both stress Linux cache will cause problems with recovery and that the databases should be backed up under the Linux OS. My response: We do perform database backups under the Linux OS. They can be used for local site recovery and are available at the DR site (because they are on the DASD that is mirrored and snapped/dumped to tape). I suppose for a tape recovery, the database backups could be used since most of the time the snaps are performed after the database is backed up. The idea behind the mirroring is to reduce the amount of data loss. But, recovering from the database backup and forward recovering with the logs would work. In either case, it just would be easier to IPL and go at the DR site. Since VM and most of the servers are static (little I/O), I don't expect problems getting them started at the DR site. The main concern is recovering the databases. If DB/2 states it can automatically restart/recover after a server failure and reboot, then what is the difference between a failure (where the cache didn't get written) and a consistent snap? I would think DB/2 and Oracle would need to somehow compensate for how Linux caches the write I/Os. Else their claim to be able to restart/recover from a crash is somewhat misleading. I appreciate the feedback. Art ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390