One of the selling points I've been using to promote Linux here is the fact that it appears to be the first truly cost effective means of transporting Unix/Linux mid-range systems - affordably - to our hot site. You know the drill..... the Linux file systems reside on 390 DASD, the same DASD we subscribe to at hot site....... the hardware is the same so we can run our guests under VM there also, no need to buy duplicate hardware.... etc., etc,.
Our approach has been that I can use our OS/390 LPAR to touch the z/VM DASD volumes (temporarily) and get full volume dumps using ADRDSSU. The full volumes would contain Linux mini disks for one or more of the Linux guests. The production OS/390 LPAR can then handle the processing of the tapes for hot site just like all other OS/390 disaster recovery backups we take - i.e. they fall into a designated tape rotation to our offsite tape vault, and subsequently can be pulled for transport to hot site. Once at hot site, we should be able to restore these full volume dumps and would have all of our mini disks back and can then fire up z/VM and the Linux guests. We are using the same process to backup the z/VM system volumes. We are not having good luck getting the Linux systems to boot when we have tested the restore process here. We (thought we) understood the need to have the Linux guest quiesced so we could get a clean dump - and as such issued a 'sync' command hoping all of the buffers would be written to disk form the affected guest(s). What (we think) is apparently happening is that some level of VM caching is still keeping all of the data from getting written when we expect it, or - even though it's late at night - there is still some activity going on in the Linux OS that's corrupting the backups. The IPLs fail at assorted points in the boot process. We are thinking that we now may have to shut down the guests in order to get reliable backups. My questions relating to this are: Does this seem like a feasible approach? Is anyone out there backing up a large number of guests in this, or a similar manner? If so, how are you doing it? Is there a better way to temporarily quiesce a 390-Linux system to get good copies of the mini disk volumes? We would like the Linux guests to be available as close to possible to 7x24x365 - meaning we would like to avoid a large "backup window" type outage; and we are thinking about buying the "point-in-time" backup capability for our DASD subsystems to reduce this window. The process would be similar to what I described above except the backups would take seconds per volume - which would mean the Linux guests could resume work after the "Snap" occurred - and the "snapped" copies could be dumped to tape as the tape drives became available. Any help would be appreciated. Chuck Gowans USDA - Nat'l IT Center - Kansas City