Re: Backups of Linux volume

2002-11-12 Thread John Summerfield
On Tue, 12 Nov 2002, Alan Cox wrote:

> On Tue, 2002-11-12 at 18:22, Gowans, Chuck wrote:s.
> > The IPLs fail at assorted points in the boot process.  We are thinking that
> > we now may have to shut down the guests in order to get reliable backups.
> >
>
> That sounds a lot more drastic a failure than I would expect if the dump
> is being done cleanly. The way I dump my live file systems is a little
> zany but it works for me
>
> I hot add a mirror volume to the raid
> I want for a rebuild to finish
> I hot remove it
>
> at that point its close enough that the ext3 journalling log replay
> should fix it up, and if not fsck will have almost no work to do. I
> guess the right approach if you run LVM would be to make a snapshot and
> back up the snapshot.
>
> I don't thinkwhat you are doing is impossible, and its a rather
> important item to get right for a 24x7 OS. In the x86 world drbd seems
> to be the popular approach (basically raid over network)

I would expect _some_ problems because you're backing up a filesystem that's
mounted rw. I would expect the filesystem to be marked dirty and so need some
fscking on boot.

Apparently you _can_ do a coherent backup on Linux if you're using LVM.



--


Cheers
John.

Please, no off-list mail. You will fall foul of my spam treatment.
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb



Re: Backups of Linux volume

2002-11-12 Thread Loren Charnley, Jr.
Chuck,

Although I only have two LINUX guests under zVM 3.1, I have an exec that
runs the backups using DDR.  I have the operator shutdown each instance
while they run the backup and at the end the EXEC will ask whether to
IPL(boot) or not.  I successfully recovered both LINUX quests when I went
through a DR in September.

Loren Charnley, Jr.
Tech Support Administrator
Family Dollar Stores, Inc.
Phone:  (704) 847-6961 Ext. 2000

> -Original Message-
> From: Gowans, Chuck [SMTP:[EMAIL PROTECTED]]
> Sent: Tuesday, November 12, 2002 1:22 PM
> To:   [EMAIL PROTECTED]
> Subject:  Backups of Linux volume
>
> One of the selling points I've been using to promote Linux here is the
> fact
> that it appears to be the first truly cost effective means of transporting
> Unix/Linux mid-range systems - affordably - to our hot site.  You know the
> drill. the Linux file systems reside on 390 DASD, the same DASD we
> subscribe to at hot site...  the hardware is the same so we can run
> our
> guests under VM there also, no need to buy duplicate hardware etc.,
> etc,.
>
> Our approach has been that I can use our OS/390 LPAR to touch the z/VM
> DASD
> volumes (temporarily) and get full volume dumps using ADRDSSU.  The full
> volumes would contain Linux mini disks for one or more of the Linux
> guests.
> The production OS/390 LPAR can then handle the processing of the tapes for
> hot site just like all other OS/390 disaster recovery backups we take -
> i.e.
> they fall into a designated tape rotation to our offsite tape vault, and
> subsequently can be pulled for transport to hot site.  Once at hot site,
> we
> should be able to restore these full volume dumps and would have all of
> our
> mini disks back and can then fire up z/VM and the Linux guests.  We are
> using the same process to backup the z/VM system volumes.
>
> We are not having good luck getting the Linux systems to boot when we have
> tested the restore process here.  We (thought we) understood the need to
> have the Linux guest quiesced so we could get a clean dump - and as such
> issued a 'sync' command hoping all of the buffers would be written to disk
> form the affected guest(s).  What (we think) is apparently happening is
> that
> some level of VM caching is still keeping all of the data from getting
> written when we expect it, or - even though it's late at night - there is
> still some activity going on in the Linux OS that's corrupting the
> backups.
> The IPLs fail at assorted points in the boot process.  We are thinking
> that
> we now may have to shut down the guests in order to get reliable backups.
>
> My questions relating to this are:
>
> Does this seem like a feasible approach?
> Is anyone out there backing up a large number of guests in this, or a
> similar manner?  If so, how are you doing it?
> Is there a better way to temporarily quiesce a 390-Linux system to get
> good
> copies of the mini disk volumes?
>
> We would like the Linux guests to be available as close to possible to
> 7x24x365 - meaning we would like to avoid a large "backup window" type
> outage; and we are thinking about buying the "point-in-time" backup
> capability for our DASD subsystems to reduce this window.  The process
> would
> be similar to what I described above except the backups would take seconds
> per volume - which would mean the Linux guests could resume work after the
> "Snap" occurred - and the "snapped" copies could be dumped to tape as the
> tape drives became available.
>
> Any help would be appreciated.
>
> Chuck Gowans
> USDA - Nat'l IT Center - Kansas City
>
>
> 
> NOTE:
> This e-mail message contains PRIVILEGED and CONFIDENTIAL information and
> is intended only for the use of the specific individual or individuals to
> which it is addressed. If you are not an intended recipient of this
> e-mail, you are hereby notified that any unauthorized use, dissemination
> or copying of this e-mail or the information contained herein or attached
> hereto is strictly prohibited. If you receive this e-mail in error, notify
> the person named above by reply e-mail and please delete it. Thank you.


NOTE:
This e-mail message contains PRIVILEGED and CONFIDENTIAL information and is intended 
only for the use of the specific individual or individuals to which it is addressed. 
If you are not an intended recipient of this e-mail, you are hereby notified that any 
unauthorized use, dissemination or copying of this e-mail or the information contained 
herein or attached hereto is strictly prohibited. If you receive this e-mail in error, 
notify the person named above by reply e-mail and please delete it. Thank you.



Re: Backups of Linux volume

2002-11-12 Thread Alan Altmark
On Tuesday, 11/12/2002 at 12:22 CST, "Gowans, Chuck"
<[EMAIL PROTECTED]> wrote:
[snip]
> We are not having good luck getting the Linux systems to boot when we
have
> tested the restore process here.  We (thought we) understood the need to
> have the Linux guest quiesced so we could get a clean dump - and as such
> issued a 'sync' command hoping all of the buffers would be written to
disk
> form the affected guest(s).  What (we think) is apparently happening is
that
> some level of VM caching is still keeping all of the data from getting
> written when we expect it, or - even though it's late at night - there
is
> still some activity going on in the Linux OS that's corrupting the
backups.
> The IPLs fail at assorted points in the boot process.  We are thinking
that
> we now may have to shut down the guests in order to get reliable
backups.

The VM minidisk cache is probably getting in your way.  Look at the CP SET
MDCACHE command for ways to flush (or turn off) the cache after you have
quieced Linux.  You can do it on a volume or minidisk basis, depending on
how you do your backups.

Where 7x24x365 is required, you should already have load
balancers/sprayers, etc. in place.  Make sure your web/app server
"clusters" have minidisks on different volumes so you can take down one
server at a time, yet still maintain the service.

Alan Altmark
Sr. Software Engineer
IBM z/VM Development



Re: Backups of Linux volume

2002-11-12 Thread Alan Cox
On Tue, 2002-11-12 at 18:22, Gowans, Chuck wrote:s.
> The IPLs fail at assorted points in the boot process.  We are thinking that
> we now may have to shut down the guests in order to get reliable backups.
>

That sounds a lot more drastic a failure than I would expect if the dump
is being done cleanly. The way I dump my live file systems is a little
zany but it works for me

I hot add a mirror volume to the raid
I want for a rebuild to finish
I hot remove it

at that point its close enough that the ext3 journalling log replay
should fix it up, and if not fsck will have almost no work to do. I
guess the right approach if you run LVM would be to make a snapshot and
back up the snapshot.

I don't thinkwhat you are doing is impossible, and its a rather
important item to get right for a 24x7 OS. In the x86 world drbd seems
to be the popular approach (basically raid over network)


Alan



Backups of Linux volume

2002-11-12 Thread Gowans, Chuck
One of the selling points I've been using to promote Linux here is the fact
that it appears to be the first truly cost effective means of transporting
Unix/Linux mid-range systems - affordably - to our hot site.  You know the
drill. the Linux file systems reside on 390 DASD, the same DASD we
subscribe to at hot site...  the hardware is the same so we can run our
guests under VM there also, no need to buy duplicate hardware etc.,
etc,.

Our approach has been that I can use our OS/390 LPAR to touch the z/VM DASD
volumes (temporarily) and get full volume dumps using ADRDSSU.  The full
volumes would contain Linux mini disks for one or more of the Linux guests.
The production OS/390 LPAR can then handle the processing of the tapes for
hot site just like all other OS/390 disaster recovery backups we take - i.e.
they fall into a designated tape rotation to our offsite tape vault, and
subsequently can be pulled for transport to hot site.  Once at hot site, we
should be able to restore these full volume dumps and would have all of our
mini disks back and can then fire up z/VM and the Linux guests.  We are
using the same process to backup the z/VM system volumes.

We are not having good luck getting the Linux systems to boot when we have
tested the restore process here.  We (thought we) understood the need to
have the Linux guest quiesced so we could get a clean dump - and as such
issued a 'sync' command hoping all of the buffers would be written to disk
form the affected guest(s).  What (we think) is apparently happening is that
some level of VM caching is still keeping all of the data from getting
written when we expect it, or - even though it's late at night - there is
still some activity going on in the Linux OS that's corrupting the backups.
The IPLs fail at assorted points in the boot process.  We are thinking that
we now may have to shut down the guests in order to get reliable backups.

My questions relating to this are:

Does this seem like a feasible approach?
Is anyone out there backing up a large number of guests in this, or a
similar manner?  If so, how are you doing it?
Is there a better way to temporarily quiesce a 390-Linux system to get good
copies of the mini disk volumes?

We would like the Linux guests to be available as close to possible to
7x24x365 - meaning we would like to avoid a large "backup window" type
outage; and we are thinking about buying the "point-in-time" backup
capability for our DASD subsystems to reduce this window.  The process would
be similar to what I described above except the backups would take seconds
per volume - which would mean the Linux guests could resume work after the
"Snap" occurred - and the "snapped" copies could be dumped to tape as the
tape drives became available.

Any help would be appreciated.

Chuck Gowans
USDA - Nat'l IT Center - Kansas City