subject:"Bad Linux backups"

Re: Bad Linux backups

2006-07-29 Thread Alan Cox

Ar Sad, 2006-07-29 am 11:08 +0800, ysgrifennodd John Summerfield:
> Aside  from users' aversion to cookies, their correct use isn't any
> easier than good backups;-) I reckon a lot of application authors trust
> the data held cookies, saying "we provided that so we know it's okay."

It is possible to use cookies for passing data (or hidden forms much the
same way) and still not have to worry about which web server gets the
request. You simply digitally sign the cookie with a secret only the
webserver knows and ensure it includes enough info to stop long term
reuse or reuse by another user.

Alan

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-29 Thread John Summerfield

Mark Perry wrote:

- Start Original Message -
Sent: Fri, 28 Jul 2006 12:35:26 +0200
From: Rob van der Heij <[EMAIL PROTECTED]>
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

I would hope that anything that prevents the flashcopy
from starting could be reported for example with a command reject upon
device end.

I have not analyzed a CCW trace of the whole FLASHCOPY operation, but the z/VM 
FLASHCOPY command returns immediately if the request can be queued to the 
Shark. At some later point in time an error may be received by z/VM and an 
asynchronous console message is issued.

This is the complication that Mike related to, in that handling such 
asynchronous messages in a REXX script is complicated and not for a novice.

Once again this is practical advice, not theoretical. I have been through this, 
and it was painful.

It seems to me simple enough:-)
One script that initiates the copy.
A second that is run when (or waits until) the copy _should_ have
completed, confirms that it has worked and takes appropriate action
depending on the results.

I'll leave it to the more skilled to determine _how_ to tell whether the
copy's done, is still in progress or has failed.

--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-29 Thread John Summerfield


Post, Mark K wrote:

From what I've seen, a lot of that information is usually kept in the
user's browser via cookies or "session cookies."  For things that
aren't, mirroring the data on separate physical devices, on separate
controllers, etc., etc., provides the redundancy needed.  The whole
point of clustering is not to have _any_ single points of failure.
That's why clustering an application is _at least_ two times more
expensive than not clustering it.



Aside  from users' aversion to cookies, their correct use isn't any
easier than good backups;-) I reckon a lot of application authors trust
the data held cookies, saying "we provided that so we know it's okay."





--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-28 Thread Stahr, Lea

You are correct about the cacheing of filesystems. The clustered systems
I referred to are DB2 Connect gateways and contain no shared
filesystems. I will route all DB2C users through one gateway wile
backing up the other. Then reverse positions. When both are done, bring
them both online.

For systems with active filesystems, they must be shut down and a SNAP
taken. They can be brought up while the SNAP volumes are copied to tape.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of J
Leslie Turriff
Sent: Friday, July 28, 2006 11:04 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

Well, I can see that clustering is a solution to the availability of a
service while a host is shut down, but I don't see how it makes thinks
any better in regards to backing up the filesystems used by the cluster.
 As long as any of the hosts in the cluster is using a filesystem in R/W
mode there are going to be pieces of data cached in main memory that the
filesystem doesn't know about, and that means that snapshots, etc. done
from outside will not be valid.  Seems like the base issue is that Linux
doesn't do write-through cacheing, so the filesystem will almost never
be valid to an outside observer (see also Schroedinger's cat).

J. Leslie Turriff
VM Systems Programmer
Central Missouri State University
Room 400
Ward Edwards Building
Warrensburg MO 64093
660-543-4285
660-580-0523
[EMAIL PROTECTED]

>>>[EMAIL PROTECTED] 07/28/06 10:31 am >>>
>From what I've seen, a lot of that information is usually kept in the
user's browser via cookies or "session cookies."  For things that
aren't, mirroring the data on separate physical devices, on separate
controllers, etc., etc., provides the redundancy needed.  The whole
point of clustering is not to have _any_ single points of failure.
That's why clustering an application is _at least_ two times more
expensive than not clustering it.

Mark Post

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
John Summerfied
Sent: Thursday, July 27, 2006 8:37 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

David Boyes wrote:
>I think Lea means:
>
>For cluster takeover to work seamlessly, your application has to keep
>session data in some common location between the servers.

There's the point that has me: how do you backup that location? Is it
something that, if it fails, you quickly find a new one and tell the PC
buyer you had a "technical problem" and would they mind starting again?

--

Cheers
John

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

!DSPAM:32225,44ca2f9e88571098210962!

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-28 Thread David Boyes

> >From what I've seen, a lot of that information is usually kept in the
> user's browser via cookies or "session cookies."  For things that
> aren't, mirroring the data on separate physical devices, on separate
> controllers, etc., etc., provides the redundancy needed. 

The other common technique is storing the session data in a RDBMS. You
use the DBMS live backup tools to dump to a stable copy on disk, and
then backup the stable copy.

> The whole
> point of clustering is not to have _any_ single points of failure.
> That's why clustering an application is _at least_ two times more
> expensive than not clustering it.

Yup. 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-28 Thread J Leslie Turriff

Well, I can see that clustering is a solution to the availability of a
service while a host is shut down, but I don't see how it makes thinks
any better in regards to backing up the filesystems used by the cluster.
 As long as any of the hosts in the cluster is using a filesystem in R/W
mode there are going to be pieces of data cached in main memory that the
filesystem doesn't know about, and that means that snapshots, etc. done
from outside will not be valid.  Seems like the base issue is that Linux
doesn't do write-through cacheing, so the filesystem will almost never
be valid to an outside observer (see also Schroedinger's cat).

J. Leslie Turriff
VM Systems Programmer
Central Missouri State University
Room 400
Ward Edwards Building
Warrensburg MO 64093
660-543-4285
660-580-0523
[EMAIL PROTECTED]

>>>[EMAIL PROTECTED] 07/28/06 10:31 am >>>
>From what I've seen, a lot of that information is usually kept in the
user's browser via cookies or "session cookies."  For things that
aren't, mirroring the data on separate physical devices, on separate
controllers, etc., etc., provides the redundancy needed.  The whole
point of clustering is not to have _any_ single points of failure.
That's why clustering an application is _at least_ two times more
expensive than not clustering it.

Mark Post

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
John Summerfied
Sent: Thursday, July 27, 2006 8:37 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

David Boyes wrote:
>I think Lea means:
>
>For cluster takeover to work seamlessly, your application has to keep
>session data in some common location between the servers.

There's the point that has me: how do you backup that location? Is it
something that, if it fails, you quickly find a new one and tell the PC
buyer you had a "technical problem" and would they mind starting again?

--

Cheers
John

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

!DSPAM:32225,44ca2f9e88571098210962!

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-28 Thread J Leslie Turriff

>>On Wed, Jul 26, 2006 at 01:27:06PM -0500, J Leslie Turriff wrote:
>>Okay, now, wait; are you saying that the storage device _does_ have a
>>mechanism for communicating with the Linux filesystem to determine
what
>>filesystem pages are still cached in main storage and have not yet
been
>>commited to external storage?

>It doesn't.  It's also not as easy as having a list of pages that need
>commiting.  What we would need is a way for the storage device (or
>rather the software controlling it) to call the existing Linux lockfs
>functionality.

A way for the storage device to call the lockfs via an API was what I
was thinking of.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-28 Thread Post, Mark K

>From what I've seen, a lot of that information is usually kept in the
user's browser via cookies or "session cookies."  For things that
aren't, mirroring the data on separate physical devices, on separate
controllers, etc., etc., provides the redundancy needed.  The whole
point of clustering is not to have _any_ single points of failure.
That's why clustering an application is _at least_ two times more
expensive than not clustering it.


Mark Post

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
John Summerfied
Sent: Thursday, July 27, 2006 8:37 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

David Boyes wrote:
> I think Lea means:
>
> For cluster takeover to work seamlessly, your application has to keep
> session data in some common location between the servers.

There's the point that has me: how do you backup that location? Is it
something that, if it fails, you quickly find a new one and tell the PC
buyer you had a "technical problem" and would they mind starting again?

--

Cheers
John

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-28 Thread John Summerfied


David Boyes wrote:

I think Lea means:

For cluster takeover to work seamlessly, your application has to keep
session data in some common location between the servers.


There's the point that has me: how do you backup that location? Is it
something that, if it fails, you quickly find a new one and tell the PC
buyer you had a "technical problem" and would they mind starting again?






--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-28 Thread Mark Perry

- Start Original Message -
Sent: Fri, 28 Jul 2006 12:35:26 +0200
From: Rob van der Heij <[EMAIL PROTECTED]>
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

>  I would hope that anything that prevents the flashcopy
> from starting could be reported for example with a command reject upon
> device end. 

I have not analyzed a CCW trace of the whole FLASHCOPY operation, but the z/VM 
FLASHCOPY command returns immediately if the request can be queued to the 
Shark. At some later point in time an error may be received by z/VM and an 
asynchronous console message is issued.

This is the complication that Mike related to, in that handling such 
asynchronous messages in a REXX script is complicated and not for a novice.

Once again this is practical advice, not theoretical. I have been through this, 
and it was painful.

Alan has stated that something better is coming, perhaps a rewite of the actual 
z/VM FLASHCOPY command?
Can you enlighten us Alan?

> If the device could just give up somewhere in the middle of copying a
> volume as if it were normal business, then I think I don't want to
> share my views on that design with folks on a public mailing list.

It is very likely that I have already said many words relating to such views :-)

Mark

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-28 Thread Rob van der Heij

On 7/28/06, Mark Perry <[EMAIL PROTECTED]> wrote:

> Unless I am terribly misinformed, it *is* an atomic operation for the
> operating system.

Sorry Rob, but your are terribly misinformed.

What I meant to say with "atomic operation" is that things remain in order.
The flashcopy itself is initiated by some (undocumented?) CCWs in a
channel program so it is very clear to the host which I/O got device
end before the SSCH for the flashcopy was issued.
If the operation is atomic for the host, then anything you wrote to
disk before the flashcopy is in the copy, and anything you write after
that is not.

I can imagine it takes some smoke and mirrors (like keeping copies of
tracks that were modified on the source during the background copy
process, and redirecting reads to the copied volume while the copying
takes place). I would hope that anything that prevents the flashcopy
from starting could be reported for example with a command reject upon
device end. (why would you give device end before sorting out that it
will work). Slightly less convenient would be to report an issue with
the next I/O to the device, but chaining a NOP in should make it
synchronous.

From that point on, it seems to me the device has no option but

complete the given task. If things that could not be foreseen prevent
this, then imho there's no option but reject reading from the source
or writing to the target. Both rather unpleasant and a good reason to
make the check robust.

If the device could just give up somewhere in the middle of copying a
volume as if it were normal business, then I think I don't want to
share my views on that design with folks on a public mailing list.

Rob

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-28 Thread Mark Perry


Rob van der Heij wrote:

On 7/26/06, Mark Perry <[EMAIL PROTECTED]> wrote:


One point not mentioned yet, is that FLASHCOPY is an asynchronous
process. You can start a FLASHCOPY operation and it *can* return an
error status asynchronously. 90+% of the time this is not apparent,
the request is made and the Shark goes happily on its way. However if
the request that is queued within the Shark has to be terminated
(Resource shortages, target volume errors etc.) then beware!


Unless I am terribly misinformed, it *is* an atomic operation for the
operating system.

My reply is a little late, but emails from the list are not coming in
synchronously ;-)

Sorry Rob, but your are terribly misinformed.

The Shark responds with a successful completion on accepting the request
to perform the FLASHCOPY. Remember that a Shark is not just a bunch of
DASD/Disks, it is actually an AIX system running on POWER - i.e. it is
an intelligent system, and you are communicating with software, not
hardware. A FLASHCOPY operation is one of many that get put onto queues
and are processed asynchronously.
This is not a failure, but any host system utilizing FLASHCOPY must be
able to handle an asynchronous errors.

I believe Alan has already confirmed what I was going to say, but hard
experience using FLASHCOPY under z/VM has lead to my statements, I am
not theorizing - it comes from painful experience.

Painful examples, such as regular backups that utilize the same SOURCE
and TARGET volumes: when an asynchronous error (resource shortage within
the Shark) is "missed" then the data on the TARGET volume is "old". Any
attempt to use the TARGET volume succeeds but utilizes the "old" data,
hence backups etc. are useless.

This problem hit us during FLASHCOPY cloning of a master system; we
couldn't understand why the new clones behaved as they did
(updates/changes missing etc.), until later analysis showed that the
clones were running from the "old" data - the FLASHCOPY has failed
asynchronously. It became obvious that we could not rely on the return
code from the FLASHCOPY operation, we needed to wait longer for any
asynchronous error to come back. The tricky decision was how long does
one wait for such an error? If you wait too long you effectively defeat
any advantage of using FLASHCOPY rather than a simple DASD copy
operation. We compromised on a parameter value that defaulted to about 1
minute (remember that a copy operation could take 20-30 minutes.)

Mark

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread David Boyes

I think Lea means:

For cluster takeover to work seamlessly, your application has to keep
session data in some common location between the servers. If that's the
case, then when the shutdown of the second server commences, it takes
itself out of the load balancer queue, completes whatever transactions
are in flight at that moment on that node, signals the other node that
it's now in charge, and then takes a swan dive into oblivion. The other
node takes the session state data from the shared location, and picks up
where the original node left off. Works well for applications that are
aware of how to play nicely. 

If you go all the way to OpenSSI, then a node shutdown triggers process
migration to nodes other than the one going down, and the system
continues operation without the application even noticing. 

It takes some planning to get clustered applications to work properly,
but once that's done, it's pretty slick. RTFineM for 'cluster' for more
interesting details. 

Once you have the cluster properly configured for takeover, then you use
VMUTIL or S5INIT to issue the SIGNAL SHUTDOWN to each node in turn, back
it up, then XAUTOLOG it so that it re-enters the cluster and all is
well. 

David Boyes
Sine Nomine Associates

> -Original Message-
> From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
John
> Summerfied
> Sent: Thursday, July 27, 2006 7:46 PM
> To: LINUX-390@VM.MARIST.EDU
> Subject: Re: Bad Linux backups
> 
> Stahr, Lea wrote:
> > A piece of cake! Use VMUTIL on VM to do the shutdowns and startups
and
> > have the backups scheduled appropriately. Or get the CONTROL-M agent
and
> > have that do it all from ZOS.
> 
> 
> I don't understand how that addresses my concern.
> >
> > Stahr, Lea wrote:
> >
> >>With clustering, you shut down one image and do an OFFLINE backup
> >
> > while
> >
> >>the application runs on the second image. Then bring up the primary
> >>image and shutdown the secondary system for backup.
> >>
> >
> >
> > which sounds every bit as tricky to me as getting good backups from
a
> > live Linux system.
> >
> > I'm negotiating purchase of a PC with your online retail shop and
you
> > take down the box I'm talking to while I'm negotiating PC options
such
> > as RAM, CPU, disk
> >
> >
> >
> > --
> 
> 
> 
> 
> --
> 
> Cheers
> John
> 
> -- spambait
> [EMAIL PROTECTED]  [EMAIL PROTECTED]
> Tourist pics
http://portgeographe.environmentaldisasters.cds.merseine.nu/
> 
> do not reply off-list
> 
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to [EMAIL PROTECTED] with the message: INFO LINUX-390
or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> 
> 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread John Summerfied


Stahr, Lea wrote:

A piece of cake! Use VMUTIL on VM to do the shutdowns and startups and
have the backups scheduled appropriately. Or get the CONTROL-M agent and
have that do it all from ZOS.



I don't understand how that addresses my concern.


Stahr, Lea wrote:


With clustering, you shut down one image and do an OFFLINE backup


while


the application runs on the second image. Then bring up the primary
image and shutdown the secondary system for backup.




which sounds every bit as tricky to me as getting good backups from a
live Linux system.

I'm negotiating purchase of a PC with your online retail shop and you
take down the box I'm talking to while I'm negotiating PC options such
as RAM, CPU, disk



--





--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread Christoph Hellwig

On Wed, Jul 26, 2006 at 03:04:34PM -0400, Alan Altmark wrote:
> On Wednesday, 07/26/2006 at 01:27 EST, J Leslie Turriff
> <[EMAIL PROTECTED]> wrote:
> > Okay, now, wait; are you saying that the storage device _does_ have a
> > mechanism for communicating with the Linux filesystem to determine what
> > filesystem pages are still cached in main storage and have not yet been
> > commited to external storage?
>
> No.  I'm saying that an application that closes or flushes all of its open
> files and then tells the filesystem "commit the filesystem to disk" (e.g.
> sync) is then at a known point with respect to the dasd.  It is free at
> that point to kick off a flashcopy via some command or utility and start
> running again.

if you are doing an fsync that data is guarnateed to be on stable
storage, yes.  But that's not enough, because it is

 a) not specified where on stable storage, it could for example still be
in the log of a data journaling device
 b) you risk sever corruption if the filesystem metadata is not in a
coherent state, up to the point that you can't find your data
anymore despite it beeing on stable storage.

>
> Alan Altmark
> z/VM Development
> IBM Endicott
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
---end quoted text---

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread Christoph Hellwig

On Wed, Jul 26, 2006 at 01:27:06PM -0500, J Leslie Turriff wrote:
> Okay, now, wait; are you saying that the storage device _does_ have a
> mechanism for communicating with the Linux filesystem to determine what
> filesystem pages are still cached in main storage and have not yet been
> commited to external storage?

It doesn't.  It's also not as easy as having a list of pages that need
commiting.  What we would need is a way for the storage device (or
rather the software controlling it) to call the existing Linux lockfs
functionality.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread Christoph Hellwig

On Wed, Jul 26, 2006 at 12:50:09PM -0400, Alan Altmark wrote:
> You're right, however, and as we've been discussing, that these features
> can be misused or misinterpreted to provide an *application*-consistent
> view of the data.  They don't do that.  That applies to any operating
> system, not just Linux.  And it's not the lock/unlock features of a
> filesystem that are important.  Instead, the application must be able to
> exert control on the filesystem in such a way that it *knows* that all
> [relevant] data has been committed to disk and can say "OK. Now is a good
> time to take that backup."

With a transaction-oriented application the filesystem data is always
coherent, if you application isn't transaction-based all hope for a
coherent backup is lost.

>
> Properly used, these features can drastically reduce the amount of down
> time needed to perform application-consistent backups.
>
> Alan Altmark
> z/VM Development
> IBM Endicott
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
---end quoted text---

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread Christoph Hellwig

On Wed, Jul 26, 2006 at 06:21:03PM +0200, Rob van der Heij wrote:
> On 7/26/06, Mark Perry <[EMAIL PROTECTED]> wrote:
>
> >One point not mentioned yet, is that FLASHCOPY is an asynchronous process.
> >You can start a FLASHCOPY operation and it *can* return an error status
> >asynchronously. 90+% of the time this is not apparent, the request is made
> >and the Shark goes happily on its way. However if the request that is
> >queued within the Shark has to be terminated (Resource shortages, target
> >volume errors etc.) then beware!
>
> Unless I am terribly misinformed, it *is* an atomic operation for the
> operating system.

Doing it atomic is not enough.  You need to put the filesystem into a
coherent state first.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread Alan Altmark

On Thursday, 07/27/2006 at 09:57 ZE2, Carsten Otte <[EMAIL PROTECTED]>
wrote:
> I am sorry, but I have to disagree with Alan's statement. They _are_
> currently dangerous to use with Linux volumes that are being accessed
> _because_ unlike dm-snapshot the filesystem is not frozen in Linux
> (lockfs) and thus the data on disk is inconsitent due to caching.
> DM-snapshot does the desired trick, flashcopy does not.
>
> I feel sorry for causing confusion by creating the expectation that
> flashcopy can be used to snapshot linux volumes before.

I had previously said that you shouldn't use Flashcopy on active volumes.
I meant that Flashcopy is just a mechanism for making a copy of the media.
 If you don't worry about the mechanics underneath and just think of it as
a super-fast copy function, then all will be fine.

High-speed copy technology does not absolve the system manager from
properly preparing the system for backup.

Alan Altmark
z/VM Development
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread Stahr, Lea

A piece of cake! Use VMUTIL on VM to do the shutdowns and startups and
have the backups scheduled appropriately. Or get the CONTROL-M agent and
have that do it all from ZOS.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]
 

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
John Summerfied
Sent: Wednesday, July 26, 2006 7:18 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

Stahr, Lea wrote:
> With clustering, you shut down one image and do an OFFLINE backup
while
> the application runs on the second image. Then bring up the primary
> image and shutdown the secondary system for backup.
>

which sounds every bit as tricky to me as getting good backups from a
live Linux system.

I'm negotiating purchase of a PC with your online retail shop and you
take down the box I'm talking to while I'm negotiating PC options such
as RAM, CPU, disk



--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics
http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread Stahr, Lea

Funny thing testing! I tested it and it worked four times in a row. Then
when I actually needed it, it failed. Thank you fuzzy backups!

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
Alan Altmark
Sent: Wednesday, July 26, 2006 3:13 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

On Wednesday, 07/26/2006 at 02:55 EST, J Leslie Turriff
<[EMAIL PROTECTED]> wrote:
> Okay.  I may be wrong, but it seems to me that the majority of Linux
> applications (probably excepting database packages and such) rely on
the
> filesystem to eventually get their data to disk without them doing
> anything besides open, write and close operations.

And  the circle is closed.   Hence this entire thread/rant about
shutting down servers while you are flashcopying or otherwise performing
external physical backups.  If you know what you and the application are
doing, take a live backup.  If you don't, don't.  If the application
provides you with a set of backup functions, use them.

Oh, and the point that actually started the whole thing:  Test your
backups.  You should already be doing that in your DR tests, but if you
change your processes, re-test.

"There's a hole in my bucket, dear Liza, dear Liza"  ;-)

Alan Altmark
z/VM Development
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread John Summerfied


J Leslie Turriff wrote:

 Sounds to me, then, like the use of the
snapshot/mirror/peer-to-peer copy features of storage devices e.g.
Shark, SATABeast, etc. are currently dangerous to use with Linux
filesystems.  They would need to be able to coordinate their activities
with the filesystem lock/unlock components of the kernel to be made
safe?


My limited knowledge suggests you need do no more than reboot, maybe
less (basically you want the filesystem ro for a moment), initiating the
flashcopy at the right instant.





--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread John Summerfied


Carsten Otte wrote:

Fargusson.Alan wrote:


I agree.  I think you should make your backups with the Linux system down.  You 
should test this to make sure that there is not some other operational error 
causing problems.


I think we got close to the bottom of the stack now: If one can take
down the system for backup it is a good idea to do so because of the
reasons discussed in this thread.
Backing up a running system involves trust in the application and the
file system.



I've found this an interesting discussion; I've been wondering for some
time how the pros who have their big businesses (and/or careers) on the
line do it.

I've always suspected it's not so simple as it might be, and this has
confirmed my opinion:

One's choices are
1. Do it perfectly, with the system down.
2. Do it less rigorously with the system up, but analyse the
implications very carefully. You don't want your latest recruit doing
this:-) It's become clear to me that even experienced folk can get this
wrong, and argue that they're right!





--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread John Summerfied


Alan Altmark wrote:

On Wednesday, 07/26/2006 at 01:27 EST, J Leslie Turriff
<[EMAIL PROTECTED]> wrote:


Okay, now, wait; are you saying that the storage device _does_ have a
mechanism for communicating with the Linux filesystem to determine what
filesystem pages are still cached in main storage and have not yet been
commited to external storage?



No.  I'm saying that an application that closes or flushes all of its open
files and then tells the filesystem "commit the filesystem to disk" (e.g.
sync) is then at a known point with respect to the dasd.  It is free at
that point to kick off a flashcopy via some command or utility and start
running again.


_Only_ if all users of the filesystem agree!

It seems less straightforward to me if you have more than one
application writing to the filesystem, or if your application's file are
spread across filesystems.


--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread John Summerfied


Stahr, Lea wrote:

With clustering, you shut down one image and do an OFFLINE backup while
the application runs on the second image. Then bring up the primary
image and shutdown the secondary system for backup.



which sounds every bit as tricky to me as getting good backups from a
live Linux system.

I'm negotiating purchase of a PC with your online retail shop and you
take down the box I'm talking to while I'm negotiating PC options such
as RAM, CPU, disk



--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread Carsten Otte

J Leslie Turriff wrote:
> Okay, now, wait; are you saying that the storage device _does_ have a
> mechanism for communicating with the Linux filesystem to determine what
> filesystem pages are still cached in main storage and have not yet been
> commited to external storage?
No, it does not. Invention required.


cheers,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-27 Thread Carsten Otte

> On Wednesday, 07/26/2006 at 10:33 EST, J Leslie Turriff
> <[EMAIL PROTECTED]> wrote:
>> Sounds to me, then, like the use of the
>> snapshot/mirror/peer-to-peer copy features of storage devices e.g.
>> Shark, SATABeast, etc. are currently dangerous to use with Linux
>> filesystems.  They would need to be able to coordinate their activities
>> with the filesystem lock/unlock components of the kernel to be made
>> safe?

Alan Altmark wrote:
> No, they are not "currently dangerous to use with Linux".  The
> snapshot/flashcopy features provide a point-in-time consistent view of an
> entire device or range of blocks/cylinders.   In a "normal" track-by-track
> read, data on the device can change while you're reading.

I am sorry, but I have to disagree with Alan's statement. They _are_
currently dangerous to use with Linux volumes that are being accessed
_because_ unlike dm-snapshot the filesystem is not frozen in Linux
(lockfs) and thus the data on disk is inconsitent due to caching.
DM-snapshot does the desired trick, flashcopy does not.

I feel sorry for causing confusion by creating the expectation that
flashcopy can be used to snapshot linux volumes before.

cheers,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Alan Altmark

On Wednesday, 07/26/2006 at 02:55 EST, J Leslie Turriff
<[EMAIL PROTECTED]> wrote:
> Okay.  I may be wrong, but it seems to me that the majority of Linux
> applications (probably excepting database packages and such) rely on the
> filesystem to eventually get their data to disk without them doing
> anything besides open, write and close operations.

And  the circle is closed.   Hence this entire thread/rant about
shutting down servers while you are flashcopying or otherwise performing
external physical backups.  If you know what you and the application are
doing, take a live backup.  If you don't, don't.  If the application
provides you with a set of backup functions, use them.

Oh, and the point that actually started the whole thing:  Test your
backups.  You should already be doing that in your DR tests, but if you
change your processes, re-test.

"There's a hole in my bucket, dear Liza, dear Liza"  ;-)

Alan Altmark
z/VM Development
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Alan Altmark

On Wednesday, 07/26/2006 at 02:20 AST, David Kreuter
<[EMAIL PROTECTED]> wrote:
> including ESTABLISH, QUERY and WITHDRAW ala ickdsf on z/OS?
> will ickdsf on z/vm be changed to support these functions?

I give an inch and you want a mile!  :-)   We will, among other things, be
adding a QUERY capability for convenience.

As far as ICKDSF goes, you can establish flashcopy relationships among
your minidisks as long as they are on the same controller.   You may need
to order service for DSF to bring it's functionality up the that
documented in the -30 level of the manual.

Alan Altmark
z/VM Development
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread J Leslie Turriff

Okay.  I may be wrong, but it seems to me that the majority of Linux
applications (probably excepting database packages and such) rely on the
filesystem to eventually get their data to disk without them doing
anything besides open, write and close operations.

J. Leslie Turriff
VM Systems Programmer
Central Missouri State University
Room 400
Ward Edwards Building
Warrensburg MO 64093
660-543-4285
660-580-0523
[EMAIL PROTECTED]

>>>[EMAIL PROTECTED] 07/26/06 2:04 pm >>>
On Wednesday, 07/26/2006 at 01:27 EST, J Leslie Turriff
<[EMAIL PROTECTED]> wrote:
>Okay, now, wait; are you saying that the storage device _does_ have a
>mechanism for communicating with the Linux filesystem to determine what

>filesystem pages are still cached in main storage and have not yet been

>commited to external storage?

No.  I'm saying that an application that closes or flushes all of its
open
files and then tells the filesystem "commit the filesystem to disk"
(e.g.
sync) is then at a known point with respect to the dasd.  It is free at
that point to kick off a flashcopy via some command or utility and start

running again.

Alan Altmark
z/VM Development
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

!DSPAM:32225,44c7bdcf88571709617740!

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Alan Altmark

On Wednesday, 07/26/2006 at 01:27 EST, J Leslie Turriff
<[EMAIL PROTECTED]> wrote:
> Okay, now, wait; are you saying that the storage device _does_ have a
> mechanism for communicating with the Linux filesystem to determine what
> filesystem pages are still cached in main storage and have not yet been
> commited to external storage?

No.  I'm saying that an application that closes or flushes all of its open
files and then tells the filesystem "commit the filesystem to disk" (e.g.
sync) is then at a known point with respect to the dasd.  It is free at
that point to kick off a flashcopy via some command or utility and start
running again.

Alan Altmark
z/VM Development
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread J Leslie Turriff

Okay, now, wait; are you saying that the storage device _does_ have a
mechanism for communicating with the Linux filesystem to determine what
filesystem pages are still cached in main storage and have not yet been
commited to external storage?

J. Leslie Turriff
VM Systems Programmer
Central Missouri State University
Room 400
Ward Edwards Building
Warrensburg MO 64093
660-543-4285
660-580-0523
[EMAIL PROTECTED]

>>>[EMAIL PROTECTED] 07/26/06 11:50 am >>>
On Wednesday, 07/26/2006 at 10:33 EST, J Leslie Turriff
<[EMAIL PROTECTED]> wrote:
>Sounds to me, then, like the use of the
>snapshot/mirror/peer-to-peer copy features of storage devices e.g.
>Shark, SATABeast, etc. are currently dangerous to use with Linux
>filesystems.  They would need to be able to coordinate their activities

>with the filesystem lock/unlock components of the kernel to be made
>safe?

No, they are not "currently dangerous to use with Linux".  The
snapshot/flashcopy features provide a point-in-time consistent view of
an
entire device or range of blocks/cylinders.   In a "normal"
track-by-track
read, data on the device can change while you're reading.

You're right, however, and as we've been discussing, that these features

can be misused or misinterpreted to provide an -consistent
view of the data.  They don't do that.  That applies to any operating
system, not just Linux.  And it's not the lock/unlock features of a
filesystem that are important.  Instead, the application must be able to

exert control on the filesystem in such a way that it *knows* that all
[relevant] data has been committed to disk and can say "OK. Now is a
good
time to take that backup."

Properly used, these features can drastically reduce the amount of down
time needed to perform application-consistent backups.

Alan Altmark
z/VM Development
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

!DSPAM:32225,44c79d9988571674117836!

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread David Kreuter


including ESTABLISH, QUERY and WITHDRAW ala ickdsf on z/OS?
will ickdsf on z/vm be changed to support these functions?
David


Yes, the CP FLASHCOPY command is one of those asynchronous commands.  But
take heart!  We are busily improving it, making it more suitable for use
in scripts.  (And adding function to it while we're at it.)

Alan Altmark
z/VM Development
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390







--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Alan Altmark

On Wednesday, 07/26/2006 at 01:59 AST, Michael
MacIsaac/Poughkeepsie/[EMAIL PROTECTED] wrote:
> The z/VM FLASHCOPY command can give a return code of 0 and then *fail*
> later asynchronously. It is difficult to trap in REXX (for a mere mortal
> like myself).  And it will fail reliably (and asynchronously) if you
queue
> up too much work to the Shark/DSx000. This behavior is not well suited
to
> scripting :((  As such we had to pull back in a few cases on FLASHCOPY
in
> "The Virtualization Cookbook".

Yes, the CP FLASHCOPY command is one of those asynchronous commands.  But
take heart!  We are busily improving it, making it more suitable for use
in scripts.  (And adding function to it while we're at it.)

Alan Altmark
z/VM Development
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Michael MacIsaac

> Unless I am terribly misinformed, it *is* an atomic operation for the
> operating system. Even though from a storage management point of view
> it may take some time.
The z/VM FLASHCOPY command can give a return code of 0 and then *fail*
later asynchronously. It is difficult to trap in REXX (for a mere mortal
like myself).  And it will fail reliably (and asynchronously) if you queue
up too much work to the Shark/DSx000. This behavior is not well suited to
scripting :((  As such we had to pull back in a few cases on FLASHCOPY in
"The Virtualization Cookbook".

"Mike MacIsaac" <[EMAIL PROTECTED]>   (845) 433-7061

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Alan Altmark

On Wednesday, 07/26/2006 at 10:33 EST, J Leslie Turriff
<[EMAIL PROTECTED]> wrote:
> Sounds to me, then, like the use of the
> snapshot/mirror/peer-to-peer copy features of storage devices e.g.
> Shark, SATABeast, etc. are currently dangerous to use with Linux
> filesystems.  They would need to be able to coordinate their activities
> with the filesystem lock/unlock components of the kernel to be made
> safe?

No, they are not "currently dangerous to use with Linux".  The
snapshot/flashcopy features provide a point-in-time consistent view of an
entire device or range of blocks/cylinders.   In a "normal" track-by-track
read, data on the device can change while you're reading.

You're right, however, and as we've been discussing, that these features
can be misused or misinterpreted to provide an *application*-consistent
view of the data.  They don't do that.  That applies to any operating
system, not just Linux.  And it's not the lock/unlock features of a
filesystem that are important.  Instead, the application must be able to
exert control on the filesystem in such a way that it *knows* that all
[relevant] data has been committed to disk and can say "OK. Now is a good
time to take that backup."

Properly used, these features can drastically reduce the amount of down
time needed to perform application-consistent backups.

Alan Altmark
z/VM Development
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Rob van der Heij


On 7/26/06, Mark Perry <[EMAIL PROTECTED]> wrote:


One point not mentioned yet, is that FLASHCOPY is an asynchronous process. You 
can start a FLASHCOPY operation and it *can* return an error status 
asynchronously. 90+% of the time this is not apparent, the request is made and 
the Shark goes happily on its way. However if the request that is queued within 
the Shark has to be terminated (Resource shortages, target volume errors etc.) 
then beware!


Unless I am terribly misinformed, it *is* an atomic operation for the
operating system. Even though from a storage management point of view
it may take some time. And to maintain the illusion the device needs
resources (e.g. cache, extra disk space, etc).
The same applies to freezing the file system in Linux as suggested. If
freezing means that dirty pages are held back until the freeze is
over, then it will increase the demand for memory in the server. If
the server is large enough this process would increase the working set
size, but not worse than otherwise because page cache would be used
anyway. Using snapshot on the DASD subsystem means you can shorten the
time that Linux needs to hold its breath, and thus limit the amount of
data to be held up.
The alternative (file level backup inside Linux) will fill the page
cache with meta data for the entire file system (rather than the
content that changed during backup). Which is worse depends on your
situation.

Rob

--
Rob van der Heij
Velocity Software, Inc
http://velocitysoftware.com/

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Carsten Otte

J Leslie Turriff wrote:
>  Sounds to me, then, like the use of the
> snapshot/mirror/peer-to-peer copy features of storage devices e.g.
> Shark, SATABeast, etc. are currently dangerous to use with Linux
> filesystems.  They would need to be able to coordinate their activities
> with the filesystem lock/unlock components of the kernel to be made
> safe?
Exactly, yes.

cheers,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Mark Perry

- Start Original Message -
Sent: Wed, 26 Jul 2006 10:33:32 -0500
From: J Leslie Turriff <[EMAIL PROTECTED]>
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

>  Sounds to me, then, like the use of the
> snapshot/mirror/peer-to-peer copy features of storage devices e.g.
> Shark, SATABeast, etc. are currently dangerous to use with Linux
> filesystems.  They would need to be able to coordinate their activities
> with the filesystem lock/unlock components of the kernel to be made
> safe?

One point not mentioned yet, is that FLASHCOPY is an asynchronous process. You 
can start a FLASHCOPY operation and it *can* return an error status 
asynchronously. 90+% of the time this is not apparent, the request is made and 
the Shark goes happily on its way. However if the request that is queued within 
the Shark has to be terminated (Resource shortages, target volume errors etc.) 
then beware!

Mark

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread J Leslie Turriff

 Sounds to me, then, like the use of the
snapshot/mirror/peer-to-peer copy features of storage devices e.g.
Shark, SATABeast, etc. are currently dangerous to use with Linux
filesystems.  They would need to be able to coordinate their activities
with the filesystem lock/unlock components of the kernel to be made
safe?

J. Leslie Turriff
VM Systems Programmer
Central Missouri State University
Room 400
Ward Edwards Building
Warrensburg MO 64093
660-543-4285
660-580-0523
[EMAIL PROTECTED]

>>>[EMAIL PROTECTED] 07/26/06 9:04 am >>>
On Wed, Jul 26, 2006 at 02:28:53PM +0200, Carsten Otte wrote:
>Very interresting indeed. This pointed me to reading the
>lockfs/unlockfs semantics in Linux, and I think I need to withdraw my
>statement regarding flashcopy snapshots: because of the fact that
>there is no lockfs/unlockfs interaction when doing flashcopy, and
>because of dirty pages in the page cache during snapshot, flashcopy
>will not generate a consistent snapshot. Therefore, using flashcopy on
>an active volume from outside Linux is _not_ suitable for backup
purposes.
>
>The only feasible way to get a consistent snapshot is to use
>dm-snapshot from within Linux. This snapshot copy can later on be used
>with a backup feature outside Linux.

If you use xfs you can also put the filesystem in frozen state from
userspace with the xfs_freeze utility.  I know of inhouse backup tools
at various companies that make use of this feature.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

!DSPAM:32225,44c776f188571486219204!

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Christoph Hellwig

On Wed, Jul 26, 2006 at 02:28:53PM +0200, Carsten Otte wrote:
> Very interresting indeed. This pointed me to reading the
> lockfs/unlockfs semantics in Linux, and I think I need to withdraw my
> statement regarding flashcopy snapshots: because of the fact that
> there is no lockfs/unlockfs interaction when doing flashcopy, and
> because of dirty pages in the page cache during snapshot, flashcopy
> will not generate a consistent snapshot. Therefore, using flashcopy on
> an active volume from outside Linux is _not_ suitable for backup purposes.
>
> The only feasible way to get a consistent snapshot is to use
> dm-snapshot from within Linux. This snapshot copy can later on be used
> with a backup feature outside Linux.

If you use xfs you can also put the filesystem in frozen state from
userspace with the xfs_freeze utility.  I know of inhouse backup tools
at various companies that make use of this feature.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Mark Perry

- Start Original Message -
Sent: Wed, 26 Jul 2006 11:49:45 +0200
From: Christoph Hellwig <[EMAIL PROTECTED]>
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

> On Tue, Jul 25, 2006 at 01:22:53PM +0200, Mark Perry wrote:
> > I believe that several DB systems offer direct/raw I/O to avoid Linux cache
> > problems, and that journaling filesystems, although by default only journal
> > meta-data, offer mount options to journal data too.  This of course comes at
> > a performance price, though Hans Reiser did claim that the new Resier4 FS
> > will journal data without the previous performance penalties.
> 
> Journalled filesystems only journal buffered I/O.  Direct I/O means you
> do direct dma operations from the storage controller to the user address
> space.  It's physically impossible to journal.

I agree, I did not mean to connect direct I/O and journalling. I was merely 
commenting on features that are available to assist in ensuring data integrity 
on disk.

Mark

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Carsten Otte

Christoph Hellwig wrote:
> But that's not how snapshot work.  When you do a snapshot the filesystem
> is frozen.  That means:  new file writers are blocked from dirtying the
> filesystem throug the pagecache.  The filesystem block callers that want
> to create new transactions.  Then the whole file cache is written out
> and the asynchronous write ahead log (journal) is written out on disk.
> The filesystem is in a fully consistant state.  Trust me, I've
> implemented this myself for XFS.
Very interresting indeed. This pointed me to reading the
lockfs/unlockfs semantics in Linux, and I think I need to withdraw my
statement regarding flashcopy snapshots: because of the fact that
there is no lockfs/unlockfs interaction when doing flashcopy, and
because of dirty pages in the page cache during snapshot, flashcopy
will not generate a consistent snapshot. Therefore, using flashcopy on
an active volume from outside Linux is _not_ suitable for backup purposes.

The only feasible way to get a consistent snapshot is to use
dm-snapshot from within Linux. This snapshot copy can later on be used
with a backup feature outside Linux.

regards,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Christoph Hellwig

On Tue, Jul 25, 2006 at 09:06:54AM +0800, John Summerfied wrote:
> To avoid the nitpickers, let's say that David means all filesystems must
> be flushed and ro.
>
> As I understand it, journalling (by default) logs metadata (dirctory
> info) but not data.
>
> If you create a file, that's journalled. If you extend a file, that's
> journalled. The data you write to the file are not.
>
> Let's say that you create a file, write 4K to it, close it. Let's say
> you do a backup of the volume externally while the 4K data remains
> unwritten. Note: read in "man 2 close" "A successful close does not
> guarantee that the data has been successfully saved to disk."
>
> So now you have journalled (or comitted) metadata that says the file's
> got 4K of data in it.
>
> But, it hasn't. In the ordinary course of events, the data gets written
> to disk ans all is well.
>
> The same sort of thing happens when a file's updated in place, as I
> expect databases commonly are.

But that's not how snapshot work.  When you do a snapshot the filesystem
is frozen.  That means:  new file writers are blocked from dirtying the
filesystem throug the pagecache.  The filesystem block callers that want
to create new transactions.  Then the whole file cache is written out
and the asynchronous write ahead log (journal) is written out on disk.
The filesystem is in a fully consistant state.  Trust me, I've
implemented this myself for XFS.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-26 Thread Christoph Hellwig

On Tue, Jul 25, 2006 at 01:22:53PM +0200, Mark Perry wrote:
> I believe that several DB systems offer direct/raw I/O to avoid Linux cache
> problems, and that journaling filesystems, although by default only journal
> meta-data, offer mount options to journal data too.  This of course comes at
> a performance price, though Hans Reiser did claim that the new Resier4 FS
> will journal data without the previous performance penalties.

Journalled filesystems only journal buffered I/O.  Direct I/O means you
do direct dma operations from the storage controller to the user address
space.  It's physically impossible to journal.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Post, Mark K

I haven't finished all the replies to this thread, so I apologize if I'm
duplicating someone else's comment/question.  The thing that comes to my
mind is, what _exactly_ do you mean by "not boot."  Nothing happens at
all?  The system starts to come up, but can't find the root file system?
The root file system gets mounted, but things start dying for various
reasons?  The problem description needs to be filled in quite a bit
more.


Mark Post

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
Stahr, Lea
Sent: Tuesday, July 25, 2006 8:18 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

FDR says working as designed. They back up the entire volume and restore
the entire volume. I have restored 3 systems and they DO NOT BOOT.

Lea Stahr
Sr. System Administrator
Linux/Unix Team

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Alan Altmark

On Tuesday, 07/25/2006 at 02:19 ZE2, Carsten Otte <[EMAIL PROTECTED]>
wrote:
> Stahr, Lea wrote:
> > FDR says working as designed. They back up the entire volume and
restore
> > the entire volume. I have restored 3 systems and they DO NOT BOOT.
> How does FDR copy the volume? Do they sequentially copy track-by-track
> or use flashcopy?

You would have to presume track-by-track copy since flashcopy is an
optional feature and isn't available on all dasd brands.

Alan Altmark
z/VM Development
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Richard Pinion

Sorry for asking this at such a late time but did you say the Linux guest was 
shutdown when the FDR backup jobs were run?

>>> [EMAIL PROTECTED] 7/25/2006 11:27 AM >>>
Gentlemen, I must agree with the validity of external backups, but only
when the Linux is down. Any backup taken internally OR externally while
the system is running may not work due to extensive caching by the
system itself and by the applications. If I cannot restore my
application to a current state, then it's broken. And these were all
either EXT3 or REISERFS.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED] 
 

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
David Boyes
Sent: Tuesday, July 25, 2006 10:11 AM
To: LINUX-390@VM.MARIST.EDU 
Subject: Re: Bad Linux backups

> Therefore, dm-snapshot and
> flashcopy are two sides of the same medal once the entire filesystem
> is on a single dasd.

That's a pretty large assumption, especially since the recommended
wisdom for most "advanced applications" -- like DB/2 and WAS -- is *not*
to put things like data and logs on the same filesystem for performance
reasons. 
 
> > Given how quickly this can change in
> > most real production systems, I don't have time or spare cycles to
try
> > to second-guess this, or make excuses when I miss backing up
something
> > important because someone didn't tell me that a change in data
location
> > was made.
> The point is, that data is considered stable at any time. That's a
> basic assumption which is true for ext3 and most applications. If you
> run a file system or an application that does have inconsistent data
> from time to time, you are in trouble in case of a power outage or
> system crash. I hope this is not the case in any production
environment.

With respect, I think this is an unrealistic expectation. I don't
control the application programmers at IBM or S/AP or Oracle, etc. If
you want to preach on proper application design to those folks, I'll
happily supply amens from the pews, but out here in the real world, it
ain't so, and it ain't gonna be so for a good long while (or at least
until the current crop of programmers re-discover all the development
validation model work that we did back in the 70s at PARC). 

We're faced with dealing with the world as it is, not as we'd like it to
be, and that reality contradicts your assertion. The filesystem contents
may be technically consistent, but if the applications disagree for
*any* reason, then that doesn't help us at all given what we have to
work with in the field. It's a goal to build toward, but for now, it's
just that: a goal. 

With *today's* applications, you need a guaranteed valid state both from
the application *and* filesystem standpoint, and to get that, you need
to coordinate backups from both inside and outside the guest if you want
to use facilities outside the guest to dump the data. How you do that
coordination is what I think you're trying to argue and there, your
points are extremely valid and useful; my point still stands that
without coordination between Linux and whatever else you're using,
you're not going to get the desired result, which is a no-exceptions way
to handle backup and restore of critical data in the most efficient
manner available. 

-- db

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Fw: [LINUX-390] Bad Linux backups

2006-07-25 Thread Mark Perry


Rob van der Heij wrote:

On 7/25/06, John Campbell <[EMAIL PROTECTED]> wrote:


In all of this, isn't the UNIONFS still a live deal?  If as many client
systems as possible use a set of backing F/Ss that are Read Only,
wouldn't


Yes, it's mostly working. I have done quite a lot with it on s390. You
probably don't want to use it for all your data (for performance
reasons) but just for parts of the file system that are mostly
unmodified (like /etc). I would not use it for all data on the system
though.
With unionfs you can put a sparse R/W file system on top and have the
modified files reside on some private R/W disk. Because that R/W disk
still is a real file system, you could sort of run a file level backup
of that disk outside the unionfs. For a stable backup you would have
the issues we discussed in this thread though.
But you could even put a temporary R/W layer on top and divert all
writes to that layer, backup the (now frozen) first R/W layer, and
then merge any updates during the backup back into the first R/W
layer. This is neat because it's file level, but there may be a
performance issue when files need to be copied up.

Rob
--
Rob van der Heij
Velocity Software, Inc
http://velocitysoftware.com/

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390
or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


AFAIK unionfs copies the entire file when a change is made, unlike other
snapshot methods which only record the changes at a more granular level.
Thus as Rob mentions it is *very* useful for ascii text file changes
(/etc or source code etc.), but not for your DB files!

Mark

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Fw: [LINUX-390] Bad Linux backups

2006-07-25 Thread Rob van der Heij


On 7/25/06, John Campbell <[EMAIL PROTECTED]> wrote:


In all of this, isn't the UNIONFS still a live deal?  If as many client
systems as possible use a set of backing F/Ss that are Read Only, wouldn't


Yes, it's mostly working. I have done quite a lot with it on s390. You
probably don't want to use it for all your data (for performance
reasons) but just for parts of the file system that are mostly
unmodified (like /etc). I would not use it for all data on the system
though.
With unionfs you can put a sparse R/W file system on top and have the
modified files reside on some private R/W disk. Because that R/W disk
still is a real file system, you could sort of run a file level backup
of that disk outside the unionfs. For a stable backup you would have
the issues we discussed in this thread though.
But you could even put a temporary R/W layer on top and divert all
writes to that layer, backup the (now frozen) first R/W layer, and
then merge any updates during the backup back into the first R/W
layer. This is neat because it's file level, but there may be a
performance issue when files need to be copied up.

Rob
--
Rob van der Heij
Velocity Software, Inc
http://velocitysoftware.com/

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Fw: [LINUX-390] Bad Linux backups

2006-07-25 Thread John Campbell

In all of this, isn't the UNIONFS still a live deal?  If as many client
systems as possible use a set of backing F/Ss that are Read Only, wouldn't
the local copy ONLY consist of changed files?  And wouldn't the local copy
(I'm not sure, UNIONFS _does_ handle having a R/W copy on a hard disk,
right?  Or am I talking through a sphincter again?) be in a form that could
be copied off VERY quickly because it'd be pretty darn small?

It seems, at least w/ z/VM, that this kind of trick would be almost *made*
for a virtualized environment.  I realize, though, that this discussion has
been about LPAR'd environments so I'm not sure what would be different.

Note that my experiences are currently limited to Intel and pSeries
(PowerPC) based systems...

John R. Campbell, Speaker to Machines (GNUrd)  (813) 356-5322 (t/l 697)
Adsumo ergo raptus sum
Why MacOS X?  Well, it's proof that making Unix user-friendly was much
easier than debugging Windows.
Red Hat Certified Engineer (#803004680310286, RHEL3)
- Forwarded by John Campbell/Tampa/IBM on 07/25/06 01:20 PM -

 Adam Thornton
 <[EMAIL PROTECTED]
 mine.net>  To
 Sent by: Linux on LINUX-390@VM.MARIST.EDU
 390 Port   cc
 <[EMAIL PROTECTED]
 IST.EDU>  Subject
               Re: [LINUX-390] Bad Linux backups

 07/24/06 04:38 PM

 Please respond to
 Linux on 390 Port
 <[EMAIL PROTECTED]
 IST.EDU>

On Jul 24, 2006, at 1:35 PM, David Boyes wrote:

>> Such an approach does require discipline to properly register what
>> you
>> have modified and to assure the copy of that customized file is held
>> somewhere.
>
> Tripwire is a handy tool for this. Run it every night and have it
> generate a list of changes in diff format. You can then turn that diff
> into input to patch and deploy it as a .deb or .rpm.

Or, if you're feeling REALLY parsimonious, you can use Bacula in its
"verify" mode to do the same thing.

Adam

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread David Boyes

>  Earlier in this thread there was mention of using clustering
> services to avoid outages while doing backups.  Wouldn't that involve
> the same sort of data-in-flight issues?

Not really, because the major thrust of clustering tools is to
coordinate the services and workload between the cluster members. In
that case, there is no exposure for the guest being shut down, because
the work and inflight transactions have been moved elsewhere in a
coordinated manner. 

Once the guest leaves the cluster, then the shutdown/logoff is trivial,
and you can do what you like with dumping the disks for that guest from
outside the guest.

David Boyes
Sine Nomine Associates

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Carsten Otte

J Leslie Turriff wrote:
>  Earlier in this thread there was mention of using clustering
> services to avoid outages while doing backups.  Wouldn't that involve
> the same sort of data-in-flight issues?
If the data is shared among the nodes, like with nfs or a cluster
filesystem, yes.

with kind regards,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Stahr, Lea

With clustering, you shut down one image and do an OFFLINE backup while
the application runs on the second image. Then bring up the primary
image and shutdown the secondary system for backup.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]
 

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of J
Leslie Turriff
Sent: Tuesday, July 25, 2006 10:54 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

 Earlier in this thread there was mention of using clustering
services to avoid outages while doing backups.  Wouldn't that involve
the same sort of data-in-flight issues?




J. Leslie Turriff
VM Systems Programmer
Central Missouri State University
Room 400
Ward Edwards Building
Warrensburg MO 64093
660-543-4285
660-580-0523
[EMAIL PROTECTED]

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread J Leslie Turriff

 Earlier in this thread there was mention of using clustering
services to avoid outages while doing backups.  Wouldn't that involve
the same sort of data-in-flight issues?




J. Leslie Turriff
VM Systems Programmer
Central Missouri State University
Room 400
Ward Edwards Building
Warrensburg MO 64093
660-543-4285
660-580-0523
[EMAIL PROTECTED]

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Adam Thornton


On Jul 25, 2006, at 8:49 AM, Carsten Otte wrote:


James Melin wrote:

Not even remotely close to what I was thinking Greater minds
than mine any ideas?

Oh not me. The oops seems to be issued in the filesystem code,
probably reiserfs. Lea, could you run the oops message through
ksymoops please?


That would utterly fail to surprise me.

ReiserFS and I don't get along.  Ext3, on the other hand, has rarely
bitten me without extreme provocation.

Adam

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Carsten Otte

James Melin wrote:
> Not even remotely close to what I was thinking Greater minds than 
> mine any ideas?
Oh not me. The oops seems to be issued in the filesystem code,
probably reiserfs. Lea, could you run the oops message through
ksymoops please?


regards,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Carsten Otte

Fargusson.Alan wrote:
> I agree.  I think you should make your backups with the Linux system down.  
> You should test this to make sure that there is not some other operational 
> error causing problems.
I think we got close to the bottom of the stack now: If one can take
down the system for backup it is a good idea to do so because of the
reasons discussed in this thread.
Backing up a running system involves trust in the application and the
file system.

with kind regards,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread James Melin

Not even remotely close to what I was thinking Greater minds than mine 
any ideas?




 "Stahr, Lea" <[EMAIL PROTECTED]>
 Sent by: Linux on 390 Port
   
   To
 
LINUX-390@VM.MARIST.EDU

   cc
 07/25/2006 09:31 AM

  Subject
         Re: Bad 
Linux backups
Please respond to
   Linux on 390 Port 








All systems are clones of an original, so all device assignments in
Linux are the same across systems. Here is my ZIPL and FSTAB. I cannot
generate the error messages as I had to re-clone the system and get it
back online.
I was receiving errors on different filesystems and I ran FSCK's on
them, then received this abend in the kernel:
I restored the Linux volumes from the backup tapes and it will not come
up all the way. I switched tape sets with the same result. The kernel is
abending.



Unable to handle kernel pointer dereference at virtual kernel address
0c80

Oops: 0010

CPU:0Not tainted

Process find (pid: 497, task: 09804000, ksp: 09805940)

Krnl PSW : 07081000 8d020b96

Krnl GPRS: 09709a80 fffd2348 0c7fff79 6003

   0c5c7570 d140 09a3e779 09a34300

   b814a409 00643c03 0d14 6004

   09a414b0 8d020864 8d020a54 09805c98

Krnl ACRS:    

   0001   

      

      

Krnl Code: d2 ff 40 00 20 00 41 40 41 00 41 20 21 00 a7 16 ff f9 a7 15

Call Trace:

 <1>Unable to handle kernel pointer dereference at virtual kernel
address 18

00

Oops: 0010

CPU:0Not tainted

Process find (pid: 497, task: 09804000, ksp: 09805940)

Krnl PSW : 07082000 800765aa

Krnl GPRS: 00a83f80 0002 1800 00c0
**

[EMAIL PROTECTED]:~> cd /etc
[EMAIL PROTECTED]:/etc> cat zipl.conf
# Generated by YaST2
[defaultboot]
default=ipl

[ipl]
target=/boot/zipl
image=/boot/kernel/image
ramdisk=/boot/initrd
parameters="dasd=0201-020f,0300-030f root=/dev/dasda1"

[dumpdasd]
target=/boot/zipl
dumpto=/dev/dasd??

[dumptape]
target=/boot/zipl
dumpto=/dev/rtibm0
[EMAIL PROTECTED]:/etc> cat fstab
/dev/dasda1  /reiserfs   defaults
1 1
/dev/dasda2  /tmp ext2   defaults
1 2
/dev/dasdb1  /usr reiserfs   defaults
1 2
/dev/dasdd1  /var reiserfs   defaults
1 2
/dev/dasde1  /homereiserfs   defaults
1 2
/dev/dasdf1  /user2   reiserfs   defaults
1 2
/dev/dasdc1  swap swap   pri=42
0 0
/dev/dasdd2  /opt/IBM reiserfs   defaults
0 2
devpts   /dev/pts devpts mode=0620,gid=5
0 0
proc /procproc   defaults
0 0
[EMAIL PROTECTED]:/etc>

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]


-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
James Melin
Sent: Tuesday, July 25, 2006 8:43 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

I think I might have an idea as to where your problem MAY be - I'm
guessing at this point so I'd like you to fill in some blanks

Are your Linux guests device number consistent across Linuxen? As in,
/dev/dasdb1 is always device 601 and /{root file system} is 600, etc? Or
is it
on Linux A it's 600,601,602 etc and Linux B it's 400,401,402 etc?  What
does your zipl.conf look like?

Also can you post any messages you get when you try to boot them?





 "Stahr, Lea" <[EMAIL PROTECTED]>
 Sent by: Linux on 390 Port
 
To

LINUX-390@VM.MARIST.EDU

cc
 07/25/2006 07:17 AM

Subject
 Re:
Bad Linux backups
Please respond to
   Linux on 390 Port 








FDR says working as designed. They back up the entire volume and restore
the entire volume. I have restored 3 systems and they DO NOT BOOT.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]


-Original Message-
From: Linux on 390 Port

Re: Bad Linux backups

2006-07-25 Thread Carsten Otte

David Boyes wrote:
>> Therefore, dm-snapshot and
>> flashcopy are two sides of the same medal once the entire filesystem
>> is on a single dasd.
>
> That's a pretty large assumption, especially since the recommended
> wisdom for most "advanced applications" -- like DB/2 and WAS -- is *not*
> to put things like data and logs on the same filesystem for performance
> reasons.
Yup, I know that "everything on a single dasd" is a strong limitation.
But since flashcopy does'nt allow to snapshot multiple volumes at a
time, it is the only way to get a snapshot of all data involved from
outside the system that I know of.

>> The point is, that data is considered stable at any time. That's a
>> basic assumption which is true for ext3 and most applications. If you
>> run a file system or an application that does have inconsistent data
>> from time to time, you are in trouble in case of a power outage or
>> system crash. I hope this is not the case in any production
> environment.
>
> With respect, I think this is an unrealistic expectation. I don't
> control the application programmers at IBM or S/AP or Oracle, etc. If
> you want to preach on proper application design to those folks, I'll
> happily supply amens from the pews, but out here in the real world, it
> ain't so, and it ain't gonna be so for a good long while (or at least
> until the current crop of programmers re-discover all the development
> validation model work that we did back in the 70s at PARC).
It depends on the type of application. For a fileserver or static
webserver for example, this requirement is fullfilled. For more
complex servers, it can get nasty.

> With *today's* applications, you need a guaranteed valid state both from
> the application *and* filesystem standpoint, and to get that, you need
> to coordinate backups from both inside and outside the guest if you want
> to use facilities outside the guest to dump the data. How you do that
> coordination is what I think you're trying to argue and there, your
> points are extremely valid and useful; my point still stands that
> without coordination between Linux and whatever else you're using,
> you're not going to get the desired result, which is a no-exceptions way
> to handle backup and restore of critical data in the most efficient
> manner available.
Some people seem to trust today's applications more, for example the
developers of dm-snapshot and the users of per-file backup soloutions
like tsm which usually also run while the application is active.


with kind regards,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Fargusson.Alan

I agree.  I think you should make your backups with the Linux system down.  You 
should test this to make sure that there is not some other operational error 
causing problems.

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] Behalf Of
Stahr, Lea
Sent: Tuesday, July 25, 2006 8:28 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

Gentlemen, I must agree with the validity of external backups, but only
when the Linux is down. Any backup taken internally OR externally while
the system is running may not work due to extensive caching by the
system itself and by the applications. If I cannot restore my
application to a current state, then it's broken. And these were all
either EXT3 or REISERFS.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
David Boyes
Sent: Tuesday, July 25, 2006 10:11 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

> Therefore, dm-snapshot and
> flashcopy are two sides of the same medal once the entire filesystem
> is on a single dasd.

That's a pretty large assumption, especially since the recommended
wisdom for most "advanced applications" -- like DB/2 and WAS -- is *not*
to put things like data and logs on the same filesystem for performance
reasons. 

> > Given how quickly this can change in
> > most real production systems, I don't have time or spare cycles to
try
> > to second-guess this, or make excuses when I miss backing up
something
> > important because someone didn't tell me that a change in data
location
> > was made.
> The point is, that data is considered stable at any time. That's a
> basic assumption which is true for ext3 and most applications. If you
> run a file system or an application that does have inconsistent data
> from time to time, you are in trouble in case of a power outage or
> system crash. I hope this is not the case in any production
environment.

With respect, I think this is an unrealistic expectation. I don't
control the application programmers at IBM or S/AP or Oracle, etc. If
you want to preach on proper application design to those folks, I'll
happily supply amens from the pews, but out here in the real world, it
ain't so, and it ain't gonna be so for a good long while (or at least
until the current crop of programmers re-discover all the development
validation model work that we did back in the 70s at PARC). 

We're faced with dealing with the world as it is, not as we'd like it to
be, and that reality contradicts your assertion. The filesystem contents
may be technically consistent, but if the applications disagree for
*any* reason, then that doesn't help us at all given what we have to
work with in the field. It's a goal to build toward, but for now, it's
just that: a goal. 

With *today's* applications, you need a guaranteed valid state both from
the application *and* filesystem standpoint, and to get that, you need
to coordinate backups from both inside and outside the guest if you want
to use facilities outside the guest to dump the data. How you do that
coordination is what I think you're trying to argue and there, your
points are extremely valid and useful; my point still stands that
without coordination between Linux and whatever else you're using,
you're not going to get the desired result, which is a no-exceptions way
to handle backup and restore of critical data in the most efficient
manner available. 

-- db

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Stahr, Lea

Gentlemen, I must agree with the validity of external backups, but only
when the Linux is down. Any backup taken internally OR externally while
the system is running may not work due to extensive caching by the
system itself and by the applications. If I cannot restore my
application to a current state, then it's broken. And these were all
either EXT3 or REISERFS.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]
 

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
David Boyes
Sent: Tuesday, July 25, 2006 10:11 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

> Therefore, dm-snapshot and
> flashcopy are two sides of the same medal once the entire filesystem
> is on a single dasd.

That's a pretty large assumption, especially since the recommended
wisdom for most "advanced applications" -- like DB/2 and WAS -- is *not*
to put things like data and logs on the same filesystem for performance
reasons. 
 
> > Given how quickly this can change in
> > most real production systems, I don't have time or spare cycles to
try
> > to second-guess this, or make excuses when I miss backing up
something
> > important because someone didn't tell me that a change in data
location
> > was made.
> The point is, that data is considered stable at any time. That's a
> basic assumption which is true for ext3 and most applications. If you
> run a file system or an application that does have inconsistent data
> from time to time, you are in trouble in case of a power outage or
> system crash. I hope this is not the case in any production
environment.

With respect, I think this is an unrealistic expectation. I don't
control the application programmers at IBM or S/AP or Oracle, etc. If
you want to preach on proper application design to those folks, I'll
happily supply amens from the pews, but out here in the real world, it
ain't so, and it ain't gonna be so for a good long while (or at least
until the current crop of programmers re-discover all the development
validation model work that we did back in the 70s at PARC). 

We're faced with dealing with the world as it is, not as we'd like it to
be, and that reality contradicts your assertion. The filesystem contents
may be technically consistent, but if the applications disagree for
*any* reason, then that doesn't help us at all given what we have to
work with in the field. It's a goal to build toward, but for now, it's
just that: a goal. 

With *today's* applications, you need a guaranteed valid state both from
the application *and* filesystem standpoint, and to get that, you need
to coordinate backups from both inside and outside the guest if you want
to use facilities outside the guest to dump the data. How you do that
coordination is what I think you're trying to argue and there, your
points are extremely valid and useful; my point still stands that
without coordination between Linux and whatever else you're using,
you're not going to get the desired result, which is a no-exceptions way
to handle backup and restore of critical data in the most efficient
manner available. 

-- db

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread David Boyes

> Therefore, dm-snapshot and
> flashcopy are two sides of the same medal once the entire filesystem
> is on a single dasd.

That's a pretty large assumption, especially since the recommended
wisdom for most "advanced applications" -- like DB/2 and WAS -- is *not*
to put things like data and logs on the same filesystem for performance
reasons. 
 
> > Given how quickly this can change in
> > most real production systems, I don't have time or spare cycles to
try
> > to second-guess this, or make excuses when I miss backing up
something
> > important because someone didn't tell me that a change in data
location
> > was made.
> The point is, that data is considered stable at any time. That's a
> basic assumption which is true for ext3 and most applications. If you
> run a file system or an application that does have inconsistent data
> from time to time, you are in trouble in case of a power outage or
> system crash. I hope this is not the case in any production
environment.

With respect, I think this is an unrealistic expectation. I don't
control the application programmers at IBM or S/AP or Oracle, etc. If
you want to preach on proper application design to those folks, I'll
happily supply amens from the pews, but out here in the real world, it
ain't so, and it ain't gonna be so for a good long while (or at least
until the current crop of programmers re-discover all the development
validation model work that we did back in the 70s at PARC). 

We're faced with dealing with the world as it is, not as we'd like it to
be, and that reality contradicts your assertion. The filesystem contents
may be technically consistent, but if the applications disagree for
*any* reason, then that doesn't help us at all given what we have to
work with in the field. It's a goal to build toward, but for now, it's
just that: a goal. 

With *today's* applications, you need a guaranteed valid state both from
the application *and* filesystem standpoint, and to get that, you need
to coordinate backups from both inside and outside the guest if you want
to use facilities outside the guest to dump the data. How you do that
coordination is what I think you're trying to argue and there, your
points are extremely valid and useful; my point still stands that
without coordination between Linux and whatever else you're using,
you're not going to get the desired result, which is a no-exceptions way
to handle backup and restore of critical data in the most efficient
manner available. 

-- db

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Stahr, Lea

Here is my FDR restore job step for a volume:

//DISK5DD UNIT=SYSALLDA,VOL=SER=VML061,DISP=OLD
//TAPE5DD DSN=DRP.OPR.SOV.M3DMP.VML061(-3),
//SUBSYS=SOV,DISP=SHR
//SYSPRIN5 DD SYSOUT=*

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]
 

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
Jeremy Warren
Sent: Tuesday, July 25, 2006 8:50 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

Did you try the fdr from cyl / to cyl options on the backups?  This
sounds
eerily familar to our label issue.








"Stahr, Lea" <[EMAIL PROTECTED]>
Sent by: Linux on 390 Port 
07/25/2006 08:17 AM
Please respond to
Linux on 390 Port 


To
LINUX-390@VM.MARIST.EDU
cc

Subject
Re: [LINUX-390] Bad Linux backups






FDR says working as designed. They back up the entire volume and restore
the entire volume. I have restored 3 systems and they DO NOT BOOT.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]


-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
John Summerfied
Sent: Monday, July 24, 2006 6:03 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

Stahr, Lea wrote:
> I run SuSE SLES 8 under ZVM 5.1 in an IFL. The DASD are in a SAN that
is
> also accessed by ZOS. Backups are taken by ZOS using FDR full volume
> copies on Saturday morning (low usage). When I restore a backup, it
will
> not boot. The backup and the restore have the same byte counts. Linux
> support at MainLine Systems tells me that he has seen this before at
> other customers. What is everyone using for Linux under ZVM backups?
> HELP! My backups are no good!

What do the FDR suppliers say?


--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics
http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Stahr, Lea

All systems are clones of an original, so all device assignments in
Linux are the same across systems. Here is my ZIPL and FSTAB. I cannot
generate the error messages as I had to re-clone the system and get it
back online.
I was receiving errors on different filesystems and I ran FSCK's on
them, then received this abend in the kernel:
I restored the Linux volumes from the backup tapes and it will not come
up all the way. I switched tape sets with the same result. The kernel is
abending.



Unable to handle kernel pointer dereference at virtual kernel address
0c80

Oops: 0010

CPU:0Not tainted

Process find (pid: 497, task: 09804000, ksp: 09805940)

Krnl PSW : 07081000 8d020b96

Krnl GPRS: 09709a80 fffd2348 0c7fff79 6003

   0c5c7570 d140 09a3e779 09a34300

   b814a409 00643c03 0d14 6004

   09a414b0 8d020864 8d020a54 09805c98

Krnl ACRS:    

   0001   

      

      

Krnl Code: d2 ff 40 00 20 00 41 40 41 00 41 20 21 00 a7 16 ff f9 a7 15

Call Trace:

 <1>Unable to handle kernel pointer dereference at virtual kernel
address 18

00

Oops: 0010

CPU:0Not tainted

Process find (pid: 497, task: 09804000, ksp: 09805940)

Krnl PSW : 07082000 800765aa

Krnl GPRS: 00a83f80 0002 1800 00c0
**

[EMAIL PROTECTED]:~> cd /etc
[EMAIL PROTECTED]:/etc> cat zipl.conf
# Generated by YaST2
[defaultboot]
default=ipl

[ipl]
target=/boot/zipl
image=/boot/kernel/image
ramdisk=/boot/initrd
parameters="dasd=0201-020f,0300-030f root=/dev/dasda1"

[dumpdasd]
target=/boot/zipl
dumpto=/dev/dasd??

[dumptape]
target=/boot/zipl
dumpto=/dev/rtibm0
[EMAIL PROTECTED]:/etc> cat fstab
/dev/dasda1  /reiserfs   defaults
1 1
/dev/dasda2  /tmp ext2   defaults
1 2
/dev/dasdb1  /usr reiserfs   defaults
1 2
/dev/dasdd1  /var reiserfs   defaults
1 2
/dev/dasde1  /homereiserfs   defaults
1 2
/dev/dasdf1  /user2   reiserfs   defaults
1 2
/dev/dasdc1  swap swap   pri=42
0 0
/dev/dasdd2  /opt/IBM reiserfs   defaults
0 2 
devpts   /dev/pts devpts mode=0620,gid=5
0 0
proc /procproc   defaults
0 0
[EMAIL PROTECTED]:/etc>

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]
 

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
James Melin
Sent: Tuesday, July 25, 2006 8:43 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

I think I might have an idea as to where your problem MAY be - I'm
guessing at this point so I'd like you to fill in some blanks

Are your Linux guests device number consistent across Linuxen? As in,
/dev/dasdb1 is always device 601 and /{root file system} is 600, etc? Or
is it
on Linux A it's 600,601,602 etc and Linux B it's 400,401,402 etc?  What
does your zipl.conf look like?

Also can you post any messages you get when you try to boot them?





 "Stahr, Lea" <[EMAIL PROTECTED]>
 Sent by: Linux on 390 Port
 
To
 
LINUX-390@VM.MARIST.EDU
 
cc
 07/25/2006 07:17 AM
 
Subject
         Re:
Bad Linux backups
Please respond to
   Linux on 390 Port 








FDR says working as designed. They back up the entire volume and restore
the entire volume. I have restored 3 systems and they DO NOT BOOT.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]


-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
John Summerfied
Sent: Monday, July 24, 2006 6:03 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

Stahr, Lea wrote:
> I run SuSE SLES 8 under ZVM 5.1 in an IFL. The DASD are in a SAN that
is
> also accessed by ZOS. Backups are taken by ZOS using FDR full volume
> copies on Saturday morning (low usage). When I restore a backup, it
will
> not boot. The backup and the restore have the same byte counts. Linux
> support at MainLine Systems tells me that he has seen this before at
> other customers. What is everyone using for Linux under ZVM backups?
> HELP! My backups are no good!

What do the FDR suppliers say?


--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics
http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--

Re: Bad Linux backups

2006-07-25 Thread Jon Brock

What happens when you try to boot a restored system?

Jon



FDR says working as designed. They back up the entire volume and restore
the entire volume. I have restored 3 systems and they DO NOT BOOT.


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Jeremy Warren

Did you try the fdr from cyl / to cyl options on the backups?  This sounds
eerily familar to our label issue.








"Stahr, Lea" <[EMAIL PROTECTED]>
Sent by: Linux on 390 Port 
07/25/2006 08:17 AM
Please respond to
Linux on 390 Port 


To
LINUX-390@VM.MARIST.EDU
cc

Subject
Re: [LINUX-390] Bad Linux backups






FDR says working as designed. They back up the entire volume and restore
the entire volume. I have restored 3 systems and they DO NOT BOOT.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]


-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
John Summerfied
Sent: Monday, July 24, 2006 6:03 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

Stahr, Lea wrote:
> I run SuSE SLES 8 under ZVM 5.1 in an IFL. The DASD are in a SAN that
is
> also accessed by ZOS. Backups are taken by ZOS using FDR full volume
> copies on Saturday morning (low usage). When I restore a backup, it
will
> not boot. The backup and the restore have the same byte counts. Linux
> support at MainLine Systems tells me that he has seen this before at
> other customers. What is everyone using for Linux under ZVM backups?
> HELP! My backups are no good!

What do the FDR suppliers say?


--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics
http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread James Melin

I think I might have an idea as to where your problem MAY be - I'm guessing at 
this point so I'd like you to fill in some blanks

Are your Linux guests device number consistent across Linuxen? As in, 
/dev/dasdb1 is always device 601 and /{root file system} is 600, etc? Or is it
on Linux A it's 600,601,602 etc and Linux B it's 400,401,402 etc?  What does 
your zipl.conf look like?

Also can you post any messages you get when you try to boot them?





 "Stahr, Lea" <[EMAIL PROTECTED]>
 Sent by: Linux on 390 Port
   
   To
 
LINUX-390@VM.MARIST.EDU

   cc
 07/25/2006 07:17 AM

  Subject
             Re: Bad 
Linux backups
Please respond to
   Linux on 390 Port 








FDR says working as designed. They back up the entire volume and restore
the entire volume. I have restored 3 systems and they DO NOT BOOT.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]


-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
John Summerfied
Sent: Monday, July 24, 2006 6:03 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

Stahr, Lea wrote:
> I run SuSE SLES 8 under ZVM 5.1 in an IFL. The DASD are in a SAN that
is
> also accessed by ZOS. Backups are taken by ZOS using FDR full volume
> copies on Saturday morning (low usage). When I restore a backup, it
will
> not boot. The backup and the restore have the same byte counts. Linux
> support at MainLine Systems tells me that he has seen this before at
> other customers. What is everyone using for Linux under ZVM backups?
> HELP! My backups are no good!

What do the FDR suppliers say?


--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics
http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Carsten Otte

Stahr, Lea wrote:
> FDR says working as designed. They back up the entire volume and restore
> the entire volume. I have restored 3 systems and they DO NOT BOOT.
How does FDR copy the volume? Do they sequentially copy track-by-track
or use flashcopy?

cheers,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Stahr, Lea

FDR says working as designed. They back up the entire volume and restore
the entire volume. I have restored 3 systems and they DO NOT BOOT.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]
 

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
John Summerfied
Sent: Monday, July 24, 2006 6:03 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

Stahr, Lea wrote:
> I run SuSE SLES 8 under ZVM 5.1 in an IFL. The DASD are in a SAN that
is
> also accessed by ZOS. Backups are taken by ZOS using FDR full volume
> copies on Saturday morning (low usage). When I restore a backup, it
will
> not boot. The backup and the restore have the same byte counts. Linux
> support at MainLine Systems tells me that he has seen this before at
> other customers. What is everyone using for Linux under ZVM backups?
> HELP! My backups are no good!

What do the FDR suppliers say?


--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics
http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Mark Perry

- Start Original Message -
Sent: Tue, 25 Jul 2006 11:48:09 +0200
From: Ingo Adlung <[EMAIL PROTECTED]>
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

> Linux on 390 Port  wrote on 25.07.2006 11:23:02:
> 
> > Alan Altmark wrote:
> > > But, Carsten, the application may start up just fine, however it may be
> > > using old data.  I have an application running on my workstation right
> now
> > > that saves its configuration data only when you shut it down (working
> as
> > > designed according to the vendor).  Since the application is only
> > > terminated when the system is shut down, a live backup of the disk
> would
> > > have no effect.  I mean, it would restore and run just fine, but be
> > > running with old data.
> > Well, backups are always about using old data in case of recovery as
> > far as I can see. Using an application that saves important data only
> > on shutdown in a mission critical environment is very dangerous
> > regardless of the backup soloution.
> >
> 
> Well, I guess the question is whether you relaunch from a deterministic
> starting point, or whether your starting point is arbitrary. You are
> arguing
> along the line that one shouldn't be afraid about discretionary starting
> points, as anecdotal knowledge suggests that it will usually work anyhow.
> Alan is arguing that customers would typically not want to bet on
> arbitraryness and we shouldn't paper the risks doing so but clearly
> articulate
> it. Either you have application support/awareness for live backups, or the
> result by definition *is* arbitrary - unless you can guarantee a well
> defined
> transactional state (as viewed by an aplication) which we currently lack
> file
> system or more generally operating system support for.
> 

Your articulation of the English language is quite exquisite :-)

> > > I can jump up and down and stamp my feet, claiming that the application
> is
> > > broken, but that doesn't make it so.
> 
> I fully support Alan's view.

Whilst the application may not be "broken", it is most definitely unsuitable 
for use within an Enterprise. The robustness of an application, and its ability 
to recover from unexpected system errors without *any* data loss (Power outages 
etc.) is of paramount importance.

I believe that several DB systems offer direct/raw I/O to avoid Linux cache 
problems, and that journaling filesystems, although by default only journal 
meta-data, offer mount options to journal data too. This of course comes at a 
performance price, though Hans Reiser did claim that the new Resier4 FS will 
journal data without the previous performance penalties.

Mark

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Carsten Otte

Ingo Adlung wrote:
> Whether the file system is consistent in itself after a restart is
> irrelevant form an application perspective if the application has e.g.
> state that is independent from the file system content. You can only
> capture that by application collaboration or by forcing that state to
> be hardened on persistent storage, hence shutting down the application
> prior to backup/archive.
True, but this is a general restricion of live backups (e.g. file
level backup). Not specific to full volume snapshot backup.

cheers,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Carsten Otte

David Boyes wrote:
> As others have described, just dumping the data actually physically on
> the disk doesn't always provde a consistent backup that is useful for
> restoring the system. You *must* coordinate what happens inside the
> guest with something outside the guest -- regardless of what that
> outside system is -- to get something usable.
I am well aware that there are other opinions. I am just trying to
explain why those are wrong.

> LVM snapshot is neat and cool, but it doesn't help in this case if the
> outside system doesn't know it's happening or what data can be
> considered "stable" to back up.
LVM also does'nt know about filesystem interna, it just grabs a copy
of the entire volume at a given point in time. Like flashcopy. Also,
in Linux layering, LVM is "behind" the page cache (where the physical
disk is) just like the flashcopy mechanism. Therefore, dm-snapshot and
flashcopy are two sides of the same medal once the entire filesystem
is on a single dasd.

> Given how quickly this can change in
> most real production systems, I don't have time or spare cycles to try
> to second-guess this, or make excuses when I miss backing up something
> important because someone didn't tell me that a change in data location
> was made.
The point is, that data is considered stable at any time. That's a
basic assumption which is true for ext3 and most applications. If you
run a file system or an application that does have inconsistent data
from time to time, you are in trouble in case of a power outage or
system crash. I hope this is not the case in any production environment.

>> z/OS does not need to know what Linux is doing when the setup ensures
>> consistent on-disk data at all times.
> Clearly, from the live example that started this discussion, this is not
> the case.
I agree, it does'nt work in the current live example with the current
setup. But the list is suggesting to "never do flashcopy backups from
outside a running linux guest". This suggestion is wrong.

> You are talking about something that happens INSIDE the Linux guest,
> coordinating things on the Linux side to produce a copy of the data that
> z/OS can dump in a consistent manner. Given that, then z/OS darn well
> *better* know that the Linux system has done this, or the data you are
> backing up is demonstratably crap.
It can be demonstratably crap, or a reliably usable backup. If done
proper, one can be sure to get the second one.

regards,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Ingo Adlung

Linux on 390 Port  wrote on 25.07.2006 11:23:02:

> Alan Altmark wrote:
> > But, Carsten, the application may start up just fine, however it may be
> > using old data.  I have an application running on my workstation right
now
> > that saves its configuration data only when you shut it down (working
as
> > designed according to the vendor).  Since the application is only
> > terminated when the system is shut down, a live backup of the disk
would
> > have no effect.  I mean, it would restore and run just fine, but be
> > running with old data.
> Well, backups are always about using old data in case of recovery as
> far as I can see. Using an application that saves important data only
> on shutdown in a mission critical environment is very dangerous
> regardless of the backup soloution.
>

Well, I guess the question is whether you relaunch from a deterministic
starting point, or whether your starting point is arbitrary. You are
arguing
along the line that one shouldn't be afraid about discretionary starting
points, as anecdotal knowledge suggests that it will usually work anyhow.
Alan is arguing that customers would typically not want to bet on
arbitraryness and we shouldn't paper the risks doing so but clearly
articulate
it. Either you have application support/awareness for live backups, or the
result by definition *is* arbitrary - unless you can guarantee a well
defined
transactional state (as viewed by an aplication) which we currently lack
file
system or more generally operating system support for.

> > I can jump up and down and stamp my feet, claiming that the application
is
> > broken, but that doesn't make it so.

I fully support Alan's view.

> What application (Server Application on Linux) acts like you claim?
>
> regards,
> Carsten
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390

Ingo

--
Ingo Adlung,
STSM, System z Linux and Virtualization Architecture
mail: [EMAIL PROTECTED] - phone: +49-7031-16-4263

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Ingo Adlung

Linux on 390 Port  wrote on 25.07.2006 11:10:04:

> > Carsten Otte wrote:
> >> Wrong. Due to caching, as correctly described by David Boyes, the
> >> system may change on-disk content even when the application is not
> >> running. Example: the syslogd generates a "mark" every 20 minutes.
>
> John Summerfied wrote:
> > syslogd's mark message has nothing to do with caching.
> >
> > According to its man page, "sync forces changed blocks to disk, updates
> > the super block."
> >
> > If you don't believe (or trust) that, then "mount -o remount" is your
> > friend.
> You missed my point: From the file system perspective, a snapshot of
> an ext3 is _always_ consistent. No need to do remount, sync, shutdown
> of application or shutdown of the entire system.
>
Whether the file system is consistent in itself after a restart is
irrelevant form an application perspective if the application has e.g.
state that is independent from the file system content. You can only
capture that by application collaboration or by forcing that state to
be hardened on persistent storage, hence shutting down the application
prior to backup/archive.

> regards,
> Carsten
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390

Ingo

--
Ingo Adlung,
STSM, System z Linux and Virtualization Architecture
mail: [EMAIL PROTECTED] - phone: +49-7031-16-4263

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Carsten Otte

Alan Altmark wrote:
> On Monday, 07/24/2006 at 06:35 ZE2, Carsten Otte <[EMAIL PROTECTED]> wrote:
>>> But rather than focus on that "edge" condition, we are all, I think,
> in
>>> violent agreement that you cannot take a volume-by-volume physical
> backup
>>> from outside a running Linux system and expect to have a usable
> backup.
>> That is a wrong assumption, I clearly disagree with it. If planned
>> proper, and I agree that there are lots of things one can do wrong
>> when planning the setup, physical backup of mounted and actively used
>> volumes _is_ reliable.
>
> But you are making assumptions about the applications, something I am not
> willing to do quite yet.  If a database update requires a change to the
> data file, the index file, and the log file, how do you (from the outside)
> know that all changes have been made and that it is safe to copy them? And
> that another transaction has not started?
As for the first part of the question: doing "sync" after the update
ensures that everything relevant has been flushed out to disk proper.
If another transaction has been started, fine. I expect the database
to be capable of rolling back the transaction after restore. That
brings things to the same situation as if I was doing the backup
before the transaction.

> From my days as a database application developer, a the transaction
> journal was meant to be replayed against a copy of the database as it
> existed at when the database was started, not replayed against a more
> current snapshot.  I.e. today's log is replayed against last night's
> backup.  And the transaction log is specifically NOT placed on the same
> device as the data itself.  In Linux terms, I guess that means don't place
> it in the same filesystem since that's the smallest consistent unit of
> data, right?  If you lose the data device, you haven't lost a whole day's
> worth of transactions.  (Maybe database technology no longer requires such
> precautions?)
When using snapshots for backup purposes, you would obviously need a
snapshot of both journal and data at the same time. Therefore you
either need to use dm-snapshot if you have data and log on different
devices, or you need to put both on the same device if you want to use
flashcopy.

> So I'll admit that I'm obviously not "getting it".  If you would summarize
> the steps needed to allow a reliable, usable, uncoordinated live backup of
> Linux volumes, I for one would sincerely appreciate it.  How do you
> integrate them into your server?  How do you automate the process?  Right
> now I'm a fan of SIGNAL SHUTDOWN, FLASHCOPY, XAUTOLOG, but that's just
> me...
Please don't get upset, I am doing my best to explain the situation.
You need:
- the capability of getting a consistent snapshot of all data relevant
to a) the file system _and_ b) the application. If the file system or
the data set relevant to the application spans multiple volumes, you
need the capability to snapshot all volumes at the very same time. The
easy way to fullfill this requirement is to use just a single file
system - which can span multiple physical disks in case of dm-snapshot.
- an application that has consistent on-disk data at all times (which
is a basic requirement to any server application)
- a file system that has consistent on-disk data at all times (such is
ext3)
Now you can:
- take a snapshot backup at any time while the server is doing regular
disk-IO
- pull the plug (crash)
- copy the data back to the original disk
- start the snapshot copy of the server and let the file system replay
its journal, then start the application again

cheers,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Carsten Otte

Alan Altmark wrote:
> But, Carsten, the application may start up just fine, however it may be
> using old data.  I have an application running on my workstation right now
> that saves its configuration data only when you shut it down (working as
> designed according to the vendor).  Since the application is only
> terminated when the system is shut down, a live backup of the disk would
> have no effect.  I mean, it would restore and run just fine, but be
> running with old data.
Well, backups are always about using old data in case of recovery as
far as I can see. Using an application that saves important data only
on shutdown in a mission critical environment is very dangerous
regardless of the backup soloution.

> I can jump up and down and stamp my feet, claiming that the application is
> broken, but that doesn't make it so.
What application (Server Application on Linux) acts like you claim?

regards,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-25 Thread Carsten Otte

> Carsten Otte wrote:
>> Wrong. Due to caching, as correctly described by David Boyes, the
>> system may change on-disk content even when the application is not
>> running. Example: the syslogd generates a "mark" every 20 minutes.

John Summerfied wrote:
> syslogd's mark message has nothing to do with caching.
>
> According to its man page, "sync forces changed blocks to disk, updates
> the super block."
>
> If you don't believe (or trust) that, then "mount -o remount" is your
> friend.
You missed my point: From the file system perspective, a snapshot of
an ext3 is _always_ consistent. No need to do remount, sync, shutdown
of application or shutdown of the entire system.

regards,
Carsten

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread John Summerfied


Dominic Coulombe wrote:

I'm sorry, but I don't get your point.


"I don't mind losing data on the system filesystems as we are only
interested in the database stuff."




On 24-Jul-2006, at 19:19, John Summerfied wrote:


so you don't care that it doesn't actually work!


It might be that in your circumstances what you do is fine, because the
stuff that does not work is not important to you. I'd need to consider
the database stuff more carefully before agreeing that it really does
work; there's too much I don't know.



--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread Dominic Coulombe


I'm sorry, but I don't get your point.

On 24-Jul-2006, at 19:19, John Summerfied wrote:

so you don't care that it doesn't actually work!


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread John Summerfied


Dominic Coulombe wrote:

On 7/24/06, David Boyes <[EMAIL PROTECTED]> wrote:




One more time: Unless your Linux systems *are completely down* at the
time of backup, full volume dumps from outside the Linux system are more
than likely to be useless.




Can you explain why is that ?

To avoid the nitpickers, let's say that David means all filesystems must
be flushed and ro.

As I understand it, journalling (by default) logs metadata (dirctory
info) but not data.

If you create a file, that's journalled. If you extend a file, that's
journalled. The data you write to the file are not.

Let's say that you create a file, write 4K to it, close it. Let's say
you do a backup of the volume externally while the 4K data remains
unwritten. Note: read in "man 2 close" "A successful close does not
guarantee that the data has been successfully saved to disk."

So now you have journalled (or comitted) metadata that says the file's
got 4K of data in it.

But, it hasn't. In the ordinary course of events, the data gets written
to disk ans all is well.

The same sort of thing happens when a file's updated in place, as I
expect databases commonly are.

If a database product says its backup program works with active
databases, I expect it does, but I'd never trust an external program,
let alone an external system, to backup my database, unless the database
is down.



I never experimented such failure after doing live backup of journaled
filesystems.


Have you looked for a failure?

I think it more likely you've had a failure that you didn't notice than
that you didn't have a failure.

I've never noticed a problem with losing data due to a power failure
(except when it took the hardware with it!), but I'm not so foolish as
to assume that I've had no file corruption.




It is like brute forcing a shutdown by logging off the VM machine : not
ideal, but not supposed to break your Linux machine.  It is the reason to
use journaled filesystems.

Thanks.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390




--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread David Boyes

Depending on the model of the library, there might be a Linux version of
the STK control software. All you need it to do is let you tell the
library to put volume A in drive B, and make drive B available to the VM
system. Bacula doesn't need more brains than that. 

If you can't get the z/OS side to let go of the drive, then for now
you'd have to use the NFS trick we documented elsewhere. Once the
mainline Bacula code supports volume-level migration as a
non-experimental feature, it'd probably be worth porting the Bacula
storage daemon to USS and writing the tape interface routines to let it
use z/OS-based tape. That'd be a killer use for a hipersocket...hmm.
Anyone got a C++ compiler on their z/OS box and want to collaborate a
bit?

David Boyes
Sine Nomine Associates

> -Original Message-
> From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
Jon
> Brock
> Sent: Monday, July 24, 2006 4:46 PM
> To: LINUX-390@VM.MARIST.EDU
> Subject: Re: Bad Linux backups
> 
> Our robotics are controlled by Storagetek's software on our z/OS
system;
> we do not have a VM version.  As far as APIs go, I have no idea.
> 
> Jon

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread John Summerfied


Carsten Otte wrote:


But Dominic, it has nothing to do with a journaled file system.  The fact
that you stopped the application and sync'd the file system (equivalent to
unmounting it) is what makes it work, not the file system implementation.


Wrong. Due to caching, as correctly described by David Boyes, the
system may change on-disk content even when the application is not
running. Example: the syslogd generates a "mark" every 20 minutes.


syslogd's mark message has nothing to do with caching.

According to its man page, "sync forces changed blocks to disk, updates
the super block."

If you don't believe (or trust) that, then "mount -o remount" is your
friend.





--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread David Boyes

> It shouldn't be hard to create a tape (or something) that IPLs and
> restores. Should it?

Sure -- you *could* create such an animal. On the other hand, there's a
perfectly good one shipped with VM -- the IPLable DDR utility. DDR also
handles z/OS, VSE, Linux, TPF, and pretty much any other stream of bits
you can put on a DASD, and once you get a one-pack VM system up on the
bare metal, you can do as many parallel restore streams as you have tape
drives, restoring anything and everything that goes on the disks,
regardless of source or creator. 

The point of effective DR restore is to get back on the air as quickly
as possible, preferably to have one really effective tool that works for
all the data you have, and you don't have to confuse your operators with
different instructions for different types of data -- it all works the
same way. There's already enough chaos in DR; no need to introduce any
more. 

People use ADRDSSU or FDR for the same reason -- the tools are capable
of doing the image dump and restore regardless of content, and it's a
question of what you're most familiar with or already have procedures to
deal with. 

People buy 3rd party gadgets like FDR or CA's VM:Backup-Hydro because
they're more efficient or easier to use than the IBM utilities, but the
purpose is the same: get the bits back on the disks as fast as possible.

Once you have the basic snapshot laid down from the image backup, you
can do file-level restores quickly to bring a guest up to the most
recent date. The IBM utilities (DDR and ADRDSSU) do the job for the
image backup part; the file level part is the thing that isn't widely
deployed yet. 

Question for the list: if I were to put together a Bacula appliance
image similar to the SSLSERV Enabler, would people contribute a small
amount ($500-1K) to get it, or consider buying support for it? It'd take
a week or so to get it right, and I don't want to waste the time if
nobody would want it. 

David Boyes
Sine Nomine Associates

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread John Summerfied


Rob van der Heij wrote:

On 7/24/06, Adam Thornton <[EMAIL PROTECTED]> wrote:


It strikes me that file-level backups are generally a lot easier to
work with, and use less archival media.



File level backup is great for "oops backup"  when you erased a few
files and want them back. I am not sure whether you ever tried to
restore the entire server from file level backups when you lost the
disk. Typically you will need to re-install a new system and then
restore your backups on top of that.


_I_ expect to boot a recovery system (on Intel it would be a bootable
CD, but on Zeds I imagine I'd have a small system ready), repartition as
my backup suggests. mount and copy - untar or whatever.

I expect some application-specific work, but I don't see a good way
round that (without other penalties).

Whether a file-level backup is quicker than volume-level, like so much
else, depends. dd (for example) minimises head movement, tar (for
example) backs up only files actually mentioned in the directories.


dump combines the two, but still has problems wirh files (maybe
filesystems) that can be written.




--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread John Summerfied


Carsten Otte wrote:


As Alan said, it's a question of one system knowing what's happening on
the other system. There's no way that z/OS is going to be able to know
that the Linux system is "safe" (without some kind of automation on BOTH
systems to be able to signal same) so dumps taken from outside the Linux
system (even with hardware features like flashcopy) are going to be
inconsistent.


z/OS does not need to know what Linux is doing when the setup ensures
consistent on-disk data at all times.


It is, I think, time for a bakeoff
(http://www.isi.edu/in-notes/rfc1025.txt),

Team A, lead by Carsten, contructs and implements a backup strategy that
runs on z/OS and creates and restores safely, backups of Linux systems
created by Team B,

Team B, lead by David Boyes, will construct a Linux system and workload
that Carsten cannot backup and restore.


I nominate David Boies (http://en.wikipedia.org/wiki/David_Boies) to
head the rules committee..

Iterations are allowed: I'd not expect Carsten to win in the first round.

Nominations for someone to handle the betting?



--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread John Summerfied


David Boyes wrote:

Why is everyone so hung up on volume backups?
It strikes me that file-level backups are generally a lot easier to
work with, and use less archival media.



Restore time in DR situations. Volume-level backups are a LOT faster to
restore, and you don't have to configure anything special -- you restore
all the data to disk using the same tools, regardless of source.


It shouldn't be hard to create a tape (or something) that IPLs and
restores. Should it?




--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread John Summerfied


Post, Mark K wrote:

For one thing, full-volume backups preserve partition information,
making recovery much simpler.  If I had to recover a hundred Linux

My backup script does this:
sfdisk >/etc/disktab -d /dev/hda

I _could_ copy it separately to a separate repository of this info, and
I could handle similar info (eg filesystem labels, fstab) in a like
manner: I have the info I need, and it doesn't make sense for _me_ to do
anything more elaborate.

When I initiated my plan, it included making a bootable DVD from which
to restore. The last bit's not done, but the info I need is there and
there are bootable systems _I_ can use (eg Knoppix) to use to do  a
manual restore.



systems, dig through the system documentation to figure out which
partitions were what size, and belonged to these particular file systems
or were LVM PVs (or md volumes), a lot of time could go by before we
even started restoring data from tape.

From my perspective, if we could fix just that part of the equation with
some kind of automation/tool, then file level backups would be the only
thing needed (aside from database-specific requirements/tools).


Mark Post

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
Adam Thornton
Sent: Monday, July 24, 2006 2:48 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

-snip-
Why is everyone so hung up on volume backups?

It strikes me that file-level backups are generally a lot easier to
work with, and use less archival media.


One has to think harder to get it right, and it's less obvious
volume-level backups are risky.



--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread John Summerfied


James Melin wrote:

This is more of a 'how do YOU do it ancillary question'...

Obviously to get a decent system backup from within Linux you should be in 
single user mode, or even quiesced completely (if you're doing CDL volume
backups, for instance).

What are people doing to get a given image into single-user mode (or shut down) 
and then restarted in an automated way?  Just curious because as
always, there's 10 ways to do something that achieve the same goal, and 
comparing the various methods/philosophies might be of use to some.



I regularly backup a Linux server running on Intel.

I've decided some files (eg logs) aren't that important: if they were,
I'd log to another machine.

I used to rsync from one box to another (over ADSL) but that was proved
way too slow, whatever the rsync folk said.

Now, I create an ext2 filesystem image:
dd if=/dev/zero of=${Image} count=0 bs=1024 \
seek=$((7*1024*1024))
mke2fs -Fq ${Image}

I mount it, populate it with tar:
find /var -xdev -type p -o -type s >${excludes}
tar clC / --exclude=backup.img --exclude=/tmp --exclude=/mnt\/*
--exclude=/var/lock --exclude=swapfil\* \
--exclude=/var/autofs --exclude=lost+found --exclude=/var/tmp
--exclude=/var/local --exclude=squid-cache \
--exclude=/var/spool/cyrus/mail-backup \
--exclude-from=${excludes} \
/ /boot /home  /var \
| buffer -m $((2*1024*1024)) -p 75\
| tar xpC /mnt/backup || { df -h ; exit ; }

I want the files compressed and on an ISO filesystem, so:
rm ${Image}
mkzftree --one-filesystem /mnt/backup/ /var/tmp/backup
umount /mnt/backup/
mkisofs -R -z -quiet -nobak \
  -o /var/tmp/backup-${HOSTNAME}.iso /var/tmp/backup

I then have an image which I can burn to DVD (if it still fits!), or
"mount -o loop" and the Linux kernel decompresses the files.

I use tar because it has adequate file filtering capability; mkzftree
has none:-(( (and that's why I unlink the backup where I do).

I'd exclude databases & c here and make separate arrangements for them.


Once I have the image, I rsync it to images in two other locations, one
local and one off-site. Using rsync to replicate the directory structure
 took hours, days if it got a bit behind (and took an enormous amount
of virtual storage, fortunately without inducing swapping), whereas the
image takes about an hour to sync.



--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread John Summerfied


Dominic Coulombe wrote:

On 7/24/06, Rob van der Heij <[EMAIL PROTECTED]> wrote:




Stop dreaming. Not even in theory - at least not my theory.




Hi,

I'm sorry, but we managed to do live backup of our systems without any
problem.  We restored a lot of backup and all were recoverable without any
problem.  Even when data was stored on LVM volumes.

We stop our databases prior to do the backup, sync the filesystems, do a
flashcopy, then restart everything.  As the databases are down, I don't see
why we would lose data on those filesystems.  I don't mind losing data on
the system filesystems as we are only interested in the database stuff.


so you don't care that it doesn't actually work!



--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread John Summerfied


Stahr, Lea wrote:

I run SuSE SLES 8 under ZVM 5.1 in an IFL. The DASD are in a SAN that is
also accessed by ZOS. Backups are taken by ZOS using FDR full volume
copies on Saturday morning (low usage). When I restore a backup, it will
not boot. The backup and the restore have the same byte counts. Linux
support at MainLine Systems tells me that he has seen this before at
other customers. What is everyone using for Linux under ZVM backups?
HELP! My backups are no good!


What do the FDR suppliers say?


--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]
Tourist pics http://portgeographe.environmentaldisasters.cds.merseine.nu/

do not reply off-list

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread Rob van der Heij


On 7/24/06, Jeremy Warren <[EMAIL PROTECTED]> wrote:


After reading the previous post though, does anyone know if that method
would correctly configure the boot sector.


The bootstrap uses a list of block numbers for the kernel, initrd etc.
When you restore a physical backup all these files go into their old
location so the bootstrap will still work.
A file-level restore that puts the kernel and initrd somewhere else on
disk will leave the bootstrap incomplete, but that's a moot point
because it will not restore the bootstrap itself either. Running
"zipl" after the restore should take care of those.

Rob

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread Jeremy Warren

FWIW FDR's Upstream DR Recovery comes with a script you can use to
automate a good chunk of the "due dillignce" aspect of this.  It's not a
panacea mind you but throw it in cron as shipped (unless running lvm then
you need to uncomment some stuff) and at least you know you have all of
the info you need to rebuild the box.  If I had a few hours that I didn't
know what to do with myself I always thought it could be tweaked to
produce a script that did the recovery itself, rather than just a bunch of
reports...

After reading the previous post though, does anyone know if that method
would correctly configure the boot sector.

Right now we build an empty system from the reader images,
Install FDR DR Recovery Tool
Restore.

In the previous post it sounds like we could skip that first step as long
as all of the filesystems were correctly laid out beneath the rescue box?

TIA

jrw

Rob van der Heij <[EMAIL PROTECTED]>
Sent by: Linux on 390 Port 
07/24/2006 04:23 PM
Please respond to
Linux on 390 Port 

To
LINUX-390@VM.MARIST.EDU
cc

Subject
Re: [LINUX-390] Bad Linux backups

On 7/24/06, Stahr, Lea <[EMAIL PROTECTED]> wrote:

> These are standard image systems that I can clone from a master and have
> in production in 2 hours. But what if its not standard? Then I have
> customizations that are lost.

Such an approach does require discipline to properly register what you
have modified and to assure the copy of that customized file is held
somewhere.
You know what files the clone should have, if you also have a list of
what you consider variable data or stuff that otherwise does not need
to be backed up, the difference is what should have been registered as
customization. You can use the check either to correct your
registration or to educate your colleagues.
Bonus points for when you can enhance the cloning process to also
re-apply these customization things. If you keep the copy of the
customized files in a handy way (e.g. an NSF server) you could get a
mechanism for applying changes with it. You might have a look at
cfEngine.

--
Rob

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread Jon Brock

Our robotics are controlled by Storagetek's software on our z/OS system; we do 
not have a VM version.  As far as APIs go, I have no idea.

Jon




What do you currently use to control your tape robotics?

We did a demo of how to use a client-server program to make Bacula
think it could drive your tape robotics.  The freely available demo
is just for a manual operator-driven back end, but if your library
has a CMS-manipulatable interface then it's pretty easy to write a
new back end that drives it.

If the robot has any sort of API, then what you need is a server on
the side that drives the robot, and a client on the Linux side that
makes requesting tapes look to Bacula like its mtx-changer output.
It's really pretty easy.


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread Adam Thornton


On Jul 24, 2006, at 1:35 PM, David Boyes wrote:


Such an approach does require discipline to properly register what
you
have modified and to assure the copy of that customized file is held
somewhere.


Tripwire is a handy tool for this. Run it every night and have it
generate a list of changes in diff format. You can then turn that diff
into input to patch and deploy it as a .deb or .rpm.


Or, if you're feeling REALLY parsimonious, you can use Bacula in its
"verify" mode to do the same thing.

Adam

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread David Boyes

> Such an approach does require discipline to properly register what you
> have modified and to assure the copy of that customized file is held
> somewhere.

Tripwire is a handy tool for this. Run it every night and have it
generate a list of changes in diff format. You can then turn that diff
into input to patch and deploy it as a .deb or .rpm. 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread Rob van der Heij


On 7/24/06, Stahr, Lea <[EMAIL PROTECTED]> wrote:


These are standard image systems that I can clone from a master and have
in production in 2 hours. But what if its not standard? Then I have
customizations that are lost.


Such an approach does require discipline to properly register what you
have modified and to assure the copy of that customized file is held
somewhere.
You know what files the clone should have, if you also have a list of
what you consider variable data or stuff that otherwise does not need
to be backed up, the difference is what should have been registered as
customization. You can use the check either to correct your
registration or to educate your colleagues.
Bonus points for when you can enhance the cloning process to also
re-apply these customization things. If you keep the copy of the
customized files in a handy way (e.g. an NSF server) you could get a
mechanism for applying changes with it. You might have a look at
cfEngine.

--
Rob

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread Adam Thornton


On Jul 24, 2006, at 12:47 PM, Rob van der Heij wrote:


On 7/24/06, Adam Thornton <[EMAIL PROTECTED]> wrote:


It strikes me that file-level backups are generally a lot easier to
work with, and use less archival media.


File level backup is great for "oops backup"  when you erased a few
files and want them back. I am not sure whether you ever tried to
restore the entire server from file level backups when you lost the
disk. Typically you will need to re-install a new system and then
restore your backups on top of that. Think about how that works for
many servers at the same time (because it probably must be a major
problem if you actually lost DASD).


It's not that bad.

You should have a rescue system--which it *IS* a good idea to do
volume backups of, and easy too because it's almost never running.

This system has authority to link EVERYONE's disks.

You bring it up.  You attach disks in batches of however many you're
comfortable with (I've only ever done it with one client at a time,
but you certainly could do more).  You format and then mount those
disks in the right layout relative to the mount point.

You do the restore of your files from the rescue system into the
mounted filesystems.  Then you do a chroot, run zipl (zipl -b would
also work, I guess), unmount the file systems, detach the disks, and
do the next batch.

No, you don't want to try a restore onto the same devices that are
actually RUNNING the system you're restoring on to.  But the nice
thing about VM is that it makes not doing that much, much easier than
it is on discrete systems.

Bacula also supports a Bootstrap Record feature, but this has not
been extended to work with s390.  The idea there is that you get a
minimal system (on CD-ROM, as it stands) which has just enough smarts
to find your disks, ask for your Bacula server, and then request the
appropriate restore for that client (you have one bsr per client).
This would be neat to port.

Adam

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

FW: Bad Linux backups

2006-07-24 Thread Pam Lovely

Comments: What we have down is;
1. MANUAL JOB- SHUTDOWN LINUX 4 z/VM server-AFT 7pm.
2. DDR specific volumes thru an automated exec process.
3. MANUAL JOB- STARTLINUX 4 z/VM server- AFT 7pm.
We are halting to shut down the Linux servers thru the
  The Linux Web interface.
We then perform; DDR of the volumes. This is a 99.9% non violent approach!
Down time: approx 1 hour per LINUX server.
Start: XAUTOLOG the Linux 4 z/VM server back up; works out just fine.
Yes availability, 100% up-time; but this is our trade off.
Our experience for backing up the file system from the
  Network sever side, has had many issues, but our
  Staff keeps trying!
P.S. No flashcopy Lic for VM at this time.

-Original Message-
On Behalf Of Alan Altmark

But rather than focus on that "edge" condition, we are all, I think, in
violent agreement that you cannot take a volume-by-volume physical backup
from outside a running Linux system and expect to have a usable backup.
Shared dasd on System z has all the same issues that shared LUNs have on
distributed systems.  The backup *strategies* are identical, even if the
mechanisms used to create the backups are not.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Bad Linux backups

2006-07-24 Thread Stahr, Lea

These are standard image systems that I can clone from a master and have
in production in 2 hours. But what if its not standard? Then I have
customizations that are lost.

Lea Stahr
Sr. System Administrator
Linux/Unix Team
630-753-5445
[EMAIL PROTECTED]

-Original Message-
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
Rob van der Heij
Sent: Monday, July 24, 2006 2:47 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Bad Linux backups

On 7/24/06, Adam Thornton <[EMAIL PROTECTED]> wrote:

> It strikes me that file-level backups are generally a lot easier to
> work with, and use less archival media.

File level backup is great for "oops backup"  when you erased a few
files and want them back. I am not sure whether you ever tried to
restore the entire server from file level backups when you lost the
disk. Typically you will need to re-install a new system and then
restore your backups on top of that. Think about how that works for
many servers at the same time (because it probably must be a major
problem if you actually lost DASD).

I have been involved in several attempts to recover a system from file
level backup, but none worked like planned. Last one I remember we
found TSM trying to restore the upgraded glibc over the vanilla
install of SuSE.

Once you start looking at it, you will find that many servers don't
really have data that you need to backup. You might be better off with
some tooling to quickly create a fresh server and some structure to
manage any customizing you do on top of that. Which eventually leaves
the servers that actually hold business data in some application, and
you can look at the best way to deal with those applications.

Rob

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

1 2 >

1 - 100 of 151 matches

Mail list logo