Re: Long Term Data Retention - off topic

2008-05-16 Thread Steven Harris
Hi David,

A few years ago I was working for a state health department that had
similar sorts of retention issues and was about to retire their main
patient admin system as they moved to a new one.   In this case, even
keeping existing data was not sufficient because different rules applied to
different data.  Some of it was supposed to be kept literally forever so
that historians could get at it, some was required for 80 years so that
epidemiological studies could be made, and some had retention lengths that
depended on the life of the patient.  In opposition to that, privacy
legislation required that some data be deleted when there was no longer an
operational need for it.

After convincing them that TSM was not an appropriate vehicle, using a
reducio ad absurdum argument, I researched a little further.

The best method for long term data retention is probably flat XML files.
These are well understood and self describing, require no specialist
software to read, yet can be searched by machine when this is necessary,
There are a number of specialized XML dialects  developed for different
purposes so a complete re-invention of the wheel is not necessary.

I did not persue this to completion.  It turned out that there was a
section in the organization whose primary job was data retention : mostly
paper based, but recognizably moving into data - just think of all those
word documents and spreadsheets that also are subject to legal retention
requirements, and the problem was passed to them.

It did occur to me that there is a business opportunity for consulting on
such problems.  Just understanding the web of retention standards, which
tend to refer to other standards nested three or four levels deep is a huge
job,  then applying those standards to the data at hand is another in order
to write some code to produce the final XML.  It would however take the
sort of analytical accountant/actuary mindset to successfully do this and
that is not my style.

I hope that has given you some insight

Regards

Steve

Steven Harris
TSM Admin, Sydney Australia




 David Longo
 <[EMAIL PROTECTED]
 TH-FIRST.ORG>  To
 Sent by: "ADSM:   ADSM-L@VM.MARIST.EDU
 Dist Stor  cc
 Manager"
 <[EMAIL PROTECTED] Subject
 .EDU>     [ADSM-L] Long Term Data Retention


 17/05/2008 01:35
 AM


 Please respond to
 "ADSM: Dist Stor
 Manager"
 <[EMAIL PROTECTED]
   .EDU>






Wanted to get some thoughts on what people are doing for
 Long Term Data Retention - specifically on obsolete applications.

Say we have an NT 4.0 system that is no longer used.  Business
owner says we need to keep for 25 years.  I know not
 practical/possible for a number of reasons.  Even if we Vmware it,
 will they support NT 4.0 for 25 years?  (Will ANYBODY support
Windows 2008 in 25 years?)

I know even if they take a DB dump and I Archive it for 25 years, if
we retrieve the file 20 years from now, who can decipher it?  There
are several systems here that people are giving hints that they want
to do this.

I have hinted that they need to take whatever data and dump it
to a text or pdf file and then I archive that.  I realize that this may
not be that simple for some applications as probably involves
more than a simple data dump or whatever.  Plus some applications
are spread across multiple servers.

So, before we have big meeting and I push the text or pdf file
idea, what are people doing for retention of data on obsolete
servers/applications?

Thanks,
David Longo



#
This message is for the named person's use only.  It may
contain private, proprietary, or legally privileged information.
No privilege is waived or lost by any mistransmission.  If you
receive this message in error, please immediately delete it and
all copies of it from your system, destroy any hard copies of it,
and notify the sender.  You must not, directly or indirectly, use,
disclose, distribute, print, or copy any part of this message if you
are not the intended recipient.  Health First reserves the right to
monitor all e-mail communications through its networks.  Any views
or opinions expressed in this message are solely those of the
individual sender, except (1) where the message states such views
or opinions are on behalf of a particular entity;  and (2) the sender
is authorized by the entity to give such views or opinions.
#


Long Term Data Retention

2008-05-16 Thread David Longo
Wanted to get some thoughts on what people are doing for
 Long Term Data Retention - specifically on obsolete applications.

Say we have an NT 4.0 system that is no longer used.  Business
owner says we need to keep for 25 years.  I know not
 practical/possible for a number of reasons.  Even if we Vmware it,
 will they support NT 4.0 for 25 years?  (Will ANYBODY support
Windows 2008 in 25 years?)

I know even if they take a DB dump and I Archive it for 25 years, if
we retrieve the file 20 years from now, who can decipher it?  There
are several systems here that people are giving hints that they want
to do this.

I have hinted that they need to take whatever data and dump it
to a text or pdf file and then I archive that.  I realize that this may
not be that simple for some applications as probably involves
more than a simple data dump or whatever.  Plus some applications
are spread across multiple servers.

So, before we have big meeting and I push the text or pdf file
idea, what are people doing for retention of data on obsolete
servers/applications?

Thanks,
David Longo



#
This message is for the named person's use only.  It may 
contain private, proprietary, or legally privileged information.  
No privilege is waived or lost by any mistransmission.  If you 
receive this message in error, please immediately delete it and 
all copies of it from your system, destroy any hard copies of it, 
and notify the sender.  You must not, directly or indirectly, use, 
disclose, distribute, print, or copy any part of this message if you 
are not the intended recipient.  Health First reserves the right to 
monitor all e-mail communications through its networks.  Any views 
or opinions expressed in this message are solely those of the 
individual sender, except (1) where the message states such views 
or opinions are on behalf of a particular entity;  and (2) the sender 
is authorized by the entity to give such views or opinions.
#


Re: Long term data retention for retired clients

2005-07-15 Thread John Naylor
Many thanks for the excellent responses to my question
Computers are great but organics are better.
John

**

The information in this E-Mail is confidential and may be legally privileged. 
It may not represent the views of Scottish and Southern Energy Group.

It is intended solely for the addressees. Access to this E-Mail by anyone else 
is unauthorised. If you are not the intended recipient, any disclosure, 
copying, distribution or any action taken or omitted to be taken in reliance on 
it, is prohibited and may be unlawful. Any unauthorised recipient should advise 
the sender immediately of the error in transmission. Unless specifically stated 
otherwise, this email (or any attachments to it) is not an offer capable of 
acceptance or acceptance of an offer and it does not form part of a binding 
contractual agreement.

Scottish Hydro-Electric, Southern Electric, SWALEC, S+S and SSE Power 
Distribution are trading names of the Scottish and Southern Energy Group.

**


Re: Long term data retention for retired clients

2005-07-14 Thread Prather, Wanda
Biggest problem I have with using EXPORT or BACKUPSET, is that people
most likely are going to ask for a partial restore.  And they probably
aren't going to remember exactly what the directory structure &
filenames were.  Two years from now, NO ONE will have any idea what was
really on that EXPORT tape.  And you can't hunt for it effectively
unless all the data is still in the DB.

So what I've done as a compromise is use SQL SELECT to pull a list of
all the file names/backup dates for a retired client from the TSM DB
into a flat file, then do the EXPORT and delete the filespaces.  The
flat file remains around and can be searched using ordinary tools to
figure out what is on the EXPORT tape.

Wanda Prather
"I/O, I/O, It's all about I/O"  -(me)



-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
Allen S. Rout
Sent: Thursday, July 14, 2005 1:49 PM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: Long term data retention for retired clients


==> On Thu, 14 Jul 2005 10:27:49 +0100, John Naylor
<[EMAIL PROTECTED]> said:

> I have consideredf various approaches
> 1) Export
> 2) Backup set
> 3) Create a new domain for retired clients which have the long term
> retention requirement

> I see export and backup sets as reducing database overhead, but being
less
> easy to track and rather unfriendly if you just need to get a subset
of
> the data back

Export yes, backupset no; [ see below ]


> The new domain would retain great visibilty for the data and allow
easy
> access to subsets of the data, but you would stiil have the database
> overhead.

Yes, but in the long term this overhead devolves to just space.  For
example,
if everything in the node is ACTIVE, I don't believe that it represents
much
of an e.g. expiration hit.  So there's an incrementally (hah) larger
full DB
backup, and more space on disk for the DB, but not a lot of day to day
DB
overhead.

> Does the choice just depend, on how often in reality you will need to
get
> the  data back ?

I'd say that and how frequently you want to do the retention thing, for
how
long.

Off the top of my head, if keeping the bits and pieces around would add
up to,
say, a third of my [DB space / data / whatever] I'd consider doing
something
nearline with it.


[below]

Keep in mind:  you can restore from a backupset stored on the server;
this
need only be a little less convenient than restorations from online
data.
Gedankenexperiment:

Say you want to deal with TheNode:

rename node TheNode OLD-2005-07-12-13-TheNode

   [ in case you ever want to use TheNode name again ]

GEN BACKUPSET OLD-2005-07-12-13-TheNode Terminal devclass=foobar
RET=NOLimit

which usess tapes FOO1 and FOO2.

Then you DEL FILESPACE OLD-2005-07-12-13-TheNode *

and

CHECKOUT LIBVOLUME FOOLIB FOO1
CHECKOUT LIBVOLUME FOOLIB FOO2

At this point, you've got a -permanent- record of the state of TheNode,
at the
cost of a few records in the node and backupset tables. Of course, you
have an
increased exposure to media failure: no copypools for backupsets.
Anyway, to
restore from it all you have to do to use it is check the tapes back in,
and
issue a

dsmc restore backupset Terminal -loc=server [sourcespec] [destspec]

This is going to be a much less efficient restore than the online one,
but
only in wall clock time and tape use, not in human skull sweat.


Plus, if someone gets crotchety about the archive, you can hand them the
checked out tapes and tell them to get their own LTO3. (heh)


- Allen S. Rout


Re: Long term data retention for retired clients

2005-07-14 Thread Allen S. Rout
==> On Thu, 14 Jul 2005 08:45:11 -0400, Richard Rhodes <[EMAIL PROTECTED]> said:


> If there was one thing I really wish in all this was a comments field.  The
> only place we found to put comments about a node is in the contacts field.
> I wish there was another field where we could enter comments.

AMEN, brother.  Preach it!

> I am interested in how others handle this also.

Heh, I've been thinking about a white paper on just the topic:  "What they
left out, and what I did about it". I may just write it.

The short version is: I built an XML 'application' (dialect) to hold a bunch
of data about my servers, schedules, domains, storage pools and nodes.  I
generate all of my automation scripts by distilling that one big (dang, it's
50K now) file, and get out of it all the normal maintainance scripts and
schedules for my 10 servers, chargeback accounting, and trending data.

Oh, and my automatically-generated DR-restore-my-TSM-server shell scripts. :)

Most of my needs could have been filled with a ~1K "comments" field on:

nodes
domains
filespaces
stgpools



- Allen S. Rout


Re: Long term data retention for retired clients

2005-07-14 Thread Allen S. Rout
==> On Thu, 14 Jul 2005 10:27:49 +0100, John Naylor <[EMAIL PROTECTED]> said:

> I have consideredf various approaches
> 1) Export
> 2) Backup set
> 3) Create a new domain for retired clients which have the long term
> retention requirement

> I see export and backup sets as reducing database overhead, but being less
> easy to track and rather unfriendly if you just need to get a subset of
> the data back

Export yes, backupset no; [ see below ]


> The new domain would retain great visibilty for the data and allow easy
> access to subsets of the data, but you would stiil have the database
> overhead.

Yes, but in the long term this overhead devolves to just space.  For example,
if everything in the node is ACTIVE, I don't believe that it represents much
of an e.g. expiration hit.  So there's an incrementally (hah) larger full DB
backup, and more space on disk for the DB, but not a lot of day to day DB
overhead.

> Does the choice just depend, on how often in reality you will need to get
> the  data back ?

I'd say that and how frequently you want to do the retention thing, for how
long.

Off the top of my head, if keeping the bits and pieces around would add up to,
say, a third of my [DB space / data / whatever] I'd consider doing something
nearline with it.


[below]

Keep in mind:  you can restore from a backupset stored on the server; this
need only be a little less convenient than restorations from online data.
Gedankenexperiment:

Say you want to deal with TheNode:

rename node TheNode OLD-2005-07-12-13-TheNode

   [ in case you ever want to use TheNode name again ]

GEN BACKUPSET OLD-2005-07-12-13-TheNode Terminal devclass=foobar RET=NOLimit

which usess tapes FOO1 and FOO2.

Then you DEL FILESPACE OLD-2005-07-12-13-TheNode *

and

CHECKOUT LIBVOLUME FOOLIB FOO1
CHECKOUT LIBVOLUME FOOLIB FOO2

At this point, you've got a -permanent- record of the state of TheNode, at the
cost of a few records in the node and backupset tables. Of course, you have an
increased exposure to media failure: no copypools for backupsets.  Anyway, to
restore from it all you have to do to use it is check the tapes back in, and
issue a

dsmc restore backupset Terminal -loc=server [sourcespec] [destspec]

This is going to be a much less efficient restore than the online one, but
only in wall clock time and tape use, not in human skull sweat.


Plus, if someone gets crotchety about the archive, you can hand them the
checked out tapes and tell them to get their own LTO3. (heh)


- Allen S. Rout


Re: Long term data retention for retired clients

2005-07-14 Thread Nicholas Cassimatis
Remember - moving the nodes to a new domain does nothing to rebind the data
to new management classes - that only happens during an actual backup.  So
the data will stick around for as long as intended, except for the last
active version of the files.  Those you have to delete manually.  One
practice I've done is to rename the node, putting the date when it can be
finally deleted to the beginning of the name, hence "NODENAME" becomes
"051231_NODENAME".  Moving this node to a new domain, with a descriptive
name such as RETIRED, you can now easily see what nodes are able to be
deleted, and when.  The simple version:  "query node domain=retired" or the
more complex: "select node_name from nodes where domain='RETIRED' order by
node_name"

Nick Cassimatis
email: [EMAIL PROTECTED]

Re: Long term data retention for retired clients

2005-07-14 Thread Thomas Denier
> If there was one thing I really wish in all this was a comments field.  The
> only place we found to put comments about a node is in the contacts field.
> I wish there was another field where we could enter comments.

The 'define machine' and 'insert machine' commands can be used to store
large amounts of text data about nodes. I am not sure about the
licensing requirements for these commands; they may be part of the
Disaster Recovery Manager feature.


Re: Long term data retention for retired clients

2005-07-14 Thread Richard Sims

On Jul 14, 2005, at 8:45 AM, Richard Rhodes wrote:


If there was one thing I really wish in all this was a comments
field.  The
only place we found to put comments about a node is in the contacts
field.
I wish there was another field where we could enter comments.
...


Richard -

With a "decommissioned" node, you could add 200 chars of annotation
in the node "URL" field, which accepts arbitrary text.

  Richard Sims


Re: Long term data retention for retired clients

2005-07-14 Thread Richard Rhodes
We have hundreds of retired servers, both from a server consolidation
project and general server rollover.  we decided we requred acces to the
backups and archives on the retired servers, so exports and backup sets
wouldn't work.  Also, by keeping the data in the normal pools we keep
redundancy (primary and copy pool) and DR issues covered.

We thought about moving the nodes into their own domain, but decided not to
because most domains use default management classes and we weren't sure how
to keep the policies straight in one domain holding retired servers from
many domains.  We thought about a separate retired domain for each
production domain, and rejected it.  Finally we decided to simply rename
the nodes.  All retired nodes are given a prefix of "zzrt_".  For example,
node "someserver"  becomes retired node "zzrt_someserver".  Normal policies
are allowed to expire inactive versions.  We put a comment in the contact
field for when the active versions can be deleted and the node removed.

This is anything but ideal . . . . but I think we found that this was true
for any method.  Many if not most of the sql queries we run against the tsm
db have logic to exclude any nodes that are like 'zzrt_%'.

I think a cleaner method would be a separate tsm server for just retired
nodes . . . . . but that has obvious drawbacks also.

If there was one thing I really wish in all this was a comments field.  The
only place we found to put comments about a node is in the contacts field.
I wish there was another field where we could enter comments.

I am interested in how others handle this also.

Rick





  John Naylor
  <[EMAIL PROTECTED]To:   ADSM-L@VM.MARIST.EDU
  HERN.CO.UK>   cc:
      Sent by: "ADSM: Dist Stor Subject:  Long term 
data retention for retired clients
  Manager"
  


  07/14/2005 05:27 AM
  Please respond to "ADSM:
  Dist Stor Manager"






Hi out there,
Just wondering what the consensus is on the best way to retain TSM client
data that has to be kept for many years (legal requirement) after the
client box is retired.
I have consideredf various approaches
1) Export
2) Backup set
3) Create a new domain for retired clients which have the long term
retention requirement

I see export and backup sets as reducing database overhead, but being less
easy to track and rather unfriendly if you just need to get a subset of
the data back
The new domain would retain great visibilty for the data and allow easy
access to subsets of the data, but you would stiil have the database
overhead.
Does the choice just depend, on how often in reality you will need to get
the  data back ?
Thoughts appreciated
John







**

The information in this E-Mail is confidential and may be legally
privileged. It may not represent the views of Scottish and Southern Energy
Group.

It is intended solely for the addressees. Access to this E-Mail by anyone
else is unauthorised. If you are not the intended recipient, any
disclosure, copying, distribution or any action taken or omitted to be
taken in reliance on it, is prohibited and may be unlawful. Any
unauthorised recipient should advise the sender immediately of the error in
transmission. Unless specifically stated otherwise, this email (or any
attachments to it) is not an offer capable of acceptance or acceptance of
an offer and it does not form part of a binding contractual agreement.

Scottish Hydro-Electric, Southern Electric, SWALEC, S+S and SSE Power
Distribution are trading names of the Scottish and Southern Energy Group.

**




-
The information contained in this message is intended only for the
personal and confidential use of the recipient(s) named above. If the
reader of this message is not the intended recipient or an agent
responsible for delivering it to the intended recipient, you are hereby
notified that you have received this document in error and that any
review, dissemination, distribution, or copying of this message is
strictly prohibited. If you have received this communication in error,
please notify us immediately, and delete the original message.


Re: Long term data retention for retired clients

2005-07-14 Thread Richard Sims

On Jul 14, 2005, at 5:27 AM, John Naylor wrote:


I have considered various approaches
1) Export
2) Backup set
3) Create a new domain for retired clients which have the long term
retention requirement



Hi, John -

Another possibility for your list is TSM for Data Retention.
This would be more for a site where long-term, assured retention is
a big, ongoing thing as it's a non-trivial implementation.

   Richard Sims


Re: Long term data retention for retired clients

2005-07-14 Thread John Naylor
Further to my query, I have not mentioned archive as this is not a
function that we regularly use, but would this be a good candidate for
retiring clients,
and can you archive a whole drive  ie   D:
thanks,
John



John Naylor/HAV/SSE
14/07/2005 10:27

To
"ADSM: Dist Stor Manager" 
cc

Subject
Long term data retention for retired clients





Hi out there,
Just wondering what the consensus is on the best way to retain TSM client
data that has to be kept for many years (legal requirement) after the
client box is retired.
I have consideredf various approaches
1) Export
2) Backup set
3) Create a new domain for retired clients which have the long term
retention requirement

I see export and backup sets as reducing database overhead, but being less
easy to track and rather unfriendly if you just need to get a subset of
the data back
The new domain would retain great visibilty for the data and allow easy
access to subsets of the data, but you would stiil have the database
overhead.
Does the choice just depend, on how often in reality you will need to get
the  data back ?
Thoughts appreciated
John








**

The information in this E-Mail is confidential and may be legally privileged. 
It may not represent the views of Scottish and Southern Energy Group.

It is intended solely for the addressees. Access to this E-Mail by anyone else 
is unauthorised. If you are not the intended recipient, any disclosure, 
copying, distribution or any action taken or omitted to be taken in reliance on 
it, is prohibited and may be unlawful. Any unauthorised recipient should advise 
the sender immediately of the error in transmission. Unless specifically stated 
otherwise, this email (or any attachments to it) is not an offer capable of 
acceptance or acceptance of an offer and it does not form part of a binding 
contractual agreement.

Scottish Hydro-Electric, Southern Electric, SWALEC, S+S and SSE Power 
Distribution are trading names of the Scottish and Southern Energy Group.

**


Re: Long term data retention for retired clients

2005-07-14 Thread Iain Barnetson
I'd be interested in the same info as regards NetApp client data, ie:
NDMP dump backups.
Iain

 

-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
John Naylor
Sent: 14 July 2005 10:28
To: ADSM-L@VM.MARIST.EDU
Subject: [ADSM-L] Long term data retention for retired clients

Hi out there,
Just wondering what the consensus is on the best way to retain TSM
client data that has to be kept for many years (legal requirement) after
the client box is retired.
I have consideredf various approaches
1) Export
2) Backup set
3) Create a new domain for retired clients which have the long term
retention requirement

I see export and backup sets as reducing database overhead, but being
less easy to track and rather unfriendly if you just need to get a
subset of the data back The new domain would retain great visibilty for
the data and allow easy access to subsets of the data, but you would
stiil have the database overhead.
Does the choice just depend, on how often in reality you will need to
get the  data back ?
Thoughts appreciated
John







**

The information in this E-Mail is confidential and may be legally
privileged. It may not represent the views of Scottish and Southern
Energy Group.

It is intended solely for the addressees. Access to this E-Mail by
anyone else is unauthorised. If you are not the intended recipient, any
disclosure, copying, distribution or any action taken or omitted to be
taken in reliance on it, is prohibited and may be unlawful. Any
unauthorised recipient should advise the sender immediately of the error
in transmission. Unless specifically stated otherwise, this email (or
any attachments to it) is not an offer capable of acceptance or
acceptance of an offer and it does not form part of a binding
contractual agreement.

Scottish Hydro-Electric, Southern Electric, SWALEC, S+S and SSE Power
Distribution are trading names of the Scottish and Southern Energy
Group.

**
--
This e-mail, including any attached files, may contain confidential and 
privileged information for the sole use of the intended recipient.  Any review, 
use, distribution, or disclosure by others is strictly prohibited.  If you are 
not the intended recipient (or authorized to receive information for the 
intended recipient), please contact the sender by reply e-mail and delete all 
copies of this message.


Long term data retention for retired clients

2005-07-14 Thread John Naylor
Hi out there,
Just wondering what the consensus is on the best way to retain TSM client
data that has to be kept for many years (legal requirement) after the
client box is retired.
I have consideredf various approaches
1) Export
2) Backup set
3) Create a new domain for retired clients which have the long term
retention requirement

I see export and backup sets as reducing database overhead, but being less
easy to track and rather unfriendly if you just need to get a subset of
the data back
The new domain would retain great visibilty for the data and allow easy
access to subsets of the data, but you would stiil have the database
overhead.
Does the choice just depend, on how often in reality you will need to get
the  data back ?
Thoughts appreciated
John







**

The information in this E-Mail is confidential and may be legally privileged. 
It may not represent the views of Scottish and Southern Energy Group.

It is intended solely for the addressees. Access to this E-Mail by anyone else 
is unauthorised. If you are not the intended recipient, any disclosure, 
copying, distribution or any action taken or omitted to be taken in reliance on 
it, is prohibited and may be unlawful. Any unauthorised recipient should advise 
the sender immediately of the error in transmission. Unless specifically stated 
otherwise, this email (or any attachments to it) is not an offer capable of 
acceptance or acceptance of an offer and it does not form part of a binding 
contractual agreement.

Scottish Hydro-Electric, Southern Electric, SWALEC, S+S and SSE Power 
Distribution are trading names of the Scottish and Southern Energy Group.

**