wishlist suggestions

2002-11-07 Thread Todd T. Fries
Hope this is useful.  I'm not currently subscribed to the mailing list, so
please Cc: any replies.  (I took the wishlist from amanda 2.4.2p2, hope none
of the below is already in the wishlist for 2.4.3).

Simple:

- support ipv6
- encrypt with ssl (not just limit encryption to kerberos users)
- instead of 'crc' use 'sha1' or 'rmd160' in the current WISHLIST
- auto-loading of tapes in a tape silo

Complex:

- tape defragmentation + multiple backups on multiple tapes
- 'all versions of this file' in amrecover

Explanation:

With adsm (now part of tivoli) in IBM's backup solution, there is something
very similar to 'amrecover'.  You can specify the dump date, and it will
load the tape(s) necessary to recover the data, and then recover it as
you ask it to.

There is definately the concept (already in the wishlist) in adsm of
the ability to specify a filesystem as a backup medium (actually, specifying
the number of 'large files of x size' is specified, then they are created,
each one as a 'volume', a tape is also a 'volume').

The defragmentation I hope I explain properly, if not, ask for more/better
explanation.  To need defragmentation, you also need support for multiple
(partial)backups per tape.  It is best done with alot of holding disk and
a tape silo, but can be done otherwise.  The concept stems from adsm's
reality of doing a full backup once, then incremental always afterwards.
There is a concept of 'keep only one copy of a file if the most recent modification is 
over x days old' and 'keep at max x copies of a file if the modification
times are since x days old' and 'keep all files since x days old'.  This may
make more sense with an example:

- keep only one copy of each file if the most recent modification date
is older than 90 days
- keep at max 3 copies of a file if the modification times are newer
than 90 days
- or -
  keep all files newer than 90 days

When you do things like the above, you end up having to go through a tape
that has many files backed up, and remove a few files that are outside
the scope of backup.  The tape gets dumped to the holding area, and then
the data is manipulated to remove the files not needed in the backup system
anymore, then spooled with other data heading to tape.  In general data
gets spooled to tape together, but each tape (or 'volume') gets fully
used each time it is written to.

The indexing mechanism for adsm required several tuning stages, since there
was a site I used to work for that used adsm that had a 500gb filesystem
with a poor algorithm for spreading out data (usually about 5 subdirs
deep to find a file and one or two files in the subdir, if any).  It took
the indexing mechanism 22 hours to do a search for what to backup, then
about 20 minutes to do the actual backup.

It could get time consuming, perhaps the indexes would save time, but the
concept of doing 'ls -allverions file' and seeing each version of a specific
file that is in the backup set was extremely useful in adsm, users that
were very uncertain as to when the file was removed could be told we have
a file from the 10th, the 11th, and the 13th.

-- 
Todd Fries .. [EMAIL PROTECTED]

(last updated $ToddFries: signature.p,v 1.2 2002/03/19 15:10:18 todd Exp $)




Re: wishlist suggestions

2002-11-07 Thread Frank Smith


--On Thursday, November 07, 2002 07:34:32 -0600 "Todd T. Fries" <[EMAIL PROTECTED]> wrote:


The defragmentation I hope I explain properly, if not, ask for more/better
explanation.  To need defragmentation, you also need support for multiple
(partial)backups per tape.  It is best done with alot of holding disk and
a tape silo, but can be done otherwise.  The concept stems from adsm's
reality of doing a full backup once, then incremental always afterwards.
There is a concept of 'keep only one copy of a file if the most recent modification is over x days old' and 'keep at max x copies of a file if the modification
times are since x days old' and 'keep all files since x days old'.  This may
make more sense with an example:

	- keep only one copy of each file if the most recent modification date
		is older than 90 days
	- keep at max 3 copies of a file if the modification times are newer
		than 90 days
		- or -
	  keep all files newer than 90 days

When you do things like the above, you end up having to go through a tape
that has many files backed up, and remove a few files that are outside
the scope of backup.  The tape gets dumped to the holding area, and then
the data is manipulated to remove the files not needed in the backup system
anymore, then spooled with other data heading to tape.  In general data
gets spooled to tape together, but each tape (or 'volume') gets fully
used each time it is written to.


This seems pretty scary, since it means you only have one copy of static files.
When you have a tape error on that tape how do you recover?


The indexing mechanism for adsm required several tuning stages, since there
was a site I used to work for that used adsm that had a 500gb filesystem
with a poor algorithm for spreading out data (usually about 5 subdirs
deep to find a file and one or two files in the subdir, if any).  It took
the indexing mechanism 22 hours to do a search for what to backup, then
about 20 minutes to do the actual backup.

It could get time consuming, perhaps the indexes would save time, but the
concept of doing 'ls -allverions file' and seeing each version of a specific
file that is in the backup set was extremely useful in adsm, users that
were very uncertain as to when the file was removed could be told we have
a file from the 10th, the 11th, and the 13th.


This would be a nice feature, and if indexing is available shouldn't be too
hard to implement (although probably easier to implement as a stand-alone
program to browse indexes than to add in to amrecover).

Frank


--
Todd Fries .. [EMAIL PROTECTED]

(last updated $ToddFries: signature.p,v 1.2 2002/03/19 15:10:18 todd Exp $)




--
Frank Smith[EMAIL PROTECTED]
Systems Administrator Voice: 512-374-4673
Hoover's Online Fax: 512-374-4501



Re: wishlist suggestions

2002-11-07 Thread Todd T. Fries
Using DLT tapes, things are supposed to be reliable, is my recollection.

It has been a few years since I worked with adsm, perhaps things were not
as I recall, perhaps it kept a minimum of two copies of the same data,
a redundancy setting could not hurt in any event .. but the concept of
using a full tape instead of a partial portion of a tape would be very
useful to many people, I would expect.

At the very least, one could make the following requirements for this
to occur:

- the 'holding disk' must have enough free space to extract the
  current tape
- the time to extract the tape to the hard drive is acceptable

Given that, one could add more backups to existing backups until a 2nd
tape was needed, and still abide by amanda's montra 'only write to tapes
from the beginning'.
-- 
Todd Fries .. [EMAIL PROTECTED]

(last updated $ToddFries: signature.p,v 1.2 2002/03/19 15:10:18 todd Exp $)

Penned by Frank Smith on Thu, Nov 07, 2002 at 10:09:55AM -0600, we have:
| 
| 
| --On Thursday, November 07, 2002 07:34:32 -0600 "Todd T. Fries" 
| <[EMAIL PROTECTED]> wrote:
| 
| >The defragmentation I hope I explain properly, if not, ask for more/better
| >explanation.  To need defragmentation, you also need support for multiple
| >(partial)backups per tape.  It is best done with alot of holding disk and
| >a tape silo, but can be done otherwise.  The concept stems from adsm's
| >reality of doing a full backup once, then incremental always afterwards.
| >There is a concept of 'keep only one copy of a file if the most recent 
| >modification is over x days old' and 'keep at max x copies of a file if 
| >the modification
| >times are since x days old' and 'keep all files since x days old'.  This 
| >may
| >make more sense with an example:
| >
| > - keep only one copy of each file if the most recent modification 
| > date
| > is older than 90 days
| > - keep at max 3 copies of a file if the modification times are newer
| > than 90 days
| > - or -
| >   keep all files newer than 90 days
| >
| >When you do things like the above, you end up having to go through a tape
| >that has many files backed up, and remove a few files that are outside
| >the scope of backup.  The tape gets dumped to the holding area, and then
| >the data is manipulated to remove the files not needed in the backup system
| >anymore, then spooled with other data heading to tape.  In general data
| >gets spooled to tape together, but each tape (or 'volume') gets fully
| >used each time it is written to.
| 
| This seems pretty scary, since it means you only have one copy of static 
| files.
| When you have a tape error on that tape how do you recover?
| 
| >The indexing mechanism for adsm required several tuning stages, since there
| >was a site I used to work for that used adsm that had a 500gb filesystem
| >with a poor algorithm for spreading out data (usually about 5 subdirs
| >deep to find a file and one or two files in the subdir, if any).  It took
| >the indexing mechanism 22 hours to do a search for what to backup, then
| >about 20 minutes to do the actual backup.
| >
| >It could get time consuming, perhaps the indexes would save time, but the
| >concept of doing 'ls -allverions file' and seeing each version of a 
| >specific
| >file that is in the backup set was extremely useful in adsm, users that
| >were very uncertain as to when the file was removed could be told we have
| >a file from the 10th, the 11th, and the 13th.
| 
| This would be a nice feature, and if indexing is available shouldn't be too
| hard to implement (although probably easier to implement as a stand-alone
| program to browse indexes than to add in to amrecover).
| 
| Frank
| 
| >--
| >Todd Fries .. [EMAIL PROTECTED]
| >
| >(last updated $ToddFries: signature.p,v 1.2 2002/03/19 15:10:18 todd Exp $)
| 
| 
| 
| --
| Frank Smith
| [EMAIL PROTECTED]
| Systems Administrator Voice: 
| 512-374-4673
| Hoover's Online Fax: 
| 512-374-4501