Re: can amanda auto-size DLE's?

2014-03-12 Thread Stefan G. Weichinger
Am 12.03.2014 17:52, schrieb Michael Stauffer:
> Thanks Stefan! I'll take a look.
> 
> How did this work for you in terms of daily, or almost daily, creating new
> DLE's? I imagine it made for near-constant level 0 dumps? Maybe that was
> what you needed anyway with lots of new data?

I have to admit that I didn't use this for very long ... as you see from
the date the script is from ~2010 ... back then I used it for weekly
dumps of my video data but I was far from consequent ;-)

Additionally my amanda tape server is now another physical machine so
the created DLEs and includelists would have to be transferred somehow
... another todo left.

I'd be happy to hear some feedback and maybe some scripting-improvement ...

Stefan



Re: can amanda auto-size DLE's?

2014-03-12 Thread Michael Stauffer
Thanks Stefan! I'll take a look.

How did this work for you in terms of daily, or almost daily, creating new
DLE's? I imagine it made for near-constant level 0 dumps? Maybe that was
what you needed anyway with lots of new data?

-M


On Wed, Mar 12, 2014 at 5:48 AM, Stefan G. Weichinger wrote:

> Am 05.03.2014 14:10, schrieb Stefan G. Weichinger:
>
> > Aside from this I back then had some other scripts that generated
> > include-lists resulting in chunks of <= X GB (smaller than one tape) ...
> > I wanted to dump the videos in my mythtv-config and had the problem of
> > very dynamic data in there ;-)
> >
> > So the goal was to re-create dynamic include-lists for DLEs everyday (or
> > even at the actual time of amdump). It worked mostly. I would have to
> > dig that up again.
>
> Digged that up and put it on github:
>
> https://github.com/stefangweichinger/am_dyn_dles
>
> feel free to use or improve.
>
> Stefan
>
>
>


Re: can amanda auto-size DLE's?

2014-03-12 Thread Stefan G. Weichinger
Am 05.03.2014 14:10, schrieb Stefan G. Weichinger:

> Aside from this I back then had some other scripts that generated
> include-lists resulting in chunks of <= X GB (smaller than one tape) ...
> I wanted to dump the videos in my mythtv-config and had the problem of
> very dynamic data in there ;-)
> 
> So the goal was to re-create dynamic include-lists for DLEs everyday (or
> even at the actual time of amdump). It worked mostly. I would have to
> dig that up again.

Digged that up and put it on github:

https://github.com/stefangweichinger/am_dyn_dles

feel free to use or improve.

Stefan




Re: can amanda auto-size DLE's?

2014-03-05 Thread Michael Stauffer
Thanks again Jon - very helpful as usual.

-M

On Mon, Mar 3, 2014 at 7:01 PM, Jon LaBadie  wrote:

> On Mon, Mar 03, 2014 at 02:47:53PM -0500, Michael Stauffer wrote:
> > >
> ...
> > > > > Any thoughts on how I can approach this? If amanda can't do it, I
> > > thought I
> > > > > might try a script to create DLE's of a desired size based on
> > > disk-usage,
> > > > > then run the script everytime I wanted to do a new level 0 dump.
> That
> > > of
> > > > > course would mean telling amanda when I wanted to do level 0's,
> rather
> > > than
> > > > > amanda controlling it.
> > >
> > > Using a scheme like that, when it comes to recovering data, which DLE
> > > was the object in last summer?  Remember that when you are asked to
> > > recover some data, you will probably be under time pressure with
> clients
> > > and bosses looking over your shoulder.  That's not the time you want
> > > to fumble around trying to determine which DLE the data is in.
> >
> >
> > Yes, I can see the complications. That makes me think of some things:
> >
> > 1) what do people do when they need to split a DLE? Just rely on
> > notes/memory of DLE for restoring from older dumps if needed? Or just
> > search using something like in question 3) below?
>
> In addition to the report, amanda can also print a TOC for the tapes.
> This is a list of what DLE's and levels are on the tape.  Its a joke
> today, but the original reason was to put the TOC in the plastic box
> with the tape.  I print them out in 3-hole format (8.5x11) and file
> them.  I also add handwritten notes for things like a DLE split.
>
> When you are splitting a DLE, in the short run you probably remember
> the differences when you need to recover.  For archive recovery the
> written notes are helpful.
>
> >
> > 2) What happens if you split or otherwise modify a DLE during a cycle
> when
> > normally the DLE would be getting an incremental dump? Will amanda do a
> new
> > level 0 dump for it?
> >
> Splitting a DLE means there is 'at least' one new DLE.  All new DLE
> must get a level 0.  If the original DLE is still active, possibly
> "excluding" some things that go into the new DLE, it will continue on
> its current dumpcycle.  I would probably use amadmin to force it to
> do a level 0 though.
>
> > 3) Is there a tool for seaching for a path or filename across all dump
> > indecies? Or do I just grep through all the index files
> > in /etc/amanda/config-name// ?
>
> No am-tool that I know of.  Just "zgrep" (the indexes are compressed).
>
> --
> Jon H. LaBadie j...@jgcomp.com
>  11226 South Shore Rd.  (703) 787-0688 (H)
>  Reston, VA  20190  (609) 477-8330 (C)
>


Re: can amanda auto-size DLE's?

2014-03-05 Thread Michael Stauffer
Thanks Debra, this is very helpful.


On Mon, Mar 3, 2014 at 3:50 PM, Debra S Baddorf  wrote:

> Comments on questions that are at the very bottom.
>
> On Mar 3, 2014, at 1:47 PM, Michael Stauffer 
>  wrote:
>
> > > > 3) I had figured that when restoring, amrestore has to read in a
> complete
> > > > dump/tar file before it can extract even a single file. So if I have
> a
> > > > single DLE that's ~2TB that fits (with multiple parts) on a single
> tape,
> > > > then to restore a single file, amrestore has to read the whole tape.
> > > > HOWEVER, I'm now testing restoring a single file from a large 2.1TB
> DLE,
> > > > and the file has been restored, but the amrecover operation is still
> > > > running, for quite some time after restoring the file. Why might
> this be
> > > > happening?
> >
> > Most (all?) current tape formats and drives can fast forward looking
> > for end of file marks.  Amanda knows the position of the file on the
> > tape and will have to drive go at high speed to that tape file.
> >
> > For formats like LTO, which have many tracks on the tape, I think it
> > is even faster.  I "think" a TOC records where (i.e. which track) each
> > file starts.  So it doesn't have to fast forward and back 50 times to
> > get to the "tenth" file which is on the 51st track.
> >
> > Jon, Olivier and Debra - thanks for reading my long post and replying.
> >
> > OK this makes sense about searching for eof marks from what I've read.
> Seems like it's a good reason to use smaller DLE's.
> >
> > > > 3a) Where is the recovered dump file written to by amrecover? I
> can't see
> > > > space being used for it on either server or client. Is it streaming
> and
> > > > untar'ing in memory, only writing the desired files to disk?
> > >
> > The tar file is not written to disk be amrecover.  The desired files are
> > extracted as the tarchive streams.
> >
> > Thanks, that makes sense too from what I've seen (or not seen, actually
> - i.e. large temporary files).
> >
> > > > So assuming all the above is true, it'd be great if amdump could
> > > > automatically break large DLE's into small DLE's to end up with
> smaller
> > > > dump files and faster restore of individual files. Maybe it would
> happen
> > > > only for level 0 dumps, so that incremental dumps would still use
> the same
> > > > sub-DLE's used by the most recent level 0 dump.
> >
> > Sure, great idea.  Then all you would need to configure is one DLE
> > starting at "/".  Amanda would break things up into sub-DLEs.
> >
> > Nope, sorry amanda asks the backup-admin to do that part of the
> > config.  That's why you get the big bucks ;)
> >
> > Good point! A bit of job security there. ;)
> >
> > > > Any thoughts on how I can approach this? If amanda can't do it, I
> thought I
> > > > might try a script to create DLE's of a desired size based on
> disk-usage,
> > > > then run the script everytime I wanted to do a new level 0 dump.
> That of
> > > > course would mean telling amanda when I wanted to do level 0's,
> rather than
> > > > amanda controlling it.
> >
> > Using a scheme like that, when it comes to recovering data, which DLE
> > was the object in last summer?  Remember that when you are asked to
> > recover some data, you will probably be under time pressure with clients
> > and bosses looking over your shoulder.  That's not the time you want
> > to fumble around trying to determine which DLE the data is in.
> >
> > Yes, I can see the complications. That makes me think of some things:
> >
> > 1) what do people do when they need to split a DLE? Just rely on
> notes/memory of DLE for restoring from older dumps if needed? Or just
> search using something like in question 3) below?
>
> I leave the old DLE  in my disk list, commented out.  Possibly with the
> date when it was removed.  This helps me to remember that
> I need to  UNcomment it before trying to restore using it.  I.E.  The DLE
> needs to be recreated  (needs to be in your disklist file)  when
> you run amrecover, in order for it to be a valid choice.  So if you are
> looking at an older tape,  you need to have those older DLEs  still in
> place.
>
> As I understand it, anyway!
>
>
> >
> > 2) What happens if you split or otherwise modify a DLE during a cycle
> when normally the DLE would be getting an incremental dump? Will amanda do
> a new level 0 dump for it?
>
> Yes.  It's now a totally new DLE as far as amanda knows, so it gets a
> level 0 dump on the first backup.
>
> I've found  "amdump  myconfig  --no-taper   node-name  [DLE-name] "
>  useful sometimes.  It will do a backup of just the requested node and DLE
> but won't waste a tape on this small bit of data.   The data stays on my
> holding disk.  The next amdump will autoflush  it to tape with everything
> else
> (assuming   "autoflush"   is set to  AUTO  or YES  -- see your amanda.conf
>  file)
>
> I use the  --no-taper   when I need to test a new DLE to make sure it
> works,  before the regular backup is due.Or perhaps,  to get that

Re: can amanda auto-size DLE's?

2014-03-05 Thread Stefan G. Weichinger
Am 28.02.2014 06:33, schrieb Jon LaBadie:
> Sure, great idea.  Then all you would need to configure is one DLE
> starting at "/".  Amanda would break things up into sub-DLEs.

A bit off topic maybe .. but:

what I wish for for years now (and never take the time to sit down and
script it) is a helper script for amanda that reads in the disklist and
compares it with the actual filesystem(s).

practical example:

When I set up a server for a customer I create initial DLEs like:

garden pictures /mnt/samba/pictures {
root-tar
exclude "./B"
exclude append "./F"
exclude append "./G"
exclude append "./H"

[...]

}

garden pictures_b /mnt/samba/pictures {
root-tar
include "./B"
}

garden pictures_f /mnt/samba/pictures {
root-tar
include "./F"
}

[...]

--|

The main pictures-folder gets caught by DLE "pictures", that DLE
excludes some (big) subdirs which in turn are defined as separate DLEs.

Over time sometimes it is necessary to add more excludes and separate
DLEs (when things grow).

(2nd level sidenote here: I would also love a warning mechanism for
amanda writing me mails like "DLE X has not been fully dumped for more
than Y days now. Seems it has grown too much, check your setup or
cleanup a bit")

Now it would be really nice to have a check script that reads in all
this and compares it to the actual filesystem to tell me "yes, all your
subdirs are at least caught by one DLE" or "attention, subdir X gets
excluded in DLE_A, but is not included anywhere".

Or a graphical tree with dirs green that are within DLEs and red ones
are not included by amanda ...

This would make it easier in big configs with more complex
DLE-definitions and dynamic creation of dirs and mountpoints.

Did I explain it right?

---

Aside from this I back then had some other scripts that generated
include-lists resulting in chunks of <= X GB (smaller than one tape) ...
I wanted to dump the videos in my mythtv-config and had the problem of
very dynamic data in there ;-)

So the goal was to re-create dynamic include-lists for DLEs everyday (or
even at the actual time of amdump). It worked mostly. I would have to
dig that up again.

Stefan



Re: can amanda auto-size DLE's?

2014-03-03 Thread Jon LaBadie
On Mon, Mar 03, 2014 at 02:47:53PM -0500, Michael Stauffer wrote:
> >
...
> > > > Any thoughts on how I can approach this? If amanda can't do it, I
> > thought I
> > > > might try a script to create DLE's of a desired size based on
> > disk-usage,
> > > > then run the script everytime I wanted to do a new level 0 dump. That
> > of
> > > > course would mean telling amanda when I wanted to do level 0's, rather
> > than
> > > > amanda controlling it.
> >
> > Using a scheme like that, when it comes to recovering data, which DLE
> > was the object in last summer?  Remember that when you are asked to
> > recover some data, you will probably be under time pressure with clients
> > and bosses looking over your shoulder.  That's not the time you want
> > to fumble around trying to determine which DLE the data is in.
> 
> 
> Yes, I can see the complications. That makes me think of some things:
> 
> 1) what do people do when they need to split a DLE? Just rely on
> notes/memory of DLE for restoring from older dumps if needed? Or just
> search using something like in question 3) below?

In addition to the report, amanda can also print a TOC for the tapes.
This is a list of what DLE's and levels are on the tape.  Its a joke
today, but the original reason was to put the TOC in the plastic box
with the tape.  I print them out in 3-hole format (8.5x11) and file
them.  I also add handwritten notes for things like a DLE split.

When you are splitting a DLE, in the short run you probably remember
the differences when you need to recover.  For archive recovery the
written notes are helpful.

> 
> 2) What happens if you split or otherwise modify a DLE during a cycle when
> normally the DLE would be getting an incremental dump? Will amanda do a new
> level 0 dump for it?
> 
Splitting a DLE means there is 'at least' one new DLE.  All new DLE
must get a level 0.  If the original DLE is still active, possibly
"excluding" some things that go into the new DLE, it will continue on
its current dumpcycle.  I would probably use amadmin to force it to
do a level 0 though.

> 3) Is there a tool for seaching for a path or filename across all dump
> indecies? Or do I just grep through all the index files
> in /etc/amanda/config-name// ?

No am-tool that I know of.  Just "zgrep" (the indexes are compressed).

-- 
Jon H. LaBadie j...@jgcomp.com
 11226 South Shore Rd.  (703) 787-0688 (H)
 Reston, VA  20190  (609) 477-8330 (C)


Re: can amanda auto-size DLE's?

2014-03-03 Thread Debra S Baddorf
Comments on questions that are at the very bottom.

On Mar 3, 2014, at 1:47 PM, Michael Stauffer 
 wrote:

> > > 3) I had figured that when restoring, amrestore has to read in a complete
> > > dump/tar file before it can extract even a single file. So if I have a
> > > single DLE that's ~2TB that fits (with multiple parts) on a single tape,
> > > then to restore a single file, amrestore has to read the whole tape.
> > > HOWEVER, I'm now testing restoring a single file from a large 2.1TB DLE,
> > > and the file has been restored, but the amrecover operation is still
> > > running, for quite some time after restoring the file. Why might this be
> > > happening?
> 
> Most (all?) current tape formats and drives can fast forward looking
> for end of file marks.  Amanda knows the position of the file on the
> tape and will have to drive go at high speed to that tape file.
> 
> For formats like LTO, which have many tracks on the tape, I think it
> is even faster.  I "think" a TOC records where (i.e. which track) each
> file starts.  So it doesn't have to fast forward and back 50 times to
> get to the "tenth" file which is on the 51st track.
> 
> Jon, Olivier and Debra - thanks for reading my long post and replying.
> 
> OK this makes sense about searching for eof marks from what I've read. Seems 
> like it's a good reason to use smaller DLE's.
>  
> > > 3a) Where is the recovered dump file written to by amrecover? I can't see
> > > space being used for it on either server or client. Is it streaming and
> > > untar'ing in memory, only writing the desired files to disk?
> >
> The tar file is not written to disk be amrecover.  The desired files are
> extracted as the tarchive streams.
> 
> Thanks, that makes sense too from what I've seen (or not seen, actually - 
> i.e. large temporary files).
>  
> > > So assuming all the above is true, it'd be great if amdump could
> > > automatically break large DLE's into small DLE's to end up with smaller
> > > dump files and faster restore of individual files. Maybe it would happen
> > > only for level 0 dumps, so that incremental dumps would still use the same
> > > sub-DLE's used by the most recent level 0 dump.
> 
> Sure, great idea.  Then all you would need to configure is one DLE
> starting at "/".  Amanda would break things up into sub-DLEs.
> 
> Nope, sorry amanda asks the backup-admin to do that part of the
> config.  That's why you get the big bucks ;)
> 
> Good point! A bit of job security there. ;)
>  
> > > Any thoughts on how I can approach this? If amanda can't do it, I thought 
> > > I
> > > might try a script to create DLE's of a desired size based on disk-usage,
> > > then run the script everytime I wanted to do a new level 0 dump. That of
> > > course would mean telling amanda when I wanted to do level 0's, rather 
> > > than
> > > amanda controlling it.
> 
> Using a scheme like that, when it comes to recovering data, which DLE
> was the object in last summer?  Remember that when you are asked to
> recover some data, you will probably be under time pressure with clients
> and bosses looking over your shoulder.  That's not the time you want
> to fumble around trying to determine which DLE the data is in.
> 
> Yes, I can see the complications. That makes me think of some things:
> 
> 1) what do people do when they need to split a DLE? Just rely on notes/memory 
> of DLE for restoring from older dumps if needed? Or just search using 
> something like in question 3) below?

I leave the old DLE  in my disk list, commented out.  Possibly with the date 
when it was removed.  This helps me to remember that
I need to  UNcomment it before trying to restore using it.  I.E.  The DLE needs 
to be recreated  (needs to be in your disklist file)  when
you run amrecover, in order for it to be a valid choice.  So if you are looking 
at an older tape,  you need to have those older DLEs  still in place.

As I understand it, anyway!


> 
> 2) What happens if you split or otherwise modify a DLE during a cycle when 
> normally the DLE would be getting an incremental dump? Will amanda do a new 
> level 0 dump for it?

Yes.  It's now a totally new DLE as far as amanda knows, so it gets a level 0 
dump on the first backup.

I've found  "amdump  myconfig  --no-taper   node-name  [DLE-name] "  useful 
sometimes.  It will do a backup of just the requested node and DLE
but won't waste a tape on this small bit of data.   The data stays on my 
holding disk.  The next amdump will autoflush  it to tape with everything else
(assuming   "autoflush"   is set to  AUTO  or YES  -- see your amanda.conf  
file)

I use the  --no-taper   when I need to test a new DLE to make sure it works,  
before the regular backup is due.Or perhaps,  to get that new level-0
out of the way now,  so it doesn't extend the runtime of the regular amdump job.

> 
> 3) Is there a tool for seaching for a path or filename across all dump 
> indecies? Or do I just grep through all the index files in 
> /etc/amanda/config

Re: can amanda auto-size DLE's?

2014-03-03 Thread Michael Stauffer
Yes thanks, this is what I do. I've had some complication running the
restore from the backup server rather than the client, but I'll worry about
that later.


On Fri, Feb 28, 2014 at 1:47 PM, Debra S Baddorf  wrote:

> one small comment inserted below
>
> On Feb 27, 2014, at 11:33 PM, Jon LaBadie 
>  wrote:
>
> > Oliver already provided good answers, I'll just add a bit.
> >
> > On Fri, Feb 28, 2014 at 10:35:08AM +0700, Olivier Nicole wrote:
> >> Muchael,
> >>
> > ...
> >>
> >>> 3) I had figured that when restoring, amrestore has to read in a
> complete
> >>> dump/tar file before it can extract even a single file. So if I have a
> >>> single DLE that's ~2TB that fits (with multiple parts) on a single
> tape,
> >>> then to restore a single file, amrestore has to read the whole tape.
> >>> HOWEVER, I'm now testing restoring a single file from a large 2.1TB
> DLE,
> >>> and the file has been restored, but the amrecover operation is still
> >>> running, for quite some time after restoring the file. Why might this
> be
> >>> happening?
> >>
> >> Your touching the essence or tapes here: they are sequential access.
> >>
> >> So in order to access one specifi DLE on the tape, the tape has to
> >> position at the very begining of the tape and read everything until it
> >> reaches that dle (the nth file on the tape).
> >>
> >
> > Most (all?) current tape formats and drives can fast forward looking
> > for end of file marks.  Amanda knows the position of the file on the
> > tape and will have to drive go at high speed to that tape file.
> >
> > For formats like LTO, which have many tracks on the tape, I think it
> > is even faster.  I "think" a TOC records where (i.e. which track) each
> > file starts.  So it doesn't have to fast forward and back 50 times to
> > get to the "tenth" file which is on the 51st track.
> >
> >> Then it has to read sequentially all that file containing the backup of
> >> a dle to find the file(s) you want to restore. I am not sure about dump,
> >> but I am pretty sure that if your tar backup was a file on a disk
> >> instead of a file on a tape, it would read sequentially from the
> >> begining of the tar file, in a similar way.
> >>
> >> Then it has to read until the end of the tar (not sure about dump) to
> >> make sure that there is no other file(s) satisfying your extraction
> >> criteria.
> >>
> >> So yes, if the file you want to extract is at the begining of your tar,
> >> it will continue reading for a certain amount of time after the file has
> >> been extracted.
> >
> > Another reason this happens is the "append" feature of tar.  It is
> > possible that a second, later version of the same file is in the tar
> > file.  Amanda does not use this feature but tar does not know this.
> > If you see the file you want has been recovered, you can interupt
> > amrecover.
> >
> >>> The recover log shows this on the client doing the recovery:
> >>>
> >>> [root@cfile amRecoverTest_Feb_27]# tail -f
> >>> /var/log/amanda/client/jet1/amrecover.20140227135820.debug
> >>> Thu Feb 27 17:23:12 2014: thd-0x25f1590: amrecover:
> stream_read_callback:
> >>> data is still flowing
> >>>
> >>> 3a) Where is the recovered dump file written to by amrecover? I can't
> see
> >>> space being used for it on either server or client. Is it streaming and
> >>> untar'ing in memory, only writing the desired files to disk?
> >>
> > The tar file is not written to disk be amrecover.  The desired files are
> > extracted as the tarchive streams.
> >
> >> In the directory from where you started the amrecover command. With tar,
> >> it will create the same exact hierarchy, reflecting the original DLE.
> >>
> >> try:
> >>
> >> find . -name myfilename -print
> >
> > I strongly suggest you NOT use amrecover to extract directly to the
> > filesystem.  Extract them in a temporary directory and once you are
> > sure they are what you want, copy/move them to their correct location.
>
> To make this completely clear  (i.e. "restoring guide for idiots")
> -  cd  /tmp/something
> -  amrecover  .
>
> The files will be restored into the /tmp/something  which is your current
> directory
> when you typed the amrecover command.
>
>
> >
> > ...
> >>> So assuming all the above is true, it'd be great if amdump could
> >>> automatically break large DLE's into small DLE's to end up with smaller
> >>> dump files and faster restore of individual files. Maybe it would
> happen
> >>> only for level 0 dumps, so that incremental dumps would still use the
> same
> >>> sub-DLE's used by the most recent level 0 dump.
> >
> > Sure, great idea.  Then all you would need to configure is one DLE
> > starting at "/".  Amanda would break things up into sub-DLEs.
> >
> > Nope, sorry amanda asks the backup-admin to do that part of the
> > config.  That's why you get the big bucks ;)
> >
> >>
> >>> The issue I have is that with 30TB of data, there'd be lots of manual
> >>> fragmenting of data directories to get more easily-restorable DLE's
> sizes
> >>> of sa

Re: can amanda auto-size DLE's?

2014-03-03 Thread Michael Stauffer
>
> > > 3) I had figured that when restoring, amrestore has to read in a
> complete
> > > dump/tar file before it can extract even a single file. So if I have a
> > > single DLE that's ~2TB that fits (with multiple parts) on a single
> tape,
> > > then to restore a single file, amrestore has to read the whole tape.
> > > HOWEVER, I'm now testing restoring a single file from a large 2.1TB
> DLE,
> > > and the file has been restored, but the amrecover operation is still
> > > running, for quite some time after restoring the file. Why might this
> be
> > > happening?
>
> Most (all?) current tape formats and drives can fast forward looking
> for end of file marks.  Amanda knows the position of the file on the
> tape and will have to drive go at high speed to that tape file.
>
> For formats like LTO, which have many tracks on the tape, I think it
> is even faster.  I "think" a TOC records where (i.e. which track) each
> file starts.  So it doesn't have to fast forward and back 50 times to
> get to the "tenth" file which is on the 51st track.


Jon, Olivier and Debra - thanks for reading my long post and replying.

OK this makes sense about searching for eof marks from what I've read.
Seems like it's a good reason to use smaller DLE's.


> > > 3a) Where is the recovered dump file written to by amrecover? I can't
> see
> > > space being used for it on either server or client. Is it streaming and
> > > untar'ing in memory, only writing the desired files to disk?
> >
> The tar file is not written to disk be amrecover.  The desired files are
> extracted as the tarchive streams.


Thanks, that makes sense too from what I've seen (or not seen, actually -
i.e. large temporary files).


> > > So assuming all the above is true, it'd be great if amdump could
> > > automatically break large DLE's into small DLE's to end up with smaller
> > > dump files and faster restore of individual files. Maybe it would
> happen
> > > only for level 0 dumps, so that incremental dumps would still use the
> same
> > > sub-DLE's used by the most recent level 0 dump.
>
> Sure, great idea.  Then all you would need to configure is one DLE
> starting at "/".  Amanda would break things up into sub-DLEs.
>
> Nope, sorry amanda asks the backup-admin to do that part of the
> config.  That's why you get the big bucks ;)


Good point! A bit of job security there. ;)


> > > Any thoughts on how I can approach this? If amanda can't do it, I
> thought I
> > > might try a script to create DLE's of a desired size based on
> disk-usage,
> > > then run the script everytime I wanted to do a new level 0 dump. That
> of
> > > course would mean telling amanda when I wanted to do level 0's, rather
> than
> > > amanda controlling it.
>
> Using a scheme like that, when it comes to recovering data, which DLE
> was the object in last summer?  Remember that when you are asked to
> recover some data, you will probably be under time pressure with clients
> and bosses looking over your shoulder.  That's not the time you want
> to fumble around trying to determine which DLE the data is in.


Yes, I can see the complications. That makes me think of some things:

1) what do people do when they need to split a DLE? Just rely on
notes/memory of DLE for restoring from older dumps if needed? Or just
search using something like in question 3) below?

2) What happens if you split or otherwise modify a DLE during a cycle when
normally the DLE would be getting an incremental dump? Will amanda do a new
level 0 dump for it?

3) Is there a tool for seaching for a path or filename across all dump
indecies? Or do I just grep through all the index files
in /etc/amanda/config-name// ?

Thanks

-M


Re: can amanda auto-size DLE's?

2014-02-28 Thread Debra S Baddorf
one small comment inserted below

On Feb 27, 2014, at 11:33 PM, Jon LaBadie 
 wrote:

> Oliver already provided good answers, I'll just add a bit.
> 
> On Fri, Feb 28, 2014 at 10:35:08AM +0700, Olivier Nicole wrote:
>> Muchael,
>> 
> ...
>> 
>>> 3) I had figured that when restoring, amrestore has to read in a complete
>>> dump/tar file before it can extract even a single file. So if I have a
>>> single DLE that's ~2TB that fits (with multiple parts) on a single tape,
>>> then to restore a single file, amrestore has to read the whole tape.
>>> HOWEVER, I'm now testing restoring a single file from a large 2.1TB DLE,
>>> and the file has been restored, but the amrecover operation is still
>>> running, for quite some time after restoring the file. Why might this be
>>> happening?
>> 
>> Your touching the essence or tapes here: they are sequential access.
>> 
>> So in order to access one specifi DLE on the tape, the tape has to
>> position at the very begining of the tape and read everything until it
>> reaches that dle (the nth file on the tape).
>> 
> 
> Most (all?) current tape formats and drives can fast forward looking
> for end of file marks.  Amanda knows the position of the file on the
> tape and will have to drive go at high speed to that tape file.
> 
> For formats like LTO, which have many tracks on the tape, I think it
> is even faster.  I "think" a TOC records where (i.e. which track) each
> file starts.  So it doesn't have to fast forward and back 50 times to
> get to the "tenth" file which is on the 51st track.
> 
>> Then it has to read sequentially all that file containing the backup of
>> a dle to find the file(s) you want to restore. I am not sure about dump,
>> but I am pretty sure that if your tar backup was a file on a disk
>> instead of a file on a tape, it would read sequentially from the
>> begining of the tar file, in a similar way.
>> 
>> Then it has to read until the end of the tar (not sure about dump) to
>> make sure that there is no other file(s) satisfying your extraction
>> criteria.
>> 
>> So yes, if the file you want to extract is at the begining of your tar,
>> it will continue reading for a certain amount of time after the file has
>> been extracted.
> 
> Another reason this happens is the "append" feature of tar.  It is
> possible that a second, later version of the same file is in the tar
> file.  Amanda does not use this feature but tar does not know this.
> If you see the file you want has been recovered, you can interupt
> amrecover.
> 
>>> The recover log shows this on the client doing the recovery:
>>> 
>>> [root@cfile amRecoverTest_Feb_27]# tail -f
>>> /var/log/amanda/client/jet1/amrecover.20140227135820.debug
>>> Thu Feb 27 17:23:12 2014: thd-0x25f1590: amrecover: stream_read_callback:
>>> data is still flowing
>>> 
>>> 3a) Where is the recovered dump file written to by amrecover? I can't see
>>> space being used for it on either server or client. Is it streaming and
>>> untar'ing in memory, only writing the desired files to disk?
>> 
> The tar file is not written to disk be amrecover.  The desired files are
> extracted as the tarchive streams.
> 
>> In the directory from where you started the amrecover command. With tar,
>> it will create the same exact hierarchy, reflecting the original DLE.
>> 
>> try:
>> 
>> find . -name myfilename -print
> 
> I strongly suggest you NOT use amrecover to extract directly to the
> filesystem.  Extract them in a temporary directory and once you are
> sure they are what you want, copy/move them to their correct location.

To make this completely clear  (i.e. "restoring guide for idiots")
-  cd  /tmp/something
-  amrecover  …..

The files will be restored into the /tmp/something  which is your current 
directory
when you typed the amrecover command.


> 
> ...
>>> So assuming all the above is true, it'd be great if amdump could
>>> automatically break large DLE's into small DLE's to end up with smaller
>>> dump files and faster restore of individual files. Maybe it would happen
>>> only for level 0 dumps, so that incremental dumps would still use the same
>>> sub-DLE's used by the most recent level 0 dump.
> 
> Sure, great idea.  Then all you would need to configure is one DLE
> starting at "/".  Amanda would break things up into sub-DLEs.
> 
> Nope, sorry amanda asks the backup-admin to do that part of the
> config.  That's why you get the big bucks ;)
> 
>> 
>>> The issue I have is that with 30TB of data, there'd be lots of manual
>>> fragmenting of data directories to get more easily-restorable DLE's sizes
>>> of say, 500GB each. Some top-level dirs in my main data drive have 3-6TB
>>> each, while many others have only 100GB or so. Manually breaking these into
>>> smaller DLE's once is fine, but since data gets regularly moved, added and
>>> deleted, things would quickly change and upset my smaller DLE's.
> 
> I'll bet if you try you will be able to make some logical splits.
>>> 
>>> Any thoughts on how I can approach this? If aman

Re: can amanda auto-size DLE's?

2014-02-27 Thread Jon LaBadie
Oliver already provided good answers, I'll just add a bit.

On Fri, Feb 28, 2014 at 10:35:08AM +0700, Olivier Nicole wrote:
> Muchael,
> 
...
> 
> > 3) I had figured that when restoring, amrestore has to read in a complete
> > dump/tar file before it can extract even a single file. So if I have a
> > single DLE that's ~2TB that fits (with multiple parts) on a single tape,
> > then to restore a single file, amrestore has to read the whole tape.
> > HOWEVER, I'm now testing restoring a single file from a large 2.1TB DLE,
> > and the file has been restored, but the amrecover operation is still
> > running, for quite some time after restoring the file. Why might this be
> > happening?
> 
> Your touching the essence or tapes here: they are sequential access.
> 
> So in order to access one specifi DLE on the tape, the tape has to
> position at the very begining of the tape and read everything until it
> reaches that dle (the nth file on the tape).
> 

Most (all?) current tape formats and drives can fast forward looking
for end of file marks.  Amanda knows the position of the file on the
tape and will have to drive go at high speed to that tape file.

For formats like LTO, which have many tracks on the tape, I think it
is even faster.  I "think" a TOC records where (i.e. which track) each
file starts.  So it doesn't have to fast forward and back 50 times to
get to the "tenth" file which is on the 51st track.

> Then it has to read sequentially all that file containing the backup of
> a dle to find the file(s) you want to restore. I am not sure about dump,
> but I am pretty sure that if your tar backup was a file on a disk
> instead of a file on a tape, it would read sequentially from the
> begining of the tar file, in a similar way.
> 
> Then it has to read until the end of the tar (not sure about dump) to
> make sure that there is no other file(s) satisfying your extraction
> criteria.
> 
> So yes, if the file you want to extract is at the begining of your tar,
> it will continue reading for a certain amount of time after the file has
> been extracted.

Another reason this happens is the "append" feature of tar.  It is
possible that a second, later version of the same file is in the tar
file.  Amanda does not use this feature but tar does not know this.
If you see the file you want has been recovered, you can interupt
amrecover.

> > The recover log shows this on the client doing the recovery:
> > 
> > [root@cfile amRecoverTest_Feb_27]# tail -f
> > /var/log/amanda/client/jet1/amrecover.20140227135820.debug
> > Thu Feb 27 17:23:12 2014: thd-0x25f1590: amrecover: stream_read_callback:
> > data is still flowing
> > 
> > 3a) Where is the recovered dump file written to by amrecover? I can't see
> > space being used for it on either server or client. Is it streaming and
> > untar'ing in memory, only writing the desired files to disk?
> 
The tar file is not written to disk be amrecover.  The desired files are
extracted as the tarchive streams.

> In the directory from where you started the amrecover command. With tar,
> it will create the same exact hierarchy, reflecting the original DLE.
> 
> try:
> 
> find . -name myfilename -print

I strongly suggest you NOT use amrecover to extract directly to the
filesystem.  Extract them in a temporary directory and once you are
sure they are what you want, copy/move them to their correct location.

...
> > So assuming all the above is true, it'd be great if amdump could
> > automatically break large DLE's into small DLE's to end up with smaller
> > dump files and faster restore of individual files. Maybe it would happen
> > only for level 0 dumps, so that incremental dumps would still use the same
> > sub-DLE's used by the most recent level 0 dump.

Sure, great idea.  Then all you would need to configure is one DLE
starting at "/".  Amanda would break things up into sub-DLEs.

Nope, sorry amanda asks the backup-admin to do that part of the
config.  That's why you get the big bucks ;)

> 
> > The issue I have is that with 30TB of data, there'd be lots of manual
> > fragmenting of data directories to get more easily-restorable DLE's sizes
> > of say, 500GB each. Some top-level dirs in my main data drive have 3-6TB
> > each, while many others have only 100GB or so. Manually breaking these into
> > smaller DLE's once is fine, but since data gets regularly moved, added and
> > deleted, things would quickly change and upset my smaller DLE's.

I'll bet if you try you will be able to make some logical splits.
> > 
> > Any thoughts on how I can approach this? If amanda can't do it, I thought I
> > might try a script to create DLE's of a desired size based on disk-usage,
> > then run the script everytime I wanted to do a new level 0 dump. That of
> > course would mean telling amanda when I wanted to do level 0's, rather than
> > amanda controlling it.

Using a scheme like that, when it comes to recovering data, which DLE
was the object in last summer?  Remember that when you are asked to

Re: can amanda auto-size DLE's?

2014-02-27 Thread Olivier Nicole
Muchael,

> 1) if I have multiple DLE's in my disklist, then tell amdump to perform a
> level 0 dump of the complete config, each DLE gets written to tape as a
> separate dump/tar file (possibly in parts if the tar is > part-size). Is
> that right?

Yes

> 2) If multiple DLE's are processed in a single level 0 amdump run, with
> each DLE << tape-size, then as many as can fit will be written to a single
> tape, or possibly spanning tapes. But in any case it won't be a single DLE
> per tape. Is that right? That looks like what I've observed so far.

Yes, Amanda try to fit as many dle/tape in order to fill-in the tape.

> 3) I had figured that when restoring, amrestore has to read in a complete
> dump/tar file before it can extract even a single file. So if I have a
> single DLE that's ~2TB that fits (with multiple parts) on a single tape,
> then to restore a single file, amrestore has to read the whole tape.
> HOWEVER, I'm now testing restoring a single file from a large 2.1TB DLE,
> and the file has been restored, but the amrecover operation is still
> running, for quite some time after restoring the file. Why might this be
> happening?

Your touching the essence or tapes here: they are sequential access.

So in order to access one specifi DLE on the tape, the tape has to
position at the very begining of the tape and read everything until it
reaches that dle (the nth file on the tape).

Then it has to read sequentially all that file containing the backup of
a dle to find the file(s) you want to restore. I am not sure about dump,
but I am pretty sure that if your tar backup was a file on a disk
instead of a file on a tape, it would read sequentially from the
begining of the tar file, in a similar way.

Then it has to read until the end of the tar (not sure about dump) to
make sure that there is no other file(s) satisfying your extraction
criteria.

So yes, if the file you want to extract is at the begining of your tar,
it will continue reading for a certain amount of time after the file has
been extracted.

> The recover log shows this on the client doing the recovery:
> 
> [root@cfile amRecoverTest_Feb_27]# tail -f
> /var/log/amanda/client/jet1/amrecover.20140227135820.debug
> Thu Feb 27 17:23:12 2014: thd-0x25f1590: amrecover: stream_read_callback:
> data is still flowing
> 
> 3a) Where is the recovered dump file written to by amrecover? I can't see
> space being used for it on either server or client. Is it streaming and
> untar'ing in memory, only writing the desired files to disk?

In the directory from where you started the amrecover command. With tar,
it will create the same exact hierarchy, reflecting the original DLE.

try:

find . -name myfilename -print

> 4) To restore from a single DLE's dump/tar file that's smaller than tape
> size, and exists on a tape with multiple other smaller DLE dump/tar files,
> amrestore can seek to the particular DLE's dump/tar file and only has to
> read that one file. Is that right?

As mentionned above, seek on a tape is a sequential read of the tape
(unless your tape is already positionned on the file x (known) and you
want to read from file y, it will need to read only y-x).

> So assuming all the above is true, it'd be great if amdump could
> automatically break large DLE's into small DLE's to end up with smaller
> dump files and faster restore of individual files. Maybe it would happen
> only for level 0 dumps, so that incremental dumps would still use the same
> sub-DLE's used by the most recent level 0 dump.

Yes, but then what happens for level above 0?

You have to make your planning by hand and break your dle yourself.

> The issue I have is that with 30TB of data, there'd be lots of manual

Depending on the size of your tapes, even with many small dle, you will
most probably end-up reading the tape from the begining for every
restore.

If your dle is splitted on many tapes, you will have to read every tape
even if the file you want was found on the first tape (I am not 100%
sure about that though).

> fragmenting of data directories to get more easily-restorable DLE's sizes
> of say, 500GB each. Some top-level dirs in my main data drive have 3-6TB
> each, while many others have only 100GB or so. Manually breaking these into
> smaller DLE's once is fine, but since data gets regularly moved, added and
> deleted, things would quickly change and upset my smaller DLE's.
> 
> Any thoughts on how I can approach this? If amanda can't do it, I thought I
> might try a script to create DLE's of a desired size based on disk-usage,
> then run the script everytime I wanted to do a new level 0 dump. That of
> course would mean telling amanda when I wanted to do level 0's, rather than
> amanda controlling it.

While the script may be a good idea, running it for each level 0 will
completely mess up with Amanda: remember, you don't manage/know when
Amanda will do a level 0. So you don't know when to run your script.

And if you remove a dle from your disklist (because you h

can amanda auto-size DLE's?

2014-02-27 Thread Michael Stauffer
Amanda 3.3.4

Hi,

I'm guessing the answer is no since I haven't read about this, but maybe...

I'm hoping amanda might be able to auto-size DLE's into sub-DLE's of an
approximate size, say 500GB.

My understanding is this:

1) if I have multiple DLE's in my disklist, then tell amdump to perform a
level 0 dump of the complete config, each DLE gets written to tape as a
separate dump/tar file (possibly in parts if the tar is > part-size). Is
that right?

2) If multiple DLE's are processed in a single level 0 amdump run, with
each DLE << tape-size, then as many as can fit will be written to a single
tape, or possibly spanning tapes. But in any case it won't be a single DLE
per tape. Is that right? That looks like what I've observed so far.

3) I had figured that when restoring, amrestore has to read in a complete
dump/tar file before it can extract even a single file. So if I have a
single DLE that's ~2TB that fits (with multiple parts) on a single tape,
then to restore a single file, amrestore has to read the whole tape.
HOWEVER, I'm now testing restoring a single file from a large 2.1TB DLE,
and the file has been restored, but the amrecover operation is still
running, for quite some time after restoring the file. Why might this be
happening?

The recover log shows this on the client doing the recovery:

[root@cfile amRecoverTest_Feb_27]# tail -f
/var/log/amanda/client/jet1/amrecover.20140227135820.debug
Thu Feb 27 17:23:12 2014: thd-0x25f1590: amrecover: stream_read_callback:
data is still flowing

3a) Where is the recovered dump file written to by amrecover? I can't see
space being used for it on either server or client. Is it streaming and
untar'ing in memory, only writing the desired files to disk?

4) To restore from a single DLE's dump/tar file that's smaller than tape
size, and exists on a tape with multiple other smaller DLE dump/tar files,
amrestore can seek to the particular DLE's dump/tar file and only has to
read that one file. Is that right?

So assuming all the above is true, it'd be great if amdump could
automatically break large DLE's into small DLE's to end up with smaller
dump files and faster restore of individual files. Maybe it would happen
only for level 0 dumps, so that incremental dumps would still use the same
sub-DLE's used by the most recent level 0 dump.

The issue I have is that with 30TB of data, there'd be lots of manual
fragmenting of data directories to get more easily-restorable DLE's sizes
of say, 500GB each. Some top-level dirs in my main data drive have 3-6TB
each, while many others have only 100GB or so. Manually breaking these into
smaller DLE's once is fine, but since data gets regularly moved, added and
deleted, things would quickly change and upset my smaller DLE's.

Any thoughts on how I can approach this? If amanda can't do it, I thought I
might try a script to create DLE's of a desired size based on disk-usage,
then run the script everytime I wanted to do a new level 0 dump. That of
course would mean telling amanda when I wanted to do level 0's, rather than
amanda controlling it.

Thanks for reading this long post!

-M