subject:"Re\: All level 0 on the same run\?"

Re: All level 0 on the same run?

2018-11-12 Thread Nathan Stratton Treadway

On Mon, Nov 12, 2018 at 13:56:33 -0500, Chris Nighswonger wrote:
> backup@scriptor:~ amadmin campus info host dev
> 
> Current info for host dev:
>   Stats: dump rates (kps), Full:  6015.9, 5898.2, 5771.4
> Incremental:  1658.2, 1456.6, 343.4
>   compressed size, Full:  66.9%, 67.4%, 67.8%
> Incremental:  71.3%, 68.0%, 67.4%
>   Dumps: lev datestmp  tape file   origK   compK secs
>   0  19691231   0 -1 -1 -1
> 
>  That timestamp is a bit odd, but...
> 
> (I don't know off hand why Amanada would have saved a "no data recorded"
> > status rather than still having a record of the last time ZWC ran
> > sucessfully -- perhaps the last successful dump went to a volume that
> > has since been overwritten, and so the entry had to be deleted from the
> > info database without anything new replacing it?)
> >
> >
> This makes the most sense. This client has been offline for two cycles.

Yeah, notice the "tape" column is empty, and the rest of the columes are
0 or -1.  It definitely removed all traces of whatever was there before
(and presumably there was something there at some point, since the
Stats: section has data), so the "last succcessfully dump's volume has
since been overwritten" explanation seems to fit.

Nathan



Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-11-12 Thread Chris Nighswonger

On Mon, Nov 12, 2018 at 1:50 PM Nathan Stratton Treadway 
wrote:

> On Mon, Nov 12, 2018 at 13:28:15 -0500, Chris Nighswonger wrote:
> > I found the long overdue culprit... It was a windows client using the ZWC
> > community client. The ZWC service had hung (not an uncommon problem) and
> > Amanda was missing backups from it. If I had been paying attention to the
> > nightly reports, I would have investigated it sooner. However, it does
> beg
> > the question: That DLE is NOT 49 years overdue... So where in the world
> did
> > the planner get that idea from?
>
> Again, that 1969 date is the "no data saved" timestamp.  You can see
> this mostly-directly by doing "amadmin campus info [HOST] [DEV]" on that
> DLE... or completely-directly by looking at the
>  .../campus/curinfo/[HOST]/[DEV]/info
> file (where the "seconds since the epoch" timestamp will show up as "-1"
> instead of a number in the range around 1542048289).
>
>
backup@scriptor:~ amadmin campus info host dev

Current info for host dev:
  Stats: dump rates (kps), Full:  6015.9, 5898.2, 5771.4
Incremental:  1658.2, 1456.6, 343.4
  compressed size, Full:  66.9%, 67.4%, 67.8%
Incremental:  71.3%, 68.0%, 67.4%
  Dumps: lev datestmp  tape file   origK   compK secs
  0  19691231   0 -1 -1 -1

 That timestamp is a bit odd, but...

(I don't know off hand why Amanada would have saved a "no data recorded"
> status rather than still having a record of the last time ZWC ran
> sucessfully -- perhaps the last successful dump went to a volume that
> has since been overwritten, and so the entry had to be deleted from the
> info database without anything new replacing it?)
>
>
This makes the most sense. This client has been offline for two cycles.

Chris

Re: All level 0 on the same run?

2018-11-12 Thread Nathan Stratton Treadway

On Mon, Nov 12, 2018 at 13:28:15 -0500, Chris Nighswonger wrote:
> On Sat, Nov 10, 2018 at 5:03 PM Nathan Stratton Treadway 
> wrote:
> 
> > On Sat, Nov 10, 2018 at 11:39:58 -0500, Chris Nighswonger wrote:
> > >  (3 filesystems overdue. The most being overdue 17841 days.)
> > >
> > > Not sure what's up with the overdues. There were none prior to breaking
> > up
> > > the DLEs. It may just be an artifact.
> >
> > With a 5-day dumpcycle that would mean Amanda thinks the last dump took
> > place 17846-ish days ago:
> >   $ date --date="17846 days ago"
> >   Wed Dec 31 14:24:39 EST 1969
> > ... and the date of 1969/12/13 is the "no data saved" placeholder date
> > within Amanda's info database.
> >
> > Anyway, you should be able to identify the three DLEs in question with
> >   amadmin campus due | grep "Overdue"
> > and then use "amadmin campus info [...]" to see what amanda has recorded
> > about them.
> >
> >
> I found the long overdue culprit... It was a windows client using the ZWC
> community client. The ZWC service had hung (not an uncommon problem) and
> Amanda was missing backups from it. If I had been paying attention to the
> nightly reports, I would have investigated it sooner. However, it does beg
> the question: That DLE is NOT 49 years overdue... So where in the world did
> the planner get that idea from?

Again, that 1969 date is the "no data saved" timestamp.  You can see
this mostly-directly by doing "amadmin campus info [HOST] [DEV]" on that
DLE... or completely-directly by looking at the 
 .../campus/curinfo/[HOST]/[DEV]/info 
file (where the "seconds since the epoch" timestamp will show up as "-1"
instead of a number in the range around 1542048289).

(I don't know off hand why Amanada would have saved a "no data recorded"
status rather than still having a record of the last time ZWC ran
sucessfully -- perhaps the last successful dump went to a volume that
has since been overwritten, and so the entry had to be deleted from the
info database without anything new replacing it?)

Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-11-12 Thread Chris Nighswonger

On Sat, Nov 10, 2018 at 5:03 PM Nathan Stratton Treadway 
wrote:

> On Sat, Nov 10, 2018 at 11:39:58 -0500, Chris Nighswonger wrote:
> >  (3 filesystems overdue. The most being overdue 17841 days.)
> >
> > Not sure what's up with the overdues. There were none prior to breaking
> up
> > the DLEs. It may just be an artifact.
>
> With a 5-day dumpcycle that would mean Amanda thinks the last dump took
> place 17846-ish days ago:
>   $ date --date="17846 days ago"
>   Wed Dec 31 14:24:39 EST 1969
> ... and the date of 1969/12/13 is the "no data saved" placeholder date
> within Amanda's info database.
>
> Anyway, you should be able to identify the three DLEs in question with
>   amadmin campus due | grep "Overdue"
> and then use "amadmin campus info [...]" to see what amanda has recorded
> about them.
>
>
I found the long overdue culprit... It was a windows client using the ZWC
community client. The ZWC service had hung (not an uncommon problem) and
Amanda was missing backups from it. If I had been paying attention to the
nightly reports, I would have investigated it sooner. However, it does beg
the question: That DLE is NOT 49 years overdue... So where in the world did
the planner get that idea from?

I'll check the balance again after tonight's run.

Kind regards,
Chris

Re: All level 0 on the same run?

2018-11-11 Thread Gene Heskett

On Sunday 11 November 2018 15:36:52 Nathan Stratton Treadway wrote:

> On Sun, Nov 11, 2018 at 05:25:29 -0500, Gene Heskett wrote:
> > > > amanda@coyote:/amandatapes/Dailys/data$ /usr/local/sbin/amadmin
> > > > Daily balance
> > > >
> > > >  due-date  #fsorig MB out MB   balance
> > > > --
> > > > 11/10 Sat1   7912   3145-78.7%
> > > > 11/11 Sun1  10886  10886-26.1%
> > > > 11/12 Mon1  32963   7875-46.6%
> > > > 11/13 Tue1   7688   7688-47.8%
> > > > 11/14 Wed2  22109  22109+50.0%
> > > > 11/15 Thu4  75027  46623   +216.3%
> > > > 11/16 Fri6   8257   6109-58.6%
> > > > 11/17 Sat   29  14034   8932-39.4%
> > > > 11/18 Sun4  21281  16842+14.3%
> > > > 11/19 Mon   18  34599  17188+16.6%
> > > > --
> > > > TOTAL   67 234756 147397 14739
>
> [...]
>
> > After this mornings run, its better again:
> > amanda@coyote:/root$ /usr/local/sbin/amadmin Daily balance
> >
> >  due-date  #fsorig MB out MB   balance
> > --
> > 11/11 Sun1  10886  10886-35.8%
> > 11/12 Mon1  32963   7875-53.6%
> > 11/13 Tue1   7688   7688-54.7%
> > 11/14 Wed2  22109  22109+30.4%
> > 11/15 Thu4  75027  46623   +175.0%
> > 11/16 Fri6   8257   6109-64.0%
> > 11/17 Sat   29  14034   8932-47.3%
> > 11/18 Sun4  21281  16842 -0.7%
> > 11/19 Mon   18  34599  17188 +1.4%
> > 11/20 Tue1  49240  25295+49.2%
> > --
> > TOTAL   67 276084 169547 16954
> >   (estimated 10 runs per dumpcycle)
> >
> > And it must be calculating the balance based on the original size,
> > not the compressed size (out MB), that growing Tuesday the 20th is
> > still on one vtape, but if unchanged, Thu 15nth will still use a 2nd
> > vtape.
>
> Comparing these two runs, I see that all the rows (the first four
> columns on each line) are the same -- except that the single DLE
> listed on 11/10 of the first run came out much larger when it was
> actually dumped last night.  (7912/3145 v.s. 49240/25295).  When you
> said "that growing Tuesday the 20th", was that an indication that you
> already expected that particular DLE (which should be easy to identify
> as the only full dump listed in last night's report) to be growing
> rapidly?
>
> (Note that the "balance" column is in fact calculated based on the
> compressed/"out MB": the "average size" of 16954 listed at the bottom
> of the "balance" column is sum-of-all-"out"-sizes/runs-per-dumpcycle
> (i.e. the average of the out-sizes), and then the balance percentile
> in each row is (today's-out-MB less average-size)/average-size.  For
> example, for 11/20, you have
>   (25295-16954) / 16954 = 0.492
> ,and for 11/16 you have
>   ( 6109-16954) / 16954 = -0.640
> , etc.)
>
> So, from these two days of "balance" reports, the takeaway is that one
> particular DLE seems to have become much bigger (3145 -> 25295 MB
> compressed size), and as a result the average dump size went up by
> 2200MB, and thus all the positive-balance days had their balance
> percentages go down a bit.  So the balance did improve, but because of
> the growth of the total backup volume rather than because of
> rescheduling any particular dump(s).
>
> Also, the fact that this one DLE grew to be larger than the average
> dump size explains why no other DLEs were promoted to last night's run
> (and thus none of the other date's rows changed at all).
>
> However, the next three days all show negative balance figures, so it
> will be interesting to see if Amanda promotes any DLEs from the 11/14,
> 11/15, or 11/19's groups to try to even the batches out a bit.
> However, the first two of those dates currently have small DLE counts,
> and 11/19 includes many DLEs but totals just slightly over the
> average, so the opportunities for rebalancing may be pretty
> limited
>

it is an improvement, so I'm not going to try and move anything else for 
about a week just to see if it levels out better. PublicA is currently 
the 800 lb gorilla, so I may next attempt to adapt one of the recipes 
for a-f etc in the top of the disklist. 3, maybe even 4 of those might 
get it under control.  And its a technique I've not yet explored. 
>   Nathan
>
> --
>-- Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic
> region Ray Ontko & Co.  -  Software consulting services  -  
> http://www.ontko.com/ GPG Key:
> http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239 Key
> fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB

Re: All level 0 on the same run?

2018-11-11 Thread Nathan Stratton Treadway

On Sun, Nov 11, 2018 at 05:25:29 -0500, Gene Heskett wrote:
> > > amanda@coyote:/amandatapes/Dailys/data$ /usr/local/sbin/amadmin
> > > Daily balance
> > >
> > >  due-date  #fsorig MB out MB   balance
> > > --
> > > 11/10 Sat1   7912   3145-78.7%
> > > 11/11 Sun1  10886  10886-26.1%
> > > 11/12 Mon1  32963   7875-46.6%
> > > 11/13 Tue1   7688   7688-47.8%
> > > 11/14 Wed2  22109  22109+50.0%
> > > 11/15 Thu4  75027  46623   +216.3%
> > > 11/16 Fri6   8257   6109-58.6%
> > > 11/17 Sat   29  14034   8932-39.4%
> > > 11/18 Sun4  21281  16842+14.3%
> > > 11/19 Mon   18  34599  17188+16.6%
> > > --
> > > TOTAL   67 234756 147397 14739
> > >
[...] 
> After this mornings run, its better again:
> amanda@coyote:/root$ /usr/local/sbin/amadmin Daily balance
> 
>  due-date  #fsorig MB out MB   balance
> --
> 11/11 Sun1  10886  10886-35.8%
> 11/12 Mon1  32963   7875-53.6%
> 11/13 Tue1   7688   7688-54.7%
> 11/14 Wed2  22109  22109+30.4%
> 11/15 Thu4  75027  46623   +175.0%
> 11/16 Fri6   8257   6109-64.0%
> 11/17 Sat   29  14034   8932-47.3%
> 11/18 Sun4  21281  16842 -0.7%
> 11/19 Mon   18  34599  17188 +1.4%
> 11/20 Tue1  49240  25295+49.2%
> --
> TOTAL   67 276084 169547 16954
>   (estimated 10 runs per dumpcycle)
> 
> And it must be calculating the balance based on the original size, not 
> the compressed size (out MB), that growing Tuesday the 20th is still on 
> one vtape, but if unchanged, Thu 15nth will still use a 2nd vtape.

Comparing these two runs, I see that all the rows (the first four
columns on each line) are the same -- except that the single DLE listed
on 11/10 of the first run came out much larger when it was actually
dumped last night.  (7912/3145 v.s. 49240/25295).  When you said "that
growing Tuesday the 20th", was that an indication that you already
expected that particular DLE (which should be easy to identify as the
only full dump listed in last night's report) to be growing rapidly?

(Note that the "balance" column is in fact calculated based on the
compressed/"out MB": the "average size" of 16954 listed at the bottom of
the "balance" column is sum-of-all-"out"-sizes/runs-per-dumpcycle (i.e.
the average of the out-sizes), and then the balance percentile in each
row is (today's-out-MB less average-size)/average-size.  For example,
for 11/20, you have
  (25295-16954) / 16954 = 0.492
,and for 11/16 you have 
  ( 6109-16954) / 16954 = -0.640
, etc.)

So, from these two days of "balance" reports, the takeaway is that one
particular DLE seems to have become much bigger (3145 -> 25295 MB
compressed size), and as a result the average dump size went up by
2200MB, and thus all the positive-balance days had their balance
percentages go down a bit.  So the balance did improve, but because of
the growth of the total backup volume rather than because of
rescheduling any particular dump(s).  

Also, the fact that this one DLE grew to be larger than the average dump
size explains why no other DLEs were promoted to last night's run (and
thus none of the other date's rows changed at all).

However, the next three days all show negative balance figures, so it
will be interesting to see if Amanda promotes any DLEs from the 11/14,
11/15, or 11/19's groups to try to even the batches out a bit. 
However, the first two of those dates currently have small DLE counts,
and 11/19 includes many DLEs but totals just slightly over the average,
so the opportunities for rebalancing may be pretty limited

Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-11-11 Thread Gene Heskett

On Saturday 10 November 2018 13:55:30 Nathan Stratton Treadway wrote:

[...]
> > > >  due-date  #fsorig MB out MB   balance
> > > > --
> > > > 10/30 Tue5  0  0  ---
> > > > 10/31 Wed1  17355   8958-45.3%
> > > > 11/01 Thu2  10896  10887-33.5%
> > > > 11/02 Fri4  35944   9298-43.2%
> > > > 11/03 Sat4  14122  10835-33.8%
> > > > 11/04 Sun3  57736  57736   +252.7%
> > > > 11/05 Mon2  39947  30635+87.1%
> > > > 11/06 Tue8   4235   4215-74.3%
> > > > 11/07 Wed4  19503  14732-10.0%
> > > > 11/08 Thu   32  31783  16408 +0.2%
> > > > --
> > > > TOTAL   65 231521 163704 16370
[...]
> > What does your "balance" output show now?
[...]
> > Its some better:
> > amanda@coyote:/amandatapes/Dailys/data$ /usr/local/sbin/amadmin
> > Daily balance
> >
> >  due-date  #fsorig MB out MB   balance
> > --
> > 11/10 Sat1   7912   3145-78.7%
> > 11/11 Sun1  10886  10886-26.1%
> > 11/12 Mon1  32963   7875-46.6%
> > 11/13 Tue1   7688   7688-47.8%
> > 11/14 Wed2  22109  22109+50.0%
> > 11/15 Thu4  75027  46623   +216.3%
> > 11/16 Fri6   8257   6109-58.6%
> > 11/17 Sat   29  14034   8932-39.4%
> > 11/18 Sun4  21281  16842+14.3%
> > 11/19 Mon   18  34599  17188+16.6%
> > --
> > TOTAL   67 234756 147397 14739
> >
> > It will be interesting to see if it continues to get "better".
> > I should think it will be under 150% by the 15nth if so. If the
> > planner behaves itself.
>
[]
> (I am pretty sure that the small-DLE bug didn't affect the overall
> balance.  You can see from the "balance" output on 10/30 that the 5
> DLEs in question all show up as needing to be full-dumped that day --
> but the total size for that group is still zero.  So I suspect that
> there is/are some other factor(s) behind the single-day surge and
> whatever "shuffling" has been going on ...)
>
> > We shall see. Perhaps I could add a balance report to the end of
> > backup.sh so I get it emailed to me every morning? I'll take a look
> > after ingesting enough caffeine to get both eyes open
> > simultaneously.
>
> Yes, if you are trying to really understand what's going on with the
> scheduling it can certainly be useful to be able to watch the
> day-to-day changes to the balance listing.
>
>   Nathan

After this mornings run, its better again:
amanda@coyote:/root$ /usr/local/sbin/amadmin Daily balance

 due-date  #fsorig MB out MB   balance
--
11/11 Sun1  10886  10886-35.8%
11/12 Mon1  32963   7875-53.6%
11/13 Tue1   7688   7688-54.7%
11/14 Wed2  22109  22109+30.4%
11/15 Thu4  75027  46623   +175.0%
11/16 Fri6   8257   6109-64.0%
11/17 Sat   29  14034   8932-47.3%
11/18 Sun4  21281  16842 -0.7%
11/19 Mon   18  34599  17188 +1.4%
11/20 Tue1  49240  25295+49.2%
--
TOTAL   67 276084 169547 16954
  (estimated 10 runs per dumpcycle)

And it must be calculating the balance based on the original size, not 
the compressed size (out MB), that growing Tuesday the 20th is still on 
one vtape, but if unchanged, Thu 15nth will still use a 2nd vtape.

And from the emailed report, much less "churn". So I'm thinking we may 
have solved a goodly part of my planner complaints at the same time.
--
NOTES:
  planner: Incremental of coyote:/usr/local bumped to level 2.
  planner: Incremental of picnc:/ bumped to level 2.
  taper: tape Dailys-41 kb 26949743 fm 67 [OK]
---
And I woke up to recycle some water, so I'm going back to bed till a more 
civilized time of the day. :)

Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-11-10 Thread Nathan Stratton Treadway

On Sat, Nov 10, 2018 at 11:39:58 -0500, Chris Nighswonger wrote:
> I think that is output from one of Gene's systems, but here is the latest
> from mine after DLE balancing has been through one successful run. (The
> next run will take place Monday morning @ 0200).
> 
> backup@scriptor:~ amadmin campus balance
> 
>  due-date  #fsorig kB out kB   balance
> --
> 11/10 Sat4  676705513  447515666   +178.0%
> 11/11 Sun889438595250400-96.7%
> 11/12 Mon   12  127984592   84074623-47.8%
> 11/13 Tue   19  304110025  267932333+66.5%
> 11/14 Wed0  0  0  ---
> --
> TOTAL   43 1117743989  804773022 160954604
>   (estimated 5 runs per dumpcycle)
>  (3 filesystems overdue. The most being overdue 17841 days.)
> 
> Not sure what's up with the overdues. There were none prior to breaking up
> the DLEs. It may just be an artifact.

With a 5-day dumpcycle that would mean Amanda thinks the last dump took
place 17846-ish days ago:
  $ date --date="17846 days ago"
  Wed Dec 31 14:24:39 EST 1969
... and the date of 1969/12/13 is the "no data saved" placeholder date
within Amanda's info database.

Anyway, you should be able to identify the three DLEs in question with
  amadmin campus due | grep "Overdue"
and then use "amadmin campus info [...]" to see what amanda has recorded
about them.

I guess there should also be one DLE listed in the "amadmin ... due"
output as being due "today".  It would be interesting to see the info
for that one as well (in order to understand the components of the very
large total size shown on the 11/10 line).

Because Amanda won't postpone any of those for DLEs unless it really has
to, the balance will presumably still have the +175-ish% "surge" after
your run on Monday... but in the runs after that it might try to spread
things out som emore (for example, by promoting a few of the DLEs
currently included in the 11/13 line).

Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-11-10 Thread Gene Heskett

On Saturday 10 November 2018 13:55:30 Nathan Stratton Treadway wrote:

> On Sat, Nov 10, 2018 at 12:48:15 -0500, Gene Heskett wrote:
> > On Saturday 10 November 2018 10:47:03 Nathan Stratton Treadway wrote:
> > > On Tue, Oct 30, 2018 at 15:51:36 -0400, Gene Heskett wrote:
> > > > I just changed the length of the dumpcycle and runs percycle up
> > > > to 10, about last friday while I was makeing the bump* stuff
> > > > more attractive, but the above command returns that the are 5
> > > > filesystens out of date: su amanda -c "/usr/local/sbin/amadmin
> > > > Daily balance"
> > > >
> > > >  due-date  #fsorig MB out MB   balance
> > > > --
> > > > 10/30 Tue5  0  0  ---
> > > > 10/31 Wed1  17355   8958-45.3%
> > > > 11/01 Thu2  10896  10887-33.5%
> > > > 11/02 Fri4  35944   9298-43.2%
> > > > 11/03 Sat4  14122  10835-33.8%
> > > > 11/04 Sun3  57736  57736   +252.7%
> > > > 11/05 Mon2  39947  30635+87.1%
> > > > 11/06 Tue8   4235   4215-74.3%
> > > > 11/07 Wed4  19503  14732-10.0%
> > > > 11/08 Thu   32  31783  16408 +0.2%
> > > > --
> > > > TOTAL   65 231521 163704 16370
> > >
> > > Okay, now that the small-DLE distraction is out of the way, we can
> > > get back to the original question regarding the scheduling of
> > > dumps over your dumpcycle.
> > >
> > > What does your "balance" output show now?
> > >
> > > (In particular, I'm curious if there is still one day with a huge
> > > surge like shown for 11/04 in the listing above.)
> > >
> > >
> > >   Nathan
> >
> > Its some better:
> > amanda@coyote:/amandatapes/Dailys/data$ /usr/local/sbin/amadmin
> > Daily balance
> >
> >  due-date  #fsorig MB out MB   balance
> > --
> > 11/10 Sat1   7912   3145-78.7%
> > 11/11 Sun1  10886  10886-26.1%
> > 11/12 Mon1  32963   7875-46.6%
> > 11/13 Tue1   7688   7688-47.8%
> > 11/14 Wed2  22109  22109+50.0%
> > 11/15 Thu4  75027  46623   +216.3%
> > 11/16 Fri6   8257   6109-58.6%
> > 11/17 Sat   29  14034   8932-39.4%
> > 11/18 Sun4  21281  16842+14.3%
> > 11/19 Mon   18  34599  17188+16.6%
> > --
> > TOTAL   67 234756 147397 14739
> >
> > It will be interesting to see if it continues to get "better".
> > I should think it will be under 150% by the 15nth if so. If the
> > planner behaves itself.
>
> Off hand I am still suspicious of the entry here for 11/15, both
> because the data size is very high for only 4 DLEs and because you've
> gone through a full cycle since your previous listing and the huge
> surge hasn't evened out very much.  So I'm guessing that one of those
> DLEs is probably very large compared to all your other ones
>
> Anyway, my next step would be to figure out which 4 DLEs are the ones
> in that group, which you should be able to do by looking through the
> output of "/usr/local/sbin/amadmin Daily due".  For example, try
>   /usr/local/sbin/amadmin Daily due | grep "5 day"
> and see if you get 4 DLEs listed (and adjust the count by a day if
> that shows you the wrong group of DLEs).
>
> Once you see which four are in that group, you can cross reference
> with your Amanda mail reports to figure out the relative sizes of
> those particular DLEs.
>
> > I'm of the opinion now that this bug has been tickling it wrong for
> > much more than the last 30 days or so that its been visible with the
> > update from 3.3.7p1 I'd been running forever. balance reports
> > weren't all that encouraging and the planner was half out of it mind
> > trying to shuffle things to help, without ever getting in "balance".
>
> (I am pretty sure that the small-DLE bug didn't affect the overall
> balance.  You can see from the "balance" output on 10/30 that the 5
> DLEs in question all show up as needing to be full-dumped that day --
> but the total size for that group is still zero.  So I suspect that
> there is/are some other factor(s) behind the single-day surge and
> whatever "shuffling" has been going on ...)
>
> > We shall see. Perhaps I could add a balance report to the end of
> > backup.sh so I get it emailed to me every morning? I'll take a look
> > after ingesting enough caffeine to get both eyes open
> > simultaneously.
>
> Yes, if you are trying to really understand what's going on with the
> scheduling it can certainly be useful to be able to watch the
> day-to-day changes to the balance listing.
>
>   Nathan
The biggest problem is where do I put the stuff I have downloaded, which 
includes several

Re: All level 0 on the same run?

2018-11-10 Thread Nathan Stratton Treadway

On Sat, Nov 10, 2018 at 12:48:15 -0500, Gene Heskett wrote:
> On Saturday 10 November 2018 10:47:03 Nathan Stratton Treadway wrote:
> 
> > On Tue, Oct 30, 2018 at 15:51:36 -0400, Gene Heskett wrote:
> > > I just changed the length of the dumpcycle and runs percycle up to
> > > 10, about last friday while I was makeing the bump* stuff more
> > > attractive, but the above command returns that the are 5 filesystens
> > > out of date: su amanda -c "/usr/local/sbin/amadmin Daily balance"
> > >
> > >  due-date  #fsorig MB out MB   balance
> > > --
> > > 10/30 Tue5  0  0  ---
> > > 10/31 Wed1  17355   8958-45.3%
> > > 11/01 Thu2  10896  10887-33.5%
> > > 11/02 Fri4  35944   9298-43.2%
> > > 11/03 Sat4  14122  10835-33.8%
> > > 11/04 Sun3  57736  57736   +252.7%
> > > 11/05 Mon2  39947  30635+87.1%
> > > 11/06 Tue8   4235   4215-74.3%
> > > 11/07 Wed4  19503  14732-10.0%
> > > 11/08 Thu   32  31783  16408 +0.2%
> > > --
> > > TOTAL   65 231521 163704 16370
> >
> > Okay, now that the small-DLE distraction is out of the way, we can get
> > back to the original question regarding the scheduling of dumps over
> > your dumpcycle.
> >
> > What does your "balance" output show now?
> >
> > (In particular, I'm curious if there is still one day with a huge
> > surge like shown for 11/04 in the listing above.)
> >
> >
> > Nathan
> Its some better:
> amanda@coyote:/amandatapes/Dailys/data$ /usr/local/sbin/amadmin Daily 
> balance
> 
>  due-date  #fsorig MB out MB   balance
> --
> 11/10 Sat1   7912   3145-78.7%
> 11/11 Sun1  10886  10886-26.1%
> 11/12 Mon1  32963   7875-46.6%
> 11/13 Tue1   7688   7688-47.8%
> 11/14 Wed2  22109  22109+50.0%
> 11/15 Thu4  75027  46623   +216.3%
> 11/16 Fri6   8257   6109-58.6%
> 11/17 Sat   29  14034   8932-39.4%
> 11/18 Sun4  21281  16842+14.3%
> 11/19 Mon   18  34599  17188+16.6%
> --
> TOTAL   67 234756 147397 14739
> 
> It will be interesting to see if it continues to get "better".
> I should think it will be under 150% by the 15nth if so. If the planner 
> behaves itself.

Off hand I am still suspicious of the entry here for 11/15, both because
the data size is very high for only 4 DLEs and because you've gone
through a full cycle since your previous listing and the huge surge
hasn't evened out very much.  So I'm guessing that one of those DLEs is
probably very large compared to all your other ones

Anyway, my next step would be to figure out which 4 DLEs are the ones in
that group, which you should be able to do by looking through the output
of "/usr/local/sbin/amadmin Daily due".  For example, try
  /usr/local/sbin/amadmin Daily due | grep "5 day" 
and see if you get 4 DLEs listed (and adjust the count by a day if that
shows you the wrong group of DLEs).

Once you see which four are in that group, you can cross reference with
your Amanda mail reports to figure out the relative sizes of those
particular DLEs.

> 
> I'm of the opinion now that this bug has been tickling it wrong for much 
> more than the last 30 days or so that its been visible with the update 
> from 3.3.7p1 I'd been running forever. balance reports weren't all that 
> encouraging and the planner was half out of it mind trying to shuffle 
> things to help, without ever getting in "balance".

(I am pretty sure that the small-DLE bug didn't affect the overall
balance.  You can see from the "balance" output on 10/30 that the 5 DLEs
in question all show up as needing to be full-dumped that day -- but the
total size for that group is still zero.  So I suspect that there is/are
some other factor(s) behind the single-day surge and whatever
"shuffling" has been going on ...)

> 
> We shall see. Perhaps I could add a balance report to the end of 
> backup.sh so I get it emailed to me every morning? I'll take a look 
> after ingesting enough caffeine to get both eyes open simultaneously. 

Yes, if you are trying to really understand what's going on with the
scheduling it can certainly be useful to be able to watch the day-to-day
changes to the balance listing.

Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C

Re: All level 0 on the same run?

2018-11-10 Thread Gene Heskett

On Saturday 10 November 2018 10:47:03 Nathan Stratton Treadway wrote:

> On Tue, Oct 30, 2018 at 15:51:36 -0400, Gene Heskett wrote:
> > I just changed the length of the dumpcycle and runs percycle up to
> > 10, about last friday while I was makeing the bump* stuff more
> > attractive, but the above command returns that the are 5 filesystens
> > out of date: su amanda -c "/usr/local/sbin/amadmin Daily balance"
> >
> >  due-date  #fsorig MB out MB   balance
> > --
> > 10/30 Tue5  0  0  ---
> > 10/31 Wed1  17355   8958-45.3%
> > 11/01 Thu2  10896  10887-33.5%
> > 11/02 Fri4  35944   9298-43.2%
> > 11/03 Sat4  14122  10835-33.8%
> > 11/04 Sun3  57736  57736   +252.7%
> > 11/05 Mon2  39947  30635+87.1%
> > 11/06 Tue8   4235   4215-74.3%
> > 11/07 Wed4  19503  14732-10.0%
> > 11/08 Thu   32  31783  16408 +0.2%
> > --
> > TOTAL   65 231521 163704 16370
>
> Okay, now that the small-DLE distraction is out of the way, we can get
> back to the original question regarding the scheduling of dumps over
> your dumpcycle.
>
> What does your "balance" output show now?
>
> (In particular, I'm curious if there is still one day with a huge
> surge like shown for 11/04 in the listing above.)
>
>
>   Nathan
Its some better:
amanda@coyote:/amandatapes/Dailys/data$ /usr/local/sbin/amadmin Daily 
balance

 due-date  #fsorig MB out MB   balance
--
11/10 Sat1   7912   3145-78.7%
11/11 Sun1  10886  10886-26.1%
11/12 Mon1  32963   7875-46.6%
11/13 Tue1   7688   7688-47.8%
11/14 Wed2  22109  22109+50.0%
11/15 Thu4  75027  46623   +216.3%
11/16 Fri6   8257   6109-58.6%
11/17 Sat   29  14034   8932-39.4%
11/18 Sun4  21281  16842+14.3%
11/19 Mon   18  34599  17188+16.6%
--
TOTAL   67 234756 147397 14739

It will be interesting to see if it continues to get "better".
I should think it will be under 150% by the 15nth if so. If the planner 
behaves itself.

I'm of the opinion now that this bug has been tickling it wrong for much 
more than the last 30 days or so that its been visible with the update 
from 3.3.7p1 I'd been running forever. balance reports weren't all that 
encouraging and the planner was half out of it mind trying to shuffle 
things to help, without ever getting in "balance".

We shall see. Perhaps I could add a balance report to the end of 
backup.sh so I get it emailed to me every morning? I'll take a look 
after ingesting enough caffeine to get both eyes open simultaneously. 

I've been laying around & taking care of the missus while waiting on 
stuff from China to rebuild the interfaces on my Grizzly g0704 milling 
machine, destroyed by a failure of a $2 buck regulator that failed 
shorted, putting 35 volts on the 5 volt vcc line of the breakout board. 
That obviously gave up the ghost too. The rebuild will include a 
dedicated 5 volt supply and the 35 volt parts to supply some of the 
other line power controlling ice cube relays that control jigs and 
vacuums to suck up the swarf when I'm making furniture parts on it. I'm 
rather partial to the Green & Green huge box joint, which lends itself 
to being carved on cnc machinery.  I wrote that code too.

Obviously I have way to many "hobbies" ;-) I bought 2 years ago, a lathe 
big enough to do some gunsmithing work, and since I bought a 70 yo 
Sheldon model that had been badly abused, rebuilt it to be cnc 
controlled, which works well considering I didn't use a regular pc to do 
it, but an r-pi-3b, breaking new ground. Good enough I used it to put a 
new barrel on old meat in the pot last fall which I also reload for it, 
was a 30-06 Ackley Improved, now a 6.5 Creedmoor thats quite a bit 
easier of the old mans shoulder. Shoots well again, the old Douglas 
barrel was getting rusty, with accuracy to match. ;-)

Thank you Nathan.

Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-11-10 Thread Chris Nighswonger

On Sat, Nov 10, 2018 at 10:57 AM Nathan Stratton Treadway <
natha...@ontko.com> wrote:

> On Tue, Oct 30, 2018 at 15:51:36 -0400, Gene Heskett wrote:
> > I just changed the length of the dumpcycle and runs percycle up to 10,
> > about last friday while I was makeing the bump* stuff more attractive,
> > but the above command returns that the are 5 filesystens out of date:
> > su amanda -c "/usr/local/sbin/amadmin Daily balance"
> >
> >  due-date  #fsorig MB out MB   balance
> > --
> > 10/30 Tue5  0  0  ---
> > 10/31 Wed1  17355   8958-45.3%
> > 11/01 Thu2  10896  10887-33.5%
> > 11/02 Fri4  35944   9298-43.2%
> > 11/03 Sat4  14122  10835-33.8%
> > 11/04 Sun3  57736  57736   +252.7%
> > 11/05 Mon2  39947  30635+87.1%
> > 11/06 Tue8   4235   4215-74.3%
> > 11/07 Wed4  19503  14732-10.0%
> > 11/08 Thu   32  31783  16408 +0.2%
> > --
> > TOTAL   65 231521 163704 16370
>
> Okay, now that the small-DLE distraction is out of the way, we can get
> back to the original question regarding the scheduling of dumps over
> your dumpcycle.
>
> What does your "balance" output show now?
>

I think that is output from one of Gene's systems, but here is the latest
from mine after DLE balancing has been through one successful run. (The
next run will take place Monday morning @ 0200).

backup@scriptor:~ amadmin campus balance

 due-date  #fsorig kB out kB   balance
--
11/10 Sat4  676705513  447515666   +178.0%
11/11 Sun889438595250400-96.7%
11/12 Mon   12  127984592   84074623-47.8%
11/13 Tue   19  304110025  267932333+66.5%
11/14 Wed0  0  0  ---
--
TOTAL   43 1117743989  804773022 160954604
  (estimated 5 runs per dumpcycle)
 (3 filesystems overdue. The most being overdue 17841 days.)

Not sure what's up with the overdues. There were none prior to breaking up
the DLEs. It may just be an artifact.

Kind regards,
Chris

>
> (In particular, I'm curious if there is still one day with a huge surge
> like shown for 11/04 in the listing above.)
>
>
> Nathan
>
>
> 
> Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
> Ray Ontko & Co.  -  Software consulting services  -
> http://www.ontko.com/
>  GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
>  Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239
>

Re: All level 0 on the same run?

2018-11-10 Thread Nathan Stratton Treadway

On Tue, Oct 30, 2018 at 15:51:36 -0400, Gene Heskett wrote:
> I just changed the length of the dumpcycle and runs percycle up to 10, 
> about last friday while I was makeing the bump* stuff more attractive, 
> but the above command returns that the are 5 filesystens out of date:
> su amanda -c "/usr/local/sbin/amadmin Daily balance"
> 
>  due-date  #fsorig MB out MB   balance
> --
> 10/30 Tue5  0  0  ---
> 10/31 Wed1  17355   8958-45.3%
> 11/01 Thu2  10896  10887-33.5%
> 11/02 Fri4  35944   9298-43.2%
> 11/03 Sat4  14122  10835-33.8%
> 11/04 Sun3  57736  57736   +252.7%
> 11/05 Mon2  39947  30635+87.1%
> 11/06 Tue8   4235   4215-74.3%
> 11/07 Wed4  19503  14732-10.0%
> 11/08 Thu   32  31783  16408 +0.2%
> --
> TOTAL   65 231521 163704 16370

Okay, now that the small-DLE distraction is out of the way, we can get
back to the original question regarding the scheduling of dumps over
your dumpcycle. 

What does your "balance" output show now?  

(In particular, I'm curious if there is still one day with a huge surge
like shown for 11/04 in the listing above.)


Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-11-01 Thread Gene Heskett

On Thursday 01 November 2018 09:09:36 Nathan Stratton Treadway wrote:

> On Thu, Nov 01, 2018 at 08:21:55 -0400, Gene Heskett wrote:
> > And an amcheck fusses with 3.3.7p1 reinstalled:
> >
> > amcheck-server: Bogus line in the tapelist file: 20181101030104
> > Dailys-28 reuse BLOCKSIZE:32 POOL:Daily STORAGE:Daily CONFIG:Daily
> >
> > Should I nuke that line before it runs?
>
> Ah, ugh.
>
> It's complaining because v3.4/3.5 added pool and storage, etc. info to
> those lines, which 3.3 doesn't understand.
>
> I am pretty sure it will complain about all the tapelist lines that
> have been touched since you moved to 3.5, so nuking that one line
> won't be sufficient.
>
> I have no experienced downgrading like this, and suspect there might
> be other files that have similar incompabilities.
>
> Personally I would say you'd be better off switching back to 3.5 and
> moving forward from there, rather than trying to track down all the
> strange problems that will pop up trying to downgrade.  (Especially
> since no one who actually knows anything about what would be involved
> in such an undertaking is around to guide you...)

Yeah BETSOL sure isn't making an effort, and actions speak a hell of a 
lot louder than their non-existant messages to this list have. Sorry 
about the attitude, but thats how I see it.

We, the community should start a gofundme to hire JLM a week a month and 
fork this. But I know zip about how to start the "gofundme", and if its 
facebooks copyright, do something else, someplace else. Anyplace else, 
Zuckerburg would sell his mother by the lb and stand there smiling.

> (Keep in mind that 3.5 appears to be actually doing the backups just
> fine; the saving-info problem is more cosmetic than functional)

True also its seems. But it would be nice to fix some of these warts.

Nathan

Take care all.
>
> --
>-- Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic
> region Ray Ontko & Co.  -  Software consulting services  -  
> http://www.ontko.com/ GPG Key:
> http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239 Key
> fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-11-01 Thread Gene Heskett

On Thursday 01 November 2018 06:53:19 Gene Heskett wrote:

> 3.4.3 is still broken, 3.3.7p1 being installed now.

Because 337p1 squawks about a bogus line in the tapelist, I put 3.5.1 
back in.

Someone ask how small it had to be to trigger this bug.
root@coyote:/home/amanda/amanda-3.5.1/server-src# su 
amanda -c "/usr/local/sbin/amadmin Daily due"|grep Overdue -

Overdue 22 days: shop:/usr/local = du -h 136k

Overdue 22 days: shop:/var/amanda = du -h 4k, but only file is a link 
to /etc/amandates

Overdue 22 days: lathe:/usr/local = du -h 136k

Overdue 22 days: lathe:/var/amanda = du -h 4k, but only file is a link 
to /etc/amandates

Overdue 22 days: GO704:/var/amanda =du -h 4k, but only file is a link 
to /etc/amandates

So the /var/amanda can be dumped if the reference to amandates is changed  
to the real file in /etc which is backed up by a different dle. That 
will fix 3 of the overdue warnings. 

But the /usr/local warning is a different critter. I had not 
considered /usr to be a backup target because it will be restored by the 
distro install dvd + an update/upgrade.

Removing the local will  make it about 6 Gb. Ridiculous increase for no 
benefit.  Rebuilding the client to use the real file in /etc/amandates 
makes more sense.

Other ideas?

-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-11-01 Thread Nathan Stratton Treadway

On Thu, Nov 01, 2018 at 08:21:55 -0400, Gene Heskett wrote:
> And an amcheck fusses with 3.3.7p1 reinstalled:
> 
> amcheck-server: Bogus line in the tapelist file: 20181101030104 Dailys-28 
> reuse BLOCKSIZE:32 POOL:Daily STORAGE:Daily CONFIG:Daily
> 
> Should I nuke that line before it runs?

Ah, ugh.

It's complaining because v3.4/3.5 added pool and storage, etc. info to
those lines, which 3.3 doesn't understand.

I am pretty sure it will complain about all the tapelist lines that have
been touched since you moved to 3.5, so nuking that one line won't
be sufficient.

I have no experienced downgrading like this, and suspect there might be
other files that have similar incompabilities.

Personally I would say you'd be better off switching back to 3.5 and
moving forward from there, rather than trying to track down all the
strange problems that will pop up trying to downgrade.  (Especially
since no one who actually knows anything about what would be involved in
such an undertaking is around to guide you...)

(Keep in mind that 3.5 appears to be actually doing the backups just
fine; the saving-info problem is more cosmetic than functional)

Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-11-01 Thread Chris Nighswonger

On Wed, Oct 31, 2018 at 6:30 PM Debra S Baddorf  wrote:

> We may have found an answer for Gene’s problem.
> Has the original poster, Chris Nighswonger  found an answer?
>
> Deb
>
>
Indeed! Enough to get me set off in what seems to be a right direction.

Thanks to everyone for the kind assistance.

Kind regards,
Chris

Re: All level 0 on the same run?

2018-11-01 Thread Chris Nighswonger

On Wed, Oct 31, 2018 at 5:32 PM Nathan Stratton Treadway 
wrote:

>
> Am I correct that you actually ran two separate amdump runs within the
> calendar day of 10/30 (with the first "balance" command executed between
> the runs)?  That would explain why all 39 DLEs are now showing as due on
> the same day.
>
>
No. The first balance was run on 10/29. The second on 10/30.

Here is the balance after last night's run (10/31):

root@scriptor:/var/log/amanda/server/campus# su backup -c
"/usr/sbin/amadmin campus balance"

 due-date  #fsorig kB out kB   balance
--
11/01 Thu1  0  0  ---
11/02 Fri0  0  0  ---
11/03 Sat   21  899936964  623033612   +301.4%
11/04 Sun0  0  0  ---
11/05 Mon   18  179898725  153104174 -1.4%
--
TOTAL   40 1079835689  776137786 155227557
  (estimated 5 runs per dumpcycle)

Sounds to me like I need to give it another week or so to settle out. Then
revisit breaking up the two or three large DLEs.

Kind regards,
Chris

Re: All level 0 on the same run?

2018-11-01 Thread Gene Heskett

On Thursday 01 November 2018 06:53:19 Gene Heskett wrote:

> On Thursday 01 November 2018 01:50:31 Gene Heskett wrote:
> > On Wednesday 31 October 2018 22:40:58 Nathan Stratton Treadway wrote:
> > > On Wed, Oct 31, 2018 at 13:26:40 -0400, Gene Heskett wrote:
> > > > That makes more sense than anything else I've found. Now, I have
> > > > 3.3.7p1. 3.4.3, and 3.5.1 which I've been running about that
> > > > long. So lets install 3.4.3 for tonight. Building now,
> > > > apparently had not been done before. and on the instal and
> > > > amcheck, I had to move the
> > >
> > > Okay, let us know if 3.4.3 behaves any differently for those small
> > > DLEs. I took a quick look at the source code commits in the 3.5
> > > timeframe and nothing jumped out at me as touching the
> > > info-file-updating, so it would not surprise me too much if that
> > > bug were in 3.4 as well.
> > >
> > > But in any case, it would help narrow down where to look to know
> > > if it was or was not fixed by downgrading.
> >
> > The next step down would be 3.3.7p1, which has ran here for years. I
> > also have 3.3.6, but thats as old as I go except for some
> > amanda-4.x.x-alpha stuff that the previous programmer who was fond
> > of perl was doing, but can't remember his name ATM.  Most of that
> > also seemed to work. So if its present in 3.3.7p1, then test 3.3.6,
> > then start on the alpha stuff. At least till someone comes up with a
> > patch. What file do you think the bug is in?
> >
> > Take care now, Nathan & thank you.
>
> 3.4.3 is still broken, 3.3.7p1 being installed now.

And an amcheck fusses with 3.3.7p1 reinstalled:

amcheck-server: Bogus line in the tapelist file: 20181101030104 Dailys-28 
reuse BLOCKSIZE:32 POOL:Daily STORAGE:Daily CONFIG:Daily

Should I nuke that line before it runs?

-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-11-01 Thread Gene Heskett

On Thursday 01 November 2018 01:50:31 Gene Heskett wrote:

> On Wednesday 31 October 2018 22:40:58 Nathan Stratton Treadway wrote:
> > On Wed, Oct 31, 2018 at 13:26:40 -0400, Gene Heskett wrote:
> > > That makes more sense than anything else I've found. Now, I have
> > > 3.3.7p1. 3.4.3, and 3.5.1 which I've been running about that long.
> > > So lets install 3.4.3 for tonight. Building now, apparently had
> > > not been done before. and on the instal and amcheck, I had to move
> > > the
> >
> > Okay, let us know if 3.4.3 behaves any differently for those small
> > DLEs. I took a quick look at the source code commits in the 3.5
> > timeframe and nothing jumped out at me as touching the
> > info-file-updating, so it would not surprise me too much if that bug
> > were in 3.4 as well.
> >
> > But in any case, it would help narrow down where to look to know if
> > it was or was not fixed by downgrading.
>
> The next step down would be 3.3.7p1, which has ran here for years. I
> also have 3.3.6, but thats as old as I go except for some
> amanda-4.x.x-alpha stuff that the previous programmer who was fond of
> perl was doing, but can't remember his name ATM.  Most of that also
> seemed to work. So if its present in 3.3.7p1, then test 3.3.6, then
> start on the alpha stuff. At least till someone comes up with a patch.
>  What file do you think the bug is in?
>
> Take care now, Nathan & thank you.

3.4.3 is still broken, 3.3.7p1 being installed now.


Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett

On Wednesday 31 October 2018 22:40:58 Nathan Stratton Treadway wrote:

> On Wed, Oct 31, 2018 at 13:26:40 -0400, Gene Heskett wrote:
> > That makes more sense than anything else I've found. Now, I have
> > 3.3.7p1. 3.4.3, and 3.5.1 which I've been running about that long.
> > So lets install 3.4.3 for tonight. Building now, apparently had not
> > been done before. and on the instal and amcheck, I had to move the
>
> Okay, let us know if 3.4.3 behaves any differently for those small
> DLEs. I took a quick look at the source code commits in the 3.5
> timeframe and nothing jumped out at me as touching the
> info-file-updating, so it would not surprise me too much if that bug
> were in 3.4 as well.
>
> But in any case, it would help narrow down where to look to know if it
> was or was not fixed by downgrading.
>
The next step down would be 3.3.7p1, which has ran here for years. I also 
have 3.3.6, but thats as old as I go except for some amanda-4.x.x-alpha 
stuff that the previous programmer who was fond of perl was doing, but 
can't remember his name ATM.  Most of that also seemed to work. So if 
its present in 3.3.7p1, then test 3.3.6, then start on the alpha stuff. 
At least till someone comes up with a patch.  What file do you think the 
bug is in?
>
Take care now, Nathan & thank you.
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway

On Wed, Oct 31, 2018 at 13:26:40 -0400, Gene Heskett wrote:
> That makes more sense than anything else I've found. Now, I have 3.3.7p1. 
> 3.4.3, and 3.5.1 which I've been running about that long.
> So lets install 3.4.3 for tonight. Building now, apparently had not been 
> done before. and on the instal and amcheck, I had to move the 

Okay, let us know if 3.4.3 behaves any differently for those small DLEs.
I took a quick look at the source code commits in the 3.5 timeframe and
nothing jumped out at me as touching the info-file-updating, so it would
not surprise me too much if that bug were in 3.4 as well.

But in any case, it would help narrow down where to look to know if it
was or was not fixed by downgrading.

Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett

On Wednesday 31 October 2018 14:52:46 Debra S Baddorf wrote:

> “  Those datestamps are obviously wrong, should be 20181031  "
>
> The two DLE’s that you showed, with datestamp “wrong”, are among the
> “too small” disks you are talking about.   So it seems that the
> datestamp probably isn’t wrong? These sets are being missed?
>
> (Not completely following, cuz it’s not (so far) relevant to me.)
> But the date thing seems to be real, so I thought I’d point it out.
>
> Deb Baddorf
>
Nope, Deb, those dle's are present, and carrying todays date when the 
vtape is accessed by an ls -l, as I posted a few hours ago.

-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway

On Wed, Oct 31, 2018 at 12:07:06 -0400, Gene Heskett wrote:
> root@coyote:/amandatapes/Dailys# ls -l data/
> total 18143556
> -rw--- 1 amanda amanda  32768 Oct 31 03:03 0.Dailys-27
> -rw--- 1 amanda amanda  70603 Oct 31 03:03 1.shop._root.0
> -rw--- 1 amanda amanda  32934 Oct 31 03:03 2.shop._var_amanda.0
> -rw--- 1 amanda amanda 272120 Oct 31 03:03 3.GO704._root.0
[...]
> -rw--- 1 amanda amanda   13353012 Oct 31 03:44 00064.GO704._usr_local.0
> -rw--- 1 amanda amanda2163585 Oct 31 03:44 
> 00065.GO704._usr_lib_amanda.0
> -rw-r--r-- 1 amanda amanda 112640 Oct 31 03:45 configuration.tar
> -rw-r--r-- 1 amanda amanda  469934080 Oct 31 03:45 indices.tar
> 
> So all 66 dle's are there, subbing out the first line & the last 2.
> 
[...]
> Now, possibly interesting, does amadmin skip some that aren't due, it 
> only sees 65 lines of output.
> oot@coyote:/amandatapes/Dailys# su amanda -c "/usr/local/sbin/amadmin Daily 
> dles"|wc -l
> 65

Note that in the vtape directory, the 0 file is the tape label, so
actually there are only 65 DLEs backed up there (thus matching your
"amadmin ... dles" output).  

(Amanda generally does _some_ dump for every DLE, to make sure to catch
changes since the previous dump... though of course if nothing has
changed on that DLE you may end up with an empty incremental dump for
that DLE.)

> 
> So where does amanda keep the file with the last backup, is that amandates?

(In 3.5 I seems to remember it's possible to choose from different
storage back-ends, but generally) the "amadmin ... info" data is stored
in a pile of
  /var/lib/amanda//curinfo///info 
text files (one file for each DLE).

> On shop, they don't resemble dates:
> gene@shop:/etc$ ls -l amandates
> -rw-r- 1 amandabackup disk 380 Oct 31 03:28 amandates
> gene@shop:/etc$ cat amandates
> /etc 0 1540969428
> /etc 1 1540796552

I don't believe amandates has anything to do with the "amadmin ...
info/due/balance" commands, but anyway those are
seconds-since-Unix-epoch numbers.  For a reasonably up-to-date version
of the GNU date command, you can translate that to a human-readable
date-time string with date --date="@n", e.g.

  $ date --date="@1540969428"
  Wed Oct 31 03:03:48 EDT 2018
  $ date --date="@1540796552"
  Mon Oct 29 03:02:32 EDT 2018

Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-31 Thread Debra S Baddorf

We may have found an answer for Gene’s problem.
Has the original poster, Chris Nighswonger  found an answer?

Deb

> On Oct 30, 2018, at 1:32 PM, Debra S Baddorf  wrote:
> 
> Is this the first backup run for a long while?  If so, then they are all DUE, 
> so amanda feels it has to schedule them all, now.
> 
> Is this the first backup ever?   Ditto above.
> 
> Did you perhaps run  “amadminforce  *”  which forces a level 0 on 
> all disks.
> Did you specify  “strategy noinc”which does the same?
> Or  "skip-incr yes”  ?  Ditto.
> 
> Did you replace a whole disk, making all the files look like they’ve never 
> been backed up?
> 
> Okay,  failing all the above obvious reasons,  I’ll leave others to discuss 
> “planner” reasons.  Sorry!
> Deb Baddorf
> Fermilab
> 
>> On Oct 30, 2018, at 1:20 PM, Chris Nighswonger 
>>  wrote:
>> 
>> Why in the world does Amanda plan level 0 backups for all entries in a DLE 
>> for the same run This causes all sorts of problems.
>> Is there any solution for this? I've read some of the creative suggestions, 
>> but it seems a bunch of trouble.
>> Kind regards,
>> Chris
>> 
>> 0 19098649k waiting for dumping
>> 0 9214891k waiting for dumping
>> 0 718824k waiting for dumping
>> 0 365207k waiting for dumping
>> 0 2083027k waiting for dumping
>> 0 3886869k waiting for dumping
>> 0 84910k waiting for dumping
>> 0 22489k dump done (7:23:34), waiting for writing to tape
>> 0 304k dump done (7:22:30), waiting for writing to tape
>> 0 2613k waiting for dumping
>> 0 30k dump done (7:23:07), waiting for writing to tape
>> 0 39642k dump done (7:23:07), waiting for writing to tape
>> 0 8513409k waiting for dumping
>> 0 39519558k waiting for dumping
>> 0 47954k waiting for dumping
>> 0 149877984k dumping 145307840k ( 96.95%) (7:22:15)
>> 0 742804k waiting for dumping
>> p" 0 88758k waiting for dumping
>> 0 12463k dump done (7:24:19), waiting for writing to tape
>> 0 5544352k waiting for dumping
>> 0 191676480k waiting for dumping
>> 0 3799277k waiting for dumping
>> 0 3177171k waiting for dumping
>> 0 11058544k waiting for dumping
>> 0 230026440k dump done (7:22:13), waiting for writing to tape
>> 0 8k dump done (7:24:24), waiting for writing to tape
>> 0 184k dump done (7:24:19), waiting for writing to tape
>> 0 1292009k waiting for dumping
>> 0 2870k dump done (7:23:23), waiting for writing to tape
>> 0 13893263k waiting for dumping
>> 0 6025026k waiting for dumping
>> 0 6k dump done (7:22:15), waiting for writing to tape
>> 0 42k dump done (7:24:24), waiting for writing to tape
>> 0 53k dump done (7:24:19), waiting for writing to tape
>> 0 74462169k waiting for dumping
>> 0 205032k waiting for dumping
>> 0 32914k waiting for dumping
>> 0 1k dump done (7:24:02), waiting for writing to tape
>> 0 854272k waiting for dumping
>> 
>

Re: All level 0 on the same run?

2018-10-31 Thread Jon LaBadie

On Wed, Oct 31, 2018 at 08:25:39AM -0400, Chris Nighswonger wrote:
> So, looking at this more, it may be self-inflicted. Last week I changed
> blocksize to 512k, and began amrmtape and amlabel with the oldest tape
> first and working backward day by day. I run backups 5 nights per week with
> a cycle of 13 tapes (see below). I would have thought that this would have
> allowed the change in blocksize to run seamlessly. Maybe not. I'm now
> suspecting that by amrmtape --cleanup, this caused Amanda to bork and fall
> back to level 0 backups. She did this two nights in a row!!!
> 
> Anyway, I'm going to hold off any further concerns until I finish a
> complete tapecycle. If the problem continues after that point, I'll pick
> back up.
> 
> Relevant lines from amanda.conf:
> 
> dumpcycle 5 days
> runspercycle 5
> tapecycle 13 tapes
> runtapes 1
> flush-threshold-dumped 50
> bumpsize 10 Mbytes
> bumppercent 0
> bumpmult 1.5
> bumpdays 2
> 
> Kind regards,
> Chris
> 

When I introduce a lot of DLEs into the disklist (or start a
new amanda instance) I typically will comment out all the new
DLEs and only uncomment a few each amdump run.

Jon
-- 
Jon H. LaBadie j...@jgcomp.com
 11226 South Shore Rd.  (703) 787-0688 (H)
 Reston, VA  20190  (703) 935-6720 (C)

Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway

On Wed, Oct 31, 2018 at 08:29:08 -0400, Chris Nighswonger wrote:
> FWIW, here is the output of amadmin balance before last nights run and
> again this morning. No overdues, so I guess that's good. I'm not
> experienced enough to make much of the balance percentages, but am now
> wondering if I should work at breaking up the large DLEs into smaller
> subsets as several have suggested.
> 
> root@scriptor:/home/manager# su backup -c "/usr/sbin/amadmin campus balance"
> 
>  due-date  #fsorig kB out kB   balance
> --
> 10/30 Tue   25  359009166  284972243   +102.1%
> 
> 11/03 Sat   15  632526057  420083122   +197.9%
> --
> TOTAL   40  991535223  705055365 141011073
>   (estimated 5 runs per dumpcycle)
>  (13 filesystems overdue. The most being overdue 1 day.)

Regarding "balance percentages": if you divide the "TOTAL out kB" figure
by the number of runs per dmpcycle, you'll get the number in the bottom
right corner of the chart (i.e. "141011073").  Amanda tries to spread
out the full dumps so that volume of full dumps happens each day (or, in
other words, so that the full dumps are split evenly over each day in
the dumpcycle).  The "balance" figures is simply a calculation of how
the currently-scheduled cycle compares to that ideal -- so in this case
the 10/30 figure is about twice the average (102% above the target),
while the 11/03 figure is about three times the target.

> root@scriptor:/home/manager# su backup -c "/usr/sbin/amadmin campus balance"
> 
>  due-date  #fsorig kB out kB   balance
> --
> 10/31 Wed1  0  0  ---
> 
> 11/03 Sat   39 1079730080  776153947   +400.0%
> 11/04 Sun0  0  0  ---
> --
> TOTAL   40 1079730080  776153947 155230789
>   (estimated 5 runs per dumpcycle)

Am I correct that you actually ran two separate amdump runs within the
calendar day of 10/30 (with the first "balance" command executed between
the runs)?  That would explain why all 39 DLEs are now showing as due on
the same day.

Anyway, the "clump" showing here is a direct result of the fact that all
your DLEs got set to full dumps yesterday (for whatever reason that
happened).  

Over the rest of this cycle, you should see the planner promoting about
one fifth of your full-dump volume to that day, so that the full dumps
spread back out to near a "zero" balance figure.  (That is, some level
0s will happen sooner than a full dumpcycle after the last one, to
spread things back out.)

If you know for sure that certain DLEs are larger than (or very close
to) the balance size, it will definitely help to split them.  Otherwise,
I'd say you might as well wait 5 days and then take a look at the
balance listing, to see if Amanda ended up having trouble evening things
back out over the course of the cycle.

Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-31 Thread Jose M Calhariz

On Wed, Oct 31, 2018 at 02:13:27PM -0400, Nathan Stratton Treadway wrote:
> On Wed, Oct 31, 2018 at 14:38:43 +, Jose M Calhariz wrote:
> > I bet this DLEs are very small and that you have done the upgrade of
> > the amand server between 20 and 20+tapecycle ago.
> > 
> > There is a bug in recent amanda server that if a DLE is very small it
> > will not update properly the internal database after a level 0.
> > Making that DLE overdue.
> 
> Yes, that definitely would explain what Gene was seeing.
> 
> Did you recieve or create a fix for that bug?  (I didn't immediately
> recognize any of the debian/patch files in the amanda_3.5.1-3_WIP_2
> source package I downloaded from you a few weeks ago as applying to this
> bug.)

No fix.  I was waiting for a new release before reporting this bug.
As I do not like to have many patches applied to the Debian packages,
besides the ones to debianize the software.


> 
> Do you know how small a DLE has to be to trigger this problem?

No, usually they are 0 size on amreport, being MB or GB.

> 
>   
>   Nathan
> 
> 
> Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
> Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
>  GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
>  Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239
> 
>

Kind regards
Jose M Calhariz

-- 
--
Nada é tao bom que alguem, em algum lugar, não ira odiar.
-- Joseph Murphy


signature.asc
Description: PGP signature

Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway

On Wed, Oct 31, 2018 at 18:52:46 +, Debra S Baddorf wrote:
> "Those datestamps are obviously wrong, should be 20181031"
> 
> The two DLE's that you showed, with datestamp "wrong", are among the
> "too small" disks you are talking about.  So it seems that the
> datestamp probably isn't wrong? These sets are being missed?
> 
> (Not completely following, cuz it's not (so far) relevant to me.)  
> But the date thing seems to be real, so I thought I'd point it out.

Actually, in the listing of the vtape directory (found in Gene's message
dated "Wed, 31 Oct 2018 12:07:06 -0400") one can see that the DLEs in
question _did_ get dumped.

So it appears that almost all parts of the process are working
correctly; the only problem is that the info database for those
particular dumps is not getting updated.  This in turn causes Amanda to
constantly think they are overdue, and thus I assume it always schedules
them for level 0 dumps (as well as showing them as "overdue" in the
"amadmin ... due" output") -- but since they are so tiny, presumable
that ends up having a negligible effect on the overall balance.  (And in
fact the vtape listing confirms that many of Gene's level 1 dumps are
much larger than those 5 specific level 0s.)

Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-31 Thread Debra S Baddorf

“  Those datestamps are obviously wrong, should be 20181031  "

The two DLE’s that you showed, with datestamp “wrong”, are among the “too small”
disks you are talking about.   So it seems that the datestamp probably isn’t 
wrong?
These sets are being missed?

(Not completely following, cuz it’s not (so far) relevant to me.)  
But the date thing seems to be real, so I thought I’d point it out.

Deb Baddorf


> On Oct 31, 2018, at 11:07 AM, Gene Heskett  wrote:
> 
> On Wednesday 31 October 2018 10:37:11 Nathan Stratton Treadway wrote:
> 
>> On Wed, Oct 31, 2018 at 08:50:55 -0400, Gene Heskett wrote:
>>> On Wednesday 31 October 2018 08:39:43 Nathan Stratton Treadway wrote:
 On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> yadda yadda. So just where the hell do I look for these?
> root@coyote:/amandatapes/Dailys# su amanda -c
> "/usr/local/sbin/amadmin Daily due"|grep Overdue
> Overdue 21 days: shop:/usr/local
> Overdue 21 days: shop:/var/amanda
> Overdue 21 days: lathe:/usr/local
> Overdue 21 days: lathe:/var/amanda
> Overdue 21 days: GO704:/var/amanda
> 
> So I look in the vtapes, and find its being done too.
> Checking for lathe:/var/amanda, its there
> Checking for lathe:/usr/local, they've been done too
 
 What do "amadmin Daily info shop", etc. say?
>>> 
>>> That "info" directory does not exist, never has existed here that I
>>> can
>> 
>> "info" is a subcommand of "amadmin", along the lines of "balance" and
>> "due".
>> 
> I don't believe its being treated as a subcommand, or maybe I didn't 
> give it all the arguments, lemme look at the man page. Yes, I seem to 
> have failed to assert the "Daily", and now I get output like.
> shop:/var/amanda:
> Current info for shop /var/amanda:
>  Stats: dump rates (kps), Full:1.0,   1.0,   1.0
>Incremental:1.0,   1.0,   1.0
>  compressed size, Full:  10.0%, 10.0%, 10.0%
>Incremental:  10.0%, 10.0%, 10.0%
>  Dumps: lev datestmp  tape file   origK   compK secs
>  0  20181001  Dailys-272 10 1 0
>  1  20181004  Dailys-5613 10 1 1
> 
> And shop:/usr/local:
> Current info for shop /usr/local:
>  Stats: dump rates (kps), Full:1.0,   1.0,   1.0
>Incremental:1.0,   1.0,   1.0
>  compressed size, Full:   2.5%,  2.5%,  2.5%
>Incremental:   2.5%,  2.5%,  2.5%
>  Dumps: lev datestmp  tape file   origK   compK secs
>  0  20181001  Dailys-276 40 1 1
>  1  20181004  Dailys-5923 40 1 1
> 
> Those datestamps are obviously wrong, should be 20181031
> 
> root@coyote:/amandatapes/Dailys# ls -l data
> lrwxrwxrwx 1 amanda amanda 6 Oct 31 03:01 data -> slot27
> 
> root@coyote:/amandatapes/Dailys# ls -l data/
> total 18143556
> -rw--- 1 amanda amanda  32768 Oct 31 03:03 0.Dailys-27
> -rw--- 1 amanda amanda  70603 Oct 31 03:03 1.shop._root.0
> -rw--- 1 amanda amanda  32934 Oct 31 03:03 2.shop._var_amanda.0
> -rw--- 1 amanda amanda 272120 Oct 31 03:03 3.GO704._root.0
> -rw--- 1 amanda amanda  32943 Oct 31 03:03 4.GO704._var_amanda.0
> -rw--- 1 amanda amanda5013684 Oct 31 03:03 5.lathe._usr_src.0
> -rw--- 1 amanda amanda  33687 Oct 31 03:03 6.shop._usr_local.0
> -rw--- 1 amanda amanda 206886 Oct 31 03:03 
> 7.shop._var_lib_amanda.0
> -rw--- 1 amanda amanda1156067 Oct 31 03:03 8.lathe._etc.0
> -rw--- 1 amanda amanda  32933 Oct 31 03:03 9.lathe._var_amanda.0
> -rw--- 1 amanda amanda1024710 Oct 31 03:03 00010.shop._etc.0
> -rw--- 1 amanda amanda  33689 Oct 31 03:03 00011.lathe._usr_local.0
> -rw--- 1 amanda amanda 264766 Oct 31 03:03 
> 00012.lathe._var_lib_amanda.0
> -rw--- 1 amanda amanda 113999 Oct 31 03:04 00013.lathe._root.0
> -rw--- 1 amanda amanda   22061583 Oct 31 03:04 00014.shop._lib_firmware.0
> -rw--- 1 amanda amanda1325898 Oct 31 03:04 
> 00015.shop._usr_lib_amanda.0
> -rw--- 1 amanda amanda   21973059 Oct 31 03:05 00016.lathe._lib_firmware.0
> -rw--- 1 amanda amanda1325856 Oct 31 03:05 
> 00017.lathe._usr_lib_amanda.0
> -rw--- 1 amanda amanda 3193481270 Oct 31 03:09 
> 00018.coyote._home_gene_src.0
> -rw--- 1 amanda amanda 3298505184 Oct 31 03:23 00019.coyote._home_gene.0
> -rw--- 1 amanda amanda 163341 Oct 31 03:23 
> 00020.coyote._home_gene_Downloads.2
> -rw--- 1 amanda amanda   3123 Oct 31 03:23 00021.coyote._home_ups.0
> -rw--- 1 amanda amanda 6189105783 Oct 31 03:28 00022.picnc._.0
> -rw--- 1 amanda amanda   68618563 Oct 31 03:29 00023.shop._home.0
> -rw--- 1 amanda amanda   36166458 Oct 31 03:29 00024.lathe._home.0
> -rw--- 1 amanda amanda  827898079 Oct 31 03:34 00025.coyote._home_amanda.0
> -rw--- 1 amanda amanda1117337 Oct 31 03:34 00026.coyote._home_nut.0
> -rw--- 1 amanda amanda

Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway

On Wed, Oct 31, 2018 at 14:38:43 +, Jose M Calhariz wrote:
> I bet this DLEs are very small and that you have done the upgrade of
> the amand server between 20 and 20+tapecycle ago.
> 
> There is a bug in recent amanda server that if a DLE is very small it
> will not update properly the internal database after a level 0.
> Making that DLE overdue.

Yes, that definitely would explain what Gene was seeing.

Did you recieve or create a fix for that bug?  (I didn't immediately
recognize any of the debian/patch files in the amanda_3.5.1-3_WIP_2
source package I downloaded from you a few weeks ago as applying to this
bug.)

Do you know how small a DLE has to be to trigger this problem?

Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett

On Wednesday 31 October 2018 10:38:43 Jose M Calhariz wrote:

> On Wed, Oct 31, 2018 at 08:39:43AM -0400, Nathan Stratton Treadway 
wrote:
> > On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> > > yadda yadda. So just where the hell do I look for these?
> > > root@coyote:/amandatapes/Dailys# su amanda -c
> > > "/usr/local/sbin/amadmin Daily due"|grep Overdue
> > > Overdue 21 days: shop:/usr/local
> > > Overdue 21 days: shop:/var/amanda
> > > Overdue 21 days: lathe:/usr/local
> > > Overdue 21 days: lathe:/var/amanda
> > > Overdue 21 days: GO704:/var/amanda
>
> I bet this DLEs are very small and that you have done the upgrade of
> the amand server between 20 and 20+tapecycle ago.
>
> There is a bug in recent amanda server that if a DLE is very small it
> will not update properly the internal database after a level 0.
> Making that DLE overdue.
>
That makes more sense than anything else I've found. Now, I have 3.3.7p1. 
3.4.3, and 3.5.1 which I've been running about that long.
So lets install 3.4.3 for tonight. Building now, apparently had not been 
done before. and on the instal and amcheck, I had to move the 
amanda-security.conf file up one level to /usr/local/etc. Might be wiser 
to softlink it if they are going to bounce it around like that.
So we'll see how it works in the morning. amcheck is at least happy with 
it.

Thanks for the bugreport Jose M Calhariz, you may have saved what little 
hair I have not pulled out looking for this.

Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett

On Wednesday 31 October 2018 10:37:11 Nathan Stratton Treadway wrote:

> On Wed, Oct 31, 2018 at 08:50:55 -0400, Gene Heskett wrote:
> > On Wednesday 31 October 2018 08:39:43 Nathan Stratton Treadway wrote:
> > > On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> > > > yadda yadda. So just where the hell do I look for these?
> > > > root@coyote:/amandatapes/Dailys# su amanda -c
> > > > "/usr/local/sbin/amadmin Daily due"|grep Overdue
> > > > Overdue 21 days: shop:/usr/local
> > > > Overdue 21 days: shop:/var/amanda
> > > > Overdue 21 days: lathe:/usr/local
> > > > Overdue 21 days: lathe:/var/amanda
> > > > Overdue 21 days: GO704:/var/amanda
> > > >
> > > > So I look in the vtapes, and find its being done too.
> > > > Checking for lathe:/var/amanda, its there
> > > > Checking for lathe:/usr/local, they've been done too
> > >
> > > What do "amadmin Daily info shop", etc. say?
> >
> > That "info" directory does not exist, never has existed here that I
> > can
>
> "info" is a subcommand of "amadmin", along the lines of "balance" and
> "due".
>
I don't believe its being treated as a subcommand, or maybe I didn't 
give it all the arguments, lemme look at the man page. Yes, I seem to 
have failed to assert the "Daily", and now I get output like.
shop:/var/amanda:
Current info for shop /var/amanda:
  Stats: dump rates (kps), Full:1.0,   1.0,   1.0
Incremental:1.0,   1.0,   1.0
  compressed size, Full:  10.0%, 10.0%, 10.0%
Incremental:  10.0%, 10.0%, 10.0%
  Dumps: lev datestmp  tape file   origK   compK secs
  0  20181001  Dailys-272 10 1 0
  1  20181004  Dailys-5613 10 1 1

And shop:/usr/local:
Current info for shop /usr/local:
  Stats: dump rates (kps), Full:1.0,   1.0,   1.0
Incremental:1.0,   1.0,   1.0
  compressed size, Full:   2.5%,  2.5%,  2.5%
Incremental:   2.5%,  2.5%,  2.5%
  Dumps: lev datestmp  tape file   origK   compK secs
  0  20181001  Dailys-276 40 1 1
  1  20181004  Dailys-5923 40 1 1

Those datestamps are obviously wrong, should be 20181031

root@coyote:/amandatapes/Dailys# ls -l data
lrwxrwxrwx 1 amanda amanda 6 Oct 31 03:01 data -> slot27

root@coyote:/amandatapes/Dailys# ls -l data/
total 18143556
-rw--- 1 amanda amanda  32768 Oct 31 03:03 0.Dailys-27
-rw--- 1 amanda amanda  70603 Oct 31 03:03 1.shop._root.0
-rw--- 1 amanda amanda  32934 Oct 31 03:03 2.shop._var_amanda.0
-rw--- 1 amanda amanda 272120 Oct 31 03:03 3.GO704._root.0
-rw--- 1 amanda amanda  32943 Oct 31 03:03 4.GO704._var_amanda.0
-rw--- 1 amanda amanda5013684 Oct 31 03:03 5.lathe._usr_src.0
-rw--- 1 amanda amanda  33687 Oct 31 03:03 6.shop._usr_local.0
-rw--- 1 amanda amanda 206886 Oct 31 03:03 7.shop._var_lib_amanda.0
-rw--- 1 amanda amanda1156067 Oct 31 03:03 8.lathe._etc.0
-rw--- 1 amanda amanda  32933 Oct 31 03:03 9.lathe._var_amanda.0
-rw--- 1 amanda amanda1024710 Oct 31 03:03 00010.shop._etc.0
-rw--- 1 amanda amanda  33689 Oct 31 03:03 00011.lathe._usr_local.0
-rw--- 1 amanda amanda 264766 Oct 31 03:03 00012.lathe._var_lib_amanda.0
-rw--- 1 amanda amanda 113999 Oct 31 03:04 00013.lathe._root.0
-rw--- 1 amanda amanda   22061583 Oct 31 03:04 00014.shop._lib_firmware.0
-rw--- 1 amanda amanda1325898 Oct 31 03:04 00015.shop._usr_lib_amanda.0
-rw--- 1 amanda amanda   21973059 Oct 31 03:05 00016.lathe._lib_firmware.0
-rw--- 1 amanda amanda1325856 Oct 31 03:05 00017.lathe._usr_lib_amanda.0
-rw--- 1 amanda amanda 3193481270 Oct 31 03:09 00018.coyote._home_gene_src.0
-rw--- 1 amanda amanda 3298505184 Oct 31 03:23 00019.coyote._home_gene.0
-rw--- 1 amanda amanda 163341 Oct 31 03:23 
00020.coyote._home_gene_Downloads.2
-rw--- 1 amanda amanda   3123 Oct 31 03:23 00021.coyote._home_ups.0
-rw--- 1 amanda amanda 6189105783 Oct 31 03:28 00022.picnc._.0
-rw--- 1 amanda amanda   68618563 Oct 31 03:29 00023.shop._home.0
-rw--- 1 amanda amanda   36166458 Oct 31 03:29 00024.lathe._home.0
-rw--- 1 amanda amanda  827898079 Oct 31 03:34 00025.coyote._home_amanda.0
-rw--- 1 amanda amanda1117337 Oct 31 03:34 00026.coyote._home_nut.0
-rw--- 1 amanda amanda   27806189 Oct 31 03:34 
00027.coyote._home_gene_Mail.1
-rw--- 1 amanda amanda  45027 Oct 31 03:35 
00028.coyote._home_gene_Download.1
-rw--- 1 amanda amanda  339806208 Oct 31 03:35 00029.coyote._usr_dlds_misc.0
-rw--- 1 amanda amanda  101449728 Oct 31 03:35 00030.coyote._boot.0
-rw--- 1 amanda amanda  751423488 Oct 31 03:35 00031.coyote._var.3
-rw--- 1 amanda amanda  631574528 Oct 31 03:35 00032.coyote._usr_bin.0
-rw--- 1 amanda amanda  589254403 Oct 31 03:37 
00033.coyote._GenesAmandaHelper-0.61.3
-rw--- 1 amanda amanda2623488

Re: All level 0 on the same run?

2018-10-31 Thread Jose M Calhariz

On Wed, Oct 31, 2018 at 08:39:43AM -0400, Nathan Stratton Treadway wrote:
> On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> > yadda yadda. So just where the hell do I look for these?
> > root@coyote:/amandatapes/Dailys# su amanda -c "/usr/local/sbin/amadmin 
> > Daily due"|grep Overdue
> > Overdue 21 days: shop:/usr/local
> > Overdue 21 days: shop:/var/amanda
> > Overdue 21 days: lathe:/usr/local
> > Overdue 21 days: lathe:/var/amanda
> > Overdue 21 days: GO704:/var/amanda

I bet this DLEs are very small and that you have done the upgrade of
the amand server between 20 and 20+tapecycle ago.

There is a bug in recent amanda server that if a DLE is very small it
will not update properly the internal database after a level 0.
Making that DLE overdue.


> > 
> > So I look in the vtapes, and find its being done too.
> > Checking for lathe:/var/amanda, its there
> > Checking for lathe:/usr/local, they've been done too
> 
> What do "amadmin Daily info shop", etc. say?
> 
>   Nathan
> 
> Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
> Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
>  GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
>  Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239
> 
>

Kind regards
Jose M Calhariz

-- 
--
A vida é como rapadura: é doce, mas não é mole.


signature.asc
Description: PGP signature

Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway

On Wed, Oct 31, 2018 at 08:50:55 -0400, Gene Heskett wrote:
> On Wednesday 31 October 2018 08:39:43 Nathan Stratton Treadway wrote:
> 
> > On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> > > yadda yadda. So just where the hell do I look for these?
> > > root@coyote:/amandatapes/Dailys# su amanda -c
> > > "/usr/local/sbin/amadmin Daily due"|grep Overdue
> > > Overdue 21 days: shop:/usr/local
> > > Overdue 21 days: shop:/var/amanda
> > > Overdue 21 days: lathe:/usr/local
> > > Overdue 21 days: lathe:/var/amanda
> > > Overdue 21 days: GO704:/var/amanda
> > >
> > > So I look in the vtapes, and find its being done too.
> > > Checking for lathe:/var/amanda, its there
> > > Checking for lathe:/usr/local, they've been done too
> >
> > What do "amadmin Daily info shop", etc. say?
> >
> That "info" directory does not exist, never has existed here that I can 

"info" is a subcommand of "amadmin", along the lines of "balance" and
"due".

Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett

On Wednesday 31 October 2018 08:34:21 Nathan Stratton Treadway wrote:

> On Wed, Oct 31, 2018 at 08:18:47 -0400, Gene Heskett wrote:
> > that link from tapelist.last_write -> 27062 is dead, there is no
> > 27062 file to be found. WTH is that? And how can that be causeing
> > the erroneous amadmin due's.
>
> (You successfully moved your server to Amanda 3.5.x, right?)
>
Yes, both are local builds, so all I need to do to switch is install the 
older version. All built in /home/amanda.

> The last_write symlink is used for locking or something; in any case
> it's supposed to point to a number (perhaps a sequence number?) rather
> than an actual file.

It does, looks like it might be a PID. A big one but...

> > So I'm apparently plumb bumfuzzled at this point. Help! Where is JLM
> > when I need him.
>
> (Yeah.)
>   Nathan
> --
>-- Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic
> region Ray Ontko & Co.  -  Software consulting services  -  
> http://www.ontko.com/ GPG Key:
> http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239 Key
> fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239



Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett

On Wednesday 31 October 2018 08:39:43 Nathan Stratton Treadway wrote:

> On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> > yadda yadda. So just where the hell do I look for these?
> > root@coyote:/amandatapes/Dailys# su amanda -c
> > "/usr/local/sbin/amadmin Daily due"|grep Overdue
> > Overdue 21 days: shop:/usr/local
> > Overdue 21 days: shop:/var/amanda
> > Overdue 21 days: lathe:/usr/local
> > Overdue 21 days: lathe:/var/amanda
> > Overdue 21 days: GO704:/var/amanda
> >
> > So I look in the vtapes, and find its being done too.
> > Checking for lathe:/var/amanda, its there
> > Checking for lathe:/usr/local, they've been done too
>
> What do "amadmin Daily info shop", etc. say?
>
That "info" directory does not exist, never has existed here that I can 
recall.

>   Nathan
> --
>-- Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic
> region Ray Ontko & Co.  -  Software consulting services  -  
> http://www.ontko.com/ GPG Key:
> http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239 Key
> fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239



Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway

On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> yadda yadda. So just where the hell do I look for these?
> root@coyote:/amandatapes/Dailys# su amanda -c "/usr/local/sbin/amadmin 
> Daily due"|grep Overdue
> Overdue 21 days: shop:/usr/local
> Overdue 21 days: shop:/var/amanda
> Overdue 21 days: lathe:/usr/local
> Overdue 21 days: lathe:/var/amanda
> Overdue 21 days: GO704:/var/amanda
> 
> So I look in the vtapes, and find its being done too.
> Checking for lathe:/var/amanda, its there
> Checking for lathe:/usr/local, they've been done too

What do "amadmin Daily info shop", etc. say?

Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway

On Wed, Oct 31, 2018 at 08:18:47 -0400, Gene Heskett wrote:
> that link from tapelist.last_write -> 27062 is dead, there is no 27062 
> file to be found. WTH is that? And how can that be causeing the 
> erroneous amadmin due's.

(You successfully moved your server to Amanda 3.5.x, right?)

The last_write symlink is used for locking or something; in any case
it's supposed to point to a number (perhaps a sequence number?) rather than
an actual file.

> So I'm apparently plumb bumfuzzled at this point. Help! Where is JLM when 
> I need him.

(Yeah.)
Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-31 Thread Chris Nighswonger

FWIW, here is the output of amadmin balance before last nights run and
again this morning. No overdues, so I guess that's good. I'm not
experienced enough to make much of the balance percentages, but am now
wondering if I should work at breaking up the large DLEs into smaller
subsets as several have suggested.

root@scriptor:/home/manager# su backup -c "/usr/sbin/amadmin campus balance"

 due-date  #fsorig kB out kB   balance
--
10/30 Tue   25  359009166  284972243   +102.1%

11/03 Sat   15  632526057  420083122   +197.9%
--
TOTAL   40  991535223  705055365 141011073
  (estimated 5 runs per dumpcycle)
 (13 filesystems overdue. The most being overdue 1 day.)
root@scriptor:/home/manager# su backup -c "/usr/sbin/amadmin campus balance"

 due-date  #fsorig kB out kB   balance
--
10/31 Wed1  0  0  ---

11/03 Sat   39 1079730080  776153947   +400.0%
11/04 Sun0  0  0  ---
--
TOTAL   40 1079730080  776153947 155230789
  (estimated 5 runs per dumpcycle)


On Tue, Oct 30, 2018 at 3:56 PM Gene Heskett  wrote:

> On Tuesday 30 October 2018 15:29:37 Nathan Stratton Treadway wrote:
>
> > On Tue, Oct 30, 2018 at 14:20:55 -0400, Chris Nighswonger wrote:
> > > Why in the world does Amanda plan level 0 backups for all entries in
> > > a DLE for the same run This causes all sorts of problems.
> > >
> > > Is there any solution for this? I've read some of the creative
> > > suggestions, but it seems a bunch of trouble.
> >
> > The operation of Amanda's planner depends on many inputs, both "fixed"
> > (e.g. configuration options) and constantly-varying (e.g. estimate
> > sizes and dump history), and I suspect there are only a few people in
> > the world who really understand it fully -- and I don't know how many
> > of them still read this mailing list :(.  But even one of those people
> > would probably need to look at a lot of information in order to know
> > what exactly was going on.
> >
> >
> > The good news is that I have noticed that the planner records a bunch
> > of interesting information in the amdump.DATETIMESTAMP log file, so at
> > least that seems like the place to start investigated.  Look in
> > particular for the following sections: DONE QUEUE, ANALYZING
> > ESTIMATES, INITIAL SCHEDULE, DELAYING DUMPS IF NEEDED, PROMOTING DUMPS
> > IF NEEDED, and finally GENERATING SCHEDULE.
> >
> > In your case, it seems likely that the  PROMOTING DUMPS section should
> > have a bunch of activity listed; if so, that might explain what it's
> > "thinking".
> >
> > If that doesn't give a clear answer, does the INITIAL SCHEDULE section
> > show all the dumps are already scheduled for level 0?  If not, pick a
> > DLE that is not shown at level 0 there and follow it down the log to
> > see if you can figure out what stage bumps it back to level 0...
> >
> >
> > On a different track of investigation, the  output of "amadmin CONFIG
> > balance" might show something useful (though off hand it seems
> > unlikely to explain why _all_ DLEs would be switched to level 0).
> >
> >
> > Let us know what you find out :)
> >   Nathan
> >
> I just changed the length of the dumpcycle and runs percycle up to 10,
> about last friday while I was makeing the bump* stuff more attractive,
> but the above command returns that the are 5 filesystens out of date:
> su amanda -c "/usr/local/sbin/amadmin Daily balance"
>
>  due-date  #fsorig MB out MB   balance
> --
> 10/30 Tue5  0  0  ---
> 10/31 Wed1  17355   8958-45.3%
> 11/01 Thu2  10896  10887-33.5%
> 11/02 Fri4  35944   9298-43.2%
> 11/03 Sat4  14122  10835-33.8%
> 11/04 Sun3  57736  57736   +252.7%
> 11/05 Mon2  39947  30635+87.1%
> 11/06 Tue8   4235   4215-74.3%
> 11/07 Wed4  19503  14732-10.0%
> 11/08 Thu   32  31783  16408 +0.2%
> --
> TOTAL   65 231521 163704 16370
>   (estimated 10 runs per dumpcycle)
>  (5 filesystems overdue. The most being overdue 20 days.)
>
> That last line is disturbing. Ideas anyone? I'll certainly keep an eye on
> it.
>
> Cheers & thanks, Gene Heskett
> --
> "There are four boxes to be used in defense of liberty:
>  soap, ballot, jury, and ammo. Please use in that order."
> -Ed Howdershelt (Author)
> Genes Web page 
>

Re: All level 0 on the same run?

2018-10-31 Thread Chris Nighswonger

So, looking at this more, it may be self-inflicted. Last week I changed
blocksize to 512k, and began amrmtape and amlabel with the oldest tape
first and working backward day by day. I run backups 5 nights per week with
a cycle of 13 tapes (see below). I would have thought that this would have
allowed the change in blocksize to run seamlessly. Maybe not. I'm now
suspecting that by amrmtape --cleanup, this caused Amanda to bork and fall
back to level 0 backups. She did this two nights in a row!!!

Anyway, I'm going to hold off any further concerns until I finish a
complete tapecycle. If the problem continues after that point, I'll pick
back up.

Relevant lines from amanda.conf:

dumpcycle 5 days
runspercycle 5
tapecycle 13 tapes
runtapes 1
flush-threshold-dumped 50
bumpsize 10 Mbytes
bumppercent 0
bumpmult 1.5
bumpdays 2

Kind regards,
Chris

On Tue, Oct 30, 2018 at 2:32 PM Debra S Baddorf  wrote:

> Is this the first backup run for a long while?  If so, then they are all
> DUE, so amanda feels it has to schedule them all, now.
>
> Is this the first backup ever?   Ditto above.
>
> Did you perhaps run  “amadminforce  *”  which forces a level 0
> on all disks.
> Did you specify  “strategy noinc”which does the same?
> Or  "skip-incr yes”  ?  Ditto.
>
> Did you replace a whole disk, making all the files look like they’ve never
> been backed up?
>
> Okay,  failing all the above obvious reasons,  I’ll leave others to
> discuss “planner” reasons.  Sorry!
> Deb Baddorf
> Fermilab
>
> > On Oct 30, 2018, at 1:20 PM, Chris Nighswonger <
> cnighswon...@foundations.edu> wrote:
> >
> > Why in the world does Amanda plan level 0 backups for all entries in a
> DLE for the same run This causes all sorts of problems.
> > Is there any solution for this? I've read some of the creative
> suggestions, but it seems a bunch of trouble.
> > Kind regards,
> > Chris
> >
> > 0 19098649k waiting for dumping
> > 0 9214891k waiting for dumping
> > 0 718824k waiting for dumping
> > 0 365207k waiting for dumping
> > 0 2083027k waiting for dumping
> > 0 3886869k waiting for dumping
> > 0 84910k waiting for dumping
> > 0 22489k dump done (7:23:34), waiting for writing to tape
> > 0 304k dump done (7:22:30), waiting for writing to tape
> > 0 2613k waiting for dumping
> > 0 30k dump done (7:23:07), waiting for writing to tape
> > 0 39642k dump done (7:23:07), waiting for writing to tape
> > 0 8513409k waiting for dumping
> > 0 39519558k waiting for dumping
> > 0 47954k waiting for dumping
> > 0 149877984k dumping 145307840k ( 96.95%) (7:22:15)
> > 0 742804k waiting for dumping
> > p" 0 88758k waiting for dumping
> > 0 12463k dump done (7:24:19), waiting for writing to tape
> > 0 5544352k waiting for dumping
> > 0 191676480k waiting for dumping
> > 0 3799277k waiting for dumping
> > 0 3177171k waiting for dumping
> > 0 11058544k waiting for dumping
> > 0 230026440k dump done (7:22:13), waiting for writing to tape
> > 0 8k dump done (7:24:24), waiting for writing to tape
> > 0 184k dump done (7:24:19), waiting for writing to tape
> > 0 1292009k waiting for dumping
> > 0 2870k dump done (7:23:23), waiting for writing to tape
> > 0 13893263k waiting for dumping
> > 0 6025026k waiting for dumping
> > 0 6k dump done (7:22:15), waiting for writing to tape
> > 0 42k dump done (7:24:24), waiting for writing to tape
> > 0 53k dump done (7:24:19), waiting for writing to tape
> > 0 74462169k waiting for dumping
> > 0 205032k waiting for dumping
> > 0 32914k waiting for dumping
> > 0 1k dump done (7:24:02), waiting for writing to tape
> > 0 854272k waiting for dumping
> >
>
>

Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett

On Wednesday 31 October 2018 07:02:53 Nathan Stratton Treadway wrote:

> On Wed, Oct 31, 2018 at 06:32:41 -0400, Gene Heskett wrote:
> > I'll see if I can find the logs, I assume on the clients marked
> > guilty?
>
> Personally I'd probably start with "amstatus" on the server to see if
> it said anything about the DLEs in question, then maybe look into the
> amdump.1 log file (the one mentioned at the top of the amstatus
> report) for more details on that DLE.  If there is evidence in those
> places that it actually tried contacting the client to kick off a
> dump, that would tell me it was worth going over to the client's logs
> to try to track down those specific requests.
>
>   Nathan
Not it at all Nathan, the backups are in the vtapes with zero errors 
logged. See my previous post a few minutes ago. There is either 
something goofy in the $config or a nilmerg, and the only thing I see 
thats odd is:

root@coyote:/usr/local/etc/amanda/Daily# ls -l
total 132
-rw-r--r-- 1 amanda disk   21488 Oct 25  2005 3hole.ps
-rw-r--r-- 1 amanda disk5887 Oct 25  2005 8.5x11.ps
-rw-r--r-- 1 amanda disk   25423 Oct 28 05:19 amanda.conf
-rw-r--r-- 1 amanda disk   24655 Apr 20  2012 amanda.conf~
-rw--- 1 amanda disk 222 Oct  4 04:11 chg-disk
-rw-r--r-- 1 amanda disk   2 Aug 24 13:42 chg-disk-access
-rw-r--r-- 1 amanda disk   3 Aug 24 13:42 chg-disk-clean
-rw-r--r-- 1 amanda disk   2 Aug 24 13:42 chg-disk-slot
-rw-r--r-- 1 amanda disk 765 May 22  2004 chg-scsi.conf
-rw--- 1 amanda disk  18 Oct 31 03:44 command_file
-rw-r--r-- 1 amanda disk3977 Aug 30 06:28 disklist
-rw-r--r-- 1 amanda disk5002 Apr  3  2012 disklist~
-rw--- 1 amanda amanda  3809 Oct 31 03:03 tapelist
-rw--- 1 amanda disk1071 Aug 24 13:22 tapelist.amlabel
lrwxrwxrwx 1 amanda amanda 5 Oct 31 03:03 tapelist.last_write -> 
27062
-rw--- 1 amanda disk   0 Aug 31 03:03 tapelist.lock

that link from tapelist.last_write -> 27062 is dead, there is no 27062 
file to be found. WTH is that? And how can that be causeing the 
erroneous amadmin due's.

So I'm apparently plumb bumfuzzled at this point. Help! Where is JLM when 
I need him.

-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett

On Wednesday 31 October 2018 06:32:41 Gene Heskett wrote:

> On Tuesday 30 October 2018 17:56:33 Gene Heskett wrote:
> > On Tuesday 30 October 2018 16:45:50 Nathan Stratton Treadway wrote:
> > > On Tue, Oct 30, 2018 at 15:51:36 -0400, Gene Heskett wrote:
> > > > I just changed the length of the dumpcycle and runs percycle up
> > > > to 10, about last friday while I was makeing the bump* stuff
> > > > more attractive, but the above command returns that the are 5
> > > > filesystens out of date: su amanda -c "/usr/local/sbin/amadmin
> > > > Daily balance"
> > > >
> > > >  due-date  #fsorig MB out MB   balance
> > > > --
> > > > 10/30 Tue5  0  0  ---
> > > > 10/31 Wed1  17355   8958-45.3%
> > > > 11/01 Thu2  10896  10887-33.5%
> > > > 11/02 Fri4  35944   9298-43.2%
> > > > 11/03 Sat4  14122  10835-33.8%
> > > > 11/04 Sun3  57736  57736   +252.7%
> > > > 11/05 Mon2  39947  30635+87.1%
> > > > 11/06 Tue8   4235   4215-74.3%
> > > > 11/07 Wed4  19503  14732-10.0%
> > > > 11/08 Thu   32  31783  16408 +0.2%
> > > > --
> > > > TOTAL   65 231521 163704 16370
> > > >   (estimated 10 runs per dumpcycle)
> > > >  (5 filesystems overdue. The most being overdue 20 days.)
> > > >
> > > > That last line is disturbing. Ideas anyone? I'll certainly keep
> > > > an eye on it.
> > >
> > > Did you already run today's (10/30's) dump?  Assuming you are
> > > running one dump per day, running the "balance" command after the
> > > dump has occurred generates confusing output, because the output
> > > includes a line for today's date but actually no (new) dumps will
> > > be happening today. So if you are able to run the balance command
> > > before a particular day's dump (but on the day in question), the
> > > output is a little bit more helpful
> > >
> > > Anyway by "last line" are you talking about the "overdue" line?
> > > "amadmin ... due" should tell you which DLEs are overdue, and you
> > > can then look back through through your Amanda Mail Reports to see
> > > if there's any indication of why they are overdue (especially the
> > > one that's 20 days overdue...)
> > >
> > > Anyway, the other thing that jumps out from the above listing is
> > > the line for 11/4, with a balance of 250%.  We can't tell from the
> > > listing what the relatives sizes of the three DLEs in question
> > > are, though. Here's where a true Amanda expert could advise you
> > > better, but off hand I'd guess that one particule one of those
> > > three DLEs is much larger than all the rest of your DLEs and that
> > > fact is making it hard for the planner to come up with 10
> > > consecutive days of plans that when taken as a group actually work
> > > out to functional cycle
> > >
> > > ("amadmin ... due" should help you figure out which three DLEs are
> > > scheduled for that day, if you don't already know off hand which
> > > one is super large.)
> > >
> > > Hmmm, it would be interesting to know if the the super-large DLE
> > > is also the one that's 20 days overdue.  Perhaps its so big it
> > > can't fit on a tape, or something?
> > >
> > >   Nathan
> >
> > Its rigged to use 2 40GB vtapes if it has to.  But I found some
> > amandates problems in my kicking the tires so we'll see what it does
> > for tonight's after midnight run.
>
> And my perms & linkages fixes for amandates didn't help a bit, this
> morning after the run, its still showing the same 5 dle's as 21 days
> overdue.
>
> I'll see if I can find the logs, I assume on the clients marked
> guilty?
>
Apparently not on the client, I've read every 20181031 log 
containing /usr/local, no failures there. back to here I guess. Trawl 
thru another 20 megs of logs, no "fail" to be found by grep for this 
mornings date.  Humm, go look in /amandatapes/Dailys/data, and lo, and 
behold even, that backup was done!: Picked out of 65 files:
-rw--- 1 amanda amanda   13353012 Oct 31 03:44 
00064.GO704._usr_local.0

Sooo, lets backup a day, and to slot-26 in that tree.
did a level0 on the 30th
did a level1 on the 29th
did a level0 on the 28th
did a level1 on the 27th

yadda yadda. So just where the hell do I look for these?
root@coyote:/amandatapes/Dailys# su amanda -c "/usr/local/sbin/amadmin 
Daily due"|grep Overdue
Overdue 21 days: shop:/usr/local
Overdue 21 days: shop:/var/amanda
Overdue 21 days: lathe:/usr/local
Overdue 21 days: lathe:/var/amanda
Overdue 21 days: GO704:/var/amanda

So I look in the vtapes, and find its being done too.
Checking for lathe:/var/amanda, its there
Checking for lathe:/usr/local, they've been done too

I think I need a beer and its not even 8am yet...



> > Funny is that 5 machines are still running wheezy, but the default
> > names in both

Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway

On Wed, Oct 31, 2018 at 06:32:41 -0400, Gene Heskett wrote:
> I'll see if I can find the logs, I assume on the clients marked guilty?

Personally I'd probably start with "amstatus" on the server to see if it
said anything about the DLEs in question, then maybe look into the
amdump.1 log file (the one mentioned at the top of the amstatus report)
for more details on that DLE.  If there is evidence in those places that
it actually tried contacting the client to kick off a dump, that would
tell me it was worth going over to the client's logs to try to track
down those specific requests.

Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett

On Tuesday 30 October 2018 17:56:33 Gene Heskett wrote:

> On Tuesday 30 October 2018 16:45:50 Nathan Stratton Treadway wrote:
> > On Tue, Oct 30, 2018 at 15:51:36 -0400, Gene Heskett wrote:
> > > I just changed the length of the dumpcycle and runs percycle up to
> > > 10, about last friday while I was makeing the bump* stuff more
> > > attractive, but the above command returns that the are 5
> > > filesystens out of date: su amanda -c "/usr/local/sbin/amadmin
> > > Daily balance"
> > >
> > >  due-date  #fsorig MB out MB   balance
> > > --
> > > 10/30 Tue5  0  0  ---
> > > 10/31 Wed1  17355   8958-45.3%
> > > 11/01 Thu2  10896  10887-33.5%
> > > 11/02 Fri4  35944   9298-43.2%
> > > 11/03 Sat4  14122  10835-33.8%
> > > 11/04 Sun3  57736  57736   +252.7%
> > > 11/05 Mon2  39947  30635+87.1%
> > > 11/06 Tue8   4235   4215-74.3%
> > > 11/07 Wed4  19503  14732-10.0%
> > > 11/08 Thu   32  31783  16408 +0.2%
> > > --
> > > TOTAL   65 231521 163704 16370
> > >   (estimated 10 runs per dumpcycle)
> > >  (5 filesystems overdue. The most being overdue 20 days.)
> > >
> > > That last line is disturbing. Ideas anyone? I'll certainly keep an
> > > eye on it.
> >
> > Did you already run today's (10/30's) dump?  Assuming you are
> > running one dump per day, running the "balance" command after the
> > dump has occurred generates confusing output, because the output
> > includes a line for today's date but actually no (new) dumps will be
> > happening today. So if you are able to run the balance command
> > before a particular day's dump (but on the day in question), the
> > output is a little bit more helpful
> >
> > Anyway by "last line" are you talking about the "overdue" line?
> > "amadmin ... due" should tell you which DLEs are overdue, and you
> > can then look back through through your Amanda Mail Reports to see
> > if there's any indication of why they are overdue (especially the
> > one that's 20 days overdue...)
> >
> > Anyway, the other thing that jumps out from the above listing is the
> > line for 11/4, with a balance of 250%.  We can't tell from the
> > listing what the relatives sizes of the three DLEs in question are,
> > though. Here's where a true Amanda expert could advise you better,
> > but off hand I'd guess that one particule one of those three DLEs is
> > much larger than all the rest of your DLEs and that fact is making
> > it hard for the planner to come up with 10 consecutive days of plans
> > that when taken as a group actually work out to functional cycle
> >
> > ("amadmin ... due" should help you figure out which three DLEs are
> > scheduled for that day, if you don't already know off hand which one
> > is super large.)
> >
> > Hmmm, it would be interesting to know if the the super-large DLE is
> > also the one that's 20 days overdue.  Perhaps its so big it can't
> > fit on a tape, or something?
> >
> > Nathan
>
> Its rigged to use 2 40GB vtapes if it has to.  But I found some
> amandates problems in my kicking the tires so we'll see what it does
> for tonight's after midnight run.
>
And my perms & linkages fixes for amandates didn't help a bit, this 
morning after the run, its still showing the same 5 dle's as 21 days 
overdue.

I'll see if I can find the logs, I assume on the clients marked guilty?

> Funny is that 5 machines are still running wheezy, but the default
> names in both /etc/passwd and in /etc/group don't match! One of then
> isn't a Wurlitzer, but since they were installed at different times
> over the history here...  Who knows and I'm too lazy to lock
> in /var/cache/apt/archives to check  versions. We'll see what happens
> tonight.
>
> > 
> >-- -- Nathan Stratton Treadway  -  natha...@ontko.com  - 
> > Mid-Atlantic region Ray Ontko & Co.  -  Software consulting services
> >  -
> > http://www.ontko.com/ GPG Key:
> > http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239 Key
> > fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239
>
> Copyright 2018 by Maurice E. Heskett



Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-30 Thread Gene Heskett

On Tuesday 30 October 2018 16:45:50 Nathan Stratton Treadway wrote:

> On Tue, Oct 30, 2018 at 15:51:36 -0400, Gene Heskett wrote:
> > I just changed the length of the dumpcycle and runs percycle up to
> > 10, about last friday while I was makeing the bump* stuff more
> > attractive, but the above command returns that the are 5 filesystens
> > out of date: su amanda -c "/usr/local/sbin/amadmin Daily balance"
> >
> >  due-date  #fsorig MB out MB   balance
> > --
> > 10/30 Tue5  0  0  ---
> > 10/31 Wed1  17355   8958-45.3%
> > 11/01 Thu2  10896  10887-33.5%
> > 11/02 Fri4  35944   9298-43.2%
> > 11/03 Sat4  14122  10835-33.8%
> > 11/04 Sun3  57736  57736   +252.7%
> > 11/05 Mon2  39947  30635+87.1%
> > 11/06 Tue8   4235   4215-74.3%
> > 11/07 Wed4  19503  14732-10.0%
> > 11/08 Thu   32  31783  16408 +0.2%
> > --
> > TOTAL   65 231521 163704 16370
> >   (estimated 10 runs per dumpcycle)
> >  (5 filesystems overdue. The most being overdue 20 days.)
> >
> > That last line is disturbing. Ideas anyone? I'll certainly keep an
> > eye on it.
>
> Did you already run today's (10/30's) dump?  Assuming you are running
> one dump per day, running the "balance" command after the dump has
> occurred generates confusing output, because the output includes a
> line for today's date but actually no (new) dumps will be happening
> today. So if you are able to run the balance command before a
> particular day's dump (but on the day in question), the output is a
> little bit more helpful
>
> Anyway by "last line" are you talking about the "overdue" line?
> "amadmin ... due" should tell you which DLEs are overdue, and you can
> then look back through through your Amanda Mail Reports to see if
> there's any indication of why they are overdue (especially the one
> that's 20 days overdue...)
>
> Anyway, the other thing that jumps out from the above listing is the
> line for 11/4, with a balance of 250%.  We can't tell from the listing
> what the relatives sizes of the three DLEs in question are, though.
> Here's where a true Amanda expert could advise you better, but off
> hand I'd guess that one particule one of those three DLEs is much
> larger than all the rest of your DLEs and that fact is making it hard
> for the planner to come up with 10 consecutive days of plans that when
> taken as a group actually work out to functional cycle
>
> ("amadmin ... due" should help you figure out which three DLEs are
> scheduled for that day, if you don't already know off hand which one
> is super large.)
>
> Hmmm, it would be interesting to know if the the super-large DLE is
> also the one that's 20 days overdue.  Perhaps its so big it can't fit
> on a tape, or something?
>
>   Nathan

Its rigged to use 2 40GB vtapes if it has to.  But I found some amandates 
problems in my kicking the tires so we'll see what it does for tonight's 
after midnight run.

Funny is that 5 machines are still running wheezy, but the default names 
in both /etc/passwd and in /etc/group don't match! One of then isn't a 
Wurlitzer, but since they were installed at different times over the 
history here...  Who knows and I'm too lazy to lock 
in /var/cache/apt/archives to check  versions. We'll see what happens 
tonight.
>
> --
>-- Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic
> region Ray Ontko & Co.  -  Software consulting services  -  
> http://www.ontko.com/ GPG Key:
> http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239 Key
> fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239



Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-30 Thread Nathan Stratton Treadway

On Tue, Oct 30, 2018 at 16:27:29 -0400, Gene Heskett wrote:
> On Tuesday 30 October 2018 15:31:59 Nathan Stratton Treadway wrote:
> 
> > On Tue, Oct 30, 2018 at 15:29:37 -0400, Nathan Stratton Treadway wrote:
> > > On a different track of investigation, the  output of "amadmin
> > > CONFIG balance" might show something useful (though off hand it
> > > seems unlikely
> >
> > Also, "amadmin CONFIG due".
> >
> > Nathan
> So I used this to lookup the overdues.

Yes, good.

> My first impression is that the directory is locked as if they are   
> locked from access by amgtar:

Why do you think that's the problem?

> From a root shell on this machine, the server: after 
> making /etc/amandates amanda:disk from a shell on that machine:
> root@coyote:/usr/local/etc/amanda/Daily# su 
> amanda -c "/usr/local/sbin/amcheck Daily"
> Amanda Tape Server Host Check
> -
> NOTE: Holding disk '/usr/dumps': 641240 MB disk space available, using 
> 640740 MB
> Searching for label 'Dailys-27':found in slot 27: volume 'Dailys-27'
> Will write to volume 'Dailys-27' in slot 27.
> NOTE: skipping tape-writable test
> Server check took 1.473 seconds
> Amanda Backup Client Hosts Check
> 
> ERROR: shop: [can not read/write /var/amanda/amandates: Permission denied 
> (ruid:63998 euid:63998)
> Client check: 5 hosts checked in 3.280 seconds.  1 problem found.
> (brought to you by Amanda 3.5.1)
[...]> 
> So it looks as if I need to check for similar missfits on other machines, 
> and this should fix the Overdue reports for that machine. But why didn't 
> amcheck yell about that before now? It hasn't, so thats a hellofagood ?

You can see that amcheck actually does complain when the amandates
permissions are wrong, so off hand the fact that amcheck didn't
complain earlier makes me thinks that's not the true reason these DLEs
haven't been backed up.

I would say look carefully through your Amanda Mail Reports and/or
amdump.DATETIMESTAMP files for the past 20 days to see if there is any
mention of what's happening with those DLEs



> All this is the debian way the client is installed.

(Obviously this is not the way the Debian-distribution packages install
things, but rather the way you have built Amanda yourself and then
intalled it on your Debian-running client machines  [The normal
Debian packages use "backup" as the owner.])


Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-30 Thread Nathan Stratton Treadway

On Tue, Oct 30, 2018 at 15:51:36 -0400, Gene Heskett wrote:
> I just changed the length of the dumpcycle and runs percycle up to 10, 
> about last friday while I was makeing the bump* stuff more attractive, 
> but the above command returns that the are 5 filesystens out of date:
> su amanda -c "/usr/local/sbin/amadmin Daily balance"
> 
>  due-date  #fsorig MB out MB   balance
> --
> 10/30 Tue5  0  0  ---
> 10/31 Wed1  17355   8958-45.3%
> 11/01 Thu2  10896  10887-33.5%
> 11/02 Fri4  35944   9298-43.2%
> 11/03 Sat4  14122  10835-33.8%
> 11/04 Sun3  57736  57736   +252.7%
> 11/05 Mon2  39947  30635+87.1%
> 11/06 Tue8   4235   4215-74.3%
> 11/07 Wed4  19503  14732-10.0%
> 11/08 Thu   32  31783  16408 +0.2%
> --
> TOTAL   65 231521 163704 16370
>   (estimated 10 runs per dumpcycle)
>  (5 filesystems overdue. The most being overdue 20 days.)
> 
> That last line is disturbing. Ideas anyone? I'll certainly keep an eye on 
> it. 

Did you already run today's (10/30's) dump?  Assuming you are running
one dump per day, running the "balance" command after the dump has
occurred generates confusing output, because the output includes a line
for today's date but actually no (new) dumps will be happening today. 
So if you are able to run the balance command before a particular day's
dump (but on the day in question), the output is a little bit more
helpful

Anyway by "last line" are you talking about the "overdue" line? 
"amadmin ... due" should tell you which DLEs are overdue, and you can
then look back through through your Amanda Mail Reports to see if
there's any indication of why they are overdue (especially the one
that's 20 days overdue...)

Anyway, the other thing that jumps out from the above listing is the
line for 11/4, with a balance of 250%.  We can't tell from the listing
what the relatives sizes of the three DLEs in question are, though. 
Here's where a true Amanda expert could advise you better, but off hand
I'd guess that one particule one of those three DLEs is much larger than
all the rest of your DLEs and that fact is making it hard for the
planner to come up with 10 consecutive days of plans that when taken as
a group actually work out to functional cycle

("amadmin ... due" should help you figure out which three DLEs are
scheduled for that day, if you don't already know off hand which one is
super large.) 

Hmmm, it would be interesting to know if the the super-large DLE is also
the one that's 20 days overdue.  Perhaps its so big it can't fit on a
tape, or something?

Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-30 Thread Gene Heskett

On Tuesday 30 October 2018 15:31:59 Nathan Stratton Treadway wrote:

> On Tue, Oct 30, 2018 at 15:29:37 -0400, Nathan Stratton Treadway wrote:
> > On a different track of investigation, the  output of "amadmin
> > CONFIG balance" might show something useful (though off hand it
> > seems unlikely
>
> Also, "amadmin CONFIG due".
>
>   Nathan
So I used this to lookup the overdues.

My first impression is that the directory is locked as if they are   
locked from access by amgtar:
root@coyote:/usr/local/etc/amanda/Daily# su 
amanda -c "/usr/local/sbin/amadmin Daily due"|grep Overdue
Overdue 20 days: shop:/usr/local
Overdue 20 days: shop:/var/amanda
Overdue 20 days: lathe:/usr/local
Overdue 20 days: lathe:/var/amanda
Overdue 20 days: GO704:/var/amanda

Whats the perms supposed to be?

cd-ing to the home of shop/etc and looking at amandates:
gene@coyote:/sshnet/shop/etc$ cat amandates
/etc 0 1540882971
/etc 1 1540796552
/home 0 1540884324
/home 1 1540796542
/lib/firmware 0 1540882919
/lib/firmware 1 1540796545
/root 0 1540882911
/root 1 1540796549
/usr/lib/amanda 0 1540882975
/usr/lib/amanda 1 1540796556
/usr/local 0 1540882908
/usr/local 1 1538809433
/var/amanda 0 1540882905
/var/amanda 1 1538636569
/var/lib/amanda 0 1540882914
/var/lib/amanda 1 1540796566
gene@coyote:/sshnet/shop/etc$ ls -l amandates
-rw-r- 1 63998 disk 380 Oct 30 03:26 amandates

But it takes root to do a chown as sshfs only allows gene to modify his 
own stuff. I've got two paths to that machine, ssh has me logged in.
>From a root shell on this machine, the server: after 
making /etc/amandates amanda:disk from a shell on that machine:
root@coyote:/usr/local/etc/amanda/Daily# su 
amanda -c "/usr/local/sbin/amcheck Daily"
Amanda Tape Server Host Check
-
NOTE: Holding disk '/usr/dumps': 641240 MB disk space available, using 
640740 MB
Searching for label 'Dailys-27':found in slot 27: volume 'Dailys-27'
Will write to volume 'Dailys-27' in slot 27.
NOTE: skipping tape-writable test
Server check took 1.473 seconds
Amanda Backup Client Hosts Check

ERROR: shop: [can not read/write /var/amanda/amandates: Permission denied 
(ruid:63998 euid:63998)
Client check: 5 hosts checked in 3.280 seconds.  1 problem found.
(brought to you by Amanda 3.5.1)

Then I made it owned by amandabackup:disk

root@coyote:/usr/local/etc/amanda/Daily# su 
amanda -c "/usr/local/sbin/amcheck Daily"
Amanda Tape Server Host Check
-
NOTE: Holding disk '/usr/dumps': 641240 MB disk space available, using 
640740 MB
Searching for label 'Dailys-27':found in slot 27: volume 'Dailys-27'
Will write to volume 'Dailys-27' in slot 27.
NOTE: skipping tape-writable test
Server check took 0.272 seconds
Amanda Backup Client Hosts Check

Client check: 5 hosts checked in 3.091 seconds.  0 problems found.
(brought to you by Amanda 3.5.1)

So it looks as if I need to check for similar missfits on other machines, 
and this should fix the Overdue reports for that machine. But why didn't 
amcheck yell about that before now? It hasn't, so thats a hellofagood ?

All this is the debian way the client is installed.

I'll fix the other two and we'll see what happens at 3:45 am tom.

> --
>-- Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic
> region Ray Ontko & Co.  -  Software consulting services  -  
> http://www.ontko.com/ GPG Key:
> http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239 Key
> fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-30 Thread Gene Heskett

On Tuesday 30 October 2018 15:29:37 Nathan Stratton Treadway wrote:

> On Tue, Oct 30, 2018 at 14:20:55 -0400, Chris Nighswonger wrote:
> > Why in the world does Amanda plan level 0 backups for all entries in
> > a DLE for the same run This causes all sorts of problems.
> >
> > Is there any solution for this? I've read some of the creative
> > suggestions, but it seems a bunch of trouble.
>
> The operation of Amanda's planner depends on many inputs, both "fixed"
> (e.g. configuration options) and constantly-varying (e.g. estimate
> sizes and dump history), and I suspect there are only a few people in
> the world who really understand it fully -- and I don't know how many
> of them still read this mailing list :(.  But even one of those people
> would probably need to look at a lot of information in order to know
> what exactly was going on.
>
>
> The good news is that I have noticed that the planner records a bunch
> of interesting information in the amdump.DATETIMESTAMP log file, so at
> least that seems like the place to start investigated.  Look in
> particular for the following sections: DONE QUEUE, ANALYZING
> ESTIMATES, INITIAL SCHEDULE, DELAYING DUMPS IF NEEDED, PROMOTING DUMPS
> IF NEEDED, and finally GENERATING SCHEDULE.
>
> In your case, it seems likely that the  PROMOTING DUMPS section should
> have a bunch of activity listed; if so, that might explain what it's
> "thinking".
>
> If that doesn't give a clear answer, does the INITIAL SCHEDULE section
> show all the dumps are already scheduled for level 0?  If not, pick a
> DLE that is not shown at level 0 there and follow it down the log to
> see if you can figure out what stage bumps it back to level 0...
>
>
> On a different track of investigation, the  output of "amadmin CONFIG
> balance" might show something useful (though off hand it seems
> unlikely to explain why _all_ DLEs would be switched to level 0).
>
>
> Let us know what you find out :)
>   Nathan
>
I just changed the length of the dumpcycle and runs percycle up to 10, 
about last friday while I was makeing the bump* stuff more attractive, 
but the above command returns that the are 5 filesystens out of date:
su amanda -c "/usr/local/sbin/amadmin Daily balance"

 due-date  #fsorig MB out MB   balance
--
10/30 Tue5  0  0  ---
10/31 Wed1  17355   8958-45.3%
11/01 Thu2  10896  10887-33.5%
11/02 Fri4  35944   9298-43.2%
11/03 Sat4  14122  10835-33.8%
11/04 Sun3  57736  57736   +252.7%
11/05 Mon2  39947  30635+87.1%
11/06 Tue8   4235   4215-74.3%
11/07 Wed4  19503  14732-10.0%
11/08 Thu   32  31783  16408 +0.2%
--
TOTAL   65 231521 163704 16370
  (estimated 10 runs per dumpcycle)
 (5 filesystems overdue. The most being overdue 20 days.)

That last line is disturbing. Ideas anyone? I'll certainly keep an eye on 
it. 

Cheers & thanks, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-30 Thread Gene Heskett

On Tuesday 30 October 2018 14:32:14 Debra S Baddorf wrote:

> Is this the first backup run for a long while?  If so, then they are
> all DUE, so amanda feels it has to schedule them all, now.
>
> Is this the first backup ever?   Ditto above.
>
> Did you perhaps run  “amadminforce  *”  which forces a
> level 0 on all disks. Did you specify  “strategy noinc”which does
> the same?
> Or  "skip-incr yes”  ?  Ditto.
>
> Did you replace a whole disk, making all the files look like they’ve
> never been backed up?
>
> Okay,  failing all the above obvious reasons,  I’ll leave others to
> discuss “planner” reasons.  Sorry! Deb Baddorf
> Fermilab

Them is dangerous waters, Deb, the defaults aren't anywhere near optimum 
for my smaller system. Only 65 dle's now but may get expanded by next 
year. See my previous post, but its not quite ideal yet.  Yes, planner 
needs to issue the commands it says its going to do in the email, but 
the promoter overrides the planners first inclination, too often IMO. So 
I'm doing level0's too frequently yet. This effect is enhanced if the 
dle's vary greatly in size, so its best to split the really big ones 
into smaller pieces.

> > On Oct 30, 2018, at 1:20 PM, Chris Nighswonger
> >  wrote:
> >
> > Why in the world does Amanda plan level 0 backups for all entries in
> > a DLE for the same run This causes all sorts of problems. Is
> > there any solution for this? I've read some of the creative
> > suggestions, but it seems a bunch of trouble. Kind regards,
> > Chris
> >
> > 0 19098649k waiting for dumping
> > 0 9214891k waiting for dumping
> > 0 718824k waiting for dumping
> > 0 365207k waiting for dumping
> > 0 2083027k waiting for dumping
> > 0 3886869k waiting for dumping
> > 0 84910k waiting for dumping
> > 0 22489k dump done (7:23:34), waiting for writing to tape
> > 0 304k dump done (7:22:30), waiting for writing to tape
> > 0 2613k waiting for dumping
> > 0 30k dump done (7:23:07), waiting for writing to tape
> > 0 39642k dump done (7:23:07), waiting for writing to tape
> > 0 8513409k waiting for dumping
> > 0 39519558k waiting for dumping
> > 0 47954k waiting for dumping
> > 0 149877984k dumping 145307840k ( 96.95%) (7:22:15)
> > 0 742804k waiting for dumping
> > p" 0 88758k waiting for dumping
> > 0 12463k dump done (7:24:19), waiting for writing to tape
> > 0 5544352k waiting for dumping
> > 0 191676480k waiting for dumping
> > 0 3799277k waiting for dumping
> > 0 3177171k waiting for dumping
> > 0 11058544k waiting for dumping
> > 0 230026440k dump done (7:22:13), waiting for writing to tape
> > 0 8k dump done (7:24:24), waiting for writing to tape
> > 0 184k dump done (7:24:19), waiting for writing to tape
> > 0 1292009k waiting for dumping
> > 0 2870k dump done (7:23:23), waiting for writing to tape
> > 0 13893263k waiting for dumping
> > 0 6025026k waiting for dumping
> > 0 6k dump done (7:22:15), waiting for writing to tape
> > 0 42k dump done (7:24:24), waiting for writing to tape
> > 0 53k dump done (7:24:19), waiting for writing to tape
> > 0 74462169k waiting for dumping
> > 0 205032k waiting for dumping
> > 0 32914k waiting for dumping
> > 0 1k dump done (7:24:02), waiting for writing to tape
> > 0 854272k waiting for dumping



Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-30 Thread Nathan Stratton Treadway

On Tue, Oct 30, 2018 at 15:29:37 -0400, Nathan Stratton Treadway wrote:
> On a different track of investigation, the  output of "amadmin CONFIG
> balance" might show something useful (though off hand it seems unlikely

Also, "amadmin CONFIG due".

Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-30 Thread Nathan Stratton Treadway

On Tue, Oct 30, 2018 at 14:20:55 -0400, Chris Nighswonger wrote:
> Why in the world does Amanda plan level 0 backups for all entries in a DLE
> for the same run This causes all sorts of problems.
> 
> Is there any solution for this? I've read some of the creative suggestions,
> but it seems a bunch of trouble.

The operation of Amanda's planner depends on many inputs, both "fixed"
(e.g. configuration options) and constantly-varying (e.g. estimate sizes
and dump history), and I suspect there are only a few people in the
world who really understand it fully -- and I don't know how many of
them still read this mailing list :(.  But even one of those people
would probably need to look at a lot of information in order to know
what exactly was going on.

The good news is that I have noticed that the planner records a bunch of
interesting information in the amdump.DATETIMESTAMP log file, so at
least that seems like the place to start investigated.  Look in
particular for the following sections: DONE QUEUE, ANALYZING ESTIMATES,
INITIAL SCHEDULE, DELAYING DUMPS IF NEEDED, PROMOTING DUMPS IF NEEDED,
and finally GENERATING SCHEDULE.

In your case, it seems likely that the  PROMOTING DUMPS section should
have a bunch of activity listed; if so, that might explain what it's
"thinking".

If that doesn't give a clear answer, does the INITIAL SCHEDULE section
show all the dumps are already scheduled for level 0?  If not, pick a
DLE that is not shown at level 0 there and follow it down the log to see
if you can figure out what stage bumps it back to level 0...

On a different track of investigation, the  output of "amadmin CONFIG
balance" might show something useful (though off hand it seems unlikely
to explain why _all_ DLEs would be switched to level 0).

Let us know what you find out :)
Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Re: All level 0 on the same run?

2018-10-30 Thread Gene Heskett

On Tuesday 30 October 2018 14:20:55 Chris Nighswonger wrote:

> Why in the world does Amanda plan level 0 backups for all entries in a
> DLE for the same run This causes all sorts of problems.
>
If all dle's are "new", never been backed up by amanda before, this is 
normal.  Most start out with perhaps 10 to 20 dle's, then add 20 or so 
for run two, wash rinse and repeat till you have them all. Then give 
amanda a couple weeks to work out a schedule that loads each (v)tape 
with a similar amount of data. It will shuffle the order of the levels, 
but will also prevent any backup from going above the dumpdays for the 
new level0 as set in your amanda.conf.

> Is there any solution for this? I've read some of the creative
> suggestions, but it seems a bunch of trouble.

Not really, and pay attention to the stuff in the amanda.conf called 
bump*. They can and will have a huge effect on how much tape is used per 
run.

This is what I have, but isn't yet ideal to me:
bumpsize  2 MB# minimum savings (threshold) to bump level 1 -> 2
bumpdays  2   # minimum days at each level
bumppercent   0   # new var I didn't know about till Dec 2015
bumpmult  1.125   # threshold = bumpsize * bumpmult^(level-1)

Everything but bumpmult s/b integer.

> Kind regards,
>
> Chris
>
>
> 0 19098649k waiting for dumping
>
> 0 9214891k waiting for dumping
>
> 0 718824k waiting for dumping
>
> 0 365207k waiting for dumping
>
> 0 2083027k waiting for dumping
>
> 0 3886869k waiting for dumping
>
> 0 84910k waiting for dumping
>
> 0 22489k dump done (7:23:34), waiting for writing to tape
>
> 0 304k dump done (7:22:30), waiting for writing to tape
>
> 0 2613k waiting for dumping
>
> 0 30k dump done (7:23:07), waiting for writing to tape
>
> 0 39642k dump done (7:23:07), waiting for writing to tape
>
> 0 8513409k waiting for dumping
>
> 0 39519558k waiting for dumping
>
> 0 47954k waiting for dumping
>
> 0 149877984k dumping 145307840k ( 96.95%) (7:22:15)
>
> 0 742804k waiting for dumping
>
> p" 0 88758k waiting for dumping
>
> 0 12463k dump done (7:24:19), waiting for writing to tape
>
> 0 5544352k waiting for dumping
>
> 0 191676480k waiting for dumping
>
> 0 3799277k waiting for dumping
>
> 0 3177171k waiting for dumping
>
> 0 11058544k waiting for dumping
>
> 0 230026440k dump done (7:22:13), waiting for writing to tape
>
> 0 8k dump done (7:24:24), waiting for writing to tape
>
> 0 184k dump done (7:24:19), waiting for writing to tape
>
> 0 1292009k waiting for dumping
>
> 0 2870k dump done (7:23:23), waiting for writing to tape
>
> 0 13893263k waiting for dumping
>
> 0 6025026k waiting for dumping
>
> 0 6k dump done (7:22:15), waiting for writing to tape
>
> 0 42k dump done (7:24:24), waiting for writing to tape
>
> 0 53k dump done (7:24:19), waiting for writing to tape
>
> 0 74462169k waiting for dumping
>
> 0 205032k waiting for dumping
>
> 0 32914k waiting for dumping
>
> 0 1k dump done (7:24:02), waiting for writing to tape
>
> 0 854272k waiting for dumping



Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page

Re: All level 0 on the same run?

2018-10-30 Thread Debra S Baddorf

Is this the first backup run for a long while?  If so, then they are all DUE, 
so amanda feels it has to schedule them all, now.

Is this the first backup ever?   Ditto above.

Did you perhaps run  “amadminforce  *”  which forces a level 0 on 
all disks.
Did you specify  “strategy noinc”which does the same?
Or  "skip-incr yes”  ?  Ditto.

Did you replace a whole disk, making all the files look like they’ve never been 
backed up?

Okay,  failing all the above obvious reasons,  I’ll leave others to discuss 
“planner” reasons.  Sorry!
Deb Baddorf
Fermilab

> On Oct 30, 2018, at 1:20 PM, Chris Nighswonger  
> wrote:
> 
> Why in the world does Amanda plan level 0 backups for all entries in a DLE 
> for the same run This causes all sorts of problems.
> Is there any solution for this? I've read some of the creative suggestions, 
> but it seems a bunch of trouble.
> Kind regards,
> Chris
> 
> 0 19098649k waiting for dumping
> 0 9214891k waiting for dumping
> 0 718824k waiting for dumping
> 0 365207k waiting for dumping
> 0 2083027k waiting for dumping
> 0 3886869k waiting for dumping
> 0 84910k waiting for dumping
> 0 22489k dump done (7:23:34), waiting for writing to tape
> 0 304k dump done (7:22:30), waiting for writing to tape
> 0 2613k waiting for dumping
> 0 30k dump done (7:23:07), waiting for writing to tape
> 0 39642k dump done (7:23:07), waiting for writing to tape
> 0 8513409k waiting for dumping
> 0 39519558k waiting for dumping
> 0 47954k waiting for dumping
> 0 149877984k dumping 145307840k ( 96.95%) (7:22:15)
> 0 742804k waiting for dumping
> p" 0 88758k waiting for dumping
> 0 12463k dump done (7:24:19), waiting for writing to tape
> 0 5544352k waiting for dumping
> 0 191676480k waiting for dumping
> 0 3799277k waiting for dumping
> 0 3177171k waiting for dumping
> 0 11058544k waiting for dumping
> 0 230026440k dump done (7:22:13), waiting for writing to tape
> 0 8k dump done (7:24:24), waiting for writing to tape
> 0 184k dump done (7:24:19), waiting for writing to tape
> 0 1292009k waiting for dumping
> 0 2870k dump done (7:23:23), waiting for writing to tape
> 0 13893263k waiting for dumping
> 0 6025026k waiting for dumping
> 0 6k dump done (7:22:15), waiting for writing to tape
> 0 42k dump done (7:24:24), waiting for writing to tape
> 0 53k dump done (7:24:19), waiting for writing to tape
> 0 74462169k waiting for dumping
> 0 205032k waiting for dumping
> 0 32914k waiting for dumping
> 0 1k dump done (7:24:02), waiting for writing to tape
> 0 854272k waiting for dumping
>

55 matches

Mail list logo