Re: How's amanda feeling these days?

2020-11-23 Thread Jon LaBadie
On Mon, Nov 16, 2020 at 07:25:41AM -0600, Dave Sherohman wrote:
> Hello, again!
> 
> You may recall my earlier question to the list, included below.  I've
> now talked with my other coworkers who work with servers and they've
> agreed to go with amanda for our new backup system.
> 
> Now I'd like to get some hardware recommendations.  I'm mostly unsure
> about what we'll need in terms of capacity, both for processing power
> and for storing the actual backups.  Less interested in specific model
> or part numbers, because it will need to come from one of our approved
> vendors, of course, and most likely by way of a formal tender process -
> but I can say that we almost always end up buying complete Dell
> rackmount systems.
> 
> The basic parameters I'm working with are:
> 
> - Backing up around 75 servers (mostly Debian, with a handful of other
>   linux distros and a handful of windows machines).
> 
> - Total amount of data to back up is currently in the 40 TB range.
> 
> - Everything is connected by fast (10- or 100-gigabit) networks.
> 
> - Backup will be to disk/vtapes.
> 
> - I've been asked to have backups available for the previous 6 months.
> 
> - I'm assuming that the best way to handle backup of windows clients
>   will be to mount the disk on a linux box and back it up from there,
>   although some of them are virtual machines, so doing a kvm snapshot
>   and backing that up instead would also be an option.
> 
> Given all that, how beefy of a box should I be looking at, and how much
> disk space can I expect to need?

I did reply to the original message, but looking back it was addressed
to you rather than the list.  In case it was overlooked, here were my
regarding space:

"Just some simple numbers.  Assuming a 7 day dumpcycle and daily runs.
 40TB / 7 day plus some promotion is about 7TB of level 0 (full) dumps
 per day.  Add a TB for incrementals means about 8TB of backup data / day.

 8TB / 1GB/sec is about 8000sec network traffic.  3-4 hrs, doable on your
 slower network.

 6 months retension is nominally 200 days X 8TB / day is 1600 TB of
 vtape capacity.  With 5TB disks thats 320 disks.  Compression will
 reduce that some, how much only experience will tell you."

> 
> Also, as a side note, I'm planning on using VDO (Virtual Data Optimizer)
> to provide on-the-fly data compression and deduplication on the backup
> server, which should reduce disk consumption at the cost of CPU
> overhead.  I'm thinking it would make the most sense to use VDO only for
> the filesystem holding the vtapes, and not for the staging area, but
> feel free to correct me on that.

Did not know what VDO was, so I read a Red Hat description.  It seems
to consist of 3 components each I question the value of for amanda backup.
Hopefully someone with VDO experience can share it.

Elimination of zero filled blocks:  Compression is likely to greatly
shrink the storage of a string of zeros.

Only one copy of duplicate blocks:  Were your files being backed up
individually, as I do in a separate backup of my Home directory using
rsync, this could provide a worthwhile savings.  But you will likely
be merging your files into a tarball or a dumpfile.  The original
disk block alignment will be lost and likely not even match in one
day's tarball to the next.

LZ4 compression on the fly:  I don't know the cpu load for the server
compressing 8TB of data daily.  One thing you would have to deal with
is amanda's view of what has been sent to the backup device and what
size the data actually consume on the device.

There are points where amanda calculates how much space is left on
the device based on it configuration-specified size and how much it
has already sent.  Of course there is actually more space available
because the compression occurs after amanda's involvement.  The
difference may cause amanda to make less than optimal decisions.

Amanda administrators who use tape drive compression face the same
problem.  I believe most over specify the size of the storage medium
to allow more complete tape utilization.


As to Windows backup, I hope someone suggests a good solution.  I
currently use the proprietary "Zmanda Windows Client".  Generally
works well but suffers from a lack of development and unexplained
failures to connect.  It is often corrected by restarting the ZWC
services on the Windows system and always corrected by rebooting.

In the distant past I backed up windows systems by mounting the drives
on UNIX host.  Most often used NFS.  Liked that approach except for
one thing.  Windows, at least then, does not like a file to be opened
by multiple processes.  So each backup included several files that
did not backup because the file was already opened by another Windows
process.  And a few system files were never backed up.

Regarding backing up a KVM snapshot, would that mean that to recover
one file you would have to take a new snapshot, restore the entire system
from the backed up snap, copy the file to somewhere 

Re: amanda going to hell, probably because of a bad sdb.

2020-11-23 Thread Gene Heskett
On Monday 23 November 2020 12:31:23 Debra S Baddorf wrote:

> Well, if you mean this amanda-list  mail server,  I got this message
> Friday night, but didn’t do any work email until today.
>
> I don’t have any answers to your prob, either.   Is it working better
> with the SSD?
>
Haven't tried it, its hasn't gotten a new tummy ache in about a week. 
With the mail server down, I've been nuking the left overs and it 
generally just works for several days.  Like waiting for the other shoe 
to drop... The dumps crc goes to hell sitting in the holding disk. Might 
be an 8 or 9 part file, broken into 2G pieces.

I'm suspecting the drive or cable. The drive claims its healthy now but 
did have a problem way back in its log history, its got around 70k 
spinning hours on it.  Around here, 70k is a youngster. I've apparently 
lost track of it but theres a 1T someplace, one of the first ones with 
nearly 200k hours on it. 25 re-allocated sectors since the first time I 
looked at it at about 5k hours. I updated its firmware and its just 
worked since.

Thanks Deb.

Stay safe and well. As you know, I'm definitely geriatric at 86. And take 
your A,D,C and selenium vitamins. I've managed to escape it so far. 
They've been assaying bodies, those that have died have been starving 
for the trace element selenium. Said a different way, nobody with enough 
selenium has died.

Take care now.

> Deb Baddorf
> Fermilab
>
> > On Nov 11, 2020, at 6:45 AM, Gene Heskett 
> > wrote:
> >
> > Greetings all;
> >
> > 2 weeks in a row, amanda has fussed about crc's in the holding disk.
> >  But the disk checks good.  And the mail server seems to have turned
> > into a black hole, no messages since Oct 21st.
> >
> > Since /sdb1 is used only as a dump buffer, I'm going to sub a 240GB
> > SSD on the end of that cable.
> >
> > But can we fix the mail server?
> >
> > Cheers, Gene Heskett
> > --
> > "There are four boxes to be used in defense of liberty:
> > soap, ballot, jury, and ammo. Please use in that order."
> > -Ed Howdershelt (Author)
> > If we desire respect for the law, we must first make the law
> > respectable. - Louis D. Brandeis
> > Genes Web page
> >  >et-3A6309_gene=DwICaQ=gRgGjJ3BkIsb5y6s49QqsA=HMrKaRiCv4jddln9fL
> >PIOw=G_r-bs-fP6BwrcFASXERZDMQO_3keY8uVIwM5_gh2hs=r84nopOmkdkZFCR1
> >VvoFqc6LeH8olepN-kEcBGkvaA8= >



Copyright 2019 by Maurice E. Heskett
Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis
Genes Web page 



Re: How's amanda feeling these days?

2020-11-23 Thread Stefan G. Weichinger
Am 16.11.20 um 14:25 schrieb Dave Sherohman:
> Hello, again!
> 
> You may recall my earlier question to the list, included below.  I've
> now talked with my other coworkers who work with servers and they've
> agreed to go with amanda for our new backup system.
> 
> Now I'd like to get some hardware recommendations.  I'm mostly unsure
> about what we'll need in terms of capacity, both for processing power
> and for storing the actual backups.  Less interested in specific model
> or part numbers, because it will need to come from one of our approved
> vendors, of course, and most likely by way of a formal tender process -
> but I can say that we almost always end up buying complete Dell
> rackmount systems.
> 
> The basic parameters I'm working with are:
> 
> - Backing up around 75 servers (mostly Debian, with a handful of other
>   linux distros and a handful of windows machines).
> 
> - Total amount of data to back up is currently in the 40 TB range.
> 
> - Everything is connected by fast (10- or 100-gigabit) networks.
> 
> - Backup will be to disk/vtapes.
> 
> - I've been asked to have backups available for the previous 6 months.
> 
> - I'm assuming that the best way to handle backup of windows clients
>   will be to mount the disk on a linux box and back it up from there,
>   although some of them are virtual machines, so doing a kvm snapshot
>   and backing that up instead would also be an option.
> 
> Given all that, how beefy of a box should I be looking at, and how much
> disk space can I expect to need?
> 
> Also, as a side note, I'm planning on using VDO (Virtual Data Optimizer)
> to provide on-the-fly data compression and deduplication on the backup
> server, which should reduce disk consumption at the cost of CPU
> overhead.  I'm thinking it would make the most sense to use VDO only for
> the filesystem holding the vtapes, and not for the staging area, but
> feel free to correct me on that.

I am a bit surprised by the fact you haven't yet received any reply on
the list so far (maybe per direct/private reply).

Your "project" and the related questions could start a new thread
without problems ;-)

In fact this is a rather *big* amanda installation as far as I know and
there are many things to consider:

* how dynamic is your data: are the incremental changes big or small ...

* what $dumpcycle is targetted?

* parallelity: will your new amanda server have multiple NICs etc / plan
for a big holding disk (array)

* fast network is nice, but this results in a bottleneck called
*storage* -> fast RAID arrays, maybe SSDs.

I am absolutely convinced that Amanda is able to backup your servers.

But IMO this will need a rather big box with fast storage and NICs.

And a fast holding disk (array) to provide parallelity.

-

I'd start with asking: how do your current backups look like?

What is the current rate of new/changed data generated?

(maybe I ignore some of your earlier postings right now, sorry)

-

Other amanda-users here run way bigger installations than me, and should
be able to share some tips here.

I think I would do some basic calculations at first:

* how long does it take to copy all the 40TB into my amanda box (*if* I
did a FULL backup every time)?

* what grade of parallelity is possible?

-> which client server hosts X TB, which bandwidth is available to each
server, which server is able to deliver this and that performance
because of its storage hw/setup ...

etc etc

-

Nevertheless a very interesting project, yes ;-)


Re: amanda going to hell, probably because of a bad sdb.

2020-11-23 Thread Debra S Baddorf
Well, if you mean this amanda-list  mail server,  I got this message Friday 
night,
but didn’t do any work email until today.

I don’t have any answers to your prob, either.   Is it working better with the 
SSD?

Deb Baddorf
Fermilab

> On Nov 11, 2020, at 6:45 AM, Gene Heskett  wrote:
> 
> Greetings all;
> 
> 2 weeks in a row, amanda has fussed about crc's in the holding disk.  But 
> the disk checks good.  And the mail server seems to have turned into a 
> black hole, no messages since Oct 21st.
> 
> Since /sdb1 is used only as a dump buffer, I'm going to sub a 240GB SSD 
> on the end of that cable.
> 
> But can we fix the mail server?
> 
> Cheers, Gene Heskett
> -- 
> "There are four boxes to be used in defense of liberty:
> soap, ballot, jury, and ammo. Please use in that order."
> -Ed Howdershelt (Author)
> If we desire respect for the law, we must first make the law respectable.
> - Louis D. Brandeis
> Genes Web page 
>   >