Re: two tape libraries with one configuration

2017-07-16 Thread Lorenzo Marcantonio
On Fri, Jul 14, 2017 at 05:38:49PM -0400, Chris Hoogendyk wrote:
> I've just acquired and set up a new Overland T48 with two LTO7 tape drives.
> 
> I already have an Overland T24 with one LTO6 tape drive that I've been using 
> for a few years.
> 
> I plan on transitioning our large data to the LTO7 and thought I might keep
> the basic administrative backups on the LTO6, using the same instance of
> Amanda.
> 
> Anyone know how to configure this? I can imagine defining two changers and
> specifying which to use in the dumptype, but I don't see quite how that
> would be done.

You'll need two different configurations. The planner tries to fill a
certain amount of tapes (of a determined size) so it can't handle mixed
storage like you suggest.

If they were of the same size you could play with the aggregate changer
but that's not the case.

I'd suggest one configuration for the admin data on LTO6 and one
configuration for the other data on LTO7 (externally scheduled as
appropriate)

-- 
Lorenzo Marcantonio


signature.asc
Description: PGP signature


Restoring ZFS datasets with amfetchdump and feedback on S3 conf

2017-07-16 Thread Giorgio Valoti
Hi there,
I’m setting up a backup server running Amanda 3.3.6 on Ubuntu 16.04. My main 
idea is to use the amzfs-sendrecv plugin. The dump part works, but I have a 
couple of questions about the restore procedures. If anyone could give me some 
advice that would be great.

I’m still having a hard time in figuring out the interplay between ZFS 
snapshots and Amanda. As far as I can tell, Amanda create a temporary ZFS 
snapshot, and send the delta between that and the previous one the backup. Then 
it proceeds to swap / rename the snapshot for the next run. When a full backup 
is needed a it sends the snapshots without the `-i` flag. Assuming my 
understanding is correct, I still don’t know how to:
- execute a full restore a ZFS dataset, i.e. the dataset is gone. In 
particular, I’d like to use amfetchdump so that I can pipe the stream back with 
`zfs recv`
- execute a partial restore, i.e. a point in time recovery with an existing ZFS 
dataset. Now, I realize that you could solve this particular use case just with 
ZFS, assuming you have a usable ZFS snapshot, but since I’m just getting 
started with Amanda, I’d like to know how to solve it with this tool, as well.

The configuration shown below is what I come up with by looking at the wiki 
pages. I’d be really grateful if someone could give me feedback on this. Does 
anybody see anything wrong or that can be improved?


My conf:

> infofile "/var/lib/amanda/state/curinfo"
> indexdir "/var/lib/amanda/state/index"
> dumpuser "backup"
> mailto   “<…>"
> 
> define changer s3 {
> tapedev 
> "chg-multi:s3:<…>/slot-bb2e601e63b3408bb6a865e442a28366-{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}"
> changerfile "/var/lib/amanda/state/s3-statefile"
> device-property "NB_THREADS_BACKUP""8"
> device-property "NB_THREADS_RECOVERY"  "8"
> device_property "S3_ACCESS_KEY”“<…>"
> device_property "S3_SECRET_KEY”“<…>"
> device_property "S3_BUCKET_LOCATION"   "eu-west-1"
> device_property "S3_SSL"   "YES"
> device_property "BLOCK_SIZE"   "10 megabytes"
> }
> 
> tpchanger "s3"
> tapetype S3
> 
> define tapetype S3 {
> comment "S3 Bucket"
> length 10240 gigabytes
> }
> 
> org "zfs-dataset"
> logdir "/var/log/zfs-dataset"
> 
> define application-tool zfs-dataset-app {
>comment  "amzfs-sendrecv"
>plugin   "amzfs-sendrecv"
>property "ZFS-PATH"  "/sbin/zfs"
>property "PFEXEC-PATH"   "/usr/bin/sudo"
>property "PFEXEC""YES"
> }
> 
> define dumptype zfs-dataset-dump {
>   program "APPLICATION"
>   application "zfs-dataset-app"
>   auth"ssh"
>   ssh_keys"/var/backups/.ssh/id_rsa”
> }





--
Giorgio Valoti




Amcheck fails for several clients if one of them is not reachable

2017-07-16 Thread Luc Lalonde
Hello Folks,

I think that this is an old bug that has come back in version 3.4.5.

If one of the clients is not reachable for an ‘amcheck’, then I get multiple 
errors:

#
[root@beagle amandad]# su amandabackup -c '/usr/sbin/amcheck Journalier-VTAPE'
Amanda Tape Server Host Check
-
NOTE: Holding disk '/amanda/stage/Journalier-VTAPE': 6194409472 KB disk space 
available, using 1048576000 KB as requested
Searching for label 'vtape-4':found in slot 4: volume 'vtape-4'
Will write to volume 'vtape-4' in slot 4.
NOTE: skipping tape-writable test
Server check took 0.260 seconds
Amanda Backup Client Hosts Check

ERROR: trinidad: selfcheck request failed: error sending REQ: write error to: 
Broken pipe
ERROR: ada: selfcheck request failed: error sending REQ: write error to: Broken 
pipe
ERROR: moe-alt: selfcheck request failed: error sending REQ: write error to: 
Broken pipe
ERROR: moe-180: selfcheck request failed: error sending REQ: write error to: 
Broken pipe
ERROR: ldap1: selfcheck request failed: error sending REQ: write error to: 
Broken pipe
ERROR: nanofs: selfcheck request failed: error sending REQ: write error to: 
Broken pipe
ERROR: bonne: selfcheck request failed: Connection timed out
Client check: 14 hosts checked in 392.043 seconds.  7 problems found.
(brought to you by Amanda 3.4.5)
#

The last client ‘bonne’ is down and not reachable on the network.  I remove the 
entry for that client in the ‘disklist’ and everything works fine:

#
[root@beagle amandad]# su amandabackup -c '/usr/sbin/amcheck Journalier-VTAPE'
Amanda Tape Server Host Check
-
NOTE: Holding disk '/amanda/stage/Journalier-VTAPE': 6194409472 KB disk space 
available, using 1048576000 KB as requested
Searching for label 'vtape-4':found in slot 4: volume 'vtape-4'
Will write to volume 'vtape-4' in slot 4.
NOTE: skipping tape-writable test
Server check took 0.271 seconds
Amanda Backup Client Hosts Check

Client check: 13 hosts checked in 2.175 seconds.  0 problems found.
(brought to you by Amanda 3.4.5)
#

We were using 3.3.7 not so long ago and I don’t seem to remember having this 
kind of problem when an amanda client was down.

Is this a known bug?

Thank You!






Re: Problems with a big amanda-server

2017-07-16 Thread Jose M Calhariz
On Thu, Jul 13, 2017 at 04:07:19PM +0100, Jose M Calhariz wrote:
> 
> Hi,
> 
> I have another installation of amanda.  This one is very big, 120
> hosts, 750 DLE and using ssh authentication.  Anyone using an
> instlation of this size?
> 
> My problem is that this setup works for some days without problems and
> other days it can not backup all the servers.
> 
> I have investigated the clients.  When there is problems, not one tried
> to contact the faulty clients, no traffic is being generated from the
> server to the client.
> 
> Now I am trying to make sense from the server logs.  Looking into the
> planner logs I see messages from the sucessfully servers but not the
> name of the faulty servers.  Looking into /var/log/amanda/Daily I see
> messages in the log and amdump of requesting estimates and in the same
> second saying:
> 
> 
> amdump.20170713000603:planner: time 0.055: setting up estimates for 
> hostanme.domain.name:/
> amdump.20170713000603:setup_estimate: hostanme.domain.name:/: command 0, 
> options: nonelast_level 1 next_level0 5 level_days 6getting estimates 
> 0 (-3) 1 (-3) 2 (-3)
> amdump.20170713000603:planner: time 0.055: setting up estimates for 
> hostanme.domain.name:/boot
> amdump.20170713000603:setup_estimate: hostanme.domain.name:/boot: command 0, 
> options: nonelast_level 1 next_level0 4 level_days 7getting estimates 
> 0 (-3) 1 (-3) -1 (-3)
> amdump.20170713000603:planner: FAILED hostanme.domain.name / 20170713000603 0 
> "[hmm, no error indicator!]"
> amdump.20170713000603:planner: FAILED hostanme.domain.name /boot 
> 20170713000603 0 "[hmm, no error indicator!]"
> 
> I am out of ideas about things to do to find a possible reason for the
> failure.  Can anyone help me?

If I comment entries in disklist the list of failed machines changes.


To get an idea how big is this installation here is:
amstatus Daily --summary
Using /var/log/amanda/Daily/amdump.1
>From Sun Jul 16 07:45:03 BST 2017


SUMMARY  part  real  estimated
   size   size
partition   : 967
estimated   : 166  578g
flush   : 240  2711g
failed  : 5610g   (  0.00%)
wait for dumping:   00g   (  0.00%)
dumping to tape :   00sunit   (  0.00%)
dumping :   0 0g 0g (  0.00%) (  0.00%)
dumped  : 166   539g   578g ( 93.13%) ( 93.13%)
wait for writing: 166   539g   578g ( 93.13%) ( 93.13%)
wait to flush   :  62   269g   269g (100.00%) (  0.00%)
writing to tape :   0 0g 0g (  0.00%) (  0.00%)
failed to tape  :   0 0g 0g (  0.00%) (  0.00%)
taped   : 178  2441g  2441g (100.00%) ( 74.21%)
  tape 1: 178  2441g  2441g (100.00%) Daily-39 (178 chunks)
16 dumpers idle : 0
taper 0 status: Idle
taper qlen: 229
network free kps:  1000
holding space   : 10191g (122.95%)




> 
> Kind regards
> Jose M Calhariz
> 
>


Now I am looking into the code in planner.c to see if I can insert
debug code and understand why is failing to launch some ssh commands.

Kind regards
Jose M Calhariz

-- 
--
A verdade e a melhor camuflagem. Ninguem acredita nela.
--  Max Frisch; escritor suico.