Fwd: amgtar: defaults for NORMAL and STRANGE
-- Forwarded message - From: Tom Robinson Date: Thu, 21 Jan 2021 at 09:13 Subject: Re: amgtar: defaults for NORMAL and STRANGE To: Nathan Stratton Treadway On Wed, 20 Jan 2021 at 16:09, Nathan Stratton Treadway wrote: > On Wed, Jan 20, 2021 at 14:22:02 +1100, Tom Robinson wrote: > > I'm still seeing messages in the report that should have been squashed. > It > > also doesn't matter what I have configured as 'NORMAL' for the > application > > configuration. > > > > STRANGE DUMP DETAILS: > > /-- lambo.motec.com.au / lev 1 STRANGE > > sendbackup: info BACKUP=APPLICATION > > sendbackup: info APPLICATION=amgtar > > sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc > > |/usr/lib64/amanda/application/amgtar restore [./file-to-restore]+ > > sendbackup: info COMPRESS_SUFFIX=.gz > > sendbackup: info end > > | /usr/bin/tar: ./dev: directory is on a different filesystem; not > dumped > > | /usr/bin/tar: ./proc: directory is on a different filesystem; not > dumped > > | /usr/bin/tar: ./run: directory is on a different filesystem; not > dumped > > | /usr/bin/tar: ./sys: directory is on a different filesystem; not > dumped > > | /usr/bin/tar: ./mnt/s3backup: directory is on a different > filesystem; not dumped > > | /usr/bin/tar: ./var/lib/nfs/rpc_pipefs: directory is on a different > filesystem; not dumped > > ? /usr/bin/tar: ./mnt/s3backup: Warning: Cannot flistxattr: Operation > not supported > > [...] > > property"NORMAL" ": socket ignored$" > > property append "NORMAL" ": file changed as we read it$" > > property append "NORMAL" ": directory is on a different filesystem; > > Note that the man page explaination of NORMAL includes the sentence > 'These output are in the "FAILED DUMP DETAILS" section of the email > report if the dump result is STRANGE'. > > In this case, the "Operation not supported" message is considered > STRANGE... which in turn causes all the NORMAL message lines to be > included in the report output as well. > > I see. I've been working backwards on this. I was trying to squash the 'NORMAL' messaging to make the STRANGE results more clear! I've fixed the 'Operation not supported' issue and the 'NORMAL' messages have gone, too. Amanda now produces a clean report. > So presumably once you resolve all of those error messages for a > particular DLE, that DLE will no show up with a STRANGE DUMP DETAILS > section at all, in which case those NORMAL-category messages will no > longer be included in the report. > > Correct. Thanks for clarifying Nathan. Kind regards, Tom
Re: amgtar: defaults for NORMAL and STRANGE
Hi Nathan, Thanks for your insights and help. On Tue, 19 Jan 2021 at 17:24, Nathan Stratton Treadway wrote: > On Tue, Jan 19, 2021 at 11:53:52 +1100, Tom Robinson wrote: > > > > Also, the man page says there are defaults for NORMAL and STRANGE but > these > > 'defaults' don't seem to be included into the application definition > when I > > dump the config information with amadmin daily config: > > [...] > > Is the man page incorrect? Are the 'defaults' really applied or do I have > > to manually specify them in the config file? > > I haven't looked closely at this functionality before, but from a quick > skim of the code in application-src/amgtar.c, it looks like those > default values are built directly in to the program itself. > > That is, they aren't implemented as part of the config system and thus > don't show in the output of "amadmin ... config", but they do indeed > exist underneath the hood. > Ah, that explains it. Although it is somewhat confusing to not see that in the config file at first. > (As a corollary to that, it seems like there isn't any way to completely > delete the default strings from amgtar's processing, though you can > override the treatment of a particular regex by explicitly specifying > it as another type in the config file.) > > > (Are you seeing any situations where it looks like the default strings > aren't being applied as you would have expected from the man page > description?) > > CentOS Linux release 8.2.2004 (Core) amanda 3.5.1 I'm still seeing messages in the report that should have been squashed. It also doesn't matter what I have configured as 'NORMAL' for the application configuration. STRANGE DUMP DETAILS: /-- lambo.motec.com.au / lev 1 STRANGE sendbackup: info BACKUP=APPLICATION sendbackup: info APPLICATION=amgtar sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc |/usr/lib64/amanda/application/amgtar restore [./file-to-restore]+ sendbackup: info COMPRESS_SUFFIX=.gz sendbackup: info end | /usr/bin/tar: ./dev: directory is on a different filesystem; not dumped | /usr/bin/tar: ./proc: directory is on a different filesystem; not dumped | /usr/bin/tar: ./run: directory is on a different filesystem; not dumped | /usr/bin/tar: ./sys: directory is on a different filesystem; not dumped | /usr/bin/tar: ./mnt/s3backup: directory is on a different filesystem; not dumped | /usr/bin/tar: ./var/lib/nfs/rpc_pipefs: directory is on a different filesystem; not dumped ? /usr/bin/tar: ./mnt/s3backup: Warning: Cannot flistxattr: Operation not supported | /usr/bin/tar: ./var/lib/sss/pipes/nss: socket ignored | /usr/bin/tar: ./var/lib/sss/pipes/private/sbus-dp_implicit_files.558138: socket ignored | /usr/bin/tar: ./var/lib/sss/pipes/private/sbus-dp_implicit_files.782: socket ignored | /usr/bin/tar: ./var/spool/postfix/private/anvil: socket ignored | /usr/bin/tar: ./var/spool/postfix/private/bounce: socket ignored | /usr/bin/tar: ./var/spool/postfix/private/defer: socket ignored | /usr/bin/tar: ./var/spool/postfix/private/discard: socket ignored | /usr/bin/tar: ./var/spool/postfix/private/error: socket ignored Look in the '/var/log/amanda/log.error/lambo.motec.com.au._.1.20210120132743.errout' file for full error messages \ More configs: DLE: lambo.motec.com.au / { comp-root-amgtar } dumptypes: define dumptype global { comment "Global definitions" auth "bsdtcp" exclude list optional ".amanda.excludes" } define dumptype root-amgtar { global program "APPLICATION" application "app_amgtar" comment "root partitions dumped with tar" compress none index priority low } define dumptype comp-root-amgtar { root-amgtar comment "Root partitions with compression dumped with tar" compress client fast } Application: Note that this config setting: define application-tool app_amgtar { comment "amgtar" plugin "amgtar" property "XATTRS" "YES" property "ACLS" "YES" #property "GNUTAR-PATH" "/path/to/gtar" #property "GNUTAR-LISTDIR" "/path/to/gnutar_list_dir" property"NORMAL" ": socket ignored$" property append "NORMAL" ": file changed as we read it$" property append "NORMAL" ": directory is on a different filesystem; not dumped$" } AND this config setting: define application-tool app_amgtar { comment "amgtar" plugin "amgtar" property "XATTRS" "YES" property "ACLS" "YES" #property "GNUTAR-PATH" "/path/to/gtar" #property "GNUTAR-LISTDIR" "/path/to/gnutar_list_dir" #property"NORMAL" ": socket ignored$" #property append "NORMAL" ": file changed as we read it$" #property append "NORMAL" ": directory is on a different filesystem; not dumped$" } give the same results in the report. Kind regards, Tom
Re: amgtar: Operation not permitted
On Tue, 19 Jan 2021 at 18:39, Diego Zuccato wrote: > Il 19/01/21 01:53, Tom Robinson ha scritto: > > > I now get a lot of permission warnings and errors. Of particular concern > > are the 'Operation not permitted' messages: > Maybe you're running SELinux on the clients and amgtar is not allowed to > access everything? > Diego you are right and thanks for the reminder! I should have checked that but forgot! There is indeed an SE Alert regarding the access issue. There is no installed policy for amgtar so I've created one and done a manual dump which looks good. Thanks for your insight and help. Kind regards, Tom
amgtar: Operation not permitted
: Cannot open: Operation not permitted ? /usr/bin/tar: ./lib/tpm: Warning: Cannot open: Operation not permitted ? /usr/bin/tar: ./lib/unbound: Warning: Cannot open: Operation not permitted | /usr/bin/tar: ./lib/nfs/rpc_pipefs: directory is on a different filesystem; not dumped ? /usr/bin/tar: ./lib/nfs/statd: Warning: Cannot open: Operation not permitted ? /usr/bin/tar: ./lib/sss/db: Warning: Cannot open: Operation not permitted ? /usr/bin/tar: ./lib/sss/gpo_cache: Warning: Cannot open: Operation not permitted Look in the '/var/log/amanda/log.error/lambo.motec.com.au._var.0.20210118200116.errout' file for full error messages \ The amgtar application is set UID root: # ls -l /usr/lib64/amanda/application/amgtar -rwsr-x---. 1 root disk 60368 May 15 2019 /usr/lib64/amanda/application/amgtar Why am I seeing these errors and warnings about access? Also, the man page says there are defaults for NORMAL and STRANGE but these 'defaults' don't seem to be included into the application definition when I dump the config information with amadmin daily config: Config file excerpt: #define application-tool and dumptype for the amgtar application define application-tool app_amgtar { comment "amgtar" plugin "amgtar" property "XATTRS" "YES" property "ACLS" "YES" #property "GNUTAR-PATH" "/path/to/gtar" #property "GNUTAR-LISTDIR" "/path/to/gnutar_list_dir" #property"NORMAL" ": socket ignored$" #property append "NORMAL" ": file changed as we read it$" #property append "NORMAL" ": directory is on a different filesystem; not dumped$" } $ amadmin daily config : DEFINE APPLICATION app_amgtar { COMMENT "amgtar" PLUGIN "amgtar" PROPERTYvisible "xattrs" "YES" PROPERTYvisible "acls" "YES" CLIENT-NAME "" } Uncommenting the 'property "NORMAL"' lines in the config changes the definition: #define application-tool and dumptype for the amgtar application define application-tool app_amgtar { comment "amgtar" plugin "amgtar" property "XATTRS" "YES" property "ACLS" "YES" #property "GNUTAR-PATH" "/path/to/gtar" #property "GNUTAR-LISTDIR" "/path/to/gnutar_list_dir" property"NORMAL" ": socket ignored$" property append "NORMAL" ": file changed as we read it$" property append "NORMAL" ": directory is on a different filesystem; not dumped$" } $ amadmin daily config : DEFINE APPLICATION app_amgtar { COMMENT "amgtar" PLUGIN "amgtar" PROPERTYvisible "xattrs" "YES" PROPERTYvisible "acls" "YES" PROPERTYvisible "normal" ": socket ignored$" ": file changed as we read it$" ": directory is on a different filesystem; not dumped$" CLIENT-NAME "" } Is the man page incorrect? Are the 'defaults' really applied or do I have to manually specify them in the config file? *Kind regards,* *Tom Robinson * *IT Manager/System Administrator* *MoTeC Pty Ltd*121 Merrindale Drive Croydon South 3136 Victoria Australia *T:* 61 3 9761 5050 *W: *www.motec.com <https://www.facebook.com/motec.global> <https://www.youtube.com/user/MoTeCAustralia> <https://www.instagram.com/motec_global/> <http://www.motec.com/webinars/webinararchive/> <http://www.motec.com.au/forum/> <https://www.motec.com.au/gplite-m1/gplite-m1-ov/> *Disclaimer Notice: **T**his message, including any attachments, contains confidential information intended for a specific individual and purpose and is protected by law. If you are not the intended recipient you should delete this message. Any disclosure, copying, or distribution of this message or the taking of any action based on it is strictly prohibited.*
Re: lev 0 FAILED [data timeout]
On Tue, 14 May 2019 at 22:14, Nathan Stratton Treadway wrote: Hi Nathan, Thanks for you reply and help. On Mon, May 13, 2019 at 09:59:13 +1000, Tom Robinson wrote: > > I have a weekly backup that backs-up the daily disk based backup to tape > (daily's are a separate > > amanda config). > > (As a side note: if you are running Amanda 3.5 you might consider using > vaulting to do this sort of backup, so that Amanda knows about the > copies that are put onto the tape.) > > Unfortunately, no. We're on 3.3.3. Considering updating that on an Illumos variant (OmniOS CE)... looks like I may have to compile a custom package. Originally I installed a CSW package but I haven't seen any updates for that as of yet. > > Occasionally on the weekly backup a DLE will fail to dump writing only a > 32k header file before > > timing out. > > > > I can't seem to identify the error when looking in logging. Has anyone a > few clues as to what to > > look for? > > > > FAILURE DUMP SUMMARY: > > monza /data/backup/amanda/vtapes/daily/slot9 lev 0 FAILED [data > timeout] > > monza /data/backup/amanda/vtapes/daily/slot9 lev 0 FAILED [dumper > returned FAILED] > > monza /data/backup/amanda/vtapes/daily/slot9 lev 0 FAILED [data > timeout] > > monza /data/backup/amanda/vtapes/daily/slot9 lev 0 FAILED [dumper > returned FAILED] > > [...] > > > > A while ago I changed estimate to calcsize as the estimates were taking > a very long time and all > > daily slots are a know maximum fixed size. I thought it might help with > time outs. Alas not. > > Amanda's estimate timeouts are separate from data timeouts; changing to > calcsize will help with the former but not the latter. (Note that if > the slots are really a known size, using "server" estimate is even > faster than "calcsize", since it then just uses the size from the > previous run and doesn't read through the current contents of the DLE > directory to add up the sizes.) > > You can control the data timeouts with the "dtimeout" parameter in > amanda.conf. Just try bumping that up so that you are sure it's longer > than a dump actually takes. > My dtimeout was set to half an hour. After I check some logs and found the average dump time for my DLEs was around 55 minutes I've adjusted the dtimeout to 1hour. The last run went well so I'm waiting on the next run to see if it's consistent. > (The sendbackup..debug and runtar..debug client log > files should confirm that the GNU tar is running without error but then > unable to write to the output pipe on the server side. In the server > logs, I think the 'data timeout reached, aborting'-sort of message would > be found in the dumper..debug file for that run...) > yes, that helps knowing where to look! I am getting a new issue now which is annoying but not a show stopper. I think I will have to revisit my threshold settings to fix this but maybe you can offer some insight. I have a tape robot and the following settings in amanda.conf runtapes 3 flush-threshold-dumped 100 flush-threshold-scheduled 100 taperflush 100 autoflush yes There is enough room for all the data to go on three tapes yet after the amdump run is complete only two tapes are written and I am left to flush the remaining dumps to tape manually. I think it's because I'm trying to get a whole tape's worth of data before writing to tape. Is my thinking correct? What I'd like to do is make sure there's a tape's worth of data to write to the first two tapes in turn and then dump all remaining backup data to tape three (this will not be a complete tapes worth). Should I be setting taperflush as follows to achieve this? taperflush 0 Kind regards, Tom
lev 0 FAILED [data timeout]
Hi, I have a weekly backup that backs-up the daily disk based backup to tape (daily's are a separate amanda config). Occasionally on the weekly backup a DLE will fail to dump writing only a 32k header file before timing out. I can't seem to identify the error when looking in logging. Has anyone a few clues as to what to look for? FAILURE DUMP SUMMARY: monza /data/backup/amanda/vtapes/daily/slot9 lev 0 FAILED [data timeout] monza /data/backup/amanda/vtapes/daily/slot9 lev 0 FAILED [dumper returned FAILED] monza /data/backup/amanda/vtapes/daily/slot9 lev 0 FAILED [data timeout] monza /data/backup/amanda/vtapes/daily/slot9 lev 0 FAILED [dumper returned FAILED] FAILED DUMP DETAILS: /-- monza /data/backup/amanda/vtapes/daily/slot9 lev 0 FAILED [data timeout] sendbackup: start [monza:/data/backup/amanda/vtapes/daily/slot9 level 0] sendbackup: info BACKUP=/opt/csw/bin/gtar sendbackup: info RECOVER_CMD=/opt/csw/bin/gtar -xpGf - ... sendbackup: info end \ /-- monza /data/backup/amanda/vtapes/daily/slot9 lev 0 FAILED [data timeout] sendbackup: start [monza:/data/backup/amanda/vtapes/daily/slot9 level 0] sendbackup: info BACKUP=/opt/csw/bin/gtar sendbackup: info RECOVER_CMD=/opt/csw/bin/gtar -xpGf - ... sendbackup: info end DLEs are using: root-tar strategy noinc estimate calcsize index no A while ago I changed estimate to calcsize as the estimates were taking a very long time and all daily slots are a know maximum fixed size. I thought it might help with time outs. Alas not. Kind regards, Tom
Re: dump failed: [request failed: No route to host](too)
I've updated the server to amanda-backup_server-3.5-1 (64bit) which appears to have fixed the issue. The client that failed most regularly is running amanda-backup_client-3.3.9-1 (32bit). I'll keep monitoring this in case the situation changes but it looks like it's working properly now. On 05/10/17 08:58, Tom Robinson wrote: > > It may well be just that I can't see the wood for the trees when looking at > logging but I can't > find the problem :-( > > I'm running daily manual dumps of the FAILED DLE's to keep backups intact! > > I'm still getting the following: > > FAILURE DUMP SUMMARY: > bentley Resources lev 1 FAILED [too many dumper retry: [request failed: No > route to host]] > bentley sysadmin lev 1 FAILED [too many dumper retry: [request failed: No > route to host]] > > Apart from the two KVM hosts, all these systems are KVM Guests. The backup > server is a KVM guest. > Has anyone seen or know of issues that may occur with amanda on virtualised > infrastructure? > > From my understanding of KVM networking between guests, whole network frames > are dumped and picked > up between them. This allows higher transport speeds. I've tested the > throughput with iperf and > have seen througput as high as 25Gbps. The following ipef session shows the > connection between the > failed guest, bentley, and the backup server. I've only shown the 'server' > side results for iperf > below: > > # systemctl stop xinetd > > # iperf -p 10080 -s > > Server listening on TCP port 10080 > TCP window size: 85.3 KByte (default) > > [ 4] local 10.0.19.21 port 10080 connected with 192.168.0.3 port 39214 > [ ID] Interval Transfer Bandwidth > [ 4] 0.0-10.0 sec 20.5 GBytes 17.6 Gbits/sec > [ 4] local 10.0.19.21 port 10080 connected with 192.168.0.3 port 39215 > [ 4] 0.0-10.0 sec 20.7 GBytes 17.8 Gbits/sec > [ 4] local 10.0.19.21 port 10080 connected with 192.168.0.3 port 39218 > [ 4] 0.0-10.0 sec 21.3 GBytes 18.3 Gbits/sec > [ 4] local 10.0.19.21 port 10080 connected with 192.168.0.3 port 39223 > [ 4] 0.0-10.0 sec 21.4 GBytes 18.4 Gbits/sec > > Any clues/help for the above are appreciated. > > I'm now also getting some other strange errors that I've never seen before. > These report as > 'FAILED' but further on into the report they appear to have completed without > issue. What do the > error codes signify (e.g. FAILED [02-00098] etc.)? > > ---8<--- > > FAILURE DUMP SUMMARY: > ---8<--- > bentley ECN lev 0 FAILED [02-00098] > bentley Repair lev 1 FAILED [06-00229] > garage /var lev 1 FAILED [shm_ring cancelled] > modena /usr/src lev 1 FAILED [12-00205] > > ---8<--- > NOTES: > planner: Last full dump of bentley:ECN on tape daily02 overwritten in 5 > runs. > planner: Last level 1 dump of bentley:ECN on tape daily01 overwritten in 4 > runs. > planner: Last full dump of bentley:Repair on tape daily07 overwritten in 2 > runs. > planner: Last full dump of garage:/var on tape daily01 overwritten in 4 > runs. > > ---8<--- > DUMP SUMMARY: > DUMPER STATS > TAPER STATS > HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS > KB/s MMM:SS KB/s > --- > --- --- > ---8<--- > bentley ECN 0 19790 19790-- 0:03 > 7325.0 0:00 197900.0 > bentley Repair110 0.00:00 > 4.2 0:00 0.0 > garage /var 1 7000 7000-- 0:00 > 33341.0 0:00 7.0 > modena /usr/src 1 190 147.40:04 > 3.3 0:00140.0 > ---8<--- > > > What are the error codes and did amanda dump these OK or not? > > Kind regards, > Tom > > > Tom Robinson > IT Manager/System Administrator > > MoTeC Pty Ltd > > 121 Merrindale Drive > Croydon South > 3136 Victoria > Australia > > T: +61 3 9761 5050 > F: +61 3 9761 5051 > E: tom.robin...@motec.com.au > On 13/09/17 23:09, Jean-Louis Martineau wrote: >> Tom, >> >> It is the system that return the "No route to host" error. >> You should check your system log (on server, client, router, firewall, nat, >> ...) for network error. >> >> Jean-Louis >> >> On 12/09/17 06:01 PM, Tom Robinson wrote: >>> bump >>> >>> On 11/09/17 12:45, Tom Robi
FAILED [shm_ring cancelled]
Hi, Splitting this out of the other thread for clarity. What does this error mean? FAILED [shm_ring cancelled] Also getting: FAILED [02-00098] FAILED [06-00229] FAILED [12-00205] These report as 'FAILED' but further on into the report they appear to have completed without issue. What do the error codes signify (e.g. FAILED [02-00098] etc.)? ---8<--- FAILURE DUMP SUMMARY: ---8<--- bentley ECN lev 0 FAILED [02-00098] bentley Repair lev 1 FAILED [06-00229] garage /var lev 1 FAILED [shm_ring cancelled] modena /usr/src lev 1 FAILED [12-00205] ---8<--- NOTES: planner: Last full dump of bentley:ECN on tape daily02 overwritten in 5 runs. planner: Last level 1 dump of bentley:ECN on tape daily01 overwritten in 4 runs. planner: Last full dump of bentley:Repair on tape daily07 overwritten in 2 runs. planner: Last full dump of garage:/var on tape daily01 overwritten in 4 runs. ---8<--- DUMP SUMMARY: DUMPER STATS TAPER STATS HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s --- --- --- ---8<--- bentley ECN 0 19790 19790-- 0:03 7325.0 0:00 197900.0 bentley Repair110 0.00:00 4.2 0:00 0.0 garage /var 1 7000 7000-- 0:00 33341.0 0:00 7.0 modena /usr/src 1 190 147.40:04 3.3 0:00140.0 ---8<--- What are the error codes and did amanda dump these OK or not? Kind regards, Tom Tom Robinson IT Manager/System Administrator MoTeC Pty Ltd 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 E: tom.robin...@motec.com.au On 13/09/17 23:09, Jean-Louis Martineau wrote: > Tom, > > It is the system that return the "No route to host" error. > You should check your system log (on server, client, router, firewall, nat, > ...) for network error. > > Jean-Louis > > On 12/09/17 06:01 PM, Tom Robinson wrote: >> bump >> >> On 11/09/17 12:45, Tom Robinson wrote: >> > Hi, >> > >> > I've recently migrated our backup server from CentOS 5 to CentOS 7. I've >> > also upgraded from amanda >> > 3.3.7 to 3.4.5 >> > >> > The amcheck works fine and reports no issues. Yet, on backup runs on some >> > DLEs I get the error: >> > >> > dump failed: [request failed: No route to host](too) >> > >> > It also appears to be random as to which DLEs fail. Sometimes it's just >> > one or two on a client. >> > Other times it's all DLEs for a client. And, for any particular client it >> > can be a different DLE on >> > that client each day. >> > >> > Below is a dumper..debug log from the server. I'm not sure what to check >> > for in there. What other >> > logs should I check? >> > >> > Kind regards, >> > Tom >> > >> > Sun Sep 10 20:16:32.115899592 2017: pid 6088: thd-0x257f400: dumper: >> > close_producer_shm_ring >> > sem_close(sem_write 0x7fbc1588b000 >> > Sun Sep 10 20:16:32.115911222 2017: pid 6088: thd-0x257f400: dumper: >> > am_sem_close 0x7fbc1588b000 0 >> > Sun Sep 10 20:16:32.115927349 2017: pid 6088: thd-0x257f400: dumper: >> > am_sem_close 0x7fbc15889000 0 >> > Sun Sep 10 20:16:32.115938800 2017: pid 6088: thd-0x257f400: dumper: >> > am_sem_close 0x7fbc1588a000 0 >> > Sun Sep 10 20:16:32.115949293 2017: pid 6088: thd-0x257f400: dumper: >> > am_sem_close 0x7fbc15888000 0 >> > Sun Sep 10 20:16:32.337361676 2017: pid 6088: thd-0x257f400: dumper: >> > getcmd: SHM-DUMP 00-00217 >> 34076 >> > NULL 5 bentley 9efefbff3f Dispatch >> /var/lib/samba/data/public/Dispatch 1 >> > 2017:9:6:4:6:22 GNUTAR "" "" "" "" "" "" "" 1 "" "" bsdtcp AMANDA /amand >> > a_shm_control-6956-0 20 |" bsdtcp\n >> > FAST\n >> > YES\n YES\n >> > AMANDA\n""" >> > Sun Sep 10 20:16:32.337507787 2017: pid 6088: thd-0x257f400: dumper: >> > Sending header to >> localhost:34076 >> > Sun Sep 10 20:16:32.339939372 2017: pid 6088: thd-0x257f400: dumper: >> > make_socket opening socket >> with >> > family 10 >> > Sun Sep 10 20:16:32.339978452 2017: pid 6088: thd-0x257f400: dumper: >> > connect_port: Try port 1024: >> &
Re: dump failed: [request failed: No route to host](too)
It may well be just that I can't see the wood for the trees when looking at logging but I can't find the problem :-( I'm running daily manual dumps of the FAILED DLE's to keep backups intact! I'm still getting the following: FAILURE DUMP SUMMARY: bentley Resources lev 1 FAILED [too many dumper retry: [request failed: No route to host]] bentley sysadmin lev 1 FAILED [too many dumper retry: [request failed: No route to host]] Apart from the two KVM hosts, all these systems are KVM Guests. The backup server is a KVM guest. Has anyone seen or know of issues that may occur with amanda on virtualised infrastructure? From my understanding of KVM networking between guests, whole network frames are dumped and picked up between them. This allows higher transport speeds. I've tested the throughput with iperf and have seen througput as high as 25Gbps. The following ipef session shows the connection between the failed guest, bentley, and the backup server. I've only shown the 'server' side results for iperf below: # systemctl stop xinetd # iperf -p 10080 -s Server listening on TCP port 10080 TCP window size: 85.3 KByte (default) [ 4] local 10.0.19.21 port 10080 connected with 192.168.0.3 port 39214 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 20.5 GBytes 17.6 Gbits/sec [ 4] local 10.0.19.21 port 10080 connected with 192.168.0.3 port 39215 [ 4] 0.0-10.0 sec 20.7 GBytes 17.8 Gbits/sec [ 4] local 10.0.19.21 port 10080 connected with 192.168.0.3 port 39218 [ 4] 0.0-10.0 sec 21.3 GBytes 18.3 Gbits/sec [ 4] local 10.0.19.21 port 10080 connected with 192.168.0.3 port 39223 [ 4] 0.0-10.0 sec 21.4 GBytes 18.4 Gbits/sec Any clues/help for the above are appreciated. I'm now also getting some other strange errors that I've never seen before. These report as 'FAILED' but further on into the report they appear to have completed without issue. What do the error codes signify (e.g. FAILED [02-00098] etc.)? ---8<--- FAILURE DUMP SUMMARY: ---8<--- bentley ECN lev 0 FAILED [02-00098] bentley Repair lev 1 FAILED [06-00229] garage /var lev 1 FAILED [shm_ring cancelled] modena /usr/src lev 1 FAILED [12-00205] ---8<--- NOTES: planner: Last full dump of bentley:ECN on tape daily02 overwritten in 5 runs. planner: Last level 1 dump of bentley:ECN on tape daily01 overwritten in 4 runs. planner: Last full dump of bentley:Repair on tape daily07 overwritten in 2 runs. planner: Last full dump of garage:/var on tape daily01 overwritten in 4 runs. ---8<--- DUMP SUMMARY: DUMPER STATS TAPER STATS HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s --- --- --- ---8<--- bentley ECN 0 19790 19790-- 0:03 7325.0 0:00 197900.0 bentley Repair110 0.00:00 4.2 0:00 0.0 garage /var 1 7000 7000-- 0:00 33341.0 0:00 7.0 modena /usr/src 1 190 147.40:04 3.3 0:00140.0 ---8<--- What are the error codes and did amanda dump these OK or not? Kind regards, Tom Tom Robinson IT Manager/System Administrator MoTeC Pty Ltd 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 E: tom.robin...@motec.com.au On 13/09/17 23:09, Jean-Louis Martineau wrote: > Tom, > > It is the system that return the "No route to host" error. > You should check your system log (on server, client, router, firewall, nat, > ...) for network error. > > Jean-Louis > > On 12/09/17 06:01 PM, Tom Robinson wrote: >> bump >> >> On 11/09/17 12:45, Tom Robinson wrote: >> > Hi, >> > >> > I've recently migrated our backup server from CentOS 5 to CentOS 7. I've >> > also upgraded from amanda >> > 3.3.7 to 3.4.5 >> > >> > The amcheck works fine and reports no issues. Yet, on backup runs on some >> > DLEs I get the error: >> > >> > dump failed: [request failed: No route to host](too) >> > >> > It also appears to be random as to which DLEs fail. Sometimes it's just >> > one or two on a client. >> > Other times it's all DLEs for a client. And, for any particular client it >> > can be a different DLE on >> > that client each day. >> > >> > Below is a dumper..debug log from the server. I'm not sure what to check >> > for in there. What other >> > logs should I check? >> > >> > Kind regar
Re: dump failed: [request failed: No route to host](too)
bump On 11/09/17 12:45, Tom Robinson wrote: > Hi, > > I've recently migrated our backup server from CentOS 5 to CentOS 7. I've also > upgraded from amanda > 3.3.7 to 3.4.5 > > The amcheck works fine and reports no issues. Yet, on backup runs on some > DLEs I get the error: > > dump failed: [request failed: No route to host](too) > > It also appears to be random as to which DLEs fail. Sometimes it's just one > or two on a client. > Other times it's all DLEs for a client. And, for any particular client it can > be a different DLE on > that client each day. > > Below is a dumper..debug log from the server. I'm not sure what to check for > in there. What other > logs should I check? > > Kind regards, > Tom > > Sun Sep 10 20:16:32.115899592 2017: pid 6088: thd-0x257f400: dumper: > close_producer_shm_ring > sem_close(sem_write 0x7fbc1588b000 > Sun Sep 10 20:16:32.115911222 2017: pid 6088: thd-0x257f400: dumper: > am_sem_close 0x7fbc1588b000 0 > Sun Sep 10 20:16:32.115927349 2017: pid 6088: thd-0x257f400: dumper: > am_sem_close 0x7fbc15889000 0 > Sun Sep 10 20:16:32.115938800 2017: pid 6088: thd-0x257f400: dumper: > am_sem_close 0x7fbc1588a000 0 > Sun Sep 10 20:16:32.115949293 2017: pid 6088: thd-0x257f400: dumper: > am_sem_close 0x7fbc15888000 0 > Sun Sep 10 20:16:32.337361676 2017: pid 6088: thd-0x257f400: dumper: getcmd: > SHM-DUMP 00-00217 34076 > NULL 5 bentley 9efefbff3f Dispatch > /var/lib/samba/data/public/Dispatch 1 > 2017:9:6:4:6:22 GNUTAR "" "" "" "" "" "" "" 1 "" "" bsdtcp AMANDA /amand > a_shm_control-6956-0 20 |" bsdtcp\n > FAST\n > YES\n YES\n AMANDA\n""" > Sun Sep 10 20:16:32.337507787 2017: pid 6088: thd-0x257f400: dumper: Sending > header to localhost:34076 > Sun Sep 10 20:16:32.339939372 2017: pid 6088: thd-0x257f400: dumper: > make_socket opening socket with > family 10 > Sun Sep 10 20:16:32.339978452 2017: pid 6088: thd-0x257f400: dumper: > connect_port: Try port 1024: > available - Success > Sun Sep 10 20:16:32.340075462 2017: pid 6088: thd-0x257f400: dumper: > connect_portrange: Connect from > :::1024 failed: Connection refused > Sun Sep 10 20:16:32.340101209 2017: pid 6088: thd-0x257f400: dumper: > connect_portrange: connect to > ::1:34076 failed: Connection refused > Sun Sep 10 20:16:32.342383119 2017: pid 6088: thd-0x257f400: dumper: > make_socket opening socket with > family 2 > Sun Sep 10 20:16:32.342418634 2017: pid 6088: thd-0x257f400: dumper: > connect_port: Try port 1024: > available - Success > Sun Sep 10 20:16:32.342489613 2017: pid 6088: thd-0x257f400: dumper: > connected to 127.0.0.1:34076 > Sun Sep 10 20:16:32.342501059 2017: pid 6088: thd-0x257f400: dumper: our side > is 0.0.0.0:1024 > Sun Sep 10 20:16:32.342509347 2017: pid 6088: thd-0x257f400: dumper: > try_socksize: send buffer size > is 131072 > Sun Sep 10 20:16:32.342558663 2017: pid 6088: thd-0x257f400: dumper: send > request: > > SERVICE sendbackup > OPTIONS > features=9efefbfff3fffbf70f;maxdumps=5;hostname=bentley;config=daily; > > GNUTAR > Dispatch > /var/lib/samba/data/public/Dispatch > 1 > bsdtcp > FAST > YES > YES > AMANDA > > > > > Sun Sep 10 20:16:32.342572947 2017: pid 6088: thd-0x257f400: dumper: > security_getdriver(name=bsdtcp) > returns 0x7fbc153e86a0 > Sun Sep 10 20:16:32.342582472 2017: pid 6088: thd-0x257f400: dumper: > security_handleinit(handle=0x25e2e70, driver=0x7fbc153e86a0 (BSDTCP)) > Sun Sep 10 20:16:32.343623490 2017: pid 6088: thd-0x257f400: dumper: > security_streaminit(stream=0x283d6e0, driver=0x7fbc153e86a0 (BSDTCP)) > Sun Sep 10 20:16:32.346176806 2017: pid 6088: thd-0x257f400: dumper: > make_socket opening socket with > family 2 > Sun Sep 10 20:16:32.346230063 2017: pid 6088: thd-0x257f400: dumper: > connect_port: Try port 571: > available - Success > Sun Sep 10 20:16:32.346247716 2017: pid 6088: thd-0x257f400: dumper: > connect_portrange: Connect from > 0.0.0.0:571 failed: Cannot assign requested address > Sun Sep 10 20:16:32.346261235 2017: pid 6088: thd-0x257f400: dumper: > connect_portrange: connect to > 192.168.0.3:10080 failed: Cannot assign requested address > Sun Sep 10 20:16:32.348492651 2017: pid 6088: thd-0x257f400: dumper: > make_socket opening socket with > family 2 > Sun Sep 10 20:16:32.348526207 2017: pid 6088: thd-0x257f400: dumper: > connect_port: Try port 585: > available - Success > Sun Sep 10 20:18:39.587177652 2017: pid 6088: thd-0x257f400: dumper: > connect_portrange: Connect from > 0.0.0.0:585
dump failed: [request failed: No route to host](too)
Hi, I've recently migrated our backup server from CentOS 5 to CentOS 7. I've also upgraded from amanda 3.3.7 to 3.4.5 The amcheck works fine and reports no issues. Yet, on backup runs on some DLEs I get the error: dump failed: [request failed: No route to host](too) It also appears to be random as to which DLEs fail. Sometimes it's just one or two on a client. Other times it's all DLEs for a client. And, for any particular client it can be a different DLE on that client each day. Below is a dumper..debug log from the server. I'm not sure what to check for in there. What other logs should I check? Kind regards, Tom Sun Sep 10 20:16:32.115899592 2017: pid 6088: thd-0x257f400: dumper: close_producer_shm_ring sem_close(sem_write 0x7fbc1588b000 Sun Sep 10 20:16:32.115911222 2017: pid 6088: thd-0x257f400: dumper: am_sem_close 0x7fbc1588b000 0 Sun Sep 10 20:16:32.115927349 2017: pid 6088: thd-0x257f400: dumper: am_sem_close 0x7fbc15889000 0 Sun Sep 10 20:16:32.115938800 2017: pid 6088: thd-0x257f400: dumper: am_sem_close 0x7fbc1588a000 0 Sun Sep 10 20:16:32.115949293 2017: pid 6088: thd-0x257f400: dumper: am_sem_close 0x7fbc15888000 0 Sun Sep 10 20:16:32.337361676 2017: pid 6088: thd-0x257f400: dumper: getcmd: SHM-DUMP 00-00217 34076 NULL 5 bentley 9efefbff3f Dispatch /var/lib/samba/data/public/Dispatch 1 2017:9:6:4:6:22 GNUTAR "" "" "" "" "" "" "" 1 "" "" bsdtcp AMANDA /amand a_shm_control-6956-0 20 |" bsdtcp\n FAST\n YES\n YES\n AMANDA\n""" Sun Sep 10 20:16:32.337507787 2017: pid 6088: thd-0x257f400: dumper: Sending header to localhost:34076 Sun Sep 10 20:16:32.339939372 2017: pid 6088: thd-0x257f400: dumper: make_socket opening socket with family 10 Sun Sep 10 20:16:32.339978452 2017: pid 6088: thd-0x257f400: dumper: connect_port: Try port 1024: available - Success Sun Sep 10 20:16:32.340075462 2017: pid 6088: thd-0x257f400: dumper: connect_portrange: Connect from :::1024 failed: Connection refused Sun Sep 10 20:16:32.340101209 2017: pid 6088: thd-0x257f400: dumper: connect_portrange: connect to ::1:34076 failed: Connection refused Sun Sep 10 20:16:32.342383119 2017: pid 6088: thd-0x257f400: dumper: make_socket opening socket with family 2 Sun Sep 10 20:16:32.342418634 2017: pid 6088: thd-0x257f400: dumper: connect_port: Try port 1024: available - Success Sun Sep 10 20:16:32.342489613 2017: pid 6088: thd-0x257f400: dumper: connected to 127.0.0.1:34076 Sun Sep 10 20:16:32.342501059 2017: pid 6088: thd-0x257f400: dumper: our side is 0.0.0.0:1024 Sun Sep 10 20:16:32.342509347 2017: pid 6088: thd-0x257f400: dumper: try_socksize: send buffer size is 131072 Sun Sep 10 20:16:32.342558663 2017: pid 6088: thd-0x257f400: dumper: send request: SERVICE sendbackup OPTIONS features=9efefbfff3fffbf70f;maxdumps=5;hostname=bentley;config=daily; GNUTAR Dispatch /var/lib/samba/data/public/Dispatch 1 bsdtcp FAST YES YES AMANDA Sun Sep 10 20:16:32.342572947 2017: pid 6088: thd-0x257f400: dumper: security_getdriver(name=bsdtcp) returns 0x7fbc153e86a0 Sun Sep 10 20:16:32.342582472 2017: pid 6088: thd-0x257f400: dumper: security_handleinit(handle=0x25e2e70, driver=0x7fbc153e86a0 (BSDTCP)) Sun Sep 10 20:16:32.343623490 2017: pid 6088: thd-0x257f400: dumper: security_streaminit(stream=0x283d6e0, driver=0x7fbc153e86a0 (BSDTCP)) Sun Sep 10 20:16:32.346176806 2017: pid 6088: thd-0x257f400: dumper: make_socket opening socket with family 2 Sun Sep 10 20:16:32.346230063 2017: pid 6088: thd-0x257f400: dumper: connect_port: Try port 571: available - Success Sun Sep 10 20:16:32.346247716 2017: pid 6088: thd-0x257f400: dumper: connect_portrange: Connect from 0.0.0.0:571 failed: Cannot assign requested address Sun Sep 10 20:16:32.346261235 2017: pid 6088: thd-0x257f400: dumper: connect_portrange: connect to 192.168.0.3:10080 failed: Cannot assign requested address Sun Sep 10 20:16:32.348492651 2017: pid 6088: thd-0x257f400: dumper: make_socket opening socket with family 2 Sun Sep 10 20:16:32.348526207 2017: pid 6088: thd-0x257f400: dumper: connect_port: Try port 585: available - Success Sun Sep 10 20:18:39.587177652 2017: pid 6088: thd-0x257f400: dumper: connect_portrange: Connect from 0.0.0.0:585 failed: Connection timed out Sun Sep 10 20:18:39.587235409 2017: pid 6088: thd-0x257f400: dumper: connect_portrange: connect to 192.168.0.3:10080 failed: Connection timed out Sun Sep 10 20:18:39.587267623 2017: pid 6088: thd-0x257f400: dumper: stream_client: Could not bind to port in range 512-1023. Sun Sep 10 20:18:39.587290672 2017: pid 6088: thd-0x257f400: dumper: security_seterror(handle=0x25e2e70, driver=0x7fbc153e86a0 (BSDTCP) error=Connection timed out) Sun Sep 10 20:18:39.587299769 2017: pid 6088: thd-0x257f400: dumper: security_close(handle=0x25e2e70, driver=0x7fbc153e86a0 (BSDTCP)) Sun Sep 10 20:18:39.58
Re: ERROR: client: [DUMP program not available]
On 25/09/15 06:24, Heiko Schlittermann wrote: > Stefan Piperov(Do 24 Sep 2015 22:16:12 CEST): >> It's amazing that a utility like dump/restore, which has been part of UNIX >> since forever, can reach the state where it's considered a dead project and >> be >> unsupported... >> >> - Stefan. > Yes. I'm talking about http://sourceforge.net/projects/dump/, if you > know a dump/restore tools that's more alive, I'd appreciate any hint. > > (And yes, I was happy with dump/restore, until restore wasn't able to > restore a dump.) > > Bugs: > http://sourceforge.net/p/dump/bugs/ > Agreed. The dump utility has a long history. Looking at the changelog on the CentOS 7 rpm package and the sourceforge bug list I can see that Red Hat have been active in maintenance. Not sure if these patches are making back to the repository at sourceforge, though. Also not sure how many of the bugs they have addressed. signature.asc Description: OpenPGP digital signature
Re: ERROR: client: [DUMP program not available]
On 25/09/15 08:52, Tom Robinson wrote: > On 24/09/15 08:23, Tom Robinson wrote: >> Hi Paul, >> >> Thanks and, yes, that was it. The community package to which you refer works. > Well, I spoke too soon. I neglected to remember that CentOS 7 defaults to an > xfs filesystem (which > I'm using). I'm pretty sure that you have to use xfsdump for that. > > I posted on the CentOS list as well. Here's what they said > https://lists.centos.org/pipermail/centos/2015-September/155046.html (note > there are a couple of > unsigned test packages referenced there as well: > https://lists.centos.org/pipermail/centos/2015-March/150446.html) Are there any plans to support the xfs filesystem with xfsdump? Has anyone encountered this before? signature.asc Description: OpenPGP digital signature
Re: ERROR: client: [DUMP program not available]
On 24/09/15 08:23, Tom Robinson wrote: > Hi Paul, > > Thanks and, yes, that was it. The community package to which you refer works. Well, I spoke too soon. I neglected to remember that CentOS 7 defaults to an xfs filesystem (which I'm using). I'm pretty sure that you have to use xfsdump for that. I posted on the CentOS list as well. Here's what they said https://lists.centos.org/pipermail/centos/2015-September/155046.html (note there are a couple of unsigned test packages referenced there as well: https://lists.centos.org/pipermail/centos/2015-March/150446.html) signature.asc Description: OpenPGP digital signature
Re: ERROR: client: [DUMP program not available]
Hi Paul, Thanks and, yes, that was it. The community package to which you refer works. Note that RHEL7/CentOS7 now use systemd extensively and amanda backup networking is handled by amanda.socket in their native packages (i.e. they do not use xinetd). The community package still uses xinetd. Are there any plans to update the community packages to use the systemd structure? Kind regards, Tom On 23/09/15 10:36, Paul Yeatman wrote: > Hi Tom! > > It sounds like either Amanda was not compiled on the client when "dump" > was installed or you are using an Amanda package that does not have > "dump" built-in (was not compiled on a host with "dump") which I > understand is true for the Amanda package in the CentOS 7 repo. If you > want to use "dump" on this host, you will need to compile Amanda on the > host now that "dump" is installed or use a package from another source > that has "dump" built-in such as the community packages from Zmanda: > > http://www.zmanda.com/downloads/community/Amanda/3.3.7/Redhat_Enterprise_7.0/amanda-backup_client-3.3.7-1.rhel7.x86_64.rpm > > Cheers! > > > On Wed, 2015-09-23 at 09:05 +1000, Tom Robinson wrote: >> CentOS 5, amanda server 2.6.0p2-1.rhel5 >> CentOS 7, amanda client 3.3.3-13.el7 >> >> Hi, >> >> I'm configuring a new client into our backups with two GNUTAR based DLEs and >> one DUMP based DLE. The >> DUMP based one fails: >> >> client /var { >> auth "bsdtcp" >> nocomp-root >> } >> >> Here are the dumptypes: >> >> define dumptype global { >> comment "Global definitions" >> index yes >> } >> define dumptype comp-root { >> global >> comment "Root partitions with compression" >> compress client fast >> priority low >> } >> define dumptype >> nocomp-roohttp://www.zmanda.com/downloads/community/Amanda/3.3.7/Redhat_Enterprise_7.0/amanda-backup_client-3.3.7-1.rhel7.x86_64.rpmt >> { >> comp-root >> comment "Root partitions without compression" >> compress none >> } >> >> When I run a client check I get this error on the server: >> >> $ amcheck daily -c client >> >> Amanda Backup Client Hosts Check >> >> ERROR: client: [DUMP program not available] >> ERROR: client: [RESTORE program not available] >> Client check: 1 host checked in 0.138 seconds. 2 problems found. >> >> (brought to you by Amanda 2.6.0p2) >> >> Yet on the client I have dump installed: >> >> $ rpm -qa | grep dump >> dump-0.4-0.22.b44.el7.x86_64 >> $ which restore >> /usr/sbin/restore >> $ which dump >> /usr/sbin/dump >> $ ls -l /usr/sbin/{dump,restore} >> -rwxr-xr-x. 1 root root 83616 Jun 10 2014 /usr/sbin/dump >> -rwxr-xr-x. 1 root root 129392 Jun 10 2014 /usr/sbin/restore >> $ ls -l /sbin/{dump,restore} >> -rwxr-xr-x. 1 root root 83616 Jun 10 2014 /sbin/dump >> -rwxr-xr-x. 1 root root 129392 Jun 10 2014 /sbin/restore >> >> And selinux is usually on but I have it off for this test: >> >> $ getenforce >> Permissive >> >> AFAICT I have dump installed but there seems to be an error in executing it. >> When the backup runs >> the sendsize log on the client I shows: >> >> ...8<... >> Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: waiting for any >> estimate child: 1 running >> Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: calculating for >> amname /var, dirname /var, >> spindle -1 DUMP >> Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: getting size via >> dump for /var level 0 >> Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: calculating for >> device /var with >> Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: critical (fatal): no >> dump program available >> /lib64/libamanda-3.3.3.so(+0x2b707)[0x7f62963f3707] >> /lib64/libglib-2.0.so.0(g_logv+0x1b1)[0x7f62951419c1] >> /lib64/libglib-2.0.so.0(g_log+0x8f)[0x7f6295141c5f] >> /usr/lib64/amanda/sendsize(getsize_dump+0x344)[0x7f6296cbdad4] >> /usr/lib64/amanda/sendsize(dump_calc_estimates+0x95)[0x7f6296cbdb85] >> /usr/lib64/amanda/sendsize(calc_estimates+0x2b6)[0x7f6296cc1c86] >> /usr/lib64/amanda/sendsize(main+0x151a)[0x7f6296cbc6fa] >> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f6293b8eaf5] >> /usr/lib64/amanda/sendsize(+0x47a9)[0x7f6296cbc7a9] >> Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: child 502 exited >> with status 1 >> Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: pid 483 finish time >> Tue Sep 22 20:00:02 2015 >> ...8<... >> >> Is this a permissions issue or am I missing a library? Can anyone please >> shed some light on this? >> >> Kind regards, >> Tom >> >> signature.asc Description: OpenPGP digital signature
ERROR: client: [DUMP program not available]
CentOS 5, amanda server 2.6.0p2-1.rhel5 CentOS 7, amanda client 3.3.3-13.el7 Hi, I'm configuring a new client into our backups with two GNUTAR based DLEs and one DUMP based DLE. The DUMP based one fails: client /var { auth "bsdtcp" nocomp-root } Here are the dumptypes: define dumptype global { comment "Global definitions" index yes } define dumptype comp-root { global comment "Root partitions with compression" compress client fast priority low } define dumptype nocomp-root { comp-root comment "Root partitions without compression" compress none } When I run a client check I get this error on the server: $ amcheck daily -c client Amanda Backup Client Hosts Check ERROR: client: [DUMP program not available] ERROR: client: [RESTORE program not available] Client check: 1 host checked in 0.138 seconds. 2 problems found. (brought to you by Amanda 2.6.0p2) Yet on the client I have dump installed: $ rpm -qa | grep dump dump-0.4-0.22.b44.el7.x86_64 $ which restore /usr/sbin/restore $ which dump /usr/sbin/dump $ ls -l /usr/sbin/{dump,restore} -rwxr-xr-x. 1 root root 83616 Jun 10 2014 /usr/sbin/dump -rwxr-xr-x. 1 root root 129392 Jun 10 2014 /usr/sbin/restore $ ls -l /sbin/{dump,restore} -rwxr-xr-x. 1 root root 83616 Jun 10 2014 /sbin/dump -rwxr-xr-x. 1 root root 129392 Jun 10 2014 /sbin/restore And selinux is usually on but I have it off for this test: $ getenforce Permissive AFAICT I have dump installed but there seems to be an error in executing it. When the backup runs the sendsize log on the client I shows: ...8<... Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: waiting for any estimate child: 1 running Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: calculating for amname /var, dirname /var, spindle -1 DUMP Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: getting size via dump for /var level 0 Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: calculating for device /var with Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: critical (fatal): no dump program available /lib64/libamanda-3.3.3.so(+0x2b707)[0x7f62963f3707] /lib64/libglib-2.0.so.0(g_logv+0x1b1)[0x7f62951419c1] /lib64/libglib-2.0.so.0(g_log+0x8f)[0x7f6295141c5f] /usr/lib64/amanda/sendsize(getsize_dump+0x344)[0x7f6296cbdad4] /usr/lib64/amanda/sendsize(dump_calc_estimates+0x95)[0x7f6296cbdb85] /usr/lib64/amanda/sendsize(calc_estimates+0x2b6)[0x7f6296cc1c86] /usr/lib64/amanda/sendsize(main+0x151a)[0x7f6296cbc6fa] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f6293b8eaf5] /usr/lib64/amanda/sendsize(+0x47a9)[0x7f6296cbc7a9] Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: child 502 exited with status 1 Tue Sep 22 20:00:02 2015: thd-0x7f6298019e00: sendsize: pid 483 finish time Tue Sep 22 20:00:02 2015 ...8<... Is this a permissions issue or am I missing a library? Can anyone please shed some light on this? Kind regards, Tom -- Tom Robinson IT Manager/System Administrator MoTeC Pty Ltd 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 E: tom.robin...@motec.com.au signature.asc Description: OpenPGP digital signature
Re: zmanda DLEs
On 24/06/15 16:02, Tom Robinson wrote: I have a DLE that I'm not sure about when backup up windows clients using zmanda. If the DLE has spaces in it, do I need to escape them? Also, do I need to escape special charactes? e.g. mito cirris C:/Program Files (x86)/Cirris { I believe the above is correct but the planner still fails. The amanda server is quite old (2.6.0p2). Using the following DLE in the disklist: mito cirris C:/Program Files (x86)/Cirris { zwc-compress } Fails at planner stage but this: # /usr/libexec/amanda/planner dailytest ---8--- GETTING ESTIMATES... got a bad message, stopped at: C:/Program Files (x86)/Cirris error result for host mito disk cirris: badly formatted response from mito error result for host mito disk C:/TNT: badly formatted response from mito error result for host mito disk C:/Startrak: badly formatted response from mito planner: time 0.420: getting estimates took 0.389 secs FAILED QUEUE: 0: mito cirris 1: mito C:/TNT 2: mito C:/Startrak DONE QUEUE: empty ANALYZING ESTIMATES... planner: FAILED mito cirris 20150630083237 0 [badly formatted response from mito] planner: FAILED mito C:/TNT 20150630083237 0 [badly formatted response from mito] planner: FAILED mito C:/Startrak 20150630083237 0 [badly formatted response from mito] INITIAL SCHEDULE (size 2064): ---8--- Interestingly, when I remove the 'diskname' from the DLE the error goes away. So with a DLE of: mito C:/Program Files (x86)/Cirris { zwc-compress } I see this: # /usr/libexec/amanda/planner dailytest ---8--- GETTING ESTIMATES... planner: time 0.339: got result for host mito disk C:/Program Files (x86)/Cirris: 0 - 23438K, -1 - -2K, -1 - -2K planner: time 0.339: got result for host mito disk C:/TNT: 0 - 22552K, -1 - -2K, -1 - -2K planner: time 0.339: got result for host mito disk C:/Startrak: 0 - 369740K, -1 - -2K, -1 - -2K planner: time 0.339: getting estimates took 0.338 secs FAILED QUEUE: empty DONE QUEUE: 0: mito C:/Program Files (x86)/Cirris 1: mito C:/TNT 2: mito C:/Startrak ANALYZING ESTIMATES... pondering mito:C:/Program Files (x86)/Cirris... next_level0 -16616 last_level -1 (due for level 0) (new disk, can't switch to degraded mode) curr level 0 nsize 23438 csize 11719 total size 14815 total_lev0 11719 balanced-lev0size 1674 pondering mito:C:/TNT... next_level0 -16616 last_level -1 (due for level 0) (new disk, can't switch to degraded mode) curr level 0 nsize 22552 csize 11276 total size 27123 total_lev0 22995 balanced-lev0size 3284 pondering mito:C:/Startrak... next_level0 -16616 last_level -1 (due for level 0) (new disk, can't switch to degraded mode) curr level 0 nsize 369740 csize 184870 total size 213025 total_lev0 207865 balanced-lev0size 29694 INITIAL SCHEDULE (size 213025): mito C:/Startrak pri 16617 lev 0 nsize 369740 csize 184870 mito C:/Program Files (x86)/Cirris pri 16617 lev 0 nsize 23438 csize 11719 mito C:/TNT pri 16617 lev 0 nsize 22552 csize 11276 ---8--- The syntax for a DLE from the documentation is this: hostname diskname [diskdevice] dumptype [spindle [interface] ] But using a diskname fails at planner stage as shown in the first example. Also the the DLE with a diskname disrupts the other DLEs that don't have disknames. That is, the following also fail until I remove the diskname from the other DLE. mito C:/Startrak { #starttime 0525 zwc-compress } mito C:/TNT { #starttime 0525 zwc-compress } Is it because the amanda version I'm running is very old or is there a problem elsewhere? Maybe I just did something wrong. Kind regards, Tom signature.asc Description: OpenPGP digital signature
zmanda DLEs
Hi, I have a DLE that I'm not sure about when backup up windows clients using zmanda. If the DLE has spaces in it, do I need to escape them? Also, do I need to escape special charactes? e.g. mito cirris C:/Program Files (x86)/Cirris { or mito cirris C:/Program\ Files\ \(x86\)/Cirris { or mito cirris C:/Program\ Files\ (x86)/Cirris { or none of the above! Kind regards, Tom -- Tom Robinson IT Manager/System Administrator MoTeC Pty Ltd 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 E: tom.robin...@motec.com.au signature.asc Description: OpenPGP digital signature
mysql-zrm issues
Hi, Does anyone know if there is a working mailling list for mysql-zrm? I have some issues with the mysql-zrm post-backup plugin if anyone knows anything about that. Kind regards, Tom -- Tom Robinson IT Manager/System Administrator MoTeC Pty Ltd 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 E: tom.robin...@motec.com.au signature.asc Description: OpenPGP digital signature
Re: mysql-zrm issues
Thanks Paddy. I posted there about a week ago but no one has responded. https://forums.zmanda.com/showthread.php?5647-Problems-running-post-backup-plugin I thought the community would be a bit more knowledgeable and active. On 13/02/15 09:32, Paddy Sreenivasan wrote: Please post it in forums.zmanda.com http://forums.zmanda.com On Thu, Feb 12, 2015 at 2:16 PM, Tom Robinson tom.robin...@motec.com.au mailto:tom.robin...@motec.com.au wrote: Hi, Does anyone know if there is a working mailling list for mysql-zrm? I have some issues with the mysql-zrm post-backup plugin if anyone knows anything about that. Kind regards, Tom -- Tom Robinson IT Manager/System Administrator MoTeC Pty Ltd 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 tel:%2B61%203%209761%205050 F: +61 3 9761 5051 tel:%2B61%203%209761%205051 E: tom.robin...@motec.com.au mailto:tom.robin...@motec.com.au signature.asc Description: OpenPGP digital signature
Re: Backups to tape consistently under 60% tape capacity
The hardware was just over a year old. Even IBM commented on the usage hours being very low. Very surprising to have it go bad when so young. t. On 06/12/14 04:16, Debra S Baddorf wrote: Wow. How old was the hardware? I had the impression it was newish. Just curious. Deb On Dec 4, 2014, at 11:19 PM, Tom Robinson tom.robin...@motec.com.au wrote: Hi, Just to tidy off this thread, the hardware was at fault. We got a new HBA in the process as we thought tape performance may have been affected by sharing an HBA with disks. It turns out that the real issue was that the tape unit itself was faulty (and it had damaged one tape - or was it the tape that damaged the unit?). With the support of IBM, we updated the firmware of both the library and tape drive but after several tests showed no improvement, IBM shipped a replacement drive SLED. I now get the native capacity on the tape as expected. Kind regards, Tom On 20/10/14 10:49, Tom Robinson wrote: Hi, I'm not sure why I'm not getting such good tape usage any more and wonder if someone can help me. Until recently I was getting quite good tape usage on my 'weekly' config: USAGE BY TAPE: Label Time Size % DLEs Parts weekly013:10 1749362651K 117.91616 weekly023:09 1667194493K 112.42121 weekly033:08 1714523420K 115.51616 weekly043:04 1664570982K 112.22121 weekly053:11 1698357067K 114.51717 weekly063:07 1686467027K 113.72121 weekly073:03 1708584546K 115.11717 weekly083:11 1657764181K 111.72121 weekly093:03 1725209913K 116.31717 weekly103:12 1643311109K 110.72121 weekly013:06 1694157008K 114.21717 For that last entry, the mail report looked like this: These dumps were to tape weekly01. Not using all tapes because 1 tapes filled; runtapes=1 does not allow additional tapes. There are 198378440K of dumps left in the holding disk. They will be flushed on the next run. Which was fairly typical and to be expected since the tune of flush settings was: flush-threshold-dumped 100 flush-threshold-scheduled 100 taperflush 100 autoflush yes Now, without expectation, the dumps started to look like this: weekly023:21 1289271529K 86.91010 weekly033:17 854362421K 57.61111 weekly043:20 839198404K 56.61111 weekly059:40 637259676K 42.9 5 5 weekly06 10:54 806737591K 54.41515 weekly091:1235523072K2.4 1 1 weekly093:21 841844504K 56.71111 weekly013:16 842557835K 56.81919 About the time it started looking different, I introduced a second config for 'archive' but I can't see why that would affect my 'weekly' run. I had a couple of bad runs and had to flush them manually and I'm not sure what happened with tapes weekly07 and weekly08 (they appear to be missing) and weekly09 is dumped to twice in succession. This looks very weird. $ amadmin weekly find | grep weekly07 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly07 1 1/-1 PARTIAL PARTIAL $ amadmin weekly find | grep weekly08 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly08 1 1/-1 PARTIAL PARTIAL $ amadmin weekly find | grep weekly09 2014-09-21 00:00:00 monza / 0 weekly09 9 1/1 OK 2014-09-21 00:00:00 monza /data/backup/amanda/vtapes/daily/slot1 0 weekly09 10 1/1 OK 2014-09-21 00:00:00 monza /data/backup/amanda/vtapes/daily/slot2 0 weekly09 11 1/-1 OK PARTIAL 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly09 1 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot5 0 weekly09 2 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot6 0 weekly09 3 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot7 0 weekly09
Re: Backups to tape consistently under 60% tape capacity
Hi, Just to tidy off this thread, the hardware was at fault. We got a new HBA in the process as we thought tape performance may have been affected by sharing an HBA with disks. It turns out that the real issue was that the tape unit itself was faulty (and it had damaged one tape - or was it the tape that damaged the unit?). With the support of IBM, we updated the firmware of both the library and tape drive but after several tests showed no improvement, IBM shipped a replacement drive SLED. I now get the native capacity on the tape as expected. Kind regards, Tom On 20/10/14 10:49, Tom Robinson wrote: Hi, I'm not sure why I'm not getting such good tape usage any more and wonder if someone can help me. Until recently I was getting quite good tape usage on my 'weekly' config: USAGE BY TAPE: Label Time Size % DLEs Parts weekly013:10 1749362651K 117.91616 weekly023:09 1667194493K 112.42121 weekly033:08 1714523420K 115.51616 weekly043:04 1664570982K 112.22121 weekly053:11 1698357067K 114.51717 weekly063:07 1686467027K 113.72121 weekly073:03 1708584546K 115.11717 weekly083:11 1657764181K 111.72121 weekly093:03 1725209913K 116.31717 weekly103:12 1643311109K 110.72121 weekly013:06 1694157008K 114.21717 For that last entry, the mail report looked like this: These dumps were to tape weekly01. Not using all tapes because 1 tapes filled; runtapes=1 does not allow additional tapes. There are 198378440K of dumps left in the holding disk. They will be flushed on the next run. Which was fairly typical and to be expected since the tune of flush settings was: flush-threshold-dumped 100 flush-threshold-scheduled 100 taperflush 100 autoflush yes Now, without expectation, the dumps started to look like this: weekly023:21 1289271529K 86.91010 weekly033:17 854362421K 57.61111 weekly043:20 839198404K 56.61111 weekly059:40 637259676K 42.9 5 5 weekly06 10:54 806737591K 54.41515 weekly091:1235523072K2.4 1 1 weekly093:21 841844504K 56.71111 weekly013:16 842557835K 56.81919 About the time it started looking different, I introduced a second config for 'archive' but I can't see why that would affect my 'weekly' run. I had a couple of bad runs and had to flush them manually and I'm not sure what happened with tapes weekly07 and weekly08 (they appear to be missing) and weekly09 is dumped to twice in succession. This looks very weird. $ amadmin weekly find | grep weekly07 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly07 1 1/-1 PARTIAL PARTIAL $ amadmin weekly find | grep weekly08 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly08 1 1/-1 PARTIAL PARTIAL $ amadmin weekly find | grep weekly09 2014-09-21 00:00:00 monza / 0 weekly09 9 1/1 OK 2014-09-21 00:00:00 monza /data/backup/amanda/vtapes/daily/slot1 0 weekly09 10 1/1 OK 2014-09-21 00:00:00 monza /data/backup/amanda/vtapes/daily/slot2 0 weekly09 11 1/-1 OK PARTIAL 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly09 1 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot5 0 weekly09 2 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot6 0 weekly09 3 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot7 0 weekly09 4 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot8 0 weekly09 5 1/1 OK 2014-09-14 00:00:00 monza /export 0 weekly09
Re: Backups to tape consistently under 60% tape capacity
On 24/10/14 07:48, Jon LaBadie wrote: On Thu, Oct 23, 2014 at 10:34:38AM -0400, Gene Heskett wrote: On Thursday 23 October 2014 01:28:01 Tom Robinson did opine ... If you are feeding the tape device compressed files, and the drives compressor is enabled too, this will quite often cause file expansions on the tape itself. The drives compressor, because it is intended to handle the compression on the fly, is generally not sophisticated enough to do any further compression and will add to the datasize, expanding what actually goes down the cable to the drives heads. Tom is using an LTO drive (-5 I think). Most modern tape drives, including all LTO's do not exhibit the bad behavior of the DDS drives with their run-length encoding scheme. IIRC, they have enough cpu smarts and memory to first collect the data in memory, try to compress it to another another memory buffer, and if it is enlarged the block is saved uncompressed. Note, instead of a flag at the start of the tape indicating compressed or uncompressed, there is a flag for each tape block. jl Thanks for all the feedback. Jon is correct, I'm using an LTO5 drive. Note that, about a year ago I posed this question (and received no response): Tape Compression: Is it on or off (https://www.mail-archive.com/amanda-users%40amanda.org/msg47097.html) If you read that post, there is a riddle to be answered by checking the tape driver flags to determine if compression is on or off. I set the driver flags and used a 'non-compression' device node, as per the man page (/dev/rmt/0bn or BSD style, no-compression, no-rewind). All seemed well until recently. For reference, below are my amtapetype outputs from back then and from yesterday. Notably, compression always reports as 'enabled'. Also, I'm dumping the same, compressed DLEs as before. I'm not sure compression is the factor here. Regards, Tom Back then (2014-10-14) amtapetype showed: $ amtapetype -f -t ULT3580-TD5 weekly /dev/rmt/0bn Checking for FSF_AFTER_FILEMARK requirement device-property FSF_AFTER_FILEMARK false Applying heuristic check for compression. Wrote random (uncompressible) data at 73561559.3650794 bytes/sec Wrote fixed (compressible) data at 193099093.33 bytes/sec Compression: enabled Writing one file to fill the volume. Wrote 1515988615168 bytes at 73464 kb/sec Got LEOM indication, so drive and kernel together support LEOM Writing smaller files (15159885824 bytes) to determine filemark. define tapetype ULT3580-TD5 { comment Created by amtapetype; compression enabled length 1480457632 kbytes filemark 0 kbytes speed 73464 kps blocksize 32 kbytes } # for this drive and kernel, LEOM is supported; add # device-property LEOM TRUE # for this device. today (2014-10-23) I get: Checking for FSF_AFTER_FILEMARK requirement Applying heuristic check for compression. Wrote random (uncompressible) data at 67415370.8307692 bytes/sec Wrote fixed (compressible) data at 273874944 bytes/sec Compression: enabled Writing one file to fill the volume. Wrote 761266700288 bytes at 57975 kb/sec Got LEOM indication, so drive and kernel together support LEOM Writing smaller files (7612661760 bytes) to determine filemark. device-property FSF_AFTER_FILEMARK false define tapetype ULT3580-TD5 { comment Created by amtapetype; compression enabled length 743424512 kbytes filemark 987 kbytes speed 57975 kps blocksize 512 kbytes } # for this drive and kernel, LEOM is supported; add # device-property LEOM TRUE # for this device. signature.asc Description: OpenPGP digital signature
Re: Backups to tape consistently under 60% tape capacity
On 24/10/14 07:59, Jon LaBadie wrote: On Thu, Oct 23, 2014 at 04:28:01PM +1100, Tom Robinson wrote: Now I have to work out why my tape is reporting as smaller! amtapetype reports my tape is only half as big for the same block size...(was 1483868160 is now 743424512). :-/ Checking for FSF_AFTER_FILEMARK requirement Applying heuristic check for compression. Wrote random (uncompressible) data at 67415370.8307692 bytes/sec Wrote fixed (compressible) data at 273874944 bytes/sec Compression: enabled Writing one file to fill the volume. Wrote 761266700288 bytes at 57975 kb/sec Got LEOM indication, so drive and kernel together support LEOM Writing smaller files (7612661760 bytes) to determine filemark. device-property FSF_AFTER_FILEMARK false define tapetype ULT3580-TD5 { comment Created by amtapetype; compression enabled length 743424512 kbytes filemark 987 kbytes speed 57975 kps blocksize 512 kbytes } # for this drive and kernel, LEOM is supported; add # device-property LEOM TRUE # for this device. Note it is not only reporting the lower size, but dumps are experiencing it as well. IIRC, you are using an LTO-5. My peek at the web says that format can record at up to 280Mbps. You are now only seeing 58Mbps. Is that a big change from your previous runs? Feeding a drive too slowly, i.e. below its ability to stream continuously, can reduce the apparent capacity of the tape. If this is the case you may have to find ways to increase the data flow rate to your drive. Jon Thanks Jon. I was getting 'speed 73464 kps', previously. So, yes, there is some performance degradation. I never saw anything close to 280Mbps in my tests with amtapetype. Is there another way to accurately test tape write performance? Regards, Tom signature.asc Description: OpenPGP digital signature
Re: Backups to tape consistently under 60% tape capacity
On 24/10/14 08:30, Jean-Francois Malouin wrote: * Jon LaBadie j...@jgcomp.com [20141023 16:59]: On Thu, Oct 23, 2014 at 04:28:01PM +1100, Tom Robinson wrote: Now I have to work out why my tape is reporting as smaller! amtapetype reports my tape is only half as big for the same block size...(was 1483868160 is now 743424512). :-/ Checking for FSF_AFTER_FILEMARK requirement Applying heuristic check for compression. Wrote random (uncompressible) data at 67415370.8307692 bytes/sec Wrote fixed (compressible) data at 273874944 bytes/sec Compression: enabled Writing one file to fill the volume. Wrote 761266700288 bytes at 57975 kb/sec Got LEOM indication, so drive and kernel together support LEOM Writing smaller files (7612661760 bytes) to determine filemark. device-property FSF_AFTER_FILEMARK false define tapetype ULT3580-TD5 { comment Created by amtapetype; compression enabled length 743424512 kbytes filemark 987 kbytes speed 57975 kps blocksize 512 kbytes } # for this drive and kernel, LEOM is supported; add # device-property LEOM TRUE # for this device. Note it is not only reporting the lower size, but dumps are experiencing it as well. IIRC, you are using an LTO-5. My peek at the web says that format can record at up to 280Mbps. You are now only seeing 58Mbps. Is that a big change from your previous runs? Feeding a drive too slowly, i.e. below its ability to stream continuously, can reduce the apparent capacity of the tape. If this is the case you may have to find ways to increase the data flow rate to your drive. Jon -- Jon H. LaBadie j...@jgcomp.com 11226 South Shore Rd. (703) 787-0688 (H) Reston, VA 20190 (609) 477-8330 (C) Stepping in, I miss the earlier comments so maybe this is not appropriate or OT, in which case just toss me in the dust bin. Your length and speed are way off. This is my tapetype for a HP Ultrium LTO-5 define tapetype tape-lto5 { comment Created by amtapetype; compression disabled length 1480900608 kbytes filemark 3413 kbytes speed 107063 kps blocksize 2048 kbytes part-size 100gb part-cache-max-size 100gb part-cache-type disk part-cache-dir /holddisk } In this case, amtapetype disappointedly reported only ~100MBs (native speed per specs is 140MBs) but in my local setup I frequently see values up to 300MBs with 'averaged xfer rate' around the specs value, eg, see the attached munin graph of my holdding disk performance from last night run, the green line is data from the holdding disk to the tape. LTO-5 will stream from 40MBs (I think) to 150Mbs, lower than that you're shoe-shinning. If the data xfer from the holdding disk to the drive can sustain the max xfer data rate to the drive (140MBs) this suggest you have a dud or experiencing other hardware issues. I would test the hardware directly without amanda in the way using native OS tool, dd or whatever you fancy. Quite possibly the tape driver is at fault. I'm using a generic SCSI Tape driver (st) together with the SCSI Generic dirver (sgen) for the tape library. I tried installing the IBMtape driver, but failed to get it to work on OmniOS. signature.asc Description: OpenPGP digital signature
Re: Backups to tape consistently under 60% tape capacity
Tom Robinson IT Manager/System Administrator MoTeC Pty Ltd 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 E: tom.robin...@motec.com.au On 24/10/14 09:23, Jon LaBadie wrote: On Fri, Oct 24, 2014 at 08:23:55AM +1100, Tom Robinson wrote: Thanks for all the feedback. Jon is correct, I'm using an LTO5 drive. Note that, about a year ago I posed this question (and received no response): Tape Compression: Is it on or off (https://www.mail-archive.com/amanda-users%40amanda.org/msg47097.html) If you read that post, there is a riddle to be answered by checking the tape driver flags to determine if compression is on or off. I set the driver flags and used a 'non-compression' device node, as per the man page (/dev/rmt/0bn or BSD style, no-compression, no-rewind). All seemed well until recently. For reference, below are my amtapetype outputs from back then and from yesterday. Notably, compression always reports as 'enabled'. Also, I'm dumping the same, compressed DLEs as before. I'm not sure compression is the factor here. Regards, Tom As reported on the manpage for amtapetype, the compression enabled test can be fooled by several factors including very fast drives. I think the test assumes that tape writing speed is the limiting factor. Thus uncompressible data approximates the actual write speed and if more compressible data can be written in the same time, compression must be enabled. My LTO experience is limited, but I wonder about block size. You do not specify a tape block size in your amtapetype runs (-b size) and thus are defaulting to 32KB. Is this also true in your amdump runs, i.e. do you specify a blocksize in your tapetype definition? It is possible you can get better performance using a larger tape block size. Check it out with a couple of amtapetype runs. As to the performance, it seems lower than it should be and lower than it used to be. What has changed? Have you added more devices to the controller that the LTO-5 drive uses? Is it still using the same controller? I recall some reports of amanda users dedicating a high-performance controller to their LTO drives. Otherwise they couldn't feed the drive as fast as it could write. Sorry, I've copied the wrong output. Last year I did do the amtapetype test specifying block size 512 kbytes. Here's what I got last year: Checking for FSF_AFTER_FILEMARK requirement Applying heuristic check for compression. Wrote random (uncompressible) data at 85721088 bytes/sec Wrote fixed (compressible) data at 295261525.33 bytes/sec Compression: enabled Writing one file to fill the volume. Wrote 1519480995840 bytes at 85837 kb/sec Got LEOM indication, so drive and kernel together support LEOM Writing smaller files (15194390528 bytes) to determine filemark. device-property FSF_AFTER_FILEMARK false define tapetype ULT3580-TD5 { comment Created by amtapetype; compression enabled length 1483868160 kbytes filemark 868 kbytes speed 85837 kps blocksize 512 kbytes } # for this drive and kernel, LEOM is supported; add # device-property LEOM TRUE # for this device. The recent test with amtapetype is also specifying block size of 512 kbytes: Checking for FSF_AFTER_FILEMARK requirement Applying heuristic check for compression. Wrote random (uncompressible) data at 67415370.8307692 bytes/sec Wrote fixed (compressible) data at 273874944 bytes/sec Compression: enabled Writing one file to fill the volume. Wrote 761266700288 bytes at 57975 kb/sec Got LEOM indication, so drive and kernel together support LEOM Writing smaller files (7612661760 bytes) to determine filemark. device-property FSF_AFTER_FILEMARK false define tapetype ULT3580-TD5 { comment Created by amtapetype; compression enabled length 743424512 kbytes filemark 987 kbytes speed 57975 kps blocksize 512 kbytes } # for this drive and kernel, LEOM is supported; add # device-property LEOM TRUE # for this device. As to what has changed, about 25 days ago (September 31, 2014) we upgraded the OS (prompted by issues unrelated to backup) but the tape performance was already suffering degradation before that (starting in August with degradation from 117% (2014-08-10) to 86% tape capacity the following week, then steadily declining weekly by week to 57.6%, 56.6%, 42.9%, 54.4%, etc., remaining at about 56%, ...). Some speed tests with dd: $ dd if=/dev/zero of=/data/backup/amanda/file1 bs=1024k count=2048 2048+0 records in 2048+0 records out 2147483648 bytes (2.1 GB) copied, 1.34568 s, 1.6 GB/s $ time dd if=/data/backup/amanda/file1 of=/dev/rmt/0b bs=512k count=4000 4000+0 records in 4000+0 records out 2097152000 bytes (2.1 GB) copied, 13.4122 s, 156 MB/s real0m13.427s user0m0.015s sys 0m1.519s $ time dd if=/data/backup/amanda/file1 of=/dev/rmt/0b bs=512k count=4000 4000+0
Re: Backups to tape consistently under 60% tape capacity
On 24/10/14 08:30, Jean-Francois Malouin wrote: Stepping in, I miss the earlier comments so maybe this is not appropriate or OT, in which case just toss me in the dust bin. Your length and speed are way off. This is my tapetype for a HP Ultrium LTO-5 define tapetype tape-lto5 { comment Created by amtapetype; compression disabled length 1480900608 kbytes filemark 3413 kbytes speed 107063 kps blocksize 2048 kbytes part-size 100gb part-cache-max-size 100gb part-cache-type disk part-cache-dir /holddisk } In this case, amtapetype disappointedly reported only ~100MBs (native speed per specs is 140MBs) but in my local setup I frequently see values up to 300MBs with 'averaged xfer rate' around the specs value, eg, see the attached munin graph of my holdding disk performance from last night run, the green line is data from the holdding disk to the tape. LTO-5 will stream from 40MBs (I think) to 150Mbs, lower than that you're shoe-shinning. If the data xfer from the holdding disk to the drive can sustain the max xfer data rate to the drive (140MBs) this suggest you have a dud or experiencing other hardware issues. I would test the hardware directly without amanda in the way using native OS tool, dd or whatever you fancy. Here are two files from the daily dump: 1021882368 Oct 22 21:02 slot4/00076.rook._data_data0.1 1094219252 Oct 17 20:32 slot7/00070.scion._home.1 Here's a tar directly to tape (run as amanda user): $ time tar cvf /dev/rmt/0b slot7/00070.scion._home.1 slot4/00076.rook._data_data0.1 slot7/00070.scion._home.1 slot4/00076.rook._data_data0.1 real0m29.755s user0m0.656s sys 0m5.063s $ echo '((1094219252+1021882368)/1000/1000)/29.755' | bc -l 71.11751369517728112922 Some speed tests with dd: $ dd if=/dev/zero of=/data/backup/amanda/file1 bs=1024k count=2048 2048+0 records in 2048+0 records out 2147483648 bytes (2.1 GB) copied, 1.34568 s, 1.6 GB/s $ time dd if=/data/backup/amanda/file1 of=/dev/rmt/0b bs=512k count=4000 4000+0 records in 4000+0 records out 2097152000 bytes (2.1 GB) copied, 13.4122 s, 156 MB/s real0m13.427s user0m0.015s sys 0m1.519s $ time dd if=/data/backup/amanda/file1 of=/dev/rmt/0b bs=512k count=4000 4000+0 records in 4000+0 records out 2097152000 bytes (2.1 GB) copied, 13.4404 s, 156 MB/s real0m13.456s user0m0.014s sys 0m1.471s $ time dd if=/dev/zero of=/dev/rmt/0b bs=512k count=4000 4000+0 records in 4000+0 records out 2097152000 bytes (2.1 GB) copied, 8.24686 s, 254 MB/s real0m8.262s user0m0.011s sys 0m0.297s $ time dd if=/dev/zero of=/dev/rmt/0b bs=512k count=4000 4000+0 records in 4000+0 records out 2097152000 bytes (2.1 GB) copied, 8.1345 s, 258 MB/s real0m8.150s user0m0.011s sys 0m0.299s Writing directly to tape with dd if much faster than reading from filesystem and writing to tape. In which case maybe I probably have a controller type issue? signature.asc Description: OpenPGP digital signature
Re: Backups to tape consistently under 60% tape capacity
Now I have to work out why my tape is reporting as smaller! amtapetype reports my tape is only half as big for the same block size...(was 1483868160 is now 743424512). :-/ Checking for FSF_AFTER_FILEMARK requirement Applying heuristic check for compression. Wrote random (uncompressible) data at 67415370.8307692 bytes/sec Wrote fixed (compressible) data at 273874944 bytes/sec Compression: enabled Writing one file to fill the volume. Wrote 761266700288 bytes at 57975 kb/sec Got LEOM indication, so drive and kernel together support LEOM Writing smaller files (7612661760 bytes) to determine filemark. device-property FSF_AFTER_FILEMARK false define tapetype ULT3580-TD5 { comment Created by amtapetype; compression enabled length 743424512 kbytes filemark 987 kbytes speed 57975 kps blocksize 512 kbytes } # for this drive and kernel, LEOM is supported; add # device-property LEOM TRUE # for this device. On 23/10/14 03:19, Debra S Baddorf wrote: Re-running amtapetype might be a very good idea. It might point to where the problem isn’t, at least! Do double check your cables. People have found problems in cables which look like reduced throughput. “Mpath” — I don’t know what it is, but could it have changed with your OS upgrade? Wouldn’t hurt to check that the tape driver setting haven’t changed with the OS work ….. but otherwise, it sounds good. Deb On Oct 21, 2014, at 7:43 PM, Tom Robinson tom.robin...@motec.com.au wrote: Hmm, I did check my tape driver settings. When I set the tape library up, I spent a long time getting the driver settings right (on OmniOS) and took copious notes on the settings. My queries reveal that I'm not using compression (which is what I wanted as the vtapes are already compressed). LTO5 native is 1.5T; compressed is 3T. All my tapes are 'newish' (about one year old). The tape unit is the same age. For months I was consistently getting over 110% (highest 117%), then capacity dropped once to 86% and then consistently to about 56% (lowest 42%). Is there a block size issue (2x56=112)? All the weekly dumps are local so network shouldn't be an issue. The tape unit is using redundant SAS connectors. Maybe it's a mpath thing? Should I re-run amtapetype to see what it thinks the 'new' tape capacity is after upgrading the OS? On 22/10/14 10:47, Debra S Baddorf wrote: Yeah, it sure does look like it ought to fit! Could the tapes be dirty and not holding as much any more??? I have no idea if that’s even possible. But it kinds seems like your tapes are saying “I don’t want that much data”.Compression issues? Your tapes were previously getting 117% capacity, and now are only doing 86%. Is that the general summary? Unless somebody else can suggest some network (cable to tape drive?) or system problems which might make the tapes appear smaller than before? Compression issues? Read the original message at the bottom of this email for the original problem complaint. Deb On Oct 21, 2014, at 6:20 PM, Tom Robinson tom.robin...@motec.com.au wrote: Hi Debra, A brilliant motivational speech. Thanks. Well worth the read. In homage, I strongly suggest anyone who hasn't read it to go and do that now. Here it is again for those whose mouse wheel is dysfunctional: http://www.appleseeds.org/Big-Rocks_Covey.htm I will try your suggestions but want to make clear that the virtual tapes you see are the product of a daily run which is disk only. The weekly run puts all those daily dumps onto tape which then leaves the building. So I have both virtual and real tapes. The issues I'm having are in the weekly run (the dump to real tape of a set of virtual tapes). The tapes can be viewed as a bunch of big/little rocks. The total amount of data, however they are stacked, should still fit on a single LTO5 tape (amtapetype told me: length 1483868160 kbytes): $ pwd /data/backup/amanda/vtapes/daily $ du -shc * 512 data 1.0Kinfo 119Gslot1 144Gslot2 115Gslot3 101Gslot4 80G slot5 157Gslot6 189Gslot7 117Gslot8 1019G total Plus: 4.2G/ 212M/export 212M/export/home 212M/export/home/tom So, it looks like I do still have some big rocks to put in first but on the surface of it, it looks like it should all fit in anyway (Did I sum that up wrongly? Looks less than my tape length.). Thanks, Tom (BTW, your last email is not so much a diatribe as a oratory or allocution). On 22/10/14 03:31, Debra S Baddorf wrote: Since nobody else is chiming in, I’ll have another go. I don’t think there IS a dry-run of the taping process, since so much depends on the timing of when a DLE is finished and ready to go to tape, and the physical fitting it onto tape (although, since you have a virtual tape, presumably that isn’t as subject
Re: Backups to tape consistently under 60% tape capacity
Hi Debra, A brilliant motivational speech. Thanks. Well worth the read. In homage, I strongly suggest anyone who hasn't read it to go and do that now. Here it is again for those whose mouse wheel is dysfunctional: http://www.appleseeds.org/Big-Rocks_Covey.htm I will try your suggestions but want to make clear that the virtual tapes you see are the product of a daily run which is disk only. The weekly run puts all those daily dumps onto tape which then leaves the building. So I have both virtual and real tapes. The issues I'm having are in the weekly run (the dump to real tape of a set of virtual tapes). The tapes can be viewed as a bunch of big/little rocks. The total amount of data, however they are stacked, should still fit on a single LTO5 tape (amtapetype told me: length 1483868160 kbytes): $ pwd /data/backup/amanda/vtapes/daily $ du -shc * 512 data 1.0Kinfo 119Gslot1 144Gslot2 115Gslot3 101Gslot4 80G slot5 157Gslot6 189Gslot7 117Gslot8 1019G total Plus: 4.2G/ 212M/export 212M/export/home 212M/export/home/tom So, it looks like I do still have some big rocks to put in first but on the surface of it, it looks like it should all fit in anyway (Did I sum that up wrongly? Looks less than my tape length.). Thanks, Tom (BTW, your last email is not so much a diatribe as a oratory or allocution). On 22/10/14 03:31, Debra S Baddorf wrote: Since nobody else is chiming in, I’ll have another go. I don’t think there IS a dry-run of the taping process, since so much depends on the timing of when a DLE is finished and ready to go to tape, and the physical fitting it onto tape (although, since you have a virtual tape, presumably that isn’t as subject to variation as a real tape might be). I wonder if your root (or boot or sys or whatever you call them) partitions are now just slightly bigger, after your operating system upgrade. That would affect the way things fit into the tape. One has to put the biggest things in first, then the next biggest that will still fit, etc to make the most of the tape size. (see http://www.appleseeds.org/Big-Rocks_Covey.htm for the life motivational analysis type speech that uses this principal too) Yet you, Tom, are telling amanda to finish all the small things first, and then put them onto tape as soon as they are done: dumporder “sssS” taperalgo first I have mine set to finish the big dumps first, so I can put them on the tape first dumporder “BTBTBTBTBT Then — I want amanda to wait until it has a whole tapeful before it starts writing — just so that all those “big pieces” are done and available to be chosen. flush-threshold-dumped100 And THEN — I tell amanda to use the principle in the above motivational speech — PUT THE BIG THINGS IN FIRST to be sure they fit (and that I don’t have a 40% space left at the end of the tape which still isn’t big enough for that Big DLE that just now finished). taperalgo largestfit# pick the biggest file that will fit in space left #Greedy Algorithm -- best polynomial time choice # (err, I think it was maybe my suggestion that caused the creation of this option, # cuz of the Knapsack problem the Greedy Algorithm from comp sic # classes.Which is the same as the motivational speech above.) Put the # big stuff in first! Then you can always fit the little stuff in the remaining space. SO TRY THIS: If your operating system DLE is now big enough that it doesn’t fit in that last 40% of the tape — then make sure it is ready earlier dumporder “BBB” or “BTBT” etc and that the taper waits till it has a whole tape worth flush-threshold-dumped 100 AND that it chooses the biggest bits first taperalgo largestfit. Make those three changes and see if it helps. I bet your tapes will again be mostly full, and only the little bits will be left over to flush next time. Deb Baddorf Fermilab (ps the caps aren’t shouting — they are meant to make skimming this long winded diatribe easier!) On Oct 20, 2014, at 6:51 PM, Tom Robinson tom.robin...@motec.com.au wrote: Hi Debra, Thanks for you comments especially regarding 'no record'. I did already make that setting in my disklist file for all DLEs. eg: host /path { root-tar strategy noinc record no } I didn't check it though until you mentioned it, so thanks again. I did read the man page regarding the settings for autoflush to distinguish the no/yes/all semantics. I chose specifically 'yes' ('With yes, only dump [sic] matching the command line argument are flushed.'). Since I'm using 'yes' and not 'all' for autoflush, I don't think that has been interfering. When I ran the manual flush I did
Re: Backups to tape consistently under 60% tape capacity
Hmm, I did check my tape driver settings. When I set the tape library up, I spent a long time getting the driver settings right (on OmniOS) and took copious notes on the settings. My queries reveal that I'm not using compression (which is what I wanted as the vtapes are already compressed). LTO5 native is 1.5T; compressed is 3T. All my tapes are 'newish' (about one year old). The tape unit is the same age. For months I was consistently getting over 110% (highest 117%), then capacity dropped once to 86% and then consistently to about 56% (lowest 42%). Is there a block size issue (2x56=112)? All the weekly dumps are local so network shouldn't be an issue. The tape unit is using redundant SAS connectors. Maybe it's a mpath thing? Should I re-run amtapetype to see what it thinks the 'new' tape capacity is after upgrading the OS? On 22/10/14 10:47, Debra S Baddorf wrote: Yeah, it sure does look like it ought to fit! Could the tapes be dirty and not holding as much any more??? I have no idea if that’s even possible. But it kinds seems like your tapes are saying “I don’t want that much data”. Compression issues? Your tapes were previously getting 117% capacity, and now are only doing 86%. Is that the general summary? Unless somebody else can suggest some network (cable to tape drive?) or system problems which might make the tapes appear smaller than before? Compression issues? Read the original message at the bottom of this email for the original problem complaint. Deb On Oct 21, 2014, at 6:20 PM, Tom Robinson tom.robin...@motec.com.au wrote: Hi Debra, A brilliant motivational speech. Thanks. Well worth the read. In homage, I strongly suggest anyone who hasn't read it to go and do that now. Here it is again for those whose mouse wheel is dysfunctional: http://www.appleseeds.org/Big-Rocks_Covey.htm I will try your suggestions but want to make clear that the virtual tapes you see are the product of a daily run which is disk only. The weekly run puts all those daily dumps onto tape which then leaves the building. So I have both virtual and real tapes. The issues I'm having are in the weekly run (the dump to real tape of a set of virtual tapes). The tapes can be viewed as a bunch of big/little rocks. The total amount of data, however they are stacked, should still fit on a single LTO5 tape (amtapetype told me: length 1483868160 kbytes): $ pwd /data/backup/amanda/vtapes/daily $ du -shc * 512 data 1.0Kinfo 119Gslot1 144Gslot2 115Gslot3 101Gslot4 80G slot5 157Gslot6 189Gslot7 117Gslot8 1019G total Plus: 4.2G/ 212M/export 212M/export/home 212M/export/home/tom So, it looks like I do still have some big rocks to put in first but on the surface of it, it looks like it should all fit in anyway (Did I sum that up wrongly? Looks less than my tape length.). Thanks, Tom (BTW, your last email is not so much a diatribe as a oratory or allocution). On 22/10/14 03:31, Debra S Baddorf wrote: Since nobody else is chiming in, I’ll have another go. I don’t think there IS a dry-run of the taping process, since so much depends on the timing of when a DLE is finished and ready to go to tape, and the physical fitting it onto tape (although, since you have a virtual tape, presumably that isn’t as subject to variation as a real tape might be). I wonder if your root (or boot or sys or whatever you call them) partitions are now just slightly bigger, after your operating system upgrade. That would affect the way things fit into the tape. One has to put the biggest things in first, then the next biggest that will still fit, etc to make the most of the tape size. (see http://www.appleseeds.org/Big-Rocks_Covey.htm for the life motivational analysis type speech that uses this principal too) Yet you, Tom, are telling amanda to finish all the small things first, and then put them onto tape as soon as they are done: dumporder “sssS” taperalgo first I have mine set to finish the big dumps first, so I can put them on the tape first dumporder “BTBTBTBTBT Then — I want amanda to wait until it has a whole tapeful before it starts writing — just so that all those “big pieces” are done and available to be chosen. flush-threshold-dumped100 And THEN — I tell amanda to use the principle in the above motivational speech — PUT THE BIG THINGS IN FIRST to be sure they fit (and that I don’t have a 40% space left at the end of the tape which still isn’t big enough for that Big DLE that just now finished). taperalgo largestfit# pick the biggest file that will fit in space left #Greedy Algorithm -- best polynomial time choice # (err, I think it was maybe my suggestion that caused the creation of this option
Re: Backups to tape consistently under 60% tape capacity
Anyone care to comment? On 20/10/14 10:49, Tom Robinson wrote: Hi, I'm not sure why I'm not getting such good tape usage any more and wonder if someone can help me. Until recently I was getting quite good tape usage on my 'weekly' config: USAGE BY TAPE: Label Time Size % DLEs Parts weekly013:10 1749362651K 117.91616 weekly023:09 1667194493K 112.42121 weekly033:08 1714523420K 115.51616 weekly043:04 1664570982K 112.22121 weekly053:11 1698357067K 114.51717 weekly063:07 1686467027K 113.72121 weekly073:03 1708584546K 115.11717 weekly083:11 1657764181K 111.72121 weekly093:03 1725209913K 116.31717 weekly103:12 1643311109K 110.72121 weekly013:06 1694157008K 114.21717 For that last entry, the mail report looked like this: These dumps were to tape weekly01. Not using all tapes because 1 tapes filled; runtapes=1 does not allow additional tapes. There are 198378440K of dumps left in the holding disk. They will be flushed on the next run. Which was fairly typical and to be expected since the tune of flush settings was: flush-threshold-dumped 100 flush-threshold-scheduled 100 taperflush 100 autoflush yes Now, without expectation, the dumps started to look like this: weekly023:21 1289271529K 86.91010 weekly033:17 854362421K 57.61111 weekly043:20 839198404K 56.61111 weekly059:40 637259676K 42.9 5 5 weekly06 10:54 806737591K 54.41515 weekly091:1235523072K2.4 1 1 weekly093:21 841844504K 56.71111 weekly013:16 842557835K 56.81919 About the time it started looking different, I introduced a second config for 'archive' but I can't see why that would affect my 'weekly' run. I had a couple of bad runs and had to flush them manually and I'm not sure what happened with tapes weekly07 and weekly08 (they appear to be missing) and weekly09 is dumped to twice in succession. This looks very weird. $ amadmin weekly find | grep weekly07 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly07 1 1/-1 PARTIAL PARTIAL $ amadmin weekly find | grep weekly08 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly08 1 1/-1 PARTIAL PARTIAL $ amadmin weekly find | grep weekly09 2014-09-21 00:00:00 monza / 0 weekly09 9 1/1 OK 2014-09-21 00:00:00 monza /data/backup/amanda/vtapes/daily/slot1 0 weekly09 10 1/1 OK 2014-09-21 00:00:00 monza /data/backup/amanda/vtapes/daily/slot2 0 weekly09 11 1/-1 OK PARTIAL 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly09 1 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot5 0 weekly09 2 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot6 0 weekly09 3 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot7 0 weekly09 4 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot8 0 weekly09 5 1/1 OK 2014-09-14 00:00:00 monza /export 0 weekly09 6 1/1 OK 2014-09-14 00:00:00 monza /export/home0 weekly09 7 1/1 OK 2014-09-14 00:00:00 monza /export/home/tom0 weekly09 8 1/1 OK More recently (about three weesk ago) I upgraded the OS. I don't think it has anything to do with this but mention it for completeness. To get
Re: Backups to tape consistently under 60% tape capacity
Thanks Debra. I know there's a lot of info I dumped in my original email so maybe my question/message wasn't clear. I'm still confused over this. I only started dabbling with the flush settings because I wasn't getting more than about 56% on the tape. I can't see how setting it back will change that. When I add up what flushed and what's not flushed, it appears as if it would all fit on the tape. Is there any way of testing this in a so called 'dry run'? Otherwise I'll be waiting weeks to see what a couple of tweaks here and there will actually do. On 21/10/14 08:28, Debra S Baddorf wrote: Here’s a thought: orig: flush-threshold-dumped 100 flush-threshold-scheduled 100 taperflush 100 autoflush yes now: flush-threshold-dumped 50 flush-threshold-scheduled 100 taperflush 0 autoflush yes You now allow amanda to start writing to tape when only 50% of the data is ready. (flush-threshold-dumped). Previously, 100% had to be ready — and THAT allows the best fit of DLE’s onto tape. Ie: - pick the biggest DLE that will fit. Write it to tape. - repeat. Now, the biggest one may not be done yet. But you’ve already started writing all the small pieces onto the tape, so maybe when you reach the Big Guy, there is no space for it. The “Greedy Algorithm” (above: pick biggest. repeat) works best when all the parts are available for it to choose. Try setting flush-threshold-dumped back to 100.It won’t write as SOON — cuz it waits till 100% of a tape is available, but it might FILL the tape better. I think. Deb Baddorf Fermilab On Oct 20, 2014, at 3:44 PM, Tom Robinson tom.robin...@motec.com.au wrote: Anyone care to comment? On 20/10/14 10:49, Tom Robinson wrote: Hi, I'm not sure why I'm not getting such good tape usage any more and wonder if someone can help me. Until recently I was getting quite good tape usage on my 'weekly' config: USAGE BY TAPE: Label Time Size % DLEs Parts weekly013:10 1749362651K 117.91616 weekly023:09 1667194493K 112.42121 weekly033:08 1714523420K 115.51616 weekly043:04 1664570982K 112.22121 weekly053:11 1698357067K 114.51717 weekly063:07 1686467027K 113.72121 weekly073:03 1708584546K 115.11717 weekly083:11 1657764181K 111.72121 weekly093:03 1725209913K 116.31717 weekly103:12 1643311109K 110.72121 weekly013:06 1694157008K 114.21717 For that last entry, the mail report looked like this: These dumps were to tape weekly01. Not using all tapes because 1 tapes filled; runtapes=1 does not allow additional tapes. There are 198378440K of dumps left in the holding disk. They will be flushed on the next run. Which was fairly typical and to be expected since the tune of flush settings was: flush-threshold-dumped 100 flush-threshold-scheduled 100 taperflush 100 autoflush yes Now, without expectation, the dumps started to look like this: weekly023:21 1289271529K 86.91010 weekly033:17 854362421K 57.61111 weekly043:20 839198404K 56.61111 weekly059:40 637259676K 42.9 5 5 weekly06 10:54 806737591K 54.41515 weekly091:1235523072K2.4 1 1 weekly093:21 841844504K 56.71111 weekly013:16 842557835K 56.81919 About the time it started looking different, I introduced a second config for 'archive' but I can't see why that would affect my 'weekly' run. I had a couple of bad runs and had to flush them manually and I'm not sure what happened with tapes weekly07 and weekly08 (they appear to be missing) and weekly09 is dumped to twice in succession. This looks very weird. $ amadmin weekly find | grep weekly07 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly07 1 1/-1 PARTIAL PARTIAL $ amadmin weekly find | grep weekly08 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly08 1 1/-1 PARTIAL PARTIAL $ amadmin weekly find | grep weekly09 2014-09-21 00:00:00 monza / 0 weekly09 9 1/1 OK 2014-09-21 00:00:00 monza /data/backup/amanda/vtapes/daily/slot1 0 weekly09 10 1/1 OK 2014-09-21 00:00:00 monza /data/backup/amanda/vtapes/daily/slot2 0 weekly09
Re: Backups to tape consistently under 60% tape capacity
Hi Debra, Thanks for you comments especially regarding 'no record'. I did already make that setting in my disklist file for all DLEs. eg: host /path { root-tar strategy noinc record no } I didn't check it though until you mentioned it, so thanks again. I did read the man page regarding the settings for autoflush to distinguish the no/yes/all semantics. I chose specifically 'yes' ('With yes, only dump [sic] matching the command line argument are flushed.'). Since I'm using 'yes' and not 'all' for autoflush, I don't think that has been interfering. When I ran the manual flush I did have to override the flush settings because amanda didn't want to flush to tape at all. Just sat there waiting for more data, I guess. I didn't record the command and it's no longer in my history. From memory, I think it was: $ amflush -o flush-threshold-dumped=0 -o flush-threshold-scheduled=0 -o taperflush=0 -o autoflush=no weekly So essentially I was trying to flush with 'defaults' restored. Would that mess with my scheduled runs? Anyone have some clues about 'dry running' to see what tuning I need to tune without actually doing it? Regards, Tom On 21/10/14 10:27, Debra S Baddorf wrote: Not an actual answer, but two comments: 1- you’ve added a new config “archive”.Make sure you set it “no record” so that when IT does a level 0 of some disk, your normal config doesn’t read that as ITS level 0. The “level 0 was done date” info is not specific to the configuration, but to the disk itself. For a “dump” type dump (as opposed to tar) it is stored in /etc/dumpdates, and any dump done gets written there. Amanda’s configurations are “meta data” that amanda knows about but that the disk itself doesn’t know about. So your archive config might be changing the dump level patterns of your other config, unless you set the archive config to “no record”. I’m not sure if this is affecting your current setup, but you did just add that new config. 2- I became aware about a year ago that “autoflush yes” is no longer the only opposite to “autoflush no”.There is also a new-ish “autoflush all”. If you type “amdump MyConfig”the either “yes” or “all” should flush everything. But if you type “amdump MyConfig aParticularNodeName” then it will only flush DLE’s that match that node name, unless you set it to “autoflush all”. You did mention that you had to do a few flushes lately. If you really meant that you had to allow some DLE’s to auto-flush, then the “all” vs “yes” might make a difference to you. Other people: how can he do a “dry run” here? Deb On Oct 20, 2014, at 6:05 PM, Tom Robinson tom.robin...@motec.com.au wrote: Thanks Debra. I know there's a lot of info I dumped in my original email so maybe my question/message wasn't clear. I'm still confused over this. I only started dabbling with the flush settings because I wasn't getting more than about 56% on the tape. I can't see how setting it back will change that. When I add up what flushed and what's not flushed, it appears as if it would all fit on the tape. Is there any way of testing this in a so called 'dry run'? Otherwise I'll be waiting weeks to see what a couple of tweaks here and there will actually do. On 21/10/14 08:28, Debra S Baddorf wrote: Here’s a thought: orig: flush-threshold-dumped 100 flush-threshold-scheduled 100 taperflush 100 autoflush yes now: flush-threshold-dumped 50 flush-threshold-scheduled 100 taperflush 0 autoflush yes You now allow amanda to start writing to tape when only 50% of the data is ready. (flush-threshold-dumped). Previously, 100% had to be ready — and THAT allows the best fit of DLE’s onto tape. Ie: - pick the biggest DLE that will fit. Write it to tape. - repeat. Now, the biggest one may not be done yet. But you’ve already started writing all the small pieces onto the tape, so maybe when you reach the Big Guy, there is no space for it. The “Greedy Algorithm” (above: pick biggest. repeat) works best when all the parts are available for it to choose. Try setting flush-threshold-dumped back to 100.It won’t write as SOON — cuz it waits till 100% of a tape is available, but it might FILL the tape better. I think. Deb Baddorf Fermilab On Oct 20, 2014, at 3:44 PM, Tom Robinson tom.robin...@motec.com.au wrote: Anyone care to comment? On 20/10/14 10:49, Tom Robinson wrote: Hi, I'm not sure why I'm not getting such good tape usage any more and wonder if someone can help me. Until recently I was getting quite good tape usage on my 'weekly' config: USAGE BY TAPE: Label Time Size % DLEs Parts weekly013:10 1749362651K 117.91616 weekly023:09 1667194493K 112.42121 weekly033:08 1714523420K 115.5
Backups to tape consistently under 60% tape capacity
Hi, I'm not sure why I'm not getting such good tape usage any more and wonder if someone can help me. Until recently I was getting quite good tape usage on my 'weekly' config: USAGE BY TAPE: Label Time Size % DLEs Parts weekly013:10 1749362651K 117.91616 weekly023:09 1667194493K 112.42121 weekly033:08 1714523420K 115.51616 weekly043:04 1664570982K 112.22121 weekly053:11 1698357067K 114.51717 weekly063:07 1686467027K 113.72121 weekly073:03 1708584546K 115.11717 weekly083:11 1657764181K 111.72121 weekly093:03 1725209913K 116.31717 weekly103:12 1643311109K 110.72121 weekly013:06 1694157008K 114.21717 For that last entry, the mail report looked like this: These dumps were to tape weekly01. Not using all tapes because 1 tapes filled; runtapes=1 does not allow additional tapes. There are 198378440K of dumps left in the holding disk. They will be flushed on the next run. Which was fairly typical and to be expected since the tune of flush settings was: flush-threshold-dumped 100 flush-threshold-scheduled 100 taperflush 100 autoflush yes Now, without expectation, the dumps started to look like this: weekly023:21 1289271529K 86.91010 weekly033:17 854362421K 57.61111 weekly043:20 839198404K 56.61111 weekly059:40 637259676K 42.9 5 5 weekly06 10:54 806737591K 54.41515 weekly091:1235523072K2.4 1 1 weekly093:21 841844504K 56.71111 weekly013:16 842557835K 56.81919 About the time it started looking different, I introduced a second config for 'archive' but I can't see why that would affect my 'weekly' run. I had a couple of bad runs and had to flush them manually and I'm not sure what happened with tapes weekly07 and weekly08 (they appear to be missing) and weekly09 is dumped to twice in succession. This looks very weird. $ amadmin weekly find | grep weekly07 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly07 1 1/-1 PARTIAL PARTIAL $ amadmin weekly find | grep weekly08 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly08 1 1/-1 PARTIAL PARTIAL $ amadmin weekly find | grep weekly09 2014-09-21 00:00:00 monza / 0 weekly09 9 1/1 OK 2014-09-21 00:00:00 monza /data/backup/amanda/vtapes/daily/slot1 0 weekly09 10 1/1 OK 2014-09-21 00:00:00 monza /data/backup/amanda/vtapes/daily/slot2 0 weekly09 11 1/-1 OK PARTIAL 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4 0 weekly09 1 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot5 0 weekly09 2 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot6 0 weekly09 3 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot7 0 weekly09 4 1/1 OK 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot8 0 weekly09 5 1/1 OK 2014-09-14 00:00:00 monza /export 0 weekly09 6 1/1 OK 2014-09-14 00:00:00 monza /export/home0 weekly09 7 1/1 OK 2014-09-14 00:00:00 monza /export/home/tom0 weekly09 8 1/1 OK More recently (about three weesk ago) I upgraded the OS. I don't think it has anything to do with this but mention it for completeness. To get as much on tape as possible I was originally using: flush-threshold-dumped 100 flush-threshold-scheduled 100 taperflush 100 autoflush yes But now, in an effort to tune
Re: Which blocksize do I need to set and where?
Hi Jean-Louis, Thanks for clarifying that. Interesting to see that the '-f' option for amlabel overrides the readblocksize. I got an error reading 512MiB but found a threshold at 1MiB where I could read the tape: dd if=/dev/rmt/0bn bs=1024K count=1 of=/dev/null 0+1 records in 0+1 records out 32768 bytes (33 kB) copied, 0.0103029 s, 3.2 MB/s This doesn't make sense as I had written to the tape 2MiB blocks using amtapetype previously: $ amtapetype -f -b 2048k -t ULT3580-TD5 weekly /dev/rmt/0bn Checking for FSF_AFTER_FILEMARK requirement device-property FSF_AFTER_FILEMARK false Applying heuristic check for compression. Wrote random (uncompressible) data at 87839727.2131148 bytes/sec Wrote fixed (compressible) data at 282011755.789474 bytes/sec Compression: enabled Writing one file to fill the volume. Wrote 1519520841728 bytes at 86133 kb/sec Writing smaller files (15193866240 bytes) to determine filemark. define tapetype ULT3580-TD5 { comment Created by amtapetype; compression enabled length 1483907072 kbytes filemark 1820 kbytes speed 86133 kps blocksize 2048 kbytes } # LEOM is not supported for this drive and kernel I powered off, then on the tape unit and got this: dd if=/dev/rmt/0bn bs=1024K count=1 of=/dev/null 0+1 records in 0+1 records out 1047552 bytes (1.0 MB) copied, 0.0172134 s, 60.9 MB/s On further inspection I think I have a driver issue. I see this in my logs when I try to write with blocksizes over 1MiB: dd if=/dev/rmt/0b bs=1048577 count=1 of=/dev/null dd: reading ‘/dev/rmt/0b’: Invalid argument 0+0 records in 0+0 records out 0 bytes (0 B) copied, 0.00532612 s, 0.0 kB/s Dec 18 15:09:50 monza.motec.com.au scsi: [ID 107833 kern.notice] /pci@0,0/pci8086,3c0a@3,2/pci1000,3020@0/iport@f/tape@w5000e11156304003,0 (st5): Dec 18 15:09:50 monza.motec.com.au Read Write scsi_init_pkt() failure Dec 18 15:09:50 monza.motec.com.au scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,3c0a@3,2/pci1000,3020@0/iport@f/tape@w5000e11156304003,0 (st5): Dec 18 15:09:50 monza.motec.com.au errors after pkt alloc (b_flags=0x2200065, b_error=0x16) Any thoughts? Kind regards, Tom On 18/12/13 01:14, Jean-Louis Martineau wrote: Tom, You can define the block size as the BLOCK-SIZE device property in the changer section or in the tapetype, the device-property is used if it is set. You can define the read_block_size as the READ-BLOCK-SIZE device property in the changer section or in the tapetype, the device-property is used if it is set. The problem is the tape is already written with a block larger than 2048k. dd if=/dev/rmt/0bn bs=512m count=1 of=/dev/null should print the size of the block (if = 512m) You can increase the value of the READ-BLOCK-SIZE device property, the readblocksize tapetype setting or simply use the '-f' flag of amlabel: amlabel -f weekly weekly00 slot 1 Jean-Louis On 12/16/2013 08:01 PM, Tom Robinson wrote: Amanda 3.3.3 OmniOS (OpenSolaris derivative) Hi, I've got an IBM TS3200 robot working with mtx. The raw mtx command sees the robot and tapes OK but when I use amtape I get a 'block size too small' error. The amlabel command also fails with: $ amlabel weekly weekly00 slot 1 Reading label... Error reading volume label: Error reading Amanda header: block size too small Not writing label. In amanda.conf I adjusted the blocksize and readblocksize but amanda is still not satisfied. Which blocksize is amanda unhappy about and where can I successfully adjust the settings? Kind regards, Tom in amanda.conf: define changer robot { tpchanger chg-robot:/dev/scsi/changer/c1t5000E11156304003d1 property tape-device 0=tape:/dev/rmt/0bn #property eject-before-unload yes property use-slots 1-23 device-property BLOCK_SIZE 2048k } tapedev robot tapetype ULT3580-TD5 define tapetype global { part_size 3G part_cache_type none } define tapetype ULT3580-TD5 { global comment Created by amtapetype; compression enabled length 1483907072 kbytes filemark 1820 kbytes speed 86133 kps blocksize 2048 kbytes readblocksize 2048 kbytes } $ mtx status Storage Changer /dev/scsi/changer/c1t5000E11156304003d1:1 Drives, 24 Slots ( 1 Import/Export ) Data Transfer Element 0:Full (Storage Element 3 Loaded):VolumeTag = 050CHOL5 Storage Element 1:Full :VolumeTag=WEEK00L5 Storage Element 2:Full :VolumeTag=WEEK01L5 Storage Element 3:Empty Storage Element 4:Full :VolumeTag=051CHOL5 Storage Element 5:Full :VolumeTag=052CHOL5 Storage Element 6:Full :VolumeTag=053CHOL5 Storage Element 7:Full :VolumeTag=054CHOL5 Storage Element 8:Empty Storage Element 9:Empty Storage Element 10:Empty Storage Element 11:Empty Storage Element 12:Empty Storage Element 13:Empty Storage Element 14:Empty Storage Element 15:Empty Storage Element 16:Empty Storage Element 17:Empty Storage Element 18:Empty Storage Element 19:Empty Storage Element 20:Empty Storage Element 21:Empty Storage Element 22:Empty Storage Element 23:Empty Storage Element 24
Tape Compression: Is it on or off
amanda 3.3.3 OmniOS 151006 Hi, This question has probably been answered elsewhere so maybe someone can redirect me to the short answer. I am still not sure if I'm using compression or not. My apologies in advance for the long post. I have an IBM-3580-LTO5 configured through the 'st' driver (and the 'sgen' driver for the tape robot). Simply, amtapetype reports that compression is on even though I'm using a device node that doesn't do compression: $ amtapetype -f -t ULT3580-TD5 weekly /dev/rmt/0bn Checking for FSF_AFTER_FILEMARK requirement device-property FSF_AFTER_FILEMARK false Applying heuristic check for compression. Wrote random (uncompressible) data at 73561559.3650794 bytes/sec Wrote fixed (compressible) data at 193099093.33 bytes/sec Compression: enabled Writing one file to fill the volume. Wrote 1515988615168 bytes at 73464 kb/sec Got LEOM indication, so drive and kernel together support LEOM Writing smaller files (15159885824 bytes) to determine filemark. define tapetype ULT3580-TD5 { comment Created by amtapetype; compression enabled length 1480457632 kbytes filemark 0 kbytes speed 73464 kps blocksize 32 kbytes } # for this drive and kernel, LEOM is supported; add # device-property LEOM TRUE # for this device. I had determined that device node /dev/rmt/0bn does NOT use compression but still amtapetype reports compression is being used. Can shed some more light on this? To determine which device node does/doesn't use compression I did the following: Reading the man page I have determined that, for my configuration, unless I use a device node that is specifically for compression, then compress will not be used. The 'st' driver uses a bit pattern for the tape device options. Mine is: Property Value options 0x1018619 ST_VARIABLE 0x0001 ST_BSF 0x0008 ST_BSR 0x0010 ST_KNOWS_EOD0x0200 ST_UNLOADABLE 0x0400 ST_NO_RECSIZE_LIMIT 0x8000 ST_MODE_SEL_COMP 0x1 ST_WORMABLE x100 AFAICT, compression is set on by ST_MODE_SEL_COMP. But you must understand the following riddle to really determine if compression will be used. From 'man st': ST_MODE_SEL_COMP If the ST_MODE_SEL_COMP flag is set, the driver deter- mines which of the two mode pages the device supports for selecting or deselecting compression. It first tries the Data Compression mode page (0x0F); if this fails, it tries the Device Configuration mode page (0x10). Some devices, however, may need a specific density code for selecting or deselecting compression. Please refer to the device specific SCSI manual. When the flag is set, compression is enabled only if the c or u device is used. Note that when the lower 2 densities of a drive are identically configured and the upper 2 densities are identically configured, but the lower and upper differ from each other and ST_MODE_SEL_COMP is set, the m node sets compression on for the lower density code (for example, 0x42) and the c and u nodes set compression on for the higher density (for example, 0x43). For any other device densities, compression is disabled. To make more sense on the above you need to know the status of the tape: mt -f /dev/rmt/0cb config IBM ULT3580-TD5, IBM ULT3580-TD5 , CFGIBMULT3580TD5; CFGIBMULT3580TD5 = 2,0x3B,0,0x1018619,4,0x46,0x46,0x58,0x58,3,60,1500,600,16920,780,780,16380; From the status ouput and reading more of the man page, I see that I have four densities: density 0 0x46 density 1 0x46 density 2 0x58 density 3 0x58 In other words, because the lower two compression are configured identically: density 1 0x46 density 2 0x46 and the upper two compressions are configured identically: density 3 0x58 density 4 0x58 and they differ from each other: 0x46 != 0x58 and ST_MODE_SEL_COMP is set: ST_MODE_SEL_COMP = 0x1 then compression is set by using the correct device node. Lower density compression: /dev/rmt/0m /dev/rmt/0mb /dev/rmt/0mbn /dev/rmt/0mn Higher density compression: /dev/rmt/0c /dev/rmt/0cb /dev/rmt/0cbn /dev/rmt/0cn or /dev/rmt/0u /dev/rmt/0ub /dev/rmt/0ubn /dev/rmt/0un All other device densities have no compression: /dev/rmt/0 /dev/rmt/0b /dev/rmt/0bn /dev/rmt/0h /dev/rmt/0hb /dev/rmt/0hbn /dev/rmt/0hn /dev/rmt/0l /dev/rmt/0lb /dev/rmt/0lbn /dev/rmt/0ln /dev/rmt/0n Hopefully I have understood that. Regards, Tom -- Tom Robinson IT Manager/System Administrator MoTeC Pty Ltd 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 E: tom.robin...@motec.com.au signature.asc Description: OpenPGP digital signature
Re: amtapetype: Error setting FSF_AFTER_FILEMARK:
To answer my own question. Using the BSD behavoir device node seems to work (/dev/rmt/0): $ amtapetype -f -c -t ULT3580-TD5 weekly /dev/rmt/0bn Checking for FSF_AFTER_FILEMARK requirement device-property FSF_AFTER_FILEMARK false Applying heuristic check for compression. Wrote random (uncompressible) data at 74407533.1147541 bytes/sec Wrote fixed (compressible) data at 206311796.363636 bytes/sec Compression: enabled On 14/10/13 11:07, Tom Robinson wrote: amanda version 3.3.3 Hi, I'm running amanda on OmniOS version 151006 and have configured my IBM-TS3100 tape library (ULT3580-TD5 drive) using the 'st' and 'sgen' drivers. I can label tapes and get inventory successfully. All appears to work well except I can't successfully run amtapetype. I get the following error: $ id uid=33(amanda) gid=3(sys) groups=3(sys) $ amtapetype -f -c -t ULT3580-TD5 weekly /dev/rmt/0 Checking for FSF_AFTER_FILEMARK requirement amtapetype: Error setting FSF_AFTER_FILEMARK: Success at /opt/csw/sbin/amtapetype line 420. $ amtapetype -f -t ULT3580-TD5 weekly /dev/rmt/0 Checking for FSF_AFTER_FILEMARK requirement amtapetype: Error setting FSF_AFTER_FILEMARK: Success at /opt/csw/sbin/amtapetype line 420. Can anyone please shed some light on what the above errors mean? Kind regards, Tom signature.asc Description: OpenPGP digital signature
Re: wrong link to 3.1.1 on download page
It is still broken. On Tue 2010-08-24 17:03, Paddy Sreenivasan wrote: Hi Tom, Thanks for reporting it. It has been fixed. Paddy On Tue, Aug 24, 2010 at 3:23 PM, Tom Schutter t.schut...@comcast.net wrote: On http://amanda.org/download.php, what should be a link to 3.1.1 is actually a link to 3.1.2. The tag and branch links are OK, it is just the Release link that is wrong. -- Tom Schutter t.schut...@comcast.net -- Tom Schutter t.schut...@comcast.net
wrong link to 3.1.1 on download page
On http://amanda.org/download.php, what should be a link to 3.1.1 is actually a link to 3.1.2. The tag and branch links are OK, it is just the Release link that is wrong. -- Tom Schutter t.schut...@comcast.net
3.1.2 Windows binary
Does anyone know when the 3.1.2 Windows binary will be available? It is not listed at http://www.zmanda.com/download-amanda.php -- Tom Schutter t.schut...@comcast.net
Re: very slow dumper (42.7KB/s)
Dustin J. Mitchell wrote: On Mon, Aug 31, 2009 at 11:51 PM, Tom Robinsontom.robin...@motec.com.au wrote: While the disk is reaching saturation (and recovering quickly) I'm thinking that the all the retransmissions would be slowing things down more. I don't see any errors on the client interface but there are four on the server interface over the last four days. Hmm, the causation may be going the other way -- if the disk is generating too many IRQs for the CPU to handle, then network packets might get dropped. Alternately, perhaps the PCI bus is maxed out? Anyway, this sounds like a problem local to the client. Is there a way to slow down the disk IO so that it doesn't wedge the machine? Thanks Dustin, I've found that our very old (RH7.1 seawolf), running a very old kernel (2.4.20) has a bug in the ide driver. I can't say categorically that this is the root cause of the dump issue I saw but, finally, I've got permission to move forward with a planned upgrade that I've been pushing for some time now. For those that are interested, I suspect this is the problem: https://bugzilla.redhat.com/show_bug.cgi?id=134579 Thanks for all the help Regards, Tom -- Tom Robinson System Administrator MoTeC 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 M: +61 4 3268 7026 E: tom.robin...@motec.com.au
Re: very slow dumper (42.7KB/s)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tom Robinson wrote: Hi, I'm running amanda (2.6.0p2-1) but have an older client running 2.4.2p2-1. On that client the full backup of a 4GB disk takes a very long time: DUMP SUMMARY: DUMPER STATS TAPER STATS HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s -- --- host / 0 42567901819411 42.7 637:22 47.6 26:01 1165.9 I'm not sure where to start looking for this bottle-neck. Any clues would be appreciated. bump - -- Tom Robinson System Administrator MoTeC 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 M: +61 4 3268 7026 E: tom.robin...@motec.com.au -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org/ iD8DBQFKnFXv+brnGjUTUjARAgcaAJ9mhqo4enmr/VSkjJ9k4n6lpUsU/ACg38ZL EHyS6r23vri4I7azpfEk2BY= =6ngI -END PGP SIGNATURE-
Re: very slow dumper (42.7KB/s)
Frank Smith wrote: Tom Robinson wrote: Hi, I'm running amanda (2.6.0p2-1) but have an older client running 2.4.2p2-1. On that client the full backup of a 4GB disk takes a very long time: DUMP SUMMARY: DUMPER STATS TAPER STATS HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s -- --- host / 0 42567901819411 42.7 637:22 47.6 26:01 1165.9 I'm not sure where to start looking for this bottle-neck. Any clues would be appreciated. bump Try looking on the client while the backup is running. Could be any of a lot of things. Network problems (check for errors on the NIC and the switch port), lack of CPU to run the compression, disk I/O contention, huge numbers of files (either in aggregate or in a single directory), or possibly even impending disk failure (lots of read retries or a degraded RAID). Looking at something like 'top' during the backup should give you an idea of whether your CPU is overloaded or if you are always waiting for disk, and if there is some other process(es) running that may also be trying to do a lot of disk I/O. Your system logs should show if you are seeing disk errors, and the output of ifconfig or similar will show the error counts on the NIC. If you don't see anything obvious at first, try running your dump program (dump or tar or whatever Amanda is configured to use) with the output directed to /dev/null and see how long that takes, if that is also slow then it is not the network or Amanda. Then try it without compression to see how much that speeds things up. Hi, Thanks for the feedback Frank. I am running dump. After re-nicing the sendbackups and dumpers to first -1 and then -3 the load average still hovers at zero: load average: 0.00, 0.00, 0.00 Re-nicing again to 0, I looked at iostat -x and found the disk saturated (%util is frequently reaches 100 but drops quickly). The average queue size (avgqu-sz) and await are also astoundingly high: avg-cpu: %user %nice%sys %idle 0.000.000.00 100.00 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util hda8.67 0.00 0.00 1.33 74.67 10.6764.00 14316554.320.00 7500.00 100.00 avg-cpu: %user %nice%sys %idle 0.000.000.33 99.67 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util hda1.00 0.67 0.67 10.00 10.67 85.33 9.00 14316556.95 700.00 140.62 15.00 avg-cpu: %user %nice%sys %idle 1.670.006.00 92.33 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util hda 10.00 0.00 0.67 0.00 85.330.00 128.00 1.93 28150.00 6600.00 44.00 avg-cpu: %user %nice%sys %idle 0.000.002.00 98.00 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util hda 10.00 0.00 0.00 0.00 85.330.00 0.00 0.53 0.00 0.00 2.67 avg-cpu: %user %nice%sys %idle 0.330.005.33 94.33 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util hda 10.00 0.00 1.00 1.33 85.33 10.6741.14 6.90 10128.57 4271.43 99.67 More concerning is the monitoring the network with tshark on the server-side I see a lost segment followed by a flurry of Dup ACK and TCP Retransmission every eight to ten seconds: 11.641313 10.0.225.2 - 192.168.0.31 TCP 53096 11003 [ACK] Seq=558073 Ack=1 Win=5392 Len=1348 TSV=519917696 TSER=427020080 11.670185 10.0.225.2 - 192.168.0.31 TCP [TCP Previous segment lost] 53096 11003 [ACK] Seq=594469 Ack=1 Win=5392 Len=1348 TSV=519917781 TSER=427020930 11.670211 192.168.0.31 - 10.0.225.2 TCP 11003 53096 [ACK] Seq=1 Ack=559421 Win=501 Len=0 TSV=427020990 TSER=519917696 SLE=594469 SRE=595817 11.699896 10.0.225.2 - 192.168.0.31 TCP 53096 11003 [ACK] Seq=595817 Ack=1 Win=5392 Len=1348 TSV=519917781 TSER=427020930 11.699916 192.168.0.31 - 10.0.225.2 TCP [TCP Dup ACK 657#1] 11003 53096 [ACK] Seq=1 Ack=559421 Win=501 Len=0 TSV=427021020 TSER=519917696 SLE=594469 SRE=597165 11.730662 10.0.225.2 - 192.168.0.31 TCP 53096 11003 [ACK] Seq=597165 Ack=1 Win=5392 Len=1348 TSV=519917787 TSER=427020990 11.730747 192.168.0.31 - 10.0.225.2 TCP [TCP Dup ACK 657#2] 11003 53096 [ACK] Seq=1 Ack=559421 Win=501 Len=0 TSV=427021050 TSER=519917696 SLE=594469 SRE=598513 11.730716 10.0.225.2 - 192.168.0.31 TCP 53096 11003 [ACK] Seq=598513 Ack=1 Win=5392 Len=1348 TSV=519917787 TSER=427020990 11.730761 192.168.0.31 - 10.0.225.2 TCP [TCP Dup ACK 657#3] 11003 53096 [ACK] Seq=1 Ack=559421 Win=501 Len=0 TSV=427021050 TSER=519917696 SLE
very slow dumper (42.7KB/s)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, I'm running amanda (2.6.0p2-1) but have an older client running 2.4.2p2-1. On that client the full backup of a 4GB disk takes a very long time: DUMP SUMMARY: DUMPER STATS TAPER STATS HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s - - -- --- host / 0 42567901819411 42.7 637:22 47.6 26:01 1165.9 I'm not sure where to start looking for this bottle-neck. Any clues would be appreciated. Thanks, Tom - -- Tom Robinson System Administrator MoTeC 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 M: +61 4 3268 7026 E: tom.robin...@motec.com.au -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org/ iD8DBQFKmx9k+brnGjUTUjARAmaVAJ9KULC/9CIPuXW6b25ht7rPL9ONkACcCtg7 gEZQ1pnHxbQL15B5byf6Gto= =5Jxn -END PGP SIGNATURE-
Re: ZWC Pre/Post scripting
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 The tool is ZWC. So I shouldn't use amanda then? McGraw, Robert P wrote: This reminds me of the old vaudeville act. PAITIENT: Doctor it hurts when I do that. DOCTOR: Then don't do that. In your case USER to MICROSOFT: It hurts when I use backup tool MICROSOFT to USER: Then don't use backup tool Robert _ Robert P. McGraw, Jr. Manager, Computer SystemEMAIL: rmcg...@purdue.edu Purdue UniversityROOM: MATH-807 Department of Mathematics PHONE: (765) 494-6055 150 N. University Street West Lafayette, IN 47907-2067 -Original Message- From: owner-amanda-us...@amanda.org [mailto:owner-amanda- us...@amanda.org] On Behalf Of Tom Robinson Sent: Thursday, May 14, 2009 10:22 PM To: amanda-users@amanda.org Subject: ZWC Pre/Post scripting ZWC Version: 2.6.4p1 amanda server: 2.6.0p2 I'm using ZWC to backup a windows server that runs MS SQL Server 2005. MS SQL Server 2005 runs it's own in-built task to create database backup files and then ZWC picks those up for delivery to the amanda server running on a remote linux machine. We've just discovered that (due to a Microsoft bug with the way the SQL Server Writer service interacts with a 'backup tool') the backups get corrupted and aren't recoverable. The Microsoft recommended workaround is to stop the SQL Server Writer service before running the 'backup tool' (see here for more details: http://support.microsoft.com/kb/937683/en-us) Does ZWC have pre/post scripting capabilities to do this? Regards, Tom - -- Tom Robinson System Administrator MoTeC 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 M: +61 4 3268 7026 E: tom.robin...@motec.com.au -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org iD8DBQFKEeBu+brnGjUTUjARAtTRAJ9LwIRzgLKZx/R+zQLY2AlGgxaE9ACfXsa+ 8Y5wJqo2h/aVWHt8baEvsqo= =TC8M -END PGP SIGNATURE-
ZWC Pre/Post scripting
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 ZWC Version: 2.6.4p1 amanda server: 2.6.0p2 I'm using ZWC to backup a windows server that runs MS SQL Server 2005. MS SQL Server 2005 runs it's own in-built task to create database backup files and then ZWC picks those up for delivery to the amanda server running on a remote linux machine. We've just discovered that (due to a Microsoft bug with the way the SQL Server Writer service interacts with a 'backup tool') the backups get corrupted and aren't recoverable. The Microsoft recommended workaround is to stop the SQL Server Writer service before running the 'backup tool' (see here for more details: http://support.microsoft.com/kb/937683/en-us) Does ZWC have pre/post scripting capabilities to do this? Regards, Tom - -- Tom Robinson System Administrator MoTeC 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 M: +61 4 3268 7026 E: tom.robin...@motec.com.au -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org iD8DBQFKDNHL+brnGjUTUjARAspNAJ9zugbQG0MXQCXEvkLBWx5KwTWoowCgnlIr UK/jjy4X6b+LmAkRDPAp+KM= =xv1j -END PGP SIGNATURE-
lvm snapshots for backup
Hi, I'm using amanda-2.6.0p2 and want to use lvm snapshots to backup some changing filesystems. Are there hook scripts for this or is there some plugin to use? I've seen a thing called MySQL ZRM which looks like it might work. Is that ok to use for volumes other than MySQL databases? Is it overkill? Thanks, Tom -- Tom Robinson System Administrator MoTeC 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 M: +61 4 3268 7026 E: tom.robin...@motec.com.au
Re: Weird compression results for DLE using 'compress NONE' (nocomp-root)
Jean-Louis Martineau wrote: Tom Robinson wrote: DUMPER STATS TAPER STATS HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s -- --- host /disk 1 20316904063380 200.0 36:34 1852.3 6:27 10487.2 Note the ORIG-KB blows out to twice the size! COMP% is 200.0... ORIG-KB is the size reported by the dump program, it report a number of blocks, but all dump doesn't have the same blocksize. Can you post the sendsize.*.debug and sendbackup.*.debug files from the client and answer the following questions? amanda version on the client? Os of the client? Filesystem? dump program? dump version? Client: amanda: 2.4.4p3 OS: CentOS 4.7 filesystem: ext3 dump prog-ver: dump-0.4b39 (2 disks) dump prog-ver: tar-1.14(1 disk) For ease of debugging full names are restored: DUMPER STATS TAPER STATS HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s -- --- rook /dev/mapper/vg-root 1 20316904063380 200.0 36:34 1852.3 6:27 10487.2 rook /dev/sda1 0 2010 4020 200.00:06 634.9 0:01 3247.0 rook /etc 0 11200 2900 25.90:21 137.6 0:01 4732.5 DLE: rook /etc comp-root-tar rook /dev/sda1 nocomp-root rook /dev/mapper/vg-rootnocomp-root - sendsize: debug 1 pid 2434 ruid 33 euid 33: start at Wed Jan 21 21:45:28 2009 sendsize: version 2.4.4p3 sendsize[2436]: time 0.101: calculating for amname '/etc', dirname '/etc', spindle -1 sendsize[2436]: time 0.101: getting size via gnutar for /etc level 0 sendsize[2434]: time 0.101: waiting for any estimate child: 1 running sendsize[2436]: time 0.119: spawning /usr/lib/amanda/runtar in pipeline sendsize[2436]: argument list: /bin/tar --create --file /dev/null --directory /etc --one-file-system --listed-incremental /var/lib/amanda/gnutar-lists/rook_etc_0.new --sparse --ignore-failed-read --totals . sendsize[2436]: time 5.762: Total bytes written: 11468800 (11MiB, 2.2MiB/s) sendsize[2436]: time 5.762: . sendsize[2436]: estimate time for /etc level 0: 5.643 sendsize[2436]: estimate size for /etc level 0: 11200 KB sendsize[2436]: time 5.762: waiting for /bin/tar /etc child sendsize[2436]: time 5.763: after /bin/tar /etc wait sendsize[2436]: time 5.763: getting size via gnutar for /etc level 1 sendsize[2436]: time 5.803: spawning /usr/lib/amanda/runtar in pipeline sendsize[2436]: argument list: /bin/tar --create --file /dev/null --directory /etc --one-file-system --listed-incremental /var/lib/amanda/gnutar-lists/rook_etc_1.new --sparse --ignore-failed-read --totals . sendsize[2436]: time 5.828: Total bytes written: 296960 (290KiB, ?/s) sendsize[2436]: time 5.829: . sendsize[2436]: estimate time for /etc level 1: 0.025 sendsize[2436]: estimate size for /etc level 1: 290 KB sendsize[2436]: time 5.829: waiting for /bin/tar /etc child sendsize[2436]: time 5.829: after /bin/tar /etc wait sendsize[2436]: time 5.829: done with amname '/etc', dirname '/etc', spindle -1 sendsize[2434]: time 5.830: child 2436 terminated normally sendsize[2440]: time 5.830: calculating for amname '/dev/sda1', dirname '/boot', spindle -1 sendsize[2440]: time 5.830: getting size via dump for /dev/sda1 level 0 sendsize[2440]: time 5.831: calculating for device '/dev/sda1' with 'ext3' sendsize[2440]: time 5.831: running /sbin/dump 0Ssf 1048576 - /dev/sda1 sendsize[2440]: time 5.832: running /usr/lib/amanda/killpgrp sendsize[2434]: time 5.832: waiting for any estimate child: 1 running sendsize[2440]: time 7.831: 4077568 sendsize[2440]: time 7.832: . sendsize[2440]: estimate time for /dev/sda1 level 0: 2.001 sendsize[2440]: estimate size for /dev/sda1 level 0: 3982 KB sendsize[2440]: time 7.832: asking killpgrp to terminate sendsize[2440]: time 8.835: getting size via dump for /dev/sda1 level 1 sendsize[2440]: time 8.836: calculating for device '/dev/sda1' with 'ext3' sendsize[2440]: time 8.836: running /sbin/dump 1Ssf 1048576 - /dev/sda1 sendsize[2440]: time 8.839: running /usr/lib/amanda/killpgrp sendsize[2440]: time 8.888: 21504 sendsize[2440]: time 8.889: . sendsize[2440]: estimate time for /dev/sda1 level 1: 0.053 sendsize[2440]: estimate size for /dev/sda1 level 1: 21 KB sendsize[2440]: time 8.889: asking killpgrp to terminate sendsize[2440]: time 9.896: done with amname '/dev/sda1', dirname '/boot', spindle -1 sendsize[2434]: time 9.896: child 2440 terminated normally sendsize[2445]: time 9.897: calculating for amname '/dev/mapper/vg-root', dirname '/', spindle -1
Weird compression results for DLE using 'compress NONE' (nocomp-root)
Hi I've got several disks that are showing weird compression results in the amanda report. Here's one of them: DUMPER STATS TAPER STATS HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s -- --- host /disk 1 20316904063380 200.0 36:34 1852.3 6:27 10487.2 Note the ORIG-KB blows out to twice the size! COMP% is 200.0... This happens on more that one disk actually. I chose this disk as it's the biggest disk that I dump, it shows the most expansive blowout and I noticed it first. This disk uses 'compress NONE' (dumptype is nocomp-root). Some of the other disks showing compression weirdness are using 'compress client fast' in their DLE's. Just so you know, I'm using file tapes for this backup. The size on disk is: -rw--- 1 amanda disk 4160933888 Jan 21 22:53 00018.host._disk.1 The server is amanda 2.6.0p2 and the client is amanda 2.4.4p3-1. There are a few more details below that may help. I'm not sure what's happening but do appreciate any help regarding the strange results. Thanks in advance, Tom amadmin conf disklist shows: host host: interface default disk /disk: program DUMP priority 0 dumpcycle 7 maxdumps 1 maxpromoteday 1 bumppercent 20 bumpdays 1 bumpmult 4.00 strategy STANDARD ignore NO estimate CLIENT compress NONE encrypt NONE auth BSD kencrypt NO amandad_path X client_username X ssh_keys X holdingdisk AUTO record YES index YES starttime fallback_splitsize 10Mb skip-incr NO skip-full NO spindle -1 the log entries for that disk are: SUCCESS dumper host /disk 20090121214501 1 [sec 2193.730 kb 4063380 kps 1852.3 orig-kb 2031690] SUCCESS chunker host /disk 20090121214501 1 [sec 2193.762 kb 4063380 kps 1852.3] STATS driver estimate host /disk 20090121214501 1 [sec 2090 nkb 4064493 ckb 4064512 kps 1945] PART taper conf 18 host /disk 20090121214501 1/1 1 [sec 387.462180 kb 4063380 kps 10487.165483] DONE taper host /disk 20090121214501 1 1 [sec 387.462180 kb 4063380 kps 10487.165483] -- Tom Robinson System Administrator MoTeC 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 M: +61 4 3268 7026 E: tom.robin...@motec.com.au
Re: Weird compression results for DLE using 'compress NONE' (nocomp-root)
John Hein wrote: John Hein wrote at 21:38 -0700 on Jan 21, 2009: Tom Robinson wrote at 12:30 +1100 on Jan 22, 2009: I've got several disks that are showing weird compression results in the amanda report. Here's one of them: DUMPER STATS TAPER STATS HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s -- --- host /disk 1 20316904063380 200.0 36:34 1852.3 6:27 10487.2 Note the ORIG-KB blows out to twice the size! COMP% is 200.0... This happens on more that one disk actually. I chose this disk as it's the biggest disk that I dump, it shows the most expansive blowout and I noticed it first. This disk uses 'compress NONE' (dumptype is nocomp-root). Some of the other disks showing compression weirdness are using 'compress client fast' in their DLE's. Smells like a factor of two error somewhere (512 byte blocks vs. 1024?). What does 'env -i du -ks /disk' say? Never mind that last request... your report above shows a level 1, not 0. So du output won't be a useful comparision to the numbers above. Does it behave the same (x2) for level 0 dumps, too? The last full run looks like this: host /disk 0 75564750 108727227 143.9 384:15 4716.0 98:47 18345.1 The thing is, I WAS using compression back then and thought that maybe it was causing the blowout so turned it off. (I'm yet to run a full dump on 'compress NONE'). It's not quite x2 but its larger than ORIG-KB. t. -- Tom Robinson System Administrator MoTeC 121 Merrindale Drive Croydon South 3136 Victoria Australia T: +61 3 9761 5050 F: +61 3 9761 5051 M: +61 4 3268 7026 E: tom.robin...@motec.com.au
Tape Open, Input/Output Error
I'm still pretty new to Linux and Amanda. I'm receiving the following error during amcheck: Amanda Tape Server Host Check - WARNING: holding disk /opt/amanda: only 15564476 KB free, using nothing ERROR: /dev/nst0: tape_rdlabel: tape open: /dev/nst0: Input/output error (expecting tape Daily-002 or a new tape) NOTE: skipping tape-writable test Server check took 0.648 seconds Amanda Backup Client Hosts Check Client check: 3 hosts checked in 16.401 seconds, 0 problems found (brought to you by Amanda 2.4.4p1) Recommendations on best way to proceed? Thanx in advance. -- View this message in context: http://www.nabble.com/Tape-Open%2C-Input-Output-Error-tp16197933p16197933.html Sent from the Amanda - Users mailing list archive at Nabble.com.
Re: Tape Open, Input/Output Error
I'm still pretty new to Linux and Amanda. I'm receiving the following error during amcheck: Amanda Tape Server Host Check - WARNING: holding disk /opt/amanda: only 15564476 KB free, using nothing ERROR: /dev/nst0: tape_rdlabel: tape open: /dev/nst0: Input/output error (expecting tape Daily-002 or a new tape) NOTE: skipping tape-writable test Server check took 0.648 seconds Amanda Backup Client Hosts Check Client check: 3 hosts checked in 16.401 seconds, 0 problems found (brought to you by Amanda 2.4.4p1) Recommendations on best way to proceed? Thanx in advance. Jon Labadie wrote: I'd start by mimicing amanda's actions manually to narrow where to look. Can you manually read the tape label as the amanda user id? Something like this: $ su - amanda_user ... $ dd if=/dev/nst0 bs=32k count=1 The label is text and you should see it on-screen. I may have fixed the issue by physical means. I ejected the tape, turned the drive off and on, used a cleaning tape, then reinserted the tape. amcheck now reports all is well, but I'm, well, a little concerned. When I run dd if=/dev/nst0 bs=32k count=1 the response is: 0+0 records in 0+0 records out When I run amflush Daily the response is: Scanning /opt/amanda... Today is: 20080321 Flushing dumps in to tape drive /dev/nst0. Expecting tape Daily-002 or a new tape. (The last dumps were to tape Daily-001) Are you sure you want to do this [yN]? N Ok, quitting. Run amflush again when you are ready. As you can see, there is no text label using the dd command, and I would think there ought to be something to flush, since the amreport was screaming: *** A TAPE ERROR OCCURRED: [tape_rdlabel: tape open: /dev/nst0: Input/output error]. Some dumps may have been left in the holding disk. Run amflush to flush them to tape. The next tape Amanda expects to use is: Daily-002. FAILURE AND STRANGE DUMP SUMMARY: madison.ch //accounting/Apps lev 1 FAILED [no more holding disk space] madison.ch /opt lev 2 FAILED [can't dump no-hold disk in degraded mode] madison.ch /boot lev 1 FAILED [no more holding disk space] nostep.chi /boot lev 1 FAILED [no more holding disk space] tyler.chi. /var lev 1 FAILED [no more holding disk space] tyler.chi. /export lev 1 FAILED [no more holding disk space] tyler.chi. /opt lev 1 FAILED [no more holding disk space] madison.ch /var lev 1 FAILED [no more holding disk space] madison.ch /export/home lev 1 FAILED [no more holding disk space] madison.ch //accounting/Finance lev 1 FAILED [no more holding disk space] tyler.chi. / lev 1 FAILED [no more holding disk space] tyler.chi. /boot lev 1 FAILED [no more holding disk space] madison.ch / lev 1 FAILED [no more holding disk space] nostep.chi / lev 1 FAILED [no more holding disk space] It doesn't look like I need to run amflush, or should I anyway? Is all of this an indication that there may be something wrong with either the tape or drive? Or should I just skip over my twitch of paranoia and move on? Thanx.
Re: amreport analysis
Just a quick note to say thanx for the suggestions, and that I'm working on them. I have a question which I think is more Linux than Amanda, but it's affecting Amanda. On one of the servers I receive the amreport via email just fine, but on the other I have to manually request it to run. I checked the crontab stuff and it seems to be in there okay. Anyone have a suggestion on where I should look to clear this up? -- View this message in context: http://www.nabble.com/amreport-analysis-tp14338749p14372564.html Sent from the Amanda - Users mailing list archive at Nabble.com.
amreport analysis
Quick thanks to Jon, Gene, Chris, and John for assist with my amflush question Am now looking for folks suggestions on whay I'm seeing on my amreport. I've copied it below. Since I'm still very new to both Linux and Amanda I'm wondering which lines I need to be concerned about and which ones I can ignore. Any assistance would be appreciated. These dumps were to tape Daily-004. The next tape Amanda expects to use is: Daily-005. FAILURE AND STRANGE DUMP SUMMARY: brain.milw / lev 0 FAILED [/bin/gtar returned 2] STATISTICS: Total Full Daily Estimate Time (hrs:min)0:06 Run Time (hrs:min) 3:45 Dump Time (hrs:min)1:30 1:28 0:02 Output Size (meg) 10241.210241.00.2 Original Size (meg) 16661.716658.73.0 Avg Compressed Size (%)44.4 44.58.4 (level:#disks ...) Filesystems Dumped 10 7 3 (1:2 2:1) Avg Dump Rate (k/s) 1943.5 1980.22.6 Tape Time (hrs:min)2:21 2:21 0:00 Tape Size (meg) 10241.610241.20.3 Tape Used (%) 51.3 51.30.0 (level:#disks ...) Filesystems Taped10 7 3 (1:2 2:1) Avg Tp Write Rate (k/s) 1235.6 1239.4 13.3 FAILED AND STRANGE DUMP DETAILS: /-- brain.milw / lev 0 FAILED [/bin/gtar returned 2] sendbackup: start [brain.milw.nvisia.com:/ level 0] sendbackup: info BACKUP=/bin/gtar sendbackup: info RECOVER_CMD=/bin/gtar -f... - sendbackup: info end ? gtar: ./mnt/cdrom: Cannot savedir: Input/output error ? gtar: ./mnt/cdrom: Warning: Cannot savedir: Input/output error | gtar: ./lib/dev-state/gpmctl: socket ignored | gtar: ./lib/dev-state/log: socket ignored | gtar: ./opt/IBMHTTPServer-V1.x/logs/siddport: socket ignored | gtar: ./opt/IBMHttpServer/conf/socket.26158: socket ignored | gtar: ./opt/IBMHttpServer/conf/socket.26307: socket ignored | gtar: ./opt/IBMHttpServer/conf/socket.26386: socket ignored | gtar: ./opt/IBMHttpServer/conf/socket.27585: socket ignored | gtar: ./opt/save-IBMHTTPServer/logs/siddport: socket ignored | Total bytes written: 3503226880 (3.3GiB, 1.7MiB/s) ? gtar: Error exit delayed from previous errors sendbackup: error [/bin/gtar returned 2] \ NOTES: planner: Preventing bump of brain.milw.nvisia.com:/ as directed. planner: Last full dump of brain.milw.nvisia.com:/mnt/archive_vol/home on tape Daily-004 overwritten on this run. planner: Dump too big for tape: full dump of brain.milw.nvisia.com:/mnt/archive_vol/home delayed. planner: Full dump of was.demo.nvisia.com:/ promoted from 3 days ahead. planner: Full dump of was.demo.nvisia.com:/opt promoted from 3 days ahead. planner: Full dump of brain.milw.nvisia.com:/mnt/archive_vol/utility promoted from 3 days ahead. planner: Full dump of brain.milw.nvisia.com:/mnt/archive_vol/cvs_repository promoted from 3 days ahead. planner: Full dump of brain.milw.nvisia.com:/boot promoted from 3 days ahead. taper: tape Daily-004 kb 10487360 fm 10 [OK] DUMP SUMMARY: DUMPER STATSTAPER STATS HOSTNAME DISKL ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s -- - brain.milw.n / 0 FAILED --- brain.milw.n /boot 06592 6624 --0:041537.3 0:051318.2 brain.milw.n -xport/home 0 52201925220224 -- 47:261834.2 47:271833.9 brain.milw.n -repository 0 38270 30784 80.4 1:05 470.5 0:241267.5 brain.milw.n -repository 1 450 64 14.2 0:39 0.8 0:02 39.9 brain.milw.n -e_vol/home 22110224 10.6 0:38 5.0 0:02 136.5 brain.milw.n -ol/utility 0 124150 39328 31.7 1:18 505.0 0:301298.6 brain.milw.n /usr2 1 470 64 13.6 0:22 1.4 0:23 2.8 was.demo.nvi / 0 1991870 690112 34.6 4:092775.1 34:37 332.2 was.demo.nvi /home 0 74908303354208 44.8 28:031992.7 43:091295.4 was.demo.nvi /opt0 21866401145728 52.4 6:103093.2 14:491289.2 (brought to you by Amanda version 2.4.3) -- View this message in context: http://www.nabble.com/amreport-analysis-tp14338749p14338749.html Sent from the Amanda - Users mailing list archive at Nabble.com.
Running amflush to a Separate Tape
I am new to both Linux and Amanda, so please be gentle. I've recently come over from the dark side of Windoze. I've inherited a small net from a previous sysop. He was using Amanda on two servers, and I would like to continue use, but I'm finding it a bit challenging. I've gotten as far as determining there are a large number of Amanda directories left on the holding disk. I would like to flush those to an entirely separate tape, outside of my Daily config, but am unsure how to do this. I would like to flush those directories today if possible, so that I can move on to resolving errors on the amreport I receive (and which I will ask about in a separate email). Any help would be greatly appreciated. Tom -- View this message in context: http://www.nabble.com/Running-amflush-to-a-Separate-Tape-tp14318108p14318108.html Sent from the Amanda - Users mailing list archive at Nabble.com.
Re: Running amflush to a Separate Tape
Gene Heskett wrote: Generally speaking, Tom, anything that's left in the holding disk, probably due to a tape error, is still part of the database amanda keeps, so they should be flushed to the next available tape in the rotation. Otherwise you are miss-placing valuable info that amanda may need in the future. The real question is why they were not properly flushed originally. Gene and Jon, Thanks for the quick replies. I am not at all sure why there are so many directories to be flushed - they range in age from March thru last week, and they all, but the last, apply to a time before I became sysop here. So I am not all that concerned about the ancient ones, I'd just like to flush the last one (recent) and move ahead. So it would be okay for me to tar/gzip all the other directories? If I remove them Amanda won't get mad at me? Or do I somehow need to tell Amanad hey, those aren't so important anymore. Tom -- View this message in context: http://www.nabble.com/Running-amflush-to-a-Separate-Tape-tp14318108p14320406.html Sent from the Amanda - Users mailing list archive at Nabble.com.
Re: Running amflush to a Separate Tape
Chris Hoogendyk wrote: One other point to check. Since Tom is coming new into an existing setup, and didn't specify in his original post, it's possible those are from another amanda configuration that is not being run. The names of the files should indicate when and what they are from. I mean, if the daily is running and writing a tape, and there are still things being left on the holding disk, then Tom needs to figure out why they are still there. First step is to know for sure where they came from. Then, if they are from the daily that is currently being run, why were they left on the holding disk? I checked, and all the directories on the holding disk are associated with the one and only amanda config file on the system, named Daily. Since I was not here for the bulk of the directories listed I am not sure why they were never flushed. Still curious if it is okay, amanda-wise, to just remove the ancient directories, or if I have to do something more than that. -- View this message in context: http://www.nabble.com/Running-amflush-to-a-Separate-Tape-tp14318108p14323017.html Sent from the Amanda - Users mailing list archive at Nabble.com.
Re: Multi-tape span failure
Ian Turner wrote: Tom, What is runtapes set to? --Ian The runtapes parameter is set to 25. -Tom On Wednesday 31 October 2007 00:31:53 Tom Hansen wrote: BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, configured to backup several large (300Gb +) filesystems spanning several tapes. I have a robot changer, LTO1 tapes (100Gb capacity) and I used: tape_splitsize 3Gb fallback_splitsize 256m (An unrelated issue: I couldn't seem to be able to get split_diskbuffer to have any effect so the chunks were all 256mb. No big deal, it was not a bottleneck.) After much time configuring, everything seems to be working properly, and on my first big run, it successfully spanned six tapes and was nearly finished. Then it grabbed tape 7, which I had inadvertently left in write protect mode. Unfortunately, at this point Amanda completely aborted the entire 800+ Gb backup and left nothing in the index, thus completely wasting 7+ hours of backup time. This behavior is unexpected and bad. What if a tape simply goes bad during a run? If I'm running 7 or 8 tapes each backup, I don't want to lose the whole thing if there's an error on the last tape! I _thought_ that Amanda was programmed to simply go to the next tape when a tape error occurs. In this case, if Amanda _had_ gone to the next tape, it could have completed the job, since tape 8 was a good tape. MY QUESTION: Is there any way to configure Amanda such that such a tape error would simply go to the next tape, instead of the worst possible action, which is to abort the whole job? Short of that, is there any way Amanda could start up from where it left off? Thanks. -- Tom Hansen Senior Information Processing Consultant Great Lakes WATER Institute tomh -at- uwm.edu www.glwi.uwm.edu -- Tom Hansen Senior Information Processing Consultant UWM Great Lakes WATER Institute www.glwi.uwm.edu [EMAIL PROTECTED]
Re: Multi-tape span failure
Jon LaBadie wrote: On Tue, Oct 30, 2007 at 11:31:53PM -0500, Tom Hansen wrote: BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, configured to backup several large (300Gb +) filesystems spanning several tapes. I have a robot changer, LTO1 tapes (100Gb capacity) and I used: tape_splitsize 3Gb fallback_splitsize 256m [ stuff deleted ] MY QUESTION: Is there any way to configure Amanda such that such a tape error would simply go to the next tape, instead of the worst possible action, which is to abort the whole job? Short of that, is there any way Amanda could start up from where it left off? Short answer - no. If the backups are in a holding disk they can still be flushed to tapes, but resume a backup no. Something in your report is amiss. If amanda had successfully used 6 tapes, it would have completed backing up and taping one or more of your 300GB DLE's. There is no reason a failed tape after that would invalidate those backups. And your report (emailed or available with amreport) would show that. Following is the report. It clearly says FAILED for all 4 filesystems under FAILURE AND STRANGE DUMP SUMMARY and sure enough, I could not see any files using amrecover. (I have done a test using one small filesystem, and amrecover did work in that case, so I'm pretty confident that my setup is good.) I did just notice that, at the very bottom, it does not indicate failure for the two filesystems that were complete. I'm not sure what to make of that. Thanks for your comments. (Oh and BTW, I was totally wrong about the dump time, it was more like 20 hours) -Tom Hostname: waterbase Org : GLWI Config : fullback Date: October 29, 2007 These dumps were to tapes GLWIBACK-001, GLWIBACK-002, GLWIBACK-003, GLWIBACK-004, GLWIBACK-005, GLWIBACK-006. *** A TAPE ERROR OCCURRED: [No more writable valid tape found]. Some dumps may have been left in the holding disk. Run amflush to flush them to tape. The next 9 tapes Amanda expects to use are: 9 new tapes. FAILURE AND STRANGE DUMP SUMMARY: waterbase.uwm.edu /media/raid2 lev 0 FAILED [out of tape] waterbase.uwm.edu /media/raid2 lev 0 FAILED [data write: Broken pipe] waterbase.uwm.edu / lev 0 FAILED [can't switch to incremental dump] waterbase.uwm.edu /media/raid2 lev 0 FAILED [dump to tape failed] STATISTICS: Total Full Incr. Estimate Time (hrs:min)1:00 Run Time (hrs:min)20:06 Dump Time (hrs:min) 16:25 16:25 0:00 Output Size (meg) 690435.5 690435.50.0 Original Size (meg)690351.3 690351.30.0 Avg Compressed Size (%) -- -- -- Filesystems Dumped2 2 0 Avg Dump Rate (k/s) 11966.411966.4-- Tape Time (hrs:min) 16:14 16:14 0:00 Tape Size (meg)690435.5 690435.50.0 Tape Used (%) 665.3 665.30.0 Filesystems Taped 2 2 0 Chunks Taped 3121 3121 0 Avg Tp Write Rate (k/s) 12093.412093.4-- USAGE BY TAPE: Label Time Size %NbNc GLWIBACK-001 3:01 130531776K 122.8 0 498 GLWIBACK-002 3:10 135774016K 127.7 0 518 GLWIBACK-003 3:01 123874432K 116.5 1 473 GLWIBACK-004 3:05 143113152K 134.6 0 546 GLWIBACK-005 2:56 124765312K 117.4 0 476 GLWIBACK-006 3:38 159734400K 150.3 1 610 FAILED AND STRANGE DUMP DETAILS: /-- waterbase.uwm.edu /media/raid2 lev 0 FAILED [data write: Broken pipe] sendbackup: start [waterbase.uwm.edu:/media/raid2 level 0] sendbackup: info BACKUP=/bin/tar sendbackup: info RECOVER_CMD=/bin/tar -xpGf - ... sendbackup: info end | gtar: ./mysql_trans/mysql.sock: socket ignored \ NOTES: planner: Adding new disk waterbase.uwm.edu:/. planner: Adding new disk waterbase.uwm.edu:/media/raid0. planner: Adding new disk waterbase.uwm.edu:/media/raid1. planner: Adding new disk waterbase.uwm.edu:/media/raid2. taper: mmap failed (Cannot allocate memory): using fallback split size of 262144kb to buffer waterbase.uwm.edu:/media/raid1.0 in-memory taper: tape GLWIBACK-001 kb 130547712 fm 499 writing file: short write taper: continuing waterbase.uwm.edu:/media/raid1.0 on new tape from 130547712kb mark: [writing file: short write] taper: tape GLWIBACK-002 kb 135895488 fm 519 writing file: short write taper: continuing waterbase.uwm.edu:/media/raid1.0 on new tape from 266338304kb mark: [writing file: short write] taper: mmap failed (Cannot allocate memory): using fallback split size of 262144kb to buffer waterbase.uwm.edu:/media/raid0.0 in-memory taper: tape GLWIBACK-003 kb 124064672 fm 474 writing file: short write taper: continuing waterbase.uwm.edu:/media/raid0.0 on new tape from 21233664kb
Multi-tape span failure
BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, configured to backup several large (300Gb +) filesystems spanning several tapes. I have a robot changer, LTO1 tapes (100Gb capacity) and I used: tape_splitsize 3Gb fallback_splitsize 256m (An unrelated issue: I couldn't seem to be able to get split_diskbuffer to have any effect so the chunks were all 256mb. No big deal, it was not a bottleneck.) After much time configuring, everything seems to be working properly, and on my first big run, it successfully spanned six tapes and was nearly finished. Then it grabbed tape 7, which I had inadvertently left in write protect mode. Unfortunately, at this point Amanda completely aborted the entire 800+ Gb backup and left nothing in the index, thus completely wasting 7+ hours of backup time. This behavior is unexpected and bad. What if a tape simply goes bad during a run? If I'm running 7 or 8 tapes each backup, I don't want to lose the whole thing if there's an error on the last tape! I _thought_ that Amanda was programmed to simply go to the next tape when a tape error occurs. In this case, if Amanda _had_ gone to the next tape, it could have completed the job, since tape 8 was a good tape. MY QUESTION: Is there any way to configure Amanda such that such a tape error would simply go to the next tape, instead of the worst possible action, which is to abort the whole job? Short of that, is there any way Amanda could start up from where it left off? Thanks. -- Tom Hansen Senior Information Processing Consultant Great Lakes WATER Institute tomh -at- uwm.edu www.glwi.uwm.edu
DGUX - Any chance amanda would work with this?
Hi I have been tasked at making sure we have a valid backup of a box that i know nothing about! Its in a corner of one of our IDC's, has been up for about 3 years and no-one knows anything about it. uname -a gives me dgux hostname R4.20MU07 generic AViiON PentiumPro does anyone know what i am dealing with here or have any clue as to weather amanda will be able to make me a backup of it? thanks
Re: Backup device for Amanda
The backup devices I am evaluating are 1. HP Ultrium 448 Tape Drive 2. HP DLT VS160 Tape Drive 3. HP SDLT 320 4. HP SDLT 600 I would like to know whether these backup devices are fully compatible with Amanda. Does anybody on this list use these backup devices in his/her setup ? Suggestions for any other suitable backup device are also welcome. amanda does not really care about your backup device, be it a tape drive, removable media or even local disks. As long as your OS can see and write to the drive (and control the robot if you have one) then you should be set. Personally i have used amanda with dds3,dds4,LTO, LTO2 and LTO3 tape drives and robots from HP and Dell without issue using MTX to control the robot. My amanda servers have allways been linux from RH7.3 upto CentOS4.3 hope that helps some
Re: Restoring from tape when Amanda server failed
I am familiar with the Amrestore command. But the problem I am facing is that the Amanda server which also holds other applications crushed. So I have to restore data from another server - I have Solaris 9 and and or Solaris 10 servers that I can connect to the tape drive... I also saved the configuration files of the amanda server. Is there a way to directly connect to the tape drive and use unix commands to restore data from it? Or any other suggestions... you have the config files you say? Then why not reinstall amanda server on the same OS as the origional and then run the amrestore command as per normal?
Re: Dell PowerVault 124T
We got this unit delivered and it has been installed. I need to start reading the amanda documentation this weekend. Meanwhile, I was wondering if anyone using this unit or anything comparable would mind sharing some of their experiences and/or any setup warnings. Yes i have this unit (scsi0:A:6): 160.000MB/s transfers (80.000MHz DT, offset 127, 16bit) Vendor: CERTANCE Model: ULTRIUM 2 Rev: 1801 Type: Sequential-Access ANSI SCSI revision: 03 Vendor: DELL Model: PV-124T Rev: 0026 Type: Medium Changer ANSI SCSI revision: 02 works like a charm with amanda and mtx - i would suggest just going for it and then post if you encounter issues. It works out of the box basically.
Samba backup failing
Hi Amanda 2.4.5p1 on CentOS 4 Samba 3.0.21b Windows 2003 SP1 The above setup worked fine 'out of the box' and has been running for about 6 months or so. Came in to find that one of the backups was taking ages and when it did finish the samba shares were failing. Looking in the debug i can see this so not sure if its a windows/samba error but anyone seen this before or know any hints? thanks sendbackup: debug 1 pid 1068 ruid 11 euid 11: start at Sat Apr 22 00:50:45 2006 /opt/amanda-2.4.5p1/libexec/sendbackup: version 2.4.5p1 parsed request as: program `GNUTAR' disk `//host/share' device `//host/share' level 0 since 1970:1:1:0:0:0 options `|;bsd-auth;' sendbackup: try_socksize: send buffer size is 65536 sendbackup: time 0.005: stream_server: waiting for connection: 0.0.0.0.35500 sendbackup: time 0.005: stream_server: waiting for connection: 0.0.0.0.35501 sendbackup: time 0.005: waiting for connect on 35500, then 35501 sendbackup: time 0.009: stream_accept: connection from 192.168.1.2.35502 sendbackup: time 0.009: stream_accept: connection from 192.168.1.2.35503 sendbackup: time 0.009: got all connections sendbackup-gnutar: time 0.017: doing level 0 dump from date: 1970-01-01 0:00:00 GMT sendbackup-gnutar: time 0.021: backup of \\host/share sendbackup: time 0.023: spawning /opt/samba/bin/smbclient in pipeline sendbackup: argument list: smbclient //host/share -U user -E -W WORKGROUP -d0 -Tqca - sendbackup-gnutar: time 0.025: /opt/samba/bin/smbclient: pid 1070 sendbackup: time 0.477: 121: normal(|): Domain=[HOST] OS=[Windows Server 2003 3790 Service Pack 1] Server=[Windows Server 2003 5.2] sendbackup: time 24570.113: 75: normal(|): tar: dumped 4923 files and directories sendbackup: time 24570.135: 53:size(|): Total bytes written: 4779003904 sendbackup: time 24570.135: 125: strange(?): write_data: write failure. Error = Connection reset by peer sendbackup: time 24570.135: 125: strange(?): write_socket: Error writing 39 bytes to socket 14: ERRNO = Connection reset by peer sendbackup: time 24570.136: 125: strange(?): Error writing 39 bytes to client. -1 (Connection reset by peer) sendbackup: time 24570.255: pid 1068 finish time Sat Apr 22 07:40:15 2006
Re: Samba backup failing
Note the time betwen the first message (0.477) and the next and final messages (24570.113). This seems like the problem (solution) described here: http://wiki.zmanda.com/index.php/Amdump:_mesg_read:_Connection_reset_by_peer thanks - will see if that helps - There is no firewall between these 2 machines so will see how i get on with that change. thanks
Re: What is the best way to duplicate a tape?
On Mon, 2006-04-17 at 16:01 -0400, Jon LaBadie wrote: On Mon, Apr 17, 2006 at 01:35:04PM -0600, Tom Schutter wrote: I was wondering if anyone can tell be the best way to duplicate an Amanda tape on a Linux box. I will have two identical tape drives. I would much prefer to use standard UNIX commands. The program tcopy would be perfect, but it is not available on Debian, it a google search indicates that is has problems. I would think that dd could do the trick, but what is the correct incantation? tcopy was going to be my suggestion until you said you were not using Solaris :( Is tcopy available on any other OS? If you know the tape block size (amanda's is 32K) then I think it would be: dd bs=32k if=/dev/xxx of=/dev/yyy where xxx are the no compression devices or with compression turned off. As I type this I realize that dd will only copy the first tape file. So this will have to loop. And you will need to use the no-rewind device. So something like: $ mt -f /dev/nxxx rewind $ mt -f /dev/nyyy rewind $ while dd bs=32k if=/dev/nxxx of=/dev/nyyy do : # a colon == no-op command done dd returns 0 on successful copy, non-zero on failure. The loop should terminate when the input fails at EOT. Perfect, just what I needed. For the archives, a slight improvement (so the command is easier to repeat): while dd bs=32k if=/dev/nxxx of=/dev/nyyy ; do : ; done -- Tom Schutter (mailto:[EMAIL PROTECTED]) Platte River Associates, Inc. (http://www.platte.com)
What is the best way to duplicate a tape?
I was wondering if anyone can tell be the best way to duplicate an Amanda tape on a Linux box. I will have two identical tape drives. I would much prefer to use standard UNIX commands. The program tcopy would be perfect, but it is not available on Debian, it a google search indicates that is has problems. I would think that dd could do the trick, but what is the correct incantation? -- Tom Schutter (mailto:[EMAIL PROTECTED]) Platte River Associates, Inc. (http://www.platte.com)
Exclude usine dump?
Hi Using 2.4.5p1 on linux and wondering if its possible to exclude a directory from the backup using dump? I presume not and if not what would be the inpact of not having a holding disk? Scenerio is that i'm backing up a box and one of the partitions contains data that has to be backed up but also contains the holding disk. I presume that having the holding disk on a partition thats getting backed up is a bad thing.
Restore error - centos 3/amanda-2.4.5 server - centos4/amanda-2.4.5 client
Hi Testing a restore on the above config i have found this error. Backups run fine but this is the error when trying to do a restore on it - Anyone seen it before as i have not come accross this - Restore is being attempted NOT on the same server that did the backup amrestore: 53: restoring x1._dev_sda2.20051130.1 Input block size is 32 Dump date: Wed Nov 30 03:06:53 2005 Dumped from: Tue Nov 29 03:11:26 2005 Level 1 dump of / on x1:/dev/sda2 Label: / Extract directories from tape Mangled directory: reclen not multiple of 4 reclen less than DIRSIZ (2 12) Mangled directory: reclen less than DIRSIZ (0 12) Mangled directory: reclen less than DIRSIZ (0 12) Mangled directory: reclen less than DIRSIZ (0 12) Mangled directory: reclen not multiple of 4 reclen less than DIRSIZ (2 12) Mangled directory: reclen less than DIRSIZ (0 12) Mangled directory: reclen less than DIRSIZ (0 12) Mangled directory: reclen less than DIRSIZ (0 12) Mangled directory: reclen not multiple of 4 thanks
Re: Restore error - centos 3/amanda-2.4.5 server - centos4/amanda-2.4.5 client
What OS and dump version was the data backed up from? What OS and dump version are you trying to restore to? In general, the dump type (xfsdump, vdump, etc.) and sometimes even the version must be the same for backup and restore. This is an unfortunate consequence of Amanda's use of native tools. the clients that i'm testing at the moment are running CentOS4.2 and the server that did this backup is running CentOS 3.something 6 i think - I'm testing this restore on a server running RH7.3 and this is chucking the error. I'm currently running a test restore on the same server that did the backup to see if that works - I'll confirm dump versions shortly
Re: Restore error - centos 3/amanda-2.4.5 server - centos4/amanda-2.4.5 client
What OS and dump version was the data backed up from? What OS and dump version are you trying to restore to? In general, the dump type (xfsdump, vdump, etc.) and sometimes even the version must be the same for backup and restore. This is an unfortunate consequence of Amanda's use of native tools. server that ran the backup Linux 2.4.21-27.0.2.ELsmp dump 0.4b37 client that got backed up Linux 2.6.9-22.ELsmp dump 0.4b39 and i can confirm that even on the server that did the backup i get the same garbage when trying to restore - Is the dump version difference the issue do you think?
Re: Restore error - centos 3/amanda-2.4.5 server - centos4/amanda-2.4.5 client
server that ran the backup Linux 2.4.21-27.0.2.ELsmp dump 0.4b37 client that got backed up Linux 2.6.9-22.ELsmp dump 0.4b39 sorry to reply to myself here but on another test client Linux 2.4.20 dump 0.4b27 and the same server did a succesful restore so it seems that if the client version of dump is higher than the server the restore barfs however the opposite appears to give a good restore - Looks like i'm going to have to install myself a new server box with dump 0.4b39 so get around this - Unless anyone has any other suggestions? thanks
** resolved ** Re: Restore error - centos 3/amanda-2.4.5 server - centos4/amanda-2.4.5 client
and the same server did a succesful restore so it seems that if the client version of dump is higher than the server the restore barfs however the opposite appears to give a good restore - Looks like i'm going to have to install myself a new server box with dump 0.4b39 so get around this - Unless anyone has any other suggestions? thanks OK - I have managed to resolve this issue. I had to install a newer version of dump that was higher than the client version and then re-compile amanda so that it picked up this new version of dump. It appears that dump/restore is a bit picky regarding version numbers - the client version must be the same or lower than the server version otherwise restores may not work. thanks
Re: Estimate Timeout Issue - Dump runs fine
OK thanks - I have increased the etimeout to 2400 seconds and also changed the udp timeout within checkpoint to also be 2400 seconds so i'll see how the run goes tonight everything was fine today - no estimate timeout thanks for the pointer
Estimate Timeout Issue - Dump runs fine
Hi Server is 2.4.5 and client is now 2.4.5p1 both on CentOS I use Amanda and have done for years with no issues setting up etc - I can pretty much set up with my eyes closed now!! Amanda rocks... But i'm getting a slightly strange error with a large partition. The partition in question is around 900gig in size although only a few hundred meg are currently used. When the estimate runs it returns FAILURE AND STRANGE DUMP SUMMARY: planner: ERROR Estimate timeout from servername Thing is though the actual dump of this filesystem runs fine - I have increased my eTimeout to 20mins but this still occurs - Any ideas on this one? thanks
Re: Estimate Timeout Issue - Dump runs fine
Look in /tmp/amanda/sendsize*debug and/or amandad*debug to see how long the estimate is actually taking. Also, what do your iptables rules look like on the server? thanks - iptables are not being used, local firewall is off sendsize degug is below and looks OK # more /tmp/amanda/sendsize.20051102003001.debug sendsize: debug 1 pid 12320 ruid 11 euid 11: start at Wed Nov 2 00:30:01 2005 sendsize: version 2.4.5p1 sendsize[12322]: time 0.002: calculating for amname '/dev/sda2', dirname '/', spindle -1 sendsize[12322]: time 0.002: getting size via dump for /dev/sda2 level 0 sendsize[12322]: time 0.002: calculating for device '/dev/sda2' with 'ext3' sendsize[12322]: time 0.002: running /sbin/dump 0Ssf 1048576 - /dev/sda2 sendsize[12322]: time 0.003: running /opt/amanda-2.4.5p1/libexec/killpgrp sendsize[12320]: time 0.003: waiting for any estimate child: 1 running sendsize[12322]: time 21.884: 1447269376 sendsize[12322]: time 21.885: . sendsize[12322]: estimate time for /dev/sda2 level 0: 21.882 sendsize[12322]: estimate size for /dev/sda2 level 0: 1413349 KB sendsize[12322]: time 21.885: asking killpgrp to terminate sendsize[12322]: time 22.886: getting size via dump for /dev/sda2 level 1 sendsize[12322]: time 22.887: calculating for device '/dev/sda2' with 'ext3' sendsize[12322]: time 22.887: running /sbin/dump 1Ssf 1048576 - /dev/sda2 sendsize[12322]: time 22.888: running /opt/amanda-2.4.5p1/libexec/killpgrp sendsize[12322]: time 195.606: 4647936 sendsize[12322]: time 195.606: . sendsize[12322]: estimate time for /dev/sda2 level 1: 172.718 sendsize[12322]: estimate size for /dev/sda2 level 1: 4539 KB sendsize[12322]: time 195.606: asking killpgrp to terminate sendsize[12322]: time 196.608: done with amname '/dev/sda2', dirname '/', spindle -1 sendsize[12320]: time 196.608: child 12322 terminated normally sendsize[12334]: time 196.609: calculating for amname '/dev/sda1', dirname '/boot', spindle -1 sendsize[12334]: time 196.609: getting size via dump for /dev/sda1 level 0 sendsize[12334]: time 196.609: calculating for device '/dev/sda1' with 'ext3' sendsize[12334]: time 196.609: running /sbin/dump 0Ssf 1048576 - /dev/sda1 sendsize[12320]: time 196.609: waiting for any estimate child: 1 running sendsize[12334]: time 196.610: running /opt/amanda-2.4.5p1/libexec/killpgrp sendsize[12334]: time 197.239: 5737472 sendsize[12334]: time 197.239: . sendsize[12334]: estimate time for /dev/sda1 level 0: 0.630 sendsize[12334]: estimate size for /dev/sda1 level 0: 5603 KB sendsize[12334]: time 197.239: asking killpgrp to terminate sendsize[12334]: time 198.242: getting size via dump for /dev/sda1 level 1 sendsize[12334]: time 198.243: calculating for device '/dev/sda1' with 'ext3' sendsize[12334]: time 198.243: running /sbin/dump 1Ssf 1048576 - /dev/sda1 sendsize[12334]: time 198.243: running /opt/amanda-2.4.5p1/libexec/killpgrp sendsize[12334]: time 198.684: 27648 sendsize[12334]: time 198.684: . sendsize[12334]: estimate time for /dev/sda1 level 1: 0.441 sendsize[12334]: estimate size for /dev/sda1 level 1: 27 KB sendsize[12334]: time 198.684: asking killpgrp to terminate sendsize[12334]: time 199.687: done with amname '/dev/sda1', dirname '/boot', spindle -1 sendsize[12320]: time 199.687: child 12334 terminated normally sendsize[12339]: time 199.687: calculating for amname '/dev/sda5', dirname '/export/disk1', spindle -1 sendsize[12339]: time 199.688: getting size via dump for /dev/sda5 level 0 sendsize[12320]: time 199.688: waiting for any estimate child: 1 running sendsize[12339]: time 199.688: calculating for device '/dev/sda5' with 'ext3' sendsize[12339]: time 199.688: running /sbin/dump 0Ssf 1048576 - /dev/sda5 sendsize[12339]: time 199.689: running /opt/amanda-2.4.5p1/libexec/killpgrp sendsize[12339]: time 545.606: 88973312 sendsize[12339]: time 545.617: . sendsize[12339]: estimate time for /dev/sda5 level 0: 345.928 sendsize[12339]: estimate size for /dev/sda5 level 0: 86888 KB sendsize[12339]: time 545.617: asking killpgrp to terminate sendsize[12339]: time 546.619: getting size via dump for /dev/sda5 level 1 sendsize[12339]: time 546.646: calculating for device '/dev/sda5' with 'ext3' sendsize[12339]: time 546.646: running /sbin/dump 1Ssf 1048576 - /dev/sda5 sendsize[12339]: time 546.647: running /opt/amanda-2.4.5p1/libexec/killpgrp sendsize[12339]: time 2182.684: 25811968 sendsize[12339]: time 2182.696: . sendsize[12339]: estimate time for /dev/sda5 level 1: 1636.054 sendsize[12339]: estimate size for /dev/sda5 level 1: 25207 KB sendsize[12339]: time 2182.701: asking killpgrp to terminate sendsize[12339]: time 2183.703: done with amname '/dev/sda5', dirname '/export/disk1', spindle -1 sendsize[12320]: time 2183.704: child 12339 terminated normally sendsize: time 2183.704: pid 12320 finish time Wed Nov 2 01:06:24 2005 one of my amanda.debugs does have this at the bottom of it amandad: time 2193.716: dgram_recv: timeout after 10 seconds amandad: time
Re: Estimate Timeout Issue - Dump runs fine
Yep. So you can just increase etimeout and/or figure out why /sbin/dump 1Ssf 1048576 - /dev/sda5 is taking so long. OK thanks - I have increased the etimeout to 2400 seconds and also changed the udp timeout within checkpoint to also be 2400 seconds so i'll see how the run goes tonight thanks
LTO-2 - Tape Type
Anyone got a tape type for an LTO-2 please? thanks
Re: R: LTO-2 - Tape Type
Yes i know - but rather than wait all that time i wondered if anyone allready had one to hand. just run amtapetype command and you will get tape type definition to use in your amanda.conf configuration file... Giovanni -Messaggio originale- Da: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] conto di Tom Brown Inviato: mercoledì 29 giugno 2005 10.25 A: amanda-users Oggetto: LTO-2 - Tape Type Anyone got a tape type for an LTO-2 please? thanks
Re: Amanda - External HDD's
Geert Uytterhoeven wrote: On Wed, 8 Jun 2005, Jon LaBadie wrote: On Wed, Jun 08, 2005 at 01:43:25PM +0100, Tom Brown wrote: Not entirely and only related to amanda but i'm sure people have done this before so here we go ;) Linux amanda server (WBEL4) and its using filedriver for virtual tapes as client does not want to spend $$$ on an LTO3 drive - a LOT of data here. So anyone used USB2/Firewire external HDD's as their virtual tapes? How does amanda/Linux handle this? I presume that amanda does not really care and that this is prehaps more a Linux question but any gotchas? makes to avoid? Does Linux mind hotplugging? Only a single experience, for a client that wanted to transport a pair of external drives back and forth from office to home as off-site protection. This was RHEL 3 and plugging them in did not seem to be a problem, they mounted whenever they were plugged it or powered up. But it was not as if we actively tested regularly plugging/unplugging. The original plan was weekly, so that was not an issue. Right now I'm using such a scheme, using IDE disks in `hot-pluggable'[*] IDE bays. One thing that did happen to me during testing. I plugged the drives in one at at time and converted the disks to ext format. Then began some testing. Later the client rebooted and I did not realize it. I blindly continued some testing where the first step was to clean out one of the drives (rm -rf /mnt/usb1/* or some such). What I did not realize that across the reboot what had been usb1 was now usb2 and visaversa. It depended on the order they were seen by the OS. It was time to experiment with something I had read about RHEL, maybe other linux's too, that a drive can be assigned a unique id, and whenever that id is seen, it can be made to automount on the same directory. Set it up and it worked like a charm. I gave the partitions on my backup disks labels using cfdisk, and use different mount points. In /etc/fstab I have | LABEL=backup-disk-1 /media/backup1 ext3defaults,noauto 0 0 | LABEL=backup-disk-2 /media/backup2 ext3defaults,noauto 0 0 I.e. the right disk will always show up at the right mount point. My removable disks are much larger than my vtapes, so from time to time I copy my vtapes (which reside on an internal non-removable disk) to a removable disk. Right now all this is done manually, but my intention is to write some scripts to automate managing vtapes on multiple removable disks. Gr{oetje,eeting}s, thanks all - good advise in there so i'll see how i get on but it looks like firewire will be the way forward thanks
Amanda - External HDD's
Hi Not entirely and only related to amanda but i'm sure people have done this before so here we go ;) Linux amanda server (WBEL4) and its using filedriver for virtual tapes as client does not want to spend $$$ on an LTO3 drive - a LOT of data here. So anyone used USB2/Firewire external HDD's as their virtual tapes? How does amanda/Linux handle this? I presume that amanda does not really care and that this is prehaps more a Linux question but any gotchas? makes to avoid? Does Linux mind hotplugging? Thanks guys
Re: Samba 3.0.14 problems with Amanda 2.4.4p3?
I tried this: --- client-src/sendbackup-gnutar.c.orig 2005-05-20 12:46:49.0 +0200 +++ client-src/sendbackup-gnutar.c 2005-05-20 12:31:03.0 +0200 @@ -73,6 +73,7 @@ AM_NORMAL_RE(^[Aa]dded interface), AM_NORMAL_RE(^session request to ), AM_NORMAL_RE(^tar: dumped [0-9][0-9]* (tar )?files), + AM_NORMAL_RE(^Domain=.*OS=.*Server=.*), #if SAMBA_VERSION 2 AM_NORMAL_RE(^doing parameter), backed up, recompiled, ran make install in client-src, restarted services. - waiting for results :) did you ever get any results? thanks
Re: Amanda client stopped working
Type something on the client: you should see the letters on the other side. Wait some time, and type something on the other program, and see if the letters appear on the opposite program again. Some experiment with how long to wait before answering should give you an idea if it is a timing issue with udp or not. I also started to get wierd etimeout errors on random hosts for no good reason - However we had changed switch configuration, tagged VLANS, and also firewall version, FW-1 running on Linux to FW-1 running on SPLAT Turns out that in FW-1 the default UDP timeout is 40 seconds and increasing that to something higher has solved the issue for us. The error we were seeing in the debug was the same as yours. thanks
Dumps Fail - Estimate Timeout errors
Hi Clients are all RH 7.3 or WhiteBox respin 2 Server is 2.4.4p4 running on whitebox respin 2 My amchecks un fine and without issue however i have come in on 2 morning snow to find that some of the clients failed. The actual fails have occurred on different clients, ie some that failed 2 nights ago worked last night without changes, and i can't figure out why. All clients were working fine at a different site as we move idc over the weekend so we have new network architecture. Failure errors are hostname/dev/rd/c0d0p3 lev 0 FAILED [Estimate timeout from hostname] hostname/dev/rd/c0d0p1 lev 0 FAILED [Estimate timeout from hostname] anotherhostname /dev/rd/c0d0p2 lev 0 FAILED [Estimate timeout from anotherhostname] anotherhostname /dev/rd/c0d0p5 lev 0 FAILED [Estimate timeout from anotherhostname] There is nothing in the firewall log to indicate a drop of packet. I have, as a test actually allowed any ports between these networks, but it has not helped. Does anyone know how to debug these timeout type issues as i have been using amanda for about 3 years now and have not encountered this before. thanks
Re: Dumps Fail - Estimate Timeout errors
Have a look on that client in /tmp/amanda, look for the files sendsize.DATETIME.debug and see how long the estimate did take. The first line of the file is the start time and the last line is the finish time. How long did it really take? You many have to change the etimeout parameter in amanda.conf. If there is no finish time line, then the estimate crashed, and probably there is an error message in that file too. Hi Please see pasted below the 3 entries that applt to last night - everything appears OK in there it seems. If the sendsize is not crashing what else could cause this? thanks sendsize: debug 1 pid 16820 ruid 11 euid 11: start at Wed May 18 00:29:27 2005 sendsize: version 2.4.4p1 snip sendsize[16820]: time 118.545: child 16874 terminated normally sendsize: time 118.545: pid 16820 finish time Wed May 18 00:31:25 2005 sendsize: debug 1 pid 17384 ruid 11 euid 11: start at Wed May 18 00:54:26 2005 sendsize: version 2.4.4p1 snip sendsize[17384]: time 101.841: child 17418 terminated normally sendsize: time 101.841: pid 17384 finish time Wed May 18 00:56:08 2005 sendsize: debug 1 pid 17795 ruid 11 euid 11: start at Wed May 18 01:19:26 2005 sendsize: version 2.4.4p1 snip sendsize[17795]: time 106.417: child 17842 terminated normally sendsize: time 106.417: pid 17795 finish time Wed May 18 01:21:12 2005
Re: iptables script
thanks all! Tom : On Sat, May 14, 2005 at 05:29:10PM -0400, Joshua Baker-LePain enlightened us: For the first time ever i have to backup a machine over the 'internet' - This client is using iptables as its firewall. Does anyone have an iptables rule they would like to share that would allow amanda through to be able to backup this client? If you haven't compiled with any portrange options, you'll have to do something like this: -A INPUT -p udp -s $AMANDA_SERVER -d 0/0 --dport 10080 -j ACCEPT -A INPUT -p tcp -m tcp -s $AMANDA_SERVER -d 0/0 --dport 1025:65535 -j ACCEPT Or -A INPUT -p udp -s $AMANDA_SERVER -d $AMANDA_CLIENT --dport 10080 -j ACCEPT and load the ip_conntrack_amanda kernel module. I use the following in /etc/modprobe.conf: options ip_conntrack_amanda master_timeout=2400 install ip_tables /sbin/modprobe --ignore-install ip_tables \ /sbin/modprobe ip_conntrack_amanda (Lines 2 3 are all one line) This sets the UDP timeout for amanda packets to 2400 seconds, up from the default 300 (don't hold me to that, it might be 600). I was getting estimate timeouts since they were taking longer than 300/600 seconds and the firewall would close the port. Makes things a little more secure than opening up everything 1024 ;-) Matt
iptables script
Hi For the first time ever i have to backup a machine over the 'internet' - This client is using iptables as its firewall. Does anyone have an iptables rule they would like to share that would allow amanda through to be able to backup this client? thanks
Re: tapecycle and the doc
On Tue, 2005-03-15 at 23:52 +0100, Stefan G. Weichinger wrote: Hi, Tom, on Dienstag, 15. März 2005 at 23:32 you wrote to amanda-users: TS On Tue, 2005-03-15 at 10:25 -0700, Tom Schutter wrote: On Tue, 2005-03-15 at 01:03 -0500, Jon LaBadie wrote: While amanda is always willing to use a new tape in its rotation, it refuses to reuse a tape until at least 'tapecycle' number of other tapes have been used. TS Ooops. I think that should be: TS While amanda is always willing to use a new tape in its rotation, TS it refuses to reuse a tape until at least 'tapecycle-1' number of TS other tapes have been used. 10 points for that. In case you forgot, it does not appear to be fixed here yet: http://www.amanda.org/docs/amanda.8.html -- Tom Schutter (mailto:[EMAIL PROTECTED]) Platte River Associates, Inc. (http://www.platte.com)
Re: backup Oracle DB at AMANDA server
Jack$on wrote: hi. Thankx to everyone! currently I'm recompile my amanda, with my tar script. shutdown/startup schema good working now I'm replace start/shop of my Oracle DB to a switching DB in switching out hot backup mode... I'm reconfigure my Oracle DB, change archive_log_dest to another partition (may be to another server), and try backup redo logs with starttime param in dumptype. I think, that if I backup redo logs with estimated delay (2 or 3 hours) all must works correctly... i think... :) not sure what version of Oracle you use but here we use 8.1.7.4 and 9.2.0.4 all on Linux. To backup i simply backup the DB using RMAN and backup to a directory on the local oracle server. I then use Amanda to backup that directory and so i get control files, datafiles etc all in the RMAN backup. This works fine as i can't stop my db's either as we are a 24x7 shop thanks
Re: tapecycle and the doc
On Tue, 2005-03-15 at 01:03 -0500, Jon LaBadie wrote: On Mon, Mar 14, 2005 at 09:14:43AM -0700, Tom Schutter wrote: Here is my bad attempt at an improvement, please do not use it verbatim: Here is my attempt at a revision: tapecycle int Default: 15 tapes. Typically tapes are used by amanda in an ordered rotation. The tapecycle parameter defines the size of that rotation. The number of tapes in rotation must be larger than the number of tapes required for a complete dump cycle (see the dumpcycle parameter). This is calculated by multiplying the number of amdump runs per dump cycle (runspercycle parameter) times the number of tapes used per run (runtapes parameter). Typically two to four times this calculated number of tapes are in rotation. While amanda is always willing to use a new tape in its rotation, it refuses to reuse a tape until at least 'tapecycle' number of other tapes have been used. It is considered good administrative practice to set the tapecycle parameter slightly lower than the actual number of tapes in rotation. This allows the administrator to more easily cope with damaged or misplaced tapes or schedule adjustments that call for slight adjustments in the rotation order. Your attempt is far better than mine, and it says what I meant. -- Tom Schutter (mailto:[EMAIL PROTECTED]) Platte River Associates, Inc. (http://www.platte.com)
Re: tapecycle and the doc
On Tue, 2005-03-15 at 10:25 -0700, Tom Schutter wrote: On Tue, 2005-03-15 at 01:03 -0500, Jon LaBadie wrote: While amanda is always willing to use a new tape in its rotation, it refuses to reuse a tape until at least 'tapecycle' number of other tapes have been used. Ooops. I think that should be: While amanda is always willing to use a new tape in its rotation, it refuses to reuse a tape until at least 'tapecycle-1' number of other tapes have been used. -- Tom Schutter (mailto:[EMAIL PROTECTED]) Platte River Associates, Inc. (http://www.platte.com)
tapecycle and the doc
I had some questions regarding tapecycle, and after reading the man page and the doc (old and new), I think that they fall short on describing what tapecycle should be set to. The minimum value of tapecycle is well covered, but not the maximum value, and how tapecycle should relate to the number of tapes that have been labeled. From the man page: tapecycle int Default: 15 tapes. The number of tapes in the active tape cycle. This must be at least one larger than the number of Amanda runs done during a dump cycle (see the dumpcycle parameter) times the number of tapes used per run (see the runtapes parameter). For instance, if dumpcycle is set to 14 days, one Amanda run is done every day (Sunday through Satur- day), and runtapes is set to one, then tapecycle must be at least 15 (14 days * one run/day * one tape/run + one tape). In practice, there should be several extra tapes to allow for schedule adjustments or disaster recov- ery. So what is an active tape cycle? That is never defined anywhere. Although the last sentence is correct and it makes sense, it does not explain how tapecycle should relate to the actual number of labeled tapes. Here is my bad attempt at an improvement, please do not use it verbatim: You must have at least tapecycle tapes labeled, but you can have more. By labeling extra tapes, you can allow for schedule adjustments or disaster recovery. For example, lets say that your tapecycle is set to 20 and you have 20 labeled tapes. If you discover that tape #5 that you are about to put in the drive is bad, your only alternative is to immediately label a new replacement tape. If tapecycle was 20 and you had 25 labeled tapes, then you could put tape #6 in the drive and deal with the problem later. On the other hand, if the number of labeled tapes greatly exceeds tapecycle, then AMANDA (insert inefficiency issue here). -- Tom Schutter (mailto:[EMAIL PROTECTED]) Platte River Associates, Inc. (http://www.platte.com)
Amanda-2.4.4p4 - RedHat 7.3 - Samba-3.0
Hi Using the above fine for ann Unix type hosts and of these tar and dump based backups work fine. Just tried to backup my first windows partition last night and it seems to have worked but the dump ran with 'strange' - Can anyone point me in the direction of what might be wrong? FAILED AND STRANGE DUMP DETAILS: /-- titan //printserver/hyperion lev 0 STRANGE sendbackup: start [titan://printserver/hyperion level 0] sendbackup: info BACKUP=/opt/samba/bin/smbclient sendbackup: info RECOVER_CMD=/opt/samba/bin/smbclient -f... - sendbackup: info end ? [2005/02/24 00:46:52, 0] client/clitar.c:process_tar(1433) ? tar: dumped 3546 files and directories ? [2005/02/24 00:46:52, 0] client/clitar.c:process_tar(1434) | Total bytes written: 3155058688 sendbackup: size 3081112 sendbackup: end \ thanks for any help! Tom
Restore problem - tar backup
Hi Posted earlier about a 'strange' report to do with a windows partition. I'm trying to restore this partition to see what made it onto tape. The issue i'm having is that amanda is ignoring this partition and skipping it. eg amrestore -p /dev/nst0 titan \\printserver\hyperion | tar xvfp - -- snip amrestore: 9: skipping titan.__printserver_hyperion.20050224.0 I have tried escaping the \\ with a $ so had amrestore -p /dev/nst0 titan $\\printserver$\hyperion | tar xvfp - but that yielded the same - Any hints here? The OS of the amanda server where i'm doing this from is RH7.3 thanks!
Re: Restore problem - tar backup
try titan //printserver/hyperion Ah yes - its been a long week! Thanks!
Re: Amanda-2.4.4p4 - RedHat 7.3 - Samba-3.0
FAILED AND STRANGE DUMP DETAILS: /-- titan //printserver/hyperion lev 0 STRANGE sendbackup: start [titan://printserver/hyperion level 0] sendbackup: info BACKUP=/opt/samba/bin/smbclient sendbackup: info RECOVER_CMD=/opt/samba/bin/smbclient -f... - sendbackup: info end ? [2005/02/24 00:46:52, 0] client/clitar.c:process_tar(1433) ? tar: dumped 3546 files and directories ? [2005/02/24 00:46:52, 0] client/clitar.c:process_tar(1434) | Total bytes written: 3155058688 sendbackup: size 3081112 sendbackup: end \ responding to myself here but seems that this may be due to a samba error? We are using version 3.0.0 and perhaps this does not play nicely with amanda 2.4.4p4 - The backup restored fine though. Anyone know a 'good' 2.4.4p4 and Samba version combo? thanks
backup to Iomega REV
My original post Can Amanda use an Iomega REV drive as a tape? was answered affirmatively (thanks to all who responded!), but I'm still unsure of how to proceed. I have 2 servers: Amanda: 36gb-raid-1 (mirrored) -- RedHat AS 3.0 opsys, Amanda 2.4.4p4 72gb-raid-1 (mirrored) -- CVS repository, some user apps data 33gb Iomega REV rdd other: 36gb-raid-1 (mirrored) -- RedHat AS 3.0 opsys 72gb-raid-1 (mirrored) -- most user apps data I'd like daily backups, so users could restore files/directories from a few days past. I'd like Amanda to use the Iomega REV cartridge as a single tape, maybe filling it up slowly every day sending me an e-mail when it's full so I can load a new cartridge, or maybe making full backups weekly. The full backup will eventually exceed 33 gb (although Iomega claims 90gb using their Windows compression). I'd also like to keep the previous month's (or thereabout) full backup cartridges off site. This is roughly what we now have on our old server, with NetBackup making 7 daily incremental bkps 5 full weekly bkps. How do I translate this to Amanda configurations? Do I need to use work disks at all? Why would the REV cartridge have to be split into multiple vtapes?
Can Amanda use an Iomega REV drive as a tape?
Can/should Amanda use an Iomeg REV drive as an output tape? We've got 2 servers both running RedHat AS 3.0, with 35gb 70gb hard drives on each, and we're intersted in running Amanda on one of the servers to back up both ( more servers to follow). The backup server has a 35gb Iomega REV drive, which RedHat sees as /cdrom2. Has anyone used a REV drive with Amanda?
Re: rackmount tape changers
We're looking at increasing our backup capacity, and I'm wondering if anyone has any recommendations for a rackable tape changer, with about a 9 tape capacity (I'm thinking LTO tapes) and a SCSI interface. Anything at all would help; I'm especially interested in such devices that you are using in production with AMANDA, though. we use these http://www.de.nec.de/productdetail.php/id/684 they can be rack mounted with the kit and 2 can go side by side in the rack - They are relatively cheap in comparison to some and so far (about a year and a half) they have been great. Have capacity for 10 tapes and they also have a barcode reader thanks