Re: amrecover fails
On 30 Oct at 16:58 Paul Bijnens [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED] [snip] [EMAIL PROTECTED] recover]# amrecover HomeDumps -s tony-lx AMRECOVER Version 2.5.1. Contacting server on tony-lx ... NAK: amindexd: invalid service amindexd appears to be running; it's certainly configured as per the documentation in xinet.d. There appears to be no mention of this in either the wiki or the faq. http://wiki.zmanda.com/index.php/Configuring_bsd/bsdudp/bsdtcp_authentication Since 2.5.1 you need to add the services that amandad is allowed to run in the server_args: service amanda { only_from = amandasrv.example.com amandaclnt.example.com socket_type = dgram protocol= udp wait= yes user= amandabackup group = disk groups = yes server = /usr/lib/amanda/amandad server_args = -auth=bsd amdump amindexd amidxtaped disable = no Thanks vey much, Paul; that fixed it! I simply didn't associate my error message and that change. A supplementary question for anyone who knows: Is it possible to ger amrestore to preserve file ownership/permissions, instead of assigning everything to root? Cheers, Tony -- Tony van der Hoff| mailto:[EMAIL PROTECTED] Buckinghamshire, England
Re: amrecover failed - time out contacting to server itself
Are you sure the server receive the request? Did amandad get execute on the server? Post its debug log. Check your firewall/network log. Jean-Louis fedora wrote: hi all, I am having problem with amrecover.. amrecover DailySet1 AMRECOVER Version 2.5.1p3. Contacting server on server.com ... [request failed: timeout waiting for ACK] the amrecover debug file was: amrecover: debug 1 pid 12565 ruid 0 euid 0: start at Tue Oct 30 16:35:10 2007 Reading conf file /usr/local/etc/amanda/amanda-client.conf. Reading conf file /usr/local/etc/amanda/DailySet1/amanda-client.conf. amrecover: debug 1 pid 12565 ruid 0 euid 0: rename at Tue Oct 30 16:35:10 2007 security_getdriver(name=bsd) returns 0x50a0e0 security_handleinit(handle=0x8c04660, driver=0x50a0e0 (BSD)) amrecover: bind_portrange2: Skip port 706: Owned by silc. amrecover: bind_portrange2: Skip port 707: Owned by borland-dsj. amrecover: bind_portrange2: Try port 708: Available - Success amrecover: dgram_bind: socket bound to 0.0.0.0.708 amrecover: dgram_send_addr(addr=0xbfc07310, dgram=0x50b004) amrecover: (sockaddr_in *)0xbfc07310 = { 2, 10080, 202.53.250.162 } amrecover: dgram_send_addr: 0x50b004-socket = 3 amrecover: dgram_send_addr(addr=0xbfc070b0, dgram=0x50b004) amrecover: (sockaddr_in *)0xbfc070b0 = { 2, 10080, 202.53.250.162 } amrecover: dgram_send_addr: 0x50b004-socket = 3 amrecover: dgram_send_addr(addr=0xbfc070b0, dgram=0x50b004) amrecover: (sockaddr_in *)0xbfc070b0 = { 2, 10080, 202.53.250.162 } amrecover: dgram_send_addr: 0x50b004-socket = 3 security_seterror(handle=0x8c04660, driver=0x50a0e0 (BSD) error=timeout waiting for ACK) security_close(handle=0x8c04660, driver=0x50a0e0 (BSD)) any helps would be appreciated. :-D
Re: discrepency between amadmin, logs and tape content?
* Jean-Louis Martineau [EMAIL PROTECTED] [20071030 10:40]: This bug is fixed in 2.5.3alpha, but the patch was not backported to 2.5.2p1. Can you try the attached patch? Looks like the patch did it. amadmin now reports correctly and amfetchdump can restore the dump. Thanks! jf Jean-Louis Jean-Francois Malouin wrote: Hi, With amanda-2.5.2p1 I did an archive a few days ago and trying to restore it caused me a few problems: amfetchdump tells me that there is no valid data to be restored for that date and amadmin reports a connection timeout: grumpy: /opt/amanda/sbin/amfetchdump -p -d /dev/nst1 archive-nihpd3-right1 \ yorick /data/nihpd/nihpd3/data/mri_processing/1.1 20071022 | tar -tvf - No matching dumps found grumpy: su amanda -c /opt/amanda/sbin/amadmin archive-nihpd3-right1 \ find yorick /data/nihpd/nihpd3/data/mri_processing/1.1 2007-10-22 yorick /data/nihpd/nihpd3/data/mri_processing/1.1 0 av24-1_archive-nihpd3-right1_T6L30 -- FAILED (dumper) [data read: recv error: Connection timed out] However I was able to manually extract all the chunks on that tape and, after reassembling them, to untar everything without a glitch. Looking at the logs I see that it was retried with success: DISK planner yorick /data/nihpd/nihpd3/data/mri_processing/1.1 FAIL dumper yorick /data/nihpd/nihpd3/data/mri_processing/1.1 20071022 0 [data read: recv error: Connection timed out] sendbackup: start [yorick:/data/nihpd/nihpd3/data/mri_processing/1.1 level 0] PARTIAL chunker yorick /data/nihpd/nihpd3/data/mri_processing/1.1 20071022 0 [sec 9771.397 kb 105401080 kps 10786.7] SUCCESS dumper yorick /data/nihpd/nihpd3/data/mri_processing/1.1 20071022 0 [sec 10432.472 kb 113375650 kps 10867.6 orig-kb 113375650] SUCCESS chunker yorick /data/nihpd/nihpd3/data/mri_processing/1.1 20071022 0 [sec 10432.600 kb 113375650 kps 10867.4] STATS driver estimate yorick /data/nihpd/nihpd3/data/mri_processing/1.1 20071022 0 [sec 8192 nkb 113375682 ckb 113375712 kps 13840] CHUNK taper yorick /data/nihpd/nihpd3/data/mri_processing/1.1 20071022 1 0 [sec 197.311 kb 5242848 kps 26571.5 {wr: writers 163840 rdwait 79.637 wrwait 112.112 filemark 4.930}] CHUNK taper yorick /data/nihpd/nihpd3/data/mri_processing/1.1 20071022 2 0 [sec 101.364 kb 5242848 kps 51722.7 {wr: writers 163840 rdwait 42.617 wrwait 57.238 filemark 1.013}] ... CHUNK taper yorick /data/nihpd/nihpd3/data/mri_processing/1.1 20071022 22 0 [sec 60.077 kb 3275872 kps 54527.8 {wr: writers 102372 rdwait 3.036 wrwait 55.529 filemark 1.051}] CHUNKSUCCESS taper yorick /data/nihpd/nihpd3/data/mri_processing/1.1 20071022 0 [sec 2313.467 kb 113376352 kps 49007.1 {wr: writers 102372 rdwait 3.036 wrwait 55.529 filemark 1.051}] what gives? jf -- °
Re: Multi-tape span failure
Tom, What is runtapes set to? --Ian On Wednesday 31 October 2007 00:31:53 Tom Hansen wrote: BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, configured to backup several large (300Gb +) filesystems spanning several tapes. I have a robot changer, LTO1 tapes (100Gb capacity) and I used: tape_splitsize 3Gb fallback_splitsize 256m (An unrelated issue: I couldn't seem to be able to get split_diskbuffer to have any effect so the chunks were all 256mb. No big deal, it was not a bottleneck.) After much time configuring, everything seems to be working properly, and on my first big run, it successfully spanned six tapes and was nearly finished. Then it grabbed tape 7, which I had inadvertently left in write protect mode. Unfortunately, at this point Amanda completely aborted the entire 800+ Gb backup and left nothing in the index, thus completely wasting 7+ hours of backup time. This behavior is unexpected and bad. What if a tape simply goes bad during a run? If I'm running 7 or 8 tapes each backup, I don't want to lose the whole thing if there's an error on the last tape! I _thought_ that Amanda was programmed to simply go to the next tape when a tape error occurs. In this case, if Amanda _had_ gone to the next tape, it could have completed the job, since tape 8 was a good tape. MY QUESTION: Is there any way to configure Amanda such that such a tape error would simply go to the next tape, instead of the worst possible action, which is to abort the whole job? Short of that, is there any way Amanda could start up from where it left off? Thanks. -- Tom Hansen Senior Information Processing Consultant Great Lakes WATER Institute tomh -at- uwm.edu www.glwi.uwm.edu -- Zmanda: Open Source Backup and Recovery. http://www.zmanda.com
Re: Multi-tape span failure
On Tue, Oct 30, 2007 at 11:31:53PM -0500, Tom Hansen wrote: BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, configured to backup several large (300Gb +) filesystems spanning several tapes. I have a robot changer, LTO1 tapes (100Gb capacity) and I used: tape_splitsize 3Gb fallback_splitsize 256m (An unrelated issue: I couldn't seem to be able to get split_diskbuffer to have any effect so the chunks were all 256mb. No big deal, it was not a bottleneck.) After much time configuring, everything seems to be working properly, and on my first big run, it successfully spanned six tapes and was nearly finished. Then it grabbed tape 7, which I had inadvertently left in write protect mode. Unfortunately, at this point Amanda completely aborted the entire 800+ Gb backup and left nothing in the index, thus completely wasting 7+ hours of backup time. This behavior is unexpected and bad. What if a tape simply goes bad during a run? If I'm running 7 or 8 tapes each backup, I don't want to lose the whole thing if there's an error on the last tape! I _thought_ that Amanda was programmed to simply go to the next tape when a tape error occurs. In this case, if Amanda _had_ gone to the next tape, it could have completed the job, since tape 8 was a good tape. MY QUESTION: Is there any way to configure Amanda such that such a tape error would simply go to the next tape, instead of the worst possible action, which is to abort the whole job? Short of that, is there any way Amanda could start up from where it left off? Short answer - no. If the backups are in a holding disk they can still be flushed to tapes, but resume a backup no. Something in your report is amiss. If amanda had successfully used 6 tapes, it would have completed backing up and taping one or more of your 300GB DLE's. There is no reason a failed tape after that would invalidate those backups. And your report (emailed or available with amreport) would show that. Also, IIRC, an LTO-1 tape at full speed takes about 1.5-2 hrs to tape completely. I would expect 6 successful tapes to take longer than 7 hours, more like 10-15, not counting the estimate phases. Might the estimate phase have taken 7 hours and then amanda rejected all the tapes as inappropriate, never writing to them? -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
two tape drives
Quick question will Amanda work correctly with two different configurations, each one using a tape drive running at the same time? Ex. Data Transfer Element 0:Empty Data Transfer Element 1:Full (Storage Element 9 Loaded):VolumeTag = EGV010these are my 2 drives. However, When I try to run /opt/amanda/server/sbin/amlabel -f FullDB EGV010 slot 9 Here is the error I get amlabel: could not load slot 9: Drive not ready after 120 seconds, rewind said /dev/rmt/3cn:Drive not ready after no seconds, rewind said tapeDrive not ready after loaded seconds, rewind said orDrive not ready after drive seconds, rewind said offline The tape EGV010 always goes to drive 0 instead of 1. mtx -f /dev/scsi/changer/c5t3d1 status Storage Changer /dev/scsi/changer/c5t3d1:2 Drives, 24 Slots ( 1 Import/Export ) Data Transfer Element 0:Full (Storage Element 9 Loaded):VolumeTag = EGV010 Here are the conf files for FullDB cat changer.conf firstslot=1 lastslot=23 driveslot=1 cleanslot=-1 autoclean=0 autocleancount=99 havereader=1 offline_before_unload=0 initial_poll_delay=0 tapedev /dev/rmt/3cn # the no-rewind tape device to be used ***Currently using hardware compression changerdev /dev/scsi/changer/c5t3d1 amrecover_changer /dev/scsi/changer/c5t3d1 Thanks
RE: two tape drives
Figured out what the error was, my changer file was pointing to the other configuration for drive 0. Can anyone still answer this question? Quick question will Amanda work correctly with two different configurations, each one using a tape drive running at the same time? Quick question will Amanda work correctly with two different configurations, each one using a tape drive running at the same time? Ex. Data Transfer Element 0:Empty Data Transfer Element 1:Full (Storage Element 9 Loaded):VolumeTag = EGV010these are my 2 drives. However, When I try to run /opt/amanda/server/sbin/amlabel -f FullDB EGV010 slot 9 Here is the error I get amlabel: could not load slot 9: Drive not ready after 120 seconds, rewind said /dev/rmt/3cn:Drive not ready after no seconds, rewind said tapeDrive not ready after loaded seconds, rewind said orDrive not ready after drive seconds, rewind said offline The tape EGV010 always goes to drive 0 instead of 1. mtx -f /dev/scsi/changer/c5t3d1 status Storage Changer /dev/scsi/changer/c5t3d1:2 Drives, 24 Slots ( 1 Import/Export ) Data Transfer Element 0:Full (Storage Element 9 Loaded):VolumeTag = EGV010 Here are the conf files for FullDB cat changer.conf firstslot=1 lastslot=23 driveslot=1 cleanslot=-1 autoclean=0 autocleancount=99 havereader=1 offline_before_unload=0 initial_poll_delay=0 tapedev /dev/rmt/3cn # the no-rewind tape device to be used ***Currently using hardware compression changerdev /dev/scsi/changer/c5t3d1 amrecover_changer /dev/scsi/changer/c5t3d1 Thanks
Re: two tape drives
Andy, I had two config that ran concurrently on the same amanda server, each having access to its own tape, also the tapes where actually Storedge L9/LTO jukeboxes, so each config owned a jukebox. In order to avoid conflicts I was able to segregate my clients into two groups, one having only local UFS mounted partitions on the server, the other having partitions that where non-local to the amanda server, so I was able to avoid any conflicts on the known amanda TCP and UPD ports without having to compile and run different amanda binaries for each config. Also I avoided any chance of putting a single amanda client in both configs, which would have caused conflict (since the client will only respond to a single server at any one time and there is no way to interlock/schedule a common client between amanda servers). For added clarity, each jukebox had its own tape pool with config specific tape labels. I guess you could run a single jukebox with two drives with unique tape labels, especially if you restrict the slots per config, but I don't things that would be necessary since each config would check labels before writing (but it would save time finding the next tape during the run). I ran that config for years with no problems. Not sure if I'm addressing your exact question, if not or you need follow-up please let me know. Brian On Wed, Oct 31, 2007 at 11:02:08AM -0500, Krahn, Anderson wrote: Quick question will Amanda work correctly with two different configurations, each one using a tape drive running at the same time? Ex. Data Transfer Element 0:Empty Data Transfer Element 1:Full (Storage Element 9 Loaded):VolumeTag = EGV010these are my 2 drives. However, When I try to run /opt/amanda/server/sbin/amlabel -f FullDB EGV010 slot 9 Here is the error I get amlabel: could not load slot 9: Drive not ready after 120 seconds, rewind said /dev/rmt/3cn:Drive not ready after no seconds, rewind said tapeDrive not ready after loaded seconds, rewind said orDrive not ready after drive seconds, rewind said offline The tape EGV010 always goes to drive 0 instead of 1. mtx -f /dev/scsi/changer/c5t3d1 status Storage Changer /dev/scsi/changer/c5t3d1:2 Drives, 24 Slots ( 1 Import/Export ) Data Transfer Element 0:Full (Storage Element 9 Loaded):VolumeTag = EGV010 Here are the conf files for FullDB cat changer.conf firstslot=1 lastslot=23 driveslot=1 cleanslot=-1 autoclean=0 autocleancount=99 havereader=1 offline_before_unload=0 initial_poll_delay=0 tapedev /dev/rmt/3cn # the no-rewind tape device to be used ***Currently using hardware compression changerdev /dev/scsi/changer/c5t3d1 amrecover_changer /dev/scsi/changer/c5t3d1 Thanks --- Brian R Cuttler [EMAIL PROTECTED] Computer Systems Support(v) 518 486-1697 Wadsworth Center(f) 518 473-6384 NYS Department of HealthHelp Desk 518 473-0773 IMPORTANT NOTICE: This e-mail and any attachments may contain confidential or sensitive information which is, or may be, legally privileged or otherwise protected by law from further disclosure. It is intended only for the addressee. If you received this in error or from someone who was not authorized to send it to you, please do not distribute, copy or use it or any attachments. Please notify the sender immediately by reply e-mail and delete this from your system. Thank you for your cooperation.
RE: two tape drives
That would almost be my case, except both of my config's would run off the same tape changer. I guess I will find out tomorrow if it works. Figured out what the error was, my changer file was pointing to the other configuration for drive 0. Can anyone still answer this question? Quick question will Amanda work correctly with two different configurations, each one using a tape drive running at the same time? Quick question will Amanda work correctly with two different configurations, each one using a tape drive running at the same time? Ex. Data Transfer Element 0:Empty Data Transfer Element 1:Full (Storage Element 9 Loaded):VolumeTag = EGV010these are my 2 drives. However, When I try to run /opt/amanda/server/sbin/amlabel -f FullDB EGV010 slot 9 Here is the error I get amlabel: could not load slot 9: Drive not ready after 120 seconds, rewind said /dev/rmt/3cn:Drive not ready after no seconds, rewind said tapeDrive not ready after loaded seconds, rewind said orDrive not ready after drive seconds, rewind said offline The tape EGV010 always goes to drive 0 instead of 1. mtx -f /dev/scsi/changer/c5t3d1 status Storage Changer /dev/scsi/changer/c5t3d1:2 Drives, 24 Slots ( 1 Import/Export ) Data Transfer Element 0:Full (Storage Element 9 Loaded):VolumeTag = EGV010 Here are the conf files for FullDB cat changer.conf firstslot=1 lastslot=23 driveslot=1 cleanslot=-1 autoclean=0 autocleancount=99 havereader=1 offline_before_unload=0 initial_poll_delay=0 tapedev /dev/rmt/3cn # the no-rewind tape device to be used ***Currently using hardware compression changerdev /dev/scsi/changer/c5t3d1 amrecover_changer /dev/scsi/changer/c5t3d1 Thanks
Re: Multi-tape span failure
Ian Turner wrote: Tom, What is runtapes set to? --Ian The runtapes parameter is set to 25. -Tom On Wednesday 31 October 2007 00:31:53 Tom Hansen wrote: BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, configured to backup several large (300Gb +) filesystems spanning several tapes. I have a robot changer, LTO1 tapes (100Gb capacity) and I used: tape_splitsize 3Gb fallback_splitsize 256m (An unrelated issue: I couldn't seem to be able to get split_diskbuffer to have any effect so the chunks were all 256mb. No big deal, it was not a bottleneck.) After much time configuring, everything seems to be working properly, and on my first big run, it successfully spanned six tapes and was nearly finished. Then it grabbed tape 7, which I had inadvertently left in write protect mode. Unfortunately, at this point Amanda completely aborted the entire 800+ Gb backup and left nothing in the index, thus completely wasting 7+ hours of backup time. This behavior is unexpected and bad. What if a tape simply goes bad during a run? If I'm running 7 or 8 tapes each backup, I don't want to lose the whole thing if there's an error on the last tape! I _thought_ that Amanda was programmed to simply go to the next tape when a tape error occurs. In this case, if Amanda _had_ gone to the next tape, it could have completed the job, since tape 8 was a good tape. MY QUESTION: Is there any way to configure Amanda such that such a tape error would simply go to the next tape, instead of the worst possible action, which is to abort the whole job? Short of that, is there any way Amanda could start up from where it left off? Thanks. -- Tom Hansen Senior Information Processing Consultant Great Lakes WATER Institute tomh -at- uwm.edu www.glwi.uwm.edu -- Tom Hansen Senior Information Processing Consultant UWM Great Lakes WATER Institute www.glwi.uwm.edu [EMAIL PROTECTED]
RE: two tape drives
On Wed, 31 Oct 2007 at 1:12pm, Krahn, Anderson wrote That would almost be my case, except both of my config's would run off the same tape changer. I guess I will find out tomorrow if it works. First off, please clean up your quoting -- it's nearly impossible in your replies to figure out who said what. Secondly, yes, this can work. I've been doing this for years. I run 2 configs on the same server. Each config has its own set of slots in the loader and its own drive. I stagger the start times by 5 minutes to try to keep them from competing for the robotics. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF
Re: two tape drives
OK, so you figured out your error, and this one configuration is working? Now the question remains, can two configurations run using the different drives? According to this thread, the answer is yes: http://forums.zmanda.com/showthread.php?t=315 And the person who asked reported back that it worked. --- Chris Hoogendyk - O__ Systems Administrator c/ /'_ --- Biology Geology Departments (*) \(*) -- 140 Morrill Science Center ~~ - University of Massachusetts, Amherst [EMAIL PROTECTED] --- Erdös 4 Krahn, Anderson wrote: Figured out what the error was, my changer file was pointing to the other configuration for drive 0. Can anyone still answer this question? Quick question will Amanda work correctly with two different configurations, each one using a tape drive running at the same time? Quick question will Amanda work correctly with two different configurations, each one using a tape drive running at the same time? Ex. Data Transfer Element 0:Empty Data Transfer Element 1:Full (Storage Element 9 Loaded):VolumeTag = EGV010 these are my 2 drives. However, When I try to run /opt/amanda/server/sbin/amlabel -f FullDB EGV010 slot 9 Here is the error I get “amlabel: could not load slot 9: Drive not ready after 120 seconds, rewind said /dev/rmt/3cn:Drive not ready after no seconds, rewind said tapeDrive not ready after loaded seconds, rewind said orDrive not ready after drive seconds, rewind said offline” The tape EGV010 always goes to drive 0 instead of 1. mtx -f /dev/scsi/changer/c5t3d1 status Storage Changer /dev/scsi/changer/c5t3d1:2 Drives, 24 Slots ( 1 Import/Export ) Data Transfer Element 0:Full (Storage Element 9 Loaded):VolumeTag = EGV010 Here are the conf files for FullDB cat changer.conf firstslot=1 lastslot=23 driveslot=1 cleanslot=-1 autoclean=0 autocleancount=99 havereader=1 offline_before_unload=0 initial_poll_delay=0 tapedev /dev/rmt/3cn # the no-rewind tape device to be used ***Currently using hardware compression changerdev /dev/scsi/changer/c5t3d1 amrecover_changer /dev/scsi/changer/c5t3d1 Thanks
Re: Multi-tape span failure
Tom, I've looked into this, and it is indeed a bug -- errors writing tape labels are not treated as robustly as errors at other times. I'll write up a patch for this, but it may not help you unless and until you upgrade, because taper has been completely rewritten since the last community release. --Ian On Wednesday 31 October 2007 14:31:44 Tom Hansen wrote: Ian Turner wrote: Tom, What is runtapes set to? --Ian The runtapes parameter is set to 25. -Tom On Wednesday 31 October 2007 00:31:53 Tom Hansen wrote: BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, configured to backup several large (300Gb +) filesystems spanning several tapes. I have a robot changer, LTO1 tapes (100Gb capacity) and I used: tape_splitsize 3Gb fallback_splitsize 256m (An unrelated issue: I couldn't seem to be able to get split_diskbuffer to have any effect so the chunks were all 256mb. No big deal, it was not a bottleneck.) After much time configuring, everything seems to be working properly, and on my first big run, it successfully spanned six tapes and was nearly finished. Then it grabbed tape 7, which I had inadvertently left in write protect mode. Unfortunately, at this point Amanda completely aborted the entire 800+ Gb backup and left nothing in the index, thus completely wasting 7+ hours of backup time. This behavior is unexpected and bad. What if a tape simply goes bad during a run? If I'm running 7 or 8 tapes each backup, I don't want to lose the whole thing if there's an error on the last tape! I _thought_ that Amanda was programmed to simply go to the next tape when a tape error occurs. In this case, if Amanda _had_ gone to the next tape, it could have completed the job, since tape 8 was a good tape. MY QUESTION: Is there any way to configure Amanda such that such a tape error would simply go to the next tape, instead of the worst possible action, which is to abort the whole job? Short of that, is there any way Amanda could start up from where it left off? Thanks. -- Tom Hansen Senior Information Processing Consultant Great Lakes WATER Institute tomh -at- uwm.edu www.glwi.uwm.edu -- Zmanda: Open Source Backup and Recovery. http://www.zmanda.com
Re: Multi-tape span failure
Jon LaBadie wrote: On Tue, Oct 30, 2007 at 11:31:53PM -0500, Tom Hansen wrote: BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, configured to backup several large (300Gb +) filesystems spanning several tapes. I have a robot changer, LTO1 tapes (100Gb capacity) and I used: tape_splitsize 3Gb fallback_splitsize 256m [ stuff deleted ] MY QUESTION: Is there any way to configure Amanda such that such a tape error would simply go to the next tape, instead of the worst possible action, which is to abort the whole job? Short of that, is there any way Amanda could start up from where it left off? Short answer - no. If the backups are in a holding disk they can still be flushed to tapes, but resume a backup no. Something in your report is amiss. If amanda had successfully used 6 tapes, it would have completed backing up and taping one or more of your 300GB DLE's. There is no reason a failed tape after that would invalidate those backups. And your report (emailed or available with amreport) would show that. Following is the report. It clearly says FAILED for all 4 filesystems under FAILURE AND STRANGE DUMP SUMMARY and sure enough, I could not see any files using amrecover. (I have done a test using one small filesystem, and amrecover did work in that case, so I'm pretty confident that my setup is good.) I did just notice that, at the very bottom, it does not indicate failure for the two filesystems that were complete. I'm not sure what to make of that. Thanks for your comments. (Oh and BTW, I was totally wrong about the dump time, it was more like 20 hours) -Tom Hostname: waterbase Org : GLWI Config : fullback Date: October 29, 2007 These dumps were to tapes GLWIBACK-001, GLWIBACK-002, GLWIBACK-003, GLWIBACK-004, GLWIBACK-005, GLWIBACK-006. *** A TAPE ERROR OCCURRED: [No more writable valid tape found]. Some dumps may have been left in the holding disk. Run amflush to flush them to tape. The next 9 tapes Amanda expects to use are: 9 new tapes. FAILURE AND STRANGE DUMP SUMMARY: waterbase.uwm.edu /media/raid2 lev 0 FAILED [out of tape] waterbase.uwm.edu /media/raid2 lev 0 FAILED [data write: Broken pipe] waterbase.uwm.edu / lev 0 FAILED [can't switch to incremental dump] waterbase.uwm.edu /media/raid2 lev 0 FAILED [dump to tape failed] STATISTICS: Total Full Incr. Estimate Time (hrs:min)1:00 Run Time (hrs:min)20:06 Dump Time (hrs:min) 16:25 16:25 0:00 Output Size (meg) 690435.5 690435.50.0 Original Size (meg)690351.3 690351.30.0 Avg Compressed Size (%) -- -- -- Filesystems Dumped2 2 0 Avg Dump Rate (k/s) 11966.411966.4-- Tape Time (hrs:min) 16:14 16:14 0:00 Tape Size (meg)690435.5 690435.50.0 Tape Used (%) 665.3 665.30.0 Filesystems Taped 2 2 0 Chunks Taped 3121 3121 0 Avg Tp Write Rate (k/s) 12093.412093.4-- USAGE BY TAPE: Label Time Size %NbNc GLWIBACK-001 3:01 130531776K 122.8 0 498 GLWIBACK-002 3:10 135774016K 127.7 0 518 GLWIBACK-003 3:01 123874432K 116.5 1 473 GLWIBACK-004 3:05 143113152K 134.6 0 546 GLWIBACK-005 2:56 124765312K 117.4 0 476 GLWIBACK-006 3:38 159734400K 150.3 1 610 FAILED AND STRANGE DUMP DETAILS: /-- waterbase.uwm.edu /media/raid2 lev 0 FAILED [data write: Broken pipe] sendbackup: start [waterbase.uwm.edu:/media/raid2 level 0] sendbackup: info BACKUP=/bin/tar sendbackup: info RECOVER_CMD=/bin/tar -xpGf - ... sendbackup: info end | gtar: ./mysql_trans/mysql.sock: socket ignored \ NOTES: planner: Adding new disk waterbase.uwm.edu:/. planner: Adding new disk waterbase.uwm.edu:/media/raid0. planner: Adding new disk waterbase.uwm.edu:/media/raid1. planner: Adding new disk waterbase.uwm.edu:/media/raid2. taper: mmap failed (Cannot allocate memory): using fallback split size of 262144kb to buffer waterbase.uwm.edu:/media/raid1.0 in-memory taper: tape GLWIBACK-001 kb 130547712 fm 499 writing file: short write taper: continuing waterbase.uwm.edu:/media/raid1.0 on new tape from 130547712kb mark: [writing file: short write] taper: tape GLWIBACK-002 kb 135895488 fm 519 writing file: short write taper: continuing waterbase.uwm.edu:/media/raid1.0 on new tape from 266338304kb mark: [writing file: short write] taper: mmap failed (Cannot allocate memory): using fallback split size of 262144kb to buffer waterbase.uwm.edu:/media/raid0.0 in-memory taper: tape GLWIBACK-003 kb 124064672 fm 474 writing file: short write taper: continuing waterbase.uwm.edu:/media/raid0.0 on new tape from 21233664kb
Re: Multi-tape span failure
On Wed, Oct 31, 2007 at 01:59:48PM -0500, Tom Hansen wrote: Jon LaBadie wrote: On Tue, Oct 30, 2007 at 11:31:53PM -0500, Tom Hansen wrote: BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, configured to backup several large (300Gb +) filesystems spanning several tapes. I have a robot changer, LTO1 tapes (100Gb capacity) and I used: tape_splitsize 3Gb fallback_splitsize 256m [ stuff deleted ] MY QUESTION: Is there any way to configure Amanda such that such a tape error would simply go to the next tape, instead of the worst possible action, which is to abort the whole job? Short of that, is there any way Amanda could start up from where it left off? Short answer - no. If the backups are in a holding disk they can still be flushed to tapes, but resume a backup no. Something in your report is amiss. If amanda had successfully used 6 tapes, it would have completed backing up and taping one or more of your 300GB DLE's. There is no reason a failed tape after that would invalidate those backups. And your report (emailed or available with amreport) would show that. Following is the report. It clearly says FAILED for all 4 filesystems under FAILURE AND STRANGE DUMP SUMMARY and sure enough, I could not see any files using amrecover. (I have done a test using one small filesystem, and amrecover did work in that case, so I'm pretty confident that my setup is good.) I did just notice that, at the very bottom, it does not indicate failure for the two filesystems that were complete. I'm not sure what to make of that. Thanks for your comments. (Oh and BTW, I was totally wrong about the dump time, it was more like 20 hours) -Tom Hostname: waterbase Org : GLWI Config : fullback Date: October 29, 2007 These dumps were to tapes GLWIBACK-001, GLWIBACK-002, GLWIBACK-003, GLWIBACK-004, GLWIBACK-005, GLWIBACK-006. *** A TAPE ERROR OCCURRED: [No more writable valid tape found]. Some dumps may have been left in the holding disk. Run amflush to flush them to tape. The next 9 tapes Amanda expects to use are: 9 new tapes. FAILURE AND STRANGE DUMP SUMMARY: waterbase.uwm.edu /media/raid2 lev 0 FAILED [out of tape] waterbase.uwm.edu /media/raid2 lev 0 FAILED [data write: Broken pipe] waterbase.uwm.edu / lev 0 FAILED [can't switch to incremental dump] waterbase.uwm.edu /media/raid2 lev 0 FAILED [dump to tape failed] STATISTICS: Total Full Incr. Estimate Time (hrs:min)1:00 Run Time (hrs:min)20:06 Dump Time (hrs:min) 16:25 16:25 0:00 Output Size (meg) 690435.5 690435.50.0 Original Size (meg)690351.3 690351.30.0 Avg Compressed Size (%) -- -- -- Filesystems Dumped2 2 0 Avg Dump Rate (k/s) 11966.411966.4-- Tape Time (hrs:min) 16:14 16:14 0:00 Tape Size (meg)690435.5 690435.50.0 Tape Used (%) 665.3 665.30.0 Filesystems Taped 2 2 0 Chunks Taped 3121 3121 0 Avg Tp Write Rate (k/s) 12093.412093.4-- USAGE BY TAPE: Label Time Size %NbNc GLWIBACK-001 3:01 130531776K 122.8 0 498 GLWIBACK-002 3:10 135774016K 127.7 0 518 GLWIBACK-003 3:01 123874432K 116.5 1 473 GLWIBACK-004 3:05 143113152K 134.6 0 546 GLWIBACK-005 2:56 124765312K 117.4 0 476 GLWIBACK-006 3:38 159734400K 150.3 1 610 FAILED AND STRANGE DUMP DETAILS: /-- waterbase.uwm.edu /media/raid2 lev 0 FAILED [data write: Broken pipe] sendbackup: start [waterbase.uwm.edu:/media/raid2 level 0] sendbackup: info BACKUP=/bin/tar sendbackup: info RECOVER_CMD=/bin/tar -xpGf - ... sendbackup: info end | gtar: ./mysql_trans/mysql.sock: socket ignored \ NOTES: planner: Adding new disk waterbase.uwm.edu:/. planner: Adding new disk waterbase.uwm.edu:/media/raid0. planner: Adding new disk waterbase.uwm.edu:/media/raid1. planner: Adding new disk waterbase.uwm.edu:/media/raid2. taper: mmap failed (Cannot allocate memory): using fallback split size of 262144kb to buffer waterbase.uwm.edu:/media/raid1.0 in-memory taper: tape GLWIBACK-001 kb 130547712 fm 499 writing file: short write taper: continuing waterbase.uwm.edu:/media/raid1.0 on new tape from 130547712kb mark: [writing file: short write] taper: tape GLWIBACK-002 kb 135895488 fm 519 writing file: short write taper: continuing waterbase.uwm.edu:/media/raid1.0 on new tape from 266338304kb mark: [writing file: short write] taper: mmap failed (Cannot allocate memory): using fallback split size of 262144kb to buffer
Restore from file, not tape
I was really hoping I could get the tape changer configured before this happened, but it bit me on the rear. Amanda 2.4.5 server. I've got a level 0 sitting on my disk. It's too large to fit on tape. I need to extract a directory from it. I know the entire path. It doesn't show up in my list on amrestore because it never got dumped to tape from the holding disk. This has probably been covered before and I just didn't search right - but does someone know the commands to pull the directory from the file? Thanks, LP
Re: Restore from file, not tape
On Wed, Oct 31, 2007 at 04:24:25PM -0500, Linda Pahdoco wrote: I was really hoping I could get the tape changer configured before this happened, but it bit me on the rear. Amanda 2.4.5 server. I've got a level 0 sitting on my disk. It's too large to fit on tape. I need to extract a directory from it. I know the entire path. It doesn't show up in my list on amrestore because it never got dumped to tape from the holding disk. This has probably been covered before and I just didn't search right - but does someone know the commands to pull the directory from the file? Amrecover, not amrestore, would be the command to extract a directory tree from that holding disk (or tape). If that is what you used, and you were unable to see the index of files in the dump, was the record parameter set to yes when the dump was made. No index is created otherwise. If it was set, show the command and interactive amrecover session you used (see script command as a possible recorder) -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
Re: amrecover failed - time out contacting to server itself
Here is log in /var/log/messages when triggering amrecover: Nov 1 10:45:16 server xinetd[4021]: START: amanda pid=2345 from=202.53.250.162 Nov 1 10:45:46 server xinetd[4021]: EXIT: amanda status=0 pid=2345 duration=30(sec) this is amandad log: amandad: debug 1 pid 2345 ruid 501 euid 501: start at Thu Nov 1 10:45:16 2007 security_getdriver(name=bsd) returns 0xd3f0e0 amandad: version 2.5.1p3 amandad: build: VERSION=Amanda-2.5.1p3 amandad:BUILT_DATE=Mon Oct 22 06:37:53 WIT 2007 amandad:BUILT_MACH=Linux server.com 2.6.18-8.el5 #1 SMP Fri Jan 26 14:15:21 EST 2007 i686 i686 i386 GNU/Linux amandad:CC=gcc amandad:CONFIGURE_COMMAND='./configure' '--with-configdir=/usr/local/etc/amanda' '--with-user=amanda' '--with-group=disk' '--with-config=DailySet1' '--with-gnutar=/bin/ta r' '--with-tcpportrange=5,50100' '--with-udpportrange=700,710' amandad: paths: bindir=/usr/local/bin sbindir=/usr/local/sbin amandad:libexecdir=/usr/local/libexec mandir=/usr/local/man amandad:AMANDA_TMPDIR=/tmp/amanda AMANDA_DBGDIR=/tmp/amanda amandad:CONFIG_DIR=/usr/local/etc/amanda DEV_PREFIX=/dev/ amandad:RDEV_PREFIX=/dev/ DUMP=/sbin/dump amandad:RESTORE=/sbin/restore VDUMP=UNDEF VRESTORE=UNDEF amandad:XFSDUMP=UNDEF XFSRESTORE=UNDEF VXDUMP=UNDEF VXRESTORE=UNDEF amandad:SAMBA_CLIENT=/usr/bin/smbclient GNUTAR=/bin/tar amandad:COMPRESS_PATH=/bin/gzip UNCOMPRESS_PATH=/bin/gzip amandad:LPRCMD=/usr/bin/lpr MAILER=/usr/bin/Mail amandad:listed_incr_dir=/usr/local/var/amanda/gnutar-lists amandad: defs: DEFAULT_SERVER=server.com amandad:DEFAULT_CONFIG=DailySet1 amandad:DEFAULT_TAPE_SERVER=server.com HAVE_MMAP amandad:HAVE_SYSVSHM LOCKING=POSIX_FCNTL SETPGRP_VOID DEBUG_CODE amandad:AMANDA_DEBUG_DAYS=4 BSD_SECURITY RSH_SECURITY USE_AMANDAHOSTS amandad:CLIENT_LOGIN=amanda FORCE_USERID HAVE_GZIP amandad:COMPRESS_SUFFIX=.gz COMPRESS_FAST_OPT=--fast amandad:COMPRESS_BEST_OPT=--best UNCOMPRESS_OPT=-dc amandad: time 0.000: dgram_recv(dgram=0xd40004, timeout=0, fromaddr=0xd4fff0) amandad: time 0.000: (sockaddr_in *)0xd4fff0 = { 2, 708, 202.53.250.162 } amandad: time 9.916: dgram_recv(dgram=0xd40004, timeout=0, fromaddr=0xd4fff0) amandad: time 9.916: (sockaddr_in *)0xd4fff0 = { 2, 708, 202.53.250.162 } amandad: time 19.913: dgram_recv(dgram=0xd40004, timeout=0, fromaddr=0xd4fff0) amandad: time 19.913: (sockaddr_in *)0xd4fff0 = { 2, 708, 202.53.250.162 } amandad: time 29.912: pid 2345 finish time Thu Nov 1 10:45:46 2007 Jean-Louis Martineau-2 wrote: Are you sure the server receive the request? Did amandad get execute on the server? Post its debug log. Check your firewall/network log. -- View this message in context: http://www.nabble.com/amrecover-failed---time-out-contacting-to-server-itself-tf4717399.html#a13523379 Sent from the Amanda - Users mailing list archive at Nabble.com.