Re: dumps way too big
On Sat, 24 Mar 2007, [EMAIL PROTECTED] wrote: Le jeudi 22 mars 2007 � 20:13 -0400, Gene Heskett a �crit : On Thursday 22 March 2007, [EMAIL PROTECTED] wrote: Hello, One backup partly failed with : FAILURE AND STRANGE DUMP SUMMARY: k400 /mnt/d_mails lev 1 FAILED [dumps way too big, 1025270 KB, must skip incremental dumps] k400 /home/jpp lev 1 FAILED [dumps way too big, 1116100 KB, must skip incremental dumps] k400 /etc lev 0 STRANGE for some other directories and machines the backup is OK. What is the problem ? Regards Storm66 Your kernel version please? Kernel 2.6.16 on the master machine, 2.6.18 and 2.6.20 on other machines. Frank Smith asks for the size of tape I am using : it is a virtual tape on a separate disk whth more than 100G avalaible. You write `partly' failed? Doesn't it just mean that some DLEs were dumped (or estimated to dump) to tape, but Amanda noticed the remaining DLEs couldn't fit anymore? I see it from time to time, too. No harm, it just gets solved the next night :-) Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds
Backup to multiple DVDs (like scdbackup)
Hello List, does someone here back up to multiple DVDs? I am looking for some amanda/scdbackup solution. Thanks, Mario
Re: forcing a failed backup to disk to use the same slot
On 2007-03-25 01:58, Gene Heskett wrote: On Saturday 24 March 2007, Frank Smith wrote: René Kanters wrote: Hi, I have had some problems that an overnight backup 'hung up', i.e., a level 0 dump started but never properly finished so backups from other machines did not work either. So far the problem has been with the western digital disk on my Mac amanda server, where a restart of the external disk solves the problem. I catch these problems the same day they happen, so I am wondering whether it is possible to run amdump with a configuration of dumping to a disk (using tpchanger chg-disk) and not have amanda use the next slot, but the slot that the failed dump started on. I could not find any information on that in the man pages. Any ideas? If you're sure that nothing you need was written to the 'tape', you can use amrmtape and then relabel it with the same label. That should make it the first tape to use on the next run. I've often wondered why Amanda marks a tape as 'used' on a failed backup even when nothing was written to tape. Frank I think, and someone is welcome to correct me, that the re-write of the label on the tape with the current date, and all the housekeeping that gets done when that is done, really should be delayed until such time as there really is something in the holding disk that's complete and ready to move to tape. As it is, I believe this bit of housekeeping is done up front before the error is known. This would be one improvement that could be put into amanda, and would be most welcome. Hint hint... On the other hand, delaying the checks for the correct tape that is as well writable, misses the opportunity to fall back to plan B, the degraded mode dumps (incrementals to holdingdisk instead) is something is wrong with the tape. But we could get around it, by doing writing a dummy label, just like for a new tape during that write test (just like amcheck -w does). And when the first real data is to be written, rewrite that label again with the correct one for the date. This would keep the tape available for the next run if anything goes wrong with getting the dumps. -- Paul Bijnens, xplanation Technology ServicesTel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] *** * I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, ^^, * * F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, * * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, * * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, * * init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... * * ... Are you sure? ... YES ... Phew ... I'm out * ***
Re: forcing a failed backup to disk to use the same slot
On Monday 26 March 2007, Paul Bijnens wrote: On 2007-03-25 01:58, Gene Heskett wrote: On Saturday 24 March 2007, Frank Smith wrote: René Kanters wrote: Hi, I have had some problems that an overnight backup 'hung up', i.e., a level 0 dump started but never properly finished so backups from other machines did not work either. So far the problem has been with the western digital disk on my Mac amanda server, where a restart of the external disk solves the problem. I catch these problems the same day they happen, so I am wondering whether it is possible to run amdump with a configuration of dumping to a disk (using tpchanger chg-disk) and not have amanda use the next slot, but the slot that the failed dump started on. I could not find any information on that in the man pages. Any ideas? If you're sure that nothing you need was written to the 'tape', you can use amrmtape and then relabel it with the same label. That should make it the first tape to use on the next run. I've often wondered why Amanda marks a tape as 'used' on a failed backup even when nothing was written to tape. Frank I think, and someone is welcome to correct me, that the re-write of the label on the tape with the current date, and all the housekeeping that gets done when that is done, really should be delayed until such time as there really is something in the holding disk that's complete and ready to move to tape. As it is, I believe this bit of housekeeping is done up front before the error is known. This would be one improvement that could be put into amanda, and would be most welcome. Hint hint... On the other hand, delaying the checks for the correct tape that is as well writable, misses the opportunity to fall back to plan B, the degraded mode dumps (incrementals to holdingdisk instead) is something is wrong with the tape. But we could get around it, by doing writing a dummy label, just like for a new tape during that write test (just like amcheck -w does). And when the first real data is to be written, rewrite that label again with the correct one for the date. This would keep the tape available for the next run if anything goes wrong with getting the dumps. I think the how is less important to most tape users. My only problem with that is the wear and tear on the tape and drive, not a consideration for vtapes. But whats wrong with going to plan B when its found that the first file isn't writable? Most holding disks should have enough area to do that I'd think. In the event of a failure, could it not be sufficient to just move that tape back to the top of the tapelist mark it somehow as new? -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) When the blind lead the blind they will both fall over the cliff. -- Chinese proverb
Re: amrecover fails with unknown error
Karsten, Did you get an amidxtaped debug fileson the server? What did it contains? Did amidxtaped configured correctly in xinetd? Any xinetd log? Jean-Louis Karsten Fuhrmann wrote: Hello List, i have a serious situation here, i can not recover anything from my backups anymore. I am using version 2.5.0p2 on FreeBSD 6.1-STABLE FreeBSD 6.1-STABLE #0: SMP i386 amrecover always exits with 'Could not read from control socket: Unknown error: 0' Does anybody have an idea what is going on here ? This is a log file from an amrecover run [EMAIL PROTECTED] /tmp/amanda]# more amrecover.20070325205302.debug amrecover: debug 1 pid 11917 ruid 0 euid 0: start at Sun Mar 25 20:53:02 2007 amrecover: bind_portrange2: trying port=923 amrecover: stream_client_privileged: connected to 192.168.1.1.10082 amrecover: stream_client_privileged: our side is 0.0.0.0.923 ... amrecover: bind_portrange2: trying port=991 amrecover: stream_client_privileged: connected to 192.168.1.1.10083 amrecover: stream_client_privileged: our side is 0.0.0.0.991 amrecover: try_socksize: receive buffer size is 65536 amrecover: Could not read from control socket: Unknown error: 0 amrecover: pid 11917 finish time Sun Mar 25 20:54:09 2007
Re: A non-amanda issue: the abysmal misery of LUNs
Executive summary: mt status works if no tape is loaded! Details: This is what I have done so far (I will try the suggestion to echo the new device into /proc/scsi) added a the last line to /etc/modprobe.conf as follows: [EMAIL PROTECTED] ~]# tail /etc/modprobe.conf alias eth0 3c59x alias scsi_hostadapter aic7xxx alias usb-controller uhci-hcd alias scsi_hostadapter1 aic79xx options scsi_mod max_luns=255 [EMAIL PROTECTED] ~]# (in my version of CentOS, which is close to RHEL 4, what was previously options scsi_mod max_scsi_luns=255 became options scsi_mod max_luns=255) Then I created a new initial ram disk with mkinitrd /boot/initrd-2.6.9-42.0.10.ELsmpLUN.img 2.6.9-42.0.10.ELsmp and modified /boot/grub/menu.lst accordingly. But nothing sort of worked until I located the SCSI terminator. Now something odd occurs: with say storage element 4 loaded into the drive, [EMAIL PROTECTED] ~]# mtx -f /dev/sg4 status Storage Changer /dev/sg4:1 Drives, 11 Slots ( 0 Import/Export ) Data Transfer Element 0:Full (Storage Element 4 Loaded):VolumeTag = A4 Storage Element 1:Full :VolumeTag=A1 Storage Element 2:Full :VolumeTag=A2 Storage Element 3:Full :VolumeTag=A3 Storage Element 4:Empty:VolumeTag= Storage Element 5:Full :VolumeTag=A5 Storage Element 6:Full :VolumeTag=A6 Storage Element 7:Full :VolumeTag=A7 Storage Element 8:Full :VolumeTag=A8 Storage Element 9:Full :VolumeTag=A9 Storage Element 10:Full :VolumeTag=A00010 Storage Element 11:Full :VolumeTag=A00011 [EMAIL PROTECTED] ~]# I cannot see the status: [EMAIL PROTECTED] ~]# mt -f /dev/nst2 status /dev/nst2: Input/output error [EMAIL PROTECTED] ~]# But if I unload the tape drive I CAN see the status [EMAIL PROTECTED] ~]# mtx -f /dev/sg4 unload 4 Unloading Data Transfer Element into Storage Element 4...done [EMAIL PROTECTED] ~]# [EMAIL PROTECTED] ~]# mtx -f /dev/sg4 status Storage Changer /dev/sg4:1 Drives, 11 Slots ( 0 Import/Export ) Data Transfer Element 0:Empty Storage Element 1:Full :VolumeTag=A1 Storage Element 2:Full :VolumeTag=A2 Storage Element 3:Full :VolumeTag=A3 Storage Element 4:Full :VolumeTag=A4 Storage Element 5:Full :VolumeTag=A5 Storage Element 6:Full :VolumeTag=A6 Storage Element 7:Full :VolumeTag=A7 Storage Element 8:Full :VolumeTag=A8 Storage Element 9:Full :VolumeTag=A9 Storage Element 10:Full :VolumeTag=A00010 Storage Element 11:Full :VolumeTag=A00011 [EMAIL PROTECTED] ~]# [EMAIL PROTECTED] ~]# mt -f /dev/nst2 status SCSI 2 tape drive: File number=-1, block number=-1, partition=0. Tape block size 0 bytes. Density code 0x0 (default). Soft error count since last status=0 General status bits on (5): DR_OPEN IM_REP_EN [EMAIL PROTECTED] ~]#
Re: A non-amanda issue: the abysmal misery of LUNs
On Monday 26 March 2007, FL wrote: Executive summary: mt status works if no tape is loaded! Details: This is what I have done so far (I will try the suggestion to echo the new device into /proc/scsi) added a the last line to /etc/modprobe.conf as follows: [EMAIL PROTECTED] ~]# tail /etc/modprobe.conf alias eth0 3c59x alias scsi_hostadapter aic7xxx alias usb-controller uhci-hcd alias scsi_hostadapter1 aic79xx options scsi_mod max_luns=255 [EMAIL PROTECTED] ~]# (in my version of CentOS, which is close to RHEL 4, what was previously options scsi_mod max_scsi_luns=255 became options scsi_mod max_luns=255) Then I created a new initial ram disk with mkinitrd /boot/initrd-2.6.9-42.0.10.ELsmpLUN.img 2.6.9-42.0.10.ELsmp and modified /boot/grub/menu.lst accordingly. But nothing sort of worked until I located the SCSI terminator. This is very important. Missing or improper terms has caused the wasted sacrifice of lots of virgins over the last 30 or so years we've had a scsi spec. Few people understand that this is in fact a transmission line, and as such it _must_ be terminated at both _ends_ of the physical cable. Using the next connector on the cable, and leaving another foot or so curled up and unused at the end will cause data trashing echo's and cost you any religion you may have thought you had. Now something odd occurs: with say storage element 4 loaded into the drive, [EMAIL PROTECTED] ~]# mtx -f /dev/sg4 status Storage Changer /dev/sg4:1 Drives, 11 Slots ( 0 Import/Export ) Data Transfer Element 0:Full (Storage Element 4 Loaded):VolumeTag = A4 Storage Element 1:Full :VolumeTag=A1 Storage Element 2:Full :VolumeTag=A2 Storage Element 3:Full :VolumeTag=A3 Storage Element 4:Empty:VolumeTag= Storage Element 5:Full :VolumeTag=A5 Storage Element 6:Full :VolumeTag=A6 Storage Element 7:Full :VolumeTag=A7 Storage Element 8:Full :VolumeTag=A8 Storage Element 9:Full :VolumeTag=A9 Storage Element 10:Full :VolumeTag=A00010 Storage Element 11:Full :VolumeTag=A00011 [EMAIL PROTECTED] ~]# I cannot see the status: [EMAIL PROTECTED] ~]# mt -f /dev/nst2 status /dev/nst2: Input/output error [EMAIL PROTECTED] ~]# But if I unload the tape drive I CAN see the status [EMAIL PROTECTED] ~]# mtx -f /dev/sg4 unload 4 Unloading Data Transfer Element into Storage Element 4...done [EMAIL PROTECTED] ~]# [EMAIL PROTECTED] ~]# mtx -f /dev/sg4 status Storage Changer /dev/sg4:1 Drives, 11 Slots ( 0 Import/Export ) Data Transfer Element 0:Empty Storage Element 1:Full :VolumeTag=A1 Storage Element 2:Full :VolumeTag=A2 Storage Element 3:Full :VolumeTag=A3 Storage Element 4:Full :VolumeTag=A4 Storage Element 5:Full :VolumeTag=A5 Storage Element 6:Full :VolumeTag=A6 Storage Element 7:Full :VolumeTag=A7 Storage Element 8:Full :VolumeTag=A8 Storage Element 9:Full :VolumeTag=A9 Storage Element 10:Full :VolumeTag=A00010 Storage Element 11:Full :VolumeTag=A00011 [EMAIL PROTECTED] ~]# [EMAIL PROTECTED] ~]# mt -f /dev/nst2 status SCSI 2 tape drive: File number=-1, block number=-1, partition=0. Tape block size 0 bytes. Density code 0x0 (default). Soft error count since last status=0 General status bits on (5): DR_OPEN IM_REP_EN [EMAIL PROTECTED] ~]# I'm wondering if RHEL has continued the practice of only scanning the scsi bus for LUN=0, in which case many libraries and changers will be missed. This requires a rebuild of the kernel, with the 'scan all luns' set to 'y' in the scsi menu. I would however, scan the /var/log/dmesg file to see if /dev/nst2 is the correct address for the tape drive, which may not be the same as the changer robot you are running with the mtx command. You're use of /dev/sg4 to address the robot, but /dev/nst2 for the drive looks like something I'd want to check. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Everything that you know is wrong, but you can be straightened out.
Re: A non-amanda issue: the abysmal misery of LUNs
On 3/26/07, Gene Heskett [EMAIL PROTECTED] wrote: On Monday 26 March 2007, FL wrote: Executive summary: mt status works if no tape is loaded! ... Then I created a new initial ram disk with mkinitrd /boot/initrd-2.6.9-42.0.10.ELsmpLUN.img 2.6.9-42.0.10.ELsmp and modified /boot/grub/menu.lst accordingly. But nothing sort of worked until I located the SCSI terminator. This is very important. Missing or improper terms has caused the wasted sacrifice of lots of virgins over the last 30 or so years we've had a scsi spec. Few people understand that this is in fact a transmission line, and as such it _must_ be terminated at both _ends_ of the physical cable. Yes, I was actually well aware of this (it was kind of a joke -- I even have a ham license). It is crucially important to terminate the SCSI bus, but sometimes a person's SCSI terminator falls off, and one proceeds anyway. I probably should have omitted these remarks, but now that I've broadcast it to the entire world, it will be held against me. Using the next connector on the cable, and leaving another foot or so curled up and unused at the end will cause data trashing echo's and cost you any religion you may have thought you had. In that case, I owe the world a religion. I'm wondering if RHEL has continued the practice of only scanning the scsi bus for LUN=0, in which case many libraries and changers will be missed. This requires a rebuild of the kernel, with the 'scan all luns' set to 'y' in the scsi menu. AHA (behold the last line): [EMAIL PROTECTED] 2.6.9-42.0.10.EL-smp-i686]# grep SCSI .config CONFIG_CISS_SCSI_TAPE=y CONFIG_BLK_DEV_IDESCSI=m # SCSI device support CONFIG_SCSI=m CONFIG_SCSI_PROC_FS=y # SCSI support type (disk, tape, CD-ROM) CONFIG_SCSI_DUMP=m # Some SCSI devices (e.g. CD jukebox) support multiple LUNs # CONFIG_SCSI_MULTI_LUN is not set Oh well. I figured it would come to this. I would however, scan the /var/log/dmesg file to see if /dev/nst2 is the correct address for the tape drive, which may not be the same as the changer robot you are running with the mtx command. You're use of /dev/sg4 to address the robot, but /dev/nst2 for the drive looks like something I'd want to check. Here is an additional piece of information (what I call a factlet: a crucial fact that you need to know that no one will tell you unless you ask): I have another SCSI card to which my spectra logic 2K is attached. Drives 0 and 1 of the 2K are nst0 and nst1, respectively. The Exabyte drive is nst2. [EMAIL PROTECTED] 2.6.9-42.0.10.EL-smp-i686]# cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: SONY Model: SDX-300C Rev: 04c7 Type: Sequential-AccessANSI SCSI revision: 02 Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: SONY Model: SDX-300C Rev: 04c7 Type: Sequential-AccessANSI SCSI revision: 02 Host: scsi0 Channel: 00 Id: 02 Lun: 00 Vendor: SPECTRA Model: 215 Rev: 2201 Type: Medium Changer ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 06 Lun: 00 Vendor: HP Model: Ultrium 2-SCSI Rev: S33U Type: Sequential-AccessANSI SCSI revision: 03 Host: scsi1 Channel: 00 Id: 06 Lun: 01 Vendor: EXABYTE Model: MAGNUM 224 Rev: C118 Type: Medium Changer ANSI SCSI revision: 04 [EMAIL PROTECTED] 2.6.9-42.0.10.EL-smp-i686]# And, [EMAIL PROTECTED] 2.6.9-42.0.10.EL-smp-i686]# dmesg | grep -A 1 -B 4 EXABYTE (scsi1:A:6): 160.000MB/s transfers (80.000MHz DT, 16bit) Vendor: HPModel: Ultrium 2-SCSIRev: S33U Type: Sequential-Access ANSI SCSI revision: 03 Vendor: EXABYTE Model: MAGNUM 224Rev: C118 Type: Medium Changer ANSI SCSI revision: 04 [EMAIL PROTECTED] 2.6.9-42.0.10.EL-smp-i686]# -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Everything that you know is wrong, but you can be straightened out. Many thanks!
Re: A non-amanda issue: the abysmal misery of LUNs
On Monday 26 March 2007, FL wrote: On 3/26/07, Gene Heskett [EMAIL PROTECTED] wrote: On Monday 26 March 2007, FL wrote: Executive summary: mt status works if no tape is loaded! ... Then I created a new initial ram disk with mkinitrd /boot/initrd-2.6.9-42.0.10.ELsmpLUN.img 2.6.9-42.0.10.ELsmp and modified /boot/grub/menu.lst accordingly. But nothing sort of worked until I located the SCSI terminator. This is very important. Missing or improper terms has caused the wasted sacrifice of lots of virgins over the last 30 or so years we've had a scsi spec. Few people understand that this is in fact a transmission line, and as such it _must_ be terminated at both _ends_ of the physical cable. Yes, I was actually well aware of this (it was kind of a joke -- I even have a ham license). It is crucially important to terminate the SCSI bus, but sometimes a person's SCSI terminator falls off, and one proceeds anyway. I probably should have omitted these remarks, but now that I've broadcast it to the entire world, it will be held against me. Using the next connector on the cable, and leaving another foot or so curled up and unused at the end will cause data trashing echo's and cost you any religion you may have thought you had. In that case, I owe the world a religion. I'm wondering if RHEL has continued the practice of only scanning the scsi bus for LUN=0, in which case many libraries and changers will be missed. This requires a rebuild of the kernel, with the 'scan all luns' set to 'y' in the scsi menu. AHA (behold the last line): [EMAIL PROTECTED] 2.6.9-42.0.10.EL-smp-i686]# grep SCSI .config CONFIG_CISS_SCSI_TAPE=y CONFIG_BLK_DEV_IDESCSI=m # SCSI device support CONFIG_SCSI=m CONFIG_SCSI_PROC_FS=y # SCSI support type (disk, tape, CD-ROM) CONFIG_SCSI_DUMP=m # Some SCSI devices (e.g. CD jukebox) support multiple LUNs # CONFIG_SCSI_MULTI_LUN is not set Oh well. I figured it would come to this. Chuckle, aww, come on its not that bad, its been months since I ran a fedora kernel. I bleed a little for the cause occasionally. :-) I would however, scan the /var/log/dmesg file to see if /dev/nst2 is the correct address for the tape drive, which may not be the same as the changer robot you are running with the mtx command. You're use of /dev/sg4 to address the robot, but /dev/nst2 for the drive looks like something I'd want to check. Here is an additional piece of information (what I call a factlet: a crucial fact that you need to know that no one will tell you unless you ask): I have another SCSI card to which my spectra logic 2K is attached. Drives 0 and 1 of the 2K are nst0 and nst1, respectively. The Exabyte drive is nst2. [EMAIL PROTECTED] 2.6.9-42.0.10.EL-smp-i686]# cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: SONY Model: SDX-300C Rev: 04c7 Type: Sequential-AccessANSI SCSI revision: 02 Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: SONY Model: SDX-300C Rev: 04c7 Type: Sequential-AccessANSI SCSI revision: 02 Host: scsi0 Channel: 00 Id: 02 Lun: 00 Vendor: SPECTRA Model: 215 Rev: 2201 Type: Medium Changer ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 06 Lun: 00 Vendor: HP Model: Ultrium 2-SCSI Rev: S33U Type: Sequential-AccessANSI SCSI revision: 03 Host: scsi1 Channel: 00 Id: 06 Lun: 01 Vendor: EXABYTE Model: MAGNUM 224 Rev: C118 Type: Medium Changer ANSI SCSI revision: 04 [EMAIL PROTECTED] 2.6.9-42.0.10.EL-smp-i686]# And, [EMAIL PROTECTED] 2.6.9-42.0.10.EL-smp-i686]# dmesg | grep -A 1 -B 4 EXABYTE (scsi1:A:6): 160.000MB/s transfers (80.000MHz DT, 16bit) Vendor: HPModel: Ultrium 2-SCSIRev: S33U Type: Sequential-Access ANSI SCSI revision: 03 Vendor: EXABYTE Model: MAGNUM 224Rev: C118 Type: Medium Changer ANSI SCSI revision: 04 [EMAIL PROTECTED] 2.6.9-42.0.10.EL-smp-i686]# -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Everything that you know is wrong, but you can be straightened out. Many thanks! -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Life's the same, except for the shoes. - The Cars