FATAL taper
I just received a fatal taper error in last night's backup. The error is: Wed Sep 5 08:59:23 2012: thd-840d870: taper: critical (fatal): Part numbers do not match! at /local/amanda/amanda-3.3.2/lib/amanda/perl/Amanda/Taper/Scribe.pm line 877 Amanda::Taper::Scribe::_xmsg_part_done('Amanda::Taper::Scribe=HASH(0x89a9298)', 'Amanda::MainLoop::Source=HASH(0x893b1cc)', 'Amanda::Xfer::Msg=HASH(0x89e5ca0)', 'Amanda::Xfer::Xfer=SCALAR(0x89515c4)') called at /local/amanda/amanda-3.3.2/lib/amanda/perl/Amanda/Taper/Scribe.pm line 855 Amanda::Taper::Scribe::handle_xmsg('Amanda::Taper::Scribe=HASH(0x89a9298)', 'Amanda::MainLoop::Source=HASH(0x893b1cc)', 'Amanda::Xfer::Msg=HASH(0x89e5ca0)', 'Amanda::Xfer::Xfer=SCALAR(0x89515c4)') called at /local/amanda/amanda-3.3.2/lib/amanda/perl/Amanda/Taper/Worker.pm line 680 Amanda::Taper::Worker::__ANON__('Amanda::MainLoop::Source=HASH(0x893b1cc)', 'Amanda::Xfer::Msg=HASH(0x89e5ca0)', 'Amanda::Xfer::Xfer=SCALAR(0x89515c4)') called at /local/amanda/amanda-3.3.2/lib/amanda/perl/Amanda/Xfer.pm line 655 Amanda::Xfer::__ANON__('Amanda::MainLoop::Source=HASH(0x893b1cc)', 'Amanda::Xfer::Msg=HASH(0x89e5ca0)', 'Amanda::Xfer::Xfer=SCALAR(0x89515c4)') called at /local/amanda/amanda-3.3.2/lib/amanda/perl/Amanda/MainLoop.pm line 790 eval {...} called at /local/amanda/amanda-3.3.2/lib/amanda/perl/Amanda/MainLoop.pm line 790 Amanda::MainLoop::run() called at /local/amanda/amanda-3.3.2/libexec/amanda/taper line 74 >From the looks of what was going on, I suspect that one DLE finished at the end of the first tape (of two) and the next DLE was due to start writing at the beginning of the next tape. What are the chances of that happening? Pretty good. This is with amanda 3.3.2. If more information is needed, please let me know. Pieter
Re: MAC Amanda client backup
>> ... >>I did installed 2.5.0p2 client with same options that mentioned above >> wiki page. Then I started the launch daeomns for bsdudp, For some reason I >> am seeinng follwing error in logs on >> >>com.apple.launchd[1] org.amanda.amandad.basdudp getpwnam("amandabackup") >> failed >>com.apple.launchd[1] org.amanda.amandad.bsdudp Exited with exit code 1. >> >> >> Any inputs how to fix this. >> ... You need to create the user amandabackup. You can do that via System Preferences->Accounts. Pieter
Re: MAC Amanda client backup
>> ... >> > > I have downloded the tar file and built myself. i didnt find any >> > > xinetd.files, I created one directory by myself xinetd.d and created file >> > > for amanda >> >> ... does your OS use a different mechanism for this ? >> Solaris uses /etc/inetd.conf, I thought MAC used xinetd services >> but if you aren't finding an /etc/xinet.d directory >> >> Is there such a service installed on your particular machine ? >> Could that package have been removed or not been installed ? >> >> there is something odd here that we haven't quite gotten to yet >> ... You may be able to find all your MacOS X questions answered here: http://wiki.zmanda.com/index.php/Installation/OS_Specific_Notes/Installing_Amanda_on_Mac_OS_X Specifically look at the section about LaunchDaemons. Pieter
RE: Cannot alloc contig buf for I/O error.
>> ... >> I have resolved the problem but not sure how. >> >> I changed >> >> device-property "BLOCK_SIZE" "1024k" >> >> to >> >> device-property "BLOCK_SIZE" "1 mbytes" >> >> and I rebooted the system. >> >> My amflush is running with out the errors. >> ... I did a little checking, but the only related thing I found was on a page talking about "Sun StorEdge QFS and Sun StorEdge SAM-FS Limitations". The advice suggested is: The current workaround is to increase the system memory to at least 4 gigabytes. This problem is being tracked under Solaris bug 6334803. I checked the Oracle knowledge-base, but can't find anything related. I suspect the reboot was the thing that did the trick. You might also need to get any OS patches (if you can). Pieter
amanda timeout Solaris 10/x86
I have setup a test configuration because of the issue I'm seeing on a Solaris 10/x86 amanda 3.2.1 server/client. Auth is set to "local". The error from the amanda report is: planner: ERROR xxx.math.utah.edu NAK: timeout on reply pipe The test configuration has 11 DLEs. If I let the sendsize/calcsize run to completion, they may take about 30 minutes. However, the timeout happens after about 2 minutes. The etimeout value is set to 1800 seconds. There are PREP packets being seen until the ~2 minute period has passed. At which point sendbackup starts dumping the DLEs which have estimates. My best guess is that the problem is between amandad and the sendsize processes. This problem also happens when a 2.6.1 server tries to backup the 3.2.1 client. I have compiled amanda 3.2.1 with both gcc and cc, with no change in behavior. Any ideas on where to look for the problem? Thanks, Pieter
Re: Zmanda Windows Client
>> ... >> We just tried it for the first time and saw the same thing. >> What is your disklist config? >> ... The disklist looks like: host.math.utah.edu "C:/Documents and Settings" zwc-compress 1 The amanda.conf has (among other settings): define dumptype global { index yes maxdumps 4 tape_splitsize 1 Gb } define dumptype zwc-compress { global auth "bsdtcp" compress client fast program "DUMP" maxdumps 1 } Pieter
Zmanda Windows Client
I've got ZWC installed on a couple of Windows XP boxes. The one thing I notice is that the level zero and level one backups are always about the same size and the file counts are approximately the same. This leads me to believe that I am getting full dumps every time. Any hints on what might be going on? The server is running 2.6.1p2, the ZWC is 2.6.1.2, 8th January 2010, 20180. Thanks, Pieter
Re: Amanda and ZFS
>> ... >> The gtar devs finally accepted something to help with this problem: >> --no-check-device. >> ... Thanks, I hadn't caught the addition of that option. That also reminds me that the problem isn't the inode number, but the device number which was the problem. Pieter
Re: Amanda and ZFS
I started using ZFS in a big way over a year ago on our main file server. Since there is no ufsdump replacement to use with ZFS, I elected to use GNU tar. I know this doesn't yet cover backing up things like ACLs, but we don't use them in our very heterogeneous environment. The main idea I had was to take a snapshot and point tar at the snapshot so it had a nice static, read-only copy of the filesystem to work from. I created a shell script to run as a cron job, just before amdump is run, which cleans up the previous snapshots and takes new snapshots of each of the pools (effectively): zfs destroy -r [EMAIL PROTECTED] zfs snapshot -r [EMAIL PROTECTED] Fortunately, amanda has a nice way to specify that the filesystem name is something like "/local", but the point to have tar start at is a different location. A disklist entry such as: foo.math.utah.edu /local /local/.zfs/snapshot/AMANDA user-tar The final issue I found was that the inode numbers in the snapshots change each time a new snapshot is created. This is a problem with GNU tar's listed-incremental facility. To work around this I ended up hacking GNU tar to make it ignore the inodes stored in the listed incremental files. This was just a simple change, to have ZFS filesystems treated the same as NFS. The patch was submitted to the GNU tar developers, but was rejected. Here is the patch as applied to GNU tar 1.16 (this patch also contains what I consider a fix for an actual coding bug): diff -r -c tar-1.16/src/incremen.c tar-1.16-local/src/incremen.c *** tar-1.16/src/incremen.c Fri Sep 8 10:42:18 2006 --- tar-1.16-local/src/incremen.c Fri Dec 8 14:53:37 2006 *** *** 71,77 #if HAVE_ST_FSTYPE_STRING static char const nfs_string[] = "nfs"; ! # define NFS_FILE_STAT(st) (strcmp ((st).st_fstype, nfs_string) == 0) #else # define ST_DEV_MSB(st) (~ (dev_t) 0 << (sizeof (st).st_dev * CHAR_BIT - 1)) # define NFS_FILE_STAT(st) (((st).st_dev & ST_DEV_MSB (st)) != 0) --- 71,77 #if HAVE_ST_FSTYPE_STRING static char const nfs_string[] = "nfs"; ! # define NFS_FILE_STAT(st) (strcmp ((st).st_fstype, nfs_string) == 0 || strcmp ((st).st_fstype, "zfs") == 0) #else # define ST_DEV_MSB(st) (~ (dev_t) 0 << (sizeof (st).st_dev * CHAR_BIT - 1)) # define NFS_FILE_STAT(st) (((st).st_dev & ST_DEV_MSB (st)) != 0) *** *** 247,253 directories, consider all NFS devices as equal, relying on the i-node to establish differences. */ ! if (! (((DIR_IS_NFS (directory) & nfs) || directory->device_number == stat_data->st_dev) && directory->inode_number == stat_data->st_ino)) { --- 247,253 directories, consider all NFS devices as equal, relying on the i-node to establish differences. */ ! if (! (((DIR_IS_NFS (directory) && nfs) || directory->device_number == stat_data->st_dev) && directory->inode_number == stat_data->st_ino)) { I hope this helps other people with using amanda and ZFS. I'm happy to clear up any unclear issues. Pieter
Re: 2.5.2 compilation failure on irix-6.5.x
>> ... >> Can you try the attached patch for the vstrallocf problem. >> ... That corrected the problem for both IRIX and OSF/1 for me. Pieter
Re: 2.5.2 compilation failure on irix-6.5.x
>> ... >> cc-1084 cc: ERROR File = /usr/include/sys/socket.h, Line = 66 >> The indicated declaration has an invalid combination of type specifiers. >> >> typedef int socklen_t; >> ... I'm guessing that Jean-Francois' release of IRIX is probably newer than the one I have (IRIX 6.5.4m here). The socket.h here doesn't have a typedef for socklen_t. This also probably means that defining INET6 might be a bad idea. After looking through the amanda source and /usr/include/netinet/in.h a little more, I've found that: INET_ADDRSTRLEN is only defined if INET6 is defined. However, one thing I hadn't looked at is that on our IRIX system, WORKING_IPV6 is not being defined. >From the Solaris version of netinet/in.h: /* * Miscellaneous IPv6 constants. */ #define INET_ADDRSTRLEN 16 /* max len IPv4 addr in ascii dotted */ /* decimal notation. */ #define INET6_ADDRSTRLEN46 /* max len of IPv6 addr in ascii */ /* standard colon-hex notation. */ This seems to imply that INET_ADDRSTRLEN should not be used on systems which can't do IPV6. Or an alternate definition set up. I was able to complete a build of amanda 2.5.2 on IRIX with the following addition to amanda.h and using gcc (SGI's cc won't handle the vstrallocf definition): #ifndef INET_ADDRSTRLEN #define INET_ADDRSTRLEN 16 #endif Pieter
Re: 2.5.2 compilation failure on irix-6.5.x
>> ... >> First go at amanda-2.5.2 on a system running irix-6.5.x >> and compile fails with the error: >> ... This is the case on both IRIX and OSF/1. These compilers can't handle: #define vstrallocf(...) debug_vstrallocf(__FILE__,__LINE__,__VA_ARGS__) The patch for sockaddr_storage seems to work with the addition of these two definitions in amanda.h: /* Needed on SGI IRIX 6.5 */ #ifdef WORKING_IPV6 #define INET6 #endif #ifndef INET_ADDRSTRLEN #define INET_ADDRSTRLEN 16 #endif Pieter