Re: I had enough holding disk but part of the dump failed -
On Fri, Nov 03, 2000 at 11:23:10PM -0500, John R. Jackson wrote: I think that's what's happening, a dump with est_kps larger than the bandwidth will never be scheduled to dump on holding disk. I agree it looks like that could happen in principle (in fact, I wonder if other such "don't do this now" states should be examined), but in this particular case the estimated bandwidth for the two disks not processed was well below the available. The schedule give: admin1.cor sda9 lv 2 t 5 s52939 p 13 52939/5 = 10587 that's above 1, and that's why sda9 didn't get dumped. As Alexandre said, sda10 didn't get dumped because the disk was big and the chunksize set to -1. Jean-Louis -- Jean-Louis Martineau email: [EMAIL PROTECTED] Departement IRO, Universite de Montreal C.P. 6128, Succ. CENTRE-VILLETel: (514) 343-6111 ext. 3529 Montreal, Canada, H3C 3J7Fax: (514) 343-5834
Re: I had enough holding disk but part of the dump failed -
On Sat, Nov 04, 2000 at 09:35:54AM -0500, Jean-Louis Martineau wrote: On Fri, Nov 03, 2000 at 11:23:10PM -0500, John R. Jackson wrote: I think that's what's happening, a dump with est_kps larger than the bandwidth will never be scheduled to dump on holding disk. I agree it looks like that could happen in principle (in fact, I wonder if other such "don't do this now" states should be examined), but in this particular case the estimated bandwidth for the two disks not processed was well below the available. The schedule give: admin1.cor sda9 lv 2 t 5 s52939 p 13 52939/5 = 10587 that's above 1, and that's why sda9 didn't get dumped. In free_kps() we have: /* XXX - kludge - if we are currently using nothing ** on this interface then lie and say he can ** have as much as he likes. */ if (ip-curusage == 0) res = 1; else res = ip-maxusage - ip-curusage; Even if maxusage 1, it will never return more than 1 if curusage==0. Should we fix it and increased it to 10? Or my previous patch is better and we should remove this kludge? Jean-Louis -- Jean-Louis Martineau email: [EMAIL PROTECTED] Departement IRO, Universite de Montreal C.P. 6128, Succ. CENTRE-VILLETel: (514) 343-6111 ext. 3529 Montreal, Canada, H3C 3J7Fax: (514) 343-5834
I had enough holding disk but part of the dump failed -
claiming there was [no more holding disk space] - What do you think went wrong? Filesystemkbytesused avail capacity Mounted on /dev/md/dsk/d20 11560144 1847220 959732317%/dump Subject: daily AMANDA MAIL REPORT FOR November 3, 2000 *** THE DUMPS DID NOT FINISH PROPERLY! *** A TAPE ERROR OCCURRED: [no tape online]. *** PERFORMED ALL DUMPS TO HOLDING DISK. THESE DUMPS WERE TO DISK. Flush them onto a new tape. Tonight's dumps should go onto 1 tape: a new tape. FAILURE AND STRANGE DUMP SUMMARY: admin1.cor sda9 lev 2 FAILED [no more holding disk space] admin1.cor sda10 lev 1 FAILED [no more holding disk space] STATISTICS: Total Full Daily Dump Time (hrs:min)0:02 0:00 0:00 (0:02 start) Output Size (meg) 370.40.0 370.4 Original Size (meg) 370.40.0 370.4 Avg Compressed Size (%) -- -- -- Tape Used (%) 1.00.01.0 (level:#disks ...) Filesystems Dumped7 0 7 (1:1 2:5 3:1) Avg Dump Rate (k/s) 1642.0-- 1642.0 Avg Tp Write Rate (k/s) -- -- -- ? NOTES: planner: Incremental of admin1.corp.walid.com:sda2 bumped to level 2. planner: Incremental of admin1.corp.walid.com:sda9 bumped to level 2. planner: Incremental of admin1.corp.walid.com:sda5 bumped to level 2. planner: Incremental of admin1.corp.walid.com:sda3 bumped to level 2. planner: Incremental of admin1.corp.walid.com:sda6 bumped to level 2. planner: Incremental of sundev1.corp.walid.com:c0t0d0s3 bumped to level 3. planner: Incremental of sundev1.corp.walid.com:c0t0d0s0 bumped to level 2. ? DUMP SUMMARY: DUMPER STATS TAPER STATS HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s -- -- -- admin1.co sda10 1 FAILED admin1.co sda2 26380863808 -- 0:21 3095.4N/A N/A admin1.co sda3 2 32 32 -- 0:00 126.4N/A N/A admin1.co sda5 2 6272 6272 -- 0:21 293.6N/A N/A admin1.co sda6 2 1184 1184 -- 0:03 419.8N/A N/A admin1.co sda9 2 FAILED sundev1.c c0t0d0s0 26387263872 -- 0:52 1228.8N/A N/A sundev1.c c0t0d0s3 36329663296 -- 0:24 2630.6N/A N/A sundev1.c c0t0d0s7 1 180800 180800 -- 1:50 1645.3N/A N/A (brought to you by A root@sundev1:adm/amanda/daily# more log.20001103.0 START planner date 20001103 START driver date 20001103 ERROR taper no-tape [no tape online] INFO planner Incremental of admin1.corp.walid.com:sda2 bumped to level 2. INFO planner Incremental of admin1.corp.walid.com:sda9 bumped to level 2. INFO planner Incremental of admin1.corp.walid.com:sda5 bumped to level 2. INFO planner Incremental of admin1.corp.walid.com:sda3 bumped to level 2. INFO planner Incremental of admin1.corp.walid.com:sda6 bumped to level 2. INFO planner Incremental of sundev1.corp.walid.com:c0t0d0s3 bumped to level 3. INFO planner Incremental of sundev1.corp.walid.com:c0t0d0s0 bumped to level 2. FINISH planner date 20001103 STATS driver startup time 132.610 SUCCESS dumper admin1.corp.walid.com sda3 20001103 2 [sec 0.253 kb 32 kps 126.4 orig-kb 17] SUCCESS dumper admin1.corp.walid.com sda6 20001103 2 [sec 2.820 kb 1184 kps 419.8 orig-kb 1163] SUCCESS dumper sundev1.corp.walid.com c0t0d0s3 20001103 3 [sec 24.061 kb 63296 kps 2630.6 orig-kb 63263] SUCCESS dumper admin1.corp.walid.com sda5 20001103 2 [sec 21.361 kb 6272 kps 293.6 orig-kb 6274] SUCCESS dumper admin1.corp.walid.com sda2 20001103 2 [sec 20.613 kb 63808 kps 3095.4 orig-kb 63793] SUCCESS dumper sundev1.corp.walid.com c0t0d0s0 20001103 2 [sec 51.977 kb 63872 kps 1228.8 orig-kb 63839] SUCCESS dumper sundev1.corp.walid.com c0t0d0s7 20001103 1 [sec 109.887 kb 180800 kps 1645.3 orig-kb 180767] FAIL driver admin1.corp.walid.com sda9 2 [no more holding disk space] FAIL driver admin1.corp.walid.com sda10 1 [no more holding disk space]
Re: I had enough holding disk but part of the dump failed -
From: "Denise Ives" [EMAIL PROTECTED] claiming there was [no more holding disk space] - What do you think went wrong? Filesystemkbytesused avail capacity Mounted on /dev/md/dsk/d20 11560144 1847220 959732317%/dump Subject: daily AMANDA MAIL REPORT FOR November 3, 2000 FAILURE AND STRANGE DUMP SUMMARY: admin1.cor sda9 lev 2 FAILED [no more holding disk space] admin1.cor sda10 lev 1 FAILED [no more holding disk space] Your amanda.conf file may be specifying a maximum amount of drive space to use for the holding disk. holdingdisk hd1 { comment "main holding disk" directory "/dumps/amanda" # where the holding disk is use -20 Mb # how much space can we use on it # a negative value mean: #use all space except that value chunksize 1 Gb # size of chunk if you want big dump to be # dumped on multiple files on holding disks # N Kb/Mb/Gb split disks in chunks of size N # 0 split disks in INT_MAX/1024 Kb chunks # -1 same as -INT_MAX/1024 (see below) # -N Kb/Mb/Gb dont split, dump larger # filesystems directly to tape # (example: -2 Gb) # chunksize 2 Gb } Try, changing the "use" size to a more appropriate valuse for your setup. Scot
Re: I had enough holding disk but part of the dump failed -
Here is my amanda.conf file with respect to the holding disk - holdingdisk hd1 { comment "main holding disk" directory "/dump/amanda"# where the holding disk is use -500Mb # how much space can we use on it # a negative value mean: #use all space except that value chunksize -1# size of chunk if you want big dump to be # dumped on multiple files on holding disks # N Kb/Mb/Gb split disks in chunks of size N # 0 split disks in INT_MAX/1024 Kb chunks # -1 same as -INT_MAX/1024 (see below) # -N Kb/Mb/Gb dont split, dump larger # filesystems directly to tape # (example: -2 Gb) Is the use wrong? ** On Fri, 3 Nov 2000, Scot W. Hetzel wrote: From: "Denise Ives" [EMAIL PROTECTED] claiming there was [no more holding disk space] - What do you think went wrong? Filesystemkbytesused avail capacity Mounted on /dev/md/dsk/d20 11560144 1847220 959732317%/dump Subject: daily AMANDA MAIL REPORT FOR November 3, 2000 FAILURE AND STRANGE DUMP SUMMARY: admin1.cor sda9 lev 2 FAILED [no more holding disk space] admin1.cor sda10 lev 1 FAILED [no more holding disk space] Your amanda.conf file may be specifying a maximum amount of drive space to use for the holding disk. holdingdisk hd1 { comment "main holding disk" directory "/dumps/amanda" # where the holding disk is use -20 Mb # how much space can we use on it # a negative value mean: #use all space except that value chunksize 1 Gb # size of chunk if you want big dump to be # dumped on multiple files on holding disks # N Kb/Mb/Gb split disks in chunks of size N # 0 split disks in INT_MAX/1024 Kb chunks # -1 same as -INT_MAX/1024 (see below) # -N Kb/Mb/Gb dont split, dump larger # filesystems directly to tape # (example: -2 Gb) # chunksize 2 Gb } Try, changing the "use" size to a more appropriate valuse for your setup. Scot
Re: I had enough holding disk but part of the dump failed -
Here is my amanda.conf file with respect to the holding disk - ... use -500Mb ... That's OK. chunksize -1 ... This is not related to your problem, but you should change this at some point to something like "1000 Mb". Is the use wrong? It might be a little large (why not let Amanda have almost all of that file system?), but is probably not the problem. Much as I hate to do this :-), please post the entire amdump.1 file that goes with this run. It's the only way we'll be able to tell what was going on. John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]
Re: I had enough holding disk but part of the dump failed -
On Fri, 3 Nov 2000, John R. Jackson wrote: Here is my amanda.conf file with respect to the holding disk - ... use -500Mb ... That's OK. chunksize -1 ... This is not related to your problem, but you should change this at some point to something like "1000 Mb". Is the use wrong? It might be a little large (why not let Amanda have almost all of that file system?), but is probably not the problem. Much as I hate to do this :-), please post the entire amdump.1 file that goes with this run. It's the only way we'll be able to tell what was going on. John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED] -- Denise E. Ives [EMAIL PROTECTED] Systems Engineer734.822.2037 Multilingual Internet Domain Name Registrations - http://www.walid.com amdump: start at Fri Nov 3 01:15:00 EST 2000 planner: pid 14161 executable /usr/local/pkg/amanda-2.4.1p1/libexec/planner version 2.4.1p1 planner: build: VERSION="Amanda-2.4.1p1" planner:BUILT_DATE="Tue Oct 17 14:52:15 GMT 2000" planner:BUILT_MACH="SunOS sundev1.corp.walid.com 5.8 Generic sun4u sparc SUNW,Ultra-60" planner:CC="gcc" planner: paths: bindir="/usr/local/pkg/amanda-2.4.1p1/bin" planner:sbindir="/usr/local/pkg/amanda-2.4.1p1/sbin" planner:libexecdir="/usr/local/pkg/amanda-2.4.1p1/libexec" planner:mandir="/usr/local/pkg/amanda-2.4.1p1/man" planner:CONFIG_DIR="/usr/local/pkg/amanda-2.4.1p1/etc/amanda" planner:DEV_PREFIX="/dev/dsk/" RDEV_PREFIX="/dev/rdsk/" planner:DUMP="/usr/sbin/ufsdump" RESTORE="/usr/sbin/ufsrestore" planner:GNUTAR="/usr/local/bin/tar" COMPRESS_PATH="/usr/bin/gzip" planner:UNCOMPRESS_PATH="/usr/bin/gzip" MAILER="/usr/ucb/Mail" planner: listed_incr_dir="/usr/local/pkg/amanda-2.4.1p1/var/amanda/gnutar-lists" planner: defs: DEFAULT_SERVER="sundev1.corp.walid.com" planner:DEFAULT_CONFIG="daily" planner:DEFAULT_TAPE_SERVER="sundev1.corp.walid.com" planner:DEFAULT_TAPE_DEVICE="/dev/rmt/0bn" HAVE_MMAP HAVE_SYSVSHM planner:LOCKING=POSIX_FCNTL SETPGRP_VOID DEBUG_CODE BSD_SECURITY planner:USE_AMANDAHOSTS CLIENT_LOGIN="amanda" FORCE_USERID HAVE_GZIP planner:COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast" planner:COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc" READING CONF FILES... startup took 0.025 secs SETTING UP FOR ESTIMATES... setting up estimates for sundev1.corp.walid.com:c0t0d0s0 sundev1.corp.walid.com:c0t0d0s0 overdue 11 days for level 0 setup_estimate: sundev1.corp.walid.com:c0t0d0s0: command 0, options: last_level 1 next_level0 -11 level_days 2 getting estimates 0 (2049855) 1 (255) 2 (0) setting up estimates for sundev1.corp.walid.com:c0t0d0s3 sundev1.corp.walid.com:c0t0d0s3 overdue 11 days for level 0 setup_estimate: sundev1.corp.walid.com:c0t0d0s3: command 0, options: last_level 2 next_level0 -11 level_days 1 getting estimates 0 (481183) 2 (52991) 3 (0) setting up estimates for sundev1.corp.walid.com:c0t0d0s7 sundev1.corp.walid.com:c0t0d0s7 overdue 11 days for level 0 setup_estimate: sundev1.corp.walid.com:c0t0d0s7: command 0, options: last_level 0 next_level0 -11 level_days 0 getting estimates 0 (5101407) 1 (737663) -1 (-1) setting up estimates for admin1.corp.walid.com:sda6 admin1.corp.walid.com:sda6 overdue 11 days for level 0 setup_estimate: admin1.corp.walid.com:sda6: command 0, options: last_level 1 next_level0 -11 level_days 2 getting estimates 0 (93824) 1 (490) 2 (0) setting up estimates for admin1.corp.walid.com:sda3 driver: pid 14160 executable /usr/local/pkg/amanda-2.4.1p1/libexec/driver version 2.4.1p1 driver: send-cmd time 0.022 to taper: START-TAPER 20001103 admin1.corp.walid.com:sda3 overdue 11 days for level 0 setup_estimate: admin1.corp.walid.com:sda3: command 0, options: last_level 1 next_level0 -11 level_days 2 getting estimates 0 (6020) 1 (19) 2 (0) setting up estimates for admin1.corp.walid.com:sda5 admin1.corp.walid.com:sda5 overdue 11 days for level 0 setup_estimate: admin1.corp.walid.com:sda5: command 0, options: last_level 1 next_level0 -11 level_days 2 getting estimates 0 (1594953) 1 (3698) 2 (0) setting up estimates for admin1.corp.walid.com:sda10 admin1.corp.walid.com:sda10 overdue 11 days for level 0 setup_estimate: admin1.corp.walid.com:sda10: command 0, options: last_level 0 next_level0 -11 level_days 0 getting estimates 0 (4550177) 1 (0) -1 (-1) setting up estimates for admin1.corp.walid.com:sda9 admin1.corp.walid.com:sda9 overdue 11 days for level 0 setup_estimate: admin1.corp.walid.com:sda9: command 0, options: last_level 1 next_level0 -11 level_days 2 getting estimates 0 (58442) 1 (48454) 2 (0) setting up estimates for admin1.corp.walid.com:sda2 taper: pid 14162 executable taper version 2.4.1p1 admin1.corp.walid.com:sda2
Re: I had enough holding disk but part of the dump failed -
I think that's what's happening, a dump with est_kps larger than the bandwidth will never be scheduled to dump on holding disk. I agree it looks like that could happen in principle (in fact, I wonder if other such "don't do this now" states should be examined), but in this particular case the estimated bandwidth for the two disks not processed was well below the available. Jean-Louis John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]