Re: Why Oh Why only THIS DLE is giving me those timeout problems ?
On Wed, 31 Aug 2005, Steve Wray wrote: Geert Uytterhoeven wrote: On Tue, 30 Aug 2005, Graeme Humphries wrote: Guy Dallaire wrote: Yes, thanks. I know about hard links. But how would it impact the size or performance of my backups ? Well, if a file is hard linked multiple times, it'll be backed up multiple times. Therefor, a filesystem with tons of hard links will take a really long time to back up. :) Fortunately tar is sufficiently smart to back it up only once. Usually the problem with lots of hard links is not the data timeout value, but the estimate timeout value, as I found out the hard way[*]. We've been having similar problems with estimates timeing out. I just ran the 'find' command given in an earlier email and found a grand total of 607 hard links on the entire filesystem. What I'm wondering is, does 607 count as 'lots' WRT amanda estimate timeouts? Not really, given I have many files with more than 600 hard links. I seem to have 1582186 of them in my cluster of Linux kernel source trees. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds
Re: This is retarded.
On 8/30/05, Joe Rhett [EMAIL PROTECTED] wrote: tracking ability, but let's use 3 tapes and write not a single byte to them? On Tue, Aug 30, 2005 at 06:35:12PM -0400, Jon LaBadie wrote: And not using samba, right Joe :)) Right. Straight amanda clients, some Solaris, some Linux, lots of Freebsd, and some Windows. But all samba native clients 2.4.4p2 or later. The dumps were flushed to tapes svk17, svk18, svk19. The next 7 tapes Amanda expects to used are: svk20, svk21, svk01, svk02, svk03, svk04, svk05. Looks like runtapes is 7. Output Size (meg) 0.00.00.0 Nothing made it to any holding disk. This was a flush. It's already on the holding disk. taper: tape svk17 kb 34286880 fm 1 writing file: short write taper: retrying customer-plat1:/.1 on new tape: [writing file: short write] taper: tape svk18 kb 34227328 fm 1 writing file: short write taper: retrying customer-plat1:/.1 on new tape: [writing file: short write] taper: tape svk19 kb 0 fm 0 [OK] This looks like the dump was going directly to tape and was too large to fit your 35GB tape. So it was retried and of course was still too big. customer- / lev 1 FAILED 20050825 [too many taper retries] Yes. The item it is trying to flush is 53 1gb tar blockfilies. du -ks /amandadump/20050825 5336478620050825 This seems to be an improvement over what I would have expected. I recall amanda continuing through any and all runtapes tapes. It would have done 7 attempts in your config. At least now it stops after a couple of attempts. As to whether amanda's behavior is reasonable, well ... It is nearly impossible, perhaps totally impossible, to tell if the taping failed because of reaching the end of the tape or a tape or hardware error. Is having a backup important enough to continue and try again or should the first failure, possibly a bad/worn out tape terminate all the remaining backups? I don't think there is a simple answer. What would be your recommendation? I don't think this is the answer. The real answer is that the filesystem it is trying to flush was 53gb in size when you put all the chunks together. That won't fit on a 33gb tape. 1. Why aren't we backing up chunks to different tapes yet? Amanda is the only backup software which doesn't handle this. -and more importantly- 2. Why is it trying to back up 53gb to a 33gb tape definition? #2 is clearly a bug. #1 is a feature request long overdue, but #2 is clearly the bug. -- Joe Rhett senior geek meer.net
Re: This is retarded.
On Tue, Aug 30, 2005 at 06:01:07PM -0500, Frank Smith wrote: In order to tell if Amanda is deficient in some way, we need some more information. According to the original post, this was an amflush run, so this DLE must be sitting on a holdindisk. Since normally Amanda refuses to backup a DLE that won't fit on a tape and give a warning, I see two possibilities: 1. Your tapelength is set incorrectly in your config, so Amanda thinks a dump will fit when it won't Tapelength is actually less than real tape length. We adjusted this well below the real length of DLT tapes some time ago to avoid amanda miscalculation errors. 2. For some reason the dump ended up larger than the estimate, perhaps due to a changing filesystem or using both H/W compression on already compressed data. We don't use compression anywhere. Never. None of our backup definitions have compression enabled. Is your tapelength set to something less than 34.2GB? Yes, it's set to 30gb. Much less than the real capacity. How big is the dump in the holdingdisk (not how big the chunks are, if there are more than one, but the total of all the chunks of that DLE)? 53 1gb chunks. With that information we could find where the problem lies and maybe find a solution. Some part of the code mistakenly decided to store 53gb to the holding disk, and then tried to flush it to tape, regardless of the fairly basic math involved. This works fine without holding disks, or when the holding disk is too small for the backup. The logic flaw must be somewhere in the tape not available, store to holding disk for later logic and/or flush this to tape not checking the sizes involved. I am still strongly of the opinion that amanda's handling of DLEs is still its strongest failing point. We're hacking vainly at something which needs to be redesigned to work properly. -- Joe Rhett senior geek meer.net
dump larger than tape
Hi, I just got this in my daily report from Amanda (2.4.5-1, Debian etch/testing): | FAILURE AND STRANGE DUMP SUMMARY: | | [...] | | host dle lev 5 FAILED [dump larger than tape, -1 KB, skipping incremental] | | [...] | | DUMP SUMMARY: | | [...] | | host dle5 FAILED | | [...] The funny thing is that this DLE is only about 5 GiB large (according to du), while the tapetype length is 2 mbytes. Anyone ever seen this before? Thx! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds
extend chg-mtx
Hello, if the main difference between chg-mtx and chg-zd-mtx. is only The mtx program must support commands such as `-s', `-l' and `-u'. If the one you've got requires `status', `load' and `unload', you should use chg-zd-mtx instead, is it possible to just add a variable like UseFullOption = yes or no and complete StatusOption, LoadOption and UnloadOption with the good choice -s or status, -l or load and -u or unload, and use them when the script call mtx. Is it too simple ? -- freedom - share - respect
Re: Help restoring
On 2005-08-30 13:25:09 -0400, Jon LaBadie wrote: As to the other suggestions, my /dev/tape ls symlinked to /dev/nst0, so is not rewinding. Typically 'tape' references st0 and 'ntape' references nst0. OTOH, my install of CentOS 4, something (udev, manages /dev on most new linux kernel based OS Distros) has automagically made the symlink from tape to nst0, not st0. Not saying it's right, or conventional in the wider ixoid world, just that that's what happened and that's what a fresh install of CentOS 4 (and therefore presumably what RHEL4) will do. [EMAIL PROTECTED] ~]# grep -R tape /etc/udev/rules.d/ /etc/udev/rules.d/50-udev.rules:KERNEL=qft0, SYMLINK=ftape /etc/udev/rules.d/50-udev.rules:KERNEL=nst[0-9]*, SYMLINK=tape%e
Re: This is retarded.
On Wed, Aug 31, 2005 at 12:41:43AM -0700, Joe Rhett wrote: Some part of the code mistakenly decided to store 53gb to the holding disk, and then tried to flush it to tape, regardless of the fairly basic math involved. And well, tonight, it does it again. USAGE BY TAPE: Label Time Size %Nb svk17 0:00 0.00.0 0 svk18 0:00 0.00.0 0 svk19 1:32 15181.6 45.142 svk20 0:00 0.00.0 0 Yeah, I didn't need those tapes for anything else. Let's just waste 12 hours and 3 tapes doing absolutely NOTHING. -- Joe Rhett senior geek meer.net
extimate server initial value?
This may be a RTFM question, but I can't see the answer in the amanda.conf man page (http://www.amanda.org/docs/amanda.conf.5.html): When using estimate server, is there a way to configure what the initial estimate of a disk entry is before there's any historical data. It looks like Amanda's defaulting to about 5MB, and I'd rather it defaulted to closer to 1GB. Graeme -- Graeme Humphries ([EMAIL PROTECTED]) (306) 955-7075 ext. 485 My views are not the views of my employers.
Re: This is retarded.
--On Wednesday, August 31, 2005 15:46:36 -0700 Joe Rhett [EMAIL PROTECTED] wrote: On Wed, Aug 31, 2005 at 12:41:43AM -0700, Joe Rhett wrote: Some part of the code mistakenly decided to store 53gb to the holding disk, and then tried to flush it to tape, regardless of the fairly basic math involved. And well, tonight, it does it again. USAGE BY TAPE: Label Time Size %Nb svk17 0:00 0.00.0 0 svk18 0:00 0.00.0 0 svk19 1:32 15181.6 45.142 svk20 0:00 0.00.0 0 Yeah, I didn't need those tapes for anything else. Let's just waste 12 hours and 3 tapes doing absolutely NOTHING. I'm still curious how a 53gb dump was even done if your tapelength was set to 30ish GB. Was your tapelenght set to something longer at the time the dump was originally done? In the meantime, to quit burning tapes, set autoflush off (or is it false?) to let the old dump stay on disk while continuing your current backups and researching the problem. Or are you saying it did a new 53gb dump? Yes, I agree that Amanda shouldn't mark a tape as used if nothing was successfully written to it. You could edit your tapelist to 'unuse' a tape, just be aware you can screw things up if you get it wrong. Frank -- Joe Rhett senior geek meer.net -- Frank Smith [EMAIL PROTECTED] Sr. Systems Administrator Voice: 512-374-4673 Hoover's Online Fax: 512-374-4501
Re: planner timeouts
Charles Sprickman wrote: h13 (client) debug logs. Note that there is two-way communication, and everything seems to go correctly. In the debug dir, there are only amandad debug logs, nothing else. That doesn't sound right to me. There should be a sendbackup log file as well, a runtar one, and so on. Can you verify your inetd config on that particular client, to see whether there's something afoul? Have a look at the system logs as well, while you're at it. amanda might be unable to run any secondary programs, for instance. GETTING ESTIMATES... planner: time 30.956: error result for host h13.blah.com disk /spool: Request to h13.blah.com timed out. planner: time 30.956: error result for host h13.blah.com disk /var/qmail/bin: Request to h13.blah.com timed out. planner: time 30.956: error result for host h13.blah.com disk /var/qmail/control: Request to h13.blah.com timed out. planner: time 30.956: error result for host h13.blah.com disk /var/db/pkg: Request to h13.blah.com timed out. planner: time 30.956: error result for host h13.blah.com disk /usr/local/: Request to h13.blah.com timed out. planner: time 30.956: error result for host h13.blah.com disk /home: Request to h13.blah.com timed out. planner: time 30.956: error result for host h13.blah.com disk /: Request to h13.blah.com timed out. planner: time 30.956: getting estimates took 30.811 secs Does that spell a 30s timeout somewhere? amanda.conf not taken into account, perhaps? And the obligatory question, did you double-check that there's no firewall between that particular client and server? (If you did, triple-check. :-) ) Alex -- Alexander Jolk / BUF Compagnie tel +33-1 42 68 18 28 / fax +33-1 42 68 18 29
AIT-2 length specifier, et. al.
Hey, all. I've got an inherited Amanda system I'm managing, and we're using an AIT-2 tape drive / changer. I see the following: define tapetype AIT-2 { comment Generic AIT 2 Drive -- real world numbers length 41000 mbytes filemark 1000 kbytes speed 2920 kps } Now, we've got data compression turned on on the tape drive, which should theoretically make more space, and we also have clients doing compression. Looking at the Faq-O-Matic, I see this: http://amanda.sourceforge.net/fom-serve/cache/439.html In short, the person entering that data got less space and speed with hardware compression enabled. They were using the same drive we use here. My questions are profligate, but I'll limit them to these: 1. Why would the length parameter end up being so much shorter with hardware compression enabled on that drive? This confuses me. 2. Am I to take it from the derived speed parameter that dumps will go quicker without hardware compression enabled? It seems strange that hardware compression would create that sort of impact, even when it's dealing with already-compressed data. I'll try to schedule some time to run amtapetype at work, but we're in the middle of a move, so I might not get the chance for a while, and if I can safely get more speed and space out of my tapes with such a small change, I'd like to do so. Thanks kindly and heart-feltedly in advance! -- Mason Loring Bliss [EMAIL PROTECTED] They also surf who awake ? sleep : dream; http://blisses.org/ only stand on waves.
Re: This is retarded.
On Wed, Aug 31, 2005 at 03:46:36PM -0700, Joe Rhett wrote: On Wed, Aug 31, 2005 at 12:41:43AM -0700, Joe Rhett wrote: Some part of the code mistakenly decided to store 53gb to the holding disk, and then tried to flush it to tape, regardless of the fairly basic math involved. And well, tonight, it does it again. USAGE BY TAPE: Label Time Size %Nb svk17 0:00 0.00.0 0 svk18 0:00 0.00.0 0 svk19 1:32 15181.6 45.142 svk20 0:00 0.00.0 0 Yeah, I didn't need those tapes for anything else. Let's just waste 12 hours and 3 tapes doing absolutely NOTHING. What's the saying? If you keep doing what you've been doing you'll keep getting what you've been getting! You know you have a DLE that is too big to tape and that amanda does not handle it well. Isn't it time to stop wasting time and tape and split the DLE? jl -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
RE: This is retarded.
Title: RE: This is retarded. On Wed, Aug 31, 2005 at 03:46:36PM -0700, Joe Rhett wrote: And well, tonight, it does it again. USAGE BY TAPE: Label Time Size % Nb svk17 0:00 0.0 0.0 0 svk18 0:00 0.0 0.0 0 svk19 1:32 15181.6 45.1 42 svk20 0:00 0.0 0.0 0 Yeah, I didn't need those tapes for anything else. Let's just waste 12 hours and 3 tapes doing absolutely NOTHING. What's the saying? If you keep doing what you've been doing you'll keep getting what you've been getting! You know you have a DLE that is too big to tape and that amanda does not handle it well. Isn't it time to stop wasting time and tape and split the DLE? jl Ah, not splitting the DLE is retarded. The whining is a most productive use of bandwidth, however. I'd like to see a discussion of the code with the purported error. That would have some intellectual (as opposed to emotional) content
Re: This is retarded.
This one time, at band camp, Joe Rhett wrote: 1. Why aren't we backing up chunks to different tapes yet? Amanda is the only backup software which doesn't handle this. -and more importantly- 2. Why is it trying to back up 53gb to a 33gb tape definition? #2 is clearly a bug. #1 is a feature request long overdue, but #2 is clearly the bug. There's a patch from John Stange that I don't believe has been committed to CVS, but it takes care of splitting dumps and spanning tapes. Grep for it on the amanda-hackers list. Doesn't fix #2, but should take care of #1 for you.
Re: AIT-2 length specifier, et. al.
--On Wednesday, August 31, 2005 10:17:54 -0400 Mason Loring Bliss [EMAIL PROTECTED] wrote: Hey, all. I've got an inherited Amanda system I'm managing, and we're using an AIT-2 tape drive / changer. I see the following: define tapetype AIT-2 { comment Generic AIT 2 Drive -- real world numbers length 41000 mbytes filemark 1000 kbytes speed 2920 kps } Now, we've got data compression turned on on the tape drive, which should theoretically make more space, and we also have clients doing compression. Generally not good to do both. Looking at the Faq-O-Matic, I see this: http://amanda.sourceforge.net/fom-serve/cache/439.html In short, the person entering that data got less space and speed with hardware compression enabled. They were using the same drive we use here. My questions are profligate, but I'll limit them to these: 1. Why would the length parameter end up being so much shorter with hardware compression enabled on that drive? This confuses me. Trying to compress already compressed data makes it larger. Some tape drives are smart enough to not compress the data again but AIT isn't one of them (at least up through AIT-3, not certain about -4). 2. Am I to take it from the derived speed parameter that dumps will go quicker without hardware compression enabled? It seems strange that hardware compression would create that sort of impact, even when it's dealing with already-compressed data. A tape drive can only write so many MB/sec. If the data is getting larger it has to wtrite more (i.e. your 41GB is turning into maybe 46GB). I'll try to schedule some time to run amtapetype at work, but we're in the middle of a move, so I might not get the chance for a while, and if I can safely get more speed and space out of my tapes with such a small change, I'd like to do so. Pick either H/W or S/W compression, but don't do both. If you decide to disable H/W compression on your drive, you probably need to look for Gene's recurring postings on the subject in the archives, as many drives detect that a tape was previously used compressed and will re-enable compression even if you think you've disabled it. With H/W compression off, yopu should be able to get close to 50GB on your AIT-2 drive and not 41GB. Frank Thanks kindly and heart-feltedly in advance! -- Mason Loring Bliss [EMAIL PROTECTED] They also surf who awake ? sleep : dream; http://blisses.org/ only stand on waves. -- Frank Smith [EMAIL PROTECTED] Sr. Systems Administrator Voice: 512-374-4673 Hoover's Online Fax: 512-374-4501
RE: Estimate timeout
Well, the tar command by itself is still running, but the backup with the new version of tar is complete, so my estimate timeout problem is fixed with an updated tar executable. Thank you all. -Original Message- From: Joshua Baker-LePain [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 30, 2005 11:21 AM To: LaValley, Brian E Cc: Amanda (E-mail) Subject: Re: Estimate timeout On Tue, 30 Aug 2005 at 11:01am, LaValley, Brian E wrote sendsize: debug 1 pid 12359 ruid 548 euid 548: start at Mon Aug 29 18:00:02 2005 sendsize: version 2.4.4p2 sendsize[12359]: time 0.034: waiting for any estimate child: 1 running sendsize[12361]: time 0.035: calculating for amname '/dev/vx/dsk/homedg/homevol', dirname '/home', spindle -1 sendsize[12361]: time 0.035: getting size via gnutar for /dev/vx/dsk/homedg/homevol level 0 sendsize[12361]: time 0.092: spawning /home/backup/amanda_sun/libexec/runtar in pipeline sendsize[12361]: argument list: /opt/sfw/bin/gtar --create --file /dev/null --directory /home --one-file-system --listed-incremental /home/backup/amanda_sun/var/amanda/gnutar-lists/coneng_dev_vx_dsk_homedg_hom evol_0.new --sparse --ignore-failed-read --totals --exclude-from /tmp/amanda/sendsize._dev_vx_dsk_homedg_homevol.20050829180002.exclude . Run this command yourself on the command line (as root) and see how long it take to complete. Also, what version of tar are you running? -- Joshua Baker-LePain Department of Biomedical Engineering Duke University
Re: samba backups
Here is a process list and strace to smbclient and tar. You can see that is stalled at some opening of the file. I modified share name and file name for security reasons. regards, gregor 19432 ?S 0:00 /bin/sh /usr/sbin/amdump tape 19442 ?S 0:20 /usr/libexec/amanda/driver tape 19443 ?S 21:31 taper tape 19444 ?S 69:20 dumper0 tape 19445 ?S 5:36 dumper1 tape 19446 ?S 3:26 dumper2 tape 19447 ?S 17:10 taper tape 24836 ?S 0:00 /usr/libexec/amanda/sendbackup 24838 ?S 0:00 /bin/gzip --best 26743 ?S 0:01 /usr/libexec/amanda/sendbackup 26761 ?S 0:00 pickup -l -t fifo -u 26747 ?S 0:00 sed -e s/^\.// 26746 ?S 0:00 /bin/tar -tf - 26745 ?S 0:00 sh -c /bin/tar -tf - 2/dev/null | sed -e 's/^\.//' 26744 ?S 0:07 smbclient \\server\sharename -U username -E -d0 -Tqca 26778 pts/15 R 0:00 ps ax [EMAIL PROTECTED] root]# strace -p 26744 Process 26744 attached - interrupt to quit write(2, NT_STATUS_ACCESS_DENIED opening ..., 134 unfinished ... Process 26744 detached [EMAIL PROTECTED] root]# strace -p 26745 Process 26745 attached - interrupt to quit wait4(-1, Process 26745 detached [EMAIL PROTECTED] root]# strace -p 26746 Process 26746 attached - interrupt to quit read(0, unfinished ... Process 26746 detached [EMAIL PROTECTED] root]# strace -p 26747 Process 26747 attached - interrupt to quit read(0, unfinished ... Process 26747 detached [EMAIL PROTECTED] root]# strace -p 26743 Process 26743 attached - interrupt to quit read(0, unfinished ... Process 26743 detached [EMAIL PROTECTED] root]# strace -p 24836 Process 24836 attached - interrupt to quit write(2, filename.txt (\\Data\\..., 50 unfinished ... Process 24836 detached