Re: amanda getting confused
On Thursday 10 January 2002 02:31 pm, Thomas Hepper wrote: To change this will take some time, i think you will hear from me tomorow, I hadn't heard, my cold isn't any better and the backup schedule restarts tonight for the week. Here is what I've done, and which *appears* to have fixed this problem. 1. Download, configure, build and install the 20020111 snapshot, which didn't do anything for this problem. This problem being that when show or amcheck reads a tape label, it is not rewinding it for the next pass. The eject reload of a tape change will of course do this, but if amcheck finds a match, the tape was being left at block 64, where a label read by amdump then failed. IMO amdump really should have a rewind in front of its label reads, but I'm not sure where to put that. 2. Traceing through the sources, in tape-src/tapeio.c, in the function tapefd_readlabel(), I've added a call to tapefd_rewind(fd); above the tapefd_close(fd); call I can now run amtape /config/ show repeatedly, as well as amcheck /config/ repeatedly without errors. The question is: Is this automatic rewind in the tape-src/tapeio.c-tapefd_rdlabel(fd) function going to mess with anything else? IMO, not knowing all the fine points of how amanda is organised, it (to me) doesn't make any sense to do a readlabel without rewinding it for the next readlabel, in most cases it will come as a request from amdump to make sure its writing to the correct tape. So the question then is: will amdump then overwrite the tapes label after it reads it, gets a good one, rewinds to write the fresh one, writes it, rewinds to check it (maybe, I don't know if it does) with the end result being an overwritten label, overwritten by the dumps contents because the tape is NOT going to be left at block 64 after a tapefd_rdlabel(fd) as it was prior to this patch... Thomas, anybody else care to comment here? -- Cheers, Gene AMD K6-III@500mhz 320M Athlon1600XP@1400mhz 512M 98.3+% setiathome rank, not too shabby for a hillbilly
Re: amanda getting confused
On Sunday 13 January 2002 01:54 pm, Thomas Hepper wrote: Hi Gene On Sun, Jan 13, 2002 at 12:17:27PM -0500, Gene Heskett wrote: On Thursday 10 January 2002 02:31 pm, Thomas Hepper wrote: To change this will take some time, i think you will hear from me tomorow, I hadn't heard, my cold isn't any better and the backup schedule restarts tonight for the week. Here is what I've done, and which *appears* to have fixed this problem. Sorry, but we had a crash of one database system on my work which took all me time (include free time at home). It was/is important that system will stay online forthe weekend, some most of my time this weekend is used by looking at this system , sorry, but work is more important. Of course! I will read you posts and if nothing bad happens you will hear from me on monday. Well, I've been thinking on this, and it makes more sense to do the rewind before the label read, therefore leaving the tape at block 64 when done. The incorrect assumption that the tape actually has ben rewound when the tape_rdlabel(fd); is attempted seems to be the problem. So I will interchange those two statements in the tapeio.c sources, which will then do a dummy rewind with no errors returned before reading the label, and will leave the tape in the same position when done as before, eg at block 64. To refresh the users-list, in tapeio.c, the function tape_rdlabel() now looks like this: char * tape_rdlabel(devname, datestamp, label) char *devname; char **datestamp; char **label; { int fd; char *r = NULL; if((fd = tape_open(devname, O_RDONLY)) 0) { r = errstr = newvstralloc(errstr, tape_rdlabel: tape open: , devname, : , strerror(errno), NULL); } else tapefd_rewind(fd); /* a dummy rewind to make sure it is, by G. Heskett */ if(tapefd_rdlabel(fd, datestamp, label) != NULL) { r = errstr; } if(fd = 0) { tapefd_close(fd); } return r; } That code is now building, and will be tested by seeing if an error can be induced by doing a dd read of the label before starting an amcheck or amshow. Now here is an odd one, I did that editing as amanda, but when I make it, I get all sorts of questions from the compiler about changing the perms from 0644, so I've been exiting to root, and doing a chown -R amanda:amanda * in the top sourcedir. But that time I just changed the whole damned system because I didn't note the path bash printed when I exited user amanda. This will take a while to sort, and frankly, I'd druther sort rattlesnakes. They're more predictable. And my test of running dd to read the label in front of an amcheck now works with or without the dd first. Comments please! -- Cheers, Gene AMD K6-III@500mhz 320M Athlon1600XP@1400mhz 512M 98.3+% setiathome rank, not too shabby for a hillbilly
Re: amanda getting confused
On Sunday 13 January 2002 03:40 pm, you wrote: On Sunday 13 January 2002 12:17 pm, you wrote: On Thursday 10 January 2002 02:31 pm, Thomas Hepper wrote: To change this will take some time, i think you will hear from me tomorow, I hadn't heard, my cold isn't any better and the backup schedule restarts tonight for the week. Here is what I've done, and which *appears* to have fixed this problem. 1. Download, configure, build and install the 20020111 snapshot, which didn't do anything for this problem. This problem being that when show or amcheck reads a tape label, it is not rewinding it for the next pass. The eject reload of a tape change will of course do this, but if amcheck finds a match, the tape was being left at block 64, where a label read by amdump then failed. IMO amdump really should have a rewind in front of its label reads, but I'm not sure where to put that. 2. Traceing through the sources, in tape-src/tapeio.c, in the function tapefd_readlabel(), I've added a call to tapefd_rewind(fd); above the tapefd_close(fd); call The above function is missnamed, it should read: 'tape_rdlabel(' My bad. See a later post for how I have it now includeing a c/p of the whole function as its running *right now*. I can now run amtape /config/ show repeatedly, as well as amcheck /config/ repeatedly without errors. The question is: Is this automatic rewind in the tape-src/tapeio.c-tapefd_rdlabel(fd) function going to mess with anything else? IMO, not knowing all the fine points of how amanda is organised, it (to me) doesn't make any sense to do a readlabel without rewinding it for the next readlabel, in most cases it will come as a request from amdump to make sure its writing to the correct tape. So the question then is: will amdump then overwrite the tapes label after it reads it, gets a good one, rewinds to write the fresh one, writes it, rewinds to check it (maybe, I don't know if it does) with the end result being an overwritten label, overwritten by the dumps contents because the tape is NOT going to be left at block 64 after a tapefd_rdlabel(fd) as it was prior to this patch... Thomas, anybody else care to comment here? Ok, I rebuilt it with the rewind call in front of the rdlabel, and it fails for a run of amtape /config/ show on the tape already loaded! I'm looking at the error line amtape: scanning all 3 slots in tape-changer rack: slot 0: rewinding tape: Input/output error slot 1: date 20011222 label DailySet1-02 slot 2: date 20011222 label DailySet1-03 so obviously, amtape must use a different access method to read the labels. Does this not need to be a common method? Back to the drawing board I guess. And my former method. Stupid Q: Why is a rewind returned as an i/o error when its already rewound, it really should just return success, setting the bot = TRUE flag if its used. Watcing the leds on the drive a rewind IS being done by amcheck before it reads it. Aha! After amcheck is done, finding the tape it wants, a dd silently returns nothing! $ dd if=/dev/nst0 count=1 0+0 records in 0+0 records out $ But repeats are getting compressed data, and finally an advisory about how to restore in an emergency. Neat. Now for a run of amtape show, which rewinds the tape, makes no read attampt at all, and reports an i/o error while rewinding the tape, but waits about 30 seconds or more to report that on the first, already loaded tape. And it makes no attempt to rewind the next tape, nor does it rewind the next one before the read of the label. So amtape must have its own idea of how to do it. ammt returns no such errors for repeat runs. The odd thing, and maybe I'm getting macular degeneration or something, but I cannot find anyplace in the tapeio.c file where it actually issues a 'rewind', all these errors would appear to be coming from bogus slot numbers and such. And this has been holding me online for several hours now so I'm gonna hit send and go away. -- Cheers, Gene AMD K6-III@500mhz 320M Athlon1600XP@1400mhz 512M 98.3+% setiathome rank, not too shabby for a hillbilly
Re: amanda getting confused
On Sunday 13 January 2002 05:28 pm, Gene Heskett wrote: Yeah, he wrote all that, sorta like I can't believe I ate the whole thing! So lemme snip hugely: The odd thing, and maybe I'm getting macular degeneration or something, but I cannot find anyplace in the tapeio.c file where it actually issues a 'rewind', all these errors would appear to be coming from bogus slot numbers and such. And this has been holding me online for several hours now so I'm gonna hit send and go away. And I still don't have a mental image of the connection between this code, and actually telling the flippin drive to DO a %$#@ rewind! However, this is what works here, 100%, like this: [amanda@gene amanda-2.4.3b2-20020111]$ dd if=/dev/nst0 bs=32k count=1 amtape DailySet1 show amcheck DailySet1 AMANDA: TAPESTART DATE 20011221 TAPE DailySet1-01 1+0 records in 1+0 records out amtape: scanning all 3 slots in tape-changer rack: slot 0: date 20011221 label DailySet1-01 slot 1: date 20011222 label DailySet1-02 slot 2: date 20011222 label DailySet1-03 Amanda Tape Server Host Check - Holding disk /dumps: 30895900 KB disk space available, using 28798748 KB amcheck-server: slot 2: date 20011222 label DailySet1-03 (active tape) amcheck-server: slot 0: date 20011221 label DailySet1-01 (exact label match) NOTE: skipping tape-writable test Tape DailySet1-01 label ok Server check took 177.929 seconds Amanda Backup Client Hosts Check Client check: 1 host checked in 0.144 seconds, 0 problems found (brought to you by Amanda 2.4.3b2-20020111) You have new mail in /var/spool/mail/root [amanda@gene amanda-2.4.3b2-20020111]$ mt -f /dev/nst0 tell At block 0. [amanda@gene amanda-2.4.3b2-20020111]$ amcheck DailySet1 Amanda Tape Server Host Check - Holding disk /dumps: 30895800 KB disk space available, using 28798648 KB amcheck-server: slot 0: date 20011221 label DailySet1-01 (exact label match) NOTE: skipping tape-writable test Tape DailySet1-01 label ok Server check took 34.872 seconds Amanda Backup Client Hosts Check Client check: 1 host checked in 0.133 seconds, 0 problems found (brought to you by Amanda 2.4.3b2-20020111) [amanda@gene amanda-2.4.3b2-20020111]$ And I have uparrowed the prompt and repeated that 4 times. And here is the modified tape_rdlabel() routine in tapeio.c: char * tape_rdlabel(devname, datestamp, label) char *devname; char **datestamp; char **label; { int fd; char *r = NULL; if((fd = tape_open(devname, O_RDONLY)) 0) { r = errstr = newvstralloc(errstr, tape_rdlabel: tape open: , devname, : , strerror(errno), NULL); } else if(fd = 0) tapefd_rewind(fd); /* a dummy rewind to make sure it is, by G.eneHeskett. Doesn't always work. . . ): */ if(tapefd_rdlabel(fd, datestamp, label) != NULL) { r = errstr; } if(fd = 0) { tapefd_rewind(fd); /* but this one does! */ tapefd_close(fd); } return r; } I'm sure I'll find out when amdump runs at 1am, if this is actually workable. If it does, you _will_ hear me cheering from there, wherever there is, at whatever local time I check. I'm gmt -5:00 here in WV. :-) -- Cheers, gene