Backing up mail spools (was Re: How do I deal with STRANGE backups ?)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Monday, 22.08.2005 at 22:53 -0400, Jason 'XenoPhage' Frisvold wrote: [ ... some are new mail messages coming in ... ] I've often wondered about this. I back up the mail spools in /var/mail of the appropriate server and occasionally get a 'STRANGE' from it because one of the spools changes during the course of the backup job. What is considered the Correct Way to handle backing up files which are so dynamic? Our mail spools are pretty quiet at the time the backup runs, but some of you out there must have more traffic than us ... My thoughts: 1. Just put up with it: spools that are changing will result in a backup which is probably not of any use once in a while. This is probably fine, unless there is a large amount of mail coming in at the time the backup runs; 2. Make a policy decision not to backup mail spools at all. There are often reasons for making such a policy, although it's not something that we do; 3. Copy in a robust fashion from the mail spools to a temporary location prior to the backup job, so that these copies of the spools will not change; then backup the copies rather than the 'live' spools. The robust fashion would work in a similar way to how locking mail spools operates when appending/deleting messages. If option #3, has anyone actually done that? How? Dave. - -- Dave Ewart [EMAIL PROTECTED] Computing Manager, Cancer Epidemiology Unit Cancer Research UK / Oxford University PGP: CC70 1883 BD92 E665 B840 118B 6E94 2CFD 694D E370 N 51.7518, W 1.2016 -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFDCttDbpQs/WlN43ARAuFeAKDnef5GW2pZ/08qHnr1l1qYJKYhtACg0gkV 14YTwNMAJYmalmCvByQZopE= =kvg8 -END PGP SIGNATURE-
Re: Problem with backup of windows shares
I have a problem with the backup of two windows shares. At the beginning, always works fine: I do a full backup of the shares (they have the same contents). After, I add a file of 255 kb in the two shares. I do an incremental backup and I've got a problem: one of the share just backup the file and the other backup all the share. But the two backups are going up to the level 1. Why I don't have an incremental backup for the two shares? The debug file sendsize.XXX.debug seems to me very strange. There is notions of level 2 of backup. But we know that windows share could going up just at level 1. This is mostly guesswork on my part without delving into the actual mechanics of the code. Remember that the server is not backing up a windows share directly. It is backing up a unix/linux client that happens to have access to a windows share. It is the client that makes the distinction between local DLEs and remote DLEs. So the planner and estimater on the server might ask the reasonable size (reasonable to ask of a unix/linux client) if I do a level 0, level 1, and level 2 backup of this DLE, what do you think the sizes will be? I note in your logs that the client has returned the same size for both level 1 and level 2 (6804 KB). Perhaps the client code realizes that a true level 2 can not be done on a PC share, and rather than returning an error can't be done, simply says the same value. This seems to me to be a reasonable response as the planner will chose the higher level 1 over the level 2 if they are the same size. Thanks a lot for your explanation of the operation of a windows share backup. I understand the fact that the server backup a linux client who have access to windows share. My client make difference with local DLEs and remote DLEs. The remote DLEs are : unix client //pc windows/sharedumptype After, the login and the password are take in the file amandapass. There is no problem on that. The problem is that I have the same data on the two shares and the backup are different. The level 0 is the same but the level 1 is different. However, I add the same file to the two shares. //Yoann.neotip/backup level 1: 255 KB //Yoann.neotip/backup1 level 1: 6804 KB The backup of the second share is full. Why? Regards. Yoann, TANGUY Student at ESINSA Sophia-Antipolis ___ Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger Téléchargez cette version sur http://fr.messenger.yahoo.com
Changing tape label in tapelist file
My coworker have lebelled a tape with an incorrect label (he has inserted an incorrect progressive number). Now, my tapelist is something like this: 20050822 backup08 reuse 20050819 backup07 reuse 20050818 backup06 reuse 20050817 backup05 reuse 20050805 backup04 reuse 20050804 backup99 reuse - this is the tape with incorrect label 20050803 backup02 reuse 20050802 backup01 reuse 20050801 backup00 reuse 20050729 backup19 reuse 20050728 backup18 reuse 20050727 backup17 reuse 20050726 backup16 reuse 20050725 backup15 reuse 20050722 backup14 reuse 20050721 backup13 reuse 20050720 backup12 reuse 20050719 backup11 reuse 20050718 backup10 reuse 20050715 backup09 reuse What's the correct procedure to replace 'backup99' with 'backup03' ? I suppose if i edit tapelist file amanda can't recognise the tape correctly because on tape the label is 'backup99'... Thanks for your help. Giovanni
Re: Backing up mail spools (was Re: How do I deal with STRANGE backups ?)
Dave Ewart wrote: 3. Copy in a robust fashion from the mail spools to a temporary location prior to the backup job, so that these copies of the spools will not change; then backup the copies rather than the 'live' spools. The robust fashion would work in a similar way to how locking mail spools operates when appending/deleting messages. I use filesystem snapshots. Taking a snapshot is only matter of seconds. I make a snapshot a few minutes before the amanda backup starts. Amanda then makes a backup of that snapshot instead Your OS has to support is however. Solaris 2.8 (2.8 plain needs patches) can do it. Linux with lvm1 can do it too; Linux with lvm2 is not yet stable enough for doing snapshots. (I have lvm2 snapshots working on one system without problems, but on other systems it makes the computer crash; maybe related to amount of memory and/or system load: the system where it does work has lots of RAM, and is very quiet in the night.) Mail me for the scripts to create a snapshot if you're interested. -- Paul Bijnens, XplanationTel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] *** * I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, ^^, * * F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, * * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, * * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, * * init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... * * ... Are you sure? ... YES ... Phew ... I'm out * ***
Re: Changing tape label in tapelist file
Montagni, Giovanni wrote: My coworker have lebelled a tape with an incorrect label (he has inserted an incorrect progressive number). Now, my tapelist is something like this: 20050822 backup08 reuse 20050819 backup07 reuse 20050818 backup06 reuse 20050817 backup05 reuse 20050805 backup04 reuse 20050804 backup99 reuse - this is the tape with incorrect label 20050803 backup02 reuse 20050802 backup01 reuse 20050801 backup00 reuse 20050729 backup19 reuse 20050728 backup18 reuse 20050727 backup17 reuse 20050726 backup16 reuse 20050725 backup15 reuse 20050722 backup14 reuse 20050721 backup13 reuse 20050720 backup12 reuse 20050719 backup11 reuse 20050718 backup10 reuse 20050715 backup09 reuse What's the correct procedure to replace 'backup99' with 'backup03' ? I suppose if i edit tapelist file amanda can't recognise the tape correctly because on tape the label is 'backup99'... Wait for the day that tape backup99 will be used. Then do: amrmtape Config backup99 amlabel -f Config backup03 -- Paul Bijnens, XplanationTel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] *** * I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, ^^, * * F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, * * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, * * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, * * init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... * * ... Are you sure? ... YES ... Phew ... I'm out * ***
Question about data timeout.
I have recently added a set of disks (file systems) to my back-up set and that ended up with a failure due to data timeout. I didn't even know there was a dtimeout value to be specified in amanda.conf. I have learnt that it is an idle time measured against the disks in question. My question is now, how is this idle time measured and where is it reported? Only by knowing what amanda sees of the idle time am I able to specify a reasonable dtimeout value. -- Regards, Erik P. Olsen
Re: Backing up mail spools (was Re: How do I deal with STRANGE backups ?)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tuesday, 23.08.2005 at 11:01 +0200, Paul Bijnens wrote: 3. Copy in a robust fashion from the mail spools to a temporary location prior to the backup job, so that these copies of the spools will not change; then backup the copies rather than the 'live' spools. The robust fashion would work in a similar way to how locking mail spools operates when appending/deleting messages. I use filesystem snapshots. Taking a snapshot is only matter of seconds. I make a snapshot a few minutes before the amanda backup starts. Amanda then makes a backup of that snapshot instead Your OS has to support is however. Solaris 2.8 (2.8 plain needs patches) can do it. Linux with lvm1 can do it too; Linux with lvm2 is not yet stable enough for doing snapshots. (I have lvm2 snapshots working on one system without problems, but on other systems it makes the computer crash; maybe related to amount of memory and/or system load: the system where it does work has lots of RAM, and is very quiet in the night.) Mail me for the scripts to create a snapshot if you're interested. The server in question is a fairly generic Debian/Sarge mail server, RAID setup for disks, 2.6 kernel. And /var/mail is on its own ext3 partition. You mention that it works with LVM. Do snapshots *require* LVM? Dave. - -- Dave Ewart [EMAIL PROTECTED] Computing Manager, Cancer Epidemiology Unit Cancer Research UK / Oxford University PGP: CC70 1883 BD92 E665 B840 118B 6E94 2CFD 694D E370 N 51.7518, W 1.2016 -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFDCvA5bpQs/WlN43ARAv4cAKCmduKWAwWn4ju6f3Z6Js4s31kpKACfUA9G px5PyDzKKlRTDoy6JmZwmcQ= =MeB5 -END PGP SIGNATURE-
Re: Backing up mail spools (was Re: How do I deal with STRANGE backups ?)
Dave Ewart wrote: The server in question is a fairly generic Debian/Sarge mail server, RAID setup for disks, 2.6 kernel. And /var/mail is on its own ext3 partition. You mention that it works with LVM. Do snapshots *require* LVM? I create snapshots with lvcreate --snapshot ..;, so, yes indeed, you need LVM (and free space in the volume group!). 2.6 kernels probably work with lvm2 and device-mapper, so that falls in the buggy category currently. I'm not certain what Debian/Sarge uses. -- Paul Bijnens, XplanationTel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] *** * I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, ^^, * * F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, * * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, * * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, * * init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... * * ... Are you sure? ... YES ... Phew ... I'm out * ***
Re: Backing up mail spools (was Re: How do I deal with STRANGE backups ?)
On Tue, Aug 23, 2005 at 09:16:03AM +0100, Dave Ewart wrote: On Monday, 22.08.2005 at 22:53 -0400, Jason 'XenoPhage' Frisvold wrote: [ ... some are new mail messages coming in ... ] I've often wondered about this. I back up the mail spools in /var/mail of the appropriate server and occasionally get a 'STRANGE' from it because one of the spools changes during the course of the backup job. What is considered the Correct Way to handle backing up files which are so dynamic? Our mail spools are pretty quiet at the time the backup runs, but some of you out there must have more traffic than us ... My thoughts: 1. Just put up with it: spools that are changing will result in a backup which is probably not of any use once in a while. This is probably fine, unless there is a large amount of mail coming in at the time the backup runs; Just one point on this. I don't think you get a backup that is useless. It is my understanding that the backup is still ok; the problem is that specific files within the backup are questionable. -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
Re: Problem with backup of windows shares
On Tue, Aug 23, 2005 at 10:23:56AM +0200, tanguy yoann wrote: After, the login and the password are take in the file amandapass. There is no problem on that. The problem is that I have the same data on the two shares and the backup are different. The level 0 is the same but the level 1 is different. However, I add the same file to the two shares. //Yoann.neotip/backup level 1: 255 KB //Yoann.neotip/backup1 level 1: 6804 KB The backup of the second share is full. Why? Seven MB, small share. Likely for testing? I'm probably going to show my great ignorance of PC's here, but what the heck. When long ago I was looking at the code or the docs something I recall was that samba (or amanda) uses the PC file system's ?archive bit? to determine if a file needs an incremental or not. The dependency on this two-state bit is why you can't do levels of incrementals. Perhaps on the one system something other than amanda/samba prevents (or causes) this archive bit to flip. That would be analogous to touching a file on a unix system and making the file look like it needs backup. -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
Re: Question about data timeout.
On Tue, Aug 23, 2005 at 11:19:59AM +0200, Erik P. Olsen wrote: I have recently added a set of disks (file systems) to my back-up set and that ended up with a failure due to data timeout. I didn't even know there was a dtimeout value to be specified in amanda.conf. I have learnt that it is an idle time measured against the disks in question. My question is now, how is this idle time measured and where is it reported? Only by knowing what amanda sees of the idle time am I able to specify a reasonable dtimeout value. I may be totally wrong here, but I don't think it is tracking idle time. I believe it is total time to dump. This would take care of stuck or runaway dump scenarios. -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
data timeout error
I still have a data timeout error for a DLE in my amanda log this morning. This is the second time this happens and this DLE is very important for us. It has to be backed up correctly. I've looked in the sendbackup.debug files on the client side and there is no error for this DLE. I asked the network admin to check in the firewall (CheckPoint FW-1) logs to see anything unusual between the tape server and the client, he spotted this: TCP packet out of state: First packet isn't SYN I don't know if it is related. I don't know where else to look. There does not seem to be any error message on the server either. I also increased the dtimeout from 1800 to 2400 but that does not seem to be the problem, in the sendbackup.debug for the DLE in question you see that it takes more than 1800 secs to complete (it took 5945.296 secs to complete), I think the dtimeout must be an idle time limit. Here is the sendbackup.debug: sendbackup: debug 1 pid 6921 ruid 555 euid 555: start at Tue Aug 23 04:40:36 200 5 /usr/local/libexec/sendbackup: version 2.4.5 parsed request as: program `GNUTAR' disk `/disk1' device `/disk1' level 1 since 2005:8:20:8:37:56 options `|;bsd-auth;srvcomp-best;index;exclude-list=.amanda .excludes;exclude-optional;' sendbackup: try_socksize: send buffer size is 65536 sendbackup: time 0.000: stream_server: waiting for connection: 0.0.0.0.50090 sendbackup: time 0.000: stream_server: waiting for connection: 0.0.0.0.50091 sendbackup: time 0.001: stream_server: waiting for connection: 0.0.0.0.50092 sendbackup: time 0.001: waiting for connect on 50090, then 50091, then 50092 sendbackup: time 0.008: stream_accept: connection from 192.197.124.40.50070 sendbackup: time 0.012: stream_accept: connection from 192.197.124.40.50071 sendbackup: time 0.015: stream_accept: connection from 192.197.124.40.50072 sendbackup: time 0.015: got all connections sendbackup-gnutar: time 0.058: doing level 1 dump as listed-incremental from /us r/local/var/amanda/gnutar-lists/sol_disk1_0 to /usr/local/var/amanda/gnutar-list s/sol_disk1_1.new sendbackup-gnutar: time 0.074: doing level 1 dump from date: 2005-08-20 8:37:56 GMT sendbackup: time 0.077: spawning /usr/local/libexec/runtar in pipeline sendbackup: argument list: gtar --create --file - --directory /disk1 --one-file- system --listed-incremental /usr/local/var/amanda/gnutar-lists/sol_disk1_1.new - -sparse --ignore-failed-read --totals --exclude-from /tmp/amanda/sendbackup._dis k1.20050823044036.exclude . sendbackup-gnutar: time 0.078: /usr/local/libexec/runtar: pid 6924 sendbackup: time 0.079: started index creator: /usr/local/bin/tar -tf - 2/dev/ null | sed -e 's/^\.//' sendbackup: time 5945.219: index created successfully sendbackup: time 5945.253: 53:size(|): Total bytes written: 5450424320 (5.1 GiB, 896KiB/s) sendbackup: time 5945.296: pid 6921 finish time Tue Aug 23 06:19:41 2005
Re: Backing up mail spools (was Re: How do I deal with STRANGE backups ?)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tuesday, 23.08.2005 at 09:23 -0400, Jon LaBadie wrote: 1. Just put up with it: spools that are changing will result in a backup which is probably not of any use once in a while. This is probably fine, unless there is a large amount of mail coming in at the time the backup runs; Just one point on this. I don't think you get a backup that is useless. It is my understanding that the backup is still ok; the problem is that specific files within the backup are questionable. Yes, I understand that: I should have said that the backup is probably not of any use FOR THAT FILE WHICH CHANGES. Having said that, it might be, or you could probably extract parts of it ... Dave. - -- Dave Ewart [EMAIL PROTECTED] Computing Manager, Cancer Epidemiology Unit Cancer Research UK / Oxford University PGP: CC70 1883 BD92 E665 B840 118B 6E94 2CFD 694D E370 N 51.7518, W 1.2016 -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFDCyUEbpQs/WlN43ARAl4nAKCJ9fqSqfJYCa+bREYYu5gRXpg/BwCfckYb /u8vT+hTooZwIrwC94X1X4w= =A74G -END PGP SIGNATURE-
[OT] Tape drive failure
So my sturdy HP Ultrium 1 (LTO) tape drive just ate a tape and bit the big one 2 months after the warranty expired. They want $3,500 for a refurb with a 90-day warranty. I've been wanting a rackmount unit for a while now, and this gives me a good excuse to get one :-) I was looking at the Certance (formerly Quantum) CL400H (1U, single drive upgradable to dual drive). It looks like I can get it for about $2,200 (gotta love educational pricing). Does anyone have any comments on this drive, or other recommendations? Thanks, Matt -- Matt Hyclak Department of Mathematics Department of Social Work Ohio University (740) 593-1263
Re: Question about data timeout.
On Tue, 2005-08-23 at 09:38 -0400, Jon LaBadie wrote: On Tue, Aug 23, 2005 at 11:19:59AM +0200, Erik P. Olsen wrote: I have recently added a set of disks (file systems) to my back-up set and that ended up with a failure due to data timeout. I didn't even know there was a dtimeout value to be specified in amanda.conf. I have learnt that it is an idle time measured against the disks in question. My question is now, how is this idle time measured and where is it reported? Only by knowing what amanda sees of the idle time am I able to specify a reasonable dtimeout value. I may be totally wrong here, but I don't think it is tracking idle time. I believe it is total time to dump. This would take care of stuck or runaway dump scenarios. The documentation says: dtimeout int Default: 1800 seconds. Amount of idle time per disk on a given client that a dumper running from within amdump will wait before it fails with a data timeout error. -- Regards, Erik P. Olsen
Re: Problem with backup of windows shares
--- Jon LaBadie [EMAIL PROTECTED] a écrit : On Tue, Aug 23, 2005 at 10:23:56AM +0200, tanguy yoann wrote: After, the login and the password are take in the file amandapass. There is no problem on that. The problem is that I have the same data on the two shares and the backup are different. The level 0 is the same but the level 1 is different. However, I add the same file to the two shares. //Yoann.neotip/backup level 1: 255 KB //Yoann.neotip/backup1 level 1: 6804 KB The backup of the second share is full. Why? Seven MB, small share. Likely for testing? Indeed I'm in testing time. I'm probably going to show my great ignorance of PC's here, but what the heck. When long ago I was looking at the code or the docs something I recall was that samba (or amanda) uses the PC file system's ?archive bit? to determine if a file needs an incremental or not. The dependency on this two-state bit is why you can't do levels of incrementals. Perhaps on the one system something other than amanda/samba prevents (or causes) this archive bit to flip. That would be analogous to touching a file on a unix system and making the file look like it needs backup. Yes, amanda use the bit archive to know who data backup. I have a problem with some shares but I can't know the value of the archive bit of these shares. I don't know how I can do ? Thanks a lot for your explanations. Regards. Yoann, TANGUY Student at ESINSA Sophia-Antipolis ___ Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger Téléchargez cette version sur http://fr.messenger.yahoo.com
Re: Question about data timeout.
On Tue, Aug 23, 2005 at 05:04:02PM +0200, Erik P. Olsen enlightened us: On Tue, Aug 23, 2005 at 11:19:59AM +0200, Erik P. Olsen wrote: I have recently added a set of disks (file systems) to my back-up set and that ended up with a failure due to data timeout. I didn't even know there was a dtimeout value to be specified in amanda.conf. I have learnt that it is an idle time measured against the disks in question. My question is now, how is this idle time measured and where is it reported? Only by knowing what amanda sees of the idle time am I able to specify a reasonable dtimeout value. I may be totally wrong here, but I don't think it is tracking idle time. I believe it is total time to dump. This would take care of stuck or runaway dump scenarios. The documentation says: dtimeout int Default: 1800 seconds. Amount of idle time per disk on a given client that a dumper running from within amdump will wait before it fails with a data timeout error. Yes, and that per disk is important. If you have a machine with 3 Disklist Entries (DLEs), it will wait 5400 seconds (90 minutes) for that machine. Another machine with 1 DLE will only get 30 minutes to complete. Matt -- Matt Hyclak Department of Mathematics Department of Social Work Ohio University (740) 593-1263 pgpAKPdkNThiJ.pgp Description: PGP signature
Re: Question about data timeout.
On Tue, Aug 23, 2005 at 05:04:02PM +0200, Erik P. Olsen wrote: On Tue, 2005-08-23 at 09:38 -0400, Jon LaBadie wrote: I may be totally wrong here, but I don't think it is tracking idle time. I believe it is total time to dump. This would take care of stuck or runaway dump scenarios. The documentation says: dtimeout int Default: 1800 seconds. Amount of idle time per disk on a given client that a dumper running from within amdump will wait before it fails with a data timeout error. Glad I said I may be totally wrong :( Thanks, -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
Re: Question about data timeout.
Jon LaBadie wrote: The documentation says: dtimeout int Default: 1800 seconds. Amount of idle time per disk on a given client that a dumper running from within amdump will wait before it fails with a data timeout error. Glad I said I may be totally wrong :( Even though the document reads that way, I've found it to *behave* the way you described, John. When I added a new disk to a server recently that was over 200GB, I had to increase the timeout, otherwise the dump itself would trigger the timeout and cause it to abort. Is this expected behavior? If so, should the docs be modified? Graeme
Re: Question about data timeout.
Jon LaBadie wrote: On Tue, Aug 23, 2005 at 11:19:59AM +0200, Erik P. Olsen wrote: I have recently added a set of disks (file systems) to my back-up set and that ended up with a failure due to data timeout. I didn't even know there was a dtimeout value to be specified in amanda.conf. I have learnt that it is an idle time measured against the disks in question. My question is now, how is this idle time measured and where is it reported? Only by knowing what amanda sees of the idle time am I able to specify a reasonable dtimeout value. I may be totally wrong here, but I don't think it is tracking idle time. I believe it is total time to dump. This would take care of stuck or runaway dump scenarios. Correct me if I'm wrong -- the coffee machine is broken here, writing this on a diet of pure fresh water! Reading through the sources, it seems that dtimeout is used as timeout value on a select() call in dumper.c, around line 1356 (amanda 2.4.5 sources). The select waits for activity on the data stream or on the messages stream. That means that if there is no traffic received within dtimeout seconds on one of those streams, you get a data timeout. The default 1800 seconds seems more than reasonable to me in that case. A pathological case could be a sequence of very compressable data (all aaas or zero's, like an empty database file). Compressing such a sequence, together with some buffering on client and server, it could well take a long time before any bytes come out of such pipe. But 1800 seconds seems to me more than enough even for those cases. There is also one of the last enhancements in gnutar for handling sparse files, which could result in a large time without emiting any data (and some systems create sparse files with 64 bit sizes...): https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=154882 http://lists.gnu.org/archive/html/bug-tar/2005-07/msg00025.html But that is only when doing estimates, or does it also affect the backup itself? And of course firewall timeouts come into play too, blocking one of the streams (e.g. the messages stream has almost no traffic usually) resulting in never receiving the end-of-file indication on that stream. Which results after dtimetout seconds in data timeout too. -- Paul Bijnens, XplanationTel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] *** * I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, ^^, * * F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, * * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, * * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, * * init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... * * ... Are you sure? ... YES ... Phew ... I'm out * ***
Re: Problem with backup of windows shares
tanguy yoann wrote: Perhaps on the one system something other than amanda/samba prevents (or causes) this archive bit to flip. That You need administrator privilege on the PC to be able to reset the archive bit. (At least I think so. Great ignorance of ms windows here too :-) would be analogous to touching a file on a unix system and making the file look like it needs backup. Yes, amanda use the bit archive to know who data backup. I have a problem with some shares but I can't know the value of the archive bit of these shares. I don't know how I can do ? smbclient '//pc/share' -U ... -W ... password: . smb: \ dir ... Documents and Settings D 0 Tue May 6 07:04:56 2003 Program Files DR 0 Tue May 6 07:05:44 2003 CONFIG.SYS H 0 Tue May 6 16:19:58 2003 AUTOEXEC.BAT H 0 Tue May 6 16:19:58 2003 IO.SYS AHSR 0 Tue May 6 16:19:58 2003 MSDOS.SYS AHSR 0 Tue May 6 16:19:58 2003 The A in the flags column is the archive bit. Verify if the archive bit is cleared after you did a level 0 backup. -- Paul Bijnens, XplanationTel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] *** * I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, ^^, * * F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, * * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, * * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, * * init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... * * ... Are you sure? ... YES ... Phew ... I'm out * ***
Dumpers dying using GNUTAR
I'm having more weird problems backing up the server I mentioned yesterday (where amstatus was dying). I came in this morning, and discovered that last night's incremental backup job is still running, and that it seems to be because 4 dumper threads to that server are stalled out. amstatus shows the following: fileserver:/files/1 1 25455835k dumping 25212958k ( 99.05%) (4:02:22) fileserver:/files/2 1 600485k dumping0k (8:58:54) fileserver:/files/3 1 4588766k dumping 4766k ( 0.10%) (7:22:08) fileserver:/files/41 5691741k dumping 27k ( 0.00%) (6:19:02) These numbers haven't changed in the last hour. As well, it looks like perhaps half of the disk entries for that server, the other half are still waiting to be dumped. A ps on the amanda server reveals: 5420 ?S 0:00 \_ /bin/sh /usr/sbin/amdump daily 5430 ?S 0:02 \_ /usr/lib/amanda/driver daily 5431 ?S 0:27 \_ taper daily 5432 ?S 0:05 | \_ taper daily 5439 ?S 0:14 \_ dumper0 daily 6763 ?Z 0:01 | \_ [gzip] defunct 6764 ?Z 0:00 | \_ [gzip] defunct 5440 ?S 3:47 \_ dumper1 daily 6519 ?Z 85:00 | \_ [gzip] defunct 6520 ?Z 0:00 | \_ [gzip] defunct 5441 ?S 0:28 \_ dumper2 daily 6585 ?Z 0:00 | \_ [gzip] defunct 6586 ?Z 0:00 | \_ [gzip] defunct 5442 ?S 1:46 \_ dumper3 daily 6792 ?S 0:00 | \_ /bin/gzip --fast 6793 ?S 0:00 | \_ /bin/gzip --best 5443 ?S 0:05 \_ dumper4 daily Doesn't look good. :( On the client server, I see the following: 5128 ?S 0:00 /usr/lib/amanda/sendbackup 7659 ?S 0:00 /usr/lib/amanda/sendbackup 9345 ?S 0:00 /usr/lib/amanda/sendbackup 11830 ?S 0:00 /usr/lib/amanda/sendbackup According to top, none of them is using any processor time. I'm doing server-side compression on these disks, so that shouldn't be a problem. Does anyone have any ideas what's going on here, or any ideas what I should look at next? This config was all working just last week, the only thing I've changed is added a few more disklist entries. :( Graeme -- Graeme Humphries ([EMAIL PROTECTED]) (306) 955-7075 ext. 485 My views are not the views of my employers.
Holding disk size misread by amcheck
I have a Fedora Core 3 amanda server. I have specified an nfs mounted directory for one of the holding disks. Does anyone know why amcheck finds much less space available on this drive than a command like 'df' does?
amlabel Issue
We are currently experiencing the below issue using amlabel. I have tried with and without a tape loaded, with and without a slot specified etc.. All chg-zd-mtx tests described in the script notes have been confirmed to work.. amlabel: could not load slot 1: no slots available Any help on this issue is much appreciated. amanda.conf Description: Binary data changer.conf Description: Binary data
Re: Question about data timeout.
On Tue, 2005-08-23 at 11:24 -0400, Matt Hyclak wrote: On Tue, Aug 23, 2005 at 05:04:02PM +0200, Erik P. Olsen enlightened us: On Tue, Aug 23, 2005 at 11:19:59AM +0200, Erik P. Olsen wrote: I have recently added a set of disks (file systems) to my back-up set and that ended up with a failure due to data timeout. I didn't even know there was a dtimeout value to be specified in amanda.conf. I have learnt that it is an idle time measured against the disks in question. My question is now, how is this idle time measured and where is it reported? Only by knowing what amanda sees of the idle time am I able to specify a reasonable dtimeout value. I may be totally wrong here, but I don't think it is tracking idle time. I believe it is total time to dump. This would take care of stuck or runaway dump scenarios. The documentation says: dtimeout int Default: 1800 seconds. Amount of idle time per disk on a given client that a dumper running from within amdump will wait before it fails with a data timeout error. Yes, and that per disk is important. If you have a machine with 3 Disklist Entries (DLEs), it will wait 5400 seconds (90 minutes) for that machine. Another machine with 1 DLE will only get 30 minutes to complete. I read it the way that each disk gets 1800 seconds idle (wait?) time before a time out. That is if disk 1 uses 1 second of that time the rest of 1799 seconds is lost and will not be added to the idle time of the two remaining disks. I have 13 DLEs that should give me 6H 30M if this theory is true, my data timeout happened after 3H 19M! I had hoped that amanda would report how much idle time had occurred for each disk. -- Regards, Erik P. Olsen
Re: amlabel Issue
On Tue, Aug 23, 2005 at 02:04:13PM -0400, James Jacocks wrote: We are currently experiencing the below issue using amlabel. I have tried with and without a tape loaded, with and without a slot specified etc.. All chg-zd-mtx tests described in the script notes have been confirmed to work.. As an intermediate test (between chg-zd-mtx direct and am-applications), did you try amtape to manipulate your drive and changer? -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
RE: Backing up mail spools (was Re: How do I deal with STRANGE ba ckups ?)
-Original Message- 3. Copy in a robust fashion from the mail spools to a temporary location prior to the backup job, so that these copies of the spools will not change; then backup the copies rather than the 'live' spools. The robust fashion would work in a similar way to how locking mail spools operates when appending/deleting messages. I use filesystem snapshots. Taking a snapshot is only matter of seconds. I make a snapshot a few minutes before the amanda backup starts. Amanda then makes a backup of that snapshot instead Your OS has to support is however. Solaris 2.8 (2.8 plain needs patches) can do it. Linux with lvm1 can do it too; Linux with lvm2 is not yet stable enough for doing snapshots. (I have lvm2 snapshots working on one system without problems, but on other systems it makes the computer crash; maybe related to amount of memory and/or system load: the system where it does work has lots of RAM, and is very quiet in the night.) Mail me for the scripts to create a snapshot if you're interested. The server in question is a fairly generic Debian/Sarge mail server , RAID setup for disks, 2.6 kernel. And /var/mail is on its own ext3 partition. You mention that it works with LVM. Do snapshots *require* LVM? Dave. Generally I think the only files in a spool directory that need to be backed up are the files that haven't been processed, for whatever reason, for some time. It doesn't matter if you miss some files because they have been processed correctly. There are heaps of files that move through the spool dir during the day that never get backed up. Only if the files happen to be in the directory at back up time then they might be backed up. I've seen some places that take snapshots of their spool dirs on a regular basis (every minute and less) and these snapshots get backed up with the nightly backup. Even then there is a good chance you won't get all the files. This was more for a security audit trail than data protection. Regards, Greg.