Re: planner timeout

2007-11-29 Thread Paul Bijnens



René Kanters wrote:

Hi Paul,

thanks for getting back to me on this.

I don't think it is firewall issues because up till now the backup ran 
fine. Actually it ran fine last night. Nobody was doing any big 
calculations on it, which could have affected that. Also, so maybe 
something else strange on the network was going on.


Is there a way to affect the 30 sec timeout for the planner's ACK timeout?


It's hardcoded.  Anyway, no reasonable program should be expected to 
send a simple ACK-packet back taking more than 30 seconds :-)
If it can't then anything really bad is wrong, like crashing client 
programs, firewalls, routing problems etc.






Cheers,
René

On Nov 29, 2007, at 10:24 AM, Paul Bijnens wrote:


On 2007-11-27 18:13, René Kanters wrote:

Hi,
I have been running into problems that some of my systems are heavily 
used for long computations making them somewhat less responsive.
Last night I ran into the issue that four systems did not send 
acknowledgments back to the dumper on time during the planning process:
  planner: ERROR Request to werner.richmond.edu failed: timeout 
waiting for ACK
I looked into allowing more time for that stage, which I believe 
etimeout should allow, but my amanda.conf has 'etimeout 600' in it 
while the planner's debug file ends with:
security_seterror(handle=0x3038a0, driver=0xa2a0c (BSD) error=timeout 
waiting for ACK)

security_close(handle=0x3038a0, driver=0xa2a0c (BSD))
planner: time 29.898: pid 3734 finish time Tue Nov 27 00:45:36 2007
suggesting that it still only waits for 30 seconds.


planner sends a packet to the client(s) and it expects at least
an UDP ACK-packet back within 30 seconds, indicating that the
client did receive at least the request.
The etimeout is the time that planner will wait for the packet
with the different size estimates from the client, which will usually
take more than 30 seconds.


Am I setting the wrong timeout?


So it seems you can't even get a reply back.  Firewall issues?


--
Paul Bijnens, xplanation Technology ServicesTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out  *
***




Re: planner timeout

2007-11-29 Thread René Kanters

Hi Paul,

thanks for getting back to me on this.

I don't think it is firewall issues because up till now the backup  
ran fine. Actually it ran fine last night. Nobody was doing any big  
calculations on it, which could have affected that. Also, so maybe  
something else strange on the network was going on.


Is there a way to affect the 30 sec timeout for the planner's ACK  
timeout?


Cheers,
René

On Nov 29, 2007, at 10:24 AM, Paul Bijnens wrote:


On 2007-11-27 18:13, René Kanters wrote:

Hi,
I have been running into problems that some of my systems are  
heavily used for long computations making them somewhat less  
responsive.
Last night I ran into the issue that four systems did not send  
acknowledgments back to the dumper on time during the planning  
process:
  planner: ERROR Request to werner.richmond.edu failed: timeout  
waiting for ACK
I looked into allowing more time for that stage, which I believe  
etimeout should allow, but my amanda.conf has 'etimeout 600' in it  
while the planner's debug file ends with:
security_seterror(handle=0x3038a0, driver=0xa2a0c (BSD)  
error=timeout waiting for ACK)

security_close(handle=0x3038a0, driver=0xa2a0c (BSD))
planner: time 29.898: pid 3734 finish time Tue Nov 27 00:45:36 2007
suggesting that it still only waits for 30 seconds.


planner sends a packet to the client(s) and it expects at least
an UDP ACK-packet back within 30 seconds, indicating that the
client did receive at least the request.
The etimeout is the time that planner will wait for the packet
with the different size estimates from the client, which will usually
take more than 30 seconds.


Am I setting the wrong timeout?


So it seems you can't even get a reply back.  Firewall issues?


--
Paul Bijnens, xplanation Technology ServicesTel  +32 16  
397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16  
397.512
http://www.xplanation.com/  email:   
[EMAIL PROTECTED]
** 
*
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q,  
^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, / 
bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,   
hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,   
shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop- 
A, ... *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm  
out  *
** 
*





Re: planner timeout

2007-11-29 Thread Paul Bijnens

On 2007-11-27 18:13, René Kanters wrote:

Hi,

I have been running into problems that some of my systems are heavily 
used for long computations making them somewhat less responsive.


Last night I ran into the issue that four systems did not send 
acknowledgments back to the dumper on time during the planning process:


  planner: ERROR Request to werner.richmond.edu failed: timeout waiting 
for ACK


I looked into allowing more time for that stage, which I believe 
etimeout should allow, but my amanda.conf has 'etimeout 600' in it while 
the planner's debug file ends with:
security_seterror(handle=0x3038a0, driver=0xa2a0c (BSD) error=timeout 
waiting for ACK)

security_close(handle=0x3038a0, driver=0xa2a0c (BSD))
planner: time 29.898: pid 3734 finish time Tue Nov 27 00:45:36 2007

suggesting that it still only waits for 30 seconds.


planner sends a packet to the client(s) and it expects at least
an UDP ACK-packet back within 30 seconds, indicating that the
client did receive at least the request.
The etimeout is the time that planner will wait for the packet
with the different size estimates from the client, which will usually
take more than 30 seconds.



Am I setting the wrong timeout?


So it seems you can't even get a reply back.  Firewall issues?


--
Paul Bijnens, xplanation Technology ServicesTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out  *
***


Re: cleaning tapes and integration into Amanda backup scheme?

2007-11-29 Thread Chris Hoogendyk



Craig Dewick wrote:

On Mon, 22 Oct 2007, Olivier Nicole wrote:


My Sun L9 array has told me it needs a cleaning tape run. I have one so
that's no problem but what I'd like to know is if there is a way that
Amanda can receive info from the tape drive about the requirement for
cleaning and co-ordinate cleaning tape runs as part of the overall 
backup

stragegy?


My strategy, all human based, is to have the cleaning tape on the pile
of the next set of 6 tapes to be used.

I have my tapes pool divided into 3 sets of 6, once I have run through
the current set, I move the stack to the back and come up with a new
stack, on top of which is the cleaning tape.


My L9 tape array has DLT-4 tapes in the first 8 slots, and today I've 
put a brand new cleaning tape into the 9th slot. There's nothing in my 
tape server's amanda.conf file relating to cleaning tapes, so I'm 
wondering where in the Amanda config schema the info about location of 
cleaning tapes needs to be. Does Amanda itself need to know, or does 
mtx need to know directly? 


I have a 16 slot library. I configured Amanda daily backups to work with 
slots 1-15, so it would never look at slot 16. Originally, I had the 
cleaning tape in 16, but I've only had to use it once in a year, and 
that was when I had a faulty tape get caught in the drive. So, I've just 
removed it, and I use the 16th slot for archive runs and other special 
cases.


From just scanning the wiki with google, it appears that the chg 
scripts have the capacity to call a cleaning tape, and that can be 
defined for the scripts, but it isn't built into Amanda per se. With 
modern drives, it shouldn't be needed much and my inclination is that I 
would like to be in manual control of it. If the drive is misbehaving 
and seems to need cleaning, I don't want an automated process to keep it 
out of sight (out of mind) until it fails and needs major work. That's 
just my opinion.



---

Chris Hoogendyk

-
  O__   Systems Administrator
 c/ /'_ --- Biology & Geology Departments
(*) \(*) -- 140 Morrill Science Center
~~ - University of Massachusetts, Amherst 


<[EMAIL PROTECTED]>

--- 


Erdös 4