vtapes documentation

2005-09-05 Thread Gavin Henry
Hi Guys,

Been a while, but that's due to Amanda being so rock solid!!!

I am about to try a new setup using vtapes, as a new client won't be
getting a tape drive for about 5 weeks, but would like Amanda setup now.
Then when it arrives, we will be moving on to tapes.

However, I can't seem to find anything in the main docs about vtapes?

If there is and I have missed it, then sorry, if there isn't, then I could
work on whatever we have and then speak to Stefan to work with him to get
them added to the core docs?

Thanks,

Gavin.

-- 
Kind Regards,

Gavin Henry.
Managing Director.

T +44 (0) 1224 279484
M +44 (0) 7930 323266
F +44 (0) 1224 742001
E [EMAIL PROTECTED]

Open Source. Open Solutions(tm).

http://www.suretecsystems.com/




amdump is failing for localhost

2005-09-05 Thread Rodrigo Ventura

Hello all. My backup setup has been working fine for the last
months. Now I'm getting systematic FAILED dumps on the local
filesystem. What changed? There is a user process that is running a
scientific simulation program which uses about 500M of virtual memory
footprint (the machine has 520M RAM and 1G swap). This may be causing
timeouts, and I could increase the timeouts value, but from an
inspection to the debug files, I'm not sure if it is just a timeout
issue.

Therefore, I'm sending this mail to ask for help on figuring out what
is wrong, from the log files.

Ok, let's start with the amanda mail report:

--
FAILURE AND STRANGE DUMP SUMMARY:
  omni   //new/E$ lev 0 FAILED [Estimate timeout from omni]
  omni   //new/C$ lev 0 FAILED [Estimate timeout from omni]
  omni   /var/spool/imap/user/uz lev 0 FAILED [Estimate timeout from omni]
  omni   /var/spool/imap/user/nt lev 0 FAILED [Estimate timeout from omni]
[...]
--

(we can forget for now the smb backups, and restrict our attention to
the local filesystems)

The following file is amandad.2005090419540.debug, which I assume
is amandad daemon running on localhost, attending requests to
sendsize/backups:

--
[...]
amandad: time 0.000: got packet:

Amanda 2.4 REQ HANDLE 000-789C0608 SEQ 1125860041
SECURITY USER amanda
SERVICE sendsize
OPTIONS features=feff9ffe0f;maxdumps=1;hostname=omni;
GNUTAR //new/E$ 0 1970:1:1:0:0:0 -1 OPTIONS |;bsd-auth;compress-fast;index;
GNUTAR //new/E$ 1 2005:8:30:3:5:35 -1 OPTIONS |;bsd-auth;compress-fast;index;
GNUTAR //new/E$ 2 2005:8:31:3:50:7 -1 OPTIONS |;bsd-auth;compress-fast;index;
GNUTAR //new/C$ 0 1970:1:1:0:0:0 -1 OPTIONS 
|;bsd-auth;compress-fast;index;exclude-file=./pagefile.sys;
GNUTAR //new/C$ 1 2005:8:28:3:18:35 -1 OPTIONS 
|;bsd-auth;compress-fast;index;exclude-file=./pagefile.sys;
GNUTAR //new/C$ 2 2005:8:31:3:53:48 -1 OPTIONS 
|;bsd-auth;compress-fast;index;exclude-file=./pagefile.sys;
GNUTAR /var/spool/imap/user/uz /var/spool/imap/user 0 1970:1:1:0:0:0 -1 OPTIONS 
|;bsd-auth;compress-fast;index;exclude-file=./[a-t]*;
GNUTAR /var/spool/imap/user/uz /var/spool/imap/user 1 2005:8:29:3:20:17 -1 
OPTIONS |;bsd-auth;compress-fast;index;exclude-file=./[a-t]*;
[...]


amandad: time 0.000: sending ack:

Amanda 2.4 ACK HANDLE 000-789C0608 SEQ 1125860041


amandad: time 0.001: bsd security: remote host omni.isr.ist.utl.pt user amanda 
local user amanda
amandad: time 0.001: amandahosts security check passed
amandad: time 0.001: running service /usr/local/amanda/libexec/sendsize
amandad: time 12571.560: sending REP packet:

Amanda 2.4 REP HANDLE 000-789C0608 SEQ 1125860041
OPTIONS features=feff9ffe0f;
/ 0 SIZE 4974600
/ 1 SIZE 678600
/boot 0 SIZE 6750
/boot 1 SIZE 10
/usr 0 SIZE 3312330
/usr 1 SIZE 174220
[...]


amandad: time 12581.571: dgram_recv: timeout after 10 seconds
amandad: time 12581.571: waiting for ack: timeout, retrying
amandad: time 12591.572: dgram_recv: timeout after 10 seconds
amandad: time 12591.572: waiting for ack: timeout, retrying
amandad: time 12601.572: dgram_recv: timeout after 10 seconds
amandad: time 12601.572: waiting for ack: timeout, retrying
amandad: time 12611.572: dgram_recv: timeout after 10 seconds
amandad: time 12611.572: waiting for ack: timeout, retrying
amandad: time 12621.572: dgram_recv: timeout after 10 seconds
amandad: time 12621.572: waiting for ack: timeout, giving up!
amandad: time 12621.602: pid 2657 finish time Sun Sep  4 23:24:21 2005
--

Now, let's check sendsize.20050904195400.debug:

--
sendsize: debug 1 pid 2659 ruid 92 euid 92: start at Sun Sep  4 19:54:00 2005
sendsize: version 2.4.4p4
sendsize[2659]: time 0.232: waiting for any estimate child: 1 running
sendsize[2664]: time 0.232: calculating for amname '/', dirname '/', spindle -1
sendsize[2664]: time 0.232: getting size via gnutar for / level 0
sendsize[2664]: time 0.513: spawning /usr/local/amanda/libexec/runtar in 
pipeline
sendsize[2664]: argument list: /bin/tar --create --file /dev/null --directory / 
--one-file-system --listed-incremental 
/usr/local/amanda/var/amanda/gnutar-lists/omni__0.new --sparse 
--ignore-failed-read --totals --exclude-from 
/tmp/amanda/sendsize._.20050904195400.exclude .
sendsize[2664]: time 91.573: /bin/tar: 
./var/amavis/tmp/spamassassin.149.wxdvgp.tmp: Warning: Cannot stat: No such 
file or directory
sendsize[2664]: time 108.878: /bin/tar: 
./var/spool/mqueue-mta/local/tfj84IdMD1018683: Warning: Cannot stat: No such 
file or directory
[...]
sendsize[1498]: estimate time for //new/E$ level 2: 0.444
sendsize[1498]: estimate size for //new/E$ level 2: 872525 KB
sendsize[1498]: time 12571.433: 

Re: extimate server initial value?

2005-09-05 Thread Paul Bijnens

Graeme Humphries wrote:
This may be a RTFM question, but I can't see the answer in the 
amanda.conf man page (http://www.amanda.org/docs/amanda.conf.5.html):


When using estimate server, is there a way to configure what the 
initial estimate of a disk entry is before there's any historical data. 
It looks like Amanda's defaulting to about 5MB, and I'd rather it 
defaulted to closer to 1GB.



It's indeed not yet documented in amanda.conf, but having source
code available, see server-src/planner.c, beginning line 1393 :

For a level 0:100 Kbytes

For a level X, first day at this level: 10 Kbytes

There is also a default of 1 Kbytes for a level X not a
new level, which is only used when just switching to server
side estimates.  Otherwise, you always have at least one level X
already (i.e. the previous day).

How did you get the number of 5 MB ?

Or is this a level X without any historical data (1 KB above)
with a default comprate 0.50 0.50 which results in getting about
5 MByte compressed size?

Are you 100% sure there was no previous run at all? e.g. from
some little test run or so.
You can find the historical data at the end of the file:
  curinfo/host.name/_disk_list_entry/info

It has the format:

history: level size csize date duration

size and csize (compressed size) are in Kbytes.
duration is in seconds.
The date is in seconds since start of 1970; to print the date in
a human readable format:

$ perl -le 'print scalar(localtime($seconds_since_1970))'


--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  Are you sure?  ...   YES   ...   Phew ...   I'm out  *
***




Re: vtapes documentation

2005-09-05 Thread Stefan G. Weichinger

Gavin Henry wrote:

Hi Guys,

Been a while, but that's due to Amanda being so rock solid!!!

I am about to try a new setup using vtapes, as a new client won't be
getting a tape drive for about 5 weeks, but would like Amanda setup now.
Then when it arrives, we will be moving on to tapes.

However, I can't seem to find anything in the main docs about vtapes?

If there is and I have missed it, then sorry, if there isn't, then I could
work on whatever we have and then speak to Stefan to work with him to get
them added to the core docs?


What about

http://www.amanda.org/docs/howto-filedriver.html

?

This is where the term vtape originates ...

--
Stefan G. Weichinger
AMANDA core team member
mailto://[EMAIL PROTECTED]
--
oops! linux consulting  implementation
http://www.oops.co.at
--


Re: Strange Restoration

2005-09-05 Thread Paul Bijnens

Rija ANDRIANALY wrote:

Hi all,

The problem :
When I restore (with amrestore) a directory /var/courier, it's take only 
10min, or when I restore a sub-directory /var/courier/toto/, il takes 
more than 1 hour.

With :
/var/courier = ~ 20Go
/var/courier/toto/ is only 20Mo.


I bet you used fast skip forward implicitly or explicitly
when doing the restore with amrestore of the complete filesystem,
then rewound the tape, and used the much slower, read-and-skip
that is built into amrestore when not providing any fsf info.



- The summary of amanda.conf :

[...]

You can add a directives in amanda.conf
to let amrecover also use the fsf features:

amrecover_do_fsf yes
amrecover_check_label yes   ( this implies a rewind -- also nice )


--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  Are you sure?  ...   YES   ...   Phew ...   I'm out  *
***




Re: got FAILED for no apparent reason

2005-09-05 Thread Paul Bijnens

Rodrigo Ventura wrote:

It happened again. This time the timeout was on localhost! There is no
firewall involved, both in this case as well as in the last (mail from
July 12th).

FAILURE AND STRANGE DUMP SUMMARY:
  omni   //new/E$ lev 0 FAILED [Estimate timeout from omni]
  omni   //new/C$ lev 0 FAILED [Estimate timeout from omni]
  omni   /var/spool/imap/user/uz lev 0 FAILED [Estimate timeout from omni]
  omni   /var/spool/imap/user/nt lev 0 FAILED [Estimate timeout from omni]
[...]

The timeout setting at amanda.conf is 300 (the default, I
believe). There is comment stating that:
# a positive number will be multiplied by the number of filesystems on
# each host; a negative number will be taken as an absolute total time-out.
# The default is 5 minutes per filesystem.

Let's look at the debug files:

-[amandad.2005090103001.debug]
[...]
//new/C$ 2 SIZE 874751
//new/E$ 0 SIZE 7786744
//new/E$ 1 SIZE 467994
//new/E$ 2 SIZE 467994



4 * 300 sec = 1200 seconds timeout




amandad: time 14935.917: dgram_recv: timeout after 10 seconds


but apparently it took almost 15000 seconds...

That is very strange too. smbclient uses the superfast builtin
du command to do the estimates (except for very old smblient versions).
On which filesystem did it spend all that time?  You did not show that.

 
-[sendsize.20050901030010.debug]

[...]
sendsize[22886]: estimate time for //new/E$ level 2: 0.422


And indeed, this particular estimate took less than half a second.


sendsize[22886]: estimate size for //new/E$ level 2: 467994 KB
sendsize[22886]: time 14925.824: waiting for /usr/bin/smbclient //new/E$ child
sendsize[22886]: time 14925.824: after /usr/bin/smbclient //new/E$ wait
sendsize[22886]: time 14925.824: done with amname '//new/E$', dirname 
'//new/E$', spindle -1
sendsize[1525]: time 14925.827: child 22886 terminated normally
sendsize: time 14925.834: pid 1525 finish time Thu Sep  1 07:08:56 2005



What can we gather from this? sendsize finishes 50 seconds before
amandad; the amandad timeouts start 10 seconds before sendize
finishes. What is happening here?


???  amandad timeouts start 10 seconds AFTER sendsize finishes.
That looks completely normal to me.
Can you find out which filesystem(s) did took the majority of the
15000 seconds?


--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  Are you sure?  ...   YES   ...   Phew ...   I'm out  *
***




Re: got FAILED for no apparent reason

2005-09-05 Thread Rodrigo Ventura

Hi.

Meanwhile I sent a mail to amanda-users re-reporting the problem. Here
goes the data relative to that: I have 14 filesystems to dump on
localhost, so that total timeout should be 300*14=4200 seconds, right?
Doing a grep on sensize log, I get:

$ grep estimate time sendsize.20050904195400.debug
sendsize[2664]: estimate time for / level 0: 8169.854
sendsize[2664]: estimate time for / level 1: 342.415
sendsize[28610]: estimate time for /boot level 0: 0.186
sendsize[28610]: estimate time for /boot level 1: 0.021
sendsize[28613]: estimate time for /usr level 0: 1200.347
sendsize[28613]: estimate time for /usr level 1: 830.321
sendsize[28613]: estimate time for /usr level 2: 899.342
sendsize[32577]: estimate time for /root level 0: 21.309
sendsize[32577]: estimate time for /root level 1: 2.288
sendsize[32602]: estimate time for /home/ag level 0: 50.686
sendsize[32602]: estimate time for /home/ag level 1: 2.806
sendsize[32636]: estimate time for /home/hm level 0: 127.386
sendsize[32636]: estimate time for /home/hm level 1: 544.951
sendsize[1152]: estimate time for /home/nt level 0: 96.014
sendsize[1152]: estimate time for /home/nt level 1: 4.326
sendsize[1226]: estimate time for /home/uz level 0: 73.265
sendsize[1226]: estimate time for /home/uz level 1: 2.615
sendsize[1305]: estimate time for /var/spool/imap/user/ag level 0: 87.474
sendsize[1305]: estimate time for /var/spool/imap/user/ag level 2: 4.176
sendsize[1305]: estimate time for /var/spool/imap/user/ag level 3: 4.861
sendsize[1393]: estimate time for /var/spool/imap/user/hm level 0: 20.776
sendsize[1393]: estimate time for /var/spool/imap/user/hm level 1: 5.285
sendsize[1393]: estimate time for /var/spool/imap/user/hm level 2: 4.355
sendsize[1458]: estimate time for /var/spool/imap/user/nt level 0: 11.698
sendsize[1458]: estimate time for /var/spool/imap/user/nt level 2: 1.072
sendsize[1458]: estimate time for /var/spool/imap/user/nt level 3: 0.868
sendsize[1465]: estimate time for /var/spool/imap/user/uz level 0: 21.152
sendsize[1465]: estimate time for /var/spool/imap/user/uz level 1: 3.358
sendsize[1465]: estimate time for /var/spool/imap/user/uz level 2: 2.961
sendsize[1486]: estimate time for //new/C$ level 0: 22.735
sendsize[1486]: estimate time for //new/C$ level 1: 1.289
sendsize[1486]: estimate time for //new/C$ level 2: 1.182
sendsize[1498]: estimate time for //new/E$ level 0: 4.540
sendsize[1498]: estimate time for //new/E$ level 1: 0.410
sendsize[1498]: estimate time for //new/E$ level 2: 0.444

It seems that the level 0 estimate for / is the one taking longer.
The tail of that log is:

$ tail sendsize.20050904195400.debug
sendsize[1498]: time 12571.432: 59992 blocks of size 262144. 
29027 blocks available
sendsize[1498]: time 12571.432: Total number of bytes: 893464856
sendsize[1498]: time 12571.433: .
sendsize[1498]: estimate time for //new/E$ level 2: 0.444
sendsize[1498]: estimate size for //new/E$ level 2: 872525 KB
sendsize[1498]: time 12571.433: waiting for /usr/bin/smbclient //new/E$ child
sendsize[1498]: time 12571.433: after /usr/bin/smbclient //new/E$ wait
sendsize[1498]: time 12571.433: done with amname '//new/E$', dirname 
'//new/E$', spindle -1
sendsize[2659]: time 12571.433: child 1498 terminated normally
sendsize: time 12571.438: pid 2659 finish time Sun Sep  4 23:23:31 2005

It takes 12571.438 secs for the estimates; much greater than
4200.

If this is correct, then I should increase the estimate timeout, maybe
ten-fold. But I'm still not sure that is the problem. Is it worthwhile
to try with a giant timeout and see what happens?

Cheers,

Rodrigo

-- 

*** Rodrigo Martins de Matos Ventura [EMAIL PROTECTED]
***  Web page: http://www.isr.ist.utl.pt/~yoda
***   Teaching Assistant and PhD Student at ISR:
***Instituto de Sistemas e Robotica, Polo de Lisboa
*** Instituto Superior Tecnico, Lisboa, PORTUGAL
*** PGP fingerprint = 0119 AD13 9EEE 264A 3F10  31D3 89B3 C6C4 60C6 4585


Re: vtapes documentation

2005-09-05 Thread Gavin Henry
Doh!!!

Been one of those days already, and it's only 11:30am here ;-)

-- 
Kind Regards,

Gavin Henry.
Managing Director.

T +44 (0) 1224 279484
M +44 (0) 7930 323266
F +44 (0) 1224 742001
E [EMAIL PROTECTED]

Open Source. Open Solutions(tm).

http://www.suretecsystems.com/

quote who=Stefan G. Weichinger
 Gavin Henry wrote:
 Hi Guys,

 Been a while, but that's due to Amanda being so rock solid!!!

 I am about to try a new setup using vtapes, as a new client won't be
 getting a tape drive for about 5 weeks, but would like Amanda setup now.
 Then when it arrives, we will be moving on to tapes.

 However, I can't seem to find anything in the main docs about vtapes?

 If there is and I have missed it, then sorry, if there isn't, then I
 could
 work on whatever we have and then speak to Stefan to work with him to
 get
 them added to the core docs?

 What about

 http://www.amanda.org/docs/howto-filedriver.html

 ?

 This is where the term vtape originates ...

 --
 Stefan G. Weichinger
 AMANDA core team member
 mailto://[EMAIL PROTECTED]
 --
 oops! linux consulting  implementation
 http://www.oops.co.at
 --




Re: got FAILED for no apparent reason

2005-09-05 Thread Paul Bijnens

Rodrigo Ventura wrote:

Hi.

Meanwhile I sent a mail to amanda-users re-reporting the problem. Here


I just read it.

From the timings below, it seems to me that that scientific simulation
program is also doing havy IO, or at least causing very havy swapping.

I wouldn't be suprised if those state of the art scientific programs
used pre-historic dumb user - it works on these three lines of input,
so it should work on these million lines too algorithms  :-)

Note that RAM-access is measured in nanoseconds, while disk access is 
measured in milliseconds!  When the working set of a program gets

larger than the physical RAM, the kernel cannot do more than swapping
and access some variable takes then 10 times more time.



goes the data relative to that: I have 14 filesystems to dump on
localhost, so that total timeout should be 300*14=4200 seconds, right?
Doing a grep on sensize log, I get:

$ grep estimate time sendsize.20050904195400.debug
sendsize[2664]: estimate time for / level 0: 8169.854
sendsize[2664]: estimate time for / level 1: 342.415


root is indeed taking a very long time.
Possible causes:
  - some unresponsive filesystems mounted
  - using gnutar and having many small files
  - and of course, using the disk for something else while
trying to backup...



sendsize[28610]: estimate time for /boot level 0: 0.186
sendsize[28610]: estimate time for /boot level 1: 0.021
sendsize[28613]: estimate time for /usr level 0: 1200.347
sendsize[28613]: estimate time for /usr level 1: 830.321
sendsize[28613]: estimate time for /usr level 2: 899.342


/usr is also taking a long time.

Could that be because / and /usr are on the same disk
as the swap area?



sendsize[32577]: estimate time for /root level 0: 21.309
sendsize[32577]: estimate time for /root level 1: 2.288
sendsize[32602]: estimate time for /home/ag level 0: 50.686
sendsize[32602]: estimate time for /home/ag level 1: 2.806
sendsize[32636]: estimate time for /home/hm level 0: 127.386
sendsize[32636]: estimate time for /home/hm level 1: 544.951
sendsize[1152]: estimate time for /home/nt level 0: 96.014
sendsize[1152]: estimate time for /home/nt level 1: 4.326
sendsize[1226]: estimate time for /home/uz level 0: 73.265
sendsize[1226]: estimate time for /home/uz level 1: 2.615
sendsize[1305]: estimate time for /var/spool/imap/user/ag level 0: 87.474
sendsize[1305]: estimate time for /var/spool/imap/user/ag level 2: 4.176
sendsize[1305]: estimate time for /var/spool/imap/user/ag level 3: 4.861
sendsize[1393]: estimate time for /var/spool/imap/user/hm level 0: 20.776
sendsize[1393]: estimate time for /var/spool/imap/user/hm level 1: 5.285
sendsize[1393]: estimate time for /var/spool/imap/user/hm level 2: 4.355
sendsize[1458]: estimate time for /var/spool/imap/user/nt level 0: 11.698
sendsize[1458]: estimate time for /var/spool/imap/user/nt level 2: 1.072
sendsize[1458]: estimate time for /var/spool/imap/user/nt level 3: 0.868
sendsize[1465]: estimate time for /var/spool/imap/user/uz level 0: 21.152
sendsize[1465]: estimate time for /var/spool/imap/user/uz level 1: 3.358
sendsize[1465]: estimate time for /var/spool/imap/user/uz level 2: 2.961
sendsize[1486]: estimate time for //new/C$ level 0: 22.735
sendsize[1486]: estimate time for //new/C$ level 1: 1.289
sendsize[1486]: estimate time for //new/C$ level 2: 1.182
sendsize[1498]: estimate time for //new/E$ level 0: 4.540
sendsize[1498]: estimate time for //new/E$ level 1: 0.410
sendsize[1498]: estimate time for //new/E$ level 2: 0.444

It seems that the level 0 estimate for / is the one taking longer.
The tail of that log is:

$ tail sendsize.20050904195400.debug
sendsize[1498]: time 12571.432: 59992 blocks of size 262144. 
29027 blocks available
sendsize[1498]: time 12571.432: Total number of bytes: 893464856
sendsize[1498]: time 12571.433: .
sendsize[1498]: estimate time for //new/E$ level 2: 0.444
sendsize[1498]: estimate size for //new/E$ level 2: 872525 KB
sendsize[1498]: time 12571.433: waiting for /usr/bin/smbclient //new/E$ child
sendsize[1498]: time 12571.433: after /usr/bin/smbclient //new/E$ wait
sendsize[1498]: time 12571.433: done with amname '//new/E$', dirname 
'//new/E$', spindle -1
sendsize[2659]: time 12571.433: child 1498 terminated normally
sendsize: time 12571.438: pid 2659 finish time Sun Sep  4 23:23:31 2005

It takes 12571.438 secs for the estimates; much greater than
4200.

If this is correct, then I should increase the estimate timeout, maybe
ten-fold. But I'm still not sure that is the problem. Is it worthwhile
to try with a giant timeout and see what happens?


Maybe set it like etimeout -16000.
But remember that after the estimate, then starts the backup itself,
which also havily uses the disks.
And because this is the amanda server itself, the holdingdisk is used
very hard too.  Make sure the holdingdisk can feed the bytes fast
enough to the tapedrive, otherwise, you'll end up with a trashing-tape
drive (shoeshining effect of having to stop, 

Re: How testing a new backup system in parrallel with the old one ?

2005-09-05 Thread rangzen
Ok, it takes long time to configure this f changer ...

I tried your solution but with this lines in disklist
mars /u1/mars /u1 comp-user-tar
mars mars_u2 /u2 comp-user-tar

i have an error :
/usr/local/etc/amanda/Set/disklist, line 4: undefined dumptype `/U1'
/usr/local/etc/amanda/Set/disklist, line 5: undefined dumptype `/U2'

Is there a compilation option or the disklist is standard ?
It seems than it didn't take care of the name ...

2005/8/25, Jon LaBadie [EMAIL PROTECTED]:
 On Thu, Aug 25, 2005 at 05:17:50PM +0200, rangzen wrote:
  Hello,
 
  i use amanda with a single DAT, now, i have a new automatic charger of
  8 DAT, i want test it and run both system during one month to be sure
  of not loosing something.
  Is it possible ?
  How can i configure this ?
 
  I found Can I backup separate disks of the same host in different
  configurations? at
  http://amanda.sourceforge.net/fom-serve/cache/31.html but i want
  backup the SAME disks with different configurations.
 
  I'm affraid than amanda mix some internal file about what it backuped
  before or not ...
 
 It is a very valid concern.  I've been thinking about it and can't give
 a solid answer, but maybe some pointers.
 
 Any answer depends in part on whether you use gnutar or some version of
 dump to do your backups.  Dump traditionally has used a file in /etc
 called dumpdates to track when it did various levels of backups.
 Amanda has followed the same concept for gnutar backups by using a
 file in /etc called amandates.
 
 If you are using a dump program amanda may not be able to control what
 goes into /etc/dumpdates.  Thus there is potential for conflict between
 two configs backing up the same filesystem.  Solaris' ufsdump does have
 an option to specify what the entry in /etc/dumpdates should be called
 but I don't know if that is common to 'all' dumps or if amanda uses it.
 Your only solution in that case is to use the record no setting for
 one config telling amanda not to update the /etc/dumpdate records.  That
 will limit you to level 0's I believe for that config I believe.
 
 If instead you are using gnutar, and thus /etc/amandates, then there is
 a way to avoid conflicts.  This technique should also work if the ufsdump
 option I mention above is common amongst dump programs AND amanda uses it.
 
 A tiny section of my /etc/amandates file:
 
 / 0 1104304406
 / 1 1104821711
 /w 0 1104908114
 /w 1 1104821681
 /w/Packages 0 1066888427
 /w/Packages 1 1065851620
 StaticFS-1 0 1104821741
 StaticFS-1 1 1058421785
 StaticFS-2 0 1104908231
 alpha-dumps 0 1101885483
 alpha-home 0 1101885469
 
 Note most entries are directory paths, but the last several are not.
 That is the key.  In your disklist file you can name your entries
 distinct from the path.  The definition of a disklist entry is:
 
 host name [device] dumptype [spindle [interface]]
 
 If all your disklist entry contains is a path, it is considered both
 the device (path) and the entry name.  So in your new config use a
 meaningful name for each of your disklist entries.  Then as far as
 I can recall all info will be recorded for the entry name, not for
 the directory path.
 
 HTH
 
 --
 Jon H. LaBadie  [EMAIL PROTECTED]
  JG Computing
  4455 Province Line Road(609) 252-0159
  Princeton, NJ  08540-4322  (609) 683-7220 (fax)
 


-- 
liberté - partage - respect
freedom - share - respect



Re: How testing a new backup system in parrallel with the old one ?

2005-09-05 Thread Paul Bijnens

rangzen wrote:

Ok, it takes long time to configure this f changer ...

I tried your solution but with this lines in disklist
mars /u1/mars /u1 comp-user-tar
mars mars_u2 /u2 comp-user-tar

i have an error :
/usr/local/etc/amanda/Set/disklist, line 4: undefined dumptype `/U1'
/usr/local/etc/amanda/Set/disklist, line 5: undefined dumptype `/U2'

Is there a compilation option or the disklist is standard ?
It seems than it didn't take care of the name ...



It is standard since version 2.4.3b3 (according to the NEWS file in
the sources).  And that's been around for quiet a while now (2002).

What version are you using?  (amadmin xx version).


--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  Are you sure?  ...   YES   ...   Phew ...   I'm out  *
***




Re: How testing a new backup system in parrallel with the old one ?

2005-09-05 Thread rangzen
 It is standard since version 2.4.3b3 (according to the NEWS file in
 the sources).  And that's been around for quiet a while now (2002).
 
 What version are you using?  (amadmin xx version).

build: VERSION=Amanda-2.4.2p2
Argh ... Thank you Debian woody ...

And all my client are in 2.4.4p2 ...

A lots of update in perspective ...

-- 
liberté - partage - respect
freedom - share - respect



Re: samba backups

2005-09-05 Thread Gregor Ibic

Hm, hm, strange.

I created a new amanda conf with only that share in disklist and it 
finishes ok. Could be a problem of indexing? I saw that once a while 
sendbackup sends index info, for file indexing purpose.

Could this be a problem? Where can I trace this out.
It seems that is not a smbclient problem, but with some thing in Amanda.

regards,
gregor


Re: planner timeouts

2005-09-05 Thread Charles Sprickman

Charles Sprickman wrote:
h13 (client) debug logs.  Note that there is two-way communication, and 
everything seems to go correctly.  In the debug dir, there are only 
amandad debug logs, nothing else.


That doesn't sound right to me.  There should be a sendbackup log file as 
well, a runtar one, and so on.  Can you verify your inetd config on that 
particular client, to see whether there's something afoul?  Have a look at 
the system logs as well, while you're at it.  amanda might be unable to run 
any secondary programs, for instance.


Me neither...  Is there any way to increase the verbosity of amandad? 
inetd config is good (it's a client, so it's only got the single line for 
amandad), nothing in the system logs, all of the stuff in the libexec 
directory appears to have correct perms (the proper things are setuid to 
my amanda user)...



GETTING ESTIMATES...
planner: time 30.956: error result for host h13.blah.com disk /spool: 
Request to h13.blah.com timed out.
planner: time 30.956: error result for host h13.blah.com disk 
/var/qmail/bin: Request to h13.blah.com timed out.
planner: time 30.956: error result for host h13.blah.com disk 
/var/qmail/control: Request to h13.blah.com timed out.
planner: time 30.956: error result for host h13.blah.com disk /var/db/pkg: 
Request to h13.blah.com timed out.
planner: time 30.956: error result for host h13.blah.com disk /usr/local/: 
Request to h13.blah.com timed out.
planner: time 30.956: error result for host h13.blah.com disk /home: 
Request to h13.blah.com timed out.
planner: time 30.956: error result for host h13.blah.com disk /: Request to 
h13.blah.com timed out.

planner: time 30.956: getting estimates took 30.811 secs


Does that spell a 30s timeout somewhere?  amanda.conf not taken into account, 
perhaps?  And the obligatory question, did you double-check that there's no 
firewall between that particular client and server?  (If you did, 
triple-check. :-) )


I have bumped up all the timeouts in amanda.conf to ridiculously large 
values. :)  There are firewalls, and the tcpdump trace I sent was taken 
with each host's firewall software (ipfilter) disabled.  Additionally, the 
firewall logs blocked traffic and had nothing to say about this.


What else can I look at here?

I'm also including the tcpdump output again at the end of this message.

Thanks,

Charles

The devel2 (server) view:

19:46:23.967162 devel2.937  h13.blah.com.amanda: udp 117
19:46:24.074337 h13.blah.com.amanda  devel2.937: udp 50
19:46:24.249414 h13.blah.com.amanda  devel2.937: udp 81
19:46:24.249497 devel2.937  h13.blah.com.amanda: udp 50
19:46:24.489787 devel2.937  h13.blah.com.amanda: udp 1465
19:46:34.497794 devel2.937  h13.blah.com.amanda: udp 1465
19:46:44.508815 devel2.937  h13.blah.com.amanda: udp 1465

The h13 (client) view:

19:46:23.982760 devel2.937  h13.blah.com.amanda: udp 117
19:46:24.054390 h13.blah.com.amanda  devel2.937: udp 50
19:46:24.230317 h13.blah.com.amanda  devel2.937: udp 81
19:46:24.264200 devel2.937  h13.blah.com.amanda: udp 50
19:46:24.523791 devel2.937  h13.blah.com.amanda: udp 1465 (frag
59731:[EMAIL PROTECTED])
19:46:34.531821 devel2.937  h13.blah.com.amanda: udp 1465 (frag
62763:[EMAIL PROTECTED])
19:46:44.542471 devel2.937  h13.blah.com.amanda: udp 1465 (frag
9535:[EMAIL PROTECTED])



Alex


--
Alexander Jolk / BUF Compagnie
tel +33-1 42 68 18 28 /  fax +33-1 42 68 18 29