On Mon, Aug 16, 2010 at 3:55 PM, Valeriu Mutu <vm...@pcbi.upenn.edu> wrote:
> According to the documentation, to speed up backups, one could could setup 
> holding disks where the data will be buffered before it is written to tape. 
> This method is known as FILE-WRITE [1]. This sounds good and works well for 
> DLE's which can fit into the holding disk area.

Correct.

> Nevertheless, for the DLE's that don't fit into the holding disk, Amanda 
> would use the second method known as PORT-WRITE [1]. With this method, Amanda 
> splits the DLE into chucks of a given size S, writes each chunk to disk one 
> at a time, and then, once the chunk of size S is completely on disk, writes 
> the chunk to tape.

This is not correct.

First, "chunks" are used in the holding disk, and are completely
unrelated to splitting.  Amanda writes "parts" to tapes.

Second, and more importantly, Amanda writes the data directly to tape
as it arrives, but writes it to a part buffer in parallel, either on
disk or in memory.  I'm specifically objecting to "then, once the
chunk .. is completely on disk", as this implies the operations do not
occur in parallel.

The part buffer is only consulted if there is an error writing the
data to tape (rather than start the dump over).  Note that 3.2 will
reduce the need for these split buffers -- but I won't get into that
right now.

> Questions:
> - Does Amanda continuosly keep the disk buffer full? In other words, as it 
> starts writing to tape the buffered chunk1, will it start buffering chunk2? 
> Probably not, because it would need the complete copy of chunk1, if chunk1 
> fails to be written successfully to tape. Right?

Right - it only starts filling the disk buffer with data for part 2
once part 1 is written to tape.  But in ordinary operation, that
occurs immediately after the last byte of part 1 is read from the
dumper.

> - Is there a way to see the speed at which 'taper' writes data to tape?

You can look at the report sent by amreport after the dump run - but
note that it includes the time to write filemarks and labels and
whatnot, so it is not the full "streaming" rate.

> - Why is disk buffer so slow for me? When I use FILE-WRITE, I get great 
> speeds! This is not the case with PORT-WRITE. I am running Amanda 2.6.1p2 
> (I'll upgrade to 3.1.2 soon), with a holding disk of 840Gb and a disk buffer 
> of 370Gb. I have some DLE's which are greate than 1Tb and Amanda is using the 
> PORT-WRITE method to write them to tape. When this happens, I see that the 
> disk buffer becomes full, which is good, but yet the speed of writing to tape 
> seems slow. I don't have a way to see the writing speed of the 'taper', so 
> I'm relying on the amount of data Amanda reads from the disk device. With 
> 'iostat' I can see that Amanda is reading from the device at a peak spead of 
> 4Mb/sec (the device /dev/dm-8 or /dev/mapper/cronos-amanda-diskbuffer below 
> is dedicated to Amanda's disk buffer):

I wouldn't necessarily trust iostat - what's going on is a bit
higher-level than iostat is intended to address.

I wonder why you have a 370Gb disk buffer.  Unless your tapes are 8TB,
that's too big a part size.  Part size (and thus disk buffer) should
be 5-10% of your tape size, at most.  You could probably take 160GB+
of disk buffer and use it as holding instead of disk buffer, allowing
this DLE to fit in holding disk.

The bottleneck with PORT-WRITE is generally the filesystem.  If you
were dumping bytes from a raw disk onto tape, then the additional
write to the disk buffer might be a problem.  But in most cases, the
bytes are being dumped by tar, which is making all sorts of funky
filesystem calls, traversing directories, inodes, etc., and generally
putting a strain on the filesystem to keep up.  It then adds a lot of
overhead to encode that data into 512 byte tar records.  The
bottleneck can be hard to see because it's not all I/O, and it's not
all userspace CPU time.

Dustin

-- 
Open Source Storage Engineer
http://www.zmanda.com

Reply via email to