Re: Trap for the unwary user: split_diskbuffer

2009-05-11 Thread Dustin J. Mitchell
On Thu, Mar 5, 2009 at 10:32 PM, Uwe Menges uwe.men...@web.de wrote:
 Another quick check with ls -l /proc/`pidof taper`/fd/ showed me it was
 really true, the taper was happily reading its own amanda-split-buffer-* files
 (marked as (deleted) but it nevertheless had the file handles), writing them
 again as amanda-split-buffer-* files...

Now that I've been over this code several times with a
fine-toothed-comb (I've rewritten it to use Perl and the transfer
architecture), I can't figure out what happened here.  It makes sense
that the *taper* was reading and writing from its split-buffer files
-- that's what they're for!  If *gnutar* were reading the split files,
there would be a problem, since those files are created with mkstemp
and immediately unlinked.

Dustin

-- 
Open Source Storage Engineer
http://www.zmanda.com


Re: Trap for the unwary user: split_diskbuffer

2009-05-11 Thread Gene Heskett
On Monday 11 May 2009, Dustin J. Mitchell wrote:
On Thu, Mar 5, 2009 at 10:32 PM, Uwe Menges uwe.men...@web.de wrote:
 Another quick check with ls -l /proc/`pidof taper`/fd/ showed me it was
 really true, the taper was happily reading its own amanda-split-buffer-*
 files (marked as (deleted) but it nevertheless had the file handles),
 writing them again as amanda-split-buffer-* files...

Now that I've been over this code several times with a
fine-toothed-comb (I've rewritten it to use Perl and the transfer
architecture), I can't figure out what happened here.  It makes sense
that the *taper* was reading and writing from its split-buffer files
-- that's what they're for!  If *gnutar* were reading the split files,
there would be a problem, since those files are created with mkstemp
and immediately unlinked.

Dustin

This walks and talks like another duck I believe I've seen on lkml, Dustin.

There has been an ongoing discussion on lkml regarding the nearly required use 
of the fsync operator on ext3-4 systems if absolute data integrity is needed.  
Otherwise the 5 second rule, with no locking used to force synchronization, 
can find a file on the drive that has not been actually unlinked on the 
platter surface, or some such tom-foolery I don't fully grok all the nuances 
of.  I don't believe I have been a victim of that, but the discussion has been 
hot  heavy at times about how to speed up such ops without forcing software 
to use the fsync function and therefore suffering a slowdown since fsync seems 
to be an expensive operation in that it bypasses the schedulers ability to 
efficiently manage disk accesses by forcing an out of order (seek wise) flush 
of all dirty buffers held.  At least that is what I have gotten out of trying 
to follow the thread so far.  Theo. Tso can probably shed more light, far more 
clearly than I on this.

-- 
Cheers, Gene
There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order.
-Ed Howdershelt (Author)
Look, we play the Star Spangled Banner before every game.  You want us
to pay income taxes, too?
-- Bill Veeck, Chicago White Sox



Trap for the unwary user: split_diskbuffer

2009-03-05 Thread Uwe Menges
Hi,

I just wanted to report a PEBCAK that I just went into, as reference for 
others.

Short setup description: I wanted to fully backup the 1.5TB on my Linux 
workstation to two 1TB USB hard disks, using amanda 2.6.2alpha-20090130 and 
'tpchanger chg-disk:/backup/amanda', with GNUTAR onto 40GB vtapes (24 slots 
on each USB disk), no holding disk, tape_splitsize 4 gbyte and 
fallback_splitsize 1 gbyte. Of course, I needed a split_diskbuffer for 
splitting, and I put it to /data/scratch.

After roughly 68 hours being afk, I had 1129GB on the USB disks (yes, I'll 
take a look into performance soon). I recognized that the (non-)performance of 
the backup degraded even more in the last 12 hours, and it had merely 64byte/s 
write speed then.

A quick peek with strace showed that the taper process was mainly doing
  getdents64(3, /* 85 entries */, 4096)   = 4080
and I discovered there were 230952 32KB files in /backup/amanda/drive0/data/. 
A sample look at two of them showed that they contained only 0-bytes after the 
header.

After some more digging with the great help of DJ Mitchell on IRC, he gave the 
crucial hint: I didn't exclude the split_diskbuffer location from backup!

Another quick check with ls -l /proc/`pidof taper`/fd/ showed me it was 
really true, the taper was happily reading its own amanda-split-buffer-* files 
(marked as (deleted) but it nevertheless had the file handles), writing them 
again as amanda-split-buffer-* files...

*** So, be sure to have your split_diskbuffer in the exclude list. *** :)

Maybe there will be some fix some day (tar eg. also (now) recognizes its 
currently created archive and won't read it), but I see that this is probably 
not the most urgent part to work on.

Yours, Uwe