Re: Trap for the unwary user: split_diskbuffer
On Thu, Mar 5, 2009 at 10:32 PM, Uwe Menges uwe.men...@web.de wrote: Another quick check with ls -l /proc/`pidof taper`/fd/ showed me it was really true, the taper was happily reading its own amanda-split-buffer-* files (marked as (deleted) but it nevertheless had the file handles), writing them again as amanda-split-buffer-* files... Now that I've been over this code several times with a fine-toothed-comb (I've rewritten it to use Perl and the transfer architecture), I can't figure out what happened here. It makes sense that the *taper* was reading and writing from its split-buffer files -- that's what they're for! If *gnutar* were reading the split files, there would be a problem, since those files are created with mkstemp and immediately unlinked. Dustin -- Open Source Storage Engineer http://www.zmanda.com
Re: Trap for the unwary user: split_diskbuffer
On Monday 11 May 2009, Dustin J. Mitchell wrote: On Thu, Mar 5, 2009 at 10:32 PM, Uwe Menges uwe.men...@web.de wrote: Another quick check with ls -l /proc/`pidof taper`/fd/ showed me it was really true, the taper was happily reading its own amanda-split-buffer-* files (marked as (deleted) but it nevertheless had the file handles), writing them again as amanda-split-buffer-* files... Now that I've been over this code several times with a fine-toothed-comb (I've rewritten it to use Perl and the transfer architecture), I can't figure out what happened here. It makes sense that the *taper* was reading and writing from its split-buffer files -- that's what they're for! If *gnutar* were reading the split files, there would be a problem, since those files are created with mkstemp and immediately unlinked. Dustin This walks and talks like another duck I believe I've seen on lkml, Dustin. There has been an ongoing discussion on lkml regarding the nearly required use of the fsync operator on ext3-4 systems if absolute data integrity is needed. Otherwise the 5 second rule, with no locking used to force synchronization, can find a file on the drive that has not been actually unlinked on the platter surface, or some such tom-foolery I don't fully grok all the nuances of. I don't believe I have been a victim of that, but the discussion has been hot heavy at times about how to speed up such ops without forcing software to use the fsync function and therefore suffering a slowdown since fsync seems to be an expensive operation in that it bypasses the schedulers ability to efficiently manage disk accesses by forcing an out of order (seek wise) flush of all dirty buffers held. At least that is what I have gotten out of trying to follow the thread so far. Theo. Tso can probably shed more light, far more clearly than I on this. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Look, we play the Star Spangled Banner before every game. You want us to pay income taxes, too? -- Bill Veeck, Chicago White Sox
Trap for the unwary user: split_diskbuffer
Hi, I just wanted to report a PEBCAK that I just went into, as reference for others. Short setup description: I wanted to fully backup the 1.5TB on my Linux workstation to two 1TB USB hard disks, using amanda 2.6.2alpha-20090130 and 'tpchanger chg-disk:/backup/amanda', with GNUTAR onto 40GB vtapes (24 slots on each USB disk), no holding disk, tape_splitsize 4 gbyte and fallback_splitsize 1 gbyte. Of course, I needed a split_diskbuffer for splitting, and I put it to /data/scratch. After roughly 68 hours being afk, I had 1129GB on the USB disks (yes, I'll take a look into performance soon). I recognized that the (non-)performance of the backup degraded even more in the last 12 hours, and it had merely 64byte/s write speed then. A quick peek with strace showed that the taper process was mainly doing getdents64(3, /* 85 entries */, 4096) = 4080 and I discovered there were 230952 32KB files in /backup/amanda/drive0/data/. A sample look at two of them showed that they contained only 0-bytes after the header. After some more digging with the great help of DJ Mitchell on IRC, he gave the crucial hint: I didn't exclude the split_diskbuffer location from backup! Another quick check with ls -l /proc/`pidof taper`/fd/ showed me it was really true, the taper was happily reading its own amanda-split-buffer-* files (marked as (deleted) but it nevertheless had the file handles), writing them again as amanda-split-buffer-* files... *** So, be sure to have your split_diskbuffer in the exclude list. *** :) Maybe there will be some fix some day (tar eg. also (now) recognizes its currently created archive and won't read it), but I see that this is probably not the most urgent part to work on. Yours, Uwe