Create large preallocated file blocks when performing merges

2009-09-30 Thread Jason Rutherglen
I wanted to post this before I forgot. Based on an informal discussion at the Katta meeting regarding the high write throughput of Zookeeper (see http://wiki.apache.org/hadoop/ZooKeeper/Performance ) which uses the database technique of preallocating large empty files before filling them up with re

Re: Create large preallocated file blocks when performing merges

2009-10-01 Thread Michael McCandless
We are already using File.setLength to pre-set the length of the CFS file, during merging, on the hope that it'll help the filesystem minimize fragmentation of the file, but we don't use it when creating the individual index files. We could pursue doing so for individual index files too... I wasn'

Re: Create large preallocated file blocks when performing merges

2009-10-01 Thread Ted Dunning
The place that it would help is where small writes are done and flushed. If you do large writes or have good buffering (aka do large writes), then preallocation probably won't help. It helps ZK because they are writing a transaction log that must flush each small write. On Thu, Oct 1, 2009 at 3: