Maybe we should look at testing with KFS - I did some tests, and the IO seems to be slightly slower, but lost data is lost data.
The append and sync does work in KFS unlike HDFS. It would make a good stir in the Hadoop community... On Thu, Apr 2, 2009 at 11:24 PM, stack <[email protected]> wrote: > On Fri, Apr 3, 2009 at 2:41 AM, Ryan Rawson <[email protected]> wrote: > > > > > So, what will be in hadoop-0.20 to minimize this kind of horrible data > > loss? > > > > In 0.20 timeframe, you will have to enable flush (HADOOP-5332) but as Jim > says, its not going to do much good without HADOOP-4379. The latter won't > be in in hadoop 0.20. We'll have to work to make sure it makes it into > HADOOP 0.21. One recent suggestion was to contribute a patch to HADOOP > that > enabled appends in TRUNK -- but we should make sure first that all > objections to append have been put to rest. > > Working flush/sync is the most important hbase issue. Up to this, we've > not > been doing a good job staying on top of its progress. > > St.Ack >
