We have implemented this idea, and it definitely increases HLog performance by quite a large number. The one drawabck is that writes to HLog (from HDFS perspective) become more "batchy", and writes to a HDFS file consume quite a bit of CPU. So I have observed that this change increase overall system throughput, but suffer slightly on individual transaction latency.
-dhruba On Wed, Jun 29, 2011 at 6:08 AM, Joey Echeverria <[email protected]> wrote: > Hey Mingjian, > > This sounds like a good idea Your patch didn't make it through. Would you > mind either filing a JIRA and uploading your patch there or at least posting > it to something like pastebin so we can take a look. > > -Joey > > > > On Jun 29, 2011, at 3:27, Mingjian Deng <[email protected]> wrote: > > > Hi: > > We found that the hlog sync to disk each time. When one thread exec > "doWrite(info, logKey, edit);", the others wait for "updateLock" in > HLog.java. > > Why not the others add their edits into a list and wait. When sync's > time, the whole list sync to disk once. I think it will decrease the IO > calls. > > > > So Maybe we will make two lists for edits. Each thread write to the > "waledits" and wait for "updateLock". Each thread can copy the "waledits" to > "flushedits" and flush the "flushedits" to > > disk once it gets "updateLock". > > > > In my test, it can increase the write speed of 40%. > > > > Just see the HLog.patch. > > > -- Connect to me at http://www.facebook.com/dhruba
