[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

Liu Shaohui (JIRA) Thu, 26 Sep 2013 20:32:37 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779607#comment-13779607
 ]


Liu Shaohui commented on HBASE-8755:
------------------------------------

update the result table and add ops diff 
||Thread number|| Time without Patch || Ops without Patch||Time with Patch || 
Ops with Patch || Time diff % || Ops diff % ||
|1|579.38|1725.983|625.937|1597.605|-8.04|-7.44|
|1|580.307|1723.226|630.346|1586.43|-8.62|-7.94|
|1|577.853|1730.544|654.205|1528.573|-13.21|-11.67|
|5|799.579|6253.291|785.696|6363.785|1.74|1.77|
|5|795.013|6289.206|780.642|6404.984|1.81|1.84|
|5|826.27|6051.291|781.909|6394.606|5.37|5.67|
|50|3290.482|15195.343|1165.773|42890|64.57|182.26|
|50|3298.387|15158.925|1167.992|42808.516|64.59|182.40|
|50|3224.495|15506.304|1154.921|43293.004|64.18|179.20|
|75|4450.76|16851.055|1253.448|59834.953|71.84|255.08|
|75|4506.143|16643.945|1269.806|59064.141|71.82|254.87|
|75|4516.453|16605.951|1245.954|60194.84|72.41|262.49|
|100|5561.074|17982.137|1493.102|66974.656|73.15|272.45|
|100|5616.81|17803.699|1496.263|66833.172|73.36|275.39|
|100|5612.268|17818.107|1468.5|68096.695|73.83|282.18|

Time diff = (Time without Patch - Time with Patch) / Time without Patch * 100
Ops diff = (Ops with Patch - Ops without Patch) / Ops without Patch * 100

[~stack] What are the hdfs and hbase version of your test? We may rebo the 
tests in cluster with same hdfs and hbase versions as yours.

                
> A new write thread model for HLog to improve the overall HBase write 
> throughput
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-8755
>                 URL: https://issues.apache.org/jira/browse/HBASE-8755
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance, wal
>            Reporter: Feng Honghua
>            Assignee: stack
>            Priority: Critical
>             Fix For: 0.96.1
>
>         Attachments: 8755trunkV2.txt, HBASE-8755-0.94-V0.patch, 
> HBASE-8755-0.94-V1.patch, HBASE-8755-trunk-V0.patch, HBASE-8755-trunk-V1.patch
>
>
> In current write model, each write handler thread (executing put()) will 
> individually go through a full 'append (hlog local buffer) => HLog writer 
> append (write to hdfs) => HLog writer sync (sync hdfs)' cycle for each write, 
> which incurs heavy race condition on updateLock and flushLock.
> The only optimization where checking if current syncTillHere > txid in 
> expectation for other thread help write/sync its own txid to hdfs and 
> omitting the write/sync actually help much less than expectation.
> Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
> proposed a new write thread model for writing hdfs sequence file and the 
> prototype implementation shows a 4X improvement for throughput (from 17000 to 
> 70000+). 
> I apply this new write thread model in HLog and the performance test in our 
> test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
> RS, from 22000 to 70000 for 5 RS), the 1 RS write throughput (1K row-size) 
> even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
> write throughput then is 31002). I can provide the detailed performance test 
> results if anyone is interested.
> The change for new write thread model is as below:
>  1> All put handler threads append the edits to HLog's local pending buffer; 
> (it notifies AsyncWriter thread that there is new edits in local buffer)
>  2> All put handler threads wait in HLog.syncer() function for underlying 
> threads to finish the sync that contains its txid;
>  3> An single AsyncWriter thread is responsible for retrieve all the buffered 
> edits in HLog's local pending buffer and write to the hdfs 
> (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
> writes to hdfs that needs a sync)
>  4> An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
> to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
> that sync watermark increases)
>  5> An single AsyncNotifier thread is responsible for notifying all pending 
> put handler threads which are waiting in the HLog.syncer() function
>  6> No LogSyncer thread any more (since there is always 
> AsyncWriter/AsyncFlusher threads do the same job it does)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

Reply via email to