[ 
https://issues.apache.org/jira/browse/HBASE-19024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209147#comment-16209147
 ] 

Vikas Vishwakarma commented on HBASE-19024:
-------------------------------------------

[~anoop.hbase] we carried few combinations of tests since this was a part of a 
larger story to enable WAL on SSD. So we looked at 
* hflush HDD vs SSD
* hsync HDD vs SSD
* HDD hflush vs SSD hsync
* HDD hflush vs hsync 

Each test was carried out for both small batches of few 100 bytes and large 
batches of 1 MB and 10 MB

We used a multithreaded native HBase write loader for the tests that does batch 
puts of 100 bytes, 1 MB, 10 MB using random data. Latency is calculated for 
each batch put as well as total time taken for the loader to complete for few 
million rows. As per our observation 
* between hflush and hsync there is 10-15% degradation for using hsync instead 
of hflush for HDD

SSD results are slightly controversial and not in-line with conventional belief 
and we had a long discussion and experimentation phase on it. It will also 
depend on the type and grade of SSD being used value or low grade SSD vs 
enterprise SSD and other factors, so I am not posting those results here as 
this jira is anyways independent of SSD :)





> provide a configurable option to hsync WAL edits to the disk for better 
> durability
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-19024
>                 URL: https://issues.apache.org/jira/browse/HBASE-19024
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>         Environment: 
>            Reporter: Vikas Vishwakarma
>
> At present we do not have an option to hsync WAL edits to the disk for better 
> durability. In our local tests we see 10-15% latency impact of using hsync 
> instead of hflush which is not very high.  
> We should have a configurable option to hysnc WAL edits instead of just 
> sync/hflush which will call the corresponding API on the hadoop side. 
> Currently HBase handles both SYNC_WAL and FSYNC_WAL as the same calling 
> FSDataOutputStream sync/hflush on the hadoop side. This can be modified to 
> let FSYNC_WAL call hsync on the hadoop side instead of sync/hflush. We can 
> keep the default value to sync as the current behavior and hsync can be 
> enabled based on explicit configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to