[ 
https://issues.apache.org/jira/browse/PIG-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031024#comment-13031024
 ] 

Dmitriy V. Ryaboy commented on PIG-1825:
----------------------------------------

btw, skipping WAL is a BAD idea. Cause, no WAL, even during *normal* operation. 
WALs are useful for many things, recovery being only one of them.

>From the HBase book at 
>http://hbase.apache.org/book.html#perf.hbase.client.putwal :

13.7.7. Turn off WAL on Puts
A frequently discussed option for increasing throughput on Puts is to call 
writeToWAL(false). Turning this off means that the RegionServer will not write 
the Put to the Write Ahead Log, only into the memstore, HOWEVER the consequence 
is that if there is a RegionServer failure there will be data loss. If 
writeToWAL(false) is used, do so with extreme caution. You may find in 
actuality that it makes little difference if your load is well distributed 
across the cluster.

In general, it is best to use WAL for Puts, and where loading throughput is a 
concern to use bulk loading techniques instead.

> ability to turn off the write ahead log for pig's HBaseStorage
> --------------------------------------------------------------
>
>                 Key: PIG-1825
>                 URL: https://issues.apache.org/jira/browse/PIG-1825
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Corbin Hoenes
>            Priority: Minor
>         Attachments: HBaseStorage_noWAL.patch, PIG-1825_1.patch
>
>
> Added an option to allow a caller of HBaseStorage to turn off the 
> WriteAheadLog feature while doing bulk loads into hbase.
> From the performance tuning wikipage: 
> http://wiki.apache.org/hadoop/PerformanceTuning
> "To speed up the inserts in a non critical job (like an import job), you can 
> use Put.writeToWAL(false) to bypass writing to the write ahead log."
> We've tested this on HBase 0.20.6 and it helps dramatically.  
> The -noWAL options is passed in just like other options for hbase storage:
> STORE myalias INTO 'MyTable' USING 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('mycolumnfamily:field1 
> mycolumnfamily:field2','-noWAL');
> This would be my first patch so please educate me with any steps I need to 
> do.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to