[ https://issues.apache.org/jira/browse/PIG-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bill Graham updated PIG-1825: ----------------------------- Attachment: PIG-1825_3.patch Here's patch #3 with {{testStoreToHBase_2_no_WAL}} removed. I agree we should remove it if HBase doesn't even deal with it in the unit test mode. I think using the {{-noWAL}} option makes the most sense, since it's very clear what it does. I've added comments in the Javadocs to make sure the risks are clear. If someone uses an obscurely named flag (i.e., -noWAL) without understanding what it does by reading either the Pig javadocs or the HBase documentation, then they're really flying blind. > ability to turn off the write ahead log for pig's HBaseStorage > -------------------------------------------------------------- > > Key: PIG-1825 > URL: https://issues.apache.org/jira/browse/PIG-1825 > Project: Pig > Issue Type: Improvement > Affects Versions: 0.8.0 > Reporter: Corbin Hoenes > Assignee: Bill Graham > Priority: Minor > Attachments: HBaseStorage_noWAL.patch, PIG-1825_1.patch, > PIG-1825_2.patch, PIG-1825_3.patch > > > Added an option to allow a caller of HBaseStorage to turn off the > WriteAheadLog feature while doing bulk loads into hbase. > From the performance tuning wikipage: > http://wiki.apache.org/hadoop/PerformanceTuning > "To speed up the inserts in a non critical job (like an import job), you can > use Put.writeToWAL(false) to bypass writing to the write ahead log." > We've tested this on HBase 0.20.6 and it helps dramatically. > The -noWAL options is passed in just like other options for hbase storage: > STORE myalias INTO 'MyTable' USING > org.apache.pig.backend.hadoop.hbase.HBaseStorage('mycolumnfamily:field1 > mycolumnfamily:field2','-noWAL'); > This would be my first patch so please educate me with any steps I need to > do. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira