[jira] [Commented] (PIG-1825) ability to turn off the write ahead log for pig's HBaseStorage
[ https://issues.apache.org/jira/browse/PIG-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031012#comment-13031012 ] Dmitriy V. Ryaboy commented on PIG-1825: The patch is really straightforward and the test doesn't actually test the patch, except to make sure the argument doesn't break parsing. WAL behavior is not actually verified. Two things we can do here: 1) make a createPut() method in HBStorage, call it from putNext(), and in a test create our own HBS, call createPut(), and check that put.getWriteToWal() returns the right value 2) ignore the trivial test. Option 1 is the right thing to do, 2 I can probably be convinced of. As is we shouldn't commit, since the test just adds extra time to unit tests without doing much useful work. ability to turn off the write ahead log for pig's HBaseStorage -- Key: PIG-1825 URL: https://issues.apache.org/jira/browse/PIG-1825 Project: Pig Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Corbin Hoenes Priority: Minor Attachments: HBaseStorage_noWAL.patch, PIG-1825_1.patch Added an option to allow a caller of HBaseStorage to turn off the WriteAheadLog feature while doing bulk loads into hbase. From the performance tuning wikipage: http://wiki.apache.org/hadoop/PerformanceTuning To speed up the inserts in a non critical job (like an import job), you can use Put.writeToWAL(false) to bypass writing to the write ahead log. We've tested this on HBase 0.20.6 and it helps dramatically. The -noWAL options is passed in just like other options for hbase storage: STORE myalias INTO 'MyTable' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('mycolumnfamily:field1 mycolumnfamily:field2','-noWAL'); This would be my first patch so please educate me with any steps I need to do. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1825) ability to turn off the write ahead log for pig's HBaseStorage
[ https://issues.apache.org/jira/browse/PIG-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12990759#comment-12990759 ] Alan Gates commented on PIG-1825: - Unit tests pass. The output of test-patch: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] [exec] As this points out, the functionality isn't tested. Before we can check it in we'll need a test added to the hbase unit tests that shows that you can write to hbase with this option set. ability to turn off the write ahead log for pig's HBaseStorage -- Key: PIG-1825 URL: https://issues.apache.org/jira/browse/PIG-1825 Project: Pig Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Corbin Hoenes Priority: Minor Fix For: 0.8.0 Attachments: HBaseStorage_noWAL.patch Added an option to allow a caller of HBaseStorage to turn off the WriteAheadLog feature while doing bulk loads into hbase. From the performance tuning wikipage: http://wiki.apache.org/hadoop/PerformanceTuning To speed up the inserts in a non critical job (like an import job), you can use Put.writeToWAL(false) to bypass writing to the write ahead log. We've tested this on HBase 0.20.6 and it helps dramatically. The -noWAL options is passed in just like other options for hbase storage: STORE myalias INTO 'MyTable' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('mycolumnfamily:field1 mycolumnfamily:field2','-noWAL'); This would be my first patch so please educate me with any steps I need to do. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1825) ability to turn off the write ahead log for pig's HBaseStorage
[ https://issues.apache.org/jira/browse/PIG-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12989330#comment-12989330 ] Alan Gates commented on PIG-1825: - Dmitriy, is this something we should check in? You seemed to indicate that this was no longer necessary after we moved to HBase 0.89 or above. ability to turn off the write ahead log for pig's HBaseStorage -- Key: PIG-1825 URL: https://issues.apache.org/jira/browse/PIG-1825 Project: Pig Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Corbin Hoenes Priority: Minor Fix For: 0.8.0 Attachments: HBaseStorage_noWAL.patch Added an option to allow a caller of HBaseStorage to turn off the WriteAheadLog feature while doing bulk loads into hbase. From the performance tuning wikipage: http://wiki.apache.org/hadoop/PerformanceTuning To speed up the inserts in a non critical job (like an import job), you can use Put.writeToWAL(false) to bypass writing to the write ahead log. We've tested this on HBase 0.20.6 and it helps dramatically. The -noWAL options is passed in just like other options for hbase storage: STORE myalias INTO 'MyTable' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('mycolumnfamily:field1 mycolumnfamily:field2','-noWAL'); This would be my first patch so please educate me with any steps I need to do. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1825) ability to turn off the write ahead log for pig's HBaseStorage
[ https://issues.apache.org/jira/browse/PIG-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12989436#comment-12989436 ] Dmitriy V. Ryaboy commented on PIG-1825: Sounds fine to me (though I haven't read the patch yet). HBase 0.90 has significant speed improvements but I imagine it still writes a WAL and you can still turn it off. ability to turn off the write ahead log for pig's HBaseStorage -- Key: PIG-1825 URL: https://issues.apache.org/jira/browse/PIG-1825 Project: Pig Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Corbin Hoenes Priority: Minor Fix For: 0.8.0 Attachments: HBaseStorage_noWAL.patch Added an option to allow a caller of HBaseStorage to turn off the WriteAheadLog feature while doing bulk loads into hbase. From the performance tuning wikipage: http://wiki.apache.org/hadoop/PerformanceTuning To speed up the inserts in a non critical job (like an import job), you can use Put.writeToWAL(false) to bypass writing to the write ahead log. We've tested this on HBase 0.20.6 and it helps dramatically. The -noWAL options is passed in just like other options for hbase storage: STORE myalias INTO 'MyTable' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('mycolumnfamily:field1 mycolumnfamily:field2','-noWAL'); This would be my first patch so please educate me with any steps I need to do. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira