[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885287#comment-13885287 ] Hudson commented on HBASE-10416: SUCCESS: Integrated in HBase-0.98 #113 (See [https://builds.apache.org/job/HBase-0.98/113/]) HBASE-10416 Improvements to the import flow (Vasu Mariyala) (tedyu: rev 1562342) * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Assignee: Vasu Mariyala Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10416-rev1.patch, HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885456#comment-13885456 ] Nick Dimiduk commented on HBASE-10416: -- [~vasu.mariy...@gmail.com] let's take up the conversation of the other tickets elsewhere. Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Assignee: Vasu Mariyala Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10416-rev1.patch, HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885543#comment-13885543 ] Hudson commented on HBASE-10416: FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #69 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/69/]) HBASE-10416 Improvements to the import flow (Vasu Mariyala) (tedyu: rev 1562343) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Assignee: Vasu Mariyala Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10416-rev1.patch, HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884491#comment-13884491 ] Vasu Mariyala commented on HBASE-10416: --- Sorry for the delay and thanks for the review comments [~yuzhih...@gmail.com] 1. Felt that constructing a filter object from filter class and filter args would be utility method and would be useful when extending the import utility for specific customizations. 2. Fixed the long line the java doc warnings. 3. Updated the release note description in the jira. [~ndimiduk] Saw your other issues related to making things like mapper or reducer configurable and reuse the code. Would you mind discussing on these issues when you are free. You can ping me in gmail. Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Attachments: HBASE-10416-rev1.patch, HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884565#comment-13884565 ] Ted Yu commented on HBASE-10416: lgtm [~apurtell]: Do you want this in 0.98 ? Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Attachments: HBASE-10416-rev1.patch, HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884811#comment-13884811 ] Hadoop QA commented on HBASE-10416: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12625635/HBASE-10416-rev1.patch against trunk revision . ATTACHMENT ID: 12625635 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.util.TestHBaseFsck Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8544//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8544//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8544//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8544//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8544//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8544//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8544//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8544//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8544//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8544//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8544//console This message is automatically generated. Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Attachments: HBASE-10416-rev1.patch, HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884915#comment-13884915 ] Andrew Purtell commented on HBASE-10416: +1 for 0.98, thanks Ted. Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Attachments: HBASE-10416-rev1.patch, HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884998#comment-13884998 ] Hudson commented on HBASE-10416: SUCCESS: Integrated in HBase-TRUNK #4863 (See [https://builds.apache.org/job/HBase-TRUNK/4863/]) HBASE-10416 Improvements to the import flow (tedyu: rev 1562343) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Assignee: Vasu Mariyala Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10416-rev1.patch, HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885051#comment-13885051 ] Hudson commented on HBASE-10416: SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #104 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/104/]) HBASE-10416 Improvements to the import flow (Vasu Mariyala) (tedyu: rev 1562342) * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Assignee: Vasu Mariyala Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10416-rev1.patch, HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881799#comment-13881799 ] Hadoop QA commented on HBASE-10416: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12625121/HBASE-10416.patch against trunk revision . ATTACHMENT ID: 12625121 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 3 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8504//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8504//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8504//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8504//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8504//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8504//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8504//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8504//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8504//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8504//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8504//console This message is automatically generated. Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Attachments: HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881496#comment-13881496 ] Lars Hofhansl commented on HBASE-10416: --- looks good to me. Thanks Vasu. Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Attachments: HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881508#comment-13881508 ] Ted Yu commented on HBASE-10416: Nit: {code} - private static Filter instantiateFilter(Configuration conf) { -// get the filter, if it was configured + public static Filter instantiateFilter(Configuration conf) { {code} Does instantiateFilter() need to be public ? {code} + public static void flushRegionsIfNecessary(Configuration conf) throws IOException, InterruptedException { {code} Long line above. Mind adding release notes for the new parameters ? Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Attachments: HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10416) Improvements to the import flow
[ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881516#comment-13881516 ] Nick Dimiduk commented on HBASE-10416: -- Good work [~vasu.mariy...@gmail.com]. Have you seen my notes on HBASE-8074 and HBASE-8115 ? Improvements to the import flow --- Key: HBASE-10416 URL: https://issues.apache.org/jira/browse/HBASE-10416 Project: HBase Issue Type: New Feature Components: mapreduce Reporter: Vasu Mariyala Attachments: HBASE-10416.patch Following improvements can be made to the Import logic a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. ) b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table. c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested that this should be a parameter to the import. d) Some minor refactoring to avoid building a comma separated string for the filter args. -- This message was sent by Atlassian JIRA (v6.1.5#6160)