[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1355#comment-1355 ] Hudson commented on HBASE-7342: --- Integrated in HBase-0.94-security-on-Hadoop-23 #10 (See [https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/10/]) HBASE-7342 Split operation without split key incorrectly finds the middle key in off-by-one error (Aleksandr Shulman) (Revision 1423112) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, > HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537915#comment-13537915 ] Hudson commented on HBASE-7342: --- Integrated in HBase-0.94-security #87 (See [https://builds.apache.org/job/HBase-0.94-security/87/]) HBASE-7342 Split operation without split key incorrectly finds the middle key in off-by-one error (Aleksandr Shulman) (Revision 1423112) Result = SUCCESS tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, > HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534576#comment-13534576 ] Hudson commented on HBASE-7342: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #300 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/300/]) HBASE-7342 Split operation without split key incorrectly finds the middle key in off-by-one error (Aleksandr Shulman) (Revision 1423110) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, > HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534466#comment-13534466 ] Hudson commented on HBASE-7342: --- Integrated in HBase-TRUNK #3629 (See [https://builds.apache.org/job/HBase-TRUNK/3629/]) HBASE-7342 Split operation without split key incorrectly finds the middle key in off-by-one error (Aleksandr Shulman) (Revision 1423110) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, > HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534424#comment-13534424 ] Hudson commented on HBASE-7342: --- Integrated in HBase-0.94 #635 (See [https://builds.apache.org/job/HBase-0.94/635/]) HBASE-7342 Split operation without split key incorrectly finds the middle key in off-by-one error (Aleksandr Shulman) (Revision 1423112) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, > HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534225#comment-13534225 ] Ted Yu commented on HBASE-7342: --- I forgot to mention that I ran the tests reported by Hadoop QA below and didn't see hanging test: {code} BEGIN zombies jstack extract at org.apache.hadoop.hbase.util.TestHBaseFsck.testFixByTable(TestHBaseFsck.java:1188) at org.apache.hadoop.hbase.util.TestHBaseFsck.testLingeringSplitParent(TestHBaseFsck.java:1262) at org.apache.hadoop.hbase.catalog.TestCatalogTracker.testServerNotRunningIOException(TestCatalogTracker.java:250) END zombies jstack extract {code} > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, > HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534220#comment-13534220 ] Ted Yu commented on HBASE-7342: --- Integrated to trunk and 0.94 without the test Thanks for the patch, Alex. Thanks for the reviews, Stack, Ram and Lars > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, > HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534194#comment-13534194 ] Aleksandr Shulman commented on HBASE-7342: -- No worries. I also agree that it's not necessary. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, > HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534191#comment-13534191 ] stack commented on HBASE-7342: -- We should not commit the test for this patch. It is over-the-top spinning up a cluster to check a plain array math problem (but thanks for making the test Aleksandr ...) > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, > HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533448#comment-13533448 ] Hadoop QA commented on HBASE-7342: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12561192/7342-trunk-v3.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any additional warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 26 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are zombie tests. See build logs for details. Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3566//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3566//console This message is automatically generated. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, > HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533280#comment-13533280 ] Lars Hofhansl commented on HBASE-7342: -- Cool. +1 for both trunk and 0.94 > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533278#comment-13533278 ] ramkrishna.s.vasudevan commented on HBASE-7342: --- Yes Lars you are right. The midkey does not find the find an exact mid inside a block. That is why this happens. So if anyway there is only one block then we will still get the first KV in that block. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533274#comment-13533274 ] Lars Hofhansl commented on HBASE-7342: -- Just so I understand... Since the code picks the midkey as the row key of the first KeyValue of the mid block, it should never pick the first block, because that would by definition be the same as the first key of the file (unless there is only one, in which case it should not split anyway). But it's fine to pick the last block, because the the first key in that block is likely not the last key of the file (unless now there's only a single KeyValue in that last block). Did I understand this correctly? > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533198#comment-13533198 ] stack commented on HBASE-7342: -- [~ram_krish] Was leaving it open a while in case you remembered your workaround. Will commit Monday unless objection or alternative proposed. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533107#comment-13533107 ] ramkrishna.s.vasudevan commented on HBASE-7342: --- So are we going to commit this issue? > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532043#comment-13532043 ] stack commented on HBASE-7342: -- [~ram_krish] No problem boss. Thats interesting that you saw this below. Would be good to know what you fellas decided? Good on you. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531947#comment-13531947 ] ramkrishna.s.vasudevan commented on HBASE-7342: --- @Stack Sorry Stack.. I did not mean to -1 it..Actually we once faced exactly the same issue and we decided on something which i dont remember now :( > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531309#comment-13531309 ] stack commented on HBASE-7342: -- bq. I'll go back and refine the test case. We can add it later. The hard part is getting blockKeys to be a size-2 array. Any suggestions? You could just make a test apart from splits. Conjure an array of 0, 1, 2, and 10 elements. Verify that when you divide by two you get a 'midpoint' that makes sense for hbase. But such a test verges on the silly I would argue. Lets just commit your fix unless someone comes up w/ a reason for why the -1 was there in the first place. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531300#comment-13531300 ] Aleksandr Shulman commented on HBASE-7342: -- I'll go back and refine the test case. We can add it later. The hard part is getting blockKeys to be a size-2 array. Any suggestions? > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531284#comment-13531284 ] stack commented on HBASE-7342: -- [~ted_yu] The problem is plain. Look at it. Imagine an array w/ two elements in it only. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531275#comment-13531275 ] stack commented on HBASE-7342: -- +1 on fix. I'd not add the test. Its overkill running a cluster to test a array math problem. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531272#comment-13531272 ] Ted Yu commented on HBASE-7342: --- I extracted testBasicSplit from patch v2 and it passed. {code} Running org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster 2012-12-13 10:17:06.866 java[69043:1903] Unable to load realm mapping info from SCDynamicStore Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 29.711 sec {code} @Aleksandr: Can you refine your test case to show us the problem ? Thanks > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531254#comment-13531254 ] Ted Yu commented on HBASE-7342: --- Since the fix is for array size being 2, maybe add a check for this case and don't subtract 1. Otherwise keep the current logic. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531247#comment-13531247 ] Aleksandr Shulman commented on HBASE-7342: -- Hi Ramkrishna, The logic for the change is as follows: With the existing implementation (using -1), when there are two items in the array, it returns the 0th item ( (2 - 1) / 2 = 0 ) , which is equal the index of the firstKey. This is a problem during splits because a split is invalid if the midkey is equal to the firstKey. What we really want here is the index to be 1. This is because the lastKey is going to be first key in the next block. So there won't be a collision with it and the midkey will really represent the mid of first and last. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531162#comment-13531162 ] ramkrishna.s.vasudevan commented on HBASE-7342: --- It tries to find the mid from the blockkey indices right...As the offset starts from 0 this should be fine right? > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530590#comment-13530590 ] Hadoop QA commented on HBASE-7342: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560695/HBASE-7342-v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3502//console This message is automatically generated. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530571#comment-13530571 ] Aleksandr Shulman commented on HBASE-7342: -- Looks like my trunk branch is a little out of date and some things were deprecated. The next patch should integrate cleanly. I noted your suggestions in my change. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530478#comment-13530478 ] Aleksandr Shulman commented on HBASE-7342: -- Noted...let me take a look. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530472#comment-13530472 ] Ted Yu commented on HBASE-7342: --- There're compilation error: {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile (default-testCompile) on project hbase-server: Compilation failure: Compilation failure: [ERROR] /Users/zhihyu/trunk-hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java:[40,30] cannot find symbol [ERROR] symbol : class HServerAddress [ERROR] location: package org.apache.hadoop.hbase [ERROR] /Users/zhihyu/trunk-hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java:[763,21] cannot find symbol [ERROR] symbol : class HServerAddress [ERROR] location: class org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster [ERROR] /Users/zhihyu/trunk-hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java:[763,48] cannot find symbol [ERROR] symbol : method getRegionsInfo() [ERROR] location: class org.apache.hadoop.hbase.client.HTable [ERROR] /Users/zhihyu/trunk-hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java:[772,17] cannot find symbol [ERROR] symbol : method getRegionsInfo() [ERROR] location: class org.apache.hadoop.hbase.client.HTable {code} HServerAddress is replaced by ServerName in trunk. getRegionsInfo() is replaced by getRegionLocations. > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530463#comment-13530463 ] Ted Yu commented on HBASE-7342: --- {code} +System.out.println("Original table has: " + loadedTableCount + " rows"); {code} Please use LOG variable for the above. {code} + Thread.currentThread(); {code} Does the above statement have any effect ? {code} +Thread.sleep(1000); {code} Can the sleep duration be shorter ? {code} + } catch (InterruptedException e) { +e.printStackTrace(); {code} Throw InterruptedIOException from the catch block. {code} +return; + {code} nit: remove the empty line. {code} +throw new Exception("Split did not increase the number of regions"); {code} nit: use fail(). > Split operation without split key incorrectly finds the middle key in > off-by-one error > -- > > Key: HBASE-7342 > URL: https://issues.apache.org/jira/browse/HBASE-7342 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 >Reporter: Aleksandr Shulman >Assignee: Aleksandr Shulman >Priority: Minor > Fix For: 0.96.0, 0.94.4 > > Attachments: HBASE-7342-v1.patch > > > I took a deeper look into issues I was having using region splitting when > specifying a region (but not a key for splitting). > The midkey calculation is off by one and when there are 2 rows, will pick the > 0th one. This causes the firstkey to be the same as midkey and the split will > fail. Removing the -1 causes it work correctly, as per the test I've added. > Looking into the code here is what goes on: > 1. Split takes the largest storefile > 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key > i resides as blockKeys[i] > 3. Getting the middle root-level index should yield the key in the middle of > the storefile > 4. In step 3, we see that there is a possible erroneous (-1) to adjust for > the 0-offset indexing. > 5. In a result with where there are only 2 blockKeys, this yields the 0th > block key. > 6. Unfortunately, this is the same block key that 'firstKey' will be. > 7. This yields the result in HStore.java:1873 ("cannot split because midkey > is the same as first or last row") > 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira