[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2013-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1355#comment-1355
 ] 

Hudson commented on HBASE-7342:
---

Integrated in HBase-0.94-security-on-Hadoop-23 #10 (See 
[https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/10/])
HBASE-7342 Split operation without split key incorrectly finds the middle 
key in off-by-one error (Aleksandr Shulman) (Revision 1423112)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java


> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, 
> HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537915#comment-13537915
 ] 

Hudson commented on HBASE-7342:
---

Integrated in HBase-0.94-security #87 (See 
[https://builds.apache.org/job/HBase-0.94-security/87/])
HBASE-7342 Split operation without split key incorrectly finds the middle 
key in off-by-one error (Aleksandr Shulman) (Revision 1423112)

 Result = SUCCESS
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java


> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, 
> HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534576#comment-13534576
 ] 

Hudson commented on HBASE-7342:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #300 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/300/])
HBASE-7342 Split operation without split key incorrectly finds the middle 
key in off-by-one error (Aleksandr Shulman) (Revision 1423110)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java


> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, 
> HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534466#comment-13534466
 ] 

Hudson commented on HBASE-7342:
---

Integrated in HBase-TRUNK #3629 (See 
[https://builds.apache.org/job/HBase-TRUNK/3629/])
HBASE-7342 Split operation without split key incorrectly finds the middle 
key in off-by-one error (Aleksandr Shulman) (Revision 1423110)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java


> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, 
> HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534424#comment-13534424
 ] 

Hudson commented on HBASE-7342:
---

Integrated in HBase-0.94 #635 (See 
[https://builds.apache.org/job/HBase-0.94/635/])
HBASE-7342 Split operation without split key incorrectly finds the middle 
key in off-by-one error (Aleksandr Shulman) (Revision 1423112)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java


> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, 
> HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534225#comment-13534225
 ] 

Ted Yu commented on HBASE-7342:
---

I forgot to mention that I ran the tests reported by Hadoop QA below and didn't 
see hanging test:
{code}
 BEGIN zombies jstack extract
at 
org.apache.hadoop.hbase.util.TestHBaseFsck.testFixByTable(TestHBaseFsck.java:1188)
at 
org.apache.hadoop.hbase.util.TestHBaseFsck.testLingeringSplitParent(TestHBaseFsck.java:1262)
at 
org.apache.hadoop.hbase.catalog.TestCatalogTracker.testServerNotRunningIOException(TestCatalogTracker.java:250)
 END  zombies jstack extract
{code}

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, 
> HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534220#comment-13534220
 ] 

Ted Yu commented on HBASE-7342:
---

Integrated to trunk and 0.94 without the test

Thanks for the patch, Alex.

Thanks for the reviews, Stack, Ram and Lars

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, 
> HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-17 Thread Aleksandr Shulman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534194#comment-13534194
 ] 

Aleksandr Shulman commented on HBASE-7342:
--

No worries. I also agree that it's not necessary.

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, 
> HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534191#comment-13534191
 ] 

stack commented on HBASE-7342:
--

We should not commit the test for this patch.  It is over-the-top spinning up a 
cluster to check a plain array math problem (but thanks for making the test 
Aleksandr ...)

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, 
> HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533448#comment-13533448
 ] 

Hadoop QA commented on HBASE-7342:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12561192/7342-trunk-v3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
additional warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 26 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are zombie tests. See build 
logs for details.

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3566//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3566//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3566//console

This message is automatically generated.

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, 
> HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-15 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533280#comment-13533280
 ] 

Lars Hofhansl commented on HBASE-7342:
--

Cool. +1 for both trunk and 0.94

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-15 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533278#comment-13533278
 ] 

ramkrishna.s.vasudevan commented on HBASE-7342:
---

Yes Lars you are right. The midkey does not find the find an exact mid inside a 
block. That is why this happens.
So if anyway there is only one block then we will still get the first KV in 
that block. 

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-15 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533274#comment-13533274
 ] 

Lars Hofhansl commented on HBASE-7342:
--

Just so I understand... Since the code picks the midkey as the row key of the 
first KeyValue of the mid block, it should never pick the first block, because 
that would by definition be the same as the first key of the file (unless there 
is only one, in which case it should not split anyway).

But it's fine to pick the last block, because the the first key in that block 
is likely not the last key of the file (unless now there's only a single 
KeyValue in that last block).

Did I understand this correctly?


> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533198#comment-13533198
 ] 

stack commented on HBASE-7342:
--

[~ram_krish] Was leaving it open a while in case you remembered your 
workaround.  Will commit Monday unless objection or alternative proposed.

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-15 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533107#comment-13533107
 ] 

ramkrishna.s.vasudevan commented on HBASE-7342:
---

So are we going to commit this issue? 

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532043#comment-13532043
 ] 

stack commented on HBASE-7342:
--

[~ram_krish] No problem boss.  Thats interesting that you saw this below.  
Would be good to know what you fellas decided?  Good on you.

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531947#comment-13531947
 ] 

ramkrishna.s.vasudevan commented on HBASE-7342:
---

@Stack
Sorry Stack.. I did not mean to -1 it..Actually we once faced exactly the same 
issue and we decided on something which i dont remember now :(


> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531309#comment-13531309
 ] 

stack commented on HBASE-7342:
--

bq. I'll go back and refine the test case. We can add it later. The hard part 
is getting blockKeys to be a size-2 array. Any suggestions?

You could just make a test apart from splits.  Conjure an array of 0, 1, 2, and 
10 elements.  Verify that when you divide by two you get a 'midpoint' that 
makes sense for hbase. 

But such a test verges on the silly I would argue.  Lets just commit your fix 
unless someone comes up w/ a reason for why the -1 was there in the first place.

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread Aleksandr Shulman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531300#comment-13531300
 ] 

Aleksandr Shulman commented on HBASE-7342:
--

I'll go back and refine the test case. We can add it later. The hard part is 
getting blockKeys to be a size-2 array. Any suggestions?

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531284#comment-13531284
 ] 

stack commented on HBASE-7342:
--

[~ted_yu] The problem is plain.  Look at it.  Imagine an array w/ two elements 
in it only.

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531275#comment-13531275
 ] 

stack commented on HBASE-7342:
--

+1 on fix.  I'd not add the test.  Its overkill running a cluster to test a 
array math problem. 

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531272#comment-13531272
 ] 

Ted Yu commented on HBASE-7342:
---

I extracted testBasicSplit from patch v2 and it passed.
{code}
Running org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
2012-12-13 10:17:06.866 java[69043:1903] Unable to load realm mapping info from 
SCDynamicStore
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 29.711 sec
{code}
@Aleksandr:
Can you refine your test case to show us the problem ?

Thanks

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531254#comment-13531254
 ] 

Ted Yu commented on HBASE-7342:
---

Since the fix is for array size being 2, maybe add a check for this case and 
don't subtract 1. Otherwise keep the current logic.

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread Aleksandr Shulman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531247#comment-13531247
 ] 

Aleksandr Shulman commented on HBASE-7342:
--

Hi Ramkrishna,

The logic for the change is as follows:

With the existing implementation (using -1), when there are two items in the 
array, it returns the 0th item ( (2 - 1) / 2 = 0 ) , which is equal the index 
of the firstKey. This is a problem during splits because a split is invalid if 
the midkey is equal to the firstKey. What we really want here is the index to 
be 1. This is because the lastKey is going to be first key in the next block. 
So there won't be a collision with it and the midkey will really represent the 
mid of first and last.

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531162#comment-13531162
 ] 

ramkrishna.s.vasudevan commented on HBASE-7342:
---

It tries to find the mid from the blockkey indices right...As the offset starts 
from 0 this should be fine right?


> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530590#comment-13530590
 ] 

Hadoop QA commented on HBASE-7342:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12560695/HBASE-7342-v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3502//console

This message is automatically generated.

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-12 Thread Aleksandr Shulman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530571#comment-13530571
 ] 

Aleksandr Shulman commented on HBASE-7342:
--

Looks like my trunk branch is a little out of date and some things were 
deprecated. The next patch should integrate cleanly. I noted your suggestions 
in my change.

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-12 Thread Aleksandr Shulman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530478#comment-13530478
 ] 

Aleksandr Shulman commented on HBASE-7342:
--

Noted...let me take a look.

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530472#comment-13530472
 ] 

Ted Yu commented on HBASE-7342:
---

There're compilation error:
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile 
(default-testCompile) on project hbase-server: Compilation failure: Compilation 
failure:
[ERROR] 
/Users/zhihyu/trunk-hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java:[40,30]
 cannot find symbol
[ERROR] symbol  : class HServerAddress
[ERROR] location: package org.apache.hadoop.hbase
[ERROR] 
/Users/zhihyu/trunk-hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java:[763,21]
 cannot find symbol
[ERROR] symbol  : class HServerAddress
[ERROR] location: class 
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
[ERROR] 
/Users/zhihyu/trunk-hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java:[763,48]
 cannot find symbol
[ERROR] symbol  : method getRegionsInfo()
[ERROR] location: class org.apache.hadoop.hbase.client.HTable
[ERROR] 
/Users/zhihyu/trunk-hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java:[772,17]
 cannot find symbol
[ERROR] symbol  : method getRegionsInfo()
[ERROR] location: class org.apache.hadoop.hbase.client.HTable
{code}
HServerAddress is replaced by ServerName in trunk.
getRegionsInfo() is replaced by getRegionLocations.

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530463#comment-13530463
 ] 

Ted Yu commented on HBASE-7342:
---

{code}
+System.out.println("Original table has: " + loadedTableCount + " rows");
{code}
Please use LOG variable for the above.
{code}
+  Thread.currentThread();
{code}
Does the above statement have any effect ?
{code}
+Thread.sleep(1000);
{code}
Can the sleep duration be shorter ?
{code}
+  } catch (InterruptedException e) {
+e.printStackTrace();
{code}
Throw InterruptedIOException from the catch block.
{code}
+return;
+
{code}
nit: remove the empty line.
{code}
+throw new Exception("Split did not increase the number of regions");
{code}
nit: use fail().

> Split operation without split key incorrectly finds the middle key in 
> off-by-one error
> --
>
> Key: HBASE-7342
> URL: https://issues.apache.org/jira/browse/HBASE-7342
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>Priority: Minor
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7342-v1.patch
>
>
> I took a deeper look into issues I was having using region splitting when 
> specifying a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 
> 0th one. This causes the firstkey to be the same as midkey and the split will 
> fail. Removing the -1 causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
> i resides as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of 
> the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
> the 0-offset indexing.
> 5. In a result with where there are only 2 blockKeys, this yields the 0th 
> block key. 
> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey 
> is the same as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira