[ 
https://issues.apache.org/jira/browse/HADOOP-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559181#action_12559181
 ] 

Hadoop QA commented on HADOOP-2493:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12373141/2493.patch
against trunk revision r612161.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1598/testReport/
Findbugs warnings: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1598/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1598/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1598/console

This message is automatically generated.

> hbase will split on row when the start and end row is the same cuase data loss
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-2493
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2493
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: Billy Pearson
>            Assignee: Bryan Duxbury
>            Priority: Critical
>             Fix For: 0.16.0
>
>         Attachments: 2493.patch, regions_shot.JPG
>
>
> While testing hbase splits with my code I was loading a table to become a 
> inverted index on some links
> I was using the anchor text as the row key 
> and the column parent:child as
> url:(siteurl) and the data is the count of the links pointing to the siteurl 
> with row key anchor text.
> but a lot of sites have image links and I use "image" as the anchor text for 
> my testing code so there is a lot of image links. 
> I changed the max file size of hbase to 16mb for testing and have been able 
> to recreate the same error.
> When the table get big it splits on the column image as the end key for one 
> table and the start of the next table later it splits to where the start key 
> and end key was image for one of the splits. After that it keep spiting the 
> region with start key as "image" and the end key the same. So I have multi 
> splits with start key and end key as "image" unless the master keeps track of 
> the row key and partend:child data on the splits I do not thank all the data 
> will get returned when querying it.
> I have attached a screen shot of my regions i thank there should be some 
> logic to where if the start and end row key is the same the region does not 
> split or we need to start keeping track of the start key, column data on the 
> master of each split so we can know where each row is in the database.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to