[ 
https://issues.apache.org/jira/browse/HBASE-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252454#comment-13252454
 ] 

Eric Newton commented on HBASE-5754:
------------------------------------

Yes, that looks like a clean run.

My run of 1B completed correctly yesterday.

However, I just did a run of 100M.  I suspect split/balancing is causing the 
data loss.  So, during the 100M run, I pushed the split button on the master 
web page, every 5 seconds until I had several hundred tablets.  Then I left it 
alone.

{noformat}
12/04/12 14:17:01 INFO mapred.JobClient:   goraci.Verify$Counts
12/04/12 14:17:01 INFO mapred.JobClient:     UNDEFINED=37099
12/04/12 14:17:01 INFO mapred.JobClient:     REFERENCED=89961385
12/04/12 14:17:01 INFO mapred.JobClient:     UNREFERENCED=10001516
{noformat}

Perhaps if you do this you can replicate the problem.
                
> data lost with gora continuous ingest test (goraci)
> ---------------------------------------------------
>
>                 Key: HBASE-5754
>                 URL: https://issues.apache.org/jira/browse/HBASE-5754
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>         Environment: 10 node test cluster
>            Reporter: Eric Newton
>            Assignee: stack
>
> Keith Turner re-wrote the accumulo continuous ingest test using gora, which 
> has both hbase and accumulo back-ends.
> I put a billion entries into HBase, and ran the Verify map/reduce job.  The 
> verification failed because about 21K entries were missing.  The goraci 
> [README|https://github.com/keith-turner/goraci] explains the test, and how it 
> detects missing data.
> I re-ran the test with 100 million entries, and it verified successfully.  
> Both of the times I tested using a billion entries, the verification failed.
> If I run the verification step twice, the results are consistent, so the 
> problem is
> probably not on the verify step.
> Here's the versions of the various packages:
> ||package||version||
> |hadoop|0.20.205.0|
> |hbase|0.92.1|
> |gora|http://svn.apache.org/repos/asf/gora/trunk r1311277|
> |goraci|https://github.com/ericnewton/goraci  tagged 2012-04-08|
> The change I made to goraci was to configure it for hbase and to allow it to 
> build properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to