[ 
https://issues.apache.org/jira/browse/HBASE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542726#comment-13542726
 ] 

chunhui shen commented on HBASE-7299:
-------------------------------------

I have analysed the logs of trunk build #3686, and found the reason.

1.We will abort the regionserver 0 in both testBatchWithPut and 
testFlushCommitsWithAbort

2.We will ensure 2 regionservers alisve before each test
{code}
@Before public void before() throws IOException {
    LOG.info("before");
    if (UTIL.ensureSomeRegionServersAvailable(slaves)) {
      // Distribute regions
      UTIL.getMiniHBaseCluster().getMaster().balance();
    }
    LOG.info("before done");
  }
{code}

3.In trunk build #3686,  testFlushCommitsWithAbort is run after 
testBatchWithPut 
{code}
2013-01-02 12:28:33,183 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
before: client.TestMultiParallel#testBatchWithPut 
...
2013-01-02 12:30:08,410 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
before: client.TestMultiParallel#testFlushCommitsWithAbort
{code}

4.testFlushCommitsWithAbort abort the regionserver 0 which is already aborted 
by testBatchWithPut, so we see the following log:
{code}
2013-01-02 12:30:08,410 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
before: client.TestMultiParallel#testFlushCommitsWithAbort 
2013-01-02 12:30:08,410 INFO  [pool-1-thread-1] client.TestMultiParallel(77): 
before
2013-01-02 12:30:08,410 INFO  [pool-1-thread-1] hbase.LocalHBaseCluster(243): 
Not alive RegionServer:0;juno.apache.org,40265,1357129678691
2013-01-02 12:30:08,410 INFO  [pool-1-thread-1] client.TestMultiParallel(82): 
before done
2013-01-02 12:30:08,410 INFO  [Thread-709] client.TestMultiParallel(226): 
test=testFlushCommitsWithAbort
...
2013-01-02 12:30:09,059 INFO  [Thread-709] hbase.LocalHBaseCluster(243): Not 
alive RegionServer:0;juno.apache.org,40265,1357129678691
2013-01-02 12:30:09,059 INFO  [Thread-709] client.TestMultiParallel(277): 
Count=1, Alive=juno.apache.org,40198,1357129678744
2013-01-02 12:30:09,059 INFO  [Thread-709] client.TestMultiParallel(277): 
Count=2, Alive=juno.apache.org,51431,1357129753348
{code}

5.From the above, it's clear there are total 3 regionservers and 2 are alive, 
but testFlushCommitsWithAbort consider only total 2 regionserver


Uploading the addendum2 to fix the case bug
                
> TestMultiParallel fails intermittently in trunk builds
> ------------------------------------------------------
>
>                 Key: HBASE-7299
>                 URL: https://issues.apache.org/jira/browse/HBASE-7299
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.96.0
>
>         Attachments: 7299.addendum, 7299-v4.txt, HBASE-7299.patch, 
> HBASE-7299v2.patch, HBASE-7299v3.patch
>
>
> From trunk build #3598:
> {code}
>  testFlushCommitsNoAbort(org.apache.hadoop.hbase.client.TestMultiParallel): 
> Count of regions=8
> {code}
> It failed in 3595 as well:
> {code}
> java.lang.AssertionError: Server count=2, abort=true expected:<1> but was:<2>
>       at org.junit.Assert.fail(Assert.java:93)
>       at org.junit.Assert.failNotEquals(Assert.java:647)
>       at org.junit.Assert.assertEquals(Assert.java:128)
>       at org.junit.Assert.assertEquals(Assert.java:472)
>       at 
> org.apache.hadoop.hbase.client.TestMultiParallel.doTestFlushCommits(TestMultiParallel.java:267)
>       at 
> org.apache.hadoop.hbase.client.TestMultiParallel.testFlushCommitsWithAbort(TestMultiParallel.java:226)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to