[ 
https://issues.apache.org/jira/browse/HBASE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541312#comment-13541312
 ] 

chunhui shen commented on HBASE-7299:
-------------------------------------

[~ted_yu]
I have see the log again.
And I think it's because of balance

First, see the order of test:
{code}
2012-12-31 03:11:48,688 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
before: client.TestMultiParallel#testActiveThreadsCount 
2012-12-31 03:11:49,247 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
before: client.TestMultiParallel#testBatchWithGet 
2012-12-31 03:11:50,151 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
before: client.TestMultiParallel#testBadFam 
2012-12-31 03:11:50,169 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
before: client.TestMultiParallel#testFlushCommitsNoAbort 
2012-12-31 03:11:50,825 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
before: client.TestMultiParallel#testFlushCommitsWithAbort 
{code}

Therefore, We only need to take care what happen before 2012-12-31 03:11:50,825


Then, I grep all the opened region logs
{code}
2012-12-31 03:11:46,309 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-0] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,,1356923505778.5e876dba9be19501a1eb65bf3a169e52. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,164 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-2] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,bbb,1356923506859.7c3f09396e7314de6f5a757b010b6497. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,202 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-0] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,ccc,1356923506862.2a80b82e2d6c3152e3f12bc91e1cc621. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,303 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,45800,1356923500558-1] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,fff,1356923506868.63ffa8986cd30ff5314b4c2a70cf846a. on 
server:asf001.sp2.ygridcore.net,45800,1356923500558
2012-12-31 03:11:47,329 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-2] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,ddd,1356923506864.744510f09d963e39dd9c0b6e3119dc10. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,370 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,45800,1356923500558-2] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,iii,1356923506875.d09ca7b9b80b6cde560772598a240d0e. on 
server:asf001.sp2.ygridcore.net,45800,1356923500558
2012-12-31 03:11:47,400 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-0] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,eee,1356923506866.6a1697e740f121d009c3085e0cccd18d. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,439 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,45800,1356923500558-0] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,jjj,1356923506878.f25b9086263fb7a4f983524c708503b6. on 
server:asf001.sp2.ygridcore.net,45800,1356923500558
2012-12-31 03:11:47,465 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-1] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,,1356923506856.2db538d9e2005dba4e28746d51cf3831. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,482 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-2] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,ggg,1356923506871.7adeba3045bdbb0f4e499b221d2ffc87. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,598 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,45800,1356923500558-1] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,nnn,1356923506888.9cc1e013ebfba7da8e00e4963c2d111a. on 
server:asf001.sp2.ygridcore.net,45800,1356923500558
2012-12-31 03:11:47,603 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-1] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,kkk,1356923506880.a2a3e39af3fa95eb1a3979998b075bb6. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,634 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,45800,1356923500558-2] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,ppp,1356923506893.915969809cfe733d325591b7c27bd088. on 
server:asf001.sp2.ygridcore.net,45800,1356923500558
2012-12-31 03:11:47,643 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-2] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,lll,1356923506883.0e6b1c9b373cecb0c74380b78d1cc492. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,701 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,45800,1356923500558-0] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,rrr,1356923506899.67925003b24f6408e7ee6ef2360a77f6. on 
server:asf001.sp2.ygridcore.net,45800,1356923500558
2012-12-31 03:11:47,717 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-1] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,mmm,1356923506886.d0a07239a287e74e7706e4b9a0c9f491. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,745 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-2] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,ooo,1356923506891.524c6a4fb529fbb5b86e0865ac0131f5. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,867 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-2] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,sss,1356923506901.af5693d7dc46541210d7c26cf4e4c1a0. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,936 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-0] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,hhh,1356923506873.12dff64cde2a448c9d5b7adecfabfaaa. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:47,957 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-2] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,ttt,1356923506904.c121cfbfb3e248f820d4729e4452ff14. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:48,012 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-2] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,vvv,1356923506908.797a80f1a86a9256a833e4cd48554185. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:48,076 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-2] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,www,1356923506911.79129a00e6718ae7ca478e3dde854524. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:48,185 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-1] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,qqq,1356923506896.cf9a88d3961133afeaaeabdf5a9cffc3. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:48,411 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-0] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,uuu,1356923506906.62d60488f81f0e0edce10369200b1543. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:48,556 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-2] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,xxx,1356923506913.bf54cd9fae68060237f700e0c7acc6b4. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
2012-12-31 03:11:48,626 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,38198,1356923500609-1] 
handler.OpenRegionHandler(149): Opened 
multi_test_table,yyy,1356923506915.7b41bd006a2c832842a67b85f1837c68. on 
server:asf001.sp2.ygridcore.net,38198,1356923500609
{code}

These regions are created by
{code}
 @BeforeClass public static void beforeClass() throws Exception {
...
    UTIL.createMultiRegions(t, Bytes.toBytes(FAMILY));
...
  }
{code}

>From the above, we could see 
>server:asf001.sp2.ygridcore.net,38198,1356923500609 serve 20 regions, and 
>asf001.sp2.ygridcore.net,45800,1356923500558 only serve 6 regions.
It seems clear:
{code}
for (JVMClusterUtil.RegionServerThread t: liveRSs) {
      int regions = ProtobufUtil.getOnlineRegions(t.getRegionServer()).size();
      Assert.assertTrue("Count of regions=" + regions, regions > 10);
    }
{code}
I don't know why we assert regions more than 10 for each regionserver.
>From the failed logs, "java.lang.AssertionError: Count of regions=7", there is 
>another region on asf001.sp2.ygridcore.net,45800,1356923500558
{code}
2012-12-31 03:11:44,306 DEBUG 
[RS_OPEN_REGION-asf001.sp2.ygridcore.net,45800,1356923500558-0] 
handler.OpenRegionHandler(149): Opened -ROOT-,,0.70236052 on 
server:asf001.sp2.ygridcore.net,45800,1356923500558
{code}
Yes, It's the -ROOT- region.

Also, we could see the balance logs later
{code}
2012-12-31 03:11:58,883 INFO  [pool-1-thread-1] master.HMaster(1325): balance 
hri=multi_test_table,mmm,1356923506886.d0a07239a287e74e7706e4b9a0c9f491., 
src=asf001.sp2.ygridcore.net,38198,1356923500609, 
dest=asf001.sp2.ygridcore.net,59241,1356923517635
2012-12-31 03:11:58,890 INFO  [pool-1-thread-1] master.HMaster(1325): balance 
hri=multi_test_table,,1356923505778.5e876dba9be19501a1eb65bf3a169e52., 
src=asf001.sp2.ygridcore.net,38198,1356923500609, 
dest=asf001.sp2.ygridcore.net,59241,1356923517635
2012-12-31 03:11:58,949 INFO  [pool-1-thread-1] master.HMaster(1325): balance 
hri=multi_test_table,bbb,1356923506859.7c3f09396e7314de6f5a757b010b6497., 
src=asf001.sp2.ygridcore.net,38198,1356923500609, 
dest=asf001.sp2.ygridcore.net,59241,1356923517635
2012-12-31 03:11:58,967 INFO  [pool-1-thread-1] master.HMaster(1325): balance 
hri=multi_test_table,eee,1356923506866.6a1697e740f121d009c3085e0cccd18d., 
src=asf001.sp2.ygridcore.net,38198,1356923500609, 
dest=asf001.sp2.ygridcore.net,59241,1356923517635
{code}



So, I think the reason is unbalanced regions on the servers at before, And I 
don't think it's necessary that assert regions more than 10 for each 
regionserver.

By the way, I find we will abort regionserver 0 in 
TestMultiParallel#testBatchWithPut, however we will also abort regionserver 0 
TestMultiParallel#testFlushCommitsWithAbort(). It seems confused.
                
> TestMultiParallel fails intermittently in trunk builds
> ------------------------------------------------------
>
>                 Key: HBASE-7299
>                 URL: https://issues.apache.org/jira/browse/HBASE-7299
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.96.0
>
>         Attachments: 7299-v4.txt, HBASE-7299.patch, HBASE-7299v2.patch, 
> HBASE-7299v3.patch
>
>
> From trunk build #3598:
> {code}
>  testFlushCommitsNoAbort(org.apache.hadoop.hbase.client.TestMultiParallel): 
> Count of regions=8
> {code}
> It failed in 3595 as well:
> {code}
> java.lang.AssertionError: Server count=2, abort=true expected:<1> but was:<2>
>       at org.junit.Assert.fail(Assert.java:93)
>       at org.junit.Assert.failNotEquals(Assert.java:647)
>       at org.junit.Assert.assertEquals(Assert.java:128)
>       at org.junit.Assert.assertEquals(Assert.java:472)
>       at 
> org.apache.hadoop.hbase.client.TestMultiParallel.doTestFlushCommits(TestMultiParallel.java:267)
>       at 
> org.apache.hadoop.hbase.client.TestMultiParallel.testFlushCommitsWithAbort(TestMultiParallel.java:226)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to