[ https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560710#comment-13560710 ]
chunhui shen commented on HBASE-6466: ------------------------------------- TestLogRolling#testLogRollOnDatanodeDeath() is failed in trunk build 3779 and 3780 by {code}assertTrue("LowReplication Roller should've been disabled",!log.isLowReplicationRollEnabled()); {code} lowReplicationRollEnabled will only be set false in FSHlog#checkLowReplication(); FSHlog#checkLowReplication() will only called by FSHlog#syncer, however it is skipped when rolling log {code} if (!this.logRollRunning) { checkLowReplication(); ... } {code} Therefore, I could only think one reason for this failed test. Log is rolling when calling syncer(). >From the logs, I could only find "HDFS pipeline error detected. Found 1 >replicas but expecting no less than 2 replicas"(logged by the >FSHlog#checkLowReplication()) 3 times, but need at least 4 times to pass the >test. It's easy to reproduce the failed test with the following change in FSHlog {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java (revision 1437274) +++ hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java (working copy) @@ -501,6 +501,10 @@ byte [][] regionsToFlush = null; try { this.logRollRunning = true; + try { + Thread.sleep(1500); + } catch (InterruptedException e) { + } boolean isClosed = closed; if (isClosed || !closeBarrier.beginOp()) { LOG.debug("HLog " + (isClosed ? "closed" : "closing") + ". Skipping rolling of writer"); {code} In addition, with patch v6, pass the test TestLogRolling 50 times on my local PC. Attaching patchV7, change a little in the TestLogRolling > Enable multi-thread for memstore flush > -------------------------------------- > > Key: HBASE-6466 > URL: https://issues.apache.org/jira/browse/HBASE-6466 > Project: HBase > Issue Type: Improvement > Components: regionserver > Affects Versions: 0.96.0 > Reporter: chunhui shen > Assignee: chunhui shen > Priority: Critical > Fix For: 0.96.0 > > Attachments: 6466-v6.patch, HBASE-6466.patch, HBASE-6466v2.patch, > HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch, > HBASE-6466-v4.patch, HBASE-6466-v5.patch > > > If the KV is large or Hlog is closed with high-pressure putting, we found > memstore is often above the high water mark and block the putting. > So should we enable multi-thread for Memstore Flush? > Some performance test data for reference, > 1.test environment : > random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 > regions per regionserver;row len=50 bytes, value len=1024 bytes;5 > regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler > per client for writing > 2.test results: > one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per > regionserver, appears many aboveGlobalMemstoreLimit blocking > two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per > regionserver, > 200 thread handler per client & two cacheFlush handlers, tps:16.1k/s per > regionserver, Flush:18.6MB/s per regionserver -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira