[ 
https://issues.apache.org/jira/browse/HBASE-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14807:
--------------------------
    Attachment: 14807.patch

Down timeout from 30 seconds to 15. It runs fast. Let it fail before it 
generates loads of spew.

The append/puts had not edit in them so were skirting a code path; on occasion 
we'd go intended route. We also had a flush in there to add complexity... but 
mostly it just was a noop because we didn't have anything in memstore.

Address both of above. This fixes the hang we were seeing which was us waiting 
on flush to be done 'flushing' but no flush was happening.



> TestWALLockup is flakey
> -----------------------
>
>                 Key: HBASE-14807
>                 URL: https://issues.apache.org/jira/browse/HBASE-14807
>             Project: HBase
>          Issue Type: Bug
>          Components: flakey, test
>            Reporter: stack
>            Assignee: stack
>         Attachments: 14807.patch
>
>
> Fails frequently. 
> Looks like this:
> {code}
> 2015-11-12 10:38:51,812 DEBUG [Time-limited test] regionserver.HRegion(3882): 
> Found 0 recovered edits file(s) under 
> /home/jenkins/jenkins-slave/workspace/HBase-1.2/jdk/latest1.7/label/Hadoop/hbase-server/target/test-data/8b8f8f12-1819-47e3-b1f1-8ffa789438ad/data/default/testLockupWhenSyncInMiddleOfZigZagSetup/c8694b53368f3301a8d370089120388d
> 2015-11-12 10:38:51,821 DEBUG [Time-limited test] 
> regionserver.FlushLargeStoresPolicy(56): 
> hbase.hregion.percolumnfamilyflush.size.lower.bound is not specified, use 
> global config(16777216) instead
> 2015-11-12 10:38:51,880 DEBUG [Time-limited test] wal.WALSplitter(729): Wrote 
> region 
> seqId=/home/jenkins/jenkins-slave/workspace/HBase-1.2/jdk/latest1.7/label/Hadoop/hbase-server/target/test-data/8b8f8f12-1819-47e3-b1f1-8ffa789438ad/data/default/testLockupWhenSyncInMiddleOfZigZagSetup/c8694b53368f3301a8d370089120388d/recovered.edits/2.seqid
>  to file, newSeqId=2, maxSeqId=0
> 2015-11-12 10:38:51,881 INFO  [Time-limited test] regionserver.HRegion(868): 
> Onlined c8694b53368f3301a8d370089120388d; next sequenceid=2
> 2015-11-12 10:38:51,994 ERROR [sync.1] wal.FSHLog$SyncRunner(1226): Error 
> syncing, request close of WAL
> java.io.IOException: FAKE! Failed to replace a bad datanode...SYNC
>       at 
> org.apache.hadoop.hbase.regionserver.TestWALLockup$1DodgyFSLog$1.sync(TestWALLockup.java:162)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1222)
>       at java.lang.Thread.run(Thread.java:745)
> 2015-11-12 10:38:51,997 DEBUG [Thread-4] regionserver.LogRoller(139): WAL 
> roll requested
> 2015-11-12 10:38:52,019 DEBUG [flusher] 
> regionserver.FlushLargeStoresPolicy(100): Since none of the CFs were above 
> the size, flushing all.
> 2015-11-12 10:38:52,192 INFO  [Thread-4] 
> regionserver.TestWALLockup$1DodgyFSLog(129): LATCHED
> java.lang.InterruptedException: sleep interrupted
>       at java.lang.Thread.sleep(Native Method)
>       at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:146)
>       at 
> org.apache.hadoop.hbase.regionserver.TestWALLockup.testLockupWhenSyncInMiddleOfZigZagSetup(TestWALLockup.java:245)
> 2015-11-12 10:39:18,609 INFO  [main] regionserver.TestWALLockup(91): Cleaning 
> test directory: 
> /home/jenkins/jenkins-slave/workspace/HBase-1.2/jdk/latest1.7/label/Hadoop/hbase-server/target/test-data/8b8f8f12-1819-47e3-b1f1-8ffa789438ad
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>       at java.lang.Thread.run(Thread.java:745)
> {code}
> ... then times out after being locked up for 30 seconds.  Writes 50+MB of 
> logs while spinning.
> Reported as this:
> {code}
> -------------------------------------------------------------------------------
> Test set: org.apache.hadoop.hbase.regionserver.TestWALLockup
> -------------------------------------------------------------------------------
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 198.23 sec 
> <<< FAILURE! - in org.apache.hadoop.hbase.regionserver.TestWALLockup
> testLockupWhenSyncInMiddleOfZigZagSetup(org.apache.hadoop.hbase.regionserver.TestWALLockup)
>   Time elapsed: 0.049 sec  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 30000 
> milliseconds
>       at org.apache.log4j.Category.callAppenders(Category.java:205)
>       at org.apache.log4j.Category.forcedLog(Category.java:391)
>       at org.apache.log4j.Category.log(Category.java:856)
>       at 
> org.apache.commons.logging.impl.Log4JLogger.debug(Log4JLogger.java:155)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1386)
>       at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1352)
>       at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1302)
>       at 
> org.apache.hadoop.hbase.regionserver.TestWALLockup.testLockupWhenSyncInMiddleOfZigZagSetup(TestWALLockup.java:260)
> {code}
> Failed here most recently:
> https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.2/364/jdk=latest1.7,label=Hadoop/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to