[ 
https://issues.apache.org/jira/browse/HBASE-19554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360401#comment-16360401
 ] 

Duo Zhang commented on HBASE-19554:
-----------------------------------

https://builds.apache.org/job/HBASE-Flaky-Tests/25832/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.master.TestDLSFSHLog-output.txt/*view*/

{noformat}
2018-02-12 04:56:20,895 DEBUG [PEWorker-16] 
procedure.ServerCrashProcedure(192): pid=132, 
state=RUNNABLE:SERVER_CRASH_PROCESS_META; ServerCrashProcedure 
server=asf911.gq1.ygridcore.net,41715,1518411358686, splitWal=true, meta=true; 
Processing hbase:meta that was on asf911.gq1.ygridcore.net,41715,1518411358686
2018-02-12 04:56:20,895 INFO  [PEWorker-16] procedure2.ProcedureExecutor(1498): 
Initialized subprocedures=[{pid=135, ppid=132, 
state=RUNNABLE:RECOVER_META_SPLIT_LOGS; RecoverMetaProcedure 
failedMetaServer=asf911.gq1.ygridcore.net,41715,1518411358686, splitWal=true}]
{noformat}

Then there is no progress, so at last we time out. Let me add a thread dump 
when we are about to time out to see if we can find something.

> AbstractTestDLS.testThreeRSAbort sometimes fails in pre commit
> --------------------------------------------------------------
>
>                 Key: HBASE-19554
>                 URL: https://issues.apache.org/jira/browse/HBASE-19554
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Recovery, wal
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>             Fix For: 2.0.0-beta-2
>
>         Attachments: HBASE-19554.patch
>
>
> https://builds.apache.org/job/PreCommit-HBASE-Build/10554/artifact/patchprocess/patch-unit-hbase-server.txt
> The error message is a bit strange:
> {quote}
> [ERROR] testThreeRSAbort(org.apache.hadoop.hbase.master.TestDLSAsyncFSWAL) 
> Time elapsed: 20.627 s <<< ERROR!
> org.apache.hadoop.hbase.TableNotFoundException: Region of 
> 'hbase:namespace,,1513320505933.451650152885a3b41d0b1110deca513c.' is 
> expected in the table of 'testThreeRSAbort', but hbase:meta says it is in the 
> table of 'hbase:namespace'. hbase:meta might be damaged.
> {quote}
> It fails for both FSHLog and AsyncFSWAL. Need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to