[ 
https://issues.apache.org/jira/browse/HBASE-19554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358324#comment-16358324
 ] 

Duo Zhang commented on HBASE-19554:
-----------------------------------

I guess the problem is related to meta reassign. [~zghaobac] has already found 
something strange for AMv2 in HBASE-19965, so wait for his digging result.

And I found something strange in the output

{noformat}
java.lang.AssertionError
        at 
org.apache.hadoop.hbase.wal.WALKeyImpl.getWriteEntry(WALKeyImpl.java:82)
        at 
org.apache.hadoop.hbase.regionserver.wal.WALUtil.doFullAppendTransaction(WALUtil.java:159)
        at 
org.apache.hadoop.hbase.regionserver.wal.WALUtil.writeMarker(WALUtil.java:132)
        at 
org.apache.hadoop.hbase.regionserver.wal.WALUtil.writeRegionEventMarker(WALUtil.java:97)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.writeRegionCloseMarker(HRegion.java:1103)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1615)
        at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1437)
        at 
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:104)
        at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
{noformat}

The assertion is
{code}
  public MultiVersionConcurrencyControl.WriteEntry getWriteEntry() throws 
InterruptedIOException {
    assert this.writeEntry != null;
    return this.writeEntry;
  }
{code}

I think the problem is introduced by HBASE-19929. Since we have closed WAL 
directly, then it is possible that we fail a WAL.append without assigning a 
mvcc writeEntry. Let me file a issue to fix this.

> AbstractTestDLS.testThreeRSAbort sometimes fails in pre commit
> --------------------------------------------------------------
>
>                 Key: HBASE-19554
>                 URL: https://issues.apache.org/jira/browse/HBASE-19554
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Recovery, wal
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>             Fix For: 2.0.0-beta-2
>
>         Attachments: HBASE-19554.patch
>
>
> https://builds.apache.org/job/PreCommit-HBASE-Build/10554/artifact/patchprocess/patch-unit-hbase-server.txt
> The error message is a bit strange:
> {quote}
> [ERROR] testThreeRSAbort(org.apache.hadoop.hbase.master.TestDLSAsyncFSWAL) 
> Time elapsed: 20.627 s <<< ERROR!
> org.apache.hadoop.hbase.TableNotFoundException: Region of 
> 'hbase:namespace,,1513320505933.451650152885a3b41d0b1110deca513c.' is 
> expected in the table of 'testThreeRSAbort', but hbase:meta says it is in the 
> table of 'hbase:namespace'. hbase:meta might be damaged.
> {quote}
> It fails for both FSHLog and AsyncFSWAL. Need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to