[ 
https://issues.apache.org/jira/browse/HBASE-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644791#comment-16644791
 ] 

Duo Zhang commented on HBASE-21278:
-----------------------------------

Good news. After changing the testRecoveryAndDoubleExecution to only quit when 
the stepNum equals the lastStep, and change the lastStep to 8(It is the id for 
MERGE_TABLE_REGIONS_UPDATE_META, we can rollback before this state, actually) . 
The old test does not fail always is because that, we sometimes do not persist 
the TRSPs so when restarting we do not need to rollback the TRSPs. By changing 
the lastStep to 8, we make sure that the two TRSPs have been persistent so the 
test will fail always.

Let me dig.

> TestMergeTableRegionsProcedure is flaky
> ---------------------------------------
>
>                 Key: HBASE-21278
>                 URL: https://issues.apache.org/jira/browse/HBASE-21278
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Duo Zhang
>            Priority: Major
>         Attachments: 
> org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt
>
>
> https://builds.apache.org/job/HBase-Flaky-Tests/job/master/1235/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt/*view*/
> I think the problem is
> {noformat}
> 2018-10-08 03:44:30,315 INFO  [PEWorker-1] 
> procedure.MasterProcedureScheduler(689): pid=43, ppid=42, state=SUCCESS, 
> hasLock=false; TransitRegionStateProcedure 
> table=testRollbackAndDoubleExecution, 
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN checking lock on 
> 9bac7c539ac0cff6dc5706ed375a3bfb
> 2018-10-08 03:44:30,320 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): 
> CODE-BUG: Uncaught runtime exception for pid=43, ppid=42, state=SUCCESS, 
> hasLock=true; TransitRegionStateProcedure 
> table=testRollbackAndDoubleExecution, 
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN
> java.lang.UnsupportedOperationException
>       at 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:458)
>       at 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:97)
>       at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:208)
>       at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:957)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1605)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1567)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1446)
>       at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76)
> {noformat}
> Typically there is no rollback for TRSP. Need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to