[ https://issues.apache.org/jira/browse/HBASE-20921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553311#comment-16553311 ]
stack commented on HBASE-20921: ------------------------------- LGTM > Possible NPE in ReopenTableRegionsProcedure > ------------------------------------------- > > Key: HBASE-20921 > URL: https://issues.apache.org/jira/browse/HBASE-20921 > Project: HBase > Issue Type: Sub-task > Components: amv2 > Affects Versions: 3.0.0, 2.1.0, 2.0.2 > Reporter: Allan Yang > Assignee: Allan Yang > Priority: Major > Attachments: HBASE-20921.branch-2.0.001.patch > > > After HBASE-20752, we issue a ReopenTableRegionsProcedure in > ModifyTableProcedure to ensure all regions are reopened. > But, ModifyTableProcedure and ReopenTableRegionsProcedure do not hold the > lock (why?), so there is a chance that while ModifyTableProcedure executing, > a merge/split procedure can be executed at the same time. > So, when ReopenTableRegionsProcedure reaches the state of > "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED", some of the persisted regions to > check is actually not exists, thus a NPE will throw. > {code} > 2018-07-18 01:38:57,528 INFO [PEWorker-9] > procedure2.ProcedureExecutor(1246): Finished pid=6110, state=SUCCESS; > MergeTableRegionsProcedure table=IntegrationTestBigLinkedList, > regions=[845d286231eb01b7 > 1aeaa17b0e30058d, 4a46ab0918c99cada72d5336ad83a828], forcibly=false in > 10.8610sec > 2018-07-18 01:38:57,530 ERROR [PEWorker-8] > procedure2.ProcedureExecutor(1478): CODE-BUG: Uncaught runtime exception: > pid=5974, ppid=5973, state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; > ReopenTab > leRegionsProcedure table=IntegrationTestBigLinkedList > java.lang.NullPointerException > at > org.apache.hadoop.hbase.master.assignment.RegionStates.checkReopened(RegionStates.java:651) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > at > org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:102) > at > org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:45) > at > org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:184) > at > org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:850) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1453) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1221) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741) > {code} > I think we need to renew the region list of the table at the > "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED" state. For the regions which are > merged or split, we do not need to check it. Since we can be sure that they > are opened after we made change to table descriptor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)