[ https://issues.apache.org/jira/browse/HBASE-19287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yi Liang updated HBASE-19287: ----------------------------- Status: Patch Available (was: Open) > master hangs forever if RecoverMeta send assign meta region request to target > server fail > ----------------------------------------------------------------------------------------- > > Key: HBASE-19287 > URL: https://issues.apache.org/jira/browse/HBASE-19287 > Project: HBase > Issue Type: Bug > Reporter: Yi Liang > Assignee: Yi Liang > > 2017-11-10 19:26:56,019 INFO [ProcExecWrkr-1] > procedure.RecoverMetaProcedure: pid=138, > state=RUNNABLE:RECOVER_META_ASSIGN_REGIONS; RecoverMetaProcedure > failedMetaServer=null, splitWal=true; Retaining meta assignment to > server=hadoop-slave1.hadoop,16020,1510341981454 > 2017-11-10 19:26:56,029 INFO [ProcExecWrkr-1] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=139, ppid=138, > state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, > region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454}] > 2017-11-10 19:26:56,067 INFO [ProcExecWrkr-2] > procedure.MasterProcedureScheduler: pid=139, ppid=138, > state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, > region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454 hbase:meta > hbase:meta,,1.1588230740 > 2017-11-10 19:26:56,071 INFO [ProcExecWrkr-2] assignment.AssignProcedure: > Start pid=139, ppid=138, state=RUNNABLE:REGION_TRANSITION_QUEUE; > AssignProcedure table=hbase:meta, region=1588230740, > target=hadoop-slave1.hadoop,16020,1510341981454; rit=OFFLINE, > location=hadoop-slave1.hadoop,16020,1510341981454; forceNewPlan=false, > retain=false > 2017-11-10 19:26:56,224 INFO [ProcExecWrkr-4] zookeeper.MetaTableLocator: > Setting hbase:meta (replicaId=0) location in ZooKeeper as > hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:26:56,230 INFO [ProcExecWrkr-4] > assignment.RegionTransitionProcedure: Dispatch pid=139, ppid=138, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure table=hbase:meta, > region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454; > rit=OPENING, location=hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:26:56,382 INFO [ProcedureDispatcherTimeoutThread] > procedure.RSProcedureDispatcher: Using procedure batch rpc execution for > serverName=hadoop-slave2.hadoop,16020,1510341988652 version=2097152 > 2017-11-10 19:26:57,542 INFO [main-EventThread] > zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, > processing expiration [hadoop-slave2.hadoop,16020,1510341988652] > 2017-11-10 19:26:57,543 INFO [main-EventThread] master.ServerManager: Master > doesn't enable ServerShutdownHandler during initialization, delay expiring > server hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:26:58,875 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Registering > server=hadoop-slave1.hadoop,16020,1510342016106 > 2017-11-10 19:27:05,832 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Registering > server=hadoop-slave2.hadoop,16020,1510342023184 > 2017-11-10 19:27:05,832 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Triggering server recovery; existingServer > hadoop-slave2.hadoop,16020,1510341988652 looks stale, new > server:hadoop-slave2.hadoop,16020,1510342023184 > 2017-11-10 19:27:05,832 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Master doesn't enable ServerShutdownHandler during > initialization, delay expiring server hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:27:49,815 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > client.RpcRetryingCallerImpl: tarted=38594 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: hbase:meta,,1 is not > online on hadoop-slave2.hadoop,16020,1510342023184 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3290) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1370) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2401) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41544) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:278) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:258) > row 'hbase:namespace' on table 'hbase:meta' at > region=hbase:meta,,1.1588230740, > hostname=hadoop-slave2.hadoop,16020,1510341988652, seqNum=0 -- This message was sent by Atlassian JIRA (v6.4.14#64029)