Dear community, Really appreciate it if someone could provide some guides on how to debug this.
We are using HBase 1.1.9 ( + rsgroup patch) on AWS EC2 instances, and I found it is quite slow to drop a table with a lot of regions (300+). The Master's log shows: 2018-04-27 12:39:58,483 INFO [B.defaultRpcServer.handler=11,queue=2,port=16000] master.HMaster: Client=ec2-user/null delete mytable 2018-04-27 12:39:58,737 INFO [ProcedureExecutorThread-0] hbase.MetaTableAccessor: Deleted [{ENCODED => 0b3a3c3e991b38dd0921905f7200f094, NAME => 'mytable,,1524805200089.0b3a3c3e991b38dd0921905f7200f094.', STARTKEY => '', ENDKEY => '000001-0000000000000000000-######'}, {ENCODED => 5fba610c04629121e53fec53af869fe3, NAME => 'mytable,000001-0000000000000000000-######,1524805200089.5fba610c04629121e53fec53af869fe3.', STARTKEY => '000001-0000000000000000000-######', ENDKEY => '000002-0000000000000000000-######'}, {ENCODED => 653dbf4a1aba1fedb56fea9522d7c96a, NAME => 'mytable,000002-0000000000000000000-######,1524805200089.653dbf4a1aba1fedb56fea9522d7c96a.', STARTKEY => '000002-0000000000000000000-######', ENDKEY => '000003-0000000000000000000-######'}, {ENCODED => 7ba6bce2e7cc53a9bcc234244c835ac8, NAME => 'mytable,000003-0000000000000000000-######,1524805200089.7ba6bce2e7cc53a9bcc234244c835ac8.', STARTKEY => '000003-0000000000000000000-######', ENDKEY => '000004-0000000000000000000-######'}, {ENCODED => 5106e0ca76d2c026d2d8ff5ce59316cc,... 2018-04-27 12:40:58,687 INFO [B.defaultRpcServer.handler=20,queue=2,port=16000] master.HMaster: Client=ec2-user/null delete mytable 2018-04-27 12:42:57,592 WARN [B.defaultRpcServer.handler=11,queue=2,port=16000] ipc.RpcServer: (responseTooSlow): {"call":"DeleteTable(org.apache.hadoop.hbase.protobuf.generated.MasterProtos$DeleteTableRequest)","starttimems":1524832798483,"responsesize":4,"method":"DeleteTable","param":"TODO: class org.apache.hadoop.hbase.protobuf.generated.MasterProtos$DeleteTableRequest","processingtimems":179108,"client":" 10.22.22.236:45640","queuetimems":0,"class":"HMaster"} 2018-04-27 12:42:57,924 INFO [ProcedureExecutorThread-3] procedure2.ProcedureExecutor: Rolledback procedure DeleteTableProcedure (table=mytable) id=16515 owner=ec2-user state=ROLLEDBACK exec-time=1mins, 59.108sec exception=org.apache.hadoop.hbase.TableNotFoundException: mytable The stack trace is like Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.TableNotFoundException): mytable at org.apache.hadoop.hbase.master.HMaster.checkTableModifiable(HMaster.java:2216) at org.apache.hadoop.hbase.master.procedure.DeleteTableProcedure.prepareDelete(DeleteTableProcedure.java:255) at org.apache.hadoop.hbase.master.procedure.DeleteTableProcedure.executeFromState(DeleteTableProcedure.java:101) ... It is quite weird for me that: 1. Why is the following message logged twice (with different handlers: 20 and 11) ? Does it means MasterProcedureUtil.NonceProcedureRunnable#run() is called twice and handled by different handlers? INFO [B.defaultRpcServer.handler=20(or 11),queue=2,port=16000] master.HMaster: Client=ec2-user/null delete mytable" 2. The rollback is triggered by the warning of "responseTooSlow" ? 3. I set hbase.ipc.warn.response.time to take the default value (10,000 ms), why the responseTooSlow warning is reported with "processingtimems":179108 ? 4. How could I debug further? -- 李响 Xiang Li 邮件 e-mail :wate...@gmail.com