Hi Hossein,

Aborting procedures can be dangerous (specially if the procedure is not
rolled back). AFAIK, you can use hbck2(apache/hbase-operator-tools) tool to
abort a procedure using the ('bypass')  option. I would like to quote the
official hbck2 doc here:

 bypass [OPTIONS] <PID>...
   Options:
    -o,--override   override if procedure is running/stuck
    -r,--recursive  bypass parent and its children. SLOW! EXPENSIVE!
    -w,--lockWait   milliseconds to wait on lock before giving up;
default=1
   Pass one (or more) procedure 'pid's to skip to procedure finish.
   Parent of bypassed procedure will also be skipped to the finish.
   Entities will be left in an inconsistent state and will require
   manual fixup. May need Master restart to clear locks still held.
   Bypass fails if procedure has children. Add 'recursive' if all
   you have is a parent pid to finish parent and children. This
   is SLOW, and dangerous so use selectively. Does not always work.

+Other members, please correct me if I am wrong.

Sakthi

On Sun, Dec 9, 2018 at 6:18 PM Hossein Zolfi <hossein.zo...@gmail.com>
wrote:

> Hi,
> I run hbase performance tools, and thousands tables have been created. And
> our cluster is currently in inconsistent state (We dont know what is the
> cause but we try found it), at first I try to disable/drop created tables
> (1700 tables) but nothing done. list_procedure show 492 rows, and It's not
> possible to abort any of them. Then, I restart hmaster service, but now, I
> got infinite number of following exceptions:
>
> 2018-12-09 20:01:30,194 WARN  [MASTER_SERVER_OPERATIONS-master-4:16000-0]
> master.AssignmentManager: Failed assignment of
> t53889,00000000000000000007603345,1542715604227.4cc63591941dbe928663
> 88fbde075cac. to data-22-54,16020,1543392184445, waiting a little before
> trying on the same region server try=1 of 10
> org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException:
> org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException:
> Received OPEN for the region:t53889,0000000
> 0000000000007603345,1542715604227.4cc63591941dbe92866388fbde075cac. , which
> we are already trying to CLOSE
>         at
>
> org.apache.hadoop.hbase.regionserver.RSRpcServices.openRegion(RSRpcServices.java:1604)
>         at
>
> org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22239)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2196)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
>         at java.lang.Thread.run(Thread.java:748)
>
>         at sun.reflect.GeneratedConstructorAccessor10.newInstance(Unknown
> Source)
>         at
>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>         at
>
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>         at
>
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
>         at
>
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:330)
>         at
>
> org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:772)
>         at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2164)
>         at
>
> org.apache.hadoop.hbase.master.AssignmentManager$2.process(AssignmentManager.java:860)
>         at
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
>
> How can we stop such logs!?
>
> Output of `list_procedures` contains something like this:
>
> 1530 DisableTableProcedure (table=t2151) FINISHED Fri Dec 07 11:32:45 +0330
> 2018 Sun Dec 09 20:07:59 +0330 2018
>
> 1532 DisableTableProcedure (table=t21514) FINISHED Fri Dec 07 11:42:53
> +0330 2018 Sun Dec 09 20:07:27 +0330 2018
>
> 1534 DisableTableProcedure (table=t21518) FINISHED Fri Dec 07 11:53:02
> +0330 2018 Sun Dec 09 20:07:57 +0330 2018
>
> 1535 DeleteTableProcedure (table=t13946) FINISHED Fri Dec 07 12:02:59 +0330
> 2018 Sun Dec 09 20:07:27 +0330 2018
>
>
> I don't know if I remove /hbase/MasterProcWALs from hdfs will problem or
> not.
>
> Any help will be appreciated.
>
> With best regards.
>

Reply via email to