Hi Hossein, for which hbase version are you facing this issue?
Removing "/hbase/MasterProcWALs" would probably help sort the
mentioned error, but there might be some risk of creating other
inconsistencies, depending on which procedures are running. Does
list_procedures command show any "running" procedure, or just list the
finished ones?
Em seg, 10 de dez de 2018 às 02:39, Sakthi Vel
<sakthivel.azh...@gmail.com> escreveu:
>
> Hi Hossein,
>
> Aborting procedures can be dangerous (specially if the procedure is not
> rolled back). AFAIK, you can use hbck2(apache/hbase-operator-tools) tool to
> abort a procedure using the ('bypass')  option. I would like to quote the
> official hbck2 doc here:
>
>  bypass [OPTIONS] <PID>...
>    Options:
>     -o,--override   override if procedure is running/stuck
>     -r,--recursive  bypass parent and its children. SLOW! EXPENSIVE!
>     -w,--lockWait   milliseconds to wait on lock before giving up;
> default=1
>    Pass one (or more) procedure 'pid's to skip to procedure finish.
>    Parent of bypassed procedure will also be skipped to the finish.
>    Entities will be left in an inconsistent state and will require
>    manual fixup. May need Master restart to clear locks still held.
>    Bypass fails if procedure has children. Add 'recursive' if all
>    you have is a parent pid to finish parent and children. This
>    is SLOW, and dangerous so use selectively. Does not always work.
>
> +Other members, please correct me if I am wrong.
>
> Sakthi
>
> On Sun, Dec 9, 2018 at 6:18 PM Hossein Zolfi <hossein.zo...@gmail.com>
> wrote:
>
> > Hi,
> > I run hbase performance tools, and thousands tables have been created. And
> > our cluster is currently in inconsistent state (We dont know what is the
> > cause but we try found it), at first I try to disable/drop created tables
> > (1700 tables) but nothing done. list_procedure show 492 rows, and It's not
> > possible to abort any of them. Then, I restart hmaster service, but now, I
> > got infinite number of following exceptions:
> >
> > 2018-12-09 20:01:30,194 WARN  [MASTER_SERVER_OPERATIONS-master-4:16000-0]
> > master.AssignmentManager: Failed assignment of
> > t53889,00000000000000000007603345,1542715604227.4cc63591941dbe928663
> > 88fbde075cac. to data-22-54,16020,1543392184445, waiting a little before
> > trying on the same region server try=1 of 10
> > org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException:
> > org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException:
> > Received OPEN for the region:t53889,0000000
> > 0000000000007603345,1542715604227.4cc63591941dbe92866388fbde075cac. , which
> > we are already trying to CLOSE
> >         at
> >
> > org.apache.hadoop.hbase.regionserver.RSRpcServices.openRegion(RSRpcServices.java:1604)
> >         at
> >
> > org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22239)
> >         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2196)
> >         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> >         at
> > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> >         at
> > org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> >         at java.lang.Thread.run(Thread.java:748)
> >
> >         at sun.reflect.GeneratedConstructorAccessor10.newInstance(Unknown
> > Source)
> >         at
> >
> > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >         at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> >         at
> >
> > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
> >         at
> >
> > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
> >         at
> >
> > org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:330)
> >         at
> >
> > org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:772)
> >         at
> >
> > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2164)
> >         at
> >
> > org.apache.hadoop.hbase.master.AssignmentManager$2.process(AssignmentManager.java:860)
> >         at
> > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
> >         at
> >
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> >         at
> >
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> >         at java.lang.Thread.run(Thread.java:748)
> >
> > How can we stop such logs!?
> >
> > Output of `list_procedures` contains something like this:
> >
> > 1530 DisableTableProcedure (table=t2151) FINISHED Fri Dec 07 11:32:45 +0330
> > 2018 Sun Dec 09 20:07:59 +0330 2018
> >
> > 1532 DisableTableProcedure (table=t21514) FINISHED Fri Dec 07 11:42:53
> > +0330 2018 Sun Dec 09 20:07:27 +0330 2018
> >
> > 1534 DisableTableProcedure (table=t21518) FINISHED Fri Dec 07 11:53:02
> > +0330 2018 Sun Dec 09 20:07:57 +0330 2018
> >
> > 1535 DeleteTableProcedure (table=t13946) FINISHED Fri Dec 07 12:02:59 +0330
> > 2018 Sun Dec 09 20:07:27 +0330 2018
> >
> >
> > I don't know if I remove /hbase/MasterProcWALs from hdfs will problem or
> > not.
> >
> > Any help will be appreciated.
> >
> > With best regards.
> >

Reply via email to