Thanks, Wellington,

I have already build a hbck1-tools for 2.1.0 using method described in other 
topics. All the HBASE and JDK here is the same version so if it worked fixing 
one cluster HBASE then it should work for other installs.

Fiddling with masterprocWALs will require complete shutdown of hbase operations 
to prevent incoming reds/writes on other tables and I am not sure how 
disruptive that will be other than "probably a lot".

-----Original Message-----
From: Wellington Chevreuil <wellington.chevre...@gmail.com> 
Sent: Tuesday, March 2, 2021 10:57 AM
To: Hbase-User <user@hbase.apache.org>
Subject: Re: HBASE WALs

EXTERNAL

Sorry, missed your previous email. I was hoping you were not on a non-stable 
version, so that you would benefit from hbck2 tool support.
Unfortunately, 2.1.0 is among the early releases that don't work with this tool 
(it requires at least 2.0.3, 2.1.1 or 2.2.0).

Multiple locks exist for DISABLE/ENABLE/UNASSIGN but the system seems
> mostly unhappy with one region in particular, and is reporting on that.
>
Are the other regions for the table properly closed, and this is the only one 
stuck? If you do a list_procedures, are you able to identify an 'unassign' 
procedure still running for this table? Or if you grep master logs for this 
region, do you see any messages suggesting there's still ongoing attempts to 
bring the region offline? If there's apparently no procedure/no ongoing 
attempts to offline the region, you might try to manually update its state in 
meta table, then flip masters (assuming you have master HA), so that the new 
active loads an up to date state from meta table.

Otherwise, if there's still a rogue procedure trying to offline the region, 
unfortunately, due to the lack of hbck support, you would most likely need a 
more disruptive intervention similar to what you had described in your first 
email, but instead of normal wal folder, master proc wals is what you really 
would need to clean out here, as that is where procedures state is persisted, 
and you wouldn't want the rogue procedure to be resumed.

Em seg., 1 de mar. de 2021 às 10:22, Marc Hoppins <marc.hopp...@eset.sk>
escreveu:

> If you know of anything that will help I would appreciate it.
>
> If you need any log output let me know.
>
> Thanks
>
>
> -----Original Message-----
> From: Wellington Chevreuil <wellington.chevre...@gmail.com>
> Sent: Thursday, February 25, 2021 4:08 PM
> To: Hbase-User <user@hbase.apache.org>
> Subject: Re: HBASE WALs
>
> EXTERNAL
>
> >
> > Do WAL files contain information for multiple regions per WAL or is 
> > one WAL associated with one region?
> >
> Multiple regions edits would be present in a single wal file. That's 
> why upon a RS crash and wal processing, there's a wal split phase.
>
> I am trying to find a way to clear a RIT for a disabled table. A 
> similar
> > problem (but on a test cluster) involved me clearing znode info, 
> > deleting HDFS data for the table and deleting WALs/MasterProcWAL 
> > files, finally restarting HBASE service.
> >
> Which hbase version are you on?
>
> Em qui., 25 de fev. de 2021 às 11:51, Marc Hoppins 
> <marc.hopp...@eset.sk>
> escreveu:
>
> > Hi all,
> >
> > Do WAL files contain information for multiple regions per WAL or is 
> > one WAL associated with one region?
> >
> > I am trying to find a way to clear a RIT for a disabled table. A 
> > similar problem (but on a test cluster) involved me clearing znode 
> > info, deleting HDFS data for the table and deleting 
> > WALs/MasterProcWAL files, finally restarting HBASE service.
> >
> > Table cannot be enabled.
> >
> > Multiple locks exist for DISABLE/ENABLE/UNASSIGN but the system 
> > seems mostly unhappy with one region in particular, and is reporting on 
> > that.
> >
> > There are many tables that are very active so I don't think it is 
> > possible to stop the entire service without a lot of forewarning to
> users.
> >
> > Thanks in advance.
> >
>

Reply via email to