Thanks, Wellington, I have already build a hbck1-tools for 2.1.0 using method described in other topics. All the HBASE and JDK here is the same version so if it worked fixing one cluster HBASE then it should work for other installs.
Fiddling with masterprocWALs will require complete shutdown of hbase operations to prevent incoming reds/writes on other tables and I am not sure how disruptive that will be other than "probably a lot". -----Original Message----- From: Wellington Chevreuil <wellington.chevre...@gmail.com> Sent: Tuesday, March 2, 2021 10:57 AM To: Hbase-User <user@hbase.apache.org> Subject: Re: HBASE WALs EXTERNAL Sorry, missed your previous email. I was hoping you were not on a non-stable version, so that you would benefit from hbck2 tool support. Unfortunately, 2.1.0 is among the early releases that don't work with this tool (it requires at least 2.0.3, 2.1.1 or 2.2.0). Multiple locks exist for DISABLE/ENABLE/UNASSIGN but the system seems > mostly unhappy with one region in particular, and is reporting on that. > Are the other regions for the table properly closed, and this is the only one stuck? If you do a list_procedures, are you able to identify an 'unassign' procedure still running for this table? Or if you grep master logs for this region, do you see any messages suggesting there's still ongoing attempts to bring the region offline? If there's apparently no procedure/no ongoing attempts to offline the region, you might try to manually update its state in meta table, then flip masters (assuming you have master HA), so that the new active loads an up to date state from meta table. Otherwise, if there's still a rogue procedure trying to offline the region, unfortunately, due to the lack of hbck support, you would most likely need a more disruptive intervention similar to what you had described in your first email, but instead of normal wal folder, master proc wals is what you really would need to clean out here, as that is where procedures state is persisted, and you wouldn't want the rogue procedure to be resumed. Em seg., 1 de mar. de 2021 às 10:22, Marc Hoppins <marc.hopp...@eset.sk> escreveu: > If you know of anything that will help I would appreciate it. > > If you need any log output let me know. > > Thanks > > > -----Original Message----- > From: Wellington Chevreuil <wellington.chevre...@gmail.com> > Sent: Thursday, February 25, 2021 4:08 PM > To: Hbase-User <user@hbase.apache.org> > Subject: Re: HBASE WALs > > EXTERNAL > > > > > Do WAL files contain information for multiple regions per WAL or is > > one WAL associated with one region? > > > Multiple regions edits would be present in a single wal file. That's > why upon a RS crash and wal processing, there's a wal split phase. > > I am trying to find a way to clear a RIT for a disabled table. A > similar > > problem (but on a test cluster) involved me clearing znode info, > > deleting HDFS data for the table and deleting WALs/MasterProcWAL > > files, finally restarting HBASE service. > > > Which hbase version are you on? > > Em qui., 25 de fev. de 2021 às 11:51, Marc Hoppins > <marc.hopp...@eset.sk> > escreveu: > > > Hi all, > > > > Do WAL files contain information for multiple regions per WAL or is > > one WAL associated with one region? > > > > I am trying to find a way to clear a RIT for a disabled table. A > > similar problem (but on a test cluster) involved me clearing znode > > info, deleting HDFS data for the table and deleting > > WALs/MasterProcWAL files, finally restarting HBASE service. > > > > Table cannot be enabled. > > > > Multiple locks exist for DISABLE/ENABLE/UNASSIGN but the system > > seems mostly unhappy with one region in particular, and is reporting on > > that. > > > > There are many tables that are very active so I don't think it is > > possible to stop the entire service without a lot of forewarning to > users. > > > > Thanks in advance. > > >