On Thu, Dec 6, 2018 at 7:45 AM Sean Busbey <[email protected]> wrote:

> This week I've run into two cases where I needed the set of regions in
>  transition so I could recover them and I ran into what I think is a
> gap in our operator tooling. I'm hoping folks will have some ideas
> I've missed.
>
> Depending on how this thread goes, I'll make some follow-on on the
> dev@hbase list for implementing changes and documentation.
>
>
> ....


>
> Case 2: HBase 2.1-ish RIT following cluster wide crash
>
> AFAICT cluster had experienced a failure of all RS and masters. Upon
> coming back up Master was left with ~10% of ~10K regions in a state of
> PENDING_OPEN or OPENING all with a RS that had no idea it was involved
> with those regions. I'm pretty sure this is a bug;  I'm still triaging
> it and I don't think it's relevant to the current question.
>
>
Yeah. This sounds like an interesting case.



> Once I confirmed the given RS was not currently doing anything for any
> of those regions I figured I'd use HBCK2 to run an assigns to get
> things fixed. However, since there were like 900 RITs, the Master UI
> was unusable for getting a complete list.



How unusable Sean? Was it up?


> Also with that many all in
> the same state I want to be able to automate running against each of
> them.
>
> I ended up greping the master log file and pulling out the WARN
> messages about RIT to tease out the list of regions, then passed those
> to hbck2.
>
>

Yeah. You saw the doc over on hbck2?
https://github.com/apache/hbase-operator-tools/tree/master/hbase-hbck2

Did you have:

commit fa6373660f622e7520a9f2639485cc386f18ede0
Author: jingyuntian <[email protected]>
Date:   Thu Nov 8 15:30:30 2018 +0800

    HBASE-21410 A helper page that help find all problematic regions and
procedures

It dumps the problematic on the UI so can save on messing in logs.

Thanks,
S





> ----
>
> Am I missing some obvious place where I can use a CLI tool to get a
> list of RIT? I don't see anything in the ref guide. I looked through
> the help of HBCK 1 and the shell and couldn't find anything.
>
> I think I can use Admin.getClusterStatus() and getClusterMetrics() to
> get this info from the Java API. That means there's some way to get it
> in the hbase shell, but it'll probably be ugly. If there's not already
> an easier way I'll want to wrap that so it's a simple command.
>

Reply via email to