I was just facing the many-zero-sized-regions issue last week and pondering how best to approach it. So ++ for this work!
> The trouble is we ship defaults for all of the `*min*` configs, and right now there's no way to "unset" them, disable the functionality. Why is that the case? Can I not just set hbase.normalizer.merge.min_region_size.mb to 0? Do I risk blowing away regions from pre-splits or something? > HBASE-23562 added a RegionsMerger tool to hbase-operators-tools Nice. I didn't know about this. I see the tool wants you to specify a desired number of regions. In my particular case I don't have a view on the number of regions I want. I just know that all the post compaction 0 sized regions should go. Can I still make use of this tool? Whitney On Mon, Jun 29, 2020 at 2:42 AM Wellington Chevreuil < wellington.chevre...@gmail.com> wrote: > > > > The trouble is we ship defaults for all of the `*min*` configs, and right > > now there's no way to "unset" them, disable the functionality. Which > means > > there still isn't a way to support the empty regions use-case without > > awkward special-case checks. > > > > HBASE-23562 added a RegionsMerger tool to hbase-operators-tools project, > as a mean to allow multiple merges without checking minimum size. Of course > it's not as convenient as normalizer, but at least gives an alternative for > such edge cases where users ended with lots of empty regions. > > Em sex., 26 de jun. de 2020 às 22:30, Nick Dimiduk <ndimi...@apache.org> > escreveu: > > > Heya, > > > > I've seen a lot of use-cases where the normalizer would be a nice > solution > > for operators and application developers. I've been trying to beef it up > a > > bit to handle these cases. However, some of these considerations are at > > odds, so I want to vet the ideas here. > > > > The normalizer is a background chore in the HMaster that attempts to > > converge region sizes within a table toward the average region size. It > has > > a pretty wide error bar, but that's the overall goal. > > > > Early on, it was observed that an operator needs to pre-split a table, so > > special considerations were included, by way of > > `hbase.normalizer.min.region.count`, > > `hbase.normalizer.merge.min_region_age.days`, and > > `hbase.normalizer.merge.min_region_size.mb`. All these nobs are designed > to > > give an operator means of controlling this behavior. > > > > We have (what I see as) a competing objective: doing away with empty, or > > nearly-empty regions. The use-case is pretty common when there's a TTL > > applied to a table, especially if there's also a timestamp component in > the > > rowkey. In this case, we want the normalizer to "merge away" these empty > > regions. > > > > The trouble is we ship defaults for all of the `*min*` configs, and right > > now there's no way to "unset" them, disable the functionality. Which > means > > there still isn't a way to support the empty regions use-case without > > awkward special-case checks. This is where I'm looking for suggestions > from > > the community. There's some discussion under way over on the PR for > > HBASE-24583. Please take a look. > > > > Thanks in advance, > > Nick > > >