Do you have any configuration for Region Normalizer (
https://hbase.apache.org/book.html#normalizer) or something?

Balancer does not split or merge regions. AFAIK, split policy controlled by
`hbase.regionserver.region.split.policy` does the splitting and there is
nothing similar for merges.

---
Mallikarjun


On Mon, Jul 12, 2021 at 2:48 PM Christian Pfarr <[email protected]>
wrote:

> Hello @all,
>
> i´ve a quesion regarding controlling the number of regions on small tables
> in HBase.
> But first i have to give you some hints about our Usecase.
>
> We´ve built a lambda architecture with HDFS (Batch), HBase(Speed) and
> Drill as Serving Layer where we are combining Parquet Files from HDFS with
> HBase Rows that are newer then the most recent Row in HDFS.
> The HBase table is filled in realtime via Nifi, while it is cleaned up
> every Batch (nightly) so that Drill can put the most workload on HDFS.
> Unfortunately the hbase table is very small and because of this, we have
> only one region and because of that, drill cannot parallelize the query,
> which leads to long query times.
>
> If i pre-split the hbase table everything is fine, until the balancer
> comes and merges the small regions. So after a few hours everything is slow
> again :-/
>
> So... my question is now, whats the best way to handle these parallization
> issue.
> I thought about setting hbase.hregion.max.filesize to a very small
> number, for example HDFS Blocksize = 128 MB but i´m not shure if this leads
> to new problems.
>
> What do you think? Is there a better way to handle this?
>
> Regards,
> z0ltrix
>
>
>

Reply via email to