ah, ok... thought this was done by the balancer... normalizer is enabled (checked via hbase shell), but with no special configuration than in hbase-default.xml
We run hbase 1.5.0 atm... ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ Mallikarjun <[email protected]> schrieb am Montag, 12. Juli 2021 um 13:16: > Do you have any configuration for Region Normalizer ( > > https://hbase.apache.org/book.html#normalizer) or something? > > Balancer does not split or merge regions. AFAIK, split policy controlled by > > `hbase.regionserver.region.split.policy` does the splitting and there is > > nothing similar for merges. > > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Mallikarjun > > On Mon, Jul 12, 2021 at 2:48 PM Christian Pfarr [email protected] > > wrote: > > > Hello @all, > > > > i´ve a quesion regarding controlling the number of regions on small tables > > > > in HBase. > > > > But first i have to give you some hints about our Usecase. > > > > We´ve built a lambda architecture with HDFS (Batch), HBase(Speed) and > > > > Drill as Serving Layer where we are combining Parquet Files from HDFS with > > > > HBase Rows that are newer then the most recent Row in HDFS. > > > > The HBase table is filled in realtime via Nifi, while it is cleaned up > > > > every Batch (nightly) so that Drill can put the most workload on HDFS. > > > > Unfortunately the hbase table is very small and because of this, we have > > > > only one region and because of that, drill cannot parallelize the query, > > > > which leads to long query times. > > > > If i pre-split the hbase table everything is fine, until the balancer > > > > comes and merges the small regions. So after a few hours everything is slow > > > > again :-/ > > > > So... my question is now, whats the best way to handle these parallization > > > > issue. > > > > I thought about setting hbase.hregion.max.filesize to a very small > > > > number, for example HDFS Blocksize = 128 MB but i´m not shure if this leads > > > > to new problems. > > > > What do you think? Is there a better way to handle this? > > > > Regards, > > > > z0ltrix
publickey - [email protected] - 0xF0E154C5.asc
Description: application/pgp-keys
signature.asc
Description: OpenPGP digital signature
