Any hints on that?

Regards,
z0ltrix

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

Christian Pfarr <[email protected]> schrieb am Montag, 12. Juli 2021 um 
13:45:

> ah, ok... thought this was done by the balancer...
> 

> normalizer is enabled (checked via hbase shell), but with no special 
> configuration than in hbase-default.xml
> 

> We run hbase 1.5.0 atm...
> 

> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> 

> Mallikarjun [email protected] schrieb am Montag, 12. Juli 2021 um 
> 13:16:
> 

> > Do you have any configuration for Region Normalizer (
> 

> > https://hbase.apache.org/book.html#normalizer) or something?
> 

> > Balancer does not split or merge regions. AFAIK, split policy controlled by
> 

> > `hbase.regionserver.region.split.policy` does the splitting and there is
> 

> > nothing similar for merges.
> 

> > 

> 

> > Mallikarjun
> 

> > On Mon, Jul 12, 2021 at 2:48 PM Christian Pfarr [email protected]
> 

> > wrote:
> 

> > > Hello @all,
> 

> > > i´ve a quesion regarding controlling the number of regions on small tables
> 

> > > in HBase.
> 

> > > But first i have to give you some hints about our Usecase.
> 

> > > We´ve built a lambda architecture with HDFS (Batch), HBase(Speed) and
> 

> > > Drill as Serving Layer where we are combining Parquet Files from HDFS with
> 

> > > HBase Rows that are newer then the most recent Row in HDFS.
> 

> > > The HBase table is filled in realtime via Nifi, while it is cleaned up
> 

> > > every Batch (nightly) so that Drill can put the most workload on HDFS.
> 

> > > Unfortunately the hbase table is very small and because of this, we have
> 

> > > only one region and because of that, drill cannot parallelize the query,
> 

> > > which leads to long query times.
> 

> > > If i pre-split the hbase table everything is fine, until the balancer
> 

> > > comes and merges the small regions. So after a few hours everything is 
> > > slow
> 

> > > again :-/
> 

> > > So... my question is now, whats the best way to handle these parallization
> 

> > > issue.
> 

> > > I thought about setting hbase.hregion.max.filesize to a very small
> 

> > > number, for example HDFS Blocksize = 128 MB but i´m not shure if this 
> > > leads
> 

> > > to new problems.
> 

> > > What do you think? Is there a better way to handle this?
> 

> > > Regards,
> 

> > > z0ltrix

Attachment: publickey - [email protected] - 0xF0E154C5.asc
Description: application/pgp-keys

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to