Re: HotSpot detection/mitigation worker?

Mallikarjun Mon, 17 May 2021 19:09:04 -0700

I think, no matter how good a balancer cost function be written, it cannot
cover for a not so optimal row key design. Say for example, you have 10
regionservers, 100 regions and your application is heavy on the latest data
which is mostly 1 or 2 regions, how many ever splits and/or merges it
becomes very hard to balance the load among the regionservers.


Here is how we have solved this problem among our clients. Which might not
work for existing clients, but can be a thought for new clients.

Every request with a row key goes through the enrichment process, which
prefixes with a hash (from say murmurhash) based on the client requested
distribution (this stays throughout the lifetime of that table for that
client). Also We wrote a hbase client abstraction to take care of this in a
seamless manager for our clients.

Example: Actual row key --> *0QUPHSBTLGM*, and client requested a 3 digit
prefix based on table region range (000 - 999), would translate to
*115-0QUPHSBTLGM* with murmurhash

---
Mallikarjun


On Tue, May 18, 2021 at 1:33 AM Bryan Beaudreault
<bbeaudrea...@hubspot.com.invalid> wrote:

> Hey all,
>
> We run a bunch of big hbase clusters that get used by hundreds of product
> teams for a variety of real-time workloads. We are a B2B company, so most
> data has a customerId somewhere in the rowkey. As the team that owns the
> hbase infrastructure, we try to help product teams properly design schemas
> to avoid hotspotting, but inevitably it happens. It may not necessarily
> just be hotspotting, but for example request volume may not be evenly
> distributed across all regions of a table.
>
> This hotspotting/distribution issue makes it hard for the balancer to keep
> the cluster balanced from a load perspective -- sure, all RS have the same
> number of regions, but those regions are not all created equal from a load
> perspective. This results in cases where one RS might be consistently at
> 70% cpu, another might be at 30%, and all the rest are in a band
> in-between.
>
> We already have a normalizer job which works similarly to the
> SimpleRegionNormalizer -- keeping regions approximately the same size from
> a data size perspective. I'm considering updating our normalizer to also
> take into account region load.
>
> My general plan is to follow a similar strategy to the balancer -- keep a
> configurable number of RegionLoad objects in memory per-region, and extract
> averages for readRequestsCount from those. If a region's average load is >
> some threshold relative to other regions in the same table, split it. If
> it's < some threshold relative to other regions in the same table, merge
> it.
>
> I'm writing because I'm wondering if anyone else has had this problem and
> if there exists prior art here. Is there a reason HBase does not provide a
> configurable load-based normalizer (beyond typical OSS reasons -- no one
> contributed it yet)?
>
> Thanks!
>

Re: HotSpot detection/mitigation worker?

Reply via email to