Agree. It solves the problem in terms of randomizing distribution, but does not increase cardinality.
On Tue, May 18, 2021 at 6:29 PM Bryan Beaudreault <[email protected]> wrote: > Hi Mallikarjun, thanks for the response. > > I agree that it is hard to fully mitigate a bad rowkey design. We do make > pretty heavy use of hash prefixes, and we don't really have many examples > of the common problem you describe where the "latest" data is in 1-2 > regions. Our distribution issues instead come from the fact that we have > customers both big and small living in the same table -- a small customer > might have only a few rows and might only use the product enough to access > those rows a few times a day. A large cluster might have thousands or > millions of rows, and be constantly using the product and accessing those > rows. > > While we do use hash prefixes, developers need to choose what to put in > that hash -- for example if the rowkey is customerId + emailAddress + > somethingElse, ideally the hash could at the very least be > hash(customerId + emailAddress). This would result in a relatively well > distributed schema, assuming the cardinality of somethingElse wasn't too > big. But often developers might need to scan data for a given customer, so > the hash might just be hash(customerId). For a large customer with many > email addresses, this presents a problem. In these cases we do try to lean > on small index tables + multi gets where possible, but that's not always > possible and doesn't fully solve the potential for hotspotting. > > We have about 1000 tables across 40 clusters and over 5000 regionservers. > This sort of scenario plays out over and over, where large customers end up > causing certain regions to get more traffic than others. One might hope > that these "heavy" regions might also be bigger in size, if read and write > volume were relatively proportional. If that were true, the HFile > size-based normalizer might be enough. For us though, that is not always > the case. My thought is to identify these read-heavy regions and split them > so we can better distribute the load with the balancer. > > Perhaps this problem is very specific to our use-case, which is why this > problem has not been more generally solved by a load-based normalizer. > > On Mon, May 17, 2021 at 10:09 PM Mallikarjun <[email protected]> > wrote: > > > I think, no matter how good a balancer cost function be written, it > cannot > > cover for a not so optimal row key design. Say for example, you have 10 > > regionservers, 100 regions and your application is heavy on the latest > data > > which is mostly 1 or 2 regions, how many ever splits and/or merges it > > becomes very hard to balance the load among the regionservers. > > > > Here is how we have solved this problem among our clients. Which might > not > > work for existing clients, but can be a thought for new clients. > > > > Every request with a row key goes through the enrichment process, which > > prefixes with a hash (from say murmurhash) based on the client requested > > distribution (this stays throughout the lifetime of that table for that > > client). Also We wrote a hbase client abstraction to take care of this > in a > > seamless manager for our clients. > > > > Example: Actual row key --> *0QUPHSBTLGM*, and client requested a 3 digit > > prefix based on table region range (000 - 999), would translate to > > *115-0QUPHSBTLGM* with murmurhash > > > > --- > > Mallikarjun > > > > > > On Tue, May 18, 2021 at 1:33 AM Bryan Beaudreault > > <[email protected]> wrote: > > > > > Hey all, > > > > > > We run a bunch of big hbase clusters that get used by hundreds of > product > > > teams for a variety of real-time workloads. We are a B2B company, so > most > > > data has a customerId somewhere in the rowkey. As the team that owns > the > > > hbase infrastructure, we try to help product teams properly design > > schemas > > > to avoid hotspotting, but inevitably it happens. It may not necessarily > > > just be hotspotting, but for example request volume may not be evenly > > > distributed across all regions of a table. > > > > > > This hotspotting/distribution issue makes it hard for the balancer to > > keep > > > the cluster balanced from a load perspective -- sure, all RS have the > > same > > > number of regions, but those regions are not all created equal from a > > load > > > perspective. This results in cases where one RS might be consistently > at > > > 70% cpu, another might be at 30%, and all the rest are in a band > > > in-between. > > > > > > We already have a normalizer job which works similarly to the > > > SimpleRegionNormalizer -- keeping regions approximately the same size > > from > > > a data size perspective. I'm considering updating our normalizer to > also > > > take into account region load. > > > > > > My general plan is to follow a similar strategy to the balancer -- > keep a > > > configurable number of RegionLoad objects in memory per-region, and > > extract > > > averages for readRequestsCount from those. If a region's average load > is > > > > > > some threshold relative to other regions in the same table, split it. > If > > > it's < some threshold relative to other regions in the same table, > merge > > > it. > > > > > > I'm writing because I'm wondering if anyone else has had this problem > and > > > if there exists prior art here. Is there a reason HBase does not > provide > > a > > > configurable load-based normalizer (beyond typical OSS reasons -- no > one > > > contributed it yet)? > > > > > > Thanks! > > > > > >
