On Mon, Apr 26, 2021 at 12:30 PM Stack <[email protected]> wrote: > On Mon, Apr 26, 2021 at 8:10 AM Mallikarjun <[email protected]> > wrote: > >> We use FavoredStochasticBalancer, which by description says the same thing >> as FavoredNodeLoadBalancer. Ignoring that fact, problem appears to be >> >> > > Other concerns: > > * Hard-coded triplet of nodes that will inevitably rot as machines come > and go (Are there tools for remediation?) > * A workaround for a facility that belongs in the NN > * Opaque in operation > * My understanding was that the feature was never finished; in particular > the balancer wasn't properly wired- up (Happy to be incorrect here). > > One more concern was that the feature was dead/unused. You seem to refute this notion of mine. S
> > >> Going a step back. >> >> Did we ever consider giving a thought towards truely multi-tenant hbase? >> > > Always. > > >> Where each rsgroup has a group of datanodes and namespace tables data >> created under that particular rsgroup would sit on those datanodes only? >> We >> have attempted to do that and have largely been very successful running >> clusters of hundreds of terabytes with hundreds of >> regionservers(datanodes) >> per cluster. >> >> > So isolation of load by node? (I believe this is where the rsgroup feature > came from originally; the desire for a deploy like you describe above. > IIUC, its what Thiru and crew run). > > > >> 1. We use a modified version of RSGroupBasedFavoredNodeLoadBalancer >> contributed by Thiruvel Thirumoolan --> >> https://issues.apache.org/jira/browse/HBASE-15533 >> >> On each balance operation, while the region is moved around (or while >> creating table), favored nodes are assigned based on the rsgroup that >> region is pinned to. And hence data is pinned to those datanodes only >> (Pinning favored nodes is best effort from the hdfs side, but there are >> only a few exception scenarios where data will be spilled over and they >> recover after a major compaction). >> >> > Sounds like you have studied this deploy in operation. Write it up? Blog > post on hbase.apache.org? > > > >> 2. We have introduced several balancer cost functions to restore things to >> normalcy (multi tenancy with fn pinning) such as when a node is dead, or >> when fn's are imbalanced within the same rsgroup, etc. >> >> 3. We had diverse workloads under the same cluster and WAL isolation >> became >> a requirement and we went ahead with similar philosophy mentioned in line >> 1. Where WAL's are created with FN pinning so that they are tied to >> datanodes belonging to the same rsgroup. Some discussion seems to have >> happened here --> https://issues.apache.org/jira/browse/HBASE-21641 >> >> There are several other enhancements we have worked on with respect to >> rsgroup aware export snapshot, rsaware regionmover, rsaware cluster >> replication, etc. >> >> For above use cases, we would be needing fn information on hbase:meta. >> >> If the use case seems to be a fit for how we would want hbase to be taken >> forward as one of the supported use cases, willing to contribute our >> changes back to the community. (I was anyway planning to initiate this >> discussion) >> > > Contribs always welcome. > > Thanks Malilkarjun, > S > > > >> >> To strengthen the above use case. Here is what one of our multi tenant >> cluster looks like >> >> RSGroups(Tenants): 21 (With tenant isolation) >> Regionservers: 275 >> Regions Hosted: 6k >> Tables Hosted: 87 >> Capacity: 250 TB (100TB used) >> >> --- >> Mallikarjun >> >> >> On Mon, Apr 26, 2021 at 9:15 AM 张铎(Duo Zhang) <[email protected]> >> wrote: >> >> > As you all know, we always want to reduce the size of the hbase-server >> > module. This time we want to separate the balancer related code to >> another >> > sub module. >> > >> > The design doc: >> > >> > >> https://docs.google.com/document/d/1T7WSgcQBJTtbJIjqi8sZYLxD2Z7JbIHx4TJaKKdkBbE/edit# >> > >> > You can see the bottom of the design doc, favor node balancer is a >> problem, >> > as it stores the favor node information in hbase:meta. Stack mentioned >> that >> > the feature is already dead, maybe we could just purge it from our code >> > base. >> > >> > So here we want to know if there are still some users in the community >> who >> > still use favor node balancer. Please share your experience and whether >> you >> > still want to use it. >> > >> > Thanks. >> > >> >
