Re: [SURVEY] The current usage of favor node balancer across the community

Stack Mon, 26 Apr 2021 12:32:57 -0700

On Mon, Apr 26, 2021 at 12:30 PM Stack <[email protected]> wrote:

> On Mon, Apr 26, 2021 at 8:10 AM Mallikarjun <[email protected]>
> wrote:
>
>> We use FavoredStochasticBalancer, which by description says the same thing
>> as FavoredNodeLoadBalancer. Ignoring that fact, problem appears to be
>>
>>
>
> Other concerns:
>
>  * Hard-coded triplet of nodes that will inevitably rot as machines come
> and go (Are there tools for remediation?)
>  * A workaround for a facility that belongs in the NN
>  * Opaque in operation
>  * My understanding was that the feature was never finished; in particular
> the balancer wasn't properly wired- up (Happy to be incorrect here).
>
>
One more concern was that the feature was dead/unused. You seem to refute
this notion of mine.
S




>
>
>> Going a step back.
>>
>> Did we ever consider giving a thought towards truely multi-tenant hbase?
>>
>
> Always.
>
>
>> Where each rsgroup has a group of datanodes and namespace tables data
>> created under that particular rsgroup would sit on those datanodes only?
>> We
>> have attempted to do that and have largely been very successful running
>> clusters of hundreds of terabytes with hundreds of
>> regionservers(datanodes)
>> per cluster.
>>
>>
> So isolation of load by node? (I believe this is where the rsgroup feature
> came from originally; the desire for a deploy like you describe above.
> IIUC, its what Thiru and crew run).
>
>
>
>> 1. We use a modified version of RSGroupBasedFavoredNodeLoadBalancer
>> contributed by Thiruvel Thirumoolan -->
>> https://issues.apache.org/jira/browse/HBASE-15533
>>
>> On each balance operation, while the region is moved around (or while
>> creating table), favored nodes are assigned based on the rsgroup that
>> region is pinned to. And hence data is pinned to those datanodes only
>> (Pinning favored nodes is best effort from the hdfs side, but there are
>> only a few exception scenarios where data will be spilled over and they
>> recover after a major compaction).
>>
>>
> Sounds like you have studied this deploy in operation. Write it up? Blog
> post on hbase.apache.org?
>
>
>
>> 2. We have introduced several balancer cost functions to restore things to
>> normalcy (multi tenancy with fn pinning) such as when a node is dead, or
>> when fn's are imbalanced within the same rsgroup, etc.
>>
>> 3. We had diverse workloads under the same cluster and WAL isolation
>> became
>> a requirement and we went ahead with similar philosophy mentioned in line
>> 1. Where WAL's are created with FN pinning so that they are tied to
>> datanodes belonging to the same rsgroup. Some discussion seems to have
>> happened here --> https://issues.apache.org/jira/browse/HBASE-21641
>>
>> There are several other enhancements we have worked on with respect to
>> rsgroup aware export snapshot, rsaware regionmover, rsaware cluster
>> replication, etc.
>>
>> For above use cases, we would be needing fn information on hbase:meta.
>>
>> If the use case seems to be a fit for how we would want hbase to be taken
>> forward as one of the supported use cases, willing to contribute our
>> changes back to the community. (I was anyway planning to initiate this
>> discussion)
>>
>
> Contribs always welcome.
>
> Thanks Malilkarjun,
> S
>
>
>
>>
>> To strengthen the above use case. Here is what one of our multi tenant
>> cluster looks like
>>
>> RSGroups(Tenants): 21 (With tenant isolation)
>> Regionservers: 275
>> Regions Hosted: 6k
>> Tables Hosted: 87
>> Capacity: 250 TB (100TB used)
>>
>> ---
>> Mallikarjun
>>
>>
>> On Mon, Apr 26, 2021 at 9:15 AM 张铎(Duo Zhang) <[email protected]>
>> wrote:
>>
>> > As you all know, we always want to reduce the size of the hbase-server
>> > module. This time we want to separate the balancer related code to
>> another
>> > sub module.
>> >
>> > The design doc:
>> >
>> >
>> https://docs.google.com/document/d/1T7WSgcQBJTtbJIjqi8sZYLxD2Z7JbIHx4TJaKKdkBbE/edit#
>> >
>> > You can see the bottom of the design doc, favor node balancer is a
>> problem,
>> > as it stores the favor node information in hbase:meta. Stack mentioned
>> that
>> > the feature is already dead, maybe we could just purge it from our code
>> > base.
>> >
>> > So here we want to know if there are still some users in the community
>> who
>> > still use favor node balancer. Please share your experience and whether
>> you
>> > still want to use it.
>> >
>> > Thanks.
>> >
>>
>

Re: [SURVEY] The current usage of favor node balancer across the community

Reply via email to