On Mon, Apr 26, 2021 at 7:30 PM Mallikarjun <mallik.v.ar...@gmail.com> wrote:
> Inline reply > > On Tue, Apr 27, 2021 at 1:03 AM Stack <st...@duboce.net> wrote: > > > On Mon, Apr 26, 2021 at 12:30 PM Stack <st...@duboce.net> wrote: > > > > > On Mon, Apr 26, 2021 at 8:10 AM Mallikarjun <mallik.v.ar...@gmail.com> > > > wrote: > > > > > >> We use FavoredStochasticBalancer, which by description says the same > > thing > > >> as FavoredNodeLoadBalancer. Ignoring that fact, problem appears to be > > >> > > >> > > > > > > Other concerns: > > > > > > * Hard-coded triplet of nodes that will inevitably rot as machines > come > > > and go (Are there tools for remediation?) > > > > It doesn't really rot, if you think it with balancer responsible to > assigning regions > > 1. On every region assigned to a particular regionserver, the balancer > would have to reassign this triplet and hence there is no scope of rot > (Same logic applied to WAL as well). (On compaction hdfs blocks will be > pulled back if any spill over) > > I don't follow the above but no harm; I can wait for the write-up (smile). > 2. We used hostnames only (so, come and go is not going to be new nodes but > same hostnames) > > Ack. > Couple of outstanding problems though. > > 1. We couldn't increase replication factor to > 3. Which was fine so far > for our use cases. But we have had thoughts around fixing them. > > Not the end-of-the-world I'd say. Would be nice to have though. > 2. Balancer doesn't understand favored nodes construct, perfect balanced fn > among the rsgroup datanodes isn't possible, but with some variance like > 10-20% difference is expected > > Can be worked on..... > > > > * A workaround for a facility that belongs in the NN > > > > Probably, you can argue both ways. Hbase is the owner of data Sort-of. NN hands out where replicas should be placed according to its configured policies. Then there is the HDFS balancer.... .... > One more concern was that the feature was dead/unused. You seem to refute > > this notion of mine. > > S > > > > We have been using this for more than a year with hbase 2.1 in highly > critical workloads for our company. And several years with hbase 1.2 as > well with backporting rsgroup from master at that time. (2017-18 ish) > > And it has been very smooth operationally in hbase 2.1 > > Sweet. Trying to get the other FN users to show up here on this thread to speak of their experience.... Thanks for speaking up, S > > > > > > > > > > > > > >> Going a step back. > > >> > > >> Did we ever consider giving a thought towards truely multi-tenant > hbase? > > >> > > > > > > Always. > > > > > > > > >> Where each rsgroup has a group of datanodes and namespace tables data > > >> created under that particular rsgroup would sit on those datanodes > only? > > >> We > > >> have attempted to do that and have largely been very successful > running > > >> clusters of hundreds of terabytes with hundreds of > > >> regionservers(datanodes) > > >> per cluster. > > >> > > >> > > > So isolation of load by node? (I believe this is where the rsgroup > > feature > > > came from originally; the desire for a deploy like you describe above. > > > IIUC, its what Thiru and crew run). > > > > > > > > > > > >> 1. We use a modified version of RSGroupBasedFavoredNodeLoadBalancer > > >> contributed by Thiruvel Thirumoolan --> > > >> https://issues.apache.org/jira/browse/HBASE-15533 > > >> > > >> On each balance operation, while the region is moved around (or while > > >> creating table), favored nodes are assigned based on the rsgroup that > > >> region is pinned to. And hence data is pinned to those datanodes only > > >> (Pinning favored nodes is best effort from the hdfs side, but there > are > > >> only a few exception scenarios where data will be spilled over and > they > > >> recover after a major compaction). > > >> > > >> > > > Sounds like you have studied this deploy in operation. Write it up? > Blog > > > post on hbase.apache.org? > > > > > > > Definitely will write up. > > > > > > > > > > >> 2. We have introduced several balancer cost functions to restore > things > > to > > >> normalcy (multi tenancy with fn pinning) such as when a node is dead, > or > > >> when fn's are imbalanced within the same rsgroup, etc. > > >> > > >> 3. We had diverse workloads under the same cluster and WAL isolation > > >> became > > >> a requirement and we went ahead with similar philosophy mentioned in > > line > > >> 1. Where WAL's are created with FN pinning so that they are tied to > > >> datanodes belonging to the same rsgroup. Some discussion seems to have > > >> happened here --> https://issues.apache.org/jira/browse/HBASE-21641 > > >> > > >> There are several other enhancements we have worked on with respect to > > >> rsgroup aware export snapshot, rsaware regionmover, rsaware cluster > > >> replication, etc. > > >> > > >> For above use cases, we would be needing fn information on hbase:meta. > > >> > > >> If the use case seems to be a fit for how we would want hbase to be > > taken > > >> forward as one of the supported use cases, willing to contribute our > > >> changes back to the community. (I was anyway planning to initiate this > > >> discussion) > > >> > > > > > > Contribs always welcome. > > > > Happy to see our thoughts are in line. We will prepare a plan on these > contributions. > > > > > > > > Thanks Malilkarjun, > > > S > > > > > > > > > > > >> > > >> To strengthen the above use case. Here is what one of our multi tenant > > >> cluster looks like > > >> > > >> RSGroups(Tenants): 21 (With tenant isolation) > > >> Regionservers: 275 > > >> Regions Hosted: 6k > > >> Tables Hosted: 87 > > >> Capacity: 250 TB (100TB used) > > >> > > >> --- > > >> Mallikarjun > > >> > > >> > > >> On Mon, Apr 26, 2021 at 9:15 AM 张铎(Duo Zhang) <palomino...@gmail.com> > > >> wrote: > > >> > > >> > As you all know, we always want to reduce the size of the > hbase-server > > >> > module. This time we want to separate the balancer related code to > > >> another > > >> > sub module. > > >> > > > >> > The design doc: > > >> > > > >> > > > >> > > > https://docs.google.com/document/d/1T7WSgcQBJTtbJIjqi8sZYLxD2Z7JbIHx4TJaKKdkBbE/edit# > > >> > > > >> > You can see the bottom of the design doc, favor node balancer is a > > >> problem, > > >> > as it stores the favor node information in hbase:meta. Stack > mentioned > > >> that > > >> > the feature is already dead, maybe we could just purge it from our > > code > > >> > base. > > >> > > > >> > So here we want to know if there are still some users in the > community > > >> who > > >> > still use favor node balancer. Please share your experience and > > whether > > >> you > > >> > still want to use it. > > >> > > > >> > Thanks. > > >> > > > >> > > > > > >