Hi Wellington, I see. Thanks for the pointer. I guess I was staring too much at the region count skew and the overall costs. I would still expect that with a weight of 500 vs 5 the region count function would quickly dominate; but that was just a gut feeling
We have a lot of tables which are hit very unevenly, mostly by map reduce jobs. So that could explain the weird curve upward of the imbalance. Going from a computed imbalance of 1.5651524162506638 to a new imbalance of 0.6658199926299904 Going from a computed imbalance of 1.4975902458776988 to a new imbalance of 0.6489234870537978 Going from a computed imbalance of 1.4656262927807626 to a new imbalance of 0.6874494481391566 Going from a computed imbalance of 1.4416755394161005 to a new imbalance of 0.6621195486623185 Going from a computed imbalance of 1.4649057809821355 to a new imbalance of 0.6215222730451326 Going from a computed imbalance of 1.4793890584018785 to a new imbalance of 0.69336982505148 Going from a computed imbalance of 1.4827752629827082 to a new imbalance of 0.5807016932469299 Going from a computed imbalance of 1.5334691467580437 to a new imbalance of 0.5211528953736886 Going from a computed imbalance of 1.5853428527425468 to a new imbalance of 0.6737463520617091 Going from a computed imbalance of 1.6487293617093945 to a new imbalance of 0.6923698540740604 Going from a computed imbalance of 1.7068097905034438 to a new imbalance of 0.7102944789044411 Going from a computed imbalance of 1.8460020984257268 to a new imbalance of 0.8314854593561547 Going from a computed imbalance of 1.9391440009465157 to a new imbalance of 0.8170066710313877 Going from a computed imbalance of 2.0137944186440104 to a new imbalance of 1.2622553059799655 Going from a computed imbalance of 2.057726780120024 to a new imbalance of 1.0095133476731533 I’ll try fiddling with the weights! A dry-run CLI option would be nice for this :D BR, Frens > On 29 Aug 2024, at 22:37, Wellington Chevreuil > <[email protected]> wrote: > > The balancer logs shared suggest it's deciding to move regions because of the > following factors: > Data locality (hdfs blocks for the regions files) > Read/Write load > Memstore size/utilisation > So you need to look into those stats. It could be that the cluster is under a > "hotspot" situation, where a subset of your regions handle most of the > requests. > > > Em qui., 29 de ago. de 2024 às 21:22, Frens Jan Rumph > <[email protected]> escreveu: >> Dear HBase users/devs! >> >> >> Summary >> >> After a node outage, the HBase balancer was switched off. When turning it on >> later again, the StochasticLoadBalancer increased/created the region count >> skew. Which, given the mostly default configuration is unexpected. Any help >> is much appreciated! >> >> >> Details: >> >> I’m fighting an issue with HBase 2.5.7 on a 11 node cluster with ~15.000 >> regions from ~1000 tables. I’m hoping that someone has a pointer. >> >> Incident -> turned balancer off >> >> We’ve recently lost one of the nodes and ran into severe data imbalance >> issues at the level of the HDFS disks while the cluster was ‘only’ 80% full. >> Some nodes were filling up to over 98%, causing YARN to take these nodes out >> of rotation. We were unable to identify the cause of this imbalance. In an >> attempt to mitigate this, the HBase region balancer was disabled. >> >> Manually under control -> turned balancer on again >> >> Two region servers had a hard restart after the initial incident, so regions >> were reassigned, but not yet balanced. I didn’t dare turn on the balancer >> right away, fearing to get back into the situation of imbalanced disk usage. >> So regions were manually (with some scripting) re-assigned to get back to a >> balanced situation with ~1500 regions per node; in a naive way, similar to >> the SimpleLoadBalancer. >> >> We’ve got the disk usage fairly balanced right now. So I turned the balancer >> back on. >> >> Region count skew increased >> >> However, it started moving regions away from a few nodes quite aggressively. >> Every run it moved 2000 to 4000 regions, expecting a cost decrease. But then >> at the next run, the initial computed cost was higher than before. I gave >> the balancer some rounds, but stopped it as some servers had only ~400 >> regions and others were responsible for 2000+ regions. Above this limit, >> splits are prevented. >> >> This chart shows the effect of switching the balancer on from ~09:30, I >> stopped to at ~11:30: >> >> >> >> >> Some (formatted) example logging from the Balancer chore: >> 2024-08-28 09:57:54,678 INFO [master/m1:16000.Chore.5] >> balancer.StochasticLoadBalancer: ... >> Going from a computed imbalance of 1.4793890584018785 to a new imbalance >> of 0.69336982505148. funtionCost= >> RegionCountSkewCostFunction : (multiplier=500.0, >> imbalance=0.004313540707257566); >> PrimaryRegionCountSkewCostFunction : (not needed); >> MoveCostFunction : (multiplier=7.0, imbalance=0.1888262494457465, need >> balance); >> ServerLocalityCostFunction : (multiplier=25.0, >> imbalance=0.39761170318154926, need balance); >> RackLocalityCostFunction : (multiplier=15.0, imbalance=0.0); >> TableSkewCostFunction : (multiplier=35.0, imbalance=11.404401695266312, >> need balance); >> RegionReplicaHostCostFunction : (not needed); >> RegionReplicaRackCostFunction : (not needed); >> ReadRequestCostFunction : (multiplier=5.0, >> imbalance=0.028254565577063396, need balance); >> WriteRequestCostFunction : (multiplier=5.0, >> imbalance=0.7593874996431397, need balance); >> MemStoreSizeCostFunction : (multiplier=5.0, >> imbalance=0.16192309175499753, need balance); >> StoreFileCostFunction : (multiplier=5.0, imbalance=0.01758057650125178); >> >> ... >> >> 2024-08-28 10:26:34,946 INFO >> [RpcServer.default.FPBQ.Fifo.handler=63,queue=3,port=16000] >> balancer.StochasticLoadBalancer: ... >> Going from a computed imbalance of 1.5853428527425468 to a new imbalance >> of 0.6737463520617091. funtionCost= >> RegionCountSkewCostFunction : (multiplier=500.0, >> imbalance=0.023543776971639504); >> PrimaryRegionCountSkewCostFunction : (not needed); >> MoveCostFunction : (multiplier=7.0, imbalance=0.20349610488314648, need >> balance); >> ServerLocalityCostFunction : (multiplier=25.0, >> imbalance=0.41889718087643735, need balance); >> RackLocalityCostFunction : (multiplier=15.0, imbalance=0.0); >> TableSkewCostFunction : (multiplier=35.0, imbalance=10.849642781445127, >> need balance); >> RegionReplicaHostCostFunction : (not needed); >> RegionReplicaRackCostFunction : (not needed); >> ReadRequestCostFunction : (multiplier=5.0, >> imbalance=0.02832763401695891, need balance); >> WriteRequestCostFunction : (multiplier=5.0, >> imbalance=0.2960273848432453, need balance); >> MemStoreSizeCostFunction : (multiplier=5.0, >> imbalance=0.08973896446650413, need balance); >> StoreFileCostFunction : (multiplier=5.0, imbalance=0.02370918640463713); >> >> >> The balancer has default configuration with one exception, >> hbase.master.balancer.maxRitPercent was set to 0.001 because of the impact >> on availability. >> >> I don’t understand why the balancer would allow such a skew for the region >> count as (per the default configuration), this cost function as a very high >> weight. >> >> I did notice this warning: >> >> calculatedMaxSteps:126008000 for loadbalancer's stochastic walk is larger >> than maxSteps:1000000. Hence load balancing may not work well. Setting >> parameter "hbase.master.balancer.stochastic.runMaxSteps" to true can >> overcome this issue.(This config change does not require service restart) >> >> This might make the balancer perform worse than expected. But I’m under the >> impression that the balancer is eager and takes any randomly generated step >> that decreases the imbalance. With a default weight of 500, I would expect >> region count skew to initially dominate the balancing process. >> >> At a later point in time, I tried to turn the balancer back on again; this >> time after creating an ideal distribution of regions. However, again, in >> just one round the balancer made a complete mess of the region count >> distribution: >> >> >> >> >> >> >> I would very much appreciate any insights or pointers into this matter. >> >> Best regards, >> Frens Jan >> >> >> >> Award-winning OSINT partner for Law Enforcement and Defence. >> >> Frens Jan Rumph >> Data platform engineering lead >> >> phone: >> site: >> >> pgp: +31 50 21 11 622 >> web-iq.com <https://web-iq.com/> >> >> CEE2 A4F1 972E 78C0 F816 >> 86BB D096 18E2 3AC0 16E0 >> The content of this email is confidential and intended for the recipient(s) >> specified in this message only. It is strictly forbidden to share any part >> of this message with any third party, without a written consent of the >> sender. If you received this message by mistake, please reply to this >> message and follow with its deletion, so that we can ensure such a mistake >> does not occur in the future. >> >>
signature.asc
Description: Message signed with OpenPGP
