You don't need to rebuild hbase. Just add entry in hbase-site.xml for the following config: > hbase.master.balancer.stochastic.tableSkewCost
Restart master after the addition. Cheers > On Aug 30, 2016, at 12:10 AM, Manish Maheshwari <mylogi...@gmail.com> wrote: > > Hi Ted, > > Where do we set this value DEFAULT_TABLE_SKEW_COST = 35. I see it in only > in StochasticLoadBalancer.java > We don't find this in any of the HBase Config files. Do we need to re-build > HBase from code for this? > > Thanks, > Manish > >> On Tue, Aug 30, 2016 at 6:44 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> >> StochasticLoadBalancer by default would balance regions evenly across the >> cluster. >> >> Regions of particular table may not be evenly distributed. >> >> Increase the value for the following config: >> >> private static final String TABLE_SKEW_COST_KEY = >> >> "hbase.master.balancer.stochastic.tableSkewCost"; >> >> private static final float DEFAULT_TABLE_SKEW_COST = 35; >> >> You can set 500 or higher. >> >> FYI >> >> On Mon, Aug 29, 2016 at 3:22 PM, Manish Maheshwari <mylogi...@gmail.com> >> wrote: >> >>> Thanks Ted for the maxregionsize per table idea. We will try to keep it >>> around 1-2 Gigs and see how it goes. Will this also make sure that the >>> region migrates to another region server? Or do we still need to do that >>> manually? >>> >>> On JMX, Since the environment is production, we are yet unable to use jmx >>> for stats collection. But in dev we are trying it out. >>> >>>> On Aug 30, 2016 1:01 AM, "Ted Yu" <yuzhih...@gmail.com> wrote: >>>> >>>> bq. We cannot change the maxregionsize parameter >>>> >>>> The region size can be changed on per table basis: >>>> >>>> hbase> alter 't1', MAX_FILESIZE => '134217728' >>>> >>>> See the beginning of hbase-shell/src/main/ruby/shell/commands/alter.rb >>> for >>>> more details. >>>> >>>> FYI >>>> >>>> On Sun, Aug 28, 2016 at 10:44 PM, Manish Maheshwari < >> mylogi...@gmail.com >>>> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> We have a scenario where HBase is used like a Key Value Database to >> map >>>>> Keys to Regions. We have over 5 Million Keys, but the table size is >>> less >>>>> than 7 GB. The read volume is pretty high - About 50x of the >> put/delete >>>>> volume. This causes hot spotting on the Data Node and the region is >> not >>>>> split. We cannot change the maxregionsize parameter as that will >> impact >>>>> other tables too. >>>>> >>>>> Our idea is to manually inspect the row key ranges and then split the >>>>> region manually and assign them to different region servers. We will >>>>> continue to then monitor the rows in one region to see if needs to be >>>>> split. >>>>> >>>>> Any experience of doing this on HBase. Is this a recommended >> approach? >>>>> >>>>> Thanks, >>>>> Manish >>