You don't need to rebuild hbase.
Just add entry in hbase-site.xml for the following config:
> hbase.master.balancer.stochastic.tableSkewCost
Restart master after the addition.
Cheers
> On Aug 30, 2016, at 12:10 AM, Manish Maheshwari wrote:
>
> Hi Ted,
>
> Where do we
Hi Ted,
Where do we set this value DEFAULT_TABLE_SKEW_COST = 35. I see it in only
in StochasticLoadBalancer.java
We don't find this in any of the HBase Config files. Do we need to re-build
HBase from code for this?
Thanks,
Manish
On Tue, Aug 30, 2016 at 6:44 AM, Ted Yu
StochasticLoadBalancer by default would balance regions evenly across the
cluster.
Regions of particular table may not be evenly distributed.
Increase the value for the following config:
private static final String TABLE_SKEW_COST_KEY =
Thanks Ted for the maxregionsize per table idea. We will try to keep it
around 1-2 Gigs and see how it goes. Will this also make sure that the
region migrates to another region server? Or do we still need to do that
manually?
On JMX, Since the environment is production, we are yet unable to use
bq. We cannot change the maxregionsize parameter
The region size can be changed on per table basis:
hbase> alter 't1', MAX_FILESIZE => '134217728'
See the beginning of hbase-shell/src/main/ruby/shell/commands/alter.rb for
more details.
FYI
On Sun, Aug 28, 2016 at 10:44 PM, Manish Maheshwari
Cycling old bits:
http://search-hadoop.com/m/YGbb3E2a71UVLBK=Re+HBase+Count+Rows+in+Regions+and+Region+Servers
You can use /jmx to inspect regions and find the hotspot.
On Mon, Aug 29, 2016 at 7:29 AM, Manish Maheshwari
wrote:
> Hi Dima,
>
> Thanks for the suggestion. We
Hi Dima,
Thanks for the suggestion. We can load the data in heap, but Hbase makes it
easier for one to write and another to read. With heap we need to build a
process to handle both processes and also write to log so as to not lose
the updates in case of process failure.
Thanks
Manish
On Aug
(Though if it is only 7 GB, why not just store it in memory?)
On Sunday, August 28, 2016, Dima Spivak wrote:
> If your data can all fit on one machine, HBase is not the best choice. I
> think you'd be better off using a simpler solution for small data and leave
> HBase for
If your data can all fit on one machine, HBase is not the best choice. I
think you'd be better off using a simpler solution for small data and leave
HBase for use cases that require proper clusters.
On Sunday, August 28, 2016, Manish Maheshwari wrote:
> We dont want to
We dont want to invest into another DB like Dynamo, Cassandra and Already
are in the Hadoop Stack. Managing another DB would be a pain. Why HBase
over RDMS, is because we call HBase via Spark Streaming to lookup the keys.
Manish
On Mon, Aug 29, 2016 at 1:47 PM, Dima Spivak
Hey Manish,
Just to ask the naive question, why use HBase if the data fits into such a
small table?
On Sunday, August 28, 2016, Manish Maheshwari wrote:
> Hi,
>
> We have a scenario where HBase is used like a Key Value Database to map
> Keys to Regions. We have over 5
Hi,
We have a scenario where HBase is used like a Key Value Database to map
Keys to Regions. We have over 5 Million Keys, but the table size is less
than 7 GB. The read volume is pretty high - About 50x of the put/delete
volume. This causes hot spotting on the Data Node and the region is not
12 matches
Mail list logo