We dont want to invest into another DB like Dynamo, Cassandra and Already are in the Hadoop Stack. Managing another DB would be a pain. Why HBase over RDMS, is because we call HBase via Spark Streaming to lookup the keys.
Manish On Mon, Aug 29, 2016 at 1:47 PM, Dima Spivak <dspi...@cloudera.com> wrote: > Hey Manish, > > Just to ask the naive question, why use HBase if the data fits into such a > small table? > > On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com> wrote: > > > Hi, > > > > We have a scenario where HBase is used like a Key Value Database to map > > Keys to Regions. We have over 5 Million Keys, but the table size is less > > than 7 GB. The read volume is pretty high - About 50x of the put/delete > > volume. This causes hot spotting on the Data Node and the region is not > > split. We cannot change the maxregionsize parameter as that will impact > > other tables too. > > > > Our idea is to manually inspect the row key ranges and then split the > > region manually and assign them to different region servers. We will > > continue to then monitor the rows in one region to see if needs to be > > split. > > > > Any experience of doing this on HBase. Is this a recommended approach? > > > > Thanks, > > Manish > > > > > -- > -Dima >