Hi Dima, Thanks for the suggestion. We can load the data in heap, but Hbase makes it easier for one to write and another to read. With heap we need to build a process to handle both processes and also write to log so as to not lose the updates in case of process failure.
Thanks Manish On Aug 29, 2016 2:18 PM, "Dima Spivak" <dspi...@cloudera.com> wrote: > (Though if it is only 7 GB, why not just store it in memory?) > > On Sunday, August 28, 2016, Dima Spivak <dspi...@cloudera.com> wrote: > > > If your data can all fit on one machine, HBase is not the best choice. I > > think you'd be better off using a simpler solution for small data and > leave > > HBase for use cases that require proper clusters. > > > > On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com > > <javascript:_e(%7B%7D,'cvml','mylogi...@gmail.com');>> wrote: > > > >> We dont want to invest into another DB like Dynamo, Cassandra and > Already > >> are in the Hadoop Stack. Managing another DB would be a pain. Why HBase > >> over RDMS, is because we call HBase via Spark Streaming to lookup the > >> keys. > >> > >> Manish > >> > >> On Mon, Aug 29, 2016 at 1:47 PM, Dima Spivak <dspi...@cloudera.com> > >> wrote: > >> > >> > Hey Manish, > >> > > >> > Just to ask the naive question, why use HBase if the data fits into > >> such a > >> > small table? > >> > > >> > On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com> > >> wrote: > >> > > >> > > Hi, > >> > > > >> > > We have a scenario where HBase is used like a Key Value Database to > >> map > >> > > Keys to Regions. We have over 5 Million Keys, but the table size is > >> less > >> > > than 7 GB. The read volume is pretty high - About 50x of the > >> put/delete > >> > > volume. This causes hot spotting on the Data Node and the region is > >> not > >> > > split. We cannot change the maxregionsize parameter as that will > >> impact > >> > > other tables too. > >> > > > >> > > Our idea is to manually inspect the row key ranges and then split > the > >> > > region manually and assign them to different region servers. We will > >> > > continue to then monitor the rows in one region to see if needs to > be > >> > > split. > >> > > > >> > > Any experience of doing this on HBase. Is this a recommended > approach? > >> > > > >> > > Thanks, > >> > > Manish > >> > > > >> > > >> > > >> > -- > >> > -Dima > >> > > >> > > > > > > -- > > -Dima > > > > > > -- > -Dima >