Hello all,
I have an use case where I need to write 1 million to 10 million records
periodically (with intervals of 1 minutes to 10 minutes), into an HBase
table.
Once the insert is completed, these records are queried immediately from
another program - multiple reads.
So, this is one massive
Can you tell us the average size of your records and how much heap is given to
the region servers ?
Thanks
On Aug 23, 2013, at 12:11 AM, Gautam Borah gautam.bo...@gmail.com wrote:
Hello all,
I have an use case where I need to write 1 million to 10 million records
periodically (with
Hi,
Average size of my records is 60 bytes - 20 bytes Key and 40 bytes value,
table has one column family.
I have setup a cluster for testing - 1 master and 3 region servers. Each
have a heap size of 3 GB, single cpu.
I have pre-split the table into 30 regions. I do not have to keep data
Assuming you are using 0.94, the default value
for hbase.regionserver.global.memstore.lowerLimit is 0.35
Meaning, memstore on each region server would be able to hold 3000M * 0.35
/ 60 = 17.5 mil records (roughly).
bq. If I use HTable interface, would the inserted data be in the HBase
cache,
Thanks Ted for your response, and clarifying the behavior for using HTable
interface.
What would be the behavior for inserting data using map reduce job? would
the recently added records be in the memstore? or I need to load them for
read queries after the insert is done?
Thanks,
Gautam
On
What would be the behavior for inserting data using map reduce job? would
the recently added records be in the memstore? or I need to load them for
read queries after the insert is done?
Using MR you have 2 options for insertion. One will create the HFiles
directly as o/p (Using