Re: hbase memstore size

2014-08-06 Thread yonghu
I did not quite understand your problem. You store your data in HBase, and
I guess later you also will read data from it. Generally, HBase will first
check if the data exist in memstore, if not, it will check the disk. If you
set the memstore to 0, it denotes every read will directly forward to disk.
How heavy will be the I/O cost? Moreover, you can think memstore as a
buffer management in RDBMS.


On Tue, Aug 5, 2014 at 5:54 AM, Alex Newman posi...@gmail.com wrote:

 Could you explain a bit more of why you don't want a memstore? I can't see
 why it is harmful. Sorry to be dense.
 On Aug 3, 2014 11:24 AM, Ozhan Gulen ozhangu...@gmail.com wrote:

  Hello,
  In our hbase cluster memstore flush size is 128 mb. And to insert data to
  tables, we only use bulk load tool. Since bulk loading bypasses
 memstores,
  they are never used, so we want to minimize memstore flush size. But
  memstore flush size is used in many important calculations in hbase such
  that;
 
  region split size = Min (R^2 * “hbase.hregion.memstore.flush.size”,
  “hbase.hregion.max.filesize”)
 
  So setting memstore value smaller or 0 for example,  results in some
  other problems.
  What do you suggest us in that case. Setting memstore size to 128 holds
  some memory for tens of regions in region server and we want to get rid
 of
  it.
  Thanks a lot.
 
  ozhan
 



Re: hbase memstore size

2014-08-06 Thread Ted Yu
bq. HBase will first check if the data exist in memstore, if not, it will
check the disk

For read path, don't forget block cache / bucket cache.

Cheers


On Wed, Aug 6, 2014 at 7:54 AM, yonghu yongyong...@gmail.com wrote:

 I did not quite understand your problem. You store your data in HBase, and
 I guess later you also will read data from it. Generally, HBase will first
 check if the data exist in memstore, if not, it will check the disk. If you
 set the memstore to 0, it denotes every read will directly forward to disk.
 How heavy will be the I/O cost? Moreover, you can think memstore as a
 buffer management in RDBMS.


 On Tue, Aug 5, 2014 at 5:54 AM, Alex Newman posi...@gmail.com wrote:

  Could you explain a bit more of why you don't want a memstore? I can't
 see
  why it is harmful. Sorry to be dense.
  On Aug 3, 2014 11:24 AM, Ozhan Gulen ozhangu...@gmail.com wrote:
 
   Hello,
   In our hbase cluster memstore flush size is 128 mb. And to insert data
 to
   tables, we only use bulk load tool. Since bulk loading bypasses
  memstores,
   they are never used, so we want to minimize memstore flush size. But
   memstore flush size is used in many important calculations in hbase
 such
   that;
  
   region split size = Min (R^2 * “hbase.hregion.memstore.flush.size”,
   “hbase.hregion.max.filesize”)
  
   So setting memstore value smaller or 0 for example,  results in some
   other problems.
   What do you suggest us in that case. Setting memstore size to 128 holds
   some memory for tens of regions in region server and we want to get rid
  of
   it.
   Thanks a lot.
  
   ozhan
  
 



Re: hbase memstore size

2014-08-04 Thread Alex Newman
Could you explain a bit more of why you don't want a memstore? I can't see
why it is harmful. Sorry to be dense.
On Aug 3, 2014 11:24 AM, Ozhan Gulen ozhangu...@gmail.com wrote:

 Hello,
 In our hbase cluster memstore flush size is 128 mb. And to insert data to
 tables, we only use bulk load tool. Since bulk loading bypasses memstores,
 they are never used, so we want to minimize memstore flush size. But
 memstore flush size is used in many important calculations in hbase such
 that;

 region split size = Min (R^2 * “hbase.hregion.memstore.flush.size”,
 “hbase.hregion.max.filesize”)

 So setting memstore value smaller or 0 for example,  results in some
 other problems.
 What do you suggest us in that case. Setting memstore size to 128 holds
 some memory for tens of regions in region server and we want to get rid of
 it.
 Thanks a lot.

 ozhan