[ 
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16264459#comment-16264459
 ] 

Eshcar Hillel commented on HBASE-18294:
---------------------------------------

I am back with some numbers.
First, I noticed that master suffer major performance degradation w.r.t. 
branch-2. This is out of the scope of this Jira and I plan to discuss this 
issue separately.
Considering only the delta presented in the current patch here is what I 
observe for write-only workload with default parameters (Basic memstore 
compaction)

||code||Throughput||#flushes||#global heap pressure log lines||
|master|58-59Kops|~1250|~700|
|master+patch|70-71Kops|~2000|0|

And we see similar trends when running with no memstore compaction. We see that 
looking at the heap size instead of data size causes more disk flushes, since 
each store trigger flushes more frequently. However, the throughput increases 
significantly as we *never* reach global heap pressure. IMO this demonstrates 
that frequent pressure due to global heap size is not healthy, at least from 
performance perspective.

These experiments show the benefit of the patch for on-heap stores. I think it 
is best to enforce symmetric behavior for on-heap and off-heap stores. And this 
should start with the naming convention. So let's not have data size vs. 
on-heap size but rather on-heap vs off-heap size. 
The reason I think we should have two (optional) threshold is that the space 
allocated on- and off-heap and their usage can vary. Or let me phrase it as a 
question: is there a reason not to let the admin the liberty to set these 
threshold differently?? if they are not set by the admin they get the default 
value (which is currently 128MB).

> Reduce global heap pressure: flush based on heap occupancy
> ----------------------------------------------------------
>
>                 Key: HBASE-18294
>                 URL: https://issues.apache.org/jira/browse/HBASE-18294
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>         Attachments: HBASE-18294.01.patch, HBASE-18294.02.patch, 
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, 
> HBASE-18294.06.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size 
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the 
> store to another threshold (that can be configured with 
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size 
> (key-value only) to the threshold where it should compare the heap size 
> (which includes index size, and metadata).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to