[ https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16264459#comment-16264459 ]
Eshcar Hillel commented on HBASE-18294: --------------------------------------- I am back with some numbers. First, I noticed that master suffer major performance degradation w.r.t. branch-2. This is out of the scope of this Jira and I plan to discuss this issue separately. Considering only the delta presented in the current patch here is what I observe for write-only workload with default parameters (Basic memstore compaction) ||code||Throughput||#flushes||#global heap pressure log lines|| |master|58-59Kops|~1250|~700| |master+patch|70-71Kops|~2000|0| And we see similar trends when running with no memstore compaction. We see that looking at the heap size instead of data size causes more disk flushes, since each store trigger flushes more frequently. However, the throughput increases significantly as we *never* reach global heap pressure. IMO this demonstrates that frequent pressure due to global heap size is not healthy, at least from performance perspective. These experiments show the benefit of the patch for on-heap stores. I think it is best to enforce symmetric behavior for on-heap and off-heap stores. And this should start with the naming convention. So let's not have data size vs. on-heap size but rather on-heap vs off-heap size. The reason I think we should have two (optional) threshold is that the space allocated on- and off-heap and their usage can vary. Or let me phrase it as a question: is there a reason not to let the admin the liberty to set these threshold differently?? if they are not set by the admin they get the default value (which is currently 128MB). > Reduce global heap pressure: flush based on heap occupancy > ---------------------------------------------------------- > > Key: HBASE-18294 > URL: https://issues.apache.org/jira/browse/HBASE-18294 > Project: HBase > Issue Type: Improvement > Affects Versions: 3.0.0 > Reporter: Eshcar Hillel > Assignee: Eshcar Hillel > Attachments: HBASE-18294.01.patch, HBASE-18294.02.patch, > HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, > HBASE-18294.06.patch > > > A region is flushed if its memory component exceed a threshold (default size > is 128MB). > A flush policy decides whether to flush a store by comparing the size of the > store to another threshold (that can be configured with > hbase.hregion.percolumnfamilyflush.size.lower.bound). > Currently the implementation (in both cases) compares the data size > (key-value only) to the threshold where it should compare the heap size > (which includes index size, and metadata). -- This message was sent by Atlassian JIRA (v6.4.14#64029)