[jira] [Commented] (HBASE-20188) [TESTING] Performance

Eshcar Hillel (JIRA) Tue, 08 May 2018 04:02:05 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467229#comment-16467229
 ]


Eshcar Hillel commented on HBASE-20188:
---------------------------------------

Hi, wanted to share some interesting insights and benchmark results.
We tried to understand why the benefit of in-memory compaction decreases when 
MSLABs are used.
We finally realized this is due to internal fragmentation that causes under 
utilization of the memory.
For example, setting the active segment threshold to A=0.02 means it stores 
0.02*128MB=2.56MB. Each such 2.5MB segment utilizes 2 chunks (spanning *4MB*) 
which are carried in the compaction pipeline until the data is flushed to disk. 
Each 2.5MB data taking 4MB space means IMC heap utilization is roughly at 65%. 
Not ideal.
We therefore experimented with A=0.014, namely active segment of size 1.8MB, 
which fits into a single chunk (leaving some space for overflow etc). Running 
workloadx+workloada+workloadc show improvement in performance in all these 
workloads wrt the default parameters of IMC (results are attached  [^HBase 2.0 
performance evaluation - throughput SSD_HDD.pdf] ).
While the new default improves performance we believe there are still cases 
where an overflow may cause using 2 chunks instead of one. We have and idea how 
to circumvent this problem.
Suggesting to move in two phases:
(1) in Jira HBASE-20390 set IMC default parameters to best utilize memory also 
when using MSLABs.
(2) in a new Jira present and implement a solution that avoids the chunk 
overflow problem.

In addition, we are also considering more optimization in HBASE-20480 that 
potentially reduces overhead of temporary cell objects in while searching in a 
CCM segment.


> [TESTING] Performance
> ---------------------
>
>                 Key: HBASE-20188
>                 URL: https://issues.apache.org/jira/browse/HBASE-20188
>             Project: HBase
>          Issue Type: Umbrella
>          Components: Performance
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 3.0.0, 2.1.0
>
>         Attachments: CAM-CONFIG-V01.patch, HBASE-20188-xac.sh, 
> HBASE-20188.sh, HBase 2.0 performance evaluation - 8GB(1).pdf, HBase 2.0 
> performance evaluation - 8GB.pdf, HBase 2.0 performance evaluation - Basic vs 
> None_ system settings.pdf, HBase 2.0 performance evaluation - throughput 
> SSD_HDD.pdf, ITBLL2.5B_1.2.7vs2.0.0_cpu.png, 
> ITBLL2.5B_1.2.7vs2.0.0_gctime.png, ITBLL2.5B_1.2.7vs2.0.0_iops.png, 
> ITBLL2.5B_1.2.7vs2.0.0_load.png, ITBLL2.5B_1.2.7vs2.0.0_memheap.png, 
> ITBLL2.5B_1.2.7vs2.0.0_memstore.png, ITBLL2.5B_1.2.7vs2.0.0_ops.png, 
> ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png, 
> YCSB_GC_TIME.png, YCSB_IN_MEMORY_COMPACTION=NONE.ops.png, YCSB_MEMSTORE.png, 
> YCSB_OPs.png, YCSB_in-memory-compaction=NONE.ops.png, YCSB_load.png, 
> flamegraph-1072.1.svg, flamegraph-1072.2.svg, hbase-env.sh, hbase-site.xml, 
> hbase-site.xml, hits.png, hits_with_fp_scheduler.png, 
> lock.127.workloadc.20180402T200918Z.svg, 
> lock.2.memsize2.c.20180403T160257Z.svg, perregion.png, run_ycsb.sh, 
> total.png, tree.txt, workloadx, workloadx
>
>
> How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor 
> that it is much slower, that the problem is the asyncwal writing. Does 
> in-memory compaction slow us down or speed us up? What happens when you 
> enable offheaping?
> Keep notes here in this umbrella issue. Need to be able to say something 
> about perf when 2.0.0 ships.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20188) [TESTING] Performance

Reply via email to