[ https://issues.apache.org/jira/browse/HBASE-18056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015485#comment-16015485 ]
Anastasia Braginsky commented on HBASE-18056: --------------------------------------------- When 2 segments are merged together only CellArrayMap is rebuilt, all the cells remain in place whether on MSLAB or on JVM heap. Regarding why GC works better with merge, we have a theory, and this theory is supported with performance numbers. Of course we cannot ensure this theory in 100%. So this is the theory: In case of merge, let's say each of old CellArayMaps was of size 1KB and new CellArrayMap has size 2KB. So 2KB are released to GC now. Later the same happens five times again and we release 5KB (1KB at each point in time). Finally, when flushing to disk we release 7KB in one piece. While without merge we release 7 pieces of 1KB. So although in total there is more memory to collect, we believe that it is easier to GC to grasp the situation when memory is released over time and in one piece, in contrast to single release with multiple pieces. I am putting a patch here to see the QA results. > Change CompactingMemStore in BASIC mode to merge multiple segments in pipeline > ------------------------------------------------------------------------------ > > Key: HBASE-18056 > URL: https://issues.apache.org/jira/browse/HBASE-18056 > Project: HBase > Issue Type: Sub-task > Reporter: Anastasia Braginsky > Attachments: HBASE-18056-V01.patch > > > Under HBASE-16417 it was decided that CompactingMemStore in BASIC mode should > merge multiple ImmutableSegments in CompactionPipeline. Basic+Merge actually > demonstrated reduction in GC, alongside improvement in other metrics. > However, the limit on the number of segments in pipeline is still set to 30. > Under this JIRA it should be changed to 1, as it was tested under HBASE-16417. -- This message was sent by Atlassian JIRA (v6.3.15#6346)