[ https://issues.apache.org/jira/browse/HBASE-20312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440902#comment-16440902 ]
Chance Li commented on HBASE-20312: ----------------------------------- HBASE-20312-master.v5.patch fix a bug in close method. > CCSMap: A faster, GC-friendly, less memory Concurrent Map for memstore > ---------------------------------------------------------------------- > > Key: HBASE-20312 > URL: https://issues.apache.org/jira/browse/HBASE-20312 > Project: HBase > Issue Type: New Feature > Components: regionserver > Reporter: Xiang Wang > Assignee: Chance Li > Priority: Major > Fix For: 3.0.0 > > Attachments: 1.1.2-ccsmap-number.png, HBASE-20312-1.3.2.patch, > HBASE-20312-master.v1.patch, HBASE-20312-master.v2.patch, > HBASE-20312-master.v3.patch, HBASE-20312-master.v4.patch, > HBASE-20312-master.v5.patch, ccsmap-branch-1.1.patch, hits.png, jira1.png, > jira2.png, jira3.png, off-heap-test-put-master.png, > on-heap-test-put-master.png > > > Now hbase use ConcurrentSkipListMap as memstore's data structure. > Although MemSLAB reduces memory fragment brought by key-value pairs. > Hundred of millions key-value pairs still make young generation > garbage-collection(gc) stop time long. > > These are 2 gc problems of ConcurrentSkipListMap: > 1. HBase needs 3 objects to store one key-value on expectation. One > Index(skiplist's average node height is 1), one Node, and one KeyValue. Too > many objects are created for memstore. > 2. Recent inserted KeyValue and its map structure(Index, Node) are assigned > on young generation.The card table (for CMS gc algorithm) or RSet(for G1 gc > algorithm) will change frequently on high writing throughput, which makes YGC > slow. > > We devleoped a new skip-list map called CompactdConcurrentSkipListMap(CCSMap > for short), > which provides similary features like ConcurrentSkipListMap but get rid of > Objects for every key-value pairs. > CCSMap's memory structure is like this picture: > !jira1.png! > > One CCSMap consists of a certain number of Chunks. One Chunk consists of a > certain number of nodes. One node is corresspding one element. This element's > all information and its key-value is encoded on a continuous memory segment > without any objects. > Features: > 1. all insert,update,delete operations is lock-free on CCSMap. > 2. Consume less memory, it brings 40% memory saving for 50Byte key-value. > 3. Faster on small key-value because of better cacheline usage. 20~30% better > read/write troughput than ConcurrentSkipListMap for 50Byte key-value. > CCSMap do not support recyle space when deleting element. But it doesn't > matter for hbase because of region flush. > CCSMap has been running on Alibaba's hbase clusters over 17 months, it cuts > down YGC time significantly. here are 2 graph of before and after. > !jira2.png! > !jira3.png! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)