[ 
https://issues.apache.org/jira/browse/HBASE-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13608659#comment-13608659
 ] 

chunhui shen commented on HBASE-8163:
-------------------------------------

bq.How big would size the pool?
The pool size depends on the bounce of memstore size, the memstore size would 
be balanceable normally.

Suppose the scenario as following step:
1.Current state:memstore size is 3GB, pool size is 500MB 
2.Memstore size is decreased to 2.5GB because of flush, and then pool size is 
increased to 1GB
3.Memstore size is increased to 2.9GB because of writing, and then pool size is 
decreased  to 600MB

It means we could set pool size to the gap between the min size and max size of 
memstore for a running cluster.

If the gap is very large, e.g. memstore would be flushed to empty when running, 
it is not appropriate to use pool.

Hope I answered your question
                
> MemStoreChunkPool: An improvement for JAVA GC when using MSLAB
> --------------------------------------------------------------
>
>                 Key: HBASE-8163
>                 URL: https://issues.apache.org/jira/browse/HBASE-8163
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-8163v1.patch
>
>
> *Background*:
> When we use mslab,we will copy the keyvalue together in a structure called 
> *MemStoreLAB$Chunk*, therefore we could decrease the heap fragment. 
> *Problem*:
> When one chunk is full, we will create a new chunk, and then the old chunk 
> will be reclaimed by JVM if no reference to it.
> Mostly the chunk object would be promoted when doing Young GC, cause 
> increasing the cost of YGC 
> When a Chunk object has no reference? It should meet two following condition:
> 1.Memstore which this chunk belongs to is flushed
> 2.No scanner is opening on the memstore which this chunk belongs to
> *Solution:*
> 1.Create a chunk pool to manage the no-reference chunks, instead of being 
> reclaimed by JVM
> 2.When a Chunk has no reference, put it back to the pool
> 3.The pool has a max capacity, it will skip the chunks when achieve the max 
> size
> 4.When we need new Chunk to store KeyValue, get it from the pool if exists, 
> else create new one by pool, so we could be able to reuse the old chunks
> *Test results:*
> Environment:
> hbase-version:0.94
> -Xms4G -Xmx4G -Xmn2G
> Row size=50 bytes, Value size=1024 bytes
> 50 concurrent theads per client, insert 10,000,000 rows
> Before:
> Avg write request per second:12953
> After testing, final result of jstat -gcutil :
> YGC YGCT FGC FGCT GCT 
> 747 36.503 48 2.492 38.995
> After:
> Avg write request per second:14025
> After testing, final result of jstat -gcutil :
> YGC YGCT FGC FGCT GCT 
> 711 20.344 4 0.284 20.628
> *Improvement: YGC 40+%; WPS 5%+*
> review board :
> https://reviews.apache.org/r/10056/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to