[ https://issues.apache.org/jira/browse/HBASE-9553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771564#comment-13771564 ]
Lars Hofhansl commented on HBASE-9553: -------------------------------------- So I did some simple tests with just byte[]'s: # allocated chunks of 10000 64k+-100 bytes # allocated chunks of 10000 65636 (64k+100) bytes # allocated chunks of 10000 64k+-1000 bytes # allocated chunks of 10000 66536 (64k+1000) bytes Runs allocate and GC 10m of those 64k byte[]'s. With various GC settings... There was no discernible difference, between the fixed and variable sized blocks. Maybe I should have done this testing before I filed this idea, going to close as "Invalid". > Pad HFile blocks to a fixed size before placing them into the blockcache > ------------------------------------------------------------------------ > > Key: HBASE-9553 > URL: https://issues.apache.org/jira/browse/HBASE-9553 > Project: HBase > Issue Type: Bug > Reporter: Lars Hofhansl > > In order to make it easy on the garbage collector and to avoid full > compaction phases we should make sure that all (or at least a large > percentage) of the HFile blocks as cached in the block cache are exactly the > same size. > Currently an HFile block is typically slightly larger than the declared block > size, as the block will accommodate that last KV on the block. The padding > would be a ColumnFamily option. In many cases 100 bytes would probably be a > good value to make all blocks exactly the same size (but of course it depends > on the max size of the KVs). > This does not have to be perfect. The more blocks evicted and replaced in the > block cache are of the exact same size the easier it should be on the GC. > Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira