[ https://issues.apache.org/jira/browse/HBASE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134408#comment-13134408 ]
Matt Corgan commented on HBASE-4577: ------------------------------------ I think the intention of adding it was to see how big the data would be in memory as opposed to on disk, which is a valuable metric. However, we're already jumping ahead to doing delta encoding and prefix compression, so there will soon be a need for a third metric to track encoded size. Maybe these 3 names would be better: storefileSize: size as reported by the filesystem (lzo/gzip compressed) encodedDataSize: size in the block cache (with delta encoding or prefix compression, but no gzip) rawDataSize (instead of uncompressedBytes): size when stored in the current concatenated KeyValue format (the biggest of the 3) The last 2 would only count datablocks of KeyValues. I'm not sure where bloomfilters and indexblocks should be counted into these. Possibly separate metrics? > Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB > ----------------------------------------------------------------------------- > > Key: HBASE-4577 > URL: https://issues.apache.org/jira/browse/HBASE-4577 > Project: HBase > Issue Type: Bug > Affects Versions: 0.92.0 > Reporter: Jean-Daniel Cryans > Assignee: gaojinchao > Priority: Minor > Fix For: 0.92.0 > > > Minor issue while looking at the RS metrics: > bq. numberOfStorefiles=8, storefileUncompressedSizeMB=2418, > storefileSizeMB=2420, compressionRatio=1.0008 > I guess there's a truncation somewhere when it's adding the numbers up. > FWIW there's no compression on that table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira