[ https://issues.apache.org/jira/browse/HBASE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936397#comment-13936397 ]
chunhui shen commented on HBASE-10752: -------------------------------------- As my understanding about current Trunk, a data block will only be cached using one fixed format (encoded or not-encoded). In HColumnDescriptor {code} public static final String ENCODE_ON_DISK = // To be removed, it is not used anymore "ENCODE_ON_DISK"; {code} Also in HFileDataBlockEncoderImpl, we only have one variable ‘encoding’ rather than two variables ‘onDisk’ and ‘inCache’. Thus, In Trunk, I think we could remove 'DataBlockEncoding ' directly from BlockCacheKey. No necessary to check 'DataBlockEncoding' after reading block from cache. In addition, it would be better if have a test to show the case 'there is a potential for caching them twice or missing the cache' mentioned in HBASE-10270 > Port HBASE-10270 'Remove DataBlockEncoding from BlockCacheKey' to trunk > ----------------------------------------------------------------------- > > Key: HBASE-10752 > URL: https://issues.apache.org/jira/browse/HBASE-10752 > Project: HBase > Issue Type: Improvement > Reporter: Ted Yu > Assignee: Ted Yu > Priority: Minor > Fix For: 0.99.0 > > Attachments: 10752-v1.txt, 10752-v2.txt, 10752-v3.txt > > > The JIRA removes the block encoding from the key and forces the caller of > HFileReaderV2.readBlock() to specify the expected BlockType as well as the > expected DataBlockEncoding when these matter. This allows for a decision on > either of these at read time instead of cache time, puts responsibility where > appropriate, fixes some cache misses when using the scan preloading (which > does a read without knowing the type or encoding), allows for the > BlockCacheKey to be re-used by the L2 BucketCache and sets us up for a future > CompoundScannerV2 which can read both un-encoded and encoded data blocks. -- This message was sent by Atlassian JIRA (v6.2#6252)