[ https://issues.apache.org/jira/browse/HBASE-27264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wellington Chevreuil updated HBASE-27264: ----------------------------------------- Affects Version/s: 3.0.0-alpha-4 > Add options to consider compressed size when delimiting blocks during hfile > writes > ---------------------------------------------------------------------------------- > > Key: HBASE-27264 > URL: https://issues.apache.org/jira/browse/HBASE-27264 > Project: HBase > Issue Type: New Feature > Affects Versions: 3.0.0-alpha-4 > Reporter: Wellington Chevreuil > Assignee: Wellington Chevreuil > Priority: Major > > In HBASE-27232 we had modified "hbase.writer.unified.encoded.blocksize.ratio" > property soo that it can allow for the encoded size to be considered when > delimiting hfiles blocks during writes. > -Here we propose two additional > properties,"hbase.block.size.limit.compressed" and > "hbase.block.size.max.compressed" that would allow for consider the > compressed size (if compression is in use) for delimiting blocks during hfile > writing. When compression is enabled, certain datasets can have very high > compression efficiency, so that the default 64KB block size and 10GB max file > size can lead to hfiles with very large number of blocks.- > -In this proposal, "hbase.block.size.limit.compressed" is a boolean flag that > switches to compressed size for delimiting blocks, and > "hbase.block.size.max.compressed" is an int with the limit, in bytes for the > compressed block size, in order to avoid very large uncompressed blocks > (defaulting to 320KB).- > Note: As of 15/08/2022, the original proposal above has been modified to > define a pluggable strategy for predicating block compression rate. Please > refer to the release notes for more details. > -- This message was sent by Atlassian Jira (v8.20.10#820010)