[ https://issues.apache.org/jira/browse/HBASE-27232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17571581#comment-17571581 ]
Hudson commented on HBASE-27232: -------------------------------- Results for branch master [build #642 on builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/642/]: (/) *{color:green}+1 overall{color}* ---- details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/642/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/642/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/642/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Fix checking for encoded block size when deciding if block should be closed > --------------------------------------------------------------------------- > > Key: HBASE-27232 > URL: https://issues.apache.org/jira/browse/HBASE-27232 > Project: HBase > Issue Type: Improvement > Affects Versions: 3.0.0-alpha-3, 2.4.13 > Reporter: Wellington Chevreuil > Assignee: Wellington Chevreuil > Priority: Major > Fix For: 2.6.0, 3.0.0-alpha-4 > > > On HFileWriterImpl.checkBlockBoundary, we useed to consider the unencoded and > uncompressed data size when deciding to close a block and start a new one. > That could lead to varying "on-disk" block sizes, depending on the encoding > efficiency for the cells in each block. > HBASE-17757 introduced the hbase.writer.unified.encoded.blocksize.ratio > property, as ration of the original configured block size, to be compared > against the encoded size. This was an attempt to ensure homogeneous block > sizes. However, the check introduced by HBASE-17757 also considers the > unencoded size, which in the cases where encoding efficiency is higher than > what's configured in hbase.writer.unified.encoded.blocksize.ratio, it would > still lead to varying block sizes. > This patch changes that logic, to only consider encoded size if > hbase.writer.unified.encoded.blocksize.ratio property is set, otherwise, it > will consider the unencoded size. This gives a finer control over the on-disk > block sizes and the overall number of blocks when encoding is in use. -- This message was sent by Atlassian Jira (v8.20.10#820010)