apurtell commented on a change in pull request #3244: URL: https://github.com/apache/hbase/pull/3244#discussion_r629757222
########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCellCodec.java ########## @@ -241,10 +246,27 @@ public void write(Cell cell) throws IOException { compression.getDictionary(CompressionContext.DictionaryIndex.FAMILY)); PrivateCellUtil.compressQualifier(out, cell, compression.getDictionary(CompressionContext.DictionaryIndex.QUALIFIER)); - // Write timestamp, type and value as uncompressed. + // Write timestamp, type and value. StreamUtils.writeLong(out, cell.getTimestamp()); - out.write(cell.getTypeByte()); - PrivateCellUtil.writeValue(out, cell, cell.getValueLength()); + byte type = cell.getTypeByte(); + if (compression.getValueCompressor() != null && + cell.getValueLength() > VALUE_COMPRESS_THRESHOLD) { + // Try compressing the cell's value + byte[] compressedBytes = compressValue(cell); + // Only write the compressed value if we have achieved some space savings. + if (compressedBytes.length < cell.getValueLength()) { + // Set the high bit of type to indicate the value is compressed + out.write((byte)(type|0x80)); Review comment: > Jetty settled on a size threshold of 23 bytes. Thank you @ndimiduk . gzip and deflate are the same thing, essentially. Let's opt for the smaller threshold and see how it goes. Worst case if the compressor produces output that is larger than the original, we just discard it and use the original, so that's not a problem. With a smaller threshold more values are eligible for compression so there will be more time spent in compression, but presumably with a pay off in space savings, so that seems fine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org