[GitHub] [hbase] bharathv commented on a change in pull request #3244: HBASE-25869 WAL value compression

GitBox Mon, 10 May 2021 17:46:27 -0700


bharathv commented on a change in pull request #3244:
URL: https://github.com/apache/hbase/pull/3244#discussion_r629772178




##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCellCodec.java
##########
@@ -302,14 +343,28 @@ protected Cell parseCell() throws IOException {
         
compression.getDictionary(CompressionContext.DictionaryIndex.QUALIFIER));
       pos += elemLen;
 
-      // timestamp, type and value
-      int tsTypeValLen = length - pos;
+      // timestamp
+      long ts = StreamUtils.readLong(in);
+      pos = Bytes.putLong(backingArray, pos, ts);
+      // type and value
+      int typeValLen = length - pos;
       if (tagsLength > 0) {
-        tsTypeValLen = tsTypeValLen - tagsLength - KeyValue.TAGS_LENGTH_SIZE;
+        typeValLen = typeValLen - tagsLength - KeyValue.TAGS_LENGTH_SIZE;
+      }
+      // high bit of type byte is 1 if value is compressed
+      byte type = (byte)in.read();
+      if ((type & 0x80) == 0x80) {

Review comment:
       Ya, my gut feeling says it will be amortized in a mix of small and large 
values and worst case assuming every value is small, the overhead is not that 
much (since its off by default anyway, people who choose to turn it on will 
experiment to see if it actually works for them). So for that is it worth 
complicating the code. Just throwing this idea out here.

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressionContext.java
##########
@@ -34,21 +36,49 @@
 @InterfaceAudience.LimitedPrivate({HBaseInterfaceAudience.COPROC, 
HBaseInterfaceAudience.PHOENIX})
 public class CompressionContext {
 
-  static final String ENABLE_WAL_TAGS_COMPRESSION =
+  public static final String ENABLE_WAL_TAGS_COMPRESSION =
       "hbase.regionserver.wal.tags.enablecompression";
 
+  public static final String ENABLE_WAL_VALUE_COMPRESSION =
+      "hbase.regionserver.wal.value.enablecompression";
+
   public enum DictionaryIndex {
     REGION, TABLE, FAMILY, QUALIFIER, ROW
   }
 
+  static class ValueCompressor {
+    final Deflater deflater;
+    final Inflater inflater;
+
+    public ValueCompressor() {
+      deflater = new Deflater();
+      inflater = new Inflater();

Review comment:
       I'm talking about performance, not buffer overwrites. Following is the 
code path.
   
   [FS/Async]WAL#doAppend() -> writer.append() -> cellEncoder.Write() -> 
WallCellCodec.compressValue() -> deflater.deflate()
   
   all the deflate() calls are synchronized on the above `zsRef`. My question 
was does it affect the throughput of a wal appends?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hbase] bharathv commented on a change in pull request #3244: HBASE-25869 WAL value compression

Reply via email to