[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229697#comment-13229697 ]
jirapos...@reviews.apache.org commented on HBASE-4608: ------------------------------------------------------ bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote: bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java, line 107 bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92102#file92102line107> bq. > bq. > Nit: Comment here that the status byte is the higher order byte of the dict entry. done in next version bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote: bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java, line 108 bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92102#file92102line108> bq. > bq. > I assume we're entirely sure that a dictionary will never have > 2^15 entries. bq. bq. Li Pi wrote: bq. It'll start evicting once it hits its max size, which is currently 2 ^ 15. Added comment to LRUDictionary on what happens when it hits limit as well as a comment on max expected size of dictionary for any one WAL. bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote: bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java, line 128 bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92102#file92102line128> bq. > bq. > Nit: The naming convention is a bit strange. bq. > This one is called uncompress... whereas the method returning a new byte[] is called readCompressed Its not the worst. Its descriptive I think. bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote: bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java, line 1678 bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92104#file92104line1678> bq. > bq. > Have a constructor that takes a compression context too? bq. > It seems like once anything has been written to the HLog this should be immutable. That won't work for writing case since WAL compression is internal to wal package and the HLog.Entry used writing is made outside of the HLog... which means, for writing case we need above method. Might work for read side though here we allow 'reuse' of the shell HLog.Entry so would need the above method read side too.... bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote: bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java, line 53 bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92105#file92105line53> bq. > bq. > COMPRESSED is a bit of a strange name. bq. > I happens to be a version of the WAL that supports compression, but it is not necessarily compressed. Added comment that these enum means 'The WAL version that first had compression' bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote: bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java, line 303 bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92105#file92105line303> bq. > bq. > ugly whitespace :) Fixed in next version. bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote: bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/LRUDictionary.java, line 32 bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92107#file92107line32> bq. > bq. > I think I had that question to Li Pi... How much memory do we expect this dictionary to take worst case? bq. > I guess since there is one WAL per region server and it is rolled periodically it is not a problem at all. bq. bq. Li Pi wrote: bq. 65536 * 5 ( Regionname, Row key, CF, Column qual, table) * 100 bytes (these are some big names) = 32768000 bytes. Or 32 megabytes. bq. bq. If you want to get silly, even at 1kb entries (wtf are you naming things?), it maxes out at 320 megabytes. bq. bq. Li Pi wrote: bq. Actually halve those amounts, 2^15, not 2^16. Added above as class comment on class. - Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4328/#review5951 ----------------------------------------------------------- On 2012-03-14 07:34:58, Michael Stack wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4328/ bq. ----------------------------------------------------------- bq. bq. (Updated 2012-03-14 07:34:58) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. ------- bq. bq. See issue bq. bq. bq. This addresses bug hbase-4608. bq. https://issues.apache.org/jira/browse/hbase-4608 bq. bq. bq. Diffs bq. ----- bq. bq. src/main/java/org/apache/hadoop/hbase/HConstants.java 045c6f3 bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressionContext.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/Dictionary.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java b5049b1 bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java 311ea1b bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/KeyValueCompression.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/LRUDictionary.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java ff63a5f bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java 01ebb5c bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java d8f317c bq. src/main/java/org/apache/hadoop/hbase/util/Bytes.java de8e40b bq. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestCompressor.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestKeyValueCompression.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLRUDictionary.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java a11899c bq. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplayCompressed.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4328/diff bq. bq. bq. Testing bq. ------- bq. bq. bq. Thanks, bq. bq. Michael bq. bq. > HLog Compression > ---------------- > > Key: HBASE-4608 > URL: https://issues.apache.org/jira/browse/HBASE-4608 > Project: HBase > Issue Type: New Feature > Reporter: Li Pi > Assignee: stack > Fix For: 0.94.0 > > Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, > 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, > 4608v18.txt, 4608v23.txt, 4608v24.txt, 4608v25.txt, 4608v27.txt, 4608v5.txt, > 4608v6.txt, 4608v7.txt, 4608v8fixed.txt, hbase-4608-v28-delta.txt, > hbase-4608-v28.txt, hbase-4608-v28.txt > > > The current bottleneck to HBase write speed is replicating the WAL appends > across different datanodes. We can speed up this process by compressing the > HLog. Current plan involves using a dictionary to compress table name, region > id, cf name, and possibly other bits of repeated data. Also, HLog format may > be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira