[
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229697#comment-13229697
]
[email protected] commented on HBASE-4608:
------------------------------------------------------
bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java,
line 107
bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92102#file92102line107>
bq. >
bq. > Nit: Comment here that the status byte is the higher order byte of
the dict entry.
done in next version
bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java,
line 108
bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92102#file92102line108>
bq. >
bq. > I assume we're entirely sure that a dictionary will never have >
2^15 entries.
bq.
bq. Li Pi wrote:
bq. It'll start evicting once it hits its max size, which is currently 2 ^
15.
Added comment to LRUDictionary on what happens when it hits limit as well as a
comment on max expected size of dictionary for any one WAL.
bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java,
line 128
bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92102#file92102line128>
bq. >
bq. > Nit: The naming convention is a bit strange.
bq. > This one is called uncompress... whereas the method returning a new
byte[] is called readCompressed
Its not the worst. Its descriptive I think.
bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java, line
1678
bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92104#file92104line1678>
bq. >
bq. > Have a constructor that takes a compression context too?
bq. > It seems like once anything has been written to the HLog this should
be immutable.
That won't work for writing case since WAL compression is internal to wal
package and the HLog.Entry used writing is made outside of the HLog... which
means, for writing case we need above method. Might work for read side though
here we allow 'reuse' of the shell HLog.Entry so would need the above method
read side too....
bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java,
line 53
bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92105#file92105line53>
bq. >
bq. > COMPRESSED is a bit of a strange name.
bq. > I happens to be a version of the WAL that supports compression, but
it is not necessarily compressed.
Added comment that these enum means 'The WAL version that first had compression'
bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java,
line 303
bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92105#file92105line303>
bq. >
bq. > ugly whitespace :)
Fixed in next version.
bq. On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq. >
src/main/java/org/apache/hadoop/hbase/regionserver/wal/LRUDictionary.java, line
32
bq. > <https://reviews.apache.org/r/4328/diff/2/?file=92107#file92107line32>
bq. >
bq. > I think I had that question to Li Pi... How much memory do we expect
this dictionary to take worst case?
bq. > I guess since there is one WAL per region server and it is rolled
periodically it is not a problem at all.
bq.
bq. Li Pi wrote:
bq. 65536 * 5 ( Regionname, Row key, CF, Column qual, table) * 100 bytes
(these are some big names) = 32768000 bytes. Or 32 megabytes.
bq.
bq. If you want to get silly, even at 1kb entries (wtf are you naming
things?), it maxes out at 320 megabytes.
bq.
bq. Li Pi wrote:
bq. Actually halve those amounts, 2^15, not 2^16.
Added above as class comment on class.
- Michael
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4328/#review5951
-----------------------------------------------------------
On 2012-03-14 07:34:58, Michael Stack wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/4328/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2012-03-14 07:34:58)
bq.
bq.
bq. Review request for hbase.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. See issue
bq.
bq.
bq. This addresses bug hbase-4608.
bq. https://issues.apache.org/jira/browse/hbase-4608
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. src/main/java/org/apache/hadoop/hbase/HConstants.java 045c6f3
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressionContext.java
PRE-CREATION
bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java
PRE-CREATION
bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/Dictionary.java
PRE-CREATION
bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java b5049b1
bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java
311ea1b
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/wal/KeyValueCompression.java
PRE-CREATION
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/wal/LRUDictionary.java
PRE-CREATION
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java
ff63a5f
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java
01ebb5c
bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java
d8f317c
bq. src/main/java/org/apache/hadoop/hbase/util/Bytes.java de8e40b
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestCompressor.java
PRE-CREATION
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestKeyValueCompression.java
PRE-CREATION
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLRUDictionary.java
PRE-CREATION
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
a11899c
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplayCompressed.java
PRE-CREATION
bq.
bq. Diff: https://reviews.apache.org/r/4328/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq.
bq. Thanks,
bq.
bq. Michael
bq.
bq.
> HLog Compression
> ----------------
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
> Issue Type: New Feature
> Reporter: Li Pi
> Assignee: stack
> Fix For: 0.94.0
>
> Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt,
> 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt,
> 4608v18.txt, 4608v23.txt, 4608v24.txt, 4608v25.txt, 4608v27.txt, 4608v5.txt,
> 4608v6.txt, 4608v7.txt, 4608v8fixed.txt, hbase-4608-v28-delta.txt,
> hbase-4608-v28.txt, hbase-4608-v28.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends
> across different datanodes. We can speed up this process by compressing the
> HLog. Current plan involves using a dictionary to compress table name, region
> id, cf name, and possibly other bits of repeated data. Also, HLog format may
> be changed in other ways to produce a smaller HLog.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira