[
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175296#comment-13175296
]
[email protected] commented on HBASE-4608:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2740/
-----------------------------------------------------------
(Updated 2011-12-23 06:00:24.065183)
Review request for hbase, Eli Collins and Todd Lipcon.
Changes
-------
Some new things, for WALCompress.
I've modified TestWALReplay to test compression - this is a quick hack to have
effective test cases. I'm building my own subset later.
Integration is done, including config, but it doesn't all work yet. It worked
before I tried compressing HLogKeys, SequenceFile seems to try to read them out
of order, causing it to hit empty dictionary entries. Not sure what to do about
this, any advice?
If you only compress KeyValues/WALEdits, it works fine.
Summary
-------
Heres what I have so far. Things are written, and "should work". I need to
rework the test cases to test this, and put something in the config file to
enable/disable. Obviously this isn't ready for commit at the moment, but I can
get those two things done pretty quickly.
Obviously the dictionary is incredibly simple at the moment, I'll come up with
something cooler sooner. Let me know how this looks.
This addresses bug HBase-4608.
https://issues.apache.org/jira/browse/HBase-4608
Diffs (updated)
-----
src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java
PRE-CREATION
src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java
PRE-CREATION
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 24407af
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java f067221
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java
d9cd6de
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java
cbef70f
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SimpleDictionary.java
PRE-CREATION
src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java
PRE-CREATION
src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java e1117ef
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
59910bf
Diff: https://reviews.apache.org/r/2740/diff
Testing
-------
Thanks,
Li
> HLog Compression
> ----------------
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
> Issue Type: New Feature
> Reporter: Li Pi
> Assignee: Li Pi
> Attachments: 4608v1.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends
> across different datanodes. We can speed up this process by compressing the
> HLog. Current plan involves using a dictionary to compress table name, region
> id, cf name, and possibly other bits of repeated data. Also, HLog format may
> be changed in other ways to produce a smaller HLog.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira