[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175315#comment-13175315 ]
jirapos...@reviews.apache.org commented on HBASE-4608: ------------------------------------------------------ bq. On 2011-12-23 06:34:53, Lars Hofhansl wrote: bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/SimpleDictionary.java, line 73 bq. > <https://reviews.apache.org/r/2740/diff/2/?file=65772#file65772line73> bq. > bq. > What if you have a hash collision? bq. > You now overwrite the old value that just happens to have the same hash code. Is that OK? bq. bq. Li Pi wrote: bq. I overwrite the old value. As long as we do it for both reads and writes, thats okay! (The state of the dictionary must be consistent). I see, because read and write would do that in the same order. bq. On 2011-12-23 06:34:53, Lars Hofhansl wrote: bq. > src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java, line 130 bq. > <https://reviews.apache.org/r/2740/diff/2/?file=65774#file65774line130> bq. > bq. > Would sure be nice if we had a KeyValue interface and the implementations would just do the right thing. bq. bq. Li Pi wrote: bq. Didn't want to create a new KeyValue, or modify it, rather - thus the CompressedKeyValue thing. bq. bq. I can refactor this. That was just a general comment. I've thinking quite often how our life would be nice if KeyValue was just an interface rather than a concrete class. Fixing that would be a huge PITA... Different jira :) - Lars ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2740/#review4100 ----------------------------------------------------------- On 2011-12-23 06:00:24, Li Pi wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2740/ bq. ----------------------------------------------------------- bq. bq. (Updated 2011-12-23 06:00:24) bq. bq. bq. Review request for hbase, Eli Collins and Todd Lipcon. bq. bq. bq. Summary bq. ------- bq. bq. Heres what I have so far. Things are written, and "should work". I need to rework the test cases to test this, and put something in the config file to enable/disable. Obviously this isn't ready for commit at the moment, but I can get those two things done pretty quickly. bq. bq. Obviously the dictionary is incredibly simple at the moment, I'll come up with something cooler sooner. Let me know how this looks. bq. bq. bq. This addresses bug HBase-4608. bq. https://issues.apache.org/jira/browse/HBase-4608 bq. bq. bq. Diffs bq. ----- bq. bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 24407af bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java f067221 bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java d9cd6de bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java cbef70f bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SimpleDictionary.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java e1117ef bq. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java 59910bf bq. bq. Diff: https://reviews.apache.org/r/2740/diff bq. bq. bq. Testing bq. ------- bq. bq. bq. Thanks, bq. bq. Li bq. bq. > HLog Compression > ---------------- > > Key: HBASE-4608 > URL: https://issues.apache.org/jira/browse/HBASE-4608 > Project: HBase > Issue Type: New Feature > Reporter: Li Pi > Assignee: Li Pi > Attachments: 4608v1.txt > > > The current bottleneck to HBase write speed is replicating the WAL appends > across different datanodes. We can speed up this process by compressing the > HLog. Current plan involves using a dictionary to compress table name, region > id, cf name, and possibly other bits of repeated data. Also, HLog format may > be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira