[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145965#comment-13145965
 ] 

jirapos...@reviews.apache.org commented on HBASE-4608:
------------------------------------------------------



bq.  On 2011-11-07 23:39:59, Lars Hofhansl wrote:
bq.  > Cool stuff.
bq.  > 
bq.  > I am probably just missing something... But when is the dictionary 
itself stored? Don't we need to read out the logs again.
bq.  > 
bq.  > Just so I understand: We build up the dictionary as we go along. In the 
beginning most things won't be in the dictionary, we write them out and add 
them to the dict, and from that time on when we encounter them again we just 
write the index.
bq.  > On the read we could also build up the dict as we go along, because when 
values weren't in the dictionary they where written into the file, so we can 
recreate the dictionary as we read. Right?
bq.  > 
bq.  > (As I said, I am probably missing something).
bq.  > 
bq.  > See minor comments inline.

You aren't missing anything! Thats exactly how it works.

Each WAL starts off with a brand new shiny dictionary. We build up the 
dictionary as we write, and when we read, we start off with a shiny new 
dictionary again. The dictionary is recreated upon read.


bq.  On 2011-11-07 23:39:59, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1088
bq.  > <https://reviews.apache.org/r/2740/diff/1/?file=56620#file56620line1088>
bq.  >
bq.  >     This is functionally the same as before, but less readable. I don't 
think this leads to much performance improvement.

good point, i can get rid of this.


bq.  On 2011-11-07 23:39:59, Lars Hofhansl wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java, 
line 2
bq.  > <https://reviews.apache.org/r/2740/diff/1/?file=56621#file56621line2>
bq.  >
bq.  >     I think we leave out the line with the year now.
bq.  >     Lot's of leading whitespace and weird indentation in this file.

I need to fix my eclipse autoformatter. Will take care of this and formatting 
bugs.


bq.  On 2011-11-07 23:39:59, Lars Hofhansl wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java, 
line 62
bq.  > <https://reviews.apache.org/r/2740/diff/1/?file=56621#file56621line62>
bq.  >
bq.  >     passing 0 here? I might be missing something, but looking down at 
readCompressed that looks wrong.

We pass a 0, because we don't encode the length of the qualifier. I don't know 
why we don't but thats how KeyValue does it.


bq.  On 2011-11-07 23:39:59, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java, 
line 157
bq.  > <https://reviews.apache.org/r/2740/diff/1/?file=56624#file56624line157>
bq.  >
bq.  >     Could we have a no-op compressor instead?

no-op compressor? as in one that does nothing?


- Li


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2740/#review3093
-----------------------------------------------------------


On 2011-11-07 23:12:37, Li Pi wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2740/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-11-07 23:12:37)
bq.  
bq.  
bq.  Review request for hbase, Eli Collins and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Heres what I have so far. Things are written, and "should work". I need to 
rework the test cases to test this, and put something in the config file to 
enable/disable. Obviously this isn't ready for commit at the moment, but I can 
get those two things done pretty quickly.
bq.  
bq.  Obviously the dictionary is incredibly simple at the moment, I'll come up 
with something cooler sooner. Let me know how this looks.
bq.  
bq.  
bq.  This addresses bug HBase-4608.
bq.      https://issues.apache.org/jira/browse/HBase-4608
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/KeyValue.java e68e486 
bq.    
src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java 
PRE-CREATION 
bq.    
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SimpleDictionary.java 
PRE-CREATION 
bq.    
src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java 
PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java 
e1117ef 
bq.  
bq.  Diff: https://reviews.apache.org/r/2740/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Li
bq.  
bq.


                
> HLog Compression
> ----------------
>
>                 Key: HBASE-4608
>                 URL: https://issues.apache.org/jira/browse/HBASE-4608
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Li Pi
>            Assignee: Li Pi
>         Attachments: 4608v1.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to