[
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226477#comment-13226477
]
stack commented on HBASE-4608:
------------------------------
The TestLRUDictionary test looks like it could be fatter. Looks like you
should be able to throw at it a bunch more combinations. And better
excercising of new BidirectionalLRUMap type. Better to find the issues here
in unit test than....
Whats the difference between
{code}
+ public static int hashBytes(byte[] bytes, int offset, int length) {
{code}
and the existing
{code}
public static int hashCode(final byte [] b, final int length) {
{code}
They look to do the same thing? We should remove the new one if so.
We will have a keycontext when we are deserializing? Hows that work?
So we compress at the individual entry level? Why not file at a time? (Sorry
if this has been explained earlier)
Is this right in the WALReader?
{code}
+ compression = conf.getBoolean(HConstants.ENABLE_WAL_COMPRESSION, false);
{code}
How does that work if the WAL was written compressed but this flag is false?
We break? Shouldn't this instead be keyed off the entries themselves? Should
it be a sequence file attribute saying this a compressed file?
Do we foresee replication being able to use this facility? Seems like a
natural having it ship compressed entries.
Good stuff.
> HLog Compression
> ----------------
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
> Issue Type: New Feature
> Reporter: Li Pi
> Assignee: Li Pi
> Fix For: 0.94.0
>
> Attachments: 4608-v19.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt,
> 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt,
> 4608v6.txt, 4608v7.txt, 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends
> across different datanodes. We can speed up this process by compressing the
> HLog. Current plan involves using a dictionary to compress table name, region
> id, cf name, and possibly other bits of repeated data. Also, HLog format may
> be changed in other ways to produce a smaller HLog.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira