[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192604#comment-13192604
 ] 

Nicolas Spiegelberg commented on HBASE-4608:
--------------------------------------------

I think, if we want to avoid scanning the entire log and seek as an 
optimization, we should put more effort into rolling logs at a lower size 
threshold and having log GC be size-based and get rid of (or greatly raise) the 
file-count-based pressure.

In production, the major bottleneck for us in log replay (after distributed log 
splitting) has been IO dominated.  We normally don't max out CPU.  Anything we 
can do to minimize IO size at the expense of CPU would be beneficial to 
reduction.

As an aside, do we currently compress the output of our log split?  Having the 
output of the resulting per-region logs be in LZO or GZ format will decrease 
our reply time, perhaps more than this optimization will.  That said, this 
feature is very useful, just want to make sure that we're not missing less cool 
but potentially more beneficial optimizations.
                
> HLog Compression
> ----------------
>
>                 Key: HBASE-4608
>                 URL: https://issues.apache.org/jira/browse/HBASE-4608
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Li Pi
>            Assignee: Li Pi
>         Attachments: 4608v1.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 
> 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to