[
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229771#comment-13229771
]
stack commented on HBASE-4608:
------------------------------
Here's some WALs to compared compressed w/ patch v29 vs lzma and then the
dictionary compressed file itself lzma'd (Todd request). LZMA'ing the
dictionary compressed file makes it smaller than the lzma'd original. lzma'ing
the compressed file makes it 1/4 size of dictionary compressed file (roughly).
I didn't get a chance to lzo it....
{code}
....
-rw-r--r-- 1 stack staff 64589199 Mar 13 20:24
sv4r21s12%3A60020.1331685637452
-rwxrwxrwx 1 stack staff 28906432 Mar 14 15:34
sv4r21s12%3A60020.1331685637452.compressed
-rw-r--r-- 1 stack staff 7417213 Mar 14 16:25
sv4r21s12%3A60020.1331685637452.compressed.lzma
-rw-r--r-- 1 stack staff 8511618 Mar 14 16:24
sv4r21s12%3A60020.1331685637452.lzma
-rw-r--r-- 1 stack staff 63755620 Mar 13 20:24
sv4r21s12%3A60020.1331687005652
-rwxrwxrwx 1 stack staff 28804928 Mar 14 15:34
sv4r21s12%3A60020.1331687005652.compressed
-rw-r--r-- 1 stack staff 6866107 Mar 14 16:28
sv4r21s12%3A60020.1331687005652.compressed.lzma
-rw-r--r-- 1 stack staff 8328771 Mar 14 16:27
sv4r21s12%3A60020.1331687005652.lzma
-rw-r--r-- 1 stack staff 63755688 Mar 13 20:24
sv4r21s12%3A60020.1331688224458
-rwxrwxrwx 1 stack staff 27701052 Mar 14 15:34
sv4r21s12%3A60020.1331688224458.compressed
-rw-r--r-- 1 stack staff 6614637 Mar 14 16:31
sv4r21s12%3A60020.1331688224458.compressed.lzma
-rw-r--r-- 1 stack staff 8462991 Mar 14 16:31
sv4r21s12%3A60020.1331688224458.lzma
-rw-r--r-- 1 stack staff 64024836 Mar 13 20:24
sv4r21s12%3A60020.1331689518188
-rwxrwxrwx 1 stack staff 28851435 Mar 14 15:34
sv4r21s12%3A60020.1331689518188.compressed
-rw-r--r-- 1 stack staff 6677112 Mar 14 16:35
sv4r21s12%3A60020.1331689518188.compressed.lzma
-rw-r--r-- 1 stack staff 8158847 Mar 14 16:34
sv4r21s12%3A60020.1331689518188.lzma
-rw-r--r-- 1 stack staff 63757131 Mar 13 20:24
sv4r21s12%3A60020.1331690608900
-rwxrwxrwx 1 stack staff 28201506 Mar 14 15:34
sv4r21s12%3A60020.1331690608900.compressed
-rw-r--r-- 1 stack staff 6941982 Mar 14 16:38
sv4r21s12%3A60020.1331690608900.compressed.lzma
-rw-r--r-- 1 stack staff 8513895 Mar 14 16:37
sv4r21s12%3A60020.1331690608900.lzma
-rw-r--r-- 1 stack staff 63754114 Mar 13 20:24
sv4r21s12%3A60020.1331691711502
-rwxrwxrwx 1 stack staff 28318314 Mar 14 15:34
sv4r21s12%3A60020.1331691711502.compressed
-rw-r--r-- 1 stack staff 7392701 Mar 14 16:42
sv4r21s12%3A60020.1331691711502.compressed.lzma
-rw-r--r-- 1 stack staff 9136798 Mar 14 16:41
sv4r21s12%3A60020.1331691711502.lzma
-rw-r--r-- 1 stack staff 63756667 Mar 13 20:24
sv4r21s12%3A60020.1331692886725
-rwxrwxrwx 1 stack staff 28309792 Mar 14 15:34
sv4r21s12%3A60020.1331692886725.compressed
-rw-r--r-- 1 stack staff 7139965 Mar 14 16:44
sv4r21s12%3A60020.1331692886725.compressed.lzma
-rw-r--r-- 1 stack staff 8968155 Mar 14 16:43
sv4r21s12%3A60020.1331692886725.lzma
-rw-r--r-- 1 stack staff 63755003 Mar 13 20:24
sv4r21s12%3A60020.1331694049033
-rwxrwxrwx 1 stack staff 28127053 Mar 14 15:35
sv4r21s12%3A60020.1331694049033.compressed
-rw-r--r-- 1 stack staff 6498486 Mar 14 16:45
sv4r21s12%3A60020.1331694049033.compressed.lzma
-rw-r--r-- 1 stack staff 8175618 Mar 14 16:45
sv4r21s12%3A60020.1331694049033.lzma
-rw-r--r-- 1 stack staff 23441144 Mar 13 20:24
sv4r21s12%3A60020.1331695045194
-rwxrwxrwx 1 stack staff 10561645 Mar 14 15:35
sv4r21s12%3A60020.1331695045194.compressed
-rw-r--r-- 1 stack staff 2922204 Mar 14 16:46
sv4r21s12%3A60020.1331695045194.compressed.lzma
-rw-r--r-- 1 stack staff 3228837 Mar 14 16:46
sv4r21s12%3A60020.1331695045194.lzma
{code}
> HLog Compression
> ----------------
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
> Issue Type: New Feature
> Reporter: Li Pi
> Assignee: stack
> Fix For: 0.94.0
>
> Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt,
> 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt,
> 4608v18.txt, 4608v23.txt, 4608v24.txt, 4608v25.txt, 4608v27.txt, 4608v29.txt,
> 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt,
> hbase-4608-v28-delta.txt, hbase-4608-v28.txt, hbase-4608-v28.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends
> across different datanodes. We can speed up this process by compressing the
> HLog. Current plan involves using a dictionary to compress table name, region
> id, cf name, and possibly other bits of repeated data. Also, HLog format may
> be changed in other ways to produce a smaller HLog.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira