[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224786#comment-13224786 ]
Zhihong Yu edited comment on HBASE-4608 at 3/7/12 10:33 PM: ------------------------------------------------------------ I issued the following command: {code} bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 5 {code} After the above job finished, I saw this in region server log: {code} 2012-03-07 13:01:12,408 INFO wal.SequenceFileLogWriter (SequenceFileLogWriter.java:init(91)) <<regionserver60020.logRoller>> - WAL compression enabled for hdfs://sea-lab-0:54310/hbase/.logs/sea-lab-5,60020,1331150872956/sea-lab-5%2C60020%2C1331150872956.1331154072399 {code} After copying the HLog to local, I issued: {code} bin/hbase org.apache.hadoop.hbase.regionserver.wal.Compressor -u sea-lab-5%2C60020%2C1331150872956.1331154072399 sea-lab-5.decomp {code} I got: {code} -rwxr-xr-x 1 hduser hduser 119487372 2012-03-07 14:12 sea-lab-5.decomp -rw-r--r-- 1 hduser hduser 120660017 2012-03-07 14:11 sea-lab-5%2C60020%2C1331150872956.1331154072399 {code} When I issued compression command, I saw: {code} $ bin/hbase org.apache.hadoop.hbase.regionserver.wal.Compressor -c sea-lab-5.decomp sea-lab-5.comp 12/03/07 14:14:17 INFO wal.SequenceFileLogReader: Input stream class: org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker, not adjusting length 12/03/07 14:14:17 INFO wal.SequenceFileLogWriter: WAL compression enabled for sea-lab-5.comp 12/03/07 14:14:17 DEBUG wal.SequenceFileLogWriter: new createWriter -- HADOOP-6840 -- not available 12/03/07 14:14:17 WARN util.NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path 12/03/07 14:14:17 WARN util.NativeCodeLoader: java.library.path=/apache/hbase/bin/../lib/native/Linux-amd64-64 12/03/07 14:14:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 12/03/07 14:14:17 INFO compress.CodecPool: Got brand-new compressor [.deflate] 12/03/07 14:14:17 DEBUG wal.SequenceFileLogWriter: Path=sea-lab-5.comp, syncFs=true, hflush=true Exception in thread "main" java.io.IOException: sea-lab-5.decomp, entryStart=124, pos=1406386, end=119487372, edit=0 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.addFileInfoToException(SequenceFileLogReader.java:275) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:231) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:200) at org.apache.hadoop.hbase.regionserver.wal.Compressor.transformFile(Compressor.java:93) at org.apache.hadoop.hbase.regionserver.wal.Compressor.main(Compressor.java:59) Caused by: java.io.IOException: //0 read 36 bytes, should read 22 at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2118) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2155) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:229) ... 3 more {code} was (Author: zhi...@ebaysf.com): I saw this in region server log: {code} 2012-03-07 13:01:12,408 INFO wal.SequenceFileLogWriter (SequenceFileLogWriter.java:init(91)) <<regionserver60020.logRoller>> - WAL compression enabled for hdfs://sea-lab-0:54310/hbase/.logs/sea-lab-5,60020,1331150872956/sea-lab-5%2C60020%2C1331150872956.1331154072399 {code} After copying the HLog to local, I issued: {code} bin/hbase org.apache.hadoop.hbase.regionserver.wal.Compressor -u sea-lab-5%2C60020%2C1331150872956.1331154072399 sea-lab-5.decomp {code} I got: {code} -rwxr-xr-x 1 hduser hduser 119487372 2012-03-07 14:12 sea-lab-5.decomp -rw-r--r-- 1 hduser hduser 120660017 2012-03-07 14:11 sea-lab-5%2C60020%2C1331150872956.1331154072399 {code} When I issued compression command, I saw: {code} $ bin/hbase org.apache.hadoop.hbase.regionserver.wal.Compressor -c sea-lab-5.decomp sea-lab-5.comp 12/03/07 14:14:17 INFO wal.SequenceFileLogReader: Input stream class: org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker, not adjusting length 12/03/07 14:14:17 INFO wal.SequenceFileLogWriter: WAL compression enabled for sea-lab-5.comp 12/03/07 14:14:17 DEBUG wal.SequenceFileLogWriter: new createWriter -- HADOOP-6840 -- not available 12/03/07 14:14:17 WARN util.NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path 12/03/07 14:14:17 WARN util.NativeCodeLoader: java.library.path=/apache/hbase/bin/../lib/native/Linux-amd64-64 12/03/07 14:14:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 12/03/07 14:14:17 INFO compress.CodecPool: Got brand-new compressor [.deflate] 12/03/07 14:14:17 DEBUG wal.SequenceFileLogWriter: Path=sea-lab-5.comp, syncFs=true, hflush=true Exception in thread "main" java.io.IOException: sea-lab-5.decomp, entryStart=124, pos=1406386, end=119487372, edit=0 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.addFileInfoToException(SequenceFileLogReader.java:275) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:231) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:200) at org.apache.hadoop.hbase.regionserver.wal.Compressor.transformFile(Compressor.java:93) at org.apache.hadoop.hbase.regionserver.wal.Compressor.main(Compressor.java:59) Caused by: java.io.IOException: //0 read 36 bytes, should read 22 at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2118) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2155) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:229) ... 3 more {code} > HLog Compression > ---------------- > > Key: HBASE-4608 > URL: https://issues.apache.org/jira/browse/HBASE-4608 > Project: HBase > Issue Type: New Feature > Reporter: Li Pi > Assignee: Li Pi > Fix For: 0.94.0 > > Attachments: 4608-v19.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, > 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, > 4608v6.txt, 4608v7.txt, 4608v8fixed.txt > > > The current bottleneck to HBase write speed is replicating the WAL appends > across different datanodes. We can speed up this process by compressing the > HLog. Current plan involves using a dictionary to compress table name, region > id, cf name, and possibly other bits of repeated data. Also, HLog format may > be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira