[jira] [Commented] (HBASE-3691) Add compressor support for 'snappy', google's compressor

Nicholas Telford (JIRA) Mon, 09 May 2011 02:25:46 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030661#comment-13030661
 ]


Nicholas Telford commented on HBASE-3691:
-----------------------------------------

Thanks Nichole, without your patch to HColumnDescriptor it wasn't possible to 
use snappy. I'd only tested it using CompressionTest, which I see now is not a 
complete enough test: it only tests that compression on an HFile works, not 
that Column Families can use it.

One thing that does concern me: it seems as though in your patch the Algorithm 
implementation for SNAPPY has moved places in the enum. From the comments it 
sounds like it should be added as the _last_ implementation to avoid breaking 
HFiles compressed with the other implementations. This looks like it may just 
be a merge glitch when you first applied my patch.

Using Nichole's patch, the steps to getting Snappy working are currently:

# Install hadoop-snappy using these instructions: 
http://code.google.com/p/hadoop-snappy/
# You need to ensure the hadoop-snappy libs (incl. the native libs) are in the 
HBase classpath. Unless there are any other recommendations, I just symlinked 
the libs from HADOOP_HOME/lib to HBASE_HOME/lib. This needs to be done on all 
HBase nodes, as with LZO.
# Use CompressionTest to verify snappy support is enabled and the libs can be 
loaded: 
    bq. $ hbase org.apache.hadoop.hbase.util.CompressionTest 
hdfs://host/path/to/hbase snappy
# Create a column family with snappy compression and verify it:
    {quote}$ hbase shell
    > create 't1', \{ NAME => 'cf1', COMPRESSION => 'snappy' \}
    > describe 't1'{quote}

    In the output of the "describe" command, you need to ensure it lists 
"COMPRESSION => 'snappy'"

> Add compressor support for 'snappy', google's compressor
> --------------------------------------------------------
>
>                 Key: HBASE-3691
>                 URL: https://issues.apache.org/jira/browse/HBASE-3691
>             Project: HBase
>          Issue Type: Task
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: hbase-snappy-3691-trunk-002.patch, 
> hbase-snappy-3691-trunk-003.patch, hbase-snappy-3691-trunk.patch
>
>
> http://code.google.com/p/snappy/ is apache licensed.
> bq. Snappy is a compression/decompression library. It does not aim for 
> maximum compression, or compatibility with any other compression library; 
> instead, it aims for very high speeds and reasonable compression. For 
> instance, compared to the fastest mode of zlib, Snappy is an order of 
> magnitude faster for most inputs, but the resulting compressed files are 
> anywhere from 20% to 100% bigger. On a single core of a Core i7 processor in 
> 64-bit mode, Snappy compresses at about 250 MB/sec or more and decompresses 
> at about 500 MB/sec or more.
> bq. Snappy is widely used inside Google, in everything from BigTable and 
> MapReduce to our internal RPC systems. (Snappy has previously been referred 
> to as "Zippy" in some presentations and the likes.)
> Lets get it in.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3691) Add compressor support for 'snappy', google's compressor

Reply via email to