[jira] Commented: (HADOOP-5793) High speed compression algorithm like BMDiff

stack (JIRA) Wed, 02 Jun 2010 08:06:10 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874602#action_12874602
 ]


stack commented on HADOOP-5793:
-------------------------------

Hey Pirroh:  How would it work?  Whats the name of the compression that does 
both bmz and lzo?  We'd download hadoop gpl stuff, compile it and then refer to 
the new combo... how? (Sounds great).  This code came over from HT?

> High speed compression algorithm like BMDiff
> --------------------------------------------
>
>                 Key: HADOOP-5793
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5793
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: elhoim gibor
>            Priority: Minor
>
> Add a high speed compression algorithm like BMDiff.
> It gives speeds ~100MB/s for writes and ~1000MB/s for reads, compressing 
> 2.1billions web pages from 45.1TB in 4.2TB
> Reference:
> http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=437
> 2005 Jeff Dean talk about google architecture - around 46:00.
> http://feedblog.org/2008/10/12/google-bigtable-compression-zippy-and-bmdiff/
> http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=755678
> A reference implementation exists in HyperTable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5793) High speed compression algorithm like BMDiff

Reply via email to