On 6/21/07, Doug Cutting <[EMAIL PROTECTED]> wrote:
That's true only if decompression runs faster than disk input.
And for my case, compression speed also matters. Since each step is
decompressing for read and compressing for write.
I ran a test on this. Looks like on 2GHz Opterons, end-to-
That's true only if decompression runs faster than disk input. Disk
transfer rates are nearly 100MB/second, but bzip2 decompression is
around 20MB/second, while lzo can probably run at 100MB/second.
Obviously these vary with disk and cpu speed, but you get the idea. If
lzo compresses just 2:1 th
Bwolen Yang wrote:
For disk bound map/reduce applications (those did very little
computation but mainly about collating large amount of relevant data
and extract out a smaller summary for future computations), I was
wondering about whether or not it make sense for mappers to work
directly on comp
For disk bound map/reduce applications (those did very little
computation but mainly about collating large amount of relevant data
and extract out a smaller summary for future computations), I was
wondering about whether or not it make sense for mappers to work
directly on compressed inputs. i.e