Re: compression and disk-bound application

2007-06-27 Thread Bwolen Yang
On 6/21/07, Doug Cutting <[EMAIL PROTECTED]> wrote: That's true only if decompression runs faster than disk input. And for my case, compression speed also matters. Since each step is decompressing for read and compressing for write. I ran a test on this. Looks like on 2GHz Opterons, end-to-

Re: compression and disk-bound application

2007-06-21 Thread Bwolen Yang
That's true only if decompression runs faster than disk input. Disk transfer rates are nearly 100MB/second, but bzip2 decompression is around 20MB/second, while lzo can probably run at 100MB/second. Obviously these vary with disk and cpu speed, but you get the idea. If lzo compresses just 2:1 th

Re: compression and disk-bound application

2007-06-21 Thread Doug Cutting
Bwolen Yang wrote: For disk bound map/reduce applications (those did very little computation but mainly about collating large amount of relevant data and extract out a smaller summary for future computations), I was wondering about whether or not it make sense for mappers to work directly on comp

compression and disk-bound application

2007-06-20 Thread Bwolen Yang
For disk bound map/reduce applications (those did very little computation but mainly about collating large amount of relevant data and extract out a smaller summary for future computations), I was wondering about whether or not it make sense for mappers to work directly on compressed inputs. i.e