Hi Jeff,

Thank you very much for your reply sincerely.

I exactly know hadoop has overhead, but is it too large in my problem?

The 1GB text input has about 500 map tasks because the input is composed of 
little text file. And the time each map taken is from 8 seconds to 20 seconds. 
I use compression like conf.setCompressMapOutput(true).

Thanks,
Jander




At 2010-10-05 16:28:55,"Jeff Zhang" <zjf...@gmail.com> wrote:

>Hi Jander,
>
>Hadoop has overhead compared to single-machine solution. How many task
>have you get when you run your hadoop job ? And what is time consuming
>for each map and reduce task ?
>
>There's lots of tips for performance tuning of hadoop. Such as
>compression and jvm reuse.
>
>
>2010/10/5 Jander <442950...@163.com>:
>> Hi, all
>> I do an application using hadoop.
>> I take 1GB text data as input the result as follows:
>>    (1) the cluster of 3 PCs: the time consumed is 1020 seconds.
>>    (2) the cluster of 4 PCs: the time is about 680 seconds.
>> But the application before I use Hadoop takes about 280 seconds, so as the 
>> speed above, I must use 8 PCs in order to have the same speed as before. Now 
>> the problem: whether it is correct?
>>
>> Jander,
>> Thanks.
>>
>>
>>
>
>
>
>-- 
>Best Regards
>
>Jeff Zhang

Reply via email to