On Wed, Mar 11, 2009 at 4:41 PM, Sami Siren <ssi...@gmail.com> wrote:

> dayz...@gmail.com wrote:
>
>> Hi,
>>
>> If I want to run several parsers on a single quad-core machine
>> simultaneously, would I still need to have Hadoop setup as a single-node
>> cluster?
>>
> I think that the fetcher is currently the only component that can take
> advantage of multiple cores when running in "local" mode. We should perhaps
> address that at some point since it is not that hard to parallelize at least
> some of the processing inside individual tools so single machine users could
> benefit from multiple cores.


But for parsing on a single machine with quad core, I can setup 2 map and 2
reduce jobs and get about 80-90% utilisation of 4 CPUs. When only the reduce
jobs are left, I get about 50% utilisation, as expected (2/4). Do I actually
not get a performance gain even though more CPUs are being utilised?

Michael


>
> I am not sure but I think that the only way to do it properly is run
> jobtracker and tasktracker on that machine and configure proper block sizes
> & number of map and reduce tasks.
>
>> Can several updatedbs be run simultaneously? I believe not, since the db
>> seems to be locked when it's being updated.
>>
> Locking prevents multiple applications of accessing crawl db simultaneously
> (also linkdb).
>
> --
> Sami Siren
>
>

Reply via email to