Gal Nitzan wrote:
Hi,
Just installed 0-8 with hadoop from trunk.
I noticed all tasktrackers are participating in the fetch.
I have only one site in the injected seed file
I have 5 tasktrackers all except one access the same site.
I am using nu0.8 dev with hadoop.
Please, any idea?
Hadoop doesn't have any mechanism for coordinating simultaneous access
to resources across the cluster (global locking). I described the
problem on the hadoop-dev list, no comments yet...
(FYI: if you wonder how it was working before, the trick was to generate
just 1 split for the fetch job, which then lead to just one task being
created for any input fetchlist. This was a hack that apparently stopped
working after Hadoop was moved to its own codebase; the proper solution
is to implement a global lock manager).
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers