Sami Siren wrote:
Stefan Groschupf wrote:
See:
http://www.find23.net/nutch_guiToHadoop.pdf
Section required hadoop changes.
I quess you refer to these:
• LocalJobRunner:
• Run as kind of singelton
• Have a kind of jobQueue
• Implement JobSubmissionProtocol status-report
methods
• implement killJob method
Is there an issue in Hadoop's Jira for this? Is there a patch that
implements these? If there is, then I suggest folks vote for the issue.
-how about writing a nutchrunner that just extends the functionality of
localjobrunner?
-scheduling (jobQueue) could be completely outside of jobrunner?
These also sounds like a good solutions. If it is not Nutch-specific,
then perhaps it could be integrated into Hadoop, so that it is
maintained as Hadoop evolves. If that sounds like a good approach,
please submit a patch to Hadoop with some unit tests.
Cheers,
Doug