Sami Siren wrote:
Stefan Groschupf wrote:
See:
http://www.find23.net/nutch_guiToHadoop.pdf
Section required hadoop changes.

I quess you refer to these:

•  LocalJobRunner:
  •  Run as kind of singelton
  •  Have a kind of jobQueue
  •  Implement JobSubmissionProtocol status-report
     methods
  •  implement killJob method

Is there an issue in Hadoop's Jira for this? Is there a patch that implements these? If there is, then I suggest folks vote for the issue.

-how about writing a nutchrunner that just extends the functionality of localjobrunner?
-scheduling (jobQueue) could be completely outside of jobrunner?

These also sounds like a good solutions. If it is not Nutch-specific, then perhaps it could be integrated into Hadoop, so that it is maintained as Hadoop evolves. If that sounds like a good approach, please submit a patch to Hadoop with some unit tests.

Cheers,

Doug

Reply via email to