[ http://issues.apache.org/jira/browse/NUTCH-191?page=comments#action_12364773 ]
Owen O'Malley commented on NUTCH-191: ------------------------------------- I would schedule the getSplits task and when it completed, I would schedule the map jobs. It would be pretty parallel to the way the completion of the map tasks causes the reduces to be scheduled. I think the right place to hook it would be in JobTracker.JobInProgress.completedTask(String). One difference that I'm aware of, is that until getSplits returns, you don't have any idea how many maps will be needed, so you can't create the map tasks when the job is created. > InputFormat used in job must be in JobTracker classpath (not loaded from job > JAR) > --------------------------------------------------------------------------------- > > Key: NUTCH-191 > URL: http://issues.apache.org/jira/browse/NUTCH-191 > Project: Nutch > Type: Bug > Versions: 0.8-dev > Environment: ~20 node nutch mapreduce environment, running SVN trunk, on > Linux > Reporter: Bryan Pendleton > Priority: Minor > > During development, I've been creating/tweaking custom InputFormat > implementations. However, when you try to run a job against a running > cluster, you get: > Exception in thread "main" java.io.IOException: java.lang.RuntimeException: > java.lang.RuntimeException: java.lang.ClassNotFoundException: > my.custom.InputFormat > at org.apache.nutch.ipc.Client.call(Client.java:294) > at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127) > at $Proxy0.submitJob(Unknown Source) > at org.apache.nutch.mapred.JobClient.submitJob(JobClient.java:259) > at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:288) > at com.parc.uir.wikipedia.WikipediaJob.main(WikipediaJob.java:85) > This error goes away if I restart the TaskTrackers/JobTracker with a > classpath which includes the needed code. Other classes (Mapper, Reducer) > appear to be available out of the jar file specified in the JobConf, but not > the InputFormat. Obviously, it's less than idea to have to restart the > JobTracker whenever there's a change to a job-specific class. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
