[ 
http://issues.apache.org/jira/browse/NUTCH-191?page=comments#action_12364678 ] 

Doug Cutting commented on NUTCH-191:
------------------------------------

We've thus far avoided loading job-specific code in the JobTracker and 
TaskTracker, in order to keep these more reliable.  File splitting is performed 
by the job tracker.  So if you're overriding InputFormat.getSplits(), then 
fixing this is harder.  But if you're simply overriding getRecordReader(), then 
this should be easier to fix.  In that case one could fix this by moving 
getSplits() to a new interface that's used only by the TaskTracker.  If this is 
important to you, please submit a patch to this effect.

> InputFormat used in job must be in JobTracker classpath (not loaded from job 
> JAR)
> ---------------------------------------------------------------------------------
>
>          Key: NUTCH-191
>          URL: http://issues.apache.org/jira/browse/NUTCH-191
>      Project: Nutch
>         Type: Bug
>     Versions: 0.8-dev
>  Environment: ~20 node nutch mapreduce environment, running SVN trunk, on 
> Linux
>     Reporter: Bryan Pendleton
>     Priority: Minor

>
> During development, I've been creating/tweaking custom InputFormat 
> implementations. However, when you try to run a job against a running 
> cluster, you get:
>   Exception in thread "main" java.io.IOException: java.lang.RuntimeException: 
> java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> my.custom.InputFormat
>           at org.apache.nutch.ipc.Client.call(Client.java:294)
>           at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
>           at $Proxy0.submitJob(Unknown Source)
>           at org.apache.nutch.mapred.JobClient.submitJob(JobClient.java:259)
>           at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:288)
>           at com.parc.uir.wikipedia.WikipediaJob.main(WikipediaJob.java:85)
> This error goes away if I restart the TaskTrackers/JobTracker with a 
> classpath which includes the needed code. Other classes (Mapper, Reducer) 
> appear to be available out of the jar file specified in the JobConf, but not 
> the InputFormat. Obviously, it's less than idea to have to restart the 
> JobTracker whenever there's a change to a job-specific class.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to