[ https://issues.apache.org/jira/browse/HADOOP-12406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinod Kumar Vavilapalli updated HADOOP-12406: --------------------------------------------- Assignee: Nadeem Douba Status: Open (was: Patch Available) Hi [~ndouba], I'm about to do a 2.7.3 Apache Hadoop release and finally got around to this again. h4. Analysis To make progress, I had to read up a bit on nutch and about how to run this so that I can reproduce the bug in order to rationalize your patch. I finally succeeded in doing so! Tested this with 2.7.2 release and nutch 1.11 and using the URL feed [given at NUTCH-1084|https://issues.apache.org/jira/browse/NUTCH-1084?focusedCommentId=13882771&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13882771] {code} ~/tmp/common/hadoop-common-2.7.2/bin/hadoop jar apache-nutch-1.11.job org.apache.nutch.crawl.CrawlDbReader file:///tmp/nutch/apache-nutch-1.11/runtime/local/crawl/crawldb/ -url http://bappenas.go.id/ {code} I can reproduce all the problems listed at NUTCH-1084 - with readdb, MR local-job-runner based job for crawling etc. The real issue is that Nutch's readdb is client-only and *not* running a MapReduce job which was my question before. For regular MR jobs, the job-jar *is* on the system class-loader. For the client-only invocations using "hadoop jar" and local-job-runner, the job-jar is actually *not* on the system-classpath - that is why you are running into the issue. h4. Summary Your patch looks good to me. Clearly, the thread context-loader falls back to system class-loader where it is not overridden - so we are fine for all the ways of loading the classes in readFields. I'll resubmit your patch with minor commenting related changes to Jenkins and commit if Mr.Jenkins is also fine. > AbstractMapWritable.readFields throws ClassNotFoundException with custom > writables > ---------------------------------------------------------------------------------- > > Key: HADOOP-12406 > URL: https://issues.apache.org/jira/browse/HADOOP-12406 > Project: Hadoop Common > Issue Type: Bug > Components: io > Affects Versions: 2.7.1 > Environment: Ubuntu Linux 14.04 LTS amd64 > Reporter: Nadeem Douba > Assignee: Nadeem Douba > Priority: Blocker > Labels: bug, hadoop, io, newbie, patch-available > Attachments: HADOOP-12406.patch > > > Note: I am not an expert at JAVA, Class loaders, or Hadoop. I am just a > hacker. My solution might be entirely wrong. > AbstractMapWritable.readFields throws a ClassNotFoundException when reading > custom writables. Debugging the job using remote debugging in IntelliJ > revealed that the class loader being used in Class.forName() is different > than that used by the Thread's current context > (Thread.currentThread().getContextClassLoader()). The class path for the > system class loader does not include the libraries of the job jar. However, > the class path for the context class loader does. The proposed patch changes > the class loading mechanism in readFields to use the Thread's context class > loader instead of the system's default class loader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)