[ https://issues.apache.org/jira/browse/MAHOUT-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14003334#comment-14003334 ]
Hudson commented on MAHOUT-1498: -------------------------------- FAILURE: Integrated in Mahout-Quality #2610 (See [https://builds.apache.org/job/Mahout-Quality/2610/]) MAHOUT-1498 DistributedCache.setCacheFiles in DictionaryVectorizer overwrites jars pushed using oozie (ssc: rev 1595643) * /mahout/trunk/CHANGELOG * /mahout/trunk/mrlegacy/src/main/java/org/apache/mahout/common/HadoopUtil.java * /mahout/trunk/mrlegacy/src/main/java/org/apache/mahout/vectorizer/DictionaryVectorizer.java * /mahout/trunk/mrlegacy/src/main/java/org/apache/mahout/vectorizer/term/TFPartialVectorReducer.java * /mahout/trunk/mrlegacy/src/main/java/org/apache/mahout/vectorizer/tfidf/TFIDFConverter.java * /mahout/trunk/mrlegacy/src/main/java/org/apache/mahout/vectorizer/tfidf/TFIDFPartialVectorReducer.java * /mahout/trunk/mrlegacy/src/test/java/org/apache/mahout/common/DistributedCacheFileLocationTest.java > DistributedCache.setCacheFiles in DictionaryVectorizer overwrites jars pushed > using oozie > ----------------------------------------------------------------------------------------- > > Key: MAHOUT-1498 > URL: https://issues.apache.org/jira/browse/MAHOUT-1498 > Project: Mahout > Issue Type: Bug > Affects Versions: 0.7 > Environment: mahout-core-0.7-cdh4.4.0.jar > Reporter: Sergey > Assignee: Sebastian Schelter > Labels: patch > Fix For: 1.0 > > Attachments: MAHOUT-1498.patch > > > Hi, I get exception > {code} > <<< Invocation of Main class completed <<< > Failing Oozie Launcher, Main class > [org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles], main() threw > exception, Job failed! > java.lang.IllegalStateException: Job failed! > at > org.apache.mahout.vectorizer.DictionaryVectorizer.makePartialVectors(DictionaryVectorizer.java:329) > at > org.apache.mahout.vectorizer.DictionaryVectorizer.createTermFrequencyVectors(DictionaryVectorizer.java:199) > at > org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:271) > {code} > The root cause is: > {code} > Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:247) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:247 > {code} > Looks like it happens because of > DictionaryVectorizer.makePartialVectors method. > It has code: > {code} > DistributedCache.setCacheFiles(new URI[] {dictionaryFilePath.toUri()}, conf); > {code} > which overrides jars pushed with job by oozie: > {code} > public static void More ...setCacheFiles(URI[] files, Configuration conf) { > String sfiles = StringUtils.uriToString(files); > conf.set("mapred.cache.files", sfiles); > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)