[ https://issues.apache.org/jira/browse/MAHOUT-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975537#comment-13975537 ]
Sergey commented on MAHOUT-1498: -------------------------------- Yes, it's possible 1. http://commons.apache.org/patches.html is it a valid guide? 2. Which branch do I have to checkout for patching? Right now it's done 'quick-and-dirty'. It can be used a a patch... > DistributedCache.setCacheFiles in DictionaryVectorizer overwrites jars pushed > using oozie > ----------------------------------------------------------------------------------------- > > Key: MAHOUT-1498 > URL: https://issues.apache.org/jira/browse/MAHOUT-1498 > Project: Mahout > Issue Type: Bug > Affects Versions: 0.7 > Environment: mahout-core-0.7-cdh4.4.0.jar > Reporter: Sergey > Fix For: 1.0 > > > Hi, I get exception > {code} > <<< Invocation of Main class completed <<< > Failing Oozie Launcher, Main class > [org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles], main() threw > exception, Job failed! > java.lang.IllegalStateException: Job failed! > at > org.apache.mahout.vectorizer.DictionaryVectorizer.makePartialVectors(DictionaryVectorizer.java:329) > at > org.apache.mahout.vectorizer.DictionaryVectorizer.createTermFrequencyVectors(DictionaryVectorizer.java:199) > at > org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:271) > {code} > The root cause is: > {code} > Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:247) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:247 > {code} > Looks like it happens because of > DictionaryVectorizer.makePartialVectors method. > It has code: > {code} > DistributedCache.setCacheFiles(new URI[] {dictionaryFilePath.toUri()}, conf); > {code} > which overrides jars pushed with job by oozie: > {code} > public static void More ...setCacheFiles(URI[] files, Configuration conf) { > String sfiles = StringUtils.uriToString(files); > conf.set("mapred.cache.files", sfiles); > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)