[ https://issues.apache.org/jira/browse/MAHOUT-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14001120#comment-14001120 ]
Sergey commented on MAHOUT-1498: -------------------------------- Hi, I've added a separate class+test. I hope I did follow the guides. Please see is it OK or I need to work more on it. Thanks. > DistributedCache.setCacheFiles in DictionaryVectorizer overwrites jars pushed > using oozie > ----------------------------------------------------------------------------------------- > > Key: MAHOUT-1498 > URL: https://issues.apache.org/jira/browse/MAHOUT-1498 > Project: Mahout > Issue Type: Bug > Affects Versions: 0.7 > Environment: mahout-core-0.7-cdh4.4.0.jar > Reporter: Sergey > Labels: patch > Fix For: 1.0 > > Attachments: MAHOUT-1498.patch > > > Hi, I get exception > {code} > <<< Invocation of Main class completed <<< > Failing Oozie Launcher, Main class > [org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles], main() threw > exception, Job failed! > java.lang.IllegalStateException: Job failed! > at > org.apache.mahout.vectorizer.DictionaryVectorizer.makePartialVectors(DictionaryVectorizer.java:329) > at > org.apache.mahout.vectorizer.DictionaryVectorizer.createTermFrequencyVectors(DictionaryVectorizer.java:199) > at > org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:271) > {code} > The root cause is: > {code} > Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:247) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:247 > {code} > Looks like it happens because of > DictionaryVectorizer.makePartialVectors method. > It has code: > {code} > DistributedCache.setCacheFiles(new URI[] {dictionaryFilePath.toUri()}, conf); > {code} > which overrides jars pushed with job by oozie: > {code} > public static void More ...setCacheFiles(URI[] files, Configuration conf) { > String sfiles = StringUtils.uriToString(files); > conf.set("mapred.cache.files", sfiles); > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)