I wrote a message to the hadoop list about it. Also i found this https://issues.apache.org/jira/browse/MAHOUT-1498 ticket.
Could it be a related bug?

Best,
Max
On 01/08/2015 06:18 PM, Pat Ferrel wrote:
That sounds like a Hadoop list question.

All I can say is there is a job.jar in mrlegacy/target with all dependencies 
packaged. This should have everything needed for lda.

On Jan 8, 2015, at 5:50 AM, mw <m...@plista.com> wrote:

Hello again,

maybe my question was misleading.
I am asking whether the intended usage is to provide the job with the required 
library’s and sent those together with the job to yarn(if yes how can this be 
done?), or to add the required classes to the classpath of every node in the 
cluster.
What is the best practice?

Best,
Max


On 01/07/2015 06:13 PM, mw wrote:
Hello,

the first error was due to a missing property in yarn.xml. However no i have a 
different problem.


i am working on a web application that should execute lda on a external yarn 
cluster.

I am uploading all the relevant sequence files onto the yarn cluter.
This is how it try to remotely execute lda on the cluster.

        try {
            ugi.doAs(new PrivilegedExceptionAction<Void>() {
                public Void run() throws Exception {
                    Configuration hdoopConf = new Configuration();
                    hdoopConf.set("fs.defaultFS", 
"hdfs://xxx.xxx.xxx.xxx:9000/user/xx");
                    hdoopConf.set("yarn.resourcemanager.hostname", 
"xxx.xxx.xxx.xxx");
                    hdoopConf.set("mapreduce.framework.name", "yarn");
                    hdoopConf.set("mapred.framework.name", "yarn");
                    hdoopConf.set("mapred.job.tracker", "xxx.xxx.xxx.xxx");
                    hdoopConf.set("dfs.permissions.enabled", "false");
                    hdoopConf.set("hadoop.job.ugi", "xx");
hdoopConf.set("mapreduce.jobhistory.address","xxx.xxx.xxx.xxx:10020" );
                    CVB0Driver driver = new CVB0Driver();
                    try {
                        driver.run(hdoopConf, sparseVectorIn.suffix("/matrix"),
                                topicsOut, k, numTerms, doc_topic_smoothening, 
term_topic_smoothening,
                                maxIter, iteration_block_size, convergenceDelta,
sparseVectorIn.suffix("/dictionary.file-0"), 
topicsOut.suffix("/DocumentTopics/"), sparseVectorIn,
                                seed, testFraction, numTrainThreads, 
numUpdateThreads, maxItersPerDoc,
                                numReduceTasks, backfillPerplexity);
                    } catch (ClassNotFoundException e) {
                        e.printStackTrace();
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                    return null;
                }
            });
        } catch (InterruptedException e) {
            e.printStackTrace();
        }

I am getting the following error message:

Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
    at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:344)
    at 
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
    at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
    at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:837)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
    at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
    at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
    at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:344)
    at 
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
    at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
    at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:837)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
    at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
    at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
    at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:344)
    at 
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
    at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
    at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:837)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
    at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
    at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
    at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:344)
    at 
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
    at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
    at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:837)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
    at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
    at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

java.lang.InterruptedException: Failed to complete iteration 1 stage 1
    at 
org.apache.mahout.clustering.lda.cvb.CVB0Driver.runIteration(CVB0Driver.java:502)
    at org.apache.mahout.clustering.lda.cvb.CVB0Driver.run(CVB0Driver.java:319)
    ...

So apparently the job misses some mahout classes. How can i provide the 
required classes to yarn?

Best,

Max


Reply via email to