Hello,

the first error was due to a missing property in yarn.xml. However no i have a different problem.


i am working on a web application that should execute lda on a external yarn cluster.

I am uploading all the relevant sequence files onto the yarn cluter.
This is how it try to remotely execute lda on the cluster.

        try {
            ugi.doAs(new PrivilegedExceptionAction<Void>() {
                public Void run() throws Exception {
                    Configuration hdoopConf = new Configuration();
hdoopConf.set("fs.defaultFS", "hdfs://xxx.xxx.xxx.xxx:9000/user/xx"); hdoopConf.set("yarn.resourcemanager.hostname", "xxx.xxx.xxx.xxx");
                    hdoopConf.set("mapreduce.framework.name", "yarn");
                    hdoopConf.set("mapred.framework.name", "yarn");
                    hdoopConf.set("mapred.job.tracker", "xxx.xxx.xxx.xxx");
                    hdoopConf.set("dfs.permissions.enabled", "false");
                    hdoopConf.set("hadoop.job.ugi", "xx");
hdoopConf.set("mapreduce.jobhistory.address","xxx.xxx.xxx.xxx:10020" );
                    CVB0Driver driver = new CVB0Driver();
                    try {
driver.run(hdoopConf, sparseVectorIn.suffix("/matrix"), topicsOut, k, numTerms, doc_topic_smoothening, term_topic_smoothening, maxIter, iteration_block_size, convergenceDelta, sparseVectorIn.suffix("/dictionary.file-0"), topicsOut.suffix("/DocumentTopics/"), sparseVectorIn, seed, testFraction, numTrainThreads, numUpdateThreads, maxItersPerDoc,
                                numReduceTasks, backfillPerplexity);
                    } catch (ClassNotFoundException e) {
                        e.printStackTrace();
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                    return null;
                }
            });
        } catch (InterruptedException e) {
            e.printStackTrace();
        }

I am getting the following error message:

Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
        at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:344)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
        at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
        at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:837)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
        at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
        at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:344)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
        at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
        at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:837)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
        at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
        at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:344)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
        at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
        at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:837)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
        at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
        at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:344)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
        at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
        at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:837)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
        at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

java.lang.InterruptedException: Failed to complete iteration 1 stage 1
at org.apache.mahout.clustering.lda.cvb.CVB0Driver.runIteration(CVB0Driver.java:502)
        at 
org.apache.mahout.clustering.lda.cvb.CVB0Driver.run(CVB0Driver.java:319)
    ...

So apparently the job misses some mahout classes. How can i provide the required classes to yarn?

Best,

Max

Reply via email to