You need to first convert *.sgm from reuters download to text files (this
shuld happen before running seqdirectory).

To convert .sgm to text run - "$MAHOUT
org.apache.lucene.benchmark.utils.ExtractReuters ${WORK_DIR}/reuters-sgm
${WORK_DIR}/reuters-out"

Then run seqdirectory on the output of the previous step.


On Mon, Jun 23, 2014 at 6:43 PM, Parimi Rohit <rohit.par...@gmail.com>
wrote:

> Hi All,
>
> I am trying to run LDA from Mahout and as a first step I wanted to run the
> "SequenceFilesFromDirectory" job to convert the text files into sequence
> files. Following is the command I am using:
>
> hadoop jar
>
> /Users/rohitp/Desktop/rohitp/Downloads/mahout-distribution-0.9/mahout-examples-0.9-job.jar
> org.apache.mahout.text.SequenceFilesFromDirectory -i
> LDA/reuter_example/reuters-sgm/ -o LDA/reuter_example/reuters-sgm_seq/
>
>
>
> However, I get the following class not found exception. I also tried to use
> the mahout driver program but got the same exception (mahout seqdirectory
> -i LDA/reuter_example/reuters-sgm/ -o LDA/reuter_example/reuters-sgm_seq/).
>
>
> Hadoop Version: Hadoop 1.2.1
>
> Mahout version: 0.9
>
>
> Any help is much appreciated.
>
>
>
> Rohit
>
>
>
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
>
> at
>
> org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
>
> at
>
> org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126)
>
> at
>
> org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
>
> at
>
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:488)
>
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
>
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
>
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:394)
>
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> Caused by: java.lang.reflect.InvocationTargetException
>
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>
> at
>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>
> at
>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>
> at
>
> org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
>
> ... 10 more
>
> Caused by: java.lang.NoClassDefFoundError:
> org/apache/mahout/common/AbstractJob
>
> at java.lang.ClassLoader.defineClass1(Native Method)
>
> at java.lang.ClassLoader.defineClassCond(ClassLoader.java:637)
>
> at java.lang.ClassLoader.defineClass(ClassLoader.java:621)
>
> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
>
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
>
> at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
>
> at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>
> at
>
> org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:61)
>
>  ... 15 more
>

Reply via email to