Mahout 0.8/0.9 are certified for Hadoop 1.2.1.
On Thursday, February 20, 2014 4:25 PM, "Zhang, Pengchu" <pzh...@sandia.gov> wrote: Thanks, it has been executed successfully. Two more questions related to this: 1. This means that I have to execute Mahout for further analysis with the non-MR mode? 2. It is too bad that Hadoop2.2. does not support for newer versions of Mahout. Are you aware of that Hadoop 1.x working with Mahout 0.8 0r 0.9 on MR? I do have a large dataset to be clustered. Thanks. Pengchu -----Original Message----- From: Suneel Marthi [mailto:suneel_mar...@yahoo.com] Sent: Thursday, February 20, 2014 1:17 PM To: user@mahout.apache.org Subject: [EXTERNAL] Re: Mapreduce job failed ... and the reason for this failing is that 'TaskAttemptContext' which was a Class in Hadoop 1.x has now become an interface in Hadoop 2.2. Suggest that u execute this job in non-MR mode with '-xm sequential'. On Thursday, February 20, 2014 2:26 PM, Suneel Marthi <suneel_mar...@yahoo.com> wrote: Seems like u r running this on HAdoop 2.2 (officially not supported for Mahout 0.8 or 0.9), work around is to run this in sequential mode with "-xm sequential". On Thursday, February 20, 2014 1:36 PM, "Zhang, Pengchu" <pzh...@sandia.gov> wrote: Hello, I am trying to "seqdirirectory" with mahout (0.8 and 0.9) on my Linux box with Hadoop (2.2.0) but keeping failed consistently. I tested Hadoop with the Hadoop example pi and wordcount, both worked well. With a simple text file or directory with multiple text files, e.g., Shakespeare_text, I got the same message of failure. >mahout seqdirectory --input /Shakespeare_txet --output >/Shakespeare-seqdir --charset utf-8 $ mahout seqdirectory --input /shakespeare_text --output /shakespeare-seqdir --charset utf-8 MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. Running on hadoop, using /home/pzhang/hadoop-2.2.0/bin/hadoop and HADOOP_CONF_DIR=/home/pzhang/hadoop-2.2.0/etc/hadoop MAHOUT-JOB: /home/pzhang/MAHOUT_HOME/mahout-examples-0.8-job.jar 14/02/20 11:29:42 INFO common.AbstractJob: Command line arguments: {--charset=[utf-8], --chunkSize=[64], --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/shakespeare_text], --keyPrefix=[], --method=[mapreduce], --output=[/shakespeare-seqdir], --startPhase=[0], --tempDir=[temp]} 14/02/20 11:29:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/02/20 11:29:42 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/02/20 11:29:42 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress 14/02/20 11:29:42 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/02/20 11:29:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 14/02/20 11:29:44 INFO input.FileInputFormat: Total input paths to process : 43 14/02/20 11:29:44 INFO input.CombineFileInputFormat: DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 5284832 14/02/20 11:29:44 INFO mapreduce.JobSubmitter: number of splits:1 14/02/20 11:29:44 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class 14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/02/20 11:29:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1392919123773_0004 14/02/20 11:29:45 INFO impl.YarnClientImpl: Submitted application application_1392919123773_0004 to ResourceManager at /0.0.0.0:8032 14/02/20 11:29:45 INFO mapreduce.Job: The url to track the job: http://savm0072lx.sandia.gov:8088/proxy/application_1392919123773_0004/ 14/02/20 11:29:45 INFO mapreduce.Job: Running job: job_1392919123773_0004 14/02/20 11:29:53 INFO mapreduce.Job: Job job_1392919123773_0004 running in uber mode : false 14/02/20 11:29:53 INFO mapreduce.Job: map 0% reduce 0% 14/02/20 11:29:58 INFO mapreduce.Job: Task Id : attempt_1392919123773_0004_m_000000_0, Status : FAILED Error: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164) at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126) at org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:491) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:534) at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155) ... 10 more Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected at org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:52) ... 15 more Any suggestion is helpful. Thanks. Pengchu