Yeah, the default AMI has these lucene jars in ~/lib: /home/hadoop/lib/lucene-surround-2.9.4.jar /home/hadoop/lib/lucene-bdb-je-2.9.4.jar /home/hadoop/lib/lucene-bdb-2.9.4.jar /home/hadoop/lib/lucene-ant-2.9.4.jar /home/hadoop/lib/lucene-xml-query-parser-2.9.4.jar /home/hadoop/lib/lucene-swing-2.9.4.jar /home/hadoop/lib/lucene-queries-2.9.4.jar /home/hadoop/lib/lucene-instantiated-2.9.4.jar /home/hadoop/lib/lucene-spellchecker-2.9.4.jar /home/hadoop/lib/lucene-analyzers-2.9.4.jar /home/hadoop/lib/lucene-wordnet-2.9.4.jar /home/hadoop/lib/lucene-misc-2.9.4.jar /home/hadoop/lib/lucene-queryparser-2.9.4.jar /home/hadoop/lib/lucene-snowball-2.9.4.jar /home/hadoop/lib/lucene-collation-2.9.4.jar /home/hadoop/lib/lucene-remote-2.9.4.jar /home/hadoop/lib/lucene-lucli-2.9.4.jar /home/hadoop/lib/lucene-smartcn-2.9.4.jar /home/hadoop/lib/lucene-regex-2.9.4.jar /home/hadoop/lib/lucene-highlighter-2.9.4.jar /home/hadoop/lib/lucene-benchmark-2.9.4.jar /home/hadoop/lib/lucene-memory-2.9.4.jar /home/hadoop/lib/lucene-fast-vector-highlighter-2.9.4.jar /home/hadoop/lib/lucene-wikipedia-2.9.4.jar /home/hadoop/lib/lucene-spatial-2.9.4.jar /home/hadoop/lib/lucene-demos-2.9.4.jar /home/hadoop/lib/lucene-core-2.9.4.jar
Works fine when specifying an AMI with Hadoop 2, and here's what they ship: /home/hadoop/.versions/mahout-0.8/lib/lucene-sandbox-4.3.0.jar /home/hadoop/.versions/mahout-0.8/lib/lucene-core-4.3.0.jar /home/hadoop/.versions/mahout-0.8/lib/lucene-analyzers-common-4.3.0.jar /home/hadoop/.versions/mahout-0.8/lib/lucene-benchmark-4.3.0.jar /home/hadoop/.versions/mahout-0.8/lib/lucene-highlighter-4.3.0.jar /home/hadoop/.versions/mahout-0.8/lib/lucene-facet-4.3.0.jar /home/hadoop/.versions/mahout-0.8/lib/lucene-memory-4.3.0.jar /home/hadoop/.versions/mahout-0.8/lib/lucene-queries-4.3.0.jar /home/hadoop/.versions/mahout-0.8/lib/lucene-queryparser-4.3.0.jar On Thu, Mar 13, 2014 at 6:32 PM, Suneel Marthi <[email protected]>wrote: > MAHOUT 0.8 and later is Lucene 4.x, suspect this is being caused by an > antiquated lucene jar (lucene 3.x or earlier) that's part of Hadoop 1.0.3 > distro and is not compatible with lucene 4.x. > > > > > > > On Thursday, March 13, 2014 8:28 PM, Andrew Musselman < > [email protected]> wrote: > > It's with Hadoop 1.0.3 btw > > $ ./mahout-distribution-0.9/bin/mahout recommenditembased --input > ratings.csv --output recommendations --numRecommendations 10 > --outputPathForSimilarityMatrix similarity-matrix --similarityClassname > SIMILARITY_COSINE > Running on hadoop, using /home/hadoop/bin/hadoop and HADOOP_CONF_DIR= > MAHOUT-JOB: > /home/hadoop/mahout-distribution-0.9/mahout-examples-0.9-job.jar > 14/03/13 23:04:20 INFO common.AbstractJob: Command line arguments: > {--booleanData=[false], --endPhase=[2147483647], --input=[ratings.csv], > --maxPrefsInItemSimilarity=[500], --maxPrefsPerUser=[10], > --maxSimilaritiesPerItem=[100], --minPrefsPerUser=[1], > --numRecommendations=[10], --output=[recommendations], > --outputPathForSimilarityMatrix=[similarity-matrix], > --similarityClassname=[SIMILARITY_COSINE], --startPhase=[0], > --tempDir=[temp]} > 14/03/13 23:04:20 INFO common.AbstractJob: Command line arguments: > {--booleanData=[false], --endPhase=[2147483647], --input=[ratings.csv], > --minPrefsPerUser=[1], --output=[temp/preparePreferenceMatrix], > --ratingShift=[0.0], --startPhase=[0], --tempDir=[temp]} > 14/03/13 23:04:24 INFO mapred.JobClient: Default number of map tasks: null > 14/03/13 23:04:24 INFO mapred.JobClient: Setting default number of map > tasks based on cluster size to : 12 > 14/03/13 23:04:24 INFO mapred.JobClient: Default number of reduce tasks: 5 > 14/03/13 23:04:26 INFO security.ShellBasedUnixGroupsMapping: add hadoop to > shell userGroupsCache > 14/03/13 23:04:26 INFO mapred.JobClient: Setting group to hadoop > 14/03/13 23:04:26 INFO input.FileInputFormat: Total input paths to process > : 1 > 14/03/13 23:04:26 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library > 14/03/13 23:04:26 WARN lzo.LzoCodec: Could not find build properties file > with revision hash > 14/03/13 23:04:26 INFO lzo.LzoCodec: Successfully loaded & initialized > native-lzo library [hadoop-lzo rev UNKNOWN] > 14/03/13 23:04:26 WARN snappy.LoadSnappy: Snappy native library is > available > 14/03/13 23:04:26 INFO snappy.LoadSnappy: Snappy native library loaded > 14/03/13 23:04:28 INFO mapred.JobClient: Running job: job_201403132009_0001 > 14/03/13 23:04:29 INFO mapred.JobClient: map 0% reduce 0% > 14/03/13 23:05:17 INFO mapred.JobClient: map 31% reduce 0% > 14/03/13 23:05:20 INFO mapred.JobClient: map 50% reduce 0% > 14/03/13 23:05:23 INFO mapred.JobClient: map 84% reduce 0% > 14/03/13 23:05:26 INFO mapred.JobClient: map 100% reduce 0% > 14/03/13 23:05:47 INFO mapred.JobClient: map 100% reduce 20% > 14/03/13 23:05:59 INFO mapred.JobClient: map 100% reduce 40% > 14/03/13 23:06:02 INFO mapred.JobClient: map 100% reduce 60% > 14/03/13 23:06:05 INFO mapred.JobClient: map 100% reduce 80% > 14/03/13 23:06:08 INFO mapred.JobClient: map 100% reduce 100% > 14/03/13 23:06:13 INFO mapred.JobClient: Job complete: > job_201403132009_0001 > 14/03/13 23:06:13 INFO mapred.JobClient: Counters: 29 > 14/03/13 23:06:13 INFO mapred.JobClient: Job Counters > 14/03/13 23:06:13 INFO mapred.JobClient: Launched reduce tasks=7 > 14/03/13 23:06:13 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=41955 > 14/03/13 23:06:13 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 14/03/13 23:06:13 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > 14/03/13 23:06:13 INFO mapred.JobClient: Rack-local map tasks=1 > 14/03/13 23:06:13 INFO mapred.JobClient: Launched map tasks=1 > 14/03/13 23:06:13 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=80785 > 14/03/13 23:06:13 INFO mapred.JobClient: File Output Format Counters > 14/03/13 23:06:13 INFO mapred.JobClient: Bytes Written=45263 > 14/03/13 23:06:13 INFO mapred.JobClient: FileSystemCounters > 14/03/13 23:06:13 INFO mapred.JobClient: FILE_BYTES_READ=273711 > 14/03/13 23:06:13 INFO mapred.JobClient: HDFS_BYTES_READ=11553569 > 14/03/13 23:06:13 INFO mapred.JobClient: FILE_BYTES_WRITTEN=281101 > 14/03/13 23:06:13 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=45263 > 14/03/13 23:06:13 INFO mapred.JobClient: File Input Format Counters > 14/03/13 23:06:13 INFO mapred.JobClient: Bytes Read=11553456 > 14/03/13 23:06:13 INFO mapred.JobClient: Map-Reduce Framework > 14/03/13 23:06:13 INFO mapred.JobClient: Map output materialized > bytes=22207 > 14/03/13 23:06:13 INFO mapred.JobClient: Map input records=1000209 > 14/03/13 23:06:13 INFO mapred.JobClient: Reduce shuffle bytes=22207 > 14/03/13 23:06:13 INFO mapred.JobClient: Spilled Records=21398 > 14/03/13 23:06:13 INFO mapred.JobClient: Map output bytes=3946230 > 14/03/13 23:06:13 INFO mapred.JobClient: Total committed heap usage > (bytes)=257757184 > 14/03/13 23:06:13 INFO mapred.JobClient: CPU time spent (ms)=25370 > 14/03/13 23:06:13 INFO mapred.JobClient: Combine input records=1014195 > 14/03/13 23:06:13 INFO mapred.JobClient: SPLIT_RAW_BYTES=113 > 14/03/13 23:06:13 INFO mapred.JobClient: Reduce input records=3706 > 14/03/13 23:06:13 INFO mapred.JobClient: Reduce input groups=3706 > 14/03/13 23:06:13 INFO mapred.JobClient: Combine output records=17692 > 14/03/13 23:06:13 INFO mapred.JobClient: Physical memory (bytes) > snapshot=653774848 > 14/03/13 23:06:13 INFO mapred.JobClient: Reduce output records=3706 > 14/03/13 23:06:13 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=3967475712 > 14/03/13 23:06:13 INFO mapred.JobClient: Map output records=1000209 > 14/03/13 23:06:13 INFO mapred.JobClient: Default number of map tasks: null > 14/03/13 23:06:13 INFO mapred.JobClient: Setting default number of map > tasks based on cluster size to : 12 > 14/03/13 23:06:13 INFO mapred.JobClient: Default number of reduce tasks: 5 > 14/03/13 23:06:14 INFO mapred.JobClient: Setting group to hadoop > 14/03/13 23:06:14 INFO input.FileInputFormat: Total input paths to process > : 1 > 14/03/13 23:06:15 INFO mapred.JobClient: Running job: job_201403132009_0002 > 14/03/13 23:06:16 INFO mapred.JobClient: map 0% reduce 0% > 14/03/13 23:07:02 INFO mapred.JobClient: map 30% reduce 0% > 14/03/13 23:07:05 INFO mapred.JobClient: map 58% reduce 0% > 14/03/13 23:07:08 INFO mapred.JobClient: map 84% reduce 0% > 14/03/13 23:07:11 INFO mapred.JobClient: map 100% reduce 0% > 14/03/13 23:07:41 INFO mapred.JobClient: map 100% reduce 20% > 14/03/13 23:07:47 INFO mapred.JobClient: map 100% reduce 40% > 14/03/13 23:07:48 INFO mapred.JobClient: map 100% reduce 60% > 14/03/13 23:07:53 INFO mapred.JobClient: map 100% reduce 80% > 14/03/13 23:07:56 INFO mapred.JobClient: map 100% reduce 100% > 14/03/13 23:08:10 INFO mapred.JobClient: Job complete: > job_201403132009_0002 > 14/03/13 23:08:10 INFO mapred.JobClient: Counters: 30 > 14/03/13 23:08:10 INFO mapred.JobClient: > org.apache.mahout.cf.taste.hadoop.item.ToUserVectorsReducer$Counters > 14/03/13 23:08:10 INFO mapred.JobClient: USERS=6040 > 14/03/13 23:08:10 INFO mapred.JobClient: Job Counters > 14/03/13 23:08:10 INFO mapred.JobClient: Launched reduce tasks=7 > 14/03/13 23:08:10 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=51634 > 14/03/13 23:08:10 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 14/03/13 23:08:10 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > 14/03/13 23:08:10 INFO mapred.JobClient: Rack-local map tasks=1 > 14/03/13 23:08:10 INFO mapred.JobClient: Launched map tasks=1 > 14/03/13 23:08:10 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=86972 > 14/03/13 23:08:10 INFO mapred.JobClient: File Output Format Counters > 14/03/13 23:08:10 INFO mapred.JobClient: Bytes Written=6105753 > 14/03/13 23:08:10 INFO mapred.JobClient: FileSystemCounters > 14/03/13 23:08:10 INFO mapred.JobClient: FILE_BYTES_READ=11084374 > 14/03/13 23:08:10 INFO mapred.JobClient: HDFS_BYTES_READ=11553569 > 14/03/13 23:08:10 INFO mapred.JobClient: FILE_BYTES_WRITTEN=16021548 > 14/03/13 23:08:10 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=6105753 > 14/03/13 23:08:10 INFO mapred.JobClient: File Input Format Counters > 14/03/13 23:08:10 INFO mapred.JobClient: Bytes Read=11553456 > 14/03/13 23:08:10 INFO mapred.JobClient: Map-Reduce Framework > 14/03/13 23:08:10 INFO mapred.JobClient: Map output materialized > bytes=5289088 > 14/03/13 23:08:10 INFO mapred.JobClient: Map input records=1000209 > 14/03/13 23:08:10 INFO mapred.JobClient: Reduce shuffle bytes=5289088 > 14/03/13 23:08:10 INFO mapred.JobClient: Spilled Records=3000627 > 14/03/13 23:08:10 INFO mapred.JobClient: Map output bytes=7964758 > 14/03/13 23:08:10 INFO mapred.JobClient: Total committed heap usage > (bytes)=257757184 > 14/03/13 23:08:10 INFO mapred.JobClient: CPU time spent (ms)=37120 > 14/03/13 23:08:10 INFO mapred.JobClient: Combine input records=0 > 14/03/13 23:08:10 INFO mapred.JobClient: SPLIT_RAW_BYTES=113 > 14/03/13 23:08:10 INFO mapred.JobClient: Reduce input records=1000209 > 14/03/13 23:08:10 INFO mapred.JobClient: Reduce input groups=6040 > 14/03/13 23:08:10 INFO mapred.JobClient: Combine output records=0 > 14/03/13 23:08:10 INFO mapred.JobClient: Physical memory (bytes) > snapshot=658481152 > 14/03/13 23:08:10 INFO mapred.JobClient: Reduce output records=6040 > 14/03/13 23:08:10 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=3908874240 > 14/03/13 23:08:10 INFO mapred.JobClient: Map output records=1000209 > 14/03/13 23:08:10 INFO mapred.JobClient: Default number of map tasks: null > 14/03/13 23:08:10 INFO mapred.JobClient: Setting default number of map > tasks based on cluster size to : 12 > 14/03/13 23:08:10 INFO mapred.JobClient: Default number of reduce tasks: 5 > 14/03/13 23:08:11 INFO mapred.JobClient: Setting group to hadoop > 14/03/13 23:08:11 INFO input.FileInputFormat: Total input paths to process > : 5 > 14/03/13 23:08:11 INFO mapred.JobClient: Running job: job_201403132009_0003 > 14/03/13 23:08:12 INFO mapred.JobClient: map 0% reduce 0% > 14/03/13 23:08:50 INFO mapred.JobClient: map 8% reduce 0% > 14/03/13 23:08:53 INFO mapred.JobClient: map 20% reduce 0% > 14/03/13 23:09:08 INFO mapred.JobClient: map 24% reduce 0% > 14/03/13 23:09:11 INFO mapred.JobClient: map 39% reduce 0% > 14/03/13 23:09:14 INFO mapred.JobClient: map 50% reduce 0% > 14/03/13 23:09:17 INFO mapred.JobClient: map 60% reduce 0% > 14/03/13 23:09:20 INFO mapred.JobClient: map 62% reduce 0% > 14/03/13 23:09:23 INFO mapred.JobClient: map 72% reduce 0% > 14/03/13 23:09:26 INFO mapred.JobClient: map 87% reduce 0% > 14/03/13 23:09:30 INFO mapred.JobClient: map 94% reduce 0% > 14/03/13 23:09:33 INFO mapred.JobClient: map 100% reduce 2% > 14/03/13 23:09:39 INFO mapred.JobClient: map 100% reduce 10% > 14/03/13 23:09:45 INFO mapred.JobClient: map 100% reduce 17% > 14/03/13 23:09:51 INFO mapred.JobClient: map 100% reduce 45% > 14/03/13 23:09:54 INFO mapred.JobClient: map 100% reduce 60% > 14/03/13 23:10:00 INFO mapred.JobClient: map 100% reduce 80% > 14/03/13 23:10:03 INFO mapred.JobClient: map 100% reduce 100% > 14/03/13 23:10:08 INFO mapred.JobClient: Job complete: > job_201403132009_0003 > 14/03/13 23:10:08 INFO mapred.JobClient: Counters: 30 > 14/03/13 23:10:08 INFO mapred.JobClient: Job Counters > 14/03/13 23:10:08 INFO mapred.JobClient: Launched reduce tasks=6 > 14/03/13 23:10:08 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=206275 > 14/03/13 23:10:08 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 14/03/13 23:10:08 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > 14/03/13 23:10:08 INFO mapred.JobClient: Rack-local map tasks=2 > 14/03/13 23:10:08 INFO mapred.JobClient: Launched map tasks=7 > 14/03/13 23:10:08 INFO mapred.JobClient: Data-local map tasks=5 > 14/03/13 23:10:08 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=171104 > 14/03/13 23:10:08 INFO mapred.JobClient: File Output Format Counters > 14/03/13 23:10:08 INFO mapred.JobClient: Bytes Written=6086109 > 14/03/13 23:10:08 INFO mapred.JobClient: FileSystemCounters > 14/03/13 23:10:08 INFO mapred.JobClient: FILE_BYTES_READ=4129920 > 14/03/13 23:10:08 INFO mapred.JobClient: HDFS_BYTES_READ=6106528 > 14/03/13 23:10:08 INFO mapred.JobClient: FILE_BYTES_WRITTEN=8160297 > 14/03/13 23:10:08 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=6086109 > 14/03/13 23:10:08 INFO mapred.JobClient: File Input Format Counters > 14/03/13 23:10:08 INFO mapred.JobClient: Bytes Read=6105753 > 14/03/13 23:10:08 INFO mapred.JobClient: Map-Reduce Framework > 14/03/13 23:10:08 INFO mapred.JobClient: Map output materialized > bytes=3774632 > 14/03/13 23:10:08 INFO mapred.JobClient: Map input records=6040 > 14/03/13 23:10:08 INFO mapred.JobClient: Reduce shuffle bytes=3774632 > 14/03/13 23:10:08 INFO mapred.JobClient: Spilled Records=34676 > 14/03/13 23:10:08 INFO mapred.JobClient: Map output bytes=16987356 > 14/03/13 23:10:08 INFO mapred.JobClient: Total committed heap usage > (bytes)=806703104 > 14/03/13 23:10:08 INFO mapred.JobClient: CPU time spent (ms)=55360 > 14/03/13 23:10:08 INFO mapred.JobClient: Combine input records=1000209 > 14/03/13 23:10:08 INFO mapred.JobClient: SPLIT_RAW_BYTES=775 > 14/03/13 23:10:08 INFO mapred.JobClient: Reduce input records=17338 > 14/03/13 23:10:08 INFO mapred.JobClient: Reduce input groups=3706 > 14/03/13 23:10:08 INFO mapred.JobClient: Combine output records=17338 > 14/03/13 23:10:08 INFO mapred.JobClient: Physical memory (bytes) > snapshot=1430761472 > 14/03/13 23:10:08 INFO mapred.JobClient: Reduce output records=3706 > 14/03/13 23:10:08 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=6453673984 > 14/03/13 23:10:08 INFO mapred.JobClient: Map output records=1000209 > 14/03/13 23:10:08 INFO common.AbstractJob: Command line arguments: > {--endPhase=[2147483647], --excludeSelfSimilarity=[true], > --input=[temp/preparePreferenceMatrix/ratingMatrix], > --maxObservationsPerColumn=[500], --maxObservationsPerRow=[500], > --maxSimilaritiesPerRow=[100], --numberOfColumns=[6040], > --output=[temp/similarityMatrix], --randomSeed=[-9223372036854775808], > --similarityClassname=[SIMILARITY_COSINE], --startPhase=[0], > --tempDir=[temp], --threshold=[4.9E-324]} > 14/03/13 23:10:08 INFO mapred.JobClient: Default number of map tasks: null > 14/03/13 23:10:08 INFO mapred.JobClient: Setting default number of map > tasks based on cluster size to : 12 > 14/03/13 23:10:08 INFO mapred.JobClient: Default number of reduce tasks: 1 > 14/03/13 23:10:09 INFO security.ShellBasedUnixGroupsMapping: add hadoop to > shell userGroupsCache > 14/03/13 23:10:09 INFO mapred.JobClient: Setting group to hadoop > 14/03/13 23:10:09 INFO input.FileInputFormat: Total input paths to process > : 5 > 14/03/13 23:10:09 INFO mapred.JobClient: Running job: job_201403132009_0004 > 14/03/13 23:10:10 INFO mapred.JobClient: map 0% reduce 0% > 14/03/13 23:10:44 INFO mapred.JobClient: map 20% reduce 0% > 14/03/13 23:11:00 INFO mapred.JobClient: map 60% reduce 0% > 14/03/13 23:11:12 INFO mapred.JobClient: map 74% reduce 0% > 14/03/13 23:11:15 INFO mapred.JobClient: map 100% reduce 0% > 14/03/13 23:11:19 INFO mapred.JobClient: map 100% reduce 20% > 14/03/13 23:11:28 INFO mapred.JobClient: map 100% reduce 100% > 14/03/13 23:11:33 INFO mapred.JobClient: Job complete: > job_201403132009_0004 > 14/03/13 23:11:33 INFO mapred.JobClient: Counters: 30 > 14/03/13 23:11:33 INFO mapred.JobClient: Job Counters > 14/03/13 23:11:33 INFO mapred.JobClient: Launched reduce tasks=1 > 14/03/13 23:11:33 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=130734 > 14/03/13 23:11:33 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 14/03/13 23:11:33 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > 14/03/13 23:11:33 INFO mapred.JobClient: Rack-local map tasks=2 > 14/03/13 23:11:33 INFO mapred.JobClient: Launched map tasks=7 > 14/03/13 23:11:33 INFO mapred.JobClient: Data-local map tasks=5 > 14/03/13 23:11:33 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=37245 > 14/03/13 23:11:33 INFO mapred.JobClient: File Output Format Counters > 14/03/13 23:11:33 INFO mapred.JobClient: Bytes Written=98 > 14/03/13 23:11:33 INFO mapred.JobClient: FileSystemCounters > 14/03/13 23:11:33 INFO mapred.JobClient: FILE_BYTES_READ=157116 > 14/03/13 23:11:33 INFO mapred.JobClient: HDFS_BYTES_READ=6086889 > 14/03/13 23:11:33 INFO mapred.JobClient: FILE_BYTES_WRITTEN=468691 > 14/03/13 23:11:33 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=60379 > 14/03/13 23:11:33 INFO mapred.JobClient: File Input Format Counters > 14/03/13 23:11:33 INFO mapred.JobClient: Bytes Read=6086109 > 14/03/13 23:11:33 INFO mapred.JobClient: Map-Reduce Framework > 14/03/13 23:11:33 INFO mapred.JobClient: Map output materialized > bytes=157202 > 14/03/13 23:11:33 INFO mapred.JobClient: Map input records=3706 > 14/03/13 23:11:33 INFO mapred.JobClient: Reduce shuffle bytes=157202 > 14/03/13 23:11:33 INFO mapred.JobClient: Spilled Records=10 > 14/03/13 23:11:33 INFO mapred.JobClient: Map output bytes=301216 > 14/03/13 23:11:33 INFO mapred.JobClient: Total committed heap usage > (bytes)=656494592 > 14/03/13 23:11:33 INFO mapred.JobClient: CPU time spent (ms)=25460 > 14/03/13 23:11:33 INFO mapred.JobClient: Combine input records=5 > 14/03/13 23:11:33 INFO mapred.JobClient: SPLIT_RAW_BYTES=780 > 14/03/13 23:11:33 INFO mapred.JobClient: Reduce input records=5 > 14/03/13 23:11:33 INFO mapred.JobClient: Reduce input groups=1 > 14/03/13 23:11:33 INFO mapred.JobClient: Combine output records=5 > 14/03/13 23:11:33 INFO mapred.JobClient: Physical memory (bytes) > snapshot=1001426944 > 14/03/13 23:11:33 INFO mapred.JobClient: Reduce output records=0 > 14/03/13 23:11:33 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=3751100416 > 14/03/13 23:11:33 INFO mapred.JobClient: Map output records=5 > 14/03/13 23:11:33 INFO mapred.JobClient: Default number of map tasks: null > 14/03/13 23:11:33 INFO mapred.JobClient: Setting default number of map > tasks based on cluster size to : 12 > 14/03/13 23:11:33 INFO mapred.JobClient: Default number of reduce tasks: 5 > 14/03/13 23:11:33 INFO mapred.JobClient: Setting group to hadoop > 14/03/13 23:11:33 INFO input.FileInputFormat: Total input paths to process > : 5 > 14/03/13 23:11:34 INFO mapred.JobClient: Running job: job_201403132009_0005 > 14/03/13 23:11:35 INFO mapred.JobClient: map 0% reduce 0% > 14/03/13 23:12:12 INFO mapred.JobClient: map 1% reduce 0% > 14/03/13 23:12:15 INFO mapred.JobClient: map 20% reduce 0% > 14/03/13 23:12:28 INFO mapred.JobClient: map 40% reduce 0% > 14/03/13 23:12:31 INFO mapred.JobClient: map 42% reduce 0% > 14/03/13 23:12:34 INFO mapred.JobClient: map 60% reduce 0% > 14/03/13 23:12:46 INFO mapred.JobClient: map 80% reduce 0% > 14/03/13 23:12:52 INFO mapred.JobClient: map 83% reduce 4% > 14/03/13 23:12:55 INFO mapred.JobClient: map 100% reduce 4% > 14/03/13 23:12:58 INFO mapred.JobClient: map 100% reduce 10% > 14/03/13 23:13:13 INFO mapred.JobClient: map 100% reduce 25% > 14/03/13 23:13:16 INFO mapred.JobClient: map 100% reduce 39% > 14/03/13 23:13:19 INFO mapred.JobClient: map 100% reduce 60% > 14/03/13 23:13:25 INFO mapred.JobClient: map 100% reduce 66% > 14/03/13 23:13:28 INFO mapred.JobClient: map 100% reduce 100% > 14/03/13 23:13:33 INFO mapred.JobClient: Job complete: > job_201403132009_0005 > 14/03/13 23:13:33 INFO mapred.JobClient: Counters: 33 > 14/03/13 23:13:33 INFO mapred.JobClient: Job Counters > 14/03/13 23:13:33 INFO mapred.JobClient: Launched reduce tasks=7 > 14/03/13 23:13:33 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=215396 > 14/03/13 23:13:33 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 14/03/13 23:13:33 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > 14/03/13 23:13:33 INFO mapred.JobClient: Rack-local map tasks=2 > 14/03/13 23:13:33 INFO mapred.JobClient: Launched map tasks=7 > 14/03/13 23:13:33 INFO mapred.JobClient: Data-local map tasks=5 > 14/03/13 23:13:33 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=183527 > 14/03/13 23:13:33 INFO mapred.JobClient: File Output Format Counters > 14/03/13 23:13:33 INFO mapred.JobClient: Bytes Written=6696610 > 14/03/13 23:13:33 INFO mapred.JobClient: FileSystemCounters > 14/03/13 23:13:33 INFO mapred.JobClient: FILE_BYTES_READ=6092304 > 14/03/13 23:13:33 INFO mapred.JobClient: HDFS_BYTES_READ=6388294 > 14/03/13 23:13:33 INFO mapred.JobClient: FILE_BYTES_WRITTEN=10847303 > 14/03/13 23:13:33 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=6696631 > 14/03/13 23:13:33 INFO mapred.JobClient: File Input Format Counters > 14/03/13 23:13:33 INFO mapred.JobClient: Bytes Read=6086109 > 14/03/13 23:13:33 INFO mapred.JobClient: > > org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob$Counters > 14/03/13 23:13:33 INFO mapred.JobClient: ROWS=3706 > 14/03/13 23:13:33 INFO mapred.JobClient: NEGLECTED_OBSERVATIONS=344034 > 14/03/13 23:13:33 INFO mapred.JobClient: USED_OBSERVATIONS=656175 > 14/03/13 23:13:33 INFO mapred.JobClient: Map-Reduce Framework > 14/03/13 23:13:33 INFO mapred.JobClient: Map output materialized > bytes=4483794 > 14/03/13 23:13:33 INFO mapred.JobClient: Map input records=3706 > 14/03/13 23:13:33 INFO mapred.JobClient: Reduce shuffle bytes=4483794 > 14/03/13 23:13:33 INFO mapred.JobClient: Spilled Records=59962 > 14/03/13 23:13:33 INFO mapred.JobClient: Map output bytes=13756468 > 14/03/13 23:13:33 INFO mapred.JobClient: Total committed heap usage > (bytes)=806703104 > 14/03/13 23:13:33 INFO mapred.JobClient: CPU time spent (ms)=63880 > 14/03/13 23:13:33 INFO mapred.JobClient: Combine input records=656190 > 14/03/13 23:13:33 INFO mapred.JobClient: SPLIT_RAW_BYTES=780 > 14/03/13 23:13:33 INFO mapred.JobClient: Reduce input records=29981 > 14/03/13 23:13:33 INFO mapred.JobClient: Reduce input groups=6043 > 14/03/13 23:13:33 INFO mapred.JobClient: Combine output records=29981 > 14/03/13 23:13:33 INFO mapred.JobClient: Physical memory (bytes) > snapshot=1466478592 > 14/03/13 23:13:33 INFO mapred.JobClient: Reduce output records=6040 > 14/03/13 23:13:33 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=6403231744 > 14/03/13 23:13:33 INFO mapred.JobClient: Map output records=656190 > 14/03/13 23:13:33 INFO mapred.JobClient: Default number of map tasks: null > 14/03/13 23:13:33 INFO mapred.JobClient: Setting default number of map > tasks based on cluster size to : 12 > 14/03/13 23:13:33 INFO mapred.JobClient: Default number of reduce tasks: 5 > 14/03/13 23:13:35 INFO mapred.JobClient: Setting group to hadoop > 14/03/13 23:13:35 INFO input.FileInputFormat: Total input paths to process > : 5 > 14/03/13 23:13:35 INFO mapred.JobClient: Running job: job_201403132009_0006 > 14/03/13 23:13:36 INFO mapred.JobClient: map 0% reduce 0% > 14/03/13 23:14:25 INFO mapred.JobClient: map 1% reduce 0% > 14/03/13 23:14:28 INFO mapred.JobClient: map 2% reduce 0% > 14/03/13 23:14:31 INFO mapred.JobClient: map 4% reduce 0% > 14/03/13 23:14:34 INFO mapred.JobClient: map 5% reduce 0% > 14/03/13 23:14:37 INFO mapred.JobClient: map 10% reduce 0% > 14/03/13 23:14:40 INFO mapred.JobClient: map 15% reduce 0% > 14/03/13 23:14:43 INFO mapred.JobClient: map 22% reduce 0% > 14/03/13 23:14:46 INFO mapred.JobClient: map 25% reduce 0% > 14/03/13 23:14:49 INFO mapred.JobClient: map 29% reduce 0% > 14/03/13 23:14:52 INFO mapred.JobClient: map 33% reduce 0% > 14/03/13 23:14:55 INFO mapred.JobClient: map 37% reduce 0% > 14/03/13 23:14:58 INFO mapred.JobClient: map 42% reduce 0% > 14/03/13 23:15:01 INFO mapred.JobClient: map 46% reduce 0% > 14/03/13 23:15:04 INFO mapred.JobClient: map 51% reduce 0% > 14/03/13 23:15:07 INFO mapred.JobClient: map 54% reduce 0% > 14/03/13 23:15:13 INFO mapred.JobClient: map 56% reduce 0% > 14/03/13 23:15:14 INFO mapred.JobClient: map 57% reduce 0% > 14/03/13 23:15:17 INFO mapred.JobClient: map 63% reduce 0% > 14/03/13 23:15:20 INFO mapred.JobClient: map 66% reduce 0% > 14/03/13 23:15:23 INFO mapred.JobClient: map 71% reduce 0% > 14/03/13 23:15:26 INFO mapred.JobClient: map 76% reduce 0% > 14/03/13 23:15:29 INFO mapred.JobClient: map 78% reduce 0% > 14/03/13 23:15:32 INFO mapred.JobClient: map 80% reduce 0% > 14/03/13 23:15:35 INFO mapred.JobClient: map 82% reduce 0% > 14/03/13 23:15:41 INFO mapred.JobClient: map 85% reduce 0% > 14/03/13 23:15:44 INFO mapred.JobClient: map 87% reduce 0% > 14/03/13 23:15:47 INFO mapred.JobClient: map 91% reduce 0% > 14/03/13 23:15:50 INFO mapred.JobClient: map 94% reduce 0% > 14/03/13 23:15:53 INFO mapred.JobClient: map 96% reduce 0% > 14/03/13 23:15:56 INFO mapred.JobClient: map 98% reduce 0% > 14/03/13 23:16:02 INFO mapred.JobClient: map 99% reduce 0% > 14/03/13 23:16:18 INFO mapred.JobClient: map 99% reduce 4% > 14/03/13 23:16:21 INFO mapred.JobClient: map 99% reduce 8% > 14/03/13 23:16:25 INFO mapred.JobClient: map 99% reduce 12% > 14/03/13 23:16:30 INFO mapred.JobClient: map 100% reduce 12% > 14/03/13 23:16:36 INFO mapred.JobClient: map 100% reduce 16% > 14/03/13 23:16:37 INFO mapred.JobClient: map 100% reduce 17% > 14/03/13 23:16:39 INFO mapred.JobClient: map 100% reduce 24% > 14/03/13 23:16:42 INFO mapred.JobClient: map 100% reduce 45% > 14/03/13 23:16:46 INFO mapred.JobClient: map 100% reduce 48% > 14/03/13 23:16:48 INFO mapred.JobClient: map 100% reduce 50% > 14/03/13 23:16:51 INFO mapred.JobClient: map 100% reduce 54% > 14/03/13 23:16:52 INFO mapred.JobClient: map 100% reduce 56% > 14/03/13 23:16:57 INFO mapred.JobClient: map 100% reduce 62% > 14/03/13 23:16:58 INFO mapred.JobClient: map 100% reduce 65% > 14/03/13 23:17:00 INFO mapred.JobClient: map 100% reduce 77% > 14/03/13 23:17:01 INFO mapred.JobClient: map 100% reduce 89% > 14/03/13 23:17:03 INFO mapred.JobClient: map 100% reduce 93% > 14/03/13 23:17:04 INFO mapred.JobClient: map 100% reduce 97% > 14/03/13 23:17:09 INFO mapred.JobClient: map 100% reduce 98% > 14/03/13 23:17:10 INFO mapred.JobClient: map 100% reduce 100% > 14/03/13 23:17:15 INFO mapred.JobClient: Job complete: > job_201403132009_0006 > 14/03/13 23:17:15 INFO mapred.JobClient: Counters: 32 > 14/03/13 23:17:15 INFO mapred.JobClient: Job Counters > 14/03/13 23:17:15 INFO mapred.JobClient: Launched reduce tasks=6 > 14/03/13 23:17:15 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=682491 > 14/03/13 23:17:15 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 14/03/13 23:17:15 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > 14/03/13 23:17:15 INFO mapred.JobClient: Rack-local map tasks=3 > 14/03/13 23:17:15 INFO mapred.JobClient: Launched map tasks=7 > 14/03/13 23:17:15 INFO mapred.JobClient: Data-local map tasks=4 > 14/03/13 23:17:15 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=227987 > 14/03/13 23:17:15 INFO mapred.JobClient: File Output Format Counters > 14/03/13 23:17:15 INFO mapred.JobClient: Bytes Written=48495057 > 14/03/13 23:17:15 INFO mapred.JobClient: FileSystemCounters > 14/03/13 23:17:15 INFO mapred.JobClient: FILE_BYTES_READ=496094365 > 14/03/13 23:17:15 INFO mapred.JobClient: HDFS_BYTES_READ=6697350 > 14/03/13 23:17:15 INFO mapred.JobClient: FILE_BYTES_WRITTEN=732617372 > 14/03/13 23:17:15 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=48495057 > 14/03/13 23:17:15 INFO mapred.JobClient: File Input Format Counters > 14/03/13 23:17:15 INFO mapred.JobClient: Bytes Read=6696610 > 14/03/13 23:17:15 INFO mapred.JobClient: > > org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob$Counters > 14/03/13 23:17:15 INFO mapred.JobClient: PRUNED_COOCCURRENCES=0 > 14/03/13 23:17:15 INFO mapred.JobClient: COOCCURRENCES=76994409 > 14/03/13 23:17:15 INFO mapred.JobClient: Map-Reduce Framework > 14/03/13 23:17:15 INFO mapred.JobClient: Map output materialized > bytes=238087629 > 14/03/13 23:17:15 INFO mapred.JobClient: Map input records=6040 > 14/03/13 23:17:15 INFO mapred.JobClient: Reduce shuffle bytes=238087629 > 14/03/13 23:17:15 INFO mapred.JobClient: Spilled Records=91757 > 14/03/13 23:17:15 INFO mapred.JobClient: Map output bytes=777266857 > 14/03/13 23:17:15 INFO mapred.JobClient: Total committed heap usage > (bytes)=1117577216 > 14/03/13 23:17:15 INFO mapred.JobClient: CPU time spent (ms)=252750 > 14/03/13 23:17:15 INFO mapred.JobClient: Combine input records=662899 > 14/03/13 23:17:15 INFO mapred.JobClient: SPLIT_RAW_BYTES=635 > 14/03/13 23:17:15 INFO mapred.JobClient: Reduce input records=29485 > 14/03/13 23:17:15 INFO mapred.JobClient: Reduce input groups=3679 > 14/03/13 23:17:15 INFO mapred.JobClient: Combine output records=36209 > 14/03/13 23:17:15 INFO mapred.JobClient: Physical memory (bytes) > snapshot=1777143808 > 14/03/13 23:17:15 INFO mapred.JobClient: Reduce output records=3679 > 14/03/13 23:17:15 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=6485929984 > 14/03/13 23:17:15 INFO mapred.JobClient: Map output records=656175 > 14/03/13 23:17:15 INFO mapred.JobClient: Default number of map tasks: null > 14/03/13 23:17:15 INFO mapred.JobClient: Setting default number of map > tasks based on cluster size to : 12 > 14/03/13 23:17:15 INFO mapred.JobClient: Default number of reduce tasks: 5 > 14/03/13 23:17:16 INFO security.ShellBasedUnixGroupsMapping: add hadoop to > shell userGroupsCache > 14/03/13 23:17:16 INFO mapred.JobClient: Setting group to hadoop > 14/03/13 23:17:16 INFO input.FileInputFormat: Total input paths to process > : 5 > 14/03/13 23:17:16 INFO mapred.JobClient: Running job: job_201403132009_0007 > 14/03/13 23:17:17 INFO mapred.JobClient: map 0% reduce 0% > 14/03/13 23:18:12 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000002_0, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:18:23 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000003_0, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:18:24 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000000_0, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:18:33 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000004_0, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:18:39 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000001_0, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > attempt_201403132009_0007_m_000001_0: log4j:WARN No appenders could be > found for logger (org.apache.hadoop.mapred.Task). > attempt_201403132009_0007_m_000001_0: log4j:WARN Please initialize the > log4j system properly. > attempt_201403132009_0007_m_000001_0: log4j:WARN See > http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. > 14/03/13 23:18:42 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000003_1, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:18:53 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000000_1, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > attempt_201403132009_0007_m_000000_1: log4j:WARN No appenders could be > found for logger (org.apache.hadoop.mapred.Task). > attempt_201403132009_0007_m_000000_1: log4j:WARN Please initialize the > log4j system properly. > attempt_201403132009_0007_m_000000_1: log4j:WARN See > http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. > 14/03/13 23:18:56 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000002_1, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:18:57 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000004_1, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:19:03 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000003_2, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:19:09 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000001_1, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:19:17 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000004_2, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:19:18 INFO mapred.JobClient: Task Id : > attempt_201403132009_0007_m_000000_2, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:19:33 INFO mapred.JobClient: Job complete: > job_201403132009_0007 > 14/03/13 23:19:33 INFO mapred.JobClient: Counters: 8 > 14/03/13 23:19:33 INFO mapred.JobClient: Job Counters > 14/03/13 23:19:33 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=276447 > 14/03/13 23:19:33 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 14/03/13 23:19:33 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > 14/03/13 23:19:33 INFO mapred.JobClient: Rack-local map tasks=12 > 14/03/13 23:19:33 INFO mapred.JobClient: Launched map tasks=17 > 14/03/13 23:19:33 INFO mapred.JobClient: Data-local map tasks=5 > 14/03/13 23:19:33 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 > 14/03/13 23:19:33 INFO mapred.JobClient: Failed map tasks=1 > 14/03/13 23:19:33 INFO mapred.JobClient: Default number of map tasks: null > 14/03/13 23:19:33 INFO mapred.JobClient: Setting default number of map > tasks based on cluster size to : 12 > 14/03/13 23:19:33 INFO mapred.JobClient: Default number of reduce tasks: 5 > 14/03/13 23:19:34 INFO mapred.JobClient: Setting group to hadoop > 14/03/13 23:19:34 INFO input.FileInputFormat: Total input paths to process > : 0 > 14/03/13 23:19:34 INFO mapred.JobClient: Running job: job_201403132009_0008 > 14/03/13 23:19:35 INFO mapred.JobClient: map 0% reduce 0% > 14/03/13 23:20:25 INFO mapred.JobClient: map 0% reduce 20% > 14/03/13 23:20:28 INFO mapred.JobClient: map 0% reduce 40% > 14/03/13 23:20:31 INFO mapred.JobClient: map 0% reduce 80% > 14/03/13 23:20:34 INFO mapred.JobClient: map 0% reduce 100% > 14/03/13 23:20:48 INFO mapred.JobClient: Job complete: > job_201403132009_0008 > 14/03/13 23:20:48 INFO mapred.JobClient: Counters: 18 > 14/03/13 23:20:48 INFO mapred.JobClient: Job Counters > 14/03/13 23:20:48 INFO mapred.JobClient: Launched reduce tasks=5 > 14/03/13 23:20:48 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=26526 > 14/03/13 23:20:48 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 14/03/13 23:20:48 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > 14/03/13 23:20:48 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=57942 > 14/03/13 23:20:48 INFO mapred.JobClient: File Output Format Counters > 14/03/13 23:20:48 INFO mapred.JobClient: Bytes Written=0 > 14/03/13 23:20:48 INFO mapred.JobClient: FileSystemCounters > 14/03/13 23:20:48 INFO mapred.JobClient: FILE_BYTES_WRITTEN=128665 > 14/03/13 23:20:48 INFO mapred.JobClient: Map-Reduce Framework > 14/03/13 23:20:48 INFO mapred.JobClient: Reduce input groups=0 > 14/03/13 23:20:48 INFO mapred.JobClient: Combine output records=0 > 14/03/13 23:20:48 INFO mapred.JobClient: Reduce shuffle bytes=0 > 14/03/13 23:20:48 INFO mapred.JobClient: Physical memory (bytes) > snapshot=421289984 > 14/03/13 23:20:48 INFO mapred.JobClient: Reduce output records=0 > 14/03/13 23:20:48 INFO mapred.JobClient: Spilled Records=0 > 14/03/13 23:20:48 INFO mapred.JobClient: CPU time spent (ms)=4120 > 14/03/13 23:20:48 INFO mapred.JobClient: Total committed heap usage > (bytes)=131727360 > 14/03/13 23:20:48 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=3127447552 > 14/03/13 23:20:48 INFO mapred.JobClient: Combine input records=0 > 14/03/13 23:20:48 INFO mapred.JobClient: Reduce input records=0 > 14/03/13 23:20:48 INFO mapred.JobClient: Default number of map tasks: null > 14/03/13 23:20:48 INFO mapred.JobClient: Setting default number of map > tasks based on cluster size to : 12 > 14/03/13 23:20:48 INFO mapred.JobClient: Default number of reduce tasks: 5 > 14/03/13 23:20:49 INFO mapred.JobClient: Setting group to hadoop > 14/03/13 23:20:49 INFO input.FileInputFormat: Total input paths to process > : 0 > 14/03/13 23:20:49 INFO input.FileInputFormat: Total input paths to process > : 5 > 14/03/13 23:20:49 INFO mapred.JobClient: Running job: job_201403132009_0009 > 14/03/13 23:20:50 INFO mapred.JobClient: map 0% reduce 0% > 14/03/13 23:21:45 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000004_0, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:21:58 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000003_0, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:21:58 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000000_0, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > attempt_201403132009_0009_m_000000_0: log4j:WARN No appenders could be > found for logger (org.apache.hadoop.mapred.Task). > attempt_201403132009_0009_m_000000_0: log4j:WARN Please initialize the > log4j system properly. > attempt_201403132009_0009_m_000000_0: log4j:WARN See > http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. > 14/03/13 23:22:09 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000002_0, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:22:12 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000003_1, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:22:15 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000001_0, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > attempt_201403132009_0009_m_000001_0: log4j:WARN No appenders could be > found for logger (org.apache.hadoop.mapred.Task). > attempt_201403132009_0009_m_000001_0: log4j:WARN Please initialize the > log4j system properly. > attempt_201403132009_0009_m_000001_0: log4j:WARN See > http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. > 14/03/13 23:22:19 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000000_1, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:22:19 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000004_1, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:22:33 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000003_2, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:22:36 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000001_1, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:22:39 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000002_1, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:22:42 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000000_2, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:22:58 INFO mapred.JobClient: Task Id : > attempt_201403132009_0009_m_000001_2, Status : FAILED > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > 14/03/13 23:23:04 INFO mapred.JobClient: Job complete: > job_201403132009_0009 > 14/03/13 23:23:04 INFO mapred.JobClient: Counters: 8 > 14/03/13 23:23:04 INFO mapred.JobClient: Job Counters > 14/03/13 23:23:04 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=257919 > 14/03/13 23:23:04 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 14/03/13 23:23:04 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > 14/03/13 23:23:04 INFO mapred.JobClient: Rack-local map tasks=12 > 14/03/13 23:23:04 INFO mapred.JobClient: Launched map tasks=17 > 14/03/13 23:23:04 INFO mapred.JobClient: Data-local map tasks=5 > 14/03/13 23:23:04 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 > 14/03/13 23:23:04 INFO mapred.JobClient: Failed map tasks=1 > 14/03/13 23:23:04 INFO driver.MahoutDriver: Program took 1124754 ms > (Minutes: 18.7459) > > > > On Thu, Mar 13, 2014 at 5:22 PM, Suneel Marthi <[email protected] > >wrote: > > > Could u print the complete stacktrace? > > > > > > > > > > On Thursday, March 13, 2014 7:31 PM, Andrew Musselman < > > [email protected]> wrote: > > > > I'm getting this error repeated for several attempts in the last phase of > > the recommenditembased example on EMR with the default AMI and Hadoop > > version and a fresh Mahout 0.9 non-source tarball: > > > > 14/03/13 23:22:58 INFO mapred.JobClient: Task Id : > > attempt_201403132009_0009_m_000001_2, Status : FAILED > > Error: org.apache.lucene.util.PriorityQueue.<init>(I)V > > > > The ultimate output is several empty part files. > > > > Here's the du on the temp directory: > > $ hadoop fs -du temp > > Found 10 items > > 7 hdfs://10.196.18.64:9000/user/hadoop/temp/maxValues.bin > > 7 hdfs://10.196.18.64:9000/user/hadoop/temp/norms.bin > > 98 hdfs://10.196.18.64:9000/user/hadoop/temp/notUsed > > 7 > > hdfs://10.196.18.64:9000/user/hadoop/temp/numNonZeroEntries.bin > > 60281 hdfs:// > > 10.196.18.64:9000/user/hadoop/temp/observationsPerColumn.bin > > 48495057 hdfs://10.196.18.64:9000/user/hadoop/temp/pairwiseSimilarity > > 0 hdfs://10.196.18.64:9000/user/hadoop/temp/partialMultiply > > 12237129 hdfs:// > > 10.196.18.64:9000/user/hadoop/temp/preparePreferenceMatrix > > 0 hdfs://10.196.18.64:9000/user/hadoop/temp/similarityMatrix > > 8016325 hdfs://10.196.18.64:9000/user/hadoop/temp/weights > > > > Has anyone encountered this? > > > > Thanks > > Andrew > > >
