Hi Pragnesh,
I really don't know what to suggest to you. I just did a new Mahout
checkout and build, followed by uploading the synthetic_control.data
file to a local Hadoop instance. The k-means job ran without incident.
On a hunch, I also uploaded the file as testdata (not in directory
testdata) and that worked too. I'm baffled why I can't duplicate this
and suspect it is a local system issue. What OS are you running?
If yours works from Eclipse but not from the command line, I wonder if
you have done mvn clean build from the command line before you ran the
CLI Mahout job? Eclipse compiles its bits into different directories and
does not build the necessary job files. Other than that, I suggest
checking your file system groups and permissions.
If you find something that gets you running again, *please* post your
solution so we can advise others who are experiencing the same error
message.
On 10/5/10 12:06 AM, pragnesh (JIRA) wrote:
[
https://issues.apache.org/jira/browse/MAHOUT-504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917502#action_12917502
]
pragnesh edited comment on MAHOUT-504 at 10/5/10 3:05 AM:
----------------------------------------------------------
i am also getting same exption with trunk code
10/10/04 12:42:34 INFO mapred.JobClient: Running job: job_201010041038_0019
10/10/04 12:42:35 INFO mapred.JobClient: map 0% reduce 0%
10/10/04 12:42:45 INFO mapred.JobClient: Task Id :
attempt_201010041038_0019_m_000000_0, Status : FAILED
java.lang.IllegalStateException: No clusters found. Check your -c path.
at
org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:61)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
this run fine from eclipse
but when i try to run from command line with hadoop. i see following output.
while $MAHOUT_HOME/bin/mahout
org.apache.mahout.clustering.syntheticcontrol.dirichlet.Job running fine
without any error.
pragnesh-laptop% $MAHOUT_HOME/bin/mahout
org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop/
HADOOP_CONF_DIR=/etc/hadoop/conf.pseudo
10/10/05 12:26:05 WARN driver.MahoutDriver: No
org.apache.mahout.clustering.syntheticcontrol.kmeans.Job.props found on
classpath, will use command-line arguments only
10/10/05 12:26:05 INFO kmeans.Job: Running with default arguments
10/10/05 12:26:06 INFO kmeans.Job: Preparing Input
10/10/05 12:26:06 WARN mapred.JobClient: Use GenericOptionsParser for parsing
the arguments. Applications should implement Tool for the same.
10/10/05 12:26:07 INFO input.FileInputFormat: Total input paths to process : 1
10/10/05 12:26:09 INFO mapred.JobClient: Running job: job_201010051117_0005
10/10/05 12:26:10 INFO mapred.JobClient: map 0% reduce 0%
10/10/05 12:26:26 INFO mapred.JobClient: map 100% reduce 0%
10/10/05 12:26:28 INFO mapred.JobClient: Job complete: job_201010051117_0005
10/10/05 12:26:29 INFO mapred.JobClient: Counters: 7
10/10/05 12:26:29 INFO mapred.JobClient: Job Counters
10/10/05 12:26:29 INFO mapred.JobClient: Launched map tasks=1
10/10/05 12:26:29 INFO mapred.JobClient: Data-local map tasks=1
10/10/05 12:26:29 INFO mapred.JobClient: FileSystemCounters
10/10/05 12:26:29 INFO mapred.JobClient: HDFS_BYTES_READ=288374
10/10/05 12:26:29 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=335470
10/10/05 12:26:29 INFO mapred.JobClient: Map-Reduce Framework
10/10/05 12:26:29 INFO mapred.JobClient: Map input records=600
10/10/05 12:26:29 INFO mapred.JobClient: Spilled Records=0
10/10/05 12:26:29 INFO mapred.JobClient: Map output records=600
10/10/05 12:26:29 INFO kmeans.Job: Running Canopy to get initial clusters
10/10/05 12:26:29 INFO canopy.CanopyDriver: Build Clusters Input: output/data
Out: output Measure:
org.apache.mahout.common.distance.euclideandistancemeas...@136a43c t1: 80.0 t2:
55.0
10/10/05 12:26:29 WARN mapred.JobClient: Use GenericOptionsParser for parsing
the arguments. Applications should implement Tool for the same.
10/10/05 12:26:29 INFO input.FileInputFormat: Total input paths to process : 1
10/10/05 12:26:30 INFO mapred.JobClient: Running job: job_201010051117_0006
10/10/05 12:26:31 INFO mapred.JobClient: map 0% reduce 0%
10/10/05 12:26:42 INFO mapred.JobClient: map 100% reduce 0%
10/10/05 12:26:54 INFO mapred.JobClient: map 100% reduce 100%
10/10/05 12:26:56 INFO mapred.JobClient: Job complete: job_201010051117_0006
10/10/05 12:26:56 INFO mapred.JobClient: Counters: 17
10/10/05 12:26:56 INFO mapred.JobClient: Job Counters
10/10/05 12:26:56 INFO mapred.JobClient: Launched reduce tasks=1
10/10/05 12:26:56 INFO mapred.JobClient: Launched map tasks=1
10/10/05 12:26:56 INFO mapred.JobClient: Data-local map tasks=1
10/10/05 12:26:56 INFO mapred.JobClient: FileSystemCounters
10/10/05 12:26:56 INFO mapred.JobClient: FILE_BYTES_READ=13906
10/10/05 12:26:56 INFO mapred.JobClient: HDFS_BYTES_READ=335470
10/10/05 12:26:56 INFO mapred.JobClient: FILE_BYTES_WRITTEN=27844
10/10/05 12:26:56 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=7131
10/10/05 12:26:56 INFO mapred.JobClient: Map-Reduce Framework
10/10/05 12:26:56 INFO mapred.JobClient: Reduce input groups=1
10/10/05 12:26:56 INFO mapred.JobClient: Combine output records=0
10/10/05 12:26:56 INFO mapred.JobClient: Map input records=600
10/10/05 12:26:56 INFO mapred.JobClient: Reduce shuffle bytes=0
10/10/05 12:26:56 INFO mapred.JobClient: Reduce output records=6
10/10/05 12:26:56 INFO mapred.JobClient: Spilled Records=50
10/10/05 12:26:56 INFO mapred.JobClient: Map output bytes=13800
10/10/05 12:26:56 INFO mapred.JobClient: Combine input records=0
10/10/05 12:26:56 INFO mapred.JobClient: Map output records=25
10/10/05 12:26:56 INFO mapred.JobClient: Reduce input records=25
10/10/05 12:26:56 INFO kmeans.Job: Running KMeans
10/10/05 12:26:56 INFO kmeans.KMeansDriver: Input: output/data Clusters In:
output/clusters-0 Out: output Distance:
org.apache.mahout.common.distance.EuclideanDistanceMeasure
10/10/05 12:26:56 INFO kmeans.KMeansDriver: convergence: 0.5 max Iterations: 10
num Reduce Tasks: org.apache.mahout.math.VectorWritable Input Vectors: {}
10/10/05 12:26:56 INFO kmeans.KMeansDriver: K-Means Iteration 1
10/10/05 12:26:56 WARN mapred.JobClient: Use GenericOptionsParser for parsing
the arguments. Applications should implement Tool for the same.
10/10/05 12:26:57 INFO input.FileInputFormat: Total input paths to process : 1
10/10/05 12:26:58 INFO mapred.JobClient: Running job: job_201010051117_0007
10/10/05 12:26:59 INFO mapred.JobClient: map 0% reduce 0%
10/10/05 12:27:08 INFO mapred.JobClient: Task Id :
attempt_201010051117_0007_m_000000_0, Status : FAILED
java.lang.IllegalStateException: No clusters found. Check your -c path.
at
org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:61)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
10/10/05 12:27:14 INFO mapred.JobClient: Task Id :
attempt_201010051117_0007_m_000000_1, Status : FAILED
java.lang.IllegalStateException: No clusters found. Check your -c path.
at
org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:61)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
10/10/05 12:27:23 INFO mapred.JobClient: Task Id :
attempt_201010051117_0007_m_000000_2, Status : FAILED
java.lang.IllegalStateException: No clusters found. Check your -c path.
at
org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:61)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
10/10/05 12:27:35 INFO mapred.JobClient: Job complete: job_201010051117_0007
10/10/05 12:27:35 INFO mapred.JobClient: Counters: 3
10/10/05 12:27:35 INFO mapred.JobClient: Job Counters
10/10/05 12:27:35 INFO mapred.JobClient: Launched map tasks=4
10/10/05 12:27:35 INFO mapred.JobClient: Data-local map tasks=4
10/10/05 12:27:35 INFO mapred.JobClient: Failed map tasks=1
10/10/05 12:27:35 INFO kmeans.KMeansDriver: Clustering data
10/10/05 12:27:35 INFO kmeans.KMeansDriver: Running Clustering
10/10/05 12:27:35 INFO kmeans.KMeansDriver: Input: output/data Clusters In:
output/clusters-1 Out: output/clusteredPoints Distance:
org.apache.mahout.common.distance.euclideandistancemeas...@136a43c
10/10/05 12:27:35 INFO kmeans.KMeansDriver: convergence: 0.5 Input Vectors:
org.apache.mahout.math.VectorWritable
10/10/05 12:27:35 WARN mapred.JobClient: Use GenericOptionsParser for parsing
the arguments. Applications should implement Tool for the same.
10/10/05 12:27:36 INFO input.FileInputFormat: Total input paths to process : 1
10/10/05 12:27:37 INFO mapred.JobClient: Running job: job_201010051117_0008
10/10/05 12:27:38 INFO mapred.JobClient: map 0% reduce 0%
10/10/05 12:27:47 INFO mapred.JobClient: Task Id :
attempt_201010051117_0008_m_000000_0, Status : FAILED
java.lang.IllegalStateException: Cluster is empty!
at
org.apache.mahout.clustering.kmeans.KMeansClusterMapper.setup(KMeansClusterMapper.java:57)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
10/10/05 12:27:53 INFO mapred.JobClient: Task Id :
attempt_201010051117_0008_m_000000_1, Status : FAILED
java.lang.IllegalStateException: Cluster is empty!
at
org.apache.mahout.clustering.kmeans.KMeansClusterMapper.setup(KMeansClusterMapper.java:57)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
10/10/05 12:27:59 INFO mapred.JobClient: Task Id :
attempt_201010051117_0008_m_000000_2, Status : FAILED
java.lang.IllegalStateException: Cluster is empty!
at
org.apache.mahout.clustering.kmeans.KMeansClusterMapper.setup(KMeansClusterMapper.java:57)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
10/10/05 12:28:11 INFO mapred.JobClient: Job complete: job_201010051117_0008
10/10/05 12:28:11 INFO mapred.JobClient: Counters: 3
10/10/05 12:28:11 INFO mapred.JobClient: Job Counters
10/10/05 12:28:11 INFO mapred.JobClient: Launched map tasks=4
10/10/05 12:28:11 INFO mapred.JobClient: Data-local map tasks=4
10/10/05 12:28:11 INFO mapred.JobClient: Failed map tasks=1
10/10/05 12:28:12 INFO driver.MahoutDriver: Program took 126495 ms
was (Author: pgradadia):
i am also getting same exption with trunk code
10/10/04 12:42:34 INFO mapred.JobClient: Running job: job_201010041038_0019
10/10/04 12:42:35 INFO mapred.JobClient: map 0% reduce 0%
10/10/04 12:42:45 INFO mapred.JobClient: Task Id :
attempt_201010041038_0019_m_000000_0, Status : FAILED
java.lang.IllegalStateException: No clusters found. Check your -c path.
at
org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:61)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Kmeans clustering error
-----------------------
Key: MAHOUT-504
URL: https://issues.apache.org/jira/browse/MAHOUT-504
Project: Mahout
Issue Type: Bug
Reporter: Zhen Guo
Assignee: Robin Anil
Fix For: 0.4
I tried the Kmeans algorithm on the Synthetic Control data. The following error
appears. I tried the Canopy algorithm, it is fine. This error is from Mapper. I
am using Trunk.
10/09/20 19:40:06 INFO mapred.JobClient: Task Id :
attempt_201008261432_1324_m_000000_0, Status : FAILED
java.lang.IllegalStateException: Cluster is empty!
at
org.apache.mahout.clustering.kmeans.KMeansClusterMapper.setup(KMeansClusterMapper.java:57)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)