Thanks Suneel for the help! I will examine my work-flow again. Wei
On Fri, Aug 22, 2014 at 4:57 PM, Suneel Marthi <[email protected]> wrote: > u would see that error with KMeans if u do not provide an input centroids > or ask KMeans to randomly generate initial centroids. > look at the wiki page for running Mahout KMeans and ensure that u r not > missing anything. > > > On Sat, Aug 23, 2014 at 2:22 AM, Wei Zhang <[email protected]> wrote: > > > Thanks a lot Pat for the pointer! I tried with the latest trunk against > > Hadoop 2.5.0 for the k-means cluster example, it succesfully finished the > > text to tifidf vectorization MR jobs but failed at the actual clustering > > complaining > > > > Exception in thread "main" java.lang.IllegalStateException: No input > > clusters found in ... Check your -c argument. > > > > I had the same problem before when running against Hadoop 1.2.1, but the > > problem went away after I reformatted HDFS. I didn't get lucky with HDFS > > reformat this time. I'll do a bit more investigation later. > > > > Thanks again for the pointers! > > > > Wei > > > > > > On Fri, Aug 22, 2014 at 11:20 AM, Pat Ferrel <[email protected]> > > wrote: > > > > > I’m no expert here since I’m stuck on hadoop 1.2.1 but the latest > master > > > on github is meant to be used with hadoop 2.x as the default. > > > Mahout 0.9 had some experimental support but is AFAIK not recommended > for > > > h 2. No clue about hadoop 2.5 specifically but if it truly is a minor > > > backwards compatible release you may be ok. Mahout now uses Spark, > which > > > claims support through hadoop 2.4.x. > > > > > > On Aug 20, 2014, at 3:03 PM, Wei Zhang <[email protected]> wrote: > > > > > > Hello, > > > > > > After a system upgrade, we have a Hadoop 2.5.0 cloud instance. We are > > > trying to run Mahout on top of it. We are using Mahout 0.9 (downloaded > > from > > > http://mahout.apache.org/general/downloads.html ) > > > > > > https://issues.apache.org/jira/browse/MAHOUT-1329 indicates Mahout > > > supports > > > Hadoop 2.2.0 > > > > > > But even with mvn clean install -Dhadoop2.version=2.2.0 > -DskipTests=true, > > > it doesn't appear Hadoop 2.x jar is downloaded anywhere (i.e., there is > > > only Hadoop 1.2.1 jar downloaded, which is what the pom.xml dictates). > > > > > > Further, even a simple HDFS I/O call would fail (i.e., creating an HDFS > > > file). It seems Hadoop 2.2.0 is not quite compatible with Mahout 0.9 > > > > > > My question is: > > > (1) Does Mahout support Hadoop 2.5.0 or should I check out the code > from > > > git (as suggested at > > > http://mahout.apache.org/developers/buildingmahout.html) > > > in order to get a Hadoop 2.x friendly Mahout ? > > > > > > Thanks! > > > > > > Wei > > > > > > > > >
