I'll fish for a one more hint. I'm using the MAHOUT-126 code to turn text
into data via TF-IDF. What comes out of there is not in the same format as
your example data. This means that I need a different InputDriver? Is one
lying about for the format written by that DocumentVector class?

On Fri, May 29, 2009 at 10:29 AM, Jeff Eastman
<[email protected]>wrote:

> Benson Margulies wrote:
>
>> OK, I've got some inputs, I want to run k-means, how do I feed the beast?
>>
>>
>>
> Make sure you can run the Synthetic Control example to get everything wired
> together correctly: JDK, Hadoop, Mahout. See
> http://cwiki.apache.org/MAHOUT/syntheticcontroldata.html. Then write an
> input job to convert your data similar to
> /Mahout/examples/src/main/java/org/apache/mahout/clustering/syntheticcontrol/canopy/InputDriver.java
> and make a new job like
> /Mahout/examples/src/main/java/org/apache/mahout/clustering/syntheticcontrol/kmeans/Job.java.
> You will have a small adventure and then be operational.
>
> Have fun,
> Jeff
>

Reply via email to