Take a look a o.a.m.clustering.ClusterDumper in mahout-utils. The
points file is a SequenceFile<Text,Text> where the key is the vector
id and the value is a cluster id.

On Tue, Jan 5, 2010 at 9:51 PM, Bogdan Vatkov <[email protected]> wrote:
> I customized the lucene index-to-vector dumper already quite a lot (e.g.
> applied stop-words (from file), stop-regex) but I am wondering how the input
> vectors are later reachable if I start from cluster vectors, you say points
> are somehow doing that, where can I read more or can you tell me more, or is
> there a piece of code which would best guide me through the points format?

Reply via email to