Sorry, sure, I meant 17 instead of 16... too tired. Thanks for the info, so there's nothing to worry about.

-----Ursprüngliche Mitteilung-----
Von: Jeff Eastman <j...@windwardsolutions.com>
An: user <user@mahout.apache.org>
Verschickt: Di, 5 Feb 2013 12:03 am
Betreff: Re: ClusterOutputPostProcessorDriver - strange numbering of generated output foldersas


Maybe a typo? I would expect folder 16 to follow folder 15. For many
reasons though, the cluster numbers may not be monotonic. Suggest you
just iterate over the directories that are presented, their names should
correspond to the clusterIds that exist in you clusters-final directory.

On 2/4/13 1:56 PM, Stefan Kreuzer wrote:
Two questions:
a) Sometimes the ClusterOutputPostProcessorDriver leaves gaps in the
number of the generated folder, e.g. after folder 15 folder 16 is
generated. The total number of folders is right (as much folders as
there were clusters produced in the previous step. Ist this is a bug?
It causes me some concern, as I (naively) planned to iterate over the
folder with a simple counter variable.

b) What is the right way to detect what
ClusterOutputPostProcessorDriver  output folder corresponds to what
cluster? I try to construct a hierarchical structure, so it is vital
that the clusters produced based on the
ClusterOutputPostProcessorDriver output get assigned to their "legal"
father.




Reply via email to