I see but I was looking for more practical definition - e.g. if I use one or
another distance measure class what would be the effect.
The mathematical explanations in the javadoc are not helping much.
If there is no way to explain different algorithms for distance in more
practical way then maybe we do not need different algorithms :)
- e.g. is the distance affected more by the number of common terms or the
weights of common terms or ... - this is just a possible example, I do not
know if it matches any of the distance algorithms.
there should be a guidance for the ones that will use the stuff - it is
expected that these users know something about their input data and based on
different characteristics of that data (e.g. number of docs, doc size, etc.)
and desired result (e.g. number of clusters, number of unique term in
clusters, etc.) to be able to pick the right Mahout configuration - with
regards to numbers, classes, algorithms, etc.
I currently miss such a guideline.

On Thu, Jan 7, 2010 at 2:38 PM, Felix Lange <[email protected]> wrote:

> Hi Bodgan,
> I didn't read any javadocs about this package, but the cluster distance
> should be the distance between two clusters. There are different distance
> measures in this respect, e.g. you can take the distance between two
> clusters' centers as their distance value.
> Greetings
> Felix
>
>
> 2010/1/6 Bogdan Vatkov <[email protected]>
>
> > What is the practical meaning of the "cluster distance" e.g. I am
> currently
> > using org.apache.mahout.common.distance.CosineDistanceMeasure but I do
> not
> > have any clue what does that mean and what other values could bring to
> the
> > game. Any guidance here?
> >
> > --
> > Best regards,
> > Bogdan
> >
>



-- 
Best regards,
Bogdan

Reply via email to