Thanks for the help and explanation. :) On Thu, Feb 25, 2010 at 1:20 PM, Jake Mannix <[email protected]> wrote:
> And to clarify: you can use either one, but you should think of them like > this: > RandomAccessSparseVector is useful for vectors whose contents change > a great deal (the moving centroids of a clustering algorithm, for example), > and SequentialAccessSparseVector are useful (ie faster) in the case where > they are built up, and then are essentially used in an immutable fashion > (you repeatedly compute a lot of dot-products and add multiples of them > onto other vectors [either DenseVectors or RandomAccessSparseVectors]). > > -jake > > On Wed, Feb 24, 2010 at 7:45 PM, Robin Anil <[email protected]> wrote: > > > They are replaced by the two impls RandomAccessSparseVector or > > SequentialAccessSparseVector > > > > > > On Thu, Feb 25, 2010 at 9:10 AM, Arshad Khan <[email protected] > > >wrote: > > > > > Thanks for the quick reply. > > > > > > I have downloaded the latest 0.3 code. There seems to be significant > > > changes > > > in this version. For example, currently I am using > > > org.apache.mahout.matrix.SparseVector class but in 0.3 I cannot find > this > > > class. > > > > > > What class it is replaced with? > > > > > > Thanks > > > > > > On Thu, Feb 25, 2010 at 10:12 AM, Ted Dunning <[email protected]> > > > wrote: > > > > > > > There are known problems with that version of k-means. > > > > > > > > Try using the trunk version. 0.3 is very close and we are entering > > code > > > > freeze for that so you should be fine with the latest version. > > > > > > > > On Wed, Feb 24, 2010 at 5:46 PM, Arshad Khan < > [email protected] > > > > >wrote: > > > > > > > > > Hello > > > > > > > > > > I am using Mahout 0.2 implementation of KMeans in one of my Text > > Mining > > > > > project. I apply KMeans with a default K value of 4. It seems that > > > every > > > > > time I repeat the clustering process on the same data set, the > > results > > > > are > > > > > different and difference (in terms of cluster size and membership) > is > > > > great > > > > > from run to run. The initial set of centroid points are chosen > > randomly > > > > > through RandomSeedGenerator. Is there a way to obtain more > consistent > > > > > results that do not differ so greatly? Or may be I am doing > something > > > > > wrong? > > > > > > > > > > Any help or idea is very much appreciated. > > > > > > > > > > Thanks and Regards > > > > > Arshad > > > > > > > > > > > > > > > > > > > > > -- > > > > Ted Dunning, CTO > > > > DeepDyve > > > > > > > > > >
