Yes, I am seeing the same behaviour with m=2 but the convergence is faster

On Wed, Feb 17, 2010 at 11:21 PM, Palleti, Pallavi <
pallavi.pall...@corp.aol.com> wrote:

> How many iterations of FuzzyKMeans you are running? Here is my
> observation- When I ran for few iterations,the cluster centroids are far
> off. However, when I ran for more than 50 iterations or so, the cluster
> points are still different but they are very much near by as if they are
> same. By the way, I am using m=3 in membership function.
>
> Thanks
> Pallavi
>
> -----Original Message-----
> From: Robin Anil [mailto:robin.a...@gmail.com]
> Sent: Wednesday, February 17, 2010 8:10 PM
> To: mahout-dev@lucene.apache.org
> Subject: Re: Fuzzy K Means
>
> Tests are passing fine. But Not when testing reuters.
>
> On Wed, Feb 17, 2010 at 8:07 PM, Pallavi Palleti <
> pallavi.pall...@corp.aol.com> wrote:
>
> > If we just need to verify with some sample dataset, we already have
> > the data in TestFuzzyKMeansClustering code. won't that suffice?
> > Otherwise, I need to manually generate some sample dataset as I don't
> > have this small dataset with me. I am actually running on movielens
> > data using movie ratings as vector (movie as dimension , rating as
> coefficient) and user as point.
> >
> >
> > Thanks
> > Pallavi
> >
> > Robin Anil wrote:
> >
> >> I tracked the versions back to before the change to Writables were
> done.
> >> There is nothing significant change in the code.
> >>
> >> Can you give me a small dataset 10 points maybe 5 dimensions. I can
> >> verify the trunk in Case?
> >>
> >> Robin
> >>
> >> On Wed, Feb 17, 2010 at 7:49 PM, Pallavi Palleti <
> >> pallavi.pall...@corp.aol.com> wrote:
> >>
> >>
> >>
> >>> I have a local version which I have submitted long back and I am
> >>> using it on real data and is not giving same point for all clusters.
>
> >>> However, I haven't tried with latest mahout code. I have kept my
> >>> code to output data as text so that it is easy for me to verify.
> >>> However, current mahout code outputs it as binary data (as
> >>> sequencefile). So, it is difficult to verify.
> >>>
> >>>
> >>> Thanks
> >>> Pallavi
> >>>
> >>> Robin Anil wrote:
> >>>
> >>>
> >>>
> >>>> Have you verified the trunk code on some real data. I am getting
> >>>> same point for all clusters regardless of the distnce measure
> >>>>
> >>>> Robin
> >>>>
> >>>>
> >>>>
> >>>> On Wed, Feb 17, 2010 at 6:41 PM, Pallavi Palleti <
> >>>> pallavi.pall...@corp.aol.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> Yes. It shouldn't be a problem. My point was that we are extending
>
> >>>>> numpoints as part of ClusterBase, though we are not using it in
> >>>>> SoftCluster.
> >>>>> Other that that, I don't see any issue w.r.t. functionality.
> >>>>>
> >>>>>
> >>>>> Thanks
> >>>>> Pallavi
> >>>>>
> >>>>> Robin Anil wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>> In the impl of SoftClusters on writeOut it calculates the
> >>>>>> centroid and writes it and when read(in) it reads the centroid in
> to the center.
> >>>>>>
> >>>>>> In ClusterDumper it reads into the ClusterBase and does
> >>>>>> value.getCenter(); It should work normally right
> >>>>>>
> >>>>>> Robin
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Feb 17, 2010 at 6:02 PM, Pallavi Palleti <
> >>>>>> pallavi.pall...@corp.aol.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> Yes. But not the total number of points. So, the numpoints from
> >>>>>>> ClusterBase will not be used in SoftCluster. numpoints is
> >>>>>>> specific to Kmeans similar to weightedpoint total for fuzzy
> >>>>>>> kmeans.
> >>>>>>>
> >>>>>>>
> >>>>>>> Robin Anil wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> the center is still the averaged out centroid right?
> >>>>>>>> weightedtotalvector/totalprobWeight
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Feb 17, 2010 at 5:10 PM, Pallavi Palleti <
> >>>>>>>> pallavi.pall...@corp.aol.com> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> I haven't yet gone thru ClusterDumper. However, ClusterBase
> >>>>>>>>> would be having number of points to average out
> >>>>>>>>> (pointTotal/numPoints as per
> >>>>>>>>> kmeans)
> >>>>>>>>> where
> >>>>>>>>> as SoftCluster will have weighted point total. So, I am
> >>>>>>>>> wondering how can we reuse ClusterBase here?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thanks
> >>>>>>>>> Pallavi
> >>>>>>>>>
> >>>>>>>>> Robin Anil wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> yes. So that cluster dumper can print it out.
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Feb 17, 2010 at 5:02 PM, Pallavi Palleti <
> >>>>>>>>>> pallavi.pall...@corp.aol.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> Hi Robin,
> >>>>>>>>>>>
> >>>>>>>>>>> when you meant by reusing ClusterBase, are you planning to
> >>>>>>>>>>> extend ClusterBase in SoftCluster? For example, SoftCluster
> >>>>>>>>>>> extends ClusterBase?
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks
> >>>>>>>>>>> Pallavi
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Robin Anil wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> I have been trying to convert FuzzyKMeans SoftCluster(which
>
> >>>>>>>>>>>> should be ideally be named FuzzyKmeansCluster) to use the
> >>>>>>>>>>>> ClusterBase.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I am getting* the same center* for all the clusters. To aid
>
> >>>>>>>>>>>> the conversion all i did was remove the center vector from
> >>>>>>>>>>>> the SoftCluster class and reuse the same from the
> >>>>>>>>>>>> ClusterBase. These are essentially making no change in the
> >>>>>>>>>>>> tests which passes correctly.
> >>>>>>>>>>>>
> >>>>>>>>>>>> So I am questioning whether the implementation keeps the
> >>>>>>>>>>>> average center at all ? Anyone who has used FuzzyKMeans
> >>>>>>>>>>>> experiencing this?
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Robin
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>
> >>>
> >>
> >>
> >
>

Reply via email to