Re: kmeans result is different from scikit-learn result with center points provided

Lee S Mon, 05 Jan 2015 23:05:47 -0800

2015-01-06 15:03 GMT+08:00 Lee S <sle...@gmail.com>:


> But parameters and distance measure is the same. Only difference: Mahout
> kmeans convergence is based on whether every cluster is convergenced.
> scikit-learn is based on  within-cluster sum of squared criterion.
>
> 2015-01-06 14:15 GMT+08:00 Ted Dunning <ted.dunn...@gmail.com>:
>
>> I don't think that data is sufficiently clusterable to expect a unique
>> solution.
>>
>> Mean squared error would be a better measure of quality.
>>
>>
>>
>> On Mon, Jan 5, 2015 at 10:07 PM, Lee S <sle...@gmail.com> wrote:
>>
>> > Data in thie link:
>> >
>> >
>> http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data
>> > .
>> > I convert it to sequencefile with InputDriver.
>> >
>> > 2015-01-06 14:04 GMT+08:00 Ted Dunning <ted.dunn...@gmail.com>:
>> >
>> > > What kind of synthetic data did you use?
>> > >
>> > >
>> > >
>> > > On Mon, Jan 5, 2015 at 8:29 PM, Lee S <sle...@gmail.com> wrote:
>> > >
>> > > > Hi, I used the synthetic data to test the kmeans method.
>> > > > And I write the code own to convert center points to sequecefiles.
>> > > > Then I ran the kmeans with parameter( -i input -o output -c center
>> -x 3
>> > > -cd
>> > > > 1  -cl) ,
>> > > > I compared the dumped clusteredPoints with the result of
>> scikit-learn
>> > > kmens
>> > > > result, it's totally different. I'm very confused.
>> > > >
>> > > > Does anybody ever run kmeans with center points provided and compare
>> > the
>> > > > result with other ml-library?
>> > > >
>> > >
>> >
>>
>
>

Re: kmeans result is different from scikit-learn result with center points provided

Reply via email to