[jira] [Updated] (MATH-1371) Provide accelerated kmeans++ implementation

2019-12-24 Thread Gilles Sadowski (Jira)


 [ 
https://issues.apache.org/jira/browse/MATH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gilles Sadowski updated MATH-1371:
--
Fix Version/s: (was: 4.0)
   4.X

> Provide accelerated kmeans++ implementation
> ---
>
> Key: MATH-1371
> URL: https://issues.apache.org/jira/browse/MATH-1371
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Artem Barger
>Assignee: Artem Barger
>Priority: Major
> Fix For: 4.X
>
> Attachments: ElkanKmeansPlusPlusClusterer.java, 
> ElkanKmeansPlusPlusClustererTest.java
>
>
> There is an updated version of kmeans++ algorithm available, which is 
> published in: Elkan, Charles. "Using the triangle inequality to accelerate 
> k-means." ICML. Vol. 3. 2003. paper.
> The main essence is to boost the kmeans iterations by avoiding computation of 
> distances between centers and points when there is no need for that. For 
> example after the update cluster center haven't moved too far from the point 
> therefore no change in point assignment. The accelerated algorithm avoids 
> unnecessary distance calculations by applying the triangle inequality in two 
> different ways, and by keeping track of lower and upper bounds for distances
> between points and centers.
> Algorithm description is available in the paper.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (MATH-1371) Provide accelerated kmeans++ implementation

2017-04-18 Thread Rob Tompkins (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Tompkins updated MATH-1371:
---
Fix Version/s: 4.0

> Provide accelerated kmeans++ implementation
> ---
>
> Key: MATH-1371
> URL: https://issues.apache.org/jira/browse/MATH-1371
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Artem Barger
>Assignee: Artem Barger
> Fix For: 4.0
>
> Attachments: ElkanKmeansPlusPlusClusterer.java, 
> ElkanKmeansPlusPlusClustererTest.java
>
>
> There is an updated version of kmeans++ algorithm available, which is 
> published in: Elkan, Charles. "Using the triangle inequality to accelerate 
> k-means." ICML. Vol. 3. 2003. paper.
> The main essence is to boost the kmeans iterations by avoiding computation of 
> distances between centers and points when there is no need for that. For 
> example after the update cluster center haven't moved too far from the point 
> therefore no change in point assignment. The accelerated algorithm avoids 
> unnecessary distance calculations by applying the triangle inequality in two 
> different ways, and by keeping track of lower and upper bounds for distances
> between points and centers.
> Algorithm description is available in the paper.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MATH-1371) Provide accelerated kmeans++ implementation

2016-05-31 Thread Artem Barger (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Barger updated MATH-1371:
---
Attachment: (was: ElkanKmeansPlusPlusClusterer.java)

> Provide accelerated kmeans++ implementation
> ---
>
> Key: MATH-1371
> URL: https://issues.apache.org/jira/browse/MATH-1371
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Artem Barger
>Assignee: Artem Barger
> Attachments: ElkanKmeansPlusPlusClusterer.java, 
> ElkanKmeansPlusPlusClustererTest.java
>
>
> There is an updated version of kmeans++ algorithm available, which is 
> published in: Elkan, Charles. "Using the triangle inequality to accelerate 
> k-means." ICML. Vol. 3. 2003. paper.
> The main essence is to boost the kmeans iterations by avoiding computation of 
> distances between centers and points when there is no need for that. For 
> example after the update cluster center haven't moved too far from the point 
> therefore no change in point assignment. The accelerated algorithm avoids 
> unnecessary distance calculations by applying the triangle inequality in two 
> different ways, and by keeping track of lower and upper bounds for distances
> between points and centers.
> Algorithm description is available in the paper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MATH-1371) Provide accelerated kmeans++ implementation

2016-05-31 Thread Artem Barger (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Barger updated MATH-1371:
---
Attachment: ElkanKmeansPlusPlusClustererTest.java
ElkanKmeansPlusPlusClusterer.java

Update version of kmeans implementation, all comments has been addressed as 
requested.

Unit test added.

> Provide accelerated kmeans++ implementation
> ---
>
> Key: MATH-1371
> URL: https://issues.apache.org/jira/browse/MATH-1371
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Artem Barger
>Assignee: Artem Barger
> Attachments: ElkanKmeansPlusPlusClusterer.java, 
> ElkanKmeansPlusPlusClusterer.java, ElkanKmeansPlusPlusClustererTest.java
>
>
> There is an updated version of kmeans++ algorithm available, which is 
> published in: Elkan, Charles. "Using the triangle inequality to accelerate 
> k-means." ICML. Vol. 3. 2003. paper.
> The main essence is to boost the kmeans iterations by avoiding computation of 
> distances between centers and points when there is no need for that. For 
> example after the update cluster center haven't moved too far from the point 
> therefore no change in point assignment. The accelerated algorithm avoids 
> unnecessary distance calculations by applying the triangle inequality in two 
> different ways, and by keeping track of lower and upper bounds for distances
> between points and centers.
> Algorithm description is available in the paper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MATH-1371) Provide accelerated kmeans++ implementation

2016-05-30 Thread Artem Barger (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Barger updated MATH-1371:
---
Attachment: ElkanKmeansPlusPlusClusterer.java

My Java implementation of the algorithm described here: Elkan, Charles. "Using 
the triangle inequality to accelerate k-means." ICML. Vol. 3. 2003. 
https://www.aaai.org/Papers/ICML/2003/ICML03-022.pdf.

Used recently in my research project, found it actually able to speed up order 
of magnitude kmeans clustering algorithm provided by CM. Also not sure whenever 
Elkan implementation of kmean++ has actually current solution.

> Provide accelerated kmeans++ implementation
> ---
>
> Key: MATH-1371
> URL: https://issues.apache.org/jira/browse/MATH-1371
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Artem Barger
>Assignee: Artem Barger
> Attachments: ElkanKmeansPlusPlusClusterer.java
>
>
> There is an updated version of kmeans++ algorithm available, which is 
> published in: Elkan, Charles. "Using the triangle inequality to accelerate 
> k-means." ICML. Vol. 3. 2003. paper.
> The main essence is to boost the kmeans iterations by avoiding computation of 
> distances between centers and points when there is no need for that. For 
> example after the update cluster center haven't moved too far from the point 
> therefore no change in point assignment. The accelerated algorithm avoids 
> unnecessary distance calculations by applying the triangle inequality in two 
> different ways, and by keeping track of lower and upper bounds for distances
> between points and centers.
> Algorithm description is available in the paper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)