Hope this helps!
Manuel
@Article{Ciampi2008,
author="Ciampi, Antonio
and Lechevallier, Yves
and Limas, Manuel Castej{\'o}n
and Marcos, Ana Gonz{\'a}lez",
title="Hierarchical clustering of subpopulations with a dissimilarity based
on the likelihood ratio statistic: application to clustering
Yes, use an approximate nearest neighbors approach. None is included in
scikit-learn, but there are numerous implementations with Python interfaces.
On 5 January 2018 at 12:51, Shiheng Duan wrote:
> Thanks, Joel,
> I am working on KD-tree to find the nearest neighbors. Basically, I find
> the ne
Thanks, Joel,
I am working on KD-tree to find the nearest neighbors. Basically, I find
the nearest neighbors for each point and then merge a couple of points if
they are both NN for each other. The problem is that after each iteration,
we will have a new bunch of points, where new clusters are adde
Can you use nearest neighbors with a KD tree to build a distance matrix
that is sparse, in that distances to all but the nearest neighbors of a
point are (near-)infinite? Yes, this again has an additional parameter
(neighborhood size), just as BIRCH has its threshold. I suspect you will
not be able
Yes, it is an efficient method, still, we need to specify the number of
clusters or the threshold. Is there another way to run hierarchy clustering
on the big dataset? The main problem is the distance matrix.
Thanks.
On Tue, Jan 2, 2018 at 6:02 AM, Olivier Grisel
wrote:
> Have you had a look at
Have you had a look at BIRCH?
http://scikit-learn.org/stable/modules/clustering.html#birch
--
Olivier
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
Hi all,
I wonder if there is any method to do exact clustering (hierarchy cluster)
on a huge dataset where it is impossible to use distance matrix. I am
considering KD-tree but every time it needs to rebuild it, consuming lots
time.
Any ideas?
___
scikit-