> I want to impose an additional constraint. When 2 clusters are combined and
> the
> cost of combination is equal for multiple cluster pairs, I want to choose the
> pair for which the combined cluster has the least size.
> What is the cleanest and easiest way of achieving this?
I don't think th
Hi everyone,
I am using agglomerative clustering with an L1 distance matrix as input and
the "complete" linkage option.
I want to impose an additional constraint. When 2 clusters are combined and
the cost of combination is equal for multiple cluster pairs, I want to
choose the pair for which the
Typically when I think of limiting the number of points in a cluster I
think of KD trees. I suppose that wouldn't work?
On Tue, Jul 11, 2017 at 11:22 AM, Ariani A wrote:
> ِDear Uri,
> Thanks. I just have a pairwise distance matrix and I want to implement it
> so that each cluster has at least 4
Dear Shane,
Sorry bothering you!
Is the "precomputed" and "distance matrix" you are talking about, are about
"DBSCAN" ?
Thanks,
Best.
On Thu, Jul 13, 2017 at 7:03 PM, Ariani A wrote:
> Dear Shane,
> Thanks for your prompt answer.
> Do you mean that for DBSCAN there is no need to feed other param
Dear Shane,
Thanks for your prompt answer.
Do you mean that for DBSCAN there is no need to feed other parameters? Do I
just call the function or I have to manipulate the code?
P.S. I was not able to find the DBSCAN code on github.
Looking forward to hearing from you.
Best,
-Noushin
On Thu, Jul 13,
Hi Ariani,
Yes, you can use a distance matrix-- I think that what you want is
metric='precomputed', and then X would be your N by N distance matrix.
Hope that helps,
~Shane
On 07/13, Ariani A wrote:
Dear Shane,
Thanks for your answer.
Does DBSCAN works with distance matrix/? I have a distance
Dear Shane,
Thanks for your answer.
Does DBSCAN works with distance matrix/? I have a distance matrix
(symmetric matrix which contains pairwise distances). Can you help me? I
did not find DBSCAN code in that link.
Best,
-Ariani
On Thu, Jul 6, 2017 at 12:32 PM, Shane Grigsby
wrote:
> This sounds
ِDear Uri,
Thanks. I just have a pairwise distance matrix and I want to implement it
so that each cluster has at least 40 data points. (in Agglomerative).
Does it work?
Thanks,
-Ariani
On Tue, Jul 11, 2017 at 1:54 PM, Uri Goren wrote:
> Take a look at scipy's fcluster function.
> If M is a matri
Take a look at scipy's fcluster function.
If M is a matrix of all of your feature vectors, this code snippet should
work.
You need to figure out what metric and algorithm work for you
from sklearn.metrics import pairwise_distance
from scipy.cluster import hierarchy
X = pairwise_dista
Hi all,
I want to perform agglomerative clustering, but I have no idea of number of
clusters before hand. But I want that every cluster has at least 40 data
points in it. How can I apply this to sklearn.agglomerative clustering?
Should I use dendrogram and cut it somehow? I have no idea how to rela
Dear Shane,
Thanks for your time. But I have to implement it by agglomerative
clustering and cut it when each cluster has at least 40 data points. But I
am not sure how to do cut it. I was guessing maybe it can be done by
cutting the dandrogram? Is it correct? If so, I do not know how to apply
it.
This sounds like it may be a problem more amenable to either DBSCAN or
OPTICS. Both algorithms don't require a priori knowledge of the number
of clusters, and both let you specify a minimum point membership
threshold for cluster membership. The OPTICS algorithm will also produce
a dendrogram th
I want to perform agglomerative clustering, but I have no idea of number of
clusters before hand. But I want that every cluster has at least 40 data
points in it. How can I apply this to sklearn.agglomerative clustering?
Should I use dendrogram and cut it somehow? I have no idea how to relate
dendr
You can have a look at the test named "test_agglomerative_clustering" in:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cluster/tests/test_hierarchical.py
--
Olivier
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.
I have some data and also the pairwise distance matrix of these data
points. I want to cluster them using Agglomerative clustering. I readthat
in sklearn, we can have 'precomputed' as affinity and I expect it is the
distance matrix. But I could not find any example which uses precomputed
affinity a
15 matches
Mail list logo