Re: [scikit-learn] Contribution - Markov Clustering

2017-07-11 Thread Uri Goren
I've added this PR, and I addressed in the comments some of your concerns (publications, comparison to affinity propagation, etc). https://github.com/scikit-learn/scikit-learn/pull/9329 I'd love for you to review, since this is my first PR in the scikit learn repository On Wed, Jul 12, 2017 at 1

Re: [scikit-learn] Contribution - Markov Clustering

2017-07-11 Thread Olivier Grisel
If this is the first time you contribute, please make sure to carefully read the contributors guide till the end: http://scikit-learn.org/stable/developers/contributing.html In particular, make sure to follow the estimators API conventions for your PR to get a chance to be reviewed. In particular

Re: [scikit-learn] Agglomerative clustering problem

2017-07-11 Thread Ariani A
ِDear Uri, Thanks. I just have a pairwise distance matrix and I want to implement it so that each cluster has at least 40 data points. (in Agglomerative). Does it work? Thanks, -Ariani On Tue, Jul 11, 2017 at 1:54 PM, Uri Goren wrote: > Take a look at scipy's fcluster function. > If M is a matri

Re: [scikit-learn] Agglomerative clustering problem

2017-07-11 Thread Uri Goren
Take a look at scipy's fcluster function. If M is a matrix of all of your feature vectors, this code snippet should work. You need to figure out what metric and algorithm work for you from sklearn.metrics import pairwise_distance from scipy.cluster import hierarchy X = pairwise_dista

[scikit-learn] Agglomerative clustering problem

2017-07-11 Thread Ariani A
Hi all, I want to perform agglomerative clustering, but I have no idea of number of clusters before hand. But I want that every cluster has at least 40 data points in it. How can I apply this to sklearn.agglomerative clustering? Should I use dendrogram and cut it somehow? I have no idea how to rela

Re: [scikit-learn] Contribution - Markov Clustering

2017-07-11 Thread Jacob Schreiber
You don't need our permission to submit a PR, go ahead! We welcome PRs. On Mon, Jul 10, 2017 at 9:36 PM, Uri Goren wrote: > I have, > The only criterion that I am unsure about is the number citations. > > In the literature Markov clustering is usually compared to affinity > prolongation, which a