+1
On 07/31/2015 04:50 PM, Sebastian Raschka wrote:
Hi, Timo,
wow, the code really short, well organized and commented. But it's
probably better to submit a pull request so that people can directly
comment on sections of the code and get notifications and updates.
Best,
Sebastian
On Jul 31,
Hi, Timo,
wow, the code really short, well organized and commented. But it's probably
better to submit a pull request so that people can directly comment on sections
of the code and get notifications and updates.
Best,
Sebastian
> On Jul 31, 2015, at 4:35 PM, Timo Erkkilä wrote:
>
> Good idea
Good ideas. I'm fine integrating the code to Scikit-Learn even though it's
a bit of work. :) I've pushed the first version of the code under feature
branch "kmedoids":
https://github.com/terkkila/scikit-learn/blob/kmedoids/sklearn/cluster/k_medoids_.py
I've added drafts of the "clustering" and "d
To address the efficiency issue for large datasets (to some extend), we could
maybe have a `clustering` argument where `clustering='pam'` or
`clustering='clara'`; 'pam' should probably be the default.
In a nutshell, CLARA repeatedly draws random samples (k < n_samples), applies
PAM to them, and
Cool.
Including the code in scikit-learn is often a bit of a process but it
might indeed be interesting.
You could just start with a pull request - or publish a gist if you
don't think you'll have time to work on the inclusion and leave that
part to someone else.
Cheers,
Andy
On 07/31/2015 0
Hi Sebastian.
Have you seen this used much recently? How does it compare against
DBSCAN, BIRCH, OPTICS or just KMeans?
Cheers,
Andy
On 07/31/2015 10:28 AM, Sebastián Palacio wrote:
Hello all,
I've been investigating clustering algorithms with special interest in
non-parametric methods and,
Hello all,
I've been investigating clustering algorithms with special interest in
non-parametric methods and, one that is being mentioned quite often is
DBCLASD [1]. I've looked around but I haven't been able to find one single
implementation of this algorithm whatsoever so I decided to implement
Hi all,
I was checking the archive of the mailing list to see if there were any
attempts in the past to incorporate Conditional Inferences Trees into
the Ensemble module. I've found a mail from Theo Strinopoulos
(07-07-2013) asking if this would be welcomed as a contribution of his.
Gilles Lo
That makes sense. The basic implementation is definitely short, just ~20
lines of code if you don't count comments etc. I can put the source code
available so that you can judge whether it's good to take further. I am
familiar with the documentation libraries you are using (Sphinx with Numpy
style
> Is it required that an algorithm, which is implemented in Scikit-Learn, scales
> well wrt n_samples?
The requirement is 'be actually useful', which is something that is a bit
hard to judge :).
I think that K-medoids is bordeline on this requirement, probably on the
right side of the border. I
I was using a dynamic time warping (DTW) distance with KMedoids, which made
more sense than using euclidean distance since the profiles indeed had
warps along the time axis. DTW implementation was taken from MLPY since
it's not in Scikit-Learn either.
Is it required that an algorithm, which is imp
11 matches
Mail list logo