Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-12 Thread Olivier Grisel
On Sep 12, 2013 8:51 AM, "Robert Layton" wrote: > > In the interests of a decision, can I push for renaming to SingleLinkageCluster, and then I'll work with Gael on a solution to either introduce a threshold cut to his implementation, or choose some other path? +1

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-11 Thread Robert Layton
In the interests of a decision, can I push for renaming to SingleLinkageCluster, and then I'll work with Gael on a solution to either introduce a threshold cut to his implementation, or choose some other path? - Robert On 9 September 2013 20:22, Robert Layton wrote: > I haven't yet compared ag

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-09 Thread Robert Layton
I haven't yet compared against scipy's implementation. The main reason for this is that they are different types of clusterers (with the MSTCluster here generating flat clusters). That said, they are easily convertible. Perhaps we should just drop the separate class altogether, and add an ability

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-09 Thread Andreas Mueller
On 09/08/2013 06:51 PM, Olivier Grisel wrote: > I just had a look at the results section and it looks very > interesting, in particular in its ability to bring noise robustness to > single linkage. Have you tried to compare it with ward? FYI the output of "examples.py" for the smaller datasets.

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-09 Thread Gael Varoquaux
On Mon, Sep 09, 2013 at 05:41:08PM +1000, Robert Layton wrote: >3) Gael's PR can either use this class, or replace it if he comes up with >something better/faster/stronger. If the second case, remove this class >then. Ideally, I'd like a function in addition to the class. And the more

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-09 Thread Robert Layton
*SingleLinkageClustering On 9 September 2013 17:41, Robert Layton wrote: > Thanks for the comments everyone, and the praise Jake. > > Based on this conversation, I think a good avenue would be: > > 1) Rename MSTCluster to SingleLinkageCluster > 2) Merge (after checks, I still need to rebase aga

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-09 Thread Robert Layton
Thanks for the comments everyone, and the praise Jake. Based on this conversation, I think a good avenue would be: 1) Rename MSTCluster to SingleLinkageCluster 2) Merge (after checks, I still need to rebase again) 3) Gael's PR can either use this class, or replace it if he comes up with something

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-08 Thread Andreas Mueller
On 09/08/2013 06:51 PM, Olivier Grisel wrote: > > I just had a look at the results section and it looks very > interesting, in particular in its ability to bring noise robustness to > single linkage. Have you tried to compare it with ward? Yeah. I think the "experiments.py" had ward in it: https://

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-08 Thread Andreas Mueller
On 09/08/2013 07:10 PM, Olivier Grisel wrote: > BTW it might make sense to keep `SingleLinkageClustering` as a special case > as: > > - the MST algorithm can benefit from extracting the nearest neighbors > graph only using the ball tree as done in Andreas implementation: > https://github.com/amuel

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-08 Thread Olivier Grisel
BTW it might make sense to keep `SingleLinkageClustering` as a special case as: - the MST algorithm can benefit from extracting the nearest neighbors graph only using the ball tree as done in Andreas implementation: https://github.com/amueller/information-theoretic-mst/blob/master/itm.py#L76 and t

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-08 Thread Olivier Grisel
2013/9/8 Gael Varoquaux : > On Sun, Sep 08, 2013 at 05:14:35PM +0200, Alexandre Gramfort wrote: >> I would be in favor of a HierarchicalClustering object that supports >> various linkage >> criteria. > >> something like: > >> hc = HierarchicalClustering(linkage='single') > >> linkage='ward' would b

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-08 Thread Olivier Grisel
2013/9/7 Andreas Mueller : > On 09/07/2013 12:35 PM, Lars Buitinck wrote: >> 2013/9/7 Robert Layton : >>> This algorithm finds a minimum spanning tree, then cuts any edge higher than >>> a given threshold. >>> >>> This is equivalent to the single linkage clustering. Olivier and I are >>> talking ab

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-08 Thread Gael Varoquaux
On Sun, Sep 08, 2013 at 05:14:35PM +0200, Alexandre Gramfort wrote: > I would be in favor of a HierarchicalClustering object that supports > various linkage > criteria. > something like: > hc = HierarchicalClustering(linkage='single') > linkage='ward' would be another option. Yes, indeed. This

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-08 Thread Alexandre Gramfort
I would be in favor of a HierarchicalClustering object that supports various linkage criteria. something like: hc = HierarchicalClustering(linkage='single') linkage='ward' would be another option. Alex On Sat, Sep 7, 2013 at 4:25 PM, Jacob Vanderplas wrote: > On Sat, Sep 7, 2013 at 5:21 AM,

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-07 Thread Jacob Vanderplas
On Sat, Sep 7, 2013 at 5:21 AM, bthirion wrote: > > I think single-linkage is what people are going to look for when they > > want a clustering algorithm. The fact that this is equivalent to > > finding an MST is an implementation detail (although it's still a good > > thing to have that in the d

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-07 Thread bthirion
On 07/09/2013 12:35, Lars Buitinck wrote: > 2013/9/7 Robert Layton : >> This algorithm finds a minimum spanning tree, then cuts any edge higher than >> a given threshold. >> >> This is equivalent to the single linkage clustering. Olivier and I are >> talking about which name would be best to use. T

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-07 Thread Andreas Mueller
On 09/07/2013 12:35 PM, Lars Buitinck wrote: > 2013/9/7 Robert Layton : >> This algorithm finds a minimum spanning tree, then cuts any edge higher than >> a given threshold. >> >> This is equivalent to the single linkage clustering. Olivier and I are >> talking about which name would be best to use

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-07 Thread Lars Buitinck
2013/9/7 Robert Layton : > This algorithm finds a minimum spanning tree, then cuts any edge higher than > a given threshold. > > This is equivalent to the single linkage clustering. Olivier and I are > talking about which name would be best to use. The leading option at the > moment is SingleLinkag

[Scikit-learn-general] Question about naming a clustering algorithm

2013-09-06 Thread Robert Layton
In my recent PR , I've implemented the MSTCluster algorithm. This algorithm finds a minimum spanning tree, then cuts any edge higher than a given threshold. This is equivalent to the single linkage clustering. Olivier and I are talking about