On Mon, Mar 26, 2012 at 09:48:52AM +1100, Robert Layton wrote:
>It's a good description of DBSCAN. I would point out that the outliers are
>found as "The points which do not belong to any current cluster and do not
>have enough close neighbours to start a new cluster."
Thanks, I have a
On 26 March 2012 09:38, Gael Varoquaux wrote:
> On Mon, Mar 26, 2012 at 12:27:37AM +0200, Andreas wrote:
> > Well as you can tell my motivation for working on the examples
> > and the data sets was not all altruistic ;)
>
> The key to success in a shared project is that every actor should get a
>
On Mon, Mar 26, 2012 at 12:27:37AM +0200, Andreas wrote:
> Well as you can tell my motivation for working on the examples
> and the data sets was not all altruistic ;)
The key to success in a shared project is that every actor should get a
benefit. I don't work on the scikit for the glory of manki
On 03/26/2012 12:31 AM, Gael Varoquaux wrote:
> On Mon, Mar 26, 2012 at 12:22:53AM +0200, Andreas wrote:
>
>> Thanks for the great work. This is really a step forward for the docs!
>>
> Thanks guys. I must confess that I had a presentation to give tomorow
> about clustering and I jumped o
On Mon, Mar 26, 2012 at 12:22:53AM +0200, Andreas wrote:
> Thanks for the great work. This is really a step forward for the docs!
Thanks guys. I must confess that I had a presentation to give tomorow
about clustering and I jumped on the occasion to improve the docs.
Gael
On 03/26/2012 12:19 AM, Gael Varoquaux wrote:
> Thanks for all the feedback. I have included in at merged to master,
> because I was running out of time, but it can still be improved!
>
>
Thanks for the great work. This is really a step forward for the docs!
---
On Mon, Mar 26, 2012 at 09:21:21AM +1100, Robert Layton wrote:
>This is great,
Thanks,
>and I think it would be a good idea to include such a
>summary table for classification at some point as well.
Yes. Actually I believe that every main usecase should have one, at the
beginning of
On 26 March 2012 09:19, Gael Varoquaux wrote:
> Thanks for all the feedback. I have included in at merged to master,
> because I was running out of time, but it can still be improved!
>
> Gael
>
>
> --
> This SF email is s
Thanks for all the feedback. I have included in at merged to master,
because I was running out of time, but it can still be improved!
Gael
--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
On 03/26/2012 12:06 AM, Gael Varoquaux wrote:
> On Sun, Mar 25, 2012 at 11:56:31PM +0200, Andreas wrote:
>
>> As far as I can see, your groups are "KMeans + Ward" and "rest".
>> I don't know how ward works but looking at the lena example,
>> the clusters don't seem to be convex.
>>
> But
On Sun, Mar 25, 2012 at 11:56:31PM +0200, Andreas wrote:
> As far as I can see, your groups are "KMeans + Ward" and "rest".
> I don't know how ward works but looking at the lena example,
> the clusters don't seem to be convex.
But you are looking in the wrong space: the physical space, and not the
On 03/25/2012 11:47 PM, Gael Varoquaux wrote:
> On Sun, Mar 25, 2012 at 11:38:50PM +0200, Andreas wrote:
>
>>> Unlike something like spectral clustering, it is the euclidean distance
>>> to the centers that is minimized. Thus K-Means will seek clusters that
>>> are regular in the flat euclidean
On Sun, Mar 25, 2012 at 11:38:50PM +0200, Andreas wrote:
> > Unlike something like spectral clustering, it is the euclidean distance
> > to the centers that is minimized. Thus K-Means will seek clusters that
> > are regular in the flat euclidean space.
> Ok, that's right. Though I would argue that
> Unlike something like spectral clustering, it is the euclidean distance
> to the centers that is minimized. Thus K-Means will seek clusters that
> are regular in the flat euclidean space.
>
>
Ok, that's right. Though I would argue that the distance measure
is not the only factor here. MeanSh
On Sun, Mar 25, 2012 at 11:30:36PM +0200, Andreas wrote:
> On 03/25/2012 11:32 PM, Gael Varoquaux wrote:
> > On Sun, Mar 25, 2012 at 11:23:55PM +0200, Andreas wrote:
> >> Without looking at the source, it could be that we initialize GMM
> >> with the result of KMeans.
> > We do.
> Then I would s
On 03/25/2012 11:32 PM, Gael Varoquaux wrote:
> On Sun, Mar 25, 2012 at 11:23:55PM +0200, Andreas wrote:
>
>> Without looking at the source, it could be that we initialize GMM
>> with the result of KMeans.
>>
> We do.
>
>
Then I would suggest changing that.
Although not sure what the
On Sun, Mar 25, 2012 at 11:22:32PM +0200, Andreas wrote:
> >> I'm not sure if "flat geometry" is a good way to describe the case that
> >> KMeans works in. I would have said "convex clusters". Not sure in how far
> >> that applies to hierarchical clustering, though.
> > Euclidean distance.
> Can
On Sun, Mar 25, 2012 at 11:23:55PM +0200, Andreas wrote:
> Without looking at the source, it could be that we initialize GMM
> with the result of KMeans.
We do.
> I read that if you do this, the GMM
> solution rarely changes.
No surprising.
> Instead, one should only run KMeans for one or two i
On 03/25/2012 11:20 PM, Gael Varoquaux wrote:
> On Sun, Mar 25, 2012 at 10:51:36PM +0200, Gael Varoquaux wrote:
>
>>> - You should at least refer to GMMs, as this is the most popular
>>> clustering framework that comes with a natural probabilistic setting
>>>
>
>> Agreed.
>>
>
>> I'm not sure if "flat geometry" is a good way to describe the case that
>> KMeans works in. I would have said "convex clusters". Not sure in how far
>> that applies to hierarchical clustering, though.
>>
> Euclidean distance.
>
Can you please elaborate?
>> Also, I would mention explic
On Sun, Mar 25, 2012 at 10:51:36PM +0200, Gael Varoquaux wrote:
> > - You should at least refer to GMMs, as this is the most popular
> > clustering framework that comes with a natural probabilistic setting
> Agreed.
Actually, on our various examples, it is impressive how much GMMs behave
similar
On Sun, Mar 25, 2012 at 10:12:59PM +0200, Andreas wrote:
> For the input, I would hope we can implement Olivier's proposal soon
> so that we don't need to differentiate the different input types.
Agreed. It was literly itching me when I was playing with the example.
> I'm not sure if "flat geomet
On Sun, Mar 25, 2012 at 10:22:39PM +0200, Andreas wrote:
> I might not have the time next week but after that I can give
> it a shot if you don't have the time.
It would be great, as I am not a specialist of this method.
Gael
--
On Sun, Mar 25, 2012 at 10:10:51PM +0200, bthirion wrote:
> - "Hierarchical clustering -> Few clusters": I thought it was not the
> best use case for these algorithms
Yes, this is clearly a typo.
> - "Hierarchical clustering -> even cluster size": this is not true if
> you consider single linka
> - You should at least refer to GMMs, as this is the most popular
> clustering framework that comes with a natural probabilistic setting
>
+1
> - With mean shift, I would refer to 'modes' rather than 'blobs'.
>
+1
In general the mean shift docs could be improved a lot.
There is quite a n
On 03/25/2012 10:25 PM, Olivier Grisel wrote:
> Le 25 mars 2012 22:12, Andreas a écrit :
>
>> ps: Maybe I'll find time to do the "fit_distance"/"fit_kernel" API in
>> one or two weeks.
>>
> As discussed earlier, I would prefer `fit_symmetric` or `fit_pairwise`
> when working with squared
Le 25 mars 2012 22:12, Andreas a écrit :
>
> ps: Maybe I'll find time to do the "fit_distance"/"fit_kernel" API in
> one or two weeks.
As discussed earlier, I would prefer `fit_symmetric` or `fit_pairwise`
when working with squared distance / affinity / kernel matrices as
main data input.
--
Ol
> I am working on a summary table on clustering methods. It is not
> finished, I need to do a bit more literature review, however, I'd love
> some feedback on the current status:
> https://github.com/GaelVaroquaux/scikit-learn/blob/master/doc/modules/clustering.rst
>
>
>
Thanks for starting on
Hi Gael,
Here are some suggestions regarding details of the page:
- "Hierarchical clustering -> Few clusters": I thought it was not the
best use case for these algorithms
- "Hierarchical clustering -> even cluster size": this is not true if
you consider single linkage, or even in general with Wa
Le 25 mars 2012 20:40, Gael Varoquaux a écrit :
> Hi list,
>
> I am working on a summary table on clustering methods. It is not
> finished, I need to do a bit more literature review, however, I'd love
> some feedback on the current status:
> https://github.com/GaelVaroquaux/scikit-learn/blob/maste
Hi list,
I am working on a summary table on clustering methods. It is not
finished, I need to do a bit more literature review, however, I'd love
some feedback on the current status:
https://github.com/GaelVaroquaux/scikit-learn/blob/master/doc/modules/clustering.rst
Cheers,
Gaël
---
31 matches
Mail list logo