Re: [Scikit-learn-general] GSoC - Improving GMM

2014-01-29 Thread Manoj Kumar
Hi Andy. Thanks for the response :) I'm looking into the project ideas but I'm am unable to zero in on a single idea for GSoC . My knowledge is limited to linear and clustering models, however I am willing to learn and read the literature well before GSoC and I am a pretty quick learner. It would

Re: [Scikit-learn-general] GSoC - Improving GMM

2014-01-29 Thread Andy
Hey Manoj. I agree that the description is vague. I think what Vlad was trying to say that refurbishing only makes sense if it comes with long-time support by an active user. Basically, "refurbishing" means - have a simple and sklearn-consistent interface - be numerically stable, reliable and r

Re: [Scikit-learn-general] Request for CC0 licensing on examples

2014-01-29 Thread Andy
On 01/28/2014 01:44 PM, Olivier Grisel wrote: > 2014/1/28 Gael Varoquaux : >> >> I had never worried about that, and I guess nobody usually does. Do >> people actually respect that clause in slides? >> >> Just put a tiny "BSD licensed" somewhere in the slide Same here. > >> Practically, to relicens

Re: [Scikit-learn-general] What's up with our Debian popcon results

2014-01-29 Thread Andy
On 01/29/2014 12:36 PM, Olivier Grisel wrote: > Maybe some organization has a Debian-based compute cluster with popcon > installed and the sysadmin installed sklearn on all the nodes? > Not it! On the other hand let me see where the default images lie around ;)

Re: [Scikit-learn-general] Combine criterions for building a tree

2014-01-29 Thread Alexandre Gramfort
a hack that might be good enough could be to use as features the cos and sin of the angle and split the output (using then multioutput trees) Alex -- WatchGuard Dimension instantly turns raw network data into actionable

Re: [Scikit-learn-general] Combine criterions for building a tree

2014-01-29 Thread Pablo Rozas Larraondo
I think you're right, it might be quite a big of a hack... :-) Anyway, if any of the main developers think it might be worth enabling this kind of extensibility (handle different variables types) to the tree implementation I'd be more than happy to contribute under limited direction. Otherwise, I

Re: [Scikit-learn-general] Combine criterions for building a tree

2014-01-29 Thread Olivier Grisel
Alright thanks for the clarification. In that case you will to hack a lot indeed, good luck :) -- WatchGuard Dimension instantly turns raw network data into actionable security intelligence. It gives you real-time visual

Re: [Scikit-learn-general] What's up with our Debian popcon results

2014-01-29 Thread Olivier Grisel
Maybe some organization has a Debian-based compute cluster with popcon installed and the sysadmin installed sklearn on all the nodes? -- Olivier -- WatchGuard Dimension instantly turns raw network data into actionable s

Re: [Scikit-learn-general] What's up with our Debian popcon results

2014-01-29 Thread Gael Varoquaux
On Wed, Jan 29, 2014 at 11:07:28AM +, Nigel Legg wrote: > Is the chart showing all installs, or just successful installs? It's installed instances, and only for the Debian boxes that have popcon enabled, so a minor fraction of them. G -

Re: [Scikit-learn-general] What's up with our Debian popcon results

2014-01-29 Thread Nigel Legg
Is the chart showing all installs, or just successful installs? I was reinstalling dependencies in vagrant yesterday and the statsmodels lib install failed, which meant I had to start installing all dependencies again - three times. Though this won't account for all the variation, if other people

Re: [Scikit-learn-general] Combine criterions for building a tree

2014-01-29 Thread Pablo Rozas Larraondo
Thanks Gilles and Lars, I will try to figure out what is the best way to hack the current implementation of the tree builder to combine different splitters. Olivier: What I want to implement is a tree that can combine both linear and circular variables as input features and also the target variabl

[Scikit-learn-general] What's up with our Debian popcon results

2014-01-29 Thread Gael Varoquaux
I don't understand why, but it seems that our count of installation on Debian has jumped crazily in the last weeks: http://qa.debian.org/popcon.php?package=scikit-learn Anybody has an explaination? Most likely some other package has declared a dependency on us. It would be interesting to know whic

Re: [Scikit-learn-general] Combine criterions for building a tree

2014-01-29 Thread Olivier Grisel
Just to clarify things Pablo: are your circular variables input features or target regression variables in a multi-output regression task? -- Olivier -- WatchGuard Dimension instantly turns raw network data into actionab

Re: [Scikit-learn-general] Combine criterions for building a tree

2014-01-29 Thread Gilles Louppe
> I suppose, I will have to test all my variables at every node to find the > optimum split measured with a criteria. What it's still not clear to me is > if it exists an elegant way of choosing the right splitter depending on the > variable, via tagging or any other solution. Yes, we don't have a

Re: [Scikit-learn-general] Combine criterions for building a tree

2014-01-29 Thread Pablo Rozas Larraondo
Hi Gilles, Thanks for your help, you're right, what I'm looking for is a new implementation of a Splitter that deals with circular data. I suppose, I will have to test all my variables at every node to find the optimum split measured with a criteria. What it's still not clear to me is if it exist

Re: [Scikit-learn-general] Combine criterions for building a tree

2014-01-29 Thread Gilles Louppe
Hi Pablo, I am not sure re-implementing a new criterion is what you are looking for. Criteria are made to evaluate the goodness of a split (i.e., a binary partition of the samples in the current node) in terms of impurity with regards to the output variable - not the inputs. What you should do in

Re: [Scikit-learn-general] Combine criterions for building a tree

2014-01-29 Thread Lars Buitinck
2014-01-29 Pablo Rozas Larraondo : > Suppose I want to create a regression tree accepting both continuous linear > data and circular data. If I implement a new RegressionCriterion specific > for circular data, how difficult would it be to grow a tree combining to > different Criterions (ie MSE and

[Scikit-learn-general] Combine criterions for building a tree

2014-01-29 Thread Pablo Rozas Larraondo
Suppose I want to create a regression tree accepting both continuous linear data and circular data. If I implement a new RegressionCriterion specific for circular data, how difficult would it be to grow a tree combining to different Criterions (ie MSE and the new CircularCriterion)? I suppose the