Re: [Scikit-learn-general] GoS wish list updated

2012-03-13 Thread Olivier Grisel
Le 13 mars 2012 07:53, Alejandro Weinstein a écrit : > On Tue, Mar 13, 2012 at 6:37 AM, Shankar Satish wrote: >> Do you think my proposal about implementing reinforcement-learning >> algorithms (subject line: "GSOC project idea: online learning algorithms") >> is something that is well suited for

[Scikit-learn-general] Graph reduction via sequential importance sampling (SIS)

2012-03-13 Thread Timmy Wilson
Hi scikit-learn community, I'm interested in reducing a large adjacency matrix (1M x 1M) to 2 dimensions (1M x 2) This is very similar to force-based graph embedding (drawing): http://en.wikipedia.org/wiki/Force-based_algorithms_(graph_drawing) But instead of using force for optimization, i wan

Re: [Scikit-learn-general] Not all plots generated on website

2012-03-13 Thread Andreas
I tried with .99 and the plots show correctly. When trying to get .98 to build, there were gcc errors, which I didn't want to bother with. Therefore it would be good to know which version the server actually runs or if we could have access to the error logs. Any ideas? > I know that several of th

Re: [Scikit-learn-general] DBSCAN demo's input

2012-03-13 Thread Robert Layton
On 14 March 2012 08:54, Lars Buitinck wrote: > 2012/3/13 Robert Layton : > > Lars, you are right, it should have metric='precomputed' in it. > > However by passing the distance matrix without a metric, the features > become > > "distance to point i", which act as sort of meta-features anyway, > a

Re: [Scikit-learn-general] DBSCAN demo's input

2012-03-13 Thread Lars Buitinck
2012/3/13 Robert Layton : > Lars, you are right, it should have metric='precomputed' in it. > However by passing the distance matrix without a metric, the features become > "distance to point i", which act as sort of meta-features anyway, allowing > training to happen. This means that it works with

Re: [Scikit-learn-general] DBSCAN demo's input

2012-03-13 Thread Robert Layton
On 14 March 2012 08:05, Andreas wrote: > Hi Lars. > All I can say is that it worked for me by passing X directly: > > http://scikit-learn.org/dev/auto_examples/cluster/plot_cluster_comparison.html > > I'm deadlining right now, hopefully I have time to work on Olivier's > "quadratic_fit" (or whate

Re: [Scikit-learn-general] DBSCAN demo's input

2012-03-13 Thread Andreas
Hi Lars. All I can say is that it worked for me by passing X directly: http://scikit-learn.org/dev/auto_examples/cluster/plot_cluster_comparison.html I'm deadlining right now, hopefully I have time to work on Olivier's "quadratic_fit" (or whatever) proposal afterward. Cheers, Andy On 03/13/2012

[Scikit-learn-general] DBSCAN demo's input

2012-03-13 Thread Lars Buitinck
Hi all, A colleague approached me today asking how the scikit-learn DBSCAN algorithm should be applied and I must admit that the documentation and example was confusing even to me. The fit docstring says X: array [n_samples, n_samples] or [n_samples, n_features] Array of distances bet

Re: [Scikit-learn-general] Video of the PyCon tutorial on-line!

2012-03-13 Thread Olivier Grisel
Le 13 mars 2012 01:44, Emanuele Olivetti a écrit : > Hi, > > I guess the correct link of the video is now: > http://pyvideo.org/video/622/introduction-to-interactive-predictive-analytics > or > https://www.youtube.com/watch?v=Zd5dfooZWG4 Thanks, indeed the URL has changed. -- Olivier http://tw

Re: [Scikit-learn-general] GoS wish list updated

2012-03-13 Thread Paolo Losi
On Tue, Mar 13, 2012 at 1:37 PM, Shankar Satish wrote: > Hi Paolo and others, > > Do you think my proposal about implementing reinforcement-learning > algorithms (subject line: "GSOC project idea: online learning algorithms") > is something that is well suited for integration into scikit-learn? Do

Re: [Scikit-learn-general] GridSearchCV pickleability

2012-03-13 Thread Martin Fergie
Thanks Andy, have added to the issue. On 13 March 2012 15:16, Andreas wrote: > Hi Martin. > > This might be solved by issue 565 > https://github.com/scikit-learn/scikit-learn/issues/565. > Maybe you should add there that keeping the estimators in memory also > prevents pickling. > > Cheers, > And

Re: [Scikit-learn-general] GridSearchCV pickleability

2012-03-13 Thread Andreas
Hi Martin. This might be solved by issue 565 https://github.com/scikit-learn/scikit-learn/issues/565. Maybe you should add there that keeping the estimators in memory also prevents pickling. Cheers, Andy On 03/13/2012 03:11 PM, Martin Fergie wrote: > Good Afternoon, > > I'm trying to pickle a

Re: [Scikit-learn-general] GoS wish list updated

2012-03-13 Thread Alejandro Weinstein
On Tue, Mar 13, 2012 at 6:37 AM, Shankar Satish wrote: > Do you think my proposal about implementing reinforcement-learning > algorithms (subject line: "GSOC project idea: online learning algorithms") > is something that is well suited for integration into scikit-learn? Do you > think it makes mor

Re: [Scikit-learn-general] Decision tree pruning

2012-03-13 Thread Mathieu Blondel
2012/3/13 Frédéric Bastien : > I would also add that probably ensemble are slower to train then > prunned tree. In academic, this is not a too big problem, but in > industries it can be important in some case. And slower to make predictions too, I guess. Mathieu -

[Scikit-learn-general] GridSearchCV pickleability

2012-03-13 Thread Martin Fergie
Good Afternoon, I'm trying to pickle a class that contains a reference to a GridSearchCV object and it is failing due to an instancemethod type. I was under the impression that scikit-learn objects should be picklable, is this correct? If so, would it be appropriate for me to raise an issue? Info

Re: [Scikit-learn-general] Decision tree pruning

2012-03-13 Thread Frédéric Bastien
I would also add that probably ensemble are slower to train then prunned tree. In academic, this is not a too big problem, but in industries it can be important in some case. Fred On Tue, Mar 13, 2012 at 7:34 AM, Paolo Losi wrote: > On Tue, Mar 13, 2012 at 12:09 PM, Andreas wrote: >> On 03/13/2

Re: [Scikit-learn-general] GoS wish list updated

2012-03-13 Thread Shankar Satish
Hi Paolo and others, Do you think my proposal about implementing reinforcement-learning algorithms (subject line: "GSOC project idea: online learning algorithms") is something that is well suited for integration into scikit-learn? Do you think it makes more sense to start a new scikit focussed on

[Scikit-learn-general] GoS wish list updated

2012-03-13 Thread Paolo Losi
I've added my personal item in the GoS 2012 wish list [1]. That list is a very good source of ideas. I would urge all GoS applicants to look at it before formulating a proposal. Thanks Paolo [1] https://github.com/scikit-learn/scikit-learn/wiki/A-list-of-topics-for-a-google-summer-of-code-(gso

Re: [Scikit-learn-general] Decision tree pruning

2012-03-13 Thread Paolo Losi
On Tue, Mar 13, 2012 at 12:09 PM, Andreas wrote: > On 03/13/2012 12:11 PM, Paolo Losi wrote: >> Since ensemble methods consistently outperform "traditional" tree building >> (where variance is controlled by pruning), what are the advantages of >> implementing >> pruning in sklearn? >> >> > I think

Re: [Scikit-learn-general] Decision tree pruning

2012-03-13 Thread Peter Prettenhofer
2012/3/13 Andreas > On 03/13/2012 12:11 PM, Paolo Losi wrote: > > Since ensemble methods consistently outperform "traditional" tree > building > > (where variance is controlled by pruning), what are the advantages of > > implementing > > pruning in sklearn? > > > > > I think the idea would be to

Re: [Scikit-learn-general] Decision tree pruning

2012-03-13 Thread Andreas
On 03/13/2012 12:11 PM, Paolo Losi wrote: > Since ensemble methods consistently outperform "traditional" tree building > (where variance is controlled by pruning), what are the advantages of > implementing > pruning in sklearn? > > I think the idea would be to have an easy to interpret model. T

Re: [Scikit-learn-general] Decision tree pruning

2012-03-13 Thread Paolo Losi
Since ensemble methods consistently outperform "traditional" tree building (where variance is controlled by pruning), what are the advantages of implementing pruning in sklearn? Paolo N.B. The question is not directed specifically to Brain but to GoS applicants and sklearn contributors. On Tue,

Re: [Scikit-learn-general] Decision tree pruning

2012-03-13 Thread Charanpal Dhanjal
Hi Andy, Thanks for the information. I read the thread by Vikram, and would gladly share my work with him. My particular interest is in model selection for decision trees and at this stage I would like to test how different prunings can improve generalisation. Best, Charanpal Le 13/03/2012 11:18

Re: [Scikit-learn-general] Shrunken Centroid Classifier

2012-03-13 Thread Robert Layton
On 13 March 2012 18:16, Andreas Mueller wrote: > On 03/13/2012 07:49 AM, Olivier Grisel wrote: > > Le 12 mars 2012 17:49, Robert Layton a écrit : > >> I'll work off that template, and when I work out the details of the > >> shrinking parameters (specifically which one is more in use), I'll > bra

Re: [Scikit-learn-general] Decision tree pruning

2012-03-13 Thread Brian Holt
Decision trees tend to overfit, so they are most often used (unpruned) in a forest. That said, I think it would be a useful contribution to our offering. Brian -Original Message- From: Charanpal Dhanjal Date: Tue, 13 Mar 2012 11:20:45 To: Reply-To: scikit-learn-general@lists.sourcefo

Re: [Scikit-learn-general] Decision tree pruning

2012-03-13 Thread Andreas
Hi Charanpal. In the recent GSoC-Thread, Vikram Kamath has proposed this and other improvements as a GSoC project. This idea is still in a pretty early stage, though. In general, this is definitely a useful feature. Cheers, Andy --

[Scikit-learn-general] Decision tree pruning

2012-03-13 Thread Charanpal Dhanjal
I noticed that decision trees are currently unpruned, and wondered if anyone was working on this (or has been)? If not, I might implement pruning myself. Charanpal -- Keep Your Developer Skills Current with LearnDevNow! T

Re: [Scikit-learn-general] Video of the PyCon tutorial on-line!

2012-03-13 Thread Emanuele Olivetti
Hi, I guess the correct link of the video is now: http://pyvideo.org/video/622/introduction-to-interactive-predictive-analytics or https://www.youtube.com/watch?v=Zd5dfooZWG4 Best, Emanuele On 03/11/2012 09:52 AM, Olivier Grisel wrote: > Hi all, > > The video of the tutorial I gave on Thursday

Re: [Scikit-learn-general] Shrunken Centroid Classifier

2012-03-13 Thread Andreas Mueller
On 03/13/2012 07:49 AM, Olivier Grisel wrote: > Le 12 mars 2012 17:49, Robert Layton a écrit : >> I'll work off that template, and when I work out the details of the >> shrinking parameters (specifically which one is more in use), I'll branch >> and submit a PR. > Great. I think the nearest centro

Re: [Scikit-learn-general] Shrunken Centroid Classifier

2012-03-13 Thread Mathieu Blondel
> Great. I think the nearest centroid is a very nice baseline classifier > for sanity check: fast to fit, fast to predict, zero hyper-paramete > and yet make reasonable assumption for many classification dataset (a > good example of high bias, low variance, the opposite of deep decision > trees or