Re: [Scikit-learn-general] Nice blog post comparing various scikit-learn classifier runtimes

2012-06-24 Thread Peter Prettenhofer
Andy, I just recently discussed this with Gilles; There are a number of things involved here: Gilles told me that his experience shows that randomized trees are usually deeper than regular trees thus the increased training time. After looking at the code I also found that ``_find_random_split`` r

Re: [Scikit-learn-general] ANN: scikits-image v0.6 release

2012-06-24 Thread Gael Varoquaux
> We're happy to announce the 6th version of scikits-image! Congratulations! It is nice to see scikits-image threading along. Gael -- Live Security Virtual Conference Exclusive live event will cover all the ways today's

[Scikit-learn-general] ANN: scikits-image v0.6 release

2012-06-24 Thread Stéfan van der Walt
Announcement: scikits-image 0.6 === We're happy to announce the 6th version of scikits-image! Scikits-image is an image processing toolbox for SciPy that includes algorithms for segmentation, geometric transformations, color space manipulation, analysis, filtering, mor

Re: [Scikit-learn-general] Nice blog post comparing various scikit-learn classifier runtimes

2012-06-24 Thread amueller
I just read the Post and i was wodering: shouldn't extra trees be faster than random forests? In the Blog Post they are slower. Andy -- Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet. Olivier Grisel schrieb: Here is the link: http://blog.explainmydata.com/2012/0

[Scikit-learn-general] Fwd: Fall CS287 HW2 Dataset

2012-06-24 Thread Daniel Duckworth
For posterity, this the written consent for the "load_kalman_data" dataset that will (hopefully) be integrated in the sklearn soon. Daniel Duckworth -- Forwarded message -- From: Daniel Duckworth Date: Wed, May 16, 2012 at 11:59 AM Subject: Re: Fall CS287 HW2 Dataset To: Pieter

Re: [Scikit-learn-general] Lasso lars path

2012-06-24 Thread Charles-Pierre Astolfi
Hi Alexandre, >> 2. alphas should be sorted in ascending order? Or at least sorted? > > no. The path is fit starting from high alpha so alphas are returns in > decreasing order. Okay, thanks. I have an instance where this list of values is not sorted. I'll submit a bug report. -- Cp ---

Re: [Scikit-learn-general] Nice blog post comparing various scikit-learn classifier runtimes

2012-06-24 Thread Olivier Grisel
2012/6/24 Mathieu Blondel : > It's important to bear in mind that some parameters have a huge impact on > performance and that just using the default ones may result in unfair > comparisons. For example, SGDClassifier uses the quite small n_iter=5 by > default whereas liblinear-based algorithms che

Re: [Scikit-learn-general] Nice blog post comparing various scikit-learn classifier runtimes

2012-06-24 Thread Mathieu Blondel
It's important to bear in mind that some parameters have a huge impact on performance and that just using the default ones may result in unfair comparisons. For example, SGDClassifier uses the quite small n_iter=5 by default whereas liblinear-based algorithms check that the solution is close enough

Re: [Scikit-learn-general] Supporting users on stackoverflow

2012-06-24 Thread Olivier Grisel
2012/6/24 Lars Buitinck : > 2012/6/24 Olivier Grisel : >> I find stackoverflow a good way to answer questions and they are very >> well indexed in search engine (much better than the mailing list >> archives for instance) and the vote system makes it easy to find the >> most useful answers / commen

Re: [Scikit-learn-general] Supporting users on stackoverflow

2012-06-24 Thread Lars Buitinck
2012/6/24 Olivier Grisel : > I find stackoverflow a good way to answer questions and they are very > well indexed in search engine (much better than the mailing list > archives for instance) and the vote system makes it easy to find the > most useful answers / comments. At some point we will also b

[Scikit-learn-general] Supporting users on stackoverflow

2012-06-24 Thread Olivier Grisel
Hi all, I spent a bunch of time re-tagging the questions about scikit-learn on stackoverflow so that the new scikit-learn tag will be used by news user in a more consistent way: http://stackoverflow.com/questions/tagged/scikit-learn?pagesize=50 If you would like to support our users by answering

[Scikit-learn-general] Nice blog post comparing various scikit-learn classifier runtimes

2012-06-24 Thread Olivier Grisel
Here is the link: http://blog.explainmydata.com/2012/06/ntrain-24853-ntest-25147-ncorrupt.html -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel -- Live Security Virtual Conference Exclusive live event

Re: [Scikit-learn-general] GradientBoosting and GridSearchCV: how?

2012-06-24 Thread Peter Prettenhofer
Emanuele, I just realized that the above approach might not be what you actually want: It will select the best value for ``n_estimators`` for _each_ fold - what we actually should do is to average the staged scores over all folds and select the ``n_estimators`` which has the best average score. I