Re: [Scikit-learn-general] Scikit-learn's website is down

2013-07-09 Thread Nigel Legg
Yes, sorry, realised my mistake just after sending. Regards, Nigel Legg 07914 740972 http://www.treavnianlegg.co.uk http://twitter.com/nigellegg http://uk.linkedin.com/in/nigellegg On 10 July 2013 06:46, Robert Layton wrote: > Can confirm, it works without the *www*, but doesn't work with it.

Re: [Scikit-learn-general] Scikit-learn's website is down

2013-07-09 Thread Robert Layton
Can confirm, it works without the *www*, but doesn't work with it. On 10 July 2013 15:38, Nigel Legg wrote: > Total content on www.scikit-learn.org > This space is managed by SourceForge.net. You have attempted to access a > URL that either never existed or is no longer active. Please check the

Re: [Scikit-learn-general] Scikit-learn's website is down

2013-07-09 Thread Nigel Legg
Total content on www.scikit-learn.org This space is managed by SourceForge.net. You have attempted to access a URL that either never existed or is no longer active. Please check the source of your link and/or contact the maintainer of the link to have them update their records. Regards, Nigel Legg

Re: [Scikit-learn-general] # of jobs run by GridSearchCV?

2013-07-09 Thread Robert Layton
The ParameterGrid object is created before the jobs are run, so it would be trivial to move this object creation up, calculate the number of jobs and output it. Happy to take a PR. On 10 July 2013 08:57, Joel Nothman wrote: > The number of jobs is actually len(ParameterGrid(search.param_grid))

Re: [Scikit-learn-general] # of jobs run by GridSearchCV?

2013-07-09 Thread Joel Nothman
The number of jobs is actually len(ParameterGrid(search.param_grid)) * len(check_cv(search.cv)), and I think this should be output at the start of the search if verbose >= 1, and perhaps should also be calculated by some method, so a user can estimate the time before finalising the grid... - Joel

Re: [Scikit-learn-general] # of jobs run by GridSearchCV?

2013-07-09 Thread Robert Layton
Hi Josh, This is decided by the param_grid that you give it. The actual internals is handled by the ParameterGrid class ( http://scikit-learn.org/dev/modules/generated/sklearn.grid_search.ParameterGrid.html ). The example on that page shows how you could calculate the number of runs based on you

Re: [Scikit-learn-general] Scikit-learn's website is down

2013-07-09 Thread Josh Wasserstein
Thank you all. http://scikit-learn.org was down for a few minutes when I sent the email, but it's up again. Josh On Tue, Jul 9, 2013 at 6:40 PM, Robert Layton wrote: > /stable and /dev are both up for me at this time (which was two hours > since Josh's email). > > > On 10 July 2013 06:29, Josh

[Scikit-learn-general] # of jobs run by GridSearchCV?

2013-07-09 Thread Josh Wasserstein
Hi there, Prior to running clf.fit(X,y) with GridSearchCV is there any easy/direct way to know how many jobs will GridSearchCV run? (i.e. the total number of parameter combinations in the grid search) Josh -- See everyt

Re: [Scikit-learn-general] Scikit-learn's website is down

2013-07-09 Thread Robert Layton
/stable and /dev are both up for me at this time (which was two hours since Josh's email). On 10 July 2013 06:29, Josh Wasserstein wrote: > FYI: The website seems to be currently down. > > Josh > > > -- > See everything

Re: [Scikit-learn-general] Scikit-learn's website is down

2013-07-09 Thread Alexandre ABRAHAM
Hi Josh, The website works for me. Maybe you are trying to access http://www.scikit-learn.org instead of http://scikit-learn.org? Alexandre. On Tue, Jul 9, 2013 at 10:29 PM, Josh Wasserstein wrote: > FYI: The website seems to be currently down. > > Josh > > > --

[Scikit-learn-general] Scikit-learn's website is down

2013-07-09 Thread Josh Wasserstein
FYI: The website seems to be currently down. Josh -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose

Re: [Scikit-learn-general] Py3 port: FAILED (SKIP=2, errors=1, failures=5)

2013-07-09 Thread Justin Vincent
Thanks for getting that all set up. I'll take another crack at it tonight. On Jul 9, 2013 2:46 PM, "Olivier Grisel" wrote: > We are almost there. I have reconfigured the Jenkins build for Python > 3.3, NumPy 1.7.1 and SciPy 0.12.0: > > > https://jenkins.shiningpanda-ci.com/scikit-learn/job/python

[Scikit-learn-general] Py3 port: FAILED (SKIP=2, errors=1, failures=5)

2013-07-09 Thread Olivier Grisel
We are almost there. I have reconfigured the Jenkins build for Python 3.3, NumPy 1.7.1 and SciPy 0.12.0: https://jenkins.shiningpanda-ci.com/scikit-learn/job/python-3.3-numpy-1.7.1-scipy-0.12.0/ Here are the remaining errors / failures: https://jenkins.shiningpanda-ci.com/scikit-learn/job/python

Re: [Scikit-learn-general] Python 3 port

2013-07-09 Thread Olivier Grisel
The REAMDE-Py3k.rst was not reflecting the current situation. I just updated it. We don't use 2to3 anymore but a single code base with helpers in sklearn.externals.six . Please feel free to submit pull requests to fix the remaining test failures if you wish. -- Olivier --

Re: [Scikit-learn-general] Sigmoid's degree?

2013-07-09 Thread Lars Buitinck
2013/7/9 Josh Wasserstein : > After running a SVC grid search with linear, rbf and sigmoid kernels, I got > the following: > [snip] > > note that the above says that the best estimator is a sigmoid with: > coef0 = 20.0855369232 > gamma=0.367879441171 > degree=3 ? > > I am confused about the above.

[Scikit-learn-general] Sigmoid's degree?

2013-07-09 Thread Josh Wasserstein
After running a SVC grid search with linear, rbf and sigmoid kernels, I got the following: Classification report for the best estimator: > SVC(C=403.428793493, cache_size=600, class_weight=None, > coef0=20.0855369232, degree=3, gamma=0.367879441171, kernel=sigmoid, > max_iter=-1, probability=Fals

Re: [Scikit-learn-general] Recursive Feature selection for a One hot encoded data set

2013-07-09 Thread Maheshakya Wijewardena
This is about the application of One hot encoder. I used label encoder because it would look like different categorical set of values. (Just to demonstrate the functionality of One hot encoder) What I want to know is, what those feature_indices_ and active_features_ indicate. As I've used the same

Re: [Scikit-learn-general] Defining a Density Estimation Interface

2013-07-09 Thread Bertrand Thirion
- Mail original - > De: "Skipper Seabold" > À: scikit-learn-general@lists.sourceforge.net > Envoyé: Lundi 8 Juillet 2013 19:40:36 > Objet: Re: [Scikit-learn-general] Defining a Density Estimation Interface > > On Mon, Jul 8, 2013 at 1:20 PM, Bertrand Thirion > wrote: > > > > De: "Jacob

Re: [Scikit-learn-general] Recursive Feature selection for a One hot encoded data set

2013-07-09 Thread Lars Buitinck
2013/7/9 Joel Nothman : > Sorry, I got confused with binarizer somehow. Thanks, Lars. So did I because LabelBinarizer does not do a one-hot encoding, but the general point stands. -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam --

Re: [Scikit-learn-general] Recursive Feature selection for a One hot encoded data set

2013-07-09 Thread Joel Nothman
Sorry, I got confused with binarizer somehow. Thanks, Lars. On Tue, Jul 9, 2013 at 9:27 PM, Lars Buitinck wrote: > 2013/7/9 Joel Nothman : > > * You probably want to use Encoder rather than LabelEncoder in that > example > > There is no Encoder. But LabelEncoder is indeed the wrong thing to > u

Re: [Scikit-learn-general] Recursive Feature selection for a One hot encoded data set

2013-07-09 Thread Lars Buitinck
2013/7/9 Joel Nothman : > * You probably want to use Encoder rather than LabelEncoder in that example There is no Encoder. But LabelEncoder is indeed the wrong thing to use, since it encodes *labels*, not *samples*, using a one-hot scheme. On feature arrays, the result is unspecified. But even if

Re: [Scikit-learn-general] Recursive Feature selection for a One hot encoded data set

2013-07-09 Thread Joel Nothman
So: * There may not be an issue with RFE? * You probably want to use Encoder rather than LabelEncoder in that example * It seems as if the output of feature_indices_ needs to be understood as if it is then masked by active_indices_, which only registers exactly those features active in training. So

Re: [Scikit-learn-general] Some concerns about the MLP pull request

2013-07-09 Thread Lars Buitinck
2013/7/8 Issam : > On 7/8/2013 12:53 PM, Lars Buitinck wrote: >> cost = np.sum(np.einsum('ij,ji->i', diff, diff.T)) / (2 * n_samples) > Thanks for all the remarks! > > I found out that the `einsum` can be replaced simply by 'cost = > np.sum(diff**2)/ (2 * n_samples)' which is faster and more reada

Re: [Scikit-learn-general] mean(scores) vs score(concatenation). E.g. AUC with LOO validation

2013-07-09 Thread Vincent Dubourg
Hi, I have the same issue there with R2 score for regression : http://sourceforge.net/mailarchive/message.php?msg_id=31136945 Most scores use averages over a test sample. Hence, I think the choice between mean(scores) and score(concatenation) depends on the CV iterator: - For KFold it makes sense