Re: [Scikit-learn-general] What is the right way to convert unseen categorical value into numeric?

2013-10-22 Thread Andreas Mueller
On 10/22/2013 09:46 PM, ChungHung Liu wrote: > I read following links > > > http://scikit-learn.org/stable/modules/preprocessing.html#encoding-categorical-features > > http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.DictVectorizer.html > > It seems that I should

Re: [Scikit-learn-general] Negative mean_squared_error & other custom metrics using SVR on 0.14.1

2013-10-22 Thread Andreas Mueller
On 10/22/2013 09:40 PM, Ralf Gunter wrote: > > However, I'm new to all of this and so have a related but potentially > dumb question. Don't worry about it ;) I think there are two parts to the answer: GridSearchCV has a parameter "refit" which is True by default, which means that after taining,

[Scikit-learn-general] What is the right way to convert unseen categorical value into numeric?

2013-10-22 Thread ChungHung Liu
I read following links     http://scikit-learn.org/stable/modules/preprocessing.html#encoding-categorical-features      http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.DictVectorizer.html   It seems that I should use DictVectorizer, but     http://www.mail-archive.com

Re: [Scikit-learn-general] Negative mean_squared_error & other custom metrics using SVR on 0.14.1

2013-10-22 Thread Ralf Gunter
2013/10/22 Andreas Mueller > I think the answer is that the results of RMSE are (somewhat > counterintuitively) always negative when reporting the results of > GridSearchCV. > The "greater_is_better" actually just flips the sign. > The reason it does that is that GridSearchCV always tries to max

Re: [Scikit-learn-general] Negative mean_squared_error & other custom metrics using SVR on 0.14.1

2013-10-22 Thread Andreas Mueller
I think the answer is that the results of RMSE are (somewhat counterintuitively) always negative when reporting the results of GridSearchCV. The "greater_is_better" actually just flips the sign. The reason it does that is that GridSearchCV always tries to maximize the score. I am not sure this

Re: [Scikit-learn-general] Negative mean_squared_error & other custom metrics using SVR on 0.14.1

2013-10-22 Thread Andreas Mueller
Hi Ralf. Sorry I'm tired I didn't see the attachment, sorry. Andy On 10/22/2013 08:48 PM, Ralf Gunter wrote: Hi Andreas, I'm not sure what you mean by "more comprehensive"; the gist on the first message should reproduce the problem -- if not, then it might be something on my local configurat

Re: [Scikit-learn-general] Negative mean_squared_error & other custom metrics using SVR on 0.14.1

2013-10-22 Thread Ralf Gunter
Hi Andreas, I'm not sure what you mean by "more comprehensive"; the gist on the first message should reproduce the problem -- if not, then it might be something on my local configuration (python, numpy, etc). The script is exactly the same one I'm using in "production", just with a much bigger dat

Re: [Scikit-learn-general] Negative mean_squared_error & other custom metrics using SVR on 0.14.1

2013-10-22 Thread Andreas Mueller
Hi Ralf. Can you give a more comprehensive gist maybe? https://gist.github.com/ My first intuition would be that you are in fact using the r2 score, not the MSE, when outputting these numbers. Cheers, Andy On 10/22/2013 07:20 PM, Ralf Gunter wrote: Hello, I'm testing a few regression algor

Re: [Scikit-learn-general] Re : Image Feature Classification Conceptual Fog

2013-10-22 Thread Andreas Mueller
On 10/22/2013 03:54 PM, Ankit Agrawal wrote: > Hi Jim, > > What Joe said is correct when you want to label/classify images, > since classifying images by trying to find similarity of the test image with > the training images on pixel level would not work even if there is some > ordinary

Re: [Scikit-learn-general] Image Feature Classification Conceptual Fog

2013-10-22 Thread Andreas Mueller
I would also suggest the book "computer vision" by Richard Szeliski. For you classification problem it really depends on what you want as output and what the statistics of the data are. If I understand you correctly, you want a prediction for each label. If your images are somewhat natural, the

[Scikit-learn-general] Negative mean_squared_error & other custom metrics using SVR on 0.14.1

2013-10-22 Thread Ralf Gunter
Hello, I'm testing a few regression algorithms to map ndarrays of eigenvalues to floats, using StratifieldKFolds + GridSearchCV for cross-validation & hyperparameter estimation using some code borrowed from [1]. Although GridSearchCV appears to be working as advertised (i.e. the "best_estimator_"

[Scikit-learn-general] Re : Image Feature Classification Conceptual Fog

2013-10-22 Thread Ankit Agrawal
Hi Jim, What Joe said is correct when you want to label/classify images, since classifying images by trying to find similarity of the test image with the training images on pixel level would not work even if there is some ordinary geometric transform like scaling or rotation or Intensity c

[Scikit-learn-general] C integer types: the missing manual

2013-10-22 Thread Lars Buitinck
Dear all, I promised some time ago to write a guideline for using C integer types in Cython code. Here's a start; currently on the wiki instead of in a PR because of the rough state. https://github.com/scikit-learn/scikit-learn/wiki/C-integer-types:-the-missing-manual Regards, Lars

Re: [Scikit-learn-general] Image Feature Classification Conceptual Fog

2013-10-22 Thread jim vickroy
On 10/22/2013 3:32 PM, Joseph Jacobs wrote: The best book I have come across for image processing/vision + machine learning is one by Simon Prince. You can download the book from his website (http://computervisionmodels.com/). Chapter 13 gives a good intro to feature extraction. OK, great --

Re: [Scikit-learn-general] Image Feature Classification Conceptual Fog

2013-10-22 Thread jim vickroy
On 10/22/2013 3:32 PM, Joseph Jacobs wrote: The best book I have come across for image processing/vision + machine learning is one by Simon Prince. You can download the book from his website (http://computervisionmodels.com/). Chapter 13 gives a good intro to feature extraction. Joe On 22 Oc

Re: [Scikit-learn-general] Image Feature Classification Conceptual Fog

2013-10-22 Thread Joseph Jacobs
The best book I have come across for image processing/vision + machine learning is one by Simon Prince. You can download the book from his website (http://computervisionmodels.com/). Chapter 13 gives a good intro to feature extraction. Joe On 22 Oct 2013, at 22:27, jim vickroy wrote: > On 10/

Re: [Scikit-learn-general] Image Feature Classification Conceptual Fog

2013-10-22 Thread jim vickroy
On 10/22/2013 2:47 PM, Joseph Jacobs wrote: Hey Jim, From my (non-expert) perspective, performing classification pixel-wise would not be ideal (please correct me if I am wrong). I think the better way would be to perform some sort of feature extraction on the image (eg. SIFT, SURF, HOG, LBP a

Re: [Scikit-learn-general] Image Feature Classification Conceptual Fog

2013-10-22 Thread Joseph Jacobs
Hey Jim, From my (non-expert) perspective, performing classification pixel-wise would not be ideal (please correct me if I am wrong). I think the better way would be to perform some sort of feature extraction on the image (eg. SIFT, SURF, HOG, LBP and many, many more...checkout scikit-image or

[Scikit-learn-general] Image Feature Classification Conceptual Fog

2013-10-22 Thread jim vickroy
Hi, Apologies if this is an inappropriate question for this forum. I have a collection of (1024x1024) mono-chromatic images in which each pixel is to be labeled as 1 of several categories (e.g., 10). Furthermore, each mono-chromatic image was captured through several filters (e.g., 5). My

Re: [Scikit-learn-general] Self Organizing Map implementation

2013-10-22 Thread Kyle Kastner
I stumbled across an implementation of this a while back which used numba - maybe it will be helpful for comparison? http://nbviewer.ipython.org/3407544 Kyle On Tue, Oct 22, 2013 at 10:35 AM, Gmail wrote: > Okay very cool! That gives me a plan of attack going forward. I'll try > it on the

Re: [Scikit-learn-general] Self Organizing Map implementation

2013-10-22 Thread Gmail
Okay very cool! That gives me a plan of attack going forward. I'll try it on the digits to start. Thank you guys for all of the guidance! Sent from my iPhone > On Oct 22, 2013, at 12:06 AM, Olivier Grisel wrote: > > I would rather not add a new model class if there is no way to > demonstra

Re: [Scikit-learn-general] GradientBoostingRegressor with LogisticRegression

2013-10-22 Thread Attila Balogh
Hm, maybe I'm doing something wrong but I'm still getting the error: ValueError: operands could not be broadcast together with shapes (3) (6) I am using 0.14.1. Full stacktrace: Traceback (most recent call last): File "GB_problem.py", line 46, in main() File "GB_problem.py", line 43, in

Re: [Scikit-learn-general] GradientBoostingRegressor with LogisticRegression

2013-10-22 Thread Peter Prettenhofer
Ok, below is the adaptor that will work. The code requires that the output of predict is 2d. Thanks for the test-case. best, Peter class Adaptor(object): def __init__(self, est): self.est = est def predict(self, X): return self.est.predict_proba(X)[:, np.newaxis] de

Re: [Scikit-learn-general] GradientBoostingRegressor with LogisticRegression

2013-10-22 Thread Peter Prettenhofer
Right, I thought you were using the multi-class loss function. Please send me a testcase so that I can investigate the issue. thanks, Peter 2013/10/22 Attila Balogh > Hi Peter, > > thanks for your answer. I have tried this before also, and the problem is > that in this case I get > ValueErro

Re: [Scikit-learn-general] GradientBoostingRegressor with LogisticRegression

2013-10-22 Thread Attila Balogh
Hi Peter, thanks for your answer. I have tried this before also, and the problem is that in this case I get ValueError: operands could not be broadcast together with shapes (74) (148), because the y array is raveled and it has shape (74,2). Do you need a self containing testcase which reproduces

Re: [Scikit-learn-general] GradientBoostingRegressor with LogisticRegression

2013-10-22 Thread Peter Prettenhofer
Hi Attila, please use the following adaptor:: def __init__(self, est): self.est = est def predict(self, X): return self.est.predict_proba(X) def fit(self, X, y): self.est.fit(X, y) The one in the stackoverflow question returns an array of shape (n_samples,) bu

[Scikit-learn-general] GradientBoostingRegressor with LogisticRegression

2013-10-22 Thread Attila Balogh
Hi all, first of all thanks for all the developers for working on scikit-learn, it is a wonderful library. I am struggling for a while now with the following problem: Trying to use GBR with LR as a BaseEstimator, and I'm getting the following error: File "main.py", line 110, in main score =

Re: [Scikit-learn-general] Self Organizing Map implementation

2013-10-22 Thread Olivier Grisel
I would rather not add a new model class if there is no way to demonstrate that they can solve a non-synthetic task in an example. I would rather not have the scikit-learn code base turn into a museum of useless algorithms. So +1 for inclusion of a SOM model if it can lead to interesting results