[Scikit-learn-general] random splitting and random seed

2015-02-16 Thread Pagliari, Roberto
I'm comparing a few algorithms, and trying to have them run using the same random datasets. Each algorithm is a separate python process and I provide a file with a list of integers, generated using numpy.random.randint. It is a small sequence of random integers between 0 and 10,000,000. Every

[Scikit-learn-general] Maximum Relevance Minimum Redundancy Feature Selection

2015-02-16 Thread Neelmani Sehgal
Hi all, I am recently working on a small academic project where I used mrmr feature selection technique . It would be great if this can be included in scikit-learn. I would like to contribute it as a GSOC project. Please have a look at this technique and tell

Re: [Scikit-learn-general] which methods do I need to implement for a regressor?

2015-02-16 Thread Pagliari, Roberto
Hi Vlad, Thanks for your help. I'm requiring a scalar number now (not a list) and I think it is working. I did not implement get_params and set_params because I don't need them and no complains from sklearn; I guess because they are derived from base. Regarding the copy, I'm using fit and trans

Re: [Scikit-learn-general] which methods do I need to implement for a regressor?

2015-02-16 Thread Pagliari, Roberto
Hi Gael, I think using list may cause problems with what I was doing. So I decided to change things in a way that I only need a scalar and everything works now. But thanks for your help ! From: Gael Varoquaux [mailto:gael.varoqu...@normalesup.org] Sent: Monday, February 16, 2015 5:39 PM To: scik

Re: [Scikit-learn-general] custom regressor keeps failing

2015-02-16 Thread Vlad Niculae
Hi Roberto, Everything I say below is also explained in the developers documentation that I linked to in the other e-mail. [1] You are breaking some conventions that make the default `get_params` and `set_params` not work well. As I said in the other thread, fitted attributes are suffixed with

Re: [Scikit-learn-general] which methods do I need to implement for a regressor?

2015-02-16 Thread Gael Varoquaux
Your get_params looks wrong to me: it is not returning a dictionary. Sent from my phone. Please forgive brevity and mis spelling On Feb 16, 2015, 20:02, at 20:02, "Pagliari, Roberto" wrote: >Hi Vlad/All, >Thanks for the pointers. The reason I return a copy of X is because I >don't want to mo

[Scikit-learn-general] custom regressor keeps failing

2015-02-16 Thread Pagliari, Roberto
I keep failing with custom transformer implementation. I posted a question on stackoverflow, and deleted it as I think it should be more appropriate here. I followed the suggestions by other people. The code right now is this: from sklearn.base import TransformerMixin, BaseEstimator clas

Re: [Scikit-learn-general] which methods do I need to implement for a regressor?

2015-02-16 Thread Pagliari, Roberto
Hi Vlad/All, Thanks for the pointers. The reason I return a copy of X is because I don't want to modify the dataset during grid search with cross validation (I'm not sure if the argument of transform is a deep copy or shallow copy). I implemented the class like the below. Basically a transformer

Re: [Scikit-learn-general] which methods do I need to implement for a regressor?

2015-02-16 Thread Vlad Niculae
Hi Roberto, This is all documented in more detail here: [1] The transform looks good (just that you might want to add a flag to avoid memory copies when you can afford to destroy the original data). It’s not clear what the intention of `my_param` is here. It’s not user specified, right? Conven

Re: [Scikit-learn-general] which methods do I need to implement for a regressor?

2015-02-16 Thread Pagliari, Roberto
I looked into some examples I found online but I’m a bit confused. Supposed I want to implement my own transformer, something similar to the standard scaler. Would this be sufficient to be used in a pipeline, or should it be done differently? class ModelTransformer(TransformerMixin): def

Re: [Scikit-learn-general] which methods do I need to implement for a regressor?

2015-02-16 Thread Michael Eickenberg
Can I conclude that you are looking to implement a transformer? Note that scikit learn transformers only act on X data, not on y data at the moment. If this is what you need, then you need to implement a transform method for your class. fit will still be necessary though, as the pipeline calls it a

Re: [Scikit-learn-general] which methods do I need to implement for a regressor?

2015-02-16 Thread Pagliari, Roberto
Broadly speaking, I would like to add my own custom function into a pipeline. However, my function is not really a classifier, nor a regressor. What do you think would be the best way to go about it? Is there a shortcut that does not require implementing the functions below? Thank you,

Re: [Scikit-learn-general] Regarding classification with one variable only

2015-02-16 Thread shalu jhanwar
Thanks a lot Artem :) It worked! Shalu On Mon, Feb 16, 2015 at 2:05 PM, Artem wrote: > X needs to be a matrix of shape (n_samples, n_features), not a vector. You > need to reshape it into the matrix by doing > > X_train = X_train.reshape( (len(X_train), 1) ) > > On Mon, Feb 16, 2015 at 4:01 PM,

Re: [Scikit-learn-general] Regarding classification with one variable only

2015-02-16 Thread Artem
X needs to be a matrix of shape (n_samples, n_features), not a vector. You need to reshape it into the matrix by doing X_train = X_train.reshape( (len(X_train), 1) ) On Mon, Feb 16, 2015 at 4:01 PM, shalu jhanwar wrote: > Hi Scikit fans, > > I am facing following error while performing classifi

[Scikit-learn-general] Regarding classification with one variable only

2015-02-16 Thread shalu jhanwar
Hi Scikit fans, I am facing following error while performing classification with *single feature* only: reg = linear_model.LogisticRegression() *scores.append(reg.fit(X_train, y_train).score(X_test, y_test))* Traceback (most recent call last): File "", line 1, in File "/software/so/el6.3/Pyt