Re: [Scikit-learn-general] Handle sparse data on Instance Reduction

2014-07-04 Thread Lars Buitinck
2014-07-04 10:28 GMT+02:00 Olivier Grisel : > 2014-07-04 3:35 GMT+02:00 Dayvid Victor : >> Should I do the classifier setup in the __init__ (passing all arguments of >> the KNN to in the InstanceReduction constructor)? > > You might want to pass a KNN instance as `base_estimator` directly as > the

Re: [Scikit-learn-general] Handle sparse data on Instance Reduction

2014-07-04 Thread Olivier Grisel
2014-07-04 15:38 GMT+02:00 Dayvid Victor : > Also, is there any plans to add a transform(X, y, axis=1) in the pipe? > Is anybody working on that? What would that mean. IMO this is a very confusing API. In particular y can be 1D. -- Olivier ---

Re: [Scikit-learn-general] Handle sparse data on Instance Reduction

2014-07-04 Thread Dayvid Victor
Also, is there any plans to add a transform(X, y, axis=1) in the pipe? Is anybody working on that? Thanks, On Fri, Jul 4, 2014 at 10:29 AM, Dayvid Victor wrote: > Hi Olivier, > > Thank you for your considerations. I'll follow your recomendations. > > Instance Reduction is the opposite of resa

Re: [Scikit-learn-general] Handle sparse data on Instance Reduction

2014-07-04 Thread Dayvid Victor
Hi Olivier, Thank you for your considerations. I'll follow your recomendations. Instance Reduction is the opposite of resampling (like a inverse SMOTE); So it would require the pipeline to accept transformers that change the number of samples (axis=0 in the input data). Maybe in the future I coul

Re: [Scikit-learn-general] Handle sparse data on Instance Reduction

2014-07-04 Thread Olivier Grisel
2014-07-04 3:35 GMT+02:00 Dayvid Victor : > Hi Olivier, > > I solved this issue, but talking to some people in the maillist, > they adviced me to start a new project (already referenced in the wiki) > and latter think about include instance reduction in the sklearn. > > https://github.com/dvro/scik

Re: [Scikit-learn-general] Handle sparse data on Instance Reduction

2014-07-03 Thread Dayvid Victor
Hi Olivier, I solved this issue, but talking to some people in the maillist, they adviced me to start a new project (already referenced in the wiki) and latter think about include instance reduction in the sklearn. https://github.com/dvro/scikit-protopy (name is not definite yet); If you could t

Re: [Scikit-learn-general] Handle sparse data on Instance Reduction

2013-12-04 Thread Olivier Grisel
Hi, indeed the generic exception catching / reraising of the test common stuff is not very helpful. You can add a test in you own test suite to check where it breaks in your code: import scipy.sparse as sp X_train_csr = sp.csr_matrix(X_train) X_test_csr = sp.csr_matrix(X_test) model = MyModel().

[Scikit-learn-general] Handle sparse data on Instance Reduction

2013-12-01 Thread Dayvid Victor
Hi Everybody, I have some IR techniques implemented in Python, and I want to contribute to sklearn. But I am having some trouble with sparse data: https://github.com/dvro/scikit-learn/blob/instance_reduction/sklearn/instance_reduction/enn.py Here is one of the techniques (I'll improve the style