[Scikit-learn-general] 'module' object has no attribute 'SingleBlockManager'

2014-02-10 Thread Alessandro Gagliardi
Sorry for the cross-post. I can’t tell if this is an IPython issue, a scikit-learn issue, or a StarCluster issue. When I try to get back results from ExtraTreesRegressor from a load_balanced_view on StarCluster, I get: AttributeError: 'module' object has no attribute 'SingleBlockManager' Not s

Re: [Scikit-learn-general] Confidence of Trees

2014-02-10 Thread Alessandro Gagliardi
Update: I messed up with my training set (I included a variable I shouldn’t have) and am now getting more reasonable results (score = .634) My question about predicting error still stands, however. I should be able to train a classifier on the error (now that I’ve got enough that are wrong) but

[Scikit-learn-general] Confidence of Trees

2014-02-10 Thread Alessandro Gagliardi
I got ExtraTreesRegressor running on IPython.parallel (Pyrallel doesn’t work for me but the example at http://nbviewer.ipython.org/github/ogrisel/notebooks/blob/master/Distributed%20Learning%20of%20Extra%20Trees%20with%20IPython.parallel.ipynbdid). Now I’d like to be able to predict my error (i

Re: [Scikit-learn-general] RandomForestClassifier w/ IPython.parallel

2014-02-10 Thread Olivier Grisel
Extra Trees are even more random than random forests. Have a look at the referenced papers. To choose one vs the other you can evaluate the generalization power via cross-validation on your data (you might also want to grid search the optimal parameter values for max_features and min_samples_split

Re: [Scikit-learn-general] Sparse matrix support for Decision tree implementation

2014-02-10 Thread Olivier Grisel
2014-02-08 2:25 GMT-08:00 Arnaud Joly : > > I have looked a bit at your code and it’s a great start. It would be easier > to help you if you open a pull request. +1. Don't hesitate to open an early PR with the "[WIP]" marker as a title prefix to emphasize that you don't consider it finished work y

Re: [Scikit-learn-general] RandomForestClassifier w/ IPython.parallel

2014-02-10 Thread Alessandro Gagliardi
This looks perfect. I’m pretty knew to ensemble methods, so please forgive this ignorant question: what’s the difference between ExtraTrees and RandomForests? From http://scikit-learn.org/stable/modules/ensemble.html it looks like ExtraTrees is an extension of RandomForests. Examples of when one

Re: [Scikit-learn-general] Strange Error Message

2014-02-10 Thread Lars Buitinck
2014-02-08 18:44 GMT+01:00 Lorenzo Isella : > This is the range of my data. > > train.max() is, > 2.33326321223e+41 > train.min is, > -24799.05 > > Do you think that the max is simply too large to be handled by the random > regressor? Random forests cast to float32 internally, so yes, that's too l

Re: [Scikit-learn-general] Bug in BernoulliRBM

2014-02-10 Thread Lars Buitinck
2014-02-10 12:56 GMT+01:00 Pedro Cardoso : > I am using the version 0.14 > > I believe that there is a bug in defining the slices for the baches. ex: on > a matrix qith 1078 rows, the last batch is from 1070 to 1080. > > Created with : > batch_slices = list(gen_even_slices(n_batches * self.batch_si

Re: [Scikit-learn-general] Contributing in a New Topic : Recommender Systems

2014-02-10 Thread NALINI RANGARAJU
About scikit-crab, I think the authors are re-engineering it and it is currently not open to the community for contribution. This is what someone said recently on the google group for crab. As far as I could tell (and I could very well be wrong), crab recommender system does not have support for

[Scikit-learn-general] Bug in BernoulliRBM

2014-02-10 Thread Pedro Cardoso
I am using the version 0.14 I believe that there is a bug in defining the slices for the baches. ex: on a matrix qith 1078 rows, the last batch is from 1070 to 1080. Created with : batch_slices = list(gen_even_slices(n_batches * self.batch_size, n_batch