subject:"\[Scikit\-learn\-general\] Using Random forest classifier after One hot encoding"

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-21 Thread federico vaggi

What do you mean? It's pretty trivial to implement a one-hot encoding, the issue is that if you use a non-sparse format then you'll end up with a matrix which is far too dense to be practical, for anything but trivial examples. On Fri, Jun 21, 2013 at 10:46 AM, Maheshakya Wijewardena < pmaheshak

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-21 Thread Maheshakya Wijewardena

I'd like to analyse a bit and encode using that method to cohere with random forests in scikit-learn. On Fri, Jun 21, 2013 at 2:08 PM, Peter Prettenhofer < peter.prettenho...@gmail.com> wrote: > ? you already use one-hot encoding in your example ( > preprocessing.OneHotEncoder) > > > 2013/6/21 M

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-21 Thread Peter Prettenhofer

? you already use one-hot encoding in your example ( preprocessing.OneHotEncoder) 2013/6/21 Maheshakya Wijewardena > can anyone give me a sample algorithm for one hot encoding used in > scikit-learn? > > > On Thu, Jun 20, 2013 at 8:37 PM, Peter Prettenhofer < > peter.prettenho...@gmail.com> wro

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-21 Thread Maheshakya Wijewardena

can anyone give me a sample algorithm for one hot encoding used in scikit-learn? On Thu, Jun 20, 2013 at 8:37 PM, Peter Prettenhofer < peter.prettenho...@gmail.com> wrote: > you can try an ordinal encoding instead - just map each categorical value > to an integer so that you end up with 8 numeri

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Peter Prettenhofer

you can try an ordinal encoding instead - just map each categorical value to an integer so that you end up with 8 numerical features - if you use enough trees and grow them deep it may work 2013/6/20 Maheshakya Wijewardena > And yes Gilles, It is the Amazon challenge :D > > > On Thu, Jun 20, 20

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Maheshakya Wijewardena

And yes Gilles, It is the Amazon challenge :D On Thu, Jun 20, 2013 at 8:21 PM, Maheshakya Wijewardena < pmaheshak...@gmail.com> wrote: > The shape of X after encoding is (32769, 16600). Seems as if that is too > big to be converted into a dense matrix. Can Random forest handle this > amount of f

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Maheshakya Wijewardena

The shape of X after encoding is (32769, 16600). Seems as if that is too big to be converted into a dense matrix. Can Random forest handle this amount of features? On Thu, Jun 20, 2013 at 7:31 PM, Olivier Grisel wrote: > 2013/6/20 Lars Buitinck : > > 2013/6/20 Olivier Grisel : > >>> Actually twi

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Olivier Grisel

2013/6/20 Lars Buitinck : > 2013/6/20 Gilles Louppe : >> This looks like the dataset from the Amazon challenge currently >> running on Kaggle. When one-hot-encoded, you end up with rhoughly >> 15000 binary features, which means that the dense representation >> requires at least 32000*15000*4 bytes

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Olivier Grisel

2013/6/20 Lars Buitinck : > 2013/6/20 Olivier Grisel : >>> Actually twice as much, even on a 32-bit platform (float size is >>> always 64 bits). >> >> The decision tree code always uses 32 bits floats: >> >> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_tree.pyx#L38 >> >> b

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Joel Nothman

So Maheshakya's `toarray` might work with `X.astype(np.float32).toarray('F')`... (But by "might work" I mean won't throw a ValueError...) On Thu, Jun 20, 2013 at 11:56 PM, Olivier Grisel wrote: > 2013/6/20 Lars Buitinck : > > 2013/6/20 Gilles Louppe : > >> This looks like the dataset from the Am

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Lars Buitinck

2013/6/20 Olivier Grisel : >> Actually twice as much, even on a 32-bit platform (float size is >> always 64 bits). > > The decision tree code always uses 32 bits floats: > > https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_tree.pyx#L38 > > but you have to cast your data to `dt

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Lars Buitinck

2013/6/20 Gilles Louppe : > This looks like the dataset from the Amazon challenge currently > running on Kaggle. When one-hot-encoded, you end up with rhoughly > 15000 binary features, which means that the dense representation > requires at least 32000*15000*4 bytes to hold in memory (or even twice

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Olivier Grisel

What is the cardinality of each feature? -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Scikit-learn-general mai

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Gilles Louppe

Hi, This looks like the dataset from the Amazon challenge currently running on Kaggle. When one-hot-encoded, you end up with rhoughly 15000 binary features, which means that the dense representation requires at least 32000*15000*4 bytes to hold in memory (or even twice as as more depending on your

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Joel Nothman

Hi Maheshakya, It's probably right: your feature space is too big and sparse to be reasonable for random forests. What sort of categorical data are you encoding? What is the shape of the matrix after applying one-hot encoding? If you need to use random forests, and not a method that natively hand

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Olivier Grisel

2013/6/20 Maheshakya Wijewardena : > The shape is (32769, 8). There are 8 categorical variables before applying > OneHotEncoding. And what is the shape after? -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel --

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Maheshakya Wijewardena

The shape is (32769, 8). There are 8 categorical variables before applying OneHotEncoding. On Thu, Jun 20, 2013 at 5:43 PM, Peter Prettenhofer < peter.prettenho...@gmail.com> wrote: > > Hi, > > seems like your sparse matrix is too large to be converted to a dense > matrix. What shape does X have

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Peter Prettenhofer

Hi, seems like your sparse matrix is too large to be converted to a dense matrix. What shape does X have? How many categorical variables do you have (before applying the OneHotTransformer)? -- This SF.net email is sponsore

[Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-20 Thread Maheshakya Wijewardena

Hi, I'm new to scikit-learn. I'm trying use preprocessing.OneHotEncoder to encode my training and test data. After encoding I tried to train Random forest classifier using that data. But I get the following error when fitting. (Here the error trace) 99 model.fit(X_train, y_train)100

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

[Scikit-learn-general] Using Random forest classifier after One hot encoding

19 matches

Site Navigation

Mail list logo

Footer information