Re: [Scikit-learn-general] Are you good at algebra with permutations and factorials?

2011-10-27 Thread Robert Layton
On 27 October 2011 23:22, Satrajit Ghosh wrote: > hi robert, > > >> I had a way of doing it that would be really robust, but really slow. >> Expand each of the factorials on the denominator and numerator separately >> and store as separate lists. Then remove all terms common to both lists and >>

Re: [Scikit-learn-general] Question on KFold CV and an error with Naive Bayes classifiers

2011-10-27 Thread Lars Buitinck
2011/10/27 SK Sn : > Hi all, I was playing around with KFold CV and found I need to transfer an X > (scipy sparse matrix after text vectorization) by todense() in order to work > with Kfold CV using following code: > > for train_index, test_index in kf: >     X_train, X_test = X[train_index],

[Scikit-learn-general] Question on KFold CV and an error with Naive Bayes classifiers

2011-10-27 Thread SK Sn
Hi all, I was playing around with KFold CV and found I need to transfer an X (scipy sparse matrix after text vectorization) by todense() in order to work with Kfold CV using following code: for train_index, test_index in kf: X_train, X_test = X[train_index], X[test_index] y_train, y_te

Re: [Scikit-learn-general] Are you good at algebra with permutations and factorials?

2011-10-27 Thread Satrajit Ghosh
hi robert, > I had a way of doing it that would be really robust, but really slow. > Expand each of the factorials on the denominator and numerator separately > and store as separate lists. Then remove all terms common to both lists and > multiply the results. However, this heavily uses append an

Re: [Scikit-learn-general] Storing and loading decision tree classifiers

2011-10-27 Thread bdholt1
I would have agreed if I was working on a machine with less memory: the server has 144GB and I'm only using a few percent. -Original Message- From: Peter Prettenhofer Date: Thu, 27 Oct 2011 12:20:40 To: Reply-To: scikit-learn-general@lists.sourceforge.net Subject: Re: [Scikit-learn-ge

Re: [Scikit-learn-general] A question about multi-label classification

2011-10-27 Thread Lars Buitinck
2011/10/26 SK Sn : > I will try to come up with a wrapper for the multi-label. Something like this: https://github.com/scikit-learn/scikit-learn/pull/417 ? -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam --

Re: [Scikit-learn-general] Storing and loading decision tree classifiers

2011-10-27 Thread Peter Prettenhofer
100K nodes is not much larger than my test (60K)... have you checked the memory consumption during the load operation? I suspect that you run out of memory and the huge overhead is due to thrashing. 2011/10/27 Brian Holt : > Firstly, thanks for all the helpful comments.  I didn't know that the > p

Re: [Scikit-learn-general] Storing and loading decision tree classifiers

2011-10-27 Thread Brian Holt
Firstly, thanks for all the helpful comments. I didn't know that the protocol made such a big difference, so until now in ignorance I've been using the default. That said, I left a test running last night on one of our centre's servers and it took 8hrs to load 20 forests ( each with 10 trees, dep