On 27 October 2011 23:22, Satrajit Ghosh wrote:
> hi robert,
>
>
>> I had a way of doing it that would be really robust, but really slow.
>> Expand each of the factorials on the denominator and numerator separately
>> and store as separate lists. Then remove all terms common to both lists and
>>
2011/10/27 SK Sn :
> Hi all, I was playing around with KFold CV and found I need to transfer an X
> (scipy sparse matrix after text vectorization) by todense() in order to work
> with Kfold CV using following code:
>
> for train_index, test_index in kf:
> X_train, X_test = X[train_index],
Hi all, I was playing around with KFold CV and found I need to transfer an X
(scipy sparse matrix after text vectorization) by todense() in order to work
with Kfold CV using following code:
for train_index, test_index in kf:
X_train, X_test = X[train_index], X[test_index]
y_train, y_te
hi robert,
> I had a way of doing it that would be really robust, but really slow.
> Expand each of the factorials on the denominator and numerator separately
> and store as separate lists. Then remove all terms common to both lists and
> multiply the results. However, this heavily uses append an
I would have agreed if I was working on a machine with less memory: the server
has 144GB and I'm only using a few percent.
-Original Message-
From: Peter Prettenhofer
Date: Thu, 27 Oct 2011 12:20:40
To:
Reply-To: scikit-learn-general@lists.sourceforge.net
Subject: Re: [Scikit-learn-ge
2011/10/26 SK Sn :
> I will try to come up with a wrapper for the multi-label.
Something like this: https://github.com/scikit-learn/scikit-learn/pull/417 ?
--
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam
--
100K nodes is not much larger than my test (60K)... have you checked
the memory consumption during the load operation? I suspect that you
run out of memory and the huge overhead is due to thrashing.
2011/10/27 Brian Holt :
> Firstly, thanks for all the helpful comments. I didn't know that the
> p
Firstly, thanks for all the helpful comments. I didn't know that the
protocol made such a big difference, so until now in ignorance I've
been using the default.
That said, I left a test running last night on one of our centre's
servers and it took 8hrs to load 20 forests ( each with 10 trees,
dep