Firstly, thanks for all the helpful comments.  I didn't know that the
protocol made such a big difference, so until now in ignorance I've
been using the default.

That said, I left a test running last night on one of our centre's
servers and it took 8hrs to load 20 forests ( each with 10 trees,
depth 20, approx 100K nodes) using `pickle`.  It dropped to 6 hours
using `cPickle`.  The trees aren't complete binary trees (100K nodes
out of a possible 2million).

Its really quick to load a single one, what seems to be the trouble is
when I load all of them into memory consecutively.  Its gets
progressively slower as more memory is used. I will give it another
try saving and loading using the highest protocol.


On 26 October 2011 19:58, Peter Prettenhofer
<[email protected]> wrote:
> I just dumped and loaded a fairly large tree (~40000 nodes; from
> bench_sgd_covertype.py) with cPickle, both operations performed in
> less than 1 sec (w/ and w/o HIGHTEST_PROTOCOL).
>
> Brian: how large are your trees (are they complete binary trees?)
>
> best,
>  Peter
>
>
> 2011/10/26 Peter Prettenhofer <[email protected]>:
>> brian, try to save the tree using::
>>
>> cPickle.dump(tree, f, cPickle.HIGHEST_PROTOCOL)
>>
>> if this doesn't solve the issue we should reconsider Gaels array
>> representation.
>>
>> best,
>> peter
>>
>> Am 26.10.2011 14:37 schrieb "Andreas Mueller" <[email protected]>:
>>>
>>> > My question is; is there a way to improve the performance of loading
>>> > classifiers, either using different pickle options (of which I don't
>>> > know any, but there may be)
>>> >
>>> >
>>> Just to be sure, you used the latest pickling format, right?
>>> cPickle uses the oldest one by default afaik.
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> The demand for IT networking professionals continues to grow, and the
>>> demand for specialized networking skills is growing even more rapidly.
>>> Take a complimentary Learning@Cisco Self-Assessment and learn
>>> about Cisco certifications, training, and career opportunities.
>>> http://p.sf.net/sfu/cisco-dev2dev
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
>
> --
> Peter Prettenhofer
>
> ------------------------------------------------------------------------------
> The demand for IT networking professionals continues to grow, and the
> demand for specialized networking skills is growing even more rapidly.
> Take a complimentary Learning@Cisco Self-Assessment and learn
> about Cisco certifications, training, and career opportunities.
> http://p.sf.net/sfu/cisco-dev2dev
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>



-- 
He is no fool who gives what he cannot keep to gain what he cannot lose.
 - Jim Elliot.

------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to