100K nodes is not much larger than my test (60K)... have you checked
the memory consumption during the load operation? I suspect that you
run out of memory and the huge overhead is due to thrashing.

2011/10/27 Brian Holt <[email protected]>:
> Firstly, thanks for all the helpful comments.  I didn't know that the
> protocol made such a big difference, so until now in ignorance I've
> been using the default.
>
> That said, I left a test running last night on one of our centre's
> servers and it took 8hrs to load 20 forests ( each with 10 trees,
> depth 20, approx 100K nodes) using `pickle`.  It dropped to 6 hours
> using `cPickle`.  The trees aren't complete binary trees (100K nodes
> out of a possible 2million).
>
> Its really quick to load a single one, what seems to be the trouble is
> when I load all of them into memory consecutively.  Its gets
> progressively slower as more memory is used. I will give it another
> try saving and loading using the highest protocol.
>
>
> On 26 October 2011 19:58, Peter Prettenhofer
> <[email protected]> wrote:
>> I just dumped and loaded a fairly large tree (~40000 nodes; from
>> bench_sgd_covertype.py) with cPickle, both operations performed in
>> less than 1 sec (w/ and w/o HIGHTEST_PROTOCOL).
>>
>> Brian: how large are your trees (are they complete binary trees?)
>>
>> best,
>>  Peter
>>
>>
>> 2011/10/26 Peter Prettenhofer <[email protected]>:
>>> brian, try to save the tree using::
>>>
>>> cPickle.dump(tree, f, cPickle.HIGHEST_PROTOCOL)
>>>
>>> if this doesn't solve the issue we should reconsider Gaels array
>>> representation.
>>>
>>> best,
>>> peter
>>>
>>> Am 26.10.2011 14:37 schrieb "Andreas Mueller" <[email protected]>:
>>>>
>>>> > My question is; is there a way to improve the performance of loading
>>>> > classifiers, either using different pickle options (of which I don't
>>>> > know any, but there may be)
>>>> >
>>>> >
>>>> Just to be sure, you used the latest pickling format, right?
>>>> cPickle uses the oldest one by default afaik.
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> The demand for IT networking professionals continues to grow, and the
>>>> demand for specialized networking skills is growing even more rapidly.
>>>> Take a complimentary Learning@Cisco Self-Assessment and learn
>>>> about Cisco certifications, training, and career opportunities.
>>>> http://p.sf.net/sfu/cisco-dev2dev
>>>> _______________________________________________
>>>> Scikit-learn-general mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>
>>
>>
>> --
>> Peter Prettenhofer
>>
>> ------------------------------------------------------------------------------
>> The demand for IT networking professionals continues to grow, and the
>> demand for specialized networking skills is growing even more rapidly.
>> Take a complimentary Learning@Cisco Self-Assessment and learn
>> about Cisco certifications, training, and career opportunities.
>> http://p.sf.net/sfu/cisco-dev2dev
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
>
> --
> He is no fool who gives what he cannot keep to gain what he cannot lose.
>  - Jim Elliot.
>
> ------------------------------------------------------------------------------
> The demand for IT networking professionals continues to grow, and the
> demand for specialized networking skills is growing even more rapidly.
> Take a complimentary Learning@Cisco Self-Assessment and learn
> about Cisco certifications, training, and career opportunities.
> http://p.sf.net/sfu/cisco-dev2dev
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>



-- 
Peter Prettenhofer

------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to