Re: [Scikit-learn-general] random forest prediction performance

Lars Buitinck Tue, 18 Nov 2014 02:46:14 -0800

2014-11-18 11:07 GMT+01:00 Nicola Sambin <[email protected]>:
> - when I computed:
> for vector in vectors:
>     classifier.predict_proba(vector)
> it took:
> 2227,99s user 90,75s system 21:29,94 total
>
> - while
> classifier.predict_proba(vectors)
> took:
> 1,06s user 0,39s system 1,984 total
>
> What is this (impressive) difference depending on?


Python overhead (slow loops, slow function calls) + memory allocation
for the slicing (the vectors aren't copied, but a header structure is
allocated per vector) + memory allocation for the output + input
validation. If you feed in a batch of vectors, all of them are
validated in one go, all the loops are performed in C and the output
matrix is allocated in one operation as well.

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] random forest prediction performance

Reply via email to