Re: [Scikit-learn-general] Storing and loading decision tree classifiers

Olivier Grisel Fri, 04 Nov 2011 02:29:17 -0700

2011/11/4 Peter Prettenhofer <[email protected]>:
> [..]
>>
>> Interesting. What is the order of magnitude of the decrease in speed
>> at fit time?
>
> IMHO it's negligible
>
> here are some timings for::
>
>    rs = np.random.RandomState(13)
>    X = rs.rand(50000, 100)
>    y = rs.randint(2, size=50000)
>    from sklearn.tree import tree
>    clf = tree.DecisionTreeClassifier(max_depth=2)
>    %timeit clf.fit(X, y)
>    %timeit clf.predict(X)
>
>
> Array repr:           fit           predict
> max_depth=2    1.11 s    14.5 ms
> max_depth=9    4.03 s    15.6 ms
> max_depth=20  7.92 s    22.4 ms
>
> Comp repr:           fit           predict
> max_depth=2    1.11 s    64.9 ms
> max_depth=9    3.96 s    65.8 ms
> max_depth=20  8.03 s    72.9 ms
>
> The array repr is significantly faster at prediction time - and
> there's still some room for improvement because it might be possible
> to vectorize the
> predict computation (easy for decision stumps but more difficult for
> trees of depth 3 or larger)
>
> Given the above timings I think there is too little to be gained
> considering the additional code complexity of such a hybrid approach.


Ok great this is good news. Good work :) I let you deal with Gilles on
how to merge your work with his...

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Storing and loading decision tree classifiers

Reply via email to