2011/11/4 Peter Prettenhofer <[email protected]>: > [..] >> >> Interesting. What is the order of magnitude of the decrease in speed >> at fit time? > > IMHO it's negligible > > here are some timings for:: > > rs = np.random.RandomState(13) > X = rs.rand(50000, 100) > y = rs.randint(2, size=50000) > from sklearn.tree import tree > clf = tree.DecisionTreeClassifier(max_depth=2) > %timeit clf.fit(X, y) > %timeit clf.predict(X) > > > Array repr: fit predict > max_depth=2 1.11 s 14.5 ms > max_depth=9 4.03 s 15.6 ms > max_depth=20 7.92 s 22.4 ms > > Comp repr: fit predict > max_depth=2 1.11 s 64.9 ms > max_depth=9 3.96 s 65.8 ms > max_depth=20 8.03 s 72.9 ms > > The array repr is significantly faster at prediction time - and > there's still some room for improvement because it might be possible > to vectorize the > predict computation (easy for decision stumps but more difficult for > trees of depth 3 or larger) > > Given the above timings I think there is too little to be gained > considering the additional code complexity of such a hybrid approach.
Ok great this is good news. Good work :) I let you deal with Gilles on how to merge your work with his... -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
