cPickle with HIGHEST_PROTOCOL is significantly faster, it averages 15
seconds to load the 10 tree forest compared to the 5 minutes without.

What still confuses me is why loading the forests and storing them in
a list should be any slower than loading them individually.  In other
words, why should this code

    import cPickle

    for i in range(0, 20):
        with open("forest%d.pkl" % (i), 'r') as f:
            start = datetime.now()
            a = cPickle.load(f)
            print 'loaded ', i, datetime.now() - start

produce these run-time results

loaded  0 0:00:14.952436
loaded  1 0:00:15.759927
loaded  2 0:00:15.839598
loaded  3 0:00:14.505774
loaded  4 0:00:15.703471
loaded  5 0:00:15.492304
loaded  6 0:00:16.379292
loaded  7 0:00:17.276785
loaded  8 0:00:17.725532
loaded  9 0:00:16.245370
loaded  10 0:00:12.884921
loaded  11 0:00:15.775455
loaded  12 0:00:14.682209
loaded  13 0:00:16.039402
loaded  14 0:00:19.444111
loaded  15 0:00:14.574627
loaded  16 0:00:16.927921
loaded  17 0:00:18.554036
loaded  18 0:00:13.532662
loaded  19 0:00:18.664413


and this code

    import cPickle

    classifier_bank = []
    for i in range(0, 20):
        with open("forest%d.pkl" % (i), 'r') as f:
            start = datetime.now()
            a = cPickle.load(f)
            classifier_bank.append(a)
            print 'loaded ', i, datetime.now() - start

produce these results?

loaded  0 0:00:16.561096
loaded  1 0:00:28.319847
loaded  2 0:00:37.514201
loaded  3 0:00:47.548183
loaded  4 0:00:56.997077
loaded  5 0:01:06.473708
loaded  6 0:01:20.373356
loaded  7 0:01:33.540237
loaded  8 0:01:42.579691
loaded  9 0:01:46.615368
loaded  10 0:01:44.872446
loaded  11 0:02:03.572577
loaded  12 0:02:16.641806
loaded  13 0:02:32.068945
loaded  14 0:02:59.249750
loaded  15 0:02:34.015673
loaded  16 0:03:02.650718
loaded  17 0:03:38.124911
loaded  18 0:03:00.707291
loaded  19 0:03:55.910640

This has got to be a python quirk that I don't have a handle on yet.
Any thoughts?

------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to