On Oct 31, 1:37 pm, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > On Oct 31, 6:45 am, Aaron Watters <[EMAIL PROTECTED]> wrote: > > > I like to use > > marshal a lot because it's the absolutely fastest > > way to store and load data to/from Python.... > > I believe this FUD is somewhat out-of-date. Marshalling > became smarter about repeated and shared objects. The > pickle module (using mode 2) has a similar implementation > to marshal
Raymond: happy days! We are both right! I just ran some tests from the test suite for http://nucular.sourceforge.net with marshalling and pickling switched in and out and to my surprise I didn't find too much difference on the "load" end (marshalling 10% faster), but for the "bigLtreeTest.py" I found that the build ("dump") process was about 1/3 slower with cPickle (mode 2/python2.4). For the more complex tests (mondial and gutenberg) I found that the speed up for using marshal was in the 1-2% range (and sometimes inverted because of processor load I think, on a shared hosting machine). I'm pretty sure things were much worse for cPickle many moons ago. Nice to see that some things get better :). It makes sense that the "dump" side would be slower because that's where you need to remember all the objects in case you see them again... Anyway since it's easy and makes sense I think the next version of nucular will have a switchable option between marshal and cPickle for persistant storage. Thanks! -- Aaron Watters === The pursuit of hypothetical performance improvements is the root of all evil. -- Bill Tutt http://www.xfeedme.com/nucular/pydistro.py/go?FREETEXT=tutt -- http://mail.python.org/mailman/listinfo/python-list