Collin Winter <collinwinter <at> google.com> writes: > I don't think it's possible in general to remove any PUTs if the > pickle is being written to a file-like object. It is possible to reuse > a single Pickler to pickle multiple objects: this causes the Pickler's > memo dict to be shared between the objects being pickled. If you > pickle foo, bar, and baz, foo may not have any GETs, but bar and baz > may have GETs that reference data added to the memo by foo's PUT > operations. Because you can't know what will be written to the > file-like object later, you can't remove any of the PUT instructions > in this scenario.
Hmm, that is a good point. A possible solution would be for the two-pass optimizer to scan through the entire file, going right through '.' opcodes. That would deal with the case you are describing, but not if the user "maliciously" wrote some other stuff into the file in between pickle dumps, all the while reusing the same pickler. I think a better solution would be to make sure that the '.' is the last thing in the file and die otherwise. This would at least ensure correctness and detection of cases that this thing could not handle. > don't break cvs2svn, it's not fun > to fix :). I added some basic tests for this support in cPython's > Lib/test/pickletester.py. Thanks for the warning :) > There might be room for app-specific optimizations that do this, but > I'm not sure it would work for a general-usage cPickle that needs to > stay compatible with the current system. That may well be true. Still, when trying to deal with large data you really need something like this. Our situation was made worse because we had a extension types. As they were allocated they got interspersed with temporaries generated by the spurious PUTs, and that is what really fragmented the memory. However its probably not a stretch to assume that if you are dealing with large stuff through python you are going to have extension types in the mix. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com