We were having performance problems unpickling a large pickle file, we were getting 170s running time (which was fine), but 1100mb memory usage. Memory usage ought to have been about 300mb, this was happening because of memory fragmentation, due to many unnecessary "puts" in the pickle stream.
We made a pickletools.optimize inspired tool that could run directly on a pickle file and used pickletools.genops. This solved the unpickling problem (84s, 382mb). However the tool itself was using too much memory and time (1100s, 470mb), so I recoded it to scan through the pickle stream directly, without going through pickletools.genops, giving (240s, 130mb). Other people that deal with large pickle files are probably having similar problems, and since this comes up when dealing with large data it is precisely in this situation that you probably can't use pickletools.optimize or pickletools.genops. It feels like functionality that ought to be added to pickletools, is there some way I can contribute this? _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com