> The program works fine, but the memory load is huge. The size of > the pickle file on disk is about 900 Meg so I would theoretically > expect my program to consume about twice that (the dictionary > contained in the pickle file plus its repackaging into other formats), > but instead my program needs almost 5 Gig of memory to run. > Am I being unrealistic in my memory expectations?
I would say so, yes. As you use 5GiB of memory, it seems you are running a 64-bit system. On such a system, each pointer takes 8 bytes. In addition, each object takes at least 16 bytes; if it's variable-sized, it takes at least 24 bytes, plus the actual data in the object. OTOH, in a pickle, a pointer takes no space, unless it's a shared pointer (i.e. backwards reference), which takes as many digits as you need to encode the "object number" in the pickle. Each primitive object takes only a single byte overhead (as opposed to 24), causing quite drastic space reductions. Of course, non-primitive objects take more, as they need to encode the class they are instances of. > Is there a way to see how much memory is being consumed > by a single data structure or variable? How can I go about > debugging this problem? In Python 2.6, there is the sys.getsizeof function. For earlier versions, the asizeof package gives similar results. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list