Re: Memory usage per top 10x usage per heapy

Junkshops Tue, 25 Sep 2012 13:30:25 -0700

On 9/25/2012 11:17 AM, Oscar Benjamin wrote:

On 25 September 2012 19:08, Junkshops <junksh...@gmail.com<mailto:junksh...@gmail.com>> wrote:
    In [38]: mpef._ustore._store
    Out[38]: defaultdict(<type 'dict'>, {'Measurement':
    {'8991c2dc67a49b909918477ee4efd767':
    <micropheno.exchangeformat.Exceptions.FileContext object at
    0x2f0fe90>, '7b38b429230f00fe4731e60419e92346':
    <micropheno.exchangeformat.Exceptions.FileContext object at
    0x2f0fad0>, 'b53531471b261c44d52f651add647544':
    <micropheno.exchangeformat.Exceptions.FileContext object at
    0x2f0f4d0>, '44ea6d949f7c8c8ac3bb4c0bf4943f82':
    <micropheno.exchangeformat.Exceptions.FileContext object at
    0x2f0f910>, '0de96f928dc471b297f8a305e71ae3e1':
    <micropheno.exchangeformat.Exceptions.FileContext object at
    0x2f0f550>}})
Have these exceptions been raised from somewhere before being stored?I wonder if you're inadvertently keeping execution frames alive. Thereare some problems in CPython with this that are related to storingexceptions.

FileContext objects aren't exceptions. They store information aboutwhere the stored object originally came from, so if there's an MD5 or IDclash with a later line in the file the code can report both the currentline and the older clashing line to the user. I have an Exceptionsubclass that takes a FileContext as an argument. There are noexceptions thrown in the file I processed to get the heapy resultsearlier in the thread.

In [43]: mpef._ustore._idstore['Measurement']._SIDstore
Out[43]: defaultdict(<function <lambda> at 0x2ece7d0>,{'emailRemoved': defaultdict(<function <lambda> at 0x2c4caa0>,{'microPhenoShew2011': defaultdict(<type 'dict'>, {0:{'MLR_124572462': '8991c2dc67a49b909918477ee4efd767','MLR_124572161': '7b38b429230f00fe4731e60419e92346','SMMLR_12551352': 'b53531471b261c44d52f651add647544','SMMLR_12551051': '0de96f928dc471b297f8a305e71ae3e1','SMMLR_12550750': '44ea6d949f7c8c8ac3bb4c0bf4943f82'}})})})
Also I think lambda functions might be able to keep the frame alive.Are they by any chance being created in a function that is called in aloop?

Here's the context for the lambdas:

  def __init__(self):

self._SIDstore = defaultdict(lambda: defaultdict(lambda:defaultdict(dict)))

So the lambda is only being called when a new key is added to the top 3levels of the datastructure, which in the test case I've beendiscussing, only happens once each.

Although the suggestion to change the hex strings to ints is a good oneand I'll do it, what I'm really trying to understand is why there's sucha large difference between the memory use per top (and the fact that thecode appears to thrash swap) and per heapy and my calculations of howmuch memory the code should be using.


Cheers, MrsEntity

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Memory usage per top 10x usage per heapy

Reply via email to