[Zope3-Users] (solved?) Large mappings or sequences in ZODB eat all the memory

Christophe Combelles Wed, 14 Nov 2007 10:40:16 -0800

Christophe Combelles a écrit :

Hello,


What should I do to have a data structure which is memory scalable?

Consider the following large btree:

$ ./debugzope

    >>> from BTrees.OOBTree import OOBTree
    >>> root['btree']=OOBTree()
    >>> for i in xrange(700000):
    ...   root['btree'][i] = tuple(range(i,i+30))
    ...
    >>> import transaction
    >>> transaction.commit()

Quit and restart  ./debugzope

Now I just want to know if some value is in the btree:

    >>> 'value' in root['btree'].values()

Ok, the story could be called: "ZODB is great, but take care of what you do withpersistency". There are 3 solutions to this problem. One ugly, one workaround,and the correct one. I found the ugly one; thanks to Dennis and Chris forpointing to the workaround and the correct one.


The whole btree is raised to the memory, even when I do a simple loop such as:

    >>> for i in root['btree']:
    ...     pass

(it's the same with items(), iteritems(), values(), itervalues().)


1) First the *ugly* one: I abort the transaction every N loops:

    >>> import transaction
    >>> a=0
    >>> for i in root['b']:
    ...     a+=1
    ...     if not a % 5000:
    ...         transaction.abort()
    ...

That works, but that's definitely not the right thing to do, I suspect that byaborting the transaction in the middle of the read, someone else might be ableto modify the btree before I've finished my read. (zodb experts, please confirm)

2) Now a good *workaround* (that I will eventually use, because it's too latefor me to change the data structure of my app, and it happens to be the fastestsolution).It's almost the same, except that instead of aborting the transaction, weperiodically minimize the cache of the connection to the ZODB:


    >>> a=0
    >>> for i in root['btree']:
    ...     a+=1
    ...     if not a % 5000:
    ...         root['btree']._p_jar.cacheMinimize()
    ...

This way, the maximum memory used corresponds to 5000 tuples.


3) the *correct* solution is to store real persistent objects in the btree.
(ie objects that derive from persistent.Persistent).
That works , and eats zero memory. But it's slower than tuples.

Non-persistent tuples are persisted because they are part of a persistentobject, but they are considered an integral part of the btree, and notindividual separate persistent objects.

That's my understanding, however that does not really explain why looping overnon-persistent objects in a btree should absolutely raise everything in the memory.


And what about IIBTrees? (integers are not persistent by themselves)


Christophe

or compute the length

    >>> len(root['btree'])
(I'm already using some separate lazy bookkeeping for the length, buteven if len() is time consuming for a btree, it should be possible froma memory point of view)
This loads the whole btree in memory (~500MB), and that memory nevergets released! If the btree grows, how will I be able to use it? (>2GB)
I've tried to scan the btree by using slices, usingroot['btree'].itervalues(min,max), and by trying to do sometransaction.abort()/commit()/savepoint()/anything() between the slices.But every slice I parse allocates yet another amount of memory, and whenthe whole btree has been scanned using slices, it's like the whole btreewas in memory.
I've also tried with lists, the result is the same, except the memorygets eaten even quicker.
What I understand is that the ZODB wakes up everything, and the memoryallocator of python (2.4) never release the memory. Is there a solutionor something I missed in the API of the ZODB or BTrees or python itself?
thanks,
Christophe




_______________________________________________
Zope3-users mailing list
Zope3-users@zope.org
http://mail.zope.org/mailman/listinfo/zope3-users


_______________________________________________
Zope3-users mailing list
Zope3-users@zope.org
http://mail.zope.org/mailman/listinfo/zope3-users

[Zope3-Users] (solved?) Large mappings or sequences in ZODB eat all the memory

Reply via email to