[Chris Withers] > This is with whatever ZODB ships with Zope 2.8.5... Do:
import ZODB print ZODB.__version__ to find out. > I have a Stepper (zopectl run on steroids) job that deals with lots of > big objects. Can you quantify this? > After processing each one, Stepper does a transaction.get().commit(). Note that "transaction.commit()" is a shortcut spelling. > I thought this was enough to keep the object cache at a sane size, It does not do cacheMinimize(). It tries to reduce the memory cache to the target number of objects specified for that cache, which is not at all the same as cache minimization (which latter shoots for a target size of 0). Whether that's "sane" or not depends on the product of: the cache's target number of objects times: "the average" byte size of an object ZODB has no say of its own about either of those. > however the job kept bombing out with MemoryErrors, and sure enough it > was using 2 or 3 gigs of memory when that happened. > > I fiddled about with the gc module and found that, sure enough, object > were being kept in memory. At a guess, I inserted something close to the > following: > > obj._p_jar.db().cacheMinimize() > > ...after each 5,000 objects were processed (there are 60,000 objects in > total) > > Lo and behold, memory usage became sane. > > Why is this step necessary? I thought transaction.get().commit() every so > often was enough to sort out the cache... See above. For most people it works OK. If `cn` is the Connection, then cn._cache.cache_size is the target number of non-ghost objects cn._cache.ringlen() is the current number of non-ghost objects At a transaction boundary, the cache gc method run tries to make ringlen() <= cache_size, and that's all. For example, using all defaults: >>> ZODB.__version__ # probably the version you're using '3.4.2' This loads a million-element OOBTree (the construction of which I won't show here): >>> len(t) 1000000 The number of non-ghost objects is then approximately 1e6/15 (the number of leaf-node OOBuckets in that tree; there are more than that because of non-leaf interior OOBTree nodes, but the leaf nodes account for the bulk of it): >>> cn._cache.cache_size, cn._cache.ringlen() (400, 67067) At a transaction boundary, a cache gc pass is run to try to reduce the number of non-ghost objects to cache_size: >>> transaction.commit() >>> cn._cache.cache_size, cn._cache.ringlen() (400, 400) So it booted 67067 - 400 = 66667 non-ghost objects. _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev