On Sat, 2008-02-02 at 22:10 +0100, Dieter Maurer wrote: > Roché Compaan wrote at 2008-2-1 21:17 +0200: > >I have completed my first round of benchmarks on the ZODB and welcome > >any criticism and advise. I summarised our earlier discussion and > >additional findings in this blog entry: > >http://www.upfrontsystems.co.za/Members/roche/where-im-calling-from/zodb-benchmarks > > In your insertion test: when do you do commits? > One per insertion? Or one per n insertions (for which "n")?
I have tried different commit intervals. The published results are for a commit interval of 100, iow 100 inserts per commit. > Your profile looks very surprising: > > I would expect that for a single insertion, typically > one persistent object (the bucket where the insertion takes place) > is changed. About every 15 inserts, 3 objects are changed (the bucket > is split) about every 15*125 inserts, 5 objects are changed > (split of bucket and its container). > But the mean value of objects changed in a transaction is 20 > in your profile. > The changed objects typically have about 65 subobjects. This > fits with "OOBucket"s. It was very surprising to me too since the insertion is so basic. I simply assign a Persistent object with 1 string attribute that is 1K in size to a key in a OOBTree. I mentioned this earlier on the list and I thought that Jim's explanation was sufficient when he said that the persistent_id method is called for all objects including simple types like strings, ints, etc. I don't know if it explains all the calls that add up to a mean value of 20 though. I guess the calls are being made by the cPickle module, but I don't have the experience to investigate this. > Lookup times: > > 0.23 s would be 230 ms not 23 ms. Oops my multiplier broke ;-) > > The reason for the dramatic drop from 10**6 to 10**7 cannot lie in the > BTree implementation itself. Lookup time is proportional to > the tree depth, which ideally would be O(log(n)). While BTrees > are not necessarily balanced (and therefore the depth may be larger > than logarithmic) it is not easy to obtain a severely unbalanced > tree by insertions only. > Other factors must have contributed to this drop: swapping, cache too small, > garbage collections... The cache size was set to 100000 objects so I doubt that this was the cause. I do the lookup test right after I populate the BTree so it might be that the cache and memory is full but I take care to commit after the BTree is populated so even this is unlikely. The keys that I lookup are completely random so it is probably the case that the lookup causes disk lookups all the time. If this is the case, is 230ms not still to slow? > Furthermore, the lookup times for your smaller BTrees are far too > good -- fetching any object from disk takes in the order of several > ms (2 to 20, depending on your disk). > This means that the lookups for your smaller BTrees have > typically been served directly from the cache (no disk lookups). > With your large BTree disk lookups probably became necessary. I accept that these lookups all all served from cache. I am going to modify the lookup test so that I close the database after population and re-open it when starting the test to make sure nothing is cached and see what the results look like. Thanks for your insightful comments! -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev