On Fri, 07 Nov 2003 12:02:07 +0000
Seb Bacon <[EMAIL PROTECTED]> wrote:

> Casey Duncan wrote:
> > On Thu, 06 Nov 2003 19:11:55 +0000
> > Seb Bacon <[EMAIL PROTECTED]> wrote:
> >>A simple query for ["A" or "B" or "C"] against a KeywordIndex containing 
> >>27k objects is taking about 7 seconds on a Celeron 1.6Ghz, which seems 
> >>an absurdly long time to me.
> > 
> > <guess>
> > This time may be caused by fetching from the database. If so, then the
>  > only way to speed it up is increase the ZODB cache or get faster disks.
>  > Try the former and see if it helps. </guess>
> 
> Yup, absolutely right.  Upping the cache speeds it up to something sane. 
>   However, I don't understand why.  The code does something like:
> 
>   set1 = self.index.get(1)
>   set2 = self.index.get(2)
>   sets = [set1, set2]
> 
> ...so the sets will have come from the ZODB.  But the bit which takes 
> the time is the following line:

These are TreeSets most likely. The actual members of the sets are stored in separate 
persistent objects. This is done so that large sets can be fetched in chunks rather 
than all at once.

The ZODB tries to be lazy with fetching objects. If an object is very large it often 
makes sense to split it up between many persistent objects so that each part can be 
loaded and unloaded separately. This is what BTrees and TreeSets do. When you fetch a 
value from a BTree or test for an element in a set it only needs to load the part (in 
this case a Bucket) that contains the element, rather than the whole enchilada which 
could be huge (in reality its not quite that simple, but that's the general idea).

This is why BTrees and TreeSets are used when you need to store and manipulate 
arbitarily large numbers of elements.

>   result = multiunion(sets)

This iterates the sets which loads all the buckets from the database if they are not 
in the cache.
 
> At which point the sets have already been fetched, no?
> 
> looking forward to the day I understand ZODB caches...;-)

Actually this is not so much a function of the cache but a function of the 
organization of the set objects themselves.

-Casey

_______________________________________________
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )

Reply via email to