Hi Gary,Thanks for your comprehensive answer. Yes, my extents aren't really as small as in my examples. Seems like a reasonable idea to wait with optimizations, not sure they are even needed, at least not within a year or so :)
Cheers On 10/28/07, Gary Poster <[EMAIL PROTECTED]> wrote: > > Hi Jesper. > > Extents have a primary use case in the zc.catalog package of defining > the extent of a catalog--a set of indexes. This is more efficient > both in terms of programmer time and computer time than filtering out > objects per-index. It also allows asking indexes questions that would > otherwise be impossible, e.g., "what objects do *not* match this > particular search?", and a couple of others. > > I'm not sure hurry.query leverages all aspects of extents, and indexes > that know how to deal with them. I seem to recall that it didn't, but > I could have been wrong and it was a while ago. > > So, the primary use case is different than yours. > > Extents can be used in the way that you describe--intersecting against > a larger search of a larger catalog. What you described is a > reasonable first cut, and a reasonable use of extents. > > Depending on your use cases and the time available, you may want to > explore optimizations. I wouldn't surprised if you eventually wanted > to roll your own catalog to do the set operations in the ways that > make the most sense for your application. A few quick thoughts: > > - If your common extents are really as small as in your examples, one > thing to realize is that the time for an intersection in BTree code > pretty much always is determined by the size of the smaller set. > Therefore, given three sets that need to be intersected (say, your > extent and the result of the search of two indexes) of relative sizes > Small, Medium, and Large, you want to intersect in this way: > intersect(intersect(Small, Medium), Large). See > http://svn.zope.org/zc.relation/trunk/src/zc/relation/timeit/manual_intersection.py?view=auto > for timeit fun, if you like. > > - there are two primary costs of a big catalog, IMO/IME: write time > and load time. If necessary for your app, consider ways to try to > keep smaller catalogs (e.g., does the value of some information > diminish over time? Does it make sense to have separate catalogs, > divided across some boundary or boundaries?); and consider ways to > keep the catalog in memory (in the object cache). > > - if you typically only need the first X of a result set, doing > something like Dieter Maurer's incremental search Zope 2 code would be > interesting to research and might be appreciated by the community if > it worked out well. > > Finally IMO/IME, only pursue these sometimes risky optimizations if > they are really necessary and if you have some pretty concrete > research or knowledge (your own or others) to back up your plan. If I > were you I'd just start out with the "do a search and then intersect > with the extent" approach you mentioned, and only worry about it more > when your app needs it. > > HTH > > Gary >
_______________________________________________ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users