Re: [Zope3-dev] Re: Zope3 Standalone Page Templates

2007-03-26 Thread Wichert Akkerman
Previously Duncan McGreggor wrote:
 On 3/13/07, Tres Seaver [EMAIL PROTECTED] wrote:
 I still get a crazy amount of packages installed when I do this (see
 below). Either I'm doing something wrong (any ideas?) or I've got a
 very different definition of stand-alone ;-)

Personally I, and from what I can see most people, use SimpleTAL
(http://www.owlfish.com/software/simpleTAL/) if I need to use TAL
outside of Zope. It's very small and has no dependencies.

Wichert.

-- 
Wichert Akkerman [EMAIL PROTECTED]It is simple to make things.
http://www.wiggy.net/   It is hard to make things simple.
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-26 Thread Jim Fulton


On Mar 26, 2007, at 3:28 PM, Dieter Maurer wrote:


Jim Fulton wrote at 2007-3-25 09:53 -0400:


On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote:

MF I think one of the main limitations of the current catalog (and
MF hurry.query) is efficient support for sorting and batching the
query
MF results. The Zope 3 catalog returns all matching results, which
can then
MF be sorted and batched. This will stop being scalable for large
MF collections. A relational database is able to do this
internally, and is
MF potentially able to use optimizations there.


What evidence to you have to support this assertion?  We did some
literature search on this a few years ago and found no special trick
to avoid sorting costs.

I know of 2 approaches to reducing sort cost:

1. Sort your results based on the primary key and therefore, pick
your primary key to match your sort results.  In terms of the Zope
catalog framework, the primary keys are the document IDs, which are
traditionally chosen randomly.  You can pick your primary keys based
on a desired sort order instead. A variation on this theme is to use
multiple sets of document ids,  storing multiple sets of ids in each
index.  Of course, this approach doesn't help with something like
relevance ranks.

2. Use an N-best algorithm.  If N is the size of the batch and M is
the corpus size, then this is O(M*ln(N)) rather than O(M*ln(M)) which
is a significant improvement if N  M, but still quite expensive.


The major costs in sorting are usually not the log(n) but
the very high linear costs fetching the sort keys (although for  
huge n,

we will reach the asymptotic limits).


Right. The problem is the N not the log(N). :)


Under normal conditions, a relational database can be far more  
efficient

to fetch values either from index structures or the data records
than Zope -- as

  * its data representation is much more compact

  * it often supports direct access

  * the server itself can access and process all data.


With the ZODB, the data is hidden in pickles (less compact), there is
no direct access (instead the complete pickle need to be decoded)


The catalog sort index mechanism uses the un-index data structure in  
the sort index to get sort keys. This is a pretty compact data  
structure.



and
all operations are done in the client (rather than in the server).


Which is often fine if the desired data are in the client cache.  It  
avoids making the storage a bottleneck.


Jim

--
Jim Fulton  mailto:[EMAIL PROTECTED]Python 
Powered!
CTO (540) 361-1714  
http://www.python.org
Zope Corporationhttp://www.zope.com http://www.zope.org



___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com