Re: [Zope-dev] ZCatalog caching with memcached

2008-10-31 Thread Chris Withers
Hedley Roos wrote:
 Since memcached is distributed only a single Zope client needs to
 perform that query and the result is available to all other Zope
 clients. 

This is where you'll get the big win: no need to load all the 
catalog-related objects into the zodb cache on all the clients which has 
the twin drawbacks of needing to be done and trashing your zodb cache...

 And the cache is persistent as long as memcached runs, so
 you can merrily restart Zope instances and have a warm cache. I didn't
 even realise this until Roche pointed it out to me.

Coool :-)

cheers,

Chris

-- 
Simplistix - Content Management, Zope  Python Consulting
- http://www.simplistix.co.uk
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-27 Thread Roché Compaan
On Sun, 2008-10-26 at 14:07 -0400, Tres Seaver wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Roché Compaan wrote:
  On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote:
  Have you measures the time needs for some standard ZCatalog queries
  used with a Plone site with the communication overhead with memcached?
  Generally spoken: I think the ZCatalog is in general fast. Queries using a
  fulltext index are known to be more expensive or if you have to deal with
  large resultsets or complex queries.
 
  No I haven't. Roche Compaan has done extensive benchmarking using
  funkload testing plain catalog vs module level cache vs memcached, but
  the tests are more about page serving than catalog query time. I'll
  ask him to comment more on that.
  
  I actually did some profiling as well and catalog searches were just too
  damn slow. The average execution time for searchResults was 100
  milliseconds and this is why I told Hedley we should do some caching at
  query level in the first place. I experimented with this idea a couple
  of years back but wasn't successful due to inexperience. I was trying to
  cache brains which obviously leads to persistency bugs. This time around
  it was obvious to me that we should cache the IISet result sets.
  
  I suspect specific indexes are just performing suboptimally and needs to
  be improved. ExtendPathIndex in Plone seems to be one of them.
  
  The effect on performance is really awesome, now we just need to fine
  tune the implementation.
 
 Before (or while) we work on caching, can we try to improve the
 underlying indexes, and the way that applications use them?  I'm pretty
 sure that there is a lot of room for improvement:
 
  - Plone uses too many indexes, and in particular, uses multiple text
indexes.  Having extra indexes around just in case is a sure lose
a write time, and may even be expensive at query time (depending on
the query).
 
  - Particular indexes have performance characteristics based on their
designed purpose:  for instance, the stock FieldIndex implementation
assumes that the number of documents indexed will be  the number of
discrete indexable values.  Using such an index in an application
domain with a very large set of indexable values probably loses, and
in ways which don't show up in early / small-scale testing.
 
  - I'm pretty sure that we haven't yet found the best data structure for
hierarchy indexes (e.g., the Plone EPI index, or the stock Zope2
PathIndex, etc.).  Something like a 'trie' might be optimal for
pure prefix searching of hierarchies.
 
  - I am confident that the TopicIndex is underutiliized:  it does *all*
the work for a given query at write time, and can thus be blindingly
fast at query time.
 
  - Other special-purpose indexes (e.g., a recent items index) would
be worth a look, especially for applications with large volumes of
content.

I agree that one should look at improving performance without caching as
well. But this is a lot harder and takes significantly more development
and debugging time than introducing some form caching. So I'm not
convinced that it needs to happen in a certain order. If caching gives
you lots of performance with little effort now, then why shouldn't you
use it?

-- 
Roché Compaan
Upfront Systems   http://www.upfrontsystems.co.za

___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-27 Thread Jens Vagelpohl
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


On Oct 27, 2008, at 13:08 , Roché Compaan wrote:

 On Sun, 2008-10-26 at 14:07 -0400, Tres Seaver wrote:
 - Plone uses too many indexes, and in particular, uses multiple text
   indexes.  Having extra indexes around just in case is a sure lose
   a write time, and may even be expensive at query time (depending on
   the query).

 - Particular indexes have performance characteristics based on their
   designed purpose:  for instance, the stock FieldIndex  
 implementation
   assumes that the number of documents indexed will be  the  
 number of
   discrete indexable values.  Using such an index in an application
   domain with a very large set of indexable values probably loses,  
 and
   in ways which don't show up in early / small-scale testing.

 - I'm pretty sure that we haven't yet found the best data structure  
 for
   hierarchy indexes (e.g., the Plone EPI index, or the stock Zope2
   PathIndex, etc.).  Something like a 'trie' might be optimal for
   pure prefix searching of hierarchies.

 - I am confident that the TopicIndex is underutiliized:  it does  
 *all*
   the work for a given query at write time, and can thus be  
 blindingly
   fast at query time.

 - Other special-purpose indexes (e.g., a recent items index) would
   be worth a look, especially for applications with large volumes of
   content.

 I agree that one should look at improving performance without  
 caching as
 well. But this is a lot harder and takes significantly more  
 development
 and debugging time than introducing some form caching. So I'm not
 convinced that it needs to happen in a certain order. If caching gives
 you lots of performance with little effort now, then why shouldn't you
 use it?

It's the typical trade-off. One course is expedient and fast for your  
use case now. The other requires more resources, but benefits  
everyone. Including those who don't want to depend on yet another  
package, like memcached, for performance.

When it comes to integrating anything in Zope itself I'd choose the  
latter.

jens


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkkFssEACgkQRAx5nvEhZLITiQCgskifGYaixaj6lVLk85l6rz6E
aQwAoI9PRcJHL8oZPatlHWADA0h6orCe
=YLhP
-END PGP SIGNATURE-
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-27 Thread Roché Compaan
On Mon, 2008-10-27 at 13:23 +0100, Jens Vagelpohl wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 
 On Oct 27, 2008, at 13:08 , Roché Compaan wrote:
 
  On Sun, 2008-10-26 at 14:07 -0400, Tres Seaver wrote:
  - Plone uses too many indexes, and in particular, uses multiple text
indexes.  Having extra indexes around just in case is a sure lose
a write time, and may even be expensive at query time (depending on
the query).
 
  - Particular indexes have performance characteristics based on their
designed purpose:  for instance, the stock FieldIndex  
  implementation
assumes that the number of documents indexed will be  the  
  number of
discrete indexable values.  Using such an index in an application
domain with a very large set of indexable values probably loses,  
  and
in ways which don't show up in early / small-scale testing.
 
  - I'm pretty sure that we haven't yet found the best data structure  
  for
hierarchy indexes (e.g., the Plone EPI index, or the stock Zope2
PathIndex, etc.).  Something like a 'trie' might be optimal for
pure prefix searching of hierarchies.
 
  - I am confident that the TopicIndex is underutiliized:  it does  
  *all*
the work for a given query at write time, and can thus be  
  blindingly
fast at query time.
 
  - Other special-purpose indexes (e.g., a recent items index) would
be worth a look, especially for applications with large volumes of
content.
 
  I agree that one should look at improving performance without  
  caching as
  well. But this is a lot harder and takes significantly more  
  development
  and debugging time than introducing some form caching. So I'm not
  convinced that it needs to happen in a certain order. If caching gives
  you lots of performance with little effort now, then why shouldn't you
  use it?
 
 It's the typical trade-off. One course is expedient and fast for your  
 use case now. The other requires more resources, but benefits  
 everyone. Including those who don't want to depend on yet another  
 package, like memcached, for performance.

I'm not tied to memcached. We started out using module level caches like
zope.cache.ram but that has obvious problems when using ZEO.

 When it comes to integrating anything in Zope itself I'd choose the  
 latter.

Sure, we're not trying to get this into Zope, we're just sharing our
experience and exploring the territory so that one can produce a third
party package that really help people with the same use case (which I
suspect is quite common one).

-- 
Roché Compaan
Upfront Systems   http://www.upfrontsystems.co.za

___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-27 Thread Jens Vagelpohl
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


On Oct 27, 2008, at 13:32 , Roché Compaan wrote:

 On Mon, 2008-10-27 at 13:23 +0100, Jens Vagelpohl wrote:
 When it comes to integrating anything in Zope itself I'd choose the
 latter.

 Sure, we're not trying to get this into Zope, we're just sharing our
 experience and exploring the territory so that one can produce a third
 party package that really help people with the same use case (which I
 suspect is quite common one).

Right, it's perfectly valid to create such a third party package. The  
discussion just highlights a greater issue. Personally, I don't think  
it's good practice to focus on the expediency of working around a  
problem as opposed to tackling the problem directly. The Zope world is  
littered with add-ons that act as band-aids on real or perceived  
shortcomings in Zope itself.

jens


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkkFtvQACgkQRAx5nvEhZLLOrwCaA+X3iGaTDmyt3vP4q93OoTfx
CNsAoJXppoHwI17ISetv4iAwoJeb+Phd
=auan
-END PGP SIGNATURE-
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-27 Thread Roché Compaan
On Mon, 2008-10-27 at 13:41 +0100, Jens Vagelpohl wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 
 On Oct 27, 2008, at 13:32 , Roché Compaan wrote:
 
  On Mon, 2008-10-27 at 13:23 +0100, Jens Vagelpohl wrote:
  When it comes to integrating anything in Zope itself I'd choose the
  latter.
 
  Sure, we're not trying to get this into Zope, we're just sharing our
  experience and exploring the territory so that one can produce a third
  party package that really help people with the same use case (which I
  suspect is quite common one).
 
 Right, it's perfectly valid to create such a third party package. The  
 discussion just highlights a greater issue. Personally, I don't think  
 it's good practice to focus on the expediency of working around a  
 problem as opposed to tackling the problem directly. The Zope world is  
 littered with add-ons that act as band-aids on real or perceived  
 shortcomings in Zope itself.

Improving the performance of indexes is really really hard. In this case
I really don't think caching is a band-aid, it is a good solution. Even
with optimised indexes, you will find that you need caching to get
reasonable performance if you have a catalog with close to a million or
more documents indexed. Given a large enough catalog, I would argue that
caching is equally as necessary as having a large cache for a ZEO
client.

But caches expire and results get invalidated, and therefor we should
continue to optimise indexes. With some help we should be able to
contribute at this level too.

-- 
Roché Compaan
Upfront Systems   http://www.upfrontsystems.co.za

___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-27 Thread Rudá Porto Filgueiras
On Mon, Oct 27, 2008 at 12:33 PM, Andreas Jung [EMAIL PROTECTED] wrote:
 On 27.10.2008 16:28 Uhr, Rudá Porto Filgueiras wrote:

 I will sugest a package called zope.memcached (like zope.sqlalchemy
 does for SQLAlchemy integration).
 That way any application who need to talk memcached can do it with out
 loose atomicit.

 I don't see a particular reason for creating a new package for here.
 Extend lovely.memcached and your done. There is not much need for scattering
 tiny functionalies into two modules here. The module world is already
 complicated enough.

If lovely,memcached alredy is safe when some Exception is raised,
discard my sugestion.
It's also compatible with zope2?

 Andreas




-- 
=
Rudá Porto Filgueiras
Weimar Consultoria

http://python-blog.blogspot.com

Hospedagem Plone, Django, Zope 3, Grok...
http://www.pytown.com
=
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-27 Thread Andreas Jung

On 27.10.2008 17:18 Uhr, Rudá Porto Filgueiras wrote:

On Mon, Oct 27, 2008 at 12:33 PM, Andreas Jung[EMAIL PROTECTED]  wrote:

On 27.10.2008 16:28 Uhr, Rudá Porto Filgueiras wrote:


I will sugest a package called zope.memcached (like zope.sqlalchemy
does for SQLAlchemy integration).
That way any application who need to talk memcached can do it with out
loose atomicit.

I don't see a particular reason for creating a new package for here.
Extend lovely.memcached and your done. There is not much need for scattering
tiny functionalies into two modules here. The module world is already
complicated enough.


If lovely,memcached alredy is safe when some Exception is raised,
discard my sugestion.
It's also compatible with zope2?


We are using it together with our cache tool I mentioned earlier
with Zope 2.8.1.

Andreas
begin:vcard
fn:Andreas Jung
n:Jung;Andreas
org:ZOPYX Ltd.  Co. KG
adr;quoted-printable:;;Charlottenstr. 37/1;T=C3=BCbingen;;72070;Germany
email;internet:[EMAIL PROTECTED]
title:CEO
tel;work:+49-7071-793376
tel;fax:+49-7071-7936840
tel;home:+49-7071-793257
x-mozilla-html:FALSE
url:www.zopyx.com
version:2.1
end:vcard

___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-26 Thread Andreas Jung

On 26.10.2008 18:43 Uhr, Roché Compaan wrote:

On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote:




I suspect specific indexes are just performing suboptimally and needs to
be improved. ExtendPathIndex in Plone seems to be one of them.


Path indexes and fulltext indexes have a much more complicated 
implementation compared to field or keyword indexes.


Andreas
begin:vcard
fn:Andreas Jung
n:Jung;Andreas
org:ZOPYX Ltd.  Co. KG
adr;quoted-printable:;;Charlottenstr. 37/1;T=C3=BCbingen;;72070;Germany
email;internet:[EMAIL PROTECTED]
title:CEO
tel;work:+49-7071-793376
tel;fax:+49-7071-7936840
tel;home:+49-7071-793257
x-mozilla-html:FALSE
url:www.zopyx.com
version:2.1
end:vcard

___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-26 Thread Fabio Rizzo Matos
Hi Roché,

I can see your funkload profile?

On Sun, Oct 26, 2008 at 3:43 PM, Roché Compaan
[EMAIL PROTECTED]wrote:

 On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote:
   Have you measures the time needs for some standard ZCatalog queries
   used with a Plone site with the communication overhead with memcached?
   Generally spoken: I think the ZCatalog is in general fast. Queries
 using a
   fulltext index are known to be more expensive or if you have to deal
 with
   large resultsets or complex queries.
  
 
  No I haven't. Roche Compaan has done extensive benchmarking using
  funkload testing plain catalog vs module level cache vs memcached, but
  the tests are more about page serving than catalog query time. I'll
  ask him to comment more on that.

 I actually did some profiling as well and catalog searches were just too
 damn slow. The average execution time for searchResults was 100
 milliseconds and this is why I told Hedley we should do some caching at
 query level in the first place. I experimented with this idea a couple
 of years back but wasn't successful due to inexperience. I was trying to
 cache brains which obviously leads to persistency bugs. This time around
 it was obvious to me that we should cache the IISet result sets.

 I suspect specific indexes are just performing suboptimally and needs to
 be improved. ExtendPathIndex in Plone seems to be one of them.

 The effect on performance is really awesome, now we just need to fine
 tune the implementation.

 --
 Roché Compaan
 Upfront Systems   http://www.upfrontsystems.co.za

 ___
 Zope-Dev maillist  -  Zope-Dev@zope.org
 http://mail.zope.org/mailman/listinfo/zope-dev
 **  No cross posts or HTML encoding!  **
 (Related lists -
  http://mail.zope.org/mailman/listinfo/zope-announce
  http://mail.zope.org/mailman/listinfo/zope )




-- 
Fábio Rizzo Matos
ThreePointsWeb
[EMAIL PROTECTED]
http://www.threepointsweb.com
+55 61 3202-6480

Python, Zope e Plone com quem entende do assunto!
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-26 Thread Roché Compaan
Hi Fabio

The funkload tests were project specific. I plan to write up my findings
and to do benchmarks on a standard Plone instance and blog about it.
This will unfortunately have to wait since I'm on holiday this week :-)

-- 
Roché Compaan
Upfront Systems   http://www.upfrontsystems.co.za

On Sun, 2008-10-26 at 15:54 -0200, Fabio Rizzo Matos wrote:
 Hi Roché,
 
 I can see your funkload profile?
 
 On Sun, Oct 26, 2008 at 3:43 PM, Roché Compaan
 [EMAIL PROTECTED] wrote:
 On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote:
   Have you measures the time needs for some standard
 ZCatalog queries
   used with a Plone site with the communication overhead
 with memcached?
   Generally spoken: I think the ZCatalog is in general fast.
 Queries using a
   fulltext index are known to be more expensive or if you
 have to deal with
   large resultsets or complex queries.
  
 
  No I haven't. Roche Compaan has done extensive benchmarking
 using
  funkload testing plain catalog vs module level cache vs
 memcached, but
  the tests are more about page serving than catalog query
 time. I'll
  ask him to comment more on that.
 
 
 I actually did some profiling as well and catalog searches
 were just too
 damn slow. The average execution time for searchResults was
 100
 milliseconds and this is why I told Hedley we should do some
 caching at
 query level in the first place. I experimented with this idea
 a couple
 of years back but wasn't successful due to inexperience. I was
 trying to
 cache brains which obviously leads to persistency bugs. This
 time around
 it was obvious to me that we should cache the IISet result
 sets.
 
 I suspect specific indexes are just performing suboptimally
 and needs to
 be improved. ExtendPathIndex in Plone seems to be one of them.
 
 The effect on performance is really awesome, now we just need
 to fine
 tune the implementation.
 
 --
 Roché Compaan
 Upfront Systems
 http://www.upfrontsystems.co.za
 
 
 ___
 Zope-Dev maillist  -  Zope-Dev@zope.org
 http://mail.zope.org/mailman/listinfo/zope-dev
 **  No cross posts or HTML encoding!  **
 (Related lists -
  http://mail.zope.org/mailman/listinfo/zope-announce
  http://mail.zope.org/mailman/listinfo/zope )
 
 
 
 
 -- 
 Fábio Rizzo Matos
 ThreePointsWeb
 [EMAIL PROTECTED]
 http://www.threepointsweb.com
 +55 61 3202-6480
 
 Python, Zope e Plone com quem entende do assunto!
 ___
 Zope-Dev maillist  -  Zope-Dev@zope.org
 http://mail.zope.org/mailman/listinfo/zope-dev
 **  No cross posts or HTML encoding!  **
 (Related lists - 
  http://mail.zope.org/mailman/listinfo/zope-announce
  http://mail.zope.org/mailman/listinfo/zope )


___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-26 Thread Roché Compaan
On Sun, 2008-10-26 at 18:50 +0100, Andreas Jung wrote:
 On 26.10.2008 18:43 Uhr, Roché Compaan wrote:
  On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote:
 
 
  I suspect specific indexes are just performing suboptimally and needs to
  be improved. ExtendPathIndex in Plone seems to be one of them.
 
 Path indexes and fulltext indexes have a much more complicated 
 implementation compared to field or keyword indexes.

I know, and this alone makes a good argument for caching at catalog
level. In our case we used membrane, which makes an excessive amount of
catalog queries when looking up users so some level of caching was
essential.

-- 
Roché Compaan
Upfront Systems   http://www.upfrontsystems.co.za

___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-26 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Roché Compaan wrote:
 On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote:
 Have you measures the time needs for some standard ZCatalog queries
 used with a Plone site with the communication overhead with memcached?
 Generally spoken: I think the ZCatalog is in general fast. Queries using a
 fulltext index are known to be more expensive or if you have to deal with
 large resultsets or complex queries.

 No I haven't. Roche Compaan has done extensive benchmarking using
 funkload testing plain catalog vs module level cache vs memcached, but
 the tests are more about page serving than catalog query time. I'll
 ask him to comment more on that.
 
 I actually did some profiling as well and catalog searches were just too
 damn slow. The average execution time for searchResults was 100
 milliseconds and this is why I told Hedley we should do some caching at
 query level in the first place. I experimented with this idea a couple
 of years back but wasn't successful due to inexperience. I was trying to
 cache brains which obviously leads to persistency bugs. This time around
 it was obvious to me that we should cache the IISet result sets.
 
 I suspect specific indexes are just performing suboptimally and needs to
 be improved. ExtendPathIndex in Plone seems to be one of them.
 
 The effect on performance is really awesome, now we just need to fine
 tune the implementation.

Before (or while) we work on caching, can we try to improve the
underlying indexes, and the way that applications use them?  I'm pretty
sure that there is a lot of room for improvement:

 - Plone uses too many indexes, and in particular, uses multiple text
   indexes.  Having extra indexes around just in case is a sure lose
   a write time, and may even be expensive at query time (depending on
   the query).

 - Particular indexes have performance characteristics based on their
   designed purpose:  for instance, the stock FieldIndex implementation
   assumes that the number of documents indexed will be  the number of
   discrete indexable values.  Using such an index in an application
   domain with a very large set of indexable values probably loses, and
   in ways which don't show up in early / small-scale testing.

 - I'm pretty sure that we haven't yet found the best data structure for
   hierarchy indexes (e.g., the Plone EPI index, or the stock Zope2
   PathIndex, etc.).  Something like a 'trie' might be optimal for
   pure prefix searching of hierarchies.

 - I am confident that the TopicIndex is underutiliized:  it does *all*
   the work for a given query at write time, and can thus be blindingly
   fast at query time.

 - Other special-purpose indexes (e.g., a recent items index) would
   be worth a look, especially for applications with large volumes of
   content.


Tres.
- --
===
Tres Seaver  +1 540-429-0999  [EMAIL PROTECTED]
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJBLHb+gerLs4ltQ4RAp59AJwNlfjI0tBv4PdMiDdH4TLKSm5YfwCgu8xB
F3u1G0onXKKZ4s7MbLj9B2w=
=r0oE
-END PGP SIGNATURE-
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-26 Thread Fabio Rizzo Matos
Very Nice.

Have a nice holiday :-)

On Sun, Oct 26, 2008 at 3:58 PM, Roché Compaan
[EMAIL PROTECTED]wrote:

 Hi Fabio

 The funkload tests were project specific. I plan to write up my findings
 and to do benchmarks on a standard Plone instance and blog about it.
 This will unfortunately have to wait since I'm on holiday this week :-)

 --
 Roché Compaan
 Upfront Systems   http://www.upfrontsystems.co.za

 On Sun, 2008-10-26 at 15:54 -0200, Fabio Rizzo Matos wrote:
  Hi Roché,
 
  I can see your funkload profile?
 
  On Sun, Oct 26, 2008 at 3:43 PM, Roché Compaan
  [EMAIL PROTECTED] wrote:
  On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote:
Have you measures the time needs for some standard
  ZCatalog queries
used with a Plone site with the communication overhead
  with memcached?
Generally spoken: I think the ZCatalog is in general fast.
  Queries using a
fulltext index are known to be more expensive or if you
  have to deal with
large resultsets or complex queries.
   
  
   No I haven't. Roche Compaan has done extensive benchmarking
  using
   funkload testing plain catalog vs module level cache vs
  memcached, but
   the tests are more about page serving than catalog query
  time. I'll
   ask him to comment more on that.
 
 
  I actually did some profiling as well and catalog searches
  were just too
  damn slow. The average execution time for searchResults was
  100
  milliseconds and this is why I told Hedley we should do some
  caching at
  query level in the first place. I experimented with this idea
  a couple
  of years back but wasn't successful due to inexperience. I was
  trying to
  cache brains which obviously leads to persistency bugs. This
  time around
  it was obvious to me that we should cache the IISet result
  sets.
 
  I suspect specific indexes are just performing suboptimally
  and needs to
  be improved. ExtendPathIndex in Plone seems to be one of them.
 
  The effect on performance is really awesome, now we just need
  to fine
  tune the implementation.
 
  --
  Roché Compaan
  Upfront Systems
  http://www.upfrontsystems.co.za
 
 
  ___
  Zope-Dev maillist  -  Zope-Dev@zope.org
  http://mail.zope.org/mailman/listinfo/zope-dev
  **  No cross posts or HTML encoding!  **
  (Related lists -
   http://mail.zope.org/mailman/listinfo/zope-announce
   http://mail.zope.org/mailman/listinfo/zope )
 
 
 
 
  --
  Fábio Rizzo Matos
  ThreePointsWeb
  [EMAIL PROTECTED]
  http://www.threepointsweb.com
  +55 61 3202-6480
 
  Python, Zope e Plone com quem entende do assunto!
  ___
  Zope-Dev maillist  -  Zope-Dev@zope.org
  http://mail.zope.org/mailman/listinfo/zope-dev
  **  No cross posts or HTML encoding!  **
  (Related lists -
   http://mail.zope.org/mailman/listinfo/zope-announce
   http://mail.zope.org/mailman/listinfo/zope )





-- 
Fábio Rizzo Matos
ThreePointsWeb
[EMAIL PROTECTED]
http://www.threepointsweb.com
+55 61 3202-6480

Python, Zope e Plone com quem entende do assunto!
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-26 Thread Andreas Jung

On 26.10.2008 19:05 Uhr, Roché Compaan wrote:

On Sun, 2008-10-26 at 18:50 +0100, Andreas Jung wrote:

On 26.10.2008 18:43 Uhr, Roché Compaan wrote:

On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote:
I suspect specific indexes are just performing suboptimally and needs to
be improved. ExtendPathIndex in Plone seems to be one of them.

Path indexes and fulltext indexes have a much more complicated
implementation compared to field or keyword indexes.


I know, and this alone makes a good argument for caching at catalog
level. In our case we used membrane, which makes an excessive amount of
catalog queries when looking up users so some level of caching was
essential.



First caching is good thing :-)
But how about the following issue: CMF/Plone inject additional 
subqueries for expires/effective/typesAndRoles. At least the security 
related aubqueries make a cached catalog result very specific to a 
particular user. That seems to be very ok for a site with lots of 
anonymous users - it might be an issue with lots of authenticated users.
It might be necessary to add some kind of intelligence to decide what to 
cache and what not. I don't think it does not make sense to cache the 
result of a fulltext search. I am just thinking if it would make sense 
to cache on the index level instead of catalog level? So you could for 
example cache expensive index queries (path index) and combine them 
with uncached index which are supposed to be fast..however

such decisions require detailed mesurements on real systems.

One other thing concerning memcached: there is obviously a limit to 1MB
for data you can store as a value. We have not found an obvious way for 
increasing this limit other by patching the memcached sources. We came 
up with an implementation where data 1MB is split up into individual 
junks (we have a dedicated set_huge(), get_huge()) implementation.


Andreas

--
ZOPYX Ltd.  Co. KG - Charlottenstr. 37/1 - 72070 Tübingen - Germany
Web: www.zopyx.com - Email: [EMAIL PROTECTED] - Phone +49 - 7071 - 793376
Registergericht: Amtsgericht Stuttgart, Handelsregister A 381535
Geschäftsführer/Gesellschafter: ZOPYX Limited, Birmingham, UK

E-Publishing, Python, Zope  Plone development, Consulting

begin:vcard
fn:Andreas Jung
n:Jung;Andreas
org:ZOPYX Ltd.  Co. KG
adr;quoted-printable:;;Charlottenstr. 37/1;T=C3=BCbingen;;72070;Germany
email;internet:[EMAIL PROTECTED]
title:CEO
tel;work:+49-7071-793376
tel;fax:+49-7071-7936840
tel;home:+49-7071-793257
x-mozilla-html:FALSE
url:www.zopyx.com
version:2.1
end:vcard

___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Hedley Roos
The usual Plone catalogs (portal_catalog, uid_catalog,
reference_catalog and membrane_tool) all run above 90% hit rate if the
server is up to it. portal_catalog is invalidated the most so it
fluctuates the most.

If the server is severely underpowered then catalogcache is much less
effective. portal_catalog hit rates will degrade over time. This is
the situation I'm currently facing with on one site, but more servers
will fix that.

It's quite easy to benchmark / load test with funkload. What I've
found is that memcached is very light on CPU, but if the Zope
processes are constantly using all CPU it is starved and runs into
trouble. As long as you avoid that case (which would be fatal without
catalogcache in any case) then everything works perfectly.

Run a few tests and let me know please.

Hedley
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Andreas Jung

On 25.10.2008 8:48 Uhr, Hedley Roos wrote:

The usual Plone catalogs (portal_catalog, uid_catalog,
reference_catalog and membrane_tool) all run above 90% hit rate if the
server is up to it. portal_catalog is invalidated the most so it
fluctuates the most.

If the server is severely underpowered then catalogcache is much less
effective. portal_catalog hit rates will degrade over time. This is
the situation I'm currently facing with on one site, but more servers
will fix that.

It's quite easy to benchmark / load test with funkload. What I've
found is that memcached is very light on CPU, but if the Zope
processes are constantly using all CPU it is starved and runs into
trouble. As long as you avoid that case (which would be fatal without
catalogcache in any case) then everything works perfectly.



Have you measures the time needs for some standard ZCatalog queries
used with a Plone site with the communication overhead with memcached?
Generally spoken: I think the ZCatalog is in general fast. Queries using 
a fulltext index are known to be more expensive or if you have to deal 
with large resultsets or complex queries.


Andreas
begin:vcard
fn:Andreas Jung
n:Jung;Andreas
org:ZOPYX Ltd.  Co. KG
adr;quoted-printable:;;Charlottenstr. 37/1;T=C3=BCbingen;;72070;Germany
email;internet:[EMAIL PROTECTED]
title:CEO
tel;work:+49-7071-793376
tel;fax:+49-7071-7936840
tel;home:+49-7071-793257
x-mozilla-html:FALSE
url:www.zopyx.com
version:2.1
end:vcard

___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Hedley Roos
 Have you measures the time needs for some standard ZCatalog queries
 used with a Plone site with the communication overhead with memcached?
 Generally spoken: I think the ZCatalog is in general fast. Queries using a
 fulltext index are known to be more expensive or if you have to deal with
 large resultsets or complex queries.


No I haven't. Roche Compaan has done extensive benchmarking using
funkload testing plain catalog vs module level cache vs memcached, but
the tests are more about page serving than catalog query time. I'll
ask him to comment more on that.

As for standard queries on a Plone site the typical folder contents
query is a good example. The query will be fast unless it sorts on
sortable_title (a ZCTextIndex) right? Not sure right now.

Since memcached is distributed only a single Zope client needs to
perform that query and the result is available to all other Zope
clients. And the cache is persistent as long as memcached runs, so
you can merrily restart Zope instances and have a warm cache. I didn't
even realise this until Roche pointed it out to me. To answer the
question: I believe catalogcache will win every time since the return
time of a cached query is not dependent on the complexity of the
query.

We should get a few benchmarks running at query level. I'll have a bit
of time next week.

Hedley
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Martin Aspeli
Hedley Roos wrote:

 As for standard queries on a Plone site the typical folder contents
 query is a good example. The query will be fast unless it sorts on
 sortable_title (a ZCTextIndex) right? Not sure right now.

sortable_title is a field index and shouldn't be slower than any other 
index.

This all sounds very cool, by the way. :)

Martin
-- 
Author of `Professional Plone Development`, a book for developers who
want to work with Plone. See http://martinaspeli.net/plone-book

___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Christian Theune
Hi,

On Fri, 2008-10-24 at 15:41 +0200, Hedley Roos wrote:
 The product is a monkey patch to Catalog.py. I'd love some feedback and 
 suggestions.

I'd love if this wouldn't be a monkey patch.

Also, there is nothing that makes this integrate correctly with
transactions. Your cache will happily deliver never-committed data and
also it will not isolate transactions from each other.

Christian

-- 
Christian Theune · [EMAIL PROTECTED]
gocept gmbh  co. kg · forsterstraße 29 · 06112 halle (saale) · germany
http://gocept.com · tel +49 345 1229889 7 · fax +49 345 1229889 1
Zope and Plone consulting and development


signature.asc
Description: This is a digitally signed message part
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Hedley Roos
 I'd love if this wouldn't be a monkey patch.

So would I, but I couldn't find another way in this case.


 Also, there is nothing that makes this integrate correctly with
 transactions. Your cache will happily deliver never-committed data and
 also it will not isolate transactions from each other.

I patched 4 methods - clear, search, catalogObject, uncatalogObject.

Method clear is the simplest one - I simply flush the cache.

Methods catalogObject and uncatalogObject both invalidate the cache.
Should the transaction fail later the only drawback is that you threw
a few things out of the cache. They'll soon be re-entered by
subsequent searches.

Method search just inspects queries and stores results to memcache.

Can you give me an example where the cache would deliver non-committed data?
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Hedley Roos

 In addition, you need to include a serial in your cache keys to avoid
 dirty reads.

The cache invalidation code actively removes items from the cache. Am
I understanding you correctly?

H
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Andreas Jung

On 25.10.2008 14:53 Uhr, Hedley Roos wrote:

I'd love if this wouldn't be a monkey patch.


So would I, but I couldn't find another way in this case.


Also, there is nothing that makes this integrate correctly with
transactions. Your cache will happily deliver never-committed data and
also it will not isolate transactions from each other.


I patched 4 methods - clear, search, catalogObject, uncatalogObject.

Method clear is the simplest one - I simply flush the cache.

Methods catalogObject and uncatalogObject both invalidate the cache.
Should the transaction fail later the only drawback is that you threw
a few things out of the cache. They'll soon be re-entered by
subsequent searches.


Using a DataManager is likely the better and more safe choice.

Andreas
begin:vcard
fn:Andreas Jung
n:Jung;Andreas
org:ZOPYX Ltd.  Co. KG
adr;quoted-printable:;;Charlottenstr. 37/1;T=C3=BCbingen;;72070;Germany
email;internet:[EMAIL PROTECTED]
title:CEO
tel;work:+49-7071-793376
tel;fax:+49-7071-7936840
tel;home:+49-7071-793257
x-mozilla-html:FALSE
url:www.zopyx.com
version:2.1
end:vcard

___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Christian Theune
On Sat, 2008-10-25 at 14:53 +0200, Hedley Roos wrote:
  I'd love if this wouldn't be a monkey patch.
 
 So would I, but I couldn't find another way in this case.
 
 
  Also, there is nothing that makes this integrate correctly with
  transactions. Your cache will happily deliver never-committed data and
  also it will not isolate transactions from each other.
 
 I patched 4 methods - clear, search, catalogObject, uncatalogObject.
 
 Method clear is the simplest one - I simply flush the cache.

This is probably harmless but will cause unnecessary cache flushes for
other clients.

 Methods catalogObject and uncatalogObject both invalidate the cache.
 Should the transaction fail later the only drawback is that you threw
 a few things out of the cache. They'll soon be re-entered by
 subsequent searches.

Right. This is the same as clear.

 Method search just inspects queries and stores results to memcache.

That's the issue.

If you catalog an object, then search for it and then abort the
transaction, your cache will have data in it that isn't committed.

Additionally when another transaction is already running in parallel, it
will see cache inserts from other transactions.

Christian

-- 
Christian Theune · [EMAIL PROTECTED]
gocept gmbh  co. kg · forsterstraße 29 · 06112 halle (saale) · germany
http://gocept.com · tel +49 345 1229889 7 · fax +49 345 1229889 1
Zope and Plone consulting and development


signature.asc
Description: This is a digitally signed message part
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Hedley Roos
On Sat, Oct 25, 2008 at 2:57 PM, Andreas Jung [EMAIL PROTECTED] wrote:
 On 25.10.2008 14:53 Uhr, Hedley Roos wrote:

 I'd love if this wouldn't be a monkey patch.

 So would I, but I couldn't find another way in this case.

 Also, there is nothing that makes this integrate correctly with
 transactions. Your cache will happily deliver never-committed data and
 also it will not isolate transactions from each other.

 I patched 4 methods - clear, search, catalogObject, uncatalogObject.

 Method clear is the simplest one - I simply flush the cache.

 Methods catalogObject and uncatalogObject both invalidate the cache.
 Should the transaction fail later the only drawback is that you threw
 a few things out of the cache. They'll soon be re-entered by
 subsequent searches.

 Using a DataManager is likely the better and more safe choice.

 Andreas


Thanks Andreas. I'll have a look at your code when available.

Christian, I do have a mistake in my reasoning. If an object is added
to the catalog in a transaction and I cache that object as result of a
query in that same transaction, and then the transaction fails I'll
have a bad cache.

H
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Hedley Roos
 If you catalog an object, then search for it and then abort the
 transaction, your cache will have data in it that isn't committed.


Kind of like how I came to the same conclusion in parallel to you and
stuffed up this thread :)

 Additionally when another transaction is already running in parallel, it
 will see cache inserts from other transactions.

So this is the area I have to focus on right now.

H
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Hedley Roos
 Additionally when another transaction is already running in parallel, it
 will see cache inserts from other transactions.


A possible solution is to keep a module level cache which can be
committed to the memcache on transaction boundaries. That way I'll
incur no performance penalty.

H
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] ZCatalog caching with memcached

2008-10-25 Thread Christian Theune
On Sat, 2008-10-25 at 14:55 +0200, Hedley Roos wrote:
 
  In addition, you need to include a serial in your cache keys to avoid
  dirty reads.
 
 The cache invalidation code actively removes items from the cache. Am
 I understanding you correctly?

I wasn't even talking about invalidation as your cache wouldn't see
'invalidations' anyways.

It's memcached's task to forget stuff: it's a cache anyway.

-- 
Christian Theune · [EMAIL PROTECTED]
gocept gmbh  co. kg · forsterstraße 29 · 06112 halle (saale) · germany
http://gocept.com · tel +49 345 1229889 7 · fax +49 345 1229889 1
Zope and Plone consulting and development


signature.asc
Description: This is a digitally signed message part
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )