Re: [Zope-dev] ZCatalog caching with memcached
Hedley Roos wrote: Since memcached is distributed only a single Zope client needs to perform that query and the result is available to all other Zope clients. This is where you'll get the big win: no need to load all the catalog-related objects into the zodb cache on all the clients which has the twin drawbacks of needing to be done and trashing your zodb cache... And the cache is persistent as long as memcached runs, so you can merrily restart Zope instances and have a warm cache. I didn't even realise this until Roche pointed it out to me. Coool :-) cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On Sun, 2008-10-26 at 14:07 -0400, Tres Seaver wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Roché Compaan wrote: On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote: Have you measures the time needs for some standard ZCatalog queries used with a Plone site with the communication overhead with memcached? Generally spoken: I think the ZCatalog is in general fast. Queries using a fulltext index are known to be more expensive or if you have to deal with large resultsets or complex queries. No I haven't. Roche Compaan has done extensive benchmarking using funkload testing plain catalog vs module level cache vs memcached, but the tests are more about page serving than catalog query time. I'll ask him to comment more on that. I actually did some profiling as well and catalog searches were just too damn slow. The average execution time for searchResults was 100 milliseconds and this is why I told Hedley we should do some caching at query level in the first place. I experimented with this idea a couple of years back but wasn't successful due to inexperience. I was trying to cache brains which obviously leads to persistency bugs. This time around it was obvious to me that we should cache the IISet result sets. I suspect specific indexes are just performing suboptimally and needs to be improved. ExtendPathIndex in Plone seems to be one of them. The effect on performance is really awesome, now we just need to fine tune the implementation. Before (or while) we work on caching, can we try to improve the underlying indexes, and the way that applications use them? I'm pretty sure that there is a lot of room for improvement: - Plone uses too many indexes, and in particular, uses multiple text indexes. Having extra indexes around just in case is a sure lose a write time, and may even be expensive at query time (depending on the query). - Particular indexes have performance characteristics based on their designed purpose: for instance, the stock FieldIndex implementation assumes that the number of documents indexed will be the number of discrete indexable values. Using such an index in an application domain with a very large set of indexable values probably loses, and in ways which don't show up in early / small-scale testing. - I'm pretty sure that we haven't yet found the best data structure for hierarchy indexes (e.g., the Plone EPI index, or the stock Zope2 PathIndex, etc.). Something like a 'trie' might be optimal for pure prefix searching of hierarchies. - I am confident that the TopicIndex is underutiliized: it does *all* the work for a given query at write time, and can thus be blindingly fast at query time. - Other special-purpose indexes (e.g., a recent items index) would be worth a look, especially for applications with large volumes of content. I agree that one should look at improving performance without caching as well. But this is a lot harder and takes significantly more development and debugging time than introducing some form caching. So I'm not convinced that it needs to happen in a certain order. If caching gives you lots of performance with little effort now, then why shouldn't you use it? -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Oct 27, 2008, at 13:08 , Roché Compaan wrote: On Sun, 2008-10-26 at 14:07 -0400, Tres Seaver wrote: - Plone uses too many indexes, and in particular, uses multiple text indexes. Having extra indexes around just in case is a sure lose a write time, and may even be expensive at query time (depending on the query). - Particular indexes have performance characteristics based on their designed purpose: for instance, the stock FieldIndex implementation assumes that the number of documents indexed will be the number of discrete indexable values. Using such an index in an application domain with a very large set of indexable values probably loses, and in ways which don't show up in early / small-scale testing. - I'm pretty sure that we haven't yet found the best data structure for hierarchy indexes (e.g., the Plone EPI index, or the stock Zope2 PathIndex, etc.). Something like a 'trie' might be optimal for pure prefix searching of hierarchies. - I am confident that the TopicIndex is underutiliized: it does *all* the work for a given query at write time, and can thus be blindingly fast at query time. - Other special-purpose indexes (e.g., a recent items index) would be worth a look, especially for applications with large volumes of content. I agree that one should look at improving performance without caching as well. But this is a lot harder and takes significantly more development and debugging time than introducing some form caching. So I'm not convinced that it needs to happen in a certain order. If caching gives you lots of performance with little effort now, then why shouldn't you use it? It's the typical trade-off. One course is expedient and fast for your use case now. The other requires more resources, but benefits everyone. Including those who don't want to depend on yet another package, like memcached, for performance. When it comes to integrating anything in Zope itself I'd choose the latter. jens -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAkkFssEACgkQRAx5nvEhZLITiQCgskifGYaixaj6lVLk85l6rz6E aQwAoI9PRcJHL8oZPatlHWADA0h6orCe =YLhP -END PGP SIGNATURE- ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On Mon, 2008-10-27 at 13:23 +0100, Jens Vagelpohl wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Oct 27, 2008, at 13:08 , Roché Compaan wrote: On Sun, 2008-10-26 at 14:07 -0400, Tres Seaver wrote: - Plone uses too many indexes, and in particular, uses multiple text indexes. Having extra indexes around just in case is a sure lose a write time, and may even be expensive at query time (depending on the query). - Particular indexes have performance characteristics based on their designed purpose: for instance, the stock FieldIndex implementation assumes that the number of documents indexed will be the number of discrete indexable values. Using such an index in an application domain with a very large set of indexable values probably loses, and in ways which don't show up in early / small-scale testing. - I'm pretty sure that we haven't yet found the best data structure for hierarchy indexes (e.g., the Plone EPI index, or the stock Zope2 PathIndex, etc.). Something like a 'trie' might be optimal for pure prefix searching of hierarchies. - I am confident that the TopicIndex is underutiliized: it does *all* the work for a given query at write time, and can thus be blindingly fast at query time. - Other special-purpose indexes (e.g., a recent items index) would be worth a look, especially for applications with large volumes of content. I agree that one should look at improving performance without caching as well. But this is a lot harder and takes significantly more development and debugging time than introducing some form caching. So I'm not convinced that it needs to happen in a certain order. If caching gives you lots of performance with little effort now, then why shouldn't you use it? It's the typical trade-off. One course is expedient and fast for your use case now. The other requires more resources, but benefits everyone. Including those who don't want to depend on yet another package, like memcached, for performance. I'm not tied to memcached. We started out using module level caches like zope.cache.ram but that has obvious problems when using ZEO. When it comes to integrating anything in Zope itself I'd choose the latter. Sure, we're not trying to get this into Zope, we're just sharing our experience and exploring the territory so that one can produce a third party package that really help people with the same use case (which I suspect is quite common one). -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Oct 27, 2008, at 13:32 , Roché Compaan wrote: On Mon, 2008-10-27 at 13:23 +0100, Jens Vagelpohl wrote: When it comes to integrating anything in Zope itself I'd choose the latter. Sure, we're not trying to get this into Zope, we're just sharing our experience and exploring the territory so that one can produce a third party package that really help people with the same use case (which I suspect is quite common one). Right, it's perfectly valid to create such a third party package. The discussion just highlights a greater issue. Personally, I don't think it's good practice to focus on the expediency of working around a problem as opposed to tackling the problem directly. The Zope world is littered with add-ons that act as band-aids on real or perceived shortcomings in Zope itself. jens -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAkkFtvQACgkQRAx5nvEhZLLOrwCaA+X3iGaTDmyt3vP4q93OoTfx CNsAoJXppoHwI17ISetv4iAwoJeb+Phd =auan -END PGP SIGNATURE- ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On Mon, 2008-10-27 at 13:41 +0100, Jens Vagelpohl wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Oct 27, 2008, at 13:32 , Roché Compaan wrote: On Mon, 2008-10-27 at 13:23 +0100, Jens Vagelpohl wrote: When it comes to integrating anything in Zope itself I'd choose the latter. Sure, we're not trying to get this into Zope, we're just sharing our experience and exploring the territory so that one can produce a third party package that really help people with the same use case (which I suspect is quite common one). Right, it's perfectly valid to create such a third party package. The discussion just highlights a greater issue. Personally, I don't think it's good practice to focus on the expediency of working around a problem as opposed to tackling the problem directly. The Zope world is littered with add-ons that act as band-aids on real or perceived shortcomings in Zope itself. Improving the performance of indexes is really really hard. In this case I really don't think caching is a band-aid, it is a good solution. Even with optimised indexes, you will find that you need caching to get reasonable performance if you have a catalog with close to a million or more documents indexed. Given a large enough catalog, I would argue that caching is equally as necessary as having a large cache for a ZEO client. But caches expire and results get invalidated, and therefor we should continue to optimise indexes. With some help we should be able to contribute at this level too. -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On Mon, Oct 27, 2008 at 12:33 PM, Andreas Jung [EMAIL PROTECTED] wrote: On 27.10.2008 16:28 Uhr, Rudá Porto Filgueiras wrote: I will sugest a package called zope.memcached (like zope.sqlalchemy does for SQLAlchemy integration). That way any application who need to talk memcached can do it with out loose atomicit. I don't see a particular reason for creating a new package for here. Extend lovely.memcached and your done. There is not much need for scattering tiny functionalies into two modules here. The module world is already complicated enough. If lovely,memcached alredy is safe when some Exception is raised, discard my sugestion. It's also compatible with zope2? Andreas -- = Rudá Porto Filgueiras Weimar Consultoria http://python-blog.blogspot.com Hospedagem Plone, Django, Zope 3, Grok... http://www.pytown.com = ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On 27.10.2008 17:18 Uhr, Rudá Porto Filgueiras wrote: On Mon, Oct 27, 2008 at 12:33 PM, Andreas Jung[EMAIL PROTECTED] wrote: On 27.10.2008 16:28 Uhr, Rudá Porto Filgueiras wrote: I will sugest a package called zope.memcached (like zope.sqlalchemy does for SQLAlchemy integration). That way any application who need to talk memcached can do it with out loose atomicit. I don't see a particular reason for creating a new package for here. Extend lovely.memcached and your done. There is not much need for scattering tiny functionalies into two modules here. The module world is already complicated enough. If lovely,memcached alredy is safe when some Exception is raised, discard my sugestion. It's also compatible with zope2? We are using it together with our cache tool I mentioned earlier with Zope 2.8.1. Andreas begin:vcard fn:Andreas Jung n:Jung;Andreas org:ZOPYX Ltd. Co. KG adr;quoted-printable:;;Charlottenstr. 37/1;T=C3=BCbingen;;72070;Germany email;internet:[EMAIL PROTECTED] title:CEO tel;work:+49-7071-793376 tel;fax:+49-7071-7936840 tel;home:+49-7071-793257 x-mozilla-html:FALSE url:www.zopyx.com version:2.1 end:vcard ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On 26.10.2008 18:43 Uhr, Roché Compaan wrote: On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote: I suspect specific indexes are just performing suboptimally and needs to be improved. ExtendPathIndex in Plone seems to be one of them. Path indexes and fulltext indexes have a much more complicated implementation compared to field or keyword indexes. Andreas begin:vcard fn:Andreas Jung n:Jung;Andreas org:ZOPYX Ltd. Co. KG adr;quoted-printable:;;Charlottenstr. 37/1;T=C3=BCbingen;;72070;Germany email;internet:[EMAIL PROTECTED] title:CEO tel;work:+49-7071-793376 tel;fax:+49-7071-7936840 tel;home:+49-7071-793257 x-mozilla-html:FALSE url:www.zopyx.com version:2.1 end:vcard ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
Hi Roché, I can see your funkload profile? On Sun, Oct 26, 2008 at 3:43 PM, Roché Compaan [EMAIL PROTECTED]wrote: On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote: Have you measures the time needs for some standard ZCatalog queries used with a Plone site with the communication overhead with memcached? Generally spoken: I think the ZCatalog is in general fast. Queries using a fulltext index are known to be more expensive or if you have to deal with large resultsets or complex queries. No I haven't. Roche Compaan has done extensive benchmarking using funkload testing plain catalog vs module level cache vs memcached, but the tests are more about page serving than catalog query time. I'll ask him to comment more on that. I actually did some profiling as well and catalog searches were just too damn slow. The average execution time for searchResults was 100 milliseconds and this is why I told Hedley we should do some caching at query level in the first place. I experimented with this idea a couple of years back but wasn't successful due to inexperience. I was trying to cache brains which obviously leads to persistency bugs. This time around it was obvious to me that we should cache the IISet result sets. I suspect specific indexes are just performing suboptimally and needs to be improved. ExtendPathIndex in Plone seems to be one of them. The effect on performance is really awesome, now we just need to fine tune the implementation. -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope ) -- Fábio Rizzo Matos ThreePointsWeb [EMAIL PROTECTED] http://www.threepointsweb.com +55 61 3202-6480 Python, Zope e Plone com quem entende do assunto! ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
Hi Fabio The funkload tests were project specific. I plan to write up my findings and to do benchmarks on a standard Plone instance and blog about it. This will unfortunately have to wait since I'm on holiday this week :-) -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za On Sun, 2008-10-26 at 15:54 -0200, Fabio Rizzo Matos wrote: Hi Roché, I can see your funkload profile? On Sun, Oct 26, 2008 at 3:43 PM, Roché Compaan [EMAIL PROTECTED] wrote: On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote: Have you measures the time needs for some standard ZCatalog queries used with a Plone site with the communication overhead with memcached? Generally spoken: I think the ZCatalog is in general fast. Queries using a fulltext index are known to be more expensive or if you have to deal with large resultsets or complex queries. No I haven't. Roche Compaan has done extensive benchmarking using funkload testing plain catalog vs module level cache vs memcached, but the tests are more about page serving than catalog query time. I'll ask him to comment more on that. I actually did some profiling as well and catalog searches were just too damn slow. The average execution time for searchResults was 100 milliseconds and this is why I told Hedley we should do some caching at query level in the first place. I experimented with this idea a couple of years back but wasn't successful due to inexperience. I was trying to cache brains which obviously leads to persistency bugs. This time around it was obvious to me that we should cache the IISet result sets. I suspect specific indexes are just performing suboptimally and needs to be improved. ExtendPathIndex in Plone seems to be one of them. The effect on performance is really awesome, now we just need to fine tune the implementation. -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope ) -- Fábio Rizzo Matos ThreePointsWeb [EMAIL PROTECTED] http://www.threepointsweb.com +55 61 3202-6480 Python, Zope e Plone com quem entende do assunto! ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope ) ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On Sun, 2008-10-26 at 18:50 +0100, Andreas Jung wrote: On 26.10.2008 18:43 Uhr, Roché Compaan wrote: On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote: I suspect specific indexes are just performing suboptimally and needs to be improved. ExtendPathIndex in Plone seems to be one of them. Path indexes and fulltext indexes have a much more complicated implementation compared to field or keyword indexes. I know, and this alone makes a good argument for caching at catalog level. In our case we used membrane, which makes an excessive amount of catalog queries when looking up users so some level of caching was essential. -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Roché Compaan wrote: On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote: Have you measures the time needs for some standard ZCatalog queries used with a Plone site with the communication overhead with memcached? Generally spoken: I think the ZCatalog is in general fast. Queries using a fulltext index are known to be more expensive or if you have to deal with large resultsets or complex queries. No I haven't. Roche Compaan has done extensive benchmarking using funkload testing plain catalog vs module level cache vs memcached, but the tests are more about page serving than catalog query time. I'll ask him to comment more on that. I actually did some profiling as well and catalog searches were just too damn slow. The average execution time for searchResults was 100 milliseconds and this is why I told Hedley we should do some caching at query level in the first place. I experimented with this idea a couple of years back but wasn't successful due to inexperience. I was trying to cache brains which obviously leads to persistency bugs. This time around it was obvious to me that we should cache the IISet result sets. I suspect specific indexes are just performing suboptimally and needs to be improved. ExtendPathIndex in Plone seems to be one of them. The effect on performance is really awesome, now we just need to fine tune the implementation. Before (or while) we work on caching, can we try to improve the underlying indexes, and the way that applications use them? I'm pretty sure that there is a lot of room for improvement: - Plone uses too many indexes, and in particular, uses multiple text indexes. Having extra indexes around just in case is a sure lose a write time, and may even be expensive at query time (depending on the query). - Particular indexes have performance characteristics based on their designed purpose: for instance, the stock FieldIndex implementation assumes that the number of documents indexed will be the number of discrete indexable values. Using such an index in an application domain with a very large set of indexable values probably loses, and in ways which don't show up in early / small-scale testing. - I'm pretty sure that we haven't yet found the best data structure for hierarchy indexes (e.g., the Plone EPI index, or the stock Zope2 PathIndex, etc.). Something like a 'trie' might be optimal for pure prefix searching of hierarchies. - I am confident that the TopicIndex is underutiliized: it does *all* the work for a given query at write time, and can thus be blindingly fast at query time. - Other special-purpose indexes (e.g., a recent items index) would be worth a look, especially for applications with large volumes of content. Tres. - -- === Tres Seaver +1 540-429-0999 [EMAIL PROTECTED] Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJBLHb+gerLs4ltQ4RAp59AJwNlfjI0tBv4PdMiDdH4TLKSm5YfwCgu8xB F3u1G0onXKKZ4s7MbLj9B2w= =r0oE -END PGP SIGNATURE- ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
Very Nice. Have a nice holiday :-) On Sun, Oct 26, 2008 at 3:58 PM, Roché Compaan [EMAIL PROTECTED]wrote: Hi Fabio The funkload tests were project specific. I plan to write up my findings and to do benchmarks on a standard Plone instance and blog about it. This will unfortunately have to wait since I'm on holiday this week :-) -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za On Sun, 2008-10-26 at 15:54 -0200, Fabio Rizzo Matos wrote: Hi Roché, I can see your funkload profile? On Sun, Oct 26, 2008 at 3:43 PM, Roché Compaan [EMAIL PROTECTED] wrote: On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote: Have you measures the time needs for some standard ZCatalog queries used with a Plone site with the communication overhead with memcached? Generally spoken: I think the ZCatalog is in general fast. Queries using a fulltext index are known to be more expensive or if you have to deal with large resultsets or complex queries. No I haven't. Roche Compaan has done extensive benchmarking using funkload testing plain catalog vs module level cache vs memcached, but the tests are more about page serving than catalog query time. I'll ask him to comment more on that. I actually did some profiling as well and catalog searches were just too damn slow. The average execution time for searchResults was 100 milliseconds and this is why I told Hedley we should do some caching at query level in the first place. I experimented with this idea a couple of years back but wasn't successful due to inexperience. I was trying to cache brains which obviously leads to persistency bugs. This time around it was obvious to me that we should cache the IISet result sets. I suspect specific indexes are just performing suboptimally and needs to be improved. ExtendPathIndex in Plone seems to be one of them. The effect on performance is really awesome, now we just need to fine tune the implementation. -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope ) -- Fábio Rizzo Matos ThreePointsWeb [EMAIL PROTECTED] http://www.threepointsweb.com +55 61 3202-6480 Python, Zope e Plone com quem entende do assunto! ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope ) -- Fábio Rizzo Matos ThreePointsWeb [EMAIL PROTECTED] http://www.threepointsweb.com +55 61 3202-6480 Python, Zope e Plone com quem entende do assunto! ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On 26.10.2008 19:05 Uhr, Roché Compaan wrote: On Sun, 2008-10-26 at 18:50 +0100, Andreas Jung wrote: On 26.10.2008 18:43 Uhr, Roché Compaan wrote: On Sat, 2008-10-25 at 09:20 +0200, Hedley Roos wrote: I suspect specific indexes are just performing suboptimally and needs to be improved. ExtendPathIndex in Plone seems to be one of them. Path indexes and fulltext indexes have a much more complicated implementation compared to field or keyword indexes. I know, and this alone makes a good argument for caching at catalog level. In our case we used membrane, which makes an excessive amount of catalog queries when looking up users so some level of caching was essential. First caching is good thing :-) But how about the following issue: CMF/Plone inject additional subqueries for expires/effective/typesAndRoles. At least the security related aubqueries make a cached catalog result very specific to a particular user. That seems to be very ok for a site with lots of anonymous users - it might be an issue with lots of authenticated users. It might be necessary to add some kind of intelligence to decide what to cache and what not. I don't think it does not make sense to cache the result of a fulltext search. I am just thinking if it would make sense to cache on the index level instead of catalog level? So you could for example cache expensive index queries (path index) and combine them with uncached index which are supposed to be fast..however such decisions require detailed mesurements on real systems. One other thing concerning memcached: there is obviously a limit to 1MB for data you can store as a value. We have not found an obvious way for increasing this limit other by patching the memcached sources. We came up with an implementation where data 1MB is split up into individual junks (we have a dedicated set_huge(), get_huge()) implementation. Andreas -- ZOPYX Ltd. Co. KG - Charlottenstr. 37/1 - 72070 Tübingen - Germany Web: www.zopyx.com - Email: [EMAIL PROTECTED] - Phone +49 - 7071 - 793376 Registergericht: Amtsgericht Stuttgart, Handelsregister A 381535 Geschäftsführer/Gesellschafter: ZOPYX Limited, Birmingham, UK E-Publishing, Python, Zope Plone development, Consulting begin:vcard fn:Andreas Jung n:Jung;Andreas org:ZOPYX Ltd. Co. KG adr;quoted-printable:;;Charlottenstr. 37/1;T=C3=BCbingen;;72070;Germany email;internet:[EMAIL PROTECTED] title:CEO tel;work:+49-7071-793376 tel;fax:+49-7071-7936840 tel;home:+49-7071-793257 x-mozilla-html:FALSE url:www.zopyx.com version:2.1 end:vcard ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
The usual Plone catalogs (portal_catalog, uid_catalog, reference_catalog and membrane_tool) all run above 90% hit rate if the server is up to it. portal_catalog is invalidated the most so it fluctuates the most. If the server is severely underpowered then catalogcache is much less effective. portal_catalog hit rates will degrade over time. This is the situation I'm currently facing with on one site, but more servers will fix that. It's quite easy to benchmark / load test with funkload. What I've found is that memcached is very light on CPU, but if the Zope processes are constantly using all CPU it is starved and runs into trouble. As long as you avoid that case (which would be fatal without catalogcache in any case) then everything works perfectly. Run a few tests and let me know please. Hedley ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On 25.10.2008 8:48 Uhr, Hedley Roos wrote: The usual Plone catalogs (portal_catalog, uid_catalog, reference_catalog and membrane_tool) all run above 90% hit rate if the server is up to it. portal_catalog is invalidated the most so it fluctuates the most. If the server is severely underpowered then catalogcache is much less effective. portal_catalog hit rates will degrade over time. This is the situation I'm currently facing with on one site, but more servers will fix that. It's quite easy to benchmark / load test with funkload. What I've found is that memcached is very light on CPU, but if the Zope processes are constantly using all CPU it is starved and runs into trouble. As long as you avoid that case (which would be fatal without catalogcache in any case) then everything works perfectly. Have you measures the time needs for some standard ZCatalog queries used with a Plone site with the communication overhead with memcached? Generally spoken: I think the ZCatalog is in general fast. Queries using a fulltext index are known to be more expensive or if you have to deal with large resultsets or complex queries. Andreas begin:vcard fn:Andreas Jung n:Jung;Andreas org:ZOPYX Ltd. Co. KG adr;quoted-printable:;;Charlottenstr. 37/1;T=C3=BCbingen;;72070;Germany email;internet:[EMAIL PROTECTED] title:CEO tel;work:+49-7071-793376 tel;fax:+49-7071-7936840 tel;home:+49-7071-793257 x-mozilla-html:FALSE url:www.zopyx.com version:2.1 end:vcard ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
Have you measures the time needs for some standard ZCatalog queries used with a Plone site with the communication overhead with memcached? Generally spoken: I think the ZCatalog is in general fast. Queries using a fulltext index are known to be more expensive or if you have to deal with large resultsets or complex queries. No I haven't. Roche Compaan has done extensive benchmarking using funkload testing plain catalog vs module level cache vs memcached, but the tests are more about page serving than catalog query time. I'll ask him to comment more on that. As for standard queries on a Plone site the typical folder contents query is a good example. The query will be fast unless it sorts on sortable_title (a ZCTextIndex) right? Not sure right now. Since memcached is distributed only a single Zope client needs to perform that query and the result is available to all other Zope clients. And the cache is persistent as long as memcached runs, so you can merrily restart Zope instances and have a warm cache. I didn't even realise this until Roche pointed it out to me. To answer the question: I believe catalogcache will win every time since the return time of a cached query is not dependent on the complexity of the query. We should get a few benchmarks running at query level. I'll have a bit of time next week. Hedley ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
Hedley Roos wrote: As for standard queries on a Plone site the typical folder contents query is a good example. The query will be fast unless it sorts on sortable_title (a ZCTextIndex) right? Not sure right now. sortable_title is a field index and shouldn't be slower than any other index. This all sounds very cool, by the way. :) Martin -- Author of `Professional Plone Development`, a book for developers who want to work with Plone. See http://martinaspeli.net/plone-book ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
Hi, On Fri, 2008-10-24 at 15:41 +0200, Hedley Roos wrote: The product is a monkey patch to Catalog.py. I'd love some feedback and suggestions. I'd love if this wouldn't be a monkey patch. Also, there is nothing that makes this integrate correctly with transactions. Your cache will happily deliver never-committed data and also it will not isolate transactions from each other. Christian -- Christian Theune · [EMAIL PROTECTED] gocept gmbh co. kg · forsterstraße 29 · 06112 halle (saale) · germany http://gocept.com · tel +49 345 1229889 7 · fax +49 345 1229889 1 Zope and Plone consulting and development signature.asc Description: This is a digitally signed message part ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
I'd love if this wouldn't be a monkey patch. So would I, but I couldn't find another way in this case. Also, there is nothing that makes this integrate correctly with transactions. Your cache will happily deliver never-committed data and also it will not isolate transactions from each other. I patched 4 methods - clear, search, catalogObject, uncatalogObject. Method clear is the simplest one - I simply flush the cache. Methods catalogObject and uncatalogObject both invalidate the cache. Should the transaction fail later the only drawback is that you threw a few things out of the cache. They'll soon be re-entered by subsequent searches. Method search just inspects queries and stores results to memcache. Can you give me an example where the cache would deliver non-committed data? ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
In addition, you need to include a serial in your cache keys to avoid dirty reads. The cache invalidation code actively removes items from the cache. Am I understanding you correctly? H ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On 25.10.2008 14:53 Uhr, Hedley Roos wrote: I'd love if this wouldn't be a monkey patch. So would I, but I couldn't find another way in this case. Also, there is nothing that makes this integrate correctly with transactions. Your cache will happily deliver never-committed data and also it will not isolate transactions from each other. I patched 4 methods - clear, search, catalogObject, uncatalogObject. Method clear is the simplest one - I simply flush the cache. Methods catalogObject and uncatalogObject both invalidate the cache. Should the transaction fail later the only drawback is that you threw a few things out of the cache. They'll soon be re-entered by subsequent searches. Using a DataManager is likely the better and more safe choice. Andreas begin:vcard fn:Andreas Jung n:Jung;Andreas org:ZOPYX Ltd. Co. KG adr;quoted-printable:;;Charlottenstr. 37/1;T=C3=BCbingen;;72070;Germany email;internet:[EMAIL PROTECTED] title:CEO tel;work:+49-7071-793376 tel;fax:+49-7071-7936840 tel;home:+49-7071-793257 x-mozilla-html:FALSE url:www.zopyx.com version:2.1 end:vcard ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On Sat, 2008-10-25 at 14:53 +0200, Hedley Roos wrote: I'd love if this wouldn't be a monkey patch. So would I, but I couldn't find another way in this case. Also, there is nothing that makes this integrate correctly with transactions. Your cache will happily deliver never-committed data and also it will not isolate transactions from each other. I patched 4 methods - clear, search, catalogObject, uncatalogObject. Method clear is the simplest one - I simply flush the cache. This is probably harmless but will cause unnecessary cache flushes for other clients. Methods catalogObject and uncatalogObject both invalidate the cache. Should the transaction fail later the only drawback is that you threw a few things out of the cache. They'll soon be re-entered by subsequent searches. Right. This is the same as clear. Method search just inspects queries and stores results to memcache. That's the issue. If you catalog an object, then search for it and then abort the transaction, your cache will have data in it that isn't committed. Additionally when another transaction is already running in parallel, it will see cache inserts from other transactions. Christian -- Christian Theune · [EMAIL PROTECTED] gocept gmbh co. kg · forsterstraße 29 · 06112 halle (saale) · germany http://gocept.com · tel +49 345 1229889 7 · fax +49 345 1229889 1 Zope and Plone consulting and development signature.asc Description: This is a digitally signed message part ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On Sat, Oct 25, 2008 at 2:57 PM, Andreas Jung [EMAIL PROTECTED] wrote: On 25.10.2008 14:53 Uhr, Hedley Roos wrote: I'd love if this wouldn't be a monkey patch. So would I, but I couldn't find another way in this case. Also, there is nothing that makes this integrate correctly with transactions. Your cache will happily deliver never-committed data and also it will not isolate transactions from each other. I patched 4 methods - clear, search, catalogObject, uncatalogObject. Method clear is the simplest one - I simply flush the cache. Methods catalogObject and uncatalogObject both invalidate the cache. Should the transaction fail later the only drawback is that you threw a few things out of the cache. They'll soon be re-entered by subsequent searches. Using a DataManager is likely the better and more safe choice. Andreas Thanks Andreas. I'll have a look at your code when available. Christian, I do have a mistake in my reasoning. If an object is added to the catalog in a transaction and I cache that object as result of a query in that same transaction, and then the transaction fails I'll have a bad cache. H ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
If you catalog an object, then search for it and then abort the transaction, your cache will have data in it that isn't committed. Kind of like how I came to the same conclusion in parallel to you and stuffed up this thread :) Additionally when another transaction is already running in parallel, it will see cache inserts from other transactions. So this is the area I have to focus on right now. H ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
Additionally when another transaction is already running in parallel, it will see cache inserts from other transactions. A possible solution is to keep a module level cache which can be committed to the memcache on transaction boundaries. That way I'll incur no performance penalty. H ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog caching with memcached
On Sat, 2008-10-25 at 14:55 +0200, Hedley Roos wrote: In addition, you need to include a serial in your cache keys to avoid dirty reads. The cache invalidation code actively removes items from the cache. Am I understanding you correctly? I wasn't even talking about invalidation as your cache wouldn't see 'invalidations' anyways. It's memcached's task to forget stuff: it's a cache anyway. -- Christian Theune · [EMAIL PROTECTED] gocept gmbh co. kg · forsterstraße 29 · 06112 halle (saale) · germany http://gocept.com · tel +49 345 1229889 7 · fax +49 345 1229889 1 Zope and Plone consulting and development signature.asc Description: This is a digitally signed message part ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )