Re: [Framework-Team] Plone roadmap
If I'm allowed to get one suggestion in there: On Thu, Sep 2, 2010 at 3:39 PM, Laurence Rowe wrote: > WSGI > > > Various components should be move out of Plone and into the WSGI > pipeline. This should allow us to share code with other projects. > Prime contenders would be: > > * Authentication > * Resource registries One thing I would like to see, and would likely be a small (though effective) improvement, specially for Plone would be: * Flushing the Document Early (as described by: http://www.stevesouders.com/blog/2009/05/18/flushing-the-document-early/) I *think* we should be able to get the whole (or most of the) tag flushed out to the browser (maybe with Transfer-Encoding: chunked, by way of RESPONSE.write() or similar WSGI majik). If you think about it, for the great majority of Plone sites that part of the page is fairly static, except maybe for the tag and some metadata. If we could get it far enough so that the browser starts fetching the CSS and JS resources while Plone does it's thing to render the rest of the page, it would be a great win already. -- Sidnei ___ Framework-Team mailing list Framework-Team@lists.plone.org http://lists.plone.org/mailman/listinfo/framework-team
[Framework-Team] PLIP for Plone 4.1 - plone.app.caching
Hi folks, Eric asked me to submit a PLIP for 4.1 to include plone.app.caching: http://dev.plone.org/plone/ticket/11065 I haven't added a buildout yet, but you can test with the KGS at http://good-py.appspot.com/release/plone.app.caching/1.0b1 If people agree with the idea of including it, I'll set up a buildout (I'd like to fix a test failure first, which was caused by some recent changes to use plone.app.registry a bit more efficiently). Cheers, Martin ___ Framework-Team mailing list Framework-Team@lists.plone.org http://lists.plone.org/mailman/listinfo/framework-team
Re: [Framework-Team] Plone roadmap
On Sun, 2010-09-05 at 15:35 -0400, Chris McDonough wrote: > > > > I'm tempted to say that the login form should be a separate > > application end point to the CMS. Authentication is something that can > > and should be shared across multiple applications - it's much easier > > to maintain a number of smaller focussed applications than one large > > monolithic one. > > Dunno. Not really even sure what that means. We tried to get > generality by putting a login form handler inside the "r.who > application", which arbitrary application code ("another application") > could post to. It failed because at the end of the day, login form > integration for arbitrary customers requires a lot of control over the > process from end to end; nobody is willing to give up any features (such > as login logging, arbitrarily skinned login form, arbitrarily complex > error messages, etc) to service the more general goal of > cross-application login (especially when they only have one application > anyway, understandably). I should also note that we had our own "login application" which didn't require any help at all from an external form (the normal "FormPlugin" in r.who, which both renders a login form and posts to itself). Literally zero people wanted this (I mean, literally no one). They all had integration requirements that they felt more comfortable servicing within the framework that they were creating their app in. - C ___ Framework-Team mailing list Framework-Team@lists.plone.org http://lists.plone.org/mailman/listinfo/framework-team
Re: [Framework-Team] Plone roadmap
On Sun, 2010-09-05 at 19:49 +0200, Laurence Rowe wrote: > On 5 September 2010 18:47, Chris McDonough wrote: > > On Sun, 2010-09-05 at 18:18 +0200, Hanno Schlichting wrote: > >> On Sun, Sep 5, 2010 at 5:46 PM, Wichert Akkerman wrote: > >> > On 2010-9-5 17:29, Hanno Schlichting wrote: > >> >> PluggableAuthService > >> >> -- > >> >> > >> >> There's tons of code based on this. I imagine we can first move the > >> >> authentication API's into a WSGI middleware querying PAS as the > >> >> backend. > >> > > >> > This sounds like the mistake repoze.who 1 made. It turns out that for > >> > almost every use case you want more control over handling login > >> > behaviour than WSGI middleware can provide. It is much simpler to have a > >> > simple API to an AAA service and use that than to try pushing this into > >> > middleware. > >> > >> Right, I'm aware of the repoze.who lessons. Authorization is always > >> going to be a WSGI framework component ("endware") and not an isolated > >> middleware. But there should be some subpart of the API, which allows > >> you to share the same authorization information across multiple WSGI > >> applications. Or deal with some of the external authorization > >> handling, when you offload things to Apache or other SSO approaches. > >> > >> But I'm not familiar enough with this topic to know what exact subpart > >> this is. It might come down to just the userid. > > > > r.who 2 actually allows you to dial responsibilities up and down. You > > can use "full stack" middleware that lends it effectively the same > > responsibilities as r.who 1, or you can use only the r.who "API" portion > > in an app or you can combine the two approaches as necessary. See also > > http://docs.repoze.org/who/2.0/narr.html#using-repoze-who-without-wsgi-middleware > > > > The particular pain point you should never run into, because it is truly > > horrible: don't try to do any login form post handling in an > > "identifier". Just allow the application to render a self-posting login > > form and use "the API" to check credentials and set headers and so on, > > rather than putting the login form handling itself into middleware > > machinery. In particular, never do anything remotely like the > > "RedirectingForm" plugin within > > http://svn.repoze.org/repoze.who/trunk/repoze/who/plugins/form.py (it > > wants to be the target for a login form post). > > > > Aside from that (which is a problem for people of any competence level), > > most of the problems with the middleware approach stem from needing to > > explain how the middleware approach works to integrators of widely > > varying competence levels. Each has his own slightly varying > > requirements, and each needs the middleware approach to be explained to > > him within that context. This has been a truly painful task for me, but > > that's more an indictment of my level of patience than it is of r.who or > > things like it. > > I'm tempted to say that the login form should be a separate > application end point to the CMS. Authentication is something that can > and should be shared across multiple applications - it's much easier > to maintain a number of smaller focussed applications than one large > monolithic one. Dunno. Not really even sure what that means. We tried to get generality by putting a login form handler inside the "r.who application", which arbitrary application code ("another application") could post to. It failed because at the end of the day, login form integration for arbitrary customers requires a lot of control over the process from end to end; nobody is willing to give up any features (such as login logging, arbitrarily skinned login form, arbitrarily complex error messages, etc) to service the more general goal of cross-application login (especially when they only have one application anyway, understandably). - C ___ Framework-Team mailing list Framework-Team@lists.plone.org http://lists.plone.org/mailman/listinfo/framework-team
Re: [Framework-Team] Plone roadmap
> > Right, I'm aware of the repoze.who lessons. Authorization is always > going to be a WSGI framework component ("endware") and not an isolated > middleware. But there should be some subpart of the API, which allows > you to share the same authorization information across multiple WSGI > applications. Or deal with some of the external authorization > handling, when you offload things to Apache or other SSO approaches. > > But I'm not familiar enough with this topic to know what exact subpart > this is. It might come down to just the userid. > > Hanno > ___ > Framework-Team mailing list > Framework-Team@lists.plone.org > http://lists.plone.org/mailman/listinfo/framework-team > Realistically this is what Oauth[1] already does so that one doesn't need to concentrate on worrying about the intricacies of passing or sharing that information. PAS could use OAuth to pipe the required data back to Plone. Right now the Openid stuff is a step in the right direction but realistically it creates a virtual like user in Plone. This could possibly be extended whilst i'm doing work on my plip ticket with some prototypal code on how it would work. [1]: http://oauth.net/ -- Christopher Warner http://cwarner.kernelcode.com ___ Framework-Team mailing list Framework-Team@lists.plone.org http://lists.plone.org/mailman/listinfo/framework-team
Re: [Framework-Team] Plone roadmap
On 5 September 2010 20:01, Laurence Rowe wrote: > On 5 September 2010 19:17, Martin Aspeli wrote: >> On 5 September 2010 15:29, Hanno Schlichting wrote: >>> - Once we have intid's we can change the internal unique id of the >>> catalog from the physical path over to an intid. >> >> Perhaps we should consider using UUIDs instead of intids? > > We want to use intids because it is more efficient to intersect sets > of integers. They are only an implementation detail though, and it > should be possible to zap and rebuild your catalogue (assigning > different intids) without problems. > >>> >>> - Once we have parent pointers we can probably ditch storing metadata >>> in the catalog and load objects directly. >> >> Why do __parent__ pointers help here? > > With __parent__ pointers you can pull an object directly out of the > ZODB complete with it's location context. That means fetching the > title and description for an item is usually just an object load. > > What's not so clear about this is how we index an object's path and > it's allowed roles and users for the view permission. We should be > able to learn from Zope3 here though. > > Tthere are balances to be struck between read and write efficiency here. It's worth noting here that the overhead of constructing the full location chain from a content item to the application root is much cheaper following __parent__ pointers up than traversing down the hierarchy. At each level of traversing you load the content object itself and search the BTree for the child (loading several BTree objects). With __parent__ pointers you can directly load each parent. I think this means that we probably won't have to worry about providing a cached absolute url metadata lookup or even cached roles and users as metadata - as these will only be calculated for a page's worth of content items. We will of course need an index on allowed roles and users and probably a descendants index (which zc.relation might provide), though only for those particular 'sections' to which searches are restricted. Laurence ___ Framework-Team mailing list Framework-Team@lists.plone.org http://lists.plone.org/mailman/listinfo/framework-team
Re: [Framework-Team] Plone roadmap
On Sun, Sep 5, 2010 at 7:17 PM, Martin Aspeli wrote: > On 5 September 2010 15:29, Hanno Schlichting wrote: >> >> - With Zope 2.13 / Plone 4.1 we are cleaning up the "query" interface >> a bit. The catalog's search methods now all want a simple dictionary >> as the query specification and issue deprecation warnings for >> everything else. > > Can I ask why we need to do this? It seems like an obvious use case for > keyword arguments, and this is a pattern we've been documenting and > promoting for a long, long time. The various search methods in the catalog all have actual keyword arguments which govern part of their behavior. I want a clear distinction between "the query" and the different keyword arguments on the search methods. We also do a couple of operations on the query dict, rewrite it to some degree and pass it on between methods. Currently there's a good deal of code to first merge all the different ways of passing the query and bringing it into a canonical dict form (see Zope 2.13 Products.ZCatalog.Catalog at the beginning of searchResults, the CatalogSearchArgumentsMap class and the make_query method). All that code can go away once we just get a dict in the first place. This will also make collective.solr simpler, as it has to do the exact same on its own so far. It needs to rewrite the catalog query syntax into the Solr syntax. To my knowledge none of the various forms of constructing the query is more common or more advertised. The oldest one was using a request object, which some people figured out could just as well be a simple dict instead. The simple dict approach is what I've seen the most. Passing keyword arguments was another way, but is not more common or advertised to my knowledge. >> - Once we have intid's we can change the internal unique id of the >> catalog from the physical path over to an intid. > > Perhaps we should consider using UUIDs instead of intids? In order for merging of result sets across catalogs to be efficient, it needs to be integers and not strings. It's likely that querying one catalog only does a minimal restriction and gets you 100.000+ results and only merging it with the restriction from a second catalog will bring it down to a smaller number. At this scale the difference between IISets and sets of strings is significant. >> - Once we have parent pointers we can probably ditch storing metadata >> in the catalog and load objects directly. > > Why do __parent__ pointers help here? Once we have parent pointers, we don't need to traverse to the object anymore to construct its Acquisition chain. We can just do the moral equivalent of "connection.load(poid)" and use the result. As long as we have all the traversal overhead (like with five.intid), there's still a large overhead cost for this traversal. >> This >> should get us out of the business of maintaining a web server, but >> will also likely mean the loss of FTP and WebDAV support. > > I don't think that's a good option. We may not need to support both, but > supporting one is probably quite important. For one thing, it'd kill Enfold > Desktop and similar integrations. WebDAV is also very useful for bulk > movement of images and documents. I haven't ever seen an actual good and working WebDAV client for normal content editors. The WebDAV standard is dead and the big operating systems have no interest in fixing it or their implementations. FTP is even less user friendly and I've only seen WebFTP implementations that work for mortals. I think we should focus on better web-based upload and batch functionality and give up on those other protocols. As I said, there's some customers that want this, but it's a tiny minority and thus best served by an add-on. Just because FTP and WebDAV have been cool in 1998 doesn't mean we need to keep them in 2010. With HTML5 and AJAX UI's we have better answers to these use-cases now. >> - For Plone 4.2 I want to get rid of CMFDefault (there's too much UI >> code in it, which is useless to Plone) > > Is there anything that people regularly import from CMFDefault? I can't > think of anything other than perhaps Document being used in some tests. I'm not aware of anything. There's some bits of DublinCore base classes in it, some base parts of the Portal object and some site creation logic we use. Once we have plone.app.discussion we don't need DiscussionItem anymore. Other than that there's bits of tools code for the registration tool, membership tool and metadata tool. We probably overwrite 80% of the code somewhere in our own versions anyways for these. I did this removal once already for my Plone trunk work and it's not been too difficult. >> Remaining are CMFCore, DCWorkflow and GenericSetup. I don't think we >> can get rid of any of them in the foreseeable future. > > Do we have to? I quite like those bits. :-) > Well, CMFCore is maybe a bit bigger than we want it to be, but still. I want to completely get rid of Acquisition before we move to Python 3. You cannot
Re: [Framework-Team] Plone roadmap
On 5 September 2010 19:17, Martin Aspeli wrote: > On 5 September 2010 15:29, Hanno Schlichting wrote: >> - Once we have intid's we can change the internal unique id of the >> catalog from the physical path over to an intid. > > Perhaps we should consider using UUIDs instead of intids? We want to use intids because it is more efficient to intersect sets of integers. They are only an implementation detail though, and it should be possible to zap and rebuild your catalogue (assigning different intids) without problems. >> >> - Once we have parent pointers we can probably ditch storing metadata >> in the catalog and load objects directly. > > Why do __parent__ pointers help here? With __parent__ pointers you can pull an object directly out of the ZODB complete with it's location context. That means fetching the title and description for an item is usually just an object load. What's not so clear about this is how we index an object's path and it's allowed roles and users for the view permission. We should be able to learn from Zope3 here though. Tthere are balances to be struck between read and write efficiency here. Laurence ___ Framework-Team mailing list Framework-Team@lists.plone.org http://lists.plone.org/mailman/listinfo/framework-team
Re: [Framework-Team] Plone roadmap
On 5 September 2010 18:47, Chris McDonough wrote: > On Sun, 2010-09-05 at 18:18 +0200, Hanno Schlichting wrote: >> On Sun, Sep 5, 2010 at 5:46 PM, Wichert Akkerman wrote: >> > On 2010-9-5 17:29, Hanno Schlichting wrote: >> >> PluggableAuthService >> >> -- >> >> >> >> There's tons of code based on this. I imagine we can first move the >> >> authentication API's into a WSGI middleware querying PAS as the >> >> backend. >> > >> > This sounds like the mistake repoze.who 1 made. It turns out that for >> > almost every use case you want more control over handling login >> > behaviour than WSGI middleware can provide. It is much simpler to have a >> > simple API to an AAA service and use that than to try pushing this into >> > middleware. >> >> Right, I'm aware of the repoze.who lessons. Authorization is always >> going to be a WSGI framework component ("endware") and not an isolated >> middleware. But there should be some subpart of the API, which allows >> you to share the same authorization information across multiple WSGI >> applications. Or deal with some of the external authorization >> handling, when you offload things to Apache or other SSO approaches. >> >> But I'm not familiar enough with this topic to know what exact subpart >> this is. It might come down to just the userid. > > r.who 2 actually allows you to dial responsibilities up and down. You > can use "full stack" middleware that lends it effectively the same > responsibilities as r.who 1, or you can use only the r.who "API" portion > in an app or you can combine the two approaches as necessary. See also > http://docs.repoze.org/who/2.0/narr.html#using-repoze-who-without-wsgi-middleware > > The particular pain point you should never run into, because it is truly > horrible: don't try to do any login form post handling in an > "identifier". Just allow the application to render a self-posting login > form and use "the API" to check credentials and set headers and so on, > rather than putting the login form handling itself into middleware > machinery. In particular, never do anything remotely like the > "RedirectingForm" plugin within > http://svn.repoze.org/repoze.who/trunk/repoze/who/plugins/form.py (it > wants to be the target for a login form post). > > Aside from that (which is a problem for people of any competence level), > most of the problems with the middleware approach stem from needing to > explain how the middleware approach works to integrators of widely > varying competence levels. Each has his own slightly varying > requirements, and each needs the middleware approach to be explained to > him within that context. This has been a truly painful task for me, but > that's more an indictment of my level of patience than it is of r.who or > things like it. I'm tempted to say that the login form should be a separate application end point to the CMS. Authentication is something that can and should be shared across multiple applications - it's much easier to maintain a number of smaller focussed applications than one large monolithic one. Laurence ___ Framework-Team mailing list Framework-Team@lists.plone.org http://lists.plone.org/mailman/listinfo/framework-team
Re: [Framework-Team] Plone roadmap
On 5 September 2010 15:29, Hanno Schlichting wrote: > > - With Zope 2.13 / Plone 4.1 we are cleaning up the "query" interface > a bit. The catalog's search methods now all want a simple dictionary > as the query specification and issue deprecation warnings for > everything else. Can I ask why we need to do this? It seems like an obvious use case for keyword arguments, and this is a pattern we've been documenting and promoting for a long, long time. +1 to removing the REQUEST magic, though. > Quite a bit of code will need to be updated to avoid > using keyword arguments, passing in requests objects or mixtures of > all of these. This will make it easier to switch to a different > implementation later on, as we can keep the query syntax intact. We could do that with a query syntax based on kwargs as well, of course. ;-) > We > have also deprecated the insane "empty query" behavior of the catalog. > So far any query which didn't result in any index restriction returned > the entire catalog content. Starting in Zope 2.14 you get an empty > result instead. > +lots to this and everything else > - Get a new plone.indexing package or extend plone.indexer to do the > job of collective.indexing. Especially create indexing events which > are managed by an index manager instead of relying on mix-in classes. > This will replace object.reindexObject() and similar calls, CMF's > CatalogAware and the various catalog multiplex solutions. This will > need a rather long deprecation period as these calls are all over the > place. > They are all over the place, but rarely customised/subclassed, so one thing we can probably do is make some of these no-ops. - Once we have intid's we can change the internal unique id of the > catalog from the physical path over to an intid. Perhaps we should consider using UUIDs instead of intids? > - Once we have parent pointers we can probably ditch storing metadata > in the catalog and load objects directly. Why do __parent__ pointers help here? > > The Publisher > > - > > > > The Zope2 publisher has become incredibly complex, with numerous > > different hooks. In the long run (Plone 6?) we should replace it with > > our own simplified publisher which runs only in a WSGI pipeline. There > > will be a lot to learn from BFG here, though that is probably too > > simplistic for Plone. > > Concrete things I'd like to see: > > - Get a new round of community experimentation with WSGI now that's > inside Plone 4.1. We have seen some good interest while repoze.zope2 > was new and shiny. +1 - the downfall of repoze.zope2 was that it kept drifting out of sync with Zope 2 and Plone. > Hopefully wrap these things up and have Plone 4.2 > come with an official documented WSGI story. If things go well we can > make WSGI the only supported deployment option for Plone 5. This > should get us out of the business of maintaining a web server, but > will also likely mean the loss of FTP and WebDAV support. I don't think that's a good option. We may not need to support both, but supporting one is probably quite important. For one thing, it'd kill Enfold Desktop and similar integrations. WebDAV is also very useful for bulk movement of images and documents. Note that Dexterity has a pretty sane, pretty well documented WebDAV approach. I can't see why it has to be incompatible with WSGI. Very little of the DAV stuff is actually in the web server or publisher. Once we have > a good TTW multi-upload functionality this becomes possible (even > though some people will complain, but they will need to maintain and > evolve the code on their own - it's a niche requirement best done as > an add-on, much like CMIS will be for quite a while). > TTW multi-upload certainly helps, but WebDAV has for a long time been our answer for desktop integration, and I think jettisoning it would be a shame. Multi-upload doesn't help you uploading a large folder tree with images or files, downloading the same, opening and saving from desktop apps, etc. > - Experiment again with a cleaned up request/response object based on > WebOb. There's some insanities like the support for both getattr and > getitem to access request values, the whole DTML automagic quote > behavior and a ton more. > I think we can let Plone opt into a "simple" request object that doesn't support getattr and DTML quoting and various stupid variables (_steps anyone?). > I think we have done some steps towards it, for example with > supporting the standard dict API for accessing containers instead of > objectIds and friends. If we can move the copy/cut/paste code over to > a new approach we can avoid most of the manage_ hooks. Once we figure > out a new pattern to do object construction (the new "invokeFactory") > things will get easier. You can use the ZTK createObject() method, which just looks up and calls an IFactory utility by name. This is what Dexterity does, and it's pretty straightforward. An IFactory is just a call
Re: [Framework-Team] Plone roadmap
On Sun, 2010-09-05 at 18:18 +0200, Hanno Schlichting wrote: > On Sun, Sep 5, 2010 at 5:46 PM, Wichert Akkerman wrote: > > On 2010-9-5 17:29, Hanno Schlichting wrote: > >> PluggableAuthService > >> -- > >> > >> There's tons of code based on this. I imagine we can first move the > >> authentication API's into a WSGI middleware querying PAS as the > >> backend. > > > > This sounds like the mistake repoze.who 1 made. It turns out that for > > almost every use case you want more control over handling login > > behaviour than WSGI middleware can provide. It is much simpler to have a > > simple API to an AAA service and use that than to try pushing this into > > middleware. > > Right, I'm aware of the repoze.who lessons. Authorization is always > going to be a WSGI framework component ("endware") and not an isolated > middleware. But there should be some subpart of the API, which allows > you to share the same authorization information across multiple WSGI > applications. Or deal with some of the external authorization > handling, when you offload things to Apache or other SSO approaches. > > But I'm not familiar enough with this topic to know what exact subpart > this is. It might come down to just the userid. r.who 2 actually allows you to dial responsibilities up and down. You can use "full stack" middleware that lends it effectively the same responsibilities as r.who 1, or you can use only the r.who "API" portion in an app or you can combine the two approaches as necessary. See also http://docs.repoze.org/who/2.0/narr.html#using-repoze-who-without-wsgi-middleware The particular pain point you should never run into, because it is truly horrible: don't try to do any login form post handling in an "identifier". Just allow the application to render a self-posting login form and use "the API" to check credentials and set headers and so on, rather than putting the login form handling itself into middleware machinery. In particular, never do anything remotely like the "RedirectingForm" plugin within http://svn.repoze.org/repoze.who/trunk/repoze/who/plugins/form.py (it wants to be the target for a login form post). Aside from that (which is a problem for people of any competence level), most of the problems with the middleware approach stem from needing to explain how the middleware approach works to integrators of widely varying competence levels. Each has his own slightly varying requirements, and each needs the middleware approach to be explained to him within that context. This has been a truly painful task for me, but that's more an indictment of my level of patience than it is of r.who or things like it. - C ___ Framework-Team mailing list Framework-Team@lists.plone.org http://lists.plone.org/mailman/listinfo/framework-team
Re: [Framework-Team] Plone roadmap
On Sun, Sep 5, 2010 at 5:46 PM, Wichert Akkerman wrote: > On 2010-9-5 17:29, Hanno Schlichting wrote: >> PluggableAuthService >> -- >> >> There's tons of code based on this. I imagine we can first move the >> authentication API's into a WSGI middleware querying PAS as the >> backend. > > This sounds like the mistake repoze.who 1 made. It turns out that for > almost every use case you want more control over handling login > behaviour than WSGI middleware can provide. It is much simpler to have a > simple API to an AAA service and use that than to try pushing this into > middleware. Right, I'm aware of the repoze.who lessons. Authorization is always going to be a WSGI framework component ("endware") and not an isolated middleware. But there should be some subpart of the API, which allows you to share the same authorization information across multiple WSGI applications. Or deal with some of the external authorization handling, when you offload things to Apache or other SSO approaches. But I'm not familiar enough with this topic to know what exact subpart this is. It might come down to just the userid. Hanno ___ Framework-Team mailing list Framework-Team@lists.plone.org http://lists.plone.org/mailman/listinfo/framework-team
Re: [Framework-Team] Plone roadmap
On 2010-9-5 17:29, Hanno Schlichting wrote: > PluggableAuthService > -- > > There's tons of code based on this. I imagine we can first move the > authentication API's into a WSGI middleware querying PAS as the > backend. This sounds like the mistake repoze.who 1 made. It turns out that for almost every use case you want more control over handling login behaviour than WSGI middleware can provide. It is much simpler to have a simple API to an AAA service and use that than to try pushing this into middleware. Wichert. -- Wichert AkkermanIt is simple to make things. http://www.wiggy.net/ It is hard to make things simple. ___ Framework-Team mailing list Framework-Team@lists.plone.org http://lists.plone.org/mailman/listinfo/framework-team
Re: [Framework-Team] Plone roadmap
Hi. Some more lengthy points :) On Thu, Sep 2, 2010 at 9:39 PM, Laurence Rowe wrote: > Catalogue and References > > > Once all content has __parent__ pointers to the root, we will be able > to use standard ZTK catalog components. Except that the standard ZTK catalog components are much too simplistic for our needs. The ZCatalog, PluginIndexes and various catalog add-ons we have would need to be reimplemented. I think we can use zope.intid and zope.keyreference (once we have parent pointers), but will need to write everything else from scratch. As this is a big task, I think we need to do some smaller steps: - With Zope 2.13 / Plone 4.1 we are cleaning up the "query" interface a bit. The catalog's search methods now all want a simple dictionary as the query specification and issue deprecation warnings for everything else. Quite a bit of code will need to be updated to avoid using keyword arguments, passing in requests objects or mixtures of all of these. This will make it easier to switch to a different implementation later on, as we can keep the query syntax intact. We have also deprecated the insane "empty query" behavior of the catalog. So far any query which didn't result in any index restriction returned the entire catalog content. Starting in Zope 2.14 you get an empty result instead. - With Zope 2.13 and going forward I'll extend and improve the ZCatalog some more. queryplan, slow query logging with ZMI reporting are already in. Next up are things like specialized indexes for boolean values (so the ATCT boolean criteria doesn't have to query for "[0, '', False, '0', 'False', None, (), [], {}, Missing.MV]" anymore), a "unique value" index to efficiently hold data for the UID index, incorporating fixes from enfold.fixes to avoid the conflict error hotspots in the catalog and finally merging in a cleaned up version of the unimr.compositeindex (a multi-column index). I also want to put batch handling into the catalog in the same way collective.solr does this. Instead of querying for all content and then wrapping things in a batch class to access just a particular batch of 20 items, we should pass the batch information directly into the catalog and allow it to just give us the 20 objects we are interested in. I think there's some more we can do here and currently it's much easier to improve these things inside Zope2. - Clean up the default indexes and metadata used in Plone, for example move the Title and Description index away from being ZCTextIndexes. - Get a new plone.indexing package or extend plone.indexer to do the job of collective.indexing. Especially create indexing events which are managed by an index manager instead of relying on mix-in classes. This will replace object.reindexObject() and similar calls, CMF's CatalogAware and the various catalog multiplex solutions. This will need a rather long deprecation period as these calls are all over the place. - Define the actual contracts to re/index only part of an object and implement this in the new event approach. For example imagine you change the value of the "title" via setTitle, the field is called title, the accessor is Title and there's both an index named Title and one called SearchableText that indexes this value - what name do you actually pass into the equivalent "reindexObject" event and who's responsibility is it to notify all indexes that are interested in this value? How do you deal with custom methods tracking the title field or plone.indexer wrappers? Currently we always have to reindex the entire object, as these dependencies are all unclear or there's no way to specify them (well you can pass in index names right now, but the index name doesn't have to be the same as the actual indexed attribute and you have no idea if there's other indexes that also index "your data" making the whole thing rather pointless). - Once we have intid's we can change the internal unique id of the catalog from the physical path over to an intid. This will allow us to rename or move "top level" folders without loosing all index information and indexing the entire subtree from scratch. In combination with the above "partial reindex" contracts, we should be able to index only the indexes which are dependent on the physical path. It will also allow us to do queries across multiple catalogs with reasonable speed. - Once we have parent pointers we can probably ditch storing metadata in the catalog and load objects directly. This depends on having content with a sane persistent structure. Having blobs for large data already helps a lot, but Archetypes BaseUnit indirection is certainly not sane enough for this to work. Next to these things we'll probably do more work with Solr for indexing textual data, which will get us further on things like facetting, spell correction, auto-suggest and binary data extraction. Maybe someone might also finish "NOT" support in ZCatalog (there's some half-finished patch for it in the Zope2