On Wed, Oct 6, 2010 at 1:52 AM, Ian Booth <[email protected]> wrote: > Hi > > I took the opportunity to look at one of the code team top 10's (both > for duration and statement count). I thought I'd share what I found - it
\o/ > *Question:* does the "Repeated SQL Statements" section on the oops page > collate the statements with or without taking into account the > parameters being substituted? If without, should it? OOPs discard the bind parameters, so they detect queries with the same template and different parameters as repeated (which is desirable - exact repeats are rare, the usual repeat we see is an associated table being looked up just-in-time rather than en masse). > *Analysis* > > 1. Duplicate sql in populating view items (which btw are count(*) which > may be expensive in postgres I've gathered from others' posts). Yes. The best way to think of count(*) cost is 'the same as doing the query'. > Here's one particular issue which consumes, for this particular oops, > 20% of the sql time. The root cause leads to a design issue which I'll > pose as a question at the end. Very briefly, the > PersonSubscribedBranchesView causes the instantiation of 2 copies of a > PersonBranchesMenu object - one for the menu links under the page title > and one for the links on the right hand side of the page. Each of these > uses a property cache, but a new, different cache instance is used for > each PersonBranchesMenu object, even though the underlying domain object > whose properties are being cached represents the same real world entity. > Hence as each PersonBranchesMenu attribute is populated, there is one > cache miss per attribute per cache instance - it should be only one miss > per attribute IMHO. Can you put the cache on the domain object? > The other issue with the way this page is populated is that the > construction of both PersonBranchesMenu instances results in every > single one of the defined "extra_attributes" being retrieved, even > though one of those instances is only used to render the "Add Branch" > link on the right of the page and none of the extra_attributes from that > instance are actually used. Thats certainly an issue too; asking for a narrower report from the context may help? > So we're executing potentially expensive sql to get attributes which > have already been fetched in a different instance and which aren't in > fact required for the instance in question. For this page, one of the > PersonBranchesMenu instances comes from a reference directly in the page > template, the other comes from an included macro. > > *Questions* > > a. Should the property caching infrastructure be made smarter so that > for a given view rendering cycle, each domain object representation only > uses a single cache instance? It already does; you probably have two domain object instances, not one instance with two caches. There are a few ways this could occur; what particular object and cached attribute are you looking at? (so that I can look at the code). > b. Can anything be done to configure the tales/zope infrastructure so > that entities involved in rendering the view be accessed as singletons > where appropriate? So the basic problem is that we have multiple 'templates' involved in rendering, and each template has a 'view' and the view has a 'context'. If you provide a single context for all the views you can get a single instance; 'model' objects will have this behaviour intrinsically due to the storm cache, other objects won't. Alternatively, juidicious caching on the model objects can let things traverse more sensibly - I did that two weeks back on bugtask:+index, if you want an example. > 2. Other duplicate sqls > > According to the oops report, there's a large number of duplicate sqls > which don't seem to be used to directly populate any view items (but I > haven't done a complete analysis yet). > > 17 x select ... from Person LP_DEBUG_SQL_EXTRA=1 is your friend: run interactively with this, and you can see whats triggering the call. My money is on permission checking. > 17 x select ... from ValidPersonCache This is checking 'is_valid_person' - see Person._validity_clauses for eager-loading support for this. Examples clients are in blueprints and person.py. > 14 x select .... FROM LibraryFileAlias Probably team branding metadata lookups, a prime candidate for eager loading - just get the tupleset and use DRS on it; these three sets of queries probably should all be deleted and be either: a single result set for the main object driving their retrieval or three queries done in the pre_iter_hook of a DRS for that main object which makes most sense will depend on the number of NULLs and the width of the resulting query. > These don't take up too much sql time but to me the high execution > counts (> 1/2 the total) suggest there's something somewhere that should > be cached. Besides, these will add additional I/O load to the database > server which all adds up if the system is under load. Perhaps there will > be a similar sert of sqls executed for other similar views and a > common/shared optimisation approach can be used? I don't yet know enough > about the origin of these to make a call. Anyone else have any ideas? > > If you've managed to read this far without hitting delete, well done and > go get a life :-) Can't, sorry... I'm lifeless :) -Rob _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp

