On Thu, 2008-07-10 at 14:59 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > On Thu, 2008-06-26 at 12:50 -0400, Tom Lane wrote:
> >> I think you need to move up a level, and perhaps refactor some of the
> >> existing code to make it easier to inject made-up stats.
> 
> > I've looked at ways of refactoring this, but the hooks provided here
> > complement the existing functionality fairly well. 
> 
> The hooks proposed here are completely useless --- it's impossible
> to return anything except actual catalog data, because whatever is
> returned is going to be subjected to ReleaseSysCache() later on.
> I'd think you at least need to be able to turn that into a no-op,
> or perhaps a pfree() call.  Also it's clear that some calls of
> examine_variable would still return regular syscache entries, so
> the cleanup behavior has to be determinable on a per-call basis.

I've introduced an additional plugin function 

typedef void (*release_relation_stats_hook_type) (HeapTuple statstup);

Simply pfree-ing things would prevent the plugin from sometimes
returning a syscache entry or keeping its own cache internally.

> I also still think that you need to reconsider the level at which
> the hook gets control.  It's impossible given these hook definitions
> to create "fake" data for an appendrel or sub-SELECT, which seems to
> me to be an important requirement, at least till such time as the
> core planner handles those cases better.

I agree with this, especially with regard to join relations.

However, I'm having difficulty seeing how to identify these structures
externally to allow any meaningful lookups of data. So if we have a 7
table join, then how do we specifically identify the join between tables
4 and 7, say.

We can say, its "foo = foo", but that can easily occur multiple times in
a many way join, especially since join columns are frequently named the
same. How should we tell them apart? AFAICS we can't and yet we need to
if this is to be useful.

Overriding that is already reasonably easily possible anyway using
operators, so I'm not seeing this plugin as a way to introduce hints.
This is purely to provide access to alternate stats.

> My feeling is that the hooks ought to be at more or less the same
> semantic level as examine_variable and ReleaseVariableStats, and that
> rather than making special-case hooks for the other couple of direct
> accesses to STATRELATT, the right thing is to make sure that the
> examine_variable hook will work for them too.

In general, I agree, though this does cause more work in the plugin and
in other areas of selfuncs.c

btcostestimate() searches syscache directly rather than going through
examine_variable. I suspect refactoring that would cause changes to
internals of examine_variable, possibly also changing the API to cater
for index stats.

I have tried to follow this through as you suggest, but the thought that
examine_variable is the right level doesn't seem as clean as it sounded
when you first suggested.

With those problems in mind, can we stick to the original plan? It's
reasonably clear how it works, and it does work correctly now with the
new release call mentioned above.

Or can you suggest the refactoring you would like to see in that area?
Or better still commit something for me to use as a base.

> I think it might be a good idea to prepare an actual working example
> of a plug-in that does something with the hook in order to verify
> that you've got a useful definition.

I've tested this new patch with the plugin I showed you before, rewired
so it makes syscache calls via patch, as a "null" test case. 
All works, no problems.

I'll submit the fully working plugin once we've stabilised the API. It's
designed as a contrib module, so it can go in pgfoundry or contrib.

-- 
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support
Index: src/backend/utils/adt/selfuncs.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/utils/adt/selfuncs.c,v
retrieving revision 1.250
diff -c -r1.250 selfuncs.c
*** src/backend/utils/adt/selfuncs.c	7 Jul 2008 20:24:55 -0000	1.250
--- src/backend/utils/adt/selfuncs.c	6 Aug 2008 09:06:26 -0000
***************
*** 103,108 ****
--- 103,111 ----
  #include "utils/selfuncs.h"
  #include "utils/syscache.h"
  
+ /* Hook for plugins to get control when we ask for stats */
+ get_relation_stats_hook_type get_relation_stats_hook = NULL;
+ release_relation_stats_hook_type release_relation_stats_hook = NULL;
  
  static double var_eq_const(VariableStatData *vardata, Oid operator,
  			 Datum constval, bool constisnull,
***************
*** 3769,3775 ****
  		}
  		else if (rte->rtekind == RTE_RELATION)
  		{
! 			vardata->statsTuple = SearchSysCache(STATRELATT,
  												 ObjectIdGetDatum(rte->relid),
  												 Int16GetDatum(var->varattno),
  												 0, 0);
--- 3772,3783 ----
  		}
  		else if (rte->rtekind == RTE_RELATION)
  		{
! 			if (get_relation_stats_hook)
! 				vardata->statsTuple = (*get_relation_stats_hook) 
! 												(ObjectIdGetDatum(rte->relid),
! 												 Int16GetDatum(var->varattno));
! 			else
! 				vardata->statsTuple = SearchSysCache(STATRELATT,
  												 ObjectIdGetDatum(rte->relid),
  												 Int16GetDatum(var->varattno),
  												 0, 0);
***************
*** 3889,3898 ****
  							index->indpred == NIL)
  							vardata->isunique = true;
  						/* Has it got stats? */
! 						vardata->statsTuple = SearchSysCache(STATRELATT,
  										   ObjectIdGetDatum(index->indexoid),
! 													  Int16GetDatum(pos + 1),
! 															 0, 0);
  						if (vardata->statsTuple)
  							break;
  					}
--- 3897,3911 ----
  							index->indpred == NIL)
  							vardata->isunique = true;
  						/* Has it got stats? */
! 						if (get_relation_stats_hook)
! 							vardata->statsTuple = (*get_relation_stats_hook) 
! 											(ObjectIdGetDatum(index->indexoid),
! 														 Int16GetDatum(pos + 1));
! 						else
! 							vardata->statsTuple = SearchSysCache(STATRELATT,
  										   ObjectIdGetDatum(index->indexoid),
! 													  	 Int16GetDatum(pos + 1),
! 														 0, 0);
  						if (vardata->statsTuple)
  							break;
  					}
***************
*** 5527,5536 ****
  		colnum = 1;
  	}
  
! 	tuple = SearchSysCache(STATRELATT,
! 						   ObjectIdGetDatum(relid),
! 						   Int16GetDatum(colnum),
! 						   0, 0);
  
  	if (HeapTupleIsValid(tuple))
  	{
--- 5540,5554 ----
  		colnum = 1;
  	}
  
! 	if (get_relation_stats_hook)
! 		tuple = (*get_relation_stats_hook) (
! 							ObjectIdGetDatum(relid),
! 							Int16GetDatum(colnum));
! 	else
! 		tuple = SearchSysCache(STATRELATT,
! 							ObjectIdGetDatum(relid),
! 							Int16GetDatum(colnum),
! 							0, 0);
  
  	if (HeapTupleIsValid(tuple))
  	{
Index: src/backend/utils/cache/lsyscache.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/utils/cache/lsyscache.c,v
retrieving revision 1.157
diff -c -r1.157 lsyscache.c
*** src/backend/utils/cache/lsyscache.c	13 Apr 2008 20:51:21 -0000	1.157
--- src/backend/utils/cache/lsyscache.c	6 Aug 2008 09:10:41 -0000
***************
*** 27,32 ****
--- 27,33 ----
  #include "catalog/pg_proc.h"
  #include "catalog/pg_statistic.h"
  #include "catalog/pg_type.h"
+ #include "optimizer/plancat.h"
  #include "miscadmin.h"
  #include "nodes/makefuncs.h"
  #include "utils/array.h"
***************
*** 35,40 ****
--- 36,43 ----
  #include "utils/lsyscache.h"
  #include "utils/syscache.h"
  
+ /* Hook for plugins to get control in get_attavgwidth() */
+ get_attavgwidth_hook_type get_attavgwidth_hook = NULL;
  
  /*				---------- AMOP CACHES ----------						 */
  
***************
*** 2451,2466 ****
   *
   *	  Given the table and attribute number of a column, get the average
   *	  width of entries in the column.  Return zero if no data available.
   */
  int32
  get_attavgwidth(Oid relid, AttrNumber attnum)
  {
  	HeapTuple	tp;
  
! 	tp = SearchSysCache(STATRELATT,
  						ObjectIdGetDatum(relid),
  						Int16GetDatum(attnum),
  						0, 0);
  	if (HeapTupleIsValid(tp))
  	{
  		int32		stawidth = ((Form_pg_statistic) GETSTRUCT(tp))->stawidth;
--- 2454,2480 ----
   *
   *	  Given the table and attribute number of a column, get the average
   *	  width of entries in the column.  Return zero if no data available.
+  *
+  *	  Calling a hook at this point looks somewhat strange, but is required
+  * 	  because the optimizer handles inheritance relations by calling for
+  *	  the avg width later in the planner than get_relation_info_hook().
+  *	  So the APIs and call points of hooks must match the optimizer.
   */
  int32
  get_attavgwidth(Oid relid, AttrNumber attnum)
  {
  	HeapTuple	tp;
  
! 	if (get_attavgwidth_hook)
! 		return (*get_attavgwidth_hook) (
! 						ObjectIdGetDatum(relid),
! 						Int16GetDatum(attnum));
! 	else
! 		tp = SearchSysCache(STATRELATT,
  						ObjectIdGetDatum(relid),
  						Int16GetDatum(attnum),
  						0, 0);
+ 
  	if (HeapTupleIsValid(tp))
  	{
  		int32		stawidth = ((Form_pg_statistic) GETSTRUCT(tp))->stawidth;
Index: src/include/optimizer/plancat.h
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/include/optimizer/plancat.h,v
retrieving revision 1.50
diff -c -r1.50 plancat.h
*** src/include/optimizer/plancat.h	19 Jun 2008 00:46:06 -0000	1.50
--- src/include/optimizer/plancat.h	6 Aug 2008 09:05:19 -0000
***************
*** 14,19 ****
--- 14,20 ----
  #ifndef PLANCAT_H
  #define PLANCAT_H
  
+ #include "access/htup.h"
  #include "nodes/relation.h"
  #include "utils/relcache.h"
  
***************
*** 24,29 ****
--- 25,41 ----
  														 RelOptInfo *rel);
  extern PGDLLIMPORT get_relation_info_hook_type get_relation_info_hook;
  
+ /* Hooks for plugins to get control in lsyscache.c and selfuncs.c */
+ typedef HeapTuple (*get_relation_stats_hook_type) (Datum relid, Datum attnum);
+ extern PGDLLIMPORT get_relation_stats_hook_type get_relation_stats_hook;
+ 
+ typedef void (*release_relation_stats_hook_type) (HeapTuple statstup);
+ extern PGDLLIMPORT release_relation_stats_hook_type release_relation_stats_hook;
+ 
+ typedef int32 (*get_attavgwidth_hook_type) (Datum relid, Datum attnum);
+ extern PGDLLIMPORT get_attavgwidth_hook_type get_attavgwidth_hook;
+ 
+ 
  
  extern void get_relation_info(PlannerInfo *root, Oid relationObjectId,
  				  bool inhparent, RelOptInfo *rel);
Index: src/include/utils/selfuncs.h
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/include/utils/selfuncs.h,v
retrieving revision 1.44
diff -c -r1.44 selfuncs.h
*** src/include/utils/selfuncs.h	9 Mar 2008 00:32:09 -0000	1.44
--- src/include/utils/selfuncs.h	6 Aug 2008 05:07:09 -0000
***************
*** 79,85 ****
  #define ReleaseVariableStats(vardata)  \
  	do { \
  		if (HeapTupleIsValid((vardata).statsTuple)) \
! 			ReleaseSysCache((vardata).statsTuple); \
  	} while(0)
  
  
--- 79,90 ----
  #define ReleaseVariableStats(vardata)  \
  	do { \
  		if (HeapTupleIsValid((vardata).statsTuple)) \
! 		{ \
! 			if (release_relation_stats_hook) \
! 				(* release_relation_stats_hook)((vardata).statsTuple); \
! 			else \
! 				ReleaseSysCache((vardata).statsTuple); \
! 		} \
  	} while(0)
  
  
-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Reply via email to