Ashish wrote:
> I am thinking about starting with the following TODO item:
> 
> --> Have EXPLAIN ANALYZE issue NOTICE messages when the estimated
> and actual row counts differ by a specified percentage.
> 
> I picked this because it is somewhat related to query processing
> which is what I am most interested in. It also <seems> like a
> good start up project for a newbie like me. Before I start
> looking into what this would involve and start a conversation
> on designing a solution - I wanted to know what you guys think
> about this particular TODO, and it suitability to a newbie.
> Looking forward to your comments...

I even have a sample patch you can use as a start, attached.

--
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/ref/explain.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/explain.sgml,v
retrieving revision 1.38
diff -c -c -r1.38 explain.sgml
*** doc/src/sgml/ref/explain.sgml	18 Sep 2006 19:54:01 -0000	1.38
--- doc/src/sgml/ref/explain.sgml	22 Dec 2006 17:09:05 -0000
***************
*** 64,72 ****
    <para>
     The <literal>ANALYZE</literal> option causes the statement to be actually executed, not only
     planned.  The total elapsed time expended within each plan node (in
!    milliseconds) and total number of rows it actually returned are added to
!    the display.  This is useful for seeing whether the planner's estimates
!    are close to reality.
    </para>
  
    <important>
--- 64,72 ----
    <para>
     The <literal>ANALYZE</literal> option causes the statement to be actually executed, not only
     planned.  The total elapsed time expended within each plan node (in
!    milliseconds) and total number of rows it actually returned and variance are added to
!    the display.  A sign of the variance indicates whether the estimate was too high or too low.
!    This is useful for seeing how close the planner's estimates are to reality.
    </para>
  
    <important>
***************
*** 222,229 ****
  
                                                         QUERY PLAN                                                        
  -------------------------------------------------------------------------------------------------------------------------
!  HashAggregate  (cost=39.53..39.53 rows=1 width=8) (actual time=0.661..0.672 rows=7 loops=1)
!    -&gt;  Index Scan using test_pkey on test  (cost=0.00..32.97 rows=1311 width=8) (actual time=0.050..0.395 rows=99 loops=1)
           Index Cond: ((id &gt; $1) AND (id &lt; $2))
   Total runtime: 0.851 ms
  (4 rows)
--- 222,229 ----
  
                                                         QUERY PLAN                                                        
  -------------------------------------------------------------------------------------------------------------------------
!  HashAggregate  (cost=39.53..39.53 rows=1 width=8) (actual time=0.661..0.672 rows=7 var=-6.00 loops=1)
!    -&gt;  Index Scan using test_pkey on test  (cost=0.00..32.97 rows=1311 width=8) (actual time=0.050..0.395 rows=99 var=+12.24 loops=1)
           Index Cond: ((id &gt; $1) AND (id &lt; $2))
   Total runtime: 0.851 ms
  (4 rows)
Index: src/backend/commands/explain.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/explain.c,v
retrieving revision 1.152
diff -c -c -r1.152 explain.c
*** src/backend/commands/explain.c	4 Oct 2006 00:29:51 -0000	1.152
--- src/backend/commands/explain.c	22 Dec 2006 17:09:09 -0000
***************
*** 57,62 ****
--- 57,63 ----
  static void show_sort_keys(Plan *sortplan, int nkeys, AttrNumber *keycols,
  			   const char *qlabel,
  			   StringInfo str, int indent, ExplainState *es);
+ static double ExplainVariance(double estimate, double actual);
  
  /*
   * ExplainQuery -
***************
*** 704,713 ****
  	{
  		double		nloops = planstate->instrument->nloops;
  
! 		appendStringInfo(str, " (actual time=%.3f..%.3f rows=%.0f loops=%.0f)",
  						 1000.0 * planstate->instrument->startup / nloops,
  						 1000.0 * planstate->instrument->total / nloops,
  						 planstate->instrument->ntuples / nloops,
  						 planstate->instrument->nloops);
  	}
  	else if (es->printAnalyze)
--- 705,716 ----
  	{
  		double		nloops = planstate->instrument->nloops;
  
! 		appendStringInfo(str, " (actual time=%.3f..%.3f rows=%.0f var=%+.2f loops=%.0f)",
  						 1000.0 * planstate->instrument->startup / nloops,
  						 1000.0 * planstate->instrument->total / nloops,
  						 planstate->instrument->ntuples / nloops,
+ 						 ExplainVariance(plan->plan_rows,
+ 									planstate->instrument->ntuples / nloops),
  						 planstate->instrument->nloops);
  	}
  	else if (es->printAnalyze)
***************
*** 1205,1207 ****
--- 1208,1225 ----
  
  	appendStringInfo(str, "\n");
  }
+ 
+ 
+ static double ExplainVariance(double estimate, double actual)
+ {
+ 	if (estimate == actual)
+ 		return 0;
+ 	else if (actual == 0)
+ 		return estimate;
+ 	else if (estimate == 0)
+ 		return -actual;
+ 	else if (estimate > actual)
+ 		return (estimate / actual) - 1;
+ 	else
+ 		return -(actual / estimate - 1);
+ }	
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to