Re: [HACKERS] Delaying the planning of unnamed statements until Bind

2004-05-22 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes:

 Greg Stark [EMAIL PROTECTED] writes:
  If the planner had the expected value as well as the variance of the cost
  distribution then it might realize that in this case for instance that the
  penalty for guessing wrong with an index scan is only going to be a small
  slowdown factor, perhaps 2-4x slower. Whereas the penalty for guessing wrong
  with a sequential scan could be a factor in the thousands or more.
 
 Au contraire --- a full-table index scan can be vastly slower than a
 full-table seqscan.  

Sure, but how much slower? 10x? 20? 100x? 1,000x? The old rule of thumb was
that the break-even point was somewhere around 15% selectivity which means it
would be about 7x slower. But whatever it is, it's going to be some
substantial but still linear slowdown. And the query will be slow but still
usable.

A mistaken sequential scan used when an index scan was needed could be
millions of times slower. The sequential scan time would have no relationship
at all with the index scan time with no upper bound on the slowdown at all.
In an OLTP environment the consequence of guessing wrong in this direction
would make the difference between the query working and it effectively failing
to work at all.

 I think it's wishful thinking to assume that picking an indexscan is the
 right thing when we don't know any better.

If we don't know any better then any solution is going to be wishful thinking.
It's kind of like a bridge game. If you don't know where the card is but you
need it somewhere in order to win, then you have to assume it's there and hope
for the best. If the index is wrong then the query was going to be slow either
way. If the index was right and you chose not to use it you're risking making
it slow unnecessarily and potentially when it was absoluetely required to be
fast.

As further evidence I'll mention Oracle falls back to the Rules-Based
optimizer if it has no statistics. The Rules-Based optimizer -- which was the
ONLY optimizer it used at all for many years -- always uses an index if it
can.


-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Delaying the planning of unnamed statements until Bind

2004-05-22 Thread Greg Stark

Greg Stark [EMAIL PROTECTED] writes:

 Tom Lane [EMAIL PROTECTED] writes:
 
  select * from mytable where entry_time = $1;
 
  The planner will take a seqscan when it sees this because it is worried
  about the downside if a large fraction of the table is being selected.

I'll mention another factor that's hidden here: The type of application that
cares about parse/optimization time enough and cares about plan stability
enough to reuse plans would be an OLTP application. The type of application
that would want to delay optimization until the parameters are known to
provide the best plan would be a DSS application or ad-hoc query.

So the very fact that the user is using placeholders like this is evidence
that the optimizer should err in the direction of assuming quick, very
selective queries.

If the specific value of $1 would help a large ad-hoc batch report run quicker
the user always has the option of providing it. The OLTP application running
thousands of queries per second doesn't have the option of doing that without
a serious performance hit.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Database variables when copying a database

2004-05-22 Thread Christopher Kings-Lynne
Anyone thought about this at all yet?  Is it possible to have 
per-database variables that refer to the current database in someway 
that would need to be altered to refer to the new db?

Chris
Christopher Kings-Lynne wrote:
Hi,
When you do this:
CREATE DATABASE test TEMPLATE master;
It doesn't copy any per-database GUC variables,  created by the ALTER 
DATABASE master SET var TO val; command.

Should it?
Chris
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings
---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] pg_autovacuum fixes

2004-05-22 Thread Matthew T. O'Connor
This weekend I am trying to fix up all known the pg_autovacuum issues
that should be resolved for 7.4.3.  I am aware of only two issues:  temp
table issues, and unchecked send_query() calls, if I am forgetting
something, please let me know.

1) temp table issue:  
I was not able to reproduce the crash associated with temp tables.  I
spent a while creating tables doing updates and dropping them trying
without success to get pg_autovacuum to crash.  Since I couldn't trigger
the problem, I will need someone else to test to see if I have fixed the
problem.  Anyway, I have modified the query to exclude temp tables from
the list of tables to work with.  So we should no longer be dealing with
temp tables at all which should side step any temp table related problem
we might have been having.

2) Unchecked send_query() function calls:
As best as I can tell, this is mostly a non-issue, but I went ahead
added a check to any section that did anything with the result of
send_query, so if this was an issue, it should be fixed now.  BTW, this
might have been the cause of the temp table related crash, but that is
just a guess.


Matthew O'Connor


*** ./pg_autovacuum.c.orig	2004-05-22 02:56:09.0 -0400
--- ./pg_autovacuum.c	2004-05-22 03:36:01.152691850 -0400
***
*** 225,294 
  		 * tables to the list that are new
  		 */
  		res = send_query((char *) TABLE_STATS_QUERY, dbi);
! 		t = PQntuples(res);
! 
! 		/*
! 		 * First: use the tbl_list as the outer loop and the result set as
! 		 * the inner loop, this will determine what tables should be
! 		 * removed
! 		 */
! 		while (tbl_elem != NULL)
! 		{
! 			tbl = ((tbl_info *) DLE_VAL(tbl_elem));
! 			found_match = 0;
! 
! 			for (i = 0; i  t; i++)
! 			{	/* loop through result set looking for a
!  * match */
! if (tbl-relid == atooid(PQgetvalue(res, i, PQfnumber(res, oid
! {
! 	found_match = 1;
! 	break;
! }
! 			}
! 			if (found_match == 0)
! 			{	/* then we didn't find this tbl_elem in
!  * the result set */
! Dlelem	   *elem_to_remove = tbl_elem;
! 
! tbl_elem = DLGetSucc(tbl_elem);
! remove_table_from_list(elem_to_remove);
! 			}
! 			else
! tbl_elem = DLGetSucc(tbl_elem);
! 		}		/* Done removing dropped tables from the
!  * table_list */
! 
! 		/*
! 		 * Then loop use result set as outer loop and tbl_list as the
! 		 * inner loop to determine what tables are new
! 		 */
! 		for (i = 0; i  t; i++)
  		{
! 			tbl_elem = DLGetHead(dbi-table_list);
! 			found_match = 0;
  			while (tbl_elem != NULL)
  			{
  tbl = ((tbl_info *) DLE_VAL(tbl_elem));
! if (tbl-relid == atooid(PQgetvalue(res, i, PQfnumber(res, oid
! {
! 	found_match = 1;
! 	break;
  }
! tbl_elem = DLGetSucc(tbl_elem);
! 			}
! 			if (found_match == 0)		/* then we didn't find this result
! 		 * now in the tbl_list */
  			{
! DLAddTail(dbi-table_list, DLNewElem(init_table_info(res, i, dbi)));
! if (args-debug = 1)
  {
! 	sprintf(logbuffer, added table: %s.%s, dbi-dbname,
! 			((tbl_info *) DLE_VAL(DLGetTail(dbi-table_list)))-table_name);
! 	log_entry(logbuffer);
  }
! 			}
! 		}		/* end of for loop that adds tables */
  		fflush(LOGOUTPUT);
  		PQclear(res);
  		res = NULL;
--- 225,297 
  		 * tables to the list that are new
  		 */
  		res = send_query((char *) TABLE_STATS_QUERY, dbi);
! 		if (res != NULL)
  		{
! 			t = PQntuples(res);
! 			
! 			/*
! 			* First: use the tbl_list as the outer loop and the result set as
! 			* the inner loop, this will determine what tables should be
! 			* removed
! 			*/
  			while (tbl_elem != NULL)
  			{
  tbl = ((tbl_info *) DLE_VAL(tbl_elem));
! found_match = 0;
! 
! for (i = 0; i  t; i++)
! {	/* loop through result set looking for a
! 	* match */
! 	if (tbl-relid == atooid(PQgetvalue(res, i, PQfnumber(res, oid
! 	{
! 		found_match = 1;
! 		break;
! 	}
  }
! if (found_match == 0)
! {	/* then we didn't find this tbl_elem in
! 	* the result set */
! 	Dlelem	   *elem_to_remove = tbl_elem;
! 	
! 	tbl_elem = DLGetSucc(tbl_elem);
! 	remove_table_from_list(elem_to_remove);
! }
! else
! 	tbl_elem = DLGetSucc(tbl_elem);
! 			}		/* Done removing dropped tables from the
! 	* table_list */
! 			
! 			/*
! 			* Then loop use result set as outer loop and tbl_list as the
! 			* inner loop to determine what tables are new
! 			*/
! 			for (i = 0; i  t; i++)
  			{
! tbl_elem = DLGetHead(dbi-table_list);
! found_match = 0;
! while (tbl_elem != NULL)
  {
! 	tbl = ((tbl_info *) DLE_VAL(tbl_elem));
! 	if (tbl-relid == atooid(PQgetvalue(res, i, PQfnumber(res, oid
! 	{
! 		found_match = 1;
! 		break;
! 	}
! 	tbl_elem = DLGetSucc(tbl_elem);
  }
! if (found_match == 0)		/* then we didn't find this result
! 			* now in the tbl_list */
! 

Re: [HACKERS] postgresql extension API proof of concept

2004-05-22 Thread Fabien COELHO

Dear Carl,

 What is the purpose of the extension api in simple terms; what is it FOR?
 What does it DO?

Thanks for this perfectly legitimate question, as my mail was not very
easy to understand without the context.

The aim is to provide a compilation infrastructure to external modules
(that is things that are not distributed with the core of postgresql but
must be integrated to it, such as new types or new functions which deal
with backend informations).

The idea is that any external module could have a very simple makefile and
reuse all of postgresql build infrastructure (makefiles, scripts...) to
be build against an already installed postgresql.

I hope this is clearer,

have a nice day,

-- 
Fabien Coelho - [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] add server include files to default installation?

2004-05-22 Thread Fabien COELHO

Dear Tom,

  Agreed.  If we are pushing things out, it seems it is our duty to make
  it easy for outside things to integrate and build properly.

 It does not thereby follow that we should try to merge devel and base
 packages (to express it in RPM terms).

They are not necessarily merged, and it is quite easy to separate them
from the rpm maintainer point of view if s?he desire to do so. This is
already done for apache for instance, as the default installation includes
all build utilities. There is no problem separate build utils, or even to
make it easier than it is already if you want.


 Compiling extension packages is not and probably never will be something
 that the average user does, and there's no reason to burden him with the
 disk footprint to support something he's not gonna do.

We're arguing about the public which does compile postgresql. Guys or
girls that downloaded the source, configure and so. That is not the
average user indeed, but occasional sysadmins like me, and I like my life
to be easier. I'm ready to pay 0.002 EUR of disk space for that,
especially as the 0.002 EUR (about $0.002 for you) are those of my
employer;-)

We can discuss whether that price is too high for the average sysadmin,
but at the cost of my time and yours, I think we spent quite a lot of
money arguing about this issue that would pay for all the disk space in
the world that is needed;-)


 We do need to work out what our story is for compiling extensions
 without the original source/build trees.

We agree;-) I've made suggestions and send a proof of concept patch
which is waiting for your comments and better ideas. I'm not sure it is
the best ever possible way to allow extensions, but it works and it is
reasonnable enough IMHO, so it is at least a base for discussion.


 But the needed files should be an install option, not the default.

The previous point can be solved independetly of this issue, as you
pointed out. Let's begin with that.

-- 
Fabien Coelho - [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] CVS HEAD: make install broken for vpath

2004-05-22 Thread Neil Conway
With current sources, it appears that vpath builds (i.e. separate source 
and build trees) are broken. make succeeds, but make install produces:

[neilc:/Users/neilc/build-pgsql]% make install
make -C doc install
make[1]: Nothing to be done for `install'.
make -C src install
make -C port install
make[2]: Nothing to be done for `install'.
make -C timezone install
make -C ../../src/port all
make[3]: Nothing to be done for `all'.
./zic -d /Users/neilc/local/pgsql-cvs/share/timezone data/africa 
data/antarctica data/asia data/australasia data/europe data/northamerica 
data/southamerica data/pacificnew data/etcetera data/factory 
data/backward data/systemv data/solar87 data/solar88 data/solar89
./zic: Can't open data/africa: No such file or directory
make[2]: *** [install] Error 1
make[1]: *** [install] Error 2
make: *** [install] Error 2

-Neil
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Call for 7.5 feature completion

2004-05-22 Thread David Garamond
Robert Treat wrote:
Given that the cygwin version is currently labeled as not ready for
production I would say you are right. The truth is that many will never
declare win32 good for production simply because of the OS it runs on,
but we still want to make it as solid as possible.
People _do_ use postgresql+cygwin in production environments though (see 
the pgsql-cygwin archive).

And I suspect people _will_ use 7.5 for win32 in production, despite the 
release notes and the website clearly saying it's not production ready. Why?

1) The version number is 7.5 and many people will presume the ports 
are more or less equal in quality/maturity since they have the same 
version number;

2) People don't read release notes. See the various reviews on the 
recently released Fedora Core 2, complaining about how it doesn't 
support MP3 or DVD playback, despite the [legal] issues having been 
known and documented since Red Hat 8. Strangely enough, these people 
(who don't read release notes) _do_ write public reviews. They will 
badmouth PostgreSQL, saying it's unstable, crashes a lot, MySQL being 
much much more rock solid, etc etc.

I suggest we label the win32 port as 7.5 ALPHA or 7.5 DANGEROUS :-)
--
dave
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match