[PATCHES] Synchronized Scan WIP patch

2007-03-14 Thread Jeff Davis
This is my latest revision of the Sync Scan patch, and it implements the
observability as discussed with Simon.

Changes:
 * ss_report_loc() called once per hundred pages rather than once per
page
 * DEBUG messages are a little cleaner and easier to parse, for the sake
of analysis after the fact.
 * DEBUG2 reports a sync scan starting, the relation size in pages, and
the location at which the scan starts.
 * DEBUG2 reports the location of a scan every 50k pages, DEBUG3 every
5k pages (before it was 100k/10k at DEBUG3/DEBUG4, respectively).
Numbers are aligned along 5k boundaries to make analysis easier.
 * GUCs:
   * sync_seqscan_threshold: fraction of NBuffers for the threshold
   * sync_seqscan_offset: fraction of NBuffers for the offset
   * trace_sync_seqscan: will be used in final version of patch to
control DEBUG output

Sync_scan_offset may be eliminated completely if it's not shown to be
useful enough in conjunction with Simon's patch. Sync Scans are still a
big win without sync_seqscan_offset.

Sync_scan_threshold=real may be turned into sync_seqscan=boolean
with a fixed activation threshold (NBuffers/2 per Simon's suggestion).
The reason is that synchronized scans should activate at the same
threshold as Simon's scan_recycle_buffers feature. Should we make a
#define BIG_SCAN_THRESHOLD NBuffers/2 to use for both sync_seqscan and
for scan_recycle_buffers?

Regards,
Jeff Davis
diff -cr postgresql-8.2.3/src/backend/access/heap/heapam.c postgresql-8.2.3-syncscan/src/backend/access/heap/heapam.c
*** postgresql-8.2.3/src/backend/access/heap/heapam.c	2007-02-04 12:00:49.0 -0800
--- postgresql-8.2.3-syncscan/src/backend/access/heap/heapam.c	2007-03-13 23:21:27.0 -0700
***
*** 65,70 
--- 65,279 
   * 
   */
  
+ static BlockNumber ss_init(HeapScanDesc);
+ static int ss_store_hint(HeapScanDesc,BlockNumber);
+ static int ss_hash(HeapScanDesc);
+ bool Trace_sync_seqscan = false;
+ double sync_seqscan_threshold = DEFAULT_SYNC_SCAN_THRESHOLD;
+ double sync_seqscan_offset = DEFAULT_SYNC_SCAN_OFFSET;
+ 
+ /*
+  * ss_init: 
+  *
+  * This function reads the Sync Scan Hint Table 
+  * (creating it if it doesn't already exist) to 
+  * find a possible location for an already running 
+  * sequential scan on this relation.
+  *
+  * By starting a sequential scan near the location
+  * of an already running scan, we improve the chance
+  * of finding pages in cache.
+  *
+  * Also, depending on SYNC_SCAN_START_OFFSET, this
+  * function will subtract from the hint before
+  * starting the scan, in order to pick up pages that
+  * are likely to already be in cache.
+  *
+  * This function assumes that scan-rs_nblocks is 
+  * already properly set, and sets scan-rs_start_page
+  * to a value based on the hint found. Also, it sets
+  * scan-rs_hint to point to the location of the hint
+  * in the hint table.
+  */
+ static BlockNumber ss_init(HeapScanDesc scan)
+ {
+ 	ss_hint_t *hint_table;
+ 	int table_offset;
+ 	bool found;
+ 	int threshold = sync_seqscan_threshold * NBuffers;
+ 	int offset = sync_seqscan_offset * NBuffers;
+ 
+ 	/*
+ 	 * If the table is not large compared to effective_cache_size,
+ 	 * don't Sync Scan.
+ 	 */
+ 	if(scan-rs_nblocks  threshold)
+ 	{
+ 		elog(DEBUG2,SYNC_SCAN: Table too small to sync scan);
+ 		scan-rs_start_page = 0;
+ 		return 0;
+ 	}
+ 
+ 	table_offset = ss_hash(scan);
+ 	hint_table = (ss_hint_t*)ShmemInitStruct(Sync Scan Hint Table,
+ 		SYNC_SCAN_TABLE_SIZE*sizeof(ss_hint_t),found);
+ 			
+ 	scan-rs_hint = hint_table[table_offset];
+ 
+ 	/*
+ 	 * If we just created the hint table for the first time,
+ 	 * initialize the table to zero and start the scan at page 0.
+ 	 */
+ 	if(!found) {
+ 		elog(DEBUG2,SYNC_SCAN: Created Hint Table);
+ 		memset(hint_table,0,sizeof(ss_hint_t)*SYNC_SCAN_TABLE_SIZE);
+ 		scan-rs_start_page = 0;
+ 		return 0;
+ 	}
+ 
+ 	/*
+ 	 * If the hint's relid is 0, that means
+ 	 * we have not previously created a hint
+ 	 * at this location in the table.
+ 	 */
+ 	if(scan-rs_hint-relid == 0) {
+ 		elog(DEBUG2, SYNC_SCAN: Hint empty);
+ 		scan-rs_start_page = 0;
+ 		return 0;
+ 	}
+ 
+ 	/*
+ 	 * If the relid doesn't match the one in the hint,
+ 	 * we have a hash collision.
+ 	 */
+ 	if(RelationGetRelid(scan-rs_rd) != scan-rs_hint-relid)
+ 	{
+ 		elog(DEBUG1,SYNC_SCAN: Hash collision);
+ 		scan-rs_start_page = 0;
+ 		return 0;
+ 	}
+ 
+ 	/*
+ 	 * If the hint is not a valid block number
+ 	 * for this relation, start at 0.
+ 	 *
+ 	 * This can happen if, for instance, someone
+ 	 * TRUNCATEd the table between when the hint 
+ 	 * was set and now.
+ 	 */
+ 	if(scan-rs_hint-location  0 || 
+ 		scan-rs_hint-location = scan-rs_nblocks) 
+ 	{
+ 		elog(DEBUG2,SYNC_SCAN: Hint %d out of range. \
+  Relation has %d pages.,
+ 			scan-rs_hint-location,scan-rs_nblocks);
+ 		scan-rs_start_page = 0;
+ 		return 0;
+ 	}
+ 
+ 	scan-rs_start_page = 

Re: [PATCHES] LIMIT/SORT optimization

2007-03-14 Thread Gregory Stark

Bruce Momjian [EMAIL PROTECTED] writes:

 Is there a newer version of this patch?

As requested, I've cut an updated version of this patch against CVS HEAD:

 http://community.enterprisedb.com/sort-limit-v5.patch.gz

The open issues I see are:

 Adjusting the costing of the sort to take into account the optimization

 Whether the communication between the Limit node and the Sort node is kosher
 or whether something more abstract is needed.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] LIMIT/SORT optimization

2007-03-14 Thread Gregory Stark

Gregory Stark [EMAIL PROTECTED] writes:

 Bruce Momjian [EMAIL PROTECTED] writes:

 Is there a newer version of this patch?

 As requested, I've cut an updated version of this patch against CVS HEAD:

  http://community.enterprisedb.com/sort-limit-v5.patch.gz

Someone asked why I've been posting links rather than attachments. The only
reason was because when I posted patches in the past they were dropped by the
mailing list. I would say refused except I never received a bounce, the
messages just never appeared on list or in the archive.

I'll try attaching this patch again, which is relatively small compared to the
recursive query patches and packed varlena patches which disappeared into the
ether. Also, this one is gzipped whereas in the past I usually attached
patches uncompressed so people could read them without saving and
uncompressing them. Perhaps one of those differences is the source of the
problem?

Do people prefer receiving attachments or downloadable links? 
Does the answer change if the patches are quite large?



sort-limit-v5.patch.gz
Description: Binary data

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] LIMIT/SORT optimization

2007-03-14 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Do people prefer receiving attachments or downloadable links? 
 Does the answer change if the patches are quite large?

Links suck from an archival standpoint; but at the same time you need
to pay some attention to the size of your email.  I think the current
threshold for moderator approval is somewhere between 50K and 100K
(on patches; less on our other lists).  gzipping large patches is
encouraged --- if people's mail readers need help in viewing such
an attachment, that's not your problem.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[PATCHES] Code-Cleanup: function declarations (void, and kr style)

2007-03-14 Thread Stefan Huehner
Hi,

attached patch fixes some function declarations:

- () - (void)
- kr style - ansi C

Regards,
Stefan

Index: src/pl/plpgsql/src/pl_funcs.c
===
RCS file: /projects/cvsroot/pgsql/src/pl/plpgsql/src/pl_funcs.c,v
retrieving revision 1.57
diff -u -u -r1.57 pl_funcs.c
--- src/pl/plpgsql/src/pl_funcs.c	28 Jan 2007 17:58:13 -	1.57
+++ src/pl/plpgsql/src/pl_funcs.c	14 Mar 2007 16:17:22 -
@@ -474,7 +474,7 @@
  **/
 static int	dump_indent;
 
-static void dump_ind();
+static void dump_ind(void);
 static void dump_stmt(PLpgSQL_stmt *stmt);
 static void dump_block(PLpgSQL_stmt_block *block);
 static void dump_assign(PLpgSQL_stmt_assign *stmt);
Index: src/port/qsort.c
===
RCS file: /projects/cvsroot/pgsql/src/port/qsort.c,v
retrieving revision 1.12
diff -u -u -r1.12 qsort.c
--- src/port/qsort.c	19 Oct 2006 20:56:22 -	1.12
+++ src/port/qsort.c	14 Mar 2007 16:17:22 -
@@ -73,11 +73,7 @@
 	(es) % sizeof(long) ? 2 : (es) == sizeof(long)? 0 : 1;
 
 static void
-swapfunc(a, b, n, swaptype)
-char	   *a,
-		   *b;
-size_t		n;
-int			swaptype;
+swapfunc(char *a, char *b, size_t n, int swaptype)
 {
 	if (swaptype = 1)
 		swapcode(long, a, b, n);
Index: src/port/qsort_arg.c
===
RCS file: /projects/cvsroot/pgsql/src/port/qsort_arg.c,v
retrieving revision 1.3
diff -u -u -r1.3 qsort_arg.c
--- src/port/qsort_arg.c	12 Oct 2006 15:04:55 -	1.3
+++ src/port/qsort_arg.c	14 Mar 2007 16:17:22 -
@@ -73,11 +73,7 @@
 	(es) % sizeof(long) ? 2 : (es) == sizeof(long)? 0 : 1;
 
 static void
-swapfunc(a, b, n, swaptype)
-char	   *a,
-		   *b;
-size_t		n;
-int			swaptype;
+swapfunc(char *a, char *b, size_t n, int swaptype)
 {
 	if (swaptype = 1)
 		swapcode(long, a, b, n);

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PATCHES] Code-Cleanup: char* - const char*

2007-03-14 Thread Stefan Huehner
Hi,

attached patches marks several char* variables as const where they are
initialized with constant strings.

Regards,
Stefan

Index: src/backend/regex/regc_locale.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/regex/regc_locale.c,v
retrieving revision 1.8
diff -u -u -r1.8 regc_locale.c
--- src/backend/regex/regc_locale.c	22 Nov 2005 18:17:19 -	1.8
+++ src/backend/regex/regc_locale.c	14 Mar 2007 16:17:12 -
@@ -54,7 +54,7 @@
 
 static struct cname
 {
-	char	   *name;
+	const char	*name;
 	char		code;
 }	cnames[] =
 
@@ -604,7 +604,7 @@
 {
 	size_t		len;
 	struct cvec *cv = NULL;
-	char	  **namePtr;
+	const char **namePtr;
 	int			i,
 index;
 
@@ -612,7 +612,7 @@
 	 * The following arrays define the valid character class names.
 	 */
 
-	static char *classNames[] = {
+	static const char *classNames[] = {
 		alnum, alpha, ascii, blank, cntrl, digit, graph,
 		lower, print, punct, space, upper, xdigit, NULL
 	};
Index: src/backend/regex/regerror.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/regex/regerror.c,v
retrieving revision 1.27
diff -u -u -r1.27 regerror.c
--- src/backend/regex/regerror.c	29 Nov 2003 19:51:55 -	1.27
+++ src/backend/regex/regerror.c	14 Mar 2007 16:17:12 -
@@ -40,8 +40,8 @@
 static struct rerr
 {
 	int			code;
-	char	   *name;
-	char	   *explain;
+	const char	   *name;
+	const char	   *explain;
 }	rerrs[] =
 
 {
@@ -63,7 +63,7 @@
 			size_t errbuf_size) /* available space in errbuf, can be 0 */
 {
 	struct rerr *r;
-	char	   *msg;
+	const char   *msg;
 	char		convbuf[sizeof(unk) + 50];		/* 50 = plenty for int */
 	size_t		len;
 	int			icode;
Index: src/backend/utils/adt/ruleutils.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/ruleutils.c,v
retrieving revision 1.252
diff -u -u -r1.252 ruleutils.c
--- src/backend/utils/adt/ruleutils.c	27 Feb 2007 23:48:08 -	1.252
+++ src/backend/utils/adt/ruleutils.c	14 Mar 2007 16:17:14 -
@@ -106,9 +106,9 @@
  * --
  */
 static void *plan_getrulebyoid = NULL;
-static char *query_getrulebyoid = SELECT * FROM pg_catalog.pg_rewrite WHERE oid = $1;
+static const char *query_getrulebyoid = SELECT * FROM pg_catalog.pg_rewrite WHERE oid = $1;
 static void *plan_getviewrule = NULL;
-static char *query_getviewrule = SELECT * FROM pg_catalog.pg_rewrite WHERE ev_class = $1 AND rulename = $2;
+static const char *query_getviewrule = SELECT * FROM pg_catalog.pg_rewrite WHERE ev_class = $1 AND rulename = $2;
 
 
 /* --
Index: src/bin/initdb/initdb.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/initdb/initdb.c,v
retrieving revision 1.134
diff -u -u -r1.134 initdb.c
--- src/bin/initdb/initdb.c	20 Feb 2007 23:49:38 -	1.134
+++ src/bin/initdb/initdb.c	14 Mar 2007 16:17:14 -
@@ -727,7 +727,7 @@
 struct encoding_match
 {
 	enum pg_enc pg_enc_code;
-	char	   *system_enc_name;
+	const char *system_enc_name;
 };
 
 struct encoding_match encoding_match_list[] = {
@@ -1481,8 +1481,8 @@
 setup_auth(void)
 {
 	PG_CMD_DECL;
-	char	  **line;
-	static char *pg_authid_setup[] = {
+	const char **line;
+	static const char *pg_authid_setup[] = {
 		/*
 		 * Create triggers to ensure manual updates to shared catalogs will be
 		 * reflected into their flat file copies.
@@ -1623,8 +1623,8 @@
 setup_depend(void)
 {
 	PG_CMD_DECL;
-	char	  **line;
-	static char *pg_depend_setup[] = {
+	const char	  **line;
+	static const char *pg_depend_setup[] = {
 		/*
 		 * Make PIN entries in pg_depend for all objects made so far in the
 		 * tables that the dependency code handles.  This is overkill (the
@@ -1990,8 +1990,8 @@
 make_template0(void)
 {
 	PG_CMD_DECL;
-	char	  **line;
-	static char *template0_setup[] = {
+	const char **line;
+	static const char *template0_setup[] = {
 		CREATE DATABASE template0;\n,
 		UPDATE pg_database SET 
 			datistemplate = 't', 
@@ -2045,8 +2045,8 @@
 make_postgres(void)
 {
 	PG_CMD_DECL;
-	char	  **line;
-	static char *postgres_setup[] = {
+	const char **line;
+	static const char *postgres_setup[] = {
 		CREATE DATABASE postgres;\n,
 		NULL
 	};
Index: src/bin/pg_controldata/pg_controldata.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/pg_controldata/pg_controldata.c,v
retrieving revision 1.33
diff -u -u -r1.33 pg_controldata.c
--- src/bin/pg_controldata/pg_controldata.c	3 Mar 2007 20:02:27 -	1.33
+++ src/bin/pg_controldata/pg_controldata.c	14 Mar 2007 16:17:14 -
@@ -72,7 +72,7 @@
 	char		pgctime_str[128];
 	char		ckpttime_str[128];
 	char		sysident_str[32];
-	char	   *strftime_fmt = %c;
+	const char *strftime_fmt = %c;
 	const char *progname;
 
 	set_pglocale_pgservice(argv[0], pg_controldata);
Index: src/bin/pg_ctl/pg_ctl.c

Re: [PATCHES] LIMIT/SORT optimization

2007-03-14 Thread Magnus Hagander
Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
 Do people prefer receiving attachments or downloadable links? 
 Does the answer change if the patches are quite large?
 
 Links suck from an archival standpoint; but at the same time you need
 to pay some attention to the size of your email.  I think the current
 threshold for moderator approval is somewhere between 50K and 100K
 (on patches; less on our other lists).  gzipping large patches is
 encouraged --- if people's mail readers need help in viewing such
 an attachment, that's not your problem.

IIRC, when a patch gets held back, you get a notification. The problem
has been with mails that are silently dropped. AFAIK, that has happened
outside of mailman, somewhere else in the mail system. For example, we
used to drop anything that was a .tar (including .tar.gz), and last I
checked we still do that. But admittedly that was some time ago, but
I've seen no statement that it has been fixed.

(plain gzip usually worked fine, but .tar.gz to include a couple of new
files got silently dropped. For example, I tried sending my vcbuild
patches at least 4 times before I realized they were silently dropped)

So I'd avoid anything other than plaintext or plaintext.gz.

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PATCHES] Have \da in psql show return type

2007-03-14 Thread Greg Sabino Mullane
Using \da in psql should show the return type.


Index: doc/src/sgml/ref/psql-ref.sgml
===
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/ref/psql-ref.sgml,v
retrieving revision 1.188
diff -c -r1.188 psql-ref.sgml
*** doc/src/sgml/ref/psql-ref.sgml	3 Mar 2007 17:19:11 -	1.188
--- doc/src/sgml/ref/psql-ref.sgml	14 Mar 2007 19:18:42 -
***
*** 875,882 
  
  listitem
  para
! Lists all available aggregate functions, together with the data
! types they operate on. If replaceable
  class=parameterpattern/replaceable
  is specified, only aggregates whose names match the pattern are shown.
  /para
--- 875,882 
  
  listitem
  para
! Lists all available aggregate functions, together with their
! return type, and the data types they operate on. If replaceable
  class=parameterpattern/replaceable
  is specified, only aggregates whose names match the pattern are shown.
  /para
Index: src/bin/psql/describe.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/psql/describe.c,v
retrieving revision 1.152
diff -c -r1.152 describe.c
*** src/bin/psql/describe.c	20 Feb 2007 10:23:38 -	1.152
--- src/bin/psql/describe.c	14 Mar 2007 19:18:42 -
***
*** 62,82 
  	printfPQExpBuffer(buf,
  	  SELECT n.nspname as \%s\,\n
  	p.proname AS \%s\,\n
  	CASE WHEN p.pronargs = 0\n
  	  THEN CAST('*' AS pg_catalog.text)\n
  	  ELSE\n
  	  pg_catalog.array_to_string(ARRAY(\n
  	SELECT\n
!  pg_catalog.format_type(p.proargtypes[s.i], NULL)\n
  	FROM\n
  	  pg_catalog.generate_series(0, pg_catalog.array_upper(p.proargtypes, 1)) AS s(i)\n
  	  ), ', ')\n
  	END AS \%s\,\n
!    pg_catalog.obj_description(p.oid, 'pg_proc') as \%s\\n
  	  FROM pg_catalog.pg_proc p\n
! 	LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace\n
  	  WHERE p.proisagg\n,
! 	  _(Schema), _(Name),
  	  _(Argument data types), _(Description));
  
  	processSQLNamePattern(pset.db, buf, pattern, true, false,
--- 62,83 
  	printfPQExpBuffer(buf,
  	  SELECT n.nspname as \%s\,\n
  	p.proname AS \%s\,\n
+ 	pg_catalog.format_type(p.prorettype, NULL) AS \%s\,\n 
  	CASE WHEN p.pronargs = 0\n
  	  THEN CAST('*' AS pg_catalog.text)\n
  	  ELSE\n
  	  pg_catalog.array_to_string(ARRAY(\n
  	SELECT\n
! 	  pg_catalog.format_type(p.proargtypes[s.i], NULL)\n
  	FROM\n
  	  pg_catalog.generate_series(0, pg_catalog.array_upper(p.proargtypes, 1)) AS s(i)\n
  	  ), ', ')\n
  	END AS \%s\,\n
! 	pg_catalog.obj_description(p.oid, 'pg_proc') as \%s\\n
  	  FROM pg_catalog.pg_proc p\n
! 	   LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace\n
  	  WHERE p.proisagg\n,
! 	  _(Schema), _(Name), _(Result data type),
  	  _(Argument data types), _(Description));
  
  	processSQLNamePattern(pset.db, buf, pattern, true, false,

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] LIMIT/SORT optimization

2007-03-14 Thread Simon Riggs
On Wed, 2007-03-14 at 15:16 +, Gregory Stark wrote:

 Do people prefer receiving attachments or downloadable links? 

Attachments are very clearly submissions to the project.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend