date:20110825

Re: [HACKERS] Removal of useless include references

2011-08-25 Thread Christian Ullrich


* Bruce Momjian wrote:


The attached patch removes unneeded include references, and marks some
includes as needing to be skipped by pgrminclude.


There are several unrelated changes to pg_upgrade in that patch, too.

--
Christian



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch to slightly improve clarity of a comment in postgresql.conf.sample

2011-08-25 Thread Dougal Sutherland

Sure, I like that more.


On Thu, Aug 25, 2011 at 9:29 AM, Robert Haas  wrote:

> On Wed, Aug 24, 2011 at 7:33 AM, Dougal Sutherland 
> wrote:
> > The attached change to postgresql.conf.sample makes it more clear at a
> > glance that the default value of listen_addresses is 'localhost', not
> > 'localhost, *'. This would have saved a friend an hour or two of fiddling
> > tonight.
>
> How about:
>
># defaults to 'localhost'; use '*' for all
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] tsvector concatenation - backend crash

2011-08-25 Thread Jesper Krogh


On 2011-08-26 05:28, Tom Lane wrote:

Jesper Krogh  writes:

Attached SQL files gives (at least in my hands) a reliable backend crash
with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry
I cannot provide a more trimmed down set of vectors the reproduces the
bug, thus
the "obsfucated" dataset. But even deleting single terms in the vectors make
the bug go away.

Hm ... I can reproduce this on one of my usual machines, but not
another.  What platform are you on exactly?

64 bit Ubuntu Lucid (amd64).

--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Buffering GiST leaf pages too

2011-08-25 Thread Heikki Linnakangas


On 26.08.2011 00:45, Jim Nasby wrote:

I've often wondered about the per-tuple overhead of all kinds of operations, 
not just GiST index builds. For example, if you're doing a seqscan, ISTM it 
would be a lot more efficient to memcpy an entire page into backend-local 
memory and operate off of that lock-free.


What we currently do is even better than that. We take the lock once, 
and hold it while we do all the visibility checks. Then the lock is 
released, but the page is kept pinned so that it doesn't get evicted 
from the buffer cache. No memcpy() required.



Similarly for an index scan, you'd want to copy a full leaf page if you think 
you'll be hitting it more than once or twice.


We more or less do that too already. When an index scan steps on a leaf 
page, it scans the page for all matches, and copies them to 
backend-local memory. The page lock is then released.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removal of useless include references

2011-08-25 Thread Tom Lane

Bruce Momjian  writes:
> It has been years since I ran src/tools/pginclude/pgrminclude to remove
> unnecessary include files.  (I have already fixed things so include
> files can be compiled on their own.)

> The attached patch removes unneeded include references, and marks some
> includes as needing to be skipped by pgrminclude.

> I am sure applying this patch will break builds on some platforms and
> some option combinations so I will monitor the buildfarm when I apply it
> and make adjustments.

The last time you did this was in July 2006.  It took us two weeks to
mostly recover, but we were still dealing with some fallout in December,
cf
http://archives.postgresql.org/pgsql-hackers/2006-12/msg00491.php

We had the buildfarm then, had had it for a couple years.  The notion
that watching the buildfarm is enough is fully disproven by history.

Unless you have a better test plan than last time (which this isn't),
I don't think this should be done at all.  The benefits are microscopic
and the pain real.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-08-25 Thread Bruce Momjian


OK, this was very helpful.  I found out that there is a bug in current
9.0.X, 9.1.X, and HEAD that I introduced recently when I excluded temp
tables.  (The bug is not in any released version of pg_upgrade.)  The
attached, applied patches should fix it for you.  I assume you are
running 9.0.X, and not 9.0.4.

---

hubert depesz lubaczewski wrote:
> On Thu, Aug 25, 2011 at 04:43:02PM -0400, Bruce Momjian wrote:
> > Please check the old cluster.
> 
> Sure:
> 
> =# SELECT reltoastrelid FROM pg_class WHERE relname  = 'actions'; 
>   
>   
> 
>  reltoastrelid 
> ---
>   82510395
>   71637071
> (2 rows)
> 
> =# SELECT oid::regclass, reltoastrelid FROM pg_class WHERE relname  = 
> 'actions';
>   
>
>   oid  | reltoastrelid 
> ---+---
>  x.actions |  82510395
>  y.actions |  71637071
> (2 rows)
> 
> =# select oid, relfilenode from pg_class where oid in (SELECT reltoastrelid 
> FROM pg_class WHERE relname  = 'actions');
>oid| relfilenode 
> --+-
>  82510395 |82510395
>  71637071 |71637071
> (2 rows)
> 
> =# select oid from pg_database where datname = current_database();
>oid
> --
>  71635381
> (1 row)
> 
> $ ls -l /base/71635381/{71637071,82510395}
> -rw--- 1 postgres postgres 0 2009-10-12 06:49 /base/71635381/71637071
> -rw--- 1 postgres postgres 0 2010-08-19 14:02 /base/71635381/82510395
> 
> > > > > One more thing - one of earlier tests actually worked through
> > > > > pg_upgrade, but when running vacuumdb -az on newly started 9.0.4, I 
> > > > > got
> > > > > error about missing transaction/clog - don't remember exactly what it
> > > > > was, though.
> > > > THere was a bug in how how pg_upgrade worked in pre-9.0.4 --- could it
> > > > have been that?
> > > It was done definitely using 9.0.4.
> > Good.
> 
> Not sure if it's good, since it was after the clog error was fixed, and
> I still got it :/
> 
> but anyway - the problem with 71637071 is more important now.
> 
> Best regards,
> 
> depesz

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/info.c b/contrib/pg_upgrade/info.c
new file mode 100644
index ca357e7..1f5b7ae
*** a/contrib/pg_upgrade/info.c
--- b/contrib/pg_upgrade/info.c
*** get_rel_infos(migratorContext *ctx, cons
*** 328,336 
  			 "	ON c.reltablespace = t.oid "
  			 "WHERE (( "
  			 /* exclude pg_catalog and pg_temp_ (could be orphaned tables) */
! 			 "	n.nspname !~ '^pg_' "
! 			 "	AND n.nspname != 'information_schema' "
! 			 "	AND c.oid >= %u "
  			 "	) OR ( "
  			 "	n.nspname = 'pg_catalog' "
  			 "	AND relname IN "
--- 328,338 
  			 "	ON c.reltablespace = t.oid "
  			 "WHERE (( "
  			 /* exclude pg_catalog and pg_temp_ (could be orphaned tables) */
! 			 "n.nspname != 'pg_catalog' "
! 			 "AND n.nspname !~ '^pg_temp_' "
! 			 "AND n.nspname !~ '^pg_toast_temp_' "
! 			 "	  AND n.nspname != 'information_schema' "
! 			 "	  AND c.oid >= %u "
  			 "	) OR ( "
  			 "	n.nspname = 'pg_catalog' "
  			 "	AND relname IN "
diff --git a/contrib/pg_upgrade/version_old_8_3.c b/contrib/pg_upgrade/version_old_8_3.c
new file mode 100644
index 930f76d..6fcd61b
*** a/contrib/pg_upgrade/version_old_8_3.c
--- b/contrib/pg_upgrade/version_old_8_3.c
*** old_8_3_check_for_name_data_type_usage(m
*** 61,69 
  "		NOT a.attisdropped AND "
  "		a.atttypid = 'pg_catalog.name'::pg_catalog.regtype AND "
  "		c.relnamespace = n.oid AND "
! 			 /* exclude pg_catalog and pg_temp_ (could be orphaned tables) */
! "   n.nspname !~ '^pg_' AND "
! 		 "		n.nspname != 'information_schema'");
  
  		ntups = PQntuples(res);
  		i_nspname = PQfnumber(res, "nspname");
--- 61,71 
  "		NOT a.attisdropped AND "
  "		a.atttypid = 'pg_catalog.name'::pg_catalog.regtype AND "
  "		c.relnamespace = n.oid AND "
! /* exclude possibly orphaned temp tables */
! 			 	"		n.nspname != 'pg_catalog' AND "
! "		n.nspname !~ '^pg_temp_' AND "
! "		n.nspname !~ '^pg_toast_temp_' AND "
! "		n.nspname != 'information_schema' ");
  
  		ntups = PQntuples(res);
  		i_nspname = PQfnumber(res, "nspname");
*** old_8_3_check_for_tsquery_usage(migrator
*** 152,160 
  "		NOT a.attisdropped AND "
  "		a.atttypid = 'pg_catalog.tsquery'::pg_catalog.regtype AND "
  "		c.relnamespace

[HACKERS] dropdb and dropuser: IF EXISTS

2011-08-25 Thread Josh Kupershmidt

I noticed a few places where it would be handy if dropdb took a flag
like "--if-exists" which would basically just add in the 'IF EXISTS'
clause to the DROP DATABASE statement. For example, scripts like
find_static or mbregress.sh use dropdb && createdb, but they generate
noisy errors from dropdb when run for the first time since there's no
--if-exists flag. (They could just pipe 'DROP DATABASE IF EXISTS ...'
to psql, but what's the point of having dropdb if it's not used?)

Attached is a very quick patch implementing the "--if-exists" or "-X"
option for dropdb and dropuser. I didn't bother adding in a check to
make sure the server version was 8.2+ since we're not even supporting
8.1 nowadays, though that'd be easy enough to add in.

Josh
diff --git a/doc/src/sgml/ref/dropdb.sgml b/doc/src/sgml/ref/dropdb.sgml
index e20bcdb..2092bb6 100644
*** a/doc/src/sgml/ref/dropdb.sgml
--- b/doc/src/sgml/ref/dropdb.sgml
*** PostgreSQL documentation
*** 87,92 
--- 87,102 
   
  
   
+   -X
+   --if-exists
+   
+
+Don't report an error if the specified database does not exist.
+
+   
+  
+ 
+  
 -V
 --version
 
diff --git a/doc/src/sgml/ref/dropuser.sgml b/doc/src/sgml/ref/dropuser.sgml
index c158103..22580a4 100644
*** a/doc/src/sgml/ref/dropuser.sgml
--- b/doc/src/sgml/ref/dropuser.sgml
*** PostgreSQL documentation
*** 89,94 
--- 89,104 
   
  
   
+   -X
+   --if-exists
+   
+
+ Don't report an error if the specified user does not exist.
+
+   
+  
+ 
+  
 -V
 --version
 
diff --git a/src/bin/scripts/dropdb.c b/src/bin/scripts/dropdb.c
index 4cec63e..187bf6c 100644
*** a/src/bin/scripts/dropdb.c
--- b/src/bin/scripts/dropdb.c
*** main(int argc, char *argv[])
*** 29,34 
--- 29,35 
  		{"password", no_argument, NULL, 'W'},
  		{"echo", no_argument, NULL, 'e'},
  		{"interactive", no_argument, NULL, 'i'},
+ 		{"if-exists", no_argument, NULL, 'X'},
  		{NULL, 0, NULL, 0}
  	};
  
*** main(int argc, char *argv[])
*** 43,48 
--- 44,50 
  	enum trivalue prompt_password = TRI_DEFAULT;
  	bool		echo = false;
  	bool		interactive = false;
+ 	bool		if_exists = false;
  
  	PQExpBufferData sql;
  
*** main(int argc, char *argv[])
*** 54,60 
  
  	handle_help_version_opts(argc, argv, "dropdb", help);
  
! 	while ((c = getopt_long(argc, argv, "h:p:U:wWei", long_options, &optindex)) != -1)
  	{
  		switch (c)
  		{
--- 56,62 
  
  	handle_help_version_opts(argc, argv, "dropdb", help);
  
! 	while ((c = getopt_long(argc, argv, "h:p:U:wWeiX", long_options, &optindex)) != -1)
  	{
  		switch (c)
  		{
*** main(int argc, char *argv[])
*** 79,84 
--- 81,89 
  			case 'i':
  interactive = true;
  break;
+ 			case 'X':
+ if_exists = true;
+ break;
  			default:
  fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
  exit(1);
*** main(int argc, char *argv[])
*** 110,117 
  
  	initPQExpBuffer(&sql);
  
! 	appendPQExpBuffer(&sql, "DROP DATABASE %s;\n",
! 	  fmtId(dbname));
  
  	/*
  	 * Connect to the 'postgres' database by default, except have the
--- 115,122 
  
  	initPQExpBuffer(&sql);
  
! 	appendPQExpBuffer(&sql, "DROP DATABASE %s%s;\n",
! 	  (if_exists ? "IF EXISTS " : ""), fmtId(dbname));
  
  	/*
  	 * Connect to the 'postgres' database by default, except have the
*** help(const char *progname)
*** 146,151 
--- 151,157 
  	printf(_("\nOptions:\n"));
  	printf(_("  -e, --echoshow the commands being sent to the server\n"));
  	printf(_("  -i, --interactive prompt before deleting anything\n"));
+ 	printf(_("  -X, --if-exists   don't report error if database doesn't exist\n"));
  	printf(_("  --helpshow this help, then exit\n"));
  	printf(_("  --version output version information, then exit\n"));
  	printf(_("\nConnection options:\n"));
diff --git a/src/bin/scripts/dropuser.c b/src/bin/scripts/dropuser.c
index 0949a5e..bf5196f 100644
*** a/src/bin/scripts/dropuser.c
--- b/src/bin/scripts/dropuser.c
*** main(int argc, char *argv[])
*** 29,34 
--- 29,35 
  		{"password", no_argument, NULL, 'W'},
  		{"echo", no_argument, NULL, 'e'},
  		{"interactive", no_argument, NULL, 'i'},
+ 		{"if-exists", no_argument, NULL, 'X'},
  		{NULL, 0, NULL, 0}
  	};
  
*** main(int argc, char *argv[])
*** 43,48 
--- 44,50 
  	enum trivalue prompt_password = TRI_DEFAULT;
  	bool		echo = false;
  	bool		interactive = false;
+ 	bool		if_exists = false;
  
  	PQExpBufferData sql;
  
*** main(int argc, char *argv[])
*** 54,60 
  
  	handle_help_version_opts(argc, argv, "dropuser", help);
  
! 	while ((c = getopt_long(argc, argv, "h:p:U:wWei", long_options, &optindex)) !

Re: [HACKERS] tsvector concatenation - backend crash

2011-08-25 Thread Tom Lane

Jesper Krogh  writes:
> Attached SQL files gives (at least in my hands) a reliable backend crash
> with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry
> I cannot provide a more trimmed down set of vectors the reproduces the 
> bug, thus
> the "obsfucated" dataset. But even deleting single terms in the vectors make
> the bug go away.

Hm ... I can reproduce this on one of my usual machines, but not
another.  What platform are you on exactly?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Tom Lane

Robert Haas  writes:
> On Thu, Aug 25, 2011 at 5:12 PM, Tom Lane  wrote:
>> ISTM the general rule ought to be that we attempt to substitute for a
>> colon-construct regardless of where it appears within an argument, as
>> long as it's not within quotes.

> My main thought is that I remember this code being pretty awful -
> especially with respect to error handling - when I looked at it.  A
> lot of dubious behaviors were more or less compelled by the
> impossibility of bailing out at an arbitrary point a la ereport().  At
> least, barring a drastic refactoring of the code, which might not be a
> bad idea either.

What I had in mind to do was just to rearrange the flex rules --- the
issues that I called out have to do with dubious choices about when to
transition between different lexer states.

I agree that the error handling isn't terribly friendly in unexpected
cases like there not being a connection available to determine the
literal-quoting rules, but that's not what I'm on about here.  I'm
just after consistent variable-expansion behavior in normal operation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why doesn't psql use the information schema to get ACL description ?

2011-08-25 Thread Dave Cramer

On Thu, Aug 25, 2011 at 5:51 PM, Jim Nasby  wrote:
> On Aug 23, 2011, at 1:57 PM, Peter Eisentraut wrote:
>>> Ultimately my ulterior motive is to make sure the information schema
>>> is actually useful and ensure that it maintains that status.
>>
>> The information schema only exposes privileges defined in the SQL
>> standard on objects defined in the SQL standard.  So psql or any tool
>> that wants to provide information on PostgreSQL-specific features can't
>> use that.  But perhaps the JDBC interfaces only expose certain standard
>> information anyway, so it could be useful.  Depends on the details,
>> though.
>
> Related to this is the newsysviews project, which was intended to present 
> more human-friendly info ala info_schema, but to also provide 
> PostgreSQL-specific information (which info_schema can't).


I'll have to have a look at this when I get a moment, sounds
promising. Will be even better if we get all clients using it.

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_upgrade problem (fwd)

2011-08-25 Thread Bruce Momjian


I am sending this report to hackers to all hacker subscribers can read
the original bug report.

---

- Forwarded message from hubert depesz lubaczewski -

hi

I have 8.3.11 database, ~ 600GB in size.

I want to upgrade it to 9.0.

First, I tried with 9.0.4, and when I hit problem (the same) I tried
git, head of 9.0 branch.

So. I did pg_upgrade with -c, and it looked like this:

$ time pg_upgrade -c -v -b /opt/pgsql-8.3.11-int/bin/ -B 
/opt/pgsql-9.0.5a-int/bin/ -d /var/postgresql// -D /var/postgresql/-9.0 
-k -l pg_upgrade.log -p  -P 4329
Running in verbose mode
Performing Consistency Checks
-
Checking old data directory (/var/postgresql/)  ok
Checking old bin directory (/opt/pgsql-8.3.11-int/bin)  ok
Checking new data directory (/var/postgresql/-9.0)  ok
Checking new bin directory (/opt/pgsql-9.0.5a-int/bin)  ok
"/opt/pgsql-8.3.11-int/bin/pg_ctl" -l "pg_upgrade.log" -D 
"/var/postgresql/" -o "-p  -c autovacuum=off -c 
autovacuum_freeze_max_age=20" start >> "pg_upgrade.log" 2>&1
Checking for reg* system oid user data typesok
Checking for /contrib/isn with bigint-passing mismatch  ok
Checking for invalid 'name' user columnsok
Checking for tsquery user columns   ok
Checking for tsvector user columns  ok
Checking for hash and gin indexes   warning

| Your installation contains hash and/or gin
| indexes.  These indexes have different
| internal formats between your old and new
| clusters so they must be reindexed with the
| REINDEX command. After migration, you will
| be given REINDEX instructions.

Checking for bpchar_pattern_ops indexes ok
Checking for large objects  ok
"/opt/pgsql-8.3.11-int/bin/pg_ctl" -l "pg_upgrade.log" -D 
"/var/postgresql/"  stop >> "pg_upgrade.log" 2>&1
"/opt/pgsql-9.0.5a-int/bin/pg_ctl" -l "pg_upgrade.log" -D 
"/var/postgresql/-9.0" -o "-p 4329 -c autovacuum=off -c 
autovacuum_freeze_max_age=20" start >> "pg_upgrade.log" 2>&1
Checking for presence of required libraries ok

*Clusters are compatible*
"/opt/pgsql-9.0.5a-int/bin/pg_ctl" -l "pg_upgrade.log" -D 
"/var/postgresql/-9.0"  stop >> "pg_upgrade.log" 2>&1

real0m6.417s
user0m0.040s
sys 0m0.060s

All looks ok. So I ran the upgrade without -c:

$ time pg_upgrade -v -b /opt/pgsql-8.3.11-int/bin/ -B 
/opt/pgsql-9.0.5a-int/bin/ -d /var/postgresql// -D /var/postgresql/-9.0 
-k -l pg_upgrade.log -p  -P 4329
Running in verbose mode
Performing Consistency Checks
-
Checking old data directory (/var/postgresql/)  ok
Checking old bin directory (/opt/pgsql-8.3.11-int/bin)  ok
Checking new data directory (/var/postgresql/-9.0)  ok
Checking new bin directory (/opt/pgsql-9.0.5a-int/bin)  ok
"/opt/pgsql-8.3.11-int/bin/pg_ctl" -l "pg_upgrade.log" -D 
"/var/postgresql/" -o "-p  -c autovacuum=off -c 
autovacuum_freeze_max_age=20" start >> "pg_upgrade.log" 2>&1
Checking for reg* system oid user data typesok
Checking for /contrib/isn with bigint-passing mismatch  ok
Checking for invalid 'name' user columnsok
Checking for tsquery user columns   ok
Creating script to adjust sequences ok
Checking for large objects  ok
Creating catalog dump   
"/opt/pgsql-9.0.5a-int/bin/pg_dumpall" --port  --username "postgres" 
--schema-only --binary-upgrade > "/var/postgresql/pg_upgrade_dump_all.sql"
ok
"/opt/pgsql-8.3.11-int/bin/pg_ctl" -l "pg_upgrade.log" -D 
"/var/postgresql/"  stop >> "pg_upgrade.log" 2>&1
"/opt/pgsql-9.0.5a-int/bin/pg_ctl" -l "pg_upgrade.log" -D 
"/var/postgresql/-9.0" -o "-p 4329 -c autovacuum=off -c 
autovacuum_freeze_max_age=20" start >> "pg_upgrade.log" 2>&1
Checking for presence of required libraries ok

| If pg_upgrade fails after this point, you must
| re-initdb the new cluster before continuing.
| You will also need to remove the ".old" suffix
| from /var/postgresql//global/pg_control.old.

Performing Migration

Adding ".old" suffix to old global/pg_control   ok
Analyzing all rows in the new cluster   
"/opt/pgsql-9.0.5a-int/bin/vacuumdb" --port 4329 --username "postgres" --all 
--analyze >> "pg_upgrade.log" 2>&1
ok
Freezing all rows on the new cluster
"/opt/pgsql-9.0.5a-int/bin/vacuumdb" --port 4329 --username "postgres" --all 
--freeze >> "pg_upgrade.log" 2>&1
ok
"/opt/pgsql-9.0.5a-int/bin/pg_ctl" -l "pg_upgrade.log" -D 
"/var/postgresql/-9.0"  stop >> "pg_upgrade.log" 2>&1
Deleting new commit clogs

Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-25 Thread Tomas Vondra

On 26 Srpen 2011, 0:39, Tomas Vondra wrote:
> On 26 Srpen 2011, 0:18, Josh Berkus wrote:
>> Tomas,
>>
>>> I'd like to propose a small patch that allows better checkpoint
>>> progress
>>> monitoring. The patch is quite simple - it adds a new integer GUC
>>> "checkpoint_update_limit" and every time checkpoint writes this number
>>> of
>>> buffers, it does two things:
>>
>> I'd rather not have a new GUC if we can avoid it.  What about just
>> making this some reasonable value (like 1000 buffers) if log_checkpoints
>> = on?
>
> I was thinking about that too, but I think no value can fit all cases
> because the systems may have very different I/O subsystems.
>
> With one 7.2k drive I usually get about 25MB/s on average, with big arrays
> / good controllers / fast drives you can write much faster. So either the
> value will be too low (and the log will be infested with those messages)
> or too high (so it won't be updated very often).

Hmmm, maybe we could use time instead of number of buffers? Something like
"every 5 seconds, log the checkpoint progress" instead of "every time 1000
buffers is written ..."

That should work on systems regardless of I/O performance.

But although using time instead of number of buffers seems like a good
idea, I don't think it eliminates the need for a new GUC.

(a) Even with time limit, I find it useful to be able to set the limits
differently.

(b) In some cases it may be useful to enable just basic (current
behaviour) checkpoint logging using log_checkpoints, so using it for this
new patch may not be a good idea. Although this could be fixed by allowing
three values no/basic/detailed instead of just true/false.

Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-25 Thread Tomas Vondra

On 26 Srpen 2011, 0:18, Josh Berkus wrote:
> Tomas,
>
>> I'd like to propose a small patch that allows better checkpoint progress
>> monitoring. The patch is quite simple - it adds a new integer GUC
>> "checkpoint_update_limit" and every time checkpoint writes this number
>> of
>> buffers, it does two things:
>
> I'd rather not have a new GUC if we can avoid it.  What about just
> making this some reasonable value (like 1000 buffers) if log_checkpoints
> = on?

I was thinking about that too, but I think no value can fit all cases
because the systems may have very different I/O subsystems.

With one 7.2k drive I usually get about 25MB/s on average, with big arrays
/ good controllers / fast drives you can write much faster. So either the
value will be too low (and the log will be infested with those messages)
or too high (so it won't be updated very often).

And it depends on the number of shared buffers too. I thought it could be
something like 5% of shared buffers but it really does not solve the
issues. So I think there should be a new GUC.

OTOH, it's probably a good idea to log this only when the log_checkpoints
is enabled as those who tune checkpoints will enable log_checkpoints in
the first case.

Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Jim Nasby

On Aug 22, 2011, at 6:22 PM, Robert Haas wrote:
> With respect to a general-purpose shared memory allocator, I think
> that there are cases where that would be useful to have, but I don't
> think there are as many of them as many people seem to think.  I
> wouldn't choose to implement this using a general-purpose allocator
> even if we had it, both because it's undesirable to allow this or any
> subsystem to consume an arbitrary amount of memory (nor can it fail...
> especially in the abort path) and because a ring buffer is almost
> certainly faster than a general-purpose allocator.  We have enough
> trouble with palloc overhead already.  That having been said, I do
> think there are cases where it would be nice to have... and it
> wouldn't surprise me if I end up working on something along those
> lines in the next year or so.  It turns out that memory management is
> a major issue in lock-free programming; you can't assume that it's
> safe to recycle an object once the last pointer to it has been removed
> from shared memory - because someone may have fetched the pointer just
> before you removed it and still be using it to examine the object.  An
> allocator with some built-in capabilities for handling such problems
> seems like it might be very useful

Actually, I wasn't thinking about the system dynamically sizing shared memory 
on it's own... I was only thinking of providing the ability for a user to 
change something like shared_buffers and allow that change to take effect with 
a SIGHUP instead of requiring a full restart.

I agree that we'd have to be very careful with allowing the code to start 
changing shared memory size on it's own...
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Jim Nasby

On Aug 25, 2011, at 8:24 AM, Robert Haas wrote:
> My hope (and it might turn out that I'm an optimist) is that even with
> a reasonably small buffer it will be very rare for a backend to
> experience a wraparound condition.  For example, consider a buffer
> with ~6500 entries, approximately 64 * MaxBackends, the approximate
> size of the current subxip arrays taken in aggregate.  I hypothesize
> that a typical snapshot on a running system is going to be very small
> - a handful of XIDs at most - because, on the average, transactions
> are going to commit in *approximately* increasing XID order and, if
> you take the regression tests as representative of a real workload,
> only a small fraction of transactions will have more than one XID.  So

BTW, there's a way to actually gather some data on this by using PgQ (part of 
Skytools and used by Londiste). PgQ works by creating "ticks" at regular 
intervals, where a tick is basically just a snapshot of committed XIDs. 
Presumably Slony does something similar.

I can provide you with sample data from our production systems if you're 
interested.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inputting relative datetimes

2011-08-25 Thread Josh Berkus


> Yes, but what I am trying to achieve is a way of entering such
> relative timestamps using a single input value, so that absolute and
> relative timestamps can both be bound to a SQL query using just one
> variable.

I think adding a function would be the way to go then.  Maybe extending
to_timestamp with a "relative" keyword.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-25 Thread Josh Berkus

Tomas,

> I'd like to propose a small patch that allows better checkpoint progress
> monitoring. The patch is quite simple - it adds a new integer GUC
> "checkpoint_update_limit" and every time checkpoint writes this number of
> buffers, it does two things:

I'd rather not have a new GUC if we can avoid it.  What about just
making this some reasonable value (like 1000 buffers) if log_checkpoints
= on?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-08-25 Thread Andrew Dunstan




On 08/25/2011 06:05 PM, Jim Nasby wrote:

On Aug 24, 2011, at 7:43 PM, Josh Berkus wrote:

On 8/23/11 1:30 PM, Andrew Dunstan wrote:

Attached is an undocumented patch that allows pg_restore to omit
post-data items or omit all but post-data items. This has been discussed
before, and Simon sent in a patch back on 2008, which has bitrotted
some. I'm not sure why it was dropped at the time, but I think it's time
to do this. This patch relies on some infrastructure that was added
since Simon's patch, so it works a bit differently  (and more simply).

If it's not clear from Andrew's description, the purpose of this patch
is to allow dividing your pgdump into 3 portions:

1. schema
2. data
3. constraints/indexes

This allows users to implement a number of custom solutions for ad-hoc
parallel dump, conditional loading, data munging and sampled databases.
While doing so was possible before using the manifest from pg_restore
-l, the manifest approach has been complex to automate and relies on
obscure knowledge.

I have immediate production use for this patch and may be backporting it.

FWIW, I got around this by writing a perl script that calls pg_dump -s and 
watches for the end of table create statements (IIRC it specifically looks for 
the first CREATE INDEX). The advantage to that approach is that you don't have 
to first create a custom format dump and then run pg_restore against that.



Well, notwithstanding my well known love of perl, that strikes me as 
spending a pound to save a penny. And custom format dumps rock ;-) Also, 
your recipe above is buggy, BTW. A CREATE INDEX statement might well not 
be the first item in the post-data section.


But we could also add these switches to pg_dump too if people feel it's 
worthwhile. I haven't looked but the logic should not be terribly hard.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-08-25 Thread Jim Nasby

On Aug 24, 2011, at 7:43 PM, Josh Berkus wrote:
> On 8/23/11 1:30 PM, Andrew Dunstan wrote:
>> 
>> Attached is an undocumented patch that allows pg_restore to omit
>> post-data items or omit all but post-data items. This has been discussed
>> before, and Simon sent in a patch back on 2008, which has bitrotted
>> some. I'm not sure why it was dropped at the time, but I think it's time
>> to do this. This patch relies on some infrastructure that was added
>> since Simon's patch, so it works a bit differently  (and more simply).
> 
> If it's not clear from Andrew's description, the purpose of this patch
> is to allow dividing your pgdump into 3 portions:
> 
> 1. schema
> 2. data
> 3. constraints/indexes
> 
> This allows users to implement a number of custom solutions for ad-hoc
> parallel dump, conditional loading, data munging and sampled databases.
> While doing so was possible before using the manifest from pg_restore
> -l, the manifest approach has been complex to automate and relies on
> obscure knowledge.
> 
> I have immediate production use for this patch and may be backporting it.

FWIW, I got around this by writing a perl script that calls pg_dump -s and 
watches for the end of table create statements (IIRC it specifically looks for 
the first CREATE INDEX). The advantage to that approach is that you don't have 
to first create a custom format dump and then run pg_restore against that.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why doesn't psql use the information schema to get ACL description ?

2011-08-25 Thread Jim Nasby

On Aug 23, 2011, at 1:57 PM, Peter Eisentraut wrote:
>> Ultimately my ulterior motive is to make sure the information schema
>> is actually useful and ensure that it maintains that status.
> 
> The information schema only exposes privileges defined in the SQL
> standard on objects defined in the SQL standard.  So psql or any tool
> that wants to provide information on PostgreSQL-specific features can't
> use that.  But perhaps the JDBC interfaces only expose certain standard
> information anyway, so it could be useful.  Depends on the details,
> though.

Related to this is the newsysviews project, which was intended to present more 
human-friendly info ala info_schema, but to also provide PostgreSQL-specific 
information (which info_schema can't).
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Buffering GiST leaf pages too

2011-08-25 Thread Jim Nasby

On Aug 23, 2011, at 2:03 AM, Heikki Linnakangas wrote:
> While looking at Alexander's GiST fastbuild patch, which adds buffers to 
> internal nodes to avoid random I/O during index build, it occurred to me that 
> inserting the tuples to the leaf pages one at a time is quite inefficient 
> too, even if the leaf pages are in cache. There's still the overhead of 
> locking and WAL-logging each insertion separately. I think we could get a 
> nice further speedup if we attach a small buffer (one block or so) to every 
> leaf page we're currently writing tuples to, and update the leaf page in 
> bulk. Conveniently, the code to insert multiple tuples to a page already 
> exists in GiST code (because inserting a tuple sometimes splits the page into 
> more than two parts, so you need to insert multiple downlinks to the parent), 
> so this requires no changes to the low-level routines and WAL-logging.
> 
> Let's finish off the main fastbuild patch first, but I wanted to get the idea 
> out there.

I've often wondered about the per-tuple overhead of all kinds of operations, 
not just GiST index builds. For example, if you're doing a seqscan, ISTM it 
would be a lot more efficient to memcpy an entire page into backend-local 
memory and operate off of that lock-free. Similarly for an index scan, you'd 
want to copy a full leaf page if you think you'll be hitting it more than once 
or twice.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inputting relative datetimes

2011-08-25 Thread Merlin Moncure

On Thu, Aug 25, 2011 at 5:08 AM, Dean Rasheed  wrote:
> On 25 August 2011 10:43, Vik Reykja  wrote:
>> On Thu, Aug 25, 2011 at 11:39, Dean Rasheed 
>> wrote:
>>>
>>> My first thought was to have some general way of adding or subtracting
>>> an interval at the end of an input timestamp, eg. by adding another
>>> couple of special values - "plus " and "minus ".
>>> This would allow things like:
>>>
>>> TIMESTAMPTZ 'today minus 5 days'
>>> TIMESTAMPTZ 'now plus 2 hours'
>>
>> Funny you should mention intervals...
>>
>> timestamptz 'today' - interval '5 days'
>> timestamptz 'now' + interval '2 hours'
>>
>
> Yes, but what I am trying to achieve is a way of entering such
> relative timestamps using a single input value, so that absolute and
> relative timestamps can both be bound to a SQL query using just one
> variable.

not gonna happen.  as noted, intervals already solve the issue quite
well and timestamp string parsing is already complicated enough as it
is.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Andrew Dunstan




On 08/25/2011 02:45 PM, Tom Lane wrote:

I wrote:

Andrew Dunstan  writes:

We could just add -b unconditionally to the flex flags and then count
the number of lines in lex.backup. If it's greater that 1 whine loudly,
or even fail, otherwise remove lex.backup. Would that avoid locale
dependencies?

Hm, yeah, seems like that ought to work.

Done in HEAD, but only for the Makefile-based build mechanism.  Anybody
want to add the comparable logic to the MSVC scripts?




Done.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Robert Haas

On Thu, Aug 25, 2011 at 5:12 PM, Tom Lane  wrote:
> While I'm looking at this ... the current implementation has got a
> number of very inconsistent behaviors with respect to when it will
> expand a variable reference within a psql meta-command argument.
> Observe:
>
> regression=# \set foo 'value of foo'
> regression=# \set bar 'value of bar'
> regression=# \echo :foo
> value of foo
> regression=# \echo :foo@bar
> value of foo @bar
>
> (there shouldn't be a space before the @, IMO --- there is because this
> gets treated as two separate arguments, which seems bizarre)
>
> regression=# \echo :foo:bar
> value of foo value of bar
>
> (again, why is this two arguments not one?)
>
> regression=# \echo :foo@:bar
> value of foo @:bar
>
> (why isn't :bar expanded here, when it is in the previous case?)
>
> regression=# \echo foo:foo@:bar
> foo:foo@:bar
>
> (and now neither one gets expanded)
>
> ISTM the general rule ought to be that we attempt to substitute for a
> colon-construct regardless of where it appears within an argument, as
> long as it's not within quotes.
>
> Thoughts?

My main thought is that I remember this code being pretty awful -
especially with respect to error handling - when I looked at it.  A
lot of dubious behaviors were more or less compelled by the
impossibility of bailing out at an arbitrary point a la ereport().  At
least, barring a drastic refactoring of the code, which might not be a
bad idea either.

No objection if you want to clean some of it up, but you may find it's
a larger sinkhole than you anticipate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Tom Lane

While I'm looking at this ... the current implementation has got a
number of very inconsistent behaviors with respect to when it will
expand a variable reference within a psql meta-command argument.
Observe:

regression=# \set foo 'value of foo'
regression=# \set bar 'value of bar'
regression=# \echo :foo
value of foo
regression=# \echo :foo@bar
value of foo @bar

(there shouldn't be a space before the @, IMO --- there is because this
gets treated as two separate arguments, which seems bizarre)

regression=# \echo :foo:bar
value of foo value of bar

(again, why is this two arguments not one?)

regression=# \echo :foo@:bar
value of foo @:bar

(why isn't :bar expanded here, when it is in the previous case?)

regression=# \echo foo:foo@:bar
foo:foo@:bar

(and now neither one gets expanded)

ISTM the general rule ought to be that we attempt to substitute for a
colon-construct regardless of where it appears within an argument, as
long as it's not within quotes.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PATCH: regular logging of checkpoint progress

2011-08-25 Thread Tomas Vondra

Hello,

I'd like to propose a small patch that allows better checkpoint progress
monitoring. The patch is quite simple - it adds a new integer GUC
"checkpoint_update_limit" and every time checkpoint writes this number of
buffers, it does two things:

(a) logs a "checkpoint status" message into the server log, with info
about total number of buffers to write, number of already written buffers,
current and average write speed and estimate of remaining time

(b) sends bgwriter stats (so that the buffers_checkpoint is updated)

I believe this will make checkpoint tuning easier, especially with large
shared bufferers and large when there's other write activity (so that it's
difficult to see checkpoint I/O).

The default value (0) means this continuous logging is disabled.

Tomasdiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index 67e722f..64d84b0
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*** SET ENABLE_SEQSCAN TO OFF;
*** 1863,1868 
--- 1863,1885 

   
  
+  
+   checkpoint_update_limit 
(integer)
+   
+checkpoint_update_limit configuration 
parameter
+   
+   
+
+ Number of buffers written during a checkpoint between logging a status
+ (with total number of buffers to write, number of already written 
buffers,
+ average/current write and estimate of the remaining time) and updates 
bgwriter
+ stats at the same time. The default value 0 disables the continuous 
updates so
+ the stats are updated only at the end of the checkpoint. This 
parameter can only
+ be set in the postgresql.conf file or on the server 
command line.
+
+   
+  
+ 
   
   
   
diff --git a/src/backend/storage/buffer/bufmgr.c 
b/src/backend/storage/buffer/bufmgr.c
new file mode 100644
index 4c7cfb0..b24ec93
*** a/src/backend/storage/buffer/bufmgr.c
--- b/src/backend/storage/buffer/bufmgr.c
***
*** 66,71 
--- 66,72 
  bool  zero_damaged_pages = false;
  int   bgwriter_lru_maxpages = 100;
  doublebgwriter_lru_multiplier = 2.0;
+ int   checkpoint_update_limit = 0;
  
  /*
   * How many buffers PrefetchBuffer callers should try to stay ahead of their
*** BufferSync(int flags)
*** 1175,1180 
--- 1176,1192 
int num_to_write;
int num_written;
int mask = BM_DIRTY;
+   
+   int num_since_update;
+   
+   longcurr_secs,
+   total_secs;
+   int curr_usecs,
+   total_usecs;
+   float   curr_time,
+   total_time;
+   
+   TimestampTz startTimestamp, lastTimestamp;
  
/* Make sure we can handle the pin inside SyncOneBuffer */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
*** BufferSync(int flags)
*** 1238,1243 
--- 1250,1260 
buf_id = StrategySyncStart(NULL, NULL);
num_to_scan = NBuffers;
num_written = 0;
+   num_since_update = 0;
+   
+   startTimestamp = GetCurrentTimestamp();
+   lastTimestamp = startTimestamp;
+   
while (num_to_scan-- > 0)
{
volatile BufferDesc *bufHdr = &BufferDescriptors[buf_id];
*** BufferSync(int flags)
*** 1261,1266 
--- 1278,1327 
TRACE_POSTGRESQL_BUFFER_SYNC_WRITTEN(buf_id);
BgWriterStats.m_buf_written_checkpoints++;
num_written++;
+   num_since_update++;
+   
+   /*
+* Every time we write enough buffers 
(checkpoint_update_limit),
+* we log a checkpoint status message and 
update the bgwriter
+* stats (so that the pg_stat_bgwriter table 
may be updated).
+* 
+* The log message contains info about total 
number of buffers to
+* write, how many buffers are already written, 
average and current
+* write speed and an estimate remaining time.
+*/
+   if ((checkpoint_update_limit > 0) && 
(num_since_update >= checkpoint_update_limit))
+   {
+ 
+   TimestampDifference(lastTimestamp,
+   GetCurrentTimestamp(),
+   &curr_secs, &curr_usecs);
+   
+

Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-08-25 Thread hubert depesz lubaczewski

On Thu, Aug 25, 2011 at 04:43:02PM -0400, Bruce Momjian wrote:
> Please check the old cluster.

Sure:

=# SELECT reltoastrelid FROM pg_class WHERE relname  = 'actions';   

  
 reltoastrelid 
---
  82510395
  71637071
(2 rows)

=# SELECT oid::regclass, reltoastrelid FROM pg_class WHERE relname  = 
'actions';  

   
  oid  | reltoastrelid 
---+---
 x.actions |  82510395
 y.actions |  71637071
(2 rows)

=# select oid, relfilenode from pg_class where oid in (SELECT reltoastrelid 
FROM pg_class WHERE relname  = 'actions');
   oid| relfilenode 
--+-
 82510395 |82510395
 71637071 |71637071
(2 rows)

=# select oid from pg_database where datname = current_database();
   oid
--
 71635381
(1 row)

$ ls -l /base/71635381/{71637071,82510395}
-rw--- 1 postgres postgres 0 2009-10-12 06:49 /base/71635381/71637071
-rw--- 1 postgres postgres 0 2010-08-19 14:02 /base/71635381/82510395

> > > > One more thing - one of earlier tests actually worked through
> > > > pg_upgrade, but when running vacuumdb -az on newly started 9.0.4, I got
> > > > error about missing transaction/clog - don't remember exactly what it
> > > > was, though.
> > > THere was a bug in how how pg_upgrade worked in pre-9.0.4 --- could it
> > > have been that?
> > It was done definitely using 9.0.4.
> Good.

Not sure if it's good, since it was after the clog error was fixed, and
I still got it :/

but anyway - the problem with 71637071 is more important now.

Best regards,

depesz


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI 2PC coverage

2011-08-25 Thread Alvaro Herrera

Excerpts from Tom Lane's message of mié ago 24 22:11:58 -0300 2011:
> Alvaro Herrera  writes:
> > After having to play with this, I didn't like it very much, because
> > regression.diffs gets spammed with the (rather massive and completely
> > useless) diff in that test.  For the xml tests, rather than ignoring it
> > fail on an installation without libxml, we use an alternative output.
> 
> > Unless there are objections, I will commit the alternative file proposed
> > by Dan.
> 
> +1 ... "ignore" is a pretty ugly hack here.

Done.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-08-25 Thread Bruce Momjian

hubert depesz lubaczewski wrote:
> On Thu, Aug 25, 2011 at 04:33:07PM -0400, Bruce Momjian wrote:
> > The problem appears to be that the Postgres catalogs think there is a
> > toast table for 'actions', while the file system doesn't seem to have
> > such a file.  I can you look in pg_class and verify that?
> > 
> > SELECT reltoastrelid FROM pg_class WHERE relname  = 'actions';
> 
> $ SELECT reltoastrelid FROM pg_class WHERE relname  = 'actions';
>  reltoastrelid 
> ---
> (0 rows)
> 
> This is done not on the pg from backup, but on normal production, as the test
> pg instance doesn't work anymore.
> 
> I can re-set the test instance, but extracting from backup, and making it 
> apply
> all xlogs usually takes 2-3 days.

If you remove the .old extension on pg_control, you can start the old
cluster and check it.  This is explained by pg_upgrade output:

| If pg_upgrade fails after this point, you must
| re-initdb the new cluster before continuing.
| You will also need to remove the ".old" suffix
| from /var/postgresql//global/pg_control.old.

Please check the old cluster.

> > > One more thing - one of earlier tests actually worked through
> > > pg_upgrade, but when running vacuumdb -az on newly started 9.0.4, I got
> > > error about missing transaction/clog - don't remember exactly what it
> > > was, though.
> > 
> > THere was a bug in how how pg_upgrade worked in pre-9.0.4 --- could it
> > have been that?
> 
> It was done definitely using 9.0.4.

Good.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-08-25 Thread hubert depesz lubaczewski

On Thu, Aug 25, 2011 at 04:33:07PM -0400, Bruce Momjian wrote:
> The problem appears to be that the Postgres catalogs think there is a
> toast table for 'actions', while the file system doesn't seem to have
> such a file.  I can you look in pg_class and verify that?
> 
>   SELECT reltoastrelid FROM pg_class WHERE relname  = 'actions';

$ SELECT reltoastrelid FROM pg_class WHERE relname  = 'actions';
 reltoastrelid 
---
(0 rows)

This is done not on the pg from backup, but on normal production, as the test
pg instance doesn't work anymore.

I can re-set the test instance, but extracting from backup, and making it apply
all xlogs usually takes 2-3 days.

> > One more thing - one of earlier tests actually worked through
> > pg_upgrade, but when running vacuumdb -az on newly started 9.0.4, I got
> > error about missing transaction/clog - don't remember exactly what it
> > was, though.
> 
> THere was a bug in how how pg_upgrade worked in pre-9.0.4 --- could it
> have been that?

It was done definitely using 9.0.4.

Best regards,

depesz


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-08-25 Thread Bruce Momjian

hubert depesz lubaczewski wrote:
> hi
> 
> I have 8.3.11 database, ~ 600GB in size.
> 
> I want to upgrade it to 9.0.
> 
> First, I tried with 9.0.4, and when I hit problem (the same) I tried
> git, head of 9.0 branch.

Good.

> pg_upgrade_dump_db.sql-
> pg_upgrade_dump_db.sql--- For binary upgrade, must preserve relfilenodes
> pg_upgrade_dump_db.sql-SELECT 
> binary_upgrade.set_next_heap_relfilenode('71637068'::pg_catalog.oid);
> pg_upgrade_dump_db.sql:SELECT 
> binary_upgrade.set_next_toast_relfilenode('71637071'::pg_catalog.oid);
> pg_upgrade_dump_db.sql-SELECT 
> binary_upgrade.set_next_index_relfilenode('71637073'::pg_catalog.oid);
> pg_upgrade_dump_db.sql-
> pg_upgrade_dump_db.sql-CREATE TABLE actions (
> --
> pg_upgrade.log-linking /var/postgresql//base/113953649/2613 to 
> /var/postgresql/-9.0/base/11826/11790
> pg_upgrade.log-  /var/postgresql//base/113953649/2683
> pg_upgrade.log-linking /var/postgresql//base/113953649/2683 to 
> /var/postgresql/-9.0/base/11826/11792
> pg_upgrade.log:Could not find 71637071 in old cluster

The problem appears to be that the Postgres catalogs think there is a
toast table for 'actions', while the file system doesn't seem to have
such a file.  I can you look in pg_class and verify that?

SELECT reltoastrelid FROM pg_class WHERE relname  = 'actions';

Then look in the file system to see if there is a matching file.

> One more thing - one of earlier tests actually worked through
> pg_upgrade, but when running vacuumdb -az on newly started 9.0.4, I got
> error about missing transaction/clog - don't remember exactly what it
> was, though.

THere was a bug in how how pg_upgrade worked in pre-9.0.4 --- could it
have been that?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache

2011-08-25 Thread Robert Haas

On Fri, Aug 5, 2011 at 2:36 PM, Kohei KaiGai  wrote:
> BTW, what is the current status of this patch?
> The status of contrib/sepgsql part is unclear for me, although we agreed that
> syscache is suitable mechanism for security labels.

Sorry it's taken me a while to get around to looking at this.  Reviewing away...

For me, the line you removed from dml.out causes the regression tests to fail.

I don't understand what this is going for:

+   /*
+* To boost up trusted procedure checks on db_procedure object
+* class, we also confirm the decision when user calls a procedure
+* labeled as 'tcontext'.
+*/

Can you explain?

sepgsql_avc_check_perms_label has a formatting error on the line that
says "result = false".  It's not indented correctly.

Several functions do this: sepgsql_avc_check_valid(); do { ... } while
(!sepgsql_avc_check_valid);  I don't understand why we need a loop
there.

The comment for sepgql_avc_check_perms_label uses the word "elsewhere"
when it really means "otherwise".

Changing the calling sequence of sepgsql_get_label() would perhaps be
better separated out into its own patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] tsvector concatenation - backend crash

2011-08-25 Thread Jesper Krogh


Hi

Attached SQL files gives (at least in my hands) a reliable backend crash
with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry
I cannot provide a more trimmed down set of vectors the reproduces the 
bug, thus

the "obsfucated" dataset. But even deleting single terms in the vectors make
the bug go away.

*** glibc detected *** postgres: jk jk [local] SELECT: corrupted 
double-linked list: 0x02279f80 ***

=== Backtrace: =
/lib/libc.so.6(+0x775b6)[0x7fe4db4b25b6]
/lib/libc.so.6(+0x7aa25)[0x7fe4db4b5a25]
/lib/libc.so.6(cfree+0x73)[0x7fe4db4b8e83]
postgres: jk jk [local] SELECT[0x710de5]
postgres: jk jk [local] SELECT(MemoryContextReset+0x2a)[0x71119a]
postgres: jk jk [local] SELECT(ExecScan+0x4a)[0x57887a]
postgres: jk jk [local] SELECT(ExecProcNode+0x238)[0x571708]
postgres: jk jk [local] SELECT(standard_ExecutorRun+0xd2)[0x5705e2]
postgres: jk jk [local] SELECT[0x63c627]
postgres: jk jk [local] SELECT(PortalRun+0x248)[0x63d948]
postgres: jk jk [local] SELECT[0x639fdb]
postgres: jk jk [local] SELECT(PostgresMain+0x547)[0x63af97]
postgres: jk jk [local] SELECT[0x5fb959]
postgres: jk jk [local] SELECT(PostmasterMain+0xa97)[0x5fe137]
postgres: jk jk [local] SELECT(main+0x490)[0x59f4d0]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7fe4db459c4d]
postgres: jk jk [local] SELECT[0x45d569]
=== Memory map: 
0040-008d6000 r-xp  08:01 4071141
/tmp/pgsql/bin/postgres
00ad5000-00ad6000 r--p 004d5000 08:01 4071141
/tmp/pgsql/bin/postgres
00ad6000-00ae2000 rw-p 004d6000 08:01 4071141
/tmp/pgsql/bin/postgres

00ae2000-00b43000 rw-p  00:00 0
0215d000-0227e000 rw-p  00:00 0  
[heap]

7fe4d400-7fe4d4021000 rw-p  00:00 0
7fe4d4021000-7fe4d800 ---p  00:00 0
7fe4d908f000-7fe4d90a5000 r-xp  08:01 4194383
/lib/libgcc_s.so.1
7fe4d90a5000-7fe4d92a4000 ---p 00016000 08:01 4194383
/lib/libgcc_s.so.1
7fe4d92a4000-7fe4d92a5000 r--p 00015000 08:01 4194383
/lib/libgcc_s.so.1
7fe4d92a5000-7fe4d92a6000 rw-p 00016000 08:01 4194383
/lib/libgcc_s.so.1

7fe4d92c1000-7fe4d9342000 rw-p  00:00 0
7fe4d9342000-7fe4db22e000 rw-s  00:04 8716337
/SYSV0052ea91 (deleted)
7fe4db22e000-7fe4db23a000 r-xp  08:01 4194415
/lib/libnss_files-2.11.1.so
7fe4db23a000-7fe4db439000 ---p c000 08:01 4194415
/lib/libnss_files-2.11.1.so
7fe4db439000-7fe4db43a000 r--p b000 08:01 4194415
/lib/libnss_files-2.11.1.so
7fe4db43a000-7fe4db43b000 rw-p c000 08:01 4194415
/lib/libnss_files-2.11.1.so
7fe4db43b000-7fe4db5b5000 r-xp  08:01 4194349
/lib/libc-2.11.1.so
7fe4db5b5000-7fe4db7b4000 ---p 0017a000 08:01 4194349
/lib/libc-2.11.1.so
7fe4db7b4000-7fe4db7b8000 r--p 00179000 08:01 4194349
/lib/libc-2.11.1.so
7fe4db7b8000-7fe4db7b9000 rw-p 0017d000 08:01 4194349
/lib/libc-2.11.1.so

7fe4db7b9000-7fe4db7be000 rw-p  00:00 0
7fe4db7be000-7fe4db84 r-xp  08:01 4194398
/lib/libm-2.11.1.so
7fe4db84-7fe4dba3f000 ---p 00082000 08:01 4194398
/lib/libm-2.11.1.so
7fe4dba3f000-7fe4dba4 r--p 00081000 08:01 4194398
/lib/libm-2.11.1.so
7fe4dba4-7fe4dba41000 rw-p 00082000 08:01 4194398
/lib/libm-2.11.1.so
7fe4dba41000-7fe4dba43000 r-xp  08:01 4194363
/lib/libdl-2.11.1.so
7fe4dba43000-7fe4dbc43000 ---p 2000 08:01 4194363
/lib/libdl-2.11.1.so
7fe4dbc43000-7fe4dbc44000 r--p 2000 08:01 4194363
/lib/libdl-2.11.1.so
7fe4dbc44000-7fe4dbc45000 rw-p 3000 08:01 4194363
/lib/libdl-2.11.1.so
7fe4dbc45000-7fe4dbc65000 r-xp  08:01 4194325
/lib/ld-2.11.1.so

7fe4dbc85000-7fe4dbce7000 rw-p  00:00 0
7fe4dbce7000-7fe4dbd26000 r--p  08:01 5512971
/usr/lib/locale/en_DK.utf8/LC_CTYPE
7fe4dbd26000-7fe4dbe44000 r--p  08:01 5512650
/usr/lib/locale/en_DK.utf8/LC_COLLATE

7fe4dbe44000-7fe4dbe47000 rw-p  00:00 0
7fe4dbe58000-7fe4dbe59000 r--p  08:01 5515083
/usr/lib/locale/en_DK.utf8/LC_TIME
7fe4dbe59000-7fe4dbe5a000 r--p  08:01 5515084
/usr/lib/locale/en_DK.utf8/LC_MONETARY
7fe4dbe5a000-7fe4dbe5b000 r--p  08:01 5640299
/usr/lib/locale/en_DK.utf8/LC_MESSAGES/SYS_LC_MESSAGES
7fe4dbe5b000-7fe4dbe62000 r--s  08:01 5511621
/usr/lib/gconv/gconv-modules.cache

7fe4dbe62000-7fe4dbe64000 rw-p  00:00 0
7fe4dbe64000-7fe4dbe65000 r--p 0001f000 08:01 4194325
/lib/ld-2.11.1.so
7

Re: [HACKERS] Backup's from standby

2011-08-25 Thread Robert Haas

On Fri, Aug 19, 2011 at 9:38 AM, Valentine Gogichashvili
 wrote:
>> > What issue we may face if you take a backups(includes data dir + wal
>> > files)
>> > at standby without LVM snapshot?
>>
>> The backup might be corrupted in arbitrary ways.
>
> And what will happen, if one issues a pg_start_backup() on the master, then
> takes a file-backup on slave, and issues pg_stop_backup() on master again?
> As far as I remember this approach was working for me, considering, that all
> needed WAL files are transferred to the newly created DB copy as well.

Well, I think you would need to verify a few more things:

- The pg_start_backup() will need to have been replayed on the standby
before you start the file copy.
- You will need all the WAL segments beginning at the position where
pg_start_backup() was executed on the master and ending after
pg_stop_backup() was executed on the master.

Assuming you do that, it seems like it ought to work, although I have
not tested it and am not at all sure I'm not missing something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Fast GiST index build

2011-08-25 Thread Heikki Linnakangas


On 24.08.2011 16:57, Alexander Korotkov wrote:

I've added some testing results to the wiki page:
http://wiki.postgresql.org/wiki/Fast_GiST_index_build_GSoC_2011
There are not all the results I planned for the first chunk because it takes
more time than I expect.
Some notes about it.

Now I see two causes which accelerate regular build of GiST indexes:
1) As it was noted before regular index build of pretty ordered dataset is
fast.
2) I found that worse index is faster to build. I mean worse index is index
with higher overlaps. Function gistchoose selects the first index tuple with
zero penalty if any. Thus, with higher overlap in root page only few index
tuples of it will be choosed for insert. And, recursively, only small part
of the tree will be used for actual inserts. And that part of tree can
easier fit to the cache. Thus, high overlaps  makes inserts cheaper as much
as searches expensiver.


As an extreme case, a trivial penalty function that just always returns 
0 will make index build fast - but the index will be useless for querying.



In the tests on the first version of patch I found index quality of regular
build much better than it of buffering build (without neighborrelocation).
Now it's similar, though it's because index quality of regular index build
become worse. There by in current tests regular index build is faster than
in previous. I see following possible causes of it:
  1) I didn't save source random data. So, now it's a new random data.
2) Some environment parameters of my test setup may alters, though I doubt.
Despite these possible explanation it seems quite strange for me.


That's pretty surprising. Assuming the data is truly random, I wouldn't 
expect a big difference in the index quality of one random data set over 
another. If the index quality depends so much on, say, the distribution 
of the few first tuples that are inserted to it, that's a quite 
interesting find on its own, and merits some further research.



In order to compare index build methods on more qualitative indexes, I've
tried to build indexes with my double sorting split method (see:
http://syrcose.ispras.ru/2011/files/SYRCoSE2011_Proceedings.pdf#page=36). So
on uniform dataset search is faster in about 10 times! And, as it was
expected, regular index build becomes much slower. It runs more than 60
hours and while only 50% of index is complete (estimated by file sizes).

Also, automatic switching to buffering build shows better index quality
results in all the tests. While it's hard for me to explain that.


Hmm, makes me a bit uneasy that we're testing with a modified page 
splitting algorithm. But if the new algorithm is that good, could you 
post that as a separate patch, please?


That said, I don't see any new evidence that the buffering build 
algorithm would be significantly worse. There's the case of ordered data 
that we already knew about, and will have to just accept for now.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Tom Lane

I wrote:
> Andrew Dunstan  writes:
>> We could just add -b unconditionally to the flex flags and then count 
>> the number of lines in lex.backup. If it's greater that 1 whine loudly, 
>> or even fail, otherwise remove lex.backup. Would that avoid locale 
>> dependencies?

> Hm, yeah, seems like that ought to work.

Done in HEAD, but only for the Makefile-based build mechanism.  Anybody
want to add the comparable logic to the MSVC scripts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Tom Lane

Andrew Dunstan  writes:
> We could just add -b unconditionally to the flex flags and then count 
> the number of lines in lex.backup. If it's greater that 1 whine loudly, 
> or even fail, otherwise remove lex.backup. Would that avoid locale 
> dependencies?

Hm, yeah, seems like that ought to work.

I'm tempted to add "-p -p" also, even though that only results in some
whinging on stderr.  It would still probably get noticed by anyone who
was changing the lexer.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Andrew Dunstan




On 08/25/2011 01:16 PM, Alvaro Herrera wrote:

Excerpts from Tom Lane's message of jue ago 25 14:00:57 -0300 2011:

Robert Haas  writes:

On Thu, Aug 25, 2011 at 12:47 PM, Tom Lane  wrote:

1. Somebody broke the no-backtracking property back in 9.0 while adding
quoted variable substitution.  According to the flex manual, use of
backtracking creates a performance penalty.  We once measured the
backend's lexer as being about a third faster with backtrack avoidance,
and presumably it's about the same for psql's.  This is not hard to fix,
but should I consider it a bug fix and back-patch?  We've not had
complaints about psql getting slower as of 9.0.

That may well have been me.

[ checks "git blame" ]  Well, you commmitted the patch anyway: d0cfc018.


How would I have known that I broke it?

Per the header comments in the backend lexer, you should run flex with
"-b" switch and verify that the resulting lex.backup file says "no
backing up".  I've occasionally thought about automating that, but I'm
not sure if the output is entirely locale- and flex-version-independent.

It is locale dependent, though of course for the automated check you
could just run flex under LC_ALL=C.

$ /usr/bin/flex -Cfe -b /pgsql/source/REL8_4_STABLE/src/bin/psql/psqlscan.l
$ cat lex.backup
Sin retroceso.
$ LC_ALL=C /usr/bin/flex -Cfe -b 
/pgsql/source/REL8_4_STABLE/src/bin/psql/psqlscan.l
$ cat lex.backup
No backing up.



We could just add -b unconditionally to the flex flags and then count 
the number of lines in lex.backup. If it's greater that 1 whine loudly, 
or even fail, otherwise remove lex.backup. Would that avoid locale 
dependencies?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Alvaro Herrera

Excerpts from Tom Lane's message of jue ago 25 14:00:57 -0300 2011:
> Robert Haas  writes:
> > On Thu, Aug 25, 2011 at 12:47 PM, Tom Lane  wrote:
> >> 1. Somebody broke the no-backtracking property back in 9.0 while adding
> >> quoted variable substitution.  According to the flex manual, use of
> >> backtracking creates a performance penalty.  We once measured the
> >> backend's lexer as being about a third faster with backtrack avoidance,
> >> and presumably it's about the same for psql's.  This is not hard to fix,
> >> but should I consider it a bug fix and back-patch?  We've not had
> >> complaints about psql getting slower as of 9.0.
> 
> > That may well have been me.
> 
> [ checks "git blame" ]  Well, you commmitted the patch anyway: d0cfc018.
> 
> > How would I have known that I broke it?
> 
> Per the header comments in the backend lexer, you should run flex with
> "-b" switch and verify that the resulting lex.backup file says "no
> backing up".  I've occasionally thought about automating that, but I'm
> not sure if the output is entirely locale- and flex-version-independent.

It is locale dependent, though of course for the automated check you
could just run flex under LC_ALL=C.

$ /usr/bin/flex -Cfe -b /pgsql/source/REL8_4_STABLE/src/bin/psql/psqlscan.l
$ cat lex.backup 
Sin retroceso.
$ LC_ALL=C /usr/bin/flex -Cfe -b 
/pgsql/source/REL8_4_STABLE/src/bin/psql/psqlscan.l
$ cat lex.backup 
No backing up.



-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Tom Lane

Robert Haas  writes:
> On Thu, Aug 25, 2011 at 12:47 PM, Tom Lane  wrote:
>> 1. Somebody broke the no-backtracking property back in 9.0 while adding
>> quoted variable substitution.  According to the flex manual, use of
>> backtracking creates a performance penalty.  We once measured the
>> backend's lexer as being about a third faster with backtrack avoidance,
>> and presumably it's about the same for psql's.  This is not hard to fix,
>> but should I consider it a bug fix and back-patch?  We've not had
>> complaints about psql getting slower as of 9.0.

> That may well have been me.

[ checks "git blame" ]  Well, you commmitted the patch anyway: d0cfc018.

> How would I have known that I broke it?

Per the header comments in the backend lexer, you should run flex with
"-b" switch and verify that the resulting lex.backup file says "no
backing up".  I've occasionally thought about automating that, but I'm
not sure if the output is entirely locale- and flex-version-independent.

> Also, how invasive is the fix?

We need to add a couple more rules that will match an unterminated
quoted variable and do something reasonable (probably just throw back
everything but the colon with yyless).  I've not coded it but I think
it can't be more than a dozen lines or so.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Robert Haas

On Thu, Aug 25, 2011 at 12:47 PM, Tom Lane  wrote:
> On my way to do something else entirely, I came across a couple of
> things that are not very nice about psql's lexer:
>
> 1. Somebody broke the no-backtracking property back in 9.0 while adding
> quoted variable substitution.  According to the flex manual, use of
> backtracking creates a performance penalty.  We once measured the
> backend's lexer as being about a third faster with backtrack avoidance,
> and presumably it's about the same for psql's.  This is not hard to fix,
> but should I consider it a bug fix and back-patch?  We've not had
> complaints about psql getting slower as of 9.0.

That may well have been me.  How would I have known that I broke it?

Also, how invasive is the fix?

> 2. The lexer rules associated with variable substitution think that
> variable names can consist only of ASCII letters and digits (and
> underscores).  The psql manual is noncommittal about whether non-ASCII
> characters are allowed, but a reasonable person would think that the
> rules ought to be the same as the backend's idea of what an identifier
> is.  Does anybody have a problem with improving that?

Nope.  Or at least, I don't.

> (I'm not
> proposing this part as a bug fix, because it does look a little bit
> more invasive to fix, because of the way psql deals with unsafe
> multibyte encodings.)

+1 for not back-patching this part.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Change format of FDW options used in \d* commands

2011-08-25 Thread Robert Haas

2011/8/19 Shigeru Hanada :
> Oops, good catch.  I've fixed psql to use quote_ident() for option_name,
> and modified regression tests to use special characters in option names.
>  Please try revised version of patch.

This part looks good to me, too.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Tom Lane

On my way to do something else entirely, I came across a couple of
things that are not very nice about psql's lexer:

1. Somebody broke the no-backtracking property back in 9.0 while adding
quoted variable substitution.  According to the flex manual, use of
backtracking creates a performance penalty.  We once measured the
backend's lexer as being about a third faster with backtrack avoidance,
and presumably it's about the same for psql's.  This is not hard to fix,
but should I consider it a bug fix and back-patch?  We've not had
complaints about psql getting slower as of 9.0.

2. The lexer rules associated with variable substitution think that
variable names can consist only of ASCII letters and digits (and
underscores).  The psql manual is noncommittal about whether non-ASCII
characters are allowed, but a reasonable person would think that the
rules ought to be the same as the backend's idea of what an identifier
is.  Does anybody have a problem with improving that?  (I'm not
proposing this part as a bug fix, because it does look a little bit
more invasive to fix, because of the way psql deals with unsafe
multibyte encodings.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Change format of FDW options used in \d* commands

2011-08-25 Thread Robert Haas

2011/8/19 Shigeru Hanada :
> BTW, I noticed that pg_dump has same issue since 8.4, initial release of
> SQL/MED infrastructure.  If a FDW option was defined on any FDW object
> with a name which contains one of special characters such as space and
> parentheses, pg_dump generates invalid OPTIONS clause such as "OPTIONS
> (separated name 'value')".
>  ~~
> Perhaps this issue has been overlooked because dblink is practically the
> only use case of FDW option before 9.1.  Since 9.1, users might get
> various FDW and some of those might use special characters in option
> name.  ISTM that this fix should be back-patched, at least to 9.1.
> Please find attached patches for each STABLE branch.

Good catch, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Robert Haas

On Thu, Aug 25, 2011 at 11:15 AM, Markus Wanner  wrote:
> On 08/25/2011 04:59 PM, Tom Lane wrote:
>> That's a good point.  If the ring buffer size creates a constraint on
>> the maximum number of sub-XIDs per transaction, you're going to need a
>> fallback path of some sort.
>
> I think Robert envisions the same fallback path we already have:
> subxids.overflowed.

I have a slightly more nuanced idea, but basically yes.  The trouble
is that if you're keeping the snapshot around and updating it (rather
than scanning the ProcArray each time) you need some sort of mechanism
for the snapshot to eventually un-overflow.  Otherwise, the first
overflow leaves you in the soup for the entire lifetime of the
cluster.

What I have in mind is to store the highest subxid that has been
removed from the snapshot, or InvalidTransactonId if we know the
snapshot is complete.  Whenever the highest removed subxid falls
behind xmin, we can reset it to InvalidTransactionId.

It would be sensible for clients to store the exact value of
highest_removed_subxid in their snapshots as well, instead of just a
Boolean flag.  A pg_subtrans lookup is needed only for XIDs which are
greater than xmin and less than or equal to highest_removed_subxid.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Markus Wanner

Tom,

On 08/25/2011 04:59 PM, Tom Lane wrote:
> That's a good point.  If the ring buffer size creates a constraint on
> the maximum number of sub-XIDs per transaction, you're going to need a
> fallback path of some sort.

I think Robert envisions the same fallback path we already have:
subxids.overflowed.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Markus Wanner

Robert,

On 08/25/2011 04:48 PM, Robert Haas wrote:
> What's a typical message size for imessages?

Most message types in Postgres-R are just a couple bytes in size.
Others, especially change sets, can be up to 8k.

However, I think you'll have an easier job guaranteeing that backends
"consume" their portions of the ring-buffer in time.  Plus wrap-around
isn't that much of a problem in your case.  (I couldn't drop imessage,
but had to let senders wait).

> Well, one long-running transaction that only has a single XID is not
> really a problem: the snapshot is still small.  But one very old
> transaction that also happens to have a large number of
> subtransactions all of which have XIDs assigned might be a good way to
> stress the system.

Ah, right, that's why its a list of transactions in progress and not a
list of completed transactions in SnapshotData... good.

> Each reader decides which data he needs to copy from the buffer, and
> then copies it, and then checks whether any of it got overwritten
> before the copy was completed.  So there's a lively possibility that
> the snapshot that was current when the reader began copying it will no
> longer be current by the time he finishes copying it, because a commit
> has intervened.  That's OK: it just means that, effectively, the
> snapshot is taken at the moment the start and stop pointers are read,
> and won't take into account any commits that happen later, which is
> exactly what a snapshot is supposed to do anyway.

Agreed, that makes sense.  Thanks for explaining.

> There is a hopefully quite small possibility that by the time the
> reader finishes copying it so much new data will have been written to
> the buffer that it will have wrapped around and clobbered the portion
> the reader was interested in.  That needs to be rare.

Yeah.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Tom Lane

Robert Haas  writes:
> Well, one long-running transaction that only has a single XID is not
> really a problem: the snapshot is still small.  But one very old
> transaction that also happens to have a large number of
> subtransactions all of which have XIDs assigned might be a good way to
> stress the system.

That's a good point.  If the ring buffer size creates a constraint on
the maximum number of sub-XIDs per transaction, you're going to need a
fallback path of some sort.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Robert Haas

On Thu, Aug 25, 2011 at 10:19 AM, Markus Wanner  wrote:
> Note, however, that for imessages, I've also had the policy in place
> that a backend *must* consume its message before sending any.  And that
> I took great care for all receivers to consume their messages as early
> as possible.  None the less, I kept incrementing the buffer size (to
> multiple megabytes) to make this work.  Maybe I'm overcautious because
> of that experience.

What's a typical message size for imessages?

>> - a handful of XIDs at most - because, on the average, transactions
>> are going to commit in *approximately* increasing XID order
>
> This assumption quickly turns false, if you happen to have just one
> long-running transaction, I think.  Or in general, if transaction
> duration varies a lot.

Well, one long-running transaction that only has a single XID is not
really a problem: the snapshot is still small.  But one very old
transaction that also happens to have a large number of
subtransactions all of which have XIDs assigned might be a good way to
stress the system.

>> So the backend taking a snapshot only needs
>> to be able to copy < ~64 bytes of information from the ring buffer
>> before other backends write ~27k of data into that buffer, likely
>> requiring hundreds of other commits.
>
> You said earlier, that "only the latest snapshot" is required.  It takes
> only a single commit for such a snapshot to not be the latest anymore.
>
> Instead, if you keep around older snapshots for some time - as what your
> description here implies - readers are free to copy from those older
> snapshots while other backends are able to make progress concurrently
> (writers or readers of other snapshots).
>
> However, that either requires keeping track of readers of a certain
> snapshot (reference counting) or - as I understand your description -
> you simply invalidate all concurrent readers upon wrap-around, or something.

Each reader decides which data he needs to copy from the buffer, and
then copies it, and then checks whether any of it got overwritten
before the copy was completed.  So there's a lively possibility that
the snapshot that was current when the reader began copying it will no
longer be current by the time he finishes copying it, because a commit
has intervened.  That's OK: it just means that, effectively, the
snapshot is taken at the moment the start and stop pointers are read,
and won't take into account any commits that happen later, which is
exactly what a snapshot is supposed to do anyway.

There is a hopefully quite small possibility that by the time the
reader finishes copying it so much new data will have been written to
the buffer that it will have wrapped around and clobbered the portion
the reader was interested in.  That needs to be rare.

>> Now, as the size of the snapshot gets bigger, things will eventually
>> become less good.
>
> Also keep configurations with increased max_connections in mind.  With
> that, we not only the snapshots get bigger, but more processes have to
> share CPU time, on avg. making memcpy slower for a single process.

Right.  I'm imagining making the default buffer size proportional to
max_connections.

>> Of course even if wraparound turns out not to be a problem there are
>> other things that could scuttle this whole approach, but I think the
>> idea has enough potential to be worth testing.  If the whole thing
>> crashes and burns I hope I'll at least learn enough along the way to
>> design something better...
>
> That's always a good motivation.  In that sense: happy hacking!

Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Markus Wanner

Robert,

On 08/25/2011 03:24 PM, Robert Haas wrote:
> My hope (and it might turn out that I'm an optimist) is that even with
> a reasonably small buffer it will be very rare for a backend to
> experience a wraparound condition.

It certainly seems less likely than with the ring-buffer for imessages, yes.

Note, however, that for imessages, I've also had the policy in place
that a backend *must* consume its message before sending any.  And that
I took great care for all receivers to consume their messages as early
as possible.  None the less, I kept incrementing the buffer size (to
multiple megabytes) to make this work.  Maybe I'm overcautious because
of that experience.

> - a handful of XIDs at most - because, on the average, transactions
> are going to commit in *approximately* increasing XID order

This assumption quickly turns false, if you happen to have just one
long-running transaction, I think.  Or in general, if transaction
duration varies a lot.

> So the backend taking a snapshot only needs
> to be able to copy < ~64 bytes of information from the ring buffer
> before other backends write ~27k of data into that buffer, likely
> requiring hundreds of other commits.

You said earlier, that "only the latest snapshot" is required.  It takes
only a single commit for such a snapshot to not be the latest anymore.

Instead, if you keep around older snapshots for some time - as what your
description here implies - readers are free to copy from those older
snapshots while other backends are able to make progress concurrently
(writers or readers of other snapshots).

However, that either requires keeping track of readers of a certain
snapshot (reference counting) or - as I understand your description -
you simply invalidate all concurrent readers upon wrap-around, or something.

> That seems vanishingly unlikely;

Agreed.

> Now, as the size of the snapshot gets bigger, things will eventually
> become less good.

Also keep configurations with increased max_connections in mind.  With
that, we not only the snapshots get bigger, but more processes have to
share CPU time, on avg. making memcpy slower for a single process.

> Of course even if wraparound turns out not to be a problem there are
> other things that could scuttle this whole approach, but I think the
> idea has enough potential to be worth testing.  If the whole thing
> crashes and burns I hope I'll at least learn enough along the way to
> design something better...

That's always a good motivation.  In that sense: happy hacking!

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch to slightly improve clarity of a comment in postgresql.conf.sample

2011-08-25 Thread Robert Haas

On Thu, Aug 25, 2011 at 9:30 AM, Dougal Sutherland  wrote:
> Sure, I like that more.

OK, done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Short document fix

2011-08-25 Thread Robert Haas

On Tue, Aug 23, 2011 at 4:53 PM, Hitoshi Harada  wrote:
> In the CREATE DOMAIN reference page of the current HEAD, it says
>
> ---
> CREATE DOMAIN us_postal_code AS TEXT
> CHECK(
>   VALUE ~ '^\\d{5}$'
> OR VALUE ~ '^\\d{5}-\\d{4}$'
> );
> ---
>
> but I believe it should conform the standard string style now that the
> default is standard_conforming_strings = on.

I agree.  Done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch to slightly improve clarity of a comment in postgresql.conf.sample

2011-08-25 Thread Robert Haas

On Wed, Aug 24, 2011 at 7:33 AM, Dougal Sutherland  wrote:
> The attached change to postgresql.conf.sample makes it more clear at a
> glance that the default value of listen_addresses is 'localhost', not
> 'localhost, *'. This would have saved a friend an hour or two of fiddling
> tonight.

How about:

# defaults to 'localhost'; use '*' for all

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Robert Haas

On Thu, Aug 25, 2011 at 1:55 AM, Markus Wanner  wrote:
>> One difference with snapshots is that only the latest snapshot is of
>> any interest.
>
> Theoretically, yes.  But as far as I understood, you proposed the
> backends copy that snapshot to local memory.  And copying takes some
> amount of time, possibly being interrupted by other backends which add
> newer snapshots...  Or do you envision the copying to restart whenever a
> new snapshot arrives?

My hope (and it might turn out that I'm an optimist) is that even with
a reasonably small buffer it will be very rare for a backend to
experience a wraparound condition.  For example, consider a buffer
with ~6500 entries, approximately 64 * MaxBackends, the approximate
size of the current subxip arrays taken in aggregate.  I hypothesize
that a typical snapshot on a running system is going to be very small
- a handful of XIDs at most - because, on the average, transactions
are going to commit in *approximately* increasing XID order and, if
you take the regression tests as representative of a real workload,
only a small fraction of transactions will have more than one XID.  So
it seems believable to think that the typical snapshot on a machine
with max_connections=100 might only be ~10 XIDs, even if none of the
backends are read-only.  So the backend taking a snapshot only needs
to be able to copy < ~64 bytes of information from the ring buffer
before other backends write ~27k of data into that buffer, likely
requiring hundreds of other commits.  That seems vanishingly unlikely;
memcpy() is very fast.  If it does happen, you can recover by
retrying, but it should be a once-in-a-blue-moon kind of thing.  I
hope.

Now, as the size of the snapshot gets bigger, things will eventually
become less good.  For example if you had a snapshot with 6000 XIDs in
it then every commit would need to write over the previous snapshot
and things would quickly deteriorate.  But you can cope with that
situation using the same mechanism we already use to handle big
snapshots: toss out all the subtransaction IDs, mark the snapshot as
overflowed, and just keep the toplevel XIDs.  Now you've got at most
~100 XIDs to worry about, so you're back in the safety zone.  That's
not ideal in the sense that you will cause more pg_subtrans lookups,
but that's the price you pay for having a gazillion subtransactions
floating around, and any system is going to have to fall back on some
sort of mitigation strategy at some point.  There's no useful limit on
the number of subxids a transaction can have, so unless you're
prepared to throw an unbounded amount of memory at the problem you're
going to eventually have to punt.

It seems to me that the problem case is when you are just on the edge.
 Say you have 1400 XIDs in the snapshot.  If you compact the snapshot
down to toplevel XIDs, most of those will go away and you won't have
to worry about wraparound - but you will pay a performance penalty in
pg_subtrans lookups.  On the other hand, if you don't compact the
snapshot, it's not that hard to imagine a wraparound occurring - four
snapshot rewrites could wrap the buffer.  You would still hope that
memcpy() could finish in time, but if you're rewriting 1400 XIDs with
any regularity, it might not take that many commits to throw a spanner
into the works.  If the system is badly overloaded and the backend
trying to take a snapshot gets descheduled for a long time at just the
wrong moment, it doesn't seem hard to imagine a wraparound happening.

Now, it's not hard to recover from a wraparound.  In fact, we can
pretty easily guarantee that any given attempt to take a snapshot will
suffer a wraparound at most once.  The writers (who are committing)
have to be serialized anyway, so anyone who suffers a wraparound can
just grab the same lock in shared mode and retry its snapshot.  Now
concurrency decreases significantly, because no one else is allowed to
commit until that guy has got his snapshot, but right now that's true
*every time* someone wants to take a snapshot, so falling back to that
strategy occasionally doesn't seem prohibitively bad.  However, you
don't want it to happen very often, because even leaving aside the
concurrency hit, it's double work: you have to try to take a snapshot,
realize you've had a wraparound, and then retry.   It seems pretty
clear that with a big enough ring buffer the wraparound problem will
become so infrequent as to be not worth worrying about.  I'm
theorizing that even with a quite small ring buffer the problem will
still be infrequent enough not to worry about, but that might be
optimistic.  I think I'm going to need some kind of test case that
generates very large, frequently changing snapshots.

Of course even if wraparound turns out not to be a problem there are
other things that could scuttle this whole approach, but I think the
idea has enough potential to be worth testing.  If the whole thing
crashes and burns I hope I'll at least learn enough along the way to

Re: [HACKERS] Inputting relative datetimes

2011-08-25 Thread Dean Rasheed

On 25 August 2011 10:43, Vik Reykja  wrote:
> On Thu, Aug 25, 2011 at 11:39, Dean Rasheed 
> wrote:
>>
>> My first thought was to have some general way of adding or subtracting
>> an interval at the end of an input timestamp, eg. by adding another
>> couple of special values - "plus " and "minus ".
>> This would allow things like:
>>
>> TIMESTAMPTZ 'today minus 5 days'
>> TIMESTAMPTZ 'now plus 2 hours'
>
> Funny you should mention intervals...
>
> timestamptz 'today' - interval '5 days'
> timestamptz 'now' + interval '2 hours'
>

Yes, but what I am trying to achieve is a way of entering such
relative timestamps using a single input value, so that absolute and
relative timestamps can both be bound to a SQL query using just one
variable.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inputting relative datetimes

2011-08-25 Thread Vik Reykja

On Thu, Aug 25, 2011 at 11:39, Dean Rasheed wrote:

> My first thought was to have some general way of adding or subtracting
> an interval at the end of an input timestamp, eg. by adding another
> couple of special values - "plus " and "minus ".
> This would allow things like:
>
> TIMESTAMPTZ 'today minus 5 days'
> TIMESTAMPTZ 'now plus 2 hours'
>

Funny you should mention intervals...

timestamptz 'today' - interval '5 days'
timestamptz 'now' + interval '2 hours'

[HACKERS] Inputting relative datetimes

2011-08-25 Thread Dean Rasheed

As background, I have an app that accepts user text input and casts it
to a timestamp in order to produce reports. I use PostgreSQL's
timestamp input conversion for this, since it gives a lot of
flexibility, and can parse pretty much anything the users throw at it.

It is also handy that it recognizes special case values like "now",
"today", "tomorrow" and "yesterday". However, I can't see any way of
entering more general relative timestamps like "5 days ago" or "2
hours from now".

Obviously I can enhance my app by writing my own input function to
support relative timestamps, but I wonder if this is something that
would be more generally useful if PostgreSQL supported it natively. If
so, what should the syntax be?

My first thought was to have some general way of adding or subtracting
an interval at the end of an input timestamp, eg. by adding another
couple of special values - "plus " and "minus ".
This would allow things like:

TIMESTAMPTZ 'today minus 5 days'
TIMESTAMPTZ 'now plus 2 hours'

It seems a bit clunky to have to spell out "plus" and "minus", but I
think that using the symbols + and - would be impossible to parse
because of the ambiguity with timezones.

Thoughts?
Better ideas?

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

57 matches

Mail list logo