Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-28 Thread Kyotaro HORIGUCHI
This is the new version of the patch.
It is not rebased to the HEAD because of a build error.

 It's better to restore old two-path error handling.

I restorerd OOM and save result route. But it seems needed to
get back any amount of memory on REAL OOM as the comment in
original code says.

So I restored the meaning of rp == 0  errMsg == NULL as REAL
OOM which is to throw the async result away and the result will
be preserved if errMsg is not NULL. `unknown error' has been
removed.

As the result, if row processor returns 0 the parser skips to the
end of rows and returns the working result or an error result
according to whether errMsg is set or not in the row processor.


 I don't think that should be required.  Just use a dummy msg.

Considering the above, pqAddRow is also restored to leave errMsg
NULL on OOM.

 There is still one EOF in v3 getAnotherTuple() -
 pqGetInt(tupnfields), please turn that one also to
 protocolerror.

pqGetInt() returns EOF only when it wants additional reading from
network if the parameter `bytes' is appropreate. Non-zero return
from it seems should be handled as EOF, not a protocol error. The
one point I had modified bugilly is also restored. The so-called
'protocol error' has been vanished eventually.

 Instead use (%s, errmsg) as argument there.  libpq code
 is noisy enough, no need to add more.

done

Is there someting left undone?


By the way, I noticed that dblink always says that the current
connection is 'unnamed' in messages the errors in
dblink_record_internal@dblink.  I could see that
dblink_record_internal defines the local variable conname = NULL
and pass it to dblink_res_error to display the error message. But
no assignment on it in the function.

It seemed properly shown when I added the code to set conname
from PG_GETARG_TEXT_PP(0) if available, in other words do that
just after DBLINK_GET_CONN/DBLINK_GET_NAMED_CONN's. It seems the
dblink's manner...  This is not included in this patch.

Furthurmore dblink_res_error looks only into returned PGresult to
display the error and always says only `Error occurred on dblink
connection..: could not execute query'..

Is it right to consider this as follows?

 - dblink is wrong in error handling. A client of libpq should
   see PGconn by PQerrorMessage() if (or regardless of whether?)
   PGresult says nothing about error.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 1af8df6..239edb8 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -160,3 +160,7 @@ PQconnectStartParams  157
 PQping158
 PQpingParams  159
 PQlibVersion  160
+PQsetRowProcessor	  161
+PQgetRowProcessor	  162
+PQresultSetErrMsg	  163
+PQskipResult		  164
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 27a9805..4605e49 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2693,6 +2693,9 @@ makeEmptyPGconn(void)
 	conn-wait_ssl_try = false;
 #endif
 
+	/* set default row processor */
+	PQsetRowProcessor(conn, NULL, NULL);
+
 	/*
 	 * We try to send at least 8K at a time, which is the usual size of pipe
 	 * buffers on Unix systems.  That way, when we are sending a large amount
@@ -2711,8 +2714,13 @@ makeEmptyPGconn(void)
 	initPQExpBuffer(conn-errorMessage);
 	initPQExpBuffer(conn-workBuffer);
 
+	/* set up initial row buffer */
+	conn-rowBufLen = 32;
+	conn-rowBuf = (PGrowValue *)malloc(conn-rowBufLen * sizeof(PGrowValue));
+
 	if (conn-inBuffer == NULL ||
 		conn-outBuffer == NULL ||
+		conn-rowBuf == NULL ||
 		PQExpBufferBroken(conn-errorMessage) ||
 		PQExpBufferBroken(conn-workBuffer))
 	{
@@ -2814,6 +2822,8 @@ freePGconn(PGconn *conn)
 		free(conn-inBuffer);
 	if (conn-outBuffer)
 		free(conn-outBuffer);
+	if (conn-rowBuf)
+		free(conn-rowBuf);
 	termPQExpBuffer(conn-errorMessage);
 	termPQExpBuffer(conn-workBuffer);
 
@@ -5078,3 +5088,4 @@ PQregisterThreadLock(pgthreadlock_t newhandler)
 
 	return prev;
 }
+
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index b743566..ce58778 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -66,6 +66,7 @@ static PGresult *PQexecFinish(PGconn *conn);
 static int PQsendDescribe(PGconn *conn, char desc_type,
 			   const char *desc_target);
 static int	check_field_number(const PGresult *res, int field_num);
+static int	pqAddRow(PGresult *res, PGrowValue *columns, void *param);
 
 
 /* 
@@ -701,7 +702,6 @@ pqClearAsyncResult(PGconn *conn)
 	if (conn-result)
 		PQclear(conn-result);
 	conn-result = NULL;
-	conn-curTuple = NULL;
 }
 
 /*
@@ -756,7 +756,6 @@ pqPrepareAsyncResult(PGconn *conn)
 	 */
 	res = conn-result;
 	conn-result = NULL;		/* handing over ownership to caller */
-	conn-curTuple = NULL;		/* just in case */
 	if (!res)
 		res = PQmakeEmptyPGresult(conn, 

Re: [HACKERS] pg_basebackup -x stream from the standby gets stuck

2012-02-28 Thread Fujii Masao
On Thu, Feb 23, 2012 at 1:02 AM, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Feb 7, 2012 at 12:30, Fujii Masao masao.fu...@gmail.com wrote:
 Hi,

 http://www.depesz.com/2012/02/03/waiting-for-9-2-pg_basebackup-from-slave/
 =$ time pg_basebackup -D /home/pgdba/slave2/ -F p -x stream -c fast -P -v 
 -h 127.0.0.1 -p 5921 -U replication
 xlog start point: 2/AC4E2600
 pg_basebackup: starting background WAL receiver
 692447/692447 kB (100%), 1/1 tablespace
 xlog end point: 2/AC4E2600
 pg_basebackup: waiting for background process to finish streaming...
 pg_basebackup: base backup completed

 real    3m56.237s
 user    0m0.224s
 sys     0m0.936s

 (time is long because this is only test database with no traffic, so I had 
 to make some inserts for it to finish)

 The above article points out the problem of pg_basebackup from the standby:
 when -x stream is specified, pg_basebackup from the standby gets stuck if
 there is no traffic in the database.

 When -x stream is specified, pg_basebackup forks the background process
 for receiving WAL records during backup, takes an online backup and waits for
 the background process to end. The forked background process keeps receiving
 WAL records, and whenever it reaches end of WAL file, it checks whether it 
 has
 already received all WAL files required for the backup, and exits if yes. 
 Which
 means that at least one WAL segment switch is required for pg_basebackup with
 -x stream option to end.

 In the backup from the master, WAL file switch always occurs at both start 
 and
 end of backup (i.e., in do_pg_start_backup() and do_pg_stop_backup()), so the
 above logic works fine even if there is no traffic. OTOH, in the backup from 
 the
 standby, while there is no traffic, WAL file switch is not performed at all. 
 So
 in that case, there is no chance that the background process reaches end of 
 WAL
 file, check whether all required WAL arrives and exit. At the end, 
 pg_basebackup
 gets stuck.

 To fix the problem, I'd propose to change the background process so that it
 checks whether all required WAL has arrived, every time data is received, 
 even
 if end of WAL file is not reached. Patch attached. Comments?

 This seems like a good thing in general.

 Why does it need to modify pg_receivexlog, though? I thought only
 pg_basebackup had tihs issue?

 I guess it is because of the change of the API to
 stream_continue_callback only?

Yes, that's the reason why I changed continue_streaming() in pg_receivexlog.c.

But the reason why I changed segment_callback() in pg_receivexlog.c is not the
same. I did that because previously segment_finish_callback is called
only at the
end of WAL segment but in the patch it can be called at the middle of segment.
OTOH, segment_callback() must emit a verbose message only when current
WAL segment is complete. So I had to add the check of whether current WAL
segment is partial or complete into segment_callback().

 Looking at it after your patch,
 stream_continue_callback and segment_finish_callback are the same.
 Should we perhaps just fold them into a single
 stream_continue_callback? Since you had to move the detect segment
 end to the caller anyway?

No. I think we cannot do that because in pg_receivexlog they are not the same.

 Another question related to this - since we clearly don't need the
 xlog switch in this case, should we make it conditional on the master
 as well, so we don't switch unnecessarily there as well?

Maybe. At the end of backup, we force WAL segment switch, to ensure all required
WAL files have been archived. So theoretically if WAL archiving is not enabled,
we can skip WAL segment switch. But some backup tools might depend on this
behavior

In standby-only backup, we always skip WAL segment switch. So there is
no guarantee
that all WAL files required for the backup are archived at the end of
backup. This
limitation is documented.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW system columns

2012-02-28 Thread Shigeru Hanada
(2012/02/27 12:35), Robert Haas wrote:
 On Sat, Feb 25, 2012 at 3:56 PM, Thom Brownt...@linux.com  wrote:
 If there seems to be a consensus on removing system column from foreign
 tables, I'd like to work on this issue.  Attached is a halfway patch,
 and ISTM there is no problem so far.


 I can say that at least PgAdmin doesn't use these columns.

 So we still have all of these columns for foreign tables.  I've tested
 Hanada-san's patch and it removes all of the system columns.  Could we
 consider applying it, or has a use-case for them since been
 discovered?
 
 Not to my knowledge, but Hanada-san described his patch as a halfway
 patch, implying that it wasn't done.

Sorry for long absence.

I've used the word halfway because I didn't have enough time to
examine that patch at that time.  I tested the patch, and now I think
it's OK to apply.  One concern is that there is no mention about
unavailable system columns in any document.  ddl.sgml has main
description of system columns, but it just says:

quote
Every table has several system columns that are implicitly defined by
the system.
/quote

Since this doesn't mention detailed type of relation, such as VIEW and
COMPOSITE TYPE, IMO we can leave this paragraph as is.

BTW, I still think that tableoid is useful if foreign tables can inherit
other tables.  With such feature, tableoid of foreign table is necessary
to determine actual source table.  Once we want to support that feature,
IMO we should revive tableoid system column for foreign tables.

-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] incorrect handling of the timeout in pg_receivexlog

2012-02-28 Thread Fujii Masao
On Wed, Feb 8, 2012 at 1:33 AM, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Feb 7, 2012 at 17:29, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 On 07.02.2012 16:55, Tom Lane wrote:

 (The integer vs float TimestampTz issue is a kind of portability
 problem, but we've already bought into the assumption that sender and
 receiver must be built with the same choice, no?)


 Hmm, true. In hindsight, I think that was a bad choice, but it's a bit late
 to change that. pg_basebackup doesn't otherwise care about the integer/float
 timestamps, but it does send a timestamp back to the server. You won't be
 able to actually start up the database if the config options don't match,
 but I think it would be good if pg_basebackup still worked across platforms
 and versions. For example, you might have a central backup server that calls
 pg_basebackup on several database servers, running on different platforms.

 In 9.0, the only field in the protocol that depends on timestamp format is
 WalDataMessageHeader-sendTime. That goes from server to client, and
 pg_basebackup/pg_receivexlog don't care about that. In 9.1 we introduced
 StandbyReplyMessage-sendTime, which is sent from client to server, but
 looking at the code it looks like the server doesn't use it for anything. In
 9.2, we added WalSndrMessage-sendTime, which is used by a standby server to
 calculate how far behind the standby is.

 I'm tempted to just change all of those TimestampTz fields to something
 that's independent of integer/float timestamp setting, in 9.2. At a quick
 glance, it seems that it wouldn't break anything.

Agreed. If we'll have not pushed such change into 9.2, we would break
something later.

 In general, I think that would work. Since we can't replicate across
 versions anyway.

 Will it break using pg_basebackup 9.2 on a 9.1 server, though? that
 would also be very useful in the scenario of the central server...

No unless I'm missing something. Because pg_basebackup doesn't use
any message which is defined in walprotocol.h if -x stream option is
not specified.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW system columns

2012-02-28 Thread Thom Brown
2012/2/28 Shigeru Hanada shigeru.han...@gmail.com:
 (2012/02/27 12:35), Robert Haas wrote:
 On Sat, Feb 25, 2012 at 3:56 PM, Thom Brownt...@linux.com  wrote:
 If there seems to be a consensus on removing system column from foreign
 tables, I'd like to work on this issue.  Attached is a halfway patch,
 and ISTM there is no problem so far.


 I can say that at least PgAdmin doesn't use these columns.

 So we still have all of these columns for foreign tables.  I've tested
 Hanada-san's patch and it removes all of the system columns.  Could we
 consider applying it, or has a use-case for them since been
 discovered?

 Not to my knowledge, but Hanada-san described his patch as a halfway
 patch, implying that it wasn't done.

 Sorry for long absence.

 I've used the word halfway because I didn't have enough time to
 examine that patch at that time.  I tested the patch, and now I think
 it's OK to apply.  One concern is that there is no mention about
 unavailable system columns in any document.  ddl.sgml has main
 description of system columns, but it just says:

 quote
 Every table has several system columns that are implicitly defined by
 the system.
 /quote

 Since this doesn't mention detailed type of relation, such as VIEW and
 COMPOSITE TYPE, IMO we can leave this paragraph as is.

 BTW, I still think that tableoid is useful if foreign tables can inherit
 other tables.  With such feature, tableoid of foreign table is necessary
 to determine actual source table.  Once we want to support that feature,
 IMO we should revive tableoid system column for foreign tables.

I'm not familiar with foreign table inheritance, or how it would work.
 If that's something that will likely be introduced in future, then
surely we'd want to keep the tableoid column rather than removing it
then re-introducing it later?

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Documentation: remove confusing paragraph about backslash escaping

2012-02-28 Thread Hannes Frederic Sowa

On 02/28/2012 12:10 AM, Tom Lane wrote:

I suggest replacing the first and third cases with something along the
lines of

Note: if you have standard_conforming_strings turned off, any
backslashes you write in literal string constants will need to be
doubled.  See Section 4.1.2.1 for more information.

The second case is just a parenthetical comment and perhaps could be
removed.


Definitely OK by me. Thanks for looking into this!

--
Hannes Sowa   hs...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Finer Extension dependencies

2012-02-28 Thread Hitoshi Harada
On Fri, Feb 24, 2012 at 2:09 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Hitoshi Harada umi.tan...@gmail.com writes:
 I confirmed DROP EXTENSION is fixed now.  In turn, it seems to me
 requires doesn't work.  My test ext2.control looks like:

 I'm very sorry about that. It's all about playing with pg_depend and
 I've failed to spend enough time on that very topic to send a patch that
 just works, it seems.

 I'm going to fix that over the week-end.  Thanks for your reviewing so
 far.

Quickly reviewed the patch and found some issues.

- There are some mixture of pg_extension_feature and pg_extension_features
- The doc says pg_extension_features has four columns but it's not true.
- Line 608 is bad. In the loop, provides_itself is repeatedly changed
to true and false and I guess that's not what you meant.
- Line 854+, you can fold two blocks into one.  The two blocks are
similar and by giving provides list with list_make1 when
control-provides  == NIL you can do it in one block.
- s/trak/track/
- Line 960, you missed updating classId for dependency.

That's pretty much from me.  I just looked at the patch and have no
idea about grand architecture.  Marking Waiting on Author.

Thanks,
-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-02-28 Thread Thom Brown
On 27 February 2012 19:37, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Thom Brown t...@linux.com writes:
 CREATE COMMAND TRIGGER test_cmd_trg
 BEFORE CREATE SCHEMA,
   CREATE OPERATOR,
   CREATE COLLATION,
   CREATE CAST
 EXECUTE PROCEDURE my_func();

 I couldn't drop it completely unless I specified all of those commands.  Why?

 Because I couldn't find a nice enough way to implement that given the
 shared infrastructure Robert and Kaigai did put into place to handle
 dropping of objects.  It could be that I didn't look hard enough, I'll
 be happy to get back that feature.

Well the problem is that you can add commands to a trigger en masse,
but you can only remove them one at a time.  Couldn't we at least
allow the removal of multiple commands at the same time?  The docs you
wrote suggest you can do this, but you can't.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW system columns

2012-02-28 Thread Shigeru Hanada
(2012/02/28 18:08), Thom Brown wrote:
   If that's something that will likely be introduced in future, then
 surely we'd want to keep the tableoid column rather than removing it
 then re-introducing it later?

As background knowledge, currently (9.1 and 9.2dev) foreign tables have
all system columns, but they return meaningless values except tableoid.
 For instance, a foreign table pgbench_branches with 3 rows will return
results like below (bid in the rightmost is user column):

postgres=# select ctid, xmin, cmin, xmax, cmax, tableoid,
postgres-# bid from pgbench_branches;
  ctid  | xmin | cmin | xmax | cmax | tableoid | bid
+--+--+--+--+--+-
 (4294967295,0) |0 |0 |0 |0 |16400 |   2
 (4294967295,0) |0 |0 |0 |0 |16400 |   3
 (4294967295,0) |0 |0 |0 |0 |16400 |   1
(3 rows)

In this example, 16400 is correct oid of pg_class record for relation
pgbench_branches.

I don't have any idea to use system columns other than tableoid of
foreign tables, because it seems difficult to define common meaning for
various FDWs.  One possible idea about ctid column is using it for
virtual tuple id (location information of remote data) for update
support, if FDW can pack location information into ItemPointerData area.

We have three options:

a) remove all system columns (posted patch)
b) remove system columns other than tableoid
c) leave all system columns as is (current 9.2dev)

Incidentally, views, which is very similar object type to foreign
tables, have no system columns.

Thoughts?

-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-02-28 Thread Thom Brown
On 28 February 2012 11:43, Thom Brown t...@linux.com wrote:
 On 27 February 2012 19:37, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Thom Brown t...@linux.com writes:
 CREATE COMMAND TRIGGER test_cmd_trg
 BEFORE CREATE SCHEMA,
   CREATE OPERATOR,
   CREATE COLLATION,
   CREATE CAST
 EXECUTE PROCEDURE my_func();

 I couldn't drop it completely unless I specified all of those commands.  
 Why?

 Because I couldn't find a nice enough way to implement that given the
 shared infrastructure Robert and Kaigai did put into place to handle
 dropping of objects.  It could be that I didn't look hard enough, I'll
 be happy to get back that feature.

 Well the problem is that you can add commands to a trigger en masse,
 but you can only remove them one at a time.  Couldn't we at least
 allow the removal of multiple commands at the same time?  The docs you
 wrote suggest you can do this, but you can't.

Also note that as of commit 66f0cf7da8eeaeca4b9894bfafd61789b514af4a
(Remove useless const qualifier) the patch no longer applies due to
changes to src/backend/commands/typecmds.c.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW system columns

2012-02-28 Thread Robert Haas
On Tue, Feb 28, 2012 at 7:00 AM, Shigeru Hanada
shigeru.han...@gmail.com wrote:
 We have three options:

 a) remove all system columns (posted patch)
 b) remove system columns other than tableoid
 c) leave all system columns as is (current 9.2dev)

 Incidentally, views, which is very similar object type to foreign
 tables, have no system columns.

 Thoughts?

I vote against (c).  I'm not sure which of (a) or (b) is better.
We've talked about allowing foreign tables to inherit from regular
tables and visca versa, and certainly, in that situation, tableoid
would be useful.  But I don't think we've made a definitive decision
about that.  I stripped that functionality out of the original patch
because it introduced a bunch of warts that we didn't have time to
figure out how to fix, and it's not clear to me that anyone's spent
any time thinking about that since then.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW system columns

2012-02-28 Thread Kohei KaiGai
2012年2月28日12:00 Shigeru Hanada shigeru.han...@gmail.com:
 (2012/02/28 18:08), Thom Brown wrote:
   If that's something that will likely be introduced in future, then
 surely we'd want to keep the tableoid column rather than removing it
 then re-introducing it later?

 As background knowledge, currently (9.1 and 9.2dev) foreign tables have
 all system columns, but they return meaningless values except tableoid.
  For instance, a foreign table pgbench_branches with 3 rows will return
 results like below (bid in the rightmost is user column):

 postgres=# select ctid, xmin, cmin, xmax, cmax, tableoid,
 postgres-# bid from pgbench_branches;
  ctid  | xmin | cmin | xmax | cmax | tableoid | bid
 +--+--+--+--+--+-
  (4294967295,0) |0 |0 |0 |0 |16400 |   2
  (4294967295,0) |0 |0 |0 |0 |16400 |   3
  (4294967295,0) |0 |0 |0 |0 |16400 |   1
 (3 rows)

 In this example, 16400 is correct oid of pg_class record for relation
 pgbench_branches.

 I don't have any idea to use system columns other than tableoid of
 foreign tables, because it seems difficult to define common meaning for
 various FDWs.  One possible idea about ctid column is using it for
 virtual tuple id (location information of remote data) for update
 support, if FDW can pack location information into ItemPointerData area.

 We have three options:

 a) remove all system columns (posted patch)
 b) remove system columns other than tableoid
 c) leave all system columns as is (current 9.2dev)

 Incidentally, views, which is very similar object type to foreign
 tables, have no system columns.

 Thoughts?

Which is the expected behavior in case of a foreign table
is constructed as a child table of a particular regular table?
In this case, children foreign tables don't have columns
that exist on the parent table?
(Although it is no matter when a regular table is a child of
a foreign table...)

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-02-28 Thread Tom Lane
Thom Brown t...@linux.com writes:
 Well the problem is that you can add commands to a trigger en masse,
 but you can only remove them one at a time.  Couldn't we at least
 allow the removal of multiple commands at the same time?  The docs you
 wrote suggest you can do this, but you can't.

This seems over-complicated.  Triggers on tables do not have alterable
properties, why should command triggers?  I vote for

CREATE COMMAND TRIGGER name ... properties ...;

DROP COMMAND TRIGGER name;

full stop.  If you want to run the same trigger function on some more
commands, add another trigger name.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-02-28 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 This seems over-complicated.  Triggers on tables do not have
 alterable properties, why should command triggers?  I vote for
 
   CREATE COMMAND TRIGGER name ... properties ...;
 
   DROP COMMAND TRIGGER name;
 
 full stop.  If you want to run the same trigger function on some
 more commands, add another trigger name.
 
+1
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW system columns

2012-02-28 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Feb 28, 2012 at 7:00 AM, Shigeru Hanada
 shigeru.han...@gmail.com wrote:
 We have three options:
 
 a) remove all system columns (posted patch)
 b) remove system columns other than tableoid
 c) leave all system columns as is (current 9.2dev)
 
 Incidentally, views, which is very similar object type to foreign
 tables, have no system columns.
 
 Thoughts?

 I vote against (c).  I'm not sure which of (a) or (b) is better.
 We've talked about allowing foreign tables to inherit from regular
 tables and visca versa, and certainly, in that situation, tableoid
 would be useful.

I think it is a mistake to imagine that tableoid is only useful in
inheritance contexts.  As one counterexample, pg_dump selects tableoid
from quite a lot of system catalogs, just as a convenient and uniform
way of remembering each object's type (of course, the fact that it needs
to match them up against pg_depend entries has something to do with
that).  More generally, if we exclude tableoid from foreign tables,
that just introduces an arbitrary behavioral difference between foreign
and regular tables, thus complicating any code that has use for the
feature.

So I believe that (a) is a pretty bad choice.  I would hold still for
(b) but I am not convinced that the case has been made for that either.
I think it would be wise to avoid introducing behavioral churn until
after we have designed and implemented update capabilities for foreign
tables.  If we end up putting back ctid to support that, we'll look
pretty silly.

In short, (c) looks like the most reasonable choice for now, with the
expectation of revisiting the question after we have foreign update
working.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-02-28 Thread Thom Brown
On 28 February 2012 15:03, Tom Lane t...@sss.pgh.pa.us wrote:
 Thom Brown t...@linux.com writes:
 Well the problem is that you can add commands to a trigger en masse,
 but you can only remove them one at a time.  Couldn't we at least
 allow the removal of multiple commands at the same time?  The docs you
 wrote suggest you can do this, but you can't.

 This seems over-complicated.  Triggers on tables do not have alterable
 properties, why should command triggers?  I vote for

        CREATE COMMAND TRIGGER name ... properties ...;

        DROP COMMAND TRIGGER name;

 full stop.  If you want to run the same trigger function on some more
 commands, add another trigger name.

This would make more sense, particularly since dropping a command
trigger, as it stands, doesn't necessarily drop the trigger.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-02-28 Thread Robert Haas
On Fri, Feb 24, 2012 at 5:31 PM, Peter Eisentraut pete...@gmx.net wrote:
 Could we name this postgresql_fdw instead?  We already have several
 ${productname}_fdw out there, and I don't want to get in the business of
 having to guess variant spellings.

If you don't like variant spellings, having anything to do with
PostgreSQL, aka Postgres, and usually discussed on the pgsql-* mailing
lists, is probably a bad idea.

Go Postgre!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-02-28 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Feb 24, 2012 at 5:31 PM, Peter Eisentraut pete...@gmx.net wrote:
 Could we name this postgresql_fdw instead?  We already have several
 ${productname}_fdw out there, and I don't want to get in the business of
 having to guess variant spellings.

 If you don't like variant spellings, having anything to do with
 PostgreSQL, aka Postgres, and usually discussed on the pgsql-* mailing
 lists, is probably a bad idea.

[ snicker ]  But still, Peter has a point: pgsql is not a name for the
product, it's at best an abbreviation.  We aren't calling the other
thing orcl_fdw or ora_fdw.

I think either postgres_fdw or postgresql_fdw would be fine.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-02-28 Thread Robert Haas
On Tue, Feb 28, 2012 at 11:02 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Feb 24, 2012 at 5:31 PM, Peter Eisentraut pete...@gmx.net wrote:
 Could we name this postgresql_fdw instead?  We already have several
 ${productname}_fdw out there, and I don't want to get in the business of
 having to guess variant spellings.

 If you don't like variant spellings, having anything to do with
 PostgreSQL, aka Postgres, and usually discussed on the pgsql-* mailing
 lists, is probably a bad idea.

 [ snicker ]  But still, Peter has a point: pgsql is not a name for the
 product, it's at best an abbreviation.  We aren't calling the other
 thing orcl_fdw or ora_fdw.

 I think either postgres_fdw or postgresql_fdw would be fine.

I liked the shorter name, myself, but I'm not going to make a big deal about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-02-28 Thread Bruce Momjian
On Wed, Feb 22, 2012 at 03:37:29PM -0500, Bruce Momjian wrote:
 On Wed, Feb 22, 2012 at 05:22:29PM -0300, Alvaro Herrera wrote:
  Not sure about this.  If the upgrades completes successfully and the
  file is not renamed at the last minute due to some error, that would be
  a problem as well, because now the old cluster would happily run and
  perhaps corrupt the data files from under the new cluster.
 
 Well, the basic problem is that the user, before pg_upgrade started,
 installed a new cluster that works.  If we rename the old control, but
 rename it back on failure, there are cases we will miss, kill like -9 or
 a server crash, and it will not be obvious to them that the control file
 was renamed.
 
 Of course, if we only rename on success, and there is kill -9 or server
 crash, the old cluster is still start-able, like the new one.
 
 One good argument for the rename early is that on a server crash, the
 system is probably going to restart the database automatically, and that
 means the old server.
 
 Right now we have a clear message that they need to rename the control
 file to start the old server.  Not sure what the new wording would look
 like --- let me try.

I have thought about this, and feel that it would be odd to lock the old
cluster at the start of the upgrade, and then unlock it on a failure,
particularly because we can't always unlock it, e.g. operating system
crash.

A cleaner solution would be to lock it when we complete the upgrade,
which I have done in the attached patch.  I have also added a warning
about restarting the old server when link mode is used, and updated the
documentation to match the new behavior.

Patch attached.  I would like to apply this to 9.2/HEAD.

---

Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions   ok
Checking database user is a superuser   ok
Checking for prepared transactions  ok
Checking for reg* system OID user data typesok
Checking for contrib/isn with bigint-passing mismatch   ok
Creating catalog dump   ok
Checking for prepared transactions  ok
Checking for presence of required libraries ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
--
Analyzing all rows in the new cluster   ok
Freezing all rows on the new clusterok
Deleting new commit clogs   ok
Copying old commit clogs to new server  ok
Setting next transaction ID for new cluster ok
Resetting WAL archives  ok
Setting frozenxid counters in new cluster   ok
Creating databases in the new cluster   ok
Adding support functions to new cluster ok
Restoring database schema to new clusterok
Removing support functions from new cluster ok
Linking user relation files
ok
Setting next OID for new clusterok
Creating script to delete old cluster   ok
Adding .old suffix to old global/pg_control   ok

If you want to start the old cluster, you will need to remove
the .old suffix from /u/pgsql.old/data/global/pg_control.old.
Because link mode was used, the old cluster cannot be safely
started once the new cluster has been started.

Upgrade complete

Optimizer statistics are not transferred by pg_upgrade so
consider running:
vacuumdb --all --analyze-only
on the newly-upgraded cluster.

Running this script will delete the old cluster's data files:
/usr/local/pgdev/pg_upgrade/delete_old_cluster.sh

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
new file mode 100644
index 891eb9a..a5f63eb
*** a/contrib/pg_upgrade/check.c
--- b/contrib/pg_upgrade/check.c
*** report_clusters_compatible(void)
*** 148,156 
  	}
  
  	pg_log(PG_REPORT, \n
! 		   If pg_upgrade fails after this point, you must re-initdb the new cluster\n
! 		   before continuing.  You will also need to remove the \.old\ suffix from\n
! 		   %s/global/pg_control.old.\n, old_cluster.pgdata);
  }
  
  
--- 148,155 
  	}
  
  	pg_log(PG_REPORT, \n
! 		   If pg_upgrade fails after this point, you must re-initdb the\n
! 		   new cluster before continuing.\n);
  }
  
  
*** output_completion_banner(char *deletion_
*** 198,205 
  	/* Did 

Re: [HACKERS] [v9.2] Add GUC sepgsql.client_label

2012-02-28 Thread Kohei KaiGai
2012/2/24 Yeb Havinga yebhavi...@gmail.com:
 On 2012-02-24 15:17, Yeb Havinga wrote:

 I don't know what's fishy about the mgrid user and root that causes
 c0.c1023 to be absent.


 more info:

 In shells started in a x environment under Xvnc, id -Z shows the system_u
 and c0.c1023 absent.

 In shells started from the ssh daemon, the id -Z matches what it should be
 according to the seusers file: unconfined_u and c0.c1023 present.

It seems to me the reason why your security label on bash is different from
the expected default one.
Unlike sshd daemon, vncserver does not assign security label on itself
according to the /etc/selinux/targeted/seusers, thus it inherits the label
of system startup script. It is also the reason why you saw system_u
at the head of security context.

I'll report this topic to selinux community to discuss the preferable solution.
Anyway, please use ssh connection for the testing purpose.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial 9.2 pgbench write results

2012-02-28 Thread Jeff Janes
On Thu, Feb 23, 2012 at 3:17 AM, Greg Smith g...@2ndquadrant.com wrote:

 I think an even bigger factor now is that the BGW writes can disturb write
 ordering/combining done at the kernel and storage levels.  It's painfully
 obvious now how much PostgreSQL relies on that to get good performance.  All
 sorts of things break badly if we aren't getting random writes scheduled to
 optimize seek times, in as many contexts as possible.  It doesn't seem
 unreasonable that background writer writes can introduce some delay into the
 checkpoint writes, just by adding more random components to what is already
 a difficult to handle write/sync series.  That's what I think what these
 results are showing is that background writer writes can deoptimize other
 forms of write.

How hard would it be to dummy up a bgwriter which, every time it wakes
up, it forks off a child process to actually do the write, and then
the real one just waits for the child to exit?  If it didn't have to
correctly handle signals, SINVAL, and such, it should be just a few
lines of code, but I don't know how much we can ignore signals and
such even just for testing purposes.  My thought here is that the
kernel is getting in a snit over one process doing all the writing on
the system, and is punishing that process in a way that ruins things
for everyone.



 A second fact that's visible from the TPS graphs over the test run, and
 obvious if you think about it, is that BGW writes force data to physical
 disk earlier than they otherwise might go there.

On a busy system like you are testing, the BGW should only be writing
out data a fraction of a second before the backends would otherwise be
doing it, unless the 2 minutes to circle the buffer pool logic is in
control rather than the bgwriter_lru_multiplier and
bgwriter_lru_maxpages logic. From the data reported, we can see how
many buffer-allocations there are but not how many circles of the pool
it took to find them)

It doesn't seem likely that small shifts in timing are having that
effect, compared to the possible effect of who is doing the writing.
If the timing is truly the issue, lowering bgwriter_delay might smooth
the timing out and bring closer to what the backends would do for
themselves.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby Failover Scenario

2012-02-28 Thread Greg Smith

On 02/27/2012 10:05 PM, Lucky Haryadi wrote:

3. When Master A fails to service, the database will failovered to Slave
B by triggering with trigger file.


As soon as you trigger a standby, it changes it to a new timeline.  At 
that point, the series of WAL files diverges.  It's no longer possible 
to apply them to a system that is still on the original timeline, such 
as your original master A in this situation.  There's a good reason for 
that.  Let's say that A committed an additional transaction before it 
went down, but that commit wasn't replicated to B.  You can't just move 
records from B over anymore in that case.  The only way to make sure A 
is in sync again is to do a new base backup, which you can potentially 
accelerate using rsync to only copy what has changed.  I see a lot of 
people try to bypass one of the steps recommended in the manual using 
various schemes like yours, and they usually have a bug like this in 
there--sometimes obvious like this, sometimes subtle.  Trying to get too 
clever here is dangerous to your database.


Warning:  pgsql-hackers is the mailing list for people to discuss the 
development of PostgreSQL, not how to use it.  Questions like this 
should be asked on either the pgsql-admin or pgsql-general mailing list. 
 I'm not going to answer additional questions like this from you here 
on this list, and I doubt anyone else will either.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial 9.2 pgbench write results

2012-02-28 Thread Robert Haas
On Tue, Feb 28, 2012 at 1:15 AM, Ants Aasma ants.aa...@eesti.ee wrote:
 My hypothesis for the TPS regression is that it is due to write combining.
 When the workload is mainly bound by I/O, every little bit that can be saved
 helps the bottomline. Larger scalefactors don't get the benefit because
 there is less write combining going on overall.

This is an interesting hypothesis which I think we can test.  I'm
thinking of writing a quick patch (just for testing, not for commit)
to set a new buffer flag BM_BGWRITER_CLEANED to every buffer the
background writer cleans.   Then we can keep a count of how often such
buffers are dirtied before they're evicted, vs. how often they're
evicted before they're dirtied.  If any significant percentage of them
are redirtied before they're evicted, that would confirm this
hypothesis.  At any rate I think the numbers would be interesting to
see.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] swapcache-style cache?

2012-02-28 Thread karavelov
- Цитат от Andrea Suisani (sick...@opinioni.net), на 28.02.2012 в 09:54 
-

 On 02/28/2012 04:52 AM, Rob Wultsch wrote:
 On Wed, Feb 22, 2012 at 2:31 PM, jamesja...@mansionfamily.plus.com  wrote:
 Has anyone considered managing a system like the DragonFLY swapcache for a
 DBMS like PostgreSQL?


 https://www.facebook.com/note.php?note_id=388112370932

 
 in the same vein:
 
 http://bcache.evilpiepirate.org/
 
 from the main page:
 
 Bcache is a patch for the Linux kernel to use SSDs to cache other block 
 devices. It's analogous to L2Arc for ZFS,
 but Bcache also does writeback caching, and it's filesystem agnostic. It's 
 designed to be switched on with a minimum
 of effort, and to work well without configuration on any setup. By default it 
 won't cache sequential IO, just the random
 reads and writes that SSDs excel at. It's meant to be suitable for desktops, 
 servers, high end storage arrays, and perhaps
 even embedded.
 
 it was submitted to linux kernel mailing list a bunch of time, the last one:
 
 https://lkml.org/lkml/2011/9/10/13
 
 
 Andrea
 

I am successfully using facebook's flashchache in write-through mode - so it 
speeds only reads. I have seen 3 times 
increase on TPS for databases that do not fit in RAM. I am using Intel X-25E 
over RAID10 of 4 SAS disks. I have tested also
writeback mode but the gain is not so huge and there is a considerable risk for 
loosing all your data if/when the SSD fails.

Best regards


--
Luben Karavelov

Re: [HACKERS] review: CHECK FUNCTION statement

2012-02-28 Thread Alvaro Herrera


I have a few comments about this patch:

I didn't like the fact that the checker calling infrastructure uses
SPI instead of just a FunctionCallN to call the checker function.  I
think this should be easily avoidable.

Second, I see that functioncmds.c gets a lot into trigger internals just
to be able to figure out the function starting from a trigger name.  I
think it'd be saner to have a new function in trigger.c that returns the
required function OID.

I think CheckFunction would be clearer if the code to check multiple
objects is split out into a separate subroutine.

After CheckFunction there is a leftover function comment without any
following function.  There are other spurious hunks that add or remove
single lines too (once in an otherwise untouched file).

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] strange plan - PostgreSQL 9.2

2012-02-28 Thread Pavel Stehule
Hello

I try to look on one slow query with correlated subquery:

create table xx(a int primary key);
create table yy(a int);

insert into xx select generate_series(1,100);
insert into yy select (random()*100)::int from generate_series(1,10);

create index on yy(a);

Query A
select a, (select true from yy where xx.a = yy.a limit 1) from xx
limit 10 offset 0;

postgres= explain select a, (select true from yy where xx.a = yy.a
limit 1) from xx;
  QUERY PLAN
--
 Seq Scan on xx  (cost=0.00..4392325.00 rows=100 width=4)
   SubPlan 1
 -  Limit  (cost=0.00..4.38 rows=1 width=0)
   -  Index Only Scan using yy_a_idx on yy  (cost=0.00..4.38
rows=1 width=0)
 Index Cond: (a = xx.a)
(5 rows)

plan for this query is expected

But when I rewrote query I got strange plan (query B):

postgres= explain select a, exists(select 1 from yy where xx.a =
yy.a)  from xx limit 10 offset 0;
QUERY PLAN
--
 Limit  (cost=0.00..43.92 rows=10 width=4)
   -  Seq Scan on xx  (cost=0.00..4392325.00 rows=100 width=4)
 SubPlan 1
   -  Index Only Scan using yy_a_idx on yy  (cost=0.00..4.38
rows=1 width=0)
 Index Cond: (a = xx.a)
 SubPlan 2
   -  Seq Scan on yy  (cost=0.00..1443.00 rows=10 width=4)
(7 rows)

Why there are a SubPlan 2?

But query B is two time faster than query A

 public | xx| table | pavel| 35 MB  |
 public | yy| table | pavel| 3576 kB|

regards

Pavel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial 9.2 pgbench write results

2012-02-28 Thread Robert Haas
On Tue, Feb 28, 2012 at 11:36 AM, Jeff Janes jeff.ja...@gmail.com wrote:
 How hard would it be to dummy up a bgwriter which, every time it wakes
 up, it forks off a child process to actually do the write, and then
 the real one just waits for the child to exit?  If it didn't have to
 correctly handle signals, SINVAL, and such, it should be just a few
 lines of code, but I don't know how much we can ignore signals and
 such even just for testing purposes.  My thought here is that the
 kernel is getting in a snit over one process doing all the writing on
 the system, and is punishing that process in a way that ruins things
 for everyone.

I would assume the only punishment that the kernel would inflict would
be to put the bgwriter to sleep.  That would make the bgwriter less
effective, of course, but it shouldn't make it worse than no bgwriter
at all.  Unless it does it some really stupid way, like making
bgwriter sleep while it holds some lock.

But maybe I'm missing something - what do you have in mind?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial 9.2 pgbench write results

2012-02-28 Thread karavelov
- Цитат от Robert Haas (robertmh...@gmail.com), на 28.02.2012 в 19:25 -

 On Tue, Feb 28, 2012 at 11:36 AM, Jeff Janes jeff.ja...@gmail.com wrote:
 How hard would it be to dummy up a bgwriter which, every time it wakes
 up, it forks off a child process to actually do the write, and then
 the real one just waits for the child to exit?  If it didn't have to
 correctly handle signals, SINVAL, and such, it should be just a few
 lines of code, but I don't know how much we can ignore signals and
 such even just for testing purposes.  My thought here is that the
 kernel is getting in a snit over one process doing all the writing on
 the system, and is punishing that process in a way that ruins things
 for everyone.
 
 I would assume the only punishment that the kernel would inflict would
 be to put the bgwriter to sleep.  That would make the bgwriter less
 effective, of course, but it shouldn't make it worse than no bgwriter
 at all.  Unless it does it some really stupid way, like making
 bgwriter sleep while it holds some lock.
 
 But maybe I'm missing something - what do you have in mind?
 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 
 

--
Luben Karavelov

Re: [HACKERS] Initial 9.2 pgbench write results

2012-02-28 Thread Robert Haas
On Tue, Feb 28, 2012 at 11:46 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Feb 28, 2012 at 1:15 AM, Ants Aasma ants.aa...@eesti.ee wrote:
 My hypothesis for the TPS regression is that it is due to write combining.
 When the workload is mainly bound by I/O, every little bit that can be saved
 helps the bottomline. Larger scalefactors don't get the benefit because
 there is less write combining going on overall.

 This is an interesting hypothesis which I think we can test.  I'm
 thinking of writing a quick patch (just for testing, not for commit)
 to set a new buffer flag BM_BGWRITER_CLEANED to every buffer the
 background writer cleans.   Then we can keep a count of how often such
 buffers are dirtied before they're evicted, vs. how often they're
 evicted before they're dirtied.  If any significant percentage of them
 are redirtied before they're evicted, that would confirm this
 hypothesis.  At any rate I think the numbers would be interesting to
 see.

Patch attached.

I tried it on my laptop with a 60-second pgbench run at scale factor
100, and got this:

LOG:  bgwriter_clean: 1387 evict-before-dirty, 10 dirty-before-evict
LOG:  bgwriter_clean: 1372 evict-before-dirty, 10 dirty-before-evict
LOG:  bgwriter_clean: 1355 evict-before-dirty, 10 dirty-before-evict
LOG:  bgwriter_clean: 1344 evict-before-dirty, 8 dirty-before-evict
LOG:  bgwriter_clean: 1418 evict-before-dirty, 8 dirty-before-evict
LOG:  bgwriter_clean: 1345 evict-before-dirty, 7 dirty-before-evict
LOG:  bgwriter_clean: 1339 evict-before-dirty, 6 dirty-before-evict
LOG:  bgwriter_clean: 1362 evict-before-dirty, 9 dirty-before-evict

That doesn't look bad at all.  Then I reset the stats, tried it again,
and got this:

LOG:  bgwriter_clean: 3863 evict-before-dirty, 198 dirty-before-evict
LOG:  bgwriter_clean: 3861 evict-before-dirty, 199 dirty-before-evict
LOG:  bgwriter_clean: 3978 evict-before-dirty, 218 dirty-before-evict
LOG:  bgwriter_clean: 3928 evict-before-dirty, 204 dirty-before-evict
LOG:  bgwriter_clean: 3956 evict-before-dirty, 207 dirty-before-evict
LOG:  bgwriter_clean: 3906 evict-before-dirty, 222 dirty-before-evict
LOG:  bgwriter_clean: 3912 evict-before-dirty, 197 dirty-before-evict
LOG:  bgwriter_clean: 3853 evict-before-dirty, 200 dirty-before-evict

OK, that's not so good, but I don't know why it's different.  I'm not
sure I can reproduce the exact same scenario Greg is seeing - this is
totally different hardware - but I'll play around with it a little bit
more.  Greg, if you happen to feel like testing this on one of your
problem cases I'd be interested in seeing what it spits out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


bgwriter_clean_stats.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial 9.2 pgbench write results

2012-02-28 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 How hard would it be to dummy up a bgwriter which, every time it wakes
 up, it forks off a child process to actually do the write, and then
 the real one just waits for the child to exit?  If it didn't have to
 correctly handle signals, SINVAL, and such, it should be just a few
 lines of code, but I don't know how much we can ignore signals and
 such even just for testing purposes.  My thought here is that the
 kernel is getting in a snit over one process doing all the writing on
 the system, and is punishing that process in a way that ruins things
 for everyone.

If that is the case (which I don't know one way or the other), I'm not
sure that having subprocesses do the work would be an improvement.
We know for a fact that the OOM killer knows enough to blame memory
consumed by a child process on the parent.  Resource limitations of
other sorts might work similarly.

But a bigger problem is that fork() is very very far from being zero
cost.  I think doing one per write() would swamp any improvement you
could hope to get.

It does make sense that the bgwriter would get hit by niceness penalties
after it'd run up sufficient runtime.  If we could get some hard numbers
about how long it takes for that to happen, we could consider letting
the bgwriter exit (and the postmaster spawn a new one) every so often.
This is just gaming the scheduler, and so one would like to think
there's a better way to do it, but I don't know of any
non-root-privilege way to stave off getting niced.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CLOG contention, part 2

2012-02-28 Thread Robert Haas
On Mon, Feb 27, 2012 at 4:03 AM, Simon Riggs si...@2ndquadrant.com wrote:
 So please use a scale factor that the hardware can cope with.

OK.  I tested this out on Nate Boley's 32-core AMD machine, using
scale factor 100 and scale factor 300. I initialized it with Simon's
patch, which should have the effect of rendering the entire table
unhinted and giving each row a different XID.  I used my usual
configuration settings for that machine, which are: shared_buffers =
8GB, maintenance_work_mem = 1GB, synchronous_commit = off,
checkpoint_segments = 300, checkpoint_timeout = 15min,
checkpoint_completion_target = 0.9, wal_writer_delay = 20ms.  I did
three runs on master, as of commit
9bf8603c7a9153cada7e32eb0cf7ac1feb1d3b56, and three runs with the
clog_history_v4 patch applied.  The command to initialize the database
was:

~/install/clog-contention/bin/pgbench -i -I -s $scale

The command to run the test was:

~/install/clog-contention/bin/pgbench -l -T 1800 -c 32 -j 32 -n

Executive Summary: The patch makes things way slower at scale factor
300, and possibly slightly slower at scale factor 100.

Detailed Results:

resultslp.clog_history_v4.32.100.1800:tps = 14286.049637 (including
connections establishing)
resultslp.clog_history_v4.32.100.1800:tps = 13532.814984 (including
connections establishing)
resultslp.clog_history_v4.32.100.1800:tps = 13972.987301 (including
connections establishing)
resultslp.clog_history_v4.32.300.1800:tps = 5061.650470 (including
connections establishing)
resultslp.clog_history_v4.32.300.1800:tps = 4871.126457 (including
connections establishing)
resultslp.clog_history_v4.32.300.1800:tps = 5861.124177 (including
connections establishing)
resultslp.master.32.100.1800:tps = 13420.777222 (including connections
establishing)
resultslp.master.32.100.1800:tps = 14912.336257 (including connections
establishing)
resultslp.master.32.100.1800:tps = 14505.718977 (including connections
establishing)
resultslp.master.32.300.1800:tps = 14766.984548 (including connections
establishing)
resultslp.master.32.300.1800:tps = 14783.026190 (including connections
establishing)
resultslp.master.32.300.1800:tps = 14567.504887 (including connections
establishing)

I don't know whether this is just a bug or whether there's some more
fundamental problem with the approach.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-02-28 Thread Robert Haas
On Tue, Feb 28, 2012 at 11:21 AM, Bruce Momjian br...@momjian.us wrote:
 On Wed, Feb 22, 2012 at 03:37:29PM -0500, Bruce Momjian wrote:
 On Wed, Feb 22, 2012 at 05:22:29PM -0300, Alvaro Herrera wrote:
  Not sure about this.  If the upgrades completes successfully and the
  file is not renamed at the last minute due to some error, that would be
  a problem as well, because now the old cluster would happily run and
  perhaps corrupt the data files from under the new cluster.

 Well, the basic problem is that the user, before pg_upgrade started,
 installed a new cluster that works.  If we rename the old control, but
 rename it back on failure, there are cases we will miss, kill like -9 or
 a server crash, and it will not be obvious to them that the control file
 was renamed.

 Of course, if we only rename on success, and there is kill -9 or server
 crash, the old cluster is still start-able, like the new one.

 One good argument for the rename early is that on a server crash, the
 system is probably going to restart the database automatically, and that
 means the old server.

 Right now we have a clear message that they need to rename the control
 file to start the old server.  Not sure what the new wording would look
 like --- let me try.

 I have thought about this, and feel that it would be odd to lock the old
 cluster at the start of the upgrade, and then unlock it on a failure,
 particularly because we can't always unlock it, e.g. operating system
 crash.

 A cleaner solution would be to lock it when we complete the upgrade,
 which I have done in the attached patch.  I have also added a warning
 about restarting the old server when link mode is used, and updated the
 documentation to match the new behavior.

 Patch attached.  I would like to apply this to 9.2/HEAD.

 ---

 Performing Consistency Checks
 -
 Checking current, bin, and data directories                 ok
 Checking cluster versions                                   ok
 Checking database user is a superuser                       ok
 Checking for prepared transactions                          ok
 Checking for reg* system OID user data types                ok
 Checking for contrib/isn with bigint-passing mismatch       ok
 Creating catalog dump                                       ok
 Checking for prepared transactions                          ok
 Checking for presence of required libraries                 ok

 If pg_upgrade fails after this point, you must re-initdb the
 new cluster before continuing.

 Performing Upgrade
 --
 Analyzing all rows in the new cluster                       ok
 Freezing all rows on the new cluster                        ok
 Deleting new commit clogs                                   ok
 Copying old commit clogs to new server                      ok
 Setting next transaction ID for new cluster                 ok
 Resetting WAL archives                                      ok
 Setting frozenxid counters in new cluster                   ok
 Creating databases in the new cluster                       ok
 Adding support functions to new cluster                     ok
 Restoring database schema to new cluster                    ok
 Removing support functions from new cluster                 ok
 Linking user relation files
                                                            ok
 Setting next OID for new cluster                            ok
 Creating script to delete old cluster                       ok
 Adding .old suffix to old global/pg_control               ok

 If you want to start the old cluster, you will need to remove
 the .old suffix from /u/pgsql.old/data/global/pg_control.old.
 Because link mode was used, the old cluster cannot be safely
 started once the new cluster has been started.

 Upgrade complete
 
 Optimizer statistics are not transferred by pg_upgrade so
 consider running:
    vacuumdb --all --analyze-only
 on the newly-upgraded cluster.

 Running this script will delete the old cluster's data files:
    /usr/local/pgdev/pg_upgrade/delete_old_cluster.sh

I think you should rename the old control file just before the step
that says linking user relation files.  That's the point after which
it becomes unsafe to start the old cluster, right?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: URI connection string support for libpq

2012-02-28 Thread Alexander Shulgin

On 02/24/2012 03:18 PM, Florian Weimer wrote:


* Alex Shulgin:


It's ugly, but it's standard practice, and seems better than a separate
-d parameter (which sort of defeats the purpose of URIs).


Hm, do you see anything what's wrong with ?dbname=other if you don't
like a separate -d?


It's not nice URI syntax, but it's better than an out-of-band mechanism.


Attached is v5 of the patch, adding support for local Unix socket 
directory specification w/o the need to percent-encode path separators. 
 The path to directory must start with forward slash, like so:


postgres:///path/to/socket/dir

To specify non-default dbname use URI query parameters:

postgres:///path/to/socket/dir?dbname=other

Username/password should be also specified on query parameters in this 
case, as opposed to user:pw@host syntax supported by host URIs.


--
Alex
*** a/src/interfaces/libpq/fe-connect.c
--- b/src/interfaces/libpq/fe-connect.c
***
*** 282,287  static const PQEnvironmentOption EnvironmentOptions[] =
--- 282,290 
  	}
  };
  
+ /* The connection URI must start with either of the following designators: */
+ static const char uri_designator[] = postgresql://;
+ static const char short_uri_designator[] = postgres://;
  
  static bool connectOptions1(PGconn *conn, const char *conninfo);
  static bool connectOptions2(PGconn *conn);
***
*** 297,303  static PQconninfoOption *conninfo_parse(const char *conninfo,
  static PQconninfoOption *conninfo_array_parse(const char *const * keywords,
  	 const char *const * values, PQExpBuffer errorMessage,
  	 bool use_defaults, int expand_dbname);
! static char *conninfo_getval(PQconninfoOption *connOptions,
  const char *keyword);
  static void defaultNoticeReceiver(void *arg, const PGresult *res);
  static void defaultNoticeProcessor(void *arg, const char *message);
--- 300,333 
  static PQconninfoOption *conninfo_array_parse(const char *const * keywords,
  	 const char *const * values, PQExpBuffer errorMessage,
  	 bool use_defaults, int expand_dbname);
! static PQconninfoOption *conninfo_uri_parse(const char *uri,
! PQExpBuffer errorMessage);
! static bool conninfo_uri_parse_options(PQconninfoOption *options,
! const char *uri, char *buf,
! PQExpBuffer errorMessage);
! static char *conninfo_uri_parse_local_socket_dir(PQconninfoOption *options,
! const char *uri,
! char *buf, char *lastc,
! PQExpBuffer errorMessage);
! static char *conninfo_uri_parse_remote_host(PQconninfoOption *options,
! const char *uri,
! char *buf, char *lastc,
! PQExpBuffer errorMessage);
! static bool conninfo_uri_parse_params(char *params,
! PQconninfoOption *connOptions,
! PQExpBuffer errorMessage);
! static char *conninfo_uri_decode(const char *str, PQExpBuffer errorMessage);
! static bool get_hexdigit(char digit, int *value);
! static const char *conninfo_getval(PQconninfoOption *connOptions,
! const char *keyword);
! static PQconninfoOption *conninfo_storeval(PQconninfoOption *connOptions,
! const char *keyword, const char *value,
! PQExpBuffer errorMessage, bool ignoreMissing);
! static PQconninfoOption *conninfo_store_uri_encoded_value(
! PQconninfoOption *connOptions,
! const char *keyword, const char *encoded_value,
! PQExpBuffer errorMessage, bool ignoreMissing);
! static PQconninfoOption *conninfo_find(PQconninfoOption *connOptions,
  const char *keyword);
  static void defaultNoticeReceiver(void *arg, const PGresult *res);
  static void defaultNoticeProcessor(void *arg, const char *message);
***
*** 580,586  PQconnectStart(const char *conninfo)
  static void
  fillPGconn(PGconn *conn, PQconninfoOption *connOptions)
  {
! 	char	   *tmp;
  
  	/*
  	 * Move option values into conn structure
--- 610,616 
  static void
  fillPGconn(PGconn *conn, PQconninfoOption *connOptions)
  {
! 	const char	   *tmp;
  
  	/*
  	 * Move option values into conn structure
***
*** 3739,3745  ldapServiceLookup(const char *purl, PQconninfoOption *options,
  static int
  parseServiceInfo(PQconninfoOption *options, PQExpBuffer errorMessage)
  {
! 	char	   *service = conninfo_getval(options, service);
  	char		serviceFile[MAXPGPATH];
  	char	   *env;
  	bool		group_found = false;
--- 3769,3775 
  static int
  parseServiceInfo(PQconninfoOption *options, PQExpBuffer errorMessage)
  {
! 	const char *service = conninfo_getval(options, service);
  	char		serviceFile[MAXPGPATH];
  	char	   *env;
  	bool		group_found = false;
***
*** 4129,4161  conninfo_parse(const char *conninfo, PQExpBuffer errorMessage,
  		}
  
  		/*
! 		 * Now we have the name and the value. Search for the param record.
! 		 */
! 		for (option = options; option-keyword != NULL; option++)
! 		{
! 			if (strcmp(option-keyword, pname) == 0)
! break;
! 		}
! 		if (option-keyword == NULL)
! 		{
! 			printfPQExpBuffer(errorMessage,
! 		 libpq_gettext(invalid connection 

Re: [HACKERS] pg_upgrade --logfile option documentation

2012-02-28 Thread Alvaro Herrera

Excerpts from Robert Haas's message of mar feb 28 15:24:45 -0300 2012:

 I think you should rename the old control file just before the step
 that says linking user relation files.  That's the point after which
 it becomes unsafe to start the old cluster, right?

Also, if it's not using link mode, what is the point in doing the rename
in the first place?  I was using copy mode when I did my tests and yet
it got renamed (unless link mode is now the default, which I doubt?)

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Runtime SHAREDIR for testing CREATE EXTENSION

2012-02-28 Thread Daniel Farina
On Sun, Feb 26, 2012 at 7:36 AM, Peter Eisentraut pete...@gmx.net wrote:
 On lör, 2012-02-25 at 14:21 +0100, Christoph Berg wrote:
 Well, I'm trying to invoke the extension's make check target at
 extension build time. I do have a temporary installation I own
 somehwere in my $HOME, but that is still trying to find extensions in
 /usr/share/postgresql/9.1/extension/*.control, because I am using the
 system's postgresql version. The build process is not running as root,
 so I cannot do an install of the extension to its final location.
 Still it would be nice to run regression tests. All that seems to be
 missing is the ability to put

 extension_control_path = /home/buildd/tmp/extension

 into the postgresql.conf of the temporary PG installation, or some
 other way like CREATE EXTENSION foobar WITH CONTROL
 '/home/buildd/...'.

 Yeah, of course, the extension path is not related to the data
 directory.  So we do need some kind of path setting, just like
 dynamic_library_path.

A thought that may or may not be related:

I'd really like to support libraries (C or otherwise) of multiple
versions at the same time, when the underlying library permits.  The
critical use case is contribs that iterate on their storage format,
which right now is really painful.  While perhaps contribs should use
a version byte or something and be backwards-compatible forever, to
some extent that is clearly impossible and sometimes undesirable for
contribs that emit small on-disk representations, and would rather
use, say, the type system to disambiguate what operators fit with what
data.

I've been struggling to think of a more graceful way to load the
correct version of extra dependencies (including non-PG intrinsic
things like libgeos or libproj) when new database replicas are
created, as well as helping people with upgrade issues.  Is this
tunable possibly related to a solution for this?

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-02-28 Thread Peter Eisentraut
On tis, 2012-02-28 at 11:20 -0500, Robert Haas wrote:
  [ snicker ]  But still, Peter has a point: pgsql is not a name for
 the
  product, it's at best an abbreviation.  We aren't calling the other
  thing orcl_fdw or ora_fdw.
 
  I think either postgres_fdw or postgresql_fdw would be fine.
 
 I liked the shorter name, myself, but I'm not going to make a big deal
 about it.

Let's at least be clear about the reasons here.  The fact that
postgresql_fdw_validator exists means (a) there is a possible naming
conflict that has not been discussed yet, and/or (b) the name is already
settled and we need to think of a way to make postgresql_fdw_validator
work with the new actual FDW.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-02-28 Thread Bruce Momjian
On Tue, Feb 28, 2012 at 01:24:45PM -0500, Robert Haas wrote:
  Running this script will delete the old cluster's data files:
     /usr/local/pgdev/pg_upgrade/delete_old_cluster.sh
 
 I think you should rename the old control file just before the step
 that says linking user relation files.  That's the point after which
 it becomes unsafe to start the old cluster, right?

Yes, it is true that that is the danger point, and also it is much less
likely to fail at that point --- it usually happens during the schema
creation.  I would have to add some more conditional wording without
clearly stating if the old suffix is present.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-02-28 Thread Bruce Momjian
On Tue, Feb 28, 2012 at 03:48:03PM -0300, Alvaro Herrera wrote:
 
 Excerpts from Robert Haas's message of mar feb 28 15:24:45 -0300 2012:
 
  I think you should rename the old control file just before the step
  that says linking user relation files.  That's the point after which
  it becomes unsafe to start the old cluster, right?
 
 Also, if it's not using link mode, what is the point in doing the rename
 in the first place?  I was using copy mode when I did my tests and yet
 it got renamed (unless link mode is now the default, which I doubt?)

You are right that we lock unconditionally.  We can certainly only lock
in link mode.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CLOG contention, part 2

2012-02-28 Thread Simon Riggs
On Tue, Feb 28, 2012 at 6:11 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Feb 27, 2012 at 4:03 AM, Simon Riggs si...@2ndquadrant.com wrote:
 So please use a scale factor that the hardware can cope with.

 OK.  I tested this out on Nate Boley's 32-core AMD machine, using
 scale factor 100 and scale factor 300. I initialized it with Simon's
 patch, which should have the effect of rendering the entire table
 unhinted and giving each row a different XID.

Thanks for making the test.

I think this tells me the only real way to do this kind of testing is
not at arms length from a test machine.

So time to get my hands on a machine, but not for this release.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2012-02-28 Thread Pavel Stehule
Hello

Dne 28. února 2012 17:48 Alvaro Herrera alvhe...@commandprompt.com napsal(a):


 I have a few comments about this patch:

 I didn't like the fact that the checker calling infrastructure uses
 SPI instead of just a FunctionCallN to call the checker function.  I
 think this should be easily avoidable.


It is not possible - or it has not simple solution (I don't how to do
it). PLpgSQL_checker is SRF function. SPI is used for processing
returned resultset. I looked to pg source code, and I didn't find any
other pattern than using SPI for SRF function call. It is probably
possible, but it means some code duplication too. I invite any ideas.

 Second, I see that functioncmds.c gets a lot into trigger internals just
 to be able to figure out the function starting from a trigger name.  I
 think it'd be saner to have a new function in trigger.c that returns the
 required function OID.

done


 I think CheckFunction would be clearer if the code to check multiple
 objects is split out into a separate subroutine.

done


 After CheckFunction there is a leftover function comment without any
 following function.  There are other spurious hunks that add or remove
 single lines too (once in an otherwise untouched file).

fixed



I refreshed patch for current git repository.

Regards

Pavel

 --
 Álvaro Herrera alvhe...@commandprompt.com
 The PostgreSQL Company - Command Prompt, Inc.
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support


check_function-2012-02-28-2.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-02-28 Thread David E. Wheeler
On Feb 28, 2012, at 8:20 AM, Robert Haas wrote:

 I liked the shorter name, myself, but I'm not going to make a big deal about 
 it.

pg_ is used quite a bit. what about pg_fdw?

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2012-02-28 Thread Alvaro Herrera

Excerpts from Pavel Stehule's message of mar feb 28 16:30:58 -0300 2012:

 I refreshed patch for current git repository.

Thanks, I'll have a look.

Oh, another thing -- you shouldn't patch the 1.0 version of the plpgsql
extension.  Rather I think you should produce a 1.1 version.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2012-02-28 Thread Pavel Stehule
2012/2/28 Alvaro Herrera alvhe...@commandprompt.com:

 Excerpts from Pavel Stehule's message of mar feb 28 16:30:58 -0300 2012:

 I refreshed patch for current git repository.

 Thanks, I'll have a look.

 Oh, another thing -- you shouldn't patch the 1.0 version of the plpgsql
 extension.  Rather I think you should produce a 1.1 version.


there is patch with updated tag. I am not sure if there are some
changes in pg_upgrade are necessary??

Regards and thank you

Pavel

 --
 Álvaro Herrera alvhe...@commandprompt.com
 The PostgreSQL Company - Command Prompt, Inc.
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support


check_function-2012-02-28-3.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2012-02-28 Thread Alvaro Herrera


In gram.y we have a new check_option_list nonterminal.  This is mostly
identical to explain_option_list, except that the option args do not
take a NumericOnly (only opt_boolean_or_string and empty).  I wonder if
it's really worthwhile having a bunch of separate productions for this;
how about we just use the existing explain_option_list instead and get
rid of those extra productions?

elog() is used in many user-facing messages (errors and notices).  Full
ereport() calls should be used there, so that messages are marked for
translations and so on.

Does the patched pg_dump work with older servers?

I don't like CheckFunction being declared in defrem.h.  It seems
completely out of place there.  I don't see any better place though, so
I'm thinking maybe we should have a new header file for it (say
commands/functions.h; but we already have executor/functions.h so
perhaps it's better to find another name).  This addition means that
there's a distressingly large number of .c files that are now getting
dest.h, which was previously pretty confined.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange plan - PostgreSQL 9.2

2012-02-28 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 Why there are a SubPlan 2?

http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=bd3daddaf232d95b0c9ba6f99b0170a0147dd8af

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange plan - PostgreSQL 9.2

2012-02-28 Thread Pavel Stehule
2012/2/28 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 Why there are a SubPlan 2?

 http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=bd3daddaf232d95b0c9ba6f99b0170a0147dd8af

                        regards, tom lane

Thank you - I can verify so it works well, but a EXPLAIN result is
really strange

Pavel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2012-02-28 Thread Pavel Stehule
2012/2/28 Alvaro Herrera alvhe...@commandprompt.com:


 In gram.y we have a new check_option_list nonterminal.  This is mostly
 identical to explain_option_list, except that the option args do not
 take a NumericOnly (only opt_boolean_or_string and empty).  I wonder if
 it's really worthwhile having a bunch of separate productions for this;
 how about we just use the existing explain_option_list instead and get
 rid of those extra productions?

I have to look on it


 elog() is used in many user-facing messages (errors and notices).  Full
 ereport() calls should be used there, so that messages are marked for
 translations and so on.

yes, I'll fix it


 Does the patched pg_dump work with older servers?


yes, It should to do

 I don't like CheckFunction being declared in defrem.h.  It seems
 completely out of place there.  I don't see any better place though, so
 I'm thinking maybe we should have a new header file for it (say
 commands/functions.h; but we already have executor/functions.h so
 perhaps it's better to find another name).  This addition means that
 there's a distressingly large number of .c files that are now getting
 dest.h, which was previously pretty confined.


you have much better knowledge about headers then me, so, please,
propose solution.

Regards

Pavel


 --
 Álvaro Herrera alvhe...@commandprompt.com
 The PostgreSQL Company - Command Prompt, Inc.
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NULL's support in SP-GiST

2012-02-28 Thread Jaime Casanova
On Thu, Feb 2, 2012 at 4:26 PM, Oleg Bartunov o...@sai.msu.su wrote:
 Hi there,

 attached patch introduces NULLs indexing for SP-GiST. With this patch
 Sp-GiST supports IS NULL, IS NOT NULL clauses, as well as full index scan.


I was looking at this.
It passes all regression tests, and seems to work fine.

What i don't like about it is that spgnull.c actually call GIN
functions and even uses GIN flags. Don't know how bad it is, but IMO
there is a module violation here.

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Parameterized-path cost comparisons need some work

2012-02-28 Thread Tom Lane
I was experimenting today with a test case sent to me by somebody at Red
Hat.  The details aren't too important, except that it involves an inner
join on the inside of a left join (where it cannot commute with the left
join).  I can reproduce the behavior with a standard regression test
table, if I crank random_page_cost up a bit:

regression=# set random_page_cost TO 5;
SET
regression=# explain select * from tenk1 t1 left join (tenk1 t2 join tenk1 t3 
on t2.thousand = t3.unique2) on t1.hundred = t2.hundred where t1.unique1 = 1;
   QUERY PLAN   


 Nested Loop Left Join  (cost=0.00..507.16 rows=100 width=732)
   -  Index Scan using tenk1_unique1 on tenk1 t1  (cost=0.00..10.28 rows=1 
width=244)
 Index Cond: (unique1 = 1)
   -  Nested Loop  (cost=0.00..495.63 rows=100 width=488)
 -  Index Scan using tenk1_hundred on tenk1 t2  (cost=0.00..446.98 
rows=100 width=244)
   Index Cond: (t1.hundred = hundred)
 -  Index Scan using tenk1_unique2 on tenk1 t3  (cost=0.00..0.48 
rows=1 width=244)
   Index Cond: (unique2 = t2.thousand)
(8 rows)

regression=# set random_page_cost TO 6;
SET
regression=# explain select * from tenk1 t1 left join (tenk1 t2 join tenk1 t3 
on t2.thousand = t3.unique2) on t1.hundred = t2.hundred where t1.unique1 = 1;
 QUERY PLAN 
 
-
 Hash Right Join  (cost=928.30..2573.80 rows=100 width=732)
   Hash Cond: (t2.hundred = t1.hundred)
   -  Hash Join  (cost=916.00..2523.00 rows=1 width=488)
 Hash Cond: (t2.thousand = t3.unique2)
 -  Seq Scan on tenk1 t2  (cost=0.00..458.00 rows=1 width=244)
 -  Hash  (cost=458.00..458.00 rows=1 width=244)
   -  Seq Scan on tenk1 t3  (cost=0.00..458.00 rows=1 
width=244)
   -  Hash  (cost=12.29..12.29 rows=1 width=244)
 -  Index Scan using tenk1_unique1 on tenk1 t1  (cost=0.00..12.29 
rows=1 width=244)
   Index Cond: (unique1 = 1)
(10 rows)

WTF?  How did it suddenly fail to find the double-nested-loop plan, and
have to settle for a much worse plan?

The reason seems to be that for large enough random_page_cost, the
seqscan on t2 is costed as cheaper than an indexscan that selects a
significant fraction of the table.  (Notice the t2 scans are nearly the
same cost in these two examples.)  That causes add_path to decide that
the unparameterized seqscan is unconditionally better than the
parameterized indexscan, and throw out the latter.  Now it is impossible
to form the parameterized nestloop subplan.

The flaw in this logic, of course, is that the seqscan might be cheaper
than the parameterized indexscan, but *it produces a whole lot more
rows*, meaning that any subsequent join will be a lot more expensive.
Previously add_path didn't have to worry about that, because all
ordinary paths for a given relation produce the same number of rows
(and we studiously kept things like inner indexscan paths out of
add_path's view of the world).

The most obvious thing to do about this is to consider that one path can
dominate another on cost only if it is both cheaper *and* produces no
more rows.  But I'm concerned about the cost of inserting yet another
test condition into add_path, which is slow enough already.  Has anyone
got an idea for a different formulation that would avoid that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we add crc32 in libpgport?

2012-02-28 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 Thinking unnecessary.  Motion is progress.  Here is a patch that uses
 this exact plan: pgport for the tables, broken out into a header file
 that is included in the building of libpgport.  I have confirmed by
 objdump -t that multiple copies of the table are not included in the
 postgres binary and the bloat has not occurred.

Applied with minor adjustments; mainly, I cleaned up some additional
traces of the old way of building pg_resetxlog and pg_controldata.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 3rd Cluster Hackers Summit, May 15th in Ottawa

2012-02-28 Thread Josh Berkus
All,

First, I've added a wiki page:

http://wiki.postgresql.org/wiki/PgCon2012CanadaClusterSummit

Second, please RSVP!  At this point I've heard from a total of 3 people.
 I need to know who/how many are coming so that I can book the space,
and lunch.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [pgsql-cluster-hackers] 3rd Cluster Hackers Summit, May 15th in Ottawa

2012-02-28 Thread Koichi Suzuki
Hi,

This is Koichi.   I'm joining the summit.   I've heard at least two or
more people will join from NTT.I'm expecting a couple of people
from EDB too.

Regards;
--
Koichi Suzuki



2012/2/29 Josh Berkus j...@agliodbs.com:
 All,

 First, I've added a wiki page:

 http://wiki.postgresql.org/wiki/PgCon2012CanadaClusterSummit

 Second, please RSVP!  At this point I've heard from a total of 3 people.
  I need to know who/how many are coming so that I can book the space,
 and lunch.

 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com

 --
 Sent via pgsql-cluster-hackers mailing list 
 (pgsql-cluster-hack...@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-cluster-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [pgsql-cluster-hackers] 3rd Cluster Hackers Summit, May 15th in Ottawa

2012-02-28 Thread Michael Paquier
Hi Josh,

 Second, please RSVP!  At this point I've heard from a total of 3 people.
  I need to know who/how many are coming so that I can book the space,
 and lunch.

 Sorry for my late reply.
I will join the cluster meeting as a member of Postgres-XC.

Regards,
-- 
Michael Paquier
http://michael.otacoo.com


[HACKERS] a typo in json.h

2012-02-28 Thread Haifeng Liu
Seems there is a trivial typo in json.h

diff --git a/src/include/utils/json.h b/src/include/utils/json.h
index 415787b..fa0d4fc 100644
--- a/src/include/utils/json.h
+++ b/src/include/utils/json.h
@@ -26,4 +26,4 @@ extern Datum row_to_json(PG_FUNCTION_ARGS);
 extern Datum row_to_json_pretty(PG_FUNCTION_ARGS);
 extern void  escape_json(StringInfo buf, const char *str);

-#endif   /* XML_H */
+#endif   /* JSON_H */


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-02-28 Thread Bruce Momjian
On Tue, Feb 28, 2012 at 02:15:30PM -0500, Bruce Momjian wrote:
 On Tue, Feb 28, 2012 at 01:24:45PM -0500, Robert Haas wrote:
   Running this script will delete the old cluster's data files:
      /usr/local/pgdev/pg_upgrade/delete_old_cluster.sh
  
  I think you should rename the old control file just before the step
  that says linking user relation files.  That's the point after which
  it becomes unsafe to start the old cluster, right?
 
 Yes, it is true that that is the danger point, and also it is much less
 likely to fail at that point --- it usually happens during the schema
 creation.  I would have to add some more conditional wording without
 clearly stating if the old suffix is present.

OK, I have implemented both Roberts and Àlvaro's ideas in my patch. 
I only add the .old suffix to pg_controldata when link mode is used, and
I now do it after the schema has been created (the most common failure
case for pg_upgrade), and just before we actually link files --- both
very good ideas.

Patch attached;  new pg_upgrade output with link mode below.

---

Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions   ok
Checking database user is a superuser   ok
Checking for prepared transactions  ok
Checking for reg* system OID user data typesok
Checking for contrib/isn with bigint-passing mismatch   ok
Creating catalog dump   ok
Checking for prepared transactions  ok
Checking for presence of required libraries ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
--
Analyzing all rows in the new cluster   ok
Freezing all rows on the new clusterok
Deleting new commit clogs   ok
Copying old commit clogs to new server  ok
Setting next transaction ID for new cluster ok
Resetting WAL archives  ok
Setting frozenxid counters in new cluster   ok
Creating databases in the new cluster   ok
Adding support functions to new cluster ok
Restoring database schema to new clusterok
Removing support functions from new cluster ok
Adding .old suffix to old global/pg_control   ok

If you want to start the old cluster, you will need to remove
the .old suffix from /u/pgsql.old/data/global/pg_control.old.
Because link mode was used, the old cluster cannot be safely
started once the new cluster has been started.

Linking user relation files
ok
Setting next OID for new clusterok
Creating script to delete old cluster   ok

Upgrade complete

Optimizer statistics are not transferred by pg_upgrade so
consider running:
vacuumdb --all --analyze-only
on the newly-upgraded cluster.

Running this script will delete the old cluster's data files:
/usr/local/pgdev/pg_upgrade/delete_old_cluster.sh


-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
new file mode 100644
index 891eb9a..a5f63eb
*** a/contrib/pg_upgrade/check.c
--- b/contrib/pg_upgrade/check.c
*** report_clusters_compatible(void)
*** 148,156 
  	}
  
  	pg_log(PG_REPORT, \n
! 		   If pg_upgrade fails after this point, you must re-initdb the new cluster\n
! 		   before continuing.  You will also need to remove the \.old\ suffix from\n
! 		   %s/global/pg_control.old.\n, old_cluster.pgdata);
  }
  
  
--- 148,155 
  	}
  
  	pg_log(PG_REPORT, \n
! 		   If pg_upgrade fails after this point, you must re-initdb the\n
! 		   new cluster before continuing.\n);
  }
  
  
*** output_completion_banner(char *deletion_
*** 198,205 
  	/* Did we copy the free space files? */
  	if (GET_MAJOR_VERSION(old_cluster.major_version) = 804)
  		pg_log(PG_REPORT,
! 			   Optimizer statistics are not transferred by pg_upgrade so consider\n
! 			   running:\n
  			   vacuumdb --all --analyze-only\n
  			   on the newly-upgraded cluster.\n\n);
  	else
--- 197,204 
  	/* Did we copy the free space files? */
  	if (GET_MAJOR_VERSION(old_cluster.major_version) = 804)
  		pg_log(PG_REPORT,
! 			   Optimizer statistics are not transferred by pg_upgrade so\n
! 			   consider running:\n
  			   vacuumdb --all --analyze-only\n
  			   on the newly-upgraded cluster.\n\n);
  	else
diff 

Re: [HACKERS] a typo in json.h

2012-02-28 Thread Alvaro Herrera

Excerpts from Haifeng Liu's message of mar feb 28 23:31:26 -0300 2012:
 Seems there is a trivial typo in json.h

Thanks, fixed.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW system columns

2012-02-28 Thread Shigeru Hanada
(2012/02/28 23:37), Kohei KaiGai wrote:
 2012年2月28日12:00 Shigeru Hanadashigeru.han...@gmail.com:
 We have three options:

 a) remove all system columns (posted patch)
 b) remove system columns other than tableoid
 c) leave all system columns as is (current 9.2dev)

 Incidentally, views, which is very similar object type to foreign
 tables, have no system columns.

 Thoughts?

 Which is the expected behavior in case of a foreign table
 is constructed as a child table of a particular regular table?
 In this case, children foreign tables don't have columns
 that exist on the parent table?
 (Although it is no matter when a regular table is a child of
 a foreign table...)

If we support table inheritance by foreign tables, foreign tables should
return something for all system columns, because a child table MUST have
all columns held by all parent tables.  I'm not sure that foreign tables
should have system columns physically, like the option c).

For now, c) seems most reasonable to me.

-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers