date:20121018

Re: [HACKERS] Bugs in planner's equivalence-class processing

2012-10-18 Thread Simon Riggs

On 16 October 2012 16:56, Tom Lane t...@sss.pgh.pa.us wrote:

 Is anybody concerned about the compatibility implications of fixing this
 bug in the back branches?  I'm worried about people complaining that we
 broke their application in a minor release.  Maybe they were depending
 on incorrect behavior, but they might complain anyway.  On the other
 hand, the fact that this hasn't been reported from the field in nine
 years suggests that not many people write queries like this.

Thanks for investigating this. My experience is that people seldom
check or understand the output of a query, they probably just figure
they didn't understand SQL and rewrite a different way, so its hard to
gauge the impact.

I think we need to see the cure before we can decide whether its worse
than the disease. And especially important is that we fix this just
once so I suggest fix and then backpatch deeply later.

This type of thing is handled in other products by having a
compatibility level, so you can decide whether you want it or not. Not
suggesting that here, yet, but its one way of mitigating the change.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Simon Riggs

On 17 October 2012 11:21, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 16 October 2012 15:15, Tom Lane t...@sss.pgh.pa.us wrote:
 What you really want is something vaguely like nextval but applied to
 a distinct type of object.  That is, I think we first need a different
 kind of object called a global sequence with its own DDL operations.

 hence a different solution. CREATE SEQUENCE is SQL Standard and used
 by SERIAL, many people's SQL, SQL generation tools etc.. My objective
 is to come up with something that makes the standard code work
 correctly in a replicated environment.

 I think we still can have both. I like Tom's suggestion better, as it
 provides for a cleaner implementation in the long run, I think.

Not sure how it is cleaner when we have to have trigger stuff hanging
around to make one object pretend to be another. That also creates a
chain of dependency which puts this into the future, rather than now.

The goal is make-sequences-work, not to invent something new that
might be cooler or more useful. If we create something new, then we
need to consider the references Daniel described, but that is a whole
different thing and already accessible if you need/want that.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Simon Riggs

On 17 October 2012 10:19, Markus Wanner mar...@bluegap.ch wrote:

 On 10/17/2012 10:34 AM, Simon Riggs wrote:
 IMHO an API is required for give me the next allocation of numbers,
 essentially a bulk equivalent of nextval().

 Agreed. That pretty exactly matches what I described (and what's
 implemented in Postgres-R). The API then only needs to be called every N
 invocations of nextval(), because otherwise nextval() can simply return
 a cached number previously allocated in a single step, eliminating a lot
 of the communication overhead.

 You realize an API at that level doesn't allow for an implementation of
 options 1 and 2? (Which I'm convinced we don't need, so that's fine with
 me).

 Anything lower level is going to depend upon implementation details
 that I don't think we should expose.

 Exactly. Just like we shouldn't expose other implementation details,
 like writing to system catalogs or WAL.

 I'm sure there will be much commonality between 2 similar
 implementations, just as there is similar code in each index type. But
 maintaining modularity is important and ahead of us actually seeing 2
 implementations, trying to prejudge that is going to slow us all down
 and likely screw us up.

 Agreed. Let me add, that modularity only serves a purpose, if the
 boundaries between the modules are chosen wisely. It sounds like we are
 on the same page, though.

 To testify this: IMHO an API for setval() is required to invalidate all
 node's caches and re-set an initial value, as a starting point for the
 next bulk of numbers that nextval() will return.

 currval() doesn't need to be changed or hooked at all, because it's a
 read-only operation.

Agreed

API calls for sam_nextval_alloc() and sam_setval()
using Tom's Sequence Access Method naming.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Simon Riggs

On 16 October 2012 18:29, Tom Lane t...@sss.pgh.pa.us wrote:

 Or maybe better, invent a level of indirection like a sequence access
 method (comparable to index access methods) that provides a compatible
 set of substitute functions for sequence operations.  If you want to
 override nextval() for a sequence, don't you likely also need to
 override setval(), currval(), etc?  Not to mention overriding ALTER
 SEQUENCE's behavior.

Agreed, though with exact API as discussed on portion of thread with Markus.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Simon Riggs

On 17 October 2012 21:25, Josh Berkus j...@agliodbs.com wrote:

 It is not meant to be a full implementation of application level queuing
 system though but just the capture, persisting and distribution parts

 Using this as an application level queue needs a set of interface
 functions to extract the events and also to keep track of the processed
 events. As there is no general consensus what these shoul be (like if
 processing same event twice is allowed) this part is left for specific
 queue consumer implementations.

 Well, but AFAICT, you've already prohibited features through your design
 which are essential to application-level queues, and are implemented by,
 for example, pgQ.

 1. your design only allows the queue to be read on replicas, not on the
 node where the item was inserted.

 2. if you can't UPDATE or DELETE queue items -- or LOCK them -- how on
 earth would a client know which items they have executed and which they
 haven't?

 3. Double-down on #2 in a multithreaded environment.

It's hard to work out how to reply to this because its just so off
base. I don't agree with the restrictions you think you see at all,
saying it politely rather than giving a one word answer.

The problem here is you phrase these things with too much certainty,
seeing only barriers. The how on earth? vibe is not appropriate at
all. It's perfectly fine to ask for answers to those difficult
questions, but don't presume that there are no answers, or that you
know with certainty they are even hard ones. By phrasing things in
such a closed way the only way forwards is through you, which does not
help.

All we're discussing is moving a successful piece of software into
core, which has been discussed for years at the international
technical meetings we've both been present at. I think an open
viewpoint on the feasibility of that would be reasonable, especially
when it comes from one of the original designers.

I apologise for making a personal comment, but this does affect the
technical discussion.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Database object names and libpq in UTF-8 locale on Windows

2012-10-18 Thread Sebastien FLAESCH


Hello,

Using PostgreSQL 9.2.1 on Windows, I am facing a strange character set issue
with a UTF-8 database.

Maybe this is expected, but want to be sure that I am not missing something.

On Windows, I have created a database with:

  ENCODING = 'UTF-8'
  LC_COLLATE = 'English_United States.1252'
  LC_CTYPE = 'English_United States.1252'

I am using libpq with a simple C program.
The program handles UTF-8 strings.

While it's not a traditional UNICODE Windows application using WCHAR, it
should be possible to send and get back UTF-8 data with PostgreSQL.

Interacting with the Windows system in the system locale is another story, but
from a pure C / SQL / libpq point of view, as long as the PostgreSQL client
encoding is properly defined to UTF-8, it should work, and it does mainly:

- I can use UTF-8 string constants in my queries.
- I can pass UTF-8 data to the database with parameterized queries.
- I can fetch UTF-8 data from the database.
- I can create db object names with UTF-8 characters.

But the db object names must be specified with double quotes:
When I do not use quoted db object names, I get a strange problem.
The table is created, I can use it in my program, I can even use it in the
pgAdmin query tool, but in the pgAdmin db browser, there is no table name
displayed in the treeview...

Further, when selecting schema information from pg_class, I can see that the
table exists, but there is nothing displayed in the relname attribute...

It appears that the problem disappears when using a C collation are char type:

  ENCODING = 'UTF-8'
  LC_COLLATE = 'C'
  LC_CTYPE = 'C'

I suspect this has something to do with the fact that non-quoted identifiers
are converted to lowercase, and because my LC_CTYPE is English_United 
States.1252,
the conversion to lowercase fails...

But:

- why does PostgreSQL accept to create the table with invalid UTF-8 characters?

- why can I make queries in the query tool with the UTF-8 table name?
(note that show client_encoding in the query tool gives me UNICODE - is this
the same as UTF-8?)


So is there a bug, or do I have to use a C collation and char type for UTF-8
databases on Windows?

I know that UTF-8 is not supported by the Windows C/POSIX library (setlocale,
and co).

Does it mean that it's not realistic to use UTF-8 encoding for PostgreSQL
databases on Windows...?

Thanks for reading.
Sebastien FLAESCH
Four Js Development Tools


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Simon Riggs

On 17 October 2012 11:26, Hannu Krosing ha...@2ndquadrant.com wrote:

 LOGGED ONLY TABLE is very technical description of realisation - I'd
 prefer it to work as mush like a table as possible, similar to how VIEW
 currently works - for all usages that make sense, you can simply
 substitute it for a TABLE

 QUEUE emphasizes the aspect of logged only table that it accepts
 records in a certain order, persists these and then quarantees
 that they can be read out in exact the same order - all this being
 guaranteed by existing WAL mechanisms.

 It is not meant to be a full implementation of application level queuing
 system though but just the capture, persisting and distribution parts

 Using this as an application level queue needs a set of interface
 functions to extract the events and also to keep track of the processed
 events. As there is no general consensus what these shoul be (like if
 processing same event twice is allowed) this part is left for specific
 queue consumer implementations.

The two halves of the queue are the TAIL/entry point and the HEAD/exit
point. As you point out these could be on the different servers,
wherever the logical changes flow to, but could also be on the same
server. When the head and tail are on the same server, the MESSAGE
QUEUE syntax seems appropriate, but I agree that calling it that when
its just a head or just a tail seems slightly misleading.

I guess the question is whether we provide a full implementation or
just the first half.

We do, I think, want a full queue implementation in core. We also want
to allow other queue implementations to interface with Postgres, so we
probably want to allow first half only as well. Meaning we want both
head and tail separately in core code. The question is whether we
require both head and tail in core before we allow commit, to which I
would say I think adding the tail first is OK, and adding the head
later when we know exactly the design.

Having said that, the LOGGING ONLY syntax makes me shiver. Better name?

I should also add that this is an switchable sync/asynchronous
transactional queue, whereas LISTEN/NOTIFY is a synchronous
transactional queue.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Database object names and libpq in UTF-8 locale on Windows

2012-10-18 Thread Sebastien FLAESCH


Hello,

Using PostgreSQL 9.2.1 on Windows, I am facing a strange character set issue
with a UTF-8 database.

Maybe this is expected, but want to be sure that I am not missing something.

On Windows, I have created a database with:

  ENCODING = 'UTF-8'
  LC_COLLATE = 'English_United States.1252'
  LC_CTYPE = 'English_United States.1252'

I am using libpq with a simple C program.
The program handles UTF-8 strings.

While it's not a traditional UNICODE Windows application using WCHAR, it
should be possible to send and get back UTF-8 data with PostgreSQL.

Interacting with the Windows system in the system locale is another story, but
from a pure C / SQL / libpq point of view, as long as the PostgreSQL client
encoding is properly defined to UTF-8, it should work, and it does mainly:

- I can use UTF-8 string constants in my queries.
- I can pass UTF-8 data to the database with parameterized queries.
- I can fetch UTF-8 data from the database.
- I can create db object names with UTF-8 characters.

But the db object names must be specified with double quotes:
When I do not use quoted db object names, I get a strange problem.
The table is created, I can use it in my program, I can even use it in the
pgAdmin query tool, but in the pgAdmin db browser, there is no table name
displayed in the treeview...

Further, when selecting schema information from pg_class, I can see that the
table exists, but there is nothing displayed in the relname attribute...

It appears that the problem disappears when using a C collation are char type:

  ENCODING = 'UTF-8'
  LC_COLLATE = 'C'
  LC_CTYPE = 'C'

I suspect this has something to do with the fact that non-quoted identifiers
are converted to lowercase, and because my LC_CTYPE is English_United 
States.1252,
the conversion to lowercase fails...

But:

- why does PostgreSQL accept to create the table with invalid UTF-8 characters?

- why can I make queries in the query tool with the UTF-8 table name?
(note that show client_encoding in the query tool gives me UNICODE - is this
the same as UTF-8?)


So is there a bug, or do I have to use a C collation and char type for UTF-8
databases on Windows?

I know that UTF-8 is not supported by the Windows C/POSIX library (setlocale,
and co).

Does it mean that it's not realistic to use UTF-8 encoding for PostgreSQL
databases on Windows...?

Thanks for reading.
Sebastien FLAESCH
Four Js Development Tools


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [BUG] False indication in pg_stat_replication.sync_state

2012-10-18 Thread Kyotaro HORIGUCHI

Hello. My colleague found that pg_stat_replication.sync_state
shows false state for some condition.

This prevents Pacemaker from completing fail-over that could
safely be done.

The point is in walsender.c, pg_stat_get_wal_senders() below, (as
of REL9_2_1)

  1555:if (walsnd-pid != 0)
  1556:{
  1557:  /*
  1558:   * Treat a standby such as a pg_basebackup background process
  1559:   * which always returns an invalid flush location, as an
  1560:   * asynchronous standby.
  1561:   */
! 1562:   sync_priority[i] = XLogRecPtrIsInvalid(walsnd-flush) ?
  1563:  0 : walsnd-sync_standby_priority;

Here, XLogRecPtrIsInvalid(walsnd-flush) is defined as
(walsnd-flush.xrecoff == 0) which becomes true as usual at every
WAL 'file's (not segments) boundary. xrecoff == 0 is certainly
invalid for the start point of WAL *RECORD*, but should be
considered valid in replication stream. This check was introduced
at 9.2.0 and the version up between 9.1.4 and 9.1.5.

 | DEBUG:  write 4/0 flush 3/FEFFEC68 apply 3/FEFFEC68
 | DEBUG:  write 4/0 flush 4/0 apply 3/FEFFEC68
 | DEBUG:  HOGE: flush = 3/FEFFEC68 sync_priority[0] = 1
 | DEBUG:  write 4/111C0 flush 4/0 apply 3/FEFFECC0
!| DEBUG:  HOGE: flush = 4/0 sync_priority[0] = 0

This value zero of sync_priority[0] makes sync_status 'async'
errorneously and confuses Pacemaker.

# The log line marked with 'HOGE' above printed by applying the
# patch at the bottom of this message and invoking 'select
# sync_state from pg_stat_replication' periodically. To increase
# the chance to see the symptom, sleep 1 second for 'file'
# boundaries :-)

The Heikki's recent(?) commit
0ab9d1c4b31622e9176472b4276f3e9831e3d6ba which changes the format
of XLogRecPtr from logid:xrecoff struct to 64 bit linear address
would fix the false indication. But I suppose this patch won't be
applied to existing 9.1.x and 9.2.x because of the modification
onto streaming protocol.

As far as I see the patch, it would'nt change the meaning of
XLogRecPtr to change XLogRecPtrIsInvalid from (xrecoff == 0) to
(xrecoff == 0  xlogid == 0). But this change affects rather
wide portion where handling WAL nevertheless what is needed here
is only to stop the false indication.

On the other hand, pg_basebackup seems return 0/0 as flush and
apply positions so it seems enough only to add xlogid == 0 into
the condition. The patch attached for REL9_2_1 does this and
yields the result following.

 | DEBUG:  write 2/FEFFFD48 flush 2/FEFFFD48 apply 2/FEFF7AB0
 | DEBUG:  write 3/0 flush 2/FEFFFD48 apply 2/FEFF7E88
 | DEBUG:  write 3/0 flush 3/0 apply 2/FEFFFD48
 | DEBUG:  HOGE: flush = 2/FEFFFD48 sync_priority[0] = 1
 | DEBUG:  write 3/E338 flush 3/0 apply 2/FE80
!| DEBUG:  HOGE: flush = 3/0 sync_priority[0] = 1

I think this patch should be applied for 9.2.2 and 9.1.7.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


= The patch for this test.

diff --git a/src/backend/replication/walsender.c 
b/src/backend/replication/walsender.c
index 064ddd5..19f79d1 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -618,6 +618,10 @@ ProcessStandbyReplyMessage(void)
 reply.flush.xlogid, reply.flush.xrecoff,
 reply.apply.xlogid, reply.apply.xrecoff);
 
+   if (reply.write.xrecoff == 0 ||
+   reply.flush.xrecoff == 0)
+   sleep(1);
+
/*
 * Update shared state for this WalSender process based on reply data 
from
 * standby.
@@ -1561,7 +1565,10 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
 */
sync_priority[i] = XLogRecPtrIsInvalid(walsnd-flush) ?
0 : walsnd-sync_standby_priority;
-
+   elog(DEBUG1, HOGE: flush = %X/%X sync_priority[%d] = 
%d,
+walsnd-flush.xlogid, walsnd-flush.xrecoff, 
+i, sync_priority[i]);
+   
if (walsnd-state == WALSNDSTATE_STREAMING 
walsnd-sync_standby_priority  0 
(priority == 0 ||
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 064ddd5..1d4cbc4 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1555,11 +1555,11 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
 		if (walsnd-pid != 0)
 		{
 			/*
-			 * Treat a standby such as a pg_basebackup background process
-			 * which always returns an invalid flush location, as an
+			 * Treat a standby such as a pg_basebackup background process which
+			 * always returns 0/0 (InvalidXLogRecPtr) as flush location, as an
 			 * asynchronous standby.
 			 */
-			sync_priority[i] = XLogRecPtrIsInvalid(walsnd-flush) ?
+

Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-10-18 Thread Andres Freund

On Thursday, October 18, 2012 06:12:02 AM Kevin Grittner wrote:
 Kevin Grittner wrote:
  Hmm. The comment is probably better now, but I've been re-checking
  the code, and I think my actual code change is completely wrong.
  Give me a bit to sort this out.
 
 I'm having trouble seeing a way to make this work without rearranging
 the code for concurrent drop to get to a state where it has set
 indisvalid = false, made that visible to all processes, and ensured
 that all scans of the index are complete -- while indisready is still
 true. That is the point where TransferPredicateLocksToHeapRelation()
 could be safely called. Then we would need to set indisready = false,
 make that visible to all processes, and ensure that all access to the
 index is complete. I can't see where it works to set both flags at
 the same time. I want to sleep on it to see if I can come up with any
 other way, but right now that's the only way I'm seeing to make DROP
 INDEX CONCURRENTLY compatible with SERIALIZABLE transactions. :-(

In a nearby bug I had to restructure the code that in a way thats similar to 
this anyway, so that seems fine. Maybe you can fix the bug ontop of the two 
attached patches?

Greetings,

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
From 037daa464d3fc63dbc943b13dd90f477a4fc9aba Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Tue, 25 Sep 2012 01:41:29 +0200
Subject: [PATCH 1/2] Fix concurrency issues in concurrent index drops

Previously a DROP INDEX CONCURRENTLY started with unsetting indisvalid *and*
indisready. Thats problematic if some transaction is still looking at the index
and another transction makes changes. See the example below.

Now we do the drop in three stages, just as a concurrent index build. First
unset indivalid, wait, unset indisready, wait, drop index.

Example:

Session 1:
CREATE TABLE test_drop_concurrently(id serial primary key, data int);
INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1,
10);
CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data);
BEGIN;
EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343;
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(1 row)

Session 2:
BEGIN;
SELECT * FROM test_drop_concurrently WHERE data = 34343;

Session 3:
DROP INDEX CONCURRENTLY test_drop_concurrently_data;
(in-progress)

Session 2:
INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 10);
COMMIT;

Session 1:
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(1 row)
SET enable_bitmapscan = false;
SET enable_indexscan = false;
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(2 rows)
---
 src/backend/catalog/index.c |   99 ---
 1 file changed, 84 insertions(+), 15 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 464950b..b39536e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1316,6 +1316,10 @@ index_drop(Oid indexId, bool concurrent)
 	 * table lock strong enough to prevent all queries on the table from
 	 * proceeding until we commit and send out a shared-cache-inval notice
 	 * that will make them update their index lists.
+	 *
+	 * In the concurrent case we make sure that nobody can be looking at the
+	 * indexes by dropping the index in multiple steps, so we don't need a full
+	 * fledged AccessExlusiveLock yet.
 	 */
 	heapId = IndexGetRelation(indexId, false);
 	if (concurrent)
@@ -1336,7 +1340,19 @@ index_drop(Oid indexId, bool concurrent)
 
 	/*
 	 * Drop Index concurrently is similar in many ways to creating an index
-	 * concurrently, so some actions are similar to DefineIndex()
+	 * concurrently, so some actions are similar to DefineIndex() just in the
+	 * reverse order.
+	 *
+	 * First we unset indisvalid so queries starting afterwards don't use the
+	 * index to answer queries anymore. We have to keep indisready = true
+	 * though so transactions that are still using the index can continue to
+	 * see valid index contents. E.g. when they are using READ COMMITTED mode,
+	 * and another transactions that started later commits makes changes and
+	 * commits, they need to see those new tuples in the index.
+	 *
+	 * After all transactions that could possibly have used it for queries
+	 * ended we can unset indisready and wait till nobody could be updating it
+	 * anymore.
 	 */
 	if (concurrent)
 	{
@@ -1355,21 +1371,14 @@ index_drop(Oid indexId, bool concurrent)
 			elog(ERROR, cache lookup failed for index %u, indexId);
 		indexForm = (Form_pg_index) GETSTRUCT(tuple);
 
-		indexForm-indisvalid = false;	/* make unusable for queries */
-		indexForm-indisready = false;	/* make invisible to changes */
+		indexForm-indisvalid = false;	/* make unusable for new queries */
+		/* we keep indisready == true so it still gets updated */

Re: [HACKERS] [PATCH] pg_dump: Sort overloaded functions in deterministic order

2012-10-18 Thread Joel Jacobson

On Wed, Oct 17, 2012 at 11:43 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Uh, the patch you posted keeps the pg_get_function_identity_arguments
 call in dumpFunc, but there is now also a new one in getFuncs.  Do we
 need to remove the second one?

It could be done, but unfortunately we cannot use the value computed
in dumpFunc(),
because getFuncs() is called before dumpFunc().

The patch currently only affects getFuncs(), it doesn't touch dumpFunc().

What could be done is to keep the changes in getFuncs(), and also
change dumpFunc()
to use the value computed in getFuncs(), but I think the gain is small
in relation
to the complexity of changing dumpFunc(), as we would still need to
make the two other
function calls in the SQL query in dumpFunc() to pg_get_function_arguments() and
pg_get_function_result().


 Here's an updated patch for your consideration.  I was about to push
 this when I noticed the above.  The only change here is that the extra
 code that tests for new remoteVersions in the second else if branch of
 getFuncs and getAggregates has been removed, since it cannot ever be
 reached.

Looks really good.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Dimitri Fontaine

Simon Riggs si...@2ndquadrant.com writes:
 Not sure how it is cleaner when we have to have trigger stuff hanging
 around to make one object pretend to be another. That also creates a
 chain of dependency which puts this into the future, rather than now.

Yes, that part isn't cleaner at all. The part where we have a separate
Object to deal with I like better, and I tried to reconciliate the two
view points.

Note that the event trigger would come installed and disabled, the user
would only have to activate it:

   ALTER EVENT TRIGGER distribute_my_cluster ENABLE;

Still the same issue.

 The goal is make-sequences-work, not to invent something new that
 might be cooler or more useful. If we create something new, then we
 need to consider the references Daniel described, but that is a whole
 different thing and already accessible if you need/want that.

So ok, I withdraw my consensus proposal. I tried.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Deprecations in authentication

2012-10-18 Thread Magnus Hagander

Since Simon stirred up a hornets nest suggesting deprecation of a
number of features, I figured I'd take it one step further and suggest
removal of some previously deprecated features :)

In particular, we made a couple of changes over sveral releases back
in the authentication config, that we should perhaps consider
finishing by removing the old stuff now?

1. krb5 authentication. We've had gssapi since 8.3 (which means in all
supported versions). krb5 has been deprecated, also since 8.3. Time to
remove it?

2. ident-over-unix-sockets was renamed to peer in 9.1, with the old
syntax deprecated but still mapping to the new one. Has it been there
long enough that we should start throwing an error for ident on unix?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Simon Riggs

On 18 October 2012 12:20, Magnus Hagander mag...@hagander.net wrote:

 2. ident-over-unix-sockets was renamed to peer in 9.1, with the old
 syntax deprecated but still mapping to the new one. Has it been there
 long enough that we should start throwing an error for ident on unix?

Any reason to remove? Having two names for same thing is a happy place
for users with bad/fond memories. It costs little and no errors are
associated with using the old name (are there?).

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Magnus Hagander

On Thu, Oct 18, 2012 at 1:32 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 18 October 2012 12:20, Magnus Hagander mag...@hagander.net wrote:

 2. ident-over-unix-sockets was renamed to peer in 9.1, with the old
 syntax deprecated but still mapping to the new one. Has it been there
 long enough that we should start throwing an error for ident on unix?

 Any reason to remove? Having two names for same thing is a happy place
 for users with bad/fond memories. It costs little and no errors are
 associated with using the old name (are there?).

The only real reason for that one would be confusion. e.g. using ident
over tcp is for most people very insecure, whereas ident over unix
sockets is very secure. there are exceptions to both those, but for
the majority of cases we are using the same name for one thing that
has very good security and one that has very bad. And confusion when
it comes to security is usually not a good thing.

The krb5 one is more about maintaining code, but there is not much
cost to keeping ident-over-unix, that's true.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Simon Riggs

On 18 October 2012 12:37, Magnus Hagander mag...@hagander.net wrote:
 On Thu, Oct 18, 2012 at 1:32 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 18 October 2012 12:20, Magnus Hagander mag...@hagander.net wrote:

 2. ident-over-unix-sockets was renamed to peer in 9.1, with the old
 syntax deprecated but still mapping to the new one. Has it been there
 long enough that we should start throwing an error for ident on unix?

 Any reason to remove? Having two names for same thing is a happy place
 for users with bad/fond memories. It costs little and no errors are
 associated with using the old name (are there?).

 The only real reason for that one would be confusion. e.g. using ident
 over tcp is for most people very insecure, whereas ident over unix
 sockets is very secure. there are exceptions to both those, but for
 the majority of cases we are using the same name for one thing that
 has very good security and one that has very bad. And confusion when
 it comes to security is usually not a good thing.

I'll go with that.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Simon Riggs

On 18 October 2012 12:20, Magnus Hagander mag...@hagander.net wrote:

 Since Simon stirred up a hornets nest suggesting deprecation of a
 number of features, I figured I'd take it one step further and suggest
 removal of some previously deprecated features :)

I'm laughing at the analogy that angry and unintelligent agents
responded to my proposals, but there was no stirring action from me.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: DROP INDEX CONCURRENTLY is not really concurrency safe leaves around undroppable indexes

2012-10-18 Thread Abhijit Menon-Sen

At 2012-09-25 01:46:18 +0200, and...@2ndquadrant.com wrote:

 The attached patch fixes this issue. Haven't looked at the other one
 in detail yet.

Here are tests for both bugs. They currently fail with HEAD.

Note that the first test now uses PREPARE instead of the SELECTs in the
original example, which no longer works after commit #96cc18 because of
the re-added AcceptInvalidationMessages calls (archaeology by Andres).

-- Abhijit
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 75e33bc..d964aaf 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -14,3 +14,5 @@ test: fk-contention
 test: fk-deadlock
 test: fk-deadlock2
 test: eval-plan-qual
+test: drop-index-concurrently-1
+test: drop-index-concurrently-2

diff --git a/src/test/isolation/specs/drop-index-concurrently-1.spec b/src/test/isolation/specs/drop-index-concurrently-1.spec
new file mode 100644
index 000..83c44ab
--- /dev/null
+++ b/src/test/isolation/specs/drop-index-concurrently-1.spec
@@ -0,0 +1,33 @@
+setup
+{
+	CREATE TABLE test_dc(id serial primary key, data int);
+	INSERT INTO test_dc(data) SELECT * FROM generate_series(1, 100);
+	CREATE INDEX test_dc_data ON test_dc(data);
+}
+
+teardown
+{
+	DROP TABLE test_dc;
+}
+
+session s1
+step noseq { SET enable_seqscan = false; }
+step prepi { PREPARE getrow_idx AS SELECT * FROM test_dc WHERE data=34; }
+step preps { PREPARE getrow_seq AS SELECT * FROM test_dc WHERE data::text=34::text; }
+step begin { BEGIN; }
+step explaini { EXPLAIN (COSTS OFF) EXECUTE getrow_idx; }
+step explains { EXPLAIN (COSTS OFF) EXECUTE getrow_seq; }
+step selecti { EXECUTE getrow_idx; }
+step selects { EXECUTE getrow_seq; }
+step end { COMMIT; }
+
+session s2
+setup { BEGIN; }
+step select2 { SELECT * FROM test_dc WHERE data=34; }
+step insert2 { INSERT INTO test_dc(data) SELECT * FROM generate_series(1, 100); }
+step end2 { COMMIT; }
+
+session s3
+step drop { DROP INDEX CONCURRENTLY test_dc_data; }
+
+permutation noseq prepi preps begin explaini explains select2 drop insert2 end2 selecti selects end

diff --git a/src/test/isolation/expected/drop-index-concurrently-1.out b/src/test/isolation/expected/drop-index-concurrently-1.out
new file mode 100644
index 000..c42ac91
--- /dev/null
+++ b/src/test/isolation/expected/drop-index-concurrently-1.out
@@ -0,0 +1,36 @@
+Parsed test spec with 3 sessions
+
+starting permutation: noseq prepi preps begin explaini explains select2 drop insert2 end2 selecti selects end
+step noseq: SET enable_seqscan = false;
+step prepi: PREPARE getrow_idx AS SELECT * FROM test_dc WHERE data=34;
+step preps: PREPARE getrow_seq AS SELECT * FROM test_dc WHERE data::text=34::text;
+step begin: BEGIN;
+step explaini: EXPLAIN (COSTS OFF) EXECUTE getrow_idx;
+QUERY PLAN 
+
+Index Scan using test_dc_data on test_dc
+  Index Cond: (data = 34)
+step explains: EXPLAIN (COSTS OFF) EXECUTE getrow_seq;
+QUERY PLAN 
+
+Seq Scan on test_dc
+  Filter: ((data)::text = '34'::text)
+step select2: SELECT * FROM test_dc WHERE data=34;
+id data   
+
+34 34 
+step drop: DROP INDEX CONCURRENTLY test_dc_data; waiting ...
+step insert2: INSERT INTO test_dc(data) SELECT * FROM generate_series(1, 100);
+step end2: COMMIT;
+step selecti: EXECUTE getrow_idx;
+id data   
+
+34 34 
+13434 
+step selects: EXECUTE getrow_seq;
+id data   
+
+34 34 
+13434 
+step end: COMMIT;
+step drop: ... completed

diff --git a/src/test/isolation/specs/drop-index-concurrently-2.spec b/src/test/isolation/specs/drop-index-concurrently-2.spec
new file mode 100644
index 000..273f735
--- /dev/null
+++ b/src/test/isolation/specs/drop-index-concurrently-2.spec
@@ -0,0 +1,21 @@
+setup
+{
+	CREATE TABLE test_dc(id serial primary key, data int);
+	CREATE INDEX test_dc_data ON test_dc(data);
+}
+
+session s1
+setup { BEGIN; }
+step explain { EXPLAIN (COSTS OFF) SELECT * FROM test_dc WHERE data=34343; }
+step rollback { ROLLBACK; }
+step droptab { DROP TABLE test_dc; }
+step selecti { SELECT indexrelid::regclass, indisvalid, indisready FROM pg_index WHERE indexrelid = 'test_dc_data'::regclass; }
+step dropi { DROP INDEX test_dc_data; }
+
+session s2
+step drop { DROP INDEX CONCURRENTLY test_dc_data; }
+
+session s3
+step cancel { SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE query = 'DROP INDEX CONCURRENTLY test_dc_data;'; }
+
+permutation explain drop cancel rollback droptab selecti dropi

diff --git a/src/test/isolation/expected/drop-index-concurrently-2.out b/src/test/isolation/expected/drop-index-concurrently-2.out
new file mode 100644
index 000..4802777
--- /dev/null
+++ b/src/test/isolation/expected/drop-index-concurrently-2.out
@@ -0,0 +1,24 @@
+Parsed test spec with 3 sessions
+
+starting permutation: explain drop cancel rollback droptab

Re: [HACKERS] Re: DROP INDEX CONCURRENTLY is not really concurrency safe leaves around undroppable indexes

2012-10-18 Thread Simon Riggs

On 18 October 2012 12:56, Abhijit Menon-Sen a...@2ndquadrant.com wrote:
 At 2012-09-25 01:46:18 +0200, and...@2ndquadrant.com wrote:

 The attached patch fixes this issue. Haven't looked at the other one
 in detail yet.

 Here are tests for both bugs. They currently fail with HEAD.

 Note that the first test now uses PREPARE instead of the SELECTs in the
 original example, which no longer works after commit #96cc18 because of
 the re-added AcceptInvalidationMessages calls (archaeology by Andres).

Thanks, I'll apply these now.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-10-18 Thread Kevin Grittner

Andres Freund wrote:
 On Thursday, October 18, 2012 06:12:02 AM Kevin Grittner wrote:
 
 I'm having trouble seeing a way to make this work without
 rearranging the code for concurrent drop to get to a state where
 it has set indisvalid = false, made that visible to all processes,
 and ensured that all scans of the index are complete -- while
 indisready is still true. That is the point where
 TransferPredicateLocksToHeapRelation() could be safely called.
 Then we would need to set indisready = false, make that visible to
 all processes, and ensure that all access to the index is
 complete. I can't see where it works to set both flags at the same
 time.

 In a nearby bug I had to restructure the code that in a way thats
 similar to this anyway, so that seems fine. Maybe you can fix the
 bug ontop of the two attached patches?

Perfect; these two patches provide a spot in the code which is
exactly right for handling the predicate lock adjustments. Attached
is a patch which applies on top of the two you sent.

Thanks!

-Kevin
*** a/src/backend/catalog/index.c
--- b/src/backend/catalog/index.c
***
*** 1320,1325  index_drop(Oid indexId, bool concurrent)
--- 1320,1337 
  	 * In the concurrent case we make sure that nobody can be looking at the
  	 * indexes by dropping the index in multiple steps, so we don't need a full
  	 * fledged AccessExlusiveLock yet.
+ 	 *
+ 	 * All predicate locks on the index are about to be made invalid. Promote
+ 	 * them to relation locks on the heap. For correctness the index must not
+ 	 * be seen with indisvalid = true during query planning after the move
+ 	 * starts, so that the index will not be used for a scan after the
+ 	 * predicate lock move, as this could create new predicate locks on the
+ 	 * index which would not ensure a heap relation lock. Also, the index must
+ 	 * not be seen during execution of a heap tuple insert with indisready =
+ 	 * false before the move is complete, since the conflict with the
+ 	 * predicate lock on the index gap could be missed before the lock on the
+ 	 * heap relation is in place to detect a conflict based on the heap tuple
+ 	 * insert.
  	 */
  	heapId = IndexGetRelation(indexId, false);
  	if (concurrent)
***
*** 1439,1444  index_drop(Oid indexId, bool concurrent)
--- 1451,1464 
  		}
  
  		/*
+ 		 * No more predicate locks will be acquired on this index, and we're
+ 		 * about to stop doing inserts into the index which could show
+ 		 * conflicts with existing predicate locks, so now is the time to move
+ 		 * them to the heap relation.
+ 		 */
+ 		TransferPredicateLocksToHeapRelation(userIndexRelation);
+ 
+ 		/*
  		 * now we are sure that nobody uses the index for queries, they just
  		 * might have it opened for updating it. So now we can unset
  		 * -indisready and wait till nobody could update the index anymore.
***
*** 1507,1518  index_drop(Oid indexId, bool concurrent)
  		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
  		userIndexRelation = index_open(indexId, AccessExclusiveLock);
  	}
! 
! 	/*
! 	 * All predicate locks on the index are about to be made invalid. Promote
! 	 * them to relation locks on the heap.
! 	 */
! 	TransferPredicateLocksToHeapRelation(userIndexRelation);
  
  	/*
  	 * Schedule physical removal of the files
--- 1527,1534 
  		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
  		userIndexRelation = index_open(indexId, AccessExclusiveLock);
  	}
! 	else
! 		TransferPredicateLocksToHeapRelation(userIndexRelation);
  
  	/*
  	 * Schedule physical removal of the files

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Alvaro Herrera

Simon Riggs wrote:
 On 18 October 2012 12:20, Magnus Hagander mag...@hagander.net wrote:
 
  Since Simon stirred up a hornets nest suggesting deprecation of a
  number of features, I figured I'd take it one step further and suggest
  removal of some previously deprecated features :)
 
 I'm laughing at the analogy that angry and unintelligent agents
 responded to my proposals, but there was no stirring action from me.

We may all be stupid individually, but it's the swarm that matters.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Andrew Dunstan



On 10/17/2012 07:25 PM, Tom Lane wrote:



I'm fairly annoyed by the entire tenor of this conversation, because
the people who are hollering the loudest seem to be people who have
never actually touched any of the rules code, but nonetheless seem
prepared to tell those of us who have what to spend our time on.


+1

I too have been quite annoyed.



Now having said that, I would definitely like to see rules in their
current form go away eventually.  But not without a substitute.
Triggers are not a complete replacement, and no amount of wishful
thinking makes them so.

Perhaps it would be more profitable to try to identify the pain points
that make people so eager to get rid of rules, and then see if we could
alleviate them.  One big problem I know about offhand is the
multiple-evaluation risk, which seems at least in principle fixable.
What others are there?





Yeah. That's by far the best approach. It has the merit of being 
positive rather than just taking something away that people do use, even 
if it's only a relatively small number of users.


The biggest pain people have mentioned is that they don't work with 
COPY.  I am in fact about to start working on a project which will 
probably alleviate that pain point. I'm not going to say much more, and 
I would not have said anything right now except that there is this 
sudden rush to deprecate rules, or announce a future removal of the 
feature. However, I hope to have a proposal to put to the community by 
about the end of November.


On that point, it's also worth noting that FDWs provide a nice 
workaround, instead of doing a straight


COPY TO mytable FROM myfile

you set up the source as a foreign table using file_fdw or my 
file_text_array_fdw, and then do


INSERT INTO mytable
SELECT ...
FROM my_foreign_table;

Maybe this too would be worth mentioning in the docs, maybe in the rules 
section with an xref from the copy section.


So, please, hold off for little bit. I don't mind putting warnings in 
the docs, but I'd really rather we waited on any announcement of a 
future possible deprecation, or log warnings that using rules will cause 
zombies to eat your brainz.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)

2012-10-18 Thread Peter Geoghegan

On 16 October 2012 15:26, Jan Wieck janwi...@yahoo.com wrote:
 This means that the transition time from the existing, trigger based
 approach to the new WAL based mechanism will see both technologies in
 parallel, which is no small thing to support.

So, you're talking about a shim between the two in order to usefully
support inter-version replication, or are you just thinking about
making a clean break in compatibility for Postgres versions prior to
9.3 in a new release branch?

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hash_search and out of memory

2012-10-18 Thread Tom Lane

Hitoshi Harada umi.tan...@gmail.com writes:
 If OOM happens during expand_table() in hash_search_with_hash_value()
 for RelationCacheInsert,

What OOM?  expand_table is supposed to return without doing anything
if it can't expand the table.  If that's not happening, that's a bug
in the hash code.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] September 2012 commitfest

2012-10-18 Thread Alvaro Herrera

Amit Kapila wrote:

 For the Patch, Trim trailing NULL columns, I have provided the performance
 data required
 and completed the review. There are only few review comments which can be
 addressed.
 So is it possible that I complete them and mark it as Ready For Committer
 or what else can be the way to proceed for this patch
 if author doesn't respond.

Sure, you can do that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUG] False indication in pg_stat_replication.sync_state

2012-10-18 Thread Fujii Masao

On Thu, Oct 18, 2012 at 5:42 PM, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp wrote:
 Hello. My colleague found that pg_stat_replication.sync_state
 shows false state for some condition.

 This prevents Pacemaker from completing fail-over that could
 safely be done.

 The point is in walsender.c, pg_stat_get_wal_senders() below, (as
 of REL9_2_1)

   1555:if (walsnd-pid != 0)
   1556:{
   1557:  /*
   1558:   * Treat a standby such as a pg_basebackup background process
   1559:   * which always returns an invalid flush location, as an
   1560:   * asynchronous standby.
   1561:   */
 ! 1562:   sync_priority[i] = XLogRecPtrIsInvalid(walsnd-flush) ?
   1563:  0 : walsnd-sync_standby_priority;

 Here, XLogRecPtrIsInvalid(walsnd-flush) is defined as
 (walsnd-flush.xrecoff == 0) which becomes true as usual at every
 WAL 'file's (not segments) boundary. xrecoff == 0 is certainly
 invalid for the start point of WAL *RECORD*, but should be
 considered valid in replication stream. This check was introduced
 at 9.2.0 and the version up between 9.1.4 and 9.1.5.

  | DEBUG:  write 4/0 flush 3/FEFFEC68 apply 3/FEFFEC68
  | DEBUG:  write 4/0 flush 4/0 apply 3/FEFFEC68
  | DEBUG:  HOGE: flush = 3/FEFFEC68 sync_priority[0] = 1
  | DEBUG:  write 4/111C0 flush 4/0 apply 3/FEFFECC0
 !| DEBUG:  HOGE: flush = 4/0 sync_priority[0] = 0

 This value zero of sync_priority[0] makes sync_status 'async'
 errorneously and confuses Pacemaker.

 # The log line marked with 'HOGE' above printed by applying the
 # patch at the bottom of this message and invoking 'select
 # sync_state from pg_stat_replication' periodically. To increase
 # the chance to see the symptom, sleep 1 second for 'file'
 # boundaries :-)

 The Heikki's recent(?) commit
 0ab9d1c4b31622e9176472b4276f3e9831e3d6ba which changes the format
 of XLogRecPtr from logid:xrecoff struct to 64 bit linear address
 would fix the false indication. But I suppose this patch won't be
 applied to existing 9.1.x and 9.2.x because of the modification
 onto streaming protocol.

 As far as I see the patch, it would'nt change the meaning of
 XLogRecPtr to change XLogRecPtrIsInvalid from (xrecoff == 0) to
 (xrecoff == 0  xlogid == 0). But this change affects rather
 wide portion where handling WAL nevertheless what is needed here
 is only to stop the false indication.

 On the other hand, pg_basebackup seems return 0/0 as flush and
 apply positions so it seems enough only to add xlogid == 0 into
 the condition. The patch attached for REL9_2_1 does this and
 yields the result following.

  | DEBUG:  write 2/FEFFFD48 flush 2/FEFFFD48 apply 2/FEFF7AB0
  | DEBUG:  write 3/0 flush 2/FEFFFD48 apply 2/FEFF7E88
  | DEBUG:  write 3/0 flush 3/0 apply 2/FEFFFD48
  | DEBUG:  HOGE: flush = 2/FEFFFD48 sync_priority[0] = 1
  | DEBUG:  write 3/E338 flush 3/0 apply 2/FE80
 !| DEBUG:  HOGE: flush = 3/0 sync_priority[0] = 1

 I think this patch should be applied for 9.2.2 and 9.1.7.

Looks good to me, though I don't think the source code comment needs
to be updated in the way the patch does.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] First draft of snapshot snapshot building design document

2012-10-18 Thread Robert Haas

On Tue, Oct 16, 2012 at 7:30 AM, Andres Freund and...@2ndquadrant.com wrote:
 On Thursday, October 11, 2012 01:02:26 AM Peter Geoghegan wrote:
 The design document [2] really just explains the problem (which is the
 need for catalog metadata at a point in time to make sense of heap
 tuples), without describing the solution that this patch offers with
 any degree of detail. Rather, [2] says How we build snapshots is
 somewhat intricate and complicated and seems to be out of scope for
 this document, which is unsatisfactory. I look forward to reading the
 promised document that describes this mechanism in more detail.

 Here's the first version of the promised document. I hope it answers most of
 the questions.

 Input welcome!

I haven't grokked all of this in its entirety, but I'm kind of
uncomfortable with the relfilenode - OID mapping stuff.  I'm
wondering if we should, when logical replication is enabled, find a
way to cram the table OID into the XLOG record.  It seems like that
would simplify things.

If we don't choose to do that, it's worth noting that you actually
need 16 bytes of data to generate a unique identifier for a relation,
as in database OID + tablespace OID + relfilenode# + backend ID.
Backend ID might be ignorable because WAL-based logical replication is
going to ignore temporary relations anyway, but you definitely need
the other two.  There's nothing, for example, to keep you from having
two relations with the same value in pg_class.relfilenode in the same
database but in different tablespaces.  It's unlikely to happen,
because for new relations we set OID = relfilenode, but a subsequent
rewrite can bring it about if the stars align just right.  (Such
situations are, of course, a breeding ground for bugs, which might
make you question whether our current scheme for assigning
relfilenodes has much of anything to recommend it.)

Another thing to think about is that, like catalog snapshots,
relfilenode mappings have to be time-relativized; that is, you need to
know what the mapping was at the proper point in the WAL sequence, not
what it is now.  In practice, the risk here seems to be minimal,
because it takes a while to churn through 4 billion OIDs.  However, I
suspect it pays to think about this fairly carefully because if we do
ever run into a situation where the OID counter wraps during a time
period comparable to the replication lag, the bugs will be extremely
difficult to debug.

Anyhow, adding the table OID to the WAL header would chew up a few
more bytes of WAL space, but it seems like it might be worth it to
avoid having to think very hard about all of these issues.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Simon Riggs

On 18 October 2012 12:43, Simon Riggs si...@2ndquadrant.com wrote:
 On 18 October 2012 12:20, Magnus Hagander mag...@hagander.net wrote:

 Since Simon stirred up a hornets nest suggesting deprecation of a
 number of features, I figured I'd take it one step further and suggest
 removal of some previously deprecated features :)

 I'm laughing at the analogy that angry and unintelligent agents
 responded to my proposals, but there was no stirring action from me.

Hmm, this looks like a stirring action in itself, so I withdraw and apologise.

You are right that some people are angry and so IMHO it was wrong of
me to try to joke about that. My point was only that I had acted in
good faith, rather than to deliberately cause annoyance.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Robert Haas

On Tue, Oct 16, 2012 at 1:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 In particular, the reason proposing a hook first seems backwards is that
 if we have a catalog-level representation that some sequences are local
 and others not, we should be using that to drive the determination of
 whether to call a substitute function --- and maybe which one to call.
 For instance, I could see attaching a function OID to each sequence
 and then having nextval() call that function, instead of a hook per se.

Yeah, I like that.  That makes it easy to configure your database so
that some sequences have special behavior (which the database designer
can set up however they like) and others can be just vanilla, and the
plugin doesn't have to try to figure out which ones are which (which
was my first concern in reading Simon's original proposal).  To make
it even better, add some generic options that can be passed through to
the underlying handler.

So something like:

ALTER SEQUENCE wump
SET HANDLER (nextval my_magical_nextval, setval my_magical_setval)
OPTIONS (any_label_you_want_the_handlers_to_get
'some_text_associated_with_the_label', another_label
'some_more_text');

That way you could say, for example, that sequence wump should get its
values from coordinator node 172.24.16.93 and that the global
identifier for this sequence is UUID
e15ea6e6-43d5-4f65-8efd-cf28a14a2d70.  That way you can avoid having
to make any assumptions about how local sequence names on particular
nodes are mapped onto global names.

 Or maybe better, invent a level of indirection like a sequence access
 method (comparable to index access methods) that provides a compatible
 set of substitute functions for sequence operations.  If you want to
 override nextval() for a sequence, don't you likely also need to
 override setval(), currval(), etc?  Not to mention overriding ALTER
 SEQUENCE's behavior.

This might be better, but it's also possibly more mechanism than we
truly need here.  But then again, if we're going to end up with more
than a handful of handlers, we probably do want to do this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Tue, Oct 16, 2012 at 1:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Or maybe better, invent a level of indirection like a sequence access
 method (comparable to index access methods) that provides a compatible
 set of substitute functions for sequence operations.  If you want to
 override nextval() for a sequence, don't you likely also need to
 override setval(), currval(), etc?  Not to mention overriding ALTER
 SEQUENCE's behavior.

 This might be better, but it's also possibly more mechanism than we
 truly need here.  But then again, if we're going to end up with more
 than a handful of handlers, we probably do want to do this.

It's definitely a lot of mechanism, and if we can get away with
something simpler that's fine with me.  But I'd want to see a pretty
bulletproof argument why overriding *only* nextval is sufficient
(and always will be) before accepting a hook for just nextval.  If we
build an equivalent amount of functionality piecemeal it's going to
be a lot uglier than if we recognize we need this type of concept
up front.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)

2012-10-18 Thread Christopher Browne

On Thu, Oct 18, 2012 at 9:49 AM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 16 October 2012 15:26, Jan Wieck janwi...@yahoo.com wrote:
 This means that the transition time from the existing, trigger based
 approach to the new WAL based mechanism will see both technologies in
 parallel, which is no small thing to support.

 So, you're talking about a shim between the two in order to usefully
 support inter-version replication, or are you just thinking about
 making a clean break in compatibility for Postgres versions prior to
 9.3 in a new release branch?

It's early to assume either.

In Slony 2.0, we accepted that we were breaking compatibility with
versions of Postgres before 8.3; we accepted that because there were
considerable 'manageability' benefits (e.g. - system catalogues no
longer hacked up, so pg_dump works against all nodes, and some
dramatically reduced locking).

But that had the attendant cost that we have had to continue fixing
bugs on 1.2, to a degree, even until now, because people on Postgres
versions earlier than 8.3 have no way to use version 2.0.

Those merits and demerits apply pretty clearly to this.

It would be somewhat attractive for a version 2.3 (or, more likely,
to indicate the break from earlier versions, 3.0 to make the clean
break to the new-in-PG-9.3 facilities.  It is attractive in that we
could:
a) Safely remove the trigger-based log capture apparatus (or, at
least, I'm assuming so), and
b) Consciously upgrade to take advantage of all the latest cool stuff
found in Postgres 9.3.  (I haven't got any particular features in
mind; perhaps we add RANGE comparators for xid to 9.3, and make
extensive use of xid_range types?  That would be something that
couldn't reasonably get hacked to work in anything before 9.2...)
c) Drop out any special cases having to do with support of versions
8.3, 8.4, 9.0, 9.1, and 9.2.

But, of course, we'd be leaving everyone running 8.3 thru 9.2 behind,
if we did so, and would corresponding shackle ourselves to need to
support the 2.x branches for still longer.  And this would mean that
this Slony 3.0 would expressly NOT support one of our intended use
cases, namely to support upgrading from elder versions of Postgres.

A shim adds complexity, but retains the upgrade across versions
use case, and reduces the need to keep supporting elder versions of
Slony.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-10-18 Thread Fujii Masao

On Wed, Oct 17, 2012 at 8:46 PM, Amit Kapila amit.kap...@huawei.com wrote:
 On Monday, October 15, 2012 3:43 PM Heikki Linnakangas wrote:
 On 13.10.2012 19:35, Fujii Masao wrote:
  On Thu, Oct 11, 2012 at 11:52 PM, Heikki Linnakangas
  hlinnakan...@vmware.com  wrote:
  Ok, thanks. Committed.
 
  I found one typo. The attached patch fixes that typo.

 Thanks, fixed.

  ISTM you need to update the protocol.sgml because you added
  the field 'replyRequested' to WalSndrMessage and StandbyReplyMessage.



  Is it worth adding the same mechanism (send back the reply immediately
  if walsender request a reply) into pg_basebackup and pg_receivexlog?

 Good catch. Yes, they should be taught about this too. I'll look into
 doing that too.

 If you have not started and you don't have objection, I can pickup this to
 complete it.

 For both (pg_basebackup and pg_receivexlog), we need to get a timeout
 parameter from user in command line, as
 there is no conf file here. New Option can be -t (parameter name can be
 recvtimeout).

 The main changes will be in function ReceiveXlogStream(), it is a common
 function for both
 Pg_basebackup and pg_receivexlog. Handling will be done in same way as we
 have done in walreceiver.

 Suggestions/Comments?

Before implementing the timeout parameter, I think that it's better to change
both pg_basebackup background process and pg_receivexlog so that they
send back the reply message immediately when they receive the keepalive
message requesting the reply. Currently, they always ignore such keepalive
message, so status interval parameter (-s) in them always must be set to
the value less than replication timeout. We can avoid this troublesome
parameter setting by introducing the same logic of walreceiver into both
pg_basebackup background process and pg_receivexlog.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)

2012-10-18 Thread Peter Geoghegan

On 18 October 2012 16:18, Christopher Browne cbbro...@gmail.com wrote:
 A shim adds complexity, but retains the upgrade across versions
 use case, and reduces the need to keep supporting elder versions of
 Slony.

Right. Upgrading across major versions is likely to continue to remain
a very important use-case for Slony.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] pg_dump: Sort overloaded functions in deterministic order

2012-10-18 Thread Alvaro Herrera

Joel Jacobson wrote:
 On Wed, Oct 17, 2012 at 11:43 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  Uh, the patch you posted keeps the pg_get_function_identity_arguments
  call in dumpFunc, but there is now also a new one in getFuncs.  Do we
  need to remove the second one?
 
 It could be done, but unfortunately we cannot use the value computed
 in dumpFunc(),
 because getFuncs() is called before dumpFunc().

Right, I got that from the discussion.

 What could be done is to keep the changes in getFuncs(), and also
 change dumpFunc()
 to use the value computed in getFuncs(), but I think the gain is small
 in relation
 to the complexity of changing dumpFunc(), as we would still need to
 make the two other
 function calls in the SQL query in dumpFunc() to pg_get_function_arguments() 
 and
 pg_get_function_result().

Changing pg_dump is complex enough whatever the change, yes.  I have not
touched this.

  Here's an updated patch for your consideration.  I was about to push
  this when I noticed the above.  The only change here is that the extra
  code that tests for new remoteVersions in the second else if branch of
  getFuncs and getAggregates has been removed, since it cannot ever be
  reached.
 
 Looks really good.

Thanks, pushed it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review for pg_dump: Sort overloaded functions in deterministic order

2012-10-18 Thread Alvaro Herrera

Joel Jacobson wrote:
 Hi Joachim,
 
 Attached, please find new patch. Test unchanged.

This was committed, as discussed in the original patch's thread.

It would be great if reviewers could reply to the email that submits the
patch, instead of creating a thread of their own.  It helps keep things
better organized.

Thanks for the review.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hash_search and out of memory

2012-10-18 Thread Tom Lane

I wrote:
 Hitoshi Harada umi.tan...@gmail.com writes:
 If OOM happens during expand_table() in hash_search_with_hash_value()
 for RelationCacheInsert,

 What OOM?  expand_table is supposed to return without doing anything
 if it can't expand the table.  If that's not happening, that's a bug
 in the hash code.

Oh, wait, I take that back --- the palloc-based allocator does throw
errors.  I think that when that was designed, we were thinking that
palloc-based hash tables would be thrown away anyway after an error,
but of course that's not true for long-lived tables such as the relcache
hash table.

I'm not terribly comfortable with trying to use a PG_TRY block to catch
an OOM error - there are too many ways that could break, and this code
path is by definition not very testable.  I think moving up the
expand_table action is probably the best bet.  Will you submit a patch?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] First draft of snapshot snapshot building design document

2012-10-18 Thread Andres Freund

On Thursday, October 18, 2012 04:47:12 PM Robert Haas wrote:
 On Tue, Oct 16, 2012 at 7:30 AM, Andres Freund and...@2ndquadrant.com 
wrote:
  On Thursday, October 11, 2012 01:02:26 AM Peter Geoghegan wrote:
  The design document [2] really just explains the problem (which is the
  need for catalog metadata at a point in time to make sense of heap
  tuples), without describing the solution that this patch offers with
  any degree of detail. Rather, [2] says How we build snapshots is
  somewhat intricate and complicated and seems to be out of scope for
  this document, which is unsatisfactory. I look forward to reading the
  promised document that describes this mechanism in more detail.
  
  Here's the first version of the promised document. I hope it answers most
  of the questions.
  
  Input welcome!
 
 I haven't grokked all of this in its entirety, but I'm kind of
 uncomfortable with the relfilenode - OID mapping stuff.  I'm
 wondering if we should, when logical replication is enabled, find a
 way to cram the table OID into the XLOG record.  It seems like that
 would simplify things.
 
 If we don't choose to do that, it's worth noting that you actually
 need 16 bytes of data to generate a unique identifier for a relation,
 as in database OID + tablespace OID + relfilenode# + backend ID.
 Backend ID might be ignorable because WAL-based logical replication is
 going to ignore temporary relations anyway, but you definitely need
 the other two.  ...

Hm. I should take look at the way temporary tables are represented. As you say 
I is not going to matter for WAL decoding, but still...

 Another thing to think about is that, like catalog snapshots,
 relfilenode mappings have to be time-relativized; that is, you need to
 know what the mapping was at the proper point in the WAL sequence, not
 what it is now.  In practice, the risk here seems to be minimal,
 because it takes a while to churn through 4 billion OIDs.  However, I
 suspect it pays to think about this fairly carefully because if we do
 ever run into a situation where the OID counter wraps during a time
 period comparable to the replication lag, the bugs will be extremely
 difficult to debug.

I think with a rollbacks + restarts we might even be able to see the same 
relfilenode earlier.

 Anyhow, adding the table OID to the WAL header would chew up a few
 more bytes of WAL space, but it seems like it might be worth it to
 avoid having to think very hard about all of these issues.

I don't think its necessary to change wal logging here. The relfilenode mapping 
is now looked up using the timetravel snapshot we've built using (spcNode, 
relNode) as the key, so the time-relativized lookup is builtin. If we screw 
that up way much more is broken anyway.

Two problems are left:

1) (reltablespace, relfilenode) is not unique in pg_class because InvalidOid is 
stored for relfilenode if its a shared or nailed table. That not a problem for 
the lookup because weve already checked the relmapper before that, so we never 
look those up anyway. But it violates documented requirements of syscache.c. 
Even after some looking I haven't found any problem that that could cause.

2) We need to decide whether a HEAP[1-2]_* record did catalog changes when 
building/updating snapshots. Unfortunately we also need to do this *before* we 
built the first snapshot. For now treating all tables as catalog modifying 
before we built the snapshot seems to work fine.
I think encoding the oid in the xlog header wouln't help all that much here, 
because I am pretty sure we want to have the set of catalog tables to be 
extensible at some point...


Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in -c CLI option of pg_dump/pg_restore

2012-10-18 Thread Alvaro Herrera

Robert Haas escribió:
 On Tue, Oct 16, 2012 at 10:31 AM, Guillaume Lelarge
 guilla...@lelarge.info wrote:
  Any comments on this?
 
 I'm not sure I'd want to back-patch this, since it is a behavior
 change, but I do think it's probably a good idea to change it for 9.3.

Hm, but the bug is said to happen only in 9.2, so if we don't backpatch
we would leave 9.2 alone exhibiting this behavior.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Simon Riggs

On 18 October 2012 16:08, Robert Haas robertmh...@gmail.com wrote:

 To make
 it even better, add some generic options that can be passed through to
 the underlying handler.

Agreed

 Or maybe better, invent a level of indirection like a sequence access
 method (comparable to index access methods) that provides a compatible
 set of substitute functions for sequence operations.  If you want to
 override nextval() for a sequence, don't you likely also need to
 override setval(), currval(), etc?  Not to mention overriding ALTER
 SEQUENCE's behavior.

 This might be better, but it's also possibly more mechanism than we
 truly need here.  But then again, if we're going to end up with more
 than a handful of handlers, we probably do want to do this.

Let me have a play and see what comes out simplest. Somewhere in the
middle seems about right.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Simon Riggs

On 18 October 2012 16:15, Tom Lane t...@sss.pgh.pa.us wrote:

 But I'd want to see a pretty
 bulletproof argument why overriding *only* nextval is sufficient
 (and always will be) before accepting a hook for just nextval.  If we
 build an equivalent amount of functionality piecemeal it's going to
 be a lot uglier than if we recognize we need this type of concept
 up front.

We discussed that we need only nextval() and setval() elsewhere, but
adding others is fairly cheap so we can chew on that when we have a
patch to discuss.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-10-18 Thread Alvaro Herrera

Greg Stark escribió:
 On Tue, Oct 16, 2012 at 9:47 PM, Peter Geoghegan pe...@2ndquadrant.com 
 wrote:
  The patch will now been marked ready for committer. Does this need
  doc changes, in light of what is arguably a behavioural difference?
  You only mentioned release notes.
 
 I'm happy to look at this one, probably next week at pgconf.eu. It seems like 
 a
 reasonable size patch to get back into things.
 
 That's assuming my committer bits haven't lapsed and people are ok
 with me stepping back into things?

Feel free.  Your committer bits should still work.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-10-18 Thread Alvaro Herrera

Tomas Vondra wrote:
 Hi,
 
 thanks for the review. I'll look into that in ~2 weeks, once the
 pgconf.eu
 is over.

Excellent.  Please submit the updated version to the upcoming commitfest
when you have it.  I'm marking this patch Returned with Feedback.
Many thanks to Shigeru Hanada for the review and benchmark.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: pgbench - aggregation of info written into log

2012-10-18 Thread Alvaro Herrera

Tomas Vondra wrote:

 Also, I've realized the last interval may not be logged at all - I'll
 take a look into this in the next version of the patch.

I didn't see any later version of this patch posted anywhere.  I guess
it'll have to wait until the next commitfest.  Please fix the remaining
issues and resubmit.

Thanks to Robert Haas, Jeff Janes and Pavel Stehule for the reviews.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Robert Haas

On Thu, Oct 18, 2012 at 7:20 AM, Magnus Hagander mag...@hagander.net wrote:
 Since Simon stirred up a hornets nest suggesting deprecation of a
 number of features, I figured I'd take it one step further and suggest
 removal of some previously deprecated features :)

 In particular, we made a couple of changes over sveral releases back
 in the authentication config, that we should perhaps consider
 finishing by removing the old stuff now?

 1. krb5 authentication. We've had gssapi since 8.3 (which means in all
 supported versions). krb5 has been deprecated, also since 8.3. Time to
 remove it?

That seems like a sufficiently long deprecation window, but is gssapi
a full substitute for krb5?  I don't really have a strong opinion on
this, not being a user myself.

 2. ident-over-unix-sockets was renamed to peer in 9.1, with the old
 syntax deprecated but still mapping to the new one. Has it been there
 long enough that we should start throwing an error for ident on unix?

Definitely not.  I see no reason to change that, well, really ever.
But certainly not after just two releases.  It seems to me like a
useful convenience that does no real harm.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-10-18 Thread Peter Geoghegan

On 16 October 2012 21:47, Peter Geoghegan pe...@2ndquadrant.com wrote:
 The same basic strategy for sizing the tuplesort memtuples array in
 also exists in tuplestore. I wonder if we should repeat this there? I
 suppose that that could follow later.

Incidentally, the basis of this remark is commit 2689abf0, where Tom
decided to keep the two in sync. That's a precedent for what we need
to do here, I suppose.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [PATCH 3/8] Add support for a generic wal reading facility dubbed XLogReader

2012-10-18 Thread Alvaro Herrera

This patch doesn't seem to be going anywhere, sadly.  Since we're a bit
late in the commitfest and this patch hasn't seen any activity for a
long time, I'll mark it as returned-with-feedback.  I hope one or both
versions are resubmitted (with additional fixes?) for the next
commitfest, and that the discussion continues to determine which of the
two approaches is the best.

Thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] xlog filename formatting functions in recovery

2012-10-18 Thread Alvaro Herrera

Daniel, I assume you are submitting an updated version based on the
feedback that has been provided.  I will mark this patch returned with
feedback in the current CF; please submit the next version to CF3.

Thanks to the (rather numerous!) reviewers.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Very minor feature suggestion

2012-10-18 Thread Murphy, Kevin

It might be nice for psql to have a 'htmlcaption' boolean pset option that 
would wrap the provided title/caption, if any, in a caption tag in the HTML 
report output, when using html format.

Motivation:

When I use:

\pset title 'Some title'

or

\C 'Some title'

psql emits the text:

Title is Some title.

even when using html format.  This seems more like a diagnostic/annotation 
(like 'Title is unset', which is what you get from a plain \pset title) than 
report output.  For casual use, even \echo Some title is more pleasant, 
except for post-processing by scripts, which people are no doubt doing.  When 
using html format, it would arguably be better for the title to be wrapped in a 
caption tag inside the table itself.  You couldn't change the default behavior, 
but the html caption would be a nice option.

Regards,
Kevin Murphy



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Oct 18, 2012 at 7:20 AM, Magnus Hagander mag...@hagander.net wrote:
 2. ident-over-unix-sockets was renamed to peer in 9.1, with the old
 syntax deprecated but still mapping to the new one. Has it been there
 long enough that we should start throwing an error for ident on unix?

 Definitely not.  I see no reason to change that, well, really ever.
 But certainly not after just two releases.  It seems to me like a
 useful convenience that does no real harm.

I think the argument that it causes user confusion is a fairly strong
one, though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] xlog filename formatting functions in recovery

2012-10-18 Thread Daniel Farina

On Thu, Oct 18, 2012 at 9:22 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Daniel, I assume you are submitting an updated version based on the
 feedback that has been provided.  I will mark this patch returned with
 feedback in the current CF; please submit the next version to CF3.

Thank you for reminding me, so the approach that seems basically
reasonable is to add an overload for these functions to accept a
timeline number, if I read this feedback correctly?

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Steve Crawford


On 10/17/2012 04:25 PM, Tom Lane wrote:
...Now having said that, I would definitely like to see rules in their 
current form go away eventually. But not without a substitute. 
Triggers are not a complete replacement, and no amount of wishful 
thinking makes them so.

...
Perhaps it would be more profitable to try to identify the pain points 
that make people so eager to get rid of rules, and then see if we 
could alleviate them.


Alternately/additionally identify the deficiencies in triggers that 
drive users to prefer rules. For example, a common need is to update a 
log table whenever updates are made to a main table.


Using rules to accomplish this is very easy to understand and write, 
even for most beginners. (Understand properly including limitations and 
dangers is another issue, of course.) It is also easy to maintain. If 
you drop the table, the rule is cleaned up as well.


With triggers you need to select from a variety of available languages, 
write a function in that language and write a trigger that calls that 
function. Dropping the function will remove the trigger but the user 
must remember to delete the function as well, if desired. Nothing 
insurmountable but inconvenient compared to the use of a rule.


Per the documentation PostgreSQL only allows the execution of a 
user-defined function for the triggered action. The standard allows the 
execution of a number of other SQL commands...


There may be valid reasons why implementing that part of the SQL 
standard in PostgreSQL is difficult or unwise but removing that 
limitation on triggers would eliminate one annoyance that pushes users 
toward rules.


Cheers,
Steve



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in -c CLI option of pg_dump/pg_restore

2012-10-18 Thread Guillaume Lelarge

On Thu, 2012-10-18 at 12:19 -0300, Alvaro Herrera wrote:
 Robert Haas escribió:
  On Tue, Oct 16, 2012 at 10:31 AM, Guillaume Lelarge
  guilla...@lelarge.info wrote:
   Any comments on this?
  
  I'm not sure I'd want to back-patch this, since it is a behavior
  change, but I do think it's probably a good idea to change it for 9.3.
 
 Hm, but the bug is said to happen only in 9.2, so if we don't backpatch
 we would leave 9.2 alone exhibiting this behavior.
 

Yeah, Alvarro got it right. The behaviour changed in 9.2. This patch
needs to be applied on 9.2 and master, nothing else. If the patch is
good enough though...


-- 
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-10-18 Thread Fujii Masao

On Tue, Oct 16, 2012 at 9:31 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 15.10.2012 19:31, Fujii Masao wrote:

 On Mon, Oct 15, 2012 at 11:27 PM, Heikki Linnakangas
 hlinnakan...@vmware.com  wrote:

 On 15.10.2012 13:13, Heikki Linnakangas wrote:


 Oh, I didn't remember that we've documented the specific structs that we
 pass around. It's quite bogus anyway to explain the messages the way we
 do currently, as they are actually dependent on the underlying
 architecture's endianess and padding. I think we should refactor the
 protocol to not transmit raw structs, but use pq_sentint and friends to
 construct the messages. This was discussed earlier (see


 http://archives.postgresql.org/message-id/4fe2279c.2070...@enterprisedb.com),
 I think there's consensus that 9.3 would be a good time to do that as we
 changed the XLogRecPtr format anyway.



 This is what I came up with. The replication protocol is now
 architecture-independent. The WAL format itself is still
 architecture-independent, of course, but this is useful if you want to
 e.g
 use pg_receivexlog to back up a server that runs on a different platform.

 I chose the int64 format to transmit timestamps, even when compiled with
 --disable-integer-datetimes.

 Please review if you have the time..


 Thanks for the patch!

 When I ran pg_receivexlog, I encountered the following error.


 Yeah, clearly I didn't test this near enough...

 I fixed the bugs you bumped into, new version attached.

Thanks for updating the patch!

We should remove the check of integer_datetime by pg_basebackup
background process and pg_receivexlog? Currently, they always check
it, and then if its setting value is not the same between a client and
server, they fail. Thanks to the patch, ISTM this check is no longer
required.

+   pq_sendint64(reply_message, GetCurrentIntegerTimestamp());

In XLogWalRcvSendReply() and XLogWalRcvSendHSFeedback(),
GetCurrentTimestamp() is called twice. I think that we can skip the
latter call if integer-datetime is enabled because the return value of
GetCurrentTimestamp() and GetCurrentIntegerTimestamp() is in the
same format. It's worth reducing the number of GetCurrentTimestamp()
calls, I think.

elog(DEBUG2, sending write %X/%X flush %X/%X apply %X/%X,
-(uint32) (reply_message.write  32), (uint32) 
reply_message.write,
-(uint32) (reply_message.flush  32), (uint32) 
reply_message.flush,
-(uint32) (reply_message.apply  32), (uint32) 
reply_message.apply);
+(uint32) (writePtr  32), (uint32) writePtr,
+(uint32) (flushPtr  32), (uint32) flushPtr,
+(uint32) (applyPtr  32), (uint32) applyPtr);

elog(DEBUG2, write %X/%X flush %X/%X apply %X/%X,
-(uint32) (reply.write  32), (uint32) reply.write,
-(uint32) (reply.flush  32), (uint32) reply.flush,
-(uint32) (reply.apply  32), (uint32) reply.apply);
+(uint32) (writePtr  32), (uint32) writePtr,
+(uint32) (flushPtr  32), (uint32) flushPtr,
+(uint32) (applyPtr  32), (uint32) applyPtr);

Isn't it worth logging not only WAL location but also the replyRequested
flag in these debug message?

The remaining of the patch looks good to me.

 +   hdrlen = sizeof(int64) + sizeof(int64) +
 sizeof(int64);
 +   hdrlen = sizeof(int64) + sizeof(int64) +
 sizeof(char);

 These should be macro, to avoid calculation overhead?


 The compiler will calculate this at compilation time, it's going to be a
 constant at runtime.

Yes, you're right.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Daniel Farina

On Thu, Oct 18, 2012 at 6:46 AM, Andrew Dunstan and...@dunslane.net wrote:

 On 10/17/2012 07:25 PM, Tom Lane wrote:


 I'm fairly annoyed by the entire tenor of this conversation, because
 the people who are hollering the loudest seem to be people who have
 never actually touched any of the rules code, but nonetheless seem
 prepared to tell those of us who have what to spend our time on.


 +1

 I too have been quite annoyed.

Sorry that I'm an offender. I also did not like the way the
conversation was going for some time; for me, I felt like I didn't
understand a lot of the terse rejections that materialized immediately
on behalf of users that I personally cannot identify, and I felt those
rejections weren't in a neutral language either that encouraged
clarification.  I'm glad things have moved beyond that.

 The biggest pain people have mentioned is that they don't work with COPY.  I
 am in fact about to start working on a project which will probably alleviate
 that pain point. I'm not going to say much more, and I would not have said
 anything right now except that there is this sudden rush to deprecate rules,
 or announce a future removal of the feature. However, I hope to have a
 proposal to put to the community by about the end of November.

I have encountered this as a papercut.

Here's another use case that in my history with RULES that didn't seem
to pan out so well: In my recollection, one way to use rules is to
retarget operations that happen against a view and move them to a
table, and as I recall to make this work as one expected one had to
have a very wordy RULE (for UPDATEs) with a litany of (fairly simple)
equality and not-null conditions to make it work as one would expect
(to not under-constrain the UPDATE).  This became a maintenance
headache whenever attributes were added to the underlying relation.

It was also quite complex, as I recall, when one wanted to maintain an
interface but normalize the underlying table and split writes into two
or more places.

It has been quite some time, does that sound like a correct rendering
of a problem?

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Joshua D. Drake



On 10/18/2012 04:43 AM, Simon Riggs wrote:


On 18 October 2012 12:20, Magnus Hagander mag...@hagander.net wrote:


Since Simon stirred up a hornets nest suggesting deprecation of a
number of features, I figured I'd take it one step further and suggest
removal of some previously deprecated features :)


I'm laughing at the analogy that angry and unintelligent agents
responded to my proposals, but there was no stirring action from me.


I believe the stirring occurred when you dropped the idea in the 
proverbial bucket. It is not possible to drop even the tiniest pebble 
into any ideology of our community without some plague causing flying 
insects swarming just in case. You and I, included.


JD






--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Josh Berkus

Simon,


 It's hard to work out how to reply to this because its just so off
 base. I don't agree with the restrictions you think you see at all,
 saying it politely rather than giving a one word answer.

You have inside knowledge of Hannu's design. I am merely going from his
description *on this list*, because that's all I have to go in.

He requested comments, so here I am, commenting.  I'm *hoping* that it's
merely the description which is poor and not the conception of the
feature.  *As Hannu described the feature* it sounds useless and
obscure, and miles away from powering any kind of general queueing
mechanism.  Or anything we discussed at the clustering meetings.

And, again, if you didn't want comments, you shouldn't have posted an RFC.

 All we're discussing is moving a successful piece of software into
 core, which has been discussed for years at the international
 technical meetings we've both been present at. I think an open
 viewpoint on the feasibility of that would be reasonable, especially
 when it comes from one of the original designers.

When I ask you for technical clarification or bring up potential
problems with a 2Q feature, you consistently treat it as a personal
attack and are emotionally defensive instead of answering my technical
questions.  This, in turn, frustrates the heck out of me (and others)
because we can't get the technical questions answered.  I don't want you
to justify yourself, I want a clear technical spec.

I'm asking these questions because I'm excited about ReplicationII, and
I want it to be the best feature it can possibly be.

Or, as we tell many new contributors, We wouldn't bring up potential
problems and ask lots of questions if we weren't interested in the feature.

Now, on to the technical questions:

 QUEUE emphasizes the aspect of logged only table that it accepts
 records in a certain order, persists these and then quarantees
 that they can be read out in exact the same order - all this being
 guaranteed by existing WAL mechanisms.

 It is not meant to be a full implementation of application level queuing
 system though but just the capture, persisting and distribution parts

 Using this as an application level queue needs a set of interface
 functions to extract the events and also to keep track of the processed
 events. As there is no general consensus what these shoul be (like if
 processing same event twice is allowed) this part is left for specific
 queue consumer implementations.

While implementations vary, I think you'll find that the set of
operations required for a full-featured application queue are remarkably
similar across projects.  Personally, I've worked with celery, Redis,
AMQ, and RabbitMQ, as well as a custom solution on top of pgQ.  The
design, as you've described it, make several of these requirements
unreasonably convoluted to implement.

It sounds to me like the needs of internal queueing and application
queueing may be hopelessly divergent.  That was always possible, and
maybe the answer is to forget about application queueing and focus on
making this mechanism work for replication and for matviews, the two
features we *know* we want it for.  Which don't need the application
queueing features I described AFAIK.

 The two halves of the queue are the TAIL/entry point and the HEAD/exit
 point. As you point out these could be on the different servers,
 wherever the logical changes flow to, but could also be on the same
 server. When the head and tail are on the same server, the MESSAGE
 QUEUE syntax seems appropriate, but I agree that calling it that when
 its just a head or just a tail seems slightly misleading.

Yeah, that's why I was asking for clarification; the way Hannu described
it, it sounded like it *couldn't* be read on the insert node, but only
on a replica.

 We do, I think, want a full queue implementation in core. We also want
 to allow other queue implementations to interface with Postgres, so we
 probably want to allow first half only as well. Meaning we want both
 head and tail separately in core code. The question is whether we
 require both head and tail in core before we allow commit, to which I
 would say I think adding the tail first is OK, and adding the head
 later when we know exactly the design.

I'm just pointing out that some of the requirements of the design for
the replication queue may conflict with a design for a full-featured
application queue.

I don't quite follow you on what you mean by head vs. tail.  Explain?

 Having said that, the LOGGING ONLY syntax makes me shiver. Better name?

I suck at names.  Sorry.

 I should also add that this is an switchable sync/asynchronous
 transactional queue, whereas LISTEN/NOTIFY is a synchronous
 transactional queue.

Thanks for explaining.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-10-18 Thread Alvaro Herrera

Boszormenyi Zoltan escribió:
 Hi,
 
 this is the latest one, fixing a bug in the accounting
 of per-statement lock timeout handling and tweaking
 some comments.

Tom, are you able to give this patch some more time on this commitfest?

(If not, I think it would be fair to boot it to CF3; this is final in a
series, there's nothing that depends on it, and there's been good
movement on it; there's plenty of time before the devel cycle closes.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-10-18 Thread Tom Lane

Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Boszormenyi Zoltan escribió:
 this is the latest one, fixing a bug in the accounting
 of per-statement lock timeout handling and tweaking
 some comments.

 Tom, are you able to give this patch some more time on this commitfest?

I'm still hoping to get to it, but I've been spending a lot of time on
bug fixing rather than patch review lately :-(.  If you're hoping to
close out the current CF soon, maybe we should just slip it to the next
one.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gistchoose vs. bloat

2012-10-18 Thread Alvaro Herrera

Alexander Korotkov escribió:

  4. It looks like the randomization is happening while trying to compare
  the penalties. I think it may be more readable to separate those two
  steps; e.g.
 
/* create a mapping whether randomization is on or not */
for (i = FirstOffsetNumber; i = maxoff; i = OffsetNumberNext(i))
offsets[i - FirstOffsetNumber] = i;
 
if (randomization)
/* randomize offsets array */
 
for (i = 0; i  maxoff; i++)
{
   offset = offsets[i];
   ...
}
 
  That's just an idea; if you think it's more readable as-is (or if I am
  misunderstanding) then let me know.
 
 Actually, current implementation comes from idea of creating possible less
 overhead when randomization is off. I'll try to measure overhead in worst
 case. If it is low enough then you proposal looks reasonable to me.

Were you able to do these measurements?  If not, I'll defer to your and
Jeff's judgement on what's the best next step here.

Jeff, do you think we need more review of this patch?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-10-18 Thread Alvaro Herrera

Heikki, would you be able to give this patch a look and perhaps commit
it?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SP-GiST for ranges based on 2d-mapping and quad-tree

2012-10-18 Thread Alvaro Herrera

Jeff Davis escribió:
 On Sat, 2012-08-18 at 18:10 +0400, Alexander Korotkov wrote:
  
  Thanks! There is a separate patch for adjacent. I've reworked adjacent
  check in order to make it more clear.
 
 I am taking a look at this patch now. A few quick comments:

 * I tried some larger tests and they seemed to work. I haven't reviewed
 the spgist code changes in detail though.

Jeff, Heikki,

Any input on the subsequent version of this patch?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.3] Row-Level Security

2012-10-18 Thread Alvaro Herrera

Kohei KaiGai escribió:
 The revised patch fixes the problem that Daen pointed out.

Robert, would you be able to give this latest version of the patch a
look?

(KaiGai, does it still apply cleanly? If not, please submit a rebased
version.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Claudio Freire

On Thu, Oct 18, 2012 at 2:33 PM, Josh Berkus j...@agliodbs.com wrote:
 I should also add that this is an switchable sync/asynchronous
 transactional queue, whereas LISTEN/NOTIFY is a synchronous
 transactional queue.

 Thanks for explaining.

New here, I missed half the conversation, but since it's been brought
up and (to me wrongfully) dismissed, I'd like to propose:

NOTIFY [ALL|ONE] [REMOTE|LOCAL|CLUSTER|DOWNSTREAM] ASYNCHRONOUSLY
LISTEN [REMOTE|LOCAL|CLUSTER|UPSTREAM] too for good measure.

That ought to work out fine as SQL constructs go, implementation aside.

That's not enough for matviews, but it is IMO a good starting point.
All you need after that, are triggers for notifying automatically upon
insert, and some mechanism to attach triggers to a channel for the
receiving side.

Since channels are limited to short strings, maybe a different kind of
object (but with similar manipulation syntax) ought to be created. The
CREATE QUEUE command, in fact, could be creating such a channel. The
channel itself won't be WAL-only, just the messages going through it.
This (I think) would solve locking issues.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-10-18 Thread Simon Riggs

On 18 October 2012 10:20, Andres Freund and...@2ndquadrant.com wrote:
 On Thursday, October 18, 2012 06:12:02 AM Kevin Grittner wrote:
 Kevin Grittner wrote:
  Hmm. The comment is probably better now, but I've been re-checking
  the code, and I think my actual code change is completely wrong.
  Give me a bit to sort this out.

 I'm having trouble seeing a way to make this work without rearranging
 the code for concurrent drop to get to a state where it has set
 indisvalid = false, made that visible to all processes, and ensured
 that all scans of the index are complete -- while indisready is still
 true. That is the point where TransferPredicateLocksToHeapRelation()
 could be safely called. Then we would need to set indisready = false,
 make that visible to all processes, and ensure that all access to the
 index is complete. I can't see where it works to set both flags at
 the same time. I want to sleep on it to see if I can come up with any
 other way, but right now that's the only way I'm seeing to make DROP
 INDEX CONCURRENTLY compatible with SERIALIZABLE transactions. :-(

 In a nearby bug I had to restructure the code that in a way thats similar to
 this anyway, so that seems fine. Maybe you can fix the bug ontop of the two
 attached patches?

First patch and first test committed.

Working on second patch/test.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Hannu Krosing


On 10/18/2012 07:33 PM, Josh Berkus wrote:

Simon,



It's hard to work out how to reply to this because its just so off
base. I don't agree with the restrictions you think you see at all,
saying it politely rather than giving a one word answer.

You have inside knowledge of Hannu's design.

Actually Simon has currently no more knowledge of this specific
design than you do - I posted this on this list as soon as I had figured
it out as a possible solution of a specific problem of supporting
 full pgQ/Londiste functionality in WAL based logical replication
with minimal overhead.

(well, actually I let it settle a few weeks, but i did not discuss
this off-list before ).

Simon may have better grasp of it thanks to having done work
on the BDR/Logical Replication design  and thus having better or
at least more recent understanding of issues involved in Logical
Replication.

When mapping londiste/Slony message capture to Logical WAL
the WAL already _is_ the event queue for replication.
NOT LOGGED tables make it also usable for non-replication
things using same mechanisms. (the equivalent in trigger-based
system would be a  log trigger which captures insert event and then
cancels an insert).


I am merely going from his
description *on this list*, because that's all I have to go in.

He requested comments, so here I am, commenting.  I'm *hoping* that it's
merely the description which is poor and not the conception of the
feature.  *As Hannu described the feature* it sounds useless and
obscure, and miles away from powering any kind of general queueing
mechanism.

If we describe a queue as something you put stuff in at one end and
get it out in same or some other specific order at the other end, then
WAL _is_ a queue when you use it for replication  (if you just write to it,
then it is Log, if you write and read, it is Queue)

That is, the WAL already is  a form of persistent and ordered (that is 
how WAL works)

stream of messages (WAL records) that are generated on the master
and replayed on one or more consumers (called slaves in case of simple
replication)

All it takes to make this scenario work is keeping track of LSN or simply
log position on the slave side.

What you seem to be wanting is support for a cooperative consumers,
that is multiple consumers on the same queue working together and
sharing the work to process the incoming event .

This can be easily achieved using a single ordered event stream and
extra bookkeeping structures on the consumer side (look at cooperative
consumer samples in skytools).

What I suggested was optimisation for the case where you know that you
will never need the data on the master side and are only interested in it
on the slave side.

By writing rows/events/messages only to log (or steam or queue), you
avoid the need to later clean up it on the master by either DELETE or
TRUNCATE or rotating tables.

For both physical and logical streaming the WAL _is_ the queue of events
that were recorded on master and need to be replied on the slave.

Thanks to introducing logical replication, it now makes sense to have
actions recorded _only_ in this queue and this is what the whole RC was 
about.


I recommend that you introduce yourself a bit to skytools/pgQ to get a
better feel of the things I am talking about. Londiste is just one 
application

built on a general event logging, transport and transform/replay (that is
what i'd call queueing :) ) system pgQ.

pgQ does have its roots in Slony an(and earlier) replication systems, 
but it

is by no means _only_ a replication system.

The LOG ONLY tables are _not_ needed for pure replication (like Slony) but
they make replication + queueing type solutions like skytools/pgQ much more
 efficient as they do away wuth the need to maintain the queued data on 
the

master side where it will never be needed ( just to reapeat this once more
)


Or anything we discussed at the clustering meetings.

And, again, if you didn't want comments, you shouldn't have posted an RFC.

I did want comments and as far as I know I do not see you as hostile :)

I do understand that what you mean by QUEUE (and specially as a
MESSAGE QUEUE) is different from what I described.
You seem to want specifically an implementation of cooperative
consumers for a generic queue.

The answer is yes, it is possible to build this on WAL, or table based
event logs/queue of londiste / slony. It just takkes a little extra
management on the receiving side to do the record locking and
distribution between cooperating consumers.

All we're discussing is moving a successful piece of software into
core, which has been discussed for years at the international
technical meetings we've both been present at. I think an open
viewpoint on the feasibility of that would be reasonable, especially
when it comes from one of the original designers.

When I ask you for technical clarification or bring up potential
problems with a 2Q feature, you consistently treat it as a personal
attack and are

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Hannu Krosing


On 10/18/2012 08:36 PM, Claudio Freire wrote:
The CREATE QUEUE command, in fact, could be creating 
such a channel. The channel itself won't be WAL-only, just
the messages going through it. This (I think) would solve locking issues. 


Hmm. Maybe we should think of implementing this as REMOTE TABLE, that
is a table which gets no real data stored locally but all insert got 
through WAL

and are replayed as real inserts on slave side.

Then if you want matviews or partitioned table, you just attach triggers to
 the table on slave side to do them.

This would be tangential to their use as pure queues which would happen
at the level of plugins to logical replication.

--
Hannu



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Truncate if exists

2012-10-18 Thread Robert Haas

On Tue, Oct 16, 2012 at 2:12 PM, Stafford, David x78061
david.staff...@broadridge.com wrote:
 On Mon, 15 Oct 2012 10:21:18 -0700, Robert Haas robertmh...@gmail.com wrote:
 Yeah, I think the functionality that we need is pretty much there
 already today.  What we need to do is to get the syntax to a point
 where people can write the code they want to write without getting
 tangled up by it.

 I think the invention of DO was a big step in the right direction
 ...
 With DO, you can write the logic you want
 as an SQL statement, it's just a clunky and awkward SQL statement.  In
 my view the goal ought to be to refine that mechanism to remove the
 clunkiness and awkwardness, rather than to invent something completely
 new.

 As someone who has worked with a number of databases now, none of them really 
 get this DDL integration completely right.  What I'd like to see is 1) a 
 predicate to easily test things about the schema (does this table, column, 
 index, schema, etc. exist?  does it have the right type?) and 2) a way to 
 conditionally execute DDL (and DML, which should fall right out, but it isn't 
 really what this discussion is covering).  I would propose extending the 
 current EXISTS / NOT EXISTS predicate as follows:

 [NOT] EXISTS TABLE tab [ ( col [type] [ , col [type]]... ) ]
 [NOT] EXISTS COLUMN tab.col [type]
 [NOT] EXISTS INDEX tab.idx [ ( col [ , col]... ) ]
 [NOT] EXISTS CONSTRAINT tab.cstrt -- not sure what else might be useful here
 [NOT] EXISTS ( select ) -- this is the current EXISTS predicate, of course
 [NOT] EXISTS ANY ROWS FROM tab [join] [WHERE predicate] [GROUP BY col [ , 
 col]...] -- exactly the same as
   
  -- (select 1 FROM etc.)
   
  -- only because I like
   
  -- it better
 (the latter [which by no means am I nuts over; it's just that when extending 
 EXISTS I can't stop neatening it up to my personal preferences] could be 
 extended with [NOT] EXISTS MORE THAN n ROWS FROM and [NOT] EXISTS EXACTLY n 
 ROWS FROM.)

 There is a new SQL statement: IF predicate true-statement [ELSE 
 false-statement].

 To actually execute this new IF statement, the executor would need an IF node 
 that evaluates the predicate (with ANDs and ORs, just like all SQL 
 predicates) and then executes the rest only if the predicate came out TRUE 
 (or NOT FALSE; I forget which is usually used, and the difference with NULL 
 could be useful, as long as it matches other predicates).  This moves one 
 more bit of procedural logic into the executor.

 Another wrinkle is that the dependent statement might not compile, due to 
 missing tables or whatnot.  Actually executing it while it doesn't compile is 
 an error, but we want to defer that error until we actually decide we need to 
 execute it.  Also, it's probably good to try compiling it again at that 
 point.  So my thought would be to try planning the dependent statement(s); if 
 they compile, hook them to the IF node; if not, hook a DEFERRED node to the 
 IF node.  The DEFERRED node has the parse tree (or raw string, whatever makes 
 sense) of the statement; on execution it tries again to plan that statement; 
 if it succeeds, run it; if not, error out.

 I'd also add a SEQUENCE node to the executor.  It just runs its children in 
 order (could be n-ary, or if fixed arity nodes are what is in the 
 planner/executor today, could be binary, first left, then right, and right 
 could be another SEQUENCE).  The DEFERRED node means that a CREATE statement 
 could precede use of what is created in the same sequence and all could get 
 planned (with some deferral to execution time) in advance and run in one 
 lump.  This implements DO at the executor level.

 The biggest concepts left from plpgsql are looping and variables.  Most 
 variables could be modeled as a single row value; SQL already can update a 
 row, so the planning of assignments and calculations of scalars (and arrays, 
 I think) already fits into things the planner knows about.  Table variables 
 (which I don't know that plpgsql supports, but someday it should) are less 
 defined.  Adding plpgsql's loops to the executor would let whole functions 
 run under one trip through the executor.  This is beyond just improving the 
 DDL support for scripts.

 I have written a number of database upgrade scripts.  Over time we've made 
 them less fragile, by checking for the existence of tables, indexes, and most 
 recently, columns.  The usual sequence is:
 1) check the existence of an index; check that the first few columns are 
 correct; if not, drop the index
 2) repeat for other indexes that have changed definition over time
 3) check the existence of the table; create with current layout if it is 
 missing
 4) check for the presence of a column; if missing,

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Christopher Browne

On Thu, Oct 18, 2012 at 2:56 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
 * works as table on INSERTS up to inserting logical WAL record describing
 the
 insert but no data is inserted locally.

 with all things that follow from the local table having no data
   - unique constraints don't make sense
   - indexes make no sense
   -  updates and deletes hit no data
   - etc. . .

Yep, I think I was understanding those aspects.

I think I disagree that indexes make no sense.

I think that it would be meaningful to have an index type for this,
one that is a pointer at WAL records, to enable efficiently jumping to
the right WAL log to start accessing a data stream, given an XID.
That's a fundamentally different sort of index than we have today
(much the way that hash indexes, GiST indexes, and BTrees differ from
one another).

I'm having a hard time thinking about what happens if you have
cascaded replication, and want to carry records downstream.  In that
case, the XIDs from the original system aren't miscible with the XIDs
in a message queue on a downstream database, and I'm not sure what
we'd want to do.  Keep the original XIDs in a side attribute, maybe?
It seems weird, at any rate.  Or perhaps data from foreign sources has
got to go into a separate queue/'sorta-table', and thereby have two
XIDs, the source system XID and the when we loaded it in locally
XID.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrect behaviour when using a GiST index on points

2012-10-18 Thread Noah Misch

On Thu, Oct 11, 2012 at 07:17:28AM -0400, Noah Misch wrote:
 On Tue, Oct 02, 2012 at 01:58:40PM -0400, Noah Misch wrote:
On Mon, Aug 27, 2012 at 7:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
There's also the big-picture question of whether we should just get rid
of fuzzy comparisons in the geometric types instead of trying to hack
indexes to work around them.
 
  In any event, I think we should entertain a patch to make the GiST operator
  class methods bug-compatible with corresponding operators.  Even if we 
  decide
  to change operator behavior in HEAD, the back branches could use it.
 
 We have broad agreement that the specific implementation of fuzz in geometric
 comparison operators is shoddy, but nobody has voiced interest in designing a
 concrete improvement.  I propose adding a TODO item Remove or improve
 rounding in geometric comparison operators, endorsing Alexander's design, and
 reviewing his patch.  Objections?

TODO added, and here's a review:

The patch adds no regression tests; it should add tests illustrating the
problems it fixes.

I audited the other indexable geometric operators for similar problems.  This
passage in gist_point_consistent_internal(), which handles (point,point)
operators, caught my suspicion:

case RTSameStrategyNumber:
if (isLeaf)
{
result = FPeq(key-low.x, query-x)
 FPeq(key-low.y, query-y);
}
else
{
result = (query-x = key-high.x  query-x 
= key-low.x 
  query-y = key-high.y  
query-y = key-low.y);
}
break;

A leaf entry reachable from an internal entry may fall exactly on the
internal-entry bounding box.  Since we would accept a fuzzy match at the leaf
level, I think we must also accept a fuzzy match at the internal level.

 *** a/src/backend/access/gist/gistproc.c
 --- b/src/backend/access/gist/gistproc.c

 ***
 *** 1326,1331  gist_point_consistent(PG_FUNCTION_ARGS)
 --- 1327,1333 
   bool   *recheck = (bool *) PG_GETARG_POINTER(4);
   boolresult;
   StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
 + BOX*query, *key;

This function now has query variables within subsidiary blocks redundant
with and masking this one.  Avoid doing that.

   
   switch (strategyGroup)
   {
 ***
 *** 1337,1348  gist_point_consistent(PG_FUNCTION_ARGS)
   *recheck = false;
   break;
   case BoxStrategyNumberGroup:
 ! result = DatumGetBool(DirectFunctionCall5(
 ! 
   gist_box_consistent,
 ! 
   PointerGetDatum(entry),
 ! 
   PG_GETARG_DATUM(1),
 !   
 Int16GetDatum(RTOverlapStrategyNumber),
 ! 
0, PointerGetDatum(recheck)));
   break;
   case PolygonStrategyNumberGroup:
   {
 --- 1339,1356 
   *recheck = false;
   break;
   case BoxStrategyNumberGroup:
 ! /* 
 !  * This code repeats logic of on_ob which uses simple 
 comparison
 !  * rather than FP* functions.
 !  */
 ! query = PG_GETARG_BOX_P(1);
 ! key = DatumGetBoxP(entry-key);
 ! 
 ! *recheck = false;
 ! result = key-high.x = query-low.x  
 !  key-low.x = query-high.x 
 !  key-high.y = query-low.y  
 !  key-low.y = query-high.y;

For leaf entries, this correctly degenerates to on_pb().  For internal
entries, it must, but does not, implement box_overlap().  (The fuzzy
box_overlap() would be fine.)  I recommend making gist_point_consistent()'s
treatment of boxes resemble its treatment of circles and polygons; that eases
verifying their correctness.  Call gist_box_consistent.  Then, for leaf
entries, call box_contain_pt().


GiST consistent functions often validate the strategy number, but the
circle, polygon and box branches of gist_point_consistent silently assume
strategy % GeoStrategyNumberOffset == RTContainedByStrategyNumber.  Should
they verify that assumption?  I

Re: [HACKERS] [BUGS] BUG #7521: Cannot disable WAL log while using pg_dump

2012-10-18 Thread Gezeala M . Bacuño II

You may disable full_page_writes, but as you can see from my previous
post, disabling it did not do the trick. My zfs' USED property
continues to increase.

On Wed, Oct 17, 2012 at 3:55 PM, ichbinrene rene.romer...@gmail.com wrote:
 I'm experiencing the exact same issue:

 PostgreSQL 9.1.6 on i686-pc-linux-gnu, compiled by gcc (GCC) 4.4.6 20120305
 (Red Hat 4.4.6-4), 32-bit
 CentOS release 6.3 (Final)

 I might also turn full_page_writes off but I fear for data integrity in case
 of a crash .



 --
 View this message in context: 
 http://postgresql.1045698.n5.nabble.com/BUG-7521-Cannot-disable-WAL-log-while-using-pg-dump-tp5722846p5728727.html
 Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.


 --
 Sent via pgsql-bugs mailing list (pgsql-b...@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-bugs


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_lwlocks view - lwlocks statistics, round 2

2012-10-18 Thread Robert Haas

On Tue, Oct 16, 2012 at 11:31 AM, Satoshi Nagayasu sn...@uptime.jp wrote:
 A flight-recorder must not be disabled. Collecting
 performance data must be top priority for DBA.

This analogy is inapposite, though, because a flight recorder rarely
crashes the aircraft.  If it did, people might have second thoughts
about the never disable the flight recorder rule.  I have had a
couple of different excuses to look into the overhead of timing
lately, and it does indeed seem that on many modern Linux boxes even
extremely frequent gettimeofday calls produce only very modest amounts
of overhead.  Sadly, the situation on Windows doesn't look so good.  I
don't remember the exact numbers but I think it was something like 40
or 60 or 80 times slower on the Windows box one of my colleagues
tested than it is on Linux.  And it turns out that that overhead
really is measurable and does matter if you do it in a code path that
gets run frequently.  Of course I am enough of a Linux geek that I
don't use Windows myself and curse my fate when I do have to use it,
but the reality is that we have a huge base of users who only use
PostgreSQL at all because it runs on Windows, and we can't just throw
those people under the bus.  I think that older platforms like HP/UX
likely have problems in this area as well although I confess to not
having tested.

That having been said, if we're going to do this, this is probably the
right approach, because it only calls gettimeofday() in the case where
the lock acquisition is contended, and that is a lot cheaper than
calling it in all cases.  Maybe it's worth finding a platform where
pg_test_timing reports that timing is very slow and then measuring how
much impact this has on something like a pgbench or pgbench -S
workload.  We might find that it is in fact negligible.  I'm pretty
certain that it will be almost if not entirely negligible on Linux but
that's not really the case we need to worry about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Robert Haas

On Wed, Oct 17, 2012 at 7:25 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Josh Berkus j...@agliodbs.com writes:
 I would tend to say well, they're not hurting anyone, why not keep
 them? Except that we're gathering an increasing number of features
 (RETURNING, FDWs, CTEs, Command triggers) which don't work well together
 with RULEs.

 Really?  On what do you base that claim?  The only one of those that I
 might believe is command triggers, but AFAIK we only have/plan command
 triggers for DDL, so there's no overlap.

 I'm fairly annoyed by the entire tenor of this conversation, because
 the people who are hollering the loudest seem to be people who have
 never actually touched any of the rules code, but nonetheless seem
 prepared to tell those of us who have what to spend our time on.

 Now having said that, I would definitely like to see rules in their
 current form go away eventually.  But not without a substitute.
 Triggers are not a complete replacement, and no amount of wishful
 thinking makes them so.

 Perhaps it would be more profitable to try to identify the pain points
 that make people so eager to get rid of rules, and then see if we could
 alleviate them.  One big problem I know about offhand is the
 multiple-evaluation risk, which seems at least in principle fixable.
 What others are there?

Well, it'd be nice to be able to rewrite a query referring to a table
to still refer to that same table, but you can't, because you get
infinite recursion.

If you could do that, it'd presumably be a WHOLE lot faster than
triggers.  Because frankly, if triggers were infinitely fast, I don't
think we'd be having this conversation.  But they're not.  They're
slow; really slow, and while we may be able to improve that somewhat
by some means, they're basically always going to be slow.  Being able
to rewrite queries is inside the server is useful, and rules are not a
very good solution to that problem, but right now they're the only
thing we've got.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] embedded list

2012-10-18 Thread Tom Lane

Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Here's the final version.  I think this is ready to go in.

I got around to reviewing this today.  I'm pretty seriously annoyed at
the definition of dlist_delete: it should *not* require the list header.
The present coding simply throws away one of the primary advantages of
a doubly-linked list over a singly-linked list, namely that you don't
have to have your hands on the list header in order to unlink a node.
This isn't merely academic either, as I see that the patch to catcache
code actually added a field to struct catctup to support making the
list header available.  That's a complete waste of 8 bytes (on a 64-bit
machine) per catalog cache entry.  The only thing it buys for us is
the ability to run dlist_check, which is something that isn't even
compiled (not even in an Assert build), and which doesn't actually do
that much useful even if it is compiled --- for instance, there's no way
to verify that the nodes were actually in the list claimed.

I think we should remove the head argument at least from dlist_delete,
and probably also dlist_insert_after and dlist_insert_before.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Skip checkpoint on promoting from streaming replication

2012-10-18 Thread Alvaro Herrera

This patch seems to have been neglected by both its submitter and the
reviewer.  Also, Simon said he was going to set it
returned-with-feedback on his last reply, but I see it as needs-review
still in the CF app.  Is this something that is going to be reconsidered
and resubmitted for the next commitfest?  If so, please close it up in
the current one.

Thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Prefetch index pages for B-Tree index scans

2012-10-18 Thread Claudio Freire

I've noticed, doing some reporting queries once, that index scans fail
to saturate server resources on compute-intensive queries.

Problem is, just after fetching a page, postgres starts computing
stuff before fetching the next. This results in I/O - compute - I/O -
compute alternation that results in idle CPU and disk, both.

I've also noticed earlier patches attempted to implement prefetch of
index pages, yet they don't seem to be committed. I'm wondering why.

The patches themselves were quite complex, attempting to prefetch heap
tuples in addition to index tuples. I can see how this would be
beneficial, but heap prefetch is notoriously more complicated, and
with the advent of index-only scans, maybe index-page-only prefetching
would be of use.

To test this hypothesis, I wrote the attached patch. pgbench doesn't
seem to see any performance impact (expected, since pgbench uses
really tiny queries that touch a single page). Using pgbench's data
with a scale of 1000, I tried some other queries that were more taxing
of index-only scans. This was done on my personal computer, with a
really lame I/O subsystem (compared to database servers), and with
9.2.1 rather than git, but I think it should be significant anyway. I
will try to verify on RAID-ed disks, but I'm in short supply of test
systems at the moment.

Pgbench's biggest index (on pgbench_accounts.aid) is not
unsurprisingly quite sequential, since it has been just created. So, I
tested both forward and backward index scans, to get an idea of how
the patch impacts on sequential and non-sequential workloads. I
haven't been able to test truly random ones yet - I'm trying to load
up a dump from a production database that's both big and messy to
test.

A few things worth noting, is that the patch avoids doing prefetch on
single-page index access, and only when block numbers aren't
sequentially increasing (only when scanning the index nonsequentially
will prefetch be attempted), since I noticed the fadvise call in those
cases was being counterproductive.

So, here we go:

The base query I used, is:

explain analyze select aid, count(*) from pgbench_accounts where aid
between 1000 and 2 group by aid order by aid;

For backward scans, just order by aid desc.

The full command used is
sudo bash -c 'echo 3  /proc/sys/vm/drop_caches' ; pg_ctl start -l
${PGLOG} ; sleep 5 ; ( sleep 10 ; echo 'set effective_io_concurrency
to 0;' ; echo 'explain analyze select aid, count(*) from
pgbench_accounts where aid between 1000 and 2 group by aid
order by aid;' ) | psql -h localhost -p 5433 pgbench ; sleep 5 ;
pg_ctl stop
server starting

The server is started and stopped every time to make sure the shared
cache is empty, and sleeps are there to avoid backlash I/O from
dropping caches to influence the benchmark.

Results:

With effective_io_concurrency set to 0 (patch disabled):

Forward:

  QUERY PLAN

 GroupAggregate  (cost=0.00..4149039.04 rows=90257289 width=4) (actual
time=47.552..31113.353 rows=9001 loops=1)
   -  Index Only Scan using pgbench_accounts_pkey on pgbench_accounts
 (cost=0.00..2795179.71 rows=90257289 width=4) (actual
time=47.542..13197.982 rows=9001 loops=1)
 Index Cond: ((aid = 1000) AND (aid = 2))
 Heap Fetches: 0
 Total runtime: 33648.500 ms
I/O thoughtput averages 60MB/s (clearly sequential)
I/O utilization averages around 30%

Backward:

   QUERY PLAN
--
 GroupAggregate  (cost=0.00..4149039.04 rows=90257289 width=4) (actual
time=10.279..159704.590 rows=9001 loops=1)
   -  Index Only Scan Backward using pgbench_accounts_pkey on
pgbench_accounts  (cost=0.00..2795179.71 rows=90257289 width=4)
(actual time=10.266..132853.382 rows=9001 loops=1)
 Index Cond: ((aid = 1000) AND (aid = 2))
 Heap Fetches: 0
 Total runtime: 163202.869 ms
I/O thoughput averages 11MB/s (clearly not fully random, but neither sequential)
I/O utilization averages 68%


With effective_io_concurrency set to 1 (patch enabled):

Forward:

  QUERY PLAN

 GroupAggregate  (cost=0.00..4149039.04 rows=90257289 width=4) (actual
time=47.582..30474.222 rows=9001 loops=1)
   -  Index Only Scan using pgbench_accounts_pkey on pgbench_accounts
 (cost=0.00..2795179.71 rows=90257289 width=4) (actual
time=47.571..12208.340 rows=9001 loops=1)
 Index Cond: ((aid = 1000) AND (aid = 2))
 Heap Fetches: 0
 Total runtime: 32875.695 ms
I/O

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-10-18 Thread Boszormenyi Zoltan


2012-10-18 20:08 keltezéssel, Tom Lane írta:

Alvaro Herrera alvhe...@2ndquadrant.com writes:

Boszormenyi Zoltan escribió:

this is the latest one, fixing a bug in the accounting
of per-statement lock timeout handling and tweaking
some comments.

Tom, are you able to give this patch some more time on this commitfest?

I'm still hoping to get to it, but I've been spending a lot of time on
bug fixing rather than patch review lately :-(.  If you're hoping to
close out the current CF soon, maybe we should just slip it to the next
one.


Fine by me. Thanks.

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Andrew Dunstan



On 10/18/2012 01:11 PM, Daniel Farina wrote:


Here's another use case that in my history with RULES that didn't seem
to pan out so well: In my recollection, one way to use rules is to
retarget operations that happen against a view and move them to a
table, and as I recall to make this work as one expected one had to
have a very wordy RULE (for UPDATEs) with a litany of (fairly simple)
equality and not-null conditions to make it work as one would expect
(to not under-constrain the UPDATE).  This became a maintenance
headache whenever attributes were added to the underlying relation.



Yes, but you also get a similar headache with a trigger. Unless you're 
VERY careful you can get a trigger failure by adding an attribute, and 
an almost guaranteed one by removing an attribute. It's true that the 
language for specifying the operations is more expressive, but no matter 
what mechanism you use, changing the shape of the objects can get you 
into trouble.


I've never said that rules are perfect, nor that they should be used 
whenever possible. What I have said is that there are known cases where 
they are the best solution currently available. I still think that.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] embedded list

2012-10-18 Thread Tom Lane

Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Oops.  I mentioned this explicitely somewhere in the discussion.  I
 assumed you had seen that, and that you would have complained had you
 found it objectionable.

Sorry, I've been too busy to pay very much attention to this patch.

 I think we should remove the head argument at least from dlist_delete,
 and probably also dlist_insert_after and dlist_insert_before.

 There are more functions that get the list head just to run the check.
 Can I assume that you don't propose removing the argument from those?
 (dlist_next_node, dlist_prev_node I think are the only ones).

Yeah, I wondered whether to do the same for those.  But it's less of an
issue there, because in practice the caller is almost certainly going to
also need to do dlist_has_next or dlist_has_prev respectively, and those
require the list header.

On the other hand, applying the same principle to slists, you could
argue that slist_has_next and slist_next_node should not require the
head pointer (since that's throwing away an advantage of slists).
If we wanted to remove the head pointer from those, there would be some
value in not having the head argument in dlist_next_node/dlist_prev_node
for symmetry with slist_next_node.

I'm not as excited about these since it seems relatively less likely to
matter.  What do you think?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Любен Каравелов


Well, it'd be nice to be able to rewrite a query referring to a table
to still refer to that same table, but you can't, because you get
infinite recursion.


If that was possible it would be quite easy to express any row/column level
security policies with it.


If you could do that, it'd presumably be a WHOLE lot faster than
triggers. Because frankly, if triggers were infinitely fast, I don't
think we'd be having this conversation. But they're not. They're
slow; really slow, and while we may be able to improve that somewhat
by some means, they're basically always going to be slow. Being able
to rewrite queries is inside the server is useful, and rules are not a
very good solution to that problem, but right now they're the only
thing we've got.


Moreover there are no triggers fired on selects. 


Best regards


--


Luben Karavelov

Re: [HACKERS] Skip checkpoint on promoting from streaming replication

2012-10-18 Thread Simon Riggs

On 18 October 2012 21:22, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 This patch seems to have been neglected by both its submitter and the
 reviewer.  Also, Simon said he was going to set it
 returned-with-feedback on his last reply, but I see it as needs-review
 still in the CF app.  Is this something that is going to be reconsidered
 and resubmitted for the next commitfest?  If so, please close it up in
 the current one.

I burned time on the unlogged table problems, so haven't got round to
this yet. I'm happier than I was with this.

I'm also conscious this is very important and there are no later patch
dependencies, so there's no rush to commit it and every reason to make
sure it happens without any mistakes. It will be there for 9.3.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-10-18 Thread Claudio Freire

On Thu, Oct 18, 2012 at 5:30 PM, Claudio Freire klaussfre...@gmail.com wrote:
 Backward:

QUERY PLAN
 --
  GroupAggregate  (cost=0.00..4149039.04 rows=90257289 width=4) (actual
 time=28.190..157708.405 rows=9001 loops=1)
-  Index Only Scan Backward using pgbench_accounts_pkey on
 pgbench_accounts  (cost=0.00..2795179.71 rows=90257289 width=4)
 (actual time=28.178..135282.317 rows=9001 loops=1)
  Index Cond: ((aid = 1000) AND (aid = 2))
  Heap Fetches: 0
  Total runtime: 160735.539 ms
 I/O thoughput averages 12MB/s (a small increase), and the 3-second
 difference seems related to it (it's consistent).
 I/O utilization averages 88% (important increase)

 This last result makes me think deeper prefetching could be
 potentially beneficial (it would result in read merges), but it's
 rather hard to implement without a skiplist of leaf pages. Maybe the
 backward-sequential pattern could be detected. I'll have to tinker
 with that.

Fun. That didn't take long.

With the attached anti-sequential scan patch, and effective_io_concurrency=8:


  QUERY PLAN
-
 GroupAggregate  (cost=0.00..4149039.04 rows=90257289 width=4) (actual
time=26.964..84299.789 rows=9001 loops=1)
   -  Index Only Scan Backward using pgbench_accounts_pkey on
pgbench_accounts  (cost=0.00..2795179.71 rows=90257289 width=4)
(actual time=26.955..62761.774 rows=9001 loops=1)
 Index Cond: ((aid = 1000) AND (aid = 2))
 Heap Fetches: 0
 Total runtime: 87170.355 ms
I/O thoughput 22MB/s (twice as fast)
I/O utilization 95% (I was expecting 100% but... hey... good enough)

With e_i_c=24, it gets to 100% utilization and 30MB/s (that's 3 times
faster). So, I'd like to know what you think, but maybe for
back-sequential scans, prefetch should be set to a multiple (ie: x24)
of e_i_c, in order to exploit read request merges.


postgresql-git-bt_prefetch_backseq.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Peter Eisentraut

On Thu, 2012-10-18 at 13:20 +0200, Magnus Hagander wrote:
 In particular, we made a couple of changes over sveral releases back
 in the authentication config, that we should perhaps consider
 finishing by removing the old stuff now?
 
 1. krb5 authentication. We've had gssapi since 8.3 (which means in all
 supported versions). krb5 has been deprecated, also since 8.3. Time to
 remove it?
 
 2. ident-over-unix-sockets was renamed to peer in 9.1, with the old
 syntax deprecated but still mapping to the new one. Has it been there
 long enough that we should start throwing an error for ident on unix?
 
The hba syntax changes between 8.3 and 8.4 continue to annoy me to this
day, so I'd like to avoid these in the future, especially if they are
for mostly cosmetic reasons.  I think any change should be backward
compatible to all supported versions, or alternatively to 8.4, since
that's incompatible with 8.3 anyway.  (Those two will be the same before
9.3 goes out.)

So, in my opinion, krb5 could be removed, assuming that gssapi is a full
substitute.  But ident-over-unix-sockets should stay, at least until 9.0
is EOL.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Peter Eisentraut

On Thu, 2012-10-18 at 12:38 -0400, Tom Lane wrote:
 I think the argument that it causes user confusion is a fairly strong
 one, though.

What is confusing, IMO, is changing the hba syntax all the time.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_lwlocks view - lwlocks statistics, round 2

2012-10-18 Thread Ants Aasma

On Thu, Oct 18, 2012 at 10:36 PM, Robert Haas robertmh...@gmail.com wrote:
 Sadly, the situation on Windows doesn't look so good.  I
 don't remember the exact numbers but I think it was something like 40
 or 60 or 80 times slower on the Windows box one of my colleagues
 tested than it is on Linux.

Do you happen to know the hardware and Windows version? Windows
QueryPerformanceCounter that instr_time.h uses should use RDTSC based
timing when the hardware can support it, just like Linux. I don't know
if Windows can avoid syscall overhead though.

 Maybe it's worth finding a platform where
 pg_test_timing reports that timing is very slow and then measuring how
 much impact this has on something like a pgbench or pgbench -S
 workload.

This can easily be tested on Linux by changing to the hpet or acpi_pm
clocksource. There probably still are platforms that can do worse than
this, but probably not by orders of magnitude.

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Daniel Farina

On Thu, Oct 18, 2012 at 1:55 PM, Andrew Dunstan and...@dunslane.net wrote:

 On 10/18/2012 01:11 PM, Daniel Farina wrote:

 Here's another use case that in my history with RULES that didn't seem
 to pan out so well: In my recollection, one way to use rules is to
 retarget operations that happen against a view and move them to a
 table, and as I recall to make this work as one expected one had to
 have a very wordy RULE (for UPDATEs) with a litany of (fairly simple)
 equality and not-null conditions to make it work as one would expect
 (to not under-constrain the UPDATE).  This became a maintenance
 headache whenever attributes were added to the underlying relation.



 Yes, but you also get a similar headache with a trigger. Unless you're VERY
 careful you can get a trigger failure by adding an attribute, and an almost
 guaranteed one by removing an attribute. It's true that the language for
 specifying the operations is more expressive, but no matter what mechanism
 you use, changing the shape of the objects can get you into trouble.

 I've never said that rules are perfect, nor that they should be used
 whenever possible. What I have said is that there are known cases where they
 are the best solution currently available. I still think that.

I'm not going to disagree with that, I only feel it's reasonable to
ask why those who react so strongly against deprecation why they think
what they do, and receive a clinical response, because not everyone
has seen those use cases.  My level of interest in deprecation is only
as far as if those who have to deal with the RULES implementation
don't want to work on it anymore in favor of other things, I think the
pain to users of deprecation is, from my vantage point, manageable if
given some time.

I also want to be very clear that I know my vantage point is skewed,
but I feel like exposing what assessment of user activity I can to
-hackers is important, and the best I can do when it comes to
considering topics like these.

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Josh Berkus

Daniel,

 I'm not going to disagree with that, I only feel it's reasonable to
 ask why those who react so strongly against deprecation why they think
 what they do, and receive a clinical response, because not everyone
 has seen those use cases.  My level of interest in deprecation is only
 as far as if those who have to deal with the RULES implementation
 don't want to work on it anymore in favor of other things, I think the
 pain to users of deprecation is, from my vantage point, manageable if
 given some time.

Note that you have heard from one of the people maintaining RULES, who
doesn't find them problematic to maintain (Tom).  Note that the original
hackers calling for deprecation do not work on RULEs except where they
touch other features.

And I'll say again: if you want a full list of use-cases for RULEs, you
need to go further than the -hackers list.  This is a small, insular
community which does not represent the majority of PostgreSQL users.
You have gone to Heroku's users, but given the nature of your user base,
they seem like the least likely group of people to use RULEs.  That's
like me polling a bunch of Data Warehousing geeks and then declaring
that we don't really need SERIALIZABLE.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Daniel Farina

On Thu, Oct 18, 2012 at 6:10 PM, Josh Berkus j...@agliodbs.com wrote:
 Daniel,

 I'm not going to disagree with that, I only feel it's reasonable to
 ask why those who react so strongly against deprecation why they think
 what they do, and receive a clinical response, because not everyone
 has seen those use cases.  My level of interest in deprecation is only
 as far as if those who have to deal with the RULES implementation
 don't want to work on it anymore in favor of other things, I think the
 pain to users of deprecation is, from my vantage point, manageable if
 given some time.

 Note that you have heard from one of the people maintaining RULES, who
 doesn't find them problematic to maintain (Tom).  Note that the original
 hackers calling for deprecation do not work on RULEs except where they
 touch other features.

 And I'll say again: if you want a full list of use-cases for RULEs, you
 need to go further than the -hackers list.  This is a small, insular
 community which does not represent the majority of PostgreSQL users.
 You have gone to Heroku's users, but given the nature of your user base,
 they seem like the least likely group of people to use RULEs.  That's
 like me polling a bunch of Data Warehousing geeks and then declaring
 that we don't really need SERIALIZABLE.

I have tried very assiduously to avoid generalizing, even though
perhaps I have failed.  I made a false assumption that nobody wanted
to work with RULES that I am very sorry for.

What I do not like, and stand by, is that I did not like the form of
terse dismissal of even the idea of deprecation, with a feeling that
it is entirely unnecessary to explain that assessment in any detail
for dozens of emails.  I don't think it's a healthy thing, especially
for a community where (traditionally) deprecation comes up so seldom.
I don't think this took place on the other deprecation threads so much
that spawned in this one's wake.

I can only offer the data I have.  Please do not over-read in what I
have been trying to communicate, or conflate my position with those of
other individuals, if you feel there is a chance of that.

Let us please consider the matter resolved unless you feel you have
more pointers for me -- anyone can send them to me individually,
perhaps. I am not soldiering for the deprecation of RULES, but rather
the community's approach to but a suggestion of certain kinds of
change. Perhaps this is but a one-off, because discussion in the other
threads has been seemingly healthier.

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-18 Thread Tom Lane

Marco Nenciarini marco.nenciar...@2ndquadrant.it writes:
 Please find the attached refreshed patch (v2) which fixes the loose ends
 you found.

I've started looking at this patch, and the first thing I notice is that
the syntax doesn't work.  It's ambiguous, and this:

  %left JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL
  /* kluge to keep xml_whitespace_option from causing shift/reduce conflicts */
  %rightPRESERVE STRIP_P
+ %nonassoc ELEMENT
  
  %%

is not in any way an acceptable fix.  All that that will do is cause an
arbitrary choice to be made when it's not clear what to do.  Half the
time the arbitrary choice will be wrong.  Consider for example

regression=# create table t1 (f1 int[] default 4! element references t2);
ERROR:  column element does not exist

The parser has resolved the ambiguity about whether ! is an infix or
postfix operator by assuming it's infix.  (Yeah, I realize we've fixed
some similar cases with precedence hacks, but they are cases that were
forced on us by brain-dead syntax choices in the SQL standard.  We don't
need to go there for syntax we're making up ourselves.)

We could get around that by making ELEMENT a fully reserved word, but
I don't think that's a really acceptable solution.  ELEMENT is reserved
according to more recent versions of the SQL standard, but only as a
built-in function name, and in any case reserving it is very likely to
break people's existing applications.

Another possibility is to forget about the column constraint ELEMENT
REFERENCES syntax, and only support the table-constraint syntax with
ELEMENT inside the column list --- I've not checked, but I think that
syntax doesn't have any ambiguity problems.

Or we could go back to using ARRAY here --- that should be safe since
ARRAY is already fully reserved.

Or we could choose some other syntax.  I'm wondering about dropping the
use of a keyword entirely, and instead using '[]' decoration.  This
wouldn't work too badly in the table constraint case:

FOREIGN KEY (foo, bar[]) REFERENCES t (x,y)

but I'm less sure where to put the decoration for the column constraint
case.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Ants Aasma

On Thu, Oct 18, 2012 at 10:03 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
 Hmm. Maybe we should think of implementing this as REMOTE TABLE, that
 is a table which gets no real data stored locally but all insert got through
 WAL
 and are replayed as real inserts on slave side.

FWIW, MySQL calls this exact concept the black hole storage engine.

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-18 Thread Tom Lane

I wrote:
 Or we could go back to using ARRAY here --- that should be safe since
 ARRAY is already fully reserved.

Ugh ... no, that doesn't work, because ARRAY[...] is allowed in c_expr
and hence b_expr.  So the ambiguity would still be there.  We'd need a
different fully-reserved keyword to go this way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hash_search and out of memory

2012-10-18 Thread Hitoshi Harada

On Thu, Oct 18, 2012 at 8:35 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 Hitoshi Harada umi.tan...@gmail.com writes:
 If OOM happens during expand_table() in hash_search_with_hash_value()
 for RelationCacheInsert,

 the palloc-based allocator does throw
 errors.  I think that when that was designed, we were thinking that
 palloc-based hash tables would be thrown away anyway after an error,
 but of course that's not true for long-lived tables such as the relcache
 hash table.

 I'm not terribly comfortable with trying to use a PG_TRY block to catch
 an OOM error - there are too many ways that could break, and this code
 path is by definition not very testable.  I think moving up the
 expand_table action is probably the best bet.  Will you submit a patch?

Here it is. I factored out the bucket finding code to re-calculate it
after expansion.

Thanks,
-- 
Hitoshi Harada


hashoom.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

91 matches

Mail list logo