Re: [HACKERS] Re: PostgreSql - access modified rows in prepare transaction command

2013-02-21 Thread Heikki Linnakangas

On 20.02.2013 13:39, pierpaolo.cincilla wrote:

Thank you Heikki for your reply. As you suggest, I will explain better what
I'm trying to accomplish.

What I'm writing a ditributed two-phase-commit termination protocol that
work in this manner:

1) Each site has a replica of the database. A site A perform a transaction
t1 and prepare it (PREPARE TRANSACTION 't1'). Then it atomic broadcast a
certification request for the transaction t1 *along with its writeset*
(values updated by t1) to other sites.

2) When a site receive the certification request for transaction t1 does the
certification (check that there are no concurrent conflicting transactions).
If the certification succeed then
2a) if the transaction is local (i.e. originated at that site) it commit the
transaction (COMMMIT PREPARED 't1').
2b) If the transaction is remote (i.e. prepared at another site) *it apply
locally the writeset of transaction t1* to reflect modifications to its
local replica of the database (UPDATE command).


The usual way to keep two identical databases in sync using two-phase 
commit is to just run all the statements in both databases. That assumes 
that the statements always produce identical results in both databases, 
though.



The problem is that if I can't fetch the writeset of a transaction in phase
1 (before the commit request) then when I certify the transaction at another
site I can't apply the updates performed by the remote transaction right
away but I have to wait the originating site to commit the transaction and
send back its writeset (now visible). This will be very bad because it adds
an extra round to the algorithm.


You could fetch the writeset in the same connection just before 
calling PREPARE TRANSACTION. While the transaction is still active, the 
changes are visible to itself.


Aside from any extra round-trips, the bigger reason you can't commit 
first and then fetch the writeset is that you can't roll back the 
transaction anymore, if the writeset can't be applied on the other node. 
If you could live with that, and the problem is just the latency, then 
you don't need two-phase commit to begin with.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 The way I was thinking about it, whatever the command is named, you
 might be able to tell the database to drop the storage associated with
 the view but that would make the view invalid until it was refreshed.
 It wouldn't make it appear to be empty.

Actually, that seems like a pretty key point to me.  TRUNCATE TABLE
results in a table that is perfectly valid, you just deleted all the
rows that used to be in it.  Throwing away an MV's contents should
not result in an MV that is considered valid.  That being the case,
lumping them as being the same operation feels like the wrong thing,
and so we should choose a different name for the MV operation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unarchived WALs deleted after crash

2013-02-21 Thread Heikki Linnakangas

On 21.02.2013 02:59, Daniel Farina wrote:

On Fri, Feb 15, 2013 at 9:29 AM, Simon Riggssi...@2ndquadrant.com  wrote:

On 15 February 2013 17:07, Heikki Linnakangashlinnakan...@vmware.com  wrote:


Unfortunately in HEAD, xxx.done file is not created when restoring
archived
file because of absence of the patch. We need to implement that first.



Ah yeah, that thing again..
(http://www.postgresql.org/message-id/50df5ba7.6070...@vmware.com) I'm going
to forward-port that patch now, before it's forgotten again. It's not clear
to me what the holdup was on this, but whatever the bigger patch we've been
waiting for is, it can just as well be done on top of the forward-port.


Agreed. I wouldn't wait for a better version now.


Related to this, how is this going to affect point releases, and are
there any lingering doubts about the mechanism of the fix?


Are you talking about the patch to avoid restored WAL segments from 
being re-archived (commit 6f4b8a4f4f7a2d683ff79ab59d3693714b965e3d), or 
the bug that that unarchived WALs were deleted after crash (commit 
b5ec56f664fa20d80fe752de494ec96362eff520)? The former was included in 
9.2.0 already, and the latter will be included in the next point release.


I have no lingering doubts about this. There was some plans to do bigger 
changes for the re-archiving issue 
(6f4b8a4f4f7a2d683ff79ab59d3693714b965e3d), which is why it was 
initially left out from master. But that didn't happen, and I believe 
everyone is happy with the current state of things.



This is
quite serious given my reliance on archiving, so unless the thinking
for point releases is 'real soon' I must backpatch and release it on
my own accord until then.


I don't know what the release schedule is. I take that to be a request 
to put out a new minor release ASAP.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unarchived WALs deleted after crash

2013-02-21 Thread Daniel Farina
On Thu, Feb 21, 2013 at 12:39 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 21.02.2013 02:59, Daniel Farina wrote:

 On Fri, Feb 15, 2013 at 9:29 AM, Simon Riggssi...@2ndquadrant.com
 wrote:

 On 15 February 2013 17:07, Heikki Linnakangashlinnakan...@vmware.com
 wrote:

 Unfortunately in HEAD, xxx.done file is not created when restoring
 archived
 file because of absence of the patch. We need to implement that first.



 Ah yeah, that thing again..
 (http://www.postgresql.org/message-id/50df5ba7.6070...@vmware.com) I'm
 going
 to forward-port that patch now, before it's forgotten again. It's not
 clear
 to me what the holdup was on this, but whatever the bigger patch we've
 been
 waiting for is, it can just as well be done on top of the forward-port.


 Agreed. I wouldn't wait for a better version now.


 Related to this, how is this going to affect point releases, and are
 there any lingering doubts about the mechanism of the fix?


 Are you talking about the patch to avoid restored WAL segments from being
 re-archived (commit 6f4b8a4f4f7a2d683ff79ab59d3693714b965e3d), or the bug
 that that unarchived WALs were deleted after crash (commit
 b5ec56f664fa20d80fe752de494ec96362eff520)? The former was included in 9.2.0
 already, and the latter will be included in the next point release.

Unarchived WALs being deleted after a crash is the one that worries
me.  I actually presume re-archivals will happen anyway because I may
lose connection to archive storage after the WAL has already been
committed, hence b5ec56f664fa20d80fe752de494ec96362eff520.

 This is
 quite serious given my reliance on archiving, so unless the thinking
 for point releases is 'real soon' I must backpatch and release it on
 my own accord until then.


 I don't know what the release schedule is. I take that to be a request to
 put out a new minor release ASAP.

Perhaps, but it's more of a concrete evaluation of how important
archiving is to me and my affiliated operation.  An acceptable answer
might be yeah, backpatch if you feel it's that much of a rush.
Clearly, my opinion is that a gap in the archives is pretty
cringe-inducing.  I hit it from an out of disk case, and you'd be
surprised (or perhaps not?) how many people like to kill -9 processes
on a whim.

I already maintain other backpatches (not related to fixes), and this
one is only temporary, so it's not too much trouble for me.

--
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Andres Freund
On 2013-02-21 04:14:09 +, Greg Stark wrote:
 On Wed, Feb 20, 2013 at 9:25 PM, Peter Eisentraut pete...@gmx.net wrote:
  More generally, I would consider the invalidation of a materialized view
  a DDL command, whereas truncating a table is a DML command.
 
 That's not entirely true. From the database's point of view, TRUNCATE
 is in many ways actually DDL.
 
 I actually don't really dislike using TRUNCATE for this command. I
 was more asking about whether this meant people were thinking of the
 view as a thing where you could control the data in it by hand and
 could have the view be empty rather than just not valid.

It also might get confusing when we get materialized views that are
auto-updateable. I am not suggesting to forward TRUNCATE to the internal
storage in that case but giving an error so its an easy to find
distinction to a normal table seems like a good idea.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2013-02-21 Thread Tom Lane
Shigeru Hanada shigeru.han...@gmail.com writes:
 [ postgres_fdw.v5.patch ]

Applied with a lot of revisions.

There are still a number of loose ends and things that need to be
discussed:

I'm not at all happy with the planner support functions --- as designed,
that code is basically incapable of thinking about more than one remote
access path.  It needs to be restructured and then extended to be able
to generate parameterized remote paths.  The local estimation mode is
pretty bogus as well.  I thought the patch could be committed anyway,
but I'm going to go back and work on that part later.

I doubt that deparseFuncExpr is doing the right thing by printing
casts as their underlying function calls.  The comment claims that
this way is more robust but I rather think it's less so, because it's
effectively assuming that the remote server implements casts exactly
like the local one, which might be incorrect if the remote is a different
Postgres version.  I think we should probably change that, but would like
to know what the argument was for coding it like this.  Also, if this
is to be the approach to printing casts, why is RelabelType handled
differently?

As I mentioned earlier, I think it would be better to force the remote
session's search_path setting to just pg_catalog and then reduce the
amount of explicit schema naming in the queries --- any opinions about
that?

I took out the checks on collations of operators because I thought they
were thoroughly broken.  In the first place, looking at operator names
to deduce semantics is unsafe (if we were to try to distinguish equality,
looking up btree opclass membership would be the way to do that).  In the
second place, restricting only collation-sensitive operators and not
collation-sensitive functions seems just about useless for guaranteeing
safety.  But we don't have any very good handle on which functions might
be safe to send despite having collatable input types, so taking that
approach would greatly restrict our ability to send function calls at all.

The bigger picture here though is that we're already relying on the user
to make sure that remote tables have column data types matching the local
definition, so why can't we say that they've got to make sure collations
match too?  So I think this is largely a documentation issue and we don't
need any automated enforcement mechanism, or at least it's silly to try
to enforce this when we're not enforcing column type matching (yet?).

What might make sense is to try to determine whether a WHERE clause uses
any collations different from those of the contained foreign-column Vars,
and send it over only if not.  That would prevent us from sending clauses
that explicitly use collations that might not exist on the remote server.
I didn't try to code this though.

Another thing that I find fairly suspicious in this connection is that
deparse.c isn't bothering to print collations attached to Const nodes.
That may be a good idea to avoid needing the assumption that the remote
server uses the same collation names we do, but if we're going to do it
like this, those Const collations need to be considered when deciding
if the expression is safe to send at all.

A more general idea that follows on from that is that if we're relying on
the user to be sure the semantics are the same, maybe we don't need to be
quite so tight about what we'll send over.  In particular, if the user has
declared a foreign-table column of a non-built-in type, the current code
will never send any constraints for that column at all, which seems overly
conservative if you believe he's matched the type correctly.  I'm not sure
exactly how to act on that thought, but I think there's room for
improvement there.

A related issue is that as coded, is_builtin() is pretty flaky, because
what's built-in on our server might not exist at all on the remote side,
if it's a different major Postgres version.  So what we've got is that
the code is overly conservative about what it will send and yet still
perfectly capable of sending remote queries that will fail outright,
which is not a happy combination.  I have no very good idea how to deal
with that though.

Another thing I was wondering about, but did not change, is that if we're
having the remote transaction inherit the local transaction's isolation
level, shouldn't it inherit the READ ONLY property as well?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] posix_fadvise missing in the walsender

2013-02-21 Thread Robert Haas
On Wed, Feb 20, 2013 at 9:49 PM, Joachim Wieland j...@mcknight.de wrote:
 So given the above, I think it's possible to come up with benchmarks
 that prove whatever you want to prove :-)

Yeah.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Robert Haas
On Wed, Feb 20, 2013 at 11:14 PM, Greg Stark st...@mit.edu wrote:
 On Wed, Feb 20, 2013 at 9:25 PM, Peter Eisentraut pete...@gmx.net wrote:
 More generally, I would consider the invalidation of a materialized view
 a DDL command, whereas truncating a table is a DML command.

 That's not entirely true. From the database's point of view, TRUNCATE
 is in many ways actually DDL.

 I actually don't really dislike using TRUNCATE for this command.

Me neither.  I'm astonished that we're seriously considering adding
new keywords for this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Peter Eisentraut
On 2/20/13 5:03 PM, Kevin Grittner wrote:
 Peter Eisentraut pete...@gmx.net wrote:
 On 2/20/13 2:30 PM, Kevin Grittner wrote:
 Are there TRUNCATE triggers on materialized views?
 No.  Nor SELECT, INSERT, UPDATE, or DELETE triggers.  You can't
 create a trigger of any type on a materialized view.  I don't
 think that would interfere with event triggers, though.

 More generally, I would consider the invalidation of a
 materialized view a DDL command, whereas truncating a table is a
 DML command.
 
 The force of that assertion is somewhat undercut by the fact that
 the ExecuteTruncate() function does exactly what needs to be done
 to discard the data in a materialized view and make it appear as
 non-scannable.  Even if we dress it up with different syntax, it's
 not clear that we wouldn't build a TruncateStmt in the parser and
 pass it through exactly the same execution path.  We would just
 need to look at the relkind to generate a different command tag.

This is a fall-out of the implementation, and that's fine (although I'd
personally still be in favor of putting that state in the catalog, not
into the block count on disk, effectively), but I'm talking about the
external interfaces we present.

 This has various implications with triggers, logging,
 permissions.  I think it's not good to mix those two.
 
 Could you give a more concrete example of where you see a problem?

* Logging: You can set things to log DDL commands only.  I would want a
MV invalidation to be logged.

* Permissions: There is a TRUNCATE permission, would that apply here?
There is no refresh permission.

* Triggers: There are TRUNCATE triggers, but they don't apply here.

* Triggers: I don't know how event triggers work, but I'd like
materialized view events to be grouped together somehow.

* Don't know the opinion of sepgsql on all this.

I think what this all comes down to, as I've mentioned before, is that
the opposite of this proposed truncate operation is the refresh
operation, and that is a DDL command under ALTER MATERIALIZED VIEW.
Both of these fundamental operations -- truncate/refresh,
invalidate/validate, empty/refill, whatever -- should be grouped
together somehow, as far as syntax, as well logging, permissions,
trigger handling, and so on are concerned.  You don't need a new command
or key word for this.  How about ALTER MATERIALIZED VIEW DISCARD?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Peter Eisentraut
On 2/20/13 11:14 PM, Greg Stark wrote:
 That's not entirely true. From the database's point of view, TRUNCATE
 is in many ways actually DDL.

Whether something is DDL or DML or a read operation (query) is not an
implementation detail, it's a user-exposed category.  Since TRUNCATE is
logically equivalent to DELETE, it's a DML operation, as far as the user
is concerned.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.3 crashes during archive recovery

2013-02-21 Thread Heikki Linnakangas

On 15.02.2013 15:49, Heikki Linnakangas wrote:

Attached is a patch for git master. The basic idea is to split
InArchiveRecovery into two variables, InArchiveRecovery and
ArchiveRecoveryRequested. ArchiveRecoveryRequested is set when
recovery.conf exists. But if we don't know how far we need to recover,
we first perform crash recovery with InArchiveRecovery=false. When we
reach the end of WAL in pg_xlog, InArchiveRecovery is set, and we
continue with normal archive recovery.


New version of this attached, with a few bugs fixed.

I'm thinking that this should be back-patched to 9.2, but not to earlier 
branches. Before 9.2, we don't PANIC at a reference to a non-existent 
page until end of recovery, even if we've already reached consistency. 
The same basic issue still exists in earlier versions, though: if you 
have hot_standby=on, the system will open for read-only queries too 
early, before the database is consistent. But this patch is invasive 
enough that I'm weary of back-patching it further, when the worst that 
can happen is that there's a small window right after startup when you 
can see an inconsistent database in hot standby mode. Maybe after we get 
some more testing of this in 9.2 and master. Opinions on that?


- Heikki
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index abc525a..29d1f96 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -189,7 +189,18 @@ static bool LocalHotStandbyActive = false;
  */
 static int	LocalXLogInsertAllowed = -1;
 
-/* Are we recovering using offline XLOG archives? (only valid in the startup process) */
+/*
+ * When ArchiveRecoveryRequested is set, archive recovery was requested,
+ * ie. recovery.conf file was present. When InArchiveRecovery is set, we are
+ * currently recovering using offline XLOG archives. These variables are only
+ * valid in the startup process.
+ *
+ * When ArchiveRecoveryRequested is true, but InArchiveRecovery is false, we're
+ * currently performing crash recovery using only XLOG files in pg_xlog, but
+ * will switch to using offline XLOG archives as soon as we reach the end of
+ * WAL in pg_xlog.
+*/
+static bool ArchiveRecoveryRequested = false;
 bool InArchiveRecovery = false;
 
 /* Was the last xlog file restored from archive, or local? */
@@ -207,10 +218,13 @@ static TimestampTz recoveryTargetTime;
 static char *recoveryTargetName;
 
 /* options taken from recovery.conf for XLOG streaming */
-bool StandbyMode = false;
+static bool StandbyModeRequested = false;
 static char *PrimaryConnInfo = NULL;
 static char *TriggerFile = NULL;
 
+/* are we currently in standby mode? */
+bool StandbyMode = false;
+
 /* whether request for fast promotion has been made yet */
 static bool fast_promote = false;
 
@@ -3217,10 +3231,10 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode,
 	private-emode = emode;
 	private-randAccess = (RecPtr != InvalidXLogRecPtr);
 
-	/* This is the first try to read this page. */
+	/* This is the first attempt to read this page. */
 	lastSourceFailed = false;
 
-	do
+	for (;;)
 	{
 		char   *errormsg;
 
@@ -3229,8 +3243,6 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode,
 		EndRecPtr = xlogreader-EndRecPtr;
 		if (record == NULL)
 		{
-			lastSourceFailed = true;
-
 			if (readFile = 0)
 			{
 close(readFile);
@@ -3247,22 +3259,16 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode,
 ereport(emode_for_corrupt_record(emode,
  RecPtr ? RecPtr : EndRecPtr),
 		(errmsg_internal(%s, errormsg) /* already translated */));
-
-			/* Give up, or retry if we're in standby mode. */
-			continue;
 		}
-
 		/*
 		 * Check page TLI is one of the expected values.
 		 */
-		if (!tliInHistory(xlogreader-latestPageTLI, expectedTLEs))
+		else if (!tliInHistory(xlogreader-latestPageTLI, expectedTLEs))
 		{
 			char		fname[MAXFNAMELEN];
 			XLogSegNo segno;
 			int32 offset;
 
-			lastSourceFailed = true;
-
 			XLByteToSeg(xlogreader-latestPagePtr, segno);
 			offset = xlogreader-latestPagePtr % XLogSegSize;
 			XLogFileName(fname, xlogreader-readPageTLI, segno);
@@ -3273,11 +3279,73 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode,
 			fname,
 			offset)));
 			record = NULL;
-			continue;
 		}
-	} while (StandbyMode  record == NULL  !CheckForStandbyTrigger());
 
-	return record;
+		if (record)
+		{
+			/* Great, got a record */
+			return record;
+		}
+		else
+		{
+			/* No valid record available from this source */
+			lastSourceFailed = true;
+
+			/*
+			 * If archive recovery was requested, but we were still doing crash
+			 * recovery, switch to archive recovery and retry using the offline
+			 * archive. We have now replayed all the valid WAL in pg_xlog, so
+			 * we are presumably now consistent.
+			 *
+			 * We require that there's at least some valid WAL present in
+			 * pg_xlog, however (!fetch_ckpt). We could recover using the 

Re: [HACKERS] [PATCH] Add PQconninfoParseParams and PQconninfodefaultsMerge to libpq

2013-02-21 Thread Amit Kapila
On Thursday, February 21, 2013 12:46 AM Heikki Linnakangas wrote:
 On 20.02.2013 11:42, Amit Kapila wrote:
  The patch for providing connection string for pg_basebackup,
  pg_receivexlog, pg_dump and pg_dumpall is attached with this mail.
 
 Thanks. Now that I look at this patch, I realize that we don't actually
 need these new functions for pg_basebackup and friends after all. We
 already have PQconninfoParse(), we can just use that.
 
 pg_dump can already take a connection string:
 
 pg_dump dbname=postgres port=5432
 
 For consistency with psql and other tools, perhaps we should add a -d
 option to pg_dump, so that you could do:
 
 pg_dump -d dbname=postgres port=5432
 
 It'd be nice to call the option -d or --dbname in all the tools. That's
 a bit confusing for pg_basebackup and pg_receivexlog, as it can *not*
 actually be a database name, but it would be otherwise consistent with
 the other commands.
 
 
 I came up with the attached three patches. The first adds -d/--dbname
 option to pg_basebackup and pg_receivexlog. 

Patch-1 is working fine.

The second adds it to
 pg_dump, per above. The third adds it to pg_dumpall.
 
 The third patch is a bit complicated. It first parses the user-
 specified connection string using PQconninfoParse, so that it can merge
 in some extra keywords: user, host, password, dbname and
 fallback_application_name. It then calls PQconnectdbParams with the
 keyword/value pairs. After making the initial connection to postgres or
 template1 database, it calls PQconninfo() to again extract the
 keyword/value pairs in effect in the connection, and constructs a new
 connection string from them. That new connection string is then passed
 to pg_dump on the command line, with the database name appended to it.
 
 That seems to work, although it's perhaps a bit Rube Goldbergian. One
 step of deparsing and parsing could be avoided by keeping the
 keyword/value pairs from the first PQconninfoParse() call, instead of
 constructing them again with PQconninfo(). I'll experiment with that
 tomorrow.

I think this whole new logic in pg_dumpall is needed because in
pg_dump(Patch-2),
we have used dbname for new option.
In pg_dump, if for new option (-d) we use new variable conn_str similar as
we used at other places and change
the ConnectDatabase() in file pg_backup_db.c similar to GetConnection of
pg_basebackup, we might not
need the new logic of deparsing and parsing in pg_dumpall.

Am I missing something or do you feel this will be more cumbersome than
current Patch-2  Patch-3?
 
With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2013-02-21 Thread Albe Laurenz
Tom Lane wrote:
 Applied with a lot of revisions.

I am thrilled.

 As I mentioned earlier, I think it would be better to force the remote
 session's search_path setting to just pg_catalog and then reduce the
 amount of explicit schema naming in the queries --- any opinions about
 that?

I think that that would make the remore query much more readable.
That would improve EXPLAIN VERBOSE output, which is a user visible
improvement.

 I took out the checks on collations of operators because I thought they
 were thoroughly broken.  In the first place, looking at operator names
 to deduce semantics is unsafe (if we were to try to distinguish equality,
 looking up btree opclass membership would be the way to do that).  In the
 second place, restricting only collation-sensitive operators and not
 collation-sensitive functions seems just about useless for guaranteeing
 safety.  But we don't have any very good handle on which functions might
 be safe to send despite having collatable input types, so taking that
 approach would greatly restrict our ability to send function calls at all.
 
 The bigger picture here though is that we're already relying on the user
 to make sure that remote tables have column data types matching the local
 definition, so why can't we say that they've got to make sure collations
 match too?  So I think this is largely a documentation issue and we don't
 need any automated enforcement mechanism, or at least it's silly to try
 to enforce this when we're not enforcing column type matching (yet?).

I think that the question what to push down is a different question
from checking column data types, because there we can rely on the
type input functions to reject bad values.

Being permissive on collation issues would lead to user problems
along the lines of my query results are different when I
select from a remote table on a different operating system.
In my experience many users are blissfully ignorant of issues like
collation and encoding.

What about the following design principle:
Only push down conditions which are sure to return the correct
result, provided that the PostgreSQL system objects have not
been tampered with.

Would it be reasonable to push down operators and functions
only if
a) they are in the pg_catalog schema
b) they have been around for a couple of releases
c) they are not collation sensitive?

It makes me uncomfortable to think of a FDW that would happily
push down conditions that may lead to wrong query results.

 Another thing I was wondering about, but did not change, is that if we're
 having the remote transaction inherit the local transaction's isolation
 level, shouldn't it inherit the READ ONLY property as well?

That seems to me like it would be the right thing to do.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Kevin Grittner
Peter Eisentraut pete...@gmx.net wrote:
 On 2/20/13 11:14 PM, Greg Stark wrote:
 That's not entirely true. From the database's point of view,
 TRUNCATE is in many ways actually DDL.

 Whether something is DDL or DML or a read operation (query) is
 not an implementation detail, it's a user-exposed category. 
 Since TRUNCATE is logically equivalent to DELETE, it's a DML
 operation, as far as the user is concerned.

Not really.  It doesn't follow the same MVCC behavior as DML.  This
is user-visible, documented behavior.

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2013-02-21 Thread Andres Freund
On 2013-02-21 14:23:35 +, Albe Laurenz wrote:
 Tom Lane wrote:
  Another thing I was wondering about, but did not change, is that if we're
  having the remote transaction inherit the local transaction's isolation
  level, shouldn't it inherit the READ ONLY property as well?
 
 That seems to me like it would be the right thing to do.

I am not 100% convinced of that. There might be valid usecases where a
standby executes queries on the primary that executes that do DML. And
there would be no way out of it I think?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark st...@mit.edu writes:
 The way I was thinking about it, whatever the command is named, you
 might be able to tell the database to drop the storage associated with
 the view but that would make the view invalid until it was refreshed.
 It wouldn't make it appear to be empty.

 Actually, that seems like a pretty key point to me.  TRUNCATE TABLE
 results in a table that is perfectly valid, you just deleted all the
 rows that used to be in it.  Throwing away an MV's contents should
 not result in an MV that is considered valid.

It doesn't.  That was one of the more contentious points in the
earlier bikeshedding phases.  Some felt that throwing away the
contents was a form of making the MV out of date and as such
didn't require any special handling.  Others, including myself,
felt that data not present was a distinct state from generated
zero rows and that attempting to scan a materialized view for
which data had not been generated must result in an error.  The
latter property has been maintained from the beginning -- or at
least that has been the intent.

test=# CREATE TABLE t (id int NOT NULL PRIMARY KEY, type text NOT NULL, amt 
numeric NOT NULL);
CREATE TABLE
test=# CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t 
GROUP BY type WITH NO DATA;
SELECT 0
test=# SELECT pg_relation_is_scannable('tm'::regclass);
 pg_relation_is_scannable
--
 f
(1 row)

test=# SELECT * FROM tm;
ERROR:  materialized view tm has not been populated
HINT:  Use the REFRESH MATERIALIZED VIEW command.
test=# REFRESH MATERIALIZED VIEW tm;
REFRESH MATERIALIZED VIEW
test=# SELECT pg_relation_is_scannable('tm'::regclass);
 pg_relation_is_scannable
--
 t
(1 row)

test=# TRUNCATE MATERIALIZED VIEW tm;
TRUNCATE TABLE
test=# SELECT * FROM tm;
ERROR:  materialized view tm has not been populated
HINT:  Use the REFRESH MATERIALIZED VIEW command.
test=# SELECT pg_relation_is_scannable('tm'::regclass);
 pg_relation_is_scannable
--
 f
(1 row)

 That being the case, lumping them as being the same operation
 feels like the wrong thing, and so we should choose a different
 name for the MV operation.

There is currently no truncation of MV data without rendering the
MV unscannable.  Do you still feel it needs a different command
name?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Kevin Grittner
Greg Stark st...@mit.edu wrote:

 I actually don't really dislike using TRUNCATE for this
 command.  I was more asking about whether this meant people were
 thinking of the view as a thing where you could control the data
 in it by hand and could have the view be empty rather than just
 not valid.

You can either populate the MV in the CREATE command or by REFRESH,
and it will be scannable.  If it is created WITH NO DATA or
TRUNCATEd it is not scannable, generating an error on an attempt to
reference it.

test=# select * from tm;
 type | totamt
--+
 y    | 12
 z    | 24
 x    |  5
(3 rows)

test=# truncate tm;
TRUNCATE TABLE
test=# select * from tm;
ERROR:  materialized view tm has not been populated
HINT:  Use the REFRESH MATERIALIZED VIEW command.

 The way I was thinking about it, whatever the command is named,
 you might be able to tell the database to drop the storage
 associated with the view but that would make the view invalid
 until it was refreshed.  It wouldn't make it appear to be empty.

I think we're on the same page after all.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2013-02-21 Thread Tom Lane
Albe Laurenz laurenz.a...@wien.gv.at writes:
 Tom Lane wrote:
 As I mentioned earlier, I think it would be better to force the remote
 session's search_path setting to just pg_catalog and then reduce the
 amount of explicit schema naming in the queries --- any opinions about
 that?

 I think that that would make the remore query much more readable.
 That would improve EXPLAIN VERBOSE output, which is a user visible
 improvement.

Yeah, that's really the main point.  OPERATOR() is tremendously ugly...

 The bigger picture here though is that we're already relying on the user
 to make sure that remote tables have column data types matching the local
 definition, so why can't we say that they've got to make sure collations
 match too?  So I think this is largely a documentation issue and we don't
 need any automated enforcement mechanism, or at least it's silly to try
 to enforce this when we're not enforcing column type matching (yet?).

 I think that the question what to push down is a different question
 from checking column data types, because there we can rely on the
 type input functions to reject bad values.

Unfortunately, that's a very myopic view of the situation: there
are many cases where datatype semantics can vary without the I/O
functions having any idea that anything is wrong.  To take one example,
what if the underlying column is type citext but the user wrote text
in the foreign table definition?  postgres_fdw would see no reason not
to push col = 'foo' across, but that clause would behave quite
differently on the remote side.  Another example is that float8 and
numeric will have different opinions about the truth of
1.1 = 1.2, so you're going
to get into trouble if you declare an FT column as one when the
underlying column is the other, even though the I/O functions for these
types will happily take each other's output.

So I think (and have written in the committed docs) that users had
better be careful to ensure that FT columns are declared as the same
type as the underlying columns, even though we can't readily enforce
that, at least not for non-builtin types.

And there's really no difference between that situation and the
collation situation, though I agree with you that the latter is a lot
more likely to bite careless users.

 What about the following design principle:
 Only push down conditions which are sure to return the correct
 result, provided that the PostgreSQL system objects have not
 been tampered with.

That's a nice, simple, and useless-in-practice design principle,
because it will toss out many situations that users will want to work;
situations that in fact *would* work as long as the users adhere to
safe coding conventions.  I do not believe that when people ask why
does performance of LIKE suck on my foreign table, they will accept an
answer of we don't allow that to be pushed across because we think
you're too stupid to make the remote collation match.

If you want something provably correct, the way to get there is
to work out a way to check if the remote types and collations
really match.  But that's a hard problem AFAICT, so if we want
something usable in the meantime, we are going to have to accept
some uncertainty about what will happen if the user messes up.

 Would it be reasonable to push down operators and functions
 only if
 a) they are in the pg_catalog schema
 b) they have been around for a couple of releases
 c) they are not collation sensitive?

We don't track (b) nor (c), so this suggestion is entirely
unimplementable in the 9.3 time frame; nor is it provably safe,
unless by (b) you mean must have been around since 7.4 or so.

On a longer time horizon this might be doable, but it would be a lot
of work to implement a solution that most people will find far too
restrictive.  I'd rather see the long-term focus be on doing
type/collation matching, so that we can expand not restrict the set
of things we can push across.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2013-02-21 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-02-21 14:23:35 +, Albe Laurenz wrote:
 Tom Lane wrote:
 Another thing I was wondering about, but did not change, is that if we're
 having the remote transaction inherit the local transaction's isolation
 level, shouldn't it inherit the READ ONLY property as well?

 That seems to me like it would be the right thing to do.

 I am not 100% convinced of that. There might be valid usecases where a
 standby executes queries on the primary that executes that do DML. And
 there would be no way out of it I think?

How exactly would it do that via an FDW?  Surely if the user tries to
execute INSERT/UPDATE/DELETE against a foreign table, the command would
get rejected in a read-only transaction, long before we even figure out
that the target is a foreign table?

Even granting that there's some loophole that lets the command get sent
to the foreign server, why's it a good idea to allow that?  I rather
thought the idea of READ ONLY was to prevent the transaction from making
any permanent changes.  It's not clear why changes on a remote database
would be exempted from that.

(Doubtless you could escape the restriction anyway with dblink, but that
doesn't mean that postgres_fdw should be similarly ill-defined.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2013-02-21 Thread Andres Freund
On 2013-02-21 09:58:57 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-02-21 14:23:35 +, Albe Laurenz wrote:
  Tom Lane wrote:
  Another thing I was wondering about, but did not change, is that if we're
  having the remote transaction inherit the local transaction's isolation
  level, shouldn't it inherit the READ ONLY property as well?
 
  That seems to me like it would be the right thing to do.
 
  I am not 100% convinced of that. There might be valid usecases where a
  standby executes queries on the primary that executes that do DML. And
  there would be no way out of it I think?
 
 How exactly would it do that via an FDW?  Surely if the user tries to
 execute INSERT/UPDATE/DELETE against a foreign table, the command would
 get rejected in a read-only transaction, long before we even figure out
 that the target is a foreign table?

I was thinking of querying a remote table thats actually a view. Which
might be using a function that does caching into a table or something.
Not a completely unreasonable design.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Kevin Grittner
Peter Eisentraut pete...@gmx.net wrote:

I'll need more time to ponder your other points, but...

 You don't need a new command or key word for this.  How about
 ALTER MATERIALIZED VIEW DISCARD?

I don't like this because we don't have ALTER TABLE REINDEX.  But
the fact that DISCARD is a keyword does open up some interesting
syntax possibilities without adding more keywords.  Maybe:

DISCARD MATERIALIZED VIEW DATA tm;

Or something like that.  Paint buckets are over there by the
bikeshed.  Have at it.  :-)
 
-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Heikki Linnakangas

On 21.02.2013 16:38, Kevin Grittner wrote:

Tom Lanet...@sss.pgh.pa.us  wrote:

Greg Starkst...@mit.edu  writes:

The way I was thinking about it, whatever the command is named, you
might be able to tell the database to drop the storage associated with
the view but that would make the view invalid until it was refreshed.
It wouldn't make it appear to be empty.


Actually, that seems like a pretty key point to me.  TRUNCATE TABLE
results in a table that is perfectly valid, you just deleted all the
rows that used to be in it.  Throwing away an MV's contents should
not result in an MV that is considered valid.


It doesn't.  That was one of the more contentious points in the
earlier bikeshedding phases.  Some felt that throwing away the
contents was a form of making the MV out of date and as such
didn't require any special handling.  Others, including myself,
felt that data not present was a distinct state from generated
zero rows and that attempting to scan a materialized view for
which data had not been generated must result in an error.  The
latter property has been maintained from the beginning -- or at
least that has been the intent.


Yeah, data not present is clearly different from 0 rows. I'm 
surprised there isn't an explicit boolean column somewhere for that, but 
I guess you can use the size of the heap for that too, as long as you're 
careful to not truncate it to 0 blocks when it's empty but scannable.


There's at least one bug left in that area:

postgres=# create table t (id int4);
CREATE TABLE
postgres=# create materialized view tm as select * from t where id  
0;SELECT 0

postgres=# select * from tm;
 id

(0 rows)

postgres=# create index i_tm on tm(id);CREATE INDEX
postgres=# cluster tm using i_tm;
CLUSTER
postgres=# select * from tm;
ERROR:  materialized view tm has not been populated
HINT:  Use the REFRESH MATERIALIZED VIEW command.

Clustering a materialized view invalidates it.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 giving an error so its an easy to find distinction to a normal
 table seems like a good idea.

I'm not sure I understood your concerns entirely, but wonder
whether this helps?:

test=# \d
  List of relations
 Schema | Name  |   Type    |  Owner 
+---+---+-
 public | bb    | materialized view | kgrittn
 public | t | table | kgrittn
 public | tm    | materialized view | kgrittn
 public | tmm   | materialized view | kgrittn
 public | tv    | view  | kgrittn
 public | tvmm  | materialized view | kgrittn
 public | tvv   | view  | kgrittn
 public | tvvm  | materialized view | kgrittn
 public | tvvmv | view  | kgrittn
(9 rows)

test=# truncate table tm;
ERROR:  tm is not a table
test=# truncate materialized view t;
ERROR:  t is not a materialized view
test=# truncate materialized view tm;
TRUNCATE TABLE
test=# truncate table t;
TRUNCATE TABLE

Well, maybe those command tags could use a tweak.

Then there's this, if you don't specify an object type:

test=# truncate t, tm;
TRUNCATE TABLE

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 That being the case, lumping them as being the same operation
 feels like the wrong thing, and so we should choose a different
 name for the MV operation.

 There is currently no truncation of MV data without rendering the
 MV unscannable.  Do you still feel it needs a different command
 name?

You didn't say anything that changed my opinion: it doesn't feel like
a TRUNCATE to me.  It's not changing the object to a different but
entirely valid state, which is what TRUNCATE does.

Peter claimed upthread that REFRESH is a subcommand of ALTER MATERIALIZE
VIEW and that this operation should be another one.  That sounds pretty
reasonable from here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Kevin Grittner
Heikki Linnakangas hlinnakan...@vmware.com wrote:
 On 21.02.2013 16:38, Kevin Grittner wrote:
 Tom Lanet...@sss.pgh.pa.us  wrote:
 Greg Starkst...@mit.edu  writes:
 The way I was thinking about it, whatever the command is named, you
 might be able to tell the database to drop the storage associated with
 the view but that would make the view invalid until it was refreshed.
 It wouldn't make it appear to be empty.

 Actually, that seems like a pretty key point to me.  TRUNCATE TABLE
 results in a table that is perfectly valid, you just deleted all the
 rows that used to be in it.  Throwing away an MV's contents should
 not result in an MV that is considered valid.

 It doesn't.  That was one of the more contentious points in the
 earlier bikeshedding phases.  Some felt that throwing away the
 contents was a form of making the MV out of date and as such
 didn't require any special handling.  Others, including myself,
 felt that data not present was a distinct state from generated
 zero rows and that attempting to scan a materialized view for
 which data had not been generated must result in an error.  The
 latter property has been maintained from the beginning -- or at
 least that has been the intent.

 Yeah, data not present is clearly different from 0 rows.
 I'm surprised there isn't an explicit boolean column somewhere
 for that,

There was, in earlier versions of the patch: pg_class.relisvald. 
The problem is that we needed some way to determine from the heap
that it was invalid to support UNLOGGED MVs.  Several people were
offended by my attempt to use relisvald as the primary indicator
and transfer the information from the heap state to pg_class and
relcache.  There were some pretty big technical challenges to that.
So I caved on that one and went with the pg_relation_is_scannable()
function based on the heap as reported by relcache.  That being one
of the newer parts of the patch, it is probably not as solid as the
parts which haven't changed much in the last three months.

 but I guess you can use the size of the heap for that too, as
 long as you're careful to not truncate it to 0 blocks when it's
 empty but scannable.

 There's at least one bug left in that area:

 postgres=# create table t (id int4);
 CREATE TABLE
 postgres=# create materialized view tm as select * from t where id  0;SELECT
 0
 postgres=# select * from tm;
 id
 
 (0 rows)

 postgres=# create index i_tm on tm(id);CREATE INDEX
 postgres=# cluster tm using i_tm;
 CLUSTER
 postgres=# select * from tm;
 ERROR:  materialized view tm has not been populated
 HINT:  Use the REFRESH MATERIALIZED VIEW command.

 Clustering a materialized view invalidates it.

Good spot.  That should be easy enough to fix.

Thanks.

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Andres Freund
On 2013-02-21 07:10:09 -0800, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
  giving an error so its an easy to find distinction to a normal
  table seems like a good idea.
 
 I'm not sure I understood your concerns entirely, but wonder
 whether this helps?:

To explain it a bit:

I assume that at some point matviews will get (auto-)updateable, just as
normal views recently got. In that case application programmers might
not be aware anymore that something is a view either because they just
don't know or because a table got converted into a matview after the
code was written.

Because of the potential wish for transparency (with security views as a
potential user) at least normal views might get the capability to be
TRUNCATEd directly, so it might be that matviews do as well.

 test=# \d
   List of relations
  Schema | Name  |   Type    |  Owner 
 +---+---+-
  public | bb    | materialized view | kgrittn
  public | t | table | kgrittn
  public | tm    | materialized view | kgrittn
  public | tmm   | materialized view | kgrittn
  public | tv    | view  | kgrittn
  public | tvmm  | materialized view | kgrittn
  public | tvv   | view  | kgrittn
  public | tvvm  | materialized view | kgrittn
  public | tvvmv | view  | kgrittn
 (9 rows)
 
 test=# truncate table tm;
 ERROR:  tm is not a table
 test=# truncate materialized view t;
 ERROR:  t is not a materialized view
 test=# truncate materialized view tm;
 TRUNCATE TABLE
 test=# truncate table t;
 TRUNCATE TABLE

Thats not bad.

But what if we allow TRUNCATE on views someday (possibly only if a
truncate trigger is defined). For consistency we might also want that on
matvies. Having a difference between TRUNCATE view; and TRUNCATE
MATERIALIZED VIEW; in that case sounds ugly to me.

What about DISABLE? DISCARD or DEALLOCATE would also be nice but it
seems hard to fit that into existing syntax.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2013-02-21 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-02-21 09:58:57 -0500, Tom Lane wrote:
 How exactly would it do that via an FDW?  Surely if the user tries to
 execute INSERT/UPDATE/DELETE against a foreign table, the command would
 get rejected in a read-only transaction, long before we even figure out
 that the target is a foreign table?

 I was thinking of querying a remote table thats actually a view. Which
 might be using a function that does caching into a table or something.
 Not a completely unreasonable design.

Yeah, referencing a remote view is something that should work fine, but
it's not clear to me why it should work differently than it does on the
remote server.  If you select from that same view in a READ ONLY
transaction on the remote, won't it fail?  If so, why should that work
if it's selected from via a foreign table?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kgri...@ymail.com writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 That being the case, lumping them as being the same
 operation feels like the wrong thing, and so we should choose a
 different name for the MV operation.

 There is currently no truncation of MV data without rendering
 the MV unscannable.  Do you still feel it needs a different
 command name?

 You didn't say anything that changed my opinion: it doesn't feel
 like a TRUNCATE to me.  It's not changing the object to a
 different but entirely valid state, which is what TRUNCATE does.

 Peter claimed upthread that REFRESH is a subcommand of ALTER
 MATERIALIZE VIEW

It's not, nor do I think it should be.

 and that this operation should be another one.  That sounds
 pretty reasonable from here.

That feels completely wrong to me.  For one thing, I can't think of
any ALTER commands to populate or remove data.  What did you think
of the idea of something like DISCARD MATERIALIZED VIEW DATA as a
new statment?  Or maybe RESET MATERIALIZED VIEW?

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2013-02-21 Thread Andres Freund
On 2013-02-21 10:21:34 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-02-21 09:58:57 -0500, Tom Lane wrote:
  How exactly would it do that via an FDW?  Surely if the user tries to
  execute INSERT/UPDATE/DELETE against a foreign table, the command would
  get rejected in a read-only transaction, long before we even figure out
  that the target is a foreign table?
 
  I was thinking of querying a remote table thats actually a view. Which
  might be using a function that does caching into a table or something.
  Not a completely unreasonable design.
 
 Yeah, referencing a remote view is something that should work fine, but
 it's not clear to me why it should work differently than it does on the
 remote server.  If you select from that same view in a READ ONLY
 transaction on the remote, won't it fail?  If so, why should that work
 if it's selected from via a foreign table?

Sure, it might fail if you use READ ONLY explicitly. Or the code might
check it. The point is that one might not have choice about the READ
ONLY state of the local transaction if its a HS standby as all
transactions are READ ONLY there.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2013-02-21 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 Sure, it might fail if you use READ ONLY explicitly. Or the code might
 check it. The point is that one might not have choice about the READ
 ONLY state of the local transaction if its a HS standby as all
 transactions are READ ONLY there.

[ shrug... ]  If you want to use a remote DB to cheat on READ ONLY,
there's always dblink.  It's not apparent to me that the FDW
implementation should try to be complicit in such cheating.
(Not that it would work anyway given the command-level checks.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:

 I assume that at some point matviews will get (auto-)updateable,
 just as normal views recently got.

I'm dubious about that.  Every use case I've seen for MVs involves
aggregation, although they are a generalized feature, so that won't
always be true.  But if you have a view like:

CREATE MATERIALIZED VIEW tm AS
 SELECT t.type,
    sum(t.amt) AS totamt
   FROM t
  GROUP BY t.type;

... I don't see how that can be updateable.  If I add 5 to totamt
for some row, what do you do?  I expect that 99% of MVs will be
updated asynchronously from changes to the underlying data -- what
do you do if someone updates a row that no longer exists in the
underlying data.  This are just seems fraught with peril and out of
sync with the usual uses of MVs.

 What about DISABLE? DISCARD or DEALLOCATE would also be nice but
 it seems hard to fit that into existing syntax.

Thanks for the suggestions.

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2013-02-21 Thread Albe Laurenz
Tom Lane wrote:
 I think that the question what to push down is a different question
 from checking column data types, because there we can rely on the
 type input functions to reject bad values.
 
 Unfortunately, that's a very myopic view of the situation: there
 are many cases where datatype semantics can vary without the I/O
 functions having any idea that anything is wrong.  To take one example,
 what if the underlying column is type citext but the user wrote text
 in the foreign table definition?  postgres_fdw would see no reason not
 to push col = 'foo' across, but that clause would behave quite
 differently on the remote side.  Another example is that float8 and
 numeric will have different opinions about the truth of
 1.1 = 1.2, so you're going
 to get into trouble if you declare an FT column as one when the
 underlying column is the other, even though the I/O functions for these
 types will happily take each other's output.

You are right.

 So I think (and have written in the committed docs) that users had
 better be careful to ensure that FT columns are declared as the same
 type as the underlying columns, even though we can't readily enforce
 that, at least not for non-builtin types.
 
 And there's really no difference between that situation and the
 collation situation, though I agree with you that the latter is a lot
 more likely to bite careless users.

That's what I am worried about.

 What about the following design principle:
 Only push down conditions which are sure to return the correct
 result, provided that the PostgreSQL system objects have not
 been tampered with.
 
 That's a nice, simple, and useless-in-practice design principle,
 because it will toss out many situations that users will want to work;
 situations that in fact *would* work as long as the users adhere to
 safe coding conventions.  I do not believe that when people ask why
 does performance of LIKE suck on my foreign table, they will accept an
 answer of we don't allow that to be pushed across because we think
 you're too stupid to make the remote collation match.

I think that it will be pretty hard to get both reliability
and performance to an optimum.

I'd rather hear complaints about bad performance than
about bad results, but that's just my opinion.

 Would it be reasonable to push down operators and functions
 only if
 a) they are in the pg_catalog schema
 b) they have been around for a couple of releases
 c) they are not collation sensitive?
 
 We don't track (b) nor (c), so this suggestion is entirely
 unimplementable in the 9.3 time frame; nor is it provably safe,
 unless by (b) you mean must have been around since 7.4 or so.

My idea was to have a (hand picked) list of functions and operators
that are considered safe, but I understand that that is rather
ugly.  There could of course be no generic method because,
as you say, b) and c) are not tracked.

 On a longer time horizon this might be doable, but it would be a lot
 of work to implement a solution that most people will find far too
 restrictive.  I'd rather see the long-term focus be on doing
 type/collation matching, so that we can expand not restrict the set
 of things we can push across.

I like that vision, and of course my above idea does not
go well with it.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Peter claimed upthread that REFRESH is a subcommand of ALTER
 MATERIALIZE VIEW

 It's not, nor do I think it should be.

Oh, never mind then.

 and that this operation should be another one.  That sounds
 pretty reasonable from here.

 That feels completely wrong to me.  For one thing, I can't think of
 any ALTER commands to populate or remove data.  What did you think
 of the idea of something like DISCARD MATERIALIZED VIEW DATA as a
 new statment?  Or maybe RESET MATERIALIZED VIEW?

I could live with either DISCARD or RESET.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON Function Bike Shedding

2013-02-21 Thread Merlin Moncure
On Wed, Feb 20, 2013 at 9:42 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Feb 19, 2013 at 10:00 AM, Merlin Moncure mmonc...@gmail.com wrote:
 Anyways, as to overloading in general, well, SQL is heavily
 overloaded.  We don't have int_max, float_max, etc. and it would be
 usability reduction if we did.

 That's true, but max(int) and max(float) are doing pretty much the
 same logical operation - they are taking the maximum of a group of
 numbers.  Overloading in cases where the semantics vary - e.g. + for
 both integer addition and string concatenation - is something else
 altogether, and I have not generally observed it to be a very good
 idea.  Sometimes it works in cases where it's part of the core
 language design, but we don't have the luxury of knowing what other
 data types we'll want to add in the future, and I'm vary wary of
 allowing JSON to engage in uncontrolled namespace pollution.

Sure: but that's another straw man:  abuse of + operator is case of
combining arbitrarily different behaviors (concatenation and
arithmetic aggregation) into uniform syntax.   This is bad, but a
different thing.   The right way to do it is to globally define the
behavior and map it to types if and only if it makes sense.  Again,
you want clean separation of 'what you're doing' vs 'what you're doing
it over'.


 But that's not even the point; the
 driving philosophy of SQL is that your data structures (and types) are
 to be strongly decoupled from the manipulation you do -- this keeps
 the language very general. That philosophy, while not perfect, should
 be adhered to when possible.

 Perhaps, but that goal seems unlikely to be met in this case.  The
 JSON functions and operators are being named by one group of people
 with one set of sensibilities, and the hstore functions and operators
 were named by a different group of people with a different set of
 sensibilities (and therefore don't match), and the next type that
 comes along will be named according to yet another group of people
 with another set of sensibilities.  So we're unlikely to end up with a
 coherent set of primitives that operate on underlying data of a
 variety of types; we are instead likely to end up with an incoherent
 jumble.

json and hstore have overlap in the sense that you can use them to
define a tuple that is independent from database type system and
therefore free from it's restrictions (this is why 9.0+ hstore was a
complete game changer for trigger development).  Also a json object is
for all intents and purposes an hstore++ -- json is more general and
if json it gets the ability to be manipulated would probably displace
hstore for most usages.

So I'm not buying that: if the truly overlapping behaviors were
syntactically equivalent then you would be able to swap out the
implementation changing only the type without refactoring all your
code.   C++ STL works this way and that principle, at least, is good
despite all the C++ baggage headaches.

 Although we now have a JSON type in core, we should not pretend that
 it's in the same league as text or int4.  If those data types claim
 common function names like max and abs and common operator names like
 + and ||, it can be justified on the grounds that the appeal of those
 data types is pretty near universal.  JSON is a very popular format
 right now and I completely support adding more support for it, but I
 cheerfully submit that if you think it falls into the same category as
 text or int4, you've gotten way too caught up in the hype.  It's
 completely appropriate to apply stricter criteria for namespace
 pollution to JSON than to a basic data type whose semantics are
 dictated by the SQL standard, the behavior of other database products,
 and fourth-grade math class.

I'm not buying into the hype at all.  I've been arguing (without much
success) for years that throwing arcane type specific functions into
the public namespace is incoherent, not the other way around.
array_upper()?  How about length() or count()?

Well, we need to to decide what to do here -- I'll call the vote about
even, and there plausible arguments to do it either way -- so how do
we resolve this?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] is it bug? - printing boolean domains in sql/xml function

2013-02-21 Thread Pavel Stehule
Hello

here is patch

where it should be pushed - 9.3 or 9.4 ?

I vote 9.3 - I know a users, that should to do workarounds (he should
not to use domains) now, so early is better. And this patch is one
line size patch

Regards

Pavel


2013/2/16 Pavel Stehule pavel.steh...@gmail.com:
 2013/2/16 Noah Misch n...@leadboat.com:
 On Sun, Jan 13, 2013 at 07:54:11AM +0100, Pavel Stehule wrote:
 related to 
 http://www.postgresql.org/message-id/cafj8prdtavfnrazwet+ewmfrbdzffva8w17kk_e12fb6t-z...@mail.gmail.com

 boolean domains is serialised to string different than boolean

 postgres=# CREATE DOMAIN booldomain as bool;
 CREATE DOMAIN

 -- fully expected behave
 postgres=# select true, true::booldomain;
  bool | booldomain
 --+
  t| t
 (1 row)

 postgres=# select true::text, true::booldomain::text;
  text | text
 --+--
  true | true
 (1 row)

 -- unexpected behave
 postgres=# select xmlforest(true as bool, true::booldomain as booldomain);
   xmlforest
 -
  booltrue/boolbooldomaint/booldomain
 (1 row)

 is it expected behave?

 There is a bug here.  map_sql_type_to_xmlschema_type() has special treatment
 for domains, but map_sql_value_to_xml_value() and its callers have no
 corresponding logic.  In the same vein, this yields a schema that does not
 validate its corresponding document:

 set datestyle = 'sql, dmy';
 create domain datedom as date;
 create table t as select current_date AS a, current_date::datedom AS b;
 select table_to_xml('t', true, true, '');
 select table_to_xmlschema('t', true, true, '');

 One could debate whether the schema generation or the data generation should
 be the one to change, but I tentatively vote for the latter.

 yes, I am thinking so it is bug too.

 if we will agree so it should be fixed I'll write fix

 Regards

 Pavel




 Thanks,
 nm

 --
 Noah Misch
 EnterpriseDB http://www.enterprisedb.com


fix-xmlmap.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] is it bug? - printing boolean domains in sql/xml function

2013-02-21 Thread Pavel Stehule
pink Peter as xml feature commiter.

Regards

Pavel

2013/2/21 Pavel Stehule pavel.steh...@gmail.com:
 Hello

 here is patch

 where it should be pushed - 9.3 or 9.4 ?

 I vote 9.3 - I know a users, that should to do workarounds (he should
 not to use domains) now, so early is better. And this patch is one
 line size patch

 Regards

 Pavel


 2013/2/16 Pavel Stehule pavel.steh...@gmail.com:
 2013/2/16 Noah Misch n...@leadboat.com:
 On Sun, Jan 13, 2013 at 07:54:11AM +0100, Pavel Stehule wrote:
 related to 
 http://www.postgresql.org/message-id/cafj8prdtavfnrazwet+ewmfrbdzffva8w17kk_e12fb6t-z...@mail.gmail.com

 boolean domains is serialised to string different than boolean

 postgres=# CREATE DOMAIN booldomain as bool;
 CREATE DOMAIN

 -- fully expected behave
 postgres=# select true, true::booldomain;
  bool | booldomain
 --+
  t| t
 (1 row)

 postgres=# select true::text, true::booldomain::text;
  text | text
 --+--
  true | true
 (1 row)

 -- unexpected behave
 postgres=# select xmlforest(true as bool, true::booldomain as booldomain);
   xmlforest
 -
  booltrue/boolbooldomaint/booldomain
 (1 row)

 is it expected behave?

 There is a bug here.  map_sql_type_to_xmlschema_type() has special treatment
 for domains, but map_sql_value_to_xml_value() and its callers have no
 corresponding logic.  In the same vein, this yields a schema that does not
 validate its corresponding document:

 set datestyle = 'sql, dmy';
 create domain datedom as date;
 create table t as select current_date AS a, current_date::datedom AS b;
 select table_to_xml('t', true, true, '');
 select table_to_xmlschema('t', true, true, '');

 One could debate whether the schema generation or the data generation should
 be the one to change, but I tentatively vote for the latter.

 yes, I am thinking so it is bug too.

 if we will agree so it should be fixed I'll write fix

 Regards

 Pavel




 Thanks,
 nm

 --
 Noah Misch
 EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON Function Bike Shedding

2013-02-21 Thread Robert Haas
On Thu, Feb 21, 2013 at 10:51 AM, Merlin Moncure mmonc...@gmail.com wrote:
 Sure: but that's another straw man:  abuse of + operator is case of
 combining arbitrarily different behaviors (concatenation and
 arithmetic aggregation) into uniform syntax.   This is bad, but a
 different thing.   The right way to do it is to globally define the
 behavior and map it to types if and only if it makes sense.  Again,
 you want clean separation of 'what you're doing' vs 'what you're doing
 it over'.

I'll buy that.  So what's the globally defined behavior of a ~
operator or a function called get() or even vals()?  The problem is
that I don't know how we can be sure any definition we choose now
based on one example will be forward-compatible with things we want to
do later, perhaps involving completely unrelated data types with very
different semantics.  It's not like there are an infinite number of
short, high-quality operator/function names.

 I'm not buying into the hype at all.  I've been arguing (without much
 success) for years that throwing arcane type specific functions into
 the public namespace is incoherent, not the other way around.
 array_upper()?  How about length() or count()?

Not sure I follow.  array_upper() is annoying because its semantics
are kinda confusing and idiosyncratic, but that's more the fault of
the type itself than the accessor function.  length() and count() are
admittedly very common English words, but it's hard to imagine what
we'd want to use those names for that would be more common/important
than what they're used for already.  It's not at all hard to imagine
that with some of the other names that have been proposed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] posix_fadvise missing in the walsender

2013-02-21 Thread Jeff Janes
On Wed, Feb 20, 2013 at 7:54 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Feb 19, 2013 at 5:48 PM, Simon Riggs si...@2ndquadrant.com wrote:
 I agree with Merlin and Joachim - if we have the call in one place, we
 should have it in both.

 We might want to assess whether we even want to have it one place.
 I've seen cases where the existing call hurts performance, because of
 WAL file recycling.  If we don't flush the WAL file blocks out of
 cache, then they're still there when we recycle the WAL file and we
 can overwrite them without further I/O.  But if we tell the OS to blow
 them away, then it has to reread them when we try to overwrite the old
 files, and so we stall waiting for the I/O.

Does the kernel really read a data block from disk into memory in
order to immediately overwrite it?  I would have thought it would
optimize that away, at least if the writes are sized and aligned to
512 or 1024 bytes blocks (which WAL should be).  Well, stranger things
than that happen, I guess.  (For example on ext4, when a file with
dirty pages goes away due to another file getting renamed over the top
of it, the disappearing file automatically gets fsynced, or the
equivalent.)

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unarchived WALs deleted after crash

2013-02-21 Thread Jeff Janes
On Thu, Feb 21, 2013 at 12:39 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

 Are you talking about the patch to avoid restored WAL segments from being
 re-archived (commit 6f4b8a4f4f7a2d683ff79ab59d3693714b965e3d), or the bug
 that that unarchived WALs were deleted after crash (commit
 b5ec56f664fa20d80fe752de494ec96362eff520)? The former was included in 9.2.0
 already, and the latter will be included in the next point release.

...

 I don't know what the release schedule is. I take that to be a request to
 put out a new minor release ASAP.

+1 from me.  I'm rather uncomfortable running a system with this
unarchived deletion bug.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sql_drop Event Trigger

2013-02-21 Thread Robert Haas
On Wed, Feb 20, 2013 at 10:48 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 I have added some protections so that these do not fire on undesirable
 events (such as dropping an event trigger); also event triggers and
 other object types are filtered out of pg_dropped_objects, in case
 something like DROP OWNED happens.  (So for instance you could get an
 empty pg_dropped_objects if you did DROP OWNED and the mentioned user
 only owns an event trigger).  Maybe this needs to be reconsidered.

This is tricky.  The whole point of skipping the
ddl_command_start/ddl_command_end triggers when the command is one
that operates on event triggers is that we don't want a user to
install an event trigger that prevents said user from dropping that
event trigger afterwards.  We're currently safe from that, but with
the behavior you describe, we won't be: an event trigger that
unconditionally thows an error will block all drops, even of event
triggers.

I am inclined to suggest that we go back to an earlier suggestion I
made, which Dimitri didn't like, but which I still think might be the
best way forward: add a SUSET GUC that disables event triggers.  If we
do that, then our answer to that problem, for the current event
triggers and all we might add in the future, can be: well, if that
happens, set the disable_event_triggers GUC and retry the drop.  OTOH,
if we DON'T do that, then this problem is going to come back up every
time we add a new firing point, and it may be really tough to make
sure we're secure against this in all cases.

If you don't like that suggestion I'm open to other options, but I
think we should try hard to preserve the invariant that a superuser
can always drop an event trigger without interference from the event
trigger system itself.

 There's also code to re-obtain the list of objects to drop after the
 event trigger functions have run; the second list is compared to the
 first one, and if they differ, an error is raised.

This is definitely an improvement.  I am not 100% clear on whether
this is sufficient, but it sure seems a lot better than punting.

I think there might be a possible failure mode if somebody creates a
new object that depends on one of the objects to be dropped while the
event trigger is running.  Since system catalogs are normally read
with SnapshotNow, I think it might be possible to create a situation
where the first and second searches return different results even
though the trigger hasn't done anything naughty.  I believe that I
added some locking in 9.2 that will prevent this in the case where you
create a table that depends on a concurrently-dropped schema, but
there are other cases that have this problem, such as a function being
added to a concurrently-dropped schema.

Now, IMHO, that isn't necessarily a show-stopper for this patch,
because that same phenomenon can cause catalog corruption as things
stand.  So in the scenario where an event trigger causes a transaction
to fail because of this issue, it will actually be *preventing*
catalog corruption.  However, if this goes in, it might bump up the
priority of fixing some of those locking issues, since it may make
them more visible.

 It's rather annoying that performMultipleDeletions and performDeletion
 contain almost exactly the same code.  I think maybe it's time to morph
 the latter into a thin layer on top of the former.

Yeah, or at least factor out the code you need for this into its own function.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] posix_fadvise missing in the walsender

2013-02-21 Thread Robert Haas
On Thu, Feb 21, 2013 at 12:16 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 Does the kernel really read a data block from disk into memory in
 order to immediately overwrite it?  I would have thought it would
 optimize that away, at least if the writes are sized and aligned to
 512 or 1024 bytes blocks (which WAL should be).

Now that you mention that I agree it seems strange, but that's what I saw.

/me scratches head

It does seem pretty odd, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] relpath() to src/common

2013-02-21 Thread Alvaro Herrera
Just a heads up: I intend to push this shortly, and after it has seen
some BF activity, pg_xlogdump as well.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
*** a/src/backend/Makefile
--- b/src/backend/Makefile
***
*** 35,44  LOCALOBJS += utils/probes.o
  endif
  endif
  
! OBJS = $(SUBDIROBJS) $(LOCALOBJS) $(top_builddir)/src/port/libpgport_srv.a
  
! # We put libpgport into OBJS, so remove it from LIBS; also add libldap
! LIBS := $(filter-out -lpgport, $(LIBS)) $(LDAP_LIBS_BE)
  
  # The backend doesn't need everything that's in LIBS, however
  LIBS := $(filter-out -lz -lreadline -ledit -ltermcap -lncurses -lcurses, $(LIBS))
--- 35,46 
  endif
  endif
  
! OBJS = $(SUBDIROBJS) $(LOCALOBJS) $(top_builddir)/src/port/libpgport_srv.a \
!$(top_builddir)/src/common/libpgcommon_srv.a
  
! # We put libpgport and libpgcommon into OBJS, so remove it from LIBS; also add
! # libldap
! LIBS := $(filter-out -lpgport -lpgcommon, $(LIBS)) $(LDAP_LIBS_BE)
  
  # The backend doesn't need everything that's in LIBS, however
  LIBS := $(filter-out -lz -lreadline -ledit -ltermcap -lncurses -lcurses, $(LIBS))
*** a/src/backend/access/rmgrdesc/smgrdesc.c
--- b/src/backend/access/rmgrdesc/smgrdesc.c
***
*** 16,21 
--- 16,22 
  
  #include catalog/catalog.h
  #include catalog/storage_xlog.h
+ #include common/relpath.h
  
  
  void
*** a/src/backend/access/rmgrdesc/xactdesc.c
--- b/src/backend/access/rmgrdesc/xactdesc.c
***
*** 16,21 
--- 16,22 
  
  #include access/xact.h
  #include catalog/catalog.h
+ #include common/relpath.h
  #include storage/sinval.h
  #include utils/timestamp.h
  
*** a/src/backend/access/transam/xlogutils.c
--- b/src/backend/access/transam/xlogutils.c
***
*** 20,25 
--- 20,26 
  #include access/xlog.h
  #include access/xlogutils.h
  #include catalog/catalog.h
+ #include common/relpath.h
  #include storage/smgr.h
  #include utils/guc.h
  #include utils/hsearch.h
*** a/src/backend/catalog/catalog.c
--- b/src/backend/catalog/catalog.c
***
*** 37,42 
--- 37,43 
  #include catalog/pg_shseclabel.h
  #include catalog/pg_tablespace.h
  #include catalog/toasting.h
+ #include common/relpath.h
  #include miscadmin.h
  #include storage/fd.h
  #include utils/fmgroids.h
***
*** 44,64 
  #include utils/tqual.h
  
  
- #define FORKNAMECHARS	4		/* max chars for a fork name */
- 
- /*
-  * Lookup table of fork name by fork number.
-  *
-  * If you add a new entry, remember to update the errhint below, and the
-  * documentation for pg_relation_size(). Also keep FORKNAMECHARS above
-  * up-to-date.
-  */
- const char *forkNames[] = {
- 	main,		/* MAIN_FORKNUM */
- 	fsm,		/* FSM_FORKNUM */
- 	vm,		/* VISIBILITYMAP_FORKNUM */
- 	init		/* INIT_FORKNUM */
- };
  
  /*
   * forkname_to_number - look up fork number by name
--- 45,50 
***
*** 80,209  forkname_to_number(char *forkName)
  }
  
  /*
-  * forkname_chars
-  *		We use this to figure out whether a filename could be a relation
-  *		fork (as opposed to an oddly named stray file that somehow ended
-  *		up in the database directory).	If the passed string begins with
-  *		a fork name (other than the main fork name), we return its length,
-  *		and set *fork (if not NULL) to the fork number.  If not, we return 0.
-  *
-  * Note that the present coding assumes that there are no fork names which
-  * are prefixes of other fork names.
-  */
- int
- forkname_chars(const char *str, ForkNumber *fork)
- {
- 	ForkNumber	forkNum;
- 
- 	for (forkNum = 1; forkNum = MAX_FORKNUM; forkNum++)
- 	{
- 		int			len = strlen(forkNames[forkNum]);
- 
- 		if (strncmp(forkNames[forkNum], str, len) == 0)
- 		{
- 			if (fork)
- *fork = forkNum;
- 			return len;
- 		}
- 	}
- 	return 0;
- }
- 
- /*
-  * relpathbackend - construct path to a relation's file
-  *
-  * Result is a palloc'd string.
-  */
- char *
- relpathbackend(RelFileNode rnode, BackendId backend, ForkNumber forknum)
- {
- 	int			pathlen;
- 	char	   *path;
- 
- 	if (rnode.spcNode == GLOBALTABLESPACE_OID)
- 	{
- 		/* Shared system relations live in {datadir}/global */
- 		Assert(rnode.dbNode == 0);
- 		Assert(backend == InvalidBackendId);
- 		pathlen = 7 + OIDCHARS + 1 + FORKNAMECHARS + 1;
- 		path = (char *) palloc(pathlen);
- 		if (forknum != MAIN_FORKNUM)
- 			snprintf(path, pathlen, global/%u_%s,
- 	 rnode.relNode, forkNames[forknum]);
- 		else
- 			snprintf(path, pathlen, global/%u, rnode.relNode);
- 	}
- 	else if (rnode.spcNode == DEFAULTTABLESPACE_OID)
- 	{
- 		/* The default tablespace is {datadir}/base */
- 		if (backend == InvalidBackendId)
- 		{
- 			pathlen = 5 + OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1;
- 			path = (char *) palloc(pathlen);
- 			if (forknum != MAIN_FORKNUM)
- snprintf(path, pathlen, base/%u/%u_%s,
- 		 rnode.dbNode, rnode.relNode,
- 		 

Re: [HACKERS] sql_drop Event Trigger

2013-02-21 Thread Robert Haas
On Thu, Feb 21, 2013 at 12:25 PM, Robert Haas robertmh...@gmail.com wrote:
 There's also code to re-obtain the list of objects to drop after the
 event trigger functions have run; the second list is compared to the
 first one, and if they differ, an error is raised.

 This is definitely an improvement.  I am not 100% clear on whether
 this is sufficient, but it sure seems a lot better than punting.

What if the object that gets whacked around is one of the named
objects rather than some dependency thereof?  Suppose for example that
the event trigger drops the same object that the user tried to drop.
We need to error out cleanly in that case, not blindly proceed with
the drop.

(In the worst case, somebody could create an unrelated object with the
same OID and we could latch onto and drop that one.  Seems remote
unless the user has a way to fiddle with the OID counter, but if it
happened it would be bad.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [DOCS] Contrib module xml2 status

2013-02-21 Thread Robert Haas
On Wed, Feb 20, 2013 at 11:58 AM, Ian Lawrence Barwick
barw...@gmail.com wrote:
 I'm not sure if this is a documentation or hackers issue, but the
 documentation page for contrib module xml2 refers to PostgreSQL 8.4 in the
 future tense:

It is planned that this module will be removed in PostgreSQL 8.4 in
 favor of the newer standard API

 http://www.postgresql.org/docs/devel/static/xml2.html

 Are there any plans to remove this module by a forseeable date?

Nope.  I have repeatedly been voted down on removing it, and I've also
been repeatedly voted down on removing the deprecation text.  Could we
at least agree on changing the deprecation text to say This module is
deprecated and may be removed in a future release?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sql_drop Event Trigger

2013-02-21 Thread Alvaro Herrera
Robert Haas escribió:
 On Wed, Feb 20, 2013 at 10:48 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  I have added some protections so that these do not fire on undesirable
  events (such as dropping an event trigger); also event triggers and
  other object types are filtered out of pg_dropped_objects, in case
  something like DROP OWNED happens.  (So for instance you could get an
  empty pg_dropped_objects if you did DROP OWNED and the mentioned user
  only owns an event trigger).  Maybe this needs to be reconsidered.
 
 This is tricky.  The whole point of skipping the
 ddl_command_start/ddl_command_end triggers when the command is one
 that operates on event triggers is that we don't want a user to
 install an event trigger that prevents said user from dropping that
 event trigger afterwards.  We're currently safe from that, but with
 the behavior you describe, we won't be: an event trigger that
 unconditionally thows an error will block all drops, even of event
 triggers.

You're misunderstanding.  If you do DROP EVENT TRIGGER, the DDL_DROP
event won't fire at all.  So no matter how messed up your system is, you
can always fix it by simply dropping the event trigger.

What I was saying is that if you have some command other than DROP EVENT
TRIGGER, which happens to drop an event trigger, said event trigger will
not be present in the pg_dropped_objects results.


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sql_drop Event Trigger

2013-02-21 Thread Alvaro Herrera
Robert Haas escribió:
 On Thu, Feb 21, 2013 at 12:25 PM, Robert Haas robertmh...@gmail.com wrote:
  There's also code to re-obtain the list of objects to drop after the
  event trigger functions have run; the second list is compared to the
  first one, and if they differ, an error is raised.
 
  This is definitely an improvement.  I am not 100% clear on whether
  this is sufficient, but it sure seems a lot better than punting.
 
 What if the object that gets whacked around is one of the named
 objects rather than some dependency thereof?  Suppose for example that
 the event trigger drops the same object that the user tried to drop.
 We need to error out cleanly in that case, not blindly proceed with
 the drop.

An error is raised, which I think is sane.  I think this peculiar
situation warrants its own few lines in the new regression test.

One funny thing I noticed is that if I add a column in a table being
dropped, the targetObjects list does not change after the trigger has
run.  The reason for this is that the table's attributes are not present
in the targetObjects list; instead they are dropped manually by calling
DeleteAttributeTuples().  I saw that you can end up with lingering
pg_attribute entries that way.

 (In the worst case, somebody could create an unrelated object with the
 same OID and we could latch onto and drop that one.  Seems remote
 unless the user has a way to fiddle with the OID counter, but if it
 happened it would be bad.)

Hmm.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [DOCS] Contrib module xml2 status

2013-02-21 Thread Magnus Hagander
On Thu, Feb 21, 2013 at 6:39 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Feb 20, 2013 at 11:58 AM, Ian Lawrence Barwick
 barw...@gmail.com wrote:
 I'm not sure if this is a documentation or hackers issue, but the
 documentation page for contrib module xml2 refers to PostgreSQL 8.4 in the
 future tense:

It is planned that this module will be removed in PostgreSQL 8.4 in
 favor of the newer standard API

 http://www.postgresql.org/docs/devel/static/xml2.html

 Are there any plans to remove this module by a forseeable date?

 Nope.  I have repeatedly been voted down on removing it, and I've also
 been repeatedly voted down on removing the deprecation text.  Could we
 at least agree on changing the deprecation text to say This module is
 deprecated and may be removed in a future release?

Not reopening the actual discussion about rmeoving it, but assuming
we're not, strong +1 on changing the deprecation message. And don't
forget to backpatch the change so it shows up in the old versions of
the docs as well.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [DOCS] Contrib module xml2 status

2013-02-21 Thread Andrew Dunstan


On 02/21/2013 12:56 PM, Magnus Hagander wrote:

On Thu, Feb 21, 2013 at 6:39 PM, Robert Haas robertmh...@gmail.com wrote:

On Wed, Feb 20, 2013 at 11:58 AM, Ian Lawrence Barwick
barw...@gmail.com wrote:

I'm not sure if this is a documentation or hackers issue, but the
documentation page for contrib module xml2 refers to PostgreSQL 8.4 in the
future tense:

It is planned that this module will be removed in PostgreSQL 8.4 in
favor of the newer standard API

http://www.postgresql.org/docs/devel/static/xml2.html

Are there any plans to remove this module by a forseeable date?

Nope.  I have repeatedly been voted down on removing it, and I've also
been repeatedly voted down on removing the deprecation text.  Could we
at least agree on changing the deprecation text to say This module is
deprecated and may be removed in a future release?

Not reopening the actual discussion about rmeoving it, but assuming
we're not, strong +1 on changing the deprecation message. And don't
forget to backpatch the change so it shows up in the old versions of
the docs as well.




Yes, we should change it to remove the reference to 8.4. The point is we 
can remove the module when someone fixes and replaces the functionality 
that's left in there that some people rely on. So far nobody has stepped 
up to the plate, although now that we have lateral a sane replacement 
for xpath_table might well be a lot easier to achieve.  If someone is 
interested in working on this I'd be happy to hear about it. Maybe it 
would be a good Google SOC project.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON Function Bike Shedding

2013-02-21 Thread Merlin Moncure
On Thu, Feb 21, 2013 at 11:02 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Feb 21, 2013 at 10:51 AM, Merlin Moncure mmonc...@gmail.com wrote:
 Sure: but that's another straw man:  abuse of + operator is case of
 combining arbitrarily different behaviors (concatenation and
 arithmetic aggregation) into uniform syntax.   This is bad, but a
 different thing.   The right way to do it is to globally define the
 behavior and map it to types if and only if it makes sense.  Again,
 you want clean separation of 'what you're doing' vs 'what you're doing
 it over'.

 I'll buy that.  So what's the globally defined behavior of a ~
 operator or a function called get() or even vals()?  The problem is
 that I don't know how we can be sure any definition we choose now
 based on one example will be forward-compatible with things we want to
 do later, perhaps involving completely unrelated data types with very
 different semantics.  It's not like there are an infinite number of
 short, high-quality operator/function names.

Well, for case the of operator, it means whatever we reserve to mean.
Very much agree on limitations of symbolic representation of behaviors
(especially since some of the best ones were reserved by SQL or other
acctors), so I think there is growing consensus that such things
should get moved to functions.   But functions are a lot less terse
than operators so functions describing clearly defined behaviors are
appreciated.

So, get() means what *define it to mean*, but the definition should be
consistent. If it's shorthand for get from some multiple key/value
container then fine.  If get() is just not specific enough -- let's
at least try and go for something behavior specific (such as getMember
or some such) before punting and resolving type specific function
names.

In fact, a an awful lot of $propsal's behaviors are in fact direct
proxies for hstore behaviors, and a superficial think is suggesting
that around 90% of hstore API would make sense in JSON terms (even
though Andrew didn't implement all those behaviors and we're not going
to ask him to).  That to me is suggesting that tuple manipulation is a
pretty general problem (hstore AKA tuple) and json only brings a
couple of things to the table that isn't already covered there.
Isn't it nice that you can document functions like avals/svals ONCE
and not have to rewrite your triggers when you swap out hstore for
json to get a couple extra behavior bits?

 I'm not buying into the hype at all.  I've been arguing (without much
 success) for years that throwing arcane type specific functions into
 the public namespace is incoherent, not the other way around.
 array_upper()?  How about length() or count()?

 Not sure I follow.  array_upper() is annoying because its semantics
 are kinda confusing and idiosyncratic, but that's more the fault of
 the type itself than the accessor function.  length() and count() are
 admittedly very common English words, but it's hard to imagine what
 we'd want to use those names for that would be more common/important
 than what they're used for already.  It's not at all hard to imagine
 that with some of the other names that have been proposed.

yeah.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bugfix: --echo-hidden is not supported by \sf statements

2013-02-21 Thread Pavel Stehule
2013/2/20 Josh Kupershmidt schmi...@gmail.com:
 On Wed, Jan 23, 2013 at 12:06 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 2013/1/14 Tom Lane t...@sss.pgh.pa.us:
 Well, fine, but then it should fix both of them and remove
 minimal_error_message altogether.  I would however suggest eyeballing
 what happens when you try \ef nosuchfunction (with or without -E).
 I'm pretty sure that the reason for the special error handling in these
 functions is that the default error reporting was unpleasant for this
 use-case.

 so I rewrote patch

 still is simple

 On the end I didn't use PSQLexec - just write hidden queries directly
 from related functions - it is less invasive

 Sorry for the delay, but I finally had a chance to review this patch.
 I think the patch does a good job of bringing the behavior of \sf and
 \ef in-line with the other meta-commands as far as --echo-hidden is
 concerned.

 About the code changes:

 The bulk of the code change comes from factoring TraceQuery() out of
 PSQLexec(), so that \ef and \sf may make use of this query-printing
 without going through PSQLexec(). And not using PSQLexec() lets us
 avoid a somewhat uglier error message like:

   ERROR:  function nonexistent does not exist
   LINE 1: SELECT 'nonexistent'::pg_catalog.regproc::pg_catalog.oid

 Tom suggested removing minimal_error_message() entirely, which would
 be nice if possible. It is a shame that \sf nonexistent produces a
 scary-looking ERROR:  message (and would cause any in-progress
 transaction to need a rollback) while the other informational
 metacommands do not. Actually, the other metacommands are not entirely
 consistent with each other; compare \dt nonexistent and \df
 nonexistent.

 It would be nice if the case of a matching function not found by \sf
 or \ef could be handled in the same fashion as:

   # \d nonexistent
   Did not find any relation named nonexistent.

 though I realize that's not trivial due to the implementation of
 lookup_function_oid(). At any rate, this nitpicking about the error
 handling in the case that the function is not found is quibbling about
 behavior unchanged by the patch. I don't have any complaints about the
 patch itself, so I think it can be applied as-is, and we can attack
 the error handling issue separately.

I looked there - and mentioned issue needs little bit different
strategy - fault tolerant function searching based on function
signature. This code can be very simply, although I am not sure if we
would it. We cannot to remove  minimal_error_message() because there
are two SQL queries and if we do fault tolerant oid lookup, then
still pg_get_functiondef can raise exception. We can significantly
reduce showing this error only. It is not hard task, but it needs new
builtin SQL function and then patch can be more times larger than
current patch.

Opinion, notes?

Regards

Pavel




 Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Peter Eisentraut
On 2/21/13 9:25 AM, Kevin Grittner wrote:
 Peter Eisentraut pete...@gmx.net wrote:
 On 2/20/13 11:14 PM, Greg Stark wrote:
 That's not entirely true. From the database's point of view,
 TRUNCATE is in many ways actually DDL.

 Whether something is DDL or DML or a read operation (query) is
 not an implementation detail, it's a user-exposed category. 
 Since TRUNCATE is logically equivalent to DELETE, it's a DML
 operation, as far as the user is concerned.
 
 Not really.  It doesn't follow the same MVCC behavior as DML.  This
 is user-visible, documented behavior.

MVCC behavior does not determine whether something is considered DDL or DML.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kgri...@ymail.com writes:

 What did you think of the idea of something like DISCARD
 MATERIALIZED VIEW DATA as a new statment?  Or maybe RESET
 MATERIALIZED VIEW?

 I could live with either DISCARD or RESET.

I figured this was worth a pass through the keyword list to look
for all imperative verbs suitable for this, which could support the
needed syntax without adding a new keyword.  Here are the
possibilities I came up with, along with a note about why they are
keywords already.

DISABLE MATERIALIZED VIEW mv;  -- ALTER clause for constraints
DISCARD MATERIALIZED VIEW DATA mv;  -- session state
RELEASE MATERIALIZED VIEW DATA mv;  -- savepoint
RESET MATERIALIZED VIEW DATA mv;  -- run-time parameter

I think any of these could work.  I'm personally most inclined
toward DISABLE MATERIALIZED VIEW.  It seems to convey the semantics
better, especially if you leave out DATA as an additonal word. 
Given that a materialized view will retain its query, tablespace,
indexes, statistics targets, etc. with this operation, and will
just not be available for scanning, some of the above seem
downright misleading without DATA thrown in.

Opinions?

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Greg Stark
On Thu, Feb 21, 2013 at 2:38 PM, Kevin Grittner kgri...@ymail.com wrote:
 It doesn't.  That was one of the more contentious points in the
 earlier bikeshedding phases.  Some felt that throwing away the
 contents was a form of making the MV out of date and as such
 didn't require any special handling.  Others, including myself,
 felt that data not present was a distinct state from generated
 zero rows and that attempting to scan a materialized view for
 which data had not been generated must result in an error.  The
 latter property has been maintained from the beginning -- or at
 least that has been the intent.

Actually this sounds like exactly what I was saying. I withdraw my
concern entirely.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Kevin Grittner
Greg Stark st...@mit.edu wrote:
 Kevin Grittner kgri...@ymail.com wrote:
 It doesn't.  That was one of the more contentious points in the
 earlier bikeshedding phases.  Some felt that throwing away the
 contents was a form of making the MV out of date and as such
 didn't require any special handling.  Others, including myself,
 felt that data not present was a distinct state from
 generated zero rows and that attempting to scan a materialized
 view for which data had not been generated must result in an
 error.  The latter property has been maintained from the
 beginning -- or at least that has been the intent.

 Actually this sounds like exactly what I was saying. I withdraw
 my concern entirely.

Reviewing your concerns and discussions of freshness in general
got me thinking -- while it is clear that not having generated
values in the MV based on its query clearly should make the view
non-scannable, and will be the only criterion for that in this
initial patch; later versions will almost certainly support
age-based conditions for whether the MV is scannable.  So in the
next release the MV may become non-scannable based on the passage
of time since DML was run on a source table without the MV having
been refreshed or incrementally updated to reflect that DML.  Which
makes me wonder why DML making the MV non-scannable is such a bad
thing in the case of TRUNCATE.  Granted there is a difference in
that it is run *on the MV* rather than *on a source relation*; but
still, I'm having some second thoughts about that being a problem.

The problem with TRUNCATE MATERIALIZED VIEW from a logical
perspective doesn't seem to me so much that it makes the MV
non-scannable, as that it is the only DML which would be allowed
directly on an MV -- which is kind of a weird exception.  It is
pretty much a given that when we can get to implementing it, DML
statements will render MVs unscannable under various conditions. 
Josh Berkus and Greg Stark have been the most explicit about that,
but I think most of us who are interested in the feature take it as
a given.

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Call for Google Summer of Code mentors, admins

2013-02-21 Thread Fabrízio de Royes Mello
On Tue, Feb 19, 2013 at 5:43 PM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 19.02.2013 20:07, Fabrízio de Royes Mello wrote:

 I would like to propose implement a way to track creation times to
 database
 objects. This was discussed before in this thread [1].

 This was discussed before in this thread [1] but we don't reach a
 consensus
 of what we'll do, so I propose we discuss any more about it and I can
 implement it in GSOC2013, if my proposal will be accepted.


 I don't think that's a good GSoC project. There's no consensus on what to
 do, if anything, so 95% of the work is going to arguing over what we want,
 and 5% coding. I'd suggest finding something more well-defined.


You all right about we don't have no consensus on what to do, but maybe
this can be a opportunity to we do that.

I know a lot of people (friends, customers, users, ...) who would love to
have this feature in future versions of PostgreSQL.

Anyway there is another well defined feature which you recommend to a good
GSoC project?

Best regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] [DOCS] Contrib module xml2 status

2013-02-21 Thread Ian Lawrence Barwick
2013/2/22 Andrew Dunstan and...@dunslane.net:

 On 02/21/2013 12:56 PM, Magnus Hagander wrote:

 On Thu, Feb 21, 2013 at 6:39 PM, Robert Haas robertmh...@gmail.com
 wrote:

 On Wed, Feb 20, 2013 at 11:58 AM, Ian Lawrence Barwick
 barw...@gmail.com wrote:

 I'm not sure if this is a documentation or hackers issue, but the
 documentation page for contrib module xml2 refers to PostgreSQL 8.4 in
 the
 future tense:

 It is planned that this module will be removed in PostgreSQL 8.4 in
 favor of the newer standard API

 http://www.postgresql.org/docs/devel/static/xml2.html

 Are there any plans to remove this module by a forseeable date?

 Nope.  I have repeatedly been voted down on removing it, and I've also
 been repeatedly voted down on removing the deprecation text.  Could we
 at least agree on changing the deprecation text to say This module is
 deprecated and may be removed in a future release?

 Not reopening the actual discussion about rmeoving it, but assuming
 we're not, strong +1 on changing the deprecation message. And don't
 forget to backpatch the change so it shows up in the old versions of
 the docs as well.



 Yes, we should change it to remove the reference to 8.4.

Documentation patch attached.

 The point is we can
 remove the module when someone fixes and replaces the functionality that's
 left in there that some people rely on. So far nobody has stepped up to the
 plate, although now that we have lateral a sane replacement for xpath_table
 might well be a lot easier to achieve.  If someone is interested in working
 on this I'd be happy to hear about it. Maybe it would be a good Google SOC
 project.

It might be worth adding an explicit entry in the TODO list for removing this
and summarising what needs to be done.

https://wiki.postgresql.org/wiki/Todo#XML

Regards

Ian Barwick


doc-contrib-xml2-2013-02-22.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-21 Thread Andres Freund
On 2013-02-21 14:11:10 -0800, Kevin Grittner wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
  Kevin Grittner kgri...@ymail.com writes:
 
  What did you think of the idea of something like DISCARD
  MATERIALIZED VIEW DATA as a new statment?  Or maybe RESET
  MATERIALIZED VIEW?
 
  I could live with either DISCARD or RESET.
 
 I figured this was worth a pass through the keyword list to look
 for all imperative verbs suitable for this, which could support the
 needed syntax without adding a new keyword.  Here are the
 possibilities I came up with, along with a note about why they are
 keywords already.
 
 DISABLE MATERIALIZED VIEW mv;  -- ALTER clause for constraints
 DISCARD MATERIALIZED VIEW DATA mv;  -- session state
 RELEASE MATERIALIZED VIEW DATA mv;  -- savepoint
 RESET MATERIALIZED VIEW DATA mv;  -- run-time parameter
 
 I think any of these could work.  I'm personally most inclined
 toward DISABLE MATERIALIZED VIEW.  It seems to convey the semantics
 better, especially if you leave out DATA as an additonal word. 

 Given that a materialized view will retain its query, tablespace,
 indexes, statistics targets, etc. with this operation, and will
 just not be available for scanning, some of the above seem
 downright misleading without DATA thrown in.

I vote for RESET or DISCARD. DISABLE sounds more like you disable
automatic refreshes or somesuch.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.3 crashes during archive recovery

2013-02-21 Thread Michael Paquier
On Thu, Feb 21, 2013 at 11:09 PM, Heikki Linnakangas 
hlinnakan...@vmware.com wrote:

 On 15.02.2013 15:49, Heikki Linnakangas wrote:

 Attached is a patch for git master. The basic idea is to split
 InArchiveRecovery into two variables, InArchiveRecovery and
 ArchiveRecoveryRequested. ArchiveRecoveryRequested is set when
 recovery.conf exists. But if we don't know how far we need to recover,
 we first perform crash recovery with InArchiveRecovery=false. When we
 reach the end of WAL in pg_xlog, InArchiveRecovery is set, and we
 continue with normal archive recovery.


 New version of this attached, with a few bugs fixed.

 I'm thinking that this should be back-patched to 9.2, but not to earlier
 branches. Before 9.2, we don't PANIC at a reference to a non-existent page
 until end of recovery, even if we've already reached consistency. The same
 basic issue still exists in earlier versions, though: if you have
 hot_standby=on, the system will open for read-only queries too early,
 before the database is consistent. But this patch is invasive enough that
 I'm weary of back-patching it further, when the worst that can happen is
 that there's a small window right after startup when you can see an
 inconsistent database in hot standby mode. Maybe after we get some more
 testing of this in 9.2 and master. Opinions on that?

People have not yet complained about this problem with versions prior to
9.1. Is it worth backpatching in this case?
-- 
Michael


[HACKERS] use_remote_explain missing in docs of postgres_fdw

2013-02-21 Thread Michael Paquier
Hi all,

While testing a bit this feature, I noticed that use_remote_explain is
available in the list of options for FOREIGN TABLE and SERVER but this is
not specified in the docs:
http://www.postgresql.org/docs/devel/static/postgres-fdw.html

postgres=# CREATE FOREIGN TABLE foo (a int, b int) server postgres_server
options (table_name 'aa', foo 'true');
ERROR:  invalid option foo
HINT:  Valid options in this context are: schema_name, table_name,
use_remote_explain

Regards,
-- 
Michael


[HACKERS] OSSP UUID present but cannot be compiled

2013-02-21 Thread David E. Wheeler
While building 9.2.3 on OS X 10.8.2 today:

checking ossp/uuid.h usability... no
checking ossp/uuid.h presence... yes
configure: WARNING: ossp/uuid.h: present but cannot be compiled
configure: WARNING: ossp/uuid.h: check for missing prerequisite headers?
configure: WARNING: ossp/uuid.h: see the Autoconf documentation
configure: WARNING: ossp/uuid.h: section Present But Cannot Be Compiled
configure: WARNING: ossp/uuid.h: proceeding with the preprocessor's result
configure: WARNING: ossp/uuid.h: in the future, the compiler will take 
precedence
configure: WARNING: ##  ##
configure: WARNING: ## Report this to pgsql-b...@postgresql.org ##
configure: WARNING: ##  ##
checking for ossp/uuid.h... yes

So you asked for it. Here's how I configured Postgres:

./configure --with-bonjour --with-perl PERL=$PERL \
--with-openssl --with-pam --with-krb5 --with-libxml \
--with-ossp-uuid --with-includes=/usr/local/include \
--enable-integer-datetimes --with-zlib \
--with-libs=/usr/local/lib

And here's how I configured OSSP UUID:

  ./configure --prefix=/usr/local --with-perl=/usr/local/bin/perl \
  --includedir=/usr/local/include/ossp 

I think I have reported this before. Maybe it's not worth worrying about? I 
seem to be able to install the uuid-ossp extension and it works. So Ignore?

Thanks,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers