Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-06-26 Thread Amit Langote

Hi,

On 2015-06-26 AM 12:49, Sawada Masahiko wrote:
 On Thu, Jun 25, 2015 at 7:32 AM, Simon Riggs si...@2ndquadrant.com wrote:

 Let's start with a complex, fully described use case then work out how to
 specify what we want.

 I'm nervous of it would be good ifs because we do a ton of work only to
 find a design flaw.

 
 I'm not sure specific implementation yet, but I came up with solution
 for this case.
 
 For example,
 - s_s_name = '1(a, b), c, d'
 The priority of both 'a' and 'b' are 1, and 'c' is 2, 'd' is 3.
 i.g, 'b' and 'c' are potential sync node, and the quorum commit is
 enable only between 'a' and 'b'.
 
 - s_s_name = 'a, 1(b,c), d'
 priority of 'a' is 1, 'b' and 'c' are 2, 'd' is 3.
 So the quorum commit with 'b' and 'c' will be enabled after 'a' down.
 

Do we really need to add a number like '1' in '1(a, b), c, d'?

The order of writing names already implies priorities like 2  3 for c  d,
respectively, like in your example. Having to write '1' for the group '(a, b)'
seems unnecessary, IMHO. Sorry if I have missed any previous discussion where
its necessity was discussed.

So, the order of writing standby names in the list should declare their
relative priorities and parentheses (possibly nested) should help inform about
the grouping (for quorum?)

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WIP: ALTER TABLE ... ALTER CONSTRAINT ... SET DEFERRABLE on UNIQUE or PK

2015-06-26 Thread Craig Ringer
Hi all

Attached is a patch to implement ALTER TABLE ... ALTER CONSTRAINT ...
SET DEFERRABLE on UNIQUE or PRIMARY KEY constraints.

Currently only FOREIGN KEY constraints are supported. Others are rejected with:

constraint \%s\ of relation \%s\ is not a foreign key constraint

The patch also adds some regression tests for DEFERRABLE constraints.

The ALTER doesn't take effect in the session it's run in, which makes
me suspect I need to do additional cache invalidations - maybe the
index backing the constraint? Anyway, posted here as-is because I'm
out of time for now and it might be useful for someone who's looking
for info on this.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From c0a041f0ca5d884842820538b56d82472a701c3c Mon Sep 17 00:00:00 2001
From: Craig Ringer cr...@2ndquadrant.com
Date: Fri, 26 Jun 2015 11:59:30 +0800
Subject: [PATCH] Allow ALTER TABLE...ALTER CONSTRAINT on PK and UNIQUE

We presently support DEFERRABLE constraints on PRIMARY KEY and UNIQUE
constraints, but do not permit ALTER TABLE to modify their deferrable
state.

Fix that and add some regression test coverage.
---
 src/backend/commands/tablecmds.c | 72 +++-
 src/test/regress/sql/alter_table.sql | 37 ++
 2 files changed, 76 insertions(+), 33 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d394713..5875987 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -6647,22 +6647,20 @@ ATExecAlterConstraint(Relation rel, AlterTableCmd *cmd,
  errmsg(constraint \%s\ of relation \%s\ does not exist,
 		cmdcon-conname, RelationGetRelationName(rel;
 
-	if (currcon-contype != CONSTRAINT_FOREIGN)
+	if (currcon-contype != CONSTRAINT_FOREIGN 
+		currcon-contype != CONSTRAINT_PRIMARY 
+		currcon-contype != CONSTRAINT_UNIQUE)
 		ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg(constraint \%s\ of relation \%s\ is not a foreign key constraint,
+ errmsg(constraint \%s\ of relation \%s\ must be FOREIGN KEY, PRIMARY KEY or UNIQUE,
 		cmdcon-conname, RelationGetRelationName(rel;
 
 	if (currcon-condeferrable != cmdcon-deferrable ||
 		currcon-condeferred != cmdcon-initdeferred)
 	{
 		HeapTuple	copyTuple;
-		HeapTuple	tgtuple;
 		Form_pg_constraint copy_con;
 		List	   *otherrelids = NIL;
-		ScanKeyData tgkey;
-		SysScanDesc tgscan;
-		Relation	tgrel;
 		ListCell   *lc;
 
 		/*
@@ -6682,44 +6680,52 @@ ATExecAlterConstraint(Relation rel, AlterTableCmd *cmd,
 
 		/*
 		 * Now we need to update the multiple entries in pg_trigger that
-		 * implement the constraint.
+		 * implement the constraint if it's a foreign key constraint.
 		 */
-		tgrel = heap_open(TriggerRelationId, RowExclusiveLock);
+		if (currcon-contype == CONSTRAINT_FOREIGN)
+		{
+			HeapTuple	tgtuple;
+			ScanKeyData tgkey;
+			SysScanDesc tgscan;
+			Relation	tgrel;
 
-		ScanKeyInit(tgkey,
-	Anum_pg_trigger_tgconstraint,
-	BTEqualStrategyNumber, F_OIDEQ,
-	ObjectIdGetDatum(HeapTupleGetOid(contuple)));
+			tgrel = heap_open(TriggerRelationId, RowExclusiveLock);
 
-		tgscan = systable_beginscan(tgrel, TriggerConstraintIndexId, true,
-	NULL, 1, tgkey);
+			ScanKeyInit(tgkey,
+		Anum_pg_trigger_tgconstraint,
+		BTEqualStrategyNumber, F_OIDEQ,
+		ObjectIdGetDatum(HeapTupleGetOid(contuple)));
 
-		while (HeapTupleIsValid(tgtuple = systable_getnext(tgscan)))
-		{
-			Form_pg_trigger copy_tg;
+			tgscan = systable_beginscan(tgrel, TriggerConstraintIndexId, true,
+		NULL, 1, tgkey);
 
-			copyTuple = heap_copytuple(tgtuple);
-			copy_tg = (Form_pg_trigger) GETSTRUCT(copyTuple);
+			while (HeapTupleIsValid(tgtuple = systable_getnext(tgscan)))
+			{
+Form_pg_trigger copy_tg;
 
-			/* Remember OIDs of other relation(s) involved in FK constraint */
-			if (copy_tg-tgrelid != RelationGetRelid(rel))
-otherrelids = list_append_unique_oid(otherrelids,
-	 copy_tg-tgrelid);
+copyTuple = heap_copytuple(tgtuple);
+copy_tg = (Form_pg_trigger) GETSTRUCT(copyTuple);
 
-			copy_tg-tgdeferrable = cmdcon-deferrable;
-			copy_tg-tginitdeferred = cmdcon-initdeferred;
-			simple_heap_update(tgrel, copyTuple-t_self, copyTuple);
-			CatalogUpdateIndexes(tgrel, copyTuple);
+/* Remember OIDs of other relation(s) involved in FK constraint */
+if (copy_tg-tgrelid != RelationGetRelid(rel))
+	otherrelids = list_append_unique_oid(otherrelids,
+		 copy_tg-tgrelid);
 
-			InvokeObjectPostAlterHook(TriggerRelationId,
-	  HeapTupleGetOid(tgtuple), 0);
+copy_tg-tgdeferrable = cmdcon-deferrable;
+copy_tg-tginitdeferred = cmdcon-initdeferred;
+simple_heap_update(tgrel, copyTuple-t_self, copyTuple);
+CatalogUpdateIndexes(tgrel, copyTuple);
 
-			heap_freetuple(copyTuple);
-		}
+InvokeObjectPostAlterHook(TriggerRelationId,
+		  HeapTupleGetOid(tgtuple), 

Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-06-26 Thread Amit Langote
On 2015-06-26 PM 02:59, Amit Langote wrote:
 On 2015-06-26 AM 12:49, Sawada Masahiko wrote:

 For example,
 - s_s_name = '1(a, b), c, d'
 The priority of both 'a' and 'b' are 1, and 'c' is 2, 'd' is 3.
 i.g, 'b' and 'c' are potential sync node, and the quorum commit is
 enable only between 'a' and 'b'.

 - s_s_name = 'a, 1(b,c), d'
 priority of 'a' is 1, 'b' and 'c' are 2, 'd' is 3.
 So the quorum commit with 'b' and 'c' will be enabled after 'a' down.

 
 Do we really need to add a number like '1' in '1(a, b), c, d'?
 
 The order of writing names already implies priorities like 2  3 for c  d,
 respectively, like in your example. Having to write '1' for the group '(a, b)'
 seems unnecessary, IMHO. Sorry if I have missed any previous discussion where
 its necessity was discussed.
 
 So, the order of writing standby names in the list should declare their
 relative priorities and parentheses (possibly nested) should help inform about
 the grouping (for quorum?)
 

Oh, I missed Michael's latest message that describes its necessity. So, the
number is essentially the quorum for a group.

Sorry about the noise.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-06-26 Thread Michael Paquier
On Fri, Jun 26, 2015 at 2:59 PM, Amit Langote wrote:
 Do we really need to add a number like '1' in '1(a, b), c, d'?
 The order of writing names already implies priorities like 2  3 for c  d,
 respectively, like in your example. Having to write '1' for the group '(a, b)'
 seems unnecessary, IMHO. Sorry if I have missed any previous discussion where
 its necessity was discussed.

'1' is implied if no number is specified. That's the idea as written
here, not something decided of course :)

 So, the order of writing standby names in the list should declare their
 relative priorities and parentheses (possibly nested) should help inform about
 the grouping (for quorum?)

Yes.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: ALTER TABLE ... ALTER CONSTRAINT ... SET DEFERRABLE on UNIQUE or PK

2015-06-26 Thread Dean Rasheed
On 26 June 2015 at 07:20, Craig Ringer cr...@2ndquadrant.com wrote:
 Hi all

 Attached is a patch to implement ALTER TABLE ... ALTER CONSTRAINT ...
 SET DEFERRABLE on UNIQUE or PRIMARY KEY constraints.

 Currently only FOREIGN KEY constraints are supported. Others are rejected 
 with:


+1
I was disappointed that this wasn't part of the patch that added
support for it for FKs. What about exclusion constraints? I think
making them work should be more-or-less identical, and then we'll have
support for the full set, since CHECK and NOT NULL constraints can't
currently be deferred.


 constraint \%s\ of relation \%s\ is not a foreign key constraint

 The patch also adds some regression tests for DEFERRABLE constraints.

 The ALTER doesn't take effect in the session it's run in, which makes
 me suspect I need to do additional cache invalidations - maybe the
 index backing the constraint? Anyway, posted here as-is because I'm
 out of time for now and it might be useful for someone who's looking
 for info on this.


If you add it to the next commitfest, I'll review it.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-06-26 Thread Amit Langote

Hi,

On 2015-06-25 PM 01:01, Michael Paquier wrote:
 On Thu, Jun 25, 2015 at 12:57 PM, Fujii Masao wrote:
 On Thu, Jun 25, 2015 at 12:15 PM, Michael Paquier wrote:
 and that's actually equivalent to that in
 the grammar: 1(AAA,BBB,CCC).

 I don't think that they are the same. In the case of 1(AAA,BBB,CCC), while
 two servers AAA and BBB are running, the master server may return a success
 of the transaction to the client just after it receives the ACK from BBB.
 OTOH, in the case of AAA,BBB, that never happens. The master must wait for
 the ACK from AAA to arrive before completing the transaction. And then,
 if AAA goes down, BBB should become synchronous standby.
 
 Ah. Right. I missed your point, that's a bad day... We could have
 multiple separators to define group types then:
 - () where the order of acknowledgement does not matter
 - [] where it does not.

For '[]', I guess you meant where it does.

 You would find the old grammar with:
 1[AAA,BBB,CCC]
 

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-06-26 Thread Michael Paquier
On Fri, Jun 26, 2015 at 5:04 PM, Amit Langote
langote_amit...@lab.ntt.co.jp wrote:

 Hi,

 On 2015-06-25 PM 01:01, Michael Paquier wrote:
 On Thu, Jun 25, 2015 at 12:57 PM, Fujii Masao wrote:
 On Thu, Jun 25, 2015 at 12:15 PM, Michael Paquier wrote:
 and that's actually equivalent to that in
 the grammar: 1(AAA,BBB,CCC).

 I don't think that they are the same. In the case of 1(AAA,BBB,CCC), while
 two servers AAA and BBB are running, the master server may return a success
 of the transaction to the client just after it receives the ACK from BBB.
 OTOH, in the case of AAA,BBB, that never happens. The master must wait for
 the ACK from AAA to arrive before completing the transaction. And then,
 if AAA goes down, BBB should become synchronous standby.

 Ah. Right. I missed your point, that's a bad day... We could have
 multiple separators to define group types then:
 - () where the order of acknowledgement does not matter
 - [] where it does not.

 For '[]', I guess you meant where it does.

Yes, thanks :p
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] thread_test's sched_yield requires -lrt on solaris

2015-06-26 Thread Oskari Saarenmaa
I configured the dingo and binturong Solaris 10 animals to build 9.3
some time ago but apparently they always failed the configure phase.
Turns out this is caused by thread_test's usage of sched_yield which is
in librt on Solaris but which is not pulled in by anything on 9.3 and
earlier on my box.

Apparently the other Solaris animal (castoroides) requires librt for
fdatasync, but that's not required on my system.  On 9.4 and master
librt is required for shm_open so the check doesn't fail there.

Attached a patch to check for sched_yield in configure, the patch only
applies against 9.0 - 9.3 which are using autoconf 2.63.  We should
probably check for sched_yield anyway on all branches even if it's not
strictly required on 9.4+ at the moment.

/ Oskari
From b5a7400bdfad10fcb78a371f29fbde5dff52b40d Mon Sep 17 00:00:00 2001
From: Oskari Saarenmaa o...@ohmu.fi
Date: Fri, 26 Jun 2015 09:36:29 +0300
Subject: [PATCH] configure: we need -lrt for sched_yield on solaris

thread_test.c uses sched_yield which is in librt on Solaris.
Previously we did not check for sched_yield in configure and would not pull
in librt in all cases.  On some Solaris versions librt was required by
fdatasync, but that's not the case anymore on recent versions.

On 9.4 and master librt is again required for shm_open, but in 9.3 and
earlier there's nothing else using librt causing the thread_test check to
fail.

The configure.in diff in this patch applies against 9.0 - master, the
configure diff only applies against 9.0 - 9.3 which use autoconf 2.63; 9.4
and master require an `autoreconf` run.

---
 configure| 88 
 configure.in |  2 ++
 2 files changed, 90 insertions(+)

diff --git a/configure b/configure
index 1e95ab4..170e42a 100755
--- a/configure
+++ b/configure
@@ -8512,6 +8512,94 @@ if test $ac_res != no; then
 
 fi
 
+# Required for thread_test.c on Solaris
+{ $as_echo $as_me:$LINENO: checking for library containing sched_yield 5
+$as_echo_n checking for library containing sched_yield...  6; }
+if test ${ac_cv_search_sched_yield+set} = set; then
+  $as_echo_n (cached)  6
+else
+  ac_func_search_save_LIBS=$LIBS
+cat conftest.$ac_ext _ACEOF
+/* confdefs.h.  */
+_ACEOF
+cat confdefs.h conftest.$ac_ext
+cat conftest.$ac_ext _ACEOF
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+#ifdef __cplusplus
+extern C
+#endif
+char sched_yield ();
+int
+main ()
+{
+return sched_yield ();
+  ;
+  return 0;
+}
+_ACEOF
+for ac_lib in '' rt; do
+  if test -z $ac_lib; then
+ac_res=none required
+  else
+ac_res=-l$ac_lib
+LIBS=-l$ac_lib  $ac_func_search_save_LIBS
+  fi
+  rm -f conftest.$ac_objext conftest$ac_exeext
+if { (ac_try=$ac_link
+case (($ac_try in
+  *\* | *\`* | *\\*) ac_try_echo=\$ac_try;;
+  *) ac_try_echo=$ac_try;;
+esac
+eval ac_try_echo=\\$as_me:$LINENO: $ac_try_echo\
+$as_echo $ac_try_echo) 5
+  (eval $ac_link) 2conftest.er1
+  ac_status=$?
+  grep -v '^ *+' conftest.er1 conftest.err
+  rm -f conftest.er1
+  cat conftest.err 5
+  $as_echo $as_me:$LINENO: \$? = $ac_status 5
+  (exit $ac_status); }  {
+	 test -z $ac_c_werror_flag ||
+	 test ! -s conftest.err
+   }  test -s conftest$ac_exeext  {
+	 test $cross_compiling = yes ||
+	 $as_test_x conftest$ac_exeext
+   }; then
+  ac_cv_search_sched_yield=$ac_res
+else
+  $as_echo $as_me: failed program was: 5
+sed 's/^/| /' conftest.$ac_ext 5
+
+
+fi
+
+rm -rf conftest.dSYM
+rm -f core conftest.err conftest.$ac_objext conftest_ipa8_conftest.oo \
+  conftest$ac_exeext
+  if test ${ac_cv_search_sched_yield+set} = set; then
+  break
+fi
+done
+if test ${ac_cv_search_sched_yield+set} = set; then
+  :
+else
+  ac_cv_search_sched_yield=no
+fi
+rm conftest.$ac_ext
+LIBS=$ac_func_search_save_LIBS
+fi
+{ $as_echo $as_me:$LINENO: result: $ac_cv_search_sched_yield 5
+$as_echo $ac_cv_search_sched_yield 6; }
+ac_res=$ac_cv_search_sched_yield
+if test $ac_res != no; then
+  test $ac_res = none required || LIBS=$ac_res $LIBS
+
+fi
+
 # Required for thread_test.c on Solaris 2.5:
 # Other ports use it too (HP-UX) so test unconditionally
 { $as_echo $as_me:$LINENO: checking for library containing gethostbyname_r 5
diff --git a/configure.in b/configure.in
index 222e3e0..b964644 100644
--- a/configure.in
+++ b/configure.in
@@ -892,6 +892,8 @@ AC_SEARCH_LIBS(getopt_long, [getopt gnugetopt])
 AC_SEARCH_LIBS(crypt, crypt)
 # Solaris:
 AC_SEARCH_LIBS(fdatasync, [rt posix4])
+# Required for thread_test.c on Solaris
+AC_SEARCH_LIBS(sched_yield, rt)
 # Required for thread_test.c on Solaris 2.5:
 # Other ports use it too (HP-UX) so test unconditionally
 AC_SEARCH_LIBS(gethostbyname_r, nsl)
-- 
2.4.3


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash index creation warning

2015-06-26 Thread Robert Haas
On Wed, Jun 24, 2015 at 4:53 AM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, Jun 24, 2015 at 1:45 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 WARNING: hash indexes are not crash-safe, not replicated, and their
 use is discouraged

 +1

I'm not wild about this rewording; I think that if users don't know
what WAL is, they probably need to know that in order to make good
decisions about whether to use hash indexes.  But I don't feel
super-strongly about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Semantics of pg_file_settings view

2015-06-26 Thread Tom Lane
I looked into bug #13471, which states that we gripe about multiple
occurrences of the same variable in postgresql.conf + related files.
Now, that had clearly been fixed some time ago:

Author: Fujii Masao fu...@postgresql.org
Branch: master [e3da0d4d1] 2014-08-06 14:49:43 +0900
Branch: REL9_4_STABLE Release: REL9_4_0 [cf6a9c374] 2014-08-06 14:50:30 +0900

Change ParseConfigFp() so that it doesn't process unused entry of each 
parameter.

... however, it seems I removed that again in a cleanup commit awhile
later :-(.  I think I'd meant to move it somewhere else, or maybe even
fix it to not be O(N^2), but clearly forgot while dealing with assorted
unrelated fallout from the ALTER SYSTEM patch.

However putting it back now would be problematic, because of the recent
introduction of the pg_file_settings view, which purports to display
all entries in the config files, even ones which got overridden by later
duplicate entries.  If we just put back Fujii-san's code then only the
last duplicate entry will be visible in pg_file_settings, which seems to
destroy the rationale for having that view at all.

What we evidently need to do is fix things so that the pg_file_settings
data gets captured before we suppress duplicates.

In view of that, I am wondering whether the current placement of that
data-capturing code was actually designed intentionally, or if it was
chosen by throwing a dart at the source code.  The latter seems more
likely, because we don't capture the data until after we've decided
that all the entries seem provisionally valid, ie the stanza headed

 * Check if all the supplied option names are valid, as an additional
 * quasi-syntactic check on the validity of the config file.  It is

in guc-file.l.  ISTM that there is a good argument for capturing the data
before that, so that it's updated by any SIGHUP, whether or not we
conclude that the config file(s) are valid enough to apply data from.
This would mean that the view might contain data about settings that
aren't valid GUC variables.  But I fail to see why that's a bad thing,
if the main use-case is to debug problems with the config files.

The simplest change would be to move the whole thing to around line 208 in
guc-file.l, just after the stanza that loads PG_AUTOCONF_FILENAME.  Or you
could argue that the approach is broken altogether, and that we should
capture the data while we read the files, so that you have some useful
data in the view even if ParseConfigFile later decides there's a syntax
error.  I'm actually thinking maybe we should flush that data-capturing
logic altogether in favor of just not deleting the ConfigVariable list
data structure, and generating the view directly from that data structure.
(You could even imagine being able to finger syntax errors in the view
that way, by having ParseConfigFile attach a dummy ConfigVariable entry
when it bails out.)

The reason I started looking at this is that the loop where we set
GUC_IS_IN_FILE seems like the most natural place to deal with removing
duplicates, since it can notice more or less for free whether there are
any.  But as the code stands, that's still too early.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-06-26 Thread Robert Haas
On Wed, Jun 24, 2015 at 6:47 PM, Peter Geoghegan p...@heroku.com wrote:
 On Sun, Jun 14, 2015 at 2:08 PM, Peter Geoghegan p...@heroku.com wrote:
 Really? The pre-check thing wasn't too complex for Magnus to have a
 couple of bullet points on it *specifically* in his high level NYC
 talk on Postgres 9.5 features [1]. I don't think it's hard to
 understand at all.

 Also, it's simply incorrect to describe abbreviation as: Improve the
 speed of sorting character and numeric fields. Character fields
 presumably include character(n), and as I pointed out character(n)
 lacks abbreviation support.

 Where are we on this? Bruce mentioned that he'd revisit this during pgCon.

 Aside from the issue of whether or not the pre-check thing is
 mentioned, and aside from the issue of correctly identifying which
 types the abbreviation optimization applies to, I have another
 concern: I cannot imagine why we'd fail to mention a totally
 independent speed up of about 10% [1] for CREATE INDEX on integer
 columns. This speed-up has nothing to do with abbreviation or anything
 else mentioned in the 9.5 release notes currently; it's down to commit
 5ea86e6e, which extended sortsupport to work with cases like CREATE
 INDEX and CLUSTER.

Can you put your suggestions here in the form of a patch to the release notes?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-06-26 Thread Robert Haas
On Thu, Jun 25, 2015 at 12:09 AM, Amit Langote
langote_amit...@lab.ntt.co.jp wrote:
 On 2015-06-11 PM 01:15, Bruce Momjian wrote:
 I have committed the first draft of the 9.5 release notes.  You can view
 the output here:

   http://momjian.us/pgsql_docs/release-9-5.html

 and it will eventually appear here:

   http://www.postgresql.org/docs/devel/static/release.html

 I am ready to make suggested adjustments, though I am traveling for
 conferences for the next ten days so there might a delay in my replies.


 Is it intentional that the following items are separate?

   listitem
para
 Move applicationpg_upgrade/ from filenamecontrib/ to
 filenamesrc/bin/ (Peter Eisentraut)
/para
   /listitem


  listitem
   para
Move applicationpg_upgrade_support/ code into backend and
remove the module (Peter Eisentraut)
   /para
  /listitem

 Or could they made into one item?

I think one item would be fine.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Nitpicking: unnecessary NULL-pointer check in pg_upgrade's controldata.c

2015-06-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Jun 26, 2015 at 10:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I agree that the correct handling of this particular case is to mark it
 as not-a-bug.  We have better things to do.

 Well, I find that a disappointing conclusion, but I'm not going to
 spend a lot of time arguing against both of you.  But, for what it's
 worth: it's not as if somebody is going to modify the code in that
 function to make output == NULL a plausible option, so I think the
 change could easily be justified on code clean-up grounds if nothing
 else.  There's not much point calling fgets on a FILE unconditionally
 and then immediately thereafter allowing for the possibility that
 output might be NULL.  That's not easing the work of anyone who might
 want to modify that code in the future; it just makes the code more
 confusing.

Well, if you find this to be good code cleanup on its own merits,
you have a commit bit, you can go commit it.  I'm just saying that
Coverity is not a good judge of code readability and even less of
a judge of likely future changes.  So we should not let it determine
whether we approve of unnecessary tests.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Redesigning checkpoint_segments

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 7:08 AM, Heikki Linnakangas hlinn...@iki.fi wrote:
 I'm not sure what to do about this. With the attached patch, you get the
 same leisurely pacing with restartpoints as you get with checkpoints, but
 you exceed max_wal_size during recovery, by the amount determined by
 checkpoint_completion_target. Alternatively, we could try to perform
 restartpoints faster then checkpoints, but then you'll get nasty checkpoint
 I/O storms in recovery.

 A bigger change would be to write a WAL record at the beginning of a
 checkpoint. It wouldn't do anything else, but it would be a hint to recovery
 that there's going to be a checkpoint record later whose redo-pointer will
 point to that record. We could then start the restartpoint at that record
 already, before seeing the checkpoint record itself.

 I think the attached is better than nothing, but I'll take a look at that
 beginning-of-checkpoint idea. It might be too big a change to do at this
 point, but I'd really like to fix this properly for 9.5, since we've changed
 with the way checkpoints are scheduled anyway.

I agree.  Actually, I've seen a number of presentations indicating
that the pacing of checkpoints is already too aggressive near the
beginning, because as soon as we initiate the checkpoint we have a
storm of full page writes.  I'm sure we can come up with arbitrarily
complicated systems to compensate for this, but something simple might
be to calculate progress done+adjust/total+adjust rather than
done/total.  If you let adjust=total/9, for example, then you
essentially start the progress meter at 10% instead of 0%.  Even
something that simple might be an improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PGXS check target forcing an install ?

2015-06-26 Thread Robert Haas
On Tue, Jun 23, 2015 at 1:31 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 I tracked the dangerous -rf to come from Makefile.global and it's empty
 string being due to abs_top_builddir not being define in my own Makefile.
 But beside that, which I can probably fix, it doesn't sound correct
 that a check rule insists in finding an install rule.

 Oops, this is a regression, and a dangerous one indeed. This is caused
 by dcae5fac.

 One fix is to use NO_TEMP_INSTALL=yes in Makefile.global in the
 context of PGXS, like in the patch attached, this variable needing to
 be set before Makefile.global is loaded. We could as well use directly
 PGXS in the section Testing, but that does not sound appealing for
 Makefile.global's readability.

Gulp.  I certainly agree that emitting rm -rf /tmp_install is a scary
thing for a PostgreSQL Makefile to be doing.  Fortunately, people
aren't likely to have a directory under / by that name, and maybe not
permissions on it even if they did, but all the same it's not good.  I
propose trying to guard against that a bit more explicitly, as in the
attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index c583b44..081ed5d 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -304,12 +304,14 @@ check: temp-install
 .PHONY: temp-install
 temp-install:
 ifndef NO_TEMP_INSTALL
+ifneq ($(abs_top_builddir),)
 ifeq ($(MAKELEVEL),0)
 	rm -rf '$(abs_top_builddir)'/tmp_install
 	$(MAKE) -C '$(top_builddir)' DESTDIR='$(abs_top_builddir)'/tmp_install install
 endif
 	$(if $(EXTRA_INSTALL),for extra in $(EXTRA_INSTALL); do $(MAKE) -C '$(top_builddir)'/$$extra DESTDIR='$(abs_top_builddir)'/tmp_install install || exit; done)
 endif
+endif
 
 PROVE = @PROVE@
 PG_PROVE_FLAGS = -I $(top_srcdir)/src/test/perl/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schedule for 9.5alpha1

2015-06-26 Thread Robert Haas
On Thu, Jun 25, 2015 at 11:55 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Jun 25, 2015 at 6:25 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 I have a serious open item reported 1.5 month ago then reminded
 several times has not been fixed up yet.

 9a28c8860f777e439aa12e8aea7694f8010f3...@bpxm15gp.gisp.nec.co.jp

 Patch is less than 100 lines, entirely designed according to Tom's 
 suggestion.

 The problem is, commit 1a8a4e5cde2b7755e11bde2ea7897bd650622d3e reverted
 create_plan_recurse() to static function, thus, extension lost way to
 transform Path node to Plan node if it wants to takes underlying child
 nodes, like SeqScan, HashJoin and so on.

 Tom's suggestion is to add a list of Path nodes on CustomPath structure,
 to be transformed by createplan.c, instead of public create_plan_recurse().

 It is nearly obvious problem, and bugfix patch already exists.

 Yes, I am quite unhappy with this situation.  Tom promised me at PGCon
 that he would look at this soon, but there is no sign that he has, and
 he said the same thing weeks ago.  I think it can't be right to let
 this sit for another month or three.  Even if the API you've
 implemented is worse than something Tom can design, it is certainly
 better than the status quo.  I would rather have a working but
 imperfect API and have to break compatibility later in beta than have
 a non-working API.

...given which, I have committed this.  I did not like the use of
custom_children as a generic monicker, so I changed it to
custom_paths, custom_plans, or custom_ps, as appropriate in each case.
I did a bit of cosmetic cleanup as well.

This does seem much nicer than giving custom plans direct access to
create_plan_recurse().  The fact that you found various other places
of using those lists demonstrates that nicely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Conflict between REL9_4_STABLE and master branch.

2015-06-26 Thread David G. Johnston
On Fri, Jun 26, 2015 at 11:22 AM, Uriy Zhuravlev u.zhurav...@postgrespro.ru
 wrote:

 Hello hackers.

 I found a strange thing. I hope it's not on purpose.

 Example:
 git clone git://git.postgresql.org/git/postgresql.git
 cd postgresql
 git checkout -b remotes/origin/REL9_4_STABLE
 git merge master
 MANY CONFLICTS


​It is intentional: t
he release branches operate exclusively in parallel to master and each
other.​  While any two share a common ancestor they are never intended to
be joined again in the future.  cherry-picking w/ tweaks is used to apply
fixes made against master to the back-branches, if applicable.  Not being
an actual code -hacker on this project I am lacking on specifics and maybe
over-simplifying.

Think of this as a multi-master repository.

David J.


Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 1:46 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Jun 25, 2015 at 8:32 PM, Simon Riggs  wrote:
 Let's start with a complex, fully described use case then work out how to
 specify what we want.

 Well, one of the most simple cases where quorum commit and this
 feature would be useful for is that, with 2 data centers:
 - on center 1, master A and standby B
 - on center 2, standby C and standby D
 With the current synchronous_standby_names, what we can do now is
 ensuring that one node has acknowledged the commit of master. For
 example synchronous_standby_names = 'B,C,D'. But you know that :)
 What this feature would allow use to do is for example being able to
 ensure that a node on the data center 2 has acknowledged the commit of
 master, meaning that even if data center 1 completely lost for a
 reason or another we have at least one node on center 2 that has lost
 no data at transaction commit.

 Now, regarding the way to express that, we need to use a concept of
 node group for each element of synchronous_standby_names. A group
 contains a set of elements, each element being a group or a single
 node. And for each group we need to know three things when a commit
 needs to be acknowledged:
 - Does my group need to acknowledge the commit?
 - If yes, how many elements in my group need to acknowledge it?
 - Does the order of my elements matter?

 That's where the micro-language idea makes sense to use. For example,
 we can define a group using separators and like (elt1,...eltN) or
 [elt1,elt2,eltN]. Appending a number in front of a group is essential
 as well for quorum commits. Hence for example, assuming that '()' is
 used for a group whose element order does not matter, if we use that:
 - k(elt1,elt2,eltN) means that we need for the k elements in the set
 to return true (aka commit confirmation).
 - k[elt1,elt2,eltN] means that we need for the first k elements in the
 set to return true.

 When k is not defined for a group, k = 1. Using only elements
 separated by commas for the upper group means that we wait for the
 first element in the set (for backward compatibility), hence:
 1(elt1,elt2,eltN) = elt1,elt2,eltN

Nice design.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash index creation warning

2015-06-26 Thread Robert Haas
On Wed, Jun 24, 2015 at 3:06 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Wed, Jun 24, 2015 at 12:27 AM, Robert Haas wrote:
 I think you should be testing RelationNeedsWAL(), not the
 relpersistence directly.  The same point applies for temporary
 indexes.

 Indeed. Patch updated attached.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN: Implementing triConsistent and strategy number

2015-06-26 Thread Jeff Janes
On Fri, Jun 26, 2015 at 7:13 AM, Heikki Linnakangas hlinn...@iki.fi wrote:

 On 06/24/2015 11:11 PM, Jeff Janes wrote:

 Is there a way to implement triConsistent for only some of the strategy
 numbers?


 No.

  It looks like I would have to internally re-implement something like
 shimTriConsistentFn for each strategy number for which I don't want to
 implement the ternary system in full.  Am I missing a trick?


 Hmm. It didn't occur to me that you might want to implement tri-consistent
 for some strategy numbers and fall back to the shim-implementation for
 others.


I didn't really want to fall back to shim-implementation for executing the
query, but I didn't see other general options.  I could define a new
operator class with just the operators I want to support triconsistent on,
but then I think I would need to build the same index once for the old
class and once for the new class.

If it could learn that triconsistent is not supported for a certain
strategy, it could set
key-nrequired = key-nentries
And then use the binary consistent function.

 Do you have a real-world example of where that'd be useful?

I wanted to make pg_trgm support triconsistent for LIKE queries, to see if
that would speed things up on a few cases, but didn't want to mess around
with regexp or similarity until I was sure it was worthwhile.  I can make
them throw an error if invoked for now.   In this case if it is worth it to
do for LIKE, it will probably be worth it for the other ones as well, so it
is more an issue of incremental development and testing rather than the
final state.

Cheers,

Jeff


Re: [HACKERS] Nitpicking: unnecessary NULL-pointer check in pg_upgrade's controldata.c

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 9:49 AM, Andres Freund and...@anarazel.de wrote:
 On 2015-06-26 09:44:14 -0400, Robert Haas wrote:
 I don't mind committing patches for this kind of thing if it makes the
 Coverity reports easier to deal with, which I gather that it does.

 It takes about three seconds to mark it as ignored which will hide it
 going forward.

So what?  That doesn't help if someone *else* sets up a Coverity run
on this code base, or if say Salesforce sets up such a run on their
fork of the code base.  It's much better to fix the problem at the
root.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN: Implementing triConsistent and strategy number

2015-06-26 Thread Heikki Linnakangas

On 06/24/2015 11:11 PM, Jeff Janes wrote:

Is there a way to implement triConsistent for only some of the strategy
numbers?


No.


It looks like I would have to internally re-implement something like
shimTriConsistentFn for each strategy number for which I don't want to
implement the ternary system in full.  Am I missing a trick?


Hmm. It didn't occur to me that you might want to implement 
tri-consistent for some strategy numbers and fall back to the 
shim-implementation for others. Do you have a real-world example of 
where that'd be useful?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Nitpicking: unnecessary NULL-pointer check in pg_upgrade's controldata.c

2015-06-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Jun 26, 2015 at 9:49 AM, Andres Freund and...@anarazel.de wrote:
 It takes about three seconds to mark it as ignored which will hide it
 going forward.

 So what?  That doesn't help if someone *else* sets up a Coverity run
 on this code base, or if say Salesforce sets up such a run on their
 fork of the code base.  It's much better to fix the problem at the
 root.

The problem with that is allowing Coverity, which in the end is not magic
but just another piece of software with many faults, to define what is a
problem.  In this particular case, the only effect of the change that
I can see is to make the code less flexible, and less robust against a
fairly obvious type of future change.  So I'm not on board with removing
if-guards just because Coverity thinks they are unnecessary.

I agree that the correct handling of this particular case is to mark it
as not-a-bug.  We have better things to do.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Are we sufficiently clear that jsonb containment is nested?

2015-06-26 Thread Robert Haas
On Wed, Jun 24, 2015 at 5:53 PM, Peter Geoghegan p...@heroku.com wrote:
 I worry that 8.14.3. jsonb Containment and Existence is not
 sufficiently clear in explaining that jsonb containment is nested.
 I've seen anecdata suggesting that this is unclear to users. We do
 say:

 
 The general principle is that the contained object must match the
 containing object as to structure and data contents, possibly after
 discarding some non-matching array elements or object key/value pairs
 from the containing object.
 

 I think that we could still do with an example showing *nested*
 containment, where many non-matching elements/pairs at each of several
 nesting levels are discarded. This could be back-patched to 9.4.
 Something roughly like the delicious sample data, where queries like
 the following are possible and useful:

I would be fine with adding a *compact* example of this kind to the
table that begins section 8.14.3.  I probably would not back-patch it,
because the absence of that example is not an error in the
documentation, but I will not complain if someone else does.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Conflict between REL9_4_STABLE and master branch.

2015-06-26 Thread Uriy Zhuravlev
Hello hackers.

I found a strange thing. I hope it's not on purpose.

Example:
git clone git://git.postgresql.org/git/postgresql.git
cd postgresql
git checkout -b remotes/origin/REL9_4_STABLE
git merge master
MANY CONFLICTS

Why? 

Thanks. 
-- 
Uriy Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Conflict between REL9_4_STABLE and master branch.

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 11:22 AM, Uriy Zhuravlev
u.zhurav...@postgrespro.ru wrote:
 Hello hackers.

 I found a strange thing. I hope it's not on purpose.

 Example:
 git clone git://git.postgresql.org/git/postgresql.git
 cd postgresql
 git checkout -b remotes/origin/REL9_4_STABLE

This is an awfully strange thing to do.  git checkout -b creates a new
branch.  Why would you create a new branch with the name
remotes/origin/EDBAS9_4_STABLE?  I would just do git checkout
REL9_4_STABLE if that's what you want to do.

 git merge master
 MANY CONFLICTS

Well, the stuff we back-patch is going to conflict if it's been
adjusted at all.  I think trying to do this is a bad idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #13473: VACUUM FREEZE mistakenly cancel standby sessions

2015-06-26 Thread Marco Nenciarini
Il 26/06/15 15:43, marco.nenciar...@2ndquadrant.it ha scritto:
 The following bug has been logged on the website:
 
 Bug reference:  13473
 Logged by:  Marco Nenciarini
 Email address:  marco.nenciar...@2ndquadrant.it
 PostgreSQL version: 9.4.4
 Operating system:   all
 Description:
 
 = Symptoms
 
 Let's have a simple master - standby setup, with hot_standby_feedback
 activated,
 if a backend on standby is holding the cluster xmin and the master runs a
 VACUUM FREEZE
 on the same database of the standby's backend, it will generate a conflict
 and the query
 running on standby will be canceled.
 
 = How to reproduce it
 
 Run the following operation on an idle cluster.
 
 1) connect to the standby and simulate a long running query:
 
select pg_sleep(3600);
 
 2) connect to the master and run the following script
 
create table t(id int primary key);
insert into t select generate_series(1, 1);
vacuum freeze verbose t;
drop table t;
 
 3) after 30 seconds the pg_sleep query on standby will be canceled.
 
 = Expected output
 
 The hot standby feedback should have prevented the query cancellation
 
 = Analysis
 
 Ive run postgres at DEBUG2 logging level, and I can confirm that the vacuum
 correctly see the OldestXmin propagated by the standby through the hot
 standby feedback.
 The issue is in heap_xlog_freeze function, which calls
 ResolveRecoveryConflictWithSnapshot as first thing, passing the cutoff_xid
 value as first argument.
 The cutoff_xid is the OldestXmin active when the vacuum, so it represents a
 running xid. 
 The issue is that the function ResolveRecoveryConflictWithSnapshot expects
 as first argument of is latestRemovedXid, which represent the higher xid
 that has been actually removed, so there is an off-by-one error.
 
 I've been able to reproduce this issue for every version of postgres since
 9.0 (9.0, 9.1, 9.2, 9.3, 9.4 and current master)
 
 = Proposed solution
 
 In the heap_xlog_freeze we need to subtract one to the value of cutoff_xid
 before passing it to ResolveRecoveryConflictWithSnapshot.
 
 
 

Attached a proposed patch that solves the issue.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index caacc10..28edb17 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -7571,9 +7571,12 @@ heap_xlog_freeze_page(XLogReaderState *record)
if (InHotStandby)
{
RelFileNode rnode;
+   TransactionId latestRemovedXid = cutoff_xid;
+
+   TransactionIdRetreat(latestRemovedXid);
 
XLogRecGetBlockTag(record, 0, rnode, NULL, NULL);
-   ResolveRecoveryConflictWithSnapshot(cutoff_xid, rnode);
+   ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rnode);
}
 
if (XLogReadBufferForRedo(record, 0, buffer) == BLK_NEEDS_REDO)


signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Should we back-patch SSL renegotiation fixes?

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 9:59 AM, Andres Freund and...@anarazel.de wrote:
 Generally I'd agree that that is a bad thing. But there's really not
 much of a observable behaviour change in this case? Except that
 connections using ssl break less often.

Well, SSL renegotiation exists for a reason: to improve security.
It's not awesome that we're being forced to shut off features that are
designed to improve security.  But it seems we have little choice, at
least until we can support some other SSL implementation (and maybe
not even then).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we back-patch SSL renegotiation fixes?

2015-06-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Jun 26, 2015 at 9:59 AM, Andres Freund and...@anarazel.de wrote:
 Generally I'd agree that that is a bad thing. But there's really not
 much of a observable behaviour change in this case? Except that
 connections using ssl break less often.

 Well, SSL renegotiation exists for a reason: to improve security.

That was the theory, yes, but the CVEs that have come out of it indicate
that whether it improves security *in practice* is a pretty debatable
topic.  The fact that the new TLS draft drops it altogether tells us
something about the conclusion the standards community has arrived at.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we back-patch SSL renegotiation fixes?

2015-06-26 Thread Andres Freund
On 2015-06-26 10:26:58 -0400, Robert Haas wrote:
 On Fri, Jun 26, 2015 at 9:59 AM, Andres Freund and...@anarazel.de wrote:
  Generally I'd agree that that is a bad thing. But there's really not
  much of a observable behaviour change in this case? Except that
  connections using ssl break less often.
 
 Well, SSL renegotiation exists for a reason: to improve security.

Well, except that even if it were implemented correctly it's far from
clear cut that it's a win:

If your argument is that key-rotation is beneficial because it gives an
attacker less encrypted material to analyze: That's not a good argument,
you're just giving him more information about the assymetric crypto side
of things instead about the session key which is ephemeral anyway.

I think they only real argument for it is that you want to limit the
amount of data you could decrypt if you gain access to the current
symmetric key via the client's memory . But that's not a particularly
large benefit.

 But it seems we have little choice, at least until we can support some
 other SSL implementation (and maybe not even then).

I read through one other SSL implementation (NSS), and I don't think
it's substantially better handled there. At least one other
implementations is ripping out support entirely already.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we back-patch SSL renegotiation fixes?

2015-06-26 Thread Robert Haas
On Thu, Jun 25, 2015 at 8:03 AM, Andres Freund and...@anarazel.de wrote:
 I don't accept the argument that there are not ways to tell users
 about things they might want to do.

 We probably could do that. But why would we want to? It's just as much
 work, and puts the onus on more people?

Because it doesn't force a behavior change down everyone's throat.

If it were just a question of back-porting fixes, even someone
invasive ones, well, maybe that's what we have to do; that's pretty
much exactly what we are planning to do for the MultiXact case, but
according to Heikki, this is broken even in master and can't really be
fixed unless and until OpenSSL gets their act together.  That's a hard
argument to argue with, and I think I'm on board with it.

But as a general point, we should be very reluctant to force behavior
changes on our users in released branches, because users don't like
that.  When there are reasonable alternatives to doing that, we should
choose them.  If we have no other reasonable choice here, so be it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we back-patch SSL renegotiation fixes?

2015-06-26 Thread Andres Freund
On 2015-06-26 09:53:30 -0400, Robert Haas wrote:
 On Thu, Jun 25, 2015 at 8:03 AM, Andres Freund and...@anarazel.de wrote:
  I don't accept the argument that there are not ways to tell users
  about things they might want to do.
 
  We probably could do that. But why would we want to? It's just as much
  work, and puts the onus on more people?
 
 Because it doesn't force a behavior change down everyone's throat.

Generally I'd agree that that is a bad thing. But there's really not
much of a observable behaviour change in this case? Except that
connections using ssl break less often.

 If it were just a question of back-porting fixes, even someone
 invasive ones, well, maybe that's what we have to do; that's pretty
 much exactly what we are planning to do for the MultiXact case

There's no way we can reasonably disable multixacts, so I don't think
these situations are comparable.

 but according to Heikki, this is broken even in master and can't really be
 fixed unless and until OpenSSL gets their act together.

Yes, that's my conclusion as well, leading to my position in this
thread.

 That's a hard argument to argue with, and I think I'm on board with
 it.

Given that reported bugs for openssl around this have existed since
about 2002 without any effort at fixing, I think it's seriously unlikely
that they ever will.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Enhanced ALTER OPERATOR

2015-06-26 Thread Robert Haas
On Wed, Jun 24, 2015 at 7:30 AM, Uriy Zhuravlev
u.zhurav...@postgrespro.ru wrote:
 Because change the commutator and negator raised questions I suggest 3 version
 of the patch without them. In addition, for us now is much more important
 restrict and join (for Selectivity estimation for intarray patch).

Seems broadly sensible.  You will need to write docs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we back-patch SSL renegotiation fixes?

2015-06-26 Thread Heikki Linnakangas

On 06/26/2015 04:53 PM, Robert Haas wrote:

On Thu, Jun 25, 2015 at 8:03 AM, Andres Freund and...@anarazel.de wrote:

I don't accept the argument that there are not ways to tell users
about things they might want to do.


We probably could do that. But why would we want to? It's just as much
work, and puts the onus on more people?


Because it doesn't force a behavior change down everyone's throat.


It's arguable whether this is a change in behaviour or not. SSL 
renegotiation is (supposed to be) completely transparent to the user.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Redesigning checkpoint_segments

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 9:47 AM, Heikki Linnakangas hlinn...@iki.fi wrote:
 On 06/26/2015 03:40 PM, Robert Haas wrote:
 Actually, I've seen a number of presentations indicating
 that the pacing of checkpoints is already too aggressive near the
 beginning, because as soon as we initiate the checkpoint we have a
 storm of full page writes.  I'm sure we can come up with arbitrarily
 complicated systems to compensate for this, but something simple might
 be to calculate progress done+adjust/total+adjust rather than
 done/total.  If you let adjust=total/9, for example, then you
 essentially start the progress meter at 10% instead of 0%.  Even
 something that simple might be an improvement.

 Yeah, but that's an unrelated issue. This was most recently discussed at
 http://www.postgresql.org/message-id/CAKHd5Ce-bnD=geedtxit2_ay7shqutkd0yhxxk5f4zvekrp...@mail.gmail.com.
 I posted a simple patch there - review and testing is welcome ;-).

Ah, thanks for the pointer - I had forgotten about that thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Nitpicking: unnecessary NULL-pointer check in pg_upgrade's controldata.c

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 10:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Jun 26, 2015 at 9:49 AM, Andres Freund and...@anarazel.de wrote:
 It takes about three seconds to mark it as ignored which will hide it
 going forward.

 So what?  That doesn't help if someone *else* sets up a Coverity run
 on this code base, or if say Salesforce sets up such a run on their
 fork of the code base.  It's much better to fix the problem at the
 root.

 The problem with that is allowing Coverity, which in the end is not magic
 but just another piece of software with many faults, to define what is a
 problem.  In this particular case, the only effect of the change that
 I can see is to make the code less flexible, and less robust against a
 fairly obvious type of future change.  So I'm not on board with removing
 if-guards just because Coverity thinks they are unnecessary.

 I agree that the correct handling of this particular case is to mark it
 as not-a-bug.  We have better things to do.

Well, I find that a disappointing conclusion, but I'm not going to
spend a lot of time arguing against both of you.  But, for what it's
worth: it's not as if somebody is going to modify the code in that
function to make output == NULL a plausible option, so I think the
change could easily be justified on code clean-up grounds if nothing
else.  There's not much point calling fgets on a FILE unconditionally
and then immediately thereafter allowing for the possibility that
output might be NULL.  That's not easing the work of anyone who might
want to modify that code in the future; it just makes the code more
confusing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] less log level for success dynamic background workers for 9.5

2015-06-26 Thread Robert Haas
On Wed, Jun 24, 2015 at 7:39 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 I think it's a whole separate topicto Pavel's original proposal
 though, and really merits a separate thread. For Pavel's issue I'm all
 in favour of just changing the log message, I think LOG is way too
 high for something that's internal implementation detail.

 +1.

OK, I have committed Pavel's patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PANIC in GIN code

2015-06-26 Thread Jeff Janes
Under high load against f7bb7f0625771bc71869cda, I occasionally get:

PANIC:  XLogBeginInsert was not called

It seems to only come from vacuuming.

Here is an example back-trace:

#0  0x003dcb632625 in raise () from /lib64/libc.so.6
#1  0x003dcb633e05 in abort () from /lib64/libc.so.6
#2  0x0079a39c in errfinish (dummy=value optimized out) at
elog.c:551
#3  0x0079b0e4 in elog_finish (elevel=value optimized out,
fmt=value optimized out) at elog.c:1368
#4  0x004dd1fe in XLogInsert (rmid=13 '\r', info=48 '0') at
xloginsert.c:412
#5  0x00478e13 in ginPlaceToPage (btree=0x7fff1fbb7f60,
stack=0x2878760, insertdata=value optimized out, updateblkno=value
optimized out,
childbuf=0, buildStats=value optimized out) at ginbtree.c:582
#6  0x00479a33 in ginInsertValue (btree=0x7fff1fbb7f60,
stack=0x2878760, insertdata=0x7fff1fbb7fe0, buildStats=0x0) at
ginbtree.c:751
#7  0x0047364a in ginEntryInsert (ginstate=0x7fff1fbb8280,
attnum=54624, key=1, category=value optimized out, items=0x287d3c0,
nitem=1,
buildStats=0x0) at gininsert.c:234
#8  0x0047ef4b in ginInsertCleanup (ginstate=0x7fff1fbb8280,
vac_delay=value optimized out, stats=0x2868d90) at ginfast.c:843
#9  0x0047e024 in ginbulkdelete (fcinfo=value optimized out) at
ginvacuum.c:547


From the code, it seems obvious that XLogBeginInsert is getting called at
ginbtree.c line 373, so I think that some obscure code path of
btree-placeToPage must be either consuming or resetting that
XLogBeginInsert before it returns control to ginbtree.c


Cheers,

Jeff


Re: [HACKERS] 9.5 release notes

2015-06-26 Thread Peter Geoghegan
On Fri, Jun 26, 2015 at 11:29 AM, Robert Haas robertmh...@gmail.com wrote:

 OK, but how about this wording instead?

That seems fine.

 BTW, shouldn't Andrew also be credited here, since he did the work on
 datum sorts?

Andrew's work was entirely confined to making datum sorts work with
abbreviation, which seems entirely orthogonal (but was enough to make
me want to credit him as an author of abbreviated keys, even after
breaking out his work on numeric support into a separate item). This
particular piece of work has nothing to do with the datum sort case,
though.

Datum sorts always supported SortSupport. This commit, 5ea86e6e6,
really should have been in 9.2 (especially since it had a net-negative
code footprint and clearly simplified tuplesort), and had nothing to
do with abbreviation -- it went in before abbreviation, and before it
was 100% clear that abbreviation would ever land.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 feature count

2015-06-26 Thread Alvaro Herrera
Bruce Momjian wrote:
 I have run a script to count the number of listitem items in the
 major release notes of each major version of Postgres back to 7.4:
 
   7.4280
   8.0238
   8.1187
   8.2230
   8.3237
   8.4330
   9.0252
   9.1213
   9.2250
   9.3187
   9.4217
   9.5176
 
 The 9.5 number will only change a little by 9.5 final.

I think doing this kind of analysis can lead to bad incentives; should
we split two items that are unrelated but touch similarly-sounding parts
of the code, should we merge items that are actually pretty much the
same thing?  It's either pointless, because people in-the-know actually
realizes that it doesn't actually mean anything, or confusing because
people think that some releases are bigger than others because they have
more features.

Maybe there's a reasonable way to measure releases (my 8.0 is bigger
than your 9.1!), but I don't think this is it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-06-26 Thread Peter Geoghegan
On Fri, Jun 26, 2015 at 9:29 AM, Robert Haas robertmh...@gmail.com wrote:
 Can you put your suggestions here in the form of a patch to the release notes?

The attached patch generalizes from the 9.2 release note wording. I
use the word inlined here too, even though commit 5ea86e6e6 did not
do too much with inlining of C code (unlike the 9.2 stuff that went in
a little after SortSupport itself).

Seems better to be consistent with the earlier item, and it is still
probably in some sense true, because the new SortSupport-wise inlined
comparator probably benefits from inlining more than the historic
scanKey-wise inlined comparator, due to the removal of indirection.

As I'm fond of pointing out, inlining is mostly useful as an enabling
transformation these days.
-- 
Peter Geoghegan
diff --git a/doc/src/sgml/release-9.5.sgml b/doc/src/sgml/release-9.5.sgml
index 4e2ea45..44a9d88 100644
--- a/doc/src/sgml/release-9.5.sgml
+++ b/doc/src/sgml/release-9.5.sgml
@@ -221,6 +221,15 @@
 
   listitem
para
+Allow sorting to be performed by inlined,
+non-acronymSQL/acronym-callable comparison functions for
+commandCREATE INDEX/, commandREINDEX/, and
+commandCLUSTER/ (Peter Geoghegan)
+   /para
+  /listitem
+
+  listitem
+   para
 Improve in-memory hash performance (Tomas Vondra, Robert Haas)
/para
   /listitem

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 1:12 PM, Josh Berkus j...@agliodbs.com wrote:
 This really feels like we're going way beyond what we want a single
 string GUC.  I feel that this feature, as outlined, is a terrible hack
 which we will regret supporting in the future.  You're taking something
 which was already a fast hack because we weren't sure if anyone would
 use it, and building two levels on top of that.

 If we're going to do quorum, multi-set synchrep, then we need to have a
 real management interface.  Like, we really ought to have a system
 catalog and some built in functions to manage this instead, e.g.

 pg_add_synch_set(set_name NAME, quorum INT, set_members VARIADIC)

 pg_add_synch_set('bolivia', 1, 'bsrv-2,'bsrv-3','bsrv-5')

 pg_modify_sync_set(quorum INT, set_members VARIADIC)

 pg_drop_synch_set(set_name NAME)

 For users who want the new functionality, they just set
 synchronous_standby_names='catalog' in pg.conf.

 Having a function interface for this would make it worlds easier for the
 DBA to reconfigure in order to accomodate network changes as well.
 Let's face it, a DBA with three synch sets in different geos is NOT
 going to want to edit pg.conf by hand and reload when the link to Brazil
 goes down.  That's a really sucky workflow, and near-impossible to automate.

I think your proposal is worth considering, but you would need to fill
in a lot more details and explain how it works in detail, rather than
just via a set of example function calls.  The GUC-based syntax
proposal covers cases like multi-level rules and, now, prioritization,
and it's not clear how those would be reflected in what you propose.

 Finally, while I'm raining on everyone's parade: the mechanism of
 identifying synchronous replicas by setting the application_name on the
 replica is confusing and error-prone; if we're building out synchronous
 replication into a sophisticated system, we ought to think about
 replacing it.

I'm not averse to replacing it with something we all agree is better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-06-26 Thread Josh Berkus
On 06/26/2015 11:32 AM, Robert Haas wrote:
 I think your proposal is worth considering, but you would need to fill
 in a lot more details and explain how it works in detail, rather than
 just via a set of example function calls.  The GUC-based syntax
 proposal covers cases like multi-level rules and, now, prioritization,
 and it's not clear how those would be reflected in what you propose.

So what I'm seeing from the current proposal is:

1. we have several defined synchronous sets
2. each set requires a quorum of k  (defined per set)
3. within each set, replicas are arranged in priority order.

One thing which the proposal does not implement is *names* for
synchronous sets.  I would also suggest that if I lose this battle and
we decide to go with a single stringy GUC, that we at least use JSON
instead of defining our out, proprietary, syntax?

Point 3. also seems kind of vaguely defined.  Are we still relying on
the idea that multiple servers have the same application_name to make
them equal, and that anything else is a proritization?  That is, if we have:

replica1: appname=group1
replica2: appname=group2
replica3: appname=group1
replica4: appname=group2
replica5: appname=group1
replica6: appname=group2

And the definition:

synchset: A
quorum: 2
members: [ group1, group2 ]

Then the desired behavior would be: we must get acks from at least 2
servers in group1, but if group1 isn't responding, then from group2?

What if *one* server in group1 responds?  What do we do?  Do we fail the
whole group and try for 2 out of 3 in group2?  Or do we only need one in
group2?  In which case, what prioritization is there?  Who could
possibly use anything so complex?

I'm personally not convinced that quorum and prioritization are
compatible.  I suggest instead that quorum and prioritization should be
exclusive alternatives, that is that a synch set should be either a
quorum set (with all members as equals) or a prioritization set (if rep1
fails, try rep2).  I can imagine use cases for either mode, but not one
which would involve doing both together.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 2:01 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Robert Haas wrote:
 On Thu, Jun 25, 2015 at 12:09 AM, Amit Langote
 langote_amit...@lab.ntt.co.jp wrote:

  Is it intentional that the following items are separate?
  [...]
  Or could they made into one item?

 I think one item would be fine.

 I suggested the same a couple of weeks ago.

Great, done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Nitpicking: unnecessary NULL-pointer check in pg_upgrade's controldata.c

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 10:54 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, if you find this to be good code cleanup on its own merits,
 you have a commit bit, you can go commit it.  I'm just saying that
 Coverity is not a good judge of code readability and even less of
 a judge of likely future changes.  So we should not let it determine
 whether we approve of unnecessary tests.

Yes, it might not be right in every case, but this one seems like a
good change to me, so committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?

2015-06-26 Thread Fabien COELHO



This needs more performance testing.


Definitely. I may do that some day. However I'm not sure that this is 
currently the main issue in the checkpointer.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-06-26 Thread Josh Berkus
On 06/26/2015 09:42 AM, Robert Haas wrote:
 On Fri, Jun 26, 2015 at 1:46 AM, Michael Paquier
 That's where the micro-language idea makes sense to use. For example,
 we can define a group using separators and like (elt1,...eltN) or
 [elt1,elt2,eltN]. Appending a number in front of a group is essential
 as well for quorum commits. Hence for example, assuming that '()' is
 used for a group whose element order does not matter, if we use that:
 - k(elt1,elt2,eltN) means that we need for the k elements in the set
 to return true (aka commit confirmation).
 - k[elt1,elt2,eltN] means that we need for the first k elements in the
 set to return true.

 When k is not defined for a group, k = 1. Using only elements
 separated by commas for the upper group means that we wait for the
 first element in the set (for backward compatibility), hence:
 1(elt1,elt2,eltN) = elt1,elt2,eltN

This really feels like we're going way beyond what we want a single
string GUC.  I feel that this feature, as outlined, is a terrible hack
which we will regret supporting in the future.  You're taking something
which was already a fast hack because we weren't sure if anyone would
use it, and building two levels on top of that.

If we're going to do quorum, multi-set synchrep, then we need to have a
real management interface.  Like, we really ought to have a system
catalog and some built in functions to manage this instead, e.g.

pg_add_synch_set(set_name NAME, quorum INT, set_members VARIADIC)

pg_add_synch_set('bolivia', 1, 'bsrv-2,'bsrv-3','bsrv-5')

pg_modify_sync_set(quorum INT, set_members VARIADIC)

pg_drop_synch_set(set_name NAME)

For users who want the new functionality, they just set
synchronous_standby_names='catalog' in pg.conf.

Having a function interface for this would make it worlds easier for the
DBA to reconfigure in order to accomodate network changes as well.
Let's face it, a DBA with three synch sets in different geos is NOT
going to want to edit pg.conf by hand and reload when the link to Brazil
goes down.  That's a really sucky workflow, and near-impossible to automate.

We'll also want a new system view, pg_stat_syncrep:

pg_stat_synchrep
standby_name
client_addr
replication_status
synch_set
synch_quorum
synch_status

Alternately, we could overload those columns onto pg_stat_replication,
but that seems messy.

Finally, while I'm raining on everyone's parade: the mechanism of
identifying synchronous replicas by setting the application_name on the
replica is confusing and error-prone; if we're building out synchronous
replication into a sophisticated system, we ought to think about
replacing it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-06-26 Thread Alvaro Herrera
Robert Haas wrote:
 On Thu, Jun 25, 2015 at 12:09 AM, Amit Langote
 langote_amit...@lab.ntt.co.jp wrote:

  Is it intentional that the following items are separate?
  [...]
  Or could they made into one item?
 
 I think one item would be fine.

I suggested the same a couple of weeks ago.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 2:25 PM, Peter Geoghegan p...@heroku.com wrote:
 On Fri, Jun 26, 2015 at 9:29 AM, Robert Haas robertmh...@gmail.com wrote:
 Can you put your suggestions here in the form of a patch to the release 
 notes?

 The attached patch generalizes from the 9.2 release note wording. I
 use the word inlined here too, even though commit 5ea86e6e6 did not
 do too much with inlining of C code (unlike the 9.2 stuff that went in
 a little after SortSupport itself).

 Seems better to be consistent with the earlier item, and it is still
 probably in some sense true, because the new SortSupport-wise inlined
 comparator probably benefits from inlining more than the historic
 scanKey-wise inlined comparator, due to the removal of indirection.

OK, but how about this wording instead?

diff --git a/doc/src/sgml/release-9.5.sgml b/doc/src/sgml/release-9.5.sgml
index 4086c6b..5e8cc15 100644
--- a/doc/src/sgml/release-9.5.sgml
+++ b/doc/src/sgml/release-9.5.sgml
@@ -221,6 +221,15 @@

   listitem
para
+Extend the infrastructure allow sorting to be performed by inlined,
+non-acronymSQL/acronym-callable comparison functions to cover
+commandCREATE INDEX/, commandREINDEX/, and
+commandCLUSTER/ (Peter Geoghegan)
+   /para
+  /listitem
+
+  listitem
+   para
 Improve in-memory hash performance (Tomas Vondra, Robert Haas)
/para
   /listitem

BTW, shouldn't Andrew also be credited here, since he did the work on
datum sorts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN index bug due to WAL refactoring

2015-06-26 Thread Alvaro Herrera
Jeff Janes wrote:

 On replica:
 
 set enable_seqscan TO off;
 explain (analyze) select count(*) from foobar ;
 ERROR:  corrupted BRIN index: inconsistent range map

Nice.  As I understand it, the problem is that the replay is using the
block number of the revmap page as target blkno of the revmap entry,
when it should be using the block number of the data page.  This should
fix it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services
commit 622a903cf88931878cbaba9bdd25c9835af0d055
Author: Alvaro Herrera alvhe...@alvh.no-ip.org
Date:   Fri Jun 26 15:33:19 2015 -0300

Fix BRIN xlog replay

diff --git a/src/backend/access/brin/brin_xlog.c b/src/backend/access/brin/brin_xlog.c
index 49261aa..e68f623 100644
--- a/src/backend/access/brin/brin_xlog.c
+++ b/src/backend/access/brin/brin_xlog.c
@@ -47,6 +47,7 @@ brin_xlog_insert_update(XLogReaderState *record,
 {
 	XLogRecPtr	lsn = record-EndRecPtr;
 	Buffer		buffer;
+	BlockNumber	blkno;
 	Page		page;
 	XLogRedoAction action;
 
@@ -66,6 +67,8 @@ brin_xlog_insert_update(XLogReaderState *record,
 		action = XLogReadBufferForRedo(record, 0, buffer);
 	}
 
+	blkno = BufferGetBlockNumber(buffer);
+
 	/* insert the index item into the page */
 	if (action == BLK_NEEDS_REDO)
 	{
@@ -97,7 +100,6 @@ brin_xlog_insert_update(XLogReaderState *record,
 	if (action == BLK_NEEDS_REDO)
 	{
 		ItemPointerData tid;
-		BlockNumber blkno = BufferGetBlockNumber(buffer);
 
 		ItemPointerSet(tid, blkno, xlrec-offnum);
 		page = (Page) BufferGetPage(buffer);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PANIC in GIN code

2015-06-26 Thread Heikki Linnakangas

On 06/26/2015 08:02 PM, Jeff Janes wrote:

Under high load against f7bb7f0625771bc71869cda, I occasionally get:

PANIC:  XLogBeginInsert was not called

It seems to only come from vacuuming.

Here is an example back-trace:

#0  0x003dcb632625 in raise () from /lib64/libc.so.6
#1  0x003dcb633e05 in abort () from /lib64/libc.so.6
#2  0x0079a39c in errfinish (dummy=value optimized out) at
elog.c:551
#3  0x0079b0e4 in elog_finish (elevel=value optimized out,
fmt=value optimized out) at elog.c:1368
#4  0x004dd1fe in XLogInsert (rmid=13 '\r', info=48 '0') at
xloginsert.c:412
#5  0x00478e13 in ginPlaceToPage (btree=0x7fff1fbb7f60,
stack=0x2878760, insertdata=value optimized out, updateblkno=value
optimized out,
 childbuf=0, buildStats=value optimized out) at ginbtree.c:582
#6  0x00479a33 in ginInsertValue (btree=0x7fff1fbb7f60,
stack=0x2878760, insertdata=0x7fff1fbb7fe0, buildStats=0x0) at
ginbtree.c:751
#7  0x0047364a in ginEntryInsert (ginstate=0x7fff1fbb8280,
attnum=54624, key=1, category=value optimized out, items=0x287d3c0,
nitem=1,
 buildStats=0x0) at gininsert.c:234
#8  0x0047ef4b in ginInsertCleanup (ginstate=0x7fff1fbb8280,
vac_delay=value optimized out, stats=0x2868d90) at ginfast.c:843
#9  0x0047e024 in ginbulkdelete (fcinfo=value optimized out) at
ginvacuum.c:547


 From the code, it seems obvious that XLogBeginInsert is getting called at
ginbtree.c line 373, so I think that some obscure code path of
btree-placeToPage must be either consuming or resetting that
XLogBeginInsert before it returns control to ginbtree.c


Here's a theory:

First of all, you have checksums or wal_log_hints enabled.

The page is being split (that's evident from info=48 above). 
ginPlaceToPage calls GinNewBuffer, which calls GetFreeIndexPage(). That 
finds a page that can be recycled, and marks it as used. 
RecordUsedIndexPage calls MarkBufferDirtyHint(), which in turn calls 
XLogSaveBufferForHint() to create a full-page image record of the page. 
That calls XLogBeginInsert() + XLogInsert(), and leaves the 
begininsert_called == false.


If you had assertions enabled, you'd see the assertion in 
XLogBeginInsert() to fail.


I'll look into that over the weekend..

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-06-26 Thread Alvaro Herrera
Andres Freund wrote:

 Rework the way multixact truncations work.

I spent some time this morning reviewing this patch and had some
comments that I relayed over IM to Andres.  The vast majority is
cosmetic, but there are two larger things:

1. I think this part of PerformMembersTruncation() is very confusing:

/* verify whether the current segment is to be deleted */
if (segment != startsegment  segment != endsegment)
SlruDeleteSegment(MultiXactMemberCtl, segment);

I think this works correctly in that it preserves both endpoint files,
but the files in between are removed ... which is a confusing interface,
IMO.  I think this merits a longer explanation.


2. We set PGXACT-delayChkpt while the truncation is executed.  This
seems reasonable, and there's a good reason for it, but all the other
users of this facility only do small operations with this thing grabbed,
while the multixact truncation could take a long time because a large
number of files might be deleted.  Maybe it's not a problem to have
checkpoints be delayed by several seconds, or who knows maybe even a
minute in a busy system.  (We will have checkpointer sleeping in 10ms
intervals until the truncation is complete).

Maybe this is fine, not sure.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 feature count

2015-06-26 Thread Peter Geoghegan
On Fri, Jun 26, 2015 at 11:09 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Maybe there's a reasonable way to measure releases (my 8.0 is bigger
 than your 9.1!), but I don't think this is it.


I agree with the sentiment, but I don't think that anyone actually
thinks of it that way. Most people tend to think of a release in terms
of the big, exciting features, or the smaller features that happened
to scratch their particular itch.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Nitpicking: unnecessary NULL-pointer check in pg_upgrade's controldata.c

2015-06-26 Thread Peter Geoghegan
On Fri, Jun 26, 2015 at 7:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I agree that the correct handling of this particular case is to mark it
 as not-a-bug.  We have better things to do.

+1

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing SSL renegotiation (Was: Should we back-patch SSL renegotiation fixes?)

2015-06-26 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On 2015-06-24 16:41:48 +0200, Andres Freund wrote:
 I, by now, have come to a different conclusion. I think it's time to
 entirely drop the renegotiation support.

 I think by now we essentially concluded that we should do that. What I'm
 not sure yet is how: Do we want to rip it out in master and just change
 the default in the backbranches, or do we want to rip it out in all
 branches and leave a faux guc in place in the back branches. I vote for
 the latter, but would be ok with both variants.

I think the former is probably the saner answer.  It is less likely to
annoy people who dislike back-branch changes.  And it will be
significantly less work, considering that that code has changed enough
that you won't be able to just cherry-pick a removal patch.  I also fear
there's a nonzero chance of breaking stuff if you're careless about doing
the removal in one or more of the five active back branches ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN index bug due to WAL refactoring

2015-06-26 Thread Alvaro Herrera
Jeff Janes wrote:

 The way the blkno and buffer refer to different block/buffers is pretty
 confusing.  Could we rename a variable to make it clearer which buffer's
 number is contained in blkno?

Sure thing.  Pushed that way.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing SSL renegotiation (Was: Should we back-patch SSL renegotiation fixes?)

2015-06-26 Thread Andres Freund
On 2015-06-24 16:41:48 +0200, Andres Freund wrote:
 I, by now, have come to a different conclusion. I think it's time to
 entirely drop the renegotiation support.

I think by now we essentially concluded that we should do that. What I'm
not sure yet is how: Do we want to rip it out in master and just change
the default in the backbranches, or do we want to rip it out in all
branches and leave a faux guc in place in the back branches. I vote for
the latter, but would be ok with both variants.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

2015-06-26 Thread Robert Haas
On Mon, Jun 15, 2015 at 2:52 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 Attached patch provides a fix as per above discussion.

I think we should emit some LOG messages here.  When we detect the
file is there:

LOG: ignoring tablespace_map file because no backup_label file exists

If the rename fails:

LOG: could not rename file %s to %s: %m

If it works:

LOG: renamed file %s to %s

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git push hook to check for outdated timestamps

2015-06-26 Thread Peter Eisentraut
On 6/25/15 8:08 PM, Robert Haas wrote:
 Because I don't want to have to do git log --format=fuller to see when
 the thing was committed, basically.

Then I suggest to you the following configuration settings:

[format]
pretty=cmedium
[pretty]
cmedium=format:%C(auto,yellow)commit %H%C(reset)%nCommit: %cn 
%ce%nCommitDate: %cd%n%n%w(80,4,4)%B

 I don't particularly think that having separate AuthorDate and
 CommitterDate fields has any value.  At least, it doesn't to me.  But
 a linear-looking commit history does have value to me.  Someone else
 might have different priorities; those are mine.

Sure, that's why Git is configurable to particular preferences.  But we
don't have to force everyone to use Git in a nonstandard way.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind failure by file deletion in source server

2015-06-26 Thread Robert Haas
On Wed, Jun 24, 2015 at 9:41 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Wed, Jun 24, 2015 at 3:43 PM, Michael Paquier wrote:
 I just realized another problem: We recently learned the hard way that some
 people have files in the data directory that are not writeable by the
 'postgres' user
 (http://www.postgresql.org/message-id/20150523172627.ga24...@msg.df7cb.de).
 pg_rewind will try to overwrite all files it doesn't recognize as relation
 files, so it's going to fail on those. A straightforward fix would be to
 first open the destination file in read-only mode, and compare its contents,
 and only open the file in write mode if it has changed. It would still fail
 when the files really differ, but I think that's acceptable.

 If I am missing nothing, two code paths need to be patched here:
 copy_file_range and receiveFileChunks. copy_file_range is
 straight-forward. Now wouldn't it be better to write the contents into
 a temporary file, compare their content, and then switch if necessary
 for receiveFileChunks?

 After sleeping on it, I have been looking at this issue again and came
 up with the patch attached. Instead of checking if the content of the
 target and the source file are the same, meaning that we would still
 need to fetch chunk content from the server in stream mode, I think
 that it is more robust to check if the target file can be opened and
 check for EACCES on failure, bypassing it if process does not have
 permissions on it. the patch contains a test case as well, and is
 independent on the rest sent upthread.
 Thoughts?

That seems scary as heck to me.  Suppose that you run pg_rewind and
SELinux decides, due to some labeling problem, to deny access to some
files.  So we just skip those.  Then the user tries to start the
server and maybe it works (since the postgres executable is labeled
differently) or maybe it fails and the user runs restorecon and then
it works.  Kaboom, your database is corrupt.

I realize that the recent fsync fiasco demonstrated that people keep
files not readable by PG in the data directory, but that still seems
loopy to me.  I realize that we can't de-support that in the back
branches because people have existing configurations that we can't
just break.  But maybe we should say, look, that's just not compatible
with pg_rewind, because where the integrity of your data is concerned
oops, I can't read this, that's probably OK just does not seem good
enough.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind failure by file deletion in source server

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 3:10 PM, Andres Freund and...@anarazel.de wrote:
 On 2015-06-26 15:07:59 -0400, Robert Haas wrote:
 I realize that the recent fsync fiasco demonstrated that people keep
 files not readable by PG in the data directory

 It wasn't unreadable files that were the primary problem, it was files
 with read only permissions, no?

Yes, that's true.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind failure by file deletion in source server

2015-06-26 Thread Andres Freund
On 2015-06-26 15:07:59 -0400, Robert Haas wrote:
 I realize that the recent fsync fiasco demonstrated that people keep
 files not readable by PG in the data directory

It wasn't unreadable files that were the primary problem, it was files
with read only permissions, no?

 oops, I can't read this, that's probably OK just does not seem good
 enough.

Agreed.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing SSL renegotiation (Was: Should we back-patch SSL renegotiation fixes?)

2015-06-26 Thread David G. Johnston
On Fri, Jun 26, 2015 at 3:09 PM, Andres Freund and...@anarazel.de wrote:

 On 2015-06-24 16:41:48 +0200, Andres Freund wrote:
  I, by now, have come to a different conclusion. I think it's time to
  entirely drop the renegotiation support.

 I think by now we essentially concluded that we should do that. What I'm
 not sure yet is how: Do we want to rip it out in master and just change
 the default in the backbranches, or do we want to rip it out in all
 branches and leave a faux guc in place in the back branches. I vote for
 the latter, but would be ok with both variants.


​3. ​Change the default and make the guc impotent - in the back
branches.  Its minimally invasive and accomplishes the same user-facing
goal as ripping it out.
​
​  Leaving dead code around in master seems undesirable so ripping it out
from there would still make sense.  This does provide an easy fall-back in
the back-branches if we are accused of being overly parental.

David J.
​


Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive

2015-06-26 Thread Peter Eisentraut
On 6/25/15 11:39 PM, Robert Haas wrote:
 Could that also cover waiting on network?
 
 Possibly.  My approach requires that the number of wait states be kept
 relatively small, ideally fitting in a single byte.  And it also
 requires that we insert pgstat_report_waiting() calls around the thing
 that is notionally blocking.  So, if there are a small number of
 places in the code where we do network I/O, we could stick those calls
 around those places, and this would work just fine.  But if a foreign
 data wrapper, or any other piece of code, does network I/O - or any
 other blocking operation - without calling pgstat_report_waiting(), we
 just won't know about it.

That sounds doable, assuming that extension authors play along.

I see that network problems because of connection poolers, foreign-data
connections, and so on, are a significant cause of session hangs, so
it would be good if they could be covered.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Semantics of pg_file_settings view

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 4:02 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Combining this with my idea about preserving the ConfigVariable list,
 I'm thinking that it would be a good idea for ProcessConfigFile() to
 run in a context created for the purpose of processing the config files,
 rather than blindly using the caller's context, which is likely to be
 a process-lifespan context and thus not a good place to leak in.
 We could keep this context around until the next SIGHUP event, so that
 the ConfigVariable list remains available, and then destroy it and
 replace it with the next ProcessConfigFile's instance of the context.
 In this way, any leakage in the processing code could not accumulate
 over multiple SIGHUPs, and so it would be certain to remain fairly
 negligible.

That seems like a nice idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] thread_test's sched_yield requires -lrt on solaris

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 3:52 AM, Oskari Saarenmaa o...@ohmu.fi wrote:
 I configured the dingo and binturong Solaris 10 animals to build 9.3
 some time ago but apparently they always failed the configure phase.
 Turns out this is caused by thread_test's usage of sched_yield which is
 in librt on Solaris but which is not pulled in by anything on 9.3 and
 earlier on my box.

 Apparently the other Solaris animal (castoroides) requires librt for
 fdatasync, but that's not required on my system.  On 9.4 and master
 librt is required for shm_open so the check doesn't fail there.

 Attached a patch to check for sched_yield in configure, the patch only
 applies against 9.0 - 9.3 which are using autoconf 2.63.  We should
 probably check for sched_yield anyway on all branches even if it's not
 strictly required on 9.4+ at the moment.

Thanks for working on support for this obscure platform; testing such
platforms helps to improve portability and reliability.

You might want to add your patch to the next CommitFest:

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN index bug due to WAL refactoring

2015-06-26 Thread Jeff Janes
On Fri, Jun 26, 2015 at 11:35 AM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 Jeff Janes wrote:

  On replica:
 
  set enable_seqscan TO off;
  explain (analyze) select count(*) from foobar ;
  ERROR:  corrupted BRIN index: inconsistent range map

 Nice.  As I understand it, the problem is that the replay is using the
 block number of the revmap page as target blkno of the revmap entry,
 when it should be using the block number of the data page.  This should
 fix it.



Thanks, that worked.

The way the blkno and buffer refer to different block/buffers is pretty
confusing.  Could we rename a variable to make it clearer which buffer's
number is contained in blkno?

Thanks,

Jeff


Re: [HACKERS] Removing SSL renegotiation (Was: Should we back-patch SSL renegotiation fixes?)

2015-06-26 Thread Andres Freund
On 2015-06-26 15:36:53 -0400, David G. Johnston wrote:
 On Fri, Jun 26, 2015 at 3:09 PM, Andres Freund and...@anarazel.de wrote:
 
  On 2015-06-24 16:41:48 +0200, Andres Freund wrote:
   I, by now, have come to a different conclusion. I think it's time to
   entirely drop the renegotiation support.
 
  I think by now we essentially concluded that we should do that. What I'm
  not sure yet is how: Do we want to rip it out in master and just change
  the default in the backbranches, or do we want to rip it out in all
  branches and leave a faux guc in place in the back branches. I vote for
  the latter, but would be ok with both variants.
 
 
 ​3. ​Change the default and make the guc impotent - in the back
 branches.  Its minimally invasive and accomplishes the same user-facing
 goal as ripping it out.

What would be the point of that? The code is pretty localized, so
leaving it in, but unreachable, seems to have no benefits.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] checkpointer continuous flushing

2015-06-26 Thread Andres Freund
On 2015-06-26 21:47:30 +0200, Fabien COELHO wrote:
 tps  stddev full speed:

 HEAD OFF/OFF
 
  tiny 1 client  727 +- 227 221 +- 246

Huh?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

2015-06-26 Thread Robert Haas
On Wed, Jun 10, 2015 at 3:34 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 Okay, I have updated the patch to destroy_tablespace_directories() code
 as well in the attached patch. I have tried to modify
 remove_tablespace_symlink(), so that it can be called from
 destroy_tablespace_directories(), but that is making it more complex,
 especially due to the reason that destroy_tablespace_directories()
 treats ENOENT as warning rather than ignoring it.

This pretty obviously doesn't follow style guidelines.  You've got it
started with a capital letter, and there are two spaces between a
and directory.

 errmsg(Not a  directory or symbolic link: \%s\,

But it looks OK otherwise, so committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Semantics of pg_file_settings view

2015-06-26 Thread Tom Lane
I wrote:
 The simplest change would be to move the whole thing to around line 208 in
 guc-file.l, just after the stanza that loads PG_AUTOCONF_FILENAME.  Or you
 could argue that the approach is broken altogether, and that we should
 capture the data while we read the files, so that you have some useful
 data in the view even if ParseConfigFile later decides there's a syntax
 error.  I'm actually thinking maybe we should flush that data-capturing
 logic altogether in favor of just not deleting the ConfigVariable list
 data structure, and generating the view directly from that data structure.
 (You could even imagine being able to finger syntax errors in the view
 that way, by having ParseConfigFile attach a dummy ConfigVariable entry
 when it bails out.)

While snouting around in the same area, I noticed that
ParseConfigDirectory() leaks memory: it neglects to pfree the file names
it's collected.  I'm not sure that it's worth fixing in the back branches,
because you'd need to have SIGHUP'd the postmaster a few million times
before the leak would amount to anything worth noticing.  However, this
does demonstrate that all the functionality we've loaded into the GUC code
of late is likely to have some memory leaks in it.

Combining this with my idea about preserving the ConfigVariable list,
I'm thinking that it would be a good idea for ProcessConfigFile() to
run in a context created for the purpose of processing the config files,
rather than blindly using the caller's context, which is likely to be
a process-lifespan context and thus not a good place to leak in.
We could keep this context around until the next SIGHUP event, so that
the ConfigVariable list remains available, and then destroy it and
replace it with the next ProcessConfigFile's instance of the context.
In this way, any leakage in the processing code could not accumulate
over multiple SIGHUPs, and so it would be certain to remain fairly
negligible.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 2:37 PM, Peter Geoghegan p...@heroku.com wrote:
 On Fri, Jun 26, 2015 at 11:29 AM, Robert Haas robertmh...@gmail.com wrote:

 OK, but how about this wording instead?

 That seems fine.

 BTW, shouldn't Andrew also be credited here, since he did the work on
 datum sorts?

 Andrew's work was entirely confined to making datum sorts work with
 abbreviation, which seems entirely orthogonal (but was enough to make
 me want to credit him as an author of abbreviated keys, even after
 breaking out his work on numeric support into a separate item). This
 particular piece of work has nothing to do with the datum sort case,
 though.

 Datum sorts always supported SortSupport. This commit, 5ea86e6e6,
 really should have been in 9.2 (especially since it had a net-negative
 code footprint and clearly simplified tuplesort), and had nothing to
 do with abbreviation -- it went in before abbreviation, and before it
 was 100% clear that abbreviation would ever land.

OK, understood, and thanks for the clarification.  I've committed the
version I proposed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] checkpointer continuous flushing

2015-06-26 Thread Fabien COELHO


Note that I'm not comparing to HEAD in the above tests, but with the new 
options desactivated, which should be more or less comparable to current 
HEAD, i.e. there is no sorting nor flushing done, but this is not strictly 
speaking HEAD behavior. Probably I should get some figures with HEAD as well 
to check the more or less assumption.


Just for answering myself on this point, I tried current HEAD vs patch v4 
with sort OFF + flush OFF: the figures are indeed quite comparable (see 
below), so although the internal implementation is different, the 
performance when both options are off is still a reasonable approximation 
of the performance without the patch, as I was expecting. What patch v4 
still does with OFF/OFF which is not done by HEAD is balancing writes 
among tablespaces, but there is only one disk on these tests so it does 
not matter.


tps  stddev full speed:

HEAD OFF/OFF

 tiny 1 client  727 +- 227 221 +- 246
 small 1 client 158 +- 316 158 +- 325
 medium 1 client148 +- 285 157 +- 326
 tiny 4 clients2088 +- 7862074 +- 699
 small 4 clients192 +- 648 188 +- 560
 medium 4 clients   220 +- 654 220 +- 648

percent of late transactions:

HEAD   OFF/OFF

 tiny 4 clients 100 tps  6.316.67
 small 4c 100 tps   35.68   35.23
 medium 4c 100 tps  37.38   38.00
 tiny 4c 200 tps 9.069.10
 small 4c 200 tps   51.65   51.16
 medium 4c 200 tps  51.35   50.20
 tiny 4 clients 400 tps 11.410.5
 small 4 clients 400 tps66.467.6

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PANIC in GIN code

2015-06-26 Thread Jeff Janes
On Fri, Jun 26, 2015 at 11:40 AM, Heikki Linnakangas hlinn...@iki.fi
wrote:

 On 06/26/2015 08:02 PM, Jeff Janes wrote:

 Under high load against f7bb7f0625771bc71869cda, I occasionally get:

 PANIC:  XLogBeginInsert was not called

 It seems to only come from vacuuming.

 Here is an example back-trace:

 #0  0x003dcb632625 in raise () from /lib64/libc.so.6
 #1  0x003dcb633e05 in abort () from /lib64/libc.so.6
 #2  0x0079a39c in errfinish (dummy=value optimized out) at
 elog.c:551
 #3  0x0079b0e4 in elog_finish (elevel=value optimized out,
 fmt=value optimized out) at elog.c:1368
 #4  0x004dd1fe in XLogInsert (rmid=13 '\r', info=48 '0') at
 xloginsert.c:412
 #5  0x00478e13 in ginPlaceToPage (btree=0x7fff1fbb7f60,
 stack=0x2878760, insertdata=value optimized out, updateblkno=value
 optimized out,
  childbuf=0, buildStats=value optimized out) at ginbtree.c:582
 #6  0x00479a33 in ginInsertValue (btree=0x7fff1fbb7f60,
 stack=0x2878760, insertdata=0x7fff1fbb7fe0, buildStats=0x0) at
 ginbtree.c:751
 #7  0x0047364a in ginEntryInsert (ginstate=0x7fff1fbb8280,
 attnum=54624, key=1, category=value optimized out, items=0x287d3c0,
 nitem=1,
  buildStats=0x0) at gininsert.c:234
 #8  0x0047ef4b in ginInsertCleanup (ginstate=0x7fff1fbb8280,
 vac_delay=value optimized out, stats=0x2868d90) at ginfast.c:843
 #9  0x0047e024 in ginbulkdelete (fcinfo=value optimized out) at
 ginvacuum.c:547


  From the code, it seems obvious that XLogBeginInsert is getting called at
 ginbtree.c line 373, so I think that some obscure code path of
 btree-placeToPage must be either consuming or resetting that
 XLogBeginInsert before it returns control to ginbtree.c


 Here's a theory:

 First of all, you have checksums or wal_log_hints enabled.


Correct, checksums are on.  I forgot all about that.



 The page is being split (that's evident from info=48 above).
 ginPlaceToPage calls GinNewBuffer, which calls GetFreeIndexPage(). That
 finds a page that can be recycled, and marks it as used.
 RecordUsedIndexPage calls MarkBufferDirtyHint(), which in turn calls
 XLogSaveBufferForHint() to create a full-page image record of the page.
 That calls XLogBeginInsert() + XLogInsert(), and leaves the
 begininsert_called == false.

 If you had assertions enabled, you'd see the assertion in
 XLogBeginInsert() to fail.

 I'll look into that over the weekend..


Should the assertion in XLogBeginInsert() be a elog(FATAL,...) instead?
The performance impact should be negligible (touches process-local memory
which is already being touched a few instructions later) and it will
produce better bug reports, and backtraces, should there be more issues.

I thought it would takes days to reproduce this with a enable-cassert
build, but it actually reproduced much faster that way.  Your theory looks
completely correct.


#0  0x003dcb632625 in raise () from /lib64/libc.so.6
#1  0x003dcb633e05 in abort () from /lib64/libc.so.6
#2  0x007d18bd in ExceptionalCondition (conditionName=value
optimized out, errorType=value optimized out, fileName=value optimized
out,
lineNumber=value optimized out) at assert.c:54
#3  0x004f8067 in XLogBeginInsert () at xloginsert.c:125
#4  0x004f83d5 in XLogSaveBufferForHint (buffer=241,
buffer_std=value optimized out) at xloginsert.c:907
#5  0x006b64af in MarkBufferDirtyHint (buffer=241,
buffer_std=value optimized out) at bufmgr.c:3085
#6  0x006be859 in fsm_set_and_search (rel=value optimized out,
addr=..., slot=1160, newValue=0 '\000', minValue=0 '\000') at
freespace.c:621
#7  0x006be8f6 in RecordPageWithFreeSpace (rel=0x7f01e7fe0560,
heapBlk=1160, spaceAvail=value optimized out) at freespace.c:188
#8  0x006bf105 in GetFreeIndexPage (rel=0x7f01e7fe0560) at
indexfsm.c:43
#9  0x004750b2 in GinNewBuffer (index=0x7f01e7fe0560) at
ginutil.c:218
#10 0x0047d5c4 in ginPlaceToPage (btree=0x7fffe3a55840,
stack=0x1307b78, insertdata=value optimized out, updateblkno=value
optimized out,
childbuf=0, buildStats=0x0) at ginbtree.c:442
#11 0x0047f2dd in ginInsertValue (btree=0x7fffe3a55840,
stack=0x1307b78, insertdata=0x7fffe3a558c0, buildStats=0x0) at
ginbtree.c:751
#12 0x00475b7b in ginEntryInsert (ginstate=0x7fffe3a55b80,
attnum=58216, key=1, category=value optimized out, items=0x7f01e7f61120,
nitem=11,
buildStats=0x0) at gininsert.c:234
#13 0x00485dfc in ginInsertCleanup (ginstate=0x7fffe3a55b80,
vac_delay=value optimized out, stats=0x12bbbc8) at ginfast.c:843
#14 0x00484d83 in ginbulkdelete (fcinfo=value optimized out) at
ginvacuum.c:547

Cheers,

Jeff


Re: [HACKERS] checkpointer continuous flushing

2015-06-26 Thread Fabien COELHO


Hello Andres,


HEAD OFF/OFF

 tiny 1 client  727 +- 227 221 +- 246


Huh?


Indeed, just to check that someone was reading this magnificent mail:-)

Just a typo because I reformated the figures for simpler comparison. 221 
is really 721, quite close to 727.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git push hook to check for outdated timestamps

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 4:11 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 6/25/15 8:08 PM, Robert Haas wrote:
 Because I don't want to have to do git log --format=fuller to see when
 the thing was committed, basically.

 Then I suggest to you the following configuration settings:

 [format]
 pretty=cmedium
 [pretty]
 cmedium=format:%C(auto,yellow)commit %H%C(reset)%nCommit: %cn 
 %ce%nCommitDate: %cd%n%n%w(80,4,4)%B

 I don't particularly think that having separate AuthorDate and
 CommitterDate fields has any value.  At least, it doesn't to me.  But
 a linear-looking commit history does have value to me.  Someone else
 might have different priorities; those are mine.

 Sure, that's why Git is configurable to particular preferences.  But we
 don't have to force everyone to use Git in a nonstandard way.

I respect your opinion, but I think you are out-voted, unless some of
the people who previously favored this proposal have changed their
minds based on this discussion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] open items list cleanup

2015-06-26 Thread Robert Haas
I spent much of today going through here:

https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items

Here's what I did:

* Committed patches for four of the items, hopefully resolving those items.
* Moved three items from open to either resolved or a new section
don't need fixing.
* Added several new items based on recent posts to pgsql-hackers.
* Split up the remaining open items into sections.
* Added a comment with current status to many, but not all, of the
items.  (I would have done them all, but I'm pretty much out of time
to work on this for today.)

There are a sizeable number of items on that list for which no patches
exist yet.  It would be great if the people involved in those issues
could work on producing those patches.  There are a few issues on that
list that require a committer's attention to a patch that has already
been produced.  I believe that the onus is on the committers who
committed those features to look at those patches - or if not, they
should be up-front about the fact that they no longer have time to
work on the features they committed, so that other committers know
that they need to and should step in.

Further improvements to the wiki page itself are also welcome and will
be appreciated, at least, by me.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-06-26 Thread Peter Geoghegan
On Fri, Jun 26, 2015 at 3:39 PM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, Jun 10, 2015 at 9:15 PM, Bruce Momjian br...@momjian.us wrote:
 I am ready to make suggested adjustments

Also, I attach a new description of the UPSERT feature. For me, UPSERT
was always about guarantees that make statements work robustly under
standard operating conditions -- a novice user should be able to write
a simple UPSERT statement and forget about it. It's really hard to
explain how that can be guaranteed, but the basic guarantee itself is
simple, and matters a lot.

I think that the proposed wording conveys that. I also think that it's
important that this message is prominently conveyed, because these
issues cause no end of confusion.

-- 
Peter Geoghegan
diff --git a/doc/src/sgml/release-9.5.sgml b/doc/src/sgml/release-9.5.sgml
index 5b0d109..94cecd7 100644
--- a/doc/src/sgml/release-9.5.sgml
+++ b/doc/src/sgml/release-9.5.sgml
@@ -620,16 +620,23 @@
 
   listitem
para
-Allow link linkend=sql-on-conflictcommandINSERTS//
-that would generate constraint conflicts to be turned into
-commandUPDATE/s or ignored (Peter Geoghegan, Heikki
-Linnakangas, Andres Freund)
+Allow commandINSERT/s to specify an alternative
+literalDO UPDATE/ or literalDO NOTHING/ action that is
+executed upon encountering an literallink
+linkend=sql-on-conflictON CONFLICT// condition (Peter
+Geoghegan, Heikki Linnakangas, Andres Freund)
/para
 
para
-The syntax is commandINSERT ... ON CONFLICT DO NOTHING/UPDATE/.
-This is the Postgres implementation of the popular
-commandUPSERT/ command.
+The commandINSERT/ syntax is revised to accept an
+literalON CONFLICT DO NOTHING/UPDATE/ clause.  literalON
+CONFLICT/ conditions are conditions that ordinarily result
+in a uniqueness violation or exclusion violation error.  In
+the absence of any independent error, literalON CONFLICT DO
+UPDATE/ guarantees either inserting any row proposed for
+insertion, or, alternatively, resolving a conflict by updating
+an existing, conflicting row.  This capability is generally
+known as quoteUPSERT/.
/para
   /listitem
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

2015-06-26 Thread Amit Kapila
On Sat, Jun 27, 2015 at 1:32 AM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Jun 10, 2015 at 3:34 AM, Amit Kapila amit.kapil...@gmail.com
wrote:
  Okay, I have updated the patch to destroy_tablespace_directories() code
  as well in the attached patch. I have tried to modify
  remove_tablespace_symlink(), so that it can be called from
  destroy_tablespace_directories(), but that is making it more complex,
  especially due to the reason that destroy_tablespace_directories()
  treats ENOENT as warning rather than ignoring it.

 This pretty obviously doesn't follow style guidelines.  You've got it
 started with a capital letter, and there are two spaces between a
 and directory.

  errmsg(Not a  directory or symbolic link: \%s\,


Sorry, I think this is left over due to multiple versions exchange
between me and Andrew.

 But it looks OK otherwise, so committed.


Thanks.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] open items list cleanup

2015-06-26 Thread Noah Misch
On Fri, Jun 26, 2015 at 04:23:28PM -0400, Robert Haas wrote:
 https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items
 
 Here's what I did:

 * Split up the remaining open items into sections.
 * Added a comment with current status to many, but not all, of the
 items.  (I would have done them all, but I'm pretty much out of time
 to work on this for today.)

I like these changes.  It's now much easier to see trends and to see which
items are ready for which kinds of actions.  Thanks.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_*_columns?

2015-06-26 Thread Tomas Vondra



On 06/27/2015 12:30 AM, Jim Nasby wrote:

On 6/24/15 6:41 PM, Tomas Vondra wrote:

Were the stories (or the experience which lead to the stories) on
9.3 or later? Do they have a good way to reproduce it for testing
purposes?


The per-db split can only improve things if there actually are
multiple databases, and if the objects are somehow evenly
distributed among them. If there's a single database, or if most of
the objects are in a single database (but stored in multiple
schemas, for example), it does not help much. So it's trivially to
reproduce the previous issues.


Right, and a single large database is a pretty common scenario.


FWIW I doubt this scenario is so common.

Most of the cases of issues with stats file I've heard of were about a 
database server shared by many applications, separated into databases. 
That's basically the scenario we've been facing with our PostgreSQL 
machines, and the motivation for the split.


Maybe there are many clusters with a single database, containing many 
objects, but I don't remember any recent reports of problems with stats 
files from them. Those pretty much disappeared after 9.3, which is when 
the split happened.


That is not to say we can just design it in a way that will cause OOM 
issues or crashes on such machines ...


regards

--
Tomas Vondra   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] object_classes array is broken, again

2015-06-26 Thread Jim Nasby

On 6/24/15 2:11 PM, Robert Haas wrote:

Fixing this is easy, but ISTM that we need to insert some sort of a
guard to prevent people from continuing to forget this, because it's
apparently quite easy to do.  Perhaps add_object_address should
Assert(OidIsValid(object_classes[oclass])), plus a (static?) assert
someplace checking that OidIsValid(object_classes[MAX_OCLASS - 1])?


I tried doing this and I'm getting a static_assert expression is not an 
integral constant expression error, even when I reduce it to a simple 
constant comparison. Maybe I'm just doing something dumb...


If I replace the StaticAssert with 
Assert(OidIsValid(object_classes[MAX_OCLASS - 1])) it works find and 
initdb will fail if that assert trips.


I've attached the broken StaticAssert version. Also added a warning 
comment to the enum.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index c1212e9..7adc216 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -2037,6 +2037,15 @@ add_object_address(ObjectClass oclass, Oid objectId, 
int32 subId,
 {
ObjectAddress *item;
 
+   /*
+* Ensure object_classes is properly sized. Can't use OidIsValid here
+* because it's not marked as static (nor should it be).
+*/
+   StaticAssertStmt(object_classes[MAX_OCLASS - 1] != 0,
+   last OID in object_classes[] is invalid (did you 
forget to add a new entry to it?));
+
+   Assert(OidIsValid(object_classes[oclass]));
+
/* enlarge array if needed */
if (addrs-numrefs = addrs-maxrefs)
{
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 5da18c2..515417d 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -149,6 +149,8 @@ typedef enum ObjectClass
OCLASS_EVENT_TRIGGER,   /* pg_event_trigger */
OCLASS_POLICY,  /* pg_policy */
OCLASS_TRANSFORM,   /* pg_transform */
+   
+   /* New classes need to be added to object_classes[] too. */
MAX_OCLASS  /* MUST BE LAST */
 } ObjectClass;
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_*_columns?

2015-06-26 Thread Jim Nasby

On 6/24/15 6:41 PM, Tomas Vondra wrote:

Were the stories (or the experience which lead to the stories) on 9.3 or
later?  Do they have a good way to reproduce it for testing purposes?


The per-db split can only improve things if there actually are multiple
databases, and if the objects are somehow evenly distributed among them.
If there's a single database, or if most of the objects are in a single
database (but stored in multiple schemas, for example), it does not help
much. So it's trivially to reproduce the previous issues.


Right, and a single large database is a pretty common scenario.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPSERT on partition

2015-06-26 Thread Jim Nasby

On 6/24/15 1:03 PM, Peter Geoghegan wrote:

On Wed, Jun 24, 2015 at 11:02 AM, Peter Geoghegan p...@heroku.com wrote:

I think that the real way to fix this is, as you say, to make it so
that it isn't necessary in general to write trigger functions like
this to make inheritance work.


Excuse me -- I mean to make *partitioning* work.


It occurs to me that we could run the BEFORE triggers and then pass the 
new tuple to a user-supplied function that returns a regclass. That 
would allow us to know exactly which child/partition the UPSERT should 
be attempted in.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Oh, this is embarrassing: init file logic is still broken

2015-06-26 Thread Jim Nasby

On 6/25/15 12:51 PM, Josh Berkus wrote:

On 06/25/2015 10:47 AM, Peter Geoghegan wrote:

On Wed, Jun 24, 2015 at 2:52 PM, Josh Berkus j...@agliodbs.com wrote:

OK, this is pretty bad in its real performance effects.  On a workload
which is dominated by new connection creation, we've lost about 17%
throughput.


Mistakes happen, but this is the kind of regression that automated
performance testing could have caught. That's another area of testing
that needs considerable improvement IMV.


Well, I have a pgbench test script for it now.  If only I had some place
to check it in ...


Well, there is https://github.com/slux/PostgreSQL-Performance-Farm; but 
it hasn't been touched in 5 years. :(

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #13473: VACUUM FREEZE mistakenly cancel standby sessions

2015-06-26 Thread Jim Nasby

On 6/26/15 8:50 AM, Marco Nenciarini wrote:

In the heap_xlog_freeze we need to subtract one to the value of cutoff_xid
before passing it to ResolveRecoveryConflictWithSnapshot.




Attached a proposed patch that solves the issue.


FWIW, I've reviewed the usage and the new logic looks correct.

It'd actually be nice if we also logged the most recent XID that 
actually got frozen. That's the XID that Hot Standby would actually care 
about (not the freeze limit).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: psql: show proper row count in \x mode for zero-column output

2015-06-26 Thread Jim Nasby

On 6/25/15 9:42 AM, Peter Eisentraut wrote:

On 3/24/15 9:04 PM, Bruce Momjian wrote:

psql:  show proper row count in \x mode for zero-column output

Also, fix pager enable selection for such cases, and other cleanups for
zero-column output.

Report by Thom Brown

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/376a0c4547fe98c45476647596ce9c9b394f8415


This change added an extra blank line to the output of a zero-row result.

Compare:

[9.4]
$ psql -X -d postgres -c 'select * from pg_class where false' -x
(No rows)
$

[9.5]
$ psql -X -d postgres -c 'select * from pg_class where false' -x
(0 rows)

$


Was that intentional?


That's consistent with what  0 rows does, so it seems the correct thing 
to do. Going from (No rows) to (0 rows) is going to break things 
anyway, so I don't see a backwards compatibility issue here.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-06-26 Thread Peter Geoghegan
On Wed, Jun 10, 2015 at 9:15 PM, Bruce Momjian br...@momjian.us wrote:
 I am ready to make suggested adjustments

I attach a compatibility note that is clearly needed; adding this is
an open item of mine for 9.5. This concerns foreign data wrappers and
UPSERT.

-- 
Peter Geoghegan
diff --git a/doc/src/sgml/release-9.5.sgml b/doc/src/sgml/release-9.5.sgml
index 5b0d109..9e857bc 100644
--- a/doc/src/sgml/release-9.5.sgml
+++ b/doc/src/sgml/release-9.5.sgml
@@ -125,6 +125,34 @@
  /para
 /listitem
 
+listitem
+ para
+  Foreign data wrappers that support updating foreign tables must
+  consider the possible presence of an literalON CONFLICT DO
+  NOTHING/ clause on commandINSERT/ statements (Peter
+  Geoghegan)
+ /para
+
+ para
+  When an literalON CONFLICT DO NOTHING/ clause is present,
+  foreign data wrappers should either perform a
+  productnamePostgreSQL/productname-analogous action on the
+  foreign table, or reject the query outright.
+ /para
+
+ para
+  productnamePostgreSQL/productname currently lacks support
+  for unique index inference against foreign tables;  the
+  optimizer will always reject commandINSERT/ statements that
+  attempt literalON CONFLICT/ inference on the basis of the
+  system having no information about quoteforeign unique
+  indexes/.  Since, in general, an inference clause is mandatory
+  for literalON CONFLICT DO UPDATE/, the literalDO UPDATE/
+  variant is in effect not currently supported with foreign
+  tables.
+ /para
+/listitem
+
/itemizedlist
 
   /sect2

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPSERT on partition

2015-06-26 Thread Simon Riggs
On 24 June 2015 at 15:05, Fujii Masao masao.fu...@gmail.com wrote:


 How should we treat this problem for 9.5? If we want to fix this problem
 completely, probably we would need to make constraint_exclusion work with
 even UPSERT. Which sounds difficult to do at least for 9.5. Any other idea?
 Or we should just treat it as a limitation of UPSERT and add that document?


+1

There are many problems that cannot be resolved for 9.5.

UPSERT works fine with tables with BRIN indexes.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Nitpicking: unnecessary NULL-pointer check in pg_upgrade's controldata.c

2015-06-26 Thread Robert Haas
On Fri, Jun 26, 2015 at 9:06 AM, Andres Freund and...@anarazel.de wrote:
 On 2015-06-26 22:03:05 +0900, Michael Paquier wrote:
 Hi,

 Coverity is nitpickingly pointing out the following thing:
 --- a/src/bin/pg_upgrade/controldata.c
 +++ b/src/bin/pg_upgrade/controldata.c
 @@ -402,8 +402,7 @@ get_control_data(ClusterInfo *cluster, bool live_check)
 }
 }

 -   if (output)
 -   pclose(output);
 +   pclose(output);
 The thing is that output can never be NULL, pg_upgrade leaving with
 pg_fatal before coming to this code path. Hence this NULL check could
 be simply removed.

 FWIW, I think these type of coverity items should just be discarded with
 prejudice. Fixing them doesn't seem to buy anything and there's enough
 actually worthwhile stuff going on.

I don't mind committing patches for this kind of thing if it makes the
Coverity reports easier to deal with, which I gather that it does.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Redesigning checkpoint_segments

2015-06-26 Thread Heikki Linnakangas

On 05/27/2015 12:26 AM, Jeff Janes wrote:

On Thu, May 21, 2015 at 8:40 AM, Fujii Masao masao.fu...@gmail.com wrote:


On Thu, May 21, 2015 at 3:53 PM, Jeff Janes jeff.ja...@gmail.com wrote:

One of the points of max_wal_size and its predecessor is to limit how big
pg_xlog can grow.  But running out of disk space on pg_xlog is no more

fun

during archive recovery than it is during normal operations.  So why
shouldn't max_wal_size be active during recovery?


The following message of commit 7181530 explains why.

 In standby mode, respect checkpoint_segments in addition to
 checkpoint_timeout to trigger restartpoints. We used to deliberately
only
 do time-based restartpoints, because if checkpoint_segments is small we
 would spend time doing restartpoints more often than really necessary.
 But now that restartpoints are done in bgwriter, they're not as
 disruptive as they used to be. Secondly, because streaming replication
 stores the streamed WAL files in pg_xlog, we want to clean it up more
 often to avoid running out of disk space when checkpoint_timeout is
large
 and checkpoint_segments small.

Previously users were more likely to fall into this trouble (i.e., too
frequent
occurrence of restartpoints) because the default value of
checkpoint_segments
was very small, I guess. But we increased the default of max_wal_size, so
now
the risk of that trouble seems to be smaller than before, and maybe we can
allow max_wal_size to trigger restartpoints.


I see.  The old behavior was present for the same reason we decided to split
checkpoint_segments into max_wal_size and min_wal_size.

That is, the default checkpoint_segments was small, and it had to be small
because increasing it would cause more space to be used even when that
extra space was not helpful.

So perhaps we can consider this change a completion of the max_wal_size
work, rather than a new feature?


Yeah, I'm inclined to change the behaviour. Ignoring checkpoint_segments 
made sense when we initially did that, but it has gradually become less 
and less sensible after that, as we got streaming replication, and as we 
started to keep all restored segments in pg_xlog even in archive recovery.



It seems to be a trivial change to implement that, although I might be
overlooking something subtle (pasted below, also attached)

--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10946,7 +10946,7 @@ XLogPageRead(XLogReaderState *xlogreader,
XLogRecPtr targetPagePtr, int reqLen,
 * Request a restartpoint if we've replayed too much xlog
since the
 * last one.
 */
-   if (StandbyModeRequested  bgwriterLaunched)
+   if (bgwriterLaunched)
{
if (XLogCheckpointNeeded(readSegNo))
{

This keeps pg_xlog at about 67% of max_wal_size during archive recovery
(because checkpoint_completion_target is accounted for but goes unused)


Hmm. checkpoint_completion_target is used when determining progress 
against checkpoint_timeout just fine, but the problem is that if you do 
just the above, IsCheckpointOnSchedule() still won't consider consumed 
WAL when it determines whether the restartpoint is on time. So the 
error is in the other direction: if you set max_wal_size to a small 
value, and checkpoint_timeout to a large value, the restartpoint would 
think that it has plenty of time to complete, and exceed max_wal_size. 
We need to fix IsCheckpointOnSchedule() to also track progress against 
max_wal_size during recovery.


I came up with the attached patch as a first attempt. It enables the 
same logic to calculate if the checkpoint is on schedule to be used in 
recovery. But there's a little problem (also explained in a comment in 
the patch):


There is a large gap between a checkpoint's redo-pointer, and the 
checkpoint record itself (determined by checkpoint_completion_target). 
When we're not in recovery, we set the redo-pointer for the current 
checkpoint first, then start flushing data, and finally write the 
checkpoint record. The logic in IsCheckpointOnSchedule() calculates a) 
how much WAL has been generated since the beginning of the checkpoint, 
i.e its redo-pointer, and b) what fraction of shared_buffers has been 
flushed to disk. But in recovery, we only start the restartpoint after 
replaying the checkpoint record, so at the beginning of a restartpoint, 
we're actually already behind schedule by the amount of WAL between the 
redo-pointer and the record itself.


I'm not sure what to do about this. With the attached patch, you get the 
same leisurely pacing with restartpoints as you get with checkpoints, 
but you exceed max_wal_size during recovery, by the amount determined by 
checkpoint_completion_target. Alternatively, we could try to perform 
restartpoints faster then checkpoints, but then you'll get nasty 
checkpoint I/O storms in recovery.


A bigger 

[HACKERS] Nitpicking: unnecessary NULL-pointer check in pg_upgrade's controldata.c

2015-06-26 Thread Michael Paquier
Hi,

Coverity is nitpickingly pointing out the following thing:
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -402,8 +402,7 @@ get_control_data(ClusterInfo *cluster, bool live_check)
}
}

-   if (output)
-   pclose(output);
+   pclose(output);
The thing is that output can never be NULL, pg_upgrade leaving with
pg_fatal before coming to this code path. Hence this NULL check could
be simply removed.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Nitpicking: unnecessary NULL-pointer check in pg_upgrade's controldata.c

2015-06-26 Thread Andres Freund
Hi,

On 2015-06-26 22:03:05 +0900, Michael Paquier wrote:
 Hi,
 
 Coverity is nitpickingly pointing out the following thing:
 --- a/src/bin/pg_upgrade/controldata.c
 +++ b/src/bin/pg_upgrade/controldata.c
 @@ -402,8 +402,7 @@ get_control_data(ClusterInfo *cluster, bool live_check)
 }
 }
 
 -   if (output)
 -   pclose(output);
 +   pclose(output);
 The thing is that output can never be NULL, pg_upgrade leaving with
 pg_fatal before coming to this code path. Hence this NULL check could
 be simply removed.

FWIW, I think these type of coverity items should just be discarded with
prejudice. Fixing them doesn't seem to buy anything and there's enough
actually worthwhile stuff going on.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Redesigning checkpoint_segments

2015-06-26 Thread Heikki Linnakangas

On 06/26/2015 03:40 PM, Robert Haas wrote:

Actually, I've seen a number of presentations indicating
that the pacing of checkpoints is already too aggressive near the
beginning, because as soon as we initiate the checkpoint we have a
storm of full page writes.  I'm sure we can come up with arbitrarily
complicated systems to compensate for this, but something simple might
be to calculate progress done+adjust/total+adjust rather than
done/total.  If you let adjust=total/9, for example, then you
essentially start the progress meter at 10% instead of 0%.  Even
something that simple might be an improvement.


Yeah, but that's an unrelated issue. This was most recently discussed at 
http://www.postgresql.org/message-id/CAKHd5Ce-bnD=geedtxit2_ay7shqutkd0yhxxk5f4zvekrp...@mail.gmail.com. 
I posted a simple patch there - review and testing is welcome ;-).


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive

2015-06-26 Thread Robert Haas
On Thu, Jun 25, 2015 at 11:57 PM, Amit Kapila amit.kapil...@gmail.com wrote:
  3. Add new view 'pg_stat_wait_event' with following info:
  pid   - process id of this backend
  waiting - true for any form of wait, false otherwise
  wait_event_type - Heavy Weight Lock, Light Weight Lock, I/O wait, etc
  wait_event - Lock (Relation), Lock (Relation Extension), etc
 I am pretty unconvinced that it's a good idea to try to split up the
 wait event into two columns.  I'm only proposing ~20 wait states, so
 there's something like 5 bits of information here.  Spreading that
 over two text columns is a lot, and note that Amit's text would
 basically recapitulate the contents of the first column in the second
 one, which I cannot get excited about.
 There is an advantage in splitting the columns which is if wait_event_type
 column indicates Heavy Weight Lock, then user can go and check further
 details in pg_locks, I think he can do that even by seeing wait_event
 column, but that might not be as straightforward as with wait_event_type
 column.

It's just a matter of writing event_type LIKE 'Lock %' instead of
event_type = 'Lock'.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?

2015-06-26 Thread Heikki Linnakangas
This needs more performance testing.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Nitpicking: unnecessary NULL-pointer check in pg_upgrade's controldata.c

2015-06-26 Thread Andres Freund
On 2015-06-26 09:44:14 -0400, Robert Haas wrote:
 I don't mind committing patches for this kind of thing if it makes the
 Coverity reports easier to deal with, which I gather that it does.

It takes about three seconds to mark it as ignored which will hide it
going forward.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers