[HACKERS] [sqlsmith] Missing CHECK_FOR_INTERRUPTS in tsquery_rewrite

2016-10-29 Thread Andreas Seltenreich
Hi,

testing with sqlsmith yielded an uncancellable backend hogging CPU time.
Gdb showed it was busy in findeq() of tsquery_rewrite.c.  This function
appears to have exponential complexity wrt. the size of the involved
tsqueries.  The following query runs for 12s on my machine with no way
to cancel it and incrementing the length of the first argument by 1
doubles this time.

select ts_rewrite(
  (select string_agg(i::text, '&')::tsquery from generate_series(1,32) g(i)),
  (select string_agg(i::text, '&')::tsquery from generate_series(1,19) g(i)),
  'foo');

The attached patch adds a CHECK_FOR_INTERRUPTS to make it cancellable.

regards,
Andreas

>From d9910a96c9bd73c16e29ecaa0577945d5e1c091c Mon Sep 17 00:00:00 2001
From: Andreas Seltenreich 
Date: Sun, 30 Oct 2016 03:25:55 +0100
Subject: [PATCH] Add CHECK_FOR_INTERRUPTS in tsquery_rewrite loop.

The loop in findeq() appears to have exponential complexity and
runtime becomes excessive for more than about 30 tokens in the
tsvectors.  Add a CHECK_FOR_INTERRUPTS to make it cancellable.
---
 src/backend/utils/adt/tsquery_rewrite.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/backend/utils/adt/tsquery_rewrite.c b/src/backend/utils/adt/tsquery_rewrite.c
index 28f328d..ef6444f 100644
--- a/src/backend/utils/adt/tsquery_rewrite.c
+++ b/src/backend/utils/adt/tsquery_rewrite.c
@@ -95,6 +95,10 @@ findeq(QTNode *node, QTNode *ex, QTNode *subs, bool *isfind)
 
 			do
 			{
+/* This loop is rather heavyweight, it better be
+ * cancellable. */
+CHECK_FOR_INTERRUPTS();
+
 tnode->sign = 0;
 for (i = 0; i < ex->nchild; i++)
 {
-- 
2.9.3


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make coverage-html on OS X

2016-10-29 Thread Jim Nasby
tl;dr: It's critical that you actually do a make install, or at least it 
is if you've set --prefix with configure. If you don't, then even if you 
do make check you'le going to get the *installed* libpq, and not the 
*built* libpq.


Also, looks like there's a race between the .info and .c.gcov targets 
with parallel make; I'm wondering if there's a way to fix that by not 
allowing parallelism in each directory...? (Presumably the issue is the 
rm *.gcov). It'd be nice to fix this because -j speeds up coverage-html 
a lot, even with just 2 cores.


On 10/27/16 9:38 PM, Peter Eisentraut wrote:

On 10/27/16 1:27 PM, Jim Nasby wrote:

Well, that got me closer, but it's still blowing up on libpq:

genhtml: ERROR: no valid records found in tracefile
./src/interfaces/libpq/lcov.info


I have seen similar problems when I use a gcov that does not match the gcc.


I switched back to macports gcc6, verified version correctness, and got 
further (though still broken). Looking through the build log:


Processing fe-lobj.gcda
/Users/decibel/pgsql/HEAD/src/interfaces/libpq/fe-lobj.gcda:stamp 
mismatch with notes file
geninfo: WARNING: gcov did not create any files for 
/Users/decibel/pgsql/HEAD/src/interfaces/libpq/fe-lobj.gcda!

...
gcov -b -f -p -o .  fe-lobj.c >fe-lobj.c.gcov.out
./fe-lobj.gcda:stamp mismatch with notes file

First hit[1] on "google:gcov mismatch with notes file" led me to this 
hexdump command (picking a libpq file at random...)


hexdump -e '"%x\n"' -s8 -n4 fe-lobj.gcda
6929786
hexdump -e '"%x\n"' -s8 -n4 fe-lobj.gcno
d93a160

Second hit[2] gives a better explaination: the files were rebuilt a 
second time. While I don't see that happening in the log, his example is 
from building a shared library, so I'm wondering if that's got something 
to do with this.


I went into src/interfaces/libpq, did rm *.gc* lcov.info. After doing 
that, a second run of make coverage-html worked fine.


I'm wondering if there's some magic involved in coverage with shared 
libraries...


Actually, after a bunch of other experiments I ran those hexdump 
commands again; the value for .gcno has changed, but the value for .gcda 
hasn't. So I'm wondering if the base system libpq is getting pulled in. 
Sure enough, if I also do make install the timestamp matches. Presumably 
this is all due to having set --prefix with configure.



I was able to run it successfully using CC=gcc-6 and GCOV=gcov-6 from
Homebrew.


I tried that as well; it didn't work either. I didn't run a diff, but it 
appeared to be doing the same thing that macports gcc6 was.


1: 
http://stackoverflow.com/questions/27276155/gcov-generates-empty-coverage-for-c

2: http://stackoverflow.com/a/26061575/4196282
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-10-29 Thread Gilles Darold
Le 29/10/2016 à 14:38, Karl O. Pinc a écrit :
> On Fri, 28 Oct 2016 10:03:37 +0200
> Gilles Darold  wrote:
>
>> ...
>> v9 of the patch, attached here.
> Attached are 2 more documentation patchs to apply on
> top of your v9 patch.
>
>
> patch_pg_current_logfile-v9.diff.doc_current_logfiles
>
> Explains the current_logfiles file in the
> narrative documentation.  It's not like I want
> to toot our horn here.  I'm afraid that otherwise
> no one will notice the feature.
>
>
> patch_pg_current_logfile-v9.diff.doc_indexes
>
> Fixes an index entry and add more.
>
> Regards,
>
> Karl 
> Free Software:  "You don't pay back, you pay forward."
>  -- Robert A. Heinlein

The attached v10 of the current_logfiles patch include your last changes
on documentation but not the patch on v9 about the user-supplied GUC
value. I think the v10 path is ready for committers and that the
additional patch to add src/include/utils/guc_values.h to define user
GUC values is something that need to be taken outside this one. Imo,
thoses GUC values (stderr, csvlog) are not expected to change so often
to require a global definition, but why not, if committers think this
must be done I can add it to a v11 patch.

Best regards,

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index adab2f8..645cbb9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4143,6 +4143,12 @@ SELECT * FROM parent WHERE key = 2400;
  server log
 
 
+When logs are written to the file-system their paths, names, and
+types are recorded in
+the  file.  This provides
+a convenient way to find and access log content without establishing a
+database connection.
+
 
  Where To Log
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2e64cc4..f5bfe59 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15443,6 +15443,19 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
   
 
   
+   pg_current_logfile()
+   text
+   primary log file name in use by the logging collector
+  
+
+  
+   pg_current_logfile(text)
+   text
+   log file name, of log in the requested format, in use by the
+   logging collector
+  
+
+  
pg_my_temp_schema()
oid
OID of session's temporary schema, or 0 if none
@@ -15660,6 +15673,34 @@ SET search_path TO schema , schema, ..
 the time when the postmaster process re-read the configuration files.)

 
+
+pg_current_logile
+   
+
+   
+Logging
+pg_current_logfile function
+   
+
+   
+pg_current_logfile returns, as text,
+the path of either the csv or stderr log file currently in use by the
+logging collector.  This is a path including the
+ directory and the log file name.
+Log collection must be active or the return value
+is NULL.  When multiple logfiles exist, each in a
+different format, pg_current_logfile called
+without arguments returns the path of the file having the first format
+found in the ordered
+list: stderr, csvlog.
+NULL is returned when no log file has any of these
+formats.  To request a specific file format supply,
+as text, either csvlog
+or stderr as the value of the optional parameter. The
+return value is NULL when the log format requested
+is not a configured .
+   
+

 pg_my_temp_schema

diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index fddb69b..1cfb9da 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -60,6 +60,28 @@ Item
  Subdirectory containing per-database subdirectories
 
 
+
+ 
+  
+   current_logfiles
+  
+  
+   Logging
+   current_logfiles file
+  
+  current_logfiles
+ 
+ 
+  A file recording the current log file(s) used by the syslogger and
+  its log format, stderr or csvlog. Each line
+  of the file is a space separated list of two elements: the log format and
+  the full path to the log file including the value
+  of . The log format must be present
+  in  to be listed in the file. This file
+  is present only when  is
+  activated.
+
+
 
  global
  Subdirectory containing cluster-wide tables, such as
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index fd62d66..83afc34 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -146,6 +146,7 @@ static char *logfile_getname(pg_time_t timestamp, const char *suffix);
 static void set_next_rotation_time(void);
 static void sigHupHandler(SIGNAL_ARGS);
 static void sigUsr1Handler(SIGNAL_ARGS);
+static void logfile_writename(char *filename, char *csvfilename);
 
 
 /*
@@ -348,6 +349,13 @@ SysLoggerMain(int argc, char *argv[])
 rotation_disabled = false;
 rotation_requested = true;
 			}
+
+			/*
+			 * Force rewriting last log filename when reloading configuration,
+			 * l

Re: [HACKERS] Speed of user-defined aggregates using array_append as transfn

2016-10-29 Thread Tom Lane
I wrote:
> I'm still worried about chewing up a kilobyte-at-least for each transition
> value, but maybe that's something we could leave to fix later.  Another
> idea is that we could teach the planner to know about that in its hash
> table size estimates.

Here's a complete proposed patch for this.  I decided to hard-wire the
planner adjustment to apply to array_append specifically.  One could
consider doing it whenever the aggregate transtype is an array type,
but that seems likely to be casting too wide a net for now.  We can
revisit it in the future if necessary.  In any case, the estimate
can be overridden per-aggregate using the aggtransspace parameter.

Barring objections, I intend to back-patch this as far as 9.5.

regards, tom lane

diff --git a/doc/src/sgml/xaggr.sgml b/doc/src/sgml/xaggr.sgml
index fa98572..d432c9a 100644
*** a/doc/src/sgml/xaggr.sgml
--- b/doc/src/sgml/xaggr.sgml
*** if (AggCheckCallContext(fcinfo, NULL))
*** 626,632 
 function, the first input
 must be a temporary state value and can therefore safely be modified
 in-place rather than allocating a new copy.
!See int8inc() for an example.
 (This is the only
 case where it is safe for a function to modify a pass-by-reference input.
 In particular, final functions for normal aggregates must not
--- 626,632 
 function, the first input
 must be a temporary state value and can therefore safely be modified
 in-place rather than allocating a new copy.
!See int8inc() for an example.
 (This is the only
 case where it is safe for a function to modify a pass-by-reference input.
 In particular, final functions for normal aggregates must not
*** if (AggCheckCallContext(fcinfo, NULL))
*** 635,640 
--- 635,654 

  

+The second argument of AggCheckCallContext can be used to
+retrieve the memory context in which aggregate state values are being kept.
+This is useful for transition functions that wish to use expanded
+objects (see ) as their transition values.
+On first call, the transition function should return an expanded object
+whose memory context is a child of the aggregate state context, and then
+keep returning the same expanded object on subsequent calls.  See
+array_append() for an example.  (array_append()
+is not the transition function of any built-in aggregate, but it is written
+to behave efficiently when used as transition function of a custom
+aggregate.)
+   
+ 
+   
 Another support routine available to aggregate functions written in C
 is AggGetAggref, which returns the Aggref
 parse node that defines the aggregate call.  This is mainly useful
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index b06e1c1..28c15ba 100644
*** a/src/backend/executor/nodeAgg.c
--- b/src/backend/executor/nodeAgg.c
***
*** 91,100 
   *	  transition value or a previous function result, and in either case its
   *	  value need not be preserved.  See int8inc() for an example.  Notice that
   *	  advance_transition_function() is coded to avoid a data copy step when
!  *	  the previous transition value pointer is returned.  Also, some
!  *	  transition functions want to store working state in addition to the
!  *	  nominal transition value; they can use the memory context returned by
!  *	  AggCheckCallContext() to do that.
   *
   *	  Note: AggCheckCallContext() is available as of PostgreSQL 9.0.  The
   *	  AggState is available as context in earlier releases (back to 8.1),
--- 91,103 
   *	  transition value or a previous function result, and in either case its
   *	  value need not be preserved.  See int8inc() for an example.  Notice that
   *	  advance_transition_function() is coded to avoid a data copy step when
!  *	  the previous transition value pointer is returned.  It is also possible
!  *	  to avoid repeated data copying when the transition value is an expanded
!  *	  object: to do that, the transition function must take care to return
!  *	  an expanded object that is in a child context of the memory context
!  *	  returned by AggCheckCallContext().  Also, some transition functions want
!  *	  to store working state in addition to the nominal transition value; they
!  *	  can use the memory context returned by AggCheckCallContext() to do that.
   *
   *	  Note: AggCheckCallContext() is available as of PostgreSQL 9.0.  The
   *	  AggState is available as context in earlier releases (back to 8.1),
*** advance_transition_function(AggState *ag
*** 791,798 
  
  	/*
  	 * If pass-by-ref datatype, must copy the new value into aggcontext and
! 	 * pfree the prior transValue.  But if transfn returned a pointer to its
! 	 * first input, we don't need to do anything.
  	 */
  	if (!pertrans->transtypeByVal &&
  		DatumGetPointer(newVal) != DatumGetPointer(pergroupstate->transValue))
--- 794,803 
  
  

Re: [HACKERS] Streaming basebackups vs pg_stat_tmp

2016-10-29 Thread Michael Paquier
On Fri, Oct 28, 2016 at 9:57 PM, David Steele  wrote:
> On 10/28/16 3:49 PM, Magnus Hagander wrote:
> The change from 10 to 11 increases the tests that are skipped on Windows,
> which is necessary because one extra symlink test is added.
>
> I think you need:
>
> [...]
>
> The rest of the tests are for exclusions.

Indeed, giving the attached for REL9_6_STABLE. You could as well have
a test for pg_stat_tmp but honestly that's not worth it. One thing I
have noticed is that the patch does not use _tarWriteDir() for
pg_xlog. I think it should even if that's not addressing directly a
bug...
-- 
Michael
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index da9b7a6..426f6c8 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -57,6 +57,8 @@ static bool sendFile(char *readfilename, char *tarfilename,
 static void sendFileWithContent(const char *filename, const char *content);
 static void _tarWriteHeader(const char *filename, const char *linktarget,
 struct stat * statbuf);
+static int64 _tarWriteDir(const char *pathbuf, int basepathlen, struct stat * statbuf,
+			 bool sizeonly);
 static void send_int8_string(StringInfoData *buf, int64 intval);
 static void SendBackupHeader(List *tablespaces);
 static void base_backup_cleanup(int code, Datum arg);
@@ -969,9 +971,7 @@ sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
 		if ((statrelpath != NULL && strcmp(pathbuf, statrelpath) == 0) ||
 		  strncmp(de->d_name, PG_STAT_TMP_DIR, strlen(PG_STAT_TMP_DIR)) == 0)
 		{
-			if (!sizeonly)
-_tarWriteHeader(pathbuf + basepathlen + 1, NULL, &statbuf);
-			size += 512;
+			size += _tarWriteDir(pathbuf, basepathlen, &statbuf, sizeonly);
 			continue;
 		}
 
@@ -981,9 +981,7 @@ sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
 		 */
 		if (strcmp(de->d_name, "pg_replslot") == 0)
 		{
-			if (!sizeonly)
-_tarWriteHeader(pathbuf + basepathlen + 1, NULL, &statbuf);
-			size += 512;		/* Size of the header just added */
+			size += _tarWriteDir(pathbuf, basepathlen, &statbuf, sizeonly);
 			continue;
 		}
 
@@ -994,18 +992,8 @@ sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
 		 */
 		if (strcmp(pathbuf, "./pg_xlog") == 0)
 		{
-			if (!sizeonly)
-			{
-/* If pg_xlog is a symlink, write it as a directory anyway */
-#ifndef WIN32
-if (S_ISLNK(statbuf.st_mode))
-#else
-if (pgwin32_is_junction(pathbuf))
-#endif
-	statbuf.st_mode = S_IFDIR | S_IRWXU;
-_tarWriteHeader(pathbuf + basepathlen + 1, NULL, &statbuf);
-			}
-			size += 512;		/* Size of the header just added */
+			/* If pg_xlog is a symlink, write it as a directory anyway */
+			size += _tarWriteDir(pathbuf, basepathlen, &statbuf, sizeonly);
 
 			/*
 			 * Also send archive_status directory (by hackishly reusing
@@ -1248,6 +1236,30 @@ _tarWriteHeader(const char *filename, const char *linktarget,
 }
 
 /*
+ * Write tar header for a directory.  If the entry in statbuf is a link then
+ * write it as a directory anyway.
+ */
+static int64
+_tarWriteDir(const char *pathbuf, int basepathlen, struct stat * statbuf,
+			 bool sizeonly)
+{
+	if (sizeonly)
+		/* Directory headers are always 512 bytes */
+		return 512;
+
+	/* If symlink, write it as a directory anyway */
+#ifndef WIN32
+	if (S_ISLNK(statbuf->st_mode))
+#else
+	if (pgwin32_is_junction(pathbuf))
+#endif
+		statbuf->st_mode = S_IFDIR | S_IRWXU;
+
+	_tarWriteHeader(pathbuf + basepathlen + 1, NULL, statbuf);
+	return 512;
+}
+
+/*
  * Increment the network transfer counter by the given number of bytes,
  * and sleep if necessary to comply with the requested network transfer
  * rate.
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 6c33936..a83f3af 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -4,7 +4,7 @@ use Cwd;
 use Config;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 51;
+use Test::More tests => 52;
 
 program_help_ok('pg_basebackup');
 program_version_ok('pg_basebackup');
@@ -102,7 +102,17 @@ unlink "$pgdata/$superlongname";
 # skip on Windows.
 SKIP:
 {
-	skip "symlinks not supported on Windows", 10 if ($windows_os);
+	skip "symlinks not supported on Windows", 11 if ($windows_os);
+
+	# Move pg_replslot out of $pgdata and create a symlink to it.
+	$node->stop;
+
+	rename("$pgdata/pg_replslot", "$tempdir/pg_replslot")
+		   or BAIL_OUT "could not move $pgdata/pg_replslot";
+	symlink("$tempdir/pg_replslot", "$pgdata/pg_replslot")
+		   or BAIL_OUT "could not symlink to $pgdata/pg_replslot";
+
+	$node->start;
 
 	# Create a temporary directory in the system location and symlink it
 	# to our physical temp location.  That way we can use shorter names
@@ -140,6 +150,8 @@ SKIP:
 		"tablespace symlink was updated");
 	closedir $dh;
 
+	ok(-d "$tempdir/backup1/pg_replslot", 'pg_replslot symlink copied as direc

Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-10-29 Thread Karl O. Pinc
On Fri, 28 Oct 2016 10:03:37 +0200
Gilles Darold  wrote:

> ...
> v9 of the patch, attached here.

Attached are 2 more documentation patchs to apply on
top of your v9 patch.


patch_pg_current_logfile-v9.diff.doc_current_logfiles

Explains the current_logfiles file in the
narrative documentation.  It's not like I want
to toot our horn here.  I'm afraid that otherwise
no one will notice the feature.


patch_pg_current_logfile-v9.diff.doc_indexes

Fixes an index entry and add more.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


patch_pg_current_logfile-v9.diff.doc_current_logfiles
Description: Binary data


patch_pg_current_logfile-v9.diff.doc_indexes
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers