date:20201110

RE: POC: postgres_fdw insert batching

2020-11-10 Thread tsunakawa.ta...@fujitsu.com

From: Tomas Vondra 
> I see the patch builds the "bulk" query in execute_foreign_modify. IMO
> that's something we should do earlier, when we're building the simple
> query (for 1-row inserts). I'd understand if you were concerned about
> overhead in case of 1-row inserts, trying to not plan the bulk query
> until necessary, but I'm not sure this actually helps.
> 
> Or was the goal to build a query for every possible number of slots? I
> don't think that's really useful, considering it requires deallocating
> the old plan, preparing a new one, etc. IMO it should be sufficient to
> have two queries - one for 1-row inserts, one for the full batch. The
> last incomplete batch can be inserted using a loop of 1-row queries.
> 
> That's what my patch was doing, but I'm not insisting on that - it just
> seems like a better approach to me. So feel free to argue why this is
> better.

Don't be concerned, the processing is not changed for 1-row inserts: the INSERT 
query string is built in PlanForeignModify(), and the remote statement is 
prepared in execute_foreign_modify() during the first call to 
ExecForeignInsert() and it's reused for subsequent ExecForeignInsert() calls.

The re-creation of INSERT query string and its corresponding PREPARE happen 
when the number of tuples to be inserted is different from the previous call to 
ExecForeignInsert()/ExecForeignBulkInsert().  That's because we don't know how 
many tuples will be inserted during planning (PlanForeignModify) or execution 
(until the scan ends for SELECT).  For example, if we insert 10,030 rows with 
the bulk size 100, the flow is:

  PlanForeignModify():
build the INSERT query string for 1 row
  ExecForeignBulkInsert(100):
drop the INSERT query string and prepared statement for 1 row
build the query string and prepare statement for 100 row INSERT
execute it
  ExecForeignBulkInsert(100):
reuse the prepared statement for 100 row INSERT and execute it
...
  ExecForeignBulkInsert(30):
drop the INSERT query string and prepared statement for 100 row
build the query string and prepare statement for 30 row INSERT
execute it


> I think it'd be good to have such GUC, even if only for testing and
> development. We should probably have a way to disable the batching,
> which the GUC could also do, I think. So +1 to have the GUC.

OK, I'll add it.  The name would be max_bulk_insert_tuples, because a) it might 
cover bulk insert for local relations in the future, and b) "tuple" is used in 
cpu_(index_)tuple_cost and parallel_tuple_cost, while "row" or "record" is not 
used in GUC (except for row_security).

The valid range would be between 1 and 1,000 (I said 10,000 previously, but I 
think it's overreaction and am a bit worried about unforseen trouble too many 
tuples might cause.)  1 disables the bulk processing and uses the traditonal 
ExecForeignInsert().  The default value is 100 (would 1 be sensible as a 
default value to avoid surprising users by increased memory usage?)


> > 2. Should we accumulate records per relation in ResultRelInfo
> > instead? That is, when inserting into a partitioned table that has
> > foreign partitions, delay insertion until a certain number of input
> > records accumulate, and then insert accumulated records per relation
> > (e.g., 50 records to relation A, 30 records to relation B, and 20
> > records to relation C.)  If we do that,
> >
> 
> I think there's a chunk of text missing here? If we do that, then what?

Sorry, the two bullets below there are what follows.  Perhaps I should have 
written ":" instead of ",".


> Anyway, I don't see why accumulating the records in ResultRelInfo would
> be better than what the patch does now. It seems to me like fairly
> specific to FDWs, so keeping it int FDW state seems appropriate. What
> would be the advantage of stashing it in ResultRelInfo?

I thought of distributing input records to their corresponding partitions' 
ResultRelInfos.  For example, input record for partition 1 comes, store it in 
the ResultRelInfo for partition 1, then input record for partition 2 comes, 
store it in the ResultRelInfo for partition 2.  When a ResultRelInfo 
accumulates some number of rows, insert the accumulated rows therein into the 
partition.  When the input endds, perform bulk inserts for ResultRelInfos that 
have accumulated rows.



> I think that's OK for most use cases, and if it's not (e.g. when there's
> something requiring the exact order of writes) then it's not possible to
> use batching. That's one of the reasons why I think we should have a GUC
> to disable the batching.

Agreed.


> > * Should the maximum count of accumulated records be applied per
> > relation or the query? When many foreign partitions belong to a
> > partitioned table, if the former is chosen, it may use much memory in
> > total.  If the latter is chosen, the records per relation could be
> > few and thus the benefit of bulk insert could be small.
> >
> 
> I think it needs to be

Re: proposal: possibility to read dumped table's name from file

2020-11-10 Thread Pavel Stehule

st 11. 11. 2020 v 6:32 odesílatel Pavel Stehule 
napsal:

> Hi
>
> út 10. 11. 2020 v 21:09 odesílatel Stephen Frost 
> napsal:
>
>> Greetings,
>>
>> * Pavel Stehule (pavel.steh...@gmail.com) wrote:
>> > rebase + minor change - using pg_get_line_buf instead pg_get_line_append
>>
>> I started looking at this and went back through the thread and while I
>> tend to agree that JSON may not be a good choice for this, it's not the
>> only possible alternative.  There is no doubt that pg_dump is already a
>> sophisticated data export tool, and likely to continue to gain new
>> features, such that having a configuration file for it would be very
>> handy, but this clearly isn't really going in a direction that would
>> allow for that.
>>
>> Perhaps this feature could co-exist with a full blown configuration for
>> pg_dump, but even then there's certainly issues with what's proposed-
>> how would you handle explicitly asking for a table which is named
>> "  mytable" to be included or excluded?  Or a table which has a newline
>> in it?  Using a standardized format which supports the full range of
>> what we do in a table name, explicitly and clearly, would address these
>> issues and also give us the flexibility to extend the options which
>> could be used through the configuration file beyond just the filters in
>> the future.
>>
>
> This is the correct argument - I will check a possibility to use strange
> names, but there is the same possibility and functionality like we allow
> from the command line. So you can use double quoted names. I'll check it.
>
>
>> Unlike for the pg_basebackup manifest, which we generate and read
>> entirely programatically, a config file for pg_dump would almost
>> certainly be updated manually (or, at least, parts of it would be and
>> perhaps other parts generated), which means it'd really be ideal to have
>> a proper way to support comments in it (something that the proposed
>> format also doesn't really get right- # must be the *first* character,
>> and you can only have whole-line comments..?), avoid extra unneeded
>> punctuation (or, at times, allow it- such as trailing commas in lists),
>> cleanly handle multi-line strings (consider the oft discussed idea
>> around having pg_dump support a WHERE clause for exporting data from
>> tables...), etc.
>>
>
> I think the proposed feature is very far to be the config file for pg_dump
> (it implements a option "--filter"). This is not the target. It is not
> designed for this. This is just an alternative for options like -t, -T, ...
> and I am sure so nobody will generate this file manually. Main target of
> this patch is eliminating problems with the max length of the command line.
> So it is really not designed to be the config file for pg_dump.
>
>
>>
>> Overall, -1 from me on this approach.  Maybe it could be fixed up to
>> handle all the different names of objects that we support today
>> (something which, imv, is really a clear requirement for this feature to
>> be committed), but I suspect you'd end up half-way to yet another
>> configuration format when we could be working to support something like
>> TOML or maybe YAML... but if you want my 2c, TOML seems closer to what
>> we do for postgresql.conf and getting that over to something that's
>> standardized, while a crazy long shot, is a general nice idea, imv.
>>
>
> I have nothing against TOML, but I don't see a sense of usage in this
> patch. This patch doesn't implement a config file for pg_dump, and I don't
> see any sense or benefits of it. The TOML is designed for different
> purposes. TOML is good for manual creating, but it is not this case.
> Typical usage of this patch is some like, and TOML syntax (or JSON) is not
> good for this.
>
> psql -c "select '+t' || quote_ident(relname) from pg_class where relname
> ..." | pg_dump --filter=/dev/stdin
>
> I can imagine some benefits of saved configure files for postgres
> applications - but it should be designed generally and implemented
> generally. Probably you would use one for pg_dump, psql, pg_restore, 
> But it is a different feature with different usage. This patch doesn't
> implement option "--config", it implements option "--filter".
>

Some generic configuration for postgres binary applications is an
interesting idea.  And TOML language can be well for this purpose. We can
parametrize applications by command line and by system variables. But
filtering objects is a really different case - although there is some small
intersection, and it will be used very differently, and I don't think so
one language can be practical for both cases. The object filtering is an
independent feature, and both features can coexist together.

Regards

Pavel


> Regards
>
> Pavel
>
>
>
>> Thanks,
>>
>> Stephen
>>
>

Re: Skip ExecCheckRTPerms in CTAS with no data

2020-11-10 Thread Michael Paquier

On Mon, Nov 09, 2020 at 10:48:09PM +0300, Anastasia Lubennikova wrote:
> I see Tom's objection above. Still, I tend to agree that if 'WITH NO DATA'
> was specified explicitly, CREATE AS should behave more like a utility
> statement rather than a regular query. So I think that this patch can be
> useful in some use-cases and I definitely don't see any harm it could cause.
> Even the comment in the current code suggests that it is an option.

I agree with Tom's point to leave this stuff alone, and just remove
this XXX comment.  An extra issue I can see is that you would bypass
ExecutorCheckPerms_hook_type when using WITH NO DATA.  This could
silently break the users of this hook.
--
Michael


signature.asc
Description: PGP signature

Sloppiness around failure handling of parsePGArray in pg_dump

2020-11-10 Thread Michael Paquier

Hi all,

Following the report of Coverity that led to 3636efa, I have reviewed
the existing callers of parsePGArray() in pg_dump and some of its
error handling is a bit sloppy.

It could theoretically be possible to reach an OOM in parsePGArray()
with a dump able to finish.  This is very unlikely going to matter in
practice as an OOM when parsing an array is most likely going to
trigger a fatal failure in one of the follow-up allocations, but if
the dump is able to go through we could finish with a valid dump that
lacks some information:
- Statistics for indexes.
- Run-time configuration of functions.
- Configuration of extensions.
- Publication list for a subscription.

I would like to propose the attached to tighten the error handling in
the area, generating a fatal error if an array cannot be parsed.  I
did not see the point of changing the assumptions we use for the
parsing of function args or such when it comes to pre-8.4 dumps.  This
issue is unlikely going to matter in practice, so I don't propose a
backpatch.

Thoughts?
--
Michael
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c68db75b97..dc1d41dd8d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4357,13 +4357,7 @@ dumpSubscription(Archive *fout, SubscriptionInfo *subinfo)
 
 	/* Build list of quoted publications and append them to query. */
 	if (!parsePGArray(subinfo->subpublications, , ))
-	{
-		pg_log_warning("could not parse subpublications array");
-		if (pubnames)
-			free(pubnames);
-		pubnames = NULL;
-		npubnames = 0;
-	}
+		fatal("could not parse subpublications array");
 
 	publications = createPQExpBuffer();
 	for (i = 0; i < npubnames; i++)
@@ -12128,13 +12122,12 @@ dumpFunc(Archive *fout, FuncInfo *finfo)
 	if (proconfig && *proconfig)
 	{
 		if (!parsePGArray(proconfig, , ))
-		{
-			pg_log_warning("could not parse proconfig array");
-			if (configitems)
-free(configitems);
-			configitems = NULL;
-			nconfigitems = 0;
-		}
+			fatal("could not parse proconfig array");
+	}
+	else
+	{
+		configitems = NULL;
+		nconfigitems = 0;
 	}
 
 	if (funcargs)
@@ -16453,8 +16446,8 @@ dumpIndex(Archive *fout, IndxInfo *indxinfo)
 		char	   *indstatvals = indxinfo->indstatvals;
 		char	  **indstatcolsarray = NULL;
 		char	  **indstatvalsarray = NULL;
-		int			nstatcols;
-		int			nstatvals;
+		int			nstatcols = 0;
+		int			nstatvals = 0;
 
 		if (dopt->binary_upgrade)
 			binary_upgrade_set_pg_class_oids(fout, q,
@@ -16483,12 +16476,17 @@ dumpIndex(Archive *fout, IndxInfo *indxinfo)
 		 * If the index has any statistics on some of its columns, generate
 		 * the associated ALTER INDEX queries.
 		 */
-		if (parsePGArray(indstatcols, , ) &&
-			parsePGArray(indstatvals, , ) &&
-			nstatcols == nstatvals)
+		if (strlen(indstatcols) != 0 || strlen(indstatvals) != 0)
 		{
 			int			j;
 
+			if (!parsePGArray(indstatcols, , ))
+fatal("could not parse index statistic columns");
+			if (!parsePGArray(indstatvals, , ))
+fatal("could not parse index statistic values");
+			if (nstatcols != nstatvals)
+fatal("mismatched number of columns and values for index stats");
+
 			for (j = 0; j < nstatcols; j++)
 			{
 appendPQExpBuffer(q, "ALTER INDEX %s ", qqindxname);
@@ -17938,15 +17936,20 @@ processExtensionTables(Archive *fout, ExtensionInfo extinfo[],
 		char	   *extcondition = curext->extcondition;
 		char	  **extconfigarray = NULL;
 		char	  **extconditionarray = NULL;
-		int			nconfigitems;
-		int			nconditionitems;
+		int			nconfigitems = 0;
+		int			nconditionitems = 0;
 
-		if (parsePGArray(extconfig, , ) &&
-			parsePGArray(extcondition, , ) &&
-			nconfigitems == nconditionitems)
+		if (strlen(extconfig) != 0 || strlen(extcondition) != 0)
 		{
 			int			j;
 
+			if (!parsePGArray(extconfig, , ))
+fatal("could not parse extension configuration array");
+			if (!parsePGArray(extcondition, , ))
+fatal("could not parse extension condition array");
+			if (nconfigitems != nconditionitems)
+fatal("mismatched number of configurations and conditions for extension");
+
 			for (j = 0; j < nconfigitems; j++)
 			{
 TableInfo  *configtbl;


signature.asc
Description: PGP signature

Re: proposal: possibility to read dumped table's name from file

2020-11-10 Thread Pavel Stehule

Hi

čt 24. 9. 2020 v 19:47 odesílatel Pavel Stehule 
napsal:

> Hi
>
> rebase + minor change - using pg_get_line_buf instead pg_get_line_append
>
>
fresh rebase


Regards
>

> Pavel
>
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 0aa35cf0c3..068821583f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -751,6 +751,56 @@ PostgreSQL documentation
   
  
 
+ 
+  --filter=filename
+  
+   
+Read objects filters from the specified file.
+If you use "-" as a filename, the filters are read from stdin.
+The lines of this file must have the following format:
+
+(+|-)[tnfd] objectname
+
+   
+
+   
+The first character specifies whether the object is to be included
+(+) or excluded (-), and the
+second character specifies the type of object to be filtered:
+t (table),
+n (schema),
+f (foreign server),
+d (table data).
+   
+
+   
+With the following filter file, the dump would include table
+mytable1 and data from foreign tables of
+some_foreign_server foreign server, but exclude data
+from table mytable2.
+
++t mytable1
++f some_foreign_server
+-d mytable2
+
+   
+
+   
+The lines starting with symbol # are ignored.
+Previous white chars (spaces, tabs) are not allowed. These
+lines can be used for comments, notes. Empty lines are ignored too.
+   
+
+   
+The --filter option works just like the other
+options to include or exclude tables, schemas, table data, or foreign
+tables, and both forms may be combined.  Note that there are no options
+to exclude a specific foreign table or to include a specific table's
+data.
+   
+  
+ 
+
  
   --if-exists
   
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c68db75b97..f1f9e8321e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -54,9 +54,11 @@
 #include "catalog/pg_trigger_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/connect.h"
+#include "common/string.h"
 #include "dumputils.h"
 #include "fe_utils/string_utils.h"
 #include "getopt_long.h"
+#include "lib/stringinfo.h"
 #include "libpq/libpq-fs.h"
 #include "parallel.h"
 #include "pg_backup_db.h"
@@ -294,6 +296,7 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 static char *get_synchronized_snapshot(Archive *fout);
 static void setupDumpWorker(Archive *AHX);
 static TableInfo *getRootTableInfo(TableInfo *tbinfo);
+static void read_patterns_from_file(char *filename, DumpOptions *dopt);
 
 
 int
@@ -367,6 +370,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, _row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"extra-float-digits", required_argument, NULL, 8},
+		{"filter", required_argument, NULL, 12},
 		{"if-exists", no_argument, _exists, 1},
 		{"inserts", no_argument, NULL, 9},
 		{"lock-wait-timeout", required_argument, NULL, 2},
@@ -390,6 +394,7 @@ main(int argc, char **argv)
 		{"rows-per-insert", required_argument, NULL, 10},
 		{"include-foreign-data", required_argument, NULL, 11},
 		{"index-collation-versions-unknown", no_argument, _unknown, 1},
+		{"include-foreign-data-file", required_argument, NULL, 17},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -607,6 +612,10 @@ main(int argc, char **argv)
 		  optarg);
 break;
 
+			case 12:			/* filter implementation */
+read_patterns_from_file(optarg, );
+break;
+
 			default:
 fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 exit_nicely(1);
@@ -1035,6 +1044,8 @@ help(const char *progname)
 			 "   access to)\n"));
 	printf(_("  --exclude-table-data=PATTERN do NOT dump data for the specified table(s)\n"));
 	printf(_("  --extra-float-digits=NUM override default setting for extra_float_digits\n"));
+	printf(_("  --filter=FILENAMEdump objects and data based on the filter expressions\n"
+			 "   from the filter file\n"));
 	printf(_("  --if-exists  use IF EXISTS when dropping objects\n"));
 	printf(_("  --include-foreign-data=PATTERN\n"
 			 "   include data of foreign tables on foreign\n"
@@ -18632,3 +18643,160 @@ appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 	if (!res)
 		pg_log_warning("could not parse reloptions array");
 }
+
+/*
+ * Print error message and exit.
+ */
+static void
+exit_invalid_filter_format(FILE *fp, char *filename, char *message, char *line, int lineno)
+{
+	pg_log_error("invalid format of filter file \"%s\": %s",
+ filename,
+ message);
+
+	fprintf(stderr, "%d: %s\n", lineno, line);
+
+	if (fp != stdin)
+		fclose(fp);
+
+	exit_nicely(-1);
+}
+
+/*
+ * Read dumped object specification from file
+ */
+static

Re: proposal: possibility to read dumped table's name from file

2020-11-10 Thread Pavel Stehule

Hi

Perhaps this feature could co-exist with a full blown configuration for
>> pg_dump, but even then there's certainly issues with what's proposed-
>> how would you handle explicitly asking for a table which is named
>> "  mytable" to be included or excluded?  Or a table which has a newline
>> in it?  Using a standardized format which supports the full range of
>> what we do in a table name, explicitly and clearly, would address these
>> issues and also give us the flexibility to extend the options which
>> could be used through the configuration file beyond just the filters in
>> the future.
>>
>
> This is the correct argument - I will check a possibility to use strange
> names, but there is the same possibility and functionality like we allow
> from the command line. So you can use double quoted names. I'll check it.
>

I checked

echo "+t \"bad Name\"" | /usr/local/pgsql/master/bin/pg_dump
--filter=/dev/stdin

It is working without any problem

Regards

Pavel

Re: proposal: possibility to read dumped table's name from file

2020-11-10 Thread Pavel Stehule

Hi

út 10. 11. 2020 v 21:09 odesílatel Stephen Frost 
napsal:

> Greetings,
>
> * Pavel Stehule (pavel.steh...@gmail.com) wrote:
> > rebase + minor change - using pg_get_line_buf instead pg_get_line_append
>
> I started looking at this and went back through the thread and while I
> tend to agree that JSON may not be a good choice for this, it's not the
> only possible alternative.  There is no doubt that pg_dump is already a
> sophisticated data export tool, and likely to continue to gain new
> features, such that having a configuration file for it would be very
> handy, but this clearly isn't really going in a direction that would
> allow for that.
>
> Perhaps this feature could co-exist with a full blown configuration for
> pg_dump, but even then there's certainly issues with what's proposed-
> how would you handle explicitly asking for a table which is named
> "  mytable" to be included or excluded?  Or a table which has a newline
> in it?  Using a standardized format which supports the full range of
> what we do in a table name, explicitly and clearly, would address these
> issues and also give us the flexibility to extend the options which
> could be used through the configuration file beyond just the filters in
> the future.
>

This is the correct argument - I will check a possibility to use strange
names, but there is the same possibility and functionality like we allow
from the command line. So you can use double quoted names. I'll check it.

> Unlike for the pg_basebackup manifest, which we generate and read
> entirely programatically, a config file for pg_dump would almost
> certainly be updated manually (or, at least, parts of it would be and
> perhaps other parts generated), which means it'd really be ideal to have
> a proper way to support comments in it (something that the proposed
> format also doesn't really get right- # must be the *first* character,
> and you can only have whole-line comments..?), avoid extra unneeded
> punctuation (or, at times, allow it- such as trailing commas in lists),
> cleanly handle multi-line strings (consider the oft discussed idea
> around having pg_dump support a WHERE clause for exporting data from
> tables...), etc.
>

I think the proposed feature is very far to be the config file for pg_dump
(it implements a option "--filter"). This is not the target. It is not
designed for this. This is just an alternative for options like -t, -T, ...
and I am sure so nobody will generate this file manually. Main target of
this patch is eliminating problems with the max length of the command line.
So it is really not designed to be the config file for pg_dump.

>
> Overall, -1 from me on this approach.  Maybe it could be fixed up to
> handle all the different names of objects that we support today
> (something which, imv, is really a clear requirement for this feature to
> be committed), but I suspect you'd end up half-way to yet another
> configuration format when we could be working to support something like
> TOML or maybe YAML... but if you want my 2c, TOML seems closer to what
> we do for postgresql.conf and getting that over to something that's
> standardized, while a crazy long shot, is a general nice idea, imv.
>

I have nothing against TOML, but I don't see a sense of usage in this
patch. This patch doesn't implement a config file for pg_dump, and I don't
see any sense or benefits of it. The TOML is designed for different
purposes. TOML is good for manual creating, but it is not this case.
Typical usage of this patch is some like, and TOML syntax (or JSON) is not
good for this.

psql -c "select '+t' || quote_ident(relname) from pg_class where relname
..." | pg_dump --filter=/dev/stdin

I can imagine some benefits of saved configure files for postgres
applications - but it should be designed generally and implemented
generally. Probably you would use one for pg_dump, psql, pg_restore, 
But it is a different feature with different usage. This patch doesn't
implement option "--config", it implements option "--filter".

Regards

Pavel

> Thanks,
>
> Stephen
>

Re: logical streaming of xacts via test_decoding is broken

2020-11-10 Thread Dilip Kumar

On Tue, Nov 10, 2020 at 7:20 PM Amit Kapila  wrote:
>
> On Tue, Nov 10, 2020 at 2:25 PM Dilip Kumar  wrote:
> >
> > On Tue, Nov 10, 2020 at 11:18 AM Dilip Kumar  wrote:
> > >
> > > On Tue, Nov 10, 2020 at 10:52 AM Amit Kapila  
> > > wrote:
> > > > For this case, users can use skip_empty_xacts = true and
> > > > skip_empty_streams = false. I am just asking if the user has only used
> > > > skip_empty_xacts = true and didn't use the 'skip_empty_streams'
> > > > option.
> > >
> > > Ok, thanks for the clarification.
> > >
> >
> > I have prepared a patch for the same.
> >
>
> Few comments:
> 1.
> + else if (strcmp(elem->defname, "skip-empty-streams") == 0)
> + {
> + if (elem->arg == NULL)
> + data->skip_empty_streams = true;
> + else if (!parse_bool(strVal(elem->arg), >skip_empty_streams))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("could not parse value \"%s\" for parameter \"%s\"",
> + strVal(elem->arg), elem->defname)));
> + if (!data->skip_empty_xacts && data->skip_empty_streams)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("the skip-empty-streams can not be true if skip-empty-xacts
> is false")));
>   }
>
> You can probably add a comment as to why we are disallowing this case.
> I thought of considering 'stream-changes' parameter here because it
> won't make sense to give this parameter without it, however, it seems
> that is not necessary but maybe adding a comment
> here in that regard would be a good idea.

Should we also consider the case that if the user just passed
skip_empty_streams to true then we should automatically set
skip_empty_xacts to true?
And we will allow the 'skip_empty_streams' parameter only if
stream-changes' is true.

> 2.
> pg_decode_begin_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn)
>  {
>   TestDecodingData *data = ctx->output_plugin_private;
> + TestDecodingTxnData *txndata =
> + MemoryContextAllocZero(ctx->context, sizeof(TestDecodingTxnData));
> +
>
> Shall we free this memory at commit time for the sake of consistency,
> otherwise also it would be freed with decoding context?

Yeah, actually I have freed in the stream commit but missed here.  I
will do that.

> 3. Can you please prepare a separate patch for test case changes so
> that it would be easier to verify that it fails without the patch and
> passed after the patch?

ok

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: PATCH: Report libpq version and configuration

2020-11-10 Thread Craig Ringer

On Tue, Nov 10, 2020 at 2:22 PM Craig Ringer 
wrote:

>
> The main things I'd really like to get in place are a way to get the
> version as an ELF data symbol, and a simple way to ID the binary.
>
> So the minimal change would be to declare:
>
> const char LIBPQ_VERSION_STR[] = PG_VERSION_STR;
> const int LIBPQ_VERSION_NUM = PG_VERSION_NUM;
>
> then change PQgetVersion() to return LIBPQ_VERSION_NUM and add a
> PQgetVersionStr() that returns LIBPQ_VERSION_STR.
>
> That OK with you?
>

Proposed minimal patch attached.
From 137fa6f0ee8a13e56025af4adf7707b8ec8e4739 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 11 Nov 2020 11:45:05 +0800
Subject: [PATCH v2] Add PQlibVersionString() to libpq

Provide applications with a way to get the full postgres version string
associated with the libpq they are dynamically linked against. This might
not be the same libpq they were compiled against, so checking PG_VERSION_STR
is not the same thing.

Expose LIBPQ_VERSION_NUM and LIBPQ_VERSION_STR symbols in the libpq export
symbol table  for examination in situations like core file analysis where
functions cannot be called. Debuggers can't inspect PG_VERSION_NUM etc unless
-ggdb3 debuginfo is available, wheras this will will work even with a stripped
binary since they're in the export symbol table. Also useful for use in trace
tooling (dtrace, systemtap, perf, etc) and for identifying a stray libpq binary
found lying around:

$ gdb -batch \
	  -ex 'p (int)LIBPQ_VERSION_NUM' \
	  -ex 'p (const char*)_VERSION_STR' \
  build/src/interfaces/libpq/libpq.so
$1 = 14
$2 = 0x269c0  "PostgreSQL 14devel ..."

or

$ strings build/src/interfaces/libpq/libpq.so | egrep '^PostgreSQL [0-9]'
PostgreSQL 14devel ...
---
 doc/src/sgml/libpq.sgml  | 49 
 src/interfaces/libpq/exports.txt |  3 ++
 src/interfaces/libpq/fe-misc.c   | 21 +-
 src/interfaces/libpq/libpq-fe.h  |  5 +++-
 src/interfaces/libpq/libpq-int.h |  8 ++
 5 files changed, 84 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 9ce32fb39b..1b7f57c549 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -6248,6 +6248,55 @@ int PQlibVersion(void);
on version 9.1 or later.
   
  
+
+ 
+  The value returned by PQlibVersion() is also exposed as the
+  exported symbol LIBPQ_VERSION_NUM for use by debuggers and
+  trace tooling. Applications should not reference this directly; call the function
+  instead.
+ 
+
+
+   
+
+   
+
+  PQlibVersionString
+  
+PQlibVersionString
+PQserverVersion
+PQlibVersion
+  
+
+
+
+ 
+  Return a string describing the PostgreSQL version that the currently loaded
+  libpq binary was compiled against.
+
+const char * PQlibVersion(void);
+
+ 
+
+ 
+  The return value is a statically allocated string. The caller must not free it.
+ 
+
+ 
+  Applications should prefer to call PQlibVersion()
+  to determine the major and minor versions. Do not parse this version string
+  to get the numeric version information. Use PQlibVersionString
+  when displaying a detailed version in diagnostic output or debug logs where
+  a more detailed identification of the exact libpq binary may be desirable.
+ 
+
+ 
+  The value returned by PQlibVersionString() is also exposed as the
+  exported symbol LIBPQ_VERSION_STR for use by debuggers and
+  trace tooling. Applications should not reference this directly; call the function
+  instead.
+ 
+
 

 
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index bbc1f90481..76dfdc1d5f 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -179,3 +179,6 @@ PQgetgssctx   176
 PQsetSSLKeyPassHook_OpenSSL 177
 PQgetSSLKeyPassHook_OpenSSL 178
 PQdefaultSSLKeyPassHook_OpenSSL 179
+PQlibVersionString  180
+LIBPQ_VERSION_STR   181
+LIBPQ_VERSION_NUM   182
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index 4ffc7f33fb..c57ef85a4d 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -59,15 +59,34 @@ static int	pqSocketCheck(PGconn *conn, int forRead, int forWrite,
 		  time_t end_time);
 static int	pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time);
 
+/*
+ * Storage for version information. These are deliberately not static, so they
+ * appear in the symbol table where they can be inspected by debuggers and
+ * trace tools in contexts where a function call isn't possible. They aren't
+ * declared as externs in libpq-fe.h because we want programs to use the
+ * function interfaces instead.
+ */
+const char LIBPQ_VERSION_STR[] = PG_VERSION_STR;
+const int LIBPQ_VERSION_NUM = PG_VERSION_NUM;
+
 /*

Re: Misuse of TimestampDifference() in the autoprewarm feature of pg_prewarm

2020-11-10 Thread Tom Lane

Alexey Kondratov  writes:
> After looking on the autoprewarm code more closely I have realised that 
> this 'double dump' issues was not an issues at all. I have just 
> misplaced a debug elog(), so its second output in the log was only 
> indicating that we calculated delay_in_ms one more time.

Ah --- that explains why I couldn't see a problem.

I've pushed 0001+0002 plus some followup work to fix other places
that could usefully use TimestampDifferenceMilliseconds().  I have
not done anything with 0003 (the TAP test for pg_prewarm), and will
leave that to the judgment of somebody who's worked with pg_prewarm
before.  To me it looks like it's not really testing things very
carefully at all; on the other hand, we have exactly zero test
coverage of that module today, so maybe something is better than
nothing.

regards, tom lane

RE: Disable WAL logging to speed up data loading

2020-11-10 Thread osumi.takami...@fujitsu.com

Horiguchi-San


On Tuesday, November 10, 2020 10:00 AM  Kyotaro Horiguchi 
 wrote:
> FWIW, the following is that, I think it works not only when wal_level=minimal
> for SET UNLOGGED, and only works when minimal for SET LOGGED.
> 
> https://www.postgresql.org/message-id/20201002.100621.166891875652013
> 6893.horikyota@gmail.com
> 
> | For fuel(?) of the discussion, I tried a very-quick PoC for in-place
> | ALTER TABLE SET LOGGED/UNLOGGED and resulted as attached. After
> some
> | trials of several ways, I drifted to the following way after poking
> | several ways.
> |
> | 1. Flip BM_PERMANENT of active buffers 2. adding/removing init fork 3.
> | sync files, 4. Flip pg_class.relpersistence.
> |
> | It always skips table copy in the SET UNLOGGED case, and only when
> | wal_level=minimal in the SET LOGGED case.  Crash recovery seems
> | working by some brief testing by hand.
> I agree to this aspect of the in-place flipping of UNLOGGED.
> 
> Couldn't we have something like the following?
>  ALTER TABLE table1, table2, table3 SET UNLOGGED;
> 
> That is, multiple target object specification in ALTER TABLE sttatement.
I *really* appreciate the 1st patch.
Did you register this patch with the current or next commitfest ?
I cannot find it. We should avoid that this patch misses people's attention.
When I'm available, I'd like to join the review for the patch as well.

Also, I supposed that as long as
something impossible to implement for wal_level=none is not founded,
we could develop both patches in parallel, because both have good points
described in the previous e-mails of this thread.

Best,
Takamichi Osumi

Re: Rethinking LOCK TABLE's behavior on views

2020-11-10 Thread Noah Misch

On Mon, Nov 09, 2020 at 11:42:33AM -0300, Alvaro Herrera wrote:
> On 2020-Nov-07, Noah Misch wrote:
> > On Sat, Nov 07, 2020 at 11:57:20AM -0500, Tom Lane wrote:
> > > A completely different approach we could consider is to weaken the
> > > permissions requirements for LOCK on a view, say "allow it if either
> > > the calling user or the view owner has the needed permission".  This
> > > seems generally pretty messy and so I don't much like it, but we
> > > should consider as many solutions as we can think of.
> > 
> > This is the best of what you've listed by a strong margin, and I don't know 
> > of
> > better options you've not listed.  +1 for it.  Does it work for you?
> 
> It does sound attractive from a user complexity perspective, even if it
> does sound messy form an implementation perspective.
> 
> > I think
> > the mess arises from LOCK TABLE serving "get locks sufficient for $ACTIONS" 
> > as
> > a family of use cases.  For views only, different $ACTIONS want different
> > behavior.  $ACTIONS==SELECT wants today's behavior; pg_get_viewdef() wants
> > shallower recursion and caller permissions; DROP VIEW wants no recursion.
> 
> Maybe we can tackle this problem directly, by adding a clause to LOCK
> TABLE to indicate a purpose for the lock that the server can use to
> determine the level of recursion.  For example
>   LOCK TABLE xyz IN  FOR 
> where  can be READ, DROP, DEFINE.

Possible.  Regrettably, we're not set up for it; running pg_get_viewdef() to
completion is today's way to determine what it will lock.  Each of these modes
probably would have condensed copies of the operation they mimic, which I'd
find sadder than locking somewhat more than pg_dump needs (via today's "LOCK
TABLE viewname" behavior).  Is it plausible to do without that duplication?

Re: pg_upgrade analyze script

2020-11-10 Thread Michael Paquier

On Tue, Nov 10, 2020 at 10:21:04AM +0900, Michael Paquier wrote:
> On Mon, Nov 09, 2020 at 03:47:22PM +0100, Peter Eisentraut wrote:
>> You should just remove those calls.  There is no need to replace them with
>> vacuumdb calls.  The reason those calls were there is that they were testing
>> the generated script itself.  If the script is gone, there is no more need.
>> There are already separate tests for testing vacuumdb.
> 
> True, 102_vacuumdb_stages.pl already does all that.

Let's fix that.  From what I can see it only involves the attached,
and as a bobus it also reduces the number of extra characters showing
on my terminal after running pg_upgrade tests.
--
Michael
diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh
index 551ac8a5c2..04aa7fd9f5 100644
--- a/src/bin/pg_upgrade/test.sh
+++ b/src/bin/pg_upgrade/test.sh
@@ -243,8 +243,6 @@ esac
 
 pg_ctl start -l "$logdir/postmaster2.log" -o "$POSTMASTER_OPTS" -w
 
-vacuumdb --all --analyze-in-stages
-
 pg_dumpall --no-sync -f "$temp_root"/dump2.sql || pg_dumpall2_status=$?
 pg_ctl -m fast stop
 
diff --git a/src/tools/msvc/vcregress.pl b/src/tools/msvc/vcregress.pl
index 9799c5f54b..266098e193 100644
--- a/src/tools/msvc/vcregress.pl
+++ b/src/tools/msvc/vcregress.pl
@@ -606,9 +606,6 @@ sub upgradecheck
 	print "\nStarting new cluster\n\n";
 	@args = ('pg_ctl', '-l', "$logdir/postmaster2.log", 'start');
 	system(@args) == 0 or exit 1;
-	print "\nSetting up stats on new cluster\n\n";
-	@args = ('vacuumdb', '--all', '--analyze-in-stages');
-	system(@args) == 0 or exit 1;
 	print "\nDumping new cluster\n\n";
 	@args = ('pg_dumpall', '-f', "$tmp_root/dump2.sql");
 	system(@args) == 0 or exit 1;


signature.asc
Description: PGP signature

Re: Windows regress fails (latest HEAD)

2020-11-10 Thread Michael Paquier

On Tue, Nov 10, 2020 at 04:22:37PM -0500, Russell Foster wrote:
> Never claimed they were the same, but they are both Windows x64. Here
> are some more details:
> 
> Test Passes:
> VM machine (Build Server)
> Microsoft Windows 10 Pro
> Version 10.0.18363 Build 18363
> Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29112 for x64
> 
> Compile args:
> C:\Program Files (x86)\Microsoft Visual
> Studio\2019\Enterprise\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\CL.exe
> /c /Isrc/include /Isrc/include/port/win32
> /Isrc/include/port/win32_msvc /Iexternals/zlib
> /Iexternals/openssl\include /Isrc/backend /Zi /nologo /W3 /WX-
> /diagnostics:column /Ox /D WIN32 /D _WINDOWS /D __WINDOWS__ /D
> __WIN32__ /D EXEC_BACKEND /D WIN32_STACK_RLIMIT=4194304 /D
> _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D BUILDING_DLL
> /D _MBCS /GF /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope
> /Zc:inline /Fo".\Release\postgres\\" /Fd".\Release\postgres\vc142.pdb"
> /Gd /TC /wd4018 /wd4244 /wd4273 /wd4102 /wd4090 /wd4267 /FC
> /errorReport:queue /MP src/backend/catalog/partition.c

drongo is the closest thing we have in the buildfarm for this setup.
Here is the boring option part when it comes to Postgres compilation:
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=drongo=2020-11-10%2018%3A59%3A20=make
  C:\\Program Files (x86)\\Microsoft Visual
  
Studio\\2019\\BuildTools\\VC\\Tools\\MSVC\\14.23.28105\\bin\\HostX86\\x64\\CL.exe
  /c /Isrc/include /Isrc/include/port/win32
  /Isrc/include/port/win32_msvc /Ic:\\prog\\3p64\\include
  /I"C:\\Program Files\\OpenSSL-Win64\\include" /Isrc/backend /Zi
  /nologo /W3 /WX- /diagnostics:column /Ox /D WIN32 /D _WINDOWS /D
  __WINDOWS__ /D __WIN32__ /D WIN32_STACK_RLIMIT=4194304 /D
  _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D
  BUILDING_DLL /D _MBCS /GF /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t
  /Zc:forScope /Zc:inline /Fo".\\Release\\postgres"
  /Fd".\\Release\\postgres\\vc142.pdb" /Gd /TC /wd4018 /wd4244 /wd4273
  /wd4102 /wd4090 /wd4267 /FC /errorReport:queue /MP [ source files ]
 [...]

So except if I am missing something we have an exact match here.

> Test Fails:
> Laptop machine (Development)
> Microsoft Windows 10 Enterprise
> Version 10.0.19041 Build 19041
> Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29112 for x64
> 
> Compile args:
> C:\Program Files (x86)\Microsoft Visual
> Studio\2019\Enterprise\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\CL.exe
> /c /Isrc/include /Isrc/include/port/win32
> /Isrc/include/port/win32_msvc /Iexternals/zlib
> /Iexternals/openssl\include /Isrc/backend /Zi /nologo /W3 /WX-
> /diagnostics:column /Ox /D WIN32 /D _WINDOWS /D __WINDOWS__ /D
> __WIN32__ /D EXEC_BACKEND /D WIN32_STACK_RLIMIT=4194304 /D
> _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D BUILDING_DLL
> /D _MBCS /GF /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope
> /Zc:inline /Fo".\Release\postgres\\" /Fd".\Release\postgres\vc142.pdb"
> /Gd /TC /wd4018 /wd4244 /wd4273 /wd4102 /wd4090 /wd4267 /FC
> /errorReport:queue /MP src/backend/catalog/partition.c

And this configuration matches exactly what you have with the host
where the test passed.

Now I do see a difference in the Windows 10 build involved, 10.0.19041
fails but 10.0.18363 passes.  I find rather hard to buy that this is
directly a Postgres bug.  The compiler version is the same, so the
issue seems to be related to the way the code compiled is
interpreted.
--
Michael


signature.asc
Description: PGP signature

Re: Reduce the number of special cases to build contrib modules on windows

2020-11-10 Thread Michael Paquier

On Wed, Nov 11, 2020 at 11:01:57AM +1300, David Rowley wrote:
> I'm still working through some small differences in some of the
> .vcxproj files.  I've been comparing these by copying *.vcxproj out to
> another directory with patched and unpatched then diffing the
> directory. See attached txt file with those diffs. Here's a summary of
> some of them:

Thanks.  It would be good to not have those diffs if not necessary.

> 1. There are a few places that libpq gets linked where it previously did not.

It seems to me that your patch is doing the right thing for adminpack
and that its Makefile has no need to include a reference to libpq
source path, no?

For dblink and postgres_fdw, the duplication comes from PG_CPPFLAGS.
It does not matter much in practice, but it would be nice to not have
unnecessary data in the project files.  One thing that could be done
is to make Project.pm more aware of the uniqueness of the elements
included.  But, do we really need -I$(libpq_srcdir) there anyway?
From what I can see, we have all the paths in -I we'd actually need
with or without USE_PGXS.

> 2. REFINT_VERBOSE gets defined in a few more places than it did
> previously. This makes it closer to what happens on Linux anyway, if
> you look at the Make output from contrib/spi/Makefile you'll see
> -DREFINT_VERBOSE in there for autoinc.

Indeed.

> 3. LOWER_NODE gets defined in ltree now where it wasn't before.  It's
> defined on Linux. Unsure why it wasn't before on Windows.

Your patch is grabbing the value of PG_CPPFLAGS from ltree's
Makefile, which is fine.  We may be able to remove this flag and rely
on pg_tolower() instead in the long run?  I am not sure about
FLG_CANLOOKSIGN() though.
--
Michael

signature.asc
Description: PGP signature

Re: Prefer TG_TABLE_NAME over TG_RELNAME in tests

2020-11-10 Thread Michael Paquier

On Tue, Nov 10, 2020 at 09:50:15AM +0100, Andreas Karlsson wrote:
> I submitted patches to pgformatter, bucardo and ledgersmb. Both davical and
> sql-ledger only seems to have them in old upgrade scripts.

Thanks, Andreas!
--
Michael


signature.asc
Description: PGP signature

Re: cutting down the TODO list thread

2020-11-10 Thread Bruce Momjian

On Tue, Nov  3, 2020 at 02:06:13PM -0400, John Naylor wrote:
> I was thinking of not having the next updates during commitfest, but it could
> also be argued this is a type of review, and the things here will be returned
> with feedback or rejected, in a way. Ultimately, it comes down to "when time
> permits".

I don't understand what this is referring to.  Thanks for the rest of
the work.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Support for NSS as a libpq TLS backend

2020-11-10 Thread Daniel Gustafsson

> On 10 Nov 2020, at 21:11, Jacob Champion  wrote:
> On Nov 6, 2020, at 3:11 PM, Daniel Gustafsson  wrote:

>> The attached switches to SSL_ConfigServerSessionIDCacheWithOpt
>> with which one can explicitly make the cache non-shared, which in turn backs
>> the mutexes with NSPR locks rather than the missing sem_init.  Can you test
>> this version and see if that makes it work?
> 
> Yep, I get much farther through the tests with that patch.

Great, thanks for confirming.

> I'm currently
> diving into another assertion failure during socket disconnection:
> 
>Assertion failure: fd->secret == NULL, at prlayer.c:45
> 
> cURL has some ominously vague references to this [1], though I'm not
> sure that we should work around it in the same way without knowing what
> the cause is...

Digging through the archives from when this landed in curl, the assertion
failure was never fully identified back then but happened spuriously.  Which
version of NSPR is this happening with?

cheers ./daniel

Re: Reduce the number of special cases to build contrib modules on windows

2020-11-10 Thread David Rowley

On Tue, 10 Nov 2020 at 03:07, Alvaro Herrera  wrote:
>
> On 2020-Nov-06, David Rowley wrote:
>
> > +# Handle makefile rules for when file to be added to the project
> > +# does not exist.  Returns 1 when the original file add should be
> > +# skipped.
> > +sub FindAndAddAdditionalFiles
> > +{
> > + my $self = shift;
> > + my $fname = shift;
> > + my ($ext) = $fname =~ /(\.[^.]+)$/;
> > +
> > + # For .c files, check if a .l file of the same name exists and add 
> > that
> > + # too.
> > + if ($ext eq ".c")
> > + {
> > + my $filenoext = $fname;
> > + $filenoext =~ s{\.[^.]+$}{};
>
> I think you can make this simpler by capturing both the basename and the
> extension in one go.  For example,
>
>   $fname =~ /(.*)(\.[^.]+)$/;
>   $filenoext = $1;
>   $ext = $2;
>
> so you avoid the second =~ statement.

Thanks. That's neater.

> > + if (-e "$filenoext.l")
> > + {
> > + AddFileConditional($self, "$filenoext.l");
> > + return 1;
> > + }
> > + if (-e "$filenoext.y")
> > + {
> > + AddFileConditional($self, "$filenoext.y");
>
> Maybe DRY like
>
> for my $ext (".l", ".y") {
>   my $file = $filenoext . $ext;
>   AddFileConditional($self, $file) if -f $file;
>   return 1;
> }

I did adapt that part of the code, but not exactly to what's above.
The return there would cause us to return from the function after the
first iteration.

> Note: comment says "check if a .l file" and then checks both .l and .y.
> Probably want to update the comment ...

Updated.

I've attached the v3 patch.

I'm still working through some small differences in some of the
.vcxproj files.  I've been comparing these by copying *.vcxproj out to
another directory with patched and unpatched then diffing the
directory. See attached txt file with those diffs. Here's a summary of
some of them:

1. There are a few places that libpq gets linked where it previously did not.
2. REFINT_VERBOSE gets defined in a few more places than it did
previously. This makes it closer to what happens on Linux anyway, if
you look at the Make output from contrib/spi/Makefile you'll see
-DREFINT_VERBOSE in there for autoinc
3. LOWER_NODE gets defined in ltree now where it wasn't before.  It's
defined on Linux. Unsure why it wasn't before on Windows.

David
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 90594bd41b..a97edeb6eb 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -32,16 +32,13 @@ my $libpq;
 my @unlink_on_exit;
 
 # Set of variables for modules in contrib/ and src/test/modules/
-my $contrib_defines = { 'refint' => 'REFINT_VERBOSE' };
-my @contrib_uselibpq = ('dblink', 'oid2name', 'postgres_fdw', 'vacuumlo');
-my @contrib_uselibpgport   = ('oid2name', 'pg_standby', 'vacuumlo');
-my @contrib_uselibpgcommon = ('oid2name', 'pg_standby', 'vacuumlo');
+my $contrib_defines = {};
+my @contrib_uselibpq = ();
+my @contrib_uselibpgport   = ('pg_standby');
+my @contrib_uselibpgcommon = ('pg_standby');
 my $contrib_extralibs  = undef;
 my $contrib_extraincludes = { 'dblink' => ['src/backend'] };
-my $contrib_extrasource = {
-   'cube' => [ 'contrib/cube/cubescan.l', 'contrib/cube/cubeparse.y' ],
-   'seg'  => [ 'contrib/seg/segscan.l',   'contrib/seg/segparse.y' ],
-};
+my $contrib_extrasource = {};
 my @contrib_excludes = (
'bool_plperl',  'commit_ts',
'hstore_plperl','hstore_plpython',
@@ -951,6 +948,7 @@ sub AddContrib
my $subdir = shift;
my $n  = shift;
my $mf = Project::read_file("$subdir/$n/Makefile");
+   my @projects;
 
if ($mf =~ /^MODULE_big\s*=\s*(.*)$/mg)
{
@@ -958,6 +956,7 @@ sub AddContrib
my $proj = $solution->AddProject($dn, 'dll', 'contrib', 
"$subdir/$n");
$proj->AddReference($postgres);
AdjustContribProj($proj);
+   push @projects, $proj;
}
elsif ($mf =~ /^MODULES\s*=\s*(.*)$/mg)
{
@@ -969,18 +968,90 @@ sub AddContrib
$proj->AddFile("$subdir/$n/$filename");
$proj->AddReference($postgres);
AdjustContribProj($proj);
+   push @projects, $proj;
}
}
elsif ($mf =~ /^PROGRAM\s*=\s*(.*)$/mg)
{
my $proj = $solution->AddProject($1, 'exe', 'contrib', 
"$subdir/$n");
AdjustContribProj($proj);
+   push @projects, $proj;
}
else
{
croak "Could not determine contrib module type for $n\n";
}
 
+   # Process custom compiler flags
+   if ($mf =~ /^PG_CPPFLAGS\s*=\s*(.*)$/mg)
+   {
+   foreach my $flag (split /\s+/, $1)
+   {
+   if ($flag =~ /^-D(.*)$/)
+   {
+

Re: Additional Chapter for Tutorial

2020-11-10 Thread David G. Johnston

On Sun, Nov 8, 2020 at 8:56 AM Jürgen Purtz  wrote:

> Good catches. Everything applied.
>

MVCC Section

The first paragraph and example in the MVCC section is a good example but
seems misplaced - its relationship to MVCC generally is tenuous, rather I
would expect a discussion of the serializable isolation mode to follow.

I'm not sure how much detail this section wants to get into given the
coverage of concurrency elsewhere in the documentation.  "Not much" would
be my baseline.

I would suggest spelling out what "OLTP" stands for and ideally pointing
the user to the glossary for the term.

Tending more toward a style gripe but the amount of leader phrases and
redundancy are at a level that I am noticing them when I read this but do
not have the same impression having read large portions of documentation.
In particular:

"When we speak about transaction IDs, you need to know that xids are like
sequences."

"But keep in mind that xids are independent of any time measurement — in
milliseconds or otherwise. If you dive deeper into PostgreSQL, you will
recognize parameters with names such as 'xxx_age'. Despite their names,
these '_age' parameters do not specify a period of time but represent a
certain number of transactions, e.g., 100 million."

Could just be:  xids are sequences and age computations involving them
measure a transaction count as opposed to a time interval.

Then I would consider adding a bit more detail/context here.

xids are 32bit sequences, with a reserved value to handle wrap-around.
There are 4 billion values in the sequence but wrap-around handling must
occur every 2 billion transactions. Age computations involving xids measure
a transaction count as opposed to a time interval.

I would move the mentioning of "vacuum" to the main paragraph about delete
and not solely as a "keep in mind" note.

The part before the diagram seems like it should be much shorter, concise,
and provide links to the excellent documentation.  The part after the
image, and the image itself, are good material, though possibly should be
in a main administration chapter instead of an internals chapter.

The first bullet of "keep in mind" is both wordy and wrong - in particular
"as xids grow old row versions get out of scope over time" doesn't make
sense (or rather it only does in the context of wrap-around, not normal
visibility).  Having the only mention of bloat be here is also not ideal,
it too should be weaved into the main narrative.  The "keep in mind"
section here should be a recap of already covered material in a succinct
form, nothing should be new to someone who just read the entire section.

I don't think that usage of exclamation marks (!) is warranted here, though
emphasis on the key phrase wouldn't hurt.

Vacuum Section

avoid -> prevent (continued growth)

Autovacuum is enabled by default.  The whole note needs commas.

I'd try to get rid of "at arbitrary point in time"

"Instance." we've already described where instances are previously ("on the
server")

The other sections - these seem misplaced for the tutorial, update the main
documentation if this information is wholly missing or lacking.  The MVCC
chapter can incorporate overview information as it is a strict consequence
of that implementation.

Statistics belong elsewhere - the tutorial should not use poor command
implementation choices as a guide for user education.

In short, this whole section should not exist and its content moved to more
appropriate areas (mainly MVCC).  Vacuum is a tool that one must use but
the narrative should be about the system generally.

David J.

Re: Windows regress fails (latest HEAD)

2020-11-10 Thread Russell Foster

On Tue, Nov 10, 2020 at 1:52 PM Tomas Vondra
 wrote:
>
>
>
>
> On 11/10/20 7:15 PM, Tom Lane wrote:
> > Tomas Vondra  writes:
> >> That's unlikely, I think. The regression tests are constructed so that
> >> the estimates are stable. It's more likely this is some difference in
> >> rounding behavior, for example.
> >
> > The reported delta is in the actual row count, not an estimate.
> > How could that be subject to roundoff issues?  And there's no
> > delta in the Append's inputs, so this seems like it's a flat-out
> > miss of a row count in EXPLAIN ANALYZE.
> >
>
> My bad. I've not noticed it's EXPLAIN ANALYZE (COSTS OFF) so I thought
> it's estimates. You're right this can't be a roundoff error.
>
> >> I wonder which msvc builds are used on the machines that fail/pass
> >> the tests, and if the compiler flags are the same.
> >
> > Yeah, this might be a fruitful way to figure out "what's different
> > from the buildfarm".
> >
>
> Yeah. Also Russell claims to have two "same" machines out of which one
> works and the other one fails.

Never claimed they were the same, but they are both Windows x64. Here
are some more details:

Test Passes:
VM machine (Build Server)
Microsoft Windows 10 Pro
Version 10.0.18363 Build 18363
Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29112 for x64

Compile args:
C:\Program Files (x86)\Microsoft Visual
Studio\2019\Enterprise\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\CL.exe
/c /Isrc/include /Isrc/include/port/win32
/Isrc/include/port/win32_msvc /Iexternals/zlib
/Iexternals/openssl\include /Isrc/backend /Zi /nologo /W3 /WX-
/diagnostics:column /Ox /D WIN32 /D _WINDOWS /D __WINDOWS__ /D
__WIN32__ /D EXEC_BACKEND /D WIN32_STACK_RLIMIT=4194304 /D
_CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D BUILDING_DLL
/D _MBCS /GF /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope
/Zc:inline /Fo".\Release\postgres\\" /Fd".\Release\postgres\vc142.pdb"
/Gd /TC /wd4018 /wd4244 /wd4273 /wd4102 /wd4090 /wd4267 /FC
/errorReport:queue /MP src/backend/catalog/partition.c

Test Fails:
Laptop machine (Development)
Microsoft Windows 10 Enterprise
Version 10.0.19041 Build 19041
Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29112 for x64

Compile args:
C:\Program Files (x86)\Microsoft Visual
Studio\2019\Enterprise\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\CL.exe
/c /Isrc/include /Isrc/include/port/win32
/Isrc/include/port/win32_msvc /Iexternals/zlib
/Iexternals/openssl\include /Isrc/backend /Zi /nologo /W3 /WX-
/diagnostics:column /Ox /D WIN32 /D _WINDOWS /D __WINDOWS__ /D
__WIN32__ /D EXEC_BACKEND /D WIN32_STACK_RLIMIT=4194304 /D
_CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D BUILDING_DLL
/D _MBCS /GF /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope
/Zc:inline /Fo".\Release\postgres\\" /Fd".\Release\postgres\vc142.pdb"
/Gd /TC /wd4018 /wd4244 /wd4273 /wd4102 /wd4090 /wd4267 /FC
/errorReport:queue /MP src/backend/catalog/partition.c

>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Compiler versions are the same and the build args seem the same. Also,
tried with the latest REL_12_STABLE and REL_13_STABLE, and they both
fail on my dev machine as well.

I'm not going to pretend to know why the explain should be equal in
this case, but I wonder what difference it would make if the output
for both is equal and/or correct? From parttion_prune.out on the
failing machine, line 2810:

-- Multiple partitions
insert into tbl1 values (1001), (1010), (1011);
explain (analyze, costs off, summary off, timing off)
select * from tbl1 inner join tprt on tbl1.col1 > tprt.col1;
QUERY PLAN
--
 Nested Loop (actual rows=23 loops=1)
   ->  Seq Scan on tbl1 (actual rows=5 loops=1)
   ->  Append (actual rows=5 loops=5)
 ->  Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=5)
   Index Cond: (col1 < tbl1.col1)
 ->  Index Scan using tprt2_idx on tprt_2 (actual rows=3 loops=4)
   Index Cond: (col1 < tbl1.col1)
 ->  Index Scan using tprt3_idx on tprt_3 (actual rows=1 loops=2)
   Index Cond: (col1 < tbl1.col1)
 ->  Index Scan using tprt4_idx on tprt_4 (never executed)
   Index Cond: (col1 < tbl1.col1)
 ->  Index Scan using tprt5_idx on tprt_5 (never executed)
   Index Cond: (col1 < tbl1.col1)
 ->  Index Scan using tprt6_idx on tprt_6 (never executed)
   Index Cond: (col1 < tbl1.col1)
(15 rows)

explain (analyze, costs off, summary off, timing off)
select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
QUERY PLAN
--
 Nested Loop (actual rows=3 loops=1)
   ->  Seq Scan on tbl1 (actual rows=5 loops=1)
   ->  Append (actual rows=0 loops=5)
 ->  Index Scan using tprt1_idx on tprt_1

Re: Windows regress fails (latest HEAD)

2020-11-10 Thread Ranier Vilela

Em ter., 10 de nov. de 2020 às 14:16, Russell Foster <
russell.foster.cod...@gmail.com> escreveu:

> On Tue, Nov 10, 2020 at 12:03 PM Ranier Vilela 
> wrote:
> >
> > Em sex., 26 de jun. de 2020 às 08:21, Ranier Vilela 
> escreveu:
> >>
> >> Em qui., 11 de jun. de 2020 às 10:28, Ranier Vilela <
> ranier...@gmail.com> escreveu:
> >>>
> >>> Em qui., 11 de jun. de 2020 às 10:01, Andrew Dunstan <
> andrew.duns...@2ndquadrant.com> escreveu:
> 
> 
>  On 6/11/20 8:52 AM, Ranier Vilela wrote:
>  > Hi,
>  > Latest HEAD, fails with windows regress tests.
>  >
>  >  float8   ... FAILED  517 ms
>  >  partition_prune  ... FAILED 3085 ms
>  >
>  >
> 
>  The first thing you should do when you find this is to see if there
> is a
>  buildfarm report of the failure. If there isn't then try to work out
> why.
> >>>
> >>> Sorry, I will have to research the buildfarm, I have no reference to
> it.
> >>>
> 
> 
>  Also, when making a report like this, it is essential to let us know
>  things like:
> 
>    * which commit is causing the failure (git bisect is good for
> finding
>  this)
> >>>
> >>> Thanks for hit (git bisect).
> >>>
> 
>    * what Windows version you're testing on
> >>>
> >>> Windows 10 (2004)
> >>>
>    * which compiler you're using
> >>>
> >>> msvc 2019 (64 bits)
> >>
> >>
> >> Only for registry, if anyone else is using msvc 2019.
> >> I'm using latest msvc 2019 64 bits (16.6.0)
> >> Problably this is a compiler optimization bug.
> >> vcregress check with build DEBUG, pass all 200 tests.
> >
> > With the current HEAD, the regression float8 in release mode (msvc 2019
> 64 bits) is gone.
> > Maybe it's this commit:
> >
> https://github.com/postgres/postgres/commit/0aa8f764088ea0f36620ae2955fa6c54ec736c46
> >
> > But (partition_prune) persists.
> > partition_prune  ... FAILED
> >
> > regards,
> > Ranier Vilela
>
> I am also experiencing this issue on one of my Windows machines (x64)
> using 12.4. I believe this is new, possibly since 12.2. It doesn't
> occur on another machine though, which is strange. It appears to be
> the same diff output. Is it possible that the given result is also
> valid for this test?
>
Hi Russel,
In DEBUG mode, the issue is gone (all 202 tests pass).
You can be sure yourself.
I think that compiler code generation bug...

Ranier Vilela

Re: Support for NSS as a libpq TLS backend

2020-11-10 Thread Jacob Champion

On Nov 6, 2020, at 3:11 PM, Daniel Gustafsson  wrote:
> 
> The attached switches to SSL_ConfigServerSessionIDCacheWithOpt
> with which one can explicitly make the cache non-shared, which in turn backs
> the mutexes with NSPR locks rather than the missing sem_init.  Can you test
> this version and see if that makes it work?

Yep, I get much farther through the tests with that patch. I'm currently
diving into another assertion failure during socket disconnection:

Assertion failure: fd->secret == NULL, at prlayer.c:45

cURL has some ominously vague references to this [1], though I'm not
sure that we should work around it in the same way without knowing what
the cause is...

--Jacob

[1] https://github.com/curl/curl/blob/4d2f800/lib/vtls/nss.c#L1266

Re: Allow some recovery parameters to be changed with reload

2020-11-10 Thread Sergei Kornilov

Hello

> Even if PITR is commanded, crash recovery can run before starting
> archive recovery if the server was not gracefully shut down.

Hmm... Still not sure how it's possible. Both readRecoverySignalFile and 
validateRecoveryParameters are called early in StartupXLOG. If PITR was 
commanded - we follow PITR logic. If requested recovery stop point is before 
consistent recovery point we shutdown the database with another FATAL.
I mean such place:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/transam/xlog.c;h=9d3f1c12fc56f61da4d2b9bf08c54d31b9757ef7;hb=29be9983a64c011eac0b9ee29895cce71e15ea77#l6891
If we start recovery by any reason and a archive recovery was requested - we 
start archive recovery instead of crash recovery.

> I don't know. I just think that it is not proper that "ALTER SYSTEM" +
> config-reload causes server stop.

I got your point. How about pause the recovery process? Like proposed in 
https://commitfest.postgresql.org/30/2489/
For example,
* restore_command become empty on SIGHUP while PITR was requested
* we set recovery to pause
* if user call pg_wal_replay_resume and restore_command is still empty - we 
shutdown the database
* if user fix restore_command - we continue restore.

But it seems complicated if we just don't need special handling here. We still 
require restore_command to be set to start recovery. In case the user later 
wants to set the restore_command to empty - let's assume that's correct (FATAL 
if PITR target is after the end of local pg_wal, promote otherwise).

>>  Why not use local pg_wal? There may be already enough WAL.
>
> Mmm. If the file to read is in pg_wal, restore_command won't be
> executed in the first place?

Startup process will call restore_command in any case regardless of pg_wal 
content. (xlogarchive.c, RestoreArchivedFile)

> * When doing archive recovery, we always prefer an archived log file even
> * if a file of the same name exists in XLOGDIR.  The reason is that the
> * file in XLOGDIR could be an old, un-filled or partly-filled version
> * that was copied and restored as part of backing up $PGDATA.

regards, Sergei

Re: Windows regress fails (latest HEAD)

2020-11-10 Thread Russell Foster

On Tue, Nov 10, 2020 at 1:15 PM Tom Lane  wrote:
>
> Tomas Vondra  writes:
> > That's unlikely, I think. The regression tests are constructed so that
> > the estimates are stable. It's more likely this is some difference in
> > rounding behavior, for example.
>
> The reported delta is in the actual row count, not an estimate.
> How could that be subject to roundoff issues?  And there's no
> delta in the Append's inputs, so this seems like it's a flat-out
> miss of a row count in EXPLAIN ANALYZE.
>
> > I wonder which msvc builds are used on
> > the machines that fail/pass the tests, and if the compiler flags are the
> > same.
>
> Yeah, this might be a fruitful way to figure out "what's different
> from the buildfarm".
>
> regards, tom lane

Hmm..anyway I can help here? I don't believe I am using any special
compile options. I am using VS 2019. The thing is, both systems I have
use the same build. I call msvcvars to set up the environment:

"%ProgramFiles(x86)%\Microsoft Visual
Studio\2019\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" amd64

I also saw some recent changes have been made around these tests, so I
can try the latest too.

Re: proposal: possibility to read dumped table's name from file

2020-11-10 Thread Stephen Frost

Greetings,

* Pavel Stehule (pavel.steh...@gmail.com) wrote:
> rebase + minor change - using pg_get_line_buf instead pg_get_line_append

I started looking at this and went back through the thread and while I
tend to agree that JSON may not be a good choice for this, it's not the
only possible alternative.  There is no doubt that pg_dump is already a
sophisticated data export tool, and likely to continue to gain new
features, such that having a configuration file for it would be very
handy, but this clearly isn't really going in a direction that would
allow for that.

Perhaps this feature could co-exist with a full blown configuration for
pg_dump, but even then there's certainly issues with what's proposed-
how would you handle explicitly asking for a table which is named 
"  mytable" to be included or excluded?  Or a table which has a newline
in it?  Using a standardized format which supports the full range of
what we do in a table name, explicitly and clearly, would address these
issues and also give us the flexibility to extend the options which
could be used through the configuration file beyond just the filters in
the future.

Unlike for the pg_basebackup manifest, which we generate and read
entirely programatically, a config file for pg_dump would almost
certainly be updated manually (or, at least, parts of it would be and
perhaps other parts generated), which means it'd really be ideal to have
a proper way to support comments in it (something that the proposed
format also doesn't really get right- # must be the *first* character,
and you can only have whole-line comments..?), avoid extra unneeded
punctuation (or, at times, allow it- such as trailing commas in lists),
cleanly handle multi-line strings (consider the oft discussed idea
around having pg_dump support a WHERE clause for exporting data from
tables...), etc.

Overall, -1 from me on this approach.  Maybe it could be fixed up to
handle all the different names of objects that we support today
(something which, imv, is really a clear requirement for this feature to
be committed), but I suspect you'd end up half-way to yet another
configuration format when we could be working to support something like
TOML or maybe YAML... but if you want my 2c, TOML seems closer to what
we do for postgresql.conf and getting that over to something that's
standardized, while a crazy long shot, is a general nice idea, imv.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: [PATCH] Support negative indexes in split_part

2020-11-10 Thread Jacob Champion

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

Patch looks good to me. Seems like a useful feature, and I agree that the 
two-pass implementation makes the change very easy to review.

Quick note on test coverage: gcov marks the "needle not found" branch (the one 
marked `/* special case if fldsep not found at all */`) as being completely 
uncovered. I don't think that needs to gate this patch; it looks like it was 
uncovered before this feature was added.

Doc builds are currently failing due to what appears to be an xmllint failure:

/usr/bin/xmllint --path . --noout --valid postgres.sgml
error : Unknown IO error
postgres.sgml:21: warning: failed to load external entity 
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd;

but that doesn't have anything to do with this patch. Marking Ready for 
Committer. (I'm a little new to this myself, so someone please let me know if 
I'm jumping the gun.)

Thanks!
--Jacob

The new status of this patch is: Ready for Committer

Re: Strange behavior with polygon and NaN

2020-11-10 Thread Tom Lane

I spent some time looking this over, and have a few thoughts:

1. I think it's useful to split the test changes into two patches,
as I've done below: first, just add the additional row in point_tbl
and let the fallout from that happen, and then in the second patch
make the code changes.  This way, it's much clearer what the actual
behavioral changes are.  Some of them don't look right, either.
For instance, in the very first hunk in geometry.out, we have
this:

- (Infinity,1e+300) | {1,0,5}   |
NaN |NaN
+ (Infinity,1e+300) | {1,0,5}   |   
Infinity |   Infinity

which seems right, and also this:

- (1e+300,Infinity) | {1,-1,0}  |   
Infinity |   Infinity
- (1e+300,Infinity) | {-0.4,-1,-6}  |   
Infinity |   Infinity
- (1e+300,Infinity) | {-0.000184615384615,-1,15.3846153846} |   
Infinity |   Infinity
+ (1e+300,Infinity) | {1,-1,0}  |
NaN |NaN
+ (1e+300,Infinity) | {-0.4,-1,-6}  |
NaN |NaN
+ (1e+300,Infinity) | {-0.000184615384615,-1,15.3846153846} |
NaN |NaN

which does not.  Why aren't these distances infinite as well?
For instance, {1,-1,0} is the line "x = y".  We could argue about
whether it'd be sensible to return zero for the distance between that
and the point (inf,inf), but surely any point with one inf and one
finite coordinate must be an infinite distance away from that line.
There's nothing ill-defined about that situation.

2. Rather than coding around undesirable behavior of float8_min,
it seems like it's better to add a primitive to float.h that
does what you want, ie "NaN if either input is NaN, else the
smaller input".  This is more readable, and possibly more efficient
(depending on whether the compiler is smart enough to optimize
away redundant isnan checks).  I did that in the attached.

3. Looking for other calls of float8_min, I wonder why you did not
touch the bounding-box calculations in box_interpt_lseg() or
boxes_bound_box().

4. The line changes feel a bit weird, like there's no clear idea
of what a "valid" or "invalid" line is.  For instance the first
hunk in line_construct():

+   /* Avoid creating a valid line from an invalid point */
+   if (unlikely(isnan(pt->y)))
+   result->C = get_float8_nan();

Why's it appropriate to set C and only C to NaN?

5. But actually there's a bigger issue with that particular hunk.
This code branch is dealing with "draw a vertical line through this
point", so why should we care what the point's y coordinate is --- that
is, why is this particular change appropriate at all?  The usual rule as
I understand it is that if a function's result is determined by some of
its arguments independently of what another argument's value is, then it
doesn't matter if that one is NaN, you can still return the same result.

6. I'm a bit uncomfortable with the use of "bool isnan" in a couple
of places.  I think it's confusing to use that alongside the isnan()
macro.  Moreover, it's at least possible that some platforms implement
isnan() in a way that would break this usage.  The C spec specifically
says that isnan() is a macro not a function ... but it doesn't commit
to it being a macro-with-arguments.  I think "anynan" or something
like that would be a better choice of name.

[ a bit later... ] Indeed, I get a compile failure on gaur:

geo_ops.c: In function 'lseg_closept_lseg':
geo_ops.c:2906:17: error: called object 'isnan' is not a function
geo_ops.c:2906:32: error: called object 'isnan' is not a function
geo_ops.c:2916:16: error: called object 'isnan' is not a function
geo_ops.c:2924:16: error: called object 'isnan' is not a function
geo_ops.c: In function 'box_closept_point':
geo_ops.c:2989:16: error: called object 'isnan' is not a function
geo_ops.c:2992:16: error: called object 'isnan' is not a function
geo_ops.c:3004:16: error: called object 'isnan' is not a function
geo_ops.c:3014:16: error: called object 'isnan' is not a function
make: *** [geo_ops.o] Error 1

So that scenario isn't hypothetical.  Please rename the variables.

regards, tom lane

diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 93a8736a3f..76679bae8d 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -157,7 +157,7 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 << '(0.0, 0.0)';
 SELECT count(*) FROM point_tbl p WHERE p.f1 >> '(0.0, 0.0)';
  count 
 ---
- 3
+ 4
 (1 row)
 
 SELECT count(*) FROM point_tbl p WHERE p.f1 <^ '(0.0, 0.0)';
@@ -169,7 +169,7 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 <^ '(0.0, 0.0)';
 SELECT count(*) FROM point_tbl p WHERE

Re: SQL-standard function body

2020-11-10 Thread Georgios Kokolatos

Hi,

I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.

Cheers,
//Georgios

The new status of this patch is: Waiting on Author

Re: Error on failed COMMIT

2020-11-10 Thread Georgios Kokolatos

Hi,

I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.

Cheers,
//Georgios

The new status of this patch is: Waiting on Author

Re: WIP: System Versioned Temporal Table

2020-11-10 Thread Georgios Kokolatos

Hi,

just a quick comment that this patch fails on the cfbot.

Cheers,
//Georgios

Re: Allow an alias to be attached directly to a JOIN ... USING

2020-11-10 Thread Georgios Kokolatos

Hi,

I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.

Cheers,
//Georgios

The new status of this patch is: Waiting on Author

Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-11-10 Thread Georgios Kokolatos

Hi,

I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.

Cheers,
//Georgios

The new status of this patch is: Waiting on Author

Re: Get memory contexts of an arbitrary backend process

2020-11-10 Thread Georgios Kokolatos

Hi,

I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.

Cheers,
//Georgios

The new status of this patch is: Waiting on Author

Re: Display individual query in pg_stat_activity

2020-11-10 Thread Georgios Kokolatos

Hi,

I noticed that this patch is failing on the cfbot.
For this, I changed the status to: 'Waiting on Author'

Cheers,
Georgios

The new status of this patch is: Waiting on Author

Re: Add session statistics to pg_stat_database

2020-11-10 Thread Georgios Kokolatos

Hi,

I noticed that the cfbot fails for this patch.
For this, I am setting the status to: 'Waiting on Author'.

Cheers,
//Georgios

The new status of this patch is: Waiting on Author

Re: Windows regress fails (latest HEAD)

2020-11-10 Thread Tomas Vondra





On 11/10/20 7:15 PM, Tom Lane wrote:
> Tomas Vondra  writes:
>> That's unlikely, I think. The regression tests are constructed so that
>> the estimates are stable. It's more likely this is some difference in
>> rounding behavior, for example.
> 
> The reported delta is in the actual row count, not an estimate.
> How could that be subject to roundoff issues?  And there's no
> delta in the Append's inputs, so this seems like it's a flat-out
> miss of a row count in EXPLAIN ANALYZE.
> 

My bad. I've not noticed it's EXPLAIN ANALYZE (COSTS OFF) so I thought
it's estimates. You're right this can't be a roundoff error.

>> I wonder which msvc builds are used on the machines that fail/pass
>> the tests, and if the compiler flags are the same.
> 
> Yeah, this might be a fruitful way to figure out "what's different
> from the buildfarm".
> 

Yeah. Also Russell claims to have two "same" machines out of which one
works and the other one fails.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: proposal: schema variables

2020-11-10 Thread Pavel Stehule

čt 1. 10. 2020 v 7:08 odesílatel Pavel Stehule 
napsal:

>
>
> čt 1. 10. 2020 v 5:38 odesílatel Michael Paquier 
> napsal:
>
>> On Thu, Sep 24, 2020 at 08:49:50PM +0200, Pavel Stehule wrote:
>> > fixed patch attached
>>
>> It looks like there are again conflicts within setrefs.c.
>>
>
> fresh patch
>

rebase



> Regards
>
> Pavel
>
> --
>> Michael
>>
>


schema-variables-20201110.patch.gz
Description: application/gzip

Re: MultiXact\SLRU buffers configuration

2020-11-10 Thread Thomas Munro

On Wed, Nov 11, 2020 at 7:07 AM Tomas Vondra
 wrote:
> Seems we haven't made much progress in reproducing the issue :-( I guess
> we'll need to know more about the machine where this happens. Is there
> anything special about the hardware/config? Are you monitoring size of
> the pg_multixact directory?

Which release was the original problem seen on?

Re: [POC] Fast COPY FROM command for the table with foreign partitions

2020-11-10 Thread Tomas Vondra

Hi,

I needed to look at this patch while working on something related, and I
found it got broken by 6973533650c a couple days ago. So here's a fixed
version, to keep cfbot happy. I haven't done any serious review yet.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
>From 75ad201a09238c8fb69a22a85b2cde72838c2c84 Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Tue, 10 Nov 2020 18:54:05 +0100
Subject: [PATCH 1/2] Move multi-insert decision logic into executor

When 0d5f05cde introduced support for using multi-insert mode when
copying into partitioned tables, it introduced single variable of
enum type CopyInsertMethod shared across all potential target
relations (partitions) that, along with some target relation
proprties, dictated whether to engage multi-insert mode for a given
target relation.

Move that decision logic into InitResultRelInfo which now sets a new
boolean field ri_usesMultiInsert of ResultRelInfo when a target
relation is first initialized.  That prevents repeated computation
of the same information in some cases, especially for partitions,
and the new arrangement results in slightly more readability.
---
 src/backend/commands/copy.c  | 152 +++
 src/backend/executor/execMain.c  |  52 +
 src/backend/executor/execPartition.c |   7 ++
 src/include/executor/execPartition.h |   2 +
 src/include/executor/executor.h  |   2 +
 src/include/nodes/execnodes.h|   9 +-
 6 files changed, 109 insertions(+), 115 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 115860a9d4..d882396d6f 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -85,16 +85,6 @@ typedef enum EolType
 	EOL_CRNL
 } EolType;
 
-/*
- * Represents the heap insert method to be used during COPY FROM.
- */
-typedef enum CopyInsertMethod
-{
-	CIM_SINGLE,	/* use table_tuple_insert or fdw routine */
-	CIM_MULTI,	/* always use table_multi_insert */
-	CIM_MULTI_CONDITIONAL		/* use table_multi_insert only if valid */
-} CopyInsertMethod;
-
 /*
  * This struct contains all the state variables used throughout a COPY
  * operation. For simplicity, we use the same struct for all variants of COPY,
@@ -2717,12 +2707,10 @@ CopyFrom(CopyState cstate)
 	CommandId	mycid = GetCurrentCommandId(true);
 	int			ti_options = 0; /* start with default options for insert */
 	BulkInsertState bistate = NULL;
-	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
 	uint64		processed = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
-	bool		leafpart_use_multi_insert = false;
 
 	Assert(cstate->rel);
 	Assert(list_length(cstate->range_table) == 1);
@@ -2832,6 +2820,30 @@ CopyFrom(CopyState cstate)
 	resultRelInfo = target_resultRelInfo = makeNode(ResultRelInfo);
 	ExecInitResultRelation(estate, resultRelInfo, 1);
 
+	Assert(target_resultRelInfo->ri_usesMultiInsert == false);
+
+	/*
+	 * It's generally more efficient to prepare a bunch of tuples for
+	 * insertion, and insert them in bulk, for example, with one
+	 * table_multi_insert() call than call table_tuple_insert() separately for
+	 * every tuple. However, there are a number of reasons why we might not be
+	 * able to do this.  For example, if there any volatile expressions in the
+	 * table's default values or in the statement's WHERE clause, which may
+	 * query the table we are inserting into, buffering tuples might produce
+	 * wrong results.  Also, the relation we are trying to insert into itself
+	 * may not be amenable to buffered inserts.
+	 *
+	 * Note: For partitions, this flag is set considering the target table's
+	 * flag that is being set here and partition's own properties which are
+	 * checked by calling ExecRelationAllowsMultiInsert().  It does not matter
+	 * whether partitions have any volatile default expressions as we use the
+	 * defaults from the target of the COPY command.
+	 */
+	if (!cstate->volatile_defexprs &&
+		!contain_volatile_functions(cstate->whereClause))
+		target_resultRelInfo->ri_usesMultiInsert =
+	ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL);
+
 	/* Verify the named relation is a valid target for INSERT */
 	CheckValidResultRel(resultRelInfo, CMD_INSERT);
 
@@ -2847,6 +2859,10 @@ CopyFrom(CopyState cstate)
 	mtstate->operation = CMD_INSERT;
 	mtstate->resultRelInfo = resultRelInfo;
 
+	/*
+	 * Init COPY into foreign table. Initialization of copying into foreign
+	 * partitions will be done later.
+	 */
 	if (resultRelInfo->ri_FdwRoutine != NULL &&
 		resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
 		resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
@@ -2879,83 +2895,9 @@ CopyFrom(CopyState cstate)
 		cstate->qualexpr = ExecInitQual(castNode(List, cstate->whereClause),
 		>ps);
 
-	/*
-	 * It's generally more efficient to prepare a bunch of tuples for
-	 * insertion,

Re: Windows regress fails (latest HEAD)

2020-11-10 Thread Tom Lane

Tomas Vondra  writes:
> That's unlikely, I think. The regression tests are constructed so that
> the estimates are stable. It's more likely this is some difference in
> rounding behavior, for example.

The reported delta is in the actual row count, not an estimate.
How could that be subject to roundoff issues?  And there's no
delta in the Append's inputs, so this seems like it's a flat-out
miss of a row count in EXPLAIN ANALYZE.

> I wonder which msvc builds are used on
> the machines that fail/pass the tests, and if the compiler flags are the
> same.

Yeah, this might be a fruitful way to figure out "what's different
from the buildfarm".

regards, tom lane

Re: MultiXact\SLRU buffers configuration

2020-11-10 Thread Tomas Vondra




On 11/10/20 7:16 AM, Andrey Borodin wrote:
> 
> 
>> 10 нояб. 2020 г., в 05:13, Tomas Vondra  
>> написал(а):
>> After the issue reported in [1] got fixed, I've restarted the multi-xact
>> stress test, hoping to reproduce the issue. But so far no luck :-(
> 
> 
> Tomas, many thanks for looking into this. I figured out that to make 
> multixact sets bigger transactions must hang for a while and lock large set 
> of tuples. But not continuous range to avoid locking on buffer_content.
> I did not manage to implement this via pgbench, that's why I was trying to 
> hack on separate go program. But, essentially, no luck either.
> I was observing something resemblant though
> 
> пятница,  8 мая 2020 г. 15:08:37 (every 
> 1s)
> 
>   pid  | wait_event | wait_event_type | state  |  
>  query
> ---++-++
>  41344 | ClientRead | Client  | idle   | insert into 
> t1 select generate_series(1,100,1)
>  41375 | MultiXactOffsetControlLock | LWLock  | active | select * 
> from t1 where i = ANY ($1) for share
>  41377 | MultiXactOffsetControlLock | LWLock  | active | select * 
> from t1 where i = ANY ($1) for share
>  41378 || | active | select * 
> from t1 where i = ANY ($1) for share
>  41379 | MultiXactOffsetControlLock | LWLock  | active | select * 
> from t1 where i = ANY ($1) for share
>  41381 || | active | select * 
> from t1 where i = ANY ($1) for share
>  41383 | MultiXactOffsetControlLock | LWLock  | active | select * 
> from t1 where i = ANY ($1) for share
>  41385 | MultiXactOffsetControlLock | LWLock  | active | select * 
> from t1 where i = ANY ($1) for share
> (8 rows)
> 
> but this picture was not stable.
> 

Seems we haven't made much progress in reproducing the issue :-( I guess
we'll need to know more about the machine where this happens. Is there
anything special about the hardware/config? Are you monitoring size of
the pg_multixact directory?

> How do you collect wait events for aggregation? just insert into some table 
> with cron?
> 

No, I have a simple shell script (attached) sampling data from
pg_stat_activity regularly. Then I load it into a table and aggregate to
get a summary.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


collect-wait-events.sh
Description: application/shellscript

Re: Windows regress fails (latest HEAD)

2020-11-10 Thread Tomas Vondra



On 11/10/20 6:16 PM, Russell Foster wrote:
> On Tue, Nov 10, 2020 at 12:03 PM Ranier Vilela  wrote:
>>
>> Em sex., 26 de jun. de 2020 às 08:21, Ranier Vilela  
>> escreveu:
>>>
>>> Em qui., 11 de jun. de 2020 às 10:28, Ranier Vilela  
>>> escreveu:

 Em qui., 11 de jun. de 2020 às 10:01, Andrew Dunstan 
  escreveu:
>
>
> On 6/11/20 8:52 AM, Ranier Vilela wrote:
>> Hi,
>> Latest HEAD, fails with windows regress tests.
>>
>>  float8   ... FAILED  517 ms
>>  partition_prune  ... FAILED 3085 ms
>>
>>
>
> The first thing you should do when you find this is to see if there is a
> buildfarm report of the failure. If there isn't then try to work out why.

 Sorry, I will have to research the buildfarm, I have no reference to it.

>
>
> Also, when making a report like this, it is essential to let us know
> things like:
>
>   * which commit is causing the failure (git bisect is good for finding
> this)

 Thanks for hit (git bisect).

>
>   * what Windows version you're testing on

 Windows 10 (2004)

>   * which compiler you're using

 msvc 2019 (64 bits)
>>>
>>>
>>> Only for registry, if anyone else is using msvc 2019.
>>> I'm using latest msvc 2019 64 bits (16.6.0)
>>> Problably this is a compiler optimization bug.
>>> vcregress check with build DEBUG, pass all 200 tests.
>>
>> With the current HEAD, the regression float8 in release mode (msvc 2019 64 
>> bits) is gone.
>> Maybe it's this commit:
>> https://github.com/postgres/postgres/commit/0aa8f764088ea0f36620ae2955fa6c54ec736c46
>>
>> But (partition_prune) persists.
>> partition_prune  ... FAILED
>>
>> regards,
>> Ranier Vilela
> 
> I am also experiencing this issue on one of my Windows machines (x64)
> using 12.4. I believe this is new, possibly since 12.2. It doesn't
> occur on another machine though, which is strange. It appears to be
> the same diff output. Is it possible that the given result is also
> valid for this test?
> 

That's unlikely, I think. The regression tests are constructed so that
the estimates are stable. It's more likely this is some difference in
rounding behavior, for example. I wonder which msvc builds are used on
the machines that fail/pass the tests, and if the compiler flags are the
same.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Windows regress fails (latest HEAD)

2020-11-10 Thread Russell Foster

On Tue, Nov 10, 2020 at 12:03 PM Ranier Vilela  wrote:
>
> Em sex., 26 de jun. de 2020 às 08:21, Ranier Vilela  
> escreveu:
>>
>> Em qui., 11 de jun. de 2020 às 10:28, Ranier Vilela  
>> escreveu:
>>>
>>> Em qui., 11 de jun. de 2020 às 10:01, Andrew Dunstan 
>>>  escreveu:


 On 6/11/20 8:52 AM, Ranier Vilela wrote:
 > Hi,
 > Latest HEAD, fails with windows regress tests.
 >
 >  float8   ... FAILED  517 ms
 >  partition_prune  ... FAILED 3085 ms
 >
 >

 The first thing you should do when you find this is to see if there is a
 buildfarm report of the failure. If there isn't then try to work out why.
>>>
>>> Sorry, I will have to research the buildfarm, I have no reference to it.
>>>


 Also, when making a report like this, it is essential to let us know
 things like:

   * which commit is causing the failure (git bisect is good for finding
 this)
>>>
>>> Thanks for hit (git bisect).
>>>

   * what Windows version you're testing on
>>>
>>> Windows 10 (2004)
>>>
   * which compiler you're using
>>>
>>> msvc 2019 (64 bits)
>>
>>
>> Only for registry, if anyone else is using msvc 2019.
>> I'm using latest msvc 2019 64 bits (16.6.0)
>> Problably this is a compiler optimization bug.
>> vcregress check with build DEBUG, pass all 200 tests.
>
> With the current HEAD, the regression float8 in release mode (msvc 2019 64 
> bits) is gone.
> Maybe it's this commit:
> https://github.com/postgres/postgres/commit/0aa8f764088ea0f36620ae2955fa6c54ec736c46
>
> But (partition_prune) persists.
> partition_prune  ... FAILED
>
> regards,
> Ranier Vilela

I am also experiencing this issue on one of my Windows machines (x64)
using 12.4. I believe this is new, possibly since 12.2. It doesn't
occur on another machine though, which is strange. It appears to be
the same diff output. Is it possible that the given result is also
valid for this test?

Russell


regression.diffs
Description: Binary data

Re: POC: postgres_fdw insert batching

2020-11-10 Thread Tomas Vondra




On 11/10/20 4:05 PM, Tomas Vondra wrote:
> Hi,
> 
> Thanks for working on this!
> 
> On 11/10/20 1:45 AM, tsunakawa.ta...@fujitsu.com wrote:
>> Hello,
>>
>>
>> The attached patch implements the new bulk insert routine for
>> postgres_fdw and the executor utilizing it.  It passes make
>> check-world.
>>
> 
> I haven't done any testing yet, just a quick review.
> 
> I see the patch builds the "bulk" query in execute_foreign_modify. IMO
> that's something we should do earlier, when we're building the simple
> query (for 1-row inserts). I'd understand if you were concerned about
> overhead in case of 1-row inserts, trying to not plan the bulk query
> until necessary, but I'm not sure this actually helps.
> 
> Or was the goal to build a query for every possible number of slots? I
> don't think that's really useful, considering it requires deallocating
> the old plan, preparing a new one, etc. IMO it should be sufficient to
> have two queries - one for 1-row inserts, one for the full batch. The
> last incomplete batch can be inserted using a loop of 1-row queries.
> 
> That's what my patch was doing, but I'm not insisting on that - it just
> seems like a better approach to me. So feel free to argue why this is
> better.
> 
> 
>> I measured performance in a basic non-partitioned case by modifying
>> Tomas-san's scripts.  They perform an INSERT SELECT statement that
>> copies one million records.  The table consists of two integer
>> columns, with a primary key on one of those them.  You can run the
>> attached prepare.sql to set up once.  local.sql inserts to the table
>> directly, while fdw.sql inserts through a foreign table.
>>
>> The performance results, the average time of 5 runs,  were as follows
>> on a Linux host where the average round-trip time of "ping localhost"
>> was 34 us:
>>
>> master, local: 6.1 seconds master, fdw: 125.3 seconds patched, fdw:
>> 11.1 seconds (11x improvement)
>>
> 
> Nice. I think we can't really get much closer to local master, so 6.1
> vs. 11.1 seconds look quite acceptable.
> 
>>
>> The patch accumulates at most 100 records in ModifyTableState before
>> inserting in bulk.  Also, when an input record is targeted for a
>> different relation (= partition) than that for already accumulated
>> records, insert the accumulated records and store the new record for
>> later insert.
>>
>> [Issues]
>>
>> 1. Do we want a GUC parameter, say, max_bulk_insert_records =
>> (integer), to control the number of records inserted at once? The
>> range of allowed values would be between 1 and 1,000.  1 disables
>> bulk insert. The possible reason of the need for this kind of
>> parameter would be to limit the amount of memory used for accumulated
>> records, which could be prohibitively large if each record is big.  I
>> don't think this is a must, but I think we can have it.
>>
> 
> I think it'd be good to have such GUC, even if only for testing and
> development. We should probably have a way to disable the batching,
> which the GUC could also do, I think. So +1 to have the GUC.
> 
>> 2. Should we accumulate records per relation in ResultRelInfo
>> instead? That is, when inserting into a partitioned table that has
>> foreign partitions, delay insertion until a certain number of input
>> records accumulate, and then insert accumulated records per relation
>> (e.g., 50 records to relation A, 30 records to relation B, and 20
>> records to relation C.)  If we do that,
>>
> 
> I think there's a chunk of text missing here? If we do that, then what?
> 
> Anyway, I don't see why accumulating the records in ResultRelInfo would
> be better than what the patch does now. It seems to me like fairly
> specific to FDWs, so keeping it int FDW state seems appropriate. What
> would be the advantage of stashing it in ResultRelInfo?
> 
>> * The order of insertion differs from the order of input records.  Is
>> it OK?
>>
> 
> I think that's OK for most use cases, and if it's not (e.g. when there's
> something requiring the exact order of writes) then it's not possible to
> use batching. That's one of the reasons why I think we should have a GUC
> to disable the batching.
> 
>> * Should the maximum count of accumulated records be applied per
>> relation or the query? When many foreign partitions belong to a
>> partitioned table, if the former is chosen, it may use much memory in
>> total.  If the latter is chosen, the records per relation could be
>> few and thus the benefit of bulk insert could be small.
>>
> 
> I think it needs to be applied per relation, because that's the level at
> which we can do it easily and consistently. The whole point is to send
> data in sufficiently large chunks to minimize the communication overhead
> (latency etc.), but if you enforce it "per query" that seems hard.
> 
> Imagine you're inserting data into a table with many partitions - how do
> you pick the number of rows to accumulate? The table may have 10 or 1000
> partitions, we may be inserting into all partitions or just a

Re: [PATCH] Add features to pg_stat_statements

2020-11-10 Thread Fujii Masao





On 2020/11/09 18:02, Seino Yuki wrote:

2020-11-09 15:39 に Seino Yuki さんは書きました:


However, let me confirm the following.

Is this information really useful?
If there is no valid use case for this, I'd like to drop it.
Thought?


I thought it would be easy for users to see at a glance that if there is a case 
I assumed,
if the last modified date and time is old, there is no need to adjust at all, 
and if the
last modified date and time is recent, it would be easy for users to understand 
that the
parameters need to be adjusted.
What do you think?


Even without the last deallocation timestamp, you can presume
when the deallocation of entries happened by periodically monitoring
the total number of deallocations and seeing those history. Or IMO
it's better to log whenever the deallocation happens as proposed upthread.
Then you can easily check the history of occurrences of deallocations
from the log file.

Regards,


+1.I think you should output the deallocation history to the log as well.

In light of the above, I've posted a patch that reflects the points made.

Regards,


Sorry. There was a typo in the documentation correction.
I'll post a patch to fix it.


Thanks for updating the patch!

+   pgss_info->dealloc = 0;
+   SpinLockInit(_info->mutex);
+   Assert(pgss_info->dealloc == 0);

Why is this assertion check necessary? It seems not necessary.

+   {
+   Assert(found == found_info);

Having pgssSharedState and pgssInfoCounters separately might make
the code a bit more complicated like the above? If this is true, what about
including pgssInfoCounters in pgssSharedState?

PGSS_FILE_HEADER needs to be changed since the patch changes
the format of pgss file?

+   /* Read pgss_info */
+   if (feof(file) == 0)
+   if (fread(pgss_info, sizeof(pgssInfoCounters), 1, file) != 1)
+   goto read_error;

Why does feof(file) need to be called here?

+pgss_info_update(void)
+{
+   {

Why is the second "{" necessary? It seems redundant.

+pgss_info_reset(void)
+{
+   {

Same as above.

+pg_stat_statements_info(PG_FUNCTION_ARGS)
+{
+   int64   d_count = 0;
+   {

Same as above.

+   SpinLockAcquire(>mutex);
+   d_count = Int64GetDatum(c->dealloc);
+   SpinLockRelease(>mutex);

Why does Int64GetDatum() need to be called here? It seems not necessary.

+   
+
+ pg_stat_statements_info() returns bigint
+ 
+  pg_stat_statements_info
+ 
+

Isn't it better not to expose pg_stat_statements_info() function in the
document because pg_stat_statements_info view is enough and there
seems no use case for the function?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: Misuse of TimestampDifference() in the autoprewarm feature of pg_prewarm

2020-11-10 Thread Alexey Kondratov


On 2020-11-09 23:25, Tom Lane wrote:

Alexey Kondratov  writes:

On 2020-11-09 21:53, Tom Lane wrote:
0002 seems like a pretty clear bug fix, though I wonder if this is 
exactly
what we want to do going forward.  It seems like a very large 
fraction of
the callers of TimestampDifference would like to have the value in 
msec,

which means we're doing a whole lot of expensive and error-prone
arithmetic to break down the difference to sec/usec and then put it
back together again.  Let's get rid of that by inventing, say
TimestampDifferenceMilliseconds(...).



Yeah, I get into this problem after a bug in another extension —
pg_wait_sampling. I have attached 0002, which implements
TimestampDifferenceMilliseconds(), so 0003 just uses this new function
to solve the initial issues. If it looks good to you, then we can 
switch

all similar callers to it.


Yeah, let's move forward with that --- in fact, I'm inclined to
back-patch it.  (Not till the current release cycle is done, though.
I don't find this important enough to justify a last-moment patch.)

BTW, I wonder if we shouldn't make TimestampDifferenceMilliseconds
round any fractional millisecond up rather than down.  Rounding down
seems to create a hazard of uselessly waking just before the delay is
completed.  Better to wake just after.



Yes, it make sense. I have changed TimestampDifferenceMilliseconds() to 
round result up if there is a reminder.


After looking on the autoprewarm code more closely I have realised that 
this 'double dump' issues was not an issues at all. I have just 
misplaced a debug elog(), so its second output in the log was only 
indicating that we calculated delay_in_ms one more time. Actually, even 
with wrong calculation of delay_in_ms the only problem was that we were 
busy looping with ~1 second interval instead of waiting on latch.


It is still a buggy behaviour, but much less harmful than I have 
originally thought.



Regards
--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres CompanyFrom ce09103d9d58b611728b66366cd24e8a4069f7ac Mon Sep 17 00:00:00 2001
From: Alexey Kondratov 
Date: Mon, 9 Nov 2020 19:04:10 +0300
Subject: [PATCH v3 3/3] pg_prewarm: add tap test for autoprewarm feature

---
 contrib/pg_prewarm/Makefile |  2 +
 contrib/pg_prewarm/t/001_autoprewarm.pl | 59 +
 2 files changed, 61 insertions(+)
 create mode 100644 contrib/pg_prewarm/t/001_autoprewarm.pl

diff --git a/contrib/pg_prewarm/Makefile b/contrib/pg_prewarm/Makefile
index b13ac3c813..9cfde8c4e4 100644
--- a/contrib/pg_prewarm/Makefile
+++ b/contrib/pg_prewarm/Makefile
@@ -10,6 +10,8 @@ EXTENSION = pg_prewarm
 DATA = pg_prewarm--1.1--1.2.sql pg_prewarm--1.1.sql pg_prewarm--1.0--1.1.sql
 PGFILEDESC = "pg_prewarm - preload relation data into system buffer cache"
 
+TAP_TESTS = 1
+
 ifdef USE_PGXS
 PG_CONFIG = pg_config
 PGXS := $(shell $(PG_CONFIG) --pgxs)
diff --git a/contrib/pg_prewarm/t/001_autoprewarm.pl b/contrib/pg_prewarm/t/001_autoprewarm.pl
new file mode 100644
index 00..f55b2a5352
--- /dev/null
+++ b/contrib/pg_prewarm/t/001_autoprewarm.pl
@@ -0,0 +1,59 @@
+#
+# Check that pg_prewarm can dump blocks from shared buffers
+# to PGDATA/autoprewarm.blocks.
+#
+
+use strict;
+use Test::More;
+use TestLib;
+use Time::HiRes qw(usleep);
+use warnings;
+
+use PostgresNode;
+
+plan tests => 3;
+
+# Wait up to 180s for pg_prewarm to dump blocks.
+sub wait_for_dump
+{
+	my $path = shift;
+
+	foreach my $i (0 .. 1800)
+	{
+		last if -e $path;
+		usleep(100_000);
+	}
+}
+
+my $node = get_new_node("node");
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq(
+shared_preload_libraries = 'pg_prewarm'
+pg_prewarm.autoprewarm = 'on'
+pg_prewarm.autoprewarm_interval = 1
+));
+$node->start;
+
+my $blocks_path = $node->data_dir . '/autoprewarm.blocks';
+
+# Check that we can dump blocks on timeout.
+wait_for_dump($blocks_path);
+ok(-e $blocks_path, 'file autoprewarm.blocks should be present in the PGDATA');
+
+# Check that we can dump blocks on shutdown.
+$node->stop;
+$node->append_conf(
+	'postgresql.conf', qq(
+pg_prewarm.autoprewarm_interval = 0
+));
+
+# Remove autoprewarm.blocks
+unlink($blocks_path) || die "$blocks_path: $!";
+ok(!-e $blocks_path, 'sanity check, dump on timeout is turned off');
+
+$node->start;
+$node->stop;
+
+wait_for_dump($blocks_path);
+ok(-e $blocks_path, 'file autoprewarm.blocks should be present in the PGDATA after clean shutdown');
-- 
2.19.1

From fba212ed765c8c411db1ca19c2ac991662109d99 Mon Sep 17 00:00:00 2001
From: Alexey Kondratov 
Date: Mon, 9 Nov 2020 19:12:00 +0300
Subject: [PATCH v3 2/3] pg_prewarm: fix autoprewarm_interval behaviour

Previously it misused seconds from TimestampDifference() as
milliseconds, so it was busy looping with ~1 second interval
instead of waiting on latch with default autoprewarm_interval = 300s.
---
 contrib/pg_prewarm/autoprewarm.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git

Re: Error on failed COMMIT

2020-11-10 Thread Dave Cramer

On Mon, 9 Nov 2020 at 16:26, Dave Cramer  wrote:

>
>
> On Wed, 30 Sep 2020 at 18:14, Andrew Dunstan <
> andrew.duns...@2ndquadrant.com> wrote:
>
>>
>> On 8/4/20 12:19 PM, Dave Cramer wrote:
>> > Attached is the rebased patch for consideration.
>> >
>> >
>>
>>
>> It's a bit sad this has been hanging around so long without attention.
>>
>>
>> The previous discussion seems to give the patch a clean bill of health
>> for most/all of the native drivers. Are there any implications for libpq
>> based drivers such as DBD::Pg and psycopg2? How about for ecpg?
>>
>>
>> cheers
>>
>>
>> andrew
>>
>>
>> --
>> Andrew Dunstanhttps://www.2ndQuadrant.com
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>
>
> Attached is a rebased patch with fixes for the isolation tests
>
>

>
> Dave Cramer
> www.postgres.rocks
>


0001-Throw-error-and-rollback-on-a-failed-transaction-ins.patch
Description: Binary data

Re: ModifyTable overheads in generic plans

2020-11-10 Thread Heikki Linnakangas


On 10/11/2020 13:12, Amit Langote wrote:

On Wed, Nov 4, 2020 at 11:32 AM Amit Langote  wrote:

On Tue, Nov 3, 2020 at 9:05 PM Heikki Linnakangas  wrote:

A) We could change FDW API so that BeginDirectModify takes the same
arguments as BeginForeignModify(). That avoids the assumption that it's
a ForeignScan node, because BeginForeignModify() doesn't take
ForeignScanState as argument. That would be consistent, which is nice.
But I think we'd somehow still need to pass the ResultRelInfo to the
corresponding ForeignScan, and I'm not sure how.


Maybe ForeignScan doesn't need to contain any result relation info
then?  ForeignScan.operation != CMD_SELECT is enough to tell it to
call IterateDirectModify() as today.


Hmm, I misspoke.   We do still need ForeignScanState.resultRelInfo,
because the IterateDirectModify() API uses it to return the remotely
inserted/updated/deleted tuple for the RETURNING projection performed
by ExecModifyTable().


B) Look up the ResultRelInfo, and call BeginDirectModify(), on the first
call to ForeignNext().

C) Accept the Assertion. And add an elog() check in the planner for that
with a proper error message.

I'm leaning towards B), but maybe there's some better solution I didn't
think of?   Perhaps changing the API would make sense in any case, it is a
bit weird as it is. Backwards-incompatible API changes are not nice, but
I don't think there are many FDWs out there that implement the
DirectModify functions. And those functions are pretty tightly coupled
with the executor and ModifyTable node details anyway, so I don't feel
like we can, or need to, guarantee that they stay unchanged across major
versions.


B is not too bad, but I tend to prefer doing A too.


On second thought, it seems A would amount to merely a cosmetic
adjustment of the API, nothing more.  B seems to get the job done for
me and also doesn't unnecessarily break compatibility, so I've updated
0001 to implement B.  Please give it a look.


Looks good at a quick glance. It is a small API break that 
BeginDirectModify() is now called during execution, not at executor 
startup, but I don't think that's going to break FDWs in practice. One 
could argue, though, that if we're going to change the API, we should do 
it more loudly. So changing the arguments might be a good thing.


The BeginDirectModify() and BeginForeignModify() interfaces are 
inconsistent, but that's not this patch's fault. I wonder if we could 
move the call to BeginForeignModify() also to ForeignNext(), though? And 
BeginForeignScan() too, while we're at it.


Overall, this is probably fine as it is though. I'll review more 
thorougly tomorrow.


- Heikki

Re: Multi Inserts in CREATE TABLE AS - revived patch

2020-11-10 Thread Justin Pryzby

On Tue, Nov 10, 2020 at 10:17:15AM +, Paul Guo wrote:
> Raw insert could be used on CTAS & Create MatView. For Refresh MatView the 
> code is a bit
> different. I did not spend more time on this so not sure raw insert could be 
> used for that.
> 
> But I think the previous multi insert work could be still used in at least 
> "INSERT tbl SELECT…” (if the INSERT
> is a simple one, e.g. no trigger, no index, etc).

Note that I've started that idea on another thread:
https://commitfest.postgresql.org/30/2553/
 - should INSERT SELECT use a BulkInsertState? (and multi_insert)

There's also this one:
https://commitfest.postgresql.org/30/2818/
 - split copy.c, Heikki

-- 
Justin

Re: POC: postgres_fdw insert batching

2020-11-10 Thread Tomas Vondra

Hi,

Thanks for working on this!

On 11/10/20 1:45 AM, tsunakawa.ta...@fujitsu.com wrote:
> Hello,
> 
> 
> The attached patch implements the new bulk insert routine for
> postgres_fdw and the executor utilizing it.  It passes make
> check-world.
> 

I haven't done any testing yet, just a quick review.

I see the patch builds the "bulk" query in execute_foreign_modify. IMO
that's something we should do earlier, when we're building the simple
query (for 1-row inserts). I'd understand if you were concerned about
overhead in case of 1-row inserts, trying to not plan the bulk query
until necessary, but I'm not sure this actually helps.

Or was the goal to build a query for every possible number of slots? I
don't think that's really useful, considering it requires deallocating
the old plan, preparing a new one, etc. IMO it should be sufficient to
have two queries - one for 1-row inserts, one for the full batch. The
last incomplete batch can be inserted using a loop of 1-row queries.

That's what my patch was doing, but I'm not insisting on that - it just
seems like a better approach to me. So feel free to argue why this is
better.

> I measured performance in a basic non-partitioned case by modifying
> Tomas-san's scripts.  They perform an INSERT SELECT statement that
> copies one million records.  The table consists of two integer
> columns, with a primary key on one of those them.  You can run the
> attached prepare.sql to set up once.  local.sql inserts to the table
> directly, while fdw.sql inserts through a foreign table.
> 
> The performance results, the average time of 5 runs,  were as follows
> on a Linux host where the average round-trip time of "ping localhost"
> was 34 us:
> 
> master, local: 6.1 seconds master, fdw: 125.3 seconds patched, fdw:
> 11.1 seconds (11x improvement)
> 

Nice. I think we can't really get much closer to local master, so 6.1
vs. 11.1 seconds look quite acceptable.

> 
> The patch accumulates at most 100 records in ModifyTableState before
> inserting in bulk.  Also, when an input record is targeted for a
> different relation (= partition) than that for already accumulated
> records, insert the accumulated records and store the new record for
> later insert.
> 
> [Issues]
> 
> 1. Do we want a GUC parameter, say, max_bulk_insert_records =
> (integer), to control the number of records inserted at once? The
> range of allowed values would be between 1 and 1,000.  1 disables
> bulk insert. The possible reason of the need for this kind of
> parameter would be to limit the amount of memory used for accumulated
> records, which could be prohibitively large if each record is big.  I
> don't think this is a must, but I think we can have it.
> 

I think it'd be good to have such GUC, even if only for testing and
development. We should probably have a way to disable the batching,
which the GUC could also do, I think. So +1 to have the GUC.

> 2. Should we accumulate records per relation in ResultRelInfo
> instead? That is, when inserting into a partitioned table that has
> foreign partitions, delay insertion until a certain number of input
> records accumulate, and then insert accumulated records per relation
> (e.g., 50 records to relation A, 30 records to relation B, and 20
> records to relation C.)  If we do that,
> 

I think there's a chunk of text missing here? If we do that, then what?

Anyway, I don't see why accumulating the records in ResultRelInfo would
be better than what the patch does now. It seems to me like fairly
specific to FDWs, so keeping it int FDW state seems appropriate. What
would be the advantage of stashing it in ResultRelInfo?

> * The order of insertion differs from the order of input records.  Is
> it OK?
> 

I think that's OK for most use cases, and if it's not (e.g. when there's
something requiring the exact order of writes) then it's not possible to
use batching. That's one of the reasons why I think we should have a GUC
to disable the batching.

> * Should the maximum count of accumulated records be applied per
> relation or the query? When many foreign partitions belong to a
> partitioned table, if the former is chosen, it may use much memory in
> total.  If the latter is chosen, the records per relation could be
> few and thus the benefit of bulk insert could be small.
> 

I think it needs to be applied per relation, because that's the level at
which we can do it easily and consistently. The whole point is to send
data in sufficiently large chunks to minimize the communication overhead
(latency etc.), but if you enforce it "per query" that seems hard.

Imagine you're inserting data into a table with many partitions - how do
you pick the number of rows to accumulate? The table may have 10 or 1000
partitions, we may be inserting into all partitions or just a small
subset, not all partitions may be foreign, etc. It seems pretty
difficult to pick and enforce a reliable limit at the query level. But
maybe I'm missing something and it's easier than I think?

Re: Disable WAL logging to speed up data loading

2020-11-10 Thread Stephen Frost

Greetings,

* Kyotaro Horiguchi (horikyota@gmail.com) wrote:
> For fuel(?) of the discussion, I tried a very-quick PoC for in-place
> ALTER TABLE SET LOGGED/UNLOGGED and resulted as attached. After some
> trials of several ways, I drifted to the following way after poking
> several ways.
> 
> 1. Flip BM_PERMANENT of active buffers
> 2. adding/removing init fork
> 3. sync files,
> 4. Flip pg_class.relpersistence.
> 
> It always skips table copy in the SET UNLOGGED case, and only when
> wal_level=minimal in the SET LOGGED case.  Crash recovery seems
> working by some brief testing by hand.

Somehow missed that this patch more-or-less does what I was referring to
down-thread, but I did want to mention that it looks like it's missing a
necessary FlushRelationBuffers() call before the sync, otherwise there
could be dirty buffers for the relation that's being set to LOGGED (with
wal_level=minimal), which wouldn't be good.  See the comments above
smgrimmedsync().

> Of course, I haven't performed intensive test on it.

Reading through the thread, it didn't seem very clear, but we should
definitely make sure that it does the right thing on replicas when going
between unlogged and logged (and between logged and unlogged too), of
course.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: PATCH: Attempt to make dbsize a bit more consistent

2020-11-10 Thread gkokolatos

‐‐‐ Original Message ‐‐‐
On Monday, November 9, 2020 7:50 PM, Soumyadeep Chakraborty 
 wrote:

> Hey Georgios,
>
> Thanks for looking for more avenues to invoke tableAM APIS! Please find
> my review below:

A great review Soumyadeep, it is much appreciated.
Please remember to add yourself as a reviewer in the commitfest
[https://commitfest.postgresql.org/30/2701/]

>
> On Tue, Oct 13, 2020 at 6:28 AM gkokola...@pm.me wrote:
>
> 1.
>
> > /*
> >
> > -   -   heap size, including FSM and VM
> >
> > -   -   table size, including FSM and VM
> > */
> >
>
> We should not mention FSM and VM in dbsize.c at all as these are
> heap AM specific. We can say:
> table size, excluding TOAST relation

Yeah, I was thinking that the notion that FSM and VM are still taken
into account should be stated. We are iterating over ForkNumber
after all.

How about phrasing it as:

+ table size, including all implemented forks from the AM (e.g. FSM, VM)
+ excluding TOAST relations

Thoughts?

>
> 2.
>
> > /*
> >
> > -   Size of toast relation
> > */
> > if (OidIsValid(rel->rd_rel->reltoastrelid))
> >
> >
> > -   size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
> >
> > -   {
> > -   Relation toastRel;
> > -
> > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
>
> We can replace the OidIsValid check with relation_needs_toast_table()
> and then have the OidIsValid() check in an Assert. Perhaps, that will
> make things more readable.

Please, allow me to kindly disagree.

Relation is already open at this stage. Even create_toast_table(), the
internal workhorse for creating toast relations, does check reltoastrelid
to test if the relation is already toasted.

Furthermore, we do call:

+ toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);

and in order to avoid elog() errors underneath us, we ought to have
verified the validity of reltoastrelid.

In short, I think that the code in the proposal is not too unreadable
nor that it breaks the coding patterns throughout the codebase.

Am I too wrong?

>
> 3.
>
> > -   if (rel->rd_rel->relkind == RELKIND_RELATION ||
> > -   rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
> > -   rel->rd_rel->relkind == RELKIND_MATVIEW)
> > -   size = calculate_table_size(rel);
> > -   else
> > -   {
> > -   relation_close(rel, AccessShareLock);
> > -   PG_RETURN_NULL();
> > -   }
>
> This leads to behavioral changes:
>
> I am talking about the case where one calls pg_table_size() on an index.
> W/o your patch, it returns the size of the index. W/ your patch it
> returns NULL. Judging by the documentation, this function should not
> ideally apply to indexes but it does! I have a sinking feeling that lots
> of users use it for this purpose, as there is no function to calculate
> the size of a single specific index (except pg_relation_size()).
> The same argument I have made above applies to sequences. Users may have
> trial-and-errored their way into finding that pg_table_size() can tell
> them the size of a specific index/sequence! I don't know how widespread
> the use is in the user community, so IMO maybe we should be conservative
> and not introduce this change? Alternatively, we could call out that
> pg_table_size() is only for tables by throwing an error if anything
> other than a table is passed in.
>
> If we decide to preserve the existing behavior of the pg_table_size():
> It seems that for things not backed by the tableAM (indexes and
> sequences), they should still go through calculate_relation_size().
> We can call table_relation_size() based on if relkind is
> RELKIND_RELATION, RELKIND_TOASTVALUE or RELKIND_MATVIEW. Perhaps it
> might be worthwhile making a new macro RELKIND_HAS_TABLE_STORAGE to
> capture these three cases (See RELKIND_HAS_STORAGE). This also ensures
> that we return 0 for things that don't qualify as RELKIND_HAS_STORAGE,
> such as a partitioned table (Currently w/ the patch applied, we return
> NULL for those cases, which is another behavior change)

Excellent point. This is the discussion I was longing to have.

I stand by the decision coded in the patch, that pg_table_size() should
return NULL for other kinds of relations, such as indexes, sequences
etc.

It is a conscious decision based on the following:

 * Supported by the documentation, pg_table_size() applies to tables only.
For other uses the higher-level functions pg_total_relation_size() or
pg_relation_size() should be used.
 * Commit fa352d662e taught pg_relation_size() and friends to return NULL if 
the object doesn't exist. This makes perfect sense for the
scenarios described in the commit:

   That avoids errors when the functions are used in queries like
  "SELECT pg_relation_size(oid) FROM pg_class",
   and a table is dropped concurrently.

IMHO: It is more consistent to return NULL when the relation does exist
OR it is not a table kind.
* Returning 0 for things that do not have storage, is nonsensical because
it implies that it

Re: Disable WAL logging to speed up data loading

2020-11-10 Thread Stephen Frost

Greetings,

* Kyotaro Horiguchi (horikyota@gmail.com) wrote:
> At Mon, 9 Nov 2020 10:18:08 -0500, Stephen Frost  wrote 
> in 
> > Greetings,
> > 
> > * osumi.takami...@fujitsu.com (osumi.takami...@fujitsu.com) wrote:
> > > When I consider the use case is the system of data warehouse
> > > as described upthread, the size of each table can be large.
> > > Thus, changing the status from unlogged to logged (recoverable)
> > > takes much time under the current circumstances, which was discussed 
> > > before.
> > 
> > Ok- so the issue is that, today, we dump all of the table into the WAL
> > when we go from unlogged to logged, but as I outlined previously,
> > perhaps that's because we're missing a trick there when
> > wal_level=minimal.  If wal_level=minimal, then it would seem like we
> > could lock the table, then sync it and then mark is as logged, which is
> 
> FWIW, the following is that, I think it works not only when
> wal_level=minimal for SET UNLOGGED, and only works when minimal for
> SET LOGGED.
> 
> https://www.postgresql.org/message-id/20201002.100621.1668918756520136893.horikyota@gmail.com

Oh, nice!  Very cool that this was already being worked on, apologies
for not seeing that thread.

> | For fuel(?) of the discussion, I tried a very-quick PoC for in-place
> | ALTER TABLE SET LOGGED/UNLOGGED and resulted as attached. After some
> | trials of several ways, I drifted to the following way after poking
> | several ways.
> | 
> | 1. Flip BM_PERMANENT of active buffers
> | 2. adding/removing init fork
> | 3. sync files,
> | 4. Flip pg_class.relpersistence.
> | 
> | It always skips table copy in the SET UNLOGGED case, and only when
> | wal_level=minimal in the SET LOGGED case.  Crash recovery seems
> | working by some brief testing by hand.

Looking over that patch- isn't it missing a necessary FlushBuffers(), to
make sure that any dirty buffers have been written out before the sync
is being done?

Will try to find that thread and review/comment on it there too, if I
can, buf figured I'd point it out here while I have your attention too.

:)

> > more-or-less what you're asking to have be effectively done with the
> > proposed wal_level=none, but this would be an optimization for all
> > existing users of wal_level=minimal who have unlogged tables that they
> > want to change to logged, and this works on a per-table basis instead,
> > which seems like a better approach than a cluster-wide setting.
> > 
> > > By having the limited window of time,
> > > during wal_level=none, I'd like to make wal_level=none work to
> > > localize and minimize the burden to guarantee all commands are
> > > repeatable. To achieve this, after switching wal_level from none to 
> > > higher ones,
> > > the patch must ensure crash recovery, though.
> > 
> > Perhaps a helper command could be added to ALTER TABLE ALL IN TABLESPACE
> > to marked a bunch of unlogged tables over to being logged would be good
> > to add too.
> 
> I agree to this aspect of the in-place flipping of UNLOGGED.

Yeah, seems like it'd be useful and probably not hard to implement.

> > > Sorry that my current patch doesn't complete this aspect fully at present
> > > but, may I have your opinion about this ?
> > 
> > Presently, my feeling is that we could address this use-case without
> > having to introduce a new cluster-wide WAL level, and that's the
> > direction I'd want to see this going.  Perhaps I'm missing something
> > about why the approach I've set forth above wouldn't work, and
> > wal_level=none would, but I've not seen it yet.
> 
> Couldn't we have something like the following?
> 
>  ALTER TABLE table1, table2, table3 SET UNLOGGED;
> 
> That is, multiple target object specification in ALTER TABLE sttatement.

That requires knowing all the tables ahead of time though, or writing
some pl/pgsql to build up that list.  Not hard in general, but having
the 'all in tablespace' would make it a bit easier if you wanted to do
this for effectively all tables in a given database.  Having ALTER TABLE
support being given an explicit list of tables does seem like it could
be useful though, so certainly not against that if someone wanted to
implement it.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: Parallel copy

2020-11-10 Thread Amit Kapila

On Tue, Nov 10, 2020 at 7:12 PM vignesh C  wrote:
>
> On Tue, Nov 3, 2020 at 2:28 PM Amit Kapila  wrote:
> >
>
> I have worked to provide a patch for the parallel safety checks. It
> checks if parallely copy can be performed, Parallel copy cannot be
> performed for the following a) If relation is temporary table b) If
> relation is foreign table c) If relation has non parallel safe index
> expressions d) If relation has triggers present whose type is of non
> before statement trigger type e) If relation has check constraint
> which are not parallel safe f) If relation has partition and any
> partition has the above type. This patch has the checks for it. This
> patch will be used by parallel copy implementation.
>

How did you ensure that this is sufficient? For parallel-insert's
patch we have enabled parallel-mode for Inserts and ran the tests with
force_parallel_mode to see if we are not missing anything. Also, it
seems there are many common things here w.r.t parallel-insert patch,
is it possible to prepare this atop that patch or do you have any
reason to keep this separate?

-- 
With Regards,
Amit Kapila.

Re: logical streaming of xacts via test_decoding is broken

2020-11-10 Thread Amit Kapila

On Tue, Nov 10, 2020 at 2:25 PM Dilip Kumar  wrote:
>
> On Tue, Nov 10, 2020 at 11:18 AM Dilip Kumar  wrote:
> >
> > On Tue, Nov 10, 2020 at 10:52 AM Amit Kapila  
> > wrote:
> > > For this case, users can use skip_empty_xacts = true and
> > > skip_empty_streams = false. I am just asking if the user has only used
> > > skip_empty_xacts = true and didn't use the 'skip_empty_streams'
> > > option.
> >
> > Ok, thanks for the clarification.
> >
>
> I have prepared a patch for the same.
>

Few comments:
1.
+ else if (strcmp(elem->defname, "skip-empty-streams") == 0)
+ {
+ if (elem->arg == NULL)
+ data->skip_empty_streams = true;
+ else if (!parse_bool(strVal(elem->arg), >skip_empty_streams))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("could not parse value \"%s\" for parameter \"%s\"",
+ strVal(elem->arg), elem->defname)));
+ if (!data->skip_empty_xacts && data->skip_empty_streams)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("the skip-empty-streams can not be true if skip-empty-xacts
is false")));
  }

You can probably add a comment as to why we are disallowing this case.
I thought of considering 'stream-changes' parameter here because it
won't make sense to give this parameter without it, however, it seems
that is not necessary but maybe adding a comment
here in that regard would be a good idea.

2.
pg_decode_begin_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn)
 {
  TestDecodingData *data = ctx->output_plugin_private;
+ TestDecodingTxnData *txndata =
+ MemoryContextAllocZero(ctx->context, sizeof(TestDecodingTxnData));
+

Shall we free this memory at commit time for the sake of consistency,
otherwise also it would be freed with decoding context?

3. Can you please prepare a separate patch for test case changes so
that it would be easier to verify that it fails without the patch and
passed after the patch?

-- 
With Regards,
Amit Kapila.

Re: Parallel copy

2020-11-10 Thread vignesh C

On Tue, Nov 3, 2020 at 2:28 PM Amit Kapila  wrote:
>
> On Mon, Nov 2, 2020 at 12:40 PM Heikki Linnakangas  wrote:
> >
> > On 02/11/2020 08:14, Amit Kapila wrote:
> > > On Fri, Oct 30, 2020 at 10:11 PM Heikki Linnakangas  
> > > wrote:
> > >>
> > >> In this design, you don't need to keep line boundaries in shared memory,
> > >> because each worker process is responsible for finding the line
> > >> boundaries of its own block.
> > >>
> > >> There's a point of serialization here, in that the next block cannot be
> > >> processed, until the worker working on the previous block has finished
> > >> scanning the EOLs, and set the starting position on the next block,
> > >> putting it in READY state. That's not very different from your patch,
> > >> where you had a similar point of serialization because the leader
> > >> scanned the EOLs,
> > >
> > > But in the design (single producer multiple consumer) used by the
> > > patch the worker doesn't need to wait till the complete block is
> > > processed, it can start processing the lines already found. This will
> > > also allow workers to start much earlier to process the data as it
> > > doesn't need to wait for all the offsets corresponding to 64K block
> > > ready. However, in the design where each worker is processing the 64K
> > > block, it can lead to much longer waits. I think this will impact the
> > > Copy STDIN case more where in most cases (200-300 bytes tuples) we
> > > receive line-by-line from client and find the line-endings by leader.
> > > If the leader doesn't find the line-endings the workers need to wait
> > > till the leader fill the entire 64K chunk, OTOH, with current approach
> > > the worker can start as soon as leader is able to populate some
> > > minimum number of line-endings
> >
> > You can use a smaller block size.
> >
>
> Sure, but the same problem can happen if the last line in that block
> is too long and we need to peek into the next block. And then there
> could be cases where a single line could be greater than 64K.
>
> > However, the point of parallel copy is
> > to maximize bandwidth.
> >
>
> Okay, but this first-phase (finding the line boundaries) can anyway be
> not done in parallel and we have seen in some of the initial
> benchmarking that this initial phase is a small part of work
> especially when the table has indexes, constraints, etc. So, I think
> it won't matter much if this splitting is done in a single process or
> multiple processes.
>
> > If the workers ever have to sit idle, it means
> > that the bottleneck is in receiving data from the client, i.e. the
> > backend is fast enough, and you can't make the overall COPY finish any
> > faster no matter how you do it.
> >
> > > The other point is that the leader backend won't be used completely as
> > > it is only doing a very small part (primarily reading the file) of the
> > > overall work.
> >
> > An idle process doesn't cost anything. If you have free CPU resources,
> > use more workers.
> >
> > > We have discussed both these approaches (a) single producer multiple
> > > consumer, and (b) all workers doing the processing as you are saying
> > > in the beginning and concluded that (a) is better, see some of the
> > > relevant emails [1][2][3].
> > >
> > > [1] - 
> > > https://www.postgresql.org/message-id/20200413201633.cki4nsptynq7blhg%40alap3.anarazel.de
> > > [2] - 
> > > https://www.postgresql.org/message-id/20200415181913.4gjqcnuzxfzbbzxa%40alap3.anarazel.de
> > > [3] - 
> > > https://www.postgresql.org/message-id/78C0107E-62F2-4F76-BFD8-34C73B716944%40anarazel.de
> >
> > Sorry I'm late to the party. I don't think the design I proposed was
> > discussed in that threads.
> >
>
> I think something close to that is discussed as you have noticed in
> your next email but IIRC, because many people (Andres, Ants, myself
> and author) favoured the current approach (single reader and multiple
> consumers) we decided to go with that. I feel this patch is very much
> in the POC stage due to which the code doesn't look good and as we
> move forward we need to see what is the better way to improve it,
> maybe one of the ways is to split it as you are suggesting so that it
> can be easier to review. I think the other important thing which this
> patch has not addressed properly is the parallel-safety checks as
> pointed by me earlier. There are two things to solve there (a) the
> lower-level code (like heap_* APIs, CommandCounterIncrement, xact.c
> APIs, etc.) have checks which doesn't allow any writes, we need to see
> which of those we can open now (or do some additional work to prevent
> from those checks) after some of the work done for parallel-writes in
> PG-13[1][2], and (b) in which all cases we can parallel-writes
> (parallel copy) is allowed, for example need to identify whether table
> or one of its partitions has any constraint/expression which is
> parallel-unsafe.
>

I have worked to provide a patch for the parallel safety checks. It
checks if parallely copy

Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module

2020-11-10 Thread Bharath Rupireddy

Hi,

Attaching a patch that replaces custom signal handlers for SIGHUP and
SIGTERM in worker_spi.c.

Thoughts?


With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From a212163b64bc3ab4b8d4493e6d53f32979e3e9bf Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Tue, 10 Nov 2020 18:27:00 +0530
Subject: [PATCH v1] Use-standard-SIGHUP-SIGTERM-handlers-in-worker_spi.c

---
 src/test/modules/worker_spi/worker_spi.c | 47 +++-
 1 file changed, 6 insertions(+), 41 deletions(-)

diff --git a/src/test/modules/worker_spi/worker_spi.c b/src/test/modules/worker_spi/worker_spi.c
index 258237f9bf..5598d2e850 100644
--- a/src/test/modules/worker_spi/worker_spi.c
+++ b/src/test/modules/worker_spi/worker_spi.c
@@ -25,6 +25,7 @@
 /* These are always necessary for a bgworker */
 #include "miscadmin.h"
 #include "postmaster/bgworker.h"
+#include "postmaster/interrupt.h"
 #include "storage/ipc.h"
 #include "storage/latch.h"
 #include "storage/lwlock.h"
@@ -48,10 +49,6 @@ PG_FUNCTION_INFO_V1(worker_spi_launch);
 void		_PG_init(void);
 void		worker_spi_main(Datum) pg_attribute_noreturn();
 
-/* flags set by signal handlers */
-static volatile sig_atomic_t got_sighup = false;
-static volatile sig_atomic_t got_sigterm = false;
-
 /* GUC variables */
 static int	worker_spi_naptime = 10;
 static int	worker_spi_total_workers = 2;
@@ -64,38 +61,6 @@ typedef struct worktable
 	const char *name;
 } worktable;
 
-/*
- * Signal handler for SIGTERM
- *		Set a flag to let the main loop to terminate, and set our latch to wake
- *		it up.
- */
-static void
-worker_spi_sigterm(SIGNAL_ARGS)
-{
-	int			save_errno = errno;
-
-	got_sigterm = true;
-	SetLatch(MyLatch);
-
-	errno = save_errno;
-}
-
-/*
- * Signal handler for SIGHUP
- *		Set a flag to tell the main loop to reread the config file, and set
- *		our latch to wake it up.
- */
-static void
-worker_spi_sighup(SIGNAL_ARGS)
-{
-	int			save_errno = errno;
-
-	got_sighup = true;
-	SetLatch(MyLatch);
-
-	errno = save_errno;
-}
-
 /*
  * Initialize workspace for a worker process: create the schema if it doesn't
  * already exist.
@@ -179,8 +144,8 @@ worker_spi_main(Datum main_arg)
 	table->name = pstrdup("counted");
 
 	/* Establish signal handlers before unblocking signals. */
-	pqsignal(SIGHUP, worker_spi_sighup);
-	pqsignal(SIGTERM, worker_spi_sigterm);
+	pqsignal(SIGHUP, SignalHandlerForConfigReload);
+	pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
 
 	/* We're now ready to receive signals */
 	BackgroundWorkerUnblockSignals();
@@ -221,7 +186,7 @@ worker_spi_main(Datum main_arg)
 	/*
 	 * Main loop: do this until the SIGTERM handler tells us to terminate
 	 */
-	while (!got_sigterm)
+	while (!ShutdownRequestPending)
 	{
 		int			ret;
 
@@ -242,9 +207,9 @@ worker_spi_main(Datum main_arg)
 		/*
 		 * In case of a SIGHUP, just reload the configuration.
 		 */
-		if (got_sighup)
+		if (ConfigReloadPending)
 		{
-			got_sighup = false;
+			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
 		}
 
-- 
2.25.1

Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module

2020-11-10 Thread Bharath Rupireddy

On Tue, Nov 10, 2020 at 3:04 PM Fujii Masao  wrote:
>
> >> The main reason for having SetLatch() in
> >> SignalHandlerForConfigReload() is to wake up the calling process if
> >> waiting in WaitLatchOrSocket() or WaitLatch() and reload the new
> >> config file and use the reloaded config variables. Maybe we should
> >> give a thought on the scenarios in which the walreceiver process
> >> waits, and what happens in case the latch is set when SIGHUP is
> >> received.
> >
> > The difference is whether the config file is processed at the next
> > wakeup (by force-reply-request or SIGTERM) of walreceiver or
> > immediately. If server-reload happened frequently, say, several times
> > per second(?), we should consider to reduce the useless reloading, but
> > actually that's not the case.
>
> So, attached is the patch that makes walreceiver use both standard
> SIGTERM and SIGHUP handlers. Currently I've not found any actual
> issues by making walreceiver use standard SIGHUP handler, yet.
>

I think it makes sense to replace WalRcv->latch with
MyProc->procLatch(as they point to the same latch) in the functions
that are called in the walreceiver process. However, we are using
walrcv->latch with spinlock, say in WalRcvForceReply() and
RequestXLogStreaming() both are called from the startup process. See
commit 45f9d08684, that made the access to the walreceiver's
latch(WalRcv->latch) by the other process(startup) spinlock protected

And looks like, in general it's a standard practice to set latch to
wake up the process if waiting in case a SIGHUP signal is received and
reload the relevant config variables.

Going by the above analysis, the v3 patch looks good to me.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: -O switch

2020-11-10 Thread Magnus Hagander

On Mon, Nov 9, 2020 at 4:58 PM Tom Lane  wrote:
>
> Magnus Hagander  writes:
> > On Wed, Nov 4, 2020 at 2:10 AM Tom Lane  wrote:
> >> ... looking at this again, BackendRun certainly looks ridiculously
> >> over-engineered for what it still does.
>
> > Yeah, looking at it again, I agree. PFA an updated patch, which I'll
> > go ahead and push shortly.
>
> LGTM.

Pushed.


> > I do noticed when looking through this -- the comment before the function 
> > says:
>
> >  * returns:
> >  * Shouldn't return at all.
> >  * If PostgresMain() fails, return status.
>
> > I'm pretty sure that's incorrect in the current branches as well,
> > since it's a void function it will never return anything. Pretty sure
> > it should just have the first point and not the second one there, or
> > is this trying to convey some meaning I'm just not getting?
>
> Looking at old versions, BackendRun and PostgresMain used to be
> declared to return int.  Whoever changed that to void evidently
> missed updating this comment.
>
> I'd reduce the whole thing to "Doesn't return."  If you were feeling
> really ambitious you could start plastering pg_attribute_noreturn() on
> these functions ... but since that would be placed on the declarations,
> a comment here would still be in order probably.

They're already marked pg_attribute_noreturn() in the declarations.
It's just the comment that was a bit out of date.

I'll go fix that one.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: Allow some recovery parameters to be changed with reload

2020-11-10 Thread Kyotaro Horiguchi

Hello.

At Tue, 10 Nov 2020 14:13:17 +0300, Sergei Kornilov  wrote in 
> Hello
> 
> > If someone changes restore_command to '' then reload while crash
> > recovery is running, the server stops for no valid reason.
> 
> While *crash* recovery is running? It's possible only during Point-in-Time 
> Recovery, no?

Even if PITR is commanded, crash recovery can run before starting
archive recovery if the server was not gracefully shut down.
Parameter reload can happen while crash recovery. And
validateRecoveryParameters() calls "ereport(FATAL" in that case.

> At the beginning of validateRecoveryParameters we check 
> ArchiveRecoveryRequested, which can be set in two cases:

That does not prevent crash recovery from running.

> * if recovery.signal found - same check on recovery start. Otherwise it is 
> possible to early end recovery because of empty restore_command. So we want 
> to protect the user from such misconfiguration? I am fine if we decide that 
> no additional handling is needed.
> * if standby.signal found - this FATAL is not reachable because 
> StandbyModeRequested is also set.
>
> During crash recovery validateRecoveryParameters does nothing.



> > If restore_command is set to 'hoge' (literally:p, that is, anything
> > unexecutable) and send SIGHUP while archive recovery is running, the
> > server stops. I think we need to handle these cases more gracefully,
> 
> I think we can not perform such check reliable. As in my example earlier:
> 
> > restore_command = '. /etc/wal-g/WALG_AWS_ENV; wal-g wal-fetch "%f" "%p"'
> 
> How do we find the commands first? For any shell? And even: we learned that 
> the binary is unexecutable. But what to do next?

I don't suggest to check if the command actually works, I suggested to
avoid server stop even if the parameters failed to run after a
config-reload.

> > If someone changes restore_command by mistake to something executable
> > but fails to offer the specfied file even if it exists, the running
> > archive recovery finishes then switches timeline unexpectedly.
> 
> Or executable file was just removed. Which is clearly a pilot error. Is this 
> differs from changing restore_command?

I don't know. I just think that it is not proper that "ALTER SYSTEM" +
config-reload causes server stop.

> >>  I do not know the history of this fatal ereport. It looks like "must 
> >> specify restore_command when standby mode is not enabled" check is only 
> >> intended to protect the user from misconfiguration and the rest code will 
> >> treat empty restore_command correctly, just like /bin/false. Did not 
> >> notice anything around StandbyMode conditions.
> >
> > If restore_command is not changable after server-start, it would be
> > valid for startup to stop for inexecutable content for the variable
> > since there's no way to proceed recovery.
> 
> Why not use local pg_wal? There may be already enough WAL.

Mmm. If the file to read is in pg_wal, restore_command won't be
executed in the first place?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Autovacuum on partitioned table (autoanalyze)

2020-11-10 Thread Kyotaro Horiguchi

At Thu, 5 Nov 2020 16:03:12 +0900, yuzuko  wrote in 
> Hi Justin,
> 
> Thank you for your comments.
> I attached the latest patch(v11) to the previous email.
> 
> >
> > +* Get its all ancestors to propagate changes_since_analyze 
> > count.
> > +* However, when ANALYZE inheritance tree, we get ancestors 
> > of
> > +* toprel_oid to avoid needless counting.
> >
> > => I don't understand that comment.
> >
> I fixed that comment.

+* Get its all ancestors to propagate changes_since_analyze 
count.
+* However, when we have a valid toprel_oid, that is ANALYZE 
inheritance
+* tree, if we propagate the number to all ancestors, the next 
analyze
+* on partitioned tables in the tree could happen shortly 
expected.
+* So we get ancestors of toprel_oid which are not analyzed 
this time.

In second thought about the reason for the "toprel_oid". It is perhaps
to avoid "wrongly" propagated values to ancestors after a manual
ANALYZE on a partitioned table.  But the same happens after an
autoanalyze iteration if some of the ancestors of a leaf relation are
analyzed before the leaf relation in a autoanalyze iteration.  That
can trigger an unnecessary analyzing for some of the ancestors.
So we need to do a similar thing for autovacuum, However..

  [1(root):analzye]-[2:DONT analyze]-[3:analyze]-[leaf]

In this case topre_oid is invalid (since it's autoanalyze) but we
should avoid propagating the count to 1 and 3 if it is processed
*before* the leaf, but should propagate to 2. toprel_oid doesn't work
in that case.

So, to propagate the count properly, we need to analyze relations
leaf-to-root order, or propagate the counter only to anscestors that
haven't been processed in the current iteration.  It seems a bit too
complex to sort analyze relations in that order.  The latter would be
relatively simple.  See the attached for how it looks like.

Anyway, either way we take, it is not pgstat.c's responsibility to do
that since the former need to heavily reliant to what analyze does,
and the latter need to know what anlyze is doing.


> > Also, you called SearchSysCacheCopy1, but didn't free the tuple.  I don't 
> > think
> > you need to copy it anyway - just call ReleaseSysCache().
> >
> Fixed it.

Mmm. Unfortunately, that fix leaks cache reference when
!RELKIND_HAS_STORAGE.

> > Regarding the counters in pg_stat_all_tables: maybe some of these should be
> > null rather than zero ?  Or else you should make an 0001 patch to fully
> > implement this view, with all relevant counters, not just 
> > n_mod_since_analyze,
> > last_*analyze, and *analyze_count.  These are specifically misleading:
> >
> > last_vacuum |
> > last_autovacuum |
> > n_ins_since_vacuum  | 0
> > vacuum_count| 0
> > autovacuum_count| 0
> >
> I haven't modified this part yet, but you meant that we should set
> null to counters
> about vacuum because partitioned tables are not vacuumed?

Perhaps bacause partitioned tables *cannot* be vacuumed.  I'm not sure
what is the best way here.  Showing null seems reasonable but I'm not
sure that doesn't break anything.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 4d8ad754f8..50b55f5d01 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -816,6 +816,18 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
 since the last ANALYZE.

 
+   
+For declaratively partitioned tables, only analyze is supported.
+The same analyze threshold defined above is used,
+but the number of tuples is sum of their childrens'
+pg_class.reltuples.
+Also, total number of tuples inserted, updated, or deleted since the last
+ANALYZE compared with the threshold is calculated by adding up
+childrens' number of tuples analyzed in the previous ANALYZE.
+This is because partitioned tables don't have any data.  So analyze on partitioned
+tables are one lap behind their children.
+   
+

 Temporary tables cannot be accessed by autovacuum.  Therefore,
 appropriate vacuum and analyze operations should be performed via
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index bc59a2d77d..d94caa4b7e 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1323,8 +1323,6 @@ WITH ( MODULUS numeric_literal, REM
 If a table parameter value is set and the
 equivalent toast. parameter is not, the TOAST table
 will use the table's parameter value.
-Specifying these parameters for partitioned tables is not supported,
-but you may specify them for individual leaf partitions.

 

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 8ccc228a8c..35bc2e5bdb 100644
---

Add Nullif case for eval_const_expressions_mutator

2020-11-10 Thread Hou, Zhijie

Hi

I notice that there are no Nullif case in eval_const_expression.
Since Nullif is similar to Opexpr and is easy to implement,
I try add this case in eval_const_expressions_mutator.

Best regards,
houzj




0001-patch-eval-NULLIF.patch
Description: 0001-patch-eval-NULLIF.patch

Re: Allow some recovery parameters to be changed with reload

2020-11-10 Thread Sergei Kornilov

Hello

> If someone changes restore_command to '' then reload while crash
> recovery is running, the server stops for no valid reason.

While *crash* recovery is running? It's possible only during Point-in-Time 
Recovery, no?
At the beginning of validateRecoveryParameters we check 
ArchiveRecoveryRequested, which can be set in two cases:

* if recovery.signal found - same check on recovery start. Otherwise it is 
possible to early end recovery because of empty restore_command. So we want to 
protect the user from such misconfiguration? I am fine if we decide that no 
additional handling is needed.
* if standby.signal found - this FATAL is not reachable because 
StandbyModeRequested is also set.

During crash recovery validateRecoveryParameters does nothing.

> If restore_command is set to 'hoge' (literally:p, that is, anything
> unexecutable) and send SIGHUP while archive recovery is running, the
> server stops. I think we need to handle these cases more gracefully,

I think we can not perform such check reliable. As in my example earlier:

> restore_command = '. /etc/wal-g/WALG_AWS_ENV; wal-g wal-fetch "%f" "%p"'

How do we find the commands first? For any shell? And even: we learned that the 
binary is unexecutable. But what to do next?

> If someone changes restore_command by mistake to something executable
> but fails to offer the specfied file even if it exists, the running
> archive recovery finishes then switches timeline unexpectedly.

Or executable file was just removed. Which is clearly a pilot error. Is this 
differs from changing restore_command?

>>  I do not know the history of this fatal ereport. It looks like "must 
>> specify restore_command when standby mode is not enabled" check is only 
>> intended to protect the user from misconfiguration and the rest code will 
>> treat empty restore_command correctly, just like /bin/false. Did not notice 
>> anything around StandbyMode conditions.
>
> If restore_command is not changable after server-start, it would be
> valid for startup to stop for inexecutable content for the variable
> since there's no way to proceed recovery.

Why not use local pg_wal? There may be already enough WAL.

regards, Sergei

Re: ModifyTable overheads in generic plans

2020-11-10 Thread Amit Langote

On Wed, Nov 4, 2020 at 11:32 AM Amit Langote  wrote:
> On Tue, Nov 3, 2020 at 9:05 PM Heikki Linnakangas  wrote:
> > On 03/11/2020 10:27, Amit Langote wrote:
> > > Please check the attached if that looks better.
> >
> > Great, thanks! Yeah, I like that much better.
> >
> > This makes me a bit unhappy:
> >
> > >
> > >   /* Also let FDWs init themselves for foreign-table result 
> > > rels */
> > >   if (resultRelInfo->ri_FdwRoutine != NULL)
> > >   {
> > >   if (resultRelInfo->ri_usesFdwDirectModify)
> > >   {
> > >   ForeignScanState *fscan = (ForeignScanState 
> > > *) mtstate->mt_plans[i];
> > >
> > >   /*
> > >* For the FDW's convenience, set the 
> > > ForeignScanState node's
> > >* ResultRelInfo to let the FDW know which 
> > > result relation it
> > >* is going to work with.
> > >*/
> > >   Assert(IsA(fscan, ForeignScanState));
> > >   fscan->resultRelInfo = resultRelInfo;
> > >   
> > > resultRelInfo->ri_FdwRoutine->BeginDirectModify(fscan, eflags);
> > >   }
> > >   else if 
> > > (resultRelInfo->ri_FdwRoutine->BeginForeignModify != NULL)
> > >   {
> > >   List   *fdw_private = (List *) 
> > > list_nth(node->fdwPrivLists, i);
> > >
> > >   
> > > resultRelInfo->ri_FdwRoutine->BeginForeignModify(mtstate,
> > >   
> > >  resultRelInfo,
> > >   
> > >  fdw_private,
> > >   
> > >  i,
> > >   
> > >  eflags);
> > >   }
> > >   }
> >
> > If you remember, I was unhappy with a similar assertion in the earlier
> > patches [1]. I'm not sure what to do instead though. A few options:
> >
> > A) We could change FDW API so that BeginDirectModify takes the same
> > arguments as BeginForeignModify(). That avoids the assumption that it's
> > a ForeignScan node, because BeginForeignModify() doesn't take
> > ForeignScanState as argument. That would be consistent, which is nice.
> > But I think we'd somehow still need to pass the ResultRelInfo to the
> > corresponding ForeignScan, and I'm not sure how.
>
> Maybe ForeignScan doesn't need to contain any result relation info
> then?  ForeignScan.operation != CMD_SELECT is enough to tell it to
> call IterateDirectModify() as today.

Hmm, I misspoke.   We do still need ForeignScanState.resultRelInfo,
because the IterateDirectModify() API uses it to return the remotely
inserted/updated/deleted tuple for the RETURNING projection performed
by ExecModifyTable().

> > B) Look up the ResultRelInfo, and call BeginDirectModify(), on the first
> > call to ForeignNext().
> >
> > C) Accept the Assertion. And add an elog() check in the planner for that
> > with a proper error message.
> >
> > I'm leaning towards B), but maybe there's some better solution I didn't
> > think of?   Perhaps changing the API would make sense in any case, it is a
> > bit weird as it is. Backwards-incompatible API changes are not nice, but
> > I don't think there are many FDWs out there that implement the
> > DirectModify functions. And those functions are pretty tightly coupled
> > with the executor and ModifyTable node details anyway, so I don't feel
> > like we can, or need to, guarantee that they stay unchanged across major
> > versions.
>
> B is not too bad, but I tend to prefer doing A too.

On second thought, it seems A would amount to merely a cosmetic
adjustment of the API, nothing more.  B seems to get the job done for
me and also doesn't unnecessarily break compatibility, so I've updated
0001 to implement B.  Please give it a look.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


v8-0001-Set-ForeignScanState.resultRelInfo-lazily.patch
Description: Binary data


v8-0002-Initialize-result-relation-information-lazily.patch
Description: Binary data

Re: [HACKERS] logical decoding of two-phase transactions

2020-11-10 Thread Ajin Cherian

I was doing some testing, and I found some issues. Two issues. The
first one, seems to be a behaviour that might be acceptable, the
second one not so much.
I was using test_decoding, not sure how this might behave with the
pg_output plugin.

Test 1:
A transaction that is immediately rollbacked after the prepare.

SET synchronous_commit = on;
SELECT 'init' FROM
pg_create_logical_replication_slot('regression_slot',
'test_decoding');
CREATE TABLE stream_test(data text);
-- consume DDL
SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');

BEGIN;
INSERT INTO stream_test SELECT repeat('a', 10) || g.i FROM
generate_series(1, 20) g(i);
PREPARE TRANSACTION 'test1';
ROLLBACK PREPARED 'test1';
SELECT data FROM pg_logical_slot_get_changes('regression_slot',
NULL,NULL, 'two-phase-commit', '1', 'include-xids', '0',
'skip-empty-xacts', '1', 'stream-changes', '1');
==

Here, what is seen is that while the transaction was not decoded at
all  since it was rollbacked before it could get decoded, the ROLLBACK
PREPARED is actually decoded.
The result being that the standby could get a spurious ROLLBACK
PREPARED. The current code in worker.c does handle this silently. So,
this might not be an issue.

Test 2:
A transaction that is partially streamed , is then prepared.
'
BEGIN;
INSERT INTO stream_test SELECT repeat('a', 10) || g.i FROM
generate_series(1,800) g(i);
SELECT data FROM pg_logical_slot_get_changes('regression_slot',
NULL,NULL, 'two-phase-commit', '1', 'include-xids', '0',
'skip-empty-xacts', '1', 'stream-changes', '1');
SELECT data FROM pg_logical_slot_get_changes('regression_slot',
NULL,NULL, 'two-phase-commit', '1', 'include-xids', '0',
'skip-empty-xacts', '1', 'stream-changes', '1');
PREPARE TRANSACTION 'test1';
SELECT data FROM pg_logical_slot_get_changes('regression_slot',
NULL,NULL, 'two-phase-commit', '1', 'include-xids', '0',
'skip-empty-xacts', '1', 'stream-changes', '1');
ROLLBACK PREPARED 'test1';
==

Here, what is seen is that the transaction is streamed twice, first
when it crosses the memory threshold and is streamed (usually only in
the 2nd pg_logical_slot_get_changes call)
and then the same transaction is streamed again after the prepare.
This cannot be right, as it would result in duplication of data on the
standby.

I will be debugging the second issue and try to arrive at a fix.

regards,
Ajin Cherian
Fujitsu Australia.

On Tue, Nov 10, 2020 at 4:47 PM Peter Smith  wrote:
>
> FYI - I have cross-checked all the v18 patch code against the v18 code
> coverage [1] resulting from running the tests.
>
> The purpose of this study was to identify where there may be any gaps
> in the testing of this patch - e.g is there some v18 code not
> currently getting executed by the tests?
>
> I found almost all of the normal (not error) code paths are getting executed.
>
> For details please see attached the study results. (MS Excel file)
>
> ===
>
> [1] 
> https://www.postgresql.org/message-id/CAHut%2BPu4BpUr0GfCLqJjXc%3DDcaKSvjDarSN89-4W2nxBeae9hQ%40mail.gmail.com
>
> Kind Regards,
> Peter Smith.
> Fujitsu Australia

Re: Multi Inserts in CREATE TABLE AS - revived patch

2020-11-10 Thread Paul Guo

> On Nov 9, 2020, at 6:41 PM, Bharath Rupireddy 
>  wrote:
>
> On Tue, Nov 3, 2020 at 4:54 PM Bharath Rupireddy
>  wrote:
>>
>> If the approach followed in the patch looks okay, I can work on a separate 
>> patch for multi inserts in refresh materialized view cases.
>>
>
> Hi, I'm attaching a v2 patch that has multi inserts for CTAS as well
> as REFRESH MATERIALiZED VIEW.
>
> I did some testing: exec time in seconds.
>
> Use case 1: 1 int and 1 text column. each row size 129 bytes, size of
> 1 text column 101 bytes, number of rows 100million, size of heap file
> 12.9GB.
> HEAD: 220.733, 220.428
> Patch: 151.923, 152.484
>
> Thoughts?
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: 
> https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.enterprisedb.com%2Fdata=04%7C01%7Cguopa%40vmware.com%7C2471a90558ce4bf0af5b08d8849c03bb%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637405152899337347%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=QKeRMGQjOlOL%2FlQv%2BuEAb2ocLVq6zqXESKoNOaJ6YCo%3Dreserved=0
> 

Thanks for doing this. There might be another solution - use raw insert 
interfaces (i.e. raw_heap_insert()).
Attached is the test (not formal) patch that verifies this idea. 
raw_heap_insert() writes the page into the
table files directly and also write the FPI xlog when the tuples filled up the 
whole page. This seems be
more efficient.

In addition, those raw write interfaces call smgrimmedsync() when finishing raw 
inserting, this is because
the write bypasses the shared buffer so a CHECKPOINT plus crash might cause 
data corruption since
some FPI xlogs cannot be replayed and those table files are not fsync-ed during 
crash. It seems that a sync
request could be forwarded to the checkpointer for each table segment file and 
then we do not need to call
smgrimmedsync(). If the theory is correct this should be in a separate patch. 
Anyway I tested this idea
also by simply commenting out the smgrimmedsync() call in heap_raw_insert_end() 
(a new function in
the attached patch) since forwarding fsync request is light-weight.

I did a quick and simple testing. The test environment is a centos6 vm with 7G 
memory on my Mac laptop.
-O3 gcc compiler option; shared_buffers as 2GB. Did not check if parallel 
scanning is triggered by the test
query and the data volume is not large so test time is not long.

Here are the test script.
 create table t1 (a int, b int, c int, d int);
  insert into t1 select i,i,i,i from generate_series(1,1000) i;
  show shared_buffers;
  \timing on
  create table t2 as select * from t1;
  \timing off

Here are the results:

HEAD (37d2ff380312):
Time: 5143.041 ms (00:05.143)
Multi insert patch:
Time: 4456.461 ms (00:04.456)
Raw insert (attached):
Time: 2317.453 ms (00:02.317)
Raw insert + no smgrimmedsync():
Time: 2103.610 ms (00:02.104).

From the above data raw insert is better; also forwarding sync should be able 
to improve further
(Note my laptop is with SSD so on machine with SATA/SAS, I believe forwarding 
sync should
be able to help more.)

I tested removing smgrimmedsync in "vacuum full” code that uses raw insert 
also. FYI.
HEAD:
  Time: 3567.036 ms (00:03.567)
no smgrimmedsync:
  Time: 3023.487 ms (00:03.023)

Raw insert could be used on CTAS & Create MatView. For Refresh MatView the code 
is a bit
different. I did not spend more time on this so not sure raw insert could be 
used for that.

But I think the previous multi insert work could be still used in at least 
"INSERT tbl SELECT…” (if the INSERT
is a simple one, e.g. no trigger, no index, etc).

Regards,
Paul

v1-0001-ctas-using-raw-insert.patch
Description: v1-0001-ctas-using-raw-insert.patch

Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module

2020-11-10 Thread Fujii Masao




On 2020/11/10 16:17, Kyotaro Horiguchi wrote:

At Sat, 7 Nov 2020 19:31:21 +0530, Bharath Rupireddy 
 wrote in

The main reason for having SetLatch() in
SignalHandlerForConfigReload() is to wake up the calling process if
waiting in WaitLatchOrSocket() or WaitLatch() and reload the new
config file and use the reloaded config variables. Maybe we should
give a thought on the scenarios in which the walreceiver process
waits, and what happens in case the latch is set when SIGHUP is
received.


The difference is whether the config file is processed at the next
wakeup (by force-reply-request or SIGTERM) of walreceiver or
immediately. If server-reload happened frequently, say, several times
per second(?), we should consider to reduce the useless reloading, but
actually that's not the case.


So, attached is the patch that makes walreceiver use both standard
SIGTERM and SIGHUP handlers. Currently I've not found any actual
issues by making walreceiver use standard SIGHUP handler, yet.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/backend/replication/walreceiver.c 
b/src/backend/replication/walreceiver.c
index bb1d44ccb7..babee386c4 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -104,13 +104,6 @@ static int recvFile = -1;
 static TimeLineID recvFileTLI = 0;
 static XLogSegNo recvSegNo = 0;
 
-/*
- * Flags set by interrupt handlers of walreceiver for later service in the
- * main loop.
- */
-static volatile sig_atomic_t got_SIGHUP = false;
-static volatile sig_atomic_t got_SIGTERM = false;
-
 /*
  * LogstreamResult indicates the byte positions that we have already
  * written/fsynced.
@@ -135,11 +128,6 @@ static void XLogWalRcvSendReply(bool force, bool 
requestReply);
 static void XLogWalRcvSendHSFeedback(bool immed);
 static void ProcessWalSndrMessage(XLogRecPtr walEnd, TimestampTz sendTime);
 
-/* Signal handlers */
-static void WalRcvSigHupHandler(SIGNAL_ARGS);
-static void WalRcvShutdownHandler(SIGNAL_ARGS);
-
-
 /*
  * Process any interrupts the walreceiver process may have received.
  * This should be called any time the process's latch has become set.
@@ -164,7 +152,7 @@ ProcessWalRcvInterrupts(void)
 */
CHECK_FOR_INTERRUPTS();
 
-   if (got_SIGTERM)
+   if (ShutdownRequestPending)
{
ereport(FATAL,
(errcode(ERRCODE_ADMIN_SHUTDOWN),
@@ -267,9 +255,10 @@ WalReceiverMain(void)
on_shmem_exit(WalRcvDie, 0);
 
/* Properly accept or ignore signals the postmaster might send us */
-   pqsignal(SIGHUP, WalRcvSigHupHandler);  /* set flag to read config file 
*/
+   pqsignal(SIGHUP, SignalHandlerForConfigReload); /* set flag to read 
config
+   
 * file */
pqsignal(SIGINT, SIG_IGN);
-   pqsignal(SIGTERM, WalRcvShutdownHandler);   /* request shutdown */
+   pqsignal(SIGTERM, SignalHandlerForShutdownRequest); /* request shutdown 
*/
/* SIGQUIT handler was already set up by InitPostmasterChild */
pqsignal(SIGALRM, SIG_IGN);
pqsignal(SIGPIPE, SIG_IGN);
@@ -441,9 +430,9 @@ WalReceiverMain(void)
/* Process any requests or signals received 
recently */
ProcessWalRcvInterrupts();
 
-   if (got_SIGHUP)
+   if (ConfigReloadPending)
{
-   got_SIGHUP = false;
+   ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
XLogWalRcvSendHSFeedback(true);
}
@@ -510,7 +499,7 @@ WalReceiverMain(void)
 * avoiding some system calls.
 */
Assert(wait_fd != PGINVALID_SOCKET);
-   rc = WaitLatchOrSocket(walrcv->latch,
+   rc = WaitLatchOrSocket(MyLatch,
   
WL_EXIT_ON_PM_DEATH | WL_SOCKET_READABLE |
   
WL_TIMEOUT | WL_LATCH_SET,
   
wait_fd,
@@ -518,7 +507,7 @@ WalReceiverMain(void)
   
WAIT_EVENT_WAL_RECEIVER_MAIN);
if (rc & WL_LATCH_SET)
{
-   ResetLatch(walrcv->latch);
+   ResetLatch(MyLatch);

Re: A problem about partitionwise join

2020-11-10 Thread Richard Guo

On Fri, Nov 6, 2020 at 11:26 PM Anastasia Lubennikova <
a.lubennik...@postgrespro.ru> wrote:

> Status update for a commitfest entry.
>
> According to CFbot this patch fails to apply. Richard, can you send an
> update, please?
>
> Also, I see that the thread was inactive for a while.
> Are you going to continue this work? I think it would be helpful, if you
> could write a short recap about current state of the patch and list open
> questions for reviewers.
>
> The new status of this patch is: Waiting on Author
>

Thanks Anastasia. I've rebased the patch with latest master.

To recap, the problem we are fixing here is when generating join clauses
from equivalence classes, we only select the joinclause with the 'best
score', or the first joinclause with a score of 3. This may cause us to
miss some joinclause on partition keys and thus fail to generate
partitionwise join.

The initial idea for the fix is to create all the RestrictInfos from ECs
in order to check whether there exist equi-join conditions involving
pairs of matching partition keys of the relations being joined for all
partition keys. And then Tom proposed a much better idea which leverages
function exprs_known_equal() to tell whether the partkeys can be found
in the same eclass, which is the current implementation in the latest
patch.

Thanks
Richard

v4-0001-Fix-up-partitionwise-join.patch
Description: Binary data

Re: Corner-case bug in pg_rewind

2020-11-10 Thread Pavel Borisov

I did some effort to review your patch which seems legit to me.
I think some minor things are better to be improved i.e.

1. Comment regarding
--
347  * Check for the possibility that the target is in fact a direct
348  * ancestor of the source. In that case, there is no divergent
history
349  * in the target that needs rewinding.
--
are better be reformulated as overall block contents are mostly cover vice
versa case of a target NOT being a direct ancestor of the source. Maybe:
"We need rewind in cases when  and don't need only if the target is a
direct ancestor of the source." I think it will be more understandable if
it would be a commentary with descriptions of all cases in the block or no
commentary before the block at all with commentaries of these cases on each
if/else inside the block instead.

2. When I do the test with no patching of pg_rewind.c I get the output:
-
#   Failed test 'pg_rewind detects rewind needed stderr /(?^:rewinding from
last common checkpoint at)/'
#   at t/007_min_recovery_point.pl line 107.
#   'pg_rewind: servers diverged at WAL location 0/3000180
on timeline 2
# pg_rewind: no rewind required
# '
# doesn't match '(?^:rewinding from last common checkpoint at)'
# Looks like you failed 1 test of 2.
t/007_min_recovery_point.pl .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/2 subtests

Test Summary Report
---
t/007_min_recovery_point.pl (Wstat: 256 Tests: 2 Failed: 1)
  Failed test:  2
  Non-zero exit status: 1
---
Maybe it can just give "failed" without so many details?

Also, I think Heikki's notion could be fulfilled.

Apart from this I consider the patch clean, clear, tests are passes and I'd
recommend it to commit after a minor improvement described.
Thank you!

-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: logical streaming of xacts via test_decoding is broken

2020-11-10 Thread Dilip Kumar

On Tue, Nov 10, 2020 at 11:18 AM Dilip Kumar  wrote:
>
> On Tue, Nov 10, 2020 at 10:52 AM Amit Kapila  wrote:
> >
> > On Tue, Nov 10, 2020 at 10:26 AM Dilip Kumar  wrote:
> > >
> > > On Tue, Nov 10, 2020 at 8:14 AM Amit Kapila  
> > > wrote:
> > > >
> > > > On Mon, Nov 9, 2020 at 6:00 PM Dilip Kumar  
> > > > wrote:
> > > > >
> > > > > On Mon, Nov 9, 2020 at 5:37 PM Dilip Kumar  
> > > > > wrote:
> > > > > >
> > > > > > On Mon, Nov 9, 2020 at 4:21 PM Amit Kapila 
> > > > > >  wrote:
> > > > > >
> > > > > >  The bigger question is do we want to give users an option
> > > > > > > for skip_empty_streams similar to skip_empty_xacts? I would again
> > > > > > > prefer to give a separate option to the user as well. What do you
> > > > > > > think?
> > > > > >
> > > > > > Yeah, I think giving an option would be better.
> > > > >
> > > > > I think we should also think about the combinations of the
> > > > > skip_empty_xacts and skip_empty_streams.  For example, if the user
> > > > > passes the skip_empty_xacts to false and skip_empty_streams to true
> > > > > then what should be the behavior, if the complete transaction
> > > > > option1: It should not print any stream_start/stream_stop and just
> > > > > print commit stream because skip_empty_xacts is false and
> > > > > skip_empty_streams is true.
> > > > > option2: It should print the stream_start message for the very first
> > > > > stream because it is the first stream if the txn and skip_empty_xacts
> > > > > is false so print it and later it will print-stream commit.
> > > > > option3: Or for the first stream we first put the BEGIN message i.e
> > > > > stream begin
> > > > > stream start
> > > > > stream stop
> > > > > stream commit
> > > > > option4: the user should not be allowed to pass skip_empty_xacts =
> > > > > false with skip_empty_streams to true.  Because if the streaming mode
> > > > > is on then we can not print the xact without printing streams.
> > > > >
> > > > > What is your opinion on this?
> > > > >
> > > >
> > > > I would prefer option-4 and in addition to that we can ensure that if
> > > > skip_empty_xacts = true then by default skip_empty_streams is also
> > > > true.
> > >
> > > But then it will behave as a single option only, right? because if
> > > 1. skip_empty_xacts = true, then we set skip_empty_streams = true
> > >
> >
> > For this case, users can use skip_empty_xacts = true and
> > skip_empty_streams = false. I am just asking if the user has only used
> > skip_empty_xacts = true and didn't use the 'skip_empty_streams'
> > option.
>
> Ok, thanks for the clarification.
>

I have prepared a patch for the same.


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v1-0001-Bug-fix-skip-empty-xacts-in-streaming-mode.patch
Description: Binary data

Re: Prefer TG_TABLE_NAME over TG_RELNAME in tests

2020-11-10 Thread Andreas Karlsson


On 11/3/20 10:22 AM, Magnus Hagander wrote:

On Thu, Sep 24, 2020 at 5:17 AM Michael Paquier  wrote:

No objections from here to remove that from the core tests.  It is
worth noting that Debian Code Search hints that this is used in some
extensions:
https://codesearch.debian.net/search?q=TG_RELNAME=1

These are pgformatter, bucardo, sql-ledger, ledgersmb and davical.


That's interesting, but I think irrelevant to this patch in itself of
course. But it might be worth reaching out to some of those projects
and notifying them they're using the deprecated ones..


I submitted patches to pgformatter, bucardo and ledgersmb. Both davical 
and sql-ledger only seems to have them in old upgrade scripts.


Andreas

Re: enable pg_stat_statements to track rows processed by REFRESH MATERIALIZED VIEW

2020-11-10 Thread Fujii Masao




On 2020/11/05 23:54, Seino Yuki wrote:

2020-11-02 20:01 に Fujii Masao さんは書きました:

On 2020/11/02 14:02, Yuki Seino wrote:

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:    tested, passed

+1.
I checked the patch and there were no problems.


+    PG_END_TRY();
+    SetQueryCompletion(qc, CMDTAG_REFRESH_MATERIALIZED_VIEW, processed);

Isn't it better to call SetQueryCompletion() in ExecRefreshMatView()
instead of ProcessUtilitySlow() (e.g., ExecCreateTableAs() does)?

Regards,



Sorry. I missed it.
I've incorporated your point into this patch.
So the changes to "matview.h" and "utility.c" have been canceld.

We also confirmed that the new patch passed the regression test.


Thanks for updating the patch!

+   /* save the rowcount if we're given a qc to fill */
+   SetQueryCompletion(qc, CMDTAG_REFRESH_MATERIALIZED_VIEW, processed);

I added the check "if (qc)" into the above. I also added the following
comments about that we don't display the rowcount in the command
completion tag output though we save it in qc. There is the discussion
related to this topic, at [1]. Thought?

+* Save the rowcount so that pg_stat_statements can track the total 
number
+* of rows processed by REFRESH MATERIALIZED VIEW command. Note that we
+* still don't display the rowcount in the command completion tag 
output,
+* i.e., the display_rowcount flag of CMDTAG_REFRESH_MATERIALIZED_VIEW
+* command tag is left false in cmdtaglist.h. Otherwise, the change of
+* completion tag output might break applications using it.

Attached is the updated version of the patch.
Barring no objection, I will commit that.

Regards,

[1]
https://postgr.es/m/aadbfba9-e4bb-9531-6b3a-d13c31c8f...@oss.nttdata.com


--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out 
b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index e0edb134f3..2a303a7f07 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -530,8 +530,8 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY 
query COLLATE "C";
 
 --
 -- Track the total number of rows retrieved or affected by the utility
--- commands of COPY, FETCH, CREATE TABLE AS, CREATE MATERIALIZED VIEW
--- and SELECT INTO
+-- commands of COPY, FETCH, CREATE TABLE AS, CREATE MATERIALIZED VIEW,
+-- REFRESH MATERIALIZED VIEW and SELECT INTO
 --
 SELECT pg_stat_statements_reset();
  pg_stat_statements_reset 
@@ -543,6 +543,7 @@ CREATE TABLE pgss_ctas AS SELECT a, 'ctas' b FROM 
generate_series(1, 10) a;
 SELECT generate_series(1, 10) c INTO pgss_select_into;
 COPY pgss_ctas (a, b) FROM STDIN;
 CREATE MATERIALIZED VIEW pgss_matv AS SELECT * FROM pgss_ctas;
+REFRESH MATERIALIZED VIEW pgss_matv;
 BEGIN;
 DECLARE pgss_cursor CURSOR FOR SELECT * FROM pgss_matv;
 FETCH NEXT pgss_cursor;
@@ -586,10 +587,11 @@ SELECT query, plans, calls, rows FROM pg_stat_statements 
ORDER BY query COLLATE
  FETCH FORWARD 5 pgss_cursor   
  | 0 | 1 |5
  FETCH FORWARD ALL pgss_cursor 
  | 0 | 1 |7
  FETCH NEXT pgss_cursor
  | 0 | 1 |1
+ REFRESH MATERIALIZED VIEW pgss_matv   
  | 0 | 1 |   13
  SELECT generate_series(1, 10) c INTO pgss_select_into 
  | 0 | 1 |   10
  SELECT pg_stat_statements_reset() 
  | 0 | 1 |1
  SELECT query, plans, calls, rows FROM pg_stat_statements ORDER BY query 
COLLATE "C" | 1 | 0 |0
-(12 rows)
+(13 rows)
 
 --
 -- Track user activity and reset them
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c 
b/contrib/pg_stat_statements/pg_stat_statements.c
index 1eac9edaee..dd963c4644 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1171,13 +1171,14 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char 
*queryString,
INSTR_TIME_SUBTRACT(duration, start);
 
/*
-* Track the total number of rows retrieved or affected by
-* the utility statements of COPY, FETCH, CREATE TABLE AS,
-* CREATE MATERIALIZED VIEW and SELECT INTO.
+* Track the total number of rows retrieved or affected by the 
utility
+* statements of COPY, FETCH, CREATE TABLE AS, CREATE 
MATERIALIZED
+* VIEW, REFRESH MATERIALIZED VIEW and SELECT INTO.

Re: list of extended statistics on psql

2020-11-10 Thread Tatsuro Yamada


Hi,

 

2) The test is failing intermittently because it's executed in parallel
with stats_ext test, which is also creating extended statistics. So
depending on the timing the \dX may list some of the stats_ext stuff.
I'm not sure what to do about this. Either this part needs to be moved
to a separate test executed in a different group, or maybe we should
simply move it to stats_ext.


I thought all tests related to meta-commands exist in psql.sql, but I
realize it's not true. For example, the test of \dRp does not exist in
psql.sql. Therefore, I moved the regression test of \dX to stats_ext.sql
to avoid the test failed in parallel.

Attached patches is following:
  - 0001-v8-Add-dX-command-on-psql.patch
  - 0002-Add-regression-test-of-dX-to-stats_ext.sql.patch

However, I feel the test of \dX is not elegant, so I'm going to try
creating another one since it would be better to be aware of the context
of existing extended stats tests.


I tried to create another version of the regression test (0003).
"\dX" was added after ANALYZE command or SELECT... from pg_statistic_ext.

Please find the attached file:
  - 0003-Add-regression-test-of-dX-to-stats_ext.sql-another-ver

Both regression tests 0002 and 0003 are okay for me, I think.
Could you choose one?

Regards,
Tatsuro Yamada

From b22ebf34fc09e246f8d4cf408f76a6753f3d6bcb Mon Sep 17 00:00:00 2001
From: Tatsuro Yamada 
Date: Tue, 10 Nov 2020 16:47:45 +0900
Subject: [PATCH] Add regression test of \dX to stats_ext.sql (another version)

---
 src/test/regress/sql/stats_ext.sql | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/src/test/regress/sql/stats_ext.sql 
b/src/test/regress/sql/stats_ext.sql
index 9781e590a3..c590dcc90a 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -49,6 +49,7 @@ CREATE STATISTICS regress_schema_2.ab1_a_b_stats ON a, b FROM 
ab1;
 
 -- Let's also verify the pg_get_statisticsobjdef output looks sane.
 SELECT pg_get_statisticsobjdef(oid) FROM pg_statistic_ext WHERE stxname = 
'ab1_a_b_stats';
+\dX
 
 DROP STATISTICS regress_schema_2.ab1_a_b_stats;
 
@@ -60,9 +61,10 @@ ALTER TABLE ab1 DROP COLUMN a;
 \d ab1
 -- Ensure statistics are dropped when table is
 SELECT stxname FROM pg_statistic_ext WHERE stxname LIKE 'ab1%';
+\dX
 DROP TABLE ab1;
 SELECT stxname FROM pg_statistic_ext WHERE stxname LIKE 'ab1%';
-
+\dX
 -- Ensure things work sanely with SET STATISTICS 0
 CREATE TABLE ab1 (a INTEGER, b INTEGER);
 ALTER TABLE ab1 ALTER a SET STATISTICS 0;
@@ -73,13 +75,16 @@ ALTER TABLE ab1 ALTER a SET STATISTICS -1;
 -- setting statistics target 0 skips the statistics, without printing any 
message, so check catalog
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS 0;
 \d ab1
+\dX+ ab1_a_b_stats
 ANALYZE ab1;
 SELECT stxname, stxdndistinct, stxddependencies, stxdmcv
   FROM pg_statistic_ext s, pg_statistic_ext_data d
  WHERE s.stxname = 'ab1_a_b_stats'
AND d.stxoid = s.oid;
+\dX+ ab1_a_b_stats
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS -1;
 \d+ ab1
+\dX+ ab1_a_b_stats
 -- partial analyze doesn't build stats either
 ANALYZE ab1 (a);
 ANALYZE ab1;
@@ -93,6 +98,7 @@ CREATE TABLE ab1c () INHERITS (ab1);
 INSERT INTO ab1 VALUES (1,1);
 CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
+\dX+ ab1_a_b_stats
 DROP TABLE ab1 CASCADE;
 
 -- Verify supported object types for extended statistics
@@ -171,6 +177,7 @@ SELECT s.stxkind, d.stxdndistinct
   FROM pg_statistic_ext s, pg_statistic_ext_data d
  WHERE s.stxrelid = 'ndistinct'::regclass
AND d.stxoid = s.oid;
+\dX+ s10
 
 -- minor improvement, make sure the ctid does not break the matching
 SELECT * FROM check_estimated_rows('SELECT COUNT(*) FROM ndistinct GROUP BY 
ctid, a, b');
@@ -202,6 +209,7 @@ SELECT s.stxkind, d.stxdndistinct
   FROM pg_statistic_ext s, pg_statistic_ext_data d
  WHERE s.stxrelid = 'ndistinct'::regclass
AND d.stxoid = s.oid;
+\dX+ s10
 
 -- correct estimates
 SELECT * FROM check_estimated_rows('SELECT COUNT(*) FROM ndistinct GROUP BY a, 
b');
@@ -220,6 +228,7 @@ SELECT s.stxkind, d.stxdndistinct
   FROM pg_statistic_ext s, pg_statistic_ext_data d
  WHERE s.stxrelid = 'ndistinct'::regclass
AND d.stxoid = s.oid;
+\dX+
 
 -- dropping the statistics results in under-estimates
 SELECT * FROM check_estimated_rows('SELECT COUNT(*) FROM ndistinct GROUP BY a, 
b');
@@ -261,6 +270,7 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
functional_dependencies WHERE
 CREATE STATISTICS func_deps_stat (dependencies) ON a, b, c FROM 
functional_dependencies;
 
 ANALYZE functional_dependencies;
+\dX+ func_deps_stat
 
 SELECT * FROM check_estimated_rows('SELECT * FROM functional_dependencies 
WHERE a = 1 AND b = ''1''');
 
@@ -274,6 +284,7 @@ INSERT INTO functional_dependencies (a, b, c, filler1)
  SELECT mod(i,100), mod(i,50), mod(i,25), i FROM generate_series(1,5000) 
s(i);
 
 ANALYZE functional_dependencies;
+\dX+ func_deps_stat
 
 SELECT * FROM check_estimated_rows('SELECT

74 matches

Mail list logo