date:20200411

Re: cleaning perl code

2020-04-11 Thread Peter Eisentraut


On 2020-04-11 06:30, Noah Misch wrote:

In summary, among those warnings, I see non-negative value in "Code before
warnings are enabled" only.


Now that you put it like this, that was also my impression when I first 
introduced the level 5 warnings and then decided to stop there.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

relcache leak warnings vs. errors

2020-04-11 Thread Peter Eisentraut

I just fixed a relcache leak that I accidentally introduced 
(5a1d0c9925).  Because it was a TAP test involving replication workers, 
you don't see the usual warning anywhere unless you specifically check 
the log files manually.


How about a compile-time option to turn all the warnings in resowner.c 
into errors?  This could be enabled automatically by --enable-cassert, 
similar to other defines that that option enables.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: proposal - psql - possibility to redirect only tabular output

2020-04-11 Thread Pavel Stehule

so 11. 4. 2020 v 8:54 odesílatel Pavel Stehule 
napsal:

> Hi
>
> Now, the content of redirect output has two parts
>
> 1. tabular output
> 2. cmd tags
>
> There is a problem with command tags, because it is specific kind of
> information and can be nice if can be redirected to stdout every time like
> \h output. There can be new psql variable like REDIRECTED_OUTPUT with
> possibilities (all, tabular)
>
> What do you think about this?
>

or different method - set target of status row - with result (default) or
stdout (terminal)

patch assigned

When I pin status rows just to stdout, then redirected output contains only
query results

Regards

Pavel


> Pavel
>
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 621a33f7e8..2b722eb2de 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1072,17 +1072,20 @@ static void
 PrintQueryStatus(PGresult *results)
 {
 	char		buf[16];
+	FILE	   *fp;
+
+	fp = pset.status_target == PSQL_STATUS_RESULT ? pset.queryFout : stdout;
 
 	if (!pset.quiet)
 	{
 		if (pset.popt.topt.format == PRINT_HTML)
 		{
-			fputs("", pset.queryFout);
-			html_escaped_print(PQcmdStatus(results), pset.queryFout);
-			fputs("\n", pset.queryFout);
+			fputs("", fp);
+			html_escaped_print(PQcmdStatus(results), fp);
+			fputs("\n", fp);
 		}
 		else
-			fprintf(pset.queryFout, "%s\n", PQcmdStatus(results));
+			fprintf(fp, "%s\n", PQcmdStatus(results));
 	}
 
 	if (pset.logfile)
diff --git a/src/bin/psql/settings.h b/src/bin/psql/settings.h
index 97941aa10c..2299cb2b6d 100644
--- a/src/bin/psql/settings.h
+++ b/src/bin/psql/settings.h
@@ -70,6 +70,12 @@ typedef enum
 	hctl_ignoreboth = hctl_ignorespace | hctl_ignoredups
 } HistControl;
 
+typedef enum
+{
+	PSQL_STATUS_STDOUT,
+	PSQL_STATUS_RESULT
+} PSQL_STATUS_TARGET;
+
 enum trivalue
 {
 	TRI_DEFAULT,
@@ -135,6 +141,7 @@ typedef struct _psqlSettings
 	PSQL_ECHO_HIDDEN echo_hidden;
 	PSQL_ERROR_ROLLBACK on_error_rollback;
 	PSQL_COMP_CASE comp_case;
+	PSQL_STATUS_TARGET status_target;
 	HistControl histcontrol;
 	const char *prompt1;
 	const char *prompt2;
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 3302bd4dd3..e6b0ea91aa 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -1076,6 +1076,30 @@ histcontrol_hook(const char *newval)
 	return true;
 }
 
+static char *
+status_target_substitute_hook(char *newVal)
+{
+	if (newVal == NULL)
+		newVal = pg_strdup("result");
+	return newVal;
+}
+
+static bool
+status_target_hook(const char *newVal)
+{
+	Assert(newVal != NULL);
+	if (pg_strcasecmp(newVal, "stdout") == 0)
+		pset.status_target = PSQL_STATUS_STDOUT;
+	else if (pg_strcasecmp(newVal, "result") == 0)
+		pset.status_target = PSQL_STATUS_RESULT;
+	else
+	{
+		PsqlVarEnumError("STATUS_TARGET", newVal, "result, stdout");
+		return false;
+	}
+	return true;
+}
+
 static bool
 prompt1_hook(const char *newval)
 {
@@ -1228,4 +1252,7 @@ EstablishVariableSpace(void)
 	SetVariableHooks(pset.vars, "HIDE_TABLEAM",
 	 bool_substitute_hook,
 	 hide_tableam_hook);
+	SetVariableHooks(pset.vars, "STATUS_TARGET",
+	 status_target_substitute_hook,
+	 status_target_hook);
 }
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 0e7a373caf..882baf553e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3879,6 +3879,8 @@ psql_completion(const char *text, int start, int end)
 			COMPLETE_WITH_CS("on", "off", "interactive");
 		else if (TailMatchesCS("SHOW_CONTEXT"))
 			COMPLETE_WITH_CS("never", "errors", "always");
+		else if (TailMatchesCS("STATUS_TARGET"))
+			COMPLETE_WITH_CS("result", "stdout");
 		else if (TailMatchesCS("VERBOSITY"))
 			COMPLETE_WITH_CS("default", "verbose", "terse", "sqlstate");
 	}

Re: relcache leak warnings vs. errors

2020-04-11 Thread Julien Rouhaud

On Sat, Apr 11, 2020 at 10:09:59AM +0200, Peter Eisentraut wrote:
> I just fixed a relcache leak that I accidentally introduced (5a1d0c9925).
> Because it was a TAP test involving replication workers, you don't see the
> usual warning anywhere unless you specifically check the log files manually.
> 
> How about a compile-time option to turn all the warnings in resowner.c into
> errors?  This could be enabled automatically by --enable-cassert, similar to
> other defines that that option enables.
> 

+1.  Would it be worthwhile to do the same in e.g. aset.c (for
MEMORY_CONTEXT_CHECKING case), or more generally for stuff in
src/backend/utils?

Re: proposal - psql - possibility to redirect only tabular output

2020-04-11 Thread Erik Rijkers


On 2020-04-11 10:21, Pavel Stehule wrote:
so 11. 4. 2020 v 8:54 odesílatel Pavel Stehule 


napsal:



[psql-status-target.patch]


Hi Pavel,

This looks interesting, and I built an instance with the patch to try 
it, but I can't figure out how to use it.


Can you perhaps give a few or even just one example?

thanks!

Erik Rijkers

Re: proposal - psql - possibility to redirect only tabular output

2020-04-11 Thread Pavel Stehule

so 11. 4. 2020 v 11:04 odesílatel Erik Rijkers  napsal:

> On 2020-04-11 10:21, Pavel Stehule wrote:
> > so 11. 4. 2020 v 8:54 odesílatel Pavel Stehule
> > 
> > napsal:
>
> > [psql-status-target.patch]
>
> Hi Pavel,
>
> This looks interesting, and I built an instance with the patch to try
> it, but I can't figure out how to use it.
>
> Can you perhaps give a few or even just one example?
>

Main motivation for this patch is working with psql for writing and editing
queries, and browsing result in second terminal with pspg or any other
similar tool (tail, ...). The advantage of this setup is possibility to see
sql and query result together. I use terminal multiplicator (Tilix), but it
can be used without it.

So example with pspg (should be some fresh release)

1. create fifo used for communication - mkfifo ~/pipe

2. run in one terminal pspg - pspg -f ~/pipe --hold-stream=2

3. run psql in other terminal

psql
\o ~/pipe
CREATE TABLE foo(a int);
INSERT INTO foo VALUES(10);
-- in default case, the status row "CREATE", "INSERT" is redirected to
"browser terminal" and it doesn't look well (and it is not user friendly).

with patched version you can

\set STATUS_TARGET stdout
-- after this setting, the status row will be displayed in psql terminal

Regards

Pavel

> thanks!
>
> Erik Rijkers
>

Re: Index Skip Scan

2020-04-11 Thread Dilip Kumar

On Wed, Apr 8, 2020 at 1:10 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:
>
> > On Mon, Apr 06, 2020 at 06:31:08PM +, Floris Van Nee wrote:
> >
> > > Hm, I wasn't aware about this one, thanks for bringing this up. Btw, 
> > > Floris, I
> > > would appreciate if in the future you can make it more visible that 
> > > changes you
> > > suggest contain some fixes. E.g. it wasn't clear for me from your 
> > > previous email
> > > that that's the case, and it doesn't make sense to pull into different 
> > > direction
> > > when we're trying to achieve the same goal :)
> >
> > I wasn't aware that this particular case could be triggered before I saw 
> > Dilip's email, otherwise I'd have mentioned it here of course. It's just 
> > that because my patch handles filter conditions in general, it works for 
> > this case too.
>
> Oh, then fortunately I've got a wrong impression, sorry and thanks for
> clarification :)
>
> > > > > In the patch I posted a week ago these cases are all handled
> > > > > correctly, as it introduces this extra logic in the Executor.
> > > >
> > > > Okay, So I think we can merge those fixes in Dmitry's patch set.
> > >
> > > I'll definitely take a look at suggested changes in filtering part.
> >
> > It may be possible to just merge the filtering part into your patch, but 
> > I'm not entirely sure. Basically you have to pull the information about 
> > skipping one level up, out of the node, into the generic IndexNext code.
>
> I was actually thinking more about just preventing skip scan in this
> situation, which is if I'm not mistaken could be solved by inspecting
> qual conditions to figure out if they're covered in the index -
> something like in attachments (this implementation is actually too
> restrictive in this sense and will not allow e.g. expressions, that's
> why I haven't bumped patch set version for it - soon I'll post an
> extended version).

Some more comments...

+ so->skipScanKey->nextkey = ScanDirectionIsForward(dir);
+ _bt_update_skip_scankeys(scan, indexRel);
+
...
+ /*
+ * We haven't found scan key within the current page, so let's scan from
+ * the root. Use _bt_search and _bt_binsrch to get the buffer and offset
+ * number
+ */
+ so->skipScanKey->nextkey = ScanDirectionIsForward(dir);
+ stack = _bt_search(scan->indexRelation, so->skipScanKey,
+&buf, BT_READ, scan->xs_snapshot);

Why do we need to set so->skipScanKey->nextkey =
ScanDirectionIsForward(dir); multiple times? I think it is fine to
just
set it once?

+static inline bool
+_bt_scankey_within_page(IndexScanDesc scan, BTScanInsert key,
+ Buffer buf, ScanDirection dir)
+{
+ OffsetNumber low, high;
+ Page page = BufferGetPage(buf);
+ BTPageOpaque opaque = (BTPageOpaque) PageGetSpecialPointer(page);
+
+ low = P_FIRSTDATAKEY(opaque);
+ high = PageGetMaxOffsetNumber(page);
+
+ if (unlikely(high < low))
+ return false;
+
+ return (_bt_compare(scan->indexRelation, key, page, low) > 0 &&
+ _bt_compare(scan->indexRelation, key, page, high) < 1);
+}

I think the high key condition should be changed to
_bt_compare(scan->indexRelation, key, page, high) < 0 ?  Because if
prefix qual is equal to the high key then also
there is no point in searching on the current page so we can directly skip.


+ /* Check if an index skip scan is possible. */
+ can_skip = enable_indexskipscan & index->amcanskip;
+
+ /*
+ * Skip scan is not supported when there are qual conditions, which are not
+ * covered by index. The reason for that is that those conditions are
+ * evaluated later, already after skipping was applied.
+ *
+ * TODO: This implementation is too restrictive, and doesn't allow e.g.
+ * index expressions. For that we need to examine index_clauses too.
+ */
+ if (root->parse->jointree != NULL)
+ {
+ ListCell *lc;
+
+ foreach(lc, (List *)root->parse->jointree->quals)
+ {
+ Node *expr, *qual = (Node *) lfirst(lc);
+ Var *var;

I think we can avoid checking for expression if can_skip is already false.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Add A Glossary

2020-04-11 Thread Jürgen Purtz


On 2020-Apr-05, Jürgen Purtz wrote:


a) Some rearrangements of the sequence of terms to meet alphabetical order.

Thanks, will get this pushed.


b)   -->   in
two cases. Or should it be a ?

Ah, yeah, those should be linkend.

Term 'relation': A sequence is internally a table with one row - right? 
Shall we extend the list of concrete relations by 'sequence'? Or is this 
not necessary because 'table' is already there?


Kind regards, Jürgen

execExprInterp() questions / How to improve scalar array op expr eval?

2020-04-11 Thread James Coleman

Short version:

In what I'm currently working on I had a few questions about arrays
and the execExpr/execExprInterp framework that didn't seem obviously
answered in the code or README.

- Does the execExpr/execExprInterp framework allow a scalar array op
to get an already expanded array (unless I'm missing something we
can't easily lookup a given index in a flattened array)?
- If not, is there a way in that framework to know if the array expr
has stayed the same through multiple evaluations of the expression
tree (i.e., so you could expand and sort it just once)?

Long version/background:

I've been looking at how a query like:
select * from t where i in ();
executes as a seq scan the execution time increases linearly with
respect to the length of the array. That's because in
execExprInterp.c's ExecEvalScalarArrayOp() we do a linear search
through the array.

In contrast, when setting up a btree scan with a similar saop, we
first sort the array, remove duplicates, and remove nulls. Of course
with btree scans this has other values, like allowing us to return
results in array order.

I've been considering approaches to improve the seq scan case. We might:
- At plan time rewrite as a hash join to a deduped array values
relation (gross, but included here since that's an approach we can
take rewriting the SQL itself as a proof of concept).
- At execution time build a hash and lookup.
- At execution time sort the array and binary search through it.

I've been working other last approach to see what results I might get
(it seemed like the easiest to hack together). Putting that together
left me with the questions mentioned above in the "short version".

Obviously if anyone has thoughts on the above approaches I'd be
interested in that too.

Side question: when we do:
arr = DatumGetArrayTypeP(*op->resvalue);
in ExecEvalScalarArrayOp() is that going to be expensive each time
through a seq scan? Or is it (likely) going to resolve to an already
in-memory array and effectively be the cost of retrieving that
pointer?

James

Re: WAL usage calculation patch

2020-04-11 Thread Julien Rouhaud

On Fri, Apr 10, 2020 at 9:37 PM Julien Rouhaud  wrote:
>
> On Fri, Apr 10, 2020 at 8:17 AM Amit Kapila  wrote:
> > Would you like to send a consolidated patch that includes Euler's
> > suggestion and Justin's patch (by making changes for points we
> > discussed.)?  I think we can keep the point related to number of
> > spaces before each field open?
>
> Sure, I'll take care of that tomorrow!

I tried to take into account all that have been discussed, but I have
to admit that I'm absolutely not sure of what was actually decided
here.  I went with those changes:

- rename wal_num_fpw to wal_fpw for consistency, both in pgss view
fiel name but also everywhere in the code
- change comments to consistently mention "full page writes generated"
- changed pgss and explain documentation to mention "full page images
generated", from Justin's patch on another thread
- kept "amount" of WAL bytes
- no change to the explain output as I have no idea what is the
consensus (one or two spaces, use semicolon or equal, show unit or
not)
diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql b/contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql
index 30566574ab..d0a6c3b4fc 100644
--- a/contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql
+++ b/contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql
@@ -43,7 +43,7 @@ CREATE FUNCTION pg_stat_statements(IN showtext boolean,
 OUT blk_read_time float8,
 OUT blk_write_time float8,
 OUT wal_records int8,
-OUT wal_num_fpw int8,
+OUT wal_fpw int8,
 OUT wal_bytes numeric
 )
 RETURNS SETOF record
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 04abdab904..90bc6fd013 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -189,7 +189,7 @@ typedef struct Counters
 	double		blk_write_time; /* time spent writing, in msec */
 	double		usage;			/* usage factor */
 	int64		wal_records;	/* # of WAL records generated */
-	int64		wal_num_fpw;	/* # of WAL full page image records generated */
+	int64		wal_fpw;	/* # of WAL full page writes generated */
 	uint64		wal_bytes;		/* total amount of WAL bytes generated */
 } Counters;
 
@@ -1432,7 +1432,7 @@ pgss_store(const char *query, uint64 queryId,
 		e->counters.blk_write_time += INSTR_TIME_GET_MILLISEC(bufusage->blk_write_time);
 		e->counters.usage += USAGE_EXEC(total_time);
 		e->counters.wal_records += walusage->wal_records;
-		e->counters.wal_num_fpw += walusage->wal_num_fpw;
+		e->counters.wal_fpw += walusage->wal_fpw;
 		e->counters.wal_bytes += walusage->wal_bytes;
 
 		SpinLockRelease(&e->mutex);
@@ -1824,7 +1824,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
 			Datum		wal_bytes;
 
 			values[i++] = Int64GetDatumFast(tmp.wal_records);
-			values[i++] = Int64GetDatumFast(tmp.wal_num_fpw);
+			values[i++] = Int64GetDatumFast(tmp.wal_fpw);
 
 			snprintf(buf, sizeof buf, UINT64_FORMAT, tmp.wal_bytes);
 
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index 5a962feb39..f0b769fad8 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -283,11 +283,11 @@
  
 
  
-  wal_num_fpw
+  wal_fpw
   bigint
   
   
-Total count of WAL full page writes generated by the statement
+Total count of WAL full page images generated by the statement
   
  
 
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 024ede4a8d..1e12715a03 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -198,8 +198,8 @@ ROLLBACK;
 
  
   Include information on WAL record generation. Specifically, include the
-  number of records, number of full page image records and amount of WAL
-  bytes generated.  In text format, only non-zero values are printed.  This
+  number of records, number of full page images and amount of WAL bytes
+  generated.  In text format, only non-zero values are printed.  This
   parameter may only be used when ANALYZE is also
   enabled.  It defaults to FALSE.
  
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index f3382d37a4..d8bc06fe0b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -676,7 +676,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 			 _("WAL usage: %ld records, %ld full page writes, "
 			   UINT64_FORMAT " bytes"),
 			 walusage.wal_records,
-			 walusage.wal_num_fpw,
+			 walusage.wal_fpw,
 			 walusage.wal_bytes);
 
 			ereport(LOG,
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c38bc1412d..11e32733c4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1258,7 +1258,7 @@ XLogInsertRecord(XLogRecData *rdata,
 	{
 		pgWalUsage.w

Re: relcache leak warnings vs. errors

2020-04-11 Thread Tom Lane

Peter Eisentraut  writes:
> How about a compile-time option to turn all the warnings in resowner.c 
> into errors?  This could be enabled automatically by --enable-cassert, 
> similar to other defines that that option enables.

[ itch... ]  Those calls occur post-commit; throwing an error there
is really a mess, which is why it's only WARNING now.

I guess you could make them PANICs, but it would be an option that nobody
could possibly want to have enabled in anything resembling production.
So I"m kind of -0.5 on making --enable-cassert do it automatically.
Although I suppose that it's not really worse than other assertion
failures.

regards, tom lane

Re: cleaning perl code

2020-04-11 Thread Tom Lane

Noah Misch  writes:
> In summary, among those warnings, I see non-negative value in "Code before
> warnings are enabled" only.  While we're changing this, I propose removing
> Subroutines::RequireFinalReturn.

If it's possible to turn off just that warning, then +several.
It's routinely caused buildfarm failures, yet I can detect exactly
no value in it.  If there were sufficient cross-procedural analysis
backing it to detect whether any caller examines the subroutine's
result value, then it'd be worth having.  But there isn't, so those
extra returns are just pedantic verbosity.

regards, tom lane

Re: cleaning perl code

2020-04-11 Thread Andrew Dunstan


On 4/11/20 12:30 AM, Noah Misch wrote:
> On Thu, Apr 09, 2020 at 11:44:11AM -0400, Andrew Dunstan wrote:
>>  39 Always unpack @_ first
> Requiring a "my @args = @_" does not improve this code:
>
> sub CreateSolution
> {
> ...
>   if ($visualStudioVersion eq '12.00')
>   {
>   return new VS2013Solution(@_);
>   }
>
>>  30 Code before warnings are enabled
> Sounds good.  We already require "use strict" before code.  Requiring "use
> warnings" in the exact same place does not impose much burden.
>
>>  12 Subroutine "new" called using indirect syntax
> No, thanks.  "new VS2013Solution(@_)" and "VS2013Solution->new(@_)" are both
> fine; enforcing the latter is an ongoing waste of effort.
>
>>   9 Multiple "package" declarations
> This is good advice if you're writing for CPAN, but it would make PostgreSQL
> code worse by having us split affiliated code across multiple files.
>
>>   9 Expression form of "grep"
> No, thanks.  I'd be happier with the opposite, requiring grep(/x/, $arg)
> instead of grep { /x/ } $arg.  Neither is worth enforcing.
>
>>   7 Symbols are exported by default
> This is good advice if you're writing for CPAN.  For us, it just adds typing.
>
>>   5 Warnings disabled
>>   4 Magic variable "$/" should be assigned as "local"
>>   4 Comma used to separate statements
>>   2 Readline inside "for" loop
>>   2 Pragma "constant" used
>>   2 Mixed high and low-precedence booleans
>>   2 Don't turn off strict for large blocks of code
>>   1 Magic variable "@a" should be assigned as "local"
>>   1 Magic variable "$|" should be assigned as "local"
>>   1 Magic variable "$\" should be assigned as "local"
>>   1 Magic variable "$?" should be assigned as "local"
>>   1 Magic variable "$," should be assigned as "local"
>>   1 Magic variable "$"" should be assigned as "local"
>>   1 Expression form of "map"
> I looked less closely at the rest, but none give me a favorable impression.



I don't have a problem with some of this. OTOH, it's nice to know what
we're ignoring and what we're not.


What I have prepared is first a patch that lowers the severity level to
3 but implements policy exceptions so that nothing is broken. Then 3
patches. One fixes the missing warnings pragma and removes shebang -w
switches, so we are quite consistent about how we do this. I gather we
are agreed about that one. The next one fixes those magic variable
error. That includes using some more idiomatic perl, and in one case
just renaming a couple of variables that are fairly opaque anyway. The
last one fixes the mixture of high and low precedence boolean operators,
the inefficient  inside a foreach loop,  and the use of commas to
separate statements, and relaxes the policy about large blocks with 'no
strict'.


Since I have written them they are attached, for posterity if nothing
else. :-)



>
>
> In summary, among those warnings, I see non-negative value in "Code before
> warnings are enabled" only.  While we're changing this, I propose removing
> Subroutines::RequireFinalReturn.  Implicit return values were not a material
> source of PostgreSQL bugs, yet we've allowed this to litter our code:
>

That doesn't mean it won't be a source of problems in future, I've
actually been bitten by this in the past.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index da34124595..25b18e84c5 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -590,7 +590,7 @@ EOM
 
 		# Special hack to generate OID symbols for pg_type entries
 		# that lack one.
-		if ($catname eq 'pg_type' and !exists $bki_values{oid_symbol})
+		if ($catname eq 'pg_type' && !exists $bki_values{oid_symbol})
 		{
 			my $symbol = form_pg_type_symbol($bki_values{typname});
 			$bki_values{oid_symbol} = $symbol
diff --git a/src/backend/parser/check_keywords.pl b/src/backend/parser/check_keywords.pl
index 68d1f517b7..6b27fbf1be 100644
--- a/src/backend/parser/check_keywords.pl
+++ b/src/backend/parser/check_keywords.pl
@@ -44,12 +44,12 @@ line: while (my $S = <$gram>)
 	my $s;
 
 	# Make sure any braces are split
-	$s = '{', $S =~ s/$s/ { /g;
-	$s = '}', $S =~ s/$s/ } /g;
+	$s = '{'; $S =~ s/$s/ { /g;
+	$s = '}'; $S =~ s/$s/ } /g;
 
 	# Any comments are split
-	$s = '[/][*]', $S =~ s#$s# /* #g;
-	$s = '[*][/]', $S =~ s#$s# */ #g;
+	$s = '[/][*]'; $S =~ s#$s# /* #g;
+	$s = '[*][/]'; $S =~ s#$s# */ #g;
 
 	if (!($kcat))
 	{
diff --git a/src/common/unicode/generate-unicode_combining_table.pl b/src/common/unicode/generate-unicode_combining_table.pl
index e468a5f8c9..c984a903ee 100644
--- a/src/common/unicode/generate-unicode_combining_table.pl
+++ b/src/common/unicode/generate-unicode_comb

Re: where should I stick that backup?

2020-04-11 Thread Jose Luis Tallon


On 10/4/20 15:49, Robert Haas wrote:

On Thu, Apr 9, 2020 at 6:44 PM Bruce Momjian  wrote:

Good point, but if there are multiple APIs, it makes shell script
flexibility even more useful.

[snip]

One thing I do think would be realistic would be to invent a set of
tools that are perform certain local filesystem operations in a
"hardened" way.

+10

  Maybe a single tool with subcommands and options. So
you could say, e.g. 'pgfile cp SOURCE TARGET' and it would create a
temporary file in the target directory, write the contents of the
source into that file, fsync the file, rename it into place, and do
more fsyncs to make sure it's all durable in case of a crash. You
could have a variant of this that instead of using the temporary file
and rename in place approach, does the thing where you open the target
file with O_CREAT|O_EXCL, writes the bytes, and then closes and fsyncs
it.
Behaviour might be decided in the same way as the default for 
'wal_sync_method' gets chosen, as the most appropriate for a particular 
system.

And you could have other things too, like 'pgfile mkdir DIR' to
create a directory and fsync it for durability. A toolset like this
would probably help people write better archive commands


Definitely, "mkdir" and "create-exclusive" (along with cp) would be a 
great addition and simplify the kind of tasks properly (i.e. with 
risking data loss every time)

[excerpted]

pg_basebackup -Ft --pipe-output 'bzip | pgfile create-exclusive - %f.bz2'

[]

pg_basebackup -Ft --pipe-output 'bzip | gpg -e | ssh someuser@somehost
pgfile create-exclusive - /backups/tuesday/%f.bz2'

Yep. Would also fit the case for non-synchronous NFS mounts for backups...

It is of course not impossible to teach pg_basebackup to do all of
that stuff internally, but I have a really difficult time imagining us
ever getting it done. There are just too many possibilities, and new
ones arise all the time.


Indeed. The beauty of Unix-like OSs is precisely this.


A 'pgfile' utility wouldn't help at all for people who are storing to
S3 or whatever. They could use 'aws s3' as a target for --pipe-output,
[snip]
(Incidentally, pg_basebackup already has an option to output the
entire backup as a tarfile on standard output, and a user can already
pipe that into any tool they like. However, it doesn't work with
tablespaces. So you could think of this proposal as extending the
existing functionality to cover that case.)


Been there already :S  Having pg_basebackup output multiple tarballs 
(one per tablespace), ideally separated via something so that splitting 
can be trivially done on the receiving end.


...but that's probably matter for another thread.


Thanks,

    / J.L.

Re: cleaning perl code

2020-04-11 Thread Mark Dilger

> On Apr 11, 2020, at 9:13 AM, Andrew Dunstan  
> wrote:

Hi Andrew.  I appreciate your interest and efforts here.  I hope you don't mind 
a few questions/observations about this effort:

> 
>  The
> last one fixes the mixture of high and low precedence boolean operators,

I did not spot examples of this in your diffs, but I assume you mean to 
prohibit conditionals like:

if ($a || $b and $c || $d)

As I understand it, perl introduced low precedence operators precisely to allow 
this.  Why disallow it?

> and the use of commas to separate statements

I don't understand the prejudice against commas used this way.  What is wrong 
with:

$i++, $j++ if defined $k;

rather than:

if (defined $k)
{
$i++;
$j++;
}

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: range_agg

2020-04-11 Thread Paul A Jungwirth

On Sat, Apr 11, 2020 at 9:36 AM Paul A Jungwirth
 wrote:
> Btw I'm working on typanalyze + selectivity, and it seems like the
> test suite doesn't run those things?

Nevermind, I just had to add `analyze numrange_test` to
src/test/regress/sql/rangetypes.sql. :-) Do you want a separate patch
for that? Or maybe it should go in sql/vacuum.sql?

Regards,
Paul

Re: cleaning perl code

2020-04-11 Thread Andrew Dunstan

On 4/11/20 12:28 PM, Mark Dilger wrote:
>
>> On Apr 11, 2020, at 9:13 AM, Andrew Dunstan  
>> wrote:
> Hi Andrew.  I appreciate your interest and efforts here.  I hope you don't 
> mind a few questions/observations about this effort:

Not at all.

>
>>  The
>> last one fixes the mixture of high and low precedence boolean operators,
> I did not spot examples of this in your diffs, but I assume you mean to 
> prohibit conditionals like:
>
> if ($a || $b and $c || $d)
>
> As I understand it, perl introduced low precedence operators precisely to 
> allow this.  Why disallow it?

The docs say:

Conway advises against combining the low-precedence booleans ( |and
or not| ) with the high-precedence boolean operators ( |&& || !| )
in the same expression. Unless you fully understand the differences
between the high and low-precedence operators, it is easy to
misinterpret expressions that use both. And even if you do
understand them, it is not always clear if the author actually
intended it.

|next| |if| |not ||$foo| ||| ||$bar||;  ||#not ok|
|next| |if| |!||$foo| ||| ||$bar||; ||#ok|
|next| |if| |!( ||$foo| ||| ||$bar| |); ||#ok|

I don't feel terribly strongly about it, but personally I just about
never use the low precendence operators, and mostly prefer to resolve
precedence issue with parentheses.

>
>> and the use of commas to separate statements
> I don't understand the prejudice against commas used this way.  What is wrong 
> with:
>
> $i++, $j++ if defined $k;
>
> rather than:
>
> if (defined $k)
> {
> $i++;
> $j++;
> }
>

I don't think the example is terribly clear. I have to look at it and
think "Does it do $i++ if $k isn't defined?"

In the cases we actually have there isn't even any shorthand advantage
like this. There are only a couple of cases.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: cleaning perl code

2020-04-11 Thread Tom Lane

Andrew Dunstan  writes:
> On 4/11/20 12:30 AM, Noah Misch wrote:
>> In summary, among those warnings, I see non-negative value in "Code before
>> warnings are enabled" only.  While we're changing this, I propose removing
>> Subroutines::RequireFinalReturn.  Implicit return values were not a material
>> source of PostgreSQL bugs, yet we've allowed this to litter our code:

> That doesn't mean it won't be a source of problems in future, I've
> actually been bitten by this in the past.

Yeah, as I recall, the reason for the restriction is that if you fall out
without a "return", what's returned is the side-effect value of the last
statement, which might be fairly surprising.  Adding explicit "return;"
guarantees an undef result.  So when this does prevent a bug it could
be a pretty hard-to-diagnose one.  The problem is that it's a really
verbose/pedantic requirement for subs that no one ever examines the
result value of.

Is there a way to modify the test so that it only complains when
the final return is missing and there are other return(s) with values?
That would seem like a more narrowly tailored check.

regards, tom lane

Re: range_agg

2020-04-11 Thread Alvaro Herrera

On 2020-Apr-11, Paul A Jungwirth wrote:

> On Sat, Apr 11, 2020 at 9:36 AM Paul A Jungwirth
>  wrote:
> > Btw I'm working on typanalyze + selectivity, and it seems like the
> > test suite doesn't run those things?
> 
> Nevermind, I just had to add `analyze numrange_test` to
> src/test/regress/sql/rangetypes.sql. :-) Do you want a separate patch
> for that? Or maybe it should go in sql/vacuum.sql?

Dunno, it seems fine in rangetypes.sql.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: cleaning perl code

2020-04-11 Thread Andrew Dunstan



On 4/11/20 12:48 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 4/11/20 12:30 AM, Noah Misch wrote:
>>> In summary, among those warnings, I see non-negative value in "Code before
>>> warnings are enabled" only.  While we're changing this, I propose removing
>>> Subroutines::RequireFinalReturn.  Implicit return values were not a material
>>> source of PostgreSQL bugs, yet we've allowed this to litter our code:
>> That doesn't mean it won't be a source of problems in future, I've
>> actually been bitten by this in the past.
> Yeah, as I recall, the reason for the restriction is that if you fall out
> without a "return", what's returned is the side-effect value of the last
> statement, which might be fairly surprising.  Adding explicit "return;"
> guarantees an undef result.  So when this does prevent a bug it could
> be a pretty hard-to-diagnose one.  The problem is that it's a really
> verbose/pedantic requirement for subs that no one ever examines the
> result value of.
>
> Is there a way to modify the test so that it only complains when
> the final return is missing and there are other return(s) with values?
> That would seem like a more narrowly tailored check.
>
>   



Not AFAICS:



That would probably require writing a replacement module. Looking at the
source if this module I think it might be possible, although I don't
know much of the internals of perlcritic.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: cleaning perl code

2020-04-11 Thread Mark Dilger




> On Apr 11, 2020, at 9:47 AM, Andrew Dunstan  
> wrote:
> 
> 
> On 4/11/20 12:28 PM, Mark Dilger wrote:
>> 
>>> On Apr 11, 2020, at 9:13 AM, Andrew Dunstan 
>>>  wrote:
>> Hi Andrew.  I appreciate your interest and efforts here.  I hope you don't 
>> mind a few questions/observations about this effort:
> 
> 
> Not at all.
> 
> 
>> 
>>> The
>>> last one fixes the mixture of high and low precedence boolean operators,
>> I did not spot examples of this in your diffs, but I assume you mean to 
>> prohibit conditionals like:
>> 
>>if ($a || $b and $c || $d)
>> 
>> As I understand it, perl introduced low precedence operators precisely to 
>> allow this.  Why disallow it?
> 
> 
> The docs say:
> 
> 
>Conway advises against combining the low-precedence booleans ( |and
>or not| ) with the high-precedence boolean operators ( |&& || !| )
>in the same expression. Unless you fully understand the differences
>between the high and low-precedence operators, it is easy to
>misinterpret expressions that use both. And even if you do
>understand them, it is not always clear if the author actually
>intended it.
> 
>|next| |if| |not ||$foo| ||| ||$bar||;  ||#not ok|
>|next| |if| |!||$foo| ||| ||$bar||; ||#ok|
>|next| |if| |!( ||$foo| ||| ||$bar| |); ||#ok|

I don't think any of those three are ok, from a code review perspective, but 
it's not because high and low precedence operators were intermixed.

>> 
>>> and the use of commas to separate statements
>> I don't understand the prejudice against commas used this way.  What is 
>> wrong with:
>> 
>>$i++, $j++ if defined $k;
>> 
>> rather than:
>> 
>>if (defined $k)
>>{
>>$i++;
>>$j++;
>>}
>> 
> 
> 
> I don't think the example is terribly clear. I have to look at it and
> think "Does it do $i++ if $k isn't defined?"

It works like the equivalent C-code:

if (k)
i++, j++;

which to my eyes is also fine.

I'm less concerned with which perlcritic features you enable than I am with 
accidentally submitting perl which looks fine to me but breaks the build.  I 
mostly use perl from within TAP tests, which I run locally before submission to 
the project.  Can your changes be integrated into the TAP_TESTS makefile target 
so that I get local errors about this stuff and can fix it before submitting a 
regression test to -hackers?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: cleaning perl code

2020-04-11 Thread Tom Lane

Andrew Dunstan  writes:
> On 4/11/20 12:48 PM, Tom Lane wrote:
>> Is there a way to modify the test so that it only complains when
>> the final return is missing and there are other return(s) with values?
>> That would seem like a more narrowly tailored check.

> Not AFAICS:
> 

Yeah, the list of all policies in the parent page doesn't offer any
promising alternatives either :-(

BTW, this bit in the policy's man page seems pretty disheartening:

Be careful when fixing problems identified by this Policy; don't
blindly put a return; statement at the end of every subroutine.

since I'd venture that's *exactly* what we've done every time perlcritic
moaned about this.  I wonder what else the author expected would happen.

> That would probably require writing a replacement module. Looking at the
> source if this module I think it might be possible, although I don't
> know much of the internals of perlcritic.

I doubt we want to go maintaining our own perlcritic policies; aside from
the effort involved, it'd become that much harder for anyone to reproduce
the results.

regards, tom lane

Re: Complete data erasure

2020-04-11 Thread Tomas Vondra


On Fri, Apr 10, 2020 at 08:23:32AM +, asaba.takan...@fujitsu.com wrote:

Hello,

I was off the point.
I want to organize the discussion and suggest feature design.

There are two opinions.
1. COMMIT should not take a long time because errors are more likely to occur.


I don't think it's a matter of commit duration but a question what to do
in response to errors in the data erasure code - which is something we
can't really rule out if we allow custom scripts to perform the erasure.
If the erasure took very long but couldn't possibly fail, it'd be much
easier to handle than fast erasure failing often.

The difficulty of error-handling is why adding new stuff to commit may
be tricky. Which is why I proposed not to do the failure-prone code in
commit itself, but move it to a separate process.


2. The data area should be released when COMMIT is completed because COMMIT has 
to be an atomic action.



I don't think "commit is atomic" really implies "data should be released
at commit". This is precisely what makes the feature extremely hard to
implement, IMHO.

Why wouldn't it be acceptable to do something like this?

BEGIN;
...
DROP TABLE x ERASE;
...
COMMIT;  <-- Don't do data erasure, just add "x" to queue.

-- wait for another process to complete the erasure
SELECT pg_wait_for_erasure();

That means we're not running any custom commands / code during commit,
which should (hopefully) make it easier to handle errors.



These opinions are correct.
But it is difficult to satisfy them at the same time.
So I suggest that users have the option to choose.
DROP TABLE works as following two patterns:

1. Rename data file to "...del" instead of ftruncate(fd,0).
 After that, bgworker scan the directory and run erase_command.
 (erase_command is command set by user like archive_command.
  For example, shred on Linux.)

2. Run erase_command for data file immediately before ftruncate(fd,0).
 Wait until it completes, then reply COMMIT to the client.
 After that, it is the same as normal processing.

If error of erase_command occurs, it issues WARNING and don't request unlink to 
CheckPointer.
It’s not a security failure because I think that there is a risk when data area 
is returned to OS.



I think it was already disicussed why doing file renames and other
expensive stuff that could fail is a bad idea. And I'm not sure just
ignoring erase_command failures (because that's what WARNING does) is
really appropriate for this feature.


I will implement from pattern 2 because it's more similar to user experience 
than pattern 1.
This method has been pointed out as follows.

From Stephen

We certainly can't run external commands during transaction COMMIT, so
this can't be part of a regular DROP TABLE.


I think it means that error of external commands can't be handled.
If so, it's no problem because I determined behavior after error.
Are there any other problems?


I'm not sure what you mean by "determined behavior after error"? You
essentially propose to just print a warning and be done with it. But
that means we can simply leave data files with sensitive data on the
disk, which seems ... not great.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: cleaning perl code

2020-04-11 Thread Tom Lane

Mark Dilger  writes:
> I'm less concerned with which perlcritic features you enable than I am with 
> accidentally submitting perl which looks fine to me but breaks the build.  I 
> mostly use perl from within TAP tests, which I run locally before submission 
> to the project.  Can your changes be integrated into the TAP_TESTS makefile 
> target so that I get local errors about this stuff and can fix it before 
> submitting a regression test to -hackers?

As far as that goes, I think crake is just running

src/tools/perlcheck/pgperlcritic

which you can do for yourself as long as you've got perlcritic
installed.

regards, tom lane

Re: Complete data erasure

2020-04-11 Thread Tom Lane

Tomas Vondra  writes:
> I don't think "commit is atomic" really implies "data should be released
> at commit". This is precisely what makes the feature extremely hard to
> implement, IMHO.

> Why wouldn't it be acceptable to do something like this?

>  BEGIN;
>  ...
>  DROP TABLE x ERASE;
>  ...
>  COMMIT;  <-- Don't do data erasure, just add "x" to queue.

>  -- wait for another process to complete the erasure
>  SELECT pg_wait_for_erasure();

> That means we're not running any custom commands / code during commit,
> which should (hopefully) make it easier to handle errors.

Yeah, adding actions-that-could-fail to commit is a very hard sell,
so something like this API would probably have a better chance.

However ... the whole concept of erasure being a committable action
seems basically misguided from here.  Consider this scenario:

begin;

create table full_o_secrets (...);

... manipulate secret data in full_o_secrets ...

drop table full_o_secrets erase;

... do something that unintentionally fails, causing xact abort ...

commit;

Now what?  Your secret data is all over the disk and you have *no*
recourse to get rid of it; that's true even at a very low level,
because we unlinked the file when rolling back the transaction.
If the error occurred before getting to "drop table full_o_secrets
erase" then there isn't even any way in principle for the server
to know that you might not be happy about leaving that data lying
around.

And I haven't even spoken of copies that may exist in WAL, or
have been propagated to standby servers by now.

I have no idea what an actual solution that accounted for those
problems would look like.  But as presented, this is a toy feature
offering no real security gain, if you ask me.

regards, tom lane

Re: execExprInterp() questions / How to improve scalar array op expr eval?

2020-04-11 Thread Andres Freund

Hi,

Tom, CCing you because of expanded datum question at bottom.

On 2020-04-11 08:58:46 -0400, James Coleman wrote:
> - Does the execExpr/execExprInterp framework allow a scalar array op
> to get an already expanded array (unless I'm missing something we
> can't easily lookup a given index in a flattened array)?

Well, I'm not quite sure what you're thinking of. If the input is
constant, then expression initialization can do just about everything it
wants. Including preprocessing the array into whatever form it wants.
But there's no logic for doing preprocessing whenever values change.

> - If not, is there a way in that framework to know if the array expr
> has stayed the same through multiple evaluations of the expression
> tree (i.e., so you could expand and sort it just once)?

No.

> Long version/background:
> 
> I've been looking at how a query like:
> select * from t where i in ();
> executes as a seq scan the execution time increases linearly with
> respect to the length of the array. That's because in
> execExprInterp.c's ExecEvalScalarArrayOp() we do a linear search
> through the array.

Is "" constant in the cases you're interested in? Because
it's a heck of a lot easier to add an optimization for that case, than
adding runtime tracking the array values by comparing the whole array
for equality with the last - the comparisons of the whole array could
easily end up adding more cost than what's being saved.

> I've been considering approaches to improve the seq scan case. We might:
> - At plan time rewrite as a hash join to a deduped array values
> relation (gross, but included here since that's an approach we can
> take rewriting the SQL itself as a proof of concept).
> - At execution time build a hash and lookup.
> - At execution time sort the array and binary search through it.
> 
> I've been working other last approach to see what results I might get
> (it seemed like the easiest to hack together). Putting that together
> left me with the questions mentioned above in the "short version".
> 
> Obviously if anyone has thoughts on the above approaches I'd be
> interested in that too.

If you're content with optimizing constant arrays, I'd go for detecting
that case in the T_ScalarArrayOpExpr case in ExecInitExprRec(), and
preprocessing the array into an optimized form. Probably with a separate
opcode for execution.

> Side question: when we do:
> arr = DatumGetArrayTypeP(*op->resvalue);
> in ExecEvalScalarArrayOp() is that going to be expensive each time
> through a seq scan? Or is it (likely) going to resolve to an already
> in-memory array and effectively be the cost of retrieving that
> pointer?

It Depends TM.  For the constant case it's likely going to be cheap-ish,
because it'll not be toasted. For the case where it's the return value
from a subquery or something, you cannot assume it won't change between
calls.

I think the worst case here is something like a nestloop, where the
inner side does foo IN (outer.column). If I recall the code correctly,
we'll currently end up detoasting the array value every single
iteration.

I wonder if it would be a good idea to change ExecEvalParamExec and
ExecEvalParamExtern to detoast the to-be-returned value. If we change
the value that's stored in econtext->ecxt_param_exec_vals /
econtext->ecxt_param_list_info, we'd avoid repeated detaosting.

It'd be nice if we somehow could make the expanded datum machinery work
here. I'm not quite seeing how though?

Crazy idea: I have a patch to make execExprInterp largely use
NullableDatum. Tom and I had theorized a while ago about adding
additional fields in the padding that currently exists in it. I wonder
if we could utilize a bit in there to allow to expand in-place?

Greetings,

Andres Freund

Re: Complete data erasure

2020-04-11 Thread Tomas Vondra


On Sat, Apr 11, 2020 at 01:56:10PM -0400, Tom Lane wrote:

Tomas Vondra  writes:

I don't think "commit is atomic" really implies "data should be released
at commit". This is precisely what makes the feature extremely hard to
implement, IMHO.



Why wouldn't it be acceptable to do something like this?



 BEGIN;
 ...
 DROP TABLE x ERASE;
 ...
 COMMIT;  <-- Don't do data erasure, just add "x" to queue.



 -- wait for another process to complete the erasure
 SELECT pg_wait_for_erasure();



That means we're not running any custom commands / code during commit,
which should (hopefully) make it easier to handle errors.


Yeah, adding actions-that-could-fail to commit is a very hard sell,
so something like this API would probably have a better chance.

However ... the whole concept of erasure being a committable action
seems basically misguided from here.  Consider this scenario:

begin;

create table full_o_secrets (...);

... manipulate secret data in full_o_secrets ...

drop table full_o_secrets erase;

... do something that unintentionally fails, causing xact abort ...

commit;

Now what?  Your secret data is all over the disk and you have *no*
recourse to get rid of it; that's true even at a very low level,
because we unlinked the file when rolling back the transaction.
If the error occurred before getting to "drop table full_o_secrets
erase" then there isn't even any way in principle for the server
to know that you might not be happy about leaving that data lying
around.

And I haven't even spoken of copies that may exist in WAL, or
have been propagated to standby servers by now.

I have no idea what an actual solution that accounted for those
problems would look like.  But as presented, this is a toy feature
offering no real security gain, if you ask me.



Yeah, unfortunately the feature as proposed has these weaknesses.

This is why I proposed that a solution based on encryption and throwing
away a key might be more reliable - if you don't have a key, who cares
if the encrypted data file (or parts of it) is still on disk?

It has issues too, though - a query might need a temporary file to do a
sort, hash join spills to disk, or something like that. And those won't
be encrypted without some executor changes (e.g. we might propagate
"needs erasure" to temp files, and do erasure when necessary).

I doubt a perfect solution would be so complex it's not feasible in
practice, especially in v1. So maybe the best thing we can do is
documenting those limitations, but I'm not sure where to draw the line
between acceptable and unacceptable limitations.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Support for DATETIMEOFFSET

2020-04-11 Thread Neil

> On Apr 10, 2020, at 6:10 PM, Jeremy Morton  wrote:
> 
> Neil wrote:
>>> On Apr 10, 2020, at 8:19 AM, Jeremy Morton  wrote:
>>> 
>>> Oh well.  Guess I keep using SQL Server then.  datetimeoffset makes it 
>>> impossible for developers to make the mistake of forgetting to use UTC 
>>> instead of local datetime, and for that reason alone it makes it invaluable 
>>> in my opinion.  It should be used universally instead of datetime.
>> 1. Not sure I understand. I’ve never used datetimeoffset so please bear with 
>> me. How does storing a time zone with the date time “make it impossible for 
>> developers to make the mistake….”
> 
> At just about every development shop I've worked for, I've seen developers 
> use methods to get a local DateTime - both in the DB and in the code - such 
> as DateTime.Now, and throw it at a DateTime field. Heck, even I've 
> occasionally forgotten to use .UtcNow.  With DateTimeOffset.Now, you can't go 
> wrong.  You get the UTC time, and the offset.  I've taken to using it 100% of 
> the time.  It’s just really handy.
> 

In PostgreSQL there are two types; timestamp and timestamptz.  If you use 
timestamptz then all time stamps coming into the database with time zones will 
be converted to and stored in UTC in the database and all times coming out of 
the database will have the local time zone of the server unless otherwise 
requested.

Not sure how that is error prone.  Maybe you are working around a problem that 
does not exist in PostgreSQL.

If you use timestamp type (not timestamptz) then all input output time zone 
conversions are ignored (time zone is truncated) and sure problems can occur.  
That is why there is very little use of the timestamp type.

Neil
https:://www.fairwindsoft.com

Re: execExprInterp() questions / How to improve scalar array op expr eval?

2020-04-11 Thread Tom Lane

Andres Freund  writes:
> On 2020-04-11 08:58:46 -0400, James Coleman wrote:
>> - Does the execExpr/execExprInterp framework allow a scalar array op
>> to get an already expanded array (unless I'm missing something we
>> can't easily lookup a given index in a flattened array)?

> Well, I'm not quite sure what you're thinking of. If the input is
> constant, then expression initialization can do just about everything it
> wants. Including preprocessing the array into whatever form it wants.
> But there's no logic for doing preprocessing whenever values change.

For the most part it seems like this is asking the question at the wrong
level.  It's not execExpr's job to determine the form of either values
coming in from "outside" (Vars from table rows, or Params from elsewhere)
or the results of intermediate functions/operators.

In the specific case of an array-valued (or record-valued) Const node,
you could imagine having a provision to convert the array/record to an
expanded datum at execution start, or maybe better on first use.  I'm
not sure how to tell whether that's actually a win though.  It could
easily be a net loss if the immediate consumer of the value wants a
flat datum.

It seems like this might be somewhat related to the currently-moribund
patch to allow caching of the values of stable subexpressions from
one execution to the next.  If we had that infrastructure you could
imagine extending it to allow caching the expanded not flat form of
some datums.  Again I'm not entirely sure what would drive the choice.

> I wonder if it would be a good idea to change ExecEvalParamExec and
> ExecEvalParamExtern to detoast the to-be-returned value. If we change
> the value that's stored in econtext->ecxt_param_exec_vals /
> econtext->ecxt_param_list_info, we'd avoid repeated detaosting.

I'd think about attaching that to the nestloop param mechanism not
ExecEvalParam in general.

regards, tom lane

Re: Add A Glossary

2020-04-11 Thread Corey Huinker

>
>
> Term 'relation': A sequence is internally a table with one row - right?
> Shall we extend the list of concrete relations by 'sequence'? Or is this
> not necessary because 'table' is already there?
>

I wrote one for sequence, it was a bit math-y for Alvaro's taste, so we're
going to try again.

Re: execExprInterp() questions / How to improve scalar array op expr eval?

2020-04-11 Thread James Coleman

On Sat, Apr 11, 2020 at 2:01 PM Andres Freund  wrote:
>
> Hi,
>
>
> Tom, CCing you because of expanded datum question at bottom.
>
>
> On 2020-04-11 08:58:46 -0400, James Coleman wrote:
> > - Does the execExpr/execExprInterp framework allow a scalar array op
> > to get an already expanded array (unless I'm missing something we
> > can't easily lookup a given index in a flattened array)?
>
> Well, I'm not quite sure what you're thinking of. If the input is
> constant, then expression initialization can do just about everything it
> wants. Including preprocessing the array into whatever form it wants.
> But there's no logic for doing preprocessing whenever values change.
>
>
> > - If not, is there a way in that framework to know if the array expr
> > has stayed the same through multiple evaluations of the expression
> > tree (i.e., so you could expand and sort it just once)?
>
> No.

Ok. Seems like it'd be likely to be interesting (maybe in other places
too?) to know if:
- Something is actually a param that can change, and,
- When (perhaps by some kind of flag or counter) it has changed.

> > Long version/background:
> >
> > I've been looking at how a query like:
> > select * from t where i in ();
> > executes as a seq scan the execution time increases linearly with
> > respect to the length of the array. That's because in
> > execExprInterp.c's ExecEvalScalarArrayOp() we do a linear search
> > through the array.
>
> Is "" constant in the cases you're interested in? Because
> it's a heck of a lot easier to add an optimization for that case, than
> adding runtime tracking the array values by comparing the whole array
> for equality with the last - the comparisons of the whole array could
> easily end up adding more cost than what's being saved.

In the simplest case, yes, it's a constant, though it'd be obviously
better if it weren't limited to that. There are many cases where a
long array can come from a subplan but we can easily tell by looking
at the SQL that it will only ever execute once. An unimaginative case
is something like:
select * from t where a in (select i from generate_series(0,1) n(i));

> > I've been considering approaches to improve the seq scan case. We might:
> > - At plan time rewrite as a hash join to a deduped array values
> > relation (gross, but included here since that's an approach we can
> > take rewriting the SQL itself as a proof of concept).
> > - At execution time build a hash and lookup.
> > - At execution time sort the array and binary search through it.
> >
> > I've been working other last approach to see what results I might get
> > (it seemed like the easiest to hack together). Putting that together
> > left me with the questions mentioned above in the "short version".
> >
> > Obviously if anyone has thoughts on the above approaches I'd be
> > interested in that too.
>
> If you're content with optimizing constant arrays, I'd go for detecting
> that case in the T_ScalarArrayOpExpr case in ExecInitExprRec(), and
> preprocessing the array into an optimized form. Probably with a separate
> opcode for execution.

At minimum constants are a good first place to try it out. Thanks for
the pointers.

> > Side question: when we do:
> > arr = DatumGetArrayTypeP(*op->resvalue);
> > in ExecEvalScalarArrayOp() is that going to be expensive each time
> > through a seq scan? Or is it (likely) going to resolve to an already
> > in-memory array and effectively be the cost of retrieving that
> > pointer?
>
> It Depends TM.  For the constant case it's likely going to be cheap-ish,
> because it'll not be toasted. For the case where it's the return value
> from a subquery or something, you cannot assume it won't change between
> calls.

Back to what I was saying earlier. Perhaps some kind of mechanism so
we can know that is a better place to start. Perhaps something from
the patch Tom referenced would help kickstart that. I'll take a look.

> I think the worst case here is something like a nestloop, where the
> inner side does foo IN (outer.column). If I recall the code correctly,
> we'll currently end up detoasting the array value every single
> iteration.

Ouch. Seems like that could be a significant cost in some queries?

> I wonder if it would be a good idea to change ExecEvalParamExec and
> ExecEvalParamExtern to detoast the to-be-returned value. If we change
> the value that's stored in econtext->ecxt_param_exec_vals /
> econtext->ecxt_param_list_info, we'd avoid repeated detaosting.
>
> It'd be nice if we somehow could make the expanded datum machinery work
> here. I'm not quite seeing how though?

I'm not yet familiar enough with it to comment.

> Crazy idea: I have a patch to make execExprInterp largely use
> NullableDatum. Tom and I had theorized a while ago about adding
> additional fields in the padding that currently exists in it. I wonder
> if we could utilize a bit in there to allow to expand in-place?

Effectively to store the pointers to, for example, the expanded array?

J

Re: execExprInterp() questions / How to improve scalar array op expr eval?

2020-04-11 Thread James Coleman

On Sat, Apr 11, 2020 at 3:33 PM Tom Lane  wrote:
>
> Andres Freund  writes:
> > On 2020-04-11 08:58:46 -0400, James Coleman wrote:
> >> - Does the execExpr/execExprInterp framework allow a scalar array op
> >> to get an already expanded array (unless I'm missing something we
> >> can't easily lookup a given index in a flattened array)?
>
> > Well, I'm not quite sure what you're thinking of. If the input is
> > constant, then expression initialization can do just about everything it
> > wants. Including preprocessing the array into whatever form it wants.
> > But there's no logic for doing preprocessing whenever values change.
>
> For the most part it seems like this is asking the question at the wrong
> level.  It's not execExpr's job to determine the form of either values
> coming in from "outside" (Vars from table rows, or Params from elsewhere)
> or the results of intermediate functions/operators.

Right, though I didn't know if the expr interpretation by any chance
had expanded arrays already available in some cases that we could take
advantage of. A bit of a shot in the dark as I try to grok how this
all fits together.

> In the specific case of an array-valued (or record-valued) Const node,
> you could imagine having a provision to convert the array/record to an
> expanded datum at execution start, or maybe better on first use.  I'm
> not sure how to tell whether that's actually a win though.  It could
> easily be a net loss if the immediate consumer of the value wants a
> flat datum.
>
> It seems like this might be somewhat related to the currently-moribund
> patch to allow caching of the values of stable subexpressions from
> one execution to the next.  If we had that infrastructure you could
> imagine extending it to allow caching the expanded not flat form of
> some datums.  Again I'm not entirely sure what would drive the choice.

I'll have to look into that patch to see if it sparks any ideas (or if
it's worth working on for its own merits).

> > I wonder if it would be a good idea to change ExecEvalParamExec and
> > ExecEvalParamExtern to detoast the to-be-returned value. If we change
> > the value that's stored in econtext->ecxt_param_exec_vals /
> > econtext->ecxt_param_list_info, we'd avoid repeated detaosting.
>
> I'd think about attaching that to the nestloop param mechanism not
> ExecEvalParam in general.

Revealing my ignorance here, but is nestloop the only case where we
have params like that?

James

Re: execExprInterp() questions / How to improve scalar array op expr eval?

2020-04-11 Thread Andres Freund

Hi,

On 2020-04-11 15:33:09 -0400, Tom Lane wrote:
> For the most part it seems like this is asking the question at the wrong
> level.  It's not execExpr's job to determine the form of either values
> coming in from "outside" (Vars from table rows, or Params from elsewhere)
> or the results of intermediate functions/operators.

We don't really have good place to do optimizations to transform an
expression to be better for an "upper" expression node.

> In the specific case of an array-valued (or record-valued) Const node,
> you could imagine having a provision to convert the array/record to an
> expanded datum at execution start, or maybe better on first use.  I'm
> not sure how to tell whether that's actually a win though.  It could
> easily be a net loss if the immediate consumer of the value wants a
> flat datum.

With execution start you do mean ExecInitExpr()? Or later?  I see little
reason not to do such an optimization during expression
initialization. It's not going to add much if any overhead compared to
all the rest of the costs to change a Const array into a different
form.

> It seems like this might be somewhat related to the currently-moribund
> patch to allow caching of the values of stable subexpressions from
> one execution to the next.  If we had that infrastructure you could
> imagine extending it to allow caching the expanded not flat form of
> some datums.  Again I'm not entirely sure what would drive the choice.

> > I wonder if it would be a good idea to change ExecEvalParamExec and
> > ExecEvalParamExtern to detoast the to-be-returned value. If we change
> > the value that's stored in econtext->ecxt_param_exec_vals /
> > econtext->ecxt_param_list_info, we'd avoid repeated detaosting.
> 
> I'd think about attaching that to the nestloop param mechanism not
> ExecEvalParam in general.

That was my first though too - but especially if there's multiple
variables detoasting all of them might be a waste if the params are not
dereferenced. So doing it the first time a parameter is used seemed like
it'd be much more likely to be beneficial.

But even there it could be a waste, because e.g. a length comparison
alone is enough to determine inequality. Which is why I was wondering
about somehow being able to detoast the parameter "in place" in the
params arrays.

Greetings,

Andres Freund

Re: pg_validatebackup -> pg_verifybackup?

2020-04-11 Thread Robert Haas

On Fri, Apr 10, 2020 at 5:24 PM Tom Lane  wrote:
> Meh.  I would argue that that's an actively BAD idea.  The use-cases
> are entirely different, the implementation is going to be quite a lot
> different, the relevant options are going to be quite a lot different.
> It will not be better for either implementors or users to force those
> into the same executable.

I think Andres has a point, but on balance I am more inclined to agree
with you. It may be that in the future it will make sense to organize
things differently, but I would rather arrange them according to what
makes sense now, and then change it later, even though that makes for
some user-visible churn. The thing is that we don't really know what's
going to happen in future releases, and our track record when we try
to guess is very poor. Creating stuff that we hope will get extended
to do this or that in a future release can end up looking really
half-baked if the code doesn't get extended in the anticipated
direction.

I *would* like to find a way to address the proliferation of binaries,
because I've got other things I'd like to do that would require
creating still more of them, and until we come up with a scalable
solution that makes everybody happy, there's going to be progressively
more complaining every time. One possible solution is to adopt the
'git' approach and decide we're going to have one 'pg' command (or
whatever we call it). I think the way that 'git' does it is that all
of the real binaries are stored in a directory that users are not
expected to have in their path, and the 'git' wrapper just looks for
one based on the name of the subcommand. So, if you say 'git thunk',
it goes and searches the private bin directory for an executable
called 'git-thunk'. We could easily do this for nearly everything 'pg'
related, so:

clusterdb -> pg clusterdb
pg_ctl -> pg ctl
pg_resetwal -> pg resetwal
etc.

I think we'd want psql to still be separate, but I'm not sure we'd
need any other exceptions. In a lot of cases it won't lead to any more
typing because the current command is pg_whatever and with this
approach you'd just type a space instead of an underscore. The
"legacy" cases that don't start with "pg" would get a bit longer, but
I wouldn't lose a lot of sleep over that myself.

There are other possibilities too. We could try to pick out individual
groups of commands to merge, rather than having a unified framework
for everything. For instance, we could turn
{cluster,create,drop,reindex,vacuum}db into one utility,
{create,drop}user into another, pg_dump{,all} into a third, and so on.
But this seems like it would require making a lot more awkward policy
decisions, so I don't think it's a great plan.

Still, we need to agree on something. It won't do to tell people that
we're not going to commit patches to add more functionality to
PostgreSQL because it would involve adding more binaries. Any
particular piece of functionality may draw substantive objections, and
that's fine, but we shouldn't stifle development categorically because
we can't agree on how to clean up the namespace pollution.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: where should I stick that backup?

2020-04-11 Thread Robert Haas

On Fri, Apr 10, 2020 at 3:38 PM Andres Freund  wrote:
> Wouldn't there be state like a S3/ssh/https/... connection? And perhaps
> a 'backup_id' in the backup metadata DB that'd one would want to update
> at the end?

Good question. I don't know that there would be but, uh, maybe? It's
not obvious to me why all of that would need to be done using the same
connection, but if it is, the idea I proposed isn't going to work very
nicely.

More generally, can you think of any ideas for how to structure an API
here that are easier to use than "write some C code"? Or do you think
we should tell people to write some C code if they want to
compress/encrypt/relocate their backup in some non-standard way?

For the record, I'm not against eventually having more than one way to
do this, maybe a shell-script interface for simpler things and some
kind of API for more complex needs (e.g. NetBackup integration,
perhaps). And I did wonder if there was some other way we could do
this. For instance, we could add an option --tar-everything that
sticks all the things that would have been returned by the backup
inside another level of tar file and sends the result to stdout. Then
you can pipe it into a single command that gets invoked only once for
all the data, rather than once per tablespace. That might be better,
but I'm not sure it's better. It's better if you want to do
complicated things that involve steps that happen before and after and
persistent connections and so on, but it seems worse for simple things
like piping through a non-default compressor.

Larry Wall somewhat famously commented that a good programming
language should (and I paraphrase) make simple things simple and
complex things possible. My hesitation in going straight to a C API is
that it does not make simple things simple; and I'd like to be really
sure that there is no way of achieving that valuable goal before we
give up on it. However, there is no doubt that a C API is potentially
more powerful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: execExprInterp() questions / How to improve scalar array op expr eval?

2020-04-11 Thread James Coleman

On Sat, Apr 11, 2020 at 3:57 PM James Coleman  wrote:
> ..
> > It seems like this might be somewhat related to the currently-moribund
> > patch to allow caching of the values of stable subexpressions from
> > one execution to the next.  If we had that infrastructure you could
> > imagine extending it to allow caching the expanded not flat form of
> > some datums.  Again I'm not entirely sure what would drive the choice.
>
> I'll have to look into that patch to see if it sparks any ideas (or if
> it's worth working on for its own merits).

Is this the patch [1] you're thinking of?

James

[1]: 
https://www.postgresql.org/message-id/flat/da87bb6a014e029176a04f6e50033cfb%40postgrespro.ru

Re: execExprInterp() questions / How to improve scalar array op expr eval?

2020-04-11 Thread Tom Lane

James Coleman  writes:
>>> It seems like this might be somewhat related to the currently-moribund
>>> patch to allow caching of the values of stable subexpressions from
>>> one execution to the next.

> Is this the patch [1] you're thinking of?
> [1]: 
> https://www.postgresql.org/message-id/flat/da87bb6a014e029176a04f6e50033cfb%40postgrespro.ru

Yeah.  I was just digging for that in the archives, and also came across
this older patch:

https://www.postgresql.org/message-id/CABRT9RBdRFS8sQNsJHxZOhC0tJe1x2jnomiz%3DFOhFkS07yRwQA%40mail.gmail.com

which doesn't seem to have gone anywhere but might still contain
useful ideas.

regards, tom lane

Re: execExprInterp() questions / How to improve scalar array op expr eval?

2020-04-11 Thread Andres Freund

Hi,

On 2020-04-11 15:53:11 -0400, James Coleman wrote:
> On Sat, Apr 11, 2020 at 2:01 PM Andres Freund  wrote:
> > > - If not, is there a way in that framework to know if the array expr
> > > has stayed the same through multiple evaluations of the expression
> > > tree (i.e., so you could expand and sort it just once)?
> >
> > No.
> 
> Ok. Seems like it'd be likely to be interesting (maybe in other places
> too?) to know if:
> - Something is actually a param that can change, and,
> - When (perhaps by some kind of flag or counter) it has changed.

We do have the param logic inside the executor, which does signal which
params have changed. It's just independent of expression evaluation.

I'm not convinced (or well, even doubtful) this is something we want to
have at the expression evaluation level.

Greetings,

Andres Freund

Re: sqlsmith crash incremental sort

2020-04-11 Thread Justin Pryzby

Adding -hackers, originally forgotten.

On Sat, Apr 11, 2020 at 10:26:39PM +0200, Tomas Vondra wrote:
> Thanks! I'll investigate.
> 
> On Sat, Apr 11, 2020 at 02:19:52PM -0500, Justin Pryzby wrote:
> > frequent crash looks like:
> > 
> > #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> > #1  0x7fb0a0cda801 in __GI_abort () at abort.c:79
> > #2  0x7fb0a21ec425 in ExceptionalCondition 
> > (conditionName=conditionName@entry=0x7fb0a233a2ed "relid > 0", 
> > errorType=errorType@entry=0x7fb0a224701d "FailedAssertion",
> >fileName=fileName@entry=0x7fb0a2340ce8 "relnode.c", 
> > lineNumber=lineNumber@entry=379) at assert.c:67
> > #3  0x7fb0a2010d3a in find_base_rel (root=root@entry=0x7fb0a2de2d00, 
> > relid=) at relnode.c:379
> > #4  0x7fb0a2199666 in examine_variable (root=root@entry=0x7fb0a2de2d00, 
> > node=node@entry=0x7fb0a2e65eb8, varRelid=varRelid@entry=0, 
> > vardata=vardata@entry=0x7ffe7b549e60) at selfuncs.c:4600
> > #5  0x7fb0a219e2ed in estimate_num_groups 
> > (root=root@entry=0x7fb0a2de2d00, groupExprs=0x7fb0a2e69118, 
> > input_rows=input_rows@entry=2, pgset=pgset@entry=0x0) at selfuncs.c:3279
> > #6  0x7fb0a1fc198b in cost_incremental_sort 
> > (path=path@entry=0x7fb0a2e69080, root=root@entry=0x7fb0a2de2d00, 
> > pathkeys=pathkeys@entry=0x7fb0a2e68b28, 
> > presorted_keys=presorted_keys@entry=3,
> >input_startup_cost=103.73424154497742, input_total_cost=, 
> > input_tuples=2, width=480, comparison_cost=comparison_cost@entry=0, 
> > sort_mem=4096, limit_tuples=-1) at costsize.c:1832
> > #7  0x7fb0a2007f63 in create_incremental_sort_path 
> > (root=root@entry=0x7fb0a2de2d00, rel=rel@entry=0x7fb0a2e67a38, 
> > subpath=subpath@entry=0x7fb0a2e681a0, pathkeys=0x7fb0a2e68b28,
> >presorted_keys=3, limit_tuples=limit_tuples@entry=-1) at pathnode.c:2793
> > #8  0x7fb0a1fe97cb in create_ordered_paths (limit_tuples=-1, 
> > target_parallel_safe=true, target=0x7fb0a2e65568, input_rel= > out>, root=0x7fb0a2de2d00) at planner.c:5029
> > #9  grouping_planner (root=root@entry=0x7fb0a2de2d00, 
> > inheritance_update=inheritance_update@entry=false, 
> > tuple_fraction=, tuple_fraction@entry=0) at planner.c:2254
> > #10 0x7fb0a1fecd5c in subquery_planner (glob=, 
> > parse=parse@entry=0x7fb0a2db7840, 
> > parent_root=parent_root@entry=0x7fb0a2dad650, 
> > hasRecursion=hasRecursion@entry=false,
> >tuple_fraction=0) at planner.c:1015
> > #11 0x7fb0a1fbc286 in set_subquery_pathlist (rte=, 
> > rti=, rel=0x7fb0a2db3598, root=0x7fb0a2dad650) at 
> > allpaths.c:2303
> > #12 set_rel_size (root=root@entry=0x7fb0a2dad650, 
> > rel=rel@entry=0x7fb0a2db1670, rti=rti@entry=2, rte=) at 
> > allpaths.c:422
> > #13 0x7fb0a1fbecad in set_base_rel_sizes (root=) at 
> > allpaths.c:323
> > #14 make_one_rel (root=root@entry=0x7fb0a2dad650, 
> > joinlist=joinlist@entry=0x7fb0a2db76b8) at allpaths.c:185
> > #15 0x7fb0a1fe4a2b in query_planner (root=root@entry=0x7fb0a2dad650, 
> > qp_callback=qp_callback@entry=0x7fb0a1fe52c0 , 
> > qp_extra=qp_extra@entry=0x7ffe7b54a510)
> >at planmain.c:269
> > #16 0x7fb0a1fea0b8 in grouping_planner (root=root@entry=0x7fb0a2dad650, 
> > inheritance_update=inheritance_update@entry=false, 
> > tuple_fraction=, tuple_fraction@entry=0)
> >at planner.c:2058
> > #17 0x7fb0a1fecd5c in subquery_planner (glob=glob@entry=0x7fb0a2dab480, 
> > parse=parse@entry=0x7fb0a2d48410, parent_root=parent_root@entry=0x0, 
> > hasRecursion=hasRecursion@entry=false,
> >tuple_fraction=tuple_fraction@entry=0) at planner.c:1015
> > #18 0x7fb0a1fee1df in standard_planner (parse=0x7fb0a2d48410, 
> > query_string=, cursorOptions=256, boundParams= > out>) at planner.c:405
> > 
> > Minimal query like:
> > 
> > explain SELECT * FROM information_schema.transforms AS ref_1 RIGHT JOIN 
> > (SELECT 1 FROM pg_catalog.pg_namespace TABLESAMPLE SYSTEM (7.2))AS sample_2 
> > ON (ref_1.specific_name is NULL);
> > 
> > -- 
> > Justin

Re: pg_validatebackup -> pg_verifybackup?

2020-04-11 Thread Alvaro Herrera

On 2020-Apr-11, Robert Haas wrote:

> I *would* like to find a way to address the proliferation of binaries,
> because I've got other things I'd like to do that would require
> creating still more of them, and until we come up with a scalable
> solution that makes everybody happy, there's going to be progressively
> more complaining every time. One possible solution is to adopt the
> 'git' approach and decide we're going to have one 'pg' command (or
> whatever we call it). I think the way that 'git' does it is that all
> of the real binaries are stored in a directory that users are not
> expected to have in their path, and the 'git' wrapper just looks for
> one based on the name of the subcommand.

I like this idea so much that I already proposed it in the past[1], so +1.

[1] https://postgr.es/m/20160826202911.GA320593@alvherre.pgsql

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Race condition in SyncRepGetSyncStandbysPriority

2020-04-11 Thread Tom Lane

Masahiko Sawada  writes:
> On Tue, 31 Mar 2020 at 23:16, Masahiko Sawada
>  wrote:
>>> Therefore, the band-aid fix seems to be to set the lowest priority to
>>> very large number at the beginning of SyncRepGetSyncStandbysPriority().

>> I think we can use max_wal_senders.

> Sorry, that's not true. We need another number large enough.

The buildfarm had another three failures of this type today, so that
motivated me to look at it some more.  I don't think this code needs
a band-aid fix; I think "nuke from orbit" is more nearly the right
level of response.

The point that I was trying to make originally is that it seems quite
insane to imagine that a walsender's sync_standby_priority value is
somehow more stable than the very existence of the process.  Yet we
only require a walsender to lock its own mutex while claiming or
disowning its WalSnd entry (by setting or clearing the pid field).
So I think it's nuts to define those fields as being protected by
the global SyncRepLock.

Even without considering the possibility that a walsender has just
started or stopped, we have the problem Fujii-san described that after
a change in the synchronous_standby_names GUC setting, different
walsenders will update their values of sync_standby_priority at
different instants.  (BTW, I now notice that Noah had previously
identified this problem at [1].)

Thus, even while holding SyncRepLock, we do not have a guarantee that
we'll see a consistent set of sync_standby_priority values.  In fact
we don't even know that the walsnd array entries still belong to the
processes that last set those values.  This is what is breaking
SyncRepGetSyncStandbysPriority, and what it means is that there's
really fundamentally no chance of that function producing trustworthy
results.  The "band aid" fixes discussed here might avoid crashing on
the Assert, but they won't fix the problems that (a) the result is
possibly wrong and (b) it can become stale immediately even if it's
right when returned.

Now, there are only two callers of SyncRepGetSyncStandbys:
SyncRepGetSyncRecPtr and pg_stat_get_wal_senders.  The latter is
mostly cosmetic (which is a good thing, because to add insult to
injury, it continues to use the list after releasing SyncRepLock;
not that continuing to hold that lock would make things much safer).
If I'm reading the code correctly, the former doesn't really care
exactly which walsenders are sync standbys: all it cares about is
to collect their WAL position pointers.

What I think we should do about this is, essentially, to get rid of
SyncRepGetSyncStandbys.  Instead, let's have each walsender advertise
whether *it* believes that it is a sync standby, based on its last
evaluation of the relevant GUCs.  This would be a bool that it'd
compute and set alongside sync_standby_priority.  (Hm, maybe we'd not
even need to have that field anymore?  Not sure.)  We should also
redefine that flag, and sync_standby_priority if it survives, as being
protected by the per-walsender mutex not SyncRepLock.  Then, what
SyncRepGetSyncRecPtr would do is just sweep through the walsender
array and collect WAL position pointers from the walsenders that
claim to be sync standbys at the instant that it's inspecting them.
pg_stat_get_wal_senders could also use those flags instead of the
list from SyncRepGetSyncStandbys.

It's likely that this definition would have slightly different
behavior from the current implementation during the period where
the system is absorbing a change in the set of synchronous
walsenders.  However, since the current implementation is visibly
wrong during that period anyway, I'm not sure how this could be
worse.  And at least we can be certain that SyncRepGetSyncRecPtr
will not include WAL positions from already-dead walsenders in
its calculations, which *is* a hazard in the code as it stands.

I also estimate that this would be noticeably more efficient than
the current code, since the logic to decide who's a sync standby
would only run when we're dealing with walsender start/stop or
SIGHUP, rather than every time SyncRepGetSyncRecPtr runs.

Don't especially want to code this myself, though.  Anyone?

regards, tom lane

[1] 
https://www.postgresql.org/message-id/flat/20200206074552.GB3326097%40rfd.leadboat.com

Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch)

2020-04-11 Thread Justin Pryzby

On Sat, Mar 28, 2020 at 04:17:21PM +0100, Julien Rouhaud wrote:
> On Sat, Mar 28, 2020 at 02:38:27PM +0100, Julien Rouhaud wrote:
> > On Sat, Mar 28, 2020 at 04:14:04PM +0530, Amit Kapila wrote:
> > > 
> > > I see some basic problems with the patch.  The way it tries to compute
> > > WAL usage for parallel stuff doesn't seem right to me.  Can you share
> > > or point me to any test done where we have computed WAL for parallel
> > > operations like Parallel Vacuum or Parallel Create Index?
> > 
> > Ah, that's indeed a good point and AFAICT WAL records from parallel utility
> > workers won't be accounted for.  That being said, I think that an argument
> > could be made that proper infrastructure should have been added in the 
> > original
> > parallel utility patches, as pg_stat_statement is already broken wrt. buffer
> > usage in parallel utility, unless I'm missing something.
> 
> Just to be sure I did a quick test with pg_stat_statements behavior using
> parallel/non-parallel CREATE INDEX and VACUUM, and unsurprisingly buffer usage
> doesn't reflect parallel workers' activity.
> 
> I added an open for that, and adding Robert in Cc as 9da0cc352 is the first
> commit adding parallel maintenance.

I believe this is resolved for parallel vacuum in master and parallel create
index back to PG11.

I marked this as closed.
https://wiki.postgresql.org/index.php?title=PostgreSQL_13_Open_Items&diff=34802&oldid=34781

-- 
Justin

Re: sqlsmith crash incremental sort

2020-04-11 Thread Tomas Vondra


Hi,

I've looked into this a bit, and at first I thought that maybe the issue 
is in how cost_incremental_sort picks the EC members. It simply does this:


EquivalenceMember *member = (EquivalenceMember *)
linitial(key->pk_eclass->ec_members);

so I was speculating that maybe there are multiple EC members and the
one we need is not the first one. That would have been easy to fix.

But that doesn't seem to be the case - in this example the EC ony has a
single EC member anyway.

(gdb) p key->pk_eclass->ec_members
$14 = (List *) 0x12eb958
(gdb) p *key->pk_eclass->ec_members
$15 = {type = T_List, length = 1, max_length = 5, elements = 0x12eb970, 
initial_elements = 0x12eb970}

and the member is a Var with varno=0 (with a RelabelType on top, but 
that's irrelevant).


(gdb) p *(Var*)((RelabelType*)member->em_expr)->arg
$12 = {xpr = {type = T_Var}, varno = 0, varattno = 1, vartype = 12445, 
vartypmod = -1, varcollid = 950, varlevelsup = 0, varnosyn = 0, varattnosyn = 
1, location = -1}

which then triggers the assert in find_base_rel. When looking for other
places calling estimate_num_groups I found this in prepunion.c:

 * XXX you don't really want to know about this: we do the estimation
 * using the subquery's original targetlist expressions, not the
 * subroot->processed_tlist which might seem more appropriate.  The
 * reason is that if the subquery is itself a setop, it may return a
 * processed_tlist containing "varno 0" Vars generated by
 * generate_append_tlist, and those would confuse estimate_num_groups
 * mightily.  We ought to get rid of the "varno 0" hack, but that
 * requires a redesign of the parsetree representation of setops, so
 * that there can be an RTE corresponding to each setop's output.

which seems pretty similar to the issue at hand, because the subpath is
T_UpperUniquePath (not sure if that passes as setop, but the symptoms 
match nicely).


Not sure what to do about it in cost_incremental_sort, though :-(


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_basebackup, manifests and backends older than ~12

2020-04-11 Thread Michael Paquier

On Fri, Apr 10, 2020 at 04:44:34PM -0400, David Steele wrote:
> On 4/10/20 4:41 PM, Stephen Frost wrote:
>> It's only the default in v13..  Surely when we connect to a v12 or
>> earlier system we should just keep working and accept that we don't get
>> a manifest as part of that.
> 
> Yeah, OK. It's certainly better than forcing the user to disable manifests,
> which might also disable them for v13 clusters.

Exactly.  My point is exactly that.  The current code would force
users maintaining scripts with pg_basebackup to use --no-manifest if
such a script runs with older versions of Postgres, but we should
encourage users not do to that because we want them to use manifests
with backend versions where they are supported.
--
Michael


signature.asc
Description: PGP signature

Re: pg_validatebackup -> pg_verifybackup?

2020-04-11 Thread Michael Paquier

On Fri, Apr 10, 2020 at 02:48:05PM -0700, Andres Freund wrote:
> I don't agree with any of that. Combining the manifest validation with
> checksum validation halves the IO. It allows to offload some of the
> expense of verifying page level checksums from the primary.
> 
> And all of the operations require iterating through data directories,
> classify files that are part / not part of a normal data directory, etc.

The last time we had the idea to use _verify_ in a tool name, the same
tool has been renamed one year after as we found new use cases for
it, aka pg_checksums.  Cannot the same be said with pg_validatebackup?
It seems to me that it could be interesting for some users to build a
manifest after a backup is taken, using something like a --build
option with pg_validatebackup.
--
Michael


signature.asc
Description: PGP signature

Re: pg_validatebackup -> pg_verifybackup?

2020-04-11 Thread Michael Paquier

On Sat, Apr 11, 2020 at 05:50:56PM -0400, Alvaro Herrera wrote:
> On 2020-Apr-11, Robert Haas wrote:
>> I *would* like to find a way to address the proliferation of binaries,
>> because I've got other things I'd like to do that would require
>> creating still more of them, and until we come up with a scalable
>> solution that makes everybody happy, there's going to be progressively
>> more complaining every time. One possible solution is to adopt the
>> 'git' approach and decide we're going to have one 'pg' command (or
>> whatever we call it). I think the way that 'git' does it is that all
>> of the real binaries are stored in a directory that users are not
>> expected to have in their path, and the 'git' wrapper just looks for
>> one based on the name of the subcommand.
> 
> I like this idea so much that I already proposed it in the past[1], so +1.
> 
> [1] https://postgr.es/m/20160826202911.GA320593@alvherre.pgsql

Yeah, their stuff is nice.  Another nice thing is that git has the
possibility to scan as well for custom scripts as long as they respect
the naming convention, like having a custom script called "git-foo",
that can be called as "git foo".
--
Michael


signature.asc
Description: PGP signature

Re: pg_validatebackup -> pg_verifybackup?

2020-04-11 Thread Isaac Morland

On Sat, 11 Apr 2020 at 19:36, Michael Paquier  wrote:

> Yeah, their stuff is nice.  Another nice thing is that git has the
> possibility to scan as well for custom scripts as long as they respect
> the naming convention, like having a custom script called "git-foo",
> that can be called as "git foo".
>

… which could be installed by an extension perhaps. Wait, that doesn't
quite work because it's usually one bin directory per version, not per
database. Still maybe something can be done with that idea.

Re: where should I stick that backup?

2020-04-11 Thread Jose Luis Tallon


On 10/4/20 21:38, Andres Freund wrote:

Hi,

On 2020-04-10 12:20:01 -0400, Robert Haas wrote:

- We're only talking about writing a handful of tar files, and that's
in the context of a full-database backup, which is a much
heavier-weight operation than a query.
- There is not really any state that needs to be maintained across calls.
- The expected result is that a file gets written someplace, which is
not an in-memory data structure but something that gets written to a
place outside of PostgreSQL.

Wouldn't there be state like a S3/ssh/https/... connection?
...to try and save opening a new connection in the context of a 
(potentially) multi-TB backup? :S

And perhaps
a 'backup_id' in the backup metadata DB that'd one would want to update
at the end?


This is, indeed, material for external tools. Each cater for a 
particular set of end-user requirements.


We got many examples already, with most even co-authored by this list's 
regulars... and IMHO none is suitable for ALL use-cases.



BUT I agree that providing better tools with Postgres itself, ready to 
use --- that is, uncomment the default "archive_command" and get going 
for a very basic starting point --- is a huge advancement in the right 
direction. More importantly (IMO): including the call to "pgfile" or 
equivalent quite clearly signals any inadvertent user that there is more 
to safely archiving WAL segments than just doing "cp -a" blindly and 
hoping that the tool magically does all required steps [needed to ensure 
data safety in this case, rather than the usual behaviour]. It's 
probably more effective than just ammending the existing comments to 
point users to a (new?) section within the documentation.



This comment is from experience: I've lost count of how many times I 
have had to "fix" the default command for WAL archiving --- precisely 
because it had been blindly copied from the default without further 
thinking of the implications should there happen any 
(deviation-from-expected-behaviour) during WAL archiving  only to be 
noticed at (attempted) recovery time :\



HTH.

Thanks,

    J.L.

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-04-11 Thread Justin Pryzby

On Tue, Apr 07, 2020 at 03:44:06PM -0500, Justin Pryzby wrote:
> > I mean to literally use vac_analyze_option_list for reindex and cluster as
> > well. Please, check attached 0007. Now, vacuum, reindex and cluster filter
> > options list and reject everything that is not supported, so it seems
> > completely fine to just reuse vac_analyze_option_list, doesn't it?
> 
> It's fine with me :)
> 
> Possibly we could rename vac_analyze_option_list as generic_option_list.
> 
> I'm resending the patchset like that, and fixed cluster/index to handle not
> just "VERBOSE" but "verbose OFF", rather than just ignoring the argument.
> 
> That's the last known issue with the patch.  I doubt anyone will elect to pick
> it up in the next 8 hours, but I think it's in very good shape for v14 :)

I tweaked some comments and docs and plan to mark this RfC.

-- 
Justin
>From 7798d2292b41e039853a38577bb5db504c698bba Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 2 Apr 2020 14:20:11 -0500
Subject: [PATCH v20 1/5] tab completion for reindex(verbose)..

..which was added at ecd222e77 for v9.5.

This handles "verbose" itself as well as the following word.

Separate commit as this could be backpatched to v12 (but backpatching further
is less trivial, due to improvements added at 121213d9d).
---
 src/bin/psql/tab-complete.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 0e7a373caf..e49463f11b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3414,7 +3414,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("DATA");
 
 /* REINDEX */
-	else if (Matches("REINDEX"))
+	else if (Matches("REINDEX") || Matches("REINDEX", "(*)"))
 		COMPLETE_WITH("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches("REINDEX", "TABLE"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables,
@@ -3436,6 +3436,17 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
 	else if (Matches("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
+	else if (HeadMatches("REINDEX", "(*") &&
+			 !HeadMatches("REINDEX", "(*)"))
+	{
+		/*
+		 * This fires if we're in an unfinished parenthesized option list.
+		 * get_previous_words treats a completed parenthesized option list as
+		 * one word, so the above test is correct.
+		 */
+		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
+			COMPLETE_WITH("VERBOSE");
+	}
 
 /* SECURITY LABEL */
 	else if (Matches("SECURITY"))
-- 
2.17.0

>From 60709901923b0f51ca187fd2fa298dc45910013c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 27 Mar 2020 17:50:46 -0500
Subject: [PATCH v20 2/5] Change REINDEX/CLUSTER to accept an option list..

..like EXPLAIN (..), VACUUM (..), and ANALYZE (..).

Change docs in the style of VACUUM.  See also: 52dcfda48778d16683c64ca4372299a099a15b96
---
 doc/src/sgml/ref/cluster.sgml| 29 ++---
 doc/src/sgml/ref/reindex.sgml| 43 +---
 src/backend/commands/cluster.c   | 23 -
 src/backend/commands/indexcmds.c | 41 --
 src/backend/nodes/copyfuncs.c|  2 ++
 src/backend/nodes/equalfuncs.c   |  2 ++
 src/backend/parser/gram.y| 35 +++---
 src/backend/tcop/utility.c   | 36 +++---
 src/bin/psql/tab-complete.c  | 23 +
 src/include/commands/cluster.h   |  3 ++-
 src/include/commands/defrem.h|  7 +++---
 src/include/nodes/parsenodes.h   |  2 ++
 12 files changed, 180 insertions(+), 66 deletions(-)

diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 4da60d8d56..a6df8a3d81 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -21,8 +21,13 @@ PostgreSQL documentation
 
  
 
-CLUSTER [VERBOSE] table_name [ USING index_name ]
-CLUSTER [VERBOSE]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ] table_name [ USING index_name ]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ]
+
+where option can be one of:
+
+VERBOSE [ boolean ]
+
 
  
 
@@ -81,6 +86,16 @@ CLUSTER [VERBOSE]
   Parameters
 
   
+   
+VERBOSE
+
+ 
+  Prints a progress report as each table is clustered.
+
+ 
+
+   
+

 table_name
 
@@ -100,13 +115,19 @@ CLUSTER [VERBOSE]

 

-VERBOSE
+boolean
 
  
-  Prints a progress report as each table is clustered.
+  Specifies whether the selected option should be turned on or off.
+  You can write TRUE, ON, or
+  1 to enable the option, and FALSE,
+  OFF, or 0 to disable it.  The
+  boolean value can also
+  be omitted, in which case TRUE is assumed.
  
 

+
   
  
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index c54a7c420d..71941e52e3 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/do

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-04-11 Thread Michael Paquier

On Sat, Apr 11, 2020 at 08:33:52PM -0500, Justin Pryzby wrote:
> On Tue, Apr 07, 2020 at 03:44:06PM -0500, Justin Pryzby wrote:
>> That's the last known issue with the patch.  I doubt anyone will elect to 
>> pick
>> it up in the next 8 hours, but I think it's in very good shape for v14 :)
> 
> I tweaked some comments and docs and plan to mark this RfC.

Yeah, unfortunately this will have to wait at least until v14 opens
for business :(
--
Michael


signature.asc
Description: PGP signature

50 matches

Mail list logo