Re: corruption of WAL page header is never reported

2021-09-12 Thread Fujii Masao




On 2021/09/13 11:00, Kyotaro Horiguchi wrote:

The point here is "retry this page, not this record", so "we don't need
to retry immediately" looks a bit ambiguous. So how about something
like this?

Note that we don't do this while not in standby mode because we don't
need to avoid retrying this entire record even if the page header is
not valid. Instead, ReadPageInternal() is responsible for validating
the page header in that case.


You mean that, while not in standby mode, we need to retry reading
the entire record if the page head is invalid? I was thinking that
we basically give up replaying further records in that case becase
we're not in standby mode.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: PG Docs - CREATE SUBSCRIPTION option list order

2021-09-12 Thread Amit Kapila
On Thu, Sep 9, 2021 at 9:50 AM Amit Kapila  wrote:
>
> On Wed, Sep 8, 2021 at 12:24 PM Peter Smith  wrote:
> >
> > v2 --> v3
> >
> > The subscription_parameter names are now split into 2 groups using
> > Amit's suggestion [1] on how to categorise them.
> >
> > I also made some grammar improvements to their descriptions.
> >
>
> I have made minor edits to your first patch and it looks good to me.
>

Pushed the first patch. I am not so sure about the second one so I
won't do anything for the same. I'll close this CF entry in a day or
two unless there is an interest in the second patch.

With Regards,
Amit Kapila.




Re: document the need to analyze partitioned tables

2021-09-12 Thread Zhihong Yu
On Sun, Sep 12, 2021 at 8:54 PM Justin Pryzby  wrote:

> Adding -hackers, sorry for the duplicate.
>
> This seems to be deficient, citing
>
> https://www.postgresql.org/message-id/flat/0d1b394b-bec9-8a71-a336-44df7078b295%40gmail.com
>
> I'm proposing something like the attached.  Ideally, there would be a
> central
> place to put details, and the other places could refer to that.
>
> Since the autoanalyze patch was reverted, this should be easily applied to
> backbranches, which is probably most of its value.
>
> commit 4ad2c8f6fd8eb26d76b226e68d3fdb8f0658f113
> Author: Justin Pryzby 
> Date:   Thu Jul 22 16:06:18 2021 -0500
>
> documentation deficiencies for ANALYZE of partitioned tables
>
> This is partially extracted from
> 1b5617eb844cd2470a334c1d2eec66cf9b39c41a,
> which was reverted.
>
> diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
> index 36f975b1e5..decfabff5d 100644
> --- a/doc/src/sgml/maintenance.sgml
> +++ b/doc/src/sgml/maintenance.sgml
> @@ -290,6 +290,14 @@
>  to meaningful statistical changes.
> 
>
> +   
> +Tuples changed in partitions and inheritence children do not count
> towards
> +analyze on the parent table.  If the parent table is empty or rarely
> +changed, it may never be processed by autovacuum.  It is necessary to
> +periodically run an manual ANALYZE to keep the
> statistics
> +of the table hierarchy up to date.
> +   
> +
> 
>  As with vacuuming for space recovery, frequent updates of statistics
>  are more useful for heavily-updated tables than for seldom-updated
> @@ -347,6 +355,18 @@
>   ANALYZE commands on those tables on a suitable
> schedule.
>  
> 
> +
> +   
> +
> + The autovacuum daemon does not issue ANALYZE
> commands for
> + partitioned tables.  Inheritence parents will only be analyzed if the
> + parent is changed - changes to child tables do not trigger
> autoanalyze on
> + the parent table.  It is necessary to periodically run an manual
> + ANALYZE to keep the statistics of the table
> hierarchy up to
> + date.
> +
> +   
> +
>
>
>
> @@ -817,6 +837,18 @@ analyze threshold = analyze base threshold + analyze
> scale factor * number of tu
>  
>  is compared to the total number of tuples inserted, updated, or
> deleted
>  since the last ANALYZE.
> +
> +Partitioned tables are not processed by autovacuum, and their
> statistics
> +should be updated by manually running ANALYZE when
> the
> +table is first populated, and whenever the distribution of data in its
> +partitions changes significantly.
> +   
> +
> +   
> +Partitioned tables are not processed by autovacuum.  Statistics
> +should be collected by running a manual ANALYZE
> when it is
> +first populated, and updated whenever the distribution of data in its
> +partitions changes significantly.
> 
>
> 
> diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
> index 89ff58338e..b84853fd6f 100644
> --- a/doc/src/sgml/perform.sgml
> +++ b/doc/src/sgml/perform.sgml
> @@ -1765,9 +1765,11 @@ SELECT * FROM x, y, a, b, c WHERE something AND
> somethingelse;
> Run ANALYZE Afterwards
>
> 
> +
>  Whenever you have significantly altered the distribution of data
>  within a table, running  linkend="sql-analyze">ANALYZE is strongly
> recommended. This
>  includes bulk loading large amounts of data into the table.  Running
> +
>  ANALYZE (or VACUUM ANALYZE)
>  ensures that the planner has up-to-date statistics about the
>  table.  With no statistics or obsolete statistics, the planner might
> diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
> index c423aeeea5..20ffbc2d7a 100644
> --- a/doc/src/sgml/ref/analyze.sgml
> +++ b/doc/src/sgml/ref/analyze.sgml
> @@ -250,22 +250,33 @@ ANALYZE [ VERBOSE ] [  class="parameter">table_and_columns
>
>
> -If the table being analyzed has one or more children,
> -ANALYZE will gather statistics twice: once on the
> -rows of the parent table only, and a second time on the rows of the
> -parent table with all of its children.  This second set of statistics
> -is needed when planning queries that traverse the entire inheritance
> -tree.  The autovacuum daemon, however, will only consider inserts or
> -updates on the parent table itself when deciding whether to trigger an
> -automatic analyze for that table.  If that table is rarely inserted
> into
> -or updated, the inheritance statistics will not be up to date unless
> you
> -run ANALYZE manually.
> +If the table being analyzed is partitioned, ANALYZE
> +will gather statistics by sampling blocks randomly from its
> partitions;
> +in addition, it will recurse into each partition and update its
> statistics.
> +(However, in multi-level partitioning scenarios, each leaf partition
> +will only be analyzed once.)
> +By constrast, 

Re: TAP test for recovery_end_command

2021-09-12 Thread Amul Sul
On Mon, Sep 13, 2021 at 8:44 AM Michael Paquier  wrote:
>
> On Sun, Sep 12, 2021 at 09:25:32PM -0300, Euler Taveira wrote:
> > On Thu, Sep 9, 2021, at 8:18 AM, Amul Sul wrote:
> >> Also, we don't have a good test for archive_cleanup_command as well, I
> >> am not sure how we could test that which executes with every
> >> restart-point.
> >
> > Setup a replica and stop it. It triggers a restartpoint during the shutdown.
>
> +$node_standby2->append_conf('postgresql.conf',
> +   "recovery_end_command='echo recovery_ended > 
> $recovery_end_command_file'");
> This is not going to work on Windows.

Unfortunately, I don't have Windows, but my colleague Neha Sharma has
confirmed it works there.

Regards,
Amul




document the need to analyze partitioned tables

2021-09-12 Thread Justin Pryzby
Adding -hackers, sorry for the duplicate.

This seems to be deficient, citing
https://www.postgresql.org/message-id/flat/0d1b394b-bec9-8a71-a336-44df7078b295%40gmail.com

I'm proposing something like the attached.  Ideally, there would be a central
place to put details, and the other places could refer to that.

Since the autoanalyze patch was reverted, this should be easily applied to
backbranches, which is probably most of its value.

commit 4ad2c8f6fd8eb26d76b226e68d3fdb8f0658f113
Author: Justin Pryzby 
Date:   Thu Jul 22 16:06:18 2021 -0500

documentation deficiencies for ANALYZE of partitioned tables

This is partially extracted from 1b5617eb844cd2470a334c1d2eec66cf9b39c41a,
which was reverted.

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 36f975b1e5..decfabff5d 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -290,6 +290,14 @@
 to meaningful statistical changes.

 
+   
+Tuples changed in partitions and inheritence children do not count towards
+analyze on the parent table.  If the parent table is empty or rarely
+changed, it may never be processed by autovacuum.  It is necessary to
+periodically run an manual ANALYZE to keep the 
statistics
+of the table hierarchy up to date.
+   
+

 As with vacuuming for space recovery, frequent updates of statistics
 are more useful for heavily-updated tables than for seldom-updated
@@ -347,6 +355,18 @@
  ANALYZE commands on those tables on a suitable 
schedule.
 

+
+   
+
+ The autovacuum daemon does not issue ANALYZE commands 
for
+ partitioned tables.  Inheritence parents will only be analyzed if the
+ parent is changed - changes to child tables do not trigger autoanalyze on
+ the parent table.  It is necessary to periodically run an manual
+ ANALYZE to keep the statistics of the table hierarchy 
up to
+ date.
+
+   
+
   
 
   
@@ -817,6 +837,18 @@ analyze threshold = analyze base threshold + analyze scale 
factor * number of tu
 
 is compared to the total number of tuples inserted, updated, or deleted
 since the last ANALYZE.
+
+Partitioned tables are not processed by autovacuum, and their statistics
+should be updated by manually running ANALYZE when the
+table is first populated, and whenever the distribution of data in its
+partitions changes significantly.
+   
+
+   
+Partitioned tables are not processed by autovacuum.  Statistics
+should be collected by running a manual ANALYZE when it 
is
+first populated, and updated whenever the distribution of data in its
+partitions changes significantly.

 

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index 89ff58338e..b84853fd6f 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -1765,9 +1765,11 @@ SELECT * FROM x, y, a, b, c WHERE something AND 
somethingelse;
Run ANALYZE Afterwards
 

+
 Whenever you have significantly altered the distribution of data
 within a table, running ANALYZE is strongly 
recommended. This
 includes bulk loading large amounts of data into the table.  Running
+
 ANALYZE (or VACUUM ANALYZE)
 ensures that the planner has up-to-date statistics about the
 table.  With no statistics or obsolete statistics, the planner might
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index c423aeeea5..20ffbc2d7a 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -250,22 +250,33 @@ ANALYZE [ VERBOSE ] [ table_and_columns
 
   
-If the table being analyzed has one or more children,
-ANALYZE will gather statistics twice: once on the
-rows of the parent table only, and a second time on the rows of the
-parent table with all of its children.  This second set of statistics
-is needed when planning queries that traverse the entire inheritance
-tree.  The autovacuum daemon, however, will only consider inserts or
-updates on the parent table itself when deciding whether to trigger an
-automatic analyze for that table.  If that table is rarely inserted into
-or updated, the inheritance statistics will not be up to date unless you
-run ANALYZE manually.
+If the table being analyzed is partitioned, ANALYZE
+will gather statistics by sampling blocks randomly from its partitions;
+in addition, it will recurse into each partition and update its statistics.
+(However, in multi-level partitioning scenarios, each leaf partition
+will only be analyzed once.)
+By constrast, if the table being analyzed has inheritance children,
+ANALYZE will gather statistics for it twice:
+once on the rows of the parent table only, and a second time on the
+rows of the parent table with all of its children.  This second set of
+statistics is needed when planning queries that traverse the entire
+inheritance 

Re: Remove duplicate static function check_permissions in slotfuncs.c and logicalfuncs.c

2021-09-12 Thread Bharath Rupireddy
On Mon, Sep 13, 2021 at 8:07 AM Michael Paquier  wrote:
>
> On Sun, Sep 12, 2021 at 10:14:36PM -0300, Euler Taveira wrote:
> > On Sun, Sep 12, 2021, at 8:02 PM, Bossart, Nathan wrote:
> >> nitpick: It looks like there's an extra set of parentheses around
> >> errmsg().
> >
> > Indeed. Even the requirement for extra parenthesis around auxiliary function
> > calls was removed in v12 (e3a87b4991cc2d00b7a3082abb54c5f12baedfd1).
>
> Yes.  The patch makes sense.  I am not seeing any other places that
> could be grouped, so that looks fine as-is.

Thanks all for taking a look at the patch. Here's the CF entry -
https://commitfest.postgresql.org/35/3319/

Regards,
Bharath Rupireddy.




Re: Remove duplicate static function check_permissions in slotfuncs.c and logicalfuncs.c

2021-09-12 Thread Bharath Rupireddy
On Mon, Sep 13, 2021 at 6:45 AM Euler Taveira  wrote:
>
> On Sun, Sep 12, 2021, at 8:02 PM, Bossart, Nathan wrote:
>
> nitpick: It looks like there's an extra set of parentheses around
> errmsg().
>
> Indeed. Even the requirement for extra parenthesis around auxiliary function
> calls was removed in v12 (e3a87b4991cc2d00b7a3082abb54c5f12baedfd1).

The same commit says that the new code can be written in any way.
Having said that, I will leave it to the committer to take a call on
whether or not to remove the extra parenthesis.
   "
While new code can be written either way, code intended to be
back-patched will need to use extra parens for awhile yet.
"

Regards,
Bharath Rupireddy.




Re: TAP test for recovery_end_command

2021-09-12 Thread Michael Paquier
On Sun, Sep 12, 2021 at 09:25:32PM -0300, Euler Taveira wrote:
> On Thu, Sep 9, 2021, at 8:18 AM, Amul Sul wrote:
>> Also, we don't have a good test for archive_cleanup_command as well, I
>> am not sure how we could test that which executes with every
>> restart-point.
>
> Setup a replica and stop it. It triggers a restartpoint during the shutdown.

+$node_standby2->append_conf('postgresql.conf',
+   "recovery_end_command='echo recovery_ended > 
$recovery_end_command_file'");
This is not going to work on Windows.
--
Michael


signature.asc
Description: PGP signature


Re: brin multi minmax crash for inet value

2021-09-12 Thread Michael Paquier
On Sun, Sep 12, 2021 at 08:23:44PM -0500, Justin Pryzby wrote:
> Could you check what HEAD your server is compiled from ?

That works on HEAD for me.
--
Michael


signature.asc
Description: PGP signature


Re: Remove duplicate static function check_permissions in slotfuncs.c and logicalfuncs.c

2021-09-12 Thread Michael Paquier
On Sun, Sep 12, 2021 at 10:14:36PM -0300, Euler Taveira wrote:
> On Sun, Sep 12, 2021, at 8:02 PM, Bossart, Nathan wrote:
>> nitpick: It looks like there's an extra set of parentheses around
>> errmsg().
>
> Indeed. Even the requirement for extra parenthesis around auxiliary function
> calls was removed in v12 (e3a87b4991cc2d00b7a3082abb54c5f12baedfd1).

Yes.  The patch makes sense.  I am not seeing any other places that
could be grouped, so that looks fine as-is.
--
Michael


signature.asc
Description: PGP signature


Re: Add jsonlog log_destination for JSON server logs

2021-09-12 Thread Michael Paquier
On Fri, Sep 10, 2021 at 03:56:18PM +0900, Michael Paquier wrote:
> And this part leads me to the attached, where the addition of the JSON
> format would result in the addition of a couple of lines.

Okay, I have worked through the first half of the patch set, and
applied the improved versions of 0001 (refactoring of the chunk
protocol) and 0002 (addition of the tests for csvlog).

I have not looked in details at 0003 and 0004 yet.  Still, here are
some comments after a quick scan.

+ * elog-internal.h
I'd rather avoid the hyphen, and use elog_internal.h.

+#define FORMATTED_TS_LEN 128
+extern char formatted_start_time[FORMATTED_TS_LEN];
+extern char formatted_log_time[FORMATTED_TS_LEN];
+
+void setup_formatted_log_time(void);
+void setup_formatted_start_time(void);
We could just use a static buffer in each one of those routines, and
return back a pointer to the caller.

+   else if ((Log_destination & LOG_DESTINATION_JSONLOG) &&
+   (jsonlogFile == NULL ||
+time_based_rotation || (size_rotation_for & LOG_DESTINATION_JSONLOG)))
[...]
+   /* Write to JSON log if enabled */
+   else if (Log_destination & LOG_DESTINATION_JSONLOG)
+   {
Those bits in 0004 are wrong.  They should be two "if" clauses.  This
is leading to issues when setting log_destination to multiple values
with jsonlog in the set of values and logging_connector = on, and the
logs are not getting redirected properly to the .json file.  We would 
get the data for the .log and .csv files though.  This issue also
happens with the original patch set applied on top of e757080.   I
think that we should also be more careful with the way we free
StringInfoData in send_message_to_server_log(), or we are going to
finish with unwelcome leaks.

The same coding pattern is now repeated three times in logfile_rotate():
- Check if a logging type is enabled.
- Optionally open new file, with logfile_open().
- Business with ENFILE and EMFILE.
- pfree() and save of the static FILE* ane file name for each type.
I think that we have enough material for a wrapper routine that does
this work, where we pass down LOG_DESTINATION_* and pointers to the
static FILE* and the static last file names.  That would have avoided
the bug I found above.

The rebased patch set, that has reworked the tests to be in line with
HEAD, also fails.  They compile.

Sehrope, could you adjust that?
--
Michael
From a08e9df54c5960225055a0c999090dad8cf839be Mon Sep 17 00:00:00 2001
From: Sehrope Sarkuni 
Date: Wed, 1 Sep 2021 09:06:15 -0400
Subject: [PATCH v3 1/2] Split csv handling in elog.c into separate csvlog.c

Split out csvlog to its own file and centralize common elog internals
and helpers into its own file as well.
---
 src/include/utils/elog-internal.h |  78 
 src/backend/utils/error/Makefile  |   1 +
 src/backend/utils/error/csvlog.c  | 270 ++
 src/backend/utils/error/elog.c| 313 ++
 4 files changed, 365 insertions(+), 297 deletions(-)
 create mode 100644 src/include/utils/elog-internal.h
 create mode 100644 src/backend/utils/error/csvlog.c

diff --git a/src/include/utils/elog-internal.h 
b/src/include/utils/elog-internal.h
new file mode 100644
index 00..ac08b6f12f
--- /dev/null
+++ b/src/include/utils/elog-internal.h
@@ -0,0 +1,78 @@
+/*-
+ *
+ * elog-internal.h
+ *   POSTGRES error reporting/logging internal definitions.
+ *
+ *
+ * Portions Copyright (c) 2021, PostgreSQL Global Development Group
+ * src/include/utils/elog-internal.h
+ *
+ *-
+ */
+#ifndef ELOG_INTERNAL_H
+#define ELOG_INTERNAL_H
+
+#include "postgres.h"
+
+#include "utils/elog.h"
+#include "miscadmin.h"
+#include "postmaster/postmaster.h"
+#include "postmaster/bgworker.h"
+
+const char * error_severity(int elevel);
+void write_syslogger(char *data, int len, int dest);
+
+void write_csvlog(ErrorData *edata);
+
+/*
+ * is_log_level_output -- is elevel logically >= log_min_level?
+ *
+ * We use this for tests that should consider LOG to sort out-of-order,
+ * between ERROR and FATAL.  Generally this is the right thing for testing
+ * whether a message should go to the postmaster log, whereas a simple >=
+ * test is correct for testing whether the message should go to the client.
+ */
+static inline bool
+is_log_level_output(int elevel, int log_min_level)
+{
+   if (elevel == LOG || elevel == LOG_SERVER_ONLY)
+   {
+   if (log_min_level == LOG || log_min_level <= ERROR)
+   return true;
+   }
+   else if (elevel == WARNING_CLIENT_ONLY)
+   {
+   /* never sent to log, regardless of log_min_level */
+   return false;
+   }
+   else if (log_min_level == LOG)
+   {
+   /* elevel != LOG */
+   if (elevel >= FATAL)
+   return true;
+   }
+   /* Neither is LOG 

Re: Added missing invalidations for all tables publication

2021-09-12 Thread Amit Kapila
On Sat, Sep 11, 2021 at 11:58 PM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > On Wed, Sep 8, 2021 at 7:57 AM houzj.f...@fujitsu.com
> >  wrote:
> >> I found that the patch cannot be applied to back-branches(v10-v14) cleanly,
> >> so, I generate the patches for back-branches. Attached, all the patches 
> >> have
> >> passed regression test.
>
> > Pushed!
>
> Shouldn't the CF entry for this be closed? [1]
>

Yes, and I have done that now.

-- 
With Regards,
Amit Kapila.




Re: corruption of WAL page header is never reported

2021-09-12 Thread Kyotaro Horiguchi
At Fri, 10 Sep 2021 10:38:39 +0900, Fujii Masao  
wrote in 
> 
> 
> On 2021/09/07 2:02, Fujii Masao wrote:
> > Even if we do this while NOT in standby mode, ISTM that this function
> > doesn't
> > return with a valid errmsg_buf because it's reset. So probably the
> > comment
> > should be updated as follows?
> > -
> > We don't do this while not in standby mode because we don't need to
> > retry
> > immediately if the page header is not valid. Instead, XLogReadRecord()
> > is
> > responsible to check the page header.
> > -
> 
> I updated the comment as above. Patch attached.
> 
> -  * it's not valid. This may seem unnecessary, because XLogReadRecord()
> + * it's not valid. This may seem unnecessary, because
> ReadPageInternal()
>* validates the page header anyway, and would propagate the failure up 
> to
> 
> I also applied this change because ReadPageInternal() not
> XLogReadRecord()
> calls XLogReaderValidatePageHeader().

Yeah, good catch.


+* Note that we don't do this while not in standby mode because we don't
+* need to retry immediately if the page header is not valid. Instead,
+* ReadPageInternal() is responsible for validating the page header.

The point here is "retry this page, not this record", so "we don't need
to retry immediately" looks a bit ambiguous. So how about something
like this?

Note that we don't do this while not in standby mode because we don't
need to avoid retrying this entire record even if the page header is
not valid. Instead, ReadPageInternal() is responsible for validating
the page header in that case.

Everything else looks fine to me.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




walsender timeout on logical replication set

2021-09-12 Thread Kyotaro Horiguchi
Hello.

As reported in [1] it seems that walsender can suffer timeout in
certain cases.  It is not clearly confirmed, but I suspect that
there's the case where LogicalRepApplyLoop keeps running the innermost
loop without receiving keepalive packet for longer than
wal_sender_timeout (not wal_receiver_timeout).  Of course that can be
resolved by giving sufficient processing power to the subscriber if
not. But if that happens between the servers with the equal processing
power, it is reasonable to "fix" this.  Theoretically I think this can
happen with equally-powered servers if the connecting network is
sufficiently fast.  Because sending reordered changes is relatively
simple and fast than apllying the changes on subscriber.

I think we don't want to call GetCurrentTimestamp every iteration of
the innermost loop.  Even if we call it every N iterations, I don't
come up with a proper N that fits any workload. So one possible
solution would be using slgalrm.  Is it worth doing?  Or is there any
other way?

Even if we won't fix this, we might need to add a description about
this restriciton in the documentation?

Any thougths?

[1] 
https://www.postgresql.org/message-id/CAEDsCzhBtkNDLM46_fo_HirFYE2Mb3ucbZrYqG59ocWqWy7-xA%40mail.gmail.com

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Remove redundant initializations

2021-09-12 Thread Noah Misch
I'm +1 for the $SUBJECT concept, mostly because I take longer to read code
where immaterial zero-initialization lines are diluting the code.  A quick
scan of the patch content is promising.  If there's a decision to move
forward, I'm happy to review it more closely.

On Wed, Jun 30, 2021 at 09:28:17AM -0400, Tom Lane wrote:
> David Rowley  writes:
> > On Tue, 29 Jun 2021 at 02:15, Tom Lane  wrote:
> >> The primary case where I personally rely on that style is when adding a
> >> new field to a struct.  Currently it's possible to grep for some existing
> >> field and add the new one beside it.  Leaving out initializations by
> >> relying on side-effects of makeNode makes that far riskier.
> 
> > FWIW, I mostly grep for makeNode(NameOfNode) as I'm a bit mistrusting
> > of if the random existing field name that I pick to grep for will
> > properly showing me all the locations I should touch.
> 
> I tend to do that too, but it's not a foolproof thing either, since
> some places use random memset's for the purpose.

I checked the first five matches of "git grep ' = T_'" to get a sense of code
sites that skip makeNode().  Just one of those five initializes every field:

recordDependencyOnSingleRelExpr() builds RangeTblEntry, subset of fields
EventTriggerCommonSetup() builds EventTriggerData, full fields
validateForeignKeyConstraint() builds TriggerData, subset of fields
ExecBSInsertTriggers() builds TriggerData, subset of fields [many similar 
examples in trigger.c]
ExecBuildProjectionInfo() builds ExprState, subset of fields

Hence, I find we're already too inconsistent about "explicitly initialize
every field" to recommend "grep for some existing field".  (Two participants
in the 2018 thread made similar observations[1][2].)  grepping T_NameOfNode
and then makeNode(NameOfNode) is more reliable today, and $SUBJECT will not
decrease its reliability.

[1] https://postgr.es/m/20180830045736.p3mrugcq2j367...@alap3.anarazel.de
[2] 
https://postgr.es/m/ca+tgmoypw3y8zkofsetpvbb8avy7v7jbjmg6bme7cc+eod7...@mail.gmail.com




Re: brin multi minmax crash for inet value

2021-09-12 Thread Justin Pryzby
On Sun, Sep 12, 2021 at 07:44:47PM -0500, Jaime Casanova wrote:
> Hi Tomas,
> 
> Just noted that this query crash the server. Execute it in the
> regression database:

If I'm not wrong, this is the crash fixed by e1fbe1181 in April.

Could you check what HEAD your server is compiled from ?

-- 
Justin




Re: Remove duplicate static function check_permissions in slotfuncs.c and logicalfuncs.c

2021-09-12 Thread Euler Taveira
On Sun, Sep 12, 2021, at 8:02 PM, Bossart, Nathan wrote:
> nitpick: It looks like there's an extra set of parentheses around
> errmsg().
Indeed. Even the requirement for extra parenthesis around auxiliary function
calls was removed in v12 (e3a87b4991cc2d00b7a3082abb54c5f12baedfd1).


--
Euler Taveira
EDB   https://www.enterprisedb.com/


brin multi minmax crash for inet value

2021-09-12 Thread Jaime Casanova
Hi Tomas,

Just noted that this query crash the server. Execute it in the
regression database:

"""
update brintest_multi set inetcol = '192.168.204.50/0'::inet;
"""

Attached is the backtrace. Let me know if you need something else to
track it.

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
set = {__val = {4194304, 140730302333136, 2, 6, 435, 
94894241509376, 
4611686018427388799, 140219046775462, 0, 281470681751456, 0, 0, 0, 
0, 0, 0}}
pid = 
tid = 
ret = 
#1  0x7f874a40f535 in __GI_abort () at abort.c:79
save_stage = 1
act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, 
sa_mask = {__val = {
  0, 0, 0, 0, 0, 140219044532213, 2, 7292513299284754848, 
7003713384994060596, 
  94894241509376, 7003719963994494672, 0, 3387215930523559936, 
14073030276, 0, 
  140730302334240}}, sa_flags = 1234071552, sa_restorer = 0x0}
sigs = {__val = {32, 0 }}
#2  0x564e49f91849 in ExceptionalCondition (
conditionName=0x564e4a00f6e4 "(delta >= 0) && (delta <= 1)", 
errorType=0x564e4a00ef74 "FailedAssertion", fileName=0x564e4a00ef60 
"brin_minmax_multi.c", 
lineNumber=2368) at assert.c:69
No locals.
#3  0x564e498f16db in brin_minmax_multi_distance_inet 
(fcinfo=0x7ffe53ae05c0)
at brin_minmax_multi.c:2368
delta = -1.1641532182693481e-08
i = -1
len = 4
addra = 0x564e4b6ee950 ""
addrb = 0x564e4b6ee970 ""
ipa = 0x564e4b6c8828
ipb = 0x564e4b6c5a78
lena = 0
lenb = 0
#4  0x564e49f9bcc4 in FunctionCall2Coll (flinfo=0x564e4b60c640, 
collation=0, 
arg1=94894272841768, arg2=94894272830072) at fmgr.c:1160
fcinfodata = {fcinfo = {flinfo = 0x564e4b60c640, context = 0x0, 
resultinfo = 0x0, 
fncollation = 0, isnull = false, nargs = 2, args = 0x7ffe53ae05e0}, 
  fcinfo_data = "@\306`KNV", '\000' , 
"\002\000(\210lKNV\000\000\000\000\000\000\000\000\000\000xZlKNV\000\000\000\350nKNV\000"}
fcinfo = 0x7ffe53ae05c0
result = 94894272996608
__func__ = "FunctionCall2Coll"
#5  0x564e498ef621 in build_distances (distanceFn=0x564e4b60c640, 
colloid=0, 
eranges=0x564e4b6ee620, neranges=11) at brin_minmax_multi.c:1352
a1 = 94894272841768
a2 = 94894272830072
r = 94894241543955
i = 0
ndistances = 10
distances = 0x564e4b6ee838
#6  0x564e498f0199 in compactify_ranges (bdesc=0x564e4b60ca58, 
ranges=0x564e4b6da6c0, 
max_values=32) at brin_minmax_multi.c:1822
cmpFn = 0x564e4b60c678
distanceFn = 0x564e4b60c640
eranges = 0x564e4b6ee620
neranges = 11
distances = 0x564e0008
ctx = 0x564e4b6ee500
oldctx = 0x564e4b6c4b80
#7  0x564e498f1735 in brin_minmax_multi_serialize (bdesc=0x564e4b60ca58, 
src=94894272915136, 
dst=0x564e4b6c5030) at brin_minmax_multi.c:2386
ranges = 0x564e4b6da6c0
s = 0x564e4b6c6110
#8  0x564e498f795b in brin_form_tuple (brdesc=0x564e4b60ca58, blkno=0, 
tuple=0x564e4b6c4ca0, 
size=0x7ffe53ae0858) at brin_tuple.c:165
datumno = 1
values = 0x564e4b6c5298
nulls = 0x564e4b6c5ad0
anynulls = true
rettuple = 0x741231600
keyno = 9
idxattno = 9
phony_infomask = 0
phony_nullbitmap = 0x564e4b6c5b28 "\177\177\177~\177\177\177\177"
len = 94894241579157
hoff = 1024
data_len = 94894272830256
i = 32766
untoasted_values = 0x564e4b6c53b0
nuntoasted = 0
#9  0x564e498e79d1 in brininsert (idxRel=0x7f87412316b0, 
values=0x7ffe53ae09b0, 
nulls=0x7ffe53ae0990, heaptid=0x564e4b6b49d0, heapRel=0x7f874122f288, 
checkUnique=UNIQUE_CHECK_NO, indexUnchanged=false, 
indexInfo=0x564e4b6b8710) at brin.c:281
lp = 0x7f8741a3cfb8
origsz = 952
newsz = 94894244461288
page = 0x7f8741a3cf80 ""
origtup = 0x564e4b6c5b48
newtup = 0x7ffe53ae0890
samepage = false
need_insert = true
off = 9
brtup = 0x7f8741a3d108
dtup = 0x564e4b6c4ca0
pagesPerRange = 1
origHeapBlk = 0
heapBlk = 0
bdesc = 0x564e4b60ca58
revmap = 0x564e4b6b8aa8
buf = 114
tupcxt = 0x564e4b6c4b80
oldcxt = 0x564e4b6b2bc0
autosummarize = false
__func__ = "brininsert"
#10 0x564e49989165 in index_insert (indexRelation=0x7f87412316b0, 
values=0x7ffe53ae09b0, 
isnull=0x7ffe53ae0990, heap_t_ctid=0x564e4b6b49d0, 
heapRelation=0x7f874122f288, 
checkUnique=UNIQUE_CHECK_NO, indexUnchanged=false, 
indexInfo=0x564e4b6b8710) at indexam.c:193
__func__ = "index_insert"
#11 0x564e49ba4b23 in ExecInsertIndexTuples (resultRelInfo=0x564e4b6b3168, 
slot=0x564e4b6b49a0, 

Re: TAP test for recovery_end_command

2021-09-12 Thread Euler Taveira
On Thu, Sep 9, 2021, at 8:18 AM, Amul Sul wrote:
> The attached patch adds a small test for recovery_end_command execution.
Additional coverage is always a good thing.

> Currently, patch tests execution of recovery_end_command by creating
> dummy file, I am not wedded only to this approach, other suggestions
> also welcome.
This test file is for archiving only. It seems 020_archive_status.pl is more
appropriate for testing this parameter.

> Also, we don't have a good test for archive_cleanup_command as well, I
> am not sure how we could test that which executes with every
> restart-point.
Setup a replica and stop it. It triggers a restartpoint during the shutdown.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Remove duplicate static function check_permissions in slotfuncs.c and logicalfuncs.c

2021-09-12 Thread Bossart, Nathan
On 9/11/21, 1:31 AM, "Bharath Rupireddy" 
 wrote:
> We have two static check_permissions functions (one in slotfuncs.c
> another in logicalfuncs.c) with the same name and same code for
> checking the privileges for using replication slots. Why can't we have
> a single function CheckReplicationSlotPermissions in slot.c? This way,
> we can get rid of redundant code. Attaching a patch for it.

+1

+/*
+ * Check whether the user has privilege to use replication slots.
+ */
+void
+CheckReplicationSlotPermissions(void)
+{
+   if (!superuser() && !has_rolreplication(GetUserId()))
+   ereport(ERROR,
+   (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+(errmsg("must be superuser or replication role 
to use replication slots";
+}

nitpick: It looks like there's an extra set of parentheses around
errmsg().

Nathan



Re: CLUSTER on partitioned index

2021-09-12 Thread Justin Pryzby
On Tue, Jul 20, 2021 at 08:27:02PM -0400, Alvaro Herrera wrote:
> I have to wonder if there really *is* a use case for CLUSTER in the
> first place on regular tables, let alone on partitioned tables, which
> are likely to be large and thus take a lot of time.

The cluster now is done one partition at a time, so it might take a long time,
but doesn't lock the whole partition heirarchy.  Same as VACUUM (since v10) and
(since v14) REINDEX.

The patch series would be simpler if partitioned indexes weren't allowed to be
marked CLUSTERED ON.  Then, "USING " would be required, which is a step
forward from not supporting cluster on partitioned index at all.  As attached.
It's arguably true that the follow-up patches supporting indisclustered on
partitioned indexes aren't worth the trouble.

For sure CLUSTER is useful, see eg.
https://github.com/bucardo/check_postgres/issues/29

It's sometimes important that the table is clustered to allow index scan to
work well (or be chosen at all).

If a table is scanned by an index, and isn't well-clustered, then a larger
fraction (multiple) of the table will be read than what's optimal.   That
requires more IO, and more cache space.

A year ago, I partitioned one of our previously-unpartitioned tables, and ended
up clustering the partitions on their partition key (and indexed column) using
\gexec.  This was preferable to doing INSERT .. SELECT .. ORDER BY, which
would've made the initial process slower - maybe unjustifiably slower for some
customers.  Cluster (using \gexec) was something I was able to do afterward,
for completeness, since I expect the partitions to be mostly-clustered
automatically, so it was bothering me that the existing data was unordered, and
that it might behave differently in the future.

> What justifies spending so much time on this implementation?

Actually, I don't use partitioned indexes at all here, so this is not for us..
I worked on this after Adger asked about CIC on partitioned tables (for which I
have a patch in the queue).  Isn't it worth supporting that (or should we
include an example about how to use format() with %I and \gexec) ?

VACUUM [FULL] has recursed into partitions since v10 (f0e44751d).
REINDEX supports partitioned tables in v14 (a6642b3ae).  
Partitioned indexes exist since v11 (as you well know), so it's somewhat odd
that CLUSTER isn't supported, and seems increasingly weird as decreasing number
of DDL commands are not supported.  Supporting DDL on partitioned tables
supports the idea that the physical partitions can be seen as an implementation
detail by the DBA, which I understand was the intent since v10.

You're right that I wouldn't plan to *routinely* re-cluster a partitioned
table.  Rather, I'd cluster only its "recent" *partitions*, and leave the old
ones alone.  Or cluster the partitions, a single time, once they're no longer
recent.  I don't think the feature is marginal just because I don't use it
routinely.

> My impression is that CLUSTER is pretty much a fringe command nowadays,
> because of the access exclusive lock required.

A step forward would be to integrate something like pg_repack/reorg/squeeze.
I used pg_repack --index until v12 got REINDEX CONCURRENTLY.  The goal there
was to improve index scans on some large, append-only partitions where the
planner gave an index scan, but performance was poor (now, we use BRIN so it
works well without reindex).  I tested that this would still be an issue by
creating a non-brin index for a single day's table (even with v13 deduplication
and v12 TID tiebreak).

As I see it, support for partitioned cluster is orthogonal to an
online/concurrent cluster, which is a job for another patch.

-- 
Justin
>From d2e7daf9c05cd3c20c60f5e35c0d6b2612d75d1b Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 7 Jun 2020 16:58:42 -0500
Subject: [PATCH v11] Implement CLUSTER of partitioned table..

For now, partitioned indexes cannot be marked clustered, so clustering requires
specification of a partitioned index on the partitioned table.

VACUUM (including vacuum full) has recursed into partitione tables since
partitioning were introduced in v10 (3c3bb9933).  See expand_vacuum_rel().

See also a556549d7 and 19de0ab23.
---
 doc/src/sgml/ref/cluster.sgml |   6 +
 src/backend/commands/cluster.c| 170 +++---
 src/bin/psql/tab-complete.c   |   1 +
 src/include/commands/cluster.h|   1 +
 src/test/regress/expected/cluster.out |  46 ++-
 src/test/regress/sql/cluster.sql  |  22 +++-
 6 files changed, 197 insertions(+), 49 deletions(-)

diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 86f5fdc469..b3463ae5c4 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -196,6 +196,12 @@ CLUSTER [VERBOSE]
 in the pg_stat_progress_cluster view. See
  for details.
   
+
+   
+Clustering a partitioned table clusters each of its partitions using the
+partition of the 

Re: Numeric x^y for negative x

2021-09-12 Thread Dean Rasheed
On Thu, Sep 02, 2021 at 07:27:09AM +0100, Dean Rasheed wrote:
>
> It's mostly done, but there is one more corner case where it loses
> precision. I'll post an update shortly.
>

I spent some more time looking at this, testing a variety of edge
cases, and the only case I could find that produces inaccurate results
was the one I noted previously -- computing x^y when x is very close
to 1 (less than around 1e-1000 away from it, so that ln_dweight is
less than around -1000). In this case, it loses precision due to the
way local_rscale is set for the initial low-precision calculation:

local_rscale = 8 - ln_dweight;
local_rscale = Max(local_rscale, NUMERIC_MIN_DISPLAY_SCALE);
local_rscale = Min(local_rscale, NUMERIC_MAX_DISPLAY_SCALE);

This needs to be allowed to be greater than NUMERIC_MAX_DISPLAY_SCALE
(1000), otherwise the approximate result will lose all precision,
leading to a poor choice of scale for the full-precision calculation.

So the fix is just to remove the upper bound on this local_rscale, as
we do for the full-precision calculation. This doesn't impact
performance, because it's only computing the logarithm to 8
significant digits at this stage, and when x is very close to 1 like
this, ln_var() has very little work to do -- there is no argument
reduction to do, and the Taylor series terminates on the second term,
since 1-x is so small.

Coming up with a test case that doesn't have thousands of digits is a
bit fiddly, so I chose one where most of the significant digits of the
result are a long way after the decimal point and shifted them up,
which makes the loss of precision in HEAD more obvious. The expected
result can be verified using bc with a scale of 2000.

Regards,
Dean
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
new file mode 100644
index 796f517..65dc3bd
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -10266,9 +10266,9 @@ power_var(const NumericVar *base, const
 	 */
 	ln_dweight = estimate_ln_dweight(base);
 
+	/* scale for initial low-precision calculation (8 significant digits) */
 	local_rscale = 8 - ln_dweight;
 	local_rscale = Max(local_rscale, NUMERIC_MIN_DISPLAY_SCALE);
-	local_rscale = Min(local_rscale, NUMERIC_MAX_DISPLAY_SCALE);
 
 	ln_var(base, _base, local_rscale);
 
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
new file mode 100644
index efbb22a..e224eff
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -2483,6 +2483,12 @@ select coalesce(nullif(0.99 ^ 23
   0
 (1 row)
 
+select round(((1 - 1.500012345678e-1000) ^ 1.45e1003) * 1e1000);
+  round   
+--
+ 25218976308958387188077465658068501556514992509509282366
+(1 row)
+
 -- cases that used to error out
 select 0.12 ^ (-25);
  ?column?  
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
new file mode 100644
index 0418ff0..eeca63d
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -1148,6 +1148,7 @@ select 1.2 ^ 345;
 select 0.12 ^ (-20);
 select 1.0123 ^ (-2147483648);
 select coalesce(nullif(0.99 ^ 233000, 0), 0) as rounds_to_zero;
+select round(((1 - 1.500012345678e-1000) ^ 1.45e1003) * 1e1000);
 
 -- cases that used to error out
 select 0.12 ^ (-25);


Re: pg_upgrade test for binary compatibility of core data types

2021-09-12 Thread Andrew Dunstan


On 9/11/21 8:51 PM, Justin Pryzby wrote:
>
> @Andrew: did you have any comment on this part ?
>
> |Subject: buildfarm xversion diff
> |Forking 
> https://www.postgresql.org/message-id/20210328231433.gi15...@telsasoft.com
> |
> |I gave suggestion how to reduce the "lines of diff" metric almost to nothing,
> |allowing a very small "fudge factor", and which I think makes this a pretty
> |good metric rather than a passable one.
>

Somehow I missed that. Looks like some good suggestions. I'll
experiment. (Note: we can't assume the presence of sed, especially on
Windows).


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: BUG #15293: Stored Procedure Triggered by Logical Replication is Unable to use Notification Events

2021-09-12 Thread Tom Lane
Artur Zakirov  writes:
> [ v2-0001-signal-backends-on-commit.patch ]

I had an epiphany while looking at this.  Now that PostgresMain
calls ProcessNotifyInterrupt at the same place it calls
ProcessCompletedNotifies (which it does since 790026972), we don't
actually need ProcessCompletedNotifies to read self-notifies either.
If we merely set notifyInterruptPending, ProcessNotifyInterrupt will
handle that just fine.  With the other changes already under discussion,
this means ProcessCompletedNotifies can go away entirely.

That's not only less code, but fewer cycles: in cases where we have both
self-notifies and inbound notify signals, the existing code starts two
transactions and runs asyncQueueReadAllNotifications twice, but there's
no need to do it more than once.  Self-notifies become less of a special
case on the sending side too, since we can just treat that as signalling
ourselves --- though it still seems worthwhile to optimize that by
setting notifyInterruptPending directly instead of invoking kill().

Hence, I present the attached, which also tweaks things to avoid an
extra pq_flush in the end-of-command code path, and improves the
comments to discuss the issue of NOTIFYs sent by procedures.

There is still a loose end we ought to think about: what to do when
someone issues LISTEN in a background worker.  With the code as
it stands, or with this patch, the worker will block cleanout of
the async SLRU since it will never read any messages.  (With
the code as it stands, a bgworker author can ameliorate that by
calling ProcessCompletedNotifies, but this patch is going to either
eliminate ProcessCompletedNotifies or turn it into a no-op.  In
any case, we still have a problem if an ill-considered trigger
issues LISTEN in a replication worker.)

I'm inclined to think we should flat-out reject LISTEN in any process
that is not attached to a frontend, at least until somebody takes the
trouble to add infrastructure that would let it be useful.  I've not
done that here though; I'm not quite sure what we should test for.

regards, tom lane

diff --git a/doc/src/sgml/bgworker.sgml b/doc/src/sgml/bgworker.sgml
index c0811935a1..73207f72fe 100644
--- a/doc/src/sgml/bgworker.sgml
+++ b/doc/src/sgml/bgworker.sgml
@@ -280,15 +280,13 @@ typedef struct BackgroundWorker
   
 
   
-   If a background worker sends asynchronous notifications with the
-   NOTIFY command via the Server Programming Interface
-   (SPI), it should call
-   ProcessCompletedNotifies explicitly after committing
-   the enclosing transaction so that any notifications can be delivered.  If a
-   background worker registers to receive asynchronous notifications with
-   the LISTEN through SPI, the worker
-   will log those notifications, but there is no programmatic way for the
-   worker to intercept and respond to those notifications.
+   Background workers can send asynchronous notification messages, either by
+   using the NOTIFY command via SPI,
+   or directly via Async_Notify().  Such notifications
+   will be sent at transaction commit.
+   Background workers should not register to receive asynchronous
+   notifications with the LISTEN command, as there is no
+   infrastructure for a worker to consume such notifications.
   
 
   
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 387f80419a..6597ec45a9 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -2269,7 +2269,17 @@ CommitTransaction(void)
 	 */
 	smgrDoPendingDeletes(true);
 
+	/*
+	 * Send out notification signals to other backends (and do other
+	 * post-commit NOTIFY cleanup).  This must not happen until after our
+	 * transaction is fully done from the viewpoint of other backends.
+	 */
 	AtCommit_Notify();
+
+	/*
+	 * Everything after this should be purely internal-to-this-backend
+	 * cleanup.
+	 */
 	AtEOXact_GUC(true, 1);
 	AtEOXact_SPI(true);
 	AtEOXact_Enum();
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 4b16fb5682..8557008545 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -68,17 +68,27 @@
  *	  CommitTransaction() which will then do the actual transaction commit.
  *
  *	  After commit we are called another time (AtCommit_Notify()). Here we
- *	  make the actual updates to the effective listen state (listenChannels).
- *
- *	  Finally, after we are out of the transaction altogether, we check if
- *	  we need to signal listening backends.  In SignalBackends() we scan the
- *	  list of listening backends and send a PROCSIG_NOTIFY_INTERRUPT signal
- *	  to every listening backend (we don't know which backend is listening on
- *	  which channel so we must signal them all). We can exclude backends that
- *	  are already up to date, though, and we can also exclude backends that
- *	  are in other databases (unless they are way behind and should be kicked
- *	  to make them advance their pointers).  We don't 

Re: Schema variables - new implementation for Postgres 15

2021-09-12 Thread Pavel Stehule
Hi,
>
> Thanks, will test rebased version.
> BTW, that is not the temp variable. You can note it because of the
> schema or the lack of a "Transaction end action". That is a normal
> non-temp variable that has been created before. A TEMP variable with an
> ON COMMIT DROP created outside an explicit transaction will disappear
> immediatly like cursor does in the same situation.
>

Unfortunately, I don't see it - or I don't understand to your example from
morning mail well

"""
regression=# create temp variable random_number numeric ;
CREATE VARIABLE
regression=# \dV
   List of variables
  Schema   | Name  |  Type   | Is nullable | Is mutable | Default
|  Owner   | Transaction
al end action
---+---+-+-++-+--+
--
 pg_temp_4 | random_number | numeric | t   | t  | |
jcasanov |
(1 row)

regression=# select public.random_number;
ERROR:  missing FROM-clause entry for table "public"
LINE 1: select public.random_number;
   ^
"""


>
>
> --
> Jaime Casanova
> Director de Servicios Profesionales
> SystemGuards - Consultores de PostgreSQL
>


Re: Private Information Retrieval (PIR) as a C/C++ Aggregate Extension

2021-09-12 Thread Andrey Borodin
Hi!

> 12 сент. 2021 г., в 18:02, Private Information Retrieval(PIR) 
>  написал(а):
> 
> I've created a Postgresql C/C++ Aggregate Extension implementing Private 
> Information Retrieval (PIR) using Homomorphic Encryption. The open sourced 
> version can be found here: https://github.com/ReverseControl/MuchPIR .
> 
> In essence, with PIR we can retrieve data from any row in a table without 
> revealing to the server doing the search which row data was retrieved, or 
> whether the data was found at all. 
> 
> I am seeking feedback from the postgres community on this extension. Is it 
> something of interest? Is it something anyone would like to contribute to and 
> make better? Is there similar work already publicly available? Any reference 
> would be greatly appreciated.

PIR seem to be interesting functionality.
As far as I understand in terms of a database PIR is special kind of an 
aggregator, which extracts some part of data unknown to server.

One question came to my mind. Can we limit the amount of extracted data? It 
makes sense to protect the database from copy.

Also you may be interested in differential privacy data exploration [0,1]. This 
is a kind of data aggregation which protects data from deducing single row by 
means of aggregation. Implementation could be resemblant to MuchPIR.

Thanks!

Best regards, Andrey Borodin.

[0] https://en.wikipedia.org/wiki/Differential_privacy
[1] https://cs.uwaterloo.ca/~ilyas/papers/GeSIGMOD2019.pdf 



Re: Remove duplicate static function check_permissions in slotfuncs.c and logicalfuncs.c

2021-09-12 Thread Euler Taveira
On Sat, Sep 11, 2021, at 5:28 AM, Bharath Rupireddy wrote:
> We have two static check_permissions functions (one in slotfuncs.c
> another in logicalfuncs.c) with the same name and same code for
> checking the privileges for using replication slots. Why can't we have
> a single function CheckReplicationSlotPermissions in slot.c? This way,
> we can get rid of redundant code. Attaching a patch for it.
Good catch! Your patch looks good to me.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Schema variables - new implementation for Postgres 15

2021-09-12 Thread Jaime Casanova
On Sun, Sep 12, 2021 at 05:38:42PM +0200, Pavel Stehule wrote:
> Hi
> 
> > """
> > regression=# create temp variable random_number numeric on commit drop;
> > CREATE VARIABLE
> > regression=# \dV
> > Did not find any schema variables.
> > regression=# declare q cursor  for select 1;
> > ERROR:  DECLARE CURSOR can only be used in transaction blocks
> > """
> >
> 
> I have different result
> 
> postgres=# create temp variable random_number numeric on commit drop;
> CREATE VARIABLE
> postgres=# \dV
>  List of variables
> ┌┬───┬─┬─┬┬─┬───┬──┐
> │ Schema │ Name  │  Type   │ Is nullable │ Is mutable │ Default │
> Owner │ Transactional end action │
> ╞╪═══╪═╪═╪╪═╪═══╪══╡
> │ public │ random_number │ numeric │ t   │ t  │ │
> tom2  │  │
> └┴───┴─┴─┴┴─┴───┴──┘
> (1 row)
> 
> 
> 

Hi, 

Thanks, will test rebased version.
BTW, that is not the temp variable. You can note it because of the
schema or the lack of a "Transaction end action". That is a normal
non-temp variable that has been created before. A TEMP variable with an
ON COMMIT DROP created outside an explicit transaction will disappear
immediatly like cursor does in the same situation.


-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL




Re: WIP: System Versioned Temporal Table

2021-09-12 Thread Simon Riggs
On Fri, 10 Sept 2021 at 19:30, Jaime Casanova
 wrote:
>
> On Tue, Aug 10, 2021 at 01:20:14PM +0100, Simon Riggs wrote:
> > On Wed, 14 Jul 2021 at 12:48, vignesh C  wrote:
> >
> > > The patch does not apply on Head anymore, could you rebase and post a
> > > patch. I'm changing the status to "Waiting for Author".
> >
> > OK, so I've rebased the patch against current master to take it to v15.
> >
> > I've then worked on the patch some myself to make v16 (attached),
> > adding these things:
> >
>
> Hi Simon,
>
> This one doesn't apply nor compile anymore.
> Can we expect a rebase soon?

Hi Jaime,

Sorry for not replying.

Yes, I will rebase again to assist the design input I have requested.
Please expect that on Sep 15.

Cheers

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Schema variables - new implementation for Postgres 15

2021-09-12 Thread Pavel Stehule
ne 12. 9. 2021 v 17:38 odesílatel Pavel Stehule 
napsal:

> Hi
>
>
>
>> Just noted that there is no support for REASSIGN OWNED BY:
>>
>> """
>> regression=# create variable random_number numeric;
>> CREATE VARIABLE
>> regression=# alter variable random_number owner to jcm;
>> ALTER VARIABLE
>> regression=# reassign owned by jcm to jaime;
>> ERROR:  unexpected classid 9222
>> """
>>
>>
> should be fixed by the attached patch, please check.
>
>
>
>> TEMP variables are not schema variables? at least not attached to the
>> schema one expects:
>>
>
> temp variables are schema variables like any other. But they are created
> in temp schema - like temp tables.
> I designed it in consistency with temporary tables.
>
>
>> """
>> regression=# create temp variable random_number numeric ;
>> CREATE VARIABLE
>> regression=# \dV
>>List of variables
>>   Schema   | Name  |  Type   | Is nullable | Is mutable | Default
>> |  Owner   | Transaction
>> al end action
>>
>> ---+---+-+-++-+--+
>> --
>>  pg_temp_4 | random_number | numeric | t   | t  |
>>  | jcasanov |
>> (1 row)
>>
>> regression=# select public.random_number;
>> ERROR:  missing FROM-clause entry for table "public"
>> LINE 1: select public.random_number;
>>^
>> """
>>
>> There was a comment that TEMP variables should be DECLAREd instead of
>> CREATEd, i guess that is because those have similar behaviour. At least,
>> I would like to see similar messages when using the ON COMMIT DROP
>> option in a TEMP variable:
>>
>
> I don't remember this comment. When I talked about similarity with the
> DECLARE statement, I thought about semantic similarity with T-SQL
> (Microsoft SQL) DECLARE command. Unfortunately, DECLARE command is pretty
> messy - it exists in SQL, it exists in SQL/PSM and it exists in T-SQL - and
> every time has similar syntax, but partially different semantics. For me -
> CREATE TEMP VARIABLE creates session's life limited variable (by default),
> similarly like DECLARE @localvariable command from T-SQL.
>

any value of a schema variable has a session (or transaction) life cycle.
But the schema variable itself is persistent.  temp schema variable is an
exception. It is limited by session (and the value stored in the variable
is limited to session too).


>
>> """
>> regression=# create temp variable random_number numeric on commit drop;
>> CREATE VARIABLE
>> regression=# \dV
>> Did not find any schema variables.
>> regression=# declare q cursor  for select 1;
>> ERROR:  DECLARE CURSOR can only be used in transaction blocks
>> """
>>
>
> I have different result
>
> postgres=# create temp variable random_number numeric on commit drop;
> CREATE VARIABLE
> postgres=# \dV
>  List of variables
>
> ┌┬───┬─┬─┬┬─┬───┬──┐
> │ Schema │ Name  │  Type   │ Is nullable │ Is mutable │ Default │
> Owner │ Transactional end action │
>
> ╞╪═══╪═╪═╪╪═╪═══╪══╡
> │ public │ random_number │ numeric │ t   │ t  │ │
> tom2  │  │
>
> └┴───┴─┴─┴┴─┴───┴──┘
> (1 row)
>
>
>
>> About that, why are you not using syntax ON COMMIT RESET instead on
>> inventing ON TRANSACTION END RESET? seems better because you already use
>> ON COMMIT DROP.
>>
>
> I thought about this question for a very long time, and I think the
> introduction of a new clause is better, and I will try to explain why.
>
> One part of this patch are DDL statements - and all DDL statements are
> consistent with other DDL statements in Postgres. Schema variables DDL
> commands are transactional and for TEMP variables we can specify a scope -
> session or transaction, and then clause ON COMMIT DROP is used. You should
> not need to specify ON ROLLBACK action, because in this case an removing
> from system catalogue is only one possible action.
>
> Second part of this patch is holding some value in schema variables or
> initialization with default expression. The default behaviour is not
> transactional, and the value is stored all session's time by default. But I
> think it can be very useful to enforce initialization in some specific
> times - now only the end of the transaction is possible to specify. In the
> future there can be transaction end, transaction start, rollback, commit,
> top query start, top query end, ... This logic is different from the logic
> of DDL commands.  For DDL commands I need to specify behaviour just for the
> COMMIT end. But for reset of non-transactional schema variables I need to
> specify any possible end of transaction - COMMIT, ROLLBACK or COMMIT or
> 

Re: Schema variables - new implementation for Postgres 15

2021-09-12 Thread Pavel Stehule
IT RESET" is not exact. "ON
COMMIT OR ROLLBACK RESET" sounds a little bit strange for me, but we use
something similar in trigger definition "ON INSERT OR UPDATE OR DELETE ..."
My opinion is not too strong if "ON TRANSACTION END  RESET"  or "ON COMMIT
OR ROLLBACK RESET" is better, and I can change it if people will have
different preferences, but I am sure so "ON COMMIT RESET" is not correct in
implemented case. And from the perspective of a PLpgSQL developer, I would
have initialized the variable on any transaction start, so I need to reset
it on any end.

Regards

Pavel



> I will test more this patch tomorrow. Great work, very complete.
>
> --
> Jaime Casanova
> Director de Servicios Profesionales
> SystemGuards - Consultores de PostgreSQL
>


schema-variables-20210912.patch.gz
Description: application/gzip


Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-09-12 Thread Joel Jacobson
On Thu, Sep 2, 2021, at 00:03, Daniel Gustafsson wrote:
> I can see value in a function like this one, and the API is AFAICT fairly
> aligned with what I as a user would expect it to be given what we already 
> have.

Good to hear and thanks for looking at this patch.

I've fixed the problem due to the recent change in setup_regexp_matches(),
which added a new int parameter "start_search".
I pass 0 as start_search, which I think should give the same behaviour as 
before.

I also changed the assigned oid values in pg_proc.dat for the two new 
regexp_positions() catalog functions.

$ make check

===
All 209 tests passed.
===

/Joel





0005-regexp-positions.patch
Description: Binary data


Private Information Retrieval (PIR) as a C/C++ Aggregate Extension

2021-09-12 Thread Private Information Retrieval(PIR)
Hello,

I've created a Postgresql C/C++ Aggregate Extension implementing Private 
Information Retrieval (PIR) using Homomorphic Encryption. The open sourced 
version can be found here: https://github.com/ReverseControl/MuchPIR .

In essence, with PIR we can retrieve data from any row in a table without 
revealing to the server doing the search which row data was retrieved, or 
whether the data was found at all.

I am seeking feedback from the postgres community on this extension. Is it 
something of interest? Is it something anyone would like to contribute to and 
make better? Is there similar work already publicly available? Any reference 
would be greatly appreciated.

Thank you.

Sent with [ProtonMail](https://protonmail.com/) Secure Email.

Doc: Extra type info on postgres-fdw option import_generated in back branches

2021-09-12 Thread Etsuro Fujita
In commit aa769f80e, I back-patched the same postgres-fdw.sgml change,
including $SUBJECT, to v12, but I noticed the type info on each FDW
option is present in HEAD only.  :-(  Here is a patch to remove
$SUBJECT from the back branches for consistency.
Best regards,Etsuro Fujita


remove-extra-type-info.patch
Description: Binary data


RE: drop tablespace failed when location contains .. on win32

2021-09-12 Thread wangsh.f...@fujitsu.com
Hi,

> -Original Message-
> From: Andrew Dunstan 
> Sent: Thursday, September 9, 2021 8:30 PM

> I think I would say that we should remove any "." or ".." element in the
> path except at the beginning, and in the case of ".." also remove the
> preceding element, unless someone can convince me that there's a problem
> with that.

These WIP patches try to remove all the '.' or '..' in the path except at
the beginning.

0001 is a small fix, because I find that is_absolute_path is not appropriate, 
see comment in skip_drive:
> * On Windows, a path may begin with "C:" or "//network/".

But this modification will lead to a regress test failure on Windows:
> -- Will fail with bad path
> CREATE TABLESPACE regress_badspace LOCATION '/no/such/location';
> -ERROR:  directory "/no/such/location" does not exist
> +ERROR:  tablespace location must be an absolute path

Do you think this modification is necessary ?

Rest of the modification is in 0002. I think this patch need more test and 
review.

0003 is a test extension for me to check the action of canonicalize_path.
Do you think is necessary to add this test extension(and some test scripts) to 
master ?
If necessary,  maybe I can use the taptest to test the action of 
canonicalize_path
in Linux and Windows.


I will add this to next commitfest after further test .

Regards.
Shenhao Wang


0001-WIP-change-is_absolute_path-s-action.patch
Description: 0001-WIP-change-is_absolute_path-s-action.patch


0002-WIP-make-canonicalize_path-remove-all-.-in-path.patch
Description:  0002-WIP-make-canonicalize_path-remove-all-.-in-path.patch


0003-WIP-add-a-canonicalize_path-test.patch
Description: 0003-WIP-add-a-canonicalize_path-test.patch


Re: Confusing messages about index row size

2021-09-12 Thread David G. Johnston
On Sunday, September 12, 2021, Jaime Casanova 
wrote:
>
>
> So, what is it? the index row size could be upto 8191 or cannot be
> greater than 2704?
>

The wording doesn’t change between the two: The size cannot be greater the
8191 regardless of the index type being used.  This check is first,
probably because it is cheaper, and just normal abstraction layering, but
it doesn’t preclude individual indexes imposing their own constraint, as
evidenced by the lower maximum of 2704 in this specific setup.

It may be non-ideal from a UX perspective to have a moving target in the
error messages, but they are consistent and accurate, and doesn’t seem
worthwhile to expend much effort on usability since the errors should
themselves be rare.

David J.