Re: Asynchronous Append on postgres_fdw nodes.

2021-04-26 Thread Andrey V. Lepikhov

On 4/23/21 8:12 AM, Etsuro Fujita wrote:

On Thu, Apr 22, 2021 at 12:30 PM Etsuro Fujita  wrote:
I have committed the patch.


One more question. Append choose async plans at the stage of the Append 
plan creation.
Later, the planner performs some optimizations, such as eliminating 
trivial Subquery nodes. So, AsyncAppend is impossible in some 
situations, for example:


(SELECT * FROM f1 WHERE a < 10)
  UNION ALL
(SELECT * FROM f2 WHERE a < 10);

But works for the query:

SELECT *
  FROM (SELECT * FROM f1 UNION ALL SELECT * FROM f2) AS q1
WHERE a < 10;

As far as I understand, this is not a hard limit. We can choose async 
subplans at the beginning of the execution stage.

For a demo, I prepared the patch (see in attachment).
It solves the problem and passes the regression tests.

--
regards,
Andrey Lepikhov
Postgres Professional
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index a960ada441..655e743c6e 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1246,6 +1246,7 @@ postgresGetForeignPlan(PlannerInfo *root,
 	bool		has_final_sort = false;
 	bool		has_limit = false;
 	ListCell   *lc;
+	ForeignScan *fsplan;
 
 	/*
 	 * Get FDW private data created by postgresGetForeignUpperPaths(), if any.
@@ -1430,7 +1431,7 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * field of the finished plan node; we can't keep them in private state
 	 * because then they wouldn't be subject to later planner processing.
 	 */
-	return make_foreignscan(tlist,
+	fsplan = make_foreignscan(tlist,
 			local_exprs,
 			scan_relid,
 			params_list,
@@ -1438,6 +1439,13 @@ postgresGetForeignPlan(PlannerInfo *root,
 			fdw_scan_tlist,
 			fdw_recheck_quals,
 			outer_plan);
+
+	/* If appropriate, consider participation in async operations */
+	fsplan->scan.plan.async_capable = (enable_async_append &&
+	   best_path->path.pathkeys == NIL &&
+	   !fsplan->scan.plan.parallel_safe &&
+	   is_async_capable_path((Path *)best_path));
+	return fsplan;
 }
 
 /*
diff --git a/src/backend/executor/execAmi.c b/src/backend/executor/execAmi.c
index b3726a54f3..4e70f4eb54 100644
--- a/src/backend/executor/execAmi.c
+++ b/src/backend/executor/execAmi.c
@@ -524,6 +524,9 @@ ExecSupportsBackwardScan(Plan *node)
 	if (node->parallel_aware)
 		return false;
 
+	if (node->async_capable)
+		return false;
+
 	switch (nodeTag(node))
 	{
 		case T_Result:
@@ -536,10 +539,6 @@ ExecSupportsBackwardScan(Plan *node)
 			{
 ListCell   *l;
 
-/* With async, tuples may be interleaved, so can't back up. */
-if (((Append *) node)->nasyncplans > 0)
-	return false;
-
 foreach(l, ((Append *) node)->appendplans)
 {
 	if (!ExecSupportsBackwardScan((Plan *) lfirst(l)))
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 3c1f12adaf..363cf9f4a5 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -117,6 +117,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	int			firstvalid;
 	int			i,
 j;
+	ListCell   *l;
+	bool		consider_async = false;
 
 	/* check for unsupported flags */
 	Assert(!(eflags & EXEC_FLAG_MARK));
@@ -197,6 +199,23 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendplanstates = (PlanState **) palloc(nplans *
 			 sizeof(PlanState *));
 
+	/* If appropriate, consider async append */
+	consider_async = (list_length(node->appendplans) > 1);
+
+	if (!consider_async)
+	{
+		foreach(l, node->appendplans)
+		{
+			Plan *subplan = (Plan *) lfirst(l);
+
+			/* Check to see if subplan can be executed asynchronously */
+			if (subplan->async_capable)
+			{
+subplan->async_capable = false;
+			}
+		}
+	}
+
 	/*
 	 * call ExecInitNode on each of the valid plans to be executed and save
 	 * the results into the appendplanstates array.
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 632cc31a04..f7302ccf28 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -242,7 +242,6 @@ _copyAppend(const Append *from)
 	 */
 	COPY_BITMAPSET_FIELD(apprelids);
 	COPY_NODE_FIELD(appendplans);
-	COPY_SCALAR_FIELD(nasyncplans);
 	COPY_SCALAR_FIELD(first_partial_plan);
 	COPY_NODE_FIELD(part_prune_info);
 
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index c723f6d635..665cdf3add 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -432,7 +432,6 @@ _outAppend(StringInfo str, const Append *node)
 
 	WRITE_BITMAPSET_FIELD(apprelids);
 	WRITE_NODE_FIELD(appendplans);
-	WRITE_INT_FIELD(nasyncplans);
 	WRITE_INT_FIELD(first_partial_plan);
 	WRITE_NODE_FIELD(part_prune_info);
 }
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3746668f52..9e3822f7db 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1716,7 +1716,6 @@ _readAppend(void)
 
 	READ_BITMAP

Re: Skip temporary table schema name from explain-verbose output.

2021-04-26 Thread Amul Sul
On Tue, Apr 27, 2021 at 11:07 AM Bharath Rupireddy
 wrote:
>
> On Tue, Apr 27, 2021 at 10:51 AM Amul Sul  wrote:
> >
> > Hi,
> >
> > Temporary tables usually gets a unique schema name, see this:
> >
> > postgres=# create temp table foo(i int);
> > CREATE TABLE
> > postgres=# explain verbose select * from foo;
> >QUERY PLAN
> > -
> >  Seq Scan on pg_temp_3.foo  (cost=0.00..35.50 rows=2550 width=4)
> >Output: i
> > (2 rows)
> >
> > The problem is that explain-verbose regression test output becomes
> > unstable when several concurrently running tests operate on temporary
> > tables.
> >
> > I was wondering can we simply skip the temporary schema name from the
> > explain-verbose output or place the "pg_temp" schema name?
> >
> > Thoughts/Suggestions?
>
> How about using an explain filter to replace the unstable text
> pg_temp_3 to pg_temp_N instead of changing it in the core? Following
> are the existing explain filters: explain_filter,
> explain_parallel_append, explain_analyze_without_memory,
> explain_resultcache, explain_parallel_sort_stats, explain_sq_limit.
>

Well, yes eventually, that will be the kludge. I was wondering if that
table is accessible in a query via pg_temp schema then why should
bother about printing the pg_temp_N schema name which is an internal
purpose.

> Looks like some of the test cases already replace pg_temp_nn with pg_temp:
> -- \dx+ would expose a variable pg_temp_nn schema name, so we can't use it 
> here
> select regexp_replace(pg_describe_object(classid, objid, objsubid),
>   'pg_temp_\d+', 'pg_temp', 'g') as "Object description"
>

This \d could be one example of why not simply show pg_temp instead of
pg_temp_N.

Regards,
Amul




Re: [BUG] "FailedAssertion" reported when streaming in logical replication

2021-04-26 Thread Dilip Kumar
On Tue, Apr 27, 2021 at 12:05 PM Amit Kapila  wrote:
> > > Can't we use 'txns_by_base_snapshot_lsn' list for this purpose? It is
> > > ensured in ReorderBufferSetBaseSnapshot that we always assign
> > > base_snapshot to a top-level transaction if the current is a known
> > > subxact. I think that will be true because we always form xid-subxid
> > > relation before processing each record in
> > > LogicalDecodingProcessRecord.
> >
> > Yeah, we can do that, but here we are only interested in top
> > transactions and this list will give us sub-transaction as well so we
> > will have to skip it in the below if condition.
> >
>
> I am not so sure about this point. I have explained above why I think
> there won't be any subtransactions in this. Can you please let me know
> what am I missing if anything?

Got your point, yeah this will only have top transactions so we can
use this.  I will change this in the next patch.  In fact we can put
an assert that it should not be an sub transaction?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [BUG] "FailedAssertion" reported when streaming in logical replication

2021-04-26 Thread Amit Kapila
On Tue, Apr 27, 2021 at 11:50 AM Dilip Kumar  wrote:
>
> On Tue, Apr 27, 2021 at 11:43 AM Amit Kapila  wrote:
> >
> > On Mon, Apr 26, 2021 at 7:52 PM Dilip Kumar  wrote:
> > >
> > > On Mon, Apr 26, 2021 at 6:59 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar  
> > > > wrote:
> > > > >
> > > > > I am able to reproduce this and I think I have done the initial 
> > > > > investigation.
> > > > >
> > > > > The cause of the issue is that, this transaction has only one change
> > > > > and that change is XLOG_HEAP2_NEW_CID, which is added through
> > > > > SnapBuildProcessNewCid.  Basically, when we add any changes through
> > > > > SnapBuildProcessChange we set the base snapshot but when we add
> > > > > SnapBuildProcessNewCid this we don't set the base snapshot, because
> > > > > there is nothing to be done for this change.  Now, this transaction is
> > > > > identified as the biggest transaction with non -partial changes, and
> > > > > now in ReorderBufferStreamTXN, it will return immediately because the
> > > > > base_snapshot is NULL.
> > > > >
> > > >
> > > > Your analysis sounds correct to me.
> > > >
> > >
> > > Thanks, I have attached a patch to fix this.
> > >
> >
> > Can't we use 'txns_by_base_snapshot_lsn' list for this purpose? It is
> > ensured in ReorderBufferSetBaseSnapshot that we always assign
> > base_snapshot to a top-level transaction if the current is a known
> > subxact. I think that will be true because we always form xid-subxid
> > relation before processing each record in
> > LogicalDecodingProcessRecord.
>
> Yeah, we can do that, but here we are only interested in top
> transactions and this list will give us sub-transaction as well so we
> will have to skip it in the below if condition.
>

I am not so sure about this point. I have explained above why I think
there won't be any subtransactions in this. Can you please let me know
what am I missing if anything?

-- 
With Regards,
Amit Kapila.




RE: Truncate in synchronous logical replication failed

2021-04-26 Thread tanghy.f...@fujitsu.com
On Tuesday, April 27, 2021 1:17 PM, Amit Kapila  wrote

>Seeing no other suggestions, I have pushed this in HEAD only. Thanks!

Sorry for the later reply on this issue.
I tested with the latest HEAD, the issue is fixed now. Thanks a lot.

Regards
Tang


Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Julien Rouhaud
On Mon, Apr 26, 2021 at 11:37:45AM -0700, Andres Freund wrote:
> Hi,
> 
> On 2021-04-26 14:21:00 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > That's sounding like a pretty sane design, actually.  Not sure about
> > the shared-library-name-with-fixed-function-name detail, but certainly
> > it seems to be useful to separate "I need a query-id" from the details
> > of the ID calculation.
> > 
> > Rather than a GUC per se for the ID provider, maybe we could have a
> > function hook that defaults to pointing at the in-core computation,
> > and then a module wanting to override that just gets into the hook.
> 
> I have a preference to determining the provider via GUC instead of a
> hook because it is both easier to introspect and easier to configure.

In any case, having a different provider would greatly simplify third-party
queryid lib authors and users life.  For now the core queryid is computed
before post_parse_analyze_hook, but any third party plugin would have to do it
as a post_parse_analyze_hook, so you have to make sure that the lib is at the
right position in shared_preload_libraries to have it work, eg. [1], depending
on how pg_stat_statements and other similar module call
prev_post_parse_analyze_hook, which is a pretty bad thing.

> If the provider is loaded via a hook, and the shared library is loaded
> via shared_preload_libraries, one can't easily just turn that off in a
> single session, but needs to restart or explicitly load a different
> library (that can't already be loaded).

On the other hand we *don't* want to dynamically change the provider.
Temporarily enabling/disabling queryid calculation is ok, but generating
different have for the same query isn't.

> We also don't have any way to show what's hooking into a hook.

If we had a dedicated query_id hook, then plugins should error out if users
configured multiple plugins to calculate a query_id, so it should be easy to
know which plugin is responsible for it without knowing who hooked into the
hook.

[1] https://github.com/rjuju/pg_queryid/blob/master/pg_queryid.c#L116-L117




Small issues with CREATE TABLE COMPRESSION

2021-04-26 Thread Michael Paquier
Hi all,

I have been looking at and testing the patch set for CREATE TABLE
COMPRESSION, and spotted a couple of things in parallel of some work
done by Jacob (added in CC).

The behavior around CREATE TABLE AS and matviews is a bit confusing,
and not documented.  First, at the grammar level, it is not possible
to specify which compression option is used per column when creating
the relation.  So, all the relation columns would just set a column's
compression to be default_toast_compression for all the toastable
columns of the relation.  That's not enforceable at column level when
the relation is created, except with a follow-up ALTER TABLE.  That's
similar to STORAGE when it comes to matviews, but these are at least
documented.

And so, ALTER MATERIALIZED VIEW supports SET COMPRESSION but this is
not mentioned in its docs:
https://www.postgresql.org/docs/devel/sql-altermaterializedview.html
psql could have tab completion support for that.

There are no tests in pg_dump to make sure that some ALTER
MATERIALIZED VIEW or ALTER TABLE commands are generated when the
compression of a matview's or table's column is changed.  This depends
on the value of default_toast_compression, but that would be nice to
have something, and get at least some coverage with
--no-toast-compression.  You would need to make the tests conditional
here, for example with check_pg_config() (see for example what's done
with channel binding in ssl/t/002_scram.pl).

Another thing is the handling of the per-value compression that could
be confusing to the user.  As no materialization of the data is done
for a CTAS or a matview, and the regression tests of compression.sql
track that AFAIK, there can be a mix of toast values compressed with
lz4 or pglz, with pg_attribute.attcompression being one or the other.

Now, we don't really document any of that, and the per-column
compression value would be set to default_toast_compression while the
stored values may use a mix of the compression methods, depending on
where the toasted values come from.  If this behavior is intended, this
makes me wonder in what the possibility to set the compression for a
materialized view column is useful for except for a logical
dump/restore?  As of HEAD we'd just insert the toasted value from the
origin as-is so the compression of the column has no effect at all.
Another thing here is the inconsistency that this brings with pg_dump.
For example, as the dumped values are decompressed, we could have
values compressed with pglz at the origin, with a column using lz4
within its definition that would make everything compressed with lz4
once the values are restored.  This choice may be fine, but I think
that it would be good to document all that.  That would be less
surprising to the user.

Similarly, we may want to document that COMPRESSION does not trigger a
table rewrite, but that it is effective only for the new toast values
inserted if a tuple is rebuilt and rewritten?

Would it be better to document that pg_column_compression() returns
NULL if the column is not a toastable type or if the column's value is
not compressed?

The flexibility with allow_system_table_mods allows one to change the
compression method of catalogs, for example switching rolpassword with
a SCRAM verifier large enough to be toasted would lock an access to
the cluster if restarting the server without lz4 support.  I shouldn't
have done that but I did, and I like it :)

The design used by this feature is pretty cool, as long as you don't
read the compressed values, physical replication can work out of the
box even across nodes that are built with or without lz4.

Thanks,
--
Michael


signature.asc
Description: PGP signature


Re: [BUG] "FailedAssertion" reported when streaming in logical replication

2021-04-26 Thread Dilip Kumar
On Tue, Apr 27, 2021 at 11:43 AM Amit Kapila  wrote:
>
> On Mon, Apr 26, 2021 at 7:52 PM Dilip Kumar  wrote:
> >
> > On Mon, Apr 26, 2021 at 6:59 PM Amit Kapila  wrote:
> > >
> > > On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar  wrote:
> > > >
> > > > I am able to reproduce this and I think I have done the initial 
> > > > investigation.
> > > >
> > > > The cause of the issue is that, this transaction has only one change
> > > > and that change is XLOG_HEAP2_NEW_CID, which is added through
> > > > SnapBuildProcessNewCid.  Basically, when we add any changes through
> > > > SnapBuildProcessChange we set the base snapshot but when we add
> > > > SnapBuildProcessNewCid this we don't set the base snapshot, because
> > > > there is nothing to be done for this change.  Now, this transaction is
> > > > identified as the biggest transaction with non -partial changes, and
> > > > now in ReorderBufferStreamTXN, it will return immediately because the
> > > > base_snapshot is NULL.
> > > >
> > >
> > > Your analysis sounds correct to me.
> > >
> >
> > Thanks, I have attached a patch to fix this.
> >
>
> Can't we use 'txns_by_base_snapshot_lsn' list for this purpose? It is
> ensured in ReorderBufferSetBaseSnapshot that we always assign
> base_snapshot to a top-level transaction if the current is a known
> subxact. I think that will be true because we always form xid-subxid
> relation before processing each record in
> LogicalDecodingProcessRecord.

Yeah, we can do that, but here we are only interested in top
transactions and this list will give us sub-transaction as well so we
will have to skip it in the below if condition.  So I think using
toplevel_by_lsn and skipping the txn without base_snapshot in below if
condition will be cheaper compared to process all the transactions
with base snapshot i.e. txns_by_base_snapshot_lsn and skipping the
sub-transactions in the below if conditions.  Whats your thoughts on
this?


> Few other minor comments:
> 1. I think we can update the comments atop function 
> ReorderBufferLargestTopTXN.
> 2. minor typo in comments atop ReorderBufferLargestTopTXN "...There is
> a scope of optimization here such that we can select the largest
> transaction which has complete changes...". In this 'complete' should
> be incomplete. This is not related to this patch but I think we can
> fix it along with this because anyway we are going to change
> surrounding comments.

I will work on these in the next version.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [BUG] "FailedAssertion" reported when streaming in logical replication

2021-04-26 Thread Amit Kapila
On Mon, Apr 26, 2021 at 7:52 PM Dilip Kumar  wrote:
>
> On Mon, Apr 26, 2021 at 6:59 PM Amit Kapila  wrote:
> >
> > On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar  wrote:
> > >
> > > I am able to reproduce this and I think I have done the initial 
> > > investigation.
> > >
> > > The cause of the issue is that, this transaction has only one change
> > > and that change is XLOG_HEAP2_NEW_CID, which is added through
> > > SnapBuildProcessNewCid.  Basically, when we add any changes through
> > > SnapBuildProcessChange we set the base snapshot but when we add
> > > SnapBuildProcessNewCid this we don't set the base snapshot, because
> > > there is nothing to be done for this change.  Now, this transaction is
> > > identified as the biggest transaction with non -partial changes, and
> > > now in ReorderBufferStreamTXN, it will return immediately because the
> > > base_snapshot is NULL.
> > >
> >
> > Your analysis sounds correct to me.
> >
>
> Thanks, I have attached a patch to fix this.
>

Can't we use 'txns_by_base_snapshot_lsn' list for this purpose? It is
ensured in ReorderBufferSetBaseSnapshot that we always assign
base_snapshot to a top-level transaction if the current is a known
subxact. I think that will be true because we always form xid-subxid
relation before processing each record in
LogicalDecodingProcessRecord.

Few other minor comments:
1. I think we can update the comments atop function ReorderBufferLargestTopTXN.
2. minor typo in comments atop ReorderBufferLargestTopTXN "...There is
a scope of optimization here such that we can select the largest
transaction which has complete changes...". In this 'complete' should
be incomplete. This is not related to this patch but I think we can
fix it along with this because anyway we are going to change
surrounding comments.

-- 
With Regards,
Amit Kapila.




Re: TRUNCATE on foreign table

2021-04-26 Thread Bharath Rupireddy
On Tue, Apr 27, 2021 at 11:19 AM Fujii Masao
 wrote:
> > In docs v4 patch, I think we can combine below two lines into a single line:
> > +   supported by the foreign data wrapper,
> >  see .
>
> You mean "supported by the foreign data wrapper  linkend="postgres-fdw"/>"?
>
> I was thinking that it's better to separate them because postgres_fdw
> is just an example of the foreign data wrapper supporting TRUNCATE.
> This makes me think again; isn't it better to add "for example" or
> "for instance" into after "data wrapper"? That is,
>
>  TRUNCATE can be used for foreign tables if
>  supported by the foreign data wrapper, for instance,
>  see .

+1.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: TRUNCATE on foreign table

2021-04-26 Thread Fujii Masao




On 2021/04/26 13:52, Bharath Rupireddy wrote:

On Fri, Apr 23, 2021 at 9:50 PM Fujii Masao  wrote:

Thanks for the review! I fixed this.


Thanks for the updated patches.

In docs v4 patch, I think we can combine below two lines into a single line:
+   supported by the foreign data wrapper,
 see .


You mean "supported by the foreign data wrapper "?

I was thinking that it's better to separate them because postgres_fdw
is just an example of the foreign data wrapper supporting TRUNCATE.
This makes me think again; isn't it better to add "for example" or
"for instance" into after "data wrapper"? That is,

TRUNCATE can be used for foreign tables if
supported by the foreign data wrapper, for instance,
see .



Other than the above minor change, both patches look good to me, I
have no further comments.


Thanks! I pushed the patch 
truncate_foreign_table_dont_pass_only_clause_xx.patch, at first.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: [PATCH] Re: pg_identify_object_as_address() doesn't support pg_event_trigger oids

2021-04-26 Thread Japin Li


On Tue, 27 Apr 2021 at 13:16, Joel Jacobson  wrote:
> On Mon, Apr 26, 2021, at 10:30, Michael Paquier wrote:
>> On Fri, Apr 23, 2021 at 09:33:36AM +0200, Joel Jacobson wrote:
>> > Also, since this is a problem also in v13 maybe this should also be
>> > back-ported?  I think it's a bug since both
>> > pg_identify_object_as_address() and event triggers exists in v13, so
>> > the function should work there as well, otherwise users would need
>> > to do work-arounds for event triggers.  
>> 
>> No objections from here to do something in back-branches.  We cannot
>> have a test for event triggers in object_address.sql and it would be
>> better to keep it in a parallel set (see 676858b for example).  Could
>> you however add a small test for that in event_trigger.sql?  It would
>> be good to check after all three functions pg_identify_object(),
>> pg_identify_object_as_address() and pg_get_object_address().
>> --
>> Michael
>
> Thanks for the guidance in how to test.
>
> I've added a test at the end of event_trigger.sql,
> reusing the three event triggers already in existence,
> just before they are dropped.
>
> New patch attached.

IMO we should add a space between the parameters to keep the code
style consistently.

+SELECT
+  evtname,
+  pg_describe_object('pg_event_trigger'::regclass,oid,0),
+  pg_identify_object('pg_event_trigger'::regclass,oid,0),
+  pg_identify_object_as_address('pg_event_trigger'::regclass,oid,0)
+FROM pg_event_trigger
+WHERE evtname IN ('start_rls_command','end_rls_command','sql_drop_command')
+ORDER BY evtname;

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: Skip temporary table schema name from explain-verbose output.

2021-04-26 Thread Bharath Rupireddy
On Tue, Apr 27, 2021 at 10:51 AM Amul Sul  wrote:
>
> Hi,
>
> Temporary tables usually gets a unique schema name, see this:
>
> postgres=# create temp table foo(i int);
> CREATE TABLE
> postgres=# explain verbose select * from foo;
>QUERY PLAN
> -
>  Seq Scan on pg_temp_3.foo  (cost=0.00..35.50 rows=2550 width=4)
>Output: i
> (2 rows)
>
> The problem is that explain-verbose regression test output becomes
> unstable when several concurrently running tests operate on temporary
> tables.
>
> I was wondering can we simply skip the temporary schema name from the
> explain-verbose output or place the "pg_temp" schema name?
>
> Thoughts/Suggestions?

How about using an explain filter to replace the unstable text
pg_temp_3 to pg_temp_N instead of changing it in the core? Following
are the existing explain filters: explain_filter,
explain_parallel_append, explain_analyze_without_memory,
explain_resultcache, explain_parallel_sort_stats, explain_sq_limit.

Looks like some of the test cases already replace pg_temp_nn with pg_temp:
-- \dx+ would expose a variable pg_temp_nn schema name, so we can't use it here
select regexp_replace(pg_describe_object(classid, objid, objsubid),
  'pg_temp_\d+', 'pg_temp', 'g') as "Object description"

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-26 Thread Masahiko Sawada
On Tue, Apr 27, 2021 at 8:07 AM Andres Freund  wrote:
>
> Hi,
>
> On 2021-04-26 23:59:17 +0200, Tomas Vondra wrote:
> > On 4/26/21 9:27 PM, Andres Freund wrote:
> > > On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
> > > > I'm not sure what to do about this :-( I don't have any ideas about how 
> > > > to
> > > > eliminate this overhead, so the only option I see is reverting the 
> > > > changes
> > > > in heap_insert. Unfortunately, that'd mean inserts into TOAST tables 
> > > > won't
> > > > be frozen ...
> > >
> > > ISTM that the fundamental issue here is not that we acquire pins that we
> > > shouldn't, but that we do so at a much higher frequency than needed.
> > >
> > > It's probably too invasive for 14, but I think it might be worth exploring
> > > passing down a BulkInsertState in nodeModifyTable.c's 
> > > table_tuple_insert() iff
> > > the input will be more than one row.
> > >
> > > And then add the vm buffer of the target page to BulkInsertState, so that
> > > hio.c can avoid re-pinning the buffer.
> > >
> >
> > Yeah. The question still is what to do about 14, though. Shall we leave the
> > code as it is now, or should we change it somehow? It seem a bit unfortunate
> > that a COPY FREEZE optimization should negatively influence other (more)
> > common use cases, so I guess we can't just keep the current code ...
>
> I'd suggest prototyping the use of BulkInsertState in nodeModifyTable.c
> and see whether that fixes the regression.

Is this idea to have RelationGetBufferForTuple() skip re-pinning
vmbuffer? If so, is this essentially the same as the one in the v3
patch?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: [BUG] "FailedAssertion" reported when streaming in logical replication

2021-04-26 Thread Dilip Kumar
On Mon, Apr 26, 2021 at 7:52 PM Dilip Kumar  wrote:
>
> On Mon, Apr 26, 2021 at 6:59 PM Amit Kapila  wrote:
> >
> > On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar  wrote:
> > >
> > > I am able to reproduce this and I think I have done the initial 
> > > investigation.
> > >
> > > The cause of the issue is that, this transaction has only one change
> > > and that change is XLOG_HEAP2_NEW_CID, which is added through
> > > SnapBuildProcessNewCid.  Basically, when we add any changes through
> > > SnapBuildProcessChange we set the base snapshot but when we add
> > > SnapBuildProcessNewCid this we don't set the base snapshot, because
> > > there is nothing to be done for this change.  Now, this transaction is
> > > identified as the biggest transaction with non -partial changes, and
> > > now in ReorderBufferStreamTXN, it will return immediately because the
> > > base_snapshot is NULL.
> > >
> >
> > Your analysis sounds correct to me.
> >
>
> Thanks, I have attached a patch to fix this.

There is also one very silly mistake in below condition, basically,
once we got any transaction for next transaction it is unconditionally
selecting without comparing the size because largest != NULL is wrong,
ideally this should be largest == NULL, basically, if we haven't
select any transaction then only we can approve next transaction
without comparing the size.

if ((largest != NULL || txn->total_size > largest_size) &&
(txn->base_snapshot != NULL) && (txn->total_size > 0) &&
!(rbtxn_has_incomplete_tuple(txn)))


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 16d47947002357cc37fdc3debcdf8c376e370188 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Mon, 26 Apr 2021 18:19:27 +0530
Subject: [PATCH v1] Don't select the transaction without base snapshot for
 streaming

While selecting the largest top transaction, currently we don't check
whether the transaction has the base snapshot or not, but if the
transaction doesn't have the base snapshot then we can not stream that so
skip such transactions.
---
 src/backend/replication/logical/reorderbuffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 5cb484f..981619f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3388,7 +3388,8 @@ ReorderBufferLargestTopTXN(ReorderBuffer *rb)
 		txn = dlist_container(ReorderBufferTXN, node, iter.cur);
 
 		if ((largest != NULL || txn->total_size > largest_size) &&
-			(txn->total_size > 0) && !(rbtxn_has_incomplete_tuple(txn)))
+			(txn->base_snapshot != NULL) && (txn->total_size > 0) &&
+			!(rbtxn_has_incomplete_tuple(txn)))
 		{
 			largest = txn;
 			largest_size = txn->total_size;
-- 
1.8.3.1

From c3327f9eda4a880a960afb67ad2ad65b30fdbd32 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Tue, 27 Apr 2021 10:53:44 +0530
Subject: [PATCH v2 2/2] Fix thinko while selecting the largest transaction for
 streaming

---
 src/backend/replication/logical/reorderbuffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 981619f..3e53144 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3387,7 +3387,7 @@ ReorderBufferLargestTopTXN(ReorderBuffer *rb)
 
 		txn = dlist_container(ReorderBufferTXN, node, iter.cur);
 
-		if ((largest != NULL || txn->total_size > largest_size) &&
+		if ((largest == NULL || txn->total_size > largest_size) &&
 			(txn->base_snapshot != NULL) && (txn->total_size > 0) &&
 			!(rbtxn_has_incomplete_tuple(txn)))
 		{
-- 
1.8.3.1



Re: Replication slot stats misgivings

2021-04-26 Thread vignesh C
On Tue, Apr 27, 2021 at 9:48 AM vignesh C  wrote:
>
> On Tue, Apr 27, 2021 at 9:43 AM Amit Kapila  wrote:
> >
> > On Tue, Apr 27, 2021 at 9:17 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Tue, Apr 27, 2021 at 11:31 AM vignesh C  wrote:
> > > >
> > > > > > And I think there is
> > > > > > also a risk to increase shared memory when we want to add other
> > > > > > statistics in the future.
> > > > > >
> > > > >
> > > > > Yeah, so do you think it is not a good idea to store stats in
> > > > > ReplicationSlot? Actually storing them in a slot makes it easier to
> > > > > send them during ReplicationSlotRelease which is quite helpful if the
> > > > > replication is interrupted due to some reason. Or the other idea was
> > > > > that we send stats every time we stream or spill changes.
> > > >
> > > > We use around 64 bytes of shared memory to store the statistics
> > > > information per slot, I'm not sure if this is a lot of memory. If this
> > > > memory is fine, then I felt the approach to store stats seems fine. If
> > > > that memory is too much then we could use the other approach to update
> > > > stats when we stream or spill the changes as suggested by Amit.
> > >
> > > I agree that makes it easier to send slot stats during
> > > ReplicationSlotRelease() but I'd prefer to avoid storing data that
> > > doesn't need to be shared in the shared buffer if possible.
> > >
> >
> > Sounds reasonable and we might add some stats in the future so that
> > will further increase the usage of shared memory.
> >
> > > And those
> > > counters are not used by physical slots at all. If sending slot stats
> > > every time we stream or spill changes doesn't affect the system much,
> > > I think it's better than having slot stats in the shared memory.
> > >
> >
> > As the minimum size of logical_decoding_work_mem is 64KB, so in the
> > worst case, we will send stats after decoding that many changes. I
> > don't think it would impact too much considering that we need to spill
> > or stream those many changes.  If it concerns any users they can
> > always increase logical_decoding_work_mem. The default value is 64MB
> > at which point, I don't think it will matter sending the stats.
>
> Sounds good to me, I will rebase my previous patch and send a patch for this.
>

Attached patch has the changes to update statistics during
spill/stream which prevents the statistics from being lost during
interrupt.
Thoughts?

Regards,
Vignesh
From a1fd6f76c955482efa7e63ea6c25ae6350160b4d Mon Sep 17 00:00:00 2001
From: vignesh 
Date: Tue, 27 Apr 2021 10:56:02 +0530
Subject: [PATCH v3] Update replication statistics after every stream/spill.

Currently, replication slot statistics are updated at prepare, commit, and
rollback. Now, if the transaction is interrupted the stats might not get
updated. Fixed this by updating replication statistics after every
stream/spill.
---
 src/backend/replication/logical/decode.c| 6 +++---
 src/backend/replication/logical/logical.c   | 6 +++---
 src/backend/replication/logical/reorderbuffer.c | 3 +++
 src/include/replication/logical.h   | 2 +-
 4 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 7924581cdc..2c009d51ec 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -750,7 +750,7 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
 	 * not clear that sending more or less frequently than this would be
 	 * better.
 	 */
-	UpdateDecodingStats(ctx);
+	UpdateDecodingStats(ctx->reorder);
 }
 
 /*
@@ -832,7 +832,7 @@ DecodePrepare(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
 	 * not clear that sending more or less frequently than this would be
 	 * better.
 	 */
-	UpdateDecodingStats(ctx);
+	UpdateDecodingStats(ctx->reorder);
 }
 
 
@@ -889,7 +889,7 @@ DecodeAbort(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
 	}
 
 	/* update the decoding stats */
-	UpdateDecodingStats(ctx);
+	UpdateDecodingStats(ctx->reorder);
 }
 
 /*
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index 00543ede45..0c3ef0f93f 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -1770,10 +1770,10 @@ ResetLogicalStreamingState(void)
  * Report stats for a slot.
  */
 void
-UpdateDecodingStats(LogicalDecodingContext *ctx)
+UpdateDecodingStats(ReorderBuffer *rb)
 {
-	ReorderBuffer *rb = ctx->reorder;
 	PgStat_StatReplSlotEntry repSlotStat;
+	ReplicationSlot *slot = MyReplicationSlot;
 
 	/* Nothing to do if we don't have any replication stats to be sent. */
 	if (rb->spillBytes <= 0 && rb->streamBytes <= 0 && rb->totalBytes <= 0)
@@ -1790,7 +1790,7 @@ UpdateDecodingStats(LogicalDecodingContext *ctx)
 		 (long long) rb->totalTxns,
 		 (long long) rb->totalBytes);
 
-	namestrcpy(&repSlotStat.slotname, NameStr(ctx->slot->data.name));
+	namestr

Skip temporary table schema name from explain-verbose output.

2021-04-26 Thread Amul Sul
Hi,

Temporary tables usually gets a unique schema name, see this:

postgres=# create temp table foo(i int);
CREATE TABLE
postgres=# explain verbose select * from foo;
   QUERY PLAN
-
 Seq Scan on pg_temp_3.foo  (cost=0.00..35.50 rows=2550 width=4)
   Output: i
(2 rows)

The problem is that explain-verbose regression test output becomes
unstable when several concurrently running tests operate on temporary
tables.

I was wondering can we simply skip the temporary schema name from the
explain-verbose output or place the "pg_temp" schema name?

Thoughts/Suggestions?

Regares,
Amul




Re: [PATCH] Re: pg_identify_object_as_address() doesn't support pg_event_trigger oids

2021-04-26 Thread Joel Jacobson
On Mon, Apr 26, 2021, at 10:30, Michael Paquier wrote:
> On Fri, Apr 23, 2021 at 09:33:36AM +0200, Joel Jacobson wrote:
> > Also, since this is a problem also in v13 maybe this should also be
> > back-ported?  I think it's a bug since both
> > pg_identify_object_as_address() and event triggers exists in v13, so
> > the function should work there as well, otherwise users would need
> > to do work-arounds for event triggers.  
> 
> No objections from here to do something in back-branches.  We cannot
> have a test for event triggers in object_address.sql and it would be
> better to keep it in a parallel set (see 676858b for example).  Could
> you however add a small test for that in event_trigger.sql?  It would
> be good to check after all three functions pg_identify_object(),
> pg_identify_object_as_address() and pg_get_object_address().
> --
> Michael

Thanks for the guidance in how to test.

I've added a test at the end of event_trigger.sql,
reusing the three event triggers already in existence,
just before they are dropped.

New patch attached.

/Joel

fix_event_trigger_pg_identify_object_as_address2.patch
Description: Binary data


Re: Replication slot stats misgivings

2021-04-26 Thread Masahiko Sawada
On Tue, Apr 27, 2021 at 1:18 PM vignesh C  wrote:
>
> On Tue, Apr 27, 2021 at 9:43 AM Amit Kapila  wrote:
> >
> > On Tue, Apr 27, 2021 at 9:17 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Tue, Apr 27, 2021 at 11:31 AM vignesh C  wrote:
> > > >
> > > > > > And I think there is
> > > > > > also a risk to increase shared memory when we want to add other
> > > > > > statistics in the future.
> > > > > >
> > > > >
> > > > > Yeah, so do you think it is not a good idea to store stats in
> > > > > ReplicationSlot? Actually storing them in a slot makes it easier to
> > > > > send them during ReplicationSlotRelease which is quite helpful if the
> > > > > replication is interrupted due to some reason. Or the other idea was
> > > > > that we send stats every time we stream or spill changes.
> > > >
> > > > We use around 64 bytes of shared memory to store the statistics
> > > > information per slot, I'm not sure if this is a lot of memory. If this
> > > > memory is fine, then I felt the approach to store stats seems fine. If
> > > > that memory is too much then we could use the other approach to update
> > > > stats when we stream or spill the changes as suggested by Amit.
> > >
> > > I agree that makes it easier to send slot stats during
> > > ReplicationSlotRelease() but I'd prefer to avoid storing data that
> > > doesn't need to be shared in the shared buffer if possible.
> > >
> >
> > Sounds reasonable and we might add some stats in the future so that
> > will further increase the usage of shared memory.
> >
> > > And those
> > > counters are not used by physical slots at all. If sending slot stats
> > > every time we stream or spill changes doesn't affect the system much,
> > > I think it's better than having slot stats in the shared memory.
> > >
> >
> > As the minimum size of logical_decoding_work_mem is 64KB, so in the
> > worst case, we will send stats after decoding that many changes. I
> > don't think it would impact too much considering that we need to spill
> > or stream those many changes.  If it concerns any users they can
> > always increase logical_decoding_work_mem. The default value is 64MB
> > at which point, I don't think it will matter sending the stats.
>
> Sounds good to me, I will rebase my previous patch and send a patch for this.

+1. Thanks!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Replication slot stats misgivings

2021-04-26 Thread vignesh C
On Tue, Apr 27, 2021 at 9:43 AM Amit Kapila  wrote:
>
> On Tue, Apr 27, 2021 at 9:17 AM Masahiko Sawada  wrote:
> >
> > On Tue, Apr 27, 2021 at 11:31 AM vignesh C  wrote:
> > >
> > > > > And I think there is
> > > > > also a risk to increase shared memory when we want to add other
> > > > > statistics in the future.
> > > > >
> > > >
> > > > Yeah, so do you think it is not a good idea to store stats in
> > > > ReplicationSlot? Actually storing them in a slot makes it easier to
> > > > send them during ReplicationSlotRelease which is quite helpful if the
> > > > replication is interrupted due to some reason. Or the other idea was
> > > > that we send stats every time we stream or spill changes.
> > >
> > > We use around 64 bytes of shared memory to store the statistics
> > > information per slot, I'm not sure if this is a lot of memory. If this
> > > memory is fine, then I felt the approach to store stats seems fine. If
> > > that memory is too much then we could use the other approach to update
> > > stats when we stream or spill the changes as suggested by Amit.
> >
> > I agree that makes it easier to send slot stats during
> > ReplicationSlotRelease() but I'd prefer to avoid storing data that
> > doesn't need to be shared in the shared buffer if possible.
> >
>
> Sounds reasonable and we might add some stats in the future so that
> will further increase the usage of shared memory.
>
> > And those
> > counters are not used by physical slots at all. If sending slot stats
> > every time we stream or spill changes doesn't affect the system much,
> > I think it's better than having slot stats in the shared memory.
> >
>
> As the minimum size of logical_decoding_work_mem is 64KB, so in the
> worst case, we will send stats after decoding that many changes. I
> don't think it would impact too much considering that we need to spill
> or stream those many changes.  If it concerns any users they can
> always increase logical_decoding_work_mem. The default value is 64MB
> at which point, I don't think it will matter sending the stats.

Sounds good to me, I will rebase my previous patch and send a patch for this.

Regards,
Vignesh




Re: Truncate in synchronous logical replication failed

2021-04-26 Thread Amit Kapila
On Mon, Apr 26, 2021 at 12:37 PM Japin Li  wrote:
>
> On Mon, 26 Apr 2021 at 12:49, Amit Kapila  wrote:
> > On Fri, Apr 23, 2021 at 7:18 PM osumi.takami...@fujitsu.com
> >  wrote:
> >>
> >
> > The latest patch looks good to me. I have made a minor modification
> > and added a commit message in the attached. I would like to once again
> > ask whether anybody else thinks we should backpatch this? Just a
> > summary for anybody not following this thread:
> >
> > This patch fixes the Logical Replication of Truncate in synchronous
> > commit mode. The Truncate operation acquires an exclusive lock on the
> > target relation and indexes and waits for logical replication of the
> > operation to finish at commit. Now because we are acquiring the shared
> > lock on the target index to get index attributes in pgoutput while
> > sending the changes for the Truncate operation, it leads to a
> > deadlock.
> >
> > Actually, we don't need to acquire a lock on the target index as we
> > build the cache entry using a historic snapshot and all the later
> > changes are absorbed while decoding WAL. So, we wrote a special
> > purpose function for logical replication to get a bitmap of replica
> > identity attribute numbers where we get that information without
> > locking the target index.
> >
> > We are planning not to backpatch this as there doesn't seem to be any
> > field complaint about this issue since it was introduced in commit
> > 5dfd1e5a in v11.
>
> +1 for apply only on HEAD.
>

Seeing no other suggestions, I have pushed this in HEAD only. Thanks!

-- 
With Regards,
Amit Kapila.




Re: Replication slot stats misgivings

2021-04-26 Thread Amit Kapila
On Tue, Apr 27, 2021 at 9:17 AM Masahiko Sawada  wrote:
>
> On Tue, Apr 27, 2021 at 11:31 AM vignesh C  wrote:
> >
> > > > And I think there is
> > > > also a risk to increase shared memory when we want to add other
> > > > statistics in the future.
> > > >
> > >
> > > Yeah, so do you think it is not a good idea to store stats in
> > > ReplicationSlot? Actually storing them in a slot makes it easier to
> > > send them during ReplicationSlotRelease which is quite helpful if the
> > > replication is interrupted due to some reason. Or the other idea was
> > > that we send stats every time we stream or spill changes.
> >
> > We use around 64 bytes of shared memory to store the statistics
> > information per slot, I'm not sure if this is a lot of memory. If this
> > memory is fine, then I felt the approach to store stats seems fine. If
> > that memory is too much then we could use the other approach to update
> > stats when we stream or spill the changes as suggested by Amit.
>
> I agree that makes it easier to send slot stats during
> ReplicationSlotRelease() but I'd prefer to avoid storing data that
> doesn't need to be shared in the shared buffer if possible.
>

Sounds reasonable and we might add some stats in the future so that
will further increase the usage of shared memory.

> And those
> counters are not used by physical slots at all. If sending slot stats
> every time we stream or spill changes doesn't affect the system much,
> I think it's better than having slot stats in the shared memory.
>

As the minimum size of logical_decoding_work_mem is 64KB, so in the
worst case, we will send stats after decoding that many changes. I
don't think it would impact too much considering that we need to spill
or stream those many changes.  If it concerns any users they can
always increase logical_decoding_work_mem. The default value is 64MB
at which point, I don't think it will matter sending the stats.

> Also, not sure it’s better but another idea would be to make the slot
> stats a global variable like pgBufferUsage and use it during decoding.
>

Hmm, I think it is better to avoid global variables if possible.

> Or we can set a proc-exit callback? But to be honest, I'm not sure
> which approach we should go with. Those approaches have proc and cons.
>

I think we can try the first approach listed here which is to send
stats each time we spill or stream.

-- 
With Regards,
Amit Kapila.




Re: [bug?] Missed parallel safety checks, and wrong parallel safety

2021-04-26 Thread Greg Nancarrow
On Sat, Apr 24, 2021 at 12:53 PM Amit Kapila  wrote:
>
> On Fri, Apr 23, 2021 at 6:45 PM Tom Lane  wrote:
> >
> > Greg Nancarrow  writes:
> > > I'm curious. The FmgrBuiltin struct includes the "strict" flag, so
> > > that would "lock down the value" of the strict flag, wouldn't it?
> >
> > It does, but that's much more directly a property of the function's
> > C code than parallel-safety is.
> >
>
> Isn't parallel safety also the C code property? I mean unless someone
> changes the built-in function code, changing that property would be
> dangerous. The other thing is even if a user is allowed to change one
> function's property, how will they know which other functions are
> called by that function and whether they are parallel-safe or not. For
> example, say if the user wants to change the parallel safe property of
> a built-in function brin_summarize_new_values, unless she changes its
> code and the functions called by it like brin_summarize_range, it
> would be dangerous. So, isn't it better to disallow changing parallel
> safety for built-in functions?
>
> Also, if the strict property of built-in functions is fixed
> internally, why we allow users to change it and is that of any help?
>

Yes, I'd like to know too.
I think it would make more sense to disallow changing properties like
strict/parallel-safety on built-in functions.
Also, with sufficient privileges, a built-in function can be
redefined, yet the original function (whose info is cached in
FmgrBuiltins[], from build-time) is always invoked, not the
newly-defined version.

Regards,
Greg Nancarrow
Fujitsu Australia




Re: logical replication empty transactions

2021-04-26 Thread Ajin Cherian
On Mon, Apr 26, 2021 at 4:29 PM Peter Smith  wrote:

> The v4 patch applied cleanly.
>
> make check-world completed successfully.
>
> So this patch v4 looks LGTM, apart from the following 2 nitpick comments:
>
> ==
>
> 1. Suggest to add a blank line after the (void)txn; ?
>
> @@ -345,10 +345,29 @@ pgoutput_startup(LogicalDecodingContext *ctx,
> OutputPluginOptions *opt,
>  static void
>  pgoutput_begin_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn)
>  {
> + PGOutputData *data = (PGOutputData *) ctx->output_plugin_private;
> +
> + (void)txn; /* keep compiler quiet */
> + /*
> + * Don't send BEGIN message here. Instead, postpone it until the first
>
>

Fixed.

> ==
>
> 2. Unnecessary statement blocks?
>
> AFAIK those { } are not the usual PG code-style when there is only one
> statement, so suggest to remove them.
>
> Appies to 3 places:
>
> @@ -551,6 +576,12 @@ pgoutput_change(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
>   Assert(false);
>   }
>
> + /* output BEGIN if we haven't yet */
> + if (!data->sent_begin_txn && !in_streaming)
> + {
> + pgoutput_begin(ctx, txn);
> + }
>
> @@ -693,6 +724,12 @@ pgoutput_truncate(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
>
>   if (nrelids > 0)
>   {
> + /* output BEGIN if we haven't yet */
> + if (!data->sent_begin_txn && !in_streaming)
> + {
> + pgoutput_begin(ctx, txn);
> + }
>
> @@ -725,6 +762,12 @@ pgoutput_message(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
>   if (in_streaming)
>   xid = txn->xid;
>
> + /* output BEGIN if we haven't yet, avoid for streaming and
> non-transactional messages */
> + if (!data->sent_begin_txn && !in_streaming && transactional)
> + {
> + pgoutput_begin(ctx, txn);
> + }
>

Fixed.

regards,
Ajin Cherian
Fujitsu Australia


v5-0001-Skip-empty-transactions-for-logical-replication.patch
Description: Binary data


Re: Replication slot stats misgivings

2021-04-26 Thread Masahiko Sawada
On Tue, Apr 27, 2021 at 11:31 AM vignesh C  wrote:
>
> On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila  wrote:
> >
> > On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C  wrote:
> > > > >
> > > > > I have made the changes to update the replication statistics at
> > > > > replication slot release. Please find the patch attached for the same.
> > > > > Thoughts?
> > > > >
> > > >
> > > > Thanks, the changes look mostly good to me. The slot stats need to be
> > > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> > > > StartupDecodingContext. Apart from that, I have moved the declaration
> > > > of UpdateDecodingStats from slot.h back to logical.h. I have also
> > > > added/edited a few comments. Please check and let me know what do you
> > > > think of the attached?
> > >
> > > The patch moves slot stats to the ReplicationSlot data that is on the
> > > shared memory. If we have a space to store the statistics in the
> > > shared memory can we simply accumulate the stats there and make them
> > > persistent without using the stats collector?
> > >
> >
> > But for that, we need to write to file at every commit/abort/prepare
> > (decode of commit) which I think will incur significant overhead.
> > Also, we try to write after few commits then there is a danger of
> > losing them and still there could be a visible overhead for small
> > transactions.
> >
>
> I preferred not to persist this information to file, let's have stats
> collector handle the stats persisting.
>
> > > And I think there is
> > > also a risk to increase shared memory when we want to add other
> > > statistics in the future.
> > >
> >
> > Yeah, so do you think it is not a good idea to store stats in
> > ReplicationSlot? Actually storing them in a slot makes it easier to
> > send them during ReplicationSlotRelease which is quite helpful if the
> > replication is interrupted due to some reason. Or the other idea was
> > that we send stats every time we stream or spill changes.
>
> We use around 64 bytes of shared memory to store the statistics
> information per slot, I'm not sure if this is a lot of memory. If this
> memory is fine, then I felt the approach to store stats seems fine. If
> that memory is too much then we could use the other approach to update
> stats when we stream or spill the changes as suggested by Amit.

I agree that makes it easier to send slot stats during
ReplicationSlotRelease() but I'd prefer to avoid storing data that
doesn't need to be shared in the shared buffer if possible. And those
counters are not used by physical slots at all. If sending slot stats
every time we stream or spill changes doesn't affect the system much,
I think it's better than having slot stats in the shared memory.

Also, not sure it’s better but another idea would be to make the slot
stats a global variable like pgBufferUsage and use it during decoding.
Or we can set a proc-exit callback? But to be honest, I'm not sure
which approach we should go with. Those approaches have proc and cons.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Parallel INSERT SELECT take 2

2021-04-26 Thread Bharath Rupireddy
On Tue, Apr 27, 2021 at 8:12 AM Amit Kapila  wrote:
> On Tue, Apr 27, 2021 at 7:45 AM Greg Nancarrow  wrote:
> >
> > On Tue, Apr 27, 2021 at 10:51 AM Bharath Rupireddy
> >  wrote:
> > >
> > >
> > > I still feel that why we shouldn't limit the declarative approach to
> > > only partitioned tables? And for normal tables, possibly with a
> > > minimal cost(??), the server can do the safety checking. I know this
> > > feels a little inconsistent. In the planner we will have different
> > > paths like: if (partitioned_table) { check the parallel safety tag
> > > associated with the table } else { perform the parallel safety of the
> > > associated objects }.
> > >
> >
> > Personally I think the simplest and best approach is just do it
> > consistently, using the declarative approach across all table types.
> >
>
> Yeah, if we decide to go with a declarative approach then that sounds
> like a better approach.

Agree.

> > > Then, running the pg_get_parallel_safety will have some overhead if
> > > there are many partitions associated with a table. And, this is the
> > > overhead planner would have had to incur without the declarative
> > > approach which we are trying to avoid with this design.
> > >
> >
> > The big difference is that pg_get_parallel_safety() is intended to be
> > used during development, not as part of runtime parallel-safety checks
> > (which are avoided using the declarative approach).
> > So there is no runtime overhead associated with pg_get_parallel_safety().
> >
> > >
> > > I'm thinking that when users say ALTER TABLE partioned_table SET
> > > PARALLEL TO 'safe';, we check all the partitions' and their associated
> > > objects' parallel safety? If all are parallel safe, then only we set
> > > partitioned_table as parallel safe. What should happen if the parallel
> > > safety of any of the associated objects/partitions changes after
> > > setting the partitioned_table safety?
> > >
> >
> > With the declarative approach, there is no parallel-safety checking on
> > either the CREATE/ALTER when the parallel-safety is declared/set.
> > It's up to the user to get it right. If it's actually wrong, it will
> > be detected at runtime.
> >
>
> OTOH, even if we want to verify at DDL time, we won't be able to
> maintain it at the later point of time say if user changed the
> parallel-safety of some function used in check constraint. I think the
> function pg_get_parallel_safety() will help the user to decide whether
> it can declare table parallel-safe. Now, it is quite possible that the
> user can later change the parallel-safe property of some function then
> that should be caught at runtime.

Yeah, this is an unavoidable problem with the declarative approach.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: [HACKERS] logical decoding of two-phase transactions

2021-04-26 Thread vignesh C
On Wed, Apr 21, 2021 at 12:13 PM Peter Smith  wrote:
>
> On Tue, Apr 20, 2021 at 3:45 PM Peter Smith  wrote:
> >
> > Please find attached the latest patch set v73`*
> >
> > Differences from v72* are:
> >
> > * Rebased to HEAD @ today (required because v72-0001 no longer applied 
> > cleanly)
> >
> > * Minor documentation correction for protocol messages for Commit Prepared 
> > ('K')
> >
> > * Non-functional code tidy (mostly proto.c) to reduce overloading
> > different meanings to same member names for prepare/commit times.
>
>
> Please find attached a re-posting of patch set v73*

Few comments when I was having a look at the tests added:
1) Can the below:
+# check inserts are visible. 22 should be rolled back. 21 should be committed.
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(*)
FROM tab_full where a IN (21);");
+is($result, qq(1), 'Rows committed are on the subscriber');
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(*)
FROM tab_full where a IN (22);");
+is($result, qq(0), 'Rows rolled back are not on the subscriber');

be changed to:
$result = $node_subscriber->safe_psql('postgres', "SELECT a FROM
tab_full where a IN (21,22);");
is($result, qq(21), 'Rows committed are on the subscriber');

And Test count need to be reduced to "use Test::More tests => 19"

2) we can change tx to transaction:
+# check the tx state is prepared on subscriber(s)
+$result = $node_B->safe_psql('postgres', "SELECT count(*) FROM
pg_prepared_xacts;");
+is($result, qq(1), 'transaction is prepared on subscriber B');
+$result = $node_C->safe_psql('postgres', "SELECT count(*) FROM
pg_prepared_xacts;");
+is($result, qq(1), 'transaction is prepared on subscriber C');

3) There are few more instances present in the same file, those also
can be changed.

4) Can the below:
 check inserts are visible at subscriber(s).
# 22 should be rolled back.
# 21 should be committed.
$result = $node_B->safe_psql('postgres', "SELECT count(*) FROM
tab_full where a IN (21);");
is($result, qq(1), 'Rows committed are present on subscriber B');
$result = $node_B->safe_psql('postgres', "SELECT count(*) FROM
tab_full where a IN (22);");
is($result, qq(0), 'Rows rolled back are not present on subscriber B');
$result = $node_C->safe_psql('postgres', "SELECT count(*) FROM
tab_full where a IN (21);");
is($result, qq(1), 'Rows committed are present on subscriber C');
$result = $node_C->safe_psql('postgres', "SELECT count(*) FROM
tab_full where a IN (22);");
is($result, qq(0), 'Rows rolled back are not present on subscriber C');

be changed to:
$result = $node_B->safe_psql('postgres', "SELECT a FROM tab_full where
a IN (21,22);");
is($result, qq(21), 'Rows committed are on the subscriber');
$result = $node_C->safe_psql('postgres', "SELECT a FROM tab_full where
a IN (21,22);");
is($result, qq(21), 'Rows committed are on the subscriber');

And Test count need to be reduced to "use Test::More tests => 27"

5) should we change "Two phase commit" to "Two phase commit state" :
+   /*
+* Binary, streaming, and two_phase are only supported
in v14 and
+* higher
+*/
if (pset.sversion >= 14)
appendPQExpBuffer(&buf,
  ", subbinary
AS \"%s\"\n"
- ", substream
AS \"%s\"\n",
+ ", substream
AS \"%s\"\n"
+ ",
subtwophasestate AS \"%s\"\n",

gettext_noop("Binary"),
-
gettext_noop("Streaming"));
+
gettext_noop("Streaming"),
+
gettext_noop("Two phase commit"));

Regards,
Vignesh




Re: Parallel INSERT SELECT take 2

2021-04-26 Thread Bharath Rupireddy
On Tue, Apr 27, 2021 at 7:45 AM Greg Nancarrow  wrote:
>
> On Tue, Apr 27, 2021 at 10:51 AM Bharath Rupireddy
>  wrote:
> >
> >
> > I still feel that why we shouldn't limit the declarative approach to
> > only partitioned tables? And for normal tables, possibly with a
> > minimal cost(??), the server can do the safety checking. I know this
> > feels a little inconsistent. In the planner we will have different
> > paths like: if (partitioned_table) { check the parallel safety tag
> > associated with the table } else { perform the parallel safety of the
> > associated objects }.
> >
>
> Personally I think the simplest and best approach is just do it
> consistently, using the declarative approach across all table types.

Agree.

> > Then, running the pg_get_parallel_safety will have some overhead if
> > there are many partitions associated with a table. And, this is the
> > overhead planner would have had to incur without the declarative
> > approach which we are trying to avoid with this design.
> >
>
> The big difference is that pg_get_parallel_safety() is intended to be
> used during development, not as part of runtime parallel-safety checks
> (which are avoided using the declarative approach).
> So there is no runtime overhead associated with pg_get_parallel_safety().

Yes, while we avoid runtime overhead, but we run the risk of changed
parallel safety of any of the underlying objects/functions/partitions.
This risk will anyways be unavoidable with declarative approach.

> > I'm thinking that when users say ALTER TABLE partioned_table SET
> > PARALLEL TO 'safe';, we check all the partitions' and their associated
> > objects' parallel safety? If all are parallel safe, then only we set
> > partitioned_table as parallel safe. What should happen if the parallel
> > safety of any of the associated objects/partitions changes after
> > setting the partitioned_table safety?
> >
>
> With the declarative approach, there is no parallel-safety checking on
> either the CREATE/ALTER when the parallel-safety is declared/set.
> It's up to the user to get it right. If it's actually wrong, it will
> be detected at runtime.

As I said upthread, we can provide the parallel safety check of
associated objects/partitions as an option with default as false. I'm
not sure if this is a good thing to do at all. Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Parallel INSERT SELECT take 2

2021-04-26 Thread Bharath Rupireddy
On Tue, Apr 27, 2021 at 7:39 AM houzj.f...@fujitsu.com
 wrote:
> > I'm thinking that when users say ALTER TABLE partioned_table SET PARALLEL
> > TO 'safe';, we check all the partitions' and their associated objects' 
> > parallel
> > safety? If all are parallel safe, then only we set partitioned_table as 
> > parallel safe.
> > What should happen if the parallel safety of any of the associated
> > objects/partitions changes after setting the partitioned_table safety?
>
> Currently, nothing happened if any of the associated objects/partitions 
> changes after setting the partitioned_table safety.
> Because , we do not have a really cheap way to catch the change. The existing 
> relcache does not work because alter function
> does not invalid the relcache which the function belongs to. And it will 
> bring some other overhead(locking, systable scan,...)
> to find the table the objects belong to.

Makes sense. Anyways, the user is responsible for such changes and
otherwise the executor can catch them at run time, if not, the users
will see unintended consequences.

> > My understanding was that: the command ALTER TABLE ... SET PARALLEL TO
> > 'safe' work will check the parallel safety of all the objects associated 
> > with the
> > table. If the objects are all parallel safe, then the table will be set to 
> > safe. If at
> > least one object is parallel unsafe or restricted, then the command will 
> > fail.
>
> I think this idea makes sense. Some detail of the designed can be improved.
> I agree with you that we can try to check check all the partitions' and their 
> associated objects' parallel safety when ALTER PARALLEL.
> Because it's a separate new command, add some overhead to it seems not too 
> bad.
> If there are no other objections, I plan to add safety check in the ALTER 
> PARALLEL command.

Maybe we can make the parallel safety check of the associated
objects/partitions optional for CREATE/ALTER DDLs, with the default
being no checks performed. Both Greg and Amit agree that we don't have
to perform any parallel safety checks during CREATE/ALTER DDLs.

> > also thinking that how will the design cope with situations such as the 
> > parallel
> > safety of any of the associated objects changing after setting the table to
> > parallel safe. The planner would have relied on the outdated parallel 
> > safety of
> > the table and chosen parallel inserts and the executor will catch such 
> > situations.
> > Looks like my understanding was wrong.
>
> Currently, we assume user is responsible for its correctness.
> Because, from our research, when the parallel safety of some of these objects 
> is changed,
> it's costly to reflect it on the parallel safety of tables that depend on 
> them.
> (we need to scan the pg_depend,pg_inherit,pg_index to find the target 
> table)

Agree.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Replication slot stats misgivings

2021-04-26 Thread Masahiko Sawada
On Tue, Apr 27, 2021 at 11:45 AM Amit Kapila  wrote:
>
> On Tue, Apr 27, 2021 at 8:01 AM vignesh C  wrote:
> >
> > On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila  wrote:
> > >
> > > On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C  wrote:
> > > > > >
> > > > > > I have made the changes to update the replication statistics at
> > > > > > replication slot release. Please find the patch attached for the 
> > > > > > same.
> > > > > > Thoughts?
> > > > > >
> > > > >
> > > > > Thanks, the changes look mostly good to me. The slot stats need to be
> > > > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> > > > > StartupDecodingContext. Apart from that, I have moved the declaration
> > > > > of UpdateDecodingStats from slot.h back to logical.h. I have also
> > > > > added/edited a few comments. Please check and let me know what do you
> > > > > think of the attached?
> > > >
> > > > The patch moves slot stats to the ReplicationSlot data that is on the
> > > > shared memory. If we have a space to store the statistics in the
> > > > shared memory can we simply accumulate the stats there and make them
> > > > persistent without using the stats collector?
> > > >
> > >
> > > But for that, we need to write to file at every commit/abort/prepare
> > > (decode of commit) which I think will incur significant overhead.
> > > Also, we try to write after few commits then there is a danger of
> > > losing them and still there could be a visible overhead for small
> > > transactions.
> > >
> >
> > I preferred not to persist this information to file, let's have stats
> > collector handle the stats persisting.
> >
>
> Sawada-San, I would like to go ahead with your
> "Use-HTAB-for-replication-slot-statistics" unless you think otherwise?

I agree that it's better to use the stats collector. So please go ahead.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Replication slot stats misgivings

2021-04-26 Thread Amit Kapila
On Tue, Apr 27, 2021 at 8:01 AM vignesh C  wrote:
>
> On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila  wrote:
> >
> > On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C  wrote:
> > > > >
> > > > > I have made the changes to update the replication statistics at
> > > > > replication slot release. Please find the patch attached for the same.
> > > > > Thoughts?
> > > > >
> > > >
> > > > Thanks, the changes look mostly good to me. The slot stats need to be
> > > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> > > > StartupDecodingContext. Apart from that, I have moved the declaration
> > > > of UpdateDecodingStats from slot.h back to logical.h. I have also
> > > > added/edited a few comments. Please check and let me know what do you
> > > > think of the attached?
> > >
> > > The patch moves slot stats to the ReplicationSlot data that is on the
> > > shared memory. If we have a space to store the statistics in the
> > > shared memory can we simply accumulate the stats there and make them
> > > persistent without using the stats collector?
> > >
> >
> > But for that, we need to write to file at every commit/abort/prepare
> > (decode of commit) which I think will incur significant overhead.
> > Also, we try to write after few commits then there is a danger of
> > losing them and still there could be a visible overhead for small
> > transactions.
> >
>
> I preferred not to persist this information to file, let's have stats
> collector handle the stats persisting.
>

Sawada-San, I would like to go ahead with your
"Use-HTAB-for-replication-slot-statistics" unless you think otherwise?

-- 
With Regards,
Amit Kapila.




Re: SQL-standard function body

2021-04-26 Thread Justin Pryzby
On Thu, Apr 22, 2021 at 04:04:18PM -0400, Jeff Janes wrote:
> This commit break line continuation prompts for unbalanced parentheses in
> the psql binary.  Skimming through this thread, I don't see that this is
> intentional or has been noticed before.
> 
> with psql -X
> 
> Before:
> 
> jjanes=# asdf (
> jjanes(#
> 
> Now:
> 
> jjanes=# asdf (
> jjanes-#
> 
> I've looked through the parts of the commit that change psql, but didn't
> see an obvious culprit.

I haven't studied it in detail, but it probably needs something like this.

diff --git a/src/fe_utils/psqlscan.l b/src/fe_utils/psqlscan.l
index 991b7de0b5..0fab48a382 100644
--- a/src/fe_utils/psqlscan.l
+++ b/src/fe_utils/psqlscan.l
@@ -1098,23 +1098,23 @@ psql_scan(PsqlScanState state,
{
case LEXRES_EOL:/* end of input */
switch (state->start_state)
{
case INITIAL:
case xqs:   /* we treat this like 
INITIAL */
if (state->paren_depth > 0)
{
result = PSCAN_INCOMPLETE;
*prompt = PROMPT_PAREN;
}
-   if (state->begin_depth > 0)
+   else if (state->begin_depth > 0)
{
result = PSCAN_INCOMPLETE;
*prompt = PROMPT_CONTINUE;
}
else if (query_buf->len > 0)
{
result = PSCAN_EOL;
*prompt = PROMPT_CONTINUE;
}
else
{




Re: Parallel INSERT SELECT take 2

2021-04-26 Thread Amit Kapila
On Tue, Apr 27, 2021 at 7:45 AM Greg Nancarrow  wrote:
>
> On Tue, Apr 27, 2021 at 10:51 AM Bharath Rupireddy
>  wrote:
> >
> >
> > I still feel that why we shouldn't limit the declarative approach to
> > only partitioned tables? And for normal tables, possibly with a
> > minimal cost(??), the server can do the safety checking. I know this
> > feels a little inconsistent. In the planner we will have different
> > paths like: if (partitioned_table) { check the parallel safety tag
> > associated with the table } else { perform the parallel safety of the
> > associated objects }.
> >
>
> Personally I think the simplest and best approach is just do it
> consistently, using the declarative approach across all table types.
>

Yeah, if we decide to go with a declarative approach then that sounds
like a better approach.

> >
> > Then, running the pg_get_parallel_safety will have some overhead if
> > there are many partitions associated with a table. And, this is the
> > overhead planner would have had to incur without the declarative
> > approach which we are trying to avoid with this design.
> >
>
> The big difference is that pg_get_parallel_safety() is intended to be
> used during development, not as part of runtime parallel-safety checks
> (which are avoided using the declarative approach).
> So there is no runtime overhead associated with pg_get_parallel_safety().
>
> >
> > I'm thinking that when users say ALTER TABLE partioned_table SET
> > PARALLEL TO 'safe';, we check all the partitions' and their associated
> > objects' parallel safety? If all are parallel safe, then only we set
> > partitioned_table as parallel safe. What should happen if the parallel
> > safety of any of the associated objects/partitions changes after
> > setting the partitioned_table safety?
> >
>
> With the declarative approach, there is no parallel-safety checking on
> either the CREATE/ALTER when the parallel-safety is declared/set.
> It's up to the user to get it right. If it's actually wrong, it will
> be detected at runtime.
>

OTOH, even if we want to verify at DDL time, we won't be able to
maintain it at the later point of time say if user changed the
parallel-safety of some function used in check constraint. I think the
function pg_get_parallel_safety() will help the user to decide whether
it can declare table parallel-safe. Now, it is quite possible that the
user can later change the parallel-safe property of some function then
that should be caught at runtime.

-- 
With Regards,
Amit Kapila.




Re: Replication slot stats misgivings

2021-04-26 Thread vignesh C
On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila  wrote:
>
> On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada  wrote:
> >
> > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila  wrote:
> > >
> > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C  wrote:
> > > >
> > > > I have made the changes to update the replication statistics at
> > > > replication slot release. Please find the patch attached for the same.
> > > > Thoughts?
> > > >
> > >
> > > Thanks, the changes look mostly good to me. The slot stats need to be
> > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> > > StartupDecodingContext. Apart from that, I have moved the declaration
> > > of UpdateDecodingStats from slot.h back to logical.h. I have also
> > > added/edited a few comments. Please check and let me know what do you
> > > think of the attached?
> >
> > The patch moves slot stats to the ReplicationSlot data that is on the
> > shared memory. If we have a space to store the statistics in the
> > shared memory can we simply accumulate the stats there and make them
> > persistent without using the stats collector?
> >
>
> But for that, we need to write to file at every commit/abort/prepare
> (decode of commit) which I think will incur significant overhead.
> Also, we try to write after few commits then there is a danger of
> losing them and still there could be a visible overhead for small
> transactions.
>

I preferred not to persist this information to file, let's have stats
collector handle the stats persisting.

> > And I think there is
> > also a risk to increase shared memory when we want to add other
> > statistics in the future.
> >
>
> Yeah, so do you think it is not a good idea to store stats in
> ReplicationSlot? Actually storing them in a slot makes it easier to
> send them during ReplicationSlotRelease which is quite helpful if the
> replication is interrupted due to some reason. Or the other idea was
> that we send stats every time we stream or spill changes.

We use around 64 bytes of shared memory to store the statistics
information per slot, I'm not sure if this is a lot of memory. If this
memory is fine, then I felt the approach to store stats seems fine. If
that memory is too much then we could use the other approach to update
stats when we stream or spill the changes as suggested by Amit.

Regards,
Vignesh




Re: Parallel INSERT SELECT take 2

2021-04-26 Thread Greg Nancarrow
On Tue, Apr 27, 2021 at 10:51 AM Bharath Rupireddy
 wrote:
>
>
> I still feel that why we shouldn't limit the declarative approach to
> only partitioned tables? And for normal tables, possibly with a
> minimal cost(??), the server can do the safety checking. I know this
> feels a little inconsistent. In the planner we will have different
> paths like: if (partitioned_table) { check the parallel safety tag
> associated with the table } else { perform the parallel safety of the
> associated objects }.
>

Personally I think the simplest and best approach is just do it
consistently, using the declarative approach across all table types.

>
> Then, running the pg_get_parallel_safety will have some overhead if
> there are many partitions associated with a table. And, this is the
> overhead planner would have had to incur without the declarative
> approach which we are trying to avoid with this design.
>

The big difference is that pg_get_parallel_safety() is intended to be
used during development, not as part of runtime parallel-safety checks
(which are avoided using the declarative approach).
So there is no runtime overhead associated with pg_get_parallel_safety().

>
> I'm thinking that when users say ALTER TABLE partioned_table SET
> PARALLEL TO 'safe';, we check all the partitions' and their associated
> objects' parallel safety? If all are parallel safe, then only we set
> partitioned_table as parallel safe. What should happen if the parallel
> safety of any of the associated objects/partitions changes after
> setting the partitioned_table safety?
>

With the declarative approach, there is no parallel-safety checking on
either the CREATE/ALTER when the parallel-safety is declared/set.
It's up to the user to get it right. If it's actually wrong, it will
be detected at runtime.

Regards,
Greg Nancarrow
Fujitsu Australia




RE: Parallel INSERT SELECT take 2

2021-04-26 Thread houzj.f...@fujitsu.com
> > Currently, index expression and predicate are stored in text format.
> > We need to use stringToNode(expression/predicate) to parse it.
> > Some committers think doing this twice does not look good, unless we
> > found some ways to pass parsed info to the executor to avoid the second
> parse.
> 
> How much is the extra cost that's added if we do stringToNode twiice?
> Maybe, we can check for a few sample test cases by just having
> stringToNode(expression/predicate) in the planner and see if it adds much to
> the total execution time of the query.


OK, I will do some test on it.

> > Yes, because both parent table and child table will be inserted and
> > the parallel related objects on them will be executed. If users want
> > to make sure the parallel insert succeed, they need to check all the 
> > objects.
> 
> Then, running the pg_get_parallel_safety will have some overhead if there are
> many partitions associated with a table. And, this is the overhead planner
> would have had to incur without the declarative approach which we are trying
> to avoid with this design.

Yes, I think put such overhead in a separate function is better than in a 
common path(planner).

> > Foreign table itself is considered as parallel restricted, because we
> > do not support parallel insert fdw api for now.
> 
> Maybe, the ALTER TABLE ... SET PARALLEL on a foreign table should default to
> parallel restricted always and emit a warning saying the reason?

Thanks for the comment, I agree.
I will change this in the next version patches.

> > But the ALTER PARALLEL command itself does not check all the partition's
> safety flag.
> > The function pg_get_parallel_safety can return all the partition's not
> > safe objects, user should set the parallel safety based the function's 
> > result.
> 
> I'm thinking that when users say ALTER TABLE partioned_table SET PARALLEL
> TO 'safe';, we check all the partitions' and their associated objects' 
> parallel
> safety? If all are parallel safe, then only we set partitioned_table as 
> parallel safe.
> What should happen if the parallel safety of any of the associated
> objects/partitions changes after setting the partitioned_table safety?

Currently, nothing happened if any of the associated objects/partitions changes 
after setting the partitioned_table safety.
Because , we do not have a really cheap way to catch the change. The existing 
relcache does not work because alter function
does not invalid the relcache which the function belongs to. And it will bring 
some other overhead(locking, systable scan,...)
to find the table the objects belong to.

> My understanding was that: the command ALTER TABLE ... SET PARALLEL TO
> 'safe' work will check the parallel safety of all the objects associated with 
> the
> table. If the objects are all parallel safe, then the table will be set to 
> safe. If at
> least one object is parallel unsafe or restricted, then the command will fail.

I think this idea makes sense. Some detail of the designed can be improved.
I agree with you that we can try to check check all the partitions' and their 
associated objects' parallel safety when ALTER PARALLEL.
Because it's a separate new command, add some overhead to it seems not too bad.
If there are no other objections, I plan to add safety check in the ALTER 
PARALLEL command.

> also thinking that how will the design cope with situations such as the 
> parallel
> safety of any of the associated objects changing after setting the table to
> parallel safe. The planner would have relied on the outdated parallel safety 
> of
> the table and chosen parallel inserts and the executor will catch such 
> situations.
> Looks like my understanding was wrong.

Currently, we assume user is responsible for its correctness.
Because, from our research, when the parallel safety of some of these objects 
is changed,
it's costly to reflect it on the parallel safety of tables that depend on them.
(we need to scan the pg_depend,pg_inherit,pg_index to find the target table)

> So, the ALTER TABLE ... SET PARALLEL TO command just sets the target table
> safety, doesn't bother what the associated objects' safety is.
> It just believes the user. If at all there are any parallel unsafe objects it 
> will be
> caught by the executor. Just like, setting parallel safety of the
> functions/aggregates, the docs caution users about accidentally/intentionally
> tagging parallel unsafe functions/aggregates as parallel safe.

Yes, thanks for looking into this.

Best regards,
houzj


Re: Addition of authenticated ID to pg_stat_activity

2021-04-26 Thread Michael Paquier
On Tue, Apr 27, 2021 at 09:26:11AM +0800, Julien Rouhaud wrote:
> -1.  It's already annoying enough to have to type "WHERE pid !=
> pg_backend_pid()" to exclude my own backend, and I usually need it quite 
> often.
> Unless we add some new view which integrate that, something like
> pg_stat_activity_except_me with a better name.  I also don't see how we could
> join a new dedicated view with the old one without that information.

Err, sorry for the confusion.  What I meant here is to also keep the
PID in pg_stat_activity, but also add it to this new view to be able
to join things across the board.
--
Michael


signature.asc
Description: PGP signature


Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-26 Thread Masahiko Sawada
On Mon, Apr 26, 2021 at 10:31 PM Tomas Vondra
 wrote:
>
> Hi,
>
> I took a look at this today, as I committed 39b66a91b back in January. I
> can reproduce the issue, with just 1M rows the before/after timings are
> roughly 480ms and 620ms on my hardware.
>
> Unfortunately, the v3 patch does not really fix the issue for me. The
> timing with it applied is ~610ms so the improvement is only minimal.

Since the reading vmbuffer is likely to hit on the shared buffer
during inserting frozen tuples, I think the improvement would not be
visible with a few million tuples depending on hardware. But it might
not be as fast as before commit 39b66a91b since we read vmbuffer at
least per insertion.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Addition of authenticated ID to pg_stat_activity

2021-04-26 Thread Julien Rouhaud
On Tue, Apr 27, 2021 at 09:59:18AM +0900, Michael Paquier wrote:
> 
> I am wondering if we should take this as an occasion to move some data
> out of pg_stat_activity into a separate biew, dedicated to the data
> related to the connection that remains set to the same value for the
> duration of a backend's life, as of the following set:
> - the backend PID

-1.  It's already annoying enough to have to type "WHERE pid !=
pg_backend_pid()" to exclude my own backend, and I usually need it quite often.
Unless we add some new view which integrate that, something like
pg_stat_activity_except_me with a better name.  I also don't see how we could
join a new dedicated view with the old one without that information.

> - application_name?  (well, this one could change on reload, so I am
> lying).

No, it can change at any time.  And the fact that it's not transactional makes
it quite convenient for poor man progress reporting.  For instance in powa I
use that to report what the bgworker is currently working on, and this has
already proven to be useful.




Re: Transactions involving multiple postgres foreign servers, take 2

2021-04-26 Thread Masahiro Ikeda



On 2021/03/17 12:03, Masahiko Sawada wrote:
> I've attached the updated version patch set.

Thanks for updating the patches! I'm now restarting to review of 2PC because
I'd like to use this feature in PG15.


I think the following logic of resolving and removing the fdwxact entries
by the transaction resolver needs to be fixed.

1. check if pending fdwxact entries exist

HoldInDoubtFdwXacts() checks if there are entries which the condition is
InvalidBackendId and so on. After that it gets the indexes of the fdwxacts
array. The fdwXactLock is released at the end of this phase.

2. resolve and remove the entries held in 1th phase.

ResolveFdwXacts() resloves the status per each fdwxact entry using the
indexes. The end of resolving, the transaction resolver remove the entry in
fdwxacts array via remove_fdwact().

The way to remove the entry is the following. Since to control using the
index, the indexes of getting in the 1st phase are meaningless anymore.

/* Remove the entry from active array */
FdwXactCtl->num_fdwxacts--;
FdwXactCtl->fdwxacts[i] = FdwXactCtl->fdwxacts[FdwXactCtl->num_fdwxacts];

This seems to lead resolving the unexpected fdwxacts and it can occur the
following assertion error. That's why I noticed. For example, there is the
case which a backend inserts new fdwxact entry in the free space, which the
resolver removed the entry right before, and the resolver accesses the new
entry which doesn't need to resolve yet because it use the indexes checked in
1st phase.

Assert(fdwxact->locking_backend == MyBackendId);



The simple solution is that to get fdwXactLock exclusive all the time from the
begining of 1st phase to the finishing of 2nd phase. But, I worried that the
performance impact became too big...

I came up with two solutions although there may be better solutions.

A. to remove resolved entries at once after resolution for all held entries is
finished

If so, we don't need to take the exclusive lock for a long time. But, this
have other problems, which pg_remove_foreign_xact() can still remove entries
and we need to handle the fail of resolving.

I wondered that we can solve the first problem to introduce a new lock like
"removing lock" and only the processes which hold the lock can remove the
entries. The performance impact is limited since the insertion the fdwxact
entries is not blocked by this lock. And second problem can be solved using
try-catch sentence.


B. to merge 1st and 2nd phase

Now, the resolver resolves the entries together. That's the reason why it's
difficult to remove the entries. So, it seems to solve the problem to execute
checking, resolving and removing per each entry. I think it's better since
this is simpler than A. If I'm missing something, please let me know.


Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION




Re: Addition of authenticated ID to pg_stat_activity

2021-04-26 Thread Michael Paquier
On Mon, Apr 26, 2021 at 03:21:46PM -0400, Stephen Frost wrote:
> * Andres Freund (and...@anarazel.de) wrote:
>> I'm getting a bit worried about the incremental increase in
>> pg_stat_activity width - it's probably by far the view that's most
>> viewed interactively. I think we need to be careful not to add too niche
>> things to it. This is especially true for columns that may be wider.
>> 
>> It'd be bad for discoverability, but perhaps something like this, that's
>> not that likely to be used interactively, would be better done as a
>> separate function that would need to be used explicitly?
> 
> I mean.. we already have separate functions and views for this, though
> they're auth-method-specific currently, but also provide more details,
> since it isn't actually a "one size fits all" kind of thing like this
> entire approach is imagining it to be.

Referring to pg_stat_ssl and pg_stat_gssapi here, right?  Yes, that
would be very limited as this leads to no visibility for LDAP, all
password-based authentications and more.

I am wondering if we should take this as an occasion to move some data
out of pg_stat_activity into a separate biew, dedicated to the data
related to the connection that remains set to the same value for the
duration of a backend's life, as of the following set:
- the backend PID
- client_addr
- client_hostname
- client_port
- authenticated ID
- application_name?  (well, this one could change on reload, so I am
lying).

It would be tempting to move the database name and the username but
these are popular fields with monitoring.  Maybe we could name that
pg_stat_connection_status, pg_stat_auth_status or just
pg_stat_connection?
--
Michael


signature.asc
Description: PGP signature


Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Bharath Rupireddy
On Mon, Apr 26, 2021 at 9:10 PM Bharath Rupireddy
 wrote:
>
> I found another problem with collationcmds.c is that it doesn't error
> out if some of the options are specified more than once, something
> like below. I think the option checking "for loop" in DefineCollation
> needs to be reworked.
> CREATE COLLATION case_insensitive (provider = icu, provider =
> someother locale = '@colStrength=secondary', deterministic = false,
> deterministic = true);

I'm thinking that the above problem should be discussed separately. I
will start a new thread soon on this.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Bharath Rupireddy
On Mon, Apr 26, 2021 at 9:24 PM Alvaro Herrera  wrote:
>
> On 2021-Apr-26, Bharath Rupireddy wrote:
>
> > I agree that we can just be clear about the problem. Looks like the
> > majority of the errors "conflicting or redundant options" are for
> > redundant options. So, wherever "conflicting or redundant options"
> > exists: 1) change the message to "option \"%s\" specified more than
> > once" and remove parser_errposition if it's there because the option
> > name in the error message would give the info with which user can
> > point to the location
>
> Hmm, I would keep the parser_errposition() even if the option name is
> mentioned in the error message.  There's no harm in being a little
> redundant, with both the option name and the error cursor showing the
> same thing.

Agreed.

> > 2) change the message to something like "option \"%s\" is conflicting
> > with option \"%s\"".
>
> Maybe, but since these would all be special cases, I think we'd need to
> discuss them individually.  I would suggest that in order not to stall
> this patch, these cases should all stay as "redundant or conflicting
> options" -- that is, avoid any further change apart from exactly the
> thing you came here to change.  You can submit a 0002 patch to change
> those other errors.  That way, even if those changes end up rejected for
> whatever reason, you still got your 0001 done (which would change the
> bulk of "conflicting or redundant" error to the "option %s already
> specified" error).  Some progress is better than none.

+1 to have all the conflicting options error message changes as 0002
patch or I'm okay even if we discuss those changes after the 0001
patch goes in.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Parallel INSERT SELECT take 2

2021-04-26 Thread Bharath Rupireddy
On Mon, Apr 26, 2021 at 4:56 PM houzj.f...@fujitsu.com
 wrote:
> > > The parallel attributes in table means the parallel safety when user does 
> > > some
> > data-modification operations on it.
> > > So, It only limit the use of parallel plan when using 
> > > INSERT/UPDATE/DELETE.
> >
> > In that case, isn't it better to use the terminology "PARALLEL DML
> > SAFE/UNSAFE/RESTRICTED" in the code and docs? This way, it will be clear 
> > that
> > these tags don't affect parallel selects.
>
> Makes sense, I recalled I have heart similar suggestion before.
> If there are no other objections, I plan to change command to
> PARALLEL DML in my next version patches.

+1 from me. Let's hear from others.

> > Will the planner parse and check parallel safety of index((where
> > clause) expressions in case of SELECTs? I'm not sure of this. But if it 
> > does, maybe
> > we could do the same thing for parallel DML as well for normal tables?
>
> The planner does not check the index expression, because the expression will 
> not be used in SELECT.
> I think the expression is only used when a tuple inserted into the index.

Oh.

> > the overhead of parsing index expressions? If the cost is heavy for checking
> > index expressions parallel safety in case of normal tables, then the current
> > design i.e. attributing parallel safety tag to all the tables makes sense.
>
> Currently, index expression and predicate are stored in text format.
> We need to use stringToNode(expression/predicate) to parse it.
> Some committers think doing this twice does not look good,
> unless we found some ways to pass parsed info to the executor to avoid the 
> second parse.

How much is the extra cost that's added if we do stringToNode twiice?
Maybe, we can check for a few sample test cases by just having
stringToNode(expression/predicate) in the planner and see if it adds
much to the total execution time of the query.

> > I was actually thinking that we will have the declarative approach only for
> > partitioned tables as it is the main problem we are trying to solve with 
> > this
> > design. Something like: users will run pg_get_parallel_safety to see the 
> > parallel
> > unsafe objects associated with a partitioned table by looking at all of its
> > partitions and be able to set a parallel dml safety tag to only partitioned 
> > tables.
>
> We originally want to make the declarative approach for both normal and 
> partitioned table.
> In this way, it will not bring any overhead to planner and looks consistency.
> But, do you think we should put some really cheap safety check to the planner 
> ?

I still feel that why we shouldn't limit the declarative approach to
only partitioned tables? And for normal tables, possibly with a
minimal cost(??), the server can do the safety checking. I know this
feels a little inconsistent. In the planner we will have different
paths like: if (partitioned_table) { check the parallel safety tag
associated with the table } else { perform the parallel safety of the
associated objects }.

Others may have better thoughts on this.

> > >0001: provide a utility function pg_get_parallel_safety('table_name').
> > >
> > >The function returns records of (objid, classid, parallel_safety) that
> > >represent the parallel safety of objects that determine the parallel 
> > >safety of
> > the specified table.
> > >Note: The function only outputs objects that are not parallel safe.
> >
> > If it returns only parallel "unsafe" objects and not "safe" or "restricted" 
> > objects,
> > how about naming it to pg_get_table_parallel_unsafe_objects("table_name")?
>
> Currently, the function returns both unsafe and restricted objects(I thought 
> restricted is also not safe),
> because we thought users only care about the objects that affect the use of 
> parallel plan.

Hm.

> > This way we could get rid of parallel_safety in the output record? If at 
> > all users
> > want to see parallel restricted or safe objects, we can also have the
> > counterparts pg_get_table_parallel_safe_objects and
> > pg_get_table_parallel_restricted_objects. Of course, we can caution the user
> > that execution of these functions might take longer and "might consume
> > excessive memory while accumulating the output".
> >
> > Otherwise, we can have a single function pg_get_parallel_safety("table_name"
> > IN, "parallel_safety" IN, "objid"
> > OUT, "classid" OUT)? If required, we could name it
> > pg_get_parallel_safety_of_table_objects.
> >
> > Thoughts?
>
> I am sure will user want to get safe objects, do you have some usecases ?
> If users do not care the safe objects, I think they can use
> "SELECT * FROM pg_get_parallel_safety() where parallel_safety = 'specified 
> safety' " to get the specified objects.

I don't know any practical scenarios, but If I'm a user, at least I
will be tempted to see the parallel safe objects associated with a
particular table along with unsafe and restricted ones. Others may
have better thoughts on this.

> > Although

Bug fix for tab completion of ALTER TABLE ... VALIDATE CONSTRAINT ...

2021-04-26 Thread David Fetter
Folks,

I noticed that $subject completes with already valid constraints,
please find attached a patch that fixes it. I noticed that there are
other places constraints can be validated, but didn't check whether
similar bugs exist there yet.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
>From e191928111a7500c3a3bc84558797fe86451c24b Mon Sep 17 00:00:00 2001
From: David Fetter 
Date: Mon, 26 Apr 2021 17:17:15 -0700
Subject: [PATCH v1] tab-completion of ALTER TABLE...VALIDATE CONSTRAINT
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="2.30.2"

This is a multi-part message in MIME format.
--2.30.2
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


...now chooses only among constraints that aren't already valid.

diff --git src/bin/psql/tab-complete.c src/bin/psql/tab-complete.c
index 7c493b..2252b7d3ac 100644
--- src/bin/psql/tab-complete.c
+++ src/bin/psql/tab-complete.c
@@ -785,6 +785,15 @@ static const SchemaQuery Query_for_list_of_collations = {
 "   and pg_catalog.quote_ident(c1.relname)='%s'"\
 "   and pg_catalog.pg_table_is_visible(c1.oid)"
 
+/* the silly-looking length condition is just to eat up the current word */
+#define Query_for_nonvalid_constraint_of_table \
+"SELECT pg_catalog.quote_ident(conname) "\
+"  FROM pg_catalog.pg_class c1, pg_catalog.pg_constraint con "\
+" WHERE c1.oid=conrelid and and not convalidated"\
+"   and (%d = pg_catalog.length('%s'))"\
+"   and pg_catalog.quote_ident(c1.relname)='%s'"\
+"   and pg_catalog.pg_table_is_visible(c1.oid)"
+
 #define Query_for_all_table_constraints \
 "SELECT pg_catalog.quote_ident(conname) "\
 "  FROM pg_catalog.pg_constraint c "\
@@ -2118,11 +2127,16 @@ psql_completion(const char *text, int start, int end)
 	 * If we have ALTER TABLE  ALTER|DROP|RENAME|VALIDATE CONSTRAINT,
 	 * provide list of constraints
 	 */
-	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER|DROP|RENAME|VALIDATE", "CONSTRAINT"))
+	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER|DROP|RENAME", "CONSTRAINT"))
 	{
 		completion_info_charp = prev3_wd;
 		COMPLETE_WITH_QUERY(Query_for_constraint_of_table);
 	}
+	else if (Matches("ALTER", "TABLE", MatchAny, "VALIDATE", "CONSTRAINT"))
+	{
+		completion_info_charp = prev3_wd;
+		COMPLETE_WITH_QUERY(Query_for_nonvalid_constraint_of_table);
+	}
 	/* ALTER TABLE ALTER [COLUMN]  */
 	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER", "COLUMN", MatchAny) ||
 			 Matches("ALTER", "TABLE", MatchAny, "ALTER", MatchAny))

--2.30.2--




Do we have a way to dynamically make a callback stub?

2021-04-26 Thread Chapman Flack
Hi,

A lot of the APIs in PostgreSQL that accept a callback follow the
familiar idiom of an extra void* argument allowing a single static
callback address to be multiplexed. But not all of them do. For example,
if I wanted to expose the possibility of defining GUCs from code written
in my PL, I would run into the fact that none of GUC's check/assign/show
hooks have the void* extra arg.

(The GUC machinery allows a way for extra info to be passed from a check
hook to an assign hook, which for a moment I thought might be abusable to
multiplex hooks, but I don't believe it is.)

Making all such APIs follow the void *extra convention might have
the virtue of consistency, but that might not be worth disturbing APIs
that have been stable for many years, and an effort to do so
might not be guaranteed to catch every such instance anyway.

A more general solution might be a function that generates a callback
stub: "please give me a void (*foo)() at a distinct address that I can
pass into this API, and when called it will call bar(baz), passing
this value for baz."

In olden days I wasn't above writing C to just slam those instructions
on the stack and return their address, but that was without multiple
architectures to think about, and non-executable stacks, and so on.
Now, there'd be a bit more magic required. Maybe some such ability is
already present in LLVM and could be exposed in jit.c?

I see that Java is currently incubating such a feature [0], so if I wait
for that I will have another option that serves my specific purposes, but
I wonder if it would be useful for PostgreSQL itself to have such
a capability available (or if it already does, and I haven't found it).

Regards,
-Chap



[0]
https://docs.oracle.com/en/java/javase/16/docs/api/jdk.incubator.foreign/jdk/incubator/foreign/CLinker.html#upcallStub(java.lang.invoke.MethodHandle,jdk.incubator.foreign.FunctionDescriptor)




Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY

2021-04-26 Thread Alvaro Herrera
Sorry, I forgot to update some comments in that version.  Fixed here.

-- 
Álvaro Herrera39°49'30"S 73°17'W
>From cb6d9e026624656e826ea880716ee552b15203a8 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 26 Apr 2021 14:53:04 -0400
Subject: [PATCH v2] Allow a partdesc-omitting-partitions to be cached

Makes partition descriptor acquisition faster during the transient
period in which a partition is in the process of being detached.

Per gripe from Amit Langote
Discussion: https://postgr.es/m/ca+hiwqfgpp1lxjzobygt9rpvtjxxkg5qg2+xch2z1q7krqz...@mail.gmail.com
---
 src/backend/catalog/pg_inherits.c | 52 +-
 src/backend/commands/tablecmds.c  | 66 +++--
 src/backend/commands/trigger.c|  3 +-
 src/backend/partitioning/partdesc.c   | 96 ---
 src/include/catalog/pg_inherits.h |  6 +-
 src/include/utils/rel.h   |  3 +
 .../detach-partition-concurrently-3.out   | 57 ++-
 .../detach-partition-concurrently-3.spec  |  9 +-
 8 files changed, 219 insertions(+), 73 deletions(-)

diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c
index 6447b52854..c373faf2d6 100644
--- a/src/backend/catalog/pg_inherits.c
+++ b/src/backend/catalog/pg_inherits.c
@@ -52,6 +52,22 @@ typedef struct SeenRelsEntry
  * then no locks are acquired, but caller must beware of race conditions
  * against possible DROPs of child relations.
  *
+ * Partitions marked as being detached are omitted; see
+ * find_inheritance_children_extended for details.
+ */
+List *
+find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
+{
+	return find_inheritance_children_extended(parentrelId, true, lockmode,
+			  NULL, NULL);
+}
+
+/*
+ * find_inheritance_children_extended
+ *
+ * As find_inheritance_children, with more options regarding detached
+ * partitions.
+ *
  * If a partition's pg_inherits row is marked "detach pending",
  * *detached_exist (if not null) is set true.
  *
@@ -60,11 +76,13 @@ typedef struct SeenRelsEntry
  * marked "detach pending" is visible to that snapshot, then that partition is
  * omitted from the output list.  This makes partitions invisible depending on
  * whether the transaction that marked those partitions as detached appears
- * committed to the active snapshot.
+ * committed to the active snapshot.  In addition, *detached_xmin (if not null)
+ * is set to the xmin of the row of the detached partition.
  */
 List *
-find_inheritance_children(Oid parentrelId, bool omit_detached,
-		  LOCKMODE lockmode, bool *detached_exist)
+find_inheritance_children_extended(Oid parentrelId, bool omit_detached,
+   LOCKMODE lockmode, bool *detached_exist,
+   TransactionId *detached_xmin)
 {
 	List	   *list = NIL;
 	Relation	relation;
@@ -132,7 +150,32 @@ find_inheritance_children(Oid parentrelId, bool omit_detached,
 snap = GetActiveSnapshot();
 
 if (!XidInMVCCSnapshot(xmin, snap))
+{
+	if (detached_xmin)
+	{
+		/*
+		 * Two detached partitions should not occur (see
+		 * checks in MarkInheritDetached), but if they do,
+		 * track the newer of the two.  Make sure to warn the
+		 * user, so that they can clean up.  Since this is
+		 * just a cross-check against potentially corrupt
+		 * catalogs, we don't make it a full-fledged error
+		 * message.
+		 */
+		if (*detached_xmin != InvalidTransactionId)
+		{
+			elog(WARNING, "more than one partition pending detach found for table with OID %u",
+ parentrelId);
+			if (TransactionIdFollows(xmin, *detached_xmin))
+*detached_xmin = xmin;
+		}
+		else
+			*detached_xmin = xmin;
+	}
+
+	/* Don't add the partition to the output list */
 	continue;
+}
 			}
 		}
 
@@ -247,8 +290,7 @@ find_all_inheritors(Oid parentrelId, LOCKMODE lockmode, List **numparents)
 		ListCell   *lc;
 
 		/* Get the direct children of this rel */
-		currentchildren = find_inheritance_children(currentrel, true,
-	lockmode, NULL);
+		currentchildren = find_inheritance_children(currentrel, lockmode);
 
 		/*
 		 * Add to the queue only those children not already seen. This avoids
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7d00f4eb25..ab3c61898c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3507,7 +3507,7 @@ renameatt_internal(Oid myrelid,
 		 * expected_parents will only be 0 if we are not already recursing.
 		 */
 		if (expected_parents == 0 &&
-			find_inheritance_children(myrelid, true, NoLock, NULL) != NIL)
+			find_inheritance_children(myrelid, NoLock) != NIL)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
 	 errmsg("inherited column \"%s\" must be renamed in child tables too",
@@ -3706,7 +3706,7 @@ rename_constraint_internal(Oid myrelid,
 		else
 		{
 			if (expected

Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-26 Thread Andres Freund
Hi,

On 2021-04-26 23:59:17 +0200, Tomas Vondra wrote:
> On 4/26/21 9:27 PM, Andres Freund wrote:
> > On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
> > > I'm not sure what to do about this :-( I don't have any ideas about how to
> > > eliminate this overhead, so the only option I see is reverting the changes
> > > in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
> > > be frozen ...
> > 
> > ISTM that the fundamental issue here is not that we acquire pins that we
> > shouldn't, but that we do so at a much higher frequency than needed.
> > 
> > It's probably too invasive for 14, but I think it might be worth exploring
> > passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() 
> > iff
> > the input will be more than one row.
> > 
> > And then add the vm buffer of the target page to BulkInsertState, so that
> > hio.c can avoid re-pinning the buffer.
> > 
> 
> Yeah. The question still is what to do about 14, though. Shall we leave the
> code as it is now, or should we change it somehow? It seem a bit unfortunate
> that a COPY FREEZE optimization should negatively influence other (more)
> common use cases, so I guess we can't just keep the current code ...

I'd suggest prototyping the use of BulkInsertState in nodeModifyTable.c
and see whether that fixes the regression. If it does, then we can
analyze whether that's possibly the best way forward. Or whether we
revert, live with the regression or find yet another path.

Greetings,

Andres Freund




Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-26 Thread Tomas Vondra




On 4/26/21 9:27 PM, Andres Freund wrote:

Hi,

On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:

I'm not sure what to do about this :-( I don't have any ideas about how to
eliminate this overhead, so the only option I see is reverting the changes
in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
be frozen ...


ISTM that the fundamental issue here is not that we acquire pins that we
shouldn't, but that we do so at a much higher frequency than needed.

It's probably too invasive for 14, but I think it might be worth exploring
passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() iff
the input will be more than one row.

And then add the vm buffer of the target page to BulkInsertState, so that
hio.c can avoid re-pinning the buffer.



Yeah. The question still is what to do about 14, though. Shall we leave 
the code as it is now, or should we change it somehow? It seem a bit 
unfortunate that a COPY FREEZE optimization should negatively influence 
other (more) common use cases, so I guess we can't just keep the current 
code ...


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: tab-complete for ALTER TABLE .. DETACH PARTITION CONCURRENTLY

2021-04-26 Thread Alvaro Herrera
On 2021-Apr-26, Justin Pryzby wrote:

> On Thu, Apr 22, 2021 at 04:40:35PM -0400, Alvaro Herrera wrote:
> > Would anyone oppose me pushing this for tab-completing the new keywords
> > of ALTER TABLE ..  DETACH PARTITION?
> 
> +1 to apply tab completion for v14

Pushed.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are."  -- Charles J. Sykes' advice to teenagers




Re: Use simplehash.h instead of dynahash in SMgr

2021-04-26 Thread Andres Freund
Hi,

On 2021-04-26 22:44:13 +0300, Yura Sokolov wrote:
> Even for Robin Hood hashing 0.9 fill factor is too high. It leads to too
> much movements on insertion/deletion and longer average collision chain.

That's true for modification heavy cases - but most hash tables in PG,
including the smgr one, are quite read heavy. For workloads where
there's a lot of smgr activity, the other overheads in relation
creation/drop handling are magnitudes more expensive than the collision
handling.

Note that simplehash.h also grows when the distance gets too big and
when there are too many elements to move, not just based on the fill
factor.


I kinda wish we had a chained hashtable implementation with the same
interface as simplehash. It's very use-case dependent which approach is
better, and right now we might be forcing some users to choose linear
probing because simplehash.h is still faster than dynahash, even though
chaining would actually be more appropriate.


> Note that Robin Hood doesn't optimize average case.

Right.


> See discussion on Rust hash table fill factor when it were Robin Hood:
> https://github.com/rust-lang/rust/issues/38003

The first sentence actually confirms my point above, about it being a
question of read vs write heavy.

Greetings,

Andres Freund




Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY

2021-04-26 Thread Alvaro Herrera
On 2021-Apr-26, Alvaro Herrera wrote:

> > Please allow me to study the patch a bit more closely and get back tomorrow.
> 
> Sure, thanks!

Here's a more polished version.

After trying the version with the elog(ERROR) when two detached
partitions are present, I decided against it; it is unhelpful because
it doesn't let you build partition descriptors for anything.  So I made
it an elog(WARNING) (not an ereport, note), and keep the most recent
pg_inherits.xmin value.  This is not great, but it lets you out of the
situation by finalizing one of the detaches.

The other check (at ALTER TABLE .. DETACH time) is an ereport(ERROR) and
should make the first one unreachable.

-- 
Álvaro Herrera39°49'30"S 73°17'W
>From 89e8a7d170fc0c6701e536b9c5830e3a58f303cc Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 26 Apr 2021 14:53:04 -0400
Subject: [PATCH] Allow a partdesc-with-omitted-partitions to be cached

Makes partdesc acquisition faster during the transient period during
which a partition is in the process of being dropped.

Per gripe from Amit Langote
Discussion: https://postgr.es/m/ca+hiwqfgpp1lxjzobygt9rpvtjxxkg5qg2+xch2z1q7krqz...@mail.gmail.com
---
 src/backend/catalog/pg_inherits.c | 52 +++--
 src/backend/commands/tablecmds.c  | 66 +---
 src/backend/commands/trigger.c|  3 +-
 src/backend/partitioning/partdesc.c   | 75 ---
 src/include/catalog/pg_inherits.h |  6 +-
 src/include/utils/rel.h   |  3 +
 .../detach-partition-concurrently-3.out   | 57 +-
 .../detach-partition-concurrently-3.spec  |  9 ++-
 8 files changed, 205 insertions(+), 66 deletions(-)

diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c
index 6447b52854..c373faf2d6 100644
--- a/src/backend/catalog/pg_inherits.c
+++ b/src/backend/catalog/pg_inherits.c
@@ -52,6 +52,22 @@ typedef struct SeenRelsEntry
  * then no locks are acquired, but caller must beware of race conditions
  * against possible DROPs of child relations.
  *
+ * Partitions marked as being detached are omitted; see
+ * find_inheritance_children_extended for details.
+ */
+List *
+find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
+{
+	return find_inheritance_children_extended(parentrelId, true, lockmode,
+			  NULL, NULL);
+}
+
+/*
+ * find_inheritance_children_extended
+ *
+ * As find_inheritance_children, with more options regarding detached
+ * partitions.
+ *
  * If a partition's pg_inherits row is marked "detach pending",
  * *detached_exist (if not null) is set true.
  *
@@ -60,11 +76,13 @@ typedef struct SeenRelsEntry
  * marked "detach pending" is visible to that snapshot, then that partition is
  * omitted from the output list.  This makes partitions invisible depending on
  * whether the transaction that marked those partitions as detached appears
- * committed to the active snapshot.
+ * committed to the active snapshot.  In addition, *detached_xmin (if not null)
+ * is set to the xmin of the row of the detached partition.
  */
 List *
-find_inheritance_children(Oid parentrelId, bool omit_detached,
-		  LOCKMODE lockmode, bool *detached_exist)
+find_inheritance_children_extended(Oid parentrelId, bool omit_detached,
+   LOCKMODE lockmode, bool *detached_exist,
+   TransactionId *detached_xmin)
 {
 	List	   *list = NIL;
 	Relation	relation;
@@ -132,7 +150,32 @@ find_inheritance_children(Oid parentrelId, bool omit_detached,
 snap = GetActiveSnapshot();
 
 if (!XidInMVCCSnapshot(xmin, snap))
+{
+	if (detached_xmin)
+	{
+		/*
+		 * Two detached partitions should not occur (see
+		 * checks in MarkInheritDetached), but if they do,
+		 * track the newer of the two.  Make sure to warn the
+		 * user, so that they can clean up.  Since this is
+		 * just a cross-check against potentially corrupt
+		 * catalogs, we don't make it a full-fledged error
+		 * message.
+		 */
+		if (*detached_xmin != InvalidTransactionId)
+		{
+			elog(WARNING, "more than one partition pending detach found for table with OID %u",
+ parentrelId);
+			if (TransactionIdFollows(xmin, *detached_xmin))
+*detached_xmin = xmin;
+		}
+		else
+			*detached_xmin = xmin;
+	}
+
+	/* Don't add the partition to the output list */
 	continue;
+}
 			}
 		}
 
@@ -247,8 +290,7 @@ find_all_inheritors(Oid parentrelId, LOCKMODE lockmode, List **numparents)
 		ListCell   *lc;
 
 		/* Get the direct children of this rel */
-		currentchildren = find_inheritance_children(currentrel, true,
-	lockmode, NULL);
+		currentchildren = find_inheritance_children(currentrel, lockmode);
 
 		/*
 		 * Add to the queue only those children not already seen. This avoids
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7d00f4eb25..ab3c61898c

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-26 Thread Yura Sokolov

Andres Freund писал 2021-04-26 21:46:

Hi,

On 2021-04-25 01:27:24 +0300, Yura Sokolov wrote:

It is quite interesting result. Simplehash being open-addressing with
linear probing is friendly for cpu cache. I'd recommend to define
SH_FILLFACTOR with value lower than default (0.9). I believe 0.75 is
suitable most for such kind of hash table.


It's not a "plain" linear probing hash table (although it is on the 
lookup
side). During insertions collisions are reordered so that the average 
distance
from the "optimal" position is ~even ("robin hood hashing"). That 
allows a
higher load factor than a plain linear probed hash table (for which 
IIRC

there's data to show 0.75 to be a good default load factor).


Even for Robin Hood hashing 0.9 fill factor is too high. It leads to too
much movements on insertion/deletion and longer average collision chain.

Note that Robin Hood doesn't optimize average case. Indeed, usually 
Robin Hood

has worse (longer) average collision chain than simple linear probing.
Robin Hood hashing optimizes worst case, ie it guarantees there is no 
unnecessary

long collision chain.

See discussion on Rust hash table fill factor when it were Robin Hood:
https://github.com/rust-lang/rust/issues/38003



There of course may still be a benefit in lowering the load factor, but 
I'd

not start there.

David's test aren't really suited to benchmarking the load factor, but 
to me
the stats he showed didn't highlight a need to lower the load factor. 
Lowering

the fill factor does influence the cache hit ratio...

Greetings,

Andres Freund


regards,
Yura.




multi-version capable PostgresNode.pm

2021-04-26 Thread Andrew Dunstan


Just to complete the circle on this topic, which I intend to take up
again during the next dev cycle, I have captured the current state of my
work in a public git repo at
. This can be cloned and
used without having to change the core Postgres code, as shown in the
README.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-26 Thread Andres Freund
Hi,

On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
> I'm not sure what to do about this :-( I don't have any ideas about how to
> eliminate this overhead, so the only option I see is reverting the changes
> in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
> be frozen ...

ISTM that the fundamental issue here is not that we acquire pins that we
shouldn't, but that we do so at a much higher frequency than needed.

It's probably too invasive for 14, but I think it might be worth exploring
passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() iff
the input will be more than one row.

And then add the vm buffer of the target page to BulkInsertState, so that
hio.c can avoid re-pinning the buffer.

Greetings,

Andres Freund




Re: Addition of authenticated ID to pg_stat_activity

2021-04-26 Thread Stephen Frost
Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2021-04-26 11:34:16 +0900, Michael Paquier wrote:
> > 9afffcb has added the concept of authenticated identity to the
> > information provided in log_connections for audit purposes, with this
> > data stored in each backend's port.  One extra thing that can be
> > really useful for monitoring is the capability to track this
> > information directly in pg_stat_activity.
> 
> I'm getting a bit worried about the incremental increase in
> pg_stat_activity width - it's probably by far the view that's most
> viewed interactively. I think we need to be careful not to add too niche
> things to it. This is especially true for columns that may be wider.
> 
> It'd be bad for discoverability, but perhaps something like this, that's
> not that likely to be used interactively, would be better done as a
> separate function that would need to be used explicitly?

I mean.. we already have separate functions and views for this, though
they're auth-method-specific currently, but also provide more details,
since it isn't actually a "one size fits all" kind of thing like this
entire approach is imagining it to be.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Addition of authenticated ID to pg_stat_activity

2021-04-26 Thread Andres Freund
Hi,

On 2021-04-26 11:34:16 +0900, Michael Paquier wrote:
> 9afffcb has added the concept of authenticated identity to the
> information provided in log_connections for audit purposes, with this
> data stored in each backend's port.  One extra thing that can be
> really useful for monitoring is the capability to track this
> information directly in pg_stat_activity.

I'm getting a bit worried about the incremental increase in
pg_stat_activity width - it's probably by far the view that's most
viewed interactively. I think we need to be careful not to add too niche
things to it. This is especially true for columns that may be wider.

It'd be bad for discoverability, but perhaps something like this, that's
not that likely to be used interactively, would be better done as a
separate function that would need to be used explicitly?


A select * from pg_stat_activity on a plain installation, connecting
over unix socket, with nothing running, is 411 chars wide for me...

Greetings,

Andres Freund




Re: Use simplehash.h instead of dynahash in SMgr

2021-04-26 Thread Andres Freund
Hi,

On 2021-04-25 01:27:24 +0300, Yura Sokolov wrote:
> It is quite interesting result. Simplehash being open-addressing with
> linear probing is friendly for cpu cache. I'd recommend to define
> SH_FILLFACTOR with value lower than default (0.9). I believe 0.75 is
> suitable most for such kind of hash table.

It's not a "plain" linear probing hash table (although it is on the lookup
side). During insertions collisions are reordered so that the average distance
from the "optimal" position is ~even ("robin hood hashing"). That allows a
higher load factor than a plain linear probed hash table (for which IIRC
there's data to show 0.75 to be a good default load factor).

There of course may still be a benefit in lowering the load factor, but I'd
not start there.

David's test aren't really suited to benchmarking the load factor, but to me
the stats he showed didn't highlight a need to lower the load factor. Lowering
the fill factor does influence the cache hit ratio...

Greetings,

Andres Freund




Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Magnus Hagander
On Mon, Apr 26, 2021 at 8:14 PM Andres Freund  wrote:
>
> Hi,
>
> On 2021-04-26 13:43:31 -0400, Alvaro Herrera wrote:
> > I think it's straightforward, if we decouple the tri-valued enum used
> > for guc.c purposes from a separate boolean that actually enables the
> > feature.  GUC sets the boolean to "off" initially when it sees the enum
> > as "auto", and then pg_stat_statement's _PG_init modifies it during its
> > own startup as needed.

That's pretty much exactly my original suggestion, yes :)


> > So the user can turn the GUC off, and then pg_stat_statement does
> > nothing and there is no performance drawback; or leave it "auto" and
> > it'll only compute query_id if pg_stat_statement is loaded; or leave it
> > on if they want the query_id for other purposes.
>
> I think that's the right direction. I wonder though if we shouldn't go a
> bit further. Have one guc that determines the "query id provider" (NULL
> or a shared library), and one GUC that configures whether query-id is
> computed (never, on-demand/auto, always). For the provider GUC load the
> .so and look up a function with some well known name.

On Mon, Apr 26, 2021 at 8:37 PM Andres Freund  wrote:
> I have a preference to determining the provider via GUC instead of a
> hook because it is both easier to introspect and easier to configure.

+1 in general. Though we could of course also have a read-only
internal GUC that would show what we ended up with, and still
configure it with shared_preload_libraries, or loaded in some other
way. In a way it'd be cleaner to "always load modules with
shared_preload_libraries", but I can certainly see the arguments in
either direction..

But whichever way it's configured, having a well exposed way to know
what it actually is would be important.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Andres Freund
Hi,

On 2021-04-26 14:21:00 -0400, Tom Lane wrote:
> Andres Freund  writes:
> That's sounding like a pretty sane design, actually.  Not sure about
> the shared-library-name-with-fixed-function-name detail, but certainly
> it seems to be useful to separate "I need a query-id" from the details
> of the ID calculation.
> 
> Rather than a GUC per se for the ID provider, maybe we could have a
> function hook that defaults to pointing at the in-core computation,
> and then a module wanting to override that just gets into the hook.

I have a preference to determining the provider via GUC instead of a
hook because it is both easier to introspect and easier to configure.

If the provider is loaded via a hook, and the shared library is loaded
via shared_preload_libraries, one can't easily just turn that off in a
single session, but needs to restart or explicitly load a different
library (that can't already be loaded).

We also don't have any way to show what's hooking into a hook.

Greetings,

Andres Freund




Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Tom Lane
Andres Freund  writes:
> I think that's the right direction. I wonder though if we shouldn't go a
> bit further. Have one guc that determines the "query id provider" (NULL
> or a shared library), and one GUC that configures whether query-id is
> computed (never, on-demand/auto, always). For the provider GUC load the
> .so and look up a function with some well known name.

That's sounding like a pretty sane design, actually.  Not sure about
the shared-library-name-with-fixed-function-name detail, but certainly
it seems to be useful to separate "I need a query-id" from the details
of the ID calculation.

Rather than a GUC per se for the ID provider, maybe we could have a
function hook that defaults to pointing at the in-core computation,
and then a module wanting to override that just gets into the hook.

regards, tom lane




Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Andres Freund
Hi,

On 2021-04-26 13:43:31 -0400, Alvaro Herrera wrote:
> I think it's straightforward, if we decouple the tri-valued enum used
> for guc.c purposes from a separate boolean that actually enables the
> feature.  GUC sets the boolean to "off" initially when it sees the enum
> as "auto", and then pg_stat_statement's _PG_init modifies it during its
> own startup as needed.

> So the user can turn the GUC off, and then pg_stat_statement does
> nothing and there is no performance drawback; or leave it "auto" and
> it'll only compute query_id if pg_stat_statement is loaded; or leave it
> on if they want the query_id for other purposes.

I think that's the right direction. I wonder though if we shouldn't go a
bit further. Have one guc that determines the "query id provider" (NULL
or a shared library), and one GUC that configures whether query-id is
computed (never, on-demand/auto, always). For the provider GUC load the
.so and look up a function with some well known name.

Greetings,

Andres Freund




Re: Issue in recent pg_stat_statements?

2021-04-26 Thread Andres Freund
On 2021-04-26 12:53:30 -0500, David Christensen wrote:
> On Mon, Apr 26, 2021 at 12:18 PM Julien Rouhaud  wrote:
> > Using pg_stat_statements with a different query_id semantics without
> > having to fork pg_stat_statements.
> >
> 
> I can see that argument for allowing alternatives, but the current default
> of nothing seems to be particularly non-useful, so some sensible default
> value would seem to be in order, or I can predict a whole mess of future
> user complaints.

+1




Re: Issue in recent pg_stat_statements?

2021-04-26 Thread David Christensen
On Mon, Apr 26, 2021 at 12:18 PM Julien Rouhaud  wrote:

> On Mon, Apr 26, 2021 at 11:40 PM David Christensen
>  wrote:
> >>
> >> > Is this an expected change, or is this in fact broken?  In previous
> revisions, this was showing the INSERT and SELECT at the very least.  I'm
> unclear as to why the regression test is still passing, so want to verify
> that I'm not doing something wrong in the testing.
> >>
> >> Yes, you want to look into the queryid functionality. See
> >>
> https://www.postgresql.org/message-id/flat/35457b09-36f8-add3-1d07-6034fa585ca8%40oss.nttdata.com
> >>
> >> Interface changes may still be coming in 14 for that. Or warnings.
> >
> >
> > Hmm, I'm unclear as to why you would potentially want to use
> pg_stat_statements *without* this functionality.
>
> Using pg_stat_statements with a different query_id semantics without
> having to fork pg_stat_statements.
>

I can see that argument for allowing alternatives, but the current default
of nothing seems to be particularly non-useful, so some sensible default
value would seem to be in order, or I can predict a whole mess of future
user complaints.


Re: pg_amcheck contrib application

2021-04-26 Thread Mark Dilger


> On Apr 23, 2021, at 3:01 PM, Mark Dilger  wrote:
> 
> I'll try to post something that accomplishes the changes to the reports that 
> you are looking for.

The attached patch changes amcheck corruption reports as discussed upthread.  
This patch is submitted for the v14 development cycle as a bug fix, per your 
complaint that the committed code generates reports sufficiently confusing to a 
user as to constitute a bug.

All other code refactoring and additional checks discussed upthread are 
reserved for the v15 development cycle and are not included here.

The minimal patch (not attached) that does not rename any variables is 135 
lines.  Your patch was 159 lines.  The patch (attached) which includes your 
variable renaming is 174 lines.



v23-0001-amcheck-adjusting-corruption-report-output.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Stephen Frost
Greetings,

* Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote:
> On 2021-Apr-26, Tom Lane wrote:
> 
> > Stephen Frost  writes:
> > > * Magnus Hagander (mag...@hagander.net) wrote:
> > >> Thatäs why I suggested the three value one. Default to a mode where
> > >> it's automatic, which is what the majority is going to want, but have
> > >> a way to explicitly turn it on.
> > 
> > > This is certainly fine with me too, though it seems a bit surprising to
> > > me that we couldn't just figure out what the user actually wants based
> > > on what's installed/running for any given combination.
> > 
> > I'd be on board with having pg_stat_statement's pg_init function do
> > something to adjust the setting, if we can figure out how to do that
> > in a way that's not confusing in itself.  I'm not sure though that
> > the GUC engine offers a good way.
> 
> I think it's straightforward, if we decouple the tri-valued enum used
> for guc.c purposes from a separate boolean that actually enables the
> feature.  GUC sets the boolean to "off" initially when it sees the enum
> as "auto", and then pg_stat_statement's _PG_init modifies it during its
> own startup as needed.
> 
> So the user can turn the GUC off, and then pg_stat_statement does
> nothing and there is no performance drawback; or leave it "auto" and
> it'll only compute query_id if pg_stat_statement is loaded; or leave it
> on if they want the query_id for other purposes.

Yeah, this is more-or-less the same as what I was just proposing in an
email that crossed this one.  Using a separate boolean would certainly
be fine.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Alvaro Herrera
On 2021-Apr-26, Tom Lane wrote:

> Stephen Frost  writes:
> > * Magnus Hagander (mag...@hagander.net) wrote:
> >> Thatäs why I suggested the three value one. Default to a mode where
> >> it's automatic, which is what the majority is going to want, but have
> >> a way to explicitly turn it on.
> 
> > This is certainly fine with me too, though it seems a bit surprising to
> > me that we couldn't just figure out what the user actually wants based
> > on what's installed/running for any given combination.
> 
> I'd be on board with having pg_stat_statement's pg_init function do
> something to adjust the setting, if we can figure out how to do that
> in a way that's not confusing in itself.  I'm not sure though that
> the GUC engine offers a good way.

I think it's straightforward, if we decouple the tri-valued enum used
for guc.c purposes from a separate boolean that actually enables the
feature.  GUC sets the boolean to "off" initially when it sees the enum
as "auto", and then pg_stat_statement's _PG_init modifies it during its
own startup as needed.

So the user can turn the GUC off, and then pg_stat_statement does
nothing and there is no performance drawback; or leave it "auto" and
it'll only compute query_id if pg_stat_statement is loaded; or leave it
on if they want the query_id for other purposes.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"El miedo atento y previsor es la madre de la seguridad" (E. Burke)




Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Magnus Hagander (mag...@hagander.net) wrote:
> >> Thatäs why I suggested the three value one. Default to a mode where
> >> it's automatic, which is what the majority is going to want, but have
> >> a way to explicitly turn it on.
> 
> > This is certainly fine with me too, though it seems a bit surprising to
> > me that we couldn't just figure out what the user actually wants based
> > on what's installed/running for any given combination.
> 
> I'd be on board with having pg_stat_statement's pg_init function do
> something to adjust the setting, if we can figure out how to do that
> in a way that's not confusing in itself.  I'm not sure though that
> the GUC engine offers a good way.

Both of the extensions are getting loaded via pg_stat_statements and
both can have pg_init functions which work together to come up with the
right answer, no?

That is- can't pg_stat_statements, when it's loaded, enable
compute_query_id if it's not already enabled, and then the pg_queryid
module simply disable it when it gets loaded in it's pg_init()?  Telling
people who are using pg_queryid to have it loaded *after*
pg_stat_statements certainly seems reasonable to me, but if folks don't
like that then maybe have a tri-state which is 'auto', 'on', and 'off',
where pg_stat_statements would set it to 'on' if it's set to 'auto', but
not do anything if it starts as 'off'.  pg_queryid would then set it to
'off' when it's loaded and it wouldn't matter if it's loaded before or
after.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Tom Lane
Stephen Frost  writes:
> * Magnus Hagander (mag...@hagander.net) wrote:
>> Thatäs why I suggested the three value one. Default to a mode where
>> it's automatic, which is what the majority is going to want, but have
>> a way to explicitly turn it on.

> This is certainly fine with me too, though it seems a bit surprising to
> me that we couldn't just figure out what the user actually wants based
> on what's installed/running for any given combination.

I'd be on board with having pg_stat_statement's pg_init function do
something to adjust the setting, if we can figure out how to do that
in a way that's not confusing in itself.  I'm not sure though that
the GUC engine offers a good way.

regards, tom lane




Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Stephen Frost
Greetings,

* Magnus Hagander (mag...@hagander.net) wrote:
> On Mon, Apr 26, 2021 at 6:56 PM Tom Lane  wrote:
> > Stephen Frost  writes:
> > > * Bruce Momjian (br...@momjian.us) wrote:
> > >> Techically, pg_stat_statements can turn on compute_query_id when it is
> > >> loaded, even if it is 'off' in postgresql.conf, right?  And
> > >> pg_stat_statements would know if an alternate hash method is being used,
> > >> right?
> >
> > > +1 on this approach.
> >
> > That'd make it impossible to turn off or adjust afterwards, wouldn't it?
> > I'm afraid the confusion stemming from that would outweigh any simplicity.

I don't know that it actually would, but ...

> Thatäs why I suggested the three value one. Default to a mode where
> it's automatic, which is what the majority is going to want, but have
> a way to explicitly turn it on.

This is certainly fine with me too, though it seems a bit surprising to
me that we couldn't just figure out what the user actually wants based
on what's installed/running for any given combination.

> > In the end, it's not like this is the first time we've ever made an
> > incompatible change in configuration needs; and it won't be the last
> > either.  I don't buy the argument that pg_stat_statements users can't
> > cope with adding the additional setting.  (Of course, we should be
> > careful to call it out as an incompatible change in the release notes.)
> 
> The fact that we've made changes before that complicated our users
> experience isn't in itself an argument for doing it again though...

I'm generally a proponent of sensible changes across major versions even
if it means that the user has to adjust things, but this seems like a
case where we're punting on something that we really should just be able
to figure out the right answer to and that seems like a step backwards.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Issue in recent pg_stat_statements?

2021-04-26 Thread Julien Rouhaud
On Mon, Apr 26, 2021 at 11:40 PM David Christensen
 wrote:
>>
>> > Is this an expected change, or is this in fact broken?  In previous 
>> > revisions, this was showing the INSERT and SELECT at the very least.  I'm 
>> > unclear as to why the regression test is still passing, so want to verify 
>> > that I'm not doing something wrong in the testing.
>>
>> Yes, you want to look into the queryid functionality. See
>> https://www.postgresql.org/message-id/flat/35457b09-36f8-add3-1d07-6034fa585ca8%40oss.nttdata.com
>>
>> Interface changes may still be coming in 14 for that. Or warnings.
>
>
> Hmm, I'm unclear as to why you would potentially want to use 
> pg_stat_statements *without* this functionality.

Using pg_stat_statements with a different query_id semantics without
having to fork pg_stat_statements.




Re: ERROR: relation "sql_features" does not exist

2021-04-26 Thread Pavel Stehule
po 26. 4. 2021 v 19:10 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > I tried to write a query that does lateral join between
> > information_schema.tables and pgstattuple function.
>
> > select * from information_schema.tables, lateral(select * from
> > pgstattuple(table_name::name)) s where table_type = 'BASE TABLE';
>
> > The query finished by strange error
>
> > postgres=# select * from information_schema.tables, lateral(select * from
> > pgstattuple(table_name::name)) s where table_type = 'BASE TABLE';
> > ERROR:  relation "sql_features" does not exist
>
> > When I set search_path to information_schema, then the query is running.
> > But there is not any reason why it should be necessary.
>
> Nope, this is classic user error, nothing else.  "table_name::name"
> is entirely inadequate as a way to reference a table that isn't
> visible in your search path.  You have to incorporate the schema
> name as well.
>
> Ideally you'd just pass the table OID to the OID-accepting version of
> pgstattuple(), but of course the information_schema schema views
> don't expose OIDs.  So basically you need something like
>
>
> pgstattuple((quote_ident(table_schema)||'.'||quote_ident(table_name))::regclass)
>
> although perhaps format() could help a little here.
>

I understand now. Thank you for explanation

select * from information_schema.tables, lateral(select * from
pgstattuple(format('%I.%I', table_schema, table_name))) s where table_type
= 'BASE TABLE';

This is working

Regards

Pavel

>
> regards, tom lane
>


Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Julien Rouhaud
On Tue, Apr 27, 2021 at 1:04 AM Magnus Hagander  wrote:
>
> Thatäs why I suggested the three value one. Default to a mode where
> it's automatic, which is what the majority is going to want, but have
> a way to explicitly turn it on.

Agreed, that also sounds like a sensible default.




Re: ERROR: relation "sql_features" does not exist

2021-04-26 Thread Tom Lane
Pavel Stehule  writes:
> I tried to write a query that does lateral join between
> information_schema.tables and pgstattuple function.

> select * from information_schema.tables, lateral(select * from
> pgstattuple(table_name::name)) s where table_type = 'BASE TABLE';

> The query finished by strange error

> postgres=# select * from information_schema.tables, lateral(select * from
> pgstattuple(table_name::name)) s where table_type = 'BASE TABLE';
> ERROR:  relation "sql_features" does not exist

> When I set search_path to information_schema, then the query is running.
> But there is not any reason why it should be necessary.

Nope, this is classic user error, nothing else.  "table_name::name"
is entirely inadequate as a way to reference a table that isn't
visible in your search path.  You have to incorporate the schema
name as well.

Ideally you'd just pass the table OID to the OID-accepting version of
pgstattuple(), but of course the information_schema schema views
don't expose OIDs.  So basically you need something like

pgstattuple((quote_ident(table_schema)||'.'||quote_ident(table_name))::regclass)

although perhaps format() could help a little here.

regards, tom lane




Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Magnus Hagander
On Mon, Apr 26, 2021 at 7:00 PM Bruce Momjian  wrote:
>
> On Mon, Apr 26, 2021 at 12:56:13PM -0400, Tom Lane wrote:
> > Stephen Frost  writes:
> > > * Bruce Momjian (br...@momjian.us) wrote:
> > >> Techically, pg_stat_statements can turn on compute_query_id when it is
> > >> loaded, even if it is 'off' in postgresql.conf, right?  And
> > >> pg_stat_statements would know if an alternate hash method is being used,
> > >> right?
> >
> > > +1 on this approach.
> >
> > That'd make it impossible to turn off or adjust afterwards, wouldn't it?
> > I'm afraid the confusion stemming from that would outweigh any simplicity.
> >
> > I would be in favor of logging a message at startup to the effect of
> > "this is misconfigured" (as per upthread discussion), although whether
> > people would see that is uncertain.
>
> I think a user-visible warning at CREATE EXNTENSION would help too.

It would help a bit, but actually logging it would probably help more.
Most people don't run the CREATE EXTENSION commands manually, it's all
done as part of either system install scripts or of application
migrations.

But that doesn't mean it wouldn't be useful to do it for those that
*do* run things manually, it just wouldn't be sufficient in itself.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Magnus Hagander
On Mon, Apr 26, 2021 at 6:56 PM Tom Lane  wrote:
>
> Stephen Frost  writes:
> > * Bruce Momjian (br...@momjian.us) wrote:
> >> Techically, pg_stat_statements can turn on compute_query_id when it is
> >> loaded, even if it is 'off' in postgresql.conf, right?  And
> >> pg_stat_statements would know if an alternate hash method is being used,
> >> right?
>
> > +1 on this approach.
>
> That'd make it impossible to turn off or adjust afterwards, wouldn't it?
> I'm afraid the confusion stemming from that would outweigh any simplicity.

Thatäs why I suggested the three value one. Default to a mode where
it's automatic, which is what the majority is going to want, but have
a way to explicitly turn it on.


> I would be in favor of logging a message at startup to the effect of
> "this is misconfigured" (as per upthread discussion), although whether
> people would see that is uncertain.

Some people would. Many wouldn't, and sadly many hours would be spent
on debugging things before they got there -- based on experience of
how many people actually read the logs..

> In the end, it's not like this is the first time we've ever made an
> incompatible change in configuration needs; and it won't be the last
> either.  I don't buy the argument that pg_stat_statements users can't
> cope with adding the additional setting.  (Of course, we should be
> careful to call it out as an incompatible change in the release notes.)

The fact that we've made changes before that complicated our users
experience isn't in itself an argument for doing it again though...

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Julien Rouhaud
On Tue, Apr 27, 2021 at 12:56 AM Tom Lane  wrote:
>
> Stephen Frost  writes:
> > * Bruce Momjian (br...@momjian.us) wrote:
> >> Techically, pg_stat_statements can turn on compute_query_id when it is
> >> loaded, even if it is 'off' in postgresql.conf, right?  And
> >> pg_stat_statements would know if an alternate hash method is being used,
> >> right?
>
> > +1 on this approach.
>
> That'd make it impossible to turn off or adjust afterwards, wouldn't it?

I think so, which would also make it impossible to use an external
query_id plugin.

Enabling compute_query_id by default or raising a WARNING in
pg_stat_statements' PG_INIT seems like the only 2 sensible options.




Re: Synchronous commit behavior during network outage

2021-04-26 Thread Ondřej Žižka

Hello Andrey,

I went through the thread for your patch and seems to me as an 
acceptable solution...


> The only case patch does not handle is sudden backend crash - 
Postgres will recover without a restart.


We also use a HA tool (Patroni). If the whole machine fails, it will 
find a new master and it should be OK. We use a 4 node setup (2 sync 
replicas and 1 async from every replica). If there is an issue just with 
sync replica (async operated normally) and the master fails completely 
in this situation, it will be solved by Patroni (the async replica 
become another sync), but if it is just the backend process, the master 
will not failover and changes will be still visible...


If the sync replica outage is temporal it will be solved itself when the 
node will establish a replication slot again... If the outage is "long", 
Patroni will remove the "old" sync replica from the cluster and the 
async replica reading from the master would be new sync. So yes... In 2 
node setup, this can be an issue, but in 4 node setup, this seems to me 
like a solution.
The only situation I can imagine is a situation when the client 
connections use a different network than the replication network and the 
replication network would be down completely, but the client network 
will be up. In that case, the master can be an "isolated island" and if 
it fails, we can lose the changed data.
Is this situation also covered in your model: "transaction effects 
should not be observable on primary until requirements of 
synchronous_commit are satisfied."


Do you agree with my thoughts?

Maybe would be possible to implement it into PostgreSQL with a note in 
documentation, that a multinode (>=3 nodes) cluster is necessary.


Regards
Ondrej

On 22/04/2021 05:55, Andrey Borodin wrote:


Hi Ondrej!


19 апр. 2021 г., в 22:19, Ondřej Žižka  написал(а):

Do you think, that would be possible to implement a process that would solve 
this use case?
Thank you
Ondrej


Feel free to review patch fixing this at [0]. It's classified as "Server 
Features", but I'm sure it's a bug fix.

Yandex.Cloud PG runs with this patch for more than half a year. Because we 
cannot afford loosing data in HA clusters.

It's somewhat incomplete solution, because PG restart or crash recovery will 
make waiting transactions visible. But we protect from this on HA tool's side.

Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/33/2402/





Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Bruce Momjian
On Mon, Apr 26, 2021 at 12:56:13PM -0400, Tom Lane wrote:
> Stephen Frost  writes:
> > * Bruce Momjian (br...@momjian.us) wrote:
> >> Techically, pg_stat_statements can turn on compute_query_id when it is
> >> loaded, even if it is 'off' in postgresql.conf, right?  And
> >> pg_stat_statements would know if an alternate hash method is being used,
> >> right?
> 
> > +1 on this approach.
> 
> That'd make it impossible to turn off or adjust afterwards, wouldn't it?
> I'm afraid the confusion stemming from that would outweigh any simplicity.
> 
> I would be in favor of logging a message at startup to the effect of
> "this is misconfigured" (as per upthread discussion), although whether
> people would see that is uncertain.

I think a user-visible warning at CREATE EXNTENSION would help too.

> In the end, it's not like this is the first time we've ever made an
> incompatible change in configuration needs; and it won't be the last
> either.  I don't buy the argument that pg_stat_statements users can't
> cope with adding the additional setting.  (Of course, we should be
> careful to call it out as an incompatible change in the release notes.)

Agreed.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





ERROR: relation "sql_features" does not exist

2021-04-26 Thread Pavel Stehule
Hi

I tried to write a query that does lateral join between
information_schema.tables and pgstattuple function.

select * from information_schema.tables, lateral(select * from
pgstattuple(table_name::name)) s where table_type = 'BASE TABLE';

The query finished by strange error

postgres=# select * from information_schema.tables, lateral(select * from
pgstattuple(table_name::name)) s where table_type = 'BASE TABLE';
ERROR:  relation "sql_features" does not exist

When I set search_path to information_schema, then the query is running.
But there is not any reason why it should be necessary.

I found this issue on pg 11.11, but the same behavior is on master branch.

Regards

Pavel


Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Tom Lane
Stephen Frost  writes:
> * Bruce Momjian (br...@momjian.us) wrote:
>> Techically, pg_stat_statements can turn on compute_query_id when it is
>> loaded, even if it is 'off' in postgresql.conf, right?  And
>> pg_stat_statements would know if an alternate hash method is being used,
>> right?

> +1 on this approach.

That'd make it impossible to turn off or adjust afterwards, wouldn't it?
I'm afraid the confusion stemming from that would outweigh any simplicity.

I would be in favor of logging a message at startup to the effect of
"this is misconfigured" (as per upthread discussion), although whether
people would see that is uncertain.

In the end, it's not like this is the first time we've ever made an
incompatible change in configuration needs; and it won't be the last
either.  I don't buy the argument that pg_stat_statements users can't
cope with adding the additional setting.  (Of course, we should be
careful to call it out as an incompatible change in the release notes.)

regards, tom lane




Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

2021-04-26 Thread Dean Rasheed
On Mon, 26 Apr 2021 at 15:55, Tom Lane  wrote:
>
> I checked into the commit history (how'd we ever survive without "git
> blame"?) and found that my argument above is actually wrong in detail.
> Before cab5dc5da of 2013-10-18, rewriteTargetListIU expanded non-updated
> columns for all views not only trigger-updatable ones.  However, that
> behavior itself goes back only to 2ec993a7c of 2010-10-10, which added
> triggers on views; before that there was indeed no such expansion.

Ah, that makes sense. Before cab5dc5da, expanding non-updated columns
of auto-updatable views was safe because until then a view was only
auto-updatable if all it's columns were. It was still unnecessary work
though, and with 20/20 hindsight, when triggers on views were first
added in 2ec993a7c, it probably should have only expanded the
targetlist for trigger-updatable views.

> Of course the view rewrite mechanisms are ten or so years older than
> that, so the conclusion that they weren't designed to need this still
> stands.

Yeah, I think that conclusion is right. The trickiest part I found was
deciding whether any product queries from conditional rules would do
the right thing if the main trigger-updatable query no longer expands
its targetlist. But I think that has to be OK, because even before
trigger-updatable views were added, it was possible to have product
queries from conditional rules together with an unconditional
do-nothing rule, so the product queries don't rely on the expanded
targetlist, and never have.

Regards,
Dean




Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Mon, Apr 26, 2021 at 05:34:30PM +0200, Christoph Berg wrote:
> > Re: Peter Eisentraut
> > > > Agreed.  If pg_stat_statements were zero-configuration today then
> > > > this would be an annoying new burden, but it isn't.
> > > 
> > > I think people can understand "add pg_stat_statements to
> > > shared_preload_libraries" and "install the extension".  You have to turn 
> > > it
> > > on somehow after all.
> > 
> > Fwiw, I'd claim that pg_stat_statements *is* zero-configuration today.
> > You just have to load the module (= shared_preload_libraries), and it
> > will start working. Later you can run CREATE EXTENSION to actually see
> > the stats, but they are already being collected in the background.
> > 
> > > Now there is the additional burden of turning on this weird setting that 
> > > no
> > > one understands.  That's a 50% increase in burden.
> > > 
> > > And almost no one will want to use a nondefault setting.
> > > 
> > > pg_stat_statements is pretty popular.  I think leaving in this requirement
> > > will lead to widespread confusion and complaints.
> > 
> > Ack, please make the default config (i.e. after setting 
> > shared_preload_libraries)
> > do something sensible. Having to enable some "weird" internal other setting
> > will be very hard to explain to users.
> > 
> > Fwiw, I'd even want to have pg_stat_statements enabled in core by
> > default. That would awesome UX. (And turning off could be as simple as
> > setting compute_query_id=off.)
> 
> Techically, pg_stat_statements can turn on compute_query_id when it is
> loaded, even if it is 'off' in postgresql.conf, right?  And
> pg_stat_statements would know if an alternate hash method is being used,
> right?

+1 on this approach.  I agree that we should avoid having to make every
new user and every user who is upgrading with pg_stat_statements
installed have to go twiddle this parameter.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Bruce Momjian
On Mon, Apr 26, 2021 at 05:34:30PM +0200, Christoph Berg wrote:
> Re: Peter Eisentraut
> > > Agreed.  If pg_stat_statements were zero-configuration today then
> > > this would be an annoying new burden, but it isn't.
> > 
> > I think people can understand "add pg_stat_statements to
> > shared_preload_libraries" and "install the extension".  You have to turn it
> > on somehow after all.
> 
> Fwiw, I'd claim that pg_stat_statements *is* zero-configuration today.
> You just have to load the module (= shared_preload_libraries), and it
> will start working. Later you can run CREATE EXTENSION to actually see
> the stats, but they are already being collected in the background.
> 
> > Now there is the additional burden of turning on this weird setting that no
> > one understands.  That's a 50% increase in burden.
> > 
> > And almost no one will want to use a nondefault setting.
> > 
> > pg_stat_statements is pretty popular.  I think leaving in this requirement
> > will lead to widespread confusion and complaints.
> 
> Ack, please make the default config (i.e. after setting 
> shared_preload_libraries)
> do something sensible. Having to enable some "weird" internal other setting
> will be very hard to explain to users.
> 
> Fwiw, I'd even want to have pg_stat_statements enabled in core by
> default. That would awesome UX. (And turning off could be as simple as
> setting compute_query_id=off.)

Techically, pg_stat_statements can turn on compute_query_id when it is
loaded, even if it is 'off' in postgresql.conf, right?  And
pg_stat_statements would know if an alternate hash method is being used,
right?

This is closer to Magnus's idea of having a three-value
compute_query_id, except is it more controlled by pg_stat_statements. 
Another idea would be to throw a user-visible warning if the
pg_stat_statements extension is loaded and compute_query_id is off.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Alvaro Herrera
On 2021-Apr-26, Bharath Rupireddy wrote:

> I agree that we can just be clear about the problem. Looks like the
> majority of the errors "conflicting or redundant options" are for
> redundant options. So, wherever "conflicting or redundant options"
> exists: 1) change the message to "option \"%s\" specified more than
> once" and remove parser_errposition if it's there because the option
> name in the error message would give the info with which user can
> point to the location

Hmm, I would keep the parser_errposition() even if the option name is
mentioned in the error message.  There's no harm in being a little
redundant, with both the option name and the error cursor showing the
same thing.

> 2) change the message to something like "option \"%s\" is conflicting
> with option \"%s\"".

Maybe, but since these would all be special cases, I think we'd need to
discuss them individually.  I would suggest that in order not to stall
this patch, these cases should all stay as "redundant or conflicting
options" -- that is, avoid any further change apart from exactly the
thing you came here to change.  You can submit a 0002 patch to change
those other errors.  That way, even if those changes end up rejected for
whatever reason, you still got your 0001 done (which would change the
bulk of "conflicting or redundant" error to the "option %s already
specified" error).  Some progress is better than none.

-- 
Álvaro Herrera39°49'30"S 73°17'W
 It does it in a really, really complicated way
 why does it need to be complicated?
 Because it's MakeMaker.




Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Bharath Rupireddy
On Mon, Apr 26, 2021 at 8:06 PM Alvaro Herrera  wrote:
>
> On 2021-Apr-26, Bharath Rupireddy wrote:
>
> > Thanks! IMO, it is better to change the error message to "option
> > \"%s\" specified more than once" instead of adding an error detail.
> > Let's hear other hackers' opinions.
>
> Many other places have the message "conflicting or redundant options",
> and then parser_errposition() shows the problem option.  That seems
> pretty unhelpful, so whenever the problem is the redundancy I would have
> the message be explicit about that, and about which option is the
> problem:
>   errmsg("option \"%s\" specified more than once", "someopt")
> Do note that wording it this way means only one translatable message,
> not dozens.

I agree that we can just be clear about the problem. Looks like the
majority of the errors "conflicting or redundant options" are for
redundant options. So, wherever "conflicting or redundant options"
exists: 1) change the message to "option \"%s\" specified more than
once" and remove parser_errposition if it's there because the option
name in the error message would give the info with which user can
point to the location 2) change the message to something like "option
\"%s\" is conflicting with option \"%s\"".

> In some cases it is possible that you'd end up with two messages, one
> for "redundant" and one for the other ways for options to conflict with
> others; for example collationcmds.c has one that's not as obvious, and

And yes, we need to divide up the message for conflicting and
redundant options on a case-to-case basis.

In createdb: we just need to modify the error message to "conflicting
options" or we could just get rid of errdetail and have the error
message directly saying "LOCALE cannot be specified together with
LC_COLLATE or LC_CTYPE". Redundant options are just caught in the
above for loop in createdb.
if (dlocale && (dcollate || dctype))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
 errmsg("conflicting or redundant options"),
 errdetail("LOCALE cannot be specified together with
LC_COLLATE or LC_CTYPE.")));

In AlterDatabase: we can remove parser_errposition because the option
name in the error message would give the right information.
if (list_length(stmt->options) != 1)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("option \"%s\" cannot be specified with
other options",
dtablespace->defname),
 parser_errposition(pstate, dtablespace->location)));

In compute_common_attribute: we can remove goto duplicate_error and
have the message change to "option \"%s\" specified more than once".

In DefineType: we need to rework for loop.
I found another problem with collationcmds.c is that it doesn't error
out if some of the options are specified more than once, something
like below. I think the option checking "for loop" in DefineCollation
needs to be reworked.
CREATE COLLATION case_insensitive (provider = icu, provider =
someother locale = '@colStrength=secondary', deterministic = false,
deterministic = true);

> force_quote/force_quote_all in COPY have their own thing too.

We can remove the errhint for force_not_null and force_null along with
the error message wording change to "option \"%s\" specified more than
once".

Upon looking at error "conflicting or redundant options" instances, to
do the above we need a lot of code changes, I'm not sure that will be
acceptable.

One thing is that all the option checking for loops are doing these
things in common: 1) fetching the values bool, int, float, string of
the options 2) redundant checking. I feel we need to invent a common
API to which we pass in 1) a list of allowed options for a particular
command, we can have these as static data structure
{allowed_option_name, data_type}, 2) a list of user specified options
3) the API will return a list of fetched i.e. parsed values
{user_specified_option_name, data_type, value}. Maybe the API can
return a hash table of these values so that the callers can look up
faster for the required option. The advantage of this API is that we
don't need to have many for-loops for options checking in the code.
I'm not sure it is worth doing though. Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Issue in recent pg_stat_statements?

2021-04-26 Thread David Christensen
>
> > Is this an expected change, or is this in fact broken?  In previous
> revisions, this was showing the INSERT and SELECT at the very least.  I'm
> unclear as to why the regression test is still passing, so want to verify
> that I'm not doing something wrong in the testing.
>
> Yes, you want to look into the queryid functionality. See
>
> https://www.postgresql.org/message-id/flat/35457b09-36f8-add3-1d07-6034fa585ca8%40oss.nttdata.com
>
> Interface changes may still be coming in 14 for that. Or warnings.
>

Hmm, I'm unclear as to why you would potentially want to use
pg_stat_statements *without* this functionality.  At the very least, it
violates POLA — I spent the better part of a day thinking this was a bug
due to the expected behavior being so obvious I wouldn't have expected any
different.

In any case, this discussion is better had on a different thread.  Thanks
at least for explaining what I was seeing.

Best,

David


Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Christoph Berg
Re: Peter Eisentraut
> > Agreed.  If pg_stat_statements were zero-configuration today then
> > this would be an annoying new burden, but it isn't.
> 
> I think people can understand "add pg_stat_statements to
> shared_preload_libraries" and "install the extension".  You have to turn it
> on somehow after all.

Fwiw, I'd claim that pg_stat_statements *is* zero-configuration today.
You just have to load the module (= shared_preload_libraries), and it
will start working. Later you can run CREATE EXTENSION to actually see
the stats, but they are already being collected in the background.

> Now there is the additional burden of turning on this weird setting that no
> one understands.  That's a 50% increase in burden.
> 
> And almost no one will want to use a nondefault setting.
> 
> pg_stat_statements is pretty popular.  I think leaving in this requirement
> will lead to widespread confusion and complaints.

Ack, please make the default config (i.e. after setting 
shared_preload_libraries)
do something sensible. Having to enable some "weird" internal other setting
will be very hard to explain to users.

Fwiw, I'd even want to have pg_stat_statements enabled in core by
default. That would awesome UX. (And turning off could be as simple as
setting compute_query_id=off.)

Christoph




Re: Issue in recent pg_stat_statements?

2021-04-26 Thread Magnus Hagander
On Mon, Apr 26, 2021 at 5:15 PM David Christensen
 wrote:
>
> -hackers,
>
> So in doing some recent work on pg_stat_statements, I notice that while the 
> regression test still passes on HEAD, it appears that 4f0b096 (per git 
> bisect) changed/broke how this works compared to historical versions.
>
> Essentially, when doing a fresh install of pg_stat_statements on a new fresh 
> db (outside of the regression framework), it's not returning any rows from 
> the view.  I didn't see any related documentation changes, so as far as I 
> know, this should still be recording all statements as per normal.
>
> My full steps to reproduce from a clean Centos 7 install are attached.  I 
> have also been able to reproduce this on OS X and Fedora 33.  The TL;DR is:
>
> CREATE EXTENSION pg_stat_statements;
> CREATE TABLE foo (a int, b text);
> INSERT INTO foo VALUES (1,'a');
> SELECT * FROM foo;
> SELECT * FROM pg_stat_statements; -- returns nothing
>
> Settings for pg_stat_statements:
> postgres=# select name, setting from pg_settings where name like 
> 'pg_stat_statements%';
>name| setting
> ---+-
>  pg_stat_statements.max| 5000
>  pg_stat_statements.save   | on
>  pg_stat_statements.track  | top
>  pg_stat_statements.track_planning | off
>  pg_stat_statements.track_utility  | on
> (5 rows)
>
> Is this an expected change, or is this in fact broken?  In previous 
> revisions, this was showing the INSERT and SELECT at the very least.  I'm 
> unclear as to why the regression test is still passing, so want to verify 
> that I'm not doing something wrong in the testing.

Yes, you want to look into the queryid functionality. See
https://www.postgresql.org/message-id/flat/35457b09-36f8-add3-1d07-6034fa585ca8%40oss.nttdata.com

Interface changes may still be coming in 14 for that. Or warnings.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Issue in recent pg_stat_statements?

2021-04-26 Thread David Christensen
-hackers,

So in doing some recent work on pg_stat_statements, I notice that while the
regression test still passes on HEAD, it appears that 4f0b096 (per git
bisect) changed/broke how this works compared to historical versions.

Essentially, when doing a fresh install of pg_stat_statements on a new
fresh db (outside of the regression framework), it's not returning any rows
from the view.  I didn't see any related documentation changes, so as far
as I know, this should still be recording all statements as per normal.

My full steps to reproduce from a clean Centos 7 install are attached.  I
have also been able to reproduce this on OS X and Fedora 33.  The TL;DR is:

CREATE EXTENSION pg_stat_statements;
CREATE TABLE foo (a int, b text);
INSERT INTO foo VALUES (1,'a');
SELECT * FROM foo;
SELECT * FROM pg_stat_statements; -- returns nothing

Settings for pg_stat_statements:
postgres=# select name, setting from pg_settings where name like
'pg_stat_statements%';
   name| setting
---+-
 pg_stat_statements.max| 5000
 pg_stat_statements.save   | on
 pg_stat_statements.track  | top
 pg_stat_statements.track_planning | off
 pg_stat_statements.track_utility  | on
(5 rows)

Is this an expected change, or is this in fact broken?  In previous
revisions, this was showing the INSERT and SELECT at the very least.  I'm
unclear as to why the regression test is still passing, so want to verify
that I'm not doing something wrong in the testing.

Best,

David
#!/bin/bash -e

sudo yum install -y git
sudo yum groupinstall -y "Development Tools"
sudo yum install -y readline-devel zlib-devel 'perl(IPC::Run)' 'perl(Test::More)' 'perl(Time::HiRes)'

mkdir -p ~/Checkouts

cd ~/Checkouts
git clone https://github.com/postgres/postgres postgresql
cd ~/Checkouts/postgresql

git checkout 4f0b096
make distclean || true
./configure --enable-cassert --enable-debug CFLAGS="-ggdb -O0 -g3 -fno-omit-frame-pointer" --enable-tap-tests
make -j10 && \
( \
  cd contrib/pg_stat_statements/; \
  make check \
)

cd tmp_install/usr/local/pgsql
export PGUSER=postgres
export PGHOST=/tmp/
export LD_LIBRARY_PATH=./lib:$LD_LIBRARY_PATH

PATH=$HOME/tmp_install/usr/local/pgsql:$PATH
bin/initdb -D data -U postgres
bin/pg_ctl -D data -l logfile start
bin/psql -c 'ALTER SYSTEM SET shared_preload_libraries = $$pg_stat_statements$$'
bin/pg_ctl -D data -l logfile restart
bin/psql <

Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

2021-04-26 Thread Tom Lane
Dean Rasheed  writes:
> On Mon, 26 Apr 2021 at 15:09, Tom Lane  wrote:
>> Thanks for looking at that.  On reflection I think this must be so,
>> because those rewriter mechanisms were designed long before we had
>> trigger-updatable views, and rewriteTargetListIU has never added
>> tlist items like this for any other sort of view.  So the ability
>> to insert the original view output column has necessarily been there
>> from the beginning.  This is just getting rid of a weird implementation
>> difference between trigger-updatable views and other views.

> FWIW, I had a look at this too and came to much the same conclusion,
> so I think this is a safe change that makes the code a little neater
> and more efficient.

Again, thanks for looking!

I checked into the commit history (how'd we ever survive without "git
blame"?) and found that my argument above is actually wrong in detail.
Before cab5dc5da of 2013-10-18, rewriteTargetListIU expanded non-updated
columns for all views not only trigger-updatable ones.  However, that
behavior itself goes back only to 2ec993a7c of 2010-10-10, which added
triggers on views; before that there was indeed no such expansion.
Of course the view rewrite mechanisms are ten or so years older than
that, so the conclusion that they weren't designed to need this still
stands.

regards, tom lane




Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Peter Eisentraut

On 24.04.21 19:43, Tom Lane wrote:

Bruce Momjian  writes:

That's a pretty weird API.  I think we just need people to turn it on
like they are doing when the configure pg_stat_statements anyway.
pg_stat_statements already requires configuration anyway.


Agreed.  If pg_stat_statements were zero-configuration today then
this would be an annoying new burden, but it isn't.


I think people can understand "add pg_stat_statements to 
shared_preload_libraries" and "install the extension".  You have to turn 
it on somehow after all.


Now there is the additional burden of turning on this weird setting that 
no one understands.  That's a 50% increase in burden.


And almost no one will want to use a nondefault setting.

pg_stat_statements is pretty popular.  I think leaving in this 
requirement will lead to widespread confusion and complaints.





Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Alvaro Herrera
On 2021-Apr-26, Bharath Rupireddy wrote:

> Thanks! IMO, it is better to change the error message to "option
> \"%s\" specified more than once" instead of adding an error detail.
> Let's hear other hackers' opinions.

Many other places have the message "conflicting or redundant options",
and then parser_errposition() shows the problem option.  That seems
pretty unhelpful, so whenever the problem is the redundancy I would have
the message be explicit about that, and about which option is the
problem:
  errmsg("option \"%s\" specified more than once", "someopt")
Do note that wording it this way means only one translatable message,
not dozens.

In some cases it is possible that you'd end up with two messages, one
for "redundant" and one for the other ways for options to conflict with
others; for example collationcmds.c has one that's not as obvious, and
force_quote/force_quote_all in COPY have their own thing too.

I think we should do parser_errposition() wherever possible, in
addition to the wording change.

-- 
Álvaro Herrera   Valdivia, Chile
 really, I see PHP as like a strange amalgamation of C, Perl, Shell
 inflex: you know that "amalgam" means "mixture with mercury",
   more or less, right?
 i.e., "deadly poison"




Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

2021-04-26 Thread Dean Rasheed
On Mon, 26 Apr 2021 at 15:09, Tom Lane  wrote:
>
> Thanks for looking at that.  On reflection I think this must be so,
> because those rewriter mechanisms were designed long before we had
> trigger-updatable views, and rewriteTargetListIU has never added
> tlist items like this for any other sort of view.  So the ability
> to insert the original view output column has necessarily been there
> from the beginning.  This is just getting rid of a weird implementation
> difference between trigger-updatable views and other views.
>

FWIW, I had a look at this too and came to much the same conclusion,
so I think this is a safe change that makes the code a little neater
and more efficient.

Regards,
Dean




Re: tab-complete for ALTER TABLE .. DETACH PARTITION CONCURRENTLY

2021-04-26 Thread Justin Pryzby
On Thu, Apr 22, 2021 at 04:40:35PM -0400, Alvaro Herrera wrote:
> Would anyone oppose me pushing this for tab-completing the new keywords
> of ALTER TABLE ..  DETACH PARTITION?

+1 to apply tab completion for v14

-- 
Justin




Re: [BUG] "FailedAssertion" reported when streaming in logical replication

2021-04-26 Thread Dilip Kumar
On Mon, Apr 26, 2021 at 6:59 PM Amit Kapila  wrote:
>
> On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar  wrote:
> >
> > I am able to reproduce this and I think I have done the initial 
> > investigation.
> >
> > The cause of the issue is that, this transaction has only one change
> > and that change is XLOG_HEAP2_NEW_CID, which is added through
> > SnapBuildProcessNewCid.  Basically, when we add any changes through
> > SnapBuildProcessChange we set the base snapshot but when we add
> > SnapBuildProcessNewCid this we don't set the base snapshot, because
> > there is nothing to be done for this change.  Now, this transaction is
> > identified as the biggest transaction with non -partial changes, and
> > now in ReorderBufferStreamTXN, it will return immediately because the
> > base_snapshot is NULL.
> >
>
> Your analysis sounds correct to me.
>

Thanks, I have attached a patch to fix this.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 16d47947002357cc37fdc3debcdf8c376e370188 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Mon, 26 Apr 2021 18:19:27 +0530
Subject: [PATCH v1] Don't select the transaction without base snapshot for
 streaming

While selecting the largest top transaction, currently we don't check
whether the transaction has the base snapshot or not, but if the
transaction doesn't have the base snapshot then we can not stream that so
skip such transactions.
---
 src/backend/replication/logical/reorderbuffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 5cb484f..981619f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3388,7 +3388,8 @@ ReorderBufferLargestTopTXN(ReorderBuffer *rb)
 		txn = dlist_container(ReorderBufferTXN, node, iter.cur);
 
 		if ((largest != NULL || txn->total_size > largest_size) &&
-			(txn->total_size > 0) && !(rbtxn_has_incomplete_tuple(txn)))
+			(txn->base_snapshot != NULL) && (txn->total_size > 0) &&
+			!(rbtxn_has_incomplete_tuple(txn)))
 		{
 			largest = txn;
 			largest_size = txn->total_size;
-- 
1.8.3.1



Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

2021-04-26 Thread Tom Lane
Amit Langote  writes:
> On Mon, Apr 26, 2021 at 9:40 AM Tom Lane  wrote:
>> I would think that this is a totally straightforward improvement,
>> but there's one thing in the comments for rewriteTargetListIU that
>> gives me a little pause: it says
>> 
>> * We must do items 1,2,3 before firing rewrite rules, else rewritten
>> * references to NEW.foo will produce wrong or incomplete results.
>> 
>> As far as I can tell, though, references to NEW values still do the
>> right thing.  I'm not certain whether any of the existing regression
>> tests really cover this point, but experimenting with the scenario shown
>> in the attached SQL file says that the DO ALSO rule gets the right
>> results.  It's possible that the expansion sequence is a bit different
>> than before, but we still end up with the right answer.

> I also checked what the rewriter and the planner do for the following
> DO ALSO insert:
> create rule logit as on update to v1 do also
> insert into log values(old.a1, new.a1, old.b2, new.b2);
> and can see that the insert ends up with the right targetlist
> irrespective of whether or not rewriteTargetListIU() adds an item for
> NEW.b2.

Thanks for looking at that.  On reflection I think this must be so,
because those rewriter mechanisms were designed long before we had
trigger-updatable views, and rewriteTargetListIU has never added
tlist items like this for any other sort of view.  So the ability
to insert the original view output column has necessarily been there
from the beginning.  This is just getting rid of a weird implementation
difference between trigger-updatable views and other views.

regards, tom lane




Re: How to test Postgres for any unaligned memory accesses?

2021-04-26 Thread Bharath Rupireddy
On Fri, Apr 23, 2021 at 7:25 PM Tom Lane  wrote:
> > I'm not sure this is the right way. I would like to know whether there
> > is a standard way of testing Postgres code for any unaligned memory
> > accesses. Thanks. Any help would be appreciated.
>
> Per c.h, late-model compilers have options for this:
>
>  * Testing can be done with "-fsanitize=alignment -fsanitize-trap=alignment"
>  * on clang, or "-fsanitize=alignment -fno-sanitize-recover=alignment" on gcc.

Thanks Tom!

I used the above gcc compiler flags to see if they catch memory
alignment issues. The way I tested on my dev system (x86_64 platform
with Ubuntu OS)  was that I commented out max aligning specialSize in
PageInit, compiled the source code with and without the alignment
flags. make check failed with the alignment checking flags, it passed
without the flags.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




RE: use pg_strncasecmp to replace strncmp when compare "pg_"

2021-04-26 Thread tanghy.f...@fujitsu.com
On Friday, April 23, 2021 2:06 PM, Tom Lane  wrote

>>Kyotaro Horiguchi  writes:
>> That doesn't matter at all for now since we match schema identifiers
>> case-sensitively.  Maybe it should be a part of the patch in [1].
>
>Yeah --- maybe this'd make sense as part of a full patch to improve
>tab-complete.c's handling of case folding, but I'm suspicious that
>applying it on its own would just make things less consistent.

Thanks for your reply. Merged this patch to [1]. Any further comment on [1] is 
very welcome.

[1] 
https://www.postgresql.org/message-id/OS0PR01MB6113CA04E06D5BF221BC4FE2FB429%40OS0PR01MB6113.jpnprd01.prod.outlook.com

Regards,
Tang




RE: Support tab completion for upper character inputs in psql

2021-04-26 Thread tanghy.f...@fujitsu.com
Hi 

I've updated the patch to V7 based on the following comments. 

On Friday, April 23, 2021 11:58 AM, Kyotaro Horiguchi  
wrote
>All usages of pg_string_tolower don't need a copy.
>So don't we change the function to in-place converter?

Refer to your later discussion with Tom. Keep the code as it is.

>   if (completion_info_charp)
>+  {
>   e_info_charp = escape_string(completion_info_charp);
>+  if(e_info_charp[0] == '"')
>+  completion_case_sensitive = true;
>+  else
>+  {
>+  le_str = pg_string_tolower(e_info_charp);
>
>It seems right to lower completion_info_charp and ..2 but it is not
>right that change completion_case_sensitive here, which only affects
>the returned candidates.  

Agreed, code " completion_case_sensitive = true;" removed.

>By the way COMP_KEYWORD_CASE suggests that *keywords* are completed
>following the setting. However, they are not keywords, but
>identifiers. And some people (including me) might dislike that
>keywords and identifiers follow the same setting.  Specifically I
>sometimes want keywords to be upper-cased but identifiers (always) be
>lower-cased.

Changed my design based on your suggestion. Now the upper character inputs for 
identifiers will always turn to lower case(regardless COMP_KEYWORD_CASE) which 
I think can be accepted by most of PG users. 
  Eg: SET BYT / SET Byt
  output when apply V6 patch: SET BYTEA_OUTPUT
  output when apply V7 patch: SET bytea_output

On Friday, April 23, 2021 12:26 PM, Kyotaro Horiguchi  
wrote
>Oh, I accidentally found a doubious behsbior.
>
>=# alter table public.
>public.c1public.d1public."t"   public.t public."tt"  
>
>The "t" and "tt" are needlessly lower-cased.

Good catch. I didn’t think of schema stuff before. 
Bug fixed. Add tap tests for this scenario.

Please let me know if you find more insufficient issue in the patch. Any 
further suggestion is very welcome.

Regards,
Tang


V7-0001-Support-tab-completion-with-a-query-result-for-upper.patch
Description:  V7-0001-Support-tab-completion-with-a-query-result-for-upper.patch


  1   2   >