date:20210426

Re: Implementing Incremental View Maintenance

2021-04-26 Thread Yugo NAGATA

On Mon, 26 Apr 2021 15:46:21 +0900
Yugo NAGATA  wrote:

> On Tue, 20 Apr 2021 09:51:34 +0900
> Yugo NAGATA  wrote:
> 
> > On Mon, 19 Apr 2021 17:40:31 -0400
> > Tom Lane  wrote:
> > 
> > > Andrew Dunstan  writes:
> > > > This patch (v22c) just crashed for me with an assertion failure on
> > > > Fedora 31. Here's the stack trace:
> > > 
> > > > #2  0x0094a54a in ExceptionalCondition
> > > > (conditionName=conditionName@entry=0xa91dae "queryDesc->sourceText !=
> > > > NULL", errorType=errorType@entry=0x99b468 "FailedAssertion",
> > > > fileName=fileName@entry=0xa91468
> > > > "/home/andrew/pgl/pg_head/src/backend/executor/execMain.c",
> > > > lineNumber=lineNumber@entry=199) at
> > > > /home/andrew/pgl/pg_head/src/backend/utils/error/assert.c:69
> > > 
> > > That assert just got added a few days ago, so that's why the patch
> > > seemed OK before.
> > 
> > Thank you for letting me know. I'll fix it.
> 
> Attached is the fixed patch.
> 
> queryDesc->sourceText cannot be NULL after commit b2668d8, 
> so now we pass an empty string "" for refresh_matview_datafill() instead NULL
> when maintaining views incrementally.

I am sorry, I forgot to include a fix for 8aba9322511.
Attached is the fixed version.

Regards,
Yugo Nagata

-- 
Yugo NAGATA 


IVM_patches_v22e.tar.gz
Description: application/gzip

Re: Truncate in synchronous logical replication failed

2021-04-26 Thread Japin Li



On Mon, 26 Apr 2021 at 12:49, Amit Kapila  wrote:
> On Fri, Apr 23, 2021 at 7:18 PM osumi.takami...@fujitsu.com
>  wrote:
>>
>
> The latest patch looks good to me. I have made a minor modification
> and added a commit message in the attached. I would like to once again
> ask whether anybody else thinks we should backpatch this? Just a
> summary for anybody not following this thread:
>
> This patch fixes the Logical Replication of Truncate in synchronous
> commit mode. The Truncate operation acquires an exclusive lock on the
> target relation and indexes and waits for logical replication of the
> operation to finish at commit. Now because we are acquiring the shared
> lock on the target index to get index attributes in pgoutput while
> sending the changes for the Truncate operation, it leads to a
> deadlock.
>
> Actually, we don't need to acquire a lock on the target index as we
> build the cache entry using a historic snapshot and all the later
> changes are absorbed while decoding WAL. So, we wrote a special
> purpose function for logical replication to get a bitmap of replica
> identity attribute numbers where we get that information without
> locking the target index.
>
> We are planning not to backpatch this as there doesn't seem to be any
> field complaint about this issue since it was introduced in commit
> 5dfd1e5a in v11.

+1 for apply only on HEAD.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Re: [BUG] "FailedAssertion" reported when streaming in logical replication

2021-04-26 Thread Dilip Kumar

On Mon, 26 Apr 2021 at 12:45 PM, tanghy.f...@fujitsu.com <
tanghy.f...@fujitsu.com> wrote:

> Hi
>
> I think I may found a bug when using streaming in logical replication.
> Could anyone please take a look at this?
>
> Here's what I did to produce the problem.
> I set logical_decoding_work_mem and created multiple publications at
> publisher, created multiple subscriptions with "streaming = on" at
> subscriber.
> However, an assertion failed at publisher when I COMMIT and ROLLBACK
> multiple transactions at the same time.
>
> The log reported a FailedAssertion:
> TRAP: FailedAssertion("txn->size == 0", File: "reorderbuffer.c", Line:
> 3465, PID: 911730)
>
> The problem happens both in synchronous mode and asynchronous mode. When
> there are only one or two publications, It doesn't seem to happen. (In my
> case, there are 8 publications and the failure always happened).
>
> The scripts and the log are attached. It took me about 4 minutes to run
> the script on my machine.
> Please contact me if you need more specific info for the problem.



Thanks for reporting. I will look into it.

> --
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [PATCH] add concurrent_abort callback for output plugin

2021-04-26 Thread Amit Kapila

On Mon, Apr 26, 2021 at 7:05 AM Peter Smith  wrote:
>
> Hi,
>
> While testing another WIP patch [1] a clashing GID problem was found,
> which gives us apply worker errors like:
>
> 2021-04-26 10:07:12.883 AEST [22055] ERROR:  transaction identifier
> "pg_gid_16403_608" is already in use
> 2021-04-26 10:08:05.149 AEST [22124] ERROR:  transaction identifier
> "pg_gid_16403_757" is already in use
>
> These GID clashes were traced back to a problem of the
> concurrent-abort logic: when "streaming" is enabled the
> concurrent-abort logic was always sending "prepare" even though a
> "stream_prepare" had already been sent.
>
> PSA a patch to correct this.
>

Your patch looks good to me, so pushed! Thanks for the report and patch.

-- 
With Regards,
Amit Kapila.

Re: [PATCH] Re: pg_identify_object_as_address() doesn't support pg_event_trigger oids

2021-04-26 Thread Michael Paquier

On Fri, Apr 23, 2021 at 09:33:36AM +0200, Joel Jacobson wrote:
> Also, since this is a problem also in v13 maybe this should also be
> back-ported?  I think it's a bug since both
> pg_identify_object_as_address() and event triggers exists in v13, so
> the function should work there as well, otherwise users would need
> to do work-arounds for event triggers.  

No objections from here to do something in back-branches.  We cannot
have a test for event triggers in object_address.sql and it would be
better to keep it in a parallel set (see 676858b for example).  Could
you however add a small test for that in event_trigger.sql?  It would
be good to check after all three functions pg_identify_object(),
pg_identify_object_as_address() and pg_get_object_address().
--
Michael


signature.asc
Description: PGP signature

Re: Performance Evaluation of Result Cache by using TPC-DS

2021-04-26 Thread Yuya Watari

Hello David,

Thank you for running experiments on your machine and I really
appreciate your deep analysis.

Your results are very interesting. In 5 queries, the result cache is
cheaper but slower. Especially, in query 88, although the cost with
result cache is cheaper, it has 34.23% degradation in query execution
time. This is big regression.

> What can be done?
> ===
>
> I'm not quite sure. The biggest problem is add_path's fuzziness.  I
> could go and add some penalty cost to Result Cache paths so that
> they're picked less often.  If I make that penalty more than 1% of the
> cost, then that should get around add_path rejecting the other join
> method that is not fuzzily good enough.  Adding some sort of penalty
> might also help the 5 of 23 queries that were cheaper and slower than
> the alternative.

Based on your idea, I have implemented a penalty for the cost of the
result cache. I attached the patch to this e-mail. Please be noted
that this patch is experimental, so it lacks comments, documents,
tests, etc. This patch adds a new GUC, resultcache_cost_factor. The
cost of the result cache is multiplied by this factor. If the factor
is greater than 1, we impose a penalty on the result cache.

The cost calculation has been modified as follows.

=
@@ -2541,6 +2542,13 @@ cost_resultcache_rescan(PlannerInfo *root,
ResultCachePath *rcpath,
 */
startup_cost += cpu_tuple_cost;

+   /*
+* We multiply the costs by resultcache_cost_factor to control the
+* aggressiveness of result cache.
+*/
+   startup_cost *= resultcache_cost_factor;
+   total_cost *= resultcache_cost_factor;
=
@@ -1618,9 +1618,14 @@ create_resultcache_path(PlannerInfo *root,
RelOptInfo *rel, Path *subpath,
 * Add a small additional charge for caching the first entry.  All the
 * harder calculations for rescans are performed in
 * cost_resultcache_rescan().
+*
+* We multiply the costs by resultcache_cost_factor to control the
+* aggressiveness of result cache.
 */
-   pathnode->path.startup_cost = subpath->startup_cost + cpu_tuple_cost;
-   pathnode->path.total_cost = subpath->total_cost + cpu_tuple_cost;
+   pathnode->path.startup_cost =
+   (subpath->startup_cost + cpu_tuple_cost) *
resultcache_cost_factor;
+   pathnode->path.total_cost =
+   (subpath->total_cost + cpu_tuple_cost) *
resultcache_cost_factor;
pathnode->path.rows = subpath->rows;

return pathnode;
=

As this factor increases, the result cache becomes less and less
likely to be adopted. I conducted an experiment to clarify the
threshold of the factor. I ran EXPLAIN (not EXPLAIN ANALYZE) command
with different factors. The threshold is defined as the factor at
which the result cache no longer appears in the query plan. The factor
more than the threshold indicates the planner does not select the
result cache.

This experiment was conducted on my machine, so the results may differ
from those on your machine.

I attached the thresholds as Excel and PDF files. The thresholds vary
from 1.1 to 9.6. The threshold of 9.6 indicates that a penalty of 860%
must be imposed to avoid the result cache.

The Excel and PDF files also contain the chart showing the
relationship between speedup ratio and threshold. Unfortunately, there
is no clear correlation. If we set the factor to 5, we can avoid 11%
degradation of query 62 because the threshold of the query is 4.7.
However, we cannot gain a 62% speedup of query 61 with this factor.
Therefore, this factor does not work well and should be reconsidered.

In this patch, I impose a penalty on the result cache node. An
alternative way is to increase the cost of a nested loop that contains
a result cache. If so, there is no need to impose a penalty of 860%,
but a penalty of about 1% is enough.

This approach of introducing resultcache_cost_factor is not an
essential solution. However, it is reasonable to offer a way of
controlling the aggressiveness of the result cache.

Repeatedly, this patch is experimental, so it needs feedback and modifications.

Best regards,
Yuya Watari


experimental-resultcache-factor.patch
Description: Binary data


experimental-result.xlsx
Description: MS-Excel 2007 spreadsheet


experimental-result.pdf
Description: Adobe PDF document

Re: Table refer leak in logical replication

2021-04-26 Thread Amit Langote

On Mon, Apr 26, 2021 at 3:27 PM Michael Paquier  wrote:
> On Fri, Apr 23, 2021 at 09:38:01PM +0900, Amit Langote wrote:
> > On Thu, Apr 22, 2021 at 1:45 PM Michael Paquier  wrote:
> >> On the other hand, the tests for partitions have much more value IMO,
> >> but looking closely I think that we can do better:
> >> - Create triggers on more relations of the partition tree,
> >> particularly to also check when a trigger does not fire.
> >
> > It seems you're suggesting to adopt 003's trigger firing tests for
> > partitions in 013, but would we gain much by that?
>
> I was suggesting the opposite, aka apply the trigger design that you
> are introducing in 013 within 003.  But that may not be necessary
> either :)

You mentioned adding "triggers on more relations of the partition
trees", so I thought you were talking about 013; 003 doesn't test
partitioning at all at the moment.

> >> - Create some extra BEFORE triggers perhaps?
> >
> > Again, that seems separate from what we're trying to do here.  AIUI,
> > our aim here is to expand coverage for after triggers, and not really
> > that of the trigger functionality proper, because nothing seems broken
> > about it, but that of the trigger target relation opening/closing.
> > ISTM that's what you're talking about below...
>
> Exactly.  My review of the worker code is make me feeling that
> reworking this code could easily lead to some incorrect behavior, so
> I'd rather be careful with a couple of extra triggers created across
> the partition tree, down to the partitions on which the triggers are
> fired actually.

Ah, okay.  You are talking about improving the coverage in general,
NOT in the context of the fix committed in f3b141c482552.

However, note that BEFORE triggers work the same no matter whether the
target relation is directly mentioned in the apply message or found as
a result of tuple routing.  That's because the routines
execReplication.c (like ExecSimpleRelationInsert) and in
nodeModifyTable.c (like ExecInsert) pass the ResultRelInfo *directly*
to the BR trigger.c routines.   So, there's no need for the complexity
of the code and data structures for looking up trigger target
relations, such as what AFTER triggers need --
ExecGetTargetResultRel().  Given that, it's understandable to have
more coverage for the AFTER trigger case like that added by the patch
you just committed.

> >> similarly to what issues_sql_like() or connect_{fails,ok}() do
> >> already, but that would mean increasing the log level and we don't do
> >> that to ease the load of the nodes.
> >
> > ...sorry, I am not very familiar with our Perl testing infra.  Is
> > there some script that already does something like this?  For example,
> > checking in the logs generated by a "node" that no instance of a
> > certain WARNING is logged?
>
> See for example how we test for SQL patterns with the backend logs
> using issues_sql_like(), or the more recent connect_ok() and
> connect_fails().  This functions match regexps with the logs of the
> backend for patterns.

So I assume those pattern-matching functions would catch, for example,
relation leak warnings in case they get introduced later, right?  If
so, I can see the merit of trying the idea.

>  I am not sure if that's worth the extra cycles
> though.  I also feel that we may want a more centralized place as well
> to check such things, with more matching patterns, like at the end of
> one run on a set of log files?

I guess that makes sense.

> So, after a set of edits related to the format of the SQL queries, the
> object names and some indentation (including a perltidy run), I have
> applied this patch to close the loop.

Thanks a lot.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Re: Addition of authenticated ID to pg_stat_activity

2021-04-26 Thread Michael Paquier

On Sun, Apr 25, 2021 at 10:14:43PM -0500, Justin Pryzby wrote:
> That part looks to be a copy+paste error.

Sorry about that.  I have fixed that on my own branch.

>> +   
>> pg_stat_activity.authenticated_id
>>  field.
>> +   If this value is specified without units, it is taken as bytes.
>> +   The default value is 128 bytes.
>> +   This parameter can only be set at server start.
>> +   
>> +  
>> + 
> 
> I think many/most things in log/CSV should also go in PSA, and vice versa.
>
> It seems like there should be a comment about this - in both places - to avoid
> forgetting it in the future.

I am not sure what you mean here, neither do I see in what this is
related to what is proposed on this thread.
--
Michael


signature.asc
Description: PGP signature

Re: Parallel INSERT SELECT take 2

2021-04-26 Thread Bharath Rupireddy

On Mon, Apr 26, 2021 at 7:00 AM houzj.f...@fujitsu.com
 wrote:
>
> > > Based on above, we plan to move forward with the apporache 2) (declarative
> > idea).
> >
> > IIUC, the declarative behaviour idea attributes parallel 
> > safe/unsafe/restricted
> > tags to each table with default being the unsafe. Does it mean for a 
> > parallel
> > unsafe table, no parallel selects, inserts (may be updates) will be picked 
> > up? Or
> > is it only the parallel inserts? If both parallel inserts, selects will be 
> > picked, then
> > the existing tables need to be adjusted to set the parallel safety tags 
> > while
> > migrating?
>
> Thanks for looking into this.

Thanks for the responses.

> The parallel attributes in table means the parallel safety when user does 
> some data-modification operations on it.
> So, It only limit the use of parallel plan when using INSERT/UPDATE/DELETE.

In that case, isn't it better to use the terminology "PARALLEL DML
SAFE/UNSAFE/RESTRICTED" in the code and docs? This way, it will be
clear that these tags don't affect parallel selects.

> > Another point, what does it mean a table being parallel restricted?
> > What should happen if it is present in a query of other parallel safe 
> > tables?
>
> If a table is parallel restricted, it means the table contains some parallel 
> restricted objects(such as: parallel restricted functions in index 
> expressions).
> And in planner, it means parallel insert plan will not be chosen, but it can 
> use parallel select(with serial insert).

Makes sense. I assume that when there is a parallel restricted
function associated with a table, the current design doesn't enforce
the planner to choose parallel select and it is left up to it.

> > I may be wrong here: IIUC, the main problem we are trying to solve with the
> > declarative approach is to let the user decide parallel safety for 
> > partition tables
> > as it may be costlier for postgres to determine it. And for the normal 
> > tables we
> > can perform parallel safety checks without incurring much cost. So, I think 
> > we
> > should restrict the declarative approach to only partitioned tables?
>
> Yes, we are tring to avoid overhead when checking parallel safety.
> The cost to check all the partition's parallel safety is the biggest one.
> Another is the safety check of index's expression.
> Currently, for INSERT, the planner does not open the target table's indexinfo 
> and does not
> parse the expression of the index. We need to parse the expression in planner 
> if we want
> to do parallel safety check for it which can bring some overhead(it will open 
> the index the do the parse in executor again).
> So, we plan to skip all of the extra check and let user take responsibility 
> for the safety.
> Of course, maybe we can try to pass the indexinfo to the executor but it need 
> some further refactor and I will take a look into it.

Will the planner parse and check parallel safety of index((where
clause) expressions in case of SELECTs? I'm not sure of this. But if
it does, maybe we could do the same thing for parallel DML as well for
normal tables? What is the overhead of parsing index expressions? If
the cost is heavy for checking index expressions parallel safety in
case of normal tables, then the current design i.e. attributing
parallel safety tag to all the tables makes sense.

I was actually thinking that we will have the declarative approach
only for partitioned tables as it is the main problem we are trying to
solve with this design. Something like: users will run
pg_get_parallel_safety to see the parallel unsafe objects associated
with a partitioned table by looking at all of its partitions and be
able to set a parallel dml safety tag to only partitioned tables.

> > While reading the design, I came across this "erroring out during execution 
> > of a
> > query when a parallel unsafe function is detected". If this is correct, 
> > isn't it
> > warranting users to run pg_get_parallel_safety to know the parallel unsafe
> > objects, set parallel safety to all of them if possible, otherwise disable
> > parallelism to run the query? Isn't this burdensome?
>
> How about:
> If detecting parallel unsafe objects in executor, then, alter the table to 
> parallel unsafe internally.
> So, user do not need to alter it manually.

I don't think this is a good idea, because, if there are multiple
tables involved in the query, do you alter all the tables? Usually, we
error out on finding the first such unsafe object.

> > Instead, how about
> > postgres retries the query upon detecting the error that came from a 
> > parallel
> > unsafe function during execution, disable parallelism and run the query? I 
> > think
> > this kind of retry query feature can be built outside of the core postgres, 
> > but
> > IMO it will be good to have inside (of course configurable). IIRC, the 
> > Teradata
> > database has a Query Retry feature.
> >
>
> Thanks for the suggestion.
> The retry query feature sounds lik

Re: Asynchronous Append on postgres_fdw nodes.

2021-04-26 Thread Andrey V. Lepikhov


On 4/23/21 8:12 AM, Etsuro Fujita wrote:

I have committed the patch.
Small mistake i found. If no tuple was received from a foreign 
partition, explain shows that we never executed node. For example,

if we have 0 tuples in f1 and 100 tuples in f2:

Query:
EXPLAIN (ANALYZE, VERBOSE, TIMING OFF, COSTS OFF)
SELECT * FROM (SELECT * FROM f1 UNION ALL SELECT * FROM f2) AS q1
LIMIT 101;

Explain:
 Limit (actual rows=100 loops=1)
   Output: f1.a
   ->  Append (actual rows=100 loops=1)
 ->  Async Foreign Scan on public.f1 (never executed)
   Output: f1.a
   Remote SQL: SELECT a FROM public.l1
 ->  Async Foreign Scan on public.f2 (actual rows=100 loops=1)
   Output: f2.a
   Remote SQL: SELECT a FROM public.l2

The patch in the attachment fixes this.

--
regards,
Andrey Lepikhov
Postgres Professional
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index e201b5404e..a960ada441 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -6857,8 +6857,13 @@ produce_tuple_asynchronously(AsyncRequest *areq, bool fetch)
 		}
 		else
 		{
-			/* There's nothing more to do; just return a NULL pointer */
-			result = NULL;
+			/*
+			 * There's nothing more to do; just check it and get an empty slot
+			 * from the child node.
+			 */
+			result = ExecProcNode((PlanState *) node);
+			Assert(TupIsNull(result));
+
 			/* Mark the request as complete */
 			ExecAsyncRequestDone(areq, result);
 		}

Re: [HACKERS] Cached plans and statement generalization

2021-04-26 Thread Юрий Соколов

вс, 1 мар. 2020 г. в 22:26, Tom Lane :
>
> Konstantin Knizhnik  writes:
> > [ autoprepare-extended-4.patch ]
>
> The cfbot is showing that this doesn't apply anymore; there's
> some doubtless-trivial conflict in prepare.c.
>
> However ... TBH I've been skeptical of this whole proposal from the
> beginning, on the grounds that it would probably hurt more use-cases
> than it helps.  The latest approach doesn't really do anything to
> assuage that fear, because now that you've restricted it to extended
> query mode, the feature amounts to nothing more than deliberately
> overriding the application's choice to use unnamed rather than named
> (prepared) statements.  How often is that going to be a good idea?
> And when it is, wouldn't it be better to fix the app?  The client is
> likely to have a far better idea of which statements would benefit
> from this treatment than the server will; and in this context,
> the client-side changes needed would really be trivial.  The original
> proposal, scary as it was, at least supported the argument that you
> could use it to improve applications that were too dumb/hoary to
> parameterize commands for themselves.

The theme of this thread:
- it is not possible to reliably "fix the app" when pgbouncer or internal
  driver connection multiplexing are used.
- another widely spread case is frameworks (ORM or other).
  There is no way to prepare a concrete query because it is buried under
  levels of abstractions.

Whole "autoprepare" thing is a workaround for absence of "really trivial
client-side changes" in these conditions.

regards,
Yura

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

2021-04-26 Thread Arne Roland

Hi,


thanks for looking into it!


For some reason the patch doesn't apply at my end, could you repost one based 
at the master?


> 1) If get_cheapest_fractional_path_for_pathkeys returns NULL, it's not
> clear whether to default to cheapest_startup or cheapest_total. We might
> also consider an incremental sort, in which case the startup cost
> matters (for Sort it does not). So we may need an enhanced version of
> get_cheapest_fractional_path_for_pathkeys that generates such paths.
>
> 2) Same for the cheapest_total - maybe there's a partially sorted path,
> and using it with incremental sort on top would be better than using
> cheapest_total_path + sort.

> I'd argue that this patch does not need to add the Incremental Sort
> capabilities in (1) and (2) - it's just another place where we decided
> not to consider this sort variant yet.


I'd say your reasoning is sound. If I'd want to get better partial costs for 
incremental sorts, I'd look at get_cheapest_fractional_path first. That sounds 
more important than generate_orderedappend_paths. Either way I'd say that is a 
completely separate issue and I think that should be looked at separately.


>3) Not sure if get_cheapest_fractional_path_for_pathkeys should worry

> about require_parallel_safe, just like the other functions nearby.

I think it should. We have a ParallelAppend node after all.
I'm not really familiar with the way get_cheapest_fractional_path_for_pathkeys 
is used, but a quick search suggests to me, that build_minmax_path was thus far 
the only one using it. And minmax paths are never parallel safe anyway. I think 
that is the reason it doesn't do that already.


Regards

Arne


From: Tomas Vondra 
Sent: Saturday, April 17, 2021 1:52:19 AM
To: pgsql-hackers
Subject: PATCH: generate fractional cheapest paths in 
generate_orderedappend_path

Hi,

This patch is a WIP fix for the issue described in [1], where the
planner picks a more expensive plan with partition-wise joins enabled,
and disabling this option produces a cheaper plan. That's a bit strange
because with the option disabled we consider *fewer* plans, so we should
not be able to generate a cheaper plan.

The problem lies in generate_orderedappend_paths(), which builds two
types of append paths - with minimal startup cost, and with minimal
total cost. That however does not work for queries with LIMIT, which
also need to consider at fractional cost, but the path interesting from
this perspective may be eliminated by other paths.

Consider three paths (this comes from the reproducer shared in [1]):

  A: nestloop_path   startup 0.585000total 35708.292500
  B: nestloop_path   startup 0.292500total 150004297.292500
  C: mergejoin_path  startup 9748.112737 total 14102.092737

With some reasonable LIMIT value (e.g. 5% of the data), we really want
to pick path A. But that path is dominated both in startup cost (by B)
and total cost (path C). Hence generate_orderedappend_paths() will
ignore it, and we'll generate a more expensive LIMIT plan.

In [2] Tom proposed to modify generate_orderedappend_paths() to also
consider the fractional cheapest_path, just like we do for startup and
total costs. The attached patch does exactly that, and it does seem to
do the trick.

There are some loose ends, though:

1) If get_cheapest_fractional_path_for_pathkeys returns NULL, it's not
clear whether to default to cheapest_startup or cheapest_total. We might
also consider an incremental sort, in which case the startup cost
matters (for Sort it does not). So we may need an enhanced version of
get_cheapest_fractional_path_for_pathkeys that generates such paths.

2) Same for the cheapest_total - maybe there's a partially sorted path,
and using it with incremental sort on top would be better than using
cheapest_total_path + sort.

3) Not sure if get_cheapest_fractional_path_for_pathkeys should worry
about require_parallel_safe, just like the other functions nearby.

I'd argue that this patch does not need to add the Incremental Sort
capabilities in (1) and (2) - it's just another place where we decided
not to consider this sort variant yet.

I'm not sure how much this depends on partition-wise join, and why
disabling it generates the right plan. The reproducer uses that, but
AFAICS generate_orderedappend_paths() works like this since 2010 (commit
11cad29c915). I'd bet the issue exists since then and it's possible to
get similar cases even without partition-wise join.

I can reproduce it on PostgreSQL 12, though (which however supports
partition-wise join).

Not sure whether this should be backpatched. We didn't get any reports
until now, so it doesn't seem like a pressing issue. OTOH most users
won't actually notice this, they'll just get worse plans without
realizing there's a better option.


regards

[1]
https://www.postgresql.org/message-id/011937a3-7427-b99f-13f1-c07a127cf94c%40enterprisedb.com

[2] https://www.postgresql.org/message-id/4006636.1618577893%40sss.pg

Re: [HACKERS] logical decoding of two-phase transactions

2021-04-26 Thread vignesh C

On Wed, Apr 21, 2021 at 12:13 PM Peter Smith  wrote:
>
> On Tue, Apr 20, 2021 at 3:45 PM Peter Smith  wrote:
> >
> > Please find attached the latest patch set v73`*
> >
> > Differences from v72* are:
> >
> > * Rebased to HEAD @ today (required because v72-0001 no longer applied 
> > cleanly)
> >
> > * Minor documentation correction for protocol messages for Commit Prepared 
> > ('K')
> >
> > * Non-functional code tidy (mostly proto.c) to reduce overloading
> > different meanings to same member names for prepare/commit times.
>
>
> Please find attached a re-posting of patch set v73*
>
> This is the same as yesterday's v73 but with a contrib module compile
> error fixed.

Thanks for the updated patch, few comments:
1) Should "final_lsn not set in begin message" be "prepare_lsn not set
in begin message"
+logicalrep_read_begin_prepare(StringInfo in,
LogicalRepPreparedTxnData *begin_data)
+{
+   /* read fields */
+   begin_data->prepare_lsn = pq_getmsgint64(in);
+   if (begin_data->prepare_lsn == InvalidXLogRecPtr)
+   elog(ERROR, "final_lsn not set in begin message");

2) Should "These commands" be "ALTER SUBSCRIPTION ... REFRESH
PUBLICATION and ALTER SUBSCRIPTION ... SET/ADD PUBLICATION ..." as
copy_data cannot be specified with alter subscription .. drop
publication.
+   These commands also cannot be executed with copy_data =
true
+   when the subscription has two_phase commit enabled. See
+   column subtwophasestate of
+to know the actual
two-phase state.

3) Byte1('A') should be Byte1('r') as we
have defined LOGICAL_REP_MSG_ROLLBACK_PREPARED as r.
+Rollback Prepared
+
+
+
+
+
+
+Byte1('A')
+
+Identifies this message as the rollback of a
two-phase transaction message.
+
+

4) Should "Check if the prepared transaction with the given GID and
lsn is around." be
"Check if the prepared transaction with the given GID, lsn & timestamp
is around."
+/*
+ * LookupGXact
+ * Check if the prepared transaction with the given GID
and lsn is around.
+ *
+ * Note that we always compare with the LSN where prepare ends because that is
+ * what is stored as origin_lsn in the 2PC file.
+ *
+ * This function is primarily used to check if the prepared transaction
+ * received from the upstream (remote node) already exists. Checking only GID
+ * is not sufficient because a different prepared xact with the same GID can
+ * exist on the same node. So, we are ensuring to match origin_lsn and
+ * origin_timestamp of prepared xact to avoid the possibility of a match of
+ * prepared xact from two different nodes.
+ */

5) Should we change "The LSN of the prepare." to "The LSN of the begin prepare."
+Begin Prepare
+
+
+
+
+
+
+Byte1('b')
+
+Identifies this message as the beginning of a
two-phase transaction message.
+
+
+
+
+Int64
+
+The LSN of the prepare.
+
+


6) Similarly in cases of "Commit Prepared" and "Rollback Prepared"

7) No need to initialize has_subrels as we will always assign the
value returned by HeapTupleIsValid
+HasSubscriptionRelations(Oid subid)
+{
+   Relationrel;
+   int nkeys = 0;
+   ScanKeyData skey[2];
+   SysScanDesc scan;
+   boolhas_subrels = false;
+
+   rel = table_open(SubscriptionRelRelationId, AccessShareLock);

8) We could include errhint, like errhint("Option \"two_phase\"
specified more than once") to specify a more informative error
message.
+   else if (strcmp(defel->defname, "two_phase") == 0)
+   {
+   if (two_phase_option_given)
+   ereport(ERROR,
+   (errcode(ERRCODE_SYNTAX_ERROR),
+errmsg("conflicting
or redundant options")));
+   two_phase_option_given = true;
+
+   data->two_phase = defGetBoolean(defel);
+   }

9) We have a lot of function parameters for
parse_subscription_options, should we change it to struct?
@@ -69,7 +69,8 @@ parse_subscription_options(List *options,
   char **synchronous_commit,
   bool *refresh,
   bool *binary_given,
bool *binary,
-  bool
*streaming_given, bool *streaming)
+  bool
*streaming_given, bool *streaming,
+  bool
*twophase_given, bool *twophase)

10) Should we change " errhint("Use ALTER SUBSCRIPTION ...SET
PUBLICATION with refresh = false, or with copy_data = false, or use
DROP/CREATE SUBSCRIPTION.")" to  "errhint("Use ALTER SUBSCRIPTION
...SET/ADD PUBLICATION with refresh = false, or with copy_data =
false.")" as we don't support copy_data in ALTER subscription ... DROP
publication.
+

RE: Parallel INSERT SELECT take 2

2021-04-26 Thread houzj.f...@fujitsu.com

> > The parallel attributes in table means the parallel safety when user does 
> > some
> data-modification operations on it.
> > So, It only limit the use of parallel plan when using INSERT/UPDATE/DELETE.
> 
> In that case, isn't it better to use the terminology "PARALLEL DML
> SAFE/UNSAFE/RESTRICTED" in the code and docs? This way, it will be clear that
> these tags don't affect parallel selects.

Makes sense, I recalled I have heart similar suggestion before.
If there are no other objections, I plan to change command to
PARALLEL DML in my next version patches.

> > > I may be wrong here: IIUC, the main problem we are trying to solve
> > > with the declarative approach is to let the user decide parallel
> > > safety for partition tables as it may be costlier for postgres to
> > > determine it. And for the normal tables we can perform parallel
> > > safety checks without incurring much cost. So, I think we should restrict 
> > > the
> declarative approach to only partitioned tables?
> >
> > Yes, we are tring to avoid overhead when checking parallel safety.
> > The cost to check all the partition's parallel safety is the biggest one.
> > Another is the safety check of index's expression.
> > Currently, for INSERT, the planner does not open the target table's
> > indexinfo and does not parse the expression of the index. We need to
> > parse the expression in planner if we want to do parallel safety check for 
> > it
> which can bring some overhead(it will open the index the do the parse in
> executor again).
> > So, we plan to skip all of the extra check and let user take responsibility 
> > for
> the safety.
> > Of course, maybe we can try to pass the indexinfo to the executor but it 
> > need
> some further refactor and I will take a look into it.
> 
> Will the planner parse and check parallel safety of index((where
> clause) expressions in case of SELECTs? I'm not sure of this. But if it does, 
> maybe
> we could do the same thing for parallel DML as well for normal tables? 

The planner does not check the index expression, because the expression will 
not be used in SELECT.
I think the expression is only used when a tuple inserted into the index.

> the overhead of parsing index expressions? If the cost is heavy for checking
> index expressions parallel safety in case of normal tables, then the current
> design i.e. attributing parallel safety tag to all the tables makes sense.

Currently, index expression and predicate are stored in text format.
We need to use stringToNode(expression/predicate) to parse it.
Some committers think doing this twice does not look good,
unless we found some ways to pass parsed info to the executor to avoid the 
second parse.

> I was actually thinking that we will have the declarative approach only for
> partitioned tables as it is the main problem we are trying to solve with this
> design. Something like: users will run pg_get_parallel_safety to see the 
> parallel
> unsafe objects associated with a partitioned table by looking at all of its
> partitions and be able to set a parallel dml safety tag to only partitioned 
> tables.

We originally want to make the declarative approach for both normal and 
partitioned table.
In this way, it will not bring any overhead to planner and looks consistency.
But, do you think we should put some really cheap safety check to the planner ?

> 
> >0001: provide a utility function pg_get_parallel_safety('table_name').
> >
> >The function returns records of (objid, classid, parallel_safety) that
> >represent the parallel safety of objects that determine the parallel safety 
> >of
> the specified table.
> >Note: The function only outputs objects that are not parallel safe.
> 
> If it returns only parallel "unsafe" objects and not "safe" or "restricted" 
> objects,
> how about naming it to pg_get_table_parallel_unsafe_objects("table_name")?

Currently, the function returns both unsafe and restricted objects(I thought 
restricted is also not safe),
because we thought users only care about the objects that affect the use of 
parallel plan.

> This way we could get rid of parallel_safety in the output record? If at all 
> users
> want to see parallel restricted or safe objects, we can also have the
> counterparts pg_get_table_parallel_safe_objects and
> pg_get_table_parallel_restricted_objects. Of course, we can caution the user
> that execution of these functions might take longer and "might consume
> excessive memory while accumulating the output".
> 
> Otherwise, we can have a single function pg_get_parallel_safety("table_name"
> IN, "parallel_safety" IN, "objid"
> OUT, "classid" OUT)? If required, we could name it
> pg_get_parallel_safety_of_table_objects.
> 
> Thoughts?

I am sure will user want to get safe objects, do you have some usecases ?
If users do not care the safe objects, I think they can use
"SELECT * FROM pg_get_parallel_safety() where parallel_safety = 'specified 
safety' " to get the specified objects.

> Although, I have not

Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread vignesh C

Hi,

While reviewing one of the logical replication patches, I found that
we do not include hint messages to display the actual option which has
been specified more than once in case of redundant option error. I
felt including this will help in easily identifying the error, users
will not have to search through the statement to identify where the
actual error is present. Attached a patch for the same.
Thoughts?

Regards,
Vignesh
From 73f670b9206e3f0240d56435f838c67f3e9fb681 Mon Sep 17 00:00:00 2001
From: vignesh 
Date: Mon, 26 Apr 2021 09:44:23 +0530
Subject: [PATCH] Enhance error message.

Enhanced error message, so that the user can easily identify the error.
---
 src/backend/commands/copy.c |  3 +-
 src/backend/commands/foreigncmds.c  |  6 ++--
 src/backend/commands/functioncmds.c |  6 ++--
 src/backend/commands/publicationcmds.c  |  6 ++--
 src/backend/commands/subscriptioncmds.c | 28 +++--
 src/backend/commands/tablecmds.c|  3 +-
 src/backend/commands/typecmds.c | 18 +++
 src/backend/commands/user.c | 33 ++---
 src/backend/parser/parse_utilcmd.c  |  3 +-
 src/backend/replication/pgoutput/pgoutput.c | 15 ++
 src/backend/replication/walsender.c |  9 --
 src/test/regress/expected/copy2.out |  2 ++
 src/test/regress/expected/foreign_data.out  |  2 ++
 src/test/regress/expected/publication.out   |  1 +
 14 files changed, 92 insertions(+), 43 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 8265b981eb..77a50652cc 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -469,7 +469,8 @@ ProcessCopyOptions(ParseState *pstate,
 			if (opts_out->force_null)
 ereport(ERROR,
 		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg("conflicting or redundant options")));
+		 errmsg("conflicting or redundant options"),
+		 parser_errposition(pstate, defel->location)));
 			if (defel->arg && IsA(defel->arg, List))
 opts_out->force_null = castNode(List, defel->arg);
 			else
diff --git a/src/backend/commands/foreigncmds.c b/src/backend/commands/foreigncmds.c
index eb7103fd3b..1239b0cafd 100644
--- a/src/backend/commands/foreigncmds.c
+++ b/src/backend/commands/foreigncmds.c
@@ -536,7 +536,8 @@ parse_func_options(List *func_options,
 			if (*handler_given)
 ereport(ERROR,
 		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg("conflicting or redundant options")));
+		 errmsg("conflicting or redundant options"),
+		 errhint("Option \"handler\" specified more than once")));
 			*handler_given = true;
 			*fdwhandler = lookup_fdw_handler_func(def);
 		}
@@ -545,7 +546,8 @@ parse_func_options(List *func_options,
 			if (*validator_given)
 ereport(ERROR,
 		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg("conflicting or redundant options")));
+		 errmsg("conflicting or redundant options"),
+		 errhint("Option \"validator\" specified more than once")));
 			*validator_given = true;
 			*fdwvalidator = lookup_fdw_validator_func(def);
 		}
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index 9548287217..6772bdbce9 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2065,7 +2065,8 @@ ExecuteDoStmt(DoStmt *stmt, bool atomic)
 			if (as_item)
 ereport(ERROR,
 		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg("conflicting or redundant options")));
+		 errmsg("conflicting or redundant options"),
+		 errhint("Option \"as\" specified more than once")));
 			as_item = defel;
 		}
 		else if (strcmp(defel->defname, "language") == 0)
@@ -2073,7 +2074,8 @@ ExecuteDoStmt(DoStmt *stmt, bool atomic)
 			if (language_item)
 ereport(ERROR,
 		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg("conflicting or redundant options")));
+		 errmsg("conflicting or redundant options"),
+		 errhint("Option \"language\" specified more than once")));
 			language_item = defel;
 		}
 		else
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c8e0..804082116b 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -87,7 +87,8 @@ parse_publication_options(List *options,
 			if (*publish_given)
 ereport(ERROR,
 		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg("conflicting or redundant options")));
+		 errmsg("conflicting or redundant options"),
+		 errhint("Option \"publish\" specified more than once")));
 
 			/*
 			 * If publish option was given only the explicitly listed actions
@@ -130,7 +131,8 @@ parse_publication_options(List *options,
 			if (*publish_via_partition_root_given)
 ereport(ERROR,
 		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg("conflicting or redundant options")));
+		 errmsg("conflicting or redundant options"),
+		 errhint("Option \"publish_via_parti

Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY

2021-04-26 Thread Amit Langote

Hi Alvaro,

On Sat, Apr 24, 2021 at 8:31 AM Alvaro Herrera  wrote:
> On 2021-Apr-23, Alvaro Herrera wrote:
> > I think the patch I posted was too simple.  I think a real fix requires
> > us to keep track of exactly in what way the partdesc is outdated, so
> > that we can compare to the current situation in deciding to use that
> > partdesc or build a new one.  For example, we could keep track of a list
> > of OIDs of detached partitions (and it simplifies things a lot if allow
> > only a single partition in this situation, because it's either zero OIDs
> > or one OID); or we can save the Xmin of the pg_inherits tuple for the
> > detached partition (and we can compare that Xmin to our current active
> > snapshot and decide whether to use that partdesc or not).
> >
> > I'll experiment with this a little more and propose a patch later today.
>
> This (POC-quality) seems to do the trick.

Thanks for the patch.

> (I restored the API of find_inheritance_children, because it was getting
> a little obnoxious.  I haven't thought this through but I think we
> should do something like it.)

+1.

> > I don't think it's too much of a problem to state that you need to
> > finalize one detach before you can do the next one.  After all, with
> > regular detach, you'd have the tables exclusively locked so you can't do
> > them in parallel anyway.  (It also increases the chances that people
> > will finalize detach operations that went unnoticed.)

That sounds reasonable.

> I haven't added a mechanism to verify this; but with asserts on, this
> patch will crash if you have more than one.  I think the behavior is not
> necessarily sane with asserts off, since you'll get an arbitrary
> detach-Xmin assigned to the partdesc, depending on catalog scan order.

Maybe this is an ignorant question but is the plan to add an elog() in
this code path or a check (and an ereport()) somewhere in
ATExecDetachPartition() to prevent more than one partition ending up
in detach-pending state?

Please allow me to study the patch a bit more closely and get back tomorrow.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Re: Dumping/restoring fails on inherited generated column

2021-04-26 Thread Peter Eisentraut


On 05.02.21 15:18, Peter Eisentraut wrote:
Anyway, I figured out how to take account of generation expressions with 
different column orders.  I used the same approach that we use for check 
constraints.  The attached patch is good to go from my perspective.


Dusting this off ... this patch should go into the next minor releases. 
The attached patch is for master but backpatches without manual 
intervention to PG13 and PG12.
From 5a24ea0dc89c82a84efe59ef82557b8cd017def6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 26 Apr 2021 13:54:34 +0200
Subject: [PATCH v3] Fix ALTER TABLE / INHERIT with generated columns

When running ALTER TABLE t2 INHERIT t1, we must check that columns in
t2 that correspond to a generated column in t1 are also generated and
have the same generation expression.  Otherwise, this would allow
creating setups that a normal CREATE TABLE sequence would not allow.

Discussion: 
https://www.postgresql.org/message-id/22de27f6-7096-8d96-4619-7b882932c...@2ndquadrant.com
---
 src/backend/commands/tablecmds.c| 60 +
 src/test/regress/expected/generated.out | 21 +
 src/test/regress/sql/generated.sql  | 14 ++
 3 files changed, 95 insertions(+)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7d00f4eb25..1b7a4282bf 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14398,6 +14398,66 @@ MergeAttributesIntoExisting(Relation child_rel, 
Relation parent_rel)
 errmsg("column \"%s\" in child 
table must be marked NOT NULL",

attributeName)));
 
+   /*
+* If parent column is generated, child column must be, 
too.
+*/
+   if (attribute->attgenerated && !childatt->attgenerated)
+   ereport(ERROR,
+   
(errcode(ERRCODE_DATATYPE_MISMATCH),
+errmsg("column \"%s\" in child 
table must be a generated column",
+   
attributeName)));
+
+   /*
+* Check that both generation expressions match.
+*
+* The test we apply is to see whether they 
reverse-compile to the
+* same source string.  This insulates us from issues 
like whether
+* attributes have the same physical column numbers in 
parent and
+* child relations.  (See also 
constraints_equivalent().)
+*/
+   if (attribute->attgenerated && childatt->attgenerated)
+   {
+   TupleConstr *child_constr = 
child_rel->rd_att->constr;
+   TupleConstr *parent_constr = 
parent_rel->rd_att->constr;
+   char   *child_expr = NULL;
+   char   *parent_expr = NULL;
+
+   Assert(child_constr != NULL);
+   Assert(parent_constr != NULL);
+
+   for (int i = 0; i < child_constr->num_defval; 
i++)
+   {
+   if (child_constr->defval[i].adnum == 
childatt->attnum)
+   {
+   child_expr =
+   
TextDatumGetCString(DirectFunctionCall2(pg_get_expr,
+   

CStringGetTextDatum(child_constr->defval[i].adbin),
+   

ObjectIdGetDatum(child_rel->rd_id)));
+   break;
+   }
+   }
+   Assert(child_expr != NULL);
+
+   for (int i = 0; i < parent_constr->num_defval; 
i++)
+   {
+   if (parent_constr->defval[i].adnum == 
attribute->attnum)
+   {
+   parent_expr =
+   
TextDatumGetCString(DirectFunctionCall2(pg_get_expr,
+   

CStringGetTextDatum(parent_constr->defval[i].adbin),
+

Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Bharath Rupireddy

On Mon, Apr 26, 2021 at 5:29 PM vignesh C  wrote:
>
> Hi,
>
> While reviewing one of the logical replication patches, I found that
> we do not include hint messages to display the actual option which has
> been specified more than once in case of redundant option error. I
> felt including this will help in easily identifying the error, users
> will not have to search through the statement to identify where the
> actual error is present. Attached a patch for the same.
> Thoughts?

+1. A more detailed error will be useful in a rare scenario like users
have specified duplicate options along with a lot of other options,
they will know for which option the error is thrown by the server.

Instead of adding errhint or errdetail, how about just changing the
error message itself to something like "option \"%s\" specified more
than once" or "parameter \"%s\" specified more than once" like we have
in other places in the code?

Comments on the patch:

1) generally errhint and errdetail messages should end with a period
".", please see their usage in other places.
+ errhint("Option \"streaming\" specified more
than once")));

2) I think it should be errdetail instead of errhint, because you are
giving more information about the error,  but not hinting user how to
overcome it. If you had to say something like "Remove duplicate
\"password\" option." or "The \"password\" option is specified more
than once. Remove all the duplicates.", then it makes sense to use
errhint.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: [BUG] "FailedAssertion" reported when streaming in logical replication

2021-04-26 Thread Dilip Kumar

On Mon, Apr 26, 2021 at 1:26 PM Dilip Kumar  wrote:
>
> On Mon, 26 Apr 2021 at 12:45 PM, tanghy.f...@fujitsu.com 
>  wrote:
>>
>> Hi
>>
>> I think I may found a bug when using streaming in logical replication. Could 
>> anyone please take a look at this?
>>
>> Here's what I did to produce the problem.
>> I set logical_decoding_work_mem and created multiple publications at 
>> publisher, created multiple subscriptions with "streaming = on" at 
>> subscriber.
>> However, an assertion failed at publisher when I COMMIT and ROLLBACK 
>> multiple transactions at the same time.
>>
>> The log reported a FailedAssertion:
>> TRAP: FailedAssertion("txn->size == 0", File: "reorderbuffer.c", Line: 3465, 
>> PID: 911730)
>>
>> The problem happens both in synchronous mode and asynchronous mode. When 
>> there are only one or two publications, It doesn't seem to happen. (In my 
>> case, there are 8 publications and the failure always happened).
>>
>> The scripts and the log are attached. It took me about 4 minutes to run the 
>> script on my machine.
>> Please contact me if you need more specific info for the problem.
>
>
>
> Thanks for reporting. I will look into it.

I am able to reproduce this and I think I have done the initial investigation.

The cause of the issue is that, this transaction has only one change
and that change is XLOG_HEAP2_NEW_CID, which is added through
SnapBuildProcessNewCid.  Basically, when we add any changes through
SnapBuildProcessChange we set the base snapshot but when we add
SnapBuildProcessNewCid this we don't set the base snapshot, because
there is nothing to be done for this change.  Now, this transaction is
identified as the biggest transaction with non -partial changes, and
now in ReorderBufferStreamTXN, it will return immediately because the
base_snapshot is NULL.  I think the fix should be while selecting the
largest transaction in ReorderBufferLargestTopTXN, we should check the
base_snapshot should not be NULL.

I will think more about this and post the patch.

>From the core dump, we can see that base_snapshot is 0x0 and
ntuplecids = 1, and txn_flags = 1 also proves that it has a new
command id change.  And the size of the txn also shows that it has
only one change and that is REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID
because in that case, the change size will be just the
sizeof(ReorderBufferChange) which is 80.

(gdb) p *txn
$4 = {txn_flags = 1, xid = 1115, toplevel_xid = 0, gid = 0x0,
first_lsn = 1061159120, final_lsn = 0, end_lsn = 0, toptxn = 0x0,
restart_decoding_lsn = 958642624,
  origin_id = 0, origin_lsn = 0, commit_time = 0, base_snapshot = 0x0,
base_snapshot_lsn = 0, base_snapshot_node = {prev = 0x0, next = 0x0},
snapshot_now = 0x0,
  command_id = 4294967295, nentries = 1, nentries_mem = 1, changes =
{head = {prev = 0x3907c18, next = 0x3907c18}}, tuplecids = {head =
{prev = 0x39073d8,
  next = 0x39073d8}}, ntuplecids = 1, tuplecid_hash = 0x0,
toast_hash = 0x0, subtxns = {head = {prev = 0x30f1cd8, next =
0x30f1cd8}}, nsubtxns = 0,
  ninvalidations = 0, invalidations = 0x0, node = {prev = 0x30f1a98,
next = 0x30c64f8}, size = 80, total_size = 80, concurrent_abort =
false,
  output_plugin_private = 0x0}

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY

2021-04-26 Thread Alvaro Herrera

Hello Amit,

On 2021-Apr-26, Amit Langote wrote:

> On Sat, Apr 24, 2021 at 8:31 AM Alvaro Herrera  
> wrote:

> > I haven't added a mechanism to verify this; but with asserts on, this
> > patch will crash if you have more than one.  I think the behavior is not
> > necessarily sane with asserts off, since you'll get an arbitrary
> > detach-Xmin assigned to the partdesc, depending on catalog scan order.
> 
> Maybe this is an ignorant question but is the plan to add an elog() in
> this code path or a check (and an ereport()) somewhere in
> ATExecDetachPartition() to prevent more than one partition ending up
> in detach-pending state?

Yeah, that's what I'm planning to do.

> Please allow me to study the patch a bit more closely and get back tomorrow.

Sure, thanks!

-- 
Álvaro Herrera   Valdivia, Chile
"But static content is just dynamic content that isn't moving!"
http://smylers.hates-software.com/2007/08/15/fe244d0c.html

Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Dilip Kumar

On Mon, Apr 26, 2021 at 5:49 PM Bharath Rupireddy
 wrote:
>
> On Mon, Apr 26, 2021 at 5:29 PM vignesh C  wrote:
> >
> > Hi,
> >
> > While reviewing one of the logical replication patches, I found that
> > we do not include hint messages to display the actual option which has
> > been specified more than once in case of redundant option error. I
> > felt including this will help in easily identifying the error, users
> > will not have to search through the statement to identify where the
> > actual error is present. Attached a patch for the same.
> > Thoughts?
>

+1 for improving the error

> Comments on the patch:
>
> 1) generally errhint and errdetail messages should end with a period
> ".", please see their usage in other places.
> + errhint("Option \"streaming\" specified more
> than once")));
>
> 2) I think it should be errdetail instead of errhint, because you are
> giving more information about the error,  but not hinting user how to
> overcome it. If you had to say something like "Remove duplicate
> \"password\" option." or "The \"password\" option is specified more
> than once. Remove all the duplicates.", then it makes sense to use
> errhint.

I agree this should be errdetail.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [BUG] "FailedAssertion" reported when streaming in logical replication

2021-04-26 Thread Amit Kapila

On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar  wrote:
>
> On Mon, Apr 26, 2021 at 1:26 PM Dilip Kumar  wrote:
> >
> > On Mon, 26 Apr 2021 at 12:45 PM, tanghy.f...@fujitsu.com 
> >  wrote:
> >>
> >> Hi
> >>
> >> I think I may found a bug when using streaming in logical replication. 
> >> Could anyone please take a look at this?
> >>
> >> Here's what I did to produce the problem.
> >> I set logical_decoding_work_mem and created multiple publications at 
> >> publisher, created multiple subscriptions with "streaming = on" at 
> >> subscriber.
> >> However, an assertion failed at publisher when I COMMIT and ROLLBACK 
> >> multiple transactions at the same time.
> >>
> >> The log reported a FailedAssertion:
> >> TRAP: FailedAssertion("txn->size == 0", File: "reorderbuffer.c", Line: 
> >> 3465, PID: 911730)
> >>
> >> The problem happens both in synchronous mode and asynchronous mode. When 
> >> there are only one or two publications, It doesn't seem to happen. (In my 
> >> case, there are 8 publications and the failure always happened).
> >>
> >> The scripts and the log are attached. It took me about 4 minutes to run 
> >> the script on my machine.
> >> Please contact me if you need more specific info for the problem.
> >
> >
> >
> > Thanks for reporting. I will look into it.
>
> I am able to reproduce this and I think I have done the initial investigation.
>
> The cause of the issue is that, this transaction has only one change
> and that change is XLOG_HEAP2_NEW_CID, which is added through
> SnapBuildProcessNewCid.  Basically, when we add any changes through
> SnapBuildProcessChange we set the base snapshot but when we add
> SnapBuildProcessNewCid this we don't set the base snapshot, because
> there is nothing to be done for this change.  Now, this transaction is
> identified as the biggest transaction with non -partial changes, and
> now in ReorderBufferStreamTXN, it will return immediately because the
> base_snapshot is NULL.
>

Your analysis sounds correct to me.

-- 
With Regards,
Amit Kapila.

Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-26 Thread Tomas Vondra


Hi,

I took a look at this today, as I committed 39b66a91b back in January. I 
can reproduce the issue, with just 1M rows the before/after timings are 
roughly 480ms and 620ms on my hardware.


Unfortunately, the v3 patch does not really fix the issue for me. The 
timing with it applied is ~610ms so the improvement is only minimal.


I'm not sure what to do about this :-( I don't have any ideas about how 
to eliminate this overhead, so the only option I see is reverting the 
changes in heap_insert. Unfortunately, that'd mean inserts into TOAST 
tables won't be frozen ...



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread vignesh C

On Mon, Apr 26, 2021 at 5:49 PM Bharath Rupireddy
 wrote:
>
> On Mon, Apr 26, 2021 at 5:29 PM vignesh C  wrote:
> >
> > Hi,
> >
> > While reviewing one of the logical replication patches, I found that
> > we do not include hint messages to display the actual option which has
> > been specified more than once in case of redundant option error. I
> > felt including this will help in easily identifying the error, users
> > will not have to search through the statement to identify where the
> > actual error is present. Attached a patch for the same.
> > Thoughts?
>
> +1. A more detailed error will be useful in a rare scenario like users
> have specified duplicate options along with a lot of other options,
> they will know for which option the error is thrown by the server.
>
> Instead of adding errhint or errdetail, how about just changing the
> error message itself to something like "option \"%s\" specified more
> than once" or "parameter \"%s\" specified more than once" like we have
> in other places in the code?
>

Both seemed fine but I preferred using errdetail as I felt it is
slightly better for the details to appear in a new line.

> Comments on the patch:
>
> 1) generally errhint and errdetail messages should end with a period
> ".", please see their usage in other places.
> + errhint("Option \"streaming\" specified more
> than once")));
>

Modified it.

> 2) I think it should be errdetail instead of errhint, because you are
> giving more information about the error,  but not hinting user how to
> overcome it. If you had to say something like "Remove duplicate
> \"password\" option." or "The \"password\" option is specified more
> than once. Remove all the duplicates.", then it makes sense to use
> errhint.

Modified it.

Attached patch for the same.

Regards,
Vignesh
From 19b33693873a9c6201a139370649f605978863f8 Mon Sep 17 00:00:00 2001
From: vignesh 
Date: Mon, 26 Apr 2021 18:40:36 +0530
Subject: [PATCH v2] Enhance error message.

Enhanced error message, so that the user can easily identify the error.
---
 src/backend/commands/copy.c |  3 +-
 src/backend/commands/foreigncmds.c  |  6 ++--
 src/backend/commands/functioncmds.c |  6 ++--
 src/backend/commands/publicationcmds.c  |  6 ++--
 src/backend/commands/subscriptioncmds.c | 28 +++--
 src/backend/commands/tablecmds.c|  3 +-
 src/backend/commands/typecmds.c | 18 +++
 src/backend/commands/user.c | 33 ++---
 src/backend/parser/parse_utilcmd.c  |  3 +-
 src/backend/replication/pgoutput/pgoutput.c | 15 ++
 src/backend/replication/walsender.c |  9 --
 src/test/regress/expected/copy2.out |  2 ++
 src/test/regress/expected/foreign_data.out  |  2 ++
 src/test/regress/expected/publication.out   |  1 +
 14 files changed, 92 insertions(+), 43 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 8265b981eb..77a50652cc 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -469,7 +469,8 @@ ProcessCopyOptions(ParseState *pstate,
 			if (opts_out->force_null)
 ereport(ERROR,
 		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg("conflicting or redundant options")));
+		 errmsg("conflicting or redundant options"),
+		 parser_errposition(pstate, defel->location)));
 			if (defel->arg && IsA(defel->arg, List))
 opts_out->force_null = castNode(List, defel->arg);
 			else
diff --git a/src/backend/commands/foreigncmds.c b/src/backend/commands/foreigncmds.c
index eb7103fd3b..83b969a5f8 100644
--- a/src/backend/commands/foreigncmds.c
+++ b/src/backend/commands/foreigncmds.c
@@ -536,7 +536,8 @@ parse_func_options(List *func_options,
 			if (*handler_given)
 ereport(ERROR,
 		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg("conflicting or redundant options")));
+		 errmsg("conflicting or redundant options"),
+		 errdetail("Option \"handler\" specified more than once.")));
 			*handler_given = true;
 			*fdwhandler = lookup_fdw_handler_func(def);
 		}
@@ -545,7 +546,8 @@ parse_func_options(List *func_options,
 			if (*validator_given)
 ereport(ERROR,
 		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg("conflicting or redundant options")));
+		 errmsg("conflicting or redundant options"),
+		 errdetail("Option \"validator\" specified more than once.")));
 			*validator_given = true;
 			*fdwvalidator = lookup_fdw_validator_func(def);
 		}
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index 9548287217..5508fad48f 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2065,7 +2065,8 @@ ExecuteDoStmt(DoStmt *stmt, bool atomic)
 			if (as_item)
 ereport(ERROR,
 		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg("conflicting or redundant options")));
+		 errmsg("conflicting or redundant options"),
+		 errdetail("

Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread vignesh C

On Mon, Apr 26, 2021 at 6:18 PM Dilip Kumar  wrote:
>
> On Mon, Apr 26, 2021 at 5:49 PM Bharath Rupireddy
>  wrote:
> >
> > On Mon, Apr 26, 2021 at 5:29 PM vignesh C  wrote:
> > >
> > > Hi,
> > >
> > > While reviewing one of the logical replication patches, I found that
> > > we do not include hint messages to display the actual option which has
> > > been specified more than once in case of redundant option error. I
> > > felt including this will help in easily identifying the error, users
> > > will not have to search through the statement to identify where the
> > > actual error is present. Attached a patch for the same.
> > > Thoughts?
> >
>
> +1 for improving the error
>
> > Comments on the patch:
> >
> > 1) generally errhint and errdetail messages should end with a period
> > ".", please see their usage in other places.
> > + errhint("Option \"streaming\" specified more
> > than once")));
> >
> > 2) I think it should be errdetail instead of errhint, because you are
> > giving more information about the error,  but not hinting user how to
> > overcome it. If you had to say something like "Remove duplicate
> > \"password\" option." or "The \"password\" option is specified more
> > than once. Remove all the duplicates.", then it makes sense to use
> > errhint.
>
> I agree this should be errdetail.

Thanks for the comment, I have modified and shared the v2 patch
attached in the previous mail.

Regards,
Vignesh

Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Bharath Rupireddy

On Mon, Apr 26, 2021 at 7:02 PM vignesh C  wrote:
>
> On Mon, Apr 26, 2021 at 5:49 PM Bharath Rupireddy
>  wrote:
> >
> > On Mon, Apr 26, 2021 at 5:29 PM vignesh C  wrote:
> > >
> > > Hi,
> > >
> > > While reviewing one of the logical replication patches, I found that
> > > we do not include hint messages to display the actual option which has
> > > been specified more than once in case of redundant option error. I
> > > felt including this will help in easily identifying the error, users
> > > will not have to search through the statement to identify where the
> > > actual error is present. Attached a patch for the same.
> > > Thoughts?
> >
> > +1. A more detailed error will be useful in a rare scenario like users
> > have specified duplicate options along with a lot of other options,
> > they will know for which option the error is thrown by the server.
> >
> > Instead of adding errhint or errdetail, how about just changing the
> > error message itself to something like "option \"%s\" specified more
> > than once" or "parameter \"%s\" specified more than once" like we have
> > in other places in the code?
> >
>
> Both seemed fine but I preferred using errdetail as I felt it is
> slightly better for the details to appear in a new line.

Thanks! IMO, it is better to change the error message to "option
\"%s\" specified more than once" instead of adding an error detail.
Let's hear other hackers' opinions.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

RE: Support tab completion for upper character inputs in psql

2021-04-26 Thread tanghy.f...@fujitsu.com

Hi 

I've updated the patch to V7 based on the following comments. 

On Friday, April 23, 2021 11:58 AM, Kyotaro Horiguchi  
wrote
>All usages of pg_string_tolower don't need a copy.
>So don't we change the function to in-place converter?

Refer to your later discussion with Tom. Keep the code as it is.

>   if (completion_info_charp)
>+  {
>   e_info_charp = escape_string(completion_info_charp);
>+  if(e_info_charp[0] == '"')
>+  completion_case_sensitive = true;
>+  else
>+  {
>+  le_str = pg_string_tolower(e_info_charp);
>
>It seems right to lower completion_info_charp and ..2 but it is not
>right that change completion_case_sensitive here, which only affects
>the returned candidates.  

Agreed, code " completion_case_sensitive = true;" removed.

>By the way COMP_KEYWORD_CASE suggests that *keywords* are completed
>following the setting. However, they are not keywords, but
>identifiers. And some people (including me) might dislike that
>keywords and identifiers follow the same setting.  Specifically I
>sometimes want keywords to be upper-cased but identifiers (always) be
>lower-cased.

Changed my design based on your suggestion. Now the upper character inputs for 
identifiers will always turn to lower case(regardless COMP_KEYWORD_CASE) which 
I think can be accepted by most of PG users. 
  Eg: SET BYT / SET Byt
  output when apply V6 patch: SET BYTEA_OUTPUT
  output when apply V7 patch: SET bytea_output

On Friday, April 23, 2021 12:26 PM, Kyotaro Horiguchi  
wrote
>Oh, I accidentally found a doubious behsbior.
>
>=# alter table public.
>public.c1public.d1public."t"   public.t public."tt"  
>
>The "t" and "tt" are needlessly lower-cased.

Good catch. I didn’t think of schema stuff before. 
Bug fixed. Add tap tests for this scenario.

Please let me know if you find more insufficient issue in the patch. Any 
further suggestion is very welcome.

Regards,
Tang


V7-0001-Support-tab-completion-with-a-query-result-for-upper.patch
Description:  V7-0001-Support-tab-completion-with-a-query-result-for-upper.patch

RE: use pg_strncasecmp to replace strncmp when compare "pg_"

2021-04-26 Thread tanghy.f...@fujitsu.com

On Friday, April 23, 2021 2:06 PM, Tom Lane  wrote

>>Kyotaro Horiguchi  writes:
>> That doesn't matter at all for now since we match schema identifiers
>> case-sensitively.  Maybe it should be a part of the patch in [1].
>
>Yeah --- maybe this'd make sense as part of a full patch to improve
>tab-complete.c's handling of case folding, but I'm suspicious that
>applying it on its own would just make things less consistent.

Thanks for your reply. Merged this patch to [1]. Any further comment on [1] is 
very welcome.

[1] 
https://www.postgresql.org/message-id/OS0PR01MB6113CA04E06D5BF221BC4FE2FB429%40OS0PR01MB6113.jpnprd01.prod.outlook.com

Regards,
Tang

Re: How to test Postgres for any unaligned memory accesses?

2021-04-26 Thread Bharath Rupireddy

On Fri, Apr 23, 2021 at 7:25 PM Tom Lane  wrote:
> > I'm not sure this is the right way. I would like to know whether there
> > is a standard way of testing Postgres code for any unaligned memory
> > accesses. Thanks. Any help would be appreciated.
>
> Per c.h, late-model compilers have options for this:
>
>  * Testing can be done with "-fsanitize=alignment -fsanitize-trap=alignment"
>  * on clang, or "-fsanitize=alignment -fno-sanitize-recover=alignment" on gcc.

Thanks Tom!

I used the above gcc compiler flags to see if they catch memory
alignment issues. The way I tested on my dev system (x86_64 platform
with Ubuntu OS)  was that I commented out max aligning specialSize in
PageInit, compiled the source code with and without the alignment
flags. make check failed with the alignment checking flags, it passed
without the flags.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

2021-04-26 Thread Tom Lane

Amit Langote  writes:
> On Mon, Apr 26, 2021 at 9:40 AM Tom Lane  wrote:
>> I would think that this is a totally straightforward improvement,
>> but there's one thing in the comments for rewriteTargetListIU that
>> gives me a little pause: it says
>> 
>> * We must do items 1,2,3 before firing rewrite rules, else rewritten
>> * references to NEW.foo will produce wrong or incomplete results.
>> 
>> As far as I can tell, though, references to NEW values still do the
>> right thing.  I'm not certain whether any of the existing regression
>> tests really cover this point, but experimenting with the scenario shown
>> in the attached SQL file says that the DO ALSO rule gets the right
>> results.  It's possible that the expansion sequence is a bit different
>> than before, but we still end up with the right answer.

> I also checked what the rewriter and the planner do for the following
> DO ALSO insert:
> create rule logit as on update to v1 do also
> insert into log values(old.a1, new.a1, old.b2, new.b2);
> and can see that the insert ends up with the right targetlist
> irrespective of whether or not rewriteTargetListIU() adds an item for
> NEW.b2.

Thanks for looking at that.  On reflection I think this must be so,
because those rewriter mechanisms were designed long before we had
trigger-updatable views, and rewriteTargetListIU has never added
tlist items like this for any other sort of view.  So the ability
to insert the original view output column has necessarily been there
from the beginning.  This is just getting rid of a weird implementation
difference between trigger-updatable views and other views.

regards, tom lane

Re: [BUG] "FailedAssertion" reported when streaming in logical replication

2021-04-26 Thread Dilip Kumar

On Mon, Apr 26, 2021 at 6:59 PM Amit Kapila  wrote:
>
> On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar  wrote:
> >
> > I am able to reproduce this and I think I have done the initial 
> > investigation.
> >
> > The cause of the issue is that, this transaction has only one change
> > and that change is XLOG_HEAP2_NEW_CID, which is added through
> > SnapBuildProcessNewCid.  Basically, when we add any changes through
> > SnapBuildProcessChange we set the base snapshot but when we add
> > SnapBuildProcessNewCid this we don't set the base snapshot, because
> > there is nothing to be done for this change.  Now, this transaction is
> > identified as the biggest transaction with non -partial changes, and
> > now in ReorderBufferStreamTXN, it will return immediately because the
> > base_snapshot is NULL.
> >
>
> Your analysis sounds correct to me.
>

Thanks, I have attached a patch to fix this.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 16d47947002357cc37fdc3debcdf8c376e370188 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Mon, 26 Apr 2021 18:19:27 +0530
Subject: [PATCH v1] Don't select the transaction without base snapshot for
 streaming

While selecting the largest top transaction, currently we don't check
whether the transaction has the base snapshot or not, but if the
transaction doesn't have the base snapshot then we can not stream that so
skip such transactions.
---
 src/backend/replication/logical/reorderbuffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 5cb484f..981619f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3388,7 +3388,8 @@ ReorderBufferLargestTopTXN(ReorderBuffer *rb)
 		txn = dlist_container(ReorderBufferTXN, node, iter.cur);
 
 		if ((largest != NULL || txn->total_size > largest_size) &&
-			(txn->total_size > 0) && !(rbtxn_has_incomplete_tuple(txn)))
+			(txn->base_snapshot != NULL) && (txn->total_size > 0) &&
+			!(rbtxn_has_incomplete_tuple(txn)))
 		{
 			largest = txn;
 			largest_size = txn->total_size;
-- 
1.8.3.1

Re: tab-complete for ALTER TABLE .. DETACH PARTITION CONCURRENTLY

2021-04-26 Thread Justin Pryzby

On Thu, Apr 22, 2021 at 04:40:35PM -0400, Alvaro Herrera wrote:
> Would anyone oppose me pushing this for tab-completing the new keywords
> of ALTER TABLE ..  DETACH PARTITION?

+1 to apply tab completion for v14

-- 
Justin

Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

2021-04-26 Thread Dean Rasheed

On Mon, 26 Apr 2021 at 15:09, Tom Lane  wrote:
>
> Thanks for looking at that.  On reflection I think this must be so,
> because those rewriter mechanisms were designed long before we had
> trigger-updatable views, and rewriteTargetListIU has never added
> tlist items like this for any other sort of view.  So the ability
> to insert the original view output column has necessarily been there
> from the beginning.  This is just getting rid of a weird implementation
> difference between trigger-updatable views and other views.
>

FWIW, I had a look at this too and came to much the same conclusion,
so I think this is a safe change that makes the code a little neater
and more efficient.

Regards,
Dean

Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Alvaro Herrera

On 2021-Apr-26, Bharath Rupireddy wrote:

> Thanks! IMO, it is better to change the error message to "option
> \"%s\" specified more than once" instead of adding an error detail.
> Let's hear other hackers' opinions.

Many other places have the message "conflicting or redundant options",
and then parser_errposition() shows the problem option.  That seems
pretty unhelpful, so whenever the problem is the redundancy I would have
the message be explicit about that, and about which option is the
problem:
  errmsg("option \"%s\" specified more than once", "someopt")
Do note that wording it this way means only one translatable message,
not dozens.

In some cases it is possible that you'd end up with two messages, one
for "redundant" and one for the other ways for options to conflict with
others; for example collationcmds.c has one that's not as obvious, and
force_quote/force_quote_all in COPY have their own thing too.

I think we should do parser_errposition() wherever possible, in
addition to the wording change.

-- 
Álvaro Herrera   Valdivia, Chile
 really, I see PHP as like a strange amalgamation of C, Perl, Shell
 inflex: you know that "amalgam" means "mixture with mercury",
   more or less, right?
 i.e., "deadly poison"

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Peter Eisentraut


On 24.04.21 19:43, Tom Lane wrote:

Bruce Momjian  writes:

That's a pretty weird API.  I think we just need people to turn it on
like they are doing when the configure pg_stat_statements anyway.
pg_stat_statements already requires configuration anyway.


Agreed.  If pg_stat_statements were zero-configuration today then
this would be an annoying new burden, but it isn't.


I think people can understand "add pg_stat_statements to 
shared_preload_libraries" and "install the extension".  You have to turn 
it on somehow after all.


Now there is the additional burden of turning on this weird setting that 
no one understands.  That's a 50% increase in burden.


And almost no one will want to use a nondefault setting.

pg_stat_statements is pretty popular.  I think leaving in this 
requirement will lead to widespread confusion and complaints.

Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

2021-04-26 Thread Tom Lane

Dean Rasheed  writes:
> On Mon, 26 Apr 2021 at 15:09, Tom Lane  wrote:
>> Thanks for looking at that.  On reflection I think this must be so,
>> because those rewriter mechanisms were designed long before we had
>> trigger-updatable views, and rewriteTargetListIU has never added
>> tlist items like this for any other sort of view.  So the ability
>> to insert the original view output column has necessarily been there
>> from the beginning.  This is just getting rid of a weird implementation
>> difference between trigger-updatable views and other views.

> FWIW, I had a look at this too and came to much the same conclusion,
> so I think this is a safe change that makes the code a little neater
> and more efficient.

Again, thanks for looking!

I checked into the commit history (how'd we ever survive without "git
blame"?) and found that my argument above is actually wrong in detail.
Before cab5dc5da of 2013-10-18, rewriteTargetListIU expanded non-updated
columns for all views not only trigger-updatable ones.  However, that
behavior itself goes back only to 2ec993a7c of 2010-10-10, which added
triggers on views; before that there was indeed no such expansion.
Of course the view rewrite mechanisms are ten or so years older than
that, so the conclusion that they weren't designed to need this still
stands.

regards, tom lane

Issue in recent pg_stat_statements?

2021-04-26 Thread David Christensen

-hackers,

So in doing some recent work on pg_stat_statements, I notice that while the
regression test still passes on HEAD, it appears that 4f0b096 (per git
bisect) changed/broke how this works compared to historical versions.

Essentially, when doing a fresh install of pg_stat_statements on a new
fresh db (outside of the regression framework), it's not returning any rows
from the view.  I didn't see any related documentation changes, so as far
as I know, this should still be recording all statements as per normal.

My full steps to reproduce from a clean Centos 7 install are attached.  I
have also been able to reproduce this on OS X and Fedora 33.  The TL;DR is:

CREATE EXTENSION pg_stat_statements;
CREATE TABLE foo (a int, b text);
INSERT INTO foo VALUES (1,'a');
SELECT * FROM foo;
SELECT * FROM pg_stat_statements; -- returns nothing

Settings for pg_stat_statements:
postgres=# select name, setting from pg_settings where name like
'pg_stat_statements%';
   name| setting
---+-
 pg_stat_statements.max| 5000
 pg_stat_statements.save   | on
 pg_stat_statements.track  | top
 pg_stat_statements.track_planning | off
 pg_stat_statements.track_utility  | on
(5 rows)

Is this an expected change, or is this in fact broken?  In previous
revisions, this was showing the INSERT and SELECT at the very least.  I'm
unclear as to why the regression test is still passing, so want to verify
that I'm not doing something wrong in the testing.

Best,

David
#!/bin/bash -e

sudo yum install -y git
sudo yum groupinstall -y "Development Tools"
sudo yum install -y readline-devel zlib-devel 'perl(IPC::Run)' 'perl(Test::More)' 'perl(Time::HiRes)'

mkdir -p ~/Checkouts

cd ~/Checkouts
git clone https://github.com/postgres/postgres postgresql
cd ~/Checkouts/postgresql

git checkout 4f0b096
make distclean || true
./configure --enable-cassert --enable-debug CFLAGS="-ggdb -O0 -g3 -fno-omit-frame-pointer" --enable-tap-tests
make -j10 && \
( \
  cd contrib/pg_stat_statements/; \
  make check \
)

cd tmp_install/usr/local/pgsql
export PGUSER=postgres
export PGHOST=/tmp/
export LD_LIBRARY_PATH=./lib:$LD_LIBRARY_PATH

PATH=$HOME/tmp_install/usr/local/pgsql:$PATH
bin/initdb -D data -U postgres
bin/pg_ctl -D data -l logfile start
bin/psql -c 'ALTER SYSTEM SET shared_preload_libraries = $$pg_stat_statements$$'
bin/pg_ctl -D data -l logfile restart
bin/psql <

Re: Issue in recent pg_stat_statements?

2021-04-26 Thread Magnus Hagander

On Mon, Apr 26, 2021 at 5:15 PM David Christensen
 wrote:
>
> -hackers,
>
> So in doing some recent work on pg_stat_statements, I notice that while the 
> regression test still passes on HEAD, it appears that 4f0b096 (per git 
> bisect) changed/broke how this works compared to historical versions.
>
> Essentially, when doing a fresh install of pg_stat_statements on a new fresh 
> db (outside of the regression framework), it's not returning any rows from 
> the view.  I didn't see any related documentation changes, so as far as I 
> know, this should still be recording all statements as per normal.
>
> My full steps to reproduce from a clean Centos 7 install are attached.  I 
> have also been able to reproduce this on OS X and Fedora 33.  The TL;DR is:
>
> CREATE EXTENSION pg_stat_statements;
> CREATE TABLE foo (a int, b text);
> INSERT INTO foo VALUES (1,'a');
> SELECT * FROM foo;
> SELECT * FROM pg_stat_statements; -- returns nothing
>
> Settings for pg_stat_statements:
> postgres=# select name, setting from pg_settings where name like 
> 'pg_stat_statements%';
>name| setting
> ---+-
>  pg_stat_statements.max| 5000
>  pg_stat_statements.save   | on
>  pg_stat_statements.track  | top
>  pg_stat_statements.track_planning | off
>  pg_stat_statements.track_utility  | on
> (5 rows)
>
> Is this an expected change, or is this in fact broken?  In previous 
> revisions, this was showing the INSERT and SELECT at the very least.  I'm 
> unclear as to why the regression test is still passing, so want to verify 
> that I'm not doing something wrong in the testing.

Yes, you want to look into the queryid functionality. See
https://www.postgresql.org/message-id/flat/35457b09-36f8-add3-1d07-6034fa585ca8%40oss.nttdata.com

Interface changes may still be coming in 14 for that. Or warnings.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Christoph Berg

Re: Peter Eisentraut
> > Agreed.  If pg_stat_statements were zero-configuration today then
> > this would be an annoying new burden, but it isn't.
> 
> I think people can understand "add pg_stat_statements to
> shared_preload_libraries" and "install the extension".  You have to turn it
> on somehow after all.

Fwiw, I'd claim that pg_stat_statements *is* zero-configuration today.
You just have to load the module (= shared_preload_libraries), and it
will start working. Later you can run CREATE EXTENSION to actually see
the stats, but they are already being collected in the background.

> Now there is the additional burden of turning on this weird setting that no
> one understands.  That's a 50% increase in burden.
> 
> And almost no one will want to use a nondefault setting.
> 
> pg_stat_statements is pretty popular.  I think leaving in this requirement
> will lead to widespread confusion and complaints.

Ack, please make the default config (i.e. after setting 
shared_preload_libraries)
do something sensible. Having to enable some "weird" internal other setting
will be very hard to explain to users.

Fwiw, I'd even want to have pg_stat_statements enabled in core by
default. That would awesome UX. (And turning off could be as simple as
setting compute_query_id=off.)

Christoph

Re: Issue in recent pg_stat_statements?

2021-04-26 Thread David Christensen

>
> > Is this an expected change, or is this in fact broken?  In previous
> revisions, this was showing the INSERT and SELECT at the very least.  I'm
> unclear as to why the regression test is still passing, so want to verify
> that I'm not doing something wrong in the testing.
>
> Yes, you want to look into the queryid functionality. See
>
> https://www.postgresql.org/message-id/flat/35457b09-36f8-add3-1d07-6034fa585ca8%40oss.nttdata.com
>
> Interface changes may still be coming in 14 for that. Or warnings.
>

Hmm, I'm unclear as to why you would potentially want to use
pg_stat_statements *without* this functionality.  At the very least, it
violates POLA — I spent the better part of a day thinking this was a bug
due to the expected behavior being so obvious I wouldn't have expected any
different.

In any case, this discussion is better had on a different thread.  Thanks
at least for explaining what I was seeing.

Best,

David

Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Bharath Rupireddy

On Mon, Apr 26, 2021 at 8:06 PM Alvaro Herrera  wrote:
>
> On 2021-Apr-26, Bharath Rupireddy wrote:
>
> > Thanks! IMO, it is better to change the error message to "option
> > \"%s\" specified more than once" instead of adding an error detail.
> > Let's hear other hackers' opinions.
>
> Many other places have the message "conflicting or redundant options",
> and then parser_errposition() shows the problem option.  That seems
> pretty unhelpful, so whenever the problem is the redundancy I would have
> the message be explicit about that, and about which option is the
> problem:
>   errmsg("option \"%s\" specified more than once", "someopt")
> Do note that wording it this way means only one translatable message,
> not dozens.

I agree that we can just be clear about the problem. Looks like the
majority of the errors "conflicting or redundant options" are for
redundant options. So, wherever "conflicting or redundant options"
exists: 1) change the message to "option \"%s\" specified more than
once" and remove parser_errposition if it's there because the option
name in the error message would give the info with which user can
point to the location 2) change the message to something like "option
\"%s\" is conflicting with option \"%s\"".

> In some cases it is possible that you'd end up with two messages, one
> for "redundant" and one for the other ways for options to conflict with
> others; for example collationcmds.c has one that's not as obvious, and

And yes, we need to divide up the message for conflicting and
redundant options on a case-to-case basis.

In createdb: we just need to modify the error message to "conflicting
options" or we could just get rid of errdetail and have the error
message directly saying "LOCALE cannot be specified together with
LC_COLLATE or LC_CTYPE". Redundant options are just caught in the
above for loop in createdb.
if (dlocale && (dcollate || dctype))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
 errmsg("conflicting or redundant options"),
 errdetail("LOCALE cannot be specified together with
LC_COLLATE or LC_CTYPE.")));

In AlterDatabase: we can remove parser_errposition because the option
name in the error message would give the right information.
if (list_length(stmt->options) != 1)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("option \"%s\" cannot be specified with
other options",
dtablespace->defname),
 parser_errposition(pstate, dtablespace->location)));

In compute_common_attribute: we can remove goto duplicate_error and
have the message change to "option \"%s\" specified more than once".

In DefineType: we need to rework for loop.
I found another problem with collationcmds.c is that it doesn't error
out if some of the options are specified more than once, something
like below. I think the option checking "for loop" in DefineCollation
needs to be reworked.
CREATE COLLATION case_insensitive (provider = icu, provider =
someother locale = '@colStrength=secondary', deterministic = false,
deterministic = true);

> force_quote/force_quote_all in COPY have their own thing too.

We can remove the errhint for force_not_null and force_null along with
the error message wording change to "option \"%s\" specified more than
once".

Upon looking at error "conflicting or redundant options" instances, to
do the above we need a lot of code changes, I'm not sure that will be
acceptable.

One thing is that all the option checking for loops are doing these
things in common: 1) fetching the values bool, int, float, string of
the options 2) redundant checking. I feel we need to invent a common
API to which we pass in 1) a list of allowed options for a particular
command, we can have these as static data structure
{allowed_option_name, data_type}, 2) a list of user specified options
3) the API will return a list of fetched i.e. parsed values
{user_specified_option_name, data_type, value}. Maybe the API can
return a hash table of these values so that the callers can look up
faster for the required option. The advantage of this API is that we
don't need to have many for-loops for options checking in the code.
I'm not sure it is worth doing though. Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Alvaro Herrera

On 2021-Apr-26, Bharath Rupireddy wrote:

> I agree that we can just be clear about the problem. Looks like the
> majority of the errors "conflicting or redundant options" are for
> redundant options. So, wherever "conflicting or redundant options"
> exists: 1) change the message to "option \"%s\" specified more than
> once" and remove parser_errposition if it's there because the option
> name in the error message would give the info with which user can
> point to the location

Hmm, I would keep the parser_errposition() even if the option name is
mentioned in the error message.  There's no harm in being a little
redundant, with both the option name and the error cursor showing the
same thing.

> 2) change the message to something like "option \"%s\" is conflicting
> with option \"%s\"".

Maybe, but since these would all be special cases, I think we'd need to
discuss them individually.  I would suggest that in order not to stall
this patch, these cases should all stay as "redundant or conflicting
options" -- that is, avoid any further change apart from exactly the
thing you came here to change.  You can submit a 0002 patch to change
those other errors.  That way, even if those changes end up rejected for
whatever reason, you still got your 0001 done (which would change the
bulk of "conflicting or redundant" error to the "option %s already
specified" error).  Some progress is better than none.

-- 
Álvaro Herrera39°49'30"S 73°17'W
 It does it in a really, really complicated way
 why does it need to be complicated?
 Because it's MakeMaker.

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Bruce Momjian

On Mon, Apr 26, 2021 at 05:34:30PM +0200, Christoph Berg wrote:
> Re: Peter Eisentraut
> > > Agreed.  If pg_stat_statements were zero-configuration today then
> > > this would be an annoying new burden, but it isn't.
> > 
> > I think people can understand "add pg_stat_statements to
> > shared_preload_libraries" and "install the extension".  You have to turn it
> > on somehow after all.
> 
> Fwiw, I'd claim that pg_stat_statements *is* zero-configuration today.
> You just have to load the module (= shared_preload_libraries), and it
> will start working. Later you can run CREATE EXTENSION to actually see
> the stats, but they are already being collected in the background.
> 
> > Now there is the additional burden of turning on this weird setting that no
> > one understands.  That's a 50% increase in burden.
> > 
> > And almost no one will want to use a nondefault setting.
> > 
> > pg_stat_statements is pretty popular.  I think leaving in this requirement
> > will lead to widespread confusion and complaints.
> 
> Ack, please make the default config (i.e. after setting 
> shared_preload_libraries)
> do something sensible. Having to enable some "weird" internal other setting
> will be very hard to explain to users.
> 
> Fwiw, I'd even want to have pg_stat_statements enabled in core by
> default. That would awesome UX. (And turning off could be as simple as
> setting compute_query_id=off.)

Techically, pg_stat_statements can turn on compute_query_id when it is
loaded, even if it is 'off' in postgresql.conf, right?  And
pg_stat_statements would know if an alternate hash method is being used,
right?

This is closer to Magnus's idea of having a three-value
compute_query_id, except is it more controlled by pg_stat_statements. 
Another idea would be to throw a user-visible warning if the
pg_stat_statements extension is loaded and compute_query_id is off.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Stephen Frost

Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Mon, Apr 26, 2021 at 05:34:30PM +0200, Christoph Berg wrote:
> > Re: Peter Eisentraut
> > > > Agreed.  If pg_stat_statements were zero-configuration today then
> > > > this would be an annoying new burden, but it isn't.
> > > 
> > > I think people can understand "add pg_stat_statements to
> > > shared_preload_libraries" and "install the extension".  You have to turn 
> > > it
> > > on somehow after all.
> > 
> > Fwiw, I'd claim that pg_stat_statements *is* zero-configuration today.
> > You just have to load the module (= shared_preload_libraries), and it
> > will start working. Later you can run CREATE EXTENSION to actually see
> > the stats, but they are already being collected in the background.
> > 
> > > Now there is the additional burden of turning on this weird setting that 
> > > no
> > > one understands.  That's a 50% increase in burden.
> > > 
> > > And almost no one will want to use a nondefault setting.
> > > 
> > > pg_stat_statements is pretty popular.  I think leaving in this requirement
> > > will lead to widespread confusion and complaints.
> > 
> > Ack, please make the default config (i.e. after setting 
> > shared_preload_libraries)
> > do something sensible. Having to enable some "weird" internal other setting
> > will be very hard to explain to users.
> > 
> > Fwiw, I'd even want to have pg_stat_statements enabled in core by
> > default. That would awesome UX. (And turning off could be as simple as
> > setting compute_query_id=off.)
> 
> Techically, pg_stat_statements can turn on compute_query_id when it is
> loaded, even if it is 'off' in postgresql.conf, right?  And
> pg_stat_statements would know if an alternate hash method is being used,
> right?

+1 on this approach.  I agree that we should avoid having to make every
new user and every user who is upgrading with pg_stat_statements
installed have to go twiddle this parameter.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

2021-04-26 Thread Dean Rasheed

On Mon, 26 Apr 2021 at 15:55, Tom Lane  wrote:
>
> I checked into the commit history (how'd we ever survive without "git
> blame"?) and found that my argument above is actually wrong in detail.
> Before cab5dc5da of 2013-10-18, rewriteTargetListIU expanded non-updated
> columns for all views not only trigger-updatable ones.  However, that
> behavior itself goes back only to 2ec993a7c of 2010-10-10, which added
> triggers on views; before that there was indeed no such expansion.

Ah, that makes sense. Before cab5dc5da, expanding non-updated columns
of auto-updatable views was safe because until then a view was only
auto-updatable if all it's columns were. It was still unnecessary work
though, and with 20/20 hindsight, when triggers on views were first
added in 2ec993a7c, it probably should have only expanded the
targetlist for trigger-updatable views.

> Of course the view rewrite mechanisms are ten or so years older than
> that, so the conclusion that they weren't designed to need this still
> stands.

Yeah, I think that conclusion is right. The trickiest part I found was
deciding whether any product queries from conditional rules would do
the right thing if the main trigger-updatable query no longer expands
its targetlist. But I think that has to be OK, because even before
trigger-updatable views were added, it was possible to have product
queries from conditional rules together with an unconditional
do-nothing rule, so the product queries don't rely on the expanded
targetlist, and never have.

Regards,
Dean

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Tom Lane

Stephen Frost  writes:
> * Bruce Momjian (br...@momjian.us) wrote:
>> Techically, pg_stat_statements can turn on compute_query_id when it is
>> loaded, even if it is 'off' in postgresql.conf, right?  And
>> pg_stat_statements would know if an alternate hash method is being used,
>> right?

> +1 on this approach.

That'd make it impossible to turn off or adjust afterwards, wouldn't it?
I'm afraid the confusion stemming from that would outweigh any simplicity.

I would be in favor of logging a message at startup to the effect of
"this is misconfigured" (as per upthread discussion), although whether
people would see that is uncertain.

In the end, it's not like this is the first time we've ever made an
incompatible change in configuration needs; and it won't be the last
either.  I don't buy the argument that pg_stat_statements users can't
cope with adding the additional setting.  (Of course, we should be
careful to call it out as an incompatible change in the release notes.)

regards, tom lane

ERROR: relation "sql_features" does not exist

2021-04-26 Thread Pavel Stehule

Hi

I tried to write a query that does lateral join between
information_schema.tables and pgstattuple function.

select * from information_schema.tables, lateral(select * from
pgstattuple(table_name::name)) s where table_type = 'BASE TABLE';

The query finished by strange error

postgres=# select * from information_schema.tables, lateral(select * from
pgstattuple(table_name::name)) s where table_type = 'BASE TABLE';
ERROR:  relation "sql_features" does not exist

When I set search_path to information_schema, then the query is running.
But there is not any reason why it should be necessary.

I found this issue on pg 11.11, but the same behavior is on master branch.

Regards

Pavel

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Bruce Momjian

On Mon, Apr 26, 2021 at 12:56:13PM -0400, Tom Lane wrote:
> Stephen Frost  writes:
> > * Bruce Momjian (br...@momjian.us) wrote:
> >> Techically, pg_stat_statements can turn on compute_query_id when it is
> >> loaded, even if it is 'off' in postgresql.conf, right?  And
> >> pg_stat_statements would know if an alternate hash method is being used,
> >> right?
> 
> > +1 on this approach.
> 
> That'd make it impossible to turn off or adjust afterwards, wouldn't it?
> I'm afraid the confusion stemming from that would outweigh any simplicity.
> 
> I would be in favor of logging a message at startup to the effect of
> "this is misconfigured" (as per upthread discussion), although whether
> people would see that is uncertain.

I think a user-visible warning at CREATE EXNTENSION would help too.

> In the end, it's not like this is the first time we've ever made an
> incompatible change in configuration needs; and it won't be the last
> either.  I don't buy the argument that pg_stat_statements users can't
> cope with adding the additional setting.  (Of course, we should be
> careful to call it out as an incompatible change in the release notes.)

Agreed.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: Synchronous commit behavior during network outage

2021-04-26 Thread Ondřej Žižka


Hello Andrey,

I went through the thread for your patch and seems to me as an 
acceptable solution...


> The only case patch does not handle is sudden backend crash - 
Postgres will recover without a restart.


We also use a HA tool (Patroni). If the whole machine fails, it will 
find a new master and it should be OK. We use a 4 node setup (2 sync 
replicas and 1 async from every replica). If there is an issue just with 
sync replica (async operated normally) and the master fails completely 
in this situation, it will be solved by Patroni (the async replica 
become another sync), but if it is just the backend process, the master 
will not failover and changes will be still visible...


If the sync replica outage is temporal it will be solved itself when the 
node will establish a replication slot again... If the outage is "long", 
Patroni will remove the "old" sync replica from the cluster and the 
async replica reading from the master would be new sync. So yes... In 2 
node setup, this can be an issue, but in 4 node setup, this seems to me 
like a solution.
The only situation I can imagine is a situation when the client 
connections use a different network than the replication network and the 
replication network would be down completely, but the client network 
will be up. In that case, the master can be an "isolated island" and if 
it fails, we can lose the changed data.
Is this situation also covered in your model: "transaction effects 
should not be observable on primary until requirements of 
synchronous_commit are satisfied."


Do you agree with my thoughts?

Maybe would be possible to implement it into PostgreSQL with a note in 
documentation, that a multinode (>=3 nodes) cluster is necessary.


Regards
Ondrej

On 22/04/2021 05:55, Andrey Borodin wrote:


Hi Ondrej!


19 апр. 2021 г., в 22:19, Ondřej Žižka  написал(а):

Do you think, that would be possible to implement a process that would solve 
this use case?
Thank you
Ondrej


Feel free to review patch fixing this at [0]. It's classified as "Server 
Features", but I'm sure it's a bug fix.

Yandex.Cloud PG runs with this patch for more than half a year. Because we 
cannot afford loosing data in HA clusters.

It's somewhat incomplete solution, because PG restart or crash recovery will 
make waiting transactions visible. But we protect from this on HA tool's side.

Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/33/2402/

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Julien Rouhaud

On Tue, Apr 27, 2021 at 12:56 AM Tom Lane  wrote:
>
> Stephen Frost  writes:
> > * Bruce Momjian (br...@momjian.us) wrote:
> >> Techically, pg_stat_statements can turn on compute_query_id when it is
> >> loaded, even if it is 'off' in postgresql.conf, right?  And
> >> pg_stat_statements would know if an alternate hash method is being used,
> >> right?
>
> > +1 on this approach.
>
> That'd make it impossible to turn off or adjust afterwards, wouldn't it?

I think so, which would also make it impossible to use an external
query_id plugin.

Enabling compute_query_id by default or raising a WARNING in
pg_stat_statements' PG_INIT seems like the only 2 sensible options.

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Magnus Hagander

On Mon, Apr 26, 2021 at 6:56 PM Tom Lane  wrote:
>
> Stephen Frost  writes:
> > * Bruce Momjian (br...@momjian.us) wrote:
> >> Techically, pg_stat_statements can turn on compute_query_id when it is
> >> loaded, even if it is 'off' in postgresql.conf, right?  And
> >> pg_stat_statements would know if an alternate hash method is being used,
> >> right?
>
> > +1 on this approach.
>
> That'd make it impossible to turn off or adjust afterwards, wouldn't it?
> I'm afraid the confusion stemming from that would outweigh any simplicity.

Thatäs why I suggested the three value one. Default to a mode where
it's automatic, which is what the majority is going to want, but have
a way to explicitly turn it on.


> I would be in favor of logging a message at startup to the effect of
> "this is misconfigured" (as per upthread discussion), although whether
> people would see that is uncertain.

Some people would. Many wouldn't, and sadly many hours would be spent
on debugging things before they got there -- based on experience of
how many people actually read the logs..

> In the end, it's not like this is the first time we've ever made an
> incompatible change in configuration needs; and it won't be the last
> either.  I don't buy the argument that pg_stat_statements users can't
> cope with adding the additional setting.  (Of course, we should be
> careful to call it out as an incompatible change in the release notes.)

The fact that we've made changes before that complicated our users
experience isn't in itself an argument for doing it again though...

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Magnus Hagander

On Mon, Apr 26, 2021 at 7:00 PM Bruce Momjian  wrote:
>
> On Mon, Apr 26, 2021 at 12:56:13PM -0400, Tom Lane wrote:
> > Stephen Frost  writes:
> > > * Bruce Momjian (br...@momjian.us) wrote:
> > >> Techically, pg_stat_statements can turn on compute_query_id when it is
> > >> loaded, even if it is 'off' in postgresql.conf, right?  And
> > >> pg_stat_statements would know if an alternate hash method is being used,
> > >> right?
> >
> > > +1 on this approach.
> >
> > That'd make it impossible to turn off or adjust afterwards, wouldn't it?
> > I'm afraid the confusion stemming from that would outweigh any simplicity.
> >
> > I would be in favor of logging a message at startup to the effect of
> > "this is misconfigured" (as per upthread discussion), although whether
> > people would see that is uncertain.
>
> I think a user-visible warning at CREATE EXNTENSION would help too.

It would help a bit, but actually logging it would probably help more.
Most people don't run the CREATE EXTENSION commands manually, it's all
done as part of either system install scripts or of application
migrations.

But that doesn't mean it wouldn't be useful to do it for those that
*do* run things manually, it just wouldn't be sufficient in itself.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: ERROR: relation "sql_features" does not exist

2021-04-26 Thread Tom Lane

Pavel Stehule  writes:
> I tried to write a query that does lateral join between
> information_schema.tables and pgstattuple function.

> select * from information_schema.tables, lateral(select * from
> pgstattuple(table_name::name)) s where table_type = 'BASE TABLE';

> The query finished by strange error

> postgres=# select * from information_schema.tables, lateral(select * from
> pgstattuple(table_name::name)) s where table_type = 'BASE TABLE';
> ERROR:  relation "sql_features" does not exist

> When I set search_path to information_schema, then the query is running.
> But there is not any reason why it should be necessary.

Nope, this is classic user error, nothing else.  "table_name::name"
is entirely inadequate as a way to reference a table that isn't
visible in your search path.  You have to incorporate the schema
name as well.

Ideally you'd just pass the table OID to the OID-accepting version of
pgstattuple(), but of course the information_schema schema views
don't expose OIDs.  So basically you need something like

pgstattuple((quote_ident(table_schema)||'.'||quote_ident(table_name))::regclass)

although perhaps format() could help a little here.

regards, tom lane

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Julien Rouhaud

On Tue, Apr 27, 2021 at 1:04 AM Magnus Hagander  wrote:
>
> Thatäs why I suggested the three value one. Default to a mode where
> it's automatic, which is what the majority is going to want, but have
> a way to explicitly turn it on.

Agreed, that also sounds like a sensible default.

Re: ERROR: relation "sql_features" does not exist

2021-04-26 Thread Pavel Stehule

po 26. 4. 2021 v 19:10 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > I tried to write a query that does lateral join between
> > information_schema.tables and pgstattuple function.
>
> > select * from information_schema.tables, lateral(select * from
> > pgstattuple(table_name::name)) s where table_type = 'BASE TABLE';
>
> > The query finished by strange error
>
> > postgres=# select * from information_schema.tables, lateral(select * from
> > pgstattuple(table_name::name)) s where table_type = 'BASE TABLE';
> > ERROR:  relation "sql_features" does not exist
>
> > When I set search_path to information_schema, then the query is running.
> > But there is not any reason why it should be necessary.
>
> Nope, this is classic user error, nothing else.  "table_name::name"
> is entirely inadequate as a way to reference a table that isn't
> visible in your search path.  You have to incorporate the schema
> name as well.
>
> Ideally you'd just pass the table OID to the OID-accepting version of
> pgstattuple(), but of course the information_schema schema views
> don't expose OIDs.  So basically you need something like
>
>
> pgstattuple((quote_ident(table_schema)||'.'||quote_ident(table_name))::regclass)
>
> although perhaps format() could help a little here.
>

I understand now. Thank you for explanation

select * from information_schema.tables, lateral(select * from
pgstattuple(format('%I.%I', table_schema, table_name))) s where table_type
= 'BASE TABLE';

This is working

Regards

Pavel

>
> regards, tom lane
>

Re: Issue in recent pg_stat_statements?

2021-04-26 Thread Julien Rouhaud

On Mon, Apr 26, 2021 at 11:40 PM David Christensen
 wrote:
>>
>> > Is this an expected change, or is this in fact broken?  In previous 
>> > revisions, this was showing the INSERT and SELECT at the very least.  I'm 
>> > unclear as to why the regression test is still passing, so want to verify 
>> > that I'm not doing something wrong in the testing.
>>
>> Yes, you want to look into the queryid functionality. See
>> https://www.postgresql.org/message-id/flat/35457b09-36f8-add3-1d07-6034fa585ca8%40oss.nttdata.com
>>
>> Interface changes may still be coming in 14 for that. Or warnings.
>
>
> Hmm, I'm unclear as to why you would potentially want to use 
> pg_stat_statements *without* this functionality.

Using pg_stat_statements with a different query_id semantics without
having to fork pg_stat_statements.

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Stephen Frost

Greetings,

* Magnus Hagander (mag...@hagander.net) wrote:
> On Mon, Apr 26, 2021 at 6:56 PM Tom Lane  wrote:
> > Stephen Frost  writes:
> > > * Bruce Momjian (br...@momjian.us) wrote:
> > >> Techically, pg_stat_statements can turn on compute_query_id when it is
> > >> loaded, even if it is 'off' in postgresql.conf, right?  And
> > >> pg_stat_statements would know if an alternate hash method is being used,
> > >> right?
> >
> > > +1 on this approach.
> >
> > That'd make it impossible to turn off or adjust afterwards, wouldn't it?
> > I'm afraid the confusion stemming from that would outweigh any simplicity.

I don't know that it actually would, but ...

> Thatäs why I suggested the three value one. Default to a mode where
> it's automatic, which is what the majority is going to want, but have
> a way to explicitly turn it on.

This is certainly fine with me too, though it seems a bit surprising to
me that we couldn't just figure out what the user actually wants based
on what's installed/running for any given combination.

> > In the end, it's not like this is the first time we've ever made an
> > incompatible change in configuration needs; and it won't be the last
> > either.  I don't buy the argument that pg_stat_statements users can't
> > cope with adding the additional setting.  (Of course, we should be
> > careful to call it out as an incompatible change in the release notes.)
> 
> The fact that we've made changes before that complicated our users
> experience isn't in itself an argument for doing it again though...

I'm generally a proponent of sensible changes across major versions even
if it means that the user has to adjust things, but this seems like a
case where we're punting on something that we really should just be able
to figure out the right answer to and that seems like a step backwards.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Tom Lane

Stephen Frost  writes:
> * Magnus Hagander (mag...@hagander.net) wrote:
>> Thatäs why I suggested the three value one. Default to a mode where
>> it's automatic, which is what the majority is going to want, but have
>> a way to explicitly turn it on.

> This is certainly fine with me too, though it seems a bit surprising to
> me that we couldn't just figure out what the user actually wants based
> on what's installed/running for any given combination.

I'd be on board with having pg_stat_statement's pg_init function do
something to adjust the setting, if we can figure out how to do that
in a way that's not confusing in itself.  I'm not sure though that
the GUC engine offers a good way.

regards, tom lane

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Stephen Frost

Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Magnus Hagander (mag...@hagander.net) wrote:
> >> Thatäs why I suggested the three value one. Default to a mode where
> >> it's automatic, which is what the majority is going to want, but have
> >> a way to explicitly turn it on.
> 
> > This is certainly fine with me too, though it seems a bit surprising to
> > me that we couldn't just figure out what the user actually wants based
> > on what's installed/running for any given combination.
> 
> I'd be on board with having pg_stat_statement's pg_init function do
> something to adjust the setting, if we can figure out how to do that
> in a way that's not confusing in itself.  I'm not sure though that
> the GUC engine offers a good way.

Both of the extensions are getting loaded via pg_stat_statements and
both can have pg_init functions which work together to come up with the
right answer, no?

That is- can't pg_stat_statements, when it's loaded, enable
compute_query_id if it's not already enabled, and then the pg_queryid
module simply disable it when it gets loaded in it's pg_init()?  Telling
people who are using pg_queryid to have it loaded *after*
pg_stat_statements certainly seems reasonable to me, but if folks don't
like that then maybe have a tri-state which is 'auto', 'on', and 'off',
where pg_stat_statements would set it to 'on' if it's set to 'auto', but
not do anything if it starts as 'off'.  pg_queryid would then set it to
'off' when it's loaded and it wouldn't matter if it's loaded before or
after.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Alvaro Herrera

On 2021-Apr-26, Tom Lane wrote:

> Stephen Frost  writes:
> > * Magnus Hagander (mag...@hagander.net) wrote:
> >> Thatäs why I suggested the three value one. Default to a mode where
> >> it's automatic, which is what the majority is going to want, but have
> >> a way to explicitly turn it on.
> 
> > This is certainly fine with me too, though it seems a bit surprising to
> > me that we couldn't just figure out what the user actually wants based
> > on what's installed/running for any given combination.
> 
> I'd be on board with having pg_stat_statement's pg_init function do
> something to adjust the setting, if we can figure out how to do that
> in a way that's not confusing in itself.  I'm not sure though that
> the GUC engine offers a good way.

I think it's straightforward, if we decouple the tri-valued enum used
for guc.c purposes from a separate boolean that actually enables the
feature.  GUC sets the boolean to "off" initially when it sees the enum
as "auto", and then pg_stat_statement's _PG_init modifies it during its
own startup as needed.

So the user can turn the GUC off, and then pg_stat_statement does
nothing and there is no performance drawback; or leave it "auto" and
it'll only compute query_id if pg_stat_statement is loaded; or leave it
on if they want the query_id for other purposes.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"El miedo atento y previsor es la madre de la seguridad" (E. Burke)

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Stephen Frost

Greetings,

* Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote:
> On 2021-Apr-26, Tom Lane wrote:
> 
> > Stephen Frost  writes:
> > > * Magnus Hagander (mag...@hagander.net) wrote:
> > >> Thatäs why I suggested the three value one. Default to a mode where
> > >> it's automatic, which is what the majority is going to want, but have
> > >> a way to explicitly turn it on.
> > 
> > > This is certainly fine with me too, though it seems a bit surprising to
> > > me that we couldn't just figure out what the user actually wants based
> > > on what's installed/running for any given combination.
> > 
> > I'd be on board with having pg_stat_statement's pg_init function do
> > something to adjust the setting, if we can figure out how to do that
> > in a way that's not confusing in itself.  I'm not sure though that
> > the GUC engine offers a good way.
> 
> I think it's straightforward, if we decouple the tri-valued enum used
> for guc.c purposes from a separate boolean that actually enables the
> feature.  GUC sets the boolean to "off" initially when it sees the enum
> as "auto", and then pg_stat_statement's _PG_init modifies it during its
> own startup as needed.
> 
> So the user can turn the GUC off, and then pg_stat_statement does
> nothing and there is no performance drawback; or leave it "auto" and
> it'll only compute query_id if pg_stat_statement is loaded; or leave it
> on if they want the query_id for other purposes.

Yeah, this is more-or-less the same as what I was just proposing in an
email that crossed this one.  Using a separate boolean would certainly
be fine.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: pg_amcheck contrib application

2021-04-26 Thread Mark Dilger

> On Apr 23, 2021, at 3:01 PM, Mark Dilger  wrote:
> 
> I'll try to post something that accomplishes the changes to the reports that 
> you are looking for.

The attached patch changes amcheck corruption reports as discussed upthread.  
This patch is submitted for the v14 development cycle as a bug fix, per your 
complaint that the committed code generates reports sufficiently confusing to a 
user as to constitute a bug.

All other code refactoring and additional checks discussed upthread are 
reserved for the v15 development cycle and are not included here.

The minimal patch (not attached) that does not rename any variables is 135 
lines.  Your patch was 159 lines.  The patch (attached) which includes your 
variable renaming is 174 lines.

v23-0001-amcheck-adjusting-corruption-report-output.patch
Description: Binary data

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Issue in recent pg_stat_statements?

2021-04-26 Thread David Christensen

On Mon, Apr 26, 2021 at 12:18 PM Julien Rouhaud  wrote:

> On Mon, Apr 26, 2021 at 11:40 PM David Christensen
>  wrote:
> >>
> >> > Is this an expected change, or is this in fact broken?  In previous
> revisions, this was showing the INSERT and SELECT at the very least.  I'm
> unclear as to why the regression test is still passing, so want to verify
> that I'm not doing something wrong in the testing.
> >>
> >> Yes, you want to look into the queryid functionality. See
> >>
> https://www.postgresql.org/message-id/flat/35457b09-36f8-add3-1d07-6034fa585ca8%40oss.nttdata.com
> >>
> >> Interface changes may still be coming in 14 for that. Or warnings.
> >
> >
> > Hmm, I'm unclear as to why you would potentially want to use
> pg_stat_statements *without* this functionality.
>
> Using pg_stat_statements with a different query_id semantics without
> having to fork pg_stat_statements.
>

I can see that argument for allowing alternatives, but the current default
of nothing seems to be particularly non-useful, so some sensible default
value would seem to be in order, or I can predict a whole mess of future
user complaints.

Re: Issue in recent pg_stat_statements?

2021-04-26 Thread Andres Freund

On 2021-04-26 12:53:30 -0500, David Christensen wrote:
> On Mon, Apr 26, 2021 at 12:18 PM Julien Rouhaud  wrote:
> > Using pg_stat_statements with a different query_id semantics without
> > having to fork pg_stat_statements.
> >
> 
> I can see that argument for allowing alternatives, but the current default
> of nothing seems to be particularly non-useful, so some sensible default
> value would seem to be in order, or I can predict a whole mess of future
> user complaints.

+1

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Andres Freund

Hi,

On 2021-04-26 13:43:31 -0400, Alvaro Herrera wrote:
> I think it's straightforward, if we decouple the tri-valued enum used
> for guc.c purposes from a separate boolean that actually enables the
> feature.  GUC sets the boolean to "off" initially when it sees the enum
> as "auto", and then pg_stat_statement's _PG_init modifies it during its
> own startup as needed.

> So the user can turn the GUC off, and then pg_stat_statement does
> nothing and there is no performance drawback; or leave it "auto" and
> it'll only compute query_id if pg_stat_statement is loaded; or leave it
> on if they want the query_id for other purposes.

I think that's the right direction. I wonder though if we shouldn't go a
bit further. Have one guc that determines the "query id provider" (NULL
or a shared library), and one GUC that configures whether query-id is
computed (never, on-demand/auto, always). For the provider GUC load the
.so and look up a function with some well known name.

Greetings,

Andres Freund

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Tom Lane

Andres Freund  writes:
> I think that's the right direction. I wonder though if we shouldn't go a
> bit further. Have one guc that determines the "query id provider" (NULL
> or a shared library), and one GUC that configures whether query-id is
> computed (never, on-demand/auto, always). For the provider GUC load the
> .so and look up a function with some well known name.

That's sounding like a pretty sane design, actually.  Not sure about
the shared-library-name-with-fixed-function-name detail, but certainly
it seems to be useful to separate "I need a query-id" from the details
of the ID calculation.

Rather than a GUC per se for the ID provider, maybe we could have a
function hook that defaults to pointing at the in-core computation,
and then a module wanting to override that just gets into the hook.

regards, tom lane

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Andres Freund

Hi,

On 2021-04-26 14:21:00 -0400, Tom Lane wrote:
> Andres Freund  writes:
> That's sounding like a pretty sane design, actually.  Not sure about
> the shared-library-name-with-fixed-function-name detail, but certainly
> it seems to be useful to separate "I need a query-id" from the details
> of the ID calculation.
> 
> Rather than a GUC per se for the ID provider, maybe we could have a
> function hook that defaults to pointing at the in-core computation,
> and then a module wanting to override that just gets into the hook.

I have a preference to determining the provider via GUC instead of a
hook because it is both easier to introspect and easier to configure.

If the provider is loaded via a hook, and the shared library is loaded
via shared_preload_libraries, one can't easily just turn that off in a
single session, but needs to restart or explicitly load a different
library (that can't already be loaded).

We also don't have any way to show what's hooking into a hook.

Greetings,

Andres Freund

Re: compute_query_id and pg_stat_statements

2021-04-26 Thread Magnus Hagander

On Mon, Apr 26, 2021 at 8:14 PM Andres Freund  wrote:
>
> Hi,
>
> On 2021-04-26 13:43:31 -0400, Alvaro Herrera wrote:
> > I think it's straightforward, if we decouple the tri-valued enum used
> > for guc.c purposes from a separate boolean that actually enables the
> > feature.  GUC sets the boolean to "off" initially when it sees the enum
> > as "auto", and then pg_stat_statement's _PG_init modifies it during its
> > own startup as needed.

That's pretty much exactly my original suggestion, yes :)


> > So the user can turn the GUC off, and then pg_stat_statement does
> > nothing and there is no performance drawback; or leave it "auto" and
> > it'll only compute query_id if pg_stat_statement is loaded; or leave it
> > on if they want the query_id for other purposes.
>
> I think that's the right direction. I wonder though if we shouldn't go a
> bit further. Have one guc that determines the "query id provider" (NULL
> or a shared library), and one GUC that configures whether query-id is
> computed (never, on-demand/auto, always). For the provider GUC load the
> .so and look up a function with some well known name.

On Mon, Apr 26, 2021 at 8:37 PM Andres Freund  wrote:
> I have a preference to determining the provider via GUC instead of a
> hook because it is both easier to introspect and easier to configure.

+1 in general. Though we could of course also have a read-only
internal GUC that would show what we ended up with, and still
configure it with shared_preload_libraries, or loaded in some other
way. In a way it'd be cleaner to "always load modules with
shared_preload_libraries", but I can certainly see the arguments in
either direction..

But whichever way it's configured, having a well exposed way to know
what it actually is would be important.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-26 Thread Andres Freund

Hi,

On 2021-04-25 01:27:24 +0300, Yura Sokolov wrote:
> It is quite interesting result. Simplehash being open-addressing with
> linear probing is friendly for cpu cache. I'd recommend to define
> SH_FILLFACTOR with value lower than default (0.9). I believe 0.75 is
> suitable most for such kind of hash table.

It's not a "plain" linear probing hash table (although it is on the lookup
side). During insertions collisions are reordered so that the average distance
from the "optimal" position is ~even ("robin hood hashing"). That allows a
higher load factor than a plain linear probed hash table (for which IIRC
there's data to show 0.75 to be a good default load factor).

There of course may still be a benefit in lowering the load factor, but I'd
not start there.

David's test aren't really suited to benchmarking the load factor, but to me
the stats he showed didn't highlight a need to lower the load factor. Lowering
the fill factor does influence the cache hit ratio...

Greetings,

Andres Freund

Re: Addition of authenticated ID to pg_stat_activity

2021-04-26 Thread Andres Freund

Hi,

On 2021-04-26 11:34:16 +0900, Michael Paquier wrote:
> 9afffcb has added the concept of authenticated identity to the
> information provided in log_connections for audit purposes, with this
> data stored in each backend's port.  One extra thing that can be
> really useful for monitoring is the capability to track this
> information directly in pg_stat_activity.

I'm getting a bit worried about the incremental increase in
pg_stat_activity width - it's probably by far the view that's most
viewed interactively. I think we need to be careful not to add too niche
things to it. This is especially true for columns that may be wider.

It'd be bad for discoverability, but perhaps something like this, that's
not that likely to be used interactively, would be better done as a
separate function that would need to be used explicitly?

A select * from pg_stat_activity on a plain installation, connecting
over unix socket, with nothing running, is 411 chars wide for me...

Greetings,

Andres Freund

Re: Addition of authenticated ID to pg_stat_activity

2021-04-26 Thread Stephen Frost

Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2021-04-26 11:34:16 +0900, Michael Paquier wrote:
> > 9afffcb has added the concept of authenticated identity to the
> > information provided in log_connections for audit purposes, with this
> > data stored in each backend's port.  One extra thing that can be
> > really useful for monitoring is the capability to track this
> > information directly in pg_stat_activity.
> 
> I'm getting a bit worried about the incremental increase in
> pg_stat_activity width - it's probably by far the view that's most
> viewed interactively. I think we need to be careful not to add too niche
> things to it. This is especially true for columns that may be wider.
> 
> It'd be bad for discoverability, but perhaps something like this, that's
> not that likely to be used interactively, would be better done as a
> separate function that would need to be used explicitly?

I mean.. we already have separate functions and views for this, though
they're auth-method-specific currently, but also provide more details,
since it isn't actually a "one size fits all" kind of thing like this
entire approach is imagining it to be.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-26 Thread Andres Freund

Hi,

On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
> I'm not sure what to do about this :-( I don't have any ideas about how to
> eliminate this overhead, so the only option I see is reverting the changes
> in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
> be frozen ...

ISTM that the fundamental issue here is not that we acquire pins that we
shouldn't, but that we do so at a much higher frequency than needed.

It's probably too invasive for 14, but I think it might be worth exploring
passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() iff
the input will be more than one row.

And then add the vm buffer of the target page to BulkInsertState, so that
hio.c can avoid re-pinning the buffer.

Greetings,

Andres Freund

multi-version capable PostgresNode.pm

2021-04-26 Thread Andrew Dunstan



Just to complete the circle on this topic, which I intend to take up
again during the next dev cycle, I have captured the current state of my
work in a public git repo at
. This can be cloned and
used without having to change the core Postgres code, as shown in the
README.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-26 Thread Yura Sokolov


Andres Freund писал 2021-04-26 21:46:

Hi,

On 2021-04-25 01:27:24 +0300, Yura Sokolov wrote:

It is quite interesting result. Simplehash being open-addressing with
linear probing is friendly for cpu cache. I'd recommend to define
SH_FILLFACTOR with value lower than default (0.9). I believe 0.75 is
suitable most for such kind of hash table.


It's not a "plain" linear probing hash table (although it is on the 
lookup
side). During insertions collisions are reordered so that the average 
distance
from the "optimal" position is ~even ("robin hood hashing"). That 
allows a
higher load factor than a plain linear probed hash table (for which 
IIRC

there's data to show 0.75 to be a good default load factor).


Even for Robin Hood hashing 0.9 fill factor is too high. It leads to too
much movements on insertion/deletion and longer average collision chain.

Note that Robin Hood doesn't optimize average case. Indeed, usually 
Robin Hood

has worse (longer) average collision chain than simple linear probing.
Robin Hood hashing optimizes worst case, ie it guarantees there is no 
unnecessary

long collision chain.

See discussion on Rust hash table fill factor when it were Robin Hood:
https://github.com/rust-lang/rust/issues/38003



There of course may still be a benefit in lowering the load factor, but 
I'd

not start there.

David's test aren't really suited to benchmarking the load factor, but 
to me
the stats he showed didn't highlight a need to lower the load factor. 
Lowering

the fill factor does influence the cache hit ratio...

Greetings,

Andres Freund


regards,
Yura.

Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY

2021-04-26 Thread Alvaro Herrera

On 2021-Apr-26, Alvaro Herrera wrote:

> > Please allow me to study the patch a bit more closely and get back tomorrow.
> 
> Sure, thanks!

Here's a more polished version.

After trying the version with the elog(ERROR) when two detached
partitions are present, I decided against it; it is unhelpful because
it doesn't let you build partition descriptors for anything.  So I made
it an elog(WARNING) (not an ereport, note), and keep the most recent
pg_inherits.xmin value.  This is not great, but it lets you out of the
situation by finalizing one of the detaches.

The other check (at ALTER TABLE .. DETACH time) is an ereport(ERROR) and
should make the first one unreachable.

-- 
Álvaro Herrera39°49'30"S 73°17'W
>From 89e8a7d170fc0c6701e536b9c5830e3a58f303cc Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 26 Apr 2021 14:53:04 -0400
Subject: [PATCH] Allow a partdesc-with-omitted-partitions to be cached

Makes partdesc acquisition faster during the transient period during
which a partition is in the process of being dropped.

Per gripe from Amit Langote
Discussion: https://postgr.es/m/ca+hiwqfgpp1lxjzobygt9rpvtjxxkg5qg2+xch2z1q7krqz...@mail.gmail.com
---
 src/backend/catalog/pg_inherits.c | 52 +++--
 src/backend/commands/tablecmds.c  | 66 +---
 src/backend/commands/trigger.c|  3 +-
 src/backend/partitioning/partdesc.c   | 75 ---
 src/include/catalog/pg_inherits.h |  6 +-
 src/include/utils/rel.h   |  3 +
 .../detach-partition-concurrently-3.out   | 57 +-
 .../detach-partition-concurrently-3.spec  |  9 ++-
 8 files changed, 205 insertions(+), 66 deletions(-)

diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c
index 6447b52854..c373faf2d6 100644
--- a/src/backend/catalog/pg_inherits.c
+++ b/src/backend/catalog/pg_inherits.c
@@ -52,6 +52,22 @@ typedef struct SeenRelsEntry
  * then no locks are acquired, but caller must beware of race conditions
  * against possible DROPs of child relations.
  *
+ * Partitions marked as being detached are omitted; see
+ * find_inheritance_children_extended for details.
+ */
+List *
+find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
+{
+	return find_inheritance_children_extended(parentrelId, true, lockmode,
+			  NULL, NULL);
+}
+
+/*
+ * find_inheritance_children_extended
+ *
+ * As find_inheritance_children, with more options regarding detached
+ * partitions.
+ *
  * If a partition's pg_inherits row is marked "detach pending",
  * *detached_exist (if not null) is set true.
  *
@@ -60,11 +76,13 @@ typedef struct SeenRelsEntry
  * marked "detach pending" is visible to that snapshot, then that partition is
  * omitted from the output list.  This makes partitions invisible depending on
  * whether the transaction that marked those partitions as detached appears
- * committed to the active snapshot.
+ * committed to the active snapshot.  In addition, *detached_xmin (if not null)
+ * is set to the xmin of the row of the detached partition.
  */
 List *
-find_inheritance_children(Oid parentrelId, bool omit_detached,
-		  LOCKMODE lockmode, bool *detached_exist)
+find_inheritance_children_extended(Oid parentrelId, bool omit_detached,
+   LOCKMODE lockmode, bool *detached_exist,
+   TransactionId *detached_xmin)
 {
 	List	   *list = NIL;
 	Relation	relation;
@@ -132,7 +150,32 @@ find_inheritance_children(Oid parentrelId, bool omit_detached,
 snap = GetActiveSnapshot();
 
 if (!XidInMVCCSnapshot(xmin, snap))
+{
+	if (detached_xmin)
+	{
+		/*
+		 * Two detached partitions should not occur (see
+		 * checks in MarkInheritDetached), but if they do,
+		 * track the newer of the two.  Make sure to warn the
+		 * user, so that they can clean up.  Since this is
+		 * just a cross-check against potentially corrupt
+		 * catalogs, we don't make it a full-fledged error
+		 * message.
+		 */
+		if (*detached_xmin != InvalidTransactionId)
+		{
+			elog(WARNING, "more than one partition pending detach found for table with OID %u",
+ parentrelId);
+			if (TransactionIdFollows(xmin, *detached_xmin))
+*detached_xmin = xmin;
+		}
+		else
+			*detached_xmin = xmin;
+	}
+
+	/* Don't add the partition to the output list */
 	continue;
+}
 			}
 		}
 
@@ -247,8 +290,7 @@ find_all_inheritors(Oid parentrelId, LOCKMODE lockmode, List **numparents)
 		ListCell   *lc;
 
 		/* Get the direct children of this rel */
-		currentchildren = find_inheritance_children(currentrel, true,
-	lockmode, NULL);
+		currentchildren = find_inheritance_children(currentrel, lockmode);
 
 		/*
 		 * Add to the queue only those children not already seen. This avoids
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7d00f4eb25..ab3c61898c

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-26 Thread Andres Freund

Hi,

On 2021-04-26 22:44:13 +0300, Yura Sokolov wrote:
> Even for Robin Hood hashing 0.9 fill factor is too high. It leads to too
> much movements on insertion/deletion and longer average collision chain.

That's true for modification heavy cases - but most hash tables in PG,
including the smgr one, are quite read heavy. For workloads where
there's a lot of smgr activity, the other overheads in relation
creation/drop handling are magnitudes more expensive than the collision
handling.

Note that simplehash.h also grows when the distance gets too big and
when there are too many elements to move, not just based on the fill
factor.

I kinda wish we had a chained hashtable implementation with the same
interface as simplehash. It's very use-case dependent which approach is
better, and right now we might be forcing some users to choose linear
probing because simplehash.h is still faster than dynahash, even though
chaining would actually be more appropriate.

> Note that Robin Hood doesn't optimize average case.

Right.

> See discussion on Rust hash table fill factor when it were Robin Hood:
> https://github.com/rust-lang/rust/issues/38003

The first sentence actually confirms my point above, about it being a
question of read vs write heavy.

Greetings,

Andres Freund

Re: tab-complete for ALTER TABLE .. DETACH PARTITION CONCURRENTLY

2021-04-26 Thread Alvaro Herrera

On 2021-Apr-26, Justin Pryzby wrote:

> On Thu, Apr 22, 2021 at 04:40:35PM -0400, Alvaro Herrera wrote:
> > Would anyone oppose me pushing this for tab-completing the new keywords
> > of ALTER TABLE ..  DETACH PARTITION?
> 
> +1 to apply tab completion for v14

Pushed.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are."  -- Charles J. Sykes' advice to teenagers

Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-26 Thread Tomas Vondra





On 4/26/21 9:27 PM, Andres Freund wrote:

Hi,

On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:

I'm not sure what to do about this :-( I don't have any ideas about how to
eliminate this overhead, so the only option I see is reverting the changes
in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
be frozen ...


ISTM that the fundamental issue here is not that we acquire pins that we
shouldn't, but that we do so at a much higher frequency than needed.

It's probably too invasive for 14, but I think it might be worth exploring
passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() iff
the input will be more than one row.

And then add the vm buffer of the target page to BulkInsertState, so that
hio.c can avoid re-pinning the buffer.



Yeah. The question still is what to do about 14, though. Shall we leave 
the code as it is now, or should we change it somehow? It seem a bit 
unfortunate that a COPY FREEZE optimization should negatively influence 
other (more) common use cases, so I guess we can't just keep the current 
code ...


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-26 Thread Andres Freund

Hi,

On 2021-04-26 23:59:17 +0200, Tomas Vondra wrote:
> On 4/26/21 9:27 PM, Andres Freund wrote:
> > On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
> > > I'm not sure what to do about this :-( I don't have any ideas about how to
> > > eliminate this overhead, so the only option I see is reverting the changes
> > > in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
> > > be frozen ...
> > 
> > ISTM that the fundamental issue here is not that we acquire pins that we
> > shouldn't, but that we do so at a much higher frequency than needed.
> > 
> > It's probably too invasive for 14, but I think it might be worth exploring
> > passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() 
> > iff
> > the input will be more than one row.
> > 
> > And then add the vm buffer of the target page to BulkInsertState, so that
> > hio.c can avoid re-pinning the buffer.
> > 
> 
> Yeah. The question still is what to do about 14, though. Shall we leave the
> code as it is now, or should we change it somehow? It seem a bit unfortunate
> that a COPY FREEZE optimization should negatively influence other (more)
> common use cases, so I guess we can't just keep the current code ...

I'd suggest prototyping the use of BulkInsertState in nodeModifyTable.c
and see whether that fixes the regression. If it does, then we can
analyze whether that's possibly the best way forward. Or whether we
revert, live with the regression or find yet another path.

Greetings,

Andres Freund

Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY

2021-04-26 Thread Alvaro Herrera

Sorry, I forgot to update some comments in that version.  Fixed here.

-- 
Álvaro Herrera39°49'30"S 73°17'W
>From cb6d9e026624656e826ea880716ee552b15203a8 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 26 Apr 2021 14:53:04 -0400
Subject: [PATCH v2] Allow a partdesc-omitting-partitions to be cached

Makes partition descriptor acquisition faster during the transient
period in which a partition is in the process of being detached.

Per gripe from Amit Langote
Discussion: https://postgr.es/m/ca+hiwqfgpp1lxjzobygt9rpvtjxxkg5qg2+xch2z1q7krqz...@mail.gmail.com
---
 src/backend/catalog/pg_inherits.c | 52 +-
 src/backend/commands/tablecmds.c  | 66 +++--
 src/backend/commands/trigger.c|  3 +-
 src/backend/partitioning/partdesc.c   | 96 ---
 src/include/catalog/pg_inherits.h |  6 +-
 src/include/utils/rel.h   |  3 +
 .../detach-partition-concurrently-3.out   | 57 ++-
 .../detach-partition-concurrently-3.spec  |  9 +-
 8 files changed, 219 insertions(+), 73 deletions(-)

diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c
index 6447b52854..c373faf2d6 100644
--- a/src/backend/catalog/pg_inherits.c
+++ b/src/backend/catalog/pg_inherits.c
@@ -52,6 +52,22 @@ typedef struct SeenRelsEntry
  * then no locks are acquired, but caller must beware of race conditions
  * against possible DROPs of child relations.
  *
+ * Partitions marked as being detached are omitted; see
+ * find_inheritance_children_extended for details.
+ */
+List *
+find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
+{
+	return find_inheritance_children_extended(parentrelId, true, lockmode,
+			  NULL, NULL);
+}
+
+/*
+ * find_inheritance_children_extended
+ *
+ * As find_inheritance_children, with more options regarding detached
+ * partitions.
+ *
  * If a partition's pg_inherits row is marked "detach pending",
  * *detached_exist (if not null) is set true.
  *
@@ -60,11 +76,13 @@ typedef struct SeenRelsEntry
  * marked "detach pending" is visible to that snapshot, then that partition is
  * omitted from the output list.  This makes partitions invisible depending on
  * whether the transaction that marked those partitions as detached appears
- * committed to the active snapshot.
+ * committed to the active snapshot.  In addition, *detached_xmin (if not null)
+ * is set to the xmin of the row of the detached partition.
  */
 List *
-find_inheritance_children(Oid parentrelId, bool omit_detached,
-		  LOCKMODE lockmode, bool *detached_exist)
+find_inheritance_children_extended(Oid parentrelId, bool omit_detached,
+   LOCKMODE lockmode, bool *detached_exist,
+   TransactionId *detached_xmin)
 {
 	List	   *list = NIL;
 	Relation	relation;
@@ -132,7 +150,32 @@ find_inheritance_children(Oid parentrelId, bool omit_detached,
 snap = GetActiveSnapshot();
 
 if (!XidInMVCCSnapshot(xmin, snap))
+{
+	if (detached_xmin)
+	{
+		/*
+		 * Two detached partitions should not occur (see
+		 * checks in MarkInheritDetached), but if they do,
+		 * track the newer of the two.  Make sure to warn the
+		 * user, so that they can clean up.  Since this is
+		 * just a cross-check against potentially corrupt
+		 * catalogs, we don't make it a full-fledged error
+		 * message.
+		 */
+		if (*detached_xmin != InvalidTransactionId)
+		{
+			elog(WARNING, "more than one partition pending detach found for table with OID %u",
+ parentrelId);
+			if (TransactionIdFollows(xmin, *detached_xmin))
+*detached_xmin = xmin;
+		}
+		else
+			*detached_xmin = xmin;
+	}
+
+	/* Don't add the partition to the output list */
 	continue;
+}
 			}
 		}
 
@@ -247,8 +290,7 @@ find_all_inheritors(Oid parentrelId, LOCKMODE lockmode, List **numparents)
 		ListCell   *lc;
 
 		/* Get the direct children of this rel */
-		currentchildren = find_inheritance_children(currentrel, true,
-	lockmode, NULL);
+		currentchildren = find_inheritance_children(currentrel, lockmode);
 
 		/*
 		 * Add to the queue only those children not already seen. This avoids
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7d00f4eb25..ab3c61898c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3507,7 +3507,7 @@ renameatt_internal(Oid myrelid,
 		 * expected_parents will only be 0 if we are not already recursing.
 		 */
 		if (expected_parents == 0 &&
-			find_inheritance_children(myrelid, true, NoLock, NULL) != NIL)
+			find_inheritance_children(myrelid, NoLock) != NIL)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
 	 errmsg("inherited column \"%s\" must be renamed in child tables too",
@@ -3706,7 +3706,7 @@ rename_constraint_internal(Oid myrelid,
 		else
 		{
 			if (expected

Do we have a way to dynamically make a callback stub?

2021-04-26 Thread Chapman Flack

Hi,

A lot of the APIs in PostgreSQL that accept a callback follow the
familiar idiom of an extra void* argument allowing a single static
callback address to be multiplexed. But not all of them do. For example,
if I wanted to expose the possibility of defining GUCs from code written
in my PL, I would run into the fact that none of GUC's check/assign/show
hooks have the void* extra arg.

(The GUC machinery allows a way for extra info to be passed from a check
hook to an assign hook, which for a moment I thought might be abusable to
multiplex hooks, but I don't believe it is.)

Making all such APIs follow the void *extra convention might have
the virtue of consistency, but that might not be worth disturbing APIs
that have been stable for many years, and an effort to do so
might not be guaranteed to catch every such instance anyway.

A more general solution might be a function that generates a callback
stub: "please give me a void (*foo)() at a distinct address that I can
pass into this API, and when called it will call bar(baz), passing
this value for baz."

In olden days I wasn't above writing C to just slam those instructions
on the stack and return their address, but that was without multiple
architectures to think about, and non-executable stacks, and so on.
Now, there'd be a bit more magic required. Maybe some such ability is
already present in LLVM and could be exposed in jit.c?

I see that Java is currently incubating such a feature [0], so if I wait
for that I will have another option that serves my specific purposes, but
I wonder if it would be useful for PostgreSQL itself to have such
a capability available (or if it already does, and I haven't found it).

Regards,
-Chap



[0]
https://docs.oracle.com/en/java/javase/16/docs/api/jdk.incubator.foreign/jdk/incubator/foreign/CLinker.html#upcallStub(java.lang.invoke.MethodHandle,jdk.incubator.foreign.FunctionDescriptor)

Bug fix for tab completion of ALTER TABLE ... VALIDATE CONSTRAINT ...

2021-04-26 Thread David Fetter

Folks,

I noticed that $subject completes with already valid constraints,
please find attached a patch that fixes it. I noticed that there are
other places constraints can be validated, but didn't check whether
similar bugs exist there yet.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
>From e191928111a7500c3a3bc84558797fe86451c24b Mon Sep 17 00:00:00 2001
From: David Fetter 
Date: Mon, 26 Apr 2021 17:17:15 -0700
Subject: [PATCH v1] tab-completion of ALTER TABLE...VALIDATE CONSTRAINT
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="2.30.2"

This is a multi-part message in MIME format.
--2.30.2
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


...now chooses only among constraints that aren't already valid.

diff --git src/bin/psql/tab-complete.c src/bin/psql/tab-complete.c
index 7c493b..2252b7d3ac 100644
--- src/bin/psql/tab-complete.c
+++ src/bin/psql/tab-complete.c
@@ -785,6 +785,15 @@ static const SchemaQuery Query_for_list_of_collations = {
 "   and pg_catalog.quote_ident(c1.relname)='%s'"\
 "   and pg_catalog.pg_table_is_visible(c1.oid)"
 
+/* the silly-looking length condition is just to eat up the current word */
+#define Query_for_nonvalid_constraint_of_table \
+"SELECT pg_catalog.quote_ident(conname) "\
+"  FROM pg_catalog.pg_class c1, pg_catalog.pg_constraint con "\
+" WHERE c1.oid=conrelid and and not convalidated"\
+"   and (%d = pg_catalog.length('%s'))"\
+"   and pg_catalog.quote_ident(c1.relname)='%s'"\
+"   and pg_catalog.pg_table_is_visible(c1.oid)"
+
 #define Query_for_all_table_constraints \
 "SELECT pg_catalog.quote_ident(conname) "\
 "  FROM pg_catalog.pg_constraint c "\
@@ -2118,11 +2127,16 @@ psql_completion(const char *text, int start, int end)
 	 * If we have ALTER TABLE  ALTER|DROP|RENAME|VALIDATE CONSTRAINT,
 	 * provide list of constraints
 	 */
-	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER|DROP|RENAME|VALIDATE", "CONSTRAINT"))
+	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER|DROP|RENAME", "CONSTRAINT"))
 	{
 		completion_info_charp = prev3_wd;
 		COMPLETE_WITH_QUERY(Query_for_constraint_of_table);
 	}
+	else if (Matches("ALTER", "TABLE", MatchAny, "VALIDATE", "CONSTRAINT"))
+	{
+		completion_info_charp = prev3_wd;
+		COMPLETE_WITH_QUERY(Query_for_nonvalid_constraint_of_table);
+	}
 	/* ALTER TABLE ALTER [COLUMN]  */
 	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER", "COLUMN", MatchAny) ||
 			 Matches("ALTER", "TABLE", MatchAny, "ALTER", MatchAny))

--2.30.2--

Re: Parallel INSERT SELECT take 2

2021-04-26 Thread Bharath Rupireddy

On Mon, Apr 26, 2021 at 4:56 PM houzj.f...@fujitsu.com
 wrote:
> > > The parallel attributes in table means the parallel safety when user does 
> > > some
> > data-modification operations on it.
> > > So, It only limit the use of parallel plan when using 
> > > INSERT/UPDATE/DELETE.
> >
> > In that case, isn't it better to use the terminology "PARALLEL DML
> > SAFE/UNSAFE/RESTRICTED" in the code and docs? This way, it will be clear 
> > that
> > these tags don't affect parallel selects.
>
> Makes sense, I recalled I have heart similar suggestion before.
> If there are no other objections, I plan to change command to
> PARALLEL DML in my next version patches.

+1 from me. Let's hear from others.

> > Will the planner parse and check parallel safety of index((where
> > clause) expressions in case of SELECTs? I'm not sure of this. But if it 
> > does, maybe
> > we could do the same thing for parallel DML as well for normal tables?
>
> The planner does not check the index expression, because the expression will 
> not be used in SELECT.
> I think the expression is only used when a tuple inserted into the index.

Oh.

> > the overhead of parsing index expressions? If the cost is heavy for checking
> > index expressions parallel safety in case of normal tables, then the current
> > design i.e. attributing parallel safety tag to all the tables makes sense.
>
> Currently, index expression and predicate are stored in text format.
> We need to use stringToNode(expression/predicate) to parse it.
> Some committers think doing this twice does not look good,
> unless we found some ways to pass parsed info to the executor to avoid the 
> second parse.

How much is the extra cost that's added if we do stringToNode twiice?
Maybe, we can check for a few sample test cases by just having
stringToNode(expression/predicate) in the planner and see if it adds
much to the total execution time of the query.

> > I was actually thinking that we will have the declarative approach only for
> > partitioned tables as it is the main problem we are trying to solve with 
> > this
> > design. Something like: users will run pg_get_parallel_safety to see the 
> > parallel
> > unsafe objects associated with a partitioned table by looking at all of its
> > partitions and be able to set a parallel dml safety tag to only partitioned 
> > tables.
>
> We originally want to make the declarative approach for both normal and 
> partitioned table.
> In this way, it will not bring any overhead to planner and looks consistency.
> But, do you think we should put some really cheap safety check to the planner 
> ?

I still feel that why we shouldn't limit the declarative approach to
only partitioned tables? And for normal tables, possibly with a
minimal cost(??), the server can do the safety checking. I know this
feels a little inconsistent. In the planner we will have different
paths like: if (partitioned_table) { check the parallel safety tag
associated with the table } else { perform the parallel safety of the
associated objects }.

Others may have better thoughts on this.

> > >0001: provide a utility function pg_get_parallel_safety('table_name').
> > >
> > >The function returns records of (objid, classid, parallel_safety) that
> > >represent the parallel safety of objects that determine the parallel 
> > >safety of
> > the specified table.
> > >Note: The function only outputs objects that are not parallel safe.
> >
> > If it returns only parallel "unsafe" objects and not "safe" or "restricted" 
> > objects,
> > how about naming it to pg_get_table_parallel_unsafe_objects("table_name")?
>
> Currently, the function returns both unsafe and restricted objects(I thought 
> restricted is also not safe),
> because we thought users only care about the objects that affect the use of 
> parallel plan.

Hm.

> > This way we could get rid of parallel_safety in the output record? If at 
> > all users
> > want to see parallel restricted or safe objects, we can also have the
> > counterparts pg_get_table_parallel_safe_objects and
> > pg_get_table_parallel_restricted_objects. Of course, we can caution the user
> > that execution of these functions might take longer and "might consume
> > excessive memory while accumulating the output".
> >
> > Otherwise, we can have a single function pg_get_parallel_safety("table_name"
> > IN, "parallel_safety" IN, "objid"
> > OUT, "classid" OUT)? If required, we could name it
> > pg_get_parallel_safety_of_table_objects.
> >
> > Thoughts?
>
> I am sure will user want to get safe objects, do you have some usecases ?
> If users do not care the safe objects, I think they can use
> "SELECT * FROM pg_get_parallel_safety() where parallel_safety = 'specified 
> safety' " to get the specified objects.

I don't know any practical scenarios, but If I'm a user, at least I
will be tempted to see the parallel safe objects associated with a
particular table along with unsafe and restricted ones. Others may
have better thoughts on this.

> > Although

Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Bharath Rupireddy

On Mon, Apr 26, 2021 at 9:24 PM Alvaro Herrera  wrote:
>
> On 2021-Apr-26, Bharath Rupireddy wrote:
>
> > I agree that we can just be clear about the problem. Looks like the
> > majority of the errors "conflicting or redundant options" are for
> > redundant options. So, wherever "conflicting or redundant options"
> > exists: 1) change the message to "option \"%s\" specified more than
> > once" and remove parser_errposition if it's there because the option
> > name in the error message would give the info with which user can
> > point to the location
>
> Hmm, I would keep the parser_errposition() even if the option name is
> mentioned in the error message.  There's no harm in being a little
> redundant, with both the option name and the error cursor showing the
> same thing.

Agreed.

> > 2) change the message to something like "option \"%s\" is conflicting
> > with option \"%s\"".
>
> Maybe, but since these would all be special cases, I think we'd need to
> discuss them individually.  I would suggest that in order not to stall
> this patch, these cases should all stay as "redundant or conflicting
> options" -- that is, avoid any further change apart from exactly the
> thing you came here to change.  You can submit a 0002 patch to change
> those other errors.  That way, even if those changes end up rejected for
> whatever reason, you still got your 0001 done (which would change the
> bulk of "conflicting or redundant" error to the "option %s already
> specified" error).  Some progress is better than none.

+1 to have all the conflicting options error message changes as 0002
patch or I'm okay even if we discuss those changes after the 0001
patch goes in.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Enhanced error message to include hint messages for redundant options error

2021-04-26 Thread Bharath Rupireddy

On Mon, Apr 26, 2021 at 9:10 PM Bharath Rupireddy
 wrote:
>
> I found another problem with collationcmds.c is that it doesn't error
> out if some of the options are specified more than once, something
> like below. I think the option checking "for loop" in DefineCollation
> needs to be reworked.
> CREATE COLLATION case_insensitive (provider = icu, provider =
> someother locale = '@colStrength=secondary', deterministic = false,
> deterministic = true);

I'm thinking that the above problem should be discussed separately. I
will start a new thread soon on this.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Addition of authenticated ID to pg_stat_activity

2021-04-26 Thread Michael Paquier

On Mon, Apr 26, 2021 at 03:21:46PM -0400, Stephen Frost wrote:
> * Andres Freund (and...@anarazel.de) wrote:
>> I'm getting a bit worried about the incremental increase in
>> pg_stat_activity width - it's probably by far the view that's most
>> viewed interactively. I think we need to be careful not to add too niche
>> things to it. This is especially true for columns that may be wider.
>> 
>> It'd be bad for discoverability, but perhaps something like this, that's
>> not that likely to be used interactively, would be better done as a
>> separate function that would need to be used explicitly?
> 
> I mean.. we already have separate functions and views for this, though
> they're auth-method-specific currently, but also provide more details,
> since it isn't actually a "one size fits all" kind of thing like this
> entire approach is imagining it to be.

Referring to pg_stat_ssl and pg_stat_gssapi here, right?  Yes, that
would be very limited as this leads to no visibility for LDAP, all
password-based authentications and more.

I am wondering if we should take this as an occasion to move some data
out of pg_stat_activity into a separate biew, dedicated to the data
related to the connection that remains set to the same value for the
duration of a backend's life, as of the following set:
- the backend PID
- client_addr
- client_hostname
- client_port
- authenticated ID
- application_name?  (well, this one could change on reload, so I am
lying).

It would be tempting to move the database name and the username but
these are popular fields with monitoring.  Maybe we could name that
pg_stat_connection_status, pg_stat_auth_status or just
pg_stat_connection?
--
Michael


signature.asc
Description: PGP signature

Re: Transactions involving multiple postgres foreign servers, take 2

2021-04-26 Thread Masahiro Ikeda




On 2021/03/17 12:03, Masahiko Sawada wrote:
> I've attached the updated version patch set.

Thanks for updating the patches! I'm now restarting to review of 2PC because
I'd like to use this feature in PG15.


I think the following logic of resolving and removing the fdwxact entries
by the transaction resolver needs to be fixed.

1. check if pending fdwxact entries exist

HoldInDoubtFdwXacts() checks if there are entries which the condition is
InvalidBackendId and so on. After that it gets the indexes of the fdwxacts
array. The fdwXactLock is released at the end of this phase.

2. resolve and remove the entries held in 1th phase.

ResolveFdwXacts() resloves the status per each fdwxact entry using the
indexes. The end of resolving, the transaction resolver remove the entry in
fdwxacts array via remove_fdwact().

The way to remove the entry is the following. Since to control using the
index, the indexes of getting in the 1st phase are meaningless anymore.

/* Remove the entry from active array */
FdwXactCtl->num_fdwxacts--;
FdwXactCtl->fdwxacts[i] = FdwXactCtl->fdwxacts[FdwXactCtl->num_fdwxacts];

This seems to lead resolving the unexpected fdwxacts and it can occur the
following assertion error. That's why I noticed. For example, there is the
case which a backend inserts new fdwxact entry in the free space, which the
resolver removed the entry right before, and the resolver accesses the new
entry which doesn't need to resolve yet because it use the indexes checked in
1st phase.

Assert(fdwxact->locking_backend == MyBackendId);



The simple solution is that to get fdwXactLock exclusive all the time from the
begining of 1st phase to the finishing of 2nd phase. But, I worried that the
performance impact became too big...

I came up with two solutions although there may be better solutions.

A. to remove resolved entries at once after resolution for all held entries is
finished

If so, we don't need to take the exclusive lock for a long time. But, this
have other problems, which pg_remove_foreign_xact() can still remove entries
and we need to handle the fail of resolving.

I wondered that we can solve the first problem to introduce a new lock like
"removing lock" and only the processes which hold the lock can remove the
entries. The performance impact is limited since the insertion the fdwxact
entries is not blocked by this lock. And second problem can be solved using
try-catch sentence.


B. to merge 1st and 2nd phase

Now, the resolver resolves the entries together. That's the reason why it's
difficult to remove the entries. So, it seems to solve the problem to execute
checking, resolving and removing per each entry. I think it's better since
this is simpler than A. If I'm missing something, please let me know.


Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION

Re: Addition of authenticated ID to pg_stat_activity

2021-04-26 Thread Julien Rouhaud

On Tue, Apr 27, 2021 at 09:59:18AM +0900, Michael Paquier wrote:
> 
> I am wondering if we should take this as an occasion to move some data
> out of pg_stat_activity into a separate biew, dedicated to the data
> related to the connection that remains set to the same value for the
> duration of a backend's life, as of the following set:
> - the backend PID

-1.  It's already annoying enough to have to type "WHERE pid !=
pg_backend_pid()" to exclude my own backend, and I usually need it quite often.
Unless we add some new view which integrate that, something like
pg_stat_activity_except_me with a better name.  I also don't see how we could
join a new dedicated view with the old one without that information.

> - application_name?  (well, this one could change on reload, so I am
> lying).

No, it can change at any time.  And the fact that it's not transactional makes
it quite convenient for poor man progress reporting.  For instance in powa I
use that to report what the bgworker is currently working on, and this has
already proven to be useful.

Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-26 Thread Masahiko Sawada

On Mon, Apr 26, 2021 at 10:31 PM Tomas Vondra
 wrote:
>
> Hi,
>
> I took a look at this today, as I committed 39b66a91b back in January. I
> can reproduce the issue, with just 1M rows the before/after timings are
> roughly 480ms and 620ms on my hardware.
>
> Unfortunately, the v3 patch does not really fix the issue for me. The
> timing with it applied is ~610ms so the improvement is only minimal.

Since the reading vmbuffer is likely to hit on the shared buffer
during inserting frozen tuples, I think the improvement would not be
visible with a few million tuples depending on hardware. But it might
not be as fast as before commit 39b66a91b since we read vmbuffer at
least per insertion.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Addition of authenticated ID to pg_stat_activity

2021-04-26 Thread Michael Paquier

On Tue, Apr 27, 2021 at 09:26:11AM +0800, Julien Rouhaud wrote:
> -1.  It's already annoying enough to have to type "WHERE pid !=
> pg_backend_pid()" to exclude my own backend, and I usually need it quite 
> often.
> Unless we add some new view which integrate that, something like
> pg_stat_activity_except_me with a better name.  I also don't see how we could
> join a new dedicated view with the old one without that information.

Err, sorry for the confusion.  What I meant here is to also keep the
PID in pg_stat_activity, but also add it to this new view to be able
to join things across the board.
--
Michael


signature.asc
Description: PGP signature

RE: Parallel INSERT SELECT take 2

2021-04-26 Thread houzj.f...@fujitsu.com

> > Currently, index expression and predicate are stored in text format.
> > We need to use stringToNode(expression/predicate) to parse it.
> > Some committers think doing this twice does not look good, unless we
> > found some ways to pass parsed info to the executor to avoid the second
> parse.
> 
> How much is the extra cost that's added if we do stringToNode twiice?
> Maybe, we can check for a few sample test cases by just having
> stringToNode(expression/predicate) in the planner and see if it adds much to
> the total execution time of the query.


OK, I will do some test on it.

> > Yes, because both parent table and child table will be inserted and
> > the parallel related objects on them will be executed. If users want
> > to make sure the parallel insert succeed, they need to check all the 
> > objects.
> 
> Then, running the pg_get_parallel_safety will have some overhead if there are
> many partitions associated with a table. And, this is the overhead planner
> would have had to incur without the declarative approach which we are trying
> to avoid with this design.

Yes, I think put such overhead in a separate function is better than in a 
common path(planner).

> > Foreign table itself is considered as parallel restricted, because we
> > do not support parallel insert fdw api for now.
> 
> Maybe, the ALTER TABLE ... SET PARALLEL on a foreign table should default to
> parallel restricted always and emit a warning saying the reason?

Thanks for the comment, I agree.
I will change this in the next version patches.

> > But the ALTER PARALLEL command itself does not check all the partition's
> safety flag.
> > The function pg_get_parallel_safety can return all the partition's not
> > safe objects, user should set the parallel safety based the function's 
> > result.
> 
> I'm thinking that when users say ALTER TABLE partioned_table SET PARALLEL
> TO 'safe';, we check all the partitions' and their associated objects' 
> parallel
> safety? If all are parallel safe, then only we set partitioned_table as 
> parallel safe.
> What should happen if the parallel safety of any of the associated
> objects/partitions changes after setting the partitioned_table safety?

Currently, nothing happened if any of the associated objects/partitions changes 
after setting the partitioned_table safety.
Because , we do not have a really cheap way to catch the change. The existing 
relcache does not work because alter function
does not invalid the relcache which the function belongs to. And it will bring 
some other overhead(locking, systable scan,...)
to find the table the objects belong to.

> My understanding was that: the command ALTER TABLE ... SET PARALLEL TO
> 'safe' work will check the parallel safety of all the objects associated with 
> the
> table. If the objects are all parallel safe, then the table will be set to 
> safe. If at
> least one object is parallel unsafe or restricted, then the command will fail.

I think this idea makes sense. Some detail of the designed can be improved.
I agree with you that we can try to check check all the partitions' and their 
associated objects' parallel safety when ALTER PARALLEL.
Because it's a separate new command, add some overhead to it seems not too bad.
If there are no other objections, I plan to add safety check in the ALTER 
PARALLEL command.

> also thinking that how will the design cope with situations such as the 
> parallel
> safety of any of the associated objects changing after setting the table to
> parallel safe. The planner would have relied on the outdated parallel safety 
> of
> the table and chosen parallel inserts and the executor will catch such 
> situations.
> Looks like my understanding was wrong.

Currently, we assume user is responsible for its correctness.
Because, from our research, when the parallel safety of some of these objects 
is changed,
it's costly to reflect it on the parallel safety of tables that depend on them.
(we need to scan the pg_depend,pg_inherit,pg_index to find the target table)

> So, the ALTER TABLE ... SET PARALLEL TO command just sets the target table
> safety, doesn't bother what the associated objects' safety is.
> It just believes the user. If at all there are any parallel unsafe objects it 
> will be
> caught by the executor. Just like, setting parallel safety of the
> functions/aggregates, the docs caution users about accidentally/intentionally
> tagging parallel unsafe functions/aggregates as parallel safe.

Yes, thanks for looking into this.

Best regards,
houzj

Re: Parallel INSERT SELECT take 2

2021-04-26 Thread Greg Nancarrow

On Tue, Apr 27, 2021 at 10:51 AM Bharath Rupireddy
 wrote:
>
>
> I still feel that why we shouldn't limit the declarative approach to
> only partitioned tables? And for normal tables, possibly with a
> minimal cost(??), the server can do the safety checking. I know this
> feels a little inconsistent. In the planner we will have different
> paths like: if (partitioned_table) { check the parallel safety tag
> associated with the table } else { perform the parallel safety of the
> associated objects }.
>

Personally I think the simplest and best approach is just do it
consistently, using the declarative approach across all table types.

>
> Then, running the pg_get_parallel_safety will have some overhead if
> there are many partitions associated with a table. And, this is the
> overhead planner would have had to incur without the declarative
> approach which we are trying to avoid with this design.
>

The big difference is that pg_get_parallel_safety() is intended to be
used during development, not as part of runtime parallel-safety checks
(which are avoided using the declarative approach).
So there is no runtime overhead associated with pg_get_parallel_safety().

>
> I'm thinking that when users say ALTER TABLE partioned_table SET
> PARALLEL TO 'safe';, we check all the partitions' and their associated
> objects' parallel safety? If all are parallel safe, then only we set
> partitioned_table as parallel safe. What should happen if the parallel
> safety of any of the associated objects/partitions changes after
> setting the partitioned_table safety?
>

With the declarative approach, there is no parallel-safety checking on
either the CREATE/ALTER when the parallel-safety is declared/set.
It's up to the user to get it right. If it's actually wrong, it will
be detected at runtime.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Replication slot stats misgivings

2021-04-26 Thread vignesh C

On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila  wrote:
>
> On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada  wrote:
> >
> > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila  wrote:
> > >
> > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C  wrote:
> > > >
> > > > I have made the changes to update the replication statistics at
> > > > replication slot release. Please find the patch attached for the same.
> > > > Thoughts?
> > > >
> > >
> > > Thanks, the changes look mostly good to me. The slot stats need to be
> > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> > > StartupDecodingContext. Apart from that, I have moved the declaration
> > > of UpdateDecodingStats from slot.h back to logical.h. I have also
> > > added/edited a few comments. Please check and let me know what do you
> > > think of the attached?
> >
> > The patch moves slot stats to the ReplicationSlot data that is on the
> > shared memory. If we have a space to store the statistics in the
> > shared memory can we simply accumulate the stats there and make them
> > persistent without using the stats collector?
> >
>
> But for that, we need to write to file at every commit/abort/prepare
> (decode of commit) which I think will incur significant overhead.
> Also, we try to write after few commits then there is a danger of
> losing them and still there could be a visible overhead for small
> transactions.
>

I preferred not to persist this information to file, let's have stats
collector handle the stats persisting.

> > And I think there is
> > also a risk to increase shared memory when we want to add other
> > statistics in the future.
> >
>
> Yeah, so do you think it is not a good idea to store stats in
> ReplicationSlot? Actually storing them in a slot makes it easier to
> send them during ReplicationSlotRelease which is quite helpful if the
> replication is interrupted due to some reason. Or the other idea was
> that we send stats every time we stream or spill changes.

We use around 64 bytes of shared memory to store the statistics
information per slot, I'm not sure if this is a lot of memory. If this
memory is fine, then I felt the approach to store stats seems fine. If
that memory is too much then we could use the other approach to update
stats when we stream or spill the changes as suggested by Amit.

Regards,
Vignesh

Re: Parallel INSERT SELECT take 2

2021-04-26 Thread Amit Kapila

On Tue, Apr 27, 2021 at 7:45 AM Greg Nancarrow  wrote:
>
> On Tue, Apr 27, 2021 at 10:51 AM Bharath Rupireddy
>  wrote:
> >
> >
> > I still feel that why we shouldn't limit the declarative approach to
> > only partitioned tables? And for normal tables, possibly with a
> > minimal cost(??), the server can do the safety checking. I know this
> > feels a little inconsistent. In the planner we will have different
> > paths like: if (partitioned_table) { check the parallel safety tag
> > associated with the table } else { perform the parallel safety of the
> > associated objects }.
> >
>
> Personally I think the simplest and best approach is just do it
> consistently, using the declarative approach across all table types.
>

Yeah, if we decide to go with a declarative approach then that sounds
like a better approach.

> >
> > Then, running the pg_get_parallel_safety will have some overhead if
> > there are many partitions associated with a table. And, this is the
> > overhead planner would have had to incur without the declarative
> > approach which we are trying to avoid with this design.
> >
>
> The big difference is that pg_get_parallel_safety() is intended to be
> used during development, not as part of runtime parallel-safety checks
> (which are avoided using the declarative approach).
> So there is no runtime overhead associated with pg_get_parallel_safety().
>
> >
> > I'm thinking that when users say ALTER TABLE partioned_table SET
> > PARALLEL TO 'safe';, we check all the partitions' and their associated
> > objects' parallel safety? If all are parallel safe, then only we set
> > partitioned_table as parallel safe. What should happen if the parallel
> > safety of any of the associated objects/partitions changes after
> > setting the partitioned_table safety?
> >
>
> With the declarative approach, there is no parallel-safety checking on
> either the CREATE/ALTER when the parallel-safety is declared/set.
> It's up to the user to get it right. If it's actually wrong, it will
> be detected at runtime.
>

OTOH, even if we want to verify at DDL time, we won't be able to
maintain it at the later point of time say if user changed the
parallel-safety of some function used in check constraint. I think the
function pg_get_parallel_safety() will help the user to decide whether
it can declare table parallel-safe. Now, it is quite possible that the
user can later change the parallel-safe property of some function then
that should be caught at runtime.

-- 
With Regards,
Amit Kapila.

Re: SQL-standard function body

2021-04-26 Thread Justin Pryzby

On Thu, Apr 22, 2021 at 04:04:18PM -0400, Jeff Janes wrote:
> This commit break line continuation prompts for unbalanced parentheses in
> the psql binary.  Skimming through this thread, I don't see that this is
> intentional or has been noticed before.
> 
> with psql -X
> 
> Before:
> 
> jjanes=# asdf (
> jjanes(#
> 
> Now:
> 
> jjanes=# asdf (
> jjanes-#
> 
> I've looked through the parts of the commit that change psql, but didn't
> see an obvious culprit.

I haven't studied it in detail, but it probably needs something like this.

diff --git a/src/fe_utils/psqlscan.l b/src/fe_utils/psqlscan.l
index 991b7de0b5..0fab48a382 100644
--- a/src/fe_utils/psqlscan.l
+++ b/src/fe_utils/psqlscan.l
@@ -1098,23 +1098,23 @@ psql_scan(PsqlScanState state,
{
case LEXRES_EOL:/* end of input */
switch (state->start_state)
{
case INITIAL:
case xqs:   /* we treat this like 
INITIAL */
if (state->paren_depth > 0)
{
result = PSCAN_INCOMPLETE;
*prompt = PROMPT_PAREN;
}
-   if (state->begin_depth > 0)
+   else if (state->begin_depth > 0)
{
result = PSCAN_INCOMPLETE;
*prompt = PROMPT_CONTINUE;
}
else if (query_buf->len > 0)
{
result = PSCAN_EOL;
*prompt = PROMPT_CONTINUE;
}
else
{

Re: Replication slot stats misgivings

2021-04-26 Thread Amit Kapila

On Tue, Apr 27, 2021 at 8:01 AM vignesh C  wrote:
>
> On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila  wrote:
> >
> > On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C  wrote:
> > > > >
> > > > > I have made the changes to update the replication statistics at
> > > > > replication slot release. Please find the patch attached for the same.
> > > > > Thoughts?
> > > > >
> > > >
> > > > Thanks, the changes look mostly good to me. The slot stats need to be
> > > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> > > > StartupDecodingContext. Apart from that, I have moved the declaration
> > > > of UpdateDecodingStats from slot.h back to logical.h. I have also
> > > > added/edited a few comments. Please check and let me know what do you
> > > > think of the attached?
> > >
> > > The patch moves slot stats to the ReplicationSlot data that is on the
> > > shared memory. If we have a space to store the statistics in the
> > > shared memory can we simply accumulate the stats there and make them
> > > persistent without using the stats collector?
> > >
> >
> > But for that, we need to write to file at every commit/abort/prepare
> > (decode of commit) which I think will incur significant overhead.
> > Also, we try to write after few commits then there is a danger of
> > losing them and still there could be a visible overhead for small
> > transactions.
> >
>
> I preferred not to persist this information to file, let's have stats
> collector handle the stats persisting.
>

Sawada-San, I would like to go ahead with your
"Use-HTAB-for-replication-slot-statistics" unless you think otherwise?

-- 
With Regards,
Amit Kapila.

Re: Replication slot stats misgivings

2021-04-26 Thread Masahiko Sawada

On Tue, Apr 27, 2021 at 11:45 AM Amit Kapila  wrote:
>
> On Tue, Apr 27, 2021 at 8:01 AM vignesh C  wrote:
> >
> > On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila  wrote:
> > >
> > > On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C  wrote:
> > > > > >
> > > > > > I have made the changes to update the replication statistics at
> > > > > > replication slot release. Please find the patch attached for the 
> > > > > > same.
> > > > > > Thoughts?
> > > > > >
> > > > >
> > > > > Thanks, the changes look mostly good to me. The slot stats need to be
> > > > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in
> > > > > StartupDecodingContext. Apart from that, I have moved the declaration
> > > > > of UpdateDecodingStats from slot.h back to logical.h. I have also
> > > > > added/edited a few comments. Please check and let me know what do you
> > > > > think of the attached?
> > > >
> > > > The patch moves slot stats to the ReplicationSlot data that is on the
> > > > shared memory. If we have a space to store the statistics in the
> > > > shared memory can we simply accumulate the stats there and make them
> > > > persistent without using the stats collector?
> > > >
> > >
> > > But for that, we need to write to file at every commit/abort/prepare
> > > (decode of commit) which I think will incur significant overhead.
> > > Also, we try to write after few commits then there is a danger of
> > > losing them and still there could be a visible overhead for small
> > > transactions.
> > >
> >
> > I preferred not to persist this information to file, let's have stats
> > collector handle the stats persisting.
> >
>
> Sawada-San, I would like to go ahead with your
> "Use-HTAB-for-replication-slot-statistics" unless you think otherwise?

I agree that it's better to use the stats collector. So please go ahead.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Parallel INSERT SELECT take 2

2021-04-26 Thread Bharath Rupireddy

On Tue, Apr 27, 2021 at 7:39 AM houzj.f...@fujitsu.com
 wrote:
> > I'm thinking that when users say ALTER TABLE partioned_table SET PARALLEL
> > TO 'safe';, we check all the partitions' and their associated objects' 
> > parallel
> > safety? If all are parallel safe, then only we set partitioned_table as 
> > parallel safe.
> > What should happen if the parallel safety of any of the associated
> > objects/partitions changes after setting the partitioned_table safety?
>
> Currently, nothing happened if any of the associated objects/partitions 
> changes after setting the partitioned_table safety.
> Because , we do not have a really cheap way to catch the change. The existing 
> relcache does not work because alter function
> does not invalid the relcache which the function belongs to. And it will 
> bring some other overhead(locking, systable scan,...)
> to find the table the objects belong to.

Makes sense. Anyways, the user is responsible for such changes and
otherwise the executor can catch them at run time, if not, the users
will see unintended consequences.

> > My understanding was that: the command ALTER TABLE ... SET PARALLEL TO
> > 'safe' work will check the parallel safety of all the objects associated 
> > with the
> > table. If the objects are all parallel safe, then the table will be set to 
> > safe. If at
> > least one object is parallel unsafe or restricted, then the command will 
> > fail.
>
> I think this idea makes sense. Some detail of the designed can be improved.
> I agree with you that we can try to check check all the partitions' and their 
> associated objects' parallel safety when ALTER PARALLEL.
> Because it's a separate new command, add some overhead to it seems not too 
> bad.
> If there are no other objections, I plan to add safety check in the ALTER 
> PARALLEL command.

Maybe we can make the parallel safety check of the associated
objects/partitions optional for CREATE/ALTER DDLs, with the default
being no checks performed. Both Greg and Amit agree that we don't have
to perform any parallel safety checks during CREATE/ALTER DDLs.

> > also thinking that how will the design cope with situations such as the 
> > parallel
> > safety of any of the associated objects changing after setting the table to
> > parallel safe. The planner would have relied on the outdated parallel 
> > safety of
> > the table and chosen parallel inserts and the executor will catch such 
> > situations.
> > Looks like my understanding was wrong.
>
> Currently, we assume user is responsible for its correctness.
> Because, from our research, when the parallel safety of some of these objects 
> is changed,
> it's costly to reflect it on the parallel safety of tables that depend on 
> them.
> (we need to scan the pg_depend,pg_inherit,pg_index to find the target 
> table)

Agree.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Parallel INSERT SELECT take 2

2021-04-26 Thread Bharath Rupireddy

On Tue, Apr 27, 2021 at 7:45 AM Greg Nancarrow  wrote:
>
> On Tue, Apr 27, 2021 at 10:51 AM Bharath Rupireddy
>  wrote:
> >
> >
> > I still feel that why we shouldn't limit the declarative approach to
> > only partitioned tables? And for normal tables, possibly with a
> > minimal cost(??), the server can do the safety checking. I know this
> > feels a little inconsistent. In the planner we will have different
> > paths like: if (partitioned_table) { check the parallel safety tag
> > associated with the table } else { perform the parallel safety of the
> > associated objects }.
> >
>
> Personally I think the simplest and best approach is just do it
> consistently, using the declarative approach across all table types.

Agree.

> > Then, running the pg_get_parallel_safety will have some overhead if
> > there are many partitions associated with a table. And, this is the
> > overhead planner would have had to incur without the declarative
> > approach which we are trying to avoid with this design.
> >
>
> The big difference is that pg_get_parallel_safety() is intended to be
> used during development, not as part of runtime parallel-safety checks
> (which are avoided using the declarative approach).
> So there is no runtime overhead associated with pg_get_parallel_safety().

Yes, while we avoid runtime overhead, but we run the risk of changed
parallel safety of any of the underlying objects/functions/partitions.
This risk will anyways be unavoidable with declarative approach.

> > I'm thinking that when users say ALTER TABLE partioned_table SET
> > PARALLEL TO 'safe';, we check all the partitions' and their associated
> > objects' parallel safety? If all are parallel safe, then only we set
> > partitioned_table as parallel safe. What should happen if the parallel
> > safety of any of the associated objects/partitions changes after
> > setting the partitioned_table safety?
> >
>
> With the declarative approach, there is no parallel-safety checking on
> either the CREATE/ALTER when the parallel-safety is declared/set.
> It's up to the user to get it right. If it's actually wrong, it will
> be detected at runtime.

As I said upthread, we can provide the parallel safety check of
associated objects/partitions as an option with default as false. I'm
not sure if this is a good thing to do at all. Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] logical decoding of two-phase transactions

2021-04-26 Thread vignesh C

On Wed, Apr 21, 2021 at 12:13 PM Peter Smith  wrote:
>
> On Tue, Apr 20, 2021 at 3:45 PM Peter Smith  wrote:
> >
> > Please find attached the latest patch set v73`*
> >
> > Differences from v72* are:
> >
> > * Rebased to HEAD @ today (required because v72-0001 no longer applied 
> > cleanly)
> >
> > * Minor documentation correction for protocol messages for Commit Prepared 
> > ('K')
> >
> > * Non-functional code tidy (mostly proto.c) to reduce overloading
> > different meanings to same member names for prepare/commit times.
>
>
> Please find attached a re-posting of patch set v73*

Few comments when I was having a look at the tests added:
1) Can the below:
+# check inserts are visible. 22 should be rolled back. 21 should be committed.
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(*)
FROM tab_full where a IN (21);");
+is($result, qq(1), 'Rows committed are on the subscriber');
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(*)
FROM tab_full where a IN (22);");
+is($result, qq(0), 'Rows rolled back are not on the subscriber');

be changed to:
$result = $node_subscriber->safe_psql('postgres', "SELECT a FROM
tab_full where a IN (21,22);");
is($result, qq(21), 'Rows committed are on the subscriber');

And Test count need to be reduced to "use Test::More tests => 19"

2) we can change tx to transaction:
+# check the tx state is prepared on subscriber(s)
+$result = $node_B->safe_psql('postgres', "SELECT count(*) FROM
pg_prepared_xacts;");
+is($result, qq(1), 'transaction is prepared on subscriber B');
+$result = $node_C->safe_psql('postgres', "SELECT count(*) FROM
pg_prepared_xacts;");
+is($result, qq(1), 'transaction is prepared on subscriber C');

3) There are few more instances present in the same file, those also
can be changed.

4) Can the below:
 check inserts are visible at subscriber(s).
# 22 should be rolled back.
# 21 should be committed.
$result = $node_B->safe_psql('postgres', "SELECT count(*) FROM
tab_full where a IN (21);");
is($result, qq(1), 'Rows committed are present on subscriber B');
$result = $node_B->safe_psql('postgres', "SELECT count(*) FROM
tab_full where a IN (22);");
is($result, qq(0), 'Rows rolled back are not present on subscriber B');
$result = $node_C->safe_psql('postgres', "SELECT count(*) FROM
tab_full where a IN (21);");
is($result, qq(1), 'Rows committed are present on subscriber C');
$result = $node_C->safe_psql('postgres', "SELECT count(*) FROM
tab_full where a IN (22);");
is($result, qq(0), 'Rows rolled back are not present on subscriber C');

be changed to:
$result = $node_B->safe_psql('postgres', "SELECT a FROM tab_full where
a IN (21,22);");
is($result, qq(21), 'Rows committed are on the subscriber');
$result = $node_C->safe_psql('postgres', "SELECT a FROM tab_full where
a IN (21,22);");
is($result, qq(21), 'Rows committed are on the subscriber');

And Test count need to be reduced to "use Test::More tests => 27"

5) should we change "Two phase commit" to "Two phase commit state" :
+   /*
+* Binary, streaming, and two_phase are only supported
in v14 and
+* higher
+*/
if (pset.sversion >= 14)
appendPQExpBuffer(&buf,
  ", subbinary
AS \"%s\"\n"
- ", substream
AS \"%s\"\n",
+ ", substream
AS \"%s\"\n"
+ ",
subtwophasestate AS \"%s\"\n",

gettext_noop("Binary"),
-
gettext_noop("Streaming"));
+
gettext_noop("Streaming"),
+
gettext_noop("Two phase commit"));

Regards,
Vignesh

1 2 >

1 - 100 of 126 matches

Mail list logo