date:20220713

Re: [PROPOSAL] Detecting plan changes with plan_id in pg_stat_activity

2022-07-13 Thread Antonin Houska

Shouldn't the patch status be set to "Waiting on Author"?

(I was curious if this is a patch that I can review.)

Julien Rouhaud  wrote:

> On Wed, Jun 22, 2022 at 11:05:54PM +, Imseih (AWS), Sami wrote:
> > >Can you describe how it's kept in sync, and how it makes sure that the 
> > > property
> > >is maintained over restart / gc?  I don't see any change in the code 
> > > for the
> > >2nd part so I don't see how it could work (and after a quick test it 
> > > indeed
> > >doesn't).
> >
> > There is a routine called qtext_hash_sync which removed all entries from
> > the qtext_hash and reloads it will all the query ids from pgss_hash.
> > [...]
> > All the points when it's called, an exclusive lock is held.this allows or 
> > syncing all
> > The present queryid's in pgss_hash with qtext_hash.
> 
> So your approach is to let the current gc / file loading behavior happen as
> before and construct your mapping hash using the resulting query text / offset
> info.  That can't work.
> 
> > >2nd part so I don't see how it could work (and after a quick test it 
> > > indeed
> > >doesn't).
> >
> > Can you tell me what test you used to determine it is not in sync?
> 
> What test did you use to determine it is in sync?  Have you checked how the 
> gc/
> file loading actually work?
> 
> In my case I just checked the size of the query text file after running some
> script that makes sure that there are the same few queries ran by multiple
> different roles, then:
> 
> Size of $PGDATA/pg_stat_tmp/pgss_query_texts.stat: 559B
> pg_ctl restart
> Size of $PGDATA/pg_stat_tmp/pgss_query_texts.stat: 2383B
> 
> > >Can you share more details on the benchmarks you did?  Did you also run
> > >benchmark on workloads that induce entry eviction, with and without 
> > > need for
> > >gc?  Eviction in pgss is already very expensive, and making things 
> > > worse just
> > >to save a bit of disk space doesn't seem like a good compromise.
> >
> > Sorry this was poorly explained by me. I went back and did some benchmarks. 
> > Attached is
> > The script and results. But here is a summary:
> > On a EC2 r5.2xlarge. The benchmark I performed is:
> > 1. create 10k tables
> > 2. create 5 users
> > 3. run a pgbench script that performs per transaction a select on
> > A randomly chosen table for each of the 5 users.
> > 4. 2 variants of the test executed . 1 variant is with the default 
> > pg_stat_statements.max = 5000
> > and one test with a larger pg_stat_statements.max = 1.
> 
> But you wrote:
> 
> > ##
> > ## pg_stat_statements.max = 15000
> > ##
> 
> So which one is it?
> 
> >
> > So 10-15% is not accurate. I originally tested on a less powered machine. 
> > For this
> > Benchmark I see a 6% increase in TPS (732k vs  683k) when we have a larger 
> > sized
> > pg_stat_statements.max is used and less gc/deallocations.
> > Both tests show a drop in gc/deallocations and a net increase
> > In tps. Less GC makes sense since the external file has less duplicate SQLs.
> 
> On the other hand you're rebuilding the new query_offset hashtable every time
> there's an entry eviction, which seems quite expensive.
> 
> Also, as I mentioned you didn't change any of the heuristic for
> need_gc_qtexts().  So if the query texts are indeed deduplicated, doesn't it
> mean that  gc  will artificially
> be called less often?  The wanted target of "50% bloat" will become "50%
> bloat assuming no deduplication is done" and the average query text file size
> will stay the same whether the query texts are deduplicated or not.
> 
> I'm wondering the improvements you see due to the patch or simply due to
> artificially calling gc less often?  What are the results if instead of using
> vanilla pg_stat_statements you patch it to perform roughly the same number of
> gc as your version does?
> 
> Also your benchmark workload is very friendly with your feature, what are the
> results with other workloads?  Having the results with query texts that aren't
> artificially long would be interesting for instance, after fixing the problems
> mentioned previously.
> 
> Also, you said that if you run that benchmark with a single user you don't see
> any regression.   I don't see how rebuilding an extra hashtable in
> entry_dealloc(), so when holding an exclusive lwlock, while not saving any
> other work elsewhere could be free?
> 
> Looking at the script, you have:
> echo "log_min_messages=debug1" >> $PGDATA/postgresql.conf; \
> 
> Is that really necessary?  Couldn't you upgrade the gc message to a higher
> level for your benchmark need, or expose some new counter in
> pg_stat_statements_info maybe?  Have you done the benchmark using a debug 
> build
> or normal build?
> 
> 

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

Re: i.e. and e.g.

2022-07-13 Thread Kyotaro Horiguchi

At Thu, 14 Jul 2022 09:40:25 +0700, John Naylor  
wrote in 
> On Wed, Jul 13, 2022 at 4:13 PM Kyotaro Horiguchi 
> wrote:
> >
> > At Wed, 13 Jul 2022 18:09:43 +0900 (JST), Kyotaro Horiguchi <
> horikyota@gmail.com> wrote in
> > > So, "e.g." (for example) in the message sounds like "that is", which I
> > > think is "i.e.".  It should be fixed if this is correct.  I'm not sure
> > > whether to keep using Latin-origin acronyms like this, but in the
> > > attached I used "i.e.".
> 
> I did my own quick scan and found one use of i.e. that doesn't really fit,
> in a sentence that has other grammatical issues:
> 
> -Due to the differences how ECPG works compared to Informix's
> ESQL/C (i.e., which steps
> +Due to differences in how ECPG works compared to Informix's ESQL/C
> (namely, which steps
>  are purely grammar transformations and which steps rely on the

Oh!

> I've pushed that in addition to your changes, thanks!

Thanks!

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: First draft of the PG 15 release notes

2022-07-13 Thread John Naylor

Regarding this item:

"Allow hash lookup for NOT IN clauses with many constants (David Rowley,
James Coleman)
Previously the code always sequentially scanned the list of values."

The todo list has an entry titled "Planning large IN lists", which links to

https://www.postgresql.org/message-id/1178821226.6034.63.camel@goldbach

Did we already have a hash lookup for IN clauses with constants and the
above commit adds NOT IN? If so, maybe we have enough to remove this todo
item.

-- 
John Naylor
EDB: http://www.enterprisedb.com

Re: Handle infinite recursion in logical replication setup

2022-07-13 Thread Dilip Kumar

On Wed, Jul 13, 2022 at 4:49 PM Dilip Kumar  wrote:
>
> On Tue, Jul 12, 2022 at 2:58 PM vignesh C  wrote:
> >
> > On Tue, Jul 12, 2022 at 9:51 AM Amit Kapila  wrote:
>
> I find one thing  confusing about this patch.  Basically, this has two
> option 'local' and 'any', so I would assume that all the local server
> changes should be covered under the 'local' but now if we set some
> origin using 'select pg_replication_origin_session_setup('aa');' then
> changes from that session will be ignored because it has an origin id.
> I think actually the name is creating confusion, because by local it
> seems like a change which originated locally and the document is also
> specifying the same.
>
> +   If local, the subscription will request the 
> publisher
> +   to only send changes that originated locally. If 
> any,
>
> I think if we want to keep the option local then we should look up all
> the origin in the replication origin catalog and identify whether it
> is a local origin id or remote origin id and based on that filter out
> the changes.

On the other hand if we are interested in receiving the changes which
are generated without any origin then I think we should change 'local'
to 'none' and then in future we can provide a new option which can
send the changes generated by all the local origin?  I think other
than this the patch LGTM.


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [PATCH] Completed unaccent dictionary with many missing characters

2022-07-13 Thread Michael Paquier

On Tue, Jul 05, 2022 at 09:24:49PM +0200, Przemysław Sztoch wrote:
> I do not add more, because they probably concern older languages.
> An alternative might be to rely entirely on Unicode decomposition ...
> However, after the change, only one additional Ukrainian letter with an
> accent was added to the rule file.

Hmm.  I was wondering about the decomposition part, actually.  How
much would it make things simpler if we treat the full range of the
cyrillic characters, aka from U+0400 to U+4FF, scanning all of them
and building rules only if there are decompositions?  Is it worth
considering the Cyrillic supplement, as of U+0500-U+052F?

I was also thinking about the regression tests, and as unaccent
characters are more spread than for Latin and Greek, it could be a
good thing to have a complete coverage.  We could for example use a
query like that to check if a character is treated properly or not:
SELECT chr(i.a) = unaccent(chr(i.a))
  FROM generate_series(1024, 1327) AS i(a); -- range of Cyrillic.
--
Michael

signature.asc
Description: PGP signature

Re: proposal: possibility to read dumped table's name from file

2022-07-13 Thread Pavel Stehule

st 13. 7. 2022 v 22:49 odesílatel Andrew Dunstan 
napsal:

>
> On 2022-04-25 Mo 13:39, Pavel Stehule wrote:
> > Hi
> >
> > fresh rebase
> >
> >
>
>
> If we're going to do this for pg_dump's include/exclude options,
> shouldn't we also provide an equivalent facility for pg_dumpall's
> --exclude-database option?
>
>
It has sense

Regards

Pavel




>
> cheers
>
>
> andrew
>
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>

Re: pg_parameter_aclcheck() and trusted extensions

2022-07-13 Thread Tom Lane

John Naylor  writes:
> The RMT has discussed this item further, and we agree an ABI break is
> acceptable for resolving this issue.

Cool, I'll produce a patch soon.

regards, tom lane

Re: pg_parameter_aclcheck() and trusted extensions

2022-07-13 Thread John Naylor

On Fri, Jul 8, 2022 at 1:09 PM Michael Paquier  wrote:
>
> On Thu, Jul 07, 2022 at 03:43:03PM -0400, Joe Conway wrote:
> > On 7/7/22 15:00, Tom Lane wrote:
> >> The aspect that is a bit more debatable is whether to trouble with
> >> a set_config_option() wrapper to avoid the API break in v15.
> >> I think we'd still be making people deal with an API break in v16,
> >> so making them do it this year rather than next doesn't seem like
> >> a big deal ... but maybe someone wants to argue it's too late
> >> for API breaks in v15?
> >
> > Well there are other API breaks that affect me in v15, and to be honest
I
> > have done little except keep an eye out for the ones likely to affect
> > extensions I maintain so far, so may as well inflict the pain now as
later
> > ¯\_(ツ)_/¯
>
> With my RMT and hacker hat on, I see no reason to not break ABI or
> APIs while we are still in beta, as long as the GA result is as best
> as we can make it.  I have not looked at the reasoning behind the
> issue, but if you think that this feature will work better in the long
> term by having an extra field to track the role OID in one of the GUC
> structs or in one of its API arguments, that's fine by me.
>
> If this requires more work, a revert can of course be discussed, but I
> am not getting that this is really necessary here.  This would be the
> last option to consider.

The RMT has discussed this item further, and we agree an ABI break is
acceptable for resolving this issue.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Data is copied twice when specifying both child and parent table in publication

2022-07-13 Thread Peter Smith

Here are some review comments for the v6 patch (HEAD only):


HEAD_v6-0001


1. Commit message

If there are two publications that publish the parent table and the child table
separately, and both specify the option PUBLISH_VIA_PARTITION_ROOT, subscribing
to both publications from one subscription causes initial copy twice. What we
expect is to be copied only once.

~

I don’t think the parameter even works in uppercase, so maybe better to say:
PUBLISH_VIA_PARTITION_ROOT -> 'publish_via_partition_root'

~~~

2.

What we expect is to be copied only once.

SUGGESTION
It should only be copied once.

~~~

3.

To fix this, we extend the API of the function pg_get_publication_tables.
Now, the function pg_get_publication_tables could receive the publication list.
And then, if we specify option viaroot, we could exclude the partitioned table
whose ancestor belongs to the publication list when getting the table
informations.

~

Don't you mean "partition table" instead of "partitioned table"?

SUGGESTION (also reworded)
To fix this, the API function pg_get_publication_tables has been
extended to take a publication list. Now, when getting the table
information, if the publish_via_partition_root is true, the function
can exclude a partition table whose ancestor is also published by the
same publication list.

==

4. src/backend/catalog/pg_publication.c - pg_get_publication_tables

- publication = GetPublicationByName(pubname, false);
+ arr = PG_GETARG_ARRAYTYPE_P(0);
+ deconstruct_array(arr, TEXTOID, -1, false, TYPALIGN_INT,
+   &elems, NULL, &nelems);

Maybe should have some comment to describe that this function
parameter is now an array of publications names.

~~~

5.

+ /* get Oids of tables from each publication */

Uppercase comment

~~~

6.

+ ArrayType  *arr;
+ Datum*elems;
+ int nelems,
+ i;
+ Publication *publication;
+ bool viaroot = false;
+ List*pub_infos = NIL;
+ ListCell   *lc1,
+*lc2;

The 'publication' should be declared only in the loop that uses it.
It's also not good that this is shadowing the same variable name in a
later declaration.

~~~

7.

+ * Publications support partitioned tables, although all changes
+ * are replicated using leaf partition identity and schema, so we
+ * only need those.
  */
+ if (publication->alltables)
+ current_tables = GetAllTablesPublicationRelations(publication->pubviaroot);
+ else
+ {
+ List*relids,
+*schemarelids;
+
+ relids = GetPublicationRelations(publication->oid,
+ publication->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ schemarelids = GetAllSchemaPublicationRelations(publication->oid,
+ publication->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ current_tables = list_concat(relids, schemarelids);
+ }

Somehow I was confused by this comment because it says you only need
the LEAF tables but then the subsequent code is getting ROOT relations
anyway... Can you clarify the comment some more?

~~~

8.

+ bool viaroot = false;

I think that should have a comment something like:
/* At least one publication is using publish_via_partition_root */

~~~

9.

+ /*
+ * Record the published table and the corresponding publication so
+ * that we can get row filters and column list later.
+ */
+ foreach(lc1, tables)
+ {
+ Oid relid = lfirst_oid(lc1);
+
+ foreach(lc2, pub_infos)
+ {
+ pub_info   *pubinfo = (pub_info *) lfirst(lc2);
+
+ if (list_member_oid(pubinfo->table_list, relid))
+ {
+ Oid*result = (Oid *) malloc(sizeof(Oid) * 2);
+
+ result[0] = relid;
+ result[1] = pubinfo->pubid;
+
+ results = lappend(results, result);
+ }
+ }
  }

I felt a bit uneasy about the double-looping here. I wonder if these
'results' could have been accumulated within the existing loop over
all publications. Then the results would need to be filtered to remove
the ones associated with removed partitions. Otherwise with 1
tables and also many publications this (current) double-looping seems
like it might be quite expensive.

==

10. src/backend/commands/subscriptioncmds.c - fetch_table_list

+ if (check_columnlist && server_version >= 16)

This condition does not make much sense to me. Isn’t it effectively
same as saying
if (server_version >= 15 && server_version >= 16)

???

~~~

11.

+ /*
+ * Get the list of tables from publisher, the partitioned table whose
+ * ancestor is also in this list should be ignored, otherwise the
+ * initial date in the partitioned table would be replicated twice.
+ */

11.a
Isn't this comment all backwards? I think you mean to say "partition"
or "partition table" (not partitioned table) because partitions have
ancestors but partition-ED tables don't.


11.b
"initial date" -> "initial data"

==

12. src/test/subscription/t/013_partition.pl

-# Note: We create two separate tables, not a partitioned one, so that we can
-# easily identity through which relation were the changes replicated.
+# Note: We only create one table for the partition table (ta

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-13 Thread John Naylor

On Tue, Jul 12, 2022 at 8:16 AM Masahiko Sawada  wrote:

> > > I think that at this stage it's better to define the design first. For
> > > example, key size and value size, and these sizes are fixed or can be
> > > set the arbitary size?
> >
> > I don't think we need to start over. Andres' prototype had certain
> > design decisions built in for the intended use case (although maybe
> > not clearly documented as such). Subsequent patches in this thread
> > substantially changed many design aspects. If there were any changes
> > that made things wonderful for vacuum, it wasn't explained, but Andres
> > did explain how some of these changes were not good for other uses.
> > Going to fixed 64-bit keys and values should still allow many future
> > applications, so let's do that if there's no reason not to.
>
> I thought Andres pointed out that given that we store BufferTag (or
> part of that) into the key, the fixed 64-bit keys might not be enough
> for buffer mapping use cases. If we want to use wider keys more than
> 64-bit, we would need to consider it.

It sounds like you've answered your own question, then. If so, I'm
curious what your current thinking is.

If we *did* want to have maximum flexibility, then "single-value
leaves" method would be the way to go, since it seems to be the
easiest way to have variable-length both keys and values. I do have a
concern that the extra pointer traversal would be a drag on
performance, and also require lots of small memory allocations. If we
happened to go that route, your idea upthread of using a bitmapset of
item offsets in the leaves sounds like a good fit for that.

I also have some concerns about also simultaneously trying to design
for the use for buffer mappings. I certainly want to make this good
for as many future uses as possible, and I'd really like to preserve
any optimizations already fought for. However, to make concrete
progress on the thread subject, I also don't think it's the most
productive use of time to get tied up about the fine details of
something that will not likely happen for several years at the
earliest.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Add connection active, idle time to pg_stat_activity

2022-07-13 Thread torikoshia


Rafia, Sergey,

+1 for adding the total_active_time and total_idle_in_transaction_time 
to pg_stat_activity.


I reviewed the patch and here are some comments.

+  
+   total_active_time double 
precision

+  
+  
+   Time in milliseconds this backend spent in 
active and

+   fastpath states.

Is 'fastpath' an abbreviation of 'fastpath function call'?
If so, I feel it's clearer 'fastpath function call' 
than 'fastpath'.



+extern uint64 pgstat_get_my_active_time(void);
+extern uint64 pgstat_get_my_transaction_idle_time(void);

Are these functions necessary?
It seems they are not called from anywhere, doesn't it?

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-13 Thread Masahiko Sawada

On Thu, Jul 14, 2022 at 11:16 AM shiy.f...@fujitsu.com
 wrote:
>
> On Tue, Jul 12, 2022 5:23 PM Masahiko Sawada  wrote:
> >
> > On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com
> >  wrote:
> > >
> > > It happened when executing the following code because it tried to free a
> > NULL
> > > pointer (catchange_xip).
> > >
> > > /* be tidy */
> > > if (ondisk)
> > > pfree(ondisk);
> > > +   if (catchange_xip)
> > > +   pfree(catchange_xip);
> > >  }
> > >
> > > It seems to be related to configure option. I could reproduce it when 
> > > using
> > > `./configure --enable-debug`.
> > > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og -
> > ggdb"`.
> >
> > Hmm, I could not reproduce this problem even if I use ./configure
> > --enable-debug. And it's weird that we checked if catchange_xip is not
> > null but we did pfree for it:
> >
> > #1  pfree (pointer=0x0) at mcxt.c:1177
> > #2  0x0078186b in SnapBuildSerialize (builder=0x1fd5e78,
> > lsn=25719712) at snapbuild.c:1792
> >
> > Is it reproducible in your environment?
>
> Thanks for your reply! Yes, it is reproducible. And I also reproduced it on 
> the
> v4 patch you posted [1].

Thank you for testing!

>
> [1] 
> https://www.postgresql.org/message-id/CAD21AoAyNPrOFg%2BQGh%2B%3D4205TU0%3DyrE%2BQyMgzStkH85uBZXptQ%40mail.gmail.com
>
> > If so, could you test it again
> > with the following changes?
> >
> > diff --git a/src/backend/replication/logical/snapbuild.c
> > b/src/backend/replication/logical/snapbuild.c
> > index d015c06ced..a6e76e3781 100644
> > --- a/src/backend/replication/logical/snapbuild.c
> > +++ b/src/backend/replication/logical/snapbuild.c
> > @@ -1788,7 +1788,7 @@ out:
> > /* be tidy */
> > if (ondisk)
> > pfree(ondisk);
> > -   if (catchange_xip)
> > +   if (catchange_xip != NULL)
> > pfree(catchange_xip);
> >  }
> >
>
> I tried this and could still reproduce the problem.

Does the backtrace still show we attempt to pfree a null-pointer?

>
> Besides, I tried the suggestion from Amit [2],  it could be fixed by checking
> the value of catchange_xcnt instead of catchange_xip before pfree.

Could you check if this problem occurred when we reached there via
goto pass, i.e., did we call ReorderBufferGetCatalogChangesXacts() or
not?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Remove support for Visual Studio 2013

2022-07-13 Thread Michael Paquier

On Fri, Jul 08, 2022 at 07:38:23AM +0900, Michael Paquier wrote:
> And with 495ed0e now in place, attached is a rebased version.

Hearing nothing about this one, and because it is a nice cleanup
overall, I have gone ahead and applied it:
14 files changed, 24 insertions(+), 177 deletions(-)

This removes the dependency on the value of _MSC_VER.
--
Michael


signature.asc
Description: PGP signature

Re: i.e. and e.g.

2022-07-13 Thread John Naylor

On Wed, Jul 13, 2022 at 4:13 PM Kyotaro Horiguchi 
wrote:
>
> At Wed, 13 Jul 2022 18:09:43 +0900 (JST), Kyotaro Horiguchi <
horikyota@gmail.com> wrote in
> > I happened to see the message below.
> >
> > > WARNING:  new data directory should not be inside the old data
directory, e.g. %s
> >
> > The corresponding code is
> >
> > > ... the old data directory, e.g. %s", old_cluster_pgdata);
> >
> > So, "e.g." (for example) in the message sounds like "that is", which I
> > think is "i.e.".  It should be fixed if this is correct.  I'm not sure
> > whether to keep using Latin-origin acronyms like this, but in the
> > attached I used "i.e.".

I did my own quick scan and found one use of i.e. that doesn't really fit,
in a sentence that has other grammatical issues:

-Due to the differences how ECPG works compared to Informix's
ESQL/C (i.e., which steps
+Due to differences in how ECPG works compared to Informix's ESQL/C
(namely, which steps
 are purely grammar transformations and which steps rely on the
underlying run-time library)
 there is no FREE cursor_name statement in ECPG.
This is because in ECPG,
 DECLARE CURSOR doesn't translate to a function
call into

I've pushed that in addition to your changes, thanks!

--
John Naylor
EDB: http://www.enterprisedb.com

RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-13 Thread shiy.f...@fujitsu.com

On Tue, Jul 12, 2022 5:23 PM Masahiko Sawada  wrote:
> 
> On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com
>  wrote:
> >
> > It happened when executing the following code because it tried to free a
> NULL
> > pointer (catchange_xip).
> >
> > /* be tidy */
> > if (ondisk)
> > pfree(ondisk);
> > +   if (catchange_xip)
> > +   pfree(catchange_xip);
> >  }
> >
> > It seems to be related to configure option. I could reproduce it when using
> > `./configure --enable-debug`.
> > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og -
> ggdb"`.
> 
> Hmm, I could not reproduce this problem even if I use ./configure
> --enable-debug. And it's weird that we checked if catchange_xip is not
> null but we did pfree for it:
> 
> #1  pfree (pointer=0x0) at mcxt.c:1177
> #2  0x0078186b in SnapBuildSerialize (builder=0x1fd5e78,
> lsn=25719712) at snapbuild.c:1792
> 
> Is it reproducible in your environment?

Thanks for your reply! Yes, it is reproducible. And I also reproduced it on the
v4 patch you posted [1].

[1] 
https://www.postgresql.org/message-id/CAD21AoAyNPrOFg%2BQGh%2B%3D4205TU0%3DyrE%2BQyMgzStkH85uBZXptQ%40mail.gmail.com

> If so, could you test it again
> with the following changes?
> 
> diff --git a/src/backend/replication/logical/snapbuild.c
> b/src/backend/replication/logical/snapbuild.c
> index d015c06ced..a6e76e3781 100644
> --- a/src/backend/replication/logical/snapbuild.c
> +++ b/src/backend/replication/logical/snapbuild.c
> @@ -1788,7 +1788,7 @@ out:
> /* be tidy */
> if (ondisk)
> pfree(ondisk);
> -   if (catchange_xip)
> +   if (catchange_xip != NULL)
> pfree(catchange_xip);
>  }
> 

I tried this and could still reproduce the problem.

Besides, I tried the suggestion from Amit [2],  it could be fixed by checking
the value of catchange_xcnt instead of catchange_xip before pfree.

[2] 
https://www.postgresql.org/message-id/CAA4eK1%2BXPdm8G%3DEhUJA12Pi1YvQAfcz2%3DkTd9a4BjVx4%3Dgk-MA%40mail.gmail.com

diff --git a/src/backend/replication/logical/snapbuild.c 
b/src/backend/replication/logical/snapbuild.c
index c482e906b0..68b9c4ef7d 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1573,7 +1573,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
Sizeneeded_length;
SnapBuildOnDisk *ondisk = NULL;
TransactionId   *catchange_xip = NULL;
-   size_t  catchange_xcnt;
+   size_t  catchange_xcnt = 0;
char   *ondisk_c;
int fd;
chartmppath[MAXPGPATH];
@@ -1788,7 +1788,7 @@ out:
/* be tidy */
if (ondisk)
pfree(ondisk);
-   if (catchange_xip)
+   if (catchange_xcnt != 0)
pfree(catchange_xip);
 }


Regards,
Shi yu

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-13 Thread Masahiko Sawada

On Tue, Jul 12, 2022 at 12:40 PM shiy.f...@fujitsu.com
 wrote:
>
> On Tue, Jul 12, 2022 8:49 AM Masahiko Sawada  wrote:
> >
> > I've attached an updated patch.
> >
> > While trying this idea, I noticed there is no API to get the length of
> > dlist, as we discussed offlist. Alternative idea was to use List
> > (T_XidList) but I'm not sure it's a great idea since deleting an xid
> > from the list is O(N), we need to implement list_delete_xid, and we
> > need to make sure allocating list node in the reorder buffer context.
> > So in the patch, I added a variable, catchange_ntxns, to keep track of
> > the length of the dlist. Please review it.
> >
>
> Thanks for your patch. Here are some comments on the master patch.

Thank you for the comments.

>
> 1.
> In catalog_change_snapshot.spec, should we use "RUNNING_XACTS record" instead 
> of
> "RUNNING_XACT record" / "XACT_RUNNING record" in the comment?
>
> 2.
> +* Since catchange.xip is sorted, we find the lower bound of
> +* xids that sill are interesting.
>
> Typo?
> "sill" -> "still"
>
> 3.
> +* This array is set once when restoring the snapshot, xids are 
> removed
> +* from the array when decoding xl_running_xacts record, and then 
> eventually
> +* becomes an empty.
>
> +   /* catchange list becomes an empty */
> +   pfree(builder->catchange.xip);
> +   builder->catchange.xip = NULL;
>
> Should "becomes an empty" be modified to "becomes empty"?
>
> 4.
> + * changes that are smaller than ->xmin. Those won't ever get checked via
> + * the ->committed array and ->catchange, respectively. The committed xids 
> will
>
> Should we change
> "the ->committed array and ->catchange"
> to
> "the ->committed or ->catchange array"
> ?

Agreed with all the above comments. These are incorporated in the
latest v4 patch I just sent[1].

Regards,

[1] 
https://www.postgresql.org/message-id/CAD21AoAyNPrOFg%2BQGh%2B%3D4205TU0%3DyrE%2BQyMgzStkH85uBZXptQ%40mail.gmail.com

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-13 Thread Masahiko Sawada

On Tue, Jul 12, 2022 at 5:52 PM Amit Kapila  wrote:
>
> On Tue, Jul 12, 2022 at 1:13 PM Masahiko Sawada  wrote:
> >
> > On Tue, Jul 12, 2022 at 3:25 PM Amit Kapila  wrote:
> > >
> > > On Tue, Jul 12, 2022 at 11:38 AM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Tue, Jul 12, 2022 at 10:28 AM Masahiko Sawada 
> > > >  wrote:
> > > > >
> > > > >
> > > > > I'm doing benchmark tests and will share the results.
> > > > >
> > > >
> > > > I've done benchmark tests to measure the overhead introduced by doing
> > > > bsearch() every time when decoding a commit record. I've simulated a
> > > > very intensified situation where we decode 1M commit records while
> > > > keeping builder->catchange.xip array but the overhead is negilible:
> > > >
> > > > HEAD: 584 ms
> > > > Patched: 614 ms
> > > >
> > > > I've attached the benchmark script I used. With increasing
> > > > LOG_SNAPSHOT_INTERVAL_MS to 9, the last decoding by
> > > > pg_logicla_slot_get_changes() decodes 1M commit records while keeping
> > > > catalog modifying transactions.
> > > >
> > >
> > > Thanks for the test. We should also see how it performs when (a) we
> > > don't change LOG_SNAPSHOT_INTERVAL_MS,
> >
> > What point do you want to see in this test? I think the performance
> > overhead depends on how many times we do bsearch() and how many
> > transactions are in the list.
> >
>
> Right, I am not expecting any visible performance difference in this
> case. This is to ensure that we are not incurring any overhead in the
> more usual scenarios (or default cases). As per my understanding, the
> purpose of increasing the value of LOG_SNAPSHOT_INTERVAL_MS is to
> simulate a stress case for the changes made by the patch, and keeping
> its value default will test the more usual scenarios.

Agreed.

I've done simple benchmark tests to decode 100k pgbench transactions:

HEAD: 10.34 s
Patched: 10.29 s

I've attached an updated patch that incorporated comments from Amit and Shi.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/
From 28ca92c9d95cd05a26a7db6e54704f92b1846943 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada 
Date: Wed, 6 Jul 2022 12:53:36 +0900
Subject: [PATCH v4] Add catalog modifying transactions to logical decoding
 serialized snapshot.

Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION
records to know if the transaction has modified the catalog, and that
information is not serialized to snapshot. Therefore, if the logical
decoding decodes only the commit record of the transaction that
actually has modified a catalog, we missed adding its XID to the
snapshot. We ended up looking at catalogs with the wrong snapshot.

To fix this problem, this change adds the list of transaction IDs and
sub-transaction IDs, that have modified catalogs and are running when
snapshot serialization, to the serialized snapshot. When decoding a
COMMIT record, we check both the list and the ReorderBuffer to see if
if the transaction has modified catalogs.

Since this adds additional information to the serialized snapshot, we
cannot backpatch it. For back branches, we take another approach;
remember the last-running-xacts list of the first decoded
RUNNING_XACTS record and check if the transaction whose commit record
has XACT_XINFO_HAS_INVALS and whose XID is in the list. This doesn't
require any file format changes but the transaction will end up being
added to the snapshot even if it has only relcache invalidations.

This commit bumps SNAPBUILD_VERSION because of change in SnapBuild.
---
 contrib/test_decoding/Makefile|   2 +-
 .../expected/catalog_change_snapshot.out  |  44 
 .../specs/catalog_change_snapshot.spec|  39 +++
 .../replication/logical/reorderbuffer.c   |  69 -
 src/backend/replication/logical/snapbuild.c   | 235 --
 src/include/replication/reorderbuffer.h   |  12 +
 6 files changed, 317 insertions(+), 84 deletions(-)
 create mode 100644 contrib/test_decoding/expected/catalog_change_snapshot.out
 create mode 100644 contrib/test_decoding/specs/catalog_change_snapshot.spec

diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index b220906479..c7ce603706 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -8,7 +8,7 @@ REGRESS = ddl xact rewrite toast permissions decoding_in_xact \
 	spill slot truncate stream stats twophase twophase_stream
 ISOLATION = mxact delayed_startup ondisk_startup concurrent_ddl_dml \
 	oldest_xmin snapshot_transfer subxact_without_top concurrent_stream \
-	twophase_snapshot slot_creation_error
+	twophase_snapshot slot_creation_error catalog_change_snapshot
 
 REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
 ISOLATION_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
diff --git a/contrib/test_decoding/expected/catalog_change_snapshot.out b/contrib/test_decoding/expected/catalog_change_snapshot.out
new file mode 100644
index 000

Re: automatically generating node support functions

2022-07-13 Thread Tom Lane

Just one more thing here ... I really don't like the fact that
gen_node_support.pl's response to unparseable input is to silently
ignore it.  That's maybe tolerable outside a node struct, but
I think we need a higher standard inside.  I experimented with
promoting the commented-out "warn" to "die", and soon learned
that there are two shortcomings:

* We can't cope with the embedded union inside A_Const.
Simplest fix is to move it outside.

* We can't cope with function-pointer fields.  The only real
problem there is that some of them spread across multiple lines,
but really that was a shortcoming we'd have to fix sometime
anyway.

Proposed patch attached.

regards, tom lane

diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 96af17516a..35af4e231f 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -213,15 +213,34 @@ foreach my $infile (@ARGV)
 	}
 	$file_content .= $raw_file_content;
 
-	my $lineno = 0;
+	my $lineno   = 0;
+	my $prevline = '';
 	foreach my $line (split /\n/, $file_content)
 	{
+		# per-physical-line processing
 		$lineno++;
 		chomp $line;
 		$line =~ s/\s*$//;
 		next if $line eq '';
 		next if $line =~ /^#(define|ifdef|endif)/;
 
+		# within a node struct, don't process until we have whole logical line
+		if ($in_struct && $subline > 1)
+		{
+			if ($line =~ m/;$/)
+			{
+# found the end, re-attach any previous line(s)
+$line = $prevline . $line;
+$prevline = '';
+			}
+			else
+			{
+# set it aside for a moment
+$prevline .= $line . ' ';
+next;
+			}
+		}
+
 		# we are analyzing a struct definition
 		if ($in_struct)
 		{
@@ -394,7 +413,7 @@ foreach my $infile (@ARGV)
 			}
 			# normal struct field
 			elsif ($line =~
-/^\s*(.+)\s*\b(\w+)(\[\w+\])?\s*(?:pg_node_attr\(([\w(), ]*)\))?;/
+/^\s*(.+)\s*\b(\w+)(\[[\w\s+]+\])?\s*(?:pg_node_attr\(([\w(), ]*)\))?;/
 			  )
 			{
 if ($is_node_struct)
@@ -441,13 +460,46 @@ foreach my $infile (@ARGV)
 	$my_field_attrs{$name} = \@attrs;
 }
 			}
-			else
+			# function pointer field
+			elsif ($line =~
+/^\s*([\w\s*]+)\s*\(\*(\w+)\)\s*\((.*)\)\s*(?:pg_node_attr\(([\w(), ]*)\))?;/
+			  )
 			{
 if ($is_node_struct)
 {
-	#warn "$infile:$lineno: could not parse \"$line\"\n";
+	my $type  = $1;
+	my $name  = $2;
+	my $args  = $3;
+	my $attrs = $4;
+
+	my @attrs;
+	if ($attrs)
+	{
+		@attrs = split /,\s*/, $attrs;
+		foreach my $attr (@attrs)
+		{
+			if (   $attr !~ /^copy_as\(\w+\)$/
+&& $attr !~ /^read_as\(\w+\)$/
+&& !elem $attr,
+qw(equal_ignore read_write_ignore))
+			{
+die
+  "$infile:$lineno: unrecognized attribute \"$attr\"\n";
+			}
+		}
+	}
+
+	push @my_fields, $name;
+	$my_field_types{$name} = 'function pointer';
+	$my_field_attrs{$name} = \@attrs;
 }
 			}
+			else
+			{
+# We're not too picky about what's outside structs,
+# but we'd better understand everything inside.
+die "$infile:$lineno: could not parse \"$line\"\n";
+			}
 		}
 		# not in a struct
 		else
@@ -709,6 +761,12 @@ _equal${n}(const $n *a, const $n *b)
   unless $equal_ignore;
 			}
 		}
+		elsif ($t eq 'function pointer')
+		{
+			# we can copy and compare as a scalar
+			print $cff "\tCOPY_SCALAR_FIELD($f);\n"unless $copy_ignore;
+			print $eff "\tCOMPARE_SCALAR_FIELD($f);\n" unless $equal_ignore;
+		}
 		# node type
 		elsif ($t =~ /(\w+)\*/ and elem $1, @node_types)
 		{
@@ -980,6 +1038,12 @@ _read${n}(void)
   unless $no_read;
 			}
 		}
+		elsif ($t eq 'function pointer')
+		{
+			# We don't print these, and we can't read them either
+			die "cannot read function pointer in struct \"$n\" field \"$f\"\n"
+			  unless $no_read;
+		}
 		# Special treatments of several Path node fields
 		elsif ($t eq 'RelOptInfo*' && elem 'write_only_relids', @a)
 		{
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index b0c9c5f2ef..98fe1abaa2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -303,26 +303,26 @@ typedef struct A_Expr
 
 /*
  * A_Const - a literal constant
+ *
+ * Value nodes are inline for performance.  You can treat 'val' as a node,
+ * as in IsA(&val, Integer).  'val' is not valid if isnull is true.
  */
+union ValUnion
+{
+	Node		node;
+	Integer		ival;
+	Float		fval;
+	Boolean		boolval;
+	String		sval;
+	BitString	bsval;
+};
+
 typedef struct A_Const
 {
 	pg_node_attr(custom_copy_equal, custom_read_write, no_read)
 
 	NodeTag		type;
-
-	/*
-	 * Value nodes are inline for performance.  You can treat 'val' as a node,
-	 * as in IsA(&val, Integer).  'val' is not valid if isnull is true.
-	 */
-	union ValUnion
-	{
-		Node		node;
-		Integer		ival;
-		Float		fval;
-		Boolean		boolval;
-		String		sval;
-		BitString	bsval;
-	}			val;
+	union ValUnion val;
 	bool		isnull;			/* SQL NULL constant */

Re: fix stats_fetch_consistency value in postgresql.conf.sample

2022-07-13 Thread Justin Pryzby

On Thu, Jul 14, 2022 at 08:46:02AM +0900, Michael Paquier wrote:
> On Wed, Jul 13, 2022 at 12:30:00PM -0500, Justin Pryzby wrote:
> > How did you make this list ?  Was it by excluding things that failed for 
> > you ?
> > 
> > cfbot is currently failing due to io_concurrency on windows.
> > I think there are more GUC which should be included here.
> > 
> > http://cfbot.cputube.org/kyotaro-horiguchi.html
> 
> FWIW, I am not really a fan of making this test depend on a hardcoded
> list of GUCs.  The design strength of the existing test is that we
> don't have such a dependency now, making less to think about in terms
> of maintenance in the long-term, even if this is now run
> automatically.

It doesn't really need to be stated that an inclusive list wouldn't be useful.

That's a list of GUCs to be excluded.
Which is hardly different from the pre-existing list of exceptions.

# Ignore some exceptions.
next if $param_name eq "include";
next if $param_name eq "include_dir";
next if $param_name eq "include_if_exists";

-- Exceptions are transaction_*.
SELECT name FROM tab_settings_flags
  WHERE NOT no_show_all AND no_reset_all
  ORDER BY 1;
  name  

 transaction_deferrable
 transaction_isolation
 transaction_read_only
(3 rows)

How else do you propose to make this work for guc whose defaults vary by
platform in guc.c or in initdb ?

-- 
Justin

Re: fix stats_fetch_consistency value in postgresql.conf.sample

2022-07-13 Thread Andres Freund

Hi,

On 2022-07-14 08:46:02 +0900, Michael Paquier wrote:
> On Wed, Jul 13, 2022 at 12:30:00PM -0500, Justin Pryzby wrote:
> > How did you make this list ?  Was it by excluding things that failed for 
> > you ?
> > 
> > cfbot is currently failing due to io_concurrency on windows.
> > I think there are more GUC which should be included here.
> > 
> > http://cfbot.cputube.org/kyotaro-horiguchi.html
> 
> FWIW, I am not really a fan of making this test depend on a hardcoded
> list of GUCs.

I wonder if we should add flags indicating platform dependency etc to guc.c?
That should allow to remove most of them?


> The design strength of the existing test is that we
> don't have such a dependency now, making less to think about in terms
> of maintenance in the long-term, even if this is now run
> automatically.

There's no existing test for things covered by these exceptions, unless I am
missing something?

Greetings,

Andres Freund

Re: fix stats_fetch_consistency value in postgresql.conf.sample

2022-07-13 Thread Michael Paquier

On Wed, Jul 13, 2022 at 12:30:00PM -0500, Justin Pryzby wrote:
> How did you make this list ?  Was it by excluding things that failed for you ?
> 
> cfbot is currently failing due to io_concurrency on windows.
> I think there are more GUC which should be included here.
> 
> http://cfbot.cputube.org/kyotaro-horiguchi.html

FWIW, I am not really a fan of making this test depend on a hardcoded
list of GUCs.  The design strength of the existing test is that we
don't have such a dependency now, making less to think about in terms
of maintenance in the long-term, even if this is now run
automatically.
--
Michael


signature.asc
Description: PGP signature

Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

2022-07-13 Thread Peter Smith

On Wed, Jul 13, 2022 at 7:55 PM vignesh C  wrote:
>
> On Wed, Jul 13, 2022 at 1:13 PM Michael Paquier  wrote:
> >
> > On Wed, Jul 13, 2022 at 12:22:06PM +0530, vignesh C wrote:
> > > Most of the code is common between GetSubscriptionRelations and
> > > GetSubscriptionNotReadyRelations. Added a parameter to
> > > GetSubscriptionRelations which could provide the same functionality as
> > > the existing GetSubscriptionRelations and
> > > GetSubscriptionNotReadyRelations. Attached patch has the changes for
> > > the same. Thoughts?
> >
> > Right.  Using all_rels to mean that we'd filter relations that are not
> > ready is a bit confusing, though.  Perhaps this could use a bitmask as
> > argument.
>
> The attached v2 patch has the modified version which includes the
> changes to make the argument as bitmask.
>

By using a bitmask I think there is an implication that the flags can
be combined...

Perhaps it is not a problem today, but later you may want more flags. e.g.
#define SUBSCRIPTION_REL_STATE_READY 0x02 /* READY relations */

then the bitmask idea falls apart because IIUC you have no intentions
to permit things like:
(SUBSCRIPTION_REL_STATE_NOT_READY | SUBSCRIPTION_REL_STATE_READY)

IMO using an enum might be a better choice for that parameter.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: proposal: Allocate work_mem From Pool

2022-07-13 Thread Justin Pryzby

On Tue, Jul 12, 2022 at 08:49:10PM -0700, Joseph D Wagner wrote:
> > > Before I try to answer that, I need to know how the scheduler works.
> 
> > As I understand the term used, there is no scheduler inside Postgres
> > for user connections -- they're handled by the OS kernel.
> 
> Then, I'm probably using the wrong term. Right now, I have
> max_worker_processes set to 16. What happens when query #17
> wants some work done? What do you call the thing that handles
> that? What is its algorithm for allocating work to the processes?
> Or am I completely misunderstanding the role worker processes
> play in execution?

max_connections limits the number of client connections (queries).
Background workers are a relatively new thing - they didn't exist until v9.3.

There is no scheduler, unless you run a connection pooler between the
application and the DB.  Which you should probably do, since you've set
work_mem measured in GB on a server with 10s of GB of RAM, and while using
partitioning.

https://www.postgresql.org/docs/current/runtime-config-connection.html#GUC-MAX-CONNECTIONS
|max_connections (integer)
|
|Determines the maximum number of concurrent connections to the database 
server. The default is typically 100 connections, but might be less if your 
kernel settings will not support it (as determined during initdb). This 
parameter can only be set at server start.

https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES
|max_worker_processes (integer)
|Sets the maximum number of background processes that the system can 
support. This parameter can only be set at server start. The default is 8.
|When running a standby server, you must set this parameter to the same or 
higher value than on the primary server. Otherwise, queries will not be allowed 
in the standby server.
|When changing this value, consider also adjusting max_parallel_workers, 
max_parallel_maintenance_workers, and max_parallel_workers_per_gather.

https://www.postgresql.org/docs/current/connect-estab.html
|PostgreSQL implements a “process per user” client/server model. In this model, 
every client process connects to exactly one backend process. As we do not know 
ahead of time how many connections will be made, we have to use a “supervisor 
process” that spawns a new backend process every time a connection is 
requested. This supervisor process is called postmaster and listens at a 
specified TCP/IP port for incoming connections. Whenever it detects a request 
for a connection, it spawns a new backend process. Those backend processes 
communicate with each other and with other processes of the instance using 
semaphores and shared memory to ensure data integrity throughout concurrent 
data access.
|
|The client process can be any program that understands the PostgreSQL protocol 
described in Chapter 53. Many clients are based on the C-language library 
libpq, but several independent implementations of the protocol exist, such as 
the Java JDBC driver.
|
|Once a connection is established, the client process can send a query to the 
backend process it's connected to. The query is transmitted using plain text, 
i.e., there is no parsing done in the client. The backend process parses the 
query, creates an execution plan, executes the plan, and returns the retrieved 
rows to the client by transmitting them over the established connection.

https://www.postgresql.org/docs/current/tutorial-arch.html
| The PostgreSQL server can handle multiple concurrent connections from 
clients. To achieve this it starts (“forks”) a new process for each connection. 
From that point on, the client and the new server process communicate without 
intervention by the original postgres process. Thus, the supervisor server 
process is always running, waiting for client connections, whereas client and 
associated server processes come and go. (All of this is of course invisible to 
the user. We only mention it here for completeness.)

https://wiki.postgresql.org/wiki/FAQ#How_does_PostgreSQL_use_CPU_resources.3F
|The PostgreSQL server is process-based (not threaded). Each database session 
connects to a single PostgreSQL operating system (OS) process. Multiple 
sessions are automatically spread across all available CPUs by the OS. The OS 
also uses CPUs to handle disk I/O and run other non-database tasks. Client 
applications can use threads, each of which connects to a separate database 
process. Since version 9.6, portions of some queries can be run in parallel, in 
separate OS processes, allowing use of multiple CPU cores. Parallel queries are 
enabled by default in version 10 (max_parallel_workers_per_gather), with 
additional parallelism expected in future releases. 

BTW, since this is amply documented, I have to point out that it's not on-topic
for the development list.

-- 
Justin

Re: The "char" type versus non-ASCII characters

2022-07-13 Thread Tom Lane

I wrote:
> Peter Eisentraut  writes:
>> I think we could consider char to be a single-byte bytea and use the 
>> escape format of bytea for char.  That way there is some precedent and 
>> we don't add yet another encoding or escape format.

> Do you want to take that as far as changing backslash to print
> as '\\' ?

This came up again today [1], so here's a concrete proposal.
Let's use \ooo for high-bit-set chars, but keep backslash as just
backslash (so it's only semi-compatible with bytea).

regards, tom lane

[1] 
https://www.postgresql.org/message-id/CAFM5RapGbBQm%2BdH%3D7K80HcvBvEWiV5Tm7N%3DNRaYURfm98YWc8A%40mail.gmail.com

diff --git a/src/backend/utils/adt/char.c b/src/backend/utils/adt/char.c
index 0df41c2253..e50293bf14 100644
--- a/src/backend/utils/adt/char.c
+++ b/src/backend/utils/adt/char.c
@@ -20,6 +20,11 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 
+#define ISOCTAL(c)   (((c) >= '0') && ((c) <= '7'))
+#define TOOCTAL(c)   ((c) + '0')
+#define FROMOCTAL(c) ((unsigned char) (c) - '0')
+
+
 /*
  *	 USER I/O ROUTINES		 *
  */
@@ -27,31 +32,53 @@
 /*
  *		charin			- converts "x" to 'x'
  *
- * Note that an empty input string will implicitly be converted to \0.
+ * This accepts the formats charout produces.  If we have multibyte input
+ * that is not in the form '\ooo', then we take its first byte as the value
+ * and silently discard the rest; this is a backwards-compatibility provision.
  */
 Datum
 charin(PG_FUNCTION_ARGS)
 {
 	char	   *ch = PG_GETARG_CSTRING(0);
 
+	if (strlen(ch) == 4 && ch[0] == '\\' &&
+		ISOCTAL(ch[1]) && ISOCTAL(ch[2]) && ISOCTAL(ch[3]))
+		PG_RETURN_CHAR((FROMOCTAL(ch[1]) << 6) +
+	   (FROMOCTAL(ch[2]) << 3) +
+	   FROMOCTAL(ch[3]));
+	/* This will do the right thing for a zero-length input string */
 	PG_RETURN_CHAR(ch[0]);
 }
 
 /*
  *		charout			- converts 'x' to "x"
  *
- * Note that if the char value is \0, the resulting string will appear
- * to be empty (null-terminated after zero characters).  So this is the
- * inverse of the charin() function for such data.
+ * The possible output formats are:
+ * 1. 0x00 is represented as an empty string.
+ * 2. 0x01..0x7F are represented as a single ASCII byte.
+ * 3. 0x80..0xFF are represented as \ooo (backslash and 3 octal digits).
+ * Case 3 is meant to match the traditional "escape" format of bytea.
  */
 Datum
 charout(PG_FUNCTION_ARGS)
 {
 	char		ch = PG_GETARG_CHAR(0);
-	char	   *result = (char *) palloc(2);
+	char	   *result = (char *) palloc(5);
 
-	result[0] = ch;
-	result[1] = '\0';
+	if (IS_HIGHBIT_SET(ch))
+	{
+		result[0] = '\\';
+		result[1] = TOOCTAL(((unsigned char) ch) >> 6);
+		result[2] = TOOCTALunsigned char) ch) >> 3) & 07);
+		result[3] = TOOCTAL(((unsigned char) ch) & 07);
+		result[4] = '\0';
+	}
+	else
+	{
+		/* This produces acceptable results for 0x00 as well */
+		result[0] = ch;
+		result[1] = '\0';
+	}
 	PG_RETURN_CSTRING(result);
 }
 
@@ -176,15 +203,20 @@ Datum
 text_char(PG_FUNCTION_ARGS)
 {
 	text	   *arg1 = PG_GETARG_TEXT_PP(0);
+	char	   *ch = VARDATA_ANY(arg1);
 	char		result;
 
 	/*
-	 * An empty input string is converted to \0 (for consistency with charin).
-	 * If the input is longer than one character, the excess data is silently
-	 * discarded.
+	 * Conversion rules are the same as in charin(), but here we need to
+	 * handle the empty-string case honestly.
 	 */
-	if (VARSIZE_ANY_EXHDR(arg1) > 0)
-		result = *(VARDATA_ANY(arg1));
+	if (VARSIZE_ANY_EXHDR(arg1) == 4 && ch[0] == '\\' &&
+		ISOCTAL(ch[1]) && ISOCTAL(ch[2]) && ISOCTAL(ch[3]))
+		result = (FROMOCTAL(ch[1]) << 6) +
+			(FROMOCTAL(ch[2]) << 3) +
+			FROMOCTAL(ch[3]);
+	else if (VARSIZE_ANY_EXHDR(arg1) > 0)
+		result = ch[0];
 	else
 		result = '\0';
 
@@ -195,13 +227,21 @@ Datum
 char_text(PG_FUNCTION_ARGS)
 {
 	char		arg1 = PG_GETARG_CHAR(0);
-	text	   *result = palloc(VARHDRSZ + 1);
+	text	   *result = palloc(VARHDRSZ + 4);
 
 	/*
-	 * Convert \0 to an empty string, for consistency with charout (and
-	 * because the text stuff doesn't like embedded nulls all that well).
+	 * Conversion rules are the same as in charout(), but here we need to be
+	 * honest about converting 0x00 to an empty string.
 	 */
-	if (arg1 != '\0')
+	if (IS_HIGHBIT_SET(arg1))
+	{
+		SET_VARSIZE(result, VARHDRSZ + 4);
+		(VARDATA(result))[0] = '\\';
+		(VARDATA(result))[1] = TOOCTAL(((unsigned char) arg1) >> 6);
+		(VARDATA(result))[2] = TOOCTALunsigned char) arg1) >> 3) & 07);
+		(VARDATA(result))[3] = TOOCTAL(((unsigned char) arg1) & 07);
+	}
+	else if (arg1 != '\0')
 	{
 		SET_VARSIZE(result, VARHDRSZ + 1);
 		*(VARDATA(result)) = arg1;

Re: Bug: Reading from single byte character column type may cause out of bounds memory reads.

2022-07-13 Thread Tom Lane

Andrew Dunstan  writes:
> On 2022-07-13 We 11:11, Tom Lane wrote:
>> I complained about this in [1], but that thread died off before reaching a
>> clear consensus about exactly what to do.
>> [1] 
>> https://www.postgresql.org/message-id/flat/2318797.1638558730%40sss.pgh.pa.us

> Looks like the main controversy was about the output format. Make an
> executive decision and pick one.

Done, see other thread.

regards, tom lane

Re: proposal: possibility to read dumped table's name from file

2022-07-13 Thread Andrew Dunstan



On 2022-04-25 Mo 13:39, Pavel Stehule wrote:
> Hi
>
> fresh rebase
>
>


If we're going to do this for pg_dump's include/exclude options,
shouldn't we also provide an equivalent facility for pg_dumpall's
--exclude-database option?


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2022-07-13 Thread Zhihong Yu

On Wed, Jul 13, 2022 at 1:05 PM Dmitry Koval  wrote:

> Thanks you!
> I've fixed all things mentioned.
>
> --
> With best regards,
> Dmitry Koval
>
> Postgres Professional: http://postgrespro.com

Hi,
Toward the end of ATExecSplitPartition():

+   /* Unlock new partition. */
+   table_close(newPartRel, NoLock);

 Why is NoLock passed (instead of AccessExclusiveLock) ?

Cheers

Re: allow building trusted languages without the untrusted versions

2022-07-13 Thread Tom Lane

Nathan Bossart  writes:
> Given the discussion in this thread, I intend to mark the commitfest entry
> as Withdrawn shortly.  Before I do, I thought I'd first check whether 0001
> [0] might be worthwhile independent of $SUBJECT.  This change separates the
> [un]trusted handler and validator functions for PL/Perl so that we no
> longer need to inspect pg_language to determine whether to use the trusted
> or untrusted code path.  I was surprised to learn that you can end up with
> PL/PerlU even if you've specified the trusted handler/validator functions.
> Besides bringing things more in line with how PL/Tcl does things, this
> change simplifies function lookup in plperl_proc_hash.  I suppose such a
> change might introduce a compatibility break for users who are depending on
> this behavior, but I don't know if that's worth worrying about.

Meh.  Avoiding the potential repeat hashtable lookup is worth something,
but I'm not sure I buy that this is a semantic improvement.  ISTM that
lanpltrusted *should* be the ultimate source of truth on this point.

My feelings about it are not terribly strong either way, though.

regards, tom lane

Re: Bug: Reading from single byte character column type may cause out of bounds memory reads.

2022-07-13 Thread Andrew Dunstan



On 2022-07-13 We 11:11, Tom Lane wrote:
> Aleksander Alekseev  writes:
>> Although the bug is easy to fix for this particular case (see the
>> patch) I'm not sure if this solution is general enough. E.g. is there
>> something that generally prevents pg_mblen() from doing out of bound
>> reading in cases similar to this one? Should we prevent such an INSERT
>> from happening instead?
> This is ultimately down to char_text() generating a string that's alleged
> to be a valid "text" type value, but it contains illegally-encoded data.
> Where we need to fix it is there: if we try to make every single
> text-using function be 100% bulletproof against wrongly-encoded data,
> we'll still be fixing bugs at the heat death of the universe.
>
> I complained about this in [1], but that thread died off before reaching a
> clear consensus about exactly what to do.
>
>   regards, tom lane
>
> [1] 
> https://www.postgresql.org/message-id/flat/2318797.1638558730%40sss.pgh.pa.us
>
>


Looks like the main controversy was about the output format. Make an
executive decision and pick one.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: make update-po@master stops at pg_upgrade

2022-07-13 Thread Tom Lane

Peter Eisentraut  writes:
> On 13.07.22 20:19, Tom Lane wrote:
>> Hmm.  We could list built files explicitly, perhaps, and still be
>> a good step ahead on the maintenance burden.  Does xgettext get
>> upset if the same input file is mentioned twice, ie would we have
>> to filter sql_help.c out of the wildcard result?

> It seems it would be ok with that.

Actually, we can get rid of those easily enough anyway with $(sort).
Here's a draft that might solve these problems.  The idea is to use
$(wildcard) for files in the srcdir, and manually enumerate only
built files.

regards, tom lane

diff --git a/src/bin/initdb/nls.mk b/src/bin/initdb/nls.mk
index 19c9136849..54d6d034b5 100644
--- a/src/bin/initdb/nls.mk
+++ b/src/bin/initdb/nls.mk
@@ -1,5 +1,7 @@
 # src/bin/initdb/nls.mk
 CATALOG_NAME = initdb
-GETTEXT_FILES= $(FRONTEND_COMMON_GETTEXT_FILES) findtimezone.c initdb.c ../../common/exec.c ../../common/fe_memutils.c ../../common/file_utils.c ../../common/pgfnames.c ../../common/restricted_token.c ../../common/rmtree.c ../../common/username.c ../../common/wait_error.c ../../port/dirmod.c
+GETTEXT_FILES= $(FRONTEND_COMMON_GETTEXT_FILES) \
+   $(notdir $(wildcard $(srcdir)/*.c)) \
+   ../../common/exec.c ../../common/fe_memutils.c ../../common/file_utils.c ../../common/pgfnames.c ../../common/restricted_token.c ../../common/rmtree.c ../../common/username.c ../../common/wait_error.c ../../port/dirmod.c
 GETTEXT_TRIGGERS = $(FRONTEND_COMMON_GETTEXT_TRIGGERS) simple_prompt
 GETTEXT_FLAGS= $(FRONTEND_COMMON_GETTEXT_FLAGS)
diff --git a/src/bin/pg_amcheck/nls.mk b/src/bin/pg_amcheck/nls.mk
index 5e6171952c..1f70ed4c70 100644
--- a/src/bin/pg_amcheck/nls.mk
+++ b/src/bin/pg_amcheck/nls.mk
@@ -1,7 +1,7 @@
 # src/bin/pg_amcheck/nls.mk
 CATALOG_NAME = pg_amcheck
 GETTEXT_FILES= $(FRONTEND_COMMON_GETTEXT_FILES) \
-   pg_amcheck.c \
+   $(notdir $(wildcard $(srcdir)/*.c)) \
../../fe_utils/cancel.c \
../../fe_utils/connect_utils.c \
../../fe_utils/option_utils.c \
diff --git a/src/bin/pg_archivecleanup/nls.mk b/src/bin/pg_archivecleanup/nls.mk
index 801cf1c51e..8b4b64de8f 100644
--- a/src/bin/pg_archivecleanup/nls.mk
+++ b/src/bin/pg_archivecleanup/nls.mk
@@ -1,5 +1,6 @@
 # src/bin/pg_archivecleanup/nls.mk
 CATALOG_NAME = pg_archivecleanup
-GETTEXT_FILES= $(FRONTEND_COMMON_GETTEXT_FILES) pg_archivecleanup.c
+GETTEXT_FILES= $(FRONTEND_COMMON_GETTEXT_FILES) \
+   $(notdir $(wildcard $(srcdir)/*.c))
 GETTEXT_TRIGGERS = $(FRONTEND_COMMON_GETTEXT_TRIGGERS)
 GETTEXT_FLAGS= $(FRONTEND_COMMON_GETTEXT_FLAGS)
diff --git a/src/bin/pg_basebackup/nls.mk b/src/bin/pg_basebackup/nls.mk
index 8d28e249de..04af63ac3c 100644
--- a/src/bin/pg_basebackup/nls.mk
+++ b/src/bin/pg_basebackup/nls.mk
@@ -1,18 +1,7 @@
 # src/bin/pg_basebackup/nls.mk
 CATALOG_NAME = pg_basebackup
 GETTEXT_FILES= $(FRONTEND_COMMON_GETTEXT_FILES) \
-   bbstreamer_file.c \
-   bbstreamer_gzip.c \
-   bbstreamer_inject.c \
-   bbstreamer_lz4.c \
-   bbstreamer_tar.c \
-   bbstreamer_zstd.c \
-   pg_basebackup.c \
-   pg_receivewal.c \
-   pg_recvlogical.c \
-   receivelog.c \
-   streamutil.c \
-   walmethods.c \
+   $(notdir $(wildcard $(srcdir)/*.c)) \
../../common/compression.c \
../../common/fe_memutils.c \
../../common/file_utils.c \
diff --git a/src/bin/pg_checksums/nls.mk b/src/bin/pg_checksums/nls.mk
index f7cd2a5ee9..0b08f46f4d 100644
--- a/src/bin/pg_checksums/nls.mk
+++ b/src/bin/pg_checksums/nls.mk
@@ -1,7 +1,7 @@
 # src/bin/pg_checksums/nls.mk
 CATALOG_NAME = pg_checksums
 GETTEXT_FILES= $(FRONTEND_COMMON_GETTEXT_FILES) \
-   pg_checksums.c \
+   $(notdir $(wildcard $(srcdir)/*.c)) \
../../fe_utils/option_utils.c
 GETTEXT_TRIGGERS = $(FRONTEND_COMMON_GETTEXT_TRIGGERS)
 GETTEXT_FLAGS= $(FRONTEND_COMMON_GETTEXT_FLAGS)
diff --git a/src/bin/pg_config/nls.mk b/src/bin/pg_config/nls.mk
index d86c28c404..8642be47e4 100644
--- a/src/bin/pg_config/nls.mk
+++ b/src/bin/pg_config/nls.mk
@@ -1,3 +1,4 @@
 # src/bin/pg_config/nls.mk
 CATALOG_NAME = pg_config
-GETTEXT_FILES= pg_config.c ../../common/config_info.c ../../common/exec.c
+GETTEXT_FILES= $(notdir $(wildcard $(srcdir)/*.c)) \
+   ../../common/config_info.c ../../common/exec.c
diff --git a/src/bin/pg_controldata/nls.mk b/src/bin/pg_controldata/nls.mk
index ab34205b96..805085a56b 100644
--- a/src/bin/pg_controldata/nls.mk
+++ b/src/bin/pg_controldata/nls.mk
@@ -1,5 +1,6 @@
 # src/bin/pg_controldata/nls.mk
 CATALOG_NAME = pg_controldata
-GETTEXT_FILES

Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2022-07-13 Thread Zhihong Yu

On Wed, Jul 13, 2022 at 11:28 AM Dmitry Koval 
wrote:

> Hi!
>
> Patch stop applying due to changes in upstream.
> Here is a rebased version.
>
> --
> With best regards,
> Dmitry Koval
>
> Postgres Professional: http://postgrespro.com

Hi,

+attachPartTable(List **wqueue, Relation rel, Relation partition,
PartitionBoundSpec *bound)

I checked naming of existing methods, such as AttachPartitionEnsureIndexes.
I think it would be better if the above method is
named attachPartitionTable.

+   if (!defaultPartCtx && OidIsValid(defaultPartOid))
+   {
+   pc = createSplitPartitionContext(table_open(defaultPartOid,
AccessExclusiveLock));

Since the value of pc would be passed to defaultPartCtx, there is no need
to assign to pc above. You can assign directly to defaultPartCtx.

+   /* Drop splitted partition. */

splitted -> split

+   /* Rename new partition if it is need. */

need -> needed.

Cheers

Re: allow building trusted languages without the untrusted versions

2022-07-13 Thread Nathan Bossart

Given the discussion in this thread, I intend to mark the commitfest entry
as Withdrawn shortly.  Before I do, I thought I'd first check whether 0001
[0] might be worthwhile independent of $SUBJECT.  This change separates the
[un]trusted handler and validator functions for PL/Perl so that we no
longer need to inspect pg_language to determine whether to use the trusted
or untrusted code path.  I was surprised to learn that you can end up with
PL/PerlU even if you've specified the trusted handler/validator functions.
Besides bringing things more in line with how PL/Tcl does things, this
change simplifies function lookup in plperl_proc_hash.  I suppose such a
change might introduce a compatibility break for users who are depending on
this behavior, but I don't know if that's worth worrying about.

[0] 
https://www.postgresql.org/message-id/attachment/133940/v1-0001-Do-not-use-pg_language-to-determine-whether-PL-Pe.patch

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Issue with recovery_target = 'immediate'

2022-07-13 Thread David Steele


Hackers,

We have noticed an issue when performing recovery with recovery_target = 
'immediate' when the latest timeline cannot be replayed to from the 
backup (current) timeline.


First, create two backups:

$ pgbackrest --stanza=demo --type=full --start-fast backup
$ pgbackrest --stanza=demo --type=full --start-fast backup
$ pgbackrest info


full backup: 20220713-175710F
timestamp start/stop: 2022-07-13 17:57:10 / 2022-07-13 17:57:14
wal start/stop: 00010003 / 
00010003

database size: 23.2MB, database backup size: 23.2MB
repo1: backup set size: 2.8MB, backup size: 2.8MB

full backup: 20220713-175748F
timestamp start/stop: 2022-07-13 17:57:48 / 2022-07-13 17:57:52
wal start/stop: 00010005 / 
00010005

database size: 23.2MB, database backup size: 23.2MB
repo1: backup set size: 2.8MB, backup size: 2.8MB

Restore the first backup:

$ pg_ctlcluster 13 demo stop
$ pgbackrest --stanza=demo --delta --set=20220713-175710F 
--type=immediate --target-action=promote restore


Recovery settings:

$ cat /var/lib/postgresql/13/demo/postgresql.auto.conf


restore_command = 'pgbackrest --stanza=demo archive-get %f "%p"'
recovery_target = 'immediate'
recovery_target_action = 'promote'

Starting PostgreSQL performs recovery as expected:

$ pg_ctlcluster 13 demo start
$ cat /var/log/postgresql/postgresql-13-demo.log

LOG:  database system was interrupted; last known up at 2022-07-13 
17:57:10 UTC

LOG:  starting point-in-time recovery to earliest consistent point
LOG:  restored log file "00010003" from archive
LOG:  redo starts at 0/328
LOG:  consistent recovery state reached at 0/3000138
LOG:  recovery stopping after reaching consistency
LOG:  redo done at 0/3000138
LOG:  database system is ready to accept read only connections
LOG:  selected new timeline ID: 2
LOG:  archive recovery complete
LOG:  database system is ready to accept connections

Now restore the second backup (recovery settings are identical):

$ pg_ctlcluster 13 demo stop
$ pgbackrest --stanza=demo --delta --set=20220713-175748F 
--type=immediate --target-action=promote restore


Recovery now fails:

$ pg_ctlcluster 13 demo start
$ cat /var/log/postgresql/postgresql-13-demo.log

LOG:  database system was interrupted; last known up at 2022-07-13 
17:57:48 UTC

LOG:  restored log file "0002.history" from archive
LOG:  starting point-in-time recovery to earliest consistent point
LOG:  restored log file "0002.history" from archive
LOG:  restored log file "00010005" from archive
FATAL:  requested timeline 2 is not a child of this server's history
DETAIL:  Latest checkpoint is at 0/560 on timeline 1, but in the 
history of the requested timeline, the server forked off from that 
timeline at 0/3000138.

LOG:  startup process (PID 511) exited with exit code 1
LOG:  aborting startup due to startup process failure
LOG:  database system is shut down

While it is certainly true that timeline 2 cannot be replayed to from 
timeline 1, it should not matter for an immediate recovery that stops at 
consistency. No timeline switch will occur until promotion. Of course 
the cluster could be shut down before promotion and the target changed, 
but in that case couldn't timeline be adjusted at that point?


This works by default for PostgreSQL < 12 because the default timeline 
is current. Since PostgreSQL 12 the default has been latest, which does 
not work by default.


When a user does a number of recoveries it is pretty common for the 
timelines to get in a state that will make most subsequent recoveries 
fail. We think it makes sense for recovery_target = 'immediate' to be a 
fail safe that always works no matter the state of the latest timeline.


Our solution has been to force recovery_target_timeline = 'current' when 
recovery_target = 'immediate', but it seems like this is something that 
should be done in PostgreSQL instead.


Thoughts?

Regards,
-David

Re: Add --{no-,}bypassrls flags to createuser

2022-07-13 Thread Nathan Bossart

On Wed, Jul 13, 2022 at 08:14:28AM +, Shinoda, Noriyoshi (PN Japan FSIP) 
wrote:
> The attached small patch fixes the message in "createuser --help" command. 
> The patch has changed to specify a time stamp for the --valid-for option. I 
> don't think the SGML description needs to be modified.

Good catch.  Apart from a nitpick about the indentation, your patch looks
reasonable to me.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: minor change for create_list_bounds()

2022-07-13 Thread Nathan Bossart

On Wed, Jul 13, 2022 at 05:07:53PM +1200, David Rowley wrote:
> While I agree that the gains on making this change are small. It just
> accounts to saving a call to bms_add_member() when we've already found
> the partition to be interleaved due to interleaved Datum values,  I
> just disagree with not doing anything about it. My reasons are:
> 
> 1. This code is new to PG15. We have the opportunity now to make a
> meaningful improvement and backpatch it.  When PG15 is out, the bar is
> set significantly higher for fixing this type of thing due to having
> to consider the additional cost of backpatching conflicts with other
> future fixes in that area.
> 2. I think the code as I just pushed it is easier to understand than
> what was there before.

Fair enough.

> 3. I'd like to encourage people to look at and critique our newly
> added code. Having a concern addressed seems like a good reward for
> the work.

+1

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: make update-po@master stops at pg_upgrade

2022-07-13 Thread Peter Eisentraut


On 13.07.22 20:19, Tom Lane wrote:

Hmm.  We could list built files explicitly, perhaps, and still be
a good step ahead on the maintenance burden.  Does xgettext get
upset if the same input file is mentioned twice, ie would we have
to filter sql_help.c out of the wildcard result?


It seems it would be ok with that.

Re: make update-po@master stops at pg_upgrade

2022-07-13 Thread Tom Lane

Peter Eisentraut  writes:
> Note that we have this in nls-global.mk which tries to avoid having the
> vpath details sneak into the output:

> po/$(CATALOG_NAME).pot: $(GETTEXT_FILES) $(MAKEFILE_LIST)
> # Change to srcdir explicitly, don't rely on $^.  That way we get
> # consistent #: file references in the po files.
> ...
>  $(XGETTEXT) -D $(srcdir) -D . -n $(addprefix -k, $(GETTEXT_TRIGGERS)) 
> $(addprefix --flag=, $(GETTEXT_FLAGS)) $(GETTEXT_FILES)

Hmm ... so how does that work with built files in a VPATH build today?

Anyway, I'll go revert the patch for now, since it's clearly got
at least a couple problems that need sorting.

regards, tom lane

Re: make update-po@master stops at pg_upgrade

2022-07-13 Thread Peter Eisentraut


On 13.07.22 19:41, Tom Lane wrote:

Alvaro Herrera  writes:

Hmm, I got this failure:
/usr/bin/xgettext: no se especificó el fichero de entrada


Hmm ... are you doing this in a VPATH setup?  Does it help
if you make the entry be

GETTEXT_FILES= $(wildcard $(srcdir)/*.c)

I'd supposed we didn't need to be careful about that, because
I saw uses of $(wildcard) without it ... but I now see other uses
with.


Note that we have this in nls-global.mk which tries to avoid having the
vpath details sneak into the output:

po/$(CATALOG_NAME).pot: $(GETTEXT_FILES) $(MAKEFILE_LIST)
# Change to srcdir explicitly, don't rely on $^.  That way we get
# consistent #: file references in the po files.
...
$(XGETTEXT) -D $(srcdir) -D . -n $(addprefix -k, $(GETTEXT_TRIGGERS)) 
$(addprefix --flag=, $(GETTEXT_FLAGS)) $(GETTEXT_FILES)

Re: make update-po@master stops at pg_upgrade

2022-07-13 Thread Tom Lane

Peter Eisentraut  writes:
> In some cases, the listed files are build output from another rule, for 
> example sql_help.c.  By using a wildcard, you just take whatever files 
> happen to be there, not observing proper make dependencies. 

Hmm.  We could list built files explicitly, perhaps, and still be
a good step ahead on the maintenance burden.  Does xgettext get
upset if the same input file is mentioned twice, ie would we have
to filter sql_help.c out of the wildcard result?

regards, tom lane

Re: make update-po@master stops at pg_upgrade

2022-07-13 Thread Peter Eisentraut


On 13.07.22 20:09, Peter Eisentraut wrote:
In any case, someone should check that this creates identical output 
before and after.


A quick check shows differences in

pg_rewind.pot
psql.pot
ecpg.pot
libpq.pot
plpgsql.pot

Re: make update-po@master stops at pg_upgrade

2022-07-13 Thread Peter Eisentraut


On 13.07.22 18:07, Tom Lane wrote:

Still, wildcarding the local *.c references seems like a clear step
forward.  I'll go push that part.


In some cases, the listed files are build output from another rule, for 
example sql_help.c.  By using a wildcard, you just take whatever files 
happen to be there, not observing proper make dependencies. 
(Conversely, you could also include files that are not part of the 
build, maybe leftover from something aborted.)  So I'm not sure this is 
such a clear win.  In any case, someone should check that this creates 
identical output before and after.

Re: strings: ".. (compression)? is not supported by this build"

2022-07-13 Thread Robert Haas

On Wed, Jul 13, 2022 at 10:33 AM Justin Pryzby  wrote:
> $ git grep 'is not supported by this build' '*c'
> src/backend/access/transam/xloginsert.c:  
>   elog(ERROR, "LZ4 is not supported by this build");
> src/backend/access/transam/xloginsert.c:  
>   elog(ERROR, "zstd is not supported by this build");
> src/backend/access/transam/xloginsert.c:elog(ERROR, 
> "LZ4 is not supported by this build");
> src/backend/access/transam/xloginsert.c:elog(ERROR, 
> "zstd is not supported by this build");
> ...
> src/backend/replication/basebackup_gzip.c:   errmsg("gzip 
> compression is not supported by this build")));
> src/backend/replication/basebackup_lz4.c:errmsg("lz4 
> compression is not supported by this build")));
> src/backend/replication/basebackup_zstd.c:   errmsg("zstd 
> compression is not supported by this build")));
>
> Should the word "compression" be removed from basebackup, for consistency with
> the use in xloginsert.c ?  And "lz4" capitalization changed for consistency 
> (in
> one direction or the other).  See 4035cd5d4, e9537321a7, 7cf085f07.  Maybe 
> zstd
> should also be changed to Zstandard per 586955ddd.
>
> To avoid the extra translation, and allow the compiler to merge strings.

Translation isn't an issue here because the first group of messages
are reported using elog(), not ereport(). There is something to be
said for being consistent, though.

I feel like it's kind of awkward that this thing is named Zstandard
but the library is libzstd and the package name is probably also zstd.
I worry a bit that if we make the messages say zstandard instead of
zstd it makes it harder to figure out. It's probably not a huge issue,
though.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: make update-po@master stops at pg_upgrade

2022-07-13 Thread Tom Lane

Alvaro Herrera  writes:
> Hmm, I got this failure:
> /usr/bin/xgettext: no se especificó el fichero de entrada

Hmm ... are you doing this in a VPATH setup?  Does it help
if you make the entry be

GETTEXT_FILES= $(wildcard $(srcdir)/*.c)

I'd supposed we didn't need to be careful about that, because
I saw uses of $(wildcard) without it ... but I now see other uses
with.

regards, tom lane

Re: logical replication restrictions

2022-07-13 Thread Melih Mutlu

Hi Euler,

I've some comments/questions about the latest version (v4) of your patch.

Firstly, I think the patch needs a rebase. CI currently cannot apply it [1].

22. src/test/subscription/t/032_apply_delay.pl
>
> I received the following error when trying to run these 'subscription'
> tests:
>
> t/032_apply_delay.pl ... No such class log_location at
> t/032_apply_delay.pl line 49, near "my log_location"
> syntax error at t/032_apply_delay.pl line 49, near "my log_location ="
>

I'm having these errors too. Seems like some declarations are missing.


+  specified amount of time. If this value is specified without
>> units,
>
> +  it is taken as milliseconds. The default is zero, adding no
>> delay.
>
> + 
>
> I'm also having an issue when I give min_apply_delay parameter without
units.
I expect that if I set  min_apply_delay to 5000 (without any unit), it will
be interpreted as 5000 ms.

I tried:
postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'dbname=postgres
port=5432' PUBLICATION testpub WITH (min_apply_delay=5000);

And logs showed:
2022-07-13 20:26:52.231 +03 [5422] LOG:  logical replication apply delay:
499 ms
2022-07-13 20:26:52.231 +03 [5422] CONTEXT:  processing remote data for
replication origin "pg_18126" during "BEGIN" in transaction 3152 finished
at 0/465D7A0

Looks like it starts from 500 ms instead of 5000 ms for me. If I state
the unit as ms, then it works correctly.


Lastly, I have a question about this delay during tablesync.
It's stated here that apply delays are not for initial tablesync.

 
>
> +  The delay occurs only on WAL records for transaction begins and
>> after
>
> +  the initial table synchronization. It is possible that the
>
> +  replication delay between publisher and subscriber exceeds the
>> value
>
> +  of this parameter, in which case no delay is added. Note that
>> the
>
> +  delay is calculated between the WAL time stamp as written on
>
> +  publisher and the current time on the subscriber. Delays in
>> logical
>
> +  decoding and in transfer the transaction may reduce the actual
>> wait
>
> +  time. If the system clocks on publisher and subscriber are not
>
> +  synchronized, this may lead to apply changes earlier than
>> expected.
>
> +  This is not a major issue because a typical setting of this
>> parameter
>
> +  are much larger than typical time deviations between servers.
>
> + 
>
>
There might be a case where tablesync workers are in SYNCWAIT state and
waiting for apply worker to tell them to CATCHUP.
And if apply worker is waiting in apply_delay function, tablesync workers
will be stuck at SYNCWAIT state and this might delay tablesync at least
"min_apply_delay" amount of time or more.
Is it something we would want? What do you think?


[1] http://cfbot.cputube.org/patch_38_3581.log


Best,
Melih

Re: fix stats_fetch_consistency value in postgresql.conf.sample

2022-07-13 Thread Justin Pryzby

> +# The following parameters are defaultly set with
> +# environment-dependent values at run-time which may not match the
> +# default values written in the sample config file.
> +my %ignore_parameters = 
> +  map { $_ => 1 } (
> +   'data_directory',
> +   'hba_file',
> +   'ident_file',
> +   'krb_server_keyfile',
> +   'max_stack_depth',
> +   'bgwriter_flush_after',
> +   'wal_sync_method',
> +   'checkpoint_flush_after',
> +   'timezone_abbreviations',
> +   'lc_messages',
> +   'wal_buffers');

How did you make this list ?  Was it by excluding things that failed for you ?

cfbot is currently failing due to io_concurrency on windows.
I think there are more GUC which should be included here.

http://cfbot.cputube.org/kyotaro-horiguchi.html

-- 
Justin

Re: make update-po@master stops at pg_upgrade

2022-07-13 Thread Alvaro Herrera

On 2022-Jul-13, Tom Lane wrote:

> I had to recreate the patch almost from scratch, because 88dad06b4
> touched adjacent lines in most of these files, scaring patch(1)
> away from applying the changes.  That being the case, I decided
> to use $(wildcard *.c) everywhere, including the places where there's
> currently just one *.c file.  It seems to work for me, but please check.

Hmm, I got this failure:

make[4]: se entra en el directorio 
'/home/alvherre/Code/pgsql-build/master/src/interfaces/ecpg/ecpglib'
/usr/bin/xgettext -ctranslator --copyright-holder='PostgreSQL Global 
Development Group' --msgid-bugs-address=pgsql-b...@lists.postgresql.org 
--no-wrap --sort-by-file --package-name='ecpglib (PostgreSQL)' 
--package-version='16' -D /pgsql/source/master/src/interfaces/ecpg/ecpglib -D . 
-n -kecpg_gettext -k_ --flag=ecpg_gettext:1:pass-c-format 
--flag=_:1:pass-c-format 
/usr/bin/xgettext: no se especificó el fichero de entrada
Pruebe '/usr/bin/xgettext --help' para más información.
make[4]: *** [/pgsql/source/master/src/nls-global.mk:108: po/ecpglib.pot] Error 
1

The error from xgettext, of course, is
/usr/bin/xgettext: no input file given

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Puedes vivir sólo una vez, pero si lo haces bien, una vez es suficiente"

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2022-07-13 Thread Melanie Plageman

Attached patch set is substantially different enough from previous
versions that I kept it as a new patch set.
Note that local buffer allocations are now correctly tracked.

On Tue, Jul 12, 2022 at 1:01 PM Andres Freund  wrote:

> Hi,
>
> On 2022-07-12 12:19:06 -0400, Melanie Plageman wrote:
> > > > I also realized that I am not differentiating between IOPATH_SHARED
> and
> > > > IOPATH_STRATEGY for IOOP_FSYNC. But, given that we don't know what
> type
> > > > of buffer we are fsync'ing by the time we call
> register_dirty_segment(),
> > > > I'm not sure how we would fix this.
> > >
> > > I think there scarcely happens flush for strategy-loaded buffers.  If
> > > that is sensible, IOOP_FSYNC would not make much sense for
> > > IOPATH_STRATEGY.
> > >
> >
> > Why would it be less likely for a backend to do its own fsync when
> > flushing a dirty strategy buffer than a regular dirty shared buffer?
>
> We really just don't expect a backend to do many segment fsyncs at
> all. Otherwise there's something wrong with the forwarding mechanism.
>

When a dirty strategy buffer is written out, if pendingOps sync queue is
full and the backend has to fsync the segment itself instead of relying
on the checkpointer, this will show in the statistics as an IOOP_FSYNC
for IOPATH_SHARED not IOPATH_STRATEGY.
IOPATH_STRATEGY + IOOP_FSYNC will always be 0 for all BackendTypes.
Does this seem right?

>
> It'd be different if we tracked WAL fsyncs more granularly - which would be
> quite interesting - but that's something for another day^Wpatch.
>
>
I do have a question about this.
So, if we were to start tracking WAL IO would it fit within this
paradigm to have a new IOPATH_WAL for WAL or would it add a separate
dimension?

I was thinking that we might want to consider calling this view
pg_stat_io_data because we might want to have a separate view,
pg_stat_io_wal and then, maybe eventually, convert pg_stat_slru to
pg_stat_io_slru (or a subset of what is in pg_stat_slru).
And maybe then later add pg_stat_io_[archiver/other]

Is pg_stat_io_data a good name that gives us flexibility to
introduce views which expose per-backend IO operation stats (maybe that
goes in pg_stat_activity, though [or maybe not because it wouldn't
include exited backends?]) and per query IO operation stats?

I would like to add roughly the same additional columns to all of
these during AIO development (basically the columns from iostat):
- average block size (will usually be 8kB for pg_stat_io_data but won't
necessarily for the others)
- IOPS/BW
- avg read/write wait time
- demand rate/completion rate
- merges
- maybe queue depth

And I would like to be able to see all of these per query, per backend,
per relation, per BackendType, per IOPath, per SLRU type, etc.

Basically, what I'm asking is
1) what can we name the view to enable these future stats to exist with
the least confusing/wordy view names?
2) will the current view layout and column titles work with minimal
changes for future stats extensions like what I mention above?

>
> > > > > Wonder if it's worth making the lock specific to the backend type?
> > > > >
> > > >
> > > > I've added another Lock into PgStat_IOPathOps so that each
> BackendType
> > > > can be locked separately. But, I've also kept the lock in
> > > > PgStatShared_BackendIOPathOps so that reset_all and snapshot could be
> > > > done easily.
> > >
> > > Looks fine about the lock separation.
> > >
> >
> > Actually, I think it is not safe to use both of these locks. So for
> > picking one method, it is probably better to go with the locks in
> > PgStat_IOPathOps, it will be more efficient for flush (and not for
> > fetching and resetting), so that is probably the way to go, right?
>
> I think it's good to just use one kind of lock, and efficiency of
> snapshotting
> / resetting is nearly irrelevant. But I don't see why it's not safe to use
> both kinds of locks?
>
>
The way I implemented it was not safe because I didn't use both locks
when resetting the stats.

In this new version of the patch, I've done the following: In shared
memory I've put the lock in PgStatShared_IOPathOps -- the data structure
which contains an array of PgStat_IOOpCounters for all IOOp types for
all IOPaths. Thus, different BackendType + IOPath combinations can be
updated concurrently without contending for the same lock.

To make this work, I made two versions of the PgStat_IOPathOps -- one
that has the lock, PgStatShared_IOPathOps, and one without,
PgStat_IOPathOps, so that I can persist it to the stats file without
writing and reading the LWLock and can have a local and snapshot version
of the data structure without the lock.

This also necessitated two versions of the data structure wrapping
PgStat_IOPathOps, PgStat_BackendIOPathOps, which contains an array with
a PgStat_IOPathOps for each BackendType, and
PgStatShared_BackendIOPathOps, containing an array of
PgStatShared_IOPathOps.

>
> > > Looks fine, but I think pgstat_flush_io_ops() need more comment

optimize lookups in snapshot [sub]xip arrays

2022-07-13 Thread Nathan Bossart

Hi hackers,

A few years ago, there was a proposal to create hash tables for long
[sub]xip arrays in snapshots [0], but the thread seems to have fizzled out.
I was curious whether this idea still showed measurable benefits, so I
revamped the patch and ran the same test as before [1].  Here are the
results for 60₋second runs on an r5d.24xlarge with the data directory on
the local NVMe storage:

 writers  HEAD  patch  diff

 16   659   664+1%
 32   645   663+3%
 64   659   692+5%
 128  641   716+12%
 256  619   610-1%
 512  530   702+32%
 768  469   582+24%
 1000 367   577+57%

As before, the hash table approach seems to provide a decent benefit at
higher client counts, so I felt it was worth reviving the idea.

The attached patch has some key differences from the previous proposal.
For example, the new patch uses simplehash instead of open-coding a new
hash table.  Also, I've bumped up the threshold for creating hash tables to
128 based on the results of my testing.  The attached patch waits until a
lookup of [sub]xip before generating the hash table, so we only need to
allocate enough space for the current elements in the [sub]xip array, and
we avoid allocating extra memory for workloads that do not need the hash
tables.  I'm slightly worried about increasing the number of memory
allocations in this code path, but the results above seemed encouraging on
that front.

Thoughts?

[0] https://postgr.es/m/35960b8af917e9268881cd8df3f88320%40postgrespro.ru
[1] https://postgr.es/m/057a9a95-19d2-05f0-17e2-f46ff20e9b3e%402ndquadrant.com

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From a7e63198030d8c77df1720a85f9eca6d1d5078b2 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 12 Jul 2022 11:39:41 -0700
Subject: [PATCH v1 1/1] Optimize lookups in snapshot "transactions in
 progress" arrays.

Presently, XidInMVCCSnapshot() performs linear searches through the
xip and subxip arrays.  This is ordinarily not too bad, but this
O(n) behavior may degrade performance for certain workloads at
higher client counts.  This change teaches XidInMVCCSnapshot() to
generate hash tables when the [sub]xip array is large, thereby
achieving O(1) lookup times at the expense of some extra memory.
These hash tables are regarded as ephemeral; they only live in
process-local memory and are never rewritten, copied, or
serialized.  This means that we only need to allocate enough room
for the current elements in [sub]xip, which should usually save
memory (but increase the number of allocations).  Another benefit
of this approach is that the hash tables are not allocated for
workloads that do not generate snapshots with long [sub]xip arrays.

A synthetic workload that generates snapshots with many transaction
IDs showed no regression in TPS at lower client counts, and it
provided over 50% improvement at higher client counts (i.e., 1000
connections).
---
 src/backend/storage/ipc/procarray.c |   7 +-
 src/backend/utils/time/snapmgr.c| 111 ++--
 src/include/utils/snapshot.h|  19 +
 3 files changed, 114 insertions(+), 23 deletions(-)

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index dadaa958a8..e126eb72bb 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -2544,12 +2544,15 @@ GetSnapshotData(Snapshot snapshot)
 	snapshot->curcid = GetCurrentCommandId(false);
 
 	/*
-	 * This is a new snapshot, so set both refcounts are zero, and mark it as
-	 * not copied in persistent memory.
+	 * This is a new snapshot, so set both refcounts are zero, mark it as not
+	 * copied in persistent memory, and mark the xip and subxip hash tables as
+	 * not built.
 	 */
 	snapshot->active_count = 0;
 	snapshot->regd_count = 0;
 	snapshot->copied = false;
+	snapshot->xiph = NULL;
+	snapshot->subxiph = NULL;
 
 	GetSnapshotDataInitOldSnapshot(snapshot);
 
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 5bc2a15160..c5fcfd254c 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -53,6 +53,7 @@
 #include "access/xact.h"
 #include "access/xlog.h"
 #include "catalog/catalog.h"
+#include "common/hashfn.h"
 #include "datatype/timestamp.h"
 #include "lib/pairingheap.h"
 #include "miscadmin.h"
@@ -626,6 +627,8 @@ CopySnapshot(Snapshot snapshot)
 	newsnap->active_count = 0;
 	newsnap->copied = true;
 	newsnap->snapXactCompletionCount = 0;
+	newsnap->xiph = NULL;
+	newsnap->subxiph = NULL;
 
 	/* setup XID array */
 	if (snapshot->xcnt > 0)
@@ -667,6 +670,10 @@ FreeSnapshot(Snapshot snapshot)
 	Assert(snapshot->active_count == 0);
 	Assert(snapshot->copied);
 
+	if (snapshot->xiph)
+		xip_hash_destroy(snapshot->xiph);
+	if (snapshot->subxiph)
+		xip_hash_destroy(snapshot->subxiph);
 	pfree(snapshot);
 }
 
@@ -2233,6 +2240,8 @@ Rest

Re: make update-po@master stops at pg_upgrade

2022-07-13 Thread Tom Lane

I wrote:
> Kyotaro Horiguchi  writes:
>> Since backend does that way, I think we can do that the same way
>> also for the tools. Attached second does that except for tools that
>> have only one *.c.  The patch doesn't make a difference in the result
>> of make update-po.

> Still, wildcarding the local *.c references seems like a clear step
> forward.  I'll go push that part.

I had to recreate the patch almost from scratch, because 88dad06b4
touched adjacent lines in most of these files, scaring patch(1)
away from applying the changes.  That being the case, I decided
to use $(wildcard *.c) everywhere, including the places where there's
currently just one *.c file.  It seems to work for me, but please check.

regards, tom lane

Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-13 Thread Alvaro Herrera

Not a review, just a preparatory rebase across some trivially
conflicting changes.  I also noticed that
src/test/recovery/t/031_recovery_conflict.pl, which was added two days
after v23 was sent, and which uses allow_in_place_tablespaces, bails out
because of the checks introduced by this patch, so I made the check
routine do nothing in that case.

Anyway, here's v24.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"La conclusión que podemos sacar de esos estudios es que
no podemos sacar ninguna conclusión de ellos" (Tanenbaum)
>From de2c77f5e2c31c911674377a51359ba8fe662c96 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Wed, 13 Jul 2022 18:14:18 +0200
Subject: [PATCH v24] Fix replay of create database records on standby

Crash recovery on standby may encounter missing directories when
replaying create database WAL records.  Prior to this patch, the
standby would fail to recover in such a case.  However, the
directories could be legitimately missing.  Consider a sequence of WAL
records as follows:

CREATE DATABASE
DROP DATABASE
DROP TABLESPACE

If, after replaying the last WAL record and removing the tablespace
directory, the standby crashes and has to replay the create database
record again, the crash recovery must be able to move on.

This patch allows missing tablespaces to be created during recovery
before reaching consistency.  The tablespaces are created as real
directories that should not exists but will be removed until reaching
consistency. CheckRecoveryConsistency is responsible to make sure they
have disappeared.

Similar to log_invalid_page mechanism, the GUC ignore_invalid_pages
turns into PANIC errors detected by this patch into WARNING, which
allows continueing recovery.

Diagnosed-by: Paul Guo 
Author: Paul Guo 
Author: Kyotaro Horiguchi 
Author: Asim R Praveen 
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27dn5xwj2p9-roh16e4...@mail.gmail.com
---
 doc/src/sgml/config.sgml|   5 +-
 src/backend/access/transam/xlogrecovery.c   |  59 
 src/backend/commands/dbcommands.c   |  71 +
 src/backend/commands/tablespace.c   |  28 +---
 src/backend/utils/misc/guc.c|   8 +-
 src/include/access/xlogutils.h  |   2 +
 src/test/recovery/t/029_replay_tsp_drops.pl | 155 
 7 files changed, 296 insertions(+), 32 deletions(-)
 create mode 100644 src/test/recovery/t/029_replay_tsp_drops.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 37fd80388c..1e1c8c1cb7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -11363,11 +11363,12 @@ LOG:  CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1)
   

 If set to off (the default), detection of
-WAL records having references to invalid pages during
+WAL records having references to invalid pages or
+WAL records resulting in invalid directory operations during
 recovery causes PostgreSQL to
 raise a PANIC-level error, aborting the recovery. Setting
 ignore_invalid_pages to on
-causes the system to ignore invalid page references in WAL records
+causes the system to ignore invalid actions caused by such WAL records
 (but still report a warning), and continue the recovery.
 This behavior may cause crashes, data loss,
 propagate or hide corruption, or other serious problems.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5e46..ae81244e06 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2008,6 +2008,57 @@ xlogrecovery_redo(XLogReaderState *record, TimeLineID replayTLI)
 	}
 }
 
+/*
+ * Makes sure that ./pg_tblspc directory doesn't contain a real directory.
+ *
+ * This is intended to be called after reaching consistency.
+ * ignore_invalid_pages=on turns into the PANIC error into WARNING so that
+ * recovery can continue.
+ *
+ * This can't be checked in allow_in_place_tablespaces mode, so skip it in
+ * that case.
+ */
+static void
+CheckTablespaceDirectory(void)
+{
+	char *tblspc_path = "./pg_tblspc";
+	DIR		   *dir;
+	struct dirent *de;
+
+	/* Do not check for this when test tablespaces are in use */
+	if (allow_in_place_tablespaces)
+		return;
+
+	dir = AllocateDir(tblspc_path);
+	while ((de = ReadDir(dir, tblspc_path)) != NULL)
+	{
+		char	path[MAXPGPATH];
+		char   *p;
+#ifndef WIN32
+		struct stat st;
+#endif
+
+		/* Skip entries of non-oid names */
+		for (p = de->d_name; *p && isdigit(*p); p++);
+		if (*p)
+			continue;
+
+		snprintf(path, MAXPGPATH, "%s/%s", tblspc_path, de->d_name);
+
+#ifndef WIN32
+		if (lstat(path, &st) < 0)
+			ereport(ERROR, errcode_for_file_access(),
+	errmsg("could not stat file \"%s\": %m", path));
+
+		if (!S_ISLNK(st.st_mode))
+#else
+		if (!pgwin32_is_junction(path))
+#endif
+			elog(ignore_

Re: Bug: Reading from single byte character column type may cause out of bounds memory reads.

2022-07-13 Thread Isaac Morland

On Wed, 13 Jul 2022 at 09:15, Aleksander Alekseev 
wrote:

I can confirm the bug exists in the `master` branch as well and
> doesn't depend on the platform.
>
> Although the bug is easy to fix for this particular case (see the
> patch) I'm not sure if this solution is general enough. E.g. is there
> something that generally prevents pg_mblen() from doing out of bound
> reading in cases similar to this one? Should we prevent such an INSERT
> from happening instead?
>

Not just INSERTs, I would think: the implicit cast is already invalid,
since the "char" type can only hold characters that can be represented in 1
byte. A comparable example in the numeric types might be:

odyssey=> select (2.0 ^ 80)::double precision::integer;
ERROR:  integer out of range

By comparison:

odyssey=> select '🀆'::"char";
 char
──

(1 row)

I think this should give an error, perhaps 'ERROR: "char" out of range'.

Incidentally, if I apply ascii() to the result, I get sometimes 0 and
sometimes 90112, neither of which should be a possible value for ascii ()
of a "char" value and neither of which is 126982, the actual value of that
character.

odyssey=> select ascii ('🀆'::"char");
 ascii
───
 90112
(1 row)

odyssey=> select ascii ('🀆'::"char");
 ascii
───
 0
(1 row)

odyssey=> select ascii ('🀆');
 ascii

 126982
(1 row)

Re: make update-po@master stops at pg_upgrade

2022-07-13 Thread Tom Lane

Kyotaro Horiguchi  writes:
> I find it annoying that make update-po stops at pg_upgrade on master.
> The cause is that a file is renamed from relfilenode.c to
> relfilenumber.c so just fixing the name works. (attached first).

Ooops.

> I wonder if we can use $(wildcard *.c) instead of explicitly
> enumerating every *.c file in the directory then maintaining the
> list.

+1.  We've repeatedly forgotten to update the nls.mk files when
adding/removing files; they're just not on most hackers' radar.
Anything we can do to make that more automatic seems like a win.

> Since backend does that way, I think we can do that the same way
> also for the tools. Attached second does that except for tools that
> have only one *.c.  The patch doesn't make a difference in the result
> of make update-po.

I wonder if there's anything we can do about the manual cross-references
to src/common.  We could wildcard that, but then every FE program would
have to contain translations for all messages in src/common/, even for
modules it doesn't use.

Still, wildcarding the local *.c references seems like a clear step
forward.  I'll go push that part.

regards, tom lane

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2022-07-13 Thread Pavel Borisov

>
> Is there any reason to continue with two separate threads and CF entries ?
> The original reason was to have a smaller patch for considerate late in
> v15.
>
> But right now, it just seems to cause every update to imply two email
> messages
> rather than one.
>
> Since the patch is split into 0001, 0002, 0003, 0004+, both can continue
> in the
> main thread.  The early patches can still be applied independently from
> each
> later patch (the same as with any other patch series).
>

I see the main goal of this split is to make discussion of this (easier)
thread separate to the discussion of a whole patchset which is expected to
be more thorough.

Also I see the chances of this thread to be committed into v16 to be much
higher than of a main patch, which will be for v17 then.

Thanks for the advice to add git thread instead of patch posting. Will try
to do this.
-- 
Best regards,
Pavel Borisov

Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-13 Thread Andres Freund

Hi,

On 2022-07-13 09:23:00 -0400, Jonathan S. Katz wrote:
> On 7/13/22 12:13 AM, David Rowley wrote:
> > On Tue, 12 Jul 2022 at 17:15, David Rowley  wrote:
> > > So far only Robert has raised concerns with this regression for PG15
> > > (see [2]). Tom voted for leaving things as they are for PG15 in [3].
> > > John agrees, as quoted above. Does anyone else have any opinion?
> > 
> > Let me handle this slightly differently.  I've moved the open item for
> > this into the "won't fix" section.  If nobody shouts at me for that
> > then I'll let that end the debate.  Otherwise, we can consider the
> > argument when it arrives.
> 
> The RMT discussed this issue at its meeting today (and a few weeks back --
> apologies for not writing sooner). While we agree with your analysis that 1/
> this issue does appear to be a corner case and 2/ the benefits outweigh the
> risks, we still don't know how prevalent it may be in the wild and the
> general impact to user experience.
> 
> The RMT suggests that you make one more pass at attempting to solve it.

I think without a more concrete analysis from the RMT that's not really
actionable. What risks are we willing to accept to resolve this? This is
mostly a question of tradeoffs.

Several "senior" postgres hackers looked at this and didn't find any solution
that makes sense to apply to 15. I don't think having David butt his head
further against this code is likely to achieve much besides a headache.  Note
that David already has a patch to address this for 16.

> If there does not appear to be a clear path forward, we should at least
> document how a user can detect and resolve the issue.

To me that doesn't really make sense. We have lots of places were performance
changes once you cross some threshold, and lots of those are related to
work_mem.

We don't, e.g., provide tooling to detect when performance in aggregation
regresses due to crossing work_mem and could be fixed by a tiny increase in
work_mem.

Greetings,

Andres Freund

Re: Bug: Reading from single byte character column type may cause out of bounds memory reads.

2022-07-13 Thread Tom Lane

Aleksander Alekseev  writes:
> Although the bug is easy to fix for this particular case (see the
> patch) I'm not sure if this solution is general enough. E.g. is there
> something that generally prevents pg_mblen() from doing out of bound
> reading in cases similar to this one? Should we prevent such an INSERT
> from happening instead?

This is ultimately down to char_text() generating a string that's alleged
to be a valid "text" type value, but it contains illegally-encoded data.
Where we need to fix it is there: if we try to make every single
text-using function be 100% bulletproof against wrongly-encoded data,
we'll still be fixing bugs at the heat death of the universe.

I complained about this in [1], but that thread died off before reaching a
clear consensus about exactly what to do.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/flat/2318797.1638558730%40sss.pgh.pa.us

Re: [PATCH] Optimize json_lex_string by batching character copying

2022-07-13 Thread Andrew Dunstan



On 2022-06-24 Fr 20:18, Andres Freund wrote:
> Hi,
>
> On 2022-06-24 08:47:09 +, Jelte Fennema wrote:
>> To test performance of this change I used COPY BINARY from a JSONB table
>> into another, containing fairly JSONB values of ~15kB.
> This will have a lot of other costs included (DML is expensive). I'd suggest
> storing the json in a text column and casting it to json[b], with a filter
> ontop of the json[b] result that cheaply filters it away. That should end up
> spending nearly all the time somewhere around json parsing.
>
> It's useful for things like this to include a way for others to use the same
> benchmark...
>
> I tried your patch with:
>
> DROP TABLE IF EXISTS json_as_text;
> CREATE TABLE json_as_text AS SELECT (SELECT json_agg(row_to_json(pd)) as t 
> FROM pg_description pd) FROM generate_series(1, 100);
> VACUUM FREEZE json_as_text;
>
> SELECT 1 FROM json_as_text WHERE jsonb_typeof(t::jsonb) = 'not me';


I've been doing some other work related to json parsing and John
referred me to this. But it's actually not the best test for pure json
parsing - casting to jsonb involves some extra work besides pure
parsing. Instead I've been using this query with the same table, which
should be almost all json parsing:


select 1 from json_as_text where t::json is null;


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: make install-world fails sometimes in Mac M1

2022-07-13 Thread Alvaro Herrera

On 2022-Jul-11, Gaddam Sai Ram wrote:

>       Even we don't have any problem when we run commands via
> terminal. Problem occurs only when we run as a part of script.

It must be a problem induced by the shell used to run the script, then.
What is it?  The script itself doesn't say.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"¿Cómo puedes confiar en algo que pagas y que no ves,
y no confiar en algo que te dan y te lo muestran?" (Germán Poo)

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2022-07-13 Thread Justin Pryzby

On Fri, May 13, 2022 at 04:21:29PM +0300, Maxim Orlov wrote:
> We have posted an updated version v34 of the whole patchset in [1].
> Changes of patches 0001-0003 there are identical to v33. So, no update is
> needed in this thread.

Is there any reason to continue with two separate threads and CF entries ?
The original reason was to have a smaller patch for considerate late in v15.

But right now, it just seems to cause every update to imply two email messages
rather than one.

Since the patch is split into 0001, 0002, 0003, 0004+, both can continue in the
main thread.  The early patches can still be applied independently from each
later patch (the same as with any other patch series).

Also, since this patch series is large, and expects a lot of conflicts, it
seems better to update the cfapp with a "git link" [0] where you can maintain a
rebased branch.  It avoids the need to mail new patches to the list several
times more often than it's being reviewed.

[0] Note that cirrusci will run the same checks as cfbot if you push a branch
to github.

strings: ".. (compression)? is not supported by this build"

2022-07-13 Thread Justin Pryzby

$ git grep 'is not supported by this build' '*c'
src/backend/access/transam/xloginsert.c:
elog(ERROR, "LZ4 is not supported by this build");
src/backend/access/transam/xloginsert.c:
elog(ERROR, "zstd is not supported by this build");
src/backend/access/transam/xloginsert.c:elog(ERROR, 
"LZ4 is not supported by this build");
src/backend/access/transam/xloginsert.c:elog(ERROR, 
"zstd is not supported by this build");
...
src/backend/replication/basebackup_gzip.c:   errmsg("gzip 
compression is not supported by this build")));
src/backend/replication/basebackup_lz4.c:errmsg("lz4 
compression is not supported by this build")));
src/backend/replication/basebackup_zstd.c:   errmsg("zstd 
compression is not supported by this build")));

Should the word "compression" be removed from basebackup, for consistency with
the use in xloginsert.c ?  And "lz4" capitalization changed for consistency (in
one direction or the other).  See 4035cd5d4, e9537321a7, 7cf085f07.  Maybe zstd
should also be changed to Zstandard per 586955ddd.

To avoid the extra translation, and allow the compiler to merge strings.

The "binary size" argument wouldn't apply, but note that pg_dump uses this
language:

src/bin/pg_dump/compress_io.c:  pg_fatal("not built with zlib support");

See also some other string messages I mentioned here:
https://www.postgresql.org/message-id/20210622001927.ge29...@telsasoft.com
|+#define NO_LZ4_SUPPORT() \
|+   ereport(ERROR, \
|+   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), \
|+errmsg("unsupported LZ4 compression method"), \
|+errdetail("This functionality requires the server to 
be built with lz4 support."), \
|+errhint("You need to rebuild PostgreSQL using 
--with-lz4.")))
|
|src/bin/pg_dump/pg_backup_archiver.c:   fatal("cannot 
restore from compressed archive (compression not supported in this 
installation)");
|src/bin/pg_dump/pg_backup_archiver.c:   pg_log_warning("archive is 
compressed, but this installation does not support compression -- no data will 
be available");
|src/bin/pg_dump/pg_dump.c:  pg_log_warning("requested compression 
not available in this installation -- archive will be uncompressed");

-- 
Justin

Re: Making Vars outer-join aware

2022-07-13 Thread Tom Lane

Richard Guo  writes:
> But I'm not sure which is better, to evaluate the expression below or
> above the outer join. It seems to me that if the size of base rel is
> large and somehow the size of the joinrel is small, evaluation above the
> outer join would win. And in the opposite case evaluation below the
> outer join would be better.

Reasonable question.  But I think for the purposes of this patch,
it's better to keep the old behavior as much as we can.  People
have probably relied on it while tuning queries.  (I'm not saying
it has to be *exactly* bug-compatible, but simple cases like your
example probably should work the same.)

regards, tom lane

Re: Building PostgreSQL in external directory is broken?

2022-07-13 Thread Tom Lane

Aleksander Alekseev  writes:
>> Could you give an example of when this can be useful?

> And now I can answer my own question. I can move all shell scripts I
> typically use for the development from the repository and be sure they
> are not going to be deleted by accident (by `git clean`, for
> instance).

FWIW, I gather that the upcoming switch to the meson build system
will result in *requiring* use of outside-the-source-tree builds.

regards, tom lane

Re: Support for grabbing multiple consecutive values with nextval()

2022-07-13 Thread Ronan Dunklau

> It is Friday here, so I would easily miss something..  It is possible
> to use COPY FROM with a list of columns, so assuming that you could
> use a default expression with nextval() or just a SERIAL column not
> listed in the COPY FROM query to do the job, what do we gain with this
> feature?  In which aspect does the preallocation of a range handled
> on the client side after being allocated in the backend make things
> better?

The problem the author wants to solve is the fact they don't have a way of 
returning the ids when using COPY FROM. Pre-allocating them and assigning them 
to the individual records before sending them via COPY FROM would solve that 
for them.

-- 
Ronan Dunklau

Re: Building PostgreSQL in external directory is broken?

2022-07-13 Thread Aleksander Alekseev

Alvaro, Alexander,

> Please, check Alvaro's advise to run "git clean -dfx".  Helped to me.

Thanks, `git clean -dfx` did the trick!

> Could you give an example of when this can be useful?

And now I can answer my own question. I can move all shell scripts I
typically use for the development from the repository and be sure they
are not going to be deleted by accident (by `git clean`, for
instance).

Very convenient.

-- 
Best regards,
Aleksander Alekseev

Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-13 Thread Jonathan S. Katz


Hi David,

On 7/13/22 12:13 AM, David Rowley wrote:

On Tue, 12 Jul 2022 at 17:15, David Rowley  wrote:

So far only Robert has raised concerns with this regression for PG15
(see [2]). Tom voted for leaving things as they are for PG15 in [3].
John agrees, as quoted above. Does anyone else have any opinion?


Let me handle this slightly differently.  I've moved the open item for
this into the "won't fix" section.  If nobody shouts at me for that
then I'll let that end the debate.  Otherwise, we can consider the
argument when it arrives.


The RMT discussed this issue at its meeting today (and a few weeks back 
-- apologies for not writing sooner). While we agree with your analysis 
that 1/ this issue does appear to be a corner case and 2/ the benefits 
outweigh the risks, we still don't know how prevalent it may be in the 
wild and the general impact to user experience.


The RMT suggests that you make one more pass at attempting to solve it. 
If there does not appear to be a clear path forward, we should at least 
document how a user can detect and resolve the issue.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature

Re: Bug: Reading from single byte character column type may cause out of bounds memory reads.

2022-07-13 Thread Aleksander Alekseev

Hi Spyridon,

> The column "single_byte_col" is supposed to store only 1 byte. Nevertheless, 
> the INSERT command implicitly casts the '🀆' text into "char". This means that 
> only the first byte of '🀆' ends up stored in the column.
> gdb reports that "pg_mblen(p) = 4" (line 1046), which is expected since the 
> pg_mblen('🀆') is indeed 4. Later at line 1050, the memcpy will copy 4 bytes 
> instead of 1, hence an out of bounds memory read happens for pointer 's', 
> which effectively copies random bytes.

Many thanks for reporting this!

> - OS: Ubuntu 20.04
> - PSQL version 14.4

I can confirm the bug exists in the `master` branch as well and
doesn't depend on the platform.

Although the bug is easy to fix for this particular case (see the
patch) I'm not sure if this solution is general enough. E.g. is there
something that generally prevents pg_mblen() from doing out of bound
reading in cases similar to this one? Should we prevent such an INSERT
from happening instead?

-- 
Best regards,
Aleksander Alekseev


v1-0001-Fix-out-of-bounds-memory-reads-in-text_substring.patch
Description: Binary data

Re: Building PostgreSQL in external directory is broken?

2022-07-13 Thread Alexander Korotkov

On Wed, Jul 13, 2022 at 3:12 PM Aleksander Alekseev
 wrote:
> > That's the short version. The longer version² does claim it's supported:
>
> You are right, I missed this. Thanks!
>
> Regarding these errors:
>
> > ar: cryptohash.o: No such file or directory
> > ar: hmac.o: No such file or directory
> > ...
>
> This has something to do with the particular choice of the ./configure
> flags on a given platform. If I omit:
>
> > --disable-debug --disable-cassert --enable-tap-tests
>
> I get completely different errors:
>
> > clang: error: no such file or directory: 'replication/backup_manifest.o'
> > clang: error: no such file or directory: 'replication/basebackup.o'
>
> Apparently this should be checked carefully with different configure
> flags combinations if we are going to continue maintaining this.

Please, check Alvaro's advise to run "git clean -dfx".  Helped to me.

--
Regards,
Alexander Korotkov

Re: Building PostgreSQL in external directory is broken?

2022-07-13 Thread Alexander Korotkov

On Wed, Jul 13, 2022 at 3:19 PM Alvaro Herrera  wrote:
> On 2022-Jul-13, Alexander Korotkov wrote:
>
> > results in an error
> >
> > .../src/pgbld/../postgresql/src/include/utils/elog.h:73:10: fatal
> > error: utils/errcodes.h: No such file or directory
> >73 | #include "utils/errcodes.h"
> >
> >   |  ^~
>
> Probably what is happening here is that you have build artifacts in the
> source tree after having built there, and that confuses make so not
> everything is rebuilt correctly when you call it from the external build
> dit.  I suggest to "git clean -dfx" your source tree, then you can rerun
> configure/make from the external builddir.
>
> FWIW building in external dirs works fine.  I use it all the time.

You are right.  I made "make distclean" which appears to be not
enough.  With "git clean -dfx" correct symlink is generated.  Sorry
for the noise.

--
Regards,
Alexander Korotkov

Re: Building PostgreSQL in external directory is broken?

2022-07-13 Thread Alvaro Herrera

On 2022-Jul-13, Alexander Korotkov wrote:

> results in an error
> 
> .../src/pgbld/../postgresql/src/include/utils/elog.h:73:10: fatal
> error: utils/errcodes.h: No such file or directory
>73 | #include "utils/errcodes.h"
> 
>   |  ^~

Probably what is happening here is that you have build artifacts in the
source tree after having built there, and that confuses make so not
everything is rebuilt correctly when you call it from the external build
dit.  I suggest to "git clean -dfx" your source tree, then you can rerun
configure/make from the external builddir.

FWIW building in external dirs works fine.  I use it all the time.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Once again, thank you and all of the developers for your hard work on
PostgreSQL.  This is by far the most pleasant management experience of
any database I've worked on." (Dan Harris)
http://archives.postgresql.org/pgsql-performance/2006-04/msg00247.php

Re: Building PostgreSQL in external directory is broken?

2022-07-13 Thread Aleksander Alekseev

Hi Ilmari,

> That's the short version. The longer version² does claim it's supported:

You are right, I missed this. Thanks!

Regarding these errors:

> ar: cryptohash.o: No such file or directory
> ar: hmac.o: No such file or directory
> ...

This has something to do with the particular choice of the ./configure
flags on a given platform. If I omit:

> --disable-debug --disable-cassert --enable-tap-tests

I get completely different errors:

> clang: error: no such file or directory: 'replication/backup_manifest.o'
> clang: error: no such file or directory: 'replication/basebackup.o'

Apparently this should be checked carefully with different configure
flags combinations if we are going to continue maintaining this.

-- 
Best regards,
Aleksander Alekseev

Bug: Reading from single byte character column type may cause out of bounds memory reads.

2022-07-13 Thread Spyridon Dimitrios Agathos

Hi hackers,

While I was writing a test for PSQL, I faced a weird scenario. Depending on
how I build PSQL (enabling or not debug options), I saw different results
for the following query.

Steps to reproduce:
- OS: Ubuntu 20.04
- PSQL version 14.4

CREATE TABLE test (single_byte_col "char");
INSERT INTO test (single_byte_col) VALUES ('🀆');

In case where the following query runs in debug mode, I got this result:
SELECT left(single_byte_col, 1) l FROM test;
  l
---
 ~\x7F\x7F
(1 row)

Once I turn off debug mode, I got:
SELECT left(single_byte_col, 1) l FROM test;
 l
---

(1 row)

That triggered me to use Valgrind, which reported the following error:

==00:00:03:22.867 1171== VALGRINDERROR-BEGIN
==00:00:03:22.867 1171== Invalid read of size 1
==00:00:03:22.867 1171== at 0x4C41209: memmove (vg_replace_strmem.c:1382)
==00:00:03:22.868 1171== by 0xC8D8D0: text_substring
(../src/backend/utils/adt/varlena.c:1050)
==00:00:03:22.868 1171== by 0xC9AC78: text_left
(../src/backend/utils/adt/varlena.c:5614)
==00:00:03:22.868 1171== by 0x7FBB75: ExecInterpExpr
(../src/backend/executor/execExprInterp.c:749)
==00:00:03:22.868 1171== by 0x7FADA6: ExecInterpExprStillValid
(../src/backend/executor/execExprInterp.c:1824)
==00:00:03:22.868 1171== by 0x81C0AA: ExecEvalExprSwitchContext
(../src/include/executor/executor.h:339)
==00:00:03:22.868 1171== by 0x81BD96: ExecProject
(../src/include/executor/executor.h:373)
==00:00:03:22.868 1171== by 0x81B8BE: ExecScan
(../src/backend/executor/execScan.c:238)
==00:00:03:22.868 1171== by 0x8638BA: ExecSeqScan
(../src/backend/executor/nodeSeqscan.c:112)
==00:00:03:22.868 1171== by 0x8170D4: ExecProcNodeFirst
(../src/backend/executor/execProcnode.c:465)
==00:00:03:22.868 1171== by 0x80F291: ExecProcNode
(../src/include/executor/executor.h:257)
==00:00:03:22.868 1171== by 0x80A8F0: ExecutePlan
(../src/backend/executor/execMain.c:1551)
==00:00:03:22.868 1171== by 0x80A7A3: standard_ExecutorRun
(../src/backend/executor/execMain.c:361)
==00:00:03:22.868 1171== by 0x136231BC: pgss_ExecutorRun
(../contrib/pg_stat_statements/pg_stat_statements.c:1001)
==00:00:03:22.868 1171== by 0x80A506: ExecutorRun
(../src/backend/executor/execMain.c:303)
==00:00:03:22.868 1171== by 0xAD71B0: PortalRunSelect
(../src/backend/tcop/pquery.c:921)
==00:00:03:22.868 1171== by 0xAD6B75: PortalRun
(../src/backend/tcop/pquery.c:765)
==00:00:03:22.868 1171== by 0xAD1B3C: exec_simple_query
(../src/backend/tcop/postgres.c:1214)
==00:00:03:22.868 1171== by 0xAD0D7F: PostgresMain
(../src/backend/tcop/postgres.c:4496)
==00:00:03:22.868 1171== by 0x9D6C79: BackendRun
(../src/backend/postmaster/postmaster.c:4530)
==00:00:03:22.868 1171== by 0x9D61A3: BackendStartup
(../src/backend/postmaster/postmaster.c:4252)
==00:00:03:22.868 1171== by 0x9D4F56: ServerLoop
(../src/backend/postmaster/postmaster.c:1745)
==00:00:03:22.868 1171== by 0x9D23A4: PostmasterMain
(../src/backend/postmaster/postmaster.c:1417)
==00:00:03:22.868 1171== by 0x8AA0D2: main (../src/backend/main/main.c:209)
==00:00:03:22.868 1171== Address 0x3ed1f3c5 is 293 bytes inside a block of
size 8,192 alloc'd
==00:00:03:22.868 1171== at 0x4C37135: malloc (vg_replace_malloc.c:381)
==00:00:03:22.868 1171== by 0xD1C3C0: AllocSetContextCreateInternal
(../src/backend/utils/mmgr/aset.c:469)
==00:00:03:22.868 1171== by 0x821CB4: CreateExprContextInternal
(../src/backend/executor/execUtils.c:253)
==00:00:03:22.868 1171== by 0x821BE2: CreateExprContext
(../src/backend/executor/execUtils.c:303)
==00:00:03:22.868 1171== by 0x822078: ExecAssignExprContext
(../src/backend/executor/execUtils.c:482)
==00:00:03:22.868 1171== by 0x8637D6: ExecInitSeqScan
(../src/backend/executor/nodeSeqscan.c:147)
==00:00:03:22.868 1171== by 0x816B2D: ExecInitNode
(../src/backend/executor/execProcnode.c:211)
==00:00:03:22.868 1171== by 0x80A336: InitPlan
(../src/backend/executor/execMain.c:936)
==00:00:03:22.868 1171== by 0x809BE7: standard_ExecutorStart
(../src/backend/executor/execMain.c:263)
==00:00:03:22.868 1171== by 0x1362303C: pgss_ExecutorStart
(../contrib/pg_stat_statements/pg_stat_statements.c:963)
==00:00:03:22.868 1171== by 0x809881: ExecutorStart
(../src/backend/executor/execMain.c:141)
==00:00:03:22.868 1171== by 0xAD636A: PortalStart
(../src/backend/tcop/pquery.c:514)
==00:00:03:22.868 1171== by 0xAD1A31: exec_simple_query
(../src/backend/tcop/postgres.c:1175)
==00:00:03:22.868 1171== by 0xAD0D7F: PostgresMain
(../src/backend/tcop/postgres.c:4496)
==00:00:03:22.868 1171== by 0x9D6C79: BackendRun
(../src/backend/postmaster/postmaster.c:4530)
==00:00:03:22.868 1171== by 0x9D61A3: BackendStartup
(../src/backend/postmaster/postmaster.c:4252)
==00:00:03:22.868 1171== by 0x9D4F56: ServerLoop
(../src/backend/postmaster/postmaster.c:1745)
==00:00:03:22.868 1171== by 0x9D23A4: PostmasterMain
(../src/backend/postmaster/postmaster.c:1417)
==00:00:03:22.868 1171== by 0x8AA0D2: main (../src/backend/main/main.c:209)
==00:00:03:22.868 1171==
==00:00:03:22.868 1171== VALGRINDERROR-END

Then

Re: Building PostgreSQL in external directory is broken?

2022-07-13 Thread Dagfinn Ilmari Mannsåker

Aleksander Alekseev  writes:

> Hi Alexander,
>
> To be honest, this is the first time I see anyone trying to build a
> project that is using Autotools from an external directory :)  I
> checked the documentation [1] and it doesn't seem that we claim to
> support this.
>
> [1]: https://www.postgresql.org/docs/current/install-short.html

That's the short version. The longer version² does claim it's supported:

You can also run configure in a directory outside the source tree,
and then build there, if you want to keep the build directory
separate from the original source files. This procedure is called a
VPATH build. Here's how:

mkdir build_dir
cd build_dir
/path/to/source/tree/configure [options go here]
make

- ilmari

[2] https://www.postgresql.org/docs/current/install-procedure.html

Re: Eliminating SPI from RI triggers - take 2

2022-07-13 Thread Amit Langote

On Sat, Jul 9, 2022 at 1:15 AM Robert Haas  wrote:
> On Fri, Jul 1, 2022 at 2:23 AM Amit Langote  wrote:
> > So, I hacked together a patch (attached 0001) that invents an "RI
> > plan" construct (struct RIPlan) to replace the use of an "SPI plan"
> > (struct _SPI_plan).
> >
> > With that in place, I decided to rebase my previous patch [1] to use
> > this new interface and the result is attached 0002.
>

Thanks for taking a look at this.  I'll try to respond to other points
in a separate email, but I wanted to clarify something about below:

> I find my ego slightly wounded by the comment that "the partition
> descriptor machinery has a hack that assumes that the queries
> originating in this module push the latest snapshot in the
> transaction-snapshot mode." It's true that the partition descriptor
> machinery gives different answers depending on the active snapshot,
> but, err, is that a hack, or just a perfectly reasonable design
> decision?

I think my calling it a hack of "partition descriptor machinery" is
not entirely fair (sorry), because it's talking about the following
comment in find_inheritance_children_extended(), which describes it as
being a hack, so I mentioned the word "hack" in my comment too:

/*
 * Cope with partitions concurrently being detached.  When we see a
 * partition marked "detach pending", we omit it from the returned set
 * of visible partitions if caller requested that and the tuple's xmin
 * does not appear in progress to the active snapshot.  (If there's no
 * active snapshot set, that means we're not running a user query, so
 * it's OK to always include detached partitions in that case; if the
 * xmin is still running to the active snapshot, then the partition
 * has not been detached yet and so we include it.)
 *
 * The reason for this hack is that we want to avoid seeing the
 * partition as alive in RI queries during REPEATABLE READ or
 * SERIALIZABLE transactions: such queries use a different snapshot
 * than the one used by regular (user) queries.
 */

That bit came in to make DETACH CONCURRENTLY produce sane answers for
RI queries in some cases.

I guess my comment should really have said something like:

HACK: find_inheritance_children_extended() has a hack that assumes
that the queries originating in this module push the latest snapshot
in transaction-snapshot mode.

> An alternative might be for PartitionDirectoryLookup to take
> a snapshot as an explicit argument rather than relying on the global
> variable to get that information from context. I generally feel that
> we rely too much on global variables where we should be passing around
> explicit parameters, so if you're just arguing that explicit
> parameters would be better here, then I agree and just didn't think of
> it. If you're arguing that making the answer depend on the snapshot is
> itself a bad idea, I don't agree with that.

No, I'm not arguing that using a snapshot there is wrong and haven't
really thought hard about an alternative.

I tend to agree passing a snapshot explicitly might be better than
using ActiveSnapshot stuff for this.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

Re: [RFC] building postgres with meson -v9

2022-07-13 Thread Peter Eisentraut


On 06.07.22 15:21, Andres Freund wrote:

bda6a45bae meson: prereq: Refactor PG_TEST_EXTRA logic in autoconf build

I understand the intention behind this, but I think it changes the
behavior in an undesirable way.  Before this patch, you can go into
src/test/ssl/ and run make check manually.  This was indeed the only
way to do it before PG_TEST_EXTRA.  With this patch, this would now
skip all the tests unless you set PG_TEST_EXTRA, even if you run the
specific test directly.

It's not a free lunch, I agree. But I think the downsides outweigh the upsides
by far. Not seeing that tests were skipped in the test output is quite
problematic imo. And with meson's testrunner we're going to need something
that deals with skipping these tests - and it's more important to have them
skipped, so that they show up in the test summary.

It's not like it's hard to set PG_TEST_EXTRA for a single command invocation?


It's probably ok.  I have it set in my environment all the time anyway, 
so it wouldn't affect me.  But it's the sort of thing people might be 
particular about, so I thought it was worth pointing out.

Re: Building PostgreSQL in external directory is broken?

2022-07-13 Thread Aleksander Alekseev

Hi Alexander,

> Assuming postgres sources located in postgresql directory, the
> following sequence of commands
>
> mkdir -p pgbld
> cd pgbld
> ../postgresql/configure --disable-debug --disable-cassert --enable-tap-tests
> make -j4
>
> ...
>
> It seems strange to me that I'm the first one discovering this.  Am I
> missing something?

To be honest, this is the first time I see anyone trying to build a
project that is using Autotools from an external directory :)  I
checked the documentation [1] and it doesn't seem that we claim to
support this.

Also I tried your patch on MacOS Monterey 12.4 and it didn't work. I
get the following error:

```
...
ar: cryptohash.o: No such file or directory
ar: hmac.o: No such file or directory
ar: sha1.o: No such file or directory
ar: sha2.o: No such file or directory
```

... with or without the patch.

My guess would be that the reason no one discovered this before is
that this is in fact not supported and/or tested on CI.

Could you give an example of when this can be useful?

[1]: https://www.postgresql.org/docs/current/install-short.html

-- 
Best regards,
Aleksander Alekseev

Re: Handle infinite recursion in logical replication setup

2022-07-13 Thread Dilip Kumar

On Tue, Jul 12, 2022 at 2:58 PM vignesh C  wrote:
>
> On Tue, Jul 12, 2022 at 9:51 AM Amit Kapila  wrote:

I find one thing  confusing about this patch.  Basically, this has two
option 'local' and 'any', so I would assume that all the local server
changes should be covered under the 'local' but now if we set some
origin using 'select pg_replication_origin_session_setup('aa');' then
changes from that session will be ignored because it has an origin id.
I think actually the name is creating confusion, because by local it
seems like a change which originated locally and the document is also
specifying the same.

+   If local, the subscription will request the publisher
+   to only send changes that originated locally. If any,

I think if we want to keep the option local then we should look up all
the origin in the replication origin catalog and identify whether it
is a local origin id or remote origin id and based on that filter out
the changes.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Building PostgreSQL in external directory is broken?

2022-07-13 Thread Alexander Korotkov

Hackers,

I usually build the PostgreSQL from the sources directory.  But I've
heard a complaint that PostgreSQL can't be built in the external
directory.

Assuming postgres sources located in postgresql directory, the
following sequence of commands

mkdir -p pgbld
cd pgbld
../postgresql/configure --disable-debug --disable-cassert --enable-tap-tests
make -j4

results in an error

.../src/pgbld/../postgresql/src/include/utils/elog.h:73:10: fatal
error: utils/errcodes.h: No such file or directory
   73 | #include "utils/errcodes.h"

  |  ^~

I've discovered this a bit.  It appears that Makefile generates
src/backend/utils/errcodes.h in the build directory, but symlinks
src/include/utils/errcodes.h to the sources directory.  That is
src/include/utils/errcodes.h appears to be a broken symlink.  I've
written a short patch to fix that.

It seems strange to me that I'm the first one discovering this.  Am I
missing something?

--
Regards,
Alexander Korotkov


fix_errcodes_external_build.patch
Description: Binary data

Re: CREATE TABLE ( .. STORAGE ..)

2022-07-13 Thread Aleksander Alekseev

Hi,

> Committed.
>
> I thought the removal of the documentation details of SET COMPRESSION
> and SET STORAGE from the ALTER TABLE ref page was a bit excessive, since
> that material actually contained useful information about what happens
> when you change compression or storage on a table with existing data.
> So I left that in.  Maybe there is room to deduplicate that material a
> bit, but it would need to be more fine-grained than just removing one
> side of it.

Thanks, Peter!

-- 
Best regards,
Aleksander Alekseev

Re: CREATE TABLE ( .. STORAGE ..)

2022-07-13 Thread Peter Eisentraut


On 12.07.22 12:10, Aleksander Alekseev wrote:

Hi Peter,


The "safety check: do not allow toasted storage modes unless column
datatype is TOAST-aware" could be moved into GetAttributeStorage(), so
it doesn't have to be repeated.  (Note that GetAttributeCompression()
does similar checking.)


Good point. Fixed.


ATExecSetStorage() currently doesn't do any such check, and your patch
isn't adding one.  Is there a reason for that?


ATExecSetStorage() does this, but the check is a bit below [1]. In v7
I moved the check to GetAttributeStorage() as well.

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/tablecmds.c#l8312


Committed.

I thought the removal of the documentation details of SET COMPRESSION 
and SET STORAGE from the ALTER TABLE ref page was a bit excessive, since 
that material actually contained useful information about what happens 
when you change compression or storage on a table with existing data. 
So I left that in.  Maybe there is room to deduplicate that material a 
bit, but it would need to be more fine-grained than just removing one 
side of it.

Re: [PATCH] Completed unaccent dictionary with many missing characters

2022-07-13 Thread Przemysław Sztoch


Dear Michael P.,
3. The matter is not that simple. When I change priorities (ie 
Latin-ASCII.xml is less important than Unicode decomposition),

then "U + 33D7" changes not to pH but to PH.
In the end, I left it like it was before ...

If you decide what to do with point 3, I will correct it and send new 
patches.

What is your decision?
Option 1: We leave x as in Latin-ASCII.xml and we also have full 
compatibility with previous PostgreSQL versions.

If they fix Latin-ASCII.xml at Unicode, it will fix itself.

Option 2:  We choose a lower priority for entries in Latin-ASCII.xml

I would choose option 1.

P.S. I will be going on vacation and it would be nice to close this 
patch soon. TIA.


--
Przemysław Sztoch | Mobile +48 509 99 00 66

Collect ObjectAddress for ATTACH DETACH PARTITION to use in event trigger

2022-07-13 Thread houzj.f...@fujitsu.com

Hi hackers,

I noticed that we didn't collect the ObjectAddress returned by
ATExec[Attach|Detach]Partition. I think collecting this information can make it
easier for users to get the partition OID of the attached or detached table in
the event trigger. So how about collecting it like the attached patch ?

Best regards,
Hou zhijie



0001-Collect-ObjectAddress-for-ATTACH-DETACH-PARTITION.patch
Description:  0001-Collect-ObjectAddress-for-ATTACH-DETACH-PARTITION.patch

Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

2022-07-13 Thread vignesh C

On Wed, Jul 13, 2022 at 1:13 PM Michael Paquier  wrote:
>
> On Wed, Jul 13, 2022 at 12:22:06PM +0530, vignesh C wrote:
> > Most of the code is common between GetSubscriptionRelations and
> > GetSubscriptionNotReadyRelations. Added a parameter to
> > GetSubscriptionRelations which could provide the same functionality as
> > the existing GetSubscriptionRelations and
> > GetSubscriptionNotReadyRelations. Attached patch has the changes for
> > the same. Thoughts?
>
> Right.  Using all_rels to mean that we'd filter relations that are not
> ready is a bit confusing, though.  Perhaps this could use a bitmask as
> argument.

The attached v2 patch has the modified version which includes the
changes to make the argument as bitmask.

Regards,
Vignesh
From 1f67e4b5ff2d304323f262e8acf74d48893ceb06 Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Wed, 13 Jul 2022 11:54:59 +0530
Subject: [PATCH v2] Refactor to make use of a common function for
 GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

Most of the code is common between GetSubscriptionRelations and
GetSubscriptionNotReadyRelations. Added a parameter to GetSubscriptionRelations
which could provide the same functionality for GetSubscriptionRelations
and GetSubscriptionNotReadyRelations.
---
 src/backend/catalog/pg_subscription.c   | 71 -
 src/backend/commands/subscriptioncmds.c |  5 +-
 src/backend/replication/logical/tablesync.c |  3 +-
 src/include/catalog/pg_subscription.h   |  3 +
 src/include/catalog/pg_subscription_rel.h   |  3 +-
 5 files changed, 20 insertions(+), 65 deletions(-)

diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 8856ce3b50..cd7dc00c79 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -525,65 +525,15 @@ HasSubscriptionRelations(Oid subid)
 }
 
 /*
- * Get all relations for subscription.
+ * Get the relations for subscription.
  *
- * Returned list is palloc'ed in current memory context.
+ * If subrel_options is not set, return all the relations for subscription. If
+ * SUBSCRIPTION_REL_STATE_NOT_READY bit is set in subrel_options, return all
+ * the relations for subscription that are not in a ready state. Returned list
+ * is palloc'ed in current memory context.
  */
 List *
-GetSubscriptionRelations(Oid subid)
-{
-	List	   *res = NIL;
-	Relation	rel;
-	HeapTuple	tup;
-	ScanKeyData skey[1];
-	SysScanDesc scan;
-
-	rel = table_open(SubscriptionRelRelationId, AccessShareLock);
-
-	ScanKeyInit(&skey[0],
-Anum_pg_subscription_rel_srsubid,
-BTEqualStrategyNumber, F_OIDEQ,
-ObjectIdGetDatum(subid));
-
-	scan = systable_beginscan(rel, InvalidOid, false,
-			  NULL, 1, skey);
-
-	while (HeapTupleIsValid(tup = systable_getnext(scan)))
-	{
-		Form_pg_subscription_rel subrel;
-		SubscriptionRelState *relstate;
-		Datum		d;
-		bool		isnull;
-
-		subrel = (Form_pg_subscription_rel) GETSTRUCT(tup);
-
-		relstate = (SubscriptionRelState *) palloc(sizeof(SubscriptionRelState));
-		relstate->relid = subrel->srrelid;
-		relstate->state = subrel->srsubstate;
-		d = SysCacheGetAttr(SUBSCRIPTIONRELMAP, tup,
-			Anum_pg_subscription_rel_srsublsn, &isnull);
-		if (isnull)
-			relstate->lsn = InvalidXLogRecPtr;
-		else
-			relstate->lsn = DatumGetLSN(d);
-
-		res = lappend(res, relstate);
-	}
-
-	/* Cleanup */
-	systable_endscan(scan);
-	table_close(rel, AccessShareLock);
-
-	return res;
-}
-
-/*
- * Get all relations for subscription that are not in a ready state.
- *
- * Returned list is palloc'ed in current memory context.
- */
-List *
-GetSubscriptionNotReadyRelations(Oid subid)
+GetSubscriptionRelations(Oid subid, bits32 subrel_state_options)
 {
 	List	   *res = NIL;
 	Relation	rel;
@@ -599,10 +549,11 @@ GetSubscriptionNotReadyRelations(Oid subid)
 BTEqualStrategyNumber, F_OIDEQ,
 ObjectIdGetDatum(subid));
 
-	ScanKeyInit(&skey[nkeys++],
-Anum_pg_subscription_rel_srsubstate,
-BTEqualStrategyNumber, F_CHARNE,
-CharGetDatum(SUBREL_STATE_READY));
+	if ((subrel_state_options & SUBSCRIPTION_REL_STATE_NOT_READY) != 0)
+		ScanKeyInit(&skey[nkeys++],
+	Anum_pg_subscription_rel_srsubstate,
+	BTEqualStrategyNumber, F_CHARNE,
+	CharGetDatum(SUBREL_STATE_READY));
 
 	scan = systable_beginscan(rel, InvalidOid, false,
 			  NULL, nkeys, skey);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index bdc1208724..d9fb235ce5 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -785,7 +785,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 		pubrel_names = fetch_table_list(wrconn, sub->publications);
 
 		/* Get local table list. */
-		subrel_states = GetSubscriptionRelations(sub->oid);
+		subrel_states = GetSubscriptionRelations(sub->oid, 0);
 
 		/*
 		 * Build qsorted array of local table oids for faster lookup. This can
@@ -1457,7 +1457,8 @@ DropSubscription(DropSubsc

Re: i.e. and e.g.

2022-07-13 Thread Aleksander Alekseev

Hi Kyotaro,

> Oops! There's another use of that word in the same context.
> Attached contains two fixes.

Good catch. I did a quick search for similar messages and apparently
there are no others to fix.

-- 
Best regards,
Aleksander Alekseev

Re: i.e. and e.g.

2022-07-13 Thread Junwang Zhao

make sense.

+1

On Wed, Jul 13, 2022 at 5:14 PM Kyotaro Horiguchi
 wrote:
>
> At Wed, 13 Jul 2022 18:09:43 +0900 (JST), Kyotaro Horiguchi 
>  wrote in
> > I happened to see the message below.
> >
> > > WARNING:  new data directory should not be inside the old data directory, 
> > > e.g. %s
> >
> > The corresponding code is
> >
> > > ... the old data directory, e.g. %s", old_cluster_pgdata);
> >
> > So, "e.g." (for example) in the message sounds like "that is", which I
> > think is "i.e.".  It should be fixed if this is correct.  I'm not sure
> > whether to keep using Latin-origin acronyms like this, but in the
> > attached I used "i.e.".
>
> Oops! There's another use of that word in the same context.
> Attached contains two fixes.
>
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center



-- 
Regards
Junwang Zhao

Re: i.e. and e.g.

2022-07-13 Thread Kyotaro Horiguchi

At Wed, 13 Jul 2022 18:09:43 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> I happened to see the message below.
> 
> > WARNING:  new data directory should not be inside the old data directory, 
> > e.g. %s
> 
> The corresponding code is
> 
> > ... the old data directory, e.g. %s", old_cluster_pgdata);
> 
> So, "e.g." (for example) in the message sounds like "that is", which I
> think is "i.e.".  It should be fixed if this is correct.  I'm not sure
> whether to keep using Latin-origin acronyms like this, but in the
> attached I used "i.e.".

Oops! There's another use of that word in the same context.
Attached contains two fixes.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fda744911d..01ede1e4ad 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -540,7 +540,7 @@ create_script_for_old_cluster_deletion(char **deletion_script_file_name)
 	if (path_is_prefix_of_path(old_cluster_pgdata, new_cluster_pgdata))
 	{
 		pg_log(PG_WARNING,
-			   "\nWARNING:  new data directory should not be inside the old data directory, e.g. %s", old_cluster_pgdata);
+			   "\nWARNING:  new data directory should not be inside the old data directory, i.e. %s", old_cluster_pgdata);
 
 		/* Unlink file in case it is left over from a previous run. */
 		unlink(*deletion_script_file_name);
@@ -564,7 +564,7 @@ create_script_for_old_cluster_deletion(char **deletion_script_file_name)
 		{
 			/* reproduce warning from CREATE TABLESPACE that is in the log */
 			pg_log(PG_WARNING,
-   "\nWARNING:  user-defined tablespace locations should not be inside the data directory, e.g. %s", old_tablespace_dir);
+   "\nWARNING:  user-defined tablespace locations should not be inside the data directory, i.e. %s", old_tablespace_dir);
 
 			/* Unlink file in case it is left over from a previous run. */
 			unlink(*deletion_script_file_name);

i.e. and e.g.

2022-07-13 Thread Kyotaro Horiguchi

I happened to see the message below.

> WARNING:  new data directory should not be inside the old data directory, 
> e.g. %s

The corresponding code is

> ... the old data directory, e.g. %s", old_cluster_pgdata);

So, "e.g." (for example) in the message sounds like "that is", which I
think is "i.e.".  It should be fixed if this is correct.  I'm not sure
whether to keep using Latin-origin acronyms like this, but in the
attached I used "i.e.".

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fda744911d..65cec7bf2b 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -540,7 +540,7 @@ create_script_for_old_cluster_deletion(char **deletion_script_file_name)
 	if (path_is_prefix_of_path(old_cluster_pgdata, new_cluster_pgdata))
 	{
 		pg_log(PG_WARNING,
-			   "\nWARNING:  new data directory should not be inside the old data directory, e.g. %s", old_cluster_pgdata);
+			   "\nWARNING:  new data directory should not be inside the old data directory, i.e. %s", old_cluster_pgdata);
 
 		/* Unlink file in case it is left over from a previous run. */
 		unlink(*deletion_script_file_name);

Re: Add connection active, idle time to pg_stat_activity

2022-07-13 Thread Aleksander Alekseev

Hi again,

> 57033 (master) =# select * from pg_stat_activity where pid = 57033;
> ...
> total_active_time  | 2514.635
> total_idle_in_transaction_time | 2314.703
>
> 57033 (master) =# COMMIT;
> 57033 (master) =# select * from pg_stat_activity where pid = 57033;
> ...
> total_active_time  | 22.048
> total_idle_in_transaction_time | 7300.911
> ```

My previous message was wrong, total_active_time doesn't track
seconds. I got confused by the name of this column. Still I'm pretty
confident it shouldn't decrease.

-- 
Best regards,
Aleksander Alekseev

Re: Add connection active, idle time to pg_stat_activity

2022-07-13 Thread Aleksander Alekseev

Rafia, Sergey,

Many thanks for working on this!

> I have incorporated most of the suggestions into the patch. I have also 
> rebased and tested the patch on top of the current master

I noticed that this patch is marked as "Needs Review" and decided to
take a look.

I believe there is a bug in the implementation. Here is what I did:

```
57033 (master) =# select * from pg_stat_activity where pid = 57033;
...
total_active_time  | 9.128
total_idle_in_transaction_time | 0

57033 (master) =# select * from pg_stat_activity where pid = 57033;
...
total_active_time  | 10.626
total_idle_in_transaction_time | 0

57033 (master) =# BEGIN;
57033 (master) =# select * from pg_stat_activity where pid = 57033;
...
total_active_time  | 17.443
total_idle_in_transaction_time | 2314.703

57033 (master) =# select * from pg_stat_activity where pid = 57033;
...
total_active_time  | 2514.635
total_idle_in_transaction_time | 2314.703

57033 (master) =# COMMIT;
57033 (master) =# select * from pg_stat_activity where pid = 57033;
...
total_active_time  | 22.048
total_idle_in_transaction_time | 7300.911
```

So it looks like total_active_time tracks seconds when a user executes
single expressions and milliseconds when running a transaction. It
should always track milliseconds.

Please use `git format-patch` for the next patch and provide a commit
message, as it was previously pointed out by Bharath. Please specify
the list of the authors and reviewers and add a note about
incrementing the catalog version.

-- 
Best regards,
Aleksander Alekseev

Re: "ERROR: latch already owned" on gharial

2022-07-13 Thread Alvaro Herrera

On 2022-Jul-13, Sandeep Thakkar wrote:

> Thanks Robert.
> 
> We are receiving the alerts from buildfarm-admins for anole and gharial not
> reporting. Who can help to stop these? Thanks

Probably Andrew knows how to set buildsystems.no_alerts for these
animals.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)

RE: Add --{no-,}bypassrls flags to createuser

2022-07-13 Thread Shinoda, Noriyoshi (PN Japan FSIP)

Hi,
Thanks to the developers and reviewers.
The attached small patch fixes the message in "createuser --help" command. The 
patch has changed to specify a time stamp for the --valid-for option. I don't 
think the SGML description needs to be modified.

Regards,
Noriyoshi Shinoda
-Original Message-
From: Michael Paquier  
Sent: Wednesday, July 13, 2022 12:25 PM
To: Kyotaro Horiguchi 
Cc: shinya11.k...@oss.nttdata.com; nathandboss...@gmail.com; 
przemys...@sztoch.pl; david.g.johns...@gmail.com; robertmh...@gmail.com; 
dan...@yesql.se; pgsql-hack...@postgresql.org
Subject: Re: Add --{no-,}bypassrls flags to createuser

On Thu, May 26, 2022 at 04:47:46PM +0900, Kyotaro Horiguchi wrote:
> FWIW, the "fancy" here causes me to think about something likely to 
> cause syntax breakage of the query to be sent.
> 
> createuser -a 'user"1' -a 'user"2' 'user"3'
> createuser -v "2023-1-1'; DROP TABLE public.x; select '" hoge

That would be mostly using spaces here, to make sure that quoting is correctly 
applied.

> BUT, thses should be prevented by the functions enumerated above. So, 
> I don't think we need them.

Mostly.  For example, the test for --valid-until can use a timestamp with 
spaces to validate the use of appendStringLiteralConn().  A second thing is 
that --member was checked, but not --admin, so I have renamed regress_user2 to 
"regress user2" for that to apply a maximum of coverage, and applied the patch.

One thing that I found annoying is that this made the list of options of 
createuser much harder to follow.  That's not something caused by this patch as 
many options have accumulated across the years and there is a kind pattern 
where the connection options were listed first, but I have cleaned up that 
while on it.  A second area where this could be done is createdb, as it could 
be easily expanded if the backend query gains support for more stuff, but that 
can happen when it makes more sense.
--
Michael

createuser_help_v1.diff
Description: createuser_help_v1.diff

Re: Making Vars outer-join aware

2022-07-13 Thread Richard Guo

On Tue, Jul 12, 2022 at 9:37 PM Tom Lane  wrote:

> Richard Guo  writes:
> > Note that the evaluation of expression 'b.j + 1' now occurs below the
> > outer join. Is this something we need to be concerned about?
>
> It seems more formally correct to me, but perhaps somebody would
> complain about possibly-useless expression evals.  We could likely
> re-complicate the logic that inserts PHVs during pullup so that it
> looks for Vars it can apply the nullingrels to.  Maybe there's an
> opportunity to share code with flatten_join_alias_vars?

Yeah, maybe we can extend and leverage the codes in
adjust_standard_join_alias_expression() to do that?

But I'm not sure which is better, to evaluate the expression below or
above the outer join. It seems to me that if the size of base rel is
large and somehow the size of the joinrel is small, evaluation above the
outer join would win. And in the opposite case evaluation below the
outer join would be better.

Thanks
Richard

Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

2022-07-13 Thread Peter Smith

On Wed, Jul 13, 2022 at 5:43 PM Michael Paquier  wrote:
>
> On Wed, Jul 13, 2022 at 12:22:06PM +0530, vignesh C wrote:
> > Most of the code is common between GetSubscriptionRelations and
> > GetSubscriptionNotReadyRelations. Added a parameter to
> > GetSubscriptionRelations which could provide the same functionality as
> > the existing GetSubscriptionRelations and
> > GetSubscriptionNotReadyRelations. Attached patch has the changes for
> > the same. Thoughts?
>
> Right.  Using all_rels to mean that we'd filter relations that are not
> ready is a bit confusing, though.  Perhaps this could use a bitmask as
> argument.

+1

(or some enum)

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

2022-07-13 Thread Michael Paquier

On Wed, Jul 13, 2022 at 12:22:06PM +0530, vignesh C wrote:
> Most of the code is common between GetSubscriptionRelations and
> GetSubscriptionNotReadyRelations. Added a parameter to
> GetSubscriptionRelations which could provide the same functionality as
> the existing GetSubscriptionRelations and
> GetSubscriptionNotReadyRelations. Attached patch has the changes for
> the same. Thoughts?

Right.  Using all_rels to mean that we'd filter relations that are not
ready is a bit confusing, though.  Perhaps this could use a bitmask as
argument.
--
Michael


signature.asc
Description: PGP signature

make update-po@master stops at pg_upgrade

2022-07-13 Thread Kyotaro Horiguchi

I'm not sure if it fits -hackers, but seems better than -translators.

I find it annoying that make update-po stops at pg_upgrade on master.
The cause is that a file is renamed from relfilenode.c to
relfilenumber.c so just fixing the name works. (attached first).


On the other hand, basically, every nls.mk contain all *.c files in the
tool's directory plus some in other directories with some exceptions:

a) nls.mk of pg_dump and psql contain some *.h files but they don't
 contain a translatable string.

b) nls.mk of ecpglib is missing some files that contain translatable
 strings. (adding theses files doesn't change the result of make
 update-po, but I'll send another mail for this issue.)

c) nls.mk of pg_waldump, psql, libpq and preproc excludes some *.c
 files which don't contain a translatable string.

At least for (a) above, removing them from nls.mk is rather good.  For
(b), adding them is also good.  For (c), I think few extra files
doesn't make difference.

I wonder if we can use $(wildcard *.c) instead of explicitly
enumerating every *.c file in the directory then maintaining the
list. Since backend does that way, I think we can do that the same way
also for the tools. Attached second does that except for tools that
have only one *.c.  The patch doesn't make a difference in the result
of make update-po.

Either will do for me but at least I'd like (a) applied.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/pg_upgrade/nls.mk b/src/bin/pg_upgrade/nls.mk
index bfaacfbb7c..2488b8c406 100644
--- a/src/bin/pg_upgrade/nls.mk
+++ b/src/bin/pg_upgrade/nls.mk
@@ -2,7 +2,7 @@
 CATALOG_NAME = pg_upgrade
 AVAIL_LANGUAGES  = cs de es fr ja ko ru sv tr uk zh_CN
 GETTEXT_FILES= check.c controldata.c dump.c exec.c file.c function.c \
-   info.c option.c parallel.c pg_upgrade.c relfilenode.c \
+   info.c option.c parallel.c pg_upgrade.c relfilenumber.c \
server.c tablespace.c util.c version.c
 GETTEXT_TRIGGERS = pg_fatal pg_log:2 prep_status prep_status_progress report_status:2
 GETTEXT_FLAGS= \
diff --git a/src/bin/initdb/nls.mk b/src/bin/initdb/nls.mk
index fe7bdfc04a..df3287e95b 100644
--- a/src/bin/initdb/nls.mk
+++ b/src/bin/initdb/nls.mk
@@ -1,6 +1,6 @@
 # src/bin/initdb/nls.mk
 CATALOG_NAME = initdb
 AVAIL_LANGUAGES  = cs de el es fr he it ja ko pl pt_BR ru sv tr uk vi zh_CN
-GETTEXT_FILES= $(FRONTEND_COMMON_GETTEXT_FILES) findtimezone.c initdb.c ../../common/exec.c ../../common/fe_memutils.c ../../common/file_utils.c ../../common/pgfnames.c ../../common/restricted_token.c ../../common/rmtree.c ../../common/username.c ../../common/wait_error.c ../../port/dirmod.c
+GETTEXT_FILES= $(FRONTEND_COMMON_GETTEXT_FILES) $(wildcard *.c) ../../common/exec.c ../../common/fe_memutils.c ../../common/file_utils.c ../../common/pgfnames.c ../../common/restricted_token.c ../../common/rmtree.c ../../common/username.c ../../common/wait_error.c ../../port/dirmod.c
 GETTEXT_TRIGGERS = $(FRONTEND_COMMON_GETTEXT_TRIGGERS) simple_prompt
 GETTEXT_FLAGS= $(FRONTEND_COMMON_GETTEXT_FLAGS)
diff --git a/src/bin/pg_basebackup/nls.mk b/src/bin/pg_basebackup/nls.mk
index ec7393d321..1750e7ba94 100644
--- a/src/bin/pg_basebackup/nls.mk
+++ b/src/bin/pg_basebackup/nls.mk
@@ -2,18 +2,7 @@
 CATALOG_NAME = pg_basebackup
 AVAIL_LANGUAGES  = cs de es fr he it ja ko pl pt_BR ru sv tr uk vi zh_CN
 GETTEXT_FILES= $(FRONTEND_COMMON_GETTEXT_FILES) \
-   bbstreamer_file.c \
-   bbstreamer_gzip.c \
-   bbstreamer_inject.c \
-   bbstreamer_lz4.c \
-   bbstreamer_tar.c \
-   bbstreamer_zstd.c \
-   pg_basebackup.c \
-   pg_receivewal.c \
-   pg_recvlogical.c \
-   receivelog.c \
-   streamutil.c \
-   walmethods.c \
+   $(wildcard *.c) \
../../common/compression.c \
../../common/fe_memutils.c \
../../common/file_utils.c \
diff --git a/src/bin/pg_dump/nls.mk b/src/bin/pg_dump/nls.mk
index dc7360db48..059c862ffb 100644
--- a/src/bin/pg_dump/nls.mk
+++ b/src/bin/pg_dump/nls.mk
@@ -2,12 +2,7 @@
 CATALOG_NAME = pg_dump
 AVAIL_LANGUAGES  = cs de el es fr he it ja ko pl pt_BR ru sv tr uk zh_CN
 GETTEXT_FILES= $(FRONTEND_COMMON_GETTEXT_FILES) \
-   pg_backup_archiver.c pg_backup_db.c pg_backup_custom.c \
-   pg_backup_null.c pg_backup_tar.c \
-   pg_backup_directory.c dumputils.c compress_io.c \
-   pg_dump.c common.c pg_dump_sort.c \
-   pg_restore.c pg_dumpall.c \
-   parallel.c parallel.h pg_backup_utils.c pg_backup_utils.h \
+   $(wildcard *.c) \
../../common/exec.c ../../common/fe_memutils.c \
../../common/wait_err

Re: generic plans and "initial" pruning

2022-07-13 Thread Amit Langote

On Wed, Jul 13, 2022 at 3:40 PM Amit Langote  wrote:
> Rebased over 964d01ae90c.

Sorry, left some pointless hunks in there while rebasing.  Fixed in
the attached.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


v19-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch
Description: Binary data


v19-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch
Description: Binary data

94 matches

Mail list logo