Re: [PoC] Improve dead tuple storage for lazy vacuum

2023-07-03 Thread Masahiko Sawada
On Tue, Jun 27, 2023 at 5:20 PM Masahiko Sawada  wrote:
>
> On Fri, Jun 23, 2023 at 6:54 PM John Naylor
>  wrote:
> >
> >
> > I wrote:
> > > I cleaned up a few things and attached v34 so you can do that if you like.
> >
> > Of course, "clean" is a relative term. While making a small bit of progress 
> > working in tidbitmap.c earlier this week, I thought it useful to prototype 
> > some things in the tidstore, at which point I was reminded it no longer 
> > compiles because of my recent work. I put in the necessary incantations so 
> > that the v32 tidstore compiles and passes tests, so here's a patchset for 
> > that (but no vacuum changes). I thought it was a good time to also condense 
> > it down to look more similar to previous patches, as a basis for future 
> > work.
> >
>
> Thank you for updating the patch set. I'll look at updates closely
> early next week.
>

I've run several benchmarks for v32, where before your recent change
starting, and v35 patch. Overall the numbers are better than the
previous version. Here is the test result where I used 1-byte value:

"select * from bench_load_random(10_000_000)"

* v35
  radix tree leaves: 192 total in 0 blocks; 0 empty blocks; 0 free (0
chunks); 192 used
  radix tree node 256: 13697472 total in 205 blocks; 0 empty blocks;
52400 free (25 chunks); 13645072 used
  radix tree node 125: 86630592 total in 2115 blocks; 0 empty blocks;
7859376 free (6102 chunks); 78771216 used
  radix tree node 32: 94912 total in 0 blocks; 10 empty blocks; 0 free
(0 chunks); 94912 used
  radix tree node 15: 9269952 total in 1136 blocks; 0 empty blocks;
168 free (1 chunks); 9269784 used
  radix tree node 3: 1915502784 total in 233826 blocks; 0 empty
blocks; 6560 free (164 chunks); 1915496224 used
 mem_allocated | load_ms
---+-
2025194752 |3011
(1 row)

* v32
  radix tree node 256: 192 total in 0 blocks; 0 empty blocks; 0 free
(0 chunks); 192 used
  radix tree node 256: 13487552 total in 205 blocks; 0 empty blocks;
51600 free (25 chunks); 13435952 used
  radix tree node 125: 192 total in 0 blocks; 0 empty blocks; 0 free
(0 chunks); 192 used
  radix tree node 125: 86630592 total in 2115 blocks; 0 empty blocks;
7859376 free (6102 chunks); 78771216 used
  radix tree node 32: 192 total in 0 blocks; 0 empty blocks; 0 free (0
chunks); 192 used
  radix tree node 32: 94912 total in 0 blocks; 10 empty blocks; 0 free
(0 chunks); 94912 used
  radix tree node 15: 192 total in 0 blocks; 0 empty blocks; 0 free (0
chunks); 192 used
  radix tree node 15: 9269952 total in 1136 blocks; 0 empty blocks;
168 free (1 chunks); 9269784 used
  radix tree node 3: 241597002 total in 29499 blocks; 0 empty blocks;
3864 free (161 chunks); 241593138 used
  radix tree node 3: 1809039552 total in 221696 blocks; 0 empty
blocks; 5280 free (110 chunks); 1809034272 used
 mem_allocated | load_ms
---+-
2160118410 |3069
(1 row)

As you mentioned, the 1-byte value is embedded into 8 byte so 7 bytes
are unused, but we use less memory since we use less slab contexts and
save fragmentations.

I've also tested some large value cases (e.g. the value is 80-bytes)
and got a similar result.

Regarding the codes, there are many todo and fixme comments so it
seems to me that your recent work is still in-progress. What is the
current status? Can I start reviewing the code or should I wait for a
while until your recent work completes?

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: pgsql: Fix search_path to a safe value during maintenance operations.

2023-07-03 Thread Noah Misch
On Mon, Jul 03, 2023 at 11:19:14AM -0700, Nathan Bossart wrote:
> On Sun, Jul 02, 2023 at 08:57:31PM -0700, Noah Misch wrote:
> > Another dimension of compromise could be to make MAINTAIN affect fewer
> > commands in v16.  Un-revert the part of commit 05e1737 affecting just the
> > commands it still affects.  For example, limit MAINTAIN and the 05e1737
> > behavior change to VACUUM, ANALYZE, and REINDEX.  Don't worry about VACUUM 
> > or
> > ANALYZE failing under commit 05e1737, since they would have been failing 
> > under
> > autovacuum since 2018.  A problem index expression breaks both autoanalyze 
> > and
> > REINDEX, hence the inclusion of REINDEX.  The already-failing argument 
> > doesn't
> > apply to CLUSTER or REFRESH MATERIALIZED VIEW, so those commands could join
> > MAINTAIN in v17.
> 
> I'm open to compromise if others are, but I'm skeptical that folks will be
> okay with too much fancy footwork this late in the game.

Got it.

> Anyway, IMO your argument could extend to CLUSTER and REFRESH, too.  If
> we're willing to change behavior under the assumption that autovacuum
> would've been failing since 2018, then why wouldn't we be willing to change
> it everywhere?  I suppose someone could have been manually vacuuming with a
> special search_path for 5 years to avoid needing to schema-qualify their
> index expressions (and would then be surprised that CLUSTER/REFRESH no
> longer work), but limiting MAINTAIN to VACUUM, etc. would still break their
> use-case, right?

Yes, limiting MAINTAIN to VACUUM would still break a site that has used manual
VACUUM to work around associated loss of autovacuum.  I'm not sympathetic to a
user who neglected to benefit from the last five years of prep time on this
issue as it affects VACUUM and ANALYZE.  REFRESH runs more than index
expressions, e.g. function calls in the targetlist of the materialized view
query.  Those targetlist expressions haven't been putting ERRORs in the log
during autovacuum, so REFRESH hasn't had the sort of advance warning that
VACUUM and ANALYZE got.




Re: pg_stat_statements and "IN" conditions

2023-07-03 Thread Nathan Bossart
On Sun, Mar 19, 2023 at 01:27:34PM +0100, Dmitry Dolgov wrote:
> +If this parameter is on, two queries with an array will get the same
> +query identifier if the only difference between them is the number of
> +constants, both numbers is of the same order of magnitude and 
> greater or
> +equal 10 (so the order of magnitude is greather than 1, it is not 
> worth
> +the efforts otherwise).

IMHO this adds way too much complexity to something that most users would
expect to be an on/off switch.  If I understand Álvaro's suggestion [0]
correctly, he's saying that in addition to allowing "on" and "off", it
might be worth allowing something like "powers" to yield roughly the
behavior described above.  I don't think he's suggesting that this "powers"
behavior should be the only available option.  Also, it seems
counterintuitive that queries with fewer than 10 constants are not merged.

In the interest of moving this patch forward, I would suggest making it a
simple on/off switch in 0002 and moving the "powers" functionality to a new
0003 patch.  I think separating out the core part of this feature might
help reviewers.  As you can see, I got distracted by the complicated
threshold logic and ended up focusing my first round of review there.

[0] https://postgr.es/m/20230209172651.cfgrebpyyr72h7fv%40alvherre.pgsql

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: DSA failed to allocate memory

2023-07-03 Thread Thomas Munro
Pushed.

I wasn't sure it was worth keeping the test in the tree.  It's here in
the mailing list archives for future reference.




Re: [PATCH] Extend the length of BackgroundWorker.bgw_library_name

2023-07-03 Thread Nathan Bossart
On Mon, Jul 03, 2023 at 06:00:12PM -0700, Yurii Rashkovskii wrote:
> Great, thank you! The reason I was leaving the other constant in place to
> make upgrading extensions trivial (so that they don't need to adjust for
> this), but if you think this is a better way, I am fine with it.

Sorry, I'm not following.  Which constant are you referring to?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Is a pg_stat_force_next_flush() call sufficient for regression tests?

2023-07-03 Thread Kyotaro Horiguchi
At Mon, 3 Jul 2023 15:45:52 +0200, Tomas Vondra  
wrote in 
> So I'm wondering if pg_stat_force_next_flush() is enough - AFAICS this
> only sets some flag for the *next* pgstat_report_stat() call, but how do
> we know that happens before the query execution?
> 
> Shouldn't there be something like pg_stat_flush() that actually does the
> flushing, instead of just setting the flag?

The reason for the function is that pg_stat_flush() is supposed not to
be called within a transaction.  AFAICS pg_stat_force_next_flush()
takes effect after a successfull transaction end and before the next
command execution.

To verify this, I put in an assertion to check that the flag gets
consumed before reading of pg_stat_io (a.diff), then ran pgbench with
the attached custom script. As expected, it didn't fire at all during
several trials. When I wrapped all lines in t.sql within a
begin-commit block, the assertion fired off immediately as a matter of
course.

Is there any chance concurrent backends or some other things can
actually hinder the backend from reusing buffers?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index d743fc0b28..6529e737ea 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -225,7 +225,7 @@ static dlist_head pgStatPending = DLIST_STATIC_INIT(pgStatPending);
  * Force the next stats flush to happen regardless of
  * PGSTAT_MIN_INTERVAL. Useful in test scripts.
  */
-static bool pgStatForceNextFlush = false;
+bool pgStatForceNextFlush = false;
 
 /*
  * Force-clear existing snapshot before next use when stats_fetch_consistency
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 2a4c8ef87f..7f60b2834a 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -1352,7 +1352,9 @@ pg_stat_get_io(PG_FUNCTION_ARGS)
 	ReturnSetInfo *rsinfo;
 	PgStat_IO  *backends_io_stats;
 	Datum		reset_time;
+	extern bool pgStatForceNextFlush;
 
+	Assert(!pgStatForceNextFlush);
 	InitMaterializedSRF(fcinfo, 0);
 	rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 
--begin;
select pg_stat_force_next_flush();
select sum(reuses) from pg_stat_io;
--commit;


Re: Commitfest manager for July

2023-07-03 Thread Ibrar Ahmed
On Tue, Jul 4, 2023 at 5:03 AM Michael Paquier  wrote:

> On Mon, Jul 03, 2023 at 12:26:44PM +0200, Daniel Gustafsson wrote:
> > Since this didn't get any takers, and we are in July AoE since a few
> days ago,
> > I guess I'll assume the role this time in the interest of moving things
> along.
> > I've switched the 2023-07 CF to in-progress and 2023-09 to open, let's
> try to
> > close patches!
>
> Thanks, Daniel!
> --
> Michael
>
If nobody taking that, I can take the responsibility.--
Ibrar Ahmed


Re: pgindent (probably my missing something obvious)

2023-07-03 Thread Tom Lane
James Coleman  writes:
> My heuristic for what pgindent changes must be wrong. The long
> function calls (and 'if' conditions) seem obviously out of place to my
> eyes with the surrounding code. Does that mean the surrounding code
> was just hand-prettified?

pgindent won't usually editorialize on line breaks within C
statements.  (It *will* re-flow comment text, if the comment block
isn't at the left margin.)  It seems to feel free to play with
horizontal whitespace, but not to add or remove newlines within a
statement.  I do know that it will move curly braces around to meet
formatting rules, but I've not seen it do similar changes within a
function call or if-condition.  So it's up to you to break the lines
in a reasonable way.

regards, tom lane




Re: pgindent (probably my missing something obvious)

2023-07-03 Thread James Coleman
On Mon, Jul 3, 2023 at 9:20 PM Tom Lane  wrote:
>
> James Coleman  writes:
> > This is the first time I've run pgindent on my current machine, and it
> > doesn't seem to be making any modifications to source files. For
> > example this command:
>
> > ./src/tools/pgindent/pgindent src/backend/optimizer/path/allpaths.c
>
> > leaves the allpaths.c file unchanged despite my having some very long
> > function calls.
>
> "Long function calls" aren't necessarily something pgindent would
> change.  Have you tried intentionally misindenting some lines?
>
> regards, tom lane

Hmm, yeah, that works.

My heuristic for what pgindent changes must be wrong. The long
function calls (and 'if' conditions) seem obviously out of place to my
eyes with the surrounding code. Does that mean the surrounding code
was just hand-prettified?

Thanks,
James Coleman




RE: [PGdocs] fix description for handling pf non-ASCII characters

2023-07-03 Thread Hayato Kuroda (Fujitsu)
Dear Jian,

> On Thu, Jun 29, 2023 at 3:51 PM Hayato Kuroda (Fujitsu)
>  wrote:
> >
> > Dear Jian,
> >
> > Thank you for checking my patch!
> >
> > >
> > > in your patch:
> > > > printable ASCII characters will be replaced with a hex escape.
> > >
> > > My wording is not good. I think the result will be: ASCII characters
> > > will be as is, non-ASCII characters will be replaced with "a hex
> > > escape".
> >
> > Yeah, your point was right. I have already said:
> > "anything other than printable ASCII characters will be replaced with a hex
> escape"
> > IIUC They have same meaning.
> >
> > You might want to say the line was not good, so reworded like
> > "non-ASCII characters will be replaced with hexadecimal strings." How do you
> think?
> >
> > > set application_name to 'abc漢字Abc';
> > > SET
> > > test16=# show application_name;
> > > application_name
> > > 
> > >  abc\xe6\xbc\xa2\xe5\xad\x97Abc
> > > (1 row)
> > >
> > > I see multi escape, so I am not sure "a hex escape".
> >
> > Not sure what you said, but I could not find word "hex escape" in the 
> > document.
> > So I used "hexadecimal string" instead. Is it acceptable?
> >
> > > to properly render it back to  'abc漢字Abc'
> > > here is how i do it:
> > > select 'abc' || convert_from(decode(' e6bca2e5ad97','hex'), 'UTF8') || 
> > > 'Abc';
> >
> > Yeah, your approach seems right, but I'm not sure it is related with us.
> > Just to confirm, I don't have interest the method for rendering non-ASCII
> characters.
> > My motivation of the patch was to document the the incompatibility noted in 
> > [1]:
> >
> > >
> > Changed the conversion rules when non-ASCII characters are specified for
> ASCII-only
> > strings such as parameters application_name and cluster_name. Previously, it
> was
> > converted in byte units with a question mark (?), but in PostgreSQL 16, it 
> > is
> > converted to a hexadecimal string.
> > >
> >
> > > I guess it's still painful if your application_name has non-ASCII chars.
> >
> > I agreed that, but no one has recommended to use non-ASCII.
> >
> > [1]:
> https://h50146.www5.hpe.com/products/software/oe/linux/mainstream/suppo
> rt/lcc/pdf/PostgreSQL16Beta1_New_Features_en_20230528_1.pdf
> >
> > Best Regards,
> > Hayato Kuroda
> > FUJITSU LIMITED
> 
> looks fine. Do you need to add to commitfest?

Thank you for your confirmation. ! I registered to following:

https://commitfest.postgresql.org/44/4437/


Best Regards,
Hayato Kuroda
FUJITSU LIMITED
 


Re: Parallelize correlated subqueries that execute within each worker

2023-07-03 Thread James Coleman
On Sun, Jun 11, 2023 at 10:23 PM James Coleman  wrote:
>
> ...
> > And while trying the v9 patch I came across a crash with the query
> > below.
> >
> > set min_parallel_table_scan_size to 0;
> > set parallel_setup_cost to 0;
> > set parallel_tuple_cost to 0;
> >
> > explain (costs off)
> > select * from pg_description t1 where objoid in
> > (select objoid from pg_description t2 where t2.description = 
> > t1.description);
> >QUERY PLAN
> > 
> >  Seq Scan on pg_description t1
> >Filter: (SubPlan 1)
> >SubPlan 1
> >  ->  Gather
> >Workers Planned: 2
> >->  Parallel Seq Scan on pg_description t2
> >  Filter: (description = t1.description)
> > (7 rows)
> >
> > select * from pg_description t1 where objoid in
> > (select objoid from pg_description t2 where t2.description = 
> > t1.description);
> > WARNING:  terminating connection because of crash of another server process
> >
> > Seems something is wrong when extracting the argument from the Param in
> > parallel worker.
>
> With what I'm trying to change I don't think this plan should ever be
> generated since it means we'd have to pass a param from the outer seq
> scan into the parallel subplan, which we can't do (currently).
>
> I've attached the full backtrace to the email, but as you hinted at
> the parallel worker is trying to get the param (in this case
> detoasting it), but the param doesn't exist on the worker, so it seg
> faults. Looking at this further I think there's an existing test case
> that exposes the misplanning here (the one right under the comment
> "Parallel Append is not to be used when the subpath depends on the
> outer param" in select_parallel.sql), but it doesn't seg fault because
> the param is an integer, doesn't need to be detoasted, and therefore
> (I think) we skate by (but probably with wrong results in depending on
> the dataset).
>
> Interestingly this is one of the existing test queries my original
> patch approach didn't change, so this gives me something specific to
> work with improving the path. Thanks for testing this and bringing
> this to my attention!

Here's what I've found debugging this:

There's only a single gather path ever created when planning this
query, making it easy to know which one is the problem. That gather
path is created with this stacktrace:

frame #0: 0x000105291590
postgres`create_gather_path(root=0x00013081ae78,
rel=0x00013080c8e8, subpath=0x00013081c080,
target=0x00013081c8c0, required_outer=0x,
rows=0x) at pathnode.c:1971:2
frame #1: 0x000105208e54
postgres`generate_gather_paths(root=0x00013081ae78,
rel=0x00013080c8e8, override_rows=false) at allpaths.c:3097:4
frame #2: 0x0001052090ec
postgres`generate_useful_gather_paths(root=0x00013081ae78,
rel=0x00013080c8e8, override_rows=false) at allpaths.c:3241:2
frame #3: 0x000105258754
postgres`apply_scanjoin_target_to_paths(root=0x00013081ae78,
rel=0x00013080c8e8, scanjoin_targets=0x00013081c978,
scanjoin_targets_contain_srfs=0x,
scanjoin_target_parallel_safe=true, tlist_same_exprs=true) at
planner.c:7696:3
frame #4: 0x0001052533cc
postgres`grouping_planner(root=0x00013081ae78, tuple_fraction=0.5)
at planner.c:1611:3
frame #5: 0x000105251e9c
postgres`subquery_planner(glob=0x0001308188d8,
parse=0x00013080caf8, parent_root=0x00013080cc38,
hasRecursion=false, tuple_fraction=0.5) at planner.c:1062:2
frame #6: 0x00010526b134
postgres`make_subplan(root=0x00013080cc38,
orig_subquery=0x00013080ff58, subLinkType=ANY_SUBLINK,
subLinkId=0, testexpr=0x00013080d848, isTopQual=true) at
subselect.c:221:12
frame #7: 0x000105268b8c
postgres`process_sublinks_mutator(node=0x00013080d6d8,
context=0x00016b0998f8) at subselect.c:1950:10
frame #8: 0x000105268ad8
postgres`SS_process_sublinks(root=0x00013080cc38,
expr=0x00013080d6d8, isQual=true) at subselect.c:1923:9
frame #9: 0x0001052527b8
postgres`preprocess_expression(root=0x00013080cc38,
expr=0x00013080d6d8, kind=0) at planner.c:1169:10
frame #10: 0x000105252954
postgres`preprocess_qual_conditions(root=0x00013080cc38,
jtnode=0x00013080d108) at planner.c:1214:14
frame #11: 0x000105251580
postgres`subquery_planner(glob=0x0001308188d8,
parse=0x000137010d68, parent_root=0x,
hasRecursion=false, tuple_fraction=0) at planner.c:832:2
frame #12: 0x00010525042c
postgres`standard_planner(parse=0x000137010d68,
query_string="explain (costs off)\nselect * from pg_description t1
where objoid in\n(select objoid from pg_description t2 where
t2.description = t1.description);", cursorOptions=2048,
boundParams=0x) at planner.c:411:9

There aren't any lateral markings on the rels. Additionally the

Re: pgindent (probably my missing something obvious)

2023-07-03 Thread Tom Lane
James Coleman  writes:
> This is the first time I've run pgindent on my current machine, and it
> doesn't seem to be making any modifications to source files. For
> example this command:

> ./src/tools/pgindent/pgindent src/backend/optimizer/path/allpaths.c

> leaves the allpaths.c file unchanged despite my having some very long
> function calls.

"Long function calls" aren't necessarily something pgindent would
change.  Have you tried intentionally misindenting some lines?

regards, tom lane




pgindent (probably my missing something obvious)

2023-07-03 Thread James Coleman
Hello,

This is the first time I've run pgindent on my current machine, and it
doesn't seem to be making any modifications to source files. For
example this command:

./src/tools/pgindent/pgindent src/backend/optimizer/path/allpaths.c

leaves the allpaths.c file unchanged despite my having some very long
function calls. I've downloaded the latest typedefs list, but I
haven't added any types anyway.

What obvious thing am I missing?

Thanks,
James Coleman




Re: [PATCH] Extend the length of BackgroundWorker.bgw_library_name

2023-07-03 Thread Yurii Rashkovskii
Nathan,

On Mon, Jul 3, 2023 at 3:08 PM Nathan Bossart 
wrote:

> On Sun, Jul 02, 2023 at 04:37:52PM -0700, Yurii Rashkovskii wrote:
> > Thank you for revising the patch. While this is relatively minor, I think
> > it should be set to MAXPGPATH directly to clarify their relationship.
>
> Committed.  I set the size to MAXPGPATH directly instead of inventing a new
> macro with the same value.
>

Great, thank you! The reason I was leaving the other constant in place to
make upgrading extensions trivial (so that they don't need to adjust for
this), but if you think this is a better way, I am fine with it.

-- 
Y.


Re: Improve list manipulation in several places

2023-07-03 Thread Richard Guo
On Mon, Jul 3, 2023 at 5:41 PM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> On 09.05.23 05:13, Richard Guo wrote:
> > Yeah, maybe this is the reason I failed to devise a query that shows any
> > performance gain.  I tried with a query which makes the 'all_pathkeys'
> > in sort_inner_and_outer being length of 500 and still cannot see any
> > notable performance improvements gained by list_copy_move_nth_to_head.
> > Maybe the cost of other parts of planning swamps the performance gain
> > here?  Now I agree that maybe 0002 is not worthwhile to do.
>
> I have committed patch 0001.  Since you have withdrawn 0002, this closes
> the commit fest item.


Thanks for pushing it and closing the item!

Thanks
Richard


Re: Making empty Bitmapsets always be NULL

2023-07-03 Thread David Rowley
On Mon, 3 Jul 2023 at 18:10, Yuya Watari  wrote:
> Thank you for your reply. I am +1 to your change. I think these
> assertions will help someone who changes the Bitmapset implementations
> in the future.

I've now pushed the patch.

Thanks for all your reviews and detailed benchmarks.

David




Re: Consider \v to the list of whitespace characters in the parser

2023-07-03 Thread Michael Paquier
On Mon, Jul 03, 2023 at 08:15:03PM -0400, Tom Lane wrote:
> Assuming we don't want to change either of these distinctions,
> the v2 patch looks about right to me.

Yeah, thanks.  Peter, what's your take?
--
Michael


signature.asc
Description: PGP signature


Re: Consider \v to the list of whitespace characters in the parser

2023-07-03 Thread Tom Lane
Michael Paquier  writes:
> On Mon, Jul 03, 2023 at 12:17:10PM +0200, Peter Eisentraut wrote:
>> In scan.l, you might want to ponder horiz_space: Even though \v is clearly
>> not "horizontal space", horiz_space already includes \f, which is also not
>> horizontal IMO.  I think horiz_space is really all space characters except
>> newline characters.  Maybe this should be rephrased.

> And a few lines above, there is a comment from 2000 (3cfdd8f)
> pondering if \f should be handled as a newline, which is kind of
> incorrect anyway?

It looks to me like there are two places where these distinctions
actually matter:

1. Which characters terminate a "--" comment.  Currently that's only
[\n\r] (see {non_newline}).

2. Which characters satisfy the SQL spec's requirement that there be a
newline in the whitespace separating string literals that are to be
concatenated.  Currently, that's also only [\n\r].

Assuming we don't want to change either of these distinctions,
the v2 patch looks about right to me.

regards, tom lane




Re: Commitfest manager for July

2023-07-03 Thread Michael Paquier
On Mon, Jul 03, 2023 at 12:26:44PM +0200, Daniel Gustafsson wrote:
> Since this didn't get any takers, and we are in July AoE since a few days ago,
> I guess I'll assume the role this time in the interest of moving things along.
> I've switched the 2023-07 CF to in-progress and 2023-09 to open, let's try to
> close patches!

Thanks, Daniel!
--
Michael


signature.asc
Description: PGP signature


Re: Consider \v to the list of whitespace characters in the parser

2023-07-03 Thread Michael Paquier
On Mon, Jul 03, 2023 at 12:17:10PM +0200, Peter Eisentraut wrote:
> SQL has "whitespace", which includes any Unicode character with the
> White_Space property (which includes \v), and , which is
> implementation-defined.
> 
> So nothing there speaks against treating \v as a (white)space character in
> the SQL scanner.

Okay, thanks for confirming.  

> In scan.l, you might want to ponder horiz_space: Even though \v is clearly
> not "horizontal space", horiz_space already includes \f, which is also not
> horizontal IMO.  I think horiz_space is really all space characters except
> newline characters.  Maybe this should be rephrased.

And a few lines above, there is a comment from 2000 (3cfdd8f)
pondering if \f should be handled as a newline, which is kind of
incorrect anyway?

FWIW, I agree that horiz_space is confusing in this context because it
does not completely reflect the reality, and \v is not that so adding
it to the existing list felt wrong to me.  Form feed is also not a
newline, from what I understand..  From what the parser tells, there
are two things we want to track to handle comments:
- All space characters, which would be \t\n\r\f\v.
- All space characters that are not newlines, \t\f\v.

I don't really have a better idea this morning than using the
following terms in the parser, changing the surroundings with similar
terms:
-space  [ \t\n\r\f]
-horiz_space[ \t\f]
+space  [ \t\n\r\f\v]
+non_newline_space  [ \t\f\v]

Perhaps somebody has a better idea of split?
--
Michael
diff --git a/src/backend/parser/parse_type.c b/src/backend/parser/parse_type.c
index be75dc6ab0..63b4e96962 100644
--- a/src/backend/parser/parse_type.c
+++ b/src/backend/parser/parse_type.c
@@ -742,7 +742,7 @@ typeStringToTypeName(const char *str, Node *escontext)
 	ErrorContextCallback ptserrcontext;
 
 	/* make sure we give useful error for empty input */
-	if (strspn(str, " \t\n\r\f") == strlen(str))
+	if (strspn(str, " \t\n\r\f\v") == strlen(str))
 		goto fail;
 
 	/*
diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l
index b2216a9eac..0708ba6540 100644
--- a/src/backend/parser/scan.l
+++ b/src/backend/parser/scan.l
@@ -213,16 +213,16 @@ extern void core_yyset_column(int column_no, yyscan_t yyscanner);
  * versions of Postgres failed to recognize -- as a comment if the input
  * did not end with a newline.
  *
- * XXX perhaps \f (formfeed) should be treated as a newline as well?
+ * non_newline_space tracks all the other space characters except newlines.
  *
  * XXX if you change the set of whitespace characters, fix scanner_isspace()
  * to agree.
  */
 
-space			[ \t\n\r\f]
-horiz_space		[ \t\f]
-newline			[\n\r]
-non_newline		[^\n\r]
+space[ \t\n\r\f\v]
+non_newline_space	[ \t\f\v]
+newline[\n\r]
+non_newline			[^\n\r]
 
 comment			("--"{non_newline}*)
 
@@ -236,8 +236,8 @@ whitespace		({space}+|{comment})
  */
 
 special_whitespace		({space}+|{comment}{newline})
-horiz_whitespace		({horiz_space}|{comment})
-whitespace_with_newline	({horiz_whitespace}*{newline}{special_whitespace}*)
+non_newline_whitespace	({non_newline_space}|{comment})
+whitespace_with_newline	({non_newline_whitespace}*{newline}{special_whitespace}*)
 
 quote			'
 /* If we see {quote} then {quotecontinue}, the quoted string continues */
@@ -1414,6 +1414,8 @@ unescape_single_char(unsigned char c, core_yyscan_t yyscanner)
 			return '\r';
 		case 't':
 			return '\t';
+		case 'v':
+			return '\v';
 		default:
 			/* check for backslash followed by non-7-bit-ASCII */
 			if (c == '\0' || IS_HIGHBIT_SET(c))
diff --git a/src/backend/parser/scansup.c b/src/backend/parser/scansup.c
index ed67f5f5fe..4f0005a114 100644
--- a/src/backend/parser/scansup.c
+++ b/src/backend/parser/scansup.c
@@ -121,6 +121,7 @@ scanner_isspace(char ch)
 		ch == '\t' ||
 		ch == '\n' ||
 		ch == '\r' ||
+		ch == '\v' ||
 		ch == '\f')
 		return true;
 	return false;
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index cb467ca46f..1cc7fb858c 100644
--- a/src/backend/replication/repl_scanner.l
+++ b/src/backend/replication/repl_scanner.l
@@ -73,7 +73,7 @@ static void addlitchar(unsigned char ychar);
 %x xd
 %x xq
 
-space			[ \t\n\r\f]
+space			[ \t\n\r\f\v]
 
 quote			'
 quotestop		{quote}
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 9000f83a83..4359dbd83d 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -24,6 +24,7 @@
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
+#include "parser/scansup.h"
 #include "port/pg_bitutils.h"
 #include "utils/array.h"
 #include "utils/arrayaccess.h"
@@ -89,7 +90,6 @@ typedef struct ArrayIteratorData
 	int			current_item;	/* the item # we're at in the array */
 }			ArrayIteratorData;
 
-static bool array_isspace(char ch);
 static int	ArrayCount(const char *str, int *dim, char typdelim,
 	   Node *escontext);
 

Re: Deleting prepared statements from libpq.

2023-07-03 Thread Michael Paquier
On Mon, Jul 03, 2023 at 02:33:55PM +0200, Jelte Fennema wrote:
> @Michael is anything else needed from my side? If not, I'll mark the
> commitfest entry as "Ready For Committer".

Sure, feel free.  I was planning to look at and play more with it.
--
Michael


signature.asc
Description: PGP signature


Re: Does a cancelled REINDEX CONCURRENTLY need to be messy?

2023-07-03 Thread Michael Paquier
On Mon, Jul 03, 2023 at 07:46:27PM +0200, Alvaro Herrera wrote:
> On 2023-Jul-01, Thom Brown wrote:
>> On Thu, 29 Jun 2023, 14:45 Álvaro Herrera,  wrote:
>>> ALTER TABLE DETACH CONCURRENTLY had to deal with this also, and it did it
>>> by having a COMPLETE option you can run later in case things got stuck the
>>> first time around. I suppose we could do something similar, where the
>>> server automatically does the needful, whatever that is.

I could imagine a code path for manual and automatic operations for
REINDEX (?) at table level and at database level, but using this
keyword would be strange, as well.  CONCURRENTLY cannot work on system
indexes so SYSTEM does not make sense, and index level is no different
than a DROP.

> Well, I definitely agree that it would be useful to have *something*
> that automatically removes debris  (I'm not sure VACUUM is the best place
> to do it.  Perhaps we could have autovacuum check for it, and do it
> separately of vacuum proper.)

Being able to reuse some of the worker/launcher parts from autovacuum
could make things easier for a bgworker implementation, perhaps?
--
Michael


signature.asc
Description: PGP signature


Re: Large files for relations

2023-07-03 Thread Thomas Munro
On Mon, Jun 12, 2023 at 8:53 PM David Steele  wrote:
> +   if (strcmp(endptr, "kB") == 0)
>
> Why kB here instead of KB to match MB, GB, TB below?

Those are SI prefixes[1], and we use kB elsewhere too.  ("K" was used
for kelvins, so they went with "k" for kilo.  Obviously these aren't
fully SI, because B is supposed to mean bel.  A gigabel would be
pretty loud... more than "sufficient power to create a black hole"[2],
hehe.)

> +   int64   relseg_size;/* blocks per segment of large 
> relation */
>
> This will require PG_CONTROL_VERSION to be bumped -- but you are
> probably waiting until commit time to avoid annoying conflicts, though I
> don't think it is as likely as with CATALOG_VERSION_NO.

Oh yeah, thanks.

> > Another
> > idea would be to make it static in md.c and call smgrsetsegmentsize(),
> > or something like that.  That could be a nice place to compute the
> > "shift" value up front, instead of computing it each time in
> > blockno_to_segno(), but that's probably not worth bothering with (?).
> > BSR/LZCNT/CLZ instructions are pretty fast on modern chips.  That's
> > about the only place where someone could say that this change makes
> > things worse for people not interested in the new feature, so I was
> > careful to get rid of / and % operations with no-longer-constant RHS.
>
> Right -- not sure we should be troubling ourselves with trying to
> optimize away ops that are very fast, unless they are computed trillions
> of times.

This obviously has some things in common with David Christensen's
nearby patch for block sizes[3], and we should be shifting and masking
there too if that route is taken (as opposed to a specialise-the-code
route or somethign else).  My binary-log trick is probably a little
too cute though... I should probably just go and set a shift variable.

Thanks for looking!

[1] https://en.wikipedia.org/wiki/Metric_prefix
[2] https://en.wiktionary.org/wiki/gigabel
[3] 
https://www.postgresql.org/message-id/flat/CAOxo6XKx7DyDgBkWwPfnGSXQYNLpNrSWtYnK6-1u%2BQHUwRa1Gg%40mail.gmail.com




Re: ProcessStartupPacket(): database_name and user_name truncation

2023-07-03 Thread Tom Lane
Nathan Bossart  writes:
> Thanks, Bertrand.  I chickened out and ended up committing v1 for now
> (i.e., simply removing the truncation code).

WFM.

> If anyone disagrees and wants to see the FATALs emitted from
> ProcessStartupPacket() directly, please let me know and we can work on
> adding them in a follow-up patch.

I think the new behavior is fine.

regards, tom lane




Re: Add information about command path and version of flex in meson output

2023-07-03 Thread Michael Paquier
On Mon, Jul 03, 2023 at 10:17:40AM +0200, Peter Eisentraut wrote:
> Yes, if you want two separate lines, then doing it like bison makes sense.

Okay, I have applied v2 that uses two separate lines and two separate
variables, then, to be like bison.
--
Michael


signature.asc
Description: PGP signature


brininsert optimization opportunity

2023-07-03 Thread Soumyadeep Chakraborty
Hello hackers,

My colleague, Ashwin, pointed out to me that brininsert's per-tuple init
of the revmap access struct can have non-trivial overhead.

Turns out he is right. We are saving 24 bytes of memory per-call for
the access struct, and a bit on buffer/locking overhead, with the
attached patch.

The implementation ties the revmap cleanup as a MemoryContext callback
to the IndexInfo struct's MemoryContext, as there is no teardown
function provided by the index AM for end-of-insert-command.

Test setup (local Ubuntu workstation):

# Drop caches and restart between each run:
sudo sh -c "sync; echo 3 > /proc/sys/vm/drop_caches;"
pg_ctl -D /usr/local/pgsql/data/ -l /tmp/logfile restart

\timing
DROP TABLE heap;
CREATE TABLE heap(i int);
CREATE INDEX ON heap USING brin(i) WITH (pages_per_range=1);
INSERT INTO heap SELECT 1 FROM generate_series(1, 2);

Results:
We see an improvement for 100M tuples and an even bigger improvement for
200M tuples.

Master (29cf61ade3f245aa40f427a1d6345287ef77e622):

test=# INSERT INTO heap SELECT 1 FROM generate_series(1, 1);
INSERT 0 1
Time: 222762.159 ms (03:42.762)

-- 3 runs
test=# INSERT INTO heap SELECT 1 FROM generate_series(1, 2);
INSERT 0 2
Time: 471168.181 ms (07:51.168)
Time: 457071.883 ms (07:37.072)
TimeL 486969.205 ms (08:06.969)

Branch:

test2=# INSERT INTO heap SELECT 1 FROM generate_series(1, 1);
INSERT 0 1
Time: 200046.519 ms (03:20.047)

-- 3 runs
test2=# INSERT INTO heap SELECT 1 FROM generate_series(1, 2);
INSERT 0 2
Time: 369041.832 ms (06:09.042)
Time: 365483.382 ms (06:05.483)
Time: 375506.144 ms (06:15.506)

# Profiled backend running INSERT of 1 rows
sudo perf record -p 11951 --call-graph fp sleep 180

Please see attached perf diff between master and branch. We see that we
save on a bit of overhead from brinRevmapInitialize(),
brinRevmapTerminate() and lock routines.

Regards,
Soumyadeep (VMware)


perf_diff.out
Description: Binary data
From 5d11219e413fe6806e00dd738b051c3948dffcab Mon Sep 17 00:00:00 2001
From: Soumyadeep Chakraborty 
Date: Mon, 3 Jul 2023 10:47:48 -0700
Subject: [PATCH v1 1/1] Reuse revmap and brin desc in brininsert

brininsert() used to have code that performed per-tuple initialization
of the revmap. That had some overhead.
---
 src/backend/access/brin/brin.c | 70 --
 1 file changed, 59 insertions(+), 11 deletions(-)

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 3c6a956eaa3..e27bee980f1 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -58,6 +58,17 @@ typedef struct BrinBuildState
 	BrinMemTuple *bs_dtuple;
 } BrinBuildState;
 
+/*
+ * We use a BrinInsertState to capture running state spanning multiple
+ * brininsert invocations, within the same command.
+ */
+typedef struct BrinInsertState
+{
+	BrinRevmap 	*bs_rmAccess;
+	BrinDesc   	*bs_desc;
+	BlockNumber bs_pages_per_range;
+} BrinInsertState;
+
 /*
  * Struct used as "opaque" during index scans
  */
@@ -72,6 +83,7 @@ typedef struct BrinOpaque
 
 static BrinBuildState *initialize_brin_buildstate(Relation idxRel,
   BrinRevmap *revmap, BlockNumber pagesPerRange);
+static BrinInsertState *initialize_brin_insertstate(Relation idxRel, IndexInfo *indexInfo);
 static void terminate_brin_buildstate(BrinBuildState *state);
 static void brinsummarize(Relation index, Relation heapRel, BlockNumber pageRange,
 		  bool include_partial, double *numSummarized, double *numExisting);
@@ -140,6 +152,42 @@ brinhandler(PG_FUNCTION_ARGS)
 	PG_RETURN_POINTER(amroutine);
 }
 
+static void
+brininsertCleanupCallback(void *arg)
+{
+	BrinInsertState *bistate = (BrinInsertState *) arg;
+	/*
+	 * Clean up the revmap. Note that the brinDesc has already been cleaned up
+	 * as part of its own memory context.
+	 */
+	brinRevmapTerminate(bistate->bs_rmAccess);
+}
+
+static BrinInsertState *
+initialize_brin_insertstate(Relation idxRel, IndexInfo *indexInfo)
+{
+	BrinInsertState 		*bistate;
+	MemoryContextCallback 	*cb;
+	MemoryContext 			oldcxt;
+
+	oldcxt = MemoryContextSwitchTo(indexInfo->ii_Context);
+	bistate = MemoryContextAllocZero(indexInfo->ii_Context,
+	 sizeof(BrinInsertState));
+
+	bistate->bs_desc = brin_build_desc(idxRel);
+	cb = palloc(sizeof(MemoryContextCallback));
+	cb->arg = bistate;
+	cb->func = brininsertCleanupCallback;
+	MemoryContextRegisterResetCallback(indexInfo->ii_Context, cb);
+	MemoryContextSwitchTo(oldcxt);
+
+	bistate->bs_rmAccess = brinRevmapInitialize(idxRel,
+>bs_pages_per_range,
+NULL);
+
+	return bistate;
+}
+
 /*
  * A tuple in the heap is being inserted.  To keep a brin index up to date,
  * we need to obtain the relevant index tuple and compare its stored values
@@ -162,14 +210,23 @@ brininsert(Relation idxRel, Datum *values, bool *nulls,
 	BlockNumber pagesPerRange;
 	BlockNumber origHeapBlk;
 	BlockNumber heapBlk;
-	BrinDesc   *bdesc = 

Re: Size vs size_t or, um, PgSize?

2023-07-03 Thread Tom Lane
Daniel Gustafsson  writes:
> On 3 Jul 2023, at 20:32, Yurii Rashkovskii  wrote:
>> I couldn't find any rationale as to why we might want to have this alias and 
>> not use size_t. Any insight on this would be appreciated.

> This used to be a typedef for unsigned int a very long time ago.

I'm fairly sure that Size dates from before we could expect the system
headers to provide size_t everywhere.

> This has been discussed a number of times in the past, and the conclusion from
> last time IIRC was to use size_t for new code and only change the existing
> instances when touched for other reasons to avoid churn.

Yeah.  The code-churn costs of s/Size/size_t/g outweigh the possible
gain, at least from our admittedly project-centric point of view.
But I don't have a whole lot of sympathy for arguments about "this
other code I'd like to also use has its own definition for Size",
because you could potentially make that complaint about just about
every typedef we've got.  If you have conflicts like that, you have
to resolve them by methods like #define hacks or factoring your code
so it doesn't need to include Postgres headers in the same files that
include $other-project-headers.

regards, tom lane




Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2023-07-03 Thread Michael Paquier
On Tue, Jul 04, 2023 at 06:40:49AM +1200, Thomas Munro wrote:
> curculio (OpenBSD 5.9) is failing with "undefined reference to
> `X509_get_signature_nid'", but that's OK, Mikael already supplied a
> modern OpenBSD system to replace it (schnauzer, which is green) and he
> was planning to shut curculio down (see Direct I/O thread where that
> came up because its GCC 4.2 compiler doesn't understand our stack
> alignment directives; it will also break comprehensively when I push
> the nearby all-supported-computers-have-locale_t patch from the
> check_strxfrm_bug thread).

The second and third animals to fail are skate and snapper, both using
Debian 7 Wheezy.  As far as I know, it was an LTS supported until
2018.  The owner of both machines is added in CC.  I guess that we
this stuff could just remove --with-openssl from the configure
switches.
--
Michael


signature.asc
Description: PGP signature


Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2023-07-03 Thread Michael Paquier
On Mon, Jul 03, 2023 at 10:23:02PM +0200, Mikael Kjellström wrote:
> On 2023-07-03 20:53, Daniel Gustafsson wrote:
>>> curculio (OpenBSD 5.9) is failing with "undefined reference to
>>> `X509_get_signature_nid'", but that's OK, Mikael already supplied a
>>> modern OpenBSD system to replace it
>> 
>> Thanks for the report!  OpenBSD 5.9 was released in 2016 and is thus well 
>> over
>> 5 years EOL, so I agree that it doesn't warrant a code change from us to
>> support this.

OpenBSD 5.9 was EOL in 2017 as far as I know.

> I have retired curculio now.  So it will stop reporting in from now on.

Thanks Mikael!
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] Extend the length of BackgroundWorker.bgw_library_name

2023-07-03 Thread Nathan Bossart
On Sun, Jul 02, 2023 at 04:37:52PM -0700, Yurii Rashkovskii wrote:
> Thank you for revising the patch. While this is relatively minor, I think
> it should be set to MAXPGPATH directly to clarify their relationship.

Committed.  I set the size to MAXPGPATH directly instead of inventing a new
macro with the same value.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Why is DATESTYLE, ordering ignored for output but used for input ?

2023-07-03 Thread Matthias van de Meent
On Mon, 3 Jul 2023 at 20:06, Dave Cramer  wrote:
>
> Greetings,
>
> For ISO and German dates the order DMY is completely ignored on output but 
> used for input.
>
> test=# set datestyle to 'ISO,DMY';
> SET
> select '7-8-2023'::date
> test-# ;
> date
> 
>  2023-08-07
> (1 row)
>
> test=# set datestyle to 'ISO,MDY';
> SET
> test=# select '7-8-2023'::date
> ;
> date
> 
>  2023-07-08
> (1 row)
>
> Note regardless of  how the ordering is specified it is always output as
> YMD

Wouldn't that be because ISO only has one correct ordering of the day
and month fields? I fail to see why we'd output non-ISO-formatted date
strings when ISO format is requested. I believe the reason is the same
for German dates - Germany's official (or most common?) date
formatting has a single ordering of these fields, which is also the
ordering that we supply.

The code comments also seem to hint to this:

> case USE_ISO_DATES:
> case USE_XSD_DATES:
>  /* compatible with ISO date formats */

> case USE_GERMAN_DATES:
> /* German-style date format */

This has been this way since the code for ISO was originally committed
in July of '97 with 8507ddb9 and the GERMAN formatting which was added
in December of '97 as D.M/Y with 352b3687 (and later that month was
updated to D.M.Y with ca23837a).
Sadly, the -hackers archives don't seem to have any mails from that
time period, so I couldn't find much info on the precise rationale
around this behavior.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)

PS. That was some interesting digging into the history of the date
formatting module.




Re: ProcessStartupPacket(): database_name and user_name truncation

2023-07-03 Thread Nathan Bossart
On Sat, Jul 01, 2023 at 04:02:06PM +0200, Drouvot, Bertrand wrote:
> Please find V2 attached where it's failing as soon as the database name or
> user name are detected as overlength.

Thanks, Bertrand.  I chickened out and ended up committing v1 for now
(i.e., simply removing the truncation code).  I didn't like the idea of
trying to keep the new error messages consistent with code in faraway
files, and the startup packet length limit is already pretty aggressive, so
I'm a little less concerned about lugging around long names.  Plus, I think
v2 had some subtle interactions with db_user_namespace (maybe for the
better), but I didn't spend too much time looking at that since
db_user_namespace will likely be removed soon.

If anyone disagrees and wants to see the FATALs emitted from
ProcessStartupPacket() directly, please let me know and we can work on
adding them in a follow-up patch.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2023-07-03 Thread Mikael Kjellström




On 2023-07-03 20:53, Daniel Gustafsson wrote:


curculio (OpenBSD 5.9) is failing with "undefined reference to
`X509_get_signature_nid'", but that's OK, Mikael already supplied a
modern OpenBSD system to replace it


Thanks for the report!  OpenBSD 5.9 was released in 2016 and is thus well over
5 years EOL, so I agree that it doesn't warrant a code change from us to
support this.


I have retired curculio now.  So it will stop reporting in from now on.

/Mikael





Re: Size vs size_t or, um, PgSize?

2023-07-03 Thread Yurii Rashkovskii
Daniel,

On Mon, Jul 3, 2023 at 12:20 PM Daniel Gustafsson  wrote:

> > On 3 Jul 2023, at 21:14, Yurii Rashkovskii  wrote:
>
> > That being said, going ahead with the global renaming of Size to size_t
> will mostly eliminate this clash in, say, five years when old versions will
> be gone. At least it'll be fixed then. Otherwise, it'll never be fixed at
> all. To me, having the problem gone in the future beats having the problem
> forever.
>
> I would also like all Size instances gone, but the cost during backpatching
> will likely be very high.  There are ~1300 or so of them in the code, and
> that's a lot of potential conflicts during the coming 5 years of
> backpatches.
>
>
I understand. How about a workaround for extension builders? Something like

```
/* Use this if you run into Size type redefinition */
#ifdef DONT_TYPEDEF_SIZE
#define Size size_t
#else
typedef size_t Size;
#endif
```
This way, extension developers can specify DONT_TYPEDEF_SIZE. However, this
would have to be backported, but to minimal/no effect if I am not missing
anything.

Not beautiful, but better than freezing the status quo forever?

-- 
Y.


Re: Size vs size_t or, um, PgSize?

2023-07-03 Thread Daniel Gustafsson
> On 3 Jul 2023, at 21:14, Yurii Rashkovskii  wrote:

> That being said, going ahead with the global renaming of Size to size_t will 
> mostly eliminate this clash in, say, five years when old versions will be 
> gone. At least it'll be fixed then. Otherwise, it'll never be fixed at all. 
> To me, having the problem gone in the future beats having the problem forever.

I would also like all Size instances gone, but the cost during backpatching
will likely be very high.  There are ~1300 or so of them in the code, and
that's a lot of potential conflicts during the coming 5 years of backpatches.

--
Daniel Gustafsson





Re: Size vs size_t or, um, PgSize?

2023-07-03 Thread Yurii Rashkovskii
Hi Thomas,

On Mon, Jul 3, 2023 at 12:03 PM Thomas Munro  wrote:

> On Tue, Jul 4, 2023 at 6:46 AM Daniel Gustafsson  wrote:
> > > On 3 Jul 2023, at 20:32, Yurii Rashkovskii  wrote:
> > > If there's a willingness to try this out, I am happy to prepare a
> patch.
> >
> > This has been discussed a number of times in the past, and the
> conclusion from
> > last time IIRC was to use size_t for new code and only change the
> existing
> > instances when touched for other reasons to avoid churn.
>
> One such earlier discussion:
>
>
> https://www.postgresql.org/message-id/flat/CAEepm%3D1eA0vsgA7-2oigKzqg10YeXoPWiS-fCuQRDLwwmgMXag%40mail.gmail.com
>
> I personally wouldn't mind if we just flipped to standard types
> everywhere, but I guess it wouldn't help with your problem with
> extensions on macOS as you probably also want to target released
> branches, not just master/17+.  But renaming in the back branches
> doesn't sound like something we'd do...
>

Of course, it would have been great to have it backported in the ideal
world, but it isn't realistic, as you say.

That being said, going ahead with the global renaming of Size to size_t
will mostly eliminate this clash in, say, five years when old versions will
be gone. At least it'll be fixed then. Otherwise, it'll never be fixed at
all. To me, having the problem gone in the future beats having the problem
forever.


-- 
Y.


Re: Size vs size_t or, um, PgSize?

2023-07-03 Thread Thomas Munro
On Tue, Jul 4, 2023 at 6:46 AM Daniel Gustafsson  wrote:
> > On 3 Jul 2023, at 20:32, Yurii Rashkovskii  wrote:
> > If there's a willingness to try this out, I am happy to prepare a patch.
>
> This has been discussed a number of times in the past, and the conclusion from
> last time IIRC was to use size_t for new code and only change the existing
> instances when touched for other reasons to avoid churn.

One such earlier discussion:

https://www.postgresql.org/message-id/flat/CAEepm%3D1eA0vsgA7-2oigKzqg10YeXoPWiS-fCuQRDLwwmgMXag%40mail.gmail.com

I personally wouldn't mind if we just flipped to standard types
everywhere, but I guess it wouldn't help with your problem with
extensions on macOS as you probably also want to target released
branches, not just master/17+.  But renaming in the back branches
doesn't sound like something we'd do...




Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2023-07-03 Thread Daniel Gustafsson
> On 3 Jul 2023, at 20:40, Thomas Munro  wrote:
> 
> On Mon, Jul 3, 2023 at 4:26 PM Michael Paquier  wrote:
>> I have not gone back to this part yet, though I plan to do so.  As we
>> are at the beginning of the development cycle, I have applied the
>> patch to remove support for 1.0.1 for now on HEAD.  Let's see what the
>> buildfarm tells.
> 
> curculio (OpenBSD 5.9) is failing with "undefined reference to
> `X509_get_signature_nid'", but that's OK, Mikael already supplied a
> modern OpenBSD system to replace it

Thanks for the report!  OpenBSD 5.9 was released in 2016 and is thus well over
5 years EOL, so I agree that it doesn't warrant a code change from us to
support this.

--
Daniel Gustafsson





Re: Size vs size_t or, um, PgSize?

2023-07-03 Thread Daniel Gustafsson
> On 3 Jul 2023, at 20:32, Yurii Rashkovskii  wrote:

> I couldn't find any rationale as to why we might want to have this alias and 
> not use size_t. Any insight on this would be appreciated.

This used to be a typedef for unsigned int a very long time ago.

> Would there be any sense in changing it all to size_t or renaming it to 
> something else?
> 
> I understand that they will break some extensions, so if we don't want them 
> to have to go through with the renaming, can we enable backward compatibility 
> with a macro?
> 
> If there's a willingness to try this out, I am happy to prepare a patch.

This has been discussed a number of times in the past, and the conclusion from
last time IIRC was to use size_t for new code and only change the existing
instances when touched for other reasons to avoid churn.

--
Daniel Gustafsson





Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2023-07-03 Thread Thomas Munro
On Mon, Jul 3, 2023 at 4:26 PM Michael Paquier  wrote:
> I have not gone back to this part yet, though I plan to do so.  As we
> are at the beginning of the development cycle, I have applied the
> patch to remove support for 1.0.1 for now on HEAD.  Let's see what the
> buildfarm tells.

curculio (OpenBSD 5.9) is failing with "undefined reference to
`X509_get_signature_nid'", but that's OK, Mikael already supplied a
modern OpenBSD system to replace it (schnauzer, which is green) and he
was planning to shut curculio down (see Direct I/O thread where that
came up because its GCC 4.2 compiler doesn't understand our stack
alignment directives; it will also break comprehensively when I push
the nearby all-supported-computers-have-locale_t patch from the
check_strxfrm_bug thread).




Re: pgbnech: allow to cancel queries during benchmark

2023-07-03 Thread Fabien COELHO



Yugo-san,

Some feedback about v1 of this patch.

Patch applies cleanly, compiles.

There are no tests, could there be one? ISTM that one could be done with a 
"SELECT pg_sleep(...)" script??


The global name "all_state" is quite ambiguous, what about "client_states" 
instead? Or maybe it could be avoided, see below.


Instead of renaming "state" to "all_state" (or client_states as suggested 
above), I'd suggest to minimize the patch by letting "state" inside the 
main and adding a "client_states = state;" just after the allocation, or 
another approach, see below.


Should PQfreeCancel be called on deconnections, in finishCon? I think that 
there may be a memory leak with the current implementation??


Maybe it should check that cancel is not NULL before calling PQcancel?

After looking at the code, I'm very unclear whether they may be some 
underlying race conditions, or not, depending on when the cancel is 
triggered. I think that some race conditions are still likely given the 
current thread 0 implementation, and dealing with them with a barrier or 
whatever is not desirable at all.


In order to work around this issue, ISTM that we should go away from the 
simple and straightforward thread 0 approach, and the only clean way is 
that the cancelation should be managed by each thread for its own client.


I'd suggest to have the advanceState to call PQcancel when CancelRequested 
is set and switch to CSTATE_ABORTED to end properly. This means that there 
would be no need for the global client states pointer, so the patch should 
be smaller and simpler. Possibly there would be some shortcuts added here 
and there to avoid lingering after the control-C, in threadRun.


--
Fabien.




Including a sample Table Access Method with core code

2023-07-03 Thread Hannu Krosing
At PgCon 2023 in Ottawa we had an Unconference session on Table Access
Methods [1]

One thing that was briefly mentioned (but is missing from the notes)
is need to have a sample API client in contrib/ , both for having a
2nd user for API to make it more likely that non-heap AMs are doable
and also to serve as an easy starting point for someone interested in
developing a new AM.

There are a few candidates which could be lightweight enough for this

* in-memory temp tables, especially if you specify max table size at
creation and/or limit data types which can be used.

* "overlay tables" - tables which "overlay" another - possibly
read-only - table and store only changed rows and tombstones for
deletions. (this likely would make more sense as a FDW itself as Table
AM currently knows nothing about Primary Keys and these are likely
needed for overlays)

* Table AM as a (pl/)Python Class - this is inspired by the amazing
Multicorn [2] FDW-in-Python tool which made it ridiculously easy to
expose anything (mailbox, twitter feed, git commit history,
you-name-it) as a Foreign Table


Creating any of these seems to be a project of size suitable for a
student course project or maybe Google Summer of Code [3].


Included Mark Dilger directly to this mail as he mentioned he has a
Perl script that makes a functional copy of heap AM that can be
compiled as installed as custom AM.

@mark - maybe you can create 3 boilerplate Table AMs for the above
named `mem_am`, `overlay_am` and `py3_am` and we could put them
somewhere for interested parties to play with ?

[1] https://wiki.postgresql.org/wiki/PgCon_2023_Developer_Unconference#Table_AMs
[2] https://multicorn.org/ - unfortunately unmaintained since 2016,
but there are some forks supporting later PostgreSQL versions
[3] https://wiki.postgresql.org/wiki/GSoC - Google Summer of Code

---
Best Regards
Hannu




Size vs size_t or, um, PgSize?

2023-07-03 Thread Yurii Rashkovskii
Hi,

I've run into an issue of a name clash with system libraries. Specifically,
the `Size` type seems to be just an alias for `size_t` and, at least on
macOS, it clashes with the core SDK, as it is also defined by MacTypes.h,
which is used by some of the libraries one may want to use from within a
Postgres extension.

While in my case, I believe I have a workaround, I couldn't find any
rationale as to why we might want to have this alias and not use size_t.
Any insight on this would be appreciated.

Would there be any sense in changing it all to size_t or renaming it to
something else?

I understand that they will break some extensions, so if we don't want them
to have to go through with the renaming, can we enable backward
compatibility with a macro?

If there's a willingness to try this out, I am happy to prepare a patch.

-- 
Y.


Re: pgsql: Fix search_path to a safe value during maintenance operations.

2023-07-03 Thread Nathan Bossart
On Sun, Jul 02, 2023 at 08:57:31PM -0700, Noah Misch wrote:
> Another dimension of compromise could be to make MAINTAIN affect fewer
> commands in v16.  Un-revert the part of commit 05e1737 affecting just the
> commands it still affects.  For example, limit MAINTAIN and the 05e1737
> behavior change to VACUUM, ANALYZE, and REINDEX.  Don't worry about VACUUM or
> ANALYZE failing under commit 05e1737, since they would have been failing under
> autovacuum since 2018.  A problem index expression breaks both autoanalyze and
> REINDEX, hence the inclusion of REINDEX.  The already-failing argument doesn't
> apply to CLUSTER or REFRESH MATERIALIZED VIEW, so those commands could join
> MAINTAIN in v17.

I'm open to compromise if others are, but I'm skeptical that folks will be
okay with too much fancy footwork this late in the game.

Anyway, IMO your argument could extend to CLUSTER and REFRESH, too.  If
we're willing to change behavior under the assumption that autovacuum
would've been failing since 2018, then why wouldn't we be willing to change
it everywhere?  I suppose someone could have been manually vacuuming with a
special search_path for 5 years to avoid needing to schema-qualify their
index expressions (and would then be surprised that CLUSTER/REFRESH no
longer work), but limiting MAINTAIN to VACUUM, etc. would still break their
use-case, right?

> From my reading of the objections, I think they're saying that commit 05e1737
> arrived too late and that MAINTAIN is unacceptable without commit 05e1737.  I
> think you'd conform to all objections by pushing the revert to v16 and pushing
> a roll-forward of commit 05e1737 to master.

Okay, I'll adjust my plans accordingly.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Why is DATESTYLE, ordering ignored for output but used for input ?

2023-07-03 Thread Dave Cramer
Greetings,

For ISO and German dates the order DMY is completely ignored on output but
used for input.

test=# set datestyle to 'ISO,DMY';
SET
select '7-8-2023'::date
test-# ;
date

 2023-08-07
(1 row)

test=# set datestyle to 'ISO,MDY';
SET
test=# select '7-8-2023'::date
;
date

 2023-07-08
(1 row)

Note regardless of  how the ordering is specified it is always output as
YMD

Dave Cramer


Re: Optionally using a better backtrace library?

2023-07-03 Thread Tristan Partin
On Mon Jul 3, 2023 at 12:43 PM CDT, Andres Freund wrote:
> On 2023-07-03 11:58:25 +0200, Alvaro Herrera wrote:
> > On 2023-Jul-02, Andres Freund wrote:
> > > Nice things about libbacktrace are that the generation of stack traces is
> > > documented to be async signal safe on most platforms (with a #define to 
> > > figure
> > > that out, and a more minimal safe version always available) and that it
> > > supports a wide range of platforms:
> > 
> > Sadly, it looks like the library is seldom distributed.
>
> It's often distributed as part of gcc.
>
>
> > For example, Debian seems to only have a package called android-libbacktrace
> > which I imagine is not what we want.
>
> Indeed not.
>
>
> > On my system I see a static library only -- is that enough?  That file is
> > part of package libgcc-10-dev, which tells me that we can't depend on that
> > for packaging purposes.
>
> We should be able to depend on that gcc-NN depends on libgcc-NN-dev, it
> contains all the compiler version specific stuff. It's where the intrinsics
> headers, C runtime initialization, sanitizer libraries all live.  clang will
> typically also depend on libgcc-NN-dev on unixoid systems.
>
> And since it's statically linked (and needs to be apparently), you don't need
> libgcc-NN-dev installed at runtime.
>
>
> > I think it's pretty much the same in the RPM side of the world.
>
> I don't know much about that side of the world...

I could not find this packaged in Fedora. I did find it in FreeBSD
however. We could add libbacktrace as a Meson subproject.

-- 
Tristan Partin
Neon (https://neon.tech)




Re: Should heapam_estimate_rel_size consider fillfactor?

2023-07-03 Thread Tomas Vondra
On 7/3/23 11:40, Tomas Vondra wrote:
> ...
>
> FWIW the reason why the integer division is intentional is most likely
> that we want "floor" semantics - if there's 10.23 rows per page, that
> really means 10 rows per page.
> 
> I doubt it makes a huge difference in this particular place, considering
> we're calculating the estimate from somewhat unreliable values, and then
> use it for rough estimate of relation size.
> 
> But from this POV, I think it's more correct to do it "my" way:
> 
>   density = (usable_bytes_per_page * fillfactor / 100) / tuple_width;
> 
> because that's doing *two* separate integer divisions, with floor
> semantics. First we calculate "usable bytes" (rounded down), then
> average number of rows per page (also rounded down).
> 
> Corey's formula would do just one integer division. I don't think it
> makes a huge difference, though. I mean, it's just an estimate and so we
> can't expect to be 100% accurate.
> 

Pushed, using the formula with two divisions (as in the original patch).

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Does a cancelled REINDEX CONCURRENTLY need to be messy?

2023-07-03 Thread Álvaro Herrera
On 2023-Jul-01, Thom Brown wrote:

> On Thu, 29 Jun 2023, 14:45 Álvaro Herrera,  wrote:
> 
> > ALTER TABLE DETACH CONCURRENTLY had to deal with this also, and it did it
> > by having a COMPLETE option you can run later in case things got stuck the
> > first time around. I suppose we could do something similar, where the
> > server automatically does the needful, whatever that is.
> 
> So there doesn't appear to be provision for deferred activities.

There is not.

> Could, perhaps, the fact that it is an invalid index that has no locks on
> it, and is dependent on the table mean it could be removed by a VACUUM?

Well, I definitely agree that it would be useful to have *something*
that automatically removes debris  (I'm not sure VACUUM is the best place
to do it.  Perhaps we could have autovacuum check for it, and do it
separately of vacuum proper.)

On the whole, the reason we don't have such a mechanism AFAIK is that
nobody has presented a credible implementation for it.  There was a push
to use UNDO to remove orphan files; if we had that, we could also use it
to implement cleanup of dead indexes and partially-detached partitions.
However, that project crashed and burned a long time ago and has seen no
resurrection as yet.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Find a bug in a program, and fix it, and the program will work today.
Show the program how to find and fix a bug, and the program
will work forever" (Oliver Silfridge)




Re: Optionally using a better backtrace library?

2023-07-03 Thread Andres Freund
Hi,

On 2023-07-03 11:58:25 +0200, Alvaro Herrera wrote:
> On 2023-Jul-02, Andres Freund wrote:
> > I like that we now have a builtin backtrace ability. Unfortunately I think 
> > the
> > backtraces are often not very useful, because only externally visible
> > functions are symbolized.
> 
> Agreed, these backtraces are pretty close to useless.  Not completely,
> but I haven't found a practical way to use them for actual debugging
> of production problems.

Yea. And I've grown pretty tired asking people to break out gdb in production
scenarios :/


> > Nice things about libbacktrace are that the generation of stack traces is
> > documented to be async signal safe on most platforms (with a #define to 
> > figure
> > that out, and a more minimal safe version always available) and that it
> > supports a wide range of platforms:
> 
> Sadly, it looks like the library is seldom distributed.

It's often distributed as part of gcc.


> For example, Debian seems to only have a package called android-libbacktrace
> which I imagine is not what we want.

Indeed not.


> On my system I see a static library only -- is that enough?  That file is
> part of package libgcc-10-dev, which tells me that we can't depend on that
> for packaging purposes.

We should be able to depend on that gcc-NN depends on libgcc-NN-dev, it
contains all the compiler version specific stuff. It's where the intrinsics
headers, C runtime initialization, sanitizer libraries all live.  clang will
typically also depend on libgcc-NN-dev on unixoid systems.

And since it's statically linked (and needs to be apparently), you don't need
libgcc-NN-dev installed at runtime.


> I think it's pretty much the same in the RPM side of the world.

I don't know much about that side of the world...

Greetings,

Andres Freund




Re: Fdw batch insert error out when set batch_size > 65535

2023-07-03 Thread Tomas Vondra
On 7/2/23 15:50, Tomas Vondra wrote:
> On 7/2/23 15:23, Tom Lane wrote:
>> Andres Freund  writes:
>>> On 2021-06-11 18:44:28 -0400, Tom Lane wrote:
 I suggest what we do is leave it in place for long enough to get
 a round of reports from those slow animals, and then (assuming
 those reports are positive) drop the test.
>>
>>> I think two years later is long enough to have some confidence in this being
>>> fixed?
>>
>> +1, time to drop it (in the back branches too).
>>
> 
> OK, will do (unless someone else wants to handle this) on Monday.
> 

FWIW I've removed the test from all branches where it was present.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: [PATCH] Add <> support to sepgsql_restorecon

2023-07-03 Thread Daniel Gustafsson
> On 20 Mar 2023, at 21:17, Ted Toth  wrote:
> 
> Not this month unfortunately other work has taken precedence. I'll need to 
> look at what it's going to take to create a test. Hopefully I can piggyback 
> on an existing test.

This patch has been marked Waiting on Author since January, I'm marking it
Returned with Feedback.  Please feel free to resubmit when there is time and
interest to resume work on this.

--
Daniel Gustafsson





Re: Inconsistent results with libc sorting on Windows

2023-07-03 Thread Juan José Santamaría Flecha
El lun, 3 jul 2023, 17:42, Alvaro Herrera 
escribió:

> On 2023-Jun-19, Juan José Santamaría Flecha wrote:
>
> > Ok, let's see where the report goes:
> >
> >
> https://developercommunity.visualstudio.com/t/CompareStringEx-non-transitive/10393003?q=comparestringex
>
> Hm, so this appears to have been marked as solved by Microsoft.  Can you
> recheck?  Also, what does the resolution mean for Postgres, in practical
> terms?
>


It's not really solved, they have just pushed the issue to another team
that's only reachable through the Windows feedback hub. I've already
provided the feedback, but it's only visible through the proprietary app.

I would say there haven't been any real progress on that front so far.

Regards,

Juan José Santamaría Flecha

>


Re: explain analyze rows=%.0f

2023-07-03 Thread Daniel Gustafsson
> On 8 Jun 2023, at 19:49, Ibrar Ahmed  wrote:
> On Mon, Mar 20, 2023 at 7:56 PM Gregory Stark (as CFM)  > wrote:

> This patch was marked Returned with Feedback and then later Waiting on
> Author. And it hasn't had any updates since January. What is the state
> on this feedback? If it's already done we can set the patch to Ready
> for Commit and if not do you disagree with the proposed changes?
> 
> If there is a consensus to modify the test cases' output, I am willing to
> make the necessary changes and adjust the patch accordingly. However,
> if there is a preference to keep the output of certain test cases unchanged,
> I can rebase and modify the patch accordingly to accommodate those 
> preferences.

As there hasn't been any other comments I suggest updating your patch to
address Tom's comments to see if we can make progress here.

--
Daniel Gustafsson





Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG

2023-07-03 Thread Joe Conway

On 7/3/23 12:25, Tristan Partin wrote:

On Sat Jun 24, 2023 at 8:09 AM CDT, Joe Conway wrote:
Although I have not looked yet, presumably we could have similar 
problems with plpython. I would like to get agreement on this approach 
against plperl before diving into that though.


Thoughts?


I don't see anything immediately wrong with this. I think doing a
similar thing for plpython would make sense. Might make sense to CC any
other pl* maintainers too.


In our tree there are only plperl and plpython to worry about.

"other pl* maintainers" is a fuzzy concept since other pl's are 
scattered far and wide.


I think it is reasonable to expect such maintainers to be paying 
attention to hackers and pick up on it themselves (I say that as a pl 
maintainer myself -- plr)


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG

2023-07-03 Thread Joe Conway

On 7/3/23 12:17, Tristan Partin wrote:

The Reply-To header in your email is pointing at joe@cd, fyi. Pretty
strange.



I noticed that -- it happened only the one time, and I am not sure why. 
Seems fine now though.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG

2023-07-03 Thread Tristan Partin
On Sat Jun 24, 2023 at 8:09 AM CDT, Joe Conway wrote:
> Although I have not looked yet, presumably we could have similar 
> problems with plpython. I would like to get agreement on this approach 
> against plperl before diving into that though.
>
> Thoughts?

I don't see anything immediately wrong with this. I think doing a
similar thing for plpython would make sense. Might make sense to CC any
other pl* maintainers too.

-- 
Tristan Partin
Neon (https://neon.tech)




Re: Proposal: %T Prompt parameter for psql for current time (like Oracle has)

2023-07-03 Thread Daniel Gustafsson
> On 1 Mar 2023, at 10:29, Jim Jones  wrote:
> 
> On 01.03.23 01:47, Kirk Wolak wrote:
>> Patch Posted with one edit, for line editings (Thanks Jim!)
> The patch didn't pass the SanityCheck:
> 
> https://cirrus-ci.com/task/5445242183221248?logs=build#L1337
> 
> missing a header perhaps?
> 
> #include "time.h"

This patch spent the March commitfest not building and still doesn't build, so
I'm marking this returned with feedback.  Please feel free to resubmit to the
next commitfest if there is renewed interest in the patch.

--
Daniel Gustafsson





Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG

2023-07-03 Thread Tristan Partin
Joe,

The Reply-To header in your email is pointing at joe@cd, fyi. Pretty
strange.

-- 
Tristan Partin
Neon (https://neon.tech)




Re: Sampling-based timing for EXPLAIN ANALYZE

2023-07-03 Thread Daniel Gustafsson
This thread has been stale since January with no movement at all during the
March CF, and according to the CFBot it stopped building at all ~ 14 weeks ago.

I'm marking this returned with feedback, it can be resubmitted for a future CF
if someone decides to pick it up.

--
Daniel Gustafsson





Re: Extension Enhancement: Buffer Invalidation in pg_buffercache

2023-07-03 Thread jian he
On Mon, Jul 3, 2023 at 4:26 PM Palak Chaturvedi
 wrote:
>
> Hi Thomas,
> Thank you for your suggestions. I have added the sql in the meson
> build as well.
>
> On Sat, 1 Jul 2023 at 03:39, Thomas Munro  wrote:
> >
> > On Fri, Jun 30, 2023 at 10:47 PM Palak Chaturvedi
> >  wrote:
> > > pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
> > > pg_buffercache where relfilenode =
> > > pg_relation_filenode('pgbench_accounts'::regclass);
> >
> > Hi Palak,
> >
> > Thanks for working on this!  I think this will be very useful for
> > testing existing workloads but also for testing future work on
> > prefetching with AIO (and DIO), work on putting SLRUs (or anything
> > else) into the buffer pool, nearby proposals for caching buffer
> > mapping information, etc etc.
> >
> > Palak and I talked about this idea a bit last week (stimulated by a
> > recent thread[1], but the topic has certainly come up before), and we
> > discussed some different ways one could specify which pages are
> > dropped.  For example, perhaps the pg_prewarm extension could have an
> > 'unwarm' option instead.  I personally thought the buffer ID-based
> > approach was quite good because it's extremely simple, while giving
> > the user the full power of SQL to say which buffers.   Half a table?
> > Visibility map?  Everything?  Root page of an index?  I think that's
> > probably better than something that requires more code and
> > complication but is less flexible in the end.  It feels like the right
> > level of rawness for something primarily of interest to hackers and
> > advanced users.  I don't think it matters that there is a window
> > between selecting a buffer ID and invalidating it, for the intended
> > use cases.  That's my vote, anyway, let's see if others have other
> > ideas...
> >
> > We also talked a bit about how one might control the kernel page cache
> > in more fine-grained ways for testing purposes, but it seems like the
> > pgfincore project has that covered with its pgfadvise_willneed() and
> > pgfadvise_dontneed().  IMHO that project could use more page-oriented
> > operations (instead of just counts and coarse grains operations) but
> > that's something that could be material for patches to send to the
> > extension maintainers.  This work, in contrast, is more tangled up
> > with bufmgr.c internals, so it feels like this feature belongs in a
> > core contrib module.
> >
> > Some initial thoughts on the patch:
> >
> > I wonder if we should include a simple exercise in
> > contrib/pg_buffercache/sql/pg_buffercache.sql.  One problem is that
> > it's not guaranteed to succeed in general.  It doesn't wait for pins
> > to go away, and it doesn't retry cleaning dirty buffers after one
> > attempt, it just returns false, which I think is probably the right
> > approach, but it makes the behaviour too non-deterministic for simple
> > tests.  Perhaps it's enough to include an exercise where we call it a
> > few times to hit a couple of cases, but not verify what effect it has.
> >
> > It should be restricted by role, but I wonder which role it should be.
> > Testing for superuser is now out of fashion.
> >
> > Where the Makefile mentions 1.4--1.5.sql, the meson.build file needs
> > to do the same.  That's because PostgreSQL is currently in transition
> > from autoconf/gmake to meson/ninja[2], so for now we have to maintain
> > both build systems.  That's why it fails to build in some CI tasks[3].
> > You can enable CI in your own GitHub account if you want to run test
> > builds on several operating systems, see [4] for info.
> >
> > [1] 
> > https://www.postgresql.org/message-id/flat/CAFSGpE3y_oMK1uHhcHxGxBxs%2BKrjMMdGrE%2B6HHOu0vttVET0UQ%40mail.gmail.com
> > [2] https://wiki.postgresql.org/wiki/Meson
> > [3] http://cfbot.cputube.org/palak-chaturvedi.html
> > [4] 
> > https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/tools/ci/README;hb=HEAD

newbie question:
quote from: https://www.interdb.jp/pg/pgsql08.html
>
> Pinned: When the corresponding buffer pool slot stores a page and any 
> PostgreSQL processes are accessing the page (i.e. refcount and usage_count 
> are greater than or equal to 1), the state of this buffer descriptor is 
> pinned.
> Unpinned: When the corresponding buffer pool slot stores a page but no 
> PostgreSQL processes are accessing the page (i.e. usage_count is greater than 
> or equal to 1, but refcount is 0), the state of this buffer descriptor is 
> unpinned.


So do you need to check BUF_STATE_GET_REFCOUNT(buf_state) and
BUF_STATE_GET_USAGECOUNT(state)?




Re: Inconsistent results with libc sorting on Windows

2023-07-03 Thread Alvaro Herrera
On 2023-Jun-19, Juan José Santamaría Flecha wrote:

> Ok, let's see where the report goes:
> 
> https://developercommunity.visualstudio.com/t/CompareStringEx-non-transitive/10393003?q=comparestringex

Hm, so this appears to have been marked as solved by Microsoft.  Can you
recheck?  Also, what does the resolution mean for Postgres, in practical
terms?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"If you have nothing to say, maybe you need just the right tool to help you
not say it."   (New York Times, about Microsoft PowerPoint)




Re: pg_basebackup check vs Windows file path limits

2023-07-03 Thread Daniel Gustafsson
> On 3 Jul 2023, at 17:18, Andrew Dunstan  wrote:
> On 2023-07-03 Mo 10:16, Daniel Gustafsson wrote:
>>> On 3 Jul 2023, at 16:12, Andrew Dunstan 
>>>  wrote:
>>> 
>>> I've pushed a better solution, which creates the file via a short symlink. 
>>> Experimentation on fairywren showed this working.
>>> 
>> The buildfarm seems a tad upset after this?
> 
> Yeah :-( 
> 
> I think it should be fixing itself now.

Yeah, thanks for speedy fix!

--
Daniel Gustafsson





Re: pg_basebackup check vs Windows file path limits

2023-07-03 Thread Andrew Dunstan


On 2023-07-03 Mo 10:16, Daniel Gustafsson wrote:

On 3 Jul 2023, at 16:12, Andrew Dunstan  wrote:
I've pushed a better solution, which creates the file via a short symlink. 
Experimentation on fairywren showed this working.

The buildfarm seems a tad upset after this?



Yeah :-(

I think it should be fixing itself now.


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: check_strxfrm_bug()

2023-07-03 Thread Tristan Partin
On Sun Jul 2, 2023 at 8:49 PM CDT, Thomas Munro wrote:
> On Sun, Jul 2, 2023 at 4:25 AM Noah Misch  wrote:
> > On Wed, Jun 28, 2023 at 01:02:21PM +1200, Thomas Munro wrote:
> > > On Wed, Jun 28, 2023 at 11:03 AM Thomas Munro  
> > > wrote:
> > > > The GCC build farm has just received some SPARC hardware new enough to
> > > > run modern Solaris (hostname gcc106), so if wrasse were moved over
> > > > there we could finally assume all systems have POSIX 2008 (AKA
> > > > SUSv4)'s locale_t.
> > >
> > > That would look something like this.
> >
> > This removes thirty-eight ifdefs, most of them located in the middle of
> > function bodies.  That's far more beneficial than most proposals to raise
> > minimum requirements.  +1 for revoking support for wrasse's OS version.
> > (wrasse wouldn't move, but it would stop testing v17+.)
>
> Great.  It sounds like I should wait a few days for any other feedback
> and then push this patch.  Thanks for looking.

The patch looks good to me as well. Happy to rebase my other patch on
this one.

-- 
Tristan Partin
Neon (https://neon.tech)




Re: Make uselocale protection more consistent

2023-07-03 Thread Tristan Partin
On Mon Jul 3, 2023 at 9:21 AM CDT, Peter Eisentraut wrote:
> On 03.07.23 15:21, Tristan Partin wrote:
> >>> I think it would be better to keep HAVE_LOCALE_T as encompassing any of
> >>> the various locale_t-using functions, rather than using HAVE_USELOCALE
> >>> as a proxy for them.  Otherwise you create weird situations like having
> >>> #ifdef HAVE_WCSTOMBS_L inside #ifdef HAVE_USELOCALE, which doesn't make
> >>> sense, I think.
> >> I propose[1] that we get rid of HAVE_LOCALE_T completely and make
> >> "libc" provider support unconditional.  It's standardised, and every
> >> target system has it, even Windows.  But Windows doesn't have
> >> uselocale().
> >>
> >> [1]https://www.postgresql.org/message-id/flat/CA%2BhUKGL7CmmzeRhoirzjECmOdABVFTn8fo6gEOaFRF1Oxey6Hw%40mail.gmail.com#aef2f2274b28ff8a36f9b8a598e3cec0
> > I think keeping HAVE_USELOCALE is important for the Windows case as
> > mentioned. I need it for my localization work where I am ripping out
> > setlocale() on non-Windows.
>
> The current code is structured
>
> #ifdef HAVE_LOCALE_T
> #ifdef HAVE_WCSTOMBS_L
>  wcstombs_l(...);
> #else
>  uselocale(...);
> #endif
> #else
>  elog(ERROR);
> #endif
>
> If you just replace HAVE_LOCALE_T with HAVE_USELOCALE, then this would 
> penalize a platform that has wcstombs_l(), but not uselocale().  I think 
> the correct structure would be
>
> #if defined(HAVE_WCSTOMBS_L)
>  wcstombs_l(...);
> #elif defined(HAVE_USELOCALE)
>  uselocale(...);
> #else
>  elog(ERROR);
> #endif

That makes sense to me. I gave it some more thought. Maybe it makes more
sense to just completely drop HAVE_USELOCALE as mentioned, and protect
calls to it with #ifdef WIN32 or whatever the macro is. HAVE_USELOCALE
might be more descriptive, but I don't really care that much either way.

-- 
Tristan Partin
Neon (https://neon.tech)




Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG

2023-07-03 Thread Tristan Partin
On Fri Jun 30, 2023 at 7:13 AM CDT, Joe Conway wrote:
> On 6/29/23 22:13, Tristan Partin wrote:
> > On Mon Jun 5, 2023 at 11:00 AM CDT, Heikki Linnakangas wrote:
> >> I think the uselocale() call renders ineffective the setlocale() calls 
> >> that we make later. Maybe we should replace our setlocale() calls with 
> >> uselocale(), too.
> > 
> > For what it's worth to everyone else in the thread (especially Joe), I
> > have a patch locally that fixes the mentioned bug using uselocale(). I
> > am not sure that it is worth committing for v16 given how _large_ (the
> > patch is actually quite small, +216 -235) of a change it is. I am going
> > to spend tomorrow combing over it a bit more and evaluating other
> > setlocale uses in the codebase.
>
> (moving thread to hackers)
>
> I don't see a patch attached -- how is it different than what I posted a 
> week ago and added to the commitfest here?
>
>   https://commitfest.postgresql.org/43/4413/
>
> FWIW, if you are proposing replacing all uses of setlocale() with 
> uselocale() as Heikki suggested:
>
> 1/ I don't think that is pg16 material, and almost certainly not 
> back-patchable to earlier.

I am in agreement.

> 2/ It probably does not solve all of the identified issues caused by the 
> newer perl libraries by itself, i.e. I believe the patch posted to the 
> CF is still needed.

Perhaps. I do think your patch is still valuable regardless. Works for
backpatching and is just good defensive programming. I have added myself
as a reviewer.

> 3/ I believe it is probably the right way to go for pg17+, but I would 
> love to hear opinions from Jeff Davis, Peter Eisentraut, and/or Thomas 
> Munroe (the locale code "usual suspects" ;-)), and others, about that.

Thanks for your patience. Attached is a patch that should cover all the
problematic use cases of setlocale(). There are some setlocale() calls in
tests, initdb, and ecpg left. I plan to get to ecpglib before the final
version of this patch after I abstract over Windows not having
uselocale(). I think leaving initdb and tests as is would be fine, but I
am also happy to just permanently purge setlocale() from the codebase
if people see value in that. We could also poison[0] setlocale() at that
point.

[0]: https://gcc.gnu.org/onlinedocs/cpp/Pragmas.html

-- 
Tristan Partin
Neon (https://neon.tech)
From 5688bc2b2c6331f437a72b6a429199c5416ccd76 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Fri, 30 Jun 2023 09:31:04 -0500
Subject: [PATCH v1 1/3] Skip checking for uselocale on Windows

Windows doesn't have uselocale, so skip it.
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index fbec997947..df76f10822 100644
--- a/meson.build
+++ b/meson.build
@@ -2439,7 +2439,7 @@ func_checks = [
   ['strsignal'],
   ['sync_file_range'],
   ['syncfs'],
-  ['uselocale'],
+  ['uselocale', {'skip': host_system == 'windows'}],
   ['wcstombs_l'],
 ]
 
-- 
Tristan Partin
Neon (https://neon.tech)

From d52ab7851e9638e36cd99859014d7aa31154e600 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Fri, 30 Jun 2023 11:13:29 -0500
Subject: [PATCH v1 2/3] Add locale_is_c function

In some places throughout the codebase, there are string comparisons for
locales matching C or POSIX. Encapsulate this logic into a single
function and use it.
---
 src/backend/utils/adt/pg_locale.c | 37 ++-
 src/backend/utils/init/postinit.c |  4 +---
 src/backend/utils/mb/mbutils.c|  5 ++---
 src/include/utils/pg_locale.h |  1 +
 4 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 514d4a9eeb..b455ef3aff 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -154,6 +154,21 @@ static void icu_set_collation_attributes(UCollator *collator, const char *loc,
 		 UErrorCode *status);
 #endif
 
+/*
+ * Check if a locale is the C locale
+ *
+ * POSIX is an alias for C. Passing ingore_case as true will use
+ * pg_strcasecmp() instead of strcmp().
+ */
+bool
+locale_is_c(const char *locale, bool ignore_case)
+{
+	if (!ignore_case)
+		return strcmp(locale, "C") == 0 || strcmp(locale, "POSIX") == 0;
+
+	return pg_strcasecmp(locale, "C") == 0 || pg_strcasecmp(locale, "POSIX") == 0;
+}
+
 /*
  * pg_perm_setlocale
  *
@@ -1239,10 +1254,8 @@ lookup_collation_cache(Oid collation, bool set_flags)
 			datum = SysCacheGetAttrNotNull(COLLOID, tp, Anum_pg_collation_collctype);
 			collctype = TextDatumGetCString(datum);
 
-			cache_entry->collate_is_c = ((strcmp(collcollate, "C") == 0) ||
-		 (strcmp(collcollate, "POSIX") == 0));
-			cache_entry->ctype_is_c = ((strcmp(collctype, "C") == 0) ||
-	   (strcmp(collctype, "POSIX") == 0));
+			cache_entry->collate_is_c = locale_is_c(collcollate, false);
+			cache_entry->ctype_is_c = locale_is_c(collctype, false);
 		}
 		else
 		{
@@ -1290,12 +1303,8 @@ lc_collate_is_c(Oid collation)
 		

Re: Optionally using a better backtrace library?

2023-07-03 Thread Peter Eisentraut

On 02.07.23 20:31, Andres Freund wrote:

A lot of platforms have "libbacktrace" available, e.g. as part of gcc. I think
we should consider using it, when available, to produce more useful
backtraces.

I hacked it up for ereport() to debug something, and the backtraces are
considerably better:


Makes sense.  When we first added backtrace support, we considered 
libunwind, which didn't really give better backtraces than the built-in 
stuff, so it wasn't worth dealing with an additional dependency.






Re: Add support for AT LOCAL

2023-07-03 Thread Vik Fearing

On 7/3/23 15:42, Daniel Gustafsson wrote:

On 6 Jun 2023, at 05:13, Vik Fearing  wrote:



Patch against 3f1180 attached.


This patch fails to compile, the declaration of variables in the switch block
needs to be scoped within a { } block.



Interesting.  It compiles for me.



I've fixed this trivial error in the
attached v2 and also reflowed the comments which now no longer fit.



Thank you.
--
Vik Fearing





Re: Make uselocale protection more consistent

2023-07-03 Thread Peter Eisentraut

On 03.07.23 15:21, Tristan Partin wrote:

I think it would be better to keep HAVE_LOCALE_T as encompassing any of
the various locale_t-using functions, rather than using HAVE_USELOCALE
as a proxy for them.  Otherwise you create weird situations like having
#ifdef HAVE_WCSTOMBS_L inside #ifdef HAVE_USELOCALE, which doesn't make
sense, I think.

I propose[1] that we get rid of HAVE_LOCALE_T completely and make
"libc" provider support unconditional.  It's standardised, and every
target system has it, even Windows.  But Windows doesn't have
uselocale().

[1]https://www.postgresql.org/message-id/flat/CA%2BhUKGL7CmmzeRhoirzjECmOdABVFTn8fo6gEOaFRF1Oxey6Hw%40mail.gmail.com#aef2f2274b28ff8a36f9b8a598e3cec0

I think keeping HAVE_USELOCALE is important for the Windows case as
mentioned. I need it for my localization work where I am ripping out
setlocale() on non-Windows.


The current code is structured

#ifdef HAVE_LOCALE_T
#ifdef HAVE_WCSTOMBS_L
wcstombs_l(...);
#else
uselocale(...);
#endif
#else
elog(ERROR);
#endif

If you just replace HAVE_LOCALE_T with HAVE_USELOCALE, then this would 
penalize a platform that has wcstombs_l(), but not uselocale().  I think 
the correct structure would be


#if defined(HAVE_WCSTOMBS_L)
wcstombs_l(...);
#elif defined(HAVE_USELOCALE)
uselocale(...);
#else
elog(ERROR);
#endif





Re: logicalrep_message_type throws an error

2023-07-03 Thread Euler Taveira
On Mon, Jul 3, 2023, at 10:57 AM, Ashutosh Bapat wrote:
> On Mon, Jul 3, 2023 at 6:52 PM Ashutosh Bapat
>  wrote:
> >
> > The switch is on action which is an enum. So without default it will
> > provide a compilation warning for missing enums. Adding "default" case
> > defeats that purpose. I think we should just return "???" from outside
> > switch block.
> >

Yeah. I don't think any gcc -Wswitch options have effect if a default is used
so your suggestion is good for wrong/missing message types in the future.

> PFA patch.

WFM.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: pg_basebackup check vs Windows file path limits

2023-07-03 Thread Daniel Gustafsson
> On 3 Jul 2023, at 16:12, Andrew Dunstan  wrote:

> I've pushed a better solution, which creates the file via a short symlink. 
> Experimentation on fairywren showed this working.

The buildfarm seems a tad upset after this?

--
Daniel Gustafsson





Re: Make uselocale protection more consistent

2023-07-03 Thread Peter Eisentraut

On 03.07.23 08:24, Thomas Munro wrote:

I propose[1] that we get rid of HAVE_LOCALE_T completely and make
"libc" provider support unconditional.  It's standardised, and every
target system has it, even Windows.  But Windows doesn't have
uselocale().

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGL7CmmzeRhoirzjECmOdABVFTn8fo6gEOaFRF1Oxey6Hw%40mail.gmail.com#aef2f2274b28ff8a36f9b8a598e3cec0


Ok, it appears your patch is imminent, so let's wait for that and see if 
this patch here is still required afterwards.






Re: pg_basebackup check vs Windows file path limits

2023-07-03 Thread Andrew Dunstan


On 2023-07-02 Su 09:15, Andrew Dunstan wrote:


The buildfarm animal fairywren has been failing the tests for 
pg_basebackup because it can't create a file with a path longer than 
255 chars. This has just been tripped because for release 16 it's 
running TAP tests, and the branch name is part of the file path, and 
"REL_16_STABLE" is longer than "HEAD". I did think of chdir'ing into 
the directory to create the file, but experimentation shows that 
doesn't solve matters. I also adjusted the machine's settings related 
to long file names, but to no avail, so for now I propose to reduce 
slightly the name of the long file so it still exercises the check for 
file names longer than 100 but doesn't trip this up on fairywren. But 
that's a bandaid. I don't have a good solution for now.




I've pushed a better solution, which creates the file via a short 
symlink. Experimentation on fairywren showed this working.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Mark a transaction uncommittable

2023-07-03 Thread Daniel Gustafsson
> On 5 Jun 2023, at 09:22, Gurjeet Singh  wrote:

> Please see attached the patch that introduces this new feature.

This patch fails the guc_check test due to something missing in the sample
configuration.  Please send a rebased version with this test fixed.

[08:31:26.680] --- stdout ---
[08:31:26.680] # executing test in 
/tmp/cirrus-ci-build/build/testrun/test_misc/003_check_guc group test_misc test 
003_check_guc
[08:31:26.680] not ok 1 - no parameters missing from postgresql.conf.sample
[08:31:26.680] ok 2 - no parameters missing from guc_tables.c
[08:31:26.680] ok 3 - no parameters marked as NOT_IN_SAMPLE in 
postgresql.conf.sample
[08:31:26.680] 1..3
[08:31:26.680] # test failed

--
Daniel Gustafsson





Re: cataloguing NOT NULL constraints

2023-07-03 Thread Peter Eisentraut

On 30.06.23 13:44, Alvaro Herrera wrote:

OK, so here's a new attempt to get this working correctly.


Attached is a small fixup patch for the documentation.

Furthermore, there are a few outdated comments that are probably left 
over from previous versions of this patch set.



* src/backend/catalog/pg_constraint.c

Outdated comment:

+   /* only tuples for CHECK constraints should be given */
+   Assert(((Form_pg_constraint) GETSTRUCT(constrTup))->contype == 
CONSTRAINT_NOTNULL);



* src/backend/parser/gram.y

Should processCASbits() set >skip_validation, like in the CHECK
case?  _outConstraint() looks at the field, so it seems relevant.


* src/backend/parser/parse_utilcmd.c

The updated comment says

List   *ckconstraints;  /* CHECK and NOT NULL constraints */

but it seems to me that NOT NULL constraints are not actually added
there but in nnconstraints instead.

It would be nice if the field nnconstraints was listed after
ckconstraints consistently throughout the file.  It's a bit random
right now.

This comment is outdated:

+   /*
+* For NOT NULL declarations, we need to mark the column as
+* not nullable, and set things up to have a CHECK 
constraint

+* created.  Also, duplicate NOT NULL declarations are not
+* allowed.
+*/

About this:

case CONSTR_CHECK:
cxt->ckconstraints = lappend(cxt->ckconstraints, 
constraint);

+
+   /*
+* XXX If the user says CHECK (IS NOT NULL), should we turn
+* that into a regular NOT NULL constraint?
+*/
break;

I think this was decided against.

+   /*
+* Copy NOT NULL constraints, too (these do not require any option 
to have

+* been given).
+*/

Shouldn't that be governed by the INCLUDING CONSTRAINTS option?

Btw., there is some asymmetry here between check constraints and
not-null constraints: Check constraints are in the tuple descriptor,
but not-null constraints are not.  Should that be unified?  Or at
least explained?


* src/include/nodes/parsenodes.h

This comment appears to be outdated:

+ * intermixed in tableElts, and constraints and notnullcols are NIL.  After
+ * parse analysis, tableElts contains just ColumnDefs, notnullcols has been
+ * filled with not-nullable column names from various sources, and 
constraints

+ * contains just Constraint nodes (in fact, only CONSTR_CHECK nodes, in the
+ * present implementation).

There is no "notnullcolls".


* src/test/regress/parallel_schedule

This change appears to be correct, but unrelated to this patch, so I
suggest committing this separately.


* src/test/regress/sql/create_table.sql

-SELECT conislocal, coninhcount FROM pg_constraint WHERE conrelid = 
'part_b'::regclass;
+SELECT conislocal, coninhcount FROM pg_constraint WHERE conrelid = 
'part_b'::regclass ORDER BY coninhcount DESC, conname;


Maybe add conname to the select list here as well, for consistency with 
nearby queries.
From c481b58c3e48ff0ab4738e5cf2440d2ad6fc3e47 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 3 Jul 2023 15:44:59 +0200
Subject: [PATCH] fixup! Add pg_constraint rows for NOT NULL constraints

---
 doc/src/sgml/catalogs.sgml| 2 +-
 doc/src/sgml/ref/alter_table.sgml | 9 -
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 9137b1bc58..36460126f4 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2552,7 +2552,7 @@ pg_constraint 
Columns
   
c = check constraint,
f = foreign key constraint,
-   n = not null constraint,
+   n = not-null constraint,
p = primary key constraint,
u = unique constraint,
t = constraint trigger,
diff --git a/doc/src/sgml/ref/alter_table.sgml 
b/doc/src/sgml/ref/alter_table.sgml
index 0b65731b1f..2c4138e4e9 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -113,13 +113,12 @@
 
 [ CONSTRAINT constraint_name ]
 { CHECK ( expression ) [ NO 
INHERIT ] |
+  NOT NULL column_name [ NO 
INHERIT ] |
   UNIQUE [ NULLS [ NOT ] DISTINCT ] ( column_name [, ... ] ) index_parameters |
   PRIMARY KEY ( column_name [, 
... ] ) index_parameters |
   EXCLUDE [ USING index_method ] 
( exclude_element WITH 
operator [, ... ] ) index_parameters [ WHERE ( predicate ) ] |
   FOREIGN KEY ( column_name [, 
... ] ) REFERENCES reftable [ ( 
refcolumn [, ... ] ) ]
-[ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ] [ ON DELETE referential_action ] [ ON UPDATE referential_action ] |
-  NOT NULL column_name [ NO 
INHERIT ]
-}
+[ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ] [ ON DELETE referential_action ] [ ON UPDATE referential_action ] }
 [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]
 
 and table_constraint_using_index is:
@@ -1765,7 +1764,7 @@ Examples
   Compatibility
 

Re: logicalrep_message_type throws an error

2023-07-03 Thread Ashutosh Bapat
On Mon, Jul 3, 2023 at 6:52 PM Ashutosh Bapat
 wrote:
>
> Thanks Euler for the patch.
>
> On Mon, Jul 3, 2023 at 6:32 PM Euler Taveira  wrote:
> >
> > Masahiko, since abc0910e2e0 is your patch maybe you want to take a look at 
> > it.
> >
>
> A couple of comments.
>
> -char *
> +const char *
>
> Nice improvement.
>
>  logicalrep_message_type(LogicalRepMsgType action)
>  {
>  switch (action)
> @@ -1256,9 +1256,7 @@ logicalrep_message_type(LogicalRepMsgType action)
>  return "STREAM ABORT";
>  case LOGICAL_REP_MSG_STREAM_PREPARE:
>  return "STREAM PREPARE";
> +default:
> +return "???";
>  }
> -
> -elog(ERROR, "invalid logical replication message type \"%c\"", action);
> -
> -return NULL;/* keep compiler quiet */
>
> The switch is on action which is an enum. So without default it will
> provide a compilation warning for missing enums. Adding "default" case
> defeats that purpose. I think we should just return "???" from outside
> switch block.
>

PFA patch.

-- 
Best Wishes,
Ashutosh Bapat
From e117a1ea78327ebcb85d55fedb3b5e7e3b5c7b27 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Mon, 3 Jul 2023 08:54:10 -0300
Subject: [PATCH] uncover logical change details

The commit abc0910e2e0 adds logical change details to error context.
However, the function logicalrep_message_type() introduces an
elog(ERROR) that can hide these details. Instead, avoid elog() and use
??? (that is a synonym for unknown). Spotted by Ashutosh Bapat.

Discussion: https://postgr.es/m/CAExHW5suAEDW-mBZt_qu4RVxWZ1vL54-L%2Bci2zreYWebpzxYsA%40mail.gmail.com
---
 src/backend/replication/logical/proto.c | 11 +++
 src/include/replication/logicalproto.h  |  2 +-
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index f308713275..fe6ed9f94b 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1213,7 +1213,7 @@ logicalrep_read_stream_abort(StringInfo in,
 /*
  * Get string representing LogicalRepMsgType.
  */
-char *
+const char *
 logicalrep_message_type(LogicalRepMsgType action)
 {
 	switch (action)
@@ -1258,7 +1258,10 @@ logicalrep_message_type(LogicalRepMsgType action)
 			return "STREAM PREPARE";
 	}
 
-	elog(ERROR, "invalid logical replication message type \"%c\"", action);
-
-	return NULL;/* keep compiler quiet */
+	/*
+	 * This function is called to provide context in the error raised when
+	 * applying a logical message. So we can't throw an error here. Return an
+	 * unknown indicator value so that the original error is still reported.
+	 */
+	return "???";
 }
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 0ea2df5088..c5be981eae 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -269,6 +269,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
 extern void logicalrep_read_stream_abort(StringInfo in,
 		 LogicalRepStreamAbortData *abort_data,
 		 bool read_abort_info);
-extern char *logicalrep_message_type(LogicalRepMsgType action);
+extern const char *logicalrep_message_type(LogicalRepMsgType action);
 
 #endif			/* LOGICAL_PROTO_H */
-- 
2.25.1



Re: Introduce "log_connection_stages" setting.

2023-07-03 Thread Daniel Gustafsson
> On 16 May 2023, at 20:51, Sergey Dudoladov  wrote:

> I have attached the fourth version of the patch.

This version fails the ldap_password test on all platforms on pg_ctl failing to 
start:

[14:48:10.544] --- stdout ---
[14:48:10.544] # executing test in 
/tmp/cirrus-ci-build/build/testrun/ldap_password_func/001_mutated_bindpasswd 
group ldap_password_func test 001_mutated_bindpasswd
[14:48:10.544] # waiting for slapd to accept requests...
[14:48:10.544] # setting up PostgreSQL instance
[14:48:10.544] Bail out! pg_ctl start failed
[14:48:10.544] # test failed

Updating src/test/modules/ldap_password_func/t/001_mutated_bindpasswd.pl with
the new GUC might solve the problem from skimming this.

Please send a fixed version, I'm marking the patch Waiting on Author in the
meantime.

--
Daniel Gustafsson





Is a pg_stat_force_next_flush() call sufficient for regression tests?

2023-07-03 Thread Tomas Vondra
Hi,

I noticed eelpout failed with this in stats.sql, in the pg_stat_io tests
added a couple months ago [1]:

@@ -1415,7 +1415,7 @@
:io_sum_vac_strategy_after_reuses >
:io_sum_vac_strategy_before_reuses;
  ?column? | ?column?
 --+--
- t| t
+ t| f
 (1 row)

The failure seems completely unrelated to the new commit, so this seems
like some randomness / timing issue. The failing bit does this:


VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) test_io_vac_strategy;
SELECT pg_stat_force_next_flush();
SELECT sum(reuses) AS reuses, sum(reads) AS reads
  FROM pg_stat_io WHERE context = 'vacuum' \gset io_sum_vac_strategy_after_
SELECT :io_sum_vac_strategy_after_reads > :io_sum_vac_strategy_before_reads,
   :io_sum_vac_strategy_after_reuses >
:io_sum_vac_strategy_before_reuses;


So I'm wondering if pg_stat_force_next_flush() is enough - AFAICS this
only sets some flag for the *next* pgstat_report_stat() call, but how do
we know that happens before the query execution?

Shouldn't there be something like pg_stat_flush() that actually does the
flushing, instead of just setting the flag?


regards


[1]
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=eelpout=2023-07-03%2011%3A09%3A13

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Add support for AT LOCAL

2023-07-03 Thread Daniel Gustafsson
> On 6 Jun 2023, at 05:13, Vik Fearing  wrote:

> Patch against 3f1180 attached.

This patch fails to compile, the declaration of variables in the switch block
needs to be scoped within a { } block.  I've fixed this trivial error in the
attached v2 and also reflowed the comments which now no longer fit.

--
Daniel Gustafsson



v2-0001-Add-support-for-AT-LOCAL.patch
Description: Binary data


Re: Track Oldest Initialized WAL Buffer Page

2023-07-03 Thread Heikki Linnakangas

On 07/02/2023 16:00, Bharath Rupireddy wrote:

Hi,

While working on [1], I was looking for a quick way to tell if a WAL
record is present in the WAL buffers array without scanning but I
couldn't find one.


/* The end-ptr of the page that contains the record */
expectedEndPtr += XLOG_BLCKSZ - recptr % XLOG_BLCKSZ;

/* get the buffer where the record is, if it's in WAL buffers at all */
idx = XLogRecPtrToBufIdx(recptr);

/* prevent the WAL buffer from being evicted while we look at it */
LWLockAcquire(WALBufMappingLock, LW_SHARED);

/* Check if the page we're interested in is in the buffer */
found = XLogCtl->xlblocks[idx] == expectedEndPtr;

LWLockRelease(WALBufMappingLock, LW_SHARED);


Hence, I put up a patch that basically tracks the
oldest initialized WAL buffer page, named OldestInitializedPage, in
XLogCtl. With OldestInitializedPage, we can easily illustrate WAL
buffers array properties:

1) At any given point of time, pages in the WAL buffers array are
sorted in an ascending order from OldestInitializedPage till
InitializedUpTo. Note that we verify this property for assert-only
builds, see IsXLogBuffersArraySorted() in the patch for more details.

2) OldestInitializedPage is monotonically increasing (by virtue of how
postgres generates WAL records), that is, its value never decreases.
This property lets someone read its value without a lock. There's no
problem even if its value is slightly stale i.e. concurrently being
updated. One can still use it for finding if a given WAL record is
available in WAL buffers. At worst, one might get false positives
(i.e. OldestInitializedPage may tell that the WAL record is available
in WAL buffers, but when one actually looks at it, it isn't really
available). This is more efficient and performant than acquiring a
lock for reading. Note that we may not need a lock to read
OldestInitializedPage but we need to update it holding
WALBufMappingLock.


You actually hint at the above solution here, so I'm confused. If you're 
OK with slightly stale results, you can skip the WALBufferMappingLock 
above too, and perform an atomic read of xlblocks[idx] instead.



3) One can start traversing WAL buffers from OldestInitializedPage
till InitializedUpTo to list out all valid WAL records and stats, and
expose them via SQL-callable functions to users, for instance, as
pg_walinspect functions.

4) WAL buffers array is inherently organized as a circular, sorted and
rotated array with OldestInitializedPage as pivot/first element of the
array with the property where LSN of previous buffer page (if valid)
is greater than OldestInitializedPage and LSN of the next buffer page
(if
valid) is greater than OldestInitializedPage.


These properties are true, maybe we should document them explicitly in a 
comment. But I don't see the point of tracking OldestInitializedPage. It 
seems cheap enough that we could, if there's a need for it, but I don't 
see the need.


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: generic plans and "initial" pruning

2023-07-03 Thread Daniel Gustafsson
> On 8 Jun 2023, at 16:23, Amit Langote  wrote:
> 
> Here is a new version.

The local planstate variable in the hunk below is shadowing the function
parameter planstate which cause a compiler warning:

@@ -1495,18 +1556,15 @@ ExecEndPlan(PlanState *planstate, EState *estate)
ListCell   *l;
 
/*
-* shut down the node-type-specific query processing
-*/
-   ExecEndNode(planstate);
-
-   /*
-* for subplans too
+* Shut down the node-type-specific query processing for all nodes that
+* were initialized during InitPlan(), both in the main plan tree and 
those
+* in subplans (es_subplanstates), if any.
 */
-   foreach(l, estate->es_subplanstates)
+   foreach(l, estate->es_inited_plannodes)
{
-   PlanState  *subplanstate = (PlanState *) lfirst(l);
+   PlanState  *planstate = (PlanState *) lfirst(l);

--
Daniel Gustafsson





Re: logicalrep_message_type throws an error

2023-07-03 Thread Ashutosh Bapat
Thanks Euler for the patch.

On Mon, Jul 3, 2023 at 6:32 PM Euler Taveira  wrote:
>
> Masahiko, since abc0910e2e0 is your patch maybe you want to take a look at it.
>

A couple of comments.

-char *
+const char *

Nice improvement.

 logicalrep_message_type(LogicalRepMsgType action)
 {
 switch (action)
@@ -1256,9 +1256,7 @@ logicalrep_message_type(LogicalRepMsgType action)
 return "STREAM ABORT";
 case LOGICAL_REP_MSG_STREAM_PREPARE:
 return "STREAM PREPARE";
+default:
+return "???";
 }
-
-elog(ERROR, "invalid logical replication message type \"%c\"", action);
-
-return NULL;/* keep compiler quiet */

The switch is on action which is an enum. So without default it will
provide a compilation warning for missing enums. Adding "default" case
defeats that purpose. I think we should just return "???" from outside
switch block.

-- 
Best Wishes,
Ashutosh Bapat




Re: Incremental sort for access method with ordered scan support (amcanorderbyop)

2023-07-03 Thread Miroslav Bendik
Thanks, for suggestions.

On Sun 02. 07. 2023 at 10:18 Richard Guo  wrote:
> 1. For comment "On success, the result list is ordered by pathkeys.", I
> think it'd be more accurate if we say something like "On success, the
> result list is ordered by pathkeys or a prefix list of pathkeys."
> considering the possibility of incremental sort.
>
> 2. The comment below is not true anymore.
>
>/*
> * Note: for any failure to match, we just return NIL immediately.
> * There is no value in matching just some of the pathkeys.
> */
> We should either remove it or change it to emphasize that we may return
> a prefix of the pathkeys for incremental sort.

Comments are updated now.

> BTW, would you please add the patch to the CF to not lose track of it?

Submitted 

-- 
Best regards
Miroslav
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 0065c8992b..a1daa31d07 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -978,10 +978,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		match_pathkeys_to_index(index, root->query_pathkeys,
 ,
 );
-		if (orderbyclauses)
-			useful_pathkeys = root->query_pathkeys;
-		else
-			useful_pathkeys = NIL;
+		useful_pathkeys = list_copy_head(root->query_pathkeys,
+		 list_length(orderbyclauses));
 	}
 	else
 	{
@@ -3059,16 +3057,14 @@ expand_indexqual_rowcompare(PlannerInfo *root,
  * index column numbers (zero based) that each clause would be used with.
  * NIL lists are returned if the ordering is not achievable this way.
  *
- * On success, the result list is ordered by pathkeys, and in fact is
- * one-to-one with the requested pathkeys.
+ * On success, the result list is ordered by pathkeys or a prefix list of
+ * pathkeys.  Result can be shorter than pathkeys on partial prefix match.
  */
 static void
 match_pathkeys_to_index(IndexOptInfo *index, List *pathkeys,
 		List **orderby_clauses_p,
 		List **clause_columns_p)
 {
-	List	   *orderby_clauses = NIL;
-	List	   *clause_columns = NIL;
 	ListCell   *lc1;
 
 	*orderby_clauses_p = NIL;	/* set default results */
@@ -3085,8 +3081,8 @@ match_pathkeys_to_index(IndexOptInfo *index, List *pathkeys,
 		ListCell   *lc2;
 
 		/*
-		 * Note: for any failure to match, we just return NIL immediately.
-		 * There is no value in matching just some of the pathkeys.
+		 * Note: for any failure to match, we just stop matching.  Current
+		 * longest match is already accumulated.
 		 */
 
 		/* Pathkey must request default sort order for the target opfamily */
@@ -3133,8 +3129,8 @@ match_pathkeys_to_index(IndexOptInfo *index, List *pathkeys,
    pathkey->pk_opfamily);
 if (expr)
 {
-	orderby_clauses = lappend(orderby_clauses, expr);
-	clause_columns = lappend_int(clause_columns, indexcol);
+	*orderby_clauses_p = lappend(*orderby_clauses_p, expr);
+	*clause_columns_p = lappend_int(*clause_columns_p, indexcol);
 	found = true;
 	break;
 }
@@ -3144,12 +3140,9 @@ match_pathkeys_to_index(IndexOptInfo *index, List *pathkeys,
 break;
 		}
 
-		if (!found)/* fail if no match for this pathkey */
+		if (!found)/* end after first not matching expression */
 			return;
 	}
-
-	*orderby_clauses_p = orderby_clauses;	/* success! */
-	*clause_columns_p = clause_columns;
 }
 
 /*
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 0c3433f8e5..41c62e0268 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -1660,3 +1660,33 @@ order by 1, 2;
->  Function Scan on generate_series
 (7 rows)
 
+-- Incremental sort for not ordered indexes, but with AM order operator
+set enable_incremental_sort = on;
+-- table with GiST index supports closest points retrieval,
+-- but has not sorted index
+create table t (a point, b int);
+create index on t using gist(a);
+insert into t select point(mod(i, 7), mod(i, 11)), i from generate_series(1, 1000) s(i);
+analyze t;
+explain (costs off) select a, b, a <-> point(5, 5) dist from t order by dist, b limit 1;
+   QUERY PLAN
+-
+ Limit
+   ->  Incremental Sort
+ Sort Key: ((a <-> '(5,5)'::point)), b
+ Presorted Key: ((a <-> '(5,5)'::point))
+ ->  Index Scan using t_a_idx on t
+   Order By: (a <-> '(5,5)'::point)
+(6 rows)
+
+explain (costs off) select a, b, a <-> point(5, 5) dist from t order by dist, b desc limit 1;
+ QUERY PLAN 
+
+ Limit
+   ->  Incremental Sort
+ Sort Key: ((a <-> '(5,5)'::point)), b DESC
+ Presorted Key: ((a <-> '(5,5)'::point))
+ ->  Index Scan using t_a_idx on t
+   Order By: (a <-> 

Re: Make uselocale protection more consistent

2023-07-03 Thread Tristan Partin
On Mon Jul 3, 2023 at 1:24 AM CDT, Thomas Munro wrote:
> On Mon, Jul 3, 2023 at 6:13 PM Peter Eisentraut  wrote:
> > On 27.06.23 17:02, Tristan Partin wrote:
> > > This is a patch which implements an issue discussed in bug #17946[0]. It
> > > doesn't fix the overarching issue of the bug, but merely a consistency
> > > issue which was found while analyzing code by Heikki. I had originally
> > > submitted the patch within that thread, but for visibility and the
> > > purposes of the commitfest, I have re-sent it in its own thread.
> > >
> > > [0]: 
> > > https://www.postgresql.org/message-id/49dfcad8-90fa-8577-008f-d142e61af...@iki.fi
> >
> > I notice that HAVE_USELOCALE was introduced much later than
> > HAVE_LOCALE_T, and at the time the code was already using uselocale(),
> > so perhaps the introduction of HAVE_USELOCALE was unnecessary and should
> > be reverted.
> >
> > I think it would be better to keep HAVE_LOCALE_T as encompassing any of
> > the various locale_t-using functions, rather than using HAVE_USELOCALE
> > as a proxy for them.  Otherwise you create weird situations like having
> > #ifdef HAVE_WCSTOMBS_L inside #ifdef HAVE_USELOCALE, which doesn't make
> > sense, I think.
>
> I propose[1] that we get rid of HAVE_LOCALE_T completely and make
> "libc" provider support unconditional.  It's standardised, and every
> target system has it, even Windows.  But Windows doesn't have
> uselocale().
>
> [1] 
> https://www.postgresql.org/message-id/flat/CA%2BhUKGL7CmmzeRhoirzjECmOdABVFTn8fo6gEOaFRF1Oxey6Hw%40mail.gmail.com#aef2f2274b28ff8a36f9b8a598e3cec0

I think keeping HAVE_USELOCALE is important for the Windows case as
mentioned. I need it for my localization work where I am ripping out
setlocale() on non-Windows.

-- 
Tristan Partin
Neon (https://neon.tech)




Re: Flush SLRU counters in checkpointer process

2023-07-03 Thread Daniel Gustafsson
> On 3 Mar 2023, at 09:06, Anthonin Bonnefoy  
> wrote:
> 
> Here's the patch rebased with Andres' suggestions. 
> Happy to update it if there's any additionalj change required.

This patch crashes 031_recovery_conflict with a SIGInvalid on Windows, can you
please investigate and see what might be going on there?  The test passed about
4 days ago on Windows so unless it's the CI being flaky it should be due to a
recent change.

If you don't have access to a Windows environment you can run your own
instrumented builds in your Github account with the CI files in the postgres
repo.

--
Daniel Gustafsson





Re: cataloguing NOT NULL constraints

2023-07-03 Thread Alvaro Herrera
On 2023-Jun-30, Andres Freund wrote:

> On 2023-06-30 13:44:03 +0200, Alvaro Herrera wrote:
> 
> > The main novelty in this version of the patch, is that we now emit
> > "throwaway" NOT NULL constraints when a column is part of the primary
> > key.  Then, after the PK is created, we run a DROP for that constraint.
> > That lets us create the PK without having to scan the table during
> > pg_upgrade.
> 
> Have you considered extending the DDL statement for this purpose? We have
>   ALTER TABLE ... ADD CONSTRAINT ... PRIMARY KEY USING INDEX ...;
> we could just do something similar for the NOT NULL constraint?  Which would
> then delete the separate constraint NOT NULL constraint.

Hmm, I hadn't.  I think if we have to explicitly list the constraint
that we want dropped, then it's pretty much the same than as if we used
a comma-separated list of subcommands, like 

ALTER TABLE ... ADD CONSTRAINT .. PRIMARY KEY (a,b),
   DROP CONSTRAINT pgdump_throwaway_notnull_0,
   DROP CONSTRAINT pgdump_throwaway_notnull_1;

However, I think it would be ideal if we *don't* have to specify the
list of constraints: we would do this on any ALTER TABLE .. ADD
CONSTRAINT PRIMARY KEY, without having any additional clause.

But how to distinguish which NOT NULL markings to drop?  Maybe we would
have to specify a flag at NOT NULL constraint creation time.  So pg_dump
would emit something like

CREATE TABLE foo (a int CONSTRAINT NOT NULL THROWAWAY);
... (much later) ...
ALTER TABLE foo ADD CONSTRAINT .. PRIMARY KEY;

and by the time this second command is run, those throwaway constraints
are removed.  The problems now are 1) how to make this CREATE statement
more SQL-conformant (answer: make pg_dump emit a separate ALTER TABLE
command for the constraint addition; it already knows how to do this, so
it'd be very little code); but also 2) where to store the flag
server-side flag that says this constraint has this property.  I think
it'd have to be a new pg_constraint column, and I don't like to add one
for such a minor issue.

On 2023-Jun-30, Alvaro Herrera wrote:

> Scanning this thread, I think I left one reported issue unfixed related
> to tables created LIKE others.  I'll give it a look later.  Other than
> that I think all bases are covered, but I intend to leave the patch open
> until near the end of the CF, in case someone wants to play with it.

So it was [1] that I meant, where this example was provided:

# create table t1 (c int primary key null unique);  

  
# create table t2 (like t1);

  
# alter table t2 alter c drop not null; 

  
ERROR:  no NOT NULL constraint found to drop

The problem here is that because we didn't give INCLUDING INDEXES in the
LIKE clause, we end up with a column marked NOT NULL for which we have
no pg_constraint row.  Okay, I thought, we can just make sure *not* to
mark that case as not null; that works fine and looks reasonable.
However, it breaks the following use case, which is already in use in
the regression tests and possibly by users:

 CREATE TABLE pk (a int PRIMARY KEY) PARTITION BY RANGE (a);
 CREATE TABLE pk4 (LIKE pk);
 ALTER TABLE pk ATTACH PARTITION pk4 FOR VALUES FROM (3000) TO (4000);
+ERROR:  column "a" in child table must be marked NOT NULL

The problem here is that we were assuming, by the time the third command
is run, that the column had been marked NOT NULL by the second command.
So my solution above is simply not acceptable.  What we must do, in
order to handle this backward-compatibly, is to ensure that a column
part of a PK automatically gets a NOT NULL constraint for all the PK
columns, for the case where INCLUDING INDEXES is not given.  This is the
same we do for regular INHERITS children and PKs.

I'll go write this code now; should be simple enough.

[1] 
https://postgr.es/m/CAMbWs48astPDb3K+L89wb8Yju0jM_Czm8svmU=uzd+wm61c...@mail.gmail.com

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
 It does it in a really, really complicated way
 why does it need to be complicated?
 Because it's MakeMaker.




Re: logicalrep_message_type throws an error

2023-07-03 Thread Euler Taveira
On Mon, Jul 3, 2023, at 7:30 AM, Ashutosh Bapat wrote:
> logicalrep_message_type() is used to convert logical message type code
> into name while prepared error context or details. Thus when this
> function is called probably an ERROR is already raised. If
> logicalrep_message_type() gets an unknown message type, it will throw
> an error, which will suppress the error for which we are building
> context or details. That's not useful. I think instead
> logicalrep_message_type() should return "unknown" when it encounters
> an unknown message type and let the original error message be thrown
> as is.

Hmm. Good catch. The current behavior is:

ERROR:  invalid logical replication message type "X" 
LOG:  background worker "logical replication worker" (PID 71800) exited with 
exit code 1

... that hides the details. After providing a default message type:

ERROR:  invalid logical replication message type "X"
CONTEXT:  processing remote data for replication origin "pg_16638" during 
message type "???" in transaction 796, finished at 0/16266F8

Masahiko, since abc0910e2e0 is your patch maybe you want to take a look at it.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From aaeb2d7474a30a363b69441397b2d7dd91bfba30 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Mon, 3 Jul 2023 08:54:10 -0300
Subject: [PATCH] uncover logical change details

The commit abc0910e2e0 adds logical change details to error context.
However, the function logicalrep_message_type() introduces an
elog(ERROR) that can hide these details. Instead, avoid elog() and use
??? (that is a synonym for unknown). Spotted by Ashutosh Bapat.

Discussion: https://postgr.es/m/CAExHW5suAEDW-mBZt_qu4RVxWZ1vL54-L%2Bci2zreYWebpzxYsA%40mail.gmail.com
---
 src/backend/replication/logical/proto.c | 8 +++-
 src/include/replication/logicalproto.h  | 2 +-
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index f308713275..572ef0a1aa 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1213,7 +1213,7 @@ logicalrep_read_stream_abort(StringInfo in,
 /*
  * Get string representing LogicalRepMsgType.
  */
-char *
+const char *
 logicalrep_message_type(LogicalRepMsgType action)
 {
 	switch (action)
@@ -1256,9 +1256,7 @@ logicalrep_message_type(LogicalRepMsgType action)
 			return "STREAM ABORT";
 		case LOGICAL_REP_MSG_STREAM_PREPARE:
 			return "STREAM PREPARE";
+		default:
+			return "???";
 	}
-
-	elog(ERROR, "invalid logical replication message type \"%c\"", action);
-
-	return NULL;/* keep compiler quiet */
 }
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 0ea2df5088..c5be981eae 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -269,6 +269,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
 extern void logicalrep_read_stream_abort(StringInfo in,
 		 LogicalRepStreamAbortData *abort_data,
 		 bool read_abort_info);
-extern char *logicalrep_message_type(LogicalRepMsgType action);
+extern const char *logicalrep_message_type(LogicalRepMsgType action);
 
 #endif			/* LOGICAL_PROTO_H */
-- 
2.30.2



Re: Creation of an empty table is not fsync'd at checkpoint

2023-07-03 Thread Daniel Gustafsson
This patch required a trivial rebase after conflicting with bfcf1b3480 so I've
attached a v2 to get the CFBot to run this.

--
Daniel Gustafsson



v2-0001-Ensure-that-creation-of-an-empty-relfile-is-fsync.patch
Description: Binary data


Re: Deleting prepared statements from libpq.

2023-07-03 Thread Jelte Fennema
@Michael is anything else needed from my side? If not, I'll mark the
commitfest entry as "Ready For Committer".


Allow specifying a dbname in pg_basebackup connection string

2023-07-03 Thread Jelte Fennema
Normally it doesn't really matter which dbname is used in the connection
string that pg_basebackup and other physical replication CLI tools use.
The reason being, that physical replication does not work at the
database level, but instead at the server level. So you will always get
the data for all databases.

However, when there's a proxy, such as PgBouncer, in between the client
and the server, then it might very well matter. Because this proxy might
want to route the connection to a different server depending on the
dbname parameter in the startup packet.

This patch changes the creation of the connection string key value
pairs, so that the following command will actually include
dbname=postgres in the startup packet to the server:

pg_basebackup --dbname 'dbname=postgres port=6432' -D dump

This also applies to other physical replication CLI tools like
pg_receivewal.

To address the security issue described in CVE-2016-5424 it
now only passes expand_dbname=true when the tool did not
receive a connection_string argument.

I tested that the change worked on this PgBouncer PR of mine:
https://github.com/pgbouncer/pgbouncer/pull/876


v1-0001-Allow-specifying-a-dbname-in-pg_basebackup-connec.patch
Description: Binary data


Re: [PGdocs] fix description for handling pf non-ASCII characters

2023-07-03 Thread jian he
On Thu, Jun 29, 2023 at 3:51 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> Dear Jian,
>
> Thank you for checking my patch!
>
> >
> > in your patch:
> > > printable ASCII characters will be replaced with a hex escape.
> >
> > My wording is not good. I think the result will be: ASCII characters
> > will be as is, non-ASCII characters will be replaced with "a hex
> > escape".
>
> Yeah, your point was right. I have already said:
> "anything other than printable ASCII characters will be replaced with a hex 
> escape"
> IIUC They have same meaning.
>
> You might want to say the line was not good, so reworded like
> "non-ASCII characters will be replaced with hexadecimal strings." How do you 
> think?
>
> > set application_name to 'abc漢字Abc';
> > SET
> > test16=# show application_name;
> > application_name
> > 
> >  abc\xe6\xbc\xa2\xe5\xad\x97Abc
> > (1 row)
> >
> > I see multi escape, so I am not sure "a hex escape".
>
> Not sure what you said, but I could not find word "hex escape" in the 
> document.
> So I used "hexadecimal string" instead. Is it acceptable?
>
> > to properly render it back to  'abc漢字Abc'
> > here is how i do it:
> > select 'abc' || convert_from(decode(' e6bca2e5ad97','hex'), 'UTF8') || 
> > 'Abc';
>
> Yeah, your approach seems right, but I'm not sure it is related with us.
> Just to confirm, I don't have interest the method for rendering non-ASCII 
> characters.
> My motivation of the patch was to document the the incompatibility noted in 
> [1]:
>
> >
> Changed the conversion rules when non-ASCII characters are specified for 
> ASCII-only
> strings such as parameters application_name and cluster_name. Previously, it 
> was
> converted in byte units with a question mark (?), but in PostgreSQL 16, it is
> converted to a hexadecimal string.
> >
>
> > I guess it's still painful if your application_name has non-ASCII chars.
>
> I agreed that, but no one has recommended to use non-ASCII.
>
> [1]: 
> https://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL16Beta1_New_Features_en_20230528_1.pdf
>
> Best Regards,
> Hayato Kuroda
> FUJITSU LIMITED

looks fine. Do you need to add to commitfest?




Re: pgbnech: allow to cancel queries during benchmark

2023-07-03 Thread Fabien COELHO



Hello Yugo-san,


In thread #0, setup_cancel_handler is called before the loop, the
CancelRequested flag is set when Ctrl+C is issued. In the loop, cancel
requests are sent when the flag is set only in thread #0. SIGINT is
blocked in other threads, but the threads will exit after their query
are cancelled. If thread safety is disabled or OS is Windows, the signal
is not blocked because pthread_sigmask cannot be used.
(I didn't test the patch on WIndows yet, though.)

I choose the design that the signal handler and the query cancel are
performed only in thread #0 because I wanted to make the behavior as
predicable as possible. However, another design that all thread can
received SIGINT and that the first thread that catches the signal is
responsible to sent cancel requests for all connections may also work.

Also, the array of CState that contains all clients state is changed to
a global variable so that all connections can be accessed within a thread.



As a side note, actually I thought another design to create a special thread
for handling query cancelling in which SIGINT would be catched by sigwait,
I quit the idea because it would add complexity due to code for Windows and
--disabled-thread-safe.


I agree that the simpler the better.


I would appreciate it if you could kindly review and test the patch!


I'll try to have a look at it.

--
Fabien.




Re: Cleaning up array_in()

2023-07-03 Thread jian he
based on Nikhil Benesch idea.
The attached diff is  based on
v1-0002-Rewrite-ArrayCount-to-make-dimensionality-checks.patch.

diff compare v1-0002:
select '{{1,{2}},{2,3}}'::text[];
 ERROR:  malformed array literal: "{{1,{2}},{2,3}}"
 LINE 1: select '{{1,{2}},{2,3}}'::text[];
^
-DETAIL:  Unexpected "{" character.
+DETAIL:  Multidimensional arrays must have sub-arrays with matching dimensions.

 select E'{{1,2},\\{2,3}}'::text[];
 ERROR:  malformed array literal: "{{1,2},\{2,3}}"
 LINE 1: select E'{{1,2},\\{2,3}}'::text[];
^
-DETAIL:  Unexpected "\" character.
+DETAIL:  Multidimensional arrays must have sub-arrays with matching dimensions.

---
new errors details kind of make sense.
From 34055df96fcacc35c61be22d0fdc2a86a7dd854d Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Mon, 3 Jul 2023 19:00:13 +0800
Subject: [PATCH 3/3] changed based on Nikhil Benesch idea

---
 src/backend/utils/adt/arrayfuncs.c   | 35 ++--
 src/test/regress/expected/arrays.out |  4 ++--
 2 files changed, 14 insertions(+), 25 deletions(-)

diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 8c5d35f1..a1ebc1cc 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -59,9 +59,9 @@ typedef enum
 	ARRAY_ELEM_STARTED,
 	ARRAY_QUOTED_ELEM_STARTED,
 	ARRAY_QUOTED_ELEM_COMPLETED,
-	ARRAY_ELEM_DELIMITED,
 	ARRAY_LEVEL_COMPLETED,
-	ARRAY_LEVEL_DELIMITED
+	ARRAY_DELIMITED,
+	ARRAY_END
 } ArrayParseState;
 
 /* Working state for array_iterate() */
@@ -473,8 +473,6 @@ ArrayCount(const char *str, int *dim, char typdelim, Node *escontext)
 	int			ndim = 1,
 nelems[MAXDIM];
 	bool		ndim_frozen = false;
-	bool		in_quotes = false;
-	bool		eoArray = false;
 	bool		empty_array = true;
 	const char *ptr;
 	ArrayParseState parse_state = ARRAY_NO_LEVEL;
@@ -485,7 +483,7 @@ ArrayCount(const char *str, int *dim, char typdelim, Node *escontext)
 
 	/* Scan string until we reach closing brace */
 	ptr = str;
-	while (!eoArray)
+	while (parse_state != ARRAY_END)
 	{
 		bool		new_element = false;
 
@@ -507,7 +505,7 @@ ArrayCount(const char *str, int *dim, char typdelim, Node *escontext)
 switch (parse_state)
 {
 	case ARRAY_LEVEL_STARTED:
-	case ARRAY_ELEM_DELIMITED:
+	case ARRAY_DELIMITED:
 		/* start new unquoted element */
 		parse_state = ARRAY_ELEM_STARTED;
 		new_element = true;
@@ -542,17 +540,11 @@ ArrayCount(const char *str, int *dim, char typdelim, Node *escontext)
 switch (parse_state)
 {
 	case ARRAY_LEVEL_STARTED:
-	case ARRAY_ELEM_DELIMITED:
-		/* start new quoted element */
-		Assert(!in_quotes);
-		in_quotes = true;
+	case ARRAY_DELIMITED:
 		parse_state = ARRAY_QUOTED_ELEM_STARTED;
 		new_element = true;
 		break;
 	case ARRAY_QUOTED_ELEM_STARTED:
-		/* already in element, end it */
-		Assert(in_quotes);
-		in_quotes = false;
 		parse_state = ARRAY_QUOTED_ELEM_COMPLETED;
 		break;
 	default:
@@ -563,7 +555,7 @@ ArrayCount(const char *str, int *dim, char typdelim, Node *escontext)
 }
 break;
 			case '{':
-if (!in_quotes)
+if (parse_state != ARRAY_QUOTED_ELEM_STARTED)
 {
 	/*
 	 * A left brace can occur if no nesting has occurred yet,
@@ -571,7 +563,7 @@ ArrayCount(const char *str, int *dim, char typdelim, Node *escontext)
 	 */
 	if (parse_state != ARRAY_NO_LEVEL &&
 		parse_state != ARRAY_LEVEL_STARTED &&
-		parse_state != ARRAY_LEVEL_DELIMITED)
+		parse_state != ARRAY_DELIMITED)
 		ereturn(escontext, -1,
 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
  errmsg("malformed array literal: \"%s\"", str),
@@ -602,7 +594,7 @@ ArrayCount(const char *str, int *dim, char typdelim, Node *escontext)
 }
 break;
 			case '}':
-if (!in_quotes)
+if (parse_state != ARRAY_QUOTED_ELEM_STARTED)
 {
 	/*
 	 * A right brace can occur after an element start, an
@@ -656,11 +648,11 @@ ArrayCount(const char *str, int *dim, char typdelim, Node *escontext)
 	}
 	/* Done if this is the outermost level's '}' */
 	if (nest_level == 0)
-		eoArray = true;
+		parse_state	= ARRAY_END;
 }
 break;
 			default:
-if (!in_quotes)
+if (parse_state != ARRAY_QUOTED_ELEM_STARTED)
 {
 	if (*ptr == typdelim)
 	{
@@ -676,10 +668,7 @@ ArrayCount(const char *str, int *dim, char typdelim, Node *escontext)
 	 errmsg("malformed array literal: \"%s\"", str),
 	 errdetail("Unexpected \"%c\" character.",
 			   typdelim)));
-		if (parse_state == ARRAY_LEVEL_COMPLETED)
-			parse_state = ARRAY_LEVEL_DELIMITED;
-		else
-			parse_state = ARRAY_ELEM_DELIMITED;
+		parse_state	= ARRAY_DELIMITED;	
 	}
 	else if (!array_isspace(*ptr))
 	{
@@ -696,7 +685,7 @@ ArrayCount(const char *str, int *dim, char 

Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

2023-07-03 Thread vignesh C
On Wed, 28 Jun 2023 at 12:02, Hayato Kuroda (Fujitsu)
 wrote:
>
> Dear Amit,
>
> > > > This actually makes sense. I quickly try to do that without adding any
> > > > new replication message. As you would expect, it did not work.
> > > > I don't really know what's needed to make a connection to last for
> > > > more than one iteration. Need to look into this. Happy to hear any
> > > > suggestions and thoughts.
> > >
> >
> > It is not clear to me what exactly you tried here which didn't work.
> > Can you please explain a bit more?
>
> Just to confirm, this is not my part. Melih can answer this...
>
> > > I have analyzed how we handle this. Please see attached the patch (0003) 
> > > which
> > > allows reusing connection.
> > >
> >
> > Why did you change the application name during the connection?
>
> It was because the lifetime of tablesync worker is longer than slots's one and
> tablesync worker creates temporary replication slots many times, per the 
> target
> relation. The name of each slots has relid, so I thought that it was not 
> suitable.
> But in the later patch the tablesync worker tries to reuse the slot during the
> synchronization, so in this case the application_name should be same as 
> slotname.
>
> I added comment in 0003, and new file 0006 file to use slot name as 
> application_name
> again. Note again that the separation was just for specifying changes, Melih 
> can
> include them to one part of files if needed.

Few comments:
1) Should these error messages say "Could not create a snapshot by
replication slot":
+   if (!pubnames_str)
+   ereport(ERROR,
+   (errcode(ERRCODE_OUT_OF_MEMORY),
 /* likely guess */
+errmsg("could not start WAL streaming: %s",
+
pchomp(PQerrorMessage(conn->streamConn);
+   pubnames_literal = PQescapeLiteral(conn->streamConn, pubnames_str,
+
strlen(pubnames_str));
+   if (!pubnames_literal)
+   ereport(ERROR,
+   (errcode(ERRCODE_OUT_OF_MEMORY),
 /* likely guess */
+errmsg("could not start WAL streaming: %s",
+
pchomp(PQerrorMessage(conn->streamConn);
+   appendStringInfo(, ", publication_names %s", pubnames_literal);
+   PQfreemem(pubnames_literal);
+   pfree(pubnames_str);

2) These checks are present in CreateReplicationSlot too, can we have
a common function to check these for both CreateReplicationSlot and
CreateReplicationSnapshot:
+   if (!IsTransactionBlock())
+   ereport(ERROR,
+   (errmsg("%s must be called inside a
transaction",
+
"CREATE_REPLICATION_SNAPSHOT ...")));
+
+   if (XactIsoLevel != XACT_REPEATABLE_READ)
+   ereport(ERROR,
+   (errmsg("%s must be called in
REPEATABLE READ isolation mode transaction",
+
"CREATE_REPLICATION_SNAPSHOT ...")));
+
+   if (!XactReadOnly)
+   ereport(ERROR,
+   (errmsg("%s must be called in a read
only transaction",
+
"CREATE_REPLICATION_SNAPSHOT ...")));
+
+   if (FirstSnapshotSet)
+   ereport(ERROR,
+   (errmsg("%s must be called before any query",
+
"CREATE_REPLICATION_SNAPSHOT ...")));
+
+   if (IsSubTransaction())
+   ereport(ERROR,
+   (errmsg("%s must not be called in a
subtransaction",
+
"CREATE_REPLICATION_SNAPSHOT ...")));

3) Probably we can add the function header at this point of time:
+/*
+ * TODO
+ */
+static void
+libpqrcv_slot_snapshot(WalReceiverConn *conn,
+  char *slotname,
+  const WalRcvStreamOptions *options,
+  XLogRecPtr *lsn)

4) Either or relation name or relid should be sufficient here, no need
to print both:
   StartTransactionCommand();
+   ereport(LOG,
+   (errmsg("%s
for subscription \"%s\" has moved to sync table \"%s\" with relid
%u.",
+
 get_worker_name(),
+
 MySubscription->name,
+
 get_rel_name(MyLogicalRepWorker->relid),
+
 MyLogicalRepWorker->relid)));
+   CommitTransactionCommand();

5) Why is this check of logicalrep_worker_find is required required,
will it not be sufficient to pick the relations that are in
SUBREL_STATE_INIT state?
+   /*
+   * Pick the table for the next run if
it is not already picked up
+   * by another worker.
+   *
+   * Take exclusive lock to prevent any
other sync worker from picking
+   * the same table.
+   */
+   LWLockAcquire(LogicalRepWorkerLock,

Re: Refactor ssl tests to avoid using internal PostgreSQL::Test::Cluster methods

2023-07-03 Thread Heikki Linnakangas

On 31/05/2023 15:47, Daniel Gustafsson wrote:

The SSL tests for pg_ctl restarts with an incorrect key passphrase run pg_ctl
manually and use the internal method _update_pid to set the server PID file
accordingly.  This is needed since $node->restart will BAIL in case the restart
fails, which clearly isn't useful to anyone wanting to test restarts.  This is
the only use of _update_pid outside of Cluster.pm.

To avoid this, the attached adds fail_ok functionality to restart() which makes
it easier to use it in tests, and aligns it with how stop() and start() works.
The resulting SSL tests are also more readable IMO.


Makes sense.


diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 76442de063..e33f648aae 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -85,10 +85,8 @@ switch_server_cert(
passphrase_cmd => 'echo wrongpassword',
restart => 'no');
 
-command_fails(

-   [ 'pg_ctl', '-D', $node->data_dir, '-l', $node->logfile, 'restart' ],
-   'restart fails with password-protected key file with wrong password');
-$node->_update_pid(0);
+$result = $node->restart(fail_ok => 1);
+is($result, 0, 'restart fails with password-protected key file with wrong 
password');
 
 switch_server_cert(

$node,


In principle, this makes the tests more lenient. If "pg_ctl restart" 
fails because of a timeout, for example, the PID file could be present. 
Previously, the _update_pid(0) would error out on that, but now it's 
accepted. I think that's fine, but just wanted to point it out.


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Avoid unncessary always true test (src/backend/storage/buffer/bufmgr.c)

2023-07-03 Thread Ranier Vilela
Em sex., 30 de jun. de 2023 às 15:48, Karina Litskevich <
litskevichkar...@gmail.com> escreveu:

> Hi,
>
> Good catch. Indeed, eb.rel shouldn't be NULL there and the tests should be
> unnecessary.
>
Thanks for the confirmation.


> However, it doesn't follow from the code of these functions.
> From what I can see eb.rel can be NULL in both of these functions. There is
> the following Assert in the beginning of the ExtendBufferedRelTo()
> function:
>
> Assert((eb.rel != NULL) != (eb.smgr != NULL));
>
> And ExtendBufferedRelShared() is only called from ExtendBufferedRelCommon()
> which can be called from ExtendBufferedRelTo() or ExtendBufferedRelBy()
> that
> also has the same Assert(). And none of these functions assigns eb.rel, so
> it can be NULL from the very beginning and it stays the same.
>
>
> And there is the following call in xlogutils.c, which is exactly the case
> when
> eb.rel is NULL:
>
> buffer = ExtendBufferedRelTo(EB_SMGR(smgr, RELPERSISTENCE_PERMANENT),
>  forknum,
>  NULL,
>  EB_PERFORMING_RECOVERY |
>  EB_SKIP_EXTENSION_LOCK,
>  blkno + 1,
>  mode);
>
EB_SMGR and EB_REL are macros for making new structs.
IMO these are buggy, once make new structs without initializing all fields.
Attached a patch to fix this and make more clear when rel or smgr is NULL.


>
>
>
> So as for me calling LockRelationForExtension() and
> UnlockRelationForExtension()
> without testing eb.rel first looks more like a bug here. However, they are
> never
> actually called with eb.rel=NULL because of the EB_* flags, so there is no
> bug
> here. I believe we should add Assert checking that when eb.rel is NULL,
> flags
> are such that we won't use eb.rel. And yes, we can remove unnecessary
> checks
> where the flags value guaranty us that eb.rel is not NULL.
>
Not against these Asserts, but It is very confusing and difficult to
understand them without some comment.


>
> And another minor observation. It seems to me that we don't need a "could
> have
> been closed while waiting for lock" in ExtendBufferedRelShared(), because I
> believe the comment above already explains why updating eb.smgr:
>
>  * Note that another backend might have extended the relation by the time
>  * we get the lock.
>
Ok, but the first comment still ambiguous, I think that could be:
"/* eb.smgr could have been closed while waiting for lock */"

best regards,
Ranier Vilela


0002-avoid-new-struct-with-undefined-fields-ExtendedBufferedWhat.patch
Description: Binary data


Re: pg_decode_message vs skip_empty_xacts and xact_wrote_changes

2023-07-03 Thread vignesh C
On Fri, 30 Jun 2023 at 09:55, Amit Kapila  wrote:
>
> On Thu, Jun 29, 2023 at 9:40 PM vignesh C  wrote:
> >
> > On Thu, 29 Jun 2023 at 09:58, Zhijie Hou (Fujitsu)
> >  wrote:
> > >
> > > On Thursday, June 29, 2023 12:06 PM vignesh C  wrote:
> > > >
> > >
> > > Thanks for the patches.
> > >
> > > I tried to understand the following check:
> > >
> > > /*
> > >  * If asked to skip empty transactions, we'll emit BEGIN at the 
> > > point
> > >  * where the first operation is received for this transaction.
> > >  */
> > > -   if (data->skip_empty_xacts)
> > > +   if (!(last_write ^ data->skip_empty_xacts) || 
> > > txndata->xact_wrote_changes)
> > > return;
> > >
> > > I might miss something, but would you mind elaborating on why we use 
> > > "last_write" in this check?
> >
> > last_write is used to indicate if it is begin/"begin
> > prepare"(last_write is true) or change/truncate/message(last_write is
> > false).
> >
> > We have specified logical XNOR which will be true for the following 
> > conditions:
> > Condition1: last_write && data->skip_empty_xacts  -> If it is
> > begin/begin prepare and user has specified skip empty transactions, we
> > will return from here, so that the begin message can be appended at
> > the point where the first operation is received for this transaction.
> > Condition2: !last_write && !data->skip_empty_xacts -> If it is
> > change/truncate or message and user has not specified skip empty
> > transactions, we will return from here as we would have appended the
> > begin earlier itself.
> > The txndata->xact_w6rote_changes will be set after the first operation
> > is received for this transaction during which we would have outputted
> > the begin message, this condition is to skip outputting begin message
> > if the begin message was already outputted.
> >
>
> I feel the use of last_write has reduced the readability of this part
> of the code. It may be that we can add comments to make it clear but I
> feel your previous version was much easier to understand.

+1 for the first version patch, I also felt the first version is
easily understandable.

Regards,
Vignesh




logicalrep_message_type throws an error

2023-07-03 Thread Ashutosh Bapat
Hi All,
logicalrep_message_type() is used to convert logical message type code
into name while prepared error context or details. Thus when this
function is called probably an ERROR is already raised. If
logicalrep_message_type() gets an unknown message type, it will throw
an error, which will suppress the error for which we are building
context or details. That's not useful. I think instead
logicalrep_message_type() should return "unknown" when it encounters
an unknown message type and let the original error message be thrown
as is.

-- 
Best Wishes,
Ashutosh Bapat




Re: Commitfest manager for July

2023-07-03 Thread Daniel Gustafsson
> On 28 Jun 2023, at 09:45, Daniel Gustafsson  wrote:
> 
> A quick scan of the archives doesn't turn up anyone who has volunteered in
> advance to run the upcoming commitfest.  Is anyone keen at trying their hand 
> at
> this very important community work?  The July CF is good for anyone doing this
> for the first time IMHO as it's usually less stressful than the ones later in
> the cycle.

Since this didn't get any takers, and we are in July AoE since a few days ago,
I guess I'll assume the role this time in the interest of moving things along.
I've switched the 2023-07 CF to in-progress and 2023-09 to open, let's try to
close patches!

--
Daniel Gustafsson





Re: Optionally using a better backtrace library?

2023-07-03 Thread David Steele

On 7/3/23 11:58, Alvaro Herrera wrote:



Nice things about libbacktrace are that the generation of stack traces is
documented to be async signal safe on most platforms (with a #define to figure
that out, and a more minimal safe version always available) and that it
supports a wide range of platforms:


Sadly, it looks like the library is seldom distributed.  For example,
Debian seems to only have a package called android-libbacktrace which I
imagine is not what we want.  On my system I see a static library only
-- is that enough?  That file is part of package libgcc-10-dev, which
tells me that we can't depend on that for packaging purposes.


It would be a pretty big win even if the improved backtrace is only 
available in dev environments -- this is what pgBackRest currently does.


We are also considering adding this library to production builds but 
have not pulled the trigger on that yet since we are a bit worried about 
possible performance impact and have not had time to benchmark.


Regards,
-David




  1   2   >