pltcl crashes due to a syntax error

2024-05-31 Thread a.kozhemyakin

Hello hackers,

When executing the following query on master branch:

CREATE EXTENSION pltcl;
CREATE or replace PROCEDURE test_proc5(INOUT a text)
    LANGUAGE pltcl
    AS $$
    set aa [concat $1 "+" $1]
    return [list $aa $aa])
    $$;

CALL test_proc5('abc');
CREATE EXTENSION
CREATE PROCEDURE
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

The connection to the server was lost. Attempting reset: Failed.


Core was generated by `postgres: postgres postgres [loca'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/strlen-sse2.S:142
142 ../sysdeps/x86_64/multiarch/strlen-sse2.S: No such file or 
directory.

(gdb) bt
#0  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/strlen-sse2.S:142
#1  0x7f5f1353ba6a in utf_u2e (src=0x0) at pltcl.c:77
#2  0x7f5f1353c9f7 in throw_tcl_error (interp=0x55ec24bdaf70, 
proname=0x55ec24b6b140 "test_proc5") at pltcl.c:1373
#3  0x7f5f1353ed64 in pltcl_func_handler 
(fcinfo=fcinfo@entry=0x7ffdbfb407a0, 
call_state=call_state@entry=0x7ffdbfb405d0, 
pltrusted=pltrusted@entry=true) at pltcl.c:1029
#4  0x7f5f1353ee8d in pltcl_handler (fcinfo=0x7ffdbfb407a0, 
pltrusted=pltrusted@entry=true) at pltcl.c:765
#5  0x7f5f1353f1ef in pltcl_call_handler (fcinfo=) at 
pltcl.c:698
#6  0x55ec239ec64a in ExecuteCallStmt 
(stmt=stmt@entry=0x55ec24a9a940, params=params@entry=0x0, 
atomic=atomic@entry=false, dest=dest@entry=0x55ec24a6ea18) at 
functioncmds.c:2285
#7  0x55ec23c103a7 in standard_ProcessUtility (pstmt=0x55ec24a9a9d8, 
queryString=0x55ec24a99e68 "CALL test_proc5('abc');", 
readOnlyTree=, context=PROCESS_UTILITY_TOPLEVEL, 
params=0x0, queryEnv=0x0, dest=0x55ec24a6ea18,

    qc=0x7ffdbfb40f40) at utility.c:851
#8  0x55ec23c1081b in ProcessUtility 
(pstmt=pstmt@entry=0x55ec24a9a9d8, queryString=, 
readOnlyTree=, 
context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=, 
queryEnv=,

    dest=0x55ec24a6ea18, qc=0x7ffdbfb40f40) at utility.c:523
#9  0x55ec23c0e04e in PortalRunUtility 
(portal=portal@entry=0x55ec24b18108, pstmt=0x55ec24a9a9d8, 
isTopLevel=isTopLevel@entry=true, 
setHoldSnapshot=setHoldSnapshot@entry=true, 
dest=dest@entry=0x55ec24a6ea18, qc=qc@entry=0x7ffdbfb40f40)

    at pquery.c:1158
#10 0x55ec23c0e3b7 in FillPortalStore 
(portal=portal@entry=0x55ec24b18108, isTopLevel=isTopLevel@entry=true) 
at pquery.c:1031
#11 0x55ec23c0e6ee in PortalRun (portal=portal@entry=0x55ec24b18108, 
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, 
run_once=run_once@entry=true, dest=dest@entry=0x55ec24a9aec8, 
altdest=altdest@entry=0x55ec24a9aec8,

    qc=0x7ffdbfb41130) at pquery.c:763
#12 0x55ec23c0acca in exec_simple_query 
(query_string=query_string@entry=0x55ec24a99e68 "CALL 
test_proc5('abc');") at postgres.c:1274
#13 0x55ec23c0caad in PostgresMain (dbname=, 
username=) at postgres.c:4680
#14 0x55ec23c0687a in BackendMain (startup_data=, 
startup_data_len=) at backend_startup.c:105
#15 0x55ec23b766bf in postmaster_child_launch 
(child_type=child_type@entry=B_BACKEND, 
startup_data=startup_data@entry=0x7ffdbfb41354 "", 
startup_data_len=startup_data_len@entry=4, 
client_sock=client_sock@entry=0x7ffdbfb41390)

    at launch_backend.c:265
#16 0x55ec23b7ab36 in BackendStartup 
(client_sock=client_sock@entry=0x7ffdbfb41390) at postmaster.c:3593

#17 0x55ec23b7adb0 in ServerLoop () at postmaster.c:1674
#18 0x55ec23b7c20c in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x55ec24a54030) at postmaster.c:1372

#19 0x55ec23aacf9f in main (argc=3, argv=0x55ec24a54030) at main.c:197





Re: Add memory context type to pg_backend_memory_contexts view

2024-05-31 Thread Michael Paquier
On Fri, May 31, 2024 at 12:35:58PM +1200, David Rowley wrote:
> This follows what we do in other places.  If you look at explain.c,
> you'll see lots of "???"s.
> 
> I think if you're worried about corrupted memory, then garbled output
> in pg_get_backend_memory_contexts wouldn't be that high on the list of
> concerns.

+   const char *type;
[...]
+   switch (context->type)
+   {
+   case T_AllocSetContext:
+   type = "AllocSet";
+   break;
+   case T_GenerationContext:
+   type = "Generation";
+   break;
+   case T_SlabContext:
+   type = "Slab";
+   break;
+   case T_BumpContext:
+   type = "Bump";
+   break;
+   default:
+   type = "???";
+   break;
+   }

Yeah, it's a common practice to use that as fallback.  What you are
doing is OK, and it is not possible to remove the default case as
these are nodetags to generate warnings if a new value needs to be
added.

This patch looks like a good idea, so +1 from here.  (PS: catversion
bump).  
--
Michael


signature.asc
Description: PGP signature


Re: To what extent should tests rely on VACUUM ANALYZE?

2024-05-31 Thread Alexander Lakhin

29.03.2024 11:59, Alexander Lakhin wrote:

But it looks like subselect is not the only test that can fail due to
vacuum instability. I see that create_index also suffers from cranky
ConditionalLockBufferForCleanup() (+if (rand() % 10 == 0)  ...


Just for the record, I think I've reproduced the same failure as:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2024-03-17%2003%3A03%3A57
not ok 66    + create_index    27509 ms
...

and the similar occurrences:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet&dt=2024-01-02%2007%3A09%3A09
not ok 66    + create_index    25830 ms

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2023-11-15%2006%3A16%3A15
not ok 66    + create_index    38508 ms

by running 8 027_stream_regress instances in parallel on a slow ARMv7
device like this:
for i in `seq 10`; do echo "I $i"; parallel --halt now,fail=1  -j8 \
 --linebuffer --tag NO_TEMP_INSTALL=1 make -s check -C \
 src/test/recovery_{}/ PROVE_TESTS="t/027*" ::: `seq 8` || break; done
5
5   #   Failed test 'regression tests pass'
5   #   at t/027_stream_regress.pl line 92.
5   #  got: '256'
5   # expected: '0'
5   t/027_stream_regress.pl ..
5   Dubious, test returned 1 (wstat 256, 0x100)
5   Failed 1/6 subtests

not ok 66    + create_index   152995 ms
...
=== dumping .../src/test/recovery_5/tmp_check/regression.diffs ===
diff -U3 .../src/test/regress/expected/create_index.out 
.../src/test/recovery_5/tmp_check/results/create_index.out
--- .../src/test/regress/expected/create_index.out  2024-05-30 
15:30:34.523048633 +
+++ .../src/test/recovery_5/tmp_check/results/create_index.out 2024-05-31 
13:07:56.359001362 +
@@ -1916,11 +1916,15 @@
 SELECT unique1 FROM tenk1
 WHERE unique1 IN (1,42,7)
 ORDER BY unique1;
-  QUERY PLAN

- Index Only Scan using tenk1_unique1 on tenk1
-   Index Cond: (unique1 = ANY ('{1,42,7}'::integer[]))
-(2 rows)
+    QUERY PLAN
+---
+ Sort
+   Sort Key: unique1
+   ->  Bitmap Heap Scan on tenk1
+ Recheck Cond: (unique1 = ANY ('{1,42,7}'::integer[]))
+ ->  Bitmap Index Scan on tenk1_unique1
+   Index Cond: (unique1 = ANY ('{1,42,7}'::integer[]))
+(6 rows)

 SELECT unique1 FROM tenk1
 WHERE unique1 IN (1,42,7)
@@ -1936,12 +1940,13 @@
 SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)
 ORDER BY thousand;
-  QUERY PLAN

- Index Only Scan using tenk1_thous_tenthous on tenk1
-   Index Cond: (thousand < 2)
-   Filter: (tenthous = ANY ('{1001,3000}'::integer[]))
-(3 rows)
+  QUERY PLAN
+--
+ Sort
+   Sort Key: thousand
+   ->  Index Only Scan using tenk1_thous_tenthous on tenk1
+ Index Cond: ((thousand < 2) AND (tenthous = ANY 
('{1001,3000}'::integer[])))
+(4 rows)

 SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)
=== EOF ===

I got failures on iteration 2, 3, 7, 1.

But with the repeated VACUUM ANALYZE:
--- a/src/test/regress/sql/test_setup.sql
+++ b/src/test/regress/sql/test_setup.sql
@@ -163,6 +163,8 @@ CREATE TABLE tenk1 (
 \set filename :abs_srcdir '/data/tenk.data'
 COPY tenk1 FROM :'filename';
 VACUUM ANALYZE tenk1;
+VACUUM ANALYZE tenk1;
+VACUUM ANALYZE tenk1;

20 iterations succeeded in the same environment.

So I think that that IOS plan change can be explained by the issue
discussed here.

Best regards,
Alexander




Re: Explicit specification of index ensuring uniqueness of foreign columns

2024-05-31 Thread David G. Johnston
On Friday, May 31, 2024, Tom Lane  wrote:

> Kaiting Chen  writes:
> > I'd like to resurrect a subset of my proposal in [1], specifically that:
> >   The FOREIGN KEY constraint syntax gains a [ USING INDEX index_name ]
> clause
> >   optionally following the referenced column list.
> > ...
> > While, in this minimal reproduction, the two indexes are
> interchangeable, there
> > are situations that may reasonably occur in the course of ordinary use
> in which
> > they aren't. For example, a redundant unique index with different storage
> > parameters may exist during the migration of an application schema.
>
> I agree that there's a hazard there, but I question if the case is
> sufficiently real-world to justify the costs of introducing a
> non-SQL-standard clause in foreign key constraints.
>
> One such cost is that pg_dump output would become less able to be
> loaded into other DBMSes, or even into older PG versions.
>
> I also wonder if this wouldn't just trade one fragility for another.
> Specifically, I am not sure that we guarantee that the names of
> indexes underlying constraints remain the same across dump/reload.
> If they don't, the USING INDEX clause might fail unnecessarily.
>
> As against that, I'm not sure I've ever seen a real-world case with
> intentionally-duplicate unique indexes.
>
> So on the whole I'm unconvinced that this is worth changing.


Seems like most of those issues could be avoided if we only supply “alter
table” syntax (or a function…).  i.e., give the dba a tool to modify their
system when our default choices fail them.  But continue on with the
defaults as they exist today.

David J.


Re: Explicit specification of index ensuring uniqueness of foreign columns

2024-05-31 Thread Tom Lane
Kaiting Chen  writes:
> I'd like to resurrect a subset of my proposal in [1], specifically that:
>   The FOREIGN KEY constraint syntax gains a [ USING INDEX index_name ] clause
>   optionally following the referenced column list.
> ...
> While, in this minimal reproduction, the two indexes are interchangeable, 
> there
> are situations that may reasonably occur in the course of ordinary use in 
> which
> they aren't. For example, a redundant unique index with different storage
> parameters may exist during the migration of an application schema.

I agree that there's a hazard there, but I question if the case is
sufficiently real-world to justify the costs of introducing a
non-SQL-standard clause in foreign key constraints.

One such cost is that pg_dump output would become less able to be
loaded into other DBMSes, or even into older PG versions.

I also wonder if this wouldn't just trade one fragility for another.
Specifically, I am not sure that we guarantee that the names of
indexes underlying constraints remain the same across dump/reload.
If they don't, the USING INDEX clause might fail unnecessarily.

As against that, I'm not sure I've ever seen a real-world case with
intentionally-duplicate unique indexes.

So on the whole I'm unconvinced that this is worth changing.

regards, tom lane




Re: Ambiguous description on new columns

2024-05-31 Thread Peter Smith
On Wed, May 29, 2024 at 8:04 PM vignesh C  wrote:
>
> On Wed, 22 May 2024 at 14:26, Peter Smith  wrote:
> >
> > On Tue, May 21, 2024 at 8:40 PM PG Doc comments form
> >  wrote:
> > >
> > > The following documentation comment has been logged on the website:
> > >
> > > Page: 
> > > https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> > > Description:
> > >
> > > The documentation on this page mentions:
> > >
> > > "If no column list is specified, any columns added later are automatically
> > > replicated."
> > >
> > > It feels ambiguous what this could mean. Does it mean:
> > >
> > > 1/ That if you alter the table on the publisher and add a new column, it
> > > will be replicated
> > >
> > > 2/ If you add a column list later and add a column to it, it will be
> > > replicated
> > >
> > > In both cases, does the subscriber automatically create this column if it
> > > wasn't there before?
> >
> > No, the subscriber will not automatically create the column. That is
> > already clearly said at the top of the same page you linked "The table
> > on the subscriber side must have at least all the columns that are
> > published."
> >
> > All that "If no column list..." paragraph was trying to say is:
> >
> > CREATE PUBLICATION pub FOR TABLE T;
> >
> > is not quite the same as:
> >
> > CREATE PUBLICATION pub FOR TABLE T(a,b,c);
> >
> > The difference is, in the 1st case if you then ALTER the TABLE T to
> > have a new column 'd' then that will automatically start replicating
> > the 'd' data without having to do anything to either the PUBLICATION
> > or the SUBSCRIPTION. Of course, if TABLE T at the subscriber side does
> > not have a column 'd' then you'll get an error because your subscriber
> > table needs to have *at least* all the replicated columns. (I
> > demonstrate this error below)
> >
> > Whereas in the 2nd case, even though you ALTER'ed the TABLE T to have
> > a new column 'd' then that won't be replicated because 'd' was not
> > named in the PUBLICATION's column list.
> >
> > 
> >
> > Here's an example where you can see this in action
> >
> > Here is an example of the 1st case -- it shows 'd' is automatically
> > replicated and also shows the subscriber-side error caused by the
> > missing column:
> >
> > test_pub=# CREATE TABLE T(a int,b int, c int);
> > test_pub=# CREATE PUBLICATION pub FOR TABLE T;
> >
> > test_sub=# CREATE TABLE T(a int,b int, c int);
> > test_sub=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=test_pub' PUBLICATION 
> > pub;
> >
> > See the replication happening
> > test_pub=# INSERT INTO T VALUES (1,2,3);
> > test_sub=# SELECT * FROM t;
> >  a | b | c
> > ---+---+---
> >  1 | 2 | 3
> > (1 row)
> >
> > Now alter the publisher table T and insert some new data
> > test_pub=# ALTER TABLE T ADD COLUMN d int;
> > test_pub=# INSERT INTO T VALUES (5,6,7,8);
> >
> > This will cause subscription errors like:
> > 2024-05-22 11:53:19.098 AEST [16226] ERROR:  logical replication
> > target relation "public.t" is missing replicated column: "d"
> >
> > 
> >
> > I think the following small change will remove any ambiguity:
> >
> > BEFORE
> > If no column list is specified, any columns added later are
> > automatically replicated.
> >
> > SUGGESTION
> > If no column list is specified, any columns added to the table later
> > are automatically replicated.
> >
> > ~~
> >
> > I attached a small patch to make the above change.
>
> A small recommendation:
> It would enhance clarity to include a line break following "If no
> column list is specified, any columns added to the table later are":
> -   If no column list is specified, any columns added later are automatically
> +   If no column list is specified, any columns added to the table
> later are automatically
> replicated. This means that having a column list which names all columns

Hi Vignesh,

IIUC you're saying my v1 patch *content* and rendering is OK, but you
only wanted the SGML text to have better wrapping for < 80 chars
lines. So I have attached a patch v2 with improved wrapping. If you
meant something different then please explain.

==
Kind Regards,
Peter Smith.
Fujitsu Australia


v2-0001-Fix-minor-ambiguity.patch
Description: Binary data


meson and check-tests

2024-05-31 Thread Ashutosh Bapat
Hi Tristan,
Using make I can run only selected tests under src/test/regress using
TESTS=... make check-tests. I didn't find any way to do that with meson.
meson test --suite regress runs all the regression tests.

We talked this off-list at the conference. It seems we have to somehow
avoid passing pg_regress --schedule argument and instead pass the list of
tests. Any idea how to do that?

-- 
Best Wishes,
Ashutosh Bapat


Re: meson "experimental"?

2024-05-31 Thread Andres Freund
Hi, 

On May 30, 2024 8:03:33 AM PDT, Andrew Dunstan  wrote:
>On Thu, May 30, 2024 at 6:32 AM Aleksander Alekseev <
>aleksan...@timescale.com> wrote:
>
>>
>>
>> By a quick look on the buildfarm we seem to use Ninja >= 1.11.1.
>> However since Meson can use both Ninja and VS as a backend I'm not
>> certain which section would be most appropriate for naming the minimal
>> required version of Ninja.
>>
>>
>When I tried building with the VS backend it blew up, I don't recall the
>details. I think we should just use ninja everywhere. That keeps things
>simple. 

VS should work, and if not, we should fix it. It's slow, so I'd not use it for 
scheduled builds, but for people developing using visual studio. 

Andres 

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Pavel Stehule
pá 31. 5. 2024 v 15:49 odesílatel Wolfgang Walther 
napsal:

> Pavel Stehule:
> > When you write RAISE NOTICE '%', x, then PLpgSQL parser rewrite it to
> > RAISE NOTICE '%', SELECT $1
> >
> > There is no parser just for expressions.
>
> That's why my suggestion in [1] already made a difference between:
>
> SELECT var;
>
> and
>
> SELECT col, var FROM table, var;
>
> So the "only require variable-in-FROM if FROM is used" should extend to
> the SQL level.
>
> That should be possible, right?
>

1. you need to implement extra path - the data from FROM clause are
processed differently than params  - it is much more code (and current code
should to stay if you want to support it)

2. current default behave is implicit unpacking of composites when are used
in FROM clause. So it is problem when you want to use composite in query
without unpacking

3. when I'll support SELECT var and SELECT var FROM var together, then it
will raise a collision with self, that should be solved

4. there is not any benefit if variables and tables doen't share catalog,
but session variables requires lsn number, and it can be problem to use it
is table catalog

5. identification when the variable needs or doesn't need FROM clause isn't
easy

there can be lot of combinations like SELECT (SELECT var), c FROM tab  or
SELECT var, (SELECT c) FROM c and if c is variable, then FROM is not
necessary.

If somebody will write SELECT (SELECT var OFFSET 0) FROM ... then subselect
can know nothing about outer query - so it means minimally one check over
all nodes

It is possible / but it is multiple more complex than current code (and I
am not sure if store lns in pg_class is possible ever)

6. I think so plpgsql case statement use multicolumn expression, so you can
write

CASE WHEN x = 1, (SELECT count(*) FROM tab) THEN ...

It is synthetic, but we are talking about what is possible.

and although it looks correctly, and will work if x will be plpgsql
variable, then it will not work if x will be session variable

and then you need to fix it like

CASE WHEN (SELECT x=1 FROM x), (SELECT count(*) FROM tab) THEN

so it is possible, but it is clean only in trivial cases, and can be pretty
messy

Personally, I cannot to imagine to explain to any user so following
(proposed by you) behaviour is intuitive and friendly

CREATE VARIABLE a as int;
CREATE TABLE test(id int);

SELECT a; --> ok
SELECT * FROM test WHERE id = a; -- error message "the column "a" doesn't
exists"



Best,
>
> Wolfgang
>
> [1]:
>
> https://www.postgresql.org/message-id/e7faf42f-62b8-47f4-af5c-cb8efa3e0e20%40technowledgy.de
>


Re: Add memory context type to pg_backend_memory_contexts view

2024-05-31 Thread David Christensen


> On May 30, 2024, at 5:36 PM, David Rowley  wrote:
> 
> On Fri, 31 May 2024 at 07:21, David Christensen  wrote:
>> Giving this a once-through, this seems straightforward and useful.  I
>> have a slight preference for keeping "name" the first field in the
>> view and moving "type" to the second, but that's minor.
> 
> Not having it first make sense, but I don't think putting it between
> name and ident is a good idea. I think name and ident belong next to
> each other. parent likely should come after those since that also
> relates to the name.
> 
> How about putting it after "parent"?

That works for me. I skimmed the new patch and it seems fine but on my phone so 
didn’t do any build tests. 

>> Just confirming that the allocator types are not extensible without a
>> recompile, since it's using a specific node tag to switch on, so there
>> are no concerns with not properly displaying the output of something
>> else.
> 
> They're not extensible.

Good to confirm. 

>> The "" text placeholder might be more appropriate as "",
>> or perhaps stronger, include a WARNING in the logs, since an unknown
>> tag at this point would be an indication of some sort of memory
>> corruption.
> 
> This follows what we do in other places.  If you look at explain.c,
> you'll see lots of "???"s.
> 
> I think if you're worried about corrupted memory, then garbled output
> in pg_get_backend_memory_contexts wouldn't be that high on the list of
> concerns.

Heh, indeed. +1 for precedent. 

>> Since there are only four possible values, I think there would be
>> utility in including them in the docs for this field.
> 
> I'm not sure about this. We do try not to expose too much internal
> detail in the docs.  I don't know all the reasons for that, but at
> least one reason is that it's easy for things to get outdated as code
> evolves. I'm also unsure how much value there is in listing 4 possible
> values unless we were to document the meaning of each of those values,
> and doing that puts us even further down the path of detailing
> Postgres internals in the documents. I don't think that's a
> maintenance burden that's often worth the trouble.

I can see that and it’s consistent with what we do, just was thinking as a user 
that that may be useful, but if you’re using this view you likely already know 
what it means.

>> I also think it
>> would be useful to have some sort of comments at least in mmgr/README
>> to indicate that if a new type of allocator is introduced that you
>> will also need to add the node to the function for this type, since
>> it's not an automatic conversion.
> 
> I don't think it's sustainable to do this.  If we have to maintain
> documentation that lists everything you must do in order to add some
> new node types then I feel it's just going to get outdated as soon as
> someone adds something new that needs to be done.  I'm only one
> developer, but for me, I'd not even bother looking there if I was
> planning to add a new memory context type.
> 
> What I would be doing is searching the entire code base for where
> special handling is done for the existing types and ensuring I
> consider if I need to include a case for the new node type. In this
> case, I'd probably choose to search for "T_AllocSetContext", and I'd
> quickly land on PutMemoryContextsStatsTupleStore() and update it. This
> method doesn't get outdated, provided you do "git pull" occasionally.

Fair. 

>> (For that matter, instead of
>> switching on node type and outputting a given string, is there a
>> generic function that could just give us the string value for node
>> type so we don't need to teach anything else about it anyway?)
> 
> There isn't.  nodeToString() does take some node types as inputs and
> serialise those to something JSON-like, but that includes serialising
> each field of the node type too. The memory context types are not
> handled by those functions. I think it's fine to copy what's done in
> explain.c.  "git grep \"???\" -- *.c | wc -l" gives me 31 occurrences,
> so I'm not doing anything new.
> 
> I've attached an updated patch which changes the position of the new
> column in the view.

+1





Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Wolfgang Walther

Pavel Stehule:
When you write RAISE NOTICE '%', x, then PLpgSQL parser rewrite it to 
RAISE NOTICE '%', SELECT $1


There is no parser just for expressions.


That's why my suggestion in [1] already made a difference between:

SELECT var;

and

SELECT col, var FROM table, var;

So the "only require variable-in-FROM if FROM is used" should extend to 
the SQL level.


That should be possible, right?

Best,

Wolfgang

[1]: 
https://www.postgresql.org/message-id/e7faf42f-62b8-47f4-af5c-cb8efa3e0e20%40technowledgy.de





Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Pavel Stehule
>
>
>
>
>> In this case you could still write:
>>
>> RAISE NOTICE '% %', x, (SELECT a,b FROM y);
>>
>> (assuming only x is a variable here)
>>
>
no - y was a composite variable.

When you write RAISE NOTICE '%', x, then PLpgSQL parser rewrite it to RAISE
NOTICE '%', SELECT $1

There is no parser just for expressions.



>
>> Best,
>>
>> Wolfgang
>>
>


Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Pavel Stehule
pá 31. 5. 2024 v 15:29 odesílatel Wolfgang Walther 
napsal:

> Pavel Stehule:
> > The session variables can be used in queries, but should be used in
> > PL/pgSQL expressions, and then the mandatory usage in FROM clause will
> > do lot of problems and unreadable code like
> >
> > DO $$
> > BEGIN
> >RAISE NOTICE '% %', (SELECT x FROM x), (SELECT a,b FROM y);
> >
> > END
> > $$
> >
> > This requirement does variables unusable in PL
>
> I already proposed earlier to only require listing them in FROM when
> there is actually a related FROM.
>

but there is technical problem - plpgsql expression are internally SQL
queries. Isn't possible to cleanly to parse queries and expressions
differently.



>
> In this case you could still write:
>
> RAISE NOTICE '% %', x, (SELECT a,b FROM y);
>
> (assuming only x is a variable here)
>
> Best,
>
> Wolfgang
>


Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Wolfgang Walther

Pavel Stehule:
The session variables can be used in queries, but should be used in 
PL/pgSQL expressions, and then the mandatory usage in FROM clause will 
do lot of problems and unreadable code like


DO $$
BEGIN
   RAISE NOTICE '% %', (SELECT x FROM x), (SELECT a,b FROM y);

END
$$

This requirement does variables unusable in PL


I already proposed earlier to only require listing them in FROM when 
there is actually a related FROM.


In this case you could still write:

RAISE NOTICE '% %', x, (SELECT a,b FROM y);

(assuming only x is a variable here)

Best,

Wolfgang




Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Pavel Stehule
pá 31. 5. 2024 v 15:02 odesílatel Pavel Stehule 
napsal:

>
>
> pá 31. 5. 2024 v 13:37 odesílatel Wolfgang Walther <
> walt...@technowledgy.de> napsal:
>
>> Pavel Stehule:
>> > But in this case you could make variables and tables share the same
>> > namespace, i.e. forbid creating a variable with the same name as an
>> > already existing table.
>> >
>> >
>> > It helps, but not on 100% - there is a search path
>>
>
>> I think we can ignore the search_path for this discussion. That's not a
>> problem of variables vs tables, but just a search path related problem.
>> It is exactly the same thing right now, when you create a new table x(x)
>> in a schema which happens to be earlier in your search path.
>>
>
> I don't think it is a valid argument - search_path is there, and we cannot
> ignore it, because it allows just one case.
>
> And the need to use a variable in FROM clause introduces implicit
> unpacking or inconsistency with current work with composite's types, so I
> am more sure this way is not good.
>

The session variables can be used in queries, but should be used in
PL/pgSQL expressions, and then the mandatory usage in FROM clause will do
lot of problems and unreadable code like

DO $$
BEGIN
  RAISE NOTICE '% %', (SELECT x FROM x), (SELECT a,b FROM y);

END
$$

This requirement does variables unusable in PL



>
>
>
>>
>> The objection to the proposed approach for variables was that it would
>> introduce *new* ambiguities, which Alvaro's suggestion avoids.
>>
>> Best,
>>
>> Wolfgang
>>
>


Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Pavel Stehule
pá 31. 5. 2024 v 13:37 odesílatel Wolfgang Walther 
napsal:

> Pavel Stehule:
> > But in this case you could make variables and tables share the same
> > namespace, i.e. forbid creating a variable with the same name as an
> > already existing table.
> >
> >
> > It helps, but not on 100% - there is a search path
>

> I think we can ignore the search_path for this discussion. That's not a
> problem of variables vs tables, but just a search path related problem.
> It is exactly the same thing right now, when you create a new table x(x)
> in a schema which happens to be earlier in your search path.
>

I don't think it is a valid argument - search_path is there, and we cannot
ignore it, because it allows just one case.

And the need to use a variable in FROM clause introduces implicit unpacking
or inconsistency with current work with composite's types, so I am more
sure this way is not good.




>
> The objection to the proposed approach for variables was that it would
> introduce *new* ambiguities, which Alvaro's suggestion avoids.
>
> Best,
>
> Wolfgang
>


Re: Improve the connection failure error messages

2024-05-31 Thread Nisha Moond
On Fri, Apr 26, 2024 at 1:10 PM Daniel Gustafsson  wrote:
>
> > On 22 Mar 2024, at 11:42, Nisha Moond  wrote:
>
> > Here is the v4 patch with changes required in slotfuncs.c and slotsync.c 
> > files.
>
> -   errmsg("could not connect to the primary server: %s", err));
> +   errmsg("\"%s\" could not connect to the primary server: %s", 
> app_name.data, err));
>
> Messages like this should perhaps have translator comments to indicate what 
> the
> leading "%s" will contain?

Attached v5 patch with the translator comments as suggested.

--
Thanks,
Nisha


v5-0001-Improve-the-connection-failure-error-messages.patch
Description: Binary data


Re: 回复: An implementation of multi-key sort

2024-05-31 Thread Yao Wang
I added two optimizations to mksort which exist on qsort_tuple():

1. When selecting pivot, always pick the item in the middle of array but
not by random. Theoretically it has the same effect to old approach, but
it can eliminate some unstable perf test results, plus a bit perf benefit by
removing random value generator.
2. Always check whether the array is ordered already, and return
immediately if it is. The pre-ordered check requires extra cost and
impacts perf numbers on some data sets, but can improve perf
significantly on other data sets.

By now, mksort has perf results equal or better than qsort on all data
sets I ever used.

I also updated test case. Please see v3 code as attachment.

Perf test results:

Data set 1 (with mass duplicate values):
-

create table t1 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t1 values (generate_series(1,49), 0, 0, 0, 0,
'aaabbb');
update t1 set c2 = c1 % 100, c3 = c1 % 50, c4 = c1 % 10, c5 = c1 % 3;
update t1 set c6 = 'aaabbb'
|| (c1 % 5)::text;

Query 1:

explain analyze select c1 from t1 order by c6, c5, c4, c3, c2, c1;

Disable Mksort

3021.636 ms
3014.669 ms
3033.588 ms

Enable Mksort

1688.590 ms
1686.956 ms
1688.567 ms

The improvement is 78.9%, which is reduced from the previous version
(129%). The most cost should be the pre-ordered check.

Query 2:

create index idx_t1_mk on t1 (c6, c5, c4, c3, c2, c1);

Disable Mksort

1674.648 ms
1680.608 ms
1681.373 ms

Enable Mksort

1143.341 ms
1143.462 ms
1143.894 ms

The improvement is ~47%, which is also reduced a bit (52%).

Data set 2 (with distinct values):
--

create table t2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t2 values (generate_series(1,49), 0, 0, 0, 0, '');
update t2 set c2 = 90 - c1, c3 = 91 - c1, c4 = 92 - c1, c5
= 93 - c1;
update t2 set c6 = 'aaabbb'
|| (94 - c1)::text;

Query 1:

explain analyze select c1 from t2 order by c6, c5, c4, c3, c2, c1;

Disable Mksort

12199.963 ms
12197.068 ms
12191.657 ms

Enable Mksort

9538.219 ms
9571.681 ms
9536.335 ms

The improvement is 27.9%, which is much better than the old approach (-6.2%).

Query 2 (the data is pre-ordered):

explain analyze select c1 from t2 order by c6 desc, c5, c4, c3, c2, c1;

Enable Mksort

768.191 ms
768.079 ms
767.026 ms

Disable Mksort

768.757 ms
766.166 ms
766.149 ms

They are almost the same since no actual sort was performed, and much
better than the old approach (-1198.1%).


Thanks,

Yao Wang

On Fri, May 24, 2024 at 8:50 PM Yao Wang  wrote:
>
> When all leading keys are different, mksort will finish the entire sort at the
> first sort key and never touch other keys. For the case, mksort falls back to
> kind of qsort actually.
>
> I created another data set with distinct values in all sort keys:
>
> create table t2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
> insert into t2 values (generate_series(1,49), 0, 0, 0, 0, '');
> update t2 set c2 = 90 - c1, c3 = 91 - c1, c4 = 92 - c1, c5
> = 93 - c1;
> update t2 set c6 = 'aaabbb'
>   || (94 - c1)::text;
> explain analyze select c1 from t2 order by c6, c5, c4, c3, c2, c1;
>
> Results:
>
> MKsort:
> 12374.427 ms
> 12528.068 ms
> 12554.718 ms
>
> qsort:
> 12251.422 ms
> 12279.938 ms
> 12280.254 ms
>
> MKsort is a bit slower than qsort, which can be explained by extra
> checks of MKsort.
>
> Yao Wang
>
> On Fri, May 24, 2024 at 8:36 PM Wang Yao  wrote:
> >
> >
> >
> > 获取Outlook for Android
> > 
> > From: Heikki Linnakangas 
> > Sent: Thursday, May 23, 2024 8:47:29 PM
> > To: Wang Yao ; PostgreSQL Hackers 
> > 
> > Cc: inte...@outlook.com 
> > Subject: Re: 回复: An implementation of multi-key sort
> >
> > On 23/05/2024 15:39, Wang Yao wrote:
> > > No obvious perf regression is expected because PG will follow original
> > > qsort code path when mksort is disabled. For the case, the only extra
> > > cost is the check in tuplesort_sort_memtuples() to enter mksort code path.
> >
> > And what about the case the mksort is enabled, but it's not effective
> > because all leading keys are different?
> >
> > --
> > Heikki Linnakangas
> > Neon (https://neon.tech)
> >

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copyi

Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Wolfgang Walther

Pavel Stehule:

But in this case you could make variables and tables share the same
namespace, i.e. forbid creating a variable with the same name as an
already existing table.


It helps, but not on 100% - there is a search path


I think we can ignore the search_path for this discussion. That's not a 
problem of variables vs tables, but just a search path related problem. 
It is exactly the same thing right now, when you create a new table x(x) 
in a schema which happens to be earlier in your search path.


The objection to the proposed approach for variables was that it would 
introduce *new* ambiguities, which Alvaro's suggestion avoids.


Best,

Wolfgang




Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Pavel Stehule
pá 31. 5. 2024 v 13:10 odesílatel Wolfgang Walther 
napsal:

> Pavel Stehule:
> > 2. But my main argument is, it is not really safe - it solves Peter's
> > use case, but if I use a reverse example of Peter's case, I still have a
> > problem.
> >
> > I can have a variable x, and then I can write query like `SELECT x FROM
> x`;
> >
> > but if somebody creates table x(x int), then the query `SELECT x FROM x`
> > will be correct, but it is surely something else. So the requirement of
> > the usage variable inside FROM clause doesn't help. It doesn't work.
>
> But in this case you could make variables and tables share the same
> namespace, i.e. forbid creating a variable with the same name as an
> already existing table.
>

It helps, but not on 100% - there is a search path



>
> Best,
>
> Wolfgang
>


Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Wolfgang Walther

Pavel Stehule:
2. But my main argument is, it is not really safe - it solves Peter's 
use case, but if I use a reverse example of Peter's case, I still have a 
problem.


I can have a variable x, and then I can write query like `SELECT x FROM x`;

but if somebody creates table x(x int), then the query `SELECT x FROM x` 
will be correct, but it is surely something else. So the requirement of 
the usage variable inside FROM clause doesn't help. It doesn't work.


But in this case you could make variables and tables share the same 
namespace, i.e. forbid creating a variable with the same name as an 
already existing table.


Best,

Wolfgang




Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Pavel Stehule
pá 31. 5. 2024 v 11:46 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Tue, May 28, 2024 at 05:18:02PM GMT, Pavel Stehule wrote:
> >
> > I propose another variants. First we can introduce pseudo function VAR(
> ).
> > The argument should be session variables. The name of this function can
> be
> > pgvar, globvar, ... We can talk about good name, it should not be too
> long,
> > but it is not important now. The VAR() function will be pseudo function
> > like COALESCE, so we can easily to set correct result type.
>
> So, the purpose of the function would be only to verify that the argument
> is a
> session variable? That seems to be a very light payload, which looks a bit
> awkward.
>

no, it just reduces catalog searching to variables. So with using this
function, then there is no possibility of collision between variables and
other objects. The argument can be only variable and nothing else. So then
the conflict is not possible. When somebody tries to specify a table or
column, then it fails, because this object will not be detected. So inside
this function, the tables and columns cannot to shading variables, and
variables cannot be replaced by columns.

So the proposed function is not just assert, it is designed like a catalog
filter.


> Out of those options you propose I think the first one is the
> most straightforward one, but...
>
> > Alvaro Herrera:
> > > Perhaps the solution to all this is to avoid having the variables be
> > > implicitly present in the range table of all queries.  Instead, if you
> > > need a variable's value, then you need to add the variable to the FROM
> > > clause;
>
> The more I think about this, the more I like this solution. Marking
> which variables are available to the query this way, and using established
> patterns for resolving ambiguity actually looks intuitive to me. Now I
> know,
> you've got strong objections:
>

I still don't like this - mainly from two reasons

1. it doesn't look user friendly - you need to maintain two different
places in one query for one object.  I can imagine usage there in the case
of composite variables with unpacking (and then it can be consistent with
others). I can imagine to use optional usage of variables there for the
possibility of realiasing - like functions - and if we should support it,
then with unpacking of composite values.

(2024-05-31 12:33:57) postgres=# create type t as (a int, b int);
CREATE TYPE
(2024-05-31 12:35:26) postgres=# create function fx() returns t as $$
select 1, 2 $$ language sql;
CREATE FUNCTION
(2024-05-31 12:35:44) postgres=# select fx();
┌───┐
│  fx   │
╞═══╡
│ (1,2) │
└───┘
(1 row)

(2024-05-31 12:35:47) postgres=# select * from fx();
┌───┬───┐
│ a │ b │
╞═══╪═══╡
│ 1 │ 2 │
└───┴───┘
(1 row)

2. But my main argument is, it is not really safe - it solves Peter's use
case, but if I use a reverse example of Peter's case, I still have a
problem.

I can have a variable x, and then I can write query like `SELECT x FROM x`;

but if somebody creates table x(x int), then the query `SELECT x FROM x`
will be correct, but it is surely something else. So the requirement of the
usage variable inside FROM clause doesn't help. It doesn't work.







> > I don't like this. Sure, this fixes the problem with collisions, but then
> > we cannot talk about variables. When some is used like a table, then it
> > should be a table. I can imagine memory tables, but it is a different
> type
> > of object. Table is relation, variable is just value. Variables should
> not
> > have columns, so using the same patterns for tables and variables has no
> > sense. Using the same catalog for variables and tables. Variables just
> hold
> > a value, and then you can use it inside a query without necessity to
> write
> > JOIN. Variables are not tables, and then it is not too confusing so they
> > are not transactional and don't support more rows, more columns.
>
> A FROM clause could contain a function returning a single value, nobody
> finds it confusing. And at least to me it's not much different from having
> a
> session variable as well, what do you think?
>

but there is a difference when function returns composite, and when not -
if I use function in FROM clause, I'll get unpacked columns, when I use
function in columns, then I get composite.

The usage variable in FROM clause can have sense in similar princip like
functions - for possibility to use alias in same level of query and
possibility to use one common syntax for composite unpacking. But it
doesn't help with safety against collisions.


>
> > c) using variables with necessity to define it in FROM clause. It is
> safe,
> > but it can be less readable, when you use more variables, and it is not
> too
> > readable, and user friendly, because you need to write FROM. And can be
> > messy, because you usually will use variables in queries, and it is
> > introduce not relations into FROM clause. But I can imagine this mode as
> > alternative syntax, but it

Re: meson "experimental"?

2024-05-31 Thread Aleksander Alekseev
Hi,

>> By a quick look on the buildfarm we seem to use Ninja >= 1.11.1.
>> However since Meson can use both Ninja and VS as a backend I'm not
>> certain which section would be most appropriate for naming the minimal
>> required version of Ninja.
>
> When I tried building with the VS backend it blew up, I don't recall the 
> details. I think we should just use ninja everywhere. That keeps things 
> simple. On Windows I just install python and then do "pip install meson ninja"

If we know that it doesn't work I suggest removing mention of
--backend option from [1] until it will, in order to avoid any
confusion.

[1]: https://www.postgresql.org/docs/devel/install-meson.html

-- 
Best regards,
Aleksander Alekseev




Re: Switch background worker on/off in runtime.

2024-05-31 Thread Kashif Zeeshan
Hi ISHAN



On Fri, May 31, 2024 at 2:28 PM ISHAN CHHANGANI . <
f20200...@hyderabad.bits-pilani.ac.in> wrote:

> Hi,
>
> Is it possible to switch on/off a background worker in runtime?
>
As per my understanding there is no such way to do it on runtime. But you
can kill it by using the following command

select pg_terminate_backend(pid of bgworker);

Regards
Kashif Zeeshan
Bitnine Global

>
> worker.bgw_flags = BGWORKER_SHMEM_ACCESS;
>
> worker.bgw_start_time = BgWorkerStart_PostmasterStart;
>
>
>
> I want to switch off the worker based on some flag value, etc, either from
> the main process or the worker itself.
>
>
> Are there any already existing examples?
>
> Thanks,
>
> Ishan.
>
> The information contained in this electronic communication is intended
> solely for the individual(s) or entity to which it is addressed. It may
> contain proprietary, confidential and/or legally privileged information.
> Any review, retransmission, dissemination, printing, copying or other use
> of, or taking any action in reliance on the contents of this information by
> person(s) or entities other than the intended recipient is strictly
> prohibited and may be unlawful. If you have received this communication in
> error, please notify us by responding to this email or telephone and
> immediately and permanently delete all copies of this message and any
> attachments from your system(s). The contents of this message do not
> necessarily represent the views or policies of BITS Pilani.
>


Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Dmitry Dolgov
> On Tue, May 28, 2024 at 05:18:02PM GMT, Pavel Stehule wrote:
>
> I propose another variants. First we can introduce pseudo function VAR( ).
> The argument should be session variables. The name of this function can be
> pgvar, globvar, ... We can talk about good name, it should not be too long,
> but it is not important now. The VAR() function will be pseudo function
> like COALESCE, so we can easily to set correct result type.

So, the purpose of the function would be only to verify that the argument is a
session variable? That seems to be a very light payload, which looks a bit
awkward.

Out of those options you propose I think the first one is the
most straightforward one, but...

> Alvaro Herrera:
> > Perhaps the solution to all this is to avoid having the variables be
> > implicitly present in the range table of all queries.  Instead, if you
> > need a variable's value, then you need to add the variable to the FROM
> > clause;

The more I think about this, the more I like this solution. Marking
which variables are available to the query this way, and using established
patterns for resolving ambiguity actually looks intuitive to me. Now I know,
you've got strong objections:

> I don't like this. Sure, this fixes the problem with collisions, but then
> we cannot talk about variables. When some is used like a table, then it
> should be a table. I can imagine memory tables, but it is a different type
> of object. Table is relation, variable is just value. Variables should not
> have columns, so using the same patterns for tables and variables has no
> sense. Using the same catalog for variables and tables. Variables just hold
> a value, and then you can use it inside a query without necessity to write
> JOIN. Variables are not tables, and then it is not too confusing so they
> are not transactional and don't support more rows, more columns.

A FROM clause could contain a function returning a single value, nobody
finds it confusing. And at least to me it's not much different from having a
session variable as well, what do you think?

> c) using variables with necessity to define it in FROM clause. It is safe,
> but it can be less readable, when you use more variables, and it is not too
> readable, and user friendly, because you need to write FROM. And can be
> messy, because you usually will use variables in queries, and it is
> introduce not relations into FROM clause. But I can imagine this mode as
> alternative syntax, but it is very unfriendly and not intuitive (I think).

The proposal from Wolfgang to have a short-cut and not add FROM in case there
is no danger of ambiguity seems to resolve that.

> More probably it doesn't fast execution in simple expression execution mode.

Could you elaborate more, what do you mean by that? If the performance
overhead is not prohibitive (which I would expect is the case), having better
UX for a new feature usually beats having better performance.

> It looks odd - It is not intuitive, it introduces new inconsistency inside
> Postgres, or with solutions in other databases. No other database has a
> similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird will be
> confused. Users that use PL/pgSQL will be confused.

Session variables are not part of the SQL standard, and maintaining
consistency with other databases is a questionable goal. Since it's a new
feature, I'm not sure what you mean by inconsistency inside Postgres itself.

I see that the main driving case behind this patch is to help with
migrating from other databases that do have session variables. Going with
variables in FROM clause, will not make a migration much harder -- some of the
queries would have to modify the FROM part, and that's it, right? I could
imagine it would be even easier than adding VAR() everywhere.




Switch background worker on/off in runtime.

2024-05-31 Thread ISHAN CHHANGANI .
Hi,
Is it possible to switch on/off a background worker in runtime?

worker.bgw_flags = BGWORKER_SHMEM_ACCESS;
worker.bgw_start_time = BgWorkerStart_PostmasterStart;

I want to switch off the worker based on some flag value, etc, either from the 
main process or the worker itself.

Are there any already existing examples?

Thanks,
Ishan.

-- 
The information contained in this electronic communication is intended 
solely for the individual(s) or entity to which it is addressed. It may 
contain proprietary, confidential and/or legally privileged information. 
Any review, retransmission, dissemination, printing, copying or other use 
of, or taking any action in reliance on the contents of this information by 
person(s) or entities other than the intended recipient is strictly 
prohibited and may be unlawful. If you have received this communication in 
error, please notify us by responding to this email or telephone and 
immediately and permanently delete all copies of this message and any 
attachments from your system(s). The contents of this message do not 
necessarily represent the views or policies of BITS Pilani.


Re: 001_rep_changes.pl fails due to publisher stuck on shutdown

2024-05-31 Thread Peter Smith
On Wed, May 29, 2024 at 9:00 PM Alexander Lakhin  wrote:
>
> Hello hackers,
>
> As a recent buildfarm test failure [1] shows:
> [14:33:02.374](0.333s) ok 23 - update works with dropped subscriber column
> ### Stopping node "publisher" using mode fast
> # Running: pg_ctl -D
> /home/bf/bf-build/adder/HEAD/pgsql.build/testrun/subscription/001_rep_changes/data/t_001_rep_changes_publisher_data/pgdata
> -m fast stop
> waiting for server to shut down.. ... ... ... .. failed
> pg_ctl: server does not shut down
> # pg_ctl stop failed: 256
> # Postmaster PID for node "publisher" is 549
> [14:39:04.375](362.001s) Bail out!  pg_ctl stop failed
>
> 001_rep_changes_publisher.log
> 2024-05-16 14:33:02.907 UTC [2238704][client backend][4/22:0] LOG: statement: 
> DELETE FROM tab_rep
> 2024-05-16 14:33:02.925 UTC [2238704][client backend][:0] LOG: disconnection: 
> session time: 0:00:00.078 user=bf
> database=postgres host=[local]
> 2024-05-16 14:33:02.939 UTC [549][postmaster][:0] LOG:  received fast 
> shutdown request
> 2024-05-16 14:33:03.000 UTC [549][postmaster][:0] LOG:  aborting any 
> active transactions
> 2024-05-16 14:33:03.049 UTC [549][postmaster][:0] LOG: background worker 
> "logical replication launcher" (PID
> 2223110) exited with exit code 1
> 2024-05-16 14:33:03.062 UTC [901][checkpointer][:0] LOG: shutting down
> 2024-05-16 14:39:04.377 UTC [549][postmaster][:0] LOG:  received 
> immediate shutdown request
> 2024-05-16 14:39:04.382 UTC [549][postmaster][:0] LOG:  database system 
> is shut down
>
> the publisher node may hang on stopping.
>
> I reproduced the failure (with aggressive autovacuum) locally and
> discovered that it happens because:
> 1) checkpointer calls WalSndInitStopping() (which sends
>   PROCSIG_WALSND_INIT_STOPPING to walsender), and then spins inside
>   WalSndWaitStopping() indefinitely, because:
> 2) walsender receives the signal, sets got_STOPPING = true, but can't exit
> WalSndLoop():
> 3) it never sets got_SIGUSR2 (to get into WalSndDone()) in
>   XLogSendLogical():
> 4) it never sets WalSndCaughtUp to true:
> 5) logical_decoding_ctx->reader->EndRecPtr can't reach flushPtr in
>   XLogSendLogical():
> 6) EndRecPtr doesn't advance in XLogNextRecord():
> 7) XLogDecodeNextRecord() fails do decode a record that crosses a page
>   boundary:
> 8) ReadPageInternal() (commented "Wait for the next page to become
>   available") constantly returns XLREAD_FAIL:
> 9) state->routine.page_read()/logical_read_xlog_page() constantly returns
>   -1:
> 10) flushptr = WalSndWaitForWal() stops advancing, because
>   got_STOPPING == true (see 2).
>
> That is, walsender doesn't let itself to catch up, if it gets the stop
> signal when it's lagging behind and decoding a record requires reading
> the next wal page.
>
> Please look at the reproducing test (based on 001_rep_changes.pl) attached.
> If fails for me as below:
> # 17
> Bailout called.  Further testing stopped:  pg_ctl stop failed
> FAILED--Further testing stopped: pg_ctl stop failed
> make: *** [Makefile:21: check] Ошибка 255
>
> [1] 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=adder&dt=2024-05-16%2014%3A22%3A38
> [2] 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dikkop&dt=2024-04-24%2014%3A38%3A35
>  (apparently the same)
>

Hi Alexander,

FYI, by injecting a lot of logging, I’ve confirmed your findings that
for the failing scenario, the ‘got_SIGUSR2’ flag never gets set to
true, meaning the WalSndLoop() cannot finish. Furthermore, I agree
with your step 8 finding that when it fails the ReadPageInternal
function call (the one in XLogDecodeNextRecord with the comment "Wait
for the next page to become available") constantly returns -1.

I will continue digging next week...

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Fix possible dereference null pointer (PQprint)

2024-05-31 Thread Daniel Gustafsson
> On 27 May 2024, at 16:52, Ranier Vilela  wrote:

> In the function *PQprint*, the variable po->fieldName can be NULL.

Yes.

> See the checks a few lines up.

Indeed, let's check it.

for (numFieldName = 0;
 po->fieldName && po->fieldName[numFieldName];
 numFieldName++)
;
for (j = 0; j < nFields; j++)
{
int len;
const char *s = (j < numFieldName && po->fieldName[j][0]) ?
po->fieldName[j] : PQfname(res, j);

If po->fieldName is NULL then numFieldName won't be incremented and will remain
zero.  In the check you reference we check (j < numFieldName) which will check
the j in the range 0..nFields for being less than zero.  The code thus does
seem quite correct to me.

--
Daniel Gustafsson