Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-27 Thread vignesh C
On Fri, 26 Jan 2024 at 22:22, Jelte Fennema-Nio  wrote:
>
> On Fri, 26 Jan 2024 at 13:11, Alvaro Herrera  wrote:
> > I wonder, would it make sense to put all these new functions in a
> > separate file fe-cancel.c?
>
>
> Okay I tried doing that. I think the end result is indeed quite nice,
> having all the cancellation related functions together in a file. But
> it did require making a bunch of static functions in fe-connect
> extern, and adding them to libpq-int.h. On one hand that seems fine to
> me, on the other maybe that indicates that this cancellation logic
> makes sense to be in the same file as the other connection functions
> (in a sense, connecting is all that a cancel request does).

CFBot shows that the patch has few compilation errors as in [1]:
[17:07:07.621] /usr/bin/ld:
../../../src/fe_utils/libpgfeutils.a(cancel.o): in function
`handle_sigint':
[17:07:07.621] cancel.c:(.text+0x50): undefined reference to `PQcancel'
[17:07:07.621] /usr/bin/ld:
../../../src/fe_utils/libpgfeutils.a(cancel.o): in function
`SetCancelConn':
[17:07:07.621] cancel.c:(.text+0x10c): undefined reference to `PQfreeCancel'
[17:07:07.621] /usr/bin/ld: cancel.c:(.text+0x114): undefined
reference to `PQgetCancel'
[17:07:07.621] /usr/bin/ld:
../../../src/fe_utils/libpgfeutils.a(cancel.o): in function
`ResetCancelConn':
[17:07:07.621] cancel.c:(.text+0x148): undefined reference to `PQfreeCancel'
[17:07:07.621] /usr/bin/ld:
../../../src/fe_utils/libpgfeutils.a(connect_utils.o): in function
`disconnectDatabase':
[17:07:07.621] connect_utils.c:(.text+0x2fc): undefined reference to
`PQcancelConn'
[17:07:07.621] /usr/bin/ld: connect_utils.c:(.text+0x307): undefined
reference to `PQcancelSend'
[17:07:07.621] /usr/bin/ld: connect_utils.c:(.text+0x30f): undefined
reference to `PQcancelFinish'
[17:07:07.623] /usr/bin/ld: ../../../src/interfaces/libpq/libpq.so:
undefined reference to `PQcancelPoll'
[17:07:07.626] collect2: error: ld returned 1 exit status
[17:07:07.626] make[3]: *** [Makefile:31: pg_amcheck] Error 1
[17:07:07.626] make[2]: *** [Makefile:45: all-pg_amcheck-recurse] Error 2
[17:07:07.626] make[2]: *** Waiting for unfinished jobs
[17:07:08.126] /usr/bin/ld: ../../../src/interfaces/libpq/libpq.so:
undefined reference to `PQcancelPoll'
[17:07:08.130] collect2: error: ld returned 1 exit status
[17:07:08.131] make[3]: *** [Makefile:42: initdb] Error 1
[17:07:08.131] make[2]: *** [Makefile:45: all-initdb-recurse] Error 2
[17:07:08.492] /usr/bin/ld: ../../../src/interfaces/libpq/libpq.so:
undefined reference to `PQcancelPoll'
[17:07:08.495] collect2: error: ld returned 1 exit status
[17:07:08.496] make[3]: *** [Makefile:50: pg_basebackup] Error 1
[17:07:08.496] make[2]: *** [Makefile:45: all-pg_basebackup-recurse] Error 2
[17:07:09.060] /usr/bin/ld: parallel.o: in function `sigTermHandler':
[17:07:09.060] parallel.c:(.text+0x1aa): undefined reference to `PQcancel'

Please post an updated version for the same.

[1] - https://cirrus-ci.com/task/6210637211107328

Regards,
Vignesh




Re: proposal: psql: show current user in prompt

2024-01-27 Thread Pavel Stehule
so 27. 1. 2024 v 10:24 odesílatel Jelte Fennema-Nio 
napsal:

> On Sat, 27 Jan 2024 at 08:35, Pavel Stehule 
> wrote:
> > Until now, it is not possible to disable reporting. So clients can
> expect so reporting is workable.
>
> Sure, so if we leave the default as is that's fine. It's not as if
> this GUC would be changed without the client knowing, the client would
> be the one changing the GUC and thus disabling the sending of
> reporting for the default GUCs. If it doesn't want to disable the
> reporting, than it simply should not send such a request.
>
> > Do you have a use case, when disabling some of the defaultly reported
> GUC makes sense?
>
> Mostly if the client doesn't actually use them, e.g. I expect many
> clients don't care what the current application_name is. But I do
> agree this is not a very strong usecase, so I'd be okay with always
> sending the ones that we sent as default for now. But that does make
> the implementation more difficult, since we'd have to special case
> these GUCs instead of having the same consistent behaviour for all
> GUCs.
>

client_encoding, standard_conforming_strings, server_version,
default_transaction_read_only, in_hot_standby
and scram_iterations
are used by libpq directly, so it can be wrong to introduce the possibility
to break it.



> > yes, inside gradual connect you can enhance the list of custom reported
> GUC easily.
> >
> > but for use cases like prompt in psql, I need to enable, disable
> reporting - but this use case should be independent of "protected"
> connection related GUC reporting.
> >
> > For example - when I disable %N, I can disable reporting "role" and
> disable showing role in prompt. But when "role" should be necessary for
> pgbouncer, then surely the disabling reporting should be ignored. The user
> by setting a prompt should not break communication.  And it can be ignored
> without any issue, there is not performance issue, because "role" is still
> necessary for pgbouncer that is used for connection. Without described
> behaviour we should not implement controlling reporting to psql, because
> there can be a lot of unhappy side effects in dependency if the user set or
> unset custom prompt or some other future feature.
>
> Maybe I'm misunderstanding what you're saying, but it's not clear to
> me why you are seeing two different use cases here. To me if we have
> the ParameterSet message then they are both the same. When you enable
> %N you would send a ParamaterSet message for _pq_.report_parameters
> and set it to a list of gucs including "role". And when you disable %N
> you would set _pq_.report_parameters to a list of GUCs without "role".
> Then if any proxy still needed "role" even if the list it receives in
> _pq_.report_parameters doesn't contain it, then this proxy would
> simply prepend "role" to the list of GUCs before forwarding the
> ParameterSet message.
>

Your scenario can work but looks too fragile. I checked - GUC now cannot
contain some special chars, so writing parser should not be hard work. But
your proposal means the proxy should be smart about it, and have to check
any change of _pq_.report_parameters, and this point can be fragile and a
source of hardly searched bugs.


>
> Also to be clear, having a "protected" connection-start only GUC is
> problematic for proxies. Because they need to update this setting
> while the connection is active when they hand of one server connection
> to another client.
>

This is true, but how common is this situation? Probably every client  that
uses one proxy will use the same defaultly reported GUC.  Reporting has no
extra overhead. The notification is reduced. When there is a different set
of reported GUC, then the proxy can try to find another connection with the
same set or can reconnect.  I think so there is still opened question what
should be correct behaviour when client execute RESET ALL or DISCARD ALL.
Without special protection the communication with proxy can be broken - and
we use GUC for reported variables, then my case, prompt in psql will be
broken too. Inside psql I have not callback on change of reported GUC. So
this is reason why reporting based on mutable GUC is fragile :-/

Pavel


Re: logical decoding and replication of sequences, take 2

2024-01-27 Thread Tomas Vondra
On 1/26/24 15:39, Robert Haas wrote:
> On Wed, Jan 24, 2024 at 12:46 PM Tomas Vondra
>  wrote:
>> I did try to explain how this works (and why) in a couple places:
>>
>> 1) the commit message
>> 2) reorderbuffer header comment
>> 3) ReorderBufferSequenceIsTransactional comment (and nearby)
>>
>> It's possible this does not meet your expectations, ofc. Maybe there
>> should be a separate README for this - I haven't found anything like
>> that for logical decoding in general, which is why I did (1)-(3).
> 
> I read over these and I do think they answer a bunch of questions, but
> I don't think they answer all of the questions.
> 
> Suppose T1 creates a sequence and commits. Then T2 calls nextval().
> Then T3 drops the sequence. According to the commit message, T2's
> change will be "replayed immediately after decoding". But it's
> essential to replay T2's change after we replay T1 and before we
> replay T3, and the comments don't explain why that's guaranteed.
> 
> The answer might be "locks". If we always replay a transaction
> immediately when we see it's commit record then in the example above
> we're fine, because the commit record for the transaction that creates
> the sequence must precede the nextval() call, since the sequence won't
> be visible until the transaction commits, and also because T1 holds a
> lock on it at that point sufficient to hedge out nextval. And the
> nextval record must precede the point where T3 takes an exclusive lock
> on the sequence.
> 

Right, locks + apply in commit order gives us this guarantee (I can't
think of a case where it wouldn't be the case).

> Note, however, that this change of reasoning critically depends on us
> never delaying application of a transaction. If we might reach T1's
> commit record and say "hey, let's hold on to this for a bit and replay
> it after we've decoded some more," everything immediately breaks,
> unless we also delay application of T2's non-transactional update in
> such a way that it's still guaranteed to happen after T1. I wonder if
> this kind of situation would be a problem for a future parallel-apply
> feature. It wouldn't work, for example, to hand T1 and T3 off (in that
> order) to a separate apply process but handle T2's "non-transactional"
> message directly, because it might handle that message before the
> application of T1 got completed.
> 

Doesn't the whole logical replication critically depend on the commit
order? If you decide to arbitrarily reorder/delay the transactions, all
kinds of really bad things can happen. That's a generic problem, it
applies to all kinds of objects, not just sequences - a parallel apply
would need to detect this sort of dependencies (e.g. INSERT + DELETE of
the same key), and do something about it.

Similar for sequences, where the important event is allocation of a new
relfilenode.

If anything, it's easier for sequences, because the relfilenode tracking
gives us an explicit (and easy) way to detect these dependencies between
transactions.

> This also seems to depend on every transactional operation that might
> affect a future non-transactional operation holding a lock that would
> conflict with that non-transactional operation. For example, if ALTER
> SEQUENCE .. RESTART WITH didn't take a strong lock on the sequence,
> then you could have: T1 does nextval, T2 does ALTER SEQUENCE RESTART
> WITH, T1 does nextval again, T1 commits, T2 commits. It's unclear what
> the semantics of that would be -- would T1's second nextval() see the
> sequence restart, or what? But if the effect of T1's second nextval
> does depend in some way on the ALTER SEQUENCE operation which precedes
> it in the WAL stream, then we might have some trouble here, because
> both nextvals precede the commit of T2. Fortunately, this sequence of
> events is foreclosed by locking.
> 

I don't quite follow :-(

AFAIK this theory hinges on not having the right lock, but I believe
ALTER SEQUENCE does obtain the lock (at least in cases that assign a new
relfilenode). Which means such reordering should not be possible,
because nextval() in other transactions will then wait until commit. And
all nextval() calls in the same transaction will be treated as
transactional.

So I think this works OK. If something does not lock the sequence in a
way that would prevent other xacts to do nextval() on it, it's not a
change that would change the relfilenode - and so it does not switch the
sequence into a transactional mode.

> But I did find one somewhat-similar case in which that's not so.
> 
> S1: create table withseq (a bigint generated always as identity);
> S1: begin;
> S2: select nextval('withseq_a_seq');
> S1: alter table withseq set unlogged;
> S2: select nextval('withseq_a_seq');
> 
> I think this is a bug in the code that supports owned sequences rather
> than a problem that this patch should have to do something about. When
> a sequence is flipped between logged and unlogged directly, we take a
> stronger lock than we do here when 

PG versus libxml2 2.12.x

2024-01-27 Thread Tom Lane
Buildfarm member caiman has been failing build for a couple weeks now.
The reason turns out to be that recent libxml2 has decided to throw
a "const" into the signature required for custom error handlers.
(API compatibility?  What's that?)

I don't mind adopting the "const" --- it's a good idea in isolation.
The trouble is in fixing our code to work with both old and new
libxml2 versions.  We could thrash around with a configure test or
something, but I think the most expedient answer is just to insert
some explicit casts, as shown in the attached.  It's possible though
that some compilers will throw a cast-away-const warning.  I'm
not seeing any, but ...

Also, I'm seeing a deprecation warning in contrib/xml2/xpath.c
for

xmlLoadExtDtdDefaultValue = 1;

I'm not sure why that's still there, given that we disabled external
DTD access ages ago.  I propose we just remove it.

In short, I suggest the attached.

regards, tom lane
diff --git a/contrib/xml2/xpath.c b/contrib/xml2/xpath.c
index a967257546..b999b1f706 100644
--- a/contrib/xml2/xpath.c
+++ b/contrib/xml2/xpath.c
@@ -74,8 +74,6 @@ pgxml_parser_init(PgXmlStrictness strictness)
 	/* Initialize libxml */
 	xmlInitParser();
 
-	xmlLoadExtDtdDefaultValue = 1;
-
 	return xmlerrcxt;
 }
 
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index f869c680af..a6734f3550 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -124,7 +124,7 @@ static xmlParserInputPtr xmlPgEntityLoader(const char *URL, const char *ID,
 		   xmlParserCtxtPtr ctxt);
 static void xml_errsave(Node *escontext, PgXmlErrorContext *errcxt,
 		int sqlcode, const char *msg);
-static void xml_errorHandler(void *data, xmlErrorPtr error);
+static void xml_errorHandler(void *data, const xmlError *error);
 static int	errdetail_for_xml_code(int code);
 static void chopStringInfoNewlines(StringInfo str);
 static void appendStringInfoLineSeparator(StringInfo str);
@@ -1196,7 +1196,8 @@ pg_xml_init(PgXmlStrictness strictness)
 	errcxt->saved_errcxt = xmlGenericErrorContext;
 #endif
 
-	xmlSetStructuredErrorFunc((void *) errcxt, xml_errorHandler);
+	xmlSetStructuredErrorFunc((void *) errcxt,
+			  (xmlStructuredErrorFunc) xml_errorHandler);
 
 	/*
 	 * Verify that xmlSetStructuredErrorFunc set the context variable we
@@ -2024,7 +2025,7 @@ xml_errsave(Node *escontext, PgXmlErrorContext *errcxt,
  * Error handler for libxml errors and warnings
  */
 static void
-xml_errorHandler(void *data, xmlErrorPtr error)
+xml_errorHandler(void *data, const xmlError *error)
 {
 	PgXmlErrorContext *xmlerrcxt = (PgXmlErrorContext *) data;
 	xmlParserCtxtPtr ctxt = (xmlParserCtxtPtr) error->ctxt;
@@ -4803,7 +4804,8 @@ XmlTableFetchRow(TableFuncScanState *state)
 	xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableFetchRow");
 
 	/* Propagate our own error context to libxml2 */
-	xmlSetStructuredErrorFunc((void *) xtCxt->xmlerrcxt, xml_errorHandler);
+	xmlSetStructuredErrorFunc((void *) xtCxt->xmlerrcxt,
+			  (xmlStructuredErrorFunc) xml_errorHandler);
 
 	if (xtCxt->xpathobj == NULL)
 	{
@@ -4857,7 +4859,8 @@ XmlTableGetValue(TableFuncScanState *state, int colnum,
 		   xtCxt->xpathobj->nodesetval != NULL);
 
 	/* Propagate our own error context to libxml2 */
-	xmlSetStructuredErrorFunc((void *) xtCxt->xmlerrcxt, xml_errorHandler);
+	xmlSetStructuredErrorFunc((void *) xtCxt->xmlerrcxt,
+			  (xmlStructuredErrorFunc) xml_errorHandler);
 
 	*isnull = false;
 
@@ -5000,7 +5003,8 @@ XmlTableDestroyOpaque(TableFuncScanState *state)
 	xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableDestroyOpaque");
 
 	/* Propagate our own error context to libxml2 */
-	xmlSetStructuredErrorFunc((void *) xtCxt->xmlerrcxt, xml_errorHandler);
+	xmlSetStructuredErrorFunc((void *) xtCxt->xmlerrcxt,
+			  (xmlStructuredErrorFunc) xml_errorHandler);
 
 	if (xtCxt->xpathscomp != NULL)
 	{


Re: Segmentation fault on FreeBSD with GSSAPI authentication

2024-01-27 Thread Tom Lane
=?utf-8?Q?Micha=C5=82_K=C5=82eczek?=  writes:
> On 27 Jan 2024, at 15:43, Tom Lane  wrote:
>> Which FreeBSD version?  Which krb5 package version?
>> Is this x86_64, or something else?

> Right, sorry:

> 1. X86_64
> 2. FreeBSD 13.2 (It is in a TrueNAS Jail if it makes any difference)
> 3. krb5-1.21.2

Hmm.  I made a desultory attempt at replicating this, using
a fully-updated FreeBSD 13.2 image.  I can't find anything wrong,
but it's not clear to me that I duplicated your configuration.

To make sure we're on the same page: do the tests in src/test/kerberos
in our source tree pass on your system?  Read the README there for
caveats, then do

make check PG_TEST_EXTRA=kerberos

regards, tom lane




Re: cleanup patches for incremental backup

2024-01-27 Thread Nathan Bossart
On Sat, Jan 27, 2024 at 11:00:01AM +0300, Alexander Lakhin wrote:
> 24.01.2024 20:46, Robert Haas wrote:
>> This is weird. There's a little more detail in the log file,
>> regress_log_002_blocks, e.g. from the first failure you linked:
>> 
>> [11:18:20.683](96.787s) # before insert, summarized TLI 1 through 0/14E09D0
>> [11:18:21.188](0.505s) # after insert, summarized TLI 1 through 0/14E0D08
>> [11:18:21.326](0.138s) # examining summary for TLI 1 from 0/14E0D08 to 
>> 0/155BAF0
>> # 1
>> ...
>> [11:18:21.349](0.000s) #  got: 'pg_walsummary: error: could
>> not open file 
>> "/home/nm/farm/gcc64/HEAD/pgsql.build/src/bin/pg_walsummary/tmp_check/t_002_blocks_node1_data/pgdata/pg_wal/summaries/0001014E0D08155BAF0
>> # 1.summary": No such file or directory'
>> 
>> The "examining summary" line is generated based on the output of
>> pg_available_wal_summaries(). The way that works is that the server
>> calls readdir(), disassembles the filename into a TLI and two LSNs,
>> and returns the result.
> 
> I'm discouraged by "\n1" in the file name and in the
> "examining summary..." message.
> regress_log_002_blocks from the following successful test run on the same
> sungazer node contains:
> [15:21:58.924](0.106s) # examining summary for TLI 1 from 0/155BAE0 to 
> 0/155E750
> [15:21:58.925](0.001s) ok 1 - WAL summary file exists

Ah, I think this query:

SELECT tli, start_lsn, end_lsn from pg_available_wal_summaries()
WHERE tli = $summarized_tli AND end_lsn > '$summarized_lsn'

is returning more than one row in some cases.  I attached a quick sketch of
an easy way to reproduce the issue as well as one way to fix it.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
diff --git a/src/bin/pg_walsummary/t/002_blocks.pl b/src/bin/pg_walsummary/t/002_blocks.pl
index 40908da8cb..5609b1bd14 100644
--- a/src/bin/pg_walsummary/t/002_blocks.pl
+++ b/src/bin/pg_walsummary/t/002_blocks.pl
@@ -64,10 +64,16 @@ SELECT EXISTS (
 )
 EOM
 
+$node1->safe_psql('postgres', safe_psql('postgres', < '$summarized_lsn'
+	WHERE tli = $summarized_tli AND end_lsn > '$summarized_lsn' ORDER BY end_lsn LIMIT 1
 EOM
 my ($tli, $start_lsn, $end_lsn) = split(/\|/, $details);
 note("examining summary for TLI $tli from $start_lsn to $end_lsn");


Re: Segmentation fault on FreeBSD with GSSAPI authentication

2024-01-27 Thread Michał Kłeczek



> On 27 Jan 2024, at 15:43, Tom Lane  wrote:
> 
> =?utf-8?Q?Micha=C5=82_K=C5=82eczek?=  writes:
>> I am trying to use GSSAPI (Kerberos auth) on FreeBSD.
> 
>> After several attempts I was able to build PostgreSQL 16 from FreeBSD ports 
>> with GSSAPI support. The steps were:
>> 1. Uninstall Heimdal
>> 2. Install krb5 (MIT Kerberos)
>> 3. Build Pg with GSSAPI support
> 
> Which FreeBSD version?  Which krb5 package version?
> Is this x86_64, or something else?

Right, sorry:

1. X86_64
2. FreeBSD 13.2 (It is in a TrueNAS Jail if it makes any difference)
3. krb5-1.21.2

—
Michal



Re: Segmentation fault on FreeBSD with GSSAPI authentication

2024-01-27 Thread Tom Lane
=?utf-8?Q?Micha=C5=82_K=C5=82eczek?=  writes:
> I am trying to use GSSAPI (Kerberos auth) on FreeBSD.

> After several attempts I was able to build PostgreSQL 16 from FreeBSD ports 
> with GSSAPI support. The steps were:
> 1. Uninstall Heimdal
> 2. Install krb5 (MIT Kerberos)
> 3. Build Pg with GSSAPI support

Which FreeBSD version?  Which krb5 package version?
Is this x86_64, or something else?

regards, tom lane




Segmentation fault on FreeBSD with GSSAPI authentication

2024-01-27 Thread Michał Kłeczek
Hi All,

I am trying to use GSSAPI (Kerberos auth) on FreeBSD.

After several attempts I was able to build PostgreSQL 16 from FreeBSD ports 
with GSSAPI support. The steps were:
1. Uninstall Heimdal
2. Install krb5 (MIT Kerberos)
3. Build Pg with GSSAPI support

I configured pg_hba.conf to require GSSAPI authentication for one of the users.

When trying to connect using “psql -h postgres” (postgres is the name of the 
host) I get:

psql: error: connection to server at “postgres.[domainname]” ([ipaddress]), 
port 5432 failed: connection to server at “postgres.[domainname]” 
([ipaddress]), port 5432 failed: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

Configured Pg to debug log and here are the relevant syslog entries:


an 27 13:33:45 postgres postgres[61605]: [36-1] 2024-01-27 13:33:45.779 CET 
[61605] DEBUG:  forked new backend, pid=61626 socket=10
Jan 27 13:33:45 postgres postgres[61605]: [37-1] 2024-01-27 13:33:45.786 CET 
[61605] DEBUG:  forked new backend, pid=61627 socket=10
Jan 27 13:33:45 postgres postgres[61605]: [38-1] 2024-01-27 13:33:45.786 CET 
[61605] DEBUG:  reaping dead processes
Jan 27 13:33:45 postgres postgres[61605]: [39-1] 2024-01-27 13:33:45.787 CET 
[61605] DEBUG:  server process (PID 61626) was terminated by signal 11: 
Segmentation fault
Jan 27 13:33:45 postgres postgres[61605]: [40-1] 2024-01-27 13:33:45.787 CET 
[61605] LOG:  server process (PID 61626) was terminated by signal 11: 
Segmentation fault
Jan 27 13:33:45 postgres postgres[61605]: [41-1] 2024-01-27 13:33:45.787 CET 
[61605] LOG:  terminating any other active server processes
Jan 27 13:33:45 postgres postgres[61605]: [42-1] 2024-01-27 13:33:45.787 CET 
[61605] DEBUG:  sending SIGQUIT to process 61611
Jan 27 13:33:45 postgres postgres[61605]: [43-1] 2024-01-27 13:33:45.787 CET 
[61605] DEBUG:  sending SIGQUIT to process 61627
Jan 27 13:33:45 postgres postgres[61605]: [44-1] 2024-01-27 13:33:45.787 CET 
[61605] DEBUG:  sending SIGQUIT to process 61607
Jan 27 13:33:45 postgres postgres[61605]: [45-1] 2024-01-27 13:33:45.787 CET 
[61605] DEBUG:  sending SIGQUIT to process 61606
Jan 27 13:33:45 postgres postgres[61605]: [46-1] 2024-01-27 13:33:45.787 CET 
[61605] DEBUG:  sending SIGQUIT to process 61609
Jan 27 13:33:45 postgres postgres[61605]: [47-1] 2024-01-27 13:33:45.787 CET 
[61605] DEBUG:  sending SIGQUIT to process 61610
Jan 27 13:33:45 postgres postgres[61605]: [48-1] 2024-01-27 13:33:45.789 CET 
[61605] DEBUG:  reaping dead processes
Jan 27 13:33:45 postgres postgres[61605]: [49-1] 2024-01-27 13:33:45.789 CET 
[61605] DEBUG:  server process (PID 61627) exited with exit code 2



I am not familiar enough with FreeBSD to be able to gather any crash dumps or 
backtraces, I am afraid.

--
Michal



Re: proposal: psql: show current user in prompt

2024-01-27 Thread Jelte Fennema-Nio
On Sat, 27 Jan 2024 at 08:35, Pavel Stehule  wrote:
> Until now, it is not possible to disable reporting. So clients can expect so 
> reporting is workable.

Sure, so if we leave the default as is that's fine. It's not as if
this GUC would be changed without the client knowing, the client would
be the one changing the GUC and thus disabling the sending of
reporting for the default GUCs. If it doesn't want to disable the
reporting, than it simply should not send such a request.

> Do you have a use case, when disabling some of the defaultly reported GUC 
> makes sense?

Mostly if the client doesn't actually use them, e.g. I expect many
clients don't care what the current application_name is. But I do
agree this is not a very strong usecase, so I'd be okay with always
sending the ones that we sent as default for now. But that does make
the implementation more difficult, since we'd have to special case
these GUCs instead of having the same consistent behaviour for all
GUCs.

> yes, inside gradual connect you can enhance the list of custom reported GUC 
> easily.
>
> but for use cases like prompt in psql, I need to enable, disable reporting - 
> but this use case should be independent of "protected" connection related GUC 
> reporting.
>
> For example - when I disable %N, I can disable reporting "role" and disable 
> showing role in prompt. But when "role" should be necessary for pgbouncer, 
> then surely the disabling reporting should be ignored. The user by setting a 
> prompt should not break communication.  And it can be ignored without any 
> issue, there is not performance issue, because "role" is still necessary for 
> pgbouncer that is used for connection. Without described behaviour we should 
> not implement controlling reporting to psql, because there can be a lot of 
> unhappy side effects in dependency if the user set or unset custom prompt or 
> some other future feature.

Maybe I'm misunderstanding what you're saying, but it's not clear to
me why you are seeing two different use cases here. To me if we have
the ParameterSet message then they are both the same. When you enable
%N you would send a ParamaterSet message for _pq_.report_parameters
and set it to a list of gucs including "role". And when you disable %N
you would set _pq_.report_parameters to a list of GUCs without "role".
Then if any proxy still needed "role" even if the list it receives in
_pq_.report_parameters doesn't contain it, then this proxy would
simply prepend "role" to the list of GUCs before forwarding the
ParameterSet message.

Also to be clear, having a "protected" connection-start only GUC is
problematic for proxies. Because they need to update this setting
while the connection is active when they hand of one server connection
to another client.




Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-01-27 Thread Bharath Rupireddy
On Sat, Jan 27, 2024 at 1:04 AM Jeff Davis  wrote:
>
> All of these things are functionally equivalent -- the same thing is
> happening at the end. This is just a discussion about API style and how
> that will interact with hypothetical callers that don't exist today.
> And it can also be easily changed later, so we aren't stuck with
> whatever decision happens here.

I'll leave that up to you. I'm okay either ways - 1) ensure the caller
doesn't use XLogReadFromBuffers, 2) XLogReadFromBuffers returning
as-if nothing was read when in recovery or on a different timeline.

> > Imagine, implementing an extension (may be for fun or learning or
> > educational or production purposes) to read unflushed WAL directly
> > from WAL buffers using XLogReadFromBuffers as page_read callback with
> > xlogreader facility.
>
> That makes sense, I didn't realize you intended to use this fron an
> extension. I'm fine considering that as a separate patch that could
> potentially be committed soon after this one.

Yes, I've turned that into 0002 patch.

> I'd like some more details, but can I please just commit the basic
> functionality now-ish?

+1.

> > Tried to keep wal_writer quiet with wal_writer_delay=1ms and
> > wal_writer_flush_after = 1GB to not to flush WAL in the background.
> > Also, disabled autovacuum, and set checkpoint_timeout to a higher
> > value. All of this is done to generate minimal WAL so that WAL
> > buffers
> > don't get overwritten. Do you see any problems with it?
>
> Maybe check it against pg_current_wal_lsn(), and see if the Write
> pointer moved ahead? Perhaps even have a (limited) loop that tries
> again to catch it at the right time?

Adding a loop seems to be reasonable here and done in v21-0003. Also,
I've added wal_level = minimal per
src/test/recovery/t/039_end_of_wal.pl introduced by commit bae868caf22
which also tries to keep WAL activity to minimum.

> > Can the WAL summarizer ever read the WAL on current TLI? I'm not so
> > sure about it, I haven't explored it in detail.
>
> Let's just not call XLogReadFromBuffers from there.

Removed.

PSA v21 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v21-0001-Add-XLogReadFromBuffers.patch
Description: Binary data


v21-0002-Allow-XLogReadFromBuffers-to-wait-for-in-progres.patch
Description: Binary data


v21-0003-Add-test-module-for-verifying-WAL-read-from-WAL-.patch
Description: Binary data


v21-0004-Use-XLogReadFromBuffers-in-more-places.patch
Description: Binary data


Re: cleanup patches for incremental backup

2024-01-27 Thread Alexander Lakhin

24.01.2024 20:46, Robert Haas wrote:

This is weird. There's a little more detail in the log file,
regress_log_002_blocks, e.g. from the first failure you linked:

[11:18:20.683](96.787s) # before insert, summarized TLI 1 through 0/14E09D0
[11:18:21.188](0.505s) # after insert, summarized TLI 1 through 0/14E0D08
[11:18:21.326](0.138s) # examining summary for TLI 1 from 0/14E0D08 to 0/155BAF0
# 1
...
[11:18:21.349](0.000s) #  got: 'pg_walsummary: error: could
not open file 
"/home/nm/farm/gcc64/HEAD/pgsql.build/src/bin/pg_walsummary/tmp_check/t_002_blocks_node1_data/pgdata/pg_wal/summaries/0001014E0D08155BAF0
# 1.summary": No such file or directory'

The "examining summary" line is generated based on the output of
pg_available_wal_summaries(). The way that works is that the server
calls readdir(), disassembles the filename into a TLI and two LSNs,
and returns the result.


I'm discouraged by "\n1" in the file name and in the
"examining summary..." message.
regress_log_002_blocks from the following successful test run on the same
sungazer node contains:
[15:21:58.924](0.106s) # examining summary for TLI 1 from 0/155BAE0 to 0/155E750
[15:21:58.925](0.001s) ok 1 - WAL summary file exists

Best regards,
Alexander