Re: --frokbackend process

2024-06-27 Thread Michael Paquier
On Thu, Jun 27, 2024 at 09:19:45AM +0500, Kashif Zeeshan wrote:
> It's hard to figure out the issue by just looking on the process list, to
> figure out the issue you need to share the DB Server Logs and thats the why
> to figure out the exac issue.

Note that it is equally hard for anybody reading your two emails to
provide any hints about what could be going on, because you are
providing zero details about what's happening in the server, and
nobody would be able to help you with only a screenshot of running
processes.  There should be at least some logs from which you could
begin some sort of investigation, for example, to know what is
happening to your system and figure out if there is even something
wrong going on.

Thanks,
--
Michael


signature.asc
Description: PGP signature


Re: ERROR: could not attach to dynamic shared area

2024-06-26 Thread Michael Paquier
On Wed, Jun 26, 2024 at 02:43:30PM +, Andrew Longwill wrote:
> We’re using Postgresql 15.4 in AWS RDS. Since yesterday we have seen
> two occurrences where our PHP application becomes unable to connect
> to our RDS replicas. In the application logs we see the error
> "FATAL: could not attach to dynamic shared area". The RDS replica
> remains unusable in this state unless it’s rebooted.

I suspect that you have been hit with this issue:
https://www.postgresql.org/message-id/CAO6_XqqJbJBL=m7ym13tcb4xnq58vra2jcc+gwepbgbada6...@mail.gmail.com

A fix has been committed for it today by me down to v15.  This will
show up in the next round of minor releases.
--
Michael


signature.asc
Description: PGP signature


Re: recovery.signal not being removed when recovery complete

2024-04-03 Thread Michael Paquier
On Tue, Mar 26, 2024 at 06:22:32PM -0400, Isaac Morland wrote:
> I use a script to restore a backup to create a testing copy of the
> database. I set the following in postgresql.auto.conf:
> 
> recovery_target = 'immediate'
> recovery_target_action = 'promote'

Why not, after a pg_basebackup -R I assume.  Would you mind sharing
your commands?

> In the logs I get "recovery stopping after reaching consistency" then a
> moment later "database system is ready to accept read-only connections",
> then some entries about restoring log files, then "database system is ready
> to accept connections".

If you have some logs, that could help as well.

> I am able to make changes (e.g. CREATE TABLE), yet recovery.signal is still
> present. My understanding is that recovery.signal should be removed when
> recovery is finished (i.e., more or less when "database system is ready to
> accept connections" is logged?), unless recovery_target_action is set to
> 'shutdown'.
> 
> Any ideas? Even just confirming/denying I understand the above correctly
> would help.

Not removing the two .signal files when promotion is achieved would be
a problem to me because we'd reenter recovery again at a follow-up
startup.  ArchiveRecoveryRequested should be set if there was either
recovery.signal or standby.signal found at startup, meaning that we
should have a TLI jump at promotion with a physical removal of both
files and a LOG for a "selected new timeline ID".
--
Michael


signature.asc
Description: PGP signature


Re: support fix query_id for temp table

2024-02-01 Thread Michael Paquier
On Thu, Feb 01, 2024 at 07:37:32AM +, ma lz wrote:
> session 1:
> create temp table ttt ( a int );
> insert into ttt values(3); -- query_id is XXX  from 
> pg_stat_activity
> 
> session 2:
> create temp table ttt ( a int );
> insert into ttt values(3);-- query_id is YYY  from 
> pg_stat_activity
> 
> I know temp table has different oid, so query_id is different, is
> there a way to use table name for temp table  instead of oid?

The CREATE TABLE statements have indeed the same query ID (in 16~),
and the inserts have a different one as they touch different schemas
and relations.  That's quite an old problem, that depends on the
RangeVar attached to an InsertStmt.  I don't quite see a way to
directly handle that except by using a custom implementation in query
jumbling for this node and its RangeVar, so there is no "easy" way to
tackle that :/
--
Michael


signature.asc
Description: PGP signature


Re: Logical replication breaks: "unexpected duplicate for tablespace 0, relfilenode 2774069304"

2024-01-02 Thread Michael Paquier
On Thu, Dec 28, 2023 at 02:03:12PM +0200, Kouber Saparev wrote:
>> The first problem that we have here is that we've lost track of the
>> patch proposed, so I have added a CF entry for now:
>> https://commitfest.postgresql.org/46/4720/
> 
> Thank you. Is there a bug report or should we file one? It looks like
> something that compromises the reliability of the logical replication as a
> whole.

Having a CF entry means that it is already tracked, so no need to do
more here for the moment.  The next step would be to look at the
proposed patch in more details, and work on fixing the issue.
--
Michael


signature.asc
Description: PGP signature


Re: Logical replication breaks: "unexpected duplicate for tablespace 0, relfilenode 2774069304"

2023-12-23 Thread Michael Paquier
On Fri, Dec 22, 2023 at 10:55:24AM +0200, Kouber Saparev wrote:
> The table for this file node is not even included in any of the
> publications we have. I've found a similar issue described [1] before, so I
> was wondering whether this patch is applied? Our subscriber database is
> PostgreSQL 16.1 and the publisher - PostgreSQL 15.4.

Or just this link using the community archives based on the
message-ID:
https://www.postgresql.org/message-id/tycpr01mb83731ade7fd7c7cf5d335bceed...@tycpr01mb8373.jpnprd01.prod.outlook.com

> What quick solution would fix the replication? Repack of the table? Reload
> of the database? Killing some backends?

There may be something you could do as a short-term solution, but it
does not solve the actual root of the problem, because the error you
are seeing is not something users should be able to face.

The first problem that we have here is that we've lost track of the
patch proposed, so I have added a CF entry for now:
https://commitfest.postgresql.org/46/4720/
--
Michael


signature.asc
Description: PGP signature


Re: pg_checksums?

2023-10-29 Thread Michael Paquier
On Sun, Oct 29, 2023 at 11:49:11AM +0100, Peter J. Holzer wrote:
> On 2023-10-29 10:11:07 +0100, Paul Förster wrote:
>> On Oct 29, 2023, at 02:43, Peter J. Holzer  wrote:
>>> I don't think so. AFAIK Replication keeps the data files in sync on a
>>> bit-for-bit level and turning on checksums changes the data layout.
>>> Running a cluster where one node has checksums and the other doesn't
>>> would result in a complete mess.
>> 
>> I agree with the last sentence. This is why I asked if it is safe to
>> enable checksums on a replica, switch over and then do it again on the
>> ex primary, i.e. now new replica without doing a reinit.
> 
> It *might* work if there are zero writes on the primary during the
> downtime of the replica (because those writes couldn't be replicated),
> but that seems hard to ensure. Even if you could get away with making
> the primary read-only (is this even possible?) I wouldn't have much
> confidence in the result and reinit the (new) replica anyway.

Hm?  Page checksums are written when a page is flushed to disk, we
don't set them for dirty buffers or full-page writes included in WAL,
so it should be OK to do something like the following:
- Stop cleanly a standby.
- Run pg_checksums on the standby to enable them.
- Restart the standby.
- Catchup with the latest changes
- Stop cleanly the primary, letting the shutdown checkpoint be
replicated to the standby.
- Promote the standby.
- Enable checksums on the previous primary.
- Start the previous primary to be a standby of the node you failed
over to.
--
Michael


signature.asc
Description: PGP signature


Re: REINDEX in tables

2023-10-25 Thread Michael Paquier
On Wed, Oct 25, 2023 at 11:33:11AM +0200, Andreas Kretschmer wrote:
> Am 25.10.23 um 11:24 schrieb Matthias Apitz:
>> We have a client who run REINDEX in certain tables of the database of
>> our application (on Linux with PostgreSQL 13.x):
>> 
>> REINDEX TABLE CONCURRENTLY d83last;
>> REINDEX TABLE CONCURRENTLY d86plz;
>> REINDEX TABLE CONCURRENTLY ig_memtable;
>> REINDEX TABLE CONCURRENTLY ig_dictionary;
>> REINDEX TABLE CONCURRENTLY ig_dictionary;
>> REINDEX TABLE CONCURRENTLY d50zweig ;
>> REINDEX TABLE CONCURRENTLY d50zweig ;
>> 
>> We as the software vendor and support, do not use or recommend this
>> procedure, because we have own SQL files for creating or deleting
>> indices in the around 400 tables.
>> 
>> The client is now concerned about the issue that the number of
>> rows in some of the above tables has increased. Is this possible?
> 
> In principle, there is nothing wrong with doing this in a maintenance
> window, for example.

If you have a maintenance window where your production server is not
going to be active, you may want to just do a more aggressive REINDEX
without CONCURRENTLY as that's going to be cheaper and faster.
CONCURRENTLY is useful even in non-maintenance cases as it allows
concurrent reads and writes to happen on the index while running the
operation.  CONCURRENTLY is much slower of course, as it needs to wait
two times for older snapshots held by concurrent sessions when the
index build is finished and when the index built gets validated.
--
Michael


signature.asc
Description: PGP signature


Re: Change error code severity for syslog?

2023-10-11 Thread Michael Paquier
On Thu, Oct 12, 2023 at 09:42:47AM +0900, Abhishek Bhola wrote:
> For most of the Postgres errors, translating it to WARNING level in syslog
> works, however, I wanted to translate *ERROR codes XX000/1/2* in Postgres
> to be ERROR in syslog as well, so that it triggers the alert system and I
> can notified.
> 
> Is there any way to change the severity level for these specific error
> codes in Postgres?

You could, hypothetically, use the elog hook to switch dynamically the
error data reports to something else depending on the types of filters
you would like to apply to them.  I have an example of this kind of
idea in one of my plugin modules here:
https://github.com/michaelpq/pg_plugins/tree/main/jsonlog
--
Michael


signature.asc
Description: PGP signature


Re: How to force "re-TOAST" after changing STORAGE or COMPRESSION?

2023-10-03 Thread Michael Paquier
On Tue, Oct 03, 2023 at 09:08:49AM +0200, Dominique Devienne wrote:
>  In my case, it's OK not to be transactional, for these experiments. Is
> there a way
> to lock the table and do the rewriting w/o generating any WAL? I don't have
> any experience
> with unlogged tables, but should I take an exclusive lock on the table,
> switch it to unlogged,
> rewrite, and switch it back to logged?

Switching a table back to be logged requires all its 8k blocks to be
WAL-logged, so that would be roughly the same as a plain UPDATE.
--
Michael


signature.asc
Description: PGP signature


Re: How to force "re-TOAST" after changing STORAGE or COMPRESSION?

2023-10-02 Thread Michael Paquier
On Tue, Oct 03, 2023 at 06:31:27AM +0200, Laurenz Albe wrote:
> On Tue, 2023-10-03 at 12:33 +1100, rob stone wrote:
>> Would running CLUSTER on the table use the new parameters for the re-
>> write?
> 
> No, as far as I know.

Note that under the hoods VACUUM FULL and CLUSTER use the same code
paths when doing their stuff.

> You'd need something like
> 
>   -- rewrite all tuples
>   UPDATE tab SET id = id;
>   -- get rid of the bloat
>   VACUUM (FULL) tab;

I'm afraid so, and get ready for a burst of WAL that depends on the
size of your relation if you are too aggressive with the updates.  You
could do that in periodic steps, as well.
--
Michael


signature.asc
Description: PGP signature


Re: How to force "re-TOAST" after changing STORAGE or COMPRESSION?

2023-10-02 Thread Michael Paquier
On Mon, Oct 02, 2023 at 04:42:15PM +0200, Dominique Devienne wrote:
> According to the doc, the table is NOT changed.
> In my case, I DO want to have the bytea column rewritten
> according to the new STORAGE and/or COMPRESSION.
> The doc doesn't mention how to achieve that.

Yes, the compression type of the existed toasted values are not
changed after an ALTER SET, and that's not something one would do on a
daily basis, either.

> VACUUM?
> VACUUM FULL?
> Else?

VACUUM was considered as an option to force a rewrte, but it has been
removed because we were not sure that this is the correct path to do
so or that in some cases forcing the hand of the user was incorrect.
It was also creating a penalty in some of the hot loops of area:
commit: dbab0c07e5ba1f19a991da2d72972a8fe9a41bda
committer: Michael Paquier 
date: Mon, 14 Jun 2021 09:25:50 +0900
Remove forced toast recompression in VACUUM FULL/CLUSTER

Related thread:
https://postgr.es/m/20210527003144.xxqppojoiwurc...@alap3.anarazel.de
--
Michael


signature.asc
Description: PGP signature


Re: backup_manifest rename to backup_manifest.old after successful postgres start up

2023-07-28 Thread Michael Paquier
On Tue, Jul 18, 2023 at 01:53:23AM +0200, Gert Cuykens wrote:
> Hi, suggest to automatically rename backup_manifest to backup_manifest.old
> like backup_label when postgres start up successfully because it has no use
> anymore for pg_verifybackup after postgres has been started.

Yes, I would agree with you that it is not especially useful to keep
it around once the cluster has been recovered from a base backup.  It
would actually lead to various errors if attempting to run
pg_verifybackup on its data folder, for instance.
--
Michael


signature.asc
Description: PGP signature


Re: Query on Primary_conninfo

2023-07-26 Thread Michael Paquier
On Wed, Jul 26, 2023 at 05:33:50PM +0530, Praneel Devisetty wrote:
> Standy is not picking this change even after a reload.
> 
> Restarting standby is working out, but as PG13 supports reload of
> primary_connfino is there a reliable way to achieve this without restarting
> standby database.

primary_conninfo should be able to switch dynamically the upstream
server it streams WAL from on a reload.  I've quickly checked with a
simple configuration and I was able to get it done without a restart
here.  Perhaps you have found a bug, but it's hard to say without
having more details about what's happening.  For example, what do the
logs of the standby tell you?  Are you sure that the reload was done
on the correct node?  Did you check with a SHOW command that the new
value is reflected on your standby to what you want it to be?
--
Michael


signature.asc
Description: PGP signature


Re: fsync data directory after DB crash

2023-07-18 Thread Michael Paquier
On Tue, Jul 18, 2023 at 04:50:25PM +0800, Pandora wrote:
> I found that starting from version 9.5, PostgreSQL will do fsync on
> the entire data directory after DB crash. Here's a question: if I
> have FPW = on, why is this step still necessary?

Yes, see around the call of SyncDataDirectory() in xlog.c:
 * - There might be data which we had written, intending to fsync it, but
 *   which we had not actually fsync'd yet.  Therefore, a power failure in
 *   the near future might cause earlier unflushed writes to be lost, even
 *   though more recent data written to disk from here on would be
 *   persisted.  To avoid that, fsync the entire data directory.
--
Michael


signature.asc
Description: PGP signature


Re: ECPG Semantic Analysis

2023-06-22 Thread Michael Paquier
On Fri, Jun 23, 2023 at 12:21:48AM -0400, Juan Rodrigo Alejandro Burgos Mella 
wrote:
> I have a modified version of ECPG, to which I gave the ability to do
> semantic analysis of SQL statements. Where can you share it or with whom
> can I discuss it?

I cannot say what kind of problem this solves and/or if this is useful
as a feature of the ECPG driver in PostgreSQL core itself, but you
could consider submitting a patch for integration into core.  See for
example here:
https://wiki.postgresql.org/wiki/Submitting_a_Patch

Note that things are pretty old-school around here, as all the
development discussions are done by email on the mailing list
pgsql-hackers, mostly.
--
Michael


signature.asc
Description: PGP signature


Re: FIPS-related Error: Password Must Be at Least 112 Bits on Postgres 14, Unlike in Postgres 11

2023-06-22 Thread Michael Paquier
On Thu, Jun 22, 2023 at 07:16:21PM +0530, Abhishek Dasgupta wrote:
> I am puzzled as to why this error occurs only with PostgreSQL 14 and not
> with PostgreSQL 11.

This error is specific to the Postgres JDBC driver, which relies on
its own application layer for FIPS and SCRAM because it speaks
directly the protocol and because it has no dependency to libpq.  Are
there any specific failures you are seeing in the PostgreSQL backend
that you find confusing?
--
Michael


signature.asc
Description: PGP signature


Re: "PANIC: could not open critical system index 2662" - twice

2023-05-08 Thread Michael Paquier
On Mon, May 08, 2023 at 06:04:23PM -0400, Tom Lane wrote:
> Andres seems to think it's a problem with aborting a DROP DATABASE.
> Adding more data might serve to make the window wider, perhaps.

And the odds get indeed much better once I use these two toys:
CREATE OR REPLACE FUNCTION create_tables(num_tables int)
  RETURNS VOID AS
  $func$
  BEGIN
  FOR i IN 1..num_tables LOOP
EXECUTE format('
  CREATE TABLE IF NOT EXISTS %I (id int)', 't_' || i);
  END LOOP;
END
$func$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION drop_tables(num_tables int)
  RETURNS VOID AS
  $func$
  BEGIN
  FOR i IN 1..num_tables LOOP
EXECUTE format('
  DROP TABLE IF EXISTS %I', 't_' || i);
  END LOOP;
END
$func$ LANGUAGE plpgsql;

And then use this loop with the others I have mentioned upthread (just
create origindb and the functions in them):
while true;
  do psql -c 'select create_tables(1000)' origindb > /dev/null 2>&1 ;
  psql testdb -c "select 1" > /dev/null 2>&1 ;
  psql -c 'select drop_tables(1000)' origindb > /dev/null 2>&1 ;
  psql testdb -c "select 1" > /dev/null 2>&1 ;
done;

On top of that, I have also been able to reproduce the issue on HEAD,
and luckily some pg_class file remain around, full of zeros:
$ hexdump ./base/199634/1259
000        

The contents of 2662, though, looked OK.

Echoing Alvaro..  Could we, err, revisit the choice of making WAL_LOG
the default strategy even for this set of minor releases?  FWIW, I've
mentioned that this choice was too aggressive in the thread of
8a86618..
--
Michael


signature.asc
Description: PGP signature


Re: "PANIC: could not open critical system index 2662" - twice

2023-05-08 Thread Michael Paquier
On Mon, May 08, 2023 at 07:15:20PM +0530, Dilip Kumar wrote:
> I am able to reproduce this using the steps given above, I am also
> trying to analyze this further.  I will send the update once I get
> some clue.

Have you been able to reproduce this on HEAD or at the top of
REL_15_STABLE, or is that 15.2-ish without fa5dd46?

One thing I was wondering about to improve the odds of the hits is to
be more aggressive with the number of relations created at once, so as
we are much more aggressive with the number of pages extended in
pg_class from the origin database.
--
Michael


signature.asc
Description: PGP signature


Re: "PANIC: could not open critical system index 2662" - twice

2023-05-08 Thread Michael Paquier
On Mon, May 08, 2023 at 02:46:37PM +1200, Thomas Munro wrote:
> That sounds like good news, but I'm still confused: do you see all 0s
> in the target database (popo)'s catalogs, as reported (and if so can
> you explain how they got there?), or is it regression that is
> corrupted in more subtle ways also involving 1s?

Nope, I have not been able to confirm that yet without 8a86618.  The
test runs at a high frequency, so that's kind of hard to catch.  I
have not been able to get things in a state where I could look at a
FPW for pg_class or its index, either, in a way similar to Evgeny.
--
Michael


signature.asc
Description: PGP signature


Re: "PANIC: could not open critical system index 2662" - twice

2023-05-07 Thread Michael Paquier
On Sun, May 07, 2023 at 10:30:52PM +1200, Thomas Munro wrote:
> Bug-in-PostgreSQL explanations could include that we forgot it was
> dirty, or some backend wrote it out to the wrong file; but if we were
> forgetting something like permanent or dirty, would there be a more
> systematic failure?  Oh, it could require special rare timing if it is
> similar to 8a8661828's confusion about permanence level or otherwise
> somehow not setting BM_PERMANENT, but in the target blocks, so I think
> that'd require a checkpoint AND a crash.  It doesn't reproduce for me,
> but perhaps more unlucky ingredients are needed.
> 
> Bug-in-OS/FS explanations could include that a whole lot of writes
> were mysteriously lost in some time window, so all those files still
> contain the zeroes we write first in smgrextend().  I guess this
> previously rare (previously limited to hash indexes?) use of sparse
> file hole-punching could be a factor in an it's-all-ZFS's-fault
> explanation:

Yes, you would need a bit of all that.

I can reproduce the same backtrace here.  That's just my usual laptop
with ext4, so this would be a Postgres bug.  First, here are the four
things running in parallel so as I can get a failure in loading a
critical index when connecting:
1) Create and drop a database with WAL_LOG as strategy and the
regression database as template:
while true; do
  createdb --template=regression --strategy=wal_log testdb;
  dropdb testdb;
done
2) Feeding more data to pg_class in the middle, while testing the
connection to the database created:
while true;
  do psql -c 'create table popo as select 1 as a;' regression > /dev/null 2>&1 ;
  psql testdb -c "select 1" > /dev/null 2>&1 ;
  psql -c 'drop table popo' regression > /dev/null 2>&1 ;
  psql testdb -c "select 1" > /dev/null 2>&1 ;
done;
3) Force some checkpoints:
while true; do psql -c 'checkpoint' > /dev/null 2>&1; sleep 4; done
4) Force a few crashes and recoveries:
while true ; do pg_ctl stop -m immediate ; pg_ctl start ; sleep 4 ; done

And here is one backtrace:
(gdb) bt
#0  __pthread_kill_implementation (threadid=, 
signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x7f8a395cad2f in __pthread_kill_internal (signo=6, threadid=) at ./nptl/pthread_kill.c:78
#2  0x7f8a3957bef2 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/posix/raise.c:26
#3  0x7f8a39566472 in __GI_abort () at ./stdlib/abort.c:79
#4  0x55f0b36b95ac in errfinish (filename=0x55f0b38d4f68 "relcache.c", 
lineno=4335, funcname=0x55f0b38d6500 <__func__.12> "load_critical_index") at 
elog.c:604
#5  0x55f0b36a9c64 in load_critical_index (indexoid=2662, heapoid=1259) at 
relcache.c:4335
#6  0x55f0b36a9712 in RelationCacheInitializePhase3 () at relcache.c:4110
(gdb) up 5
#5  0x55f0b36a9c64 in load_critical_index (indexoid=2662, heapoid=1259) at 
relcache.c:4335
4335elog(PANIC, "could not open critical system index %u", 
indexoid);

You can also get failures with btree deduplication because of the CTAS
running above, by the way, but that's just the top of the iceberg once
the state is corrupted:
#0  __pthread_kill_implementation (threadid=, 
signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x7fcae38abd2f in __pthread_kill_internal (signo=6, threadid=) at ./nptl/pthread_kill.c:78
#2  0x7fcae385cef2 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/posix/raise.c:26
#3  0x7fcae3847472 in __GI_abort () at ./stdlib/abort.c:79
#4  0x5556a809f584 in ExceptionalCondition (conditionName=0x5556a812b639 
"false", fileName=0x5556a812abc8 "heapam.c", lineNumber=7882) at assert.c:66
#5  0x5556a79e13db in index_delete_sort_cmp (deltid1=0x5556a9682158, 
deltid2=0x7ffdb62c7088) at heapam.c:7882
#6  0x5556a79e14f6 in index_delete_sort (delstate=0x7ffdb62c8350) at 
heapam.c:7923
#7  0x5556a79e0cd0 in heap_index_delete_tuples (rel=0x7fcae0f1a9a8, 
delstate=0x7ffdb62c8350) at heapam.c:7580
#8  0x5556a7a07be4 in table_index_delete_tuples (rel=0x7fcae0f1a9a8, 
delstate=0x7ffdb62c8350) at ../../../../src/include/access/tableam.h:1337
#9  0x5556a7a0a7a3 in _bt_delitems_delete_check (rel=0x7fcae0f231c8, 
buf=179, heapRel=0x7fcae0f1a9a8, delstate=0x7ffdb62c8350) at nbtpage.c:1541
#10 0x5556a7a07545 in _bt_simpledel_pass (rel=0x7fcae0f231c8, buffer=179, 
heapRel=0x7fcae0f1a9a8, deletable=0x7ffdb62c8430, ndeletable=160, 
newitem=0x5556a9689260, 

Anyway, you would be able to see that b3e184a (pretty much the same as
15.2), without 8a86618 included.  Once 8a86618 is included, all these
steps are stable in the backend, at least here.  Or do we have some
low-hanging fruit with the WAL_LOG strategy?  That could always be
possible, of course, but that looks like the same issue to me, just
with a different symptom showing up.
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup / recovery

2023-04-12 Thread Michael Paquier
On Wed, Apr 12, 2023 at 01:45:56PM +0300, Achilleas Mantzios - cloud wrote:
> On 4/12/23 12:32, Fabrice Chapuis wrote:
>> During recovery process of a self contained backup, how postgres know to
>> stop reading wal when consistency is reached?
> 
> Because it knows the full packup info. It will observe the
> 
> STOP WAL LOCATION: 3BC7/4B000130 (file 00023BC7004B)
> 
> inside the backup file

There is a bit more to that in the recovery logic, depending mostly
on the presence of backup_ label file in the data folder when recovery
begins.  Once the backup_label is found at the beginning of recovery,
its information is stored in the control file and the file is renamed
to backup_label.old hence stopping the server when recovery has not
reached its expected point would rely on the control file contents
later on.  Then, the startup process and its WAL redo makes sure that
WAL replays until it finds the WAL record marking the end of the
backup.  Grepping for XLOG_BACKUP_END (WAL record type in this case)
shows all the areas that rely on that, and xlogrecovery.c covers the
most relevant bits.
--
Michael


signature.asc
Description: PGP signature


Re: "PANIC: could not open critical system index 2662" - twice

2023-04-11 Thread Michael Paquier
On Tue, Apr 11, 2023 at 04:44:54PM +, Evgeny Morozov wrote:
> We have data_checksums=on. (It must be on by default, since I cannot
> find that in our config files anywhere.)

initdb does not enable checksums by default, requiring a
-k/--data-checksums, so likely this addition comes from from your
environment.

> However, the docs say "Only
> data pages are protected by checksums; internal data structures and
> temporary files are not.", so I guess pg_class_oid_index might be an
> "internal data structure"?

pg_class_oid_index is a btree index that relies on 8k on-disk pages
(default size), so it is subject to the same rules as normal relations
regarding checksums for the pages flushed to disk, even if it is on a
catalog.
--
Michael


signature.asc
Description: PGP signature


Re: "PANIC: could not open critical system index 2662" - twice

2023-04-07 Thread Michael Paquier
On Fri, Apr 07, 2023 at 01:04:34PM +0200, Laurenz Albe wrote:
> On Thu, 2023-04-06 at 16:41 +, Evgeny Morozov wrote:
>> Could this be a PG bug?
> 
> It could be, but data corruption caused by bad hardware is much more likely.

There is no way to be completely sure here, except if we would be able
to put our hands on a reproducible test case that would break the
cluster so much that we'd get into this state.  I don't recall seeing
this error pattern in recent history, though.
--
Michael


signature.asc
Description: PGP signature


Re: Binding Postgres to port 0 for testing

2023-03-26 Thread Michael Paquier
On Sun, Mar 26, 2023 at 10:49:33PM -0600, Markus Pilman wrote:
> I somehow didn't consider looking at the postgres tests, though it makes
> sense that they need to solve this problem. If I read the perl code
> correctly though it seems that this could, in theory, cause a race? The
> script checks first whether the port has been assigned to a test, then
> binds a socket to check whether it is used by someone else, closes this
> test socker, and then starts a server process. I guess it's unlikely
> enough, but isn't there a risk that some other process (that isn't
> controlled by this perl script) binds to the found port right after this
> test bind but right before postgres calls bind? I guess it should be rare
> enough so that it wouldn't cause flaky tests.

In theory, yes, I recall that's possible in the scripts.  But only on
Windows where we cannot use socket directories and rely on SSPI
authentication while binding all the nodes to listen to 127.0.0.1.  On
all the other platforms, each test creates its own directory that will
be used as path for unix_socket_directories.  umask is 0700 and owned
by the OS user running the perl tests, so it is isolated as much as it
can be.  Using the same port should be actually fine as far as I
recall, as long as the unix socket paths are different.
--
Michael


signature.asc
Description: PGP signature


Re: Binding Postgres to port 0 for testing

2023-03-26 Thread Michael Paquier
On Sat, Mar 25, 2023 at 11:01:33AM -0600, Markus Pilman wrote:
> Now the problem is that I need to find a TCP port for each running postgres
> instance. There's multiple ways to do this, but by far the easiest one I
> know is to bind to port 0. So my plan was to start postgres with "-p 0" and
> then parse stdout to figure out which port it actually uses. But that
> doesn't seem to work:

Note that you can find some inspiration about that in the code tree
within src/test/perl/PostgreSQL/Test/Cluster.pm, particularly
get_free_port(), where we have now accumulated a couple of years of
experience in designing something that's rather safe, even if it comes
with its own limits.  It is in perl so perhaps you could just reuse it
rather than reinvent the wheel?  Of course, still it should not be
complicated to translate that in a different language, but there may
be no need to reinvent the wheel.  And seeing your first message with
the requirements you list, this does what you are looking for:
- Create an empty cluster.
- Freely create databases, tablespaces, queries, etc.
- Wipe out the whole.

The test cases around src/test/recovery/t/ could be a good starting
point, as well.
--
Michael


signature.asc
Description: PGP signature


Re: Libpq linked to LibreSSL

2022-12-04 Thread Michael Paquier
On Sun, Dec 04, 2022 at 09:02:07AM +0100, Marco Bambini wrote:
> After several attempts (and runtime crashes), I am asking for help
> with how to compile libpq with LibreSSL support (both dynamic and
> static links would be OK to me).
> I know how to compile libpq with OpenSSL support, but I need to
> force it to link to LibreSSL. 

LibreSSL is rather close to OpenSSL in terms of API compatibility as
far as I recall, OpenBSD using it.  Note that we have two buildfarm
members that I guess are doing so (adding Mikael in CC to confirm
here): morepork and plover.

The configure option --with-openssl would be the one to use, but you'd
better be sure that CFLAGS is pointing at the path(s) of the headers
of LibreSSL and that LDFLAGS points to the libraries of LibreSSL if
you have an environment mixing both.  Then at run-time, you could
force LD_LIBRARY_PATH to load the right thing.

There may be trickier things like building with OpenLDAP for example,
as Postgres built with --with-ldap could link to a version of OpenLDAP
linking to OpenSSL, but you'd want LibreSSL for a full consistency, so
be careful..  The same could happen with Python, but without knowing
what kind of environment you are using it will be hard to reach a
clear conclusion without at least more details.
--
Michael


signature.asc
Description: PGP signature


Re: Q: pg_hba.conf separate database names file format

2022-11-09 Thread Michael Paquier
On Wed, Nov 09, 2022 at 04:02:43AM -0600, Ron wrote:
> Are these "include" files supposed to solve the problem of having a *lot* of
> databases (or users) that you want to allow access to?

Yes, splitting the list of users and database eases the maintenance
and readability of pg_hba.conf as each HBA entry could get quite
long depending on the connection policy you may want.  My take would
be to use one entry per line in an @ file in this case.
--
Michael


signature.asc
Description: PGP signature


Re: Q: pg_hba.conf separate database names file format

2022-11-08 Thread Michael Paquier
On Tue, Nov 08, 2022 at 02:16:03PM +0100, Albrecht Dreß wrote:
> However, I could not find a specification of the format for this
> file…  It appears as if simply giving each database name on a
> separate line does the job.  Is this correct?  May the file contain
> comments (i.e. lines starting with “#”) or empty lines?  May the
> file be specified including a path (e.g. “@/some/path/databases”)?

I have been playing with this code for the last couple of days, and
the answer is that you can use an absolute path or a relative path.
In the case of a relative path, the code considers the base directory
as the parent directory of the file this is included in.  For example,
/data/pg/pg_hba.conf including a @databases.conf resolves as
/data/pg/databases.conf, and a @conf/databases.conf resolves as
/data/pg/conf/databases.conf.

The parsing of these files uses the same rules as what's done for
pg_hba.conf and pg_ident.conf, so you can specify a list of
user names separated by commas or even spaces, or put one name per
line.  Comments beginning with '#' are ignored.

If you want to play with your file and see the results, I would
recommend to tweak the files, and then look at the contents generated
in the system view pg_hba_file_rules.  Querying pg_hba_file_rules
loads directly the configuration files from disk, so there is no need
to reload or restart the server to see the effects any modifications
would have.

The documentation has already some descriptions, that you've missed,
perhaps:
https://www.postgresql.org/docs/15/auth-pg-hba-conf.html
"Files included by @ constructs are read as lists of names, which can
be separated by either whitespace or commas. Comments are introduced
by #, just as in pg_hba.conf, and nested @ constructs are
allowed. Unless the file name following @ is an absolute path, it is
taken to be relative to the directory containing the referencing
file."
--
Michael


signature.asc
Description: PGP signature


Re: empty pg_stat_progress_vacuum

2022-11-01 Thread Michael Paquier
On Fri, Oct 21, 2022 at 10:21:23PM +, senor wrote:
> I'm mainly wanting to understand why I'm not seeing processes in
> pg_stat_progress_vacuum. If I rapidly refresh I occasionally see an
> entry for a very small table. A manually started vacuum didn't show
> up either.

It may be possible that the run is short enough that it did not get
captured, as pg_stat_progress_vacuum is a snapshot of the current
point in time.

> Pg version 11.4

Hard to say, but I think that you should update your binaries, at
least.  11.4 has been release in June 2019, and the latest release
available is 11.17, meaning that you are missing more than three years
worth of bug fixes.  Based on the roadmap in [1], 11.18 should be out
next week.

[1]: https://www.postgresql.org/developer/roadmap/
--
Michael


signature.asc
Description: PGP signature


Re: Zheap Tech Problem

2022-10-14 Thread Michael Paquier
On Fri, Oct 14, 2022 at 07:53:17PM -0700, Adrian Klaver wrote:
> On 10/14/22 18:59, jacktby wrote:
>> What's Zheap tech? Can you give me some details or stuff to study? and
>> which version will be realized in ?

There are a few videos on youtube that can provide some insight about
all that, mainly from EDB:
https://www.youtube.com/watch?v=ZbdWOuTTWrw
https://www.youtube.com/watch?v=7jVz4TTB5-4

The community wiki has a page with few references to the code of the
project:
https://wiki.postgresql.org/wiki/Zheap
--
Michael


signature.asc
Description: PGP signature


Re: Incorrect resource manager data checksum in record with zfs and compression

2022-10-03 Thread Michael Paquier
On Mon, Oct 03, 2022 at 12:41:23PM -0700, John Bolliger wrote:
> Our architecture is similar but all of the servers are now on ZFS now and
> Postgres 13.8 with Ubuntu 18.04+ and still doing streaming replication, all
> with ECC memory and 26-64 cores with 192gb ram+ on top of a ZPOOL made out
> of NVMe PCIe SSDs.
> 
> A101 (primary) -> A201 (replica) -> B101(primary) -> B201 (replica).

replica -> primary does not really make sense for physical
replication.  Or do you mean that B101 is itself a standby doing
streaming from A201?
--
Michael


signature.asc
Description: PGP signature


Re: Streaming wal from primiry terminating

2022-09-29 Thread Michael Paquier
On Thu, Sep 29, 2022 at 01:04:08PM +, Lahnov, Igor wrote:
> Primary_conninfo is set in postgresql.conf
> The reason is definitely something else, we do not know what. This is fatal 
> in this case.

Which version of PostgreSQL are you using?  There has been a few
commits around continuation records lately, so it is hard to say if
you are pointing at a new issue or if this is a past one already
addressed.
--
Michael


signature.asc
Description: PGP signature


Re: PostgreSQL 14.4 ERROR: out of memory issues

2022-07-12 Thread Michael Paquier
On Mon, Jul 11, 2022 at 10:50:23AM +0200, Aleš Zelený wrote:
> So far, it has happened three times (during a single week) from the 14.3 ->
> 14.4 upgrade, before 14.4 we haven't suffered from such an issue.
> 
> Questions:
> 1)  Can we safely downgrade from 14.4 to 14.3 by shutting down the instance
> and reinstalling 14.3 PG packages (to prove, that the issue disappear)?
> 2) What is the best way to diagnose what is the root cause?

Hmm.  14.4 has nothing in its release notes that would point to a
change in the vacuum or autovacuum's code paths:
https://www.postgresql.org/docs/14/release-14-4.html#id-1.11.6.5.4

There is nothing specific after a look at the changes as of, and I am
not grabbing anything that would imply a change in memory context
handling either:
`git log --stat REL_14_3..REL_14_4`
`git diff REL_14_3..REL_14_4 -- *.c`

Saying that, you should be able to downgrade safely as there are no
changes in WAL format or such that would break things.  Saying that,
the corruption issue caused by CONCURRENTLY is something you'd still
have to face.

> 2022-07-02 14:48:07 CEST [3930]: [3-1] user=,db=,host=,app= ERROR:  out of
> memory
> 2022-07-02 14:48:07 CEST [3930]: [4-1] user=,db=,host=,app= DETAIL:  Failed
> on request of size 152094068 in memory context "TopTransactionContext".
> 2022-07-02 14:48:07 CEST [3930]: [5-1] user=,db=,host=,app= CONTEXT:
>  automatic vacuum of table "prematch.replication.tab_queue_tmp"

This is the interesting part.  Do you happen to use logical
replication in a custom C++ plugin?

> 2022-07-02 14:48:47 CEST [4476]: [43-1] user=,db=,host=,app= LOG:  could
> not fork worker process: Cannot allocate memory
> terminate called after throwing an instance of 'std::bad_alloc'
>   what():  std::bad_alloc
>
> DETAIL: parameters: $1 = '1', $2 = '1748010445', $3 = '0', $4 = '1000'
> terminate
> called after throwing an instance of 'std::bad_alloc' terminate called
> after throwing an instance of 'std::bad_alloc' what(): what():
> std::bad_allocstd::bad_alloc 2022-07-08 14:54:23 CEST [4476]: [49-1]
> user=,db=,host=,app= LOG: background worker "parallel worker" (PID 25251)
> was terminated by signal 6: Aborted
> 2022-07-08 14:54:23 CEST [4476]: [51-1] user=,db=,host=,app= LOG:
>  terminating any other active server processes

Looks like something is going wrong in the memory handling of one of
your C++ extensions here.  If you can isolate an issue using a query
without any custom code, that would be a Postgres problem, but I think
that you are missing a trick in it.
--
Michael


signature.asc
Description: PGP signature


Re: Postgresql error : PANIC: could not locate a valid checkpoint record

2022-06-26 Thread Michael Paquier
On Fri, Jun 24, 2022 at 01:03:57PM +, Mahendrakar, Prabhakar - Dell Team 
wrote:
> Is it possible to explicitly issue a checkpoint before we move on to
> the pg_upgrade command? 
> so that in the circumstances of the Upgrade issues (like PANIC:
> could not locate a valid checkpoint record), we still  have this
> last explicit checkpoint available. 
>
> Please let us know your thoughts on this.

Well, you have mentioned the use of pg_upgrade, but you are giving
zero details about what kind of command you used, how you handled
the clusters before and after that were upgraded, or what kind of
environment is getting used.  With this little amount of details,
nobody will be able to guess what's happening.  This issue could also
be caused by the environment.  For example, it is possible in some
carelessly-setup enviromnents that a flush is issued and recognized as
completed by the OS, and thought as completed by Postgres, but an
application layer between the OS and the actual hardware did not issue
the flush (be it an OS, FS, disk or a VM-related thing), which would
make this issue reachable.
--
Michael


signature.asc
Description: PGP signature


Re: multiple entries for synchronous_standby_names

2022-06-13 Thread Michael Paquier
On Fri, Jun 10, 2022 at 05:04:30PM +0100, Nitesh Nathani wrote:
> Trying to achieve sync streaming to barman server and i need to add an
> entry to postgresql.conf for this parameter, which already has an entry and
> tried a few variations but does not work. Any ideas? Also tried '&&' but in
> vain
> 
> synchronous_standby_names='ANY 1 (*)',barman-wal-archive

This grammar flavor is not supported (see also syncrep_gram.y for the
code):
https://www.postgresql.org/docs/devel/runtime-config-replication.html

And here is the actual list of grammars supported:
[FIRST] num_sync ( standby_name [, ...] )
ANY num_sync ( standby_name [, ...] )
standby_name [, ...]

In short, you can specify a list of node names within one ANY or FIRST
clause, but you cannot specify a list made of ANY/FIRST items.
Without knowing what kind of priority policy you are trying to
achieve, it is hard to recommend one method over the others.  What we
support now has proven to be hard enough to implement and to make
robust, and supporting sub-groups of nodes was also on the table back
in the day, but the lack of cases did not justify the extra
implementation complexity, as far as I recall this matter.
--
Michael


signature.asc
Description: PGP signature


Re: What do you guys use for issue tracking, CI/CD and team management? any nice open source options?

2022-04-15 Thread Michael Paquier
On Thu, Apr 14, 2022 at 06:19:44PM +0300, Achilleas Mantzios wrote:
> What issue/bug tracking is PostgreSQL itself using?
> What continuous build system (CI/CD) is PostgreSQL itself using?
> Any tool that you ppl or the PostgreSQL infrastructure use that
> links people/committers with bugs/issues, automatically closes
> issues upon commit/push, manages issues/p rojects/people ?

The community does not have a bug tracker, everything is mail-based as
of today.  There is what we call the commit fest app:
https://commitfest.postgresql.org/

This is more a patch tracker for the development of new features
though, still there is a category for published patches that aim at
fixing bugs (see "Bug Fixes").  This does not act as a bug tracker to
be able to see all the live issues reported.
--
Michael


signature.asc
Description: PGP signature


Re: Can you install/run postgresql on a FIPS enabled host?

2022-03-22 Thread Michael Paquier
On Mon, Mar 21, 2022 at 06:33:29PM -0400, Tom Lane wrote:
> It sounds like something thinks that scram-sha-256 encryption is
> disallowed by FIPS.  That may or may not be accurate.  If it's
> supposed to be allowed, you'd need to poke a little harder to
> narrow down where the problem is.
> 
> (Digging in our commit logs, it looks like version 14.2 has some
> changes that might make this work better than it did in older
> versions; but I can't tell from the log messages whether the
> issue being fixed was new-in-14 or not.)

I guess that 3a0cced is the commit you are talking about here.  Please
note that it has been reverted in ad5b6f2 due to ABI concerns with
some of the MD5 hashing routines.
--
Michael


signature.asc
Description: PGP signature


Re: could not open relation with OID

2022-01-26 Thread Michael Paquier
On Wed, Jan 26, 2022 at 05:30:01PM -0800, Ben Chobot wrote:
> We do a lot of queries per day, over a lot of hosts, all of which are on
> 12.9. We've recently started doing a better job at analyzing our db logs and
> have found that, a few times a day, every day, we see some of our queries
> fail with errors like:
> 
> could not open relation with OID 201940279
> 
> In the cases we've examined so far, the failed query succeeds just fine when
> we run it manually. The failed query also had run on an async streaming
> replica, and the primary has completed at least one autovacuum since the
> failure. I don't know if either of those two facts are relevant, but I'm not
> sure what else to blame. The internet seems to want to blame issues like
> this on temp tables, which makes sense, but in our case, most of the queries
> that are failing this way are simple PK scans, which then fall back to the
> table to pull all the columns. The tables themselves are small in row count
> - although some values are likely TOASTed - so I would be surprised if
> anything is spilling to disk for sorting, which might have counted as a temp
> table enough to give such an error.

Do those OIDs point to some specific relations?  It should be easy
enough to guess to which pg_class entry they point to, especially if
you have a persistent schema, and it these are indeed temporary
entries or not depending on their pg_class.relnamespace.

> This is a minuscule failure percentage, so replicating it is going to be
> hard, but it is still breaking for reasons I don't understand, and so I'd
> like to fix it. Has anybody else seen this, or have an ideas of what to look
> at?

I don't recall seeing such reports recently.

> Other things we've considered:
>     - we run pg_repack, which certainly seems like it could make an error
> like this, but we see this error in places and times that pg_repack isn't
> currently running

It could also take time for the issue to show up, depending on the
state of the relcache.
--
Michael


signature.asc
Description: PGP signature


Re: How are md5.h: pg_md5_hash() function and below functions working?

2022-01-21 Thread Michael Paquier
On Sat, Jan 08, 2022 at 08:12:50AM -0800, Adrian Klaver wrote:
> On 1/8/22 05:21, Ali Koca wrote:
>> I can't understand functions in md5.h, these are seemingly little bit
>> weird. Such as:
>>  /* Utilities common to all the MD5 implementations,
>>  as of md5_common.c */
>>  extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum);
>>  extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf);
>>  extern bool pg_md5_encrypt(const char *passwd, const char *salt,
>>     size_t salt_len, char *buf);
> 
> What they do is explained in md5_common.c.

pg_md5_hash() and pg_md5_binary() do the same thing, by computing a
MD5.  pg_md5_hash() computes it as a string in hex format, meaning
that it is 33-character long with the final trailing '\0'.
pg_md5_binary() computes 16 raw bytes.

pg_md5_encrypt() is a utility wrapper that does the operation of
pg_md5_hash() based on a password and a salt, used at authentication
time for MD5.  Its result is a 36-byte long string, prefixed with
"md5" and a 33-byte long hex string.  Those routine names are
historic.
--
Michael


signature.asc
Description: PGP signature


Re: recording of INDEX creation in tables

2022-01-21 Thread Michael Paquier
On Fri, Jan 21, 2022 at 01:38:59PM +0100, Matthias Apitz wrote:
> Does the PostgreSQL (11.4 or 13.1) record somewhere in system tables
> the creation of INDEXes (or other objects)?

Hard to say what you are looking for with such a general question.

Would pg_index or pg_indexes be enough?  There are a bunch of system
catalogs and views, like what's in information_schema, that provide
this information.  If you are looking for a record of the timestamp
when objects have been created, there is no such tracking in Postgres
itself, but you could use an event trigger for this purpose.

Here are some links;
https://www.postgresql.org/docs/devel/catalog-pg-index.html
https://www.postgresql.org/docs/devel/view-pg-indexes.html
https://www.postgresql.org/docs/devel/event-triggers.html
--
Michael


signature.asc
Description: PGP signature


Re: could not accept SSL connection: Success

2022-01-19 Thread Michael Paquier
On Wed, Jan 19, 2022 at 08:06:30PM -0500, Tom Lane wrote:
> Michael Paquier  writes:
> > Leaving things in their current state is fine by me.  Would it be
> > better to add a note about the business with 3.0 though?
> 
> What do you envision saying?  "We don't need to do anything here
> for 3.0" doesn't seem helpful.

Nope, but the idea would be to keep around a note that we may want to
revisit this area of the code based on the state of upstream, because
our code is currently shaped based on problems that OpenSSL has dealt
with.  I am not completely sure, but something among the line of: 
"OpenSSL 1.1.1 and older versions return nothing on an unexpected EOF,
and errno may not be set.  3.0 reports SSL_ERROR_SSL with a
meaningful error set on the stack, so this could be reworked once
support for older versions is removed."

Perhaps that's just nannyism from my side, this is really minor at the
end.
--
Michael


signature.asc
Description: PGP signature


Re: could not accept SSL connection: Success

2022-01-19 Thread Michael Paquier
On Wed, Jan 19, 2022 at 07:58:43PM -0500, Tom Lane wrote:
> Personally I'm satisfied to leave it as-is, since this issue apparently
> occurs only in a minority of OpenSSL versions, and not the newest.

Leaving things in their current state is fine by me.  Would it be
better to add a note about the business with 3.0 though?  My gut is
telling me that we'd better revisit those code paths in a couple of
years when support for legacy OpenSSL is removed, and most likely we
would have forgotten about all those details.
--
Michael


signature.asc
Description: PGP signature


Re: could not accept SSL connection: Success

2022-01-19 Thread Michael Paquier
On Thu, Jan 20, 2022 at 09:05:35AM +1300, Thomas Munro wrote:
> Good news, I'm glad they nailed that down.  I recall that this
> behaviour was a bit of a moving target in earlier versions:
> 
> https://www.postgresql.org/message-id/CAEepm%3D3cc5wYv%3DX4Nzy7VOUkdHBiJs9bpLzqtqJWxdDUp5DiPQ%40mail.gmail.com

Ahh..  So you saw the same problem a couple of years back.  This
thread did not catch much attention.

I don't think that it makes much sense to leave this unchecked as the
message is confusing as it stands.  Perhaps we could do something like
the attached by adding a note about OpenSSL 3.0 to revisit this code
once we unplug support for 1.1.1 and avoiding the errno==0 case?  The
frontend has its own ideas of socket failures as it requires thread
support, making things different with the backend, but it seems to me
that we could see cases where SOCK_ERRNO is also 0.  That's mostly
what you suggested on the other thread.

The error handling changes are really cosmetic, so I'd rather leave
the back-branches out of that.  Thoughts?
--
Michael
diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 3d0168a369..12ddc6f82f 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -493,7 +493,13 @@ aloop:
 		 WAIT_EVENT_SSL_OPEN_SERVER);
 goto aloop;
 			case SSL_ERROR_SYSCALL:
-if (r < 0)
+/*
+ * OpenSSL 1.1.1 and older versions return nothing on
+ * an unexpected EOF, and errno may not be set.  Note that
+ * in OpenSSL 3.0, an unexpected EOF would result in
+ * SSL_ERROR_SSL with a meaningful error set on the stack.
+ */
+if (r < 0 && errno != 0)
 	ereport(COMMERROR,
 			(errcode_for_socket_access(),
 			 errmsg("could not accept SSL connection: %m")));
diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c
index 9f735ba437..b47ea87b43 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -314,6 +314,12 @@ pgtls_write(PGconn *conn, const void *ptr, size_t len)
 			n = 0;
 			break;
 		case SSL_ERROR_SYSCALL:
+			/*
+			 * OpenSSL 1.1.1 and older versions return nothing on
+			 * an unexpected EOF, and errno may not be set.  Note that
+			 * in OpenSSL 3.0, an unexpected EOF would result in
+			 * SSL_ERROR_SSL with a meaningful error set on the stack.
+			 */
 			if (n < 0)
 			{
 result_errno = SOCK_ERRNO;
@@ -322,11 +328,14 @@ pgtls_write(PGconn *conn, const void *ptr, size_t len)
 		 libpq_gettext("server closed the connection unexpectedly\n"
 	   "\tThis probably means the server terminated abnormally\n"
 	   "\tbefore or while processing the request.\n"));
-else
+else if (result_errno != 0)
 	appendPQExpBuffer(>errorMessage,
 	  libpq_gettext("SSL SYSCALL error: %s\n"),
 	  SOCK_STRERROR(result_errno,
 	sebuf, sizeof(sebuf)));
+else
+	appendPQExpBuffer(>errorMessage,
+	  libpq_gettext("SSL SYSCALL error: internal failure\n"));
 			}
 			else
 			{


signature.asc
Description: PGP signature


Re: could not accept SSL connection: Success

2022-01-18 Thread Michael Paquier
On Mon, Jan 17, 2022 at 05:05:52PM +0100, Carla Iriberri wrote:
> I saw previous discussions where different errors were logged with the
> "Success"
> message and this was corrected/treated as a bug, but I couldn't find similar
> reports specific to "could not accept SSL connection". Is this a known
> issue or
> case?

Not based my recent mailing list memories, but I may be running short.
The error comes from the backend as you say, where this log would
expect something in saved_errno to feed %m.

And, upstream documentation tells that:
https://www.openssl.org/docs/manmaster/man3/SSL_get_error.html

"On an unexpected EOF, versions before OpenSSL 3.0 returned
SSL_ERROR_SYSCALL, nothing was added to the error stack, and errno was
0. Since OpenSSL 3.0 the returned error is SSL_ERROR_SSL with a
meaningful error on the error stack."

This would mean that relying on %m would be wrong for this case.  And
I guess that you are using a version of OpenSSL older than 3.0?
--
Michael


signature.asc
Description: PGP signature


Re: md5 issues Postgres14 on OL7

2022-01-10 Thread Michael Paquier
On Sat, Jan 08, 2022 at 02:00:16PM -0500, Tom Lane wrote:
> This is looking pretty solid to me.  Just a couple of nitpicks:
> 
> * In most places you initialize variables holding error strings to NULL:
> 
> + const char *logdetail = NULL;
> 
> but there are three or so spots that don't, eg PerformRadiusTransaction.
> They should be consistent.  (Even if there's no actual bug, I'd be
> unsurprised to see Coverity carp about the inconsistency.)

Hmm.  I have spotted five of them, with one in passwordcheck.

> * The comments for md5_crypt_verify and plain_crypt_verify claim that
> the error string is "optionally" stored, but I don't see anything
> optional about it.  To me, "optional" would imply coding like
> 
>   if (logdetail)
>   *logdetail = errstr;
> 
> which we don't have here, and I don't think we need it.  But the
> comments need adjustment.  (They were wrong before too, but no
> time like the present to clean them up.)

Makes sense.

> * I'd be inclined to just drop the existing comments like
> 
> -  * We do not bother setting logdetail for any pg_md5_encrypt failure
> -  * below: the only possible error is out-of-memory, which is unlikely, 
> and
> -  * if it did happen adding a psprintf call would only make things worse.
> 
> rather than modify them.   Neither the premise nor the conclusion
> of these comments is accurate anymore.  (I think that the psprintf
> they are talking about is the one that will happen inside elog.c
> to construct an errdetail string.  Yeah, there's some risk there,
> but I think it's minimal because of the fact that we preallocate
> some space in ErrorContext.)

Okay, that's fine by me.

> Other than those things, I think v3 is good to go.

I have done an extra pass on all that, and the result seemed fine to
me, so applied.  I have changed the non-OpenSSL code path of pgcrypto
to deal with that in 14 (does not exist on HEAD).  Thanks a lot for
the successive reviews!

The patch was invasive enough, but we could do more here:
- Add the same facility for HMAC.  That's not worth on REL_14_STABLE
based on the existing set of callers, but I'd like to do something
about that on HEAD as that could be helpful in the future.
- The error areas related to checksum_helper.c and backup_manifest.c
could be improved more.  Now these refer only to scenarios unlikely
going to happen in the field, so I have left that out.
--
Michael


signature.asc
Description: PGP signature


Re: md5 issues Postgres14 on OL7

2022-01-07 Thread Michael Paquier
On Fri, Jan 07, 2022 at 05:40:09PM -0500, Tom Lane wrote:
> Hm, you still have cast-away-const in md5_crypt_verify and
> plain_crypt_verify.  Can we adjust their APIs to make them
> return const char * as well (and then their API spec is that
> the caller must never free the string, rather than being
> vague about it)?

Okay.  Hmm.  This requires a couple of extra const markers in the area
of authentication and SASL for the backend, but not much actually.
I thought first that it would be more invasive.

> The other thing that bothers me slightly is that it looks like
> some code paths could end up passing a NULL string pointer to
> ereport or sprintf, since you don't positively guarantee that
> an error will return a string there.  I suppose this is safe
> since 3779ac62d, but I don't really want to start making API
> specs depend on it.

Sounds fair to me in the long term, even for non-assert builds.  I
have added a small-ish wrapper routine in crytohash_openssl.c for this
purpose, with a name copied from {be,fe}-secure-openssl.c to ease any
future refactoring lookups if that proves to be worth in the future.
--
Michael
From 0e7ceca421ab309ce64d8f6fea7a714da08df56c Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Sat, 8 Jan 2022 14:56:39 +0900
Subject: [PATCH v3] Improve error reporting for cryptohashes

---
 src/include/common/cryptohash.h   |  1 +
 src/include/common/md5.h  |  9 ++-
 src/include/libpq/crypt.h |  7 +-
 src/include/libpq/sasl.h  |  4 +-
 src/backend/commands/user.c   |  4 +-
 src/backend/libpq/auth-sasl.c |  2 +-
 src/backend/libpq/auth-scram.c|  5 +-
 src/backend/libpq/auth.c  | 33 
 src/backend/libpq/crypt.c | 46 ++-
 src/backend/replication/backup_manifest.c |  9 ++-
 src/backend/utils/adt/cryptohashfuncs.c   | 25 +++---
 src/common/cryptohash.c   | 57 +-
 src/common/cryptohash_openssl.c   | 93 +++
 src/common/md5_common.c   | 20 +++--
 src/interfaces/libpq/fe-auth.c| 22 +-
 contrib/uuid-ossp/uuid-ossp.c | 18 +++--
 16 files changed, 282 insertions(+), 73 deletions(-)

diff --git a/src/include/common/cryptohash.h b/src/include/common/cryptohash.h
index ea1300d5d4..8e339c83ad 100644
--- a/src/include/common/cryptohash.h
+++ b/src/include/common/cryptohash.h
@@ -34,5 +34,6 @@ extern int	pg_cryptohash_init(pg_cryptohash_ctx *ctx);
 extern int	pg_cryptohash_update(pg_cryptohash_ctx *ctx, const uint8 *data, size_t len);
 extern int	pg_cryptohash_final(pg_cryptohash_ctx *ctx, uint8 *dest, size_t len);
 extern void pg_cryptohash_free(pg_cryptohash_ctx *ctx);
+extern const char *pg_cryptohash_error(pg_cryptohash_ctx *ctx);
 
 #endif			/* PG_CRYPTOHASH_H */
diff --git a/src/include/common/md5.h b/src/include/common/md5.h
index bbd2ec0165..942ca4242c 100644
--- a/src/include/common/md5.h
+++ b/src/include/common/md5.h
@@ -26,9 +26,12 @@
 #define MD5_PASSWD_LEN	35
 
 /* Utilities common to all the MD5 implementations, as of md5_common.c */
-extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum);
-extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf);
+extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum,
+		const char **errstr);
+extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf,
+		  const char **errstr);
 extern bool pg_md5_encrypt(const char *passwd, const char *salt,
-		   size_t salt_len, char *buf);
+		   size_t salt_len, char *buf,
+		   const char **errstr);
 
 #endif			/* PG_MD5_H */
diff --git a/src/include/libpq/crypt.h b/src/include/libpq/crypt.h
index ee60772e94..3238cf66d3 100644
--- a/src/include/libpq/crypt.h
+++ b/src/include/libpq/crypt.h
@@ -35,12 +35,13 @@ extern PasswordType get_password_type(const char *shadow_pass);
 extern char *encrypt_password(PasswordType target_type, const char *role,
 			  const char *password);
 
-extern char *get_role_password(const char *role, char **logdetail);
+extern char *get_role_password(const char *role, const char **logdetail);
 
 extern int	md5_crypt_verify(const char *role, const char *shadow_pass,
 			 const char *client_pass, const char *md5_salt,
-			 int md5_salt_len, char **logdetail);
+			 int md5_salt_len, const char **logdetail);
 extern int	plain_crypt_verify(const char *role, const char *shadow_pass,
-			   const char *client_pass, char **logdetail);
+			   const char *client_pass,
+			   const char **logdetail);
 
 #endif
diff --git a/src/include/libpq/sasl.h b/src/include/libpq/sasl.h
index 7ba3f5f5bc..71cc0dc251 100644
--- a/src/include/libpq/sasl.h
+++ b/src/include/libpq/sasl.h
@@ -126,11 +126,11 @@ typedef struct pg_be_sasl_mech
 	int			(*exchange) (void *state,
 			 const char *input, int inputlen,
 			 char **outp

Re: md5 issues Postgres14 on OL7

2022-01-06 Thread Michael Paquier
On Thu, Jan 06, 2022 at 11:40:04AM -0500, Tom Lane wrote:
> 1. It draws a cast-away-const warning.  We'd have to make the result
> of pg_cryptohash_error be "const char *", which would be better
> practice anyway, but that propagates into some other APIs and I didn't
> take the trouble to chase it to the end.

Yeah.  I wanted to switch all those routines to use a const here
anyway, just did not have the time to tackle that yesterday with the
rest of the issues I could think about.  Looking at that today, I
don't see any problems in switching to const in all those places, so
done this way (two places in crypt.c are more picky, though, for
logdetail).

> 2. It feels a bit bogus to be fetching ERR_get_error() at this point.
> Maybe it's all right to assume that the OpenSSL error stack won't
> change state before we get to pg_cryptohash_error, but I don't like
> the idea much.  I think it'd be better to capture ERR_get_error()
> sooner and store it in an additional field in pg_cryptohash_ctx.

Right, I forgot about the ERR_get_error() piece of it.  Thanks!  I'd
also rather have the check done just after the OpenSSL call.  If hash
computations are split across multiple code paths, this could lead to
issues.  We don't have any problems currently in the core code, but I
see no reason to not make that safer in the long run.  And the
structure is flexible enough, so that's not an issue.

> Also, I wonder if this shouldn't be unified with the SSLerrmessage()
> support found in be-secure-openssl.c and fe-secure-openssl.c.

Guess so.  HEAD could be poked at for this part.  I recall looking at
that once by that did not seem worth the complications.

I have also looked at the ABI part of the patch.  I cannot spot any
public trace of pg_md5_hash() and pg_md5_binary().  pgbouncer and
pgpool2 have each a copy of pg_md5_encrypt(), but they just share the
API name with PG, nothing more.  So that looks reasonably safe to
change.

The last thing I have on my notes is to assign logdetail in
md5_crypt_verify() and plain_crypt_verify() to feed back a LOG entry
to the postmaster on those failures, and saw that it is safe to assign
directly the error returned by the cryptohash APIs, avoiding the
extra psprintf call that could become an issue under memory pressure.

What do you think?
--
Michael
From 62ecfb1a77a58f6e056822e3a7a0b3052fcf9e46 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Fri, 7 Jan 2022 11:08:17 +0900
Subject: [PATCH v2] Improve error reporting for cryptohashes

---
 src/include/common/cryptohash.h   |  1 +
 src/include/common/md5.h  |  9 ++-
 src/backend/libpq/auth.c  | 14 ++--
 src/backend/libpq/crypt.c | 40 +++-
 src/backend/replication/backup_manifest.c |  9 ++-
 src/backend/utils/adt/cryptohashfuncs.c   | 25 ---
 src/common/cryptohash.c   | 57 +++-
 src/common/cryptohash_openssl.c   | 80 +++
 src/common/md5_common.c   | 20 --
 src/interfaces/libpq/fe-auth.c| 22 +--
 contrib/uuid-ossp/uuid-ossp.c | 18 +++--
 11 files changed, 244 insertions(+), 51 deletions(-)

diff --git a/src/include/common/cryptohash.h b/src/include/common/cryptohash.h
index 541dc844c8..c62c350d57 100644
--- a/src/include/common/cryptohash.h
+++ b/src/include/common/cryptohash.h
@@ -34,5 +34,6 @@ extern int	pg_cryptohash_init(pg_cryptohash_ctx *ctx);
 extern int	pg_cryptohash_update(pg_cryptohash_ctx *ctx, const uint8 *data, size_t len);
 extern int	pg_cryptohash_final(pg_cryptohash_ctx *ctx, uint8 *dest, size_t len);
 extern void pg_cryptohash_free(pg_cryptohash_ctx *ctx);
+extern const char *pg_cryptohash_error(pg_cryptohash_ctx *ctx);
 
 #endif			/* PG_CRYPTOHASH_H */
diff --git a/src/include/common/md5.h b/src/include/common/md5.h
index 62a31e6ed4..cc43fc6606 100644
--- a/src/include/common/md5.h
+++ b/src/include/common/md5.h
@@ -26,9 +26,12 @@
 #define MD5_PASSWD_LEN	35
 
 /* Utilities common to all the MD5 implementations, as of md5_common.c */
-extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum);
-extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf);
+extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum,
+		const char **errstr);
+extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf,
+		  const char **errstr);
 extern bool pg_md5_encrypt(const char *passwd, const char *salt,
-		   size_t salt_len, char *buf);
+		   size_t salt_len, char *buf,
+		   const char **errstr);
 
 #endif			/* PG_MD5_H */
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 7bcf52523b..eea933f41e 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -3085,6 +3085,8 @@ PerformRadiusTransaction(const char *server, const char *secret, const char *por
 	md5trailer = packet->vector;
 	for (i = 0; i < encryptedpassword

Re: md5 issues Postgres14 on OL7

2022-01-06 Thread Michael Paquier
On Wed, Jan 05, 2022 at 04:09:12PM +0900, Michael Paquier wrote:
> In order to make things portable with 14 in cryptohash.c, we don't
> have any need to change the existing cryptohash APIs.  We could just
> store in each implementation context a location to a static string,
> and add a new routine to extract it if there is an error, defaulting
> to OOM.

I have been looking at that, and finished with the attached.  It is
close to the end of the day, so this needs an extra lookup, but I have
finished by using the idea of an extra routine, called
pg_cryptohash_error(), able to grab the error saved in the private
contexts, so as any callers from the backend or the frontend can feed
on that.  This way, it is possible to make the difference between
several class of errors: OOMs, a too short destination buffer, OpenSSL
internal error, etc.

There are a couple of things worth noting here:
- Two code paths of src/backend/libpq/crypt.c rely on the result of
pg_md5_encrypt() to always be an OOM, so as this skips one
psnprintf().  This has been let as-is in the backend for now, but we
don't pfree() the *logdetail strings passed across the various layers,
so we could just pass down the cryptohash error as-is..  We'd better
mention that logdetail may not be palloc'd all the time, once we do
that.  libpq is able to use that properly.
- The routines of md5_common.c need to pass down an extra *errstr for
their respective callers.  That's an ABI breakage but I'd like to
think that nobody uses that out-of-core.  (I need to double-check this
part, as well).
- HMAC (hmac_openssl.c and hmac.c) could use the same infra, but I did
not see a use for that yet.  It is possible to compile HMACs with MD5s,
but we don't have any in-core callers, and failures are just part of
the SCRAM workflow with dedicated error messages.

I am still not sure about the FIPS part, as per the argument of
OpenSSL using something different in 3.0.0, and Fedora that patches 
upstream in its own way, but this could be extended in
cryptohash_openssl.c to provide even more context.  For now, this
allows reports about OpenSSL internal failures, taking care of the
disturbance.

Thoughts?
--
Michael
From 064ef97fd2a171485ff408550380515493ce0072 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 6 Jan 2022 17:13:55 +0900
Subject: [PATCH] Improve error reporting for cryptohashes

---
 src/include/common/cryptohash.h   |  1 +
 src/include/common/md5.h  |  8 ++-
 src/backend/libpq/auth.c  | 14 +++--
 src/backend/libpq/crypt.c | 20 ---
 src/backend/replication/backup_manifest.c |  9 ++-
 src/backend/utils/adt/cryptohashfuncs.c   | 25 ++---
 src/common/cryptohash.c   | 57 ++-
 src/common/cryptohash_openssl.c   | 68 +++
 src/common/md5_common.c   | 20 +--
 src/interfaces/libpq/fe-auth.c| 22 ++--
 contrib/uuid-ossp/uuid-ossp.c | 18 --
 11 files changed, 219 insertions(+), 43 deletions(-)

diff --git a/src/include/common/cryptohash.h b/src/include/common/cryptohash.h
index 541dc844c8..61d74e1195 100644
--- a/src/include/common/cryptohash.h
+++ b/src/include/common/cryptohash.h
@@ -34,5 +34,6 @@ extern int	pg_cryptohash_init(pg_cryptohash_ctx *ctx);
 extern int	pg_cryptohash_update(pg_cryptohash_ctx *ctx, const uint8 *data, size_t len);
 extern int	pg_cryptohash_final(pg_cryptohash_ctx *ctx, uint8 *dest, size_t len);
 extern void pg_cryptohash_free(pg_cryptohash_ctx *ctx);
+extern char *pg_cryptohash_error(pg_cryptohash_ctx *ctx);
 
 #endif			/* PG_CRYPTOHASH_H */
diff --git a/src/include/common/md5.h b/src/include/common/md5.h
index 62a31e6ed4..1db2855d95 100644
--- a/src/include/common/md5.h
+++ b/src/include/common/md5.h
@@ -26,9 +26,11 @@
 #define MD5_PASSWD_LEN	35
 
 /* Utilities common to all the MD5 implementations, as of md5_common.c */
-extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum);
-extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf);
+extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum,
+		char **errstr);
+extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf,
+		  char **errstr);
 extern bool pg_md5_encrypt(const char *passwd, const char *salt,
-		   size_t salt_len, char *buf);
+		   size_t salt_len, char *buf, char **errstr);
 
 #endif			/* PG_MD5_H */
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 7bcf52523b..9f118ec546 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -3085,6 +3085,8 @@ PerformRadiusTransaction(const char *server, const char *secret, const char *por
 	md5trailer = packet->vector;
 	for (i = 0; i < encryptedpasswordlen; i += RADIUS_VECTOR_LENGTH)
 	{
+		char	   *errstr = NULL;
+
 		memcpy(cryptvector + strlen(secret), md5trailer, RADIUS_VECTOR_LENGTH);
 
 		/*
@@ -3093,10 +3095,12 @@ PerformRadiusT

Re: md5 issues Postgres14 on OL7

2022-01-04 Thread Michael Paquier
On Wed, Jan 05, 2022 at 01:08:53AM -0500, Tom Lane wrote:
> I think it's very important that the error message in this case
> mention "FIPS mode" explicitly.  Otherwise, people will have no
> idea that that's where the problem originates, and they'll be
> frustrated and we'll get bug reports.  (They may be frustrated
> anyway, but it was their choice, or their corporate policy's
> choice, to cut off their access to MD5.  Not our place to dodge
> that decision.)

I am not completely sure how to detect that in 1.1.1 in the context of
Fedora, and portability may become a tricky thing.  FIPS_mode() and
FIPS_mode_set() are legacy APIs that should not be used, and upstream
just disables them in 1.1.1.

[... digs a bit ...]

Ugh.  Fedora patches upstream's 1.1.1 to check and react on
/proc/sys/crypto/fips_enabled.  Their code is here, see particularly
0009-Add-Kernel-FIPS-mode-flag-support.patch:
https://src.fedoraproject.org/rpms/openssl.git

So that's why you are able to use it with 1.1.1.  Well, we could do
something similar to that, but in 3.0.0 things are done very
differently: one has to set to alg_sect fips=yes with fips = fips_sect
in the OpenSSL configuration to load the FIPS provider.  Providing
more error context is going to be hairy here..

In order to make things portable with 14 in cryptohash.c, we don't
have any need to change the existing cryptohash APIs.  We could just
store in each implementation context a location to a static string,
and add a new routine to extract it if there is an error, defaulting
to OOM.
--
Michael


signature.asc
Description: PGP signature


Re: md5 issues Postgres14 on OL7

2022-01-04 Thread Michael Paquier
On Tue, Jan 04, 2022 at 12:54:35PM -0500, Tom Lane wrote:
> I reproduced this on Fedora 35 with FIPS mode enabled.  The problem
> is that OpenSSL treats MD5 as a disallowed cipher type under FIPS
> mode, so this call in pg_cryptohash_init fails:

Is that 3.0.0 or 1.1.1?  I can see the following, telling that Fedora
35 uses OpenSSL 1.1.1:
https://packages.fedoraproject.org/pkgs/openssl/openssl/

And there is FIPS 2.0 for 1.0.1 and 1.0.2 (funnily, I recall that
1.0.2+FIPS allows MD5 to work with the EVP routines), but the module
of FIPS 3.0 does not work with 1.1.1 AFAIK, so I may be confused.  Or
perhaps this is OpenSSL 1.1.1 with a separate module?  The upstream
code does nothing special with EVP_DigestInit_ex() in 1.1.1 (see
digest.c), contrary to 3.0.0 where there is specific knowledge of
FIPS.

> status = EVP_DigestInit_ex(ctx->evpctx, EVP_md5(), NULL);
> 
> and then we come back to this in md5_text():
> 
> /* get the hash result */
> if (pg_md5_hash(VARDATA_ANY(in_text), len, hexsum) == false)
> ereport(ERROR,
> (errcode(ERRCODE_OUT_OF_MEMORY),
>  errmsg("out of memory")));
> 
> So there's nothing actually misbehaving, but our error reportage sucks:
> the hash functions have no way to report a specific failure code,
> and the caller(s) think the only possible failure mode is OOM.

Indeed, this error is a pilot error with the cryptohash integration of
14.  In ~13, the custom MD5 implementation would only fail on OOM, but
more failure modes are possible now.

> I suppose we could get around the error by using our own MD5 code
> even in OpenSSL-enabled builds, but that'd violate both the spirit
> and the letter of FIPS certification.

I don't think we should go back to an integration of our custom MD5 if
we have external libraries that provide support for it.  That's
against the set of FIPS policies.

> I think the right response is
> to upgrade the error-reporting API in this area, so that the message
> could look more like "MD5 is disallowed in FIPS mode".

Hmm.  I am not sure how much we should try to make the backend, or
even the frontend, FIPS-aware (remember for example 0182438 where we
avoided this kind of complexity), and not all SSL libraries we would
potentially add support for may care about such error states.  The
cleanest approach may be to extend the APIs to store an **errstr so
as implementations are free to assign the error string they want to
send back for the error reporting, rather than using more error codes.
If we want to improve things in this area with FIPS (aka allow
check-world to pass in this case), we would need more in terms of
alternate test output, and extra tweaks for the authentication tests,
as well.  Perhaps the best thing to do in the long term would be to
drop MD5, but we are not there yet IMO even if password_encryption
default has changed, and upgrade scenarios get hairy.

At the end, I agree that we should improve the error message in these
two cases.  However, I would stick to simplicity by not assuming that 
those two code paths fail only on OOM, and reword things in md5_text()
and md5_bytea() with a simple "could not compute MD5 hash".  Any code
paths calling the routines of md5_common.c just do that as well for
ages when the computation fails, and that's what we care about here.
--
Michael


signature.asc
Description: PGP signature


Re: md5 issues Postgres14 on OL7

2022-01-04 Thread Michael Paquier
On Mon, Dec 20, 2021 at 03:22:31PM +0100, Christoph Moench-Tegeder wrote:
> Active FIPS mode (/proc/sys/crypto/fips_enabled => 1) on the server does
> produce this behaviour.

Most likely, this is a build linked with OpenSSL?  The way MD5 hashes
are computed in Postgres has largely changed in 14, and the code has
been refactored so as we rely on the EVP APIs from OpenSSL when
building with --with-ssl=openssl, having as direct consequence to
allocate a bit more memory every time a hash is computed.  My guess is
that this comes from pg_cryptohash_create() in cryptohash_openssl.c,
with a complain coming from OpenSSL's EVP_MD_CTX_create(), but there
are other palloc() calls in this area as well.
--
Michael


signature.asc
Description: PGP signature


Re: Reindex "locked" standby database

2021-12-14 Thread Michael Paquier
On Wed, Dec 15, 2021 at 12:15:27AM -0300, Martín Fernández wrote:
> The reindex went fine in the primary database and in one of our
> standby. The other standby that we also operate for some reason
> ended up in a state where all transactions were locked by the WAL
> process and the WAL process was not able to make any progress. In
> order to solve this issue we had to move traffic from the “bad”
> standby to the healthy one and then kill all transactions that were
> running in the “bad” standby. After that, replication was able to
> resume successfully.

You are referring to the startup process that replays WAL, right?
Without having an idea about the type of workload your primary and/or
standbys are facing, as well as an idea of the configuration you are
using on both (hot_standby_feedback for one), I have no direct idea,
but that could be a conflict caused by a concurrent vacuum.

Seeing where things got stuck could also be useful, perhaps with a
backtrace of the area where it happens and some information around
it.

> I’m just trying to understand what could have caused this issue. I
> was not able to identify any queries in the standby that would be
> locking the WAL process. Any insight would be more than welcome!

That's not going to be easy without more information, I am afraid.
--
Michael


signature.asc
Description: PGP signature


Re: log shipping with pg_receivewal

2021-12-13 Thread Michael Paquier
On Tue, Dec 14, 2021 at 05:25:04AM +0100, Marc Millas wrote:
> My question: as the synchronous option is supposed to make pg_receivewal
> write transaction immediately in the wal files, is there a way to ask the
> standby to apply them on the fly ie. without waiting a wal file change ?

Nope, there is no such option.  First, pg_receivewal strongly relies
on the rename of WAL segment to its correct name once completed.  If
there is any failure, we would start back at the beginning of the last
partial segment found.

Saying that, you could do things by using a restore_command that
checks after a .partial file directly in the path used by
pg_receivewal to store the archives, and Postgres would repeatedly ask
for a segment once it is out.  But really, this is going to be really
expensive as a restore_command would copy a full file, so that's a lot
of bandwidth wasted away for nothing.  Streaming replication would be
likely your best, and cheapest, option here.
--
Michael


signature.asc
Description: PGP signature


Re: split postgresql logfile

2021-11-29 Thread Michael Paquier
On Mon, Nov 29, 2021 at 12:12:11PM +0100, Paul van Rixel wrote:
> Now the postgresql.logs are polluted with log connections/disconnection,
> Agent monitoring entries and in between some entries which need some
> attention but you missed them. I know the option exists to disable
> log_(dis)connections but because of auditing this option must be enabled.
> We don't use a tool like Grafana yet and with pgBadger we are experimenting
> but skipping "unnessary" lines makes the manually reading of the logs much
> more clear.

The logging collector is an all-or-nothing operation, so your options
when using it are limited when it comes to filtering, I am afraid.  If
you'd wish to split the log entries, something that you could consider
is using the elog hook and redirect those logs into a dedicated file,
but that would mean being also careful about concurrency as you cannot
send those entries into the logging collector that would just write
logs in a serializable fashion in which log file it sees suited
depending on the server configuration (.csv, .log, etc.).

Another option for more filtering flexibility is syslog.  While,
contrary to the logging collector that it designed to never lose
messages, it could drop messages that cannot be written :/

For auditing compliance, the latter is never a good option.

(FWIW, another thing that's been asked across the years is to have
JSON as logging format.  There is a patch floating around to add this
support, but that won't apply to already-released versions.)
--
Michael


signature.asc
Description: PGP signature


Re: Pause streaming replication

2021-11-10 Thread Michael Paquier
On Wed, Nov 10, 2021 at 08:36:45PM -0500, Rita wrote:
> Yes, I have read the manual and seen this. It pauses the replication
> (select pg_is_wal_replay_paused()). But on the primary, when I look at
> pg_stat_replication, it still says 'streaming' in the state column. My
> question was how do I get it from 'streaming'  to anything else?  (
> https://www.postgresql.org/docs/11/monitoring-stats.html#PG-STAT-REPLICATION-VIEW)
> ?

pg_is_wal_replay_paused() pauses WAL *replay* at recovery, but it does
not stop the stream of WAL from the primary to its standbys.

> I want to trigger an alert when 'streaming' isn't there. Or is there a
> better way to do it?

If you want to have some automated test to check a given state of the
replication, you could use a combination of a SIGSTOP on the WAL
receiver of the standby and/or the WAL sender of the primary, with
some pg_terminate_backend() calls, just to throw one idea in the
bucket.
--
Michael


signature.asc
Description: PGP signature


Re: Force re-compression with lz4

2021-10-18 Thread Michael Paquier
On Mon, Oct 18, 2021 at 08:01:04AM -0700, Adrian Klaver wrote:
> Not sure how much this applies to the Postgres usage of lz4. As I understand
> it, this is only used internally for table compression. When using pg_dump
> compression gzip is used. Unless you pipe plain text output through some
> other program.

More precisely, LZ4 applies to the compression of toastable values in
14~.  In 15~, we can already use it for WAL and the compression of
full-page writes.

It is worth noting that there are extra patches floating around to add
more LZ4 pluggability to pg_dump (I think this has not been published
yet), pg_receivewal (published) and base backups through the
replication protocol (published).  I have seen rather good numbers
when it came to WAL, FWIW.  Even if the compression ratio was a bit
less than pglz, it was much faster.
--
Michael


signature.asc
Description: PGP signature


Re: Force re-compression with lz4

2021-10-18 Thread Michael Paquier
On Mon, Oct 18, 2021 at 09:57:11AM +0300, Florents Tselai wrote:
> Oh, that’s good to know then. So besides ALTER COMPRESSION for
> future inserts there’s not much one can do for pre-existing values

The posting style of the mailing list is to not top-post, so if you
could avoid breaking the logic of the thread, that would be nice :)

> I think it makes sense to update/ add more info to the docs on this
> as well, since other people in the thread expected this to work that
> way too.

There is some documentation, as changing the compression for an
existing table is part of ALTER TABLE:
https://www.postgresql.org/docs/current/sql-altertable.html

"This does not cause the table to be rewritten, so existing data may
still be compressed with other compression methods. If the table is
restored with pg_restore, then all values are rewritten with the
configured compression method."

> Maybe at some point, even allow an explicit option to be defined during 
> VACUUM ? 

That's a part where we disagreed as it should not be VACUUM's work to
do that.  The option would have a limited impact as it comes to users
that would do a one-time operation most likely part of an upgrade, so
I don't think that this would be adapted to have anyway.
--
Michael


signature.asc
Description: PGP signature


Re: Force re-compression with lz4

2021-10-17 Thread Michael Paquier
On Sun, Oct 17, 2021 at 10:13:48PM +0300, Florents Tselai wrote:
> I did look into VACUUM(full) for it’s PROCESS_TOAST option which
> makes sense, but the thing is I already had a cron-ed VACUUM (full)
> which I ended up disabling a while back; exactly because of the
> double-space requirement.

Please note that VACUUM FULL does not enforce a recompression on
existing values.  See commit dbab0c0, that disabled this choice as it
introduced a noticeable performance penality in some cases when
looking at the compression type of the vacuumed table attributes:
=# CREATE TABLE cmdata(f1 text COMPRESSION pglz);
CREATE TABLE
=# INSERT INTO cmdata VALUES(repeat('1234567890', 1000));
INSERT 0 1
=# SELECT pg_column_compression(f1) FROM cmdata;
 pg_column_compression
---
  pglz
(1 row)
=# ALTER TABLE cmdata ALTER COLUMN f1 SET COMPRESSION lz4;
ALTER TABLE
=# VACUUM FULL cmdata;
VACUUM
=# SELECT pg_column_compression(f1) FROM cmdata;
 pg_column_compression
---
  pglz
(1 row)
--
Michael


signature.asc
Description: PGP signature


Re: Force re-compression with lz4

2021-10-17 Thread Michael Paquier
On Sun, Oct 17, 2021 at 10:33:52PM +0200, Daniel Verite wrote:
> However lz4 appears to be much faster to compress than pglz, so its
> benefit is clear in terms of CPU usage for future insertions.

CPU-speaking, LZ4 is *much* faster than pglz when it comes to
compression or decompression with its default options.  The
compression ratio is comparable between both, still LZ4 compresses in
average less than PGLZ.
--
Michael


signature.asc
Description: PGP signature


Re: How postgres is refreshing TLS certificates

2021-07-29 Thread Michael Paquier
On Wed, Jul 28, 2021 at 06:51:22AM +, M Tarkeshwar Rao wrote:
> We are working on a activity in which I need to refresh the TLS
> certificate without restarting the my application pod. 
> This feature is already there in Postgres. Can anyone please suggest
> us how postgres is implemented the same?

Hard to answer with so little detail, but if you are referring to the
backend server, aren't you looking for the fact that SSL contexts and
its surrounding applications can be reloaded?  That would apply after
a simple pg_ctl "reload" for example.
--
Michael


signature.asc
Description: PGP signature


Re: pg_upgrade and wraparound

2021-05-08 Thread Michael Paquier
On Mon, May 03, 2021 at 11:10:44AM -0400, Jan Wieck wrote:
> Not yet, but I will enter it so that we can get it into 15 for sure.

I may be missing something but this is not listed:
https://commitfest.postgresql.org/33/

Could you add it to the CF app please?  There are so many patches and
discussions that this would easily get lost if you don't register it.
And from what I can see having a discussion on this matter looks
adapted to me.
--
Michael


signature.asc
Description: PGP signature


Re: SV: Log files polluted with permission denied error messages after every 10 seconds

2021-03-22 Thread Michael Paquier
On Mon, Mar 22, 2021 at 09:25:26AM +0200, Andrus wrote:
> Errors in pg_wal directory seems not to occur in patched version. Errors in
> pg_stat_tmp still occur. Yesterdays log introduces new error message
> 
> using stale statistics instead of current ones because stats collector is
> not responding
> 
> 2021-03-21 23:51:25 EET autovacuum worker LOG:  using stale statistics
> instead of current ones because stats collector is not responding

The renaming of stats files involves just pgrename(), which is a
completely separate code path than the one of the WAL segments.  This
requires a separate investigation.
--
Michael


signature.asc
Description: PGP signature


Re: PostgreSQL occasionally unable to rename WAL files (NTFS)

2021-03-21 Thread Michael Paquier
On Tue, Feb 16, 2021 at 12:22:36PM +0100, Thomas Kellerer wrote:
> My first suspect is always the anti-virus on Windows when things like
> that happen with Postgres.

Or maybe not.  13 has introduced a regression in this area AFAIK, and
909b449 should have taken care of it (available in 13.3~).
--
Michael


signature.asc
Description: PGP signature


Re: SV: Log files polluted with permission denied error messages after every 10 seconds

2021-03-19 Thread Michael Paquier
On Fri, Mar 19, 2021 at 09:00:10AM +0200, Andrus wrote:
> I replaced files in 13.1 server with ones from your patched version. There
> are no errors in log file now for 8 hours.

Yippee.  Thanks.

Have you tested the unpatched builds?  And did you see some errors
with them?
--
Michael


signature.asc
Description: PGP signature


Re: SV: Log files polluted with permission denied error messages after every 10 seconds

2021-03-17 Thread Michael Paquier
On Thu, Mar 18, 2021 at 12:25:45PM +1300, Guy Burgess wrote:
> FWIW, this looks the same issue I am getting (reported last month:
> https://www.postgresql.org/message-id/f444a84e-2d29-55f9-51a6-a5dcea3bc253%40burgess.co.nz)

Yep.

> I get the same Process Monitor output, including BUFFER OVERFLOW entries.
> No sign of any process other than postgres.exe touching the WAL files.

Guy, do you have an environment where this is still happening and
where you could test a potential fix?  We are not sure yet what's
causing that, but one code path has changed in this area, involving
CreateHardLinkA()+_unlink() instead of a single rename when attempting
to recycle a segment.  And I am just in a mood to build things by
myself and send some links to people to be able to download and test
that, so one more is fine..
--
Michael


signature.asc
Description: PGP signature


Re: SV: Log files polluted with permission denied error messages after every 10 seconds

2021-03-17 Thread Michael Paquier
On Wed, Mar 17, 2021 at 10:45:28AM +0200, Andrus wrote:
> In this server hopefully no. Application code contains xml parsing it but
> probably those queries are never running in this server.

Okay, cool.  I am going to send you privately two links to the builds
I am going to produce, 13.2 unpatched and 13.2 patched.
--
Michael


signature.asc
Description: PGP signature


Re: SV: Log files polluted with permission denied error messages after every 10 seconds

2021-03-17 Thread Michael Paquier
On Wed, Mar 17, 2021 at 11:44:48AM -0400, Tom Lane wrote:
> Oh!  That's an interesting theory; it'd explain why this broke recently,
> because we didn't use to use that function.  But how do you draw that
> conclusion from this stack trace?
> 
> Anyway, if you've diagnosed this correctly, I bet the fault is in
> this bit in win32stat.c:
> 
> #if _WIN32_WINNT < 0x0600
>   IO_STATUS_BLOCK ioStatus;
>   FILE_STANDARD_INFORMATION standardInfo;
> #else
>   FILE_STANDARD_INFO standardInfo;
> #endif
> 
> Maybe the version cutoff is wrong?  Maybe we have to do this with
> a run-time version check, instead of statically compiling it?

All the reports received are on 13.1 and 13.2.  This code is new as of
bed9075, no?
--
Michael


signature.asc
Description: PGP signature


Re: SV: Log files polluted with permission denied error messages after every 10 seconds

2021-03-17 Thread Michael Paquier
On Wed, Mar 17, 2021 at 09:25:00AM +0200, Andrus wrote:
> pg_config --configure outputs
> 
> --enable-thread-safety --enable-nls --with-ldap --with-openssl --with-uuid
> --with-libxml --with-libxslt --with-icu --with-tcl --with-perl --with-python

Thanks.  Do you actually use OpenSSL, LDAP, uuid-ossp, xml2, PL/Perl
PL/Python, or the XML datatype for your applications there?  It may be
better if those custom builds have a minimum number of dependencies 
filled, while still being compatible with what you do on those servers
so as they can still have some load applied.
--
Michael


signature.asc
Description: PGP signature


Re: SV: Log files polluted with permission denied error messages after every 10 seconds

2021-03-16 Thread Michael Paquier
On Wed, Mar 17, 2021 at 01:09:24AM +0200, Andrus wrote:
> Should I try install Visual C++ , compile and replace postgres.exe file in
> AMD server.

Mostly.  That's the annoying part:
https://www.postgresql.org/docs/devel/install-windows-full.html

It is also possible to compile the code on a first machine, then just
zip followed by an unzip it at the same location as your installation.

I am not completely sure which flags your installation has, but
another possibility is that I directly send to you two compiled
builds, one with the patch and one without it that you could directly
test.  I would not send that to the lists as an installation is rather
large, but I could just send you links from where you could download
both of them.  Then you would just need to stop the Postgres service,
do a drop-in deplacement of the binaries, and start again the Postgres
service.
--
Michael


signature.asc
Description: PGP signature


Re: SV: Log files polluted with permission denied error messages after every 10 seconds

2021-03-16 Thread Michael Paquier
On Tue, Mar 16, 2021 at 01:59:07PM +0200, Andrus wrote:
> I have two Windows 2019 servers. In Intel Xeon Cold 6226R server it occurs
> after every 10 seconds. Last logs:
> 
> 2021-03-16 13:48:12 EET checkpointer LOG:  could not rename file
> "pg_wal/000100110097": Permission denied
> 2021-03-16 13:48:22 EET checkpointer LOG:  could not rename file
> "pg_wal/000100110098": Permission denied
> 2021-03-16 13:48:32 EET checkpointer LOG:  could not rename file
> "pg_wal/000100110099": Permission denied
> 2021-03-16 13:48:42 EET checkpointer LOG:  could not rename file
> "pg_wal/00010011009A": Permission denied
> 2021-03-16 13:48:52 EET checkpointer LOG:  could not rename file
> "pg_wal/00010011009D": Permission denied
> 2021-03-16 13:49:02 EET checkpointer LOG:  could not rename file
> "pg_wal/0001001100A0": Permission denied
> 
> So It should be probably reproducible in any Windows 2019 server.

Those ten seconds are coming from RemoveXlogFile(), where pgrename()
loops 100 times for 100ms before giving up.  So something holding up
the file's handle prevents the removal to happen.  Attached is the
patch that should be tested, based on the suspected commit.  There are
actually two scenarios to worry about:
- Check that the code of 13.2 compiled manually is enough to see the
failure.
- Check that once the patch attached is applied makes the failure go
away.

I am trying on my side to reproduce the problem in a more reliable
way.  One thing I saw breaking in my setup is archive_command, where
it was not able to archive a segment with a simple copy, failing with
the same error as yours.

In one of those servers, do you have in pg_wal/ some files named
xlogtemp.N?  N is an integer that would be the PID of the process that
generated it.
--
Michael
From 961f9a03d4c27220c33e88402d5ef274424a0ab2 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 17 Mar 2021 07:12:35 +0900
Subject: [PATCH] Revert "Remove HAVE_WORKING_LINK"

This reverts commit aaa3aeddee51dd0058d38469907865052706a590.
---
 src/include/pg_config_manual.h|  7 +++
 src/include/storage/fd.h  |  2 +-
 src/backend/access/transam/timeline.c |  4 ++--
 src/backend/access/transam/xlog.c |  4 ++--
 src/backend/storage/file/fd.c | 21 -
 5 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index 8f3ec6bde1..966da99742 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -135,6 +135,13 @@
 #define EXEC_BACKEND
 #endif
 
+/*
+ * Define this if your operating system supports link()
+ */
+#if !defined(WIN32) && !defined(__CYGWIN__)
+#define HAVE_WORKING_LINK 1
+#endif
+
 /*
  * USE_POSIX_FADVISE controls whether Postgres will attempt to use the
  * posix_fadvise() kernel call.  Usually the automatic configure tests are
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index 8cd125d7df..2085c62b41 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -157,7 +157,7 @@ extern void fsync_fname(const char *fname, bool isdir);
 extern int	fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel);
 extern int	durable_rename(const char *oldfile, const char *newfile, int loglevel);
 extern int	durable_unlink(const char *fname, int loglevel);
-extern int	durable_rename_excl(const char *oldfile, const char *newfile, int loglevel);
+extern int	durable_link_or_rename(const char *oldfile, const char *newfile, int loglevel);
 extern void SyncDataDirectory(void);
 extern int	data_sync_elevel(int elevel);
 
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index e6a29d9a9b..27d70ff869 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -446,7 +446,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 	 * Perform the rename using link if available, paranoidly trying to avoid
 	 * overwriting an existing file (there shouldn't be one).
 	 */
-	durable_rename_excl(tmppath, path, ERROR);
+	durable_link_or_rename(tmppath, path, ERROR);
 
 	/* The history file can be archived immediately. */
 	if (XLogArchivingActive())
@@ -524,7 +524,7 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size)
 	 * Perform the rename using link if available, paranoidly trying to avoid
 	 * overwriting an existing file (there shouldn't be one).
 	 */
-	durable_rename_excl(tmppath, path, ERROR);
+	durable_link_or_rename(tmppath, path, ERROR);
 }
 
 /*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7daa7c43ad..82e070e431 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3624,1

Re: SV: Log files polluted with permission denied error messages after every 10 seconds

2021-03-16 Thread Michael Paquier
Hi Andrus,

On Wed, Mar 10, 2021 at 03:20:47PM +0200, Andrus wrote:
> After re-starting postgres service problem persists.

Where you getting the Postgres binaries from?  If we provide a patch,
could you test it?  This would require that you do your own build,
unfortunately, but having an environment where this is easily
reproducible is a key thing.
--
Michael


signature.asc
Description: PGP signature


Re: how to best remove version 10 (and keep version 9.5)

2021-03-14 Thread Michael Paquier
On Sat, Mar 13, 2021 at 12:03:04PM -0800, Adrian Klaver wrote:
> So, the 10 instance is not running and the 9.5 instance is listening on the
> default port. At this point I would leave things as they are.

Robert, you may want to know that 9.5 has been EOL'd by community.
Just saying.
--
Michael


signature.asc
Description: PGP signature


Re: pg_upgrade and wraparound

2021-03-12 Thread Michael Paquier
Hi Jan,

On Fri, Mar 12, 2021 at 06:13:33PM -0500, Jan Wieck wrote:
> One of the things in my way is that when using pg_resetwal to put the
> NextXID way into the future (to push the old cluster close to wraparound for
> example), the postmaster won't start because it doesn't have the pg_xact
> files for that around. Should pg_resetwal create the files in the gap
> between the old NextXID and the new one?

I think that you should add this patch to the next commit fest to
track it properly:
https://commitfest.postgresql.org/33/
--
Michael


signature.asc
Description: PGP signature


Re: Log files polluted with permission denied error messages after every 10 seconds

2021-03-07 Thread Michael Paquier
On Sun, Mar 07, 2021 at 11:45:26AM +0200, Andrus wrote:
> Should files with .deleted extension deleted manually to save disk space ?
> May of them have dates before today.

RemoveOldXlogFiles() would discard any of those .deleted files because
they don't match a legal WAL segment name, so checkpoints are not able
to work on them even in the future.  I would avoid meddling with
anything that a backend may finish to touch while running, but that 
should not really matter here as they are just never chosen for
deletion.  Piling up those files is not a good thing, so while you
still need to figure out what's causing those files to remain around
on your side, perhaps we should improve the situation in the backend
itself.
--
Michael


signature.asc
Description: PGP signature


Re: Log files polluted with permission denied error messages after every 10 seconds

2021-03-07 Thread Michael Paquier
On Sat, Mar 06, 2021 at 07:53:11PM +0200, Andrus wrote:
> I changed wal_recycle to off. So checkpointer should no more try to rename
> wal files. Iit still tries to rename files. No idea way it does not use this
> setting:

On Windows, RemoveXlogFile() would still rename a given WAL segment
file with a ".deleted" suffix with ou without wal_recycle in the case
where the a recycling of a WAL segment is not necessary, for example
if max_wal_size is already full.  So this has no effect.

> Should chekpointer process terminated to force it to use new setting. Is it
> safe to kill it during database usage.

I don't understand what you mean here.
--
Michael


signature.asc
Description: PGP signature


Re: Log files polluted with permission denied error messages after every 10 seconds

2021-03-05 Thread Michael Paquier
On Fri, Mar 05, 2021 at 07:36:37PM +0200, Andrus wrote:
> Then turned real-time protection off:
> 
> Problem persists. New entry is written after every 10 seconds.

On which files are those complaints?  It seems to me that you may have
more going on in this system that interacts with your data folder than
you think.

> pg_wal also contains files with .deleted extension like
> 
> 0001000500B2.deleted

These are generated on Windows when removing a past WAL segment, where
the process involves a rename followed by durable_unlink() that would
generate some LOG entries in the logs if the internal unlink() failed
(see RemoveXlogFile() in xlog.c). 
--
Michael


signature.asc
Description: PGP signature


Re: ransomware

2021-02-01 Thread Michael Paquier
On Mon, Feb 01, 2021 at 03:38:35PM +0100, Marc Millas wrote:
> there are various ways to do those checks but I was wandering if any
> ""standard''" solution exist within postgres ecosystem, or someone do have
> any feedback on the topic.

It seems to me that you should first write down on a sheet of paper a
list of all the requirements you are trying to satisfy.  What you are
describing here is a rather general problem line, so nobody can help
without knowing what you are trying to achieve, precisely.
--
Michael


signature.asc
Description: PGP signature


Re: Does pg_ctl promote wait for pending WAL?

2021-01-28 Thread Michael Paquier
Hi Ishii-san,

On Fri, Jan 29, 2021 at 07:59:26AM +0100, Paul Förster wrote:
> On 29. Jan, 2021, at 03:51, Tatsuo Ishii  wrote:
>> 
>> Does anybody know whether a standby server waits for pending WAL
>> records/files while promotion is requested? I assume that no data
>> update is performed on the primary server while promotion.
>> 
>> I imagine that a standby server stops to replay WAL and promotes as
>> soon as SIGUSR1 signal is received.

To answer to your question based on the code, you can check for the
code paths calling CheckForStandbyTrigger() in xlog.c when it comes to
promotion detection in the WAL replay.  In short,
WaitForWALToBecomeAvailable() tells that after the promotion is 
detected in the startup process, then recovery would still try to
replay as much WAL as possible from the archives or pg_wal before a
failover.

Equally, from the docs:
"Before failover, any WAL immediately available in the archive or in
pg_wal will be restored, but no attempt is made to connect to the
primary."
--
Michael


signature.asc
Description: PGP signature


Re: Error messages on duplicate schema names

2021-01-19 Thread Michael Paquier
On Tue, Jan 19, 2021 at 05:37:51PM -0300, Alvaro Herrera wrote:
> I guess you could make the case that the CCI call should be in the
> callers where we actually loop (SetDefaultACLsInSchemas,
> RemoveRoleFromObjectACL), but it's hard to get excited about the added
> clutter.

Yeah, that matches my thoughts.  And I did not see any harm in putting
the CCI in SetDefaultACL() even for the case of pg_default_acl with
DROP OWNED BY.  So applied and backpatched.  Thanks, Alvaro and
Andrus.
--
Michael


signature.asc
Description: PGP signature


Re: Error messages on duplicate schema names

2021-01-14 Thread Michael Paquier
On Wed, Jan 06, 2021 at 07:15:24PM +0200, Andrus wrote:
> Should duplicate schema names accepted or should their usage throw better
> error messages.

This means that we are one call of CommandCounterIncrement() short for
such queries, and similar safeguards already exist in this area for
GRANT/REVOKE.  The spot where I would put this new CCI is at the end
of SetDefaultACL(), like in the attached with a test added to
privileges.sql.

Any thoughts from others?
--
Michael
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 1a81e768ec..f3c1ca18ae 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -1365,6 +1365,9 @@ SetDefaultACL(InternalDefaultACL *iacls)
 		ReleaseSysCache(tuple);
 
 	table_close(rel, RowExclusiveLock);
+
+	/* prevent error when processing duplicate objects */
+	CommandCounterIncrement();
 }
 
 
diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out
index 7754c20db4..5e5f98ac68 100644
--- a/src/test/regress/expected/privileges.out
+++ b/src/test/regress/expected/privileges.out
@@ -1649,7 +1649,8 @@ SELECT has_table_privilege('regress_priv_user1', 'testns.acltest1', 'INSERT'); -
  f
 (1 row)
 
-ALTER DEFAULT PRIVILEGES IN SCHEMA testns GRANT SELECT ON TABLES TO public;
+-- placeholder for test with duplicated schema and role names
+ALTER DEFAULT PRIVILEGES IN SCHEMA testns,testns GRANT SELECT ON TABLES TO public,public;
 SELECT has_table_privilege('regress_priv_user1', 'testns.acltest1', 'SELECT'); -- no
  has_table_privilege 
 -
diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql
index 4911ad4add..fff76e0bd0 100644
--- a/src/test/regress/sql/privileges.sql
+++ b/src/test/regress/sql/privileges.sql
@@ -985,7 +985,8 @@ CREATE TABLE testns.acltest1 (x int);
 SELECT has_table_privilege('regress_priv_user1', 'testns.acltest1', 'SELECT'); -- no
 SELECT has_table_privilege('regress_priv_user1', 'testns.acltest1', 'INSERT'); -- no
 
-ALTER DEFAULT PRIVILEGES IN SCHEMA testns GRANT SELECT ON TABLES TO public;
+-- placeholder for test with duplicated schema and role names
+ALTER DEFAULT PRIVILEGES IN SCHEMA testns,testns GRANT SELECT ON TABLES TO public,public;
 
 SELECT has_table_privilege('regress_priv_user1', 'testns.acltest1', 'SELECT'); -- no
 SELECT has_table_privilege('regress_priv_user1', 'testns.acltest1', 'INSERT'); -- no


signature.asc
Description: PGP signature


Re: Postgres C-API: How to get the Oid for a custom type defined in a schema outside of the current search path

2020-12-03 Thread Michael Paquier
On Thu, Dec 03, 2020 at 01:45:05PM +0100, Pavel Stehule wrote:
> When you write C extensions for Postgres, then PostgreSQL source code is
> the best source of inspiration.

One common source of inspiration for such cases is regproc.c.  For a
type, you can for example look at what to_regtype() uses for a
conversion from a name string to an OID, aka parseTypeString().
--
Michael


signature.asc
Description: PGP signature


Re: postgres-10 with FIPS

2020-12-03 Thread Michael Paquier
On Thu, Dec 03, 2020 at 05:57:04PM +0530, Aravindhan Krishnan wrote:
> Since postgres is linked against openssl we wanted to make sure we build
> postgres against the FIPS compliant openssl libraries. Does postgres
> provide a FIPS debian package that can be used. If not it would be of great
> help to help with the instructions to build the debian of postgres linked
> against the FIPS compliant openssl libraries.

There is no need for Postgres to do anything specific with FIPS at
runtime, as long as the OS takes care of enabling FIPS and that
OpenSSL is able to recognize that.  So normally, you could just use a
version of Postgres compiled with OpenSSL 1.0.2, and replace the
libraries of OpenSSL with a version that is compiled with FIPS enabled
as the APIs of OpenSSL used by Postgres are exactly the same for the
non-FIPS and FIPS cases.
--
Michael


signature.asc
Description: PGP signature


Re: meaning of (to prevent wraparound) ..??

2020-11-25 Thread Michael Paquier
On Wed, Nov 25, 2020 at 11:10:50PM -0700, Jessica Sharp wrote:
> On Wed, Nov 25, 2020 at 23:09 Atul Kumar  wrote: 
>> Thanks Jessica. Could help me out by sharing documents that can help me
>> understand “to prevent wraparound “ in simplest way, postgres doc is little
>> bit harder for a newbee like me.
>
> Absolutely. This may be helpful:
> https://youtu.be/bA1SgWn6Gv4

If you have any suggestions on how to improve this part of the docs,
please feel free:
https://www.postgresql.org/docs/devel/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

Note that due to the critical work that such vacuum jobs do, they
cannot be cancelled.
--
Michael


signature.asc
Description: PGP signature


Re: archive file "00000001000000000000006F" has wrong size: 67118648 instead of 16777216

2020-11-24 Thread Michael Paquier
On Wed, Nov 25, 2020 at 11:01:37AM +0900, 江川潔 wrote:
> Hi,
> 
> WAL log recovery was failed on wrong log record size. Could you please
> advise me what is wrong in the setting ? Any suggestions will be highly
> appreciated.


> 2020-11-25 10:12:23.569 JST [7792] FATAL:  archive file
> "0001006F" has wrong size: 67118648 instead of 16777216

Something is fishy with your WAL archive partition, so you had better
check its state, by for example using pg_waldump on each segment
archived.  This error means that the file copied was basically 64MB in
size, while these are expected to be 16MB in this instance.
--
Michael


signature.asc
Description: PGP signature


Re: initdb --data-checksums

2020-11-09 Thread Michael Paquier
On Mon, Nov 09, 2020 at 06:03:43PM +0100, Paul Förster wrote:
> indeed, it is. Have a look at:
> 
> https://www.postgresql.org/docs/12/app-pgchecksums.html
> 
> Make sure the database is cleanly shut down before doing it.

This tool is really useful with upgrades after pg_upgrade.  Please
note that there is a --progress option, so you can basically know how
long it is going to take until completion.
--
Michael


signature.asc
Description: PGP signature


Re: PANIC: could not write to log file {} at offset {}, length {}: Invalid argument

2020-11-05 Thread Michael Paquier
On Wed, Nov 04, 2020 at 10:23:04PM -0500, Tom Lane wrote:
> The latter case would result in a LOG message "unrecognized win32 error
> code", so it would be good to know if any of those are showing up in
> the postmaster log.

Yeah.  Not sure which one it could be here:
https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile

One possibility could also be ERROR_OPERATION_ABORTED, which is not in
the mapping table.  So that would map to EINVAL.

> Seems like maybe it wasn't a great idea for _dosmaperr's fallback
> errno to be something that is also a real error code.

You could say that for any fallback errno as long as you don't know if
there's a LOG to show that a DWORD does not map with the table of
win32error.c, no?

(I got to wonder whether it would be worth the complexity to show more
information when using _dosmaperr() for WIN32 on stuff like 
elog(ERROR, "%m"), just a wild thought).
--
Michael


signature.asc
Description: PGP signature


Re: PANIC: could not write to log file {} at offset {}, length {}: Invalid argument

2020-11-05 Thread Michael Paquier
On Thu, Nov 05, 2020 at 10:21:40AM +0100, Magnus Hagander wrote:
> The problem with AVs generally doesn't come from them opening files in
> non-share mode (I've, surprisingly enough, seen backup software that
> causes that problem for example). It might happen on scheduled scans
> for example, but the bigger problem with AV software has always been
> their filter driver software which intercepts both the open/close and
> the read/write calls an application makes and "does it's magic" on
> them before handing the actual call up to the operating system. It's
> completely independent of how the file is opened.

This one is a bit new to me.  I certainly saw my share of stat() or
open() calls failing on ENOPERM because of file handles taken
exclusively by external scanners around here or even with
customer-related issues, and I did not expect that such dark magic
could be involved in a write.  It would indeed not be surprising to
see a PANIC depending on what gets done.
--
Michael


signature.asc
Description: PGP signature


Re: PANIC: could not write to log file {} at offset {}, length {}: Invalid argument

2020-11-04 Thread Michael Paquier
On Wed, Nov 04, 2020 at 01:24:46PM +0100, Andreas Kretschmer wrote:
>> Any ideas about what is the problem? or anything else I need to check?
> 
> wild guess: Antivirus Software?

Perhaps not.  To bring more context in here, PostgreSQL opens any
files on WIN32 with shared writes and reads allowed to have an
equivalent of what we do on all *nix platforms.  Note here that the
problem comes from a WAL segment write, which is done after the file
handle is opened in shared mode.  As long as the fd is correctly
opened, any attempt for an antivirus software to open a file with an 
exclusive write would be blocked, no?
--
Michael


signature.asc
Description: PGP signature


Re: PG 9.2 slave restarted - cache not impacted

2020-10-22 Thread Michael Paquier
On Fri, Oct 23, 2020 at 11:23:20AM +1300, Lucas Possamai wrote:
> I'm a bit confused about PG cache.
> 
> I have a PostgreSQL 9.2 cluster (yes, we're planning on upgrading it to 12)
> with a master and a slave database.
> 
> The application is sending all read requests to the slave, where the master
> processes the writes.
> 
> A while ago we had to restart the master database server, and the
> application was slow for a couple of days while PG cache was warmed up.
> Yesterday, we had to restart the slave database server and we did not have
> the same problem.
> 
> I would like to understand why?

Perhaps you misunderstood the effects of the OS cache and the Postgres
shared buffers?  On restart the cluster discards the shared buffers
internal to Postgres.  It does not mean that the OS cache is changed,
and it matters a lot in terms of performance.  By the way, if you want
to warm up your caches faster, an option you can consider with 9.2
(which is a version not officially supported by the community by the
way so please upgrade) would be to use pgfincore.
--
Michael


signature.asc
Description: PGP signature


Re: rum index supported on pg13?

2020-10-12 Thread Michael Paquier
On Mon, Oct 12, 2020 at 12:17:04PM -0500, John the Scott wrote:
> I am still new to github protocol, so i was not sure
> if asking about longer term support of rum was appropriate for
> the github issues posting.

Most of the original developers of rum are registered on this mailing
list so there would be some visibility, but I would guess that posting
an issue or a question from the actual project page does not hurt
either.

> the differences between pg12 and pg13 seem considerable.
> our internal use of rum has been spectacularly successful,
> we may be able to justify resources to fixing the compile issues with pg13,
> but the effort will be considerable.

Glad to hear that.
--
Michael


signature.asc
Description: PGP signature


Re: rum index supported on pg13?

2020-10-11 Thread Michael Paquier
On Thu, Oct 08, 2020 at 09:29:31PM -0500, John the Scott wrote:
> will rum index from postgrespro be supported in pg13?
> numerous errors occur when compiling rum in pg13 and
> no replies from github.  the differences from pg12
> to pg13 seem to be significant
> 
>  https://github.com/postgrespro/rum

Did you ask directly this question to the authors of the extension on
the page of the project you are quoting above?
--
Michael


signature.asc
Description: PGP signature


Re: Both type of replications from a single server?

2020-10-08 Thread Michael Paquier
On Thu, Oct 08, 2020 at 01:25:14PM +0530, Srinivasa T N wrote:
> For streaming replication, I need to set wal_level to replica in A whereas
> for logical_replication we need to set wal_level to replica in the same A
> server.  So, was wondering how to go about?

A logical replica needs wal_level = logical, a setting that also
allows to do streaming replication for a physical replica.  But the
opposite is not true, as using wal_level = replica will not work for
logical replicas.  So, assuming that you want to have both logical and
physical replicas that replicate from the same source server, you need
to set wal_level to logical on the primary server because it is a
system-wide configuration.
--
Michael


signature.asc
Description: PGP signature


Re: Both type of replications from a single server?

2020-10-08 Thread Michael Paquier
On Thu, Oct 08, 2020 at 11:43:26AM +0530, Srinivasa T N wrote:
>Is it possible to have both type of replications (streaming and logical)
> from a single server?

Yes.

>If I have 3 servers A,B and C, then I want to have streaming replication
> from A to B whereas logical replication from A to C.  Is it possible?

And yes.
--
Michael


signature.asc
Description: PGP signature


Re: PG 13 trusted extensions and pg_available_extensions

2020-09-23 Thread Michael Paquier
On Wed, Sep 23, 2020 at 03:28:45PM +, Daniel Westermann (DWE) wrote:
> I was playing a bit with trusted extensions and wondered if there is
> a reason that the "trusted" flag is not exposed in pg_available_extensions.  
> I believe that information would be quite useful so one can easily
> identify extensions that can be installed as "normal" user.

Adding the trusted flag makes sense for visibility.  There is a bit
more that we could consider though?  For example, what about
"relocatable" and "requires"? 
--
Michael


signature.asc
Description: PGP signature


Re: Autovacuum of independent tables

2020-09-08 Thread Michael Paquier
On Tue, Sep 08, 2020 at 11:16:04AM +0300, Michael Holzman wrote:
> Autovacuum does not clean dead tuples of closed transactions in tableB
> while there is an open transaction on tableA.
> But the tables have nothing in common. They are handled by separate
> applications and there are no transactions that touch both tables
> simultaneously.
> Why does autovacuum create an artificial dependency on the tables?

This is called MVCC, which applies to a session as a whole.  The point
here is that even if your application knows that only tableA is used
by a given transaction, Postgres cannot know that, as it could be
possible that data from tableB is needed in this same transaction, so
old versions of the rows from tableB matching with the snapshot hold
by this long-running transaction still have to be around.
--
Michael


signature.asc
Description: PGP signature


Re: When are largobject records TOASTed into pg_toast_2613?

2020-08-21 Thread Michael Paquier
On Fri, Aug 21, 2020 at 03:10:30PM +0200, Laurenz Albe wrote:
> Indeed, system tables have no TOAST tables in PostgreSQL, so I wonder
> how your "pg_largeobject" table could have grown one.

FWIW, src/include/catalog/toasting.h is giving me a list of 28 catalog
tables with a toast relation as of HEAD.
--
Michael


signature.asc
Description: PGP signature


Re: [EXTERNAL] RE: PostgreSQL-12 replication. Check replication lag

2020-08-05 Thread Michael Paquier
On Wed, Aug 05, 2020 at 06:36:15PM +, Mariya Rampurawala wrote:
> What I want to understand is that, in case of replication link
> failure, there will still be inserts happening at the master
> node. In that case, how will the slave know if it is up-to-date?

It cannot do that by itself, which is why in common HA scenarios you
have a witness host that acts as a monitoring instance for the primary
and the standby so as you limit SPOF issues.  This depends of course
on your application requirements, but you can also leverage things
by using a replication lag in bytes on the primary, and even couple
that with synchronous replication to check when commits are getting
blocked.

IMO, I think that you should try to not reinvent the wheel and use one
of the solutions provided by the community, pg_auto_failover coming
into my mind as one solution for example.
--
Michael


signature.asc
Description: PGP signature


Re: How to rebuild index efficiently

2020-08-04 Thread Michael Paquier
On Mon, Aug 03, 2020 at 01:04:45PM -0500, Ron wrote:
> same definition, and when that is complete, drop the old index.  The
> locking that is required here is modest: CREATE INDEX CONCURRENTLY
> needs to lock the table briefly at a couple of points in the
> operation, and dropping the old index requires a brief lock on the
> table.   It is, however, much less overall lock time than REINDEX would be.
> 
> Of course, you need enough disk space... :)

A SHARE UPDATE EXCLUSIVE lock is taken during a CIC, meaning that
writes and reads are allowed on the parent table while the operation
works, but no DDLs are allowed (roughly).  The operation takes a
couple of transactions to complete, and  there are two wait points
after building and validating the new index to make sure that there
are no transactions remaining around that may cause visiblity issues
once the new index is ready to use and becomes valid.  So the
operation is longer, takes more resources, but it has the advantage to
be non-disruptive.
--
Michael


signature.asc
Description: PGP signature


Re: how reliable is pg_rewind?

2020-08-02 Thread Michael Paquier
On Sat, Aug 01, 2020 at 10:35:37AM -0700, Curt Kolovson wrote:
> When trying to resync an old primary to become a new standby, I have found
> that pg_rewind only works occasionally. How reliable/robust is pg_rewind,
> and what are its limitations? We have observed that approx half our FPIs in
> the WALs are due to XLOG/FPI_FOR_HINT. The only reason we've set
> wal_log_hints=on is so that we can  use pg_rewind. But if pg_rewind is
> unreliable, we would rather turn off wal_log_hints. Any info on the
> reliability of pg_rewind and its limitations would be appreciated.

FWIW, we use it in production to accelerate the redeployment of
standbys in HA configuration for 4 years now in at least one product,
and it is present in upstream for since 9.5, for 5 years now.  So the
tool is rather baked at this stage of the game.
--
Michael


signature.asc
Description: PGP signature


  1   2   3   >