Re: --frokbackend process
On Thu, Jun 27, 2024 at 09:19:45AM +0500, Kashif Zeeshan wrote: > It's hard to figure out the issue by just looking on the process list, to > figure out the issue you need to share the DB Server Logs and thats the why > to figure out the exac issue. Note that it is equally hard for anybody reading your two emails to provide any hints about what could be going on, because you are providing zero details about what's happening in the server, and nobody would be able to help you with only a screenshot of running processes. There should be at least some logs from which you could begin some sort of investigation, for example, to know what is happening to your system and figure out if there is even something wrong going on. Thanks, -- Michael signature.asc Description: PGP signature
Re: ERROR: could not attach to dynamic shared area
On Wed, Jun 26, 2024 at 02:43:30PM +, Andrew Longwill wrote: > We’re using Postgresql 15.4 in AWS RDS. Since yesterday we have seen > two occurrences where our PHP application becomes unable to connect > to our RDS replicas. In the application logs we see the error > "FATAL: could not attach to dynamic shared area". The RDS replica > remains unusable in this state unless it’s rebooted. I suspect that you have been hit with this issue: https://www.postgresql.org/message-id/CAO6_XqqJbJBL=m7ym13tcb4xnq58vra2jcc+gwepbgbada6...@mail.gmail.com A fix has been committed for it today by me down to v15. This will show up in the next round of minor releases. -- Michael signature.asc Description: PGP signature
Re: recovery.signal not being removed when recovery complete
On Tue, Mar 26, 2024 at 06:22:32PM -0400, Isaac Morland wrote: > I use a script to restore a backup to create a testing copy of the > database. I set the following in postgresql.auto.conf: > > recovery_target = 'immediate' > recovery_target_action = 'promote' Why not, after a pg_basebackup -R I assume. Would you mind sharing your commands? > In the logs I get "recovery stopping after reaching consistency" then a > moment later "database system is ready to accept read-only connections", > then some entries about restoring log files, then "database system is ready > to accept connections". If you have some logs, that could help as well. > I am able to make changes (e.g. CREATE TABLE), yet recovery.signal is still > present. My understanding is that recovery.signal should be removed when > recovery is finished (i.e., more or less when "database system is ready to > accept connections" is logged?), unless recovery_target_action is set to > 'shutdown'. > > Any ideas? Even just confirming/denying I understand the above correctly > would help. Not removing the two .signal files when promotion is achieved would be a problem to me because we'd reenter recovery again at a follow-up startup. ArchiveRecoveryRequested should be set if there was either recovery.signal or standby.signal found at startup, meaning that we should have a TLI jump at promotion with a physical removal of both files and a LOG for a "selected new timeline ID". -- Michael signature.asc Description: PGP signature
Re: support fix query_id for temp table
On Thu, Feb 01, 2024 at 07:37:32AM +, ma lz wrote: > session 1: > create temp table ttt ( a int ); > insert into ttt values(3); -- query_id is XXX from > pg_stat_activity > > session 2: > create temp table ttt ( a int ); > insert into ttt values(3);-- query_id is YYY from > pg_stat_activity > > I know temp table has different oid, so query_id is different, is > there a way to use table name for temp table instead of oid? The CREATE TABLE statements have indeed the same query ID (in 16~), and the inserts have a different one as they touch different schemas and relations. That's quite an old problem, that depends on the RangeVar attached to an InsertStmt. I don't quite see a way to directly handle that except by using a custom implementation in query jumbling for this node and its RangeVar, so there is no "easy" way to tackle that :/ -- Michael signature.asc Description: PGP signature
Re: Logical replication breaks: "unexpected duplicate for tablespace 0, relfilenode 2774069304"
On Thu, Dec 28, 2023 at 02:03:12PM +0200, Kouber Saparev wrote: >> The first problem that we have here is that we've lost track of the >> patch proposed, so I have added a CF entry for now: >> https://commitfest.postgresql.org/46/4720/ > > Thank you. Is there a bug report or should we file one? It looks like > something that compromises the reliability of the logical replication as a > whole. Having a CF entry means that it is already tracked, so no need to do more here for the moment. The next step would be to look at the proposed patch in more details, and work on fixing the issue. -- Michael signature.asc Description: PGP signature
Re: Logical replication breaks: "unexpected duplicate for tablespace 0, relfilenode 2774069304"
On Fri, Dec 22, 2023 at 10:55:24AM +0200, Kouber Saparev wrote: > The table for this file node is not even included in any of the > publications we have. I've found a similar issue described [1] before, so I > was wondering whether this patch is applied? Our subscriber database is > PostgreSQL 16.1 and the publisher - PostgreSQL 15.4. Or just this link using the community archives based on the message-ID: https://www.postgresql.org/message-id/tycpr01mb83731ade7fd7c7cf5d335bceed...@tycpr01mb8373.jpnprd01.prod.outlook.com > What quick solution would fix the replication? Repack of the table? Reload > of the database? Killing some backends? There may be something you could do as a short-term solution, but it does not solve the actual root of the problem, because the error you are seeing is not something users should be able to face. The first problem that we have here is that we've lost track of the patch proposed, so I have added a CF entry for now: https://commitfest.postgresql.org/46/4720/ -- Michael signature.asc Description: PGP signature
Re: pg_checksums?
On Sun, Oct 29, 2023 at 11:49:11AM +0100, Peter J. Holzer wrote: > On 2023-10-29 10:11:07 +0100, Paul Förster wrote: >> On Oct 29, 2023, at 02:43, Peter J. Holzer wrote: >>> I don't think so. AFAIK Replication keeps the data files in sync on a >>> bit-for-bit level and turning on checksums changes the data layout. >>> Running a cluster where one node has checksums and the other doesn't >>> would result in a complete mess. >> >> I agree with the last sentence. This is why I asked if it is safe to >> enable checksums on a replica, switch over and then do it again on the >> ex primary, i.e. now new replica without doing a reinit. > > It *might* work if there are zero writes on the primary during the > downtime of the replica (because those writes couldn't be replicated), > but that seems hard to ensure. Even if you could get away with making > the primary read-only (is this even possible?) I wouldn't have much > confidence in the result and reinit the (new) replica anyway. Hm? Page checksums are written when a page is flushed to disk, we don't set them for dirty buffers or full-page writes included in WAL, so it should be OK to do something like the following: - Stop cleanly a standby. - Run pg_checksums on the standby to enable them. - Restart the standby. - Catchup with the latest changes - Stop cleanly the primary, letting the shutdown checkpoint be replicated to the standby. - Promote the standby. - Enable checksums on the previous primary. - Start the previous primary to be a standby of the node you failed over to. -- Michael signature.asc Description: PGP signature
Re: REINDEX in tables
On Wed, Oct 25, 2023 at 11:33:11AM +0200, Andreas Kretschmer wrote: > Am 25.10.23 um 11:24 schrieb Matthias Apitz: >> We have a client who run REINDEX in certain tables of the database of >> our application (on Linux with PostgreSQL 13.x): >> >> REINDEX TABLE CONCURRENTLY d83last; >> REINDEX TABLE CONCURRENTLY d86plz; >> REINDEX TABLE CONCURRENTLY ig_memtable; >> REINDEX TABLE CONCURRENTLY ig_dictionary; >> REINDEX TABLE CONCURRENTLY ig_dictionary; >> REINDEX TABLE CONCURRENTLY d50zweig ; >> REINDEX TABLE CONCURRENTLY d50zweig ; >> >> We as the software vendor and support, do not use or recommend this >> procedure, because we have own SQL files for creating or deleting >> indices in the around 400 tables. >> >> The client is now concerned about the issue that the number of >> rows in some of the above tables has increased. Is this possible? > > In principle, there is nothing wrong with doing this in a maintenance > window, for example. If you have a maintenance window where your production server is not going to be active, you may want to just do a more aggressive REINDEX without CONCURRENTLY as that's going to be cheaper and faster. CONCURRENTLY is useful even in non-maintenance cases as it allows concurrent reads and writes to happen on the index while running the operation. CONCURRENTLY is much slower of course, as it needs to wait two times for older snapshots held by concurrent sessions when the index build is finished and when the index built gets validated. -- Michael signature.asc Description: PGP signature
Re: Change error code severity for syslog?
On Thu, Oct 12, 2023 at 09:42:47AM +0900, Abhishek Bhola wrote: > For most of the Postgres errors, translating it to WARNING level in syslog > works, however, I wanted to translate *ERROR codes XX000/1/2* in Postgres > to be ERROR in syslog as well, so that it triggers the alert system and I > can notified. > > Is there any way to change the severity level for these specific error > codes in Postgres? You could, hypothetically, use the elog hook to switch dynamically the error data reports to something else depending on the types of filters you would like to apply to them. I have an example of this kind of idea in one of my plugin modules here: https://github.com/michaelpq/pg_plugins/tree/main/jsonlog -- Michael signature.asc Description: PGP signature
Re: How to force "re-TOAST" after changing STORAGE or COMPRESSION?
On Tue, Oct 03, 2023 at 09:08:49AM +0200, Dominique Devienne wrote: > In my case, it's OK not to be transactional, for these experiments. Is > there a way > to lock the table and do the rewriting w/o generating any WAL? I don't have > any experience > with unlogged tables, but should I take an exclusive lock on the table, > switch it to unlogged, > rewrite, and switch it back to logged? Switching a table back to be logged requires all its 8k blocks to be WAL-logged, so that would be roughly the same as a plain UPDATE. -- Michael signature.asc Description: PGP signature
Re: How to force "re-TOAST" after changing STORAGE or COMPRESSION?
On Tue, Oct 03, 2023 at 06:31:27AM +0200, Laurenz Albe wrote: > On Tue, 2023-10-03 at 12:33 +1100, rob stone wrote: >> Would running CLUSTER on the table use the new parameters for the re- >> write? > > No, as far as I know. Note that under the hoods VACUUM FULL and CLUSTER use the same code paths when doing their stuff. > You'd need something like > > -- rewrite all tuples > UPDATE tab SET id = id; > -- get rid of the bloat > VACUUM (FULL) tab; I'm afraid so, and get ready for a burst of WAL that depends on the size of your relation if you are too aggressive with the updates. You could do that in periodic steps, as well. -- Michael signature.asc Description: PGP signature
Re: How to force "re-TOAST" after changing STORAGE or COMPRESSION?
On Mon, Oct 02, 2023 at 04:42:15PM +0200, Dominique Devienne wrote: > According to the doc, the table is NOT changed. > In my case, I DO want to have the bytea column rewritten > according to the new STORAGE and/or COMPRESSION. > The doc doesn't mention how to achieve that. Yes, the compression type of the existed toasted values are not changed after an ALTER SET, and that's not something one would do on a daily basis, either. > VACUUM? > VACUUM FULL? > Else? VACUUM was considered as an option to force a rewrte, but it has been removed because we were not sure that this is the correct path to do so or that in some cases forcing the hand of the user was incorrect. It was also creating a penalty in some of the hot loops of area: commit: dbab0c07e5ba1f19a991da2d72972a8fe9a41bda committer: Michael Paquier date: Mon, 14 Jun 2021 09:25:50 +0900 Remove forced toast recompression in VACUUM FULL/CLUSTER Related thread: https://postgr.es/m/20210527003144.xxqppojoiwurc...@alap3.anarazel.de -- Michael signature.asc Description: PGP signature
Re: backup_manifest rename to backup_manifest.old after successful postgres start up
On Tue, Jul 18, 2023 at 01:53:23AM +0200, Gert Cuykens wrote: > Hi, suggest to automatically rename backup_manifest to backup_manifest.old > like backup_label when postgres start up successfully because it has no use > anymore for pg_verifybackup after postgres has been started. Yes, I would agree with you that it is not especially useful to keep it around once the cluster has been recovered from a base backup. It would actually lead to various errors if attempting to run pg_verifybackup on its data folder, for instance. -- Michael signature.asc Description: PGP signature
Re: Query on Primary_conninfo
On Wed, Jul 26, 2023 at 05:33:50PM +0530, Praneel Devisetty wrote: > Standy is not picking this change even after a reload. > > Restarting standby is working out, but as PG13 supports reload of > primary_connfino is there a reliable way to achieve this without restarting > standby database. primary_conninfo should be able to switch dynamically the upstream server it streams WAL from on a reload. I've quickly checked with a simple configuration and I was able to get it done without a restart here. Perhaps you have found a bug, but it's hard to say without having more details about what's happening. For example, what do the logs of the standby tell you? Are you sure that the reload was done on the correct node? Did you check with a SHOW command that the new value is reflected on your standby to what you want it to be? -- Michael signature.asc Description: PGP signature
Re: fsync data directory after DB crash
On Tue, Jul 18, 2023 at 04:50:25PM +0800, Pandora wrote: > I found that starting from version 9.5, PostgreSQL will do fsync on > the entire data directory after DB crash. Here's a question: if I > have FPW = on, why is this step still necessary? Yes, see around the call of SyncDataDirectory() in xlog.c: * - There might be data which we had written, intending to fsync it, but * which we had not actually fsync'd yet. Therefore, a power failure in * the near future might cause earlier unflushed writes to be lost, even * though more recent data written to disk from here on would be * persisted. To avoid that, fsync the entire data directory. -- Michael signature.asc Description: PGP signature
Re: ECPG Semantic Analysis
On Fri, Jun 23, 2023 at 12:21:48AM -0400, Juan Rodrigo Alejandro Burgos Mella wrote: > I have a modified version of ECPG, to which I gave the ability to do > semantic analysis of SQL statements. Where can you share it or with whom > can I discuss it? I cannot say what kind of problem this solves and/or if this is useful as a feature of the ECPG driver in PostgreSQL core itself, but you could consider submitting a patch for integration into core. See for example here: https://wiki.postgresql.org/wiki/Submitting_a_Patch Note that things are pretty old-school around here, as all the development discussions are done by email on the mailing list pgsql-hackers, mostly. -- Michael signature.asc Description: PGP signature
Re: FIPS-related Error: Password Must Be at Least 112 Bits on Postgres 14, Unlike in Postgres 11
On Thu, Jun 22, 2023 at 07:16:21PM +0530, Abhishek Dasgupta wrote: > I am puzzled as to why this error occurs only with PostgreSQL 14 and not > with PostgreSQL 11. This error is specific to the Postgres JDBC driver, which relies on its own application layer for FIPS and SCRAM because it speaks directly the protocol and because it has no dependency to libpq. Are there any specific failures you are seeing in the PostgreSQL backend that you find confusing? -- Michael signature.asc Description: PGP signature
Re: "PANIC: could not open critical system index 2662" - twice
On Mon, May 08, 2023 at 06:04:23PM -0400, Tom Lane wrote: > Andres seems to think it's a problem with aborting a DROP DATABASE. > Adding more data might serve to make the window wider, perhaps. And the odds get indeed much better once I use these two toys: CREATE OR REPLACE FUNCTION create_tables(num_tables int) RETURNS VOID AS $func$ BEGIN FOR i IN 1..num_tables LOOP EXECUTE format(' CREATE TABLE IF NOT EXISTS %I (id int)', 't_' || i); END LOOP; END $func$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION drop_tables(num_tables int) RETURNS VOID AS $func$ BEGIN FOR i IN 1..num_tables LOOP EXECUTE format(' DROP TABLE IF EXISTS %I', 't_' || i); END LOOP; END $func$ LANGUAGE plpgsql; And then use this loop with the others I have mentioned upthread (just create origindb and the functions in them): while true; do psql -c 'select create_tables(1000)' origindb > /dev/null 2>&1 ; psql testdb -c "select 1" > /dev/null 2>&1 ; psql -c 'select drop_tables(1000)' origindb > /dev/null 2>&1 ; psql testdb -c "select 1" > /dev/null 2>&1 ; done; On top of that, I have also been able to reproduce the issue on HEAD, and luckily some pg_class file remain around, full of zeros: $ hexdump ./base/199634/1259 000 The contents of 2662, though, looked OK. Echoing Alvaro.. Could we, err, revisit the choice of making WAL_LOG the default strategy even for this set of minor releases? FWIW, I've mentioned that this choice was too aggressive in the thread of 8a86618.. -- Michael signature.asc Description: PGP signature
Re: "PANIC: could not open critical system index 2662" - twice
On Mon, May 08, 2023 at 07:15:20PM +0530, Dilip Kumar wrote: > I am able to reproduce this using the steps given above, I am also > trying to analyze this further. I will send the update once I get > some clue. Have you been able to reproduce this on HEAD or at the top of REL_15_STABLE, or is that 15.2-ish without fa5dd46? One thing I was wondering about to improve the odds of the hits is to be more aggressive with the number of relations created at once, so as we are much more aggressive with the number of pages extended in pg_class from the origin database. -- Michael signature.asc Description: PGP signature
Re: "PANIC: could not open critical system index 2662" - twice
On Mon, May 08, 2023 at 02:46:37PM +1200, Thomas Munro wrote: > That sounds like good news, but I'm still confused: do you see all 0s > in the target database (popo)'s catalogs, as reported (and if so can > you explain how they got there?), or is it regression that is > corrupted in more subtle ways also involving 1s? Nope, I have not been able to confirm that yet without 8a86618. The test runs at a high frequency, so that's kind of hard to catch. I have not been able to get things in a state where I could look at a FPW for pg_class or its index, either, in a way similar to Evgeny. -- Michael signature.asc Description: PGP signature
Re: "PANIC: could not open critical system index 2662" - twice
On Sun, May 07, 2023 at 10:30:52PM +1200, Thomas Munro wrote: > Bug-in-PostgreSQL explanations could include that we forgot it was > dirty, or some backend wrote it out to the wrong file; but if we were > forgetting something like permanent or dirty, would there be a more > systematic failure? Oh, it could require special rare timing if it is > similar to 8a8661828's confusion about permanence level or otherwise > somehow not setting BM_PERMANENT, but in the target blocks, so I think > that'd require a checkpoint AND a crash. It doesn't reproduce for me, > but perhaps more unlucky ingredients are needed. > > Bug-in-OS/FS explanations could include that a whole lot of writes > were mysteriously lost in some time window, so all those files still > contain the zeroes we write first in smgrextend(). I guess this > previously rare (previously limited to hash indexes?) use of sparse > file hole-punching could be a factor in an it's-all-ZFS's-fault > explanation: Yes, you would need a bit of all that. I can reproduce the same backtrace here. That's just my usual laptop with ext4, so this would be a Postgres bug. First, here are the four things running in parallel so as I can get a failure in loading a critical index when connecting: 1) Create and drop a database with WAL_LOG as strategy and the regression database as template: while true; do createdb --template=regression --strategy=wal_log testdb; dropdb testdb; done 2) Feeding more data to pg_class in the middle, while testing the connection to the database created: while true; do psql -c 'create table popo as select 1 as a;' regression > /dev/null 2>&1 ; psql testdb -c "select 1" > /dev/null 2>&1 ; psql -c 'drop table popo' regression > /dev/null 2>&1 ; psql testdb -c "select 1" > /dev/null 2>&1 ; done; 3) Force some checkpoints: while true; do psql -c 'checkpoint' > /dev/null 2>&1; sleep 4; done 4) Force a few crashes and recoveries: while true ; do pg_ctl stop -m immediate ; pg_ctl start ; sleep 4 ; done And here is one backtrace: (gdb) bt #0 __pthread_kill_implementation (threadid=, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44 #1 0x7f8a395cad2f in __pthread_kill_internal (signo=6, threadid=) at ./nptl/pthread_kill.c:78 #2 0x7f8a3957bef2 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x7f8a39566472 in __GI_abort () at ./stdlib/abort.c:79 #4 0x55f0b36b95ac in errfinish (filename=0x55f0b38d4f68 "relcache.c", lineno=4335, funcname=0x55f0b38d6500 <__func__.12> "load_critical_index") at elog.c:604 #5 0x55f0b36a9c64 in load_critical_index (indexoid=2662, heapoid=1259) at relcache.c:4335 #6 0x55f0b36a9712 in RelationCacheInitializePhase3 () at relcache.c:4110 (gdb) up 5 #5 0x55f0b36a9c64 in load_critical_index (indexoid=2662, heapoid=1259) at relcache.c:4335 4335elog(PANIC, "could not open critical system index %u", indexoid); You can also get failures with btree deduplication because of the CTAS running above, by the way, but that's just the top of the iceberg once the state is corrupted: #0 __pthread_kill_implementation (threadid=, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44 #1 0x7fcae38abd2f in __pthread_kill_internal (signo=6, threadid=) at ./nptl/pthread_kill.c:78 #2 0x7fcae385cef2 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x7fcae3847472 in __GI_abort () at ./stdlib/abort.c:79 #4 0x5556a809f584 in ExceptionalCondition (conditionName=0x5556a812b639 "false", fileName=0x5556a812abc8 "heapam.c", lineNumber=7882) at assert.c:66 #5 0x5556a79e13db in index_delete_sort_cmp (deltid1=0x5556a9682158, deltid2=0x7ffdb62c7088) at heapam.c:7882 #6 0x5556a79e14f6 in index_delete_sort (delstate=0x7ffdb62c8350) at heapam.c:7923 #7 0x5556a79e0cd0 in heap_index_delete_tuples (rel=0x7fcae0f1a9a8, delstate=0x7ffdb62c8350) at heapam.c:7580 #8 0x5556a7a07be4 in table_index_delete_tuples (rel=0x7fcae0f1a9a8, delstate=0x7ffdb62c8350) at ../../../../src/include/access/tableam.h:1337 #9 0x5556a7a0a7a3 in _bt_delitems_delete_check (rel=0x7fcae0f231c8, buf=179, heapRel=0x7fcae0f1a9a8, delstate=0x7ffdb62c8350) at nbtpage.c:1541 #10 0x5556a7a07545 in _bt_simpledel_pass (rel=0x7fcae0f231c8, buffer=179, heapRel=0x7fcae0f1a9a8, deletable=0x7ffdb62c8430, ndeletable=160, newitem=0x5556a9689260, Anyway, you would be able to see that b3e184a (pretty much the same as 15.2), without 8a86618 included. Once 8a86618 is included, all these steps are stable in the backend, at least here. Or do we have some low-hanging fruit with the WAL_LOG strategy? That could always be possible, of course, but that looks like the same issue to me, just with a different symptom showing up. -- Michael signature.asc Description: PGP signature
Re: pg_basebackup / recovery
On Wed, Apr 12, 2023 at 01:45:56PM +0300, Achilleas Mantzios - cloud wrote: > On 4/12/23 12:32, Fabrice Chapuis wrote: >> During recovery process of a self contained backup, how postgres know to >> stop reading wal when consistency is reached? > > Because it knows the full packup info. It will observe the > > STOP WAL LOCATION: 3BC7/4B000130 (file 00023BC7004B) > > inside the backup file There is a bit more to that in the recovery logic, depending mostly on the presence of backup_ label file in the data folder when recovery begins. Once the backup_label is found at the beginning of recovery, its information is stored in the control file and the file is renamed to backup_label.old hence stopping the server when recovery has not reached its expected point would rely on the control file contents later on. Then, the startup process and its WAL redo makes sure that WAL replays until it finds the WAL record marking the end of the backup. Grepping for XLOG_BACKUP_END (WAL record type in this case) shows all the areas that rely on that, and xlogrecovery.c covers the most relevant bits. -- Michael signature.asc Description: PGP signature
Re: "PANIC: could not open critical system index 2662" - twice
On Tue, Apr 11, 2023 at 04:44:54PM +, Evgeny Morozov wrote: > We have data_checksums=on. (It must be on by default, since I cannot > find that in our config files anywhere.) initdb does not enable checksums by default, requiring a -k/--data-checksums, so likely this addition comes from from your environment. > However, the docs say "Only > data pages are protected by checksums; internal data structures and > temporary files are not.", so I guess pg_class_oid_index might be an > "internal data structure"? pg_class_oid_index is a btree index that relies on 8k on-disk pages (default size), so it is subject to the same rules as normal relations regarding checksums for the pages flushed to disk, even if it is on a catalog. -- Michael signature.asc Description: PGP signature
Re: "PANIC: could not open critical system index 2662" - twice
On Fri, Apr 07, 2023 at 01:04:34PM +0200, Laurenz Albe wrote: > On Thu, 2023-04-06 at 16:41 +, Evgeny Morozov wrote: >> Could this be a PG bug? > > It could be, but data corruption caused by bad hardware is much more likely. There is no way to be completely sure here, except if we would be able to put our hands on a reproducible test case that would break the cluster so much that we'd get into this state. I don't recall seeing this error pattern in recent history, though. -- Michael signature.asc Description: PGP signature
Re: Binding Postgres to port 0 for testing
On Sun, Mar 26, 2023 at 10:49:33PM -0600, Markus Pilman wrote: > I somehow didn't consider looking at the postgres tests, though it makes > sense that they need to solve this problem. If I read the perl code > correctly though it seems that this could, in theory, cause a race? The > script checks first whether the port has been assigned to a test, then > binds a socket to check whether it is used by someone else, closes this > test socker, and then starts a server process. I guess it's unlikely > enough, but isn't there a risk that some other process (that isn't > controlled by this perl script) binds to the found port right after this > test bind but right before postgres calls bind? I guess it should be rare > enough so that it wouldn't cause flaky tests. In theory, yes, I recall that's possible in the scripts. But only on Windows where we cannot use socket directories and rely on SSPI authentication while binding all the nodes to listen to 127.0.0.1. On all the other platforms, each test creates its own directory that will be used as path for unix_socket_directories. umask is 0700 and owned by the OS user running the perl tests, so it is isolated as much as it can be. Using the same port should be actually fine as far as I recall, as long as the unix socket paths are different. -- Michael signature.asc Description: PGP signature
Re: Binding Postgres to port 0 for testing
On Sat, Mar 25, 2023 at 11:01:33AM -0600, Markus Pilman wrote: > Now the problem is that I need to find a TCP port for each running postgres > instance. There's multiple ways to do this, but by far the easiest one I > know is to bind to port 0. So my plan was to start postgres with "-p 0" and > then parse stdout to figure out which port it actually uses. But that > doesn't seem to work: Note that you can find some inspiration about that in the code tree within src/test/perl/PostgreSQL/Test/Cluster.pm, particularly get_free_port(), where we have now accumulated a couple of years of experience in designing something that's rather safe, even if it comes with its own limits. It is in perl so perhaps you could just reuse it rather than reinvent the wheel? Of course, still it should not be complicated to translate that in a different language, but there may be no need to reinvent the wheel. And seeing your first message with the requirements you list, this does what you are looking for: - Create an empty cluster. - Freely create databases, tablespaces, queries, etc. - Wipe out the whole. The test cases around src/test/recovery/t/ could be a good starting point, as well. -- Michael signature.asc Description: PGP signature
Re: Libpq linked to LibreSSL
On Sun, Dec 04, 2022 at 09:02:07AM +0100, Marco Bambini wrote: > After several attempts (and runtime crashes), I am asking for help > with how to compile libpq with LibreSSL support (both dynamic and > static links would be OK to me). > I know how to compile libpq with OpenSSL support, but I need to > force it to link to LibreSSL. LibreSSL is rather close to OpenSSL in terms of API compatibility as far as I recall, OpenBSD using it. Note that we have two buildfarm members that I guess are doing so (adding Mikael in CC to confirm here): morepork and plover. The configure option --with-openssl would be the one to use, but you'd better be sure that CFLAGS is pointing at the path(s) of the headers of LibreSSL and that LDFLAGS points to the libraries of LibreSSL if you have an environment mixing both. Then at run-time, you could force LD_LIBRARY_PATH to load the right thing. There may be trickier things like building with OpenLDAP for example, as Postgres built with --with-ldap could link to a version of OpenLDAP linking to OpenSSL, but you'd want LibreSSL for a full consistency, so be careful.. The same could happen with Python, but without knowing what kind of environment you are using it will be hard to reach a clear conclusion without at least more details. -- Michael signature.asc Description: PGP signature
Re: Q: pg_hba.conf separate database names file format
On Wed, Nov 09, 2022 at 04:02:43AM -0600, Ron wrote: > Are these "include" files supposed to solve the problem of having a *lot* of > databases (or users) that you want to allow access to? Yes, splitting the list of users and database eases the maintenance and readability of pg_hba.conf as each HBA entry could get quite long depending on the connection policy you may want. My take would be to use one entry per line in an @ file in this case. -- Michael signature.asc Description: PGP signature
Re: Q: pg_hba.conf separate database names file format
On Tue, Nov 08, 2022 at 02:16:03PM +0100, Albrecht Dreß wrote: > However, I could not find a specification of the format for this > file… It appears as if simply giving each database name on a > separate line does the job. Is this correct? May the file contain > comments (i.e. lines starting with “#”) or empty lines? May the > file be specified including a path (e.g. “@/some/path/databases”)? I have been playing with this code for the last couple of days, and the answer is that you can use an absolute path or a relative path. In the case of a relative path, the code considers the base directory as the parent directory of the file this is included in. For example, /data/pg/pg_hba.conf including a @databases.conf resolves as /data/pg/databases.conf, and a @conf/databases.conf resolves as /data/pg/conf/databases.conf. The parsing of these files uses the same rules as what's done for pg_hba.conf and pg_ident.conf, so you can specify a list of user names separated by commas or even spaces, or put one name per line. Comments beginning with '#' are ignored. If you want to play with your file and see the results, I would recommend to tweak the files, and then look at the contents generated in the system view pg_hba_file_rules. Querying pg_hba_file_rules loads directly the configuration files from disk, so there is no need to reload or restart the server to see the effects any modifications would have. The documentation has already some descriptions, that you've missed, perhaps: https://www.postgresql.org/docs/15/auth-pg-hba-conf.html "Files included by @ constructs are read as lists of names, which can be separated by either whitespace or commas. Comments are introduced by #, just as in pg_hba.conf, and nested @ constructs are allowed. Unless the file name following @ is an absolute path, it is taken to be relative to the directory containing the referencing file." -- Michael signature.asc Description: PGP signature
Re: empty pg_stat_progress_vacuum
On Fri, Oct 21, 2022 at 10:21:23PM +, senor wrote: > I'm mainly wanting to understand why I'm not seeing processes in > pg_stat_progress_vacuum. If I rapidly refresh I occasionally see an > entry for a very small table. A manually started vacuum didn't show > up either. It may be possible that the run is short enough that it did not get captured, as pg_stat_progress_vacuum is a snapshot of the current point in time. > Pg version 11.4 Hard to say, but I think that you should update your binaries, at least. 11.4 has been release in June 2019, and the latest release available is 11.17, meaning that you are missing more than three years worth of bug fixes. Based on the roadmap in [1], 11.18 should be out next week. [1]: https://www.postgresql.org/developer/roadmap/ -- Michael signature.asc Description: PGP signature
Re: Zheap Tech Problem
On Fri, Oct 14, 2022 at 07:53:17PM -0700, Adrian Klaver wrote: > On 10/14/22 18:59, jacktby wrote: >> What's Zheap tech? Can you give me some details or stuff to study? and >> which version will be realized in ? There are a few videos on youtube that can provide some insight about all that, mainly from EDB: https://www.youtube.com/watch?v=ZbdWOuTTWrw https://www.youtube.com/watch?v=7jVz4TTB5-4 The community wiki has a page with few references to the code of the project: https://wiki.postgresql.org/wiki/Zheap -- Michael signature.asc Description: PGP signature
Re: Incorrect resource manager data checksum in record with zfs and compression
On Mon, Oct 03, 2022 at 12:41:23PM -0700, John Bolliger wrote: > Our architecture is similar but all of the servers are now on ZFS now and > Postgres 13.8 with Ubuntu 18.04+ and still doing streaming replication, all > with ECC memory and 26-64 cores with 192gb ram+ on top of a ZPOOL made out > of NVMe PCIe SSDs. > > A101 (primary) -> A201 (replica) -> B101(primary) -> B201 (replica). replica -> primary does not really make sense for physical replication. Or do you mean that B101 is itself a standby doing streaming from A201? -- Michael signature.asc Description: PGP signature
Re: Streaming wal from primiry terminating
On Thu, Sep 29, 2022 at 01:04:08PM +, Lahnov, Igor wrote: > Primary_conninfo is set in postgresql.conf > The reason is definitely something else, we do not know what. This is fatal > in this case. Which version of PostgreSQL are you using? There has been a few commits around continuation records lately, so it is hard to say if you are pointing at a new issue or if this is a past one already addressed. -- Michael signature.asc Description: PGP signature
Re: PostgreSQL 14.4 ERROR: out of memory issues
On Mon, Jul 11, 2022 at 10:50:23AM +0200, Aleš Zelený wrote: > So far, it has happened three times (during a single week) from the 14.3 -> > 14.4 upgrade, before 14.4 we haven't suffered from such an issue. > > Questions: > 1) Can we safely downgrade from 14.4 to 14.3 by shutting down the instance > and reinstalling 14.3 PG packages (to prove, that the issue disappear)? > 2) What is the best way to diagnose what is the root cause? Hmm. 14.4 has nothing in its release notes that would point to a change in the vacuum or autovacuum's code paths: https://www.postgresql.org/docs/14/release-14-4.html#id-1.11.6.5.4 There is nothing specific after a look at the changes as of, and I am not grabbing anything that would imply a change in memory context handling either: `git log --stat REL_14_3..REL_14_4` `git diff REL_14_3..REL_14_4 -- *.c` Saying that, you should be able to downgrade safely as there are no changes in WAL format or such that would break things. Saying that, the corruption issue caused by CONCURRENTLY is something you'd still have to face. > 2022-07-02 14:48:07 CEST [3930]: [3-1] user=,db=,host=,app= ERROR: out of > memory > 2022-07-02 14:48:07 CEST [3930]: [4-1] user=,db=,host=,app= DETAIL: Failed > on request of size 152094068 in memory context "TopTransactionContext". > 2022-07-02 14:48:07 CEST [3930]: [5-1] user=,db=,host=,app= CONTEXT: > automatic vacuum of table "prematch.replication.tab_queue_tmp" This is the interesting part. Do you happen to use logical replication in a custom C++ plugin? > 2022-07-02 14:48:47 CEST [4476]: [43-1] user=,db=,host=,app= LOG: could > not fork worker process: Cannot allocate memory > terminate called after throwing an instance of 'std::bad_alloc' > what(): std::bad_alloc > > DETAIL: parameters: $1 = '1', $2 = '1748010445', $3 = '0', $4 = '1000' > terminate > called after throwing an instance of 'std::bad_alloc' terminate called > after throwing an instance of 'std::bad_alloc' what(): what(): > std::bad_allocstd::bad_alloc 2022-07-08 14:54:23 CEST [4476]: [49-1] > user=,db=,host=,app= LOG: background worker "parallel worker" (PID 25251) > was terminated by signal 6: Aborted > 2022-07-08 14:54:23 CEST [4476]: [51-1] user=,db=,host=,app= LOG: > terminating any other active server processes Looks like something is going wrong in the memory handling of one of your C++ extensions here. If you can isolate an issue using a query without any custom code, that would be a Postgres problem, but I think that you are missing a trick in it. -- Michael signature.asc Description: PGP signature
Re: Postgresql error : PANIC: could not locate a valid checkpoint record
On Fri, Jun 24, 2022 at 01:03:57PM +, Mahendrakar, Prabhakar - Dell Team wrote: > Is it possible to explicitly issue a checkpoint before we move on to > the pg_upgrade command? > so that in the circumstances of the Upgrade issues (like PANIC: > could not locate a valid checkpoint record), we still have this > last explicit checkpoint available. > > Please let us know your thoughts on this. Well, you have mentioned the use of pg_upgrade, but you are giving zero details about what kind of command you used, how you handled the clusters before and after that were upgraded, or what kind of environment is getting used. With this little amount of details, nobody will be able to guess what's happening. This issue could also be caused by the environment. For example, it is possible in some carelessly-setup enviromnents that a flush is issued and recognized as completed by the OS, and thought as completed by Postgres, but an application layer between the OS and the actual hardware did not issue the flush (be it an OS, FS, disk or a VM-related thing), which would make this issue reachable. -- Michael signature.asc Description: PGP signature
Re: multiple entries for synchronous_standby_names
On Fri, Jun 10, 2022 at 05:04:30PM +0100, Nitesh Nathani wrote: > Trying to achieve sync streaming to barman server and i need to add an > entry to postgresql.conf for this parameter, which already has an entry and > tried a few variations but does not work. Any ideas? Also tried '&&' but in > vain > > synchronous_standby_names='ANY 1 (*)',barman-wal-archive This grammar flavor is not supported (see also syncrep_gram.y for the code): https://www.postgresql.org/docs/devel/runtime-config-replication.html And here is the actual list of grammars supported: [FIRST] num_sync ( standby_name [, ...] ) ANY num_sync ( standby_name [, ...] ) standby_name [, ...] In short, you can specify a list of node names within one ANY or FIRST clause, but you cannot specify a list made of ANY/FIRST items. Without knowing what kind of priority policy you are trying to achieve, it is hard to recommend one method over the others. What we support now has proven to be hard enough to implement and to make robust, and supporting sub-groups of nodes was also on the table back in the day, but the lack of cases did not justify the extra implementation complexity, as far as I recall this matter. -- Michael signature.asc Description: PGP signature
Re: What do you guys use for issue tracking, CI/CD and team management? any nice open source options?
On Thu, Apr 14, 2022 at 06:19:44PM +0300, Achilleas Mantzios wrote: > What issue/bug tracking is PostgreSQL itself using? > What continuous build system (CI/CD) is PostgreSQL itself using? > Any tool that you ppl or the PostgreSQL infrastructure use that > links people/committers with bugs/issues, automatically closes > issues upon commit/push, manages issues/p rojects/people ? The community does not have a bug tracker, everything is mail-based as of today. There is what we call the commit fest app: https://commitfest.postgresql.org/ This is more a patch tracker for the development of new features though, still there is a category for published patches that aim at fixing bugs (see "Bug Fixes"). This does not act as a bug tracker to be able to see all the live issues reported. -- Michael signature.asc Description: PGP signature
Re: Can you install/run postgresql on a FIPS enabled host?
On Mon, Mar 21, 2022 at 06:33:29PM -0400, Tom Lane wrote: > It sounds like something thinks that scram-sha-256 encryption is > disallowed by FIPS. That may or may not be accurate. If it's > supposed to be allowed, you'd need to poke a little harder to > narrow down where the problem is. > > (Digging in our commit logs, it looks like version 14.2 has some > changes that might make this work better than it did in older > versions; but I can't tell from the log messages whether the > issue being fixed was new-in-14 or not.) I guess that 3a0cced is the commit you are talking about here. Please note that it has been reverted in ad5b6f2 due to ABI concerns with some of the MD5 hashing routines. -- Michael signature.asc Description: PGP signature
Re: could not open relation with OID
On Wed, Jan 26, 2022 at 05:30:01PM -0800, Ben Chobot wrote: > We do a lot of queries per day, over a lot of hosts, all of which are on > 12.9. We've recently started doing a better job at analyzing our db logs and > have found that, a few times a day, every day, we see some of our queries > fail with errors like: > > could not open relation with OID 201940279 > > In the cases we've examined so far, the failed query succeeds just fine when > we run it manually. The failed query also had run on an async streaming > replica, and the primary has completed at least one autovacuum since the > failure. I don't know if either of those two facts are relevant, but I'm not > sure what else to blame. The internet seems to want to blame issues like > this on temp tables, which makes sense, but in our case, most of the queries > that are failing this way are simple PK scans, which then fall back to the > table to pull all the columns. The tables themselves are small in row count > - although some values are likely TOASTed - so I would be surprised if > anything is spilling to disk for sorting, which might have counted as a temp > table enough to give such an error. Do those OIDs point to some specific relations? It should be easy enough to guess to which pg_class entry they point to, especially if you have a persistent schema, and it these are indeed temporary entries or not depending on their pg_class.relnamespace. > This is a minuscule failure percentage, so replicating it is going to be > hard, but it is still breaking for reasons I don't understand, and so I'd > like to fix it. Has anybody else seen this, or have an ideas of what to look > at? I don't recall seeing such reports recently. > Other things we've considered: > - we run pg_repack, which certainly seems like it could make an error > like this, but we see this error in places and times that pg_repack isn't > currently running It could also take time for the issue to show up, depending on the state of the relcache. -- Michael signature.asc Description: PGP signature
Re: How are md5.h: pg_md5_hash() function and below functions working?
On Sat, Jan 08, 2022 at 08:12:50AM -0800, Adrian Klaver wrote: > On 1/8/22 05:21, Ali Koca wrote: >> I can't understand functions in md5.h, these are seemingly little bit >> weird. Such as: >> /* Utilities common to all the MD5 implementations, >> as of md5_common.c */ >> extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum); >> extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf); >> extern bool pg_md5_encrypt(const char *passwd, const char *salt, >> size_t salt_len, char *buf); > > What they do is explained in md5_common.c. pg_md5_hash() and pg_md5_binary() do the same thing, by computing a MD5. pg_md5_hash() computes it as a string in hex format, meaning that it is 33-character long with the final trailing '\0'. pg_md5_binary() computes 16 raw bytes. pg_md5_encrypt() is a utility wrapper that does the operation of pg_md5_hash() based on a password and a salt, used at authentication time for MD5. Its result is a 36-byte long string, prefixed with "md5" and a 33-byte long hex string. Those routine names are historic. -- Michael signature.asc Description: PGP signature
Re: recording of INDEX creation in tables
On Fri, Jan 21, 2022 at 01:38:59PM +0100, Matthias Apitz wrote: > Does the PostgreSQL (11.4 or 13.1) record somewhere in system tables > the creation of INDEXes (or other objects)? Hard to say what you are looking for with such a general question. Would pg_index or pg_indexes be enough? There are a bunch of system catalogs and views, like what's in information_schema, that provide this information. If you are looking for a record of the timestamp when objects have been created, there is no such tracking in Postgres itself, but you could use an event trigger for this purpose. Here are some links; https://www.postgresql.org/docs/devel/catalog-pg-index.html https://www.postgresql.org/docs/devel/view-pg-indexes.html https://www.postgresql.org/docs/devel/event-triggers.html -- Michael signature.asc Description: PGP signature
Re: could not accept SSL connection: Success
On Wed, Jan 19, 2022 at 08:06:30PM -0500, Tom Lane wrote: > Michael Paquier writes: > > Leaving things in their current state is fine by me. Would it be > > better to add a note about the business with 3.0 though? > > What do you envision saying? "We don't need to do anything here > for 3.0" doesn't seem helpful. Nope, but the idea would be to keep around a note that we may want to revisit this area of the code based on the state of upstream, because our code is currently shaped based on problems that OpenSSL has dealt with. I am not completely sure, but something among the line of: "OpenSSL 1.1.1 and older versions return nothing on an unexpected EOF, and errno may not be set. 3.0 reports SSL_ERROR_SSL with a meaningful error set on the stack, so this could be reworked once support for older versions is removed." Perhaps that's just nannyism from my side, this is really minor at the end. -- Michael signature.asc Description: PGP signature
Re: could not accept SSL connection: Success
On Wed, Jan 19, 2022 at 07:58:43PM -0500, Tom Lane wrote: > Personally I'm satisfied to leave it as-is, since this issue apparently > occurs only in a minority of OpenSSL versions, and not the newest. Leaving things in their current state is fine by me. Would it be better to add a note about the business with 3.0 though? My gut is telling me that we'd better revisit those code paths in a couple of years when support for legacy OpenSSL is removed, and most likely we would have forgotten about all those details. -- Michael signature.asc Description: PGP signature
Re: could not accept SSL connection: Success
On Thu, Jan 20, 2022 at 09:05:35AM +1300, Thomas Munro wrote: > Good news, I'm glad they nailed that down. I recall that this > behaviour was a bit of a moving target in earlier versions: > > https://www.postgresql.org/message-id/CAEepm%3D3cc5wYv%3DX4Nzy7VOUkdHBiJs9bpLzqtqJWxdDUp5DiPQ%40mail.gmail.com Ahh.. So you saw the same problem a couple of years back. This thread did not catch much attention. I don't think that it makes much sense to leave this unchecked as the message is confusing as it stands. Perhaps we could do something like the attached by adding a note about OpenSSL 3.0 to revisit this code once we unplug support for 1.1.1 and avoiding the errno==0 case? The frontend has its own ideas of socket failures as it requires thread support, making things different with the backend, but it seems to me that we could see cases where SOCK_ERRNO is also 0. That's mostly what you suggested on the other thread. The error handling changes are really cosmetic, so I'd rather leave the back-branches out of that. Thoughts? -- Michael diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index 3d0168a369..12ddc6f82f 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -493,7 +493,13 @@ aloop: WAIT_EVENT_SSL_OPEN_SERVER); goto aloop; case SSL_ERROR_SYSCALL: -if (r < 0) +/* + * OpenSSL 1.1.1 and older versions return nothing on + * an unexpected EOF, and errno may not be set. Note that + * in OpenSSL 3.0, an unexpected EOF would result in + * SSL_ERROR_SSL with a meaningful error set on the stack. + */ +if (r < 0 && errno != 0) ereport(COMMERROR, (errcode_for_socket_access(), errmsg("could not accept SSL connection: %m"))); diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c index 9f735ba437..b47ea87b43 100644 --- a/src/interfaces/libpq/fe-secure-openssl.c +++ b/src/interfaces/libpq/fe-secure-openssl.c @@ -314,6 +314,12 @@ pgtls_write(PGconn *conn, const void *ptr, size_t len) n = 0; break; case SSL_ERROR_SYSCALL: + /* + * OpenSSL 1.1.1 and older versions return nothing on + * an unexpected EOF, and errno may not be set. Note that + * in OpenSSL 3.0, an unexpected EOF would result in + * SSL_ERROR_SSL with a meaningful error set on the stack. + */ if (n < 0) { result_errno = SOCK_ERRNO; @@ -322,11 +328,14 @@ pgtls_write(PGconn *conn, const void *ptr, size_t len) libpq_gettext("server closed the connection unexpectedly\n" "\tThis probably means the server terminated abnormally\n" "\tbefore or while processing the request.\n")); -else +else if (result_errno != 0) appendPQExpBuffer(>errorMessage, libpq_gettext("SSL SYSCALL error: %s\n"), SOCK_STRERROR(result_errno, sebuf, sizeof(sebuf))); +else + appendPQExpBuffer(>errorMessage, + libpq_gettext("SSL SYSCALL error: internal failure\n")); } else { signature.asc Description: PGP signature
Re: could not accept SSL connection: Success
On Mon, Jan 17, 2022 at 05:05:52PM +0100, Carla Iriberri wrote: > I saw previous discussions where different errors were logged with the > "Success" > message and this was corrected/treated as a bug, but I couldn't find similar > reports specific to "could not accept SSL connection". Is this a known > issue or > case? Not based my recent mailing list memories, but I may be running short. The error comes from the backend as you say, where this log would expect something in saved_errno to feed %m. And, upstream documentation tells that: https://www.openssl.org/docs/manmaster/man3/SSL_get_error.html "On an unexpected EOF, versions before OpenSSL 3.0 returned SSL_ERROR_SYSCALL, nothing was added to the error stack, and errno was 0. Since OpenSSL 3.0 the returned error is SSL_ERROR_SSL with a meaningful error on the error stack." This would mean that relying on %m would be wrong for this case. And I guess that you are using a version of OpenSSL older than 3.0? -- Michael signature.asc Description: PGP signature
Re: md5 issues Postgres14 on OL7
On Sat, Jan 08, 2022 at 02:00:16PM -0500, Tom Lane wrote: > This is looking pretty solid to me. Just a couple of nitpicks: > > * In most places you initialize variables holding error strings to NULL: > > + const char *logdetail = NULL; > > but there are three or so spots that don't, eg PerformRadiusTransaction. > They should be consistent. (Even if there's no actual bug, I'd be > unsurprised to see Coverity carp about the inconsistency.) Hmm. I have spotted five of them, with one in passwordcheck. > * The comments for md5_crypt_verify and plain_crypt_verify claim that > the error string is "optionally" stored, but I don't see anything > optional about it. To me, "optional" would imply coding like > > if (logdetail) > *logdetail = errstr; > > which we don't have here, and I don't think we need it. But the > comments need adjustment. (They were wrong before too, but no > time like the present to clean them up.) Makes sense. > * I'd be inclined to just drop the existing comments like > > - * We do not bother setting logdetail for any pg_md5_encrypt failure > - * below: the only possible error is out-of-memory, which is unlikely, > and > - * if it did happen adding a psprintf call would only make things worse. > > rather than modify them. Neither the premise nor the conclusion > of these comments is accurate anymore. (I think that the psprintf > they are talking about is the one that will happen inside elog.c > to construct an errdetail string. Yeah, there's some risk there, > but I think it's minimal because of the fact that we preallocate > some space in ErrorContext.) Okay, that's fine by me. > Other than those things, I think v3 is good to go. I have done an extra pass on all that, and the result seemed fine to me, so applied. I have changed the non-OpenSSL code path of pgcrypto to deal with that in 14 (does not exist on HEAD). Thanks a lot for the successive reviews! The patch was invasive enough, but we could do more here: - Add the same facility for HMAC. That's not worth on REL_14_STABLE based on the existing set of callers, but I'd like to do something about that on HEAD as that could be helpful in the future. - The error areas related to checksum_helper.c and backup_manifest.c could be improved more. Now these refer only to scenarios unlikely going to happen in the field, so I have left that out. -- Michael signature.asc Description: PGP signature
Re: md5 issues Postgres14 on OL7
On Fri, Jan 07, 2022 at 05:40:09PM -0500, Tom Lane wrote: > Hm, you still have cast-away-const in md5_crypt_verify and > plain_crypt_verify. Can we adjust their APIs to make them > return const char * as well (and then their API spec is that > the caller must never free the string, rather than being > vague about it)? Okay. Hmm. This requires a couple of extra const markers in the area of authentication and SASL for the backend, but not much actually. I thought first that it would be more invasive. > The other thing that bothers me slightly is that it looks like > some code paths could end up passing a NULL string pointer to > ereport or sprintf, since you don't positively guarantee that > an error will return a string there. I suppose this is safe > since 3779ac62d, but I don't really want to start making API > specs depend on it. Sounds fair to me in the long term, even for non-assert builds. I have added a small-ish wrapper routine in crytohash_openssl.c for this purpose, with a name copied from {be,fe}-secure-openssl.c to ease any future refactoring lookups if that proves to be worth in the future. -- Michael From 0e7ceca421ab309ce64d8f6fea7a714da08df56c Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Sat, 8 Jan 2022 14:56:39 +0900 Subject: [PATCH v3] Improve error reporting for cryptohashes --- src/include/common/cryptohash.h | 1 + src/include/common/md5.h | 9 ++- src/include/libpq/crypt.h | 7 +- src/include/libpq/sasl.h | 4 +- src/backend/commands/user.c | 4 +- src/backend/libpq/auth-sasl.c | 2 +- src/backend/libpq/auth-scram.c| 5 +- src/backend/libpq/auth.c | 33 src/backend/libpq/crypt.c | 46 ++- src/backend/replication/backup_manifest.c | 9 ++- src/backend/utils/adt/cryptohashfuncs.c | 25 +++--- src/common/cryptohash.c | 57 +- src/common/cryptohash_openssl.c | 93 +++ src/common/md5_common.c | 20 +++-- src/interfaces/libpq/fe-auth.c| 22 +- contrib/uuid-ossp/uuid-ossp.c | 18 +++-- 16 files changed, 282 insertions(+), 73 deletions(-) diff --git a/src/include/common/cryptohash.h b/src/include/common/cryptohash.h index ea1300d5d4..8e339c83ad 100644 --- a/src/include/common/cryptohash.h +++ b/src/include/common/cryptohash.h @@ -34,5 +34,6 @@ extern int pg_cryptohash_init(pg_cryptohash_ctx *ctx); extern int pg_cryptohash_update(pg_cryptohash_ctx *ctx, const uint8 *data, size_t len); extern int pg_cryptohash_final(pg_cryptohash_ctx *ctx, uint8 *dest, size_t len); extern void pg_cryptohash_free(pg_cryptohash_ctx *ctx); +extern const char *pg_cryptohash_error(pg_cryptohash_ctx *ctx); #endif /* PG_CRYPTOHASH_H */ diff --git a/src/include/common/md5.h b/src/include/common/md5.h index bbd2ec0165..942ca4242c 100644 --- a/src/include/common/md5.h +++ b/src/include/common/md5.h @@ -26,9 +26,12 @@ #define MD5_PASSWD_LEN 35 /* Utilities common to all the MD5 implementations, as of md5_common.c */ -extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum); -extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf); +extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum, + const char **errstr); +extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf, + const char **errstr); extern bool pg_md5_encrypt(const char *passwd, const char *salt, - size_t salt_len, char *buf); + size_t salt_len, char *buf, + const char **errstr); #endif /* PG_MD5_H */ diff --git a/src/include/libpq/crypt.h b/src/include/libpq/crypt.h index ee60772e94..3238cf66d3 100644 --- a/src/include/libpq/crypt.h +++ b/src/include/libpq/crypt.h @@ -35,12 +35,13 @@ extern PasswordType get_password_type(const char *shadow_pass); extern char *encrypt_password(PasswordType target_type, const char *role, const char *password); -extern char *get_role_password(const char *role, char **logdetail); +extern char *get_role_password(const char *role, const char **logdetail); extern int md5_crypt_verify(const char *role, const char *shadow_pass, const char *client_pass, const char *md5_salt, - int md5_salt_len, char **logdetail); + int md5_salt_len, const char **logdetail); extern int plain_crypt_verify(const char *role, const char *shadow_pass, - const char *client_pass, char **logdetail); + const char *client_pass, + const char **logdetail); #endif diff --git a/src/include/libpq/sasl.h b/src/include/libpq/sasl.h index 7ba3f5f5bc..71cc0dc251 100644 --- a/src/include/libpq/sasl.h +++ b/src/include/libpq/sasl.h @@ -126,11 +126,11 @@ typedef struct pg_be_sasl_mech int (*exchange) (void *state, const char *input, int inputlen, char **outp
Re: md5 issues Postgres14 on OL7
On Thu, Jan 06, 2022 at 11:40:04AM -0500, Tom Lane wrote: > 1. It draws a cast-away-const warning. We'd have to make the result > of pg_cryptohash_error be "const char *", which would be better > practice anyway, but that propagates into some other APIs and I didn't > take the trouble to chase it to the end. Yeah. I wanted to switch all those routines to use a const here anyway, just did not have the time to tackle that yesterday with the rest of the issues I could think about. Looking at that today, I don't see any problems in switching to const in all those places, so done this way (two places in crypt.c are more picky, though, for logdetail). > 2. It feels a bit bogus to be fetching ERR_get_error() at this point. > Maybe it's all right to assume that the OpenSSL error stack won't > change state before we get to pg_cryptohash_error, but I don't like > the idea much. I think it'd be better to capture ERR_get_error() > sooner and store it in an additional field in pg_cryptohash_ctx. Right, I forgot about the ERR_get_error() piece of it. Thanks! I'd also rather have the check done just after the OpenSSL call. If hash computations are split across multiple code paths, this could lead to issues. We don't have any problems currently in the core code, but I see no reason to not make that safer in the long run. And the structure is flexible enough, so that's not an issue. > Also, I wonder if this shouldn't be unified with the SSLerrmessage() > support found in be-secure-openssl.c and fe-secure-openssl.c. Guess so. HEAD could be poked at for this part. I recall looking at that once by that did not seem worth the complications. I have also looked at the ABI part of the patch. I cannot spot any public trace of pg_md5_hash() and pg_md5_binary(). pgbouncer and pgpool2 have each a copy of pg_md5_encrypt(), but they just share the API name with PG, nothing more. So that looks reasonably safe to change. The last thing I have on my notes is to assign logdetail in md5_crypt_verify() and plain_crypt_verify() to feed back a LOG entry to the postmaster on those failures, and saw that it is safe to assign directly the error returned by the cryptohash APIs, avoiding the extra psprintf call that could become an issue under memory pressure. What do you think? -- Michael From 62ecfb1a77a58f6e056822e3a7a0b3052fcf9e46 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Fri, 7 Jan 2022 11:08:17 +0900 Subject: [PATCH v2] Improve error reporting for cryptohashes --- src/include/common/cryptohash.h | 1 + src/include/common/md5.h | 9 ++- src/backend/libpq/auth.c | 14 ++-- src/backend/libpq/crypt.c | 40 +++- src/backend/replication/backup_manifest.c | 9 ++- src/backend/utils/adt/cryptohashfuncs.c | 25 --- src/common/cryptohash.c | 57 +++- src/common/cryptohash_openssl.c | 80 +++ src/common/md5_common.c | 20 -- src/interfaces/libpq/fe-auth.c| 22 +-- contrib/uuid-ossp/uuid-ossp.c | 18 +++-- 11 files changed, 244 insertions(+), 51 deletions(-) diff --git a/src/include/common/cryptohash.h b/src/include/common/cryptohash.h index 541dc844c8..c62c350d57 100644 --- a/src/include/common/cryptohash.h +++ b/src/include/common/cryptohash.h @@ -34,5 +34,6 @@ extern int pg_cryptohash_init(pg_cryptohash_ctx *ctx); extern int pg_cryptohash_update(pg_cryptohash_ctx *ctx, const uint8 *data, size_t len); extern int pg_cryptohash_final(pg_cryptohash_ctx *ctx, uint8 *dest, size_t len); extern void pg_cryptohash_free(pg_cryptohash_ctx *ctx); +extern const char *pg_cryptohash_error(pg_cryptohash_ctx *ctx); #endif /* PG_CRYPTOHASH_H */ diff --git a/src/include/common/md5.h b/src/include/common/md5.h index 62a31e6ed4..cc43fc6606 100644 --- a/src/include/common/md5.h +++ b/src/include/common/md5.h @@ -26,9 +26,12 @@ #define MD5_PASSWD_LEN 35 /* Utilities common to all the MD5 implementations, as of md5_common.c */ -extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum); -extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf); +extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum, + const char **errstr); +extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf, + const char **errstr); extern bool pg_md5_encrypt(const char *passwd, const char *salt, - size_t salt_len, char *buf); + size_t salt_len, char *buf, + const char **errstr); #endif /* PG_MD5_H */ diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c index 7bcf52523b..eea933f41e 100644 --- a/src/backend/libpq/auth.c +++ b/src/backend/libpq/auth.c @@ -3085,6 +3085,8 @@ PerformRadiusTransaction(const char *server, const char *secret, const char *por md5trailer = packet->vector; for (i = 0; i < encryptedpassword
Re: md5 issues Postgres14 on OL7
On Wed, Jan 05, 2022 at 04:09:12PM +0900, Michael Paquier wrote: > In order to make things portable with 14 in cryptohash.c, we don't > have any need to change the existing cryptohash APIs. We could just > store in each implementation context a location to a static string, > and add a new routine to extract it if there is an error, defaulting > to OOM. I have been looking at that, and finished with the attached. It is close to the end of the day, so this needs an extra lookup, but I have finished by using the idea of an extra routine, called pg_cryptohash_error(), able to grab the error saved in the private contexts, so as any callers from the backend or the frontend can feed on that. This way, it is possible to make the difference between several class of errors: OOMs, a too short destination buffer, OpenSSL internal error, etc. There are a couple of things worth noting here: - Two code paths of src/backend/libpq/crypt.c rely on the result of pg_md5_encrypt() to always be an OOM, so as this skips one psnprintf(). This has been let as-is in the backend for now, but we don't pfree() the *logdetail strings passed across the various layers, so we could just pass down the cryptohash error as-is.. We'd better mention that logdetail may not be palloc'd all the time, once we do that. libpq is able to use that properly. - The routines of md5_common.c need to pass down an extra *errstr for their respective callers. That's an ABI breakage but I'd like to think that nobody uses that out-of-core. (I need to double-check this part, as well). - HMAC (hmac_openssl.c and hmac.c) could use the same infra, but I did not see a use for that yet. It is possible to compile HMACs with MD5s, but we don't have any in-core callers, and failures are just part of the SCRAM workflow with dedicated error messages. I am still not sure about the FIPS part, as per the argument of OpenSSL using something different in 3.0.0, and Fedora that patches upstream in its own way, but this could be extended in cryptohash_openssl.c to provide even more context. For now, this allows reports about OpenSSL internal failures, taking care of the disturbance. Thoughts? -- Michael From 064ef97fd2a171485ff408550380515493ce0072 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Thu, 6 Jan 2022 17:13:55 +0900 Subject: [PATCH] Improve error reporting for cryptohashes --- src/include/common/cryptohash.h | 1 + src/include/common/md5.h | 8 ++- src/backend/libpq/auth.c | 14 +++-- src/backend/libpq/crypt.c | 20 --- src/backend/replication/backup_manifest.c | 9 ++- src/backend/utils/adt/cryptohashfuncs.c | 25 ++--- src/common/cryptohash.c | 57 ++- src/common/cryptohash_openssl.c | 68 +++ src/common/md5_common.c | 20 +-- src/interfaces/libpq/fe-auth.c| 22 ++-- contrib/uuid-ossp/uuid-ossp.c | 18 -- 11 files changed, 219 insertions(+), 43 deletions(-) diff --git a/src/include/common/cryptohash.h b/src/include/common/cryptohash.h index 541dc844c8..61d74e1195 100644 --- a/src/include/common/cryptohash.h +++ b/src/include/common/cryptohash.h @@ -34,5 +34,6 @@ extern int pg_cryptohash_init(pg_cryptohash_ctx *ctx); extern int pg_cryptohash_update(pg_cryptohash_ctx *ctx, const uint8 *data, size_t len); extern int pg_cryptohash_final(pg_cryptohash_ctx *ctx, uint8 *dest, size_t len); extern void pg_cryptohash_free(pg_cryptohash_ctx *ctx); +extern char *pg_cryptohash_error(pg_cryptohash_ctx *ctx); #endif /* PG_CRYPTOHASH_H */ diff --git a/src/include/common/md5.h b/src/include/common/md5.h index 62a31e6ed4..1db2855d95 100644 --- a/src/include/common/md5.h +++ b/src/include/common/md5.h @@ -26,9 +26,11 @@ #define MD5_PASSWD_LEN 35 /* Utilities common to all the MD5 implementations, as of md5_common.c */ -extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum); -extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf); +extern bool pg_md5_hash(const void *buff, size_t len, char *hexsum, + char **errstr); +extern bool pg_md5_binary(const void *buff, size_t len, void *outbuf, + char **errstr); extern bool pg_md5_encrypt(const char *passwd, const char *salt, - size_t salt_len, char *buf); + size_t salt_len, char *buf, char **errstr); #endif /* PG_MD5_H */ diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c index 7bcf52523b..9f118ec546 100644 --- a/src/backend/libpq/auth.c +++ b/src/backend/libpq/auth.c @@ -3085,6 +3085,8 @@ PerformRadiusTransaction(const char *server, const char *secret, const char *por md5trailer = packet->vector; for (i = 0; i < encryptedpasswordlen; i += RADIUS_VECTOR_LENGTH) { + char *errstr = NULL; + memcpy(cryptvector + strlen(secret), md5trailer, RADIUS_VECTOR_LENGTH); /* @@ -3093,10 +3095,12 @@ PerformRadiusT
Re: md5 issues Postgres14 on OL7
On Wed, Jan 05, 2022 at 01:08:53AM -0500, Tom Lane wrote: > I think it's very important that the error message in this case > mention "FIPS mode" explicitly. Otherwise, people will have no > idea that that's where the problem originates, and they'll be > frustrated and we'll get bug reports. (They may be frustrated > anyway, but it was their choice, or their corporate policy's > choice, to cut off their access to MD5. Not our place to dodge > that decision.) I am not completely sure how to detect that in 1.1.1 in the context of Fedora, and portability may become a tricky thing. FIPS_mode() and FIPS_mode_set() are legacy APIs that should not be used, and upstream just disables them in 1.1.1. [... digs a bit ...] Ugh. Fedora patches upstream's 1.1.1 to check and react on /proc/sys/crypto/fips_enabled. Their code is here, see particularly 0009-Add-Kernel-FIPS-mode-flag-support.patch: https://src.fedoraproject.org/rpms/openssl.git So that's why you are able to use it with 1.1.1. Well, we could do something similar to that, but in 3.0.0 things are done very differently: one has to set to alg_sect fips=yes with fips = fips_sect in the OpenSSL configuration to load the FIPS provider. Providing more error context is going to be hairy here.. In order to make things portable with 14 in cryptohash.c, we don't have any need to change the existing cryptohash APIs. We could just store in each implementation context a location to a static string, and add a new routine to extract it if there is an error, defaulting to OOM. -- Michael signature.asc Description: PGP signature
Re: md5 issues Postgres14 on OL7
On Tue, Jan 04, 2022 at 12:54:35PM -0500, Tom Lane wrote: > I reproduced this on Fedora 35 with FIPS mode enabled. The problem > is that OpenSSL treats MD5 as a disallowed cipher type under FIPS > mode, so this call in pg_cryptohash_init fails: Is that 3.0.0 or 1.1.1? I can see the following, telling that Fedora 35 uses OpenSSL 1.1.1: https://packages.fedoraproject.org/pkgs/openssl/openssl/ And there is FIPS 2.0 for 1.0.1 and 1.0.2 (funnily, I recall that 1.0.2+FIPS allows MD5 to work with the EVP routines), but the module of FIPS 3.0 does not work with 1.1.1 AFAIK, so I may be confused. Or perhaps this is OpenSSL 1.1.1 with a separate module? The upstream code does nothing special with EVP_DigestInit_ex() in 1.1.1 (see digest.c), contrary to 3.0.0 where there is specific knowledge of FIPS. > status = EVP_DigestInit_ex(ctx->evpctx, EVP_md5(), NULL); > > and then we come back to this in md5_text(): > > /* get the hash result */ > if (pg_md5_hash(VARDATA_ANY(in_text), len, hexsum) == false) > ereport(ERROR, > (errcode(ERRCODE_OUT_OF_MEMORY), > errmsg("out of memory"))); > > So there's nothing actually misbehaving, but our error reportage sucks: > the hash functions have no way to report a specific failure code, > and the caller(s) think the only possible failure mode is OOM. Indeed, this error is a pilot error with the cryptohash integration of 14. In ~13, the custom MD5 implementation would only fail on OOM, but more failure modes are possible now. > I suppose we could get around the error by using our own MD5 code > even in OpenSSL-enabled builds, but that'd violate both the spirit > and the letter of FIPS certification. I don't think we should go back to an integration of our custom MD5 if we have external libraries that provide support for it. That's against the set of FIPS policies. > I think the right response is > to upgrade the error-reporting API in this area, so that the message > could look more like "MD5 is disallowed in FIPS mode". Hmm. I am not sure how much we should try to make the backend, or even the frontend, FIPS-aware (remember for example 0182438 where we avoided this kind of complexity), and not all SSL libraries we would potentially add support for may care about such error states. The cleanest approach may be to extend the APIs to store an **errstr so as implementations are free to assign the error string they want to send back for the error reporting, rather than using more error codes. If we want to improve things in this area with FIPS (aka allow check-world to pass in this case), we would need more in terms of alternate test output, and extra tweaks for the authentication tests, as well. Perhaps the best thing to do in the long term would be to drop MD5, but we are not there yet IMO even if password_encryption default has changed, and upgrade scenarios get hairy. At the end, I agree that we should improve the error message in these two cases. However, I would stick to simplicity by not assuming that those two code paths fail only on OOM, and reword things in md5_text() and md5_bytea() with a simple "could not compute MD5 hash". Any code paths calling the routines of md5_common.c just do that as well for ages when the computation fails, and that's what we care about here. -- Michael signature.asc Description: PGP signature
Re: md5 issues Postgres14 on OL7
On Mon, Dec 20, 2021 at 03:22:31PM +0100, Christoph Moench-Tegeder wrote: > Active FIPS mode (/proc/sys/crypto/fips_enabled => 1) on the server does > produce this behaviour. Most likely, this is a build linked with OpenSSL? The way MD5 hashes are computed in Postgres has largely changed in 14, and the code has been refactored so as we rely on the EVP APIs from OpenSSL when building with --with-ssl=openssl, having as direct consequence to allocate a bit more memory every time a hash is computed. My guess is that this comes from pg_cryptohash_create() in cryptohash_openssl.c, with a complain coming from OpenSSL's EVP_MD_CTX_create(), but there are other palloc() calls in this area as well. -- Michael signature.asc Description: PGP signature
Re: Reindex "locked" standby database
On Wed, Dec 15, 2021 at 12:15:27AM -0300, Martín Fernández wrote: > The reindex went fine in the primary database and in one of our > standby. The other standby that we also operate for some reason > ended up in a state where all transactions were locked by the WAL > process and the WAL process was not able to make any progress. In > order to solve this issue we had to move traffic from the “bad” > standby to the healthy one and then kill all transactions that were > running in the “bad” standby. After that, replication was able to > resume successfully. You are referring to the startup process that replays WAL, right? Without having an idea about the type of workload your primary and/or standbys are facing, as well as an idea of the configuration you are using on both (hot_standby_feedback for one), I have no direct idea, but that could be a conflict caused by a concurrent vacuum. Seeing where things got stuck could also be useful, perhaps with a backtrace of the area where it happens and some information around it. > I’m just trying to understand what could have caused this issue. I > was not able to identify any queries in the standby that would be > locking the WAL process. Any insight would be more than welcome! That's not going to be easy without more information, I am afraid. -- Michael signature.asc Description: PGP signature
Re: log shipping with pg_receivewal
On Tue, Dec 14, 2021 at 05:25:04AM +0100, Marc Millas wrote: > My question: as the synchronous option is supposed to make pg_receivewal > write transaction immediately in the wal files, is there a way to ask the > standby to apply them on the fly ie. without waiting a wal file change ? Nope, there is no such option. First, pg_receivewal strongly relies on the rename of WAL segment to its correct name once completed. If there is any failure, we would start back at the beginning of the last partial segment found. Saying that, you could do things by using a restore_command that checks after a .partial file directly in the path used by pg_receivewal to store the archives, and Postgres would repeatedly ask for a segment once it is out. But really, this is going to be really expensive as a restore_command would copy a full file, so that's a lot of bandwidth wasted away for nothing. Streaming replication would be likely your best, and cheapest, option here. -- Michael signature.asc Description: PGP signature
Re: split postgresql logfile
On Mon, Nov 29, 2021 at 12:12:11PM +0100, Paul van Rixel wrote: > Now the postgresql.logs are polluted with log connections/disconnection, > Agent monitoring entries and in between some entries which need some > attention but you missed them. I know the option exists to disable > log_(dis)connections but because of auditing this option must be enabled. > We don't use a tool like Grafana yet and with pgBadger we are experimenting > but skipping "unnessary" lines makes the manually reading of the logs much > more clear. The logging collector is an all-or-nothing operation, so your options when using it are limited when it comes to filtering, I am afraid. If you'd wish to split the log entries, something that you could consider is using the elog hook and redirect those logs into a dedicated file, but that would mean being also careful about concurrency as you cannot send those entries into the logging collector that would just write logs in a serializable fashion in which log file it sees suited depending on the server configuration (.csv, .log, etc.). Another option for more filtering flexibility is syslog. While, contrary to the logging collector that it designed to never lose messages, it could drop messages that cannot be written :/ For auditing compliance, the latter is never a good option. (FWIW, another thing that's been asked across the years is to have JSON as logging format. There is a patch floating around to add this support, but that won't apply to already-released versions.) -- Michael signature.asc Description: PGP signature
Re: Pause streaming replication
On Wed, Nov 10, 2021 at 08:36:45PM -0500, Rita wrote: > Yes, I have read the manual and seen this. It pauses the replication > (select pg_is_wal_replay_paused()). But on the primary, when I look at > pg_stat_replication, it still says 'streaming' in the state column. My > question was how do I get it from 'streaming' to anything else? ( > https://www.postgresql.org/docs/11/monitoring-stats.html#PG-STAT-REPLICATION-VIEW) > ? pg_is_wal_replay_paused() pauses WAL *replay* at recovery, but it does not stop the stream of WAL from the primary to its standbys. > I want to trigger an alert when 'streaming' isn't there. Or is there a > better way to do it? If you want to have some automated test to check a given state of the replication, you could use a combination of a SIGSTOP on the WAL receiver of the standby and/or the WAL sender of the primary, with some pg_terminate_backend() calls, just to throw one idea in the bucket. -- Michael signature.asc Description: PGP signature
Re: Force re-compression with lz4
On Mon, Oct 18, 2021 at 08:01:04AM -0700, Adrian Klaver wrote: > Not sure how much this applies to the Postgres usage of lz4. As I understand > it, this is only used internally for table compression. When using pg_dump > compression gzip is used. Unless you pipe plain text output through some > other program. More precisely, LZ4 applies to the compression of toastable values in 14~. In 15~, we can already use it for WAL and the compression of full-page writes. It is worth noting that there are extra patches floating around to add more LZ4 pluggability to pg_dump (I think this has not been published yet), pg_receivewal (published) and base backups through the replication protocol (published). I have seen rather good numbers when it came to WAL, FWIW. Even if the compression ratio was a bit less than pglz, it was much faster. -- Michael signature.asc Description: PGP signature
Re: Force re-compression with lz4
On Mon, Oct 18, 2021 at 09:57:11AM +0300, Florents Tselai wrote: > Oh, that’s good to know then. So besides ALTER COMPRESSION for > future inserts there’s not much one can do for pre-existing values The posting style of the mailing list is to not top-post, so if you could avoid breaking the logic of the thread, that would be nice :) > I think it makes sense to update/ add more info to the docs on this > as well, since other people in the thread expected this to work that > way too. There is some documentation, as changing the compression for an existing table is part of ALTER TABLE: https://www.postgresql.org/docs/current/sql-altertable.html "This does not cause the table to be rewritten, so existing data may still be compressed with other compression methods. If the table is restored with pg_restore, then all values are rewritten with the configured compression method." > Maybe at some point, even allow an explicit option to be defined during > VACUUM ? That's a part where we disagreed as it should not be VACUUM's work to do that. The option would have a limited impact as it comes to users that would do a one-time operation most likely part of an upgrade, so I don't think that this would be adapted to have anyway. -- Michael signature.asc Description: PGP signature
Re: Force re-compression with lz4
On Sun, Oct 17, 2021 at 10:13:48PM +0300, Florents Tselai wrote: > I did look into VACUUM(full) for it’s PROCESS_TOAST option which > makes sense, but the thing is I already had a cron-ed VACUUM (full) > which I ended up disabling a while back; exactly because of the > double-space requirement. Please note that VACUUM FULL does not enforce a recompression on existing values. See commit dbab0c0, that disabled this choice as it introduced a noticeable performance penality in some cases when looking at the compression type of the vacuumed table attributes: =# CREATE TABLE cmdata(f1 text COMPRESSION pglz); CREATE TABLE =# INSERT INTO cmdata VALUES(repeat('1234567890', 1000)); INSERT 0 1 =# SELECT pg_column_compression(f1) FROM cmdata; pg_column_compression --- pglz (1 row) =# ALTER TABLE cmdata ALTER COLUMN f1 SET COMPRESSION lz4; ALTER TABLE =# VACUUM FULL cmdata; VACUUM =# SELECT pg_column_compression(f1) FROM cmdata; pg_column_compression --- pglz (1 row) -- Michael signature.asc Description: PGP signature
Re: Force re-compression with lz4
On Sun, Oct 17, 2021 at 10:33:52PM +0200, Daniel Verite wrote: > However lz4 appears to be much faster to compress than pglz, so its > benefit is clear in terms of CPU usage for future insertions. CPU-speaking, LZ4 is *much* faster than pglz when it comes to compression or decompression with its default options. The compression ratio is comparable between both, still LZ4 compresses in average less than PGLZ. -- Michael signature.asc Description: PGP signature
Re: How postgres is refreshing TLS certificates
On Wed, Jul 28, 2021 at 06:51:22AM +, M Tarkeshwar Rao wrote: > We are working on a activity in which I need to refresh the TLS > certificate without restarting the my application pod. > This feature is already there in Postgres. Can anyone please suggest > us how postgres is implemented the same? Hard to answer with so little detail, but if you are referring to the backend server, aren't you looking for the fact that SSL contexts and its surrounding applications can be reloaded? That would apply after a simple pg_ctl "reload" for example. -- Michael signature.asc Description: PGP signature
Re: pg_upgrade and wraparound
On Mon, May 03, 2021 at 11:10:44AM -0400, Jan Wieck wrote: > Not yet, but I will enter it so that we can get it into 15 for sure. I may be missing something but this is not listed: https://commitfest.postgresql.org/33/ Could you add it to the CF app please? There are so many patches and discussions that this would easily get lost if you don't register it. And from what I can see having a discussion on this matter looks adapted to me. -- Michael signature.asc Description: PGP signature
Re: SV: Log files polluted with permission denied error messages after every 10 seconds
On Mon, Mar 22, 2021 at 09:25:26AM +0200, Andrus wrote: > Errors in pg_wal directory seems not to occur in patched version. Errors in > pg_stat_tmp still occur. Yesterdays log introduces new error message > > using stale statistics instead of current ones because stats collector is > not responding > > 2021-03-21 23:51:25 EET autovacuum worker LOG: using stale statistics > instead of current ones because stats collector is not responding The renaming of stats files involves just pgrename(), which is a completely separate code path than the one of the WAL segments. This requires a separate investigation. -- Michael signature.asc Description: PGP signature
Re: PostgreSQL occasionally unable to rename WAL files (NTFS)
On Tue, Feb 16, 2021 at 12:22:36PM +0100, Thomas Kellerer wrote: > My first suspect is always the anti-virus on Windows when things like > that happen with Postgres. Or maybe not. 13 has introduced a regression in this area AFAIK, and 909b449 should have taken care of it (available in 13.3~). -- Michael signature.asc Description: PGP signature
Re: SV: Log files polluted with permission denied error messages after every 10 seconds
On Fri, Mar 19, 2021 at 09:00:10AM +0200, Andrus wrote: > I replaced files in 13.1 server with ones from your patched version. There > are no errors in log file now for 8 hours. Yippee. Thanks. Have you tested the unpatched builds? And did you see some errors with them? -- Michael signature.asc Description: PGP signature
Re: SV: Log files polluted with permission denied error messages after every 10 seconds
On Thu, Mar 18, 2021 at 12:25:45PM +1300, Guy Burgess wrote: > FWIW, this looks the same issue I am getting (reported last month: > https://www.postgresql.org/message-id/f444a84e-2d29-55f9-51a6-a5dcea3bc253%40burgess.co.nz) Yep. > I get the same Process Monitor output, including BUFFER OVERFLOW entries. > No sign of any process other than postgres.exe touching the WAL files. Guy, do you have an environment where this is still happening and where you could test a potential fix? We are not sure yet what's causing that, but one code path has changed in this area, involving CreateHardLinkA()+_unlink() instead of a single rename when attempting to recycle a segment. And I am just in a mood to build things by myself and send some links to people to be able to download and test that, so one more is fine.. -- Michael signature.asc Description: PGP signature
Re: SV: Log files polluted with permission denied error messages after every 10 seconds
On Wed, Mar 17, 2021 at 10:45:28AM +0200, Andrus wrote: > In this server hopefully no. Application code contains xml parsing it but > probably those queries are never running in this server. Okay, cool. I am going to send you privately two links to the builds I am going to produce, 13.2 unpatched and 13.2 patched. -- Michael signature.asc Description: PGP signature
Re: SV: Log files polluted with permission denied error messages after every 10 seconds
On Wed, Mar 17, 2021 at 11:44:48AM -0400, Tom Lane wrote: > Oh! That's an interesting theory; it'd explain why this broke recently, > because we didn't use to use that function. But how do you draw that > conclusion from this stack trace? > > Anyway, if you've diagnosed this correctly, I bet the fault is in > this bit in win32stat.c: > > #if _WIN32_WINNT < 0x0600 > IO_STATUS_BLOCK ioStatus; > FILE_STANDARD_INFORMATION standardInfo; > #else > FILE_STANDARD_INFO standardInfo; > #endif > > Maybe the version cutoff is wrong? Maybe we have to do this with > a run-time version check, instead of statically compiling it? All the reports received are on 13.1 and 13.2. This code is new as of bed9075, no? -- Michael signature.asc Description: PGP signature
Re: SV: Log files polluted with permission denied error messages after every 10 seconds
On Wed, Mar 17, 2021 at 09:25:00AM +0200, Andrus wrote: > pg_config --configure outputs > > --enable-thread-safety --enable-nls --with-ldap --with-openssl --with-uuid > --with-libxml --with-libxslt --with-icu --with-tcl --with-perl --with-python Thanks. Do you actually use OpenSSL, LDAP, uuid-ossp, xml2, PL/Perl PL/Python, or the XML datatype for your applications there? It may be better if those custom builds have a minimum number of dependencies filled, while still being compatible with what you do on those servers so as they can still have some load applied. -- Michael signature.asc Description: PGP signature
Re: SV: Log files polluted with permission denied error messages after every 10 seconds
On Wed, Mar 17, 2021 at 01:09:24AM +0200, Andrus wrote: > Should I try install Visual C++ , compile and replace postgres.exe file in > AMD server. Mostly. That's the annoying part: https://www.postgresql.org/docs/devel/install-windows-full.html It is also possible to compile the code on a first machine, then just zip followed by an unzip it at the same location as your installation. I am not completely sure which flags your installation has, but another possibility is that I directly send to you two compiled builds, one with the patch and one without it that you could directly test. I would not send that to the lists as an installation is rather large, but I could just send you links from where you could download both of them. Then you would just need to stop the Postgres service, do a drop-in deplacement of the binaries, and start again the Postgres service. -- Michael signature.asc Description: PGP signature
Re: SV: Log files polluted with permission denied error messages after every 10 seconds
On Tue, Mar 16, 2021 at 01:59:07PM +0200, Andrus wrote: > I have two Windows 2019 servers. In Intel Xeon Cold 6226R server it occurs > after every 10 seconds. Last logs: > > 2021-03-16 13:48:12 EET checkpointer LOG: could not rename file > "pg_wal/000100110097": Permission denied > 2021-03-16 13:48:22 EET checkpointer LOG: could not rename file > "pg_wal/000100110098": Permission denied > 2021-03-16 13:48:32 EET checkpointer LOG: could not rename file > "pg_wal/000100110099": Permission denied > 2021-03-16 13:48:42 EET checkpointer LOG: could not rename file > "pg_wal/00010011009A": Permission denied > 2021-03-16 13:48:52 EET checkpointer LOG: could not rename file > "pg_wal/00010011009D": Permission denied > 2021-03-16 13:49:02 EET checkpointer LOG: could not rename file > "pg_wal/0001001100A0": Permission denied > > So It should be probably reproducible in any Windows 2019 server. Those ten seconds are coming from RemoveXlogFile(), where pgrename() loops 100 times for 100ms before giving up. So something holding up the file's handle prevents the removal to happen. Attached is the patch that should be tested, based on the suspected commit. There are actually two scenarios to worry about: - Check that the code of 13.2 compiled manually is enough to see the failure. - Check that once the patch attached is applied makes the failure go away. I am trying on my side to reproduce the problem in a more reliable way. One thing I saw breaking in my setup is archive_command, where it was not able to archive a segment with a simple copy, failing with the same error as yours. In one of those servers, do you have in pg_wal/ some files named xlogtemp.N? N is an integer that would be the PID of the process that generated it. -- Michael From 961f9a03d4c27220c33e88402d5ef274424a0ab2 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Wed, 17 Mar 2021 07:12:35 +0900 Subject: [PATCH] Revert "Remove HAVE_WORKING_LINK" This reverts commit aaa3aeddee51dd0058d38469907865052706a590. --- src/include/pg_config_manual.h| 7 +++ src/include/storage/fd.h | 2 +- src/backend/access/transam/timeline.c | 4 ++-- src/backend/access/transam/xlog.c | 4 ++-- src/backend/storage/file/fd.c | 21 - 5 files changed, 28 insertions(+), 10 deletions(-) diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h index 8f3ec6bde1..966da99742 100644 --- a/src/include/pg_config_manual.h +++ b/src/include/pg_config_manual.h @@ -135,6 +135,13 @@ #define EXEC_BACKEND #endif +/* + * Define this if your operating system supports link() + */ +#if !defined(WIN32) && !defined(__CYGWIN__) +#define HAVE_WORKING_LINK 1 +#endif + /* * USE_POSIX_FADVISE controls whether Postgres will attempt to use the * posix_fadvise() kernel call. Usually the automatic configure tests are diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h index 8cd125d7df..2085c62b41 100644 --- a/src/include/storage/fd.h +++ b/src/include/storage/fd.h @@ -157,7 +157,7 @@ extern void fsync_fname(const char *fname, bool isdir); extern int fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel); extern int durable_rename(const char *oldfile, const char *newfile, int loglevel); extern int durable_unlink(const char *fname, int loglevel); -extern int durable_rename_excl(const char *oldfile, const char *newfile, int loglevel); +extern int durable_link_or_rename(const char *oldfile, const char *newfile, int loglevel); extern void SyncDataDirectory(void); extern int data_sync_elevel(int elevel); diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c index e6a29d9a9b..27d70ff869 100644 --- a/src/backend/access/transam/timeline.c +++ b/src/backend/access/transam/timeline.c @@ -446,7 +446,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI, * Perform the rename using link if available, paranoidly trying to avoid * overwriting an existing file (there shouldn't be one). */ - durable_rename_excl(tmppath, path, ERROR); + durable_link_or_rename(tmppath, path, ERROR); /* The history file can be archived immediately. */ if (XLogArchivingActive()) @@ -524,7 +524,7 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size) * Perform the rename using link if available, paranoidly trying to avoid * overwriting an existing file (there shouldn't be one). */ - durable_rename_excl(tmppath, path, ERROR); + durable_link_or_rename(tmppath, path, ERROR); } /* diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 7daa7c43ad..82e070e431 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -3624,1
Re: SV: Log files polluted with permission denied error messages after every 10 seconds
Hi Andrus, On Wed, Mar 10, 2021 at 03:20:47PM +0200, Andrus wrote: > After re-starting postgres service problem persists. Where you getting the Postgres binaries from? If we provide a patch, could you test it? This would require that you do your own build, unfortunately, but having an environment where this is easily reproducible is a key thing. -- Michael signature.asc Description: PGP signature
Re: how to best remove version 10 (and keep version 9.5)
On Sat, Mar 13, 2021 at 12:03:04PM -0800, Adrian Klaver wrote: > So, the 10 instance is not running and the 9.5 instance is listening on the > default port. At this point I would leave things as they are. Robert, you may want to know that 9.5 has been EOL'd by community. Just saying. -- Michael signature.asc Description: PGP signature
Re: pg_upgrade and wraparound
Hi Jan, On Fri, Mar 12, 2021 at 06:13:33PM -0500, Jan Wieck wrote: > One of the things in my way is that when using pg_resetwal to put the > NextXID way into the future (to push the old cluster close to wraparound for > example), the postmaster won't start because it doesn't have the pg_xact > files for that around. Should pg_resetwal create the files in the gap > between the old NextXID and the new one? I think that you should add this patch to the next commit fest to track it properly: https://commitfest.postgresql.org/33/ -- Michael signature.asc Description: PGP signature
Re: Log files polluted with permission denied error messages after every 10 seconds
On Sun, Mar 07, 2021 at 11:45:26AM +0200, Andrus wrote: > Should files with .deleted extension deleted manually to save disk space ? > May of them have dates before today. RemoveOldXlogFiles() would discard any of those .deleted files because they don't match a legal WAL segment name, so checkpoints are not able to work on them even in the future. I would avoid meddling with anything that a backend may finish to touch while running, but that should not really matter here as they are just never chosen for deletion. Piling up those files is not a good thing, so while you still need to figure out what's causing those files to remain around on your side, perhaps we should improve the situation in the backend itself. -- Michael signature.asc Description: PGP signature
Re: Log files polluted with permission denied error messages after every 10 seconds
On Sat, Mar 06, 2021 at 07:53:11PM +0200, Andrus wrote: > I changed wal_recycle to off. So checkpointer should no more try to rename > wal files. Iit still tries to rename files. No idea way it does not use this > setting: On Windows, RemoveXlogFile() would still rename a given WAL segment file with a ".deleted" suffix with ou without wal_recycle in the case where the a recycling of a WAL segment is not necessary, for example if max_wal_size is already full. So this has no effect. > Should chekpointer process terminated to force it to use new setting. Is it > safe to kill it during database usage. I don't understand what you mean here. -- Michael signature.asc Description: PGP signature
Re: Log files polluted with permission denied error messages after every 10 seconds
On Fri, Mar 05, 2021 at 07:36:37PM +0200, Andrus wrote: > Then turned real-time protection off: > > Problem persists. New entry is written after every 10 seconds. On which files are those complaints? It seems to me that you may have more going on in this system that interacts with your data folder than you think. > pg_wal also contains files with .deleted extension like > > 0001000500B2.deleted These are generated on Windows when removing a past WAL segment, where the process involves a rename followed by durable_unlink() that would generate some LOG entries in the logs if the internal unlink() failed (see RemoveXlogFile() in xlog.c). -- Michael signature.asc Description: PGP signature
Re: ransomware
On Mon, Feb 01, 2021 at 03:38:35PM +0100, Marc Millas wrote: > there are various ways to do those checks but I was wandering if any > ""standard''" solution exist within postgres ecosystem, or someone do have > any feedback on the topic. It seems to me that you should first write down on a sheet of paper a list of all the requirements you are trying to satisfy. What you are describing here is a rather general problem line, so nobody can help without knowing what you are trying to achieve, precisely. -- Michael signature.asc Description: PGP signature
Re: Does pg_ctl promote wait for pending WAL?
Hi Ishii-san, On Fri, Jan 29, 2021 at 07:59:26AM +0100, Paul Förster wrote: > On 29. Jan, 2021, at 03:51, Tatsuo Ishii wrote: >> >> Does anybody know whether a standby server waits for pending WAL >> records/files while promotion is requested? I assume that no data >> update is performed on the primary server while promotion. >> >> I imagine that a standby server stops to replay WAL and promotes as >> soon as SIGUSR1 signal is received. To answer to your question based on the code, you can check for the code paths calling CheckForStandbyTrigger() in xlog.c when it comes to promotion detection in the WAL replay. In short, WaitForWALToBecomeAvailable() tells that after the promotion is detected in the startup process, then recovery would still try to replay as much WAL as possible from the archives or pg_wal before a failover. Equally, from the docs: "Before failover, any WAL immediately available in the archive or in pg_wal will be restored, but no attempt is made to connect to the primary." -- Michael signature.asc Description: PGP signature
Re: Error messages on duplicate schema names
On Tue, Jan 19, 2021 at 05:37:51PM -0300, Alvaro Herrera wrote: > I guess you could make the case that the CCI call should be in the > callers where we actually loop (SetDefaultACLsInSchemas, > RemoveRoleFromObjectACL), but it's hard to get excited about the added > clutter. Yeah, that matches my thoughts. And I did not see any harm in putting the CCI in SetDefaultACL() even for the case of pg_default_acl with DROP OWNED BY. So applied and backpatched. Thanks, Alvaro and Andrus. -- Michael signature.asc Description: PGP signature
Re: Error messages on duplicate schema names
On Wed, Jan 06, 2021 at 07:15:24PM +0200, Andrus wrote: > Should duplicate schema names accepted or should their usage throw better > error messages. This means that we are one call of CommandCounterIncrement() short for such queries, and similar safeguards already exist in this area for GRANT/REVOKE. The spot where I would put this new CCI is at the end of SetDefaultACL(), like in the attached with a test added to privileges.sql. Any thoughts from others? -- Michael diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c index 1a81e768ec..f3c1ca18ae 100644 --- a/src/backend/catalog/aclchk.c +++ b/src/backend/catalog/aclchk.c @@ -1365,6 +1365,9 @@ SetDefaultACL(InternalDefaultACL *iacls) ReleaseSysCache(tuple); table_close(rel, RowExclusiveLock); + + /* prevent error when processing duplicate objects */ + CommandCounterIncrement(); } diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out index 7754c20db4..5e5f98ac68 100644 --- a/src/test/regress/expected/privileges.out +++ b/src/test/regress/expected/privileges.out @@ -1649,7 +1649,8 @@ SELECT has_table_privilege('regress_priv_user1', 'testns.acltest1', 'INSERT'); - f (1 row) -ALTER DEFAULT PRIVILEGES IN SCHEMA testns GRANT SELECT ON TABLES TO public; +-- placeholder for test with duplicated schema and role names +ALTER DEFAULT PRIVILEGES IN SCHEMA testns,testns GRANT SELECT ON TABLES TO public,public; SELECT has_table_privilege('regress_priv_user1', 'testns.acltest1', 'SELECT'); -- no has_table_privilege - diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql index 4911ad4add..fff76e0bd0 100644 --- a/src/test/regress/sql/privileges.sql +++ b/src/test/regress/sql/privileges.sql @@ -985,7 +985,8 @@ CREATE TABLE testns.acltest1 (x int); SELECT has_table_privilege('regress_priv_user1', 'testns.acltest1', 'SELECT'); -- no SELECT has_table_privilege('regress_priv_user1', 'testns.acltest1', 'INSERT'); -- no -ALTER DEFAULT PRIVILEGES IN SCHEMA testns GRANT SELECT ON TABLES TO public; +-- placeholder for test with duplicated schema and role names +ALTER DEFAULT PRIVILEGES IN SCHEMA testns,testns GRANT SELECT ON TABLES TO public,public; SELECT has_table_privilege('regress_priv_user1', 'testns.acltest1', 'SELECT'); -- no SELECT has_table_privilege('regress_priv_user1', 'testns.acltest1', 'INSERT'); -- no signature.asc Description: PGP signature
Re: Postgres C-API: How to get the Oid for a custom type defined in a schema outside of the current search path
On Thu, Dec 03, 2020 at 01:45:05PM +0100, Pavel Stehule wrote: > When you write C extensions for Postgres, then PostgreSQL source code is > the best source of inspiration. One common source of inspiration for such cases is regproc.c. For a type, you can for example look at what to_regtype() uses for a conversion from a name string to an OID, aka parseTypeString(). -- Michael signature.asc Description: PGP signature
Re: postgres-10 with FIPS
On Thu, Dec 03, 2020 at 05:57:04PM +0530, Aravindhan Krishnan wrote: > Since postgres is linked against openssl we wanted to make sure we build > postgres against the FIPS compliant openssl libraries. Does postgres > provide a FIPS debian package that can be used. If not it would be of great > help to help with the instructions to build the debian of postgres linked > against the FIPS compliant openssl libraries. There is no need for Postgres to do anything specific with FIPS at runtime, as long as the OS takes care of enabling FIPS and that OpenSSL is able to recognize that. So normally, you could just use a version of Postgres compiled with OpenSSL 1.0.2, and replace the libraries of OpenSSL with a version that is compiled with FIPS enabled as the APIs of OpenSSL used by Postgres are exactly the same for the non-FIPS and FIPS cases. -- Michael signature.asc Description: PGP signature
Re: meaning of (to prevent wraparound) ..??
On Wed, Nov 25, 2020 at 11:10:50PM -0700, Jessica Sharp wrote: > On Wed, Nov 25, 2020 at 23:09 Atul Kumar wrote: >> Thanks Jessica. Could help me out by sharing documents that can help me >> understand “to prevent wraparound “ in simplest way, postgres doc is little >> bit harder for a newbee like me. > > Absolutely. This may be helpful: > https://youtu.be/bA1SgWn6Gv4 If you have any suggestions on how to improve this part of the docs, please feel free: https://www.postgresql.org/docs/devel/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND Note that due to the critical work that such vacuum jobs do, they cannot be cancelled. -- Michael signature.asc Description: PGP signature
Re: archive file "00000001000000000000006F" has wrong size: 67118648 instead of 16777216
On Wed, Nov 25, 2020 at 11:01:37AM +0900, 江川潔 wrote: > Hi, > > WAL log recovery was failed on wrong log record size. Could you please > advise me what is wrong in the setting ? Any suggestions will be highly > appreciated. > 2020-11-25 10:12:23.569 JST [7792] FATAL: archive file > "0001006F" has wrong size: 67118648 instead of 16777216 Something is fishy with your WAL archive partition, so you had better check its state, by for example using pg_waldump on each segment archived. This error means that the file copied was basically 64MB in size, while these are expected to be 16MB in this instance. -- Michael signature.asc Description: PGP signature
Re: initdb --data-checksums
On Mon, Nov 09, 2020 at 06:03:43PM +0100, Paul Förster wrote: > indeed, it is. Have a look at: > > https://www.postgresql.org/docs/12/app-pgchecksums.html > > Make sure the database is cleanly shut down before doing it. This tool is really useful with upgrades after pg_upgrade. Please note that there is a --progress option, so you can basically know how long it is going to take until completion. -- Michael signature.asc Description: PGP signature
Re: PANIC: could not write to log file {} at offset {}, length {}: Invalid argument
On Wed, Nov 04, 2020 at 10:23:04PM -0500, Tom Lane wrote: > The latter case would result in a LOG message "unrecognized win32 error > code", so it would be good to know if any of those are showing up in > the postmaster log. Yeah. Not sure which one it could be here: https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile One possibility could also be ERROR_OPERATION_ABORTED, which is not in the mapping table. So that would map to EINVAL. > Seems like maybe it wasn't a great idea for _dosmaperr's fallback > errno to be something that is also a real error code. You could say that for any fallback errno as long as you don't know if there's a LOG to show that a DWORD does not map with the table of win32error.c, no? (I got to wonder whether it would be worth the complexity to show more information when using _dosmaperr() for WIN32 on stuff like elog(ERROR, "%m"), just a wild thought). -- Michael signature.asc Description: PGP signature
Re: PANIC: could not write to log file {} at offset {}, length {}: Invalid argument
On Thu, Nov 05, 2020 at 10:21:40AM +0100, Magnus Hagander wrote: > The problem with AVs generally doesn't come from them opening files in > non-share mode (I've, surprisingly enough, seen backup software that > causes that problem for example). It might happen on scheduled scans > for example, but the bigger problem with AV software has always been > their filter driver software which intercepts both the open/close and > the read/write calls an application makes and "does it's magic" on > them before handing the actual call up to the operating system. It's > completely independent of how the file is opened. This one is a bit new to me. I certainly saw my share of stat() or open() calls failing on ENOPERM because of file handles taken exclusively by external scanners around here or even with customer-related issues, and I did not expect that such dark magic could be involved in a write. It would indeed not be surprising to see a PANIC depending on what gets done. -- Michael signature.asc Description: PGP signature
Re: PANIC: could not write to log file {} at offset {}, length {}: Invalid argument
On Wed, Nov 04, 2020 at 01:24:46PM +0100, Andreas Kretschmer wrote: >> Any ideas about what is the problem? or anything else I need to check? > > wild guess: Antivirus Software? Perhaps not. To bring more context in here, PostgreSQL opens any files on WIN32 with shared writes and reads allowed to have an equivalent of what we do on all *nix platforms. Note here that the problem comes from a WAL segment write, which is done after the file handle is opened in shared mode. As long as the fd is correctly opened, any attempt for an antivirus software to open a file with an exclusive write would be blocked, no? -- Michael signature.asc Description: PGP signature
Re: PG 9.2 slave restarted - cache not impacted
On Fri, Oct 23, 2020 at 11:23:20AM +1300, Lucas Possamai wrote: > I'm a bit confused about PG cache. > > I have a PostgreSQL 9.2 cluster (yes, we're planning on upgrading it to 12) > with a master and a slave database. > > The application is sending all read requests to the slave, where the master > processes the writes. > > A while ago we had to restart the master database server, and the > application was slow for a couple of days while PG cache was warmed up. > Yesterday, we had to restart the slave database server and we did not have > the same problem. > > I would like to understand why? Perhaps you misunderstood the effects of the OS cache and the Postgres shared buffers? On restart the cluster discards the shared buffers internal to Postgres. It does not mean that the OS cache is changed, and it matters a lot in terms of performance. By the way, if you want to warm up your caches faster, an option you can consider with 9.2 (which is a version not officially supported by the community by the way so please upgrade) would be to use pgfincore. -- Michael signature.asc Description: PGP signature
Re: rum index supported on pg13?
On Mon, Oct 12, 2020 at 12:17:04PM -0500, John the Scott wrote: > I am still new to github protocol, so i was not sure > if asking about longer term support of rum was appropriate for > the github issues posting. Most of the original developers of rum are registered on this mailing list so there would be some visibility, but I would guess that posting an issue or a question from the actual project page does not hurt either. > the differences between pg12 and pg13 seem considerable. > our internal use of rum has been spectacularly successful, > we may be able to justify resources to fixing the compile issues with pg13, > but the effort will be considerable. Glad to hear that. -- Michael signature.asc Description: PGP signature
Re: rum index supported on pg13?
On Thu, Oct 08, 2020 at 09:29:31PM -0500, John the Scott wrote: > will rum index from postgrespro be supported in pg13? > numerous errors occur when compiling rum in pg13 and > no replies from github. the differences from pg12 > to pg13 seem to be significant > > https://github.com/postgrespro/rum Did you ask directly this question to the authors of the extension on the page of the project you are quoting above? -- Michael signature.asc Description: PGP signature
Re: Both type of replications from a single server?
On Thu, Oct 08, 2020 at 01:25:14PM +0530, Srinivasa T N wrote: > For streaming replication, I need to set wal_level to replica in A whereas > for logical_replication we need to set wal_level to replica in the same A > server. So, was wondering how to go about? A logical replica needs wal_level = logical, a setting that also allows to do streaming replication for a physical replica. But the opposite is not true, as using wal_level = replica will not work for logical replicas. So, assuming that you want to have both logical and physical replicas that replicate from the same source server, you need to set wal_level to logical on the primary server because it is a system-wide configuration. -- Michael signature.asc Description: PGP signature
Re: Both type of replications from a single server?
On Thu, Oct 08, 2020 at 11:43:26AM +0530, Srinivasa T N wrote: >Is it possible to have both type of replications (streaming and logical) > from a single server? Yes. >If I have 3 servers A,B and C, then I want to have streaming replication > from A to B whereas logical replication from A to C. Is it possible? And yes. -- Michael signature.asc Description: PGP signature
Re: PG 13 trusted extensions and pg_available_extensions
On Wed, Sep 23, 2020 at 03:28:45PM +, Daniel Westermann (DWE) wrote: > I was playing a bit with trusted extensions and wondered if there is > a reason that the "trusted" flag is not exposed in pg_available_extensions. > I believe that information would be quite useful so one can easily > identify extensions that can be installed as "normal" user. Adding the trusted flag makes sense for visibility. There is a bit more that we could consider though? For example, what about "relocatable" and "requires"? -- Michael signature.asc Description: PGP signature
Re: Autovacuum of independent tables
On Tue, Sep 08, 2020 at 11:16:04AM +0300, Michael Holzman wrote: > Autovacuum does not clean dead tuples of closed transactions in tableB > while there is an open transaction on tableA. > But the tables have nothing in common. They are handled by separate > applications and there are no transactions that touch both tables > simultaneously. > Why does autovacuum create an artificial dependency on the tables? This is called MVCC, which applies to a session as a whole. The point here is that even if your application knows that only tableA is used by a given transaction, Postgres cannot know that, as it could be possible that data from tableB is needed in this same transaction, so old versions of the rows from tableB matching with the snapshot hold by this long-running transaction still have to be around. -- Michael signature.asc Description: PGP signature
Re: When are largobject records TOASTed into pg_toast_2613?
On Fri, Aug 21, 2020 at 03:10:30PM +0200, Laurenz Albe wrote: > Indeed, system tables have no TOAST tables in PostgreSQL, so I wonder > how your "pg_largeobject" table could have grown one. FWIW, src/include/catalog/toasting.h is giving me a list of 28 catalog tables with a toast relation as of HEAD. -- Michael signature.asc Description: PGP signature
Re: [EXTERNAL] RE: PostgreSQL-12 replication. Check replication lag
On Wed, Aug 05, 2020 at 06:36:15PM +, Mariya Rampurawala wrote: > What I want to understand is that, in case of replication link > failure, there will still be inserts happening at the master > node. In that case, how will the slave know if it is up-to-date? It cannot do that by itself, which is why in common HA scenarios you have a witness host that acts as a monitoring instance for the primary and the standby so as you limit SPOF issues. This depends of course on your application requirements, but you can also leverage things by using a replication lag in bytes on the primary, and even couple that with synchronous replication to check when commits are getting blocked. IMO, I think that you should try to not reinvent the wheel and use one of the solutions provided by the community, pg_auto_failover coming into my mind as one solution for example. -- Michael signature.asc Description: PGP signature
Re: How to rebuild index efficiently
On Mon, Aug 03, 2020 at 01:04:45PM -0500, Ron wrote: > same definition, and when that is complete, drop the old index. The > locking that is required here is modest: CREATE INDEX CONCURRENTLY > needs to lock the table briefly at a couple of points in the > operation, and dropping the old index requires a brief lock on the > table. It is, however, much less overall lock time than REINDEX would be. > > Of course, you need enough disk space... :) A SHARE UPDATE EXCLUSIVE lock is taken during a CIC, meaning that writes and reads are allowed on the parent table while the operation works, but no DDLs are allowed (roughly). The operation takes a couple of transactions to complete, and there are two wait points after building and validating the new index to make sure that there are no transactions remaining around that may cause visiblity issues once the new index is ready to use and becomes valid. So the operation is longer, takes more resources, but it has the advantage to be non-disruptive. -- Michael signature.asc Description: PGP signature
Re: how reliable is pg_rewind?
On Sat, Aug 01, 2020 at 10:35:37AM -0700, Curt Kolovson wrote: > When trying to resync an old primary to become a new standby, I have found > that pg_rewind only works occasionally. How reliable/robust is pg_rewind, > and what are its limitations? We have observed that approx half our FPIs in > the WALs are due to XLOG/FPI_FOR_HINT. The only reason we've set > wal_log_hints=on is so that we can use pg_rewind. But if pg_rewind is > unreliable, we would rather turn off wal_log_hints. Any info on the > reliability of pg_rewind and its limitations would be appreciated. FWIW, we use it in production to accelerate the redeployment of standbys in HA configuration for 4 years now in at least one product, and it is present in upstream for since 9.5, for 5 years now. So the tool is rather baked at this stage of the game. -- Michael signature.asc Description: PGP signature