Re: Something is wrong with wal_compression

2023-01-30 Thread Michael Paquier
On Mon, Jan 30, 2023 at 02:57:13PM +0900, Michael Paquier wrote: > 1) means more test cycles, and perhaps we could enforce compression of > WAL while on it? At the end, my vote would just go for 3) and drop > the whole scenario, though there may be an argument in 1). And actually I was under the

Re: Something is wrong with wal_compression

2023-01-29 Thread Michael Paquier
On Sat, Jan 28, 2023 at 12:02:23AM -0500, Tom Lane wrote: > My thoughts were trending in that direction too. It's starting > to sound like we aren't going to be able to make a fix that > we'd be willing to risk back-patching, even if it were completely > compatible at the user level. > > Still,

Re: Something is wrong with wal_compression

2023-01-27 Thread Tom Lane
Thomas Munro writes: > Reading Andres's comments and realising how relatively young > txid_status() is compared to txid_current(), I'm now wondering if we > shouldn't just disclaim the whole thing in back branches. My thoughts were trending in that direction too. It's starting to sound like we

Re: Something is wrong with wal_compression

2023-01-27 Thread Thomas Munro
On Sat, Jan 28, 2023 at 4:57 PM Andrey Borodin wrote: > It's not trustworthy anyway. Xid wraparound might happen during > reconnect. I suspect we can design a test that will show that it does > not always show correct results during xid->2pc conversion (there is a > point in time when xid is not

Re: Something is wrong with wal_compression

2023-01-27 Thread Andres Freund
Hi, On 2023-01-27 19:57:35 -0800, Andrey Borodin wrote: > On Fri, Jan 27, 2023 at 7:40 PM Tom Lane wrote: > > > > What are you using it for, that you don't care whether the answer > > is trustworthy? > > > > It's not trustworthy anyway. Xid wraparound might happen during > reconnect. I think

Re: Something is wrong with wal_compression

2023-01-27 Thread Andres Freund
Hi, On 2023-01-27 19:49:17 -0800, Andres Freund wrote: > It's quite commonly used as part of trigger based replication tools (IIRC > that's its origin), monitoring, as part of client side logging, as part of > snapshot management. Forgot one: Queues. The way it's used for trigger based

Re: Something is wrong with wal_compression

2023-01-27 Thread Andrey Borodin
On Fri, Jan 27, 2023 at 7:40 PM Tom Lane wrote: > > What are you using it for, that you don't care whether the answer > is trustworthy? > It's not trustworthy anyway. Xid wraparound might happen during reconnect. I suspect we can design a test that will show that it does not always show correct

Re: Something is wrong with wal_compression

2023-01-27 Thread Andres Freund
Hi, On 2023-01-27 22:39:56 -0500, Tom Lane wrote: > Andres Freund writes: > > On 2023-01-28 11:38:50 +0900, Michael Paquier wrote: > >> FWIW, my vote goes for a more expensive but reliable function even in > >> stable branches. > > > I very strenuously object. If we make txid_current() (by way

Re: Something is wrong with wal_compression

2023-01-27 Thread Tom Lane
Andres Freund writes: > On 2023-01-28 11:38:50 +0900, Michael Paquier wrote: >> FWIW, my vote goes for a more expensive but reliable function even in >> stable branches. > I very strenuously object. If we make txid_current() (by way of > pg_current_xact_id()) flush WAL, we'll cause outages.

Re: Something is wrong with wal_compression

2023-01-27 Thread Andres Freund
Hi, On 2023-01-28 11:38:50 +0900, Michael Paquier wrote: > On Fri, Jan 27, 2023 at 06:06:05AM +0100, Laurenz Albe wrote: > > On Fri, 2023-01-27 at 16:15 +1300, Thomas Munro wrote: > >> There is no > >> doubt that the current situation is unacceptable, though, so maybe we > >> really should just

Re: Something is wrong with wal_compression

2023-01-27 Thread Maciek Sakrejda
On Fri, Jan 27, 2023, 18:58 Andres Freund wrote: > Hi, > > On 2023-01-27 16:15:08 +1300, Thomas Munro wrote: > > It would be pg_current_xact_id() that would have to pay the cost of > > the WAL flush, not pg_xact_status() itself, but yeah that's what the > > patch does (with some optimisations).

Re: Something is wrong with wal_compression

2023-01-27 Thread Andres Freund
Hi, On 2023-01-27 16:15:08 +1300, Thomas Munro wrote: > It would be pg_current_xact_id() that would have to pay the cost of > the WAL flush, not pg_xact_status() itself, but yeah that's what the > patch does (with some optimisations). I guess one question is whether > there are any other

Re: Something is wrong with wal_compression

2023-01-27 Thread Michael Paquier
On Fri, Jan 27, 2023 at 06:06:05AM +0100, Laurenz Albe wrote: > On Fri, 2023-01-27 at 16:15 +1300, Thomas Munro wrote: >> There is no >> doubt that the current situation is unacceptable, though, so maybe we >> really should just do it and make a faster one later.  Anyone else >> want to vote on

Re: Something is wrong with wal_compression

2023-01-26 Thread Laurenz Albe
On Fri, 2023-01-27 at 16:15 +1300, Thomas Munro wrote: > On Fri, Jan 27, 2023 at 3:04 PM Tom Lane wrote: > > Thomas Munro writes: > > > On Fri, Jan 27, 2023 at 1:30 PM Michael Paquier > > > wrote: > > > > My opinion would be to make this function more reliable, FWIW, even if > > > > that

Re: Something is wrong with wal_compression

2023-01-26 Thread Tom Lane
Thomas Munro writes: > On Fri, Jan 27, 2023 at 3:04 PM Tom Lane wrote: >> I think we need to get the thing correct first and worry about >> performance later. What's wrong with simply making pg_xact_status >> write and flush a record of the XID's existence before returning it? >> Yeah, it will

Re: Something is wrong with wal_compression

2023-01-26 Thread Thomas Munro
On Fri, Jan 27, 2023 at 3:04 PM Tom Lane wrote: > Thomas Munro writes: > > On Fri, Jan 27, 2023 at 1:30 PM Michael Paquier wrote: > >> My opinion would be to make this function more reliable, FWIW, even if > >> that involves a performance impact when called in a close loop by > >> forcing more

Re: Something is wrong with wal_compression

2023-01-26 Thread Tom Lane
Thomas Munro writes: > On Fri, Jan 27, 2023 at 1:30 PM Michael Paquier wrote: >> My opinion would be to make this function more reliable, FWIW, even if >> that involves a performance impact when called in a close loop by >> forcing more WAL flushes to ensure its report durability and >>

Re: Something is wrong with wal_compression

2023-01-26 Thread Thomas Munro
On Fri, Jan 27, 2023 at 1:30 PM Michael Paquier wrote: > On Thu, Jan 26, 2023 at 04:14:57PM -0800, Andrey Borodin wrote: > > If we agree that xid allocation is not something persistent, let's fix > > the test? We can replace a check with select * from pg_class or, > > maybe, add an amcheck run. >

Re: Something is wrong with wal_compression

2023-01-26 Thread Michael Paquier
On Thu, Jan 26, 2023 at 04:14:57PM -0800, Andrey Borodin wrote: > If we agree that xid allocation is not something persistent, let's fix > the test? We can replace a check with select * from pg_class or, > maybe, add an amcheck run. > As far as I recollect, this test was introduced to test this

Re: Something is wrong with wal_compression

2023-01-26 Thread Tom Lane
Andrey Borodin writes: > On Thu, Jan 26, 2023 at 3:04 PM Tom Lane wrote: >> Indeed, it seems like this behavior makes pg_xact_status() basically >> useless as things stand. > If we agree that xid allocation is not something persistent, let's fix > the test? If we're not going to fix this

Re: Something is wrong with wal_compression

2023-01-26 Thread Andrey Borodin
On Thu, Jan 26, 2023 at 3:04 PM Tom Lane wrote: > > Indeed, it seems like this behavior makes pg_xact_status() basically > useless as things stand. > If we agree that xid allocation is not something persistent, let's fix the test? We can replace a check with select * from pg_class or, maybe, add

Re: Something is wrong with wal_compression

2023-01-26 Thread Tom Lane
Thomas Munro writes: > On Fri, Jan 27, 2023 at 11:14 AM Tom Lane wrote: >> If any tuples made by that transaction had reached disk, >> we'd have a problem. > The problem is that the WAL wasn't flushed, allowing the same xid to > be allocated again after crash recovery. But for any data pages

Re: Something is wrong with wal_compression

2023-01-26 Thread Thomas Munro
On Fri, Jan 27, 2023 at 11:14 AM Tom Lane wrote: > Andrey Borodin writes: > > On Thu, Jan 26, 2023 at 12:12 PM Tom Lane wrote: > >> That test case is demonstrating fundamental > >> database corruption after a crash. > > > Not exactly corruption. XID was not persisted and buffer data did not > >

Re: Something is wrong with wal_compression

2023-01-26 Thread Tom Lane
Andrey Borodin writes: > On Thu, Jan 26, 2023 at 12:12 PM Tom Lane wrote: >> That test case is demonstrating fundamental >> database corruption after a crash. > Not exactly corruption. XID was not persisted and buffer data did not > hit a disk. Database is in the correct state. Really? I

Re: Something is wrong with wal_compression

2023-01-26 Thread Andrey Borodin
On Thu, Jan 26, 2023 at 12:12 PM Tom Lane wrote: > > That test case is demonstrating fundamental > database corruption after a crash. > Not exactly corruption. XID was not persisted and buffer data did not hit a disk. Database is in the correct state. It was discussed long before WAL

Re: Something is wrong with wal_compression

2023-01-26 Thread Justin Pryzby
On Thu, Jan 26, 2023 at 02:08:27PM -0600, Justin Pryzby wrote: > On Thu, Jan 26, 2023 at 02:43:29PM -0500, Tom Lane wrote: > > The symptom being exhibited by Michael's new BF animal tanager > > is perfectly reproducible elsewhere. > > I think these tests have always failed with wal_compression ?

Re: Something is wrong with wal_compression

2023-01-26 Thread Tom Lane
Justin Pryzby writes: > On Thu, Jan 26, 2023 at 02:43:29PM -0500, Tom Lane wrote: >> The symptom being exhibited by Michael's new BF animal tanager >> is perfectly reproducible elsewhere. > I think these tests have always failed with wal_compression ? If that's a known problem, and we've done

Re: Something is wrong with wal_compression

2023-01-26 Thread Justin Pryzby
On Thu, Jan 26, 2023 at 02:43:29PM -0500, Tom Lane wrote: > The symptom being exhibited by Michael's new BF animal tanager > is perfectly reproducible elsewhere. I think these tests have always failed with wal_compression ?

Something is wrong with wal_compression

2023-01-26 Thread Tom Lane
The symptom being exhibited by Michael's new BF animal tanager is perfectly reproducible elsewhere. $ cat /home/postgres/tmp/temp_config #default_toast_compression = lz4 wal_compression = lz4 $ export TEMP_CONFIG=/home/postgres/tmp/temp_config $ cd ~/pgsql/src/test/recovery $ make check