Re: Orphan page in _bt_split

2025-09-04 Thread Michael Paquier
On Wed, Sep 03, 2025 at 09:32:41AM +0300, Konstantin Knizhnik wrote: > On 01/09/2025 10:18 PM, Peter Geoghegan wrote: >> Also rethinking this aspect: a checksum failure probably *isn't* going >> to make much difference. Since that'll also cause bigger problems for >> VACUUM than logging one of thes

Re: Orphan page in _bt_split

2025-09-04 Thread Konstantin Knizhnik
On 04/09/2025 3:55 AM, Peter Geoghegan wrote: On Wed, Sep 3, 2025 at 2:32 AM Konstantin Knizhnik wrote: But sooner or later vacuum will be called for this index and will traverse this page, will not it? There is not other way to reuse this page without deleting it or I am missing something?

Re: Orphan page in _bt_split

2025-09-03 Thread Peter Geoghegan
On Wed, Sep 3, 2025 at 2:32 AM Konstantin Knizhnik wrote: > But sooner or later vacuum will be called for this index and will > traverse this page, will not it? > There is not other way to reuse this page without deleting it or I am > missing something? That's true. But VACUUM won't even attempt

Re: Orphan page in _bt_split

2025-09-03 Thread Michael Paquier
On Mon, Sep 01, 2025 at 02:35:56PM -0400, Peter Geoghegan wrote: > I remember that when I worked on what became commit 9b42e71376 back in > 2019 (which fixed a similar problem caused by the INCLUDE index > patch), Tom suggested that we do things this way defensively (without > being specifically aw

Re: Orphan page in _bt_split

2025-09-02 Thread Konstantin Knizhnik
On 01/09/2025 10:18 PM, Peter Geoghegan wrote: On Mon, Sep 1, 2025 at 3:04 PM Peter Geoghegan wrote: There's just no reason to think that we'd ever be able to tie back one of these LOG messages from VACUUM to the problem within _bt_split. There's too many other forms of corruption that might

Re: Orphan page in _bt_split

2025-09-02 Thread Konstantin Knizhnik
On 02/09/2025 7:42 AM, Michael Paquier wrote: On Mon, Sep 01, 2025 at 02:35:56PM -0400, Peter Geoghegan wrote: I remember that when I worked on what became commit 9b42e71376 back in 2019 (which fixed a similar problem caused by the INCLUDE index patch), Tom suggested that we do things this way

Re: Orphan page in _bt_split

2025-09-01 Thread Michael Paquier
On Mon, Sep 01, 2025 at 03:04:58PM -0400, Peter Geoghegan wrote: > This hazard has existing since commit 8fa30f906b, from 2010. That's > the commit that introduced the general idea of making _bt_split zero > its rightpage in order to make it safe to throw an ERROR instead of just > PANICing. Thank

Re: Orphan page in _bt_split

2025-09-01 Thread Peter Geoghegan
On Mon, Sep 1, 2025 at 1:03 AM Konstantin Knizhnik wrote: > So we should zero this page in case of reporting error to "not confuse > vacuum. > It is done for all places where this function reports error before > entering critical section and wal-logging this page. Right. > _bp_getbuf just reads

Re: Orphan page in _bt_split

2025-09-01 Thread Peter Geoghegan
On Mon, Sep 1, 2025 at 1:35 AM Michael Paquier wrote: > Nice investigation and report, that I assume you have just guessed > from a read of the code and that there could be plenty of errors that > could happen in this code path. It indeed looks like some weak > coding assumption introduced in thi

Re: Orphan page in _bt_split

2025-09-01 Thread Peter Geoghegan
On Mon, Sep 1, 2025 at 3:04 PM Peter Geoghegan wrote: > There's just no reason to think that we'd ever be able to tie back one > of these LOG messages from VACUUM to the problem within _bt_split. > There's too many other forms of corruption that might result in VACUUM > logging this same error (e.

Orphan page in _bt_split

2025-09-01 Thread Konstantin Knizhnik
Hi hacker, The function _bt_split creates new (right) page to split into. The comment says:     /*      * Acquire a new right page to split into, now that left page has a new      * high key.  From here on, it's not okay to throw an error without      * zeroing rightpage first.  This coding rule

Re: Orphan page in _bt_split

2025-08-31 Thread Michael Paquier
On Mon, Sep 01, 2025 at 08:03:18AM +0300, Konstantin Knizhnik wrote: > _bp_getbuf just reads page in a shared buffer. But ReadBuffer actually can > invoke half of Postgres code (locating buffer, evicting victim, writing to > SMGR,..., So there are a lot of places where error can be thrown (includin