Michail Nikolaev <michail.nikol...@gmail.com> wrote:

> After some correspondence with Peter Geoghegan (1) and his ideas, I
> have reworked the patch a lot and now it is much more simple with even
> better performance (no new WAL or conflict resolution, hot standby
> feedback is unrelated).

My review that started in [1] continues here.

(Please note that code.patch does not apply to the current master branch.)

I think I understand your approach now and couldn't find a problem by reading
the code. What I consider worth improving is documentation, both code comments
and nbtree/README. Especially for the problem discussed in [1] it should be
explained what would happen if kill_prior_tuple_min_lsn was not checked.


Also, in IsIndexLpDeadAllowed() you say that invalid
deadness->latest_removed_xid means the following:

    /*
     * Looks like it is tuple cleared by heap_page_prune_execute,
     * we must be sure if LSN of XLOG_HEAP2_CLEAN (or any subsequent
     * updates) less than minRecoveryPoint to avoid MVCC failure
     * after crash recovery.
     */

However I think there's one more case: if heap_hot_search_buffer() considers
all tuples in the chain to be "surely dead", but
HeapTupleHeaderAdvanceLatestRemovedXid() skips them all for this reason:

    /*
     * Ignore tuples inserted by an aborted transaction or if the tuple was
     * updated/deleted by the inserting transaction.
     *
     * Look for a committed hint bit, or if no xmin bit is set, check clog.
     */

I think that the dead tuples produced this way should never be visible on the
standby (and even if they were, they would change the page LSN so your
algorithm would treat them correctly) so I see no correctness problem. But it
might be worth explaining better the meaning of invalid "latest_removed_xid"
in comments.


In the nbtree/README, you say

    "... if the commit record of latestRemovedXid is more ..."

but it's not clear to me what "latestRemovedXid" is. If you mean the
scan->kill_prior_tuple_min_lsn field, you probably need more words to explain
it.


* IsIndexLpDeadAllowed()

    /* It all always allowed on primary if *all_dead. */

should probably be

    /* It is always allowed on primary if *all_dead. */


* gistkillitems()

As the function is only called if (so->numKilled > 0), I think both
"killedsomething" and "dirty" variables should always have the same value, so
one variable should be enough. Assert(so->numKilled) would be appropriate in
that case.

The situation is similar for btree and hash indexes.


doc.patch:

"+applying the fill page write."



[1] https://www.postgresql.org/message-id/61470.1620647290%40antos

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Reply via email to