Michail Nikolaev <michail.nikol...@gmail.com> wrote: > After some correspondence with Peter Geoghegan (1) and his ideas, I > have reworked the patch a lot and now it is much more simple with even > better performance (no new WAL or conflict resolution, hot standby > feedback is unrelated).
My review that started in [1] continues here. (Please note that code.patch does not apply to the current master branch.) I think I understand your approach now and couldn't find a problem by reading the code. What I consider worth improving is documentation, both code comments and nbtree/README. Especially for the problem discussed in [1] it should be explained what would happen if kill_prior_tuple_min_lsn was not checked. Also, in IsIndexLpDeadAllowed() you say that invalid deadness->latest_removed_xid means the following: /* * Looks like it is tuple cleared by heap_page_prune_execute, * we must be sure if LSN of XLOG_HEAP2_CLEAN (or any subsequent * updates) less than minRecoveryPoint to avoid MVCC failure * after crash recovery. */ However I think there's one more case: if heap_hot_search_buffer() considers all tuples in the chain to be "surely dead", but HeapTupleHeaderAdvanceLatestRemovedXid() skips them all for this reason: /* * Ignore tuples inserted by an aborted transaction or if the tuple was * updated/deleted by the inserting transaction. * * Look for a committed hint bit, or if no xmin bit is set, check clog. */ I think that the dead tuples produced this way should never be visible on the standby (and even if they were, they would change the page LSN so your algorithm would treat them correctly) so I see no correctness problem. But it might be worth explaining better the meaning of invalid "latest_removed_xid" in comments. In the nbtree/README, you say "... if the commit record of latestRemovedXid is more ..." but it's not clear to me what "latestRemovedXid" is. If you mean the scan->kill_prior_tuple_min_lsn field, you probably need more words to explain it. * IsIndexLpDeadAllowed() /* It all always allowed on primary if *all_dead. */ should probably be /* It is always allowed on primary if *all_dead. */ * gistkillitems() As the function is only called if (so->numKilled > 0), I think both "killedsomething" and "dirty" variables should always have the same value, so one variable should be enough. Assert(so->numKilled) would be appropriate in that case. The situation is similar for btree and hash indexes. doc.patch: "+applying the fill page write." [1] https://www.postgresql.org/message-id/61470.1620647290%40antos -- Antonin Houska Web: https://www.cybertec-postgresql.com