On Mon, 10 Aug 2020 at 23:56, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote: > > The problem was simply that when a page is > examined by a seqscan, we do HeapTupleSatisfiesVisibility of each tuple > in isolation; and for each tuple we call SetHintBits(). And only the > first time the FPI happens; by the time we get to the second tuple, the > page is already dirty, so there's no need to emit an FPI. But the FPI > we sent only had the bit on the first tuple ... so the standby will not > have the bit set for any subsequent tuple. And on promotion, the > standby will have to have the bits set for all those tuples, unless you > happened to dirty the page again later for other reasons.
Which probably means that pg_rewind is broken because it won't be able to rewind correctly. > One simple idea to try to forestall this problem would be to modify the > algorithm so that all tuples are scanned and hinted if the page is going > to be dirtied -- then send a single FPI setting bits for all tuples, > instead of just on the first tuple. This would make latency much worse for non seqscan cases. Certainly for seqscans it would make sense to emit a message that sets all tuples at once, or possibly emit an FPI and then follow that with a second message that sets all other hints on the page. -- Simon Riggs http://www.2ndQuadrant.com/ Mission Critical Databases