On Sun, May 1, 2016 at 11:54 PM, Bruce Momjian <br...@momjian.us> wrote: > On Sat, Apr 23, 2016 at 10:20:19AM -0400, Bruce Momjian wrote: >> On Sat, Apr 23, 2016 at 12:48:08PM +0530, Amit Kapila wrote: >>> On Sat, Apr 23, 2016 at 8:34 AM, Bruce Momjian <br...@momjian.us> wrote: >>>> >>>> I kind of agreed with Tom about just aborting transactions that held >>>> snapshots for too long, and liked the idea this could be set per >>>> session, but the idea that we abort only if a backend actually touches >>>> the old data is very nice. I can see why the patch author worked hard >>>> to do that. > > As I understand it, a transaction trying to access a shared buffer > aborts if there was a cleanup on the page that removed rows it might be > interested in. How does this handle cases where vacuum removes _pages_ > from the table?
(1) When the "snapshot too old" feature is enabled (old_snapshot_threshold >= 0) relations are not truncated, so pages cannot be removed that way. This mainly protects against a seq scan having the problem you describe. (2) Other than a seq scan, you could miss a page when descending through an index or following sibling pointers within an index. In either case you can't remove a page without modifying the page pointing to it to no longer do so, so the modified LSN on the parent or sibling will trigger the error. Note that a question has recently been raised regarding hash indexes (which should perhaps be generalized to any non-WAL-logged index on a permanent table. Since that is a correctness issue, versus a performance issue affecting only how many people will find the feature useful, I will add that to the release blockers list and prioritize it ahead of those issues only affecting how many people will find it useful. > Does vacuum avoid this when there are running transactions? I'm not sure I understand the question. >> Also, it seems we have similar behavior already in applying WAL on the >> standby --- we delay WAL replay when there is a long-running >> transaction. Once the time expires, we apply the WAL. Do we cancel the >> long-running transaction at that time, or wait for the long-running >> transaction to touch some WAL we just applied? If the former, does >> Kevin's new code allow us to do the later? > > Is this a TODO item? I'm not aware of any TODO items existing or needed here. The feature operates by adjusting the xmin used by vacuum and pruning, and leaving all the other mechanisms functioning as they were. That looked to me like it should interact with replication streams correctly. If someone sees something that needs adjustment please speak up Real Soon Now. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers