On Thu, 7 Mar 2024 at 19:37, Michail Nikolaev <michail.nikol...@gmail.com> wrote: > > Hello! > > > I'm not a fan of this approach. Changing visibility and cleanup > > semantics to only benefit R/CIC sounds like a pain to work with in > > essentially all visibility-related code. I'd much rather have to deal > > with another index AM, even if it takes more time: the changes in > > semantics will be limited to a new plug in the index AM system and a > > behaviour change in R/CIC, rather than behaviour that changes in all > > visibility-checking code. > > Technically, this does not affect the visibility logic, only the > clearing semantics. > All visibility related code remains untouched.
Yeah, correct. But it still needs to update the table relations' information after finishing creating the indexes, which I'd rather not have to do. > But yes, still an inelegant and a little strange-looking option. > > At the same time, perhaps it can be dressed in luxury > somehow - for example, add as a first class citizen in > ComputeXidHorizonsResult > a list of blocks to clear some relations. Not sure what you mean here, but I don't think ComputeXidHorizonsResult should have anything to do with actual relations. > > But regardless of second scan snapshots, I think we can worry about > > that part at a later moment: The first scan phase is usually the most > > expensive and takes the most time of all phases that hold snapshots, > > and in the above discussion we agreed that we can already reduce the > > time that a snapshot is held during that phase significantly. Sure, it > > isn't great that we have to scan the table again with only a single > > snapshot, but generally phase 2 doesn't have that much to do (except > > when BRIN indexes are involved) so this is likely less of an issue. > > And even if it is, we would still have reduced the number of > > long-lived snapshots by half. > > Hmm, but it looks like we don't have the infrastructure to "update" xmin > propagating to the horizon after the first snapshot in a transaction is taken. We can just release the current snapshot, and get a new one, right? I mean, we don't actually use the transaction for much else than visibility during the first scan, and I don't think there is a need for an actual transaction ID until we're ready to mark the index entry with indisready. > One option I know of is to reuse the > d9d076222f5b94a85e0e318339cfc44b8f26022d (1) approach. > But if this is the case, then there is no point in re-taking the > snapshot again during the first > phase - just apply this "if" only for the first phase - and you're done. Not a fan of that, as it is too sensitive to abuse. Note that extensions will also have access to these tools, and I think we should build a system here that's not easy to break, rather than one that is. > Do you know any less-hacky way? Or is it a nice way to go? I suppose we could be resetting the snapshot every so often? Or use multiple successive TID range scans with a new snapshot each? Kind regards, Matthias van de Meent Neon (https://neon.tech)