On 4/24/15 5:30 PM, Alvaro Herrera wrote:
Jim Nasby wrote:

It looks like the biggest complaint (aside from allowing a limited number of
tuples to be moved) is in [1] and [2], where Tom is saying that you can't
simply call heap_update() like this without holding an exclusive lock on the
table. Is that because we're not actually changing the tuple?

That's nonsense -- obviously UPDATE can do heap_update without an
exclusive lock on the table, so the explanation must be something else.
I think his actual complaint was that you can't remove the old tuple
until concurrent readers of the table have already finished scanning it,
or you get into a situation where they might need to read the page in
which the original version resided, but your mini-vacuum already removed
it.  So before removing it you need to wait until they are all finished.
This is the reason I mentioned CREATE INDEX CONCURRENTLY: if you wait
until those transactions are all gone (like CIC does), you are then free
to remove the old versions of the tuple, because you know that all
readers have a snapshot new enough to see the new version of the tuple.

Except I don't see anywhere in the patch that's actually removing the old tuple...

Another issue is both HOT and KeyUpdate; I think we need to completely
ignore/over-ride that stuff for this.

You don't need anything for HOT, because cross-page updates are never
HOT.  Not sure what you mean about KeyUpdate, but yeah you might need
something there -- obviously, you don't want to create multixacts
unnecessarily.

If I'm not mistaken, if there's enough room left on the page then HeapSatisfiesHOTandKeyUpdate() will say this tuple satisfies HOT. So we'd have to do something to over-ride that, and I don't think the current patch does that. (It might force it to a new page anyway, but it does nothing with satisfies_hot, which I suspect isn't safe.)

Instead of adding forcefsm, I think it would be more useful to accept a
target block number. That way we can actually control where the new tuple
goes.

Whatever makes the most sense, I suppose.  (Maybe we shouldn't consider
this a tweaked heap_update -- which is already complex enough -- but a
separate heapam entry point.)

Yeah, I thought about creating heap_move, but I suspect that would still have to worry about a lot of this other stuff anyway. Far more likely for a change to be missed in heap_move than heap_update too.

I am tempted to add a SQL heap_move function though, assuming it's not much extra work.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to