On Sat, May 21, 2011 at 8:50 PM, Kevin Grittner <kevin.gritt...@wicourts.gov> wrote: > Robert Haas wrote: >> How is an UPDATE different from a DELETE and an INSERT? > > That's the question Jeff asked which caused us to revisit this. > > There are two answers -- conceptual and implementation. > > Conceptually, an updated row is the same row, and a row inserted after a > delete is a new row. Note that READ COMMITTED doesn't treat them the > same on a write conflict. To give a practical example, police > departments in Wisconsin typically reassign a badge number to a new > officer after an existing officer leaves. Updating an officer record > keyed by badge number (say, with a new address or a name change) would > be qualitatively different from deleting an old officer record and > inserting a new one for a different person now getting the badge number. > (OK, so this is somewhat artificial, because they track who had the > number in what temporal ranges, but you get the idea.) > > In the implementation the only difference between an UPDATE in a table > and a DELETE and INSERT in the same table in the same transaction > (besides concurrency handling) is the ctid linkage, at least as far as I > can see.
I think the implementation is what matters here. I understand that there are certain situations in which the user might choose to UPDATE a row and other situations in which they might choose to DELETE and then INSERT: but the user's intent is basically irrelevant. If the system treats those operations in basically the same way, then it shouldn't be necessary to follow the CTID chain in one case given that there is no CTID chain in the other case. Or to put that another way, if it is necessary to follow the CTID chain, then we should be able to articulate a reason for that necessity -- something that is materially different in the UPDATE case. Otherwise, we're just following the chain "because it's there". It seems to me that we can actually state with some degree of precision what that "material difference" would need to be. The goal of SSI is to prevent serialization anomalies that would not be prevented by snapshot isolation. Let's suppose that it successfully does that in the DELETE/INSERT case. Suppose further that we change SSI so that it handles the UPDATE case in the same way that it handles the DELETE/INSERT case. This change will be incorrect only if there is a serialization anomaly that snapshot isolation *would have prevented* in the DELETE/INSERT case that *it does not prevent* in the update case. In other words, if SSI needs to be more rigorous in the UPDATE case, it can only be because snapshot isolation is less rigorous in that case, and the additional rigor that SSI must apply there must be exactly equal to whatever snapshot isolation isn't picking up (as compared with the DELETE/INSERT case). Does that make any sense? It seems logical to me, but IJWH. > So I think that you can't just treat the two things the same in SSI just > because the PostgreSQL implementation details make them similar; but I > think that you can treat the two things the same for the reasons I > worked out at the start of the thread. Your argument seems reasonable to me; but it would be nice if we could find a simpler one, because simpler arguments are less likely to be incorrect. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers