Re: [HACKERS] UPSERT

Simon Riggs Fri, 02 Mar 2007 10:06:12 -0800

On Fri, 2007-03-02 at 15:41 +0000, Heikki Linnakangas wrote:
> Tom Lane wrote:
> > Andrew Dunstan <[EMAIL PROTECTED]> writes:
> >> My instinct would be to follow your first strategy, i.e. detect which 
> >> path is needed rather than try one and then if it fails do the other.
> > 
> > The very first thing you need to think about is how to solve the race
> > condition problem, ie, two backends concurrently trying to insert
> > identical data.  Until you have a plausible mechanism for that, the
> > whole thing is pie-in-the-sky.
> 
> How about:
> 
> 1. Insert new heap tuple
> 2. Try to insert the index tuple. If there's a duplicate tuple, lock the 
> existing tuple instead of throwing an error.
> 3. If there was no duplicate, we're done.
> 
> 4. Otherwise, kill the new tuple inserted in step 1, by setting it's 
> xmin to InvalidTransactionId.
> 5. Perform the UPDATE on the existing tuple.
> 
> This requires one change to the indexam api: a duplicate key violation 
> needs to lock the existing tuple instead of throwing an error.


So if the INSERT fails we will leave two dead copies of the tuples? Hmm.

Seems like we should try to locate a row first, then INSERT if we cannot
find one. That's slower on INSERT but more balanced overall - sometimes
the input will generate all UPDATEs, sometimes all INSERTs we'll never
know.


I'm a bit surprised the TODO didn't mention the MERGE statement, which
is the SQL:2003 syntax for specifying this as an atomic statement. There
are lots of other syntaxes, the most simple of which are the MySQL
REPLACE and Teradata's UPDATE ... ELSE INSERT. As seductive as they are,
I'd say that's all the more reason to go with the single approved
syntax. If MySQL are standards compliant, they will support that also,
so we get MySQL compatibility either way.

Another thought that really ought to be on the TODO is a MERGE FROM
(pick your syntax) that allows MERGE to act like a COPY, reading data
from an external data file. That would save effort, since the only way
of doing this currently is to do a COPY then an UPDATE and then an
INSERT. So the MERGE FROM would reduce three operations to just a single
command. 

-- 
  Simon Riggs             
  EnterpriseDB   http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org

Re: [HACKERS] UPSERT

Reply via email to