Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

Heikki Linnakangas Tue, 17 Mar 2015 12:13:40 -0700

On 03/05/2015 03:18 AM, Peter Geoghegan wrote:

Attached patch series forms what I'm calling V3.0 of the INSERT ... ON
CONFLICT IGNORE/UPDATE feature. (Sorry about all the threads. I feel
this development justifies a new thread, though.)

I'm still not sure the way the speculative locking works is the bestapproach. Instead of clearing xmin on super-deletion, a new flag on theheap tuple seems more straightforward. And we could put the speculativeinsertion token in t_ctid, instead of stashing it in the PGPROC array.That would again seem more straightforward.


A couple of quick random comments:

/*
 * plan_speculative_use_index
 *              Use the planner to decide speculative insertion arbiter index
 *
 * Among indexes on target of INSERT ... ON CONFLICT, decide which index to
 * use to arbitrate taking alternative path.  This should be called
 * infrequently in practice, because it's unusual for more than one index to
 * be available that can satisfy a user-specified unique index inference
 * specification.
 *
 * Note: caller had better already hold some type of lock on the table.
 */
Oid
plan_speculative_use_index(PlannerInfo *root, List *indexList)
{
        ...
        /* Locate cheapest IndexOptInfo for the target index */

If I'm reading this correctly, if there are multiple indexes that matchthe unique index inference specification, we pick the cheapest one.Isn't that unstable? Two backends running the same INSERT ON CONFLICTstatement might pick different indexes, and the decision might changeover time as the table is analyzed. I think we should have a more robustrule. Could we easily just use all matching indexes?

... Deferred unique constraints are not supported as
+   arbiters of whether an alternative <literal>ON CONFLICT</> path
+   should be taken.

We really need to find a shorter term for "arbiter of whether analternative path should be taken". Different variations of that term areused a lot, and it's tedious to read.

* There is still an unresolved semantics issue with unique index
inference and non-default opclasses. I think it's sufficient that the
available/defined unique indexes dictate our idea of a unique
violation (that necessitates taking the alternative path). Even in a
world where there exists a non-default opclass with an "equals"
operator that does not match that of the default opclass (that is not
really the world we live in, because the only counter-example known is
made that way specifically to *discourage* its use by users), this
seems okay to me. It seems okay to me because surely the relevant
definition of equality is the one actually chosen for the available
unique index. If there exists an ambiguity for some strange reason
(i.e. there are two unique indexes, on the same attribute(s), but with
different "equals" operators), then its a costing issue, so the
behavior given is essentially non-predictable (it could end up being
either...but hey, those are the semantics). I have a very hard time
imagining how that could ever be the case, even when we have (say)
case insensitive opclasses for the text type. A case insensitive
opclass is stricter than a case sensitive opclass.  Why would a user
ever want both on the same attribute(s) of the same table? Is the user
really more or less expecting to never get a unique violation on the
non-arbitrating unique index, despite all this?

If reviewers are absolutely insistent that this theoretical ambiguity
is a problem, we can add an optional CREATE INDEX style opclass
specification (I'm already using the IndexElems representation from
CREATE INDEX for the inference specification, actually, so that would
be easy enough). I really have a hard time believing that the ugliness
is a problem for those hypothetical users that eventually consider
"equals" operator ambiguity among opclasses among available unique
indexes to be a problem. I haven't just gone and implemented this
already because I didn't want to document that an opclass
specification will be accepted. Since there is literally no reason why
anyone would care today, I see no reason to add what IMV would really
just be cruft.

I've been thinking that it would be nice to be able to specify aconstraint name. Naming an index directly feels wrong, as in relationaland SQL philosophy, indexes are just an implementation detail, butnaming a constraint is a fair game. It would also be nice to be able tospecify "use the primary key".


Attached patch contains a few more things I saw at a quick read.

- Heikki

diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index 46b8db8..d6fa98c 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -1014,6 +1014,10 @@ GetForeignServerByName(const char *name, bool missing_ok);
      source provides.
     </para>
 
+    <!-- FIXME: If this is a hard limitation with the backend, the backend
+         should check and reject these cases. Otherwise, if it's possible
+         that a FDW would actually support this, this needs to be reworded
+      -->
     <para>
      <command>INSERT</> with an <literal>ON CONFLICT</> clause is not
      supported with a unique index inference specification.  When
diff --git a/doc/src/sgml/ref/create_view.sgml b/doc/src/sgml/ref/create_view.sgml
index 2b68121..24fb4b7 100644
--- a/doc/src/sgml/ref/create_view.sgml
+++ b/doc/src/sgml/ref/create_view.sgml
@@ -385,6 +385,7 @@ CREATE VIEW vista AS SELECT text 'Hello World' AS hello;
     <xref linkend="rules-privileges">).
    </para>
    <para>
+    <!-- FIXME: This paragraph needs to reworded, to make it more readable. -->
     <command>INSERT</command> with an <literal>ON CONFLICT IGNORE</>
     clause is only supported on updatable views under specific
     circumstances.  If a set of columns/expressions has been provided
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index 1395c48..4f3878b 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -33,9 +33,12 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
   <title>Description</title>
 
   <para>
-   <command>INSERT</command> inserts new rows into a table.  One can
-   insert one or more rows specified by value expressions, or zero or
-   more rows resulting from a query.  An alternative
+   <command>INSERT</command> inserts new rows into a table.
+   One can insert one or more rows specified by value expressions,
+   or zero or more rows resulting from a query.
+   <!-- FIXME: This needs to go somewhere later, not in the very first
+        introductory paragraph on INSERT in general -->
+   An alternative
    <literal>IGNORE</literal> path can optionally be specified, to be
    taken in the event of detecting that proceeding with insertion
    would result in a conflict (i.e. a conflicting tuple already
diff --git a/doc/src/sgml/ref/set_constraints.sgml b/doc/src/sgml/ref/set_constraints.sgml
index 5030f3c..b1e8f9b 100644
--- a/doc/src/sgml/ref/set_constraints.sgml
+++ b/doc/src/sgml/ref/set_constraints.sgml
@@ -69,7 +69,11 @@ SET CONSTRAINTS { ALL | <replaceable class="parameter">name</replaceable> [, ...
   <para>
    Currently, only <literal>UNIQUE</>, <literal>PRIMARY KEY</>,
    <literal>REFERENCES</> (foreign key), and <literal>EXCLUDE</>
-   constraints are affected by this setting.  Note that constraints
+   constraints are affected by this setting.
+   <!-- FIXME: I take that this is meant to say that DEFERRED constraints
+        cannot be used as arbiters. e.g.  SET CONSTRAINTS ALL IMMEDIATE is
+        OK. -->
+   Note that constraints
    that were created with this clause cannot be used as arbiters of
    whether or not to take the alternative path with an
    <command>INSERT</command> statement that includes an <literal>ON
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index ffe1d62..0dabd16 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1548,6 +1548,7 @@ retry:
 			if (DirtySnapshot.speculativeToken)
 				SpeculativeInsertionWait(DirtySnapshot.xmin,
 										 DirtySnapshot.speculativeToken);
+			/* FIXME: both arms of this if do the same thing. Huh? */
 			else if (violationOK)
 				XactLockTableWait(xwait, heap, &ctid_wait,
 								  XLTW_RecheckExclusionConstr);
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 99c8d6a..d31db33 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -4118,10 +4118,10 @@ string_to_const(const char *str, Oid datatype)
  * plan_speculative_use_index
  *		Use the planner to decide speculative insertion arbiter index
  *
- * Among indexes on target of INSERT ... ON CONFLICT,  decide which index to
+ * Among indexes on target of INSERT ... ON CONFLICT, decide which index to
  * use to arbitrate taking alternative path.  This should be called
- * infrequently in practice, because its unusual for more than one index to be
- * available that can satisfy a user-specified unique index inference
+ * infrequently in practice, because it's unusual for more than one index to
+ * be available that can satisfy a user-specified unique index inference
  * specification.
  *
  * Note: caller had better already hold some type of lock on the table.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

Reply via email to