This can be solved by chaining modifying CTEs.

Something like this (not tested)  that can work with multiple rows inserted:


WITH
  vals (bk1, bk2, other_t1_columns, other_t2_columns) AS
    ( VALUES (bk1val, bk2val, other_t1_values, other_t2_values),
             (bk1val, bk2val, other_t1_values, other_t2_values)
    ),
  ins_t1 AS
    ( INSERT INTO t1 (bk1, bk2, other columns)
      SELECT bk1, bk2, other_t1_columns
      FROM vals
      ON CONFLICT (bk1val, bk2val) DO NOTHING
      RETURNING id, bk1, bk2
    )
INSERT INTO t2 (t1_id, other_t2_columns)
SELECT
    COALESCE(t1.id, ins_t1,id),
    val.bk1, val.bk2, val.other_t2_columns
FROM vals
  LEFT JOIN ins_t1 ON (vals.bk1, vals.bk2) = (ins_t1.bk1, ins_t1.bk2)
  LEFT JOIN t1     ON (vals.bk1, vals.bk2) = (t1.bk1, t1.bk2)
 ;

On Wed, Oct 5, 2016 at 1:53 AM, Tom Dunstan <pg...@tomd.cc> wrote:

> Hi all
>
> We recently moved to using 9.5 and were hoping to use the new upsert
> functionality, but unfortunately it doesn’t quite do what we need.
>
> Our setup is something like this:
>
> CREATE TABLE t1 (
>   id BIGSERIAL NOT NULL PRIMARY KEY,
>   bk1 INT,
>   bk2 UUID
>   — other columns
> );
> CREATE UNIQUE INDEX t1_bk ON t1 (bk1, bk2);
>
> CREATE TABLE t2 (
>   t1_id BIGINT NOT NULL REFERENCES t1
>  — other stuff
> );
>
> Data comes in as inserts of one tuple each of t1 and t2. We expect inserts
> to t1 to be heavily duplicated. That is, for stuff coming in we expect a
> large number of rows to have duplicate (bk1, bk2), and we wish to discard
> those, but not discard the t2 tuple - those should always be inserted and
> reference the correct t1 record.
>
> So we currently have an insertion function that does this:
>
> BEGIN
>   INSERT INTO t1 (bk1, bk2, other columns)
>   VALUES (bk1val, bk2val, other values)
>   RETURNING id
>   INTO t1_id;
> EXCEPTION WHEN unique_violation THEN
>   SELECT id
>   FROM t1
>   WHERE bk1 = bk1val AND bk2 = bk2val
>   INTO t1_id;
> END;
>
> INSERT INTO t2(t1_id, other columns) VALUES(t1_id, other values);
>
> We were hoping that we’d be able to do something like this:
>
> INSERT INTO t1 (bk1, bk2, other columns)
>   VALUES (bk1val, bk2val, other values)
>   ON CONFLICT (bk1val, bk2val) DO NOTHING
>   RETURNING id
>   INTO t1_id;
> INSERT INTO t2(t1_id, other columns) VALUES(t1_id, other values);
>
> But unfortunately it seems that the RETURNING clause returns null when
> there’s a conflict, rather than the existing row’s value.
>
> I understand that there is ambiguity if there were multiple rows that were
> in conflict. I think this sort of functionality really only makes sense
> where the conflict target is a unique constraint, so IMO it would make
> sense to only support returning columns in that case.
>
> I imagine that this would be possible to do more efficiently than the
> subsequent query that we are currently doing given that postgres has
> already found the rows in question, in the index at least. I have no idea
> how hard it would actually be to implement though. FWIW my use-case would
> be supported even if this only worked for indexes where the to-be-returned
> columns were stored in the index using Anastasia’s covering + unique index
> patch, when that lands.
>
> Thoughts?
>
> Tom
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Reply via email to