Re: [HACKERS] Writeable CTEs and side effects

Bruce Momjian Fri, 09 Oct 2009 13:32:31 -0700

Added to TODO:

        Allow INSERT/UPDATE/DELETE ... RETURNING in common table expressions


        *  http://archives.postgresql.org/pgsql-hackers/2009-10/msg00472.php

---------------------------------------------------------------------------

Marko Tiikkaja wrote:
> I've made progress in implementing writeable CTEs (repo at
> git://git.postgresql.org/git/writeable_cte.git , branch actually_write)
> and I've hit a few corner-cases which have lead me to think that we
> should be handling DML inside CTEs a bit differently.  Before I go on
> implementing this, I'd like to hear your input.
> 
> 1)    WITH t AS
>       (UPDATE foo SET bar = bar+1 RETURNING *)
>       SELECT * FROM t LIMIT 1;
> 
> What's problematic here is that only 1 row is read from the CTE, meaning
> also that only one row is updated which, at least how I see it, is not
> what we want.  The CTE should only store one row and return that after
> it has completely processed the UPDATE statement.
> 
> 2)    WITH t1 AS
>       (UPDATE foo SET bar=bar+1 RETURNING *),
>       t2 AS
>       (UPDATE foo SET bar=bar+1 RETURNING *)
>       SELECT * FROM t1
>       UNION ALL
>       SELECT * FROM t2;
> 
> This is probably not the most common scenario, but is still very
> surprising if you for some reason happen to hit it.  Both of the updates
> actually have the same transaction ID and command ID, so the rows are
> first updated by t1, but when t2 is processed, it looks at the rows and
> thinks that it already updated them.
> 
> 3)    WITH t1 AS
>       (UPDATE foo SET bar=bar+1 RETURNING *),
>       t2 AS
>       (UPDATE baz SET bat=bat+1 RETURNING *)
>       VALUES (true);
> 
> This isn't probably the most common situation either, but I think it's
> worth looking at; the user wants to update two different tables, but
> ignore the RETURNING data completely.  On IRC, this has been requested
> multiple times.  Even if we wouldn't agree that this feature is useful,
> it pretty much follows the semantics of example #1.
> 
> 
> Trying to tackle all of these at once, I've come up with this kind of
> execution strategy:
> 
> Before starting the execution of the main plan tree, for every CTE which
> is a DML query, do the following:
> 
>    1) Get a new CID
>    2a) If there are no references to the CTE (example #3), run the DML
> query to the end but ignore the results of the RETURNING query,
>    or
>    2b) If there are references, run the DML query to the end but store
> either as many as rows as you need to to answer the outer query (example
> #1) or if we can't determine the number of rows we need (most cases,
> example #2) run the query and store all of its results.
> 
> Then, if required, get a new CID for the main execution tree and execute
> it using the data we now have inside the CTEs.  This way we can avoid
> storing useless rows in memory without unexpected behaviour and caveats.
> 
> 
> Regards,
> Marko Tiikkaja
> 
> 
> 
> -- 
> Sent via pgsql-hackers mailing list ([email protected])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  <[email protected]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Writeable CTEs and side effects

Reply via email to