Added to TODO: Allow INSERT/UPDATE/DELETE ... RETURNING in common table expressions
* http://archives.postgresql.org/pgsql-hackers/2009-10/msg00472.php --------------------------------------------------------------------------- Marko Tiikkaja wrote: > I've made progress in implementing writeable CTEs (repo at > git://git.postgresql.org/git/writeable_cte.git , branch actually_write) > and I've hit a few corner-cases which have lead me to think that we > should be handling DML inside CTEs a bit differently. Before I go on > implementing this, I'd like to hear your input. > > 1) WITH t AS > (UPDATE foo SET bar = bar+1 RETURNING *) > SELECT * FROM t LIMIT 1; > > What's problematic here is that only 1 row is read from the CTE, meaning > also that only one row is updated which, at least how I see it, is not > what we want. The CTE should only store one row and return that after > it has completely processed the UPDATE statement. > > 2) WITH t1 AS > (UPDATE foo SET bar=bar+1 RETURNING *), > t2 AS > (UPDATE foo SET bar=bar+1 RETURNING *) > SELECT * FROM t1 > UNION ALL > SELECT * FROM t2; > > This is probably not the most common scenario, but is still very > surprising if you for some reason happen to hit it. Both of the updates > actually have the same transaction ID and command ID, so the rows are > first updated by t1, but when t2 is processed, it looks at the rows and > thinks that it already updated them. > > 3) WITH t1 AS > (UPDATE foo SET bar=bar+1 RETURNING *), > t2 AS > (UPDATE baz SET bat=bat+1 RETURNING *) > VALUES (true); > > This isn't probably the most common situation either, but I think it's > worth looking at; the user wants to update two different tables, but > ignore the RETURNING data completely. On IRC, this has been requested > multiple times. Even if we wouldn't agree that this feature is useful, > it pretty much follows the semantics of example #1. > > > Trying to tackle all of these at once, I've come up with this kind of > execution strategy: > > Before starting the execution of the main plan tree, for every CTE which > is a DML query, do the following: > > 1) Get a new CID > 2a) If there are no references to the CTE (example #3), run the DML > query to the end but ignore the results of the RETURNING query, > or > 2b) If there are references, run the DML query to the end but store > either as many as rows as you need to to answer the outer query (example > #1) or if we can't determine the number of rows we need (most cases, > example #2) run the query and store all of its results. > > Then, if required, get a new CID for the main execution tree and execute > it using the data we now have inside the CTEs. This way we can avoid > storing useless rows in memory without unexpected behaviour and caveats. > > > Regards, > Marko Tiikkaja > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers