On Sun, Oct 11, 2009 at 12:35 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > I'm fooling around with pushing FOR UPDATE locking into a new plan node > type, and I just noticed a behavior that seems a bit bogus. > Historically we have dealt with FOR UPDATE in sub-selects by flattening > the sub-select if we could, because the alternative was to fail > altogether. For example, consider > > select * from a join (select * from b for update) ss on a.x = ss.y; > > The FOR UPDATE effectively got hoisted to the top because that's where > we could implement it, making this equivalent to > > select * from a join b on a.x = b.y for update of b; > > It seems to me, though, that this is changing the semantics. In the > latter case it's clear that we should only lock b rows that have a join > partner in a (which indeed is what happens). In the former case, what > I think should be expected to happen is that *all* b rows get locked. > > With FOR UPDATE as a plan node, it's possible to fix this by treating > FOR UPDATE in a sub-select as an optimization fence that prevents > flattening of the sub-select, much like LIMIT has always done. The > FOR UPDATE node will end up at the top of the subplan and it will act > as the syntax would suggest. > > Of course the downside of changing it is that queries that worked fine > before might work differently (and much slower) now; first because not > flattening the sub-select might lead to a worse plan, and second because > locking more rows takes more time. > > The alternative would be to let it continue to flatten such sub-selects > when possible, and to tell anyone who doesn't want that to stick in > OFFSET 0 as an optimization fence. > > It's an entirely trivial code change either way. I'm inclined to think > that we should prevent flattening, on the grounds of least astonishment.
It seems like this is somewhat related to the question of embedding an {INSERT|UPDATE|DELETE}...RETURNING in some arbitrary part of a query versus only allowing it in a WITH clause. The argument for only allowing it in a WITH clause is that there is otherwise no guarantee that it is evaluated in its entirety but just once. ISTM we could contrariwise give it the handling you're proposing here: allow it anywhere in the query, but make it act as an optimization fence. For that reason, I think I'd be inclined to make it act as an optimization fence if used as a top-level CTE, but otherwise flatten it, so that the handling is consistent with what we've proposed to do elsewhere. But I'm not really familiar with how these constructs are being treated by the executor, so I might be creating a false parallel here. The other comment I have is that I *expect* subqueries to be pulled up. So my own personal POLA would not be violated by locking only the rows with a join partner; in fact it would be more likely to be violated by the reverse behavior. I might not be typical, though. My experience is that not pulling up subqueries tends to have disastrous effects on performance, so I'm somewhat biased against creating more situations where that will happen. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers