Good to know and I agree that it is not an urgent case. I think this practice might be more common in the POSTGIS community where there are plenty of set-returning-functions used in this way. My use was taking a random sample of a pointcloud distrubution.
I took the liberty to post your answer at stackexchange. thanks, Tom On Mon, 26 Sep 2016 at 21:38 Tom Lane <t...@sss.pgh.pa.us> wrote: > Tom van Tilburg <tom.van.tilb...@gmail.com> writes: > > I'm often using the WHERE clause random() > 0.5 to pick a random subset > of > > my data. Now I noticed that when using a set-returning function in a > > sub-query, I either get the whole set or none (meaning that the WHERE > > random() > 0.5 clause is interpreted *before* the set is being > generated). > > e.g.: > > > > SELECT num FROM ( > > SELECT unnest(Array[1,2,3,4,5,6,7,8,9,10]) num) AS foo WHERE > random() > 0.5; > > Hmm, I think this is an optimizer bug. There are two legitimate behaviors > here: > > SELECT * FROM unnest(ARRAY[1,2,3,4,5,6,7,8,9,10]) WHERE random() > 0.5; > > should (and does) re-evaluate the WHERE for every row output by unnest(). > > SELECT unnest(ARRAY[1,2,3,4,5,6,7,8,9,10]) WHERE random() > 0.5; > > should evaluate WHERE only once, since that happens before expansion of the > set-returning function in the targetlist. (If you're an Oracle user and > you imagine this query as having an implicit "FROM dual", the WHERE should > be evaluated for the single row coming out of the FROM clause.) > > In the case you've got here, given the placement of the WHERE in the outer > query, you'd certainly expect it to be evaluated for each row coming out > of the inner query. But the optimizer is deciding it can push the WHERE > clause down to become a WHERE of the sub-select. That is legitimate in a > lot of cases, but not when there are SRF(s) in the sub-select's > targetlist, because that pushes the WHERE to occur before the SRF(s), > analogously to the change between the two queries I wrote. > > I'm a bit hesitant to change this in existing releases. Given the lack > of previous complaints, it seems more likely to break queries that were > behaving as-expected than to make people happy. But we could change it > in v10 and up, especially since some other corner-case changes in > SRF-in-tlist behavior are afoot. > > In the meantime, you could force it to work as you wish by inserting the > all-purpose optimization fence "OFFSET 0" in the sub-select: > > =# SELECT num FROM ( > SELECT unnest(Array[1,2,3,4,5,6,7,8,9,10]) num OFFSET 0) AS foo WHERE > random() > 0.5; > num > ----- > 1 > 4 > 7 > 9 > (4 rows) > > > regards, tom lane >