Good to know and I agree that it is not an urgent case.
I think this practice might be more common in the POSTGIS community where
there are plenty of set-returning-functions used in this way. My use was
taking a random sample of a pointcloud distrubution.

I took the liberty to post your answer at stackexchange.

thanks,
 Tom

On Mon, 26 Sep 2016 at 21:38 Tom Lane <t...@sss.pgh.pa.us> wrote:

> Tom van Tilburg <tom.van.tilb...@gmail.com> writes:
> > I'm often using the WHERE clause random() > 0.5 to pick a random subset
> of
> > my data. Now I noticed that when using a set-returning function in a
> > sub-query, I either get the whole set or none (meaning that the WHERE
> > random() > 0.5 clause is interpreted *before* the set is being
> generated).
> > e.g.:
> >
> > SELECT num FROM (
> >     SELECT unnest(Array[1,2,3,4,5,6,7,8,9,10]) num) AS foo WHERE
> random() > 0.5;
>
> Hmm, I think this is an optimizer bug.  There are two legitimate behaviors
> here:
>
>     SELECT * FROM unnest(ARRAY[1,2,3,4,5,6,7,8,9,10]) WHERE random() > 0.5;
>
> should (and does) re-evaluate the WHERE for every row output by unnest().
>
>     SELECT unnest(ARRAY[1,2,3,4,5,6,7,8,9,10]) WHERE random() > 0.5;
>
> should evaluate WHERE only once, since that happens before expansion of the
> set-returning function in the targetlist.  (If you're an Oracle user and
> you imagine this query as having an implicit "FROM dual", the WHERE should
> be evaluated for the single row coming out of the FROM clause.)
>
> In the case you've got here, given the placement of the WHERE in the outer
> query, you'd certainly expect it to be evaluated for each row coming out
> of the inner query.  But the optimizer is deciding it can push the WHERE
> clause down to become a WHERE of the sub-select.  That is legitimate in a
> lot of cases, but not when there are SRF(s) in the sub-select's
> targetlist, because that pushes the WHERE to occur before the SRF(s),
> analogously to the change between the two queries I wrote.
>
> I'm a bit hesitant to change this in existing releases.  Given the lack
> of previous complaints, it seems more likely to break queries that were
> behaving as-expected than to make people happy.  But we could change it
> in v10 and up, especially since some other corner-case changes in
> SRF-in-tlist behavior are afoot.
>
> In the meantime, you could force it to work as you wish by inserting the
> all-purpose optimization fence "OFFSET 0" in the sub-select:
>
> =# SELECT num FROM (
>     SELECT unnest(Array[1,2,3,4,5,6,7,8,9,10]) num OFFSET 0) AS foo WHERE
> random() > 0.5;
>  num
> -----
>    1
>    4
>    7
>    9
> (4 rows)
>
>
>                         regards, tom lane
>

Reply via email to