Hi List,
Note beforehand: this question is a result of a stack-exchange that can be
seen here:
http://stackoverflow.com/questions/39624241/inconsistent-behaviour-of-set-returning-functions-in-sub-query-with-random
I'm often using the WHERE clause random() > 0.5 to pick a random subset of
my data. Now I noticed that when using a set-returning function in a
sub-query, I either get the whole set or none (meaning that the WHERE
random() > 0.5 clause is interpreted *before* the set is being generated).
e.g.:
SELECT num FROM (
SELECT unnest(Array[1,2,3,4,5,6,7,8,9,10]) num) AS foo WHERE random() > 0.5;
This seems inconsistent because the following query *does* take the whole
set into account:
SELECT num FROM (
SELECT * FROM unnest(Array[1,2,3,4,5,6,7,8,9,10]) num) AS foo
WHERE random() > 0.5;
So does this one:
WITH foo AS (
SELECT unnest(Array[1,2,3,4,5,6,7,8,9,10]) num)
SELECT num FROM foo WHERE random() > 0.5;
Could anyone reflect on the seeming inconsistency here? I do understand
that the planner sees the queries quite different (as can be seen from an
EXPLAIN) but I don't understand the rationale behind it.
Notes:
-
couldn't find another function to test apart from random(), but likely
there is some
-
I tested with generate_series and as well
-
My real use case works with postgis and pgpointcloud where a range of
set-returning functions is used in this manner
Thanks,
Tom