Hi all, In PostgreSQL 9.1 and 9.2 (possibly also in earlier versions), the query
--8<-- WITH source AS ( SELECT i FROM generate_series(1,10) AS i ) SELECT i FROM source, ( SELECT count(*) AS _n FROM source ) AS _stats WHERE random() < 5::DOUBLE PRECISION/_n; -->8-- translates into the following query plan: --8<-- Nested Loop (cost=35.00..65.03 rows=1000 width=4) CTE source -> Function Scan on generate_series i (cost=0.00..10.00 rows=1000 width=4) -> Aggregate (cost=25.00..25.02 rows=1 width=0) Filter: (random() < (5::double precision / (count(*))::double precision)) -> CTE Scan on source (cost=0.00..20.00 rows=1000 width=0) -> CTE Scan on source (cost=0.00..20.00 rows=1000 width=4) -->8-- In other words, the query either gives exactly 0 or 10 rows, and both cases happen with probability 0.5. Naturally, I would have expected instead that each row is sampled independently with probability 0.5. Since random() is volatile, so is the whole where-expression. So I wonder why the condition is pushed down to the lowest level, given that this changes results. Is this behavior correct, i.e., specified somewhere? Or is this a bug? Florian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers