Tom Lane <t...@sss.pgh.pa.us> wrote: > florian.schoppm...@emc.com (Florian Schoppmann) writes: > > [VOLATILE function in WHERE clause *does* get optimized] > > I can't get excited about this. Any time you put a volatile function > into WHERE, you're playing with fire. The docs warn against it: > http://www.postgresql.org/docs/9.2/static/sql-expressions.html#SYNTAX- > EXPRESS-EVAL
All this section tells me is that one cannot rely on the evaluation order in expressions, and that side effects are dangerous in WHERE and HAVING clauses. I do not read in this section that VOLATILE functions are unsafe per se. After all, a VOLATILE function is not required to have side effects (suppose, e.g., somebody implemented a true random nubmer generator). However, <http://www.postgresql.org/docs/9.2/interactive/sql-createfunction.html> is actually very clear in its wording: | VOLATILE indicates that the function value can change even within a | single table scan, so no optimizations can be made. I therefore tend to see the behavior as a bug. I concede though that there is also good reason to keep the current behavior (VOLATILE is the default for UDFs, etc.). But then I think the documentation needs to be changed, and it has to be made explicit where the optimizer may change semantics and what, on the other hand, is defined behavior. E.g., users should need to know if the following rewrite causes defined or undefined behavior. Does it give correct results by accident, or are correct results guaranteed? --8<-- WITH source AS ( SELECT i FROM generate_series(1,10) AS i ) SELECT i FROM source AS _stats WHERE random() < 5::DOUBLE PRECISION / (SELECT count(*) FROM source); -->8-- > To do what you want, I'd suggest wrapping the join into a sub-select > with an "OFFSET 0" clause, which will serve as an optimization fence > that prevents the random() call from being pushed down. My interpretation so far is that VOLATILE functions in a WHERE clause can always be avoided: E.g., move the condition to the SELECT list and embed in an outer query that then filters on the condition column. Or is my assumption wrong, and the optimizer could theoretically interfere even here? Florian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers