"k...@rice.edu" <k...@rice.edu> wrote:
> On Wed, Sep 19, 2012 at 02:39:12PM -0500, Kevin Grittner wrote:
>> In another thread, Tom Lane <t...@sss.pgh.pa.us> wrote:
 
>>> 2. Apply the WHERE condition to each row from 1, and drop rows
>>> that don't pass it.
>>  
>> People expect that the results will be consistent with this
>> model, even if the implementation is optimized "under the
>> covers".  I think correct semantics should trump performance
>> here.
 
> It seems like this is what happens here except that the function
> is evaluated once for the WHERE and not once per ROW. Both of
> these meet the criterion for 2 above and Tom's earlier comments
> both hold.
 
There really needs to be some way to specify that when an expression
is evaluated for each row in a set, a function used within that
expression is not optimized away for some rows.  Fortunately we have
a way:
 
http://www.postgresql.org/docs/9.2/interactive/sql-createfunction.html
 
| VOLATILE indicates that the function value can change even within
| a single table scan, so no optimizations can be made. Relatively
| few database functions are volatile in this sense; some examples
| are random(), [...]
 
The behavior in the OP's query would certainly be sane if the
function were not VOLATILE; as it is, I have a hard time seeing this
as anything but a bug.

There is a workaround, if you don't mind ugly:
 
CREATE FUNCTION random_really_i_mean_it(dummy int)
  RETURNS double precision
  LANGUAGE plpgsql
  VOLATILE
AS $$
BEGIN
  -- no need to reference dummy parameter
  RETURN random();
END;
$$;

WITH source AS (
    SELECT i FROM generate_series(1,10) AS i
)
SELECT
    i
FROM
    source, (
        SELECT
            count(*) AS _n
        FROM source
    ) AS _stats
WHERE
    random_really_i_mean_it(i) < 5::DOUBLE PRECISION/_n;
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to