Re: [sqlite] Once again about random values appearance

Yuriy Kaminskiy Mon, 17 Feb 2014 02:24:13 -0800

James K. Lowden wrote:
> On Fri, 14 Feb 2014 08:32:02 +0400
> Max Vlasov <max.vla...@gmail.com> wrote:
> 
>> From: Max Vlasov <max.vla...@gmail.com>
>> To: General Discussion of SQLite Database <sqlite-users@sqlite.org>
>> Reply-To: General Discussion of SQLite Database
>> <sqlite-users@sqlite.org> Date: Fri, 14 Feb 2014 08:32:02 +0400
>> Subject: Re: [sqlite] Once again about random values appearance
>>
>> On Fri, Feb 14, 2014 at 6:35 AM, James K. Lowden
>> <jklow...@schemamania.org>wrote:
>>
>>>>> select id, (select id from TestTable where id = abs(random() %
>>>>> 100)) as rndid from TestTable where id=rndid
>>> On Thu, 13 Feb 2014 07:26:55 -0500
>>> Richard Hipp <d...@sqlite.org> wrote:
>>>
>>>> It is undefined behavior, subject to change depending the specific
>>>> version of SQLite, compile-time options, optimization settings,
>>>> and the whim of the query planner.
>>> It should be defined.
>>>
>>> In the above query, random() should be evaluated once.  In the SQL
>>> model the user provides inputs once, and the system evaluates them
>>> once.
>>>
>>>
>> Once for the query or once for the row?
> 
> Once for the query.  
> 
> As a user you have no control how the system evaluates your query.
> The evaluation may change over time with different implementations, but
> the semantics of the query do not.  
> 
> Not long ago on this list we discussed 
> 
>       SELECT *, datetime('now') from T;
> 
> and the behavior was that the datetime function was called per-row,
> resulting in different times on different rows.  It was changed, the
> rationale IIRC to be compatible with the SQL standard and other DBMSs.  
> 
> Like datetime, random() is a nondeterministic function taking constant
> (but void) input.  It should be evaluated once, as provided. Where it
> appears in your query should not matter.  That would make it consistent
> with how other functions work, and with the SQL standard.


Good in theory, bad in practice.
CURRENT_TIME = CURRENT_TIME was /almost always/ true before change (when it
happened to be false, it was /surprising/ and /counter-intuitive/ exception),
and just become /always/ true after change (no surprising change in behavior!).

RANDOM() = RANDOM() is /always false/ now, it will become /always true/ after
your suggested change.
It have much higher potential to break existing code, e.g.
   SELECT * FROM t ORDER BY RANDOM() LIMIT 1
will be completely broken (well, sure, it is not best way to select random
row(s) from table - it is overly expensive and unportable - but still much more
likely to be used - and relied on - in existing code).

Besides, datetime function behavior is specified by SQL standard - and RANDOM()
is not.

Then again, I, too, don't quite like how OP's queries - such as
   SELECT r <> r FROM (SELECT RANDOM() AS r FROM t)
behavior depends on whether query flattening was used or not (even though one
can write it off as "documented undefined behavior").
Or, that `SELECT RANDOM() AS r FROM t ORDER BY r` gives quite unexpected result.
I just don't think that same approach as with datetime function would be good in
this case.

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Once again about random values appearance

Reply via email to