Re: [sqlite] racing with date('now') (was: Select with dates)

Yuriy Kaminskiy Sat, 14 Sep 2013 10:56:58 -0700

Keith Medcalf wrote:
>> On Saturday, 14 September, 2013 07:19, Yuriy Kaminskiy said:
>> Keith Medcalf wrote:
>>>> On Thu, 12 Sep 2013 14:01:04 +0100
>>>> Simon Davies <simon.james.dav...@gmail.com> wrote:
>>>>
>>>>> Why not
>>>>>     SELECT * FROM "entry" WHERE
>>>>>            bankdate >= date('now','start of month')
>>>>>           AND bankdate < date('now','start of month','+1 month')
>>>> The half-open interval strikes again!  :-)
>>> And you are using UTC ...
>>>
>>> Notwithstanding the timezone you want to use,
>>>
>>> explain select *
>>>           from entry
>>>          where bankdate >= date('now', 'start of month')
>>>            and bankdate < date('now', 'start of month', '-1 day');
>>>
>>> will generate the following code:
>> And there are another pitfall in this query: date('now') is *RACY*. That
>> is, if you execute this query *exactly* at midnight at month boundary, first
>> and second DATE() invocation may refer to *different* months.
>  
>> And if your query return several rows, and there will be month boundary
>> between sqlite3_step(), your query also can return data from two month.
>  
>> You can easily reproduce this problem if you switch unit from month to
>> millisecond, e.g.
>> SELECT * FROM t WHERE strftime('%f') <> strftime('%f');
>> will non-deterministically return rows.
>  
>> IMO, correct [= least surprise] behavior should be "timestamp used for
>> 'now' should cached on first row step, and reused in all following calls
>> [until SQLITE_OK/SQLITE_DONE returned]", but fixing that won't be easy :-|.
> 
> That would require a change to SQLite itself.  Without doing that I believe 
> this will solve those problems by using only one 'now', and also only running 
> each date() call once (even my previous solutions could suffer from race 
> conditions since they retrieved 'now' multiple times):
> 
> sqlite> explain select *
>    ...>           from entry,
>    ...>                (select bdate, (select date(bdate, '+1 month')) as 
> edate
>    ...>                   From (select date('now', 'localtime', 'start of 
> month') as bdate) as ttemp limit 1) as tstamp
>    ...>          where bankdate >= bdate
>    ...>            and bankdate < edate;


Unfortunately, it is not only extremely inconvenient to rewrite queries this
way, but also this workaround relies on current implementation of query
optimizer. Next version may decide to e.g. flatten query, and break assumption
that inner date() will be called only once.

> It is a bit complicated though.  It would be much simpler to do the date 
> calculation in the host language and pass the boundary values as parameters 
> to the query.

In other words, "The way it is implemented currently, sqlite DATE() function
considered harmful and should not be used." Huh.

> Of course, you could also do the following:
> 
> Select * from entry where bankdate like (select substr(date('now', 
> 'localtime'), 7) + '%');
> 
> But of course this will always do a table/index scan that cannot be optimized 
> away ...

And same problem: some next version of query optimizer may decide to change the
way this query will be executed.

PS from postgresql documentation:
=== cut
http://www.postgresql.org/docs/9.3/static/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT
===
Since these functions return the start time of the current transaction, their
values do not change during the transaction. This is considered a feature: the
intent is to allow a single transaction to have a consistent notion of the
"current" time, so that multiple modifications within the same transaction bear
the same time stamp.
=== cut ===
*That's* way to go.

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] racing with date('now') (was: Select with dates)

Reply via email to