On Sep 28, 2011, at 11:00 AM, Black, Michael (IS) wrote:
> Your change to numeric date/time may not take a long as you think.
>
>
>
Took an hour and a half.
Step 1: Alter all tables with datetime columns, converting those columns to
integer;
Step 2: Update all tables setting new datetime columns to unixtime
UPDATE table SET new_column = strftime('%s', old_column);
Step 3: CREATE INDEX uris_downloaded_on ON uris (u_downloaded_on);
Step 4: Run the following query
SELECT u.uri_id uri_id, u.uri uri, u.u_downloaded_on
FROM fts_uri f
JOIN uris u ON f.uri_id = u.uri_id
JOIN feed_history fh ON u.feed_history_id = fh.feed_history_id
JOIN feeds f ON fh.feed_id = f.feed_id
JOIN projects p ON f.project_id = p.project_id
WHERE p.project_id = 3
AND u.u_downloaded_on >= p.u_project_start
AND fts_uri MATCH 'education school'
ORDER BY u.uri_id, u_downloaded_on DESC;
Terrible time.
>> CPU Time: user 27.584849 sys 115.219293
Step 5: EXPLAIN QUERY PLAN (above SELECT query)
0|0|0|SCAN TABLE fts_uri AS f VIRTUAL TABLE INDEX 4: (~0 rows)
0|1|1|SEARCH TABLE uris AS u USING INTEGER PRIMARY KEY (rowid=?) (~1
rows)
0|2|2|SEARCH TABLE feed_history AS fh USING INTEGER PRIMARY KEY
(rowid=?) (~1 rows)
0|3|3|SEARCH TABLE feeds AS f USING INTEGER PRIMARY KEY (rowid=?) (~1
rows)
0|4|4|SEARCH TABLE projects AS p USING INTEGER PRIMARY KEY (rowid=?)
(~1 rows)
0|0|0|USE TEMP B-TREE FOR ORDER BY
CPU Time: user 0.000099 sys 0.000008
Step 6: ANALYZE;
Step 7: Run the above SELECT query again
Terrible time.
>> CPU Time: user 27.703538 sys 116.684390
This train is going nowhere. The times are actually worse than they were when I
was using a non-text column for date time.
By the way, have tried this on two machines -- the previous one was a dual-Xeon
Xserve with 12 GB RAM. The current machine of choice is the top of the line
iMac (quad core 3.4 GHz Intel Core i7) with 12 GB RAM and a 7200 RPM SATA drive.
> drop any indexes on project_start and downloaded_on;
>
>
>
> update projects set project_start=julianday(project_start);
>
> update uris set downloaded_on=julianday(downloaded_on);
>
>
>
> Recreate indexes.
>
>
>
> Modify your code to insert julianday('now','localtime') instead of taking the
> default current_timestamp.
>
>
>
> I wasn't really aware before the the datetime functions stored as text all
> the time...that's bad for searches...
>
>
>
>
>
> Michael D. Black
>
> Senior Scientist
>
> NG Information Systems
>
> Advanced Analytics Directorate
>
>
>
> ________________________________
> From: [email protected] [[email protected]] on
> behalf of Puneet Kishor [[email protected]]
> Sent: Wednesday, September 28, 2011 9:48 AM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] EXT : speeding up FTS4
>
>
> On Sep 28, 2011, at 9:44 AM, Simon Slavin wrote:
>
>>
>> On 28 Sep 2011, at 3:41pm, Puneet Kishor wrote:
>>
>>> WHERE Datetime(u.downloaded_on) >= Datetime(p.project_start)
>>
>> Why are you doing 'Datetime' here ? Not only does the conversion take time,
>> but it means you can't usefully index either of those two columns.
>>
>> Can you instead store your stamps in a format which is readily sortable ?
>> Either in text form or as julian days.
>
>
> Could I? Sure, if I had known better. Should I? I would be happy to create a
> new column, convert the values to julian days, and try that, but on a 27 GB
> db, that would take a bit of a while.
>
> But, if I understood [http://www.sqlite.org/datatype3.html] correctly, there
> really is no such thing as DATETIME value. Internally, it is stored as TEXT
> anyway.
>
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users