On Sep 28, 2011, at 11:00 AM, Black, Michael (IS) wrote:

> Your change to numeric date/time may not take a long as you think.
> 
> 
> 


Took an hour and a half.

Step 1: Alter all tables with datetime columns, converting those columns to 
integer;

Step 2: Update all tables setting new datetime columns to unixtime

UPDATE table SET new_column = strftime('%s', old_column);

Step 3: CREATE INDEX uris_downloaded_on ON uris (u_downloaded_on);

Step 4: Run the following query

        SELECT u.uri_id uri_id, u.uri uri, u.u_downloaded_on 
        FROM fts_uri f 
                JOIN uris u ON f.uri_id = u.uri_id 
                JOIN feed_history fh ON u.feed_history_id = fh.feed_history_id 
                JOIN feeds f ON fh.feed_id = f.feed_id 
                JOIN projects p ON f.project_id = p.project_id 
        WHERE p.project_id = 3
                AND u.u_downloaded_on >= p.u_project_start 
                AND fts_uri MATCH 'education school'
        ORDER BY u.uri_id, u_downloaded_on DESC;

Terrible time.
>> CPU Time: user 27.584849 sys 115.219293

Step 5: EXPLAIN QUERY PLAN (above SELECT query)

        0|0|0|SCAN TABLE fts_uri AS f VIRTUAL TABLE INDEX 4: (~0 rows)
        0|1|1|SEARCH TABLE uris AS u USING INTEGER PRIMARY KEY (rowid=?) (~1 
rows)
        0|2|2|SEARCH TABLE feed_history AS fh USING INTEGER PRIMARY KEY 
(rowid=?) (~1 rows)
        0|3|3|SEARCH TABLE feeds AS f USING INTEGER PRIMARY KEY (rowid=?) (~1 
rows)
        0|4|4|SEARCH TABLE projects AS p USING INTEGER PRIMARY KEY (rowid=?) 
(~1 rows)
        0|0|0|USE TEMP B-TREE FOR ORDER BY
        CPU Time: user 0.000099 sys 0.000008

Step 6: ANALYZE;

Step 7: Run the above SELECT query again

Terrible time.
>> CPU Time: user 27.703538 sys 116.684390



This train is going nowhere. The times are actually worse than they were when I 
was using a non-text column for date time.

By the way, have tried this on two machines -- the previous one was a dual-Xeon 
Xserve with 12 GB RAM. The current machine of choice is the top of the line 
iMac (quad core 3.4 GHz Intel Core i7) with 12 GB RAM and a 7200 RPM SATA drive.


> drop any indexes on project_start and downloaded_on;
> 
> 
> 
> update projects set project_start=julianday(project_start);
> 
> update uris set downloaded_on=julianday(downloaded_on);
> 
> 
> 
> Recreate indexes.
> 
> 
> 
> Modify your code to insert julianday('now','localtime') instead of taking the 
> default current_timestamp.
> 
> 
> 
> I wasn't really aware before the the datetime functions stored as text all 
> the time...that's bad for searches...
> 
> 
> 
> 
> 
> Michael D. Black
> 
> Senior Scientist
> 
> NG Information Systems
> 
> Advanced Analytics Directorate
> 
> 
> 
> ________________________________
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
> behalf of Puneet Kishor [punk.k...@gmail.com]
> Sent: Wednesday, September 28, 2011 9:48 AM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] EXT : speeding up FTS4
> 
> 
> On Sep 28, 2011, at 9:44 AM, Simon Slavin wrote:
> 
>> 
>> On 28 Sep 2011, at 3:41pm, Puneet Kishor wrote:
>> 
>>>  WHERE Datetime(u.downloaded_on) >= Datetime(p.project_start)
>> 
>> Why are you doing 'Datetime' here ?  Not only does the conversion take time, 
>> but it means you can't usefully index either of those two columns.
>> 
>> Can you instead store your stamps in a format which is readily sortable ?  
>> Either in text form or as julian days.
> 
> 
> Could I? Sure, if I had known better. Should I? I would be happy to create a 
> new column, convert the values to julian days, and try that, but on a 27 GB 
> db, that would take a bit of a while.
> 
> But, if I understood [http://www.sqlite.org/datatype3.html] correctly, there 
> really is no such thing as DATETIME value. Internally, it is stored as TEXT 
> anyway.
> 
> 

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to