On 2018-03-22 12:03, Richard Hipp wrote:
On 3/21/18, Jonathan Moules <jonathan-li...@lightpear.com> wrote:
I've spent the last ~90 minutes trying to build this but to no avail
The query planner enhancements are now available in the pre-release
snapshot on the https://sqlite.org/download.html page.

Well, after a lot of fiddling, I finally got something that seems to work.

One observation that's relevant to you; the suggested:

./configure; make sqlite3.c

results in a make error (I'm running it in a Linux Mint VM):

config.status: executing libtool commands
make: Warning: File 'Makefile' has modification time 2.4 s in the future
make: Nothing to be done for 'sqlite3.c'.
make: warning: Clock skew detected. Your build may be incomplete.

No idea what it's on about - I have no clock issues on the host machine and the time appears to be correct in the VM. Got around it in the end by running it as two commands.

----------

Back to the issue at hand using my modestly populated test database - all six variations of the query (Order BY ASC; ORDER BY DESC; No Order By or LIMIT; - each of those twice, once with JOINs once with LEFT JOINs) take the same amount of time in the 3.23.0 build - 0.33seconds. (I used the built in ".timer on").

This is the same speed as three of the queries, and faster than one of them in 3.15, BUT this is still several times slower than 3.15 on the same database (but in Windows) for two of the queries.

The below query, along with the no ORDER BY / LIMIT variant both take just 0.08s to run in 3.15.

    SELECT
        u.url_id, u.url, l.error_code
    FROM
        urls u
    left JOIN
        lookups l
        USING(url_id)
    left JOIN
        (select * from v_most_recent_lookup_per_url where url_id in (
select url_id from urls where url = 'http://catalogue.beta.data.wa.gov.au/api/3/action/resource_search?limit=100&offset=0&query=format%3AWFS'
            )) recent
       -- By definition url's can) recent
        -- This is the important bit
-- Here we use the most recent lookup url_id to link to the source_seed_id, so we only find its children
        -- Or alternatively itself
        ON u.source_seed_id = recent.url_id
            OR u.url_id = recent.url_id
    WHERE
-- For JSON-spider at least, Generic's are guaranteed to be generic pages.
        l.is_generic_flag = 1
        AND
-- Must be "or equal to" so we can get the lookup of the very base url.
        l.retrieval_datetime >= recent.retrieval_datetime
        AND
DATETIME(recent.retrieval_datetime) > DATETIME('now', '-14 days', 'start of day')
    ORDER BY
        u.url_id aSC
    LIMIT 1;


Although the two "fast" queries are much slower if a non-existent url is used (goes up to about 0.5s in 3.15 for the same query)

Cheers,
Jonathan

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to