On 28 Apr 2017, at 10:23pm, Jens Alfke <j...@mooseyard.com> wrote:

> On Apr 28, 2017, at 2:00 PM, Simon Slavin <slav...@bigfraud.org> wrote:
> 
>> What indexes do you have on these two tables ?  I can’t recommend one 
>> without knowing which columns belong to which tables.
> 
> I showed them at the end of my first message.

Sorry, I missed that.

> CREATE TABLE docs (doc_id INTEGER PRIMARY KEY, docid TEXT UNIQUE NOT NULL, 
> expiry_timestamp INTEGER);
> CREATE INDEX docs_docid ON docs(docid);

> CREATE TABLE revs (sequence INTEGER PRIMARY KEY AUTOINCREMENT, doc_id INTEGER 
> NOT NULL REFERENCES docs(doc_id) ON DELETE CASCADE, revid TEXT NOT NULL 
> COLLATE REVID, parent INTEGER REFERENCES revs(sequence) ON DELETE SET NULL,  
> current BOOLEAN,                deleted BOOLEAN DEFAULT 0, json BLOB, 
> no_attachments BOOLEAN, doc_type TEXT, UNIQUE (doc_id, revid));
> CREATE INDEX revs_parent ON revs(parent);
> CREATE INDEX revs_by_docid_revid ON revs(doc_id, revid desc, current, 
> deleted);
> CREATE INDEX revs_current ON revs(doc_id, current desc, deleted, revid desc);

This search

SELECT revs.doc_id, sequence, docid, revid, json, deleted FROM revs
        JOIN docs ON docs.doc_id = revs.doc_id
        WHERE revs.sequence>? AND revs.current!=0 AND revs.deleted=0
        ORDER BY revs.doc_id, revs.deleted, revs.revid DESC

can be improved if you know that revs.current is never negative.  If that’s the 
case then rephrase it

SELECT revs.doc_id, sequence, docid, revid, json, deleted FROM revs
        JOIN docs ON docs.doc_id = revs.doc_id
        WHERE revs.sequence>? AND revs.current>0 AND revs.deleted=0
        ORDER BY revs.doc_id, revs.deleted, revs.revid DESC

Either way, the search could benefit from a better index on revs.  Depending on 
your data it would be one of these four:

CREATE INDEX revs_xxxx1 ON revs(deleted,current,sequence,doc_id,revid);
CREATE INDEX revs_xxxx2 ON revs(deleted,current,doc_id,sequence,revid);
CREATE INDEX revs_xxxx3 ON revs(deleted,sequence,current,doc_id,revid);
CREATE INDEX revs_xxxx4 ON revs(deleted,sequence,doc_id,current,revid);

Create those four in a database with typical data in it.  Then do ANALYZE.  
Then run the query and see if time has improved.  Once you see which index 
SQLite decided to use, you can delete the other ones.

> The only difference is the explicit JOIN statement. I was under the 
> impression that using this, vs. the way I wrote it, is a matter of taste that 
> doesn’t affect the execution of the query.


SQLite computes two-table searches using nested loops.  Providing the JOIN you 
want tells SQLite which order you think the loops should be nested in.  
Expressing the searches with JOINs is also helps me figure out what SQLite is 
actually doing.  Before I did that it wasn’t obvious to me how little the table 
"docs" mattered for this query.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to