Jeremy Zeiber wrote:
> This query runs in ~ 17 ms:
> SELECT COUNT(data) FROM detail AS outerdetail WHERE headerid=4 AND data
> NOT IN (SELECT DISTINCT data FROM detail WHERE headerid<4)

Here the subquery is not coordinated. It is run once, the results are stored in 
an ephemeral table, then the check in the main query is performed against that 
table.

> This query runs in ~ 15s:
> SELECT COUNT(data) FROM detail AS outerdetail WHERE headerid=4 AND data
> NOT IN (SELECT DISTINCT data FROM detail WHERE
> headerid<outerdetail.headerid)

Here the subquery is coordinated: since it depends on values in the current row 
of the main query, it has to be rerun for each row. So you end up with 
quadratic complexity.

> Why is there such a huge difference in the query time for two very
> similar queries?  Is there anything I can do with the second query to
> make it perform more like the first?

The second query is equivalent to the first, it's just written in the form that 
makes it difficult for SQLite to optimize. Why don't you just use the first 
query?

> This query is part of a larger
> query that aggregates detail data such as the following, so I can't hard
> code the headerid in the query.
> 
> SELECT headerid,
> (SELECT COUNT(data) FROM detail AS outerdetail WHERE
> headerid=header.headerid AND data NOT IN (SELECT DISTINCT data FROM
> detail WHERE headerid<outerdetail.headerid)) AS newcount
> FROM header GROUP BY headerid;

Try replacing outerdetail.headerid with header.headerid.

Igor Tandetnik

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to