Re: [sqlite] Improving select distinct performance

Igor Tandetnik Thu, 08 Oct 2009 04:51:59 -0700

Jason Freshwater wrote:
> The problem I keep running into is with expressions of the form
>
>       select distinct c from t1
>
> For a 1 million row table with 8 distinct values of c the query
> time on my system (pretty typical modern laptop using precompiled
> sqlite 3.6.18) is about 2.8 seconds. For my requirements this is
> a bit too slow...


I believe SELECT DISTINCT is implemented by sorting. Internally, a resultset is 
created as if by ORDER BY, then a loop compares each 
record with the previous one.

In your case, something like this might work better:

create temp table d(c unique);
insert or ignore into d(c) select c from t1;
select c from d;
drop table d;

This should change the complexity from O(N log N) to O(N log M), where N is the 
total number of records and M is the number of 
distinct values. In your case, M << N.

Igor Tandetnik 


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Improving select distinct performance

Reply via email to