Re: [sqlite] EXT : Speed up count(distinct col)

Black, Michael (IS) Mon, 07 Feb 2011 13:10:29 -0800

Mabye one of our experts can explain why distinct takes to long.
Seems to me if you have an index you're just returning the values in the index, 
aren't you?
But it looks like it's comparing all keys to all keys.


Do you just want a count of distinct values?  
select count(distinct(tag)) from tags;
I think that will run faster.

Then again...if you normalize your tags you end up with just count(*) from the 
new tag table to find all unique keys
sqlite> create index tags_index on tags(tag);
sqlite> explain select distinct(tag) from tags;
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Trace          0     0     0                    00
1     Noop           0     0     0                    00
2     Integer        0     3     0                    00
3     Integer        0     2     0                    00
4     Gosub          5     28    0                    00
5     Goto           0     31    0                    00
6     OpenRead       2     4     0     keyinfo(1,BINARY)  00
7     Rewind         2     18    8     0              00
8     Column         2     0     7                    00
9     Compare        6     7     1     keyinfo(1,BINARY)  00
10    Jump           11    15    11                   00
11    Move           7     6     1                    00
12    Gosub          4     23    0                    00
13    IfPos          3     30    0                    00
14    Gosub          5     28    0                    00
15    Column         2     0     1                    00
16    Integer        1     2     0                    00
17    Next           2     8     0                    00
18    Close          2     0     0                    00
19    Gosub          4     23    0                    00
20    Goto           0     30    0                    00
21    Integer        1     3     0                    00
22    Return         4     0     0                    00
23    IfPos          2     25    0                    00
24    Return         4     0     0                    00
25    Copy           1     9     0                    00
26    ResultRow      9     1     0                    00
27    Return         4     0     0                    00
28    Null           0     1     0                    00
29    Return         5     0     0                    00
30    Halt           0     0     0                    00
31    Transaction    0     0     0                    00
32    VerifyCookie   0     2     0                    00
33    TableLock      0     2     0     tags           00
34    Goto           0     6     0                    00
sqlite> explain select count(*) from tags;
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Trace          0     0     0                    00
1     Goto           0     8     0                    00
2     OpenRead       1     4     0     keyinfo(1,BINARY)  00
3     Count          1     1     0                    00
4     Close          1     0     0                    00
5     Copy           1     2     0                    00
6     ResultRow      2     1     0                    00
7     Halt           0     0     0                    00
8     Transaction    0     0     0                    00
9     VerifyCookie   0     2     0                    00
10    TableLock      0     2     0     tags           00
11    Goto           0     2     0                    00
sqlite> explain select count(distinct(tag)) from tags;
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Trace          0     0     0                    00
1     Null           0     2     0                    00
2     Null           0     1     0                    00
3     OpenEphemeral  1     0     0     keyinfo(1,BINARY)  00
4     Goto           0     18    0                    00
5     OpenRead       0     2     0     2              00
6     Rewind         0     13    0                    00
7     Column         0     1     3                    00
8     Found          1     12    3     1              00
9     MakeRecord     3     1     4                    00
10    IdxInsert      1     4     0                    00
11    AggStep        0     3     1     count(1)       01
12    Next           0     7     0                    01
13    Close          0     0     0                    00
14    AggFinal       1     1     0     count(1)       00
15    Copy           1     5     0                    00
16    ResultRow      5     1     0                    00
17    Halt           0     0     0                    00
18    Transaction    0     0     0                    00
19    VerifyCookie   0     2     0                    00
20    TableLock      0     2     0     tags           00
21    Goto           0     5     0                    00

Michael D. Black
Senior Scientist
NG Information Systems
Advanced Analytics Directorate



________________________________________
From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Yuzem [naujnit...@gmail.com]
Sent: Monday, February 07, 2011 2:13 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] EXT :  Speed up count(distinct col)

Black, Michael (IS) wrote:
>
> Test#1
> create index tags_index on tags(tag);
> You should have an index for any fields you query on like this.
>

Thanks Michael but I don't see any speed improvement:
create index test on tags(tag);
select count(distinct tag) from tags;

This is much faster:
select count(*) from tags;

Am I doing something wrong?
--
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30867275.html
Sent from the SQLite mailing list archive at Nabble.com.

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] EXT : Speed up count(distinct col)

Reply via email to