Re: [sqlite] Aggregate and join query very slow

2010-05-28 Thread Simon Slavin

On 28 May 2010, at 3:57pm, Michael Ash wrote:

> Per suggestions, I indexed year and media on the big table.  So I now
> have separate indexes for the key variable (releasenumber) and for
> year and for media.Would  it make more sense to have a single
> index for all three, thus:
> 
> CREATE UNIQUE INDEX r ON release_cl (year,media,releasenumber);

Correct.  Each SELECT command uses zero or one index.  It's pointless to create 
individual indexes for each column in your table.  Look at the columns, ORDER 
BY and WHERE of your SELECT query and make up one index which is ideal for that 
query.  For the ideal index in SQLite, you make the index up of

first, the columns mentioned in the WHERE clause
then, the columns mentioned in the ORDER BY clause
finally any other columns you want retrieved.

For each purpose you have to choose your preferred cutoff: it's usually not 
worth doing the last section of the above, and sometimes not even worth doing 
the ORDER BY section.  The more indexes, and the longer they are, the longer 
each WRITE operation takes for that table.  You could end up with a table which 
might be fast to search but need a very long time for each INSERT.  Also, 
increasing the length of each index will increase the database filesize, and 
therefore cause operations which change the database file to take longer.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Aggregate and join query very slow

2010-05-28 Thread Tim Romano
How many distinct media-types are there?
How many distinct facilities do you have?
How many rows are typically returned by your FacilityScore subquery?

SELECT facilitynumber,SUM(score_rev) AS score
   FROM release_cl
   WHERE media<3
   AND year=2006
  GROUP BY facilitynumber

Regards
Tim Romano



On Fri, May 28, 2010 at 10:57 AM, Michael Ash <m...@econs.umass.edu> wrote:

> Thank you very much.  Both replies pointed to indexes.  So I changed
> the indexes and markedly improved performance from 12 seconds to about
> 1.5 seconds for the faster variant (using nested SELECTS) and about
> 2.2 second for the slower variant.
>
> Per suggestions, I indexed year and media on the big table.  So I now
> have separate indexes for the key variable (releasenumber) and for
> year and for media.Would  it make more sense to have a single
> index for all three, thus:
>
> CREATE UNIQUE INDEX r ON release_cl (year,media,releasenumber);
>
> I remain concerned that I am not using indexes as well as possible.
> The query still takes many times longer than does the same query on
> MySQL.
>
> Thank you very much again for the helpful responses.
>
>
> Best,
> Michael Ash
>
>
>
>
> >   3. Re: Aggregate and join query very slow (Max Vlasov)
> > Message: 3
> > Date: Thu, 27 May 2010 17:26:10 +0400
> > From: Max Vlasov <max.vla...@gmail.com>
> > Subject: Re: [sqlite] Aggregate and join query very slow
> > To: General Discussion of SQLite Database <sqlite-users@sqlite.org>
> > Message-ID:
> ><aanlktinqmk9csyaguoyllo2okpmsyq3di-tzbuahw...@mail.gmail.com>
> > Content-Type: text/plain; charset=ISO-8859-1
> >
> > On Thu, May 27, 2010 at 3:07 PM, Michael Ash <m...@econs.umass.edu>
> wrote:
> >
> >> ...These are large tables (52,355 records in facility and 4,085,137 in
> >> release_cl).
> >>
> >> ...
> >> sqlite> explain query plan
> >>   ...> SELECT name,score
> >>   ...> FROM facility f, (SELECT facilitynumber,SUM(score_rev) AS score
> >>   ...> FROM release_cl
> >>   ...> WHERE media<3
> >>   ...> AND year=2006
> >>   ...> GROUP BY facilitynumber) r
> >>
> >
> >
> > Michael, from what I see, if your release_cl table is not properly
> indexed
> > to be quickly aggregated (media and year field), this will lead to full
> > table reading (so all the data of your 4M records). If it's properly
> > indexed, and the result number of records of this select is big, consider
> > adding non-indexed fields to this (or brand-new) index since otherwise
> > sqlite quickly finds records with this index, but have to look up main
> data
> > tree to retrieve other fields.
> >
> > Max,
> > maxerist.net
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Aggregate and join query very slow

2010-05-28 Thread Black, Michael (IS)
You also need to increase your cache size to match the mysql performance
 
pragma cache 10;
 
Michael D. Black
Senior Scientist
Northrop Grumman Mission Systems
 



From: sqlite-users-boun...@sqlite.org on behalf of Michael Ash
Sent: Fri 5/28/2010 9:57 AM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Aggregate and join query very slow



Thank you very much.  Both replies pointed to indexes.  So I changed
the indexes and markedly improved performance from 12 seconds to about
1.5 seconds for the faster variant (using nested SELECTS) and about
2.2 second for the slower variant.

Per suggestions, I indexed year and media on the big table.  So I now
have separate indexes for the key variable (releasenumber) and for
year and for media.Would  it make more sense to have a single
index for all three, thus:

CREATE UNIQUE INDEX r ON release_cl (year,media,releasenumber);

I remain concerned that I am not using indexes as well as possible.
The query still takes many times longer than does the same query on
MySQL.

Thank you very much again for the helpful responses.


Best,
Michael Ash




>   3. Re: Aggregate and join query very slow (Max Vlasov)
> Message: 3
> Date: Thu, 27 May 2010 17:26:10 +0400
> From: Max Vlasov <max.vla...@gmail.com>
> Subject: Re: [sqlite] Aggregate and join query very slow
> To: General Discussion of SQLite Database <sqlite-users@sqlite.org>
> Message-ID:
><aanlktinqmk9csyaguoyllo2okpmsyq3di-tzbuahw...@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Thu, May 27, 2010 at 3:07 PM, Michael Ash <m...@econs.umass.edu> wrote:
>
>> ...These are large tables (52,355 records in facility and 4,085,137 in
>> release_cl).
>>
>> ...
>> sqlite> explain query plan
>>   ...> SELECT name,score
>>   ...> FROM facility f, (SELECT facilitynumber,SUM(score_rev) AS score
>>   ...> FROM release_cl
>>   ...> WHERE media<3
>>   ...> AND year=2006
>>   ...> GROUP BY facilitynumber) r
>>
>
>
> Michael, from what I see, if your release_cl table is not properly indexed
> to be quickly aggregated (media and year field), this will lead to full
> table reading (so all the data of your 4M records). If it's properly
> indexed, and the result number of records of this select is big, consider
> adding non-indexed fields to this (or brand-new) index since otherwise
> sqlite quickly finds records with this index, but have to look up main data
> tree to retrieve other fields.
>
> Max,
> maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Aggregate and join query very slow

2010-05-28 Thread Michael Ash
Thank you very much.  Both replies pointed to indexes.  So I changed
the indexes and markedly improved performance from 12 seconds to about
1.5 seconds for the faster variant (using nested SELECTS) and about
2.2 second for the slower variant.

Per suggestions, I indexed year and media on the big table.  So I now
have separate indexes for the key variable (releasenumber) and for
year and for media.Would  it make more sense to have a single
index for all three, thus:

CREATE UNIQUE INDEX r ON release_cl (year,media,releasenumber);

I remain concerned that I am not using indexes as well as possible.
The query still takes many times longer than does the same query on
MySQL.

Thank you very much again for the helpful responses.


Best,
Michael Ash




>   3. Re: Aggregate and join query very slow (Max Vlasov)
> Message: 3
> Date: Thu, 27 May 2010 17:26:10 +0400
> From: Max Vlasov <max.vla...@gmail.com>
> Subject: Re: [sqlite] Aggregate and join query very slow
> To: General Discussion of SQLite Database <sqlite-users@sqlite.org>
> Message-ID:
>        <aanlktinqmk9csyaguoyllo2okpmsyq3di-tzbuahw...@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Thu, May 27, 2010 at 3:07 PM, Michael Ash <m...@econs.umass.edu> wrote:
>
>> ...These are large tables (52,355 records in facility and 4,085,137 in
>> release_cl).
>>
>> ...
>> sqlite> explain query plan
>>   ...> SELECT name,score
>>   ...> FROM facility f, (SELECT facilitynumber,SUM(score_rev) AS score
>>   ...>         FROM release_cl
>>   ...>         WHERE media<3
>>   ...>         AND year=2006
>>   ...>         GROUP BY facilitynumber) r
>>
>
>
> Michael, from what I see, if your release_cl table is not properly indexed
> to be quickly aggregated (media and year field), this will lead to full
> table reading (so all the data of your 4M records). If it's properly
> indexed, and the result number of records of this select is big, consider
> adding non-indexed fields to this (or brand-new) index since otherwise
> sqlite quickly finds records with this index, but have to look up main data
> tree to retrieve other fields.
>
> Max,
> maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Aggregate and join query very slow

2010-05-27 Thread Max Vlasov
On Thu, May 27, 2010 at 3:07 PM, Michael Ash  wrote:

> ...These are large tables (52,355 records in facility and 4,085,137 in
> release_cl).
>
> ...
> sqlite> explain query plan
>   ...> SELECT name,score
>   ...> FROM facility f, (SELECT facilitynumber,SUM(score_rev) AS score
>   ...> FROM release_cl
>   ...> WHERE media<3
>   ...> AND year=2006
>   ...> GROUP BY facilitynumber) r
>


Michael, from what I see, if your release_cl table is not properly indexed
to be quickly aggregated (media and year field), this will lead to full
table reading (so all the data of your 4M records). If it's properly
indexed, and the result number of records of this select is big, consider
adding non-indexed fields to this (or brand-new) index since otherwise
sqlite quickly finds records with this index, but have to look up main data
tree to retrieve other fields.

Max,
maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Aggregate and join query very slow

2010-05-27 Thread Black, Michael (IS)
MySql has a much larger default cache than sqlite3.  That could be one rather 
large difference in performance.
 
Try increasing sqlite3 cache from it's default of 2000k
PRAGMA cache_size=10;
 
Or more...
 
Also...no indexes on media or year?  And what does MySql's explain say?
 
Michael D. Black
Senior Scientist
Northrop Grumman Mission Systems
 



From: sqlite-users-boun...@sqlite.org on behalf of Michael Ash
Sent: Thu 5/27/2010 6:07 AM
To: sqlite-users@sqlite.org
Subject: [sqlite] Aggregate and join query very slow



I am new to sqlite3, converting from mysql.

A query that  involves an aggregate function and a join is running
very slowly (taking about 15 seconds compared to mysql where it runs
in <1 second).  I've tried two variants of the query (each reprinted
below with the explain query plan), and both are roughly equally slow.
 These are large tables (52,355 records in facility and 4,085,137 in
release_cl).

Please let me know if you have advice; I am clearly not getting it.



sqlite> explain query plan
   ...> SELECT name,score
   ...> FROM facility f, (SELECT facilitynumber,SUM(score_rev) AS score
   ...> FROM release_cl
   ...> WHERE media<3
   ...> AND year=2006
   ...> GROUP BY facilitynumber) r
   ...> WHERE f.facilitynumber=r.facilitynumber
   ...> ORDER BY score DESC
   ...> LIMIT 10 ;
0|0|TABLE release_cl WITH INDEX facilitynumberRC ORDER BY
0|1|TABLE  AS r
1|0|TABLE facility AS f WITH INDEX FacilityNumberF


sqlite> explain query plan
   ...> SELECT name,city,state,SUM(score_rev) AS score
   ...> FROM release_cl r,facility f
   ...> WHERE r.facilitynumber=f.facilitynumber
   ...> AND media<3
   ...> AND year=2006
   ...> GROUP BY r.facilitynumber
   ...> ORDER BY score DESC
   ...> LIMIT 10 ;
0|0|TABLE release_cl AS r WITH INDEX facilitynumberRC ORDER BY
1|1|TABLE facility AS f WITH INDEX FacilityNumberF






--
Michael Ash, Associate Professor
  of Economics and Public Policy
Department of Economics and CPPA
University of Massachusetts
Amherst, MA 01003
Email m...@econs.umass.edu
Fax +1-413-545-2921
http://people.umass.edu/maash
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Aggregate and join query very slow

2010-05-27 Thread Michael Ash
I am new to sqlite3, converting from mysql.

A query that  involves an aggregate function and a join is running
very slowly (taking about 15 seconds compared to mysql where it runs
in <1 second).  I've tried two variants of the query (each reprinted
below with the explain query plan), and both are roughly equally slow.
 These are large tables (52,355 records in facility and 4,085,137 in
release_cl).

Please let me know if you have advice; I am clearly not getting it.



sqlite> explain query plan
   ...> SELECT name,score
   ...> FROM facility f, (SELECT facilitynumber,SUM(score_rev) AS score
   ...> FROM release_cl
   ...> WHERE media<3
   ...> AND year=2006
   ...> GROUP BY facilitynumber) r
   ...> WHERE f.facilitynumber=r.facilitynumber
   ...> ORDER BY score DESC
   ...> LIMIT 10 ;
0|0|TABLE release_cl WITH INDEX facilitynumberRC ORDER BY
0|1|TABLE  AS r
1|0|TABLE facility AS f WITH INDEX FacilityNumberF


sqlite> explain query plan
   ...> SELECT name,city,state,SUM(score_rev) AS score
   ...> FROM release_cl r,facility f
   ...> WHERE r.facilitynumber=f.facilitynumber
   ...> AND media<3
   ...> AND year=2006
   ...> GROUP BY r.facilitynumber
   ...> ORDER BY score DESC
   ...> LIMIT 10 ;
0|0|TABLE release_cl AS r WITH INDEX facilitynumberRC ORDER BY
1|1|TABLE facility AS f WITH INDEX FacilityNumberF






-- 
Michael Ash, Associate Professor
  of Economics and Public Policy
Department of Economics and CPPA
University of Massachusetts
Amherst, MA 01003
Email m...@econs.umass.edu
Fax +1-413-545-2921
http://people.umass.edu/maash
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users