Re: [sqlite] Slowdown when adding terms to query

2010-08-09 Thread Edward Hawke
  Thank you all for your responses,

Igor - using a UNION made a huge difference, thank you.

I have a quick query on from the suggestion of adding indices for 
anything that appears either side of an equals sign. I have many 
user-selectable terms that can be added to the query. Is it worth adding 
indices for all of these? I have heard having too many indices can slow 
searches down, so I want to know where to stop, or is there no 
hard-and-fast rule, just test and see?

Regards,

Ed

On 06/08/2010 13:22, Igor Tandetnik wrote:
> Edward Hawke  wrote:
>> I have a query that is working at an acceptable speed, and I need to add
>> something else to it. As soon as I add the extra terms it grinds to a halt
>> (taking well over 5 minutes to perform the query on a relatively small
>> dataset c.100,000 records).
> But because you repeat the same table three-four times in the FROM clause, 
> thus building a cross-product, you are effectively working with a dataset of 
> some 10^15 records. Anything that suppress the use of indexes by SQLite will 
> cause performance to tank.
>
>> My acceptably fast query is fairly complicated as it pulls information from
>> a number of different tables, linking them all using WHERE clauses (I've
>> read somewhere this is the fastest way to do this in SQLite). It is shown
>> below (apologies for it's length) though I have cut out the information
>> about which columns it is selecting as it makes it too long to comfortably
>> read.
>>
>> *SELECT*  *FROM* Delivery, Match, Tour,
>> Ground, Country, PlayerTeam BowlerPT, PlayerTeam BatsmanPT, PlayerTeam
>> PartnerPT, Player Bowler, Player Batsman, Player Partner, Team BowlingTeam,
>> Team BattingTeam, Team HomeTeam, Team AwayTeam *WHERE* (Delivery.MatchID =
>> Match.ID AND Match.TourID = Tour.ID AND Delivery.BowlerPlayerTeamID =
>> BowlerPT.ID AND BowlerPT.PlayerID = Bowler.ID AND BowlerPT.TeamID =
>> BowlingTeam.ID AND Delivery.BatsmanPlayerTeamID = BatsmanPT.ID AND
>> BatsmanPT.PlayerID = Batsman.ID AND BatsmanPT.TeamID = BattingTeam.ID AND
>> Delivery.PartnerPlayerTeamID = PartnerPT.ID AND PartnerPT.PlayerID =
>> Partner.ID AND Match.HomeTeamID = HomeTeam.ID AND Match.AwayTeamID =
>> AwayTeam.ID AND Match.GroundID = Ground.ID AND Tour.CountryID = Country.ID
>> AND Batsman.ID = 1234567890);
>>
>> There are often multiple IDs specified at the end of the WHERE clause, as
>> these are added depending upon selections a user has made from a GUI.
>>
>> In a very specific case I need to select the above plus one unrelated row.
>> Therefore my query changes to (for example):
>>
>> *SELECT*  *FROM* Delivery, Match, Tour,
>> Ground, Country, PlayerTeam BowlerPT, PlayerTeam BatsmanPT, PlayerTeam
>> PartnerPT, Player Bowler, Player Batsman, Player Partner, Team BowlingTeam,
>> Team BattingTeam, Team HomeTeam, Team AwayTeam *WHERE* (Delivery.MatchID =
>> Match.ID AND Match.TourID = Tour.ID AND Delivery.BowlerPlayerTeamID =
>> BowlerPT.ID AND BowlerPT.PlayerID = Bowler.ID AND BowlerPT.TeamID =
>> BowlingTeam.ID AND Delivery.BatsmanPlayerTeamID = BatsmanPT.ID AND
>> BatsmanPT.PlayerID = Batsman.ID AND BatsmanPT.TeamID = BattingTeam.ID AND
>> Delivery.PartnerPlayerTeamID = PartnerPT.ID AND PartnerPT.PlayerID =
>> Partner.ID AND Match.HomeTeamID = HomeTeam.ID AND Match.AwayTeamID =
>> AwayTeam.ID AND Match.GroundID = Ground.ID AND Tour.CountryID = Country.ID
>> AND Batsman.ID = 1234567890) *OR* (Delivery.MatchID = Match.ID AND
>> Match.TourID = Tour.ID AND Delivery.BowlerPlayerTeamID = BowlerPT.ID AND
>> BowlerPT.PlayerID = Bowler.ID AND BowlerPT.TeamID = BowlingTeam.ID AND
>> Delivery.BatsmanPlayerTeamID = BatsmanPT.ID AND BatsmanPT.PlayerID =
>> Batsman.ID AND BatsmanPT.TeamID = BattingTeam.ID AND
>> Delivery.PartnerPlayerTeamID = PartnerPT.ID AND PartnerPT.PlayerID =
>> Partner.ID AND Match.HomeTeamID = HomeTeam.ID AND Match.AwayTeamID =
>> AwayTeam.ID AND Match.GroundID = Ground.ID AND Tour.CountryID = Country.ID
>> AND Delivery.ID = 1987654321);
> Using OR pretty much kills optimization in SQLite. Run this as two separate 
> queries, or as a UNION query with two subqueries, each of which only uses AND.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Slowdown when adding terms to query

2010-08-06 Thread Igor Tandetnik
Edward Hawke  wrote:
> I have a query that is working at an acceptable speed, and I need to add
> something else to it. As soon as I add the extra terms it grinds to a halt
> (taking well over 5 minutes to perform the query on a relatively small
> dataset c.100,000 records).

But because you repeat the same table three-four times in the FROM clause, thus 
building a cross-product, you are effectively working with a dataset of some 
10^15 records. Anything that suppress the use of indexes by SQLite will cause 
performance to tank.

> My acceptably fast query is fairly complicated as it pulls information from
> a number of different tables, linking them all using WHERE clauses (I've
> read somewhere this is the fastest way to do this in SQLite). It is shown
> below (apologies for it's length) though I have cut out the information
> about which columns it is selecting as it makes it too long to comfortably
> read.
> 
> *SELECT*  *FROM* Delivery, Match, Tour,
> Ground, Country, PlayerTeam BowlerPT, PlayerTeam BatsmanPT, PlayerTeam
> PartnerPT, Player Bowler, Player Batsman, Player Partner, Team BowlingTeam,
> Team BattingTeam, Team HomeTeam, Team AwayTeam *WHERE* (Delivery.MatchID =
> Match.ID AND Match.TourID = Tour.ID AND Delivery.BowlerPlayerTeamID =
> BowlerPT.ID AND BowlerPT.PlayerID = Bowler.ID AND BowlerPT.TeamID =
> BowlingTeam.ID AND Delivery.BatsmanPlayerTeamID = BatsmanPT.ID AND
> BatsmanPT.PlayerID = Batsman.ID AND BatsmanPT.TeamID = BattingTeam.ID AND
> Delivery.PartnerPlayerTeamID = PartnerPT.ID AND PartnerPT.PlayerID =
> Partner.ID AND Match.HomeTeamID = HomeTeam.ID AND Match.AwayTeamID =
> AwayTeam.ID AND Match.GroundID = Ground.ID AND Tour.CountryID = Country.ID
> AND Batsman.ID = 1234567890);
> 
> There are often multiple IDs specified at the end of the WHERE clause, as
> these are added depending upon selections a user has made from a GUI.
> 
> In a very specific case I need to select the above plus one unrelated row.
> Therefore my query changes to (for example):
> 
> *SELECT*  *FROM* Delivery, Match, Tour,
> Ground, Country, PlayerTeam BowlerPT, PlayerTeam BatsmanPT, PlayerTeam
> PartnerPT, Player Bowler, Player Batsman, Player Partner, Team BowlingTeam,
> Team BattingTeam, Team HomeTeam, Team AwayTeam *WHERE* (Delivery.MatchID =
> Match.ID AND Match.TourID = Tour.ID AND Delivery.BowlerPlayerTeamID =
> BowlerPT.ID AND BowlerPT.PlayerID = Bowler.ID AND BowlerPT.TeamID =
> BowlingTeam.ID AND Delivery.BatsmanPlayerTeamID = BatsmanPT.ID AND
> BatsmanPT.PlayerID = Batsman.ID AND BatsmanPT.TeamID = BattingTeam.ID AND
> Delivery.PartnerPlayerTeamID = PartnerPT.ID AND PartnerPT.PlayerID =
> Partner.ID AND Match.HomeTeamID = HomeTeam.ID AND Match.AwayTeamID =
> AwayTeam.ID AND Match.GroundID = Ground.ID AND Tour.CountryID = Country.ID
> AND Batsman.ID = 1234567890) *OR* (Delivery.MatchID = Match.ID AND
> Match.TourID = Tour.ID AND Delivery.BowlerPlayerTeamID = BowlerPT.ID AND
> BowlerPT.PlayerID = Bowler.ID AND BowlerPT.TeamID = BowlingTeam.ID AND
> Delivery.BatsmanPlayerTeamID = BatsmanPT.ID AND BatsmanPT.PlayerID =
> Batsman.ID AND BatsmanPT.TeamID = BattingTeam.ID AND
> Delivery.PartnerPlayerTeamID = PartnerPT.ID AND PartnerPT.PlayerID =
> Partner.ID AND Match.HomeTeamID = HomeTeam.ID AND Match.AwayTeamID =
> AwayTeam.ID AND Match.GroundID = Ground.ID AND Tour.CountryID = Country.ID
> AND Delivery.ID = 1987654321);

Using OR pretty much kills optimization in SQLite. Run this as two separate 
queries, or as a UNION query with two subqueries, each of which only uses AND.
-- 
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Slowdown when adding terms to query

2010-08-06 Thread Richard Hipp
On Fri, Aug 6, 2010 at 5:23 AM, Edward Hawke wrote:

>
> I have done an EXPLAIN QUERY PLAN on the query, which gave me the
> following:
>
> "0","2","TABLE Tour"
> "1","4","TABLE Country"
> "2","1","TABLE Match"
> "3","11","TABLE Team AS BowlingTeam"
> "4","12","TABLE Team AS BattingTeam"
> "5","13","TABLE Team AS HomeTeam"
> "6","14","TABLE Team AS AwayTeam"
> "7","3","TABLE Ground"
> "8","8","TABLE Player AS Bowler"
> "9","9","TABLE Player AS Batsman"
> "10","10","TABLE Player AS Partner"
> "11","5","TABLE PlayerTeam AS BowlerPT"
> "12","6","TABLE PlayerTeam AS BatsmanPT"
> "13","0","TABLE Delivery VIA MULTI-INDEX UNION"
> "14","7","TABLE PlayerTeam AS PartnerPT"
> "0","0","TABLE Delivery WITH INDEX Delivery_BatsmanPTIdx"
> "0","0","TABLE Delivery WITH INDEX sqlite_autoindex_Delivery_1"
>
> But I am unsure on how to read this, and what information it is giving me
> about how to better select my indices.
>

What this shows us is that, except for the subqueries on the Delivery table,
you are not using any indices anywhere.  As far as this query is concerned,
you might as well not have any indices on any of your tables other than
Delivery.

You can start by adding indices on terms that appear on either side of an =
in your WHERE clause.  The whole story is rather more complex, but that will
be a good start.


-- 
-
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users