Re: [sqlite] Fine tuning table indexes
On 02/02/2014 11:16 PM, James K. Lowden wrote: On Sat, 1 Feb 2014 11:21:45 -0500 Stephen Chrzanowskiwrote: Would be interesting to see when and where that single index comes into play when multiple indexes are defined. create table T (t int primary key, a int , b int); create index Ta on T(a); create index Tb on T(b); select * from T where a < 1 or b < 1 Using one index, pick your poison. You could use Ta or Tb, but for the other half of the OR you'll have to scan the table. So you might as well scan the table and forget about the index. Using two indexes, you can find the matching values in Ta and Tb, and produce results from the union of the two. For 10 rows, the two-index algorithm is likely slower. For a million rows, depending on the cardinality of the data, it could be significantly faster. SQLite does that sort of thing already. sqlite> create table T (t int primary key, a int , b int); sqlite> create index Ta on T(a); sqlite> create index Tb on T(b); sqlite> EXPLAIN QUERY PLAN select * from T where a < 1 or b < 1; 0|0|0|SEARCH TABLE T USING INDEX Ta (a http://www.sqlite.org/optoverview.html#or_opt ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fine tuning table indexes
On Feb 2, 2014, at 5:55 PM, Keith Medcalfwrote: > Nevertheless, each traversal operation is only using one index at a time. One word: bitmap. As in bitmap index: http://www.oracle.com/technetwork/articles/sharma-indexes-093638.html http://en.wikipedia.org/wiki/Bitmap_index ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fine tuning table indexes
You mean, of course, doing a re-write as: select * from T where rowid in (select rowid from T where a < 1 UNION select rowid from T where b < 1) which uses four indices. one on a, one on b, one on rowid, and one for the union. Depending on the number of columns and rows, you could also do: select * from T where a < 1 UNION select * from T where b < 1 Nevertheless, each traversal operation is only using one index at a time. If you have an index containing on T(a,b) and/or and index on T(b,a) you also add a couple more possible ways to execute the query. Which one has the highest performance depends on the data in the table and the available indices. Choosing the optimal requires application knowledge beyond what would be available to mere statistical optimizers. >-Original Message- >From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- >boun...@sqlite.org] On Behalf Of James K. Lowden >Sent: Sunday, 2 February, 2014 09:16 >To: General Discussion of SQLite Database >Subject: Re: [sqlite] Fine tuning table indexes > >On Sat, 1 Feb 2014 11:21:45 -0500 >Stephen Chrzanowski <pontia...@gmail.com> wrote: > >> Would be interesting to see when and where that single index comes >> into play when multiple indexes are defined. > >create table T (t int primary key, a int , b int); >create index Ta on T(a); >create index Tb on T(b); > >select * from T where a < 1 or b < 1 > >Using one index, pick your poison. You could use Ta or Tb, but for the >other half of the OR you'll have to scan the table. So you might as >well scan the table and forget about the index. > >Using two indexes, you can find the matching values in Ta and Tb, and >produce results from the union of the two. > >For 10 rows, the two-index algorithm is likely slower. For a million >rows, depending on the cardinality of the data, it could be >significantly faster. > >--jkl >___ >sqlite-users mailing list >sqlite-users@sqlite.org >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fine tuning table indexes
On Sat, 1 Feb 2014 11:21:45 -0500 Stephen Chrzanowskiwrote: > Would be interesting to see when and where that single index comes > into play when multiple indexes are defined. create table T (t int primary key, a int , b int); create index Ta on T(a); create index Tb on T(b); select * from T where a < 1 or b < 1 Using one index, pick your poison. You could use Ta or Tb, but for the other half of the OR you'll have to scan the table. So you might as well scan the table and forget about the index. Using two indexes, you can find the matching values in Ta and Tb, and produce results from the union of the two. For 10 rows, the two-index algorithm is likely slower. For a million rows, depending on the cardinality of the data, it could be significantly faster. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fine tuning table indexes
On 2 Feb 2014, at 1:09am, Keith Medcalfwrote: > No access method (yet invented) in any product (yet invented) can use more > than a single traversal path through a collection of data at any given time > during a single traversal operation. Yeah, that's why I asked for a pointer elsewhere in my post. I can't figure out what it does or why it should be worth doing. There are ways to use one index for finding the right rows and another for sorting, but the only explanation of what MS SQL does I can find says it does something like a JOIN. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fine tuning table indexes
As I recall, I recorded the source file and line number location of the query, the query and its plan. No execution time, because the time each query ran would have been heavily influenced by the amount of time the caller used in processing each row. It could have been done, but the query plan was generally adequate to identify queries that needed some indexing assistance. On Feb 1, 2014 9:21 AM, "Stephen Chrzanowski"wrote: > What was the raw results of some of the queries? Obviously query and > explain, but did you also tag in time deltas pre and post queries? > > I'll have to read up more on EXPLAIN. I'm aware of how indexes work as > well as where and when you need to put them in, but, when I had my SQL > training back on MS SQL 2000, back then the engine was knowledgeable to use > multiple indexes versus SQLite which uses one. Would be interesting to see > when and where that single index comes into play when multiple indexes are > defined. > > > On Fri, Jan 31, 2014 at 11:01 PM, Scott Robison >wrote: > > > On Fri, Jan 31, 2014 at 8:49 PM, Stephen Chrzanowski < > pontia...@gmail.com> > > wrote: > > > > > I've not done anything to 'generate' a SQL statement, but I can see the > > use > > > of this just as a debugging tool to figure out why something is taking > so > > > long. I might just implement that in the wrapper I use and test for > > DEBUG > > > or RELEASE modes. > > > > > > > For my current project for my employer, I wrote an interface in front of > > SQLite so that the other developers on the team would not need to know as > > much about it. As a result, "less than optimal" SQL was abundant (and I'm > > not a SQL guru in the first place, so if I can find flaws in it ... well, > > yeah). > > > > In any case, as part of my interface I wrote some code to log every > unique > > query that came through along with the explain query plan output for the > > query. This gave me the benefit of finding queries that were manufactured > > without bound parameters *and* information that led to creation of ideal > > indexes (indices?). > > > > You might consider something similar (and I'm really hoping such > > functionality doesn't already exist in SQLite as shipped, because I'll > feel > > foolish for reinventing the wheel). > > > > -- > > Scott Robison > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fine tuning table indexes
>> Would be interesting to see >> when and where that single index comes into play when multiple indexes >> are >> defined. >MS SQL allows for many sorts of indexes, and SQLite has only one. MS SQL >has a Tuning Wizard, which will recommend indexes to create or drop, but >SQLite has a better query optimiser which is better at figuring out which >index (per table) to use, partly because it can use the results of >ANALYZE. No access method (yet invented) in any product (yet invented) can use more than a single traversal path through a collection of data at any given time during a single traversal operation. I believe this is a physical constraint of the universe in which we exist (or the rules of the computer simulation which we perceive as reality). ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fine tuning table indexes
On 1 Feb 2014, at 4:21pm, Stephen Chrzanowskiwrote: > What was the raw results of some of the queries? Obviously query and > explain, but did you also tag in time deltas pre and post queries? > > I'll have to read up more on EXPLAIN. I'm aware of how indexes work as > well as where and when you need to put them in, but, when I had my SQL > training back on MS SQL 2000, back then the engine was knowledgeable to use > multiple indexes versus SQLite which uses one. More useful than EXPLAIN for this purpose is EXPLAIN QUERY PLAN. Also make sure you read up on (and use) ANALYZE. ANALYZE does an equivalent analysis job to MS SQL's Tuning Wizard, though the results are stored for later use by SQLite rather than being presented to the user. > Would be interesting to see > when and where that single index comes into play when multiple indexes are > defined. MS SQL allows for many sorts of indexes, and SQLite has only one. MS SQL has a Tuning Wizard, which will recommend indexes to create or drop, but SQLite has a better query optimiser which is better at figuring out which index (per table) to use, partly because it can use the results of ANALYZE. My understanding of the use of multiple indexes for one query is that MS SQL can combine indexes (doing some sort of JOIN operation), but if someone can point me at a discussion of it I'll be grateful. They're the results of two different approaches to the problems. MS SQL runs on big computers which can sit in a nice cool server room; SQLite runs on user-machines and tiny handheld devices like mobile phones. MS SQL throws hardware and manpower at the problem: big files, big memory footprint, lots of processing, use all the power you want, provide training courses so people can become experts on your many complicated features. SQLite tries to make everything as simple as possible: precompute where possible, make choices simple and allow fewer of them, stay small and nimble, work well even when the programmer isn't an expert on your technology. They work differently because their functions are different. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fine tuning table indexes
What was the raw results of some of the queries? Obviously query and explain, but did you also tag in time deltas pre and post queries? I'll have to read up more on EXPLAIN. I'm aware of how indexes work as well as where and when you need to put them in, but, when I had my SQL training back on MS SQL 2000, back then the engine was knowledgeable to use multiple indexes versus SQLite which uses one. Would be interesting to see when and where that single index comes into play when multiple indexes are defined. On Fri, Jan 31, 2014 at 11:01 PM, Scott Robisonwrote: > On Fri, Jan 31, 2014 at 8:49 PM, Stephen Chrzanowski > wrote: > > > I've not done anything to 'generate' a SQL statement, but I can see the > use > > of this just as a debugging tool to figure out why something is taking so > > long. I might just implement that in the wrapper I use and test for > DEBUG > > or RELEASE modes. > > > > For my current project for my employer, I wrote an interface in front of > SQLite so that the other developers on the team would not need to know as > much about it. As a result, "less than optimal" SQL was abundant (and I'm > not a SQL guru in the first place, so if I can find flaws in it ... well, > yeah). > > In any case, as part of my interface I wrote some code to log every unique > query that came through along with the explain query plan output for the > query. This gave me the benefit of finding queries that were manufactured > without bound parameters *and* information that led to creation of ideal > indexes (indices?). > > You might consider something similar (and I'm really hoping such > functionality doesn't already exist in SQLite as shipped, because I'll feel > foolish for reinventing the wheel). > > -- > Scott Robison > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fine tuning table indexes
On Fri, Jan 31, 2014 at 8:49 PM, Stephen Chrzanowskiwrote: > I've not done anything to 'generate' a SQL statement, but I can see the use > of this just as a debugging tool to figure out why something is taking so > long. I might just implement that in the wrapper I use and test for DEBUG > or RELEASE modes. > For my current project for my employer, I wrote an interface in front of SQLite so that the other developers on the team would not need to know as much about it. As a result, "less than optimal" SQL was abundant (and I'm not a SQL guru in the first place, so if I can find flaws in it ... well, yeah). In any case, as part of my interface I wrote some code to log every unique query that came through along with the explain query plan output for the query. This gave me the benefit of finding queries that were manufactured without bound parameters *and* information that led to creation of ideal indexes (indices?). You might consider something similar (and I'm really hoping such functionality doesn't already exist in SQLite as shipped, because I'll feel foolish for reinventing the wheel). -- Scott Robison ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fine tuning table indexes
On Fri, Jan 31, 2014 at 12:05 PM, Simon Slavinwrote: > > However, you also mention that your app generates its own commands. > Unless you can predict things about these commands ("90% of the time users > are going to want to sort by date") you're not in a good place to figure > out what indexes are good. One solution in this is to add a routine to > your app to log any SQL commands it makes up, and how long the command > takes to execute: create a log table with a text column a numeric column. > You can look at the results at your leisure and try to improve on any > command that takes more than 5 seconds, or any command which is used a lot. > I've not done anything to 'generate' a SQL statement, but I can see the use of this just as a debugging tool to figure out why something is taking so long. I might just implement that in the wrapper I use and test for DEBUG or RELEASE modes. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fine tuning table indexes
On 31 Jan 2014, at 4:41pm, Andreas Hofmannwrote: > I would like to fine tune table indexes. I want to make sure I got indexes > for the columns or combined columns of all (most) where clauses. The issue > is that the application builds dynamic SQL strings all over the place and it > may not be easy to find them all without reading the code very closely. I > would rather find out what indexes are needed/suggested by profiling the > app. > > > > Is there a tool that can collect profiling information at run time and > suggest missing indexes? If not, I am thinking about parsing the SQL > strings myself, not a difficult thing to do, but I was wondering if I > reinvent the wheel here. One problem you're going to run into is that lots of optimizing 'wizards' start off by building indexes based on what the schema is. They look at tables and see what columns exist, and make up a few indexes which seem likely to help. However, a properly defined schema already has indexes in for primary key columns, FOREIGN KEYs, and for columns declared as unique. Sometimes this is done inherently by the SQL engine, other times the SQL engine will complain is such indexes aren't available. Those are vital indexes, but any decent SQL programmer will have ensured they already exist. So at best, automated index adders help only a little. And at worst they make database files huge and increase processing times because every change has to updates many unused indexes. I've seen some truly horrible examples, for various different SQL implementations including some expensive ones that should know better. And I once made a client's day simply by deleting a whole bunch of indexes which speeded up their time-critical INSERT time, because their other operations were run in an overnight batch and nobody cared how long they took. The way to efficiently figure out what indexes should be created is to look at WHERE clauses in all commands, and at JOINs and GROUP BY and ORDER BY clauses in SELECT commands. Those are the things that actually /use/ indexes. You can probably do that for some commands your app uses: look at each SELECT and UPDATE and DELETE FROM command and figure out, for each table used, a single index which would be 100% ideal for the command. Then eliminate or optimize cases where two commands could use the same index, or where a long index includes a shorter one. However, you also mention that your app generates its own commands. Unless you can predict things about these commands ("90% of the time users are going to want to sort by date") you're not in a good place to figure out what indexes are good. One solution in this is to add a routine to your app to log any SQL commands it makes up, and how long the command takes to execute: create a log table with a text column a numeric column. You can look at the results at your leisure and try to improve on any command that takes more than 5 seconds, or any command which is used a lot. Lastly, for SQLite specifically, don't forget to run the ANALYZE command before analysis and after creating or deleting indexes. This can yield dramatic improvements. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users