Re: [sqlite] Fine tuning table indexes

2014-02-02 Thread Dan Kennedy

On 02/02/2014 11:16 PM, James K. Lowden wrote:

On Sat, 1 Feb 2014 11:21:45 -0500
Stephen Chrzanowski  wrote:


Would be interesting to see when and where that single index comes
into play when multiple indexes are defined.

create table T (t int primary key, a int , b int);
create index Ta on T(a);
create index Tb on T(b);

select * from T where a < 1 or b < 1

Using one index, pick your poison.  You could use Ta or Tb, but for the
other half of the OR you'll have to scan the table.  So you might as
well scan the table and forget about the index.

Using two indexes, you can find the matching values in Ta and Tb, and
produce results from the union of the two.

For 10 rows, the two-index algorithm is likely slower.  For a million
rows, depending on the cardinality of the data, it could be
significantly faster.



SQLite does that sort of thing already.

  sqlite> create table T (t int primary key, a int , b int);
  sqlite> create index Ta on T(a);
  sqlite> create index Tb on T(b);
  sqlite> EXPLAIN QUERY PLAN select * from T where a < 1 or b < 1;
  0|0|0|SEARCH TABLE T USING INDEX Ta (a


  http://www.sqlite.org/optoverview.html#or_opt





___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fine tuning table indexes

2014-02-02 Thread Petite Abeille

On Feb 2, 2014, at 5:55 PM, Keith Medcalf  wrote:

> Nevertheless, each traversal operation is only using one index at a time.

One word: bitmap. As in bitmap index:

http://www.oracle.com/technetwork/articles/sharma-indexes-093638.html
http://en.wikipedia.org/wiki/Bitmap_index

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fine tuning table indexes

2014-02-02 Thread Keith Medcalf

You mean, of course, doing a re-write as:

select *
  from T 
 where rowid in (select rowid
   from T
  where a < 1
 UNION
 select rowid
   from T
  where b < 1)
 
which uses four indices.  one on a, one on b, one on rowid, and one for the 
union.  Depending on the number of columns and rows, you could also do:

select *
  from T
 where a < 1
UNION
select *
  from T
 where b < 1

Nevertheless, each traversal operation is only using one index at a time.

If you have an index containing on T(a,b) and/or and index on T(b,a) you also 
add a couple more possible ways to execute the query.  Which one has the 
highest performance depends on the data in the table and the available indices. 
 Choosing the optimal requires application knowledge beyond what would be 
available to mere statistical optimizers.

>-Original Message-
>From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
>boun...@sqlite.org] On Behalf Of James K. Lowden
>Sent: Sunday, 2 February, 2014 09:16
>To: General Discussion of SQLite Database
>Subject: Re: [sqlite] Fine tuning table indexes
>
>On Sat, 1 Feb 2014 11:21:45 -0500
>Stephen Chrzanowski <pontia...@gmail.com> wrote:
>
>> Would be interesting to see when and where that single index comes
>> into play when multiple indexes are defined.
>
>create table T (t int primary key, a int , b int);
>create index Ta on T(a);
>create index Tb on T(b);
>
>select * from T where a < 1 or b < 1
>
>Using one index, pick your poison.  You could use Ta or Tb, but for the
>other half of the OR you'll have to scan the table.  So you might as
>well scan the table and forget about the index.
>
>Using two indexes, you can find the matching values in Ta and Tb, and
>produce results from the union of the two.
>
>For 10 rows, the two-index algorithm is likely slower.  For a million
>rows, depending on the cardinality of the data, it could be
>significantly faster.
>
>--jkl
>___
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fine tuning table indexes

2014-02-02 Thread James K. Lowden
On Sat, 1 Feb 2014 11:21:45 -0500
Stephen Chrzanowski  wrote:

> Would be interesting to see when and where that single index comes
> into play when multiple indexes are defined.

create table T (t int primary key, a int , b int);
create index Ta on T(a);
create index Tb on T(b);

select * from T where a < 1 or b < 1

Using one index, pick your poison.  You could use Ta or Tb, but for the
other half of the OR you'll have to scan the table.  So you might as
well scan the table and forget about the index.  

Using two indexes, you can find the matching values in Ta and Tb, and
produce results from the union of the two.  

For 10 rows, the two-index algorithm is likely slower.  For a million
rows, depending on the cardinality of the data, it could be
significantly faster.  

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fine tuning table indexes

2014-02-01 Thread Simon Slavin

On 2 Feb 2014, at 1:09am, Keith Medcalf  wrote:

> No access method (yet invented) in any product (yet invented) can use more 
> than a single traversal path through a collection of data at any given time 
> during a single traversal operation.

Yeah, that's why I asked for a pointer elsewhere in my post.  I can't figure 
out what it does or why it should be worth doing.  There are ways to use one 
index for finding the right rows and another for sorting, but the only 
explanation of what MS SQL does I can find says it does something like a JOIN.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fine tuning table indexes

2014-02-01 Thread Scott Robison
As I recall, I recorded the source file and line number location of the
query, the query and its plan. No execution time, because the time each
query ran would have been heavily influenced by the amount of time the
caller used in processing each row. It could have been done, but the query
plan was generally adequate to identify queries that needed some indexing
assistance.
 On Feb 1, 2014 9:21 AM, "Stephen Chrzanowski"  wrote:

> What was the raw results of some of the queries?  Obviously query and
> explain, but did you also tag in time deltas pre and post queries?
>
> I'll have to read up more on EXPLAIN.  I'm aware of how indexes work as
> well as where and when you need to put them in, but, when I had my SQL
> training back on MS SQL 2000, back then the engine was knowledgeable to use
> multiple indexes versus SQLite which uses one.  Would be interesting to see
> when and where that single index comes into play when multiple indexes are
> defined.
>
>
> On Fri, Jan 31, 2014 at 11:01 PM, Scott Robison  >wrote:
>
> > On Fri, Jan 31, 2014 at 8:49 PM, Stephen Chrzanowski <
> pontia...@gmail.com>
> >  wrote:
> >
> > > I've not done anything to 'generate' a SQL statement, but I can see the
> > use
> > > of this just as a debugging tool to figure out why something is taking
> so
> > > long.  I might just implement that in the wrapper I use and test for
> > DEBUG
> > > or RELEASE modes.
> > >
> >
> > For my current project for my employer, I wrote an interface in front of
> > SQLite so that the other developers on the team would not need to know as
> > much about it. As a result, "less than optimal" SQL was abundant (and I'm
> > not a SQL guru in the first place, so if I can find flaws in it ... well,
> > yeah).
> >
> > In any case, as part of my interface I wrote some code to log every
> unique
> > query that came through along with the explain query plan output for the
> > query. This gave me the benefit of finding queries that were manufactured
> > without bound parameters *and* information that led to creation of ideal
> > indexes (indices?).
> >
> > You might consider something similar (and I'm really hoping such
> > functionality doesn't already exist in SQLite as shipped, because I'll
> feel
> > foolish for reinventing the wheel).
> >
> > --
> > Scott Robison
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fine tuning table indexes

2014-02-01 Thread Keith Medcalf
>> Would be interesting to see
>> when and where that single index comes into play when multiple indexes
>> are
>> defined.

>MS SQL allows for many sorts of indexes, and SQLite has only one.  MS SQL
>has a Tuning Wizard, which will recommend indexes to create or drop, but
>SQLite has a better query optimiser which is better at figuring out which
>index (per table) to use, partly because it can use the results of
>ANALYZE.

No access method (yet invented) in any product (yet invented) can use more than 
a single traversal path through a collection of data at any given time during a 
single traversal operation.  I believe this is a physical constraint of the 
universe in which we exist (or the rules of the computer simulation which we 
perceive as reality).




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fine tuning table indexes

2014-02-01 Thread Simon Slavin

On 1 Feb 2014, at 4:21pm, Stephen Chrzanowski  wrote:

> What was the raw results of some of the queries?  Obviously query and
> explain, but did you also tag in time deltas pre and post queries?
> 
> I'll have to read up more on EXPLAIN.  I'm aware of how indexes work as
> well as where and when you need to put them in, but, when I had my SQL
> training back on MS SQL 2000, back then the engine was knowledgeable to use
> multiple indexes versus SQLite which uses one.

More useful than EXPLAIN for this purpose is EXPLAIN QUERY PLAN.  Also make 
sure you read up on (and use) ANALYZE.  ANALYZE does an equivalent analysis job 
to MS SQL's Tuning Wizard, though the results are stored for later use by 
SQLite rather than being presented to the user.

> Would be interesting to see
> when and where that single index comes into play when multiple indexes are
> defined.

MS SQL allows for many sorts of indexes, and SQLite has only one.  MS SQL has a 
Tuning Wizard, which will recommend indexes to create or drop, but SQLite has a 
better query optimiser which is better at figuring out which index (per table) 
to use, partly because it can use the results of ANALYZE.

My understanding of the use of multiple indexes for one query is that MS SQL 
can combine indexes (doing some sort of JOIN operation), but if someone can 
point me at a discussion of it I'll be grateful.

They're the results of two different approaches to the problems.  MS SQL runs 
on big computers which can sit in a nice cool server room; SQLite runs on 
user-machines and tiny handheld devices like mobile phones.  MS SQL throws 
hardware and manpower at the problem: big files, big memory footprint, lots of 
processing, use all the power you want, provide training courses so people can 
become experts on your many complicated features.  SQLite tries to make 
everything as simple as possible: precompute where possible, make choices 
simple and allow fewer of them, stay small and nimble, work well even when the 
programmer isn't an expert on your technology.  They work differently because 
their functions are different.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fine tuning table indexes

2014-02-01 Thread Stephen Chrzanowski
What was the raw results of some of the queries?  Obviously query and
explain, but did you also tag in time deltas pre and post queries?

I'll have to read up more on EXPLAIN.  I'm aware of how indexes work as
well as where and when you need to put them in, but, when I had my SQL
training back on MS SQL 2000, back then the engine was knowledgeable to use
multiple indexes versus SQLite which uses one.  Would be interesting to see
when and where that single index comes into play when multiple indexes are
defined.


On Fri, Jan 31, 2014 at 11:01 PM, Scott Robison wrote:

> On Fri, Jan 31, 2014 at 8:49 PM, Stephen Chrzanowski 
>  wrote:
>
> > I've not done anything to 'generate' a SQL statement, but I can see the
> use
> > of this just as a debugging tool to figure out why something is taking so
> > long.  I might just implement that in the wrapper I use and test for
> DEBUG
> > or RELEASE modes.
> >
>
> For my current project for my employer, I wrote an interface in front of
> SQLite so that the other developers on the team would not need to know as
> much about it. As a result, "less than optimal" SQL was abundant (and I'm
> not a SQL guru in the first place, so if I can find flaws in it ... well,
> yeah).
>
> In any case, as part of my interface I wrote some code to log every unique
> query that came through along with the explain query plan output for the
> query. This gave me the benefit of finding queries that were manufactured
> without bound parameters *and* information that led to creation of ideal
> indexes (indices?).
>
> You might consider something similar (and I'm really hoping such
> functionality doesn't already exist in SQLite as shipped, because I'll feel
> foolish for reinventing the wheel).
>
> --
> Scott Robison
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fine tuning table indexes

2014-01-31 Thread Scott Robison
On Fri, Jan 31, 2014 at 8:49 PM, Stephen Chrzanowski 
 wrote:

> I've not done anything to 'generate' a SQL statement, but I can see the use
> of this just as a debugging tool to figure out why something is taking so
> long.  I might just implement that in the wrapper I use and test for DEBUG
> or RELEASE modes.
>

For my current project for my employer, I wrote an interface in front of
SQLite so that the other developers on the team would not need to know as
much about it. As a result, "less than optimal" SQL was abundant (and I'm
not a SQL guru in the first place, so if I can find flaws in it ... well,
yeah).

In any case, as part of my interface I wrote some code to log every unique
query that came through along with the explain query plan output for the
query. This gave me the benefit of finding queries that were manufactured
without bound parameters *and* information that led to creation of ideal
indexes (indices?).

You might consider something similar (and I'm really hoping such
functionality doesn't already exist in SQLite as shipped, because I'll feel
foolish for reinventing the wheel).

-- 
Scott Robison
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fine tuning table indexes

2014-01-31 Thread Stephen Chrzanowski
On Fri, Jan 31, 2014 at 12:05 PM, Simon Slavin  wrote:

>
> However, you also mention that your app generates its own commands.
>  Unless you can predict things about these commands ("90% of the time users
> are going to want to sort by date") you're not in a good place to figure
> out what indexes are good.  One solution in this is to add a routine to
> your app to log any SQL commands it makes up, and how long the command
> takes to execute: create a log table with a text column a numeric column.
>  You can look at the results at your leisure and try to improve on any
> command that takes more than 5 seconds, or any command which is used a lot.
>

I've not done anything to 'generate' a SQL statement, but I can see the use
of this just as a debugging tool to figure out why something is taking so
long.  I might just implement that in the wrapper I use and test for DEBUG
or RELEASE modes.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fine tuning table indexes

2014-01-31 Thread Simon Slavin

On 31 Jan 2014, at 4:41pm, Andreas Hofmann  wrote:

> I would like to fine tune table indexes. I want to make sure I got indexes
> for the columns or combined columns of all (most) where clauses.  The issue
> is that the application builds dynamic SQL strings all over the place and it
> may not be easy to find them all without reading the code very closely.   I
> would rather find out what indexes are needed/suggested by profiling the
> app.
> 
> 
> 
> Is there a tool that can collect profiling information at run time and
> suggest missing indexes?  If not, I am thinking about parsing the SQL
> strings myself, not a difficult thing to do, but I was wondering if I
> reinvent the wheel here.

One problem you're going to run into is that lots of optimizing 'wizards' start 
off by building indexes based on what the schema is.  They look at tables and 
see what columns exist, and make up a few indexes which seem likely to help.

However, a properly defined schema already has indexes in for primary key 
columns, FOREIGN KEYs, and for columns declared as unique.  Sometimes this is 
done inherently by the SQL engine, other times the SQL engine will complain is 
such indexes aren't available.  Those are vital indexes, but any decent SQL 
programmer will have ensured they already exist.  So at best, automated index 
adders help only a little.  And at worst they make database files huge and 
increase processing times because every change has to updates many unused 
indexes.  I've seen some truly horrible examples, for various different SQL 
implementations including some expensive ones that should know better.  And I 
once made a client's day simply by deleting a whole bunch of indexes which 
speeded up their time-critical INSERT time, because their other operations were 
run in an overnight batch and nobody cared how long they took.

The way to efficiently figure out what indexes should be created is to look at 
WHERE clauses in all commands, and at JOINs and GROUP BY and ORDER BY clauses 
in SELECT commands.  Those are the things that actually /use/ indexes.  You can 
probably do that for some commands your app uses: look at each SELECT and 
UPDATE and DELETE FROM command and figure out, for each table used, a single 
index which would be 100% ideal for the command.  Then eliminate or optimize 
cases where two commands could use the same index, or where a long index 
includes a shorter one.

However, you also mention that your app generates its own commands.  Unless you 
can predict things about these commands ("90% of the time users are going to 
want to sort by date") you're not in a good place to figure out what indexes 
are good.  One solution in this is to add a routine to your app to log any SQL 
commands it makes up, and how long the command takes to execute: create a log 
table with a text column a numeric column.  You can look at the results at your 
leisure and try to improve on any command that takes more than 5 seconds, or 
any command which is used a lot.

Lastly, for SQLite specifically, don't forget to run the ANALYZE command before 
analysis and after creating or deleting indexes.  This can yield dramatic 
improvements.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users