On Mon, Mar 21, 2011 at 9:48 AM, Simon Slavin <slav...@bigfraud.org> wrote:

>
> Note that if you provide good INDEXes as you should, then the results of
> ANALYZE will never be used at all because the query finder will find it has
> an excellent INDEX before it even starts trying to guess how to access the
> data by a slower searching method.
>

Not exactly.  The results of ANALYZE will be used if there are two or more
indices that SQLite needs to choose from.  So for example:

    CREATE TABLE t1(a,b,c);
    CREATE INDEX t1a ON t1(a);
    CREATE INDEX t1b ON t1(b);

Then if you do:

    SELECT * FROM t1 WHERE a=5 AND b=11;

In that case, SQLite can choose to use index t1a or t1b.  But it cannot
reasonably use both.  (It could, but the result would almost always be
slower than using just one or the other index, and so the case of using both
is not even considered.)

Now if the table T1 contains only 5 distinct values of A but contains 5000
distinct values for B, it makes since to use T1B as the index because that
index is more likely to narrow down the search to a small number of rows
more quickly.  But without running ANALYZE, SQLite has no way of knowing
that there are more distinct values of B than their are of A and so it will
choose arbitrarily.

Further suppose that you compile with -DSQLITE_ENABLE_STAT2 such that the
ANALYZE command will also record a histogram of the data in each index.
Then consider the query:

    SELECT * FROM t1 WHERE a=5;

Without ANALYZE, this second query will always use index T1A.  But if the
histogram data shows that (say) 95% of the entries in T1 have a=5, then it
would be cheaper to do a full table scan, and that is exactly what SQLite
will do in that case.

So it is important to have good indices.  But ANALYZE can still help.

Note, though, that the whole point of SQL (not just SQLite but all SQL
implementations) is that you really should need to worry about any of this
until your application is finished.  Then you can run performance
measurements using real data and add indices and/or ANALYZE as necessary to
address any performance problems.  I don't think the OP groks the
seriousness of this proverb:   "Premature optimization is the root of all
evil".



-- 
D. Richard Hipp
d...@sqlite.org
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to