On Sat, Feb 12, 2011 at 1:21 PM, Roger Binns <rog...@rogerbinns.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 02/12/2011 07:27 AM, Richard Hipp wrote:
> > ... if SQLite is compiled using SQLITE_ENABLE_STAT2 ...
>
> Is there any reason this is not turned on by default?
>
>
A key principle of SQL in general (SQLite and other SQL database engines) is
that the programmer does not have to pick the access algorithm - the
database engine does that for her.  The programmer specifies what she wants
to get from the database and the query planner component of the database
engine figures out exactly how to get it.  For a complex query, there are
typically many different alternative algorithms that the query planner has
to choose from.

A key feature of SQLite is that as long as you do not change the schema or
rerun ANALYZE, the database will always choose the same algorithm given the
same SQL.  In other words, the query planner in SQLite does not depend on
the data stored in the database or the values bound to query parameters.

This is an important guarantee for many applications, especially soft
real-time and safety-critical systems where SQLite is often used.  In
business applications, you want the best average performance, and in such
systems it makes since to adjust the query plan based on the database
content.  But for soft real-time and safety-critical systems, you don't want
an application that worked OK during testing to suddenly switch from a
logarithmic to a linear-time algorithm thinking that it is helping you out.
And so SQLite guarantees that it won't ever do that.

But SQLITE_ENABLE_STAT2 breaks that guarantee.  With STAT2, the ANALYZE
command stores a histogram of left-most column of each index in order to
help it make decisions about whether or not to use that index.  Consider a
query like this:

     SELECT * FROM tab WHERE x=5;

If tab.x is indexed, SQLite would normally use the index.  But if STAT2 is
enabled and the histogram shows that most of the entries in tab have x=5,
then a full table scan would be faster than doing an index lookup for each
entry, and so a full-scan is chosen and the index is not used.  If the query
is this:

     SELECT * FROM tab WHERE x=:param

Then SQLite will choose either the logarithmic-time index algorithm or the
linear-time table scan, depending on what value is bound to :param.  In a
soft real-time system, the product might only have been tested with :param
values that resulted in the use of the index.  But then the product might go
into the field and one day just the right value of :param comes along and a
linear scan is selected.  If the histogram is right, its all good since the
result will come out faster than if the index had been used.  But if the
histogram is wrong, the result could be a slow full table scan, which
exceeds query performance limits causing the product to fail.

This always-use-the-same algorithm guarantee of SQLite is obscure but it is
important to some SQLite users.  And that is the reason STAT2 is turned off
by default.



> If it is turned on and then the database is used by an earlier version of
> SQLite will there be any problems?
>

STAT2 and non-STAT2 versions of SQLite should be able to read and write each
others database without any problems.  But thank you for asking because your
question made me realize that this is something that we do not test as well
as we ought to.  I'll be testing this much more closely in future releases.



>
> Roger
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk1Wz40ACgkQmOOfHg372QTtSQCfTUQW3FYJDn8wxzV8l5OSK3DF
> wioAn0NGzuCn5rD/k96W5HS33m6ww6Fp
> =/deb
> -----END PGP SIGNATURE-----
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to