On Fri, Mar 25, 2011 at 5:24 PM, Eric Minbiole <eminbi...@gmail.com> wrote:

> We have an application that has been using SQLite successfully for several
> years.  We use SQLite for various purposes within our product, with a few
> dozen tables overall.
>
> Recently, I started to upgrade from version 3.6.3 to 3.7.5.  During that
> time, I noticed that several previously fast indexed queries turned into
> slow full table scans.  For example, on a simple (hypothetical) example:
>
> CREATE TABLE tbl (
>  id INTEGER,
>  value INTEGER
> );
> CREATE INDEX idx_id ON tbl (id);
>
> The query "SELECT * FROM tbl WHERE id = 1" previously made use of idx_id.
> However, in 3.7.5, we noticed that it was doing a full table scan, causing
> significant performance issues.
>
> After some debugging, found that the issue is related to our (mis-)use of
> the "analyze" command:  Historically, our software runs analyze any time
> the
> db schema changes.  However, if our customer had not yet made use of a
> particular feature, the corresponding table(s) might be _empty_ when
> analyze
> is run.  On previous versions, this did not cause any obvious problems.  In
> the new version, the query planner reasonably assumes that a full table
> scan
> is faster than an index on a small/empty table.  However, when the customer
> later makes use of those features (populating the tables), the queries
> become quite slow, as it still does a full table scan.
>
> I'm trying to determine the best way to resolve this issue:
>

Suggestion:  After you run ANALYZE, go back and modify values in
sqlite_stat1 (and sqlite_stat2 if you are using SQLITE_ENABLE_STAT2) with
"typical" values that you would expect to see in most databases.  Yes, you
can read and write the sqlite_stat1 and sqlite_stat2 tables.  Maybe you can
come up with some prepackaged default values for sqlite_stat1 and
sqlite_stat2 based on experiments in your development lab, and then just put
your prepackaged defaults into the real sqlite_stat1 and sqlite_stat2 after
running ANALYZE.



>
> - Re-running analyze after adding data to the table is an obvious
> suggestion.  However, as we have lots of tables in use for various
> purposes,
> I'd need to sprinkle lots of "if (first time data added) analyze()" code
> around.
> - I could add a user triggered "maintenance" feature to manually
> re-analyze,
> but that takes away some of the "zero configuration" benefits of our
> product.
> - Even if I stop calling analyze at all going forward, there may be some
> existing (empty) tables that may have already been analyzed by previous
> code.
> - I could make use of the "INDEXED BY" clause, but this goes against the
> documented intent of this feature ("*not* intended for use in tuning the
> performance of a query"), and requires re-writing of many queries.
>
> I'm sure there are other good ideas-- I'm flexible, and open to suggestion.
> I'd appreciate any suggestions the group might have.
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to