On Fri, Mar 25, 2011 at 5:24 PM, Eric Minbiole <eminbi...@gmail.com> wrote:
> We have an application that has been using SQLite successfully for several > years. We use SQLite for various purposes within our product, with a few > dozen tables overall. > > Recently, I started to upgrade from version 3.6.3 to 3.7.5. During that > time, I noticed that several previously fast indexed queries turned into > slow full table scans. For example, on a simple (hypothetical) example: > > CREATE TABLE tbl ( > id INTEGER, > value INTEGER > ); > CREATE INDEX idx_id ON tbl (id); > > The query "SELECT * FROM tbl WHERE id = 1" previously made use of idx_id. > However, in 3.7.5, we noticed that it was doing a full table scan, causing > significant performance issues. > > After some debugging, found that the issue is related to our (mis-)use of > the "analyze" command: Historically, our software runs analyze any time > the > db schema changes. However, if our customer had not yet made use of a > particular feature, the corresponding table(s) might be _empty_ when > analyze > is run. On previous versions, this did not cause any obvious problems. In > the new version, the query planner reasonably assumes that a full table > scan > is faster than an index on a small/empty table. However, when the customer > later makes use of those features (populating the tables), the queries > become quite slow, as it still does a full table scan. > > I'm trying to determine the best way to resolve this issue: > Suggestion: After you run ANALYZE, go back and modify values in sqlite_stat1 (and sqlite_stat2 if you are using SQLITE_ENABLE_STAT2) with "typical" values that you would expect to see in most databases. Yes, you can read and write the sqlite_stat1 and sqlite_stat2 tables. Maybe you can come up with some prepackaged default values for sqlite_stat1 and sqlite_stat2 based on experiments in your development lab, and then just put your prepackaged defaults into the real sqlite_stat1 and sqlite_stat2 after running ANALYZE. > > - Re-running analyze after adding data to the table is an obvious > suggestion. However, as we have lots of tables in use for various > purposes, > I'd need to sprinkle lots of "if (first time data added) analyze()" code > around. > - I could add a user triggered "maintenance" feature to manually > re-analyze, > but that takes away some of the "zero configuration" benefits of our > product. > - Even if I stop calling analyze at all going forward, there may be some > existing (empty) tables that may have already been analyzed by previous > code. > - I could make use of the "INDEXED BY" clause, but this goes against the > documented intent of this feature ("*not* intended for use in tuning the > performance of a query"), and requires re-writing of many queries. > > I'm sure there are other good ideas-- I'm flexible, and open to suggestion. > I'd appreciate any suggestions the group might have. > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- D. Richard Hipp d...@sqlite.org _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users