Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
> -Original Message- > From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- > boun...@sqlite.org] On Behalf Of Marc L. Allen > Sent: dinsdag 3 september 2013 15:55 > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Inefficient covering index used for Subversion with > SQLite 3.8.0 > > Am I understanding that, in this example, the I_NODES_PARENT is being > chosen as the search index because... it's smaller and therefore faster to find > initial qualifying rows that you can then use in some sort of ordered lookup in > another index/table? > > I'm always in awe of some of the plans a good QA comes up with, and it > blows me away that there are cases when using a less-covering index would > be better than a more-covering index. It also happens to be the first index to be encountered that has wc_id as first component. I'm not sure which part (being first vs being smaller) is used to make the decision, but this is exactly why I expected this to be a simple corner case bug instead of part of the new query optimizer design. In Subversion we have queries that only apply on a single directory level (where the I_NODES_PARENT is used for) vs queries that apply to an entire tree (in most cases handled via the primary key index). Using the right index is critical for end-user performance. The I_NODES_MOVED index is only used when trees are moved, which is an uncommon operation, but obtaining information about moves is performance critical in certain scenarios. I hope we will start using the partial index support for this with the next release. That should also directly invalidate using this index for these optimizations . Bert ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
Am I understanding that, in this example, the I_NODES_PARENT is being chosen as the search index because... it's smaller and therefore faster to find initial qualifying rows that you can then use in some sort of ordered lookup in another index/table? I'm always in awe of some of the plans a good QA comes up with, and it blows me away that there are cases when using a less-covering index would be better than a more-covering index. This email and any attachments are only for use by the intended recipient(s) and may contain legally privileged, confidential, proprietary or otherwise private information. Any unauthorized use, reproduction, dissemination, distribution or other disclosure of the contents of this e-mail or its attachments is strictly prohibited. If you have received this email in error, please notify the sender immediately and delete the original. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
> -Original Message- > From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- > boun...@sqlite.org] On Behalf Of Richard Hipp > Sent: dinsdag 3 september 2013 02:12 > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Inefficient covering index used for Subversion with > SQLite 3.8.0 > > On Mon, Sep 2, 2013 at 10:58 AM, Bert Huijben > wrote: > > > We anticipate that the wc_id column will be used more in future versions > > and > > I had hoped that the assumption that a better index match (matching more > > usable columns) would always be preferable over one that uses less > columns. > > > > All indexes have the wc_id column first, so I don't see a good reason that > > Sqlite in this case prefers only this column of the specific (bad) index > > over any other index that can help with the rest of the query. > > > > This optimization added in 3.7.12 made it easier for us to write clearer > > well performing queries, without any hassle. . . We can't really assume our > > users to run a specific version of Sqlite (as most unix distributions > > compile sqlite themselves for products like ours)... > > > > As I've alluded to before, the issue here boils down to the "shape" of the > content of the database file. By "shape", I mean the distribution of > values in each table, and in particular how well each index is able to > narrow down a search. Knowledge of the shape of the data is critically > important in choosing the best query plan when there are two or more > potentially good query plans. It isn't possible to make good decisions > about query plans without knowing the shape of the data. I understand. A long long time ago I implemented similar indexing myself for a library system, where I couldn't even use the whole keys as an index caused by the storage limitations at the time. I'm glad we don't have that 1980's limits any more. At Subversion Our databases aren't well formed by the relational standards, but everything worked reasonably well with Sqlite 3.7 before. I then tuned all queries for Sqlite 3.7.12-3.7.18 where I could get all of our queries use the right indexes without much trouble. > When you run ANALYZE, SQLite records information in the sqlite_stat1 table > that gives it a good picture of the shape of the data. If you have not run > ANALYZE or if you have dropped or deleted the sqlite_stat1 table, then > SQLite has no information about the true shape of the data, so it has to > guess. The guess that SQLite has made for the the past decade is that the > left-most column of each index does a pretty good job of narrowing down > the > search and then subsequent columns merely refine that search slightly. > This is just a wild guess, of course. But it seems to have worked > remarkably well in a wide variety of databases in literally millions of > applications over the past ten years. I don't like that I have to do this, but this appears to be the best maintainable option for the long term future > The issue at hand is that the guess that SQLite is making on the shape of > unanalyzed databases is not working very well for SVN. The problem is that > the left-most column of many of the indices do a very poor job of narrowing > down the search, in violation of the assumption made for the standard > guess. > > As I've pointed out, one could just run ANALYZE to populate the > sqlite_stat1 table, then SQLite will know the database shape and will pick > good plans. Or one can manually populate the sqlite_stat1 table with > something close to the anticipated shape of the database. But Bert says he > doesn't want to do any of that. > > So I've come up with a new, experimental compile-time option: > > -DSQLITE_DEFAULT_INDEX_SHAPE=1 > > Rebuilding with the compile-time option above, using code from the > index-shape-1 branch (http://www.sqlite.org/src/timeline?r=index-shape-1) > causes SQLite to make a much more pessimistic guess about the > effectiveness > of unanalyzed indices. In particular, using the new default shape, SQLite > assumes that you need to use all columns of an index in order to narrow the > search significantly. The number of rows that need to be searched > increases geometrically based on the number of unused columns on the > right-hand side of the index. I can confirm that this patch updates all our queries to work with a query plan at least as efficient as with Sqlite 3.7.12-3.7.18. I will spend some time over the next few days to see if I can get more details on the difference over the different versions by using a standard install, a _stat1 table and this patch by extending our test infrastruc
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
I think this may help me significantly, as none of the application SELECTs use native tables (only the schema metadata analyzer/checker uses native tables) but quite a few virtual tables (which cannot be ANALYZED). -Ursprüngliche Nachricht- Von: Richard Hipp [mailto:d...@sqlite.org] Gesendet: Dienstag, 03. September 2013 02:12 An: General Discussion of SQLite Database Betreff: Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0 On Mon, Sep 2, 2013 at 10:58 AM, Bert Huijben wrote: > We anticipate that the wc_id column will be used more in future > versions and I had hoped that the assumption that a better index match > (matching more usable columns) would always be preferable over one > that uses less columns. > > All indexes have the wc_id column first, so I don't see a good reason > that Sqlite in this case prefers only this column of the specific > (bad) index over any other index that can help with the rest of the query. > > This optimization added in 3.7.12 made it easier for us to write > clearer well performing queries, without any hassle. . . We can't > really assume our users to run a specific version of Sqlite (as most > unix distributions compile sqlite themselves for products like ours)... > As I've alluded to before, the issue here boils down to the "shape" of the content of the database file. By "shape", I mean the distribution of values in each table, and in particular how well each index is able to narrow down a search. Knowledge of the shape of the data is critically important in choosing the best query plan when there are two or more potentially good query plans. It isn't possible to make good decisions about query plans without knowing the shape of the data. When you run ANALYZE, SQLite records information in the sqlite_stat1 table that gives it a good picture of the shape of the data. If you have not run ANALYZE or if you have dropped or deleted the sqlite_stat1 table, then SQLite has no information about the true shape of the data, so it has to guess. The guess that SQLite has made for the the past decade is that the left-most column of each index does a pretty good job of narrowing down the search and then subsequent columns merely refine that search slightly. This is just a wild guess, of course. But it seems to have worked remarkably well in a wide variety of databases in literally millions of applications over the past ten years. The issue at hand is that the guess that SQLite is making on the shape of unanalyzed databases is not working very well for SVN. The problem is that the left-most column of many of the indices do a very poor job of narrowing down the search, in violation of the assumption made for the standard guess. As I've pointed out, one could just run ANALYZE to populate the sqlite_stat1 table, then SQLite will know the database shape and will pick good plans. Or one can manually populate the sqlite_stat1 table with something close to the anticipated shape of the database. But Bert says he doesn't want to do any of that. So I've come up with a new, experimental compile-time option: -DSQLITE_DEFAULT_INDEX_SHAPE=1 Rebuilding with the compile-time option above, using code from the index-shape-1 branch (http://www.sqlite.org/src/timeline?r=index-shape-1) causes SQLite to make a much more pessimistic guess about the effectiveness of unanalyzed indices. In particular, using the new default shape, SQLite assumes that you need to use all columns of an index in order to narrow the search significantly. The number of rows that need to be searched increases geometrically based on the number of unused columns on the right-hand side of the index. Leave the SQLITE_DEFAULT_INDEX_SHAPE unset or set it to 0 to get the legacy assumption about unanalyzed indices. Set it to 1 for the new assumption that seems to work better for the test queries I've run on SVN. If I decide to include this compile-time option in a release, future releases might define other default index shapes using larger integers. I am personally skeptical of this idea of having a compile-time option to change the assumptions about the shape of unanalyzed content. After all, the legacy shape assumption has worked remarkably well up until now. And, more importantly, you can always run ANALYZE or otherwise populate the sqlite_stat1 table to define the shape of the data very precisely. It seems to me that if you want to tweak the data shape assumptions, you should use the mechanism that is already in place to do that (namely, the sqlite_stat1 table). But I'm open to including this new compile-time option if it seems like it might make things easier for developers. Please try this out (by downloading sources from http://www.sqlite.org/src/info/d8daaba7da or by applying the simple patch shown there to any 3.7.x
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
On Mon, Sep 2, 2013 at 10:58 AM, Bert Huijben wrote: > We anticipate that the wc_id column will be used more in future versions > and > I had hoped that the assumption that a better index match (matching more > usable columns) would always be preferable over one that uses less columns. > > All indexes have the wc_id column first, so I don't see a good reason that > Sqlite in this case prefers only this column of the specific (bad) index > over any other index that can help with the rest of the query. > > This optimization added in 3.7.12 made it easier for us to write clearer > well performing queries, without any hassle. . . We can't really assume our > users to run a specific version of Sqlite (as most unix distributions > compile sqlite themselves for products like ours)... > As I've alluded to before, the issue here boils down to the "shape" of the content of the database file. By "shape", I mean the distribution of values in each table, and in particular how well each index is able to narrow down a search. Knowledge of the shape of the data is critically important in choosing the best query plan when there are two or more potentially good query plans. It isn't possible to make good decisions about query plans without knowing the shape of the data. When you run ANALYZE, SQLite records information in the sqlite_stat1 table that gives it a good picture of the shape of the data. If you have not run ANALYZE or if you have dropped or deleted the sqlite_stat1 table, then SQLite has no information about the true shape of the data, so it has to guess. The guess that SQLite has made for the the past decade is that the left-most column of each index does a pretty good job of narrowing down the search and then subsequent columns merely refine that search slightly. This is just a wild guess, of course. But it seems to have worked remarkably well in a wide variety of databases in literally millions of applications over the past ten years. The issue at hand is that the guess that SQLite is making on the shape of unanalyzed databases is not working very well for SVN. The problem is that the left-most column of many of the indices do a very poor job of narrowing down the search, in violation of the assumption made for the standard guess. As I've pointed out, one could just run ANALYZE to populate the sqlite_stat1 table, then SQLite will know the database shape and will pick good plans. Or one can manually populate the sqlite_stat1 table with something close to the anticipated shape of the database. But Bert says he doesn't want to do any of that. So I've come up with a new, experimental compile-time option: -DSQLITE_DEFAULT_INDEX_SHAPE=1 Rebuilding with the compile-time option above, using code from the index-shape-1 branch (http://www.sqlite.org/src/timeline?r=index-shape-1) causes SQLite to make a much more pessimistic guess about the effectiveness of unanalyzed indices. In particular, using the new default shape, SQLite assumes that you need to use all columns of an index in order to narrow the search significantly. The number of rows that need to be searched increases geometrically based on the number of unused columns on the right-hand side of the index. Leave the SQLITE_DEFAULT_INDEX_SHAPE unset or set it to 0 to get the legacy assumption about unanalyzed indices. Set it to 1 for the new assumption that seems to work better for the test queries I've run on SVN. If I decide to include this compile-time option in a release, future releases might define other default index shapes using larger integers. I am personally skeptical of this idea of having a compile-time option to change the assumptions about the shape of unanalyzed content. After all, the legacy shape assumption has worked remarkably well up until now. And, more importantly, you can always run ANALYZE or otherwise populate the sqlite_stat1 table to define the shape of the data very precisely. It seems to me that if you want to tweak the data shape assumptions, you should use the mechanism that is already in place to do that (namely, the sqlite_stat1 table). But I'm open to including this new compile-time option if it seems like it might make things easier for developers. Please try this out (by downloading sources from http://www.sqlite.org/src/info/d8daaba7da or by applying the simple patch shown there to any 3.7.x or 3.8.x version of SQLite) and let me know what you think. Your feedback is appreciated. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
On 2 Sep 2013, at 9:54pm, Kees Nuyt wrote: > I would reverse 2) and 3), otherwise ANALYZE has nothing to analyze. Of course. I thought it right and wrote it wrong. Thanks for the correction. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
On Mon, 2 Sep 2013 17:32:31 +0100, Simon Slavin wrote: > >On 2 Sep 2013, at 3:58pm, Bert Huijben wrote: > >> We anticipate that the wc_id column will be used more in future versions and >> I had hoped that the assumption that a better index match (matching more >> usable columns) would always be preferable over one that uses less columns. > >If you want to test for good indexes and don't understand how indexing will >work on your data ... > >1) Put in a convincing set of data >2) Run ANALYZE >3) Make up a ton of indexes, indexing lots of tables in many different orders. >4) Find the SELECT and UPDATEs you run most often and use EXPLAIN QUERY PLAN >on them. >5) Keep the indexes the query plans mention, and drop the ones they don't >mention. I would reverse 2) and 3), otherwise ANALYZE has nothing to analyze. The query plans of 4) are only valid with results of ANALYZE. Without sqlite_stat1 , the SQLite optimizer may choose different indexes at runtime. So, this strategy only works if Bert decides to include a populated sqlite_stat1 table in the SVN init code. >This stuff can't be done by theory: there are too many >possibilities and it requires too good an understanding >of how the query planner works for non-experts. >There's no substitute for actual testing. > >Simon. -- Groet, Cordialement, Pozdrawiam, Regards, Kees Nuyt ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
On 2 Sep 2013, at 3:58pm, Bert Huijben wrote: > We anticipate that the wc_id column will be used more in future versions and > I had hoped that the assumption that a better index match (matching more > usable columns) would always be preferable over one that uses less columns. If you want to test for good indexes and don't understand how indexing will work on your data ... 1) Put in a convincing set of data 2) Run ANALYZE 3) Make up a ton of indexes, indexing lots of tables in many different orders. 4) Find the SELECT and UPDATEs you run most often and use EXPLAIN QUERY PLAN on them. 5) Keep the indexes the query plans mention, and drop the ones they don't mention. This stuff can't be done by theory: there are too many possibilities and it requires too good an understanding of how the query planner works for non-experts. There's no substitute for actual testing. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
On Mon, Sep 2, 2013 at 10:58 AM, Bert Huijben wrote: > We anticipate that the wc_id column will be used more in future versions > and > I had hoped that the assumption that a better index match (matching more > usable columns) would always be preferable over one that uses less columns. > > All indexes have the wc_id column first, so I don't see a good reason that > Sqlite in this case prefers only this column of the specific (bad) index > over any other index that can help with the rest of the query. > Thank you; I have your complaint. I will try to come up with some new mechanism to solve your issues without the use of ANALYZE. Please recognize that there do exists real-world databases where it is more efficient to use a single lookup on a single column of an index rather than doing multiple lookups on the same index using multiple columns, which is the choice that SQLite faces in your scenario. It is not clear to me how SQLite is suppose to tell one case from the other without access to ANALYZE-like meta-information that describes the shape of the database content. But I will investigate and see what I can come up with. Please also recognize that if I change SQLite to work better with your data shape, that will also likely make it slower on some other application where the data has a very different shape, for example where the left-most index column is distinctive but the second column is not. The question becomes whether or not this will be a worthwhile tradeoff. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
On Mon, Sep 2, 2013 at 10:58 AM, Bert Huijben wrote: > We can't really assume our > users to run a specific version of Sqlite (as most unix distributions > compile sqlite themselves for products like ours)... > Yes you can: statically link. Add "sqlite3.c" and "sqlite3.h" to your source tree so that people who build from sources do not have to download (a particular version of) sqlite. Mozilla works this way, as does Fossil. I think Chromium does too, though I'd need to check to be sure. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
> -Original Message- > From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- > boun...@sqlite.org] On Behalf Of Richard Hipp > Sent: maandag 2 september 2013 16:11 > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Inefficient covering index used for Subversion with > SQLite 3.8.0 > > On Mon, Sep 2, 2013 at 9:42 AM, Bert Huijben > wrote: > > > > > My question is more like: Should I see this as a light regression that most > > likely will be resolved in a future version, or as a buggy assumption in > > our > > code that I should work around? It should be possible via explicit 'INDEXED > > BY' clauses and/or breaking the query in separate parts. (Or for future > > versions by pre-filling the stat table) > > > > I think rules #3 and #4 (especially #4) of > http://www.sqlite.org/queryplanner-ng.html#howtofix are applicable here. > > Without the results of ANALYZE, SQLite assumes that the left-most column of > an index will narrow down a search to about 10 rows. That is clearly not > the case with your schema (where the left-most column of several indices, > wc_id, is always the same value). Hence, SQLite really needs the results > of ANALYZE to work efficiently. > > But that does not mean you have to run ANALYZE on each installation. > Instead, you can install a pre-computed sqlite3_stat1 table (the larger of > the two you emailed above would suffice) whenever you create a new > database. To do this, just run: > > ANALYZE sqlite_master; -- Create the sqlite_stat1 table > INSERT INTO sqlite_stat1 VALUES(...); -- repeat as necessary to fill > in the table. > ANALYZE sqlite_master; -- Load the sqlite_stat1 table into the > optimizer. > > The above steps only need to be done once, when the database is first > created, and can be part of the same script that creates all the other > tables, indices, triggers, and views in your database. You should never > need to run ANALYZE again (assuming the "shape" of your data is always > roughly the same). The sqlite_stat1 table created here gives SQLite all > the information it needs to be able to figure out the best way to handle > queries in your peculiar usage pattern. Thanks. We anticipate that the wc_id column will be used more in future versions and I had hoped that the assumption that a better index match (matching more usable columns) would always be preferable over one that uses less columns. All indexes have the wc_id column first, so I don't see a good reason that Sqlite in this case prefers only this column of the specific (bad) index over any other index that can help with the rest of the query. This optimization added in 3.7.12 made it easier for us to write clearer well performing queries, without any hassle. . . We can't really assume our users to run a specific version of Sqlite (as most unix distributions compile sqlite themselves for products like ours)... Most Subversion developers are not database experts, so adding this initialization will add more code that will have to be explained over and over again to developers that don't understand how a query optimizer (or a B-Tree) works. Thanks for the call ANALYZE twice trick. That avoids having to declare the table ourselves. Bert ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
On Mon, Sep 2, 2013 at 9:42 AM, Bert Huijben wrote: > > My question is more like: Should I see this as a light regression that most > likely will be resolved in a future version, or as a buggy assumption in > our > code that I should work around? It should be possible via explicit 'INDEXED > BY' clauses and/or breaking the query in separate parts. (Or for future > versions by pre-filling the stat table) > I think rules #3 and #4 (especially #4) of http://www.sqlite.org/queryplanner-ng.html#howtofix are applicable here. Without the results of ANALYZE, SQLite assumes that the left-most column of an index will narrow down a search to about 10 rows. That is clearly not the case with your schema (where the left-most column of several indices, wc_id, is always the same value). Hence, SQLite really needs the results of ANALYZE to work efficiently. But that does not mean you have to run ANALYZE on each installation. Instead, you can install a pre-computed sqlite3_stat1 table (the larger of the two you emailed above would suffice) whenever you create a new database. To do this, just run: ANALYZE sqlite_master; -- Create the sqlite_stat1 table INSERT INTO sqlite_stat1 VALUES(...); -- repeat as necessary to fill in the table. ANALYZE sqlite_master; -- Load the sqlite_stat1 table into the optimizer. The above steps only need to be done once, when the database is first created, and can be part of the same script that creates all the other tables, indices, triggers, and views in your database. You should never need to run ANALYZE again (assuming the "shape" of your data is always roughly the same). The sqlite_stat1 table created here gives SQLite all the information it needs to be able to figure out the best way to handle queries in your peculiar usage pattern. FWIW, please note that the I_NODES_MOVED index in your schema is completely useless, since all values of all columns are always the same. That index is dead weight. You might want to drop that index, if the shape of your data really is always similar to the examples you supplied us. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
> -Original Message- > From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- > boun...@sqlite.org] On Behalf Of Richard Hipp > Sent: vrijdag 30 augustus 2013 21:41 > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Inefficient covering index used for Subversion with > SQLite 3.8.0 > > On Fri, Aug 30, 2013 at 3:31 PM, Bert Huijben wrote: > > > The analyze on the very small database (which I used for the comparison > > between 3.7 and 3.8) is: > > > > Thanks for the data. I can share the/an actual database if that would help. (The wc.db describes a checkout of a 100% public subversion repository, so there is nothing secret in it) For myself it would be interesting to know how I should look at the OR optimization which was available for our use cases between Sqlite 3.7.12 and 3.7.18. My question is more like: Should I see this as a light regression that most likely will be resolved in a future version, or as a buggy assumption in our code that I should work around? It should be possible via explicit 'INDEXED BY' clauses and/or breaking the query in separate parts. (Or for future versions by pre-filling the stat table) Bert ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
Hi, Using the Subversion 1.7 / 1.8 wc.db schema I get quite different results from the query planner for several of our queries, where the difference in performance is quite huge. For this typical example using 3.8.0.1 only one index component is used: $ EXPLAIN QUERY PLAN DELETE FROM NODES WHERE wc_id = ?1 AND local_relpath = ?2; 0|0|0|SEARCH TABLE NODES USING COVERING INDEX I_NODES_PARENT (wc_id=?) While with 3.7.17 two components are used: $ EXPLAIN QUERY PLAN DELETE FROM NODES WHERE wc_id = ?1 AND local_relpath = ?2; 0|0|0|SEARCH TABLE NODES USING COVERING INDEX sqlite_autoindex_NODES_1 (wc_id=? AND local_relpath=?) (~9 rows) As currently wc_id is almost always 0 in our database, 3.8.0 will perform a full table scan while 3.7.17 (and older) just deletes the right record. If I perform a similar 'SELECT * FROM' instead of 'DELETE FROM', the right index is used (but not as COVERING of course), while a 'SELECT local_relpath FROM' shows the same problem as the delete. This problem appears specific to cases where multiple covering indexes are found. Bert -- Complete schema: [[ CREATE TABLE REPOSITORY ( id INTEGER PRIMARY KEY AUTOINCREMENT, root TEXT UNIQUE NOT NULL, uuid TEXT NOT NULL ); CREATE INDEX I_UUID ON REPOSITORY (uuid); CREATE INDEX I_ROOT ON REPOSITORY (root); CREATE TABLE WCROOT ( id INTEGER PRIMARY KEY AUTOINCREMENT, local_abspath TEXT UNIQUE ); CREATE UNIQUE INDEX I_LOCAL_ABSPATH ON WCROOT (local_abspath); CREATE TABLE PRISTINE ( checksum TEXT NOT NULL PRIMARY KEY, compression INTEGER, size INTEGER NOT NULL, refcount INTEGER NOT NULL, md5_checksum TEXT NOT NULL ); CREATE TABLE ACTUAL_NODE ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, parent_relpath TEXT, properties BLOB, conflict_old TEXT, conflict_new TEXT, conflict_working TEXT, prop_reject TEXT, changelist TEXT, text_mod TEXT, tree_conflict_data TEXT, conflict_data BLOB, older_checksum TEXT REFERENCES PRISTINE (checksum), left_checksum TEXT REFERENCES PRISTINE (checksum), right_checksum TEXT REFERENCES PRISTINE (checksum), PRIMARY KEY (wc_id, local_relpath) ); CREATE TABLE LOCK ( repos_id INTEGER NOT NULL REFERENCES REPOSITORY (id), repos_relpath TEXT NOT NULL, lock_token TEXT NOT NULL, lock_owner TEXT, lock_comment TEXT, lock_date INTEGER, PRIMARY KEY (repos_id, repos_relpath) ); CREATE TABLE WORK_QUEUE ( id INTEGER PRIMARY KEY AUTOINCREMENT, work BLOB NOT NULL ); CREATE TABLE WC_LOCK ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_dir_relpath TEXT NOT NULL, locked_levels INTEGER NOT NULL DEFAULT -1, PRIMARY KEY (wc_id, local_dir_relpath) ); CREATE TABLE NODES ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, op_depth INTEGER NOT NULL, parent_relpath TEXT, repos_id INTEGER REFERENCES REPOSITORY (id), repos_path TEXT, revision INTEGER, presence TEXT NOT NULL, moved_here INTEGER, moved_to TEXT, kind TEXT NOT NULL, properties BLOB, depth TEXT, checksum TEXT REFERENCES PRISTINE (checksum), symlink_target TEXT, changed_revision INTEGER, changed_date INTEGER, changed_author TEXT, translated_size INTEGER, last_mod_time INTEGER, dav_cache BLOB, file_external TEXT, inherited_props BLOB, PRIMARY KEY (wc_id, local_relpath, op_depth) ); CREATE VIEW NODES_CURRENT AS SELECT * FROM nodes AS n WHERE op_depth = (SELECT MAX(op_depth) FROM nodes AS n2 WHERE n2.wc_id = n.wc_id AND n2.local_relpath = n.local_relpath); CREATE VIEW NODES_BASE AS SELECT * FROM nodes WHERE op_depth = 0; CREATE TRIGGER nodes_insert_trigger AFTER INSERT ON nodes WHEN NEW.checksum IS NOT NULL BEGIN UPDATE pristine SET refcount = refcount + 1 WHERE checksum = NEW.checksum; END; CREATE TRIGGER nodes_delete_trigger AFTER DELETE ON nodes WHEN OLD.checksum IS NOT NULL BEGIN UPDATE pristine SET refcount = refcount - 1 WHERE checksum = OLD.checksum; END; CREATE TRIGGER nodes_update_checksum_trigger AFTER UPDATE OF checksum ON nodes WHEN NEW.checksum IS NOT OLD.checksum BEGIN UPDATE pristine SET refcount = refcount + 1 WHERE checksum = NEW.checksum; UPDATE pristine SET refcount = refcount - 1 WHERE checksum = OLD.checksum; END; CREATE TABLE EXTERNALS ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, parent_relpath TEXT NOT NULL, repos_id INTEGER NOT NULL REFERENCES REPOSITORY (id), presence TEXT NOT NULL, kind TEXT NOT NULL, def_local_relpath TEXT NOT NULL, def_repos_relpath TEXT NOT NULL, def_operational_revision TEXT, def_revision TEXT, PRIMARY KEY (wc_id, local_relpath) ); CREATE UNIQUE INDEX I_EXTERNALS_DEFINED ON EXTERNALS (wc_id, def_local_relpath, local_relpath); CREATE UNIQUE INDEX I_NODES_MOVED ON NODES (wc_id, mov
[sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
Hi, Using the Subversion 1.7 / 1.8 wc.db schema I get quite different results from the query planner for several of our queries, where the difference in performance is quite huge. For this typical example using 3.8.0.1 only one index component is used: $ EXPLAIN QUERY PLAN DELETE FROM NODES WHERE wc_id = ?1 AND local_relpath = ?2; 0|0|0|SEARCH TABLE NODES USING COVERING INDEX I_NODES_PARENT (wc_id=?) While with 3.7.17 two components are used: $ EXPLAIN QUERY PLAN DELETE FROM NODES WHERE wc_id = ?1 AND local_relpath = ?2; 0|0|0|SEARCH TABLE NODES USING COVERING INDEX sqlite_autoindex_NODES_1 (wc_id=? AND local_relpath=?) (~9 rows) As currently wc_id is almost always 0 in our database, 3.8.0 will perform a full table scan while 3.7.17 (and older) just deletes the right record. If I perform a similar 'SELECT * FROM' instead of 'DELETE FROM', the right index is used (but not as COVERING of course), while a 'SELECT local_relpath FROM' shows the same problem as the delete. This problem appears specific to cases where multiple covering indexes are found. Bert -- Complete schema: [[ CREATE TABLE REPOSITORY ( id INTEGER PRIMARY KEY AUTOINCREMENT, root TEXT UNIQUE NOT NULL, uuid TEXT NOT NULL ); CREATE INDEX I_UUID ON REPOSITORY (uuid); CREATE INDEX I_ROOT ON REPOSITORY (root); CREATE TABLE WCROOT ( id INTEGER PRIMARY KEY AUTOINCREMENT, local_abspath TEXT UNIQUE ); CREATE UNIQUE INDEX I_LOCAL_ABSPATH ON WCROOT (local_abspath); CREATE TABLE PRISTINE ( checksum TEXT NOT NULL PRIMARY KEY, compression INTEGER, size INTEGER NOT NULL, refcount INTEGER NOT NULL, md5_checksum TEXT NOT NULL ); CREATE TABLE ACTUAL_NODE ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, parent_relpath TEXT, properties BLOB, conflict_old TEXT, conflict_new TEXT, conflict_working TEXT, prop_reject TEXT, changelist TEXT, text_mod TEXT, tree_conflict_data TEXT, conflict_data BLOB, older_checksum TEXT REFERENCES PRISTINE (checksum), left_checksum TEXT REFERENCES PRISTINE (checksum), right_checksum TEXT REFERENCES PRISTINE (checksum), PRIMARY KEY (wc_id, local_relpath) ); CREATE TABLE LOCK ( repos_id INTEGER NOT NULL REFERENCES REPOSITORY (id), repos_relpath TEXT NOT NULL, lock_token TEXT NOT NULL, lock_owner TEXT, lock_comment TEXT, lock_date INTEGER, PRIMARY KEY (repos_id, repos_relpath) ); CREATE TABLE WORK_QUEUE ( id INTEGER PRIMARY KEY AUTOINCREMENT, work BLOB NOT NULL ); CREATE TABLE WC_LOCK ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_dir_relpath TEXT NOT NULL, locked_levels INTEGER NOT NULL DEFAULT -1, PRIMARY KEY (wc_id, local_dir_relpath) ); CREATE TABLE NODES ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, op_depth INTEGER NOT NULL, parent_relpath TEXT, repos_id INTEGER REFERENCES REPOSITORY (id), repos_path TEXT, revision INTEGER, presence TEXT NOT NULL, moved_here INTEGER, moved_to TEXT, kind TEXT NOT NULL, properties BLOB, depth TEXT, checksum TEXT REFERENCES PRISTINE (checksum), symlink_target TEXT, changed_revision INTEGER, changed_date INTEGER, changed_author TEXT, translated_size INTEGER, last_mod_time INTEGER, dav_cache BLOB, file_external TEXT, inherited_props BLOB, PRIMARY KEY (wc_id, local_relpath, op_depth) ); CREATE VIEW NODES_CURRENT AS SELECT * FROM nodes AS n WHERE op_depth = (SELECT MAX(op_depth) FROM nodes AS n2 WHERE n2.wc_id = n.wc_id AND n2.local_relpath = n.local_relpath); CREATE VIEW NODES_BASE AS SELECT * FROM nodes WHERE op_depth = 0; CREATE TRIGGER nodes_insert_trigger AFTER INSERT ON nodes WHEN NEW.checksum IS NOT NULL BEGIN UPDATE pristine SET refcount = refcount + 1 WHERE checksum = NEW.checksum; END; CREATE TRIGGER nodes_delete_trigger AFTER DELETE ON nodes WHEN OLD.checksum IS NOT NULL BEGIN UPDATE pristine SET refcount = refcount - 1 WHERE checksum = OLD.checksum; END; CREATE TRIGGER nodes_update_checksum_trigger AFTER UPDATE OF checksum ON nodes WHEN NEW.checksum IS NOT OLD.checksum BEGIN UPDATE pristine SET refcount = refcount + 1 WHERE checksum = NEW.checksum; UPDATE pristine SET refcount = refcount - 1 WHERE checksum = OLD.checksum; END; CREATE TABLE EXTERNALS ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, parent_relpath TEXT NOT NULL, repos_id INTEGER NOT NULL REFERENCES REPOSITORY (id), presence TEXT NOT NULL, kind TEXT NOT NULL, def_local_relpath TEXT NOT NULL, def_repos_relpath TEXT NOT NULL, def_operational_revision TEXT, def_revision TEXT, PRIMARY KEY (wc_id, local_relpath) ); CREATE UNIQUE INDEX I_EXTERNALS_DEFINED ON EXTERNALS (wc_id, def_local_relpath, local_relpath); CREATE UNIQUE INDEX I_NODES_MOVED ON NODES (wc_id, mov
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
On Fri, Aug 30, 2013 at 3:31 PM, Bert Huijben wrote: > The analyze on the very small database (which I used for the comparison > between 3.7 and 3.8) is: > Thanks for the data. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
> -Original Message- > From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- > boun...@sqlite.org] On Behalf Of Richard Hipp > Sent: vrijdag 30 augustus 2013 21:16 > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Inefficient covering index used for Subversion with > SQLite 3.8.0 > > On Fri, Aug 30, 2013 at 2:44 PM, wrote: > > > > > The query > > [[ > > DELETE FROM nodes WHERE wc_id = ?1 AND (local_relpath = ?2 OR > > (((local_relpath) > (CASE (?2) WHEN '' THEN '' ELSE (?2) || '/' END)) AND > > ((local_relpath) < CASE (?2) WHEN '' THEN X'' ELSE (?2) || '0' END))) > > AND op_depth = ?3 > > ]] > > > > > > Is handled by 3.7.17 as: > > 0|0|0|SEARCH TABLE nodes USING COVERING INDEX > sqlite_autoindex_NODES_1 > > (wc_id=? AND local_relpath=? AND op_depth=?) (~1 rows) > > 0|0|0|SEARCH TABLE nodes USING COVERING INDEX > sqlite_autoindex_NODES_1 > > (wc_id=? AND local_relpath>? AND local_relpath > > > Which I read as two separate operations, under the 'OR' optimization > > > > But 3.8.0.1 does: > > 0|0|0|SEARCH TABLE nodes USING COVERING INDEX I_NODES_PARENT > (wc_id=?) > > > > Which in our case is far worse than using the primary key on the normal > > table as wc_id is constant and local_relpath +- our primary key. > > > > But the query planner has know way of knowing that wc_id is always the > same value, unless you run ANALYZE. Can you do that, please: Run > ANALYZE, > then send in the content of the resulting "sqlite_stat1" table. The analyze on the very small database (which I used for the comparison between 3.7 and 3.8) is: [[ PRAGMA foreign_keys=OFF; BEGIN TRANSACTION; ANALYZE sqlite_master; INSERT INTO "sqlite_stat1" VALUES('REPOSITORY','I_ROOT','1 1'); INSERT INTO "sqlite_stat1" VALUES('REPOSITORY','I_UUID','1 1'); INSERT INTO "sqlite_stat1" VALUES('REPOSITORY','sqlite_autoindex_REPOSITORY_1','1 1'); INSERT INTO "sqlite_stat1" VALUES('WCROOT','I_LOCAL_ABSPATH','1 1'); INSERT INTO "sqlite_stat1" VALUES('WCROOT','sqlite_autoindex_WCROOT_1','1 1'); INSERT INTO "sqlite_stat1" VALUES('PRISTINE','I_PRISTINE_MD5','45 1'); INSERT INTO "sqlite_stat1" VALUES('PRISTINE','sqlite_autoindex_PRISTINE_1','45 1'); INSERT INTO "sqlite_stat1" VALUES('NODES','I_NODES_PARENT','7 7 4 1 1'); INSERT INTO "sqlite_stat1" VALUES('NODES','I_NODES_MOVED','7 7 7 7'); INSERT INTO "sqlite_stat1" VALUES('NODES','sqlite_autoindex_NODES_1','7 7 1 1'); COMMIT; ]] A larger Subversion working copy gets [[ PRAGMA foreign_keys=OFF; BEGIN TRANSACTION; ANALYZE sqlite_master; INSERT INTO "sqlite_stat1" VALUES('REPOSITORY','I_ROOT','2 1'); INSERT INTO "sqlite_stat1" VALUES('REPOSITORY','I_UUID','2 2'); INSERT INTO "sqlite_stat1" VALUES('REPOSITORY','sqlite_autoindex_REPOSITORY_1','2 1'); INSERT INTO "sqlite_stat1" VALUES('WCROOT','I_LOCAL_ABSPATH','1 1'); INSERT INTO "sqlite_stat1" VALUES('WCROOT','sqlite_autoindex_WCROOT_1','1 1'); INSERT INTO "sqlite_stat1" VALUES('PRISTINE','I_PRISTINE_MD5','9451 1'); INSERT INTO "sqlite_stat1" VALUES('PRISTINE','sqlite_autoindex_PRISTINE_1','9451 1'); INSERT INTO "sqlite_stat1" VALUES('NODES','I_NODES_PARENT','18420 18420 9 1 1'); INSERT INTO "sqlite_stat1" VALUES('NODES','I_NODES_MOVED','18420 18420 18420 18420'); INSERT INTO "sqlite_stat1" VALUES('NODES','sqlite_autoindex_NODES_1','18420 18420 1 1'); COMMIT; ]] But in general we don't run analyze in the working copies. We would most likely have to insert some initial analyze data for our use cases as our users always start from an empty database and there is no real time where we can run analyze As I guessed after your question 3.8.0 optimizes this correctly after analyzing; 3.7.17 also without. Bert ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
> -Original Message- > From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- > boun...@sqlite.org] On Behalf Of Richard Hipp > Sent: vrijdag 30 augustus 2013 19:37 > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Inefficient covering index used for Subversion with > SQLite 3.8.0 > > On Fri, Aug 30, 2013 at 12:39 PM, wrote: > > > Hi, > > > > Using the Subversion 1.7 / 1.8 wc.db schema I get quite different results > > from the query planner for several of our queries, where the difference in > > performance is quite huge. > > > > Thanks for the test case. An adjustment to the query planner so that it > works better for your test case has been checked in here: > http://www.sqlite.org/src/info/79e458ef7a Thanks. This fix appears to positively affect quite a few of our queries. There is another common pattern that changes between 3.7.17, which I hoped was caused by the same problem. The query [[ DELETE FROM nodes WHERE wc_id = ?1 AND (local_relpath = ?2 OR (((local_relpath) > (CASE (?2) WHEN '' THEN '' ELSE (?2) || '/' END)) AND ((local_relpath) < CASE (?2) WHEN '' THEN X'' ELSE (?2) || '0' END))) AND op_depth = ?3 ]] Which in our sourcecode we generate from the more readable form [[ DELETE FROM nodes WHERE wc_id = ?1 AND (local_relpath = ?2 OR IS_STRICT_DESCENDANT_OF(local_relpath, ?2)) AND op_depth = ?3 ]] (which is a valid transformation given the path rules in Subversion) Is handled by 3.7.17 as: 0|0|0|SEARCH TABLE nodes USING COVERING INDEX sqlite_autoindex_NODES_1 (wc_id=? AND local_relpath=? AND op_depth=?) (~1 rows) 0|0|0|SEARCH TABLE nodes USING COVERING INDEX sqlite_autoindex_NODES_1 (wc_id=? AND local_relpath>? AND local_relpathhttp://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
On Fri, Aug 30, 2013 at 2:44 PM, wrote: > > The query > [[ > DELETE FROM nodes WHERE wc_id = ?1 AND (local_relpath = ?2 OR > (((local_relpath) > (CASE (?2) WHEN '' THEN '' ELSE (?2) || '/' END)) AND > ((local_relpath) < CASE (?2) WHEN '' THEN X'' ELSE (?2) || '0' END))) > AND op_depth = ?3 > ]] > > Is handled by 3.7.17 as: > 0|0|0|SEARCH TABLE nodes USING COVERING INDEX sqlite_autoindex_NODES_1 > (wc_id=? AND local_relpath=? AND op_depth=?) (~1 rows) > 0|0|0|SEARCH TABLE nodes USING COVERING INDEX sqlite_autoindex_NODES_1 > (wc_id=? AND local_relpath>? AND local_relpath > Which I read as two separate operations, under the 'OR' optimization > > But 3.8.0.1 does: > 0|0|0|SEARCH TABLE nodes USING COVERING INDEX I_NODES_PARENT (wc_id=?) > > Which in our case is far worse than using the primary key on the normal > table as wc_id is constant and local_relpath +- our primary key. > > But the query planner has know way of knowing that wc_id is always the same value, unless you run ANALYZE. Can you do that, please: Run ANALYZE, then send in the content of the resulting "sqlite_stat1" table. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
On Fri, Aug 30, 2013 at 12:39 PM, wrote: > Hi, > > Using the Subversion 1.7 / 1.8 wc.db schema I get quite different results > from the query planner for several of our queries, where the difference in > performance is quite huge. > Thanks for the test case. An adjustment to the query planner so that it works better for your test case has been checked in here: http://www.sqlite.org/src/info/79e458ef7a -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Inefficient covering index used for Subversion with SQLite 3.8.0
Hi, Using the Subversion 1.7 / 1.8 wc.db schema I get quite different results from the query planner for several of our queries, where the difference in performance is quite huge. For this typical example using 3.8.0.1 only one index component is used: $ EXPLAIN QUERY PLAN DELETE FROM NODES WHERE wc_id = ?1 AND local_relpath = ?2; 0|0|0|SEARCH TABLE NODES USING COVERING INDEX I_NODES_PARENT (wc_id=?) While with 3.7.17 two components are used: $ EXPLAIN QUERY PLAN DELETE FROM NODES WHERE wc_id = ?1 AND local_relpath = ?2; 0|0|0|SEARCH TABLE NODES USING COVERING INDEX sqlite_autoindex_NODES_1 (wc_id=? AND local_relpath=?) (~9 rows) As currently wc_id is almost always 0 in our database, 3.8.0 will perform a full table scan while 3.7.17 (and older) just deletes the right record. If I perform a similar 'SELECT * FROM' instead of 'DELETE FROM', the right index is used (but not as COVERING of course), while a 'SELECT local_relpath FROM' shows the same problem as the delete. This problem appears specific to cases where multiple covering indexes are found. Bert -- Complete schema: [[ CREATE TABLE REPOSITORY ( id INTEGER PRIMARY KEY AUTOINCREMENT, root TEXT UNIQUE NOT NULL, uuid TEXT NOT NULL ); CREATE INDEX I_UUID ON REPOSITORY (uuid); CREATE INDEX I_ROOT ON REPOSITORY (root); CREATE TABLE WCROOT ( id INTEGER PRIMARY KEY AUTOINCREMENT, local_abspath TEXT UNIQUE ); CREATE UNIQUE INDEX I_LOCAL_ABSPATH ON WCROOT (local_abspath); CREATE TABLE PRISTINE ( checksum TEXT NOT NULL PRIMARY KEY, compression INTEGER, size INTEGER NOT NULL, refcount INTEGER NOT NULL, md5_checksum TEXT NOT NULL ); CREATE TABLE ACTUAL_NODE ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, parent_relpath TEXT, properties BLOB, conflict_old TEXT, conflict_new TEXT, conflict_working TEXT, prop_reject TEXT, changelist TEXT, text_mod TEXT, tree_conflict_data TEXT, conflict_data BLOB, older_checksum TEXT REFERENCES PRISTINE (checksum), left_checksum TEXT REFERENCES PRISTINE (checksum), right_checksum TEXT REFERENCES PRISTINE (checksum), PRIMARY KEY (wc_id, local_relpath) ); CREATE TABLE LOCK ( repos_id INTEGER NOT NULL REFERENCES REPOSITORY (id), repos_relpath TEXT NOT NULL, lock_token TEXT NOT NULL, lock_owner TEXT, lock_comment TEXT, lock_date INTEGER, PRIMARY KEY (repos_id, repos_relpath) ); CREATE TABLE WORK_QUEUE ( id INTEGER PRIMARY KEY AUTOINCREMENT, work BLOB NOT NULL ); CREATE TABLE WC_LOCK ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_dir_relpath TEXT NOT NULL, locked_levels INTEGER NOT NULL DEFAULT -1, PRIMARY KEY (wc_id, local_dir_relpath) ); CREATE TABLE NODES ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, op_depth INTEGER NOT NULL, parent_relpath TEXT, repos_id INTEGER REFERENCES REPOSITORY (id), repos_path TEXT, revision INTEGER, presence TEXT NOT NULL, moved_here INTEGER, moved_to TEXT, kind TEXT NOT NULL, properties BLOB, depth TEXT, checksum TEXT REFERENCES PRISTINE (checksum), symlink_target TEXT, changed_revision INTEGER, changed_date INTEGER, changed_author TEXT, translated_size INTEGER, last_mod_time INTEGER, dav_cache BLOB, file_external TEXT, inherited_props BLOB, PRIMARY KEY (wc_id, local_relpath, op_depth) ); CREATE VIEW NODES_CURRENT AS SELECT * FROM nodes AS n WHERE op_depth = (SELECT MAX(op_depth) FROM nodes AS n2 WHERE n2.wc_id = n.wc_id AND n2.local_relpath = n.local_relpath); CREATE VIEW NODES_BASE AS SELECT * FROM nodes WHERE op_depth = 0; CREATE TRIGGER nodes_insert_trigger AFTER INSERT ON nodes WHEN NEW.checksum IS NOT NULL BEGIN UPDATE pristine SET refcount = refcount + 1 WHERE checksum = NEW.checksum; END; CREATE TRIGGER nodes_delete_trigger AFTER DELETE ON nodes WHEN OLD.checksum IS NOT NULL BEGIN UPDATE pristine SET refcount = refcount - 1 WHERE checksum = OLD.checksum; END; CREATE TRIGGER nodes_update_checksum_trigger AFTER UPDATE OF checksum ON nodes WHEN NEW.checksum IS NOT OLD.checksum BEGIN UPDATE pristine SET refcount = refcount + 1 WHERE checksum = NEW.checksum; UPDATE pristine SET refcount = refcount - 1 WHERE checksum = OLD.checksum; END; CREATE TABLE EXTERNALS ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, parent_relpath TEXT NOT NULL, repos_id INTEGER NOT NULL REFERENCES REPOSITORY (id), presence TEXT NOT NULL, kind TEXT NOT NULL, def_local_relpath TEXT NOT NULL, def_repos_relpath TEXT NOT NULL, def_operational_revision TEXT, def_revision TEXT, PRIMARY KEY (wc_id, local_relpath) ); CREATE UNIQUE INDEX I_EXTERNALS_DEFINED ON EXTERNALS (wc_id, def_local_relpath, local_relpath); CREATE UNIQUE INDEX I_NODES_MOVED ON NODES (wc_id, moved_to, op_depth); CREATE INDEX I_PRISTINE_MD5 ON PRIS