Re: svn commit: r1296868 - /subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql
Trent Nelson wrote on Wed, Mar 14, 2012 at 15:05:20 -0700: > On 3/14/12 3:40 PM, "Daniel Shahaf" wrote: > > >Philip: I recalled last year's discussions about implied indexes, but > >between Trent's reported observations on IRC and a back-of-the-envelope > >test with sqlite3(1) I was led to believe that an implied index does not > >get created for in this case (due to the TEXT column, as my comment > >says). > > > >I'm more than happy to revert this on trunk (if it hasn't been already) > >assuming it's indeed superfluous. > > > >Trent -- have you looked into things from your end yet? Can you confirm > >or deny the hypothesis that the explicit INDEX was necessary in your > >environment? > > Try as I might, I can't reproduce the original behavior that set us off > down this superfluous index route. +1 to revert; I was wrong. > Done. Thanks to Philip for spotting the bug. Daniel > Trent. >
Re: svn commit: r1296868 - /subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql
On 3/14/12 3:40 PM, "Daniel Shahaf" wrote: >Philip: I recalled last year's discussions about implied indexes, but >between Trent's reported observations on IRC and a back-of-the-envelope >test with sqlite3(1) I was led to believe that an implied index does not >get created for in this case (due to the TEXT column, as my comment >says). > >I'm more than happy to revert this on trunk (if it hasn't been already) >assuming it's indeed superfluous. > >Trent -- have you looked into things from your end yet? Can you confirm >or deny the hypothesis that the explicit INDEX was necessary in your >environment? Try as I might, I can't reproduce the original behavior that set us off down this superfluous index route. +1 to revert; I was wrong. Trent.
Re: svn commit: r1296868 - /subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql
Philip: I recalled last year's discussions about implied indexes, but between Trent's reported observations on IRC and a back-of-the-envelope test with sqlite3(1) I was led to believe that an implied index does not get created for in this case (due to the TEXT column, as my comment says). I'm more than happy to revert this on trunk (if it hasn't been already) assuming it's indeed superfluous. Trent -- have you looked into things from your end yet? Can you confirm or deny the hypothesis that the explicit INDEX was necessary in your environment? Trent Nelson wrote on Tue, Mar 06, 2012 at 12:11:57 -0800: > > > On 3/6/12 5:35 PM, "Philip Martin" wrote: > > >Philip Martin writes: > > > >> It may be TEXT but it is also PRIMARY KEY and according to the SQLite > >> docs: > >> > >> http://sqlite.org/lang_createtable.html > >> > >>INTEGER PRIMARY KEY columns aside, both UNIQUE and PRIMARY KEY > >>constraints are implemented by creating an index in the database (in > >>the same way as a "CREATE UNIQUE INDEX" statement would). Such an > >>index is used like any other index in the database to optimize > >>queries. As a result, there often no advantage (but significant > >>overhead) in creating an index on a set of columns that are already > >>collectively subject to a UNIQUE or PRIMARY KEY constraint. > > > >If I create a repository using 1.7 and look at the rep-cache.db I see: > > > >$ sqlite3 rep-cache.db "select * from sqlite_master" | grep index > >index|sqlite_autoindex_rep_cache_1|rep_cache|4| > > > >An index has been created automatically and so adding another index can > >only slow things down. > > Yeah this is interesting; you've provided a wealth of information contrary > to everything I said to Daniel yesterday. I did some SQLite index tuning > a month ago and I could have sworn having a TEXT field as primary key > inhibited index creation -- but, as you've demonstrated with your explain > plan and sqlite_master query, that's clearly not the case. > > To rewind things a little: I was manually repairing my asf mirror > yesterday that happened to sync earlier last week at the wrong time and > picked up a dodgy revision. I manually deleted the affected rows from > rep-cache.db and noticed that all my select queries seemed to be taking an > inordinate amount of time to complete (5s+ at least). I explain plan the > problematic query, saw no indexes, created one manually, and wallah, > problem solved, all queries returned instantly and explain plan showed > index usage. > > I've got a bunch of zfs snapshots of the repo I can have a play around > with tomorrow to see if I can replicate. > > Plausible theories off the top of my head: > > a) There's something different with my env and indexes were not being > created automatically. (I do remember having a lot of trouble getting the > asf repo to load from dumps and then complete from subsequent syncs.) > b) There's something else going on. > > (Plausible theories? Yes. Good theories? Debatable ;-) > > > Trent. >
Re: svn commit: r1296868 - /subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql
On 3/6/12 5:35 PM, "Philip Martin" wrote: >Philip Martin writes: > >> It may be TEXT but it is also PRIMARY KEY and according to the SQLite >> docs: >> >> http://sqlite.org/lang_createtable.html >> >>INTEGER PRIMARY KEY columns aside, both UNIQUE and PRIMARY KEY >>constraints are implemented by creating an index in the database (in >>the same way as a "CREATE UNIQUE INDEX" statement would). Such an >>index is used like any other index in the database to optimize >>queries. As a result, there often no advantage (but significant >>overhead) in creating an index on a set of columns that are already >>collectively subject to a UNIQUE or PRIMARY KEY constraint. > >If I create a repository using 1.7 and look at the rep-cache.db I see: > >$ sqlite3 rep-cache.db "select * from sqlite_master" | grep index >index|sqlite_autoindex_rep_cache_1|rep_cache|4| > >An index has been created automatically and so adding another index can >only slow things down. Yeah this is interesting; you've provided a wealth of information contrary to everything I said to Daniel yesterday. I did some SQLite index tuning a month ago and I could have sworn having a TEXT field as primary key inhibited index creation -- but, as you've demonstrated with your explain plan and sqlite_master query, that's clearly not the case. To rewind things a little: I was manually repairing my asf mirror yesterday that happened to sync earlier last week at the wrong time and picked up a dodgy revision. I manually deleted the affected rows from rep-cache.db and noticed that all my select queries seemed to be taking an inordinate amount of time to complete (5s+ at least). I explain plan the problematic query, saw no indexes, created one manually, and wallah, problem solved, all queries returned instantly and explain plan showed index usage. I've got a bunch of zfs snapshots of the repo I can have a play around with tomorrow to see if I can replicate. Plausible theories off the top of my head: a) There's something different with my env and indexes were not being created automatically. (I do remember having a lot of trouble getting the asf repo to load from dumps and then complete from subsequent syncs.) b) There's something else going on. (Plausible theories? Yes. Good theories? Debatable ;-) Trent.
Re: svn commit: r1296868 - /subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql
Philip Martin writes: > If I create a repository using 1.7 and look at the rep-cache.db I see: > > $ sqlite3 rep-cache.db "select * from sqlite_master" | grep index > index|sqlite_autoindex_rep_cache_1|rep_cache|4| I'm using SQLite 3.7.8 on Linux. -- uberSVN: Apache Subversion Made Easy http://www.uberSVN.com
Re: svn commit: r1296868 - /subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql
Philip Martin writes: > It may be TEXT but it is also PRIMARY KEY and according to the SQLite > docs: > > http://sqlite.org/lang_createtable.html > >INTEGER PRIMARY KEY columns aside, both UNIQUE and PRIMARY KEY >constraints are implemented by creating an index in the database (in >the same way as a "CREATE UNIQUE INDEX" statement would). Such an >index is used like any other index in the database to optimize >queries. As a result, there often no advantage (but significant >overhead) in creating an index on a set of columns that are already >collectively subject to a UNIQUE or PRIMARY KEY constraint. If I create a repository using 1.7 and look at the rep-cache.db I see: $ sqlite3 rep-cache.db "select * from sqlite_master" | grep index index|sqlite_autoindex_rep_cache_1|rep_cache|4| An index has been created automatically and so adding another index can only slow things down. -- uberSVN: Apache Subversion Made Easy http://www.uberSVN.com
Re: svn commit: r1296868 - /subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql
danie...@apache.org writes: > Author: danielsh > Date: Sun Mar 4 20:14:01 2012 > New Revision: 1296868 > > URL: http://svn.apache.org/viewvc?rev=1296868&view=rev > Log: > * subversion/libsvn_fs_fs/rep-cache-db.sql > (I_HASH): New index. > > Suggested by: Trent Nelson > > Modified: > subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql > > Modified: subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql > URL: > http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql?rev=1296868&r1=1296867&r2=1296868&view=diff > == > --- subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql (original) > +++ subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql Sun Mar 4 > 20:14:01 2012 > @@ -33,6 +33,11 @@ CREATE TABLE rep_cache ( >expanded_size INTEGER NOT NULL >); > > +/* There isn't an implicit index on a TEXT column, so create an explicit > one. */ > +CREATE INDEX I_HASH on REP_CACHE (hash); It may be TEXT but it is also PRIMARY KEY and according to the SQLite docs: http://sqlite.org/lang_createtable.html INTEGER PRIMARY KEY columns aside, both UNIQUE and PRIMARY KEY constraints are implemented by creating an index in the database (in the same way as a "CREATE UNIQUE INDEX" statement would). Such an index is used like any other index in the database to optimize queries. As a result, there often no advantage (but significant overhead) in creating an index on a set of columns that are already collectively subject to a UNIQUE or PRIMARY KEY constraint. It appears creating the index is unnecessary and may be actively bad. This reminds me that shortly before 1.7 I was asking about the exact opposite of this change: whether removing existing indices from PRIMARY KEYs, and making indices UNIQUE where possible, would be a good idea. -- Philip