Re: [Monotone-devel] Monotone speedup by adding additional database indices?
On Thu, Oct 11, 2007 at 11:17:17AM +0200, Markus Schiltknecht wrote: > But I doubt very much that there are any gains. The index on > revision_certs(id) should be enough, as we have only few (in most cases > four) revision_certs per revision id. Measurement beats guessing :-). > I'm not an expert reading these plans, but for sure both variants use an > index scan and not a sequential scan. Thus I don't think it's worth > changing these revision_certs indices. My question was which index it was choosing to use; it is trying to satisfy a constraint like "A=1, B=2, C=3", and it can pick to either use an index on A then sequential scan for (B, C), or it can use an index on (B, C) and then sequential scan for A. (We do have an index on (name, value).) A plausible heuristic would be to use the index that lets you satisfy more parts of the constraint, but that heuristic would be wrong in this case, since in fact there are many certs with NAME=branch, VALUE=net.venge.monotone, but few certs with ID=da39... -- Nathaniel -- The best book on programming is still Strunk and White. ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Monotone speedup by adding additional database indices?
Hi, Ralf S. Engelschall wrote: Monotone showed just _warning_ messages but operated just fine. Strange, the following error message should be returned: mtn: misuse: /home/markus/.monotone.36.ind.db appears to be a monotone database, but this version of mtn: misuse: monotone does not recognize its schema. mtn: misuse: you probably need a newer version of monotone. In general, yes. The (id, name, value) index should be consulted by SQLite also in case only "id" has to be looked up for a query. As a quick test, I did add the new, combined index and dropped the old one (on id only). Doing that, my db has grown by about 2.4 percent: -rw-r--r-- 1 markus markus 617M 2007-10-11 10:33 .monotone.36.db -rw-r--r-- 1 markus markus 632M 2007-10-11 10:38 .monotone.36.ind.db But I doubt very much that there are any gains. The index on revision_certs(id) should be enough, as we have only few (in most cases four) revision_certs per revision id. And AFAICT sqlite optimizes properly, at least the explain result looks good: # sqlite3 .monotone.36.db "EXPLAIN SELECT id, name, value FROM revision_certs WHERE id = 'abc' AND name = 'def' AND value = 'ghi';" > e1.txt # sqlite3 .monotone.36.ind.db "EXPLAIN SELECT id, name, value FROM revision_certs WHERE id = 'abc' AND name = 'def' AND value = 'ghi';" > e2.txt The two execution plans differ only slightly: *** e1.txt 2007-10-11 10:53:15.492029559 +0200 --- e2.txt 2007-10-11 10:53:23.128009459 +0200 *** *** 1,26 0|Goto|0|22| 1|Integer|0|0| ! 2|OpenRead|1|28|keyinfo(5,BINARY,BINARY) ! 3|SetNumColumns|1|6| ! 4|String8|0|0|def 5|IsNull|-1|20| ! 6|String8|0|0|abc 7|IsNull|-2|20| 8|String8|0|0|ghi 9|IsNull|-3|20| ! 10|MakeRecord|3|0|b 11|MemStore|0|0| 12|MoveGe|1|20| 13|MemLoad|0|0| 14|IdxGE|1|20|+ ! 15|Column|1|1| ! 16|Column|1|0| 17|Column|1|2| 18|Callback|3|0| 19|Next|1|13| 20|Close|1|0| 21|Halt|0|0| 22|Transaction|0|0| ! 23|VerifyCookie|0|26| 24|Goto|0|1| 25|Noop|0|0| --- 1,26 0|Goto|0|22| 1|Integer|0|0| ! 2|OpenRead|1|39|keyinfo(3,BINARY,BINARY) ! 3|SetNumColumns|1|4| ! 4|String8|0|0|abc 5|IsNull|-1|20| ! 6|String8|0|0|def 7|IsNull|-2|20| 8|String8|0|0|ghi 9|IsNull|-3|20| ! 10|MakeRecord|3|0|bbb 11|MemStore|0|0| 12|MoveGe|1|20| 13|MemLoad|0|0| 14|IdxGE|1|20|+ ! 15|Column|1|0| ! 16|Column|1|1| 17|Column|1|2| 18|Callback|3|0| 19|Next|1|13| 20|Close|1|0| 21|Halt|0|0| 22|Transaction|0|0| ! 23|VerifyCookie|0|29| 24|Goto|0|1| 25|Noop|0|0| I'm not an expert reading these plans, but for sure both variants use an index scan and not a sequential scan. Thus I don't think it's worth changing these revision_certs indices. Regards Markus ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Monotone speedup by adding additional database indices?
On Wed, Oct 10, 2007, Nathaniel Smith wrote: > [...] > [Err... though... waittasec. Shouldn't the database upgrade machinery > be causing mtn to bomb out on your modified database ("unrecognized > schema version" or the like)? Did you disable that or something?] Monotone showed just _warning_ messages but operated just fine. > I do wonder where the benefit is coming from in this particular case. > The index on public_keys is almost certainly just irrelevant (though > it doesn't hurt and adds scalability), since you probably don't have > more than, say, 100 keys in there, and the whole table is almost > certainly cached. > > And on revision_certs we already have: > > CREATE INDEX revision_certs__id ON revision_certs (id); > CREATE INDEX revision_certs__name_value ON revision_certs (name, value); > > So if we add an index on (id, ) we should remove the index > on simple (id). In general, yes. The (id, name, value) index should be consulted by SQLite also in case only "id" has to be looked up for a query. > But also that index on simple (id) should be making > that search fast already, because once you've located the certs > for a particular rev, then you only have to do a sequential scan over > 4 of them (in most cases) to find any particular one. Perhaps > sqlite's optimizer has gotten *too* smart and is picking the wrong > index, doing a lookup by (name, value) and then sequential scan to > match the id? > [...] We can check this with a manual "EXPLAIN ", I think. Ralf S. Engelschall [EMAIL PROTECTED] www.engelschall.com ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Monotone speedup by adding additional database indices?
On Wed, Oct 10, 2007 at 08:07:26PM +0200, Ralf S. Engelschall wrote: > What do we think? Should we investigate further and especially add > additional indices like the above to the Monotone database schema? Or is > there consensus that this type of speed optimization is just the root of > furthcoming evil and at least at this time should be still ignored at > all... No way, 5x speedups for 1 line of code = ++good. And while once upon a time the database upgrade machinery was not savvy to indexes, that got fixed long ago (exactly to add some similar indexes, IIRC). We just don't revisit whether we have the right indexes very often :-). (This isn't to override any more specific problems people might raise in this thread, though I didn't see any obvious showstoppers so far.) [Err... though... waittasec. Shouldn't the database upgrade machinery be causing mtn to bomb out on your modified database ("unrecognized schema version" or the like)? Did you disable that or something?] I do wonder where the benefit is coming from in this particular case. The index on public_keys is almost certainly just irrelevant (though it doesn't hurt and adds scalability), since you probably don't have more than, say, 100 keys in there, and the whole table is almost certainly cached. And on revision_certs we already have: CREATE INDEX revision_certs__id ON revision_certs (id); CREATE INDEX revision_certs__name_value ON revision_certs (name, value); So if we add an index on (id, ) we should remove the index on simple (id). But also that index on simple (id) should be making that search fast already, because once you've located the certs for a particular rev, then you only have to do a sequential scan over 4 of them (in most cases) to find any particular one. Perhaps sqlite's optimizer has gotten *too* smart and is picking the wrong index, doing a lookup by (name, value) and then sequential scan to match the id? ...It may not be worth answering these questions as opposed to just adding the stupid index, though. -- Nathaniel -- Details are all that matters; God dwells there, and you never get to see Him if you don't struggle to get them right. -- Stephen Jay Gould ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Monotone speedup by adding additional database indices?
On Wed, Oct 10, 2007, Bruce Stephens wrote: > Chad Walstrom <[EMAIL PROTECTED]> writes: > > [...] > > Would there be a way to tell sqlite to ignore indices for given > > operations, such as pulls? > > That strikes me as a low-level question that should be ignored (at > least unless it causes some measurable problem). > > One would hope that SQLite will only (or mostly, anyway) use indexes > when they'll be beneficial. Index updates strike me as more likely to > be a problem, but I doubt it makes sense to suggest sometimes not > updating indexes. SQLite has a sufficient optimizer and uses indices only where it really makes sense. See http://www.sqlite.org/optoverview.html for some details. Ralf S. Engelschall [EMAIL PROTECTED] www.engelschall.com ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Monotone speedup by adding additional database indices?
On Wed, Oct 10, 2007, Justin Patrin wrote: > On 10/10/07, Ben Walton <[EMAIL PROTECTED]> wrote: > > Indexes speed up read operations but slow down writes. Which do we do > > more of? I'd optimize for the case that would benefit most. I > > suspect we do a fair amount of both. In this case, maybe adding > > indexes for mostly-read tables would be the way to go. > > > > The size increase on the DB should also be investigated here. How much > did your DB increase in size when the indexes were added? With the database where all n.v.m.* revisions stay: | $ sqlite mtn.db | -- Loading resources from /u/rse/.sqliterc | SQLite version 3.5.1 | Enter ".help" for instructions | sqlite> VACUUM; | $ ls -l | total 141344 | -rw-r--r-- 1 rse rse 144629760 Oct 11 07:52 mtn.db | $ sqlite mtn.db | -- Loading resources from /u/rse/.sqliterc | SQLite version 3.5.1 | Enter ".help" for instructions | sqlite> CREATE INDEX revision_certs__id_name_value ON revision_certs (id, name, value); | sqlite> CREATE INDEX public_keys__id ON public_keys (id); | sqlite> VACUUM; | $ ls -l | total 147024 | -rw-r--r-- 1 rse rse 150446080 Oct 11 07:53 mtn.db | $ bc | bc 1.06 | Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc. | This is free software with ABSOLUTELY NO WARRANTY. | For details type `warranty'. | scale = 2 | 150446080/144629760 | 1.04 So, the indices increased the database by just 4% and this IMHO is more or less negligible... Ralf S. Engelschall [EMAIL PROTECTED] www.engelschall.com ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Monotone speedup by adding additional database indices?
Chad Walstrom <[EMAIL PROTECTED]> writes: [...] > Would there be a way to tell sqlite to ignore indices for given > operations, such as pulls? That strikes me as a low-level question that should be ignored (at least unless it causes some measurable problem). One would hope that SQLite will only (or mostly, anyway) use indexes when they'll be beneficial. Index updates strike me as more likely to be a problem, but I doubt it makes sense to suggest sometimes not updating indexes. ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Monotone speedup by adding additional database indices?
Ralph wrote: > CREATE INDEX revision_certs__id_name_value ON >revision_certs (id, name, value); > CREATE INDEX public_keys__id ON >public_keys (id); > > This dropped down the total execution time of the mentioned "mtn update" > command by over 80%! Ben wrote: > Indexes speed up read operations but slow down writes. I can't imagine a lot of writes happening to public_keys. ;-) revision_certs would get four or more inserts per commit, and obviously sync operations would add a bunch. Commits have generally been pretty fast for me. Would there be a way to tell sqlite to ignore indices for given operations, such as pulls? Chad ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Monotone speedup by adding additional database indices?
On 10/10/07, Ben Walton <[EMAIL PROTECTED]> wrote: > Indexes speed up read operations but slow down writes. Which do we do > more of? I'd optimize for the case that would benefit most. I > suspect we do a fair amount of both. In this case, maybe adding > indexes for mostly-read tables would be the way to go. > The size increase on the DB should also be investigated here. How much did your DB increase in size when the indexes were added? > -Ben > > On 10/10/07, Ralf S. Engelschall <[EMAIL PROTECTED]> wrote: > > Some Monotone operations really operate slower than what one would > > expect in the first spot. Hence, I've today looked at the run-time of > > a simple "mtn update" in a workspace which *is already* at h:n.v.m. > > This "no-operation" command internally performs a dozend times the > > following SQL queries: > > > > SELECT id, name, value, keypair, signature > > FROM revision_certs WHERE id = ? AND name = ? AND value = ? > > SELECT keydata FROM public_keys WHERE id = ? > > SELECT id FROM public_keys WHERE id = ? > > > > The problem is that "revision_certs" and "public_keys" have not the > > proper indices for those queries and hence full-table scans seem to > > be performed. I did a quick test and added the following to indices > > manually: > > > > CREATE INDEX revision_certs__id_name_value ON > >revision_certs (id, name, value); > > CREATE INDEX public_keys__id ON > >public_keys (id); > > > > This dropped down the total execution time of the mentioned "mtn update" > > command by over 80%! A "time mtn update" showed 0.450s on average before > > and 0.080s on average afterwards. And this was really not any type of > > in-depth analysis of the situation. I just created two obvious indices > > for the most prominent queries which "mtn --debug update" showed me. > > > > What do we think? Should we investigate further and especially add > > additional indices like the above to the Monotone database schema? Or is > > there consensus that this type of speed optimization is just the root of > > furthcoming evil and at least at this time should be still ignored at > > all... > >Ralf S. Engelschall > >[EMAIL PROTECTED] > >www.engelschall.com > > > > > > > > ___ > > Monotone-devel mailing list > > Monotone-devel@nongnu.org > > http://lists.nongnu.org/mailman/listinfo/monotone-devel > > > > > -- > --- > Ben Walton <[EMAIL PROTECTED]> > > When one person suffers from a delusion, it is called insanity. When > many people suffer from a delusion it is called Religion. > Robert M. Pirsig, Zen and the Art of Motorcycle Maintenance > > --- > > > ___ > Monotone-devel mailing list > Monotone-devel@nongnu.org > http://lists.nongnu.org/mailman/listinfo/monotone-devel > -- Justin Patrin ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Monotone speedup by adding additional database indices?
Indexes speed up read operations but slow down writes. Which do we do more of? I'd optimize for the case that would benefit most. I suspect we do a fair amount of both. In this case, maybe adding indexes for mostly-read tables would be the way to go. -Ben On 10/10/07, Ralf S. Engelschall <[EMAIL PROTECTED]> wrote: > Some Monotone operations really operate slower than what one would > expect in the first spot. Hence, I've today looked at the run-time of > a simple "mtn update" in a workspace which *is already* at h:n.v.m. > This "no-operation" command internally performs a dozend times the > following SQL queries: > > SELECT id, name, value, keypair, signature > FROM revision_certs WHERE id = ? AND name = ? AND value = ? > SELECT keydata FROM public_keys WHERE id = ? > SELECT id FROM public_keys WHERE id = ? > > The problem is that "revision_certs" and "public_keys" have not the > proper indices for those queries and hence full-table scans seem to > be performed. I did a quick test and added the following to indices > manually: > > CREATE INDEX revision_certs__id_name_value ON >revision_certs (id, name, value); > CREATE INDEX public_keys__id ON >public_keys (id); > > This dropped down the total execution time of the mentioned "mtn update" > command by over 80%! A "time mtn update" showed 0.450s on average before > and 0.080s on average afterwards. And this was really not any type of > in-depth analysis of the situation. I just created two obvious indices > for the most prominent queries which "mtn --debug update" showed me. > > What do we think? Should we investigate further and especially add > additional indices like the above to the Monotone database schema? Or is > there consensus that this type of speed optimization is just the root of > furthcoming evil and at least at this time should be still ignored at > all... >Ralf S. Engelschall >[EMAIL PROTECTED] >www.engelschall.com > > > > ___ > Monotone-devel mailing list > Monotone-devel@nongnu.org > http://lists.nongnu.org/mailman/listinfo/monotone-devel > -- --- Ben Walton <[EMAIL PROTECTED]> When one person suffers from a delusion, it is called insanity. When many people suffer from a delusion it is called Religion. Robert M. Pirsig, Zen and the Art of Motorcycle Maintenance --- ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
[Monotone-devel] Monotone speedup by adding additional database indices?
Some Monotone operations really operate slower than what one would expect in the first spot. Hence, I've today looked at the run-time of a simple "mtn update" in a workspace which *is already* at h:n.v.m. This "no-operation" command internally performs a dozend times the following SQL queries: SELECT id, name, value, keypair, signature FROM revision_certs WHERE id = ? AND name = ? AND value = ? SELECT keydata FROM public_keys WHERE id = ? SELECT id FROM public_keys WHERE id = ? The problem is that "revision_certs" and "public_keys" have not the proper indices for those queries and hence full-table scans seem to be performed. I did a quick test and added the following to indices manually: CREATE INDEX revision_certs__id_name_value ON revision_certs (id, name, value); CREATE INDEX public_keys__id ON public_keys (id); This dropped down the total execution time of the mentioned "mtn update" command by over 80%! A "time mtn update" showed 0.450s on average before and 0.080s on average afterwards. And this was really not any type of in-depth analysis of the situation. I just created two obvious indices for the most prominent queries which "mtn --debug update" showed me. What do we think? Should we investigate further and especially add additional indices like the above to the Monotone database schema? Or is there consensus that this type of speed optimization is just the root of furthcoming evil and at least at this time should be still ignored at all... Ralf S. Engelschall [EMAIL PROTECTED] www.engelschall.com ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel