On 12/09/2013 7:12 PM, Simon Slavin wrote:
On 12 Sep 2013, at 11:35pm, Roger Binns <rog...@rogerbinns.com> wrote:
On 12/09/13 05:03, Gabor Grothendieck wrote:
Perhaps indexing the expression in question would be an alternative
that would keep the performance info separate from the select.
I'd rather just do 'ANALYZE query' and have SQLite go off and do whatever
it deems necessary to gather to make the query perform well in the future.
I think I prefer Roger's idea to that of messing with a vital expression.
Rather than modify a command you need to work, introduce a new command which,
even if it entirely fails, doesn't interfere with the operation of the SELECT.
If the ANALYZE fails the SELECT (or UPDATE, or whatever) will still work,
albeit more slowly. And this causes fewer problems for users who have to write
code which works with many SQL engines.
But because ANALYZE already means something in SQLite, perhaps it might be
better to introduce a new command
Would it really be so confusing to extend ANALYZE that way? It reads so
naturally, I bet the most likely source of list traffic would be people
who tried to use it in older versions of sqlite3 and were surprised it's
not there...
STORE LIKELIHOOD test, probability
or maybe
REMEMBER LIKELIHOOD OF test AS probability
which will store the fact that such-and-such a test has a certain probability as a new
row in a table somewhere. Could be a new row in sqlite_stat3 (or sqlite_stat4), or could
be in another sqlite_ table. Omitting the second parameter tells SQLite to do the
evaluation itself (like ANALYZE does) and store the result. Curious users could dump the
table just like people sometimes do "SELECT * FROM sqlite_stat3".
I think it's pretty important to examine predicates in the context of
specific queries (and to allow the same predicate to appear any number
of such queries). The predicate "c.name like '%bach%'" is going to
behave quite differently in these three queries, for example:
-- Vanilla predicate: Bach isn't a very common name
select c.name from composers c where c.name like '%bach%';
-- Join cardinality: Bach was a *very* prolific composer whose output
likely dwarfs the (surviving) output of his contemporaries
select p.title, c.name, p.year from composers c join pieces p on p.c_id
= c.id where c.name like '%bach%' and p.year between 1700 and 1750;
-- Correlated columns: Very few Brandenburg anythings were written by
composers other than J.S. Bach
select c.name, p.title from composers c join pieces p on p.c_id = c.id
where c.name like '%bach%' and p.title like '%brandenburg%';
(110% agree that any new information that changes query plans needs to
be in a stats table somewhere. It's a huge aid to performance debugging
when you can repro a problematic query plan using only the schema,
query, and a dump of the various stats tables.)
Ryan
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users