On 12/09/2013 7:12 PM, Simon Slavin wrote:
On 12 Sep 2013, at 11:35pm, Roger Binns <rog...@rogerbinns.com> wrote:

On 12/09/13 05:03, Gabor Grothendieck wrote:
Perhaps indexing the expression in question would be an alternative
that would keep the performance info separate from the select.
I'd rather just do 'ANALYZE query' and have SQLite go off and do whatever
it deems necessary to gather to make the query perform well in the future.
I think I prefer Roger's idea to that of messing with a vital expression.  
Rather than modify a command you need to work, introduce a new command which, 
even if it entirely fails, doesn't interfere with the operation of the SELECT.  
If the ANALYZE fails the SELECT (or UPDATE, or whatever) will still work, 
albeit more slowly.  And this causes fewer problems for users who have to write 
code which works with many SQL engines.

But because ANALYZE already means something in SQLite, perhaps it might be 
better to introduce a new command
Would it really be so confusing to extend ANALYZE that way? It reads so naturally, I bet the most likely source of list traffic would be people who tried to use it in older versions of sqlite3 and were surprised it's not there...

STORE LIKELIHOOD test, probability

or maybe

REMEMBER LIKELIHOOD OF test AS probability

which will store the fact that such-and-such a test has a certain probability as a new 
row in a table somewhere.  Could be a new row in sqlite_stat3 (or sqlite_stat4), or could 
be in another sqlite_ table.  Omitting the second parameter tells SQLite to do the 
evaluation itself (like ANALYZE does) and store the result.  Curious users could dump the 
table just like people sometimes do "SELECT * FROM sqlite_stat3".
I think it's pretty important to examine predicates in the context of specific queries (and to allow the same predicate to appear any number of such queries). The predicate "c.name like '%bach%'" is going to behave quite differently in these three queries, for example:

-- Vanilla predicate: Bach isn't a very common name
select c.name from composers c where c.name like '%bach%';

-- Join cardinality: Bach was a *very* prolific composer whose output likely dwarfs the (surviving) output of his contemporaries select p.title, c.name, p.year from composers c join pieces p on p.c_id = c.id where c.name like '%bach%' and p.year between 1700 and 1750;

-- Correlated columns: Very few Brandenburg anythings were written by composers other than J.S. Bach select c.name, p.title from composers c join pieces p on p.c_id = c.id where c.name like '%bach%' and p.title like '%brandenburg%';

(110% agree that any new information that changes query plans needs to be in a stats table somewhere. It's a huge aid to performance debugging when you can repro a problematic query plan using only the schema, query, and a dump of the various stats tables.)

Ryan

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to