On 2014/11/09 14:11, Tristan Van Berkom wrote:
A good SQL rule of thumb: if you can think of a way, so can the DBMS. "... no opportunity to make a good guess" is not true. In some sense, SQLite has had 10 years to make a good guess, and often does. A nested select need not be materialized as a "table", opaque or otherwise. It can be converted, or "flattened" under some circumstances. SQLite's query planner isn't the most sophisticated; no one suggests otherwise. It does not follow that every JOIN will outperform every EXISTS or vice versa.
Indeed this is a large misconception on my part, on the queries which
I *have* profiled, it did turn out that JOINs were more effective than
solutions which involve nesting select statements.

This seems like the appropriate place to interject with a Statistics / 
Sample-size comment, but I think the point is obvious.

Anyway, I appreciate the input and will try to accept that I should not
be in control of how the query is run - I was under the impression that
SQL engines can perform better when given more context about how the
query should run (i.e. being more explicit with JOINs), but I do agree
that, at least ideally, the planner should be able to make a better
guess as to how to plot a query with a more relaxed/vague statement,
than with a more explicit one.

Woah... this is not at all what James tried to say and it might just be that your choice of words is unfortunate and you did mean it correctly, but just to be sure, allow me to elaborate somewhat:

The query should in no way be "Vague" or "Relaxed", it should be precise, succint, exact and fully correct making the intended result impossible to misinterpret. What it should not be is convoluted with additional hot air that attempts to better explain to the QP how to do the query... that is where we trust the QP to do it's job as long as the question is not Vague and indeed very precise and concise.

Sometimes the Query planner is faced with situations where an algebraically correct result can be achieved via different ways (picking/using an Index (if any) is a good case-in-point) and sometimes the best one is not obvious from the query or table size. In these cases, providing hints to it - such as running Analyze (Stat3 or 4 tables) and providing direct hints (such as 'Likely' and 'Likelihood') - is a good way for a programmer (or "Query Author" I should say) to /help/ the QP decide on what is best - but these should be used sparingly and only where the improvement is substantial and the optimisation not premature.

Another consideration is that the QP gets updated almost every other month... you have no way of knowing whether your convoluted query (which may shave off a few milliseconds now) will still be the fastest next month when the QP is a bit smarter. The long-term best is to formulate the most precise and concise query possible and let the Engine do the work.


Cheers,
Ryan.


(PS: I don't mean to second-guess James, and the only reason I feel any comfort saying what he meant is that we just had an interesting debate about this elsewhere and I have come to appreciate the view above somewhat more).




_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to