Re: [sqlite] Slow SQL statements

Ludvig Strigeus Sun, 22 May 2005 23:00:54 -0700

Why doesn't SQL provide a utility function: maxv

Then you could (almost) write it like this:
SELECT maxv(sb, playerid) FROM batting WHERE playerid IN
       (SELECT player FROM fielding WHERE pos='3B' AND lgid='NL'));


The semantics of maxv(arg, value) would be that it finds the maximum
of arg, but instead of returning arg, it returns value for the record
with the maximum arg.

Maybe it could be possible to make it a multi argument function:
maxv(arg, value, value2)
which would return 2 columns, value and value2.

Or alternatively, make a function maxid(arg) that returns the rowid of
the maximum arg. (But this can be implemented as maxv(arg, rowid) )

How about adding such a function to SQLite? It seems useful.

/Ludvig


On 5/23/05, D. Richard Hipp <[EMAIL PROTECTED]> wrote:
> On Sun, 2005-05-22 at 22:02 -0400, D. Richard Hipp wrote:
> > On Sun, 2005-05-22 at 19:42 -0400, Al Danial wrote:
> > >
> > > sqlite baseball.db 'select playerid,sb from batting where sb = (select
> > > max(sb) from batting where playerid in (select playerid from fielding
> > > where pos = "3B" and lgid = "NL"));'
> > >
> >
> > The use of double-quotes instead of single-quotes is confusing
> > the optimizer somehow.  (I'm still looking why this is.)
> 
> I think I see what is going on now.  For clarity of presentation,
> allow me to reformat the query:
> 
>    SELECT playerid, sb FROM batting WHERE sb =
>      (SELECT max(sb) FROM batting WHERE playerid IN
>         (SELECT player FROM fielding WHERE pos="3B" AND lgid="NL"));
> 
> There are three nested queries.  The outer query uses a subquery in
> its WHERE clause.  The WHERE clause of the outer query is
> essentially "sb = <subquery>".  This subquery in turn as a subquery
> of its WHERE clause.
> 
> The problem arises when SQLite tries to resolve the name "NL".
> The rules of SQL require that a string in double-quotes should
> first try to resolve to a column name.  If that fails, then treat
> then string as a string literal.  This is what SQLite does:
> 
>    (1) First it tries to resolve the name "NL" in the context
>        of the inner-most query.  But NL is not a valid identifier
>        there so it fails.
>    (2) Next it tries to resolve the name NL in the middle query.
>        Once again, NL is not a valid identifier there so resolution
>        fails again.
>    (3) Then it tries to resolve the name NL in the outer query.
>        As before, this fails.
>    (4) Having not been able to resolve the name in any context,
>        SQLite uses "NL" as a string literal.
> 
> Because of steps 2 and 3 - because SQLite had to look into outer
> queries trying to resolve a name in the inner-most subquery, it
> assumes that the inner-most subquery was a correlated subquery
> and would hence need to be reevaluated once for each row in the
> middle and outer queries.  This results in O(N^3) performance,
> regardless of what indices you use.
> 
> By changing "NL" and "3B" to 'NL' and '3B', steps 1 through 3
> in the name resolution process are avoided.  SQLite immediately
> recognizes that the inner subquery is fixed, evaluates it exactly
> once and saves the result.  This results in O(N) runtime performance.
> Much, much faster.
> --
> D. Richard Hipp <[EMAIL PROTECTED]>
> 
>

Re: [sqlite] Slow SQL statements

Reply via email to