* Brannon King <[EMAIL PROTECTED]> [2006-05-23 01:40]:
> It seems that I yet need help with another query. This one is just too slow.
> I've included the "explain" and the table schema. I've been using the
> prepare/step model directly. What should I change on my indexing to make it
> faster?
> 
> The schema:
> CREATE TEMPORARY TABLE IF NOT EXISTS bounds (bi INTEGER PRIMARY KEY
> AUTOINCREMENT DEFAULT NULL, bqi INTEGER, bri INTEGER, bqis INTEGER, bris
> INTEGER);
> CREATE UNIQUE INDEX IF NOT EXISTS qisris ON bounds (bqis, bris);
> CREATE UNIQUE INDEX IF NOT EXISTS qiri ON bounds (bqi, bri);
> insert into bounds values(NULL,1,1,5880,5880);
> CREATE TABLE results_1 (qi INTEGER, ri INTEGER, drl INTEGER, score INTEGER,
> qis INTEGER, ris INTEGER);
> CREATE UNIQUE INDEX loc_1 ON results_1 (qi,ri); 
> 
> The queries (both of these run slow but I care about the second):
> "select count(*) from results_1 where qi = 5604 OR ri = 5468;"
> returns 102
> 
> So you can see the following query should only be doing a max over a 102
> pieces; that's not very many. 
> 
> "explain 
> select qi,ri,drl,max(score) as scr from results_1, bounds where (qi = 5604
> OR ri = 5468) AND (qi >= bqis
> AND qi <= 5604) AND (ri >= bris AND ri <= 5468) AND bi = 1 and qis = bqis
> AND ris = bris;"

You know, it would really, *REALLY* help someone else to figure
out what you’re doing it if you use table aliases everywhere so
one doesn’t need to crossreference the schema constantly when
trying to read the query in order to know what’s coming from
where. Together with the correction of your use of the aggregate
function as pointed out elsewhere, your query is as follows:

    SELECT
        r.qi,
        r.ri,
        r.drl,
        r.score
    FROM
        results_1 r
        INNER JOIN bounds b ON
            r.qis = b.bqis AND r.ris = b.bris
    WHERE
        (r.qi = 5604 OR r.ri = 5468)
        AND (r.qi >= b.bqis AND r.qi <= 5604)
        AND (r.ri >= b.bris AND r.ri <= 5468)
        AND b.bi = 1
    GROUP BY
        r.score DESC
    LIMIT 1

Now I realise after the fact that all your columns from `bounds`
start with `b`, but I had to unravel the query in order to pick
up on that.

It also makes obvious that your query returns no columns from
the `bounds` table, it just uses them to constrain the result
set. In that case, you might get better performance by checking a
correlated subquery with `EXISTS`.

Let’s see. Most of those parentheses in the `WHERE` clause are
unnecessary:

        (r.qi = 5604 OR r.ri = 5468)
        AND r.qi <= 5604
        AND r.ri <= 5468
        AND r.qi >= b.bqis
        AND r.ri >= b.bris
        AND b.bi = 1

Now we can group together the conditions which do not involve the
`bounds` table:

        (r.qi = 5604 AND r.ri <= 5468) OR (r.ri = 5468 AND r.qi <= 5604)
        AND r.qi >= b.bqis
        AND r.ri >= b.bris
        AND b.bi = 1

The rest can then be factored into a subquery:

    SELECT
        r.qi,
        r.ri,
        r.drl,
        r.score
    FROM
        results_1 r
    WHERE
        (r.qi = 5604 AND r.ri <= 5468) OR (r.ri = 5468 AND r.qi <= 5604)
        AND EXISTS (
            SELECT
                NULL
            FROM
                bounds b
            WHERE
                b.bi = 1
                AND b.bqis = r.qis
                AND b.bris = r.ris
                AND b.bqis <= r.qi
                AND b.bris <= r.ri
        )
    ORDER BY
        r.score DESC
    LIMIT 1

I can’t interpret the `EXPLAIN` output well enough to tell
whether this is likely to be faster, I’m afraid. (Actually I
don’t even understand how to tell whether/which indices are being
used; I tried creating a few and they didn’t seem to make a
discernible difference.)

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>

Reply via email to