Chris Kantarjiev wrote:

I'm looking at the stats on one of our servers and trying to understand
why Handler_read_rnd_next is so high. It's  256.5M right now, which is
about 10x the total number of reported queries.

The machine is being used, almost entirely, for queries of the form:

select * from crumb where link_id is null and latitude > 39 and longitude > -98 limit 10000;

link_id is indexed. There are about 8 million rows in the table,
and most of them have link_id = null right now. latitude and longitude
are not indexed - but my understanding is that mySQL will only
use one index, and link_id is the interesting one for us.

Are the latitude and longitude qualifiers the cause of the table scans?

Table scans happen when there is no *useful* index. To be useful, an index must both exist and return a relatively small number of rows, where relatively small means less than approximately 30% of the rows in the table. More than that, and the table scan is usually faster than using the index.

The optimizer considers the restrictions in your WHERE clause, looks for possible indexes, and eliminates non-useful indexes from consideration. If any indexes are left, it chooses the best one (the one which returns the fewest rows).

Devananda wrote:
> Since most of the rows have link_id = NULL, using that as the first
> condition in the WHERE clause does not help MySQL reduce the number of
> rows it needs to check (therefor MySQL prefers to do a full table scan).

Sort of. The order of requirements in the WHERE clause is irrelevant. The problem is that most rows have link_id set to NULL, so the index on link_id is not a useful index for this query. An index on latitude or longitude might be useful, if one of those conditions sufficiently restricted the number of rows to consider.

> My suggestion is create a composite index on (latitude, longitude,
> link_id) and change your queries to
>
> SELECT * FROM crumb WHERE latitude > 39 AND longitude > -98 AND link_id
> IS NULL LIMIT 10000;
>
> I think this will use the composite index, thus avoiding a full table scan.

No composite index will be fully used here. MySQL uses composite indexes from left to right, *stopping on the first key part used in a range* rather than to match a constant. "WHERE latitude > 39" is a range, so the composite index on (latitude, longitude, link_id) will be no better than a single column index on latitude.

> See http://dev.mysql.com/doc/mysql/en/mysql-indexes.html for in depth
> info on how MySQL uses indexes.
>
> As a side note, it's a good idea to run EXPLAIN on your queries to see
> what index they are using, what cardinality the index has, and pay close
> attention to the "Extra" field in the results of EXPLAIN. That field
> will tell you if MySQL thinks it has to do a full table scan, create a
> temporary table, etc.

Absolutely.  Run EXPLAIN.

> Regards,
> Devananda vdv

Bruce Dembecki wrote:
> Yes, they almost certainly are the cause of the problem.

Adding restrictions to the WHERE clause joined with AND cannot prevent the use of an index. Indeed, extra restrictions with matching indexes increase the chance an index can be used.

> A query may
> only use one index, but the table can have several, and the MySQL
> Optimizer will choose the most appropriate index for the query it is
> running. In a case such as this where most entries have a null  link_id
> you are hurting from having no index covering the other  columns. If
> this was all the table was going to do and all your  queries were of
> this form, you could make an index on link_ID,  latitude, longitude...
> as long as most everything is null it's the  equivalent of using an
> index on latitude, longitude... but as things  change (I assume you
> don't expect them to stay null) your one index  will accommodate that...

No composite index with longitude last will ever look at the longitude part of the index if the latitude part is used to satisfy a range restriction, as is the case here. Separate indexes on latitude and longitude are actually better in this case than a composite of the two, as that would roughly double the chance of an index being useful.

> However... when you say what's important to you is the link_id, I
> assume you mean that's what is important in the result... not what is
> important in the search itself (as it clearly isn't now if they are
> mostly null). The thing is to remember, while a query may use only  one
> index, a MySQL table can have many (don't go nuts here), so by  adding
> an index for latitude, longitude you are buying yourself a  bunch of
> performance.

It is entirely possible that this (type of) query will not benefit from any index, in which case adding indexes will only slow inserts without any speedup in selects.

> Beware of course... too many indexes or too complicated and they can  be
> a performance issue in their own right. The trick is to put in the
> right indexes for your data and for your queries, don't just add
> indexes for indexing sake.

Exactly.

> Best Regards, Bruce

Michael

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to