Re: Bad optimization with two conditions?

Fred van Engen Sat, 24 Mar 2001 15:49:22 -0800
Hi,

On Sat, Mar 24, 2001 at 02:04:57PM +0100, Federico Giannici wrote:
> Fred van Engen wrote:
> > On Fri, Mar 23, 2001 at 07:46:48PM +0100, Federico Giannici wrote:
> > > We have a table where we register the data of all the modem connections
> > > of the users of a small ISP.
> > >
> > > The table contains 2 million records and has the following structure:
> > >
> > 
> > ...
> > 
> > What indexes does the table have? Looking at your description below
> > I would guess at least an index on start and an index on login.
> > 
> > SHOW CREATE TABLE tblname;
> 
> CREATE TABLE `accountings` (
>   `start` int(10) unsigned NOT NULL default '0',
>   `duration` int(10) unsigned NOT NULL default '0',
>   `login` char(8) NOT NULL default '',
>   `nas` int(10) unsigned NOT NULL default '0',
>   `port` int(10) unsigned NOT NULL default '0',
>   `ip` int(10) unsigned NOT NULL default '0',
>   `bytesrx` int(10) unsigned NOT NULL default '0',
>   `bytestx` int(10) unsigned NOT NULL default '0',
>   `speedtx` int(10) unsigned NOT NULL default '0',
>   `speedrx` int(10) unsigned NOT NULL default '0',
>   `telto` char(15) NOT NULL default '',
>   `telfrom` char(15) NOT NULL default '',
>   `multilinkid` int(10) unsigned NOT NULL default '0',
>   `sessionid` char(8) binary NOT NULL default '',
>   KEY `login`(`login`),
>   PRIMARY KEY (`start`,`nas`,`port`)
> ) TYPE=MyISAM PACK_KEYS=1
> 
> 
> > Also, which version of MySQL do you use?
> 
> MySQL 3.23.32 on OpenBSD 2.7 i386.
> 

Maybe someone can comment on optimization errors in this version, but
it should be one of the best versions there is (I use 3.23.33 with a
patch).


> > > According to us there is a bad optimization with queries like the
> > > following one:
> > >
> > > SELECT * FROM accountings WHERE start >= UNIX_TIMESTAMP('2001-03-23')
> > > AND login = 'anyname'
> > >
> > 
> > What does EXPLAIN say? I.e.
> 
> +-------------+------+---------------+-------+---------+-------+------+------------+
> | table       | type | possible_keys | key   | key_len | ref   | rows |
> Extra      |
> +-------------+------+---------------+-------+---------+-------+------+------------+
> | accountings | ref  | login,PRIMARY | login |       8 | const | 2428 |
> where used |
> +-------------+------+---------------+-------+---------+-------+------+------------+
> 

This would seem to be the logical index to use if there are more than
2428 rows with start >= UNIX_TIMESTAMP('2001-03-23').

Maybe the following query will show that:

EXPLAIN SELECT * FROM accountings USE INDEX (PRIMARY)
 WHERE start >= UNIX_TIMESTAMP('2001-03-23') AND login = 'anyname'

Of course, we know that the logins are not stored close together in
the data file, so it takes more time to scan them than scanning from
a certain start time. MySQL would never take this into account I
think. Not much you can do then, besides 'USE INDEX'.


> Does the previous EXPLAIN shows the reason?
> 
> Perhaps because possible_keys lists "login" before "PRIMARY"?
> 

The order shouldn't matter unless both EXPLAINs are equal.


> > > I don't know why the second condition is much slower than the first one,
> > > even if both the columns are indexed. Perhaps because "login" is a MUL
> > > index?
> > >
> > 
> > Is there anything special about this login? Is it used more often
> > than average?
> 
> This behaviour occurs with all the logins that have a good presence in
> the table. The result is almost immediate with the logins that have
> almost no records, but they are not the average...
> 

I was asking because I once had a similar situation where I tested
with a signup account. This had many more entries than average and
made the case just more noticable (until I explicitly set specified
the index to use).


> > > Is it possible to optimize the current behavior?
> > >
> > 
> > Yes, you can use this:
> > 
> > SELECT * FROM accountings USE INDEX (myindex)
> >  WHERE start >= UNIX_TIMESTAMP('2001-03-23') AND login = 'anyname'
> > 
> > Replace myindex with the index on start.
> 
> Wow, it worked!
> The query time sensibly decreased specifing "USE INDEX(PRIMARY)".
> 
> I wasn't able to find the documentation about the "USE INDEX()"
> parameter of the SELECT command in the on-line manual...
> 

Yes, I once knew I had seen it, but only managed to find it through
the search function of the website. Now I remember, it's in the
explanation of JOINs.


Regards,

Fred.

-- 
Fred van Engen                              XO Communications B.V.
email: [EMAIL PROTECTED]             Televisieweg 2
tel: +31 36 5462400                         1322 AC  Almere
fax: +31 36 5462424                         The Netherlands

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re: Bad optimization with two conditions?

Reply via email to