Dan,

At 03:02 PM 5/11/01 -0500, you wrote:
>In the last episode (May 11), Heikki Tuuri said:
>> Dan Nelson wrote:
>> >  No ANALYZE TABLE - correct key and subkey cardinality counts work
>> >  wonders for complex queries (probably easy to add).
>> 
>> InnoDB does the estimation of cardinalities required by the MySQL
>> optimizer. But the counts are not exact, only estimates based on a
>> few dives into the index tree. If analyze table would read the whole
>> table, then we could get exact figures.
>
>The myisam version of ANALYZE does scan the entire table; I have a
>table with a 4-field PRIMARY and after an ANALYZE TABLE, I get the
>following:
>
>mysql> show keys from ipsum;
>+-------+------------+----------+-----+-------------+-------------+
>| Table | Non_unique | Key_name | Seq | Column_name | Cardinality |
>+-------+------------+----------+-----+-------------+-------------+
>| ipsum |          0 | PRIMARY  |   1 | date        |        1979 |
>| ipsum |          0 | PRIMARY  |   2 | ourip       |       65321 |
>| ipsum |          0 | PRIMARY  |   3 | port        |      130643 |
>| ipsum |          0 | PRIMARY  |   4 | proto       |      130643 |
>+-------+------------+----------+-----+-------------+-------------+
>4 rows in set (0.00 sec)
>
>(there are 130643 records total).  If I do the same on an InnoDB table,
>I get:
>
>mysql> show keys from ipsum1;
>+--------+------------+----------+-----+-------------+-------------+
>| Table  | Non_unique | Key_name | Seq | Column_name | Cardinality |
>+--------+------------+----------+-----+-------------+-------------+
>| ipsum1 |          0 | PRIMARY  |   1 | date        |        NULL |
>| ipsum1 |          0 | PRIMARY  |   2 | ourip       |        NULL |
>| ipsum1 |          0 | PRIMARY  |   3 | port        |        NULL |
>| ipsum1 |          0 | PRIMARY  |   4 | proto       |      223145 |
>+--------+------------+----------+-----+-------------+-------------+
>4 rows in set (0.01 sec)
>
>So stats on key parts aren't calculated, and the stats on the index as
>a whole are off by 2x.  I don't know if this affects how well the
>upper-level SQL optimizer selects indexes.

I see, I should calculate the estimates also for the key prefixes of
a multicolumn index. I will look at this in ha_innobase.cc. I guess
the 2 x error in the key count cardinality does not affect the optimizer
much. Inherent errors in optimization are usually bigger.

>Interestingly enough, a "select count(distinct date)" takes 5 seconds
>with MyISAM but only 2 seconds with InnoDB, so exact key counts may not
>matter if the DB engine is twice as fast :)
>
>> >Suggested improvements would be the addition of COALESCE TABLESPACE
>> 
>> Do you mean reorganization and compaction of a tablespace? The way to
>> do it is to dump and reload all tables in the tablespace.
>
>During my tests I've found that a 100MB tablespace file gets filled up
>even if I'm doing work on a 30MB table, and I can't add another table
>without dropping and reloading the first.  It'd be nice if there was a
>way to do this wihtout taking tables offline.  The current workaround
>is just to allocate much more tablespace than you really need.

I see. The 30 MB table can grow to take up to 120 MB if the
fillfactor of index pages (also data pages are clustered index pages)
drops to 25 %. Also, if InnoDB has allocated a 64-page chunk to a table
the chunk will be freed to other tables' use only when it is completely
empty.

Possible improvements:
- We could change the page merge threshold from 25 % to 40 %. Then
the fillfactor of the tree would stay higher. A drawback is increased
CPU usage by merges in some circumstances.
- We could change file space allocation for secondary indexes so that
they would always allocate only individual pages, not 64-page chunks.
The drawback is that then inter-table fragmentation can take place: dropping
a table will not free complete 64-page chunks to other tables.
- We could write a background process to reorganize and compact tables.
A weakness is that if the database load takes 100 % of disk bandwidth,
the background process will not run, and the user may be surprised
by the size increase of tables.

I think putting the threshold to 40 % is the easiest improvement here.

It also occurred to me that ALTER TABLE can be used to reorg a table:
ALTER it to MyISAM and then back to InnoDB.

Regards,

Heikki
http://www.innodb.fi

>-- 
>       Dan Nelson
>       [EMAIL PROTECTED]
>
>---------------------------------------------------------------------
>Before posting, please check:
>   http://www.mysql.com/manual.php   (the manual)
>   http://lists.mysql.com/           (the list archive)
>
>To request this thread, e-mail <[EMAIL PROTECTED]>
>To unsubscribe, e-mail <[EMAIL PROTECTED]>
>Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
>
>


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to