Re: Indexing not working

SGreen Fri, 24 Jun 2005 07:20:30 -0700

Sajith A <[EMAIL PROTECTED]> wrote on 06/24/2005 02:07:22 AM:

> Thank you Mathias  once again


> > I suggest you to test after optimize table ... and analyze table ...
> > if you stay with the problem and you can give me a set of data, 
> i'll try to help
> > more.

> Initially while posting... i tried to reduce one table to avoid
> complexity from the actual query. I'm listing the tables below. I'm
> also attaching some sample data.
> 
-------------------------------------------------------------------------------------------------------------------------------------
> CREATE TABLE `qb_test_result` (
> `id` int(11) NOT NULL auto_increment,
> `testId` smallint(4) NOT NULL default '0',
> `studentId` int(11) NOT NULL default '0',
> `marks` smallint(4) NOT NULL default '0',
> `startTime` int(20) default NULL,
> `endTime` int(20) default NULL,
> `percentage` float NOT NULL default '0',
> `status` enum('FAIL','PASS','POST','UNCOMPLETE') NOT NULL default
> 'UNCOMPLETE',
> PRIMARY KEY  (`id`),
> KEY `testId` (`testId`),
> KEY `testStudent` (`id`,`testId`,`studentId`)
> ) TYPE=MyISAM;

> 
> CREATE TABLE `qb_test_result_details` (
> `sequenceId` int(20) NOT NULL default '0',
> `resultId` int(20) NOT NULL default '0',
> `questionId` int(20) NOT NULL default '0',
> `viewStatus` enum('NV','V','A') NOT NULL default 'NV',
> `bookMark` enum('Y','N') NOT NULL default 'N',
> `correct` enum('Y','N') NOT NULL default 'N',
> `postMarks` int(11) NOT NULL default '0',
> KEY `resultId` (`resultId`)
> ) TYPE=MyISAM;

> 
> CREATE TABLE `ums_user` (
> `id` bigint(255) NOT NULL auto_increment,
> `firstName` varchar(255) NOT NULL default '',
> `lastName` varchar(255) NOT NULL default '',
> `userName` varchar(50) NOT NULL default '',
> `password` varchar(50) NOT NULL default '',
> `email` varchar(100) NOT NULL default '',
> `companyId` int(11) NOT NULL default '0',
> PRIMARY KEY  (`id`),
> UNIQUE KEY `userName` (`userName`),
> KEY `roleId` (`id`)
> ) TYPE=MyISAM ;

> 
> CREATE TABLE `qb_question` (
> `id` int(11) NOT NULL auto_increment,
> `question` text NOT NULL,
> `url` varchar(255) NOT NULL default '',
> `file` varchar(255) NOT NULL default '',
> `marks` int(11) NOT NULL default '0',
> `detailedAnswer` text NOT NULL,
> `author` int(11) NOT NULL default '0',
> `testId` smallint(4) NOT NULL default '0',
> `loId` int(11) NOT NULL default '0',
> `needWrittenAnswer` enum('Y','N') NOT NULL default 'Y',
> `archive` enum('Y','N') NOT NULL default 'N',
> PRIMARY KEY  (`id`),
> KEY `testId` (`testId`,`archive`)
> ) TYPE=MyISAM;

> 
> CREATE TABLE `qb_test` (
> `id` smallint(4) NOT NULL auto_increment,
> `categoryId` int(11) NOT NULL default '0',
> `title` varchar(80) NOT NULL default '',
> `description` text NOT NULL,
> `instructions` text NOT NULL,
> `author` int(4) NOT NULL default '0',
> `type` enum('PUBLIC','POST','PRE','REV') default NULL,
> `duration` smallint(6) NOT NULL default '0',
> `passrate` float NOT NULL default '0',
> `showDetails` enum('Y','N') NOT NULL default 'Y',
> `showRandom` enum('Y','N') NOT NULL default 'Y',
> `showAssessment` enum('N','Y') NOT NULL default 'N',
> `noOfQuestions` int(11) NOT NULL default '0',
> `dateAvailable` datetime NOT NULL default '0000-00-00 00:00:00',
> `companyId` int(11) NOT NULL default '0',
> `archive` enum('Y','N') NOT NULL default 'N',
> PRIMARY KEY  (`id`),
> KEY `title` (`title`,`author`,`type`),
> KEY `categoryId` (`categoryId`),
> KEY `companyAuthor` (`companyId`,`author`)
> ) TYPE=MyISAM;

> 
-------------------------------------------------------------------------------------------------------------------------------------

> 
> Now the below query is not using the index "testStudent" in 
qb_test_result.
> 
-------------------------------------------------------------------------------------------------------------------------------------
> EXPLAIN
> SELECT ums_user.firstName, qb_test.title testName, SUM(
> qb_question.marks ) maxMarks, qb_test_result.percentage
> FROM qb_test_result, qb_test_result_details, qb_test, qb_question, 
ums_user
> WHERE qb_test_result.id = qb_test_result_details.resultId
> AND  qb_test_result.testId = qb_test.id
> AND qb_test_result.studentId = ums_user.id
> AND qb_test.id = qb_question.testId
> AND qb_question.id = qb_test_result_details.questionId
> AND qb_test.companyId =1
> AND qb_test.author = 2
> GROUP BY qb_test_result.id
> 
-------------------------------------------------------------------------------------------------------------------------------------

> How ever when i removed qb_test_result.percentage...it does..
> 
-------------------------------------------------------------------------------------------------------------------------------------
> EXPLAIN
> SELECT ums_user.firstName, qb_test.title testName, SUM(
> qb_question.marks ) maxMarks
> FROM qb_test_result, qb_test_result_details, qb_test, qb_question, 
ums_user
> WHERE qb_test_result.id = qb_test_result_details.resultId
> AND  qb_test_result.testId = qb_test.id
> AND qb_test_result.studentId = ums_user.id
> AND qb_test.id = qb_question.testId
> AND qb_question.id = qb_test_result_details.questionId
> AND qb_test.companyId =1
> AND qb_test.author = 2
> GROUP BY qb_test_result.id
> 
-------------------------------------------------------------------------------------------------------------------------------------

> Am i doing something wrong in the first query

> Thank you once again

I don't remember seeing anyone post this link in this thread so I add it 
here:

MySQL manual chapter: 7.2. Optimizing SELECT Statements and Other Queries
http://dev.mysql.com/doc/mysql/en/query-speed.html

In the first query, you ask for a piece of information that only exists on 
the table `qb_test_result` (the field `percentage`). The only method the 
database has available to it to retrieve that data is to go into the 
actual data file and get the data. In your second query, all of the data 
you need from the table `qb_test_result` exist in TWO places:
1) the index `testStudent`
2) the table file.

For the second query, the engine KNOWS it can get all of its information 
directly from the index file and skip the extra disk operations it takes 
to retrieve the information directly from the data file. One read takes 
less time as two or more so the engine uses the index as the source of 
your data. Why did the engine not use the index for your first query? That 
has everything to do with how many read operations would it take to 
physically get the information off of the disk.

WITH AN INDEX
-------------
1) read the index file into memory (disk operation - may require memory 
paging)
2) search the index file for particular records (memory operation - may 
require more memory paging)
3) locate a particular block (random access) of information within the 
data file (at least 2 disk operations)

WITHOUT AN INDEX
---------------
(both are disk operations)
1) Locate the beginning of the table file on disk
2) scan through the table looking for and retrieving matches (serial 
access) 

What benchmark testing has revealed is that if the index indicates that it 
will match more than about 30% (that's a rough number, the actual number 
is computed based on table statistics) of the total number of records in a 
particular table that it is actually faster to perform the serial scan of 
the table rather than to bounce around picking random records out of the 
data file based on index entry matches. This is due to minimizing the 
number of times the hard disk's read heads need to be repositioned in 
order to get to the data you want.  The simple physical quantity of time 
that it takes to randomly position the read heads over a particular place 
on the disk is what discourages the engine from using an index for your 
first query. It's simply physically faster to ignore the index and scan 
the table.

More details are in the readings I referenced at the top of my response.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Re: Indexing not working

Reply via email to