Re: General Questions About Indexes

2007-05-28 Thread Baron Schwartz

Hi John,

John Kebbel wrote:

INDEXES - A Science AND an Art

I've been continuing to look for answers to my own questions. I've found
a few ...


I meant to write back and try to help, but got busy with other things. 
You have found some good answers for yourself.



Q1. What good does it do to store the primary key or a unique key if
you're normally SELECTing columns that don't use that primary or unique
key?
As you can see, it only makes sense to index those fields you
use in the WHERE clause.

http://www.databasejournal.com/features/mysql/article.php/10897_1382791_2


Generally true, but there can be some advantages to indexing other 
columns too.  A common case is a covering index.  For example, SELECT 
col1 FROM tbl WHERE col2=11.  If you have an index on col2, it will 
help the RDBMS find entries quickly.  This process works as you would 
expect: it looks in the index B-tree for entries with value 11.  Now 
it has to do what's called a bookmark lookup, to find the actual row 
in the table, and retrieve col1 from it.


However, if you have an index on (col2, col1) you can retrieve the value 
from the index, without needing to go look for the row in the table. 
This index covers the query.  Order of columns is important here.


In practice, this can be an enormous advantage.  I wrote an article 
about this, with possibly helpful diagrams, here: 
http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/



Q2. Does a SELECT statement look at an index before it looks at a
table? 
Before we repair the table structure above, let me tell you

about a most important little secret for anyone serious about
optimizing their queries: EXPLAIN. EXPLAIN shows (explains!) how
your queries are being used. By putting it before a SELECT, you
can see whether indexes are being used properly, and what kind
of join is being performed...

http://www.databasejournal.com/features/mysql/article.php/10897_1382791_1


Yes, and it goes further than that; the server maintains statistics 
about the indexes, and uses them to estimate the cost of various query 
execution plans.  You could spend months learning how this works, all 
the fine points of various optimization strategies, etc.  (Not that I'm 
an expert on it, I just know there's a lot to it).



Q3. Are JOINs where the real timesaving occurs and SELECTs just a
peripheral issue muddying the water?
In MySQL, Paul DuBois writes: Index columns that you search
for, not columns you select ... [t]he best candidate columns for
indexing are the columns that appear in your WHERE clause or
columns named in join clauses.


Many kinds of operations may benefit from indexes.  Besides SELECT and 
JOIN, they can be used for GROUP BY, ORDER BY, DISTINCT, finding rows to 
UPDATE or DELETE, and are necessary for lots of other things like 
foreign keys.


The trade-off is cost and space to maintain them.  When you change data, 
the indexes have to be updated too.



Q4. What about non-unique indexes? Is the structure of a non-unique
index file similar to the index in the back of a book, the phrase you're
searching for plus a list of row numbers (page numbers for a book) where
that phrase is found?
I haven't found the answer to this question, but I did find:
Indexes work best for columns with unique values,  and most
poorly with columns that have many duplicate values Paul
DuBois, MySQL 


I think this might be two questions.

I'm not sure -- there may be more subtleties than this -- but I think a 
UNIQUE index is like any other index, except the server knows to check 
for duplicate values and throw an error.  Most indexes in MySQL are 
B-trees, though there are specialized indexes for some things (hash, 
spatial, RTREE, fulltext).


Another thing you're touching on here is the selectivity of the index. 
This is the degree to which a row in the table is uniquely identified by 
an index entry.  If you index a column with all one value, the 
selectivity is terrible; the opposite end of the spectrum is a UNIQUE 
index, which has perfect selectivity (each row is uniquely identified by 
an index entry).



Q5. Is an item in an index tied to a memory address (like a pointer in C
++) where the indexed data appears inside the larger memory area staked
out by the table?


This is implementation-dependent.  Hopefully the diagrams in the article 
I linked above will help explain.  There is always some kind of 
pointer from the index to the row, but how it works is different from 
one system to the next.  (There are some exotic kinds of storage 
engines that may not have conventional indexes, and will work completely 
differently, but the pointer idea applies well to all the 
MySQL-supplied engines, as far as I know).



Q6. As for memory, when you choose a database inside the mysql client,
are all the tables within that database read into 

Re: General Questions About Indexes

2007-05-27 Thread John Kebbel
INDEXES - A Science AND an Art

I've been continuing to look for answers to my own questions. I've found
a few ...

Q1. What good does it do to store the primary key or a unique key if
you're normally SELECTing columns that don't use that primary or unique
key?
As you can see, it only makes sense to index those fields you
use in the WHERE clause.

http://www.databasejournal.com/features/mysql/article.php/10897_1382791_2
Q2. Does a SELECT statement look at an index before it looks at a
table? 
Before we repair the table structure above, let me tell you
about a most important little secret for anyone serious about
optimizing their queries: EXPLAIN. EXPLAIN shows (explains!) how
your queries are being used. By putting it before a SELECT, you
can see whether indexes are being used properly, and what kind
of join is being performed...

http://www.databasejournal.com/features/mysql/article.php/10897_1382791_1
Q3. Are JOINs where the real timesaving occurs and SELECTs just a
peripheral issue muddying the water?
In MySQL, Paul DuBois writes: Index columns that you search
for, not columns you select ... [t]he best candidate columns for
indexing are the columns that appear in your WHERE clause or
columns named in join clauses. 
Q4. What about non-unique indexes? Is the structure of a non-unique
index file similar to the index in the back of a book, the phrase you're
searching for plus a list of row numbers (page numbers for a book) where
that phrase is found?
I haven't found the answer to this question, but I did find:
Indexes work best for columns with unique values,  and most
poorly with columns that have many duplicate values Paul
DuBois, MySQL 
Q5. Is an item in an index tied to a memory address (like a pointer in C
++) where the indexed data appears inside the larger memory area staked
out by the table?
?
Q6. As for memory, when you choose a database inside the mysql client,
are all the tables within that database read into memory from the hard
drive, or just the indexes?
?


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



General Questions About Indexes

2007-05-26 Thread John Kebbel
I have a few questions about indexes. I understand (1) what an index
is, and (2) why indexes are useful, but I don't have even a rough idea
about HOW they work. The internet resources I've been able to find don't
answer the questions I'm asking. I also tried
cat /var/lib/mysql/srms07/staff.MYI to see if I could glean some
information directly from an index file, but the MYI file wasn't
human-readable.

Q1. What good does it do to store the primary key or a unique key if
you're normally SELECTing columns that don't use that primary or unique
key? 

Q2. Does a SELECT statement look at an index before it looks at a
table? 

Q3. Are JOINs where the real timesaving occurs and SELECTs just a
peripheral issue muddying the water?

Q4. What about non-unique indexes? Is the structure of a non-unique
index file similar to the index in the back of a book, the phrase you're
searching for plus a list of row numbers (page numbers for a book) where
that phrase is found?

Q5. Is an item in an index tied to a memory address (like a pointer in
C++) where the indexed data appears inside the larger memory area staked
out by the table?

Q6. As for memory, when you choose a database inside the mysql client,
are all the tables within that database read into memory from the hard
drive, or just the indexes?

Thanks in advance for taking the time to read this, and even more
thanks if you take the time to respond to my questions with either an
explanatory URL or your words explaining the matter.



 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]