Peter Lauri wrote:
> Is there not a better way to do that? What will happen there is that a large
> result set will be created because when you just do "select * from
> customers c, issues i, customer_issues ci" it will be like the inner product
> from all these tables, and then just choosing the right ones.
>
> If the table C have 1000 records, issues 5000 and customer_issues 15000 you
> would end up with a 75,000,000,000 rows large results set, that would not be
> so sweet, would it?

Peter Lauri wrote:
> Yes, it cuts it down to that number of records in the end, so the final
> result set will just be a few rows that match the 'WHERE'. But the internal
> process of MySQL do merge all tables and then chooses the records that
> matches the 'WHERE' clause.

No database would work very well if that were the case. You are essentially asserting that mysql has no optimizer. That just isn't so. The job of the optimizer is to devise a plan to execute the query in a way that will examine the fewest possible rows. Eliminating rows before looking at them is always preferable to eliminating them afterwards.

For example, given the query

  SELECT *
  FROM customers c
  JOIN customer_issues ci ON c.customerid = ci.customerid
  JOIN issues i on ci.issueid = i.issueid
  WHERE c.customerid = 13;

mysql will use the index on customers.customerid to select the 1 row with customerid = 13, it will then use the index on customer_issues.customerid to find matching rows in customer_issues, then finally it will use the index on issues.issueid to find matching rows in issues. This is easily verified using EXPLAIN:

  EXPLAIN SELECT *
          FROM customers c
          JOIN customer_issues ci ON c.customerid = ci.customerid
          JOIN issues i on ci.issueid = i.issueid
          WHERE c.customerid = 13;

  +-------------+-------+--------+---------+---------------------+------+
  | select_type | table | type   | key     | ref                 | rows |
  +-------------+-------+--------+---------+---------------------+------+
  | SIMPLE      | c     | const  | PRIMARY | const               |    1 |
  | SIMPLE      | ci    | ref    | PRIMARY | const               |    4 |
  | SIMPLE      | i     | eq_ref | PRIMARY | test.ci.issueid     |    1 |
  +-------------+-------+--------+---------+---------------------+------+
  3 rows in set (0.01 sec)

(Note that I've pared the output of EXPLAIN down to a few relevant columns.)

The "rows" column tells the story. Mysql plans to use the primary key to find the 1 matching row in customers, then use the primary key to find the 4 matching rows in customer_issues for that 1 customer, then use the primary key to find the 1 matching row in issues for each row found in customer_issues. That is, it expects to produce 1x4x1 = 4 rows, not 'size of c' x 'size of ci' x 'size of i' rows!

You might want to read the optimization section of the manual for more on the subject <http://dev.mysql.com/doc/refman/4.1/en/optimization.html>.

Chris wrote:
> I don't know enough about mysql internals to debate that so I'll take
> your word for it.
>
> 'Explain' doesn't give enough information about what happens behind the
> scenes so I'm not sure how to prove/disprove that and I don't know of
> any tools that would show you that (if there is let me know!).
>
> Having said all of that I've never had a problem doing it the way I
> mentioned.. ;)

EXPLAIN is documented in the manual <http://dev.mysql.com/doc/refman/4.1/en/explain.html>.

Michael

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to