On Tue, 21 Sep 2004 08:57:21 -0400
[EMAIL PROTECTED] wrote:

<snip>

> There are up to three layers of record filtering that happen during
> any query. First is the JOIN filtering.  That is where the ON
> conditions are used with the table declarations to build a virtual
> table that consists of all columns from each of the participating
> tables and each combination of rows that meets the ON conditions. 
> If table A has 5 rows and table B has 50 rows and the ON conditions
> force a match of at most 2 records from table B to each record in
> table A, the virtual table will have at most 10 rows (not the 250
> that would be generated without the ON conditions). Second to be
> applied is the rest of the WHERE clause that could not be applied
> during the ON determinations. This is especially true with queries
> that contain OUTER JOINS. If it didn't happen in this order, we
> couldn't do an outer join of two tables and look for a null result
> in the outer table to determine non-matching rows. The third set of
> filters to be applied comes from the HAVING clause. HAVING
> conditions are applied after every other portion of the query has
> been analyzed except for the LIMIT clause. That is why HAVING works
> on the results of GROUP BY aggregate functions and can use column
> aliases declared in the SELECT clause.

Yes, this much I grasp as far as the ordering of filtering.  Cartesian
Product, then JOIN, Then WHERE.

> When you put a condition into the ON clause of a JOIN, it is going
> to be applied to the formation of the virtual table which gets
> computed _before_ the entire WHERE clause is applied. Under many
> conditions, some WHERE conditions can also be applied to table JOINs
> along with the ON restrictions. Luckily, the query optimizer handles
> that for us.

Perhaps another example would help.  I've been trying to re-write
another join query that's designed to produce an attendance record for
each student detailing which classes they've attended and which
they've missed.  Each occurance of a student (PK tech_id) attending a
class (PK c_id) is recorded in an intersection table class_attneded
(tech_id, c_id PK).  The absense of a record in this table indicates
the student missed the class.  So if tech_id 123123 was at classes 1
and 3, there would be records in the class_attended table:

tech_id - class_id
123123    1
123123    3

The query that worked pre 5.0.1 is:

SELECT *
  FROM student s
 INNER JOIN enrollment e ON e.tech_id = s.tech_id
 INNER JOIN class_attended ca ON ca.tech_id = s.tech_id
 RIGHT JOIN class c ON c.c_id = ca.c_id
 WHERE s.tech_id = '253542'
   AND c.term_id = '4'
   AND c.class_date < NOW()
 ORDER BY week_no;

In my mind I see this as get all the class records (the table on the
right side of the RIGHT JOIN) and if you can match up the
class_attended, enrollment and student information do so, otherwise
set those values to NULL.  In the example above this yeilds 3 rows -
the for for class id 2 has NULLs in the other table data.

I cannot seem to figure out the 5.0.1 equivalent because I seem to
have learned this the wrong way.

I tried this, starting with the class table since those are the rows
that I want to have displayed regardless of matches.

SELECT *
  FROM class c
  LEFT JOIN class_attended ca ON c.c_id = ca.c_id
 INNER JOIN student s ON ca.tech_id = s.tech_id AND ca.tech_id =
'253542'
 INNER JOIN enrollment e ON e.tech_id = s.tech_id
 WHERE c.term_id = '4'
   AND c.class_date < NOW()
 ORDER BY week_no;

This acts like an INNER JOIN though, I can only get two rows.  I've
tried mucking around with it, but I just cannot get the "outer" row
for class id = 2 to show.

This was so much easier with the bug!  :)

Thanks for your help,

Josh


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to