Hi Bill, (answers intermixed....) "Bill" <[EMAIL PROTECTED]> wrote on 12/18/2005 07:33:52 PM:
> Hi Shawn, > > > > * /-----\ /-----\ > * / \ / \ > * / X \ > * / / \ \ > * / / \ \ > * | A | B | C | > * \ \ / / > * \ \ / / > * \ X / > * \ / \ / > * \-----/ \-----/ > * > * > *Let's imagine for a second that you are a publisher and you have some > auteurs in your data who haven't written any libres and a few libres > *without auteurs (story ideas you haven't gotten someone to write, yet). If > we can imagine that the left circle of AB represents all of the auteurs *in > your database and the right circle of BC represents your table of libres, > you are ready to understand JOINs. The area B represents where *the two > overlap (where libre.auteur=auteur.id) > * > *If you want to know all of the authors and the books they have written > (ignoring those authors without books), you want to get those rows from > *area B where the rows of auteurs matches those of libres. This is when you > use an INNER JOIN > * > *SELECT a.nom, a.prenom, l.titre > *FROM auteurs a > *INNER JOIN libres l > *ON l.auter = a.id; > > > Ok so INNER JOIN (B) is actually A intersection of B (in maths language) > and A and C are peripherals (outer) > No, LEFT CIRCLE (intersect) RIGHT CIRCLE defines the set that should be returned by an INNER JOIN. That would be just Area B. It's only going to return those rows from either table that match the ON condition(s). > *If you want a list of all auteurs in your system, regardless of whether > they have written a libre or not, that would be represented by the area > *covered by AB (the whole circle).: > > *SELECT a.nom, a.prenom, l.titre > *FROM auteurs a > *LEFT JOIN libres l > * ON l.auter = a.id; > > > So LEFT JOIN = A > Hence RIGHT JOIN is C ?? Almost... It's my fault for rushing through this. a LEFT JOIN is: LEFT CIRCLE (union) (LEFT CIRCLE (intersect) RIGHT CIRCLE) That means that all of A and just that part of BC that matches the ON condition(s) (the area defined by B) will appear in the results. LEFT and RIGHT refer to which table's data is considered non-optional. If a query contained the following clause FROM `employees` LEFT JOIN `managers` on `managers`.`emp_id` = `employees`.`id` The table `employees` is on the LEFT side of the LEFT join with means that all of that table has the potential to be in the final results. The information that comes from the table `managers` may or may not have matches with every row of `employees` so for those rows where they do not match, there will be a NULL value in every column of the `managers` table where there would have been a gap. If `employees represents` the LEFT CIRCLE (areas A and B) and `managers` represents the RIGHT CIRCLE (areas C and B) then a LEFT JOIN would return all of the LEFT CIRCLE (areas A and B) plus that part of RIGHT CIRCLE where it overlaps (area B). Because some part of `managers` matches some part of `employees` there will be some part of that table available to your query. NULL values fill in for all unmatched rows. A RIGHT JOIN just reverses which table is considered as non-optional. Based on this clause FROM employees RIGHT JOIN managers on managers.emp_id = employees.id You end up with all of the RIGHT CIRCLE(areas B and C) being part of the potential results, not just C. Only those rows from the LEFT CIRCLE that fall in area B could appear in the results. Make better sense? > > *The LEFT part of the LEFT JOIN indicates that the entire table on the LEFT > of the predicate is to be returned. In this case the table auteurs is *to > the left of the predicate so all of the rows from that table will be part of > the result even if there aren't any rows in libres that match. > > *To view it the other way, to list all of the libres records without > regards to if they have auter or not we can use the SAME query with one > small *change. > > *SELECT a.nom, a.prenom, l.titre > *FROM auteurs a > *RIGHT JOIN libres l > * ON l.auter = a.id; > > *See the difference? Now, a special type of query would be to identify all > of the rows in one table that do not match any rows in the other. This > *would be what kind of query you would need to run if you wanted to find > auteurs without libres (or vice versa). > > *SELECT * > *FROM auteurs a > *LEFT JOIN libres l > * ON a.id = l.auteur > *WHERE l.id IS NULL; OK I see it's clear --> select auteurs that > don't have any reference in livres. > > *I am sure you are thinking "The ID field of my libres table could never be > NULL. He must be crazy". It's the design behavior of the outer joins to > *return NULLS for EVERY COLUMN in the "other" table if the ON condition is > not met. In this case, you would only see values for the columns *that come > from libres if the tables actually matched up. Wherever they don't, you will > see all NULLs. > > *There are other much more detailed explanations about this on the web but I > hope this can gets you started. > > A lot yes. Translated it gives > LEFT JOIN and RIGHT JOIN are sets with nothing in common where INNER JOIN is > the intersection of the two so, what is common to the two sets. Any > condition tying part or all of the two sets create a more or less big > intersection. We then choose to see one or the other part of the collection > (A,B,C or any subset or part of them). Almost. When you JOIN tables, you are (horizontally) adding all of the columns of all tables being joined into what is a new composite table. This composite's existence is normally hidden from the user. Imagine two tables, TableA and TableB. TableB contains information that can exist in a one-to-many relationship with the items of TableA (TableA is authors and TableB is books, for example). TableA has columns a1, a2, a3, and id. TableB has columns b1, b2, b3, id, and a_id. If I did the following query: SELECT * FROM TableA INNER JOIN TableB ON TableB.A_ID = TableA.ID I would get the following columns in my output which represent all of the columns and all of the rows of the intermediate composite table: a1, a2, a3, id, b1, b2, b3, id, a_id It doesn't matter if I had used an INNER, LEFT, RIGHT, or CROSS join, the same arrangement of columns will exist in some internal table at some point in time during the execution of the query. It's the SELECT clause that determines which columns (or combinations of columns) from the composite table will appear in the final results. It's the ON clause (along with the INNER, LEFT, or RIGHT modifier) that determines which rows from which source tables end up in this intermediate table. It's the WHERE clause that determines which rows make it out of this intermediate table into the final query. In order to compute the intermediate table, each row on the left side of the predicate is evaluated in combination with each and every row from the right side of the predicate. If the ON clause is missing, or is always true, what results is called a Cartesian product of the two data sets: each row of TableA will be paired with every row of TableB. If TableA had 5 rows and TableB had 3 rows, the intermediate table will have 15 rows in it. If an ON clause exists, all row combinations are evaluated against its conditions. Whether a row makes it into the intermediate table depends on the directionality of the predicate (INNER, LEFT, or RIGHT) and whether the row combination meets the ON conditions or not. That means that MySQL supports the means to determine what rows are in each area of our diagram based on the direction of the join (INNER, LEFT, or RIGHT), the conditions declared in the ON clause, and any filtering of the results based on the WHERE conditions. > Correct me if I'm wrong. > > Now UNION is sort of equivalent to GROUP BY or even WITH ROLLUP isn't it ? > > I don't think you have that quite right. GROUP BY does summations based on the occurrence of common values within certain columns. WITH ROLLUP takes the results of a GROUP BY and re-summarizes based on the columns defining the original summations. Let's straighten out your understanding of JOINs before we delve into GROUP BY and WITH ROLLUP much further, ok? UNION linearly (vertically) combines the results of two or more complete queries. The default behavior of a UNION is to filter out duplicate rows but that can be turned off. > Sandy, learning fast with skilfull teacher Shawn Green Database Administrator Unimin Corporation - Spruce Pine PS. Please don't forget to CC the list...