Hi.

On Wed 2002-08-14 at 15:06:02 -0700, [EMAIL PROTECTED] wrote:
> If you were here right now I would kiss you!  Well not really, but I can not
> believe what that did!!  It takes around 2-4 seconds now for the query to
> execute completely!  I wish I understood when to use what type of join since
> it is very obvious to me that it matters, and matters greatly! Thank you so
> much!

Let me try to explain.
(1) INNER JOIN is the join you want and the correct one to use.
(2) LEFT JOIN produces a different result (and therefore not what you want) 
    and usually takes a bit longer.
(3) Your query takes far too long, because MySQL chooses a wrong join order.
    (the order in which the tables are read in)
(4) LEFT JOIN implies a join order and one can (mis-)use it to get
    around (3)

Regarding (2), the wrong result is not a problem in your case, because
- as Keith already explained - the additional rows with NULLs for the
right table which you get (with a LEFT JOIN compared to an INNER
JOIN), are filtered out by other conditions in your query. And the
minor speed loss can be neglected, because you gain an order of
magnitude of speed at the same time.

You still should try to use INNER JOIN instead of the LEFT one. The
"official" way to influence the JOIN order is to use STRAIGHT_JOIN, as
Keith explained, i.e.

  SELECT STRAIGHT_JOIN DISTINCT ... FROM ...

Even better would be to try to figure out why MySQL has chosen a
suboptimal join order and try to "educate" it. For the slow query, it
guesses that it will need 56467 * 1 * 1 * 1 = 56467 rows, for the fast
one it guesses 7 * 80732 * 1 * 1, which are far more, which explains
MySQL's choice for the former one.

At least one of the two guesses is wrong. Either the slow query "looks
faster" than it is or the fast query looks slower. Most times it is
the latter case, but with 56467 rows needing a minute I presume the
former.

Run ANALYZE TABLE for all involved tables and see if it changes what
EXPLAIN tells (for either query).

Regardless, for the better join (with the higher guess), EXPLAIN tells
that it has to do an full table scan for JobsLocation and cannot use
an index (NULL in the "possible keys" column). You can further improve
your query by adding an index on JobsLocation.LocID, and if there is
already one, make sure JobsLocation.LocID and Location.LocID have
exactly the same type. (Btw, if JobsLocation indeed has no index yet,
what column is the PRIMARY KEY on?)

> Here is the new Explain (sorry about the formatting due to email program):

In the future, try

  EXPLAIN SELECT ... ORDER BY Loc ASC \G

\G will result in a column-oriented output which will be longer, but
easier to read.

Please inform us of any finding.


Last, an introduction into LEFT JOINs:

In case it is not clear yet, an INNER JOIN creates a cross product,
i.e. each row of one table is paired with each row of the other
table. The ON clause reduces the result to such pairs you want to see
(usally the "matching" ones). That is also the reason, why you get no
rows at all, if one of the tables is empty. There are no rows to be
paired in one (or the theoretical explanation via cross product:
0 x some_set = 0)

A join via "," is the same as INNER JOIN except that you specify the
condition in the WHERE clause with all other conditions instead of an
seperate ON clause, i.e.

  SELECT * FROM one, two WHERE one.two_id = two.id

has exactly the same meaning (and with MySQL even the same execution path) as

  SELECT * FROM one INNER JOIN two ON one.two_id = two.id

Okay, what does LEFT JOIN do? It takes each row of the left table and
pairs it with each row of the right table and chooses the ones you
want via ON clause (until here it's the same as INNER JOIN). But if
for one row of the left table, there is no row in the right one, so
that a pair would satisfy the ON clause, a new pair is created, which
contains the normal values of the left table and NULL for all columns
of the right table.

In other words: LEFT JOIN is an INNER JOIN, which adds (row,NULL)
pairs for each row of the left table which has no match (regarding the
ON clause) in the right table.


Well, when do you need a LEFT JOIN? You only need a LEFT JOIN, if you
are interested in non-matches, e.g. you want to join a table with
another, which is "incomplete" in some way and want to ignore that
fact, i.e. see all rows of the first table anyhow. In all other cases,
use an INNER JOIN. And at last an example:

If you have a table with persons and a table with pets. If you want to
who has which pet(s), you do

  SELECT * FROM person, pet WHERE person.pet = pet.id

If you want to see all persons and the pets, if any, you need a LEFT
JOIN:

  SELECT * FROM person LEFT JOIN pet ON person.pet = pet.id

which displays NULL values if someone has no pet. If you want see who
has no pet, you can use this fact:

  SELECT * FROM person LEFT JOIN pet ON person.pet = pet.id WHERE pet.id IS NULL

If you want to see how many pets each person has, you cannot use an
INNER JOIN, because you would not see the persons without pet, therefore:

  SELECT   person.*, COUNT(pet.id)
  FROM     person LEFT JOIN pet ON person.pet = pet.id
  GROUP BY person.id

Note that it is COUNT(pet.id), not COUNT(*). COUNT(*) would result in
1 for person having no pets, because it counts _rows_. COUNT(pet.id)
counts occurences of "pet.id", but only if pet.id is not NULL.

HTH,

        Benjamin.

> Location    range    LocName    LocName    55    NULL    7    where used;
> Using temporary; Using filesort
> JobsLocation    index    NULL    PRIMARY    8    NULL    80732    Using
> index
> Jobs    eq_ref    PRIMARY,EmpID    PRIMARY    4    JobsLocation.JobID    1
> where used
> Employers    eq_ref    PRIMARY    PRIMARY    4    Jobs.EmpID    1    where
> used
> 
> Rick
> 
> "Youth furnishes the materials and plans for the future. Maturity takes and
> cuts the stones; provided the so-called wisdom of old age doesn't destroy
> the genius of youth." - Unknown
> 
> > From: "Keith C. Ivey" <[EMAIL PROTECTED]>
> > 
> > On 14 Aug 2002, at 14:46, Richard Baskett wrote:
> > 
> >> Ok that saved me 5 seconds per query! So far so good!  I couldnąt use the
> >> STRAIGHT_JOIN though (received errors) so I made it an INNER JOIN.  This is
> >> the new query:
> > 
> > You say the EXPLAIN output looks the same, so the 5 second savings is
> > just random.  Nothing changed.
> > 
> > And sorry about the STRAIGHT_JOIN advice.  Looking at the
> > documentation again I see that it doesn't do what I thought it did.
> > 
> >> SELECT DISTINCT Location.SortID AS Loc, LocName, JobTitle AS Job, Company AS
> >> Comp, Jobs.JobID, Employers.EmpID
> >> FROM Location 
> >> INNER JOIN JobsLocation ON JobsLocation.LocID = Location.LocID
> >> INNER JOIN Jobs ON Jobs.JobID = JobsLocation.JobID
> >> INNER JOIN Employers ON Employers.EmpID = Jobs.EmpID
> >> WHERE Valid=1 AND JobTitle IS NOT NULL AND (LocName LIKE 'US-Alabama%')
> >> ORDER BY Loc ASC
> > 
> > One more suggestion:  Change the first INNER JOIN to a LEFT JOIN.
> > It's a little silly, since you don't really want those nulls, but it
> > should force Location to be selected first.  The records with nulls
> > should be eliminated by the later inner joins, or failing that by the
> > JobTitle criterion.
> > 
> > I thought there was a better way to control the process, but I don't
> > see it at the moment.
[...]

-- 
[EMAIL PROTECTED]

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to