RE: Joins are slow

2003-01-22 Thread Jennifer Goodie
Try a straight join with the cases table first.  You have to play with your
queries to get the best results.

Creating a temporary, as you mentioned, table might be a very good option in
this case, but you need to make sure that your programming takes into
account that your page might be hit more than once at a single time so your
queries don't collide, i.e. someone is looking for 1999 cases and someone
else is looking for 2000 cases so both are in your temp table and you're
joining without a where or something, it seems obvious but I've seen it not
accounted for.  Another thing that might get you is populating your temp
table via select into can cause MySQL to use temp tables on disk instead of
memory if the data set is very large and your server is not optimised for
it, that will take speed way down.


-Original Message-
From: Steve Quezadas [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, January 21, 2003 7:19 PM
To: [EMAIL PROTECTED]
Subject: Joins are slow


Maybe I'm dumb for saying this, but sql joins seems expensive to do in
terms of performance (yes, I indexed the joined fields). If I do a query
search of a 2,600,000 record defendant table WITHOUT a join (SELECT
DISTINCT CaseNumber FROM Defendants WHERE Defendant LIKE "owen%" OR
Defendant LIKE "pitts%" OR Defendant LIKE "general%";). Performance is
generally zippy at .53 seconds (which ends up pulling about 34,000 rows).

HOWEVER, once I join this table with the much smaller [Cases] table
(about 140,000 rows), performance plummets to 8.79 seconds! Quite a
drop! The SQL statement is: 'SELECT Defendants.CaseNumber FROM
Defendants, Cases WHERE Cases.CaseNumber = Defendants.CaseNumber AND
Filed <= "1999-01-01" AND (Defendant LIKE "owen%" OR Defendant LIKE
"pitts%" OR Defendant LIKE "general%");'. I get about 10,500 rows
returned here, but man, what an increase in time.

So my main question is: Is this degradation in performance to be
expected with a join? What I am thinking about doing is perhaps doing
two smaller queries instead of one larger one. My first query can just
query the very large Defendants table without a join and create a
temporary table of CaseNumbers (takes about .53 seconds). Then I can
join that temporary table into the much smaller [Cases] table and then
run that part of the query (which takes about 1.23 seconds).

So bottom line:
Results of big query with one join: 8.79 seconds
Results of creating two smaller queries: 1.76 seconds (.53 seconds +
1.23 seconds).

So I am thinking about optimizing the query in my PHP program to make
two smaller queries instead of one bigger query. Is this a standard
thing to do to increase speed? Or am I doing something stupid and
inefficient in the join that I am not aware of?

- Steve

PS Here is some information about my tables and indexes:

([Defendant] Query WITHOUT a join) - .53 seconds.
EXPLAIN SELECT DISTINCT CaseNumber FROM Defendants WHERE Defendant LIKE
"owen%" OR Defendant LIKE "pitts%" OR Defendant LIKE "general%";

++---+---+---+-+--+---+-
+
| table  | type  | possible_keys | key   | key_len | ref  |
rows  | Extra   |
++---+---+---+-+--+---+-
+
| Defendants | range | Defendant | Defendant |  30 | NULL |
82756 | where used; Using temporary |
++---+---+---+-+--+---+-
+

([Defendant] Query WITH a join - 8.79 seconds!
EXPLAIN SELECT Defendants.CaseNumber FROM Defendants, Cases WHERE
Cases.CaseNumber = Defendants.CaseNumber AND Filed <= "1999-01-01" AND
(Defendant LIKE "owen%" OR Defendant LIKE "pitts%" OR Defendant LIKE
"general%");
+++--+-+-+--
-+---++
| table  | type   | possible_keys| key | key_len |
ref   | rows  | Extra  |
+++--+-+-+--
-+---++
| Defendants | range  | CaseNumber,Defendant | Defendant   |  30 |
NULL  | 82756 | where used |
| Cases  | eq_ref | CasesNumber,Filed| CasesNumber |  30 |
Defendants.CaseNumber | 1 | where used |
+++--+-+-+--
-+---++

Table and index information
Cases Table - 140,000 records
+---+---+--+-+-+---+
| Field | Type  | Null | Key | Default | Extra |
+---+---+--+-+-+---+
| CaseNumber| char(30)  |  | PRI | |   |
| County| char(30)  |  | | |   |
| CaseName  | char(120) |  | | |   |
| Court | char(30)  |  | | | 

Re: Joins are slow

2003-01-22 Thread Martin Waite
On Wed, 2003-01-22 at 03:18, Steve Quezadas wrote:
> 
> ([Defendant] Query WITH a join - 8.79 seconds!
> EXPLAIN SELECT Defendants.CaseNumber FROM Defendants, Cases WHERE 
> Cases.CaseNumber = Defendants.CaseNumber AND Filed <= "1999-01-01" AND 
> (Defendant LIKE "owen%" OR Defendant LIKE "pitts%" OR Defendant LIKE 
> "general%");
> 
>+++--+-+-+---+---++
> | table  | type   | possible_keys| key | key_len | 
> ref   | rows  | Extra  |
> 
>+++--+-+-+---+---++
> | Defendants | range  | CaseNumber,Defendant | Defendant   |  30 | 
> NULL  | 82756 | where used |
> | Cases  | eq_ref | CasesNumber,Filed| CasesNumber |  30 | 
> Defendants.CaseNumber | 1 | where used |
> 
>+++--+-+-+---+---++
> 
Would a combined index on CasesNumber and Filed help ?

eg.
create index CasesFiled on Cases (CasesNumber, Filed);

The query might work out the 'Filed <= "1999-01-01"' without consulting
the table then.

==
Martin


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php