Dear Stefan, 

Thanks for your help. I didn't know MySQL doesn't
automatically create index on primary key (I probably
should create UNIQUE index on them now). 

About not mixing char and varchar in one table, I
don't find that info in the on-line documents. I could
successfully create a test table
create table testTable(
     Seq_ID char(20),
     Title varchar(100)
);

Qunfeng

 
--- "Stefan Hinz, iConnect (Berlin)"
<[EMAIL PROTECTED]> wrote:
> Qunfeng,
> 
> > A simple left join on two big table took 5 mins to
> > finish.
> 
> These lines tell about the cause of the problem:
> 
> > | table | type   | possible_keys | key     |
> key_len |
> > | s     | index  | NULL          | PRIMARY |     
> 50 |
> 
> MySQL has no key (index) which it can use to speed
> up the search on the
> first table, newSequence (alias s). So, it has to
> scan all of the rows:
> 
> > ref      | rows    | Extra       |
> > NULL     | 2684094 | Using index |
> 
> MySQL will still use the primary key, _trying_ to be
> faster than without.
> 
> Does the Seq_ID have to be VARCHAR? This column type
> isn't very easy to
> index, especially without a length specification.
> 
> As you cannot have CHAR (> 3) and VARCHAR in one
> table, I would suggest you
> split up table newSequence into two tables (one
> fixed-length (i.e. without
> VARCHAR/TEXT columns), the other variable-length).
> This will speed up
> count() queries (and others) amazingly.
> 
> If you can use something like INT instead of CHAR,
> it's even faster.
> 
> If, for any reason, you have to stick to VARCHAR,
> you should index the
> column separately. Leave the primary key as is, but
> add another key (index)
> like that:
> 
>  CREATE INDEX make_it_fast ON newSequence
> (Seq_ID(10));
> 
> This will only make sense if the first 10 characters
> can tell the difference
> between different records. If not, you can
> experiment setting the index size
> to 20, 30, ...
> 
> I hope this will give you some ideas on how you can
> improve performance.
> 
> Regards,
> --
>   Stefan Hinz <[EMAIL PROTECTED]>
>   CEO / Geschäftsleitung iConnect GmbH
> <http://iConnect.de>
>   Heesestr. 6, 12169 Berlin (Germany)
>   Telefon: +49 30 7970948-0  Fax: +49 30 7970948-3
> 
> 
> ----- Original Message -----
> From: "Qunfeng Dong" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Monday, December 16, 2002 6:42 PM
> Subject: How can I speed up the Left Join on big
> tables?
> 
> 
> > Hi,
> >
> > A simple left join on two big table took 5 mins to
> > finish.
> >
> > Here is the "explain"
> > mysql> explain select count(*) from newSequence s
> left
> > join newSequence_Homolog h on s.Seq_ID = h.Seq_ID;
> >
>
+-------+--------+---------------+---------+---------+----------+---------+-
> ------------+
> > | table | type   | possible_keys | key     |
> key_len |
> > ref      | rows    | Extra       |
> >
>
+-------+--------+---------------+---------+---------+----------+---------+-
> ------------+
> > | s     | index  | NULL          | PRIMARY |     
> 50 |
> > NULL     | 2684094 | Using index |
> > | h     | eq_ref | PRIMARY       | PRIMARY |     
> 50 |
> > s.Seq_ID |       1 | Using index |
> >
>
+-------+--------+---------------+---------+---------+----------+---------+-
> ------------+
> > 2 rows in set (0.00 sec)
> >
> > here are the two tables' definitaion
> > mysql> describe newSequence;
> >
>
+-------------+-------------------------------------------------------------
>
-------------------------------------------+------+-----+------------+------
> -+
> > | Field       | Type
> >
> >          | Null | Key | Default    | Extra |
> >
>
+-------------+-------------------------------------------------------------
>
-------------------------------------------+------+-----+------------+------
> -+
> > | Seq_ID      | varchar(50)
> >
> >          |      | PRI |            |       |
> > | GenBank_Acc | varchar(10)
> >
> >          | YES  | MUL | NULL       |       |
> > | Organism    | varchar(50)
> >
> >          |      | MUL |            |       |
> > | Seq_Type    | enum('EST','GSS','EST Contig','EST
> > Singlet','GSS Contig','GSS Singlet','GSS Plasmid
> > Contig','Protein') |      | MUL | EST        |    
>   |
> > | Seq_Length  | int(11)
> >
> >          |      |     | 0          |       |
> > | Seq_Title   | text
> >
> >          |      | MUL |            |       |
> > | Comment     | text
> >
> >          | YES  | MUL | NULL       |       |
> > | Entry_Date  | date
> >
> >          |      |     | 0000-00-00 |       |
> >
>
+-------------+-------------------------------------------------------------
>
-------------------------------------------+------+-----+------------+------
> -+
> > 8 rows in set (0.00 sec)
> >
> > There are 2684094 records on this table.
> >
> > mysql> describe newSequence_Homolog;
> >
>
+------------------+-------------+------+-----+---------+-------+
> > | Field            | Type        | Null | Key |
> > Default | Extra |
> >
>
+------------------+-------------+------+-----+---------+-------+
> > | Seq_ID           | varchar(50) |      | PRI |
> >  |       |
> > | Homolog1_PID     | varchar(20) | YES  | MUL |
> NULL
> >  |       |
> > | Homolog1_Desc    | varchar(50) | YES  | MUL |
> NULL
> >  |       |
> > | Homolog1_Species | varchar(50) | YES  |     |
> NULL
> >  |       |
> > | Homolog2_PID     | varchar(20) | YES  | MUL |
> NULL
> >  |       |
> > | Homolog2_Desc    | varchar(50) | YES  | MUL |
> NULL
> >  |       |
> > | Homolog2_Species | varchar(50) | YES  |     |
> NULL
> >  |       |
> > | Homolog3_PID     | varchar(20) | YES  | MUL |
> NULL
> >  |       |
> > | Homolog3_Desc    | varchar(50) | YES  | MUL |
> NULL
> >  |       |
> > | Homolog3_Species | varchar(50) | YES  |     |
> NULL
> >  |       |
> >
>
+------------------+-------------+------+-----+---------+-------+
> > 10 rows in set (0.00 sec)
> > There are 357944 records in this tables.
> >
> > I've already copied
> > /usr/share/doc/mysql-server-3.23.49/my-huge.cnf as
> > /etc/my.cnf
> >
> > Is there any other thing I can do to improve the
> speed
> > of join? I really hate to merge the two tables
> > together.  I am running MySQL3.23.49 on redhat
> > linux7.3. My MySQL Server has 4 GB memory.
> >
> > Eventually, I need to do (select *) instead of the
> > above select count(*)
> 
=== message truncated ===


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to