Thanks! Re: Can MySQL handle 120 million records? - Ok, If you guys really can handle tens of millions records, you have to help me to enjoy MySQL too :-)

Qunfeng Dong Thu, 19 Dec 2002 17:53:50 -0800

Well, thanks to all of your great help! I am able to
speed up the query {select count(*) from NEW_Sequence
s left join NEW_Sequence_Homolog h on s.Seq_ID =
h.Seq_ID;} from 1 min 52.61 sec down to 20.62 sec now.
The only thing I changed so far was the Seq_ID from
type varchar to bigint. The Seq_ID was not all
numerical for different type of Sequences; but I
managed to assign numerical code to those
non-numerical ones now.


Qunfeng

> > CREATE TABLE NewSequence
> > (
> >         Seq_ID          varchar(50)     NOT NULL,
> >         GenBank_Acc     varchar(10),
> >         Organism        varchar(50)     NOT NULL,
> >         Seq_Type  enum("EST","GSS","EST
> Contig","EST
> > Singlet","GSS Contig","GSS Singlet","GSS Plasmid
> > Contig","Protein") NOT NULL,
> >         Seq_Length      int             NOT NULL,
> >         Seq_Title       text            NOT NULL,
> >         Comment         text,
> >         Entry_Date      date            NOT NULL,
> >         PRIMARY KEY     (Seq_ID),
> >         UNIQUE  (GenBank_Acc),
> >         INDEX (Seq_Type),
> >         INDEX (Organism)
> > );
> >
> > This NewSequence table is used to track some
> general
> > info about sequence. Notice I have to use text
> > datatype to describe "Comment" and "Seq_Title"
> fields;
> > therefore I have to use varchar for other string
> > fields. In addition, the Seq_ID is not numerical.
> > BTW, I found indexing on Seq_Type. Organism which
> are
> > very repeative still helps with accessing. This
> table
> > has 2676711 rows.
> >
> >
> > CREATE TABLE NewSequence_Homolog
> > (
> >         Seq_ID          varchar(50)     NOT NULL,
> >         Homolog_PID     int             NOT NULL,
> >         Homolog_Desc    varchar(50)     NOT NULL,
> >         Homolog_Species varchar(50),
> >         PRIMARY KEY     (Seq_ID, Homolog_PID)
> > );
> >
> > This NewSequence_Homolog table is to track which
> > protein sequences (homolog) are similar to the
> > sequence I store in the NewSequence table. This
> table
> > has 997654 rows.
> >
> > mysql> select count(*) from NewSequence s left
> join
> > NewSequence_Homolog h on s.Seq_ID = h.Seq_ID;
> > +----------+
> > | count(*) |
> > +----------+
> > |  3292029 |
> > +----------+
> > 1 row in set (1 min 30.50 sec)
> >
> > So a simple left join took about 1 min and half.
> > First, is this slow or I am too picky?
> >
> > This is the "Explain".
> > mysql> explain select count(*) from NewSequence s
> left
> > join NewSequence_Homolog h on s.Seq_ID = h.Seq_ID;
> >
>
+-------+-------+---------------+---------+---------+----------+---------+--
> -----------+
> > | table | type  | possible_keys | key     |
> key_len |
> > ref      | rows    | Extra       |
> >
>
+-------+-------+---------------+---------+---------+----------+---------+--
> -----------+
> > | s     | index | NULL          | PRIMARY |     
> 50 |
> > NULL     | 2676711 | Using index |
> > | h     | ref   | PRIMARY       | PRIMARY |     
> 50 |
> > s.Seq_ID |    9976 | Using index |
> >
>
+-------+-------+---------------+---------+---------+----------+---------+--
> -----------+
> >
> >
> > I am running MySQL 3.23.49 on RedHat linux 7.3 on
> a
> > dedicated server with 4 GB memory. The only
> setting I
> > changed is to copy the my-huge.cnf into
> /etc/my.cnf.
> >
> > Qunfeng
> >
> > --- "Michael T. Babcock" <[EMAIL PROTECTED]>
> > wrote:
> > > Qunfeng Dong wrote:
> > >
> > > >not-so-good performance (join on tables much
> > > smaller
> > > >than yours takes minutes even using index) and
> I
> > > seem
> > > >to read all the docs I could find on the web
> about
> > > how
> > > >to optimize but they are not working for me (I
> am
> > > >
> > >
> > > Have you stored a slow query log to run them
> through
> > > 'explain' and see
> > > why they're slow?  Do you want to post some of
> them
> > > here so we can
> > > suggest what might be done to make them faster?
> 
=== message truncated ===


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Thanks! Re: Can MySQL handle 120 million records? - Ok, If you guys really can handle tens of millions records, you have to help me to enjoy MySQL too :-)

Reply via email to