Why isn't Seq_ID not an unsigned int? Primary key should always be something generated by the system that has no other significance than being a primary key. If there actually is a seq_id piece of data that has some other significance, I wouldn't use it to link all your data. Then you don't have to have a compound index (Seq_ID, Homolog_PID) in your NewSequence_Homolog table.

I could understand why you may need to use text for the Comment, but would char(255) be good enough for the title? Then you could split Comment out to a separate table so you can get fixed length records. If comments should always be displayed with the data, then perhaps putting a flag field in the table to indicate there are comments. Then comments can be viewed individually.

Your simple left join does seem to be taking quite a long time. I wouldn't consider myself an expert in MySQL, but I would think that your index key length of 50 has to be slowing things down. The left join example you have is also scanning an entire table of 2676711 records. I don't know how big the table is in disk space, but I would guess the entire table is not cached in RAM.

A few weeks back I did read something about a company that was doing modeling on grain falling in a silo or something very complex like fluid dynamics. They were having severe performance issue where it would take 10 hours to model something. They used all the various Unix tools to determine where the bottleneck was (disk, memory, or CPU). It was RAM and disk I/O (due to low RAM) that was slowing things down. They started adding disks for scratch areas and virtual memory and made sure there were no hot disks. They knocked about two hours off of the time. They then added a whole bunch of RAM so the entire table could be loaded into RAM and got things down to something like 17 minutes.
Regardless of whether you are using Oracle or MySQL, you are still limited by the hardware you are running it on. Finding what the bottleneck is (disk, memory, cpu, or network) is the key.

On Wednesday, December 18, 2002, at 04:17 PM, Qunfeng Dong wrote:

This NewSequence table is used to track some general
info about sequence. Notice I have to use text
datatype to describe "Comment" and "Seq_Title" fields;
therefore I have to use varchar for other string
fields. In addition, the Seq_ID is not numerical.
BTW, I found indexing on Seq_Type. Organism which are
very repeative still helps with accessing. This table
has 2676711 rows.

--
Brent Baisley
Systems Architect
Landover Associates, Inc.
Search & Advisory Services for Advanced Technology Environments
p: 212.759.6400/800.759.0577


---------------------------------------------------------------------
Before posting, please check:
  http://www.mysql.com/manual.php   (the manual)
  http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to