If your contacts are all North American, and this is a typical query, you
may well profit significantly from normalizing the address data. To do so,
create a Cities table and a States table (perhaps call it Regions if you
need to account for CDN provinces). The Cities table would look like:

CityID int auto-increment primary key
CityName char(30) or whatever
RegionID int (references Regions)

The Regions table would look like:

RegionID int auto-increment primary key
RegionName char(30) or whatever
CountryID       int (references Countries if you need it)

CountryID int auto-increment primary key
CountryName char(30)

Now you can move all the city, state and country info out of your contacts
table, leaving only the CityID behind. Index the column.

In this scheme your query would do a simple join and find the 'NJ' rows much
more quickly.

hth,
Arthur

-----Original Message-----
From: James Taylor [mailto:jtx@;hatesville.com]
Sent: Thursday, October 17, 2002 10:08 PM
To: [EMAIL PROTECTED]
Subject: Suggestions for breaking large table into smaller?


Curious if this is normal behaviour, or if my table is just not put together
very well.  I have a table consisting of approximately 4.5 million contacts,
listed by company name/city/state/zip - The system this is running on is a
PowerEdge 1400sc, 1ghz p3, 1 gig ram, 18gb scsi drive.  These are the
results of the following query:

SELECT COUNT(*) from usa_phone_list where state = 'NJ';
+----------+
| COUNT(*) |
+----------+
|   157104 |
+----------+
1 row in set (29.62 sec)

That seems like an awfully long time to me...  When doing an explain on that
query, the
'type' is coming back as index, and 'rows' is reporting 4886146.. Gr.

Currently, I'm using 1 large table to store basically everything.  I'm
trying to figure out whether I should use some sort of normalization, and
also I was thinking of basically creating 50 different tables, one for each
state...  Is there any downside to having that many tables, or is it
unnecessary and should a table with millions of entries be able to run this
query just as fast as one with a couple hundred thousand. Has anyone done
anything like this before, and if so... What turned out to be the best route
to take?  I've never worked on a database this large before, and am not a
professional DBA in the least.  Just looking for some suggestions here on
how to handle this.  Thanks for all your help.

James Taylor



---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to