Use "iostat -x" while the query is running. You are likely I/O bound doing a table scan on the protected (BIG) table.
There has been alot of discussion about RAM and CPU on this thread regarding performance, but nothing regarding disk I/O. If you're going to put tens of millions of records in a database and expect fast random access to them (i.e. not just reading the more recently inserted records). Then you need lots of disks and a good volume manager. Don't try to compensate for lack of disk with lots of RAM. Eventually the DB will be too big to budget for that RAM. Plan the disk requirements before anything else. A good rule of thumb is that a 10K SCSI disk can do about 200 random I/O per second. You can verify your disks using bonnie (http://www.textuality.com/bonnie/) or the like (make sure you are looking at RANDOM I/O not sequential I/O). Next compute the rate of I/O needed. If you are using InnoDB (which you should, otherwise MyISAM's course grain locking will introduce non-linearities into the performance that are too hard to model) then you know I/O is done in 16K pages. How many such pages need to be read per second? Look at all your queries and determine the access plans. Focus on the ones that will dominate I/O requirements. Understanding how InnoDB lays out data is critical for modeling. Each secondary index is in its own B-Tree with leaves containing the value of the primary key. The data itself is in another B-Tree keyed off the primary key. (Very similar to Oracle w/ index organized tables.) Unless you can benefit from clustering around the primary key, it safest to assume one I/O per leaf-item (secondary or primary) as the tables and indexes get large. For back of the envelope calculations its okay to ignore the non-leaf pages and assume they are cached. Hence for a table with primary and secondary keys named PK and SK respectively: "SELECT * FROM FOO WHERE PK = ?" = 1 I/O "SELECT * FROM FOO WHERE SK = ?" = 2 I/O "SELECT FOO.* FROM FOO, BAR WHERE FOO.SK = BAR.PK AND BAR.SK = ?" = 3 I/O Now take all the queries and determine the number of their occurances for some unit of work. That unit of work should correspond to something externally observable event, like "user logins". Say you have 3 queries in the system like this: Query # I/O Occurances Per Unit Work Q1 3 3 Q2 4 2 Q3 1 3 Now you can say that on average it takes 20 I/Os per unit of work. With one disk you can do 200/20 = 10 Units of work / second / disk. To ensure that performance scales with disks, stripe across the disks. This is where volume management becomes key, MySQL/InnoDB fills each datafile sequentially so don't just put each datafile on its own disk, that will just create hotspots and you will be bound by the performance of a single disk. This is the general idea of planning with any database. The previous only covers equality match via an index but you can figure out the difference for tablescans (consider how many rows fit into a page) and range scans (consider the affect of the primary key as a cluster index). -Wayne -----Original Message----- I am not sure. Does anyone know any real examples of mysql handling huge database and still perform well? I am having problems with the performance with the MySQL Left join recently. A big table (about 2.5 million records) left join a small table (about 350K records) takes generally 2 mins to finish. I check the "explain" and primary key index on the small table was indeed used for the joining. My system is Redhat Linux 7.3 with 4 GB memory. I also tried replacing the default my.cnf with my-huge.cnf. It didn't help at all. Another thing, with some linux system, there is a size limit for file. MySQL seems to store each of its table as single file. You need to choose a file system without that limit. Qunfeng Dong --- "B.G. Mahesh" <[EMAIL PROTECTED]> wrote: > > hi > > We are evaluating few databases for developing an > application with > following specs, > > 1. OS not very important. Leaning towards Linux > > 2. Currently the database has about 5 million > records but it will grow > to 120 million records. > > 3. The tables will have billing information for a > telecom company. > Nothing complex. > > 4. Back office staff will use the data in the > database to create > invoices to be sent to customers. This data is not > connected to the > live telecom system [e.g. switches etc]. We get the > data every day > from the telecom company. > > 5. Staff may perform queries on the database to get > reports like > "busiest hour of the day" etc etc. I don't see too > many concurrent > users using the system, however the system needs to > be stable. > > 6. Need to create excel, pdf files from the data in > the database. This > I think has nothing to do with the database, however > this is a requirement. > > 7. Needless to say, good security is a must which > will also be built > into the front end application. > > We are considering the following databases, > > 1. MYSQL > 2. Postgres > 3. Oracle > 4. MSQL > > If MYSQL or Postgres can do the job I prefer not to > spend the money on > Oracle/MSQL. However, if Oracle/MSQL are required > for getting good > reports and scalability, so be it. We will use > Oracle/MSQL. > > Any pointers/advice is appreciated > > > -- > -- > B.G. Mahesh > mailto:[EMAIL PROTECTED] > http://www.indiainfo.com/ > India's first ISO certified portal > > --------------------------------------------------------------------- > Before posting, please check: > http://www.mysql.com/manual.php (the manual) > http://lists.mysql.com/ (the list > archive) > > To request this thread, e-mail > <[EMAIL PROTECTED]> > To unsubscribe, e-mail > <[EMAIL PROTECTED]> > Trouble unsubscribing? Try: > http://lists.mysql.com/php/unsubscribe.php > __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php