RE: MySQL/INNODB speed on large databases
For the first query below--if you really run it often enough to mess with indexes, and it really has a limit 1 or a small limit--an index on (VoidStatus, InstNum) ought to avoid having MySQL create a big temporary table and then sort it. In addition, you could add to the index any of columns in the other AND clauses, if doing so would allow a lot of records to be skipped over during the index scan, rather than read in their entirety. > From: "Wendell Dingus" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Cc: <[EMAIL PROTECTED]> > Subject: RE: MySQL/INNODB speed on large databases > Date: Wed, 2 Jul 2003 11:51:05 -0400 > Thanks to everyone who has helped and/or made suggestions so far. I'll > try to provide some answers to your further queries and report back on > some testing I've done. > Jeremy asked for explains of some of the problem queries: > Here is a particularly troublesome one that gets ran quite a lot: > mysql> SELECT InstNum FROM TBL_Transactions WHERE ((IndexStatus < '2' OR > Scanned<>'Y') OR (MoneyStatus <> '1')) AND ((VoidStatus = 'N') AND > (IndexType <> 'CP') AND (Year > '2001')) ORDER BY InstNum ASC LIMIT 1; > +--+ > | InstNum | > +--+ > | 03128665 | > +--+ > 1 row in set (6.59 sec) > mysql> explain SELECT InstNum FROM TBL_Transactions WHERE ((IndexStatus > < '2' OR Scanned<>'Y') OR (MoneyStatus <> '1')) AND ((VoidStatus = 'N') > AND (IndexType <> 'CP') AND (Year > '2001')) ORDER BY InstNum ASC LIMIT > 1; > +--+--++ > +-+---+++ > | table| type | possible_keys | key > | key_len | ref | rows | Extra | > +--+--++ > +-+---+++ > | TBL_Transactions | ref | Year,VoidStatus,IndexStatus,Year_2 | > VoidStatus | 2 | const | 150804 | where used; Using filesort | > +--+--++ > +-+---+++ > 1 row in set (0.00 sec) > Thanks to Joseph Bueno for suggesting the 4.x query cache: > I took the above query and on a test server running 4.0.13 I setup a 1MB > query cache and tried it out. It took >6 seconds first time and 0.00 > seconds on subsequent times. I'm assuming this cache is smart enough to > re-perform the query if any data pertaining to it changes, yeah > surely... So on often-executed queries where the data is very cachable > this will help. > After a few minutes of monitoring this one floats to the top of a mytop > output screen as taking the longest to run: > mysql> explain SELECT DISTINCT LastName, FirstName, PAName FROM > TBL_AllNames WHERE PAName LIKE 'WHITE%' AND NameType<'2' ORDER BY > LastName, FirstName; > +--+---+-++-+--+ > ---+-+ > | table| type | possible_keys | key| key_len | ref | > rows | Extra | > +--+---+-++-+--+ > ---+-+ > | TBL_AllNames | range | PAName,NameType | PAName | 81 | NULL | > 41830 | where used; Using temporary | > +--+---+-++-+--+ > ---+-+ > 1 row in set (0.00 sec) > mysql> Running the actual query returned >4000 rows and took (58.20 sec) > Here's some details of that table: > mysql> describe TBL_AllNames; > +---+-+--+-+-+---+ > | Field | Type| Null | Key | Default | Extra | > +---+-+--+-+-+---+ > | InstNum | varchar(8) | | PRI | | | > | Year | varchar(4) | | PRI | | | > | NameType | char(2) | | PRI | | | > | NameClass | char(1) | YES | MUL | NULL| | > | NameAP| char(1) | YES | | NULL| | > | Ncount| int(11) | | PRI | 0 | | > | LastName | varchar(80) | YES | MUL | NULL| | > | FirstName | varchar(60) | YES | MUL | NULL| | > | TypeofName| varchar(20) | YES | | NULL| | > | PAName| varchar(80) | YES | MUL | NULL| | > | SoundKeyFirst | varchar(12)
RE: MySQL/INNODB speed on large databases
Thanks to everyone who has helped and/or made suggestions so far. I'll try to provide some answers to your further queries and report back on some testing I've done. Jeremy asked for explains of some of the problem queries: Here is a particularly troublesome one that gets ran quite a lot: mysql> SELECT InstNum FROM TBL_Transactions WHERE ((IndexStatus < '2' OR Scanned<>'Y') OR (MoneyStatus <> '1')) AND ((VoidStatus = 'N') AND (IndexType <> 'CP') AND (Year > '2001')) ORDER BY InstNum ASC LIMIT 1; +--+ | InstNum | +--+ | 03128665 | +--+ 1 row in set (6.59 sec) mysql> explain SELECT InstNum FROM TBL_Transactions WHERE ((IndexStatus < '2' OR Scanned<>'Y') OR (MoneyStatus <> '1')) AND ((VoidStatus = 'N') AND (IndexType <> 'CP') AND (Year > '2001')) ORDER BY InstNum ASC LIMIT 1; +--+--++ +-+---+++ | table| type | possible_keys | key | key_len | ref | rows | Extra | +--+--++ +-+---+++ | TBL_Transactions | ref | Year,VoidStatus,IndexStatus,Year_2 | VoidStatus | 2 | const | 150804 | where used; Using filesort | +--+--++ +-+---+++ 1 row in set (0.00 sec) Thanks to Joseph Bueno for suggesting the 4.x query cache: I took the above query and on a test server running 4.0.13 I setup a 1MB query cache and tried it out. It took >6 seconds first time and 0.00 seconds on subsequent times. I'm assuming this cache is smart enough to re-perform the query if any data pertaining to it changes, yeah surely... So on often-executed queries where the data is very cachable this will help. After a few minutes of monitoring this one floats to the top of a mytop output screen as taking the longest to run: mysql> explain SELECT DISTINCT LastName, FirstName, PAName FROM TBL_AllNames WHERE PAName LIKE 'WHITE%' AND NameType<'2' ORDER BY LastName, FirstName; +--+---+-++-+--+ ---+-+ | table| type | possible_keys | key| key_len | ref | rows | Extra | +--+---+-++-+--+ ---+-+ | TBL_AllNames | range | PAName,NameType | PAName | 81 | NULL | 41830 | where used; Using temporary | +--+---+-++-+--+ ---+-+ 1 row in set (0.00 sec) mysql> Running the actual query returned >4000 rows and took (58.20 sec) Here's some details of that table: mysql> describe TBL_AllNames; +---+-+--+-+-+---+ | Field | Type| Null | Key | Default | Extra | +---+-+--+-+-+---+ | InstNum | varchar(8) | | PRI | | | | Year | varchar(4) | | PRI | | | | NameType | char(2) | | PRI | | | | NameClass | char(1) | YES | MUL | NULL| | | NameAP| char(1) | YES | | NULL| | | Ncount| int(11) | | PRI | 0 | | | LastName | varchar(80) | YES | MUL | NULL| | | FirstName | varchar(60) | YES | MUL | NULL| | | TypeofName| varchar(20) | YES | | NULL| | | PAName| varchar(80) | YES | MUL | NULL| | | SoundKeyFirst | varchar(12) | YES | MUL | NULL| | | SoundKeyLast | varchar(12) | YES | MUL | NULL| | | RecDate | varchar(8) | | MUL | | | | InstCode | varchar(10) | | MUL | | | | IndexType | varchar(4) | | | | | | XrefGroup | varchar(8) | | | | | +---+-+--+-+-+---+ 16 rows in set (0.00 sec) mysql> select count(*) from TBL_AllNames; +--+ | count(*) | +--+ | 6164129 | +--+ 1 row in set (50.17 sec) Thanks in advance! PS. I'm still very interested in *paying* MySQL to help analyze and suggest ways we can make the queries faster. Again though, I just want to point *soon* hardware upgrade purchases in the right direction and get that all settled down first. Opterons look nice but with a database size topping 29GB today I think enough ram to cache a sizable portion of it will be cost prohibitive. Could still be a possibility though... I'm still leaning towards a load-balanced setup with backend/real servers having either 15K SCSI drives RAID-0'ed or possibly SATA 10K drives for cost reasons. Again, thanks! -- MySQL General Mailing List For list archives: http://li
RE: MySQL/INNODB speed on large databases
On Sat, Jun 28, 2003 at 01:47:28PM -0400, Wendell Dingus wrote: > Hello, I've got a situation I'm hoping someone here can help me out with. We > have a web server serving up data via some PHP programs pulling from MySQL > (3.23.56 currently) in INNODB tables. There are 40+ (active) databases > totalling a bit over 28 Gigabytes at this point. As we add more users speed > is beginning to drop... > > Currently all this is on a single dual P3 1.26Ghz IBM eServer with 4GB of > RAM. 4 x 15K RPM drives RAID5. It's hitting load averages >12 during peak > times and database query speeds are really starting to drag... p3's are ancient architecture now , but i have found trnasactions and full text searching to be very cpu intensive > I've talked some with the consulting folks at MySQL and have the intention > of contracting them to help analyze our database layout, structure, and > queries to help optimize things. hmm the most common thing is to send the list an explain table and maybe the table layout someone may help out, could be an indexing problem. > My quandry is this though.. The existing server has 4GB but I can't get > innodb_buffer_pool_size above about 1.2GB I'd buy tons more RAM and go that > route if it could be utilized, I've not figured out how though and assume > that's the deal with a 32-bit CPU. Reading some about 64-bit Opterons and a > 64-bit address space sounds good. hey funny u mentioned that i had a read @ thoms hardware i must recommend this as a worthy read http://www17.tomshardware.com/cpu/20030422/index.html ok so basically in a unix environment the opteron blew all over the XEON especially in the mysql benchmark, so the opteron/linux is going to be the recommended cheaper alternative server to sun/solaris, although obviouslly XEON kicked ass in rendering situations on an XP desktop as its optimised for it , so its the situation really i'm currently investigating in the right XEON chip and mobo for a XP rendering workstation for audio/video stuff, but if i were to build a unix server opteron all the way > I have the bosses ear for hardware _now_ though and want to move > forward before that opportunity passes. I'll post the database specs > at the end of this. If you were hosting that much data without a ton > of inserts all day long but lots of selects where would you > concentrate your efforts? Speed of CPU or speed/quantity of RAM or > speed of drives or what? maybe send a grab of top or something would be good > Mytop output after about 28 hours of uptime so this is just Friday during > business hours access primarily: > MySQL on localhost (3.23.56-Max-log) > up 1+04:56:30 [13:29:15] > Queries Total: 1,783,317 Avg/Sec: 17.12 Now/Sec: 0.40 Slow: 1,046 > Threads Total: 33Active: 1 Cached: 0 > Key Efficiency: 99.99% Bytes in: 779,059,153 Bytes out: 543,489,706 hmm mytop is a good app -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL/INNODB speed on large databases
On Sat, Jun 28, 2003 at 01:47:28PM -0400, Wendell Dingus wrote: > Hello, I've got a situation I'm hoping someone here can help me out with. We > have a web server serving up data via some PHP programs pulling from MySQL > (3.23.56 currently) in INNODB tables. There are 40+ (active) databases > totalling a bit over 28 Gigabytes at this point. As we add more users speed > is beginning to drop... > > Currently all this is on a single dual P3 1.26Ghz IBM eServer with 4GB of > RAM. 4 x 15K RPM drives RAID5. It's hitting load averages >12 during peak > times and database query speeds are really starting to drag... Uh oh. > I've talked some with the consulting folks at MySQL and have the intention > of contracting them to help analyze our database layout, structure, and > queries to help optimize things. Smart move. > My quandry is this though.. The existing server has 4GB but I can't get > innodb_buffer_pool_size above about 1.2GB I'd buy tons more RAM and go that > route if it could be utilized, I've not figured out how though and assume > that's the deal with a 32-bit CPU. Reading some about 64-bit Opterons and a > 64-bit address space sounds good. Yeah, if you want more stuff in memory, look seriously at the Opterons. > I'm sure I can get the LVS approach with a load balancer and 3 or 4 back-end > "real" servers going without too much trouble. Where do I spend the money > most wisely though? More and more and more RAM in each backend server or is > anything over 2GB going to mostly not be used (Probably P4 2.4Ghz XEON > boxes). Drive speed? I'm thinking I'll test out using fast SCSI drives and > maybe some SATA drives RAID-0 stiped versus non-striped to see the speed > difference. My whole goal is speed, speed, and more speed out of > MySQL! That's what I'd suggest. Get some SCSI disks with really low seek times (4ms) and stripe using RAID-0. > The > actual web applications place a very small load on the server, viewing top > or mytop show that database access is where all the time is going and > primarily on the 2 largest databases. A query against one of the other > databases when the server is mainly idle happen in very few seconds or > tenths. The same query (data layout is basically identical across all the > databases) takes > 2 minutes sometimes on either of the largest 2 databases. Can you post the query and the result of running it thru EXPLAIN? > I have the bosses ear for hardware _now_ though and want to move > forward before that opportunity passes. I'll post the database specs > at the end of this. If you were hosting that much data without a ton > of inserts all day long but lots of selects where would you > concentrate your efforts? Speed of CPU or speed/quantity of RAM or > speed of drives or what? You haven't told us how busy the CPU gets, so I'm going to assume it's *not* the bottleneck. I'd focus ot disk performance and having sufficent memory. > Would you load balance a quantity of servers or lean more towards a > couple of much larger 64-bit servers? Running entirely out of RAM > is very appealing but somehow I suspect that will be cost > prohibitive :) Just buy reasonable hardware, load-balance it, and > then get the MySQL folks to help optimize things? That's the approach we use at work and it works well. Keeps the costs down and makes [most] scaling pretty easy. > Mytop output after about 28 hours of uptime so this is just Friday during > business hours access primarily: > MySQL on localhost (3.23.56-Max-log) > up 1+04:56:30 [13:29:15] > Queries Total: 1,783,317 Avg/Sec: 17.12 Now/Sec: 0.40 Slow: 1,046 > Threads Total: 33Active: 1 Cached: 0 > Key Efficiency: 99.99% Bytes in: 779,059,153 Bytes out: 543,489,706 Ugh. Only 17 queries/sec average? Hmm. Jeremy -- Jeremy D. Zawodny | Perl, Web, MySQL, Linux Magazine, Yahoo! <[EMAIL PROTECTED]> | http://jeremy.zawodny.com/ MySQL 4.0.13: up 25 days, processed 813,104,760 queries (362/sec. avg) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL/INNODB speed on large databases
OK, the picture is now a little bit simpler: 1. Your performance problem is coming mainy from one BIG database. 2. Although your servers have a lot of RAM (at least 4Gb), InnoDB can only use a small fraction because of addressing limitations of 32 bits CPU. Some random ideas: 1. Swith to a 64 bits system so that you can efficiently use more RAM 2. Do all tables in this BIG database need to be transaction safe ? If not, you can convert some (all ?) tables to MyISAM format. In this case, file caching will be done by the kernel outside of mysqld process allowing more efficient use of RAM. 3. Sponsor Innobase so that InnoDB use AWE memory on Linux/x86 (up to 64Gb). This has already been discussed on this mailing list: http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&th=cddc8bdca9e4304a&seekm=bbkmji%2417qt%241%40FreeBSD.csie.NCTU.edu.tw 4. Make sure that you are really limited by RAM and disc I/O not CPU otherwise you won't gain anything with more RAM. 5. If you have more selects than updates/inserts you should consider replication (I use it, it rocks !) 6. If you have more selects than updates/inserts and identical selects are run several times, you could consider upgrading to MySQL 4.x. Its query cache may speed up your server (I have not tried it myself but there are been positive reports on this list). regards, Joseph Bueno Wendell Dingus wrote: Ahh, there's something I could have explained better. Each of the databases represents a remote office of a different customer. I specifically want/need to keep the data separate, it doubles as an offsite backup for these offices as well as would conflict if it were merged. I currently use replication to get the data from those offices up to my web server but not exactly in the way it was intended. I just turn on binary logging and a process every 30 minutes or so checks the latest /var/lib/mysql/systemname-bin.nnn to see if it contains anything and starts up a new binary log file, uses mysqlbinlog to grab the contents of that one, sends them to the web server, which integrates them with it's offiste copy. Works great since some offices have live internet connections and others are dialup, etc... Now then, I could divvy things up across multiple servers except that that one largest database is almost as slow at the customers office with live local (non-web) usage and that server has over 5GB of RAM. Similar specs to the web server I described otherwise and it only has that ONE database and not the ones of all the other customers. Anyway, beyond that, the LVS approach would still involve having 1 master write server and all the backend "real" servers being readonly (to the application) and kept updated via replication slaving from the master. Just across multiple actual databases and not one... From what I've read so far that is! Thanks! -Original Message- From: Joseph Bueno [mailto:[EMAIL PROTECTED] Sent: Saturday, June 28, 2003 2:45 PM To: Wendell Dingus Cc: [EMAIL PROTECTED] Subject: Re: MySQL/INNODB speed on large databases Maybe my question is too obvious but, since you have several databases instead of only one, have you considered moving some of them to a separate server ? Also, have you checked MySQL replication capabilities ? It is a very efficient way to distribute the load across several servers unless you do a lot of insert/updates and very few selects. Regards, Joseph Bueno Wendell Dingus wrote: Hello, I've got a situation I'm hoping someone here can help me out with. We have a web server serving up data via some PHP programs pulling from MySQL (3.23.56 currently) in INNODB tables. There are 40+ (active) databases totalling a bit over 28 Gigabytes at this point. As we add more users speed is beginning to drop... -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: MySQL/INNODB speed on large databases
Ahh, there's something I could have explained better. Each of the databases represents a remote office of a different customer. I specifically want/need to keep the data separate, it doubles as an offsite backup for these offices as well as would conflict if it were merged. I currently use replication to get the data from those offices up to my web server but not exactly in the way it was intended. I just turn on binary logging and a process every 30 minutes or so checks the latest /var/lib/mysql/systemname-bin.nnn to see if it contains anything and starts up a new binary log file, uses mysqlbinlog to grab the contents of that one, sends them to the web server, which integrates them with it's offiste copy. Works great since some offices have live internet connections and others are dialup, etc... Now then, I could divvy things up across multiple servers except that that one largest database is almost as slow at the customers office with live local (non-web) usage and that server has over 5GB of RAM. Similar specs to the web server I described otherwise and it only has that ONE database and not the ones of all the other customers. Anyway, beyond that, the LVS approach would still involve having 1 master write server and all the backend "real" servers being readonly (to the application) and kept updated via replication slaving from the master. Just across multiple actual databases and not one... From what I've read so far that is! Thanks! -Original Message- From: Joseph Bueno [mailto:[EMAIL PROTECTED] Sent: Saturday, June 28, 2003 2:45 PM To: Wendell Dingus Cc: [EMAIL PROTECTED] Subject: Re: MySQL/INNODB speed on large databases Maybe my question is too obvious but, since you have several databases instead of only one, have you considered moving some of them to a separate server ? Also, have you checked MySQL replication capabilities ? It is a very efficient way to distribute the load across several servers unless you do a lot of insert/updates and very few selects. Regards, Joseph Bueno Wendell Dingus wrote: > Hello, I've got a situation I'm hoping someone here can help me out with. We > have a web server serving up data via some PHP programs pulling from MySQL > (3.23.56 currently) in INNODB tables. There are 40+ (active) databases > totalling a bit over 28 Gigabytes at this point. As we add more users speed > is beginning to drop... -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL/INNODB speed on large databases
Maybe my question is too obvious but, since you have several databases instead of only one, have you considered moving some of them to a separate server ? Also, have you checked MySQL replication capabilities ? It is a very efficient way to distribute the load across several servers unless you do a lot of insert/updates and very few selects. Regards, Joseph Bueno Wendell Dingus wrote: Hello, I've got a situation I'm hoping someone here can help me out with. We have a web server serving up data via some PHP programs pulling from MySQL (3.23.56 currently) in INNODB tables. There are 40+ (active) databases totalling a bit over 28 Gigabytes at this point. As we add more users speed is beginning to drop... Currently all this is on a single dual P3 1.26Ghz IBM eServer with 4GB of RAM. 4 x 15K RPM drives RAID5. It's hitting load averages >12 during peak times and database query speeds are really starting to drag... I've talked some with the consulting folks at MySQL and have the intention of contracting them to help analyze our database layout, structure, and queries to help optimize things. Dealing with belt-tightening management on the other end though I want to get rolled out some new software and hardware before doing that. I've been spending time working with a test setup using the load-balancer routines at www.linuxvirtualserver.com and like what I see so far. My quandry is this though.. The existing server has 4GB but I can't get innodb_buffer_pool_size above about 1.2GB I'd buy tons more RAM and go that route if it could be utilized, I've not figured out how though and assume that's the deal with a 32-bit CPU. Reading some about 64-bit Opterons and a 64-bit address space sounds good. I'm sure I can get the LVS approach with a load balancer and 3 or 4 back-end "real" servers going without too much trouble. Where do I spend the money most wisely though? More and more and more RAM in each backend server or is anything over 2GB going to mostly not be used (Probably P4 2.4Ghz XEON boxes). Drive speed? I'm thinking I'll test out using fast SCSI drives and maybe some SATA drives RAID-0 stiped versus non-striped to see the speed difference. My whole goal is speed, speed, and more speed out of MySQL! The actual web applications place a very small load on the server, viewing top or mytop show that database access is where all the time is going and primarily on the 2 largest databases. A query against one of the other databases when the server is mainly idle happen in very few seconds or tenths. The same query (data layout is basically identical across all the databases) takes > 2 minutes sometimes on either of the largest 2 databases. So you see my concern? A 2 minute query on an _unloaded_ server is not going to improve no matter how many servers I put behind a load balancer. Again I'll pay the MySQL folks to analyze and hopefully come up with lots of braindead things we're doing and help performance a lot. I have the bosses ear for hardware _now_ though and want to move forward before that opportunity passes. I'll post the database specs at the end of this. If you were hosting that much data without a ton of inserts all day long but lots of selects where would you concentrate your efforts? Speed of CPU or speed/quantity of RAM or speed of drives or what? Would you load balance a quantity of servers or lean more towards a couple of much larger 64-bit servers? Running entirely out of RAM is very appealing but somehow I suspect that will be cost prohibitive :) Just buy reasonable hardware, load-balance it, and then get the MySQL folks to help optimize things? I sure don't want to spend a ton and then be told later that it would have been much better had I went with a different setup if you know what I mean :)Thanks in advance! If any additional statistics would help let me know! PS. Any pointers to any approaches to what I think I read about Google, _striping_ database data across multiple servers for speed? Also, I'm unfamiliar with them, but do PostgreSQL or Oracle or DB2 or something like that have anything that would specifically work better in this scenario? Mytop output after about 28 hours of uptime so this is just Friday during business hours access primarily: MySQL on localhost (3.23.56-Max-log) up 1+04:56:30 [13:29:15] Queries Total: 1,783,317 Avg/Sec: 17.12 Now/Sec: 0.40 Slow: 1,046 Threads Total: 33Active: 1 Cached: 0 Key Efficiency: 99.99% Bytes in: 779,059,153 Bytes out: 543,489,706 The databases themselves from phpMyAdmin: (This one is the main killer: 40 1.8 GB 5.5 GB 7.3 GB ) Database Table(s)DataIndexes Total xxx00 Bytes 0 Bytes 0 Bytes xx 44 88.4 KB 182.0 KB270.4 KB x 21 14.8 MB 10.0 MB 24.8 MB mysql 61.5 KB