Re: Query optimization help
First, you might want to move the WHERE...t3.int_a = some integer condition into the join condition for t3. Your not using anything from t4, so I'm not sure why you have that table in your query. You can suggest or force mysql to use an index if it's using the wrong one: http://dev.mysql.com/doc/refman/5.0/en/index-hints.html On very rare occasions I've had to do this. What's happening is that mysql is analyzing the information it has about the data and indexes and coming to the wrong conclusion, perhaps even opting for an entire table scan. You can run ANALYZE TABLE to force mysql to update the information it has about the data. This may actually solve your problem. Try SHOW INDEX FROM t1 to see what data mysql has about the indexes. Sometimes the CARDINALITY (uniqueness) column will be null which can indicate a problem. Posting the result of your EXPLAIN will actually be helpful. Hope that helps. Brent Baisley On Aug 11, 2008, at 8:26 AM, Jonathan Terhorst wrote: I have this query: SELECT DISTINCT t1.string FROM t1 LEFT JOIN t2 ON t1.string=t2.string LEFT JOIN t3 ON t1.int_a=t3.int_a LEFT JOIN t4 ON t1.int_b=t4.int_b WHERE t1.string != '' AND t2.string IS NULL AND t3.int_a = some integer ORDER BY t1.string ASC This query is executing slower than it should. EXPLAIN has it using temporary and using filesort. I have indexes on every column in the query, but I think the problem is the one-index-per-table limitation. According to EXPLAIN, there are two possible indices in use for t1 (int_a and string), but only int_a is being used. So I tried constructing a compound index on int_a and string. Although this new index appears in possible_keys, EXPLAIN still shows the key actually being used as int_a. I tried building the compound key in both orders and had the same results. How do get mysql to all possible keys on t1 when running the query? Thanks! -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Query optimization help
I have this query: SELECT DISTINCT t1.string FROM t1 LEFT JOIN t2 ON t1.string=t2.string LEFT JOIN t3 ON t1.int_a=t3.int_a LEFT JOIN t4 ON t1.int_b=t4.int_b WHERE t1.string != '' AND t2.string IS NULL AND t3.int_a = some integer ORDER BY t1.string ASC This query is executing slower than it should. EXPLAIN has it using temporary and using filesort. I have indexes on every column in the query, but I think the problem is the one-index-per-table limitation. According to EXPLAIN, there are two possible indices in use for t1 (int_a and string), but only int_a is being used. So I tried constructing a compound index on int_a and string. Although this new index appears in possible_keys, EXPLAIN still shows the key actually being used as int_a. I tried building the compound key in both orders and had the same results. How do get mysql to all possible keys on t1 when running the query? Thanks! -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Ugly sql optimization help?
I have the following horrible sql. I need one result that has all the data in one row. I am currently using 3 sub queries and figure it must be a better way... SELECT 'FS_DEV', ifnull(a.severity, 0) AS aseverity, ifnull(a.eventid, 0) AS aeventid, ifnull(a.description, 'None') AS adescription, ifnull(a.hostid, 0) AS ahostid, ifnull(a.color,'#E5E0DC') as acolor, ifnull(a.fontcolor,0) as afontcolor, ifnull(a.severitydesc,'None') as aseveritydesc, ifnull(p.severity, 0) AS pseverity, ifnull(p.eventid, 0) AS peventid, ifnull(p.description, 'None') AS pdescription, ifnull(p.hostid, 0) AS phostid, ifnull(p.color,'#E5E0DC') as pcolor, ifnull(p.fontcolor,0) as pfontcolor, ifnull(p.severitydesc,'None') as pseveritydesc, ifnull(s.severity, 0) AS sseverity, ifnull(s.eventid, 0) AS seventid, ifnull(s.description, 'None') AS sdescription, ifnull(s.hostid, 0) AS shostid, ifnull(s.color,'#E5E0DC') as scolor, ifnull(s.fontcolor,0) as sfontcolor, ifnull(s.severitydesc,'None') as sseveritydesc FROM (SELECT e.severity, e.eventid, e.description, e.hostid, fs.color, fs.fontcolor, fs.severitydesc FROM fs_events e, fs_severity fs WHERE hostid in (21,22,23,24,15,16) and e.severity = fs.severityid AND e.event_perf = 1 AND e.time_stamp = date_sub(now(), INTERVAL 30 DAY) AND e.acknowledged in ( 0,1) ORDER BY e.severity DESC, e.time_stamp DESC LIMIT 0, 1) p, (SELECT e.severity, e.eventid, e.description, e.hostid, fs.color, fs.fontcolor, fs.severitydesc FROM fs_events e, fs_severity fs WHERE hostid in (21,22,23,24,15,16) and e.severity = fs.severityid AND e.event_avail = 1 AND e.time_stamp = date_sub(now(), INTERVAL 30 DAY) AND e.acknowledged in ( 0,1) ORDER BY e.severity DESC, e.time_stamp DESC LIMIT 0, 1) a, (SELECT e.severity, e.eventid, e.description, e.hostid, fs.color, fs.fontcolor, fs.severitydesc FROM fs_events e, fs_severity fs WHERE hostid in (21,22,23,24,15,16) and e.severity = fs.severityid AND e.event_sec = 1 AND e.time_stamp = date_sub(now(), INTERVAL 30 DAY) AND e.acknowledged in ( 0,1) ORDER BY e.severity DESC, e.time_stamp DESC LIMIT 0, 1) s -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
optimization help
I have a table in my database (currently) that grows to be huge (and I need to keep the data). I'm in a redesign phase and I'm trying to do it right. So here are the relevant details: The table has several keys involved: mysql desc data__ProcessedDataFrames; ++--+--+-+-++ | Field | Type | Null | Key | Default | Extra | ++--+--+-+-++ | processed_id | int(10) unsigned | NO | PRI | NULL| auto_increment | | top_level_product_name | varchar(255) | YES | MUL | NULL| | | test_id| int(10) unsigned | YES | MUL | NULL| | | p_time | double | YES | MUL | NULL| | | processed_data | mediumblob | YES | | NULL| | ++--+--+-+-++ 6 rows in set (0.00 sec) This is the table that contains the data I'm interested in currently. Queries on this table when it gets large is slow as molasses. I'm thinking about making a new table for anything with a different test_id any opinions as to whether this is good or bad? Before you make fun of me for my questions, I a bit new to database programming. If it is better design to break it into smaller tables (for speed anyway) then I would need to know how to query over multiple tables as though it was one table. Join will do this, but that takes forever (unless of course I may be doing this wrong), so that's not a good option. I need to be able to query over mutiple test_ids, which will be multiple tables, for specific top_level_product_name, with in some time range (using p_time). Any help would be appreciated. I will happily give more information if you need to offer an educated opinion. Thanks David Godsey -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: optimization help
On 6/27/07, Dave G [EMAIL PROTECTED] wrote: Queries on this table when it gets large is slow as molasses. I'm thinking about making a new table for anything with a different test_id any opinions as to whether this is good or bad? Hi Dave G., We need to know how: a)How large the table might grow to, and b)The queries you'll be executing on the table, and c)The anticipated size of the result sets (from the queries). Generally speaking, you are shooting for O(log N) on the queries. You need to be sure that the queries you are issuing combined with the database design will allow that. So, I need to know the specific queries. Dave A.
Re: optimization help
Good Afternoon David sounds as if you have a number of non-unique indices (even possibly FTS!) slowing down queries..this should help you concentrate on the slower indices mysql select TABLE_NAME,COLUMN_NAME,INDEX_NAME from INFORMATION_SCHEMA.STATISTICS where NON_UNIQUE=1; Anyone else? Martin-- This email message and any files transmitted with it contain confidential information intended only for the person(s) to whom this email message is addressed. If you have received this email message in error, please notify the sender immediately by telephone or email and destroy the original message without making a copy. Thank you. - Original Message - From: Dave G [EMAIL PROTECTED] To: mysql@lists.mysql.com Sent: Wednesday, June 27, 2007 11:32 AM Subject: optimization help I have a table in my database (currently) that grows to be huge (and I need to keep the data). I'm in a redesign phase and I'm trying to do it right. So here are the relevant details: The table has several keys involved: mysql desc data__ProcessedDataFrames; ++--+--+-+-++ | Field | Type | Null | Key | Default | Extra | ++--+--+-+-++ | processed_id | int(10) unsigned | NO | PRI | NULL| auto_increment | | top_level_product_name | varchar(255) | YES | MUL | NULL| | | test_id| int(10) unsigned | YES | MUL | NULL| | | p_time | double | YES | MUL | NULL| | | processed_data | mediumblob | YES | | NULL| | ++--+--+-+-++ 6 rows in set (0.00 sec) This is the table that contains the data I'm interested in currently. Queries on this table when it gets large is slow as molasses. I'm thinking about making a new table for anything with a different test_id any opinions as to whether this is good or bad? Before you make fun of me for my questions, I a bit new to database programming. If it is better design to break it into smaller tables (for speed anyway) then I would need to know how to query over multiple tables as though it was one table. Join will do this, but that takes forever (unless of course I may be doing this wrong), so that's not a good option. I need to be able to query over mutiple test_ids, which will be multiple tables, for specific top_level_product_name, with in some time range (using p_time). Any help would be appreciated. I will happily give more information if you need to offer an educated opinion. Thanks David Godsey -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: optimization help
I think I can do that: I don't have any other indexes, just the keys. mysql show create table data__ProcessedDataFrames; +---+--+ | Table | Create Table | +---+--+ | data__ProcessedDataFrames | CREATE TABLE `data__ProcessedDataFrames` ( `processed_id` int(10) unsigned NOT NULL auto_increment, `top_level_product_name` varchar(255) default NULL, `test_id` int(10) unsigned default NULL, `payload_time` double default NULL, `universal_time` double default NULL, `processed_data` mediumblob, PRIMARY KEY (`processed_id`), KEY `test_id` (`test_id`), KEY `payload_time` (`payload_time`), KEY `top_level_product_name` (`top_level_product_name`) ) ENGINE=MyISAM AUTO_INCREMENT=1392568 DEFAULT CHARSET=latin1 | +---+--+ 1 row in set (0.00 sec) mysql As for the amount of data I expect to get infinite really. Our setup: we have serveral boxes we are running tests from, where sql is the storage engine on each local box, then we will have a main storage area for all relevant tests. Based on passed data, the the tables size will be pushing 4G as it is (when we port the data) and expect at least that much more over the life of this software, but since the payloads I will be getting the data from have not been developed yet, I can't be entirely sure. One of the reasons I was inquiring as to whether breaking it up into several tables would be a good idea is because it would make it easier for me to merge the data from the different testers into the main data repository that way. Otherwise I will have to figure out a good way of redoing the test_id in each test that is stored in the main repository. Slow queries will be a little hard to show without giving a full evaluation of my system. So I'll simplify it a little. I'm doing several joins to get the right parameters to query this table in a stored procedure . but when it comes down to it, the query on this table is the big one and I can modify my other joins, just making the query on this table fast is my concern. Example query: select payload_time,HEX(processed_data) from data__ProcessedDataFrames where test_id=18 AND top_level_product_name=DataProduct AND payload_time 11808.74704 AND payload_time 1180564096.24967; What I'm concerned about is with how much data I will eventually have, even scanning over the KEYS will take a long time. Thanks Dave G. BTW: heres the giant query that I use. SELECT E.product_id, product_name, D.top_level_product_name, processed_id, product_offset, product_length, version_id, byte_order, ROUND(R.payload_time,6) as payload_time, SUBSTR( BINARY(processed_data), FLOOR(product_offset/8)+1, CEIL(product_length/8)) as substring, (SELECT HEX(substring)) as raw_data, (SELECT toString ( substring, round(char_length(raw_data)/2,0), data_type, (SELECT attribute_value FROM config__DataProductAttributes WHERE attribute_name = 'FormatString' AND config__DataProductAttributes.product_id = E.product_id), product_offset % 8, (product_length + (product_offset % 8)) % 8, product_length, byte_order,
Re: optimization help
I do, but I don't see any way around that with the data I have. Dave G. Good Afternoon David sounds as if you have a number of non-unique indices (even possibly FTS!) slowing down queries..this should help you concentrate on the slower indices mysql select TABLE_NAME,COLUMN_NAME,INDEX_NAME from INFORMATION_SCHEMA.STATISTICS where NON_UNIQUE=1; Anyone else? Martin-- This email message and any files transmitted with it contain confidential information intended only for the person(s) to whom this email message is addressed. If you have received this email message in error, please notify the sender immediately by telephone or email and destroy the original message without making a copy. Thank you. - Original Message - From: Dave G [EMAIL PROTECTED] To: mysql@lists.mysql.com Sent: Wednesday, June 27, 2007 11:32 AM Subject: optimization help I have a table in my database (currently) that grows to be huge (and I need to keep the data). I'm in a redesign phase and I'm trying to do it right. So here are the relevant details: The table has several keys involved: mysql desc data__ProcessedDataFrames; ++--+--+-+-++ | Field | Type | Null | Key | Default | Extra | ++--+--+-+-++ | processed_id | int(10) unsigned | NO | PRI | NULL| auto_increment | | top_level_product_name | varchar(255) | YES | MUL | NULL| | | test_id| int(10) unsigned | YES | MUL | NULL| | | p_time | double | YES | MUL | NULL| | | processed_data | mediumblob | YES | | NULL| | ++--+--+-+-++ 6 rows in set (0.00 sec) This is the table that contains the data I'm interested in currently. Queries on this table when it gets large is slow as molasses. I'm thinking about making a new table for anything with a different test_id any opinions as to whether this is good or bad? Before you make fun of me for my questions, I a bit new to database programming. If it is better design to break it into smaller tables (for speed anyway) then I would need to know how to query over multiple tables as though it was one table. Join will do this, but that takes forever (unless of course I may be doing this wrong), so that's not a good option. I need to be able to query over mutiple test_ids, which will be multiple tables, for specific top_level_product_name, with in some time range (using p_time). Any help would be appreciated. I will happily give more information if you need to offer an educated opinion. Thanks David Godsey -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: optimization help
On 6/27/07, Dave G [EMAIL PROTECTED] wrote: select payload_time,HEX(processed_data) from data__ProcessedDataFrames where test_id=18 AND top_level_product_name=DataProduct AND payload_time 11808.74704 AND payload_time 1180564096.24967; What I'm concerned about is with how much data I will eventually have, even scanning over the KEYS will take a long time. Hi Dave, In the case above, you want to be sure that everything involved in the query is indexed or a key (probably the same thing). To give an example, test_id=18 ... if that isn't indexed, it will be an O(N) scan over all records in a table rather than an O(log N) retrieval based on some BTREE or similar. As a first step, be sure that everything involved in a typical query is indexed. For joins, the related columns should also be indexed. Dave A.
Re: optimization help
That's quite a query. You may not be able to optimize it well with those nested selects. You may want to think about changing your query around a little, perhaps joining pieces of data using whatever programming language you're using on the front end. You have MySQL doing a lot of work and perhaps transferring a lot of data. If some of those selects are pulling data that is redundant across many rows, if may be more efficient to join them on the front end using arrays or something similar. Another alternative would be to use a different table type like a MERGE table. That allows you to query multiple tables as one, which is something you had asked about. You need to be appear of it's limitations, like unique indexes not being enforced across tables. So if you want to use auto increment, you need to set the value when you create a new table to add it to the merge setup. On Jun 27, 2007, at 12:16 PM, Dave G wrote: I think I can do that: I don't have any other indexes, just the keys. mysql show create table data__ProcessedDataFrames; +--- +- -- -- -- -- -- -- -+ | Table | Create Table | +--- +- -- -- -- -- -- -- -+ | data__ProcessedDataFrames | CREATE TABLE `data__ProcessedDataFrames` ( `processed_id` int(10) unsigned NOT NULL auto_increment, `top_level_product_name` varchar(255) default NULL, `test_id` int(10) unsigned default NULL, `payload_time` double default NULL, `universal_time` double default NULL, `processed_data` mediumblob, PRIMARY KEY (`processed_id`), KEY `test_id` (`test_id`), KEY `payload_time` (`payload_time`), KEY `top_level_product_name` (`top_level_product_name`) ) ENGINE=MyISAM AUTO_INCREMENT=1392568 DEFAULT CHARSET=latin1 | +--- +- -- -- -- -- -- -- -+ 1 row in set (0.00 sec) mysql As for the amount of data I expect to get infinite really. Our setup: we have serveral boxes we are running tests from, where sql is the storage engine on each local box, then we will have a main storage area for all relevant tests. Based on passed data, the the tables size will be pushing 4G as it is (when we port the data) and expect at least that much more over the life of this software, but since the payloads I will be getting the data from have not been developed yet, I can't be entirely sure. One of the reasons I was inquiring as to whether breaking it up into several tables would be a good idea is because it would make it easier for me to merge the data from the different testers into the main data repository that way. Otherwise I will have to figure out a good way of redoing the test_id in each test that is stored in the main repository. Slow queries will be a little hard to show without giving a full evaluation of my system. So I'll simplify it a little. I'm doing several joins to get the right parameters to query this table in a stored procedure . but when it comes down to it, the query on this table is the big one and I can modify my other joins, just making the query on this table fast is my concern. Example query: select payload_time,HEX(processed_data) from data__ProcessedDataFrames where test_id=18 AND top_level_product_name=DataProduct AND payload_time 11808.74704 AND payload_time 1180564096.24967; What I'm concerned about is with how much
Need sql optimization help
I have the following sql that works for what I want to see but is terribly slow due to the subquery. Any suggestions on how to get same result faster? I have tried group by and cant seem to get the correct results that way Also is there a product that can help optimize sql and indexing? SELECT DISTINCT e.severity, e.time_stamp, replace(e.description, '{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e, hosts h, fs_severity s WHERE e.hostid = 10011 AND e.hostid = h.hostid AND e.time_stamp = (SELECT max(time_stamp) FROM fs_events WHERE triggerid = e.triggerid) and e.severity = s.severityid ORDER BY e.time_stamp DESC -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Need sql optimization help
Bryan, A 'Not Exists' query http://www.artfulsoftware.com/infotree/queries.php#41 is usually faster when coded as an exclusion join, eg for max-some-value per key, left join on the key variable and left.value right. value where left.key is null, so you would need something like ... SELECT DISTINCT e.severity, e.time_stamp, replace(e.description,'{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e JOIN hosts h USING (hostid) JOIN fs_severity s ON (e.severity = s.severityid) LEFT JOIN fs_events e2 ON e.triggerid=e2.triggerid AND e.MAX(time_stamp) e2.MAX(time_stamp) WHERE e.hostid = 10011 AND e2.triggerid IS NULL ORDER BY e.time_stamp DESC PB Bryan Cantwell wrote: I have the following sql that works for what I want to see but is terribly slow due to the subquery. Any suggestions on how to get same result faster? I have tried group by and cant seem to get the correct results that way Also is there a product that can help optimize sql and indexing? SELECT DISTINCT e.severity, e.time_stamp, replace(e.description, '{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e, hosts h, fs_severity s WHERE e.hostid = 10011 AND e.hostid = h.hostid AND e.time_stamp = (SELECT max(time_stamp) FROM fs_events WHERE triggerid = e.triggerid) and e.severity = s.severityid ORDER BY e.time_stamp DESC
RE: Need sql optimization help
Gives me invalid use of group function: SELECT DISTINCT e.severity, e.time_stamp, replace(e.description,'{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e JOIN hosts h USING (hostid) JOIN fs_severity s ON (e.severity = s.severityid) LEFT JOIN fs_events e2 ON e.triggerid=e2.triggerid AND MAX(e.time_stamp) MAX(e2.time_stamp) WHERE e.hostid = 10011 AND e2.triggerid IS NULL ORDER BY e.time_stamp DESC From: Peter Brawley [mailto:[EMAIL PROTECTED] Sent: Saturday, March 03, 2007 1:40 PM To: Bryan Cantwell Cc: mysql@lists.mysql.com Subject: Re: Need sql optimization help Bryan, A 'Not Exists' query http://www.artfulsoftware.com/infotree/queries.php#41 is usually faster when coded as an exclusion join, eg for max-some-value per key, left join on the key variable and left.value right. value where left.key is null, so you would need something like ... SELECT DISTINCT e.severity, e.time_stamp, replace(e.description,'{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e JOIN hosts h USING (hostid) JOIN fs_severity s ON (e.severity = s.severityid) LEFT JOIN fs_events e2 ON e.triggerid=e2.triggerid AND e.MAX(time_stamp) e2.MAX(time_stamp) WHERE e.hostid = 10011 AND e2.triggerid IS NULL ORDER BY e.time_stamp DESC PB Bryan Cantwell wrote: I have the following sql that works for what I want to see but is terribly slow due to the subquery. Any suggestions on how to get same result faster? I have tried group by and cant seem to get the correct results that way Also is there a product that can help optimize sql and indexing? SELECT DISTINCT e.severity, e.time_stamp, replace(e.description, '{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e, hosts h, fs_severity s WHERE e.hostid = 10011 AND e.hostid = h.hostid AND e.time_stamp = (SELECT max(time_stamp) FROM fs_events WHERE triggerid = e.triggerid) and e.severity = s.severityid ORDER BY e.time_stamp DESC
Re: Need sql optimization help
Your inner query guarantees that MySQL will have to test an awfull lot of combinations: (# of records in fs_events)^2 * (# of records in hosts) * (# of records in severity) (SELECT max(time_stamp) FROM fs_events WHERE triggerid = e.triggerid) needs to be recalculated every time we try to match a row in the outer query.. that is going to hurt a lot. - michael dykman On 3/3/07, Bryan Cantwell [EMAIL PROTECTED] wrote: Gives me invalid use of group function: SELECT DISTINCT e.severity, e.time_stamp, replace(e.description,'{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e JOIN hosts h USING (hostid) JOIN fs_severity s ON (e.severity = s.severityid) LEFT JOIN fs_events e2 ON e.triggerid=e2.triggerid AND MAX(e.time_stamp) MAX(e2.time_stamp) WHERE e.hostid = 10011 AND e2.triggerid IS NULL ORDER BY e.time_stamp DESC From: Peter Brawley [mailto:[EMAIL PROTECTED] Sent: Saturday, March 03, 2007 1:40 PM To: Bryan Cantwell Cc: mysql@lists.mysql.com Subject: Re: Need sql optimization help Bryan, A 'Not Exists' query http://www.artfulsoftware.com/infotree/queries.php#41 is usually faster when coded as an exclusion join, eg for max-some-value per key, left join on the key variable and left.value right. value where left.key is null, so you would need something like ... SELECT DISTINCT e.severity, e.time_stamp, replace(e.description,'{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e JOIN hosts h USING (hostid) JOIN fs_severity s ON (e.severity = s.severityid) LEFT JOIN fs_events e2 ON e.triggerid=e2.triggerid AND e.MAX(time_stamp) e2.MAX(time_stamp) WHERE e.hostid = 10011 AND e2.triggerid IS NULL ORDER BY e.time_stamp DESC PB Bryan Cantwell wrote: I have the following sql that works for what I want to see but is terribly slow due to the subquery. Any suggestions on how to get same result faster? I have tried group by and cant seem to get the correct results that way Also is there a product that can help optimize sql and indexing? SELECT DISTINCT e.severity, e.time_stamp, replace(e.description, '{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e, hosts h, fs_severity s WHERE e.hostid = 10011 AND e.hostid = h.hostid AND e.time_stamp = (SELECT max(time_stamp) FROM fs_events WHERE triggerid = e.triggerid) and e.severity = s.severityid ORDER BY e.time_stamp DESC -- - michael dykman - [EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Need sql optimization help
Sorry for the typo, lose the MAX! ... SELECT DISTINCT e.severity, e.time_stamp, replace(e.description,'{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e JOIN hosts h USING (hostid) JOIN fs_severity s ON (e.severity = s.severityid) LEFT JOIN fs_events e2 ON e.triggerid=e2.triggerid AND e.time_stamp e2.time_stamp WHERE e.hostid = 10011 AND e2.triggerid IS NULL ORDER BY e.time_stamp DESC PB Michael Dykman wrote: Your inner query guarantees that MySQL will have to test an awfull lot of combinations: (# of records in fs_events)^2 * (# of records in hosts) * (# of records in severity) (SELECT max(time_stamp) FROM fs_events WHERE triggerid = e.triggerid) needs to be recalculated every time we try to match a row in the outer query.. that is going to hurt a lot. - michael dykman On 3/3/07, Bryan Cantwell [EMAIL PROTECTED] wrote: Gives me invalid use of group function: SELECT DISTINCT e.severity, e.time_stamp, replace(e.description,'{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e JOIN hosts h USING (hostid) JOIN fs_severity s ON (e.severity = s.severityid) LEFT JOIN fs_events e2 ON e.triggerid=e2.triggerid AND MAX(e.time_stamp) MAX(e2.time_stamp) WHERE e.hostid = 10011 AND e2.triggerid IS NULL ORDER BY e.time_stamp DESC From: Peter Brawley [mailto:[EMAIL PROTECTED] Sent: Saturday, March 03, 2007 1:40 PM To: Bryan Cantwell Cc: mysql@lists.mysql.com Subject: Re: Need sql optimization help Bryan, A 'Not Exists' query http://www.artfulsoftware.com/infotree/queries.php#41 is usually faster when coded as an exclusion join, eg for max-some-value per key, left join on the key variable and left.value right. value where left.key is null, so you would need something like ... SELECT DISTINCT e.severity, e.time_stamp, replace(e.description,'{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e JOIN hosts h USING (hostid) JOIN fs_severity s ON (e.severity = s.severityid) LEFT JOIN fs_events e2 ON e.triggerid=e2.triggerid AND e.MAX(time_stamp) e2.MAX(time_stamp) WHERE e.hostid = 10011 AND e2.triggerid IS NULL ORDER BY e.time_stamp DESC PB Bryan Cantwell wrote: I have the following sql that works for what I want to see but is terribly slow due to the subquery. Any suggestions on how to get same result faster? I have tried group by and cant seem to get the correct results that way Also is there a product that can help optimize sql and indexing? SELECT DISTINCT e.severity, e.time_stamp, replace(e.description, '{HOSTNAME}', h.host) AS description, h.host, h.hostid, e.value, e.triggerid, s.color FROM fs_events e, hosts h, fs_severity s WHERE e.hostid = 10011 AND e.hostid = h.hostid AND e.time_stamp = (SELECT max(time_stamp) FROM fs_events WHERE triggerid = e.triggerid) and e.severity = s.severityid ORDER BY e.time_stamp DESC -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: INNODB Questions and Optimization help
Hello. Have a look here: http://lists.mysql.com/mysql/194596 http://dev.mysql.com/doc/refman/5.0/en/innodb-tuning.html If you feel uncomfortable with 10G ibdata size, you may want to switch to per-file tablespace: http://dev.mysql.com/doc/refman/5.0/en/multiple-tablespaces.html http://dev.mysql.com/doc/refman/5.0/en/adding-and-removing.html Shaun Adams wrote: I currently have a customer who we switched over to INNODB from MyISM. It's a Dell Server with 4GB or Memory and RHEL4 64-bit. It's a fairly big database the size of the MyISM folders (before we converted to INNODB) was about 2-3Gigs. Questions: 1. The ibdata1 file size is 10GB. Does that sound right? Should this file be this big? 2. Once a week, I have to perform HUGE insert imports into the database. What is the recommended procedure for doing this? Clearing out memory with a FLUSH before I start the import, setting innodb_flush_log_at_trx_commit to 0... any other suggestions 3. Could you all recommend any tips you have used to increase performance using INNODB (from a system admin's perspective). 4. Finally, could you take a look at my settings below as well as the system variables and recommend any changes. Thanks. I would appreciate as many responses as possible. Shaun = 060207 19:04:03 INNODB MONITOR OUTPUT = Per second averages calculated from the last 16 seconds -- SEMAPHORES -- OS WAIT ARRAY INFO: reservation count 11216, signal count 11214 Mutex spin waits 31985, rounds 62213, OS waits 135 RW-shared spins 21945, OS waits 10884; RW-excl spins 61, OS waits 14 TRANSACTIONS Trx id counter 0 110703900 Purge done for trx's n:o 0 110703871 undo n:o 0 0 Total number of lock structs in row lock hash table 0 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0 0, not started, process no 6297, OS thread id 1147570528 MySQL thread id 66, query id 10060 localhost root show INNODB status ---TRANSACTION 0 0, not started, process no 6297, OS thread id 1147771232 MySQL thread id 2, query id 2305 10.200.60.34 admin FILE I/O I/O thread 0 state: waiting for i/o request (insert buffer thread) I/O thread 1 state: waiting for i/o request (log thread) I/O thread 2 state: waiting for i/o request (read thread) I/O thread 3 state: waiting for i/o request (write thread) Pending normal aio reads: 0, aio writes: 0, ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0 Pending flushes (fsync) log: 0; buffer pool: 0 27717 OS file reads, 2528 OS file writes, 339 OS fsyncs 47.93 reads/s, 18477 avg bytes/read, 87.81 writes/s, 14.19 fsyncs/s - INSERT BUFFER AND ADAPTIVE HASH INDEX - Ibuf for space 0: size 1, free list len 68, seg size 70, 755 inserts, 793 merged recs, 733 merges Hash table size 4980539, used cells 263653, node heap has 401 buffer(s) 6790.51 hash searches/s, 852.82 non-hash searches/s --- LOG --- Log sequence number 5 1118786578 Log flushed up to 5 1118786578 Last checkpoint at 5 1118786578 0 pending log writes, 0 pending chkp writes 553 log i/o's done, 9.31 log i/o's/second -- BUFFER POOL AND MEMORY -- Total memory allocated 2761248304; in additional pool allocated 6053120 Buffer pool size 153600 Free buffers 94978 Database pages 58221 Modified db pages 0 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages read 58209, created 12, written 1996 55.00 reads/s, 0.00 creates/s, 76.31 writes/s Buffer pool hit rate 998 / 1000 -- ROW OPERATIONS -- 0 queries inside InnoDB, 0 queries in queue Main thread process no. 6297, id 1147169120, state: waiting for server activity Number of rows inserted 171, updated 235, deleted 0, read 1793591 0.87 inserts/s, 1.00 updates/s, 0.00 deletes/s, 9788.39 reads/s END OF INNODB MONITOR OUTPUT SHOW VARIABLES Variable_name Value back_log 50 basedir /usr/local/mysql-standard-4.0.26-unknown-linux-gnu-x86_64-glibc23/ binlog_cache_size 32768 bulk_insert_buffer_size 8388608 character_set latin1 character_sets latin1 big5 czech euc_kr gb2312 gbk latin1_de sjis tis620 ujis dec8 dos german1 hp8 koi8_ru latin2 swe7 usa7 cp1251 danish hebrew win1251 estonia hungarian koi8_ukr win1251ukr greek win1250 croat cp1257 latin5 concurrent_insert ON connect_timeout 5 convert_character_set datadir /var/lib/mysql/ default_week_format 0 delay_key_write ON delayed_insert_limit 100 delayed_insert_timeout 300 delayed_queue_size 1000 flush OFF flush_time 0 ft_boolean_syntax + -()~*:| ft_max_word_len 254 ft_max_word_len_for_sort 20 ft_min_word_len 4
INNODB Questions and Optimization help
I currently have a customer who we switched over to INNODB from MyISM. It's a Dell Server with 4GB or Memory and RHEL4 64-bit. It's a fairly big database the size of the MyISM folders (before we converted to INNODB) was about 2-3Gigs. Questions: 1. The ibdata1 file size is 10GB. Does that sound right? Should this file be this big? 2. Once a week, I have to perform HUGE insert imports into the database. What is the recommended procedure for doing this? Clearing out memory with a FLUSH before I start the import, setting innodb_flush_log_at_trx_commit to 0... any other suggestions 3. Could you all recommend any tips you have used to increase performance using INNODB (from a system admin's perspective). 4. Finally, could you take a look at my settings below as well as the system variables and recommend any changes. Thanks. I would appreciate as many responses as possible. Shaun = 060207 19:04:03 INNODB MONITOR OUTPUT = Per second averages calculated from the last 16 seconds -- SEMAPHORES -- OS WAIT ARRAY INFO: reservation count 11216, signal count 11214 Mutex spin waits 31985, rounds 62213, OS waits 135 RW-shared spins 21945, OS waits 10884; RW-excl spins 61, OS waits 14 TRANSACTIONS Trx id counter 0 110703900 Purge done for trx's n:o 0 110703871 undo n:o 0 0 Total number of lock structs in row lock hash table 0 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0 0, not started, process no 6297, OS thread id 1147570528 MySQL thread id 66, query id 10060 localhost root show INNODB status ---TRANSACTION 0 0, not started, process no 6297, OS thread id 1147771232 MySQL thread id 2, query id 2305 10.200.60.34 admin FILE I/O I/O thread 0 state: waiting for i/o request (insert buffer thread) I/O thread 1 state: waiting for i/o request (log thread) I/O thread 2 state: waiting for i/o request (read thread) I/O thread 3 state: waiting for i/o request (write thread) Pending normal aio reads: 0, aio writes: 0, ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0 Pending flushes (fsync) log: 0; buffer pool: 0 27717 OS file reads, 2528 OS file writes, 339 OS fsyncs 47.93 reads/s, 18477 avg bytes/read, 87.81 writes/s, 14.19 fsyncs/s - INSERT BUFFER AND ADAPTIVE HASH INDEX - Ibuf for space 0: size 1, free list len 68, seg size 70, 755 inserts, 793 merged recs, 733 merges Hash table size 4980539, used cells 263653, node heap has 401 buffer(s) 6790.51 hash searches/s, 852.82 non-hash searches/s --- LOG --- Log sequence number 5 1118786578 Log flushed up to 5 1118786578 Last checkpoint at 5 1118786578 0 pending log writes, 0 pending chkp writes 553 log i/o's done, 9.31 log i/o's/second -- BUFFER POOL AND MEMORY -- Total memory allocated 2761248304; in additional pool allocated 6053120 Buffer pool size 153600 Free buffers 94978 Database pages 58221 Modified db pages 0 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages read 58209, created 12, written 1996 55.00 reads/s, 0.00 creates/s, 76.31 writes/s Buffer pool hit rate 998 / 1000 -- ROW OPERATIONS -- 0 queries inside InnoDB, 0 queries in queue Main thread process no. 6297, id 1147169120, state: waiting for server activity Number of rows inserted 171, updated 235, deleted 0, read 1793591 0.87 inserts/s, 1.00 updates/s, 0.00 deletes/s, 9788.39 reads/s END OF INNODB MONITOR OUTPUT SHOW VARIABLES Variable_name Value back_log 50 basedir /usr/local/mysql-standard-4.0.26-unknown-linux-gnu-x86_64-glibc23/ binlog_cache_size 32768 bulk_insert_buffer_size 8388608 character_set latin1 character_sets latin1 big5 czech euc_kr gb2312 gbk latin1_de sjis tis620 ujis dec8 dos german1 hp8 koi8_ru latin2 swe7 usa7 cp1251 danish hebrew win1251 estonia hungarian koi8_ukr win1251ukr greek win1250 croat cp1257 latin5 concurrent_insert ON connect_timeout 5 convert_character_set datadir /var/lib/mysql/ default_week_format 0 delay_key_write ON delayed_insert_limit 100 delayed_insert_timeout 300 delayed_queue_size 1000 flush OFF flush_time 0 ft_boolean_syntax + -()~*:| ft_max_word_len 254 ft_max_word_len_for_sort 20 ft_min_word_len 4 ft_stopword_file (built-in) have_bdb NO have_crypt YES have_innodb YES have_isam YES have_openssl NO have_query_cache YES have_raid NO have_symlink YES init_file innodb_additional_mem_pool_size 8388608 innodb_autoextend_increment 8 innodb_buffer_pool_size 2516582400 innodb_data_file_path ibdata1:2G:autoextend innodb_data_home_dir innodb_fast_shutdown ON innodb_file_io_threads 4 innodb_flush_log_at_trx_commit 2 innodb_flush_method O_DIRECT innodb_force_recovery 0
Re: INNODB Questions and Optimization help
Shaun Adams wrote: 1. The ibdata1 file size is 10GB. Does that sound right? Should this file be this big? That sounds right. Innodb seems to incur large space overheads. but with the cost of diskspace nowadays... 2. Once a week, I have to perform HUGE insert imports into the database. What is the recommended procedure for doing this? Clearing out memory with a FLUSH before I start the import, setting innodb_flush_log_at_trx_commit to 0... any other suggestions Yeah the flush log at trx_commit is good. remember that if the machine crashes you gotta retry the last second of txns before the crash. Also, turn of autocommit and surround your query(s) with SET AUTCOMMIT=0 and COMMIT; If the db crashes while you're doing a lot of inserts, the rollback can take FOREVER. and i do mean forever. check the mysql manual on how to avoid this. In general try not to do mass inserts at any given time. Perhaps build a queue and stream rows into the database as you need them. 3. Could you all recommend any tips you have used to increase performance using INNODB (from a system admin's look into replication if you need a lot of performance. otherwise just a lot of memory (if you have more than 4GB you'll need a 64bit OS). scsi (or otherwise high rotational) drives tend to work faster. If the database is 99% reads and you need all the performance you can get, i think MyISAM is a better option. 4. Finally, could you take a look at my settings below as well as the system variables and recommend any changes. Looks alright. i think the biggest thing is having a large innodb log file. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: INNODB Questions and Optimization help
:~ -Original Message- :~ From: Shaun Adams [mailto:[EMAIL PROTECTED] :~ Sent: Tuesday, February 07, 2006 4:32 PM :~ To: mysql@lists.mysql.com :~ Subject: INNODB Questions and Optimization help :~ :~ Questions: :~ :~ 1. The ibdata1 file size is 10GB. Does that sound right? Should this :~ file :~ be this big? Yes, the ibdata file contains the index and the data in the same space (unless you use 4.1+ and use separate table spaces). Text/blob data is doubled the actual size in most cases, so expect to see innodb use a lot of disk space if you continue to store blob data in innodb. :~ :~ 2. Once a week, I have to perform HUGE insert imports into the database. :~ What is the recommended procedure for doing this? Clearing out memory :~ with :~ a FLUSH before I start the import, setting :~ innodb_flush_log_at_trx_commit to :~ 0... any other suggestions Use LOAD DATA INFILE increase your INNODB BUFFER POOL to about 2-3GB if you can afford it, and disable keys on the table. Also LOAD DATA in the ORDER that the table is ordered. INNODB supports CLUSTERED indexes. That means if your table has a PRIMARY KEY of A,B The data should be sorted A,B before using LOAD DATA :~ :~ 3. Could you all recommend any tips you have used to increase :~ performance :~ using INNODB (from a system admin's perspective). Best perf improvements is to look at your indexes and see if queries are using them correctly. Other then that from a sysadmin point of view, use O_DIRECT, turn on noatime, and build the drive using these options /sbin/mkfs.ext3 -i 131072 -m 1 -L MYSQL /dev/.../ your drive Also set swappiness to 0 in /proc/sys/vm/ :~ :~ 4. Finally, could you take a look at my settings below as well as the :~ system :~ variables and recommend any changes. Set innodb_buffer_pool_size=2G Set innodb_thread_concurrency=32 // this is good if you do a lot of small fast queries Set innodb_log_file_size=512M Your doing a bunch of table scans check you sql, and make sure your using indexes. :~ Handler_read_rnd 2025997 :~ Handler_read_rnd_next 487643 This is how I was able to tell. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Import File / Insert Optimization Help
Hello. I'm not sure if it suitable for you case, but sometimes it is better to import data to the temporary table on the server and extract values from the fields of that table. Scott Klarenbach wrote: I am importing records from a text file into my DB. Each record in the text file corresponds to six tables in the DB due to normalization. So for example, an inventory record containing a part, vendor, contactPerson, mpn etc...in ONE line of a text file, requires 6 inserts into the DB. Further more, before each insert, I need to check for redundancy. So, if an inventory line in the text file has vendor 'Scott', BEFORE I add 'Scott' to the vendor table, I check the vendor table for whether 'Scott' exists. If 'Scott' does exist, then I just pull the ID and use that in the inventory insert - if 'Scott' DOESN'T exist yet, I insert 'Scott' into the Vendor Table, get the last_insert_id() and use that in the inventory table. Each LINE in the text file can result in more than 20 Select/Insert statements of the underlying DB before I can insert the record with all the properly allocated foreign keys. Considering many of these text files have thousands of lines of inventory, as you can imagine, I have a massive performance problem. Each complete line of the text file requires about 1 full second to validate and insert into the underlying schema. I'm using InnoDB tables so alot of the Insert Optimization techniques I found from MySQL don't seem to apply too well. I'm hoping for some experienced feedback in alternative techniques for performing this sort of import. Some further info: In PHP 5, I get my connection object, and then in a loop I'm calling $connecion-query(); for each line of the text file. I'm assuming this is always using the same connection and that each query doesn't require a re-connect to the DB...if it does, that's a major bottleneck that could be avoided (with persistent connections?) ie $conn = mysqli_init(); //null connection object $conn-real_connect('host', 'user', 'pass', 'db') or die('connection'); foreach($file as $line) { buildQuery(); $result = $conn-query($sql); } I hope each time I call $conn-query($sql) it's using the SAME connection resource and not having to reconnect to the DB. Secondly, $sql involves a call to a Stored Procedure, which in turn ends up calling other stored procedures to faciliate all the transactions, validations and inserts...I assume that's not too much more innefficient than using sql insert statements directly. Any help is appreciated. Scott. -- For technical support contracts, goto https://order.mysql.com/?ref=ensita This email is sponsored by Ensita.NET http://www.ensita.net/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Gleb Paharenko / /|_/ / // /\ \/ /_/ / /__ [EMAIL PROTECTED] /_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.NET ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Import File / Insert Optimization Help
I am importing records from a text file into my DB. Each record in the text file corresponds to six tables in the DB due to normalization. So for example, an inventory record containing a part, vendor, contactPerson, mpn etc...in ONE line of a text file, requires 6 inserts into the DB. Further more, before each insert, I need to check for redundancy. So, if an inventory line in the text file has vendor 'Scott', BEFORE I add 'Scott' to the vendor table, I check the vendor table for whether 'Scott' exists. If 'Scott' does exist, then I just pull the ID and use that in the inventory insert - if 'Scott' DOESN'T exist yet, I insert 'Scott' into the Vendor Table, get the last_insert_id() and use that in the inventory table. Each LINE in the text file can result in more than 20 Select/Insert statements of the underlying DB before I can insert the record with all the properly allocated foreign keys. Considering many of these text files have thousands of lines of inventory, as you can imagine, I have a massive performance problem. Each complete line of the text file requires about 1 full second to validate and insert into the underlying schema. I'm using InnoDB tables so alot of the Insert Optimization techniques I found from MySQL don't seem to apply too well. I'm hoping for some experienced feedback in alternative techniques for performing this sort of import. Some further info: In PHP 5, I get my connection object, and then in a loop I'm calling $connecion-query(); for each line of the text file. I'm assuming this is always using the same connection and that each query doesn't require a re-connect to the DB...if it does, that's a major bottleneck that could be avoided (with persistent connections?) ie $conn = mysqli_init(); //null connection object $conn-real_connect('host', 'user', 'pass', 'db') or die('connection'); foreach($file as $line) { buildQuery(); $result = $conn-query($sql); } I hope each time I call $conn-query($sql) it's using the SAME connection resource and not having to reconnect to the DB. Secondly, $sql involves a call to a Stored Procedure, which in turn ends up calling other stored procedures to faciliate all the transactions, validations and inserts...I assume that's not too much more innefficient than using sql insert statements directly. Any help is appreciated. Scott.
Re: MySQL View Optimization Help
Hi! Daniel Kasak wrote: [EMAIL PROTECTED] wrote: Views differ from tables in that they cannot be indexed. I've just started experimenting with derived tables under 4.1.14, and I had a hunch this was so. Is there any plan to include index support for views / derived tables? An index is a separate data structure which must be maintained when the base table is changed (in the indexed columns). A view is a restricted (by rows and/or columns) look at a base table. IMO, having separate indexes for views is not in line with the relational approach at all. The way to go is an efficient use of all indexes defined on a table, whether it is accessed as a base table or via a view. Any where condition in the view definition may be evaluated via base table indexes, if suitable ones are defined on the base table - provided the optimizer chooses this access path. Improving the optimizer would be feature changes, so you should expect to see that in newer versions only. Regards, Joerg -- Joerg Bruehe, Senior Production Engineer MySQL AB, www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL View Optimization Help
[EMAIL PROTECTED] wrote: Views differ from tables in that they cannot be indexed. I've just started experimenting with derived tables under 4.1.14, and I had a hunch this was so. Is there any plan to include index support for views / derived tables? -- Daniel Kasak IT Developer NUS Consulting Group Level 5, 77 Pacific Highway North Sydney, NSW, Australia 2060 T: (+61) 2 9922-7676 / F: (+61) 2 9922 7989 email: [EMAIL PROTECTED] website: http://www.nusconsulting.com.au -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL View Optimization Help
Thanks a lot Shawn. I didn't realize that views don't take advantage of indexing. This is the cause of my major performance hits. I'm basically using views as a form of DB abstraction over the tables. So, many of my views pull all records from all tables they join, and it is up to the user to submit a where query to the view. In many cases, I'm getting 20-30 second queries, whereas the underlying (indexed) tables return results in .33 seconds. The views themselves aren't using criteria. This runs contrary to what I imagine to be a common use of views, ie, vwSelectAllArizonaResidents sort of thing, where the view internally compiles the where criteria from the underlying table. Scott Klarenbach On 1/6/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Scott Klarenbach [EMAIL PROTECTED] wrote on 01/06/2006 08:13:10 PM: Hello, I'm new to views and am discovering massive performance hits in the views I've created once the records start to climb above 20,000 or so. Does anyone know of a great primer/tutorial site for optimizing views in MySQL, or even generally? What are the best practices etc...? I find when I create the same view in SQL by joining the tables directly, it's much faster than the views which invariably are joining other views. Is there a recursion problem with this method? Should views only join underlying tables and not other views? Thanks. Scott. Treat views as you would any other query. All of the optimizations that normally apply to SELECT query performance should also apply to view performance. Views differ from tables in that they cannot be indexed. That is probably why you are getting performance hits by building views on views. Any query against a view (such as a second-tier derivative view) will end up performing the equivalent of a full table scan on any view it uses. There is no hard and fast rule about building views based on other views or based on tables. What works best for you should be which solution you stick with. If you have millions of rows in a base table and a view can reduce that to about ten thousand rows of summary information, I would be very tempted to stick with the view as the basis of a future query. You still have to generate that view each time you want to use it but its data may be sitting there in the query cache so it has the potential to be very fast. If I were you I would review the entire optimization chapter: http://dev.mysql.com/doc/refman/5.0/en/optimization.html It's loaded with useful information. Shawn Green Database Administrator Unimin Corporation - Spruce Pine
MySQL View Optimization Help
Hello, I'm new to views and am discovering massive performance hits in the views I've created once the records start to climb above 20,000 or so. Does anyone know of a great primer/tutorial site for optimizing views in MySQL, or even generally? What are the best practices etc...? I find when I create the same view in SQL by joining the tables directly, it's much faster than the views which invariably are joining other views. Is there a recursion problem with this method? Should views only join underlying tables and not other views? Thanks. Scott.
Re: MySQL View Optimization Help
Scott Klarenbach [EMAIL PROTECTED] wrote on 01/06/2006 08:13:10 PM: Hello, I'm new to views and am discovering massive performance hits in the views I've created once the records start to climb above 20,000 or so. Does anyone know of a great primer/tutorial site for optimizing views in MySQL, or even generally? What are the best practices etc...? I find when I create the same view in SQL by joining the tables directly, it's much faster than the views which invariably are joining other views. Is there a recursion problem with this method? Should views only join underlying tables and not other views? Thanks. Scott. Treat views as you would any other query. All of the optimizations that normally apply to SELECT query performance should also apply to view performance. Views differ from tables in that they cannot be indexed. That is probably why you are getting performance hits by building views on views. Any query against a view (such as a second-tier derivative view) will end up performing the equivalent of a full table scan on any view it uses. There is no hard and fast rule about building views based on other views or based on tables. What works best for you should be which solution you stick with. If you have millions of rows in a base table and a view can reduce that to about ten thousand rows of summary information, I would be very tempted to stick with the view as the basis of a future query. You still have to generate that view each time you want to use it but its data may be sitting there in the query cache so it has the potential to be very fast. If I were you I would review the entire optimization chapter: http://dev.mysql.com/doc/refman/5.0/en/optimization.html It's loaded with useful information. Shawn Green Database Administrator Unimin Corporation - Spruce Pine
Optimization Help
Hi I need some optimization help. I currently have this query SELECT COUNT(*) as count,sum(p.amt) as total,p.batch,m.username,m.id,p.method,DATE_FORMAT(p.occur, '%M %d, %Y') as dateBought FROM pay p LEFT JOIN members m ON m.id=p.mid WHERE p.method!='none' AND p.method!='' GROUP BY p.batch ORDER BY p.occur DESC Now I have an index on batch,method but that does not help because its using filesort when ordering because of occur. Also I wanted to know if there is a way to optimize SUM, and does DATE_FORMAT slow down queries, would it be faster to run my own php date function once i get there queries. Thanks, Andrew
Optimization Help
Hi I need some optimization help. I currently have this query SELECT COUNT(*) as count,sum(p.amt) as total,p.batch,m.username,m.id,p.method,DATE_FORMAT(p.occur, '%M %d, %Y') as dateBought FROM pay p LEFT JOIN members m ON m.id=p.mid WHERE p.method!='none' AND p.method!='' GROUP BY p.batch ORDER BY p.occur DESC Now I have an index on batch,method but that does not help because its using filesort when ordering because of occur. Also I wanted to know if there is a way to optimize SUM, and does DATE_FORMAT slow down queries, would it be faster to run my own php date function once i get there queries. Thanks, Andrew
Optimization Help
Hi I need some optimization help. I currently have this query SELECT COUNT(*) as count,sum(p.amt) as total,p.batch,m.username,m.id,p.method,DATE_FORMAT(p.occur, '%M %d, %Y') as dateBought FROM pay p LEFT JOIN members m ON m.id=p.mid WHERE p.method!='none' AND p.method!='' GROUP BY p.batch ORDER BY p.occur DESC Now I have an index on batch,method but that does not help because its using filesort when ordering because of occur. Also I wanted to know if there is a way to optimize SUM, and does DATE_FORMAT slow down queries, would it be faster to run my own php date function once i get there queries. Thanks, Andrew
Query optimization help needed
I asked for help with a version of this query a few months ago, and subsequently made some improvements to it, and also fooled around with other parts of my app that were in need of tuning. I've since done some more extensive benchmarking and realized that this query really is somewhat slow. Even though the data set is rather small and everything is (I think) properly indexed and the joins are sensible, I can't seem to get rid of the using temporary and using filesort in my EXPLAIN. I'd be grateful for any suggestions for improving this. Here's the query (obviously I run it with different values for subject.name and different LIMIT values, but this is representative): SELECT citation.*, DATE_FORMAT(citation.last_modified, '%e %M, %Y') AS last_modified FROM citation, subject, citation_subject WHERE subject.name = 'History' AND citation_subject.subject_id = subject.id AND citation_subject.citation_id = citation.id AND citation.deleted = 0 ORDER BY citation.stripped_word, FIELD(citation.part_of_speech, 'NOUN', 'ADJECTIVE', 'ADVERB', 'VERB'), citation.id LIMIT 150, 50 and EXPLAIN gives me this: *** 1. row *** table: subject type: ref possible_keys: PRIMARY,name key: name key_len: 50 ref: const rows: 1 Extra: Using where; Using temporary; Using filesort *** 2. row *** table: citation_subject type: ref possible_keys: citation_id,subject_id key: subject_id key_len: 4 ref: subject.id rows: 169 Extra: Using index *** 3. row *** table: citation type: eq_ref possible_keys: PRIMARY,deleted key: PRIMARY key_len: 4 ref: citation_subject.citation_id rows: 1 Extra: Using where Finally, here are the three tables involved. I've trimmed out the irrelevant columns: CREATE TABLE `citation` ( `id` int(11) NOT NULL auto_increment, `word` varchar(50) NOT NULL default '', `stripped_word` varchar(50) default NULL, `part_of_speech` enum('NOUN','VERB','ADJECTIVE','ADVERB') NOT NULL default 'NOUN', `last_modified` timestamp(14) NOT NULL, `deleted` datetime default '-00-00 00:00:00', PRIMARY KEY (`id`), KEY `deleted` (`deleted`), KEY `word` (`word`), KEY `stripped_word` (`stripped_word`) ) TYPE=MyISAM CREATE TABLE `citation_subject` ( `id` int(11) NOT NULL auto_increment, `citation_id` int(11) NOT NULL default '0', `subject_id` int(11) NOT NULL default '0', PRIMARY KEY (`id`), KEY `citation_id` (`citation_id`,`subject_id`), KEY `subject_id` (`subject_id`,`citation_id`) ) TYPE=MyISAM CREATE TABLE `subject` ( `id` int(11) NOT NULL auto_increment, `name` varchar(50) NOT NULL default '', `deleted` datetime default NULL, PRIMARY KEY (`id`), KEY `name` (`name`) ) TYPE=MyISAM Thank you for any suggestions. Jesse Sheidlower -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Query optimization help
Chuck Gadd wrote: I've got a query that I can't seem to get optimized, so I'm hoping someone here can spot something I've missing! Table has three columns: CoordID int unsigned, Zip_Lo char(9), Zip_Hi char(9) Table has 3 million records indexes: acg_lo (Zip_Lo) acg_hi (Zip_Hi) acg_combined (Zip_Lo, Zip_Hi) Here's the query: select * from acg where zip4_lo_pot = '80128' and zip4_hi_pot = '80128' Explain shows: type: ALL possible keys: acg_lo,acg_hi,acg_combined rows: 3022309 extra: Using where This kind of query cannot be efficiently optimized on a pre-4.1 version. With 4.1, if you are using MyISAM tables you could make (zip4_lo_pot,zip4_hi_pot) a spatial column with a spatial index. See http://www.mysql.com/doc/en/Spatial_extensions_in_MySQL.html -- Sasha Pachev Create online surveys at http://www.surveyz.com/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Query optimization help
Without seeing the data I am assuming that you are going over the 30% threshold with your less/greater equal to where clauses. What sort of criteria are you asking the database engine to search for? Original Message On 2/25/04, 9:44:02 PM, [EMAIL PROTECTED] wrote regarding Re: Query optimization help: Maybe i'm wrong here, someone correct me, if its just int's you are gonna use set the field types to bigint it may search faster you are doing a character search, to get there quicker in a text search scenerio i'd suggest mysql4 and full text searching MATCH AGAINST I've got a query that I can't seem to get optimized, so I'm hoping someone here can spot something I've missing! Table has three columns: CoordID int unsigned, Zip_Lo char(9), Zip_Hi char(9) Table has 3 million records indexes: acg_lo (Zip_Lo) acg_hi (Zip_Hi) acg_combined (Zip_Lo, Zip_Hi) Here's the query: select * from acg where zip4_lo_pot = '80128' and zip4_hi_pot = '80128' Explain shows: type: ALL possible keys: acg_lo,acg_hi,acg_combined rows: 3022309 extra: Using where So, how can I optimize this? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Query optimization help
I've got a query that I can't seem to get optimized, so I'm hoping someone here can spot something I've missing! Table has three columns: CoordID int unsigned, Zip_Lo char(9), Zip_Hi char(9) Table has 3 million records indexes: acg_lo (Zip_Lo) acg_hi (Zip_Hi) acg_combined (Zip_Lo, Zip_Hi) Here's the query: select * from acg where zip4_lo_pot = '80128' and zip4_hi_pot = '80128' Explain shows: type: ALL possible keys: acg_lo,acg_hi,acg_combined rows: 3022309 extra: Using where So, how can I optimize this? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Query optimization help
Maybe i'm wrong here, someone correct me, if its just int's you are gonna use set the field types to bigint it may search faster you are doing a character search, to get there quicker in a text search scenerio i'd suggest mysql4 and full text searching MATCH AGAINST I've got a query that I can't seem to get optimized, so I'm hoping someone here can spot something I've missing! Table has three columns: CoordID int unsigned, Zip_Lo char(9), Zip_Hi char(9) Table has 3 million records indexes: acg_lo (Zip_Lo) acg_hi (Zip_Hi) acg_combined (Zip_Lo, Zip_Hi) Here's the query: select * from acg where zip4_lo_pot = '80128' and zip4_hi_pot = '80128' Explain shows: type: ALL possible keys: acg_lo,acg_hi,acg_combined rows: 3022309 extra: Using where So, how can I optimize this? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Optimization help
Hi all - I have a speed problem that I don't understand. I've been pretty active with DB's for a few years, but I'm no expert, so let me know if I'm missing the obvious. I have Paul DuBois' MySQL book (New Riders edition) and Descartes and Bunce's Programming DBI book on my desk, so feel free to reference something there if that will help. Here's the table I'm working from and it's structure: CREATE TABLE DeltaPAF ( Date DATE NOT NULL, Type VARCHAR(4) NOT NULL, Incident INT UNSIGNED NOT NULL, Mgr VARCHAR(4) NOT NULL, Site VARCHAR(40) NOT NULL, Task ENUM('Proposed', 'Approved', 'Completed', 'Invoiced', 'Expired', 'Rejected', 'Cancelled') NOT NULL, Webpage MEDIUMTEXT NOT NULL, BudgetDECIMAL(12, 2) DEFAULT 0.00, PRIMARY KEY (Date, Incident, Type, Task), INDEX (Type, Mgr, Site) ); I have about 125,000 records in the table and it's running on an older 400 MHz MacOS X 10.2.8 system. The MySQL version is 3.23.52. The following query comes back with 210 records in about 0.6 seconds. mysql SELECT Date FROM DeltaPAF WHERE Date=2003-12-11 - AND Date=2004-01-11 AND Incident=98996144; However, this query comes back with 210 records in a little over 2 minutes. mysql SELECT Budget FROM DeltaPAF WHERE Date=2003-12-11 - AND Date=2004-01-11 AND Incident=98996144; Can someone clue me in how I might get the SELECT Budget query to return in a similar time to the SELECT Date query? I tried adding an index for Budget, knowing it shouldn't help, and it didn't. FWIW, the Webpage fields average about 5K characters, but can be as much as 40K. Mike Schienle, Custom Visuals http://www.customvisuals.com/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Optimization help
I think... you don't have an index on the Incident field itself, just on (Date, Incident, Type, Task) which means that it concatenates those fields and orders the result - thus this may be virtually useless if you're looking for a specific incident within a large date range. Since your query has a specific incident number, indexing that field would probably help a lot. Do a SHOW INDEXES FROM DeltaPAF; To see the indexes that are actually there. or EXPLAIN SELECT Date FROM DeltaPAF WHERE Date=2003-12-11 AND Date=2004-01-11 AND Incident=98996144; to see which indexes MySQL is really using. For example, in the table below, there are really only two indexes, the one primary key index and the second name index. The Seq_in_index column shows the fields that are included in the index but the ones that aren't listed first will be much harder to find. Like a telephone directory, which is ordered by lastname, firstname - both fields are indexed but they are in the same index, so finding a specific firstname still means a full table scan. Good luck! mysql describe test1; ++-+--+-++---+ | Field | Type| Null | Key | Default| Extra | ++-+--+-++---+ | name | varchar(20) | YES | MUL | NULL | | | mydate | date| | PRI | -00-00 | | | number | int(10) | | PRI | 0 | | ++-+--+-++---+ 3 rows in set (0.00 sec) mysql show indexes from test1; +---++--+--+-+---+-+--++--++-+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +---++--+--+-+---+-+--++--++-+ | test1 | 0 | PRIMARY |1 | mydate | A |NULL | NULL | NULL | | BTREE | | | test1 | 0 | PRIMARY |2 | number | A | 0 | NULL | NULL | | BTREE | | | test1 | 1 | name |1 | name| A |NULL | NULL | NULL | YES | BTREE | | | test1 | 1 | name |2 | mydate | A |NULL | NULL | NULL | | BTREE | | | test1 | 1 | name |3 | number | A |NULL | NULL | NULL | | BTREE | | +---++--+--+-+---+-+--++--++-+ 5 rows in set (0.15 sec) Mike Schienle wrote: Hi all - I have a speed problem that I don't understand. I've been pretty active with DB's for a few years, but I'm no expert, so let me know if I'm missing the obvious. I have Paul DuBois' MySQL book (New Riders edition) and Descartes and Bunce's Programming DBI book on my desk, so feel free to reference something there if that will help. Here's the table I'm working from and it's structure: CREATE TABLE DeltaPAF ( Date DATE NOT NULL, Type VARCHAR(4) NOT NULL, Incident INT UNSIGNED NOT NULL, Mgr VARCHAR(4) NOT NULL, Site VARCHAR(40) NOT NULL, Task ENUM('Proposed', 'Approved', 'Completed', 'Invoiced', 'Expired', 'Rejected', 'Cancelled') NOT NULL, Webpage MEDIUMTEXT NOT NULL, BudgetDECIMAL(12, 2) DEFAULT 0.00, PRIMARY KEY (Date, Incident, Type, Task), INDEX (Type, Mgr, Site) ); I have about 125,000 records in the table and it's running on an older 400 MHz MacOS X 10.2.8 system. The MySQL version is 3.23.52. The following query comes back with 210 records in about 0.6 seconds. mysql SELECT Date FROM DeltaPAF WHERE Date=2003-12-11 - AND Date=2004-01-11 AND Incident=98996144; However, this query comes back with 210 records in a little over 2 minutes. mysql SELECT Budget FROM DeltaPAF WHERE Date=2003-12-11 - AND Date=2004-01-11 AND Incident=98996144; Can someone clue me in how I might get the SELECT Budget query to return in a similar time to the SELECT Date query? I tried adding an index for Budget, knowing it shouldn't help, and it didn't. FWIW, the Webpage fields average about 5K characters, but can be as much as 40K. Mike Schienle, Custom Visuals http://www.customvisuals.com/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Optimization help
On Jan 12, 2004, at 08:09 AM, Douglas Sims wrote: I think... you don't have an index on the Incident field itself, just on (Date, Incident, Type, Task) which means that it concatenates those fields and orders the result - thus this may be virtually useless if you're looking for a specific incident within a large date range. Since your query has a specific incident number, indexing that field would probably help a lot. Thanks for the help, Douglas. That was the ticket. We're back under a second for queries now. Mike Schienle, Custom Visuals http://www.customvisuals.com/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Query/Table optimization help needed
Hi, I have a perl script that loops through and executes 2 queries 50 times. I need to make sure that I have done all I can to make these queries and the indexing on the table as efficient as possible. Would someone do me the gargantuan favor of taking a peek at the info below and offer any suggestions that may improve things? Thanks! Bryan (apologies for text wrapping making things hard to read :P ) The table: Note: imagequery_3 is actually generated by 'create table select' where there is a left outer join on two tables, but the selection criteria are the same (chrom and chrompos). I did this because I figured it was faster to avoid the left join and index a normal table on chrompos. mysql describe imagequery_3; +---+--+--+-+-+---+ | Field | Type | Null | Key | Default | Extra | +---+--+--+-+-+---+ | subsnp_fk | int(11) | | | 0 | | | chrom | char(5) | YES | | NULL| | | locus | char(15) | YES | | NULL| | | chrompos | int(11) | | MUL | 0 | | +---+--+--+-+-+---+ 4 rows in set (0.00 sec) mysql select count(*) as n from imagequery_3; +-+ | n | +-+ | 1762834 | +-+ 1 row in set (0.00 sec) mysql show index from imagequery_3; +--++---+--+-+-- -+-+--++-+ | Table| Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Comment | +--++---+--+-+-- -+-+--++-+ | imagequery_3 | 1 | chrom_key |1 | chrompos| A |NULL | NULL | NULL | | +--++---+--+-+-- -+-+--++-+ 1 row in set (0.00 sec) The Queries (chrompos increments by some precalculated offset for every loop in the perl script): Query 1: select distinct c.subsnp_fk,locus,chrompos from chrom_position_3 c left outer join locus_anno_3 a on c.subsnp_fk=a.subsnp_fk where chrom='01' and chrompos = 1 and chrompos = 5202881; Query 2: select count(distinct locus) as n from imagequery_3 where chrom='01' and chrompos = 1 and chrompos = 5202881; - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php