Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?
On Fri, Jun 19, 2009 at 11:27 AM, Brent Baisleybrentt...@gmail.com wrote: It sounds like you want to use spatial indexes, but they only became available in v4.1 http://dev.mysql.com/doc/refman/5.0/en/create-index.html http://dev.mysql.com/doc/refman/5.0/en/using-a-spatial-index.html That feels like the right thing (spatial calculations = spatial indexes?) but I looked at the docs and my head exploded. Can anyone recommend a good book that takes me through it gently? That said I'm intreged by the MBRContains and the Polygon functions... If I read those right I could create a simplified circle (probably just an octogon) to help eliminate false positives in the corners when using a plain square as the enclosure. You don't have to do any re-architecture to change you subquery to a join: SELECT custzip FROM customers JOIN (SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5) AS zips ON custzip=zip Will that work after a where clause? Multiple Times? For example... (pseudo-code...) SELECT * FROM customers WHERE saleslastyear 10 JOIN (SELECT zip FROM etc) AS zips ON custzip=zip JOIN (SELECT MAX(date) FROM phonecalls) AS LastCalledOn ON custid=custid Just from thinking about that... I assume that the only limitation is that in a subselect you can do something like WHERE NOT IN (select etc) but with a JOIN you are assuming a positive relationship? For example using the JOIN methods above there isn't a way to simply do AND custid NOT IN (SELECT custid FROM ordersplacedthisyear) other than doing exactly that and adding this clause to the saleslastyear clause. (In this particular case a column lastorderdate in customer that was programatically updated on ordering would also be useful but I'm thinking examples here... ;) ) I've never seen JOIN used outside of a traditional SELECT t1.*,t2.* FROM table1 AS t1 LEFT JOIN table2 AS t2 ON t1.id=t2.id type of structure so I kinda feel like I have a new toy... Thanks! -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?
In the last episode (Jun 18), Matt Neimeyer said: I'm converting a PHP app from using Visual FoxPro as the database backend to using MySQL as the backend. I'm testing on MySQL 4.1.22 on Mac OSX 10.4. The end application will be deployed cross platform and to both 4.x and 5.x MySQL servers. This query returned 21 records in .27 seconds: SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5 Ouch. You might want to calculate the rectange enclosing your target distance, add an index on lat (and/or long), and add the rectangle check to your where clause: WHERE latitude BETWEEN lt1 AND lt2 AND longitude BETWEEN ln2 AND ln2. That way mysql can use the index to pare down most of the rows without having to call all those trig functions for every zipcode. This query returned 21442 records in 1.08 seconds: SELECT custzip FROM customers This query is still running half an hour later, with a Time of 2167 and a State of Sending Data (according to the mysql process list) SELECT custzip FROM customers WHERE custzip IN (SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5) When I try to EXPLAIN the query it gives me the following... id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra 1,PRIMARY,customers,index,NULL,cw_custzip,30,NULL,21226,Using where; Using index 2,DEPENDENT SUBQUERY,zipcodes,ALL,NULL,NULL,NULL,NULL,42144,Using where Neither mysql 4 or 5 are very smart when it comes to subqueries. Unless your inner query is dead simple, mysql assumes it's a dependent subquery and runs it once per row in your outer query. You might want to try mysql 6 and see if it does any better. For example, here are explain plans for mysql 5 and 6 for the following query on the famous Oracle emp sample table: select ename from emp where mgr in (select empno from emp where ename in ('scott')); +---+ | ename | +---+ | ADAMS | +---+ 1 row in set (0.00 sec) Mysql 5.1.30: +++---+-+---+-+-+--+--+-+ | id | select_type| table | type| possible_keys | key | key_len | ref | rows | Extra | +++---+-+---+-+-+--+--+-+ | 1 | PRIMARY| emp | ALL | NULL | NULL| NULL| NULL | 14 | Using where | | 2 | DEPENDENT SUBQUERY | emp | unique_subquery | PRIMARY,ENAME | PRIMARY | 4 | func |1 | Using where | +++---+-+---+-+-+--+--+-+ Note that it didn't use an index on the outer query, and had to examine all 14 rows. It even used the wrong index on the inner query :) Mysql 6.0.11: ++-+---+--+---+---+-++--+---+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+--+---+---+-++--+---+ | 1 | PRIMARY | emp | ref | PRIMARY,ENAME | ENAME | 13 | const |1 | Using index condition | | 1 | PRIMARY | emp | ref | MGR | MGR | 5 | test.emp.EMPNO |2 | | ++-+---+--+---+---+-++--+---+ Note that the queries have flipped and aren't nested anymore (id is 1 on both queries). The first query uses the ename index and estimates it will return one row. The second query uses the mgr index based on the empno value returned by the first query and estimates it will return 2 rows. Much better :) -- Dan Nelson dnel...@allantgroup.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?
SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5 Ouch. You might want to calculate the rectange enclosing your target distance, add an index on lat (and/or long), and add the rectangle check to your where clause: WHERE latitude BETWEEN lt1 AND lt2 AND longitude BETWEEN ln2 AND ln2. That way mysql can use the index to pare down most of the rows without having to call all those trig functions for every zipcode. I like this idea the best (it always bothered me running a query that involved multiple mathmatical functions). So... Here's the scratch php code I ended up with... Anyone see any problems with it? The only problem I see is that I think the old code was more circular this will be a square (within the limits of a square on a non-spherical earth... etc.. etc..) ... so there will be more zip codes included in the corners. If there are too many complaints about that I might look at some sort of overlapping rectangle scheme instead of a square. function ChangeInLatitude($Miles) { return rad2deg($Miles/3960); } function ChangeInLongitude($Lat, $Miles) { return rad2deg($Miles/3960*cos(deg2rad($Lat))); } $Miles = 5; $OriginalLat = 39.0788994; $OriginalLon = -77.1227036; $ChangeInLat = ChangeInLatitude($Miles); $ChangeInLon = ChangeInLongitude($OriginalLat, $Miles); $MinLat = $OriginalLat-$ChangeInLat; $MaxLat = $OriginalLat+$ChangeInLat; $MinLon = $OriginalLon-$ChangeInLon; $MaxLon = $OriginalLon+$ChangeInLon; My only other question is... when I explained the new query... On the dependent subquery it says possible keys are zip, longitude and latitude but it used zip. It seems like a better index would be longitude or latitude? On the primary query, even though there is an index on custzip it doesn't say it's using ANY indexes. I should probably leave well enough alone... but I'm curious. Thanks again! Matt -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?
It sounds like you want to use spatial indexes, but they only became available in v4.1 http://dev.mysql.com/doc/refman/5.0/en/create-index.html http://dev.mysql.com/doc/refman/5.0/en/using-a-spatial-index.html You would need to switch your table type from InnoDB to MyISAM, which is fairly easy with ALTER TABLE. But that should allow you to drop all your calculations in the query. You don't have to do any re-architecture to change you subquery to a join: SELECT custzip FROM customers JOIN (SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5) AS zips ON custzip=zip Often times that simple change speeds things up considerably in MySQL. An explain should show it has a DERIVED TABLE if I recall correctly. Brent Baisley On Thu, Jun 18, 2009 at 9:06 PM, Matt Neimeyerm...@neimeyer.org wrote: I'm converting a PHP app from using Visual FoxPro as the database backend to using MySQL as the backend. I'm testing on MySQL 4.1.22 on Mac OSX 10.4. The end application will be deployed cross platform and to both 4.x and 5.x MySQL servers. This query returned 21 records in .27 seconds. SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5 This query returned 21442 records in 1.08 seconds. SELECT custzip FROM customers This query is still running half an hour later, with a Time of 2167 and a State of Sending Data (according to the mysql process list) SELECT custzip FROM customers WHERE custzip IN (SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5) When I try to EXPLAIN the query it gives me the following... id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra 1,PRIMARY,customers,index,NULL,cw_custzip,30,NULL,21226,Using where; Using index 2,DEPENDENT SUBQUERY,zipcodes,ALL,NULL,NULL,NULL,NULL,42144,Using where If it matters both tables are INNODB and both customers.custzip and zipcodes.zip are indexed. We used a program called DBConvert from DMSoft to convert the data so it's exactly the same on both the VFP side and the MySQL side. With all that in mind... VFP returns the exact same query in 5-10 seconds and that includes render time in the web browser. By comparison... the query WHERE id IN (SELECT id FROM phone WHERE phonedate = '2001-01-01' AND phonedate = '2009-06-18') returns almost instantly. I'm at a complete loss... The suggestions I've seen online for optimizing Dependent Subquery's basically revolve around changing it from a sub-query to a join but that would require more re-architecturing than I want to do... (Unless I'm forced) Especially since more than a few of those solutions suggested precalculating the distance between zipcodes which only works if the distances are known (only allow 10, 50 and 100 mile radi for example) Any ideas? Thanks in advance! Matt -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=brentt...@gmail.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?
Matt, This query is still running half an hour later, with a Time of 2167 and a State of Sending Data (according to the mysql process list) SELECT custzip FROM customers WHERE custzip IN ( ... For explanation alternatives see The unbearable slowness of IN() at http://localhost/artful/infotree/queries.php. PB - Matt Neimeyer wrote: I'm converting a PHP app from using Visual FoxPro as the database backend to using MySQL as the backend. I'm testing on MySQL 4.1.22 on Mac OSX 10.4. The end application will be deployed cross platform and to both 4.x and 5.x MySQL servers. This query returned 21 records in .27 seconds. SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5 This query returned 21442 records in 1.08 seconds. SELECT custzip FROM customers This query is still running half an hour later, with a Time of 2167 and a State of Sending Data (according to the mysql process list) SELECT custzip FROM customers WHERE custzip IN (SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5) When I try to EXPLAIN the query it gives me the following... id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra 1,PRIMARY,customers,index,NULL,cw_custzip,30,NULL,21226,Using where; Using index 2,DEPENDENT SUBQUERY,zipcodes,ALL,NULL,NULL,NULL,NULL,42144,Using where If it matters both tables are INNODB and both customers.custzip and zipcodes.zip are indexed. We used a program called DBConvert from DMSoft to convert the data so it's exactly the same on both the VFP side and the MySQL side. With all that in mind... VFP returns the exact same query in 5-10 seconds and that includes render time in the web browser. By comparison... the query WHERE id IN (SELECT id FROM phone WHERE phonedate = '2001-01-01' AND phonedate = '2009-06-18') returns almost instantly. I'm at a complete loss... The suggestions I've seen online for optimizing Dependent Subquery's basically revolve around changing it from a sub-query to a join but that would require more re-architecturing than I want to do... (Unless I'm forced) Especially since more than a few of those solutions suggested precalculating the distance between zipcodes which only works if the distances are known (only allow 10, 50 and 100 mile radi for example) Any ideas? Thanks in advance! Matt No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.364 / Virus Database: 270.12.78/2185 - Release Date: 06/18/09 05:53:00
Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?
Peter, On Thu, Jun 18, 2009 at 9:27 PM, Peter Brawleypeter.braw...@earthlink.net wrote: For explanation alternatives see The unbearable slowness of IN() at http://localhost/artful/infotree/queries.php. you prolly meant to not post a url pointing at your local copy of your website. This works better for most of us: http://www.artfulsoftware.com/infotree/queries.php ;) Walter -- Walter Heck, Engineer @ Open Query (http://openquery.com) Affordable Training and ProActive Support for MySQL related technologies Follow our blog at http://openquery.com/blog/ OurDelta: free enhanced builds for MySQL @ http://ourdelta.org -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?
I'm converting a PHP app from using Visual FoxPro as the database backend to using MySQL as the backend. I'm testing on MySQL 4.1.22 on Mac OSX 10.4. The end application will be deployed cross platform and to both 4.x and 5.x MySQL servers. This query returned 21 records in .27 seconds. SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5 This query returned 21442 records in 1.08 seconds. SELECT custzip FROM customers This query is still running half an hour later, with a Time of 2167 and a State of Sending Data (according to the mysql process list) SELECT custzip FROM customers WHERE custzip IN (SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5) When I try to EXPLAIN the query it gives me the following... id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra 1,PRIMARY,customers,index,NULL,cw_custzip,30,NULL,21226,Using where; Using index 2,DEPENDENT SUBQUERY,zipcodes,ALL,NULL,NULL,NULL,NULL,42144,Using where If it matters both tables are INNODB and both customers.custzip and zipcodes.zip are indexed. We used a program called DBConvert from DMSoft to convert the data so it's exactly the same on both the VFP side and the MySQL side. With all that in mind... VFP returns the exact same query in 5-10 seconds and that includes render time in the web browser. By comparison... the query WHERE id IN (SELECT id FROM phone WHERE phonedate = '2001-01-01' AND phonedate = '2009-06-18') returns almost instantly. I'm at a complete loss... The suggestions I've seen online for optimizing Dependent Subquery's basically revolve around changing it from a sub-query to a join but that would require more re-architecturing than I want to do... (Unless I'm forced) Especially since more than a few of those solutions suggested precalculating the distance between zipcodes which only works if the distances are known (only allow 10, 50 and 100 mile radi for example) Any ideas? Thanks in advance! Matt -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?
I often find doing the IN (subquery...) is really slow versus doing a join: SELECT cutzip FROM customers INNER JOIN zipcodes ON customers.zipcode=zipcodes.zip WHERE degrees(acos(sin(radians(39.0788994)) * sin(radians(latitude)) + cos(radians(39.0788994)) * cos(radians(latitude)) * cos(radians(-77.1227036-longitude)) ) ) *60 *1.1515 5 That query may have un-matched ()'s, not sure. hard to tell =) Try a join. -jw On Thu, Jun 18, 2009 at 8:06 PM, Matt Neimeyer m...@neimeyer.org wrote: I'm converting a PHP app from using Visual FoxPro as the database backend to using MySQL as the backend. I'm testing on MySQL 4.1.22 on Mac OSX 10.4. The end application will be deployed cross platform and to both 4.x and 5.x MySQL servers. This query returned 21 records in .27 seconds. SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5 This query returned 21442 records in 1.08 seconds. SELECT custzip FROM customers This query is still running half an hour later, with a Time of 2167 and a State of Sending Data (according to the mysql process list) SELECT custzip FROM customers WHERE custzip IN (SELECT zip FROM zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+ cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude*60*1.1515 5) When I try to EXPLAIN the query it gives me the following... id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra 1,PRIMARY,customers,index,NULL,cw_custzip,30,NULL,21226,Using where; Using index 2,DEPENDENT SUBQUERY,zipcodes,ALL,NULL,NULL,NULL,NULL,42144,Using where If it matters both tables are INNODB and both customers.custzip and zipcodes.zip are indexed. We used a program called DBConvert from DMSoft to convert the data so it's exactly the same on both the VFP side and the MySQL side. With all that in mind... VFP returns the exact same query in 5-10 seconds and that includes render time in the web browser. By comparison... the query WHERE id IN (SELECT id FROM phone WHERE phonedate = '2001-01-01' AND phonedate = '2009-06-18') returns almost instantly. I'm at a complete loss... The suggestions I've seen online for optimizing Dependent Subquery's basically revolve around changing it from a sub-query to a join but that would require more re-architecturing than I want to do... (Unless I'm forced) Especially since more than a few of those solutions suggested precalculating the distance between zipcodes which only works if the distances are known (only allow 10, 50 and 100 mile radi for example) Any ideas? Thanks in advance! Matt -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=joh...@pixelated.net -- - Johnny Withers 601.209.4985 joh...@pixelated.net