Can you create a second, indexed column in your feed_new temp table that includes the title without the year appended? That might allow you to get by with a single pass through the larger prod table and avoid reading rows from the feed_new table.
-Travis -----Original Message----- From: Jerry Schwartz [mailto:je...@gii.co.jp] Sent: Tuesday, August 10, 2010 3:39 PM To: mysql@lists.mysql.com Subject: Slow query using string operator I'm running a set of queries that look like this: =============== SET @PUBID = (SELECT pub_id FROM pub WHERE pub_code = 'DC'); DROP TEMPORARY TABLE IF EXISTS feed_new; CREATE TEMPORARY TABLE feed_new ( new_title VARCHAR(255), INDEX (new_title) ); INSERT INTO feed_new VALUES ('UK Investment Bonds 2010'), ('UK Protection 2010'), ('UK Personal Insurance Distribution 2010'), ('UK Private Medical Insurance 2010'), ... ('UK Private Motor Insurance 2010'), ('Wealth Management for Non-Resident Indians 2010'), ('Middle Eastern Cards Database 2010') ; SELECT feed_new.new_title AS `New Title FROM Feed`, prod.prod_pub_prod_id AS `Lib Code FROM DB`, prod.prod_title AS `Title FROM DB`, prod.prod_num AS `Prod Num`, prod.prod_published AS `Published FROM DB` FROM feed_new JOIN prod ON LEFT(feed_new.new_title, LENGTH(feed_new.new_title) - 5) = LEFT(prod.prod_title, LENGTH(prod.prod_title) - 5) WHERE prod.pub_id = @PUBID AND prod.prod_discont = 0 ORDER BY feed_new.new_title; ============ With a relatively small number of rows in `feed_new`, this can take many seconds. With 163 rows in `feed_new`, compared against 11234 eligible rows in prod, it took about 28 seconds. Here's what an EXPLAIN looks like: *************************** 1. row *************************** id: 1 select_type: SIMPLE table: feed_new type: index possible_keys: NULL key: PRIMARY key_len: 767 ref: NULL rows: 1 Extra: Using index *************************** 2. row *************************** id: 1 select_type: SIMPLE table: prod type: ref possible_keys: pub_id,pub_id_2 key: pub_id key_len: 48 ref: const rows: 11040 Extra: Using where ========= prod.pub_id is an indexed VARCHAR(15). If I remove the string functions, I don't get what I want -- but the remaining query runs in .05 seconds. Here's an EXPLAIN of that one: =============== us-gii >EXPLAIN -> SELECT -> feed_new.new_title AS `New Title FROM Feed`, -> prod.prod_pub_prod_id AS `Lib Code FROM DB`, -> prod.prod_title AS `Title FROM DB`, -> prod.prod_num AS `Prod Num`, -> prod.prod_published AS `Published FROM DB` -> FROM feed_new JOIN prod -> ON feed_new.new_title = prod.prod_title -> WHERE prod.pub_id = @PUBID AND prod.prod_discont = 0 -> ORDER BY feed_new.new_title\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: feed_new type: index possible_keys: PRIMARY key: PRIMARY key_len: 767 ref: NULL rows: 163 Extra: Using index *************************** 2. row *************************** id: 1 select_type: SIMPLE table: prod type: ref possible_keys: pub_id,prod_title,pub_id_2,prod_title_fulltext key: prod_title key_len: 768 ref: giiexpr_db.feed_new.new_title rows: 1 Extra: Using where ================ Obviously the string manipulation is keeping MySQL from using `prod_title` as a key, but I wouldn't have thought that using `pub_id` instead would be that horrific. Does anyone have any suggestions as to how to speed this business up? I can't get away without some string manipulation, because I'm looking for "near matches" by ignoring the year at the end of the title. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=travis_...@hotmail.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=arch...@jab.org