RE: Speed difference between boolean full-text searches and full-text searches
OK I tried this, '+music +mix +2001' instead of this 'music mix 2001' IN BOOLEAN MODE and the SQL time is the same ~21 sec. select artists.name , cds.title, tracks.title from artists, tracks, cds where MATCH (artists.name) AGAINST ('madonna' IN BOOLEAN MODE) and MATCH (cds.title) AGAINST ('+music +mix +2001' IN BOOLEAN MODE) and artists.artistid = cds.artistid AND artists.artistid = tracks.artistid AND cds.cdid = tracks.cdid Do you have some explanation, why is this so much slower than this : SELECT artists.name, cds.title, tracks.title FROM artists, cds, tracks WHERE artists.artistid = cds.artistid AND artists.artistid = tracks.artistid AND cds.cdid = tracks.cdid AND MATCH ( artists.name ) AGAINST ( 'madonna' ) AND MATCH ( cds.title ) AGAINST ( 'music' ) AND MATCH ( cds.title ) AGAINST ( 'mix' ) AND MATCH ( cds.title ) AGAINST ( '2001' ) Regards -Original Message- From: Chuck Gadd [mailto:[EMAIL PROTECTED] Sent: Monday, December 08, 2003 21:50 To: Uros Kotnik; [EMAIL PROTECTED] Subject: Re: Speed difference between boolean full-text searches and full-text searches Uros Kotnik wrote: It makes sense, but Sergei G. said : And are you sure the numbers are correct, the first query - the one without IN BOOLEAN MODE - is faster ? I would expect the opposite. I guess that for my DB I can't expect satisfied in boolena mode times ? But also when searching without in boolean mode and include search criteria from TRACKS table, 13,841,930 rows , like AND MATCH ( tracks.title) AGAINST ('remix') I get ~10 sec. times. Am I doing something wrong or this results are correct for this amount of data, I would be satisfied with 0.5 - 1 sec. times If I'm not mistaken, IN BOOLEAN MODE simply changes the parser logic. It tells MySql to process the special characters, like +-*. I don't think it's the IN BOOLEAN MODE that is causing the slow query, but the fact that you are looking for the phrase. If you were to do SELECT artists.name, cds.title, tracks.title FROM artists, cds, tracks WHERE artists.artistid = cds.artistid AND artists.artistid = tracks.artistid AND cds.cdid = tracks.cdid AND MATCH (artists.name) AGAINST ('madonna' IN BOOLEAN MODE) AND MATCH (cds.title) AGAINST ('+music +mix +2001'IN BOOLEAN MODE) Then you'd probably still get the fast search time, since the query simply requires all three words. MySql can resolve this just using the index. In your example, the BOOLEAN MODE for MATCH (artists.name) AGAINST ('madonna' IN BOOLEAN MODE) isn't doing anything special, since you aren't using any special chars to modify the search expression. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: Speed difference between boolean full-text searches and full-text searches
OK I tried this, so '+music +mix +2001' instead of this 'music mix 2001' and the SQL time is the same ~21 sec. select artists.name , cds.title, tracks.title from artists, tracks, cds where MATCH (artists.name) AGAINST ('madonna' IN BOOLEAN MODE) and MATCH (cds.title) AGAINST ('+music +mix +2001' IN BOOLEAN MODE) and artists.artistid = cds.artistid AND artists.artistid = tracks.artistid AND cds.cdid = tracks.cdid -Original Message- From: Chuck Gadd [mailto:[EMAIL PROTECTED] Sent: Monday, December 08, 2003 21:50 To: Uros Kotnik; [EMAIL PROTECTED] Subject: Re: Speed difference between boolean full-text searches and full-text searches Uros Kotnik wrote: It makes sense, but Sergei G. said : And are you sure the numbers are correct, the first query - the one without IN BOOLEAN MODE - is faster ? I would expect the opposite. I guess that for my DB I can't expect satisfied in boolena mode times ? But also when searching without in boolean mode and include search criteria from TRACKS table, 13,841,930 rows , like AND MATCH ( tracks.title) AGAINST ('remix') I get ~10 sec. times. Am I doing something wrong or this results are correct for this amount of data, I would be satisfied with 0.5 - 1 sec. times If I'm not mistaken, IN BOOLEAN MODE simply changes the parser logic. It tells MySql to process the special characters, like +-*. I don't think it's the IN BOOLEAN MODE that is causing the slow query, but the fact that you are looking for the phrase. If you were to do SELECT artists.name, cds.title, tracks.title FROM artists, cds, tracks WHERE artists.artistid = cds.artistid AND artists.artistid = tracks.artistid AND cds.cdid = tracks.cdid AND MATCH (artists.name) AGAINST ('madonna' IN BOOLEAN MODE) AND MATCH (cds.title) AGAINST ('+music +mix +2001'IN BOOLEAN MODE) Then you'd probably still get the fast search time, since the query simply requires all three words. MySql can resolve this just using the index. In your example, the BOOLEAN MODE for MATCH (artists.name) AGAINST ('madonna' IN BOOLEAN MODE) isn't doing anything special, since you aren't using any special chars to modify the search expression. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: Speed difference between boolean full-text searches and full-text searches
OK, I will give you more details. MySQL ver. : 4.0.16 CPU : 2xCelleron 1000 1GB RAM Table CDS, have 1,053,794 rows, FT index on title, Data 67,646 KB, Index 70,401 KB Table ARTISTS, Rows 292,330, FT on name, Data 8,096 KB Index 17,218 KB Table TRACKS, rows 13,841,930, FT on title Data 625,360 KB Index 646,672 KB ft_min_word_len = 3 key_buffer_size 786432000 Explain for both SQLs gives same info : table type possible_keys key key_len ref rows Extra artists fulltext PRIMARY,ft_name ft_name 0 1 Using where cds fulltext PRIMARY,artistIndex,ft_title ft_title 0 1 Using where tracks ref PRIMARY,artistIndex PRIMARY 4 cds.cdId 13 Using where Last results that I sent are not correct because I forgot to include one more join, artists.artistid = cds.artistid, bad oversight I know These are the new results : Time for first SQL : 21 sec. SELECT artists.name, cds.title, tracks.title FROM artists, cds, tracks WHERE artists.artistid = cds.artistid AND artists.artistid = tracks.artistid AND cds.cdid = tracks.cdid AND MATCH (artists.name) AGAINST ('madonna'IN BOOLEAN MODE) AND MATCH (cds.title)AGAINST ('music mix 2001'IN BOOLEAN MODE) Time for second SQL : 1 sec. SELECT artists.name, cds.title, tracks.title FROM artists, cds, tracks WHERE artists.artistid = cds.artistid AND artists.artistid = tracks.artistid AND cds.cdid = tracks.cdid AND MATCH ( artists.name ) AGAINST ( 'madonna' ) AND MATCH ( cds.title ) AGAINST ( 'music' ) AND MATCH ( cds.title ) AGAINST ( 'mix' ) AND MATCH ( cds.title ) AGAINST ( '2001' ) One more thing that I noticed in last SQL, when I change, in FROM clause, positions of tables like this : FROM artists, tracks, cds, instead FROM artists, cds, tracks I get time of 1.9 sec. instead 1 sec. ? Regards -Original Message- From: Sergei Golubchik [mailto:[EMAIL PROTECTED] Sent: Monday, December 08, 2003 00:02 To: Uros Kotnik Cc: [EMAIL PROTECTED] Subject: Re: Speed difference between boolean full-text searches and full-text searches Hi! On Nov 27, Uros Kotnik wrote: Executing this SQL, takes ~5 sec. select artists.name, cds.title, tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and MATCH (artists.name) AGAINST ('madonna') and MATCH (cds.title) AGAINST ('music') and MATCH (cds.title) AGAINST ('mix') and MATCH (cds.title) AGAINST ('2001') limit 1001 and this, ~40 sec. select artists.name, cds.title, tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and MATCH (artists.name) AGAINST ('madonna' IN BOOLEAN MODE) and MATCH (cds.title) AGAINST ('music mix 2001' IN BOOLEAN MODE) limit 1001 Same result but the speed difference is quite a different, why is that ? What does EXPLAIN show for both queries ? And are you sure the numbers are correct, the first query - the one without IN BOOLEAN MODE - is faster ? I would expect the opposite. Regards, Sergei -- __ ___ ___ __ / |/ /_ __/ __/ __ \/ / Sergei Golubchik [EMAIL PROTECTED] / /|_/ / // /\ \/ /_/ / /__ MySQL AB, Senior Software Developer /_/ /_/\_, /___/\___\_\___/ Osnabrueck, Germany ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: Speed difference between boolean full-text searches and full-text searches
OK, I will give you more details. MySQL ver. : 4.0.16 CPU : 2xCelleron 1000 1GB RAM Table CDS, have 1,053,794 rows, FT index on title, Data 67,646 KB, Index 70,401 KB Table ARTISTS, Rows 292,330, FT on name, Data 8,096 KB Index 17,218 KB Table TRACKS, rows 13,841,930, FT on title Data 625,360 KB Index 646,672 KB ft_min_word_len = 3 key_buffer_size 786432000 Explain for both SQLs gives same info : table type possible_keys key key_len ref rows Extra artists fulltext PRIMARY,ft_name ft_name 0 1 Using where cds fulltext PRIMARY,artistIndex,ft_title ft_title 0 1 Using where tracks ref PRIMARY,artistIndex PRIMARY 4 cds.cdId 13 Using where Last results that I sent are not correct because I forgot to include one more join, artists.artistid = cds.artistid, bad oversight I know These are the new results : Time for first SQL : 21 sec. SELECT artists.name, cds.title, tracks.title FROM artists, cds, tracks WHERE artists.artistid = cds.artistid AND artists.artistid = tracks.artistid AND cds.cdid = tracks.cdid AND MATCH (artists.name) AGAINST ('madonna'IN BOOLEAN MODE) AND MATCH (cds.title)AGAINST ('music mix 2001'IN BOOLEAN MODE) Time for second SQL : 1 sec. SELECT artists.name, cds.title, tracks.title FROM artists, cds, tracks WHERE artists.artistid = cds.artistid AND artists.artistid = tracks.artistid AND cds.cdid = tracks.cdid AND MATCH ( artists.name ) AGAINST ( 'madonna' ) AND MATCH ( cds.title ) AGAINST ( 'music' ) AND MATCH ( cds.title ) AGAINST ( 'mix' ) AND MATCH ( cds.title ) AGAINST ( '2001' ) One more thing that I noticed in last SQL, when I change, in FROM clause, positions of tables like this : FROM artists, tracks, cds, instead FROM artists, cds, tracks I get time of 1.9 sec. instead 1 sec. ? Regards -Original Message- From: Sergei Golubchik [mailto:[EMAIL PROTECTED] Sent: Monday, December 08, 2003 00:02 To: Uros Kotnik Cc: [EMAIL PROTECTED] Subject: Re: Speed difference between boolean full-text searches and full-text searches Hi! On Nov 27, Uros Kotnik wrote: Executing this SQL, takes ~5 sec. select artists.name, cds.title, tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and MATCH (artists.name) AGAINST ('madonna') and MATCH (cds.title) AGAINST ('music') and MATCH (cds.title) AGAINST ('mix') and MATCH (cds.title) AGAINST ('2001') limit 1001 and this, ~40 sec. select artists.name, cds.title, tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and MATCH (artists.name) AGAINST ('madonna' IN BOOLEAN MODE) and MATCH (cds.title) AGAINST ('music mix 2001' IN BOOLEAN MODE) limit 1001 Same result but the speed difference is quite a different, why is that ? What does EXPLAIN show for both queries ? And are you sure the numbers are correct, the first query - the one without IN BOOLEAN MODE - is faster ? I would expect the opposite. Regards, Sergei -- __ ___ ___ __ / |/ /_ __/ __/ __ \/ / Sergei Golubchik [EMAIL PROTECTED] / /|_/ / // /\ \/ /_/ / /__ MySQL AB, Senior Software Developer /_/ /_/\_, /___/\___\_\___/ Osnabrueck, Germany ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: Speed difference between boolean full-text searches and full-text searches
It makes sense, but Sergei G. said : And are you sure the numbers are correct, the first query - the one without IN BOOLEAN MODE - is faster ? I would expect the opposite. I guess that for my DB I can't expect satisfied in boolena mode times ? But also when searching without in boolean mode and include search criteria from TRACKS table, 13,841,930 rows , like AND MATCH ( tracks.title) AGAINST ('remix') I get ~10 sec. times. Am I doing something wrong or this results are correct for this amount of data, I would be satisfied with 0.5 - 1 sec. times -Original Message- From: Chuck Gadd [mailto:[EMAIL PROTECTED] Sent: Monday, December 08, 2003 13:17 To: Uros Kotnik; [EMAIL PROTECTED] Subject: Re: Speed difference between boolean full-text searches and full-text searches Uros Kotnik wrote: Time for first SQL : 21 sec. SELECT artists.name, cds.title, tracks.title FROM artists, cds, tracks WHERE artists.artistid = cds.artistid AND artists.artistid = tracks.artistid AND cds.cdid = tracks.cdid AND MATCH (artists.name) AGAINST ('madonna'IN BOOLEAN MODE) AND MATCH (cds.title)AGAINST ('music mix 2001'IN BOOLEAN MODE) In this case, it cannot resolve the query JUST using indexes. After finding all records in the index where artists.name matches madonna and title contains all the words music, mix, 2001, then it must retrieve each record, and examine the title field to see if the three words are found together in the phrase. In your other example, it only needs to use the fulltext indexes to know which records satisfy your query, resulting in MUCH faster query time. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Once again, three queries, same result, huge speed difference
I posted this few days ago, but with no answer, also posted it to benchmark list.. Executing this SQL, takes ~5 sec. select artists.name , cds.title , tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and MATCH (artists.name) AGAINST ('madonna') and MATCH (cds.title) AGAINST ('music') and MATCH (cds.title) AGAINST ('mix') and MATCH (cds.title) AGAINST ('2001') and this, ~40 sec. select artists.name , cds.title, tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and MATCH (artists.name) AGAINST ('madonna' IN BOOLEAN MODE) and MATCH (cds.title) AGAINST ('music mix 2001' IN BOOLEAN MODE) and executing this takes less than 1 sec. select artists.name , cds.title, tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and artists.name like '%madonna%' and cds.title like '%music mix 2001%' Same result but the speed difference is quite a different, why is that ? This is only on test DB, I didn't try it on real life DB where I have ~14 mil. rows in tracks table. Regards
RE: Once again, three queries, same result, huge speed difference
Another thing that I noticed is : This query takes less than sec : SELECT artists.name, cds.title, tracks.title FROM artists, tracks, cds WHERE artists.artistid = tracks.artistid AND cds.cdid = tracks.cdid AND MATCH ( name ) AGAINST ( 'madonna' ) But when I add one more AND it takes more than 15 min. SELECT artists.name, cds.title, tracks.title FROM artists, tracks, cds WHERE artists.artistid = tracks.artistid AND cds.cdid = tracks.cdid AND MATCH ( name ) AGAINST ( 'madonna' ) AND MATCH ( cds.title ) AGAINST ( 'music' ) -Original Message- From: Tobias Asplund [mailto:[EMAIL PROTECTED] Sent: Thursday, December 04, 2003 11:50 To: Uros Kotnik Cc: [EMAIL PROTECTED] Subject: Re: Once again, three queries, same result, huge speed difference On Thu, 4 Dec 2003, Uros Kotnik wrote: I posted this few days ago, but with no answer, also posted it to benchmark list.. Executing this SQL, takes ~5 sec. select artists.name , cds.title , tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and MATCH (artists.name) AGAINST ('madonna') and MATCH (cds.title) AGAINST ('music') and MATCH (cds.title) AGAINST ('mix') and MATCH (cds.title) AGAINST ('2001') and this, ~40 sec. select artists.name , cds.title, tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and MATCH (artists.name) AGAINST ('madonna' IN BOOLEAN MODE) and MATCH (cds.title) AGAINST ('music mix 2001' IN BOOLEAN MODE) and executing this takes less than 1 sec. select artists.name , cds.title, tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and artists.name like '%madonna%' and cds.title like '%music mix 2001%' Same result but the speed difference is quite a different, why is that ? This is only on test DB, I didn't try it on real life DB where I have ~14 mil. rows in tracks table. Regards Can you post EXPLAIN SELECT of those queries as well, please? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: Once again, three queries, same result, huge speed difference
Hmmm, if I execute this 3 queries at any time in any order I get the same execution time. Yes, explain... explain select artists.name , cds.title , tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and MATCH (artists.name) AGAINST ('madonna') and MATCH (cds.title) AGAINST ('music') and MATCH (cds.title) AGAINST ('mix') and MATCH (cds.title) AGAINST ('2001') | table | type | possible_keys | key| key_len | ref | rows | Extra | artists | fulltext | PRIMARY,name | name | 0 | | 1 | Using where | | tracks | ref | PRIMARY,artistIndex| artistIndex| 5 | artists.artistId | 27 | Using where | | cds | fulltext | PRIMARY,fulltext_title | fulltext_title | 0 | |1 | Using where | explain select artists.name , cds.title, tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and MATCH (artists.name) AGAINST ('madonna' IN BOOLEAN MODE) and MATCH (cds.title) AGAINST ('music mix 2001' IN BOOLEAN MODE) | table | type | possible_keys | key| key_len | ref | rows | Extra | +-+--+++ -+-- | artists | fulltext | PRIMARY,name | name | 0 | |1 | Using where | | tracks | ref | PRIMARY,artistIndex| artistIndex| 5 | artis ts.artistId | 27 | Using where | | cds | fulltext | PRIMARY,fulltext_title | fulltext_title | 0 | |1 | Using where | explain select artists.name , cds.title, tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and artists.name like '%madonna%' and cds.title like '%music mix 2001%' | table | type | possible_keys | key | key_len | ref | rows | Extra | +-++-+-+-+-- +-++-+-+-+ | artists | ALL| PRIMARY | NULL|NULL | NULL | 23806 | Using where | | tracks | ref| PRIMARY,artistIndex | artistIndex | 5 | artists.artis tId |27 | Using where | | cds | eq_ref | PRIMARY | PRIMARY | 4 | tracks.cdId | 1 | Using where | -Original Message- From: Brent Baisley [mailto:[EMAIL PROTECTED] Sent: Thursday, December 04, 2003 16:38 To: Uros Kotnik Cc: [EMAIL PROTECTED] Subject: Re: Once again, three queries, same result, huge speed difference You need to take cache into consideration when doing your testing. Both MySQL cache and the OS cache. That means rebooting between each query that you run to clear the database and OS cache. -or- Run each query 3 or 4 times (or 5, or even 10) consecutively and either take the average or the fastest. Doing it this way will make sure that the cache is used equally for all queries. You should also do and EXPLAIN to see how MySQL is executing each query. On Dec 4, 2003, at 5:35 AM, Uros Kotnik wrote: Same result but the speed difference is quite a different, why is that ? This is only on test DB, I didn't try it on real life DB where I have ~14 mil. rows in tracks table. -- Brent Baisley Systems Architect Landover Associates, Inc. Search Advisory Services for Advanced Technology Environments p: 212.759.6400/800.759.0577 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Speed difference between boolean full-text searches and full-text searches
Executing this SQL, takes ~5 sec. select artists.name , cds.title , tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and MATCH (artists.name) AGAINST ('madonna') and MATCH (cds.title) AGAINST ('music') and MATCH (cds.title) AGAINST ('mix') and MATCH (cds.title) AGAINST ('2001') limit 1001 and this, ~40 sec. select artists.name , cds.title, tracks.title from artists, tracks, cds where artists.artistid = tracks.artistid and cds.cdid = tracks.cdid and MATCH (artists.name) AGAINST ('madonna' IN BOOLEAN MODE) and MATCH (cds.title) AGAINST ('music mix 2001' IN BOOLEAN MODE) limit 1001 Same result but the speed difference is quite a different, why is that ? Regards