You're still doing a full table scan with REGEX, so you'll never get it really fast. I was thinking it would be slightly faster because of less comparisons. It's the full table scan and no use of indexes that you want to get away from. Without doing that, the only way to get things faster is with faster disks and more RAM for caching.

I assume you created a full text index on the title field? Sorry, but have to ask. You can perform a full text search without full text index, but it's not going to be fast since no index is used.

Try adding a plus sign in front of the words or phrases you want to search on when using IN BOOLEAN MODE.
MATCH(emp.title) AGAINST('+"Vice President"' IN BOOLEAN MODE)

Although I don't know why it would be matching Vice Chairman, that's just not right. Are you patched to the latest version of MySQL? I remember getting some really weird results with full text searches with an older version of 4.1.

Regardless, you will have to adjust the full text settings and reindex to find words like VP, since that is shorter than the default 4 character word length minimum. I had to do this to search on things like "VB" (Visual Basic). MySQL still won't index "V.P." since periods aren't valid characters for words and you wouldn't want MySQL indexing single letter "words". I have a similar problem trying to search on "C", the programming language. I've figured out a work around for "C++", but not "C". I think you're stuck with doing at least a little data scrubbing and do cleanup for new data before saving.

On Oct 5, 2005, at 7:05 AM, Harini Raghavan wrote:

Hi Brent,

Using REGEXP did not really help with the performance. I need to do whole word matching sowould prefer not to do LIKE '%Vice President %' as it may return ome negative results. I separated out some of the text based columns in to a different table using MYISAM storage engine. Using FULLTEXT with IN BOOLEAN MODE was a bit slow, but without the IN BOOLEAN MODE, it seems fast. However I think the phrase searches are not working properly.

For example the below query returned records where the title was 'Vice Chairman': select emp.title from employment_title emp where MATCH(emp.title) AGAINST('"Vice President"')

I have verified the syntax of phrase query, and it seems to bve correct. Any idea why this is happening? Also if I have multiple phrases is the following query syntax correct? select emp.title from employment_title emp where MATCH(emp.title) AGAINST('"V.P." "VP" "Vice President" "Vice-President"')

Thanks,
Harini

Brent Baisley wrote:


Egads! That's a lot of OR's. You probably want to use REGEXP instead of all those OR's. REGEXP can be slow, but you'll be doing far less comparisons than what you have in your current query so it may be faster than what you have.

Something like this:
SELECT ...
WHERE ... emp.title REGEXP 'V[.]?P[.]?' OR emp.title REGEXP 'Vice [-]? President'

I think that will match everything you have. At the very least you don't have to check for periods, commas, space and everything else before and after what you are searching on. Searching on emp.title LIKE '%Vice President%', will find 'Vice President' anywhere in the text, regardless of what come before or after it.

One thing you should think about trying to do is breaking out the fields you need to do a full text search on into a separate table that you can make MYISAM. You'll be adding a JOIN to your queries that need to do the full text search, but it should be a lot quicker and your queries much simpler. You'll then have a mix of InnoDB and MYISAM tables, which is perfectly legal.

On Oct 4, 2005, at 11:17 AM, Harini Raghavan wrote:


Hi,

I am using MYSQL 4.1 database in my J2ee application. I am facing performance issues with some queries that are being run on text fields. Since MYISAM storage engine does not support transactions (and my application requires the database tables to support transaction), I have not been able to use FULL TEXT searches. I need to perform token matching against the text fields and so use a lot of LIKE expressions in the query. Below is one such query which is taking almost 2 mins to execute.

select count(emp.id) from executive as exec1 , company comp, target_company targetComp, employment as emp
where emp.executive_id = exec1.id
and emp.company_id = comp.id
and comp.id = targetComp.company_id and ((emp.title like 'V.P.' OR emp.title like 'V.P. %' OR emp.title like 'V.P., %' OR emp.title like 'V.P.. %' OR emp.title like '% V.P. %' OR emp.title like '% V.P., %' OR emp.title like '% V.P.. %' OR emp.title like '% V.P.' OR emp.title like '% V.P..') OR (emp.title like 'VP' OR emp.title like 'VP %' OR emp.title like 'VP, %' OR emp.title like 'VP. %' OR emp.title like '% VP %' OR emp.title like '% VP, %' OR emp.title like '% VP. %' OR emp.title like '% VP' OR emp.title like '% VP.') OR (emp.title like 'Vice President' OR emp.title like 'Vice President %' OR emp.title like 'Vice President, %' OR emp.title like 'Vice President. %' OR emp.title like '% Vice President %' OR emp.title like '% Vice President, %' OR emp.title like '% Vice President. %' OR emp.title like '% Vice President' OR emp.title like '% Vice President.') OR (emp.title like 'Vice-President' OR emp.title like 'Vice-President %' OR emp.title like 'Vice-President, %' OR emp.title like 'Vice- President. %' OR emp.title like '% Vice-President %' OR emp.title like '% Vice-President, %' OR emp.title like '% Vice- President. %' OR emp.title like '% Vice-President'
OR emp.title like '% Vice-President.'))
and emp.active = 1

Does MYSQL provide any other option to perform text based searches? Can someone suggest any tips for performance tuning the database in this scenario?

Thanks,
Harini

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? [EMAIL PROTECTED]









--
Brent Baisley
Systems Architect
Landover Associates, Inc.
Search & Advisory Services for Advanced Technology Environments
p: 212.759.6400/800.759.0577



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to