Harini Raghavan <[EMAIL PROTECTED]> wrote on 10/04/2005 
11:17:48 AM:

> Hi,
> 
> I am using MYSQL 4.1 database in my J2ee application. I am facing 
> performance issues with some queries that are being run on text fields. 
> Since MYISAM storage engine does not support transactions(and my 
> application requires the database tables to support transaction), I have 

> not been able to use FULL TEXT searches. I need to perform token 
> matching against the text fields and so use a lot of LIKE expressions 
> in the query. Below is one such query which is taking almost 2 mins to 
> execute.
> 
> select count(emp.id)  from executive as exec1 , company comp, 
> target_company targetComp, employment as emp
> where emp.executive_id = exec1.id
> and emp.company_id = comp.id
> and comp.id = targetComp.company_id 
> and ((emp.title like 'V.P.' OR emp.title like 'V.P. %' OR emp.title like 

> 'V.P., %' OR emp.title like 'V.P.. %' OR emp.title like '% V.P. %'
> OR emp.title like '% V.P., %' OR emp.title like '% V.P.. %' OR emp.title 

> like '% V.P.' OR emp.title like '% V.P..')
> OR (emp.title like 'VP' OR emp.title like 'VP %' OR emp.title like 'VP, 
> %' OR emp.title like 'VP. %' OR emp.title like '% VP %'
> OR emp.title like '% VP, %' OR emp.title like '% VP. %' OR emp.title 
> like '% VP' OR emp.title like '% VP.')
> OR (emp.title like 'Vice President' OR emp.title like 'Vice President %' 

> OR emp.title like 'Vice President, %' OR emp.title like 'Vice President. 
%'
> OR emp.title like '% Vice President %' OR emp.title like '% Vice 
> President, %' OR emp.title like '% Vice President. %'
> OR emp.title like '% Vice President' OR emp.title like '% Vice 
> President.') OR (emp.title like 'Vice-President' OR emp.title like 
> 'Vice-President %'
> OR emp.title like 'Vice-President, %' OR emp.title like 'Vice-President. 

> %' OR emp.title like '% Vice-President %'
> OR emp.title like '% Vice-President, %' OR emp.title like '% 
> Vice-President. %' OR emp.title like '% Vice-President'
> OR emp.title like '% Vice-President.'))
> and emp.active = 1
> 
> Does MYSQL provide any other option to perform text based searches? Can 
> someone suggest any tips for performance tuning the database in this 
> scenario?
> 
> Thanks,
> Harini
> 

It seems to me that you are trying to search on unscrubbed data. I can 
guess that your are collecting this data from a variety of sources and 
that those sources don't always use the same abbreviation or punctuation. 
However, it is better do deal with this kind of issue as the data arrives 
(before it enters your database) and not to deal with it during retrieval 
(as you are trying to do).

You need to scrub your data and standardize on certain names and 
abbreviations. Decide that "V.P." is going to be your standard for "Vice 
President" (and any of it's variants) and update all of your data to 
match.

Until then, you can use RLIKE or REGEXP to minimize how many search terms 
you are evaluating.
http://dev.mysql.com/doc/mysql/en/string-comparison-functions.html
http://dev.mysql.com/doc/mysql/en/regexp.html

For example, all of these terms:

(emp.title like 'V.P.' OR emp.title like 'V.P. %' OR emp.title like 
'V.P., %' OR emp.title like 'V.P.. %' OR emp.title like '% V.P. %'
OR emp.title like '% V.P., %' OR emp.title like '% V.P.. %' OR emp.title 
like '% V.P.' OR emp.title like '% V.P..')
OR (emp.title like 'VP' OR emp.title like 'VP %' OR emp.title like 'VP, 
%' OR emp.title like 'VP. %' OR emp.title like '% VP %'
OR emp.title like '% VP, %' OR emp.title like '% VP. %' OR emp.title 
like '% VP' OR emp.title like '% VP.')

can be simplified to just:

emp.title REGEXP '[:space:]*V.*P[ ,.]*'

Would that help?

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Reply via email to