Re: is there a way to optimize like '%..%' searches ?

2006-07-01 Thread Asif Lodhi

What about partitioning the database?


On 7/1/06, Dan Buettner [EMAIL PROTECTED] wrote:


Wow, that is a tough one.

My question would be, how often is older data really accessed?  Could
you start incorporating a default date range like past 3 months or
past 1 year into all searches, but allow people to override it if
needed?  Then if you add an index on the timestamp column it would
help any searches with a date clause.

Dan


On 6/30/06, Martin Jespersen [EMAIL PROTECTED] wrote:
 It's basically a log that people needs to be able to search with
 wildcards in... the log grows many thousand records per day and never
 gets smaller, so searches just gets slower and slower. There is a sort
 field, the timestamp which is used in the searches, but it only makes
 the searches lsower yet instead of helping in the query, since all that
 does is sort by timestamp desc


 basically the query works like this:

 some searches for foo bar baz and i create an sql that looks like:

 select * from table where  logline like '%foo%bar%baz%' order by
 timestamp desc. I have wrekced my brian plenty but have not come up with
 any otehr way of doing it that gives the needed flexibility in the
 searces. Since what is searched for is not words as such - most loglines
 are actually a single word on the form

somethingspecialcharsomethingspecialcharsomethingspecialcharsomethingspecialcharsomethingspecialchar
 and so on - the logline is varibale length and variable number of
 entities between the sepcial chars (even the special chars are very
 varied) and of no specific format, thus the needed flexibility in the
 searches.

 If i coud i would changes the log format, but that is not possible since
 this database has loglines going all the way back to the 1980's (with
 more old lines being added as well as new ones) and the format has
 changed many times since then...

 Basically i am stuck with a very crappy heap of data i need to be able
 to search in a smart manner.

 Fulltext seaching would have been ideal if i was able to do boolean
 macthes with leading wildcard, but without it is useless :/

 btw the result doesn't need scoring for relevance at all - what is
 searched for is always the newest matches to the searchterm, regardless
 of relevance (relevance could become handy at a later stage tho, but i
 dare not even think about it atm)


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




Re: is there a way to optimize like '%..%' searches ?

2006-06-30 Thread Martin Jespersen
It's basically a log that people needs to be able to search with 
wildcards in... the log grows many thousand records per day and never 
gets smaller, so searches just gets slower and slower. There is a sort 
field, the timestamp which is used in the searches, but it only makes 
the searches lsower yet instead of helping in the query, since all that 
does is sort by timestamp desc



basically the query works like this:

some searches for foo bar baz and i create an sql that looks like:

select * from table where  logline like '%foo%bar%baz%' order by 
timestamp desc. I have wrekced my brian plenty but have not come up with 
any otehr way of doing it that gives the needed flexibility in the 
searces. Since what is searched for is not words as such - most loglines 
are actually a single word on the form 
somethingspecialcharsomethingspecialcharsomethingspecialcharsomethingspecialcharsomethingspecialchar 
and so on - the logline is varibale length and variable number of 
entities between the sepcial chars (even the special chars are very 
varied) and of no specific format, thus the needed flexibility in the 
searches.


If i coud i would changes the log format, but that is not possible since 
this database has loglines going all the way back to the 1980's (with 
more old lines being added as well as new ones) and the format has 
changed many times since then...


Basically i am stuck with a very crappy heap of data i need to be able 
to search in a smart manner.


Fulltext seaching would have been ideal if i was able to do boolean 
macthes with leading wildcard, but without it is useless :/


btw the result doesn't need scoring for relevance at all - what is 
searched for is always the newest matches to the searchterm, regardless 
of relevance (relevance could become handy at a later stage tho, but i 
dare not even think about it atm)



Dan Buettner wrote:

Bummer, Martin.

What more can you tell us about your specific application?  What are
you storing and searching on, exactly?  Any chance you can filter on
another column to reduce the number of records that need a string
search?

Dan


On 6/28/06, Martin Jespersen [EMAIL PROTECTED] wrote:

I was too fast there it seems

fulltext searches doesn't help after all since i can't use leading
wildcards to words :(

too bad i loved the speed :/







--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: is there a way to optimize like '%..%' searches ?

2006-06-30 Thread Dan Buettner

Wow, that is a tough one.

My question would be, how often is older data really accessed?  Could
you start incorporating a default date range like past 3 months or
past 1 year into all searches, but allow people to override it if
needed?  Then if you add an index on the timestamp column it would
help any searches with a date clause.

Dan


On 6/30/06, Martin Jespersen [EMAIL PROTECTED] wrote:

It's basically a log that people needs to be able to search with
wildcards in... the log grows many thousand records per day and never
gets smaller, so searches just gets slower and slower. There is a sort
field, the timestamp which is used in the searches, but it only makes
the searches lsower yet instead of helping in the query, since all that
does is sort by timestamp desc


basically the query works like this:

some searches for foo bar baz and i create an sql that looks like:

select * from table where  logline like '%foo%bar%baz%' order by
timestamp desc. I have wrekced my brian plenty but have not come up with
any otehr way of doing it that gives the needed flexibility in the
searces. Since what is searched for is not words as such - most loglines
are actually a single word on the form
somethingspecialcharsomethingspecialcharsomethingspecialcharsomethingspecialcharsomethingspecialchar
and so on - the logline is varibale length and variable number of
entities between the sepcial chars (even the special chars are very
varied) and of no specific format, thus the needed flexibility in the
searches.

If i coud i would changes the log format, but that is not possible since
this database has loglines going all the way back to the 1980's (with
more old lines being added as well as new ones) and the format has
changed many times since then...

Basically i am stuck with a very crappy heap of data i need to be able
to search in a smart manner.

Fulltext seaching would have been ideal if i was able to do boolean
macthes with leading wildcard, but without it is useless :/

btw the result doesn't need scoring for relevance at all - what is
searched for is always the newest matches to the searchterm, regardless
of relevance (relevance could become handy at a later stage tho, but i
dare not even think about it atm)



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: is there a way to optimize like '%..%' searches ?

2006-06-29 Thread Dan Buettner

Bummer, Martin.

What more can you tell us about your specific application?  What are
you storing and searching on, exactly?  Any chance you can filter on
another column to reduce the number of records that need a string
search?

Dan


On 6/28/06, Martin Jespersen [EMAIL PROTECTED] wrote:

I was too fast there it seems

fulltext searches doesn't help after all since i can't use leading
wildcards to words :(

too bad i loved the speed :/





--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: is there a way to optimize like '%..%' searches ?

2006-06-28 Thread Dan Buettner

Martin, currently there is not a way to optimize that particular type of query.

You might consider changing to fulltext indexes and searches instead,
as it could be faster, but it is a slightly different animal from LIKE
 so may not fit your needs.
http://dev.mysql.com/doc/refman/4.1/en/fulltext-search.html

If there is any way you can limit the search set using another column
that could be indexed, it could speed things up significantly.  Date,
type of document, author, etc.  Even if it were an optional criteria
for people it might speed up at least some of the searches.

Dan



On 6/28/06, Martin Jespersen [EMAIL PROTECTED] wrote:

Hey all i am running mysql 4.1.20.

I have a table with about 2.5 million records and i have to do queries
on it that looks something like:

select * from table where field1 like '%some%thing%' order by field2

This is ofcourse very slow since it refuses to use indexes...

i have fairly large read_rnd_buffer (32M) and sort_buffer (32M) but this
query is still slow as hell...

Any hints as to how i can optimize the query? or tune my server settings
to produce faster results?

Regards

Martin Jespersen


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: is there a way to optimize like '%..%' searches ?

2006-06-28 Thread Peter Van Dijck

Also, perhaps this is good enough for your situation:
like 'some%thing%'
as opposed to
like '%some%thing%'

in this case, mysql can use an index on that column and filter out
everything that doesn't start with some. It's a start at least.

Peter

On 6/28/06, Dan Buettner [EMAIL PROTECTED] wrote:

Martin, currently there is not a way to optimize that particular type of query.

You might consider changing to fulltext indexes and searches instead,
as it could be faster, but it is a slightly different animal from LIKE
 so may not fit your needs.
http://dev.mysql.com/doc/refman/4.1/en/fulltext-search.html

If there is any way you can limit the search set using another column
that could be indexed, it could speed things up significantly.  Date,
type of document, author, etc.  Even if it were an optional criteria
for people it might speed up at least some of the searches.

Dan



On 6/28/06, Martin Jespersen [EMAIL PROTECTED] wrote:
 Hey all i am running mysql 4.1.20.

 I have a table with about 2.5 million records and i have to do queries
 on it that looks something like:

 select * from table where field1 like '%some%thing%' order by field2

 This is ofcourse very slow since it refuses to use indexes...

 i have fairly large read_rnd_buffer (32M) and sort_buffer (32M) but this
 query is still slow as hell...

 Any hints as to how i can optimize the query? or tune my server settings
 to produce faster results?

 Regards

 Martin Jespersen

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]





--
find videoblogs: http://mefeedia.com
my blog: http://poorbuthappy.com/ease/
my job: http://petervandijck.net

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: is there a way to optimize like '%..%' searches ?

2006-06-28 Thread Martin Jespersen

Indeed fulltext searches was the cure i was looking for.

Queries went from over 116 seconds to less than half a second

thx for the tip ;)

Dan Buettner wrote:
Martin, currently there is not a way to optimize that particular type of 
query.


You might consider changing to fulltext indexes and searches instead,
as it could be faster, but it is a slightly different animal from LIKE
 so may not fit your needs.
http://dev.mysql.com/doc/refman/4.1/en/fulltext-search.html

If there is any way you can limit the search set using another column
that could be indexed, it could speed things up significantly.  Date,
type of document, author, etc.  Even if it were an optional criteria
for people it might speed up at least some of the searches.

Dan



On 6/28/06, Martin Jespersen [EMAIL PROTECTED] wrote:

Hey all i am running mysql 4.1.20.

I have a table with about 2.5 million records and i have to do queries
on it that looks something like:

select * from table where field1 like '%some%thing%' order by field2

This is ofcourse very slow since it refuses to use indexes...

i have fairly large read_rnd_buffer (32M) and sort_buffer (32M) but this
query is still slow as hell...

Any hints as to how i can optimize the query? or tune my server settings
to produce faster results?

Regards

Martin Jespersen




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: is there a way to optimize like '%..%' searches ?

2006-06-28 Thread Martin Jespersen

I was too fast there it seems

fulltext searches doesn't help after all since i can't use leading 
wildcards to words :(


too bad i loved the speed :/



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]