Re: is there a way to optimize like '%..%' searches ?
What about partitioning the database? On 7/1/06, Dan Buettner [EMAIL PROTECTED] wrote: Wow, that is a tough one. My question would be, how often is older data really accessed? Could you start incorporating a default date range like past 3 months or past 1 year into all searches, but allow people to override it if needed? Then if you add an index on the timestamp column it would help any searches with a date clause. Dan On 6/30/06, Martin Jespersen [EMAIL PROTECTED] wrote: It's basically a log that people needs to be able to search with wildcards in... the log grows many thousand records per day and never gets smaller, so searches just gets slower and slower. There is a sort field, the timestamp which is used in the searches, but it only makes the searches lsower yet instead of helping in the query, since all that does is sort by timestamp desc basically the query works like this: some searches for foo bar baz and i create an sql that looks like: select * from table where logline like '%foo%bar%baz%' order by timestamp desc. I have wrekced my brian plenty but have not come up with any otehr way of doing it that gives the needed flexibility in the searces. Since what is searched for is not words as such - most loglines are actually a single word on the form somethingspecialcharsomethingspecialcharsomethingspecialcharsomethingspecialcharsomethingspecialchar and so on - the logline is varibale length and variable number of entities between the sepcial chars (even the special chars are very varied) and of no specific format, thus the needed flexibility in the searches. If i coud i would changes the log format, but that is not possible since this database has loglines going all the way back to the 1980's (with more old lines being added as well as new ones) and the format has changed many times since then... Basically i am stuck with a very crappy heap of data i need to be able to search in a smart manner. Fulltext seaching would have been ideal if i was able to do boolean macthes with leading wildcard, but without it is useless :/ btw the result doesn't need scoring for relevance at all - what is searched for is always the newest matches to the searchterm, regardless of relevance (relevance could become handy at a later stage tho, but i dare not even think about it atm) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: is there a way to optimize like '%..%' searches ?
It's basically a log that people needs to be able to search with wildcards in... the log grows many thousand records per day and never gets smaller, so searches just gets slower and slower. There is a sort field, the timestamp which is used in the searches, but it only makes the searches lsower yet instead of helping in the query, since all that does is sort by timestamp desc basically the query works like this: some searches for foo bar baz and i create an sql that looks like: select * from table where logline like '%foo%bar%baz%' order by timestamp desc. I have wrekced my brian plenty but have not come up with any otehr way of doing it that gives the needed flexibility in the searces. Since what is searched for is not words as such - most loglines are actually a single word on the form somethingspecialcharsomethingspecialcharsomethingspecialcharsomethingspecialcharsomethingspecialchar and so on - the logline is varibale length and variable number of entities between the sepcial chars (even the special chars are very varied) and of no specific format, thus the needed flexibility in the searches. If i coud i would changes the log format, but that is not possible since this database has loglines going all the way back to the 1980's (with more old lines being added as well as new ones) and the format has changed many times since then... Basically i am stuck with a very crappy heap of data i need to be able to search in a smart manner. Fulltext seaching would have been ideal if i was able to do boolean macthes with leading wildcard, but without it is useless :/ btw the result doesn't need scoring for relevance at all - what is searched for is always the newest matches to the searchterm, regardless of relevance (relevance could become handy at a later stage tho, but i dare not even think about it atm) Dan Buettner wrote: Bummer, Martin. What more can you tell us about your specific application? What are you storing and searching on, exactly? Any chance you can filter on another column to reduce the number of records that need a string search? Dan On 6/28/06, Martin Jespersen [EMAIL PROTECTED] wrote: I was too fast there it seems fulltext searches doesn't help after all since i can't use leading wildcards to words :( too bad i loved the speed :/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: is there a way to optimize like '%..%' searches ?
Wow, that is a tough one. My question would be, how often is older data really accessed? Could you start incorporating a default date range like past 3 months or past 1 year into all searches, but allow people to override it if needed? Then if you add an index on the timestamp column it would help any searches with a date clause. Dan On 6/30/06, Martin Jespersen [EMAIL PROTECTED] wrote: It's basically a log that people needs to be able to search with wildcards in... the log grows many thousand records per day and never gets smaller, so searches just gets slower and slower. There is a sort field, the timestamp which is used in the searches, but it only makes the searches lsower yet instead of helping in the query, since all that does is sort by timestamp desc basically the query works like this: some searches for foo bar baz and i create an sql that looks like: select * from table where logline like '%foo%bar%baz%' order by timestamp desc. I have wrekced my brian plenty but have not come up with any otehr way of doing it that gives the needed flexibility in the searces. Since what is searched for is not words as such - most loglines are actually a single word on the form somethingspecialcharsomethingspecialcharsomethingspecialcharsomethingspecialcharsomethingspecialchar and so on - the logline is varibale length and variable number of entities between the sepcial chars (even the special chars are very varied) and of no specific format, thus the needed flexibility in the searches. If i coud i would changes the log format, but that is not possible since this database has loglines going all the way back to the 1980's (with more old lines being added as well as new ones) and the format has changed many times since then... Basically i am stuck with a very crappy heap of data i need to be able to search in a smart manner. Fulltext seaching would have been ideal if i was able to do boolean macthes with leading wildcard, but without it is useless :/ btw the result doesn't need scoring for relevance at all - what is searched for is always the newest matches to the searchterm, regardless of relevance (relevance could become handy at a later stage tho, but i dare not even think about it atm) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: is there a way to optimize like '%..%' searches ?
Bummer, Martin. What more can you tell us about your specific application? What are you storing and searching on, exactly? Any chance you can filter on another column to reduce the number of records that need a string search? Dan On 6/28/06, Martin Jespersen [EMAIL PROTECTED] wrote: I was too fast there it seems fulltext searches doesn't help after all since i can't use leading wildcards to words :( too bad i loved the speed :/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: is there a way to optimize like '%..%' searches ?
Martin, currently there is not a way to optimize that particular type of query. You might consider changing to fulltext indexes and searches instead, as it could be faster, but it is a slightly different animal from LIKE so may not fit your needs. http://dev.mysql.com/doc/refman/4.1/en/fulltext-search.html If there is any way you can limit the search set using another column that could be indexed, it could speed things up significantly. Date, type of document, author, etc. Even if it were an optional criteria for people it might speed up at least some of the searches. Dan On 6/28/06, Martin Jespersen [EMAIL PROTECTED] wrote: Hey all i am running mysql 4.1.20. I have a table with about 2.5 million records and i have to do queries on it that looks something like: select * from table where field1 like '%some%thing%' order by field2 This is ofcourse very slow since it refuses to use indexes... i have fairly large read_rnd_buffer (32M) and sort_buffer (32M) but this query is still slow as hell... Any hints as to how i can optimize the query? or tune my server settings to produce faster results? Regards Martin Jespersen -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: is there a way to optimize like '%..%' searches ?
Also, perhaps this is good enough for your situation: like 'some%thing%' as opposed to like '%some%thing%' in this case, mysql can use an index on that column and filter out everything that doesn't start with some. It's a start at least. Peter On 6/28/06, Dan Buettner [EMAIL PROTECTED] wrote: Martin, currently there is not a way to optimize that particular type of query. You might consider changing to fulltext indexes and searches instead, as it could be faster, but it is a slightly different animal from LIKE so may not fit your needs. http://dev.mysql.com/doc/refman/4.1/en/fulltext-search.html If there is any way you can limit the search set using another column that could be indexed, it could speed things up significantly. Date, type of document, author, etc. Even if it were an optional criteria for people it might speed up at least some of the searches. Dan On 6/28/06, Martin Jespersen [EMAIL PROTECTED] wrote: Hey all i am running mysql 4.1.20. I have a table with about 2.5 million records and i have to do queries on it that looks something like: select * from table where field1 like '%some%thing%' order by field2 This is ofcourse very slow since it refuses to use indexes... i have fairly large read_rnd_buffer (32M) and sort_buffer (32M) but this query is still slow as hell... Any hints as to how i can optimize the query? or tune my server settings to produce faster results? Regards Martin Jespersen -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- find videoblogs: http://mefeedia.com my blog: http://poorbuthappy.com/ease/ my job: http://petervandijck.net -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: is there a way to optimize like '%..%' searches ?
Indeed fulltext searches was the cure i was looking for. Queries went from over 116 seconds to less than half a second thx for the tip ;) Dan Buettner wrote: Martin, currently there is not a way to optimize that particular type of query. You might consider changing to fulltext indexes and searches instead, as it could be faster, but it is a slightly different animal from LIKE so may not fit your needs. http://dev.mysql.com/doc/refman/4.1/en/fulltext-search.html If there is any way you can limit the search set using another column that could be indexed, it could speed things up significantly. Date, type of document, author, etc. Even if it were an optional criteria for people it might speed up at least some of the searches. Dan On 6/28/06, Martin Jespersen [EMAIL PROTECTED] wrote: Hey all i am running mysql 4.1.20. I have a table with about 2.5 million records and i have to do queries on it that looks something like: select * from table where field1 like '%some%thing%' order by field2 This is ofcourse very slow since it refuses to use indexes... i have fairly large read_rnd_buffer (32M) and sort_buffer (32M) but this query is still slow as hell... Any hints as to how i can optimize the query? or tune my server settings to produce faster results? Regards Martin Jespersen -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: is there a way to optimize like '%..%' searches ?
I was too fast there it seems fulltext searches doesn't help after all since i can't use leading wildcards to words :( too bad i loved the speed :/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]