Re: Lucene vs. MySQL Full-Text
I used the MySQL full text search to index about 70K business directory records. It became impossibly slow and I ended up creating my own text search engine similar in concept to Lucene but database driven. It worked much faster than the native MySQL full text search. Other limitations of MySQL MATCH syntax: - only 4 letter words and over are indexed (if you change this it searches VERY slowly) - the MATCH value figure returned is next to useless (it ranges wildly and is not normalized like Lucene values are) - cannot weight certain fields as more important than others. Really it is very limited. John. - Original Message - From: <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Friday, July 23, 2004 1:23 AM Subject: RE: Lucene vs. MySQL Full-Text I also question whether it could handle extreme volume with such good query speed. Has anyone done numbers with 1+ million documents? -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 20, 2004 5:44 PM To: Lucene Users List Subject: Re: Lucene vs. MySQL Full-Text On Tuesday 20 July 2004 21:29, Tim Brennan wrote: > Does anyone out there have > anything more concrete they can add? Stemming is still on the MySQL TODO list: http://dev.mysql.com/doc/mysql/en/Fulltext_TODO.html Also, for most people it's easier to extend Lucene than MySQL (as MySQL is written in C(++?)) and there are more powerful queries in Lucene, e.g. fuzzy phrase search. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene vs. MySQL Full-Text
I also question whether it could handle extreme volume with such good query speed. Has anyone done numbers with 1+ million documents? -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 20, 2004 5:44 PM To: Lucene Users List Subject: Re: Lucene vs. MySQL Full-Text On Tuesday 20 July 2004 21:29, Tim Brennan wrote: > ÂDoes anyone out there have > anything more concrete they can add? Stemming is still on the MySQL TODO list: http://dev.mysql.com/doc/mysql/en/Fulltext_TODO.html Also, for most people it's easier to extend Lucene than MySQL (as MySQL is written in C(++?)) and there are more powerful queries in Lucene, e.g. fuzzy phrase search. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene vs. MySQL Full-Text
Interestingly (and ironically) enough, the project I'm currently working on requires full-text searching of Word and PDF resumes. SQL Server is already the required database as well, so we are leveraging the full-text indexing capabilities it has. There is a special trick to drop a BLOB into a table which also has a file extension and mime type columns, and have SQL Server index it with its Index Server capabilities. Lucene was not needed, and we made the pragmatic (simplest that worked well) choice. My recommendation would be to implement something rather than debate it - and if it is good enough, leave it alone, if not then try a different approach :) Erik On Jul 21, 2004, at 7:29 AM, Anson Lau wrote: Depending on what MySQL Full-text search support you probably will lose some of the advance things you get for free from Lucene, such as proximity search, wildcard search, search term and search field boosting, scoring of the documents, etc. Afterall it depends on what you need to do. In our dev team we are actually currently having a mini debate over whether to use lucene for our project or write something from scratch that's based on a DB. We need really good performance. I feel lucene can do our job very well, some of our guys feel using a DB based search can give us greater performance on the type of search we do. Anson -Original Message- From: Florian Sauvin [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 21, 2004 8:55 AM To: Lucene Users List Subject: Re: Lucene vs. MySQL Full-Text On Jul 20, 2004, at 12:29 PM, Tim Brennan wrote: Someone came into my office today and asked me about the project I am trying to Lucene for -- "why aren't you just using a MySQL full-text index to do that" -- after thinking about it for a few minutes, I realized I don't have a great answer. MySQL builds inverted indexes for (in theory) doing the same type of lookup that lucene does. You'd maybe have to build some kind of a layer on the front to mimic Lucene's analyzers, but that wouldn't be too hard My only experience with MySQLfulltext is trivial test apps -- but the MySQL world does have some significant advantages (its a known quantity from an operations perspective, etc). Does anyone out there have anything more concrete they can add? --tim I'd say that MySQL full text is much slower if you have a lot of data... that is one of the reasons we started using lucene (We had a mysql db to do the search), it's way faster! -- Florian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene vs. MySQL Full-Text
Depending on what MySQL Full-text search support you probably will lose some of the advance things you get for free from Lucene, such as proximity search, wildcard search, search term and search field boosting, scoring of the documents, etc. Afterall it depends on what you need to do. In our dev team we are actually currently having a mini debate over whether to use lucene for our project or write something from scratch that's based on a DB. We need really good performance. I feel lucene can do our job very well, some of our guys feel using a DB based search can give us greater performance on the type of search we do. Anson -Original Message- From: Florian Sauvin [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 21, 2004 8:55 AM To: Lucene Users List Subject: Re: Lucene vs. MySQL Full-Text On Jul 20, 2004, at 12:29 PM, Tim Brennan wrote: > Someone came into my office today and asked me about the project I am > trying to Lucene for -- "why aren't you just using a MySQL full-text > index to do that" -- after thinking about it for a few minutes, I > realized I don't have a great answer. > > MySQL builds inverted indexes for (in theory) doing the same type of > lookup that lucene does. You'd maybe have to build some kind of a > layer > on the front to mimic Lucene's analyzers, but that wouldn't be too > hard > > My only experience with MySQLfulltext is trivial test apps -- but the > MySQL world does have some significant advantages (its a known quantity > from an operations perspective, etc). Does anyone out there have > anything more concrete they can add? > > --tim > > I'd say that MySQL full text is much slower if you have a lot of data... that is one of the reasons we started using lucene (We had a mysql db to do the search), it's way faster! -- Florian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene vs. MySQL Full-Text
On Jul 20, 2004, at 12:29 PM, Tim Brennan wrote: Someone came into my office today and asked me about the project I am trying to Lucene for -- "why aren't you just using a MySQL full-text index to do that" -- after thinking about it for a few minutes, I realized I don't have a great answer. MySQL builds inverted indexes for (in theory) doing the same type of lookup that lucene does. You'd maybe have to build some kind of a layer on the front to mimic Lucene's analyzers, but that wouldn't be too hard My only experience with MySQLfulltext is trivial test apps -- but the MySQL world does have some significant advantages (its a known quantity from an operations perspective, etc). Does anyone out there have anything more concrete they can add? --tim I'd say that MySQL full text is much slower if you have a lot of data... that is one of the reasons we started using lucene (We had a mysql db to do the search), it's way faster! -- Florian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene vs. MySQL Full-Text
On Tuesday 20 July 2004 21:29, Tim Brennan wrote: > ÂDoes anyone out there have > anything more concrete they can add? Stemming is still on the MySQL TODO list: http://dev.mysql.com/doc/mysql/en/Fulltext_TODO.html Also, for most people it's easier to extend Lucene than MySQL (as MySQL is written in C(++?)) and there are more powerful queries in Lucene, e.g. fuzzy phrase search. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Lucene vs. MySQL Full-Text
Someone came into my office today and asked me about the project I am trying to Lucene for -- "why aren't you just using a MySQL full-text index to do that" -- after thinking about it for a few minutes, I realized I don't have a great answer. MySQL builds inverted indexes for (in theory) doing the same type of lookup that lucene does. You'd maybe have to build some kind of a layer on the front to mimic Lucene's analyzers, but that wouldn't be too hard My only experience with MySQLfulltext is trivial test apps -- but the MySQL world does have some significant advantages (its a known quantity from an operations perspective, etc). Does anyone out there have anything more concrete they can add? --tim