On 4/13/06, Ananth T. Sarathy <[EMAIL PROTECTED]> wrote: > > No we do have drop downs selects that would allow for the substitution, > but > we also have a free text fields to allow the user to search. That solution > would I think work for the DB query replacement, but you would need a > regular non underscored field to allow for free text. > > Well, as I say, you've solved that problem already. Somewhere, somehow, you have to decide what to do with the "free text" data. Somewhere, somehow, you've got to decide whether "stunt director trainee" means "stunt director" + trainee, stunt + "director trainee", or stunt + director + trainee. Or else you can't form your SQL in the first place. And the query doesn't produce reasonable results if you *do* form the query.
If you can form your SQL with distinct "Title = 'blah'" clauses, you can substitute underscores for spaces in the terms. If you can do that, you can ask Lucene to find the terms you indexed with underscores. And if you can't form your SQL queries in the first place, the question is irrelevant. All that said, perhaps a better question is "why is your SQL slow?". Relational databases are really good at this sort of thing. Many smart people have put many, many developer years into making relational databases deal with joins efficiently. Assuming you have the proper indexes etc. As much as I've been impressed with Lucene, I have to ask whether it's relevant to your problem. I have no clue what database you're using, how it's set up, or whether the examples you've given are simplified enough that I don't understand what the *real* problem is. But if your issue isn't really dealing with a full text search, your relational DB should be able to handle it, given the proper wherewithal. Have you done "explain plan" or its equivalent in your DB? Have you tried adding indexes to avoid full table scans? In short, have you fully convinced yourself that your RDB can't handle the problem? I'm *extremely* leery of introducing another "moving part" into a product without fully exhausting the current parts. It's *never* a good thing to add a new step into the process unless you can convince yourself that it solves more problems than it introduces. You've already alluded to keeping the DB and the Lucene indexes in synch. I *guarantee* that there will be other issues that rise up and bite you. *Count* on whatever you think you'll spend in introducing Lucene into your mix (say effort X) costing you *at least* 2X more time/energy than you think. I'd actually give it a multiplier closer to 4X. This is NOT a slam on lucene. But developers often miss the bigger picture. What processes are you going to put in place to keep the Lucene part of the product up to date? How much is it going to cost your company to troubleshoot the Lucene portion? How many company resources are going to be spent answering customer complaints? What is the ongoing maintenance requirement? I like Lucene. I've just persuaded my company to use it in our next product. I've been incredibly impressed with it's architeture and implementation. But it's a text search engine, and shouldn't be confused with a RDB. *Assuming* that the RDB is an integral part of your product, I'd spend a lot of time making that do what I needed before I'd introduce another moving part. All for what it's worth, from an old "C" programmer <G>.. Best Erick