Chantal: bq: The problem with the wildcard searches is that the input is not analyzed.
As of 3.6/4.0, this is no longer entirely true. Some analysis is performed for wildcard searches by default and you can specify most anything you want if you really need to see: https://issues.apache.org/jira/browse/SOLR-2438 and http://wiki.apache.org/solr/MultitermQueryAnalysis Best Erick On Fri, Dec 30, 2011 at 4:33 PM, Devon Baumgarten <dbaumgar...@nationalcorp.com> wrote: > Hoss, > > Thanks. You've answered my question. To clarify, what I should have asked for > instead of 'exact' was 'not fuzzy'. For some reason it didn't occur to me > that I didn't need n-grams to use the wildcard. You asking for me to clarify > what I meant made me realize that the n-grams are the source of all my > current problems. :) > > Thanks! > > Devon Baumgarten > > > -----Original Message----- > From: Chris Hostetter [mailto:hossman_luc...@fucit.org] > Sent: Thursday, December 29, 2011 7:00 PM > To: solr-user@lucene.apache.org > Subject: RE: Solr, SQL Server's LIKE > > > : Thanks. I know I'll be able to utilize some of Solr's free text > : searching capabilities in other search types in this project. The > : product manager wants this particular search to exactly mimic LIKE%. > ... > : Ex: If I search "Albatross" I want "Albert" to be excluded completely, > : rather than having a low score. > > please be specific about the types of queries you want. ie: we need more > then one example of the type of input you want to provide, the type of > matches you want to see for that input, and the type of matches you want > to get back. > > in your first message you said you need to match company titles "pretty > exactly" but then seem to contradict yourself by saying the SQL's LIKE > command fit's the bill -- even though the SQL LIKE command exists > specificly for in-exact matches on field values. > > Based on your one example above of Albatross, you don't need anything > special: don't use ngrams, don't use stemming, don't use fuzzy anything -- > just search for "Albatross" and it will match "Albatross" but not > "Albert". if you want "Albatross" to match "Albatross Road" use some > basic tokenization. > > If all you really care about is prefix searching (which seems suggested by > your "LIKE%" comment above, which i'm guessing is shorthand for something > similar to "LIKE 'ABC%'"), so that queries like "abc" and "abcd" both > match "abcdef" and "abcdzzzz" but neither of them match "xxxxabcdyyyy" > then just use prefix queries (ie: "abcd*") -- they should be plenty > efficient for your purposes. you only need to worry about ngrams when you > want to efficiently match in the middle of a string. (ie: "TITLE LIKE > %ABC%") > > > -Hoss