Confession: I haven't had occasion to use the ngram thingy, but here's the theory.... And note that SOLR has n-gram tokenizers available..
Using a 2-gram example for sullivan, the n-gram would index these tokens... su, ul, ll, li, iv, va, an. Then at query time in your example, sulli would be broken up into su, ul, ll and li. Which, when searched as a phrase would turn match your field..... The expense, of course is that your index is larger (but surprisingly not as much as you'd think). But your queries are much faster..... That's the theory anyway, the practice is "left as an exercise for the reader"<G> But "the folks" generously provided quite an explication of what wildcards are all about on the *lucene* user's list, look for a thread titled "I just don't get wildcards at all" from around 2006. It's a nice background for what the underlying problem is, some of the SOLR tokenizers are realizing some of this I think. And the state of the art has progressed considerably since then, but the underlying issues are still there... Sorry I can't be more help here.. Erick On Wed, Nov 25, 2009 at 8:18 AM, Joel Nylund <jnyl...@yahoo.com> wrote: > Hi Erick, > > thanks for the links, I read both of them and I still have no idea what to > do, lots of back and forth, but didn't see any solution on it. > > One person talked about indexing the field in reverse and doing and ON on > it, this might work I guess. > > thanks > Joel > > > > On Nov 24, 2009, at 9:12 PM, Erick Erickson wrote: > > copying from Eric Hatcher: >> >> See http://issues.apache.org/jira/browse/SOLR-218 - Solr currently >> does not have leading wildcard support enabled. >> >> There's a pretty extensive recent exchange on this, see the >> thread on the user's list titled >> >> "leading and trailing wildcard query"Best >> Erick >> >> On Tue, Nov 24, 2009 at 7:51 PM, Joel Nylund <jnyl...@yahoo.com> wrote: >> >> Hi, I saw some older postings on this, but didnt see a resolution. >>> >>> I have a field called title, I would like to be able to find partial word >>> matches within the title. >>> >>> For example: >>> >>> http://localhost:8983/solr/select?q=textTitle:%22*sulli*%22 >>> >>> I would expect it to find: >>> <str name="textTitle">the daily dish | by andrew sullivan</str> >>> >>> but it doesnt, it does find sully (which is fine with me also as a >>> bonus), >>> but doesnt seem to get any of the partial word stuff. Oddly enough before >>> I >>> lowercased the title, the wildcard matching seemed to work a bit better, >>> it >>> just didnt deal with the case sensitive query. >>> >>> At first I had mixed case titles and I read that the wildcard doesn't >>> work >>> with mixed case, so I created another field that is a lowered version of >>> the >>> title called "textTitle", it is of type text. >>> >>> Is it possible with solr to achieve what I am trying to do, if so how? If >>> not, anything closer than what I have? >>> >>> thanks >>> Joel >>> >>> >>> >