AnalyzingSuggester only matches by prefix, by design. You can try AnalyzingInfixSuggester, which is currently two alternative patches on https://issues.apache.org/jira/browse/LUCENE-4845
And please post back any feedback you have on the issue ... as the issue stands I don't think either approach will be committed any time soon. Mike McCandless http://blog.mikemccandless.com On Tue, Mar 26, 2013 at 3:45 AM, Andres Garcia <hgar...@fi.upm.es> wrote: > Hi all, > > > My use case is very simple, given a string I would like to suggest all the > possible urls that contain that string (given the limitations of the > tokenizer and suggester). So far I have created a custom analyzer and > tokenizer to parse urls, and that analyzer is used to create an > AnalyzingSuggester object. When I look for a suggestion using a prefix of a > url it works fine. However when I use an in between word I don’t get any > suggestion. > > > Let’s see my test case. I have a unique suggestion entry “www.google.com” > in my TermFreq array. If I search a suggestion for “www” it returns the > url. If I search a suggestion for “google” the result is empty. > > > My tokenizer splits the suggestion entry into the following tuples > (token,offset): (www,0:3),(google,4:10),(com,11:14). Please note that I’m > getting rid of the dots > > > The automaton created for this entry is: > > state 0 [reject]: w -> 1 state 1 [reject]: w -> 2 state 2 [reject]: w -> 3 > state 3 [reject]: \\U00000100 -> 4 state 4 [reject]: g -> 5 state 5 > [reject]: o -> 6 state 6 [reject]: o -> 7 state 7 [reject]: g -> 8 state 8 > [reject]: l -> 9 state 9 [reject]: e -> 10 state 10 [reject]: \\U00000100 > -> 11 state 11 [reject]: c -> 12 state 12 [reject]: o -> 13 state 13 > [reject]: m -> 14 state 14 [accept]: > > > When I print the fst I get this: “wwwgooglecom” > > > The automaton created for “google” > > Initial state: 0 state 0 [reject]: g -> 1 state 1 [reject]: o -> 2 state 2 > [reject]: o -> 3 state 3 [reject]: g -> 4 state 4 [reject]: l -> 5 state 5 > [reject]: e -> 6 state 6 [accept]: > > > I think I have a problem with my tokenizer (I’m not an expert) and this is > affecting the creation of the first automaton. I really don’t know how to > get this fixed, any advice? > > > best regards! --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org