Could I perhaps wonder why your customer is so intent on indexing ngrams? Why not use Kuromoji and index words?
On Sun, Apr 20, 2014 at 2:00 PM, Robert Muir <rcm...@gmail.com> wrote: > On Sun, Apr 20, 2014 at 1:53 PM, Shawn Heisey <s...@elyograg.org> wrote: >> On 4/20/2014 11:10 AM, Robert Muir wrote: >>> I think you need to use 2 separate fields here? (one for n=1 and one for >>> n=2) >>> >>> You just cant really have "correct positions" for n=1 and n=2, its not >>> possible. >> >> There may be details to this that I do not understand. I'm fairly >> clueless about both CJK and writing Lucene code -- Solr does all of that >> for me. >> >> What is "n" in what you wrote above? > > This is just the mathematics, its the "n" of the n-gram. You should > only really ever have a fixed value of this for a field, otherwise the > positions are confusing. > > There is nothing this filter can do to change this mathematical fact. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org