On 4/21/2014 12:47 PM, Robert Muir wrote: > I think you misunderstand what the filter does. It does not "output unigrams". > > In the case you choose this option, the positions are from the > unigrams omitted by your tokenizer (StandardTokenizer or whatever), > and it just adds bigrams as synonyms to those. It cannot safely do > anything else. > > There can be only one "n".
I took a quick look at the code. I'm sure it's easy to grasp once you're really familiar with everything, but I'm having a hard time decoding exactly how the filter works. I don't have any more time to plow through it tonight. Would it be possible to implement an option with a name similar to "lastUnigramAtPreviousPosition" so that I can optionally get the behavior I'm after when the input is two or more characters, without changing current behavior for anyone else? This would completely solve my current problem. Thanks, Shawn --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org