An n-gram tokenizer/filter might also work for you: http://lucene.apache.org/core/7_3_1/analyzers-common/org/apache/lucene/analysis/ngram/NGramTokenizer.html
Regards, András On Wed, Jun 20, 2018 at 11:53 AM, Markus Jelsma <markus.jel...@openindex.io> wrote: > Hi Egorlex, > > Set the tokenSeparator to "" and ShingleFilter will concatenate all > shingles without whitespace. Keep in mind, this will greatly increase the > size of the index so it might not be a good idea to concatenate all pairs > of words. > > If you are looking for finding "similarissues" with "similar issues" (and > vice versa) you might want to check out DictionaryCompoundWordTokenFilter > and/or HyphenationCompoundWordTokenFilter. Although English hardly uses > compound words, the token filters still do their job quite nicely. > > Regards, > Markus > > > > -----Original message----- > > From:egorlex <egor...@gmail.com> > > Sent: Wednesday 20th June 2018 11:42 > > To: java-user@lucene.apache.org > > Subject: Re: Lucene same search result for worlds with and without spaces > > > > Thanks for replay! > > > > sorry, could you help a little, according to example > > > > "given the phrase “Shingles is a viral disease”, a shingle filter might > > produce: > > > > Shingles is > > is a > > a viral > > viral disease > > " > > > > I do not quite understand how this ShingleFilter can turn "similarissues" > > into "similar issues" > > > > Thanks! > > > > > > > > -- > > Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users- > f532864.html > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >