ooh On Fri, Sep 23, 2022 at 11:02 AM Adrien Grand <jpou...@gmail.com> wrote: > > We have a TruncateTokenFilter in lucene/analysis/common. :) > > On Fri, Sep 23, 2022 at 4:39 PM Michael Sokolov <msoko...@gmail.com> wrote: > > > I wonder if it would make sense to provide a TruncationFilter in > > addition to the LengthFilter. That way long tokens in source text > > could be better supported, albeit with some confusion if they share > > the same very long prefix... > > > > On Fri, Sep 23, 2022 at 9:56 AM Scott Guthery <sguth...@gmail.com> wrote: > > > > > > Thanks much, Adrian. I hadn't realized that the size limit was on one > > > token in the text as opposed to being a limit on the length of the entire > > > text field. I'm loading patents, so I suspect that the very long word > > is a > > > DNA sequence. > > > > > > Thanks also for your guidance with regard to setting maximums. > > > > > > Cheers, Scott > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > -- > Adrien
--------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org