Hi Dmitry, Ideally, the token stream produces tokens that have a startOffset >= the startOffset of the previous token from the stream. Sometime in the past year or so, this was enforced at the indexing layer, I think. There used to be TokenFilters that violated this contract; I think earlier versions of WordDelimiterFilter could. If my assumption that this is asserted at the indexing layer is correct, then I think TokenOrderingFilter is obsolete.
~ David On Thu, Jun 4, 2015 at 7:48 AM Dmitry Kan <[email protected]> wrote: > Hi guys, > > Sorry for sending questions to the dev list and not to the user one. > Somehow I'm getting more luck here. > > We have found the class o.a.solr.highlight.TokenOrderingFilter > with the following comment: > > > -/** > > - * Orders Tokens in a window first by their startOffset ascending. > > - * endOffset is currently ignored. > > - * This is meant to work around fickleness in the highlighter only. It > > - * can mess up token positions and should not be used for indexing or > querying. > > - */ > > -final class TokenOrderingFilter extends TokenFilter { > > In fact, removing this class didn't change the behaviour of the highlighter. > > Could anybody shed light on its necessity? > > Thanks, > > Dmitry Kan > >
