[
https://issues.apache.org/jira/browse/SOLR-461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560057#action_12560057
]
Grant Ingersoll commented on SOLR-461:
--------------------------------------
I suppose it is similar, but I don't find counting characters all that
intuitive. A token based approach doesn't cut off in the middle of a word and
it isn't clear to me whether it is counting whitespace characters, etc. Plus,
it is analogous to Lucene's Max Field Length, which is token based as well.
> Highlighting TokenStream Truncation capability
> ----------------------------------------------
>
> Key: SOLR-461
> URL: https://issues.apache.org/jira/browse/SOLR-461
> Project: Solr
> Issue Type: Improvement
> Components: highlighter
> Reporter: Grant Ingersoll
> Assignee: Grant Ingersoll
> Priority: Minor
>
> It is sometimes the case when generating snippets that one need not
> fragment/analyze the whole document (especially for large documents) in order
> to show meaningful snippet highlights.
> Patch to follow that adds a counting TokenFilter that returns null after X
> number of Tokens have been seen. This filter will then be hooked into the
> SolrHighlighter and configurable via solrconfig.xml. The default value will
> be Integer.MAX_VALUE or, I suppose, it could be set to whatever Max Field
> Length is set to, as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.