[
https://issues.apache.org/jira/browse/SOLR-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066207#comment-13066207
]
Jamie Johnson edited comment on SOLR-1954 at 7/15/11 9:12 PM:
--------------------------------------------------------------
I know this has been awhile, but I had a need for something like this and while
I implemented (and added it to SOLR-1397) I figured I'd try this out instead as
well. After applying the patch I got the following response
<lst name="highlighting">
<lst name="1">
<arr name="subject_phonetic">
<str><em>Test</em> subject message</str>
</arr>
<arr name="subject_phonetic_startPos"><int>0</int></arr>
<arr name="subject_phonetic_endPos"><int>29</int></arr>
</lst>
</lst>
seems that the startPos is always 0 and endPos is the length of the field
including the highlighting start/end tags. Is this expected?
was (Author: jej2003):
I know this has been awhile, but I had a need for something like this and
while I implemented (and added it to SOLR-1397) I figured I'd try this out
instead as well. After applying the patch I got the following response
<lst name="highlighting">
<lst name="1">
<arr name="subject_phonetic">
<str><em>Test</em> subject message</str>
</arr>
<arr name="subject_phonetic_startPos"><int>0</int></arr>
<arr name="subject_phonetic_endPos"><int>29</int></arr>
</lst>
</lst>
seems that the endPos is the length of the field including the highlighting
start/end tags. Is this expected?
> Highlighter component should expose snippet character offsets and the score.
> ----------------------------------------------------------------------------
>
> Key: SOLR-1954
> URL: https://issues.apache.org/jira/browse/SOLR-1954
> Project: Solr
> Issue Type: New Feature
> Components: highlighter
> Reporter: David Smiley
> Priority: Minor
> Attachments: SOLR-1954_start_and_end_offsets.patch
>
>
> The Highlighter Component does not currently expose the snippet character
> offsets nor the score. There is a TODO in DefaultSolrHighlighter indicating
> the intention to add this eventually. This information is needed when doing
> highlighting on external content. The data is there so its pretty easy to
> output it in some way. The challenge is deciding on the output and its
> ramifications on backwards compatibility. The current highlighter component
> response structure doesn't lend itself to adding any new data, unfortunately.
> I wish the original implementer had some foresight. Unfortunately all the
> highlighting tests assume this structure. Here is a snippet of the current
> response structure in Solr's sample data searching for "sdram" for reference:
> {code:xml}
> <lst name="highlighting">
> <lst name="VS1GB400C3">
> <arr name="text">
> <str>CORSAIR ValueSelect 1GB 184-Pin DDR <em>SDRAM</em>
> Unbuffered DDR 400 (PC 3200) System Memory - Retail</str>
> </arr>
> </lst>
> </lst>
> {code}
> Perhaps as a little hack, we introduce a pseudo field called
> text_startCharOffset which is the concatenation of the matching field and
> "_startCharOffset". This would be an array of ints. Likewise, there would
> be another array for endCharOffset and score.
> Thoughts?
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]