[ 
https://issues.apache.org/jira/browse/JCR-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12484794
 ] 

Marcel Reutegger commented on JCR-820:
--------------------------------------

Committed initial version: 523251

The query languages now support an excerpt function that returns highlighted 
fragments for the current node in a result row.

The excerpt is a simple XML fragment. An example fragment could look like this 
for the query terms 'jackrabbit' and 'query':

<excerpt>
     <fragment>
          <highlight>Jackrabbit</highlight> implements both the mandatory XPath 
and optional SQL
          <highlight>query</highlight> syntax.
     </fragment>
     <fragment>
          Before parsing the XPath <highlight>query</highlight> in 
<highlight>Jackrabbit</highlight>,
          the statement is surrounded
     </fragment>
 </excerpt>

Example queries:

//element(nt:resource)[jcr:contains(., 'jackrabbit')]/rep:excerpt(.)

select excerpt(.) from nt:resource where contains(., 'jackrabbit')

Per default the excerpt function returns only simple fragments without 
highlight elements because additional token offset information needs to be 
indexed for highlighting. To enable term highlighting a configuration parameter 
needs to be set:

<param name="supportHighlighting" value="true"/>

Per default this is set to false for performance reasons. When set to true the 
values of string properties and the text extract of binary properties are 
stored in the lucene index. Because in lucene all stored fields are loaded when 
a document is requested this affects performance. With lucene 2.1 this 
behaviour can be controlled and only specified fields can be loaded. Once 
jackrabbit switches to lucene 2.1 the query handler should only read stored 
fulltext extract when really needed.

Similarly when switching to lucene 2.1, jackrabbit should have a custom field 
implementation that allows to store a field with a reader value. Currently when 
highlighting is enabled deferred text extraction is effectively disabled. With 
a custom field implementation deferred text extraction will work again even if 
highlighting is enabled.

> Add support for query result highlighting
> -----------------------------------------
>
>                 Key: JCR-820
>                 URL: https://issues.apache.org/jira/browse/JCR-820
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: query
>            Reporter: Marcel Reutegger
>            Priority: Minor
>
> Highlighting matches in a query result list is regularly needed for an 
> application. The query languages should support a pseudo property or function 
> that allows one to retrieve text fragments with highlighted matches from the 
> content of the matching node.
> To support this feature the following enhancements are required:
> - define a pseudo property or function that returns the text excerpt and can 
> be used in the select clause
> - the index needs to *store* the original text it used when the node was 
> indexed. this also includes extracted text from binary properties.
> - text fragments must be created based on the original text, the query and 
> index information

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to