Those who aren't subscribed to solr-dev may be interested to know that
the lucene highlighter has been integrated into Solr, for both the
standard request handler, and the dismax handler.

See the highlight, highlightFields, and maxSnippets params documented here:
http://wiki.apache.org/solr/StandardRequestHandler

When highlighting is enabled, a new "highlighting" section is added to
responses.  Highlighted terms are marked up like so: <em>term</em>

Below is a sample for query terms ipod, USB
In the real response, the markup tags <em></em> are XML escaped so
each highlight snippet is a single string.


<lst name="highlighting">
 <lst name="IW-02">
   <arr name="features">
     <str>car power adapter for <em>iPod</em>, white</str>
   </arr>
   <arr name="name">
     <str><em>iPod</em> & <em>iPod</em> Mini <em>USB</em> 2.0 Cable</str>
   </arr>
 </lst>
 <lst name="MA147LL/A">
   <arr name="features">
     <str>playback, Upgradeable firmware, <em>USB</em> 2.0
compatibility, Playback speed control, Rechargeable capability</str>
   </arr>
        <arr name="name">
        <str>Apple 60 GB <em>iPod</em> with Video Playback Black</str>
   </arr>
 </lst>
</lst>

The Lucene version in Solr was just updated to fix a highlighting bug
with overlapping tokens, so you may want to wait for the 7/16 nightly
solr build if you use WordDelimiterFilter.

-Yonik

Reply via email to