[jira] [Updated] (SOLR-3110) Search result comes up with truncated words at the start of highlighted fragment

Shyam Bhaskaran (Updated) (JIRA) Tue, 07 Feb 2012 23:38:14 -0800

     [ 
https://issues.apache.org/jira/browse/SOLR-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shyam Bhaskaran updated SOLR-3110:
----------------------------------

    Description: 
It is being observed that words are getting truncated at the start of 
Highlighter fragment displayed. 
Following boundary scanner settings are introduced inside in the solrconfig.xml 
file

<str name="hl.bs.chars">.,!?</str> 

If I change the settings to <str name="hl.bs.chars">.,!?  
/&#9;/&#10;/&#13;</str> 

then it is seen that this issue goes away but another issues comes up where the 
highlighted search fragment does not start from the beginning of the sentence.

Below is the complete list of setting we are using for boundary scanner.

   <boundaryScanner name="simple" class="solr.highlight.SimpleBoundaryScanner" 
default="true">
     <lst name="defaults">
       <str name="hl.bs.maxScan">200</str>
       <str name="hl.bs.chars">.,!? /&#9;/&#10;/&#13;</str>
     </lst>
   </boundaryScanner>

   <boundaryScanner name="breakIterator" 
class="solr.highlight.BreakIteratorBoundaryScanner">
     <lst name="defaults">
       <str name="hl.bs.type">SENTENCE</str>
       <str name="hl.bs.language">en</str>
       <str name="hl.bs.country">US</str>
     </lst>
   </boundaryScanner>



  was:
It is being observed that words are getting truncated at the start of 
Highlighter fragment displayed. 
Following boundary scanner settings are introduced inside in the solrconfig.xml 
file

<str name="hl.bs.chars">.,!? \&#9;\&#10;\&#13;</str> 

If I change the settings to <str name="hl.bs.chars">.,!?</str> 

then it is seen that this issue goes away but another issues comes up where the 
highlighted search fragment does not start from the beginning of the sentence.

Below is the complete list of setting we are using for boundary scanner.

   <boundaryScanner name="simple" class="solr.highlight.SimpleBoundaryScanner" 
default="true">
     <lst name="defaults">
       <str name="hl.bs.maxScan">200</str>
       <str name="hl.bs.chars">.,!? '&#9;''&#10;''&#13;'</str>
     </lst>
   </boundaryScanner>

   <boundaryScanner name="breakIterator" 
class="solr.highlight.BreakIteratorBoundaryScanner">
     <lst name="defaults">
       <str name="hl.bs.type">SENTENCE</str>
       <str name="hl.bs.language">en</str>
       <str name="hl.bs.country">US</str>
     </lst>
   </boundaryScanner>



    
> Search result comes up with truncated words at the start of highlighted 
> fragment
> --------------------------------------------------------------------------------
>
>                 Key: SOLR-3110
>                 URL: https://issues.apache.org/jira/browse/SOLR-3110
>             Project: Solr
>          Issue Type: Bug
>          Components: highlighter
>    Affects Versions: 4.0
>         Environment: java Tomcat Solaris
>            Reporter: Shyam Bhaskaran
>              Labels: FastVectorHighlighter, boundaryScanner, highlighting, 
> solr
>
> It is being observed that words are getting truncated at the start of 
> Highlighter fragment displayed. 
> Following boundary scanner settings are introduced inside in the 
> solrconfig.xml file
> <str name="hl.bs.chars">.,!?</str> 
> If I change the settings to <str name="hl.bs.chars">.,!?  
> /&#9;/&#10;/&#13;</str> 
> then it is seen that this issue goes away but another issues comes up where 
> the highlighted search fragment does not start from the beginning of the 
> sentence.
> Below is the complete list of setting we are using for boundary scanner.
>    <boundaryScanner name="simple" 
> class="solr.highlight.SimpleBoundaryScanner" default="true">
>      <lst name="defaults">
>        <str name="hl.bs.maxScan">200</str>
>        <str name="hl.bs.chars">.,!? /&#9;/&#10;/&#13;</str>
>      </lst>
>    </boundaryScanner>
>    <boundaryScanner name="breakIterator" 
> class="solr.highlight.BreakIteratorBoundaryScanner">
>      <lst name="defaults">
>        <str name="hl.bs.type">SENTENCE</str>
>        <str name="hl.bs.language">en</str>
>        <str name="hl.bs.country">US</str>
>      </lst>
>    </boundaryScanner>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3110) Search result comes up with truncated words at the start of highlighted fragment

Reply via email to