[ 
https://issues.apache.org/jira/browse/LUCY-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marvin Humphrey updated LUCY-182:
---------------------------------

    Attachment: LUCY-182.patch

The problem arises because Highlighter_highlight_excerpt is being passed the
wrong array of Spans.  It is getting the raw Spans produced by the Compiler
object, which may contain overlapping Spans.  It should instead be using the
"flattened" array of Spans produced by the HeatMap, which has no overlaps and
was in fact purpose-built for this situation.
                
> highlighter bug when searching for duplicate terms [wordX wordX]
> ----------------------------------------------------------------
>
>                 Key: LUCY-182
>                 URL: https://issues.apache.org/jira/browse/LUCY-182
>             Project: Lucy
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.2.0 (incubating)
>         Environment: Linux 2.6.18-238.19.1.el5 #1 SMP Fri Jul 15 07:31:24 EDT 
> 2011 x86_64 x86_64 x86_64 GNU/Linux
> Perl v5.8.8 built for x86_64-linux-thread-multi
> gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)
> glibc-2.5-58.el5_6.4
>            Reporter: gk
>            Assignee: Marvin Humphrey
>             Fix For: 0.2.0 (incubating)
>
>         Attachments: LUCY-182.patch
>
>
> I stumbled onto this one when searching for [business to business].
> Source <TITLE>: ...Companies, Products, Trade Leads, Business Marketplace
> 'to' is a stopword which is ignored - no problem.
> So the query then becomes [business business].  The highlighter then produces:
> ...Companies, Products, Trade Leads, <strong>Business</strong>
> <strong>Marketp</strong>lace
> I then spent some time chasing my tail trying to reduce things down to
> a small reproducible unit, and finally decided to try searching for
> any duplicate [wordX wordX], and sure enough it's reproducible with
> all my indexes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to