[ 
https://issues.apache.org/jira/browse/LUCY-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gk updated LUCY-182:
--------------------

    Comment: was deleted

(was: Just documenting an extra bit which might help towards debugging this, 
although it might not be related.

This bug might also account for the performance problem I've encountered with 
setting up the highlighter objects - or is that not related?

For example, I'm highlighting on title and body:

   my $body_highlighter = Lucy::Highlight::Highlighter->new(
       searcher  => $poly_searcher,
       query        => $query,
       field           => 'body',
       excerpt_length => 190,
   );

   my $title_highlighter = Lucy::Highlight::Highlighter->new(
       searcher  => $poly_searcher,
       query        => $query,
       field          => 'title',
       excerpt_length => 75,
   );

...basic stuff.

Completing each new() requires 4s *each*.  Somehow I don't recall this
being the case before :/
)
    
> highlighter bug when searching for duplicate terms [wordX wordX]
> ----------------------------------------------------------------
>
>                 Key: LUCY-182
>                 URL: https://issues.apache.org/jira/browse/LUCY-182
>             Project: Lucy
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.1.0 (incubating), 0.2.0 (incubating), 0.2.1 
> (incubating)
>            Reporter: gk
>            Assignee: Marvin Humphrey
>             Fix For: 0.2.2 (incubating), 0.3.0 (incubating)
>
>         Attachments: LUCY-182.patch
>
>
> I stumbled onto this one when searching for [business to business].
> Source <TITLE>: ...Companies, Products, Trade Leads, Business Marketplace
> 'to' is a stopword which is ignored - no problem.
> So the query then becomes [business business].  The highlighter then produces:
> ...Companies, Products, Trade Leads, <strong>Business</strong>
> <strong>Marketp</strong>lace
> I then spent some time chasing my tail trying to reduce things down to
> a small reproducible unit, and finally decided to try searching for
> any duplicate [wordX wordX], and sure enough it's reproducible with
> all my indexes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to