Highlighting/excerpt on URLs 
-----------------------------

                 Key: LUCY-199
                 URL: https://issues.apache.org/jira/browse/LUCY-199
             Project: Lucy
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.2.2 (incubating)
         Environment: Linux
            Reporter: Henry


If I explicitly specify excerpt_length:

my $hl             = Lucy::Highlight::Highlighter->new(
   searcher       => $searcher,
   query          => $query_compiler,
   field          => 'site',
   excerpt_length => 60,
);

...and the field content is longer than 60, then

$page_highlighter->create_excerpt($hit);

returns '...'.

Content which is short than 60, returns the highlighted excerpt as expected.

If I comment out "excerpt_length => 60," above, then it returns the full
non-truncated excerpt with highlighting as expected.

Some >60char samples which return …/"...", searching for [iol.co.za] or
[news24.com] (brackets are mine):

[www.iol.co.za/tonight/books/what-the-dickens-gets-a-statue-1.1130220]
[http://www.news24.com/News24v2/Travel/Mini_Site/ContentDisplay/n24TravelMiniSiteHome/0,,,00.html]
[www.news24.com/News24v2/Travel/Mini_Site/ContentDisplay/n24TravelMiniSite_TravelClub/0,,,00.html]

The following return double-ellipses ("......" - ……), searching
for [adsl mweb.com]:

[http://www.mweb.co.za/helpcentre/ADSL/ADSLGeneralIdisagreewithyourusagereport.aspx]
[http://www.mweb.co.za/helpcentre/FrequentlyAskedQuestions/MWEBHelpCentreFAQsHowdoI/FAQHowdoIHowdoImigratemyADSL/tabid/661/Default.aspx]



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to