[ 
https://issues.apache.org/jira/browse/LUCENE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15470875#comment-15470875
 ] 

ASF GitHub Bot commented on LUCENE-7438:
----------------------------------------

GitHub user Timothy055 opened a pull request:

    https://github.com/apache/lucene-solr/pull/79

    LUCENE-7438 UnifiedHighlighter

    Initial pull request for 
[LUCENE-7438](https://issues.apache.org/jira/browse/LUCENE-7438)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Timothy055/lucene-solr master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/lucene-solr/pull/79.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #79
    
----
commit 02e932c4a6146363680b88f4947a693c6697c955
Author: Timothy Rodriguez <[email protected]>
Date:   2016-09-01T19:23:50Z

    Initial fork of PostingsHighlighter for UnifiedHighlighter

commit 9d88411b3985a98851384d78d681431dba710e89
Author: Timothy Rodriguez <[email protected]>
Date:   2016-09-01T23:17:06Z

    Initial commit of the UnifiedHighlighter for OSS contribution

commit e45e39bc4b07ea33e4423b264c2fefb9aa08777a
Author: David Smiley <[email protected]>
Date:   2016-09-02T12:45:49Z

    Fix misc issues; "ant test" now works. (#1)

commit 046a28ef31acf4cea7d255bbbb4b827e6a714e3d
Author: Timothy Rodriguez <[email protected]>
Date:   2016-09-02T20:58:31Z

    Minor refactoring of the AnalysisFieldHighlighter

commit ccd1a2280abd4b48cfef8122696e5d9cfd12920f
Author: David Smiley <[email protected]>
Date:   2016-09-03T12:55:20Z

    AbstractFieldHighlighter: order methods more sensibly; renamed a couple.

commit d4714a04a3e41d5e95bbe942b275c32ed69b9c2e
Author: David Smiley <[email protected]>
Date:   2016-09-04T01:03:29Z

    Improve javadocs and @lucene.external/internal labeling & scope.
    "ant precommit" now passes.

commit e0659f18a59bf2893076da6d7643ff30f2fa5a52
Author: David Smiley <[email protected]>
Date:   2016-09-04T01:25:55Z

    Analysis: remove dubious filter() method

commit ccd7ce707bff2c06da89b31853cca9aecea72008
Author: David Smiley <[email protected]>
Date:   2016-09-04T01:44:01Z

    getStrictPhraseHelper -> rm "Strict", getHighlightAccuracy -> getFlags, and 
only call filterExtractedTerms once.

commit ffc2a22c700b8abcbf87673d5d05bb3659d177c9
Author: David Smiley <[email protected]>
Date:   2016-09-04T15:21:08Z

    UnifiedHighlighter round 2 (#2)
    
    * AbstractFieldHighlighter: order methods more sensibly; renamed a couple.
    
    * Improve javadocs and @lucene.external/internal labeling & scope.
    "ant precommit" now passes.
    
    * Analysis: remove dubious filter() method
    
    * getStrictPhraseHelper -> rm "Strict", getHighlightAccuracy -> getFlags, 
and only call filterExtractedTerms once.

commit 5f95e05595db462d3ab5bffc68c2c92f70875072
Author: David Smiley <[email protected]>
Date:   2016-09-04T16:12:33Z

    Refactor: FieldOffsetStrategy

commit 86fb6265fbbdb955ead6d4baf944bf708175715e
Author: David Smiley <[email protected]>
Date:   2016-09-04T16:21:32Z

    stop passing maxPassages into highlightFieldForDoc()

commit f6fd80544eae9fab953b94b1e9346c0883f956eb
Author: David Smiley <[email protected]>
Date:   2016-09-04T16:12:33Z

    Refactor: FieldOffsetStrategy

commit b335a673c2ce45904890c1e9af7cbfda2bd27b0f
Author: David Smiley <[email protected]>
Date:   2016-09-04T16:21:32Z

    stop passing maxPassages into highlightFieldForDoc()

commit 478db9437b92214cbf459f82ba2e3a67c966a150
Author: David Smiley <[email protected]>
Date:   2016-09-04T18:29:44Z

    Rename subclasses of FieldOffsetStrategy.

commit dbf4280755c11420a5032445cd618fadb7444b61
Author: David Smiley <[email protected]>
Date:   2016-09-04T18:31:34Z

    Re-order and harmonize params on methods called by UH.getFieldHighlighter()

commit f0340e27e61dcda2e11992f08ec07a72fad6c24c
Author: David Smiley <[email protected]>
Date:   2016-09-04T18:53:51Z

    FieldHighlighter: harmonize field/param order. And don't apply 
maxNoHighlightPasses twice.

commit 817f63c1d48fd523c13b9c40a2ae9b8a4047209a
Author: Timothy Rodriguez <[email protected]>
Date:   2016-09-06T20:43:20Z

    Merge of renaming changes

commit 0f644a4f53c1ed4d41d562848f6fe51a87442a75
Author: Timothy Rodriguez <[email protected]>
Date:   2016-09-06T20:54:13Z

    add visibility tests

commit 9171f49e117085e7d086267bb73836831ff07f8e
Author: Timothy Rodriguez <[email protected]>
Date:   2016-09-07T14:26:59Z

    ADd additional extensibility test

----


> UnifiedHighlighter
> ------------------
>
>                 Key: LUCENE-7438
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7438
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/highlighter
>    Affects Versions: 6.2
>            Reporter: Timothy M. Rodriguez
>            Assignee: David Smiley
>
> The UnifiedHighlighter is an evolution of the PostingsHighlighter that is 
> able to highlight using offsets in either postings, term vectors, or from 
> analysis (a TokenStream). Lucene’s existing highlighters are mostly 
> demarcated along offset source lines, whereas here it is unified -- hence 
> this proposed name. In this highlighter, the offset source strategy is 
> separated from the core highlighting functionalty. The UnifiedHighlighter 
> further improves on the PostingsHighlighter’s design by supporting accurate 
> phrase highlighting using an approach similar to the standard highlighter’s 
> WeightedSpanTermExtractor. The next major improvement is a hybrid offset 
> source strategythat utilizes postings and “light” term vectors (i.e. just the 
> terms) for highlighting multi-term queries (wildcards) without resorting to 
> analysis. Phrase highlighting and wildcard highlighting can both be disabled 
> if you’d rather highlight a little faster albeit not as accurately reflecting 
> the query.
> We’ve benchmarked an earlier version of this highlighter comparing it to the 
> other highlighters and the results were exciting! It’s tempting to share 
> those results but it’s definitely due for another benchmark, so we’ll work on 
> that. Performance was the main motivator for creating the UnifiedHighlighter, 
> as the standard Highlighter (the only one meeting Bloomberg Law’s accuracy 
> requirements) wasn’t fast enough, even with term vectors along with several 
> improvements we contributed back, and even after we forked it to highlight in 
> multiple threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to