[ 
https://issues.apache.org/jira/browse/LUCENE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nik Everett updated LUCENE-5274:
--------------------------------

    Attachment: LUCENE-5274.patch

Not done yet but progress:
1.  Move MergedIterator to util.
2.  Add a mode to it to not remove duplicates (one extra branch per call to 
next).
3.  Add a unit test for MergedIterator.
4.  Make FieldTermStack.TermInfo, FieldPhraseList.WeighterPhraseInfo, 
FieldPhraseList.WeightedPhraseInfo.Toffs consistent under equals, hashCode, and 
compareTo.  I don't think any of them would make good hash keys but I fixed up 
hashCode because I fixed up equals.
5.  Unit tests for point 4.
7.  Use the non-duplicate removing mode of MergedIterator in FieldPhraseList's 
merge methods.
6.  More tests in FastVectorHighlighterTest - mostly around exact equal matches 
and how they effect segment sorting.

At this point this is left:
1.  Unit tests for equal matches in the same FieldPhraseList.
2.  Poke around with corner cases during merges.  Test them in 
FastVectorHighlighterTest if they reflect mockable real world cases.  Expand 
FieldPhraseListTest if they don't.
4.  Remove highlighter dependency on analyzer module.  Would it make sense to 
move PerFieldAnalyzerWrapper into core?  Something else?
3.  Anything else from review.

> Teach fast FastVectorHighlighter to highlight "child fields" with parent 
> fields
> -------------------------------------------------------------------------------
>
>                 Key: LUCENE-5274
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5274
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/other
>            Reporter: Nik Everett
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-5274.patch
>
>
> I've been messing around with the FastVectorHighlighter and it looks like I 
> can teach it to highlight matches on "child fields".  Like this query:
> foo:scissors foo_exact:running
> would highlight foo like this:
> <em>running</em> with <em>scissors</em>
> Where foo is stored WITH_POSITIONS_OFFSETS and foo_plain is an unstored copy 
> of foo a different analyzer and its own WITH_POSITIONS_OFFSETS.
> This would make queries that perform weighted matches against different 
> analyzers much more convenient to highlight.
> I have working code and test cases but they are hacked into Elasticsearch.  
> I'd love to Lucene-ify if you'll take them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to