[ https://issues.apache.org/jira/browse/LUCENE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709524#comment-15709524 ]
David Smiley commented on LUCENE-7575: -------------------------------------- Thanks for this contribution Jim! You're the first one to improve the UH outside of those who first worked on it. Overall patch looks pretty good. UnifiedHighlighter: * What do you think of adding this to the HighlightFlags instead? It's intended to be a single point to capture various boolean options. As an aside, I'm kinda wondering if the default* and DEFAULT* boolean fields shouldn't exist and instead simply have a highlightFlags enumSet field. * I think the results of filterExtractedTerms might now contain duplicated terms (BytesRefs)? (see my note later about testing the same term and varying the field). We could simply collect those bytes into a HashSet, then extract to an array and then sort. PhraseHelper: * You applied SingleFieldFilterLeafReader at the top of getTermToSpans but I think this should be done by the caller so it happens just once, not per SpanQuery. * FieldRewritingTermHashSet is so close to the other other one... hmm.... what if we had just one, remove "static" from class (thus has access to fieldName & requireFieldMatch), and then implement add() appropriately. Tests: * you used the same test input string for both the "field" and "field_require_field_match" fields. To make this more clear; can you vary them if even a little? * in no test queries do I see the same term BytesRef across more than one field. For example, maybe add a test incorporating something like {{field:test OR field_require_field_match:test}} -- granted the results might not be interesting but lets hope it doesn't puke. Do for phrase as well. I agree this requireFieldMatch=false should not be the default. It'll add some overhead -- especially for phrase and other position sensitive queries since we aren't de-duplicating them. Besides, it's more accurate as-is. As an aside... it'd be interesting if instead of a simple boolean toggle, if it were a {{Predicate<String>}} fieldMatchPredicate so that only some fields could be collected in the query but not all. Just an idea. > UnifiedHighlighter: add requireFieldMatch=false support > ------------------------------------------------------- > > Key: LUCENE-7575 > URL: https://issues.apache.org/jira/browse/LUCENE-7575 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/highlighter > Reporter: David Smiley > Assignee: David Smiley > Attachments: LUCENE-7575.patch > > > The UnifiedHighlighter (like the PostingsHighlighter) only supports > highlighting queries for the same fields that are being highlighted. The > original Highlighter and FVH support loosening this, AKA > requireFieldMatch=false. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org