A side note - I've been using a highlighter based on matches API for
quite some time now and it's been fantastic. Very precise and handles
non-trivial queries (interval queries) very well.

https://lucene.apache.org/core/9_2_0/highlighter/org/apache/lucene/search/matchhighlight/package-summary.html

Dawid

On Mon, Jun 27, 2022 at 1:10 PM Alan Woodward <romseyg...@gmail.com> wrote:
>
> Your approach is almost certainly more efficient, but it might give you false 
> matches in some cases - for example, if you have a complex query with many 
> nested MUST and SHOULD clauses, you can have a leaf TermScorer that is 
> positioned on the correct document, but which is part of a clause that 
> doesn’t actually match.  It also only works for term queries, so it won’t 
> match phrases or span/interval groups.  And Matches will work on points or 
> docvalues queries as well.  The reason I added Matches in the first place was 
> precisely to handle these weird corner cases - I had written highlighters 
> which more or less did the same thing you describe with a Collector and the 
> Scorable tree, and I would occasionally get bad highlights back.
>
> On 27 Jun 2022, at 10:51, Shai Erera <ser...@gmail.com> wrote:
>
> Out of curiosity and for education purposes, is the Collector approach I 
> proposed wrong/inefficient? Or less efficient than the matches() API?
>
> I'm thinking, if you want to both match/rank documents and as a side effect 
> know which fields matched, the Collector will perform better than 
> Weight.matches(), but I could be wrong.
>
> Shai
>
> On Mon, Jun 27, 2022 at 11:57 AM Dawid Weiss <dawid.we...@gmail.com> wrote:
>>
>> The matches API is awesome. Use it. You can also get a rough glimpse
>> into a superset of fields potentially matching the query via:
>>
>>     query.visit(
>>         new QueryVisitor() {
>>           @Override
>>           public boolean acceptField(String field) {
>>             affectedFields.add(field);
>>             return false;
>>           }
>>         });
>>
>> https://lucene.apache.org/core/9_2_0/core/org/apache/lucene/search/Query.html#visit(org.apache.lucene.search.QueryVisitor)
>>
>> I'd go with the Matches API though.
>>
>> Dawid
>>
>> On Mon, Jun 27, 2022 at 10:48 AM Alan Woodward <romseyg...@gmail.com> wrote:
>> >
>> > The Matches API will give you this information - it’s still likely to be 
>> > fairly slow, but it’s a lot easier to use than trying to parse Explain 
>> > output.
>> >
>> > Query q = ….;
>> > Weight w = searcher.createWeight(searcher.rewrite(query), 
>> > ScoreMode.COMPLETE_NO_SCORES, 1.0f);
>> >
>> > Matches m = w.matches(context, doc);
>> > List<String> matchingFields = new ArrayList();
>> > for (String field : m) {
>> >  matchingFields.add(field);
>> > }
>> >
>> > Bear in mind that `matches` doesn’t maintain any state between calls, so 
>> > calling it for every matching document is likely to be slow; for those 
>> > cases Shai’s suggestion of using a Collector and examining low-level 
>> > scorers will perform better, but it won’t work for every query type.
>> >
>> >
>> > > On 25 Jun 2022, at 04:14, Yichen Sun <yiche...@bu.edu> wrote:
>> > >
>> > > Hello!
>> > >
>> > > I’m a MSCS student from BU and learning to use Lucene. Recently I try to 
>> > > output matched fields by one query. For example, for one document, there 
>> > > are 10 fields and 2 of them match the query. I want to get the name of 
>> > > these fields.
>> > >
>> > > I have tried using explain() method and getting description then regex. 
>> > > However it cost so much time.
>> > >
>> > > I wonder what is the efficient way to get the matched fields. Would you 
>> > > please offer some help? Thank you so much!
>> > >
>> > > Best regards,
>> > > Yichen Sun
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to