Your approach is almost certainly more efficient, but it might give you false matches in some cases - for example, if you have a complex query with many nested MUST and SHOULD clauses, you can have a leaf TermScorer that is positioned on the correct document, but which is part of a clause that doesn’t actually match. It also only works for term queries, so it won’t match phrases or span/interval groups. And Matches will work on points or docvalues queries as well. The reason I added Matches in the first place was precisely to handle these weird corner cases - I had written highlighters which more or less did the same thing you describe with a Collector and the Scorable tree, and I would occasionally get bad highlights back.
> On 27 Jun 2022, at 10:51, Shai Erera <ser...@gmail.com> wrote: > > Out of curiosity and for education purposes, is the Collector approach I > proposed wrong/inefficient? Or less efficient than the matches() API? > > I'm thinking, if you want to both match/rank documents and as a side effect > know which fields matched, the Collector will perform better than > Weight.matches(), but I could be wrong. > > Shai > > On Mon, Jun 27, 2022 at 11:57 AM Dawid Weiss <dawid.we...@gmail.com > <mailto:dawid.we...@gmail.com>> wrote: > The matches API is awesome. Use it. You can also get a rough glimpse > into a superset of fields potentially matching the query via: > > query.visit( > new QueryVisitor() { > @Override > public boolean acceptField(String field) { > affectedFields.add(field); > return false; > } > }); > > https://lucene.apache.org/core/9_2_0/core/org/apache/lucene/search/Query.html#visit(org.apache.lucene.search.QueryVisitor) > > <https://lucene.apache.org/core/9_2_0/core/org/apache/lucene/search/Query.html#visit(org.apache.lucene.search.QueryVisitor)> > > I'd go with the Matches API though. > > Dawid > > On Mon, Jun 27, 2022 at 10:48 AM Alan Woodward <romseyg...@gmail.com > <mailto:romseyg...@gmail.com>> wrote: > > > > The Matches API will give you this information - it’s still likely to be > > fairly slow, but it’s a lot easier to use than trying to parse Explain > > output. > > > > Query q = ….; > > Weight w = searcher.createWeight(searcher.rewrite(query), > > ScoreMode.COMPLETE_NO_SCORES, 1.0f); > > > > Matches m = w.matches(context, doc); > > List<String> matchingFields = new ArrayList(); > > for (String field : m) { > > matchingFields.add(field); > > } > > > > Bear in mind that `matches` doesn’t maintain any state between calls, so > > calling it for every matching document is likely to be slow; for those > > cases Shai’s suggestion of using a Collector and examining low-level > > scorers will perform better, but it won’t work for every query type. > > > > > > > On 25 Jun 2022, at 04:14, Yichen Sun <yiche...@bu.edu > > > <mailto:yiche...@bu.edu>> wrote: > > > > > > Hello! > > > > > > I’m a MSCS student from BU and learning to use Lucene. Recently I try to > > > output matched fields by one query. For example, for one document, there > > > are 10 fields and 2 of them match the query. I want to get the name of > > > these fields. > > > > > > I have tried using explain() method and getting description then regex. > > > However it cost so much time. > > > > > > I wonder what is the efficient way to get the matched fields. Would you > > > please offer some help? Thank you so much! > > > > > > Best regards, > > > Yichen Sun > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > <mailto:dev-unsubscr...@lucene.apache.org> > > For additional commands, e-mail: dev-h...@lucene.apache.org > > <mailto:dev-h...@lucene.apache.org> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > <mailto:dev-unsubscr...@lucene.apache.org> > For additional commands, e-mail: dev-h...@lucene.apache.org > <mailto:dev-h...@lucene.apache.org> >