Hi, Never done in myself, but from the doc : >From the query ( https://lucene.apache.org/core/8_6_3/core/org/apache/lucene/search/Query.html), you can retrieve the Weight ( https://lucene.apache.org/core/8_6_3/core/org/apache/lucene/search/Weight.html), from which you can access the Matches ( https://lucene.apache.org/core/8_6_3/core/org/apache/lucene/search/Matches.html). That should give you access to the token positions, and such access to the tokens that maches.
Le ven. 9 juil. 2021 à 14:05, Trevor Nicholls <tre...@castingthevoid.com> a écrit : > Problem: I have indexed the filepath and the content of thousands of > documents and can successfully query the index on the text to return a > collection of filepaths. Now I need to create a collection of the tokens in > the index which matched the query. > > > > I can see that there are solutions to a related problem, which is how I > could highlight the matching terms if I displayed relevant fragments of the > document contents. But I don't want to do this; I just want a list of the > tokens. The tokens are in the index, the tokens are matched by the query. > It > seems a lot of extra work to take the selected document, retokenize it, > re-execute the query and replace the matching tokens when surely the tokens > which match the query are accessible somewhere. (Besides, I can't use > Lucene's highlighting to display the document with highlights, because the > index is not built from the displayed document but from a pre-processed > extract of it, and I don't want to just display fragments of it). > > > > I thought the Explanation class might be what I need to use but when I > display the content of the explanation for each matching document I see > only > something like this: > > > > score=5.9498425 > > 0.0 = No matching clauses > > > > which is no help at all. > > > > Is this a wild goose chase or is it achievable somehow? > > > > cheers > > T > > > > > >