Hello, Andrei. Docs are scored in-order (see Weight.scoreAll(), scoreRange()), just because underneath postings API is in-order. There are a few shortcuts/optimizations, but they only omit some iterations/segments like checking competitive scores and so one.
On Sun, Nov 6, 2022 at 1:35 AM Solodin, Andrei (TR Technology) <andrei.solo...@thomsonreuters.com.invalid> wrote: > One more thing. While the test case passes now, it still iterates in index > order. Which means that it still collects ~6.4K docs out of 10k matches. > This is an improvement, but I am still wondering why it's not possible to > iterate in the field older. Seems like that would provide substantial > improvement. > > From: Solodin, Andrei (TR Technology) > Sent: Saturday, November 5, 2022 5:18 PM > To: java-user@lucene.apache.org > Subject: RE: Efficient sort on SortedDocValues > > I just realized that the problem is that the field needs to be indexed as > well. Now it works. But I noticed that this only works in Lucene 9. Does > not work in Lucene 8 (specifically 8.11.2). This must be new functionality > in Lucene 9? > > Thanks > > > From: Solodin, Andrei (TR Technology) > Sent: Saturday, November 5, 2022 1:07 PM > To: java-user@lucene.apache.org<mailto:java-user@lucene.apache.org> > Subject: Efficient sort on SortedDocValues > > Hello Lucene community, while looking into how to efficiently sort on a > field value, I came across a couple of things that I don't quite > understand. My assumption was that if I execute a search and sort on a > SortedDocValues field, lucene would only iterate over the docs in the order > of the field values or at least collect only competitive docs (docs that > made it into the topN queue). Neither of those things seems to be > happening. Instead, the iteration is happening in index order and all > matched docs are collected. Looking at the code, I see that the > optimizations are only possible if the index is sorted in the field order > to begin with, which is not possible for our use case. We may have dozens > of such fields in our index, thus there isn't any one field that can be > used to sort the index. So I guess my question if what I am trying to > achieve is possible? I tried to look though Solr codebase, but so far > couldn't come up with anything. Code example is here > https://pastebin.com/i05E2wZy . I am using 9.4.1. Thanks in advance. > > Andrei > > -- Sincerely yours Mikhail Khludnev