Re: Request Highlighting only for the final set of rows

Erick Erickson Fri, 18 Aug 2017 07:24:19 -0700

I don't think you're reading it correctly. First of all, if you're
going to do be doing deep paging you should be using cusorMark, see:
https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results.

Second, it's a two-pass process if you don't use cursormark. The first
pass gets the candidate docs from each shard. But all it returns is
the ID and sort criteria. Then the aggregator node gets the _true_ top
N after sorting all the lists from each shard and issues a second
request for _only_ those docs that have made the top N from each sub
shard, and those should be the only ones highlighted.

Do you have any evidence to the contrary that they're all being
highlighted? Or are you misinterpreting the log message for the first
pass?

Best,
Erick

On Thu, Aug 17, 2017 at 5:43 PM, Nawab Zada Asad Iqbal <khi...@gmail.com> wrote:
> Hi,
>
> In a multi-node solr installation (without SolrCloud), during a paging
> scenario (e.g., start=1000, rows=200), the primary node asks for 1200 rows
> from each shard. If highlighting is ON, then the primary node is asking for
> highlighting all the 1200 results from each shard, which doesn't scale
> well. Is there a way to break the shard query in two steps e.g. ask for the
> 1200 rows and after sorting the 1200 responses from each shard and finding
> final rows to return (1001 to 1200) , issue another query to shards for
> asking highlighted response for the relevant docs?
>
>
>
> Thanks
> Nawab

Re: Request Highlighting only for the final set of rows

Reply via email to