[
https://issues.apache.org/jira/browse/SOLR-17670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SOLR-17670:
----------------------------------
Labels: pull-request-available (was: )
> Fix unnecessary memory allocation caused by a large reRankDocs param
> --------------------------------------------------------------------
>
> Key: SOLR-17670
> URL: https://issues.apache.org/jira/browse/SOLR-17670
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: JiaBaoGao
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The reRank function has a reRankDocs parameter that specifies the number of
> documents to re-rank. I've observed that increasing this parameter to test
> its performance impact causes queries to become progressively slower. Even
> when the parameter value exceeds the total number of documents in the index,
> further increases continue to slow down the query, which is counterintuitive.
>
> Therefore, I investigated the code:
>
> For a query containing re-ranking, such as:
> {code:java}
> {
> "start": "0",
> "rows": 10,
> "fl": "ID,score",
> "q": "*:*",
> "rq": "{!rerank reRankQuery='{!func} 100' reRankDocs=1000000000
> reRankWeight=2}"
> } {code}
>
> The current execution logic is as follows:
> 1. Perform normal retrieval using the q parameter.
> 2. Re-score all documents retrieved in the q phase using the rq parameter.
>
> During the retrieval in phase 1 (using q), a TopScoreDocCollector is created.
> Underneath, this creates a PriorityQueue which contains an Object[]. The
> length of this Object[] continuously increases with reRankDocs without any
> limit.
>
> On my local test cluster with limited JVM memory, this can even trigger an
> OOM, causing the Solr node to crash. I can also reproduce the OOM situation
> using the SolrCloudTestCase unit test.
>
> I think limiting the length of the Object[] array using
> searcher.getIndexReader().maxDoc() at ReRankCollector would resolve this
> issue. This way, when reRankDocs exceeds maxDoc, memory allocation will not
> continue to increase indefinitely.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]