Apache Solr Query Issue with huge data

prasad bezavada Fri, 05 Apr 2024 01:36:52 -0700

Dear Team,

I'm currently using Solr version 8.11.3, configured with RAM resources (125
GB physical memory, 64 GB heap memory). The collection comprises 4 shards
within the same node. Through our Java application ( SolrJ),
indexed approximately 8 million records from an RDBMS table into Solr.


Presently, my task is to query this indexes and exporting the results (5
million records fetched with my solr query) to PDF format via our Java
application. To avoid potential heap memory issues, I've implemented
pagination (3 lakhs) in the query using start and setrows parameters.

However, I've encountered an issue where the response time for subsequent
queries to fetch the next set of results (e.g., 3 to 6 lakhs, 6 to 9 lakhs)
progressively increases, leading to socket timeout exceptions.
Additionally, Solr's physical memory consumption exceeds 90%, without
releasing it.

I have several queries regarding this situation:

Why does the query time in Solr increase with each pagination query?
What causes Solr to occupy over 90% of physical memory and fail to release
it?
What would be the optimal approach for retrieving 5 million records from
our Java application and exporting them to PDF or other file formats?
Your insights and suggestions on resolving these issues would be greatly
appreciated.


-- 
*Thanks&Regards*

*Prasad Bezavada*

Apache Solr Query Issue with huge data

Reply via email to