Some points based on my experience.

You can think of SolrCloud implementation, if  you want to distribute your
index over multiple servers.

Use MMapDirectory locally for each Solr instance in cluster.
Hit warm-up query on sever start-up. So most of the documents will be
cached, you will start saving on Disk IO on subsequent requests.
For e.g. If you have 4 Solr instances with 64GB RAM on each. most of your
documents will stay in RAM for 200GB index, and this will give you better
performance.

To take advantage of multi-core system. You can increase Searcher Threads,
ideally up-to the cores you have on single instance.




On Fri, Jun 27, 2014 at 4:03 PM, Sandeep Khanzode <
sandeep_khanz...@yahoo.com.invalid> wrote:

> Hi,
>
> I have an index that runs into 200-300GB. It is not frequently updated.
>
> What are the best strategies to query on this index?
> 1.] Should I, at index time, split the content, like a hash based
> partition, into multiple separate smaller indexes and aggregate the results
> programmatically?
> 2.] Should I replicate this index and provide some sort of document ID,
> and search on each node for a specific range of document IDs?
> 3.] Is there any way I can split or move individual segments to different
> nodes and aggregate the results?
>
> I am not fully aware of the large scale query strategies. Can you please
> share your findings or experiences? Thanks,
>
> -----------------------
> Thanks n Regards,
> Sandeep Ramesh Khanzode

Reply via email to