Hi, We're in the process of converting a Lucene deployment of 500 M documents / 60 G deployment into a Solr Distributed search deployment.
The primary reason for the change is instability in the Lucene deployment due to memory constraints. The existing infrastructure is deployed on 48 machines with all indices on each machine using the MultiSearcher. The new deployment will leverage Solr's distributed search model for to deploy smaller index shards in various clusters. Under average load, the system should be able to easily handle 8-10 requests per second. We're looking for some guidance on best practices for sizing the cluster's correctly. Our current thought is to divide the indices into 4 equal parts and build several 4 machine clusters. So each machine will be hosting ~ 15 G each. Has anyone had experience with a similar size deployment? Any suggestions on the architectural strategy? Thanks for the help. Todd