Yeah coming up with a "perfect" machine for your use is completely trial and error. for me personally i found that on one machine with 24 cores, 148gb ram, handles one solr instance with 4 cores, a 16mil records sitting at 400gb, a 53mil records sitting at 160gb, and a 108mil records sitting at 300gb. the data sets are all different (kind of) but one machine can handle this with no issues, running less than 4/10 a second on average on only one solr instance.
On Thu, Mar 23, 2017 at 11:15 AM, Erick Erickson <erickerick...@gmail.com> wrote: > I've seen single nodes handle 10M docs using 64G of heap (using Zing). > I've seen 300M in 12G of memory. There's absolutely no way to tell. > > See: https://lucidworks.com/2012/07/23/sizing-hardware-in-the- > abstract-why-we-dont-have-a-definitive-answer/ > for a methodology to answer the question with _your_ data and _your_ > query pattern... > > Best, > Erick > > On Thu, Mar 23, 2017 at 5:08 AM, Emir Arnautovic > <emir.arnauto...@sematext.com> wrote: > > Hi Vrindavda, > > > > It is hard to tell anything without testing and details on what/how is > > indexed, how it is going to be queried and what are latency/throughput > > requirements. > > > > 25M or 12.5M documents per shard might be too much if you have strict > > latency requirements, but testing is the only way to tell. I would > suggest > > that you set up index with single shard and see how many documents you > can > > put into it to meet latency requirements under expected load (if you > plan to > > have 2 replicas that is roughly the half of the expected load). Leave > some > > room for distributed query overhead. After you get that number you can > see > > how many shards you need to have. > > > > HTH, > > Emir > > > > > > > > On 23.03.2017 09:46, vrindavda wrote: > >> > >> Hello, > >> > >> My production index is expected to contain 50 million documents, with > >> addition of around 1 million every year. > >> > >> Should I go for 64GB RAM (4 Shards /4 Replicas) Or 128GB (2 Shards/ 2 > >> Replicas) ? > >> > >> Please suggest if above assumptions are incorrect. What all parameters > >> should I consider ? > >> > >> > >> Thank you, > >> Vrinda Davda > >> > >> > >> > >> -- > >> View this message in context: > >> http://lucene.472066.n3.nabble.com/Architecture- > suggestions-tp4326436.html > >> Sent from the Solr - User mailing list archive at Nabble.com. > > > > > > -- > > Monitoring * Alerting * Anomaly Detection * Centralized Log Management > > Solr & Elasticsearch Support * http://sematext.com/ > > >