Yeah coming up with a "perfect" machine for your use is completely trial
and error.  for me personally i found that on one machine with 24 cores,
148gb ram, handles one solr instance with 4 cores, a 16mil records  sitting
at 400gb, a 53mil records sitting at 160gb, and a 108mil records sitting at
300gb.  the data sets are all different (kind of) but one machine can
handle this with no issues, running less than 4/10 a second on average on
only one solr instance.

On Thu, Mar 23, 2017 at 11:15 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> I've seen single nodes handle 10M docs using 64G of heap (using Zing).
> I've seen 300M in 12G of memory. There's absolutely no way to tell.
>
> See: https://lucidworks.com/2012/07/23/sizing-hardware-in-the-
> abstract-why-we-dont-have-a-definitive-answer/
> for a methodology to answer the question with _your_ data and _your_
> query pattern...
>
> Best,
> Erick
>
> On Thu, Mar 23, 2017 at 5:08 AM, Emir Arnautovic
> <emir.arnauto...@sematext.com> wrote:
> > Hi Vrindavda,
> >
> > It is hard to tell anything without testing and details on what/how is
> > indexed, how it is going to be queried and what are latency/throughput
> > requirements.
> >
> > 25M or 12.5M documents per shard might be too much if you have strict
> > latency requirements, but testing is the only way to tell. I would
> suggest
> > that you set up index with single shard and see how many documents you
> can
> > put into it to meet latency requirements under expected load (if you
> plan to
> > have 2 replicas that is roughly the half of the expected load). Leave
> some
> > room for distributed query overhead. After you get that number you can
> see
> > how many shards you need to have.
> >
> > HTH,
> > Emir
> >
> >
> >
> > On 23.03.2017 09:46, vrindavda wrote:
> >>
> >> Hello,
> >>
> >> My production index is expected to contain 50 million documents, with
> >> addition of around 1 million every year.
> >>
> >> Should I go for 64GB RAM (4 Shards /4 Replicas) Or 128GB (2 Shards/ 2
> >> Replicas) ?
> >>
> >> Please suggest if above assumptions are incorrect. What all parameters
> >> should I consider ?
> >>
> >>
> >> Thank you,
> >> Vrinda Davda
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >> http://lucene.472066.n3.nabble.com/Architecture-
> suggestions-tp4326436.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >
> >
> > --
> > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > Solr & Elasticsearch Support * http://sematext.com/
> >
>

Reply via email to