Hi,

We are planning to setup a SOLR cloud with 6 nodes for 3 million records
(expected to grow to 5 million in a year), with 150 fields and over all
index would come around 120GB.

We plan to use NRT with 5 sec soft commit and 1 min hard commit.

Expected query volume would be 5000 select hits per second and 7000 inserts
/ updates per second.

Our records can be classified under 15 categories, but they will not have
even number of records, few categories will have more number of records.

Queries will also come in the same pattern, that is., categories with high
number of records will get high volume of select / updates.

For this situation we are confused in choosing what type of sharding would
help us in better performance in both select and updates?

Composite / implicit - Composite with 15 shards or implicit based on 15
categories.

Our select queries will have minimum 15 filters in fq, with extensive
function queries used in sort.

Updates will have 6 integer fields, 5 string fields and 4 string/integer
fields with multi valued.

If we choose implicit to boost select performance, our updates will be
heavy on few shards (major category shards), will this be a problem?

For our kind of situation which replica Type can we choose? All NRT or NRT
with TLOG ?

Thanks in advance!

Best,
Doss.

Reply via email to