Along with Shawn's comments, if you create a new collection,
consider "oversharding". Say you calculate (more later) you
can fit your collection in N shards, but you expect, over time,
for your collection to triple. _start out_ with 3N shards, many of
them will be co-located. As you get more docs move the replicas
around with ADDREPLICA/DELETEREPLICA as Shawn suggests.

Finally, you really have to do some serious work to figure out what
the correct eventual size will be, see:

https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

Best,
Erick

On Tue, May 2, 2017 at 5:32 AM, Shawn Heisey <apa...@elyograg.org> wrote:
> On 5/2/2017 4:24 AM, Venkateswarlu Bommineni wrote:
>> We have Solr setup with below configuration.
>>
>> 1) 1 collection with one shard
>> 2)  4 Solr Nodes
>> 2)  and replication factor 4 with one replication to each Solr Node.
>>
>> as of now, it's working fine.But going forward it Size may reach high and
>> we would need to add new Node.
>>
>> Could you guys please suggest any idea?
>
> I'm assuming SolrCloud, because you said "collection" and "replication
> factor" which are SolrCloud concepts.
>
> As soon as you start the new node pointing at your zookeeper ensemble,
> it will be part of the cluster and will accept requests for any
> collection in the cluster.  No index data will end up on the new node
> until you take action with the Collections API, though.
>
> One way to put data on the new node is the ADDREPLICA action.  Another
> is to create a brand new collection with the shard and replication
> characteristics you want, and use the new collection instead of the old
> one, or create an alias to use whatever name you like.  You can use
> SPLITSHARD and then ADDREPLICA/DELETEREPLICA to put *some* of the data
> from an existing collection on the new node.
>
> https://cwiki.apache.org/confluence/display/solr/Collections+API
>
> I think the way I would proceed is to create a brand new collection set
> up with the correct number of shards and replicas to use the new node,
> populate that collection, delete the old collection, and set up a
> collection alias so that the new collection can be accessed with the old
> collection's name.
>
> Thanks,
> Shawn
>

Reply via email to