On 2/4/2016 9:27 AM, Zheng Lin Edwin Yeo wrote:
> Yes, I'm already on SolrCloud, so I'll probably stick to that.
>
> Regarding the network, I am just afraid that when the replica code copies
> the index over from the main node, it will use up all the available
> bandwidth, and causes the search query to have little bandwidth left, which
> will affect the performance of the search from the front-end.

Replicating the index in SolrCloud should be a VERY rare event, only
happening when there's a serious problem such as a server going down and
coming back up later, or after certain maintenance events.

Merges do not involve network traffic.  In SolrCloud, each replica will
handle merging locally.  It does not happen over the network.

Even if a replication DOES happen, TCP makes room on the network for new
connections like queries.  It's inherent in the design of the protocol. 
This is particularly effective on LAN connectivity.  If there's a WAN
involved, then you might be right to worry about bandwidth.

Regarding something you asked earlier in the thread: Assuming LAN
connectivity, I think the only thing you will achieve by using separate
network interfaces is configuration complexity.

It might be possible to separate the interfaces, even though I think
it's not required.  If you populate the hosts file on each server, or
use split DNS, you could have clients use a different address than the
Solr servers themselves use for inter-node communication, but in general
there is no need for this, because high network bandwidth utilization is
only likely during a replication event, or during bulk indexing to
rebuild collections.  For bulk indexing, the CPU and disk I/O impact
will almost certainly cause more of a slowdown than the network, unless
you're using a low-speed WAN, which is not recommended.

Thanks,
Shawn

Reply via email to