On Fri, Jun 7, 2013, at 02:59 PM, Jack Krupansky wrote:
> AFAICT, SolrCloud addresses the use case of distributed update for a 
> relatively smaller number of collections (dozens?) that have a relatively 
> larger number of rows - billions over a modest to moderate number of
> nodes 
> (a handful to a dozen or dozens). So, maybe dozens of collections (some 
> people still call these "cores") that distribute hundreds of millions if
> not 
> billions of rows over dozens (or potentially low hundreds) of nodes. 
> Technically, ZK was designed for thousands of nodes, but I don't think
> that 
> was for the use case of distributed query that constantly fans out to all 
> shards.

Not sure I get what you're saying here. ZK was designed for thousands of
nodes, and the way it works is by making sure that each node has an
active cache of all relevant data within it so they don't need to poll
ZK for the data. Therefore, as far as ZK is concerned it is irrelevant
how many hosts are involved in any particular transaction - the node
that is handling the distribution consults its cache of the list of
active nodes, decides which one to hit, and off it goes, no interaction
with ZK required.

Or am I missing something?

Upayavira

Reply via email to