Are distributed commits also done in parallel across shards? Peter
On Tue, Apr 15, 2014 at 3:50 PM, Mark Miller <markrmil...@gmail.com> wrote: > Inline responses below. > -- > Mark Miller > about.me/markrmiller > > On April 15, 2014 at 2:12:31 PM, Peter Keegan (peterlkee...@gmail.com) > wrote: > > I have a SolrCloud index, 1 shard, with a leader and one replica, and 3 > ZKs. The Solr indexes are behind a load balancer. There is one > CloudSolrServer client updating the indexes. The index schema includes 3 > ExternalFileFields. When the CloudSolrServer client issues a hard commit, > I > observe that the commits occur sequentially, not in parallel, on the > leader > and replica. The duration of each commit is about a minute. Most of this > time is spent reloading the 3 ExternalFileField files. Because of the > sequential commits, there is a period of time (1 minute+) when the index > searchers will return different results, which can cause a bad user > experience. This will get worse as replicas are added to handle > auto-scaling. The goal is to keep all replicas in sync w.r.t. the user > queries. > > My questions: > > 1. Is there a reason that the distributed commits are done in sequence, > not > in parallel? Is there a way to change this behavior? > > > The reason is that updates are currently done this way - it’s the only > safe way to do it without solving some more problems. I don’t think you can > easily change this. I think we should probably file a JIRA issue to track a > better solution for commit handling. I think there are some complications > because of how commits can be added on update requests, but its something > we probably want to try and solve before tackling *all* updates to replicas > in parallel with the leader. > > > > 2. If instead, the commits were done in parallel by a separate client via > a > GET to each Solr instance, how would this client get the host/port values > for each Solr instance from zookeeper? Are there any downsides to doing > commits this way? > > Not really, other than the extra management. > > > > > > Thanks, > Peter >