This got zero responses on the solr-user list, so I’ll raise the issue here.

Should circuit breakers only kill external search requests and not 
cluster-internal requests to shards?

Circuit breakers can kill any request, whether it is a client request from 
outside the cluster or an internal distributed request to a shard. Killing a 
portion of distributed request will affect the main request. Not sure whether a 
503 from a shard will kill the whole request or cause partial results, but it 
isn’t good.

We run with 8 shards. If a circuit breaker is killing 10% of requests on each 
host, that will hit 57% of all external requests (0.9^8 = 0.43). That seems 
like “overkill” to me. If it only kills external requests, then 10% means 10%.

Killing only external requests requires that external requests go roughly 
equally to all hosts in the cluster, or at least all NRT or PULL replicas.

wunder
Walter Underwood
wun...@wunderwood.org <mailto:wun...@wunderwood.org>
http://observer.wunderwood.org/ <http://observer.wunderwood.org/>  (my blog)

Reply via email to