"replication falls behind and then starts to recover which causes more usage"

I'm not quite sure what you mean by this. Are you using TLOG or PULL
replica types? Or stand-alone Solr? There shouldn't really be any
replication in the ideal state for NRT replicas.

If you're using SolrCloud, the usual scaling approacy if you're
index-heavy is to add more shards, and since you're CPU bound they'd
have to be on new AWS instances. Or, if you're running multiple
replicas on each instance, move some of the replicas to new instances.
Assuming NRT Solr replicas.

Best,
Erick

On Mon, May 21, 2018 at 10:25 AM, Kelly, Frank <frank.ke...@here.com> wrote:
> Using Solr 5.3.1 - index
>
> We have an indexing heavy workload (we do more indexing than searching) and 
> for those searches we do perform we have very few cache hits (25% of our 
> index is in memory and the hit rate is < 0.1%)
>
> We are currently using r3.xlarge (memory optimized instances as we originally 
> thought we’d have a higher cache hit rate) with EBS optimization to IOPs 
> configurable EBS drives.
> Our EBS traffic bandwidth seems to work great so searches on disk are pretty 
> fast.
> Now though we seem CPU bound and if/ when Solr CPU gets pegged for too long 
> replication falls behind and then starts to recover which causes more usage 
> and then eventually shards go “Down”.
>
> Our key question: Scale up (fewer instances to manage) or Scale out (more 
> instances to manage) and
> do we switch to compute optimized instances (the answer given our usage I 
> assume is probably)
>
> Appreciate any thoughts folks have on this?
>
> Thanks!
>
> -Frank

Reply via email to