Re: Thoughts on scaling strategy for Solr deployed on AWS EC2 instances - Scale up / out and which instance type?

2018-05-21 Thread Erick Erickson
"replication falls behind and then starts to recover which causes more usage"

I'm not quite sure what you mean by this. Are you using TLOG or PULL
replica types? Or stand-alone Solr? There shouldn't really be any
replication in the ideal state for NRT replicas.

If you're using SolrCloud, the usual scaling approacy if you're
index-heavy is to add more shards, and since you're CPU bound they'd
have to be on new AWS instances. Or, if you're running multiple
replicas on each instance, move some of the replicas to new instances.
Assuming NRT Solr replicas.

Best,
Erick

On Mon, May 21, 2018 at 10:25 AM, Kelly, Frank  wrote:
> Using Solr 5.3.1 - index
>
> We have an indexing heavy workload (we do more indexing than searching) and 
> for those searches we do perform we have very few cache hits (25% of our 
> index is in memory and the hit rate is < 0.1%)
>
> We are currently using r3.xlarge (memory optimized instances as we originally 
> thought we’d have a higher cache hit rate) with EBS optimization to IOPs 
> configurable EBS drives.
> Our EBS traffic bandwidth seems to work great so searches on disk are pretty 
> fast.
> Now though we seem CPU bound and if/ when Solr CPU gets pegged for too long 
> replication falls behind and then starts to recover which causes more usage 
> and then eventually shards go “Down”.
>
> Our key question: Scale up (fewer instances to manage) or Scale out (more 
> instances to manage) and
> do we switch to compute optimized instances (the answer given our usage I 
> assume is probably)
>
> Appreciate any thoughts folks have on this?
>
> Thanks!
>
> -Frank


Re: Thoughts on scaling strategy for Solr deployed on AWS EC2 instances - Scale up / out and which instance type?

2018-05-21 Thread Kelly, Frank
Thanks Erick,

 I am using TLOG replicas in this SolrCloud cluster - 3 shards, each with
3 replicas.

Here¹s my decision logic based on my (limited) understanding -
All shards seem to be equally used so to improve performance by adding
shards I think I'd have to double from 3 shards to 6 (as indexing load is
distributed equally)
and when I do that then I should also double the number of AWS instances?

Thanks!

-Frank

 
Frank Kelly
Principal Software Engineer
AAA Identity Profile Team (SCBE / CDA)

HERE 
5 Wayside Rd, Burlington, MA 01803, USA
42° 29' 7" N 71° 11' 32" W
 
  






On 5/21/18, 11:04 AM, "Erick Erickson"  wrote:

>"replication falls behind and then starts to recover which causes more
>usage"
>
>I'm not quite sure what you mean by this. Are you using TLOG or PULL
>replica types? Or stand-alone Solr? There shouldn't really be any
>replication in the ideal state for NRT replicas.
>
>If you're using SolrCloud, the usual scaling approacy if you're
>index-heavy is to add more shards, and since you're CPU bound they'd
>have to be on new AWS instances. Or, if you're running multiple
>replicas on each instance, move some of the replicas to new instances.
>Assuming NRT Solr replicas.
>
>Best,
>Erick
>
>On Mon, May 21, 2018 at 10:25 AM, Kelly, Frank 
>wrote:
>> Using Solr 5.3.1 - index
>>
>> We have an indexing heavy workload (we do more indexing than searching)
>>and for those searches we do perform we have very few cache hits (25% of
>>our index is in memory and the hit rate is < 0.1%)
>>
>> We are currently using r3.xlarge (memory optimized instances as we
>>originally thought we¹d have a higher cache hit rate) with EBS
>>optimization to IOPs configurable EBS drives.
>> Our EBS traffic bandwidth seems to work great so searches on disk are
>>pretty fast.
>> Now though we seem CPU bound and if/ when Solr CPU gets pegged for too
>>long replication falls behind and then starts to recover which causes
>>more usage and then eventually shards go ³Down².
>>
>> Our key question: Scale up (fewer instances to manage) or Scale out
>>(more instances to manage) and
>> do we switch to compute optimized instances (the answer given our usage
>>I assume is probably)
>>
>> Appreciate any thoughts folks have on this?
>>
>> Thanks!
>>
>> -Frank



Re: Thoughts on scaling strategy for Solr deployed on AWS EC2 instances - Scale up / out and which instance type?

2018-05-21 Thread Shawn Heisey

On 5/21/2018 8:25 AM, Kelly, Frank wrote:

We have an indexing heavy workload (we do more indexing than searching) and for 
those searches we do perform we have very few cache hits (25% of our index is in 
memory and the hit rate is < 0.1%)


Which cache are you looking at for that hitrate?  How are you looking at 
it?  What precisely do you see?



We are currently using r3.xlarge (memory optimized instances as we originally 
thought we’d have a higher cache hit rate) with EBS optimization to IOPs 
configurable EBS drives.
Our EBS traffic bandwidth seems to work great so searches on disk are pretty 
fast.
Now though we seem CPU bound and if/ when Solr CPU gets pegged for too long 
replication falls behind and then starts to recover which causes more usage and 
then eventually shards go “Down”.


A system with enough memory to fully cache the index would almost never 
need to actually read the disk. If there is a lot of disk activity, the 
machine may need more memory for the OS disk cache.  You've said that 25 
percent of your index is in memory.  I've seen good performance with 10 
percent, and terrible performance with 50 percent.  A lot of factors 
will affect what percentage is needed.


What precisely are you looking at to determine that the machine is 
CPU-bound?  Some of the things that people assume are evidence of CPU 
problems are actually evidence of I/O problems caused by not having 
enough memory.



Our key question: Scale up (fewer instances to manage) or Scale out (more 
instances to manage) and
do we switch to compute optimized instances (the answer given our usage I 
assume is probably)


Generally if you want to handle a higher request rate, you need more 
machines -- scale out -- and a way to load balance requests.  If each 
request takes too long when the request rate is low, that's probably an 
indication that you need to increase the resources available on each 
machine - scale up.


Memory is the most valuable resource in a Solr install. CPU is 
important, but adding CPU can't solve issues that require more memory.


Thanks,
Shawn



Re: Thoughts on scaling strategy for Solr deployed on AWS EC2 instances - Scale up / out and which instance type?

2018-05-21 Thread Deepak Goel
On Mon, May 21, 2018 at 7:55 PM, Kelly, Frank  wrote:

> Using Solr 5.3.1 - index
>
> We have an indexing heavy workload (we do more indexing than searching)
> and for those searches we do perform we have very few cache hits (25% of
> our index is in memory and the hit rate is < 0.1%)
>
> We are currently using r3.xlarge (memory optimized instances as we
> originally thought we’d have a higher cache hit rate) with EBS optimization
> to IOPs configurable EBS drives.
> Our EBS traffic bandwidth seems to work great so searches on disk are
> pretty fast.
> Now though we seem CPU bound and if/ when Solr CPU gets pegged for too
> long replication falls behind and then starts to recover which causes more
> usage and then eventually shards go “Down”.
>
> Cpu Bound - What does your hardware configuration look like?

"Down" - What does exactly happen? Can you please give a bit more about
this?


> Our key question: Scale up (fewer instances to manage) or Scale out (more
> instances to manage) and
> do we switch to compute optimized instances (the answer given our usage I
> assume is probably)
>
>
Is the load scaling linearly (25%,50%,75,100% CPU) on your current machine?
If it is, then scale-up would be a good choice. However, if it is not, I
would go for scale-out


> Appreciate any thoughts folks have on this?
>
> Thanks!
>
> -Frank
>



Deepak
"The greatness of a nation can be judged by the way its animals are
treated. Please stop cruelty to Animals, become a Vegan"

+91 73500 12833
deic...@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home