tboeghk commented on PR #2783:
URL: https://github.com/apache/solr/pull/2783#issuecomment-3104512660

   In addition to the great summary of @ardatezcan1 above here are my practical 
tips and real-world scenarios to run Solr in a high rpm and low to medium 
dataset environment (like ecommerce appliations).
   
   
   __Best practises using Solr in high rpm environments__
   
   Before starting to optimize your Solr setup, make sure to have strong 
observability in place. In addition to the [Solr Prometheus and 
Grafana](https://solr.apache.org/guide/solr/latest/deployment-guide/monitoring-with-prometheus-and-grafana.html)
 setup I strongly recommend setting up the [Node 
Exporter](https://github.com/prometheus/node_exporter) to gather and correlate 
machine metrics.
   
   * __Use Solr in cloud mode__: Running Solr in cloud mode and in a Zookeeper 
ensemble is a prerequisite to the following best-practices. Cloud mode enables 
easy addition and removal of Solr cluster nodes depending on the current 
traffic.
   * __Sharding__: Request processing in Solr is a single threaded operation. 
The larger your dataset the more latency you'll add to request processing. The 
only (sustainable) way to make query processing a multi-threaded operation is 
to shard your index. Depending on your workload, you could simply run multiple 
Solr instances on the same machine. I recommend a single Solr instance per 
machine though.
       * __Sharding strategies__: If your query processing strategy uses 
[collapse (and expand or 
grouping)](https://solr.apache.org/guide/solr/latest/query-guide/collapse-and-expand-results.html),
 make sure to put all documents to a grouping key on the same shard. Adjust the 
[document 
routing](https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-shards-indexing.html#document-routing)
 and `router.field` to your grouping key.
   * __Indexing and optimization strategies__: Indexing into a live collection 
adds significant latency to your search requests. Each commit flushes the 
internal caches and those caches keep Solr running fast. Avoid any unnecessary 
cache flushes!
     * __Optimize your index__: Manually optimizing your index is not 
recommended but delivers the best performance as deleted documents are pruned 
from the index.
     * __Rotate collections__: For smaller to medium datasets it might be a 
good strategy to periodically index your data into a new collection instead of 
updating an existing one. That way, requests caches stay warm for the lifetime 
of a collection and a manual optimize is possible. Use [collection 
aliases](https://solr.apache.org/guide/solr/latest/deployment-guide/aliases.html)
 to switch clients to the new collection.
   * __Use dedicated node setups__: In high traffic environments, a separation 
of concerns gets more important. Use dedicated node types and machine 
sizings/setup for optimal perfomance tailored to the machines role.
     * __Indexer__: Solely used for indexing products. Set up as 
[`TLOG`](https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-shards-indexing.html#types-of-replicas)
 replica type. Must not be used for request processing. Exclude `TLOG` node 
types from request processing using the 
[`shards.preference`](https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#shards-preference-parameter)
 parameter configured at your request handlers.
     * __Data__: Set up as a `PULL` replica. Replicates it's index from the 
indexer nodes via Solr cloud. Using `TLOG` and `PULL` replicas avoids that 
index data is being pulled off data nodes (as with `NRT` replicas).
     * __Coordinator__: In sharded Solr cloud setups, these nodes coordinate 
the distributed request flow and assemble the final search request result. This 
is a very CPU intensive operation and is usually shared among the data nodes. 
The usage of dedicated [coordinator 
nodes](https://solr.apache.org/guide/solr/latest/deployment-guide/node-roles.html#coordinator-role)
 separates the compute overhead of coordinating distributed requests off of the 
data nodes. Adding coordinator nodes to a Solr cloud setup will drop the 
resource usage on data nodes significantly. To make full use of coordinator 
nodes, direct all incoming request traffic to these nodes.
   * __JVM tuning__: I highly recommend running Solr on _G1GC garbage 
collector_. Keep in mind the golden rule of keeping 50% heap for disk cache on 
data and indexer nodes. As coordinator nodes are stateless, you can boost their 
performance significantly with the _ZGC garbage collector_. It slashes 
collection pauses from milli- to nanoseconds.
   * __Cloud setup__: Most Solr cloud setups will run in some kind of cloud 
environment. Here are some tipps to setup an elastic Solr cloud environment.
     * __Autoscaling__: Use a dedicated autoscaling group for each node type 
and each shard. Use tags to mark which instance should replicate which shard. 
Configure your heap settings dynamically and configure a wide range of instance 
types. Build a custom script to replicate data upon instance start. Use the 
Solr collections api to [remove a node from the 
cluster](https://solr.apache.org/guide/solr/latest/deployment-guide/cluster-node-management.html#deletenode)
    during instance termination.
     * __Spot instances__: Coordinator and data nodes are great to run as spot 
instances. This will save a big bunch of cloud spendings.
     * __ARM instance types__: Utilize ARM instance types wherever possible. 
The Solr Docker image is also pre-built for ARM architectures. ARM cpus offer 
the best bang for the buck and a more consistent response latency (as their CPU 
is not power managed).
   
   If you need more information or help to compile the whole information into a 
single document let me know!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to