Hi  Dwane,

Thank you for sharing this great solr/docker user story.

According to your Solr/JVM memory requirements (Heap size + MetaSpace +
OffHeap size) are you specifying specific settings in docker-compose files
(mem_limit, mem_reservation, mem_swappiness, ...) ?
I suppose you are limiting total memory used by all dockerised Solr in
order to keep free memory on host for MMAPDirectory ?

In short can you explain the memory management ?

Regards

Dominique




Le lun. 23 déc. 2019 à 00:17, Dwane Hall <dwaneh...@hotmail.com> a écrit :

> Hey Walter,
>
> I recently migrated our Solr cluster to Docker and am very pleased I did
> so. We run relativity large servers and run multiple Solr instances per
> physical host and having managed Solr upgrades on bare metal installs since
> Solr 5, containerisation has been a blessing (currently Solr 7.7.2). In our
> case we run 20 Solr nodes per host over 5 hosts totalling 100 Solr
> instances. Here I host 3 collections of varying size. The first contains
> 60m docs (8 shards), the second 360m (12 shards) , and the third 1.3b (30
> shards) all with 2 NRT replicas. The docs are primarily database sourced
> but are not tiny by any means.
>
> Here are some of my comments from our migration journey:
> - Running Solr on Docker should be no different to bare metal. You still
> need to test for your environment and conditions and follow the guides and
> best practices outlined in the excellent Lucidworks blog post
> https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
> .
> - The recent Solr Docker images are built with Java 11 so if you store
> your indexes in hdfs you'll have to build your own Docker image as Hadoop
> is not yet certified with Java 11 (or use an older Solr version image built
> with Java 8)
> - As Docker will be responsible for quite a few Solr nodes it becomes
> important to make sure the Docker daemon is configured in systemctl to
> restart after failure or reboot of the host. Additionally the Docker
> restart=always setting is useful for restarting failed containers
> automatically if a single container dies (i.e. JVM explosions). I've
> deliberately blown up the JVM in test conditions and found the
> containers/Solr recover really well under Docker.
> - I use Docker Compose to spin up our environment and it has been
> excellent for maintaining consistent settings across Solr nodes and hosts.
> Additionally using a .env file makes most of the Solr environment variables
> per node configurable in an external file.
> - I'd recommend Docker Swarm if you plan on running Solr over multiple
> physical hosts. Unfortunately we had an incompatible OS so I was unable to
> utilise this approach. The same incompatibility existed for K8s but
> Lucidworks has another great article on this approach if you're more
> fortunate with your environment than us
> https://lucidworks.com/post/running-solr-on-kubernetes-part-1/.
> - Our Solr instances are TLS secured and use the basic auth plugin and
> rules based authentication provider. There's nothing I have not been able
> to configure with the default Docker images using environment variables
> passed into the container. This makes upgrades to Solr versions really easy
> as you just need to grab the image and pass in your environment details to
> the container for any new Solr version.
> - If possible I'd start with the Solr 8 Docker image. The project
> underwent a large refactor to align it with the install script based on
> community feedback. If you start with an earlier version you'll need to
> refactor when you eventually move to Solr version 8. The Solr Docker page
> has more details on this.
> - Matijn Koster (the project lead) is excellent and very responsive to
> questions on the project page. Read through the q&a page before reaching
> out I found a lot of my questions already answered there.  Additionally, he
> provides a number of example Docker configurations from command line
> parameters to docker-compose files running multiple instances and zookeeper
> quarums.
> - The Docker extra hosts parameter is useful for adding extra hosts to
> your containers hosts file particularly if you have multiple nic cards with
> internal and external interfaces and you want to force communication over a
> specific one.
> - We use the Solr Prometheus exporter to collect node metrics. I've found
> I've needed to reduce the metrics to collect as having this many nodes
> overwhelmed it occasionally. From memory it had something to do with
> concurrent modification of Future objects the collector users and it
> sometimes misses collection cycles. This is not Docker related but Solr
> size related and the exporter's ability to handle it.
> - We use the zkCli script a lot for updating configsets. As I did not want
> to have to copy them into a container to update them I just download a copy
> of the Solr binaries and use it entirely for this zookeeper script. It's
> not elegant but a number of our Dev's are not familiar with Docker and this
> was a nice compromise. Another alternative is to just use the rest API to
> do any configset manipulation.
> - We load balance all of these nodes to external clients using a haproxy
> Docker image. This combined with the Docker restart policy and Solr
> replication and autoscaling capabilities provides a very stable environment
> for us.
>
> All in all migrating and running Solr on Docker has been brilliant. It was
> primarily driven by a need to scale our environment vertically on large
> hardware instances as running 100 nodes on bare metal was too big a
> maintenance and administrative burden for us with a small Dev and support
> team. To date it's been very stable and reliable so I would recommend the
> approach if you are in a similar situation.
>
> Thanks,
>
> Dwane
>
>
>
>
>
>
> ________________________________
> From: Walter Underwood <wun...@wunderwood.org>
> Sent: Saturday, 14 December 2019 6:04 PM
> To: solr-user@lucene.apache.org <solr-user@lucene.apache.org>
> Subject: Solr Cloud on Docker?
>
> Does anyone have experience running a big Solr Cloud cluster on Docker
> containers? By “big”, I mean 35 million docs, 40 nodes, 8 shards, with 36
> CPU instances. We are running version 6.6.2 right now, but could upgrade.
>
> If people have specific things to do or avoid, I’d really appreciate it.
>
> I got a couple of responses on the Slack channel, but I’d love more
> stories from the trenches. This is a direction for our company architecture.
>
> We have a master/slave cluster (Solr 4.10.4) that is awesome. I can
> absolutely see running the slaves as containers. For Solr Cloud? Makes me
> nervous.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>

Reply via email to