I interpret solr.xml as the node-local configuration for a single node. clusterprops.json is the cluster-wide configuration applying to all nodes. solrconfig.xml is of course per core etc
solr.in.sh is the per-node ENV-VAR way of configuring a node, and many of those are picked up in solr.xml (other in bin/solr). I think it is important to keep a file-local config file which can only be modified if you have shell access to that local node, it provides an extra layer of security. And in certain cases a node may need a different configuration from another node, i.e. during an upgrade. I put solr.xml in zookeeper. It may have been a mistake, since it may not make all that much sense to load solr.xml which is a node-level file, from ZK. But if it uses var substitutions for all node-level stuff, it will still work since those vars are pulled from local properties when parsed anyway. I’m also somewhat against hijacking clusterprops.json as a general purpose JSON config file for the cluster. It was supposed to be for simple properties. Jan > 28. aug. 2020 kl. 14:23 skrev Erick Erickson <[email protected]>: > > Solr.xml can also exist on Zookeeper, it doesn’t _have_ to exist locally. You > do have to restart to have any changes take effect. > > Long ago in a Solr far away solr.xml was where all the cores were defined. > This was before “core discovery” was put in. Since solr.xml had to be there > anyway and was read at startup, other global information was added and it’s > lived on... > > Then clusterprops.json came along as a place to put, well, cluster-wide > properties so having solr.xml too seems awkward. Although if you do have > solr.xml locally to each node, you could theoretically have different > settings for different Solr instances. Frankly I consider this more of a bug > than a feature. > > I know there have been some talk about removing solr.xml entirely, but I’m > not sure what the thinking is about what to do instead. Whatever we do needs > to accommodate standalone. We could do the same trick we do now, and > essentially move all the current options in solr.xml to clusterprops.json (or > other ZK node) and read it locally for stand-alone. The API could even be > used to change it if it was stored locally. > > That still leaves the chicken-and-egg problem if connecting to ZK in the > first place. > >> On Aug 28, 2020, at 7:43 AM, Ilan Ginzburg <[email protected]> wrote: >> >> I want to ramp-up/discuss/inventory configuration options in Solr. Here's my >> understanding of what exists and what could/should be used depending on the >> need. Please correct/complete as needed (or point to documentation I might >> have missed). >> >> There are currently 3 sources of general configuration I'm aware of: >> • Collection specific config bootstrapped by file solrconfig.xml and >> copied into the initial (_default) then subsequent Config Sets in Zookeeper. >> • Cluster wide config in Zookeeper /clusterprops.json editable globally >> through Zookeeper interaction using an API. Not bootstrapped by anything >> (i.e. does not exist until the user explicitly creates it) >> • Node config file solr.xml deployed with Solr on each node and loaded >> when Solr starts. Changes to this file are per node and require node restart >> to be taken into account. >> The Collection specific config (file solrconfig.xml then in Zookeeper >> /configs/<config set name>/solrconfig.xml) allows Solr devs to set >> reasonable defaults (the file is part of the Solr distribution). Content can >> be changed by users as they create new Config Sets persisted in Zookeeper. >> >> Zookeeper's /clusterprops.json can be edited through the collection admin >> API CLUSTERPROP. If users do not set anything there, the file doesn't even >> exist in Zookeeper therefore `Solr devs cannot use it to set a default >> cluster config, there's no clusterprops.json file in the Solr distrib like >> there's a solrconfig.xml. >> >> File solr.xml is used by Solr devs to set some reasonable default >> configuration (parametrized through property files or system properties). >> There's no API to change that file, users would have to edit/redeploy the >> file on each node and restart the Solr JVM on that node for the new config >> to be taken into account. >> >> Based on the above, my vision (or mental model) of what to use depending on >> the need: >> >> solrconfig.xml is the only per collection config. IMO it does its job >> correctly: Solr devs can set defaults, users tailor the content to what they >> need for new config sets. It's the only option for per collection config >> anyway. >> >> The real hesitation could be between solr.xml and Zookeeper >> /clusterprops.json. What should go where? >> >> For user configs (anything the user does to the Solr cluster AFTER it was >> deployed and started), /clusterprops.json seems to be the obvious choice and >> offers the right abstractions (global config, no need to worry about >> individual nodes, all nodes pick up configs and changes to configs >> dynamically). >> >> For configs that need to be available without requiring user intervention or >> needed before the connection to ZK is established, there's currently no >> other choice than using solr.xml. Such configuration obviously include >> parameters that are needed to connect to ZK (timeouts, credential provider >> and hopefully one day an option to either use direct ZK interaction code or >> Curator code), but also configuration of general features that should be the >> default without requiring users to opt in yet allowing then to easily opt >> out by editing solr.xml before deploying to their cluster (in the future, >> this could include which Lucene version to load in Solr for example). >> >> To summarize: >> • Collection specific config? --> solrconfig.xml >> • User provided cluster config once SolrCloud is running? --> ZK >> /clusterprops.json >> • Solr dev provided cluster config? --> solr.xml >> >> Going forward, some (but only some!) of the config that currently can only >> live in solr.xml could be made to go to /clusterprops.json or another ZK >> based config file. This would require adding code to create that ZK file >> upon initial cluster start (to not force the user to push it) and devise a >> mechanism (likely a script, could be tricky though) to update that file in >> ZK when a new release of Solr is deployed and a previous version of that >> file already exists. Not impossible tasks, but not trivial ones either. >> Whatever the needs of such an approach are, it might be easier to keep the >> existing solr.xml as a file and allow users to define overrides in Zookeeper >> for the configuration parameters from solr.xml that make sense to be >> overridden in ZK (obviously ZK credentials or connection timeout do not make >> sense in that context, but defining the shard handler implementation class >> does since it is likely loaded after a node managed to connect to ZK). >> >> Some config will have to stay in a local Node file system file and only >> there no matter what: Zookeeper timeout definition or any node configuration >> that is needed before the node connects to Zookeeper. >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
