Re: Solr configuration options

Ilan Ginzburg Fri, 28 Aug 2020 08:51:39 -0700

What I'm really looking for (and currently my understanding is that solr.xml
is the only option) is *a cluster config a Solr dev can set as a default* when
introducing a new feature for example, so that the config is picked out of
the box in SolrCloud, yet allowing the end user to override it if he so
wishes.


But "cluster config" in this context *with a caveat*: when doing a rolling
upgrade, nodes running new code need the new cluster config, nodes running
old code need the previous cluster config... Having a per node
solr.xml deployed
atomically with the code as currently the case has disadvantages, but
solves this problem effectively in a very simple way. If we were to move to
a central cluster config, we'd likely need to introduce config versioning
or as Noble suggested elsewhere, only write code that's backward compatible
(w.r.t. config), deploy that code everywhere then once no old code is
running, update the cluster config. I find this approach complicated from
both dev and operational perspective with an unclear added value.

Ilan

PS. I've stumbled upon the loading of solr.xml from Zookeeper in the past
but couldn't find it as I wrote my message so I thought I imagined it...

It's in SolrDispatchFilter.loadNodeConfig(). It establishes a connection to
ZK for fetching solr.xml then closes it.
It relies on system property waitForZk as the connection timeout (in
seconds, defaults to 30) and system property zkHost as the Zookeeper host.

I believe solr.xml can only end up in ZK through the use of ZkCLI. Then the
user is on his own to manage SolrCloud version upgrades: if a new solr.xml
is included as part of a new version of SolrCloud, the user having pushed a
previous version into ZK will not see the update.
I wonder if putting solr.xml in ZK is a common practice.

On Fri, Aug 28, 2020 at 4:58 PM Jan Høydahl <jan....@cominvent.com> wrote:

> I interpret solr.xml as the node-local configuration for a single node.
> clusterprops.json is the cluster-wide configuration applying to all nodes.
> solrconfig.xml is of course per core etc
>
> solr.in.sh is the per-node ENV-VAR way of configuring a node, and many of
> those are picked up in solr.xml (other in bin/solr).
>
> I think it is important to keep a file-local config file which can only be
> modified if you have shell access to that local node, it provides an extra
> layer of security.
> And in certain cases a node may need a different configuration from
> another node, i.e. during an upgrade.
>
> I put solr.xml in zookeeper. It may have been a mistake, since it may not
> make all that much sense to load solr.xml which is a node-level file, from
> ZK. But if it uses var substitutions for all node-level stuff, it will
> still work since those vars are pulled from local properties when parsed
> anyway.
>
> I’m also somewhat against hijacking clusterprops.json as a general purpose
> JSON config file for the cluster. It was supposed to be for simple
> properties.
>
> Jan
>
> > 28. aug. 2020 kl. 14:23 skrev Erick Erickson <erickerick...@gmail.com>:
> >
> > Solr.xml can also exist on Zookeeper, it doesn’t _have_ to exist
> locally. You do have to restart to have any changes take effect.
> >
> > Long ago in a Solr far away solr.xml was where all the cores were
> defined. This was before “core discovery” was put in. Since solr.xml had to
> be there anyway and was read at startup, other global information was added
> and it’s lived on...
> >
> > Then clusterprops.json came along as a place to put, well, cluster-wide
> properties so having solr.xml too seems awkward. Although if you do have
> solr.xml locally to each node, you could theoretically have different
> settings for different Solr instances. Frankly I consider this more of a
> bug than a feature.
> >
> > I know there have been some talk about removing solr.xml entirely, but
> I’m not sure what the thinking is about what to do instead. Whatever we do
> needs to accommodate standalone. We could do the same trick we do now, and
> essentially move all the current options in solr.xml to clusterprops.json
> (or other ZK node) and read it locally for stand-alone. The API could even
> be used to change it if it was stored locally.
> >
> > That still leaves the chicken-and-egg problem if connecting to ZK in the
> first place.
> >
> >> On Aug 28, 2020, at 7:43 AM, Ilan Ginzburg <ilans...@gmail.com> wrote:
> >>
> >> I want to ramp-up/discuss/inventory configuration options in Solr.
> Here's my understanding of what exists and what could/should be used
> depending on the need. Please correct/complete as needed (or point to
> documentation I might have missed).
> >>
> >> There are currently 3 sources of general configuration I'm aware of:
> >>      • Collection specific config bootstrapped by file solrconfig.xml
> and copied into the initial (_default) then subsequent Config Sets in
> Zookeeper.
> >>      • Cluster wide config in Zookeeper /clusterprops.json editable
> globally through Zookeeper interaction using an API. Not bootstrapped by
> anything (i.e. does not exist until the user explicitly creates it)
> >>      • Node config file solr.xml deployed with Solr on each node and
> loaded when Solr starts. Changes to this file are per node and require node
> restart to be taken into account.
> >> The Collection specific config (file solrconfig.xml then in Zookeeper
> /configs/<config set name>/solrconfig.xml) allows Solr devs to set
> reasonable defaults (the file is part of the Solr distribution). Content
> can be changed by users as they create new Config Sets persisted in
> Zookeeper.
> >>
> >> Zookeeper's /clusterprops.json can be edited through the collection
> admin API CLUSTERPROP. If users do not set anything there, the file doesn't
> even exist in Zookeeper therefore `Solr devs cannot use it to set a default
> cluster config, there's no clusterprops.json file in the Solr distrib like
> there's a solrconfig.xml.
> >>
> >> File solr.xml is used by Solr devs to set some reasonable default
> configuration (parametrized through property files or system properties).
> There's no API to change that file, users would have to edit/redeploy the
> file on each node and restart the Solr JVM on that node for the new config
> to be taken into account.
> >>
> >> Based on the above, my vision (or mental model) of what to use
> depending on the need:
> >>
> >> solrconfig.xml is the only per collection config. IMO it does its job
> correctly: Solr devs can set defaults, users tailor the content to what
> they need for new config sets. It's the only option for per collection
> config anyway.
> >>
> >> The real hesitation could be between solr.xml and Zookeeper
> /clusterprops.json. What should go where?
> >>
> >> For user configs (anything the user does to the Solr cluster AFTER it
> was deployed and started), /clusterprops.json seems to be the obvious
> choice and offers the right abstractions (global config, no need to worry
> about individual nodes, all nodes pick up configs and changes to configs
> dynamically).
> >>
> >> For configs that need to be available without requiring user
> intervention or needed before the connection to ZK is established, there's
> currently no other choice than using solr.xml. Such configuration obviously
> include parameters that are needed to connect to ZK (timeouts, credential
> provider and hopefully one day an option to either use direct ZK
> interaction code or Curator code), but also configuration of general
> features that should be the default without requiring users to opt in yet
> allowing then to easily opt out by editing solr.xml before deploying to
> their cluster (in the future, this could include which Lucene version to
> load in Solr for example).
> >>
> >> To summarize:
> >>      • Collection specific config? --> solrconfig.xml
> >>      • User provided cluster config once SolrCloud is running? --> ZK
> /clusterprops.json
> >>      • Solr dev provided cluster config? --> solr.xml
> >>
> >> Going forward, some (but only some!) of the config that currently can
> only live in solr.xml could be made to go to /clusterprops.json or another
> ZK based config file. This would require adding code to create that ZK file
> upon initial cluster start (to not force the user to push it) and devise a
> mechanism (likely a script, could be tricky though) to update that file in
> ZK when a new release of Solr is deployed and a previous version of that
> file already exists. Not impossible tasks, but not trivial ones either.
> Whatever the needs of such an approach are, it might be easier to keep the
> existing solr.xml as a file and allow users to define overrides in
> Zookeeper for the configuration parameters from solr.xml that make sense to
> be overridden in ZK (obviously ZK credentials or connection timeout do not
> make sense in that context, but defining the shard handler implementation
> class does since it is likely loaded after a node managed to connect to ZK).
> >>
> >> Some config will have to stay in a local Node file system file and only
> there no matter what: Zookeeper timeout definition or any node
> configuration that is needed before the node connects to Zookeeper.
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

Re: Solr configuration options

Reply via email to