It sounds like the issue is that we need both a "per node config" and a
"per collection" config. This could all be in zookeeper, and with a clear
well documented precedence order (node wins) for any attributes that
overlap... would even make sense to have names for nodes that were not
literal machine urls for this so that one could move a node to a different
machine... node goes down, (listed as down by zookeeper) node comes up
claiming name, if the name is a down node, bingo new node gets the same
config as the old node. New node coming up and finding the name taken by a
live node could wait for N ticks before giving up or could fail immediately.

Node names could be supplied at startup, or assigned automatically...

Probably want to have a default node config, and the ability to write
configs for node names that don't (yet) exist...

Just a thought... sounds good to me because a view of ZK still shows you
all the configurations, zk is still the one source of truth. What I don't
want is multiple sources of truthishness.

On Thu, Feb 4, 2021 at 12:23 PM Tomás Fernández Löbbe <tomasflo...@gmail.com>
wrote:

> > Ehh; I am not suggesting that configSets belong local, which would be a
> step backwards -- we put them in ZK for a reason right now :-)  I'm
> suggesting we have both for the same configSet, where the deployer can
> choose which element is node resident vs cluster/ZK resident.  Thanks to
> existing Solr features like configOverlay.json and/or XML xi:include plus
> one small addition of fallback resolution of configSet files from ZK to the
> local node, we'd get this ability.  (see my first email).
>
> To be clear, I didn't suggest we move all configsets to be local. I'm just
> saying that having a local configset has those issues I mentioned.
>
> The point I was trying to make is that, having a single configset loading
> from both, local and zk may be confusing for the user and cause issues that
> may be difficult to track: Which file is Solr really reading right now? is
> it the local one or the remote one? Is there a local one in a node or not?
> is it being correctly overridden? How do I ensure that I always have a
> local version of a file to override the remote?
>
> So, I'm thinking that if we want to support this feature, a cleaner
> approach could be to just have a type of configset that's defined as
> "local", and then it belongs to the local filesystem. We can just prevent a
> node from starting if it's supposed to have a configset that doesn't have.
> It's 100% clear where a config file is being read from, etc. Maybe the
> "configOverlay.json" is an exception and should live in ZooKeeper (and
> never locally) for the config API to work, but having just "default to
> local when a file is not in ZooKeeper" just confuses things IMO.
>
> On Tue, Jan 26, 2021 at 8:38 PM David Smiley <dsmi...@apache.org> wrote:
>
>> On Tue, Jan 26, 2021 at 1:27 PM Tomás Fernández Löbbe <
>> tomasflo...@gmail.com> wrote:
>>
>>> Thanks for bringing this up, David. I thought about this same situation
>>> before, but I think I never convinced myself in one way or another :p. As I
>>> mentioned in many other emails, I think the infrastructure and the node
>>> configuration (such as solr.xml) needs to be local (at least, needs to be
>>> able to be local and not forced on ZooKeeper) for various reasons.
>>>
>>
>> I agree 100%.  I think the key part there is having *choice* for each
>> configuration element, and not one dictated by Solr as to what belongs
>> where.  The implementation of it needn't be complicated; it's a
>> straight-forward idea to have the same format with conceptual layer /
>> aggregation of them.
>>
>>
>>> The same reasons exist for configsets: safe upgrades, or possible
>>> node-specific configuration, as you mentioned. But Configsets have another
>>> layer of complexity in my mind, which is, you don't know where you'll need
>>> them... because you don't (necessarily) know where replicas of a collection
>>> are going to be created. True that this is not a problem in the Docker
>>> image situation you are describing, or if handled with care, but how can
>>> Solr make sure of it?
>>>
>>
>> Ehh; I am not suggesting that configSets belong local, which would be a
>> step backwards -- we put them in ZK for a reason right now :-)  I'm
>> suggesting we have *both* for the same configSet, where the deployer can
>> choose which element is node resident vs cluster/ZK resident.  Thanks to
>> existing Solr features like configOverlay.json and/or XML xi:include plus
>> one small addition of fallback resolution of configSet files from ZK to the
>> local node, we'd get this ability.  (see my first email).
>>
>> We have a very limited ability to accomplish the broad idea today -- Java
>> system properties with variable substitution in our files.  But of course
>> it's very limited what you can do with that, and it feels abusive to push
>> it too far.  It's fine for individual tunables (e.g. an integer) but not
>> more aggregate things like a complete MergePolicy configuration or an
>> analysis chain in a schema.
>>
>> We have another vaguely similar thing conceptually in Solr today --
>> ImplicitPlugins.json.  Probably only a few of you have heard of it.  It's
>> baked into solr-core's JAR.  Take a look at it.  What if it were a file
>> that a deployer could easily replace on the node, e.g. to reduce SolrCore
>> load time or for security or to add something that a company wants all
>> SolrCores to have?  That is along the lines of what this email thread is
>> about:  How can a Solr cluster deployer make settings changes (to include
>> registering new plugins) that are either specific to a node and/or should
>> be so for an entire cluster without each ZK resident configSet having the
>> config element?  *We can come up with ideas but most importantly I want
>> to validate the notion that this is a desirable thing.  *I think we
>> agree, Thomas, but I'm unsure about Eric & Gus and anyone else for that
>> matter.
>>
>>
>>> But I think it's a valuable feature to explore. Maybe the configset
>>> needs to exist in ZooKeeper and have some sort of flag (similar to
>>> secure=true) where it could say "local=true", and then fail Solr instances
>>> to start if the configset is not present or something? Otherwise the
>>> collection creation and replica addition operations may need to know where
>>> configsets are present, etc. I'm wondering if this mix you are proposing of
>>> some files in ZooKeeper and some files local wouldn't complicate things too
>>> much... not sure.
>>>
>>
>> I hope my answer above clarifies.  It seems you are exploring the ideas
>> of the latter part of my proposal that I started with "Probably secondary
>> related issue" (fully file system only configSets)... but I regret adding
>> this part because apparently it's too distracting to my primary discussion
>> point.
>>
>> ~ David
>>
>>>

-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Reply via email to