Hi, I can see need for such flexibility, but I'm also worried that we complicate things and make debugging harder etc.
If I'm not mistaken, the current logic in ZkResourceLoader is to first look in ZK, and if it is not found, look in local disk(?). I'd prefer it being an explicit fallback or resolution order instead of hardcoded magick. I.e. able to configure a configset search path such as ["local", "zk", "somethingelse"]. This would make resource loader prefer local files even if they exist in ZK. Longer term it would be nice isolate ZK away, and make config sources fully pluggable. You could then address a too large resource explicitly such as <filter .. words="filestore://stopwords-en.txt"> or local:/stopwords-en.txt etc. Jan > 27. jan. 2021 kl. 05:38 skrev David Smiley <dsmi...@apache.org>: > > On Tue, Jan 26, 2021 at 1:27 PM Tomás Fernández Löbbe <tomasflo...@gmail.com > <mailto:tomasflo...@gmail.com>> wrote: > Thanks for bringing this up, David. I thought about this same situation > before, but I think I never convinced myself in one way or another :p. As I > mentioned in many other emails, I think the infrastructure and the node > configuration (such as solr.xml) needs to be local (at least, needs to be > able to be local and not forced on ZooKeeper) for various reasons. > > I agree 100%. I think the key part there is having choice for each > configuration element, and not one dictated by Solr as to what belongs where. > The implementation of it needn't be complicated; it's a straight-forward > idea to have the same format with conceptual layer / aggregation of them. > > The same reasons exist for configsets: safe upgrades, or possible > node-specific configuration, as you mentioned. But Configsets have another > layer of complexity in my mind, which is, you don't know where you'll need > them... because you don't (necessarily) know where replicas of a collection > are going to be created. True that this is not a problem in the Docker image > situation you are describing, or if handled with care, but how can Solr make > sure of it? > > Ehh; I am not suggesting that configSets belong local, which would be a step > backwards -- we put them in ZK for a reason right now :-) I'm suggesting we > have both for the same configSet, where the deployer can choose which element > is node resident vs cluster/ZK resident. Thanks to existing Solr features > like configOverlay.json and/or XML xi:include plus one small addition of > fallback resolution of configSet files from ZK to the local node, we'd get > this ability. (see my first email). > > We have a very limited ability to accomplish the broad idea today -- Java > system properties with variable substitution in our files. But of course > it's very limited what you can do with that, and it feels abusive to push it > too far. It's fine for individual tunables (e.g. an integer) but not more > aggregate things like a complete MergePolicy configuration or an analysis > chain in a schema. > > We have another vaguely similar thing conceptually in Solr today -- > ImplicitPlugins.json. Probably only a few of you have heard of it. It's > baked into solr-core's JAR. Take a look at it. What if it were a file that > a deployer could easily replace on the node, e.g. to reduce SolrCore load > time or for security or to add something that a company wants all SolrCores > to have? That is along the lines of what this email thread is about: How > can a Solr cluster deployer make settings changes (to include registering new > plugins) that are either specific to a node and/or should be so for an entire > cluster without each ZK resident configSet having the config element? We can > come up with ideas but most importantly I want to validate the notion that > this is a desirable thing. I think we agree, Thomas, but I'm unsure about > Eric & Gus and anyone else for that matter. > > But I think it's a valuable feature to explore. Maybe the configset needs to > exist in ZooKeeper and have some sort of flag (similar to secure=true) where > it could say "local=true", and then fail Solr instances to start if the > configset is not present or something? Otherwise the collection creation and > replica addition operations may need to know where configsets are present, > etc. I'm wondering if this mix you are proposing of some files in ZooKeeper > and some files local wouldn't complicate things too much... not sure. > > I hope my answer above clarifies. It seems you are exploring the ideas of > the latter part of my proposal that I started with "Probably secondary > related issue" (fully file system only configSets)... but I regret adding > this part because apparently it's too distracting to my primary discussion > point. > > ~ David