On Wed, Nov 17, 2021 at 8:12 PM Jan Høydahl <jan....@cominvent.com> wrote:
> I think your VOTE is premature as several design decisions are obviously > not landed. That may be the reason there are no votes yet, and I'm not > going to vote either. > > > 1. If "-Dsolr.node.roles" parameter is not passed, it is implicitly > assumed to be "-Dsolr.nodes.role=data" (due to backcompat reasons and also > so that those who don't use the role feature don't need any extra > parameters). > > > I'm not sold on making such a complex rule for what roles are enabled and > treating data role differently from other roles. > As I've said this before, we can't say "all roles are on by default" since we can't forsee the future to decide whether enabling a future role enabled by default. As of today, we have two roles: "data" and "overseer", and "data" is enabled by default on all nodes, and "overseer" (which stands for preferred overseer) is disabled by default on all nodes. Hence, I mentioned that if node hasn't started with "solr.node.roles" parameter, we should assume it is solr.nodes.role=data. > It's fine to require certain upgrade steps for 9.0. > Forcing everyone to explicitly use -Dsolr.nodes.role=data parameter to start their nodes, irrespective of whether they want to use the roles feature, doesn't seem like a reasonable idea. > We should keep role config 1:1 and dead simple, i.e. WYSIWYG and no roles > means all roles. Then handle back-compat in more targeted ways like we have > done for certain features before such as HTTP1 vs HTTP2. > > > - If a coordinator node is started with "data" role also, it fails to > startup with a message indicating a node cannot both be coordinator and > data node. > > > Such custom complex rules don't make sense to me. If you want a single > node to handle both data, zookeeper, overseer, coordination, > streaming-expressions, sql, foo and bar, then fine, why block it? > The coordinator role's implementation is outside the scope of this SIP. I propose that any future role (zk, coordination, sql, foo, bar...) be free to choose its own implementation or constraints. We can discuss this at the time of introduction of those roles. > Users will start in that mode and then separate out certain nodes for > certain workloads as they grow their clusters. > > Jan > > 15. nov. 2021 kl. 16:36 skrev Ishan Chattopadhyaya < > ichattopadhy...@gmail.com>: > > Thanks Jan, I've updated the SIP document with all the applicable changes > with a link to this thread (which contains the summary at the end). > I'll initiate the vote thread now. Thanks to everyone for contributing. > > On Mon, Nov 15, 2021 at 6:53 PM Jan Høydahl <jan....@cominvent.com> wrote: > >> Thanks for trying to summarize and drive the work Ishan. >> >> I'd like to add >> >> *Scope of SIP* >> Ishan: Role API and config >> Jan: Role API, config, and impact of one real role e.g. the "data" role, >> to examplify and justify the role infrastructure >> >> According to SIP process the next step is not implementation, but rather >> to iterate the SIP text to something you believe would pass a vote. It's >> hard to stitch together all these email and mini summaries into a >> meaningful whole. >> >> Jan >> >> 15. nov. 2021 kl. 05:28 skrev Ishan Chattopadhyaya < >> ichattopadhy...@gmail.com>: >> >> Thanks to everyone for the feedback. >> >> Here's an attempt to summarize broad topics discussed. >> >> *No negative roles* >> Everyone agree >> >> *Roles on/off by default?* >> Jason+(Ilan,Houston?): All roles to be on by default >> Gus,Ishan,Noble: Only those roles to be on by default that are needed for >> backcompat >> >> *Which branch to target?* >> Jan,Ishan,Noble: New feature to be added to 9x branch >> >> *Need for roles?* >> Tim: new concept of nodes unnecessary since everything that's proposed >> can be achieved using changes to new autoscaling framework and replica >> placement plugins. >> Ishan,Noble: A first class concept of roles is important so that this >> functionality is expected to work, irrespective of whatever custom >> placement plugins users deploy (since placement plugins don't support >> chaining). >> >> *Roles for collections?* >> Ilan: Role aware collections >> Ishan: This can be implemented separately later using node roles and >> placement plugins. >> >> *Configuration* >> Sysprops vs solr.xml+sysprops vs envvars: >> Shawn: Solr.xml and/or envvars >> Houston,Ilan: Sysprops and/or envvars >> Ishan,Noble: Sysprops >> Jan: SIP-11 >> >> *Outstanding issues* >> Shawn: Color of the bikeshed ;-) >> >> Please let me know if I missed something here. If there are no further >> strong objections, we can proceed to the implementation phase. There's >> already a draft/WIP PR in the works: >> https://github.com/apache/solr/pull/403 >> >> Thanks, >> Ishan >> >> On Fri, Nov 12, 2021 at 11:38 PM Gus Heck <gus.h...@gmail.com> wrote: >> >>> Yeah we should only be looking for and only be reporting (if we choose >>> to report to the user) a specific set of env variables. Anything else >>> should be ignored.Should be an enum or constants somewhere listing what >>> solr cares about, and we should ignore or be blind to anything else. >>> >>> Perhaps we'd like to have a ConfigParams (or whatever) enum that has >>> methods returning the env, sysprop, bin/solr arg, configFile and zkLocation >>> that can be used to provide each possible configuration option (for things >>> that are single value or short list, obviously an entire schema probably >>> would not be setable by sysprop :) )? >>> >>> The return type of those methods could be Optional<>() since we neither >>> have all of those for everything any time soon, and not all of them will >>> make sense in all cases. >>> >>> zkLocation is a bit tricky and nebulous since it's probably a zk path >>> and a JSON path or Xpath combined and relative to the chroot which itself >>> is a potential config param, some stuff to think through there. >>> >>> On Thu, Nov 11, 2021 at 3:49 PM Ilan Ginzburg <ilans...@gmail.com> >>> wrote: >>> >>>> Houston made a very valid comment back then on the placement plugin >>>> support of environment variables (dropped as a consequence). >>>> >>>> >>>> https://issues.apache.org/jira/browse/SOLR-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286680#comment-17286680 >>>> >>>> It could be possible to unintentionally leak node data that should be >>>> kept secret if Solr is allowed to freely access (random?) environment >>>> variables as part of configuration. >>>> >>>> Something to keep in mind. >>>> >>>> Ilan >>>> >>>> >>>> On Thu 11 Nov 2021 at 20:12, Eric Pugh <ep...@opensourceconnections.com> >>>> wrote: >>>> >>>>> Agreed! >>>>> >>>>> I’ve noticed that in the Play Framework, you can configure everything >>>>> via a property based configuration file, however it makes it easy to >>>>> override the property file via another one, or via an ENV variables: >>>>> >>>>> db.default.username="smui" >>>>> db.default.username=${?SMUI_DB_USER} >>>>> >>>>> Which turns out to be very liberating! >>>>> >>>>> >>>>> On Nov 11, 2021, at 2:09 PM, Jan Høydahl <jan....@cominvent.com> >>>>> wrote: >>>>> >>>>> +1 to a roundup of env and props across the board. I think SIP 11 is >>>>> on the track of something. But can be done independent of this. >>>>> >>>>> Jan Høydahl >>>>> >>>>> 11. nov. 2021 kl. 17:44 skrev Gus Heck <gus.h...@gmail.com>: >>>>> >>>>> >>>>> I guess all I mean is that it shouldn't be only sysprops. Enabling >>>>> sysprops, Env vars etc seems fine but we need to clearly document >>>>> precedence among any/all options. What is convenient varies from case to >>>>> case and in a perfect world what I'd like to see is full support across >>>>> each style (files, zk, props, env vars) with consistent and obvious naming >>>>> and well documented resolution order. >>>>> >>>>> What I don't like is a little bit of env vars for some stuff, props >>>>> for others, files for yet more stuff and some unclear aggregation of that >>>>> showing up in zk... (or maybe some of it not showing up anywhere code >>>>> could >>>>> check it...) >>>>> >>>>> On Thu, Nov 11, 2021 at 11:07 AM Houston Putman < >>>>> houstonput...@gmail.com> wrote: >>>>> >>>>>> I agree with Jan, when thinking about making Solr as cloud friendly >>>>>> as possible EnvVars and (to a lesser extent) sysProps are much preferable >>>>>> than having a setting in the solr.xml. >>>>>> This is because it's easier to customize EnvVars per-node, while >>>>>> customizing a config file is much harder, as those tend to be static and >>>>>> shared across a whole environment. >>>>>> >>>>>> Also thanks for linking that SIP Jan, very applicable. >>>>>> >>>>>> - Houston >>>>>> >>>>>> On Fri, Nov 5, 2021 at 5:19 PM Jan Høydahl <jan....@cominvent.com> >>>>>> wrote: >>>>>> >>>>>>> Thinking of these roles as labels, I think sysProps and envVars are >>>>>>> the two universal methods, and nothing wrong with that. >>>>>>> I keep trying to think cloud native and container, so having >>>>>>> excellent 1st class support for env.vars for such configs is a priority >>>>>>> to >>>>>>> me. >>>>>>> Most tools, CI-environments etc have built-in support for env.vars, >>>>>>> and so it makes sense to me. >>>>>>> >>>>>>> See >>>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-11+Uniform+cluster-level+configuration+API >>>>>>> for some interesting ideas around cluster/node level config. >>>>>>> >>>>>>> See >>>>>>> >>>>>>> 5. nov. 2021 kl. 15:04 skrev Gus Heck <gus.h...@gmail.com>: >>>>>>> >>>>>>> Agree better to something other than sysprops. an arg in the start >>>>>>> script would be friendlier than -D props which generally are >>>>>>> irritatingly >>>>>>> verbose and expose too much implementation. >>>>>>> >>>>>>> We lack a config file per level. solr.xml does double duty as global >>>>>>> and per-node depending on how it's used (zk or filesystem). >>>>>>> >>>>>>> Config file names are confusing too. Our file names are legacy of >>>>>>> non-cloud mode I think, and we really should at some point (10.x?) >>>>>>> rework >>>>>>> configs to be cluster.xml, node.xml, collection.xml (formerly >>>>>>> solrconfig.xml) and schema.xml (and maybe support something other than >>>>>>> xml, >>>>>>> but that's not nearly as important as clarity in naming, and having >>>>>>> features) >>>>>>> >>>>>>> But this is all straying way off topic and should have its own SIP >>>>>>> if someone seems to have time for it :) >>>>>>> >>>>>>> On Thu, Nov 4, 2021 at 6:07 PM Shawn Heisey <elyog...@elyograg.org> >>>>>>> wrote: >>>>>>> >>>>>>>> On 11/4/21 2:51 PM, Noble Paul wrote: >>>>>>>> > The SIP can be boiled down to the following >>>>>>>> > >>>>>>>> > * *Tag a node with a label (role) using a system property* >>>>>>>> > ** Use the placement plugin to whitelist/block list certain nodes* >>>>>>>> > ** Publish the roles through an API* >>>>>>>> >>>>>>>> >>>>>>>> In general, for Solr, do we like the idea of having things >>>>>>>> controlled by >>>>>>>> system properties? >>>>>>>> >>>>>>>> I would think solr.xml would be the right place to configure this, >>>>>>>> except that people can and probably do put solr.xml in zookeeper, >>>>>>>> which >>>>>>>> would mean every system would have the SAME solr.xml, and we're >>>>>>>> back to >>>>>>>> system properties as a way to customize solr.xml on each system. >>>>>>>> >>>>>>>> I have never used system properties to configure Solr. When I >>>>>>>> customize >>>>>>>> the config, I will often remove property substitutions from it and >>>>>>>> go >>>>>>>> with explicit settings. My general opinion about system properties >>>>>>>> is >>>>>>>> that if they're going to be used, they should DIRECTLY configure >>>>>>>> the >>>>>>>> application, not be sent in via property substitution in a config >>>>>>>> file. >>>>>>>> I've never liked the way our default configs use that paradigm. It >>>>>>>> means you cannot look at the config and know exactly how things are >>>>>>>> configured, without finding out whether system properties have been >>>>>>>> set. >>>>>>>> >>>>>>>> What color do others think that bikeshed should be painted? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Shawn >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >>>>>>>> For additional commands, e-mail: dev-h...@solr.apache.org >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> http://www.needhamsoftware.com (work) >>>>>>> http://www.the111shift.com (play) >>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> http://www.needhamsoftware.com (work) >>>>> http://www.the111shift.com (play) >>>>> >>>>> >>>>> _______________________ >>>>> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467 >>>>> | http://www.opensourceconnections.com | My Free/Busy >>>>> <http://tinyurl.com/eric-cal> >>>>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed >>>>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> >>>>> This e-mail and all contents, including attachments, is considered to >>>>> be Company Confidential unless explicitly stated otherwise, regardless >>>>> of whether attachments are marked as such. >>>>> >>>>> >>> >>> -- >>> http://www.needhamsoftware.com (work) >>> http://www.the111shift.com (play) >>> >> >> >