Re: First class support for node roles

Jan Høydahl Mon, 01 Nov 2021 09:19:55 -0700

I think it is safe to assume that small clusters, say 1-5 nodes will most often 
want to have all features on all nodes as the cluster is too small to 
specialize, and then the default is perfect. 
For large clusters we should recommend explicitly specifying roles during the 
9.0 upgrade. So if you have 100 nodes, you would likely have assigned the 
overseer role to a handful nodes when upgrading to 9.0.
And for every new feature in 9.x you will explicitly decide whether to use it 
and what nodes should have the role.


But I assume that a new feature in 9.x that introduces a new role can also 
decide for some alternative back-compat logic to support rolling restart if it 
is needed.

Jan

> 1. nov. 2021 kl. 17:00 skrev Ishan Chattopadhyaya <[email protected]>:
> 
> > Ilan: A node not having node.roles defined should be assumed to have all 
> > roles. Not only data. I don't see a reason to special case this one or any 
> > role.
> > Gus: There should be no "assumptions" Nothing to figure out. A node has a 
> > role or not. For back compatibility reasons, all roles would be assumed on 
> > startup if none specified. 
> > Jan: No role == all roles. Explicit list of roles = exactly those roles.
> 
> Problem with this approach is mainly to do with backcompat. 
> 
> 1. Overseer backcompat:
> If we don't make any modifications to how overseer works and adopt this 
> approach (as quoted), then imagine this situation:
> 
> Solr1-100: No roles param (assumed to be "data,overseer").
> Solr101: -Dnode.roles=overseer (intention: dedicated overseer)
> 
> User wants this node Solr101 to be a dedicated overseer, but for that to 
> happen, he/she would need to restart all the data nodes with 
> -Dnode.roles=data. This will cause unnecessary disruption to running clusters 
> where a dedicated overseer is needed. Keep in mind, if a user needs a 
> dedicated overseer, he's likely in an emergency situation and restarting the 
> whole cluster might not be viable for him/her.
> 
> 2. Future roles might not be compatible with this "assumed to have all roles" 
> idea:
> Take the proposed "zookeeper" role for example. Today, regular nodes are not 
> supposed to have embedded ZK running on them. By introducing this artificial 
> limitation ("assumed to have all roles"), we constrain adoption of all future 
> roles to necessarily require a full cluster restart.
> 
> Keep in mind newer Solr versions can introduce new capabilities and roles. 
> Imagine we have a role that is defined in a new Solr version (and there's 
> functionality to go with that role), and user upgrades to that version. 
> However, his/her nodes all were started with no node.roles param. Hence, if 
> those nodes are "assumed to have all roles", then just by virtue of upgrading 
> to this new version, new capabilities will be turned on for the entire 
> cluster, whether or not the user opted for such a capability. This is totally 
> undesirable.
> 
> > Gus: I actually don't want a coordinator to do more work, I would prefer 
> > small focused roles with names that accurately describe their function. In 
> > that light, COORDINATOR might be too nebulous. How about AGREGATOR role? 
> > (what I was thinking of would better be called a QUERY_ANALYSIS role)
> 
> If you want to do specific things like query analysis or query aggregation or 
> bulk indexing etc, all of those can be done on COORDINATOR nodes (as is the 
> case in ElasticSearch). Having tens of of " small focused roles" defined as 
> first class concepts would be confusing to the user. As a remedy to your 
> situation where you want the coordinator role to also do query-analysis for 
> shards, one possible solution is to send such a query to a coordinator node 
> with a parameter like "coordinator.query_analysis=true", and then the 
> coordinator, instead of blindly hitting remote shards, also does some extra 
> work on behalf of the shards.
> 
> 
> On Mon, Nov 1, 2021 at 9:01 PM Ishan Chattopadhyaya 
> <[email protected] <mailto:[email protected]>> wrote:
> > If we make collections role-aware for example (replicas of that collection 
> > can only be
> > placed on nodes with a specific role, in addition to the other role based 
> > constraints),
> > the set of roles should be user extensible and not fixed.
> > If collections are not role aware, the constraints introduced by roles 
> > apply to all collections
> > equally which might be insufficient if a user needs for example a heavily 
> > used collection to
> > only be placed on more powerful nodes.
> 
> I feel node roles and role-aware collections are orthogonal topics. What you 
> describe above can be achieved by the autoscaling+replica placement framework 
> where the placement plugins take the node roles as one of the inputs.
> 
> > It does impact the design from early on: the set of roles need to be 
> > expandable by a user
> > by creating a collection with new roles for example (consumed by placement 
> > plugins) and be
> > able to start nodes with new (arbitrary) roles. Should such roles follow 
> > some naming syntax to 
> > differentiate them from built in roles? To be able to fail on typos on 
> > roles - that otherwise can be
> > crippling and hard to debug. This implies in any case that the current 
> > design can't assume all
> > roles are known at compile time or define them in a Java enum.
> 
> I think this should be achieved by something different from roles. Something 
> like node labels (user defined) which can then be used in a replica placement 
> plugin to assign replicas. I see roles as more closely associated with kinds 
> of functionality a node is designated for. Therefore, I feel that replica 
> placements and user defined node labels is out of scope for this SIP. It can 
> be added later in a separate SIP, without being at odds with this proposal.
> 
> 
> 
> 
> 
> 
> On Mon, Nov 1, 2021 at 8:42 PM Jan Høydahl <[email protected] 
> <mailto:[email protected]>> wrote:
> 
> 
> > 1. nov. 2021 kl. 14:46 skrev Ilan Ginzburg <[email protected] 
> > <mailto:[email protected]>>:
> > A node not having node.roles defined should be assumed to have all roles. 
> > Not only data. I don't see a reason to special case this one or any role.
> 
> +1, make it simple and transparent. No role == all roles. Explicit list of 
> roles = exactly those roles.
> 
> > (Gus) See my comment above, but maybe preference is something handled as a 
> > feature of the role rather than via role designation? 
> 
> Yea, we always need an overseer, so that feature can decide to use its list 
> of nodes as a preference if it so chooses.
> 
> 
> Aside: I think it makes it easier if we always prefix Solr env.vars and 
> sys.props with "SOLR_" or "solr.", i.e. -Dsolr.node.roles=foo. That way we 
> can get away from having to have explicit code in bin/solr, bin/solr.cmd and 
> SolrCLI to manage every single property. Instead we can parse all ENVs and 
> Props with the solr prefix in our bootstrap code. And we can by convention 
> allow e.g. docker run -e SOLR_NODE_ROLES=foo solr:9 and it would be the same 
> ting...
> 
> Jan
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected] 
> <mailto:[email protected]>
> For additional commands, e-mail: [email protected] 
> <mailto:[email protected]>
>

Re: First class support for node roles

Reply via email to