Re: First class support for node roles

Ishan Chattopadhyaya Tue, 02 Nov 2021 17:14:19 -0700

Also, in a cluster where new collections/shards/replicas are continuously
added all the time, it would be pretty awkward to start a node (in regular
mode), briefly have it become eligible for replica assignment, then
invoking a replica placement rule/autoscaling policy for that node to not
place replicas on it. Instead, starting a node with a defined role (as a
startup param) precludes that brief period of eligibility for replica
placement on such a node.


On Wed, Nov 3, 2021 at 5:39 AM Ishan Chattopadhyaya <
[email protected]> wrote:

> If we were to tell users how to do "scatter gather on an empty node", *how
> exactly* would you recommend users have an empty node to begin with?
> Wouldn't you say something like "for 8x you can do this (rule based replica
> placement) or do that (autoscaling), but for 9x you do this new thing".
> Having a node that doesn't have a data role seems like a consistent and an
> elegant way for users to invoke such a functionality and also easily relate
> to a broad concept, without having to deal with autoscaling frameworks of
> the ancient past, medieval past or the future.
>
> On Wed, Nov 3, 2021 at 5:29 AM Timothy Potter <[email protected]>
> wrote:
>
>> As opposed to what? Looking up the configset for the addressed
>> collection and pulling whatever information it needs from cached data.
>> I'm sure there are some nuances but I hardly think you need a node
>> role framework to deal with determine the unique key field to do
>> scatter gather on an empty node when you have easy access to
>> collection metadata.
>>
>> Doesn't seem like a hard thing to overcome to me.
>>
>> On Tue, Nov 2, 2021 at 5:49 PM Noble Paul <[email protected]> wrote:
>> >
>> >
>> >
>> > On Wed, Nov 3, 2021, 10:46 AM Timothy Potter <[email protected]>
>> wrote:
>> >>
>> >> I'm not missing the point of the query coordinator, but I actually
>> >> didn't realize that an empty Solr node would forward the top-level
>> >> request onward instead of just being the query controller itself? That
>> >> actually seems like a bug vs. a feature, IMO any node that receives
>> >> the top-level query should just be the coordinator, what stops it?
>> >
>> >
>> > To process a request there should be a core that uses the same
>> configset as the requested collection.
>> >>
>> >>
>> >> Anyway, it sounds to me like you guys have your minds made up
>> >> regardless of feedback.
>> >>
>> >> Btw ~ I only mentioned the Zookeeper part b/c it's in your SIP as a
>> >> specific role, not sure why you took that as me wanting to discuss the
>> >> embedded ZK in your SIP?
>> >>
>> >> On Tue, Nov 2, 2021 at 5:13 PM Ishan Chattopadhyaya
>> >> <[email protected]> wrote:
>> >> >
>> >> > Hi Tim,
>> >> > Here are my responses inline.
>> >> >
>> >> > On Wed, Nov 3, 2021 at 3:22 AM Timothy Potter <[email protected]>
>> wrote:
>> >> >>
>> >> >> I'm just not convinced this feature is even needed and the SIP is
>> not
>> >> >> convincing that "There is no proper alternative today."
>> >> >
>> >> >
>> >> > There are no proper alternatives today, just hacks. On 8x, we have
>> two different deprecated frameworks to stop nodes from being placed on a
>> node (1. rule based replica placement, 2. autoscaling framework). On 9x, we
>> have a new autoscaling framework, which I don't even think is fully
>> implemented. And, there's definitely no way to have a node act as a query
>> coordinator without having data on it.
>> >> >
>> >> >>
>> >> >>
>> >> >> 1) Just b/c Elastic and Vespa have a concept of node roles, doesn't
>> >> >> mean Solr needs this.
>> >> >
>> >> >
>> >> > Solr needs this. Elastic has such concepts is a coincidence, and
>> also means we have an opportunity to catch up with them; they have these
>> concepts for a reason.
>> >> >
>> >> >>
>> >> >> Also, some of Elastic's roles overlap with
>> >> >> concepts Solr already has in a different form, i.e data_hot sounds
>> >> >> like NRT and data_warm sounds a lot like our Pull Replica Type
>> >> >
>> >> >
>> >> > I think that is beyond the scope of this SIP.
>> >> >
>> >> >>
>> >> >>
>> >> >> 2) You can achieve the "coordinator" role with auto-scaling rules
>> >> >> pre-9.x and with the AffinityPlacementPlugin (heck, it even has a
>> node
>> >> >> type built in:
>> .requestNodeSystemProperty(AffinityPlacementConfig.NODE_TYPE_SYSPROP).
>> >> >> Simply build your replica placement rules such that no replicas land
>> >> >> on "coordinator" nodes. And you can route queries using node.sysprop
>> >> >> already using shards.preference.
>> >> >
>> >> >
>> >> > I think you missed the whole point of the query coordinator. Please
>> refer to this https://issues.apache.org/jira/browse/SOLR-15715.
>> >> > Let me summarize the main difference between what (I think) you
>> refer to and what is proposed in SOLR-15715.
>> >> >
>> >> > With your suggestion, we'll have a node that doesn't host any
>> replicas. And you suggest queries landing on such nodes be routed using
>> shards.preference? Well, in such a case, these queries will be
>> forwarded/proxied to a random node hosting a replica of the collection and
>> that node then acts as the coordinator. This situation is no better than
>> sending the query directly to that particular node.
>> >> >
>> >> > What is proposed in SOLR-15715 is a query aggregation functionality.
>> There will be pseudo replicas (aware of the configset) on this coordinator
>> node that handle the request themselves, sends shard requests to data
>> hosting replicas, collects responses and merges them, and sends back to the
>> user. This merge step is usually extremely memory intensive, and it would
>> be good to serve these off stateless nodes (that host no data).
>> >> >
>> >> >>
>> >> >>
>> >> >> 3) Dedicated overseer role? I thought we were removing the
>> overseer?!?
>> >> >> Also, we already have the ability to run the overseer on specific
>> >> >> nodes w/o a new framework, so this doesn't really convince me we
>> need
>> >> >> a new framework.
>> >> >
>> >> >
>> >> > There's absolutely no change proposed to the "overseer" role. What
>> users need on production clusters are nodes dedicated for overseer
>> operations, and for that the current "overseer" role suffices, together
>> with some functionality to not place replicas on such nodes.
>> >> >
>> >> >>
>> >> >>
>> >> >> 4) We will indeed need to decide which nodes host embedded
>> Zookeeper's
>> >> >> but I'd argue that solution hasn't been designed entirely and we
>> >> >> probably don't need a formal node role framework to determine which
>> >> >> nodes host embedded ZKs. Moreover, embedded ZK seems more like a
>> small
>> >> >> cluster thing and anyone running a large cluster will probably have
>> a
>> >> >> dedicated ZK ensemble as they do today. The node role thing seems
>> like
>> >> >> it's intended for large clusters and my gut says few will use
>> embedded
>> >> >> ZK for large clusters.
>> >> >
>> >> >
>> >> > This SIP is not the right place for this discussion. There's a
>> separate SIP for this.
>> >> >
>> >> >>
>> >> >>
>> >> >> 5) You can also achieve a lot of "node role" functionality in query
>> >> >> routing using the shards.preference parameter.
>> >> >>
>> >> >
>> >> > That doesn't solve the purpose behind
>> https://issues.apache.org/jira/browse/SOLR-15715.
>> >> >
>> >> >>
>> >> >> At the very least, the SIP needs to list specific use cases that
>> >> >> require this feature that are not achievable with the current
>> features
>> >> >> before getting bogged down in the impl. details.
>> >> >
>> >> >
>> >> > The coordinator role is the biggest motivation for introducing the
>> concept of roles. However, in addition to what is proposed in SOLR-15715, a
>> coordinator node can later on also be used as a node for users to run
>> streaming expressions on, do bulk indexing on (impl details for this to
>> come later, don't want distraction here).
>> >> >
>> >> >>
>> >> >>
>> >> >> Tim
>> >> >>
>> >> >> On Tue, Nov 2, 2021 at 3:20 PM Gus Heck <[email protected]> wrote:
>> >> >> >
>> >> >> > I think there are things not yet accounted for. Time I spent
>> yesterday is biting me today. Pls give a couple days.
>> >> >> >
>> >> >> > On Tue, Nov 2, 2021 at 11:28 AM Jason Gerlowski <
>> [email protected]> wrote:
>> >> >> >>
>> >> >> >> Hey Ishan,
>> >> >> >>
>> >> >> >> I appreciate you writing up the SIP!  Here's some
>> notes/questions I
>> >> >> >> had as I was reading through your writeup and this mail thread.
>> >> >> >> ("----" separators between thoughts, hopefully that helps.)
>> >> >> >>
>> >> >> >> ----
>> >> >> >>
>> >> >> >> I'll add my vote to what Jan, Gus, Ilan, and Houston already
>> >> >> >> suggested: roles should default to "all-on".  I see the downsides
>> >> >> >> you're worried about with that approach (esp. around
>> 'overseer'), but
>> >> >> >> they may be mitigatable, at least in part.
>> >> >> >>
>> >> >> >> > [mail thread] User wants this node Solr101 to be a dedicated
>> overseer, but for that to happen, he/she would need to restart all the data
>> nodes with -Dnode.roles=data
>> >> >> >>
>> >> >> >> Sure, if roles can only be specified at startup.  But that may
>> be a
>> >> >> >> self-imposed constraint.
>> >> >> >>
>> >> >> >> An API to change a node's roles would remove the need for a
>> restart
>> >> >> >> and make it easy for users to affect the semantics they want.
>> You
>> >> >> >> decided you want a dedicated overseer N nodes into your cluster
>> >> >> >> deployment?  Deploy node 'N' with the 'overseer', and toggle the
>> >> >> >> overseer role off on the remainder.
>> >> >> >>
>> >> >> >> Now, I understand that you don't want roles to change at
>> runtime, but
>> >> >> >> I haven't seen you get much into "why", beyond saying "it is very
>> >> >> >> risky to have nodes change roles while they are up and
>> running."  Can
>> >> >> >> you expand a bit on the risks you're worried about?  If you're
>> >> >> >> explicit about them here maybe someone can think of a clever way
>> to
>> >> >> >> address them?
>> >> >> >>
>> >> >> >> > Hence, if those nodes are "assumed to have all roles", then
>> just by virtue of upgrading to this new version, new capabilities will be
>> turned on for the entire cluster, whether or not the user opted for such a
>> capability. This is totally undesirable.
>> >> >> >>
>> >> >> >> Obviously "roles" refer to much bigger chunks of functionality
>> than
>> >> >> >> usual, so in a sense defaulting roles on is scarier.  But in a
>> sense
>> >> >> >> you're describing something that's an inherent part of software
>> >> >> >> releases.  Releases expose new features that are typically on by
>> >> >> >> default.  A new default-on role in 9.1 might hurt a user, but
>> there's
>> >> >> >> no fundamental difference between that and a change to backups or
>> >> >> >> replication or whatever in the same release.
>> >> >> >>
>> >> >> >> I don't mean to belittle the difference in scope - I get your
>> concern.
>> >> >> >> But IMO this is something to address with good release notes and
>> >> >> >> documentation.  Designing for admins who don't do even cursory
>> >> >> >> research before an upgrade ties both our hands behind our back
>> as a
>> >> >> >> project.
>> >> >> >>
>> >> >> >> ----
>> >> >> >>
>> >> >> >> > [SIP] Internal representation in ZK ... Implementation details
>> like these can be fleshed out in the PR
>> >> >> >>
>> >> >> >> IMO this is important enough to flush out as part of the SIP, at
>> least
>> >> >> >> in broad strokes.  It affects backcompat, SolrJ client design,
>> etc.
>> >> >> >>
>> >> >> >> ----
>> >> >> >>
>> >> >> >> > [SIP] GET /api/cluster/roles?node=node1
>> >> >> >>
>> >> >> >> Woohoo - way to include a v2 API definition!
>> >> >> >>
>> >> >> >> AFAIR, the v2 API has a /nodes path defined - I wonder whether
>> "GET
>> >> >> >> /nodes/someNode/roles" wouldn't be a more intuitive endpoint for
>> the
>> >> >> >> "get the roles this node has" functionality.  Though I leave
>> that for
>> >> >> >> your consideration.
>> >> >> >>
>> >> >> >> ----
>> >> >> >>
>> >> >> >> Looking forward to your responses and seeing the SIP progress!
>> It's a
>> >> >> >> really cool, promising idea IMO.
>> >> >> >>
>> >> >> >> Best,
>> >> >> >>
>> >> >> >> Jason
>> >> >> >>
>> >> >> >> On Tue, Nov 2, 2021 at 11:21 AM Ishan Chattopadhyaya
>> >> >> >> <[email protected]> wrote:
>> >> >> >> >
>> >> >> >> > Are there any unaddressed outstanding concerns that we should
>> hold up the SIP for?
>> >> >> >> >
>> >> >> >> > On Mon, 1 Nov, 2021, 10:31 pm Ishan Chattopadhyaya, <
>> [email protected]> wrote:
>> >> >> >> >>>
>> >> >> >> >>> >> Agree. However, I disagree with ideas where "query
>> analysis" has a role of its own. Where would that lead us to? Separate
>> roles for
>> >> >> >> >>>
>> >> >> >> >>> >> nodes that do "faceting" or "spell correction" etc.? But
>> anyway, that is for discussion when we add future roles. This is beyond
>> this SIP.
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> > I am not asking you to implement every possible role of
>> course :). As a note I know a company that is running an entire separate
>> >> >> >> >> > cluster to offload and better serve highlighting on a
>> subset of large docs, so YES I think there are people who may want such
>> fine grained control.
>> >> >> >> >>
>> >> >> >> >> Cool, I think we can discuss adding any additional roles (for
>> highlighting?) on a case by case basis at a later point.
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> On Mon, Nov 1, 2021 at 10:25 PM Ishan Chattopadhyaya <
>> [email protected]> wrote:
>> >> >> >> >>>
>> >> >> >> >>> > Boiling it down the idea I'm proposing is that roles
>> required for back compatibility get explicitly added on startup, if not by
>> the user then by the code. This is more flexible than assuming that no role
>> means every role, because then every new feature that has a role will end
>> up on legacy clusters which are also not back compatible.
>> >> >> >> >>>
>> >> >> >> >>> +1, I totally agree. I even said so, when I said: "This is
>> why I was advocating that 1) we assume the "data" as a default, 2) not
>> assume overseer to be implicitly defined (because of the way overseer role
>> is written today), 3) not assume any future roles to be true by default."
>> >> >> >> >>>
>> >> >> >> >>> So, basically, I'm proposing that the "roles required for
>> back compatibility" (that should be explicitly added on startup) be just
>> the ["data"] role, and not the "overseer" role (due to the way overseer
>> role is currently defined, i.e. it is "preferred overseer").
>> >> >> >> >>>
>> >> >> >> >>> On Mon, Nov 1, 2021 at 10:19 PM Gus Heck <[email protected]>
>> wrote:
>> >> >> >> >>>>
>> >> >> >> >>>> Very sorry don't mean to sound offended, Frustrated yes
>> offended no :)... the most difficult thing about communication is the
>> illusion it has occurred :)
>> >> >> >> >>>>
>> >> >> >> >>>> If you read back just a few emails you'll see where I talk
>> about roles being applied on startup. Boiling it down the idea I'm
>> proposing is that roles required for back compatibility get explicitly
>> added on startup, if not by the user then by the code. This is more
>> flexible than assuming that no role means every role, because then every
>> new feature that has a role will end up on legacy clusters which are also
>> not back compatible.
>> >> >> >> >>>>
>> >> >> >> >>>> There are points where I said all roles rather than back
>> compatibility roles because I was thinking about back compatibility
>> specifically, but you can't know that if I don't say that can you :).
>> >> >> >> >>>>
>> >> >> >> >>>> On Mon, Nov 1, 2021 at 12:39 PM Ishan Chattopadhyaya <
>> [email protected]> wrote:
>> >> >> >> >>>>>
>> >> >> >> >>>>> > If you read more closely, my way can provide full back
>> compatibility. To say or imply it doesn't isn't helping. Perhaps you need
>> to re-read?
>> >> >> >> >>>>>
>> >> >> >> >>>>> I understand e-mails are frustrating, and I'm trying my
>> best. Please don't be offended, and kindly point me to the exact part you
>> want me to re-read.
>> >> >> >> >>>>>
>> >> >> >> >>>>> On Mon, Nov 1, 2021 at 10:05 PM Gus Heck <
>> [email protected]> wrote:
>> >> >> >> >>>>>>
>> >> >> >> >>>>>>
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> On Mon, Nov 1, 2021 at 12:22 PM Ishan Chattopadhyaya <
>> [email protected]> wrote:
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> >    Positive - They denote the existence of a capability
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> Agree, the SIP already reflects this.
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> >   Absolute - Absence/Presence binary identification of
>> a capability; no implications, no assumptions
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> Disagree, we need backcompat handling on nodes running
>> without any roles. There has to be an implicit assumption as to what roles
>> are those nodes assumed to have. My proposal is that only the "data" role
>> be assumed, but not the "overseer" role. For any future roles
>> ("coordinator", "zookeeper" etc.), this decision as to what absence of any
>> role implies should be left to the implementation of that future role.
>> Documentation should reflect clearly about these implicit assumptions.
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> If you read more closely, my way can provide full back
>> compatibility. To say or imply it doesn't isn't helping. Perhaps you need
>> to re-read?
>> >> >> >> >>>>>>
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> >    Focused - Do one thing per role
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> Agree. However, I disagree with ideas where "query
>> analysis" has a role of its own. Where would that lead us to? Separate
>> roles for nodes that do "faceting" or "spell correction" etc.? But anyway,
>> that is for discussion when we add future roles. This is beyond this SIP.
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> I am not asking you to implement every possible role of
>> course :). As a note I know a company that is running an entire separate
>> cluster to offload and better serve highlighting on a subset of large docs,
>> so YES I think there are people who may want such fine grained control.
>> >> >> >> >>>>>>
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> >    Accessible - It should be dead simple to determine
>> the members of a role, avoid parsing blobs of json, avoid calculating
>> implications, avoid consulting other resources after listing nodes with the
>> role
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> Agree. I'm open to any implementation details that make
>> it easy. There should be a reasonable API to return these node roles, with
>> ability to filter by role or filter by node.
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> >    Independent - One role should not require other
>> roles to be present
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> Do we need to have this hard and fast requirement
>> upfront? There might be situations where this is desirable. I feel we can
>> discuss on a case by case basis whenever a future role is added.
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> >    Persistent - roles should not be lost across reboot
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> Agree.
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> >    Immutable - roles should not change while the node
>> is running
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> Agree
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> >    Lively - A node with a capability may not be
>> presently providing that capability.
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> I don't understand, can you please elaborate?
>> >> >> >> >>>>>>
>> >> >> >> >>>>>>
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> Specifically imagine the case where there are 100 nodes:
>> >> >> >> >>>>>> 1-100 ==> DATA
>> >> >> >> >>>>>> 101-103 ==> OVERSEER
>> >> >> >> >>>>>> 104-106 ==> ZOOKEEPER
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> But you won't have 3 overseers... you'll want only one of
>> those to be providing overseer functionality and the other two to be
>> capable, but not providing (so that if the current overseer goes down a new
>> one can be assigned).
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> Then you decide you'd ike 5 Zookeepers. You start nodes
>> 107-108 with that role, but you probably want to ensure that zookeepers
>> require some sort of command for them to actually join the zookeeper
>> cluster (i.e. /admin?action=ZKADD&nodes=node107,node18) ... to do that the
>> nodes need to be up. But oh look I typoed 108... we want that to fail...
>> how? because 18 does not have the capability to become a zookeeper.
>> >> >> >> >>>>>>
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> On Mon, Nov 1, 2021 at 9:30 PM Ishan Chattopadhyaya <
>> [email protected]> wrote:
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> > Ilan: A node not having node.roles defined should be
>> assumed to have all roles. Not only data. I don't see a reason to special
>> case this one or any role.
>> >> >> >> >>>>>>>> > Gus: There should be no "assumptions" Nothing to
>> figure out. A node has a role or not. For back compatibility reasons, all
>> roles would be assumed on startup if none specified.
>> >> >> >> >>>>>>>> > Jan: No role == all roles. Explicit list of roles =
>> exactly those roles.
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> Problem with this approach is mainly to do with
>> backcompat.
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> 1. Overseer backcompat:
>> >> >> >> >>>>>>>> If we don't make any modifications to how overseer
>> works and adopt this approach (as quoted), then imagine this situation:
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> Solr1-100: No roles param (assumed to be
>> "data,overseer").
>> >> >> >> >>>>>>>> Solr101: -Dnode.roles=overseer (intention: dedicated
>> overseer)
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> User wants this node Solr101 to be a dedicated
>> overseer, but for that to happen, he/she would need to restart all the data
>> nodes with -Dnode.roles=data. This will cause unnecessary disruption to
>> running clusters where a dedicated overseer is needed. Keep in mind, if a
>> user needs a dedicated overseer, he's likely in an emergency situation and
>> restarting the whole cluster might not be viable for him/her.
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> 2. Future roles might not be compatible with this
>> "assumed to have all roles" idea:
>> >> >> >> >>>>>>>> Take the proposed "zookeeper" role for example. Today,
>> regular nodes are not supposed to have embedded ZK running on them. By
>> introducing this artificial limitation ("assumed to have all roles"), we
>> constrain adoption of all future roles to necessarily require a full
>> cluster restart.
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> Keep in mind newer Solr versions can introduce new
>> capabilities and roles. Imagine we have a role that is defined in a new
>> Solr version (and there's functionality to go with that role), and user
>> upgrades to that version. However, his/her nodes all were started with no
>> node.roles param. Hence, if those nodes are "assumed to have all roles",
>> then just by virtue of upgrading to this new version, new capabilities will
>> be turned on for the entire cluster, whether or not the user opted for such
>> a capability. This is totally undesirable.
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> > Gus: I actually don't want a coordinator to do more
>> work, I would prefer small focused roles with names that accurately
>> describe their function. In that light, COORDINATOR might be too nebulous.
>> How about AGREGATOR role? (what I was thinking of would better be called a
>> QUERY_ANALYSIS role)
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> If you want to do specific things like query analysis
>> or query aggregation or bulk indexing etc, all of those can be done on
>> COORDINATOR nodes (as is the case in ElasticSearch). Having tens of of "
>> small focused roles" defined as first class concepts would be confusing to
>> the user. As a remedy to your situation where you want the coordinator role
>> to also do query-analysis for shards, one possible solution is to send such
>> a query to a coordinator node with a parameter like
>> "coordinator.query_analysis=true", and then the coordinator, instead of
>> blindly hitting remote shards, also does some extra work on behalf of the
>> shards.
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> On Mon, Nov 1, 2021 at 9:01 PM Ishan Chattopadhyaya <
>> [email protected]> wrote:
>> >> >> >> >>>>>>>>>
>> >> >> >> >>>>>>>>> > If we make collections role-aware for example
>> (replicas of that collection can only be
>> >> >> >> >>>>>>>>> > placed on nodes with a specific role, in addition to
>> the other role based constraints),
>> >> >> >> >>>>>>>>> > the set of roles should be user extensible and not
>> fixed.
>> >> >> >> >>>>>>>>> > If collections are not role aware, the constraints
>> introduced by roles apply to all collections
>> >> >> >> >>>>>>>>> > equally which might be insufficient if a user needs
>> for example a heavily used collection to
>> >> >> >> >>>>>>>>> > only be placed on more powerful nodes.
>> >> >> >> >>>>>>>>>
>> >> >> >> >>>>>>>>> I feel node roles and role-aware collections are
>> orthogonal topics. What you describe above can be achieved by the
>> autoscaling+replica placement framework where the placement plugins take
>> the node roles as one of the inputs.
>> >> >> >> >>>>>>>>>
>> >> >> >> >>>>>>>>> > It does impact the design from early on: the set of
>> roles need to be expandable by a user
>> >> >> >> >>>>>>>>> > by creating a collection with new roles for example
>> (consumed by placement plugins) and be
>> >> >> >> >>>>>>>>> > able to start nodes with new (arbitrary) roles.
>> Should such roles follow some naming syntax to
>> >> >> >> >>>>>>>>> > differentiate them from built in roles? To be able
>> to fail on typos on roles - that otherwise can be
>> >> >> >> >>>>>>>>> > crippling and hard to debug. This implies in any
>> case that the current design can't assume all
>> >> >> >> >>>>>>>>> > roles are known at compile time or define them in a
>> Java enum.
>> >> >> >> >>>>>>>>>
>> >> >> >> >>>>>>>>> I think this should be achieved by something different
>> from roles. Something like node labels (user defined) which can then be
>> used in a replica placement plugin to assign replicas. I see roles as more
>> closely associated with kinds of functionality a node is designated for.
>> Therefore, I feel that replica placements and user defined node labels is
>> out of scope for this SIP. It can be added later in a separate SIP, without
>> being at odds with this proposal.
>> >> >> >> >>>>>>>>>
>> >> >> >> >>>>>>>>>
>> >> >> >> >>>>>>>>>
>> >> >> >> >>>>>>>>>
>> >> >> >> >>>>>>>>>
>> >> >> >> >>>>>>>>>
>> >> >> >> >>>>>>>>> On Mon, Nov 1, 2021 at 8:42 PM Jan Høydahl <
>> [email protected]> wrote:
>> >> >> >> >>>>>>>>>>
>> >> >> >> >>>>>>>>>>
>> >> >> >> >>>>>>>>>>
>> >> >> >> >>>>>>>>>> > 1. nov. 2021 kl. 14:46 skrev Ilan Ginzburg <
>> [email protected]>:
>> >> >> >> >>>>>>>>>> > A node not having node.roles defined should be
>> assumed to have all roles. Not only data. I don't see a reason to special
>> case this one or any role.
>> >> >> >> >>>>>>>>>>
>> >> >> >> >>>>>>>>>> +1, make it simple and transparent. No role == all
>> roles. Explicit list of roles = exactly those roles.
>> >> >> >> >>>>>>>>>>
>> >> >> >> >>>>>>>>>> > (Gus) See my comment above, but maybe preference is
>> something handled as a feature of the role rather than via role designation?
>> >> >> >> >>>>>>>>>>
>> >> >> >> >>>>>>>>>> Yea, we always need an overseer, so that feature can
>> decide to use its list of nodes as a preference if it so chooses.
>> >> >> >> >>>>>>>>>>
>> >> >> >> >>>>>>>>>>
>> >> >> >> >>>>>>>>>> Aside: I think it makes it easier if we always prefix
>> Solr env.vars and sys.props with "SOLR_" or "solr.", i.e.
>> -Dsolr.node.roles=foo. That way we can get away from having to have
>> explicit code in bin/solr, bin/solr.cmd and SolrCLI to manage every single
>> property. Instead we can parse all ENVs and Props with the solr prefix in
>> our bootstrap code. And we can by convention allow e.g. docker run -e
>> SOLR_NODE_ROLES=foo solr:9 and it would be the same ting...
>> >> >> >> >>>>>>>>>>
>> >> >> >> >>>>>>>>>> Jan
>> >> >> >> >>>>>>>>>>
>> ---------------------------------------------------------------------
>> >> >> >> >>>>>>>>>> To unsubscribe, e-mail:
>> [email protected]
>> >> >> >> >>>>>>>>>> For additional commands, e-mail:
>> [email protected]
>> >> >> >> >>>>>>>>>>
>> >> >> >> >>>>>>
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> --
>> >> >> >> >>>>>> http://www.needhamsoftware.com (work)
>> >> >> >> >>>>>> http://www.the111shift.com (play)
>> >> >> >> >>>>
>> >> >> >> >>>>
>> >> >> >> >>>>
>> >> >> >> >>>> --
>> >> >> >> >>>> http://www.needhamsoftware.com (work)
>> >> >> >> >>>> http://www.the111shift.com (play)
>> >> >> >>
>> >> >> >>
>> ---------------------------------------------------------------------
>> >> >> >> To unsubscribe, e-mail: [email protected]
>> >> >> >> For additional commands, e-mail: [email protected]
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > http://www.needhamsoftware.com (work)
>> >> >> > http://www.the111shift.com (play)
>> >> >>
>> >> >>
>> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: [email protected]
>> >> >> For additional commands, e-mail: [email protected]
>> >> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: [email protected]
>> >> For additional commands, e-mail: [email protected]
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>

Re: First class support for node roles

Reply via email to