On Mon, Nov 29, 2021 at 2:40 PM Gus Heck <[email protected]> wrote:

>
>
>> CLARIFICATION: I do not like that we are storing node liveness in two
>> different places now. We have the live nodes and we have the node roles
>> stored in two different places in zookeeper and it feels like this would
>> lead to race conditions or split brain or other hard to diagnose bugs when
>> those two lists don't agree with each other. This also feels like it
>> contradicts the "single source of truth" idea later stated in the proposal.
>> I see Gus's arguments for decoupling these and am not strongly opposed, I
>> just get a lurking feeling about it. Even if we don't do this, I would like
>> this called out explicitly in the alternative approaches section as
>> something that we considered and rejected, with details why,
>>
>>
> Yes, I had that thought and reconciled it for myself by
> realizing/theorizing that the new structure does not represent liveness. It
> represents "roleness" which is a different bit of information. Using it in
> code as a check for liveness would be wrong. In any case we always need
> to be prepared to handle the case that the node disappeared between when we
> checked the list (roles or live_nodes) and when we tried to contact the
> node. Power cord could have unplugged in the interval.
>
>
>> CHANGE REQUEST: The ZK structure also might not need that intermediate
>> "nodes" node.
>>
>
> I argued for the extra level for the following reason: Roles may want to
> coordinate additional information in zk (who IS overseer vs who COULD BE)
> or perhaps a pre-determined election order to speed up elections.
>

This brings up another point, I’d like to see if we can somehow collapse
overseer election with overseer role advertising I would be strongly in
favor of that.

Or Zk nodes might want to record the desired redundancy for zk, and how
> long past zk nodes have been down to bring up another zk from the pool of
> zk nodes if some timeout has been exceeded... And Ilan gave an example I
> don't recall at the moment that persuaded me that we can't really predict
> what each role wants to tack so my capable vs providing distinction got
> converted to a space in the struture (peer to nodes) to track any such info
> that a role needs to. thus namespacing role related stuff, enabling
> recursive watch if desired,
>

Yikes. Let’s figure out how to partition the data such that we don’t need
every node in the cluster watching every znode. I need to think about this
some more to convince myself we’re not in that space right now.

and generally keeping role related coordination data organized together in
> zk. So for concrete example you might have:
>
> /node-roles
>   /zookeeper
>     /nodes
>       host1_8983_solr {"rack":"A"}
>       host2_8983_solr {"rack":"A"}
>       host21_8983_solr {"rack":"B"}
>       host22_8983_solr {"rack":"B"}
>       host41_8983_solr {"rack":"C"}
>       host42_8983_solr {"rack":"C"}
>     /cluster { "redundancy":"3", maxDownMin:"30",
> rackAwareElections:"true" }
>     /current
>       host1_8983_solr
>       host21_8983_solr
>     /missing
>       host41_8983_solr { "since":"2021-12-23T12:34:56.7890"}
>     /election
>
> Note that this cluster could have 60 live nodes... 6 with th zk role. Just
> an example of course... we might not choose these features for zk nodes,
> but the point is to leave a spot with which to implement that we decide we
> want. Also the json at /cluster might simply be attached to /zookeeper
> instead.. but at the moment we aren't specifying how roles handle their
> role specific coordination data.
>

I think this is fair, we can leave nodes subtree. Thank you.

>
>
>>
>> CLARIFICATION: What happens when a node gets a request that it can't
>> fulfil? An overseer node gets a query or an update. A data node gets a
>> collection creation request. Do they forward it on to an appropriate node,
>> or do they reject it? Should this be configurable? If not, then it seems
>> like lazy or poorly configured clients will defeat this isolation system
>> quite easily.
>>
>
> This seems like something for each role to decide and/or configure.
> Specifically thinking of request to be elected overseer, Cluster level
> config (at /cluster or attached to the role node as above?) could determine
> if the existing (back compatible) fallback to non-overseer is desired or
> not...
>
>

Reply via email to