[jira] [Commented] (IGNITE-4501) Improvement of connection in a cluster of new node

Alexander Menshikov (JIRA) Fri, 04 May 2018 06:48:36 -0700

    [ 
https://issues.apache.org/jira/browse/IGNITE-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463891#comment-16463891
 ]


Alexander Menshikov commented on IGNITE-4501:
---------------------------------------------

[~daradurvs]
{quote}what's the reason for increasing messages number?
{quote}
It was a long time ago, so I don't remember some details. But now process works 
like that:

*Once* pass across the ring for finding the coordinator.

+

*Once* pass across the ring for submitting coordinator decision (new node at 
the end of such passing, and coordinator right behind him)

 

After implementation of the task it will be like that:

*Once* pass across the ring for finding the coordinator.

+

*Once* pass across the ring for submitting coordinator decision except for new 
node (because other nodes can reject new node)

+

*Once* pass across the ring for finding the new node (for submitting final 
decision).

+

*Once* pass across the ring for finding the coordinator (for submitting final 
decision).

 
{quote}Have you ever benchmarked prepared solution, what's the results?
{quote}
No

> Improvement of connection in a cluster of new node
> --------------------------------------------------
>
>                 Key: IGNITE-4501
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4501
>             Project: Ignite
>          Issue Type: Improvement
>          Components: messaging
>    Affects Versions: 1.8
>            Reporter: Vyacheslav Daradur
>            Priority: Major
>              Labels: important
>
> h3. Main description:
> Cluster nodes connect a ring.
> For example: we have 6 nodes: A, B, C, D, E, F. 
> They can connect a ring in any possible way: A-B-C-D-E-F-A, or A-F-B-E-C-D-A, 
> etc.
> If some node leaves topology, adjacent nodes must reconnect. 
> If nodes A, B, C are in same physical place, nodes D, E, F are in other 
> place, and places lost connect each other, we will have many ways of 
> reconnections.
> At best case, if we had a ring: A-B-CxD-E-FxA ('x' means disconnect) -- then 
> we have only one reconnect (C
> will be connected to A or F will be connected to D -- depends on what part of 
> the cluster was alive.
> Also, if we had a not ring: AxFxBxExCxDxA -- then we have a lot of 
> reconnections (A to B, B to C, C to A -- in general n/2 reconnections, where 
> n -- number of nodes). 
> h3. Approach:
> It is necessary to develop approach of node insertion to the correct place 
> for creation of the correct ring-topology.
> h3. Solutions:
> Main idea is a sorting according to latency.
> * group nodes in arcs on an ARC_ID. (manualy?)
> * implement NodeComparator (nodes on the same host : nodes on the same subnet 
> : other nodes). We will use it when we connect a new node.
> * [dev list 
> thread|http://mail-archives.apache.org/mod_mbox/ignite-dev/201612.mbox/%3CCAN+WSNyWYXSXEBpGErVt72zTgi2pTQzUWLv8JY=ke83-5-r...@mail.gmail.com%3E]
> Update Dec, 29 Yakov Zhdanov:
> # introduce CLUSTER_REGION_ID node attribute. This can be done by adding 
> public static final constant to TcpDiscoverySpi.
> # Alter 
> org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNodesRing#nextNode(java.util.Collection<org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNode>)
>  to order basing on per node attribute value
> # Node comparison should be stable and consistent. E.g. if CLUSTER_REGION_IDs 
> are equal then we should compare nodes' IDs. This way we have consistent 
> order on all nodes in topology.
> # Also nextNode() has to group nodes on same host and in same subnet. This 
> can be postponed and implemented after we have other points done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IGNITE-4501) Improvement of connection in a cluster of new node

Reply via email to