Hi Vyacheslav,

Discovery logic is incapsulated in TcpDiscoverySpi.
TcpDiscoveryMulticastIpFinder in one of many implementations of IP finder.
The only purpose of the IP finder is to provide list of addresses where a
node can send initial join request, and the fact that it sends this initial
request to node A doesn't actually mean that it will be connected to A
within a ring. Having said that, I doubt that IP finder will be somehow
affected in case the discussed change is implemented.

Discovery protocol already maintains consistent information about the ring,
so any node in topology already knows everything about other nodes,
including ordering in the ring. So on discovery level it should not be very
difficult to customize where a joining node is placed on the ring.

However, here is the concern I have. Currently when a new node joins,
coordinator assigns order number to this node (e.g. if we already have
nodes 1,2 and 3, new node will have order 4). This node will then be the
last one on the ring, i.e. nodes are always ordered in the ring by this
order number (1->2->3->4->1). If we change this, we will basically allow a
node to be placed anywhere else (smth like 1->2->4->3->1). I'm not 100%
sure if this is going to cause issues, but sounds dangerous.

Yakov, can you please chime in and share your thoughts on this?

-Val

On Fri, Dec 23, 2016 at 2:46 AM, Vyacheslav Daradur <daradu...@gmail.com>
wrote:

> Thanks for reply.
>
> I have some questions:
>
> 1. Where the logic of Ignite cluster building is realized? DiscoverySpi and
> TcpDiscoveryMulticastIpFinder?
>
> 2. Which standart Ignite metrics you can recommend to use for
> node-ordering?
>
> 2016-12-22 19:08 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:
>
> > I think having some user-defined ordering can be beneficial. However, we
> > are only talking about node discovery protocol here to maintain the
> > cluster. All other communication between nodes happens directly (does not
> > go through the ring).
> >
> > D.
> >
> > On Thu, Dec 22, 2016 at 6:32 AM, Vyacheslav Daradur <daradu...@gmail.com
> >
> > wrote:
> >
> > > Hello, Alex!
> > >
> > > I think it is a great idea.
> > >
> > > I suggest to build communications between nodes on weight (or
> priority).
> > >
> > > For example, ordering on latency:
> > > - nodes on one host = 1
> > > - nodes in one rack-blade = 2
> > > - nodes in one server-rack = 3
> > > - nodes in one physical cluster = 4
> > > - nodes in one subnet = 5
> > > - etc.
> > >
> > > Maybe it'll be better to use some metrics from ClusterMetrics
> interface.
> > >
> > > The algorithm of ordering can be implemented in a class such as
> > Comparator
> > > and use it when we build a cluster or we select a place for a new node.
> > >
> > > --
> > > With best regards,
> > > Vyacheslav Daradur
> > >
> > > 2016-12-22 13:59 GMT+03:00 Александр Меньшиков <sharple...@gmail.com>:
> > >
> > > > Hello everyone,
> > > >
> > > > As far as I know nodes are connected in a ring. For example if i
> have 6
> > > > nodes, with names A, B, C, D, E, and F they can connect in ring any
> > > > possible way: A-B-C-D-E-F-A, or A-F-B-E-C-D-A, etc. And if some node
> > > falls
> > > > out of topology neighboring nodes must reconnect. If nodes A,B and C
> > > > located in the same physical location, and D, E and F in another, and
> > in
> > > > some time one physical location is not available in another, we can
> get
> > > > different number of reconnections. Best case scenario if we have ring
> > > like
> > > > A-B-CxD-E-FxA ('x' mean disconnect) -- then we get only one reconnect
> > (C
> > > > reconnect to A or F reconnect to D -- depending on what part of the
> > > cluster
> > > > we leave alive). But now possible that case AxFxBxExCxDxA -- then we
> > get
> > > a
> > > > lot of reconnections (A to B, B to C, C to A -- in general n/2
> > > > reconnections, where n -- number of nodes). And i think to add
> > something
> > > to
> > > > ensure that we always have good sorting of nodes connections
> > > > (A-B-C-...-Z-A).
> > > >
> > > > Of course in real world we can have multiple levels of physical
> > > closeness.
> > > >
> > > > In my opinion enough to add one parameter of 'int' to configuration
> > (with
> > > > name like 'ExtraNodeOrder') and to change the method of comparison
> > nodes
> > > so
> > > > that it first compared the 'ExtraNodeOrder', and then according to
> the
> > > old
> > > > criterion (as far as I know Ignite use topology version). So if some
> > > users
> > > > have multiple levels of physical closeness, he can use different
> bits.
> > > For
> > > > example use 16 high bits for DC number, and low 16 bits for racks.
> > > >
> > > > Alternatively, we can add array of ‘int’ to configuration and compare
> > > nodes
> > > > in sequence from the zero element to the last.
> > > >
> > >
> >
>

Reply via email to