Alex, I checked reproducer you presented, please fix it according to the following rules
1) Never use System.out.println() as a part of reproducer, use assetrions if necessary. 2) Reproducer should be small. As small as possible. 3) Try to make reproducer clear. As crear as possible. In case following code can be simplified - it should be. if (flag && latch.getCount() > 0) { fut.onDone(this); 3.1) Newer use variable's names like 'flag', this makes reviewers nervous. пт, 27 апр. 2018 г. в 17:40, Александр Меньшиков <sharple...@gmail.com>: > Yakov, thank you for the advice. > > The thread.sleep is not enough, but some latch + future give me a way to > the > reproducer. > > I have created PR [1] into my master, for showing a test and modification > of > ServerImpl which help me to slow down execution inside a danger section. > > A code of test a bit long, but basically it about two parts: > > In the first part, I randomly start and stop nodes to get a moment when > a server is starting to execute the dangerous code which I described in the > first message. > > In the second part, I'm waiting while the first part produces this > situation > and after that, I call public method of ServerImpl which fails with an > exception: > > java.lang.AssertionError: Invalid node order: TcpDiscoveryNode > [id=f6bf048d-378b-4960-94cb-84e3d3300002, addrs=[127.0.0.1], sockAddrs=[/ > 127.0.0.1:47502], discPort=47502, order=0, intOrder=2, > lastExchangeTime=1524836605995, loc=false, > ver=2.5.0#20180426-sha1:34e22396, isClient=false] > at > > org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNodesRing$1.apply(TcpDiscoveryNodesRing.java:52) > at > > org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNodesRing$1.apply(TcpDiscoveryNodesRing.java:49) > at > org.apache.ignite.internal.util.lang.GridFunc.isAll(GridFunc.java:2014) > at > > org.apache.ignite.internal.util.IgniteUtils.arrayList(IgniteUtils.java:9679) > at > > org.apache.ignite.internal.util.IgniteUtils.arrayList(IgniteUtils.java:9652) > at > > org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNodesRing.nodes(TcpDiscoveryNodesRing.java:590) > at > > org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNodesRing.visibleRemoteNodes(TcpDiscoveryNodesRing.java:164) > at > > org.apache.ignite.spi.discovery.tcp.ServerImpl.getRemoteNodes(ServerImpl.java:304) > > As I told in the first message the problem arises because of the current > code > changes local node internal order and breaks sorting in > TcpDiscoveryNodesRing.nodes collection. > > Is this reproducer convince enough? > > [1] Reproducer: https://github.com/SharplEr/ignite/pull/10/files > > > > 2018-02-13 20:17 GMT+03:00 Yakov Zhdanov <yzhda...@apache.org>: > > > Alex, you can alter ServerImpl and insert a latch or thread.sleep(xxx) > > anywhere you like to show the incorrect behavior you describe. > > > > --Yakov > > >