[ 
https://issues.apache.org/jira/browse/IGNITE-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17456349#comment-17456349
 ] 

Sergey Chugunov commented on IGNITE-15996:
------------------------------------------

[~krybakova],

Thank you for the clarification!

It looks like all TCPDiscovery can do in situation of such network malfunction 
is to log additional messages. I can't say for sure but we could try to detect 
this scenario and provide a user with more informative assumptions about what 
may be wrong.

Does this idea of additional logging make sense to you?

> Node fails with "Node with the same ID was found" while connecting to the 
> cluster in Docker container if previous container was stopped
> ---------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-15996
>                 URL: https://issues.apache.org/jira/browse/IGNITE-15996
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.10
>         Environment: Windows 10, Docker+WSL2
>            Reporter: Ksenia Rybakova
>            Priority: Major
>         Attachments: ignite-47b5227b.0.log, ignite-c072978e.0.log, 
> ignite-c62bc58e.0.log
>
>
> Node in Docker container fails to connect to existing cluster if previously 
> connected node (container) was stopped:
> {noformat}
> [11:27:38,272][SEVERE][main][IgniteKernal] Got exception while starting (will 
> rollback startup routine).
> class org.apache.ignite.IgniteCheckedException: Failed to start manager: 
> GridManagerAdapter [enabled=true, 
> name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
>     at 
> org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1990)
>     at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1331)
>     at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2141)
>     at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1787)
>     at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1172)
>     at 
> org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1066)
>     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:952)
>     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:851)
>     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:721)
>     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:690)
>     at org.apache.ignite.Ignition.start(Ignition.java:353)
>     at 
> org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:367)
> Caused by: class org.apache.ignite.IgniteCheckedException: Failed to start 
> SPI: TcpDiscoverySpi [addrRslvr=null, addressFilter=null, sockTimeout=5000, 
> ackTimeout=5000, marsh=JdkMarshaller 
> [clsFilter=org.apache.ignite.marshaller.MarshallerUtils$1@21f9277b], 
> reconCnt=10, reconDelay=2000, maxAckTimeout=600000, soLinger=0, 
> forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null, 
> skipAddrsRandomization=false]
>     at 
> org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:281)
>     at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:980)
>     at 
> org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1985)
>     ... 11 more
> Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same 
> ID was found in node IDs history or existing node in topology has the same ID 
> (fix configuration and restart local node) [localNode=TcpDiscoveryNode 
> [id=c62bc58e-102a-4928-8e54-ac8a56bf4d44, 
> consistentId=127.0.0.1,172.17.0.4:47500, addrs=ArrayList [127.0.0.1, 
> 172.17.0.4], sockAddrs=HashSet [402b337a50dd/172.17.0.4:47500, 
> /127.0.0.1:47500], discPort=47500, order=0, intOrder=3, 
> lastExchangeTime=1637839658247, loc=true, ver=2.11.0#20210911-sha1:8f3f07d3, 
> isClient=false], existingNode=c62bc58e-102a-4928-8e54-ac8a56bf4d44]
>     at 
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.duplicateIdError(TcpDiscoverySpi.java:2083)
>     at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:1201)
>     at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:473)
>     at 
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2207)
>     at 
> org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:278)
>     ... 13 more{noformat}
> Steps to reproduce:
> 1) Download ignite Docker image
> {code:java}
> docker pull apacheignite/ignite:2.11.0{code}
>  2) Start node 1 (local directory is mounted to save logs)
> {code:java}
> docker run -d -v ${PWD}/docker_ignite_w1:/opt/ignite/apache-ignite/work 
> apacheignite/ignite:2.11.0 
> c5219b095c93ec56731eec9fa871ffb722ddead987256198d76889f4a1a8ea3e{code}
> 3) Start node 2
> {code:java}
> docker run -d -v ${PWD}/docker_ignite_w2:/opt/ignite/apache-ignite/work 
> apacheignite/ignite:2.11.0 
> 65fdae68a40b2d3d17ab7e560320ef6757713d8efacbc25a26aecca03be6f975{code}
> 4) Stop container for node 2
> {code:java}
> docker stop 65fdae68a40b{code}
> 5) Start node 3
> {code:java}
> docker run -d -v ${PWD}/docker_ignite_w3:/opt/ignite/apache-ignite/work 
> apacheignite/ignite:2.11.0{code}
> Expected: node 3 joins the cluster successfully
> Actual: node 3 fails with "IgniteSpiException: Node with the same ID was 
> found in node IDs history or existing node in topology has the same ID." 
> while id seems unique. 
> Logs are attached:
> node 1 - ignite-47b5227b.0.log,
> node 2 - ignite-c072978e.0.log,
> node 3 - ignite-c62bc58e.0.log.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to