[ 
https://issues.apache.org/jira/browse/IGNITE-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457295#comment-17457295
 ] 

Guilherme Momesso commented on IGNITE-15996:
--------------------------------------------

I'm facing the same issue while running on a Kubernetes cluster managed by 
Rancher.

Like [~krybakova] pointed, the error starts when the 3rd pod is launched. Then 
after a while all of the three pods keeps restarting with "Node with the same 
ID was found in node IDs history" error. After some time one pod stays running 
stable and the other two keep restarting.
I can only use 2 nodes again if I finish all the nodes and start again.

I'm using TcpDiscoveryKubernetesIpFinder as IP finder. The nodes are AWS EC2 
Linux instances. I've followed the Installation->Kubernetes steps of the 
documentation and the only difference I remember now is that I configure the 
K8s Service type as "ClusterIP" instead of "LoadBalancer".

Unfortunately, I can't use the pointed workaround.

> Node fails with "Node with the same ID was found" while connecting to the 
> cluster in Docker container if previous container was stopped
> ---------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-15996
>                 URL: https://issues.apache.org/jira/browse/IGNITE-15996
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.10
>         Environment: Windows 10, Docker+WSL2
>            Reporter: Ksenia Rybakova
>            Priority: Major
>         Attachments: ignite-47b5227b.0.log, ignite-c072978e.0.log, 
> ignite-c62bc58e.0.log
>
>
> Node in Docker container fails to connect to existing cluster if previously 
> connected node (container) was stopped:
> {noformat}
> [11:27:38,272][SEVERE][main][IgniteKernal] Got exception while starting (will 
> rollback startup routine).
> class org.apache.ignite.IgniteCheckedException: Failed to start manager: 
> GridManagerAdapter [enabled=true, 
> name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
>     at 
> org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1990)
>     at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1331)
>     at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2141)
>     at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1787)
>     at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1172)
>     at 
> org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1066)
>     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:952)
>     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:851)
>     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:721)
>     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:690)
>     at org.apache.ignite.Ignition.start(Ignition.java:353)
>     at 
> org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:367)
> Caused by: class org.apache.ignite.IgniteCheckedException: Failed to start 
> SPI: TcpDiscoverySpi [addrRslvr=null, addressFilter=null, sockTimeout=5000, 
> ackTimeout=5000, marsh=JdkMarshaller 
> [clsFilter=org.apache.ignite.marshaller.MarshallerUtils$1@21f9277b], 
> reconCnt=10, reconDelay=2000, maxAckTimeout=600000, soLinger=0, 
> forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null, 
> skipAddrsRandomization=false]
>     at 
> org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:281)
>     at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:980)
>     at 
> org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1985)
>     ... 11 more
> Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same 
> ID was found in node IDs history or existing node in topology has the same ID 
> (fix configuration and restart local node) [localNode=TcpDiscoveryNode 
> [id=c62bc58e-102a-4928-8e54-ac8a56bf4d44, 
> consistentId=127.0.0.1,172.17.0.4:47500, addrs=ArrayList [127.0.0.1, 
> 172.17.0.4], sockAddrs=HashSet [402b337a50dd/172.17.0.4:47500, 
> /127.0.0.1:47500], discPort=47500, order=0, intOrder=3, 
> lastExchangeTime=1637839658247, loc=true, ver=2.11.0#20210911-sha1:8f3f07d3, 
> isClient=false], existingNode=c62bc58e-102a-4928-8e54-ac8a56bf4d44]
>     at 
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.duplicateIdError(TcpDiscoverySpi.java:2083)
>     at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:1201)
>     at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:473)
>     at 
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2207)
>     at 
> org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:278)
>     ... 13 more{noformat}
> Steps to reproduce:
> 1) Download ignite Docker image
> {code:java}
> docker pull apacheignite/ignite:2.11.0{code}
>  2) Start node 1 (local directory is mounted to save logs)
> {code:java}
> docker run -d -v ${PWD}/docker_ignite_w1:/opt/ignite/apache-ignite/work 
> apacheignite/ignite:2.11.0 
> c5219b095c93ec56731eec9fa871ffb722ddead987256198d76889f4a1a8ea3e{code}
> 3) Start node 2
> {code:java}
> docker run -d -v ${PWD}/docker_ignite_w2:/opt/ignite/apache-ignite/work 
> apacheignite/ignite:2.11.0 
> 65fdae68a40b2d3d17ab7e560320ef6757713d8efacbc25a26aecca03be6f975{code}
> 4) Stop container for node 2
> {code:java}
> docker stop 65fdae68a40b{code}
> 5) Start node 3
> {code:java}
> docker run -d -v ${PWD}/docker_ignite_w3:/opt/ignite/apache-ignite/work 
> apacheignite/ignite:2.11.0{code}
> Expected: node 3 joins the cluster successfully
> Actual: node 3 fails with "IgniteSpiException: Node with the same ID was 
> found in node IDs history or existing node in topology has the same ID." 
> while id seems unique. 
> Logs are attached:
> node 1 - ignite-47b5227b.0.log,
> node 2 - ignite-c072978e.0.log,
> node 3 - ignite-c62bc58e.0.log.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to