Looks like you have some communication problems, since on the address 172.16.100.57:47100 node see not needed node. Please share logs from all nodes, it will help to understand a problem.
Evgenii 2017-12-13 13:45 GMT+03:00 golgoti <alaus...@hotmail.fr>: > Hi there, > > We do have an app with 4 Ignite nodes (2.1) deployed under K8s, here are > our > pods with their IP : > > NAME READY STATUS RESTARTS AGE IP > NODE > ceph-2949468122-8mwmh 1/1 Running 0 56m > 172.16.99.196 clt1-worker1 > ignite-ceph-545147792-9dx5b 1/1 Running 1 56m > 172.16.100.57 clt1-worker3 > ignite-ceph-545147792-9gpjp 1/1 Running 1 56m > 172.16.171.5 clt1-worker2 > ignite-ceph-545147792-hdl7p 1/1 Running 2 56m > 172.16.99.225 clt1-worker1 > ignite-ceph-545147792-z84lk 1/1 Running 1 56m > 172.16.194.239 clt1-worker4 > > As you can see, some pods (Ignite node servers) are restarting, sometimes > without any warning in the logs, dying quietly (In fact, we just see the > servers number decrease). > > However, this time, we were able to catch an IgniteCheckedException > exception (see node_restart.log). > I can understand something wrong with their inter-pods communication ? > > I give as well our node configuration. > > Any help very much appreciated, lib-server.xml > <http://apache-ignite-users.70518.x6.nabble.com/file/t1292/lib-server.xml> > node_restart.log > <http://apache-ignite-users.70518.x6.nabble.com/file/ > t1292/node_restart.log> > > Thanks > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >