[ https://issues.apache.org/jira/browse/IGNITE-10926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742260#comment-16742260 ]
Amelchev Nikita commented on IGNITE-10926: ------------------------------------------ I have prepared [PR|https://github.com/apache/ignite/pull/5820] with reproducer to fix the issue. I used the watcher for the local alive node. It resolve case when the node was deleted after several cluster restarts. Another way for resolving the issue was using a special event to invoke reconnect. But this requires creating a new discovery event because a custom event will not be processed (rtState isn't joined). [~sergey-chugunov], could you take a look at the issue, please? > ZookeeperDiscoverySpi: client does not survive after several cluster restarts > ----------------------------------------------------------------------------- > > Key: IGNITE-10926 > URL: https://issues.apache.org/jira/browse/IGNITE-10926 > Project: Ignite > Issue Type: Bug > Components: zookeeper > Reporter: Amelchev Nikita > Assignee: Amelchev Nikita > Priority: Major > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > {{ZookeeperDiscoveryImpl#cleanupPreviousClusterData}} can delete alive node > of a client in case of low internal order. > Steps to reproduce: > 1. Start server and client. > 2. Stop the server and wait for the client disconnected. > 3. Start and stop the server. The server hasn't time to process client join > request. > 4. Start server. It will delete alive client node because the client has low > internal order. The client will never connect. -- This message was sent by Atlassian JIRA (v7.6.3#76005)