[
https://issues.apache.org/jira/browse/IGNITE-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16534681#comment-16534681
]
Denis Garus commented on IGNITE-8131:
-------------------------------------
[~sergey-chugunov],
If the connection failure occurs for updating event's data, then the thread
"zk-internal.ZookeeperDiscoverySpiTest1-EventThread" will attempt
to retry the operation (see ZookeeperClient#setData). Method
ZookeeperClient#onZookeeperError contains some logic that defines a number of
attempts.
When the number of attempts exceeds enabled the ZookeeperClient will be closed.
By this time, the WatchedEvent was created and put to the
EventThread#queueEvent method (see SendThread#run in
org.apache.zookeeper.ClientCnxn).
But, this event can't be handled because
"zk-internal.ZookeeperDiscoverySpiTest1-EventThread" thread is busy with making
to retry setData operation.
This is the reason why we don't see the log of handling WatchedEvent from the
ZookeeperClient.
> ZookeeperDiscoverySpiTest#testClientReconnectSessionExpire* tests fail on TC
> ----------------------------------------------------------------------------
>
> Key: IGNITE-8131
> URL: https://issues.apache.org/jira/browse/IGNITE-8131
> Project: Ignite
> Issue Type: Bug
> Components: zookeeper
> Reporter: Sergey Chugunov
> Assignee: Denis Garus
> Priority: Major
> Labels: MakeTeamcityGreenAgain
> Fix For: 2.7
>
> Attachments: ZK_client_reconnect_failure.log,
> ZK_client_reconnect_success.log
>
>
> Two tests always fail on TC with the assertion
> {noformat}
> junit.framework.AssertionFailedError: Failed to wait for disconnect/reconnect
> event.
> at
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.waitReconnectEvent(ZookeeperDiscoverySpiTest.java:4221)
> at
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.reconnectClientNodes(ZookeeperDiscoverySpiTest.java:4183)
> at
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.clientReconnectSessionExpire(ZookeeperDiscoverySpiTest.java:2231)
> at
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.testClientReconnectSessionExpire1_1(ZookeeperDiscoverySpiTest.java:2206)
> {noformat}
> from client disconnect/reconnect events check. Obviously client doesn't
> generate these events as it supposed to do.
> (TC runs can be found
> [here|https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_IgniteZooKeeperDiscovery&branch_IgniteTests24Java8=pull%2F3730%2Fhead&tab=buildTypeStatusDiv]).
> It is possible to reproduce test failure locally as well, but with low
> probability: one failure for 50 or even 300 successful executions.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)