[jira] [Commented] (IGNITE-8699) ZookeeperDiscoverySpiTest#testDisconnectOnServersLeft flaky fails (rarely)

2018-06-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16519561#comment-16519561
 ] 

ASF GitHub Bot commented on IGNITE-8699:


Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/4161


> ZookeeperDiscoverySpiTest#testDisconnectOnServersLeft flaky fails (rarely)
> --
>
> Key: IGNITE-8699
> URL: https://issues.apache.org/jira/browse/IGNITE-8699
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vitaliy Biryukov
>Assignee: Vitaliy Biryukov
>Priority: Major
>  Labels: MakeTeamcityGreenAgain
> Fix For: 2.7
>
> Attachments: thread-dump-fail-before-local-join
>
>
> *Affected tests:*
> testDisconnectOnServersLeft_1
> testDisconnectOnServersLeft_2
> testDisconnectOnServersLeft_3
> testDisconnectOnServersLeft_4
> testDisconnectOnServersLeft_5
> {noformat}
> junit.framework.AssertionFailedError: Failed to wait for disconnect/reconnect 
> event.
>   at junit.framework.Assert.fail(Assert.java:57)
>   at junit.framework.TestCase.fail(TestCase.java:227)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.waitReconnectEvent(ZookeeperDiscoverySpiTest.java:4685)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.disconnectOnServersLeft(ZookeeperDiscoverySpiTest.java:3541)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.testDisconnectOnServersLeft_4(ZookeeperDiscoverySpiTest.java:3476)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at junit.framework.TestCase.runTest(TestCase.java:176)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2086)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:140)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:2001)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8699) ZookeeperDiscoverySpiTest#testDisconnectOnServersLeft flaky fails (rarely)

2018-06-21 Thread Dmitriy Pavlov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16519560#comment-16519560
 ] 

Dmitriy Pavlov commented on IGNITE-8699:


[~VitaliyB], [~sergey-chugunov], I've added several changes related to code 
style and merged change.

Changes were Idea inspections proposals, such as naming of ignored exception 
'e' to 'ignored', space line between semantic blocks.

> ZookeeperDiscoverySpiTest#testDisconnectOnServersLeft flaky fails (rarely)
> --
>
> Key: IGNITE-8699
> URL: https://issues.apache.org/jira/browse/IGNITE-8699
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vitaliy Biryukov
>Assignee: Vitaliy Biryukov
>Priority: Major
>  Labels: MakeTeamcityGreenAgain
> Fix For: 2.7
>
> Attachments: thread-dump-fail-before-local-join
>
>
> *Affected tests:*
> testDisconnectOnServersLeft_1
> testDisconnectOnServersLeft_2
> testDisconnectOnServersLeft_3
> testDisconnectOnServersLeft_4
> testDisconnectOnServersLeft_5
> {noformat}
> junit.framework.AssertionFailedError: Failed to wait for disconnect/reconnect 
> event.
>   at junit.framework.Assert.fail(Assert.java:57)
>   at junit.framework.TestCase.fail(TestCase.java:227)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.waitReconnectEvent(ZookeeperDiscoverySpiTest.java:4685)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.disconnectOnServersLeft(ZookeeperDiscoverySpiTest.java:3541)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.testDisconnectOnServersLeft_4(ZookeeperDiscoverySpiTest.java:3476)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at junit.framework.TestCase.runTest(TestCase.java:176)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2086)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:140)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:2001)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8699) ZookeeperDiscoverySpiTest#testDisconnectOnServersLeft flaky fails (rarely)

2018-06-21 Thread Sergey Chugunov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16519330#comment-16519330
 ] 

Sergey Chugunov commented on IGNITE-8699:
-

[~VitaliyB],

I triggered ZooKeeper (Discovery) 2 suite, results look good to me: [TC 
link|https://ci.ignite.apache.org/viewLog.html?buildId=1374005&buildTypeId=IgniteTests24Java8_ZooKeeperDiscovery1&tab=buildResultsDiv]

We can proceed with merging.

> ZookeeperDiscoverySpiTest#testDisconnectOnServersLeft flaky fails (rarely)
> --
>
> Key: IGNITE-8699
> URL: https://issues.apache.org/jira/browse/IGNITE-8699
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vitaliy Biryukov
>Assignee: Vitaliy Biryukov
>Priority: Major
>  Labels: MakeTeamcityGreenAgain
> Attachments: thread-dump-fail-before-local-join
>
>
> *Affected tests:*
> testDisconnectOnServersLeft_1
> testDisconnectOnServersLeft_2
> testDisconnectOnServersLeft_3
> testDisconnectOnServersLeft_4
> testDisconnectOnServersLeft_5
> {noformat}
> junit.framework.AssertionFailedError: Failed to wait for disconnect/reconnect 
> event.
>   at junit.framework.Assert.fail(Assert.java:57)
>   at junit.framework.TestCase.fail(TestCase.java:227)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.waitReconnectEvent(ZookeeperDiscoverySpiTest.java:4685)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.disconnectOnServersLeft(ZookeeperDiscoverySpiTest.java:3541)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.testDisconnectOnServersLeft_4(ZookeeperDiscoverySpiTest.java:3476)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at junit.framework.TestCase.runTest(TestCase.java:176)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2086)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:140)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:2001)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8699) ZookeeperDiscoverySpiTest#testDisconnectOnServersLeft flaky fails (rarely)

2018-06-20 Thread Vitaliy Biryukov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518078#comment-16518078
 ] 

Vitaliy Biryukov commented on IGNITE-8699:
--

[~sergey-chugunov], 
Is full *ZooKeeper (Discovery) 1* suit enough? [TC 
link|https://ci.ignite.apache.org/viewLog.html?buildId=1374005&buildTypeId=IgniteTests24Java8_ZooKeeperDiscovery1&tab=buildResultsDiv]

You are right about option#1. This case reproduce on my Linux machine sometimes.
The piece of thread dump (full thread dump in attachments):  
{noformat}
Thread [name="disco-event-worker-#2605%internal.ZookeeperDiscoverySpiTest5%", 
id=3211, state=WAITING, blockCnt=2, waitCnt=6]
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at 
o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
at o.a.i.i.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
at 
o.a.i.i.managers.discovery.GridDiscoveryManager.localJoin(GridDiscoveryManager.java:2190)
at 
o.a.i.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest$2.apply(ZookeeperDiscoverySpiTest.java:315)
- locked java.util.TreeMap@38081448
at 
o.a.i.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest$2.apply(ZookeeperDiscoverySpiTest.java:295)
at 
o.a.i.i.managers.eventstorage.GridEventStorageManager$UserListenerWrapper.onEvent(GridEventStorageManager.java:1477)
at 
o.a.i.i.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:873)
at 
o.a.i.i.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:858)
at 
o.a.i.i.managers.eventstorage.GridEventStorageManager.record0(GridEventStorageManager.java:341)
at 
o.a.i.i.managers.eventstorage.GridEventStorageManager.record(GridEventStorageManager.java:307)
at 
o.a.i.i.managers.discovery.GridDiscoveryManager$DiscoveryWorker.recordEvent(GridDiscoveryManager.java:2703)
at 
o.a.i.i.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body0(GridDiscoveryManager.java:2920)
at 
o.a.i.i.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body(GridDiscoveryManager.java:2732)
at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:748)
{noformat}


> ZookeeperDiscoverySpiTest#testDisconnectOnServersLeft flaky fails (rarely)
> --
>
> Key: IGNITE-8699
> URL: https://issues.apache.org/jira/browse/IGNITE-8699
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vitaliy Biryukov
>Assignee: Vitaliy Biryukov
>Priority: Major
>  Labels: MakeTeamcityGreenAgain
> Attachments: thread-dump-fail-before-local-join
>
>
> *Affected tests:*
> testDisconnectOnServersLeft_1
> testDisconnectOnServersLeft_2
> testDisconnectOnServersLeft_3
> testDisconnectOnServersLeft_4
> testDisconnectOnServersLeft_5
> {noformat}
> junit.framework.AssertionFailedError: Failed to wait for disconnect/reconnect 
> event.
>   at junit.framework.Assert.fail(Assert.java:57)
>   at junit.framework.TestCase.fail(TestCase.java:227)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.waitReconnectEvent(ZookeeperDiscoverySpiTest.java:4685)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.disconnectOnServersLeft(ZookeeperDiscoverySpiTest.java:3541)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.testDisconnectOnServersLeft_4(ZookeeperDiscoverySpiTest.java:3476)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at junit.framework.TestCase.runTest(TestCase.java:176)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2086)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:140)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:2001)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8699) ZookeeperDiscoverySpiTest#testDisconnectOnServersLeft flaky fails (rarely)

2018-06-20 Thread Sergey Chugunov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518042#comment-16518042
 ] 

Sergey Chugunov commented on IGNITE-8699:
-

[~VitaliyB],

Change looks reasonable for me as well, but I think we should run all 
Zookeeper-related tests on this change (now we have TC only for 4 isolated 
tests).

I'm also curious about option#1 for test to fail. Could you share stack trace 
showing how *DiscoveryWorker* hangs?
Do I understand you correctly that it happens when client node didn't finish 
local join procedure before the very last server died, and client entered some 
undefined state?

> ZookeeperDiscoverySpiTest#testDisconnectOnServersLeft flaky fails (rarely)
> --
>
> Key: IGNITE-8699
> URL: https://issues.apache.org/jira/browse/IGNITE-8699
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vitaliy Biryukov
>Assignee: Vitaliy Biryukov
>Priority: Major
>  Labels: MakeTeamcityGreenAgain
>
> *Affected tests:*
> testDisconnectOnServersLeft_1
> testDisconnectOnServersLeft_2
> testDisconnectOnServersLeft_3
> testDisconnectOnServersLeft_4
> testDisconnectOnServersLeft_5
> {noformat}
> junit.framework.AssertionFailedError: Failed to wait for disconnect/reconnect 
> event.
>   at junit.framework.Assert.fail(Assert.java:57)
>   at junit.framework.TestCase.fail(TestCase.java:227)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.waitReconnectEvent(ZookeeperDiscoverySpiTest.java:4685)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.disconnectOnServersLeft(ZookeeperDiscoverySpiTest.java:3541)
>   at 
> org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoverySpiTest.testDisconnectOnServersLeft_4(ZookeeperDiscoverySpiTest.java:3476)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at junit.framework.TestCase.runTest(TestCase.java:176)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2086)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:140)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:2001)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8699) ZookeeperDiscoverySpiTest#testDisconnectOnServersLeft flaky fails (rarely)

2018-06-18 Thread Pavel Pereslegin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515545#comment-16515545
 ] 

Pavel Pereslegin commented on IGNITE-8699:
--

[~VitaliyB], 
looks good for me.

> ZookeeperDiscoverySpiTest#testDisconnectOnServersLeft flaky fails (rarely)
> --
>
> Key: IGNITE-8699
> URL: https://issues.apache.org/jira/browse/IGNITE-8699
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vitaliy Biryukov
>Assignee: Vitaliy Biryukov
>Priority: Major
>  Labels: MakeTeamcityGreenAgain
>
> *Affected tests:*
> testDisconnectOnServersLeft_1
> testDisconnectOnServersLeft_2
> testDisconnectOnServersLeft_3
> testDisconnectOnServersLeft_4
> testDisconnectOnServersLeft_5
> *Causes:*
> * Sometimes client nodes don't have time to join the topology.
> * Sometimes starts communication failure resolver and wait for server nodes. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)