[jira] [Updated] (IGNITE-9238) Test GridTaskFailoverAffinityRunTest.testNodeRestartClient hangs when coordinator forces client to reconnect on grid startup.

2018-10-31 Thread Pavel Pereslegin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-9238:
-
Ignite Flags:   (was: Docs Required)

> Test GridTaskFailoverAffinityRunTest.testNodeRestartClient hangs when 
> coordinator forces client to reconnect on grid startup.
> -
>
> Key: IGNITE-9238
> URL: https://issues.apache.org/jira/browse/IGNITE-9238
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.6
>Reporter: Pavel Pereslegin
>Assignee: Pavel Pereslegin
>Priority: Major
> Fix For: 2.7
>
> Attachments: Reproducer.java
>
>
> Example of such hang on TC: 
> https://ci.ignite.apache.org/viewLog.html?buildId=1605243&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_ComputeGrid
> Log output:
> {noformat}
> ...
> [2018-08-07 12:20:09,804][WARN 
> ][sys-#12799%internal.GridTaskFailoverAffinityRunTest1%][GridCachePartitionExchangeManager]
>  Client node tries to connect but its exchange info is cleaned up from 
> exchange history. Consider increasing 'IGNITE_EXCHANGE_HISTORY_SIZE' property 
> or start clients in  smaller batches. Current settings and versions: 
> [IGNITE_EXCHANGE_HISTORY_SIZE=1000, initVer=AffinityTopologyVersion 
> [topVer=3, minorTopVer=0], readyVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0]].
> [2018-08-07 12:20:09,804][INFO 
> ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][GridDhtPartitionsExchangeFuture]
>  Completed partition exchange 
> [localNode=511d5932-5f22-4919-807d-575c7f61, 
> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion 
> [topVer=3, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode 
> [id=6b9a7a1d-07bf-4d20-882a-8462ada3, addrs=ArrayList [127.0.0.1], 
> sockAddrs=HashSet [/127.0.0.1:47502], discPort=47502, order=3, intOrder=3, 
> lastExchangeTime=1533644409739, loc=false, ver=2.7.0#20180807-sha1:e96616f5, 
> isClient=false], done=true], topVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0], durationFromInit=21]
> [2018-08-07 12:20:09,806][INFO 
> ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][time] 
> Finished exchange init [topVer=AffinityTopologyVersion [topVer=3, 
> minorTopVer=0], crd=true]
> [2018-08-07 12:20:09,807][INFO 
> ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][GridCachePartitionExchangeManager]
>  Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion 
> [topVer=4, minorTopVer=0], force=false, evt=NODE_JOINED, 
> node=6b9a7a1d-07bf-4d20-882a-8462ada3]
> [2018-08-07 12:20:09,811][INFO 
> ][sys-#12798%internal.GridTaskFailoverAffinityRunTest2%][GridDhtPartitionsExchangeFuture]
>  Finish exchange future [startVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0], resVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], 
> err=null]
> [2018-08-07 12:20:09,813][INFO 
> ][sys-#12798%internal.GridTaskFailoverAffinityRunTest2%][GridDhtPartitionsExchangeFuture]
>  Completed partition exchange 
> [localNode=a3206c1f-6d57-4fd6-8aa5-e22f3b42, 
> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion 
> [topVer=4, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode 
> [id=a3206c1f-6d57-4fd6-8aa5-e22f3b42, addrs=ArrayList [127.0.0.1], 
> sockAddrs=HashSet [/127.0.0.1:47503], discPort=47503, order=4, intOrder=4, 
> lastExchangeTime=1533644409779, loc=true, ver=2.7.0#20180807-sha1:e96616f5, 
> isClient=false], done=true], topVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0], durationFromInit=41]
> [2018-08-07 12:20:09,814][INFO 
> ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] To 
> start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
> [2018-08-07 12:20:09,815][INFO 
> ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] 
> [2018-08-07 12:20:09,815][INFO 
> ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] 
> >>> +---+
> >>> Ignite ver. 
> >>> 2.7.0-SNAPSHOT#20180807-sha1:e96616f580930f267eab44f75d410fa29a876bcb
> >>> +---+
> >>> OS name: Linux 4.4.0-128-generic amd64
> >>> CPU(s): 5
> >>> Heap: 2.0GB
> >>> VM name: 20126@8790182f15a5
> >>> Ignite instance name: internal.GridTaskFailoverAffinityRunTest1
> >>> Local node [ID=511D5932-5F22-4919-807D-575C7F61, order=2, 
> >>> clientMode=false]
> >>> Local node addresses: [127.0.0.1]
> >>> Local ports: TCP:10801 TCP:45821 TCP:47501 
> [2018-08-07 12:20:09,816][INFO 
> ][grid-starter-testNodeRestartClient-1][GridDiscoveryManager] Topology 
> snapshot [ver=2, servers=1

[jira] [Updated] (IGNITE-9238) Test GridTaskFailoverAffinityRunTest.testNodeRestartClient hangs when coordinator forces client to reconnect on grid startup.

2018-08-13 Thread Pavel Pereslegin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-9238:
-
Attachment: Reproducer.java

> Test GridTaskFailoverAffinityRunTest.testNodeRestartClient hangs when 
> coordinator forces client to reconnect on grid startup.
> -
>
> Key: IGNITE-9238
> URL: https://issues.apache.org/jira/browse/IGNITE-9238
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.6
>Reporter: Pavel Pereslegin
>Assignee: Pavel Pereslegin
>Priority: Major
> Fix For: 2.7
>
> Attachments: Reproducer.java
>
>
> Example of such hang on TC: 
> https://ci.ignite.apache.org/viewLog.html?buildId=1605243&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_ComputeGrid
> Log output:
> {noformat}
> ...
> [2018-08-07 12:20:09,804][WARN 
> ][sys-#12799%internal.GridTaskFailoverAffinityRunTest1%][GridCachePartitionExchangeManager]
>  Client node tries to connect but its exchange info is cleaned up from 
> exchange history. Consider increasing 'IGNITE_EXCHANGE_HISTORY_SIZE' property 
> or start clients in  smaller batches. Current settings and versions: 
> [IGNITE_EXCHANGE_HISTORY_SIZE=1000, initVer=AffinityTopologyVersion 
> [topVer=3, minorTopVer=0], readyVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0]].
> [2018-08-07 12:20:09,804][INFO 
> ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][GridDhtPartitionsExchangeFuture]
>  Completed partition exchange 
> [localNode=511d5932-5f22-4919-807d-575c7f61, 
> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion 
> [topVer=3, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode 
> [id=6b9a7a1d-07bf-4d20-882a-8462ada3, addrs=ArrayList [127.0.0.1], 
> sockAddrs=HashSet [/127.0.0.1:47502], discPort=47502, order=3, intOrder=3, 
> lastExchangeTime=1533644409739, loc=false, ver=2.7.0#20180807-sha1:e96616f5, 
> isClient=false], done=true], topVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0], durationFromInit=21]
> [2018-08-07 12:20:09,806][INFO 
> ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][time] 
> Finished exchange init [topVer=AffinityTopologyVersion [topVer=3, 
> minorTopVer=0], crd=true]
> [2018-08-07 12:20:09,807][INFO 
> ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][GridCachePartitionExchangeManager]
>  Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion 
> [topVer=4, minorTopVer=0], force=false, evt=NODE_JOINED, 
> node=6b9a7a1d-07bf-4d20-882a-8462ada3]
> [2018-08-07 12:20:09,811][INFO 
> ][sys-#12798%internal.GridTaskFailoverAffinityRunTest2%][GridDhtPartitionsExchangeFuture]
>  Finish exchange future [startVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0], resVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], 
> err=null]
> [2018-08-07 12:20:09,813][INFO 
> ][sys-#12798%internal.GridTaskFailoverAffinityRunTest2%][GridDhtPartitionsExchangeFuture]
>  Completed partition exchange 
> [localNode=a3206c1f-6d57-4fd6-8aa5-e22f3b42, 
> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion 
> [topVer=4, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode 
> [id=a3206c1f-6d57-4fd6-8aa5-e22f3b42, addrs=ArrayList [127.0.0.1], 
> sockAddrs=HashSet [/127.0.0.1:47503], discPort=47503, order=4, intOrder=4, 
> lastExchangeTime=1533644409779, loc=true, ver=2.7.0#20180807-sha1:e96616f5, 
> isClient=false], done=true], topVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0], durationFromInit=41]
> [2018-08-07 12:20:09,814][INFO 
> ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] To 
> start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
> [2018-08-07 12:20:09,815][INFO 
> ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] 
> [2018-08-07 12:20:09,815][INFO 
> ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] 
> >>> +---+
> >>> Ignite ver. 
> >>> 2.7.0-SNAPSHOT#20180807-sha1:e96616f580930f267eab44f75d410fa29a876bcb
> >>> +---+
> >>> OS name: Linux 4.4.0-128-generic amd64
> >>> CPU(s): 5
> >>> Heap: 2.0GB
> >>> VM name: 20126@8790182f15a5
> >>> Ignite instance name: internal.GridTaskFailoverAffinityRunTest1
> >>> Local node [ID=511D5932-5F22-4919-807D-575C7F61, order=2, 
> >>> clientMode=false]
> >>> Local node addresses: [127.0.0.1]
> >>> Local ports: TCP:10801 TCP:45821 TCP:47501 
> [2018-08-07 12:20:09,816][INFO 
> ][grid-starter-testNodeRestartClient-1][GridDiscoveryManager] Topology 
> snapshot [ver=2, servers=1, clients

[jira] [Updated] (IGNITE-9238) Test GridTaskFailoverAffinityRunTest.testNodeRestartClient hangs when coordinator forces client to reconnect on grid startup.

2018-08-08 Thread Pavel Pereslegin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-9238:
-
Labels:   (was: MakeTeamcityGreenAgain)

> Test GridTaskFailoverAffinityRunTest.testNodeRestartClient hangs when 
> coordinator forces client to reconnect on grid startup.
> -
>
> Key: IGNITE-9238
> URL: https://issues.apache.org/jira/browse/IGNITE-9238
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.6
>Reporter: Pavel Pereslegin
>Assignee: Pavel Pereslegin
>Priority: Major
> Fix For: 2.7
>
>
> Example of such hang on TC: 
> https://ci.ignite.apache.org/viewLog.html?buildId=1605243&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_ComputeGrid
> Log output:
> {noformat}
> [2018-08-07 12:20:09,804][WARN 
> ][sys-#12799%internal.GridTaskFailoverAffinityRunTest1%][GridCachePartitionExchangeManager]
>  Client node tries to connect but its exchange info is cleaned up from 
> exchange history. Consider increasing 'IGNITE_EXCHANGE_HISTORY_SIZE' property 
> or start clients in  smaller batches. Current settings and versions: 
> [IGNITE_EXCHANGE_HISTORY_SIZE=1000, initVer=AffinityTopologyVersion 
> [topVer=3, minorTopVer=0], readyVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0]].
> [2018-08-07 12:20:09,804][INFO 
> ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][GridDhtPartitionsExchangeFuture]
>  Completed partition exchange 
> [localNode=511d5932-5f22-4919-807d-575c7f61, 
> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion 
> [topVer=3, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode 
> [id=6b9a7a1d-07bf-4d20-882a-8462ada3, addrs=ArrayList [127.0.0.1], 
> sockAddrs=HashSet [/127.0.0.1:47502], discPort=47502, order=3, intOrder=3, 
> lastExchangeTime=1533644409739, loc=false, ver=2.7.0#20180807-sha1:e96616f5, 
> isClient=false], done=true], topVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0], durationFromInit=21]
> [2018-08-07 12:20:09,806][INFO 
> ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][time] 
> Finished exchange init [topVer=AffinityTopologyVersion [topVer=3, 
> minorTopVer=0], crd=true]
> [2018-08-07 12:20:09,807][INFO 
> ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][GridCachePartitionExchangeManager]
>  Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion 
> [topVer=4, minorTopVer=0], force=false, evt=NODE_JOINED, 
> node=6b9a7a1d-07bf-4d20-882a-8462ada3]
> [2018-08-07 12:20:09,811][INFO 
> ][sys-#12798%internal.GridTaskFailoverAffinityRunTest2%][GridDhtPartitionsExchangeFuture]
>  Finish exchange future [startVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0], resVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], 
> err=null]
> [2018-08-07 12:20:09,813][INFO 
> ][sys-#12798%internal.GridTaskFailoverAffinityRunTest2%][GridDhtPartitionsExchangeFuture]
>  Completed partition exchange 
> [localNode=a3206c1f-6d57-4fd6-8aa5-e22f3b42, 
> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion 
> [topVer=4, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode 
> [id=a3206c1f-6d57-4fd6-8aa5-e22f3b42, addrs=ArrayList [127.0.0.1], 
> sockAddrs=HashSet [/127.0.0.1:47503], discPort=47503, order=4, intOrder=4, 
> lastExchangeTime=1533644409779, loc=true, ver=2.7.0#20180807-sha1:e96616f5, 
> isClient=false], done=true], topVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0], durationFromInit=41]
> [2018-08-07 12:20:09,814][INFO 
> ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] To 
> start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
> [2018-08-07 12:20:09,815][INFO 
> ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] 
> [2018-08-07 12:20:09,815][INFO 
> ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] 
> >>> +---+
> >>> Ignite ver. 
> >>> 2.7.0-SNAPSHOT#20180807-sha1:e96616f580930f267eab44f75d410fa29a876bcb
> >>> +---+
> >>> OS name: Linux 4.4.0-128-generic amd64
> >>> CPU(s): 5
> >>> Heap: 2.0GB
> >>> VM name: 20126@8790182f15a5
> >>> Ignite instance name: internal.GridTaskFailoverAffinityRunTest1
> >>> Local node [ID=511D5932-5F22-4919-807D-575C7F61, order=2, 
> >>> clientMode=false]
> >>> Local node addresses: [127.0.0.1]
> >>> Local ports: TCP:10801 TCP:45821 TCP:47501 
> [2018-08-07 12:20:09,816][INFO 
> ][grid-starter-testNodeRestartClient-1][GridDiscoveryManager] Topology 
> snapshot [ver=2, servers=1, clients=1, CPUs=5, offheap=0.1GB, heap=2.0