[ https://issues.apache.org/jira/browse/IGNITE-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370436#comment-15370436 ]
Denis Magda commented on IGNITE-3412: ------------------------------------- [~ascherbakov], According to the logs and thread dumps there shouldn't be any {{InterruptedIOException}} and the socket should have been closed by {{SocketReader}}. After that {{SocketWriter}} received {{SocketException}}. The question is why interrupted flag was reset after {{SocketWriter}} received this kind of exception. Please keep debugging performing the following: - enable {{DEBUG}} level logging for {{ClientImpl}} and {{TcpDiscoverySpi}}; - run SocketTest.zip test on the target machine. First start {{Second}} executable and after that {{First}} executable. Looks like that Windows reset the interrupted flag on {{SocketException}}. Need to double check. > Client instance hangs on close > ------------------------------ > > Key: IGNITE-3412 > URL: https://issues.apache.org/jira/browse/IGNITE-3412 > Project: Ignite > Issue Type: Bug > Affects Versions: 1.6 > Reporter: Alexei Scherbakov > Assignee: Denis Magda > Fix For: 1.7 > > Attachments: SocketsTest.zip, threadDump.txt > > > In some cases calling close on Ignite client instance will lead to deadlock. > The deadlock happens because of the following. > Socket writer is waiting for new messages. > {code} > "tcp-client-disco-sock-writer-#2%null%" #100 prio=6 os_prio=0 > tid=0x000000005fad2800 nid=0x13bc in Object.wait() [0x0000000067d0e000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:502) > at > org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1051) > - locked <0x00000000863da2f8> (a java.lang.Object) > at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > {code} > The closing process is hanging because TcpDiscoverySPI waits while this > writer is terminated > {code} > "Thread-6" #29 prio=6 os_prio=0 tid=0x000000005a740000 nid=0x17e8 in > Object.wait() [0x000000006077e000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1245) > - locked <0x00000000863da010> (a > org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter) > at java.lang.Thread.join(Thread.java:1319) > at > org.apache.ignite.internal.util.IgniteUtils.join(IgniteUtils.java:4476) > at > org.apache.ignite.spi.discovery.tcp.ClientImpl.spiStop(ClientImpl.java:295) > at > org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStop(TcpDiscoverySpi.java:1905) > at > org.apache.ignite.internal.managers.GridManagerAdapter.stopSpi(GridManagerAdapter.java:325) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.stop(GridDiscoveryManager.java:1336) > at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:1940) > at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:1812) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2248) > - locked <0x00000000858e77a8> (a > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2211) > at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:322) > at org.apache.ignite.Ignition.stop(Ignition.java:224) > at org.apache.ignite.internal.IgniteKernal.close(IgniteKernal.java:2921) > at ru.sbrf.ggcod.loader.job.MainLoader.run(MainLoader.java:123) > at java.lang.Thread.run(Thread.java:745) > {code} > There is some raise that led to the situation when the writer is hanging on > {{Object.wait}} method ignoring interrupted flag that was set at some point > of time. > The full thread dump is attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)