[ https://issues.apache.org/jira/browse/IGNITE-15767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448157#comment-17448157 ]
Amelchev Nikita commented on IGNITE-15767: ------------------------------------------ Cherry-picked to the 2.12. > Need to workaround JDK bug JDK-8247750 > -------------------------------------- > > Key: IGNITE-15767 > URL: https://issues.apache.org/jira/browse/IGNITE-15767 > Project: Ignite > Issue Type: Improvement > Reporter: Ivan Bessonov > Assignee: Ivan Bessonov > Priority: Major > Fix For: 2.12 > > Time Spent: 20m > Remaining Estimate: 0h > > See [https://bugs.openjdk.java.net/browse/JDK-8247750] > ServerSocket.accept() with no timeout may throw SocketTimeoutException when > the process receives a signal. It can cause unexpected exception in the disco > reader: > {noformat} > [09:52:26,301][SEVERE][tcp-disco-srvr-[:47500]-#3-#71][TcpDiscoverySpi] > Failed to accept TCP connection. > java.net.SocketTimeoutException: Accept timed out at > java.base/java.net.PlainSocketImpl.socketAccept(Native Method) > at > java.base/java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:458) > at java.base/java.net.ServerSocket.implAccept(ServerSocket.java:565) > at java.base/java.net.ServerSocket.accept(ServerSocket.java:533) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServer.body(ServerImpl.java:6750) > at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServerThread.body(ServerImpl.java:6673) > at > org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:57){noformat} > There are mentions of this on StackOverflow and userlist, and we also see it > when running Docker on Alpine 3.15. Speculation is that this is caused by a > combination of a specific version of Linux kernel and environment. > Sidenote: based on strace analysis on Alpine, it doesn't even receive any > signals; it could be so that Alpine interrupts the syscall "as if" by a > signal. > The bug is not fixed in JDK 11 (surprisingly). WA is easy though - wrap the > accept in try-catch, and retry if getting the unexpected timeout exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)