[jira] [Updated] (FLINK-25316) BlobServer can get stuck during shutdown
[ https://issues.apache.org/jira/browse/FLINK-25316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn Visser updated FLINK-25316: --- Fix Version/s: (was: 1.16.0) > BlobServer can get stuck during shutdown > > > Key: FLINK-25316 > URL: https://issues.apache.org/jira/browse/FLINK-25316 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.15.0 >Reporter: Robert Metzger >Priority: Minor > > The cluster shutdown can get stuck > {code} > "AkkaRpcService-Supervisor-Termination-Future-Executor-thread-1" #89 daemon > prio=5 os_prio=0 tid=0x004017d7 nid=0x2ec in Object.wait() > [0x00402a9b5000] >java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xd6c48368> (a > org.apache.flink.runtime.blob.BlobServer) > at java.lang.Thread.join(Thread.java:1252) > - locked <0xd6c48368> (a > org.apache.flink.runtime.blob.BlobServer) > at java.lang.Thread.join(Thread.java:1326) > at org.apache.flink.runtime.blob.BlobServer.close(BlobServer.java:319) > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.stopClusterServices(ClusterEntrypoint.java:406) > - locked <0xd5d27350> (a java.lang.Object) > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$shutDownAsync$4(ClusterEntrypoint.java:505 > {code} > because the BlobServer.run() method ignores interrupts: > {code} > "BLOB Server listener at 6124" #30 daemon prio=5 os_prio=0 > tid=0x00401c929800 nid=0x2b4 runnable [0x0040263f9000] >java.lang.Thread.State: RUNNABLE > at java.net.PlainSocketImpl.socketAccept(Native Method) > at > java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409) > at java.net.ServerSocket.implAccept(ServerSocket.java:560) > at java.net.ServerSocket.accept(ServerSocket.java:528) > at > org.apache.flink.util.NetUtils.acceptWithoutTimeout(NetUtils.java:143) > at org.apache.flink.runtime.blob.BlobServer.run(BlobServer.java:268) > {code} > This issue was introduced in FLINK-24156 and first mentioned in > https://issues.apache.org/jira/browse/FLINK-24113?focusedCommentId=17459414&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17459414 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-25316) BlobServer can get stuck during shutdown
[ https://issues.apache.org/jira/browse/FLINK-25316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Gao updated FLINK-25316: Fix Version/s: 1.16.0 > BlobServer can get stuck during shutdown > > > Key: FLINK-25316 > URL: https://issues.apache.org/jira/browse/FLINK-25316 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.15.0 >Reporter: Robert Metzger >Priority: Minor > Fix For: 1.15.0, 1.16.0 > > > The cluster shutdown can get stuck > {code} > "AkkaRpcService-Supervisor-Termination-Future-Executor-thread-1" #89 daemon > prio=5 os_prio=0 tid=0x004017d7 nid=0x2ec in Object.wait() > [0x00402a9b5000] >java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xd6c48368> (a > org.apache.flink.runtime.blob.BlobServer) > at java.lang.Thread.join(Thread.java:1252) > - locked <0xd6c48368> (a > org.apache.flink.runtime.blob.BlobServer) > at java.lang.Thread.join(Thread.java:1326) > at org.apache.flink.runtime.blob.BlobServer.close(BlobServer.java:319) > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.stopClusterServices(ClusterEntrypoint.java:406) > - locked <0xd5d27350> (a java.lang.Object) > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$shutDownAsync$4(ClusterEntrypoint.java:505 > {code} > because the BlobServer.run() method ignores interrupts: > {code} > "BLOB Server listener at 6124" #30 daemon prio=5 os_prio=0 > tid=0x00401c929800 nid=0x2b4 runnable [0x0040263f9000] >java.lang.Thread.State: RUNNABLE > at java.net.PlainSocketImpl.socketAccept(Native Method) > at > java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409) > at java.net.ServerSocket.implAccept(ServerSocket.java:560) > at java.net.ServerSocket.accept(ServerSocket.java:528) > at > org.apache.flink.util.NetUtils.acceptWithoutTimeout(NetUtils.java:143) > at org.apache.flink.runtime.blob.BlobServer.run(BlobServer.java:268) > {code} > This issue was introduced in FLINK-24156 and first mentioned in > https://issues.apache.org/jira/browse/FLINK-24113?focusedCommentId=17459414&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17459414 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (FLINK-25316) BlobServer can get stuck during shutdown
[ https://issues.apache.org/jira/browse/FLINK-25316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-25316: --- Priority: Minor (was: Critical) > BlobServer can get stuck during shutdown > > > Key: FLINK-25316 > URL: https://issues.apache.org/jira/browse/FLINK-25316 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.15.0 >Reporter: Robert Metzger >Priority: Minor > Fix For: 1.15.0 > > > The cluster shutdown can get stuck > {code} > "AkkaRpcService-Supervisor-Termination-Future-Executor-thread-1" #89 daemon > prio=5 os_prio=0 tid=0x004017d7 nid=0x2ec in Object.wait() > [0x00402a9b5000] >java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xd6c48368> (a > org.apache.flink.runtime.blob.BlobServer) > at java.lang.Thread.join(Thread.java:1252) > - locked <0xd6c48368> (a > org.apache.flink.runtime.blob.BlobServer) > at java.lang.Thread.join(Thread.java:1326) > at org.apache.flink.runtime.blob.BlobServer.close(BlobServer.java:319) > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.stopClusterServices(ClusterEntrypoint.java:406) > - locked <0xd5d27350> (a java.lang.Object) > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$shutDownAsync$4(ClusterEntrypoint.java:505 > {code} > because the BlobServer.run() method ignores interrupts: > {code} > "BLOB Server listener at 6124" #30 daemon prio=5 os_prio=0 > tid=0x00401c929800 nid=0x2b4 runnable [0x0040263f9000] >java.lang.Thread.State: RUNNABLE > at java.net.PlainSocketImpl.socketAccept(Native Method) > at > java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409) > at java.net.ServerSocket.implAccept(ServerSocket.java:560) > at java.net.ServerSocket.accept(ServerSocket.java:528) > at > org.apache.flink.util.NetUtils.acceptWithoutTimeout(NetUtils.java:143) > at org.apache.flink.runtime.blob.BlobServer.run(BlobServer.java:268) > {code} > This issue was introduced in FLINK-24156 and first mentioned in > https://issues.apache.org/jira/browse/FLINK-24113?focusedCommentId=17459414&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17459414 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (FLINK-25316) BlobServer can get stuck during shutdown
[ https://issues.apache.org/jira/browse/FLINK-25316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-25316: --- Description: The cluster shutdown can get stuck {code} "AkkaRpcService-Supervisor-Termination-Future-Executor-thread-1" #89 daemon prio=5 os_prio=0 tid=0x004017d7 nid=0x2ec in Object.wait() [0x00402a9b5000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xd6c48368> (a org.apache.flink.runtime.blob.BlobServer) at java.lang.Thread.join(Thread.java:1252) - locked <0xd6c48368> (a org.apache.flink.runtime.blob.BlobServer) at java.lang.Thread.join(Thread.java:1326) at org.apache.flink.runtime.blob.BlobServer.close(BlobServer.java:319) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.stopClusterServices(ClusterEntrypoint.java:406) - locked <0xd5d27350> (a java.lang.Object) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$shutDownAsync$4(ClusterEntrypoint.java:505 {code} because the BlobServer.run() method ignores interrupts: {code} "BLOB Server listener at 6124" #30 daemon prio=5 os_prio=0 tid=0x00401c929800 nid=0x2b4 runnable [0x0040263f9000] java.lang.Thread.State: RUNNABLE at java.net.PlainSocketImpl.socketAccept(Native Method) at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409) at java.net.ServerSocket.implAccept(ServerSocket.java:560) at java.net.ServerSocket.accept(ServerSocket.java:528) at org.apache.flink.util.NetUtils.acceptWithoutTimeout(NetUtils.java:143) at org.apache.flink.runtime.blob.BlobServer.run(BlobServer.java:268) {code} This issue was introduced in FLINK-24156 and first mentioned in https://issues.apache.org/jira/browse/FLINK-24113?focusedCommentId=17459414&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17459414 was: The cluster shutdown can get stuck {code} "AkkaRpcService-Supervisor-Termination-Future-Executor-thread-1" #89 daemon prio=5 os_prio=0 tid=0x004017d7 nid=0x2ec in Object.wait() [0x00402a9b5000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xd6c48368> (a org.apache.flink.runtime.blob.BlobServer) at java.lang.Thread.join(Thread.java:1252) - locked <0xd6c48368> (a org.apache.flink.runtime.blob.BlobServer) at java.lang.Thread.join(Thread.java:1326) at org.apache.flink.runtime.blob.BlobServer.close(BlobServer.java:319) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.stopClusterServices(ClusterEntrypoint.java:406) - locked <0xd5d27350> (a java.lang.Object) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$shutDownAsync$4(ClusterEntrypoint.java:505 {code} because the BlobServer.run() method ignores interrupts: {code} "BLOB Server listener at 6124" #30 daemon prio=5 os_prio=0 tid=0x00401c929800 nid=0x2b4 runnable [0x0040263f9000] java.lang.Thread.State: RUNNABLE at java.net.PlainSocketImpl.socketAccept(Native Method) at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409) at java.net.ServerSocket.implAccept(ServerSocket.java:560) at java.net.ServerSocket.accept(ServerSocket.java:528) at org.apache.flink.util.NetUtils.acceptWithoutTimeout(NetUtils.java:143) at org.apache.flink.runtime.blob.BlobServer.run(BlobServer.java:268) {code} This issue was introduced in FLINK-24156. > BlobServer can get stuck during shutdown > > > Key: FLINK-25316 > URL: https://issues.apache.org/jira/browse/FLINK-25316 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.15.0 >Reporter: Robert Metzger >Priority: Critical > Fix For: 1.15.0 > > > The cluster shutdown can get stuck > {code} > "AkkaRpcService-Supervisor-Termination-Future-Executor-thread-1" #89 daemon > prio=5 os_prio=0 tid=0x004017d7 nid=0x2ec in Object.wait() > [0x00402a9b5000] >java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xd6c48368> (a > org.apache.flink.runtime.blob.BlobServer) > at java.lang.Thread.join(Thread.java:1252) > - locked <0xd6c48368> (a > org.apache.flink.runtime.blob.BlobServer) > at java.lang.Thread.join(Thread.java:1326) > at org.apache.flink.runtime.blob.BlobServer.close(BlobServer.java:319) > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.stopClusterServices(ClusterEntrypoint.java:406) >