[
https://issues.apache.org/jira/browse/FLINK-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108104#comment-17108104
]
Stephan Ewen commented on FLINK-17645:
--------------------------------------
I see, fair point. This is a static shared thread where we need to take care
about leaks. Makes sense, +1
> REAPER_THREAD.start() in SafetyNetCloseableRegistry failed, causing the
> repeated failover.
> ------------------------------------------------------------------------------------------
>
> Key: FLINK-17645
> URL: https://issues.apache.org/jira/browse/FLINK-17645
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Task
> Affects Versions: 1.10.1, 1.11.0
> Reporter: Zakelly Lan
> Assignee: Lijie Wang
> Priority: Major
> Fix For: 1.11.0
>
>
> I'm running a modified version of Flink, and encountered the exception below
> when task start:
> {code:java}
> 2020-05-12 00:46:19,037 ERROR [***] org.apache.flink.runtime.taskmanager.Task
> - Encountered an unexpected exception
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:802)
> at
> org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:73)
> at
> org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
> at java.lang.Thread.run(Thread.java:834)
> 2020-05-12 00:46:19,038 INFO [***] org.apache.flink.runtime.taskmanager.Task
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:802)
> at
> org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:73)
> at
> org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
> at java.lang.Thread.run(Thread.java:834)
> {code}
> The REAPER_THREAD.start() fails because of OOM, and REAPER_THREAD will never
> be null. Since then, every time SafetyNetCloseableRegistry init in this VM
> will cause an IllegalStateException:
> {code:java}
> java.lang.IllegalStateException
> at
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:179)
> at
> org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:71)
> at
> org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
> at java.lang.Thread.run(Thread.java:834){code}
> This may happen in very old version of Flink as well.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)