[ https://issues.apache.org/jira/browse/FLINK-8856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephan Ewen updated FLINK-8856: -------------------------------- Fix Version/s: 1.4.3 > Move all interrupt() calls to TaskCanceler > ------------------------------------------ > > Key: FLINK-8856 > URL: https://issues.apache.org/jira/browse/FLINK-8856 > Project: Flink > Issue Type: Bug > Components: TaskManager > Affects Versions: 1.4.0, 1.4.1, 1.4.2 > Reporter: Stephan Ewen > Assignee: Stephan Ewen > Priority: Blocker > Fix For: 1.5.0, 1.6.0, 1.4.3 > > > We need this to work around the following JVM bug: > https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8138622 > To circumvent this problem, the {{TaskCancelerWatchDog}} must not call > {{interrupt()}} at all, but only join on the executing thread (with timeout) > and cause a hard exit once cancellation takes to long. > A user affected by this problem reported this in FLINK-8834 > Personal note: The Thread.join(...) method unfortunately is not 100% reliable > as well, because it uses {{System.currentTimeMillis()}} rather than > {{System.nanoTime()}}. Because of that, sleeps can take overly long when the > clock is adjusted. I wonder why the JDK authors do not follow their own > recommendations and use {{System.nanoTime()}} for all relative time > measures... > EDIT: I am not the only one wondering why: > https://stackoverflow.com/questions/42544387/why-does-thread-join-use-currenttimemillis -- This message was sent by Atlassian JIRA (v7.6.3#76005)