[ https://issues.apache.org/jira/browse/FLINK-15687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133216#comment-17133216 ]
Till Rohrmann commented on FLINK-15687: --------------------------------------- Thanks for fixing this problem [~xintongsong]. Let me know once you have the PR ready. > Potential test instabilities due to concurrent access to TaskSlotTable. > ----------------------------------------------------------------------- > > Key: FLINK-15687 > URL: https://issues.apache.org/jira/browse/FLINK-15687 > Project: Flink > Issue Type: Task > Components: Runtime / Coordination, Tests > Affects Versions: 1.10.0 > Reporter: Kostas Kloudas > Assignee: Xintong Song > Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.11.0 > > > Working on [FLINK-14742|https://issues.apache.org/jira/browse/FLINK-14742] > revealed that the problem with that test instability was the modification of > the {{taskSlotTable}} of the {{TaskManager}} under test from multiple > threads, namely the test thread and the main thread of the {{rpcEnpoint}}. > This data-structure is not thread-safe and this should not happen. > This anti-pattern seems to be repeated in multiple tests like most of the > tests in the {{TaskExecutorSubmissionTest}} (look for the call to the > {{TaskSlotTable.allocateSlot()}}). There we seem to call > {{taskSlotTable.allocateSlot()}} and then \{{tmGateway.submitTask()}} which > is essentially accessing the slot table from within the main rpc-endpoint > thread. > This JIRA is just to investigate if this is also a problem in those tests or > not. > cc [~trohrmann], [~chesnay] , [~yangwang166] -- This message was sent by Atlassian Jira (v8.3.4#803005)