Matthias Pohl created FLINK-32098:
-------------------------------------

             Summary: Dispatcher#submitJob calls 
Dispatcher#isInGloballyTerminalState up to three times which might be expensive 
due to IO
                 Key: FLINK-32098
                 URL: https://issues.apache.org/jira/browse/FLINK-32098
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Coordination
    Affects Versions: 1.16.1, 1.17.0, 1.18.0
            Reporter: Matthias Pohl


{{Dispatcher#submitJob}} calls {{Dispatcher#isInGloballyTerminalState}} up to 
three times (1x through {{Dispatcher#isDuplicateJob}} and 2x directly) which 
calls {{JobResultStore#hasJobResultStore}}. {{hasJobResultStore}} calls 
{{hasDirtyJobResultEntry}} and {{hasCleanJobResultEntry}} if the underlying job 
hasn't completed globally, yet. Both calls run {{FileSystem#exists}} on an 
non-existing file which can be a quite expensive operation (depending on the 
{{FileSystem}} implementation for object storage) since it might require a full 
table scan.

tbh, so far, nobody complained. But we might want to either reconsider the 
{{FileSystemJobResultStore}}/{{JobResultStore#hasJobResultEntry}} 
implementation or, at least, reduce the number of {{isInGloballyTerminalState}} 
in the {{Dispatcher}} and document the performance issue in the JavaDoc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to