[ https://issues.apache.org/jira/browse/FLINK-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626190#comment-16626190 ]
Ufuk Celebi commented on FLINK-10292: ------------------------------------- I understand that non-determinism may be an issue when generating the {{JobGraph}}, but do we have some data about how common that is for applications? Would it be possible to keep a fixed JobGraph in the image instead of persisting one in the {{SubmittedJobGraphStore}}? I like our current approach, because it keeps the source of truth for the job in the image instead of the {{SubmittedJobGraphStore}}. I'm wondering about the following scenario: * A user creates a job cluster with high availability enabled (cluster ID for the logical application, e.g. myapp) ** This will persist the job with a fixed ID (after FLINK-10291) on first submission * The user kills the application *without* cancelling ** This will leave all data in the high availability store(s) such as job graphs or checkpoints * The user updates the image with a modified application and keeps the high availability configuration (e.g. cluster ID stays myapp) ** This will result in the job in the image to be ignored since we already have a job graph with the same (fixed) ID I think in such a scenario it can be desirable to still have the checkpoints available, but it might be problematic if the job graph is recovered from the {{SubmittedJobGraphStore}} instead of using the job that is part of the image. What do you think about this scenario? Is it the responsibility of the user to handle this? If so, I think that the approach outlined in this ticket makes sense. If not, we may want to consider alternatives or ignore potential non-determinism. > Generate JobGraph in StandaloneJobClusterEntrypoint only once > ------------------------------------------------------------- > > Key: FLINK-10292 > URL: https://issues.apache.org/jira/browse/FLINK-10292 > Project: Flink > Issue Type: Improvement > Components: Distributed Coordination > Affects Versions: 1.6.0, 1.7.0 > Reporter: Till Rohrmann > Assignee: vinoyang > Priority: Major > Fix For: 1.7.0, 1.6.2 > > > Currently the {{StandaloneJobClusterEntrypoint}} generates the {{JobGraph}} > from the given user code every time it starts/is restarted. This can be > problematic if the the {{JobGraph}} generation has side effects. Therefore, > it would be better to generate the {{JobGraph}} only once and store it in HA > storage instead from where to retrieve. -- This message was sent by Atlassian JIRA (v7.6.3#76005)