Hi, Yi. The FLIP is both interesting and highly promising for Flink users. Once implemented, it will enable powerful use cases—such as running a Jupyter Notebook kernel or SQL Gateway as a first-class application within the JobManager. This represents a significant step forward in usability and integration.
I’d like to share a few suggestions and clarifications that could help strengthen the proposal: *Asynchronous REST API for Application Submission* Given that launching such applications may involve complex initialization and take considerable time to complete, it would be beneficial to support an asynchronous submission mechanism via REST. A synchronous endpoint might lead to timeouts or poor user experience. An async API could return an application ID immediately, allowing users to poll or query the status of the deployment using that identifier. *Clarification on "Pre-termination Cleanup"* The term pre-termination cleanup is mentioned several times in the document. Could you please elaborate on what this entails? Specifically, which resources are expected to be released, and at what point in the life cycle does this occur? A clearer definition would help ensure consistent implementation and improve reliability. *Potential Job Leak Prevention* There appears to be a risk of job leaks if an application fails to properly cancel its associated Flink job upon termination. To mitigate this, we might consider introducing a background daemon thread (or a monitoring service) that periodically checks for orphaned jobs whose parent applications have already terminated, and automatically triggers cleanup. Alternatively, integrating with Flink’s existing lifecycle management mechanisms could help ensure robust resource cleanup. *API Compatibility Considerations* It would be helpful to clarify how the new application model aligns with existing APIs. Many external systems currently rely on job IDs to monitor or cancel jobs. Will these operations still be supported under the new model? For example, can users continue to use the existing REST endpoints to cancel a job or check its status using the job ID, even when the job was launched through this new application framework? Best, Shengkai Yi Zhang <[email protected]> 于2025年9月23日周二 11:23写道: > Hi everyone, > > > I would like to start a discussion about FLIP-549: Support Application > Management [1]. > > > Despite Flink’s widespread adoption, the existing model for running user > logic limits observability and execution flexibility, which affects user > experience. This FLIP introduces a new application management framework > designed to close these gaps and provide a foundation for future > improvements. > > > Looking forward to your feedback and suggestions. > > > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-549%3A+Support+Application+Management > > > Best regards, > > Yi Zhang
