Hi All! With the growing number of Flink streaming applications the current HS implementation is starting to lose its value. Users running streaming applications mostly care about what is running right now on the cluster and a centralised view on history is not very useful.
We have been experimenting with reworking the current HS into a Global Flink Dashboard that would show all running and completed/failed jobs on all the running Flink clusters the users have. In essence we would get a view similar to the current HS but it would also show the running jobs with a link redirecting to the actual cluster specific dashboard. This is how it looks now: In this version we took a very simple approach of introducing a cluster discovery abstraction to collect all the running Flink clusters (by listing yarn apps for instance). The main pages aggregating jobs from different clusters would then simply make calls to all clusters and aggregate the response. Job specific endpoints would be simply routed to the correct target cluster. This way the changes required are localised to the current HS implementation and cluster rest endpoints don't need to be changed. In addition to getting a fully working global dashboard this also gets us a fully functioning rest endpoint for accessing all jobs in all clusters without having to provide the clusterId (yarn app id for instance) that we can use to enhance CLI experience in multi cluster (lot of per-job clusters) environments. Please let us know what you think! Gyula
