Re: [DISCUSS] Rework History Server into Global Dashboard

Till Rohrmann Thu, 14 May 2020 06:37:04 -0700

Hi Gyula,

thanks for proposing this extension. I can see that such a feature could be
helpful.


However, I wouldn't consider the management of multiple clusters core to
Flink. Managing a single cluster is already complex enough and given the
available community capacity I would rather concentrate on doing this
aspect right instead of adding more complexity and more code to maintain.

Maybe we could add this feature as a Flink package instead. That way it
would still be available to our users. If it gains enough traction then we
can also add it to Flink later. What do you think?

Cheers,
Till

On Wed, May 13, 2020 at 11:36 AM Gyula Fóra <gyula.f...@gmail.com> wrote:

> It seems that not everyone can see the screenshot in the email, so here is
> a link:
>
> https://drive.google.com/open?id=1abrlpI976NFqOZSX20k2FoiAfVhBbER9
>
> On Wed, May 13, 2020 at 11:29 AM Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > Oops I forgot the screenshot, thanks Ufuk :D
> >
> >
> > @Jeff Zhang <zjf...@gmail.com> : Yes we simply call to the individual
> > cluster's rest endpoints so it would work with multiple flink versions
> yes.
> > Gyula
> >
> >
> > On Wed, May 13, 2020 at 10:56 AM Jeff Zhang <zjf...@gmail.com> wrote:
> >
> >> Hi Gyula,
> >>
> >> Big +1 for this, it would be very helpful for flink jobs and cluster
> >> operations. Do you call flink rest api to gather the job info ? I hope
> >> this
> >> history server could work with multiple versions of flink as long as the
> >> flink rest api is compatible.
> >>
> >> Gyula Fóra <gyula.f...@gmail.com> 于2020年5月13日周三 下午4:13写道：
> >>
> >> > Hi All!
> >> >
> >> > With the growing number of Flink streaming applications the current HS
> >> > implementation is starting to lose its value. Users running streaming
> >> > applications mostly care about what is running right now on the
> cluster
> >> and
> >> > a centralised view on history is not very useful.
> >> >
> >> > We have been experimenting with reworking the current HS into a Global
> >> > Flink Dashboard that would show all running and completed/failed jobs
> on
> >> > all the running Flink clusters the users have.
> >> >
> >> > In essence we would get a view similar to the current HS but it would
> >> also
> >> > show the running jobs with a link redirecting to the actual cluster
> >> > specific dashboard.
> >> >
> >> > This is how it looks now:
> >> >
> >> >
> >> > In this version we took a very simple approach of introducing a
> cluster
> >> > discovery abstraction to collect all the running Flink clusters (by
> >> listing
> >> > yarn apps for instance).
> >> >
> >> > The main pages aggregating jobs from different clusters would then
> >> simply
> >> > make calls to all clusters and aggregate the response. Job specific
> >> > endpoints would be simply routed to the correct target cluster. This
> way
> >> > the changes required are localised to the current HS implementation
> and
> >> > cluster rest endpoints don't need to be changed.
> >> >
> >> > In addition to getting a fully working global dashboard this also gets
> >> us a
> >> > fully functioning rest endpoint for accessing all jobs in all clusters
> >> > without having to provide the clusterId (yarn app id for instance)
> that
> >> we
> >> > can use to enhance CLI experience in multi cluster (lot of per-job
> >> > clusters) environments. Please let us know what you think! Gyula
> >> >
> >>
> >>
> >> --
> >> Best Regards
> >>
> >> Jeff Zhang
> >>
> >
>

Re: [DISCUSS] Rework History Server into Global Dashboard

Reply via email to