Hi all,
> IIUC, there is only an OperatorCoordinatorMetricGroup interface without
implementations (it never worked). There were no coordinator metrics
implemented and registered. @Hang Ruan please correct me if I'm wrong.
Yes, we shouldn't have to worry about backwards-compatibility as Chesnay
said.
After the discussion, I think the FLIP should be modified as follow.
1. Add new metric group JobManagerOperatorMetricGroup, whose scope format
is JobManagerOperatorScopeFormat. The config key of
JobManagerOperatorScopeFormat
is metrics.scope.jm-operator, and the default value is
<host>.jobmanager.<job_name>.<task_name>.<operator_name>.
2. JobManagerOperatorMetricGroup will be sub components of
JobManagerJobMetricGroup.
3. InternalOperatorCoordinatorMetricGroup will extend ProxyMetricGroup,
which registered under JobManagerOperatorMetricGroup by
addGroup("coordinator").
I have updated the FLIP-274. @Chesnay Schepler <[email protected]> @Jark Wu
<[email protected]> , please help to take a look at the modified FLIP and
make sure that I do not miss something.
If there are no more problems, we plan to start the voting thread later.
Best,
Hang
Chesnay Schepler <[email protected]> 于2023年1月31日周二 20:09写道:
> > IIUC, there is only an OperatorCoordinatorMetricGroup interface without
> implementations (it never worked). There were no coordinator metrics
> implemented and registered. @Hang Ruan <[email protected]> please
> correct me if I'm wrong.
>
> Indeed. There are interfaces, and theoretically a metric group is exposed
> to user-code via the SplitEnumeratorContext, but the metric group is always
> null in the SourceCoordinatorContext. So we shouldn't have to worry about
> backwards-compatibility.
>
> > However, the config key "metrics.scope.jm.job" is deprecated and is
> replaced by "metrics.scope.jm-job". Similarly, "metrics.scope.tm.job" is
> replaced with "metrics.scope.tm-job". That's why I though
> "metrics.scope.jm-operator" might be more consistent.
>
> HA! Good catch. I only looked at the 1.16 docs where they weren't
> deprecated yet.
>
> Then it should be "jm-operator" indeed.
>
> On 31/01/2023 10:51, Jark Wu wrote:
>
> Hi Chesnay,
>
> > The new option must default to the configured value of
> metrics.scope.jm.job because otherwise it isn't backwards-compatible with
> existing coordinator metrics.
> IIUC, there is only an OperatorCoordinatorMetricGroup interface without
> implementations (it never worked). There were no coordinator metrics
> implemented and registered. @Hang Ruan <[email protected]> please
> correct me if I'm wrong.
>
> > Intuitively I think the config key should be metrics.scope.jm.operator.,
> similar to metrics.scope.jm.job.
> However, the config key "metrics.scope.jm.job" is deprecated and is
> replaced by "metrics.scope.jm-job". Similarly, "metrics.scope.tm.job" is
> replaced with "metrics.scope.tm-job". That's why I though
> "metrics.scope.jm-operator" might be more consistent.
>
> > We would only support operator_name, operator_id, task_name and task_id
> (+ everything else from the jm/job groups of course).
> +1
>
> Best,
> Jark
>
>
>
> On Mon, 30 Jan 2023 at 20:09, Chesnay Schepler <[email protected]> wrote:
>
>> The new option must default to the configured value of
>> metrics.scope.jm.job because otherwise it isn't backwards-compatible with
>> existing coordinator metrics.
>> According to some earlier response in this thread there are already
>> coordinator metrics.
>>
>> If that weren't the case you'd be correct, and we'd keep the default
>> consistent with the default for tm operator metrics.
>>
>> Intuitively I think the config key should be metrics.scope.jm.operator.,
>> similar to metrics.scope.jm.job.
>> For symmetry we should also add new metrics.scope.tm.operator/task and
>> eventually phase out metrics.scope.operator.
>>
>> > Regarding "coordinator metrics don't *belong to* any tasks/vertexes",
>> I mean the coordinator doesn't run on specific tasks, so it doesn't have
>> runtime information of tasks.
>>
>> That is true. I intentionally only referred to task id/name, as things
>> like subtask index or attempt ids don't make sense for coordinators as you
>> said.
>> We would only support operator_name, operator_id, task_name and task_id
>> (+ everything else from the jm/job groups of course).
>> (Maybe in the future attempt_num could be interesting though)
>>
>> On 30/01/2023 10:39, Jark Wu wrote:
>>
>> Hi Chesnay,
>>
>> IIUC, our only disagreement is whether to support task info in the metric
>> format. But we all agree with the default metric format
>> "metrics.scope.jm-operator=<host>.jobmanager.<job_name>.<operator_name>"
>> which doesn't include task info, right?
>>
>> Regarding "coordinator metrics don't *belong to* any tasks/vertexes", I
>> mean the coordinator doesn't run on specific tasks, so it doesn't have
>> runtime information of tasks. My concern about adding task info to the
>> coordinator metric is that
>> 1) if we register coordinator metrics for every subtask, the number of
>> coordinator metrics will be exploded when task parallelism is large. This
>> is an overhead to JM & metric systems, and the duplicated metrics are
>> useless for users.
>> 2) Correct me if I'm wrong, MetricGroup#counter only supports registering
>> one metric. It seems it can't register a metric reporting for multiple
>> metric names (e.g. multiple subtasks).
>>
>> Best,
>> Jark
>>
>>
>> On Mon, 30 Jan 2023 at 16:11, Chesnay Schepler <[email protected]>
>> wrote:
>>
>>> I hadn't looked in detail in to how the task info can be introduced, but
>>> given that the coordinates are created by the execution graph, where we
>>> only work on tasks, it should be possible and rather simple.
>>>
>>> The OperatorCoordinatorHolder gets access to the ExecutionJobVertex from
>>> which it can extract everything and create the task/operator groups.
>>> Exposure to the coordinator could happen via the coordinator context as
>>> originally planned.
>>>
>>> > As we know, coordinator metrics are just like JMJobMetricGroup, which
>>> don't belong to any tasks/vertexes.
>>>
>>> I don't think this is true, every coordinator is scoped to a specific
>>> operator, and every operator is associated with a particular vertex.
>>>
>>> From the javadocs:
>>>
>>> " A coordinator for runtime operators. The OperatorCoordinator runs on
>>> the master, associated with
>>> the job vertex of the operator. It communicates with operators via
>>> sending operator events."
>>>
>>> On 28/01/2023 09:15, Jark Wu wrote:
>>> > Hi all,
>>> >
>>> > IIUC, Chesnay means we should have a more general metric group for
>>> > **operator**
>>> > not for **coordinator** in JM, this would be useful to extend
>>> > other operator-specific
>>> > metrics in the future. That means the new scope format should be
>>> designed
>>> > for
>>> > the operator,
>>> > e.g.,
>>> metrics.scope.jm-operator=<host>.jobmanager.<job_name>.<operator_name>
>>> > The coordinator metric is a subgroup (a constant "coordinator" suffix)
>>> of
>>> > the JMOperatorMG.
>>> >
>>> > I think this is a nice design. However, I have a question about adding
>>> task
>>> > id/name to this list.
>>> > How to get the task id/name when reporting a coordinator metric? As we
>>> > know, coordinator metrics
>>> > are just like JMJobMetricGroup, which don't belong to any
>>> tasks/vertexes.
>>> > Do you mean reporting
>>> > coordinator metrics for every task id under the operator?
>>> >
>>> > Best,
>>> > Jark
>>> >
>>> >
>>> > On Thu, 19 Jan 2023 at 17:33, Chesnay Schepler <[email protected]>
>>> wrote:
>>> >
>>> >> > First, I do not understand why users have to configure the new
>>> scope
>>> >> format, which has a default value.
>>> >>
>>> >> If you don't use scope formats, sure. If you do use scope formats,
>>> e.g.
>>> >> to add a common prefix (which is the case for datadog users for
>>> >> example), then the current default in the FLIP is insufficient and
>>> >> requires the user to update the configuration.
>>> >>
>>> >> > I usually do not need to change these configuration of scope
>>> formats
>>> >> when submitting the flink job.
>>> >>
>>> >> Scope formats as a whole are quite a power-user feature, but that
>>> >> doesn't mean we should ignore it.
>>> >>
>>> >> > I try to let it extend the ComponentMetricGroup
>>> >>
>>> >> This isn't inherently required just because it is a "component".
>>> >> Component metric groups should _only_ be used for cases where such a
>>> >> component provides several bits of metadata.
>>> >> For example, tasks provide vertex ids, task names, attempted IDs etc.,
>>> >> and we'd like users to have the option on how this metadata ends up
>>> >> being used (via scope formats). This currently can't be built with the
>>> >> addGroup() methods, hence why the component groups exist.
>>> >> However, the operator coordinator _itself_does not provide several
>>> bits
>>> >> of metadata. Logically, the _operator_ does, not the coordinator.
>>> >>
>>> >> > This make me decide to add a new scope format.
>>> >>
>>> >> And this is fine; just don't add a new scope format for the
>>> coordinator
>>> >> but logically for operators on the JM side, such that we can extend
>>> the
>>> >> set of operator-specific metrics in the future without having to add
>>> yet
>>> >> another scope format.
>>> >>
>>> >> > The relationship between the Operator and OperatorCoordinator is
>>> >> maintained in the OperatorCoordinatorHolder. In the POC, the operator
>>> >> name/id could be found and the OperatorCoordinatorMetricGroup will be
>>> >> created here.
>>> >>
>>> >> That's all irrelevant.
>>> >>
>>> >> > The registered metrics of OperatorCoordinator always are those
>>> which
>>> >> can not be aggregated from the tasks, like the number of unassigned
>>> >> splits in the SourceCoordinator. Actually we have not encountered the
>>> >> scenario you mentioned. I think the OperatorCoordinatorMetricGroup
>>> >> should only contain the metrics for itself instead of its subtasks.
>>> >>
>>> >> You are misunderstanding the problem.
>>> >>
>>> >> This isn't about aggregating metrics on our side or exposing metrics
>>> >> from TMs on the JM side or anything like that.
>>> >>
>>> >> It's purely about the available metadata for operator metrics on the
>>> JM
>>> >> side. Currently you suggest operator name / id; what I'm proposing is
>>> to
>>> >> add task(==vertex!) id / name to this list.
>>> >>
>>> >> The benefit here is simple. I can look up all metrics where
>>> task_id==XXX
>>> >> and get _everything_ related to that vertex, including all metrics
>>> >> associated with operators that are part of that vertex, including the
>>> >> coordinator.
>>> >>
>>> >> > Using metrics.scope.jm.job is not enough to distinguish the
>>> different
>>> >> coordinators.
>>> >>
>>> >> I did not suggest just using metrics.scope.jm.job. I suggested that as
>>> >> the default for the jm operator scope format, that is used for the
>>> >> proposed JobManagerOperatorMetricGroup, which will have some plain
>>> >> metric groups as children for coordinators.
>>> >> Aka, you have the JMOperatorMG, then you call addGroup("coordinators")
>>> >> and then addGroup(coordinator_name) for each coordinator.
>>> >>
>>> >> > So there are two choice for the
>>> >> InternalOperatorCoordinatorMetricGroup: 1. add and improve the new
>>> scope
>>> >> format; 2. use the metrics.scope.jm.job and ProxyMetricGroup. Which
>>> one
>>> >> is better?
>>> >>
>>> >> There are more options than that.
>>> >> I feel like there are a lot of misunderstandings in this discussion.
>>> >> Please let me know if I could clear things up. If not I can also
>>> provide
>>> >> a PoC based on your PoC if that can speed things up. It's not that
>>> >> different anyway.
>>> >>
>>> >> On 19/01/2023 07:39, Hang Ruan wrote:
>>> >>> Hi, chesnay,
>>> >>>
>>> >>> Thanks for your reply. I still have some doubts about the questions
>>> >>> you raised.
>>> >>>
>>> >>> > Extending the set of ScopeFormats is problematic because it in
>>> >>> practice
>>> >>> it breaks the config if users actively rely on it, since
>>> there's now
>>> >>> another key that they _must_ set for it to be
>>> >>> consistent/compatible with
>>> >>> their existing setup.
>>> >>> Unfortunately due to how powerful scope formats are we can't
>>> derive a
>>> >>> default value that matches their existing setup.
>>> >>> Hence we should try to do this as rarely as possible.
>>> >>>
>>> >>>
>>> >>> > This FLIP does not adhere to that since it proposes a
>>> dedicated
>>> >>> format
>>> >>> for coordinators; next time we want to expose operator-specific
>>> >>> metrics
>>> >>> (e.g., in the scheduler) we'd have to add another one to
>>> support it.
>>> >>>
>>> >>>
>>> >>> First, I do not understand why users have to configure the new scope
>>> >>> format, which has a default value. I usually do not need to change
>>> >>> these configuration of scope formats when submitting the flink job.
>>> >>> IMO, the OperatorCoordinator is a component running at the JobMaster.
>>> >>> I try to let it extend the ComponentMetricGroup, but I cannot find
>>> >>> a suitable scope. This make me decide to add a new scope format.
>>> >>>
>>> >>> > Additionally, the configurable variables (operator name/id)
>>> are
>>> >>> logically not attached to the coordinator, but operators, so to
>>> me it
>>> >>> just doesn't make sense to structure it like this.
>>> >>>
>>> >>>
>>> >>> The relationship between the Operator and OperatorCoordinator is
>>> >>> maintained in the OperatorCoordinatorHolder. In the POC, the operator
>>> >>> name/id could be found and the OperatorCoordinatorMetricGroup will be
>>> >>> created here.
>>> >>>
>>> >>> > Another thing I'm concerned about is that, because we don't
>>> include
>>> >>> tasks in the hierarchy, users wishing to collect all metrics
>>> for a
>>> >>> particular task (in this case ==vertex) now have to go
>>> >>> significantly out
>>> >>> of their way to get them, since they can no longer just filter
>>> by the
>>> >>> task ID but have to be filter for _all_ operators that are part
>>> of
>>> >>> the task.
>>> >>>
>>> >>>
>>> >>> The registered metrics of OperatorCoordinator always are those which
>>> >>> can not be aggregated from the tasks, like the number of unassigned
>>> >>> splits in the SourceCoordinator. Actually we have not encountered the
>>> >>> scenario you mentioned. I think the OperatorCoordinatorMetricGroup
>>> >>> should only contain the metrics for itself instead of its subtasks.
>>> >>>
>>> >>> Using metrics.scope.jm.job is not enough to distinguish the different
>>> >>> coordinators. The operator id/name is necessary. Then
>>> >>> the implementation should be like this. But users can not change the
>>> >>> scope in this way. This is also acceptable.
>>> >>> @Internal public class InternalOperatorCoordinatorMetricGroup
>>> >>> extends ProxyMetricGroup<MetricGroup>
>>> >>> implements OperatorCoordinatorMetricGroup {
>>> >>> public InternalOperatorCoordinatorMetricGroup(
>>> >>> JobManagerJobMetricGroup parent, OperatorID operatorID,
>>> >> String operatorName) {
>>> >>> super(parent.addGroup(operatorID +
>>> >> operatorName).addGroup("coordinator")); }
>>> >>> }
>>> >>> So there are two choice for the
>>> >>> InternalOperatorCoordinatorMetricGroup: 1. add and improve the new
>>> >>> scope format; 2. use the metrics.scope.jm.job and ProxyMetricGroup.
>>> >>> Which one is better?
>>> >>>
>>> >>> Best,
>>> >>> Hang
>>> >>>
>>> >>>
>>> >>> Chesnay Schepler <[email protected]> 于2023年1月18日周三 17:03写道:
>>> >>>
>>> >>> You're misunderstanding the problem.
>>> >>>
>>> >>> Metric groups form a tree with each group providing certain
>>> metadata.
>>> >>>
>>> >>> E.g., on the taskmanager we have a TM metric group that
>>> provides info
>>> >>> about the TM, that has child task metric groups, that have child
>>> >>> operator metric groups etc. The operator metric group again has
>>> >>> children, where we are now mostly in the user/connector land,
>>> like a
>>> >>> kafka metric group providing partition info or something.
>>> >>>
>>> >>> What is being proposed is to create a coordinator metric group
>>> on
>>> >>> the JM
>>> >>> side that provides operator metadata.
>>> >>>
>>> >>> A more appropriate structure is to create an operator group on
>>> the JM
>>> >>> side that provides this info, with the coordinator metric group
>>> >>> being a
>>> >>> child.
>>> >>>
>>> >>> On 17/01/2023 19:56, Steven Wu wrote:
>>> >>> >> Additionally, the configurable variables (operator name/id)
>>> are
>>> >>> > logically not attached to the coordinator, but operators, so
>>> to
>>> >>> me it
>>> >>> > just doesn't make sense to structure it like this.
>>> >>> >
>>> >>> > Chesnay, maybe we should clarify the terminology. To me,
>>> >>> pperators (like
>>> >>> > FLIP-27 source) can have two parts (coordinator and
>>> >>> reader/subtask). I
>>> >>> > think it is fine to include operator name/id for coordinator
>>> >>> metrics.
>>> >>> >
>>> >>> > On Mon, Jan 16, 2023 at 2:13 AM Chesnay Schepler
>>> >>> <[email protected]> wrote:
>>> >>> >
>>> >>> >> Slight correction: Using metrics.scope.jm.job as the default
>>> >>> should be
>>> >>> >> safe.
>>> >>> >>
>>> >>> >> On 16/01/2023 10:18, Chesnay Schepler wrote:
>>> >>> >>> The proposed ScopeFormat is still problematic for a few
>>> reasons.
>>> >>> >>>
>>> >>> >>> Extending the set of ScopeFormats is problematic because it
>>> in
>>> >>> >>> practice it breaks the config if users actively rely on it,
>>> since
>>> >>> >>> there's now another key that they _must_ set for it to be
>>> >>> >>> consistent/compatible with their existing setup.
>>> >>> >>> Unfortunately due to how powerful scope formats are we can't
>>> >>> derive a
>>> >>> >>> default value that matches their existing setup.
>>> >>> >>> Hence we should try to do this as rarely as possible.
>>> >>> >>>
>>> >>> >>> This FLIP does not adhere to that since it proposes a
>>> >>> dedicated format
>>> >>> >>> for coordinators; next time we want to expose
>>> operator-specific
>>> >>> >>> metrics (e.g., in the scheduler) we'd have to add another
>>> one to
>>> >>> >>> support it.
>>> >>> >>>
>>> >>> >>> Additionally, the configurable variables (operator name/id)
>>> are
>>> >>> >>> logically not attached to the coordinator, but operators, so
>>> >>> to me it
>>> >>> >>> just doesn't make sense to structure it like this.
>>> >>> >>>
>>> >>> >>> Another thing I'm concerned about is that, because we don't
>>> >>> include
>>> >>> >>> tasks in the hierarchy, users wishing to collect all
>>> metrics for
>>> >> a
>>> >>> >>> particular task (in this case ==vertex) now have to go
>>> >>> significantly
>>> >>> >>> out of their way to get them, since they can no longer just
>>> >>> filter by
>>> >>> >>> the task ID but have to be filter for _all_ operators that
>>> are
>>> >>> part of
>>> >>> >>> the task.
>>> >>> >>>
>>> >>> >>> On 16/01/2023 03:09, Hang Ruan wrote:
>>> >>> >>>> Hi, @[email protected] <[email protected]> ,
>>> >>> >>>>
>>> >>> >>>> Do you have time to help to review this FLIP again after
>>> the
>>> >>> >>>> modification?
>>> >>> >>>> Looking forward to your reply.
>>> >>> >>>> This FLIP will add a new configuration for the
>>> >>> >>>> OperatorCoordinatorMetricGroup scope format. It provides an
>>> >>> internal
>>> >>> >>>> implementation and is added as a component to the
>>> >>> >>>> JobManagerJobMetricGroup.
>>> >>> >>>> If something doesn't make sense, could you provide some
>>> >>> advice? It
>>> >>> >>>> will be
>>> >>> >>>> very helpful. Thanks a lot for your help.
>>> >>> >>>>
>>> >>> >>>> Best,
>>> >>> >>>> Hang
>>> >>> >>>>
>>> >>> >>>> Martijn Visser <[email protected]> 于2023年1月11日周三
>>> >>> 16:34写道:
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>>> Hi Hang,
>>> >>> >>>>>
>>> >>> >>>>> I'm a bit surprised that this has gone to a vote, given
>>> that
>>> >>> Chesnay
>>> >>> >>>>> deliberately mentioned that he would vote against it
>>> as-is.
>>> >>> I would
>>> >>> >>>>> expect
>>> >>> >>>>> that before going to a vote, he has had the opportunity to
>>> >>> >>>>> participate in
>>> >>> >>>>> this discussion.
>>> >>> >>>>>
>>> >>> >>>>> Best regards,
>>> >>> >>>>>
>>> >>> >>>>> Martijn
>>> >>> >>>>>
>>> >>> >>>>> On Tue, Jan 3, 2023 at 12:53 PM Jark Wu <[email protected]
>>> >
>>> >>> wrote:
>>> >>> >>>>>
>>> >>> >>>>>> Hi Dong,
>>> >>> >>>>>>
>>> >>> >>>>>> Regarding “SplitEnumeratorContext#metricGroup”, my only
>>> >>> concern is
>>> >>> >>>>>> that
>>> >>> >>>>>> this is a core interface for the FLIP. It’s hard to tell
>>> how
>>> >>> >>>>>> sources use
>>> >>> >>>>>> metric group without mentioning this interface. Even if
>>> >>> this is an
>>> >>> >>>>> existing
>>> >>> >>>>>> API, I think it’s worth introducing the interface again
>>> and
>>> >>> declaring
>>> >>> >>>>> that
>>> >>> >>>>>> we will implement the interface instead of a no-op method
>>> >>> in this
>>> >>> >>>>>> FLIP.
>>> >>> >>>>>>
>>> >>> >>>>>> Anyway, this is a minor problem and shouldn’t block this
>>> >>> FLIP. I’m
>>> >>> >>>>>> +1 to
>>> >>> >>>>>> start a vote.
>>> >>> >>>>>>
>>> >>> >>>>>> Best,
>>> >>> >>>>>> Jark
>>> >>> >>>>>>
>>> >>> >>>>>>
>>> >>> >>>>>>> 2023年1月3日 10:03,Hang Ruan <[email protected]> 写道:
>>> >>> >>>>>>>
>>> >>> >>>>>>> Hi, Jark and Dong,
>>> >>> >>>>>>>
>>> >>> >>>>>>> Thanks for your comments. Sorry for my late reply.
>>> >>> >>>>>>>
>>> >>> >>>>>>> For suggestion 1, I plan to implement the
>>> >>> >>>>>>> SplitEnumeratorMetricGroup in
>>> >>> >>>>>>> another issue, and it is not contained in this FLIP. I
>>> >>> will add some
>>> >>> >>>>>>> description about this part.
>>> >>> >>>>>>> For suggestion 2, changes about
>>> >>> OperatorCoordinator#metricGroup has
>>> >>> >>>>>> already
>>> >>> >>>>>>> been documented in the proposed change section.
>>> >>> >>>>>>>
>>> >>> >>>>>>> Best,
>>> >>> >>>>>>> Hang
>>> >>> >>>>>>>
>>> >>> >>>>>>> Dong Lin <[email protected]> 于2023年1月1日周日 09:45写道:
>>> >>> >>>>>>>
>>> >>> >>>>>>>> Let me chime-in and add comments regarding the public
>>> >>> interface
>>> >>> >>>>> section.
>>> >>> >>>>>>>> Please see my comments inline.
>>> >>> >>>>>>>>
>>> >>> >>>>>>>> On Thu, Dec 29, 2022 at 6:08 PM Jark Wu
>>> >>> <[email protected]> wrote:
>>> >>> >>>>>>>>
>>> >>> >>>>>>>>> Hi Hang,
>>> >>> >>>>>>>>>
>>> >>> >>>>>>>>> Thanks for driving this discussion. I think this is a
>>> >>> very useful
>>> >>> >>>>>> feature
>>> >>> >>>>>>>>> for connectors.
>>> >>> >>>>>>>>>
>>> >>> >>>>>>>>> The FLIP looks quite good to me, and I just have two
>>> >>> suggestions.
>>> >>> >>>>>>>>>
>>> >>> >>>>>>>>> 1. In the "Public Interface" section, mention that the
>>> >>> >>>>>>>>> implementation
>>> >>> >>>>>>>>> behavior of "SplitEnumeratorContext#metricGroup" is
>>> >>> changed from
>>> >>> >>>>>>>> returning
>>> >>> >>>>>>>>> null to returning a concrete
>>> SplitEnumeratorMetricGroup
>>> >>> instance.
>>> >>> >>>>> Even
>>> >>> >>>>>>>>> though the API is already there, the behavior change
>>> can
>>> >>> also be
>>> >>> >>>>>>>> considered
>>> >>> >>>>>>>>> a public change.
>>> >>> >>>>>>>>>
>>> >>> >>>>>>>> SplitEnumeratorContext#metricGroup is an interface and
>>> >>> this FLIP
>>> >>> >>>>>>>> does
>>> >>> >>>>>> not
>>> >>> >>>>>>>> seem to change its semantics/behavior. The FLIP does
>>> >>> change the
>>> >>> >>>>>>>> implementation/behavior of
>>> >>> SourceCoordinatorContext#metricGroup,
>>> >>> >>>>>>>> which
>>> >>> >>>>>> is
>>> >>> >>>>>>>> marked @Internal.
>>> >>> >>>>>>>>
>>> >>> >>>>>>>> Thus it might seem a bit weird to add in the public
>>> >> interface
>>> >>> >>>>>>>> section
>>> >>> >>>>>>>> saying that we change the interface
>>> >>> >>>>>>>> SplitEnumeratorContext#metricGroup
>>> >>> >>>>>> from
>>> >>> >>>>>>>> returning null to non-null object.
>>> >>> >>>>>>>>
>>> >>> >>>>>>>> 2. Mention the newly added interface of
>>> >>> >>>>>> "OperatorCoordinator#metricGroup"
>>> >>> >>>>>>>>> in the "Proposed Changes" section or "Public
>>> Interface"
>>> >>> section. As
>>> >>> >>>>> the
>>> >>> >>>>>>>>> FLIP said, OperatorCoordinator is widely used in many
>>> >>> connectors.
>>> >>> >>>>>> Though
>>> >>> >>>>>>>> it
>>> >>> >>>>>>>>> is still an @Internal API, I think it is worth
>>> >>> mentioning the
>>> >>> >>>>>>>>> change
>>> >>> >>>>> in
>>> >>> >>>>>>>> the
>>> >>> >>>>>>>>> FLIP.
>>> >>> >>>>>>>> Since OperatorCoordinator is an internal API, it seems
>>> >>> reasonable to
>>> >>> >>>>>>>> explain it in the proposed change section. The FLIP
>>> seems
>>> >>> to have
>>> >>> >>>>>>>> documented this in point 5 of the proposed change
>>> section.
>>> >>> >>>>>>>>
>>> >>> >>>>>>>> BTW, if we think there are @internal classes that are
>>> >>> important
>>> >>> >>>>>>>> enough
>>> >>> >>>>>> to
>>> >>> >>>>>>>> be added in the public interface section, it might be
>>> >>> useful to
>>> >>> >>>>>> explicitly
>>> >>> >>>>>>>> discuss this topic and document it in the "*What are
>>> the
>>> >>> "public
>>> >>> >>>>>>>> interfaces" of the project*" in this
>>> >>> >>>>>>>> <
>>> >>> >>>>>>>>
>>> >>> >>
>>> >>>
>>> >>
>>> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
>>> >>> >>>>>>>> wiki.
>>> >>> >>>>>>>>
>>> >>> >>>>>>>>
>>> >>> >>>>>>>>> Best,
>>> >>> >>>>>>>>> Jark
>>> >>> >>>>>>>>>
>>> >>> >>>>>>>>>
>>> >>> >>>>>>>>> On Mon, 26 Dec 2022 at 18:06, Hang Ruan
>>> >>> <[email protected]>
>>> >>> >>>>>> wrote:
>>> >>> >>>>>>>>>> Hi, thanks for the feedback, Zhu Zhu and Qingsheng.
>>> >>> >>>>>>>>>> After combining everyone's comments, the main
>>> concerns and
>>> >>> >>>>>>>> corresponding
>>> >>> >>>>>>>>>> adjustments are as follows.
>>> >>> >>>>>>>>>>
>>> >>> >>>>>>>>>> Q1: Common metrics are not quite useful.
>>> >>> >>>>>>>>>> numEventsIn and numEventsOut counters will be removed
>>> >>> from the
>>> >>> >>>>>>>>>> OperatorCoordinatorMetricGroup. These common metrics
>>> do
>>> >> not
>>> >>> >>>>>>>>>> provide
>>> >>> >>>>>>>>> enough
>>> >>> >>>>>>>>>> information for users. The users are more willing to
>>> >>> get the
>>> >>> >>>>>>>>>> number
>>> >>> >>>>> of
>>> >>> >>>>>>>>>> events of the specified type instead of the total
>>> >>> number. And this
>>> >>> >>>>>>>> metric
>>> >>> >>>>>>>>>> is calculated differently. The implementation could
>>> >>> register the
>>> >>> >>>>>> metric
>>> >>> >>>>>>>>> by
>>> >>> >>>>>>>>>> themselves.
>>> >>> >>>>>>>>>>
>>> >>> >>>>>>>>>> Q2: This FLIP is overly complicated.
>>> >>> >>>>>>>>>> This FLIP will become concise after these
>>> modifications.
>>> >>> >>>>>>>>>> OperatorCoordinatorMetricGroup has already been
>>> >>> introduced into
>>> >>> >>>>> Flink
>>> >>> >>>>>>>> by
>>> >>> >>>>>>>>>> FLIP-179<
>>> >>> >>>>>>>>>>
>>> >>> >>>>>>>>>>
>>> >>> >>
>>> >>>
>>> >>
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-179%3A+Expose+Standardized+Operator+Metrics
>>> >>> >>>>>>>>>>> .
>>> >>> >>>>>>>>>> And
>>> >>> >>>>>>>>>> this FLIP will not change it. This FLIP only
>>> provides a
>>> >>> new metric
>>> >>> >>>>>>>> option
>>> >>> >>>>>>>>>> and a new metric group scope. The changes in proposed
>>> >>> changes
>>> >>> >>>>> provide
>>> >>> >>>>>>>> the
>>> >>> >>>>>>>>>> details about the modifications for the internal
>>> >>> classes, which
>>> >>> >>>>> might
>>> >>> >>>>>>>>> make
>>> >>> >>>>>>>>>> it look complicated.
>>> >>> >>>>>>>>>>
>>> >>> >>>>>>>>>> Thanks for all the comments again. If there are no
>>> further
>>> >>> >>>>>>>>>> comments,
>>> >>> >>>>>> we
>>> >>> >>>>>>>>>> plan to start the voting thread this week.
>>> >>> >>>>>>>>>>
>>> >>> >>>>>>>>>> Best,
>>> >>> >>>>>>>>>> Hang
>>> >>> >>>>>>>>>>
>>> >>> >>>>>>>>>> Qingsheng Ren <[email protected]> 于2022年12月26日周一
>>> >>> 16:48写道:
>>> >>> >>>>>>>>>>
>>> >>> >>>>>>>>>>
>>> >>> >>>>>>>>>>> Thanks for the FLIP, Hang!
>>> >>> >>>>>>>>>>>
>>> >>> >>>>>>>>>>> This FLIP overall looks good to me. Actually I share
>>> >>> the same
>>> >>> >>>>> concern
>>> >>> >>>>>>>>>> with
>>> >>> >>>>>>>>>>> Zhu that numEventsIn and numEventsOut counters are
>>> not
>>> >>> quite
>>> >>> >>>>>>>>>>> useful
>>> >>> >>>>>>>> to
>>> >>> >>>>>>>>>> end
>>> >>> >>>>>>>>>>> users. OperatorEvent is a quite low-level
>>> abstraction,
>>> >>> which
>>> >>> >>>>> requires
>>> >>> >>>>>>>>>>> instantialization in order to be practical to users
>>> and
>>> >>> >>>>>>>>>>> developers,
>>> >>> >>>>>>>> so
>>> >>> >>>>>>>>>>> maybe it's better to exclude them from the FLIP.
>>> >>> >>>>>>>>>>>
>>> >>> >>>>>>>>>>> Best,
>>> >>> >>>>>>>>>>> Qingsheng
>>> >>> >>>>>>>>>>> Ververica (Alibaba)
>>> >>> >>>>>>>>>>>
>>> >>> >>>>>>>>>>> On Mon, Dec 26, 2022 at 12:08 PM Zhu Zhu
>>> >>> <[email protected]>
>>> >>> >>>>> wrote:
>>> >>> >>>>>>>>>>>> Hi Hang,
>>> >>> >>>>>>>>>>>> I still see no strong reason why we need
>>> >>> >>>>>>>>>>>> numEventsIn/numEventsOut
>>> >>> >>>>>>>>>>> metrics.
>>> >>> >>>>>>>>>>>> In the discussion in FLINK-29801, I can see the
>>> same
>>> >>> concern
>>> >>> >>>>>>>>>>>> from
>>> >>> >>>>>>>>>> others.
>>> >>> >>>>>>>>>>>> So I prefer to exclude them from this FLIP to avoid
>>> >>> >>>>>>>>>>>> over-extending
>>> >>> >>>>>>>>> the
>>> >>> >>>>>>>>>>>> scope.
>>> >>> >>>>>>>>>>>>
>>> >>> >>>>>>>>>>>> Thanks,
>>> >>> >>>>>>>>>>>> Zhu
>>> >>> >>>>>>>>>>>>
>>> >>> >>>>>>>>>>>> Hang Ruan <[email protected]> 于2022年12月23日周五
>>> >>> >>>>>>>>>>>> 15:21写道:
>>> >>> >>>>>>>>>>>>> Hi everyone,
>>> >>> >>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>> As for the Zhu Zhu's problem, I think we should
>>> keep
>>> >>> the common
>>> >>> >>>>>>>>>>> metrics,
>>> >>> >>>>>>>>>>>> which will help to observe incoming and outgoing
>>> >>> events. What do
>>> >>> >>>>>>>> you
>>> >>> >>>>>>>>>>> think,
>>> >>> >>>>>>>>>>>> @Zhu Zhu ?
>>> >>> >>>>>>>>>>>>> And @Chesnay, are there any other issues you are
>>> >>> more concerned
>>> >>> >>>>>>>>>> about?
>>> >>> >>>>>>>>>>>> Looking forward to your reply.
>>> >>> >>>>>>>>>>>>> Thanks for all the comments. If there are no
>>> further
>>> >>> >>>>>>>>>>>>> comments, we
>>> >>> >>>>>>>>>> plan
>>> >>> >>>>>>>>>>>> to start the voting thread next week.
>>> >>> >>>>>>>>>>>>> Best,
>>> >>> >>>>>>>>>>>>> Hang
>>> >>> >>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>> Hang Ruan <[email protected]> 于2022年12月15日周四
>>> >>> >>>>>>>>>>>>> 16:49写道:
>>> >>> >>>>>>>>>>>>>> Hi, Zhu Zhu,
>>> >>> >>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>> Thanks for your feedback!
>>> >>> >>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>> The OperatorCoordinator implementations are
>>> >>> different. And
>>> >>> >>>>>>>>>>>>>> their
>>> >>> >>>>>>>>>>>> metrics are much different too. We try to find the
>>> >> common
>>> >>> >>>>>>>>>>>> metrics
>>> >>> >>>>>>>> and
>>> >>> >>>>>>>>>> put
>>> >>> >>>>>>>>>>>> them in the OperatorCoordinatorMetricGroup. If most
>>> >>> developers
>>> >>> >>>>>>>> think
>>> >>> >>>>>>>>> we
>>> >>> >>>>>>>>>>> do
>>> >>> >>>>>>>>>>>> not need these common metrics, removing the common
>>> >>> metrics is
>>> >>> >>>>>>>>>>>> also
>>> >>> >>>>>>>>>>>> acceptable.
>>> >>> >>>>>>>>>>>>>> Best,
>>> >>> >>>>>>>>>>>>>> Hang
>>> >>> >>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>> Zhu Zhu <[email protected]> 于2022年12月14日周三
>>> >>> 22:09写道:
>>> >>> >>>>>>>>>>>>>>> Hi Hang & MengYue,
>>> >>> >>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>> Thanks for creating this FLIP!
>>> >>> >>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>> I think it is very useful, mainly in two
>>> aspects:
>>> >>> >>>>>>>>>>>>>>> 1. Enables OperatorCoordinators to register
>>> metrics.
>>> >>> >>>>>>>>>>>>>>> Currently
>>> >>> >>>>>>>>>>>>>>> the coordinators has no way to do this. And
>>> operator
>>> >>> >>>>>>>> coordinator
>>> >>> >>>>>>>>>>>>>>> metric group further enables the SplitEnumerator
>>> >>> to have
>>> >>> >>>>>>>>>>>>>>> access
>>> >>> >>>>>>>>>>>>>>> to a registered metric group (via the existing
>>> public
>>> >>> >>>>>>>>>>>>>>> interface
>>> >>> >>>>>>>>>>>>>>> SplitEnumeratorContext#metricGroup()), which is
>>> >>> null at the
>>> >>> >>>>>>>>> moment.
>>> >>> >>>>>>>>>>>>>>> 2. Defines the scope of operator coordinator
>>> >>> metrics. A clear
>>> >>> >>>>>>>>>>>> definition
>>> >>> >>>>>>>>>>>>>>> makes it easy for users to find their wanted
>>> >>> metrics. The
>>> >>> >>>>>>>>>> definition
>>> >>> >>>>>>>>>>>>>>> also helps to avoid conflicts of metrics from
>>> >> multiple
>>> >>> >>>>>>>>>>>> OperatorCoordinators
>>> >>> >>>>>>>>>>>>>>> of the same kind. E.g. each SourceCoordinator
>>> may
>>> >>> have its
>>> >>> >>>>>>>>>>>>>>> own
>>> >>> >>>>>>>>>>>>>>> numSourceSplits metric, these metrics should not
>>> >>> be directly
>>> >>> >>>>>>>>>>> registered
>>> >>> >>>>>>>>>>>>>>> to the job metric group.
>>> >>> >>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>> What I'm a bit concerned is the necessity of the
>>> >>> introduced
>>> >>> >>>>>>>>> common
>>> >>> >>>>>>>>>>>> metrics
>>> >>> >>>>>>>>>>>>>>> numEventsInCounter & numEventsOutCounter. If
>>> there
>>> >>> any case
>>> >>> >>>>>>>> which
>>> >>> >>>>>>>>>>>> strongly
>>> >>> >>>>>>>>>>>>>>> requires them?
>>> >>> >>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>> Regarding the concerns of Chesnay,
>>> >>> >>>>>>>>>>>>>>>> A dedicated coordinator MG implementation is
>>> >> overkill
>>> >>> >>>>>>>>>>>>>>> Directly using the job metric group can result
>>> in
>>> >>> metric
>>> >>> >>>>>>>>> conflicts,
>>> >>> >>>>>>>>>>> as
>>> >>> >>>>>>>>>>>> mentioned
>>> >>> >>>>>>>>>>>>>>> in above #2.
>>> >>> >>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>> Thanks,
>>> >>> >>>>>>>>>>>>>>> Zhu
>>> >>> >>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>> Dong Lin <[email protected]> 于2022年12月10日周六
>>> >>> >>>>>>>>>>>>>>> 14:16写道:
>>> >>> >>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>> Hi Chesney,
>>> >>> >>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>> Just to double check with you,
>>> >>> >>>>>>>>>>>>>>>> OperatorCoordinatorMetricGroup
>>> >>> >>>>>>>>>>>> (annotated as
>>> >>> >>>>>>>>>>>>>>>> @PublicEvolving) has already been introduced
>>> into
>>> >>> Flink by
>>> >>> >>>>>>>>>> FLIP-179
>>> >>> >>>>>>>>>>>>>>>> <
>>> >>> >>
>>> >>>
>>> >>
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-179%3A+Expose+Standardized+Operator+Metrics
>>> >>> >>>>>>>>>>>>> .
>>> >>> >>>>>>>>>>>>>>>> And that FLIP has got you +1.. Do you mean we
>>> >>> should remove
>>> >>> >>>>>>>>> this
>>> >>> >>>>>>>>>>>>>>>> OperatorCoordinatorMetricGroup?
>>> >>> >>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>> Regards,
>>> >>> >>>>>>>>>>>>>>>> Dong
>>> >>> >>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>> On Sat, Dec 10, 2022 at 1:33 AM Chesnay
>>> Schepler <
>>> >>> >>>>>>>>>>> [email protected]>
>>> >>> >>>>>>>>>>>> wrote:
>>> >>> >>>>>>>>>>>>>>>>> As a whole I feel like this FLIP is overly
>>> >>> complicated. A
>>> >>> >>>>>>>>>>> dedicated
>>> >>> >>>>>>>>>>>>>>>>> coordinator MG implementation is overkill; it
>>> >>> could just
>>> >>> >>>>>>>>> re-use
>>> >>> >>>>>>>>>>> the
>>> >>> >>>>>>>>>>>>>>>>> existing Task/OperatorMGs to create the same
>>> >>> structure we
>>> >>> >>>>>>>>> have
>>> >>> >>>>>>>>>> on
>>> >>> >>>>>>>>>>>> TMs,
>>> >>> >>>>>>>>>>>>>>>>> similar to what we did with the Job MG.
>>> >>> >>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>> However, I'm not convinced that this is
>>> required
>>> >>> anyway,
>>> >>> >>>>>>>>>> because
>>> >>> >>>>>>>>>>>> all the
>>> >>> >>>>>>>>>>>>>>>>> example metrics you listed can be implemented
>>> on
>>> >>> the TM
>>> >>> >>>>>>>> side
>>> >>> >>>>>>>>> +
>>> >>> >>>>>>>>>>>>>>>>> aggregating them in the external metrics
>>> backend.
>>> >>> >>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>> Since I'm on holidays soon, just so no one
>>> tries
>>> >>> to pull a
>>> >>> >>>>>>>>> fast
>>> >>> >>>>>>>>>>>> one on
>>> >>> >>>>>>>>>>>>>>>>> me, if this were to go to a vote as-is I'd be
>>> >>> against it.
>>> >>> >>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>> On 09/12/2022 15:30, Dong Lin wrote:
>>> >>> >>>>>>>>>>>>>>>>>> Hi Hang,
>>> >>> >>>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>>> Thanks for the FLIP! The FLIP looks good and
>>> it
>>> >>> is pretty
>>> >>> >>>>>>>>>>>> informative.
>>> >>> >>>>>>>>>>>>>>>>>> I have just two minor comments regarding
>>> names:
>>> >>> >>>>>>>>>>>>>>>>>> - Would it be useful to rename the config
>>> key as
>>> >>> >>>>>>>>>>>>>>>>>> *metrics.scope.jm.job.operator-coordinator*
>>> for
>>> >>> >>>>>>>> consistency
>>> >>> >>>>>>>>>>> with
>>> >>> >>>>>>>>>>>>>>>>>> *metrics.scope.jm.job
>>> >>> >>>>>>>>>>>>>>>>>> *(which is not named as *jm-job)?
>>> >>> >>>>>>>>>>>>>>>>>> - Maybe rename the variable as
>>> >>> >>>>>>>>>>> SCOPE_NAMING_OPERATOR_COORDINATOR
>>> >>> >>>>>>>>>>>> for
>>> >>> >>>>>>>>>>>>>>>>>> simplicity and consistency with
>>> >>> SCOPE_NAMING_OPERATOR
>>> >>> >>>>>>>>> (which
>>> >>> >>>>>>>>>> is
>>> >>> >>>>>>>>>>>> not named
>>> >>> >>>>>>>>>>>>>>>>>> as SCOPE_NAMING_TM_JOB_OPERATOR)?
>>> >>> >>>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>>> Cheers,
>>> >>> >>>>>>>>>>>>>>>>>> Dong
>>> >>> >>>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>>> On Thu, Dec 8, 2022 at 3:28 PM Hang Ruan <
>>> >>> >>>>>>>>>>> [email protected]>
>>> >>> >>>>>>>>>>>> wrote:
>>> >>> >>>>>>>>>>>>>>>>>>> Hi all,
>>> >>> >>>>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>>>> MengYue and I created FLIP-274[1] Introduce
>>> >>> metric group
>>> >>> >>>>>>>>> for
>>> >>> >>>>>>>>>>>>>>>>>>> OperatorCoordinator. OperatorCoordinator is
>>> the
>>> >>> >>>>>>>>> coordinator
>>> >>> >>>>>>>>>>> for
>>> >>> >>>>>>>>>>>> runtime
>>> >>> >>>>>>>>>>>>>>>>>>> operators and running on Job Manager. The
>>> >>> coordination
>>> >>> >>>>>>>>>>>> mechanism is
>>> >>> >>>>>>>>>>>>>>>>>>> operator events between OperatorCoordinator
>>> >>> and its all
>>> >>> >>>>>>>>>>>> operators, the
>>> >>> >>>>>>>>>>>>>>>>>>> coordination is more and more using in
>>> Flink, for
>>> >>> >>>>>>>> example
>>> >>> >>>>>>>>>> many
>>> >>> >>>>>>>>>>>> Sources
>>> >>> >>>>>>>>>>>>>>>>> and
>>> >>> >>>>>>>>>>>>>>>>>>> Sinks depend on the mechanism to assign
>>> splits
>>> >> and
>>> >>> >>>>>>>>>> coordinate
>>> >>> >>>>>>>>>>>> commits to
>>> >>> >>>>>>>>>>>>>>>>>>> external systems. The OperatorCoordinator is
>>> >>> widely
>>> >>> >>>>>>>> using
>>> >>> >>>>>>>>> in
>>> >>> >>>>>>>>>>>> flink kafka
>>> >>> >>>>>>>>>>>>>>>>>>> connector, flink pulsar connector, flink cdc
>>> >>> connector,
>>> >>> >>>>>>>>>> flink
>>> >>> >>>>>>>>>>>> hudi
>>> >>> >>>>>>>>>>>>>>>>>>> connector and so on.
>>> >>> >>>>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>>>> But there is not a suitable metric group
>>> scope
>>> >>> for the
>>> >>> >>>>>>>>>>>>>>>>> OperatorCoordinator
>>> >>> >>>>>>>>>>>>>>>>>>> and not an implementation for the interface
>>> >>> >>>>>>>>>>>>>>>>> OperatorCoordinatorMetricGroup.
>>> >>> >>>>>>>>>>>>>>>>>>> These metrics in OperatorCoordinator could
>>> be
>>> >>> how many
>>> >>> >>>>>>>>>>>> splits/partitions
>>> >>> >>>>>>>>>>>>>>>>>>> have been assigned to source readers, how
>>> many
>>> >>> files
>>> >>> >>>>>>>> have
>>> >>> >>>>>>>>>> been
>>> >>> >>>>>>>>>>>> written
>>> >>> >>>>>>>>>>>>>>>>> out
>>> >>> >>>>>>>>>>>>>>>>>>> by sink writers, these metrics not only help
>>> >>> users to
>>> >>> >>>>>>>> know
>>> >>> >>>>>>>>>> the
>>> >>> >>>>>>>>>>>> job
>>> >>> >>>>>>>>>>>>>>>>> progress
>>> >>> >>>>>>>>>>>>>>>>>>> but also make big job maintaining easier.
>>> Thus we
>>> >>> >>>>>>>> propose
>>> >>> >>>>>>>>>> the
>>> >>> >>>>>>>>>>>> FLIP-274
>>> >>> >>>>>>>>>>>>>>>>> to
>>> >>> >>>>>>>>>>>>>>>>>>> introduce a new metric group scope for
>>> >>> >>>>>>>> OperatorCoordinator
>>> >>> >>>>>>>>>> and
>>> >>> >>>>>>>>>>>> provide
>>> >>> >>>>>>>>>>>>>>>>> an
>>> >>> >>>>>>>>>>>>>>>>>>> internal implementation for
>>> >>> >>>>>>>>> OperatorCoordinatorMetricGroup.
>>> >>> >>>>>>>>>>>>>>>>>>> Could you help review this FLIP when you get
>>> >>> time? Any
>>> >>> >>>>>>>>>>> feedback
>>> >>> >>>>>>>>>>>> is
>>> >>> >>>>>>>>>>>>>>>>>>> appreciated!
>>> >>> >>>>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>>>> Best,
>>> >>> >>>>>>>>>>>>>>>>>>> Hang
>>> >>> >>>>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>>>> [1]
>>> >>> >>>>>>>>>>>>>>>>>>>
>>> >>> >>>>>>>>>>>>>>>>>>>
>>> >>> >>
>>> >>>
>>> >>
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-274%3A+Introduce+metric+group+for+OperatorCoordinator
>>> >>> >>
>>> >>>
>>>
>>>
>>
>