Hi Hang,
I still see no strong reason why we need numEventsIn/numEventsOut metrics.
In the discussion in FLINK-29801, I can see the same concern from others.
So I prefer to exclude them from this FLIP to avoid over-extending the scope.

Thanks,
Zhu

Hang Ruan <ruanhang1...@gmail.com> 于2022年12月23日周五 15:21写道:
>
> Hi everyone,
>
> As for the Zhu Zhu's problem, I think we should keep the common metrics, 
> which will help to observe incoming and outgoing events. What do you think, 
> @Zhu Zhu ?
> And @Chesnay, are there any other issues you are more concerned about? 
> Looking forward to your reply.
>
> Thanks for all the comments. If there are no further comments, we plan to 
> start the voting thread next week.
>
> Best,
> Hang
>
> Hang Ruan <ruanhang1...@gmail.com> 于2022年12月15日周四 16:49写道:
>>
>> Hi, Zhu Zhu,
>>
>> Thanks for your feedback!
>>
>> The OperatorCoordinator implementations are different. And their metrics are 
>> much different too. We try to find the common metrics and put them in the 
>> OperatorCoordinatorMetricGroup. If most developers think we do not need 
>> these common metrics, removing the common metrics is also acceptable.
>>
>> Best,
>> Hang
>>
>> Zhu Zhu <reed...@gmail.com> 于2022年12月14日周三 22:09写道:
>>>
>>> Hi Hang & MengYue,
>>>
>>> Thanks for creating this FLIP!
>>>
>>> I think it is very useful, mainly in two aspects:
>>> 1. Enables OperatorCoordinators to register metrics. Currently
>>> the coordinators has no way to do this. And operator coordinator
>>> metric group further enables the SplitEnumerator to have access
>>> to a registered metric group (via the existing public interface
>>> SplitEnumeratorContext#metricGroup()), which is null at the moment.
>>>
>>> 2. Defines the scope of operator coordinator metrics. A clear definition
>>> makes it easy for users to find their wanted metrics. The definition
>>> also helps to avoid conflicts of metrics from multiple OperatorCoordinators
>>> of the same kind. E.g. each SourceCoordinator may have its own
>>> numSourceSplits metric, these metrics should not be directly registered
>>> to the job metric group.
>>>
>>> What I'm a bit concerned is the necessity of the introduced common metrics
>>> numEventsInCounter & numEventsOutCounter. If there any case which strongly
>>> requires them?
>>>
>>> Regarding the concerns of Chesnay,
>>> > A dedicated coordinator MG implementation is overkill
>>> Directly using the job metric group can result in metric conflicts, as 
>>> mentioned
>>> in above #2.
>>>
>>> Thanks,
>>> Zhu
>>>
>>> Dong Lin <lindon...@gmail.com> 于2022年12月10日周六 14:16写道:
>>>
>>> >
>>> > Hi Chesney,
>>> >
>>> > Just to double check with you, OperatorCoordinatorMetricGroup (annotated 
>>> > as
>>> > @PublicEvolving) has already been introduced into Flink by FLIP-179
>>> > <https://cwiki.apache.org/confluence/display/FLINK/FLIP-179%3A+Expose+Standardized+Operator+Metrics>.
>>> > And that FLIP has got you +1.. Do you mean we should remove this
>>> > OperatorCoordinatorMetricGroup?
>>> >
>>> > Regards,
>>> > Dong
>>> >
>>> > On Sat, Dec 10, 2022 at 1:33 AM Chesnay Schepler <ches...@apache.org> 
>>> > wrote:
>>> >
>>> > > As a whole I feel like this FLIP is overly complicated. A dedicated
>>> > > coordinator MG implementation is overkill; it could just re-use the
>>> > > existing Task/OperatorMGs to create the same structure we have on TMs,
>>> > > similar to what we did with the Job MG.
>>> > >
>>> > > However, I'm not convinced that this is required anyway, because all the
>>> > > example metrics you listed can be implemented on the TM side +
>>> > > aggregating them in the external metrics backend.
>>> > >
>>> > > Since I'm on holidays soon, just so no one tries to pull a fast one on
>>> > > me, if this were to go to a vote as-is I'd be against it.
>>> > >
>>> > >
>>> > > On 09/12/2022 15:30, Dong Lin wrote:
>>> > > > Hi Hang,
>>> > > >
>>> > > > Thanks for the FLIP! The FLIP looks good and it is pretty informative.
>>> > > >
>>> > > > I have just two minor comments regarding names:
>>> > > > - Would it be useful to rename the config key as
>>> > > > *metrics.scope.jm.job.operator-coordinator* for consistency with
>>> > > > *metrics.scope.jm.job
>>> > > > *(which is not named as *jm-job)?
>>> > > > - Maybe rename the variable as SCOPE_NAMING_OPERATOR_COORDINATOR for
>>> > > > simplicity and consistency with SCOPE_NAMING_OPERATOR (which is not 
>>> > > > named
>>> > > > as SCOPE_NAMING_TM_JOB_OPERATOR)?
>>> > > >
>>> > > > Cheers,
>>> > > > Dong
>>> > > >
>>> > > >
>>> > > >
>>> > > > On Thu, Dec 8, 2022 at 3:28 PM Hang Ruan <ruanhang1...@gmail.com> 
>>> > > > wrote:
>>> > > >
>>> > > >> Hi all,
>>> > > >>
>>> > > >> MengYue and I created FLIP-274[1] Introduce metric group for
>>> > > >> OperatorCoordinator. OperatorCoordinator is the coordinator for 
>>> > > >> runtime
>>> > > >> operators and running on Job Manager. The coordination mechanism is
>>> > > >> operator events between OperatorCoordinator and its all operators, 
>>> > > >> the
>>> > > >> coordination is more and more using in Flink, for example many 
>>> > > >> Sources
>>> > > and
>>> > > >> Sinks depend on the mechanism to assign splits and coordinate 
>>> > > >> commits to
>>> > > >> external systems. The OperatorCoordinator is widely using in flink 
>>> > > >> kafka
>>> > > >> connector, flink pulsar connector, flink cdc connector, flink hudi
>>> > > >> connector and so on.
>>> > > >>
>>> > > >> But there is not a suitable metric group scope for the
>>> > > OperatorCoordinator
>>> > > >> and not an implementation for the interface
>>> > > OperatorCoordinatorMetricGroup.
>>> > > >> These metrics in OperatorCoordinator could be how many 
>>> > > >> splits/partitions
>>> > > >> have been assigned to source readers, how many files have been 
>>> > > >> written
>>> > > out
>>> > > >> by sink writers, these metrics not only help users to know the job
>>> > > progress
>>> > > >> but also make big job maintaining easier. Thus we propose the 
>>> > > >> FLIP-274
>>> > > to
>>> > > >> introduce a new metric group scope for OperatorCoordinator and 
>>> > > >> provide
>>> > > an
>>> > > >> internal implementation for OperatorCoordinatorMetricGroup.
>>> > > >>
>>> > > >> Could you help review this FLIP when you get time? Any feedback is
>>> > > >> appreciated!
>>> > > >>
>>> > > >> Best,
>>> > > >> Hang
>>> > > >>
>>> > > >> [1]
>>> > > >>
>>> > > >>
>>> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-274%3A+Introduce+metric+group+for+OperatorCoordinator
>>> > > >>
>>> > >
>>> > >

Reply via email to