Hi Martijn,

By saying "Each connector defines their own metrics at the moment", what
FLIP-33 intends to avoid is that connectA defines a metric named
numRecordsIn and connectB defines a metric named numOfRecordsIn. In this
case, these two metrics would have the same semantic meaning but different
name in the dashboard, which complicates the monitoring for users. FLIP-33
intends to standard the convention of metric name/semantic, but not the
implementation.

It is mentioned in FLIP-33 that "A connector implementation does not have
to report all the defined metrics. But if a connector reports a metric of
the same semantic defined below, the implementation should follow the
convention".

What is proposed in this thread will only affect the connection
implementation, i.e. SourceReaderBase would no longer increment
numRecordsIn. But it would not change the name/semantic of numRecordsIn.

Note that even if we don't change SourceReaderBase, a subclass of
SourceReaderBase (or Source in general) can still update numRecordsIn in a
way that breaks the semantic of this metric mentioned in FLIP-33. There is
no way we can programmatically prevent that from happening. Developers of
respective connectors need to be aware of the convention and enforce it
properly.

Regards,
Dong



On Mon, Jan 9, 2023 at 6:06 AM Martijn Visser <martijnvis...@apache.org>
wrote:

> Hi Dong,
>
> In the proposal from Wencong is stated "If source reader want to report to
> metric by self, it can set source.reader.metric.num_records_in.override to
> true, which disables the registration of numRecordsIn in SourceReaderBase
> and let the actual implementation to report the metric instead."
>
> While in FLIP-33 is stated "Each connector defines their own metrics at the
> moment. This complicates operation and monitoring. Admittedly, different
> connectors may have different metrics, but some commonly used metrics can
> probably be standardized. This FLIP proposes a set of standard connector
> metrics that each connector should emit if applicable."
>
> With this proposal, the actual implementation can be disabled in
> SourceReaderBase and the connector can define its own metric again. With
> how I read FLIP-33, the purpose was exactly to avoid that. So you can have
> a connector that has its own definition for numRecordsIn, which it
> submits/sends as numRecordsIn, which basically breaks it being a default
> metric.
>
> Best regards,
>
> Martijn
>
> On Sun, Jan 8, 2023 at 3:26 PM Dong Lin <lindon...@gmail.com> wrote:
>
> > Hi Martijn,
> >
> > I think the change proposed in this thread would *not* break the idea in
> > FLIP-33. FLIP-33 standardized metrics such as numRecordsIn, by specifying
> > the name/type/semantics of those metrics so that all connectors can
> follow.
> > The name/type/semantics of numRecordsIn metric would be the same before
> and
> > after the proposed change.
> >
> > Does this make sense? Or could you explain which part of FLIP-33 would be
> > broken by the proposed change?
> >
> > Regards,
> > Dong
> >
> >
> > On Sun, Jan 8, 2023 at 9:33 PM Martijn Visser <martijnvis...@apache.org>
> > wrote:
> >
> > > Hi all,
> > >
> > > This feels like we purposely want to abandon the idea that was
> introduced
> > > with FLIP-33 on standardizing metrics [1]. From this proposal, I don't
> > see
> > > why we want to abandon that idea. Next to that, if you override
> > > the numRecordsIn logic, you also touch on the other metrics that rely
> on
> > > this value.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics
> > >
> > > On Sun, Jan 8, 2023 at 12:43 PM Wencong Liu <liuwencle...@163.com>
> > wrote:
> > >
> > > > Thanks for the discussion, Dong and John
> > > >
> > > >
> > > > Considering the extra cost of method calls, the approach of adding an
> > > > extra SourceReaderBase constructor
> > > > should be a better choice. If there is no further discussion, I will
> > > > follow this plan.
> > > >
> > > >
> > > > Best,
> > > > Wencong
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > At 2023-01-08 01:05:07, "John Roesler" <vvcep...@apache.org> wrote:
> > > > >Thanks, for the discussion, Dong.
> > > > >
> > > > >To answer your question: I was unclear if the desire was to
> deprecate
> > > the
> > > > metric itself, to be replaced with some other metric, or whether we
> > just
> > > > wanted to deprecate the way the superclass manages it. It’s clear now
> > > that
> > > > we want to keep the metric and only change the superclass logic.
> > > > >
> > > > >I think we’re on the same page now.
> > > > >
> > > > >Thanks!
> > > > >-John
> > > > >
> > > > >On Sat, Jan 7, 2023, at 07:21, Dong Lin wrote:
> > > > >> Hi John,
> > > > >>
> > > > >> Not sure if I understand the difference between "deprecate that
> > > metric"
> > > > and
> > > > >> "deprecate the private counter mechanism". I think what we want is
> > to
> > > > >> update SourceReaderBase's implementation so that it no longer
> > > explicitly
> > > > >> increments this metric. But we still need to expose this metric to
> > > user.
> > > > >> And methods such as RecordEmitter#emitRecord (which can be invoked
> > by
> > > > >> SourceReaderBase#pollNext(...)) can potentially increment this
> > metric.
> > > > >>
> > > > >> I like your approach of adding an extra SourceReaderBase
> > constructor.
> > > > That
> > > > >> appears simpler than adding a deprecated config. We can mark the
> > > > existing
> > > > >> SourceReaderBase constructor as @deprecated.
> > > > SourceReaderBase#pollNext(...)
> > > > >> will not increment the counter if a subclass uses the newly added
> > > > >> constructor.
> > > > >>
> > > > >> Cheers,
> > > > >> Dong
> > > > >>
> > > > >>
> > > > >> On Sat, Jan 7, 2023 at 4:47 AM John Roesler <vvcep...@apache.org>
> > > > wrote:
> > > > >>
> > > > >>> Thanks for the replies, Dong and Wencong!
> > > > >>>
> > > > >>> That’s a good point about the overhead of the extra method.
> > > > >>>
> > > > >>> Is the desire to actually deprecate that metric in a user-facing
> > way,
> > > > or
> > > > >>> just to deprecate the private counter mechanism?
> > > > >>>
> > > > >>> It seems like if the desire is to deprecate the existing private
> > > > counter,
> > > > >>> we can accomplish it by deprecating the current constructor and
> > > > offering
> > > > >>> another that is documented not to track the metric. This seems
> > better
> > > > than
> > > > >>> the config option, since the concern is purely about the division
> > of
> > > > >>> responsibilities between the sub- and super-class.
> > > > >>>
> > > > >>> Another option, which might be better if we wish to keep a
> > uniformly
> > > > named
> > > > >>> metric, would be to simply make the counter protected. That would
> > be
> > > > better
> > > > >>> if we’re generally happy with the metric and counter, but a few
> > > special
> > > > >>> source connectors need to emit records in other ways.
> > > > >>>
> > > > >>> And finally, if we really want to get rid of the metric itself,
> > then
> > > I
> > > > >>> agree, a config is the way to do it.
> > > > >>>
> > > > >>> Thanks,
> > > > >>> John
> > > > >>>
> > > > >>> On Fri, Jan 6, 2023, at 00:55, Dong Lin wrote:
> > > > >>> > Hi John and Wencong,
> > > > >>> >
> > > > >>> > Thanks for the reply!
> > > > >>> >
> > > > >>> > It is nice that optional-2 can address the problem without
> > > affecting
> > > > the
> > > > >>> > existing source connectors as far as functionality is
> concerned.
> > > One
> > > > >>> > potential concern with this approach is that it might increase
> > the
> > > > Flink
> > > > >>> > runtime overhead by adding one more virtual functional call to
> > the
> > > > >>> > per-record runtime call stack.
> > > > >>> >
> > > > >>> > Since Java's default MaxInlineLevel is 12-18, I believe it is
> > easy
> > > > for an
> > > > >>> > operator chain of 5+ operators to exceed this limit. In this
> > case.
> > > > And
> > > > >>> > option-2 would incur one more virtual table lookup to produce
> > each
> > > > >>> record.
> > > > >>> > It is not clear how much this overhead would show up for jobs
> > with
> > > a
> > > > >>> chain
> > > > >>> > of lightweight operators. I am recently working on FLINK-30531
> > > > >>> > <https://issues.apache.org/jira/browse/FLINK-30531> to reduce
> > > > runtime
> > > > >>> > overhead which might be related to this discussion.
> > > > >>> >
> > > > >>> > In comparison to option-2, the option-3 provided in my earlier
> > > email
> > > > >>> would
> > > > >>> > not add this extra overhead. I think it might be worthwhile to
> > > > invest in
> > > > >>> > the long-term performance (and simpler runtime infra) and pay
> for
> > > the
> > > > >>> > short-term cost of deprecating this metric in
> SourceOperatorBase.
> > > > What do
> > > > >>> > you think?
> > > > >>> >
> > > > >>> > Regards,
> > > > >>> > Dong
> > > > >>> >
> > > > >>> >
> > > > >>> > On Thu, Jan 5, 2023 at 10:10 PM Wencong Liu <
> > liuwencle...@163.com>
> > > > >>> wrote:
> > > > >>> >
> > > > >>> >> Hi, All
> > > > >>> >>
> > > > >>> >>
> > > > >>> >> Thanks for the reply!
> > > > >>> >>
> > > > >>> >>
> > > > >>> >> I think both John and Dong's opinions are reasonable. John's
> > > > Suggestion
> > > > >>> 2
> > > > >>> >> is a good implementation.
> > > > >>> >> It does not affect the existing source connectors, but also
> > > provides
> > > > >>> >> support
> > > > >>> >> for custom counter in the future implementation.
> > > > >>> >>
> > > > >>> >>
> > > > >>> >> WDYT?
> > > > >>> >>
> > > > >>> >>
> > > > >>> >> Best,
> > > > >>> >>
> > > > >>> >> Wencong Liu
> > > > >>>
> > > >
> > >
> >
>

Reply via email to