Hi Martijn,

No worries. Thanks for the discussion!

Suppose we agree to deprecate the existing SourceReaderBase public
constructor, it means we plan to remove this construct (which could be
backward incompatible) sooner or later. I think it makes sense to at least
update all the externalized connectors listed under
github.com/apache/flink-connector-* to use the new constructor, before we
remove the existing constructor.

Maybe Wencong can create a FLIP for this public API change and specify this
plan "Compatibility, Deprecation, and Migration Plan" section? The changes
are probably straightforward to make for each externalized connectors. I am
happy to make the change if it is not made after 2 Flink release cycles.

Cheers,
Dong




On Mon, Jan 9, 2023 at 4:27 PM Martijn Visser <martijnvis...@apache.org>
wrote:

> Hi Dong,
>
> I think the key part is indeed your sentence that "FLIP-33 intends to
> standard the convention of metric name/semantic, but not the
> implementation."
> When I read FLIP-33, that's not how I interpreted it. So I've been reading
> back in the FLIP-33 discussion thread, there were some more clarifications
> in that thread that it is indeed only meant for standardizing metric names.
> Should have done that earlier, sorry for that.
>
> As part of this proposal, does that also include updating the existing
> connectors (given that a lot of them have been externalized) to not use the
> to-be-deprecated SourceReaderBase constructor?
>
> Best regards,
>
> Martijn
>
> On Mon, Jan 9, 2023 at 1:39 AM Dong Lin <lindon...@gmail.com> wrote:
>
> > Hi Martijn,
> >
> > By saying "Each connector defines their own metrics at the moment", what
> > FLIP-33 intends to avoid is that connectA defines a metric named
> > numRecordsIn and connectB defines a metric named numOfRecordsIn. In this
> > case, these two metrics would have the same semantic meaning but
> different
> > name in the dashboard, which complicates the monitoring for users.
> FLIP-33
> > intends to standard the convention of metric name/semantic, but not the
> > implementation.
> >
> > It is mentioned in FLIP-33 that "A connector implementation does not have
> > to report all the defined metrics. But if a connector reports a metric of
> > the same semantic defined below, the implementation should follow the
> > convention".
> >
> > What is proposed in this thread will only affect the connection
> > implementation, i.e. SourceReaderBase would no longer increment
> > numRecordsIn. But it would not change the name/semantic of numRecordsIn.
> >
> > Note that even if we don't change SourceReaderBase, a subclass of
> > SourceReaderBase (or Source in general) can still update numRecordsIn in
> a
> > way that breaks the semantic of this metric mentioned in FLIP-33. There
> is
> > no way we can programmatically prevent that from happening. Developers of
> > respective connectors need to be aware of the convention and enforce it
> > properly.
> >
> > Regards,
> > Dong
> >
> >
> >
> > On Mon, Jan 9, 2023 at 6:06 AM Martijn Visser <martijnvis...@apache.org>
> > wrote:
> >
> > > Hi Dong,
> > >
> > > In the proposal from Wencong is stated "If source reader want to report
> > to
> > > metric by self, it can set source.reader.metric.num_records_in.override
> > to
> > > true, which disables the registration of numRecordsIn in
> SourceReaderBase
> > > and let the actual implementation to report the metric instead."
> > >
> > > While in FLIP-33 is stated "Each connector defines their own metrics at
> > the
> > > moment. This complicates operation and monitoring. Admittedly,
> different
> > > connectors may have different metrics, but some commonly used metrics
> can
> > > probably be standardized. This FLIP proposes a set of standard
> connector
> > > metrics that each connector should emit if applicable."
> > >
> > > With this proposal, the actual implementation can be disabled in
> > > SourceReaderBase and the connector can define its own metric again.
> With
> > > how I read FLIP-33, the purpose was exactly to avoid that. So you can
> > have
> > > a connector that has its own definition for numRecordsIn, which it
> > > submits/sends as numRecordsIn, which basically breaks it being a
> default
> > > metric.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > On Sun, Jan 8, 2023 at 3:26 PM Dong Lin <lindon...@gmail.com> wrote:
> > >
> > > > Hi Martijn,
> > > >
> > > > I think the change proposed in this thread would *not* break the idea
> > in
> > > > FLIP-33. FLIP-33 standardized metrics such as numRecordsIn, by
> > specifying
> > > > the name/type/semantics of those metrics so that all connectors can
> > > follow.
> > > > The name/type/semantics of numRecordsIn metric would be the same
> before
> > > and
> > > > after the proposed change.
> > > >
> > > > Does this make sense? Or could you explain which part of FLIP-33
> would
> > be
> > > > broken by the proposed change?
> > > >
> > > > Regards,
> > > > Dong
> > > >
> > > >
> > > > On Sun, Jan 8, 2023 at 9:33 PM Martijn Visser <
> > martijnvis...@apache.org>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > This feels like we purposely want to abandon the idea that was
> > > introduced
> > > > > with FLIP-33 on standardizing metrics [1]. From this proposal, I
> > don't
> > > > see
> > > > > why we want to abandon that idea. Next to that, if you override
> > > > > the numRecordsIn logic, you also touch on the other metrics that
> rely
> > > on
> > > > > this value.
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Martijn
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics
> > > > >
> > > > > On Sun, Jan 8, 2023 at 12:43 PM Wencong Liu <liuwencle...@163.com>
> > > > wrote:
> > > > >
> > > > > > Thanks for the discussion, Dong and John
> > > > > >
> > > > > >
> > > > > > Considering the extra cost of method calls, the approach of
> adding
> > an
> > > > > > extra SourceReaderBase constructor
> > > > > > should be a better choice. If there is no further discussion, I
> > will
> > > > > > follow this plan.
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Wencong
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > At 2023-01-08 01:05:07, "John Roesler" <vvcep...@apache.org>
> > wrote:
> > > > > > >Thanks, for the discussion, Dong.
> > > > > > >
> > > > > > >To answer your question: I was unclear if the desire was to
> > > deprecate
> > > > > the
> > > > > > metric itself, to be replaced with some other metric, or whether
> we
> > > > just
> > > > > > wanted to deprecate the way the superclass manages it. It’s clear
> > now
> > > > > that
> > > > > > we want to keep the metric and only change the superclass logic.
> > > > > > >
> > > > > > >I think we’re on the same page now.
> > > > > > >
> > > > > > >Thanks!
> > > > > > >-John
> > > > > > >
> > > > > > >On Sat, Jan 7, 2023, at 07:21, Dong Lin wrote:
> > > > > > >> Hi John,
> > > > > > >>
> > > > > > >> Not sure if I understand the difference between "deprecate
> that
> > > > > metric"
> > > > > > and
> > > > > > >> "deprecate the private counter mechanism". I think what we
> want
> > is
> > > > to
> > > > > > >> update SourceReaderBase's implementation so that it no longer
> > > > > explicitly
> > > > > > >> increments this metric. But we still need to expose this
> metric
> > to
> > > > > user.
> > > > > > >> And methods such as RecordEmitter#emitRecord (which can be
> > invoked
> > > > by
> > > > > > >> SourceReaderBase#pollNext(...)) can potentially increment this
> > > > metric.
> > > > > > >>
> > > > > > >> I like your approach of adding an extra SourceReaderBase
> > > > constructor.
> > > > > > That
> > > > > > >> appears simpler than adding a deprecated config. We can mark
> the
> > > > > > existing
> > > > > > >> SourceReaderBase constructor as @deprecated.
> > > > > > SourceReaderBase#pollNext(...)
> > > > > > >> will not increment the counter if a subclass uses the newly
> > added
> > > > > > >> constructor.
> > > > > > >>
> > > > > > >> Cheers,
> > > > > > >> Dong
> > > > > > >>
> > > > > > >>
> > > > > > >> On Sat, Jan 7, 2023 at 4:47 AM John Roesler <
> > vvcep...@apache.org>
> > > > > > wrote:
> > > > > > >>
> > > > > > >>> Thanks for the replies, Dong and Wencong!
> > > > > > >>>
> > > > > > >>> That’s a good point about the overhead of the extra method.
> > > > > > >>>
> > > > > > >>> Is the desire to actually deprecate that metric in a
> > user-facing
> > > > way,
> > > > > > or
> > > > > > >>> just to deprecate the private counter mechanism?
> > > > > > >>>
> > > > > > >>> It seems like if the desire is to deprecate the existing
> > private
> > > > > > counter,
> > > > > > >>> we can accomplish it by deprecating the current constructor
> and
> > > > > > offering
> > > > > > >>> another that is documented not to track the metric. This
> seems
> > > > better
> > > > > > than
> > > > > > >>> the config option, since the concern is purely about the
> > division
> > > > of
> > > > > > >>> responsibilities between the sub- and super-class.
> > > > > > >>>
> > > > > > >>> Another option, which might be better if we wish to keep a
> > > > uniformly
> > > > > > named
> > > > > > >>> metric, would be to simply make the counter protected. That
> > would
> > > > be
> > > > > > better
> > > > > > >>> if we’re generally happy with the metric and counter, but a
> few
> > > > > special
> > > > > > >>> source connectors need to emit records in other ways.
> > > > > > >>>
> > > > > > >>> And finally, if we really want to get rid of the metric
> itself,
> > > > then
> > > > > I
> > > > > > >>> agree, a config is the way to do it.
> > > > > > >>>
> > > > > > >>> Thanks,
> > > > > > >>> John
> > > > > > >>>
> > > > > > >>> On Fri, Jan 6, 2023, at 00:55, Dong Lin wrote:
> > > > > > >>> > Hi John and Wencong,
> > > > > > >>> >
> > > > > > >>> > Thanks for the reply!
> > > > > > >>> >
> > > > > > >>> > It is nice that optional-2 can address the problem without
> > > > > affecting
> > > > > > the
> > > > > > >>> > existing source connectors as far as functionality is
> > > concerned.
> > > > > One
> > > > > > >>> > potential concern with this approach is that it might
> > increase
> > > > the
> > > > > > Flink
> > > > > > >>> > runtime overhead by adding one more virtual functional call
> > to
> > > > the
> > > > > > >>> > per-record runtime call stack.
> > > > > > >>> >
> > > > > > >>> > Since Java's default MaxInlineLevel is 12-18, I believe it
> is
> > > > easy
> > > > > > for an
> > > > > > >>> > operator chain of 5+ operators to exceed this limit. In
> this
> > > > case.
> > > > > > And
> > > > > > >>> > option-2 would incur one more virtual table lookup to
> produce
> > > > each
> > > > > > >>> record.
> > > > > > >>> > It is not clear how much this overhead would show up for
> jobs
> > > > with
> > > > > a
> > > > > > >>> chain
> > > > > > >>> > of lightweight operators. I am recently working on
> > FLINK-30531
> > > > > > >>> > <https://issues.apache.org/jira/browse/FLINK-30531> to
> > reduce
> > > > > > runtime
> > > > > > >>> > overhead which might be related to this discussion.
> > > > > > >>> >
> > > > > > >>> > In comparison to option-2, the option-3 provided in my
> > earlier
> > > > > email
> > > > > > >>> would
> > > > > > >>> > not add this extra overhead. I think it might be worthwhile
> > to
> > > > > > invest in
> > > > > > >>> > the long-term performance (and simpler runtime infra) and
> pay
> > > for
> > > > > the
> > > > > > >>> > short-term cost of deprecating this metric in
> > > SourceOperatorBase.
> > > > > > What do
> > > > > > >>> > you think?
> > > > > > >>> >
> > > > > > >>> > Regards,
> > > > > > >>> > Dong
> > > > > > >>> >
> > > > > > >>> >
> > > > > > >>> > On Thu, Jan 5, 2023 at 10:10 PM Wencong Liu <
> > > > liuwencle...@163.com>
> > > > > > >>> wrote:
> > > > > > >>> >
> > > > > > >>> >> Hi, All
> > > > > > >>> >>
> > > > > > >>> >>
> > > > > > >>> >> Thanks for the reply!
> > > > > > >>> >>
> > > > > > >>> >>
> > > > > > >>> >> I think both John and Dong's opinions are reasonable.
> John's
> > > > > > Suggestion
> > > > > > >>> 2
> > > > > > >>> >> is a good implementation.
> > > > > > >>> >> It does not affect the existing source connectors, but
> also
> > > > > provides
> > > > > > >>> >> support
> > > > > > >>> >> for custom counter in the future implementation.
> > > > > > >>> >>
> > > > > > >>> >>
> > > > > > >>> >> WDYT?
> > > > > > >>> >>
> > > > > > >>> >>
> > > > > > >>> >> Best,
> > > > > > >>> >>
> > > > > > >>> >> Wencong Liu
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to