Re: [DISCUSS] Reference operator from Flink Kubernetes deployment docs

2022-10-13 Thread Jing Ge
+1 good idea!

On Thu, Oct 13, 2022 at 2:42 PM Yang Wang  wrote:

> +1 for increasing the visibility of flink-kubernetes-operator.
>
> Best,
> Yang
>
> Thomas Weise  于2022年10月13日周四 07:49写道:
>
> > +1
> >
> >
> > On Wed, Oct 12, 2022 at 5:03 PM Martijn Visser  >
> > wrote:
> >
> > > +1 from my end to include the operator in the related Kubernetes
> sections
> > > of the Flink docs
> > >
> > > On Wed, Oct 12, 2022 at 5:31 PM Chesnay Schepler 
> > > wrote:
> > >
> > > > I don't see a reason for why we shouldn't at least mention the
> operator
> > > > in the kubernetes docs.
> > > >
> > > > On 12/10/2022 16:25, Gyula Fóra wrote:
> > > > > Hi Devs!
> > > > >
> > > > > I would like to start a discussion about referencing the Flink
> > > Kubernetes
> > > > > Operator directly from the Flink Kubernetes deployment
> documentation.
> > > > >
> > > > > Currently the Flink deployment/resource provider docs provide some
> > > > > information for the Standalone and Native Kubernetes integration
> > > without
> > > > > any reference to the operator.
> > > > >
> > > > > I think we reached a point with the operator where we should
> provide
> > a
> > > > bit
> > > > > more visibility and value to the users by directly proposing to use
> > the
> > > > > operator when considering Flink on Kubernetes. We should definitely
> > > keep
> > > > > the current docs but make the point that for most users the easiest
> > way
> > > > to
> > > > > use Flink on Kubernetes is probably through the operator (where
> they
> > > can
> > > > > now benefit from both standalone and native integration under the
> > > hood).
> > > > > This should help us avoid cases where a new user completely misses
> > the
> > > > > existence of the operator when starting out based on the Flink
> docs.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Gyula
> > > > >
> > > >
> > > >
> > >
> >
>


dev@flink.apache.org

2022-10-13 Thread Jing Ge
+1 and I would suggest giving a concrete example to explain 4) support. It
is still not quite easy to understand the text. Not many (future) connector
developers could join this discussion. It is better to make it as clear as
possible at the beginning than spend more time explaining multiple times.
Just my two cents.

Best regards,
Jing

On Thu, Oct 13, 2022 at 2:02 PM Ryan Skraba 
wrote:

> +1 non-binding!  I've been following (and generally agreeing) with the
> thread -- it's a perfectly reasonable way to start, and I'm sure we can
> adjust the process if it turns out to be unsuitable or unexpected as the
> connectors evolve in their external repositories.
>
> On Thu, Oct 13, 2022 at 12:37 PM Thomas Weise  wrote:
>
> > +1 (binding) for the vote and thanks for the explanation
> >
> > On Thu, Oct 13, 2022 at 5:58 AM Chesnay Schepler 
> > wrote:
> >
> > > @Thomas:
> > > Version-specific modules that either contain a connector or shims to
> > > support that Flink version.
> > > Alternatively, since the addition of such code (usually) goes beyond a
> > > patch release you'd create a new minor version and could have that only
> > > support the later version.
> > >
> > > On 13/10/2022 02:05, Thomas Weise wrote:
> > > > "Branches are not specific to a Flink version. (i.e., no v3.2-1.15)"
> > > >
> > > > Sorry for the late question. I could not find in the discussion
> thread
> > > how
> > > > a connector can make use of features of the latest Flink version that
> > > were
> > > > not present in the previous Flink version, when branches cannot be
> > Flink
> > > > version specific?
> > > >
> > > > Thanks,
> > > > Thomas
> > > >
> > > > On Wed, Oct 12, 2022 at 4:09 PM Ferenc Csaky
> >  > > >
> > > > wrote:
> > > >
> > > >> +1 from my side (non-binding)
> > > >>
> > > >> Best,
> > > >> F
> > > >>
> > > >>
> > > >> --- Original Message ---
> > > >> On Wednesday, October 12th, 2022 at 15:47, Martijn Visser <
> > > >> martijnvis...@apache.org> wrote:
> > > >>
> > > >>
> > > >>>
> > > >>> +1 (binding), I am indeed assuming that Chesnay meant the last two
> > > minor
> > > >>> versions as supported.
> > > >>>
> > > >>> Op wo 12 okt. 2022 om 20:18 schreef Danny Cranmer
> > > >> dannycran...@apache.org
> > >  Thanks for the concise summary Chesnay.
> > > 
> > >  +1 from me (binding)
> > > 
> > >  Just one clarification, for "3.1) The Flink versions supported by
> > the
> > >  project (last 2 major Flink versions) must be supported.". Do we
> > > >> actually
> > >  mean major here, as in Flink 1.x.x and 2.x.x? Right now we would
> > only
> > >  support Flink 1.15.x and not 1.14.x? I would be inclined to
> support
> > > the
> > >  latest 2 minor Flink versions (major.minor.patch) given that we
> only
> > > >> have 1
> > >  active major Flink version.
> > > 
> > >  Danny
> > > 
> > >  On Wed, Oct 12, 2022 at 2:12 PM Chesnay Schepler
> ches...@apache.org
> > >  wrote:
> > > 
> > > > Since the discussion
> > > > (
> https://lists.apache.org/thread/mpzzlpob9ymkjfybm96vz2y2m5fjyvfo)
> > > >> has
> > > > stalled a bit but we need a conclusion to move forward I'm
> opening
> > a
> > > > vote.
> > > >
> > > > Proposal summary:
> > > >
> > > > 1) Branch model
> > > > 1.1) The default branch is called "main" and used for the next
> > major
> > > > iteration.
> > > > 1.2) Remaining branches are called "vmajor.minor". (e.g., v3.2)
> > > > 1.3) Branches are not specific to a Flink version. (i.e., no
> > > >> v3.2-1.15)
> > > > 2) Versioning
> > > > 2.1) Source releases: major.minor.patch
> > > > 2.2) Jar artifacts: major.minor.match-flink-major.flink-minor
> > > > (This may imply releasing the exact same connector jar multiple
> > times
> > > > under different versions)
> > > >
> > > > 3) Flink compatibility
> > > > 3.1) The Flink versions supported by the project (last 2 major
> > Flink
> > > > versions) must be supported.
> > > > 3.2) How this is achived is left to the connector, as long as it
> > > > conforms to the rest of the proposal.
> > > >
> > > > 4) Support
> > > > 4.1) The last 2 major connector releases are supported with only
> > the
> > > > latter receiving additional features, with the following
> > exceptions:
> > > > 4.1.a) If the older major connector version does not support any
> > > > currently supported Flink version, then it is no longer
> supported.
> > > > 4.1.b) If the last 2 major versions do not cover all supported
> > Flink
> > > > versions, then the latest connector version that supports the
> older
> > > > Flink version /additionally /gets patch support.
> > > > 4.2) For a given major connector version only the latest minor
> > > >> version
> > > > is supported.
> > > > (This means if 1.1.x is released there will be no more 1.0.x
> > release)
> > > >
> > > > I'd like to clarify that these w

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-13 Thread Jing Ge
Hi Qingsheng,

Thanks for the clarification. +1, I like the idea. Pointing both numXXXOut
and numXXXSend to the same external data transfer metric does not really
break the new SinkV2 design, since there was no requirement to monitor the
internal traffic. So, I think both developer and user can live with it. It
might not be the perfect solution but is indeed the currently best
trade-off solution after considering the backward compatibility.  I would
suggest firing a follow-up ticket after the PR to take care of the new
metric for the internal traffic in the future.

Best regards,
Jing


On Thu, Oct 13, 2022 at 3:08 PM Qingsheng Ren  wrote:

> Hi Jing,
>
> Thanks for the reply!
>
> Let me rephrase my proposal: we’d like to use numXXXOut registered on
> SinkWriterOperator to reflect the traffic to the external system for
> compatibility with old versions before 1.15, and make numXXXSend have the
> same value as numXXXOut for compatibility within 1.15. That means both
> numXXXOut and numXXXSend are used for external data transfers, which end
> users care more about. As for the internal traffic within the sink, we
> could name a new metric for it because this is a _new_ feature in the _new_
> sink, and end users usually don’t pay attention to internal implementation.
> The name of the new metric could be discussed later after 1.16 release.
>
> > but it might end up with monitoring unexpected metrics, which is even
> worse for users, i.e. I didn't change anything, but something has been
> broken since the last update.
>
> Yeah this is exactly what we are trying to fix with this proposal. I
> believe users are more concerned with the output to the external system
> than the internal data delivery in the sink, so I think we’ll have more
> cases reporting like “I set up a panel on numRecordsOut in sink to monitor
> the output of the job, but after upgrading to 1.15 this value is extremely
> low and I didn’t change anything” if we stick to the current situation. I
> think only a few end users care about the number of committables sending to
> downstream as most of them don’t care how the sink works.
>
> We do need a re-design to fully distinguish the internal and external
> traffic on metrics, not only in sink but in all operators as it’s quite
> common for operators to make IO. This needs time to design, discuss, adjust
> and vote, but considering this is blocking 1.16, maybe it’s better to
> rescue the compatibility for now, and leave the huge reconstruction to
> future versions (maybe 2.0).
>
> Best,
> Qingsheng
>
> On Wed, Oct 12, 2022 at 7:21 PM Jing Ge  wrote:
>
>> Hi Qingsheng,
>>
>> Just want to make sure we are on the same page. Are you suggesting
>> switching the naming between "numXXXSend" and "numXXXOut" or reverting all
>> the changes we did with FLINK-26126 and FLINK-26492?
>>
>> For the naming switch, please pay attention that the behaviour has been
>> changed since we introduced SinkV2[1]. So, please be aware of different
>> numbers(behaviour change) even with the same metrics name. Sticking with
>> the old name with the new behaviour (very bad idea, IMHO) might seem like
>> saving the effort in the first place, but it might end up with monitoring
>> unexpected metrics, which is even worse for users, i.e. I didn't change
>> anything, but something has been broken since the last update.
>>
>> For reverting, I am not sure how to fix the issue mentioned in
>> FLINK-26126 after reverting all changes. Like Chesnay has already pointed
>> out, with SinkV2 we have two different output lines - one with the external
>> system and the other with the downstream operator. In this case,
>> "numXXXSend" is rather a new metric than a replacement of "numXXXOut". The
>> "numXXXOut" metric can still be used, depending on what the user wants to
>> monitor.
>>
>>
>> Best regards,
>> Jing
>>
>> [1]
>> https://github.com/apache/flink/blob/51fc20db30d001a95de95b3b9993eeb06f558f6c/flink-metrics/flink-metrics-core/src/main/java/org/apache/flink/metrics/groups/SinkWriterMetricGroup.java#L48
>>
>>
>> On Wed, Oct 12, 2022 at 12:48 PM Qingsheng Ren  wrote:
>>
>>> As a supplement, considering it could be a big reconstruction
>>> redefining internal and external traffic and touching metric names in
>>> almost all operators, this requires a lot of discussions and we might
>>> do it finally in Flink 2.0. I think compatibility is a bigger blocker
>>> in front of us, as the output of sink is a metric that users care a
>>> lot about.
>>>
>>> Thanks,
>>> Qingsheng
>>

Re: [VOTE] Reverting sink metric name changes made in 1.15

2022-10-14 Thread Jing Ge
+1

The voting title might cause confusion. Technically to say, it is a further
modification on top of the current status rather than a reverting after
considering the backward compatibility.

Best regards,
Jing

On Fri, Oct 14, 2022 at 10:41 AM Qingsheng Ren  wrote:

> Thanks for the reply Chesnay!
>
> I made a POC [1] just now, and I created a draft PR [2] so that it's easier
> for everyone to leave comments on it.
>
> [1] https://github.com/PatrickRen/flink/tree/FLINK-29567-POC
> [2] https://github.com/apache/flink/pull/21065
>
> Best,
> Qingsheng
>
>
> On Fri, Oct 14, 2022 at 1:56 AM Chesnay Schepler 
> wrote:
>
> > Do we have a PoC that achieves this without re-introducing the bug where
> > the numRecordsOut was simply wrong because it counted both records
> > written to the external system and the downstream committer?
> > It's gonna be quite the dirty hack I assume.
> >
> > On 13/10/2022 19:24, Qingsheng Ren wrote:
> > > Hi devs,
> > >
> > > I'd like to start a vote about reverting sink metric name changes made
> in
> > > 1.15 considering compatibility issues. These metrics include:
> > >
> > > - numRecordsSend -> numRecordsOut
> > > - numRecordsSendPerSecond -> numRecordsOutPerSecond
> > > - numBytesSend -> numBytesOut
> > > - numBytesSendPerSecond -> numBytesOutPerSecond
> > > - numRecordsSendError -> numRecordsOutError
> > >
> > > which reflect the output of the sink to the external system. "send"
> > metric
> > > series will be kept with the same value as "out" metric series. This
> > change
> > > will be applied to 1.15 and 1.16. More details could be found in the
> > > discussion thread [1].
> > >
> > > The vote will open for at least 72 hours.
> > >
> > > Looking forward to your feedback!
> > >
> > > [1] https://lists.apache.org/thread/vxhty3q97s7pw2zn0jhkyd6sxwwodzbv
> > >
> > > Best,
> > > Qingsheng
> > >
> >
> >
>


dev@flink.apache.org

2022-10-14 Thread Jing Ge
Thanks for the comprehensive explanation. It is clear now.

Best regards,
Jing

On Fri, Oct 14, 2022 at 9:51 AM Matthias Pohl
 wrote:

> Ok, a bit of back-and-forth reading. :-D Thanks for the example. It sounds
> reasonable.
>
> +1 (binding)
>
> On Thu, Oct 13, 2022 at 8:33 PM Chesnay Schepler 
> wrote:
>
> > I will write this all down in the wiki once the vote is over, but here
> > are some example.
> >
> >
> > Let's start out with a happy-case scenario with one connector supporting
> > the last 2 Flink versions.
> > This will commonly be the scenario for connectors when they have been
> > externalized:
> >
> > v1: 1.14-1.15
> >
> >
> > Now we create a v2 that only support 1.15:
> >
> > v1: 1.14-1.15 (patch support)
> > v2: 1.15 (feature support)
> >
> > 4.1) kicks in, both versions getting support, but only the latest
> > getting new features.
> >
> >
> > Now 1.16 is released, which v2 also supports.
> >
> > v1: 1.14-1.15 (patch support)
> > v2: 1.15-1.16 (feature support)
> >
> > Nothing changes.
> >
> >
> > Now 1.17 is released:
> >
> > v1: 1.14-1.15 (no support)
> > v2: 1.15-1.17
> >
> > Here 4.1.a kicks in; v1 supports no supported Flink version and lost
> > support.
> >
> >
> > Now we create v3 targeting 1.17, and shortly thereafter v4, also
> > targeting 1.17, because we messed something up or are just that excited
> > about finally having major releases.
> >
> > v2: 1.15-1.17 (patch support)
> > v3: 1.17 (patch support)
> > v4: 1.17 (feature support)
> >
> > Here 4.1.b kicks in, ensuring that v2 is still supported since we need
> > to support all Flink versions.
> >
> >
> > Now 1.18 is released, with v3 and v4 supporting it.
> >
> > v2: 1.15-1.17 (no support)
> > v3: 1.17 (patch support)
> > v4: 1.17 (feature support)
> >
> > General 4.1) rule kicks in, with only the last 2 major versions being
> > supported.
> >
> > On 13/10/2022 16:25, Jing Ge wrote:
> > > +1 and I would suggest giving a concrete example to explain 4) support.
> > It
> > > is still not quite easy to understand the text. Not many (future)
> > connector
> > > developers could join this discussion. It is better to make it as clear
> > as
> > > possible at the beginning than spend more time explaining multiple
> times.
> > > Just my two cents.
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Thu, Oct 13, 2022 at 2:02 PM Ryan Skraba
>  > >
> > > wrote:
> > >
> > >> +1 non-binding!  I've been following (and generally agreeing) with the
> > >> thread -- it's a perfectly reasonable way to start, and I'm sure we
> can
> > >> adjust the process if it turns out to be unsuitable or unexpected as
> the
> > >> connectors evolve in their external repositories.
> > >>
> > >> On Thu, Oct 13, 2022 at 12:37 PM Thomas Weise  wrote:
> > >>
> > >>> +1 (binding) for the vote and thanks for the explanation
> > >>>
> > >>> On Thu, Oct 13, 2022 at 5:58 AM Chesnay Schepler  >
> > >>> wrote:
> > >>>
> > >>>> @Thomas:
> > >>>> Version-specific modules that either contain a connector or shims to
> > >>>> support that Flink version.
> > >>>> Alternatively, since the addition of such code (usually) goes
> beyond a
> > >>>> patch release you'd create a new minor version and could have that
> > only
> > >>>> support the later version.
> > >>>>
> > >>>> On 13/10/2022 02:05, Thomas Weise wrote:
> > >>>>> "Branches are not specific to a Flink version. (i.e., no
> v3.2-1.15)"
> > >>>>>
> > >>>>> Sorry for the late question. I could not find in the discussion
> > >> thread
> > >>>> how
> > >>>>> a connector can make use of features of the latest Flink version
> that
> > >>>> were
> > >>>>> not present in the previous Flink version, when branches cannot be
> > >>> Flink
> > >>>>> version specific?
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Thomas
> > >>>>>
> > >>>>> On Wed, Oct 12, 2022 at 4:09

Re: [VOTE] Drop Gelly

2022-10-19 Thread Jing Ge
+1 and I second Yu Li and Yun Gao. It should be also pointed out that Flink
will continue to support graph processing with a new plan in the upcoming
1.17 release notes which includes the change of dropping Gelly in order to
let more developers and users be aware of it.

Best regards,
Jing

On Wed, Oct 19, 2022 at 10:32 AM David Anderson 
wrote:

> +1
>
> On Wed, Oct 12, 2022 at 10:59 PM Martijn Visser 
> wrote:
>
> > Hi everyone,
> >
> > I would like to open a vote for dropping Gelly, which was discussed a
> long
> > time ago but never put to a vote [1].
> >
> > Voting will be open for at least 72 hours.
> >
> > Best regards,
> >
> > Martijn
> > https://twitter.com/MartijnVisser82
> > https://github.com/MartijnVisser
> >
> > [1] https://lists.apache.org/thread/2m6wtgjvxcogbf9d5q7mqt4ofqjf2ojc
> >
>


Re: [VOTE] FLIP-265 Deprecate and remove Scala API support

2022-10-19 Thread Jing Ge
+1

On Wed, Oct 19, 2022 at 10:31 AM David Anderson 
wrote:

> +1
>
>
>
> On Mon, Oct 17, 2022 at 3:39 PM Martijn Visser 
> wrote:
>
> > Hi everyone,
> >
> > I'm hereby opening a vote for FLIP-265 Deprecate and remove Scala API
> > support. The related discussion can be found here [1].
> >
> > Voting will be open for at least 72 hours.
> >
> > Best regards,
> >
> > Martijn
> > https://twitter.com/MartijnVisser82
> > https://github.com/MartijnVisser
> >
> > [1] https://lists.apache.org/thread/d3borhdzj496nnggohq42fyb6zkwob3h
> >
>


Re: [DISCUSS] REST API to suspend & resume checkpointing

2022-10-20 Thread Jing Ge
Hi Saurabh,

In general, it is always good to add new features. I am not really sure if
I understood your requirement. I guess it will be too long for you to
resume the job with a created savepoint in the new stand-by Flink cluster.
But if it would be acceptable to you, you should not have the issue you
mentioned with the checkpoint. Speaking of checkpoint, if the checkpoint
interval were set properly, it should be fine even if in some rare cases
the last checkpoint was partially completed and is not selected. Another
option could be to trigger a manual checkpoint and then use that one to
resume the job to maintain the low downtime.

Best regards,
JIng

On Thu, Oct 20, 2022 at 7:53 AM Saurabh Kaul  wrote:

> Hey everyone,
>
> I will create a FLIP, but wanted to gauge community opinion first. The
> motivation is that production Flink applications frequently need to go
> through node/image patching to update the software and AMI with latest
> security fixes. These patching related restarts do not involve application
> jar or parallelism updates and can therefore be done without costly
> savepoint completion and restore cycles by relying on the last checkpoint
> state in order to achieve minimum downtime. In order to achieve this, we
> currently rely on retained checkpoints and the following steps:
>
>- Create new stand-by Flink cluster and submit application jar
>- Delete Flink TM deployment to stop processing & checkpoints on old
>cluster(reduce duplicates)
>- Query last completed checkpoint from REST API on JM of old cluster
>- Submit new job using last available checkpoint in new cluster, delete
>old cluster
>
> We have observed that this process will sometimes not select the latest
> checkpoint as partially completed checkpoints race and finish after
> querying the JM. Alternatives are to rely on creating other sources for
> checkpoint info but this has complications, as discussed in [2]. Waiting
> and force deleting task managers increases downtimes and doesn't guarantee
> TM process termination respectively. In order to maintain low downtime,
> duplicates and solve this race we can introduce an API to suspend
> checkpointing. Querying the latest available checkpoint after having
> suspending checkpointing will guarantee that we can maintain exactly once
> in such a scenario.
>
> This also acts as an extension to [1] where the feature to trigger
> checkpoints through a control plane has been discussed and added. It makes
> the checkpointing process flexible and gives the user more control in
> scenarios like migrating applications and letting data processing catch up
> temporarily.
> We can implement this similar to [1] and expose a trigger to suspend and
> resume checkpointing via CLI and REST API. We can add a parameter to
> suspend in 2 ways.
>
>1. Suspend scheduled checkpoint trigger, doesn’t cancel any still in
>progress checkpoints/savepoints but stops only future ones
>2. Suspend checkpoint coordinator, cancels in progress
>checkpoints/savepoints. Guarantees no racing checkpoint completion and
>could be used for canceling stuck checkpoints and help data processing
>
> [1] https://issues.apache.org/jira/browse/FLINK-27101
> [2] https://issues.apache.org/jira/browse/FLINK-26916
>


Re: [Discuss]- Donate Iceberg Flink Connector

2022-10-20 Thread Jing Ge
I agree with Steven Wu that those points are applicable to every
externalized connector. So those were actually concerns about externalizing
connector development and there were already some discussions and consensus
has already been made to do it.

Speaking of the 3x3 concern, I think the concept[1] proposed by Chesnay and
voted at [2] could help you.

[1]
https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development
[2] https://lists.apache.org/thread/7qr8jc053y8xpygcwbhlqq4r7c7fj1p3

Best regards,
Jing

On Thu, Oct 20, 2022 at 3:46 PM Steven Wu  wrote:

> Yuxia, those are valid points. But they are applicable to every connector
> (not just Iceberg).
>
> I also had a similar concern expressed in the discussion thread of
> "Externalized connector release details&workflow". My main concern is the
> multiplication factor of two upstream projects (Flink & storage/Iceberg).
> if we limit both to two versions, it will be 2x2, which might still be ok.
> but if we need to do 3x3, that will probably be too many to manage.
>
> On Thu, Oct 20, 2022 at 5:27 AM yuxia  wrote:
>
> > Hi, abmo, Abid!
> > Thanks you guys for diriving it.
> >
> > As Iceberg is more and more pupular and is an important
> > upstream/downstream system to Flink, I believe Flink community has paid
> > much attention to Icberg and hope to be closer to Icberg community. No
> > mather it's moved to Flink unbrella or not, I believe Flink experts are
> > glad to give feedbacks to Iceberg and take part in the development of
> > Icberg Flink connector.
> >
> >
> > Personaly, as a Flink contributor and main maintainer of Hive Flink
> > connector, I'm really glad to take part in Iceberg community for the
> > maintenance and future development of Icberg Flink connector. I think I
> can
> > provide some views from Flink side and bring some feedbacks from Icberg
> > comminuty to Flink community.
> >
> > But I have some concerns for moving the connector from Icberg repository
> > to a separate connector under Flink umbrella:
> >
> > 1: If Iceberg develops new features, for icberg flink connector, it have
> > to wait the Iceberg to be released before starting the development and
> > release for making use of the new features.  For users, they may need to
> > wait a much longer time before enjoying the new features of Icberg by
> using
> > Flink.
> >
> > 2: If we move it to a sepreate repositoy, I'm afrad of it'll loss
> > attention from both Flink and Iceberg sides which is definitely a harm to
> > Flink and Icerberg community. What's more, whenever Flink and icberge
> > release a version, we need to update the version in the sepreate
> > repositoy, which I think may be easily forgotten and tedious.
> >
> > Feel sorry for raising a different voice in this dicussion, but I think
> it
> > deserves a further dicussion in dev mail list, at least it will help to
> get
> > Flink developer's attention to Iceberg.
> >
> > Best regards,
> > Yuxia
> >
> > - 原始邮件 -
> > 发件人: "abmo work" 
> > 收件人: "dev" 
> > 发送时间: 星期四, 2022年 10 月 20日 上午 6:33:40
> > 主题: Re: [Discuss]- Donate Iceberg Flink Connector
> >
> > Hi Martijn,
> >
> > I created a FLIP for this, its FLIP 267: Iceberg Connector  <
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP+267:+Iceberg+Connector
> > >
> > Please let me know if anything else is needed. My email on confluence is
> > abmo.w...@icloud.com.
> >
> > As 1.0 was released today, from Iceberg perspective we need to figure out
> > what versions of Flink we will support and the release timeline as to
> when
> > the connector will be built and release off of the new repo vs Iceberg.
> >
> > Thanks
> > Abid
> >
> > > On Oct 19, 2022, at 12:43 PM, Martijn Visser  >
> > wrote:
> > >
> > > Hi Abid,
> > >
> > > We should have a FLIP as this would be a code contribution. If you
> > provide
> > > your Confluence user name, we can grant you access to create one.
> > >
> > > Is there also something from an Iceberg point of view needed to agree
> > with
> > > the code contribution?
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > Op wo 19 okt. 2022 om 19:11 schreef 
> > >
> > >> Thanks Martijn!
> > >>
> > >> Thanks for all the support and positive responses. I will start a vote
> > >> thread and send it out to the dev list.
> > >>
> > >> Also, we need help with creation of a new repo for the Iceberg
> > Connector.
> > >>
> > >> Can someone help with the creation of a repo? Please let me know if I
> > need
> > >> to create an issue or flip for that.
> > >> Following similar naming for other connectors, I propose
> > >> https://github.com/apache/flink-connector-iceberg (doesn’t exist)
> > >>
> > >> Thanks
> > >> Abid
> > >>
> > >> On 2022/10/19 08:41:02 Martijn Visser wrote:
> > >>> Hi all,
> > >>>
> > >>> Thanks for the info and also thanks Peter and Steven for offering to
> > >>> volunteer. I think that's a great idea and a necessity.
> > >>>
> > >>> Overall +1 given the current ideas to make this contribution happ

Re: [DISCUSS] Planning Flink 1.17

2022-10-21 Thread Jing Ge
+1
The plan looks rational. Thanks for your effort!

Best regards,
Jing

On Fri, Oct 21, 2022 at 1:16 PM Congxian Qiu  wrote:

> Thanks for kicking off the 1.17 release, and volunteers as release
> managers.
>
> +1 to the feature freeze date
>
> Best,
> Congxian
>
>
> Yun Tang  于2022年10月21日周五 17:24写道:
>
> > Thanks, Qingsheng and Leonard to kick off the release plan of flink-1.17,
> > which targets a feature freeze date of January 17th 🙂.
> >
> > +1 to include Martijn and Matthias as the release managers.
> >
> > Best,
> > Yun Tang
> > 
> > From: Piotr Nowojski 
> > Sent: Friday, October 21, 2022 16:22
> > To: dev@flink.apache.org 
> > Subject: Re: [DISCUSS] Planning Flink 1.17
> >
> > Hi,
> >
> > Thanks for kicking this discussion off Qingsheng and Leonard. I like the
> > January 17th, 2023 touch :)
> >
> > 1 for including Matthias Pohl and Martijn Visser as release managers.
> >
> > Best,
> > Piotrek
> > Always alright
> >
> > pt., 21 paź 2022 o 05:55 Yuan Mei  napisał(a):
> >
> > > Thanks, Qingsheng for the kicking-off efforts.
> > >
> > > 1. January 17th, 2023 as feature freeze data sounds reasonable to me.
> > > 2. We will input our plan to the wiki link.
> > >
> > > Thanks
> > >
> > > Best
> > > Yuan
> > > Ververica (Alibaba)
> > >
> > >
> > > On Fri, Oct 21, 2022 at 10:38 AM Xingbo Huang  wrote:
> > >
> > > > Thanks Qingsheng, Leonard and Martijn for starting the discussion and
> > > > volunteering.
> > > > The timeline proposal sounds reasonable :+1:
> > > >
> > > > Best,
> > > > Xingbo
> > > >
> > > > Jark Wu  于2022年10月21日周五 00:17写道:
> > > >
> > > > > Thanks for kicking off the 1.17 release.
> > > > >
> > > > > Targeting feature freeze on 1/17 for 1.17 release sounds pretty
> good
> > to
> > > > me.
> > > > > +1 for the volunteers as release managers.
> > > > >
> > > > > Best,
> > > > > Jark
> > > > > Ververica (Alibaba)
> > > > >
> > > > > On Thu, 20 Oct 2022 at 18:09, Matthias Pohl <
> matthias.p...@aiven.io
> > > > > .invalid>
> > > > > wrote:
> > > > >
> > > > > > Thanks for starting the discussion about Flink 1.17. I would be
> > > > > interested
> > > > > > in helping out around the release as well.
> > > > > >
> > > > > > Best,
> > > > > > Matthias
> > > > > >
> > > > > > On Thu, Oct 20, 2022 at 12:07 PM Xintong Song <
> > tonysong...@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for kicking this off.
> > > > > > >
> > > > > > > +1 for the proposed timeline.
> > > > > > >
> > > > > > > Also +1 for Qingsheng, Leonard and Martijn as the release
> > managers.
> > > > > > Thanks
> > > > > > > for volunteering.
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > Xintong
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Oct 20, 2022 at 3:59 PM Martijn Visser <
> > > > > martijnvis...@apache.org
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Qingsheng,
> > > > > > > >
> > > > > > > > I'm definitely interested in participating as a release
> manager
> > > > > again.
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > >
> > > > > > > > Martijn
> > > > > > > >
> > > > > > > > On Thu, Oct 20, 2022 at 9:47 AM Qingsheng Ren <
> > re...@apache.org>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > As we are approaching the official release of Flink 1.16,
> > it’s
> > > a
> > > > > good
> > > > > > > > time
> > > > > > > > > to kick off some discussions and march toward 1.17.
> > > > > > > > >
> > > > > > > > > - Release managers
> > > > > > > > >
> > > > > > > > > Leonard Xu and I would like to volunteer as release
> managers
> > > for
> > > > > > 1.17,
> > > > > > > > and
> > > > > > > > > it would be great to have someone else working together on
> > this
> > > > > > > release.
> > > > > > > > > Please let us know if you have any interest!
> > > > > > > > >
> > > > > > > > > - Timeline
> > > > > > > > >
> > > > > > > > > Having 1.16 will be released in the next few days and the 4
> > > > months
> > > > > > > > release
> > > > > > > > > cycle after that, we propose to set the feature freezing
> date
> > > on
> > > > > > > *January
> > > > > > > > > 17th, 2023* (aligned with our version number 1.17 :-)), so
> > that
> > > > > > > everyone
> > > > > > > > > could enjoy the holiday season and Chinese new year.
> > > > > > > > >
> > > > > > > > > - What we’ll be focusing
> > > > > > > > >
> > > > > > > > > Similar to our previous releases, we’d like to keep an eye
> on
> > > the
> > > > > > > > > timeline, CI stability, release testing, and any
> > communication
> > > > and
> > > > > > > > > coordination across teams and developers. One thing we’d
> like
> > > to
> > > > > > > mention
> > > > > > > > in
> > > > > > > > > particular is compatibility, which is a frequent complaint
> > from
> > > > our
> > > > > > > > > ecosystem developers and users. We encourage all committers
> > to
> > > do
> > > > > an
> > > > > > > > extra
> > > > > > > > > manual

Re: [DISCUSS] Create a dedicated aws-base connector repository

2022-10-21 Thread Jing Ge
I agree with Jark. It would be easier for the further development and
maintenance, if all aws related connectors and the base module are in the
same repo. It might make sense to upgrade the flink-connector-dynamodb to
flink-connector-aws and move the other modules including the
flink-connector-aws-base into it. The aws sdk could be managed in
flink-connector-aws-base. Any future common connector features could also
be developed in the base module.

Best regards,
Jing

On Fri, Oct 21, 2022 at 1:26 PM Jark Wu  wrote:

> How about creating a new repository flink-connector-aws and merging
> dynamodb, kinesis firehouse into it?
> This can reduce the maintenance for complex dependencies and make the
> release easy.
> I think the maintainers of aws-releated connectors are the same people.
>
> Best,
> Jark
>
> > 2022年10月21日 17:41,Chesnay Schepler  写道:
> >
> > I would not go with 2); I think it'd just be messy .
> >
> > Here's another option:
> >
> > Create another repository (aws-connector-base) (following the
> externalization model), add it as a sub-module to the downstream
> repositories, and make it part of the release process of said connector.
> >
> > I.e., we never create a release for aws-connector-bose, but release it
> as part of the connector.
> > This main benefit here is that we'd always be able to make changes to
> the aws-base code without delaying connector releases.
> > I would assume that any added overhead due to _technically_ releasing
> the aws code multiple times to be negligible.
> >
> >
> > On 20/10/2022 22:38, Danny Cranmer wrote:
> >> Hello all,
> >>
> >> Currently we have 2 AWS Flink connectors in the main Flink codebase
> >> (Kinesis Data Streams and Kinesis Data Firehose) and one new
> externalized
> >> connector in progress (DynamoDB). Currently all three of these use
> common
> >> AWS utilities from the flink-connector-aws-base module. Common code
> >> includes client builders, property keys, validation, utils etc.
> >>
> >> Once we externalize the connectors, leaving flink-connector-aws-base in
> the
> >> main Flink repository will restrict our ability to evolve the connectors
> >> quickly. For example, as part of the DynamoDB connector build we are
> >> considering adding a general retry strategy config that can be
> leveraged by
> >> all connectors. We would need to block on Flink 1.17 for this.
> >>
> >> In the past we have tried to keep the AWS SDK version consistent across
> >> connectors, with the externalization this is more likely to diverge.
> >>
> >> Option 1: I propose we create a new repository, flink-connector-aws,
> which
> >> we can move the flink-connector-aws-base module to and create a new
> >> flink-connector-aws-parent to manage SDK versions. Each of the
> externalized
> >> AWS connectors will depend on this new module and parent. Downside is an
> >> additional module to release per Flink version, however I will
> volunteer to
> >> manage this.
> >>
> >> Option 2: We can move the flink-connector-aws-base module and create
> >> flink-connector-parent within the flink-connector-shared-utils repo [2]
> >>
> >> Option 3: We do nothing.
> >>
> >> For option 1+2 we will follow the general externalized connector
> versioning
> >> strategy and rules.
> >>
> >> I am inclined towards option 1, and appreciate feedback from the
> community.
> >>
> >> [1]
> >>
> https://github.com/apache/flink/tree/master/flink-connectors/flink-connector-aws-base
> >> [2] https://github.com/apache/flink-connector-shared-utils
> >>
> >> Thanks,
> >> Danny
> >>
> >
>
>


Re: [DISCUSS] REST API to suspend & resume checkpointing

2022-10-24 Thread Jing Ge
napshot
> > > and
> > > > using CLAIM mode in re-running is not necessary here and adds
> latency.
> > > > Stop-with-checkpoint doesn't have these issues. It adds some downtime
> > in
> > > > waiting for a checkpoint to be completed but reduces replay time in
> the
> > > new
> > > > cluster which is a good trade-off. Since in this scenario of job
> > > migration
> > > > the job and/or job configuration is not changing; it should ideally
> be
> > as
> > > > fast as a regular failover scenario (like a TM going down).
> > > > 2. Taking complete ownership of triggering checkpoints and making
> them
> > > more
> > > > configurable could be feasible but are less effective comparatively
> in
> > > > terms of stopping the job for the primary purpose of low-downtime
> > > migration
> > > > of the job. Stop-with-checkpoint solves it more directly.
> > > >
> > > > Looking forward to hearing thoughts on this.
> > > >
> > > > On Thu, Oct 20, 2022 at 3:31 AM Piotr Nowojski  >
> > > > wrote:
> > > >
> > > > > Hi Saurabh,
> > > > >
> > > > > Thanks for reaching out with the proposal. I have some mixed
> feelings
> > > > about
> > > > > this for a couple of reasons:
> > > > >
> > > > > 1. It sounds like the core problem that you are describing is the
> > race
> > > > > condition between shutting down the cluster and completion of new
> > > > > checkpoints. My first thought would be as Jing's, why don't you use
> > > > > stop-with-savepoint? Especially the native savepoint? You can
> recover
> > > > from
> > > > > it using --claim mode, so the whole process should be quite fast
> > > > actually.
> > > > > 2. The same issue, not knowing the latest completed checkpoint id,
> > > > plagued
> > > > > us with some internal tests for quite a bit, so maybe this would
> also
> > > be
> > > > > worth considering to address instead? Like leaving in some text
> file
> > > the
> > > > > last completed checkpoint id? Or providing a way to read this from
> > some
> > > > > existing metadata files? However in our tests we actually
> > fixed/worked
> > > > > around that with manually triggering of checkpoints. The
> predecessor
> > of
> > > > > FLINK-27101 [1], FLINK-24280 [2], was implemented to address this
> > exact
> > > > > issue.  Which brings me to...
> > > > > 3. You could actually just use the REST API to trigger all
> > checkpoints
> > > > > manually. The idea behind FLINK-27101 [1] was to add full
> flexibility
> > > to
> > > > > the users, without adding much complexity to the system. If we
> start
> > > > adding
> > > > > more REST calls to control checkpointing behaviour it would
> > complicate
> > > > the
> > > > > system.
> > > > > 4. If at all, I would think more towards a more generic idea of
> > > > dynamically
> > > > > reconfiguring the system. We could provide a generic way to
> > dynamically
> > > > > change configuration options. We wouldn't be able to support all
> > > > > configurations, and furthermore, each "dynamic" option would have
> to
> > be
> > > > > handled/passed down to and through the system differently, BUT we
> > > > wouldn't
> > > > > have to do all of that at once. We could start with a very limited
> > set
> > > of
> > > > > dynamic options, for example just with the checkpointing interval.
> > This
> > > > > must have been considered/discussed before, so I might be missing
> > lots
> > > of
> > > > > things.
> > > > > 5. Another direction, if 1. is not an option for some reason, is to
> > > > provide
> > > > > a stop-with-checkpoint feature?
> > > > >
> > > > > Best Piotrek
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-27101
> > > > > [2] https://issues.apache.org/jira/browse/FLINK-24280
> > > > >
> > > > > czw., 20 paź 2022 o 11:53 Jing Ge  napisał(a):
> > > > >
> > > > > > Hi Saurabh,
> >

Re: [Cassandra] source connector

2022-10-24 Thread Jing Ge
Hi,

The plan was and the ideal process is to try the externalizing connector
development and release with the elastic connector first and make it stable
before starting the migration of other connectors. There are already many
connectors like Iceberg and AWS connectors that are trying externalizing.
It is risky and might be out of control to start so many
connectors externalizing in parallel without the proven stable process.
Please be aware of the maintenance effort and consider a feasible timing to
start the externalization.

Best regards,
Jing

On Fri, Oct 21, 2022 at 4:14 PM Ryan Skraba 
wrote:

> There's definitely consensus on externalizing the flink connectors!  I've
> been tracking the progress and I'd be happy to provide support on Cassandra
> if you'd like.
>
> There's some new information at
>
> https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development
>
> The first step to externalizing Cassandra would be for a PMC member to
> create the flink-connector-cassandra repository for the project.
>
> If I understand correctly, this shouldn't block your PR, since the core
> connector (inside apache/flink) and the external one
> (apache/flink-cassandra-connector) should be kept in sync for one major
> release cycle.  For my own benefit, I can start by reviewing your PR!
>
> All my best, Ryan
>
> On Fri, Oct 21, 2022 at 11:51 AM Etienne Chauchot 
> wrote:
>
> > Hi,
> >
> > Yes sure, if there is a consensus on moving, just tell me where to move
> > my PR to.
> >
> > Best
> >
> > Etienne
> >
> > Le 19/10/2022 à 18:20, Alexander Fedulov a écrit :
> > > Hi Etienne,
> > >
> > > thanks for your contribution. In light of the current efforts to
> > > externalize connectors, do you think we could maybe combine the new
> > > implementation with moving it into an external repository instead of
> > > merging into Flink main?
> > >
> > > Best,
> > > Alexander Fedulov
> > >
> > > On Fri, Oct 14, 2022 at 4:18 PM Etienne Chauchot  >
> > > wrote:
> > >
> > >> Hi all,
> > >>
> > >> As promised, I have developed the Cassandra source connector based on
> > >> FLIP-27. I've just submitted the
> > >> PR:https://github.com/apache/flink/pull/21073
> > >>
> > >>
> > >> Best
> > >>
> > >> Etienne
> > >>
> >
>


Re: [DISCUSS] Release Flink 1.15.3

2022-10-25 Thread Jing Ge
+1 The timing is good to have 1.15.3 release. Thanks Fabian for bringing
this to our attention.

I just checked PRs and didn't find the 1.15 backport of FLINK-29567
. Please be aware of it.
Thanks!

Best regards,
Jing

On Tue, Oct 25, 2022 at 11:44 AM Xintong Song  wrote:

> Thanks for bringing this up, Fabian.
>
> +1 for creating a 1.15.3 release. I've also seen users requiring this
> version [1].
>
> I can help with actions that require a PMC role, if needed.
>
> Best,
>
> Xintong
>
>
> [1] https://lists.apache.org/thread/501q4l1c6gs8hwh433bw3v1y8fs9cw2n
>
>
>
> On Tue, Oct 25, 2022 at 5:11 PM Fabian Paul  wrote:
>
> > Hi all,
> >
> > I want to start the discussion of creating a new 1.15 patch release
> > (1.15.3). The last 1.15 release is almost two months old, and since then,
> > ~60 tickets have been closed, targeting 1.15.3. It includes critical
> > changes to the sink architecture, including:
> >
> > - Reverting the sink metric naming [1]
> > - Recovery problems for sinks using the GlobalCommitter [2][3][4]
> >
> > If the community agrees to create a new patch release, I could volunteer
> as
> > the release manager.
> >
> > Best,
> > Fabian
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-29567
> > [2] https://issues.apache.org/jira/browse/FLINK-29509
> > [3] https://issues.apache.org/jira/browse/FLINK-29512
> > [4] https://issues.apache.org/jira/browse/FLINK-29627
> >
>


[DISCUSS] Remove FlinkKafkaConsumer and FlinkKafkaProducer in the master for 1.17 release

2022-10-27 Thread Jing Ge
Hi Dev,

I'd like to start a discussion about removing FlinkKafkaConsumer and
FlinkKafkaProducer in 1.17.

Back in the past, it was originally announced to remove it with Flink 1.15
after Flink 1.14 had been released[1]. And then postponed to the next 1.15
release which meant to remove it with Flink 1.16 but forgot to change the
doc[2]. I have created a PRs to fix it. Since the 1.16 release branch has
code freeze, it makes sense to, first of all, update the doc to say that
FlinkKafkaConsumer will be removed with Flink 1.17 [3][4] and second start
the discussion about removing them with the current master branch i.e. for
the coming 1.17 release. I'm all ears and looking forward to your feedback.
Thanks!

Best regards,
Jing









[1]
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/kafka/#kafka-sourcefunction
[2]
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kafka/#kafka-sourcefunction
[3] https://github.com/apache/flink/pull/21172
[4] https://github.com/apache/flink/pull/21171


Re: [ANNOUNCE] Apache Flink 1.16.0 released

2022-10-28 Thread Jing Ge
Congrats!

On Fri, Oct 28, 2022 at 1:22 PM 任庆盛  wrote:

> Congratulations and a big thanks to Chesnay, Martijn, Godfrey and Xingbo
> for the awesome work for 1.16!
>
> Best regards,
> Qingsheng Ren
>
> > On Oct 28, 2022, at 14:46, Xingbo Huang  wrote:
> >
> > The Apache Flink community is very happy to announce the release of
> Apache
> > Flink 1.16.0, which is the first release for the Apache Flink 1.16
> series.
> >
> > Apache Flink® is an open-source stream processing framework for
> > distributed, high-performing, always-available, and accurate data
> streaming
> > applications.
> >
> > The release is available for download at:
> > https://flink.apache.org/downloads.html
> >
> > Please check out the release blog post for an overview of the
> > improvements for this release:
> > https://flink.apache.org/news/2022/10/28/1.16-announcement.html
> >
> > The full release notes are available in Jira:
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12351275
> >
> > We would like to thank all contributors of the Apache Flink community
> > who made this release possible!
> >
> > Regards,
> > Chesnay, Martijn, Godfrey & Xingbo
>


Re: [VOTE] Dedicated AWS externalized connector repo

2022-10-28 Thread Jing Ge
+1 (non-binding)

Thanks!

Best Regards,
Jing

On Fri, Oct 28, 2022 at 5:29 PM Samrat Deb  wrote:

> +1 (non binding)
>
> Thanks for driving Danny
>
> Bests
> Samrat
>
> On Fri, 28 Oct 2022 at 8:36 PM, Ahmed Hamdy  wrote:
>
> > +1 (non-binding)
> > Regards,
> > Ahmed
> >
> > On Thu, 27 Oct 2022 at 08:38, Teoh, Hong 
> > wrote:
> >
> > > +1 (non-binding)
> > >
> > > Thanks for driving this, Danny!
> > >
> > > Hong
> > >
> > > On 26/10/2022, 08:14, "Martijn Visser" 
> > wrote:
> > >
> > > CAUTION: This email originated from outside of the organization. Do
> > > not click links or open attachments unless you can confirm the sender
> and
> > > know the content is safe.
> > >
> > >
> > >
> > > +1 binding
> > >
> > > Thanks Danny!
> > >
> > > On Wed, Oct 26, 2022 at 8:48 AM Danny Cranmer <
> > dannycran...@apache.org
> > > >
> > > wrote:
> > >
> > > > Hello all,
> > > >
> > > > As discussed in the discussion thread [1], I propose to create a
> > > dedicated
> > > > repository for AWS connectors called flink-connector-aws. This
> will
> > > house
> > > > 3x connectors: Amazon Kinesis Data Streams, Amazon Kinesis Data
> > > Firehose
> > > > and Amazon DynamoDB and any future AWS connectors. We will also
> > > externalize
> > > > the AWS base module from the main Flink repository [2] and
> create a
> > > parent
> > > > pom for version management.
> > > >
> > > > All modules within this repository will share the same version,
> and
> > > be
> > > > released/evolved together. We will adhere to the common Flink
> rules
> > > [3] for
> > > > connector development.
> > > >
> > > > Motivation: grouping AWS connectors together will reduce the
> number
> > > of
> > > > connector releases, simplify development, dependency management
> and
> > > > versioning for users.
> > > >
> > > > Voting schema:
> > > > Consensus, committers have binding votes, open for at least 72
> > hours.
> > > >
> > > > [1]
> > https://lists.apache.org/thread/swp4bs8407gtsgn2gh0k3wx1m4o3kqqp
> > > > [2]
> > > >
> > > >
> > >
> >
> https://github.com/apache/flink/tree/master/flink-connectors/flink-connector-aws-base
> > > > [3]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development
> > > >
> > >
> > >
> >
>


Re: [DISCUSS] Remove FlinkKafkaConsumer and FlinkKafkaProducer in the master for 1.17 release

2022-10-31 Thread Jing Ge
Thanks Martijn. What you said makes a lot of sense. I figure we should do
it in 2 steps.

 Step 1 (with 1.17):
- Remove FlinkKafkaConsumer.
- Graduate Kafka Source from @PublicEvolving to @Public.
- Update doc and leave hints for customers as the reference.

According to [1], the Kafka Sink should also be graduated with 1.17, i.e.
after 1.15 and 1.16 two release cycles. But since the design change from
SinkV1 to SinkV2 were significant and there were many change requests since
then, we'd better give the sink one more release cycle time to become more
stable. The other reason for giving the Sink more time is that the
experimental phase was only covered by one release cycle instead of two as
[1] suggested.

Step 2 (with 1.18 ):
- Remove FlinkKafkaProducer.
- Graduate Kafka Sink from @PublicEvolving to @Public.
- Update doc and leave hints for customers as the reference.

Best regards,
Jing

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process

On Thu, Oct 27, 2022 at 3:01 PM Martijn Visser 
wrote:

> Hi Jing,
>
> Thanks for opening the discussion. I see no issue with removing the
> FlinkKafkaConsumer, since it has been marked as deprecated and the Source
> API (which is used by the KafkaSource) is marked as @Public (at least the
> Base implementation)
>
> The successor of the FlinkKafkaProducer is the KafkaSink, which is using
> the Sink V2 API which is still marked as @PublicEvolving (Base
> implementation). I think that we should only remove the FlinkKafkaProducer
> if we also mark the Sink V2 as @Public. I don't think that should be a
> problem (since it's based on the first Sink implementation, which was
> Experimental in 1.14 and got replaced with Sink V2 as PublicEvolving in
> 1.15).
>
> Thanks,
>
> Martijn
>
> On Thu, Oct 27, 2022 at 2:06 PM Jing Ge  wrote:
>
> > Hi Dev,
> >
> > I'd like to start a discussion about removing FlinkKafkaConsumer and
> > FlinkKafkaProducer in 1.17.
> >
> > Back in the past, it was originally announced to remove it with Flink
> 1.15
> > after Flink 1.14 had been released[1]. And then postponed to the next
> 1.15
> > release which meant to remove it with Flink 1.16 but forgot to change the
> > doc[2]. I have created a PRs to fix it. Since the 1.16 release branch has
> > code freeze, it makes sense to, first of all, update the doc to say that
> > FlinkKafkaConsumer will be removed with Flink 1.17 [3][4] and second
> start
> > the discussion about removing them with the current master branch i.e.
> for
> > the coming 1.17 release. I'm all ears and looking forward to your
> feedback.
> > Thanks!
> >
> > Best regards,
> > Jing
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > [1]
> >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/kafka/#kafka-sourcefunction
> > [2]
> >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kafka/#kafka-sourcefunction
> > [3] https://github.com/apache/flink/pull/21172
> > [4] https://github.com/apache/flink/pull/21171
> >
>


Re: [ANNOUNCE] New Apache Flink PMC Member - Danny Cranmer

2022-11-01 Thread Jing Ge
Congrats, Danny!

Best regards,
Jing

On Tue, Nov 1, 2022 at 11:42 AM Hamdy, Ahmed 
wrote:

> Congratulations, Danny!
> Best,
> Ahmed
>
> On 01/11/2022, 09:51, "Yuxin Tan"  wrote:
>
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and
> know the content is safe.
>
>
>
> Congrats, Danny!
>
> Best,
> Yuxin
>
>
> Guowei Ma  于2022年11月1日周二 16:36写道:
>
> > Congratulations Danny!
> > Best,
> > Guowei
> >
> >
> > On Tue, Nov 1, 2022 at 2:20 PM weijie guo  >
> > wrote:
> >
> > > Congratulations Danny!
> > >
> > > Best regards,
> > >
> > > Weijie
> > >
> > >
> > > Maximilian Michels  于2022年10月13日周四 21:41写道:
> > >
> > > > Congratulations Danny! Well deserved :)
> > > >
> > > > -Max
> > > >
> > > > On Thu, Oct 13, 2022 at 2:40 PM Yang Wang  >
> > wrote:
> > > >
> > > > > Congratulations Danny!
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Hang Ruan  于2022年10月13日周四 10:58写道:
> > > > >
> > > > > > Congratulations Danny!
> > > > > >
> > > > > > Best,
> > > > > > Hang
> > > > > >
> > > > > > Yun Gao  于2022年10月13日周四
> 10:56写道:
> > > > > >
> > > > > > > Congratulations Danny!
> > > > > > > Best,
> > > > > > > Yun Gao
> > > > > > >
> > --
> > > > > > > From:yuxia 
> > > > > > > Send Time:2022 Oct. 12 (Wed.) 09:49
> > > > > > > To:dev 
> > > > > > > Subject:Re: [ANNOUNCE] New Apache Flink PMC Member - Danny
> > Cranmer
> > > > > > > Congratulations Danny!
> > > > > > > Best regards,
> > > > > > > Yuxia
> > > > > > > - 原始邮件 -
> > > > > > > 发件人: "Xingbo Huang" 
> > > > > > > 收件人: "dev" 
> > > > > > > 发送时间: 星期三, 2022年 10 月 12日 上午 9:44:22
> > > > > > > 主题: Re: [ANNOUNCE] New Apache Flink PMC Member - Danny
> Cranmer
> > > > > > > Congratulations Danny!
> > > > > > > Best,
> > > > > > > Xingbo
> > > > > > > Sergey Nuyanzin  于2022年10月12日周三
> 01:26写道:
> > > > > > > > Congratulations, Danny
> > > > > > > >
> > > > > > > > On Tue, Oct 11, 2022, 15:18 Lincoln Lee <
> > lincoln.8...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Congratulations Danny!
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Lincoln Lee
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Congxian Qiu  于2022年10月11日周二
> > 19:42写道:
> > > > > > > > >
> > > > > > > > > > Congratulations Danny!
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Congxian
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Leonard Xu  于2022年10月11日周二
> 18:03写道:
> > > > > > > > > >
> > > > > > > > > > > Congratulations Danny!
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Leonard
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>


Re: [DISCUSS] Remove FlinkKafkaConsumer and FlinkKafkaProducer in the master for 1.17 release

2022-11-01 Thread Jing Ge
Hi all,

I am totally fine to do step 2 with 1.17. It would give us more room to do
some breaking changes if we would do it with 1.18.

@Danny

Thanks for the hint. Commonly, all related interfaces/classes should be
graduated too. Classes like SingleThreadMultiplexSourceReaderBase and
RecordEmitter used within SourceReaderBase are a little bit different.
Because  SingleThreadMultiplexSourceReaderBase is extended by
KafkaSourceReader which is marked as @Internal. It is not necessary to
graduate them just because the KafkaSource needs to be graduated. However,
given that most of those classes have been marked as @PublicEvolving more
than one years, it is a good opportunity to graduate those low level APIs
along with KafkaSource. And this will make the change a little bit bigger
than the original expectation of removing FlinkKafkaConsumer,
FlinkKafkaProducer and only graduating the related KafkaSource, KafkaSink.
I am struggling if It makes sense to have a FLIP for it, because the API
itself will not be modified except upgrading the annotation for graduation.

Best regards,
Jing


On Tue, Nov 1, 2022 at 9:52 AM Piotr Nowojski  wrote:

> Thanks Martijn and David,
>
> And what about the FlinkKafkaProducer? In order to migrate it, the user has
> to make sure that the transactions stored on the state are committed, and
> that there are no lingering/pending transactions (lingering transactions
> are blocking further reads to be readable until they timeout). The first
> one can be achieved by stop-with-savepoint. Regular savepoint or checkpoint
> is not good enough, because transactions are committed on
> notfiyCheckpointCompleted RPC, not on the checkpoint itself
> (stop-with-savepoint is considered completed only once notifications are
> received).  But what about the latter? What about the pending/lingering
> transactions? Are they also closed in stop-with-savepoint?
>
> Best,
> Piotrek
>
> wt., 1 lis 2022 o 08:51 Martijn Visser 
> napisał(a):
>
> > Hi all,
> >
> > @Piotr
> > When FlinkKafkaConsumer was marked as deprecated, the migration steps
> were
> > included in the release notes:
> >
> >
> https://nightlies.apache.org/flink/flink-docs-master/release-notes/flink-1.14/#flink-24055httpsissuesapacheorgjirabrowseflink-24055
> > Scrolling up a bit shows how move from FlinkKafkaProducer to KafkaSink
> >
> > Given the importance of the Kafka connector, I think it's smart to write
> a
> > dedicated blog on how to migrate.
> > That could also include the differences between the two implementations,
> as
> > David just outlined.
> >
> > @Jing
> > SinkV2 was released with Flink 1.15, so with 1.16 released there will be
> 2
> > release cycles with that API at PublicEvolving. I don't think there were
> > breaking changes when the bug fixes were applied, because those bug fixes
> > were also applied in a patch version (1.15.3) which can't be done unless
> > they weren't breaking the API. Unless there will be work planned for 1.17
> > on the Sink, I think it can be moved to Public too.
> >
> > Best regards,
> >
> > Martijn
> >
> > On Tue, Nov 1, 2022 at 7:21 AM David Anderson 
> > wrote:
> >
> > > >
> > > > [H]ow one can migrate from the
> > > > FlinkKafkaConsumer/FlinkKafkaProducer to KafkaSource/KafkaSink, while
> > > > preserving exactly-once guarantees etc?
> > >
> > >
> > > The responses from Fabian Paul in [1] and [2] address the question of
> how
> > > to handle the migration in terms of managing the state (where the short
> > > answer is "arrange for Kafka to be the source of truth").
> > >
> > > Those threads don't get into the differences in behavior between the
> two
> > > implementations. Here I'm thinking about (1) the loss of
> > > DeserializationSchema#isEndOfStream, and the fact that you can no
> longer
> > > dynamically determine when the input stream has finished, and (2) the
> > > change to how empty partitions are handled on startup (they used to be
> > > marked idle automatically, whereas now you must use withIdleness in the
> > > WatermarkStrategy).
> > >
> > > [1] https://www.mail-archive.com/user@flink.apache.org/msg44618.html
> > > [2] https://www.mail-archive.com/user@flink.apache.org/msg45864.html
> > >
> > > On Mon, Oct 31, 2022 at 7:32 PM Piotr Nowojski 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Maybe a stupid question, but how one can migrate from the
> > > > FlinkKafkaConsumer/FlinkKafkaProducer to KafkaSource/KafkaSink, whi

Re: [DISCUSS] Flink release retro

2022-11-02 Thread Jing Ge
Hi all,

I figure it is a good idea and +1 for the async retro. More developers will
learn from what the release process looks like, which will give them
context to engage in future releases. It would be great if the conversation
could somehow follow the traditional retro pattern, e.g. tagged with
"Liked, learned, Lacked, and Longed for".

Best regards,
Jing

On Wed, Nov 2, 2022 at 11:21 AM Martijn Visser 
wrote:

> Hi Matthias,
>
> I think it's a good idea to capture how this release cycle has progressed.
> I'm not sure that a classical "retrospective" is the best solution, since
> it would require multiple people in different timezones to attend a virtual
> meeting.
>
> So I would +1 an async retrospective, which could be the questions that you
> would normally ask during a retrospective yet but now via a questionnaire.
> It probably makes sense to have a proposal of the questions that can be
> asked, discuss them and then sent them out.
>
> WDYT?
>
> Thanks,
>
> Martijn
>
> On Wed, Nov 2, 2022 at 9:42 AM Qingsheng Ren  wrote:
>
> > Thanks for starting the discussion Matthias!
> >
> > I think having a retro after a release cycle would be quite helpful to
> > standardizing the procedure of the release, and also could avoid new
> > release managers getting stuck on the same issue that happened before. I
> > prefer the second option that RMs could open a discussion thread in ML at
> > the end of the release to collect feedback about the last release cycle
> and
> > add them to the release wiki page, which would be quite handy for further
> > RMs.
> >
> > Best,
> > Qingsheng
> > Ververica (Alibaba)
> >
> > On Mon, Oct 31, 2022 at 11:02 PM Matthias Pohl
> >  wrote:
> >
> > > Hi everyone,
> > > I want to bring up the idea of having a retrospective on the release
> from
> > > the release manager's perspective. The idea would be to collect
> feedback
> > on
> > > what went well and what could be improved for a specific minor release.
> > So
> > > far, I didn't find anything on that topic. Does the community find this
> > > useful? Or was this already done but not helpful?
> > >
> > > I see three options here:
> > > 1. Having an actual meeting where issues can be discussed and/or
> > > experiences can be shared between the release managers of the previous
> > > release and the release managers of the next minor release. Of course,
> > this
> > > could be open to other contributors as well. A summary could be
> provided
> > in
> > > the Flink wiki (the Flink release's wiki article).
> > > 2. The release manager(s) provide a summary on the Flink release's wiki
> > > article as part of the release process.
> > > 3. Leave the process as is without any additional retrospective but
> focus
> > > on improving the documentation if issues arose during the release.
> > >
> > > That might help people who consider contributing to the community
> through
> > > supporting the release efforts. Additionally, it might help in
> > > understanding what went wrong in past releases retroactively (e.g. the
> > > longer release cycle for 1.15).
> > >
> > > I'm curious about opinion's on that topic.
> > >
> > > Best,
> > > Matthias
> > >
> >
>


Re: [ANNOUNCE] Performance Daily Monitoring Moved from Ververica to Apache Flink Slack Channel

2022-11-03 Thread Jing Ge
Thanks yanfei for driving this!

Looking forward to further discussion w.r.t. the workflow.

Best regards,
Jing

On Mon, Oct 31, 2022 at 6:04 PM Mason Chen  wrote:

> +1, thanks for driving this!
>
> On a side note, can we also ensure that a performance summary report for
> Flink major version upgrades is in release notes, once this infrastructure
> becomes mature? From the user perspective, it would be nice to know what
> the expected (or unexpected) regressions in a major version upgrade are.
> I've seen the community do something like this before (e.g. the major
> rocksdb version bump in 1.14?) and it was quite valuable to know that
> upfront!
>
> Best,
> Mason
>
> On Fri, Oct 28, 2022 at 1:46 AM weijie guo 
> wrote:
>
> > Thanks Yanfei for driving this.
> >
> > It allows us to easily find the problem of performance regression.
> > Especially recently, I have made some improvements to the scheduling
> > related parts, your work is very important to ensure that these changes
> do
> > not cause some unexpected problems.
> >
> > Best regards,
> >
> > Weijie
> >
> >
> > Congxian Qiu  于2022年10月28日周五 16:03写道:
> >
> > > Thanks for driving this and making the performance monitoring public,
> > this
> > > can make us know and resolve the performance problem quickly.
> > >
> > > Looking forward to the workflow and detailed descriptions fo
> > > flink-dev-benchmarks.
> > >
> > > Best,
> > > Congxian
> > >
> > >
> > > Yun Tang  于2022年10月27日周四 12:41写道:
> > >
> > > > Thanks, Yanfei for driving this to monitor the performance in the
> > Apache
> > > > Flink Slack Channel.
> > > >
> > > > Look forward to the workflow and detailed descriptions of
> > > > flink-dev-benchmarks.
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: Hangxiang Yu 
> > > > Sent: Thursday, October 27, 2022 10:59
> > > > To: dev@flink.apache.org 
> > > > Subject: Re: [ANNOUNCE] Performance Daily Monitoring Moved from
> > Ververica
> > > > to Apache Flink Slack Channel
> > > >
> > > > Hi, Yanfei.
> > > > Thanks for driving this.
> > > > It could help us to detect and resolve the regression problem quickly
> > and
> > > > officially.
> > > > I'd like to join as a maintainer.
> > > > Looking forward to the workflow.
> > > >
> > > > On Wed, Oct 26, 2022 at 5:18 PM Yuan Mei 
> > wrote:
> > > >
> > > > > Thanks, Yanfei, to drive this and make the performance monitoring
> > > > publicly
> > > > > available.
> > > > >
> > > > > Looking forward to seeing the workflow, and more details as Martijn
> > > > > mentioned.
> > > > >
> > > > > Best
> > > > > Yuan
> > > > >
> > > > > On Wed, Oct 26, 2022 at 2:59 PM Martijn Visser <
> > > martijnvis...@apache.org
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Yanfei Lei,
> > > > > >
> > > > > > Thanks for setting this up! It would be interesting to also know
> > > which
> > > > > > aspects of Flink are monitored for "performance". I'm assuming
> > there
> > > > are
> > > > > > specific pieces of functionality that are performance tested, but
> > it
> > > > > would
> > > > > > be great if this would be written down somewhere (next to a
> > procedure
> > > > how
> > > > > > to detect a regression and what should be next steps).
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Martijn
> > > > > >
> > > > > > On Wed, Oct 26, 2022 at 8:21 AM Zakelly Lan <
> zakelly@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi yanfei,
> > > > > > >
> > > > > > > Thanks for driving this! It's a great help.
> > > > > > >
> > > > > > > I would like to join as a maintainer.
> > > > > > >
> > > > > > > Best,
> > > > > > > Zakelly
> > > > > > >
> > > > > > > On Wed, Oct 26, 2022 at 11:32 AM yanfei lei <
> fredia...@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > As discussed earlier, we plan to create a benchmark channel
> in
> > > > Apache
> > > > > > > Flink
> > > > > > > > slack[1], but the plan was shelved for a while[2]. So I went
> on
> > > > with
> > > > > > this
> > > > > > > > work, and created the #flink-dev-benchmarks channel for
> > > performance
> > > > > > > > regression notifications.
> > > > > > > >
> > > > > > > > We have a regression report script[3] that runs daily, and a
> > > > > > notification
> > > > > > > > would be sent to the slack channel when the last few
> benchmark
> > > > > results
> > > > > > > are
> > > > > > > > significantly worse than the baseline.
> > > > > > > > Note, regressions are detected by a simple script which may
> > have
> > > > > false
> > > > > > > > positives and false negatives. And all benchmarks are
> executed
> > on
> > > > one
> > > > > > > > physical machine[4] which is provided by
> Ververica(Alibaba)[5],
> > > it
> > > > > > might
> > > > > > > > happen that hardware issues affect performance, like
> > > "[FLINK-18614
> > > > > > > > ]
> > Performance
> > > > > > > regression
> > > > > > > > 2020.

Re: [blog article] Howto migrate a real-life batch pipeline from the DataSet API to the DataStream API

2022-11-10 Thread Jing Ge
Hi Etienne,

Nice blog! Thanks for sharing!

Best regards,
Jing


On Wed, Nov 9, 2022 at 5:49 PM Etienne Chauchot 
wrote:

> Hi Yun Gao,
>
> FYI I just updated the article after your review:
> https://echauchot.blogspot.com/2022/11/flink-howto-migrate-real-life-batch.html
>
> Best
>
> Etienne
> Le 09/11/2022 à 10:04, Etienne Chauchot a écrit :
>
> Hi Yun Gao,
>
> thanks for your email and your review !
>
> My comments are inline
> Le 08/11/2022 à 06:51, Yun Gao a écrit :
>
> Hi Etienne,
>
> Very thanks for the article! Flink is currently indeed keeping increasing
> the
> ability of unified batch / stream processing with the same api, and its a
> great
> pleasure that more and more users are trying this functionality. But I also
> have some questions regarding some details.
>
> First IMO, as a whole for the long run Flink will have two unified APIs,
> namely Table / SQL
> API and DataStream API. Users could express the computation logic with
> these two APIs
> for both bounded and unbounded data processing.
>
>
> Yes that is what I understood also throughout the discussions and jiras.
> And I also think IMHO that reducing the number of APIs to 2 was the good
> move.
>
>
> Underlying Flink provides two
> execution modes:  the streaming mode works with both bounded and unbounded
> data,
> and it executes in a way of incremental processing based on state; the
> batch mode works
> only with bounded data, and it executes in a ways level-by-level similar
> to the traditional
> batch processing frameworks. Users could switch the execution mode via
> EnvironmentSettings.inBatchMode() for
> StreamExecutionEnvironment.setRuntimeMode().
>
> As recommended in Flink docs(1) I have enabled the batch mode as I though
> it would be more efficient on my bounded pipeline but as a matter of fact
> the streaming mode seems to be more efficient on my use case. I'll test
> with higher volumes to confirm.
>
>
>
> Specially for DataStream, as implemented in FLIP-140, currently all the
> existing DataStream
> operation supports the batch execution mode in a unified way[1]:  data
> will be sorted for the
> keyBy() edges according to the key, then the following operations like
> reduce() could receive
> all the data belonging to the same key consecutively, then it could
> directly reducing the records
> of the same key without maintaining the intermediate states. In this way
> users could write the
> same code for both streaming and batch processing with the same code.
>
>
> Yes I have no doubt that my resulting Query3ViaFlinkRowDatastream pipeline
> will work with no modification if I plug an unbounded source to it.
>
>
>
> # Regarding the migration of Join / Reduce
>
> First I think Reduce is always supported and users could write
> dataStream.keyBy().reduce(xx)
> directly, and  if batch  execution mode is set, the reduce will not be
> executed in a incremental way,
> instead is acts much  like sort-based  aggregation in the traditional
> batch processing framework.
>
> Regarding Join, although the issue of FLINK-22587 indeed exists: current
> join has to be bound
> to a window and the GlobalWindow does not work properly, but with some
> more try currently
> it does not need users to  re-write the whole join from scratch: Users
> could write a dedicated
> window assigner that assigns all the  records to the same window instance
> and return
> EventTimeTrigger.create() as the default event-time trigger [2]. Then it
> works
>
> source1.join(source2)
> .where(a -> a.f0)
> .equalTo(b -> b.f0)
> .window(new EndOfStreamWindows())
> .apply();
>
> It does not requires records have event-time attached since the trigger of
> window is only
> relying on the time range of the window and the assignment does not need
> event-time either.
>
> The behavior of the join is also similar to sort-based join if batch mode
> is enabled.
>
> Of course it is not easy to use to let users do the workaround and we'll
> try to fix this issue in 1.17.
>
>
> Yes, this is a better workaround than the manual state-based join that I
> proposed. I tried it and it works perfectly with similar performance.
> Thanks.
>
>
> # Regarding support of Sort / Limit
>
> Currently these two operators are indeed not supported in the DataStream
> API directly. One initial
> though for these two operations are that users may convert the DataStream
> to Table API and use
> Table API for these two operators:
>
> DataStream xx = ... // Keeps the customized logic in DataStream
> Table tableXX = tableEnv.fromDataStream(dataStream);
> tableXX.orderBy($("a").asc());
>
>
> Yes I knew that workaround but I decided not to use it because I have a
> special SQL based implementation (for comparison reasons) so I did not want
> to mix SQL and DataStream APIs in the same pipeline.
>
>
> How do you think about this option? We are also assessing if the
> combination of DataStream
> API / Table API is sufficient for all the batch u

Re: [DISCUSS] Remove FlinkKafkaConsumer and FlinkKafkaProducer in the master for 1.17 release

2022-11-11 Thread Jing Ge
Hi all,

Thank you all for the informative feedback. I figure there is a requirement
to improve the documentation wrt the migration from FlinkKafkaConsumer to
KafkaSource. I've fired a ticket[1] and connected it with [2]. This
shouldn't be the blocker for removing FlinkKafkaConsumer.

Given there will be some ongoing SinkV2 upgrades, I will start a vote only
limited to FlinkKafkaConsumer elimination and related APIs graduation. As a
follow-up task, I will sync with Yun Gao before the coding freeze of 1.17
release to check if we can start the second vote to remove
FlinkKafkaProducer with 1.17.

Best regards,
Jing

[1] https://issues.apache.org/jira/browse/FLINK-2
[2] https://issues.apache.org/jira/browse/FLINK-28302


On Wed, Nov 2, 2022 at 11:39 AM Martijn Visser 
wrote:

> Hi David,
>
> I believe that for the DataStream this is indeed documented [1] but it
> might be missed given that there is a lot of documentation and you need to
> know that your problem is related to idleness. For the Table API I think
> this is never mentioned, so it should definitely be at least documented
> there.
>
> Thanks,
>
> Martijn
>
> [1]
>
> https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/datastream/kafka/#idleness
>
> On Wed, Nov 2, 2022 at 11:28 AM David Anderson 
> wrote:
>
> > >
> > > For the partition
> > > idleness problem could you elaborate more about it? I assume both
> > > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to decide
> > > whether to mark the partition as idle.
> >
> >
> > As a matter of fact, no, that's not the case -- which is why I mentioned
> > it.
> >
> > The FlinkKafkaConsumer automatically treats all initially empty (or
> > non-existent) partitions as idle, while the KafkaSource only does this if
> > the WatermarkStrategy specifies that idleness handling is desired by
> > configuring withIdleness. This can be a source of confusion for folks
> > upgrading to the new connector. It most often shows up in situations
> where
> > the number of Kafka partitions is less than the parallelism of the
> > connector, which is a rather common occurrence in development and testing
> > environments.
> >
> > I believe this change in behavior was made deliberately, so as to create
> a
> > more consistent experience across all FLIP-27 connectors. This isn't
> > something that needs to be fixed, but does need to be communicated more
> > clearly. Unfortunately, the whole idleness mechanism remained
> significantly
> > broken until 1.16 (considering the impact of [1] and [2]), further
> > complicating the situation. Because of FLINK-28975 [2], users with
> > partitions that are initially empty may have problems with versions
> before
> > 1.15.3 (still unreleased) and 1.16.0. See [3] for an example of this
> > confusion.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-18934 (idleness didn't
> > work
> > with connected streams)
> > [2] https://issues.apache.org/jira/browse/FLINK-28975 (idle streams
> could
> > never become active again)
> > [3]
> >
> >
> https://stackoverflow.com/questions/70096166/parallelism-in-flink-kafka-source-causes-nothing-to-execute/70101290#70101290
> >
> > Best,
> > David
> >
> > On Wed, Nov 2, 2022 at 5:26 AM Qingsheng Ren  wrote:
> >
> > > Thanks Jing for starting the discussion.
> > >
> > > +1 for removing FlinkKafkaConsumer, as KafkaSource has evolved for many
> > > release cycles and should be stable enough. I have some concerns about
> > the
> > > new Kafka sink based on sink v2, as sink v2 still has some ongoing work
> > in
> > > 1.17 (maybe Yun Gao could provide some inputs). Also we found some
> issues
> > > of KafkaSink related to the internal mechanism of sink v2, like
> > > FLINK-29492.
> > >
> > > @David
> > > About the ability of DeserializationSchema#isEndOfStream, FLIP-208 is
> > > trying to complete this piece of the puzzle, and Hang Ruan (
> > > ruanhang1...@gmail.com) plans to work on it in 1.17. For the partition
> > > idleness problem could you elaborate more about it? I assume both
> > > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to decide
> > > whether to mark the partition as idle.
> > >
> > > Best,
> > > Qingsheng
> > > Ververica (Alibaba)
> > >
> > > On Thu, Oct 27, 2022 at 8:06 PM Jing Ge  wrote:
> > >
> > > > Hi Dev,
> > > >
> > > > I'd like to start a discussion about removi

[Vote] Remove FlinkKafkaConsumer and graduate related Source APIs

2022-11-15 Thread Jing Ge
Hi,

As discussed on the mailing list[1]. I'd like to start a vote to remove
FlinkKafkaConsumer and graduate related Source APIs.

Since there are some ongoing developments which will change SinkV2 APIs,
depending on how far it would go, I might start another vote to remove
FlinkKafkaProducer before the coding freeze of 1.17 release.

Voting will be open for at least 72 hours.

Best regards,
Jing

[1] https://lists.apache.org/thread/m3o48c2d8j9g5t9s89hqs6qvr924s71o


Re: [DISCUSS] Remove FlinkKafkaConsumer and FlinkKafkaProducer in the master for 1.17 release

2022-11-21 Thread Jing Ge
Hi Dong,

Thanks for the heads-up. It was really nice to be aware of this issue
before removing FlinkKafkaConsumer. I will check FLIP-208 and work on it.
Thanks!

Best regards,
Jing


On Thu, Nov 17, 2022 at 4:34 PM Dong Lin  wrote:

> Hi Jing,
>
> Thanks for opening the discussion. I am not sure we are ready to
> remove FlinkKafkaConsumer.
> The reason is that for existing users of FlinkKafkaConsumer who rely
> on KafkaDeserializationSchema::isEndOfStream(),
> there is currently no migration path for them to use FlinkKafkaConsumer.
>
> This issue was explained in FLIP-208
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-208%3A+Add+RecordEvaluator+to+dynamically+stop+source+based+on+de-serialized+records
> >.
> The design was pretty much ready. I didn't start the voting thread
> because Fabian said he wanted more time to explore alternative solutions.
> My priority changed recently and don't plan to get this FLIP done for 1.17.
> It will be great if someone can address this issue so that we can move
> forward to remove FlinkKafkaConsumer.
>
> Thanks,
> Dong
>
>
>
>
>
> On Fri, Nov 11, 2022 at 8:53 PM Jing Ge 
> wrote:
>
> > Hi all,
> >
> > Thank you all for the informative feedback. I figure there is a
> requirement
> > to improve the documentation wrt the migration from FlinkKafkaConsumer to
> > KafkaSource. I've fired a ticket[1] and connected it with [2]. This
> > shouldn't be the blocker for removing FlinkKafkaConsumer.
> >
> > Given there will be some ongoing SinkV2 upgrades, I will start a vote
> only
> > limited to FlinkKafkaConsumer elimination and related APIs graduation.
> As a
> > follow-up task, I will sync with Yun Gao before the coding freeze of 1.17
> > release to check if we can start the second vote to remove
> > FlinkKafkaProducer with 1.17.
> >
> > Best regards,
> > Jing
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-2
> > [2] https://issues.apache.org/jira/browse/FLINK-28302
> >
> >
> > On Wed, Nov 2, 2022 at 11:39 AM Martijn Visser  >
> > wrote:
> >
> > > Hi David,
> > >
> > > I believe that for the DataStream this is indeed documented [1] but it
> > > might be missed given that there is a lot of documentation and you need
> > to
> > > know that your problem is related to idleness. For the Table API I
> think
> > > this is never mentioned, so it should definitely be at least documented
> > > there.
> > >
> > > Thanks,
> > >
> > > Martijn
> > >
> > > [1]
> > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/datastream/kafka/#idleness
> > >
> > > On Wed, Nov 2, 2022 at 11:28 AM David Anderson 
> > > wrote:
> > >
> > > > >
> > > > > For the partition
> > > > > idleness problem could you elaborate more about it? I assume both
> > > > > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to
> decide
> > > > > whether to mark the partition as idle.
> > > >
> > > >
> > > > As a matter of fact, no, that's not the case -- which is why I
> > mentioned
> > > > it.
> > > >
> > > > The FlinkKafkaConsumer automatically treats all initially empty (or
> > > > non-existent) partitions as idle, while the KafkaSource only does
> this
> > if
> > > > the WatermarkStrategy specifies that idleness handling is desired by
> > > > configuring withIdleness. This can be a source of confusion for folks
> > > > upgrading to the new connector. It most often shows up in situations
> > > where
> > > > the number of Kafka partitions is less than the parallelism of the
> > > > connector, which is a rather common occurrence in development and
> > testing
> > > > environments.
> > > >
> > > > I believe this change in behavior was made deliberately, so as to
> > create
> > > a
> > > > more consistent experience across all FLIP-27 connectors. This isn't
> > > > something that needs to be fixed, but does need to be communicated
> more
> > > > clearly. Unfortunately, the whole idleness mechanism remained
> > > significantly
> > > > broken until 1.16 (considering the impact of [1] and [2]), further
> > > > complicating the situation. Because of FLINK-28975 [2], users with
> > > > partitions that are initially empty may have proble

Re: [Vote] Remove FlinkKafkaConsumer and graduate related Source APIs

2022-11-21 Thread Jing Ge
Hi Dong,

Thanks again for the clarification. Since a important feature is missing
with the KafkaSource, this vote is cancelled and will be started again once
the feature has been implemented with the KafkaSource.

Best regards,
Jing

On Thu, Nov 17, 2022 at 4:37 PM Dong Lin  wrote:

> Hi Jing,
>
> I realized that there is a missing feature with KafkaSource that might
> prevent existing users of FlinkKafkaConsumer from migrating to
> FlinkKafkaConsumer.
> I have put more details in the discussion thread.
>
> Thanks,
> Dong
>
>
> On Wed, Nov 16, 2022 at 12:37 AM Jing Ge 
> wrote:
>
> > Hi,
> >
> > As discussed on the mailing list[1]. I'd like to start a vote to remove
> > FlinkKafkaConsumer and graduate related Source APIs.
> >
> > Since there are some ongoing developments which will change SinkV2 APIs,
> > depending on how far it would go, I might start another vote to remove
> > FlinkKafkaProducer before the coding freeze of 1.17 release.
> >
> > Voting will be open for at least 72 hours.
> >
> > Best regards,
> > Jing
> >
> > [1] https://lists.apache.org/thread/m3o48c2d8j9g5t9s89hqs6qvr924s71o
> >
>


Re: [ANNOUNCE] New Apache Flink Committer - Matyas Orhidi

2022-11-21 Thread Jing Ge
Congrats Matyas!

Best Regards,
Jing

On Mon, Nov 21, 2022 at 3:43 PM Ferenc Csaky 
wrote:

> Congrats Matyas!
>
> Best,
> F
>
>
>
> --- Original Message ---
> On Monday, November 21st, 2022 at 15:17, Márton Balassi <
> mbala...@apache.org> wrote:
>
>
> >
> >
> > Hi everyone,
> >
> > On behalf of the PMC, I'm very happy to announce Matyas Orhidi as a new
> > Flink
> > committer.
> >
> > Matyas has over a decade of experience of the Big Data ecosystem and has
> > been working with Flink full time for the past 3 years. In the open
> source
> > community he is one of the key driving members of the Kubernetes Operator
> > subproject. He implemented multiple key features in the operator
> including
> > the metrics system and the ability to dynamically configure watched
> > namespaces. He enjoys spreading the word about Flink and regularly does
> so
> > via authoring blogposts and giving talks or interviews representing the
> > community.
> >
> > Please join me in congratulating Matyas for becoming a Flink committer!
> >
> > Best,
> > Marton
>


Re: [ANNOUNCE] New Apache Flink PMC Members - Godfrey He, Xingbo Huang

2022-11-23 Thread Jing Ge
Congrats, Godfrey! Congrats, Xingbo!

Best regards,
Jing

On Wed, Nov 23, 2022 at 6:11 PM Maximilian Michels  wrote:

> Welcome aboard Godfrey and Xingbo!
>
> -Max
>
> On Wed, Nov 23, 2022 at 5:51 PM Yun Tang  wrote:
>
> > Congratulations, Godfrey and Xingbo!
> >
> > Best
> > Yun Tang
> > 
> > From: Rui Fan <1996fan...@gmail.com>
> > Sent: Wednesday, November 23, 2022 22:31
> > To: dev@flink.apache.org 
> > Subject: Re: [ANNOUNCE] New Apache Flink PMC Members - Godfrey He, Xingbo
> > Huang
> >
> > Congratulations, well deserved!
> >
> > Rui Fan
> >
> > On Wed, Nov 23, 2022 at 9:53 PM Konstantin Knauf 
> > wrote:
> >
> > > Congrats to both.
> > >
> > > Am Mi., 23. Nov. 2022 um 10:45 Uhr schrieb yu zelin <
> > yuzelin@gmail.com
> > > >:
> > >
> > > > Congratulations,Godfrey and Xingbo!
> > > >
> > > > Best,
> > > > Yu Zelin
> > > > > 2022年11月23日 12:18,Dian Fu  写道:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > On behalf of the Apache Flink PMC, I'm very happy to announce that
> > > > Godfrey
> > > > > He and Xingbo Huang have joined the Flink PMC!
> > > > >
> > > > > Godfrey He becomes a Flink committer in Sep 2020. His contributions
> > are
> > > > > mainly focused on the Flink table module, covering almost all
> > important
> > > > > parts such as Client(SQL Client, SQL gateway, JDBC driver, etc),
> API,
> > > SQL
> > > > > parser, query optimization, query execution, etc. Especially in the
> > > query
> > > > > optimization part, he built the query optimization framework and
> the
> > > cost
> > > > > model, improved the statistics information and made a lot of
> > > > optimizations,
> > > > > e.g. dynamic partition pruning, join hint, multiple input rewrite,
> > etc.
> > > > In
> > > > > addition, he has done a lot of groundwork, such as refactoring the
> > > > > ExecNode(which is the basis for further DAG optimizations) and SQL
> > Plan
> > > > > JSON serialization (which is a big step to support SQL job version
> > > > > upgrade). Besides that, he's also helping the projects in other
> ways,
> > > > e.g.
> > > > > managing releases, giving talks, etc.
> > > > >
> > > > > Xingbo Huang becomes a Flink committer in Feb 2021. His
> contributions
> > > are
> > > > > mainly focused on the PyFlink module and he's the author of many
> > > > important
> > > > > features in PyFlink, e.g. Cython support, Python thread execution
> > mode,
> > > > > Python UDTF support, Python UDAF support in windowing, etc. He is
> > also
> > > > one
> > > > > of the main contributors in the Flink ML project. Besides that,
> he's
> > > also
> > > > > helping to manage releases, taking care of the build stabilites,
> etc.
> > > > >
> > > > > Congratulations and welcome them as Apache Flink PMC!
> > > > >
> > > > > Regards,
> > > > > Dian
> > > >
> > > >
> > >
> > > --
> > > https://twitter.com/snntrable
> > > https://github.com/knaufk
> > >
> >
>


Re: [VOTE] FLIP-273: Improve Catalog API to Support ALTER TABLE syntax

2022-12-05 Thread Jing Ge
+1(no-binding)

On Mon, Dec 5, 2022 at 3:29 PM Zheng Yu Chen  wrote:

> +1(no-binding)
>
> Jark Wu  于2022年12月5日周一 11:00写道:
>
> > +1 (binding)
> >
> > Best,
> > Jark
> >
> > On Fri, 2 Dec 2022 at 10:11, Paul Lam  wrote:
> >
> > > +1 (non-binding)
> > >
> > > Best,
> > > Paul Lam
> > >
> > > > 2022年12月2日 09:17,yuxia  写道:
> > > >
> > > > +1 (non-binding)
> > > >
> > > > Best regards,
> > > > Yuxia
> > > >
> > > > - 原始邮件 -
> > > > 发件人: "Yaroslav Tkachenko" 
> > > > 收件人: "dev" 
> > > > 发送时间: 星期五, 2022年 12 月 02日 上午 12:27:24
> > > > 主题: Re: [VOTE] FLIP-273: Improve Catalog API to Support ALTER TABLE
> > > syntax
> > > >
> > > > +1 (non-binding).
> > > >
> > > > Looking forward to it!
> > > >
> > > > On Thu, Dec 1, 2022 at 5:06 AM Dong Lin  wrote:
> > > >
> > > >> +1 (binding)
> > > >>
> > > >> Thanks for the FLIP!
> > > >>
> > > >> On Thu, Dec 1, 2022 at 12:20 PM Shengkai Fang 
> > > wrote:
> > > >>
> > > >>> Hi All,
> > > >>>
> > > >>> Thanks for all the feedback so far. Based on the discussion[1] we
> > seem
> > > >>> to have a consensus, so I would like to start a vote on FLIP-273.
> > > >>>
> > > >>> The vote will last for at least 72 hours (Dec 5th at 13:00 GMT,
> > > >>> excluding weekend days) unless there is an objection or
> insufficient
> > > >> votes.
> > > >>>
> > > >>> Best,
> > > >>> Shengkai
> > > >>>
> > > >>> [1]
> > > >>>
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-273%3A+Improve+the+Catalog+API+to+Support+ALTER+TABLE+syntax
> > > >>> [2]
> https://lists.apache.org/thread/2v4kh2bpzvk049zdxb687q7o1pcmnnnw
> > > >>>
> > > >>
> > >
> > >
> >
>
>
> --
> Best
>
> ConradJam
>


[DISCUSS] FLIP-279 Unified the max display column width for SqlClient and Table APi in both Streaming and Batch execMode

2022-12-16 Thread Jing Ge
Hi Dev,

I'd like to start a discussion about FLIP-279: Unified the max display
column width for SqlClient and Table APi in both Streaming and Batch
execMode[1].

On one side, currently, there is only one ConfigOption
SqlClientOptions.DISPLAY_MAX_COLUMN_WIDTH(sql-client.display.max-column-width')
could be used to configure the value and it only works in a very specific
case. Users need to truly understand it before using it correctly.

On the other side, users want to configure the max display column width
while:

   - using CLI
   - Using Table API
   - Running job in batch execution mode
   - Running job in streaming execution mode


FLIP-279 will try to provide a unified way to address this issue. Looking
forward to your feedback.

Best regards,
Jing

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-279+Unified+the+max+display+column+width+for+SqlClient+and+Table+APi+in+both+Streaming+and+Batch+execMode


Re: [DISCUSS] Update Flink Roadmap

2023-06-01 Thread Jing Ge
Hi Jark,

Fair enough. Let's do it like you suggested. Thanks!

Best regards,
Jing

On Thu, Jun 1, 2023 at 6:00 PM Jark Wu  wrote:

> Hi Jing,
>
> This thread is for discussing the roadmap for versions 1.18, 2.0, and even
> more.
> One of the outcomes of this discussion will be an updated version of the
> current roadmap.
> Let's work together on refining the roadmap in this thread.
>
> Best,
> Jark
>
> On Thu, 1 Jun 2023 at 23:25, Jing Ge  wrote:
>
> > Hi Jark,
> >
> > Thanks for driving it! For point 2, since we are developing 1.18 now,
> > does it make sense to update the roadmap this time while we are releasing
> > 1.18? This discussion thread will be focusing on the Flink 2.0 roadmap,
> as
> > you mentioned previously. WDYT?
> >
> > Best regards,
> > Jing
> >
> > On Thu, Jun 1, 2023 at 3:31 PM Jark Wu  wrote:
> >
> > > Hi all,
> > >
> > > Martijn and I would like to initiate a discussion on the Flink roadmap,
> > > which should cover the project's long-term roadmap and the regular
> update
> > > mechanism.
> > >
> > > Xintong has already started a discussion about Flink 2.0 planning. One
> of
> > > the points raised in that discussion is that we should have a
> high-level
> > > discussion of the roadmap to present where the project is heading
> (which
> > > doesn't necessarily need to block the Flink 2.0 planning). Moreover,
> the
> > > roadmap on the Flink website [1] hasn't been updated for half a year,
> and
> > > the last update was for the feature radar for the 1.15 release. It has
> > been
> > > 2 years since the community discussed Flink's overall roadmap.
> > >
> > > I would like to raise two topics for discussion:
> > >
> > > 1. The new roadmap. This should be an updated version of the current
> > > roadmap[1].
> > > 2. A mechanism to regularly discuss and update the roadmap.
> > >
> > > To make the first topic discussion more efficient, Martijn and I
> > volunteer
> > > to summarize the ongoing big things of different components and
> present a
> > > roadmap draft to the community in the next few weeks. This should be a
> > good
> > > starting point for a more detailed discussion.
> > >
> > > Regarding the regular update mechanism, there was a proposal in a
> thread
> > > [2] three years ago to make the release manager responsible for
> updating
> > > the roadmap. However, it appears that this was not documented as a
> > release
> > > management task [3], and the roadmap update wasn't performed for
> releases
> > > 1.16 and 1.17.
> > >
> > > In my opinion, making release managers responsible for keeping the
> > roadmap
> > > up to date is a good idea. Specifically, release managers of release X
> > can
> > > kick off the roadmap update at the beginning of release X, which can
> be a
> > > joint task with collecting a feature list [4]. Additionally, release
> > > managers of release X-1 can help verify and remove the accomplished
> items
> > > from the roadmap and update the feature radar.
> > >
> > > What do you think? Do you have other ideas?
> > >
> > > Best,
> > > Jark & Martijn
> > >
> > > [1]: https://flink.apache.org/roadmap.html
> > > [2]: https://lists.apache.org/thread/o0l3cg6yphxwrww0k7215jgtw3yfoybv
> > > [3]:
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Release+Management
> > > [4]: https://cwiki.apache.org/confluence/display/FLINK/1.18+Release
> > >
> >
>


Re: [DISCUSS] FLIP-313 Add support of User Defined AsyncTableFunction

2023-06-02 Thread Jing Ge
Hi Aitozi,

Thanks for the update. Just out of curiosity, what is the difference
between the RPC call or query you mentioned and the lookup in a very
general way? Since Lateral join is used in the FLIP. Is there any special
thought for that? Sorry for asking so many questions. The FLIP contains
limited information to understand the motivation.

Best regards,
Jing

On Fri, Jun 2, 2023 at 3:48 AM Aitozi  wrote:

> Hi Jing,
> I have updated the proposed changes to the FLIP. IMO, lookup has its
> clear
> async call requirement is due to its IO heavy operator. In our usage, sql
> users have
> logic to do some RPC call or query the third-party service which is also IO
> intensive.
> In these case, we'd like to leverage the async function to improve the
> throughput.
>
> Thanks,
> Aitozi.
>
> Jing Ge  于2023年6月1日周四 22:55写道:
>
> > Hi Aitozi,
> >
> > Sorry for the late reply. Would you like to update the proposed changes
> > with more details into the FLIP too?
> > I got your point. It looks like a rational idea. However, since lookup
> has
> > its clear async call requirement, are there any real use cases that
> > need this change? This will help us understand the motivation. After all,
> > lateral join and temporal lookup join[1] are quite different.
> >
> > Best regards,
> > Jing
> >
> >
> > [1]
> >
> >
> https://github.com/apache/flink/blob/d90a72da2fd601ca4e2a46700e91ec5b348de2ad/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/AsyncTableFunction.java#L54
> >
> > On Wed, May 31, 2023 at 8:53 AM Aitozi  wrote:
> >
> > > Hi Jing,
> > > What do you think about it? Can we move forward this feature?
> > >
> > > Thanks,
> > > Aitozi.
> > >
> > > Aitozi  于2023年5月29日周一 09:56写道:
> > >
> > > > Hi Jing,
> > > > > "Do you mean to support the AyncTableFunction beyond the
> > > > LookupTableSource?"
> > > > Yes, I mean to support the AyncTableFunction beyond the
> > > LookupTableSource.
> > > >
> > > > The "AsyncTableFunction" is the function with ability to be executed
> > > async
> > > > (with AsyncWaitOperator).
> > > > The async lookup join is a one of usage of this. So, we don't have to
> > > bind
> > > > the AyncTableFunction with LookupTableSource.
> > > > If User-defined AsyncTableFunction is supported, user can directly
> use
> > > > lateral table syntax to perform async operation.
> > > >
> > > > > "It would be better if you could elaborate the proposed changes wrt
> > the
> > > > CorrelatedCodeGenerator with more details"
> > > >
> > > > In the proposal, we use lateral table syntax to support the async
> table
> > > > function. So the planner will also treat this statement to a
> > > > CommonExecCorrelate node. So the runtime code should be generated in
> > > > CorrelatedCodeGenerator.
> > > > In CorrelatedCodeGenerator, we will know the TableFunction's Kind of
> > > > `FunctionKind.Table` or `FunctionKind.ASYNC_TABLE`
> > > > For  `FunctionKind.ASYNC_TABLE` we can generate a AsyncWaitOperator
> to
> > > > execute the async table function.
> > > >
> > > >
> > > > Thanks,
> > > > Aitozi.
> > > >
> > > >
> > > > Jing Ge  于2023年5月29日周一 03:22写道:
> > > >
> > > >> Hi Aitozi,
> > > >>
> > > >> Thanks for the clarification. The naming "Lookup" might suggest
> using
> > it
> > > >> for table look up. But conceptually what the eval() method will do
> is
> > to
> > > >> get a collection of results(Row, RowData) from the given keys. How
> it
> > > will
> > > >> be done depends on the implementation, i.e. you can implement your
> own
> > > >> Source[1][2]. The example in the FLIP should be able to be handled
> in
> > > this
> > > >> way.
> > > >>
> > > >> Do you mean to support the AyncTableFunction beyond the
> > > LookupTableSource?
> > > >> It would be better if you could elaborate the proposed changes wrt
> the
> > > >> CorrelatedCodeGenerator with more details. Thanks!
> > > >>
> > > >> Best regards,
> > > >> Jing
> > > >>
> > > >> [1]
> > > >>
> >

Re: [DISCUSS] FLIP-307: Flink connector Redshift

2023-06-02 Thread Jing Ge
Hi Samrat,

Excited to see your proposal. Supporting data warehouses is one of the
major tracks for Flink. Thanks for driving it! Happy to see that we reached
consensus to prioritize the Sink over Source in the previous discussion. Do
you already have any prototype? I'd like to join the reviews.

Just out of curiosity, speaking of JDBC mode, according to the FLIP, it
should be doable to directly use the jdbc connector with Redshift, if I am
not mistaken. Will the Redshift connector provide additional features
beyond the mediator/wrapper of the jdbc connector?

Best regards,
Jing

On Thu, Jun 1, 2023 at 8:22 PM Ahmed Hamdy  wrote:

> Hi Samrat
>
> Thanks for putting up this FLIP. I agree regarding the importance of the
> use case.
> please let me know If you need any collaboration regarding integration with
> AWS connectors credential providers or regarding FLIP-171 I would be more
> than happy to assist.
> I also like Leonard's proposal for starting with DataStreamSink and
> TableSink, It would be great to have some milestones delivered as soon as
> ready.
> best regards
> Ahmed Hamdy
>
>
> On Wed, 31 May 2023 at 11:15, Samrat Deb  wrote:
>
> > Hi Liu Ron,
> >
> > > 1. Regarding the  `read.mode` and `write.mode`, you say here provides
> two
> > modes, respectively, jdbc and `unload or copy`, What is the default value
> > for `read.mode` and `write.mode?
> >
> > I have made an effort to make the configuration options `read.mode` and
> > `write.mode` mandatory for the "flink-connector-redshift" according to
> > FLIP[1]. The rationale behind this decision is to empower users who are
> > familiar with their Redshift setup and have specific expectations for the
> > sink. By making these configurations mandatory, users can have more
> control
> > and flexibility in configuring the connector to meet their requirements.
> >
> > However, I am open to receiving feedback on whether it would be
> beneficial
> > to make the configuration options non-mandatory and set default values
> for
> > them. If you believe there are advantages to having default values or any
> > other suggestions, please share your thoughts. Your feedback is highly
> > appreciated.
> >
> > >  2. For Source, does it both support batch read and streaming read?
> >
> > Redshift currently does not provide native support for streaming reads,
> > although it does support streaming writes[2]. As part of the plan, I
> intend
> > to conduct a proof of concept and benchmarking to explore the
> possibilities
> > of implementing streaming reads using the Flink JDBC connector, as
> Redshift
> > is JDBC compatible.
> > However, it is important to note that, in the initial phase of
> > implementation, the focus will primarily be on supporting batch reads
> > rather than streaming reads. This approach will allow us to deliver a
> > robust and reliable solution for batch processing in phase 2 of the
> > implementation.
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-307%3A++Flink+Connector+Redshift
> > [2]
> >
> >
> https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-streaming-ingestion.html
> >
> > Bests,
> > Samrat
> >
> > On Wed, May 31, 2023 at 8:03 AM liu ron  wrote:
> >
> > > Hi, Samrat
> > >
> > > Thanks for driving this FLIP. It looks like supporting
> > > flink-connector-redshift is very useful to Flink. I have two question:
> > > 1. Regarding the  `read.mode` and `write.mode`, you say here provides
> two
> > > modes, respectively, jdbc and `unload or copy`, What is the default
> value
> > > for `read.mode` and `write.mode?
> > > 2. For Source, does it both support batch read and streaming read?
> > >
> > >
> > > Best,
> > > Ron
> > >
> > > Samrat Deb  于2023年5月30日周二 17:15写道:
> > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-307%3A++Flink+Connector+Redshift
> > > >
> > > > [note] Missed the trailing link for previous mail
> > > >
> > > >
> > > >
> > > > On Tue, May 30, 2023 at 2:43 PM Samrat Deb 
> > > wrote:
> > > >
> > > > > Hi Leonard,
> > > > >
> > > > > > and I’m glad to help review the design as well as the code
> review.
> > > > > Thank you so much. It would be really great and helpful to bring
> > > > > flink-connector-redshift for flink users :) .
> > > > >
> > > > > I have divided the implementation in 3 phases in the `Scope`
> > > Section[1].
> > > > > 1st phase is to
> > > > >
> > > > >- Integrate with Flink Sink API (*FLIP-171*
> > > > ><
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink
> > > >
> > > > >)
> > > > >
> > > > >
> > > > > > About the implementation phases, How about prioritizing support
> for
> > > the
> > > > > Datastream Sink API and TableSink API in the first phase?
> > > > > I can completely resonate with you to prioritize support for
> > Datastream
> > > > > Sink API and TableSink API in the first phase.
> > > > > I will update the FLIP[1] as you have suggested.
> > > > >
> > > > > > It

Re: [DISCUSS] FLIP-307: Flink connector Redshift

2023-06-05 Thread Jing Ge
Hi Samrat,

Thanks for the feedback. I would suggest adding that information into the
FLIP.

+1 Looking forward to your PR :-)

Best regards,
Jing

On Sat, Jun 3, 2023 at 9:19 PM Samrat Deb  wrote:

> Hi Jing Ge,
>
> >>> Do you already have any prototype? I'd like to join the reviews.
> The prototype is in progress. I will raise the dedicated PR for review soon
> also notify in this thread as well .
>
> >>> Will the Redshift connector provide additional features
> beyond the mediator/wrapper of the jdbc connector?
>
> Here are the additional features that the Flink connector for AWS Redshift
> can provide on top of using JDBC:
>
> 1. Integration with AWS Redshift Workload Management (WLM): AWS Redshift
> allows you to configure WLM[1] to manage query prioritization and resource
> allocation. The Flink connector for Redshift will be agnostic to the
> configured WLM and utilize it for scaling in and out for the sink. This
> means that the connector can leverage the WLM capabilities of Redshift to
> optimize the execution of queries and allocate resources efficiently based
> on your defined workload priorities.
>
> 2. Abstraction of AWS Redshift Quotas and Limits: AWS Redshift imposes
> certain quotas and limits[2] on various aspects such as the number of
> clusters, concurrent connections, queries per second, etc. The Flink
> connector for Redshift will provide an abstraction layer for users,
> allowing them to work with Redshift without having to worry about these
> specific limits. The connector will handle the management of connections
> and queries within the defined quotas and limits, abstracting away the
> complexity and ensuring compliance with Redshift's restrictions.
>
> These features aim to simplify the integration of Flink with AWS Redshift,
> providing optimized resource utilization and transparent handling of
> Redshift-specific limitations.
>
> Bests,
> Samrat
>
> [1]
>
> https://docs.aws.amazon.com/redshift/latest/dg/cm-c-implementing-workload-management.html
> [2]
>
> https://docs.aws.amazon.com/redshift/latest/mgmt/amazon-redshift-limits.html
>
> On Sat, Jun 3, 2023 at 11:40 PM Samrat Deb  wrote:
>
> > Hi Ahmed,
> >
> > >>> please let me know If you need any collaboration regarding
> integration
> > with
> > AWS connectors credential providers or regarding FLIP-171 I would be more
> > than happy to assist.
> >
> > Sure, I will reach out incase of any hands required.
> >
> >
> >
> > On Fri, Jun 2, 2023 at 6:12 PM Jing Ge 
> wrote:
> >
> >> Hi Samrat,
> >>
> >> Excited to see your proposal. Supporting data warehouses is one of the
> >> major tracks for Flink. Thanks for driving it! Happy to see that we
> >> reached
> >> consensus to prioritize the Sink over Source in the previous discussion.
> >> Do
> >> you already have any prototype? I'd like to join the reviews.
> >>
> >> Just out of curiosity, speaking of JDBC mode, according to the FLIP, it
> >> should be doable to directly use the jdbc connector with Redshift, if I
> am
> >> not mistaken. Will the Redshift connector provide additional features
> >> beyond the mediator/wrapper of the jdbc connector?
> >>
> >> Best regards,
> >> Jing
> >>
> >> On Thu, Jun 1, 2023 at 8:22 PM Ahmed Hamdy 
> wrote:
> >>
> >> > Hi Samrat
> >> >
> >> > Thanks for putting up this FLIP. I agree regarding the importance of
> the
> >> > use case.
> >> > please let me know If you need any collaboration regarding integration
> >> with
> >> > AWS connectors credential providers or regarding FLIP-171 I would be
> >> more
> >> > than happy to assist.
> >> > I also like Leonard's proposal for starting with DataStreamSink and
> >> > TableSink, It would be great to have some milestones delivered as soon
> >> as
> >> > ready.
> >> > best regards
> >> > Ahmed Hamdy
> >> >
> >> >
> >> > On Wed, 31 May 2023 at 11:15, Samrat Deb 
> wrote:
> >> >
> >> > > Hi Liu Ron,
> >> > >
> >> > > > 1. Regarding the  `read.mode` and `write.mode`, you say here
> >> provides
> >> > two
> >> > > modes, respectively, jdbc and `unload or copy`, What is the default
> >> value
> >> > > for `read.mode` and `write.mode?
> >> > >
> >> > > I have made an effort to make the configuration options `read.m

Re: [DISCUSS] FLIP-294: Support Customized Job Meta Data Listener

2023-06-05 Thread Jing Ge
Hi Shammon,

Thanks for driving it! It is a really interesting proposal. Looking forward
to the follow-up FLIP for the lineage feature, users will love it :-)

There are some inconsistencies in the content. In the very below example,
listener.onEvent(CatalogModificationEvent) is called, while in the
CatalogModificationListener interface definition, only
onEvent(CatalogModificationEvent, CatalogModificationContext) has been
defined.  I was wondering(NIT):

1. should there be another overloading method
onEvent(CatalogModificationEvent) alongside
onEvent(CatalogModificationEvent, CatalogModificationContext) ?
2. Since onEvent(CatalogModificationEvent) could be used, do we really need
CatalogModificationContext? API design example as reference: [1]

Best regards,
Jing


[1]
http://www.java2s.com/example/java-src/pkg/java/awt/event/actionlistener-add27.html

On Tue, Jun 6, 2023 at 7:43 AM Shammon FY  wrote:

> Hi devs:
>
> Thanks for all the feedback, and if there are no more comments, I will
> start a vote on FLIP-294 [1] later. Thanks again.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Job+Meta+Data+Listener
>
> Best,
> Shammon FY
>
> On Tue, Jun 6, 2023 at 1:40 PM Shammon FY  wrote:
>
> > Hi Martijn,
> >
> > Thanks for your attention, I will soon initiate a discussion about
> > FLIP-314.
> >
> > Best,
> > Shammon FY
> >
> >
> > On Fri, Jun 2, 2023 at 2:55 AM Martijn Visser 
> > wrote:
> >
> >> Hi Shammon,
> >>
> >> Just wanted to chip-in that I like the overall FLIP. Will be interesting
> >> to
> >> see the follow-up discussion on FLIP-314.
> >>
> >> Best regards,
> >>
> >> Martijn
> >>
> >> On Thu, Jun 1, 2023 at 5:45 AM yuxia 
> wrote:
> >>
> >> > Thanks for explanation. Make sense to me.
> >> >
> >> > Best regards,
> >> > Yuxia
> >> >
> >> > - 原始邮件 -
> >> > 发件人: "Shammon FY" 
> >> > 收件人: "dev" 
> >> > 发送时间: 星期四, 2023年 6 月 01日 上午 10:45:12
> >> > 主题: Re: [DISCUSS] FLIP-294: Support Customized Job Meta Data Listener
> >> >
> >> > Thanks yuxia, you're right and I'll add the new database to
> >> > AlterDatabaseEvent.
> >> >
> >> > I added `ignoreIfNotExists` for AlterDatabaseEvent because it is a
> >> > parameter in the `Catalog.alterDatabase` method. Although this value
> is
> >> > currently always false in `AlterDatabaseOperation`, I think it's
> better
> >> > to stay consistent with `Catalog.alterDatabase`. What do you think?
> >> >
> >> > Best,
> >> > Shammon FY
> >> >
> >> > On Thu, Jun 1, 2023 at 10:25 AM yuxia 
> >> wrote:
> >> >
> >> > > Hi, Shammon.
> >> > > I mean do we need to contain the new database after alter in
> >> > > AlterDatabaseEvent?  So that the listener can know what has been
> >> modified
> >> > > for the database. Or the listener don't need to care about the
> actual
> >> > > modification.
> >> > > Also, I'm wondering whether AlterDatabaseEven need to include
> >> > > ignoreIfNotExists method since alter database operation don't have
> >> such
> >> > > syntax like 'alter database if exists xxx'.
> >> > >
> >> > > Best regards,
> >> > > Yuxia
> >> > >
> >> > > - 原始邮件 -
> >> > > 发件人: "Shammon FY" 
> >> > > 收件人: "dev" 
> >> > > 发送时间: 星期三, 2023年 5 月 31日 下午 2:55:26
> >> > > 主题: Re: [DISCUSS] FLIP-294: Support Customized Job Meta Data
> Listener
> >> > >
> >> > > Hi yuxia
> >> > >
> >> > > Thanks for your input. The `AlterDatabaseEvent` extends
> >> > > `DatabaseModificationEvent` which has the original database.
> >> > >
> >> > > Best,
> >> > > Shammon FY
> >> > >
> >> > > On Wed, May 31, 2023 at 2:24 PM yuxia 
> >> > wrote:
> >> > >
> >> > > > Thanks Shammon for driving it.
> >> > > > The FLIP generally looks good to me. I only have one question.
> >> > > > WRT AlterDatabaseEvent, IIUC, it'll contain the origin database
> name
> >> > and
> >> > > > the new CatalogDatabase after modified. Is it enough only pass the
> >> > origin
> >> > > > database name? Will it be better to contain the origin
> >> CatalogDatabase
> >> > so
> >> > > > that listener have ways to know what changes?
> >> > > >
> >> > > > Best regards,
> >> > > > Yuxia
> >> > > >
> >> > > > - 原始邮件 -
> >> > > > 发件人: "ron9 liu" 
> >> > > > 收件人: "dev" 
> >> > > > 发送时间: 星期三, 2023年 5 月 31日 上午 11:36:04
> >> > > > 主题: Re: [DISCUSS] FLIP-294: Support Customized Job Meta Data
> >> Listener
> >> > > >
> >> > > > Hi, Shammon
> >> > > >
> >> > > > Thanks for driving this FLIP, It will enforce the Flink metadata
> >> > > capability
> >> > > > from the platform produce perspective. The overall design looks
> >> good to
> >> > > me,
> >> > > > I just have some small question:
> >> > > > 1. Regarding CatalogModificationListenerFactory#createListener
> >> method,
> >> > I
> >> > > > think it would be better to pass Context as its parameter instead
> of
> >> > two
> >> > > > specific Object. In this way, we can easily extend it in the
> future
> >> and
> >> > > > there will be no compatibility problems. Refer to
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.co

Re: [DISCUSS] FLIP-294: Support Customized Job Meta Data Listener

2023-06-06 Thread Jing Ge
Hi Shammon,

Thanks for the clarification. Just out of curiosity, if the context is not
part of the event, why should it be the input parameter of each onEvent
call?

Best regards,
Jing

On Tue, Jun 6, 2023 at 11:58 AM Leonard Xu  wrote:

> Thanks Shammon for the timely update, the updated FLIP looks good to me.
>
> Hope to see the vote thread and following FLIP-314 discussion thread.
>
> Best,
> Leonard
>
> > On Jun 6, 2023, at 5:04 PM, Shammon FY  wrote:
> >
> > Hi,
> >
> > Thanks for all the feedback.
> >
> > For @Jing Ge,
> > I forget to update the demo code in the FLIP, the method is
> > `onEvent(CatalogModificationEvent, CatalogModificationContext)` and there
> > is no `onEvent(CatalogModificationEvent)`. I have updated the code.
> Context
> > contains some additional information that is not part of an Event, but
> > needs to be used in the listener, so we separate it from the event.
> >
> > For @Panagiotis,
> > I think `ioExecutor` make sense to me and I have added it in
> > `ContextModificationContext`, thanks
> >
> > For @Leonard,
> > Thanks for your input.
> > 1. I have updated `CatalogModificationContext` as an interface, as well
> as
> > Context in CatalogModificationListenerFactory
> > 2. Configuration sounds good to me, I have updated the method name and
> > getConfiguration in Context
> >
> > For @David,
> > Yes, you're right. The listener will only be used on the client side and
> > won't introduce a new code path for running per-job/per-session jobs. The
> > listener will be created in `TableEnvironment` and `SqlGateway` which
> can a
> > `CatalogManager` with the listener.
> >
> >
> > Best,
> > Shammon FY
> >
> >
> > On Tue, Jun 6, 2023 at 3:33 PM David Morávek 
> > wrote:
> >
> >> Hi,
> >>
> >> Thanks for the FLIP! Data lineage is an important problem to tackle.
> >>
> >> Can you please expand on how this is planned to be wired into the
> >> JobManager? As I understand, the listeners will be configured globally
> (per
> >> cluster), so this won't introduce a new code path for running per-job /
> >> per-session user code. Is that correct?
> >>
> >> Best,
> >> D
> >>
> >> On Tue, Jun 6, 2023 at 9:17 AM Leonard Xu  wrote:
> >>
> >>> Thanks Shammon for driving this FLIP forward, I’ve several comments
> about
> >>> the updated FLIP.
> >>>
> >>> 1. CatalogModificationContext is introduced as a class instead of an
> >>> interface, is it a typo?
> >>>
> >>> 2. The FLIP defined multiple  Map config();  methods in
> >>> some Context classes, Could we use  Configuration
> >> getConfiguration();Class
> >>> org.apache.flink.configuration.Configuration is recommend as it’s
> public
> >>> API and offers more useful methods as well.
> >>>
> >>> 3. The Context of CatalogModificationListenerFactory should be an
> >>> interface too, and getUserClassLoder()
> >>> would be more aligned with flink’s naming style.
> >>>
> >>>
> >>> Best,
> >>> Leonard
> >>>
> >>>> On May 26, 2023, at 4:08 PM, Shammon FY  wrote:
> >>>>
> >>>> Hi devs,
> >>>>
> >>>> We would like to bring up a discussion about FLIP-294: Support
> >> Customized
> >>>> Job Meta Data Listener[1]. We have had several discussions with Jark
> >> Wu,
> >>>> Leonard Xu, Dong Lin, Qingsheng Ren and Poorvank about the functions
> >> and
> >>>> interfaces, and thanks for their valuable advice.
> >>>> The overall job and connector information is divided into metadata and
> >>>> lineage, this FLIP focuses on metadata and lineage will be discussed
> in
> >>>> another FLIP in the future. In this FLIP we want to add a customized
> >>>> listener in Flink to report catalog modifications to external metadata
> >>>> systems such as datahub[2] or atlas[3]. Users can view the specific
> >>>> information of connectors such as source and sink for Flink jobs in
> >> these
> >>>> systems, including fields, watermarks, partitions, etc.
> >>>>
> >>>> Looking forward to hearing from you, thanks.
> >>>>
> >>>>
> >>>> [1]
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Job+Meta+Data+Listener
> >>>> [2] https://datahub.io/
> >>>> [3] https://atlas.apache.org/#/
> >>>
> >>>
> >>
>
>


Re: [DISCUSS] FLIP-246: Multi Cluster Kafka Source

2023-06-06 Thread Jing Ge
Hi Mason,

It is a very practical feature that many users are keen to use. Thanks to
the previous discussion, the FLIP now looks informative. Thanks for your
proposal. One small suggestion is that the attached images are quite small
to read if we don't click and enlarge them. Besides that, It is difficult
to read the text on the current sequence diagram because it has a
transparent background. Would you like to replace it with a white
background?

Exactly-one is one of the key features of Kafka connector. I have the same
concern as Qingsheng. Since you have answered questions about it
previously, would you like to create an extra section in your FLIP to
explicitly describe scenarios when exactly-one is supported and when it is
not?

Best regards,
Jing

On Mon, Jun 5, 2023 at 11:41 PM Mason Chen  wrote:

> Hi all,
>
> I'm working on FLIP-246 again, for the Multi Cluster Kafka Source
> contribution. The document has been updated with some more context about
> how it can solve the Kafka topic removal scenario and a sequence diagram to
> illustrate how the components interact.
>
> Looking forward to any feedback!
>
> Best,
> Mason
>
> On Wed, Oct 12, 2022 at 11:12 PM Mason Chen 
> wrote:
>
> > Hi Ryan,
> >
> > Thanks for the additional context! Yes, the offset initializer would need
> > to take a cluster as a parameter and the MultiClusterKafkaSourceSplit can
> > be exposed in an initializer.
> >
> > Best,
> > Mason
> >
> > On Thu, Oct 6, 2022 at 11:00 AM Ryan van Huuksloot <
> > ryan.vanhuuksl...@shopify.com> wrote:
> >
> >> Hi Mason,
> >>
> >> Thanks for the clarification! In regards to the addition to the
> >> OffsetInitializer of this API - this would be an awesome addition and I
> >> think this entire FLIP would be a great addition to the Flink.
> >>
> >> To provide more context as to why we need particular offsets, we use
> >> Hybrid Source to currently backfill from buckets prior to reading from
> >> Kafka. We have a service that will tell us what offset has last been
> loaded
> >> into said bucket which we will use to initialize the KafkaSource
> >> OffsetsInitializer. We couldn't use a timestamp here and the offset
> would
> >> be different for each Cluster.
> >>
> >> In pseudocode, we'd want the ability to do something like this with
> >> HybridSources - if this is possible.
> >>
> >> ```scala
> >> val offsetsMetadata: Map[TopicPartition, Long] = // Get current offsets
> >> from OffsetReaderService
> >> val multiClusterArchiveSource: MultiBucketFileSource[T] = // Data is
> read
> >> from different buckets (multiple topics)
> >> val multiClusterKafkaSource: MultiClusterKafkaSource[T] =
> >> MultiClusterKafkaSource.builder()
> >>   .setKafkaMetadataService(new KafkaMetadataServiceImpl())
> >>   .setStreamIds(List.of("my-stream-1", "my-stream-2"))
> >>   .setGroupId("myConsumerGroup")
> >>
> >>
> .setDeserializer(KafkaRecordDeserializationSchema.valueOnly(StringDeserializer.class))
> >>   .setStartingOffsets(offsetsMetadata)
> >>   .setProperties(properties)
> >>   .build()
> >> val source =
> >>
> HybridSource.builder(multiClusterArchiveSource).addSource(multiClusterKafkaSource).build()
> >> ```
> >>
> >> Few notes:
> >> - TopicPartition won't work because the topic may be the same name as
> >> this is something that is supported IIRC
> >> - I chose to pass a map into starting offsets just for demonstrative
> >> purposes, I would be fine with whatever data structure would work best
> >>
> >> Ryan van Huuksloot
> >> Data Developer | Production Engineering | Streaming Capabilities
> >> [image: Shopify]
> >> <
> https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email>
> >>
> >>
> >> On Mon, Oct 3, 2022 at 11:29 PM Mason Chen 
> >> wrote:
> >>
> >>> Hi Ryan,
> >>>
> >>> Just copying your message over to the email chain.
> >>>
> >>> Hi Mason,
>  First off, thanks for putting this FLIP together! Sorry for the delay.
>  Full disclosure Mason and I chatted a little bit at Flink Forward
> 2022 but
>  I have tried to capture the questions I had for him then.
>  I'll start the conversation with a few questions:
>  1. The concept of streamIds is not clear to me in the proposal and
>  could use some more information. If I understand correctly, they will
> be
>  used in the MetadataService to link KafkaClusters to ones you want to
> use?
>  If you assign stream ids using `setStreamIds`, how can you dynamically
>  increase the number of clusters you consume if the list of StreamIds
> is
>  static? I am basing this off of your example .setStreamIds(List.of(
>  "my-stream-1", "my-stream-2")) so I could be off base with my
>  assumption. If you don't mind clearing up the intention, that would be
>  great!
>  2. How would offsets work if you wanted to use this
>  MultiClusterKafkaSource with a file based backfill? In the case I am
>  thinking of, you have a bucket backed archive of Kafka data per
> cluster.
>  and you want to pick up from

Re: [DISCUSS] FLIP-315: Support Operator Fusion Codegen for Flink SQL

2023-06-07 Thread Jing Ge
Hi Ron,

Thanks for raising the proposal. It is a very attractive idea! Since the
FLIP is a relatively complex one which contains three papers and a design
doc. It deserves more time for the discussion to make sure everyone is on
the same page. I have a NIT question which will not block your voting
process. Previously, it took the community a lot of effort to make Flink
kinds of scala free. Since the code base of the table module is too big,
instead of porting to Java, all scala code has been hidden. Furthermore,
there are ongoing efforts to remove Scala code from Flink. As you can see,
the community tries to limit (i.e. get rid of) scala code as much as
possible. I was wondering if it is possible for you to implement the FLIP
with scala free code?

Best regards,
Jing

[1] https://flink.apache.org/2022/02/22/scala-free-in-one-fifteen/

On Wed, Jun 7, 2023 at 5:33 PM Aitozi  wrote:

> Hi Ron:
> Sorry for the late reply after the voting process. I just want to ask
>
> > Traverse the ExecNode DAG and create a FusionExecNode  for physical
> operators that can be fused together.
> which kind of operators can be fused together ? are the operators in an
> operator chain? Is this optimization aligned to spark's whole stage codegen
> ?
>
> > If any member operator does not support codegen, generate a
> Transformation DAG based on the topological relationship of member ExecNode
>  and jump to step 8.
> step8: Generate a FusionTransformation, setting the parallelism and managed
> memory for the fused operator.
>
> does the "support codegen" means fusion codegen? but why we generate a
> FusionTransformation when the member operator does not support codegen, IMO
> it should
> fallback to the current behavior.
>
> In the end, I share the same idea with Lincoln about performance benchmark.
> Currently flink community's flink-benchmark only covers like schedule,
> state, datastream operator's performance.
> A good benchmark harness for sql operator will benefit the sql optimizer
> topic and observation
>
> Thanks,
> Atiozi.
>
>
> liu ron  于2023年6月6日周二 19:30写道:
>
> > Hi dev
> >
> > Thanks for all the feedback, it seems that here are no more comments, I
> > will
> > start a vote on FLIP-315 [1] later. Thanks again.
> >
> > [1]:
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-315+Support+Operator+Fusion+Codegen+for+Flink+SQL
> >
> > Best,
> > Ron
> >
> > liu ron  于2023年6月5日周一 16:01写道:
> >
> > > Hi, Yun, Jinsong, Benchao
> > >
> > > Thanks for your valuable input about this FLIP.
> > >
> > > First of all, let me emphasize that from the technical implementation
> > > point of view, this design is feasible in both stream and batch
> > scenarios,
> > > so I consider both stream and batch mode in FLIP. In the stream
> scenario,
> > > for stateful operator, according to our business experience, basically
> > the
> > > bottleneck is on the state access, so the optimization effect of OFCG
> for
> > > the stream will not be particularly obvious, so we will not give
> priority
> > > to support it currently. On the contrary, in the batch scenario, where
> > CPU
> > > is the bottleneck, this optimization is gainful.
> > >
> > > Taking the above into account, we are able to support both stream and
> > > batch mode optimization in this design, but we will give priority to
> > > supporting batch operators. As benchao said, when we find a suitable
> > > streaming business scenario in the future, we can consider doing this
> > > optimization. Back to Yun issue, the design will break state
> > compatibility
> > > in stream mode as[1] and the version upgrade will not support this
> OFCG.
> > As
> > > mentioned earlier, we will not support this feature in stream mode in
> the
> > > short term.
> > >
> > > Also thanks to Benchao's suggestion, I will state the current goal of
> > that
> > > optimization in the FLIP, scoped to batch mode.
> > >
> > > Best,
> > > Ron
> > >
> > > liu ron  于2023年6月5日周一 15:04写道:
> > >
> > >> Hi, Lincoln
> > >>
> > >> Thanks for your appreciation of this design. Regarding your question:
> > >>
> > >> > do we consider adding a benchmark for the operators to intuitively
> > >> understand the improvement brought by each improvement?
> > >>
> > >> I think it makes sense to add a benchmark, Spark also has this
> benchmark
> > >> framework. But I think it is another story to introduce a benchmark
> > >> framework in Flink, we need to start a new discussion to this work.
> > >>
> > >> > for the implementation plan, mentioned in the FLIP that 1.18 will
> > >> support Calc, HashJoin and HashAgg, then what will be the next step?
> and
> > >> which operators do we ultimately expect to cover (all or specific
> ones)?
> > >>
> > >> Our ultimate goal is to support all operators in batch mode, but we
> > >> prioritize them according to their usage. Operators like Calc,
> HashJoin,
> > >> HashAgg, etc. are more commonly used, so we will support them first.
> > Later
> > >> we support the rest of the operators step

Re: [VOTE] FLIP-315: Support Operator Fusion Codegen for Flink SQL

2023-06-07 Thread Jing Ge
+1

Best Regards,
Jing

On Wed, Jun 7, 2023 at 10:52 AM weijie guo 
wrote:

> +1 (binding)
>
> Best regards,
>
> Weijie
>
>
> Jingsong Li  于2023年6月7日周三 15:59写道:
>
> > +1
> >
> > On Wed, Jun 7, 2023 at 3:03 PM Benchao Li  wrote:
> > >
> > > +1, binding
> > >
> > > Jark Wu  于2023年6月7日周三 14:44写道:
> > >
> > > > +1 (binding)
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > > 2023年6月7日 14:20,liu ron  写道:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > Thanks for all the feedback about FLIP-315: Support Operator Fusion
> > > > Codegen
> > > > > for Flink SQL[1].
> > > > > [2] is the discussion thread.
> > > > >
> > > > > I'd like to start a vote for it. The vote will be open for at least
> > 72
> > > > > hours (until June 12th, 12:00AM GMT) unless there is an objection
> or
> > an
> > > > > insufficient number of votes.
> > > > >
> > > > > [1]:
> > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-315+Support+Operator+Fusion+Codegen+for+Flink+SQL
> > > > > [2]:
> > https://lists.apache.org/thread/9cnqhsld4nzdr77s2fwf00o9cb2g9fmw
> > > > >
> > > > > Best,
> > > > > Ron
> > > >
> > > >
> > >
> > > --
> > >
> > > Best,
> > > Benchao Li
> >
>


Re: [DISCUSS] FLIP-294: Support Customized Job Meta Data Listener

2023-06-08 Thread Jing Ge
Hi Shammon,

If we take a look at the JDK Event design as a reference, we can even add
an Object into the event [1]. Back to the CatalogModificationEvent,
everything related to the event could be defined in the Event. If we want
to group some information into the Context, we could also consider adding
the CatalogModificationContext into the Event and make the onEvent() method
cleaner with only one input parameter CatalogModificationEvent, because the
interface CatalogModificationListener is the most often used interface for
users. Just my two cents.

Best regards,
Jing

[1]
http://www.java2s.com/example/java-src/pkg/java/util/eventobject-85298.html

On Thu, Jun 8, 2023 at 7:50 AM Shammon FY  wrote:

> Hi,
>
> To @Jing Ge
> > Thanks for the clarification. Just out of curiosity, if the context is
> not part of the event, why should it be the input parameter of each onEvent
> call?
>
> I think it's quite strange to put some information in an Event, such as a
> factory identifier for catalog, but they will be used by the listener.  I
> place it in the context class and I think it is more suitable than directly
> placing it in the event class.
>
> To @Mason
> > 1. I'm also curious about default implementations. Would atlas/datahub be
> supported by default?
>
> We won't do that and external systems such as atlas/datahub need to
> implement the listener themselves.
>
> > 2. The FLIP title is confusing to me, especially in distinguishing it
> from FLIP-314. Would a better FLIP title be "Support Catalog Metadata
> Listener" or something alike?
>
> Thanks, I think  "Support Catalog Modification Listener" will be
> more suitable, I'll update the title to it.
>
>
> Best,
> Shammon FY
>
>
> On Thu, Jun 8, 2023 at 12:25 PM Mason Chen  wrote:
>
> > Hi Shammon,
> >
> > FLIP generally looks good and I'm excited to see this feature.
> >
> > 1. I'm also curious about default implementations. Would atlas/datahub be
> > supported by default?
> > 2. The FLIP title is confusing to me, especially in distinguishing it
> from
> > FLIP-314. Would a better FLIP title be "Support Catalog Metadata
> Listener"
> > or something alike?
> >
> > Best,
> > Mason
> >
> > On Tue, Jun 6, 2023 at 3:33 AM Jing Ge 
> wrote:
> >
> > > Hi Shammon,
> > >
> > > Thanks for the clarification. Just out of curiosity, if the context is
> > not
> > > part of the event, why should it be the input parameter of each onEvent
> > > call?
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Tue, Jun 6, 2023 at 11:58 AM Leonard Xu  wrote:
> > >
> > > > Thanks Shammon for the timely update, the updated FLIP looks good to
> > me.
> > > >
> > > > Hope to see the vote thread and following FLIP-314 discussion thread.
> > > >
> > > > Best,
> > > > Leonard
> > > >
> > > > > On Jun 6, 2023, at 5:04 PM, Shammon FY  wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > Thanks for all the feedback.
> > > > >
> > > > > For @Jing Ge,
> > > > > I forget to update the demo code in the FLIP, the method is
> > > > > `onEvent(CatalogModificationEvent, CatalogModificationContext)` and
> > > there
> > > > > is no `onEvent(CatalogModificationEvent)`. I have updated the code.
> > > > Context
> > > > > contains some additional information that is not part of an Event,
> > but
> > > > > needs to be used in the listener, so we separate it from the event.
> > > > >
> > > > > For @Panagiotis,
> > > > > I think `ioExecutor` make sense to me and I have added it in
> > > > > `ContextModificationContext`, thanks
> > > > >
> > > > > For @Leonard,
> > > > > Thanks for your input.
> > > > > 1. I have updated `CatalogModificationContext` as an interface, as
> > well
> > > > as
> > > > > Context in CatalogModificationListenerFactory
> > > > > 2. Configuration sounds good to me, I have updated the method name
> > and
> > > > > getConfiguration in Context
> > > > >
> > > > > For @David,
> > > > > Yes, you're right. The listener will only be used on the client
> side
> > > and
> > > > > won't introduce a new code path for running per-job/per-session
> jobs.
> > > The
>

Re: [DISCUSS] FLIP-315: Support Operator Fusion Codegen for Flink SQL

2023-06-08 Thread Jing Ge
Hi Ron,

Thanks for sharing the insight. Agree that it is not doable to rewrite the
entire planner module with Java. That was the reason why it has been hidden
instead of replaced. I thought, since the community decided to walk away
from scala, we should at least not add any more new scala code. According
to your suggestion, it is not the fact. I think the community should
reconsider how to handle scala, since the more features we are developing
in those areas the more scala code we will have, which makes it even harder
(impossible) to migrate to java. This is beyond the scope of this
discussion. I will start a new thread to address it.

Best regards,
Jing


On Thu, Jun 8, 2023 at 5:20 AM liu ron  wrote:

> Hi, Ging
>
> Thanks for your valuable input about scala free.
>
> Firstly, reply to your question, using java to implement codegen is
> possible,  but we need to utilize some tools. I think the first alternative
> is to update our jdk version to 13, which provides text block feature[1]
> makes string format easier, and improves the multiple-line String
> readability and writability. However, we don't update the JDK version to 13
> in the short term future. The second alternative is to use a third library
> such as Freemarker and StringTemplate, but this is not easy work, we need
> to introduce extra dependency in table planner, and makes our
> implementation more complicated.
>
> We use a lot of scala code in the planner module, one of the main purposes
> is that codegen is more friendly, and many of the operators are also
> implemented through codegen. In the foreseeable future, we do not have the
> time and manpower to remove the scala code from the planner module, so
> scala-free is unlikely. From the point of view of development friendliness
> and development cost, scala is currently a relatively better solution for
> codegen. Suppose we need to completely rewrite the planner module in java
> in the future, I think it is better to consider what tools are used to
> support codegen in a unified way at that time, and I can't give a suitable
> tool at the moment.
>
> In summary, I don't think it is feasible to implement my FLIP with
> scala-free at this time.
>
> [1]: https://openjdk.org/jeps/378
>
> Best,
> Ron
>
>
> liu ron  于2023年6月8日周四 10:51写道:
>
> > Hi, Atiozi
> >
> > Thanks for your feedback.
> >
> > > Traverse the ExecNode DAG and create a FusionExecNode  for physical
> > operators that can be fused together.
> > which kind of operators can be fused together ? are the operators in an
> > operator chain? Is this optimization aligned to spark's whole stage
> codegen
> > ?
> > In theory, all kinds of operators can be fused together, our final goal
> is
> > to support all operators in batch mode, OperatorChain is just one case.
> Due
> > to this work effort is relatively large, so we need to complete it step
> by
> > step. Our OFCG not only achieves the ability of spark's whole stage
> > codegen, but also do more better than them.
> >
> > > does the "support codegen" means fusion codegen? but why we generate a
> > FusionTransformation when the member operator does not support codegen,
> IMO
> > it should
> > fallback to the current behavior.
> >
> > yes, it means the fusion codegen. In FLIP, I propose two operator fusion
> > mechanisms, one is like OperatorChain for single input operator, another
> is
> > MultipleInput fusion. For the former, our design mechanism is to fuse all
> > operators together at the ExecNode layer only if they all support fusion
> > codegen, or else go over the default OperatorChain. For the latter, in
> > order not to break the existing MultipleInput optimization purpose, so
> when
> > there are member operators that do not support fusion codegen,  we will
> > fall back to the current behavior[1], which means that a
> > FusionTransformation is created. here FusionTransformation is just a
> > surrogate for MultipleInput case, it actually means
> > MultipleInputTransformation, which fuses multiple physical operators.
> > Sorry, the description in the flow is not very clear and caused your
> > confusion.
> >
> > > In the end, I share the same idea with Lincoln about performance
> > benchmark.
> > Currently flink community's flink-benchmark only covers like schedule,
> > state, datastream operator's performance.
> > A good benchmark harness for sql operator will benefit the sql optimizer
> > topic and observation
> >
> > For the performance benchmark, I agree with you. As I stated earlier, I
> > think this is a new scope of work, we sh

Call for Presentations: Flink Forward Seattle 2023

2023-06-08 Thread Jing Ge
Dear Flink developers & users,

We hope this email finds you well. We are excited to announce the Call for
Presentations for the upcoming Flink Forward Seattle 2023, the premier
event dedicated to Apache Flink and stream processing technologies. As a
prominent figure in the field, we invite you to submit your innovative
research, insightful experiences, and cutting-edge use cases for
consideration as a speaker at the conference.

Flink Forward Conference 2023 Details:
Date: November 6-7(training), November 8 (conference)
Location: Seattle United States

Flink Forward is a conference dedicated to the Apache Flink® community. In
2023 we shall have a full conference day following a 2-days long training
session. The conference gathers an international audience of CTOs/CIOs,
developers, data architects, data scientists, Apache Flink® core
committers, and the stream processing community, to share experiences,
exchange ideas and knowledge, and receive hands-on training sessions led by
Flink experts. We are seeking compelling presentations and
thought-provoking talks that cover a broad range of topics related to
Apache Flink, including but not limited to:

Flink architecture and internals
Flink performance optimization
Advanced Flink features and enhancements
Real-world use cases and success stories
Flink ecosystem and integrations
Stream processing at scale
Best practices for Flink application development

If you have an inspiring story, valuable insights, real-world application,
research breakthroughs, use case, best practice, or compelling vision of
the future for Flink, we encourage you to present it to a highly skilled
and enthusiastic community. We welcome submissions from both industry
professionals and academic researchers.

To submit your proposal, please visit the Flink Forward Conference website
at https://www.flink-forward.org/seattle-2023/call-for-presentations. The
submission form will require you to provide an abstract of your talk, along
with a brief biography and any supporting materials. The deadline for
submissions is July 12th 11:59 pm PDT.

We believe your contribution will greatly enrich the Flink Forward
Conference and provide invaluable insights to our attendees. This is an
excellent opportunity to connect with a diverse community of Flink
enthusiasts, network with industry experts, and gain recognition for your
expertise. We look forward to receiving your submission and welcoming you
as a speaker at the Flink Forward Conference.

Thank you for your time and consideration.

Best regards,

-- 

Jing Ge | Head of Engineering

j...@ververica.com

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference - Tickets on SALE now!
<https://eu.eventscloud.com/ereg/newreg.php?eventid=200259741&#>

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Chausseestrasse 20, 10115 Berlin, Germany

--

Ververica GmbH

Registered at Amtsgericht Charlottenburg: HRB 158244 B

Managing Directors: Alexander Walden, Karl Anton Wehner, Yip Park Tung
Jason, Jinwei (Kevin) Zhang


Re: [DISCUSS] FLIP-313 Add support of User Defined AsyncTableFunction

2023-06-08 Thread Jing Ge
 process as an async RPC call process.
> > > >
> > > > Let's see how we can perform an async RPC call with lateral join:
> > > >
> > > > (1) Implement an AsyncTableFunction with RPC call logic.
> > > > (2) Run query with
> > > >
> > > > Create function f1 as '...' ;
> > > >
> > > > SELECT o.order_id, o.total, c.country, c.zip FROM Orders  lateral
> table
> > > > (f1(order_id)) as T(...);
> > > >
> > > > As you can see, the lateral join version is more simple and intuitive
> > to
> > > > users. Users do not have to wrap a
> > > > LookupTableSource for the purpose of using async udtf.
> > > >
> > > > In the end, We can also see the user defined async table function is
> an
> > > > enhancement of the current lateral table join
> > > > which only supports sync lateral join now.
> > > >
> > > > Best,
> > > > Aitozi.
> > > >
> > > >
> > > > Jing Ge  于2023年6月2日周五 19:37写道:
> > > >
> > > >> Hi Aitozi,
> > > >>
> > > >> Thanks for the update. Just out of curiosity, what is the difference
> > > >> between the RPC call or query you mentioned and the lookup in a very
> > > >> general way? Since Lateral join is used in the FLIP. Is there any
> > > special
> > > >> thought for that? Sorry for asking so many questions. The FLIP
> > contains
> > > >> limited information to understand the motivation.
> > > >>
> > > >> Best regards,
> > > >> Jing
> > > >>
> > > >> On Fri, Jun 2, 2023 at 3:48 AM Aitozi  wrote:
> > > >>
> > > >> > Hi Jing,
> > > >> > I have updated the proposed changes to the FLIP. IMO, lookup
> has
> > > its
> > > >> > clear
> > > >> > async call requirement is due to its IO heavy operator. In our
> > usage,
> > > >> sql
> > > >> > users have
> > > >> > logic to do some RPC call or query the third-party service which
> is
> > > >> also IO
> > > >> > intensive.
> > > >> > In these case, we'd like to leverage the async function to improve
> > the
> > > >> > throughput.
> > > >> >
> > > >> > Thanks,
> > > >> > Aitozi.
> > > >> >
> > > >> > Jing Ge  于2023年6月1日周四 22:55写道:
> > > >> >
> > > >> > > Hi Aitozi,
> > > >> > >
> > > >> > > Sorry for the late reply. Would you like to update the proposed
> > > >> changes
> > > >> > > with more details into the FLIP too?
> > > >> > > I got your point. It looks like a rational idea. However, since
> > > lookup
> > > >> > has
> > > >> > > its clear async call requirement, are there any real use cases
> > that
> > > >> > > need this change? This will help us understand the motivation.
> > After
> > > >> all,
> > > >> > > lateral join and temporal lookup join[1] are quite different.
> > > >> > >
> > > >> > > Best regards,
> > > >> > > Jing
> > > >> > >
> > > >> > >
> > > >> > > [1]
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://github.com/apache/flink/blob/d90a72da2fd601ca4e2a46700e91ec5b348de2ad/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/AsyncTableFunction.java#L54
> > > >> > >
> > > >> > > On Wed, May 31, 2023 at 8:53 AM Aitozi 
> > > wrote:
> > > >> > >
> > > >> > > > Hi Jing,
> > > >> > > > What do you think about it? Can we move forward this
> > feature?
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > Aitozi.
> > > >> > > >
> > > >> > > > Aitozi  于2023年5月29日周一 09:56写道:
> > > >> > > >
> > > >> > > > > Hi Jing,
> > > >> > > > > > "Do you mean to support the AyncTableFunction beyon

Re: [DISCUSS] FLIP-313 Add support of User Defined AsyncTableFunction

2023-06-09 Thread Jing Ge
Hi Aitozi,

Thanks for the feedback. Looking forward to the performance tests.

Afaik, lookup returns one row for each key [1] [2]. Conceptually, the
lookup function is used to enrich column(s) from the dimension table. If,
for the given key, there will be more than one row, there will be no way to
know which row will be used to enrich the key.

[1]
https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L49
[2]
https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/TableFunction.java#L196

Best regards,
Jing

On Fri, Jun 9, 2023 at 5:18 AM Aitozi  wrote:

> Hi Jing
> Thanks for your good questions. I have updated the example to the FLIP.
>
> > Only one row for each lookup
> lookup can also return multi rows, based on the query result. [1]
>
> [1]:
>
> https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L56
>
> > If we use async calls with lateral join, my gut feeling is
> that we might have many more async calls than lookup join. I am not really
> sure if we will be facing potential issues in this case or not.
>
> IMO, the work pattern is similar to the lookup function, for each row from
> the left table,
> it will evaluate the eval method once, so the async call numbers will not
> change.
> and the maximum calls in flight is limited by the Async operators buffer
> capacity
> which will be controlled by the option.
>
> BTW, for the naming of these option, I updated the FLIP about this you can
> refer to
> the section of "ConfigOption" and "Rejected Alternatives"
>
> In the end, for the performance evaluation, I'd like to do some tests and
> will update it to the FLIP doc
>
> Thanks,
> Aitozi.
>
>
> Jing Ge  于2023年6月9日周五 07:23写道:
>
> > Hi Aitozi,
> >
> > Thanks for the clarification. The code example looks interesting. I would
> > suggest adding them into the FLIP. The description with code examples
> will
> > help readers understand the motivation and how to use it. Afaiac, it is a
> > valid feature for Flink users.
> >
> > As we knew, lookup join is based on temporal join, i.e. FOR SYSTEM_TIME
> AS
> > OF which is also used in your code example. Temporal join performs the
> > lookup based on the processing time match. Only one row for each
> > lookup(afaiu, I need to check the source code to double confirm) will
> > return for further enrichment. One the other hand, lateral join will have
> > sub-queries correlated with every individual value of the reference table
> > from the preceding part of the query and each sub query will return
> > multiple rows. If we use async calls with lateral join, my gut feeling is
> > that we might have many more async calls than lookup join. I am not
> really
> > sure if we will be facing potential issues in this case or not. Possible
> > issues I can think of now e.g. too many PRC calls, too many async calls
> > processing, the sub query will return a table which might be (too) big,
> and
> > might cause performance issues. I would suggest preparing some use cases
> > and running some performance tests to check it. These are my concerns
> about
> > using async calls with lateral join and I'd like to share with you, happy
> > to discuss with you and hear different opinions, hopefully the
> > discussion could help me understand it more deeply. Please correct me if
> I
> > am wrong.
> >
> > Best regards,
> > Jing
> >
> >
> > On Thu, Jun 8, 2023 at 7:22 AM Aitozi  wrote:
> >
> > > Hi Mason,
> > > Thanks for your input. I think if we support the user defined async
> > > table function,
> > > user will be able to use it to hold a batch data then handle it at one
> > time
> > > in the customized function.
> > >
> > > AsyncSink is meant for the sink operator. I have not figure out how to
> > > integrate in this case.
> > >
> > > Thanks,
> > > Atiozi.
> > >
> > >
> > > Mason Chen  于2023年6月8日周四 12:40写道:
> > >
> > > > Hi Aitozi,
> > > >
> > > > I think it makes sense to make it easier for SQL users to make RPCs.
> Do
> > > you
> > > > think your proposal can extend to the ability to batch data for the
> > RPC?
> > > > This is also another common strategy to increase throughp

Re: [DISCUSS] FLIP-313 Add support of User Defined AsyncTableFunction

2023-06-09 Thread Jing Ge
Hi Aitozi,

The keyRow used in this case contains all keys[1].

Best regards,
Jing

[1]
https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L49


On Fri, Jun 9, 2023 at 3:42 PM Aitozi  wrote:

> Hi Jing,
>
>  The performance test is added to the FLIP.
>
>  As I know, The lookup join can return multi rows, it depends on
> whether  the join key
> is the primary key of the external database or not. The `lookup` [1] will
> return a collection of
> joined result, and each of them will be collected
>
>
> [1]:
>
> https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L52
>
>
> Thanks,
> Aitozi.
>
> Jing Ge  于2023年6月9日周五 17:05写道:
>
> > Hi Aitozi,
> >
> > Thanks for the feedback. Looking forward to the performance tests.
> >
> > Afaik, lookup returns one row for each key [1] [2]. Conceptually, the
> > lookup function is used to enrich column(s) from the dimension table. If,
> > for the given key, there will be more than one row, there will be no way
> to
> > know which row will be used to enrich the key.
> >
> > [1]
> >
> >
> https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L49
> > [2]
> >
> >
> https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/TableFunction.java#L196
> >
> > Best regards,
> > Jing
> >
> > On Fri, Jun 9, 2023 at 5:18 AM Aitozi  wrote:
> >
> > > Hi Jing
> > > Thanks for your good questions. I have updated the example to the
> > FLIP.
> > >
> > > > Only one row for each lookup
> > > lookup can also return multi rows, based on the query result. [1]
> > >
> > > [1]:
> > >
> > >
> >
> https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L56
> > >
> > > > If we use async calls with lateral join, my gut feeling is
> > > that we might have many more async calls than lookup join. I am not
> > really
> > > sure if we will be facing potential issues in this case or not.
> > >
> > > IMO, the work pattern is similar to the lookup function, for each row
> > from
> > > the left table,
> > > it will evaluate the eval method once, so the async call numbers will
> not
> > > change.
> > > and the maximum calls in flight is limited by the Async operators
> buffer
> > > capacity
> > > which will be controlled by the option.
> > >
> > > BTW, for the naming of these option, I updated the FLIP about this you
> > can
> > > refer to
> > > the section of "ConfigOption" and "Rejected Alternatives"
> > >
> > > In the end, for the performance evaluation, I'd like to do some tests
> and
> > > will update it to the FLIP doc
> > >
> > > Thanks,
> > > Aitozi.
> > >
> > >
> > > Jing Ge  于2023年6月9日周五 07:23写道:
> > >
> > > > Hi Aitozi,
> > > >
> > > > Thanks for the clarification. The code example looks interesting. I
> > would
> > > > suggest adding them into the FLIP. The description with code examples
> > > will
> > > > help readers understand the motivation and how to use it. Afaiac, it
> > is a
> > > > valid feature for Flink users.
> > > >
> > > > As we knew, lookup join is based on temporal join, i.e. FOR
> SYSTEM_TIME
> > > AS
> > > > OF which is also used in your code example. Temporal join performs
> the
> > > > lookup based on the processing time match. Only one row for each
> > > > lookup(afaiu, I need to check the source code to double confirm) will
> > > > return for further enrichment. One the other hand, lateral join will
> > have
> > > > sub-queries correlated with every individual value of the reference
> > table
> > > > from the preceding part of the query and each sub query will return
> > > > multiple rows. If we use async calls with lateral join, my gut
> feeling
> > is
> > 

Re: [DISCUSS] FLIP-307: Flink connector Redshift

2023-06-09 Thread Jing Ge
Hi Samrat,

The FLIP looks good, thanks!

Best regards,
Jing


On Tue, Jun 6, 2023 at 8:16 PM Samrat Deb  wrote:

> Hi Jing,
>
> >  I would suggest adding that information into the
> FLIP.
>
> Updated now, please review the new version of flip whenever time.
>
> > +1 Looking forward to your PR :-)
> I will request for your review once m ready with PR :-)
>
> Bests,
> Samrat
>
> On Tue, Jun 6, 2023 at 11:43 PM Samrat Deb  wrote:
>
> > Hi Martijn,
> >
> > > If I understand this correctly, the Redshift sink
> > would not be able to support exactly-once, is that correct?
> >
> > As I delve deeper into the study of Redshift's capabilities, I have
> > discovered that it does support "merge into" operations [1] and some
> > merge into examples [2].
> > This opens up the possibility of implementing exactly-once semantics with
> > the connector.
> > However, I believe it would be prudent to start with a more focused scope
> > for the initial phase of implementation and defer the exact-once support
> > for subsequent iterations.
> >
> > Before finalizing the approach, I would greatly appreciate your thoughts
> > and suggestions on this matter.
> > Should we prioritize the initial implementation without exactly-once
> > support, or would you advise incorporating it right from the start?
> > Your insights and experiences would be immensely valuable in making this
> > decision.
> >
> >
> > [1]
> >
> https://docs.aws.amazon.com/redshift/latest/dg/t_updating-inserting-using-staging-tables-.html
> > [2] https://docs.aws.amazon.com/redshift/latest/dg/merge-examples.html
> >
> > Bests,
> > Samrat
> >
> > On Mon, Jun 5, 2023 at 7:09 PM Jing Ge 
> wrote:
> >
> >> Hi Samrat,
> >>
> >> Thanks for the feedback. I would suggest adding that information into
> the
> >> FLIP.
> >>
> >> +1 Looking forward to your PR :-)
> >>
> >> Best regards,
> >> Jing
> >>
> >> On Sat, Jun 3, 2023 at 9:19 PM Samrat Deb 
> wrote:
> >>
> >> > Hi Jing Ge,
> >> >
> >> > >>> Do you already have any prototype? I'd like to join the reviews.
> >> > The prototype is in progress. I will raise the dedicated PR for review
> >> soon
> >> > also notify in this thread as well .
> >> >
> >> > >>> Will the Redshift connector provide additional features
> >> > beyond the mediator/wrapper of the jdbc connector?
> >> >
> >> > Here are the additional features that the Flink connector for AWS
> >> Redshift
> >> > can provide on top of using JDBC:
> >> >
> >> > 1. Integration with AWS Redshift Workload Management (WLM): AWS
> Redshift
> >> > allows you to configure WLM[1] to manage query prioritization and
> >> resource
> >> > allocation. The Flink connector for Redshift will be agnostic to the
> >> > configured WLM and utilize it for scaling in and out for the sink.
> This
> >> > means that the connector can leverage the WLM capabilities of Redshift
> >> to
> >> > optimize the execution of queries and allocate resources efficiently
> >> based
> >> > on your defined workload priorities.
> >> >
> >> > 2. Abstraction of AWS Redshift Quotas and Limits: AWS Redshift imposes
> >> > certain quotas and limits[2] on various aspects such as the number of
> >> > clusters, concurrent connections, queries per second, etc. The Flink
> >> > connector for Redshift will provide an abstraction layer for users,
> >> > allowing them to work with Redshift without having to worry about
> these
> >> > specific limits. The connector will handle the management of
> connections
> >> > and queries within the defined quotas and limits, abstracting away the
> >> > complexity and ensuring compliance with Redshift's restrictions.
> >> >
> >> > These features aim to simplify the integration of Flink with AWS
> >> Redshift,
> >> > providing optimized resource utilization and transparent handling of
> >> > Redshift-specific limitations.
> >> >
> >> > Bests,
> >> > Samrat
> >> >
> >> > [1]
> >> >
> >> >
> >>
> https://docs.aws.amazon.com/redshift/latest/dg/cm-c-implementing-workload-management.html
> >> > [2]
> >> >
> >> >
> >>

Re: [VOTE] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-06-12 Thread Jing Ge
+1(binding) Thanks!

Best regards,
Jing

On Mon, Jun 12, 2023 at 12:01 PM yuxia  wrote:

> +1 (binding)
> Thanks Mang driving it.
>
> Best regards,
> Yuxia
>
> - 原始邮件 -
> 发件人: "zhangmang1" 
> 收件人: "dev" 
> 发送时间: 星期一, 2023年 6 月 12日 下午 5:31:10
> 主题: [VOTE] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS)
> statement
>
> Hi everyone,
>
> Thanks for all the feedback about FLIP-305: Support atomic for CREATE
> TABLE AS SELECT(CTAS) statement[1].
> [2] is the discussion thread.
>
> I'd like to start a vote for it. The vote will be open for at least 72
> hours (until June 15th, 10:00AM GMT) unless there is an objection or an
> insufficient number of votes.[1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-305%3A+Support+atomic+for+CREATE+TABLE+AS+SELECT%28CTAS%29+statement
> [2]https://lists.apache.org/thread/n6nsvbwhs5kwlj5kjgv24by2tk5mh9xd
>
>
>
>
>
>
>
> --
>
> Best regards,
> Mang Zhang
>


Re: [DISCUSS] FLIP-313 Add support of User Defined AsyncTableFunction

2023-06-12 Thread Jing Ge
Hi Aitozi,

Which key will be used for lookup is not an issue, only one row will be
required for each key in order to enrich it. True, it depends on the
implementation whether multiple rows or single row for each key will be
returned. However, for the lookup & enrichment scenario, one row/key is
recommended, otherwise, like I mentioned previously, enrichment won't work.

I am a little bit concerned about returning a big table for each key, since
it will take the async call longer to return and need more memory. The
performance tests should cover this scenario. This is not a blocking issue
for this FLIP.

Best regards,
Jing

On Sat, Jun 10, 2023 at 4:11 AM Aitozi  wrote:

> Hi Jing,
> I means the join key is not necessary to be the primary key or unique
> index of the database.
> In this situation, we may queried out multi rows for one join key. I think
> that's why the
> LookupFunction#lookup will return a collection of RowData.
>
> BTW, I think the behavior of lookup join will not affect the semantic of
> the async udtf.
> We use the Async TableFunction here and the table function can collect
> multiple rows.
>
> Thanks,
> Atiozi.
>
>
>
> Jing Ge  于2023年6月10日周六 00:15写道:
>
> > Hi Aitozi,
> >
> > The keyRow used in this case contains all keys[1].
> >
> > Best regards,
> > Jing
> >
> > [1]
> >
> >
> https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L49
> >
> >
> > On Fri, Jun 9, 2023 at 3:42 PM Aitozi  wrote:
> >
> > > Hi Jing,
> > >
> > >  The performance test is added to the FLIP.
> > >
> > >  As I know, The lookup join can return multi rows, it depends on
> > > whether  the join key
> > > is the primary key of the external database or not. The `lookup` [1]
> will
> > > return a collection of
> > > joined result, and each of them will be collected
> > >
> > >
> > > [1]:
> > >
> > >
> >
> https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L52
> > >
> > >
> > > Thanks,
> > > Aitozi.
> > >
> > > Jing Ge  于2023年6月9日周五 17:05写道:
> > >
> > > > Hi Aitozi,
> > > >
> > > > Thanks for the feedback. Looking forward to the performance tests.
> > > >
> > > > Afaik, lookup returns one row for each key [1] [2]. Conceptually, the
> > > > lookup function is used to enrich column(s) from the dimension table.
> > If,
> > > > for the given key, there will be more than one row, there will be no
> > way
> > > to
> > > > know which row will be used to enrich the key.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L49
> > > > [2]
> > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/TableFunction.java#L196
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Fri, Jun 9, 2023 at 5:18 AM Aitozi  wrote:
> > > >
> > > > > Hi Jing
> > > > > Thanks for your good questions. I have updated the example to
> the
> > > > FLIP.
> > > > >
> > > > > > Only one row for each lookup
> > > > > lookup can also return multi rows, based on the query result. [1]
> > > > >
> > > > > [1]:
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L56
> > > > >
> > > > > > If we use async calls with lateral join, my gut feeling is
> > > > > that we might have many more async calls than lookup join. I am not
> > > > really
> > > > > sure if we will be facing potential issues in this case or not.
> > > > >
> > > > > IMO, the work pattern is similar to the lookup function, for each
> row

Re: [DISCUSS] FLIP-313 Add support of User Defined AsyncTableFunction

2023-06-12 Thread Jing Ge
Hi Aitozi,

Thanks for taking care of that part. I have no other concern.

Best regards,
Jing


On Mon, Jun 12, 2023 at 5:38 PM Aitozi  wrote:

> BTW, If there are no other more blocking issue / comments, I would like to
> start a VOTE in another thread this wednesday 6.14
>
> Thanks,
> Aitozi.
>
> Aitozi  于2023年6月12日周一 23:34写道:
>
> > Hi, Jing,
> > Thanks for your explanation. I get your point now.
> >
> > For the performance part, I think it's a good idea to run with returning
> a
> > big table case, the memory consumption
> > should be a point to be taken care about. Because in the ordered mode,
> the
> > head element in buffer may affect the
> > total memory consumption.
> >
> >
> > Thanks,
> > Aitozi.
> >
> >
> >
> > Jing Ge  于2023年6月12日周一 20:28写道:
> >
> >> Hi Aitozi,
> >>
> >> Which key will be used for lookup is not an issue, only one row will be
> >> required for each key in order to enrich it. True, it depends on the
> >> implementation whether multiple rows or single row for each key will be
> >> returned. However, for the lookup & enrichment scenario, one row/key is
> >> recommended, otherwise, like I mentioned previously, enrichment won't
> >> work.
> >>
> >> I am a little bit concerned about returning a big table for each key,
> >> since
> >> it will take the async call longer to return and need more memory. The
> >> performance tests should cover this scenario. This is not a blocking
> issue
> >> for this FLIP.
> >>
> >> Best regards,
> >> Jing
> >>
> >> On Sat, Jun 10, 2023 at 4:11 AM Aitozi  wrote:
> >>
> >> > Hi Jing,
> >> > I means the join key is not necessary to be the primary key or
> >> unique
> >> > index of the database.
> >> > In this situation, we may queried out multi rows for one join key. I
> >> think
> >> > that's why the
> >> > LookupFunction#lookup will return a collection of RowData.
> >> >
> >> > BTW, I think the behavior of lookup join will not affect the semantic
> of
> >> > the async udtf.
> >> > We use the Async TableFunction here and the table function can collect
> >> > multiple rows.
> >> >
> >> > Thanks,
> >> > Atiozi.
> >> >
> >> >
> >> >
> >> > Jing Ge  于2023年6月10日周六 00:15写道:
> >> >
> >> > > Hi Aitozi,
> >> > >
> >> > > The keyRow used in this case contains all keys[1].
> >> > >
> >> > > Best regards,
> >> > > Jing
> >> > >
> >> > > [1]
> >> > >
> >> > >
> >> >
> >>
> https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L49
> >> > >
> >> > >
> >> > > On Fri, Jun 9, 2023 at 3:42 PM Aitozi  wrote:
> >> > >
> >> > > > Hi Jing,
> >> > > >
> >> > > >  The performance test is added to the FLIP.
> >> > > >
> >> > > >  As I know, The lookup join can return multi rows, it depends
> on
> >> > > > whether  the join key
> >> > > > is the primary key of the external database or not. The `lookup`
> [1]
> >> > will
> >> > > > return a collection of
> >> > > > joined result, and each of them will be collected
> >> > > >
> >> > > >
> >> > > > [1]:
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L52
> >> > > >
> >> > > >
> >> > > > Thanks,
> >> > > > Aitozi.
> >> > > >
> >> > > > Jing Ge  于2023年6月9日周五 17:05写道:
> >> > > >
> >> > > > > Hi Aitozi,
> >> > > > >
> >> > > > > Thanks for the feedback. Looking forward to the performance
> tests.
> >> > > > >
> >> > > > > Afaik, lookup returns one row for each key [1] [2].
> Conceptually,
&

Re: [DISCUSS] FLIP-321: Introduce an API deprecation process

2023-06-13 Thread Jing Ge
Hi Becket,

Thanks for driving this important topic! There were many discussions
previously that ended up with waiting up for a clear API deprecation
process definition. This FLIP will help a lot.

I'd like to ask some questions to understand your thoughts.

Speaking of the FLIP,

*"Always add a "Since X.X.X" comment to indicate when was a class /
interface / method marked as deprecated."*
 Could you describe it with a code example? Do you mean Java comments?

*"At least 1 patch release for the affected minor release for
Experimental APIs"*
The rule is absolutely right. However, afaiac, deprecation is different as
modification. As a user/dev, I would appreciate, if I do not need to do any
migration work for any deprecated API between patch releases upgrade. BTW,
if experimental APIs are allowed to change between patches, could we just
change them instead of marking them as deprecated and create new ones to
replace them?

One major issue we have, afaiu, is caused by the lack of housekeeping/house
cleaning, there are many APIs that were marked as deprecated a few years
ago and still don't get removed. Some APIs should be easy to remove and
others will need some more clear rules, like the issue discussed at [1].

Some common questions could be:

1. How to make sure the new APIs cover all functionality, i.e. backward
compatible, before removing the deprecated APIs? Since the
functionalities could only be built with the new APIs iteratively, there
will be a while (might be longer than the migration period) that the new
APIs are not backward compatible with the deprecated ones.
2. Is it allowed to remove the deprecated APIs after the defined migration
period expires while the new APis are still not backward compatible?
3. For the case of core API upgrade with downstream implementations, e.g.
connectors, What is the feasible deprecation strategy? Option1 bottom-up:
make sure the downstream implementation is backward compatible before
removing the deprecated core APIs. Option2 top-down: once the downstream
implementation of new APIs works fine, we can remove the deprecated core
APIs after the migration period expires. The implementation of the
deprecated APIs will not get any further update in upcoming releases(it has
been removed). There might be some missing features in the downstream
implementation of new APIs compared to the old implementation. Both options
have their own pros and cons.


Best regards,
Jing


[1] https://lists.apache.org/thread/m3o48c2d8j9g5t9s89hqs6qvr924s71o


On Mon, Jun 12, 2023 at 6:31 PM Stefan Richter
 wrote:

> Hi,
>
> Thanks a lot for bringing up this topic and for the initial proposal. As
> more and more people are looking into running Flink as a continuous service
> this discussion is becoming very relevant.
>
> What I would like to see is a clearer definition for what we understand by
> stability and compatibility. Our current policy only talks about being able
> to “compile” and “run” with a different version. As far as I can see, there
> is no guarantee about the stability of observable behavior. I believe it’s
> important for the community to include this important aspect in the
> guarantees that we give as our policy.
>
> For all changes that we do to the stable parts of the API, we should also
> consider how easy or difficult different types of changes would be for
> running Flink as a service with continuous delivery. For example,
> introducing a new interface to evolve the methods would make it easier to
> write adapter code than changing method signatures in-place on the existing
> interface. Those concerns should be considered in our process for evolving
> interfaces.
>
> Best,
> Stefan
>
>
>
>   
> Stefan Richter
> Principal Engineer II
>
> Follow us:  <
> https://www.confluent.io/blog?utm_source=footer&utm_medium=email&utm_campaign=ch.email-signature_type.community_content.blog>
> 
>
>
>
> > On 11. Jun 2023, at 14:30, Becket Qin  wrote:
> >
> > Hi folks,
> >
> > As one of the release 2.0 efforts, the release managers were discussing
> our
> > API lifecycle policies. There have been FLIP-196[1] and FLIP-197[2] that
> > are relevant to this topic. These two FLIPs defined the stability
> guarantee
> > of the programming APIs with various different stability annotations, and
> > the promotion process. A recap of the conclusion is following:
> >
> > Stability:
> > @Internal API: can change between major/minor/patch releases.
> > @Experimental API: can change between major/minor/patch releases.
> > @PublicEvolving API: can change between major/minor releases.
> > @Public API: can only change between major releases.
> >
> > Promotion:
> > An @Experimental API should be promoted to @PublicEvolving after two
> > releases, and a @PublicEvolving API should be promoted to @Public API
> after
> > two releases, unless there is a documented reason not to do so.
> >
> > One thing not mentioned in these two FLIPs is the API dep

Re: [DISCUSS] FLIP-321: Introduce an API deprecation process

2023-06-13 Thread Jing Ge
> This is by design. Most of these are @Public APIs that we had to carry
> around until Flink 2.0, because that was the initial guarantee that we
> gave people.
>

True, I knew @Public APIs could not be removed before the next major
release. I meant house cleaning without violation of these annotations'
design concept. i.e especially cleaning up for @PublicEvolving APIs since
they are customer-facing. Regular cleaning up with all other @Experimental
and @Internal APIs would be even better, if there might be some APIs marked
as @deprecated.

Best regards,
Jing



On Tue, Jun 13, 2023 at 4:25 PM Chesnay Schepler  wrote:

> On 13/06/2023 12:50, Jing Ge wrote:
> > One major issue we have, afaiu, is caused by the lack of
> housekeeping/house
> > cleaning, there are many APIs that were marked as deprecated a few years
> > ago and still don't get removed. Some APIs should be easy to remove and
> > others will need some more clear rules, like the issue discussed at [1].
>
This is by design. Most of these are @Public APIs that we had to carry
> around until Flink 2.0, because that was the initial guarantee that we
> gave people.


>
> As for the FLIP, I like the idea of explicitly writing down a
> deprecation period for APIs, particularly PublicEvolving ones.
> For Experimental I don't think it'd be a problem if we could change them
> right away,
> but looking back a bit I don't think it hurts us to also enforce some
> deprecation period.
> 1 release for both of these sound fine to me.
>
>
> My major point of contention is the removal of Public APIs between minor
> versions.
> This to me would a major setback towards a simpler upgrade path for users.
> If these can be removed in minor versions than what even is a major
> release?
> The very definition we have for Public APIs is that they stick around
> until the next major one.
> Any rule that theoretically allows for breaking changes in Public API in
> every minor release is in my opinion not a viable option.
>
>
> The "carry recent Public APIs forward into the next major release" thing
> seems to presume a linear release history (aka, if 2.0 is released after
> 1.20, then there will be no 1.21), which I doubt will be the case. The
> idea behind it is good, but I'd say the right conclusion would be to not
> make that API public if we know a new major release hits in 3 months and
> is about to modify it. With a regular schedule for major releases this
> wouldn't be difficult to do.
>


Re: [DISCUSS] FLIP-311: Support Call Stored Procedure

2023-06-13 Thread Jing Ge
Hi yuxia,

Thanks for your proposal and sorry for the late reply. The FLIP is in good
shape. If I am not mistaken, Everything, that a stored procedure could do,
could also be done by a Flink job. The current stored procedure design is
to empower Catalogs to provide users some commonly used logics/functions
centrally and out-of-the-box, i.e. DRY. Is that correct?

Best regards,
Jing

On Thu, Jun 8, 2023 at 10:32 AM Jark Wu  wrote:

> Thank you for the proposal, yuxia! The FLIP looks good to me.
>
> Best,
> Jark
>
> > 2023年6月8日 11:39,yuxia  写道:
> >
> > Hi, all.
> > Thanks everyone for the valuable input. If there are are no further
> concerns about this FLIP[1], I would like to start voting next monday
> (6/12).
> >
> > [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure
> >
> >
> > Best regards,
> > Yuxia
> >
> > - 原始邮件 -
> > 发件人: "Martijn Visser" 
> > 收件人: "dev" 
> > 发送时间: 星期二, 2023年 6 月 06日 下午 3:57:56
> > 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure
> >
> > Hi Yuxia,
> >
> > Thanks for the clarification. I would be +0 overall, because I think
> > without actually allowing creation/customization of stored procedures,
> the
> > value for the majority of Flink users will be minimal.
> >
> > Best regards,
> >
> > Martijn
> >
> > On Tue, Jun 6, 2023 at 3:52 AM yuxia 
> wrote:
> >
> >> Hi, Martijn.
> >> Thanks for you feedback.
> >> 1: In this FLIP we don't intend to allow users to customize their own
> >> stored procedure for we don't want to expose too much to users too
> early as
> >> the FLIP said.
> >> The procedures are supposed to be provided only by Catalog. Catalog devs
> >> can write their build-in procedures, and return the procedure in method
> >> Catalog.getProcedure(ObjectPath procedurePath);
> >> So, there won't be SQL syntax to create/save a stored procedure in this
> >> FLIP. If we find we do need it, we can propse the SQL syntax to create a
> >> stored procedure in another dedicated FLIP.
> >>
> >> 2: The syntax `Call procedure_name(xx)` proposed in this FLIP is the
> >> default syntax in Calcite for call stored procedures. Actaully, we don't
> >> need to do any modifcation in flink-sql-parser module for syntax of
> calling
> >> a procedure. MySQL[1], Postgres[2], Oracle[3] also use the syntax to
> call a
> >> stored procedure.
> >>
> >>
> >> [1] https://dev.mysql.com/doc/refman/8.0/en/call.html
> >> [2] https://www.postgresql.org/docs/15/sql-call.html
> >> [3] https://docs.oracle.com/javadb/10.8.3.0/ref/rrefcallprocedure.html
> >>
> >> Best regards,
> >> Yuxia
> >>
> >> - 原始邮件 -
> >> 发件人: "Martijn Visser" 
> >> 收件人: "dev" 
> >> 发送时间: 星期一, 2023年 6 月 05日 下午 8:35:44
> >> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure
> >>
> >> Hi Yuxia,
> >>
> >> Thanks for the FLIP. I have a couple of questions:
> >>
> >> 1. The syntax talks about how to CALL or SHOW the available stored
> >> procedures, but not on how to create one. Will there not be a SQL
> syntax to
> >> create/save a stored procedure?
> >> 2. Is there a default syntax in Calcite for stored procedures? What do
> >> other databases do, do they use CALL/SHOW or something like EXEC, USE?
> >>
> >> Best regards,
> >>
> >> Martijn
> >>
> >> On Mon, Jun 5, 2023 at 3:23 AM yuxia 
> wrote:
> >>
> >>> Hi, Jane.
> >>> Thanks for you input. I think we can add the auxiliary command show
> >>> procedures in this FLIP.
> >>> Following the syntax for show functions proposed in FLIP-297.
> >>> The syntax will be
> >>> SHOW PROCEDURES [ ( FROM | IN ) [catalog_name.]database_name ] [ [NOT]
> >>> (LIKE | ILIKE)  ].
> >>> I have updated to this FLIP.
> >>>
> >>> The other auxiliary commands maybe not suitable currently or need a
> >>> further/dedicated dicussion. Let's keep this FLIP focus.
> >>>
> >>> [1]
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements
> >>>
> >>> Best regards,
> >>> Yuxia
> >>>
> >>> - 原始邮件 -
> >>> 发件人: "Jane Chan" 
> >>> 收件人: "dev" 
> >>> 发送时间: 星期六, 2023年 6 月 03日 下午 7:04:39
> >>> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure
> >>>
> >>> Hi Yuxia,
> >>>
> >>> Thanks for bringing this to the discussion. The call procedure is a
> >> widely
> >>> used feature and will be very useful for users.
> >>>
> >>> I just have one question regarding the usage. The FLIP mentioned that
> >>>
> >>> Flink will allow connector developers to develop their own built-in
> >> stored
>  procedures, and then enables users to call these predefiend stored
>  procedures.
> 
> >>> In this FLIP, we don't intend to allow users to customize their own
> >> stored
>  procedure  for we don't want to expose too much to users too early.
> >>>
> >>>
> >>> If I understand correctly, we might need to provide some auxiliary
> >> commands
> >>> to inform users what built-in procedures are provided and how to use
> >> them.
> >>> For example, Snowflake provides commands like [1] [2], and MySQL
> provides

Re: Re: [VOTE] FLIP-311: Support Call Stored Procedure

2023-06-13 Thread Jing Ge
+1(binding)

Best Regards,
Jing

On Tue, Jun 13, 2023 at 9:03 AM Mang Zhang  wrote:

> +1 (no-binding)
>
>
>
>
> --
>
> Best regards,
> Mang Zhang
>
>
>
>
>
> 在 2023-06-13 13:19:31,"Lincoln Lee"  写道:
> >+1 (binding)
> >
> >Best,
> >Lincoln Lee
> >
> >
> >Jingsong Li  于2023年6月13日周二 10:07写道:
> >
> >> +1
> >>
> >> On Mon, Jun 12, 2023 at 10:32 PM Rui Fan <1996fan...@gmail.com> wrote:
> >> >
> >> > +1 (binding)
> >> >
> >> > Best,
> >> > Rui Fan
> >> >
> >> > On Mon, Jun 12, 2023 at 22:20 Benchao Li 
> wrote:
> >> >
> >> > > +1 (binding)
> >> > >
> >> > > yuxia  于2023年6月12日周一 17:58写道:
> >> > >
> >> > > > Hi everyone,
> >> > > > Thanks for all the feedback about FLIP-311: Support Call Stored
> >> > > > Procedure[1]. Based on the discussion [2], we have come to a
> >> consensus,
> >> > > so
> >> > > > I would like to start a vote.
> >> > > > The vote will be open for at least 72 hours (until June 15th,
> 10:00AM
> >> > > GMT)
> >> > > > unless there is an objection or an insufficient number of votes.
> >> > > >
> >> > > >
> >> > > > [1]
> >> > > >
> >> > >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure
> >> > > > [2]
> https://lists.apache.org/thread/k6s50gcgznon9v1oylyh396gb5kgrwmd
> >> > > >
> >> > > > Best regards,
> >> > > > Yuxia
> >> > > >
> >> > >
> >> > >
> >> > > --
> >> > >
> >> > > Best,
> >> > > Benchao Li
> >> > >
> >>
>


Re: [VOTE] FLIP-294: Support Customized Catalog Modification Listener

2023-06-14 Thread Jing Ge
+1 (binding)

Best Regards,
Jing

On Wed, Jun 14, 2023 at 4:07 PM Benchao Li  wrote:

> +1 (binding)
>
> Shammon FY  于2023年6月14日周三 19:52写道:
>
> > Hi all:
> >
> > Thanks for all the feedback for FLIP-294: Support Customized Catalog
> > Modification Listener [1]. I would like to start a vote for it according
> to
> > the discussion in thread [2].
> >
> > The vote will be open for at least 72 hours(excluding weekends, until
> June
> > 19, 19:00 PM GMT) unless there is an objection or an insufficient number
> of
> > votes.
> >
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener
> > [2] https://lists.apache.org/thread/185mbcwnpokfop4xcb22r9bgfp2m68fx
> >
> >
> > Best,
> > Shammon FY
> >
>
>
> --
>
> Best,
> Benchao Li
>


Re: [VOTE] FLIP-295: Support lazy initialization of catalogs and persistence of catalog configurations

2023-06-14 Thread Jing Ge
+1 (binding)

Best Regards,
Jing

On Wed, Jun 14, 2023 at 3:28 PM Rui Fan <1996fan...@gmail.com> wrote:

> +1(binding)
>
> Best,
> Rui Fan
>
> On Wed, Jun 14, 2023 at 16:24 Hang Ruan  wrote:
>
> > +1 (non-binding)
> >
> > Thanks for Feng driving it.
> >
> > Best,
> > Hang
> >
> > Feng Jin  于2023年6月14日周三 10:36写道:
> >
> > > Hi everyone
> > >
> > > Thanks for all the feedback about the FLIP-295: Support lazy
> > initialization
> > > of catalogs and persistence of catalog configurations[1].
> > > [2] is the discussion thread.
> > >
> > >
> > > I'd like to start a vote for it. The vote will be open for at least 72
> > > hours(excluding weekends,until June 19, 10:00AM GMT) unless there is an
> > > objection or an insufficient number of votes.
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
> > > [2]https://lists.apache.org/thread/dcwgv0gmngqt40fl3694km53pykocn5s
> > >
> > >
> > > Best,
> > > Feng
> > >
> >
>


Re: [VOTE] FLIP-246: Dynamic Kafka Source (originally Multi Cluster Kafka Source)

2023-06-16 Thread Jing Ge
+1 (binding)

Best regards,
Jing

On Thu, Jun 15, 2023 at 7:55 PM Mason Chen  wrote:

> Hi all,
>
> Thank you to everyone for the feedback on FLIP-246 [1]. Based on the
> discussion thread [2], we have come to a consensus on the design and are
> ready to take a vote to contribute this to Flink.
>
> This voting thread will be open for at least 72 hours (excluding weekends,
> until June 20th 10:00AM PST) unless there is an objection or an
> insufficient number of votes.
>
> (Optional) If you have an opinion on the naming of the connector, please
> include it in your vote:
> 1. DynamicKafkaSource
> 2. MultiClusterKafkaSource
> 3. DiscoveringKafkaSource
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217389320
> [2] https://lists.apache.org/thread/vz7nw5qzvmxwnpktnofc9p13s1dzqm6z
>
> Best,
> Mason
>


Re: [VOTE] FLIP-287: Extend Sink#InitContext to expose TypeSerializer, ObjectReuse and JobID

2023-06-16 Thread Jing Ge
+1(binding)

Best Regards,
Jing

On Fri, Jun 16, 2023 at 10:10 AM Lijie Wang 
wrote:

> +1 (binding)
>
> Thanks for driving it, Joao.
>
> Best,
> Lijie
>
> Joao Boto  于2023年6月16日周五 15:53写道:
>
> > Hi all,
> >
> > Thank you to everyone for the feedback on FLIP-287[1]. Based on the
> > discussion thread [2], we have come to a consensus on the design and are
> > ready to take a vote to contribute this to Flink.
> >
> > I'd like to start a vote for it. The vote will be open for at least 72
> > hours(excluding weekends, unless there is an objection or an insufficient
> > number of votes.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853
> > [2]https://lists.apache.org/thread/wb3myhqsdz81h08ygwx057mkn1hc3s8f
> >
> >
> > Best,
> > Joao Boto
> >
>


Re: [DISCUSS] FLIP-321: Introduce an API deprecation process

2023-06-17 Thread Jing Ge
ule to decide when to remove deprecated APIs without
caring of anything else like functionality compatibility. If the migration
period described in this FLIP is only the minimum time, I think we still
have the house cleaning issue unsolved. Minimum means deprecated APIs can
not be removed before the migration period expires. The issue I was aware
of is when/how to remove APIs after the migration period expired, e.g.
PublicEvolving APIs that have been marked as deprecated years ago.


Best regards,
Jing

[1] https://lists.apache.org/thread/m3o48c2d8j9g5t9s89hqs6qvr924s71o

On Wed, Jun 14, 2023 at 9:56 AM Becket Qin  wrote:

> Hi Jing,
>
> Thanks for the feedback. Please see the answers to your questions below:
>
> *"Always add a "Since X.X.X" comment to indicate when was a class /
> > interface / method marked as deprecated."*
> >  Could you describe it with a code example? Do you mean Java comments?
>
> It is just a comment such as "Since 1.18. Use X
> <
> https://kafka.apache.org/31/javadoc/org/apache/kafka/clients/admin/Admin.html#incrementalAlterConfigs(java.util.Map)
> >XX
> instead.". And we can then look it up in the deprecated list[1] in each
> release and see which method should / can be deprecated.
>
> *"At least 1 patch release for the affected minor release for
> > Experimental APIs"*
> > The rule is absolutely right. However, afaiac, deprecation is different
> as
> > modification. As a user/dev, I would appreciate, if I do not need to do
> any
> > migration work for any deprecated API between patch releases upgrade.
> BTW,
> > if experimental APIs are allowed to change between patches, could we just
> > change them instead of marking them as deprecated and create new ones to
> > replace them?
>
> Deprecating an API is just a more elegant way of replacing an API with a
> new one. The only difference between the two is whether the old API is kept
> and coexists with the new API for some releases or not. For end users,
> deprecation-then-remove is much more friendly than direct replacement.
>
> 1. How to make sure the new APIs cover all functionality, i.e. backward
> > compatible, before removing the deprecated APIs? Since the
> > functionalities could only be built with the new APIs iteratively, there
> > will be a while (might be longer than the migration period) that the new
> > APIs are not backward compatible with the deprecated ones.
>
>

> This is orthogonal to the deprecation process, and may not even be required
> in some cases if the function changes by itself. But in general, this
> relies on the developer to decide. A simple test on readiness is to see if
> all the UT / IT cases written with the old API can be migrated to the new
> one and still work.  If the new API is not ready, we probably should not
> deprecate the old one to begin with.
>


>
> 2. Is it allowed to remove the deprecated APIs after the defined migration
> > period expires while the new APis are still not backward compatible?
>
>

> By "backwards compatible", do you mean functionally equivalent? If the new
> APIs are designed to be not backwards compatible, then removing the
> deprecated source code is definitely allowed. If we don't think the new API
> is ready to take over the place for the old one, then we should wait. The
> migration period is the minimum time we have to wait before removing the
> source code. A longer migration period is OK.
>
>

> 3. For the case of core API upgrade with downstream implementations, e.g.
> > connectors, What is the feasible deprecation strategy? Option1 bottom-up:
> > make sure the downstream implementation is backward compatible before
> > removing the deprecated core APIs. Option2 top-down: once the downstream
> > implementation of new APIs works fine, we can remove the deprecated core
> > APIs after the migration period expires. The implementation of the
> > deprecated APIs will not get any further update in upcoming releases(it
> has
> > been removed). There might be some missing features in the downstream
> > implementation of new APIs compared to the old implementation. Both
> options
> > have their own pros and cons.
>
> The downstream projects such as connectors in Flink should also follow the
> migration path we tell our users. i.e. If there is a cascading backwards
> incompatible change in the connectors due to a backwards incompatible
> change in the core, and as a result a longer migration period is required,
> then I think we should postpone the removal of source code. But in general,
> we should be able to provide the same migration period in the connectors as
> the flink core, if the connectors are upg

Re: [DISCUSS] FLIP-321: Introduce an API deprecation process

2023-06-17 Thread Jing Ge
Hi All,

The @Public -> @PublicEvolving proposed by Xintong is a great idea.
Especially, after he suggest @PublicRetired, i.e. @PublicEvolving --(2
minor release)--> @Public --> @deprecated --(1 major
release)--> @PublicRetired. It will provide a lot of flexibility without
breaking any rules we had. @Public APIs are allowed to change between major
releases. Changing annotations is acceptable and provides additional
tolerance i.e. user-friendliness, since the APIs themself are not changed.

I had similar thoughts when I was facing those issues. I want to move one
step further and suggest introducing one more annotation @Retired.

Not like the @PublicRetired which is a compromise of downgrading @Public to
@PublicEvolving. As I mentioned earlier in my reply, Java standard
@deprecated should be used in the early stage of the deprecation process
and doesn't really meet our requirement. Since Java does not allow us to
extend annotation, I think it would be feasible to have the new @Retired to
help us monitor and manage the deprecation process, house cleaning, etc.

Some ideas could be(open for discussion):

@Retired:

1. There must be a replacement with functionality compatibility before APIs
can be marked as @Retired, i.e. DISCUSS and VOTE processes on the ML are
mandatory (a FLIP is recommended).
2. APIs marked as @Retired will be removed after 1 minor release sharply
(using ArchUnit to force it, needs further check whether it is possible).
Devs who marked them as @Retired are responsible to remove them.
3. Both @Public -> @Retired and @PublicEvolving -> @Retired are
recommended. @Experimental -> @Retired and @Internal -> @Retired could also
be used if it can increase user-friendliness or dev-friendliness, but not
mandatory.
4. Some variables will be defined in @Retired to support the deprecation
process management. Further extension is possible, since the annotation is
built by us.


Best regards,
Jing

On Fri, Jun 16, 2023 at 10:31 AM Becket Qin  wrote:

> Hi Xintong,
>
> Thanks for the explanation. Please see the replies inline below.
>
> I agree. And from my understanding, demoting a Public API is also a kind of
> > such change, just like removing one, which can only happen with major
> > version bumps. I'm not proposing to allow demoting Public APIs anytime,
> but
> > only in the case major version bumps happen before reaching the
> > 2-minor-release migration period. Actually, demoting would be a weaker
> > change compared to removing the API immediately upon major version bumps,
> > in order to keep the commitment about the 2-minor-release migration
> period.
> > If the concern is that `@Public` -> `@PublicEvolving` sounds against
> > conventions, we may introduce a new annotation if necessary, e.g.,
> > `@PublicRetiring`, to avoid confusions.
>
> As an end user who only uses Public APIs, if I don't change my code at all,
> my expectation is the following:
> 1. Upgrading from 1.x to 2.x may have issues.
> 2. If I can upgrade from 1.x to 2.x without an issue, I am fine with all
> the 2.x versions.
> Actually I think there are some dependency version resolution policies out
> there which picks the highest minor version when the dependencies pull in
> multiple minor versions of the same jar, which may be broken if we remove
> the API in minor releases.
>
> I'm not sure about this. Yes, it's completely "legal" that we bump up the
> > major version whenever a breaking change is needed. However, this also
> > weakens the value of the commitment that public APIs will stay stable
> > within the major release series, as the series can end anytime. IMHO,
> short
> > major release series are not something "make the end users happy", but
> > backdoors that allow us as the developers to make frequent breaking
> > changes. On the contrary, with the demoting approach, we can still have
> > longer major release series, while only allowing Public APIs deprecated
> at
> > the end of the previous major version to be removed in the next major
> > version.
>
> I totally agree that frequent major version bumps are not ideal, but here
> we are comparing it with a minor version bump which removes a Public API.
> So the context is that we have already decided to remove this Public API
> while keeping everything else backwards compatible. I think a major version
> bump is a commonly understood signal here, compared with a minor version
> change. From end users' perspective, for those who are not impacted, in
> this case upgrading a major version is not necessarily more involved than
> upgrading a minor version - both should be as smooth as a dependency
> version change. For those who are impacted, they will lose the Public API
> anyways and a major version bump ensures there is no surprise.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Jun 16, 2023 at 10:13 AM Xintong Song 
> wrote:
>
> > Public API is a well defined common concept, and one of its
> >> convention is that it only changes with a major version change.
> >>
> >

Re: [NOTICE] Experimental Java 17 support now available on master

2023-06-18 Thread Jing Ge
Hi Kurt,

Thanks for your contribution. I am a little bit confused about the email
title, since your PR[1] is not merged into the master yet. I guess, with
"Experimental Java 17 support", you meant it is available on your branch
which is based on the master.

If I am not mistaken, there is no vote thread of FLIP 317 on ML. Would you
like to follow the standard process[2] defined by the Flink community?
Thanks!


Best regards,
Jing

[1] https://github.com/apache/flink/pull/22660
[2]
https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals

On Sun, Jun 18, 2023 at 1:18 AM Kurt Ostfeld 
wrote:

> I built the Flink master branch and tried running this simple Flink app
> that uses a Java record:
>
>
> https://github.com/kurtostfeld/flink-kryo-upgrade-demo/blob/main/flink-record-demo/src/main/java/demo/app/Main.java
>
> It fails with the normal exception that Kryo 2.x throws when you try to
> serialize a Java record. The full stack trace is here:
> https://pastebin.com/HGhGKUWt
>
> I tried removing this line:
>
>
> https://github.com/kurtostfeld/flink-kryo-upgrade-demo/blob/main/flink-record-demo/src/main/java/demo/app/Main.java#L36
>
> and that had no impact, I got the same error.
>
> In the other thread, you said that the plan was to use PojoSerializer to
> serialize records rather than Kryo. Currently, the Flink code bases uses
> Kryo 2.x by default for generic user data types, and that will fail when
> the data type is a record or contains records. Ultimately, if Flink wants
> to fully support Java records, it seems that it has to move off of Kryo
> 2.x. PojoSerializer is part of what is basically a custom serialization
> library internal to Flink that is an alternative to Kryo. That's one
> option: move off of Kryo to a Flink-internal serialization library. The
> other two options are upgrade to the new Kryo or use a different
> serialization library.
>
> The Kryo 5.5.0 upgrade PR I submitted (
> https://github.com/apache/flink/pull/22660) with FLIP 317 (
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-317%3A+Upgrade+Kryo+from+2.24.0+to+5.5.0)
> works with records. The Flink app linked above that uses records works with
> the PR and that's what I posted to this mailing list a few weeks ago. I
> rebased the pull request on to the latest master branch and it's passing
> all tests. From my testing, it supports stateful upgrades, including
> checkpoints. If you can demonstrate a scenario where stateful upgrades
> error I can try to resolve that.


Re: [VOTE] FLIP-308: Support Time Travel

2023-06-19 Thread Jing Ge
+1(binding)

On Mon, Jun 19, 2023 at 1:57 PM Benchao Li  wrote:

> +1 (binding)
>
> Lincoln Lee  于2023年6月19日周一 19:40写道:
>
> > +1 (binding)
> >
> > Best,
> > Lincoln Lee
> >
> >
> > yuxia  于2023年6月19日周一 19:30写道:
> >
> > > +1 (binding)
> > > Thanks Feng driving it.
> > >
> > > Best regards,
> > > Yuxia
> > >
> > > - 原始邮件 -
> > > 发件人: "Feng Jin" 
> > > 收件人: "dev" 
> > > 发送时间: 星期一, 2023年 6 月 19日 下午 7:22:06
> > > 主题: [VOTE] FLIP-308: Support Time Travel
> > >
> > > Hi everyone
> > >
> > > Thanks for all the feedback about the FLIP-308: Support Time Travel[1].
> > > [2] is the discussion thread.
> > >
> > >
> > > I'd like to start a vote for it. The vote will be open for at least 72
> > > hours(excluding weekends,until June 22, 12:00AM GMT) unless there is an
> > > objection or an insufficient number of votes.
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-308%3A+Support+Time+Travel
> > > [2]https://lists.apache.org/thread/rpozdlf7469jmc7q7vc0s08pjnmscz00
> > >
> > >
> > > Best,
> > > Feng
> > >
> >
>
>
> --
>
> Best,
> Benchao Li
>


Re: [DISCUSS] FLIP-311: Support Call Stored Procedure

2023-06-19 Thread Jing Ge
Thanks for your reply. Nice feature!

Best regards,
Jing

On Wed, Jun 14, 2023 at 3:11 AM yuxia  wrote:

> Yes, you're right.
>
> Best regards,
> Yuxia
>
> ----- 原始邮件 -
> 发件人: "Jing Ge" 
> 收件人: "dev" 
> 发送时间: 星期三, 2023年 6 月 14日 上午 4:46:58
> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure
>
> Hi yuxia,
>
> Thanks for your proposal and sorry for the late reply. The FLIP is in good
> shape. If I am not mistaken, Everything, that a stored procedure could do,
> could also be done by a Flink job. The current stored procedure design is
> to empower Catalogs to provide users some commonly used logics/functions
> centrally and out-of-the-box, i.e. DRY. Is that correct?
>
> Best regards,
> Jing
>
> On Thu, Jun 8, 2023 at 10:32 AM Jark Wu  wrote:
>
> > Thank you for the proposal, yuxia! The FLIP looks good to me.
> >
> > Best,
> > Jark
> >
> > > 2023年6月8日 11:39,yuxia  写道:
> > >
> > > Hi, all.
> > > Thanks everyone for the valuable input. If there are are no further
> > concerns about this FLIP[1], I would like to start voting next monday
> > (6/12).
> > >
> > > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure
> > >
> > >
> > > Best regards,
> > > Yuxia
> > >
> > > - 原始邮件 -
> > > 发件人: "Martijn Visser" 
> > > 收件人: "dev" 
> > > 发送时间: 星期二, 2023年 6 月 06日 下午 3:57:56
> > > 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure
> > >
> > > Hi Yuxia,
> > >
> > > Thanks for the clarification. I would be +0 overall, because I think
> > > without actually allowing creation/customization of stored procedures,
> > the
> > > value for the majority of Flink users will be minimal.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > On Tue, Jun 6, 2023 at 3:52 AM yuxia 
> > wrote:
> > >
> > >> Hi, Martijn.
> > >> Thanks for you feedback.
> > >> 1: In this FLIP we don't intend to allow users to customize their own
> > >> stored procedure for we don't want to expose too much to users too
> > early as
> > >> the FLIP said.
> > >> The procedures are supposed to be provided only by Catalog. Catalog
> devs
> > >> can write their build-in procedures, and return the procedure in
> method
> > >> Catalog.getProcedure(ObjectPath procedurePath);
> > >> So, there won't be SQL syntax to create/save a stored procedure in
> this
> > >> FLIP. If we find we do need it, we can propse the SQL syntax to
> create a
> > >> stored procedure in another dedicated FLIP.
> > >>
> > >> 2: The syntax `Call procedure_name(xx)` proposed in this FLIP is the
> > >> default syntax in Calcite for call stored procedures. Actaully, we
> don't
> > >> need to do any modifcation in flink-sql-parser module for syntax of
> > calling
> > >> a procedure. MySQL[1], Postgres[2], Oracle[3] also use the syntax to
> > call a
> > >> stored procedure.
> > >>
> > >>
> > >> [1] https://dev.mysql.com/doc/refman/8.0/en/call.html
> > >> [2] https://www.postgresql.org/docs/15/sql-call.html
> > >> [3]
> https://docs.oracle.com/javadb/10.8.3.0/ref/rrefcallprocedure.html
> > >>
> > >> Best regards,
> > >> Yuxia
> > >>
> > >> - 原始邮件 -
> > >> 发件人: "Martijn Visser" 
> > >> 收件人: "dev" 
> > >> 发送时间: 星期一, 2023年 6 月 05日 下午 8:35:44
> > >> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure
> > >>
> > >> Hi Yuxia,
> > >>
> > >> Thanks for the FLIP. I have a couple of questions:
> > >>
> > >> 1. The syntax talks about how to CALL or SHOW the available stored
> > >> procedures, but not on how to create one. Will there not be a SQL
> > syntax to
> > >> create/save a stored procedure?
> > >> 2. Is there a default syntax in Calcite for stored procedures? What do
> > >> other databases do, do they use CALL/SHOW or something like EXEC, USE?
> > >>
> > >> Best regards,
> > >>
> > >> Martijn
> > >>
> > >> On Mon, Jun 5, 2023 at 3:23 AM yuxia 
> > wrote:
> > >>
> > >>> Hi, Jane.
> > >>> Thanks for yo

Re: [DISCUSS] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-19 Thread Jing Ge
Hi Lijie,

Thanks for your proposal. It is a really nice feature. I'd like to ask a
few questions to understand your thoughts.

Afaiu, the runtime Filter will only be Injected when the gap between the
build data size and prob data size is big enough. Let's make an extreme
example. If the small table(build side) has one row and the large
table(probe side) contains tens of billions of rows. This will be the ideal
use case for the runtime filter and the improvement will be significant. Is
this correct?

Speaking of the "Conditions of injecting Runtime Filter" in the FLIP, will
the value of max-build-data-size and min-prob-data-size depend on the
parallelism config? I.e. with the same data-size setting, is it possible to
inject or don't inject runtime filters by adjusting the parallelism?

In the FLIP, there are default values for the new configuration parameters
that will be used to check the injection condition. If ndv cannot be
estimated, row count will be used. Given the max-build-data-size is 10MB
and the min-prob-data-size is 10GB, in the worst case, the min-filter-ratio
will be 0.999, i.e. the probeNdv is 1000 times buildNdv . If we consider
the duplication and the fact that the large table might have more columns
than the small table, the probeNdv should still be 100 or 10 times
buildNdv, which ends up with a min-filter-ratio equals to 0.99 or 0.9. Both
are bigger than the default value 0.5 in the FLIP. If I am not mistaken,
commonly, a min-filter-ratio less than 0.99 will always allow injecting the
runtime filter. Does it make sense to reconsider the formula of ratio
calculation to help users easily control the filter injection?

Best regards,
Jing

On Mon, Jun 19, 2023 at 4:42 PM Lijie Wang  wrote:

> Hi Stefan,
>
> >> bypassing the dataflow
> I believe it's a possible solution, but it may require more coordination
> and extra conditions (such as DFS), I do think it should be excluded from
> the first version. I'll put it in Future+Improvements as a potential
> improvement.
>
> Thanks again for your quick reply :)
>
> Best,
> Lijie
>
> Stefan Richter  于2023年6月19日周一 20:51写道:
>
> >
> > Hi Lijie,
> >
> > I think you understood me correctly. But I would not consider this a true
> > cyclic dependency in the dataflow because I would not suggest to send the
> > filter through an edge in the job graph from join to scan. I’d rather
> > bypass the stream graph for exchanging bringing the filter to the scan.
> For
> > example, the join could report the filter after the build phase, e.g. to
> > the JM or a predefined DFS folder. And when the probe scan is scheduled,
> > the JM provides the filter information to the scan when it gets scheduled
> > for execution or the scan looks in DFS if it can find any filter that it
> > can use as part of initialization. I’m not suggesting to do it exactly in
> > those ways, but just to show what I mean by "bypassing the dataflow".
> >
> > Anyways, I’m fine with excluding this optimization from the current FLIP
> > if you believe it would be hard to implement in Flink.
> >
> > Best,
> > Stefan
> >
> >
> > > On 19. Jun 2023, at 14:07, Lijie Wang 
> wrote:
> > >
> > > Hi Stefan,
> > >
> > > If I understand correctly(I hope so), the hash join operator needs to
> > send
> > > the bloom filter to probe scan, and probe scan also needs to send the
> > > filtered data to the hash join operator. This means there will be a
> cycle
> > > in the data flow, it will be hard for current Flink to schedule this
> kind
> > > of graph. I admit we can find a way to do this, but that's probably a
> > > bit outside the scope of this FLIP.  So let's do these complex
> > > optimizations later, WDYT?
> > >
> > > Best,
> > > Lijie
> > >
> > > Stefan Richter  > srich...@confluent.io.invalid>> 于2023年6月19日周一 18:15写道:
> > >
> > >> Hi Lijie,
> > >>
> > >> Exactly, my proposal was to build the bloom filter in the hash
> > operator. I
> > >> don’t know about all the details about the implementation of Flink’s
> > join
> > >> operator, but I’d assume that even if the join is a two input operator
> > it
> > >> gets scheduled for 2 different pipelines. First the build phase with
> the
> > >> scan from the dimension table and after that’s completed the probe
> phase
> > >> with the scan of the fact table. I’m not proposing the use the bloom
> > filter
> > >> only in the join operator, but rather send the bloom filter to the
> probe
> > >> scan before starting the probe. I assume this would require some form
> of
> > >> side channel to transport the filter and coordination to tell the
> > sources
> > >> that such a filter is available. I cannot answer how hard those would
> > be to
> > >> implement, but the idea doesn’t seem impossible to me.
> > >>
> > >> Best,
> > >> Stefan
> > >>
> > >>
> > >>> On 19. Jun 2023, at 11:56, Lijie Wang 
> > wrote:
> > >>>
> > >>> Hi Stefan,
> > >>>
> > >>> Now I know what you mean about point 1. But currently it is
> unfeasible
> > >> for
> > >>> Flink, because the building of the 

[DISCUSS] Graduate the FileSink to @PublicEvolving

2023-06-20 Thread Jing Ge
Hi all,

The FileSink has been marked as @Experimental[1] since Oct. 2020.
According to FLIP-197[2], I would like to propose to graduate it
to @PublicEvloving in the upcoming 1.18 release.

On the other hand, as a related topic, FileSource was marked
as @PublicEvolving[3] 3 years ago. It deserves a graduation discussion too.
To keep this discussion lean and efficient, let's focus on FlieSink in this
thread. There will be another discussion thread for the FileSource.

I was wondering if anyone might have any concerns. Looking forward to
hearing from you.


Best regards,
Jing






[1]
https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/FileSink.java#L129
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
[3]
https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java#L95


Re: [DISCUSS] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-20 Thread Jing Ge
Hi Ron,

Thanks for the clarification. That answered my questions.

Regarding the ratio, since my gut feeling is that any value less than 0.8
or 0.9 won't help too much(I might be wrong). I was thinking of adapting
the formula to somehow map the current 0.9-1 to 0-1, i.e. if user config
0.5, it will be mapped to e.g. 0.95 (or e.g. 0.85, the real number
needs more calculation) for the current formula described in the FLIP. But
I am not sure it is a feasible solution. It deserves more discussion. Maybe
some real performance tests could give us some hints.

Best regards,
Jing

On Tue, Jun 20, 2023 at 5:19 AM liu ron  wrote:

> Hi, Jing
>
> Thanks for your feedback.
>
> > Afaiu, the runtime Filter will only be Injected when the gap between the
> build data size and prob data size is big enough. Let's make an extreme
> example. If the small table(build side) has one row and the large
> table(probe side) contains tens of billions of rows. This will be the ideal
> use case for the runtime filter and the improvement will be significant. Is
> this correct?
>
> Yes, you are right.
>
> > Speaking of the "Conditions of injecting Runtime Filter" in the FLIP,
> will
> the value of max-build-data-size and min-prob-data-size depend on the
> parallelism config? I.e. with the same data-size setting, is it possible to
> inject or don't inject runtime filters by adjusting the parallelism?
>
> First, let me clarify two points. The first is that RuntimeFilter decides
> whether to inject or not in the optimization phase, but we do not consider
> operator parallelism in the SQL optimization phase currently, which is set
> at the ExecNode level. The second is that in batch mode, the default
> AdaptiveBatchScheduler[1] is now used, which will derive the parallelism of
> the downstream operator based on the amount of data produced by the
> upstream operator, that is, the parallelism is determined by runtime
> adaptation. In the above case, we cannot decide whether to inject
> BloomFilter in the optimization stage based on parallelism.
> A more important point is that the purpose of Runtime Filter is to reduce
> the amount of data for shuffle, and thus the amount of data processed by
> the downstream join operator. Therefore, I understand that regardless of
> the parallelism of the probe, the amount of data in the shuffle must be
> reduced after inserting the Runtime Filter, which is beneficial to the join
> operator, so whether to insert the RuntimeFilter or not is not dependent on
> the parallelism.
>
> > Does it make sense to reconsider the formula of ratio
> calculation to help users easily control the filter injection?
>
> Only when ndv does not exist will row count be considered. when size uses
> the default value and ndv cannot be taken, it is true that this condition
> may always hold, but this does not seem to affect anything, and the user is
> also likely to change the value of the size. One question, how do you think
> we should make it easier for users to control the  filter injection?
>
>
> [1]:
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/elastic_scaling/#adaptive-batch-scheduler
>
> Best,
> Ron
>
> Jing Ge  于2023年6月20日周二 07:11写道:
>
> > Hi Lijie,
> >
> > Thanks for your proposal. It is a really nice feature. I'd like to ask a
> > few questions to understand your thoughts.
> >
> > Afaiu, the runtime Filter will only be Injected when the gap between the
> > build data size and prob data size is big enough. Let's make an extreme
> > example. If the small table(build side) has one row and the large
> > table(probe side) contains tens of billions of rows. This will be the
> ideal
> > use case for the runtime filter and the improvement will be significant.
> Is
> > this correct?
> >
> > Speaking of the "Conditions of injecting Runtime Filter" in the FLIP,
> will
> > the value of max-build-data-size and min-prob-data-size depend on the
> > parallelism config? I.e. with the same data-size setting, is it possible
> to
> > inject or don't inject runtime filters by adjusting the parallelism?
> >
> > In the FLIP, there are default values for the new configuration
> parameters
> > that will be used to check the injection condition. If ndv cannot be
> > estimated, row count will be used. Given the max-build-data-size is 10MB
> > and the min-prob-data-size is 10GB, in the worst case, the
> min-filter-ratio
> > will be 0.999, i.e. the probeNdv is 1000 times buildNdv . If we consider
> > the duplication and the fact that the large table might have more columns
> > than the small table, the probeNdv should still be 100 or 10 

Re: [DISCUSS] FLIP-321: Introduce an API deprecation process

2023-06-20 Thread Jing Ge
> >>>>>> There're various ways to do that, e.g., release notes, warnings in
> >>>> logs,
> >>>>>> etc.
> >>>>>>
> >>>>>> Another possible alternative: whenever there's a deprecated Public
> >> API
> >>>>> that
> >>>>>> reaches a major version bump before the migration period, and we
> >> also
> >>>>> don't
> >>>>>> want to carry it for all the next major release series, we may
> >> consider
> >>>>>> releasing more minor releases for the previous major version after
> >> the
> >>>>>> bump. E.g., an Public API is deprecated in 1.19, and then we bump
> >> to
> >>>> 2.0,
> >>>>>> we can release one more 1.20 after 2.0. That should provide users
> >>>> another
> >>>>>> choice rather than upgrading to 2.0, while satisfying the
> >>>> 2-minor-release
> >>>>>> migration period.
> >>>>>>
> >>>>>> I think my major point is, we should not carry APIs deprecated in a
> >>>>>> previous major version along all the next major version series. I'd
> >>>> like
> >>>>> to
> >>>>>> try giving users more commitments, i.e. the migration period, as
> >> long
> >>>> as
> >>>>> it
> >>>>>> does not prevent us from making breaking changes. If it doesn't
> >> work,
> >>>> I'd
> >>>>>> be in favor of not providing the migration period, but fallback to
> >> only
> >>>>>> guarantee the compatibility within the major version.
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> Xintong
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Jun 19, 2023 at 10:48 AM John Roesler  <mailto:vvcep...@apache.org>
> >>>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Hi Becket,
> >>>>>>>
> >>>>>>> Thanks for the reply! I’d like to continue the conversation about
> >>>>>>> compatibility outside of this FLIP thread, but for now, I can
> >> accept
> >>>>> your
> >>>>>>> decision. It’s certainly an improvement.
> >>>>>>>
> >>>>>>> Thanks again,
> >>>>>>> John
> >>>>>>>
> >>>>>>> On Sun, Jun 18, 2023, at 21:42, Becket Qin wrote:
> >>>>>>>> Hi John,
> >>>>>>>>
> >>>>>>>> Completely agree with all you said.
> >>>>>>>>
> >>>>>>>> Can we consider only dropping deprecated APIs in major releases
> >>>>> across
> >>>>>>> the
> >>>>>>>>> board? I understand that Experimental and PublicEvolving APIs
> >> are
> >>>> by
> >>>>>>>>> definition less stable, but it seems like this should be
> >> reflected
> >>>>> in
> >>>>>>> the
> >>>>>>>>> required deprecation period alone. I.e. that we must keep them
> >>>>> around
> >>>>>>> for
> >>>>>>>>> at least zero or one minor release, not that we can drop them
> >> in a
> >>>>>>> minor or
> >>>>>>>>> patch release.
> >>>>>>>>
> >>>>>>>> Personally speaking, I would love to do this, for exactly the
> >>>> reason
> >>>>>> you
> >>>>>>>> mentioned. However, I did not propose this due to the following
> >>>>>> reasons:
> >>>>>>>>
> >>>>>>>> 1. I am hesitating a little bit about changing the accepted
> >> FLIPs
> >>>> too
> >>>>>>> soon.
> >>>>>>>> 2. More importantly, to avoid slowing down our development. At
> >> this
> >>>>>>> point,
> >>>>>>>> Flink still lacks some design / routines to support good API
> >>>>>>> evolvability /

Re: [DISCUSS] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-21 Thread Jing Ge
Hi Ron,

Thanks for sharing your thoughts! It makes sense. It would be helpful if
these references of Hive, Polardb, etc. could be added into the FLIP.

Best regards,
Jing

On Tue, Jun 20, 2023 at 5:41 PM liu ron  wrote:

> Hi, Jing
>
> The default value for this ratio is a reference to other systems, such as
> Hive. As long as Runtime Filter can filter out more than half of the data,
> we can benefit from it. Of course, normally, as long as we can get the
> statistics, ndv are present, the use of rowCount should be less, so I think
> the formula is valid in most cases. This formula we are also borrowed from
> some systems, such as the polardb of AliCloud. your concern is valuable for
> this FLIP, but currently, we do not know how to adjust is reasonably, too
> complex may lead to the user also can not understand, so I think we should
> first follow the simple way, the subsequent gradual optimization. The first
> step may be that we can verify the reasonableness of current formula by
> TPC-DS case.
>
> Best,
> Ron
>
> Jing Ge  于2023年6月20日周二 19:46写道:
>
> > Hi Ron,
> >
> > Thanks for the clarification. That answered my questions.
> >
> > Regarding the ratio, since my gut feeling is that any value less than 0.8
> > or 0.9 won't help too much(I might be wrong). I was thinking of adapting
> > the formula to somehow map the current 0.9-1 to 0-1, i.e. if user config
> > 0.5, it will be mapped to e.g. 0.95 (or e.g. 0.85, the real number
> > needs more calculation) for the current formula described in the FLIP.
> But
> > I am not sure it is a feasible solution. It deserves more discussion.
> Maybe
> > some real performance tests could give us some hints.
> >
> > Best regards,
> > Jing
> >
> > On Tue, Jun 20, 2023 at 5:19 AM liu ron  wrote:
> >
> > > Hi, Jing
> > >
> > > Thanks for your feedback.
> > >
> > > > Afaiu, the runtime Filter will only be Injected when the gap between
> > the
> > > build data size and prob data size is big enough. Let's make an extreme
> > > example. If the small table(build side) has one row and the large
> > > table(probe side) contains tens of billions of rows. This will be the
> > ideal
> > > use case for the runtime filter and the improvement will be
> significant.
> > Is
> > > this correct?
> > >
> > > Yes, you are right.
> > >
> > > > Speaking of the "Conditions of injecting Runtime Filter" in the FLIP,
> > > will
> > > the value of max-build-data-size and min-prob-data-size depend on the
> > > parallelism config? I.e. with the same data-size setting, is it
> possible
> > to
> > > inject or don't inject runtime filters by adjusting the parallelism?
> > >
> > > First, let me clarify two points. The first is that RuntimeFilter
> decides
> > > whether to inject or not in the optimization phase, but we do not
> > consider
> > > operator parallelism in the SQL optimization phase currently, which is
> > set
> > > at the ExecNode level. The second is that in batch mode, the default
> > > AdaptiveBatchScheduler[1] is now used, which will derive the
> parallelism
> > of
> > > the downstream operator based on the amount of data produced by the
> > > upstream operator, that is, the parallelism is determined by runtime
> > > adaptation. In the above case, we cannot decide whether to inject
> > > BloomFilter in the optimization stage based on parallelism.
> > > A more important point is that the purpose of Runtime Filter is to
> reduce
> > > the amount of data for shuffle, and thus the amount of data processed
> by
> > > the downstream join operator. Therefore, I understand that regardless
> of
> > > the parallelism of the probe, the amount of data in the shuffle must be
> > > reduced after inserting the Runtime Filter, which is beneficial to the
> > join
> > > operator, so whether to insert the RuntimeFilter or not is not
> dependent
> > on
> > > the parallelism.
> > >
> > > > Does it make sense to reconsider the formula of ratio
> > > calculation to help users easily control the filter injection?
> > >
> > > Only when ndv does not exist will row count be considered. when size
> uses
> > > the default value and ndv cannot be taken, it is true that this
> condition
> > > may always hold, but this does not seem to affect anything, and the
> user
> > is
> > > also likely to change the value of the size. One question, how do yo

Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

2023-06-22 Thread Jing Ge
Hi Galen,

Thanks for the hint which is helpful for us to have a clear big picture.
Afaiac, this will not be a blocking issue for the graduation. There will
always be some (potential) bugs in the implementation. The API is very
stable from 2020. The timing is good to graduate. WDYT?
Furthermore, I'd like to have more opinions. All opinions together will
help the community build a mature API graduation process.

Best regards,
Jing

On Tue, Jun 20, 2023 at 12:48 PM Galen Warren
 wrote:

> Is this issue still unresolved?
>
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-30238
>
> Based on prior discussion, I believe this could lead to data loss with
> FileSink.
>
>
>
> On Tue, Jun 20, 2023, 5:41 AM Jing Ge  wrote:
>
> > Hi all,
> >
> > The FileSink has been marked as @Experimental[1] since Oct. 2020.
> > According to FLIP-197[2], I would like to propose to graduate it
> > to @PublicEvloving in the upcoming 1.18 release.
> >
> > On the other hand, as a related topic, FileSource was marked
> > as @PublicEvolving[3] 3 years ago. It deserves a graduation discussion
> too.
> > To keep this discussion lean and efficient, let's focus on FlieSink in
> this
> > thread. There will be another discussion thread for the FileSource.
> >
> > I was wondering if anyone might have any concerns. Looking forward to
> > hearing from you.
> >
> >
> > Best regards,
> > Jing
> >
> >
> >
> >
> >
> >
> > [1]
> >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/FileSink.java#L129
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > [3]
> >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java#L95
> >
>


Re: [VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-23 Thread Jing Ge
+1(binding)

Best Regards,
Jing

On Fri, Jun 23, 2023 at 5:50 PM Lijie Wang  wrote:

> Hi all,
>
> Thanks for all the feedback about the FLIP-324: Introduce Runtime Filter
> for Flink Batch Jobs[1]. This FLIP was discussed in [2].
>
> I'd like to start a vote for it. The vote will be open for at least 72
> hours (until June 29th 12:00 GMT) unless there is an objection or
> insufficient votes.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs
> [2] https://lists.apache.org/thread/mm0o8fv7x7k13z11htt88zhy7lo8npmg
>
> Best,
> Lijie
>


Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

2023-06-26 Thread Jing Ge
Hi,

@Galen @Yuxia

Your points are valid. Speaking of removing deprecated API, I have the same
concern. As a matter of fact, I have been raising it in the discussion
thread of API deprecation process[1]. This is another example that we
should care about more factors than the migration period, thanks for
the hint! I will add one more update into that thread with the reference of
this thread.

In a nutshell, this thread is focusing on the graduation process. Your
valid concerns should be taken care of by the deprecation process.
Please don't hesitate to share your thoughts in that thread.


Best regards,
Jing

[1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9


On Sun, Jun 25, 2023 at 3:48 AM yuxia  wrote:

> Thanks Jing for briging this to dicuss.
> I agree it's not a blocker for graduting the FileSink to @PublicEvolving
> since the Sink which is the rootcause has marked as @PublicEvolving.
> But I do also share the same concern with Galen. At least it should be a
> blocker for removing StreamingFileSink.
> Btw, seems it's really a big headache for migrating to Sink, we may need
> to pay more attention to this ticket and try to fix it.
>
> Best regards,
> Yuxia
>
> - 原始邮件 -
> 发件人: "Galen Warren" 
> 收件人: "dev" 
> 发送时间: 星期五, 2023年 6 月 23日 下午 7:47:24
> 主题: Re: [DISCUSS] Graduate the FileSink to @PublicEvolving
>
> Thanks Jing. I can only offer my perspective on this, others may view it
> differently.
>
> If FileSink is subject to data loss in the "stop-on-savepoint then restart"
> scenario, that makes it unusable for me, and presumably for anyone who uses
> it in a long-running streaming application and who cannot tolerate data
> loss. I still use the (deprecated!) StreamingFileSink for this reason.
>
> The bigger picture here is that StreamingFileSink is deprecated and will
> presumably ultimately be removed, to be replaced with FileSink. Graduating
> the status of FileSink seems to be a step along that path; I'm concerned
> about continuing down that path with such a critical issue present.
> Ultimately, my concern is that FileSink will graduate fully and that
> StreamingFileSink will be removed and that there will be no remaining
> option to reliably stop/start streaming jobs that write to files without
> incurring the risk of data loss.
>
> I'm sure I'd feel better about things if there were an ongoing effort to
> address this FileSink issue and/or a commitment that StreamingFileSink
> would not be removed until this issue is addressed.
>
> My two cents -- thanks.
>
>
> On Fri, Jun 23, 2023 at 1:47 AM Jing Ge 
> wrote:
>
> > Hi Galen,
> >
> > Thanks for the hint which is helpful for us to have a clear big picture.
> > Afaiac, this will not be a blocking issue for the graduation. There will
> > always be some (potential) bugs in the implementation. The API is very
> > stable from 2020. The timing is good to graduate. WDYT?
> > Furthermore, I'd like to have more opinions. All opinions together will
> > help the community build a mature API graduation process.
> >
> > Best regards,
> > Jing
> >
> > On Tue, Jun 20, 2023 at 12:48 PM Galen Warren
> >  wrote:
> >
> > > Is this issue still unresolved?
> > >
> > >
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-30238
> > >
> > > Based on prior discussion, I believe this could lead to data loss with
> > > FileSink.
> > >
> > >
> > >
> > > On Tue, Jun 20, 2023, 5:41 AM Jing Ge 
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > The FileSink has been marked as @Experimental[1] since Oct. 2020.
> > > > According to FLIP-197[2], I would like to propose to graduate it
> > > > to @PublicEvloving in the upcoming 1.18 release.
> > > >
> > > > On the other hand, as a related topic, FileSource was marked
> > > > as @PublicEvolving[3] 3 years ago. It deserves a graduation
> discussion
> > > too.
> > > > To keep this discussion lean and efficient, let's focus on FlieSink
> in
> > > this
> > > > thread. There will be another discussion thread for the FileSource.
> > > >
> > > > I was wondering if anyone might have any concerns. Looking forward to
> > > > hearing from you.
> > > >
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/FileSink.java#L129
> > > > [2]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > [3]
> > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java#L95
> > > >
> > >
> >
>


Re: [DISCUSS] FLIP-321: Introduce an API deprecation process

2023-06-26 Thread Jing Ge
t; > > is reached.
> > >
> > > Sorry, I didn't read the previous detailed discussion because the
> > > discussion list was so long.
> > >
> > > I don't really like either of these options.
> > >
> > > Considering that DataStream is such an important API, can we offer a
> > third
> > > option:
> > >
> > > 3. Maintain the DataStream API throughout 2.X and remove it until 3.x.
> > But
> > > there's no need to assume that 2.X is a short version, it's still a
> > normal
> > > major version.
> > >
> > > Best,
> > > Jingsong
> > >
> > > Becket Qin 于2023年6月22日 周四16:02写道:
> > >
> > > > Thanks much for the input, John, Stefan and Jing.
> > > >
> > > > I think Xingtong has well summarized the pros and cons of the two
> > > options.
> > > > Let's collect a few more opinions here and we can move forward with
> the
> > > one
> > > > more people prefer.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Wed, Jun 21, 2023 at 3:20 AM Jing Ge 
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Thanks Xingtong for the summary. If I could only choose one of the
> > > given
> > > > > two options, I would go with option 1. I understood that option 2
> > > worked
> > > > > great with Kafka. But the bridge release will still confuse users
> and
> > > my
> > > > > gut feeling is that many users will skip 2.0 and be waiting for 3.0
> > or
> > > > even
> > > > > 3.x. And since fewer users will use Flink 2.x, the development
> focus
> > > will
> > > > > be on Flink 3.0 with the fact that the current Flink release is
> 1.17
> > > and
> > > > we
> > > > > are preparing 2.0 release. That is weird for me.
> > > > >
> > > > > THB, I would not name the change from @Public to @Retired as a
> > > demotion.
> > > > > The purpose of @Retire is to extend the API lifecycle with one more
> > > > stage,
> > > > > like in the real world, people born, studied, graduated, worked,
> and
> > > > > retired. Afaiu from the previous discussion, there are two rules
> we'd
> > > > like
> > > > > to follow simultaneously:
> > > > >
> > > > > 1. Public APIs can only be changed between major releases.
> > > > > 2. A smooth migration phase should be offered to users, i.e. at
> > least 2
> > > > > minor releases after APIs are marked as @deprecated. There should
> be
> > > new
> > > > > APIs as the replacement.
> > > > >
> > > > > Agree, those rules are good to improve the user friendliness.
> Issues
> > we
> > > > > discussed are rising because we want to fulfill both of them. If we
> > > take
> > > > > care of deprecation very seriously, APIs can be marked as
> > @Deprecated,
> > > > only
> > > > > when the new APIs as the replacement provide all functionalities
> the
> > > > > deprecated APIs have. In an ideal case without critical bugs that
> > might
> > > > > stop users adopting the new APIs. Otherwise the expected
> > "replacement"
> > > > will
> > > > > not happen. Users will still stick to the deprecated APIs, because
> > the
> > > > new
> > > > > APIs can not be used. For big features, it will need at least 4
> minor
> > > > > releases(ideal case), i.e. 2+ years to remove deprecated APIs:
> > > > >
> > > > > - 1st minor release to build the new APIs as the replacement and
> > > waiting
> > > > > for feedback. It might be difficult to mark the old API as
> deprecated
> > > in
> > > > > this release, because we are not sure if the new APIs could cover
> > 100%
> > > > > functionalities.
> > > > > -  In the lucky case,  mark all old APIs as deprecated in the 2nd
> > minor
> > > > > release. (I would even suggest having the new APIs released at
> least
> > > for
> > > > > two minor releases before marking it as deprecated to make sure
> they
> > > can
> > > > > really repla

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-26 Thread Jing Ge
Hi Paul,

Thanks for driving it and thank you all for the informative discussion! The
FLIP is in good shape now. As described in the FLIP, SQL Driver will be
mainly used to run Flink SQLs in two scenarios: 1. SQL client/gateway in
application mode and 2. external system integration. Would you like to add
one section to describe(better with script/code example) how to use it in
these two scenarios from users' perspective?

NIT: the pictures have transparent background when readers click on it. It
would be great if you can replace them with pictures with white background.

Best regards,
Jing

On Mon, Jun 26, 2023 at 1:31 PM Paul Lam  wrote:

> Hi Shengkai,
>
> > * How can we ship the json plan to the JobManager?
>
> The Flink K8s module should be responsible for file distribution. We could
> introduce
> an option like `kubernetes.storage.dir`. For each flink cluster, there
> would be a
> dedicated subdirectory, with the pattern like
> `${kubernetes.storage.dir}/${cluster-id}`.
>
> All resources-related options (e.g. pipeline jars, json plans) that are
> configured with
> scheme `file://`  would be uploaded to the resource directory
> and downloaded to the
> jobmanager, before SQL Driver accesses the files with the original
> filenames.
>
>
> > * Classloading strategy
>
>
> We could directly specify the SQL Gateway jar as the jar file in
> PackagedProgram.
> It would be treated like a normal user jar and the SQL Driver is loaded
> into the user
> classloader. WDYT?
>
> > * Option `$internal.sql-gateway.driver.sql-config` is string type
> > I think it's better to use Map type here
>
> By Map type configuration, do you mean a nested map that contains all
> configurations?
>
> I hope I've explained myself well, it’s a file that contains the extra SQL
> configurations, which would be shipped to the jobmanager.
>
> > * PoC branch
>
> Sure. I’ll let you know once I get the job done.
>
> Best,
> Paul Lam
>
> > 2023年6月26日 14:27,Shengkai Fang  写道:
> >
> > Hi, Paul.
> >
> > Thanks for your update. I have a few questions about the new design:
> >
> > * How can we ship the json plan to the JobManager?
> >
> > The current design only exposes an option about the URL of the json
> plan. It seems the gateway is responsible to upload to an external stroage.
> Can we reuse the PipelineOptions.JARS to ship to the remote filesystem?
> >
> > * Classloading strategy
> >
> > Currently, the Driver is in the sql-gateway package. It means the Driver
> is not in the JM's classpath directly. Because the sql-gateway jar is now
> in the opt directory rather than lib directory. It may need to add the
> external dependencies as Python does[1]. BTW, I think it's better to move
> the Driver into the flink-table-runtime package, which is much easier to
> find(Sorry for the wrong opinion before).
> >
> > * Option `$internal.sql-gateway.driver.sql-config` is string type
> >
> > I think it's better to use Map type here
> >
> > * PoC branch
> >
> > Because this FLIP involves many modules, do you have a PoC branch to
> verify it does work?
> >
> > Best,
> > Shengkai
> >
> > [1]
> https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L940
> <
> https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L940
> >
> > Paul Lam mailto:paullin3...@gmail.com>>
> 于2023年6月19日周一 14:09写道:
> > Hi Shengkai,
> >
> > Sorry for my late reply. It took me some time to update the FLIP.
> >
> > In the latest FLIP design, SQL Driver is placed in flink-sql-gateway
> module. PTAL.
> >
> > The FLIP does not cover details about the K8s file distribution, but its
> general usage would
> > be very much the same as YARN setups. We could make follow-up
> discussions in the jira
> > tickets.
> >
> > Best,
> > Paul Lam
> >
> >> 2023年6月12日 15:29,Shengkai Fang  fskm...@gmail.com>> 写道:
> >>
> >>
> >> > If it’s the case, I’m good with introducing a new module and making
> SQL Driver
> >> > an internal class and accepts JSON plans only.
> >>
> >> I rethink this again and again. I think it's better to move the
> SqlDriver into the sql-gateway module because the sql client relies on the
> sql-gateway to submit the sql and the sql-gateway has the ability to
> generate the ExecNodeGraph now. +1 to support accepting JSON plans only.
> >>
> >> * Upload configuration through command line parameter
> >>
> >> ExecNodeGraph only contains the job's information but it doesn't
> contain the checkpoint dir, checkpoint interval, execution mode and so on.
> So I think we should also upload the configuration.
> >>
> >> * KubernetesClusterDescripter and
> KubernetesApplicationClusterEntrypoint are responsible for the jar
> upload/download
> >>
> >> +1 for the change.
> >>
> >> Could you update the FLIP about the current discussion?
> >>
> >> Best,
> >> Shengkai
> >>
> >>
> >>
> >>
> >>
> >>
> >> Yang Wang mailto:wangyang0...@apache.org>>
> 于2023年6月12日周一 11:41写道:
> >> Sorry for the

Re: [DISCUSS] FLIP-303: Support REPLACE TABLE AS SELECT statement

2023-06-27 Thread Jing Ge
Hi Yuxia,

Thanks for the proposal. Many engines like Snowflake, Databricks support
it. +1

"3:Check the atomicity is enabled, it requires both the options
table.rtas-ctas.atomicity-enabled is set to true and the corresponding
table sink implementation SupportsStaging."

Typo? "Option" instead of "options"? It sounds like there are more options
that need to be set.

Best regards,
jing




On Tue, Jun 27, 2023 at 8:37 AM yuxia  wrote:

> Hi, all.
> Thanks for the feedback.
>
> If there are no other questions or concerns for the FLIP[1], I'd like to
> start the vote tomorrow (6.28).
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement
>
> Best regards,
> Yuxia
>
>
> 发件人: "zhangmang1" 
> 收件人: "dev" , luoyu...@alumni.sjtu.edu.cn
> 发送时间: 星期二, 2023年 6 月 27日 下午 12:03:35
> 主题: Re:Re: [DISCUSS] FLIP-303: Support REPLACE TABLE AS SELECT statement
>
> Hi yuxia,
>
> +1 for this new feature.
> In particular, the CREATE OR REPLACE TABLE syntax is more usable and
> faster for users.
>
>
>
>
>
> --
> Best regards,
> Mang Zhang
>
>
>
>
> At 2023-06-26 09:46:40, "yuxia"  wrote:
> >Hi, folks.
> >To save the time of reviewers, I would like to summary the main changes
> of this FLIP[1]. The FLIP is just to introduce REPLACE TABLE AS SELECT
> statement which is almost similar to CREATE TABLE AS SELECT statement, and
> a syntax CREATE OR REPLACE TABLE AS to wrap both. This FLIP is try to
> complete such kinds of statement.
> >
> >The changes are as follows:
> >1: Add enum REPLACE_TABLE_AS, CREATE_OR_REPLACE_TABLE_AS in
> StagingPurpose which is proposed in FLIP-305[2].
> >
> >2: Change the configuration from `table.ctas.atomicity-enabled` proposed
> in FLIP-305[2] to `table.rtas-ctas.atomicity-enabled` to make it take
> effect not only for create table as, but for replace table as && create or
> replace table as. The main reason is that these statements are almost same
> which belongs to same statement family and I would not like to introduce a
> new different configuration which actually do the same thing. Also, IIRC,
> in the offline dicussion about FLIP-218[1], it also wants to introduce
> `table.rtas-ctas.atomicity-enabled`, but as FLIP-218 is only to support
> CTAS, it's not suitable to introduce a configuration implying rtas which is
> not supported. So, we change the configuration to
> `table.ctas.atomicity-enabled`. Since CTAS has been supported, I think it's
> reasonable to revist it and introduce `table.rtas-ctas.atomicity-enabled` a
> to unify them in this FLIP for supporting REPLACE TABLE AS statement.
> >
> >
> >Again, look forward to your feedback.
> >
> >[1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement
> >[2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-305%3A+Support+atomic+for+CREATE+TABLE+AS+SELECT%28CTAS%29+statement
> >[3]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=199541185
> >
> >Best regards,
> >Yuxia
> >
> >- 原始邮件 -
> >发件人: "yuxia" 
> >收件人: "dev" 
> >发送时间: 星期四, 2023年 6 月 15日 下午 7:58:27
> >主题: [DISCUSS] FLIP-303: Support REPLACE TABLE AS SELECT statement
> >
> >Hi, devs.
> >As the FLIPs FLIP-218[1] & FLIP-305[2] for Flink to supports CREATE TABLE
> AS SELECT statement has been accepted.
> >I would like to start a discussion about FLIP-303: Support REPLACE TABLE
> AS SELECT+statement[3] to complete such kinds of statements.
> >With REPLACE TABLE AS SELECT statement, users won't need to drop the
> table firstly, and use CREATE TABLE AS SELECT then. Since the statement is
> much similar to CREATE TABLE AS statement, the design is much similar to
> FLIP-218[1] & FLIP-305[2] apart from some parts specified to REPLACE TABLE
> AS SELECT statement.
> >Just kindly remind, to understand this FLIP better, you may need read
> FLIP-218[1] & FLIP-305[2] to get more context.
> >
> >Look forward to your feedback.
> >
> >[1]:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=199541185
> >[2]:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-305%3A+Support+atomic+for+CREATE+TABLE+AS+SELECT%28CTAS%29+statement
> >[3]:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement
> >
> >:) just notice I miss "[DISCUSS]" in the title of the previous email [4],
> so I send it again here with the correct email title. Please ignore the
> previous email and discuss in this thread.
> >Sorry for the noise.
> >
> >[4]: https://lists.apache.org/thread/jy39xwxn1o2035y5411xynwtbyfgg76t
> >
> >
> >Best regards,
> >Yuxia
>
>


Re: [VOTE] FLIP-303: Support REPLACE TABLE AS SELECT statement

2023-06-28 Thread Jing Ge
+1(binding)

On Wed, Jun 28, 2023 at 1:51 PM Mang Zhang  wrote:

> +1 (no-binding)
>
>
> --
>
> Best regards,
> Mang Zhang
>
>
>
>
>
> At 2023-06-28 17:48:15, "yuxia"  wrote:
> >Hi everyone,
> >Thanks for all the feedback about FLIP-303: Support REPLACE TABLE AS
> SELECT statement[1]. Based on the discussion [2], we have come to a
> consensus, so I would like to start a vote.
> >The vote will be open for at least 72 hours (until July 3th, 10:00AM GMT)
> unless there is an objection or an insufficient number of votes.
> >
> >
> >[1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement
> >[2] https://lists.apache.org/thread/39mwckdsdgck48tzsdfm66hhnxorjtz3
> >
> >
> >Best regards,
> >Yuxia
>


Re: [DISCUSS] FLIP-314: Support Customized Job Lineage Listener

2023-06-28 Thread Jing Ge
Hi Shammon,

Thanks for your proposal. After reading the FLIP, I'd like to ask
some questions to make sure we are on the same page. Thanks!

1. TableColumnLineageRelation#sinkColumn() should return
TableColumnLineageEntity instead of String, right?

2. Since LineageRelation already contains all information to build the
lineage between sources and sink, do we still need to set the LineageEntity
in the source?

3. About the "Entity" and "Relation" naming, I was confused too, like
Qingsheng mentioned. How about LineageVertex, LineageEdge, and LineageEdges
which contains multiple LineageEdge? E.g. multiple sources join into one
sink, or, edges of columns from one or different tables, etc.

Best regards,
Jing

On Sun, Jun 25, 2023 at 2:06 PM Shammon FY  wrote:

> Hi yuxia and Yun,
>
> Thanks for your input.
>
> For yuxia:
> > 1: What kinds of JobStatus will the `JobExecutionStatusEven` including?
>
> At present, we only need to notify the listener when a job goes to
> termination, but I think it makes sense to add generic `oldStatus` and
> `newStatus` in the listener and users can update the job state in their
> service as needed.
>
> > 2: I'm really confused about the `config()` included in `LineageEntity`,
> where is it from and what is it for ?
>
> The `config` in `LineageEntity` is used for users to get options for source
> and sink connectors. As the examples in the FLIP, users can add
> server/group/topic information in the config for kafka and create lineage
> entities for `DataStream` jobs, then the listeners can get this information
> to identify the same connector in different jobs. Otherwise, the `config`
> in `TableLineageEntity` will be the same as `getOptions` in
> `CatalogBaseTable`.
>
> > 3: Regardless whether `inputChangelogMode` in `TableSinkLineageEntity` is
> needed or not, since `TableSinkLineageEntity` contains
> `inputChangelogMode`, why `TableSourceLineageEntity` don't contain
> changelogmode?
>
> At present, we do not actually use the changelog mode. It can be deleted,
> and I have updated FLIP.
>
> > Btw, since there're a lot interfaces proposed, I think it'll be better to
> give an example about how to implement a listener in this FLIP to make us
> know better about the interfaces.
>
> I have added the example in the FLIP and the related interfaces and
> examples are in branch [1].
>
> For Yun:
> > I have one more question on the lookup-join dim tables, it seems this
> FLIP does not touch them, and will them become part of the
> List sources() or adding another interface?
>
> You're right, currently lookup join dim tables were not considered in the
> 'proposed changed' section of this FLIP. But the interface for lineage is
> universal and we can give `TableLookupSourceLineageEntity` which implements
> `TableSourceLineageEntity` in the future without modifying the public
> interface.
>
> > By the way, if you want to focus on job lineage instead of data column
> lineage in this FLIP, why we must introduce so many column-lineage related
> interface here?
>
> The lineage information in SQL jobs includes table lineage and column
> lineage. Although SQL jobs currently do not support column lineage, we
> would like to support this in the next step. So we have comprehensively
> considered the table lineage and column lineage interfaces here, and
> defined these two interfaces together clearly
>
>
> [1]
>
> https://github.com/FangYongs/flink/commit/d4bfe57e7a5315b790e79b8acef8b11e82c9187c
>
> Best,
> Shammon FY
>
>
> On Sun, Jun 25, 2023 at 4:17 PM Yun Tang  wrote:
>
> > Hi Shammon,
> >
> > I like the idea in general and it will help to analysis the job lineages
> > no matter FlinkSQL or Flink jar jobs in production environments.
> >
> > For Qingsheng's concern, I'd like the name of JobType more than
> > RuntimeExecutionMode, as the latter one is not easy to understand for
> users.
> >
> > I have one more question on the lookup-join dim tables, it seems this
> FLIP
> > does not touch them, and will them become part of the List
> > sources()​ or adding another interface?
> >
> > By the way, if you want to focus on job lineage instead of data column
> > lineage in this FLIP, why we must introduce so many column-lineage
> related
> > interface here?
> >
> >
> > Best
> > Yun Tang
> > 
> > From: Shammon FY 
> > Sent: Sunday, June 25, 2023 16:13
> > To: dev@flink.apache.org 
> > Subject: Re: [DISCUSS] FLIP-314: Support Customized Job Lineage Listener
> >
> > Hi Qingsheng,
> >
> > Thanks for your valuable feedback.
> >
> > > 1. Is there any specific use case to expose the batch / streaming info
> to
> > listeners or meta services?
> >
> > I agree with you that Flink is evolving towards batch-streaming
> > unification, but the lifecycle of them is different. If a job processes a
> > bound dataset, it will end after completing the data processing,
> otherwise,
> > it will run for a long time. In our scenario, we will regularly schedule
> > some Flink jobs to process boun

Re: [DISUCSS] Deprecate multiple APIs in 1.18

2023-06-29 Thread Jing Ge
Hi Xintong,

2. SourceFunction / SinkFunction
This is one with challenges. There were many discussions in the past.

2.1 SourceFunction
The voting was passed[1], but there are still some ongoing issues[2]. The
SourceFunction is not marked as @deprecated yet[3], afaik.

2.2 SinkFunction
SinkV2 is a complex one and e.g. even the most important implementation
KafkaSink has not graduated to @Public yet[4]. FlieSink is @Experimental
and there were some concerns with data loss issues when I proposed to
graduate it to @PublicEvolving. It deserves a more serious graduation
process. Otherwise, users would not migrate to SinkV2 and all SinkFunction
and related implementations could not be removed. You can find more details
and other issues from Dong's and my replies at [5]

Best regards,
Jing

[1] https://lists.apache.org/thread/hrpsddgz65hjvhjozhg72s0wsmxz145p
[2] https://github.com/apache/flink/pull/20049
[3]
https://github.com/apache/flink/blob/c11fd82e0cc63904b43d2aca3a79d85dfeb57c2f/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java#L98
[4]
https://github.com/apache/flink-connector-kafka/blob/910fc5a3f06ab7d95f79c0fc4b5ce64e2aaa8db2/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/sink/KafkaSink.java#L57
[5] https://lists.apache.org/thread/q62nj89rrz0t5xtggy5n65on95f2rmmx


On Thu, Jun 29, 2023 at 10:45 AM Martijn Visser 
wrote:

> Hi Xintong,
>
> With regards to the deprecation of the Scala APIs, during the PR
> review it was requested to not mark all APIs as deprecated but only
> the entry point [1], to avoid a fully striked through experience in
> the IDE. I think the same idea was applicable on the DataSet API. I
> think it depends on how formal we want to treat this: if really
> formal, then we should deprecate them in 1.18. I think in both cases,
> it's quite well known that they are deprecated. I'm +0 for either way,
> as long as we're all agreeing that they can be removed in 2.0.
>
> With regards to Queryable State and Source/SinkFunction, +1 to mark
> these as deprecated.
>
> Best regards,
>
> Martijn
>
> [1]
> https://github.com/apache/flink/pull/21176#pullrequestreview-1159706808
>
> On Thu, Jun 29, 2023 at 10:23 AM Xintong Song 
> wrote:
> >
> > Hi devs,
> >
> > Looking at the release 2.0 proposals [1], I noticed that many APIs that
> are
> > proposed to be removed in 2.0 are not (fully) deprecated yet. We might
> want
> > to properly mark them as `@Deprecated` in 1.18 if we agree they should be
> > removed in 2.0. Moreover, according to FLIP-321 [2] (not voted yet but
> IMO
> > is close to consensus IMO), a migration period is required after APIs are
> > deprecated and before they can be removed.
> >
> > I might not be familiar with the status of all the APIs below. So I'd
> like
> > to bring them up and see if there's any concern regarding deprecating
> them
> > in 1.18. If there's concern for deprecating API, we can start a separate
> > discussion thread for it. For those with no objections, I'd create JIRA
> > tickets and try to properly deprecate them in 1.18.
> >
> > 1. DataSet API
> > It's described as "legacy", "soft deprecated" in user documentation [3].
> > However, it's not annotated with `@Deprecated` in codes. According to
> > FLIP-131 [4], DataSet API should be deprecated when DataStream API and
> > Table API / SQL meet certain requirements. AFAICS, all the requirements
> > mentioned in the FLIP are already fulfilled. We should annotate it as
> > `@Deprecated` now.
> >
> > 2. SourceFunction / SinkFunction
> > They are described as deprecated in the roadmap[5], and I don't find
> > anything regarding them in user documentation. But they are also not
> > annotated with `@Deprecated` in codes. TBH, I'm not aware of any formal
> > decision to deprecate these. AFAICS, the replacement for SourceFunction
> > (Source) has already been promoted to `@Public`, while the replacement
> for
> > SinkFunction (SinkV2) is still `@PublicEvolving`. I found a discussion[6]
> > regarding promoting SinkV2 to `@Public`, but it's unclear to me what the
> > conclusion is.
> >
> > 3. Queryable State
> > It's described as approaching end-of-life in the roadmap [5], but is
> > neither deprecated in codes nor in user documentation [7]. I also found a
> > discussion [8] about rescuing it from deprecation, and it seems to me
> there
> > are more negative opinions than positive ones.
> >
> > 4. All Scala APIs
> > I think we agreed to drop Scala API support in FLIP-265 [9], and have
> tried
> > to deprecate them in FLINK-29740 [10]. Also, both user documentation and
> > roadmap[5] shows that scala API supports are deprecated. However, AFAICS,
> > none of the APIs in `flink-streaming-scala` are annotated with
> > `@Deprecated`, and only `ExecutionEnvironment` and `package` are marked
> > `@Deprecated` in `flink-scala`.
> >
> > Looking forward to your feedback.
> >
> > Best,
> >
> > Xintong
> >
> >
> > [1] https://cwiki.apache.org/confluence/display

Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-06-29 Thread Jing Ge
+1(binding)

On Thu, Jun 29, 2023 at 7:47 AM Leonard Xu  wrote:

> +1 (binding)
>
> Best,
> Leonard
>
> > On Jun 29, 2023, at 1:25 PM, Jingsong Li  wrote:
> >
> > +1 binding
> >
> > On Thu, Jun 29, 2023 at 11:03 AM Dong Lin  wrote:
> >>
> >> Hi all,
> >>
> >> We would like to start the vote for FLIP-309: Support using larger
> >> checkpointing interval when source is processing backlog [1]. This FLIP
> was
> >> discussed in this thread [2].
> >>
> >> Flink 1.18 release will feature freeze on July 11. We hope to make this
> >> feature available in Flink 1.18.
> >>
> >> The vote will be open until at least July 4th (at least 72 hours),
> following
> >> the consensus voting process.
> >>
> >> Cheers,
> >> Yunfeng and Dong
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> >> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
>
>


Re: [DISCUSS] Release 2.0 Work Items

2023-07-03 Thread Jing Ge
Hi,

Speaking of "Move Calcite rules from Scala to Java", I was wondering if
this thread is the right place to talk about it. Afaik, the Flink community
has decided to abandon Scala. That is the reason, I guess, we want to move
those Calcite rules from Scala to Java. On the other side, new Scala code
will be added while developing new features[1]. Do we have any thoughts
wrt the Scala code strategy?

Best regards,
Jing



[1] https://lists.apache.org/thread/tnygl4n3q1fx75cl2vclc78j8mrxmz6y

On Mon, Jul 3, 2023 at 10:31 AM Xintong Song  wrote:

> Thanks all for the discussion.
>
>
> IIUC, we need to make the following changes. Please correct me if I get it
> wrong.
>
>
> 1. Disaggregated State Management - Clarify that only the public API
> related part is must-have for 2.0.
>
> 2. Java version support - Split it into 3 items: a) make java 17 the
> default (must-have), b) drop java 8 (must-have), and c) drop java 11
> (nice-to-have)
>
> 3. Add MetricGroup#getLogicalScope - Should be promoted to must-have
>
> 4. ProcessFunction API - Should be downgrade to nice-to-have
>
> 5. Configuration - Add an item "revisit all config option types and default
> values", which IIUC should also be a must-have
>
>
> There seems to be no changes needed for "Move Calcite rules from Scala to
> Java" as it's already nice-to-have.
>
>
> If there's no objections, I'll update the wiki page accordingly, and start
> a VOTE in the next couple of days.
>
>
> Best,
>
> Xintong
>
>
>
> On Fri, Jun 30, 2023 at 12:53 AM Teoh, Hong 
> wrote:
>
> > Thanks Xintong for driving the effort.
> >
> > I’d add a +1 to reworking configs, as suggested by @Jark and @Chesnay,
> > especially the types. We have various configs that encode Time /
> MemorySize
> > that are Long instead!
> >
> > Regards,
> > Hong
> >
> >
> >
> > > On 29 Jun 2023, at 16:19, Yuan Mei  wrote:
> > >
> > > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> > >
> > >
> > >
> > > Thanks for driving this effort, Xintong!
> > >
> > > To Chesnay
> > >> I'm curious as to why the "Disaggregated State Management" item is
> > >> marked as a must-have; will it require changes that break something?
> > >> What prevents it from being added in 2.1?
> > >
> > > As to "Disaggregated State Management".
> > >
> > > We plan to provide a new type of state backend to support DFS as
> primary
> > > storage.
> > > To achieve this, we at least need to include two parts of amends (not
> > > entirely sure yet, since we are still in the designing and prototype
> > phase)
> > >
> > > 1. Statebackend Change
> > > 2. State Access Change
> > >
> > > Not all of the interfaces related are `@Internal`. Some of the
> interfaces
> > > like `StateBackend` is `@PublicEvolving`
> > > So, you are right in the sense that "Disaggregated State Management"
> > itself
> > > probably does not need to be a "Must Have"
> > >
> > > But I was hoping changes that related to public APIs can be finalized
> and
> > > merged in Flink 2.0 (I will fix the wiki accordingly).
> > >
> > > I also agree with Jark that 2.0 is a good chance to rework the default
> > > value of configurations.
> > >
> > > Best
> > > Yuan
> > >
> > >
> > > On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler 
> > wrote:
> > >
> > >> Something else configuration-related is that there are a bunch of
> > >> options where the type isn't quite correct (e.g., a String where it
> > >> could be an enum, a string where it should be an int or something).
> > >> Could do a pass over those as well.
> > >>
> > >> On 29/06/2023 13:50, Jark Wu wrote:
> > >>> Hi,
> > >>>
> > >>> I think one more thing we need to consider to do in 2.0 is changing
> the
> > >>> default value of configuration to improve out-of-box user experience.
> > >>>
> > >>> Currently, in order to run a Flink job, users may need to set
> > >>> a bunch of configurations, such as minibatch, checkpoint interval,
> > >>> exactly-once,
> > >>> incremental-checkpoint, etc. It's very verbose and hard to use for
> > >>> beginners.
> > >>> Most of them can have a universally applicable value.  Because
> changing
> > >> the
> > >>> default value is a breaking change. I think It's worth considering
> > >> changing
> > >>> them in 2.0.
> > >>>
> > >>> What do you think?
> > >>>
> > >>> Best,
> > >>> Jark
> > >>>
> > >>>
> > >>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin 
> > >> wrote:
> > >>>
> >  Hi Chesnay
> > 
> > > "Move Calcite rules from Scala to Java": I would hope that this
> would
> > >> be
> > > an entirely internal change, and could thus be an incremental
> process
> > > independent of major releases.
> > > What is the actual scale of this item; how much are we actually
> >  re-writing?
> > 
> >  Thanks for asking
> >  yes, you're right, that should be internal change.
> >  Yeah I was also thinking about incremental change (rule by rule 

Re: [VOTE] FLIP-321: introduce an API deprecation process

2023-07-03 Thread Jing Ge
+1(binding)

On Mon, Jul 3, 2023 at 10:19 AM Stefan Richter
 wrote:

> +1 (binding)
>
>
> > On 3. Jul 2023, at 10:08, Martijn Visser 
> wrote:
> >
> > +1 (binding)
> >
> >
> >
> > On Mon, Jul 3, 2023 at 10:03 AM Xintong Song  > wrote:
> >
> >> +1 (binding)
> >>
> >> Best,
> >>
> >> Xintong
> >>
> >>
> >>
> >> On Sat, Jul 1, 2023 at 11:26 PM Dong Lin  wrote:
> >>
> >>> Thanks for the FLIP.
> >>>
> >>> +1 (binding)
> >>>
> >>> On Fri, Jun 30, 2023 at 5:39 PM Becket Qin 
> wrote:
> >>>
>  Hi folks,
> 
>  I'd like to start the VOTE for FLIP-321[1] which proposes to introduce
> >> an
>  API deprecation process to Flink. The discussion thread for the FLIP
> >> can
> >>> be
>  found here[2].
> 
>  The vote will be open until at least July 4, following the consensus
> >>> voting
>  process.
> 
>  Thanks,
> 
>  Jiangjie (Becket) Qin
> 
>  [1]
> 
> 
> >>>
> >>
> https://www.google.com/url?q=https://cwiki.apache.org/confluence/display/FLINK/FLIP-321%253A%2BIntroduce%2Ban%2BAPI%2Bdeprecation%2Bprocess&source=gmail-imap&ust=168897655400&usg=AOvVaw24XYJrIcv_vYj1fJVQ7TNY
>  [2]
> https://www.google.com/url?q=https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9&source=gmail-imap&ust=168897655400&usg=AOvVaw1yaMLBBkFfvbBhvyAbHYfX
>
>


Re: [DISCUSS] Release 2.0 Work Items

2023-07-03 Thread Jing Ge
Hi Sergey,

Thanks for the clarification! I will not hijack this thread to discuss
Scala code strategy.

Best regards,
Jing

On Mon, Jul 3, 2023 at 10:51 AM Sergey Nuyanzin  wrote:

> Hi Jing,
>
> Maybe I was not clear enough, sorry.
> However the main reason for this item about Calcite rules is not abandoning
> Scala.
> The main reason are changes in Calcite itself where there was introduced
> code generator framework (immutables)
> to generate config java classes for rules and old api (which is used in
> Scala Calcirte rules) for that is marked as deprecated.
> Since Immutables implies code generation while java compilation it seems
> impossible to use for rules in Scala code.
>
> On Mon, Jul 3, 2023 at 10:44 AM Jing Ge 
> wrote:
>
> > Hi,
> >
> > Speaking of "Move Calcite rules from Scala to Java", I was wondering if
> > this thread is the right place to talk about it. Afaik, the Flink
> community
> > has decided to abandon Scala. That is the reason, I guess, we want to
> move
> > those Calcite rules from Scala to Java. On the other side, new Scala code
> > will be added while developing new features[1]. Do we have any thoughts
> > wrt the Scala code strategy?
> >
> > Best regards,
> > Jing
> >
> >
> >
> > [1] https://lists.apache.org/thread/tnygl4n3q1fx75cl2vclc78j8mrxmz6y
> >
> > On Mon, Jul 3, 2023 at 10:31 AM Xintong Song 
> > wrote:
> >
> > > Thanks all for the discussion.
> > >
> > >
> > > IIUC, we need to make the following changes. Please correct me if I get
> > it
> > > wrong.
> > >
> > >
> > > 1. Disaggregated State Management - Clarify that only the public API
> > > related part is must-have for 2.0.
> > >
> > > 2. Java version support - Split it into 3 items: a) make java 17 the
> > > default (must-have), b) drop java 8 (must-have), and c) drop java 11
> > > (nice-to-have)
> > >
> > > 3. Add MetricGroup#getLogicalScope - Should be promoted to must-have
> > >
> > > 4. ProcessFunction API - Should be downgrade to nice-to-have
> > >
> > > 5. Configuration - Add an item "revisit all config option types and
> > default
> > > values", which IIUC should also be a must-have
> > >
> > >
> > > There seems to be no changes needed for "Move Calcite rules from Scala
> to
> > > Java" as it's already nice-to-have.
> > >
> > >
> > > If there's no objections, I'll update the wiki page accordingly, and
> > start
> > > a VOTE in the next couple of days.
> > >
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Fri, Jun 30, 2023 at 12:53 AM Teoh, Hong
>  > >
> > > wrote:
> > >
> > > > Thanks Xintong for driving the effort.
> > > >
> > > > I’d add a +1 to reworking configs, as suggested by @Jark and
> @Chesnay,
> > > > especially the types. We have various configs that encode Time /
> > > MemorySize
> > > > that are Long instead!
> > > >
> > > > Regards,
> > > > Hong
> > > >
> > > >
> > > >
> > > > > On 29 Jun 2023, at 16:19, Yuan Mei  wrote:
> > > > >
> > > > > CAUTION: This email originated from outside of the organization. Do
> > not
> > > > click links or open attachments unless you can confirm the sender and
> > > know
> > > > the content is safe.
> > > > >
> > > > >
> > > > >
> > > > > Thanks for driving this effort, Xintong!
> > > > >
> > > > > To Chesnay
> > > > >> I'm curious as to why the "Disaggregated State Management" item is
> > > > >> marked as a must-have; will it require changes that break
> something?
> > > > >> What prevents it from being added in 2.1?
> > > > >
> > > > > As to "Disaggregated State Management".
> > > > >
> > > > > We plan to provide a new type of state backend to support DFS as
> > > primary
> > > > > storage.
> > > > > To achieve this, we at least need to include two parts of amends
> (not
> > > > > entirely sure yet, since we are still in the designing and
> prototype
> > > > phase)
> > > > >
> > > > > 1. Statebackend

Re: [DISUCSS] Deprecate multiple APIs in 1.18

2023-07-03 Thread Jing Ge
Hi Xingtong,

Option 1, secure plan would be:

1. graduate kafka, File, JDBC connectors to @Public
2. graduate SinkV2 to @Public
3. remove SinkFunction.

Option 2, risky plan but at a fast pace:

1. graduate SinkV2 to @Public and expecting more maintenance effort since
there are many known and unsolved issues.
2. remove SinkFunction.
3. It depends on the connectors' contributors whether connectors can
upgrade to Flink 2.0, since we moved forward with SinkV2 API without taking
care of implementations in external connectors.

I am ok with both of them and personally prefer option 1.

Best regards,
Jing


On Fri, Jun 30, 2023 at 3:41 AM Xintong Song  wrote:

> I see. Thanks for the explanation. I may have not looked into this deeply
> enough, and would trust the decision from you and the community members who
> participated in the discussion & vote.
>
> Best,
>
> Xintong
>
>
>
> On Thu, Jun 29, 2023 at 10:28 PM Alexander Fedulov <
> alexander.fedu...@gmail.com> wrote:
>
> > > However, I'm not sure about 2.
> >
> > I am not aware of a bylaw that states the specific requirements in order
> to
> > mark something as @Deprecated. My understanding from the discussion and
> the
> > vote was that the community recognizes the necessity to make it explicit
> > that
> > the usage of the SourceFunction API is discouraged. This can actually
> > stimulate
> > authors of connectors that rely on this very specific and non-baseline
> > functionality to contribute extensions to the new Source API themselves
> in
> > order to
> > close the gap. ExternallyInducedSource, for instance, was driven by
> Pravega
> > to
> > begin with, since it was only needed for their purposes [1]. We are not
> > removing
> > anything - until 2.0 everything will continue to work and we can work on
> > resolving the limitations until then, I personally don't see a big issue
> > here.
> >
> > >Do you think it is feasible to resolve them by the feature freeze date
> of
> > 1.18?
> > No, these are rather complex additions that would probably require
> FLIP(s).
> >
> > [1]
> >
> >
> https://flink.apache.org/2022/01/20/pravega-flink-connector-101/#checkpoint-integration
> >
> > On Thu, 29 Jun 2023 at 14:25, Xintong Song 
> wrote:
> >
> > > Thanks for the explanation, Alex.
> > >
> > > Not blocking the deprecation on 1 & 3 makes sense to me. However, I'm
> not
> > > sure about 2.
> > >
> > > It sounds to me that, without FLINK-28051 & FLINK-28054, some of the
> > > connectors cannot migrate to the new Source API, or at least further
> > > investigation is needed to understand the situation. If this is the
> case,
> > > we probably should not deprecate the API until these issues are
> resolved.
> > > Do you think it is feasible to resolve them by the feature freeze date
> of
> > > 1.18?
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Thu, Jun 29, 2023 at 8:02 PM Alexander Fedulov <
> > > alexander.fedu...@gmail.com> wrote:
> > >
> > > > @Xintong
> > > > The original discussion [1] and vote [2] converged on the idea that
> it
> > is
> > > > better
> > > > to make it clear to the users that they should stop using
> > SourceFunction
> > > > since it
> > > > is going away. The longer we do not have this indication, the more
> user
> > > > implementations will be based on it and the more pain will be induced
> > > when
> > > > we
> > > > finally drop it. Users now have an alternative API that they should
> use
> > > and
> > > > which
> > > > is fully functional, from that perspective nothing blocks marking it
> > > > @Deprecated.
> > > > As for the remaining work items - there are primarily three kinds:
> > > >
> > > > 1. Where Flink internally uses SourceFunction, without exposing this
> > fact
> > > > to the
> > > >outside world:
> > > >- FLINK-28050 [3]
> > > >- FLINK-28229 [4]
> > > >- FLINK-28048 [5]
> > > >
> > > > 2. Very specific edge cases that might not be covered by the Source
> API
> > > as
> > > > is:
> > > >- FLINK-28054 [6]
> > > >- FLINK-28051 [7]
> > > >
> > > > 3. Usability improvements - something that was easily doable with
> > > > SourceFunction,
> > > >but requires deep knowledge of the new, significantly more
> complex,
> > > > Source API
> > > >to achieve:
> > > >- FLINK-28056 [8]
> > > >
> > > > In my mind, none of those are blockers for proceeding with adding the
> > > > @Deprecated
> > > > annotation:
> > > > (1) is a simple case of encapsulation, internals should not concern
> the
> > > API
> > > > users
> > > > (2) is really only relevant for "exotic" use cases. Does not mean we
> > > should
> > > > not
> > > > consider those, but since it is irrelevant for 99.9% of the users, I
> do
> > > not
> > > > think
> > > > we should get stuck here.
> > > > (3) is purely a nice to have. Formally speaking, all of the tools are
> > > > there, it is
> > > > just that due to the complexity of the new Source API some "simple"
> > > things
> > > > become
> > > > non-trivial and ideally we want to

Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

2023-07-03 Thread Jing Ge
Hi,

If there are no other concerns, I will start voting. Thanks!

Best Regards,
Jing

On Mon, Jun 26, 2023 at 11:35 AM Jing Ge  wrote:

> Hi,
>
> @Galen @Yuxia
>
> Your points are valid. Speaking of removing deprecated API, I have the
> same concern. As a matter of fact, I have been raising it in the discussion
> thread of API deprecation process[1]. This is another example that we
> should care about more factors than the migration period, thanks for
> the hint! I will add one more update into that thread with the reference of
> this thread.
>
> In a nutshell, this thread is focusing on the graduation process. Your
> valid concerns should be taken care of by the deprecation process.
> Please don't hesitate to share your thoughts in that thread.
>
>
> Best regards,
> Jing
>
> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>
>
> On Sun, Jun 25, 2023 at 3:48 AM yuxia  wrote:
>
>> Thanks Jing for briging this to dicuss.
>> I agree it's not a blocker for graduting the FileSink to @PublicEvolving
>> since the Sink which is the rootcause has marked as @PublicEvolving.
>> But I do also share the same concern with Galen. At least it should be a
>> blocker for removing StreamingFileSink.
>> Btw, seems it's really a big headache for migrating to Sink, we may need
>> to pay more attention to this ticket and try to fix it.
>>
>> Best regards,
>> Yuxia
>>
>> - 原始邮件 -
>> 发件人: "Galen Warren" 
>> 收件人: "dev" 
>> 发送时间: 星期五, 2023年 6 月 23日 下午 7:47:24
>> 主题: Re: [DISCUSS] Graduate the FileSink to @PublicEvolving
>>
>> Thanks Jing. I can only offer my perspective on this, others may view it
>> differently.
>>
>> If FileSink is subject to data loss in the "stop-on-savepoint then
>> restart"
>> scenario, that makes it unusable for me, and presumably for anyone who
>> uses
>> it in a long-running streaming application and who cannot tolerate data
>> loss. I still use the (deprecated!) StreamingFileSink for this reason.
>>
>> The bigger picture here is that StreamingFileSink is deprecated and will
>> presumably ultimately be removed, to be replaced with FileSink. Graduating
>> the status of FileSink seems to be a step along that path; I'm concerned
>> about continuing down that path with such a critical issue present.
>> Ultimately, my concern is that FileSink will graduate fully and that
>> StreamingFileSink will be removed and that there will be no remaining
>> option to reliably stop/start streaming jobs that write to files without
>> incurring the risk of data loss.
>>
>> I'm sure I'd feel better about things if there were an ongoing effort to
>> address this FileSink issue and/or a commitment that StreamingFileSink
>> would not be removed until this issue is addressed.
>>
>> My two cents -- thanks.
>>
>>
>> On Fri, Jun 23, 2023 at 1:47 AM Jing Ge 
>> wrote:
>>
>> > Hi Galen,
>> >
>> > Thanks for the hint which is helpful for us to have a clear big picture.
>> > Afaiac, this will not be a blocking issue for the graduation. There will
>> > always be some (potential) bugs in the implementation. The API is very
>> > stable from 2020. The timing is good to graduate. WDYT?
>> > Furthermore, I'd like to have more opinions. All opinions together will
>> > help the community build a mature API graduation process.
>> >
>> > Best regards,
>> > Jing
>> >
>> > On Tue, Jun 20, 2023 at 12:48 PM Galen Warren
>> >  wrote:
>> >
>> > > Is this issue still unresolved?
>> > >
>> > >
>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-30238
>> > >
>> > > Based on prior discussion, I believe this could lead to data loss with
>> > > FileSink.
>> > >
>> > >
>> > >
>> > > On Tue, Jun 20, 2023, 5:41 AM Jing Ge 
>> > wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > The FileSink has been marked as @Experimental[1] since Oct. 2020.
>> > > > According to FLIP-197[2], I would like to propose to graduate it
>> > > > to @PublicEvloving in the upcoming 1.18 release.
>> > > >
>> > > > On the other hand, as a related topic, FileSource was marked
>> > > > as @PublicEvolving[3] 3 years ago. It deserves a graduation
>> discussion
>> > > too.
>> > > >

Re: [ANNOUNCE] Apache Flink has won the 2023 SIGMOD Systems Award

2023-07-03 Thread Jing Ge
Congratulations!

Best regards,
Jing


On Mon, Jul 3, 2023 at 3:21 PM yuxia  wrote:

> Congratulations!
>
> Best regards,
> Yuxia
>
> --
> *发件人: *"Pushpa Ramakrishnan" 
> *收件人: *"Xintong Song" 
> *抄送: *"dev" , "User" 
> *发送时间: *星期一, 2023年 7 月 03日 下午 8:36:30
> *主题: *Re: [ANNOUNCE] Apache Flink has won the 2023 SIGMOD Systems Award
>
> Congratulations \uD83E\uDD73
>
> On 03-Jul-2023, at 3:30 PM, Xintong Song  wrote:
>
> 
> Dear Community,
>
> I'm pleased to share this good news with everyone. As some of you may have
> already heard, Apache Flink has won the 2023 SIGMOD Systems Award [1].
>
> "Apache Flink greatly expanded the use of stream data-processing." --
> SIGMOD Awards Committee
>
> SIGMOD is one of the most influential data management research conferences
> in the world. The Systems Award is awarded to an individual or set of
> individuals to recognize the development of a software or hardware system
> whose technical contributions have had significant impact on the theory or
> practice of large-scale data management systems. Winning of the award
> indicates the high recognition of Flink's technological advancement and
> industry influence from academia.
>
> As an open-source project, Flink wouldn't have come this far without the
> wide, active and supportive community behind it. Kudos to all of us who
> helped make this happen, including the over 1,400 contributors and many
> others who contributed in ways beyond code.
>
> Best,
>
> Xintong (on behalf of the Flink PMC)
>
>
> [1] https://sigmod.org/2023-sigmod-systems-award/
>
>
>


Re: [DISCUSS] FLIP-329: Add operator attribute to specify support for object-reuse

2023-07-04 Thread Jing Ge
Hi Xuannan, Hi Dong

Thanks for the Proposal! After reading the FLIP, I'd like to ask some
questions:

1. Naming convention for boolean variables. It is recommended to follow
JavaBean [1], i.e. objectReuseCompliant as the variable name with
isObjectReuseCompliant() and setObjectReuseCompliant() as the methods' name.


2.

   -

   *If pipeline.object-reuse is set to true, records emitted by this
   operator will be re-used.*
   -

   *Otherwise, if getIsObjectReuseCompliant() returns true, records emitted
   by this operator will be re-used.*
   -

   *Otherwise, records emitted by this operator will be deep-copied before
   being given to the next operator in the chain.*


If I understand you correctly,  the hard coding objectReusedCompliant
should have higher priority over the configuration, the checking logic
should be:

   -

   *If getIsObjectReuseCompliant() returns true, records emitted by this
   operator will be re-used.*
   -

   *Otherwise, if pipeline.object-reuse is set to true, records emitted by
   this operator will be re-used.*
   -

   *Otherwise, records emitted by this operator will be deep-copied before
   being given to the next operator in the chain.*


The results are the same but the checking logics are different, but there
are some additional thoughts, which lead us to the next question.

3. Current design lets specific operators enable object reuse and ignore
the global config. There could be another thought, on the contrary: if an
operator has hard coded the objectReuseCompliant as false, i.e. disable
object reuse on purpose, records should not be reused even if the global
config pipeline.object-reused is set to true, which turns out that the
objectReuseCompliant could be a triple value logic: ture: force object
reusing; false: force deep-copying; unknown: depends on
pipeline.object-reuse config.


Best regards,
Jing


[1] https://en.wikipedia.org/wiki/JavaBeans

On Mon, Jul 3, 2023 at 4:25 AM Xuannan Su  wrote:

> Hi all,
>
> Dong(cc'ed) and I are opening this thread to discuss our proposal to
> add operator attribute to allow operator to specify support for
> object-reuse [1].
>
> Currently, the default configuration for pipeline.object-reuse is set
> to false to avoid data corruption, which can result in suboptimal
> performance. We propose adding APIs that operators can utilize to
> inform the Flink runtime whether it is safe to reuse the emitted
> records. This enhancement would enable Flink to maximize its
> performance using the default configuration.
>
> Please refer to the FLIP document for more details about the proposed
> design and implementation. We welcome any feedback and opinions on
> this proposal.
>
> Best regards,
>
> Dong and Xuannan
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255073749
>


Re: [DISUCSS] Deprecate multiple APIs in 1.18

2023-07-05 Thread Jing Ge
Hi Alex,


> > 3. remove SinkFunction.
> Which steps do you imply for the 1.18 release and for the 2.0 release?
>

for 2.0 release. 1.18 will be released soon.

Best regards,
Jing


On Wed, Jul 5, 2023 at 1:08 PM Alexander Fedulov <
alexander.fedu...@gmail.com> wrote:

> @Jing
> Just to clarify, when you say:
>
> 3. remove SinkFunction.
> Which steps do you imply for the 1.18 release and for the 2.0 release?

@Xintong
> A side note - with the new Source API we lose the ability to control
> checkpointing from the source since there is no lock anymore. This
> functionality
> is currently used in a variety of tests for the Sinks - the tests that rely
> on tight
> synchronization between specific elements passed from the source  to the
> sink before
> allowing a checkpoint to complete (see FiniteTestSource [1]). Since FLIP-27
> Sources rely
> on decoupling via the mailbox, without exposing the lock, it is not
> immediately clear
> if it is possible to achieve the same functionality without major
> extensions in the
> runtime for such testing purposes. My hope initially was that only the
> legacy Sinks
> relied on this - this would have made it possible to drop
> SourceFunction+SinkFunction
> together, but, in fact, it also already became part of the new SinkV2
> testing IT suits
> [2]. Moreover, I know of at least one major connector that also relies on
> it for
> verifying committed sink metadata for a specific set of records (Iceberg)
> [3]. In my
> estimation this currently presents a major blocker for the SourceFunction
> removal.
>
> [1]
>
> https://github.com/apache/flink/blob/master/flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/streaming/util/FiniteTestSource.java
> [2]
>
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-files/src/test/java/org/apache/flink/connector/file/sink/StreamingExecutionFileSinkITCase.java#L132
> [3]
>
> https://github.com/apache/iceberg/blob/master/flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/BoundedTestSource.java#L75C1-L85C2
>
> Best,
> Alex
>
> On Wed, 5 Jul 2023 at 10:47, Chesnay Schepler  wrote:
>
> > There's a whole bunch of metric APIs that would need to be deprecated.
> > That is of course if the metric FLIPs are being accepted.
> >
> > Which makes me wonder if we aren't doing things the wrong way around;
> > shouldn't the decision to deprecate an API be part of the FLIP
> discussion?
> >
> > On 05/07/2023 07:39, Xintong Song wrote:
> > > Thanks all for the discussion.
> > >
> > > It seems to me there's a consensus on marking the following as
> deprecated
> > > in 1.18:
> > > - DataSet API
> > > - SourceFunction
> > > - Queryable State
> > > - All Scala APIs
> > >
> > > More time is needed for deprecating SinkFunction.
> > >
> > > I'll leave this discussion open for a few more days. And if there's no
> > > objections, I'll create JIRA tickets accordingly.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Wed, Jul 5, 2023 at 1:34 PM Xintong Song 
> > wrote:
> > >
> > >> Thanks for the input, Jing. I'd also be +1 for option 1.
> > >>
> > >> Best,
> > >>
> > >> Xintong
> > >>
> > >>
> > >>
> > >> On Mon, Jul 3, 2023 at 7:20 PM Jing Ge 
> > wrote:
> > >>
> > >>> Hi Xingtong,
> > >>>
> > >>> Option 1, secure plan would be:
> > >>>
> > >>> 1. graduate kafka, File, JDBC connectors to @Public
> > >>> 2. graduate SinkV2 to @Public
> > >>> 3. remove SinkFunction.
> > >>>
> > >>> Option 2, risky plan but at a fast pace:
> > >>>
> > >>> 1. graduate SinkV2 to @Public and expecting more maintenance effort
> > since
> > >>> there are many known and unsolved issues.
> > >>> 2. remove SinkFunction.
> > >>> 3. It depends on the connectors' contributors whether connectors can
> > >>> upgrade to Flink 2.0, since we moved forward with SinkV2 API without
> > >>> taking
> > >>> care of implementations in external connectors.
> > >>>
> > >>> I am ok with both of them and personally prefer option 1.
> > >>>
> > >>> Best regards,
> > >>> Jing
> > >>>
> > >>>
> > >&g

Re: [DISCUSS] FLIP-329: Add operator attribute to specify support for object-reuse

2023-07-05 Thread Jing Ge
Hi Xuannan, Hi Dong,

Thanks for your clarification.

@Xuannan

A Jira ticket has been created for the doc update:
https://issues.apache.org/jira/browse/FLINK-32546

@Dong

I don't have a concrete example. I just thought about it from a conceptual
or pattern's perspective. Since we have 1. coarse-grained global switch(CGS
as abbreviation), i.e. the pipeline.object-reuse and 2. fine-grained local
switch(FGS as abbreviation), i.e. the objectReuseCompliant variable for
specific operators/functions, there will be the following patterns with
appropriate combinations:

pattern 1: coarse-grained switch only. Local object reuse will be
controlled by the coarse-grained switch:
1.1 cgs == true -> local object reused enabled
1.2 cgs == true  -> local object reused enabled
1.3 cgs == false -> local object reused disabled, i.e. deep copy enabled
1.4 cgs == false -> local object reused disabled, i.e. deep copy enabled

afaiu, this is the starting point. I wrote 4 on purpose to make the
regression check easier. We can consider it as the combinations with
cgs(true/false) and fgs(true/false) while fgs is ignored.

Now we introduce fine-grained switch. There will be two patterns:

pattern 2: fine-grained switch over coarse-grained switch. Coarse-grained
switch will be ignored when the local fine-grained switch has different
value:
2.1 cgs == true and fgs == true -> local object reused enabled
2.2 cgs == true and fgs == false -> local object reused disabled, i.e. deep
copy enabled
2.3 cgs == false and fgs == true -> local object reused enabled
2.4 cgs == false and fgs == false -> local object reused disabled, i.e.
deep copy enabled

cgs is actually ignored.

Current FLIP is using a slightly different pattern:

pattern 3: fine-grained switch over coarse-grained switch only when
coarse-grained switch is off, i..e cgs OR fgs:
3.1 cgs == true and fgs == true -> local object reused enabled
3.2 cgs == true and fgs == false -> local object reused enabled
3.3 cgs == false and fgs == true -> local object reused enabled
3.4 cgs == false and fgs == false -> local object reused disabled, i.e.
deep copy enabled

All of those patterns are rational and each has different focus. It depends
on the real requirement to choose one of them.

As we can see, if fgs is using 2VL, there is a regression between pattern 1
and pattern 2. You are absolutely right in this case. That's why I
suggested 3VL, i.e. fgs will have triple values: true, false, unknown(e.g.
null)

pattern 4: 3VL fgs with the null as init value (again, there are just two
combination, I made it 4 on purpose):
4.1 cgs == true and fgs == null -> local object reused enabled
4.2 cgs == true and fgs == null -> local object reused enabled
4.3 cgs == false and fgs == null -> local object reused disabled, i.e. deep
copy enabled
4.4 cgs == false and fgs == null -> local object reused disabled, i.e. deep
copy enabled

Since the default value of fgs is null, pattern 4 is backward compatible
with pattern 1, which means no regression.

Now we will set value to fgs and follow the pattern 2:
4.5 cgs == true and fgs == true -> local object reused enabled
4.6 cgs == true and fgs == false -> local object reused disabled, i.e. deep
copy enabled
4.7 cgs == false and fgs == true -> local object reused enabled
4.8 cgs == false and fgs == false -> local object reused disabled, i.e.
deep copy enabled

Pattern 4 contains pattern 3 with the following combinations(force enabling
local object reuse):
4.5 cgs == true and fgs == true -> local object reused enabled
4.2 cgs == true and fgs == null -> local object reused enabled
4.7 cgs == false and fgs == true -> local object reused enabled
4.4 cgs == false and fgs == null -> local object reused disabled, i.e. deep
copy enabled

Comparing pattern 4 to pattern 3, user will have one additional flexibility
to control(force disabling) the local object reuse capability because of
3VL, i.e. 4.2+4.6 vs. 3.2.

It is commonly used in the hierarchical RBAC to enable more fine-grained
access control of sub role.

I hope I have been able to explain myself clearly. Looking forward to your
feedback.

Best regards,
Jing



On Wed, Jul 5, 2023 at 12:47 PM Dong Lin  wrote:

> Hi Jing,
>
> Thanks for the comments! Please find below my comments, which are based on
> the offline discussion with Xuannan.
>
> On Wed, Jul 5, 2023 at 1:36 AM Jing Ge  wrote:
>
>> Hi Xuannan, Hi Dong
>>
>> Thanks for the Proposal! After reading the FLIP, I'd like to ask some
>> questions:
>>
>> 1. Naming convention for boolean variables. It is recommended to follow
>> JavaBean [1], i.e. objectReuseCompliant as the variable name with
>> isObjectReuseCompliant() and setObjectReuseCompliant() as the methods' name.
>>
>>
> Good point. We have updated the FLIP as suggested.
>
>
>>
>> 2.
>>
>>-

Re: [ANNOUNCE] Flink 1.18 Feature Freeze Extended until July 24th, 2023

2023-07-06 Thread Jing Ge
Thanks for driving it and sharing the update!

Best regards,
Jing

On Thu, Jul 6, 2023 at 9:21 AM yuxia  wrote:

> Thanks for the update and thanks for your efforts.
>
> Best regards,
> Yuxia
>
> - 原始邮件 -
> 发件人: "Rui Fan" <1996fan...@gmail.com>
> 收件人: "dev" , re...@apache.org
> 抄送: "Jing Ge" , snuyan...@gmail.com, "Konstantin
> Knauf" 
> 发送时间: 星期四, 2023年 7 月 06日 下午 3:06:28
> 主题: Re: [ANNOUNCE] Flink 1.18 Feature Freeze Extended until July 24th, 2023
>
> Thanks for the update, and thank you for your efforts for the 1.18 release!
>
> Best,
> Rui Fan
>
> On Thu, Jul 6, 2023 at 2:40 PM Qingsheng Ren  wrote:
>
> > Hi devs,
> >
> > Recently we collected some feedback from developers, and in order to give
> > more time for polishing some important features in 1.18, we decide to
> > extend the feature freezing date to:
> >
> > July 24th, 2023, at 00:00 CEST(UTC+2)
> >
> > which gives us ~2 weeks for development from now. There will be no
> > extension after Jul 24, so please arrange new features in the next
> release
> > if they cannot be finished before the closing date.
> >
> > Thanks everyone for your work in 1.18!
> >
> > Best regards,
> > Qingsheng, Jing, Konstantin and Sergey
> >
>


Re: [DISCUSS] FLIP-329: Add operator attribute to specify support for object-reuse

2023-07-06 Thread Jing Ge
Hi Dong,

one scenario I could imagine is that users could enable global object reuse
features but force deep copy for some user defined specific functions
because of any limitations. But that is only my gut feeling. And agree, we
could keep the solution simple for now as FLIP described and upgrade to 3VL
once there are such real requirements that are rising.

Best regards,
Jing

On Thu, Jul 6, 2023 at 12:30 PM Dong Lin  wrote:

> Hi Jing,
>
> Thank you for the detailed explanation. Please see my reply inline.
>
> On Thu, Jul 6, 2023 at 3:17 AM Jing Ge  wrote:
>
>> Hi Xuannan, Hi Dong,
>>
>> Thanks for your clarification.
>>
>> @Xuannan
>>
>> A Jira ticket has been created for the doc update:
>> https://issues.apache.org/jira/browse/FLINK-32546
>>
>> @Dong
>>
>> I don't have a concrete example. I just thought about it from a
>> conceptual or pattern's perspective. Since we have 1. coarse-grained global
>> switch(CGS as abbreviation), i.e. the pipeline.object-reuse and 2.
>> fine-grained local switch(FGS as abbreviation), i.e. the
>> objectReuseCompliant variable for specific operators/functions, there will
>> be the following patterns with appropriate combinations:
>>
>> pattern 1: coarse-grained switch only. Local object reuse will be
>> controlled by the coarse-grained switch:
>> 1.1 cgs == true -> local object reused enabled
>> 1.2 cgs == true  -> local object reused enabled
>> 1.3 cgs == false -> local object reused disabled, i.e. deep copy enabled
>> 1.4 cgs == false -> local object reused disabled, i.e. deep copy enabled
>>
>> afaiu, this is the starting point. I wrote 4 on purpose to make the
>> regression check easier. We can consider it as the combinations with
>> cgs(true/false) and fgs(true/false) while fgs is ignored.
>>
>> Now we introduce fine-grained switch. There will be two patterns:
>>
>> pattern 2: fine-grained switch over coarse-grained switch. Coarse-grained
>> switch will be ignored when the local fine-grained switch has different
>> value:
>> 2.1 cgs == true and fgs == true -> local object reused enabled
>> 2.2 cgs == true and fgs == false -> local object reused disabled, i.e.
>> deep copy enabled
>> 2.3 cgs == false and fgs == true -> local object reused enabled
>> 2.4 cgs == false and fgs == false -> local object reused disabled, i.e.
>> deep copy enabled
>>
>> cgs is actually ignored.
>>
>> Current FLIP is using a slightly different pattern:
>>
>> pattern 3: fine-grained switch over coarse-grained switch only when
>> coarse-grained switch is off, i..e cgs OR fgs:
>> 3.1 cgs == true and fgs == true -> local object reused enabled
>> 3.2 cgs == true and fgs == false -> local object reused enabled
>> 3.3 cgs == false and fgs == true -> local object reused enabled
>> 3.4 cgs == false and fgs == false -> local object reused disabled, i.e.
>> deep copy enabled
>>
>> All of those patterns are rational and each has different focus. It
>> depends on the real requirement to choose one of them.
>>
>> As we can see, if fgs is using 2VL, there is a regression between pattern
>> 1 and pattern 2. You are absolutely right in this case. That's why I
>> suggested 3VL, i.e. fgs will have triple values: true, false, unknown(e.g.
>> null)
>>
>> pattern 4: 3VL fgs with the null as init value (again, there are just two
>> combination, I made it 4 on purpose):
>> 4.1 cgs == true and fgs == null -> local object reused enabled
>> 4.2 cgs == true and fgs == null -> local object reused enabled
>> 4.3 cgs == false and fgs == null -> local object reused disabled, i.e.
>> deep copy enabled
>> 4.4 cgs == false and fgs == null -> local object reused disabled, i.e.
>> deep copy enabled
>>
>> Since the default value of fgs is null, pattern 4 is backward compatible
>> with pattern 1, which means no regression.
>>
>> Now we will set value to fgs and follow the pattern 2:
>> 4.5 cgs == true and fgs == true -> local object reused enabled
>> 4.6 cgs == true and fgs == false -> local object reused disabled, i.e.
>> deep copy enabled
>> 4.7 cgs == false and fgs == true -> local object reused enabled
>> 4.8 cgs == false and fgs == false -> local object reused disabled, i.e.
>> deep copy enabled
>>
>> Pattern 4 contains pattern 3 with the following combinations(force
>> enabling local object reuse):
>> 4.5 cgs == true and fgs == true -> local object reused enabled
>> 4.2 cgs == true and fgs == null -> local object reused enab

Re: [DISCUSS] FLIP-314: Support Customized Job Lineage Listener

2023-07-06 Thread Jing Ge
Hi Shammon,

Thanks for the clarification. Atlas might have his historical reason back
to the hadoop era or maybe even back to the hibernate where Entity and
Relation were commonly used. Flink already used Vertex and Edge to describe
DAG. Some popular tools like dbt are also using this convention[1] and,
afaik, most graph frameworks use vertex and edge too. It will be easier for
Flink devs and users to have a consistent naming convention for the same
concept, i.e. in this case, DAG.

Best regards,
Jing

[1]
https://docs.getdbt.com/docs/dbt-cloud-apis/discovery-use-cases-and-examples#discovery

On Wed, Jul 5, 2023 at 11:28 AM Shammon FY  wrote:

> Hi Jing,
>
> Thanks for your feedback.
>
> > 1. TableColumnLineageRelation#sinkColumn() should return
> TableColumnLineageEntity instead of String, right?
>
> The `sinkColumn()` will return `String` which is the column name in the
> sink connector. I found the name of `TableColumnLineageEntity` may
> cause ambiguity and I have renamed it to `TableColumnSourceLineageEntity`.
> In my mind the `TableColumnLineageRelation` represents the lineage for each
> sink column, each column may be computed from multiple sources and columns.
> I use `TableColumnSourceLineageEntity` to manage each source and its
> columns for the sink column, so `TableColumnLineageRelation` has a sink
> column name and `TableColumnSourceLineageEntity` list.
>
> > 2. Since LineageRelation already contains all information to build the
> lineage between sources and sink, do we still need to set the LineageEntity
> in the source?
>
> The lineage interface of `DataStream` is very flexible. We have added
> `setLineageEntity` to the source to limit and verify user behavior,
> ensuring that users have not added non-existent sources as lineage.
>
> > 3. About the "Entity" and "Relation" naming, I was confused too, like
> Qingsheng mentioned. How about LineageVertex, LineageEdge, and LineageEdges
> which contains multiple LineageEdge?
>
> We referred to `Atlas` for the name of lineage, it uses `Entity` and
> `Relation` to represent the lineage relationship and another metadata
> service `Datahub` uses `DataSet` to represent the entity. I think `Entity`
> and `Relation` are nicer for lineage, what do you think of it?
>
> Best,
> Shammon FY
>
>
> On Thu, Jun 29, 2023 at 4:21 AM Jing Ge 
> wrote:
>
> > Hi Shammon,
> >
> > Thanks for your proposal. After reading the FLIP, I'd like to ask
> > some questions to make sure we are on the same page. Thanks!
> >
> > 1. TableColumnLineageRelation#sinkColumn() should return
> > TableColumnLineageEntity instead of String, right?
> >
> > 2. Since LineageRelation already contains all information to build the
> > lineage between sources and sink, do we still need to set the
> LineageEntity
> > in the source?
> >
> > 3. About the "Entity" and "Relation" naming, I was confused too, like
> > Qingsheng mentioned. How about LineageVertex, LineageEdge, and
> LineageEdges
> > which contains multiple LineageEdge? E.g. multiple sources join into one
> > sink, or, edges of columns from one or different tables, etc.
> >
> > Best regards,
> > Jing
> >
> > On Sun, Jun 25, 2023 at 2:06 PM Shammon FY  wrote:
> >
> > > Hi yuxia and Yun,
> > >
> > > Thanks for your input.
> > >
> > > For yuxia:
> > > > 1: What kinds of JobStatus will the `JobExecutionStatusEven`
> including?
> > >
> > > At present, we only need to notify the listener when a job goes to
> > > termination, but I think it makes sense to add generic `oldStatus` and
> > > `newStatus` in the listener and users can update the job state in their
> > > service as needed.
> > >
> > > > 2: I'm really confused about the `config()` included in
> > `LineageEntity`,
> > > where is it from and what is it for ?
> > >
> > > The `config` in `LineageEntity` is used for users to get options for
> > source
> > > and sink connectors. As the examples in the FLIP, users can add
> > > server/group/topic information in the config for kafka and create
> lineage
> > > entities for `DataStream` jobs, then the listeners can get this
> > information
> > > to identify the same connector in different jobs. Otherwise, the
> `config`
> > > in `TableLineageEntity` will be the same as `getOptions` in
> > > `CatalogBaseTable`.
> > >
> > > > 3: Regardless whether `inputChangelogMode` in
> `TableSinkLineageEntity`
> > is
> > > needed or not, since `TableSinkLineageEntity` contains

Re: [DISCUSS] FLIP-325: Support configuring end-to-end allowed latency

2023-07-06 Thread Jing Ge
Hi,

Thank you all for the inspired discussion. Really appreciate it!

@Dong I'd like to ask some (stupid) questions to make sure I understand
your thoughts correctly.

1. It will make no sense to send the same type of RecordAttributes right?
e.g.  if one RecordAttributes(isBacklog=true) has been sent, a new
RecordAttributes will be only sent when isBacklog is changed to be false,
and vice versa. In this way, the number of RecordAttributes will be very
limited.

2. Since source readers can invoke Output#emitRecordAttributes to emit
RecordAttributes(isBacklog=true/false), it might be weird to send
RecordAttributes with different isBacklog back and forth too often. Devs
and users should pay attention to it. Something is wrong when such a thing
happens(metrics for monitoring?). Is this correct?

3. Is there any relationship between end-to-end-latency and checkpoint
interval that users should pay attention to? In the example described in
the FLIP, both have the same value, 2 min. What about end-to-end-latency is
configured bigger than checkpoint interval? Could checkpoint between
end-to-end-latency be skipped?

4. Afaiu, one major discussion point is that isBacklog can be derived from
back pressure and there will be no need of RecordAttributes. In case a
Flink job has rich resources that there is no back pressure (it will be
difficult to perfectly have just enough resources that everything is fine
but will have back pressure only for backlog) but we want to improve the
throughput. We then need some other ways to derive isBacklog. That is the
reason why RecordAttributes has been introduced. Did I understand it
correctly?

5. NIT: Just like we talked about in another thread, JavaBean naming
convention is recommended, i.e. isBacklog() & setBacklog() instead of
getIsBacklog() and setIsBacklog().

Best regards,
Jing

On Thu, Jul 6, 2023 at 2:38 PM Dong Lin  wrote:

> Hi Shammon,
>
> Thanks for your comments. Please see my reply inline.
>
>
> On Thu, Jul 6, 2023 at 12:47 PM Shammon FY  wrote:
>
> > Hi,
> >
> > Thanks for your replay @Dong. I really agree with Piotr's points and I
> > would like to share some thoughts from my side.
> >
> > About the latency for mini-batch mechanism in Flink SQL, I still think
> the
> > description in the FLIP is not right. If there are N operators and the
> > whole process time for data in the job is `t`, then the latency in
> > mini-batch will be `table.exec.mini-batch.allow-latency`+`t`, not `
> > table.exec.mini-batch.allow-latency`*N. I think this is one of the
> > foundations of this FLIP, and you may need to confirm it again.
> >
>
> Given that we agree to have a mechanism to support end-to-end latency for
> DataStream programs, I think the exact semantics of
> table.exec.mini-batch.allow-latency will not affect the motivation or API
> design of this FLIP. I have updated the FLIP to remove any mention of
> table.exec.mini-batch.allow-latency.
>
>
> >
> > I think supporting similar mechanisms in the runtime and balance latency
> > and throughput dynamically for all flink jobs is a very good idea, and I
> > have some questions for that.
> >
> > 1. We encounter a situation where the workload is high when processing
> > snapshot data and we need mini-batch in sql for performance reason. But
> the
> > workload is low when processing delta data, we need to automatically
> adjust
> > the mini-batch SQL for them, or even cancel the mini-batch during delta
> > processing. I think this FLIP meets our needs, but I think we need a
> > general solution which covers all source types in flink, and the
> > `isBacklog` in the FLIP is only one strategy.
> >
>
> The focus of this FLIP is to allow Flink runtime to adjust the behavior of
> operators (e.g. the buffer time) based on the IsBacklog status of sources
> and the user-specified execution.end-to-end-latency (effective only when
> there is no backlog). The FLIP assumes there is already a strategy for
> sources to determine the IsProcessingBacklog status without adding more
> strategies.
>
> I agree it is useful to introduce more strategies to determine the the
> IsProcessingBacklog status for sources. We can determine the
> IsProcessingBacklog status based on the backpressure metrics, the
> event-time watermark lag, or anything we find reasonable. I would like to
> work on this in follow-up FLIPs and that we don't work on too many things
> in the same FLIP.
>
> Would this be OK with you?
>
>
> > From the FLIP I think there should be two parts: dynamic trigger flush
> > event in JM and dynamic trigger flush operations in Operator. We need to
> > introduce much more general interfaces for them, such as
> > `DynamicFlushStrategy` in JM and `DynamicFlushOperation` in TM? As Piotr
> > mentioned above, we can collect many information from TM locally such as
> > backpressure, queue size and `Operator` can decide whether to buffer data
> > or process it immediately.  JM is also the same, it can decide to send
> > flush events on a regular basis or s

Re: [DISCUSS] FLIP-314: Support Customized Job Lineage Listener

2023-07-07 Thread Jing Ge
Hi Shammon,

Thanks for the update!

Best regards,
Jing

On Fri, Jul 7, 2023 at 4:46 AM Shammon FY  wrote:

> Thanks Jing, sounds good to me.
>
> I have updated the FLIP and renamed the lineage related classes to
> `LineageGraph`, `LineageVertex` and `LineageEdge` and keep it consistent
> with the job definition in Flink.
>
> Best,
> Shammon FY
>
> On Thu, Jul 6, 2023 at 8:25 PM Jing Ge  wrote:
>
> > Hi Shammon,
> >
> > Thanks for the clarification. Atlas might have his historical reason back
> > to the hadoop era or maybe even back to the hibernate where Entity and
> > Relation were commonly used. Flink already used Vertex and Edge to
> describe
> > DAG. Some popular tools like dbt are also using this convention[1] and,
> > afaik, most graph frameworks use vertex and edge too. It will be easier
> for
> > Flink devs and users to have a consistent naming convention for the same
> > concept, i.e. in this case, DAG.
> >
> > Best regards,
> > Jing
> >
> > [1]
> >
> >
> https://docs.getdbt.com/docs/dbt-cloud-apis/discovery-use-cases-and-examples#discovery
> >
> > On Wed, Jul 5, 2023 at 11:28 AM Shammon FY  wrote:
> >
> > > Hi Jing,
> > >
> > > Thanks for your feedback.
> > >
> > > > 1. TableColumnLineageRelation#sinkColumn() should return
> > > TableColumnLineageEntity instead of String, right?
> > >
> > > The `sinkColumn()` will return `String` which is the column name in the
> > > sink connector. I found the name of `TableColumnLineageEntity` may
> > > cause ambiguity and I have renamed it to
> > `TableColumnSourceLineageEntity`.
> > > In my mind the `TableColumnLineageRelation` represents the lineage for
> > each
> > > sink column, each column may be computed from multiple sources and
> > columns.
> > > I use `TableColumnSourceLineageEntity` to manage each source and its
> > > columns for the sink column, so `TableColumnLineageRelation` has a sink
> > > column name and `TableColumnSourceLineageEntity` list.
> > >
> > > > 2. Since LineageRelation already contains all information to build
> the
> > > lineage between sources and sink, do we still need to set the
> > LineageEntity
> > > in the source?
> > >
> > > The lineage interface of `DataStream` is very flexible. We have added
> > > `setLineageEntity` to the source to limit and verify user behavior,
> > > ensuring that users have not added non-existent sources as lineage.
> > >
> > > > 3. About the "Entity" and "Relation" naming, I was confused too, like
> > > Qingsheng mentioned. How about LineageVertex, LineageEdge, and
> > LineageEdges
> > > which contains multiple LineageEdge?
> > >
> > > We referred to `Atlas` for the name of lineage, it uses `Entity` and
> > > `Relation` to represent the lineage relationship and another metadata
> > > service `Datahub` uses `DataSet` to represent the entity. I think
> > `Entity`
> > > and `Relation` are nicer for lineage, what do you think of it?
> > >
> > > Best,
> > > Shammon FY
> > >
> > >
> > > On Thu, Jun 29, 2023 at 4:21 AM Jing Ge 
> > > wrote:
> > >
> > > > Hi Shammon,
> > > >
> > > > Thanks for your proposal. After reading the FLIP, I'd like to ask
> > > > some questions to make sure we are on the same page. Thanks!
> > > >
> > > > 1. TableColumnLineageRelation#sinkColumn() should return
> > > > TableColumnLineageEntity instead of String, right?
> > > >
> > > > 2. Since LineageRelation already contains all information to build
> the
> > > > lineage between sources and sink, do we still need to set the
> > > LineageEntity
> > > > in the source?
> > > >
> > > > 3. About the "Entity" and "Relation" naming, I was confused too, like
> > > > Qingsheng mentioned. How about LineageVertex, LineageEdge, and
> > > LineageEdges
> > > > which contains multiple LineageEdge? E.g. multiple sources join into
> > one
> > > > sink, or, edges of columns from one or different tables, etc.
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Sun, Jun 25, 2023 at 2:06 PM Shammon FY 
> wrote:
> > > >
> > > > > Hi yuxia and Yun,
> > > > >
> > > > > Thanks for your input.
> > > >

Re: [DISCUSS] FLIP-325: Support configuring end-to-end allowed latency

2023-07-07 Thread Jing Ge
Hi Dong,

Thanks for your clarification.


> Actually, I think it could make sense to toggle isBacklog between true and
> false while the job is running.
>

If isBacklog is toggled too often back and forth(e.g. by unexpected
mistake, unstable system, etc), a large amount of RecordAttributes might be
triggered, which will lead to performance issues. This should not be the
right way to use RecordAttributes right? Devs and users should be aware of
it and know how to monitor, maintain, and fix issues.

Your reply contains valuable information. It might make sense to add them
into the FLIP:

1. It is up to the operator to decide when to emit RecordAttributes. But
devs and users should be aware that the number of RecordAttributes should
not be too high to cause performance issues.
2. Although users can decide how to configure them, the end-to-end latency
should be (commonly?) configured lower than the checkpoint interval.
3. The three ways you mentioned for how to derive isBacklog.

WDYT?

Best regards,
Jing


On Fri, Jul 7, 2023 at 3:13 AM Dong Lin  wrote:

> Hi Jing,
>
> Thanks for the comments. Please see my reply inline.
>
> On Fri, Jul 7, 2023 at 5:40 AM Jing Ge  wrote:
>
> > Hi,
> >
> > Thank you all for the inspired discussion. Really appreciate it!
> >
> > @Dong I'd like to ask some (stupid) questions to make sure I understand
> > your thoughts correctly.
> >
> > 1. It will make no sense to send the same type of RecordAttributes right?
> > e.g.  if one RecordAttributes(isBacklog=true) has been sent, a new
> > RecordAttributes will be only sent when isBacklog is changed to be false,
> > and vice versa. In this way, the number of RecordAttributes will be very
> > limited.
> >
>
> Yes, you are right. Actually, this is what we plan to do when we update
> operators to emit RecordAttributes via `Output#emitRecordAttributes()`.
>
> Note that the FLIP does not specify the frequency of how operators should
> invoke `Output#emitRecordAttributes()`. It is up to the operator
> to decide when to emit RecordAttributes.
>
>
> > 2. Since source readers can invoke Output#emitRecordAttributes to emit
> > RecordAttributes(isBacklog=true/false), it might be weird to send
> > RecordAttributes with different isBacklog back and forth too often. Devs
> > and users should pay attention to it. Something is wrong when such a
> thing
> > happens(metrics for monitoring?). Is this correct?
> >
>
>

> Actually, I think it could make sense to toggle isBacklog between true and
> false while the job is running.
>
>

> Suppose the job is reading from user-action data from Kafka and there is a
> traffic spike for 2 hours. If the job keeps running in pure stream mode,
> the watermark lag might keep increasing during this period because the
> job's processing capability can not catch up with the Kafka input
> throughput. In this case, it can be beneficial to dynamically switch
> isBacklog to true when watermarkLag exceeds a given threshold (e.g. 5
> minutes), and switch isBacklog to false again when the watermarkLag is low
> enough (30 seconds).
>
>
> > 3. Is there any relationship between end-to-end-latency and checkpoint
> > interval that users should pay attention to? In the example described in
> > the FLIP, both have the same value, 2 min. What about end-to-end-latency
> is
> > configured bigger than checkpoint interval? Could checkpoint between
> > end-to-end-latency be skipped?
> >
>
> This FLIP would not enforce any relationship between end-to-end latency and
> checkpoint interval. Users are free to configure end-to-end latency to be
> bigger than checkpoint interval.
>
> I don't think there exists any use-case which requires end-to-end latency
> to be higher than the checkpoint interval. Note that introducing a
> relationship between these two configs would increase code complexity and
> also make the documentation of these configs a bit more complex for users
> to understand.
>
> Since there is no correctness when a user sets end-to-end latency to be
> bigger than the checkpointing interval, I think it is simpler to just let
> the user decide how to configure them.
>
>
> > 4. Afaiu, one major discussion point is that isBacklog can be derived
> from
> > back pressure and there will be no need of RecordAttributes. In case a
> > Flink job has rich resources that there is no back pressure (it will be
> > difficult to perfectly have just enough resources that everything is fine
> > but will have back pressure only for backlog) but we want to improve the
> > throughput. We then need some other ways to derive isBacklog. That is the
> > reason why RecordAttributes has been i

Re: [DISUCSS] Deprecate multiple APIs in 1.18

2023-07-07 Thread Jing Ge
Hi Alex,

I would follow FLIP-197 and try to release them asap depending on dev
resources and how difficult those issues are. The fastest timeline is the
period defined in FLIP-197 in ideal conditions.

Best regards,
Jing

On Fri, Jul 7, 2023 at 12:20 PM Alexander Fedulov <
alexander.fedu...@gmail.com> wrote:

> @Xintong
> > - IIUC, the testing scenario you described is like blocking the source
> for
> > proceeding (emit data, finish, etc.) until a checkpoint is finished.
>
> It is more tricky than that - we need to prevent the Sink from receiving a
> checkpoint barrier until the Source is done emitting a given set of
> records. In
> the current tests, which are also used for V2 Sinks, SourceFunction
> controls
> when the Sink is "allowed" to commit by holding the checkpoint lock while
> producing the records. The lock is not available in the new Source by
> design
> and we need a solution that provides the same functionality (without
> modifying
> the Sinks). I am currently checking if a workaround is at all possible
> without
> adjusting anything in the Source interface.
>
> > I may not have understood all the details, but based on what you
> described
> > I'd hesitate to block the deprecation / removal of SourceFunction on
> this.
>
> I don't think we should, just wanted to highlight that there are some
> unknowns
> with respect to estimating the amount of work required.
>
> @Jing
> I want to understand in which release would you target graduation of the
> mentioned connectors to @Public/@PublicEvolving - basically the anticipated
> timeline of the steps in both options with respect to releases.
>
> Best,
> Alex
>
> On Fri, 7 Jul 2023 at 10:53, Xintong Song  wrote:
>
> > Thanks all for the discussion. I've created FLINK-32557 for this.
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Thu, Jul 6, 2023 at 1:00 AM Jing Ge 
> wrote:
> >
> > > Hi Alex,
> > >
> > >
> > > > > 3. remove SinkFunction.
> > > > Which steps do you imply for the 1.18 release and for the 2.0
> release?
> > > >
> > >
> > > for 2.0 release. 1.18 will be released soon.
> > >
> > > Best regards,
> > > Jing
> > >
> > >
> > > On Wed, Jul 5, 2023 at 1:08 PM Alexander Fedulov <
> > > alexander.fedu...@gmail.com> wrote:
> > >
> > > > @Jing
> > > > Just to clarify, when you say:
> > > >
> > > > 3. remove SinkFunction.
> > > > Which steps do you imply for the 1.18 release and for the 2.0
> release?
> > >
> > > @Xintong
> > > > A side note - with the new Source API we lose the ability to control
> > > > checkpointing from the source since there is no lock anymore. This
> > > > functionality
> > > > is currently used in a variety of tests for the Sinks - the tests
> that
> > > rely
> > > > on tight
> > > > synchronization between specific elements passed from the source  to
> > the
> > > > sink before
> > > > allowing a checkpoint to complete (see FiniteTestSource [1]). Since
> > > FLIP-27
> > > > Sources rely
> > > > on decoupling via the mailbox, without exposing the lock, it is not
> > > > immediately clear
> > > > if it is possible to achieve the same functionality without major
> > > > extensions in the
> > > > runtime for such testing purposes. My hope initially was that only
> the
> > > > legacy Sinks
> > > > relied on this - this would have made it possible to drop
> > > > SourceFunction+SinkFunction
> > > > together, but, in fact, it also already became part of the new SinkV2
> > > > testing IT suits
> > > > [2]. Moreover, I know of at least one major connector that also
> relies
> > on
> > > > it for
> > > > verifying committed sink metadata for a specific set of records
> > (Iceberg)
> > > > [3]. In my
> > > > estimation this currently presents a major blocker for the
> > SourceFunction
> > > > removal.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/master/flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/streaming/util/FiniteTestSource.java
> > > > [2]
> > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/master/flink-connectors/flink-conne

Re: [VOTE] Release 2.0 must-have work items

2023-07-10 Thread Jing Ge
+1

On Mon, Jul 10, 2023 at 12:52 PM Yu Li  wrote:

> +1 (binding)
>
> Thanks for driving this and great to see us moving forward.
>
> Best Regards,
> Yu
>
>
> On Mon, 10 Jul 2023 at 11:59, Feng Wang  wrote:
>
> > +1
> > Thanks for driving this, looking forward to the next stage of flink.
> >
> > On Fri, Jul 7, 2023 at 5:31 PM Xintong Song 
> wrote:
> >
> > > Hi all,
> > >
> > > I'd like to start the VOTE for the must-have work items for release 2.0
> > > [1]. The corresponding discussion thread is [2].
> > >
> > > Please note that once the vote is approved, any changes to the
> must-have
> > > items (adding / removing must-have items, changing the priority)
> requires
> > > another vote. Assigning contributors / reviewers, updating
> descriptions /
> > > progress, changes to nice-to-have items do not require another vote.
> > >
> > > The vote will be open until at least July 12, following the consensus
> > > voting process. Votes of PMC members are binding.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > > [1] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > >
> > > [2] https://lists.apache.org/thread/l3dkdypyrovd3txzodn07lgdwtwvhgk4
> > >
> >
>


[VOTE] Graduate the FileSink to @PublicEvolving

2023-07-10 Thread Jing Ge
Hi,

I'd like to start the VOTE for graduating the FlinkSink to @PublicEvolving.
The discussion thread can be found at [1]

The vote will be open until at least July 13 12pm GMT(72 hours) unless
there is an objection or insufficient votes.

Thanks,

Jing Ge

[1] https://lists.apache.org/thread/wxoo7py5pqqlz37l4w8jrq6qdvsdq5wc


Re: [DISCUSS] FLIP-325: Support configuring end-to-end allowed latency

2023-07-10 Thread Jing Ge
Hi Dong,

Thanks for the update!

Best regards,
Jing

On Sun, Jul 9, 2023 at 3:26 AM Dong Lin  wrote:

> Hi Jing,
>
> Thanks for the suggestions. Please see my reply inline.
>
> On Fri, Jul 7, 2023 at 3:50 PM Jing Ge  wrote:
>
> > Hi Dong,
> >
> > Thanks for your clarification.
> >
> >
> > > Actually, I think it could make sense to toggle isBacklog between true
> > and
> > > false while the job is running.
> > >
> >
> > If isBacklog is toggled too often back and forth(e.g. by unexpected
> > mistake, unstable system, etc), a large amount of RecordAttributes might
> be
> > triggered, which will lead to performance issues. This should not be the
> > right way to use RecordAttributes right? Devs and users should be aware
> of
> > it and know how to monitor, maintain, and fix issues.
> >
> > Your reply contains valuable information. It might make sense to add them
> > into the FLIP:
> >
> > 1. It is up to the operator to decide when to emit RecordAttributes. But
> > devs and users should be aware that the number of RecordAttributes should
> > not be too high to cause performance issues.
> >
>
> Sure, I have updated the FLIP to include the following statement:
>
> "Note: It is up to the operator implementation to decide when (and how
> often) to emit RecordAttributes, similar to how operators emit
> RecordAttributes. The overhead of emitting Watermark is similar to the
> overhead of emitting Watermark"
>
>
> > 2. Although users can decide how to configure them, the end-to-end
> latency
> > should be (commonly?) configured lower than the checkpoint interval.
> >
>
> Since this is related to the performance tuning rather than
> correctness/functionality of the core APIs, I added the following sentence
> in the Appendix section:
>
> "We expect that end-to-end latency will typically be configured with a
> value lower than the checkpoint interval"
>
> 3. The three ways you mentioned for how to derive isBacklog.
> >
>
> Sure, I have updated the FLIP to include the following information:
>
> "In the future, we expect IsProcessingBacklog can very likely be determined
> using the following strategies ..."
>
> Best,
> Dong
>
>
> >
> > WDYT?
> >
> > Best regards,
> > Jing
> >
> >
> > On Fri, Jul 7, 2023 at 3:13 AM Dong Lin  wrote:
> >
> > > Hi Jing,
> > >
> > > Thanks for the comments. Please see my reply inline.
> > >
> > > On Fri, Jul 7, 2023 at 5:40 AM Jing Ge 
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > Thank you all for the inspired discussion. Really appreciate it!
> > > >
> > > > @Dong I'd like to ask some (stupid) questions to make sure I
> understand
> > > > your thoughts correctly.
> > > >
> > > > 1. It will make no sense to send the same type of RecordAttributes
> > right?
> > > > e.g.  if one RecordAttributes(isBacklog=true) has been sent, a new
> > > > RecordAttributes will be only sent when isBacklog is changed to be
> > false,
> > > > and vice versa. In this way, the number of RecordAttributes will be
> > very
> > > > limited.
> > > >
> > >
> > > Yes, you are right. Actually, this is what we plan to do when we update
> > > operators to emit RecordAttributes via `Output#emitRecordAttributes()`.
> > >
> > > Note that the FLIP does not specify the frequency of how operators
> should
> > > invoke `Output#emitRecordAttributes()`. It is up to the operator
> > > to decide when to emit RecordAttributes.
> > >
> > >
> > > > 2. Since source readers can invoke Output#emitRecordAttributes to
> emit
> > > > RecordAttributes(isBacklog=true/false), it might be weird to send
> > > > RecordAttributes with different isBacklog back and forth too often.
> > Devs
> > > > and users should pay attention to it. Something is wrong when such a
> > > thing
> > > > happens(metrics for monitoring?). Is this correct?
> > > >
> > >
> > >
> >
> > > Actually, I think it could make sense to toggle isBacklog between true
> > and
> > > false while the job is running.
> > >
> > >
> >
> > > Suppose the job is reading from user-action data from Kafka and there
> is
> > a
> > > traffic spike for 2 hours. If the job keeps running in pure stream
> mode,
> > > 

Re: [DISCUSS] FLIP-327: Support stream-batch unified operator to improve job throughput when processing backlog data

2023-07-10 Thread Jing Ge
Hi Dong,

Thanks for the proposal! The FLIP is already in good shape. I got some NIT
questions.

1. It is a little bit weird to write the hint right after the motivation
that some features have been moved to FLIP-331, because at that time,
readers don't know the context about what features does it mean. I would
suggest moving the note to the beginning of "Public interfaces" sections.
2. It is also a little bit weird to describe all behaviour changes at first
but only focus on one single feature, i.e. how to implement
internalSorterSupported. TBH, I was lost while I was reading the Public
interfaces. Maybe change the FLIP title? Another option could be to write a
short summary of all features and point out that this FLIP will only focus
on the internalSorterSupported feature. Others could be found in FLIP-331.
WDYT?
3. There should be a typo at 4) Checkpoint and failover strategy -> Mixed
mode ->

   - If any task fails when isBacklog=false true, this task is restarted to
   re-process its input from the beginning.



Best regards
Jing


On Thu, Jul 6, 2023 at 1:24 PM Dong Lin  wrote:

> Hi Piotr,
>
> Thanks for your comments! Please see my reply inline.
>
> On Wed, Jul 5, 2023 at 11:44 PM Piotr Nowojski 
> wrote:
>
> > Hi Dong,
> >
> > I have a couple of questions.
> >
> > Could you explain why those properties
> >
> > @Nullable private Boolean isOutputOnEOF = null;
> > @Nullable private Boolean isOutputOnCheckpoint = null;
> > @Nullable private Boolean isInternalSorterSupported = null;
> >
> > must be `@Nullable`, instead of having the default value set to `false`?
> >
>
> By initializing these private variables in OperatorAttributesBuilder as
> null, we can implement `OperatorAttributesBuilder#build()` in such a way
> that it can print DEBUG level logging to say "isOutputOnCheckpoint is not
> explicitly set". This can help user/SRE debug performance issues (or lack
> of the expected optimization) due to operators not explicitly setting the
> right operator attribute.
>
> For example, we might want a job to always use the longer checkpointing
> interval (i.e. execution.checkpointing.interval-during-backlog) if all
> running operators have isOutputOnCheckpoint==false, and use the short
> checkpointing interval otherwise. If a user has explicitly configured the
> execution.checkpointing.interval-during-backlog but the two-phase commit
> sink library has not been upgraded to set isOutputOnCheckpoint=true, then
> the job will end up using the long checkpointing interval, and it will be
> useful to figure out what is going wrong in this case by checking the log.
>
> Note that the default value of these fields of the OperatorAttributes
> instance built by OperatorAttributesBuilder will still be false. The
> following is mentioned in the Java doc of
> `OperatorAttributesBuilder#build()`:
>
>  /**
>   * If any operator attribute is null, we will log it at DEBUG level and
> use the following
>   * default values.
>   * - isOutputOnEOF defaults to false
>   * - isOutputOnCheckpoint defaults to false
>   * - isInternalSorterSupported defaults to false
>   */
>
>
> >
> > Second question, have you thought about cases where someone is
> > either bootstrapping from a streaming source like Kafka
> > or simply trying to catch up after a long period of downtime in a purely
> > streaming job? Generally speaking a cases where
> > user doesn't care about latency in the catch up phase, regardless if the
> > source is bounded or unbounded, but wants to process
> > the data as fast as possible, and then switch dynamically to real time
> > processing?
> >
>
> Yes, I have thought about this. We should allow this job to effectively run
> in batch mode when the job is in the catch-up phase. FLIP-327 is actually
> an important step toward addressing this use-case.
>
> In order to address the above use-case, all we need is a way for source
> operator (e.g. Kafka) to tell Flink runtime (via IsProcessingBacklog)
> whether it is in the catch-up phase.
>
> Since every Kafka message has event-timestamp, we can allow users to
> specify a job-level config such as backlog-watermark-lag-threshold, and
> consider a Kafka Source to have IsProcessingBacklog=true if system_time -
> watermark > backlog-watermark-lag-threshold. This effectively allows us to
> determine whether Kafka is in the catch up phase.
>
> Once we have this capability (I plan to work on this in FLIP-328), we can
> directly use the features proposed in FLIP-325 and FLIP-327 to optimize the
> above use-case.
>
> What do you think?
>
> Best,
> Dong
>
>
> >
> > Best,
> > Piotrek
> >
> > niedz., 2 lip 2023 o 16:15 Dong Lin  napisał(a):
> >
> > > Hi all,
> > >
> > > I am opening this thread to discuss FLIP-327: Support stream-batch
> > unified
> > > operator to improve job throughput when processing backlog data. The
> > design
> > > doc can be found at
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-327%3A+Support+stream-batch+unified+operator+t

Re: [DISCUSS] FLIP 333 - Redesign Apache Flink website

2023-07-11 Thread Jing Ge
Hi,

+1, the UI design looks good!

Commonly speaking, there are two parts of the whole website: Flink Web and
Flink doc. Will the dart mode also cover Flink doc?

Best regards,
Jing

On Tue, Jul 11, 2023 at 12:40 PM Matthias Pohl
 wrote:

> I also like the proposed designs. Considering that you want to touch
> individual subpages, there are also some subpages of Flink's website not
> being mentioned in the FLIP (e.g. roadmap [1]). What is the plan with
> those? Are they covered by the "We recommend modifications only to the
> design of the following pages" part but are not listed there?
>
> Additionally, it would be nice to get a bit more insight into the feedback
> from users as Xintong pointed out. It would be interesting to understand
> how the new design helps solving certain problems (besides having the
> motivation to modernize the look and feel).
>
> I'm also wondering whether it's doable to do a discussion (FLIP?) per
> subpage on the design (as proposed by Chesnay in the first discussion on
> this topic [2] to have smaller changes rather than a single big one). But I
> could imagine this being quite tedious because different people might
> have different opinions on how something should be done.
>
> I don't have any experience with frontend design. I'm wondering how much
> time such a redesign takes. Could it be linked to the 2.0 release?
>
> Thanks for the FLIP (you might want to add [2] to the FLIP's header for the
> sake of transparency).
> Matthias
>
> [1] https://flink.apache.org/roadmap/
> [2] https://lists.apache.org/thread/c3pt00cf77lrtgt242p26lgp9l2z5yc8
>
> On Tue, Jul 11, 2023 at 11:39 AM Xintong Song 
> wrote:
>
> > +1 in general.
> >
> > Thanks for proposing this contribution, Deepthi. The new design looks
> very
> > cool.
> >
> > I have a few questions, which might be entry-level given that I barely
> know
> > anything about the website design.
> > - Do you think it's feasible to maintain two sets of website designs at
> the
> > same time? E.g., adding a button "back to previous version". I'm asking
> > because, while the new UI might be more friendly to newcomers, the
> original
> > website might be more convenient for people who are already familiar with
> > it to find things. It would be nice if we can offer both options to
> users.
> > - For the documentation, I wonder if it makes sense to offer the same
> color
> > theme as the website, to keep the experience consistent. How much effort
> > does it require?
> > - In the FLIP, you mentioned things like "there's a general consensus"
> and
> > "feedback from customers". I'm curious where these come from. Have you
> > conducted some sort of survey? Would you mind sharing a bit more about
> > that?
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Tue, Jul 11, 2023 at 4:57 PM Feifan Wang  wrote:
> >
> > > +1 , the new design looks more attractive and is well organized
> > >
> > > |
> > > |
> > > Feifan Wang
> > > |
> > > |
> > > zoltar9...@163.com
> > > |
> > >
> > >
> > >  Replied Message 
> > > | From | Leonard Xu |
> > > | Date | 07/11/2023 16:34 |
> > > | To | dev |
> > > | Subject | Re: [DISCUSS] FLIP 333 - Redesign Apache Flink website |
> > > +1 for the redesigning, the new website looks cool.
> > >
> > >
> > > Best,
> > > Leonard
> > >
> > > On Jul 11, 2023, at 7:55 AM, Mohan, Deepthi  >
> > > wrote:
> > >
> > > Hi,
> > >
> > > I’m opening this thread to discuss a proposal to redesign the Apache
> > Flink
> > > website: https://flink.apache.org. The approach and a few initial
> > mockups
> > > are included in FLIP 333 - Redesign Apache Flink website.<
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-333%3A+Redesign+Apache+Flink+website
> > > >
> > >
> > > The goal is to modernize the website design to help existing and new
> > users
> > > easily understand Flink’s value proposition and make Flink attractive
> to
> > > new users. As suggested in a previous thread, there are no proposed
> > changes
> > > to Flink documentation.
> > >
> > > I look forward to your feedback and the discussion.
> > >
> > > Thanks,
> > > Deepthi
> > >
> > >
> > >
> >
>


Re: [VOTE] Release 2.0 must-have work items

2023-07-11 Thread Jing Ge
Hi Galen,

We were aware of the issue and are working on it. StreamingFileSink is a
SinkFunction that could not be removed yes as mentioned previously. You can
find SinkV1 at [1]

Best regards,
Jing


[1]
https://github.com/apache/flink/blob/4cf2124d71a8dd0595e40f07c2dbcc4c85883b82/flink-core/src/main/java/org/apache/flink/api/connector/sink/Sink.java#L55

On Tue, Jul 11, 2023 at 1:59 PM Galen Warren
 wrote:

> Regarding SinkV1 vs. SinkV2: Is StreamingFileSink a SinkV1-related
> interface that is proposed to be removed? In a separate thread, it was
> discussed how it's important not to remove StreamingFileSink as long as
> this critical issue with SinkV2 is still outstanding --
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-30238 --
> because of the prospect of data loss when stopping and restarting jobs with
> savepoints.
>
> Thanks,
> Galen
>
> On Tue, Jul 11, 2023 at 7:47 AM Leonard Xu  wrote:
>
> > Hi, Xintong
> >
> > > Could you please clarify what exact changes you are proposing to make
> on
> > > the existing list?
> > > - Are you suggesting removing the item "Remove deprecated APIs -
> > > SourceFunction / SinkFunction / SinkV1", or are you suggesting
> > downgrading
> > > it as nice-to-have?
> >
> > I prefer to remove the item as we cannot deprecate  SourceFunction /
> > SinkFunction related interfaces in 1.18, thus he 2.0 version would not
> > satisfy two minor versions condition and would not remove them as well.
> >
> > > - You said SinkV2 cannot cover SinkFunction. Then how about SinkV1? Is
> it
> > > covered by SinkV2? Should it be removed or preserved?
> >
> > SinkV2 related interfaces covers SinkV1 related interfaces well, and
> > SinkV1 related interfaces have been deprecated, I think they can be
> removed
> > in 2.0 safely.
> >
> > In a word, my proposal is replace must have item "Remove deprecated APIs
> -
> > SourceFunction / SinkFunction / SinkV1"  with must have item "Remove
> > deprecated APIs  SinkV1" .
> >
> > Best,
> > Leonard
> >
> >
> >
> >
> >
> >
> >
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Tue, Jul 11, 2023 at 4:26 PM Leonard Xu  wrote:
> > >
> > >> Thanks Xintong for driving this great work! But I’ve to give my
> > >> -1(binding) here:
> > >>
> > >> -1 to mark "deprecat SourceFunction/SinkFunction/Sinkv1" item as must
> to
> > >> have for release 2.0.
> > >>
> > >> I do a lot of connector work in the community, and I have two insights
> > >> from past experience:
> > >>
> > >> 1. Many developers reported that it is very difficult to migrate from
> > >> SourceFunction to new Source [1]. The migration of existing conenctors
> > >> after deprecated SourceFunction is very difficult. Some developers
> > (Flavio
> > >> Pompermaier) reported that they gave up the migration because it was
> too
> > >> complicated. I believe it's not a few cases. This means that
> deprecating
> > >> SourceFunction related interfaces require community contributors to
> > reduce
> > >> the migration cost before starting the migration work.
> > >>
> > >> 2. IIRC, the function of SinkV2 cannot currently cover SinkFunction as
> > >> described in FLIP-287[2], it means the migration path after deprecate
> > >> SinkFunction/Sinkv1 does not exist, thus we cannot mark the related
> > >> interfaces of sinkfunction/sinkv1  as deprecated in 1.18.
> > >>
> > >> Based on these two cognitions, I think we should not mark these
> > interfaces
> > >> as must to have in 2.0. Maintaining the two sets of source/sink
> > interfaces
> > >> is not a concern for me, users can choose the interface to implement
> > >> according to their energy and needs.
> > >>
> > >> Btw, some work items in 2.0 are marked as must to have, but no
> > contributor
> > >> has claimed them yet. I think this is a risk and hope the Release
> > Managers
> > >> could pay attention to it.
> > >>
> > >> Thank you all RMs for your work, sorry again for interrupting the vote
> > >>
> > >> Best,
> > >> Leonard
> > >>
> > >> [1] https://lists.apache.org/thread/sqq26s9rorynr4vx4nhxz3fmmxpgtdqp
> > >> [2]
> &g

<    1   2   3   4   5   6   7   8   >