date:20220724

[RESULT] [VOTE] Apache Flink Kubernetes Operator Release 1.1.0, release candidate #1

2022-07-24 Thread Gyula Fóra

Hi Devs!

I'm happy to announce that we have unanimously approved this release.

There are 5 approving votes, 3 of which are binding:

* Gyula Fora (binding)
* Marton Balassi (binding)
* Thomas Weise (binding)
* Biao Geng (non-binding)
* Jim Busche (non-binding)

There are no disapproving votes.

Thanks everyone!
Gyula

[jira] [Created] (FLINK-28659) [JUnit5 Migration] Module: flink-java

2022-07-24 Thread RocMarshal (Jira)

RocMarshal created FLINK-28659:
--

 Summary: [JUnit5 Migration] Module: flink-java
 Key: FLINK-28659
 URL: https://issues.apache.org/jira/browse/FLINK-28659
 Project: Flink
  Issue Type: Improvement
  Components: Tests
Affects Versions: 1.16.0
Reporter: RocMarshal






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-28660) Simplify logs of blocklist

2022-07-24 Thread Lijie Wang (Jira)

Lijie Wang created FLINK-28660:
--

 Summary: Simplify logs of blocklist
 Key: FLINK-28660
 URL: https://issues.apache.org/jira/browse/FLINK-28660
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Lijie Wang
 Fix For: 1.16.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-250: Support Customized Kubernetes Schedulers Proposal

2022-07-24 Thread bo zhaobo

Hi Marjtin,

Thank you very much that all clear. I will raise a Vote for this then. ;-)

BR

Bo Zhao

Martijn Visser  于2022年7月24日周日 01:42写道：

> Hi all,
>
> Thanks a lot for clarifying Yikun! I have no more concerns.
>
> Best regards,
>
> Martijn
>
> Op vr 22 jul. 2022 om 10:42 schreef bo zhaobo  >:
>
> > Hi All,
> >
> > Thanks for all feedbacks from you. All of them are helpful and valuable
> for
> > us.
> >
> > If there is no further comment towards FLIP-250 we introduced, we plan to
> > setup a VOTE thread next Monday.
> >
> > Thank you all !!
> >
> > BR
> >
> > Bo Zhao
> >
> >
> > bo zhaobo  于2022年7月15日周五 10:02写道：
> >
> > > Thanks all, @Yang Wang and @Yikun Jiang.
> > >
> > > Hi Martijn,
> > >
> > > We understand your concern. And do the above emails clear your doubts?
> > >
> > > "
> > > Thanks for the info! I think I see that you've already updated the FLIP
> > to
> > > reflect how customized schedulers are beneficial for both batch and
> > > streaming jobs.
> > > "
> > >
> > > >>>
> > >
> > > Yeah, that's true that the "Motivation" paragraph makes readers
> confused.
> > > So
> > > I updated the FLIP description. And thanks for your feedback and
> correct.
> > >
> > > "
> > > The reason why I'm not too happy that we would only create a reference
> > > implementation for Volcano is that we don't know if the generic support
> > for
> > > customized scheduler plugins will also work for others. We think it
> will,
> > > but since there would be no other implementation available, we are not
> > > sure. My concern is that when someone tries to add support for another
> > > scheduler, we notice that we actually made a mistake or should improve
> > the
> > > generic support.
> > > "
> > >
> > > >>>
> > >
> > > Yeah, I understand your concern. Via YiKun Jinag's description and
> > > experience sharing,
> > > does he make you know more? Or we need to figure out the common part of
> > > some popular
> > > K8S customized schedulers and refresh the doc? Waiting for your advice.
> > > ;-)
> > >
> > > Best regards,
> > >
> > > Bo Zhao
> > >
> > > Yikun Jiang  于2022年7月14日周四 18:45写道：
> > >
> > >> > And maybe we also could ping Yikun Jiang who has done similar things
> > in
> > >> Spark.
> > >>
> > >> Thanks for @wangyang ping. Yes, I was involved in Spark's customized
> > >> scheduler support work and as the main completer.
> > >>
> > >> For customized scheduler support, I can share scheduler's requirement
> in
> > >> here:
> > >>
> > >> 1. Help scheduler to *specify* the scheduler name
> > >>
> > >> 2. Help scheduler to create the* scheduler related
> > label/annotation/CRD*,
> > >> such as
> > >> - Yunikorn needs labels/annotations
> > >> <
> > >>
> >
> https://yunikorn.apache.org/docs/user_guide/labels_and_annotations_in_yunikorn/
> > >> >
> > >> (maybe task group CRD in future or not)
> > >> - Volcano needs annotations and CRD <
> > https://volcano.sh/en/docs/podgroup/
> > >> >
> > >> - Kube-batch needs annotations/CRD
> > >> <
> https://github.com/kubernetes-sigs/kube-batch/tree/master/config/crds>
> > >> - Kueue needs annotation support
> > >> <
> > >>
> >
> https://github.com/kubernetes-sigs/kueue/blob/888cedb6e62c315e008916086308a893cd21dd66/config/samples/sample-job.yaml#L6
> > >> >
> > >> and
> > >> cluster level CRD
> > >>
> > >> 3. Help the scheduler to create the scheduler meta/CRD at the* right
> > >> time*,
> > >> such as if users want to avoid pod max pending, we need to create the
> > >> scheduler required CRD before pod creation.
> > >>
> > >> For complex requirements, Spark uses featurestep to support (looks
> flink
> > >> decorators are very similar to it)
> > >> For simple requirements, they can just use configuration or Pod
> > Template.
> > >> [1]
> > >>
> > >>
> >
> https://spark.apache.org/docs/latest/running-on-kubernetes.html#customized-kubernetes-schedulers-for-spark-on-kubernetes
> > >>
> > >> From the FLIP, I can see the above requirements are covered.
> > >>
> > >> BTW, I think Flink decorators' existing and new added interface have
> > >> already covered all requirements of Kubernetes, so I personally think
> > the
> > >> K8s related scheduler requirement can also be well covered by it.
> > >>
> > >> Regards,
> > >> Yikun
> > >>
> > >>
> > >> On Thu, Jul 14, 2022 at 5:11 PM Yang Wang 
> > wrote:
> > >>
> > >> > I think we could go over the customized scheduler plugin mechanism
> > again
> > >> > with YuniKorn to make sure that it is common enough.
> > >> > But the implementation could be deferred.
> > >> >
> > >> > And maybe we also could ping Yikun Jiang who has done similar things
> > in
> > >> > Spark.
> > >> >
> > >> > For the e2e tests, I admit that they could be improved. But I am not
> > >> sure
> > >> > whether we really need the java implementation instead.
> > >> > This is out of the scope of this FLIP and let's keep the discussion
> > >> > under FLINK-20392.
> > >> >
> > >> >
> > >> > Best,
> > >> > Yang
> > >> >
> > >> > Martijn Visser  于2022年7月14日周四 15:28写道：
> > >> >
> > >

RE: Re: [VOTE] FLIP-252: Amazon DynamoDB Sink Connector

2022-07-24 Thread Grant L (Grant)

+1, we need this asap. Thank you so much!

On 2022/07/21 18:27:52 Robert Metzger wrote:
> +1
> 
> On Wed, Jul 20, 2022 at 10:48 PM Konstantin Knauf  wrote:
> 
> > +1. Thanks!
> >
> > Am Mi., 20. Juli 2022 um 16:48 Uhr schrieb Tzu-Li (Gordon) Tai <
> > tzuli...@apache.org>:
> >
> > > +1
> > >
> > > On Wed, Jul 20, 2022 at 6:13 AM Danny Cranmer 
> > > wrote:
> > >
> > > > Hi there,
> > > >
> > > > After the discussion in [1], I’d like to open a voting thread for
> > > FLIP-252
> > > > [2], which proposes the addition of an Amazon DynamoDB sink based on
> > the
> > > > Async Sink [3].
> > > >
> > > > The vote will be open until July 23rd earliest (72h), unless there are
> > > any
> > > > binding vetos.
> > > >
> > > > Cheers, Danny
> > > >
> > > > [1] https://lists.apache.org/thread/ssmf2c86n3xyd5qqmcdft22sqn4qw8mw
> > > > [2]
> > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-252%3A+Amazon+DynamoDB+Sink+Connector
> > > > [3]
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink
> > > >
> > >
> >
> >
> > --
> > https://twitter.com/snntrable
> > https://github.com/knaufk
> >
>

RE: Re: [VOTE] FLIP-252: Amazon DynamoDB Sink Connector

2022-07-24 Thread Grant L (Grant)

+1

On 2022/07/21 18:27:52 Robert Metzger wrote:
> +1
> 
> On Wed, Jul 20, 2022 at 10:48 PM Konstantin Knauf  wrote:
> 
> > +1. Thanks!
> >
> > Am Mi., 20. Juli 2022 um 16:48 Uhr schrieb Tzu-Li (Gordon) Tai <
> > tzuli...@apache.org>:
> >
> > > +1
> > >
> > > On Wed, Jul 20, 2022 at 6:13 AM Danny Cranmer 
> > > wrote:
> > >
> > > > Hi there,
> > > >
> > > > After the discussion in [1], I’d like to open a voting thread for
> > > FLIP-252
> > > > [2], which proposes the addition of an Amazon DynamoDB sink based on
> > the
> > > > Async Sink [3].
> > > >
> > > > The vote will be open until July 23rd earliest (72h), unless there are
> > > any
> > > > binding vetos.
> > > >
> > > > Cheers, Danny
> > > >
> > > > [1] https://lists.apache.org/thread/ssmf2c86n3xyd5qqmcdft22sqn4qw8mw
> > > > [2]
> > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-252%3A+Amazon+DynamoDB+Sink+Connector
> > > > [3]
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink
> > > >
> > >
> >
> >
> > --
> > https://twitter.com/snntrable
> > https://github.com/knaufk
> >
>

[VOTE] FLIP-250: Support Customized Kubernetes Schedulers Proposal

2022-07-24 Thread bo zhaobo

Hi all,

Thank you very much for all feedback after the discussion in [2][3].
Now I'd like to proceed with the vote for FLIP-250 [1], as no more
objections
or issues were raised in ML thread [2][3].

The vote will be opened until July 28th earliest(at least 72 hours) unless
there is an objection or
insufficient votes.

Thank you all.

BR

Bo Zhao

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-250%3A+Support+Customized+Kubernetes+Schedulers+Proposal
[2] https://lists.apache.org/thread/pf8dvbvqf845wh0x63z68jmhh4pvsbow
[3] https://lists.apache.org/thread/zbylkkc6jojrqwds7tt02k2t8nw62h26

[jira] [Created] (FLINK-28661) Introduce generic mode for table store catalog

2022-07-24 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-28661:


 Summary: Introduce generic mode for table store catalog
 Key: FLINK-28661
 URL: https://issues.apache.org/jira/browse/FLINK-28661
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0


- Introduce a generic option for catalog.
- Table Store catalog can store Flink generic tables when generic is true.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[ANNOUNCE] Apache Flink Kubernetes Operator 1.1.0 released

2022-07-24 Thread Gyula Fóra

The Apache Flink community is very happy to announce the release of Apache
Flink Kubernetes Operator 1.1.0.

The Flink Kubernetes Operator allows users to manage their Apache Flink
applications and their lifecycle through native k8s tooling like kubectl.

Please check out the release blog post for an overview of the release:
https://flink.apache.org/news/2022/07/25/release-kubernetes-operator-1.1.0.html

The release is available for download at:
https://flink.apache.org/downloads.html

Maven artifacts for Flink Kubernetes Operator can be found at:
https://search.maven.org/artifact/org.apache.flink/flink-kubernetes-operator

Official Docker image for the Flink Kubernetes Operator can be found at:
https://hub.docker.com/r/apache/flink-kubernetes-operator

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12351723

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Regards,
Gyula Fora

Re: [VOTE] Apache Flink Kubernetes Operator Release 1.1.0, release candidate #1

2022-07-24 Thread Yang Wang

+1 (binding)

Successfully verified the following:

- Verify that the checksums and GPG files

- Verify that the source distributions do not contain any binaries

- Build binary and image from release source

- Verify the NOTICE and licenses in source release and the docker image

- Verify the helm chart values with correct appVersion and image tag

- Operator functionality manual testing

- Start a Flink Application job(both streaming and batch) with 1.15

- Verify the FlinkUI could be accessed via ingress

- No strange operator logs




Best,

Yang

Thomas Weise  于2022年7月24日周日 08:02写道：

> +1 (binding)
>
> * built from source archive
> * run examples
>
> Thanks,
> Thomas
>
> On Wed, Jul 20, 2022 at 5:48 AM Gyula Fóra  wrote:
> >
> > Hi everyone,
> >
> > Please review and vote on the release candidate #1 for the version 1.1.0
> of
> > Apache Flink Kubernetes Operator,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > **Release Overview**
> >
> > As an overview, the release consists of the following:
> > a) Kubernetes Operator canonical source distribution (including the
> > Dockerfile), to be deployed to the release repository at dist.apache.org
> > b) Kubernetes Operator Helm Chart to be deployed to the release
> repository
> > at dist.apache.org
> > c) Maven artifacts to be deployed to the Maven Central Repository
> > d) Docker image to be pushed to dockerhub
> >
> > **Staging Areas to Review**
> >
> > The staging areas containing the above mentioned artifacts are as
> follows,
> > for your review:
> > * All artifacts for a,b) can be found in the corresponding dev repository
> > at dist.apache.org [1]
> > * All artifacts for c) can be found at the Apache Nexus Repository [2]
> > * The docker image for d) is staged on github [3]
> >
> > All artifacts are signed with the key
> > 0B4A34ADDFFA2BB54EB720B221F06303B87DAFF1 [4]
> >
> > Other links for your review:
> > * JIRA release notes [5]
> > * source code tag "release-1.1.0-rc1" [6]
> > * PR to update the website Downloads page to include Kubernetes Operator
> > links [7]
> >
> > **Vote Duration**
> >
> > The voting time will run for at least 72 hours.
> > It is adopted by majority approval, with at least 3 PMC affirmative
> votes.
> >
> > **Note on Verification**
> >
> > You can follow the basic verification guide here[8].
> > Note that you don't need to verify everything yourself, but please make
> > note of what you have tested together with your +- vote.
> >
> > Thanks,
> > Gyula Fora
> >
> > [1]
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.1.0-rc1/
> > [2]
> https://repository.apache.org/content/repositories/orgapacheflink-1518/
> > [3] ghcr.io/apache/flink-kubernetes-operator:c9dec3f
> > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [5]
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12351723
> > [6]
> >
> https://github.com/apache/flink-kubernetes-operator/tree/release-1.1.0-rc1
> > [7] https://github.com/apache/flink-web/pull/560
> > [8]
> >
> https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Kubernetes+Operator+Release
>

[jira] [Created] (FLINK-28662) Compaction may block job cancelling

2022-07-24 Thread Caizhi Weng (Jira)

Caizhi Weng created FLINK-28662:
---

 Summary: Compaction may block job cancelling
 Key: FLINK-28662
 URL: https://issues.apache.org/jira/browse/FLINK-28662
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Affects Versions: table-store-0.2.0, table-store-0.3.0
Reporter: Caizhi Weng
 Fix For: table-store-0.2.0, table-store-0.3.0


Currently when cancelling a job, we have to wait for current compaction thread 
to finish. If the compaction takes too long, the task manager may fail due to 
job cancelling takes more than 180 seconds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [ANNOUNCE] Apache Flink Kubernetes Operator 1.1.0 released

2022-07-24 Thread Yang Wang

Congrats! Thanks Gyula for driving this release, and thanks to all
contributors!


Best,
Yang

Gyula Fóra  于2022年7月25日周一 10:44写道：

> The Apache Flink community is very happy to announce the release of Apache
> Flink Kubernetes Operator 1.1.0.
>
> The Flink Kubernetes Operator allows users to manage their Apache Flink
> applications and their lifecycle through native k8s tooling like kubectl.
>
> Please check out the release blog post for an overview of the release:
>
> https://flink.apache.org/news/2022/07/25/release-kubernetes-operator-1.1.0.html
>
> The release is available for download at:
> https://flink.apache.org/downloads.html
>
> Maven artifacts for Flink Kubernetes Operator can be found at:
>
> https://search.maven.org/artifact/org.apache.flink/flink-kubernetes-operator
>
> Official Docker image for the Flink Kubernetes Operator can be found at:
> https://hub.docker.com/r/apache/flink-kubernetes-operator
>
> The full release notes are available in Jira:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12351723
>
> We would like to thank all contributors of the Apache Flink community who
> made this release possible!
>
> Regards,
> Gyula Fora
>

Re: [VOTE] FLIP-250: Support Customized Kubernetes Schedulers Proposal

2022-07-24 Thread Yang Wang

+1 (binding)

Best,
Yang

bo zhaobo  于2022年7月25日周一 09:38写道：

> Hi all,
>
> Thank you very much for all feedback after the discussion in [2][3].
> Now I'd like to proceed with the vote for FLIP-250 [1], as no more
> objections
> or issues were raised in ML thread [2][3].
>
> The vote will be opened until July 28th earliest(at least 72 hours) unless
> there is an objection or
> insufficient votes.
>
> Thank you all.
>
> BR
>
> Bo Zhao
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-250%3A+Support+Customized+Kubernetes+Schedulers+Proposal
> [2] https://lists.apache.org/thread/pf8dvbvqf845wh0x63z68jmhh4pvsbow
> [3] https://lists.apache.org/thread/zbylkkc6jojrqwds7tt02k2t8nw62h26
>

Re: Re: [VOTE] FLIP-252: Amazon DynamoDB Sink Connector

2022-07-24 Thread Becket Qin

+1

On Mon, Jul 25, 2022 at 9:22 AM Grant L (Grant) 
wrote:

> +1
>
> On 2022/07/21 18:27:52 Robert Metzger wrote:
> > +1
> >
> > On Wed, Jul 20, 2022 at 10:48 PM Konstantin Knauf 
> wrote:
> >
> > > +1. Thanks!
> > >
> > > Am Mi., 20. Juli 2022 um 16:48 Uhr schrieb Tzu-Li (Gordon) Tai <
> > > tzuli...@apache.org>:
> > >
> > > > +1
> > > >
> > > > On Wed, Jul 20, 2022 at 6:13 AM Danny Cranmer 
> > > > wrote:
> > > >
> > > > > Hi there,
> > > > >
> > > > > After the discussion in [1], I’d like to open a voting thread for
> > > > FLIP-252
> > > > > [2], which proposes the addition of an Amazon DynamoDB sink based
> on
> > > the
> > > > > Async Sink [3].
> > > > >
> > > > > The vote will be open until July 23rd earliest (72h), unless there
> are
> > > > any
> > > > > binding vetos.
> > > > >
> > > > > Cheers, Danny
> > > > >
> > > > > [1]
> https://lists.apache.org/thread/ssmf2c86n3xyd5qqmcdft22sqn4qw8mw
> > > > > [2]
> > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-252%3A+Amazon+DynamoDB+Sink+Connector
> > > > > [3]
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink
> > > > >
> > > >
> > >
> > >
> > > --
> > > https://twitter.com/snntrable
> > > https://github.com/knaufk
> > >
> >

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-07-24 Thread Sebastian Mattheis

Hi everybody,

I discussed last week the semantics and an implementation stragegy of the
configuration parameter with Piotr and did the implementation and some
tests this weekend.

A short summary of what I discussed and recapped with Piotr:

   - The configuration parameter allows (and tolerates) the use of
   `SourceReader`s that do not implement `pauseOrResumeSplits` method. (The
   exception is ignored in `SourceOperator`.)
   - The configuration parameter allows (and tolerates) the use of
   `SourceSplitReader`s that do not implement `pauseOrResumeSplits` method.
   (The exception is ignored in the `PauseResumeSplitsTask` of the
   `SplitFetcher`.)

In particular, this means that a `SourceReader` with two `SplitReader`s
where one does not implement `pauseOrResumeSplits` and the other does. It
will allow the use of the one that doesn't and will, nevertheless, still
attempt to pause/resume the other. (Consequently, if the one that doesn't
support pause is ahead it simply cannot not pause the `SplitReader` but if
the other is ahead it will be paused until watermarks are aligned.)

There is one flaw that I don't really like but which I accept as from the
discussion and which I will add/update in the FLIP:
If there is any other mechanism (e.g. other than watermark alignment) that
attempts to pause or resume `SplitReader`s, it will have side effects and
potential unexpected behavior if one or more `SplitReader`s do not
implement `pauseOrResumeSplits` and the user set the configuration
parameter to allow/tolerate it for split-level watermark alignment. (The
reason is simply that we cannot differentiate which mechanism attempts to
pause/resume, i.e., if it used for watermark alignment or something else.)
Given that this configuration parameter is supposed to be an intermediate
fallback, it is acceptable for me but changed at latest when some other
mechanism uses pauseOrResumeSplits.

As for the parameter naming, I have implemented it the following way
(reason: There exists a parameter `pipeline.auto-watermark-interval`.):

pipeline.watermark-alignment.allow-unaligned-source-splits (default: false)

Status: I have implemented the configuration parameter (and an IT case). I
still need to update the FLIP and will ping you (tomorrow or so) when I'm
done with that. Please check/review my description from above if you see
any problems with that.

Thanks a lot and regards,
Sebastian

On Wed, Jul 20, 2022 at 11:24 PM Thomas Weise  wrote:

> Hi Sebastian,
>
> Thank you for updating the FLIP and sorry for my delayed response. As
> Piotr pointed out, we would need to incorporate the fallback flag into
> the design to reflect the outcome of the previous discussion.
>
> Based on the current FLIP and as detailed by Becket, the
> SourceOperator coordinates the alignment. It is responsible for the
> pause/resume decision and knows how many splits are assigned.
> Therefore shouldn't it have all the information needed to efficiently
> handle the case of UnsupportedOperationException thrown by a reader?
>
> Although the fallback requires some extra implementation effort, I
> think that is more than offset by not surprising users and offering a
> smoother migration path. Yes, the flag is a temporary feature that
> will become obsolete in perhaps 2-3 releases (can we please also
> include that into the FLIP?). But since it would be just a
> configuration property that can be ignored at that point (for which
> there is precedence), no code change will be forced on users.
>
> As for the property name, perhaps the following would be even more
> descriptive?
>
> coarse.grained.wm.alignment.fallback.enabled
>
> Thanks!
> Thomas
>
>
> On Wed, Jul 13, 2022 at 10:59 AM Becket Qin  wrote:
> >
> > Thanks for the explanation, Sebastian. I understand your concern now.
> >
> > 1. About the major concern. Personally I'd consider the coarse grained
> watermark alignment as a special case for backward compatibility. In the
> future, if for whatever reason we want to pause a split and that is not
> supported, it seems the only thing that makes sense is throwing an
> exception, instead of pausing the entire source reader. Regarding this
> FLIP, if the logic that determines which split should be paused is in the
> SourceOperator, the SourceOperator actually knows the reason why it pauses
> a split. It also knows whether there are more than one split assigned to
> the source reader. So it can just fallback to the coarse grained watermark
> alignment, without affecting other reasons of pausing a split, right? And
> in the future, if there are more purposes for pausing / resuming a split,
> the SourceOperator still needs to understand each of the reasons in order
> to resume the splits after all the pausing conditions are no longer met.
> >
> > 2. Naming wise, would "coarse.grained.watermark.alignment.enabled"
> address your concern?
> >
> > The only concern I have for Option A is that people may not be able to
> benefit from split level WM alignment

[jira] [Created] (FLINK-28663) Allow multiple downstream consumer job vertices sharing the same intermediate dataset at scheduler side

2022-07-24 Thread Yingjie Cao (Jira)

Yingjie Cao created FLINK-28663:
---

 Summary: Allow multiple downstream consumer job vertices sharing 
the same intermediate dataset at scheduler side
 Key: FLINK-28663
 URL: https://issues.apache.org/jira/browse/FLINK-28663
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Reporter: Yingjie Cao


Currently, one intermediate dataset can only be consumed by one downstream 
consumer vertex. If there are multiple consumer vertices consuming the same 
output of the same upstream vertex, multiple intermediate datasets will be 
produced. We can optimize this behavior to produce only one intermediate 
dataset which can be shared by multiple consumer vertices. As the first step, 
we should allow multiple downstream consumer job vertices sharing the same 
intermediate dataset at scheduler side. (Note that this optimization only works 
for blocking shuffle because pipelined shuffle result partition can not be 
consumed multiple times)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-28664) Support AvroWriters in PyFlink

2022-07-24 Thread Juntao Hu (Jira)

Juntao Hu created FLINK-28664:
-

 Summary: Support AvroWriters in PyFlink
 Key: FLINK-28664
 URL: https://issues.apache.org/jira/browse/FLINK-28664
 Project: Flink
  Issue Type: New Feature
  Components: API / Python
Affects Versions: 1.15.1
Reporter: Juntao Hu
 Fix For: 1.16.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[RESULT] [VOTE] Apache Flink Kubernetes Operator Release 1.1.0, release candidate #1

[jira] [Created] (FLINK-28659) [JUnit5 Migration] Module: flink-java

[jira] [Created] (FLINK-28660) Simplify logs of blocklist

Re: [DISCUSS] FLIP-250: Support Customized Kubernetes Schedulers Proposal

RE: Re: [VOTE] FLIP-252: Amazon DynamoDB Sink Connector

RE: Re: [VOTE] FLIP-252: Amazon DynamoDB Sink Connector

[VOTE] FLIP-250: Support Customized Kubernetes Schedulers Proposal

[jira] [Created] (FLINK-28661) Introduce generic mode for table store catalog

[ANNOUNCE] Apache Flink Kubernetes Operator 1.1.0 released

Re: [VOTE] Apache Flink Kubernetes Operator Release 1.1.0, release candidate #1

[jira] [Created] (FLINK-28662) Compaction may block job cancelling

Re: [ANNOUNCE] Apache Flink Kubernetes Operator 1.1.0 released

Re: [VOTE] FLIP-250: Support Customized Kubernetes Schedulers Proposal

Re: Re: [VOTE] FLIP-252: Amazon DynamoDB Sink Connector

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

[jira] [Created] (FLINK-28663) Allow multiple downstream consumer job vertices sharing the same intermediate dataset at scheduler side

[jira] [Created] (FLINK-28664) Support AvroWriters in PyFlink

17 matches

Site Navigation

Mail list logo

Footer information