Re: Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-21 Thread Yangze Guo
Thanks for the comments, Zhu!

Yes, it is a known limitation for fine-grained resource management. We
also have filed this issue in FLINK-20865 when we proposed FLIP-156.

As a first step, I agree that we can mark batch jobs with PIPELINED
edges as an invalid case for this feature. However, just throwing an
exception, in that case, might confuse users who do not understand the
concept of pipeline region. Maybe we can force all the edges in this
scenario to BLOCKING in compiling stage and well document it. So that,
common users will not be interrupted while the expert users can
understand the cost of that usage and make their decision. WDYT?

Best,
Yangze Guo

On Mon, Jun 21, 2021 at 2:24 PM Zhu Zhu  wrote:
>
> Thanks for proposing this @Yangze Guo and sorry for joining the discussion so 
> late.
> The proposal generally looks good to me. But I find one problem that batch 
> job with PIPELINED edges might hang if enabling fine-grained resources. see 
> "Resource Deadlocks could still happen in certain Cases" section in 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling
> However, this problem may happen only in batch cases with PIPELINED edges, 
> because
> 1. streaming jobs would always require all resource requirements to be 
> fulfilled at the same time.
> 2. batch jobs without PIPELINED edges consist of multiple single vertex 
> regions and thus each slot can be individually used and returned
> So maybe in the first step, let's mark batch jobs with PIPELINED edges as an 
> invalid case for fine-grained resources and throw exception for it in early 
> compiling stage?
>
> Thanks,
> Zhu
>
> Yangze Guo  于2021年6月15日周二 下午4:57写道:
>>
>> Thanks for the supplement, Arvid and Yun. I've annotated these two
>> points in the FLIP.
>> The vote is now started in [1].
>>
>> [1] 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51381.html
>>
>> Best,
>> Yangze Guo
>>
>> On Fri, Jun 11, 2021 at 2:50 PM Yun Gao  wrote:
>> >
>> > Hi,
>> >
>> > Very thanks @Yangze for bringing up this discuss. Overall +1 for
>> > exposing the fine-grained resource requirements in the DataStream API.
>> >
>> > One similar issue as Arvid has pointed out is that users may also creating
>> > different SlotSharingGroup objects, with different names but with different
>> > resources.  We might need to do some check internally. But We could also
>> > leave that during the development of the actual PR.
>> >
>> > Best,
>> > Yun
>> >
>> >
>> >
>> >  --Original Mail --
>> > Sender:Arvid Heise 
>> > Send Date:Thu Jun 10 15:33:37 2021
>> > Recipients:dev 
>> > Subject:Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource 
>> > Requirements
>> > Hi Yangze,
>> >
>> >
>> >
>> > Thanks for incorporating the ideas and sorry for missing the builder part.
>> >
>> > My main idea is that SlotSharingGroup is immutable, such that the user
>> >
>> > doesn't do:
>> >
>> >
>> >
>> > ssg = new SlotSharingGroup();
>> >
>> > ssg.setCpus(2);
>> >
>> > operator1.slotSharingGroup(ssg);
>> >
>> > ssg.setCpus(4);
>> >
>> > operator2.slotSharingGroup(ssg);
>> >
>> >
>> >
>> > and wonders why both operators have the same CPU spec. But the details can
>> >
>> > be fleshed out in the actual PR.
>> >
>> >
>> >
>> > On Thu, Jun 10, 2021 at 5:13 AM Yangze Guo  wrote:
>> >
>> >
>> >
>> > > Thanks all for the discussion. I've updated the FLIP accordingly, the
>> >
>> > > key changes are:
>> >
>> > > - Introduce SlotSharingGroup instead of ResourceSpec which contains
>> >
>> > > the resource spec of slot sharing group
>> >
>> > > - Introduce two interfaces for specifying the SlotSharingGroup:
>> >
>> > > #slotSharingGroup(SlotSharingGroup) and
>> >
>> > > StreamExecutionEnvironment#registerSlotSharingGroup(SlotSharingGroup).
>> >
>> > >
>> >
>> > > If there is no more feedback, I'd start a vote next week.
>> >
>> > >
>> >
>> > > Best,
>> >
>> > > Yangze Guo
>> >
>> > >
>> >
>> > > On Wed, Jun 9, 2021 at 10:46 AM Yangze Guo  wrote:
>> >
>> > > >
>> >
>> > > > Thanks for the valuable suggestion, Arvid.
>> >
>> > > >
>> >
>> > > > 1) Yes, we can add a new SlotSharingGroup which includes the name and
>> >
>> > > > its resource. After that, we have two interfaces for configuring the
>> >
>> > > > slot sharing group of an operator:
>> >
>> > > > - #slotSharingGroup(String name) // the resource of it can be
>> >
>> > > > configured through StreamExecutionEnvironment#registerSlotSharingGroup
>> >
>> > > > - #slotSharingGroup(SlotSharingGroup ssg) // Directly configure the
>> >
>> > > resource
>> >
>> > > > And one interface to configure the resource of a SSG:
>> >
>> > > > - StreamExecutionEnvironment#registerSlotSharingGroup(SlotSharingGroup)
>> >
>> > > > We can also define the priority of the above two approaches, e.g. the
>> >
>> > > > resource registering in the StreamExecutionEnvironment will always be
>> >
>> > > > resp

Re: Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-21 Thread Zhu Zhu
Thanks for the quick response Yangze.
The proposal sounds good to me.

Thanks,
Zhu

Yangze Guo  于2021年6月21日周一 下午3:01写道:

> Thanks for the comments, Zhu!
>
> Yes, it is a known limitation for fine-grained resource management. We
> also have filed this issue in FLINK-20865 when we proposed FLIP-156.
>
> As a first step, I agree that we can mark batch jobs with PIPELINED
> edges as an invalid case for this feature. However, just throwing an
> exception, in that case, might confuse users who do not understand the
> concept of pipeline region. Maybe we can force all the edges in this
> scenario to BLOCKING in compiling stage and well document it. So that,
> common users will not be interrupted while the expert users can
> understand the cost of that usage and make their decision. WDYT?
>
> Best,
> Yangze Guo
>
> On Mon, Jun 21, 2021 at 2:24 PM Zhu Zhu  wrote:
> >
> > Thanks for proposing this @Yangze Guo and sorry for joining the
> discussion so late.
> > The proposal generally looks good to me. But I find one problem that
> batch job with PIPELINED edges might hang if enabling fine-grained
> resources. see "Resource Deadlocks could still happen in certain Cases"
> section in
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling
> > However, this problem may happen only in batch cases with PIPELINED
> edges, because
> > 1. streaming jobs would always require all resource requirements to be
> fulfilled at the same time.
> > 2. batch jobs without PIPELINED edges consist of multiple single vertex
> regions and thus each slot can be individually used and returned
> > So maybe in the first step, let's mark batch jobs with PIPELINED edges
> as an invalid case for fine-grained resources and throw exception for it in
> early compiling stage?
> >
> > Thanks,
> > Zhu
> >
> > Yangze Guo  于2021年6月15日周二 下午4:57写道:
> >>
> >> Thanks for the supplement, Arvid and Yun. I've annotated these two
> >> points in the FLIP.
> >> The vote is now started in [1].
> >>
> >> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51381.html
> >>
> >> Best,
> >> Yangze Guo
> >>
> >> On Fri, Jun 11, 2021 at 2:50 PM Yun Gao 
> wrote:
> >> >
> >> > Hi,
> >> >
> >> > Very thanks @Yangze for bringing up this discuss. Overall +1 for
> >> > exposing the fine-grained resource requirements in the DataStream API.
> >> >
> >> > One similar issue as Arvid has pointed out is that users may also
> creating
> >> > different SlotSharingGroup objects, with different names but with
> different
> >> > resources.  We might need to do some check internally. But We could
> also
> >> > leave that during the development of the actual PR.
> >> >
> >> > Best,
> >> > Yun
> >> >
> >> >
> >> >
> >> >  --Original Mail --
> >> > Sender:Arvid Heise 
> >> > Send Date:Thu Jun 10 15:33:37 2021
> >> > Recipients:dev 
> >> > Subject:Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained
> Resource Requirements
> >> > Hi Yangze,
> >> >
> >> >
> >> >
> >> > Thanks for incorporating the ideas and sorry for missing the builder
> part.
> >> >
> >> > My main idea is that SlotSharingGroup is immutable, such that the user
> >> >
> >> > doesn't do:
> >> >
> >> >
> >> >
> >> > ssg = new SlotSharingGroup();
> >> >
> >> > ssg.setCpus(2);
> >> >
> >> > operator1.slotSharingGroup(ssg);
> >> >
> >> > ssg.setCpus(4);
> >> >
> >> > operator2.slotSharingGroup(ssg);
> >> >
> >> >
> >> >
> >> > and wonders why both operators have the same CPU spec. But the
> details can
> >> >
> >> > be fleshed out in the actual PR.
> >> >
> >> >
> >> >
> >> > On Thu, Jun 10, 2021 at 5:13 AM Yangze Guo  wrote:
> >> >
> >> >
> >> >
> >> > > Thanks all for the discussion. I've updated the FLIP accordingly,
> the
> >> >
> >> > > key changes are:
> >> >
> >> > > - Introduce SlotSharingGroup instead of ResourceSpec which contains
> >> >
> >> > > the resource spec of slot sharing group
> >> >
> >> > > - Introduce two interfaces for specifying the SlotSharingGroup:
> >> >
> >> > > #slotSharingGroup(SlotSharingGroup) and
> >> >
> >> > >
> StreamExecutionEnvironment#registerSlotSharingGroup(SlotSharingGroup).
> >> >
> >> > >
> >> >
> >> > > If there is no more feedback, I'd start a vote next week.
> >> >
> >> > >
> >> >
> >> > > Best,
> >> >
> >> > > Yangze Guo
> >> >
> >> > >
> >> >
> >> > > On Wed, Jun 9, 2021 at 10:46 AM Yangze Guo  wrote:
> >> >
> >> > > >
> >> >
> >> > > > Thanks for the valuable suggestion, Arvid.
> >> >
> >> > > >
> >> >
> >> > > > 1) Yes, we can add a new SlotSharingGroup which includes the name
> and
> >> >
> >> > > > its resource. After that, we have two interfaces for configuring
> the
> >> >
> >> > > > slot sharing group of an operator:
> >> >
> >> > > > - #slotSharingGroup(String name) // the resource of it can be
> >> >
> >> > > > configured through
> StreamExecutionEnvironment#registerSlotSharingGroup
> >> >
> >> > > > - #slotSharingGroup(Slo

[jira] [Created] (FLINK-23054) Correct upsert optimization by upsert keys

2021-06-21 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-23054:


 Summary: Correct upsert optimization by upsert keys
 Key: FLINK-23054
 URL: https://issues.apache.org/jira/browse/FLINK-23054
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: 1.14.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-21 Thread Yangze Guo
Thanks, I append it to the known limitations of this FLIP.

Best,
Yangze Guo

On Mon, Jun 21, 2021 at 3:20 PM Zhu Zhu  wrote:
>
> Thanks for the quick response Yangze.
> The proposal sounds good to me.
>
> Thanks,
> Zhu
>
> Yangze Guo  于2021年6月21日周一 下午3:01写道:
>>
>> Thanks for the comments, Zhu!
>>
>> Yes, it is a known limitation for fine-grained resource management. We
>> also have filed this issue in FLINK-20865 when we proposed FLIP-156.
>>
>> As a first step, I agree that we can mark batch jobs with PIPELINED
>> edges as an invalid case for this feature. However, just throwing an
>> exception, in that case, might confuse users who do not understand the
>> concept of pipeline region. Maybe we can force all the edges in this
>> scenario to BLOCKING in compiling stage and well document it. So that,
>> common users will not be interrupted while the expert users can
>> understand the cost of that usage and make their decision. WDYT?
>>
>> Best,
>> Yangze Guo
>>
>> On Mon, Jun 21, 2021 at 2:24 PM Zhu Zhu  wrote:
>> >
>> > Thanks for proposing this @Yangze Guo and sorry for joining the discussion 
>> > so late.
>> > The proposal generally looks good to me. But I find one problem that batch 
>> > job with PIPELINED edges might hang if enabling fine-grained resources. 
>> > see "Resource Deadlocks could still happen in certain Cases" section in 
>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling
>> > However, this problem may happen only in batch cases with PIPELINED edges, 
>> > because
>> > 1. streaming jobs would always require all resource requirements to be 
>> > fulfilled at the same time.
>> > 2. batch jobs without PIPELINED edges consist of multiple single vertex 
>> > regions and thus each slot can be individually used and returned
>> > So maybe in the first step, let's mark batch jobs with PIPELINED edges as 
>> > an invalid case for fine-grained resources and throw exception for it in 
>> > early compiling stage?
>> >
>> > Thanks,
>> > Zhu
>> >
>> > Yangze Guo  于2021年6月15日周二 下午4:57写道:
>> >>
>> >> Thanks for the supplement, Arvid and Yun. I've annotated these two
>> >> points in the FLIP.
>> >> The vote is now started in [1].
>> >>
>> >> [1] 
>> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51381.html
>> >>
>> >> Best,
>> >> Yangze Guo
>> >>
>> >> On Fri, Jun 11, 2021 at 2:50 PM Yun Gao  
>> >> wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > Very thanks @Yangze for bringing up this discuss. Overall +1 for
>> >> > exposing the fine-grained resource requirements in the DataStream API.
>> >> >
>> >> > One similar issue as Arvid has pointed out is that users may also 
>> >> > creating
>> >> > different SlotSharingGroup objects, with different names but with 
>> >> > different
>> >> > resources.  We might need to do some check internally. But We could also
>> >> > leave that during the development of the actual PR.
>> >> >
>> >> > Best,
>> >> > Yun
>> >> >
>> >> >
>> >> >
>> >> >  --Original Mail --
>> >> > Sender:Arvid Heise 
>> >> > Send Date:Thu Jun 10 15:33:37 2021
>> >> > Recipients:dev 
>> >> > Subject:Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained 
>> >> > Resource Requirements
>> >> > Hi Yangze,
>> >> >
>> >> >
>> >> >
>> >> > Thanks for incorporating the ideas and sorry for missing the builder 
>> >> > part.
>> >> >
>> >> > My main idea is that SlotSharingGroup is immutable, such that the user
>> >> >
>> >> > doesn't do:
>> >> >
>> >> >
>> >> >
>> >> > ssg = new SlotSharingGroup();
>> >> >
>> >> > ssg.setCpus(2);
>> >> >
>> >> > operator1.slotSharingGroup(ssg);
>> >> >
>> >> > ssg.setCpus(4);
>> >> >
>> >> > operator2.slotSharingGroup(ssg);
>> >> >
>> >> >
>> >> >
>> >> > and wonders why both operators have the same CPU spec. But the details 
>> >> > can
>> >> >
>> >> > be fleshed out in the actual PR.
>> >> >
>> >> >
>> >> >
>> >> > On Thu, Jun 10, 2021 at 5:13 AM Yangze Guo  wrote:
>> >> >
>> >> >
>> >> >
>> >> > > Thanks all for the discussion. I've updated the FLIP accordingly, the
>> >> >
>> >> > > key changes are:
>> >> >
>> >> > > - Introduce SlotSharingGroup instead of ResourceSpec which contains
>> >> >
>> >> > > the resource spec of slot sharing group
>> >> >
>> >> > > - Introduce two interfaces for specifying the SlotSharingGroup:
>> >> >
>> >> > > #slotSharingGroup(SlotSharingGroup) and
>> >> >
>> >> > > StreamExecutionEnvironment#registerSlotSharingGroup(SlotSharingGroup).
>> >> >
>> >> > >
>> >> >
>> >> > > If there is no more feedback, I'd start a vote next week.
>> >> >
>> >> > >
>> >> >
>> >> > > Best,
>> >> >
>> >> > > Yangze Guo
>> >> >
>> >> > >
>> >> >
>> >> > > On Wed, Jun 9, 2021 at 10:46 AM Yangze Guo  wrote:
>> >> >
>> >> > > >
>> >> >
>> >> > > > Thanks for the valuable suggestion, Arvid.
>> >> >
>> >> > > >
>> >> >
>> >> > > > 1) Yes, we can add a new SlotSharingGroup which includes the name 
>> >> > > > an

退订

2021-06-21 Thread fengfuli



Re: [VOTE] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-21 Thread Yangze Guo
According to the latest comment of Zhu Zhu[1], I append the potential
resource deadlock in batch jobs as a known limitation to this FLIP.
Thus, I'd extend the voting period for another 72h.

[1] 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html

Best,
Yangze Guo

On Tue, Jun 15, 2021 at 7:53 PM Xintong Song  wrote:
>
> +1 (binding)
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Tue, Jun 15, 2021 at 6:21 PM Arvid Heise  wrote:
>
> > LGTM +1 (binding) from my side.
> >
> > On Tue, Jun 15, 2021 at 11:00 AM Yangze Guo  wrote:
> >
> > > Hi everyone,
> > >
> > > I'd like to start the vote of FLIP-169 [1]. This FLIP is discussed in
> > > the thread[2].
> > >
> > > The vote will be open for at least 72 hours. Unless there is an
> > > objection, I will try to close it by Jun. 18, 2021 if we have received
> > > sufficient votes.
> > >
> > > [1]
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-169+DataStream+API+for+Fine-Grained+Resource+Requirements
> > > [2]
> > >
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html
> > >
> > > Best,
> > > Yangze Guo
> > >
> >


[jira] [Created] (FLINK-23055) Add document for Window TVF offset

2021-06-21 Thread JING ZHANG (Jira)
JING ZHANG created FLINK-23055:
--

 Summary: Add document for Window TVF offset
 Key: FLINK-23055
 URL: https://issues.apache.org/jira/browse/FLINK-23055
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.14.0
Reporter: JING ZHANG
 Fix For: 1.14.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23056) flink docs sql builtin functions out of date

2021-06-21 Thread Junhan Yang (Jira)
Junhan Yang created FLINK-23056:
---

 Summary: flink docs sql builtin functions out of date
 Key: FLINK-23056
 URL: https://issues.apache.org/jira/browse/FLINK-23056
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Junhan Yang
 Attachments: image-2021-06-21-16-44-27-117.png

!image-2021-06-21-16-44-27-117.png!

Functions `LAST_VALUE` and `FIRST_VALUE` should support multiple expressions as 
parameters



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23057) flink-console.sh doesn't do variable expansion for FLINK_ENV_JAVA_OPTS like flink-daemon.sh

2021-06-21 Thread LIU Xiao (Jira)
LIU Xiao created FLINK-23057:


 Summary: flink-console.sh doesn't do variable expansion for 
FLINK_ENV_JAVA_OPTS like flink-daemon.sh
 Key: FLINK-23057
 URL: https://issues.apache.org/jira/browse/FLINK-23057
 Project: Flink
  Issue Type: Bug
  Components: Client / Job Submission
Affects Versions: 1.12.4, 1.13.1
Reporter: LIU Xiao


In flink-deamon.sh:

 
{code:java}
...

# Evaluate user options for local variable expansion
FLINK_ENV_JAVA_OPTS=$(eval echo ${FLINK_ENV_JAVA_OPTS})

echo "Starting $DAEMON daemon on host $HOSTNAME."
"$JAVA_RUN" $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath 
"`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" 
${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 200<&- 2>&1 < /dev/null &

...
{code}
There is a "$(eval echo ...)" line, so variables like ${FLINK_LOG_PREFIX} in 
FLINK_ENV_JAVA_OPTS can be expanded, as described in 
[https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/debugging/application_profiling/]

but flink-console.sh doesn't have the line, and as kubernetes-jobmanager.sh and 
kubernetes-taskmanager.sh all depend on flink-console.sh, so in native 
kubernetes application mode, variable expansion of FLINK_ENV_JAVA_OPTS is not 
working.

Add that line to flink-console.sh sovles the problem.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23058) flink-elasticsearch-6 not work

2021-06-21 Thread yandufeng (Jira)
yandufeng created FLINK-23058:
-

 Summary: flink-elasticsearch-6 not work
 Key: FLINK-23058
 URL: https://issues.apache.org/jira/browse/FLINK-23058
 Project: Flink
  Issue Type: Bug
  Components: Connectors / ElasticSearch
Affects Versions: 1.13.1
Reporter: yandufeng


i have two questions.

1. when i add elasticserach host and port,i random write host, but not report 
error. for example

List httpHosts = new ArrayList<>();
httpHosts.add(new HttpHost("sdfsdfsf", 9200, "http"));



2. when i write conrrect elasticsearch host and port, but no response, also not 
create index in elasticsearch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Change in accumutors semantics with jobClient

2021-06-21 Thread Till Rohrmann
Thanks for bringing this to the dev ML Etienne. Could you maybe update the
release notes for Flink 1.13 [1] to include this change? That way it might
be a bit more prominent. I think the change needs to go into the
release-1.13 and master branch.

[1]
https://github.com/apache/flink/blob/master/docs/content/release-notes/flink-1.13.md

Cheers,
Till


On Fri, Jun 18, 2021 at 2:45 PM Etienne Chauchot 
wrote:

> Hi all,
>
> I did a fix some time ago regarding accumulators:
> the/JobClient.getAccumulators()/ was infinitely  blocking in local
> environment for a streaming job (1). The change (2) consisted of giving
> the current accumulators value for the running job. And when fixing this
> in the PR, it appeared that I had to change the accumulators semantics
> with /JobClient/ and I just realized that I forgot to bring this back to
> the ML:
>
> Previously /JobClient/ assumed that getAccumulator() was called on a
> bounded pipeline and that the user wanted to acquire the *final
> accumulator values* after the job is finished.
>
> But now it returns the *current value of accumulators* immediately to be
> compatible with unbounded pipelines.
>
> If it is run on a bounded pipeline, then to get the final accumulator
> values after the job is finished, one needs to call
>
> /getJobExecutionResult().thenApply(JobExecutionResult::getAllAccumulatorResults)/
>
> (1): https://issues.apache.org/jira/browse/FLINK-18685
>
> (2): https://github.com/apache/flink/pull/14558#
>
>
> Cheers,
>
> Etienne
>
>


[jira] [Created] (FLINK-23059) Update playgrounds for Flink 1.13

2021-06-21 Thread David Anderson (Jira)
David Anderson created FLINK-23059:
--

 Summary: Update playgrounds for Flink 1.13
 Key: FLINK-23059
 URL: https://issues.apache.org/jira/browse/FLINK-23059
 Project: Flink
  Issue Type: Improvement
  Components: Documentation / Training / Exercises
Affects Versions: 1.13.0
Reporter: David Anderson
Assignee: David Anderson


The various playgrounds in apache/flink-playgrounds all need an update for the 
1.13 release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2

2021-06-21 Thread Till Rohrmann
Thanks for starting this discussion Dawid. We have collected a couple of
fixes for the different releases:

#fixes:
1.11.4: 72
1.12.5: 35
1.13.2: 49

which in my opinion warrants new bugfix releases. Note that we intend to do
another 1.11 release because of the seriousness of the FLINK-22815 [1]
which can lead to silent data loss.

I think that FLINK-23025 [2] might be nice to include but I wouldn't block
the release on it.

@pnowojski  do you have an ETA for FLINK-23011 [3]? I
do agree that this would be nice to fix but on the other hand, this issue
is in Flink since the introduction of FLIP-27. Moreover, it does not affect
Flink 1.11.x.

[1] https://issues.apache.org/jira/browse/FLINK-22815
[2] https://issues.apache.org/jira/browse/FLINK-23025
[3] https://issues.apache.org/jira/browse/FLINK-23011

Cheers,
Till

On Fri, Jun 18, 2021 at 5:15 PM Piotr Nowojski  wrote:

> Hi,
>
> Thanks for bringing this up. I think before releasing 1.12.x/1.13.x/1.14.x,
> it would be good to decide what to do with FLINK-23011 [1] and if there is
> a relatively easy fix, I would wait for it before releasing.
>
> Best,
> Piotrek
>
> [1] with https://issues.apache.org/jira/browse/FLINK-23011
>
> pt., 18 cze 2021 o 16:35 Konstantin Knauf  napisał(a):
>
> > Hi Dawid,
> >
> > Thank you for starting the discussion. I'd like to add
> > https://issues.apache.org/jira/browse/FLINK-23025 to the list for Flink
> > 1.13.2.
> >
> > Cheers,
> >
> > Konstantin
> >
> > On Fri, Jun 18, 2021 at 3:26 PM Dawid Wysakowicz  >
> > wrote:
> >
> > > Hi devs,
> > >
> > > Quite recently we pushed, in our opinion, quite an important fix[1] for
> > > unaligned checkpoints which disables UC for broadcast partitioning.
> > > Without the fix there might be some broadcast state corruption.
> > > Therefore we think it would be beneficial to release it soonish. What
> do
> > > you think? Do you have other issues in mind you'd like to have included
> > > in these versions.
> > >
> > > Would someone be willing to volunteer to help with the releases as a
> > > release manager? I guess there is a couple of spots to fill in here ;)
> > >
> > > Best,
> > >
> > > Dawid
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-22815
> > >
> > >
> > >
> >
> > --
> >
> > Konstantin Knauf
> >
> > https://twitter.com/snntrable
> >
> > https://github.com/knaufk
> >
>


Re: Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-21 Thread Till Rohrmann
I would be more in favor of what Zhu Zhu proposed to throw an exception
with a meaningful and understandable explanation that also includes how to
resolve this problem. I do understand the reasoning behind automatically
switching the edge types in order to make things easier to use but a) this
can also be confusing if the user does not expect this to happen and b) it
can add some complexity which makes other feature development harder in the
future because users might rely on it. An example of such a case I stumbled
upon rather recently is that we adjust the maximum parallelism wrt the
given savepoint if it has not been explicitly configured. On the paper this
sounds like a good usability improvement, however, for the
AdaptiveScheduler it posed a quite annoying complexity. If instead, we said
that we fail the job submission if the max parallelism does not equal the
max parallelism of the savepoint, it would have been a lot easier.

Cheers,
Till

On Mon, Jun 21, 2021 at 9:36 AM Yangze Guo  wrote:

> Thanks, I append it to the known limitations of this FLIP.
>
> Best,
> Yangze Guo
>
> On Mon, Jun 21, 2021 at 3:20 PM Zhu Zhu  wrote:
> >
> > Thanks for the quick response Yangze.
> > The proposal sounds good to me.
> >
> > Thanks,
> > Zhu
> >
> > Yangze Guo  于2021年6月21日周一 下午3:01写道:
> >>
> >> Thanks for the comments, Zhu!
> >>
> >> Yes, it is a known limitation for fine-grained resource management. We
> >> also have filed this issue in FLINK-20865 when we proposed FLIP-156.
> >>
> >> As a first step, I agree that we can mark batch jobs with PIPELINED
> >> edges as an invalid case for this feature. However, just throwing an
> >> exception, in that case, might confuse users who do not understand the
> >> concept of pipeline region. Maybe we can force all the edges in this
> >> scenario to BLOCKING in compiling stage and well document it. So that,
> >> common users will not be interrupted while the expert users can
> >> understand the cost of that usage and make their decision. WDYT?
> >>
> >> Best,
> >> Yangze Guo
> >>
> >> On Mon, Jun 21, 2021 at 2:24 PM Zhu Zhu  wrote:
> >> >
> >> > Thanks for proposing this @Yangze Guo and sorry for joining the
> discussion so late.
> >> > The proposal generally looks good to me. But I find one problem that
> batch job with PIPELINED edges might hang if enabling fine-grained
> resources. see "Resource Deadlocks could still happen in certain Cases"
> section in
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling
> >> > However, this problem may happen only in batch cases with PIPELINED
> edges, because
> >> > 1. streaming jobs would always require all resource requirements to
> be fulfilled at the same time.
> >> > 2. batch jobs without PIPELINED edges consist of multiple single
> vertex regions and thus each slot can be individually used and returned
> >> > So maybe in the first step, let's mark batch jobs with PIPELINED
> edges as an invalid case for fine-grained resources and throw exception for
> it in early compiling stage?
> >> >
> >> > Thanks,
> >> > Zhu
> >> >
> >> > Yangze Guo  于2021年6月15日周二 下午4:57写道:
> >> >>
> >> >> Thanks for the supplement, Arvid and Yun. I've annotated these two
> >> >> points in the FLIP.
> >> >> The vote is now started in [1].
> >> >>
> >> >> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51381.html
> >> >>
> >> >> Best,
> >> >> Yangze Guo
> >> >>
> >> >> On Fri, Jun 11, 2021 at 2:50 PM Yun Gao 
> wrote:
> >> >> >
> >> >> > Hi,
> >> >> >
> >> >> > Very thanks @Yangze for bringing up this discuss. Overall +1 for
> >> >> > exposing the fine-grained resource requirements in the DataStream
> API.
> >> >> >
> >> >> > One similar issue as Arvid has pointed out is that users may also
> creating
> >> >> > different SlotSharingGroup objects, with different names but with
> different
> >> >> > resources.  We might need to do some check internally. But We
> could also
> >> >> > leave that during the development of the actual PR.
> >> >> >
> >> >> > Best,
> >> >> > Yun
> >> >> >
> >> >> >
> >> >> >
> >> >> >  --Original Mail --
> >> >> > Sender:Arvid Heise 
> >> >> > Send Date:Thu Jun 10 15:33:37 2021
> >> >> > Recipients:dev 
> >> >> > Subject:Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained
> Resource Requirements
> >> >> > Hi Yangze,
> >> >> >
> >> >> >
> >> >> >
> >> >> > Thanks for incorporating the ideas and sorry for missing the
> builder part.
> >> >> >
> >> >> > My main idea is that SlotSharingGroup is immutable, such that the
> user
> >> >> >
> >> >> > doesn't do:
> >> >> >
> >> >> >
> >> >> >
> >> >> > ssg = new SlotSharingGroup();
> >> >> >
> >> >> > ssg.setCpus(2);
> >> >> >
> >> >> > operator1.slotSharingGroup(ssg);
> >> >> >
> >> >> > ssg.setCpus(4);
> >> >> >
> >> >> > operator2.slotSharingGroup(ssg);
> >> >> >
> >> >> >
> >> >> >
> >> >> > and wonders why both operators have th

Re: [DISCUSS] FLIP-171: Async Sink

2021-06-21 Thread Arvid Heise
Hi Piotr,

to pick up this discussion thread again:
- This FLIP is about providing some base implementation for FLIP-143 sinks
that make adding new implementations easier, similar to the
SourceReaderBase.
- The whole availability topic will most likely be a separate FLIP. The
basic issue just popped up here because we currently have no way to signal
backpressure in sinks except by blocking `write`. This feels quite natural
in sinks with sync communication but quite unnatural in async sinks.

Now we have a couple of options. In all cases, we would have some WIP limit
on the number of records/requests being able to be processed in parallel
asynchronously (similar to asyncIO).
1. We use some blocking queue in `write`, then we need to handle
interruptions. In the easiest case, we extend `write` to throw the
`InterruptedException`, which is a small API change.
2. We use a blocking queue, but handle interrupts and swallow/translate
them. No API change.
Both solutions block the task thread, so any RPC message / unaligned
checkpoint would be processed only after the backpressure is temporarily
lifted. That's similar to the discussions that you linked. Cancellation may
also be a tad harder on 2.
3. We could also add some `wakeUp` to the `SinkWriter` similar to
`SplitFetcher` [1]. Basically, you use a normal queue with a completeable
future on which you block. Wakeup would be a clean way to complete it next
to the natural completion through finished requests.
4. We add availability to the sink. However, this API change also requires
that we allow operators to be available so it may be a bigger change with
undesired side-effects. On the other hand, we could also use the same
mechanism for asyncIO.

For users of FLIP-171, none of the options are exposed. So we could also
start with a simple solution (add `InterruptedException`) and later try to
add availability. Option 1+2 would also not require an additional FLIP; we
could add it as part of this FLIP.

Best,

Arvid

[1]
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/fetcher/SplitFetcher.java#L258-L258
On Thu, Jun 10, 2021 at 10:09 AM Hausmann, Steffen
 wrote:

> Hey Piotrek,
>
> Thanks for your comments on the FLIP. I'll address your second question
> first, as I think it's more central to this FLIP. Just looking at the AWS
> ecosystem, there are several sinks with overlapping functionality. I've
> chosen AWS sinks here because I'm most familiar with those, but a similar
> argument applies more generically for destination that support async ingest.
>
> There is, for instance, a sink for Amazon Kinesis Data Streams that is
> part of Apache Flink [1], a sink for Amazon Kinesis Data Firehose [2], a
> sink for Amazon DynamoDB [3], and a sink for Amazon Timestream [4]. All
> these sinks have implemented their own mechanisms for batching, persisting,
> and retrying events. And I'm not sure if all of them properly participate
> in checkpointing. [3] even seems to closely mirror [1] as it contains
> references to the Kinesis Producer Library, which is unrelated to Amazon
> DynamoDB.
>
> These sinks predate FLIP-143. But as batching, persisting, and retrying
> capabilities do not seem to be part of FLIP-143, I'd argue that we would
> end up with similar duplication, even if these sinks were rewritten today
> based on FLIP-143. And that's the idea of FLIP-171: abstract away these
> commonly required capabilities so that it becomes easy to create support
> for a wide range of destination without having to think about batching,
> retries, checkpointing, etc. I've included an example in the FLIP [5] that
> shows that it only takes a couple of lines of code to implement a sink with
> exactly-once semantics. To be fair, the example is lacking robust failure
> handling and some more advanced capabilities of [1], but I think it still
> supports this point.
>
> Regarding your point on the isAvailable pattern. We need some way for the
> sink to propagate backpressure and we would also like to support time based
> buffering hints. There are two options I currently see and would need
> additional input on which one is the better or more desirable one. The
> first option is to use the non-blocking isAvailable pattern. Internally,
> the sink persists buffered events in the snapshot state which avoids having
> to flush buffered record on a checkpoint. This seems to align well with the
> non-blocking isAvailable pattern. The second option is to make calls to
> `write` blocking and leverage an internal thread to trigger flushes based
> on time based buffering hints. We've discussed these options with Arvid and
> suggested to assumed that the `isAvailable` pattern will become available
> for sinks through and additional FLIP.
>
> I think it is an important discussion to have. My understanding of the
> implications for Flink in general are very naïve, so I'd be happy to get
> further guidance. However

[jira] [Created] (FLINK-23060) Move FutureUtils.toJava into separate class

2021-06-21 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23060:


 Summary: Move FutureUtils.toJava into separate class
 Key: FLINK-23060
 URL: https://issues.apache.org/jira/browse/FLINK-23060
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


FutureUtils.toJava converts a scala future to a java CompletableFuture.

In FLINK-18783 this method will be moved to a new akka-specific module, and we 
can simplify this move by moving this into a separate class now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23061) Split BootstrapTools

2021-06-21 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23061:


 Summary: Split BootstrapTools
 Key: FLINK-23061
 URL: https://issues.apache.org/jira/browse/FLINK-23061
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


The BootstrapTools contain utils for creating java processes and actor systems.
The akka parts will be moved into a separate module in FLINK-18783, and this 
would be easier if the tools were already split ahead of time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23062) FLIP-129: Register sources/sinks in Table API

2021-06-21 Thread Jira
Ingo Bürk created FLINK-23062:
-

 Summary: FLIP-129: Register sources/sinks in Table API
 Key: FLINK-23062
 URL: https://issues.apache.org/jira/browse/FLINK-23062
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Reporter: Ingo Bürk
Assignee: Ingo Bürk


(!) FLIP-129 is awaiting another voting. These issues are preliminary. (!)

https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23063) Remove TableEnvironment#connect

2021-06-21 Thread Jira
Ingo Bürk created FLINK-23063:
-

 Summary: Remove TableEnvironment#connect
 Key: FLINK-23063
 URL: https://issues.apache.org/jira/browse/FLINK-23063
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Ingo Bürk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23064) Centralize connector options

2021-06-21 Thread Jira
Ingo Bürk created FLINK-23064:
-

 Summary: Centralize connector options
 Key: FLINK-23064
 URL: https://issues.apache.org/jira/browse/FLINK-23064
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Ingo Bürk


For built-in connectors, we need to refactor their corresponding *Options 
classes to
 * … not contain any internal things
 * … be marked PublicEvolvinv
 * … be located in a common package



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23066) Implement TableEnvironment#from

2021-06-21 Thread Jira
Ingo Bürk created FLINK-23066:
-

 Summary: Implement TableEnvironment#from
 Key: FLINK-23066
 URL: https://issues.apache.org/jira/browse/FLINK-23066
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Ingo Bürk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23067) Implement Table#executeInsert(TableDescriptor)

2021-06-21 Thread Jira
Ingo Bürk created FLINK-23067:
-

 Summary: Implement Table#executeInsert(TableDescriptor)
 Key: FLINK-23067
 URL: https://issues.apache.org/jira/browse/FLINK-23067
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Ingo Bürk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23068) Support LIKE in TableDescriptor

2021-06-21 Thread Jira
Ingo Bürk created FLINK-23068:
-

 Summary: Support LIKE in TableDescriptor
 Key: FLINK-23068
 URL: https://issues.apache.org/jira/browse/FLINK-23068
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Ingo Bürk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23069) Support schema-less #executeInsert(TableDescriptor)

2021-06-21 Thread Jira
Ingo Bürk created FLINK-23069:
-

 Summary: Support schema-less #executeInsert(TableDescriptor)
 Key: FLINK-23069
 URL: https://issues.apache.org/jira/browse/FLINK-23069
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Ingo Bürk


The schema should be inferred automatically if no schema was defined in the 
descriptor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23070) Implement TableEnvironment#createTable

2021-06-21 Thread Jira
Ingo Bürk created FLINK-23070:
-

 Summary: Implement TableEnvironment#createTable
 Key: FLINK-23070
 URL: https://issues.apache.org/jira/browse/FLINK-23070
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Ingo Bürk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23065) Implement TableEnvironment#createTemporaryTable

2021-06-21 Thread Jira
Ingo Bürk created FLINK-23065:
-

 Summary: Implement TableEnvironment#createTemporaryTable
 Key: FLINK-23065
 URL: https://issues.apache.org/jira/browse/FLINK-23065
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Ingo Bürk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23071) Implement StatementSet#addInsert(TableDescriptor, Table)

2021-06-21 Thread Jira
Ingo Bürk created FLINK-23071:
-

 Summary: Implement StatementSet#addInsert(TableDescriptor, Table)
 Key: FLINK-23071
 URL: https://issues.apache.org/jira/browse/FLINK-23071
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Ingo Bürk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2

2021-06-21 Thread Yun Tang
Hi Dawid,

Thanks for driving this discussion, I am willing to volunteer as the release 
manager for these versions.


Best
Yun Tang

From: Konstantin Knauf 
Sent: Friday, June 18, 2021 22:35
To: dev 
Subject: Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2

Hi Dawid,

Thank you for starting the discussion. I'd like to add
https://issues.apache.org/jira/browse/FLINK-23025 to the list for Flink
1.13.2.

Cheers,

Konstantin

On Fri, Jun 18, 2021 at 3:26 PM Dawid Wysakowicz 
wrote:

> Hi devs,
>
> Quite recently we pushed, in our opinion, quite an important fix[1] for
> unaligned checkpoints which disables UC for broadcast partitioning.
> Without the fix there might be some broadcast state corruption.
> Therefore we think it would be beneficial to release it soonish. What do
> you think? Do you have other issues in mind you'd like to have included
> in these versions.
>
> Would someone be willing to volunteer to help with the releases as a
> release manager? I guess there is a couple of spots to fill in here ;)
>
> Best,
>
> Dawid
>
> [1] https://issues.apache.org/jira/browse/FLINK-22815
>
>
>

--

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk


[VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-21 Thread Ingo Bürk
Hi everyone,

thanks for all the feedback so far. Based on the discussion[1] we seem to
have consensus, so I would like to start a vote on FLIP-129 for which the
FLIP has now also been updated[2].

The vote will last for at least 72 hours (Thu, Jun 24th 12:00 GMT) unless
there is an objection or insufficient votes.


Thanks
Ingo

[1]
https://lists.apache.org/thread.html/rc75d64e889bf35592e9843dde86e82bdfea8fd4eb4c3df150112b305%40%3Cdev.flink.apache.org%3E
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API


Re: [DISCUSS] FLIP-171: Async Sink

2021-06-21 Thread Piotr Nowojski
Hi,

Thanks Steffen for the explanations. I think it makes sense to me.

Re Arvid/Steffen:

- Keep in mind that even if we choose to provide a non blocking API using
the `isAvailable()`/`getAvailableFuture()` method, we would still need to
support blocking inside the sinks. For example at the very least, emitting
many records at once (`flatMap`) or firing timers are scenarios when output
availability would be ignored at the moment by the runtime. Also I would
imagine writing very large (like 1GB) records would be blocking on
something as well.
- Secondly, exposing availability to the API level might not be that
easy/trivial. The availability pattern as defined in `AvailabilityProvider`
class is quite complicated and not that easy to implement by a user.

Both of those combined with lack of a clear motivation for adding
`AvailabilityProvider` to the sinks/operators/functions,  I would vote on
just starting with blocking `write` calls. This can always be extended in
the future with availability if needed/motivated properly.

That would be aligned with either Arvid's option 1 or 2. I don't know what
are the best practices with `InterruptedException`, but I'm always afraid
of it, so I would feel personally safer with option 2.

I'm not sure what problem option 3 is helping to solve? Adding `wakeUp()`
would sound strange to me.

Best,
Piotrek

pon., 21 cze 2021 o 12:15 Arvid Heise  napisał(a):

> Hi Piotr,
>
> to pick up this discussion thread again:
> - This FLIP is about providing some base implementation for FLIP-143 sinks
> that make adding new implementations easier, similar to the
> SourceReaderBase.
> - The whole availability topic will most likely be a separate FLIP. The
> basic issue just popped up here because we currently have no way to signal
> backpressure in sinks except by blocking `write`. This feels quite natural
> in sinks with sync communication but quite unnatural in async sinks.
>
> Now we have a couple of options. In all cases, we would have some WIP
> limit on the number of records/requests being able to be processed in
> parallel asynchronously (similar to asyncIO).
> 1. We use some blocking queue in `write`, then we need to handle
> interruptions. In the easiest case, we extend `write` to throw the
> `InterruptedException`, which is a small API change.
> 2. We use a blocking queue, but handle interrupts and swallow/translate
> them. No API change.
> Both solutions block the task thread, so any RPC message / unaligned
> checkpoint would be processed only after the backpressure is temporarily
> lifted. That's similar to the discussions that you linked. Cancellation may
> also be a tad harder on 2.
> 3. We could also add some `wakeUp` to the `SinkWriter` similar to
> `SplitFetcher` [1]. Basically, you use a normal queue with a completeable
> future on which you block. Wakeup would be a clean way to complete it next
> to the natural completion through finished requests.
> 4. We add availability to the sink. However, this API change also requires
> that we allow operators to be available so it may be a bigger change with
> undesired side-effects. On the other hand, we could also use the same
> mechanism for asyncIO.
>
> For users of FLIP-171, none of the options are exposed. So we could also
> start with a simple solution (add `InterruptedException`) and later try to
> add availability. Option 1+2 would also not require an additional FLIP; we
> could add it as part of this FLIP.
>
> Best,
>
> Arvid
>
> [1]
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/fetcher/SplitFetcher.java#L258-L258
> On Thu, Jun 10, 2021 at 10:09 AM Hausmann, Steffen
>  wrote:
>
>> Hey Piotrek,
>>
>> Thanks for your comments on the FLIP. I'll address your second question
>> first, as I think it's more central to this FLIP. Just looking at the AWS
>> ecosystem, there are several sinks with overlapping functionality. I've
>> chosen AWS sinks here because I'm most familiar with those, but a similar
>> argument applies more generically for destination that support async ingest.
>>
>> There is, for instance, a sink for Amazon Kinesis Data Streams that is
>> part of Apache Flink [1], a sink for Amazon Kinesis Data Firehose [2], a
>> sink for Amazon DynamoDB [3], and a sink for Amazon Timestream [4]. All
>> these sinks have implemented their own mechanisms for batching, persisting,
>> and retrying events. And I'm not sure if all of them properly participate
>> in checkpointing. [3] even seems to closely mirror [1] as it contains
>> references to the Kinesis Producer Library, which is unrelated to Amazon
>> DynamoDB.
>>
>> These sinks predate FLIP-143. But as batching, persisting, and retrying
>> capabilities do not seem to be part of FLIP-143, I'd argue that we would
>> end up with similar duplication, even if these sinks were rewritten today
>> based on FLIP-143. And that's the idea of FLIP-171: abstract away these
>> commonly re

Re: Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-21 Thread Yangze Guo
Thanks for the feedback, Till!

Actually, we cannot give user any resolution for this issue as there
is no API for DataStream users to influence the edge types at the
moment. The edge types are currently fixed based on the jobs' mode
(batch or streaming).
a) I think it might not confuse the user a lot as the behavior has
never been documented or guaranteed to be unchanged.
b) Thanks for your illustration. I agree that add complexity can make
other feature development harder in the future. However, I think this
might not introduce much complexity. In this case, we construct an
all-edges-blocking job graph, which already exists since 1.11 and
should have been considered by the following features. I admit we
cannot assume the all-edges-blocking job graph will exist forever in
Flink, but AFAIK there is no seeable feature that will intend to
deprecate it.

WDYT?



Best,
Yangze Guo

On Mon, Jun 21, 2021 at 6:10 PM Till Rohrmann  wrote:
>
> I would be more in favor of what Zhu Zhu proposed to throw an exception
> with a meaningful and understandable explanation that also includes how to
> resolve this problem. I do understand the reasoning behind automatically
> switching the edge types in order to make things easier to use but a) this
> can also be confusing if the user does not expect this to happen and b) it
> can add some complexity which makes other feature development harder in the
> future because users might rely on it. An example of such a case I stumbled
> upon rather recently is that we adjust the maximum parallelism wrt the
> given savepoint if it has not been explicitly configured. On the paper this
> sounds like a good usability improvement, however, for the
> AdaptiveScheduler it posed a quite annoying complexity. If instead, we said
> that we fail the job submission if the max parallelism does not equal the
> max parallelism of the savepoint, it would have been a lot easier.
>
> Cheers,
> Till
>
> On Mon, Jun 21, 2021 at 9:36 AM Yangze Guo  wrote:
>
> > Thanks, I append it to the known limitations of this FLIP.
> >
> > Best,
> > Yangze Guo
> >
> > On Mon, Jun 21, 2021 at 3:20 PM Zhu Zhu  wrote:
> > >
> > > Thanks for the quick response Yangze.
> > > The proposal sounds good to me.
> > >
> > > Thanks,
> > > Zhu
> > >
> > > Yangze Guo  于2021年6月21日周一 下午3:01写道:
> > >>
> > >> Thanks for the comments, Zhu!
> > >>
> > >> Yes, it is a known limitation for fine-grained resource management. We
> > >> also have filed this issue in FLINK-20865 when we proposed FLIP-156.
> > >>
> > >> As a first step, I agree that we can mark batch jobs with PIPELINED
> > >> edges as an invalid case for this feature. However, just throwing an
> > >> exception, in that case, might confuse users who do not understand the
> > >> concept of pipeline region. Maybe we can force all the edges in this
> > >> scenario to BLOCKING in compiling stage and well document it. So that,
> > >> common users will not be interrupted while the expert users can
> > >> understand the cost of that usage and make their decision. WDYT?
> > >>
> > >> Best,
> > >> Yangze Guo
> > >>
> > >> On Mon, Jun 21, 2021 at 2:24 PM Zhu Zhu  wrote:
> > >> >
> > >> > Thanks for proposing this @Yangze Guo and sorry for joining the
> > discussion so late.
> > >> > The proposal generally looks good to me. But I find one problem that
> > batch job with PIPELINED edges might hang if enabling fine-grained
> > resources. see "Resource Deadlocks could still happen in certain Cases"
> > section in
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling
> > >> > However, this problem may happen only in batch cases with PIPELINED
> > edges, because
> > >> > 1. streaming jobs would always require all resource requirements to
> > be fulfilled at the same time.
> > >> > 2. batch jobs without PIPELINED edges consist of multiple single
> > vertex regions and thus each slot can be individually used and returned
> > >> > So maybe in the first step, let's mark batch jobs with PIPELINED
> > edges as an invalid case for fine-grained resources and throw exception for
> > it in early compiling stage?
> > >> >
> > >> > Thanks,
> > >> > Zhu
> > >> >
> > >> > Yangze Guo  于2021年6月15日周二 下午4:57写道:
> > >> >>
> > >> >> Thanks for the supplement, Arvid and Yun. I've annotated these two
> > >> >> points in the FLIP.
> > >> >> The vote is now started in [1].
> > >> >>
> > >> >> [1]
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51381.html
> > >> >>
> > >> >> Best,
> > >> >> Yangze Guo
> > >> >>
> > >> >> On Fri, Jun 11, 2021 at 2:50 PM Yun Gao 
> > wrote:
> > >> >> >
> > >> >> > Hi,
> > >> >> >
> > >> >> > Very thanks @Yangze for bringing up this discuss. Overall +1 for
> > >> >> > exposing the fine-grained resource requirements in the DataStream
> > API.
> > >> >> >
> > >> >> > One similar issue as Arvid has pointed out is that users may also
> > creating
> > >> >> > different

[jira] [Created] (FLINK-23072) Add benchmarks for SQL internal and external serializers

2021-06-21 Thread Timo Walther (Jira)
Timo Walther created FLINK-23072:


 Summary: Add benchmarks for SQL internal and external serializers
 Key: FLINK-23072
 URL: https://issues.apache.org/jira/browse/FLINK-23072
 Project: Flink
  Issue Type: Improvement
  Components: Benchmarks
Reporter: Timo Walther


Currently, we don't benchmark any of the serializers of the SQL layer.

We should test {{RowData}} with at least a field of each logical type.

Also {{ExternalTypeInfo}} might be interesting to monitor because it is used 
between Table API and DataStream API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-21 Thread Gabor Somogyi
Hi All,

We see that adding any kind of specific authentication raises more
questions than answers.
What would be if a generic API would be added without any real
authentication logic?
That way every provider can add its own protocol implementation as
additional jar.

BR,
G


On Thu, Jun 17, 2021 at 7:53 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

> Hi all,
>
> Sorry to be joining the conversation late. I'm also on the side of
> Konstantin, generally, in that this seems to not be a core goal of Flink as
> a project and adds a maintenance burden.
>
> Would another con of Kerberos be that is likely a fading project in terms
> of network security? (serious question, please correct me if there is
> reason to believe it is gaining adoption)
>
> The point about Kerberos being independent of infrastructure is a good one
> but is something that is also solved by modern sidecar proxies + service
> meshes that can run across Kubernetes and bare-metal. These solutions also
> handle certificate provisioning, rotation, etc. in addition to higher-level
> authorization policies. Some examples of projects with this "universal
> infrastructure support" are Kuma[1] (CNCF Sandbox, I'm a maintainer) and
> Istio[2] (Google).
>
> Wondering out loud: has anyone tried to run Flink on top of cilium[3],
> which also provides zero-trust networking at the kernel level without
> needing to instrument applications? This currently only runs on Kubernetes
> on Linux, so that's a major limitation, but solves many of the request
> forging concerns at all levels.
>
> Thanks,
> Austin
>
> [1]: https://kuma.io/docs/1.1.6/quickstart/universal/
> [2]: https://istio.io/latest/docs/setup/install/virtual-machine/
> [3]: https://cilium.io/
>
> On Thu, Jun 17, 2021 at 1:50 PM Till Rohrmann 
> wrote:
>
> > I left some comments in the Google document. It would be great if
> > someone from the community with security experience could also take a
> look
> > at it. Maybe Eron you have an opinion on the topic.
> >
> > Cheers,
> > Till
> >
> > On Thu, Jun 17, 2021 at 6:57 PM Till Rohrmann 
> > wrote:
> >
> > > Hi Gabor,
> > >
> > > I haven't found time to look into the updated FLIP yet. I'll try to do
> it
> > > asap.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Jun 16, 2021 at 9:35 PM Konstantin Knauf 
> > > wrote:
> > >
> > >> Hi Gabor,
> > >>
> > >> > However representing Kerberos as completely new feature is not true
> > >> because
> > >> it's already in since Flink makes authentication at least with HDFS
> and
> > >> Hbase through Kerberos.
> > >>
> > >> True, that is one way to look at it, but there are differences, too:
> > >> Control Plane vs Data Plane, Core vs Connectors.
> > >>
> > >> > Adding OIDC or OAuth2 has the exact same concerns what you've guys
> > just
> > >> raised. Why exactly these? If you think this would be beneficial we
> can
> > >> discuss it in detail
> > >>
> > >> That's exactly my point. Once we start adding authx support, we will
> > >> sooner or later discuss other options besides Kerberos, too. A user
> who
> > >> would like to use OAuth can not easily use Kerberos, right?
> > >> That is one of the reasons I am skeptical about adding initial authx
> > >> support.
> > >>
> > >> > Related authorization you've mentioned it can be complicated over
> > time.
> > >> Can
> > >> you show us an example? We've knowledge with couple of open source
> > >> components
> > >> but authorization was never a horror complex story. I personally have
> > the
> > >> most experience with Spark which I think is quite simple and stable.
> > Users
> > >> can be viewers/admins
> > >> and jobs started by others can't be modified. If you can share an
> > example
> > >> over-complication we can discuss on facts.
> > >>
> > >> Authorization is a new aspect that needs to be considered for every
> > >> addition to the REST API. In the future users might ask for additional
> > >> roles (e.g. an editor), user-defined roles and you've already
> mentioned
> > >> job-level permissions yourself. And keep in mind that there might also
> > be
> > >> larger additions in the future like the flink-sql-gateway.
> Contributions
> > >> like this become more expensive the more aspects we need to consider.
> > >>
> > >> In general, I believe, it is important that the community focuses its
> > >> efforts where we can generate the most value to the user and -
> > personally -
> > >> I don't think there is much to gain by extending Flink's scope in that
> > >> direction. Of course, this is not black and white and there are other
> > valid
> > >> opinions.
> > >>
> > >> Thanks,
> > >>
> > >> Konstantin
> > >>
> > >> On Wed, Jun 16, 2021 at 7:38 PM Gabor Somogyi <
> > gabor.g.somo...@gmail.com>
> > >> wrote:
> > >>
> > >>> Hi Konstantin,
> > >>>
> > >>> Thanks for the response. Related new feature introduction in case of
> > >>> Basic
> > >>> auth I tend to agree, anything else can be chosen.
> > >>>
> > >>> However representing Kerberos as completely new

Re: [VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-21 Thread Timo Walther

+1 (binding)

Thanks for driving this.

Regards,
Timo

On 21.06.21 13:24, Ingo Bürk wrote:

Hi everyone,

thanks for all the feedback so far. Based on the discussion[1] we seem to
have consensus, so I would like to start a vote on FLIP-129 for which the
FLIP has now also been updated[2].

The vote will last for at least 72 hours (Thu, Jun 24th 12:00 GMT) unless
there is an objection or insufficient votes.


Thanks
Ingo

[1]
https://lists.apache.org/thread.html/rc75d64e889bf35592e9843dde86e82bdfea8fd4eb4c3df150112b305%40%3Cdev.flink.apache.org%3E
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API





Re: [VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-21 Thread Jark Wu
+1 (binding)

Best,
Jark

On Mon, 21 Jun 2021 at 22:51, Timo Walther  wrote:

> +1 (binding)
>
> Thanks for driving this.
>
> Regards,
> Timo
>
> On 21.06.21 13:24, Ingo Bürk wrote:
> > Hi everyone,
> >
> > thanks for all the feedback so far. Based on the discussion[1] we seem to
> > have consensus, so I would like to start a vote on FLIP-129 for which the
> > FLIP has now also been updated[2].
> >
> > The vote will last for at least 72 hours (Thu, Jun 24th 12:00 GMT) unless
> > there is an objection or insufficient votes.
> >
> >
> > Thanks
> > Ingo
> >
> > [1]
> >
> https://lists.apache.org/thread.html/rc75d64e889bf35592e9843dde86e82bdfea8fd4eb4c3df150112b305%40%3Cdev.flink.apache.org%3E
> > [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
> >
>
>


Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-21 Thread Márton Balassi
Hi team,

Thank you for your input. Based on this discussion I agree with G that
selecting and standardizing on a specific strong authentication mechanism
is more challenging than the whole rest of the scope of this authentication
story. :-) I suggest that G and I go back to the drawing board and come up
with an API that can support multiple authentication mechanisms, and we
would only merge said API to Flink. Specific implementations of it can be
maintained outside of the project. This way we tackle the main challenge in
a truly minimal way.

Best,
Marton

On Mon, Jun 21, 2021 at 4:18 PM Gabor Somogyi 
wrote:

> Hi All,
>
> We see that adding any kind of specific authentication raises more
> questions than answers.
> What would be if a generic API would be added without any real
> authentication logic?
> That way every provider can add its own protocol implementation as
> additional jar.
>
> BR,
> G
>
>
> On Thu, Jun 17, 2021 at 7:53 PM Austin Cawley-Edwards <
> austin.caw...@gmail.com> wrote:
>
>> Hi all,
>>
>> Sorry to be joining the conversation late. I'm also on the side of
>> Konstantin, generally, in that this seems to not be a core goal of Flink
>> as
>> a project and adds a maintenance burden.
>>
>> Would another con of Kerberos be that is likely a fading project in terms
>> of network security? (serious question, please correct me if there is
>> reason to believe it is gaining adoption)
>>
>> The point about Kerberos being independent of infrastructure is a good one
>> but is something that is also solved by modern sidecar proxies + service
>> meshes that can run across Kubernetes and bare-metal. These solutions also
>> handle certificate provisioning, rotation, etc. in addition to
>> higher-level
>> authorization policies. Some examples of projects with this "universal
>> infrastructure support" are Kuma[1] (CNCF Sandbox, I'm a maintainer) and
>> Istio[2] (Google).
>>
>> Wondering out loud: has anyone tried to run Flink on top of cilium[3],
>> which also provides zero-trust networking at the kernel level without
>> needing to instrument applications? This currently only runs on Kubernetes
>> on Linux, so that's a major limitation, but solves many of the request
>> forging concerns at all levels.
>>
>> Thanks,
>> Austin
>>
>> [1]: https://kuma.io/docs/1.1.6/quickstart/universal/
>> [2]: https://istio.io/latest/docs/setup/install/virtual-machine/
>> [3]: https://cilium.io/
>>
>> On Thu, Jun 17, 2021 at 1:50 PM Till Rohrmann 
>> wrote:
>>
>> > I left some comments in the Google document. It would be great if
>> > someone from the community with security experience could also take a
>> look
>> > at it. Maybe Eron you have an opinion on the topic.
>> >
>> > Cheers,
>> > Till
>> >
>> > On Thu, Jun 17, 2021 at 6:57 PM Till Rohrmann 
>> > wrote:
>> >
>> > > Hi Gabor,
>> > >
>> > > I haven't found time to look into the updated FLIP yet. I'll try to
>> do it
>> > > asap.
>> > >
>> > > Cheers,
>> > > Till
>> > >
>> > > On Wed, Jun 16, 2021 at 9:35 PM Konstantin Knauf 
>> > > wrote:
>> > >
>> > >> Hi Gabor,
>> > >>
>> > >> > However representing Kerberos as completely new feature is not true
>> > >> because
>> > >> it's already in since Flink makes authentication at least with HDFS
>> and
>> > >> Hbase through Kerberos.
>> > >>
>> > >> True, that is one way to look at it, but there are differences, too:
>> > >> Control Plane vs Data Plane, Core vs Connectors.
>> > >>
>> > >> > Adding OIDC or OAuth2 has the exact same concerns what you've guys
>> > just
>> > >> raised. Why exactly these? If you think this would be beneficial we
>> can
>> > >> discuss it in detail
>> > >>
>> > >> That's exactly my point. Once we start adding authx support, we will
>> > >> sooner or later discuss other options besides Kerberos, too. A user
>> who
>> > >> would like to use OAuth can not easily use Kerberos, right?
>> > >> That is one of the reasons I am skeptical about adding initial authx
>> > >> support.
>> > >>
>> > >> > Related authorization you've mentioned it can be complicated over
>> > time.
>> > >> Can
>> > >> you show us an example? We've knowledge with couple of open source
>> > >> components
>> > >> but authorization was never a horror complex story. I personally have
>> > the
>> > >> most experience with Spark which I think is quite simple and stable.
>> > Users
>> > >> can be viewers/admins
>> > >> and jobs started by others can't be modified. If you can share an
>> > example
>> > >> over-complication we can discuss on facts.
>> > >>
>> > >> Authorization is a new aspect that needs to be considered for every
>> > >> addition to the REST API. In the future users might ask for
>> additional
>> > >> roles (e.g. an editor), user-defined roles and you've already
>> mentioned
>> > >> job-level permissions yourself. And keep in mind that there might
>> also
>> > be
>> > >> larger additions in the future like the flink-sql-gateway.
>> Contributions
>> > >> like this become more expensive the more aspects we need to consid

PR: "Propagate watermarks to sink API"

2021-06-21 Thread Eron Wright
Would someone be willing and able to review the PR which adds watermarks to
the sink API?

https://github.com/apache/flink/pull/15950

Thanks!
Eron


[jira] [Created] (FLINK-23073) Fix space handling in CSV timestamp parser

2021-06-21 Thread Seth Wiesman (Jira)
Seth Wiesman created FLINK-23073:


 Summary: Fix space handling in CSV timestamp parser
 Key: FLINK-23073
 URL: https://issues.apache.org/jira/browse/FLINK-23073
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.14.0, 1.13.2
Reporter: Seth Wiesman
Assignee: Seth Wiesman


FLINK-21947 Added support for TIMESTAMP_LTZ in the CSV format by replacing 
java.sql.Timestamp.valueOf with java.time.LocalDateTime.parse. 
Timestamp.valueOf internally calls `trim()` on the string before parsing while 
LocalDateTime.parse does not. This caused a breaking change where the CSV 
format can no longer parse timestamps of CSV's with spaces after the delimiter. 
We should manually re-add the call to trim to revert the behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS]FLIP-150: Introduce Hybrid Source

2021-06-21 Thread Thomas Weise
Here is a summary of where we are at with the PR:

* Added capability to construct sources at switch time through a
factory interface. This can support all previously discussed
scenarios. The simple case (sources with fixed start position) is
still simple, but for scenarios that require deferred instantiation,
sources can now be created through their respective builders at switch
time with access to the previous enumerator. This is a modification of
option 3 described previously.

* There is now unit test coverage for reader and enumerator.

* Ideas such as a universal interface for exchange of start positions
can be added on top of the current implementation. However, I would
like to keep that as exercise for the future and the scope of this
initial work contained.

* FLIP page will be updated to reflect the changes made since it was
originally created. Nicholas volunteered to take this up and also send
a VOTE thread.

Thanks all and especially Arvid for taking the time to review and discuss.

Thomas

On Tue, Jun 15, 2021 at 11:01 AM Thomas Weise  wrote:
>
> Hi Arvid,
>
> Thanks for your reply -->
>
> On Mon, Jun 14, 2021 at 2:55 PM Arvid Heise  wrote:
> >
> > Hi Thomas,
> >
> > Thanks for bringing this up. I think this is a tough nut to crack :/.
> > Imho 1 and 3 or 1+3 can work but it is ofc a pity if the source implementor
> > is not aware of HybridSource. I'm also worried that we may not have a
> > universal interface to specify start offset/time.
> > I guess it also would be much easier if we would have an abstract base
> > source class where we could implement some basic support.
> >
> > When I initially looked at the issue I was thinking that sources should
> > always be immutable (we have some bad experiences with mutable life-cycles
> > in operator implementations) and the only modifiable thing should be the
> > builder. That would mean that a HybridSource actually just gets a list of
> > source builders and creates the sources when needed with the correct
> > start/end offset. However, we neither have base builders (something that
> > I'd like to change) nor are any of the builders serializable. We could
> > convert sources back to builders, update the start offset, and convert to
> > sources again but this also seems overly complicated. So I'm assuming that
> > we should go with modifiable sources as also expressed in the FLIP draft.
>
> The need to set a start position at runtime indicates that sources
> should not be immutable. I think it would be better to have a setter
> on the source that clearly describes the mutation.
>
> Regarding deferred construction of the sources (supplier pattern):
> This is actually a very interesting idea that would also help in
> situations where the exact sequence of sources isn't known upfront.
> However, Source is also the factory for split and enumerator
> checkpoint serializers. If we were to instantiate the source at switch
> time, we would also need to distribute the serializers at switch time.
> This would lead to even more complexity and move us further away from
> the original goal of having a relatively simple implementation for the
> basic scenarios.
>
> > If we could assume that we are always switching by time, we could also
> > change Source(Enumerator)#start to take the start time as a parameter. Can
> > we deduce the end time by the record timestamp? But I guess that has all
> > been discussed already, so sorry if I derail the discussion.
>
> This actually hasn't been discussed. The original proposal left the
> type of the start position open, which also makes it less attractive
> (user still has to supply a converter).
>
> For initial internal usage of the hybrid source, we are planning to
> use a timestamp. But there may be use cases where the start position
> could be encoded in other ways, such as based on Kafka offsets.
>
> > I'm also leaning towards extending the Source interface to include these
> > methods (with defaults) to make it harder for implementers to miss.
>
> It would be possible to introduce an optional interface as a follow-up
> task. It can be implemented as the default of option 3.
>
> >
> >
> > On Fri, Jun 11, 2021 at 7:02 PM Thomas Weise  wrote:
> >
> > > Thanks for the suggestions and feedback on the PR.
> > >
> > > A variation of hybrid source that can switch back and forth was
> > > brought up before and it is something that will be eventually
> > > required. It was also suggested by Stephan that in the future there
> > > may be more than one implementation of hybrid source for different
> > > requirements.
> > >
> > > I want to bring back the topic of how enumerator end state can be
> > > converted into start position from the PR [1]. We started in the FLIP
> > > page with "switchable" interfaces, the prototype had checkpoint
> > > conversion and now the PR has a function that allows to augment the
> > > source. Each of these has pros and cons but we will need to converge.
> > >
> > > 1. Switchable interfaces
> > > * u

trying (and failing) to update pyflink-walkthrough for Flink 1.13

2021-06-21 Thread David Anderson
I've been trying to upgrade the pyflink-walkthrough to Flink 1.13.1, but
without any success.

Unless I give it a lot of resources the data generator times out trying to
connect to Kafka. If I give it 6 cores and 11GB (which is about all I can
offer it) it does manage to connect, but then fails trying to write to
kafka.

Not sure what's wrong? Any suggestions?

See [1] to review what I tried.

Best,
David

[1]
https://github.com/alpinegizmo/flink-playgrounds/commit/777274355ba04de6d8c8f1308b24be99ec86a0d6

21:40 $ docker-compose logs -f generator

Attaching to pyflink-walkthrough_generator_1

generator_1  | Connecting to Kafka brokers

generator_1  | Waiting for brokers to become available

generator_1  | Waiting for brokers to become available

generator_1  | Connected to Kafka

generator_1  | Traceback (most recent call last):

generator_1  |   File "./generate_source_data.py", line 61, in


generator_1  | write_data(producer)

generator_1  |   File "./generate_source_data.py", line 42, in
write_data

generator_1  | producer.send(topic, value=cur_data)

generator_1  |   File
"/usr/local/lib/python3.7/site-packages/kafka/producer/kafka.py", line 576,
in send

generator_1  | self._wait_on_metadata(topic,
self.config['max_block_ms'] / 1000.0)

generator_1  |   File
"/usr/local/lib/python3.7/site-packages/kafka/producer/kafka.py", line 703,
in _wait_on_metadata

generator_1  | "Failed to update metadata after %.1f secs."
% (max_wait,))

generator_1  | kafka.errors.KafkaTimeoutError:
KafkaTimeoutError: Failed to update metadata after 60.0 secs.


Re: PR: "Propagate watermarks to sink API"

2021-06-21 Thread Arvid Heise
Hi Eron,

I will check it tomorrow. Sorry for the delay. If someone beats me to that,
please go ahead.

On Mon, Jun 21, 2021 at 8:18 PM Eron Wright 
wrote:

> Would someone be willing and able to review the PR which adds watermarks to
> the sink API?
>
> https://github.com/apache/flink/pull/15950
>
> Thanks!
> Eron
>


Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-21 Thread Austin Cawley-Edwards
Hi Gabor + Marton,

I don't believe that the issue with this proposal is the specific mechanism
proposed (Kerberos), but rather that it is not the level to implement it at
(Flink). I'm just one voice, so please take this with a grain of salt.

In the other solutions previously noted there is no need to instrument
Flink which, in addition to reducing the maintenance burden, provides a
better, decoupled end result.

IMO we should not add any new API in Flink for this use case. I think it is
unfortunate and sympathize with the work that has already been done on this
feature – perhaps we could brainstorm ways to run this alongside Flink in
your setup. Again, I don't think the proposed solution of an agnostic API
would not work, nor is it a bad idea, but is not one that will make Flink
more compatible with the modern solutions to this problem.

Best,
Austin

On Mon, Jun 21, 2021 at 2:18 PM Márton Balassi 
wrote:

> Hi team,
>
> Thank you for your input. Based on this discussion I agree with G that
> selecting and standardizing on a specific strong authentication mechanism
> is more challenging than the whole rest of the scope of this authentication
> story. :-) I suggest that G and I go back to the drawing board and come up
> with an API that can support multiple authentication mechanisms, and we
> would only merge said API to Flink. Specific implementations of it can be
> maintained outside of the project. This way we tackle the main challenge in
> a truly minimal way.
>
> Best,
> Marton
>
> On Mon, Jun 21, 2021 at 4:18 PM Gabor Somogyi 
> wrote:
>
> > Hi All,
> >
> > We see that adding any kind of specific authentication raises more
> > questions than answers.
> > What would be if a generic API would be added without any real
> > authentication logic?
> > That way every provider can add its own protocol implementation as
> > additional jar.
> >
> > BR,
> > G
> >
> >
> > On Thu, Jun 17, 2021 at 7:53 PM Austin Cawley-Edwards <
> > austin.caw...@gmail.com> wrote:
> >
> >> Hi all,
> >>
> >> Sorry to be joining the conversation late. I'm also on the side of
> >> Konstantin, generally, in that this seems to not be a core goal of Flink
> >> as
> >> a project and adds a maintenance burden.
> >>
> >> Would another con of Kerberos be that is likely a fading project in
> terms
> >> of network security? (serious question, please correct me if there is
> >> reason to believe it is gaining adoption)
> >>
> >> The point about Kerberos being independent of infrastructure is a good
> one
> >> but is something that is also solved by modern sidecar proxies + service
> >> meshes that can run across Kubernetes and bare-metal. These solutions
> also
> >> handle certificate provisioning, rotation, etc. in addition to
> >> higher-level
> >> authorization policies. Some examples of projects with this "universal
> >> infrastructure support" are Kuma[1] (CNCF Sandbox, I'm a maintainer) and
> >> Istio[2] (Google).
> >>
> >> Wondering out loud: has anyone tried to run Flink on top of cilium[3],
> >> which also provides zero-trust networking at the kernel level without
> >> needing to instrument applications? This currently only runs on
> Kubernetes
> >> on Linux, so that's a major limitation, but solves many of the request
> >> forging concerns at all levels.
> >>
> >> Thanks,
> >> Austin
> >>
> >> [1]: https://kuma.io/docs/1.1.6/quickstart/universal/
> >> [2]: https://istio.io/latest/docs/setup/install/virtual-machine/
> >> [3]: https://cilium.io/
> >>
> >> On Thu, Jun 17, 2021 at 1:50 PM Till Rohrmann 
> >> wrote:
> >>
> >> > I left some comments in the Google document. It would be great if
> >> > someone from the community with security experience could also take a
> >> look
> >> > at it. Maybe Eron you have an opinion on the topic.
> >> >
> >> > Cheers,
> >> > Till
> >> >
> >> > On Thu, Jun 17, 2021 at 6:57 PM Till Rohrmann 
> >> > wrote:
> >> >
> >> > > Hi Gabor,
> >> > >
> >> > > I haven't found time to look into the updated FLIP yet. I'll try to
> >> do it
> >> > > asap.
> >> > >
> >> > > Cheers,
> >> > > Till
> >> > >
> >> > > On Wed, Jun 16, 2021 at 9:35 PM Konstantin Knauf  >
> >> > > wrote:
> >> > >
> >> > >> Hi Gabor,
> >> > >>
> >> > >> > However representing Kerberos as completely new feature is not
> true
> >> > >> because
> >> > >> it's already in since Flink makes authentication at least with HDFS
> >> and
> >> > >> Hbase through Kerberos.
> >> > >>
> >> > >> True, that is one way to look at it, but there are differences,
> too:
> >> > >> Control Plane vs Data Plane, Core vs Connectors.
> >> > >>
> >> > >> > Adding OIDC or OAuth2 has the exact same concerns what you've
> guys
> >> > just
> >> > >> raised. Why exactly these? If you think this would be beneficial we
> >> can
> >> > >> discuss it in detail
> >> > >>
> >> > >> That's exactly my point. Once we start adding authx support, we
> will
> >> > >> sooner or later discuss other options besides Kerberos, too. A user
> >> who
> >> > >> would like to use OAuth can not e

Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2

2021-06-21 Thread Seth Wiesman
+1 to the release.

It would be great if we could get FLINK-23073 into 1.13.2. There's already
an open PR and it unblocks upgrading the table API walkthrough in
apache/flink-playgrounds to 1.13.

Seth

On Mon, Jun 21, 2021 at 6:28 AM Yun Tang  wrote:

> Hi Dawid,
>
> Thanks for driving this discussion, I am willing to volunteer as the
> release manager for these versions.
>
>
> Best
> Yun Tang
> 
> From: Konstantin Knauf 
> Sent: Friday, June 18, 2021 22:35
> To: dev 
> Subject: Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2
>
> Hi Dawid,
>
> Thank you for starting the discussion. I'd like to add
> https://issues.apache.org/jira/browse/FLINK-23025 to the list for Flink
> 1.13.2.
>
> Cheers,
>
> Konstantin
>
> On Fri, Jun 18, 2021 at 3:26 PM Dawid Wysakowicz 
> wrote:
>
> > Hi devs,
> >
> > Quite recently we pushed, in our opinion, quite an important fix[1] for
> > unaligned checkpoints which disables UC for broadcast partitioning.
> > Without the fix there might be some broadcast state corruption.
> > Therefore we think it would be beneficial to release it soonish. What do
> > you think? Do you have other issues in mind you'd like to have included
> > in these versions.
> >
> > Would someone be willing to volunteer to help with the releases as a
> > release manager? I guess there is a couple of spots to fill in here ;)
> >
> > Best,
> >
> > Dawid
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-22815
> >
> >
> >
>
> --
>
> Konstantin Knauf
>
> https://twitter.com/snntrable
>
> https://github.com/knaufk
>


Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2

2021-06-21 Thread godfrey he
Thanks for driving this, Dawid. +1 to the release.
I would also like to volunteer as the release manager for these versions.


Best
Godfrey

Seth Wiesman  于2021年6月22日周二 上午8:39写道:

> +1 to the release.
>
> It would be great if we could get FLINK-23073 into 1.13.2. There's already
> an open PR and it unblocks upgrading the table API walkthrough in
> apache/flink-playgrounds to 1.13.
>
> Seth
>
> On Mon, Jun 21, 2021 at 6:28 AM Yun Tang  wrote:
>
> > Hi Dawid,
> >
> > Thanks for driving this discussion, I am willing to volunteer as the
> > release manager for these versions.
> >
> >
> > Best
> > Yun Tang
> > 
> > From: Konstantin Knauf 
> > Sent: Friday, June 18, 2021 22:35
> > To: dev 
> > Subject: Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2
> >
> > Hi Dawid,
> >
> > Thank you for starting the discussion. I'd like to add
> > https://issues.apache.org/jira/browse/FLINK-23025 to the list for Flink
> > 1.13.2.
> >
> > Cheers,
> >
> > Konstantin
> >
> > On Fri, Jun 18, 2021 at 3:26 PM Dawid Wysakowicz  >
> > wrote:
> >
> > > Hi devs,
> > >
> > > Quite recently we pushed, in our opinion, quite an important fix[1] for
> > > unaligned checkpoints which disables UC for broadcast partitioning.
> > > Without the fix there might be some broadcast state corruption.
> > > Therefore we think it would be beneficial to release it soonish. What
> do
> > > you think? Do you have other issues in mind you'd like to have included
> > > in these versions.
> > >
> > > Would someone be willing to volunteer to help with the releases as a
> > > release manager? I guess there is a couple of spots to fill in here ;)
> > >
> > > Best,
> > >
> > > Dawid
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-22815
> > >
> > >
> > >
> >
> > --
> >
> > Konstantin Knauf
> >
> > https://twitter.com/snntrable
> >
> > https://github.com/knaufk
> >
>


Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2

2021-06-21 Thread Jingsong Li
+1 to the release.

Thanks Dawid for driving this discussion.

I am willing to volunteer as the release manager too.

Best,
Jingsong

On Tue, Jun 22, 2021 at 9:58 AM godfrey he  wrote:

> Thanks for driving this, Dawid. +1 to the release.
> I would also like to volunteer as the release manager for these versions.
>
>
> Best
> Godfrey
>
> Seth Wiesman  于2021年6月22日周二 上午8:39写道:
>
> > +1 to the release.
> >
> > It would be great if we could get FLINK-23073 into 1.13.2. There's
> already
> > an open PR and it unblocks upgrading the table API walkthrough in
> > apache/flink-playgrounds to 1.13.
> >
> > Seth
> >
> > On Mon, Jun 21, 2021 at 6:28 AM Yun Tang  wrote:
> >
> > > Hi Dawid,
> > >
> > > Thanks for driving this discussion, I am willing to volunteer as the
> > > release manager for these versions.
> > >
> > >
> > > Best
> > > Yun Tang
> > > 
> > > From: Konstantin Knauf 
> > > Sent: Friday, June 18, 2021 22:35
> > > To: dev 
> > > Subject: Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2
> > >
> > > Hi Dawid,
> > >
> > > Thank you for starting the discussion. I'd like to add
> > > https://issues.apache.org/jira/browse/FLINK-23025 to the list for
> Flink
> > > 1.13.2.
> > >
> > > Cheers,
> > >
> > > Konstantin
> > >
> > > On Fri, Jun 18, 2021 at 3:26 PM Dawid Wysakowicz <
> dwysakow...@apache.org
> > >
> > > wrote:
> > >
> > > > Hi devs,
> > > >
> > > > Quite recently we pushed, in our opinion, quite an important fix[1]
> for
> > > > unaligned checkpoints which disables UC for broadcast partitioning.
> > > > Without the fix there might be some broadcast state corruption.
> > > > Therefore we think it would be beneficial to release it soonish. What
> > do
> > > > you think? Do you have other issues in mind you'd like to have
> included
> > > > in these versions.
> > > >
> > > > Would someone be willing to volunteer to help with the releases as a
> > > > release manager? I guess there is a couple of spots to fill in here
> ;)
> > > >
> > > > Best,
> > > >
> > > > Dawid
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-22815
> > > >
> > > >
> > > >
> > >
> > > --
> > >
> > > Konstantin Knauf
> > >
> > > https://twitter.com/snntrable
> > >
> > > https://github.com/knaufk
> > >
> >
>


-- 
Best, Jingsong Lee


Re: [VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-21 Thread Jingsong Li
+1 (binding)

Thanks for driving.

Best,
Jingsong

On Tue, Jun 22, 2021 at 12:07 AM Jark Wu  wrote:

> +1 (binding)
>
> Best,
> Jark
>
> On Mon, 21 Jun 2021 at 22:51, Timo Walther  wrote:
>
> > +1 (binding)
> >
> > Thanks for driving this.
> >
> > Regards,
> > Timo
> >
> > On 21.06.21 13:24, Ingo Bürk wrote:
> > > Hi everyone,
> > >
> > > thanks for all the feedback so far. Based on the discussion[1] we seem
> to
> > > have consensus, so I would like to start a vote on FLIP-129 for which
> the
> > > FLIP has now also been updated[2].
> > >
> > > The vote will last for at least 72 hours (Thu, Jun 24th 12:00 GMT)
> unless
> > > there is an objection or insufficient votes.
> > >
> > >
> > > Thanks
> > > Ingo
> > >
> > > [1]
> > >
> >
> https://lists.apache.org/thread.html/rc75d64e889bf35592e9843dde86e82bdfea8fd4eb4c3df150112b305%40%3Cdev.flink.apache.org%3E
> > > [2]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
> > >
> >
> >
>


-- 
Best, Jingsong Lee


Re: [VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-21 Thread JING ZHANG
+1 (binding)

Best regards,
JING ZHANG

Jingsong Li  于2021年6月22日周二 上午10:11写道:

> +1 (binding)
>
> Thanks for driving.
>
> Best,
> Jingsong
>
> On Tue, Jun 22, 2021 at 12:07 AM Jark Wu  wrote:
>
> > +1 (binding)
> >
> > Best,
> > Jark
> >
> > On Mon, 21 Jun 2021 at 22:51, Timo Walther  wrote:
> >
> > > +1 (binding)
> > >
> > > Thanks for driving this.
> > >
> > > Regards,
> > > Timo
> > >
> > > On 21.06.21 13:24, Ingo Bürk wrote:
> > > > Hi everyone,
> > > >
> > > > thanks for all the feedback so far. Based on the discussion[1] we
> seem
> > to
> > > > have consensus, so I would like to start a vote on FLIP-129 for which
> > the
> > > > FLIP has now also been updated[2].
> > > >
> > > > The vote will last for at least 72 hours (Thu, Jun 24th 12:00 GMT)
> > unless
> > > > there is an objection or insufficient votes.
> > > >
> > > >
> > > > Thanks
> > > > Ingo
> > > >
> > > > [1]
> > > >
> > >
> >
> https://lists.apache.org/thread.html/rc75d64e889bf35592e9843dde86e82bdfea8fd4eb4c3df150112b305%40%3Cdev.flink.apache.org%3E
> > > > [2]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
> > > >
> > >
> > >
> >
>
>
> --
> Best, Jingsong Lee
>


Re: [VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-21 Thread Leonard Xu
+1

Thanks Ingo for picking up this FLIP.

Best,
Leonard


> 在 2021年6月22日,10:16,JING ZHANG  写道:
> 
> +1 (binding)
> 
> Best regards,
> JING ZHANG
> 
> Jingsong Li  于2021年6月22日周二 上午10:11写道:
> 
>> +1 (binding)
>> 
>> Thanks for driving.
>> 
>> Best,
>> Jingsong
>> 
>> On Tue, Jun 22, 2021 at 12:07 AM Jark Wu  wrote:
>> 
>>> +1 (binding)
>>> 
>>> Best,
>>> Jark
>>> 
>>> On Mon, 21 Jun 2021 at 22:51, Timo Walther  wrote:
>>> 
 +1 (binding)
 
 Thanks for driving this.
 
 Regards,
 Timo
 
 On 21.06.21 13:24, Ingo Bürk wrote:
> Hi everyone,
> 
> thanks for all the feedback so far. Based on the discussion[1] we
>> seem
>>> to
> have consensus, so I would like to start a vote on FLIP-129 for which
>>> the
> FLIP has now also been updated[2].
> 
> The vote will last for at least 72 hours (Thu, Jun 24th 12:00 GMT)
>>> unless
> there is an objection or insufficient votes.
> 
> 
> Thanks
> Ingo
> 
> [1]
> 
 
>>> 
>> https://lists.apache.org/thread.html/rc75d64e889bf35592e9843dde86e82bdfea8fd4eb4c3df150112b305%40%3Cdev.flink.apache.org%3E
> [2]
> 
 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
> 
 
 
>>> 
>> 
>> 
>> --
>> Best, Jingsong Lee
>> 



Re: trying (and failing) to update pyflink-walkthrough for Flink 1.13

2021-06-21 Thread Dian Fu
Thanks David for taking care of this. I will take a look at this issue.

Regards,
Dian

> 2021年6月22日 上午4:06,David Anderson  写道:
> 
> I've been trying to upgrade the pyflink-walkthrough to Flink 1.13.1, but
> without any success.
> 
> Unless I give it a lot of resources the data generator times out trying to
> connect to Kafka. If I give it 6 cores and 11GB (which is about all I can
> offer it) it does manage to connect, but then fails trying to write to
> kafka.
> 
> Not sure what's wrong? Any suggestions?
> 
> See [1] to review what I tried.
> 
> Best,
> David
> 
> [1]
> https://github.com/alpinegizmo/flink-playgrounds/commit/777274355ba04de6d8c8f1308b24be99ec86a0d6
> 
> 21:40 $ docker-compose logs -f generator
> 
> Attaching to pyflink-walkthrough_generator_1
> 
> generator_1  | Connecting to Kafka brokers
> 
> generator_1  | Waiting for brokers to become available
> 
> generator_1  | Waiting for brokers to become available
> 
> generator_1  | Connected to Kafka
> 
> generator_1  | Traceback (most recent call last):
> 
> generator_1  |   File "./generate_source_data.py", line 61, in
> 
> 
> generator_1  | write_data(producer)
> 
> generator_1  |   File "./generate_source_data.py", line 42, in
> write_data
> 
> generator_1  | producer.send(topic, value=cur_data)
> 
> generator_1  |   File
> "/usr/local/lib/python3.7/site-packages/kafka/producer/kafka.py", line 576,
> in send
> 
> generator_1  | self._wait_on_metadata(topic,
> self.config['max_block_ms'] / 1000.0)
> 
> generator_1  |   File
> "/usr/local/lib/python3.7/site-packages/kafka/producer/kafka.py", line 703,
> in _wait_on_metadata
> 
> generator_1  | "Failed to update metadata after %.1f secs."
> % (max_wait,))
> 
> generator_1  | kafka.errors.KafkaTimeoutError:
> KafkaTimeoutError: Failed to update metadata after 60.0 secs.



[jira] [Created] (FLINK-23074) There is a class conflict between flink-connector-hive and flink-parquet

2021-06-21 Thread Ada Wong (Jira)
Ada Wong created FLINK-23074:


 Summary: There is a class conflict between flink-connector-hive 
and flink-parquet
 Key: FLINK-23074
 URL: https://issues.apache.org/jira/browse/FLINK-23074
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Ada Wong


flink-connector-hive 2.3.6 include parquet-hadoop 1.8.1 version but 
flink-parquet include 1.11.1.
org.apache.parquet.hadoop.example.GroupWriteSupport
 is different.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23075) Python API for enabling ChangelogStateBackend

2021-06-21 Thread Zakelly Lan (Jira)
Zakelly Lan created FLINK-23075:
---

 Summary: Python API for enabling ChangelogStateBackend
 Key: FLINK-23075
 URL: https://issues.apache.org/jira/browse/FLINK-23075
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Affects Versions: 1.14.0
Reporter: Zakelly Lan


After FLINK-22678, two APIs ```enableChangelogStateBackend``` and 
```isChangelogStateBackendEnabled``` have been added. The corresponding 
interfaces should be added to python API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2

2021-06-21 Thread Xintong Song
Thanks Dawid for starting the discussion, and thanks Yun, Godfrey and
Jingsong for volunteering as release managers.


+1 for the releases, and +1 for the release managers.


Thank you~

Xintong Song



On Tue, Jun 22, 2021 at 10:15 AM Jingsong Li  wrote:

> +1 to the release.
>
> Thanks Dawid for driving this discussion.
>
> I am willing to volunteer as the release manager too.
>
> Best,
> Jingsong
>
> On Tue, Jun 22, 2021 at 9:58 AM godfrey he  wrote:
>
> > Thanks for driving this, Dawid. +1 to the release.
> > I would also like to volunteer as the release manager for these versions.
> >
> >
> > Best
> > Godfrey
> >
> > Seth Wiesman  于2021年6月22日周二 上午8:39写道:
> >
> > > +1 to the release.
> > >
> > > It would be great if we could get FLINK-23073 into 1.13.2. There's
> > already
> > > an open PR and it unblocks upgrading the table API walkthrough in
> > > apache/flink-playgrounds to 1.13.
> > >
> > > Seth
> > >
> > > On Mon, Jun 21, 2021 at 6:28 AM Yun Tang  wrote:
> > >
> > > > Hi Dawid,
> > > >
> > > > Thanks for driving this discussion, I am willing to volunteer as the
> > > > release manager for these versions.
> > > >
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: Konstantin Knauf 
> > > > Sent: Friday, June 18, 2021 22:35
> > > > To: dev 
> > > > Subject: Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2
> > > >
> > > > Hi Dawid,
> > > >
> > > > Thank you for starting the discussion. I'd like to add
> > > > https://issues.apache.org/jira/browse/FLINK-23025 to the list for
> > Flink
> > > > 1.13.2.
> > > >
> > > > Cheers,
> > > >
> > > > Konstantin
> > > >
> > > > On Fri, Jun 18, 2021 at 3:26 PM Dawid Wysakowicz <
> > dwysakow...@apache.org
> > > >
> > > > wrote:
> > > >
> > > > > Hi devs,
> > > > >
> > > > > Quite recently we pushed, in our opinion, quite an important fix[1]
> > for
> > > > > unaligned checkpoints which disables UC for broadcast partitioning.
> > > > > Without the fix there might be some broadcast state corruption.
> > > > > Therefore we think it would be beneficial to release it soonish.
> What
> > > do
> > > > > you think? Do you have other issues in mind you'd like to have
> > included
> > > > > in these versions.
> > > > >
> > > > > Would someone be willing to volunteer to help with the releases as
> a
> > > > > release manager? I guess there is a couple of spots to fill in here
> > ;)
> > > > >
> > > > > Best,
> > > > >
> > > > > Dawid
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-22815
> > > > >
> > > > >
> > > > >
> > > >
> > > > --
> > > >
> > > > Konstantin Knauf
> > > >
> > > > https://twitter.com/snntrable
> > > >
> > > > https://github.com/knaufk
> > > >
> > >
> >
>
>
> --
> Best, Jingsong Lee
>


[jira] [Created] (FLINK-23076) DispatcherTest.testWaitingForJobMasterLeadership fails on azure

2021-06-21 Thread Xintong Song (Jira)
Xintong Song created FLINK-23076:


 Summary: DispatcherTest.testWaitingForJobMasterLeadership fails on 
azure
 Key: FLINK-23076
 URL: https://issues.apache.org/jira/browse/FLINK-23076
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.12.4
Reporter: Xintong Song
 Fix For: 1.12.5


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19265&view=logs&j=d89de3df-4600-5585-dadc-9bbc9a5e661c&t=19336553-69ec-5b03-471a-791a483cced6&l=6511

{code}
[ERROR] Failures: 
[ERROR]   DispatcherTest.testWaitingForJobMasterLeadership:672 
Expected: is 
 but: was 
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23077) Running nexmark q5 with 1.13.1 of pipeline.object-reuse=true, the taskmanager will be killed and produce failover.

2021-06-21 Thread xiaojin.wy (Jira)
xiaojin.wy created FLINK-23077:
--

 Summary: Running nexmark q5 with 1.13.1 of 
pipeline.object-reuse=true, the taskmanager will be killed and produce failover.
 Key: FLINK-23077
 URL: https://issues.apache.org/jira/browse/FLINK-23077
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Affects Versions: 1.13.1, 1.13.0
Reporter: xiaojin.wy


Running nexmark with flink 1.13.0, 1.13.1, q5 can`t success.
*The conf is: *
pipeline.object-reuse=true
*The sql is:*
CREATE TABLE discard_sink (
  auction  BIGINT,
  num  BIGINT
) WITH (
  'connector' = 'blackhole'
);

INSERT INTO discard_sink
SELECT AuctionBids.auction, AuctionBids.num
 FROM (
   SELECT
 B1.auction,
 count(*) AS num,
 HOP_START(B1.dateTime, INTERVAL '2' SECOND, INTERVAL '10' SECOND) AS 
starttime,
 HOP_END(B1.dateTime, INTERVAL '2' SECOND, INTERVAL '10' SECOND) AS endtime
   FROM bid B1
   GROUP BY
 B1.auction,
 HOP(B1.dateTime, INTERVAL '2' SECOND, INTERVAL '10' SECOND)
 ) AS AuctionBids
 JOIN (
   SELECT
 max(CountBids.num) AS maxn,
 CountBids.starttime,
 CountBids.endtime
   FROM (
 SELECT
   count(*) AS num,
   HOP_START(B2.dateTime, INTERVAL '2' SECOND, INTERVAL '10' SECOND) AS 
starttime,
   HOP_END(B2.dateTime, INTERVAL '2' SECOND, INTERVAL '10' SECOND) AS 
endtime
 FROM bid B2
 GROUP BY
   B2.auction,
   HOP(B2.dateTime, INTERVAL '2' SECOND, INTERVAL '10' SECOND)
 ) AS CountBids
   GROUP BY CountBids.starttime, CountBids.endtime
 ) AS MaxBids
 ON AuctionBids.starttime = MaxBids.starttime AND
AuctionBids.endtime = MaxBids.endtime AND
AuctionBids.num >= MaxBids.maxn;%


*The error is:*
 !image-2021-06-22-11-30-58-022.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-21 Thread Xintong Song
I second Zhu and Till's opinion.

Failing with an exception that also includes how to resolve the problem
sounds better, in terms of making it explicit to users that pipelined edges
are replaced with blocking edges.

Concerning absence of knobs tuning the edge types, we can introduce a
configuration option. Since currently the edge types are fixed based on the
job execution mode and are not exposed to users, I'd suggest introducing a
configuration option that only affects fine-grained resource management use
cases. To be specific, we can have something like
'fine-grained.xxx.all-blocking'. The default value should be false, and we
can suggest users to set it to true in the error message. When set to true,
this should take effect only when fine-grained resource requirements are
detected. Thus, it should not affect the default execution-mode based edge
type strategy for non fine-grained use cases.


Thank you~

Xintong Song



On Mon, Jun 21, 2021 at 8:59 PM Yangze Guo  wrote:

> Thanks for the feedback, Till!
>
> Actually, we cannot give user any resolution for this issue as there
> is no API for DataStream users to influence the edge types at the
> moment. The edge types are currently fixed based on the jobs' mode
> (batch or streaming).
> a) I think it might not confuse the user a lot as the behavior has
> never been documented or guaranteed to be unchanged.
> b) Thanks for your illustration. I agree that add complexity can make
> other feature development harder in the future. However, I think this
> might not introduce much complexity. In this case, we construct an
> all-edges-blocking job graph, which already exists since 1.11 and
> should have been considered by the following features. I admit we
> cannot assume the all-edges-blocking job graph will exist forever in
> Flink, but AFAIK there is no seeable feature that will intend to
> deprecate it.
>
> WDYT?
>
>
>
> Best,
> Yangze Guo
>
> On Mon, Jun 21, 2021 at 6:10 PM Till Rohrmann 
> wrote:
> >
> > I would be more in favor of what Zhu Zhu proposed to throw an exception
> > with a meaningful and understandable explanation that also includes how
> to
> > resolve this problem. I do understand the reasoning behind automatically
> > switching the edge types in order to make things easier to use but a)
> this
> > can also be confusing if the user does not expect this to happen and b)
> it
> > can add some complexity which makes other feature development harder in
> the
> > future because users might rely on it. An example of such a case I
> stumbled
> > upon rather recently is that we adjust the maximum parallelism wrt the
> > given savepoint if it has not been explicitly configured. On the paper
> this
> > sounds like a good usability improvement, however, for the
> > AdaptiveScheduler it posed a quite annoying complexity. If instead, we
> said
> > that we fail the job submission if the max parallelism does not equal the
> > max parallelism of the savepoint, it would have been a lot easier.
> >
> > Cheers,
> > Till
> >
> > On Mon, Jun 21, 2021 at 9:36 AM Yangze Guo  wrote:
> >
> > > Thanks, I append it to the known limitations of this FLIP.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Mon, Jun 21, 2021 at 3:20 PM Zhu Zhu  wrote:
> > > >
> > > > Thanks for the quick response Yangze.
> > > > The proposal sounds good to me.
> > > >
> > > > Thanks,
> > > > Zhu
> > > >
> > > > Yangze Guo  于2021年6月21日周一 下午3:01写道:
> > > >>
> > > >> Thanks for the comments, Zhu!
> > > >>
> > > >> Yes, it is a known limitation for fine-grained resource management.
> We
> > > >> also have filed this issue in FLINK-20865 when we proposed FLIP-156.
> > > >>
> > > >> As a first step, I agree that we can mark batch jobs with PIPELINED
> > > >> edges as an invalid case for this feature. However, just throwing an
> > > >> exception, in that case, might confuse users who do not understand
> the
> > > >> concept of pipeline region. Maybe we can force all the edges in this
> > > >> scenario to BLOCKING in compiling stage and well document it. So
> that,
> > > >> common users will not be interrupted while the expert users can
> > > >> understand the cost of that usage and make their decision. WDYT?
> > > >>
> > > >> Best,
> > > >> Yangze Guo
> > > >>
> > > >> On Mon, Jun 21, 2021 at 2:24 PM Zhu Zhu  wrote:
> > > >> >
> > > >> > Thanks for proposing this @Yangze Guo and sorry for joining the
> > > discussion so late.
> > > >> > The proposal generally looks good to me. But I find one problem
> that
> > > batch job with PIPELINED edges might hang if enabling fine-grained
> > > resources. see "Resource Deadlocks could still happen in certain Cases"
> > > section in
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling
> > > >> > However, this problem may happen only in batch cases with
> PIPELINED
> > > edges, because
> > > >> > 1. streaming jobs would always require all resource requirements
> to
> > > be fulfilled at the same time.
> > > >> > 2. b

[jira] [Created] (FLINK-23078) Scheduler Benchmarks not compiling

2021-06-21 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-23078:
--

 Summary: Scheduler Benchmarks not compiling
 Key: FLINK-23078
 URL: https://issues.apache.org/jira/browse/FLINK-23078
 Project: Flink
  Issue Type: Bug
  Components: Benchmarks, Runtime / Coordination
Affects Versions: 1.14.0
Reporter: Piotr Nowojski
 Fix For: 1.14.0


{code:java}
07:46:50  [ERROR] 
/home/jenkins/workspace/flink-master-benchmarks/flink-benchmarks/src/main/java/org/apache/flink/scheduler/benchmark/SchedulerBenchmarkBase.java:21:44:
  error: cannot find symbol
{code}

CC [~chesnay] [~Thesharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-21 Thread Yangze Guo
Thanks for the comment, Xintong.

I used to wonder if it was reasonable or worthwhile to introduce a
configuration like "table.exec.shuffle-mode" for DataStream API.
Narrow down the scope of effect sounds good to me.

Best,
Yangze Guo

On Tue, Jun 22, 2021 at 2:08 PM Xintong Song  wrote:
>
> I second Zhu and Till's opinion.
>
> Failing with an exception that also includes how to resolve the problem
> sounds better, in terms of making it explicit to users that pipelined edges
> are replaced with blocking edges.
>
> Concerning absence of knobs tuning the edge types, we can introduce a
> configuration option. Since currently the edge types are fixed based on the
> job execution mode and are not exposed to users, I'd suggest introducing a
> configuration option that only affects fine-grained resource management use
> cases. To be specific, we can have something like
> 'fine-grained.xxx.all-blocking'. The default value should be false, and we
> can suggest users to set it to true in the error message. When set to true,
> this should take effect only when fine-grained resource requirements are
> detected. Thus, it should not affect the default execution-mode based edge
> type strategy for non fine-grained use cases.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Jun 21, 2021 at 8:59 PM Yangze Guo  wrote:
>
> > Thanks for the feedback, Till!
> >
> > Actually, we cannot give user any resolution for this issue as there
> > is no API for DataStream users to influence the edge types at the
> > moment. The edge types are currently fixed based on the jobs' mode
> > (batch or streaming).
> > a) I think it might not confuse the user a lot as the behavior has
> > never been documented or guaranteed to be unchanged.
> > b) Thanks for your illustration. I agree that add complexity can make
> > other feature development harder in the future. However, I think this
> > might not introduce much complexity. In this case, we construct an
> > all-edges-blocking job graph, which already exists since 1.11 and
> > should have been considered by the following features. I admit we
> > cannot assume the all-edges-blocking job graph will exist forever in
> > Flink, but AFAIK there is no seeable feature that will intend to
> > deprecate it.
> >
> > WDYT?
> >
> >
> >
> > Best,
> > Yangze Guo
> >
> > On Mon, Jun 21, 2021 at 6:10 PM Till Rohrmann 
> > wrote:
> > >
> > > I would be more in favor of what Zhu Zhu proposed to throw an exception
> > > with a meaningful and understandable explanation that also includes how
> > to
> > > resolve this problem. I do understand the reasoning behind automatically
> > > switching the edge types in order to make things easier to use but a)
> > this
> > > can also be confusing if the user does not expect this to happen and b)
> > it
> > > can add some complexity which makes other feature development harder in
> > the
> > > future because users might rely on it. An example of such a case I
> > stumbled
> > > upon rather recently is that we adjust the maximum parallelism wrt the
> > > given savepoint if it has not been explicitly configured. On the paper
> > this
> > > sounds like a good usability improvement, however, for the
> > > AdaptiveScheduler it posed a quite annoying complexity. If instead, we
> > said
> > > that we fail the job submission if the max parallelism does not equal the
> > > max parallelism of the savepoint, it would have been a lot easier.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Mon, Jun 21, 2021 at 9:36 AM Yangze Guo  wrote:
> > >
> > > > Thanks, I append it to the known limitations of this FLIP.
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Mon, Jun 21, 2021 at 3:20 PM Zhu Zhu  wrote:
> > > > >
> > > > > Thanks for the quick response Yangze.
> > > > > The proposal sounds good to me.
> > > > >
> > > > > Thanks,
> > > > > Zhu
> > > > >
> > > > > Yangze Guo  于2021年6月21日周一 下午3:01写道:
> > > > >>
> > > > >> Thanks for the comments, Zhu!
> > > > >>
> > > > >> Yes, it is a known limitation for fine-grained resource management.
> > We
> > > > >> also have filed this issue in FLINK-20865 when we proposed FLIP-156.
> > > > >>
> > > > >> As a first step, I agree that we can mark batch jobs with PIPELINED
> > > > >> edges as an invalid case for this feature. However, just throwing an
> > > > >> exception, in that case, might confuse users who do not understand
> > the
> > > > >> concept of pipeline region. Maybe we can force all the edges in this
> > > > >> scenario to BLOCKING in compiling stage and well document it. So
> > that,
> > > > >> common users will not be interrupted while the expert users can
> > > > >> understand the cost of that usage and make their decision. WDYT?
> > > > >>
> > > > >> Best,
> > > > >> Yangze Guo
> > > > >>
> > > > >> On Mon, Jun 21, 2021 at 2:24 PM Zhu Zhu  wrote:
> > > > >> >
> > > > >> > Thanks for proposing this @Yangze Guo and sorry for joining the
> > > > discussion so late.
> > > > >> > The proposal generally looks good to me. But I find

[jira] [Created] (FLINK-23079) HiveTableSinkITCase fails

2021-06-21 Thread JING ZHANG (Jira)
JING ZHANG created FLINK-23079:
--

 Summary: HiveTableSinkITCase fails
 Key: FLINK-23079
 URL: https://issues.apache.org/jira/browse/FLINK-23079
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Reporter: JING ZHANG


There are 4 tests in HiveTableSinkITCase fails:

testBatchAppend,

testPartStreamingMrWrite,

testHiveTableSinkWithParallelismInStreaming,

testStreamingSinkWithTimestampLtzWatermark

 

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19241&view=logs&j=e25d5e7e-2a9c-5589-4940-0b638d75a414&t=a6e0f756-5bb9-5ea8-a468-5f60db442a29



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23080) Add finish method to the SinkFunction

2021-06-21 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-23080:


 Summary: Add finish method to the SinkFunction
 Key: FLINK-23080
 URL: https://issues.apache.org/jira/browse/FLINK-23080
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing
Reporter: Dawid Wysakowicz
 Fix For: 1.14.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)