Re: [VOTE] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-23 Thread JING ZHANG
+1 (binding)

Thanks Yangze for driving the issue.

Best regards,
JING ZHANG

Yang Wang  于2021年6月24日周四 下午1:51写道:

> +1 (non-binding)
>
> Best,
> Yang
>
> 刘建刚  于2021年6月24日周四 下午12:17写道:
>
> > +1  (binding)
> >
> > Thanks
> > liujiangang
> >
> > Zhu Zhu  于2021年6月24日周四 上午11:38写道:
> >
> > > +1  (binding)
> > >
> > > Thanks,
> > > Zhu
> > >
> > > Yangze Guo  于2021年6月21日周一 下午3:42写道:
> > >
> > > > According to the latest comment of Zhu Zhu[1], I append the potential
> > > > resource deadlock in batch jobs as a known limitation to this FLIP.
> > > > Thus, I'd extend the voting period for another 72h.
> > > >
> > > > [1]
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Tue, Jun 15, 2021 at 7:53 PM Xintong Song 
> > > > wrote:
> > > > >
> > > > > +1 (binding)
> > > > >
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Jun 15, 2021 at 6:21 PM Arvid Heise 
> > wrote:
> > > > >
> > > > > > LGTM +1 (binding) from my side.
> > > > > >
> > > > > > On Tue, Jun 15, 2021 at 11:00 AM Yangze Guo 
> > > > wrote:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I'd like to start the vote of FLIP-169 [1]. This FLIP is
> > discussed
> > > in
> > > > > > > the thread[2].
> > > > > > >
> > > > > > > The vote will be open for at least 72 hours. Unless there is an
> > > > > > > objection, I will try to close it by Jun. 18, 2021 if we have
> > > > received
> > > > > > > sufficient votes.
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-169+DataStream+API+for+Fine-Grained+Resource+Requirements
> > > > > > > [2]
> > > > > > >
> > > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > >
> > > >
> > >
> >
>


[jira] [Created] (FLINK-23133) The dependencies are not handled properly when mixing use of Python Table API and Python DataStream API

2021-06-23 Thread Dian Fu (Jira)
Dian Fu created FLINK-23133:
---

 Summary: The dependencies are not handled properly when mixing use 
of Python Table API and Python DataStream API
 Key: FLINK-23133
 URL: https://issues.apache.org/jira/browse/FLINK-23133
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.13.0, 1.12.0
Reporter: Dian Fu
Assignee: Dian Fu
 Fix For: 1.12.5, 1.13.2


The reason is that when converting from DataStream to Table, the dependencies 
should be handled and set correctly for the existing DataStream operators.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-23 Thread Jingsong Li
+1 to Xintong's proposal

I also have some concerns about unstable cases.

I think unstable cases can be divided into these types:

- Force majeure: For example, network timeout, sudden environmental
collapse, they are accidental and can always be solved by triggering azure
again. Committers should wait for the next green azure.

- Obvious mistakes: For example, some errors caused by obvious reasons may
be repaired quickly. At this time, do we need to wait, or not wait and just
ignore?

- Difficult questions: These problems are very difficult to find. There
will be no solution for a while and a half. We don't even know the reason.
At this time, we should ignore it. (Maybe it's judged by the author of the
case. But what about the old case whose author can't be found?)

So, the ignored cases should be the block of the next release until the
reason is found or the case is fixed?  We need to ensure that someone will
take care of these cases, because there is no deepening of failed tests, no
one may continue to pay attention to these cases.

I think this guideline should consider these situations, and show how to
solve them.

Best,
Jingsong

On Thu, Jun 24, 2021 at 10:57 AM Jark Wu  wrote:

> Thanks to Xintong for bringing up this topic, I'm +1 in general.
>
> However, I think it's still not very clear how we address the unstable
> tests.
> I think this is a very important part of this new guideline.
>
> According to the discussion above, if some tests are unstable, we can
> manually disable it.
> But I have some questions in my mind:
> 1) Is the instability judged by the committer themselves or by some
> metrics?
> 2) Should we log the disable commit in the corresponding issue and increase
> the priority?
> 3) What if nobody looks into this issue and this becomes some potential
> bugs released with the new version?
> 4) If no person is actively working on the issue, who should re-enable it?
> Would it block PRs again?
>
>
> Best,
> Jark
>
>
> On Thu, 24 Jun 2021 at 10:04, Xintong Song  wrote:
>
> > Thanks all for the feedback.
> >
> > @Till @Yangze
> >
> > I'm also not convinced by the idea of having an exception for local
> builds.
> > We need to execute the entire build (or at least the failing stage)
> > locally, to make sure subsequent test cases prevented by the failure one
> > are all executed. In that case, it's probably easier to rerun the build
> on
> > azure than locally.
> >
> > Concerning disabling unstable test cases that regularly block PRs from
> > merging, maybe we can say that such cases can only be disabled when
> someone
> > is actively looking into it, likely the person who disabled the case. If
> > this person is no longer actively working on it, he/she should enable the
> > case again no matter if it is fixed or not.
> >
> > @Jing
> >
> > Thanks for the suggestions.
> >
> > +1 to provide guidelines on handling test failures.
> >
> > 1. Report the test failures in the JIRA.
> > >
> >
> > +1 on this. Currently, the release managers are monitoring the ci and
> cron
> > build instabilities and reporting them on JIRA. We should also encourage
> > other contributors to do that for PRs.
> >
> > 2. Set a deadline to find out the root cause and solve the failure for
> the
> > > new created JIRA  because we could not block other commit merges for a
> > long
> > > time
> > >
> > 3. What to do if the JIRA has not made significant progress when reached
> to
> > > the deadline time?
> >
> >
> > I'm not sure about these two. It feels a bit against the voluntary nature
> > of open source projects.
> >
> > IMHO, frequent instabilities are more likely to be upgraded to the
> critical
> > / blocker priority, receive more attention and eventually get fixed.
> > Release managers are also responsible for looking for assignees for such
> > issues. If a case is still not fixed soonish, even with all these
> efforts,
> > I'm not sure how setting a deadline can help this.
> >
> > 4. If we disable the respective tests temporarily, we also need a
> mechanism
> > > to ensure the issue would be continued to be investigated in the
> future.
> > >
> >
> > +1. As mentioned above, we may consider disabling such tests iff someone
> is
> > actively working on it.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Wed, Jun 23, 2021 at 9:56 PM JING ZHANG  wrote:
> >
> > > Hi Xintong,
> > > +1 to the proposal.
> > > In order to better comply with the rule, it is necessary to describe
> > what's
> > > best practice if encountering test failure which seems unrelated with
> the
> > > current commits.
> > > How to avoid merging PR with test failures and not blocking code
> merging
> > > for a long time?
> > > I tried to think about the possible steps, and found there are some
> > > detailed problems that need to be discussed in a step further:
> > > 1. Report the test failures in the JIRA.
> > > 2. Set a deadline to find out the root cause and solve the failure for
> > the
> > > new created JIRA  because we could not block

Re: [VOTE] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-23 Thread Yang Wang
+1 (non-binding)

Best,
Yang

刘建刚  于2021年6月24日周四 下午12:17写道:

> +1  (binding)
>
> Thanks
> liujiangang
>
> Zhu Zhu  于2021年6月24日周四 上午11:38写道:
>
> > +1  (binding)
> >
> > Thanks,
> > Zhu
> >
> > Yangze Guo  于2021年6月21日周一 下午3:42写道:
> >
> > > According to the latest comment of Zhu Zhu[1], I append the potential
> > > resource deadlock in batch jobs as a known limitation to this FLIP.
> > > Thus, I'd extend the voting period for another 72h.
> > >
> > > [1]
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Tue, Jun 15, 2021 at 7:53 PM Xintong Song 
> > > wrote:
> > > >
> > > > +1 (binding)
> > > >
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Tue, Jun 15, 2021 at 6:21 PM Arvid Heise 
> wrote:
> > > >
> > > > > LGTM +1 (binding) from my side.
> > > > >
> > > > > On Tue, Jun 15, 2021 at 11:00 AM Yangze Guo 
> > > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I'd like to start the vote of FLIP-169 [1]. This FLIP is
> discussed
> > in
> > > > > > the thread[2].
> > > > > >
> > > > > > The vote will be open for at least 72 hours. Unless there is an
> > > > > > objection, I will try to close it by Jun. 18, 2021 if we have
> > > received
> > > > > > sufficient votes.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-169+DataStream+API+for+Fine-Grained+Resource+Requirements
> > > > > > [2]
> > > > > >
> > > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html
> > > > > >
> > > > > > Best,
> > > > > > Yangze Guo
> > > > > >
> > > > >
> > >
> >
>


Re: [VOTE] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-23 Thread 刘建刚
+1  (binding)

Thanks
liujiangang

Zhu Zhu  于2021年6月24日周四 上午11:38写道:

> +1  (binding)
>
> Thanks,
> Zhu
>
> Yangze Guo  于2021年6月21日周一 下午3:42写道:
>
> > According to the latest comment of Zhu Zhu[1], I append the potential
> > resource deadlock in batch jobs as a known limitation to this FLIP.
> > Thus, I'd extend the voting period for another 72h.
> >
> > [1]
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html
> >
> > Best,
> > Yangze Guo
> >
> > On Tue, Jun 15, 2021 at 7:53 PM Xintong Song 
> > wrote:
> > >
> > > +1 (binding)
> > >
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Tue, Jun 15, 2021 at 6:21 PM Arvid Heise  wrote:
> > >
> > > > LGTM +1 (binding) from my side.
> > > >
> > > > On Tue, Jun 15, 2021 at 11:00 AM Yangze Guo 
> > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > I'd like to start the vote of FLIP-169 [1]. This FLIP is discussed
> in
> > > > > the thread[2].
> > > > >
> > > > > The vote will be open for at least 72 hours. Unless there is an
> > > > > objection, I will try to close it by Jun. 18, 2021 if we have
> > received
> > > > > sufficient votes.
> > > > >
> > > > > [1]
> > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-169+DataStream+API+for+Fine-Grained+Resource+Requirements
> > > > > [2]
> > > > >
> > > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > >
> >
>


Re: [VOTE] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-23 Thread Zhu Zhu
+1  (binding)

Thanks,
Zhu

Yangze Guo  于2021年6月21日周一 下午3:42写道:

> According to the latest comment of Zhu Zhu[1], I append the potential
> resource deadlock in batch jobs as a known limitation to this FLIP.
> Thus, I'd extend the voting period for another 72h.
>
> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html
>
> Best,
> Yangze Guo
>
> On Tue, Jun 15, 2021 at 7:53 PM Xintong Song 
> wrote:
> >
> > +1 (binding)
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Tue, Jun 15, 2021 at 6:21 PM Arvid Heise  wrote:
> >
> > > LGTM +1 (binding) from my side.
> > >
> > > On Tue, Jun 15, 2021 at 11:00 AM Yangze Guo 
> wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > I'd like to start the vote of FLIP-169 [1]. This FLIP is discussed in
> > > > the thread[2].
> > > >
> > > > The vote will be open for at least 72 hours. Unless there is an
> > > > objection, I will try to close it by Jun. 18, 2021 if we have
> received
> > > > sufficient votes.
> > > >
> > > > [1]
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-169+DataStream+API+for+Fine-Grained+Resource+Requirements
> > > > [2]
> > > >
> > >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > >
>


Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-23 Thread Jark Wu
Thanks to Xintong for bringing up this topic, I'm +1 in general.

However, I think it's still not very clear how we address the unstable
tests.
I think this is a very important part of this new guideline.

According to the discussion above, if some tests are unstable, we can
manually disable it.
But I have some questions in my mind:
1) Is the instability judged by the committer themselves or by some
metrics?
2) Should we log the disable commit in the corresponding issue and increase
the priority?
3) What if nobody looks into this issue and this becomes some potential
bugs released with the new version?
4) If no person is actively working on the issue, who should re-enable it?
Would it block PRs again?


Best,
Jark


On Thu, 24 Jun 2021 at 10:04, Xintong Song  wrote:

> Thanks all for the feedback.
>
> @Till @Yangze
>
> I'm also not convinced by the idea of having an exception for local builds.
> We need to execute the entire build (or at least the failing stage)
> locally, to make sure subsequent test cases prevented by the failure one
> are all executed. In that case, it's probably easier to rerun the build on
> azure than locally.
>
> Concerning disabling unstable test cases that regularly block PRs from
> merging, maybe we can say that such cases can only be disabled when someone
> is actively looking into it, likely the person who disabled the case. If
> this person is no longer actively working on it, he/she should enable the
> case again no matter if it is fixed or not.
>
> @Jing
>
> Thanks for the suggestions.
>
> +1 to provide guidelines on handling test failures.
>
> 1. Report the test failures in the JIRA.
> >
>
> +1 on this. Currently, the release managers are monitoring the ci and cron
> build instabilities and reporting them on JIRA. We should also encourage
> other contributors to do that for PRs.
>
> 2. Set a deadline to find out the root cause and solve the failure for the
> > new created JIRA  because we could not block other commit merges for a
> long
> > time
> >
> 3. What to do if the JIRA has not made significant progress when reached to
> > the deadline time?
>
>
> I'm not sure about these two. It feels a bit against the voluntary nature
> of open source projects.
>
> IMHO, frequent instabilities are more likely to be upgraded to the critical
> / blocker priority, receive more attention and eventually get fixed.
> Release managers are also responsible for looking for assignees for such
> issues. If a case is still not fixed soonish, even with all these efforts,
> I'm not sure how setting a deadline can help this.
>
> 4. If we disable the respective tests temporarily, we also need a mechanism
> > to ensure the issue would be continued to be investigated in the future.
> >
>
> +1. As mentioned above, we may consider disabling such tests iff someone is
> actively working on it.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Wed, Jun 23, 2021 at 9:56 PM JING ZHANG  wrote:
>
> > Hi Xintong,
> > +1 to the proposal.
> > In order to better comply with the rule, it is necessary to describe
> what's
> > best practice if encountering test failure which seems unrelated with the
> > current commits.
> > How to avoid merging PR with test failures and not blocking code merging
> > for a long time?
> > I tried to think about the possible steps, and found there are some
> > detailed problems that need to be discussed in a step further:
> > 1. Report the test failures in the JIRA.
> > 2. Set a deadline to find out the root cause and solve the failure for
> the
> > new created JIRA  because we could not block other commit merges for a
> long
> > time
> > When is a reasonable deadline here?
> > 3. What to do if the JIRA has not made significant progress when reached
> to
> > the deadline time?
> > There are several situations as follows, maybe different cases need
> > different approaches.
> > 1. the JIRA is non-assigned yet
> > 2. not found the root cause yet
> > 3. not found a good solution, but already found the root cause
> > 4. found a solution, but it needs more time to be done.
> > 4. If we disable the respective tests temporarily, we also need a
> mechanism
> > to ensure the issue would be continued to be investigated in the future.
> >
> > Best regards,
> > JING ZHANG
> >
> > Stephan Ewen  于2021年6月23日周三 下午8:16写道:
> >
> > > +1 to Xintong's proposal
> > >
> > > On Wed, Jun 23, 2021 at 1:53 PM Till Rohrmann 
> > > wrote:
> > >
> > > > I would first try to not introduce the exception for local builds. It
> > > makes
> > > > it quite hard for others to verify the build and to make sure that
> the
> > > > right things were executed. If we see that this becomes an issue then
> > we
> > > > can revisit this idea.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Wed, Jun 23, 2021 at 4:19 AM Yangze Guo 
> wrote:
> > > >
> > > > > +1 for appending this to community guidelines for merging PRs.
> > > > >
> > > > > @Till Rohrmann
> > > > > I agree that with this approach unstable te

[jira] [Created] (FLINK-23132) flink upgrade issue(1.11.3->1.13.0)

2021-06-23 Thread Jeff Hu (Jira)
Jeff Hu created FLINK-23132:
---

 Summary: flink upgrade issue(1.11.3->1.13.0)
 Key: FLINK-23132
 URL: https://issues.apache.org/jira/browse/FLINK-23132
 Project: Flink
  Issue Type: Bug
Reporter: Jeff Hu


 
In order to improve the performance of data process, we store events to a map 
and do not process them untill event count reaches 100. in the meantime, start 
a timer in open method, so data is processed every 60 seconds

this works when flink version is *1.11.3*,

after upgrading flink version to *1.13.0*

I found sometimes events were consumed from Kafka continuously, but were not 
processed in RichFlatMapFunction, it means data was missing. after restarting 
service, it works well, but several hours later the same thing happened again.

any known issue for this flink version? any suggestions are appreciated.
 {{public class MyJob \{
public static void main(String[] args) throws Exception {
...
DataStream rawEventSource = env.addSource(flinkKafkaConsumer);
...
}}} {{public class MyMapFunction extends RichFlatMapFunction implements Serializable \{

@Override
public void open(Configuration parameters) {
...
long periodTimeout = 60;
pool.scheduleAtFixedRate(() -> {
// processing data
}, periodTimeout, periodTimeout, TimeUnit.SECONDS);
}

@Override
public void flatMap(String message, Collector out) \{
// store event to map 
// count event, 
// when count = 100, start data processing
}
}}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-23 Thread Xintong Song
Thanks all for the feedback.

@Till @Yangze

I'm also not convinced by the idea of having an exception for local builds.
We need to execute the entire build (or at least the failing stage)
locally, to make sure subsequent test cases prevented by the failure one
are all executed. In that case, it's probably easier to rerun the build on
azure than locally.

Concerning disabling unstable test cases that regularly block PRs from
merging, maybe we can say that such cases can only be disabled when someone
is actively looking into it, likely the person who disabled the case. If
this person is no longer actively working on it, he/she should enable the
case again no matter if it is fixed or not.

@Jing

Thanks for the suggestions.

+1 to provide guidelines on handling test failures.

1. Report the test failures in the JIRA.
>

+1 on this. Currently, the release managers are monitoring the ci and cron
build instabilities and reporting them on JIRA. We should also encourage
other contributors to do that for PRs.

2. Set a deadline to find out the root cause and solve the failure for the
> new created JIRA  because we could not block other commit merges for a long
> time
>
3. What to do if the JIRA has not made significant progress when reached to
> the deadline time?


I'm not sure about these two. It feels a bit against the voluntary nature
of open source projects.

IMHO, frequent instabilities are more likely to be upgraded to the critical
/ blocker priority, receive more attention and eventually get fixed.
Release managers are also responsible for looking for assignees for such
issues. If a case is still not fixed soonish, even with all these efforts,
I'm not sure how setting a deadline can help this.

4. If we disable the respective tests temporarily, we also need a mechanism
> to ensure the issue would be continued to be investigated in the future.
>

+1. As mentioned above, we may consider disabling such tests iff someone is
actively working on it.

Thank you~

Xintong Song



On Wed, Jun 23, 2021 at 9:56 PM JING ZHANG  wrote:

> Hi Xintong,
> +1 to the proposal.
> In order to better comply with the rule, it is necessary to describe what's
> best practice if encountering test failure which seems unrelated with the
> current commits.
> How to avoid merging PR with test failures and not blocking code merging
> for a long time?
> I tried to think about the possible steps, and found there are some
> detailed problems that need to be discussed in a step further:
> 1. Report the test failures in the JIRA.
> 2. Set a deadline to find out the root cause and solve the failure for the
> new created JIRA  because we could not block other commit merges for a long
> time
> When is a reasonable deadline here?
> 3. What to do if the JIRA has not made significant progress when reached to
> the deadline time?
> There are several situations as follows, maybe different cases need
> different approaches.
> 1. the JIRA is non-assigned yet
> 2. not found the root cause yet
> 3. not found a good solution, but already found the root cause
> 4. found a solution, but it needs more time to be done.
> 4. If we disable the respective tests temporarily, we also need a mechanism
> to ensure the issue would be continued to be investigated in the future.
>
> Best regards,
> JING ZHANG
>
> Stephan Ewen  于2021年6月23日周三 下午8:16写道:
>
> > +1 to Xintong's proposal
> >
> > On Wed, Jun 23, 2021 at 1:53 PM Till Rohrmann 
> > wrote:
> >
> > > I would first try to not introduce the exception for local builds. It
> > makes
> > > it quite hard for others to verify the build and to make sure that the
> > > right things were executed. If we see that this becomes an issue then
> we
> > > can revisit this idea.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Jun 23, 2021 at 4:19 AM Yangze Guo  wrote:
> > >
> > > > +1 for appending this to community guidelines for merging PRs.
> > > >
> > > > @Till Rohrmann
> > > > I agree that with this approach unstable tests will not block other
> > > > commit merges. However, it might be hard to prevent merging commits
> > > > that are related to those tests and should have been passed them.
> It's
> > > > true that this judgment can be made by the committers, but no one can
> > > > ensure the judgment is always precise and so that we have this
> > > > discussion thread.
> > > >
> > > > Regarding the unstable tests, how about adding another exception:
> > > > committers verify it in their local environment and comment in such
> > > > cases?
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Tue, Jun 22, 2021 at 8:23 PM 刘建刚 
> wrote:
> > > > >
> > > > > It is a good principle to run all tests successfully with any
> change.
> > > > This
> > > > > means a lot for project's stability and development. I am big +1
> for
> > > this
> > > > > proposal.
> > > > >
> > > > > Best
> > > > > liujiangang
> > > > >
> > > > > Till Rohrmann  于2021年6月22日周二 下午6:36写道:
> > > > >
> > > > > > One way to address 

[jira] [Created] (FLINK-23131) Remove scala from plugin parent-first patterns

2021-06-23 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23131:


 Summary: Remove scala from plugin parent-first patterns
 Key: FLINK-23131
 URL: https://issues.apache.org/jira/browse/FLINK-23131
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Configuration
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


In order to load akka and it's scala version through a separate classloader we 
need to remove scala from the parent-first patterns for plugins.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23130) RestServerEndpoint references on netty 3

2021-06-23 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23130:


 Summary: RestServerEndpoint references on netty 3
 Key: FLINK-23130
 URL: https://issues.apache.org/jira/browse/FLINK-23130
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / REST
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


The RestServerEndpoint does an instanceof check against 
{{org.jboss.netty.channel.ChannelException}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23129) When cancelling any running job of multiple jobs in an application cluster, JobManager shuts down

2021-06-23 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-23129:
--

 Summary: When cancelling any running job of multiple jobs in an 
application cluster, JobManager shuts down
 Key: FLINK-23129
 URL: https://issues.apache.org/jira/browse/FLINK-23129
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.14.0
Reporter: Robert Metzger


I have a jar with two jobs, both executeAsync() from the same main method. I 
execute the main method in an Application Mode cluster. When I cancel one of 
the two jobs, both jobs will stop executing.

I would expect that the JobManager shuts down once all jobs submitted from an 
application are finished.

If this is a known limitation, we should document it.

{code}
2021-06-23 21:29:53,123 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job first job 
(18181be02da272387354d093519b2359) switched from state RUNNING to CANCELLING.
2021-06-23 21:29:53,124 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Source: 
Custom Source -> Sink: Unnamed (1/1) (5a69b1c19f8da23975f6961898ab50a2) 
switched from RUNNING to CANCELING.
2021-06-23 21:29:53,141 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Source: 
Custom Source -> Sink: Unnamed (1/1) (5a69b1c19f8da23975f6961898ab50a2) 
switched from CANCELING to CANCELED.
2021-06-23 21:29:53,144 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] 
- Clearing resource requirements of job 18181be02da272387354d093519b2359
2021-06-23 21:29:53,145 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job first job 
(18181be02da272387354d093519b2359) switched from state CANCELLING to CANCELED.
2021-06-23 21:29:53,145 INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Stopping 
checkpoint coordinator for job 18181be02da272387354d093519b2359.
2021-06-23 21:29:53,147 INFO  
org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore [] - 
Shutting down
2021-06-23 21:29:53,150 INFO  
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job 
18181be02da272387354d093519b2359 reached terminal state CANCELED.
2021-06-23 21:29:53,152 INFO  org.apache.flink.runtime.jobmaster.JobMaster  
   [] - Stopping the JobMaster for job first 
job(18181be02da272387354d093519b2359).
2021-06-23 21:29:53,155 INFO  
org.apache.flink.runtime.jobmaster.slotpool.DefaultDeclarativeSlotPool [] - 
Releasing slot [c35b64879d6b02d383c825ea735ebba0].
2021-06-23 21:29:53,159 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] 
- Clearing resource requirements of job 18181be02da272387354d093519b2359
2021-06-23 21:29:53,159 INFO  org.apache.flink.runtime.jobmaster.JobMaster  
   [] - Close ResourceManager connection 
281b3fcf7ad0a6f7763fa90b8a5b9adb: Stopping JobMaster for job first 
job(18181be02da272387354d093519b2359)..
2021-06-23 21:29:53,160 INFO  
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - 
Disconnect job manager 
0...@akka.tcp://flink@localhost:6123/user/rpc/jobmanager_2
 for job 18181be02da272387354d093519b2359 from the resource manager.
2021-06-23 21:29:53,225 INFO  
org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap 
[] - Application CANCELED:
java.util.concurrent.CompletionException: 
org.apache.flink.client.deployment.application.UnsuccessfulExecutionException: 
Application Status: CANCELED
at 
org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.lambda$unwrapJobResultException$4(ApplicationDispatcherBootstrap.java:304)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) 
~[?:1.8.0_252]
at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
 ~[?:1.8.0_252]
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 
~[?:1.8.0_252]
at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) 
~[?:1.8.0_252]
at 
org.apache.flink.client.deployment.application.JobStatusPollingUtils.lambda$null$2(JobStatusPollingUtils.java:101)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
 ~[?:1.8.0_252]
at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
 ~[?:1.8.0_252]
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 
~[?:1.8.0_252]
at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) 
~[?:1.8.0_252]
at 
org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHan

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-23 Thread Austin Cawley-Edwards
Hi all,

Thanks, Konstantin and Till, for guiding the discussion.

I was not aware of the results of the call with Konstantin and was
attempting to resolve the unanswered questions before more, potentially
fruitless, work was done.

I am also looking forward to the coming proposal, as well as increasing my
understanding of this specific use case + its limitations!

Best,
Austin

On Tue, Jun 22, 2021 at 6:32 AM Till Rohrmann  wrote:

> Hi everyone,
>
> I do like the idea of keeping the actual change outside of Flink but to
> enable Flink to support such a use case (different authentication
> mechanisms). I think this is a good compromise for the community that
> combines long-term maintainability with support for new use-cases. I am
> looking forward to your proposal.
>
> I also want to second Konstantin here that the tone of your last email,
> Marton, does not reflect the values and manners of the Flink community and
> is not representative of how we conduct discussions. Especially, the more
> senior community members should know this and act accordingly in order to
> be good role models for others in the community. Technical discussions
> should not be decided by who wields presumably the greatest authority but
> by the soundness of arguments and by what is the best solution for a
> problem.
>
> Let us now try to find the best solution for the problem at hand!
>
> Cheers,
> Till
>
> On Tue, Jun 22, 2021 at 11:24 AM Konstantin Knauf 
> wrote:
>
> > Hi everyone,
> >
> > First, Marton and I had a brief conversation yesterday offline and
> > discussed exploring the approach of exposing the authentication
> > functionality via an API. So, I am looking forward to your proposal in
> that
> > direction. The benefit of such a solution would be that it is extensible
> > for others and it does add a smaller maintenance (in particular testing)
> > footprint to Apache Flink itself. If we end up going down this route,
> > flink-packages.org would be a great way to promote these third party
> > "authentication modules".
> >
> > Second, Marton, I understand your frustration about the long discussion
> on
> > this "simple matter", but the condescending tone of your last mail feels
> > uncalled for to me. Austin expressed a valid opinion on the topic, which
> is
> > based on his experience from other Open Source frameworks (CNCF mostly).
> I
> > am sure you agree that it is important for Apache Flink to stay open and
> to
> > consider different approaches and ideas and I don't think it helps the
> > culture of discussion to shoot it down like this ("This is where this
> > discussion stops.").
> >
> > Let's continue to move this discussion forward and I am sure we'll find a
> > consensus based on product and technological considerations.
> >
> > Thanks,
> >
> > Konstantin
> >
> > On Tue, Jun 22, 2021 at 9:31 AM Márton Balassi  >
> > wrote:
> >
> > > Hi Austin,
> > >
> > > Thank you for your thoughts. This is where this discussion stops. This
> > > email thread already contains more characters than the implementation
> and
> > > what is needed for the next 20 years of maintenance.
> > >
> > > It is great that you have a view on modern solutions and thank you for
> > > offering your help with brainstorming solutions. I am responsible for
> > Flink
> > > at Cloudera and we do need an implementation like this and it is in
> fact
> > > already in production at dozens of customers. We are open to adapting
> > that
> > > to expose a more generic API (and keeping Kerberos to our fork), to
> > > contribute this to the community as others have asked for it and to
> > protect
> > > ourselves from occasionally having to update this critical
> implementation
> > > path based on changes in the Apache codebase. I have worked with close
> > to a
> > > hundred Big Data customers as a consultant and an engineering manager
> and
> > > committed hundreds of changes to Apache Flink over the past decade,
> > please
> > > trust my judgement on a simple matter like this.
> > >
> > > Please forgive me for referencing authority, this discussion was
> getting
> > > out of hand. Please keep vigilant.
> > >
> > > Best,
> > > Marton
> > >
> > > On Mon, Jun 21, 2021 at 10:50 PM Austin Cawley-Edwards <
> > > austin.caw...@gmail.com> wrote:
> > >
> > > > Hi Gabor + Marton,
> > > >
> > > > I don't believe that the issue with this proposal is the specific
> > > mechanism
> > > > proposed (Kerberos), but rather that it is not the level to implement
> > it
> > > at
> > > > (Flink). I'm just one voice, so please take this with a grain of
> salt.
> > > >
> > > > In the other solutions previously noted there is no need to
> instrument
> > > > Flink which, in addition to reducing the maintenance burden,
> provides a
> > > > better, decoupled end result.
> > > >
> > > > IMO we should not add any new API in Flink for this use case. I think
> > it
> > > is
> > > > unfortunate and sympathize with the work that has already been done
> on
> > > this
> > > > feature – pe

[jira] [Created] (FLINK-23128) Translate update to operations playground docs to Chinese

2021-06-23 Thread David Anderson (Jira)
David Anderson created FLINK-23128:
--

 Summary: Translate update to operations playground docs to Chinese
 Key: FLINK-23128
 URL: https://issues.apache.org/jira/browse/FLINK-23128
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation / Training
Affects Versions: 1.13.1
Reporter: David Anderson






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23127) Can't use plugins for GCS filesystem

2021-06-23 Thread Yaroslav Tkachenko (Jira)
Yaroslav Tkachenko created FLINK-23127:
--

 Summary: Can't use plugins for GCS filesystem
 Key: FLINK-23127
 URL: https://issues.apache.org/jira/browse/FLINK-23127
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.13.0
Reporter: Yaroslav Tkachenko
 Attachments: exception-stacktrace.txt

I've been trying to add support for the GCS filesystem. I have a working 
example where I add two JARs to the */opt/flink/lib/* folder:
 * [GCS Hadoop 
connector|https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar]
 * *Shaded* Hadoop using 
[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar|https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.8.3-10.0/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar]

Now I'm trying to follow the advice from [this 
page|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/filesystems/overview/#pluggable-file-systems]
 and use Plugins instead. I followed the recommendations from 
[here|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/filesystems/plugins/].
 Now I have two JARs in the */opt/flink/plugins/hadoop-gcs/* folder:
 * [GCS Hadoop 
connector|https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-hadoop2-2.2.1.jar]
 * *Non-shaded* [Hadoop using 
hadoop-common-2.10.1.jar|https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/2.10.1/hadoop-common-2.10.1.jar]

As I can see, shading is not required for plugins (that's one of the reasons to 
use them), so I want to make it work with a simple non-shaded _hadoop-common_.

However, the JobManager fails with an exception (full stacktrace is available 
is an attachment):
{quote}Caused by: 
org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in 
the classpath/dependencies.
{quote}
The exception is thrown when _org.apache.hadoop.conf.Configuration_ and 
_org.apache.hadoop.fs.FileSystem_ [are not available in the 
classpath|https://github.com/apache/flink/blob/f2f2befee76d08b4d9aa592438dc0cf5ebe2ef96/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java#L1123-L1124],
 but they're available in hadoop-common and should have been loaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23126) Refactor smoke-e2e into smoke-e2e-common and smoke-e2e-embedded

2021-06-23 Thread Evans Ye (Jira)
Evans Ye created FLINK-23126:


 Summary: Refactor smoke-e2e into smoke-e2e-common and 
smoke-e2e-embedded
 Key: FLINK-23126
 URL: https://issues.apache.org/jira/browse/FLINK-23126
 Project: Flink
  Issue Type: Sub-task
  Components: Stateful Functions, Tests
Reporter: Evans Ye


This JIRA focus on refactoring the existing statefun-smoke-e2e module into:
* statefun-smoke-e2e-common (E2E testing framework such as source, sink, 
verification, etc)
* statefun-smoke-e2e-embedded (embedded java function)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23125) Run StateFun smoke E2E tests for multiple language SDKs

2021-06-23 Thread Evans Ye (Jira)
Evans Ye created FLINK-23125:


 Summary: Run StateFun smoke E2E tests for multiple language SDKs
 Key: FLINK-23125
 URL: https://issues.apache.org/jira/browse/FLINK-23125
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions, Tests
Reporter: Evans Ye


Currently statefun-smoke-e2e module is testing for embedded function in java 
language only.  This JIRA aims to refactor the existing code into 
self-contained modules so that we can easily compose the testing framework with 
different language SDKs.

The design will be looked like:
1. statefun-smoke-e2e-common (core testing framework such as source, sink, 
verification, etc)
2. statefun-smoke-e2e-embedded
3. statefun-smoke-e2e-java
...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] [FLINK-23122] Provide the Dynamic register converter

2021-06-23 Thread Jark Wu
Hi,

`TIMESTAMP_WITH_TIME_ZONE` is not supported in the Flink SQL engine,
 even though it is listed in the type API.

I think what you are looking for is the RawValueType which can be used as
user-defined type. You can use `DataTypes.RAW(TypeInformation)` to define
 a Raw type with the given TypeInformation which includes the serializer
and deserializer.

Best,
Jark

On Wed, 23 Jun 2021 at 21:09, 云华  wrote:

>
> Hi everyone,
> I want to rework type conversion system in connector and flink table
> module to be resuable and scalability.
> I Postgres system, the type '_citext' will not supported in
> org.apache.flink.connector.jdbc.catalog.PostgresCatalog#fromJDBCType.
> what's more,
> org.apache.flink.table.runtime.typeutils.InternalSerializers#createInternal
> cannnot support the TIMESTAMP_WITH_TIME_ZONE.
> For more background and api design :
> https://issues.apache.org/jira/browse/FLINK-23122.
> Please let me know if this matches your thoughts.
>
>
>
> Regards,Jack


[jira] [Created] (FLINK-23124) Implement exactly-once Kafka Sink

2021-06-23 Thread Fabian Paul (Jira)
Fabian Paul created FLINK-23124:
---

 Summary: Implement exactly-once Kafka Sink
 Key: FLINK-23124
 URL: https://issues.apache.org/jira/browse/FLINK-23124
 Project: Flink
  Issue Type: Sub-task
Reporter: Fabian Paul






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23123) Implement at-least-once Kafka Sink

2021-06-23 Thread Fabian Paul (Jira)
Fabian Paul created FLINK-23123:
---

 Summary: Implement at-least-once Kafka Sink
 Key: FLINK-23123
 URL: https://issues.apache.org/jira/browse/FLINK-23123
 Project: Flink
  Issue Type: Sub-task
Reporter: Fabian Paul






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-23 Thread JING ZHANG
Hi Xintong,
+1 to the proposal.
In order to better comply with the rule, it is necessary to describe what's
best practice if encountering test failure which seems unrelated with the
current commits.
How to avoid merging PR with test failures and not blocking code merging
for a long time?
I tried to think about the possible steps, and found there are some
detailed problems that need to be discussed in a step further:
1. Report the test failures in the JIRA.
2. Set a deadline to find out the root cause and solve the failure for the
new created JIRA  because we could not block other commit merges for a long
time
When is a reasonable deadline here?
3. What to do if the JIRA has not made significant progress when reached to
the deadline time?
There are several situations as follows, maybe different cases need
different approaches.
1. the JIRA is non-assigned yet
2. not found the root cause yet
3. not found a good solution, but already found the root cause
4. found a solution, but it needs more time to be done.
4. If we disable the respective tests temporarily, we also need a mechanism
to ensure the issue would be continued to be investigated in the future.

Best regards,
JING ZHANG

Stephan Ewen  于2021年6月23日周三 下午8:16写道:

> +1 to Xintong's proposal
>
> On Wed, Jun 23, 2021 at 1:53 PM Till Rohrmann 
> wrote:
>
> > I would first try to not introduce the exception for local builds. It
> makes
> > it quite hard for others to verify the build and to make sure that the
> > right things were executed. If we see that this becomes an issue then we
> > can revisit this idea.
> >
> > Cheers,
> > Till
> >
> > On Wed, Jun 23, 2021 at 4:19 AM Yangze Guo  wrote:
> >
> > > +1 for appending this to community guidelines for merging PRs.
> > >
> > > @Till Rohrmann
> > > I agree that with this approach unstable tests will not block other
> > > commit merges. However, it might be hard to prevent merging commits
> > > that are related to those tests and should have been passed them. It's
> > > true that this judgment can be made by the committers, but no one can
> > > ensure the judgment is always precise and so that we have this
> > > discussion thread.
> > >
> > > Regarding the unstable tests, how about adding another exception:
> > > committers verify it in their local environment and comment in such
> > > cases?
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Tue, Jun 22, 2021 at 8:23 PM 刘建刚  wrote:
> > > >
> > > > It is a good principle to run all tests successfully with any change.
> > > This
> > > > means a lot for project's stability and development. I am big +1 for
> > this
> > > > proposal.
> > > >
> > > > Best
> > > > liujiangang
> > > >
> > > > Till Rohrmann  于2021年6月22日周二 下午6:36写道:
> > > >
> > > > > One way to address the problem of regularly failing tests that
> block
> > > > > merging of PRs is to disable the respective tests for the time
> being.
> > > Of
> > > > > course, the failing test then needs to be fixed. But at least that
> > way
> > > we
> > > > > would not block everyone from making progress.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Tue, Jun 22, 2021 at 12:00 PM Arvid Heise 
> > wrote:
> > > > >
> > > > > > I think this is overall a good idea. So +1 from my side.
> > > > > > However, I'd like to put a higher priority on infrastructure
> then,
> > in
> > > > > > particular docker image/artifact caches.
> > > > > >
> > > > > > On Tue, Jun 22, 2021 at 11:50 AM Till Rohrmann <
> > trohrm...@apache.org
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for bringing this topic to our attention Xintong. I
> think
> > > your
> > > > > > > proposal makes a lot of sense and we should follow it. It will
> > > give us
> > > > > > > confidence that our changes are working and it might be a good
> > > > > incentive
> > > > > > to
> > > > > > > quickly fix build instabilities. Hence, +1.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Till
> > > > > > >
> > > > > > > On Tue, Jun 22, 2021 at 11:12 AM Xintong Song <
> > > tonysong...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > In the past a couple of weeks, I've observed several times
> that
> > > PRs
> > > > > are
> > > > > > > > merged without a green light from the CI tests, where failure
> > > cases
> > > > > are
> > > > > > > > considered *unrelated*. This may not always cause problems,
> but
> > > would
> > > > > > > > increase the chance of breaking our code base. In fact, it
> has
> > > > > occurred
> > > > > > > to
> > > > > > > > me twice in the past few weeks that I had to revert a commit
> > > which
> > > > > > breaks
> > > > > > > > the master branch due to this.
> > > > > > > >
> > > > > > > > I think it would be nicer to enforce a stricter rule, that no
> > PRs
> > > > > > should
> > > > > > > be
> > > > > > > > merged without passing CI.
> > > > > > > >
> > > > > > > > The problems of merging PRs with "unrelated" test failures
> are:
>

Re: [DISCUSS] Incrementally deprecating the DataSet API

2021-06-23 Thread Chesnay Schepler
If we want to publicize this plan more shouldn't we have a rough 
timeline for when 2.0 is on the table?


On 6/23/2021 2:44 PM, Stephan Ewen wrote:

Thanks for writing this up, this also reflects my understanding.

I think a blog post would be nice, ideally with an explicit call for
feedback so we learn about user concerns.
A blog post has a lot more reach than an ML thread.

Best,
Stephan


On Wed, Jun 23, 2021 at 12:23 PM Timo Walther  wrote:


Hi everyone,

I'm sending this email to make sure everyone is on the same page about
slowly deprecating the DataSet API.

There have been a few thoughts mentioned in presentations, offline
discussions, and JIRA issues. However, I have observed that there are
still some concerns or different opinions on what steps are necessary to
implement this change.

Let me summarize some of the steps and assumpations and let's have a
discussion about it:

Step 1: Introduce a batch mode for Table API (FLIP-32)
[DONE in 1.9]

Step 2: Introduce a batch mode for DataStream API (FLIP-134)
[DONE in 1.12]

Step 3: Soft deprecate DataSet API (FLIP-131)
[DONE in 1.12]

We updated the documentation recently to make this deprecation even more
visible. There is a dedicated `(Legacy)` label right next to the menu
item now.

We won't deprecate concrete classes of the API with a @Deprecated
annotation to avoid extensive warnings in logs until then.

Step 4: Drop the legacy SQL connectors and formats (FLINK-14437)
[DONE in 1.14]

We dropped code for ORC, Parque, and HBase formats that were only used
by DataSet API users. The removed classes had no documentation and were
not annotated with one of our API stability annotations.

The old functionality should be available through the new sources and
sinks for Table API and DataStream API. If not, we should bring them
into a shape that they can be a full replacement.

DataSet users are encouraged to either upgrade the API or use Flink
1.13. Users can either just stay at Flink 1.13 or copy only the format's
code to a newer Flink version. We aim to keep the core interfaces (i.e.
InputFormat and OutputFormat) stable until the next major version.

We will maintain/allow important contributions to dropped connectors in
1.13. So 1.13 could be considered as kind of a DataSet API LTS release.

Step 5: Drop the legacy SQL planner (FLINK-14437)
[DONE in 1.14]

This included dropping support of DataSet API with SQL.

Step 6: Connect both Table and DataStream API in batch mode (FLINK-20897)
[PLANNED in 1.14]

Step 7: Reach feature parity of Table API/DataStream API with DataSet API
[PLANNED for 1.14++]

We need to identify blockers when migrating from DataSet API to Table
API/DataStream API. Here we need to estabilish a good feedback pipeline
to include DataSet users in the roadmap planning.

Step 7: Drop the Gelly library

No concrete plan yet. Latest would be the next major Flink version aka
Flink 2.0.

Step 8: Drop DataSet API

Planned for the next major Flink version aka Flink 2.0.


Please let me know if this matches your thoughts. We can also convert
this into a blog post or mention it in the next release notes.

Regards,
Timo






[DISCUSS] [FLINK-23122] Provide the Dynamic register converter

2021-06-23 Thread 云华

Hi everyone,
I want to rework type conversion system in connector and flink table module to 
be resuable and scalability. 
I Postgres system, the type '_citext' will not supported in 
org.apache.flink.connector.jdbc.catalog.PostgresCatalog#fromJDBCType.  what's 
more, 
org.apache.flink.table.runtime.typeutils.InternalSerializers#createInternal 
cannnot support the TIMESTAMP_WITH_TIME_ZONE. 
For more background and api design : 
https://issues.apache.org/jira/browse/FLINK-23122.
Please let me know if this matches your thoughts.



Regards,Jack

[jira] [Created] (FLINK-23122) Provide the Dynamic register converter

2021-06-23 Thread lqjacklee (Jira)
lqjacklee created FLINK-23122:
-

 Summary: Provide the Dynamic register converter 
 Key: FLINK-23122
 URL: https://issues.apache.org/jira/browse/FLINK-23122
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Common, Connectors / HBase, Connectors / 
Hive, Connectors / JDBC, Connectors / ORC, Table SQL / API
Affects Versions: 1.14.0
Reporter: lqjacklee


Background:

Type conversion is the core of direct data conversion between Flink and data 
source. By default, Flink provides type conversion for different connectors. 
Different transformation logic is distributed in the specific implementation of 
multiple connectors. It brings a big problem to the reuse of Flink system. 
Secondly, due to the diversity of different types of data sources, the original 
transformation needs to be extended, and the original transformation does not 
have dynamic expansion. Finally, the core of the transformation logic needs to 
be reused in multiple projects, hoping to abstract the transformation logic 
into a unified processing. The application program directly depends on the same 
type transformation system, and different sub components can dynamically expand 
the types of transformation.


1, ConvertServiceRegister : provide register and search function. 

{code:java}
public interface ConvertServiceRegister {

void register(ConversionService conversionService);

void register(ConversionServiceFactory conversionServiceFactory);

void register(ConversionServiceSet conversionServiceSet);

Collection convertServices();

Collection convertServices(String group);

Collection convertServiceSets();

Collection convertServiceSets(String group);
}
{code}

2, ConversionService : provide the implement.

{code:java}
public interface ConversionService extends Order {

Set tags();

boolean canConvert(TypeInformationHolder source, TypeInformationHolder 
target)
throws ConvertException;

Object convert(
TypeInformationHolder sourceType,
Object source,
TypeInformationHolder targetType,
Object defaultValue,
boolean nullable)
throws ConvertException;
}
{code}


3, ConversionServiceFactory : provide the conversion service factory function.

{code:java}
public interface ConversionServiceFactory extends Order {

Set tags();

ConversionService getConversionService(T target) throws 
ConvertException;
}
{code}


4, ConversionServiceSet : provide group management.

{code:java}
public interface ConversionServiceSet extends Loadable {

Set tags();

Collection conversionServices();

boolean support(TypeInformationHolder source, TypeInformationHolder target)
throws ConvertException;

Object convert(
String name,
TypeInformationHolder typeInformationHolder,
Object value,
TypeInformationHolder type,
Object defaultValue,
boolean nullable)
throws ConvertException;
}
{code}






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Incrementally deprecating the DataSet API

2021-06-23 Thread Stephan Ewen
Thanks for writing this up, this also reflects my understanding.

I think a blog post would be nice, ideally with an explicit call for
feedback so we learn about user concerns.
A blog post has a lot more reach than an ML thread.

Best,
Stephan


On Wed, Jun 23, 2021 at 12:23 PM Timo Walther  wrote:

> Hi everyone,
>
> I'm sending this email to make sure everyone is on the same page about
> slowly deprecating the DataSet API.
>
> There have been a few thoughts mentioned in presentations, offline
> discussions, and JIRA issues. However, I have observed that there are
> still some concerns or different opinions on what steps are necessary to
> implement this change.
>
> Let me summarize some of the steps and assumpations and let's have a
> discussion about it:
>
> Step 1: Introduce a batch mode for Table API (FLIP-32)
> [DONE in 1.9]
>
> Step 2: Introduce a batch mode for DataStream API (FLIP-134)
> [DONE in 1.12]
>
> Step 3: Soft deprecate DataSet API (FLIP-131)
> [DONE in 1.12]
>
> We updated the documentation recently to make this deprecation even more
> visible. There is a dedicated `(Legacy)` label right next to the menu
> item now.
>
> We won't deprecate concrete classes of the API with a @Deprecated
> annotation to avoid extensive warnings in logs until then.
>
> Step 4: Drop the legacy SQL connectors and formats (FLINK-14437)
> [DONE in 1.14]
>
> We dropped code for ORC, Parque, and HBase formats that were only used
> by DataSet API users. The removed classes had no documentation and were
> not annotated with one of our API stability annotations.
>
> The old functionality should be available through the new sources and
> sinks for Table API and DataStream API. If not, we should bring them
> into a shape that they can be a full replacement.
>
> DataSet users are encouraged to either upgrade the API or use Flink
> 1.13. Users can either just stay at Flink 1.13 or copy only the format's
> code to a newer Flink version. We aim to keep the core interfaces (i.e.
> InputFormat and OutputFormat) stable until the next major version.
>
> We will maintain/allow important contributions to dropped connectors in
> 1.13. So 1.13 could be considered as kind of a DataSet API LTS release.
>
> Step 5: Drop the legacy SQL planner (FLINK-14437)
> [DONE in 1.14]
>
> This included dropping support of DataSet API with SQL.
>
> Step 6: Connect both Table and DataStream API in batch mode (FLINK-20897)
> [PLANNED in 1.14]
>
> Step 7: Reach feature parity of Table API/DataStream API with DataSet API
> [PLANNED for 1.14++]
>
> We need to identify blockers when migrating from DataSet API to Table
> API/DataStream API. Here we need to estabilish a good feedback pipeline
> to include DataSet users in the roadmap planning.
>
> Step 7: Drop the Gelly library
>
> No concrete plan yet. Latest would be the next major Flink version aka
> Flink 2.0.
>
> Step 8: Drop DataSet API
>
> Planned for the next major Flink version aka Flink 2.0.
>
>
> Please let me know if this matches your thoughts. We can also convert
> this into a blog post or mention it in the next release notes.
>
> Regards,
> Timo
>
>


Re: [DISCUSS] Feedback Collection Jira Bot

2021-06-23 Thread JING ZHANG
Hi Konstantin, Chesnay,

> I would like it to not unassign people if a PR is open. These are
> usually blocked by the reviewer, not the assignee, and having the
> assignees now additionally having to update JIRA periodically is a bit
> like rubbing salt into the wound.

I agree with Chesnay about not un-assign an issue if a PR is open.
Besides, Could assignees remove the "stale-assigned" tag  by themself? It
seems assignees have no permission to delete the tag if the issue is not
created by themselves.

Best regards,
JING ZHANG

Konstantin Knauf  于2021年6月23日周三 下午4:17写道:

> > I agree there are such tickets, but I don't see how this is addressing my
> concerns. There are also tickets that just shouldn't be closed as I
> described above. Why do you think that duplicating tickets and losing
> discussions/knowledge is a good solution?
>
> I don't understand why we are necessarily losing discussion/knowledge. The
> tickets are still there, just in "Closed" state, which are included in
> default Jira search. We could of course just add a label, but closing seems
> clearer to me given that likely this ticket will not get comitter attention
> in the foreseeable future.
>
> > I would like to avoid having to constantly fight against the bot. It's
> already responsible for the majority of my daily emails, with quite little
> benefit for me personally. I initially thought that after some period of
> time it will settle down, but now I'm afraid it won't happen.
>
> Can you elaborate which rules you are running into mostly? I'd rather like
> to understand how we work right now and where this conflicts with the Jira
> bot vs slowly disabling the jira bot via labels.
>
> On Wed, Jun 23, 2021 at 10:00 AM Piotr Nowojski 
> wrote:
>
> > Hi Konstantin,
> >
> > > In my opinion it is important that we close tickets eventually. There
> are
> > a
> > > lot of tickets (bugs, improvements, tech debt) that over time became
> > > irrelevant, out-of-scope, irreproducible, etc.  In my experience, these
> > > tickets are usually not closed by anyone but the bot.
> >
> > I agree there are such tickets, but I don't see how this is addressing my
> > concerns. There are also tickets that just shouldn't be closed as I
> > described above. Why do you think that duplicating tickets and losing
> > discussions/knowledge is a good solution?
> >
> > I would like to avoid having to constantly fight against the bot. It's
> > already responsible for the majority of my daily emails, with quite
> little
> > benefit for me personally. I initially thought that after some period of
> > time it will settle down, but now I'm afraid it won't happen. Can we add
> > some label to mark tickets to be ignored by the jira-bot?
> >
> > Best,
> > Piotrek
> >
> > śr., 23 cze 2021 o 09:40 Chesnay Schepler 
> napisał(a):
> >
> > > I would like it to not unassign people if a PR is open. These are
> > > usually blocked by the reviewer, not the assignee, and having the
> > > assignees now additionally having to update JIRA periodically is a bit
> > > like rubbing salt into the wound.
> > >
> > > On 6/23/2021 7:52 AM, Konstantin Knauf wrote:
> > > > Hi everyone,
> > > >
> > > > I was hoping for more feedback from other committers, but seems like
> > this
> > > > is not happening, so here's my proposal for immediate changes:
> > > >
> > > > * Ignore tickets with a fixVersion for all rules but the
> > stale-unassigned
> > > > role.
> > > >
> > > > * We change the time intervals as follows, accepting reality a bit
> more
> > > ;)
> > > >
> > > >  * stale-assigned only after 30 days (instead of 14 days)
> > > >  * stale-critical only after 14 days (instead of 7 days)
> > > >  * stale-major only after 60 days (instead of 30 days)
> > > >
> > > > Unless there are -1s, I'd implement the changes Monday next week.
> > > >
> > > > Cheers,
> > > >
> > > > Konstantin
> > > >
> > > > On Thu, Jun 17, 2021 at 2:17 PM Piotr Nowojski  >
> > > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> I also think that the bot is a bit too aggressive/too quick with
> > > assigning
> > > >> stale issues/deprioritizing them, but that's not that big of a deal
> > for
> > > me.
> > > >>
> > > >> What bothers me much more is that it's closing minor issues
> > > automatically.
> > > >> Depriotising issues makes sense to me. If a wish for improvement or
> a
> > > bug
> > > >> report has been opened a long time ago, and they got no attention
> over
> > > the
> > > >> time, sure depriotize them. But closing them is IMO a bad idea. Bug
> > > might
> > > >> be minor, but if it's not fixed it's still there - it shouldn't be
> > > closed.
> > > >> Closing with "won't fix" should be done for very good reasons and
> very
> > > >> rarely. Same applies to improvements/wishes. Furthermore, very often
> > > >> descriptions and comments have a lot of value, and if we keep
> closing
> > > minor
> > > >> issues I'm afraid that we end up with:
> > > >> - more duplication. I doubt anyone will be looking for prior
> 

Re: [DISCUSS] Drop Mesos in 1.14

2021-06-23 Thread Stephan Ewen
I would prefer to remove Mesos from the Flink core as well.

I also had a similar thought as Seth: As far as I know, you can package
applications to run on Mesos with "Marathon". That would be like deploying
an opaque Flink standalone cluster on Mesos
The implication is similar to going from an active integration to a
standalone cluster (like from native Flink Kubernetes Application
Deployment to a Standalone Application Deployment on Kubernetes): You need
to make sure the number of TMs / slots and the parallelism fit together (or
use the new reactive mode). Other than that, I think it should work well
for streaming jobs.

Having a Flink-Marathon template in https://flink-packages.org/ would be a
nice thing for Mesos users.

@Oleksandr What do you think about that?

On Wed, Jun 23, 2021 at 11:31 AM Leonard Xu  wrote:

> + 1 for dropping Mesos. I checked both commit history and mail list, the
> Mesos related issue/user question has been rarely appeared.
>
> Best,
> Leonard
>
>


Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-23 Thread Stephan Ewen
+1 to Xintong's proposal

On Wed, Jun 23, 2021 at 1:53 PM Till Rohrmann  wrote:

> I would first try to not introduce the exception for local builds. It makes
> it quite hard for others to verify the build and to make sure that the
> right things were executed. If we see that this becomes an issue then we
> can revisit this idea.
>
> Cheers,
> Till
>
> On Wed, Jun 23, 2021 at 4:19 AM Yangze Guo  wrote:
>
> > +1 for appending this to community guidelines for merging PRs.
> >
> > @Till Rohrmann
> > I agree that with this approach unstable tests will not block other
> > commit merges. However, it might be hard to prevent merging commits
> > that are related to those tests and should have been passed them. It's
> > true that this judgment can be made by the committers, but no one can
> > ensure the judgment is always precise and so that we have this
> > discussion thread.
> >
> > Regarding the unstable tests, how about adding another exception:
> > committers verify it in their local environment and comment in such
> > cases?
> >
> > Best,
> > Yangze Guo
> >
> > On Tue, Jun 22, 2021 at 8:23 PM 刘建刚  wrote:
> > >
> > > It is a good principle to run all tests successfully with any change.
> > This
> > > means a lot for project's stability and development. I am big +1 for
> this
> > > proposal.
> > >
> > > Best
> > > liujiangang
> > >
> > > Till Rohrmann  于2021年6月22日周二 下午6:36写道:
> > >
> > > > One way to address the problem of regularly failing tests that block
> > > > merging of PRs is to disable the respective tests for the time being.
> > Of
> > > > course, the failing test then needs to be fixed. But at least that
> way
> > we
> > > > would not block everyone from making progress.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Tue, Jun 22, 2021 at 12:00 PM Arvid Heise 
> wrote:
> > > >
> > > > > I think this is overall a good idea. So +1 from my side.
> > > > > However, I'd like to put a higher priority on infrastructure then,
> in
> > > > > particular docker image/artifact caches.
> > > > >
> > > > > On Tue, Jun 22, 2021 at 11:50 AM Till Rohrmann <
> trohrm...@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > Thanks for bringing this topic to our attention Xintong. I think
> > your
> > > > > > proposal makes a lot of sense and we should follow it. It will
> > give us
> > > > > > confidence that our changes are working and it might be a good
> > > > incentive
> > > > > to
> > > > > > quickly fix build instabilities. Hence, +1.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Tue, Jun 22, 2021 at 11:12 AM Xintong Song <
> > tonysong...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > In the past a couple of weeks, I've observed several times that
> > PRs
> > > > are
> > > > > > > merged without a green light from the CI tests, where failure
> > cases
> > > > are
> > > > > > > considered *unrelated*. This may not always cause problems, but
> > would
> > > > > > > increase the chance of breaking our code base. In fact, it has
> > > > occurred
> > > > > > to
> > > > > > > me twice in the past few weeks that I had to revert a commit
> > which
> > > > > breaks
> > > > > > > the master branch due to this.
> > > > > > >
> > > > > > > I think it would be nicer to enforce a stricter rule, that no
> PRs
> > > > > should
> > > > > > be
> > > > > > > merged without passing CI.
> > > > > > >
> > > > > > > The problems of merging PRs with "unrelated" test failures are:
> > > > > > > - It's not always straightforward to tell whether a test
> > failures are
> > > > > > > related or not.
> > > > > > > - It prevents subsequent test cases from being executed, which
> > may
> > > > fail
> > > > > > > relating to the PR changes.
> > > > > > >
> > > > > > > To make things easier for the committers, the following
> > exceptions
> > > > > might
> > > > > > be
> > > > > > > considered acceptable.
> > > > > > > - The PR has passed CI in the contributor's personal workspace.
> > > > Please
> > > > > > post
> > > > > > > the link in such cases.
> > > > > > > - The CI tests have been triggered multiple times, on the same
> > > > commit,
> > > > > > and
> > > > > > > each stage has at least passed for once. Please also comment in
> > such
> > > > > > cases.
> > > > > > >
> > > > > > > If we all agree on this, I'd update the community guidelines
> for
> > > > > merging
> > > > > > > PRs wrt. this proposal. [1]
> > > > > > >
> > > > > > > Please let me know what do you think.
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > >
> > https://cwiki.apache.org/confluence/display/FLINK/Merging+Pull+Requests
> > > > > > >
> > > > > >
> > > > >
> > > >
> >
>


Re: Change in accumutors semantics with jobClient

2021-06-23 Thread Till Rohrmann
Yes, it should be part of the release notes where this change was
introduced. I'll take a look at your PR. Thanks a lot Etienne.

Cheers,
Till

On Wed, Jun 23, 2021 at 12:29 PM Etienne Chauchot 
wrote:

> Hi Till,
>
> Of course I can update the release notes.
>
> Question is: this change is quite old (January), it is already available
> in all the maintained releases :1.11, 1.12, 1.13.
>
> I think I should update the release notes for all these versions no ?
>
> In case you agree, I took the liberty to update all these release notes
> in a PR: https://github.com/apache/flink/pull/16256
>
> Cheers,
>
> Etienne
>
> On 21/06/2021 11:39, Till Rohrmann wrote:
> > Thanks for bringing this to the dev ML Etienne. Could you maybe update
> the
> > release notes for Flink 1.13 [1] to include this change? That way it
> might
> > be a bit more prominent. I think the change needs to go into the
> > release-1.13 and master branch.
> >
> > [1]
> >
> https://github.com/apache/flink/blob/master/docs/content/release-notes/flink-1.13.md
> >
> > Cheers,
> > Till
> >
> >
> > On Fri, Jun 18, 2021 at 2:45 PM Etienne Chauchot 
> > wrote:
> >
> >> Hi all,
> >>
> >> I did a fix some time ago regarding accumulators:
> >> the/JobClient.getAccumulators()/ was infinitely  blocking in local
> >> environment for a streaming job (1). The change (2) consisted of giving
> >> the current accumulators value for the running job. And when fixing this
> >> in the PR, it appeared that I had to change the accumulators semantics
> >> with /JobClient/ and I just realized that I forgot to bring this back to
> >> the ML:
> >>
> >> Previously /JobClient/ assumed that getAccumulator() was called on a
> >> bounded pipeline and that the user wanted to acquire the *final
> >> accumulator values* after the job is finished.
> >>
> >> But now it returns the *current value of accumulators* immediately to be
> >> compatible with unbounded pipelines.
> >>
> >> If it is run on a bounded pipeline, then to get the final accumulator
> >> values after the job is finished, one needs to call
> >>
> >>
> /getJobExecutionResult().thenApply(JobExecutionResult::getAllAccumulatorResults)/
> >>
> >> (1): https://issues.apache.org/jira/browse/FLINK-18685
> >>
> >> (2): https://github.com/apache/flink/pull/14558#
> >>
> >>
> >> Cheers,
> >>
> >> Etienne
> >>
> >>
>


Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-23 Thread Till Rohrmann
I would first try to not introduce the exception for local builds. It makes
it quite hard for others to verify the build and to make sure that the
right things were executed. If we see that this becomes an issue then we
can revisit this idea.

Cheers,
Till

On Wed, Jun 23, 2021 at 4:19 AM Yangze Guo  wrote:

> +1 for appending this to community guidelines for merging PRs.
>
> @Till Rohrmann
> I agree that with this approach unstable tests will not block other
> commit merges. However, it might be hard to prevent merging commits
> that are related to those tests and should have been passed them. It's
> true that this judgment can be made by the committers, but no one can
> ensure the judgment is always precise and so that we have this
> discussion thread.
>
> Regarding the unstable tests, how about adding another exception:
> committers verify it in their local environment and comment in such
> cases?
>
> Best,
> Yangze Guo
>
> On Tue, Jun 22, 2021 at 8:23 PM 刘建刚  wrote:
> >
> > It is a good principle to run all tests successfully with any change.
> This
> > means a lot for project's stability and development. I am big +1 for this
> > proposal.
> >
> > Best
> > liujiangang
> >
> > Till Rohrmann  于2021年6月22日周二 下午6:36写道:
> >
> > > One way to address the problem of regularly failing tests that block
> > > merging of PRs is to disable the respective tests for the time being.
> Of
> > > course, the failing test then needs to be fixed. But at least that way
> we
> > > would not block everyone from making progress.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Tue, Jun 22, 2021 at 12:00 PM Arvid Heise  wrote:
> > >
> > > > I think this is overall a good idea. So +1 from my side.
> > > > However, I'd like to put a higher priority on infrastructure then, in
> > > > particular docker image/artifact caches.
> > > >
> > > > On Tue, Jun 22, 2021 at 11:50 AM Till Rohrmann  >
> > > > wrote:
> > > >
> > > > > Thanks for bringing this topic to our attention Xintong. I think
> your
> > > > > proposal makes a lot of sense and we should follow it. It will
> give us
> > > > > confidence that our changes are working and it might be a good
> > > incentive
> > > > to
> > > > > quickly fix build instabilities. Hence, +1.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Tue, Jun 22, 2021 at 11:12 AM Xintong Song <
> tonysong...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > In the past a couple of weeks, I've observed several times that
> PRs
> > > are
> > > > > > merged without a green light from the CI tests, where failure
> cases
> > > are
> > > > > > considered *unrelated*. This may not always cause problems, but
> would
> > > > > > increase the chance of breaking our code base. In fact, it has
> > > occurred
> > > > > to
> > > > > > me twice in the past few weeks that I had to revert a commit
> which
> > > > breaks
> > > > > > the master branch due to this.
> > > > > >
> > > > > > I think it would be nicer to enforce a stricter rule, that no PRs
> > > > should
> > > > > be
> > > > > > merged without passing CI.
> > > > > >
> > > > > > The problems of merging PRs with "unrelated" test failures are:
> > > > > > - It's not always straightforward to tell whether a test
> failures are
> > > > > > related or not.
> > > > > > - It prevents subsequent test cases from being executed, which
> may
> > > fail
> > > > > > relating to the PR changes.
> > > > > >
> > > > > > To make things easier for the committers, the following
> exceptions
> > > > might
> > > > > be
> > > > > > considered acceptable.
> > > > > > - The PR has passed CI in the contributor's personal workspace.
> > > Please
> > > > > post
> > > > > > the link in such cases.
> > > > > > - The CI tests have been triggered multiple times, on the same
> > > commit,
> > > > > and
> > > > > > each stage has at least passed for once. Please also comment in
> such
> > > > > cases.
> > > > > >
> > > > > > If we all agree on this, I'd update the community guidelines for
> > > > merging
> > > > > > PRs wrt. this proposal. [1]
> > > > > >
> > > > > > Please let me know what do you think.
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > > [1]
> > > > > >
> > > >
> https://cwiki.apache.org/confluence/display/FLINK/Merging+Pull+Requests
> > > > > >
> > > > >
> > > >
> > >
>


Re: [DISCUSS] Feedback Collection Jira Bot

2021-06-23 Thread Piotr Nowojski
> I don't understand why we are necessarily losing discussion/knowledge. The
> tickets are still there, just in "Closed" state, which are included in
> default Jira search.

Finding if there already has been a ticket opened for the given issue is
not always easy. Finding the right ticket among 23086 is 7 times as
difficult/time consuming as among 3305 open tickets. If a piece of
knowledge/discussion is not easily accessible, it's effectively lost.

> We could of course just add a label, but closing seems
> clearer to me given that likely this ticket will not get comitter
attention
> in the foreseeable future.

There are tickets that are waiting to get enough traction (bugs,
improvements, ideas, test instabilities). I know plenty of those. If they
are being brought up frequently enough, they will finally get the needed
attention. Until this happens, I don't like to be losing the descriptions,
previous discussions and/or a frequency of past occurrences.

Can I ask, why do you think it makes sense to be closing those tickets
besides it "being clearer" to you? What use case is justifying this? And I
don't agree it's clearer. If the issue is still there, it shouldn't be in
the "CLOSED" state.

> Can you elaborate which rules you are running into mostly? I'd rather like
> to understand how we work right now and where this conflicts with the Jira
> bot vs slowly disabling the jira bot via labels.

I didn't count them, but I think stale critical -> stale major -> stale
minor -> auto closing I'm getting the most. If a ticket is not relevant
anymore I've learned to manually close it/clean up immediately once the
jira-bot pings about it regardless of the priority. But so far, I've closed
fewer tickets than I was forced to re-open. Maybe this is because I'm
tracking all of the tickets that are of interest to my team? Maybe others
are not doing that and that's why you are not seeing this problem that I'm
having?

But keep in mind. I don't mind about auto deprioritization. It's fair to
say that tickets get automatically deprioritised if they have no attention.
But why do we have to automatically close the least priority ones?

Maybe another idea. Instead of disabling closing the tickets via some
label, we could also achieve the same thing with a dedicated lowest
priority state "on hold"/"frozen".

Piotrek

śr., 23 cze 2021 o 10:17 Konstantin Knauf 
napisał(a):

> > I agree there are such tickets, but I don't see how this is addressing my
> concerns. There are also tickets that just shouldn't be closed as I
> described above. Why do you think that duplicating tickets and losing
> discussions/knowledge is a good solution?
>
> I don't understand why we are necessarily losing discussion/knowledge. The
> tickets are still there, just in "Closed" state, which are included in
> default Jira search. We could of course just add a label, but closing seems
> clearer to me given that likely this ticket will not get comitter attention
> in the foreseeable future.
>
> > I would like to avoid having to constantly fight against the bot. It's
> already responsible for the majority of my daily emails, with quite little
> benefit for me personally. I initially thought that after some period of
> time it will settle down, but now I'm afraid it won't happen.
>
> Can you elaborate which rules you are running into mostly? I'd rather like
> to understand how we work right now and where this conflicts with the Jira
> bot vs slowly disabling the jira bot via labels.
>
> On Wed, Jun 23, 2021 at 10:00 AM Piotr Nowojski 
> wrote:
>
> > Hi Konstantin,
> >
> > > In my opinion it is important that we close tickets eventually. There
> are
> > a
> > > lot of tickets (bugs, improvements, tech debt) that over time became
> > > irrelevant, out-of-scope, irreproducible, etc.  In my experience, these
> > > tickets are usually not closed by anyone but the bot.
> >
> > I agree there are such tickets, but I don't see how this is addressing my
> > concerns. There are also tickets that just shouldn't be closed as I
> > described above. Why do you think that duplicating tickets and losing
> > discussions/knowledge is a good solution?
> >
> > I would like to avoid having to constantly fight against the bot. It's
> > already responsible for the majority of my daily emails, with quite
> little
> > benefit for me personally. I initially thought that after some period of
> > time it will settle down, but now I'm afraid it won't happen. Can we add
> > some label to mark tickets to be ignored by the jira-bot?
> >
> > Best,
> > Piotrek
> >
> > śr., 23 cze 2021 o 09:40 Chesnay Schepler 
> napisał(a):
> >
> > > I would like it to not unassign people if a PR is open. These are
> > > usually blocked by the reviewer, not the assignee, and having the
> > > assignees now additionally having to update JIRA periodically is a bit
> > > like rubbing salt into the wound.
> > >
> > > On 6/23/2021 7:52 AM, Konstantin Knauf wrote:
> > > > Hi everyone,
> > > >
> > > > I was hoping for mo

Re: Change in accumutors semantics with jobClient

2021-06-23 Thread Etienne Chauchot

Hi Till,

Of course I can update the release notes.

Question is: this change is quite old (January), it is already available 
in all the maintained releases :1.11, 1.12, 1.13.


I think I should update the release notes for all these versions no ?

In case you agree, I took the liberty to update all these release notes 
in a PR: https://github.com/apache/flink/pull/16256


Cheers,

Etienne

On 21/06/2021 11:39, Till Rohrmann wrote:

Thanks for bringing this to the dev ML Etienne. Could you maybe update the
release notes for Flink 1.13 [1] to include this change? That way it might
be a bit more prominent. I think the change needs to go into the
release-1.13 and master branch.

[1]
https://github.com/apache/flink/blob/master/docs/content/release-notes/flink-1.13.md

Cheers,
Till


On Fri, Jun 18, 2021 at 2:45 PM Etienne Chauchot 
wrote:


Hi all,

I did a fix some time ago regarding accumulators:
the/JobClient.getAccumulators()/ was infinitely  blocking in local
environment for a streaming job (1). The change (2) consisted of giving
the current accumulators value for the running job. And when fixing this
in the PR, it appeared that I had to change the accumulators semantics
with /JobClient/ and I just realized that I forgot to bring this back to
the ML:

Previously /JobClient/ assumed that getAccumulator() was called on a
bounded pipeline and that the user wanted to acquire the *final
accumulator values* after the job is finished.

But now it returns the *current value of accumulators* immediately to be
compatible with unbounded pipelines.

If it is run on a bounded pipeline, then to get the final accumulator
values after the job is finished, one needs to call

/getJobExecutionResult().thenApply(JobExecutionResult::getAllAccumulatorResults)/

(1): https://issues.apache.org/jira/browse/FLINK-18685

(2): https://github.com/apache/flink/pull/14558#


Cheers,

Etienne




[DISCUSS] Incrementally deprecating the DataSet API

2021-06-23 Thread Timo Walther

Hi everyone,

I'm sending this email to make sure everyone is on the same page about 
slowly deprecating the DataSet API.


There have been a few thoughts mentioned in presentations, offline 
discussions, and JIRA issues. However, I have observed that there are 
still some concerns or different opinions on what steps are necessary to 
implement this change.


Let me summarize some of the steps and assumpations and let's have a 
discussion about it:


Step 1: Introduce a batch mode for Table API (FLIP-32)
[DONE in 1.9]

Step 2: Introduce a batch mode for DataStream API (FLIP-134)
[DONE in 1.12]

Step 3: Soft deprecate DataSet API (FLIP-131)
[DONE in 1.12]

We updated the documentation recently to make this deprecation even more 
visible. There is a dedicated `(Legacy)` label right next to the menu 
item now.


We won't deprecate concrete classes of the API with a @Deprecated 
annotation to avoid extensive warnings in logs until then.


Step 4: Drop the legacy SQL connectors and formats (FLINK-14437)
[DONE in 1.14]

We dropped code for ORC, Parque, and HBase formats that were only used 
by DataSet API users. The removed classes had no documentation and were 
not annotated with one of our API stability annotations.


The old functionality should be available through the new sources and 
sinks for Table API and DataStream API. If not, we should bring them 
into a shape that they can be a full replacement.


DataSet users are encouraged to either upgrade the API or use Flink 
1.13. Users can either just stay at Flink 1.13 or copy only the format's 
code to a newer Flink version. We aim to keep the core interfaces (i.e. 
InputFormat and OutputFormat) stable until the next major version.


We will maintain/allow important contributions to dropped connectors in 
1.13. So 1.13 could be considered as kind of a DataSet API LTS release.


Step 5: Drop the legacy SQL planner (FLINK-14437)
[DONE in 1.14]

This included dropping support of DataSet API with SQL.

Step 6: Connect both Table and DataStream API in batch mode (FLINK-20897)
[PLANNED in 1.14]

Step 7: Reach feature parity of Table API/DataStream API with DataSet API
[PLANNED for 1.14++]

We need to identify blockers when migrating from DataSet API to Table 
API/DataStream API. Here we need to estabilish a good feedback pipeline 
to include DataSet users in the roadmap planning.


Step 7: Drop the Gelly library

No concrete plan yet. Latest would be the next major Flink version aka 
Flink 2.0.


Step 8: Drop DataSet API

Planned for the next major Flink version aka Flink 2.0.


Please let me know if this matches your thoughts. We can also convert 
this into a blog post or mention it in the next release notes.


Regards,
Timo



[jira] [Created] (FLINK-23121) Fix the issue that the InternalRow as arguments in Python UDAF

2021-06-23 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-23121:


 Summary: Fix the issue that the InternalRow as arguments in Python 
UDAF
 Key: FLINK-23121
 URL: https://issues.apache.org/jira/browse/FLINK-23121
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.13.1
Reporter: Huang Xingbo
Assignee: Huang Xingbo
 Fix For: 1.13.2


The problem is reported from
https://stackoverflow.com/questions/68026832/pyflink-udaf-internalrow-vs-row

In release-1.14, we have reconstructed the coders and fixed this problem. So 
this problem only appeared in 1.13




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: CONTENTS DELETED in nabble frontend

2021-06-23 Thread Robert Metzger
I've set up the nabble archives back in the stone age of Flink, when the
Apache archive didn't provide a very modern user experience. Since
lists.apache.org exists, we don't really need nabble anymore.

I'll open a pull request to replace the links to nabble to point to
lists.apache.org on the community page:
https://flink.apache.org/community.html

I'll also look into updating the description of our nabble groups to link
to the new archive.


On Wed, Jun 23, 2021 at 9:57 AM Dawid Wysakowicz 
wrote:

> Hey,
>
> As far as I know the official Apache ML archive can be accessed here[1].
> Personally I don't know what is the status of the nabble archives.
>
> Best,
>
> Dawid
>
> [1] https://lists.apache.org/list.html?dev@flink.apache.org
>
> On 23/06/2021 09:08, Matthias Pohl wrote:
> > Thanks for pointing to the Nabble support forum. +1 Based on [1], the
> > deletion of posts is not related to the switch of mailing lists becoming
> > regular forums. But it seems to be a general issue at Nabble.
> > But what concerns me is [2]: It looks like they are planning to remove
> the
> > feature to post through email which is actually our way of collecting the
> > posts.
> >
> > @Robert: Is this Apache Flink Mailing list/Nabble system an Apache-wide
> > setup used in any other Apache project as well? Or is this something we
> > came up with?
> >
> > Do we have a fallback system or backups of messages?
> >
> > Matthias
> >
> > [1]
> >
> http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html
> > [2]
> >
> http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html
> >
> > On Wed, Jun 23, 2021 at 8:45 AM Yangze Guo  wrote:
> >
> >> Ahh. It seems nabble has updated mailing lists to regular forums this
> >> week[1].
> >>
> >> [1]
> >>
> http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html
> >>
> >> Best,
> >> Yangze Guo
> >>
> >> On Wed, Jun 23, 2021 at 2:37 PM Yangze Guo  wrote:
> >>> It seems the post will remain iff it is sent by a registered email. I
> >>> do not register nabble in user ML and my post is deleted in [1].
> >>>
> >>> [1]
> >>
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/after-upgrade-flink1-12-to-flink1-13-1-flink-web-ui-s-taskmanager-detail-page-error-tt44391.html
> >>> Best,
> >>> Yangze Guo
> >>>
> >>> On Wed, Jun 23, 2021 at 2:16 PM Matthias Pohl 
> >> wrote:
>  Hi everyone,
>  Is it only me or does anyone else have the same problem with messages
> >> being
>  not available anymore in the nabble frontend? I get multiple messages
> >> like
>  the following one for individual messages:
> > CONTENTS DELETED
> > The author has deleted this message.
>  This appears for instance in [1], where all the messages are deleted
> >> except
>  for Till Rohrmann's, Yangze Guo's and mine. This issue is not limited
> >> to
>  the dev mailing list but also seem to appear in the user mailing list
> >> (e.g.
>  [2]).
> 
>  Logging into nabble doesn't solve the problem. I'd assume that it's
> >> some
>  infrastructure issue rather than people collectively deleting their
>  messages. But a Google search wasn't of any help.
> 
>  Matthias
> 
>  [1]
> 
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Remove-Mesos-support-td45974.html
>  [2]
> 
> >>
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/1-1-snapshot-issues-td6971.html#a6973
>
>


Re: [DISCUSS] Drop Mesos in 1.14

2021-06-23 Thread Leonard Xu
+ 1 for dropping Mesos. I checked both commit history and mail list, the Mesos 
related issue/user question has been rarely appeared. 

Best,
Leonard



Re: [DISCUSS] Drop Mesos in 1.14

2021-06-23 Thread Fabian Paul
+ 1 for dropping mesos. Most of the PMCs have already left the project [1] and 
a move to attic
was barely avoided. Overall kubernetes has taken its place and it is unlikely 
that we will see a 
surge in Mesos very soon.

Best,
Fabian


[1] 
https://lists.apache.org/thread.html/rab2a820507f7c846e54a847398ab20f47698ec5bce0c8e182bfe51ba%40%3Cdev.mesos.apache.org%3E

[jira] [Created] (FLINK-23120) ByteArrayWrapperSerializer.serialize should use writeInt to serialize the length

2021-06-23 Thread Dian Fu (Jira)
Dian Fu created FLINK-23120:
---

 Summary: ByteArrayWrapperSerializer.serialize should use writeInt 
to serialize the length
 Key: FLINK-23120
 URL: https://issues.apache.org/jira/browse/FLINK-23120
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.13.0, 1.12.0
Reporter: Dian Fu
Assignee: Dian Fu
 Fix For: 1.12.5, 1.13.2






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Drop Mesos in 1.14

2021-06-23 Thread Yang Wang
+1 for dropping mesos support.

AFAIK, mesos(including marathon for the container management) is phasing
out gradually and has been replaced with Kubernetes in the containerized
world.


Best,
Yang

Matthias Pohl  于2021年6月23日周三 下午2:04写道:

> +1 for dropping Mesos support. There was no feedback opposing the direction
> from the community in the most-recent discussion [1,2] on deprecating it.
>
> Matthias
>
> [1]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Remove-Mesos-support-td45974.html
> [2]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Deprecating-Mesos-support-td50142.html
>
> On Wed, Jun 23, 2021 at 4:21 AM Yangze Guo  wrote:
>
> > +1 for dropping if there is no strong demand from the community.
> >
> > I'm willing to help with the removal of e2e tests part.
> >
> > Best,
> > Yangze Guo
> >
> > On Wed, Jun 23, 2021 at 10:09 AM Xintong Song 
> > wrote:
> > >
> > > +1 for dropping.
> > >
> > > I like Seth's idea. I don't have any real Mesos experience either.
> > > According to this article [1], it looks like we can deploy a standalone
> > > cluster on Mesos similar to Kubernetes. However, we should only do it
> if
> > > there's indeed a strong demand from the community for deploying a
> > > latest version of Flink on Mesos.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > > [1] https://www.baeldung.com/ops/mesos-kubernetes-comparison
> > >
> > > On Tue, Jun 22, 2021 at 11:59 PM Israel Ekpo 
> > wrote:
> > >
> > > > I am in favor of dropping the support for Mesos.
> > > >
> > > > In terms of the landscape for users leveraging Mesos for the kind of
> > > > workloads Flink is used, I think it is on the decline.
> > > >
> > > > +1 from me
> > > >
> > > > On Tue, Jun 22, 2021 at 11:32 AM Seth Wiesman 
> > wrote:
> > > >
> > > > > Sorry if this is a naive question, I don't have any real Mesos
> > > > experience.
> > > > > Is it possible to deploy a standalone cluster on top of Mesos in
> the
> > same
> > > > > way you can with Kubernetes? If so, and there is still Mesos demand
> > from
> > > > > the community, we could document that process as the recommended
> > > > deployment
> > > > > mode going forward.
> > > > >
> > > > > Seth
> > > > >
> > > > > On Tue, Jun 22, 2021 at 5:02 AM Arvid Heise 
> > wrote:
> > > > >
> > > > > > +1 for dropping. Frankly speaking, I don't see it having any
> future
> > > > (and
> > > > > > D2iQ
> > > > > > agrees).
> > > > > >
> > > > > > If there is a surprisingly huge demand, I'd try to evaluate
> > plugins for
> > > > > it.
> > > > > >
> > > > > > On Tue, Jun 22, 2021 at 11:46 AM Till Rohrmann <
> > trohrm...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > I'd be ok with dropping support for Mesos if it helps us to
> > clear our
> > > > > > > dependencies in the flink-runtime module. If we do it, then we
> > should
> > > > > > > probably update our documentation with a pointer to the latest
> > Flink
> > > > > > > version that supports Mesos in case of users strictly need
> Mesos.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Till
> > > > > > >
> > > > > > > On Tue, Jun 22, 2021 at 10:29 AM Chesnay Schepler <
> > > > ches...@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Last week I spent some time looking into making flink-runtime
> > scala
> > > > > > > > free, which effectively means to move the Akka-reliant
> classes
> > to
> > > > > > > > another module, and load that module along with Akka and all
> of
> > > > it's
> > > > > > > > dependencies (including Scala) through a separate
> classloader.
> > > > > > > >
> > > > > > > > This would finally decouple the Scala versions required by
> the
> > > > > runtime
> > > > > > > > and API, and would allow us to upgrade Akka as we'd no longer
> > be
> > > > > > limited
> > > > > > > > to Scala 2.11. It would rid the classpath of a few
> > dependencies,
> > > > and
> > > > > > > > remove the need for scala suffixes on quite a few modules.
> > > > > > > >
> > > > > > > > However, our Mesos support has unfortunately a hard
> dependency
> > on
> > > > > Akka,
> > > > > > > > which naturally does not play well with the goal of isolating
> > Akka
> > > > in
> > > > > > > > it's own ClassLoader.
> > > > > > > >
> > > > > > > > To solve this issue I was thinking of simple dropping
> > flink-mesos
> > > > in
> > > > > > > > 1.14 (it was deprecated in 1.13).
> > > > > > > >
> > > > > > > > Truth be told, I picked this option because it is the easiest
> > to
> > > > do.
> > > > > We
> > > > > > > > _could_ probably make things work somehow (likely by
> shipping a
> > > > > second
> > > > > > > > Akka version just for flink-mesos), but it doesn't seem worth
> > the
> > > > > > hassle
> > > > > > > > and would void some of the benefits. So far we kept
> flink-mesos
> > > > > around,
> > > > > > > > despite not really developing it further, because it didn't
> > hurt to
> > > > > > have
> > > > > > > > it in still in Flink, but this has now ch

[jira] [Created] (FLINK-23119) Fix the issue that the exception that General Python UDAF is unsupported is not thrown in Compile Stage.

2021-06-23 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-23119:


 Summary: Fix the issue that the exception that General Python UDAF 
is unsupported is not thrown in Compile Stage.
 Key: FLINK-23119
 URL: https://issues.apache.org/jira/browse/FLINK-23119
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.12.4, 1.13.1
Reporter: Huang Xingbo
Assignee: Huang Xingbo
 Fix For: 1.12.5, 1.13.2






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Feedback Collection Jira Bot

2021-06-23 Thread Konstantin Knauf
> I agree there are such tickets, but I don't see how this is addressing my
concerns. There are also tickets that just shouldn't be closed as I
described above. Why do you think that duplicating tickets and losing
discussions/knowledge is a good solution?

I don't understand why we are necessarily losing discussion/knowledge. The
tickets are still there, just in "Closed" state, which are included in
default Jira search. We could of course just add a label, but closing seems
clearer to me given that likely this ticket will not get comitter attention
in the foreseeable future.

> I would like to avoid having to constantly fight against the bot. It's
already responsible for the majority of my daily emails, with quite little
benefit for me personally. I initially thought that after some period of
time it will settle down, but now I'm afraid it won't happen.

Can you elaborate which rules you are running into mostly? I'd rather like
to understand how we work right now and where this conflicts with the Jira
bot vs slowly disabling the jira bot via labels.

On Wed, Jun 23, 2021 at 10:00 AM Piotr Nowojski 
wrote:

> Hi Konstantin,
>
> > In my opinion it is important that we close tickets eventually. There are
> a
> > lot of tickets (bugs, improvements, tech debt) that over time became
> > irrelevant, out-of-scope, irreproducible, etc.  In my experience, these
> > tickets are usually not closed by anyone but the bot.
>
> I agree there are such tickets, but I don't see how this is addressing my
> concerns. There are also tickets that just shouldn't be closed as I
> described above. Why do you think that duplicating tickets and losing
> discussions/knowledge is a good solution?
>
> I would like to avoid having to constantly fight against the bot. It's
> already responsible for the majority of my daily emails, with quite little
> benefit for me personally. I initially thought that after some period of
> time it will settle down, but now I'm afraid it won't happen. Can we add
> some label to mark tickets to be ignored by the jira-bot?
>
> Best,
> Piotrek
>
> śr., 23 cze 2021 o 09:40 Chesnay Schepler  napisał(a):
>
> > I would like it to not unassign people if a PR is open. These are
> > usually blocked by the reviewer, not the assignee, and having the
> > assignees now additionally having to update JIRA periodically is a bit
> > like rubbing salt into the wound.
> >
> > On 6/23/2021 7:52 AM, Konstantin Knauf wrote:
> > > Hi everyone,
> > >
> > > I was hoping for more feedback from other committers, but seems like
> this
> > > is not happening, so here's my proposal for immediate changes:
> > >
> > > * Ignore tickets with a fixVersion for all rules but the
> stale-unassigned
> > > role.
> > >
> > > * We change the time intervals as follows, accepting reality a bit more
> > ;)
> > >
> > >  * stale-assigned only after 30 days (instead of 14 days)
> > >  * stale-critical only after 14 days (instead of 7 days)
> > >  * stale-major only after 60 days (instead of 30 days)
> > >
> > > Unless there are -1s, I'd implement the changes Monday next week.
> > >
> > > Cheers,
> > >
> > > Konstantin
> > >
> > > On Thu, Jun 17, 2021 at 2:17 PM Piotr Nowojski 
> > wrote:
> > >
> > >> Hi,
> > >>
> > >> I also think that the bot is a bit too aggressive/too quick with
> > assigning
> > >> stale issues/deprioritizing them, but that's not that big of a deal
> for
> > me.
> > >>
> > >> What bothers me much more is that it's closing minor issues
> > automatically.
> > >> Depriotising issues makes sense to me. If a wish for improvement or a
> > bug
> > >> report has been opened a long time ago, and they got no attention over
> > the
> > >> time, sure depriotize them. But closing them is IMO a bad idea. Bug
> > might
> > >> be minor, but if it's not fixed it's still there - it shouldn't be
> > closed.
> > >> Closing with "won't fix" should be done for very good reasons and very
> > >> rarely. Same applies to improvements/wishes. Furthermore, very often
> > >> descriptions and comments have a lot of value, and if we keep closing
> > minor
> > >> issues I'm afraid that we end up with:
> > >> - more duplication. I doubt anyone will be looking for prior "closed"
> > bug
> > >> reports/improvement requests. Definitely I'm only looking for open
> > tickets
> > >> when looking if a ticket for XYZ already exists or not
> > >> - we will be losing knowledge
> > >>
> > >> Piotrek
> > >>
> > >> śr., 16 cze 2021 o 15:12 Robert Metzger 
> > napisał(a):
> > >>
> > >>> Very sorry for the delayed response.
> > >>>
> > >>> Regarding tickets with the "test-instability" label (topic 1): I'm
> > >> usually
> > >>> assigning a fixVersion to the next release of the branch where the
> > >> failure
> > >>> occurred, when I'm opening a test failure ticket. Others seem to do
> > that
> > >>> too. Hence my comment that not checking tickets with a fixVersion set
> > by
> > >>> Flink bot is good (because test failures should always stay
> "Critical"
> > >>> until

[jira] [Created] (FLINK-23118) Drop mesos

2021-06-23 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23118:


 Summary: Drop mesos
 Key: FLINK-23118
 URL: https://issues.apache.org/jira/browse/FLINK-23118
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Mesos
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


Following the discussion on the 
[ML|https://lists.apache.org/thread.html/rd7bf0dabe2d75adb9f97a1879638711d04cfce0774d31b033acae0b8%40%3Cdev.flink.apache.org%3E]
 , remove Mesos support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Feedback Collection Jira Bot

2021-06-23 Thread Piotr Nowojski
Hi Konstantin,

> In my opinion it is important that we close tickets eventually. There are
a
> lot of tickets (bugs, improvements, tech debt) that over time became
> irrelevant, out-of-scope, irreproducible, etc.  In my experience, these
> tickets are usually not closed by anyone but the bot.

I agree there are such tickets, but I don't see how this is addressing my
concerns. There are also tickets that just shouldn't be closed as I
described above. Why do you think that duplicating tickets and losing
discussions/knowledge is a good solution?

I would like to avoid having to constantly fight against the bot. It's
already responsible for the majority of my daily emails, with quite little
benefit for me personally. I initially thought that after some period of
time it will settle down, but now I'm afraid it won't happen. Can we add
some label to mark tickets to be ignored by the jira-bot?

Best,
Piotrek

śr., 23 cze 2021 o 09:40 Chesnay Schepler  napisał(a):

> I would like it to not unassign people if a PR is open. These are
> usually blocked by the reviewer, not the assignee, and having the
> assignees now additionally having to update JIRA periodically is a bit
> like rubbing salt into the wound.
>
> On 6/23/2021 7:52 AM, Konstantin Knauf wrote:
> > Hi everyone,
> >
> > I was hoping for more feedback from other committers, but seems like this
> > is not happening, so here's my proposal for immediate changes:
> >
> > * Ignore tickets with a fixVersion for all rules but the stale-unassigned
> > role.
> >
> > * We change the time intervals as follows, accepting reality a bit more
> ;)
> >
> >  * stale-assigned only after 30 days (instead of 14 days)
> >  * stale-critical only after 14 days (instead of 7 days)
> >  * stale-major only after 60 days (instead of 30 days)
> >
> > Unless there are -1s, I'd implement the changes Monday next week.
> >
> > Cheers,
> >
> > Konstantin
> >
> > On Thu, Jun 17, 2021 at 2:17 PM Piotr Nowojski 
> wrote:
> >
> >> Hi,
> >>
> >> I also think that the bot is a bit too aggressive/too quick with
> assigning
> >> stale issues/deprioritizing them, but that's not that big of a deal for
> me.
> >>
> >> What bothers me much more is that it's closing minor issues
> automatically.
> >> Depriotising issues makes sense to me. If a wish for improvement or a
> bug
> >> report has been opened a long time ago, and they got no attention over
> the
> >> time, sure depriotize them. But closing them is IMO a bad idea. Bug
> might
> >> be minor, but if it's not fixed it's still there - it shouldn't be
> closed.
> >> Closing with "won't fix" should be done for very good reasons and very
> >> rarely. Same applies to improvements/wishes. Furthermore, very often
> >> descriptions and comments have a lot of value, and if we keep closing
> minor
> >> issues I'm afraid that we end up with:
> >> - more duplication. I doubt anyone will be looking for prior "closed"
> bug
> >> reports/improvement requests. Definitely I'm only looking for open
> tickets
> >> when looking if a ticket for XYZ already exists or not
> >> - we will be losing knowledge
> >>
> >> Piotrek
> >>
> >> śr., 16 cze 2021 o 15:12 Robert Metzger 
> napisał(a):
> >>
> >>> Very sorry for the delayed response.
> >>>
> >>> Regarding tickets with the "test-instability" label (topic 1): I'm
> >> usually
> >>> assigning a fixVersion to the next release of the branch where the
> >> failure
> >>> occurred, when I'm opening a test failure ticket. Others seem to do
> that
> >>> too. Hence my comment that not checking tickets with a fixVersion set
> by
> >>> Flink bot is good (because test failures should always stay "Critical"
> >>> until we've understood what's going on)
> >>> I see that it is a bit contradicting that Critical test instabilities
> >>> receive no attention for 14 days, but that seems to be the norm given
> the
> >>> current number of incoming test instabilities.
> >>>
> >>> On Wed, Jun 16, 2021 at 2:05 PM Till Rohrmann 
> >>> wrote:
> >>>
>  Another example for category 4 would be the ticket where we collect
>  breaking API changes for Flink 2.0 [1]. The idea behind this ticket is
> >> to
>  collect things to consider when developing the next major version.
>  Admittedly, we have never seen the benefits of collecting the breaking
>  changes because we haven't started Flink 2.x yet. Also, it is not
> clear
> >>> how
>  relevant these tickets are right now.
> 
>  [1] https://issues.apache.org/jira/browse/FLINK-3957
> 
>  Cheers,
>  Till
> 
>  On Wed, Jun 16, 2021 at 11:42 AM Konstantin Knauf 
>  wrote:
> 
> > Hi everyone,
> >
> > thank you for all the feedback so far. I believe we have four
> >> different
> > topics by now:
> >
> > 1 about *test-instability tickets* raised by Robert. Waiting for
> >>> feedback
> > by Robert.
> >
> > 2 about *aggressiveness of stale-assigned *rule raised by Timo.
> >> Waiting
> > for feedback 

Re: CONTENTS DELETED in nabble frontend

2021-06-23 Thread Dawid Wysakowicz
Hey,

As far as I know the official Apache ML archive can be accessed here[1].
Personally I don't know what is the status of the nabble archives.

Best,

Dawid

[1] https://lists.apache.org/list.html?dev@flink.apache.org

On 23/06/2021 09:08, Matthias Pohl wrote:
> Thanks for pointing to the Nabble support forum. +1 Based on [1], the
> deletion of posts is not related to the switch of mailing lists becoming
> regular forums. But it seems to be a general issue at Nabble.
> But what concerns me is [2]: It looks like they are planning to remove the
> feature to post through email which is actually our way of collecting the
> posts.
>
> @Robert: Is this Apache Flink Mailing list/Nabble system an Apache-wide
> setup used in any other Apache project as well? Or is this something we
> came up with?
>
> Do we have a fallback system or backups of messages?
>
> Matthias
>
> [1]
> http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html
> [2]
> http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html
>
> On Wed, Jun 23, 2021 at 8:45 AM Yangze Guo  wrote:
>
>> Ahh. It seems nabble has updated mailing lists to regular forums this
>> week[1].
>>
>> [1]
>> http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html
>>
>> Best,
>> Yangze Guo
>>
>> On Wed, Jun 23, 2021 at 2:37 PM Yangze Guo  wrote:
>>> It seems the post will remain iff it is sent by a registered email. I
>>> do not register nabble in user ML and my post is deleted in [1].
>>>
>>> [1]
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/after-upgrade-flink1-12-to-flink1-13-1-flink-web-ui-s-taskmanager-detail-page-error-tt44391.html
>>> Best,
>>> Yangze Guo
>>>
>>> On Wed, Jun 23, 2021 at 2:16 PM Matthias Pohl 
>> wrote:
 Hi everyone,
 Is it only me or does anyone else have the same problem with messages
>> being
 not available anymore in the nabble frontend? I get multiple messages
>> like
 the following one for individual messages:
> CONTENTS DELETED
> The author has deleted this message.
 This appears for instance in [1], where all the messages are deleted
>> except
 for Till Rohrmann's, Yangze Guo's and mine. This issue is not limited
>> to
 the dev mailing list but also seem to appear in the user mailing list
>> (e.g.
 [2]).

 Logging into nabble doesn't solve the problem. I'd assume that it's
>> some
 infrastructure issue rather than people collectively deleting their
 messages. But a Google search wasn't of any help.

 Matthias

 [1]

>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Remove-Mesos-support-td45974.html
 [2]

>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/1-1-snapshot-issues-td6971.html#a6973



OpenPGP_signature
Description: OpenPGP digital signature


[jira] [Created] (FLINK-23117) TaskExecutor.allocateSlot is a logical error

2021-06-23 Thread zhouzhengde (Jira)
zhouzhengde created FLINK-23117:
---

 Summary: TaskExecutor.allocateSlot is a logical error
 Key: FLINK-23117
 URL: https://issues.apache.org/jira/browse/FLINK-23117
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Affects Versions: 1.13.1, 1.13.0, 1.12.2, 1.12.0
Reporter: zhouzhengde


(commit: 2020-04-22)TaskExecutor.allocateSlot at line 1109 has a logical error. 
Use '!taskSlotTable.isAllocated(slotId.getSlotNumber(), jobId, allocationId)' 
to judge TaskSlot is used by another job that is not correct.  if slot index 
not occupy, that will be have some problem. Please confirm that is correct. The 
issue code follow: 

- TaskExecutor.java
```java

{color:red}} else if (!taskSlotTable.isAllocated(slotId.getSlotNumber(), jobId, 
allocationId)) {{color}
 final String message =
 "The slot " + slotId + " has already been allocated for a different job.";

 log.info(message);

 final AllocationID allocationID =
 taskSlotTable.getCurrentAllocation(slotId.getSlotNumber());
 throw new SlotOccupiedException(
 message, allocationID, taskSlotTable.getOwningJob(allocationID));
}
```

- TaskSlotTableImpl.java
```java
@Override
public boolean isAllocated(int index, JobID jobId, AllocationID 
allocationId) {
TaskSlot taskSlot = taskSlots.get(index);
if (taskSlot != null) {
return taskSlot.isAllocated(jobId, allocationId);
} else {
return false;
}
}
```





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-172: Support custom transactional.id prefix in FlinkKafkaProducer

2021-06-23 Thread Piotr Nowojski
Hi,

+1 from my side on this idea. I do not see any problems that could be
caused by this change.

Best,
Piotrek

śr., 23 cze 2021 o 08:59 Stephan Ewen  napisał(a):

> The motivation and the proposal sound good to me, +1 from my side.
>
> Would be good to have a quick opinion from someone who worked specifically
> with Kafka, maybe Becket or Piotr?
>
> Best,
> Stephan
>
>
> On Sat, Jun 12, 2021 at 9:50 AM Wenhao Ji  wrote:
>
>> Hi everyone,
>>
>> I would like to open this discussion thread to take about the FLIP-172
>> <
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-172%3A+Support+custom+transactional.id+prefix+in+FlinkKafkaProducer
>> >,
>> which aims to provide a way to support specifying a custom
>> transactional.id
>> in the FlinkKafkaProducer class.
>>
>> I am looking forwards to your feedback and suggestions!
>>
>> Thanks,
>> Wenhao
>>
>


Re: [DISCUSS] Feedback Collection Jira Bot

2021-06-23 Thread Chesnay Schepler
I would like it to not unassign people if a PR is open. These are 
usually blocked by the reviewer, not the assignee, and having the 
assignees now additionally having to update JIRA periodically is a bit 
like rubbing salt into the wound.


On 6/23/2021 7:52 AM, Konstantin Knauf wrote:

Hi everyone,

I was hoping for more feedback from other committers, but seems like this
is not happening, so here's my proposal for immediate changes:

* Ignore tickets with a fixVersion for all rules but the stale-unassigned
role.

* We change the time intervals as follows, accepting reality a bit more ;)

 * stale-assigned only after 30 days (instead of 14 days)
 * stale-critical only after 14 days (instead of 7 days)
 * stale-major only after 60 days (instead of 30 days)

Unless there are -1s, I'd implement the changes Monday next week.

Cheers,

Konstantin

On Thu, Jun 17, 2021 at 2:17 PM Piotr Nowojski  wrote:


Hi,

I also think that the bot is a bit too aggressive/too quick with assigning
stale issues/deprioritizing them, but that's not that big of a deal for me.

What bothers me much more is that it's closing minor issues automatically.
Depriotising issues makes sense to me. If a wish for improvement or a bug
report has been opened a long time ago, and they got no attention over the
time, sure depriotize them. But closing them is IMO a bad idea. Bug might
be minor, but if it's not fixed it's still there - it shouldn't be closed.
Closing with "won't fix" should be done for very good reasons and very
rarely. Same applies to improvements/wishes. Furthermore, very often
descriptions and comments have a lot of value, and if we keep closing minor
issues I'm afraid that we end up with:
- more duplication. I doubt anyone will be looking for prior "closed" bug
reports/improvement requests. Definitely I'm only looking for open tickets
when looking if a ticket for XYZ already exists or not
- we will be losing knowledge

Piotrek

śr., 16 cze 2021 o 15:12 Robert Metzger  napisał(a):


Very sorry for the delayed response.

Regarding tickets with the "test-instability" label (topic 1): I'm

usually

assigning a fixVersion to the next release of the branch where the

failure

occurred, when I'm opening a test failure ticket. Others seem to do that
too. Hence my comment that not checking tickets with a fixVersion set by
Flink bot is good (because test failures should always stay "Critical"
until we've understood what's going on)
I see that it is a bit contradicting that Critical test instabilities
receive no attention for 14 days, but that seems to be the norm given the
current number of incoming test instabilities.

On Wed, Jun 16, 2021 at 2:05 PM Till Rohrmann 
wrote:


Another example for category 4 would be the ticket where we collect
breaking API changes for Flink 2.0 [1]. The idea behind this ticket is

to

collect things to consider when developing the next major version.
Admittedly, we have never seen the benefits of collecting the breaking
changes because we haven't started Flink 2.x yet. Also, it is not clear

how

relevant these tickets are right now.

[1] https://issues.apache.org/jira/browse/FLINK-3957

Cheers,
Till

On Wed, Jun 16, 2021 at 11:42 AM Konstantin Knauf 
wrote:


Hi everyone,

thank you for all the feedback so far. I believe we have four

different

topics by now:

1 about *test-instability tickets* raised by Robert. Waiting for

feedback

by Robert.

2 about *aggressiveness of stale-assigned *rule raised by Timo.

Waiting

for feedback by Timo and others.

3 about *excluding issues with a fixVersion* raised by Konstantin,

Till.

Waiting for more feedback by the community as it involves general

changes

to how we deal with fixVersion.

4 about *excluding issues with a specific-label* raised by Arvid.

I've already written something about 1-3. Regarding 4:

How do we make sure that these don't become stale? I think, there

have

been a few "long-term efforts" in the past that never got the

attention

that we initially wanted. Is this just about the ability to collect

tickets

under an umbrella to document a future effort? Maybe for the example

of

DataStream replacing DataSet how would this look like in Jira?

Cheers,

Konstantin


On Tue, Jun 8, 2021 at 11:31 AM Till Rohrmann 
wrote:


I like this idea. It would then be the responsibility of the

component

maintainers to manage the lifecycle explicitly.

Cheers,
Till

On Mon, Jun 7, 2021 at 1:48 PM Arvid Heise 

wrote:

One more idea for the bot. Could we have a label to exclude

certain

tickets

from the life-cycle?

I'm thinking about long-term tickets such as improving DataStream

to

eventually replace DataSet. We would collect ideas over the next

couple

of

weeks without any visible progress on the implementation.

On Fri, May 21, 2021 at 2:06 PM Konstantin Knauf <

kna...@apache.org

wrote:


Hi Timo,

Thanks for joining the discussion. All rules except the

unassigned

rule

do

not apply to Sub-Tasks actually (like deprioritiza

[jira] [Created] (FLINK-23116) Update documentation about TableDescriptors

2021-06-23 Thread Timo Walther (Jira)
Timo Walther created FLINK-23116:


 Summary: Update documentation about TableDescriptors
 Key: FLINK-23116
 URL: https://issues.apache.org/jira/browse/FLINK-23116
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Table SQL / API
Reporter: Timo Walther


We should update the documentation at a couple of places to show different use 
cases. In any case we need a detailed documentation for the Table API/common 
API section.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23115) Expose new APIs from PyFlink

2021-06-23 Thread Jira
Ingo Bürk created FLINK-23115:
-

 Summary: Expose new APIs from PyFlink
 Key: FLINK-23115
 URL: https://issues.apache.org/jira/browse/FLINK-23115
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Ingo Bürk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: CONTENTS DELETED in nabble frontend

2021-06-23 Thread Matthias Pohl
Thanks for pointing to the Nabble support forum. +1 Based on [1], the
deletion of posts is not related to the switch of mailing lists becoming
regular forums. But it seems to be a general issue at Nabble.
But what concerns me is [2]: It looks like they are planning to remove the
feature to post through email which is actually our way of collecting the
posts.

@Robert: Is this Apache Flink Mailing list/Nabble system an Apache-wide
setup used in any other Apache project as well? Or is this something we
came up with?

Do we have a fallback system or backups of messages?

Matthias

[1]
http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html
[2]
http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html

On Wed, Jun 23, 2021 at 8:45 AM Yangze Guo  wrote:

> Ahh. It seems nabble has updated mailing lists to regular forums this
> week[1].
>
> [1]
> http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html
>
> Best,
> Yangze Guo
>
> On Wed, Jun 23, 2021 at 2:37 PM Yangze Guo  wrote:
> >
> > It seems the post will remain iff it is sent by a registered email. I
> > do not register nabble in user ML and my post is deleted in [1].
> >
> > [1]
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/after-upgrade-flink1-12-to-flink1-13-1-flink-web-ui-s-taskmanager-detail-page-error-tt44391.html
> >
> > Best,
> > Yangze Guo
> >
> > On Wed, Jun 23, 2021 at 2:16 PM Matthias Pohl 
> wrote:
> > >
> > > Hi everyone,
> > > Is it only me or does anyone else have the same problem with messages
> being
> > > not available anymore in the nabble frontend? I get multiple messages
> like
> > > the following one for individual messages:
> > > > CONTENTS DELETED
> > > > The author has deleted this message.
> > >
> > > This appears for instance in [1], where all the messages are deleted
> except
> > > for Till Rohrmann's, Yangze Guo's and mine. This issue is not limited
> to
> > > the dev mailing list but also seem to appear in the user mailing list
> (e.g.
> > > [2]).
> > >
> > > Logging into nabble doesn't solve the problem. I'd assume that it's
> some
> > > infrastructure issue rather than people collectively deleting their
> > > messages. But a Google search wasn't of any help.
> > >
> > > Matthias
> > >
> > > [1]
> > >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Remove-Mesos-support-td45974.html
> > > [2]
> > >
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/1-1-snapshot-issues-td6971.html#a6973