Re: [VOTE] FLIP-169: DataStream API for Fine-Grained Resource Requirements
+1 (binding) Thanks Yangze for driving the issue. Best regards, JING ZHANG Yang Wang 于2021年6月24日周四 下午1:51写道: > +1 (non-binding) > > Best, > Yang > > 刘建刚 于2021年6月24日周四 下午12:17写道: > > > +1 (binding) > > > > Thanks > > liujiangang > > > > Zhu Zhu 于2021年6月24日周四 上午11:38写道: > > > > > +1 (binding) > > > > > > Thanks, > > > Zhu > > > > > > Yangze Guo 于2021年6月21日周一 下午3:42写道: > > > > > > > According to the latest comment of Zhu Zhu[1], I append the potential > > > > resource deadlock in batch jobs as a known limitation to this FLIP. > > > > Thus, I'd extend the voting period for another 72h. > > > > > > > > [1] > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html > > > > > > > > Best, > > > > Yangze Guo > > > > > > > > On Tue, Jun 15, 2021 at 7:53 PM Xintong Song > > > > wrote: > > > > > > > > > > +1 (binding) > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > On Tue, Jun 15, 2021 at 6:21 PM Arvid Heise > > wrote: > > > > > > > > > > > LGTM +1 (binding) from my side. > > > > > > > > > > > > On Tue, Jun 15, 2021 at 11:00 AM Yangze Guo > > > > wrote: > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > I'd like to start the vote of FLIP-169 [1]. This FLIP is > > discussed > > > in > > > > > > > the thread[2]. > > > > > > > > > > > > > > The vote will be open for at least 72 hours. Unless there is an > > > > > > > objection, I will try to close it by Jun. 18, 2021 if we have > > > > received > > > > > > > sufficient votes. > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-169+DataStream+API+for+Fine-Grained+Resource+Requirements > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html > > > > > > > > > > > > > > Best, > > > > > > > Yangze Guo > > > > > > > > > > > > > > > > > > > > > > >
[jira] [Created] (FLINK-23133) The dependencies are not handled properly when mixing use of Python Table API and Python DataStream API
Dian Fu created FLINK-23133: --- Summary: The dependencies are not handled properly when mixing use of Python Table API and Python DataStream API Key: FLINK-23133 URL: https://issues.apache.org/jira/browse/FLINK-23133 Project: Flink Issue Type: Bug Components: API / Python Affects Versions: 1.13.0, 1.12.0 Reporter: Dian Fu Assignee: Dian Fu Fix For: 1.12.5, 1.13.2 The reason is that when converting from DataStream to Table, the dependencies should be handled and set correctly for the existing DataStream operators. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.
+1 to Xintong's proposal I also have some concerns about unstable cases. I think unstable cases can be divided into these types: - Force majeure: For example, network timeout, sudden environmental collapse, they are accidental and can always be solved by triggering azure again. Committers should wait for the next green azure. - Obvious mistakes: For example, some errors caused by obvious reasons may be repaired quickly. At this time, do we need to wait, or not wait and just ignore? - Difficult questions: These problems are very difficult to find. There will be no solution for a while and a half. We don't even know the reason. At this time, we should ignore it. (Maybe it's judged by the author of the case. But what about the old case whose author can't be found?) So, the ignored cases should be the block of the next release until the reason is found or the case is fixed? We need to ensure that someone will take care of these cases, because there is no deepening of failed tests, no one may continue to pay attention to these cases. I think this guideline should consider these situations, and show how to solve them. Best, Jingsong On Thu, Jun 24, 2021 at 10:57 AM Jark Wu wrote: > Thanks to Xintong for bringing up this topic, I'm +1 in general. > > However, I think it's still not very clear how we address the unstable > tests. > I think this is a very important part of this new guideline. > > According to the discussion above, if some tests are unstable, we can > manually disable it. > But I have some questions in my mind: > 1) Is the instability judged by the committer themselves or by some > metrics? > 2) Should we log the disable commit in the corresponding issue and increase > the priority? > 3) What if nobody looks into this issue and this becomes some potential > bugs released with the new version? > 4) If no person is actively working on the issue, who should re-enable it? > Would it block PRs again? > > > Best, > Jark > > > On Thu, 24 Jun 2021 at 10:04, Xintong Song wrote: > > > Thanks all for the feedback. > > > > @Till @Yangze > > > > I'm also not convinced by the idea of having an exception for local > builds. > > We need to execute the entire build (or at least the failing stage) > > locally, to make sure subsequent test cases prevented by the failure one > > are all executed. In that case, it's probably easier to rerun the build > on > > azure than locally. > > > > Concerning disabling unstable test cases that regularly block PRs from > > merging, maybe we can say that such cases can only be disabled when > someone > > is actively looking into it, likely the person who disabled the case. If > > this person is no longer actively working on it, he/she should enable the > > case again no matter if it is fixed or not. > > > > @Jing > > > > Thanks for the suggestions. > > > > +1 to provide guidelines on handling test failures. > > > > 1. Report the test failures in the JIRA. > > > > > > > +1 on this. Currently, the release managers are monitoring the ci and > cron > > build instabilities and reporting them on JIRA. We should also encourage > > other contributors to do that for PRs. > > > > 2. Set a deadline to find out the root cause and solve the failure for > the > > > new created JIRA because we could not block other commit merges for a > > long > > > time > > > > > 3. What to do if the JIRA has not made significant progress when reached > to > > > the deadline time? > > > > > > I'm not sure about these two. It feels a bit against the voluntary nature > > of open source projects. > > > > IMHO, frequent instabilities are more likely to be upgraded to the > critical > > / blocker priority, receive more attention and eventually get fixed. > > Release managers are also responsible for looking for assignees for such > > issues. If a case is still not fixed soonish, even with all these > efforts, > > I'm not sure how setting a deadline can help this. > > > > 4. If we disable the respective tests temporarily, we also need a > mechanism > > > to ensure the issue would be continued to be investigated in the > future. > > > > > > > +1. As mentioned above, we may consider disabling such tests iff someone > is > > actively working on it. > > > > Thank you~ > > > > Xintong Song > > > > > > > > On Wed, Jun 23, 2021 at 9:56 PM JING ZHANG wrote: > > > > > Hi Xintong, > > > +1 to the proposal. > > > In order to better comply with the rule, it is necessary to describe > > what's > > > best practice if encountering test failure which seems unrelated with > the > > > current commits. > > > How to avoid merging PR with test failures and not blocking code > merging > > > for a long time? > > > I tried to think about the possible steps, and found there are some > > > detailed problems that need to be discussed in a step further: > > > 1. Report the test failures in the JIRA. > > > 2. Set a deadline to find out the root cause and solve the failure for > > the > > > new created JIRA because we could not block
Re: [VOTE] FLIP-169: DataStream API for Fine-Grained Resource Requirements
+1 (non-binding) Best, Yang 刘建刚 于2021年6月24日周四 下午12:17写道: > +1 (binding) > > Thanks > liujiangang > > Zhu Zhu 于2021年6月24日周四 上午11:38写道: > > > +1 (binding) > > > > Thanks, > > Zhu > > > > Yangze Guo 于2021年6月21日周一 下午3:42写道: > > > > > According to the latest comment of Zhu Zhu[1], I append the potential > > > resource deadlock in batch jobs as a known limitation to this FLIP. > > > Thus, I'd extend the voting period for another 72h. > > > > > > [1] > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html > > > > > > Best, > > > Yangze Guo > > > > > > On Tue, Jun 15, 2021 at 7:53 PM Xintong Song > > > wrote: > > > > > > > > +1 (binding) > > > > > > > > > > > > Thank you~ > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > On Tue, Jun 15, 2021 at 6:21 PM Arvid Heise > wrote: > > > > > > > > > LGTM +1 (binding) from my side. > > > > > > > > > > On Tue, Jun 15, 2021 at 11:00 AM Yangze Guo > > > wrote: > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > I'd like to start the vote of FLIP-169 [1]. This FLIP is > discussed > > in > > > > > > the thread[2]. > > > > > > > > > > > > The vote will be open for at least 72 hours. Unless there is an > > > > > > objection, I will try to close it by Jun. 18, 2021 if we have > > > received > > > > > > sufficient votes. > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-169+DataStream+API+for+Fine-Grained+Resource+Requirements > > > > > > [2] > > > > > > > > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html > > > > > > > > > > > > Best, > > > > > > Yangze Guo > > > > > > > > > > > > > > > > >
Re: [VOTE] FLIP-169: DataStream API for Fine-Grained Resource Requirements
+1 (binding) Thanks liujiangang Zhu Zhu 于2021年6月24日周四 上午11:38写道: > +1 (binding) > > Thanks, > Zhu > > Yangze Guo 于2021年6月21日周一 下午3:42写道: > > > According to the latest comment of Zhu Zhu[1], I append the potential > > resource deadlock in batch jobs as a known limitation to this FLIP. > > Thus, I'd extend the voting period for another 72h. > > > > [1] > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html > > > > Best, > > Yangze Guo > > > > On Tue, Jun 15, 2021 at 7:53 PM Xintong Song > > wrote: > > > > > > +1 (binding) > > > > > > > > > Thank you~ > > > > > > Xintong Song > > > > > > > > > > > > On Tue, Jun 15, 2021 at 6:21 PM Arvid Heise wrote: > > > > > > > LGTM +1 (binding) from my side. > > > > > > > > On Tue, Jun 15, 2021 at 11:00 AM Yangze Guo > > wrote: > > > > > > > > > Hi everyone, > > > > > > > > > > I'd like to start the vote of FLIP-169 [1]. This FLIP is discussed > in > > > > > the thread[2]. > > > > > > > > > > The vote will be open for at least 72 hours. Unless there is an > > > > > objection, I will try to close it by Jun. 18, 2021 if we have > > received > > > > > sufficient votes. > > > > > > > > > > [1] > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-169+DataStream+API+for+Fine-Grained+Resource+Requirements > > > > > [2] > > > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html > > > > > > > > > > Best, > > > > > Yangze Guo > > > > > > > > > > > >
Re: [VOTE] FLIP-169: DataStream API for Fine-Grained Resource Requirements
+1 (binding) Thanks, Zhu Yangze Guo 于2021年6月21日周一 下午3:42写道: > According to the latest comment of Zhu Zhu[1], I append the potential > resource deadlock in batch jobs as a known limitation to this FLIP. > Thus, I'd extend the voting period for another 72h. > > [1] > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html > > Best, > Yangze Guo > > On Tue, Jun 15, 2021 at 7:53 PM Xintong Song > wrote: > > > > +1 (binding) > > > > > > Thank you~ > > > > Xintong Song > > > > > > > > On Tue, Jun 15, 2021 at 6:21 PM Arvid Heise wrote: > > > > > LGTM +1 (binding) from my side. > > > > > > On Tue, Jun 15, 2021 at 11:00 AM Yangze Guo > wrote: > > > > > > > Hi everyone, > > > > > > > > I'd like to start the vote of FLIP-169 [1]. This FLIP is discussed in > > > > the thread[2]. > > > > > > > > The vote will be open for at least 72 hours. Unless there is an > > > > objection, I will try to close it by Jun. 18, 2021 if we have > received > > > > sufficient votes. > > > > > > > > [1] > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-169+DataStream+API+for+Fine-Grained+Resource+Requirements > > > > [2] > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-169-DataStream-API-for-Fine-Grained-Resource-Requirements-td51071.html > > > > > > > > Best, > > > > Yangze Guo > > > > > > > >
Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.
Thanks to Xintong for bringing up this topic, I'm +1 in general. However, I think it's still not very clear how we address the unstable tests. I think this is a very important part of this new guideline. According to the discussion above, if some tests are unstable, we can manually disable it. But I have some questions in my mind: 1) Is the instability judged by the committer themselves or by some metrics? 2) Should we log the disable commit in the corresponding issue and increase the priority? 3) What if nobody looks into this issue and this becomes some potential bugs released with the new version? 4) If no person is actively working on the issue, who should re-enable it? Would it block PRs again? Best, Jark On Thu, 24 Jun 2021 at 10:04, Xintong Song wrote: > Thanks all for the feedback. > > @Till @Yangze > > I'm also not convinced by the idea of having an exception for local builds. > We need to execute the entire build (or at least the failing stage) > locally, to make sure subsequent test cases prevented by the failure one > are all executed. In that case, it's probably easier to rerun the build on > azure than locally. > > Concerning disabling unstable test cases that regularly block PRs from > merging, maybe we can say that such cases can only be disabled when someone > is actively looking into it, likely the person who disabled the case. If > this person is no longer actively working on it, he/she should enable the > case again no matter if it is fixed or not. > > @Jing > > Thanks for the suggestions. > > +1 to provide guidelines on handling test failures. > > 1. Report the test failures in the JIRA. > > > > +1 on this. Currently, the release managers are monitoring the ci and cron > build instabilities and reporting them on JIRA. We should also encourage > other contributors to do that for PRs. > > 2. Set a deadline to find out the root cause and solve the failure for the > > new created JIRA because we could not block other commit merges for a > long > > time > > > 3. What to do if the JIRA has not made significant progress when reached to > > the deadline time? > > > I'm not sure about these two. It feels a bit against the voluntary nature > of open source projects. > > IMHO, frequent instabilities are more likely to be upgraded to the critical > / blocker priority, receive more attention and eventually get fixed. > Release managers are also responsible for looking for assignees for such > issues. If a case is still not fixed soonish, even with all these efforts, > I'm not sure how setting a deadline can help this. > > 4. If we disable the respective tests temporarily, we also need a mechanism > > to ensure the issue would be continued to be investigated in the future. > > > > +1. As mentioned above, we may consider disabling such tests iff someone is > actively working on it. > > Thank you~ > > Xintong Song > > > > On Wed, Jun 23, 2021 at 9:56 PM JING ZHANG wrote: > > > Hi Xintong, > > +1 to the proposal. > > In order to better comply with the rule, it is necessary to describe > what's > > best practice if encountering test failure which seems unrelated with the > > current commits. > > How to avoid merging PR with test failures and not blocking code merging > > for a long time? > > I tried to think about the possible steps, and found there are some > > detailed problems that need to be discussed in a step further: > > 1. Report the test failures in the JIRA. > > 2. Set a deadline to find out the root cause and solve the failure for > the > > new created JIRA because we could not block other commit merges for a > long > > time > > When is a reasonable deadline here? > > 3. What to do if the JIRA has not made significant progress when reached > to > > the deadline time? > > There are several situations as follows, maybe different cases need > > different approaches. > > 1. the JIRA is non-assigned yet > > 2. not found the root cause yet > > 3. not found a good solution, but already found the root cause > > 4. found a solution, but it needs more time to be done. > > 4. If we disable the respective tests temporarily, we also need a > mechanism > > to ensure the issue would be continued to be investigated in the future. > > > > Best regards, > > JING ZHANG > > > > Stephan Ewen 于2021年6月23日周三 下午8:16写道: > > > > > +1 to Xintong's proposal > > > > > > On Wed, Jun 23, 2021 at 1:53 PM Till Rohrmann > > > wrote: > > > > > > > I would first try to not introduce the exception for local builds. It > > > makes > > > > it quite hard for others to verify the build and to make sure that > the > > > > right things were executed. If we see that this becomes an issue then > > we > > > > can revisit this idea. > > > > > > > > Cheers, > > > > Till > > > > > > > > On Wed, Jun 23, 2021 at 4:19 AM Yangze Guo > wrote: > > > > > > > > > +1 for appending this to community guidelines for merging PRs. > > > > > > > > > > @Till Rohrmann > > > > > I agree that with this approach unstable te
[jira] [Created] (FLINK-23132) flink upgrade issue(1.11.3->1.13.0)
Jeff Hu created FLINK-23132: --- Summary: flink upgrade issue(1.11.3->1.13.0) Key: FLINK-23132 URL: https://issues.apache.org/jira/browse/FLINK-23132 Project: Flink Issue Type: Bug Reporter: Jeff Hu In order to improve the performance of data process, we store events to a map and do not process them untill event count reaches 100. in the meantime, start a timer in open method, so data is processed every 60 seconds this works when flink version is *1.11.3*, after upgrading flink version to *1.13.0* I found sometimes events were consumed from Kafka continuously, but were not processed in RichFlatMapFunction, it means data was missing. after restarting service, it works well, but several hours later the same thing happened again. any known issue for this flink version? any suggestions are appreciated. {{public class MyJob \{ public static void main(String[] args) throws Exception { ... DataStream rawEventSource = env.addSource(flinkKafkaConsumer); ... }}} {{public class MyMapFunction extends RichFlatMapFunction implements Serializable \{ @Override public void open(Configuration parameters) { ... long periodTimeout = 60; pool.scheduleAtFixedRate(() -> { // processing data }, periodTimeout, periodTimeout, TimeUnit.SECONDS); } @Override public void flatMap(String message, Collector out) \{ // store event to map // count event, // when count = 100, start data processing } }}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.
Thanks all for the feedback. @Till @Yangze I'm also not convinced by the idea of having an exception for local builds. We need to execute the entire build (or at least the failing stage) locally, to make sure subsequent test cases prevented by the failure one are all executed. In that case, it's probably easier to rerun the build on azure than locally. Concerning disabling unstable test cases that regularly block PRs from merging, maybe we can say that such cases can only be disabled when someone is actively looking into it, likely the person who disabled the case. If this person is no longer actively working on it, he/she should enable the case again no matter if it is fixed or not. @Jing Thanks for the suggestions. +1 to provide guidelines on handling test failures. 1. Report the test failures in the JIRA. > +1 on this. Currently, the release managers are monitoring the ci and cron build instabilities and reporting them on JIRA. We should also encourage other contributors to do that for PRs. 2. Set a deadline to find out the root cause and solve the failure for the > new created JIRA because we could not block other commit merges for a long > time > 3. What to do if the JIRA has not made significant progress when reached to > the deadline time? I'm not sure about these two. It feels a bit against the voluntary nature of open source projects. IMHO, frequent instabilities are more likely to be upgraded to the critical / blocker priority, receive more attention and eventually get fixed. Release managers are also responsible for looking for assignees for such issues. If a case is still not fixed soonish, even with all these efforts, I'm not sure how setting a deadline can help this. 4. If we disable the respective tests temporarily, we also need a mechanism > to ensure the issue would be continued to be investigated in the future. > +1. As mentioned above, we may consider disabling such tests iff someone is actively working on it. Thank you~ Xintong Song On Wed, Jun 23, 2021 at 9:56 PM JING ZHANG wrote: > Hi Xintong, > +1 to the proposal. > In order to better comply with the rule, it is necessary to describe what's > best practice if encountering test failure which seems unrelated with the > current commits. > How to avoid merging PR with test failures and not blocking code merging > for a long time? > I tried to think about the possible steps, and found there are some > detailed problems that need to be discussed in a step further: > 1. Report the test failures in the JIRA. > 2. Set a deadline to find out the root cause and solve the failure for the > new created JIRA because we could not block other commit merges for a long > time > When is a reasonable deadline here? > 3. What to do if the JIRA has not made significant progress when reached to > the deadline time? > There are several situations as follows, maybe different cases need > different approaches. > 1. the JIRA is non-assigned yet > 2. not found the root cause yet > 3. not found a good solution, but already found the root cause > 4. found a solution, but it needs more time to be done. > 4. If we disable the respective tests temporarily, we also need a mechanism > to ensure the issue would be continued to be investigated in the future. > > Best regards, > JING ZHANG > > Stephan Ewen 于2021年6月23日周三 下午8:16写道: > > > +1 to Xintong's proposal > > > > On Wed, Jun 23, 2021 at 1:53 PM Till Rohrmann > > wrote: > > > > > I would first try to not introduce the exception for local builds. It > > makes > > > it quite hard for others to verify the build and to make sure that the > > > right things were executed. If we see that this becomes an issue then > we > > > can revisit this idea. > > > > > > Cheers, > > > Till > > > > > > On Wed, Jun 23, 2021 at 4:19 AM Yangze Guo wrote: > > > > > > > +1 for appending this to community guidelines for merging PRs. > > > > > > > > @Till Rohrmann > > > > I agree that with this approach unstable tests will not block other > > > > commit merges. However, it might be hard to prevent merging commits > > > > that are related to those tests and should have been passed them. > It's > > > > true that this judgment can be made by the committers, but no one can > > > > ensure the judgment is always precise and so that we have this > > > > discussion thread. > > > > > > > > Regarding the unstable tests, how about adding another exception: > > > > committers verify it in their local environment and comment in such > > > > cases? > > > > > > > > Best, > > > > Yangze Guo > > > > > > > > On Tue, Jun 22, 2021 at 8:23 PM 刘建刚 > wrote: > > > > > > > > > > It is a good principle to run all tests successfully with any > change. > > > > This > > > > > means a lot for project's stability and development. I am big +1 > for > > > this > > > > > proposal. > > > > > > > > > > Best > > > > > liujiangang > > > > > > > > > > Till Rohrmann 于2021年6月22日周二 下午6:36写道: > > > > > > > > > > > One way to address
[jira] [Created] (FLINK-23131) Remove scala from plugin parent-first patterns
Chesnay Schepler created FLINK-23131: Summary: Remove scala from plugin parent-first patterns Key: FLINK-23131 URL: https://issues.apache.org/jira/browse/FLINK-23131 Project: Flink Issue Type: Sub-task Components: Runtime / Configuration Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.14.0 In order to load akka and it's scala version through a separate classloader we need to remove scala from the parent-first patterns for plugins. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23130) RestServerEndpoint references on netty 3
Chesnay Schepler created FLINK-23130: Summary: RestServerEndpoint references on netty 3 Key: FLINK-23130 URL: https://issues.apache.org/jira/browse/FLINK-23130 Project: Flink Issue Type: Sub-task Components: Runtime / REST Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.14.0 The RestServerEndpoint does an instanceof check against {{org.jboss.netty.channel.ChannelException}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23129) When cancelling any running job of multiple jobs in an application cluster, JobManager shuts down
Robert Metzger created FLINK-23129: -- Summary: When cancelling any running job of multiple jobs in an application cluster, JobManager shuts down Key: FLINK-23129 URL: https://issues.apache.org/jira/browse/FLINK-23129 Project: Flink Issue Type: Bug Components: Runtime / Coordination Affects Versions: 1.14.0 Reporter: Robert Metzger I have a jar with two jobs, both executeAsync() from the same main method. I execute the main method in an Application Mode cluster. When I cancel one of the two jobs, both jobs will stop executing. I would expect that the JobManager shuts down once all jobs submitted from an application are finished. If this is a known limitation, we should document it. {code} 2021-06-23 21:29:53,123 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job first job (18181be02da272387354d093519b2359) switched from state RUNNING to CANCELLING. 2021-06-23 21:29:53,124 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Custom Source -> Sink: Unnamed (1/1) (5a69b1c19f8da23975f6961898ab50a2) switched from RUNNING to CANCELING. 2021-06-23 21:29:53,141 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Custom Source -> Sink: Unnamed (1/1) (5a69b1c19f8da23975f6961898ab50a2) switched from CANCELING to CANCELED. 2021-06-23 21:29:53,144 INFO org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] - Clearing resource requirements of job 18181be02da272387354d093519b2359 2021-06-23 21:29:53,145 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job first job (18181be02da272387354d093519b2359) switched from state CANCELLING to CANCELED. 2021-06-23 21:29:53,145 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Stopping checkpoint coordinator for job 18181be02da272387354d093519b2359. 2021-06-23 21:29:53,147 INFO org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore [] - Shutting down 2021-06-23 21:29:53,150 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job 18181be02da272387354d093519b2359 reached terminal state CANCELED. 2021-06-23 21:29:53,152 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Stopping the JobMaster for job first job(18181be02da272387354d093519b2359). 2021-06-23 21:29:53,155 INFO org.apache.flink.runtime.jobmaster.slotpool.DefaultDeclarativeSlotPool [] - Releasing slot [c35b64879d6b02d383c825ea735ebba0]. 2021-06-23 21:29:53,159 INFO org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] - Clearing resource requirements of job 18181be02da272387354d093519b2359 2021-06-23 21:29:53,159 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Close ResourceManager connection 281b3fcf7ad0a6f7763fa90b8a5b9adb: Stopping JobMaster for job first job(18181be02da272387354d093519b2359).. 2021-06-23 21:29:53,160 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Disconnect job manager 0...@akka.tcp://flink@localhost:6123/user/rpc/jobmanager_2 for job 18181be02da272387354d093519b2359 from the resource manager. 2021-06-23 21:29:53,225 INFO org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap [] - Application CANCELED: java.util.concurrent.CompletionException: org.apache.flink.client.deployment.application.UnsuccessfulExecutionException: Application Status: CANCELED at org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.lambda$unwrapJobResultException$4(ApplicationDispatcherBootstrap.java:304) ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT] at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) ~[?:1.8.0_252] at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) ~[?:1.8.0_252] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_252] at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) ~[?:1.8.0_252] at org.apache.flink.client.deployment.application.JobStatusPollingUtils.lambda$null$2(JobStatusPollingUtils.java:101) ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT] at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) ~[?:1.8.0_252] at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) ~[?:1.8.0_252] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_252] at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) ~[?:1.8.0_252] at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHan
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi all, Thanks, Konstantin and Till, for guiding the discussion. I was not aware of the results of the call with Konstantin and was attempting to resolve the unanswered questions before more, potentially fruitless, work was done. I am also looking forward to the coming proposal, as well as increasing my understanding of this specific use case + its limitations! Best, Austin On Tue, Jun 22, 2021 at 6:32 AM Till Rohrmann wrote: > Hi everyone, > > I do like the idea of keeping the actual change outside of Flink but to > enable Flink to support such a use case (different authentication > mechanisms). I think this is a good compromise for the community that > combines long-term maintainability with support for new use-cases. I am > looking forward to your proposal. > > I also want to second Konstantin here that the tone of your last email, > Marton, does not reflect the values and manners of the Flink community and > is not representative of how we conduct discussions. Especially, the more > senior community members should know this and act accordingly in order to > be good role models for others in the community. Technical discussions > should not be decided by who wields presumably the greatest authority but > by the soundness of arguments and by what is the best solution for a > problem. > > Let us now try to find the best solution for the problem at hand! > > Cheers, > Till > > On Tue, Jun 22, 2021 at 11:24 AM Konstantin Knauf > wrote: > > > Hi everyone, > > > > First, Marton and I had a brief conversation yesterday offline and > > discussed exploring the approach of exposing the authentication > > functionality via an API. So, I am looking forward to your proposal in > that > > direction. The benefit of such a solution would be that it is extensible > > for others and it does add a smaller maintenance (in particular testing) > > footprint to Apache Flink itself. If we end up going down this route, > > flink-packages.org would be a great way to promote these third party > > "authentication modules". > > > > Second, Marton, I understand your frustration about the long discussion > on > > this "simple matter", but the condescending tone of your last mail feels > > uncalled for to me. Austin expressed a valid opinion on the topic, which > is > > based on his experience from other Open Source frameworks (CNCF mostly). > I > > am sure you agree that it is important for Apache Flink to stay open and > to > > consider different approaches and ideas and I don't think it helps the > > culture of discussion to shoot it down like this ("This is where this > > discussion stops."). > > > > Let's continue to move this discussion forward and I am sure we'll find a > > consensus based on product and technological considerations. > > > > Thanks, > > > > Konstantin > > > > On Tue, Jun 22, 2021 at 9:31 AM Márton Balassi > > > wrote: > > > > > Hi Austin, > > > > > > Thank you for your thoughts. This is where this discussion stops. This > > > email thread already contains more characters than the implementation > and > > > what is needed for the next 20 years of maintenance. > > > > > > It is great that you have a view on modern solutions and thank you for > > > offering your help with brainstorming solutions. I am responsible for > > Flink > > > at Cloudera and we do need an implementation like this and it is in > fact > > > already in production at dozens of customers. We are open to adapting > > that > > > to expose a more generic API (and keeping Kerberos to our fork), to > > > contribute this to the community as others have asked for it and to > > protect > > > ourselves from occasionally having to update this critical > implementation > > > path based on changes in the Apache codebase. I have worked with close > > to a > > > hundred Big Data customers as a consultant and an engineering manager > and > > > committed hundreds of changes to Apache Flink over the past decade, > > please > > > trust my judgement on a simple matter like this. > > > > > > Please forgive me for referencing authority, this discussion was > getting > > > out of hand. Please keep vigilant. > > > > > > Best, > > > Marton > > > > > > On Mon, Jun 21, 2021 at 10:50 PM Austin Cawley-Edwards < > > > austin.caw...@gmail.com> wrote: > > > > > > > Hi Gabor + Marton, > > > > > > > > I don't believe that the issue with this proposal is the specific > > > mechanism > > > > proposed (Kerberos), but rather that it is not the level to implement > > it > > > at > > > > (Flink). I'm just one voice, so please take this with a grain of > salt. > > > > > > > > In the other solutions previously noted there is no need to > instrument > > > > Flink which, in addition to reducing the maintenance burden, > provides a > > > > better, decoupled end result. > > > > > > > > IMO we should not add any new API in Flink for this use case. I think > > it > > > is > > > > unfortunate and sympathize with the work that has already been done > on > > > this > > > > feature – pe
[jira] [Created] (FLINK-23128) Translate update to operations playground docs to Chinese
David Anderson created FLINK-23128: -- Summary: Translate update to operations playground docs to Chinese Key: FLINK-23128 URL: https://issues.apache.org/jira/browse/FLINK-23128 Project: Flink Issue Type: Sub-task Components: Documentation / Training Affects Versions: 1.13.1 Reporter: David Anderson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23127) Can't use plugins for GCS filesystem
Yaroslav Tkachenko created FLINK-23127: -- Summary: Can't use plugins for GCS filesystem Key: FLINK-23127 URL: https://issues.apache.org/jira/browse/FLINK-23127 Project: Flink Issue Type: Bug Affects Versions: 1.13.0 Reporter: Yaroslav Tkachenko Attachments: exception-stacktrace.txt I've been trying to add support for the GCS filesystem. I have a working example where I add two JARs to the */opt/flink/lib/* folder: * [GCS Hadoop connector|https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar] * *Shaded* Hadoop using [flink-shaded-hadoop-2-uber-2.8.3-10.0.jar|https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.8.3-10.0/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar] Now I'm trying to follow the advice from [this page|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/filesystems/overview/#pluggable-file-systems] and use Plugins instead. I followed the recommendations from [here|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/filesystems/plugins/]. Now I have two JARs in the */opt/flink/plugins/hadoop-gcs/* folder: * [GCS Hadoop connector|https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-hadoop2-2.2.1.jar] * *Non-shaded* [Hadoop using hadoop-common-2.10.1.jar|https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/2.10.1/hadoop-common-2.10.1.jar] As I can see, shading is not required for plugins (that's one of the reasons to use them), so I want to make it work with a simple non-shaded _hadoop-common_. However, the JobManager fails with an exception (full stacktrace is available is an attachment): {quote}Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies. {quote} The exception is thrown when _org.apache.hadoop.conf.Configuration_ and _org.apache.hadoop.fs.FileSystem_ [are not available in the classpath|https://github.com/apache/flink/blob/f2f2befee76d08b4d9aa592438dc0cf5ebe2ef96/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java#L1123-L1124], but they're available in hadoop-common and should have been loaded. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23126) Refactor smoke-e2e into smoke-e2e-common and smoke-e2e-embedded
Evans Ye created FLINK-23126: Summary: Refactor smoke-e2e into smoke-e2e-common and smoke-e2e-embedded Key: FLINK-23126 URL: https://issues.apache.org/jira/browse/FLINK-23126 Project: Flink Issue Type: Sub-task Components: Stateful Functions, Tests Reporter: Evans Ye This JIRA focus on refactoring the existing statefun-smoke-e2e module into: * statefun-smoke-e2e-common (E2E testing framework such as source, sink, verification, etc) * statefun-smoke-e2e-embedded (embedded java function) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23125) Run StateFun smoke E2E tests for multiple language SDKs
Evans Ye created FLINK-23125: Summary: Run StateFun smoke E2E tests for multiple language SDKs Key: FLINK-23125 URL: https://issues.apache.org/jira/browse/FLINK-23125 Project: Flink Issue Type: Improvement Components: Stateful Functions, Tests Reporter: Evans Ye Currently statefun-smoke-e2e module is testing for embedded function in java language only. This JIRA aims to refactor the existing code into self-contained modules so that we can easily compose the testing framework with different language SDKs. The design will be looked like: 1. statefun-smoke-e2e-common (core testing framework such as source, sink, verification, etc) 2. statefun-smoke-e2e-embedded 3. statefun-smoke-e2e-java ... -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] [FLINK-23122] Provide the Dynamic register converter
Hi, `TIMESTAMP_WITH_TIME_ZONE` is not supported in the Flink SQL engine, even though it is listed in the type API. I think what you are looking for is the RawValueType which can be used as user-defined type. You can use `DataTypes.RAW(TypeInformation)` to define a Raw type with the given TypeInformation which includes the serializer and deserializer. Best, Jark On Wed, 23 Jun 2021 at 21:09, 云华 wrote: > > Hi everyone, > I want to rework type conversion system in connector and flink table > module to be resuable and scalability. > I Postgres system, the type '_citext' will not supported in > org.apache.flink.connector.jdbc.catalog.PostgresCatalog#fromJDBCType. > what's more, > org.apache.flink.table.runtime.typeutils.InternalSerializers#createInternal > cannnot support the TIMESTAMP_WITH_TIME_ZONE. > For more background and api design : > https://issues.apache.org/jira/browse/FLINK-23122. > Please let me know if this matches your thoughts. > > > > Regards,Jack
[jira] [Created] (FLINK-23124) Implement exactly-once Kafka Sink
Fabian Paul created FLINK-23124: --- Summary: Implement exactly-once Kafka Sink Key: FLINK-23124 URL: https://issues.apache.org/jira/browse/FLINK-23124 Project: Flink Issue Type: Sub-task Reporter: Fabian Paul -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23123) Implement at-least-once Kafka Sink
Fabian Paul created FLINK-23123: --- Summary: Implement at-least-once Kafka Sink Key: FLINK-23123 URL: https://issues.apache.org/jira/browse/FLINK-23123 Project: Flink Issue Type: Sub-task Reporter: Fabian Paul -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.
Hi Xintong, +1 to the proposal. In order to better comply with the rule, it is necessary to describe what's best practice if encountering test failure which seems unrelated with the current commits. How to avoid merging PR with test failures and not blocking code merging for a long time? I tried to think about the possible steps, and found there are some detailed problems that need to be discussed in a step further: 1. Report the test failures in the JIRA. 2. Set a deadline to find out the root cause and solve the failure for the new created JIRA because we could not block other commit merges for a long time When is a reasonable deadline here? 3. What to do if the JIRA has not made significant progress when reached to the deadline time? There are several situations as follows, maybe different cases need different approaches. 1. the JIRA is non-assigned yet 2. not found the root cause yet 3. not found a good solution, but already found the root cause 4. found a solution, but it needs more time to be done. 4. If we disable the respective tests temporarily, we also need a mechanism to ensure the issue would be continued to be investigated in the future. Best regards, JING ZHANG Stephan Ewen 于2021年6月23日周三 下午8:16写道: > +1 to Xintong's proposal > > On Wed, Jun 23, 2021 at 1:53 PM Till Rohrmann > wrote: > > > I would first try to not introduce the exception for local builds. It > makes > > it quite hard for others to verify the build and to make sure that the > > right things were executed. If we see that this becomes an issue then we > > can revisit this idea. > > > > Cheers, > > Till > > > > On Wed, Jun 23, 2021 at 4:19 AM Yangze Guo wrote: > > > > > +1 for appending this to community guidelines for merging PRs. > > > > > > @Till Rohrmann > > > I agree that with this approach unstable tests will not block other > > > commit merges. However, it might be hard to prevent merging commits > > > that are related to those tests and should have been passed them. It's > > > true that this judgment can be made by the committers, but no one can > > > ensure the judgment is always precise and so that we have this > > > discussion thread. > > > > > > Regarding the unstable tests, how about adding another exception: > > > committers verify it in their local environment and comment in such > > > cases? > > > > > > Best, > > > Yangze Guo > > > > > > On Tue, Jun 22, 2021 at 8:23 PM 刘建刚 wrote: > > > > > > > > It is a good principle to run all tests successfully with any change. > > > This > > > > means a lot for project's stability and development. I am big +1 for > > this > > > > proposal. > > > > > > > > Best > > > > liujiangang > > > > > > > > Till Rohrmann 于2021年6月22日周二 下午6:36写道: > > > > > > > > > One way to address the problem of regularly failing tests that > block > > > > > merging of PRs is to disable the respective tests for the time > being. > > > Of > > > > > course, the failing test then needs to be fixed. But at least that > > way > > > we > > > > > would not block everyone from making progress. > > > > > > > > > > Cheers, > > > > > Till > > > > > > > > > > On Tue, Jun 22, 2021 at 12:00 PM Arvid Heise > > wrote: > > > > > > > > > > > I think this is overall a good idea. So +1 from my side. > > > > > > However, I'd like to put a higher priority on infrastructure > then, > > in > > > > > > particular docker image/artifact caches. > > > > > > > > > > > > On Tue, Jun 22, 2021 at 11:50 AM Till Rohrmann < > > trohrm...@apache.org > > > > > > > > > > wrote: > > > > > > > > > > > > > Thanks for bringing this topic to our attention Xintong. I > think > > > your > > > > > > > proposal makes a lot of sense and we should follow it. It will > > > give us > > > > > > > confidence that our changes are working and it might be a good > > > > > incentive > > > > > > to > > > > > > > quickly fix build instabilities. Hence, +1. > > > > > > > > > > > > > > Cheers, > > > > > > > Till > > > > > > > > > > > > > > On Tue, Jun 22, 2021 at 11:12 AM Xintong Song < > > > tonysong...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > In the past a couple of weeks, I've observed several times > that > > > PRs > > > > > are > > > > > > > > merged without a green light from the CI tests, where failure > > > cases > > > > > are > > > > > > > > considered *unrelated*. This may not always cause problems, > but > > > would > > > > > > > > increase the chance of breaking our code base. In fact, it > has > > > > > occurred > > > > > > > to > > > > > > > > me twice in the past few weeks that I had to revert a commit > > > which > > > > > > breaks > > > > > > > > the master branch due to this. > > > > > > > > > > > > > > > > I think it would be nicer to enforce a stricter rule, that no > > PRs > > > > > > should > > > > > > > be > > > > > > > > merged without passing CI. > > > > > > > > > > > > > > > > The problems of merging PRs with "unrelated" test failures > are: >
Re: [DISCUSS] Incrementally deprecating the DataSet API
If we want to publicize this plan more shouldn't we have a rough timeline for when 2.0 is on the table? On 6/23/2021 2:44 PM, Stephan Ewen wrote: Thanks for writing this up, this also reflects my understanding. I think a blog post would be nice, ideally with an explicit call for feedback so we learn about user concerns. A blog post has a lot more reach than an ML thread. Best, Stephan On Wed, Jun 23, 2021 at 12:23 PM Timo Walther wrote: Hi everyone, I'm sending this email to make sure everyone is on the same page about slowly deprecating the DataSet API. There have been a few thoughts mentioned in presentations, offline discussions, and JIRA issues. However, I have observed that there are still some concerns or different opinions on what steps are necessary to implement this change. Let me summarize some of the steps and assumpations and let's have a discussion about it: Step 1: Introduce a batch mode for Table API (FLIP-32) [DONE in 1.9] Step 2: Introduce a batch mode for DataStream API (FLIP-134) [DONE in 1.12] Step 3: Soft deprecate DataSet API (FLIP-131) [DONE in 1.12] We updated the documentation recently to make this deprecation even more visible. There is a dedicated `(Legacy)` label right next to the menu item now. We won't deprecate concrete classes of the API with a @Deprecated annotation to avoid extensive warnings in logs until then. Step 4: Drop the legacy SQL connectors and formats (FLINK-14437) [DONE in 1.14] We dropped code for ORC, Parque, and HBase formats that were only used by DataSet API users. The removed classes had no documentation and were not annotated with one of our API stability annotations. The old functionality should be available through the new sources and sinks for Table API and DataStream API. If not, we should bring them into a shape that they can be a full replacement. DataSet users are encouraged to either upgrade the API or use Flink 1.13. Users can either just stay at Flink 1.13 or copy only the format's code to a newer Flink version. We aim to keep the core interfaces (i.e. InputFormat and OutputFormat) stable until the next major version. We will maintain/allow important contributions to dropped connectors in 1.13. So 1.13 could be considered as kind of a DataSet API LTS release. Step 5: Drop the legacy SQL planner (FLINK-14437) [DONE in 1.14] This included dropping support of DataSet API with SQL. Step 6: Connect both Table and DataStream API in batch mode (FLINK-20897) [PLANNED in 1.14] Step 7: Reach feature parity of Table API/DataStream API with DataSet API [PLANNED for 1.14++] We need to identify blockers when migrating from DataSet API to Table API/DataStream API. Here we need to estabilish a good feedback pipeline to include DataSet users in the roadmap planning. Step 7: Drop the Gelly library No concrete plan yet. Latest would be the next major Flink version aka Flink 2.0. Step 8: Drop DataSet API Planned for the next major Flink version aka Flink 2.0. Please let me know if this matches your thoughts. We can also convert this into a blog post or mention it in the next release notes. Regards, Timo
[DISCUSS] [FLINK-23122] Provide the Dynamic register converter
Hi everyone, I want to rework type conversion system in connector and flink table module to be resuable and scalability. I Postgres system, the type '_citext' will not supported in org.apache.flink.connector.jdbc.catalog.PostgresCatalog#fromJDBCType. what's more, org.apache.flink.table.runtime.typeutils.InternalSerializers#createInternal cannnot support the TIMESTAMP_WITH_TIME_ZONE. For more background and api design : https://issues.apache.org/jira/browse/FLINK-23122. Please let me know if this matches your thoughts. Regards,Jack
[jira] [Created] (FLINK-23122) Provide the Dynamic register converter
lqjacklee created FLINK-23122: - Summary: Provide the Dynamic register converter Key: FLINK-23122 URL: https://issues.apache.org/jira/browse/FLINK-23122 Project: Flink Issue Type: Improvement Components: Connectors / Common, Connectors / HBase, Connectors / Hive, Connectors / JDBC, Connectors / ORC, Table SQL / API Affects Versions: 1.14.0 Reporter: lqjacklee Background: Type conversion is the core of direct data conversion between Flink and data source. By default, Flink provides type conversion for different connectors. Different transformation logic is distributed in the specific implementation of multiple connectors. It brings a big problem to the reuse of Flink system. Secondly, due to the diversity of different types of data sources, the original transformation needs to be extended, and the original transformation does not have dynamic expansion. Finally, the core of the transformation logic needs to be reused in multiple projects, hoping to abstract the transformation logic into a unified processing. The application program directly depends on the same type transformation system, and different sub components can dynamically expand the types of transformation. 1, ConvertServiceRegister : provide register and search function. {code:java} public interface ConvertServiceRegister { void register(ConversionService conversionService); void register(ConversionServiceFactory conversionServiceFactory); void register(ConversionServiceSet conversionServiceSet); Collection convertServices(); Collection convertServices(String group); Collection convertServiceSets(); Collection convertServiceSets(String group); } {code} 2, ConversionService : provide the implement. {code:java} public interface ConversionService extends Order { Set tags(); boolean canConvert(TypeInformationHolder source, TypeInformationHolder target) throws ConvertException; Object convert( TypeInformationHolder sourceType, Object source, TypeInformationHolder targetType, Object defaultValue, boolean nullable) throws ConvertException; } {code} 3, ConversionServiceFactory : provide the conversion service factory function. {code:java} public interface ConversionServiceFactory extends Order { Set tags(); ConversionService getConversionService(T target) throws ConvertException; } {code} 4, ConversionServiceSet : provide group management. {code:java} public interface ConversionServiceSet extends Loadable { Set tags(); Collection conversionServices(); boolean support(TypeInformationHolder source, TypeInformationHolder target) throws ConvertException; Object convert( String name, TypeInformationHolder typeInformationHolder, Object value, TypeInformationHolder type, Object defaultValue, boolean nullable) throws ConvertException; } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Incrementally deprecating the DataSet API
Thanks for writing this up, this also reflects my understanding. I think a blog post would be nice, ideally with an explicit call for feedback so we learn about user concerns. A blog post has a lot more reach than an ML thread. Best, Stephan On Wed, Jun 23, 2021 at 12:23 PM Timo Walther wrote: > Hi everyone, > > I'm sending this email to make sure everyone is on the same page about > slowly deprecating the DataSet API. > > There have been a few thoughts mentioned in presentations, offline > discussions, and JIRA issues. However, I have observed that there are > still some concerns or different opinions on what steps are necessary to > implement this change. > > Let me summarize some of the steps and assumpations and let's have a > discussion about it: > > Step 1: Introduce a batch mode for Table API (FLIP-32) > [DONE in 1.9] > > Step 2: Introduce a batch mode for DataStream API (FLIP-134) > [DONE in 1.12] > > Step 3: Soft deprecate DataSet API (FLIP-131) > [DONE in 1.12] > > We updated the documentation recently to make this deprecation even more > visible. There is a dedicated `(Legacy)` label right next to the menu > item now. > > We won't deprecate concrete classes of the API with a @Deprecated > annotation to avoid extensive warnings in logs until then. > > Step 4: Drop the legacy SQL connectors and formats (FLINK-14437) > [DONE in 1.14] > > We dropped code for ORC, Parque, and HBase formats that were only used > by DataSet API users. The removed classes had no documentation and were > not annotated with one of our API stability annotations. > > The old functionality should be available through the new sources and > sinks for Table API and DataStream API. If not, we should bring them > into a shape that they can be a full replacement. > > DataSet users are encouraged to either upgrade the API or use Flink > 1.13. Users can either just stay at Flink 1.13 or copy only the format's > code to a newer Flink version. We aim to keep the core interfaces (i.e. > InputFormat and OutputFormat) stable until the next major version. > > We will maintain/allow important contributions to dropped connectors in > 1.13. So 1.13 could be considered as kind of a DataSet API LTS release. > > Step 5: Drop the legacy SQL planner (FLINK-14437) > [DONE in 1.14] > > This included dropping support of DataSet API with SQL. > > Step 6: Connect both Table and DataStream API in batch mode (FLINK-20897) > [PLANNED in 1.14] > > Step 7: Reach feature parity of Table API/DataStream API with DataSet API > [PLANNED for 1.14++] > > We need to identify blockers when migrating from DataSet API to Table > API/DataStream API. Here we need to estabilish a good feedback pipeline > to include DataSet users in the roadmap planning. > > Step 7: Drop the Gelly library > > No concrete plan yet. Latest would be the next major Flink version aka > Flink 2.0. > > Step 8: Drop DataSet API > > Planned for the next major Flink version aka Flink 2.0. > > > Please let me know if this matches your thoughts. We can also convert > this into a blog post or mention it in the next release notes. > > Regards, > Timo > >
Re: [DISCUSS] Feedback Collection Jira Bot
Hi Konstantin, Chesnay, > I would like it to not unassign people if a PR is open. These are > usually blocked by the reviewer, not the assignee, and having the > assignees now additionally having to update JIRA periodically is a bit > like rubbing salt into the wound. I agree with Chesnay about not un-assign an issue if a PR is open. Besides, Could assignees remove the "stale-assigned" tag by themself? It seems assignees have no permission to delete the tag if the issue is not created by themselves. Best regards, JING ZHANG Konstantin Knauf 于2021年6月23日周三 下午4:17写道: > > I agree there are such tickets, but I don't see how this is addressing my > concerns. There are also tickets that just shouldn't be closed as I > described above. Why do you think that duplicating tickets and losing > discussions/knowledge is a good solution? > > I don't understand why we are necessarily losing discussion/knowledge. The > tickets are still there, just in "Closed" state, which are included in > default Jira search. We could of course just add a label, but closing seems > clearer to me given that likely this ticket will not get comitter attention > in the foreseeable future. > > > I would like to avoid having to constantly fight against the bot. It's > already responsible for the majority of my daily emails, with quite little > benefit for me personally. I initially thought that after some period of > time it will settle down, but now I'm afraid it won't happen. > > Can you elaborate which rules you are running into mostly? I'd rather like > to understand how we work right now and where this conflicts with the Jira > bot vs slowly disabling the jira bot via labels. > > On Wed, Jun 23, 2021 at 10:00 AM Piotr Nowojski > wrote: > > > Hi Konstantin, > > > > > In my opinion it is important that we close tickets eventually. There > are > > a > > > lot of tickets (bugs, improvements, tech debt) that over time became > > > irrelevant, out-of-scope, irreproducible, etc. In my experience, these > > > tickets are usually not closed by anyone but the bot. > > > > I agree there are such tickets, but I don't see how this is addressing my > > concerns. There are also tickets that just shouldn't be closed as I > > described above. Why do you think that duplicating tickets and losing > > discussions/knowledge is a good solution? > > > > I would like to avoid having to constantly fight against the bot. It's > > already responsible for the majority of my daily emails, with quite > little > > benefit for me personally. I initially thought that after some period of > > time it will settle down, but now I'm afraid it won't happen. Can we add > > some label to mark tickets to be ignored by the jira-bot? > > > > Best, > > Piotrek > > > > śr., 23 cze 2021 o 09:40 Chesnay Schepler > napisał(a): > > > > > I would like it to not unassign people if a PR is open. These are > > > usually blocked by the reviewer, not the assignee, and having the > > > assignees now additionally having to update JIRA periodically is a bit > > > like rubbing salt into the wound. > > > > > > On 6/23/2021 7:52 AM, Konstantin Knauf wrote: > > > > Hi everyone, > > > > > > > > I was hoping for more feedback from other committers, but seems like > > this > > > > is not happening, so here's my proposal for immediate changes: > > > > > > > > * Ignore tickets with a fixVersion for all rules but the > > stale-unassigned > > > > role. > > > > > > > > * We change the time intervals as follows, accepting reality a bit > more > > > ;) > > > > > > > > * stale-assigned only after 30 days (instead of 14 days) > > > > * stale-critical only after 14 days (instead of 7 days) > > > > * stale-major only after 60 days (instead of 30 days) > > > > > > > > Unless there are -1s, I'd implement the changes Monday next week. > > > > > > > > Cheers, > > > > > > > > Konstantin > > > > > > > > On Thu, Jun 17, 2021 at 2:17 PM Piotr Nowojski > > > > wrote: > > > > > > > >> Hi, > > > >> > > > >> I also think that the bot is a bit too aggressive/too quick with > > > assigning > > > >> stale issues/deprioritizing them, but that's not that big of a deal > > for > > > me. > > > >> > > > >> What bothers me much more is that it's closing minor issues > > > automatically. > > > >> Depriotising issues makes sense to me. If a wish for improvement or > a > > > bug > > > >> report has been opened a long time ago, and they got no attention > over > > > the > > > >> time, sure depriotize them. But closing them is IMO a bad idea. Bug > > > might > > > >> be minor, but if it's not fixed it's still there - it shouldn't be > > > closed. > > > >> Closing with "won't fix" should be done for very good reasons and > very > > > >> rarely. Same applies to improvements/wishes. Furthermore, very often > > > >> descriptions and comments have a lot of value, and if we keep > closing > > > minor > > > >> issues I'm afraid that we end up with: > > > >> - more duplication. I doubt anyone will be looking for prior >
Re: [DISCUSS] Drop Mesos in 1.14
I would prefer to remove Mesos from the Flink core as well. I also had a similar thought as Seth: As far as I know, you can package applications to run on Mesos with "Marathon". That would be like deploying an opaque Flink standalone cluster on Mesos The implication is similar to going from an active integration to a standalone cluster (like from native Flink Kubernetes Application Deployment to a Standalone Application Deployment on Kubernetes): You need to make sure the number of TMs / slots and the parallelism fit together (or use the new reactive mode). Other than that, I think it should work well for streaming jobs. Having a Flink-Marathon template in https://flink-packages.org/ would be a nice thing for Mesos users. @Oleksandr What do you think about that? On Wed, Jun 23, 2021 at 11:31 AM Leonard Xu wrote: > + 1 for dropping Mesos. I checked both commit history and mail list, the > Mesos related issue/user question has been rarely appeared. > > Best, > Leonard > >
Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.
+1 to Xintong's proposal On Wed, Jun 23, 2021 at 1:53 PM Till Rohrmann wrote: > I would first try to not introduce the exception for local builds. It makes > it quite hard for others to verify the build and to make sure that the > right things were executed. If we see that this becomes an issue then we > can revisit this idea. > > Cheers, > Till > > On Wed, Jun 23, 2021 at 4:19 AM Yangze Guo wrote: > > > +1 for appending this to community guidelines for merging PRs. > > > > @Till Rohrmann > > I agree that with this approach unstable tests will not block other > > commit merges. However, it might be hard to prevent merging commits > > that are related to those tests and should have been passed them. It's > > true that this judgment can be made by the committers, but no one can > > ensure the judgment is always precise and so that we have this > > discussion thread. > > > > Regarding the unstable tests, how about adding another exception: > > committers verify it in their local environment and comment in such > > cases? > > > > Best, > > Yangze Guo > > > > On Tue, Jun 22, 2021 at 8:23 PM 刘建刚 wrote: > > > > > > It is a good principle to run all tests successfully with any change. > > This > > > means a lot for project's stability and development. I am big +1 for > this > > > proposal. > > > > > > Best > > > liujiangang > > > > > > Till Rohrmann 于2021年6月22日周二 下午6:36写道: > > > > > > > One way to address the problem of regularly failing tests that block > > > > merging of PRs is to disable the respective tests for the time being. > > Of > > > > course, the failing test then needs to be fixed. But at least that > way > > we > > > > would not block everyone from making progress. > > > > > > > > Cheers, > > > > Till > > > > > > > > On Tue, Jun 22, 2021 at 12:00 PM Arvid Heise > wrote: > > > > > > > > > I think this is overall a good idea. So +1 from my side. > > > > > However, I'd like to put a higher priority on infrastructure then, > in > > > > > particular docker image/artifact caches. > > > > > > > > > > On Tue, Jun 22, 2021 at 11:50 AM Till Rohrmann < > trohrm...@apache.org > > > > > > > > wrote: > > > > > > > > > > > Thanks for bringing this topic to our attention Xintong. I think > > your > > > > > > proposal makes a lot of sense and we should follow it. It will > > give us > > > > > > confidence that our changes are working and it might be a good > > > > incentive > > > > > to > > > > > > quickly fix build instabilities. Hence, +1. > > > > > > > > > > > > Cheers, > > > > > > Till > > > > > > > > > > > > On Tue, Jun 22, 2021 at 11:12 AM Xintong Song < > > tonysong...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > In the past a couple of weeks, I've observed several times that > > PRs > > > > are > > > > > > > merged without a green light from the CI tests, where failure > > cases > > > > are > > > > > > > considered *unrelated*. This may not always cause problems, but > > would > > > > > > > increase the chance of breaking our code base. In fact, it has > > > > occurred > > > > > > to > > > > > > > me twice in the past few weeks that I had to revert a commit > > which > > > > > breaks > > > > > > > the master branch due to this. > > > > > > > > > > > > > > I think it would be nicer to enforce a stricter rule, that no > PRs > > > > > should > > > > > > be > > > > > > > merged without passing CI. > > > > > > > > > > > > > > The problems of merging PRs with "unrelated" test failures are: > > > > > > > - It's not always straightforward to tell whether a test > > failures are > > > > > > > related or not. > > > > > > > - It prevents subsequent test cases from being executed, which > > may > > > > fail > > > > > > > relating to the PR changes. > > > > > > > > > > > > > > To make things easier for the committers, the following > > exceptions > > > > > might > > > > > > be > > > > > > > considered acceptable. > > > > > > > - The PR has passed CI in the contributor's personal workspace. > > > > Please > > > > > > post > > > > > > > the link in such cases. > > > > > > > - The CI tests have been triggered multiple times, on the same > > > > commit, > > > > > > and > > > > > > > each stage has at least passed for once. Please also comment in > > such > > > > > > cases. > > > > > > > > > > > > > > If we all agree on this, I'd update the community guidelines > for > > > > > merging > > > > > > > PRs wrt. this proposal. [1] > > > > > > > > > > > > > > Please let me know what do you think. > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/Merging+Pull+Requests > > > > > > > > > > > > > > > > > > > > > > > > >
Re: Change in accumutors semantics with jobClient
Yes, it should be part of the release notes where this change was introduced. I'll take a look at your PR. Thanks a lot Etienne. Cheers, Till On Wed, Jun 23, 2021 at 12:29 PM Etienne Chauchot wrote: > Hi Till, > > Of course I can update the release notes. > > Question is: this change is quite old (January), it is already available > in all the maintained releases :1.11, 1.12, 1.13. > > I think I should update the release notes for all these versions no ? > > In case you agree, I took the liberty to update all these release notes > in a PR: https://github.com/apache/flink/pull/16256 > > Cheers, > > Etienne > > On 21/06/2021 11:39, Till Rohrmann wrote: > > Thanks for bringing this to the dev ML Etienne. Could you maybe update > the > > release notes for Flink 1.13 [1] to include this change? That way it > might > > be a bit more prominent. I think the change needs to go into the > > release-1.13 and master branch. > > > > [1] > > > https://github.com/apache/flink/blob/master/docs/content/release-notes/flink-1.13.md > > > > Cheers, > > Till > > > > > > On Fri, Jun 18, 2021 at 2:45 PM Etienne Chauchot > > wrote: > > > >> Hi all, > >> > >> I did a fix some time ago regarding accumulators: > >> the/JobClient.getAccumulators()/ was infinitely blocking in local > >> environment for a streaming job (1). The change (2) consisted of giving > >> the current accumulators value for the running job. And when fixing this > >> in the PR, it appeared that I had to change the accumulators semantics > >> with /JobClient/ and I just realized that I forgot to bring this back to > >> the ML: > >> > >> Previously /JobClient/ assumed that getAccumulator() was called on a > >> bounded pipeline and that the user wanted to acquire the *final > >> accumulator values* after the job is finished. > >> > >> But now it returns the *current value of accumulators* immediately to be > >> compatible with unbounded pipelines. > >> > >> If it is run on a bounded pipeline, then to get the final accumulator > >> values after the job is finished, one needs to call > >> > >> > /getJobExecutionResult().thenApply(JobExecutionResult::getAllAccumulatorResults)/ > >> > >> (1): https://issues.apache.org/jira/browse/FLINK-18685 > >> > >> (2): https://github.com/apache/flink/pull/14558# > >> > >> > >> Cheers, > >> > >> Etienne > >> > >> >
Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.
I would first try to not introduce the exception for local builds. It makes it quite hard for others to verify the build and to make sure that the right things were executed. If we see that this becomes an issue then we can revisit this idea. Cheers, Till On Wed, Jun 23, 2021 at 4:19 AM Yangze Guo wrote: > +1 for appending this to community guidelines for merging PRs. > > @Till Rohrmann > I agree that with this approach unstable tests will not block other > commit merges. However, it might be hard to prevent merging commits > that are related to those tests and should have been passed them. It's > true that this judgment can be made by the committers, but no one can > ensure the judgment is always precise and so that we have this > discussion thread. > > Regarding the unstable tests, how about adding another exception: > committers verify it in their local environment and comment in such > cases? > > Best, > Yangze Guo > > On Tue, Jun 22, 2021 at 8:23 PM 刘建刚 wrote: > > > > It is a good principle to run all tests successfully with any change. > This > > means a lot for project's stability and development. I am big +1 for this > > proposal. > > > > Best > > liujiangang > > > > Till Rohrmann 于2021年6月22日周二 下午6:36写道: > > > > > One way to address the problem of regularly failing tests that block > > > merging of PRs is to disable the respective tests for the time being. > Of > > > course, the failing test then needs to be fixed. But at least that way > we > > > would not block everyone from making progress. > > > > > > Cheers, > > > Till > > > > > > On Tue, Jun 22, 2021 at 12:00 PM Arvid Heise wrote: > > > > > > > I think this is overall a good idea. So +1 from my side. > > > > However, I'd like to put a higher priority on infrastructure then, in > > > > particular docker image/artifact caches. > > > > > > > > On Tue, Jun 22, 2021 at 11:50 AM Till Rohrmann > > > > > wrote: > > > > > > > > > Thanks for bringing this topic to our attention Xintong. I think > your > > > > > proposal makes a lot of sense and we should follow it. It will > give us > > > > > confidence that our changes are working and it might be a good > > > incentive > > > > to > > > > > quickly fix build instabilities. Hence, +1. > > > > > > > > > > Cheers, > > > > > Till > > > > > > > > > > On Tue, Jun 22, 2021 at 11:12 AM Xintong Song < > tonysong...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > In the past a couple of weeks, I've observed several times that > PRs > > > are > > > > > > merged without a green light from the CI tests, where failure > cases > > > are > > > > > > considered *unrelated*. This may not always cause problems, but > would > > > > > > increase the chance of breaking our code base. In fact, it has > > > occurred > > > > > to > > > > > > me twice in the past few weeks that I had to revert a commit > which > > > > breaks > > > > > > the master branch due to this. > > > > > > > > > > > > I think it would be nicer to enforce a stricter rule, that no PRs > > > > should > > > > > be > > > > > > merged without passing CI. > > > > > > > > > > > > The problems of merging PRs with "unrelated" test failures are: > > > > > > - It's not always straightforward to tell whether a test > failures are > > > > > > related or not. > > > > > > - It prevents subsequent test cases from being executed, which > may > > > fail > > > > > > relating to the PR changes. > > > > > > > > > > > > To make things easier for the committers, the following > exceptions > > > > might > > > > > be > > > > > > considered acceptable. > > > > > > - The PR has passed CI in the contributor's personal workspace. > > > Please > > > > > post > > > > > > the link in such cases. > > > > > > - The CI tests have been triggered multiple times, on the same > > > commit, > > > > > and > > > > > > each stage has at least passed for once. Please also comment in > such > > > > > cases. > > > > > > > > > > > > If we all agree on this, I'd update the community guidelines for > > > > merging > > > > > > PRs wrt. this proposal. [1] > > > > > > > > > > > > Please let me know what do you think. > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/Merging+Pull+Requests > > > > > > > > > > > > > > > > > > >
Re: [DISCUSS] Feedback Collection Jira Bot
> I don't understand why we are necessarily losing discussion/knowledge. The > tickets are still there, just in "Closed" state, which are included in > default Jira search. Finding if there already has been a ticket opened for the given issue is not always easy. Finding the right ticket among 23086 is 7 times as difficult/time consuming as among 3305 open tickets. If a piece of knowledge/discussion is not easily accessible, it's effectively lost. > We could of course just add a label, but closing seems > clearer to me given that likely this ticket will not get comitter attention > in the foreseeable future. There are tickets that are waiting to get enough traction (bugs, improvements, ideas, test instabilities). I know plenty of those. If they are being brought up frequently enough, they will finally get the needed attention. Until this happens, I don't like to be losing the descriptions, previous discussions and/or a frequency of past occurrences. Can I ask, why do you think it makes sense to be closing those tickets besides it "being clearer" to you? What use case is justifying this? And I don't agree it's clearer. If the issue is still there, it shouldn't be in the "CLOSED" state. > Can you elaborate which rules you are running into mostly? I'd rather like > to understand how we work right now and where this conflicts with the Jira > bot vs slowly disabling the jira bot via labels. I didn't count them, but I think stale critical -> stale major -> stale minor -> auto closing I'm getting the most. If a ticket is not relevant anymore I've learned to manually close it/clean up immediately once the jira-bot pings about it regardless of the priority. But so far, I've closed fewer tickets than I was forced to re-open. Maybe this is because I'm tracking all of the tickets that are of interest to my team? Maybe others are not doing that and that's why you are not seeing this problem that I'm having? But keep in mind. I don't mind about auto deprioritization. It's fair to say that tickets get automatically deprioritised if they have no attention. But why do we have to automatically close the least priority ones? Maybe another idea. Instead of disabling closing the tickets via some label, we could also achieve the same thing with a dedicated lowest priority state "on hold"/"frozen". Piotrek śr., 23 cze 2021 o 10:17 Konstantin Knauf napisał(a): > > I agree there are such tickets, but I don't see how this is addressing my > concerns. There are also tickets that just shouldn't be closed as I > described above. Why do you think that duplicating tickets and losing > discussions/knowledge is a good solution? > > I don't understand why we are necessarily losing discussion/knowledge. The > tickets are still there, just in "Closed" state, which are included in > default Jira search. We could of course just add a label, but closing seems > clearer to me given that likely this ticket will not get comitter attention > in the foreseeable future. > > > I would like to avoid having to constantly fight against the bot. It's > already responsible for the majority of my daily emails, with quite little > benefit for me personally. I initially thought that after some period of > time it will settle down, but now I'm afraid it won't happen. > > Can you elaborate which rules you are running into mostly? I'd rather like > to understand how we work right now and where this conflicts with the Jira > bot vs slowly disabling the jira bot via labels. > > On Wed, Jun 23, 2021 at 10:00 AM Piotr Nowojski > wrote: > > > Hi Konstantin, > > > > > In my opinion it is important that we close tickets eventually. There > are > > a > > > lot of tickets (bugs, improvements, tech debt) that over time became > > > irrelevant, out-of-scope, irreproducible, etc. In my experience, these > > > tickets are usually not closed by anyone but the bot. > > > > I agree there are such tickets, but I don't see how this is addressing my > > concerns. There are also tickets that just shouldn't be closed as I > > described above. Why do you think that duplicating tickets and losing > > discussions/knowledge is a good solution? > > > > I would like to avoid having to constantly fight against the bot. It's > > already responsible for the majority of my daily emails, with quite > little > > benefit for me personally. I initially thought that after some period of > > time it will settle down, but now I'm afraid it won't happen. Can we add > > some label to mark tickets to be ignored by the jira-bot? > > > > Best, > > Piotrek > > > > śr., 23 cze 2021 o 09:40 Chesnay Schepler > napisał(a): > > > > > I would like it to not unassign people if a PR is open. These are > > > usually blocked by the reviewer, not the assignee, and having the > > > assignees now additionally having to update JIRA periodically is a bit > > > like rubbing salt into the wound. > > > > > > On 6/23/2021 7:52 AM, Konstantin Knauf wrote: > > > > Hi everyone, > > > > > > > > I was hoping for mo
Re: Change in accumutors semantics with jobClient
Hi Till, Of course I can update the release notes. Question is: this change is quite old (January), it is already available in all the maintained releases :1.11, 1.12, 1.13. I think I should update the release notes for all these versions no ? In case you agree, I took the liberty to update all these release notes in a PR: https://github.com/apache/flink/pull/16256 Cheers, Etienne On 21/06/2021 11:39, Till Rohrmann wrote: Thanks for bringing this to the dev ML Etienne. Could you maybe update the release notes for Flink 1.13 [1] to include this change? That way it might be a bit more prominent. I think the change needs to go into the release-1.13 and master branch. [1] https://github.com/apache/flink/blob/master/docs/content/release-notes/flink-1.13.md Cheers, Till On Fri, Jun 18, 2021 at 2:45 PM Etienne Chauchot wrote: Hi all, I did a fix some time ago regarding accumulators: the/JobClient.getAccumulators()/ was infinitely blocking in local environment for a streaming job (1). The change (2) consisted of giving the current accumulators value for the running job. And when fixing this in the PR, it appeared that I had to change the accumulators semantics with /JobClient/ and I just realized that I forgot to bring this back to the ML: Previously /JobClient/ assumed that getAccumulator() was called on a bounded pipeline and that the user wanted to acquire the *final accumulator values* after the job is finished. But now it returns the *current value of accumulators* immediately to be compatible with unbounded pipelines. If it is run on a bounded pipeline, then to get the final accumulator values after the job is finished, one needs to call /getJobExecutionResult().thenApply(JobExecutionResult::getAllAccumulatorResults)/ (1): https://issues.apache.org/jira/browse/FLINK-18685 (2): https://github.com/apache/flink/pull/14558# Cheers, Etienne
[DISCUSS] Incrementally deprecating the DataSet API
Hi everyone, I'm sending this email to make sure everyone is on the same page about slowly deprecating the DataSet API. There have been a few thoughts mentioned in presentations, offline discussions, and JIRA issues. However, I have observed that there are still some concerns or different opinions on what steps are necessary to implement this change. Let me summarize some of the steps and assumpations and let's have a discussion about it: Step 1: Introduce a batch mode for Table API (FLIP-32) [DONE in 1.9] Step 2: Introduce a batch mode for DataStream API (FLIP-134) [DONE in 1.12] Step 3: Soft deprecate DataSet API (FLIP-131) [DONE in 1.12] We updated the documentation recently to make this deprecation even more visible. There is a dedicated `(Legacy)` label right next to the menu item now. We won't deprecate concrete classes of the API with a @Deprecated annotation to avoid extensive warnings in logs until then. Step 4: Drop the legacy SQL connectors and formats (FLINK-14437) [DONE in 1.14] We dropped code for ORC, Parque, and HBase formats that were only used by DataSet API users. The removed classes had no documentation and were not annotated with one of our API stability annotations. The old functionality should be available through the new sources and sinks for Table API and DataStream API. If not, we should bring them into a shape that they can be a full replacement. DataSet users are encouraged to either upgrade the API or use Flink 1.13. Users can either just stay at Flink 1.13 or copy only the format's code to a newer Flink version. We aim to keep the core interfaces (i.e. InputFormat and OutputFormat) stable until the next major version. We will maintain/allow important contributions to dropped connectors in 1.13. So 1.13 could be considered as kind of a DataSet API LTS release. Step 5: Drop the legacy SQL planner (FLINK-14437) [DONE in 1.14] This included dropping support of DataSet API with SQL. Step 6: Connect both Table and DataStream API in batch mode (FLINK-20897) [PLANNED in 1.14] Step 7: Reach feature parity of Table API/DataStream API with DataSet API [PLANNED for 1.14++] We need to identify blockers when migrating from DataSet API to Table API/DataStream API. Here we need to estabilish a good feedback pipeline to include DataSet users in the roadmap planning. Step 7: Drop the Gelly library No concrete plan yet. Latest would be the next major Flink version aka Flink 2.0. Step 8: Drop DataSet API Planned for the next major Flink version aka Flink 2.0. Please let me know if this matches your thoughts. We can also convert this into a blog post or mention it in the next release notes. Regards, Timo
[jira] [Created] (FLINK-23121) Fix the issue that the InternalRow as arguments in Python UDAF
Huang Xingbo created FLINK-23121: Summary: Fix the issue that the InternalRow as arguments in Python UDAF Key: FLINK-23121 URL: https://issues.apache.org/jira/browse/FLINK-23121 Project: Flink Issue Type: Bug Components: API / Python Affects Versions: 1.13.1 Reporter: Huang Xingbo Assignee: Huang Xingbo Fix For: 1.13.2 The problem is reported from https://stackoverflow.com/questions/68026832/pyflink-udaf-internalrow-vs-row In release-1.14, we have reconstructed the coders and fixed this problem. So this problem only appeared in 1.13 -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: CONTENTS DELETED in nabble frontend
I've set up the nabble archives back in the stone age of Flink, when the Apache archive didn't provide a very modern user experience. Since lists.apache.org exists, we don't really need nabble anymore. I'll open a pull request to replace the links to nabble to point to lists.apache.org on the community page: https://flink.apache.org/community.html I'll also look into updating the description of our nabble groups to link to the new archive. On Wed, Jun 23, 2021 at 9:57 AM Dawid Wysakowicz wrote: > Hey, > > As far as I know the official Apache ML archive can be accessed here[1]. > Personally I don't know what is the status of the nabble archives. > > Best, > > Dawid > > [1] https://lists.apache.org/list.html?dev@flink.apache.org > > On 23/06/2021 09:08, Matthias Pohl wrote: > > Thanks for pointing to the Nabble support forum. +1 Based on [1], the > > deletion of posts is not related to the switch of mailing lists becoming > > regular forums. But it seems to be a general issue at Nabble. > > But what concerns me is [2]: It looks like they are planning to remove > the > > feature to post through email which is actually our way of collecting the > > posts. > > > > @Robert: Is this Apache Flink Mailing list/Nabble system an Apache-wide > > setup used in any other Apache project as well? Or is this something we > > came up with? > > > > Do we have a fallback system or backups of messages? > > > > Matthias > > > > [1] > > > http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html > > [2] > > > http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html > > > > On Wed, Jun 23, 2021 at 8:45 AM Yangze Guo wrote: > > > >> Ahh. It seems nabble has updated mailing lists to regular forums this > >> week[1]. > >> > >> [1] > >> > http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html > >> > >> Best, > >> Yangze Guo > >> > >> On Wed, Jun 23, 2021 at 2:37 PM Yangze Guo wrote: > >>> It seems the post will remain iff it is sent by a registered email. I > >>> do not register nabble in user ML and my post is deleted in [1]. > >>> > >>> [1] > >> > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/after-upgrade-flink1-12-to-flink1-13-1-flink-web-ui-s-taskmanager-detail-page-error-tt44391.html > >>> Best, > >>> Yangze Guo > >>> > >>> On Wed, Jun 23, 2021 at 2:16 PM Matthias Pohl > >> wrote: > Hi everyone, > Is it only me or does anyone else have the same problem with messages > >> being > not available anymore in the nabble frontend? I get multiple messages > >> like > the following one for individual messages: > > CONTENTS DELETED > > The author has deleted this message. > This appears for instance in [1], where all the messages are deleted > >> except > for Till Rohrmann's, Yangze Guo's and mine. This issue is not limited > >> to > the dev mailing list but also seem to appear in the user mailing list > >> (e.g. > [2]). > > Logging into nabble doesn't solve the problem. I'd assume that it's > >> some > infrastructure issue rather than people collectively deleting their > messages. But a Google search wasn't of any help. > > Matthias > > [1] > > >> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Remove-Mesos-support-td45974.html > [2] > > >> > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/1-1-snapshot-issues-td6971.html#a6973 > >
Re: [DISCUSS] Drop Mesos in 1.14
+ 1 for dropping Mesos. I checked both commit history and mail list, the Mesos related issue/user question has been rarely appeared. Best, Leonard
Re: [DISCUSS] Drop Mesos in 1.14
+ 1 for dropping mesos. Most of the PMCs have already left the project [1] and a move to attic was barely avoided. Overall kubernetes has taken its place and it is unlikely that we will see a surge in Mesos very soon. Best, Fabian [1] https://lists.apache.org/thread.html/rab2a820507f7c846e54a847398ab20f47698ec5bce0c8e182bfe51ba%40%3Cdev.mesos.apache.org%3E
[jira] [Created] (FLINK-23120) ByteArrayWrapperSerializer.serialize should use writeInt to serialize the length
Dian Fu created FLINK-23120: --- Summary: ByteArrayWrapperSerializer.serialize should use writeInt to serialize the length Key: FLINK-23120 URL: https://issues.apache.org/jira/browse/FLINK-23120 Project: Flink Issue Type: Bug Components: API / Python Affects Versions: 1.13.0, 1.12.0 Reporter: Dian Fu Assignee: Dian Fu Fix For: 1.12.5, 1.13.2 -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Drop Mesos in 1.14
+1 for dropping mesos support. AFAIK, mesos(including marathon for the container management) is phasing out gradually and has been replaced with Kubernetes in the containerized world. Best, Yang Matthias Pohl 于2021年6月23日周三 下午2:04写道: > +1 for dropping Mesos support. There was no feedback opposing the direction > from the community in the most-recent discussion [1,2] on deprecating it. > > Matthias > > [1] > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Remove-Mesos-support-td45974.html > [2] > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Deprecating-Mesos-support-td50142.html > > On Wed, Jun 23, 2021 at 4:21 AM Yangze Guo wrote: > > > +1 for dropping if there is no strong demand from the community. > > > > I'm willing to help with the removal of e2e tests part. > > > > Best, > > Yangze Guo > > > > On Wed, Jun 23, 2021 at 10:09 AM Xintong Song > > wrote: > > > > > > +1 for dropping. > > > > > > I like Seth's idea. I don't have any real Mesos experience either. > > > According to this article [1], it looks like we can deploy a standalone > > > cluster on Mesos similar to Kubernetes. However, we should only do it > if > > > there's indeed a strong demand from the community for deploying a > > > latest version of Flink on Mesos. > > > > > > Thank you~ > > > > > > Xintong Song > > > > > > > > > [1] https://www.baeldung.com/ops/mesos-kubernetes-comparison > > > > > > On Tue, Jun 22, 2021 at 11:59 PM Israel Ekpo > > wrote: > > > > > > > I am in favor of dropping the support for Mesos. > > > > > > > > In terms of the landscape for users leveraging Mesos for the kind of > > > > workloads Flink is used, I think it is on the decline. > > > > > > > > +1 from me > > > > > > > > On Tue, Jun 22, 2021 at 11:32 AM Seth Wiesman > > wrote: > > > > > > > > > Sorry if this is a naive question, I don't have any real Mesos > > > > experience. > > > > > Is it possible to deploy a standalone cluster on top of Mesos in > the > > same > > > > > way you can with Kubernetes? If so, and there is still Mesos demand > > from > > > > > the community, we could document that process as the recommended > > > > deployment > > > > > mode going forward. > > > > > > > > > > Seth > > > > > > > > > > On Tue, Jun 22, 2021 at 5:02 AM Arvid Heise > > wrote: > > > > > > > > > > > +1 for dropping. Frankly speaking, I don't see it having any > future > > > > (and > > > > > > D2iQ > > > > > > agrees). > > > > > > > > > > > > If there is a surprisingly huge demand, I'd try to evaluate > > plugins for > > > > > it. > > > > > > > > > > > > On Tue, Jun 22, 2021 at 11:46 AM Till Rohrmann < > > trohrm...@apache.org> > > > > > > wrote: > > > > > > > > > > > > > I'd be ok with dropping support for Mesos if it helps us to > > clear our > > > > > > > dependencies in the flink-runtime module. If we do it, then we > > should > > > > > > > probably update our documentation with a pointer to the latest > > Flink > > > > > > > version that supports Mesos in case of users strictly need > Mesos. > > > > > > > > > > > > > > Cheers, > > > > > > > Till > > > > > > > > > > > > > > On Tue, Jun 22, 2021 at 10:29 AM Chesnay Schepler < > > > > ches...@apache.org> > > > > > > > wrote: > > > > > > > > > > > > > > > Last week I spent some time looking into making flink-runtime > > scala > > > > > > > > free, which effectively means to move the Akka-reliant > classes > > to > > > > > > > > another module, and load that module along with Akka and all > of > > > > it's > > > > > > > > dependencies (including Scala) through a separate > classloader. > > > > > > > > > > > > > > > > This would finally decouple the Scala versions required by > the > > > > > runtime > > > > > > > > and API, and would allow us to upgrade Akka as we'd no longer > > be > > > > > > limited > > > > > > > > to Scala 2.11. It would rid the classpath of a few > > dependencies, > > > > and > > > > > > > > remove the need for scala suffixes on quite a few modules. > > > > > > > > > > > > > > > > However, our Mesos support has unfortunately a hard > dependency > > on > > > > > Akka, > > > > > > > > which naturally does not play well with the goal of isolating > > Akka > > > > in > > > > > > > > it's own ClassLoader. > > > > > > > > > > > > > > > > To solve this issue I was thinking of simple dropping > > flink-mesos > > > > in > > > > > > > > 1.14 (it was deprecated in 1.13). > > > > > > > > > > > > > > > > Truth be told, I picked this option because it is the easiest > > to > > > > do. > > > > > We > > > > > > > > _could_ probably make things work somehow (likely by > shipping a > > > > > second > > > > > > > > Akka version just for flink-mesos), but it doesn't seem worth > > the > > > > > > hassle > > > > > > > > and would void some of the benefits. So far we kept > flink-mesos > > > > > around, > > > > > > > > despite not really developing it further, because it didn't > > hurt to > > > > > > have > > > > > > > > it in still in Flink, but this has now ch
[jira] [Created] (FLINK-23119) Fix the issue that the exception that General Python UDAF is unsupported is not thrown in Compile Stage.
Huang Xingbo created FLINK-23119: Summary: Fix the issue that the exception that General Python UDAF is unsupported is not thrown in Compile Stage. Key: FLINK-23119 URL: https://issues.apache.org/jira/browse/FLINK-23119 Project: Flink Issue Type: Bug Components: API / Python Affects Versions: 1.12.4, 1.13.1 Reporter: Huang Xingbo Assignee: Huang Xingbo Fix For: 1.12.5, 1.13.2 -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Feedback Collection Jira Bot
> I agree there are such tickets, but I don't see how this is addressing my concerns. There are also tickets that just shouldn't be closed as I described above. Why do you think that duplicating tickets and losing discussions/knowledge is a good solution? I don't understand why we are necessarily losing discussion/knowledge. The tickets are still there, just in "Closed" state, which are included in default Jira search. We could of course just add a label, but closing seems clearer to me given that likely this ticket will not get comitter attention in the foreseeable future. > I would like to avoid having to constantly fight against the bot. It's already responsible for the majority of my daily emails, with quite little benefit for me personally. I initially thought that after some period of time it will settle down, but now I'm afraid it won't happen. Can you elaborate which rules you are running into mostly? I'd rather like to understand how we work right now and where this conflicts with the Jira bot vs slowly disabling the jira bot via labels. On Wed, Jun 23, 2021 at 10:00 AM Piotr Nowojski wrote: > Hi Konstantin, > > > In my opinion it is important that we close tickets eventually. There are > a > > lot of tickets (bugs, improvements, tech debt) that over time became > > irrelevant, out-of-scope, irreproducible, etc. In my experience, these > > tickets are usually not closed by anyone but the bot. > > I agree there are such tickets, but I don't see how this is addressing my > concerns. There are also tickets that just shouldn't be closed as I > described above. Why do you think that duplicating tickets and losing > discussions/knowledge is a good solution? > > I would like to avoid having to constantly fight against the bot. It's > already responsible for the majority of my daily emails, with quite little > benefit for me personally. I initially thought that after some period of > time it will settle down, but now I'm afraid it won't happen. Can we add > some label to mark tickets to be ignored by the jira-bot? > > Best, > Piotrek > > śr., 23 cze 2021 o 09:40 Chesnay Schepler napisał(a): > > > I would like it to not unassign people if a PR is open. These are > > usually blocked by the reviewer, not the assignee, and having the > > assignees now additionally having to update JIRA periodically is a bit > > like rubbing salt into the wound. > > > > On 6/23/2021 7:52 AM, Konstantin Knauf wrote: > > > Hi everyone, > > > > > > I was hoping for more feedback from other committers, but seems like > this > > > is not happening, so here's my proposal for immediate changes: > > > > > > * Ignore tickets with a fixVersion for all rules but the > stale-unassigned > > > role. > > > > > > * We change the time intervals as follows, accepting reality a bit more > > ;) > > > > > > * stale-assigned only after 30 days (instead of 14 days) > > > * stale-critical only after 14 days (instead of 7 days) > > > * stale-major only after 60 days (instead of 30 days) > > > > > > Unless there are -1s, I'd implement the changes Monday next week. > > > > > > Cheers, > > > > > > Konstantin > > > > > > On Thu, Jun 17, 2021 at 2:17 PM Piotr Nowojski > > wrote: > > > > > >> Hi, > > >> > > >> I also think that the bot is a bit too aggressive/too quick with > > assigning > > >> stale issues/deprioritizing them, but that's not that big of a deal > for > > me. > > >> > > >> What bothers me much more is that it's closing minor issues > > automatically. > > >> Depriotising issues makes sense to me. If a wish for improvement or a > > bug > > >> report has been opened a long time ago, and they got no attention over > > the > > >> time, sure depriotize them. But closing them is IMO a bad idea. Bug > > might > > >> be minor, but if it's not fixed it's still there - it shouldn't be > > closed. > > >> Closing with "won't fix" should be done for very good reasons and very > > >> rarely. Same applies to improvements/wishes. Furthermore, very often > > >> descriptions and comments have a lot of value, and if we keep closing > > minor > > >> issues I'm afraid that we end up with: > > >> - more duplication. I doubt anyone will be looking for prior "closed" > > bug > > >> reports/improvement requests. Definitely I'm only looking for open > > tickets > > >> when looking if a ticket for XYZ already exists or not > > >> - we will be losing knowledge > > >> > > >> Piotrek > > >> > > >> śr., 16 cze 2021 o 15:12 Robert Metzger > > napisał(a): > > >> > > >>> Very sorry for the delayed response. > > >>> > > >>> Regarding tickets with the "test-instability" label (topic 1): I'm > > >> usually > > >>> assigning a fixVersion to the next release of the branch where the > > >> failure > > >>> occurred, when I'm opening a test failure ticket. Others seem to do > > that > > >>> too. Hence my comment that not checking tickets with a fixVersion set > > by > > >>> Flink bot is good (because test failures should always stay > "Critical" > > >>> until
[jira] [Created] (FLINK-23118) Drop mesos
Chesnay Schepler created FLINK-23118: Summary: Drop mesos Key: FLINK-23118 URL: https://issues.apache.org/jira/browse/FLINK-23118 Project: Flink Issue Type: Improvement Components: Deployment / Mesos Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.14.0 Following the discussion on the [ML|https://lists.apache.org/thread.html/rd7bf0dabe2d75adb9f97a1879638711d04cfce0774d31b033acae0b8%40%3Cdev.flink.apache.org%3E] , remove Mesos support. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Feedback Collection Jira Bot
Hi Konstantin, > In my opinion it is important that we close tickets eventually. There are a > lot of tickets (bugs, improvements, tech debt) that over time became > irrelevant, out-of-scope, irreproducible, etc. In my experience, these > tickets are usually not closed by anyone but the bot. I agree there are such tickets, but I don't see how this is addressing my concerns. There are also tickets that just shouldn't be closed as I described above. Why do you think that duplicating tickets and losing discussions/knowledge is a good solution? I would like to avoid having to constantly fight against the bot. It's already responsible for the majority of my daily emails, with quite little benefit for me personally. I initially thought that after some period of time it will settle down, but now I'm afraid it won't happen. Can we add some label to mark tickets to be ignored by the jira-bot? Best, Piotrek śr., 23 cze 2021 o 09:40 Chesnay Schepler napisał(a): > I would like it to not unassign people if a PR is open. These are > usually blocked by the reviewer, not the assignee, and having the > assignees now additionally having to update JIRA periodically is a bit > like rubbing salt into the wound. > > On 6/23/2021 7:52 AM, Konstantin Knauf wrote: > > Hi everyone, > > > > I was hoping for more feedback from other committers, but seems like this > > is not happening, so here's my proposal for immediate changes: > > > > * Ignore tickets with a fixVersion for all rules but the stale-unassigned > > role. > > > > * We change the time intervals as follows, accepting reality a bit more > ;) > > > > * stale-assigned only after 30 days (instead of 14 days) > > * stale-critical only after 14 days (instead of 7 days) > > * stale-major only after 60 days (instead of 30 days) > > > > Unless there are -1s, I'd implement the changes Monday next week. > > > > Cheers, > > > > Konstantin > > > > On Thu, Jun 17, 2021 at 2:17 PM Piotr Nowojski > wrote: > > > >> Hi, > >> > >> I also think that the bot is a bit too aggressive/too quick with > assigning > >> stale issues/deprioritizing them, but that's not that big of a deal for > me. > >> > >> What bothers me much more is that it's closing minor issues > automatically. > >> Depriotising issues makes sense to me. If a wish for improvement or a > bug > >> report has been opened a long time ago, and they got no attention over > the > >> time, sure depriotize them. But closing them is IMO a bad idea. Bug > might > >> be minor, but if it's not fixed it's still there - it shouldn't be > closed. > >> Closing with "won't fix" should be done for very good reasons and very > >> rarely. Same applies to improvements/wishes. Furthermore, very often > >> descriptions and comments have a lot of value, and if we keep closing > minor > >> issues I'm afraid that we end up with: > >> - more duplication. I doubt anyone will be looking for prior "closed" > bug > >> reports/improvement requests. Definitely I'm only looking for open > tickets > >> when looking if a ticket for XYZ already exists or not > >> - we will be losing knowledge > >> > >> Piotrek > >> > >> śr., 16 cze 2021 o 15:12 Robert Metzger > napisał(a): > >> > >>> Very sorry for the delayed response. > >>> > >>> Regarding tickets with the "test-instability" label (topic 1): I'm > >> usually > >>> assigning a fixVersion to the next release of the branch where the > >> failure > >>> occurred, when I'm opening a test failure ticket. Others seem to do > that > >>> too. Hence my comment that not checking tickets with a fixVersion set > by > >>> Flink bot is good (because test failures should always stay "Critical" > >>> until we've understood what's going on) > >>> I see that it is a bit contradicting that Critical test instabilities > >>> receive no attention for 14 days, but that seems to be the norm given > the > >>> current number of incoming test instabilities. > >>> > >>> On Wed, Jun 16, 2021 at 2:05 PM Till Rohrmann > >>> wrote: > >>> > Another example for category 4 would be the ticket where we collect > breaking API changes for Flink 2.0 [1]. The idea behind this ticket is > >> to > collect things to consider when developing the next major version. > Admittedly, we have never seen the benefits of collecting the breaking > changes because we haven't started Flink 2.x yet. Also, it is not > clear > >>> how > relevant these tickets are right now. > > [1] https://issues.apache.org/jira/browse/FLINK-3957 > > Cheers, > Till > > On Wed, Jun 16, 2021 at 11:42 AM Konstantin Knauf > wrote: > > > Hi everyone, > > > > thank you for all the feedback so far. I believe we have four > >> different > > topics by now: > > > > 1 about *test-instability tickets* raised by Robert. Waiting for > >>> feedback > > by Robert. > > > > 2 about *aggressiveness of stale-assigned *rule raised by Timo. > >> Waiting > > for feedback
Re: CONTENTS DELETED in nabble frontend
Hey, As far as I know the official Apache ML archive can be accessed here[1]. Personally I don't know what is the status of the nabble archives. Best, Dawid [1] https://lists.apache.org/list.html?dev@flink.apache.org On 23/06/2021 09:08, Matthias Pohl wrote: > Thanks for pointing to the Nabble support forum. +1 Based on [1], the > deletion of posts is not related to the switch of mailing lists becoming > regular forums. But it seems to be a general issue at Nabble. > But what concerns me is [2]: It looks like they are planning to remove the > feature to post through email which is actually our way of collecting the > posts. > > @Robert: Is this Apache Flink Mailing list/Nabble system an Apache-wide > setup used in any other Apache project as well? Or is this something we > came up with? > > Do we have a fallback system or backups of messages? > > Matthias > > [1] > http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html > [2] > http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html > > On Wed, Jun 23, 2021 at 8:45 AM Yangze Guo wrote: > >> Ahh. It seems nabble has updated mailing lists to regular forums this >> week[1]. >> >> [1] >> http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html >> >> Best, >> Yangze Guo >> >> On Wed, Jun 23, 2021 at 2:37 PM Yangze Guo wrote: >>> It seems the post will remain iff it is sent by a registered email. I >>> do not register nabble in user ML and my post is deleted in [1]. >>> >>> [1] >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/after-upgrade-flink1-12-to-flink1-13-1-flink-web-ui-s-taskmanager-detail-page-error-tt44391.html >>> Best, >>> Yangze Guo >>> >>> On Wed, Jun 23, 2021 at 2:16 PM Matthias Pohl >> wrote: Hi everyone, Is it only me or does anyone else have the same problem with messages >> being not available anymore in the nabble frontend? I get multiple messages >> like the following one for individual messages: > CONTENTS DELETED > The author has deleted this message. This appears for instance in [1], where all the messages are deleted >> except for Till Rohrmann's, Yangze Guo's and mine. This issue is not limited >> to the dev mailing list but also seem to appear in the user mailing list >> (e.g. [2]). Logging into nabble doesn't solve the problem. I'd assume that it's >> some infrastructure issue rather than people collectively deleting their messages. But a Google search wasn't of any help. Matthias [1] >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Remove-Mesos-support-td45974.html [2] >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/1-1-snapshot-issues-td6971.html#a6973 OpenPGP_signature Description: OpenPGP digital signature
[jira] [Created] (FLINK-23117) TaskExecutor.allocateSlot is a logical error
zhouzhengde created FLINK-23117: --- Summary: TaskExecutor.allocateSlot is a logical error Key: FLINK-23117 URL: https://issues.apache.org/jira/browse/FLINK-23117 Project: Flink Issue Type: Bug Components: Runtime / Task Affects Versions: 1.13.1, 1.13.0, 1.12.2, 1.12.0 Reporter: zhouzhengde (commit: 2020-04-22)TaskExecutor.allocateSlot at line 1109 has a logical error. Use '!taskSlotTable.isAllocated(slotId.getSlotNumber(), jobId, allocationId)' to judge TaskSlot is used by another job that is not correct. if slot index not occupy, that will be have some problem. Please confirm that is correct. The issue code follow: - TaskExecutor.java ```java {color:red}} else if (!taskSlotTable.isAllocated(slotId.getSlotNumber(), jobId, allocationId)) {{color} final String message = "The slot " + slotId + " has already been allocated for a different job."; log.info(message); final AllocationID allocationID = taskSlotTable.getCurrentAllocation(slotId.getSlotNumber()); throw new SlotOccupiedException( message, allocationID, taskSlotTable.getOwningJob(allocationID)); } ``` - TaskSlotTableImpl.java ```java @Override public boolean isAllocated(int index, JobID jobId, AllocationID allocationId) { TaskSlot taskSlot = taskSlots.get(index); if (taskSlot != null) { return taskSlot.isAllocated(jobId, allocationId); } else { return false; } } ``` -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] FLIP-172: Support custom transactional.id prefix in FlinkKafkaProducer
Hi, +1 from my side on this idea. I do not see any problems that could be caused by this change. Best, Piotrek śr., 23 cze 2021 o 08:59 Stephan Ewen napisał(a): > The motivation and the proposal sound good to me, +1 from my side. > > Would be good to have a quick opinion from someone who worked specifically > with Kafka, maybe Becket or Piotr? > > Best, > Stephan > > > On Sat, Jun 12, 2021 at 9:50 AM Wenhao Ji wrote: > >> Hi everyone, >> >> I would like to open this discussion thread to take about the FLIP-172 >> < >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-172%3A+Support+custom+transactional.id+prefix+in+FlinkKafkaProducer >> >, >> which aims to provide a way to support specifying a custom >> transactional.id >> in the FlinkKafkaProducer class. >> >> I am looking forwards to your feedback and suggestions! >> >> Thanks, >> Wenhao >> >
Re: [DISCUSS] Feedback Collection Jira Bot
I would like it to not unassign people if a PR is open. These are usually blocked by the reviewer, not the assignee, and having the assignees now additionally having to update JIRA periodically is a bit like rubbing salt into the wound. On 6/23/2021 7:52 AM, Konstantin Knauf wrote: Hi everyone, I was hoping for more feedback from other committers, but seems like this is not happening, so here's my proposal for immediate changes: * Ignore tickets with a fixVersion for all rules but the stale-unassigned role. * We change the time intervals as follows, accepting reality a bit more ;) * stale-assigned only after 30 days (instead of 14 days) * stale-critical only after 14 days (instead of 7 days) * stale-major only after 60 days (instead of 30 days) Unless there are -1s, I'd implement the changes Monday next week. Cheers, Konstantin On Thu, Jun 17, 2021 at 2:17 PM Piotr Nowojski wrote: Hi, I also think that the bot is a bit too aggressive/too quick with assigning stale issues/deprioritizing them, but that's not that big of a deal for me. What bothers me much more is that it's closing minor issues automatically. Depriotising issues makes sense to me. If a wish for improvement or a bug report has been opened a long time ago, and they got no attention over the time, sure depriotize them. But closing them is IMO a bad idea. Bug might be minor, but if it's not fixed it's still there - it shouldn't be closed. Closing with "won't fix" should be done for very good reasons and very rarely. Same applies to improvements/wishes. Furthermore, very often descriptions and comments have a lot of value, and if we keep closing minor issues I'm afraid that we end up with: - more duplication. I doubt anyone will be looking for prior "closed" bug reports/improvement requests. Definitely I'm only looking for open tickets when looking if a ticket for XYZ already exists or not - we will be losing knowledge Piotrek śr., 16 cze 2021 o 15:12 Robert Metzger napisał(a): Very sorry for the delayed response. Regarding tickets with the "test-instability" label (topic 1): I'm usually assigning a fixVersion to the next release of the branch where the failure occurred, when I'm opening a test failure ticket. Others seem to do that too. Hence my comment that not checking tickets with a fixVersion set by Flink bot is good (because test failures should always stay "Critical" until we've understood what's going on) I see that it is a bit contradicting that Critical test instabilities receive no attention for 14 days, but that seems to be the norm given the current number of incoming test instabilities. On Wed, Jun 16, 2021 at 2:05 PM Till Rohrmann wrote: Another example for category 4 would be the ticket where we collect breaking API changes for Flink 2.0 [1]. The idea behind this ticket is to collect things to consider when developing the next major version. Admittedly, we have never seen the benefits of collecting the breaking changes because we haven't started Flink 2.x yet. Also, it is not clear how relevant these tickets are right now. [1] https://issues.apache.org/jira/browse/FLINK-3957 Cheers, Till On Wed, Jun 16, 2021 at 11:42 AM Konstantin Knauf wrote: Hi everyone, thank you for all the feedback so far. I believe we have four different topics by now: 1 about *test-instability tickets* raised by Robert. Waiting for feedback by Robert. 2 about *aggressiveness of stale-assigned *rule raised by Timo. Waiting for feedback by Timo and others. 3 about *excluding issues with a fixVersion* raised by Konstantin, Till. Waiting for more feedback by the community as it involves general changes to how we deal with fixVersion. 4 about *excluding issues with a specific-label* raised by Arvid. I've already written something about 1-3. Regarding 4: How do we make sure that these don't become stale? I think, there have been a few "long-term efforts" in the past that never got the attention that we initially wanted. Is this just about the ability to collect tickets under an umbrella to document a future effort? Maybe for the example of DataStream replacing DataSet how would this look like in Jira? Cheers, Konstantin On Tue, Jun 8, 2021 at 11:31 AM Till Rohrmann wrote: I like this idea. It would then be the responsibility of the component maintainers to manage the lifecycle explicitly. Cheers, Till On Mon, Jun 7, 2021 at 1:48 PM Arvid Heise wrote: One more idea for the bot. Could we have a label to exclude certain tickets from the life-cycle? I'm thinking about long-term tickets such as improving DataStream to eventually replace DataSet. We would collect ideas over the next couple of weeks without any visible progress on the implementation. On Fri, May 21, 2021 at 2:06 PM Konstantin Knauf < kna...@apache.org wrote: Hi Timo, Thanks for joining the discussion. All rules except the unassigned rule do not apply to Sub-Tasks actually (like deprioritiza
[jira] [Created] (FLINK-23116) Update documentation about TableDescriptors
Timo Walther created FLINK-23116: Summary: Update documentation about TableDescriptors Key: FLINK-23116 URL: https://issues.apache.org/jira/browse/FLINK-23116 Project: Flink Issue Type: Sub-task Components: Documentation, Table SQL / API Reporter: Timo Walther We should update the documentation at a couple of places to show different use cases. In any case we need a detailed documentation for the Table API/common API section. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23115) Expose new APIs from PyFlink
Ingo Bürk created FLINK-23115: - Summary: Expose new APIs from PyFlink Key: FLINK-23115 URL: https://issues.apache.org/jira/browse/FLINK-23115 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Ingo Bürk -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: CONTENTS DELETED in nabble frontend
Thanks for pointing to the Nabble support forum. +1 Based on [1], the deletion of posts is not related to the switch of mailing lists becoming regular forums. But it seems to be a general issue at Nabble. But what concerns me is [2]: It looks like they are planning to remove the feature to post through email which is actually our way of collecting the posts. @Robert: Is this Apache Flink Mailing list/Nabble system an Apache-wide setup used in any other Apache project as well? Or is this something we came up with? Do we have a fallback system or backups of messages? Matthias [1] http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html [2] http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html On Wed, Jun 23, 2021 at 8:45 AM Yangze Guo wrote: > Ahh. It seems nabble has updated mailing lists to regular forums this > week[1]. > > [1] > http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html > > Best, > Yangze Guo > > On Wed, Jun 23, 2021 at 2:37 PM Yangze Guo wrote: > > > > It seems the post will remain iff it is sent by a registered email. I > > do not register nabble in user ML and my post is deleted in [1]. > > > > [1] > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/after-upgrade-flink1-12-to-flink1-13-1-flink-web-ui-s-taskmanager-detail-page-error-tt44391.html > > > > Best, > > Yangze Guo > > > > On Wed, Jun 23, 2021 at 2:16 PM Matthias Pohl > wrote: > > > > > > Hi everyone, > > > Is it only me or does anyone else have the same problem with messages > being > > > not available anymore in the nabble frontend? I get multiple messages > like > > > the following one for individual messages: > > > > CONTENTS DELETED > > > > The author has deleted this message. > > > > > > This appears for instance in [1], where all the messages are deleted > except > > > for Till Rohrmann's, Yangze Guo's and mine. This issue is not limited > to > > > the dev mailing list but also seem to appear in the user mailing list > (e.g. > > > [2]). > > > > > > Logging into nabble doesn't solve the problem. I'd assume that it's > some > > > infrastructure issue rather than people collectively deleting their > > > messages. But a Google search wasn't of any help. > > > > > > Matthias > > > > > > [1] > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Remove-Mesos-support-td45974.html > > > [2] > > > > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/1-1-snapshot-issues-td6971.html#a6973