Re: [DISCUSS] Releasing Flink 1.13.6
+1 for releasing Flink 1.13.6 Thanks Martijn and Konstantin for driving this. Best, Jing Zhang David Morávek 于2022年1月25日周二 19:13写道: > Thanks for driving this Martijn, +1 for the release > > Also big thanks to Konstantin for volunteering > > Best, > D. > > On Mon, Jan 24, 2022 at 3:24 PM Till Rohrmann > wrote: > > > +1 for the 1.13.6 release and thanks for volunteering Konstantin. > > > > Cheers, > > Till > > > > On Mon, Jan 24, 2022 at 2:57 PM Konstantin Knauf > > wrote: > > > > > Thanks for starting the discussion and +1 to releasing. > > > > > > I am happy to manage the release aka learn how to do this. > > > > > > Cheers, > > > > > > Konstantin > > > > > > On Mon, Jan 24, 2022 at 2:52 PM Martijn Visser > > > wrote: > > > > > > > I would like to start a discussion on releasing Flink 1.13.6. Flink > > > 1.13.5 > > > > was the latest release on the 16th of December, which was the > emergency > > > > release for the Log4j CVE [1]. Flink 1.13.4 was cancelled, leaving > > Flink > > > > 1.13.3 as the last real bugfix release. This one was released on the > > 19th > > > > of October last year. > > > > > > > > Since then, there have been 61 fixed tickets, excluding the test > > > > stabilities [3]. This includes a blocker and a couple of critical > > issues. > > > > > > > > Is there a PMC member who would like to manage the release? I'm more > > than > > > > happy to help with monitoring the status of the tickets. > > > > > > > > Best regards, > > > > > > > > Martijn Visser > > > > https://twitter.com/MartijnVisser82 > > > > > > > > [1] > https://flink.apache.org/news/2021/12/16/log4j-patch-releases.html > > > > [2] https://flink.apache.org/news/2021/10/19/release-1.13.3.html > > > > [3] JQL filter: project = FLINK AND resolution = Fixed AND > fixVersion = > > > > 1.13.6 AND labels != test-stability ORDER BY priority DESC, created > > DESC > > > > > > > > > > > > > -- > > > > > > Konstantin Knauf > > > > > > https://twitter.com/snntrable > > > > > > https://github.com/knaufk > > > > > >
Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator
Thanks Thomas for creating FLIP-212 to introduce the Flink Kubernetes Operator. The proposal looks already very good to me and has integrated all the input in the previous discussion(e.g. native K8s VS standalone, Go VS java). I read the FLIP carefully and have some questions that need to be clarified. # How do we run a Flink job from a CR? 1. Start a session cluster and then followed by submitting the Flink job via rest API 2. Start a Flink application cluster which bundles one or more Flink jobs It is not clear enough to me which way we will choose. It seems that the existing google/lyft K8s operator is using #1. But I lean to #2 in the new introduced K8s operator. If #2 is the case, how could we get the job status when it finished or failed? Maybe FLINK-24113[1] and FLINK-25715[2] could help. Or we may need to enable the Flink history server[3]. # ApplicationDeployer Interface or "flink run-application" / "kubernetes-session.sh" How do we start the Flink application or session cluster? It will be great if we have the public and stable interfaces for deployment in Flink. But currently we only have an internal interface *ApplicationDeployer* to deploy the application cluster and no interfaces for deploying session cluster. Of cause, we could also use the CLI command for submission. However, it will have poor performance when launching multiple applications. # Pod Template Is the pod template in CR same with what Flink has already supported[4]? Then I am afraid not the arbitrary field(e.g. cpu/memory resources) could take effect. [1]. https://issues.apache.org/jira/browse/FLINK-24113 [2]. https://issues.apache.org/jira/browse/FLINK-25715 [3]. https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/advanced/historyserver/ [4]. https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/resource-providers/native_kubernetes/#pod-template Best, Yang Thomas Weise 于2022年1月25日周二 13:08写道: > Hi, > > As promised in [1] we would like to start the discussion on the > addition of a Kubernetes operator to the Flink project as FLIP-212: > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-212%3A+Introduce+Flink+Kubernetes+Operator > > Please note that the FLIP is currently focussed on the overall > direction; the intention is to fill in more details once we converge > on the high level plan. > > Thanks and looking forward to a lively discussion! > > Thomas > > [1] https://lists.apache.org/thread/l1dkp8v4bhlcyb4tdts99g7w4wdglfy4 >
Re: PR Review Request
Please ignore me. I originally wanted to send it to calcite's dev mail list, but I sent it to the wrong mail list. I'm terribly sorry. Jing Zhang 于2022年1月26日周三 14:55写道: > Hi community, > My apologies for interrupting. > Anyone could help to review the pr > https://github.com/apache/calcite/pull/2606? > Thanks a lot. > > CALCITE-4865 is the first sub-task of CALCITE-4864. This Jira aims to > extend existing Table function in order to support Polymorphic Table > Function which is introduced as the part of ANSI SQL 2016. > > The brief change logs of the PR are: > - Update `Parser.jj` to support partition by clause and order by clause > for input table with set semantics of PTF > - Introduce `TableCharacteristics` which contains three characteristics > of input table of table function > - Update `SqlTableFunction` to add a method `tableCharacteristics`, the > method returns the table characteristics for the ordinal-th argument to > this table function. Default return value is Optional.empty which means the > ordinal-th argument is not table. > - Introduce `SqlSetSemanticsTable` which represents input table with set > semantics of Table Function, its `SqlKind` is `SET_SEMANTICS_TABLE` > - Updates `SqlValidatorImpl` to validate only set semantic table of > Table Function could have partition by and order by clause > - Update `SqlToRelConverter#substituteSubQuery` to parse subQuery which > represents set semantics table. > > PR: https://github.com/apache/calcite/pull/2606 > JIRA: https://issues.apache.org/jira/browse/CALCITE-4865 > Parent JARA: https://issues.apache.org/jira/browse/CALCITE-4864 > > Best, > Jing Zhang >
PR Review Request
Hi community, My apologies for interrupting. Anyone could help to review the pr https://github.com/apache/calcite/pull/2606? Thanks a lot. CALCITE-4865 is the first sub-task of CALCITE-4864. This Jira aims to extend existing Table function in order to support Polymorphic Table Function which is introduced as the part of ANSI SQL 2016. The brief change logs of the PR are: - Update `Parser.jj` to support partition by clause and order by clause for input table with set semantics of PTF - Introduce `TableCharacteristics` which contains three characteristics of input table of table function - Update `SqlTableFunction` to add a method `tableCharacteristics`, the method returns the table characteristics for the ordinal-th argument to this table function. Default return value is Optional.empty which means the ordinal-th argument is not table. - Introduce `SqlSetSemanticsTable` which represents input table with set semantics of Table Function, its `SqlKind` is `SET_SEMANTICS_TABLE` - Updates `SqlValidatorImpl` to validate only set semantic table of Table Function could have partition by and order by clause - Update `SqlToRelConverter#substituteSubQuery` to parse subQuery which represents set semantics table. PR: https://github.com/apache/calcite/pull/2606 JIRA: https://issues.apache.org/jira/browse/CALCITE-4865 Parent JARA: https://issues.apache.org/jira/browse/CALCITE-4864 Best, Jing Zhang
Re: [DISCUSS] Pushing Apache Flink 1.15 Feature Freeze
+1 extending feature freeze for one week. Code Freeze on 6th (end of Spring Festival) is equivalent to say code freeze at the end of this week for Chinese buddies, since Spring Festival starts next week. It also means they should be partially available during the holiday, otherwise they would block the release if any unexpected issues arise. The situation sounds a bit stressed and can be resolved very well by extending the freeze date for a bit. Best Yuan On Wed, Jan 26, 2022 at 11:18 AM Yun Tang wrote: > Since the official Spring Festival holidays in China starts from Jan 31th > to Feb 6th, and many developers in China would enjoy the holidays at that > time. > +1 for extending the feature freeze. > > Best > Yun Tang > > From: Jingsong Li > Sent: Wednesday, January 26, 2022 10:32 > To: dev > Subject: Re: [DISCUSS] Pushing Apache Flink 1.15 Feature Freeze > > +1 for extending the feature freeze. > > Thanks Joe for driving. > > Best, > Jingsong > > On Wed, Jan 26, 2022 at 12:04 AM Martijn Visser > wrote: > > > > Hi all, > > > > +1 for extending the feature freeze. We could use the time to try to wrap > > up some important SQL related features and improvements. > > > > Best regards, > > > > Martijn > > > > On Tue, 25 Jan 2022 at 16:38, Johannes Moser wrote: > > > > > Dear Flink community, > > > > > > as mentioned in the summary mail earlier some contributors voiced that > > > they would benefit from pushing the feature freeze for 1.15. by a week. > > > This would mean Monday, 14th of February 2022, end of business CEST. > > > > > > Please let us know in case you got any concerns. > > > > > > > > > Best, > > > Till, Yun Gao & Joe >
[jira] [Created] (FLINK-25816) Changelog keyed state backend would come across NPE during notification
Yun Tang created FLINK-25816: Summary: Changelog keyed state backend would come across NPE during notification Key: FLINK-25816 URL: https://issues.apache.org/jira/browse/FLINK-25816 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing, Runtime / State Backends Reporter: Yun Tang Assignee: Yun Tang Fix For: 1.15.0 Instance: https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=30158=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7 {code:java} Caused by: java.lang.NullPointerException at org.apache.flink.state.changelog.ChangelogKeyedStateBackend.notifyCheckpointAborted(ChangelogKeyedStateBackend.java:536) at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.notifyCheckpointAborted(StreamOperatorStateHandler.java:298) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.notifyCheckpointAborted(AbstractStreamOperator.java:383) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.notifyCheckpointAborted(AbstractUdfStreamOperator.java:132) at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.notifyCheckpointAborted(RegularOperatorChain.java:158) at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpoint(SubtaskCheckpointCoordinatorImpl.java:406) at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpointAborted(SubtaskCheckpointCoordinatorImpl.java:352) at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointAbortAsync$15(StreamTask.java:1327) at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointOperation$17(StreamTask.java:1350) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsNonBlocking(MailboxProcessor.java:353) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:317) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201) at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:802) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:751) at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948) at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563) at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25815) PulsarSourceITCase.testTaskManagerFailure failed on azure due to timeout
Yun Gao created FLINK-25815: --- Summary: PulsarSourceITCase.testTaskManagerFailure failed on azure due to timeout Key: FLINK-25815 URL: https://issues.apache.org/jira/browse/FLINK-25815 Project: Flink Issue Type: Bug Components: Connectors / Pulsar Affects Versions: 1.15.0 Reporter: Yun Gao {code:java} 022-01-25T06:46:46.5112869Z Jan 25 06:46:46 [ERROR] org.apache.flink.connector.pulsar.source.PulsarSourceITCase.testTaskManagerFailure(TestEnvironment, DataStreamSourceExternalContext, ClusterControllable)[2] Time elapsed: 74.695 s <<< ERROR! 2022-01-25T06:46:46.5116004Z Jan 25 06:46:46 org.apache.pulsar.client.api.PulsarClientException$TimeoutException: The producer mock-pulsar-wpuQ1c-1-19 can not send message to the topic persistent://public/default/pulsar-multiple-topic-0-bhxPrLuS-partition-0 within given timeout : createdAt 30002788101 ns ago, firstSentAt 2488196546830074 ns ago, lastSentAt 30002746662 ns ago, retryCount 1 2022-01-25T06:46:46.5117924Z Jan 25 06:46:46at org.apache.pulsar.client.api.PulsarClientException.unwrap(PulsarClientException.java:961) 2022-01-25T06:46:46.5143390Z Jan 25 06:46:46at org.apache.pulsar.client.impl.TypedMessageBuilderImpl.send(TypedMessageBuilderImpl.java:91) 2022-01-25T06:46:46.5145068Z Jan 25 06:46:46at org.apache.flink.connector.pulsar.testutils.runtime.PulsarRuntimeOperator.sendMessages(PulsarRuntimeOperator.java:178) 2022-01-25T06:46:46.5146447Z Jan 25 06:46:46at org.apache.flink.connector.pulsar.testutils.runtime.PulsarRuntimeOperator.sendMessages(PulsarRuntimeOperator.java:167) 2022-01-25T06:46:46.5147666Z Jan 25 06:46:46at org.apache.flink.connector.pulsar.testutils.PulsarPartitionDataWriter.writeRecords(PulsarPartitionDataWriter.java:41) 2022-01-25T06:46:46.5148974Z Jan 25 06:46:46at org.apache.flink.connector.testframe.testsuites.SourceTestSuiteBase.testTaskManagerFailure(SourceTestSuiteBase.java:328) 2022-01-25T06:46:46.5150053Z Jan 25 06:46:46at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 2022-01-25T06:46:46.5150969Z Jan 25 06:46:46at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 2022-01-25T06:46:46.5152137Z Jan 25 06:46:46at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 2022-01-25T06:46:46.5153083Z Jan 25 06:46:46at java.lang.reflect.Method.invoke(Method.java:498) 2022-01-25T06:46:46.5154039Z Jan 25 06:46:46at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725) 2022-01-25T06:46:46.5155081Z Jan 25 06:46:46at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60) 2022-01-25T06:46:46.5156548Z Jan 25 06:46:46at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131) 2022-01-25T06:46:46.5157869Z Jan 25 06:46:46at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149) 2022-01-25T06:46:46.5158984Z Jan 25 06:46:46at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140) 2022-01-25T06:46:46.5160169Z Jan 25 06:46:46at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestTemplateMethod(TimeoutExtension.java:92) 2022-01-25T06:46:46.5161460Z Jan 25 06:46:46at org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115) 2022-01-25T06:46:46.5162858Z Jan 25 06:46:46at org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105) 2022-01-25T06:46:46.5164100Z Jan 25 06:46:46at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106) 2022-01-25T06:46:46.5165365Z Jan 25 06:46:46at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64) 2022-01-25T06:46:46.5166709Z Jan 25 06:46:46at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45) 2022-01-25T06:46:46.5167925Z Jan 25 06:46:46at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37) 2022-01-25T06:46:46.5169043Z Jan 25 06:46:46at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104) 2022-01-25T06:46:46.5170296Z Jan 25 06:46:46at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98) 2022-01-25T06:46:46.5171457Z Jan 25 06:46:46at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:214) 2022-01-25T06:46:46.5172734Z Jan 25 06:46:46at
Re: [DISCUSS] Pushing Apache Flink 1.15 Feature Freeze
Since the official Spring Festival holidays in China starts from Jan 31th to Feb 6th, and many developers in China would enjoy the holidays at that time. +1 for extending the feature freeze. Best Yun Tang From: Jingsong Li Sent: Wednesday, January 26, 2022 10:32 To: dev Subject: Re: [DISCUSS] Pushing Apache Flink 1.15 Feature Freeze +1 for extending the feature freeze. Thanks Joe for driving. Best, Jingsong On Wed, Jan 26, 2022 at 12:04 AM Martijn Visser wrote: > > Hi all, > > +1 for extending the feature freeze. We could use the time to try to wrap > up some important SQL related features and improvements. > > Best regards, > > Martijn > > On Tue, 25 Jan 2022 at 16:38, Johannes Moser wrote: > > > Dear Flink community, > > > > as mentioned in the summary mail earlier some contributors voiced that > > they would benefit from pushing the feature freeze for 1.15. by a week. > > This would mean Monday, 14th of February 2022, end of business CEST. > > > > Please let us know in case you got any concerns. > > > > > > Best, > > Till, Yun Gao & Joe
[jira] [Created] (FLINK-25814) AdaptiveSchedulerITCase.testStopWithSavepointFailOnFirstSavepointSucceedOnSecond failed due to stop-with-savepoint failed
Yun Gao created FLINK-25814: --- Summary: AdaptiveSchedulerITCase.testStopWithSavepointFailOnFirstSavepointSucceedOnSecond failed due to stop-with-savepoint failed Key: FLINK-25814 URL: https://issues.apache.org/jira/browse/FLINK-25814 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.13.5 Reporter: Yun Gao {code:java} 2022-01-25T05:37:28.6339368Z Jan 25 05:37:28 [ERROR] testStopWithSavepointFailOnFirstSavepointSucceedOnSecond(org.apache.flink.test.scheduling.AdaptiveSchedulerITCase) Time elapsed: 300.269 s <<< ERROR! 2022-01-25T05:37:28.6340216Z Jan 25 05:37:28 java.util.concurrent.ExecutionException: org.apache.flink.util.FlinkException: Stop with savepoint operation could not be completed. 2022-01-25T05:37:28.6342330Z Jan 25 05:37:28at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 2022-01-25T05:37:28.6343776Z Jan 25 05:37:28at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) 2022-01-25T05:37:28.6344983Z Jan 25 05:37:28at org.apache.flink.test.scheduling.AdaptiveSchedulerITCase.testStopWithSavepointFailOnFirstSavepointSucceedOnSecond(AdaptiveSchedulerITCase.java:231) 2022-01-25T05:37:28.6346165Z Jan 25 05:37:28at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 2022-01-25T05:37:28.6347145Z Jan 25 05:37:28at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 2022-01-25T05:37:28.6348207Z Jan 25 05:37:28at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 2022-01-25T05:37:28.6349147Z Jan 25 05:37:28at java.lang.reflect.Method.invoke(Method.java:498) 2022-01-25T05:37:28.6350068Z Jan 25 05:37:28at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) 2022-01-25T05:37:28.6351116Z Jan 25 05:37:28at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) 2022-01-25T05:37:28.6352132Z Jan 25 05:37:28at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) 2022-01-25T05:37:28.6353816Z Jan 25 05:37:28at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) 2022-01-25T05:37:28.6354863Z Jan 25 05:37:28at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 2022-01-25T05:37:28.6355983Z Jan 25 05:37:28at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 2022-01-25T05:37:28.6356958Z Jan 25 05:37:28at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) 2022-01-25T05:37:28.6357871Z Jan 25 05:37:28at org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) 2022-01-25T05:37:28.6358799Z Jan 25 05:37:28at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) 2022-01-25T05:37:28.6359658Z Jan 25 05:37:28at org.junit.rules.RunRules.evaluate(RunRules.java:20) 2022-01-25T05:37:28.6360506Z Jan 25 05:37:28at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) 2022-01-25T05:37:28.6361425Z Jan 25 05:37:28at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) 2022-01-25T05:37:28.6362486Z Jan 25 05:37:28at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) 2022-01-25T05:37:28.6364531Z Jan 25 05:37:28at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) 2022-01-25T05:37:28.6365709Z Jan 25 05:37:28at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) 2022-01-25T05:37:28.6366600Z Jan 25 05:37:28at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) 2022-01-25T05:37:28.6367488Z Jan 25 05:37:28at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) 2022-01-25T05:37:28.6368333Z Jan 25 05:37:28at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) 2022-01-25T05:37:28.6369236Z Jan 25 05:37:28at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) 2022-01-25T05:37:28.6370133Z Jan 25 05:37:28at org.junit.rules.RunRules.evaluate(RunRules.java:20) 2022-01-25T05:37:28.6371056Z Jan 25 05:37:28at org.junit.runners.ParentRunner.run(ParentRunner.java:363) 2022-01-25T05:37:28.6371957Z Jan 25 05:37:28at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) 2022-01-25T05:37:28.6373128Z Jan 25 05:37:28at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) 2022-01-25T05:37:28.6374293Z Jan 25 05:37:28at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) 2022-01-25T05:37:28.6375273Z Jan 25 05:37:28at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) 2022-01-25T05:37:28.6376370Z Jan 25 05:37:28at
[jira] [Created] (FLINK-25813) TableITCase.testCollectWithClose failed on azure
Yun Gao created FLINK-25813: --- Summary: TableITCase.testCollectWithClose failed on azure Key: FLINK-25813 URL: https://issues.apache.org/jira/browse/FLINK-25813 Project: Flink Issue Type: Bug Components: Table SQL / Runtime Affects Versions: 1.15.0 Reporter: Yun Gao {code:java} 2022-01-25T08:35:25.3735884Z Jan 25 08:35:25 [ERROR] TableITCase.testCollectWithClose Time elapsed: 0.377 s <<< FAILURE! 2022-01-25T08:35:25.3737127Z Jan 25 08:35:25 java.lang.AssertionError: Values should be different. Actual: RUNNING 2022-01-25T08:35:25.3738167Z Jan 25 08:35:25at org.junit.Assert.fail(Assert.java:89) 2022-01-25T08:35:25.3739085Z Jan 25 08:35:25at org.junit.Assert.failEquals(Assert.java:187) 2022-01-25T08:35:25.3739922Z Jan 25 08:35:25at org.junit.Assert.assertNotEquals(Assert.java:163) 2022-01-25T08:35:25.3740846Z Jan 25 08:35:25at org.junit.Assert.assertNotEquals(Assert.java:177) 2022-01-25T08:35:25.3742302Z Jan 25 08:35:25at org.apache.flink.table.api.TableITCase.testCollectWithClose(TableITCase.scala:135) 2022-01-25T08:35:25.3743327Z Jan 25 08:35:25at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 2022-01-25T08:35:25.3744343Z Jan 25 08:35:25at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 2022-01-25T08:35:25.3745575Z Jan 25 08:35:25at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 2022-01-25T08:35:25.3746840Z Jan 25 08:35:25at java.lang.reflect.Method.invoke(Method.java:498) 2022-01-25T08:35:25.3747922Z Jan 25 08:35:25at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) 2022-01-25T08:35:25.3749151Z Jan 25 08:35:25at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) 2022-01-25T08:35:25.3750422Z Jan 25 08:35:25at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) 2022-01-25T08:35:25.3751820Z Jan 25 08:35:25at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) 2022-01-25T08:35:25.3753196Z Jan 25 08:35:25at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 2022-01-25T08:35:25.3754253Z Jan 25 08:35:25at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 2022-01-25T08:35:25.3755441Z Jan 25 08:35:25at org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258) 2022-01-25T08:35:25.3756656Z Jan 25 08:35:25at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) 2022-01-25T08:35:25.3757778Z Jan 25 08:35:25at org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) 2022-01-25T08:35:25.3758821Z Jan 25 08:35:25at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) 2022-01-25T08:35:25.3759840Z Jan 25 08:35:25at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 2022-01-25T08:35:25.3760919Z Jan 25 08:35:25at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) 2022-01-25T08:35:25.3762249Z Jan 25 08:35:25at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) 2022-01-25T08:35:25.3763322Z Jan 25 08:35:25at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) 2022-01-25T08:35:25.3764436Z Jan 25 08:35:25at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) 2022-01-25T08:35:25.3765907Z Jan 25 08:35:25at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 2022-01-25T08:35:25.3766957Z Jan 25 08:35:25at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 2022-01-25T08:35:25.3768104Z Jan 25 08:35:25at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 2022-01-25T08:35:25.3769128Z Jan 25 08:35:25at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 2022-01-25T08:35:25.3770125Z Jan 25 08:35:25at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) 2022-01-25T08:35:25.3771118Z Jan 25 08:35:25at org.junit.runners.ParentRunner.run(ParentRunner.java:413) 2022-01-25T08:35:25.3772264Z Jan 25 08:35:25at org.junit.runners.Suite.runChild(Suite.java:128) 2022-01-25T08:35:25.3773118Z Jan 25 08:35:25at org.junit.runners.Suite.runChild(Suite.java:27) 2022-01-25T08:35:25.3774092Z Jan 25 08:35:25at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 2022-01-25T08:35:25.3775056Z Jan 25 08:35:25at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 2022-01-25T08:35:25.3776144Z Jan 25 08:35:25at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 2022-01-25T08:35:25.3777125Z Jan 25 08:35:25at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 2022-01-25T08:35:25.3778190Z Jan 25
Re: [DISCUSS] Future of Per-Job Mode
Hi all, I remember the application mode was initially named "cluster mode". As a contrast, the per-job mode is the "client mode". So I believe application mode should cover all the functionalities of per-job except where we are running the user main code. In the containerized or the Kubernetes world, the application mode is more native and easy to use since all the Flink and user jars are bundled in the image. I am also in favor of deprecating and removing the per-job in the long run. @Ferenc IIRC, the YARN application mode could ship user jars and dependencies via "yarn.ship-files" config option. The only limitation is that we could not ship and load the user dependencies with user classloader, not the parent classloader. FLINK-24897 is trying to fix this via supporting "usrlib" directory automatically. Best, Yang Ferenc Csaky 于2022年1月25日周二 22:05写道: > Hi Konstantin, > > First of all, sorry for the delay. We at Cloudera are currently relying on > per-job mode deploying Flink applications over YARN. > > Specifically, we allow users to upload connector jars and other artifacts. > There are also some default jars that we need to ship. These are all stored > on the local file system of our service’s node. The Flink job is submitted > on the users’ behalf by our service, which also specifies the jars to ship. > The service runs on a single node, not on all nodes with Flink TM/JM. It > would thus be difficult to manage the jars on every node. > > We are not familiar with the reasoning behind why application mode > currently doesn’t ship the user jars, besides the deployment being faster > this way. Would it be possible for the application mode to (optionally, > enabled by some config) distribute these, or are there some technical > limitations? > > For us it would be crucial to achieve the functionality we have at the > moment over YARN. We started to track > https://issues.apache.org/jira/browse/FLINK-24897 that Biao Geng > mentioned as well. > > Considering the above, for us the more soonish removal does not sound > really well. We can live with this feature as deprecated of course, but it > would be nice to have some time to figure out how we can utilize > Application Mode exactly and make necessary changes if required. > > Thank you, > F > > On 2022/01/13 08:30:48 Konstantin Knauf wrote: > > Hi everyone, > > > > I would like to discuss and understand if the benefits of having Per-Job > > Mode in Apache Flink outweigh its drawbacks. > > > > > > *# Background: Flink's Deployment Modes* > > Flink currently has three deployment modes. They differ in the following > > dimensions: > > * main() method executed on Jobmanager or Client > > * dependencies shipped by client or bundled with all nodes > > * number of jobs per cluster & relationship between job and cluster > > lifecycle* (supported resource providers) > > > > ## Application Mode > > * main() method executed on Jobmanager > > * dependencies already need to be available on all nodes > > * dedicated cluster for all jobs executed from the same main()-method > > (Note: applications with more than one job, currently still significant > > limitations like missing high-availability). Technically, a session > cluster > > dedicated to all jobs submitted from the same main() method. > > * supported by standalone, native kubernetes, YARN > > > > ## Session Mode > > * main() method executed in client > > * dependencies are distributed from and by the client to all nodes > > * cluster is shared by multiple jobs submitted from different clients, > > independent lifecycle > > * supported by standalone, Native Kubernetes, YARN > > > > ## Per-Job Mode > > * main() method executed in client > > * dependencies are distributed from and by the client to all nodes > > * dedicated cluster for a single job > > * supported by YARN only > > > > > > *# Reasons to Keep** There are use cases where you might need the > > combination of a single job per cluster, but main() method execution in > the > > client. This combination is only supported by per-job mode. > > * It currently exists. Existing users will need to migrate to either > > session or application mode. > > > > > > *# Reasons to Drop** With Per-Job Mode and Application Mode we have two > > modes that for most users probably do the same thing. Specifically, for > > those users that don't care where the main() method is executed and want > to > > submit a single job per cluster. Having two ways to do the same thing is > > confusing. > > * Per-Job Mode is only supported by YARN anyway. If we keep it, we should > > work towards support in Kubernetes and Standalone, too, to reduce special > > casing. > > * Dropping per-job mode would reduce complexity in the code and allow us > to > > dedicate more resources to the other two deployment modes. > > * I believe with session mode and application mode we have to easily > > distinguishable and understandable deployment modes that cover Flink's > use > > cases: > > * session
Re: [DISCUSS] Pushing Apache Flink 1.15 Feature Freeze
+1 for extending the feature freeze. Thanks Joe for driving. Best, Jingsong On Wed, Jan 26, 2022 at 12:04 AM Martijn Visser wrote: > > Hi all, > > +1 for extending the feature freeze. We could use the time to try to wrap > up some important SQL related features and improvements. > > Best regards, > > Martijn > > On Tue, 25 Jan 2022 at 16:38, Johannes Moser wrote: > > > Dear Flink community, > > > > as mentioned in the summary mail earlier some contributors voiced that > > they would benefit from pushing the feature freeze for 1.15. by a week. > > This would mean Monday, 14th of February 2022, end of business CEST. > > > > Please let us know in case you got any concerns. > > > > > > Best, > > Till, Yun Gao & Joe
Re: Re: Re: [VOTE] FLIP-204: Introduce Hash Lookup Join
+1 (non-binding) Thanks Jing! Best, Lincoln Lee Terry 于2022年1月25日周二 17:51写道: > +1 (non-binding) > Thanks for driving this. > > Best regards > > zhangmang1 于2022年1月25日周二 10:39写道: > > > +1 >
[jira] [Created] (FLINK-25812) Show Plan in Ui doesn't show.
None none created FLINK-25812: - Summary: Show Plan in Ui doesn't show. Key: FLINK-25812 URL: https://issues.apache.org/jira/browse/FLINK-25812 Project: Flink Issue Type: Bug Components: Runtime / Web Frontend Affects Versions: 1.14.3 Reporter: None none Click on any job on the Ui to run it. Click "Show Plan". Show plan ui shows up but blank. There is also an exception recorded. 2022-01-25 21:34:02,243 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Registration failure at ResourceManager occurred. org.apache.flink.util.SerializedThrowable: org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: Could not connect to rpc endpoint under address akka.tcp://flink@xx-v-job-0001:46025/user/rpc/jobmanager_10. at org.apache.flink.runtime.rpc.akka.AkkaRpcService.lambda$resolveActorAddress$11(AkkaRpcService.java:602) ~[?:?] at scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:59) ~[?:?] at scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:53) ~[?:?] at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) ~[?:1.8.0_312] at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) ~[?:1.8.0_312] at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456) ~[?:1.8.0_312] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_312] Caused by: org.apache.flink.util.SerializedThrowable: Could not connect to rpc endpoint under address akka.tcp://flink@xx-v-job-0001:46025/user/rpc/jobmanager_10. ... 7 more Caused by: org.apache.flink.util.SerializedThrowable: Actor not found for: ActorSelection[Anchor(akka://flink/), Path(/user/rpc/jobmanager_10)] at akka.actor.ActorSelection.$anonfun$resolveOne$1(ActorSelection.scala:74) ~[?:?] at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60) ~[flink-dist_2.12-1.14.3.jar:1.14.3] at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:63) ~[?:?] at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:81) ~[?:?] at akka.dispatch.internal.SameThreadExecutionContext$$anon$1.unbatchedExecute(SameThreadExecutionContext.scala:21) ~[?:?] at akka.dispatch.BatchingExecutor.execute(BatchingExecutor.scala:130) ~[?:?] at akka.dispatch.BatchingExecutor.execute$(BatchingExecutor.scala:124) ~[?:?] at akka.dispatch.internal.SameThreadExecutionContext$$anon$1.execute(SameThreadExecutionContext.scala:20) ~[?:?] at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68) ~[flink-dist_2.12-1.14.3.jar:1.14.3] at scala.concurrent.impl.Promise$DefaultPromise.dispatchOrAddCallback(Promise.scala:312) ~[flink-dist_2.12-1.14.3.jar:1.14.3] at scala.concurrent.impl.Promise$DefaultPromise.onComplete(Promise.scala:303) ~[flink-dist_2.12-1.14.3.jar:1.14.3] at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:72) ~[?:?] at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:89) ~[?:?] at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:130) ~[?:?] at org.apache.flink.runtime.rpc.akka.AkkaRpcService.resolveActorAddress(AkkaRpcService.java:596) ~[?:?] at org.apache.flink.runtime.rpc.akka.AkkaRpcService.connectInternal(AkkaRpcService.java:547) ~[?:?] at org.apache.flink.runtime.rpc.akka.AkkaRpcService.connect(AkkaRpcService.java:233) ~[?:?] at org.apache.flink.runtime.resourcemanager.ResourceManager.registerJobManager(ResourceManager.java:367) ~[flink-dist_2.12-1.14.3.jar:1.14.3] at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_312] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_312] at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRpcInvocation$1(AkkaRpcActor.java:316) ~[?:?] at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83) ~[?:?] at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:314) ~[?:?] at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:217) ~[?:?] at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:78) ~[?:?] at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163) ~[?:?] at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24) ~[?:?] at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20) ~[?:?] at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) ~[flink-dist_2.12-1.14.3.jar:1.14.3] at
[VOTE] Apache Flink Stateful Functions 3.2.0, release candidate #1
Hi everyone, Please review and vote on the release candidate #1 for the version 3.2.0 of Apache Flink Stateful Functions, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) **Release Overview** As an overview, the release consists of the following: a) Stateful Functions canonical source distribution, to be deployed to the release repository at dist.apache.org b) Stateful Functions Python SDK distributions to be deployed to PyPI c) Maven artifacts to be deployed to the Maven Central Repository d) New Dockerfiles for the release e) GoLang SDK (contained in the repository) f) JavaScript SDK (contained in the repository; will be uploaded to npm after the release) **Staging Areas to Review** The staging areas containing the above mentioned artifacts are as follows, for your review: * All artifacts for a) and b) can be found in the corresponding dev repository at dist.apache.org [2] * All artifacts for c) can be found at the Apache Nexus Repository [3] All artifacts are signed with the key B9499FA69EFF5DEEEBC3C1F5BA7E4187C6F73D82 [4] Other links for your review: * JIRA release notes [5] * source code tag "release-3.2.0-rc1" [6] * PR for the new Dockerfiles [7] * PR to update the website Downloads page to include Stateful Functions links [8] * GoLang SDK [9] * JavaScript SDK [10] **Vote Duration** The voting time will run for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes. Thanks, Till [1] https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Stateful+Functions+Release [2] https://dist.apache.org/repos/dist/dev/flink/flink-statefun-3.2.0-rc1/ [3] https://repository.apache.org/content/repositories/orgapacheflink-1483/ [4] https://dist.apache.org/repos/dist/release/flink/KEYS [5] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350540 [6] https://github.com/apache/flink-statefun/tree/release-3.2.0-rc1 [7] https://github.com/apache/flink-statefun-docker/pull/19 [8] https://github.com/apache/flink-web/pull/501 [9] https://github.com/apache/flink-statefun/tree/release-3.2.0-rc1/statefun-sdk-go [10] https://github.com/apache/flink-statefun/tree/release-3.2.0-rc1/statefun-sdk-js
[jira] [Created] (FLINK-25811) Fix generic AsyncSinkWriter retrying requests in reverse order
Ahmed Hamdy created FLINK-25811: --- Summary: Fix generic AsyncSinkWriter retrying requests in reverse order Key: FLINK-25811 URL: https://issues.apache.org/jira/browse/FLINK-25811 Project: Flink Issue Type: Bug Components: Connectors / Common Affects Versions: 1.15.0 Reporter: Ahmed Hamdy -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25810) [FLIP-171] Add Uber-jar for sql connector for KinesisDataStreams.
Ahmed Hamdy created FLINK-25810: --- Summary: [FLIP-171] Add Uber-jar for sql connector for KinesisDataStreams. Key: FLINK-25810 URL: https://issues.apache.org/jira/browse/FLINK-25810 Project: Flink Issue Type: Sub-task Components: Connectors / Kinesis Affects Versions: 1.15.0 Reporter: Ahmed Hamdy -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [DISCUSS] Pushing Apache Flink 1.15 Feature Freeze
Hi all, +1 for extending the feature freeze. We could use the time to try to wrap up some important SQL related features and improvements. Best regards, Martijn On Tue, 25 Jan 2022 at 16:38, Johannes Moser wrote: > Dear Flink community, > > as mentioned in the summary mail earlier some contributors voiced that > they would benefit from pushing the feature freeze for 1.15. by a week. > This would mean Monday, 14th of February 2022, end of business CEST. > > Please let us know in case you got any concerns. > > > Best, > Till, Yun Gao & Joe
[jira] [Created] (FLINK-25809) Introduce test infra for building FLIP-190 tests
Francesco Guardiani created FLINK-25809: --- Summary: Introduce test infra for building FLIP-190 tests Key: FLINK-25809 URL: https://issues.apache.org/jira/browse/FLINK-25809 Project: Flink Issue Type: Sub-task Reporter: Francesco Guardiani Assignee: Francesco Guardiani The FLIP-190 requires to build a new test infra. For this test infra, we want to define test cases and data once, and then for each case we want to execute the following: * Integration test that roughly does {{create plan -> execute job -> trigger savepoint -> stop job -> restore plan -> restore savepoint -> execute job -> stop and assert}}. Plan and savepoint should be commited to git, so running this tests when a plan and savepoint is available will not regenerate plan and savepoint. * Change detection test to check that for the defined test cases, the plan hasn't been changed. Similar to the existing {{JsonPlanITCase}} tests. * Completeness of tests/Coverage, that is count how many times ExecNodes (including versions) are used in the test cases. Fail if an ExecNode version is never covered. Other requirements includes to "version" the test cases, that is for each test case we can retain different versions of the plan and savepoint, in order to make sure that after we introduce a new plan change, the old plan still continues to run -- This message was sent by Atlassian Jira (v8.20.1#820001)
[DISCUSS] Pushing Apache Flink 1.15 Feature Freeze
Dear Flink community, as mentioned in the summary mail earlier some contributors voiced that they would benefit from pushing the feature freeze for 1.15. by a week. This would mean Monday, 14th of February 2022, end of business CEST. Please let us know in case you got any concerns. Best, Till, Yun Gao & Joe
Flink 1.15. Bi-weekly 2022-01-25
Dear Flink community, Today we had another bi-weekly for the Apache Flink 1.15 release. The planned feature freeze is just 2 weeks in. In the last two weeks there some progress and some features have been moved to “done”. We are now at 70 % according to the assessments of the responsible ICs. See the 1.15 release tracking page for more details [1]. # Feature development We can already see a separation of features moving to done and won’t make it. Let’s push to clarify the state asap. # Blockers and instabilities We have been able to get all the blocker and most pressing build instabilities going, still there are way more than there should be. We’d like to ask everyone to also take care of those when working towards the feature freeze. # Bi-weeklies As we are coming closer to the feature freeze we will move to a weekly sync. # Feature freeze In the sync some contributors mentioned that they would benefit from pushing the feature freeze by a week. There will be a dedicated email on the dev mailing list for this. # Release blog post We started to work on the release blogpost. In case someone wants to contribute or has some input feel free to reach out to us. Let’s get this shipped! Best, Till, Yun Gao & Joe [1] https://cwiki.apache.org/confluence/display/FLINK/1.15+Release
[jira] [Created] (FLINK-25808) JsonTypeInfo property should be valid java identifier
Chesnay Schepler created FLINK-25808: Summary: JsonTypeInfo property should be valid java identifier Key: FLINK-25808 URL: https://issues.apache.org/jira/browse/FLINK-25808 Project: Flink Issue Type: Bug Components: Documentation, Runtime / REST Affects Versions: 1.15.0 Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.15.0 Some REST classes use the JsonTypeInfo with the property being named {{@class}}. This causes invalid java code to be generated by swagger. Rename it to {{clazz}}. -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [DISCUSS] Stateful Functions 3.2.0 release?
Quick update: Igal has resolved the last outstanding ticket [1]. [1] https://issues.apache.org/jira/browse/FLINK-25775 Cheers, Till On Tue, Jan 25, 2022 at 9:56 AM Till Rohrmann wrote: > Thanks everyone. I will start creating the artifacts. > > Cheers, > Till > > On Mon, Jan 24, 2022 at 9:13 PM Tzu-Li (Gordon) Tai > wrote: > >> +1 >> >> On Mon, Jan 24, 2022 at 9:43 AM Igal Shilman wrote: >> >> > +1 and thanks for volunteering to be the release manager! >> > >> > Cheers, >> > Igal. >> > >> > On Mon, Jan 24, 2022 at 4:13 PM Seth Wiesman >> wrote: >> > >> > > +1 >> > > >> > > These are already a useful set of features to release to our users. >> > > >> > > Seth >> > > >> > > On Mon, Jan 24, 2022 at 8:45 AM Till Rohrmann >> > > wrote: >> > > >> > > > Hi everyone, >> > > > >> > > > We would like to kick off a new StateFun release 3.2.0. The new >> release >> > > > will include the new JavaScript SDK and some useful major features: >> > > > >> > > > * JavaScript SDK [1] >> > > > * Flink version upgrade to 1.14.3 [2] >> > > > * Support different remote functions module names [3] >> > > > * Allow creating custom metrics [4] >> > > > >> > > > The only missing ticket for this release is the documentation of the >> > > > JavaScript SDK [5]. We plan to complete this in the next few days. >> > > > >> > > > Please let us know if you have any concerns. >> > > > >> > > > [1] https://issues.apache.org/jira/browse/FLINK-24256 >> > > > [2] https://issues.apache.org/jira/browse/FLINK-25708 >> > > > [3] https://issues.apache.org/jira/browse/FLINK-25308 >> > > > [4] https://issues.apache.org/jira/browse/FLINK-22533 >> > > > [5] https://issues.apache.org/jira/browse/FLINK-25775 >> > > > >> > > > Cheers, >> > > > Till >> > > > >> > > >> > >> >
Re: [DISCUSS] Host StatefulFunction JavaScript SDK on npm
Alright, I hope that we could resolve all mentioned concerns. If not, then please speak up. I will try to set up a npm account for the Flink project. Cheers, Till On Mon, Jan 24, 2022 at 4:40 PM Till Rohrmann wrote: > Reading the description of the ticket again, I think the comment might > also simply relate to the fact that the ASF does not have its own npm > account. So there is no official npm release channel. > > Note that this is not what we would need here. > > Cheers, > Till > > On Mon, Jan 24, 2022 at 4:37 PM Chesnay Schepler > wrote: > >> I looked at the 2018/2020 versions of the page and also couldn't find >> anything that relates to the comment. >> That's why this is so weird to me; it seems to be based on some >> knowledge that isn't written down explicitly. >> >> There aren't any limitations/requirements for (secondary) distribution >> channels listed anywhere (which is actually rather surprising). >> >> On 24/01/2022 16:19, Till Rohrmann wrote: >> > Looking at the linked official release documentation, I think this >> comment >> > is no longer valid. Moreover, I cannot find an explanation why some >> > projects are allowed to publish on npm and others not. >> > >> > Cheers, >> > Till >> > >> > On Mon, Jan 24, 2022 at 4:14 PM Chesnay Schepler >> wrote: >> > >> >> What is your explanation for the comment in the ticket then? >> >> >> >> On 24/01/2022 16:07, Till Rohrmann wrote: >> >>> I am not sure whether this is really a problem. There are other Apache >> >>> projects that publish their artifacts on npm already [1, 2]. >> >>> >> >>> Also on the official release page [3] there it is written: >> >>> >> After uploading to the canonical distribution channel, the project >> (or >> >>> anyone else) MAY redistribute the artifacts in accordance with their >> >>> licensing through other channels. >> >>> >> >>> [1] https://www.npmjs.com/package/echarts >> >>> [2] https://www.npmjs.com/package/cordova-coho >> >>> [3] >> >> https://www.apache.org/legal/release-policy.html#release-distribution >> >>> Cheers, >> >>> Till >> >>> >> >>> On Mon, Jan 24, 2022 at 3:54 PM Chesnay Schepler >> >> wrote: >> I'm concerned about a comment in >> https://issues.apache.org/jira/browse/INFRA-19733 stating that npm >> is >> not an approved release channel. >> I don't really know what to make of it and neither the current nor >> past >> versions of the linked legal page shed some light on it. >> >> As such it would be a good idea to double-check whether we are even >> allowed to publish anything on NPM. >> >> On 24/01/2022 15:42, Ingo Bürk wrote: >> > Hi Till, >> > >> > speaking as someone who regularly works with JavaScript/TypeScript, >> > definitely a +1 from me to providing this on the npm registry to >> make >> > it an easy installation for developers. >> > >> > >> > Best >> > Ingo >> > >> > On 24.01.22 15:37, Till Rohrmann wrote: >> >> Hi everyone, >> >> >> >> I would like to start a discussion about hosting >> >> StatefulFunction's JavaScript SDK [1] on npm. With the upcoming >> >> Statefun >> >> 3.2.0 release, the JavaScript SDK will be officially supported. In >> >> order >> >> for users to easily use this SDK, it would be very convenient if we >> >> make >> >> this dependency available via npm. Therefore, I propose to create a >> >> npm >> >> account that is governed by the PMC (similar to PyPI). >> >> >> >> Do you have any concerns about this? >> >> >> >> [1] https://issues.apache.org/jira/browse/FLINK-24256 >> >> >> >> Cheers, >> >> Till >> >> >> >> >> >>
[jira] [Created] (FLINK-25807) Generated SubtaskExecutionAttemptDetailsInfo contains duplicate property
Chesnay Schepler created FLINK-25807: Summary: Generated SubtaskExecutionAttemptDetailsInfo contains duplicate property Key: FLINK-25807 URL: https://issues.apache.org/jira/browse/FLINK-25807 Project: Flink Issue Type: Improvement Components: Documentation, Runtime / REST Affects Versions: 1.15.0 Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.15.0 Swaggers converts both start-time and start_time (kept for backwards-compatibility) into a startTime property, resulting in 2 properties with the same name in one class. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25806) Remove legacy high availability services
Till Rohrmann created FLINK-25806: - Summary: Remove legacy high availability services Key: FLINK-25806 URL: https://issues.apache.org/jira/browse/FLINK-25806 Project: Flink Issue Type: Improvement Components: Runtime / Coordination Affects Versions: 1.16.0 Reporter: Till Rohrmann Fix For: 1.16.0 After FLINK-24038, we should consider removing the legacy high availability services {{ZooKeeperHaServices}} and {{KubernetesHaServices}} since they are now subsumed by the multiple component leader election service that only uses a single leader election per component. -- This message was sent by Atlassian Jira (v8.20.1#820001)
RE: [DISCUSS] Future of Per-Job Mode
Hi Konstantin, First of all, sorry for the delay. We at Cloudera are currently relying on per-job mode deploying Flink applications over YARN. Specifically, we allow users to upload connector jars and other artifacts. There are also some default jars that we need to ship. These are all stored on the local file system of our service’s node. The Flink job is submitted on the users’ behalf by our service, which also specifies the jars to ship. The service runs on a single node, not on all nodes with Flink TM/JM. It would thus be difficult to manage the jars on every node. We are not familiar with the reasoning behind why application mode currently doesn’t ship the user jars, besides the deployment being faster this way. Would it be possible for the application mode to (optionally, enabled by some config) distribute these, or are there some technical limitations? For us it would be crucial to achieve the functionality we have at the moment over YARN. We started to track https://issues.apache.org/jira/browse/FLINK-24897 that Biao Geng mentioned as well. Considering the above, for us the more soonish removal does not sound really well. We can live with this feature as deprecated of course, but it would be nice to have some time to figure out how we can utilize Application Mode exactly and make necessary changes if required. Thank you, F On 2022/01/13 08:30:48 Konstantin Knauf wrote: > Hi everyone, > > I would like to discuss and understand if the benefits of having Per-Job > Mode in Apache Flink outweigh its drawbacks. > > > *# Background: Flink's Deployment Modes* > Flink currently has three deployment modes. They differ in the following > dimensions: > * main() method executed on Jobmanager or Client > * dependencies shipped by client or bundled with all nodes > * number of jobs per cluster & relationship between job and cluster > lifecycle* (supported resource providers) > > ## Application Mode > * main() method executed on Jobmanager > * dependencies already need to be available on all nodes > * dedicated cluster for all jobs executed from the same main()-method > (Note: applications with more than one job, currently still significant > limitations like missing high-availability). Technically, a session cluster > dedicated to all jobs submitted from the same main() method. > * supported by standalone, native kubernetes, YARN > > ## Session Mode > * main() method executed in client > * dependencies are distributed from and by the client to all nodes > * cluster is shared by multiple jobs submitted from different clients, > independent lifecycle > * supported by standalone, Native Kubernetes, YARN > > ## Per-Job Mode > * main() method executed in client > * dependencies are distributed from and by the client to all nodes > * dedicated cluster for a single job > * supported by YARN only > > > *# Reasons to Keep** There are use cases where you might need the > combination of a single job per cluster, but main() method execution in the > client. This combination is only supported by per-job mode. > * It currently exists. Existing users will need to migrate to either > session or application mode. > > > *# Reasons to Drop** With Per-Job Mode and Application Mode we have two > modes that for most users probably do the same thing. Specifically, for > those users that don't care where the main() method is executed and want to > submit a single job per cluster. Having two ways to do the same thing is > confusing. > * Per-Job Mode is only supported by YARN anyway. If we keep it, we should > work towards support in Kubernetes and Standalone, too, to reduce special > casing. > * Dropping per-job mode would reduce complexity in the code and allow us to > dedicate more resources to the other two deployment modes. > * I believe with session mode and application mode we have to easily > distinguishable and understandable deployment modes that cover Flink's use > cases: > * session mode: olap-style, interactive jobs/queries, short lived batch > jobs, very small jobs, traditional cluster-centric deployment mode (fits > the "Hadoop world") > * application mode: long-running streaming jobs, large scale & > heterogenous jobs (resource isolation!), application-centric deployment > mode (fits the "Kubernetes world") > > > *# Call to Action* > * Do you use per-job mode? If so, why & would you be able to migrate to one > of the other methods? > * Am I missing any pros/cons? > * Are you in favor of dropping per-job mode midterm? > > Cheers and thank you, > > Konstantin > > -- > > Konstantin Knauf > > https://twitter.com/snntrable > > https://github.com/knaufk >
Re: [DISCUSS] Releasing Flink 1.13.6
Thanks for driving this Martijn, +1 for the release Also big thanks to Konstantin for volunteering Best, D. On Mon, Jan 24, 2022 at 3:24 PM Till Rohrmann wrote: > +1 for the 1.13.6 release and thanks for volunteering Konstantin. > > Cheers, > Till > > On Mon, Jan 24, 2022 at 2:57 PM Konstantin Knauf > wrote: > > > Thanks for starting the discussion and +1 to releasing. > > > > I am happy to manage the release aka learn how to do this. > > > > Cheers, > > > > Konstantin > > > > On Mon, Jan 24, 2022 at 2:52 PM Martijn Visser > > wrote: > > > > > I would like to start a discussion on releasing Flink 1.13.6. Flink > > 1.13.5 > > > was the latest release on the 16th of December, which was the emergency > > > release for the Log4j CVE [1]. Flink 1.13.4 was cancelled, leaving > Flink > > > 1.13.3 as the last real bugfix release. This one was released on the > 19th > > > of October last year. > > > > > > Since then, there have been 61 fixed tickets, excluding the test > > > stabilities [3]. This includes a blocker and a couple of critical > issues. > > > > > > Is there a PMC member who would like to manage the release? I'm more > than > > > happy to help with monitoring the status of the tickets. > > > > > > Best regards, > > > > > > Martijn Visser > > > https://twitter.com/MartijnVisser82 > > > > > > [1] https://flink.apache.org/news/2021/12/16/log4j-patch-releases.html > > > [2] https://flink.apache.org/news/2021/10/19/release-1.13.3.html > > > [3] JQL filter: project = FLINK AND resolution = Fixed AND fixVersion = > > > 1.13.6 AND labels != test-stability ORDER BY priority DESC, created > DESC > > > > > > > > > -- > > > > Konstantin Knauf > > > > https://twitter.com/snntrable > > > > https://github.com/knaufk > > >
[jira] [Created] (FLINK-25805) Use compact DataType serialization for default classes instead of internal ones
Timo Walther created FLINK-25805: Summary: Use compact DataType serialization for default classes instead of internal ones Key: FLINK-25805 URL: https://issues.apache.org/jira/browse/FLINK-25805 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Reporter: Timo Walther It is more likely that default conversion classes spam the plan than internal classes. In most cases when internal classes are used, they usually also use logical type instead of data type. So it should be safer to skip default conversion classes. This also reduces the plan size for serializing `ResolvedSchema`. -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [DISCUSS] FLIP-211: Kerberos delegation token framework
First of all thanks for investing your time and helping me out. As I see you have pretty solid knowledge in the RPC area. I would like to rely on your knowledge since I'm learning this part. > - Do we need to introduce a new RPC method or can we for example piggyback on heartbeats? I'm fine with either solution but one thing is important conceptually. There are fundamentally 2 ways how tokens can be updated: - Push way: When there are new DTs then JM JVM pushes DTs to TM JVMs. This is the preferred one since tiny amount of control logic needed. - Pull way: Each time a TM would like to poll JM whether there are new tokens and each TM wants to decide alone whether DTs needs to be updated or not. As you've mentioned here some ID needs to be generated, it would generated quite some additional network traffic which can be definitely avoided. As a final thought in Spark we've had this way of DT propagation logic and we've had major issues with it. So all in all DTM needs to obtain new tokens and there must a way to send this data to all TMs from JM. > - What delivery semantics are we looking for? (what if we're only able to update subset of TMs / what happens if we exhaust retries / should we even have the retry mechanism whatsoever) - I have a feeling that somehow leveraging the existing heartbeat mechanism could help to answer these questions Let's go through these questions one by one. > What delivery semantics are we looking for? DTM must receive an exception when at least one TM was not able to get DTs. > what if we're only able to update subset of TMs? Such case DTM will reschedule token obtain after "security.kerberos.tokens.retry-wait" time. > what happens if we exhaust retries? There is no number of retries. In default configuration tokens needs to be re-obtained after one day. DTM tries to obtain new tokens after 1day * 0.75 (security.kerberos.tokens.renewal-ratio) = 18 hours. When fails it retries after "security.kerberos.tokens.retry-wait" which is 1 hour by default. If it never succeeds then authentication error is going to happen on the TM side and the workload is going to stop. > should we even have the retry mechanism whatsoever? Yes, because there are always temporary cluster issues. > What does it mean for the running application (how does this look like from the user perspective)? As far as I remember the logs are only collected ("aggregated") after the container is stopped, is that correct? With default config it works like that but it can be forced to aggregate at specific intervals. A useful feature is forcing YARN to aggregate logs while the job is still running. For long-running jobs such as streaming jobs, this is invaluable. To do this, yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds must be set to a non-negative value. When this is set, a timer will be set for the given duration, and whenever that timer goes off, log aggregation will run on new files. > I think this topic should get its own section in the FLIP (having some cross reference to YARN ticket would be really useful, but I'm not sure if there are any). I think this is important knowledge but this FLIP is not touching the already existing behavior. DTs are set on the AM container which is renewed by YARN until it's not possible anymore. Any kind of new code is not going to change this limitation. BTW, there is no jira for this. If you think it worth to write this down then I think the good place is the official security doc area as caveat. > If we split the FLIP into two parts / sections that I've suggested, I don't really think that you need to explicitly test for each deployment scenario / cluster framework, because the DTM part is completely independent of the deployment target. Basically this is what I'm aiming for with "making it work with the standalone" (as simple as starting a new java process) Flink first (which is also how most people deploy streaming application on k8s and the direction we're pushing forward with the auto-scaling / reactive mode initiatives). I see your point and agree the main direction. k8s is the megatrend which most of the peoples will use sooner or later. Not 100% sure what kind of split you suggest but in my view the main target is to add this feature and I'm open to any logical work ordering. Please share the specific details and we work it out... G On Mon, Jan 24, 2022 at 3:04 PM David Morávek wrote: > > > > Could you point to a code where you think it could be added exactly? A > > helping hand is welcome here > > > > I think you can take a look at _ResourceManagerPartitionTracker_ [1] which > seems to have somewhat similar properties to the DTM. > > One topic that needs to be addressed there is how the RPC with the > _TaskExecutorGateway_ should look like. > - Do we need to introduce a new RPC method or can we for example piggyback > on heartbeats? > - What delivery semantics are we looking for? (what if we're only able to > update subset of TMs /
[jira] [Created] (FLINK-25804) Loading and running connector code use separated ClassLoader.
Ada Wong created FLINK-25804: Summary: Loading and running connector code use separated ClassLoader. Key: FLINK-25804 URL: https://issues.apache.org/jira/browse/FLINK-25804 Project: Flink Issue Type: New Feature Components: API / Core, Connectors / Common, Table SQL / Runtime Affects Versions: 1.14.3 Reporter: Ada Wong When we use multiple connectors could have class conflicts. This class conflict can not be solved by shade. The following is example code. CREATE TABLE es6 ( user_id STRING, user_name STRING, PRIMARYKEY (user_id) NOT ENFORCED ) WITH ( 'connector' = 'elasticsearch-6', 'hosts' = 'http://localhost:9200', 'index' = 'users', 'document-type' = 'foo' ); CREATE TABLE es7 ( user_id STRING, user_name STRING, PRIMARYKEY (user_id) NOT ENFORCED ) WITH ( 'connector' = 'elasticsearch-7', 'hosts' = 'http://localhost:9200', 'index' = 'users' ); CREATE TABLE ods ( user_id STRING, user_name STRING ) WITH ( 'connector' = 'datagen' ); INSERT INTO es6 SELECT user_id, user_name FROM ods; INSERT INTO es7 SELECT user_id, user_name FROM ods; {code:java} CREATE TABLE es6 ( user_id STRING, user_name STRING, PRIMARYKEY (user_id) NOT ENFORCED ) WITH ( 'connector' = 'elasticsearch-6', 'hosts' = 'http://localhost:9200', 'index' = 'users', 'document-type' = 'foo' ); CREATE TABLE es7 ( user_id STRING, user_name STRING, PRIMARYKEY (user_id) NOT ENFORCED ) WITH ( 'connector' = 'elasticsearch-7', 'hosts' = 'http://localhost:9200', 'index' = 'users' ); CREATE TABLE ods ( user_id STRING, user_name STRING ) WITH ( 'connector' = 'datagen' ); INSERT INTO es6 SELECT user_id, user_name FROM ods; INSERT INTO es7 SELECT user_id, user_name FROM ods;{code} Inspird by PulginManager, PluginFileSystemFactory and ClassLoaderFixingFileSystem class. Could we create many ClassLoaderFixing* class to avoid class conflict. Such as ClassLoaderFixingDynamicTableFactory, ClassLoaderFixingSink or ClassLoaderFixingSinkFunction. If we use ClassLoader fixing, each call SinkFunction#invoke will switch classloader by Thread#currentThread()#setContextClassLoader(). Does setContextClassLoader() has heavy overhead of setContextClassLoader()? -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: Re: Re: [VOTE] FLIP-204: Introduce Hash Lookup Join
+1 (non-binding) Thanks for driving this. Best regards zhangmang1 于2022年1月25日周二 10:39写道: > +1
Re: [DISCUSS] Stateful Functions 3.2.0 release?
Thanks everyone. I will start creating the artifacts. Cheers, Till On Mon, Jan 24, 2022 at 9:13 PM Tzu-Li (Gordon) Tai wrote: > +1 > > On Mon, Jan 24, 2022 at 9:43 AM Igal Shilman wrote: > > > +1 and thanks for volunteering to be the release manager! > > > > Cheers, > > Igal. > > > > On Mon, Jan 24, 2022 at 4:13 PM Seth Wiesman > wrote: > > > > > +1 > > > > > > These are already a useful set of features to release to our users. > > > > > > Seth > > > > > > On Mon, Jan 24, 2022 at 8:45 AM Till Rohrmann > > > wrote: > > > > > > > Hi everyone, > > > > > > > > We would like to kick off a new StateFun release 3.2.0. The new > release > > > > will include the new JavaScript SDK and some useful major features: > > > > > > > > * JavaScript SDK [1] > > > > * Flink version upgrade to 1.14.3 [2] > > > > * Support different remote functions module names [3] > > > > * Allow creating custom metrics [4] > > > > > > > > The only missing ticket for this release is the documentation of the > > > > JavaScript SDK [5]. We plan to complete this in the next few days. > > > > > > > > Please let us know if you have any concerns. > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-24256 > > > > [2] https://issues.apache.org/jira/browse/FLINK-25708 > > > > [3] https://issues.apache.org/jira/browse/FLINK-25308 > > > > [4] https://issues.apache.org/jira/browse/FLINK-22533 > > > > [5] https://issues.apache.org/jira/browse/FLINK-25775 > > > > > > > > Cheers, > > > > Till > > > > > > > > > >
[jira] [Created] (FLINK-25803) Implement partition and bucket filter of FileStoreScanImpl
Caizhi Weng created FLINK-25803: --- Summary: Implement partition and bucket filter of FileStoreScanImpl Key: FLINK-25803 URL: https://issues.apache.org/jira/browse/FLINK-25803 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Caizhi Weng Fix For: 1.15.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)