[GitHub] [flink] flinkbot edited a comment on pull request #14859: [FLINK-21208][python] Make Arrow Coder serialize schema info in every batch
flinkbot edited a comment on pull request #14859: URL: https://github.com/apache/flink/pull/14859#issuecomment-773047857 ## CI report: * 31aab909d881e3f2bf60197b1d27048114f082c0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12903) * 2476b3750d3dd73e706a7fc0b8887c1a5631c246 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14842: [FLINK-21238][python] Support to close PythonFunctionFactory
flinkbot edited a comment on pull request #14842: URL: https://github.com/apache/flink/pull/14842#issuecomment-772256636 ## CI report: * 0b46d07633ccbd6774129dfa4f78424e6cacc533 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12936) * 5c977a53fe75be12cb949431d0e2e1fb8084c8ab Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12981) * 2469c99bbf68659d33f4f6a8f36f77c0adc8f870 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14834: [FLINK-21234][build system] Adding timeout to all tasks
flinkbot edited a comment on pull request #14834: URL: https://github.com/apache/flink/pull/14834#issuecomment-771607546 ## CI report: * 3c55d1b87076148f095c08a55319f1b898dc932d UNKNOWN * 5c8ac11faf9d6fbcad7550bec8abf6a6343347c3 UNKNOWN * 4e50c2f87a73c3a79c178a580c84c31924a72c4a Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12965) * 35e6f16b00a6ac645dc140b7289272074a82fd19 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12980) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14727: [FLINK-19945][Connectors / FileSystem]Support sink parallelism config…
flinkbot edited a comment on pull request #14727: URL: https://github.com/apache/flink/pull/14727#issuecomment-765189922 ## CI report: * a95a451df26b7d32d9a22b801cbc8aff7ceb719e Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12979) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration
flinkbot edited a comment on pull request #14629: URL: https://github.com/apache/flink/pull/14629#issuecomment-759361463 ## CI report: * 5426ab123aef03d4710c0fea1237fa014684f372 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12918) * 145f891faab2190d83a5ea35ac86a025605b43bd Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12976) * 6b93e4c21b1edf4cf9c071d88b78ab4230fbc004 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14379: [FLINK-20563][hive] Support built-in functions for Hive versions prio…
flinkbot edited a comment on pull request #14379: URL: https://github.com/apache/flink/pull/14379#issuecomment-73009 ## CI report: * 64460eaf888b0333b8aed626d48f66fd875997cf Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12972) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] leonardBang commented on pull request #14863: [FLINK-21203]Don’t collect -U&+U Row When they are equals In the LastRowFunction
leonardBang commented on pull request #14863: URL: https://github.com/apache/flink/pull/14863#issuecomment-773854776 @wangpeibin713 Could you rebase your code to master rather than use ` git merge` command? I found you used `git merge` which lead to current branch has two parents. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-21291) FlinkKafka Consumer can't dynamic discover the partition update
zhangyunyun created FLINK-21291: --- Summary: FlinkKafka Consumer can't dynamic discover the partition update Key: FLINK-21291 URL: https://issues.apache.org/jira/browse/FLINK-21291 Project: Flink Issue Type: Bug Components: Connectors / Kafka Affects Versions: 1.11.2 Reporter: zhangyunyun When start the job, occurs WARN log like below: {code:java} WARN org.apache.kafka.clients.consumer.ConsumerConfig - The configuration 'flink.partition-discovery.interval-millis' was supplied but isn't a known config. {code} And I try to change the kafka partion with command, partition number from 3 to 4 {code:java} ./kafka-topics.sh --alter --zookeeper 10.0.10.21:15311 --topic STRUCTED_LOG --partitions 4 {code} it dosen't work. How can I do with this problem. Thanks a lot -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-21045) Improve Usability of Pluggable Modules
[ https://issues.apache.org/jira/browse/FLINK-21045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Jiang updated FLINK-21045: --- Description: This improvement aims to # Simplify the module discovery mapping by module name. This will encourage users to create singleton of module instances. # Support changing the resolution order of modules in a flexible manner. This will introduce two methods {{#useModules}} and {{#listFullModules}} in both {{ModuleManager}} and {{TableEnvironment}}. # Support SQL syntax upon {{LOAD/UNLOAD MODULE}}, {{USE MODULES}}, and {{SHOW [FULL] MODULES}} in both {{FlinkSqlParserImpl}} and {{SqlClient}}. # Update the documentation to keep users informed of this improvement. Please reach to the [discussion thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLINK-21045-Support-load-module-and-unload-module-SQL-syntax-td48398.html] for more details. was: At present, Flink SQL doesn't support the 'load module' and 'unload module' SQL syntax. It's necessary for uses in the situation that users load and unload user-defined module through table api or sql client. SQL syntax has been proposed in FLIP-68: https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules > Improve Usability of Pluggable Modules > -- > > Key: FLINK-21045 > URL: https://issues.apache.org/jira/browse/FLINK-21045 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.13.0 >Reporter: Nicholas Jiang >Assignee: Jane Chan >Priority: Major > Fix For: 1.13.0 > > > This improvement aims to > # Simplify the module discovery mapping by module name. This will encourage > users to create singleton of module instances. > # Support changing the resolution order of modules in a flexible manner. > This will introduce two methods {{#useModules}} and {{#listFullModules}} in > both {{ModuleManager}} and {{TableEnvironment}}. > # Support SQL syntax upon {{LOAD/UNLOAD MODULE}}, {{USE MODULES}}, and > {{SHOW [FULL] MODULES}} in both {{FlinkSqlParserImpl}} and {{SqlClient}}. > # Update the documentation to keep users informed of this improvement. > Please reach to the [discussion > thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLINK-21045-Support-load-module-and-unload-module-SQL-syntax-td48398.html] > for more details. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-21045) Improve Usability of Pluggable Modules
[ https://issues.apache.org/jira/browse/FLINK-21045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Jiang updated FLINK-21045: --- Summary: Improve Usability of Pluggable Modules (was: Support 'load module' and 'unload module' SQL syntax) > Improve Usability of Pluggable Modules > -- > > Key: FLINK-21045 > URL: https://issues.apache.org/jira/browse/FLINK-21045 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.13.0 >Reporter: Nicholas Jiang >Assignee: Jane Chan >Priority: Major > Fix For: 1.13.0 > > > At present, Flink SQL doesn't support the 'load module' and 'unload module' > SQL syntax. It's necessary for uses in the situation that users load and > unload user-defined module through table api or sql client. > SQL syntax has been proposed in FLIP-68: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-21045) Support 'load module' and 'unload module' SQL syntax
[ https://issues.apache.org/jira/browse/FLINK-21045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279423#comment-17279423 ] Jane Chan edited comment on FLINK-21045 at 2/5/21, 7:28 AM: Hi [~nicholasjiang], "Extend Core Table System with Pluggable Modules" is the title of FLIP-68 and FLINK-14132. How about {panel:title=Improve Usability of Pluggable Modules} *Description* This improvement aims to # Simplify the module discovery mapping by module name. This will encourage users to create singleton of module instances. # Support changing the resolution order of modules in a flexible manner. This will introduce two methods {{#useModules}} and {{#listFullModules}} in both {{ModuleManager}} and {{TableEnvironment}}. # Support SQL syntax upon {{LOAD/UNLOAD MODULE}}, {{USE MODULES}}, and {{SHOW [FULL] MODULES}} in both {{FlinkSqlParserImpl}} and {{SqlClient}}. # Update the documentation to keep users informed of this improvement. Please reach to the [discussion thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLINK-21045-Support-load-module-and-unload-module-SQL-syntax-td48398.html] for more details. {panel} The proposed subtasks follow the description bullet list. What do you think? was (Author: qingyue): Hi [~nicholasjiang], "Extend Core Table System with Pluggable Modules" is the title of FLIP-68 and FLINK-14132. How about {panel:title=Improve Usability of Pluggable Modules} *Description* This improvement aims to # Simplify the module discovery mapping by module name. This will encourage users to create singleton of module instance. # Support changing the resolution order of modules in a flexible manner. This will introduce two methods {{#useModules}} and {{#listFullModules}} in both {{ModuleManager}} and {{TableEnvironment}}. # Support SQL syntax upon {{LOAD/UNLOAD MODULE}}, {{USE MODULES}}, and {{SHOW [FULL] MODULES in both {{SqlParserImpl}} and {{SqlClient}}. # Update the documentation to keep users informed with this improvement. Please reach to the [discussion thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLINK-21045-Support-load-module-and-unload-module-SQL-syntax-td48398.html] for more details. {panel} The proposed subtasks follows the description bullet list. What do you think? > Support 'load module' and 'unload module' SQL syntax > > > Key: FLINK-21045 > URL: https://issues.apache.org/jira/browse/FLINK-21045 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.13.0 >Reporter: Nicholas Jiang >Assignee: Jane Chan >Priority: Major > Fix For: 1.13.0 > > > At present, Flink SQL doesn't support the 'load module' and 'unload module' > SQL syntax. It's necessary for uses in the situation that users load and > unload user-defined module through table api or sql client. > SQL syntax has been proposed in FLIP-68: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-21045) Support 'load module' and 'unload module' SQL syntax
[ https://issues.apache.org/jira/browse/FLINK-21045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279423#comment-17279423 ] Jane Chan commented on FLINK-21045: --- Hi [~nicholasjiang], "Extend Core Table System with Pluggable Modules" is the title of FLIP-68 and FLINK-14132. How about {panel:title=Improve Usability of Pluggable Modules} *Description* This improvement aims to # Simplify the module discovery mapping by module name. This will encourage users to create singleton of module instance. # Support changing the resolution order of modules in a flexible manner. This will introduce two methods {{#useModules}} and {{#listFullModules}} in both {{ModuleManager}} and {{TableEnvironment}}. # Support SQL syntax upon {{LOAD/UNLOAD MODULE}}, {{USE MODULES}}, and {{SHOW [FULL] MODULES in both {{SqlParserImpl}} and {{SqlClient}}. # Update the documentation to keep users informed with this improvement. Please reach to the [discussion thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLINK-21045-Support-load-module-and-unload-module-SQL-syntax-td48398.html] for more details. {panel} The proposed subtasks follows the description bullet list. What do you think? > Support 'load module' and 'unload module' SQL syntax > > > Key: FLINK-21045 > URL: https://issues.apache.org/jira/browse/FLINK-21045 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.13.0 >Reporter: Nicholas Jiang >Assignee: Jane Chan >Priority: Major > Fix For: 1.13.0 > > > At present, Flink SQL doesn't support the 'load module' and 'unload module' > SQL syntax. It's necessary for uses in the situation that users load and > unload user-defined module through table api or sql client. > SQL syntax has been proposed in FLIP-68: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink-benchmarks] zhuzhurk commented on a change in pull request #7: [FLINK-20612][runtime] Add benchmarks for scheduler
zhuzhurk commented on a change in pull request #7: URL: https://github.com/apache/flink-benchmarks/pull/7#discussion_r570766969 ## File path: src/main/java/org/apache/flink/scheduler/benchmark/SchedulerBenchmarkUtils.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.scheduler.benchmark; + +import org.apache.flink.api.common.ExecutionConfig; +import org.apache.flink.api.common.ExecutionMode; +import org.apache.flink.api.common.time.Deadline; +import org.apache.flink.api.common.time.Time; +import org.apache.flink.configuration.JobManagerOptions; +import org.apache.flink.runtime.akka.AkkaUtils; +import org.apache.flink.runtime.blob.VoidBlobWriter; +import org.apache.flink.runtime.execution.ExecutionState; +import org.apache.flink.runtime.executiongraph.AccessExecutionJobVertex; +import org.apache.flink.runtime.executiongraph.DummyJobInformation; +import org.apache.flink.runtime.executiongraph.ExecutionAttemptID; +import org.apache.flink.runtime.executiongraph.ExecutionGraph; +import org.apache.flink.runtime.executiongraph.ExecutionVertex; +import org.apache.flink.runtime.executiongraph.JobInformation; +import org.apache.flink.runtime.executiongraph.NoOpExecutionDeploymentListener; +import org.apache.flink.runtime.executiongraph.TaskExecutionStateTransition; +import org.apache.flink.runtime.executiongraph.failover.RestartAllStrategy; +import org.apache.flink.runtime.executiongraph.failover.flip1.partitionrelease.RegionPartitionReleaseStrategy; +import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy; +import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker; +import org.apache.flink.runtime.io.network.partition.ResultPartitionType; +import org.apache.flink.runtime.jobgraph.DistributionPattern; +import org.apache.flink.runtime.jobgraph.JobGraph; +import org.apache.flink.runtime.jobgraph.JobVertex; +import org.apache.flink.runtime.jobgraph.JobVertexID; +import org.apache.flink.runtime.jobgraph.ScheduleMode; +import org.apache.flink.runtime.jobmaster.LogicalSlot; +import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder; +import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider; +import org.apache.flink.runtime.scheduler.DefaultScheduler; +import org.apache.flink.runtime.shuffle.NettyShuffleMaster; +import org.apache.flink.runtime.taskmanager.TaskExecutionState; +import org.apache.flink.runtime.testingUtils.TestingUtils; +import org.apache.flink.runtime.testtasks.NoOpInvokable; + +import java.io.IOException; +import java.time.Duration; +import java.util.ArrayList; +import java.util.Collection; +import java.util.List; +import java.util.concurrent.TimeoutException; +import java.util.function.Predicate; + +/** + * Utilities for runtime benchmarks. + */ +public class SchedulerBenchmarkUtils { + + public static List createDefaultJobVertices( + int parallelism, + DistributionPattern distributionPattern, + ResultPartitionType resultPartitionType) { + + List jobVertices = new ArrayList<>(); + + final JobVertex source = new JobVertex("source"); + source.setInvokableClass(NoOpInvokable.class); + source.setParallelism(parallelism); + jobVertices.add(source); + + final JobVertex sink = new JobVertex("sink"); + sink.setInvokableClass(NoOpInvokable.class); + sink.setParallelism(parallelism); + jobVertices.add(sink); + + sink.connectNewDataSetAsInput(source, distributionPattern, resultPartitionType); + + return jobVertices; + } + + public static JobGraph createJobGraph( + List jobVertices, + ScheduleMode scheduleMode, + ExecutionMode executionMode) throws IOException { + + final JobGraph jobGraph = new JobGraph(jobVertices.toArray(new JobVertex[0])); + + jobGraph.setScheduleMode(scheduleMode); + ExecutionConfig executionConfig = new ExecutionConfig(); + executionConfig.setExecut
[GitHub] [flink] flinkbot edited a comment on pull request #14876: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`
flinkbot edited a comment on pull request #14876: URL: https://github.com/apache/flink/pull/14876#issuecomment-773757019 ## CI report: * 100fb1eeb7f94949efc85cc6921a5c653a56163d Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12974) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14877: [FLINK-21225][table] Support OVER window distinct aggregates in Table API
flinkbot edited a comment on pull request #14877: URL: https://github.com/apache/flink/pull/14877#issuecomment-773826928 ## CI report: * eb7d6cb5de1138ee76b840b2b5a5dd850a66b407 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12977) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14842: [FLINK-21238][python] Support to close PythonFunctionFactory
flinkbot edited a comment on pull request #14842: URL: https://github.com/apache/flink/pull/14842#issuecomment-772256636 ## CI report: * 0b46d07633ccbd6774129dfa4f78424e6cacc533 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12936) * 5c977a53fe75be12cb949431d0e2e1fb8084c8ab UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14844: [FLINK-21208][python] Make Arrow Coder serialize schema info in every batch
flinkbot edited a comment on pull request #14844: URL: https://github.com/apache/flink/pull/14844#issuecomment-772295878 ## CI report: * 54ed8511139561657f10d84dc25b189acbbf156c Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12973) * e3c6f8e96db156989b8b749cb3f0431105997b7f Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12978) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14727: [FLINK-19945][Connectors / FileSystem]Support sink parallelism config…
flinkbot edited a comment on pull request #14727: URL: https://github.com/apache/flink/pull/14727#issuecomment-765189922 ## CI report: * f70acaad1bcc41000a101351e597c11372329e21 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12837) * a95a451df26b7d32d9a22b801cbc8aff7ceb719e UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-21290) Support Projection push down for Window TVF
Jark Wu created FLINK-21290: --- Summary: Support Projection push down for Window TVF Key: FLINK-21290 URL: https://issues.apache.org/jira/browse/FLINK-21290 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Reporter: Jark Wu {code:scala} @Test def testTumble_ProjectionPushDown(): Unit = { // TODO: [b, c, e, proctime] are never used, should be pruned val sql = """ |SELECT | a, | window_start, | window_end, | count(*), | sum(d) |FROM TABLE(TUMBLE(TABLE MyTable, DESCRIPTOR(rowtime), INTERVAL '15' MINUTE)) |GROUP BY a, window_start, window_end """.stripMargin util.verifyRelPlan(sql) } {code} For the above test, currently we get the following plan: {code} Calc(select=[a, window_start, window_end, EXPR$3, EXPR$4]) +- WindowAggregate(groupBy=[a], window=[TUMBLE(time_col=[rowtime], size=[15 min])], select=[a, COUNT(*) AS EXPR$3, SUM(d) AS EXPR$4, start('w$) AS window_start, end('w$) AS window_end]) +- Exchange(distribution=[hash[a]]) +- Calc(select=[a, d, rowtime]) +- WatermarkAssigner(rowtime=[rowtime], watermark=[-(rowtime, 1000:INTERVAL SECOND)]) +- Calc(select=[a, b, c, d, e, rowtime, PROCTIME() AS proctime]) +- TableSourceScan(table=[[default_catalog, default_database, MyTable]], fields=[a, b, c, d, e, rowtime]) {code} It should be able to prune fields and get the following plan: {code} Calc(select=[a, window_start, window_end, EXPR$3, EXPR$4]) +- WindowAggregate(groupBy=[a], window=[TUMBLE(time_col=[rowtime], size=[15 min])], select=[a, COUNT(*) AS EXPR$3, SUM(d) AS EXPR$4, start('w$) AS window_start, end('w$) AS window_end]) +- Exchange(distribution=[hash[a]]) +- WatermarkAssigner(rowtime=[rowtime], watermark=[-(rowtime, 1000:INTERVAL SECOND)]) +- TableSourceScan(table=[[default_catalog, default_database, MyTable]], fields=[a, d, rowtime]) {code} The reason is we didn't transpose Project and WindowTableFunction in logical phase. {code} LogicalAggregate(group=[{0, 1, 2}], EXPR$3=[COUNT()], EXPR$4=[SUM($3)]) +- LogicalProject(a=[$0], window_start=[$7], window_end=[$8], d=[$3]) +- LogicalTableFunctionScan(invocation=[TUMBLE($6, DESCRIPTOR($5), 90:INTERVAL MINUTE)], rowType=[RecordType(INTEGER a, BIGINT b, VARCHAR(2147483647) c, DECIMAL(10, 3) d, BIGINT e, TIME ATTRIBUTE(ROWTIME) rowtime, TIME ATTRIBUTE(PROCTIME) proctime, TIMESTAMP(3) window_start, TIMESTAMP(3) window_end, TIME ATTRIBUTE(ROWTIME) window_time)]) +- LogicalProject(a=[$0], b=[$1], c=[$2], d=[$3], e=[$4], rowtime=[$5], proctime=[$6]) +- LogicalWatermarkAssigner(rowtime=[rowtime], watermark=[-($5, 1000:INTERVAL SECOND)]) +- LogicalProject(a=[$0], b=[$1], c=[$2], d=[$3], e=[$4], rowtime=[$5], proctime=[PROCTIME()]) +- LogicalTableScan(table=[[default_catalog, default_database, MyTable]]) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-10520) Job save points REST API fails unless parameters are specified
[ https://issues.apache.org/jira/browse/FLINK-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279413#comment-17279413 ] Matthias commented on FLINK-10520: -- That's reasonable. I agree (y) > Job save points REST API fails unless parameters are specified > -- > > Key: FLINK-10520 > URL: https://issues.apache.org/jira/browse/FLINK-10520 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.6.1 >Reporter: Elias Levy >Assignee: Chesnay Schepler >Priority: Minor > Labels: pull-request-available > Fix For: 1.13.0 > > > The new REST API POST endpoint, {{/jobs/:jobid/savepoints}}, returns an error > unless the request includes a body with all parameters ({{target-directory}} > and {{cancel-job}})), even thought the > [documentation|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/runtime/rest/handler/job/savepoints/SavepointHandlers.html] > suggests these are optional. > If a POST request with no data is made, the response is a 400 status code > with the error message "Bad request received." > If the POST request submits an empty JSON object ( {} ), the response is a > 400 status code with the error message "Request did not match expected format > SavepointTriggerRequestBody." The same is true if only the > {{target-directory}} or {{cancel-job}} parameters are included. > As the system is configured with a default savepoint location, there > shouldn't be a need to include the parameter in the quest. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] XComp commented on pull request #14798: [FLINK-21187] Provide exception history for root causes
XComp commented on pull request #14798: URL: https://github.com/apache/flink/pull/14798#issuecomment-773838859 The test failures reported by Azure seem to be related to [FLINK-21277](https://issues.apache.org/jira/browse/FLINK-21277). I'm going to rebase and squash the related commits together after @rkhachatryan gave his final ok This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-21208) pyarrow exception when using window with pandas udaf
[ https://issues.apache.org/jira/browse/FLINK-21208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279400#comment-17279400 ] Huang Xingbo commented on FLINK-21208: -- [~liuyufei] The serialization protocol provided by arrow is to serialize the schema info into the header before transmitting data. This is actually a stateful serializer. But for beam, it requires your serializer to be stateless. Both of them are not wrong and have their own considerations, but when used in combination, there will be problems, unless you transmit a schema for each arrow batch. > pyarrow exception when using window with pandas udaf > > > Key: FLINK-21208 > URL: https://issues.apache.org/jira/browse/FLINK-21208 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.12.0, 1.12.1 >Reporter: YufeiLiu >Priority: Major > Labels: pull-request-available > > I write a pyflink demo and execute in local environment, the logic is > simple:generate records and aggerate in 100s tumle window, using a pandas > udaf. > But the job failed after several minutes, I don't think it's a resource > problem because the amount of data is small, here is the error trace. > {code:java} > Caused by: org.apache.flink.streaming.runtime.tasks.AsynchronousException: > Caught exception while processing timer. > at > org.apache.flink.streaming.runtime.tasks.StreamTask$StreamTaskAsyncExceptionHandler.handleAsyncException(StreamTask.java:1108) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.handleAsyncException(StreamTask.java:1082) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invokeProcessingTimeCallback(StreamTask.java:1213) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$null$17(StreamTask.java:1202) > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47) > at > org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:78) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:302) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:184) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) > at java.lang.Thread.run(Thread.java:748) > Caused by: TimerException{java.lang.RuntimeException: Failed to close remote > bundle} > ... 11 more > Caused by: java.lang.RuntimeException: Failed to close remote bundle > at > org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.finishBundle(BeamPythonFunctionRunner.java:371) > at > org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.flush(BeamPythonFunctionRunner.java:325) > at > org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.invokeFinishBundle(AbstractPythonFunctionOperator.java:291) > at > org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.checkInvokeFinishBundleByTime(AbstractPythonFunctionOperator.java:285) > at > org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.lambda$open$0(AbstractPythonFunctionOperator.java:134) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invokeProcessingTimeCallback(StreamTask.java:1211) > ... 10 more > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: Error received from SDK harness for instruction > 3: Traceback (most recent call last): > File > "/Users/lyf/Library/Python/3.7/lib/python/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 253, in _execute > response = task() > File > "/Users/lyf/Library/Python/3.7/lib/python/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 310, in > lambda: self.create_worker().do_instruction(request), request) > File > "/Users/lyf/Library/Python/3.7/lib/python/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 480, in do_instruction > getattr(request, request_type), request.instruction_id) > File > "/Users/lyf/Library/Python/3.7/lib/python/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 515, in process_bundle > bundle_processor.process_bundle(instruction_id)) > File > "/Users/lyf/Library/Python/3.7/lib/python/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 978, in proces
[GitHub] [flink] flinkbot edited a comment on pull request #14877: [FLINK-21225][table] Support OVER window distinct aggregates in Table API
flinkbot edited a comment on pull request #14877: URL: https://github.com/apache/flink/pull/14877#issuecomment-773826928 ## CI report: * eb7d6cb5de1138ee76b840b2b5a5dd850a66b407 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12977) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14844: [FLINK-21208][python] Make Arrow Coder serialize schema info in every batch
flinkbot edited a comment on pull request #14844: URL: https://github.com/apache/flink/pull/14844#issuecomment-772295878 ## CI report: * 54ed8511139561657f10d84dc25b189acbbf156c Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12973) * e3c6f8e96db156989b8b749cb3f0431105997b7f UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14834: [FLINK-21234][build system] Adding timeout to all tasks
flinkbot edited a comment on pull request #14834: URL: https://github.com/apache/flink/pull/14834#issuecomment-771607546 ## CI report: * 3c55d1b87076148f095c08a55319f1b898dc932d UNKNOWN * 5c8ac11faf9d6fbcad7550bec8abf6a6343347c3 UNKNOWN * 4e50c2f87a73c3a79c178a580c84c31924a72c4a Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12965) * 35e6f16b00a6ac645dc140b7289272074a82fd19 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #14877: [FLINK-21225][table] Support OVER window distinct aggregates in Table API
flinkbot commented on pull request #14877: URL: https://github.com/apache/flink/pull/14877#issuecomment-773826928 ## CI report: * eb7d6cb5de1138ee76b840b2b5a5dd850a66b407 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zhuzhurk commented on a change in pull request #14862: [FLINK-21273][coordination] Remove unused ExecutionVertexSchedulingRe…
zhuzhurk commented on a change in pull request #14862: URL: https://github.com/apache/flink/pull/14862#discussion_r570744967 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SlotSharingExecutionSlotAllocator.java ## @@ -115,15 +115,11 @@ * returns the results. * * - * @param executionVertexSchedulingRequirements the requirements for scheduling the executions. + * @param executionVertexIds the requirements for scheduling the executions. Review comment: the comment is outdated ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/ExecutionSlotAllocator.java ## @@ -29,10 +29,10 @@ /** * Allocate slots for the given executions. * - * @param executionVertexSchedulingRequirements The requirements for scheduling the executions. + * @param executionVertexIds The requirements for scheduling the executions. Review comment: "The requirements for scheduling the executions" -> "Execution vertices to allocate slots for" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-21274) At per-job mode,if the HDFS write is slow(about 5 seconds), the flink job archive file will upload fails
[ https://issues.apache.org/jira/browse/FLINK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279381#comment-17279381 ] Yang Wang commented on FLINK-21274: --- [~wjc920] I am afraid this ticket is related with FLINK-21008. The root cause of both them are that {{ClusterEntrypoint#shutDownAsync}} is not fully executed. And then this leads to the residual HA related data or archiving failure for completed jobs. However, I cannot agree with your fix. Simply calling the {{getTerminationFuture().get()}} will block the executing of {{runClusterEntrypoint}}. This also could not completely resolve the issue if we receive the SIGTERM very fast. So I prefer the solution posted in FLINK-21008, triggering a {{ClusterEntrypoint.closeAsync()}} if we see a SIGTERM and then wait on the completion. WDYT? > At per-job mode,if the HDFS write is slow(about 5 seconds), the flink job > archive file will upload fails > > > Key: FLINK-21274 > URL: https://issues.apache.org/jira/browse/FLINK-21274 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.10.1 >Reporter: Jichao Wang >Priority: Critical > Attachments: 1.png, 2.png, > application_1612404624605_0010-JobManager.log > > > This is a partial configuration of my Flink History service(flink-conf.yaml), > and this is also the configuration of my Flink client. > {code:java} > jobmanager.archive.fs.dir: hdfs://hdfsHACluster/flink/completed-jobs/ > historyserver.archive.fs.dir: hdfs://hdfsHACluster/flink/completed-jobs/ > {code} > I used {color:#0747a6}flink run -m yarn-cluster > /cloud/service/flink/examples/batch/WordCount.jar{color} to submit a > WorkCount task to the Yarn cluster. Under normal circumstances, after the > task is completed, the flink job execution information will be archived to > HDFS, and then the JobManager process will exit. However, when this archiving > process takes a long time (maybe the HDFS write speed is slow), the task > archive file upload fails. > The specific reproduction method is as follows: > Modify the > {color:#0747a6}org.apache.flink.runtime.history.FsJobArchivist#archiveJob{color} > method to wait 5 seconds before actually writing to HDFS (simulating a slow > write speed scenario). > {code:java} > public static Path archiveJob(Path rootPath, JobID jobId, > Collection jsonToArchive) > throws IOException { > try { > FileSystem fs = rootPath.getFileSystem(); > Path path = new Path(rootPath, jobId.toString()); > OutputStream out = fs.create(path, FileSystem.WriteMode.NO_OVERWRITE); > try { > LOG.info("===Wait 5 seconds.."); > Thread.sleep(5000); > } catch (InterruptedException e) { > e.printStackTrace(); > } > try (JsonGenerator gen = jacksonFactory.createGenerator(out, > JsonEncoding.UTF8)) { > ... // Part of the code is omitted here > } catch (Exception e) { > fs.delete(path, false); > throw e; > } > LOG.info("Job {} has been archived at {}.", jobId, path); > return path; > } catch (IOException e) { > LOG.error("Failed to archive job.", e); > throw e; > } > } > {code} > After I make the above changes to the code, I cannot find the corresponding > task on Flink's HistoryServer(Refer to Figure 1.png and Figure 2.png). > Then I went to Yarn to browse the JobManager log (see attachment > application_1612404624605_0010-JobManager.log for log details), and found > that the following logs are missing in the task log: > {code:java} > INFO entrypoint.ClusterEntrypoint: Terminating cluster entrypoint process > YarnJobClusterEntrypoint with exit code 0.{code} > Usually, if the task exits normally, a similar log will be printed before > executing {color:#0747a6}System.exit(returnCode){color}. > If no Exception information is found in the JobManager log, the above > situation occurs, indicating that the JobManager is running to a certain > point, and there is no user thread in the JobManager process, which causes > the program to exit without completing the normal process. > Eventually I found out that multiple services (for example: ioExecutor, > metricRegistry, commonRpcService) were exited asynchronously in > {color:#0747a6}org.apache.flink.runtime.entrypoint.ClusterEntrypoint#stopClusterServices{color}, > and multiple services would be exited in the shutdown() method of > metricRegistry (for example : executor), these exit actions are executed > asynchronously and in parallel. If ioExecutor or executor exits last, it will > cause the above problems.The root cause is that th
[GitHub] [flink] flinkbot edited a comment on pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration
flinkbot edited a comment on pull request #14629: URL: https://github.com/apache/flink/pull/14629#issuecomment-759361463 ## CI report: * 5426ab123aef03d4710c0fea1237fa014684f372 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12918) * 145f891faab2190d83a5ea35ac86a025605b43bd Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12976) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #14877: [FLINK-21225][table] Support OVER window distinct aggregates in Table API
flinkbot commented on pull request #14877: URL: https://github.com/apache/flink/pull/14877#issuecomment-773816561 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit eb7d6cb5de1138ee76b840b2b5a5dd850a66b407 (Fri Feb 05 06:12:30 UTC 2021) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-21289) Application mode on kubernetes deployment support run PackagedProgram.main with pipeline.classpaths
[ https://issues.apache.org/jira/browse/FLINK-21289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhou Parker updated FLINK-21289: Description: 我尝试将flink作业以application mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。 在yarn上可以工作,是因为我们用 {color:#ff}-C [http://|http:///] {color}的方式,让依赖可以被URLClassloader加载。 但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml 中生成一个pipeline.classpaths 配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。 通过阅读源码,*我发现运行用户代码的类加载器实际并没有把 pipeline.classpaths 中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。 我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。 English translation: I'm trying to submit flink job to kubernetes cluster with application mode, but throw ClassNotFoundException when some dependency class is not shipped in kind of local:///opt/flink/usrlib/.jar. This works on yarn, since we use {color:#ff}-C [http://|http:///]{color} command line style that let dependency class can be load by URLClassloader. But i figure out that not works on kubernetes. When submit to kubernetes cluster, -C is only shipped as item "pipeline.classpaths" in configmap/flink-conf.yaml。 After read the source code, *i find out that the Classloader launching the "main" entry of user code miss consider adding pipeline.classpaths into candidates URLs*. from source code, i also learn that we can ship the dependency jar in the usrlib dir to solve the problem. But that failed for us, we are not _preferred_ to ship dependencies in image at compile time, since dependencies are known dynamically in runtime I proposed to improve the process, let the Classloader consider usrlib as well as pipeline.classpaths, this is a quite little change. I test the solution and it works quite well was: 我尝试将flink作业以application mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。 在yarn上可以工作,是因为我们用 {color:#ff}-C [http://|http:///] {color}的方式,让依赖可以被URLClassloader加载。 但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml 中生成一个pipeline.classpaths 配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。 通过阅读源码,*我发现运行用户代码的类加载器实际并把 pipeline.classpaths 中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,可以通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。 我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。 English translation: I'm trying to submit flink job to kubernetes cluster with application mode, but throw ClassNotFoundException when dependency class is not shipped in kind of local:///opt/flink/usrlib/.jar. This works on yarn, since we use {color:#ff}-C [http://|http:///]{color} command line style that let dependency class can be load by URLClassloader. But i figure out that not works on kubernetes. When submit to kubernetes cluster, -C is only shipped as item "pipeline.classpaths" in configmap/flink-conf.yaml。 After read the source code, *i find out that the Classloader launching the "main" entry of user code without consider add pipeline.classpaths into candidates URLs*. from source code, i also learn that we can ship the dependency jar in the usrlib dir to solve the problem. But failed for me, we are not _preferred_ to ship dependencies in image at compile time, since they are known dynamically in runtime I proposed improving the process, let the Classloader consider usrlib as well as pipeline.classpaths, this is a quite little change. I test the solution and it works quite well > Application mode on kubernetes deployment support run PackagedProgram.main > with pipeline.classpaths > --- > > Key: FLINK-21289 > URL: https://issues.apache.org/jira/browse/FLINK-21289 > Project: Flink > Issue Type: Improvement > Components: Client / Job Submission, Deployment / Kubernetes >Affects Versions: 1.11.2, 1.12.1 > Environment: flink: 1.11 > kubernetes: 1.15 > >Reporter: Zhou Parker >Priority: Minor > Attachments: 0001-IMP.patch > > > 我尝试将flink作业以application > mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。 > 在yarn上可以工作,是因为我们用 {color:#ff}-C [http://|http:///] > {color}的方式,让依赖可以被URLClassloader加载。 > 但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml > 中生成一个pipeline.classpaths 配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。 > 通过阅读源码,*我发现运行用户代码的类加载器实际并没有把 pipeline.classpaths > 中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。 > 我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。 > > > English translation: > I'm trying to submit flink job to kubernetes cluster with application mode, > but throw Cla
[jira] [Updated] (FLINK-21289) Application mode on kubernetes deployment support run PackagedProgram.main with pipeline.classpaths
[ https://issues.apache.org/jira/browse/FLINK-21289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhou Parker updated FLINK-21289: Description: 我尝试将flink作业以application mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。 在yarn上可以工作,是因为我们用 {color:#ff}-C [http://|http:///] {color}的方式,让依赖可以被URLClassloader加载。 但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml 中生成一个pipeline.classpaths 配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。 通过阅读源码,*我发现运行用户代码的类加载器实际并把 pipeline.classpaths 中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,可以通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。 我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。 English translation: I'm trying to submit flink job to kubernetes cluster with application mode, but throw ClassNotFoundException when dependency class is not shipped in kind of local:///opt/flink/usrlib/.jar. This works on yarn, since we use {color:#ff}-C [http://|http:///]{color} command line style that let dependency class can be load by URLClassloader. But i figure out that not works on kubernetes. When submit to kubernetes cluster, -C is only shipped as item "pipeline.classpaths" in configmap/flink-conf.yaml。 After read the source code, *i find out that the Classloader launching the "main" entry of user code without consider add pipeline.classpaths into candidates URLs*. from source code, i also learn that we can ship the dependency jar in the usrlib dir to solve the problem. But failed for me, we are not _preferred_ to ship dependencies in image at compile time, since they are known dynamically in runtime I proposed improving the process, let the Classloader consider usrlib as well as pipeline.classpaths, this is a quite little change. I test the solution and it works quite well was: 我尝试将flink作业以application mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。 在yarn上可以工作,是因为我们用 {color:#FF}-C [http://|http:///] {color}的方式,让依赖可以被URLClassloader加载。 但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml 中生成一个pipeline.classpaths 配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。 通过阅读源码,*我发现运行用户代码的类加载器实际并把 pipeline.classpaths 中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,可以通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。 我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。 I'm trying to submit flink job to kubernetes cluster with application mode, but throw ClassNotFoundException when dependency class is not shipped in kind of local:///opt/flink/usrlib/.jar. This works on yarn, since we use {color:#FF}-C http://{color} command line style that let dependency class can be load by URLClassloader. But i figure out that not works on kubernetes. When submit to kubernetes cluster, -C is only shipped as item "pipeline.classpaths" in configmap/flink-conf.yaml。 After read the source code, *i find out that the Classloader launching the "main" entry of user code without consider add pipeline.classpaths into candidates URLs*. from source code, i also learn that we can ship the dependency jar in the usrlib dir to solve the problem. But failed for me, we are not _preferred_ to ship dependencies in image at compile time, since they are known dynamically in runtime I proposed improving the process, let the Classloader consider usrlib as well as pipeline.classpaths, this is a quite little change. I test the solution and it works quite well > Application mode on kubernetes deployment support run PackagedProgram.main > with pipeline.classpaths > --- > > Key: FLINK-21289 > URL: https://issues.apache.org/jira/browse/FLINK-21289 > Project: Flink > Issue Type: Improvement > Components: Client / Job Submission, Deployment / Kubernetes >Affects Versions: 1.11.2, 1.12.1 > Environment: flink: 1.11 > kubernetes: 1.15 > >Reporter: Zhou Parker >Priority: Minor > Attachments: 0001-IMP.patch > > > 我尝试将flink作业以application > mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。 > 在yarn上可以工作,是因为我们用 {color:#ff}-C [http://|http:///] > {color}的方式,让依赖可以被URLClassloader加载。 > 但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml > 中生成一个pipeline.classpaths 配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。 > 通过阅读源码,*我发现运行用户代码的类加载器实际并把 pipeline.classpaths > 中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,可以通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。 > 我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。 > > > English translation: > I'm trying to submit flink job to kubernetes cluster with application mode, > but throw ClassNotFoundException when dependency class is not shipped in
[jira] [Created] (FLINK-21289) Application mode on kubernetes deployment support run PackagedProgram.main with pipeline.classpaths
Zhou Parker created FLINK-21289: --- Summary: Application mode on kubernetes deployment support run PackagedProgram.main with pipeline.classpaths Key: FLINK-21289 URL: https://issues.apache.org/jira/browse/FLINK-21289 Project: Flink Issue Type: Improvement Components: Client / Job Submission, Deployment / Kubernetes Affects Versions: 1.12.1, 1.11.2 Environment: flink: 1.11 kubernetes: 1.15 Reporter: Zhou Parker Attachments: 0001-IMP.patch 我尝试将flink作业以application mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。 在yarn上可以工作,是因为我们用 {color:#FF}-C [http://|http:///] {color}的方式,让依赖可以被URLClassloader加载。 但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml 中生成一个pipeline.classpaths 配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。 通过阅读源码,*我发现运行用户代码的类加载器实际并把 pipeline.classpaths 中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,可以通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。 我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。 I'm trying to submit flink job to kubernetes cluster with application mode, but throw ClassNotFoundException when dependency class is not shipped in kind of local:///opt/flink/usrlib/.jar. This works on yarn, since we use {color:#FF}-C http://{color} command line style that let dependency class can be load by URLClassloader. But i figure out that not works on kubernetes. When submit to kubernetes cluster, -C is only shipped as item "pipeline.classpaths" in configmap/flink-conf.yaml。 After read the source code, *i find out that the Classloader launching the "main" entry of user code without consider add pipeline.classpaths into candidates URLs*. from source code, i also learn that we can ship the dependency jar in the usrlib dir to solve the problem. But failed for me, we are not _preferred_ to ship dependencies in image at compile time, since they are known dynamically in runtime I proposed improving the process, let the Classloader consider usrlib as well as pipeline.classpaths, this is a quite little change. I test the solution and it works quite well -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] LadyForest commented on pull request #14877: [FLINK-21225][table] Support OVER window distinct aggregates in Table API
LadyForest commented on pull request #14877: URL: https://github.com/apache/flink/pull/14877#issuecomment-773813622 @flinkbot run azure This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-21225) OverConvertRule does not consider distinct
[ https://issues.apache.org/jira/browse/FLINK-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-21225: --- Labels: pull-request-available (was: ) > OverConvertRule does not consider distinct > -- > > Key: FLINK-21225 > URL: https://issues.apache.org/jira/browse/FLINK-21225 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: Timo Walther >Assignee: Jane Chan >Priority: Major > Labels: pull-request-available > > We don't support OVER window distinct aggregates in Table API. Even though > this is explicitly documented: > https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/tableApi.html#aggregations > {code} > // Distinct aggregation on over window > Table result = orders > .window(Over > .partitionBy($("a")) > .orderBy($("rowtime")) > .preceding(UNBOUNDED_RANGE) > .as("w")) > .select( > $("a"), $("b").avg().distinct().over($("w")), > $("b").max().over($("w")), > $("b").min().over($("w")) > ); > {code} > The distinct flag is set to false in {{OverConvertRule}}. > See also > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Unknown-call-expression-avg-amount-when-use-distinct-in-Flink-Thanks-td40905.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] LadyForest opened a new pull request #14877: [FLINK-21225][table] Support OVER window distinct aggregates in Table API
LadyForest opened a new pull request #14877: URL: https://github.com/apache/flink/pull/14877 ## Contribution Checklist ## What is the purpose of the change * This pull request tries to support OVER window distinct aggregates in `OverConverterRule`. Currently, `OverConvertRule` always sets the distinct flag to false, and could not make rex call for an `agg` expression's children expression like `distinct(avg/count/sum(field))`, thus cause `ExpressionConverter#visit` throwing `RuntimeException` that "`Unknown call expression: avg(field)`". ## Brief change log - *The changes applied to `OverConverterRule` add a flag `isDistinct` by checking the function definition of `DISTINCT` and using inner agg expression to generate RexNode if `isDistinct` is true.* ## Verifying this change This change added tests and can be verified as follows: - *`OverAggregateTest#testRowTimeBoundedDistinctWithPartitionedRangeOver`, `OverAggregateTest#testRowTimeUnboundedDistinctWithPartitionedRangeOver`, `OverAggregateTest#testRowTimeBoundedDistinctWithPartitionedRowsOver` and `OverAggregateTest#testRowTimeUnboundedDistinctWithPartitionedRowsOver` are to verify the optimized plan.* - *`OverAggregateITCase#testRowTimeBoundedDistinctWithPartitionedRangeOver`, `OverAggregateITCase#testRowTimeUnboundedDistinctWithPartitionedRangeOver`, `OverAggregateITCase#testRowTimeBoundedDistinctWithPartitionedRowsOver` and `OverAggregateITCase#testRowTimeUnboundedDistinctWithPartitionedRowsOver` are to verify the execution result.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] xintongsong commented on a change in pull request #14869: [FLINK-21047][coordination] Fix the incorrect registered/free resourc…
xintongsong commented on a change in pull request #14869: URL: https://github.com/apache/flink/pull/14869#discussion_r570737663 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/TaskExecutorManager.java ## @@ -402,27 +407,31 @@ private void releaseIdleTaskExecutor(InstanceID timedOutTaskManagerId) { // - public ResourceProfile getTotalRegisteredResources() { -return getResourceFromNumSlots(getNumberRegisteredSlots()); +return taskManagerRegistrations.values().stream() +.map(TaskManagerRegistration::getTotalResource) +.reduce(ResourceProfile.ZERO, ResourceProfile::merge); } public ResourceProfile getTotalRegisteredResourcesOf(InstanceID instanceID) { -return getResourceFromNumSlots(getNumberRegisteredSlotsOf(instanceID)); +return Optional.ofNullable(taskManagerRegistrations.get(instanceID)) +.map(TaskManagerRegistration::getTotalResource) +.orElse(ResourceProfile.ZERO); } public ResourceProfile getTotalFreeResources() { -return getResourceFromNumSlots(getNumberFreeSlots()); +return taskManagerRegistrations.keySet().stream() +.map(this::getTotalFreeResourcesOf) +.reduce(ResourceProfile.ZERO, ResourceProfile::merge); Review comment: Same here. ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/SlotManagerImpl.java ## @@ -230,30 +230,34 @@ public int getNumberFreeSlotsOf(InstanceID instanceId) { @Override public ResourceProfile getRegisteredResource() { -return getResourceFromNumSlots(getNumberRegisteredSlots()); +return taskManagerRegistrations.values().stream() +.map(TaskManagerRegistration::getTotalResource) +.reduce(ResourceProfile.ZERO, ResourceProfile::merge); } @Override public ResourceProfile getRegisteredResourceOf(InstanceID instanceID) { -return getResourceFromNumSlots(getNumberRegisteredSlotsOf(instanceID)); +return Optional.ofNullable(taskManagerRegistrations.get(instanceID)) +.map(TaskManagerRegistration::getTotalResource) +.orElse(ResourceProfile.ZERO); } @Override public ResourceProfile getFreeResource() { -return getResourceFromNumSlots(getNumberFreeSlots()); +return taskManagerRegistrations.keySet().stream() +.map(this::getFreeResourceOf) +.reduce(ResourceProfile.ZERO, ResourceProfile::merge); Review comment: Better to iterate on the entry set than to iterate on the key set and get the entry with the key. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] curcur commented on pull request #14838: [WIP][state] Add StateChangelog API
curcur commented on pull request #14838: URL: https://github.com/apache/flink/pull/14838#issuecomment-773810436 > 1. > But this is where/the name I planned to put the WrappedStateBackend. > > In my opinion, both backend and changelog should reside in the same module, at least for now. I have no objection to putting them together. > > 1. I fixed the `StateChangeFormat` issue Thanks for fixing this! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14872: [FLINK-21162] [Blink Planner]use "IF(col = '' OR col IS NULL, 'a', 'b')",when co…
flinkbot edited a comment on pull request #14872: URL: https://github.com/apache/flink/pull/14872#issuecomment-773303946 ## CI report: * f6390443428d1f56a41251ffa412bf8eab820d86 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12967) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14838: [WIP][state] Add StateChangelog API
flinkbot edited a comment on pull request #14838: URL: https://github.com/apache/flink/pull/14838#issuecomment-772060058 ## CI report: * c1c536632abeeef101cd3a89edcdf63f1e950eff Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12966) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration
flinkbot edited a comment on pull request #14629: URL: https://github.com/apache/flink/pull/14629#issuecomment-759361463 ## CI report: * 5426ab123aef03d4710c0fea1237fa014684f372 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12918) * 145f891faab2190d83a5ea35ac86a025605b43bd UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-20563) Support built-in functions for Hive versions prior to 1.2.0
[ https://issues.apache.org/jira/browse/FLINK-20563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated FLINK-20563: --- Fix Version/s: 1.13.0 > Support built-in functions for Hive versions prior to 1.2.0 > --- > > Key: FLINK-20563 > URL: https://issues.apache.org/jira/browse/FLINK-20563 > Project: Flink > Issue Type: Improvement > Components: Connectors / Hive >Reporter: Rui Li >Priority: Major > Labels: pull-request-available > Fix For: 1.13.0 > > > Currently Hive built-in functions are supported only for Hive-1.2.0 and > later. We should investigate how to lift this limitation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-17061) Unset process/flink memory size from configuration once dynamic worker resource is activated.
[ https://issues.apache.org/jira/browse/FLINK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song reassigned FLINK-17061: Assignee: Xintong Song > Unset process/flink memory size from configuration once dynamic worker > resource is activated. > - > > Key: FLINK-17061 > URL: https://issues.apache.org/jira/browse/FLINK-17061 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration, Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: Xintong Song >Assignee: Xintong Song >Priority: Major > > With FLINK-14106, memory of a TaskExecutor is decided in two steps on active > resource managers. > - {{SlotManager}} decides {{WorkerResourceSpec}}, including memory used by > Flink tasks: task heap, task off-heap, network and managed memory. > - {{ResourceManager}} derives {{TaskExecutorProcessSpec}} from > {{WorkerResourceSpec}} and the configuration, deciding sizes of memory used > by Flink framework and JVM: framework heap, framework off-heap, jvm metaspace > and jvm overhead. > This works fine for now, because both {{WorkerResourceSpec}} and > {{TaskExecutorProcessSpec}} are derived from the same configurations. > However, it might cause problem if later we have new {{SlotManager}} > implementations that decides {{WorkerResourceSpec}} dynamically. In such > cases, the process/flink sizes in configuration should be ignored, or it may > easily lead to configuration conflicts. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-21288) Support redundant task managers for fine grained resource management
Xintong Song created FLINK-21288: Summary: Support redundant task managers for fine grained resource management Key: FLINK-21288 URL: https://issues.apache.org/jira/browse/FLINK-21288 Project: Flink Issue Type: Sub-task Reporter: Xintong Song -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-20970) DECIMAL(10, 0) can not be GROUP BY key.
[ https://issues.apache.org/jira/browse/FLINK-20970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu closed FLINK-20970. --- Resolution: Not A Problem As we only maintain 3 majar releases, and it has been fixed in the 1.11 and 1.12, I will close this one. > DECIMAL(10, 0) can not be GROUP BY key. > --- > > Key: FLINK-20970 > URL: https://issues.apache.org/jira/browse/FLINK-20970 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.10.1 >Reporter: Wong Mulan >Priority: Major > Attachments: image-2021-01-14-17-06-28-648.png > > > If value which type is not DECIMAL(38, 18) is GROUP BY key, the result will > be -1. > So, only DECIMAL(38, 18) can be GROUP BY key? > Whatever the value is, it will be return -1. > !image-2021-01-14-17-06-28-648.png|thumbnail! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wangyang0918 commented on pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration
wangyang0918 commented on pull request #14629: URL: https://github.com/apache/flink/pull/14629#issuecomment-773801799 @MiLk Thanks for your feedback. I have added a new commit for the documentation. Now it is ready for review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks
flinkbot edited a comment on pull request #14740: URL: https://github.com/apache/flink/pull/14740#issuecomment-766340750 ## CI report: * c7e6b28b249f85cf52740d5201a769e0982a60aa UNKNOWN * bebd298009b12a9d5ac6518902f5534f8e00ff32 UNKNOWN * 743d1592db1b1f62ef6e2b208517438e2fab3a66 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12849) * 0a5a79498ab93134eccbe025489ede9aae233392 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12975) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] xintongsong commented on a change in pull request #14865: [FLINK-21270] Generate the slot request respect to the resource specification of SlotSharingGroup if present
xintongsong commented on a change in pull request #14865: URL: https://github.com/apache/flink/pull/14865#discussion_r570703453 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/ExecutionVertexSchedulingRequirementsMapper.java ## @@ -52,17 +52,22 @@ public static ExecutionVertexSchedulingRequirements from( /** * Get resource profile of the physical slot to allocate a logical slot in for the given vertex. * If the vertex is in a slot sharing group, the physical slot resource profile should be the - * resource profile of the slot sharing group. Otherwise it should be the resource profile of - * the vertex itself since the physical slot would be used by this vertex only in this case. + * resource profile of the slot sharing group if present. Otherwise it should be the resource + * profile of the vertex itself since the physical slot would be used by this vertex only in + * this case. * * @return resource profile of the physical slot to allocate a logical slot for the given vertex */ public static ResourceProfile getPhysicalSlotResourceProfile( final ExecutionVertex executionVertex) { final SlotSharingGroup slotSharingGroup = executionVertex.getJobVertex().getSlotSharingGroup(); -return ResourceProfile.fromResourceSpec( -slotSharingGroup.getResourceSpec(), MemorySize.ZERO); +if (slotSharingGroup.getResourceProfile().equals(ResourceProfile.UNKNOWN)) { +return ResourceProfile.fromResourceSpec( +slotSharingGroup.getResourceSpec(), MemorySize.ZERO); Review comment: I wonder if `SlotSharingGroup#resourceSpec` can be removed. There're two coded paths for aggregating operator resources into slot resources. - `SlotSharingGroup#resourceSpec` - `SlotSharingExecutionSlotAllocator#getPhysicalSlotResourceProfile` IIUC, after introducing deterministic slot sharing groups, we no longer need the first code path. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] leonardBang commented on a change in pull request #14684: [FLINK-20460][Connector-HBase] Support async lookup for HBase connector
leonardBang commented on a change in pull request #14684: URL: https://github.com/apache/flink/pull/14684#discussion_r570712863 ## File path: flink-connectors/flink-connector-hbase-2.2/src/main/java/org/apache/flink/connector/hbase2/source/HBaseDynamicTableSource.java ## @@ -35,17 +42,45 @@ public HBaseDynamicTableSource( Configuration conf, String tableName, HBaseTableSchema hbaseSchema, -String nullStringLiteral) { -super(conf, tableName, hbaseSchema, nullStringLiteral); +String nullStringLiteral, +HBaseLookupOptions lookupOptions) { +super(conf, tableName, hbaseSchema, nullStringLiteral, lookupOptions); +} + +@Override +public LookupRuntimeProvider getLookupRuntimeProvider(LookupContext context) { +checkArgument(context.getKeys().length == 1 && context.getKeys()[0].length == 1, +"Currently, HBase table can only be lookup by single rowkey."); +checkArgument( +hbaseSchema.getRowKeyName().isPresent(), +"HBase schema must have a row key when used in lookup mode."); +checkArgument( +hbaseSchema +.convertsToTableSchema() +.getTableColumn(context.getKeys()[0][0]) +.filter(f -> f.getName().equals(hbaseSchema.getRowKeyName().get())) +.isPresent(), +"Currently, HBase table only supports lookup by rowkey field."); +boolean isAsync = lookupOptions.getLookupAsync(); +if (isAsync){ Review comment: ```suggestion if (lookupOptions.getLookupAsync()){ ``` ## File path: flink-connectors/flink-connector-hbase-2.2/src/main/java/org/apache/flink/connector/hbase2/source/HBaseRowDataAsyncLookupFunction.java ## @@ -0,0 +1,232 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.connector.hbase2.source; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.annotation.VisibleForTesting; +import org.apache.flink.connector.hbase.options.HBaseLookupOptions; +import org.apache.flink.connector.hbase.util.HBaseConfigurationUtil; +import org.apache.flink.connector.hbase.util.HBaseSerde; +import org.apache.flink.connector.hbase.util.HBaseTableSchema; +import org.apache.flink.metrics.Gauge; +import org.apache.flink.runtime.concurrent.Executors; +import org.apache.flink.table.data.GenericRowData; +import org.apache.flink.table.data.RowData; +import org.apache.flink.table.functions.AsyncTableFunction; +import org.apache.flink.table.functions.FunctionContext; +import org.apache.flink.util.StringUtils; + +import org.apache.flink.shaded.guava18.com.google.common.cache.Cache; +import org.apache.flink.shaded.guava18.com.google.common.cache.CacheBuilder; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.HConstants; +import org.apache.hadoop.hbase.TableName; +import org.apache.hadoop.hbase.TableNotFoundException; +import org.apache.hadoop.hbase.client.AsyncConnection; +import org.apache.hadoop.hbase.client.AsyncTable; +import org.apache.hadoop.hbase.client.ConnectionFactory; +import org.apache.hadoop.hbase.client.Get; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.client.ScanResultConsumer; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.Collection; +import java.util.Collections; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * The HBaseRowDataAsyncLookupFunction is a standard user-defined table function, it can be used in + * tableAPI and also useful for temporal table join plan in SQL. It looks up the result as {@link + * RowData}. + */ +@Internal +public class HBaseRowDataAsyncLookupFunction extends AsyncTableFunction { + +private static final Logger LOG = LoggerFactory.getLogger(HBaseRowDataAsyncLookupFunction.class); + + Review comment: redundant blank lines ## File path: flink-connectors/flink-connector-hbase-base/src/main/ja
[GitHub] [flink] gaoyunhaii edited a comment on pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks
gaoyunhaii edited a comment on pull request #14740: URL: https://github.com/apache/flink/pull/14740#issuecomment-773782948 Hi Roman @rkhachatryan very thanks for the review! I have update the PR via https://github.com/apache/flink/pull/14740/commits/0a5a79498ab93134eccbe025489ede9aae233392 according to the comments~ The current PR indeed did not include the case that the task finishes concurrently when JM tries to trigger it, [FLINK-21246](https://issues.apache.org/jira/browse/FLINK-21246) would solve this issue. I also think in this case the checkpoint would be declined with a reason that would not cause job failure. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14844: [FLINK-21208][python] Make Arrow Coder serialize schema info in every batch
flinkbot edited a comment on pull request #14844: URL: https://github.com/apache/flink/pull/14844#issuecomment-772295878 ## CI report: * 54ed8511139561657f10d84dc25b189acbbf156c Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12973) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks
flinkbot edited a comment on pull request #14740: URL: https://github.com/apache/flink/pull/14740#issuecomment-766340750 ## CI report: * c7e6b28b249f85cf52740d5201a769e0982a60aa UNKNOWN * bebd298009b12a9d5ac6518902f5534f8e00ff32 UNKNOWN * 743d1592db1b1f62ef6e2b208517438e2fab3a66 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12849) * 0a5a79498ab93134eccbe025489ede9aae233392 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii commented on pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks
gaoyunhaii commented on pull request #14740: URL: https://github.com/apache/flink/pull/14740#issuecomment-773782948 Hi Roman @rkhachatryan very thanks for the review! I have update the PR via https://github.com/apache/flink/pull/14740/commits/0a5a79498ab93134eccbe025489ede9aae233392 according to the comments~ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks
gaoyunhaii commented on a change in pull request #14740: URL: https://github.com/apache/flink/pull/14740#discussion_r570714613 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java ## @@ -611,6 +590,10 @@ public long getNumberOfRestarts() { return numberOfRestartsCounter.getCount(); } +public int getVerticesFinished() { Review comment: I also think `getFinishedVertices` would be more nature, but a bit concern here is that the variable to get is name by `verticesFinished`, should we keeps this method to be a getter method for that variable ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks
gaoyunhaii commented on a change in pull request #14740: URL: https://github.com/apache/flink/pull/14740#discussion_r570713863 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointBriefCalculator.java ## @@ -0,0 +1,492 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.checkpoint; + +import org.apache.flink.annotation.VisibleForTesting; +import org.apache.flink.api.common.JobID; +import org.apache.flink.runtime.execution.ExecutionState; +import org.apache.flink.runtime.executiongraph.Execution; +import org.apache.flink.runtime.executiongraph.ExecutionAttemptID; +import org.apache.flink.runtime.executiongraph.ExecutionEdge; +import org.apache.flink.runtime.executiongraph.ExecutionJobVertex; +import org.apache.flink.runtime.executiongraph.ExecutionVertex; +import org.apache.flink.runtime.jobgraph.DistributionPattern; +import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID; +import org.apache.flink.runtime.jobgraph.JobEdge; +import org.apache.flink.runtime.jobgraph.JobVertexID; +import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collection; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.ListIterator; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.CompletableFuture; +import java.util.stream.Collectors; +import java.util.stream.IntStream; + +import static org.apache.flink.util.Preconditions.checkNotNull; +import static org.apache.flink.util.Preconditions.checkState; + +/** Computes the tasks to trigger, wait or commit for each checkpoint. */ +public class CheckpointBriefCalculator { +private static final Logger LOG = LoggerFactory.getLogger(CheckpointBriefCalculator.class); + +private final JobID jobId; + +private final CheckpointBriefCalculatorContext context; + +private final List jobVerticesInTopologyOrder = new ArrayList<>(); + +private final List allTasks = new ArrayList<>(); + +private final List sourceTasks = new ArrayList<>(); + +public CheckpointBriefCalculator( +JobID jobId, +CheckpointBriefCalculatorContext context, +Iterable jobVerticesInTopologyOrderIterable) { + +this.jobId = checkNotNull(jobId); +this.context = checkNotNull(context); + +checkNotNull(jobVerticesInTopologyOrderIterable); +jobVerticesInTopologyOrderIterable.forEach( +jobVertex -> { +jobVerticesInTopologyOrder.add(jobVertex); + allTasks.addAll(Arrays.asList(jobVertex.getTaskVertices())); + +if (jobVertex.getJobVertex().isInputVertex()) { + sourceTasks.addAll(Arrays.asList(jobVertex.getTaskVertices())); +} +}); +} + +public CompletableFuture calculateCheckpointBrief() { +CompletableFuture resultFuture = new CompletableFuture<>(); + +context.getMainExecutor() +.execute( +() -> { +try { +if (!isAllExecutionAttemptsAreInitiated()) { +throw new CheckpointException( + CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING); +} + +CheckpointBrief result; +if (!context.hasFinishedTasks()) { +result = calculateWithAllTasksRunning(); +} else { +result = calculateAfterTasksFinished(); +} + +if (!isAllExecutionsToTriggerStarted(result.getTasksToTrigger())) { +throw new CheckpointException( + CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING); +
[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks
gaoyunhaii commented on a change in pull request #14740: URL: https://github.com/apache/flink/pull/14740#discussion_r570712494 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java ## @@ -2132,23 +2123,41 @@ public boolean isForce() { } private void reportToStatsTracker( -PendingCheckpoint checkpoint, Map tasks) { +PendingCheckpoint checkpoint, +Map tasks, +List finishedTasks) { if (statsTracker == null) { return; } Map vertices = -tasks.values().stream() +Stream.concat( +tasks.values().stream(), + finishedTasks.stream().map(Execution::getVertex)) .map(ExecutionVertex::getJobVertex) .distinct() .collect( toMap( ExecutionJobVertex::getJobVertexId, ExecutionJobVertex::getParallelism)); -checkpoint.setStatsCallback( + +PendingCheckpointStats pendingCheckpointStats = statsTracker.reportPendingCheckpoint( checkpoint.getCheckpointID(), checkpoint.getCheckpointTimestamp(), checkpoint.getProps(), -vertices)); +vertices); +checkpoint.setStatsCallback(pendingCheckpointStats); + +reportFinishedTasks(pendingCheckpointStats, finishedTasks); +} + +private void reportFinishedTasks( +PendingCheckpointStats pendingCheckpointStats, List finishedTasks) { +long now = System.currentTimeMillis(); +finishedTasks.forEach( +execution -> +pendingCheckpointStats.reportSubtaskStats( Review comment: Yes, currently it would report 0 for the metrics of finished tasks. I think it would be desired since if we do not report these tasks, users would be not easy to know which tasks are finished when the checkpoint trigger, thus he could not easily distinguish the finished tasks with the tasks that indeed not report snapshot for some reason. We may also consider add another flag to indicate if a task is finished when triggering checkpoints in a separate issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks
gaoyunhaii commented on a change in pull request #14740: URL: https://github.com/apache/flink/pull/14740#discussion_r570711043 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointBriefCalculator.java ## @@ -0,0 +1,492 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.checkpoint; + +import org.apache.flink.annotation.VisibleForTesting; +import org.apache.flink.api.common.JobID; +import org.apache.flink.runtime.execution.ExecutionState; +import org.apache.flink.runtime.executiongraph.Execution; +import org.apache.flink.runtime.executiongraph.ExecutionAttemptID; +import org.apache.flink.runtime.executiongraph.ExecutionEdge; +import org.apache.flink.runtime.executiongraph.ExecutionJobVertex; +import org.apache.flink.runtime.executiongraph.ExecutionVertex; +import org.apache.flink.runtime.jobgraph.DistributionPattern; +import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID; +import org.apache.flink.runtime.jobgraph.JobEdge; +import org.apache.flink.runtime.jobgraph.JobVertexID; +import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collection; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.ListIterator; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.CompletableFuture; +import java.util.stream.Collectors; +import java.util.stream.IntStream; + +import static org.apache.flink.util.Preconditions.checkNotNull; +import static org.apache.flink.util.Preconditions.checkState; + +/** Computes the tasks to trigger, wait or commit for each checkpoint. */ +public class CheckpointBriefCalculator { +private static final Logger LOG = LoggerFactory.getLogger(CheckpointBriefCalculator.class); + +private final JobID jobId; + +private final CheckpointBriefCalculatorContext context; + +private final List jobVerticesInTopologyOrder = new ArrayList<>(); + +private final List allTasks = new ArrayList<>(); + +private final List sourceTasks = new ArrayList<>(); + +public CheckpointBriefCalculator( +JobID jobId, +CheckpointBriefCalculatorContext context, +Iterable jobVerticesInTopologyOrderIterable) { + +this.jobId = checkNotNull(jobId); +this.context = checkNotNull(context); + +checkNotNull(jobVerticesInTopologyOrderIterable); +jobVerticesInTopologyOrderIterable.forEach( +jobVertex -> { +jobVerticesInTopologyOrder.add(jobVertex); + allTasks.addAll(Arrays.asList(jobVertex.getTaskVertices())); + +if (jobVertex.getJobVertex().isInputVertex()) { + sourceTasks.addAll(Arrays.asList(jobVertex.getTaskVertices())); +} +}); +} + +public CompletableFuture calculateCheckpointBrief() { +CompletableFuture resultFuture = new CompletableFuture<>(); + +context.getMainExecutor() +.execute( +() -> { +try { +if (!isAllExecutionAttemptsAreInitiated()) { +throw new CheckpointException( + CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING); +} + +CheckpointBrief result; +if (!context.hasFinishedTasks()) { +result = calculateWithAllTasksRunning(); +} else { +result = calculateAfterTasksFinished(); +} + +if (!isAllExecutionsToTriggerStarted(result.getTasksToTrigger())) { +throw new CheckpointException( + CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING); +
[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks
gaoyunhaii commented on a change in pull request #14740: URL: https://github.com/apache/flink/pull/14740#discussion_r570704750 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointBriefCalculator.java ## @@ -0,0 +1,492 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.checkpoint; + +import org.apache.flink.annotation.VisibleForTesting; +import org.apache.flink.api.common.JobID; +import org.apache.flink.runtime.execution.ExecutionState; +import org.apache.flink.runtime.executiongraph.Execution; +import org.apache.flink.runtime.executiongraph.ExecutionAttemptID; +import org.apache.flink.runtime.executiongraph.ExecutionEdge; +import org.apache.flink.runtime.executiongraph.ExecutionJobVertex; +import org.apache.flink.runtime.executiongraph.ExecutionVertex; +import org.apache.flink.runtime.jobgraph.DistributionPattern; +import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID; +import org.apache.flink.runtime.jobgraph.JobEdge; +import org.apache.flink.runtime.jobgraph.JobVertexID; +import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collection; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.ListIterator; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.CompletableFuture; +import java.util.stream.Collectors; +import java.util.stream.IntStream; + +import static org.apache.flink.util.Preconditions.checkNotNull; +import static org.apache.flink.util.Preconditions.checkState; + +/** Computes the tasks to trigger, wait or commit for each checkpoint. */ +public class CheckpointBriefCalculator { +private static final Logger LOG = LoggerFactory.getLogger(CheckpointBriefCalculator.class); + +private final JobID jobId; + +private final CheckpointBriefCalculatorContext context; + +private final List jobVerticesInTopologyOrder = new ArrayList<>(); + +private final List allTasks = new ArrayList<>(); + +private final List sourceTasks = new ArrayList<>(); + +public CheckpointBriefCalculator( +JobID jobId, +CheckpointBriefCalculatorContext context, +Iterable jobVerticesInTopologyOrderIterable) { + +this.jobId = checkNotNull(jobId); +this.context = checkNotNull(context); + +checkNotNull(jobVerticesInTopologyOrderIterable); +jobVerticesInTopologyOrderIterable.forEach( +jobVertex -> { +jobVerticesInTopologyOrder.add(jobVertex); + allTasks.addAll(Arrays.asList(jobVertex.getTaskVertices())); + +if (jobVertex.getJobVertex().isInputVertex()) { + sourceTasks.addAll(Arrays.asList(jobVertex.getTaskVertices())); +} +}); +} + +public CompletableFuture calculateCheckpointBrief() { +CompletableFuture resultFuture = new CompletableFuture<>(); + +context.getMainExecutor() +.execute( +() -> { +try { +if (!isAllExecutionAttemptsAreInitiated()) { +throw new CheckpointException( + CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING); +} + +CheckpointBrief result; +if (!context.hasFinishedTasks()) { +result = calculateWithAllTasksRunning(); +} else { +result = calculateAfterTasksFinished(); +} + +if (!isAllExecutionsToTriggerStarted(result.getTasksToTrigger())) { +throw new CheckpointException( + CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING); +
[jira] [Created] (FLINK-21287) Failed to build flink source code
Lei Qu created FLINK-21287: -- Summary: Failed to build flink source code Key: FLINK-21287 URL: https://issues.apache.org/jira/browse/FLINK-21287 Project: Flink Issue Type: Bug Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) Affects Versions: 1.12.1 Reporter: Lei Qu [ERROR] Failed to execute goal org.xolstice.maven.plugins:protobuf-maven-plugin:0.5.1:test-compile (default) on project flink-parquet_2.11: protoc did not exit cleanly. Review output for more information. -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :flink-parquet_2.11 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] leonardBang commented on pull request #14863: [FLINK-21203]Don’t collect -U&+U Row When they are equals In the LastRowFunction
leonardBang commented on pull request #14863: URL: https://github.com/apache/flink/pull/14863#issuecomment-773765646 > cc @leonardBang , could you help to review this? ok, I'll take a look This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14876: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`
flinkbot edited a comment on pull request #14876: URL: https://github.com/apache/flink/pull/14876#issuecomment-773757019 ## CI report: * 100fb1eeb7f94949efc85cc6921a5c653a56163d Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12974) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14844: [FLINK-21208][python] Make Arrow Coder serialize schema info in every batch
flinkbot edited a comment on pull request #14844: URL: https://github.com/apache/flink/pull/14844#issuecomment-772295878 ## CI report: * 4c5d3fcfdf5bb2a114964f5cd60fc6743fe331da Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12833) * 54ed8511139561657f10d84dc25b189acbbf156c Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12973) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14379: [FLINK-20563][hive] Support built-in functions for Hive versions prio…
flinkbot edited a comment on pull request #14379: URL: https://github.com/apache/flink/pull/14379#issuecomment-73009 ## CI report: * a413199ba00fdf4b41b09e7cf4bcf02bcd6da0ce Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10861) * 64460eaf888b0333b8aed626d48f66fd875997cf Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12972) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks
gaoyunhaii commented on a change in pull request #14740: URL: https://github.com/apache/flink/pull/14740#discussion_r570694404 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointBriefCalculator.java ## @@ -0,0 +1,492 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.checkpoint; + +import org.apache.flink.annotation.VisibleForTesting; +import org.apache.flink.api.common.JobID; +import org.apache.flink.runtime.execution.ExecutionState; +import org.apache.flink.runtime.executiongraph.Execution; +import org.apache.flink.runtime.executiongraph.ExecutionAttemptID; +import org.apache.flink.runtime.executiongraph.ExecutionEdge; +import org.apache.flink.runtime.executiongraph.ExecutionJobVertex; +import org.apache.flink.runtime.executiongraph.ExecutionVertex; +import org.apache.flink.runtime.jobgraph.DistributionPattern; +import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID; +import org.apache.flink.runtime.jobgraph.JobEdge; +import org.apache.flink.runtime.jobgraph.JobVertexID; +import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collection; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.ListIterator; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.CompletableFuture; +import java.util.stream.Collectors; +import java.util.stream.IntStream; + +import static org.apache.flink.util.Preconditions.checkNotNull; +import static org.apache.flink.util.Preconditions.checkState; + +/** Computes the tasks to trigger, wait or commit for each checkpoint. */ +public class CheckpointBriefCalculator { +private static final Logger LOG = LoggerFactory.getLogger(CheckpointBriefCalculator.class); + +private final JobID jobId; + +private final CheckpointBriefCalculatorContext context; + +private final List jobVerticesInTopologyOrder = new ArrayList<>(); + +private final List allTasks = new ArrayList<>(); + +private final List sourceTasks = new ArrayList<>(); + +public CheckpointBriefCalculator( +JobID jobId, +CheckpointBriefCalculatorContext context, +Iterable jobVerticesInTopologyOrderIterable) { + +this.jobId = checkNotNull(jobId); +this.context = checkNotNull(context); + +checkNotNull(jobVerticesInTopologyOrderIterable); +jobVerticesInTopologyOrderIterable.forEach( +jobVertex -> { +jobVerticesInTopologyOrder.add(jobVertex); + allTasks.addAll(Arrays.asList(jobVertex.getTaskVertices())); + +if (jobVertex.getJobVertex().isInputVertex()) { + sourceTasks.addAll(Arrays.asList(jobVertex.getTaskVertices())); +} +}); +} + +public CompletableFuture calculateCheckpointBrief() { +CompletableFuture resultFuture = new CompletableFuture<>(); + +context.getMainExecutor() +.execute( +() -> { +try { +if (!isAllExecutionAttemptsAreInitiated()) { +throw new CheckpointException( + CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING); +} + +CheckpointBrief result; +if (!context.hasFinishedTasks()) { +result = calculateWithAllTasksRunning(); +} else { +result = calculateAfterTasksFinished(); +} + +if (!isAllExecutionsToTriggerStarted(result.getTasksToTrigger())) { +throw new CheckpointException( + CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING); +
[jira] [Commented] (FLINK-11838) Create RecoverableWriter for GCS
[ https://issues.apache.org/jira/browse/FLINK-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279299#comment-17279299 ] Xintong Song commented on FLINK-11838: -- Hi [~galenwarren], Thanks for offering the contribution. I will help you with this contribution. Since this ticket has not been updated for quite some time and the original PR has been abandoned, I have assigned you to the ticket. Just to managed expectation, I could use some time to pick up the GCS backgrounds and review your design proposal. During this time, I would suggest to take a look at the following guidelines. [https://flink.apache.org/contributing/contribute-code.html] [https://flink.apache.org/contributing/code-style-and-quality-preamble.html] After a first glance at the PR, I've two suggestions. - I noticed you've described your proposal on the PR you've opened. It would be nice to update it to the description of this JIRA ticket. Usually, we use the JIRA ticket for design discussions, and the PR for reviewing implementation details. - The PR contains 3k LOC changes, in a single commit, which could be hard to review, especially when we cannot communicate face-to-face. It would be nice to organize the codes into smaller commits following the contribution guidelines. This can be done after we reach consensus on the design proposal. > Create RecoverableWriter for GCS > > > Key: FLINK-11838 > URL: https://issues.apache.org/jira/browse/FLINK-11838 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem >Affects Versions: 1.8.0 >Reporter: Fokko Driesprong >Assignee: Galen Warren >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > GCS supports the resumable upload which we can use to create a Recoverable > writer similar to the S3 implementation: > https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload > After using the Hadoop compatible interface: > https://github.com/apache/flink/pull/7519 > We've noticed that the current implementation relies heavily on the renaming > of the files on the commit: > https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L233-L259 > This is suboptimal on an object store such as GCS. Therefore we would like to > implement a more GCS native RecoverableWriter -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #14876: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`
flinkbot commented on pull request #14876: URL: https://github.com/apache/flink/pull/14876#issuecomment-773757019 ## CI report: * 100fb1eeb7f94949efc85cc6921a5c653a56163d UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14844: [FLINK-21208][python] Make Arrow Coder serialize schema info in every batch
flinkbot edited a comment on pull request #14844: URL: https://github.com/apache/flink/pull/14844#issuecomment-772295878 ## CI report: * 4c5d3fcfdf5bb2a114964f5cd60fc6743fe331da Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12833) * 54ed8511139561657f10d84dc25b189acbbf156c UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14798: [FLINK-21187] Provide exception history for root causes
flinkbot edited a comment on pull request #14798: URL: https://github.com/apache/flink/pull/14798#issuecomment-769223911 ## CI report: * 5910b1a876d28886a8b5f87c09e67d75d2a45cd3 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12962) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14379: [FLINK-20563][hive] Support built-in functions for Hive versions prio…
flinkbot edited a comment on pull request #14379: URL: https://github.com/apache/flink/pull/14379#issuecomment-73009 ## CI report: * a413199ba00fdf4b41b09e7cf4bcf02bcd6da0ce Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10861) * 64460eaf888b0333b8aed626d48f66fd875997cf UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20970) DECIMAL(10, 0) can not be GROUP BY key.
[ https://issues.apache.org/jira/browse/FLINK-20970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279297#comment-17279297 ] Wong Mulan commented on FLINK-20970: There is not the problem in 1.11.3 and 1.12.1 version. > DECIMAL(10, 0) can not be GROUP BY key. > --- > > Key: FLINK-20970 > URL: https://issues.apache.org/jira/browse/FLINK-20970 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.10.1 >Reporter: Wong Mulan >Priority: Major > Attachments: image-2021-01-14-17-06-28-648.png > > > If value which type is not DECIMAL(38, 18) is GROUP BY key, the result will > be -1. > So, only DECIMAL(38, 18) can be GROUP BY key? > Whatever the value is, it will be return -1. > !image-2021-01-14-17-06-28-648.png|thumbnail! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wuchong commented on pull request #14863: [FLINK-21203]Don’t collect -U&+U Row When they are equals In the LastRowFunction
wuchong commented on pull request #14863: URL: https://github.com/apache/flink/pull/14863#issuecomment-773753124 cc @leonardBang , could you help to review this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-21284) Non-deterministic functions return different values
[ https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hehuiyuan updated FLINK-21284: -- Description: Non-deterministic UDF functions is used mutiple times , the result is different. {code:java} Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex from myhive_staff"); tableEnv.registerTable("tmp", tm); tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= 8"); }{code} RAND_INTEGER() function is used for `RAND_INTEGER(10) as sample` when sink, which lead to inconsistent result. !image-2021-02-05-10-23-02-616.png|width=759,height=433! !image-2021-02-05-10-23-20-639.png|width=1664,height=728! was: Non-deterministic UDF functions is used mutiple times , the result is different. {code:java} Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex from myhive_staff"); tableEnv.registerTable("tmp", tm); tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= 8"); }{code} Sample udf function is used for `RAND_INTEGER(10) as sample` when sink, which lead to inconsistent result. !image-2021-02-05-10-23-02-616.png|width=759,height=433! !image-2021-02-05-10-23-20-639.png|width=1664,height=728! > Non-deterministic functions return different values > > > Key: FLINK-21284 > URL: https://issues.apache.org/jira/browse/FLINK-21284 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: hehuiyuan >Priority: Major > Attachments: image-2021-02-05-10-23-02-616.png, > image-2021-02-05-10-23-20-639.png > > > Non-deterministic UDF functions is used mutiple times , the result is > different. > > {code:java} > Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex > from myhive_staff"); > tableEnv.registerTable("tmp", tm); > tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= > 8"); > }{code} > > RAND_INTEGER() function is used for `RAND_INTEGER(10) as sample` when sink, > which lead to inconsistent result. > > !image-2021-02-05-10-23-02-616.png|width=759,height=433! > > > !image-2021-02-05-10-23-20-639.png|width=1664,height=728! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-21284) Non-deterministic functions return different values
[ https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279295#comment-17279295 ] Jark Wu commented on FLINK-21284: - [~hehuiyuan], I agree this is a bug. > Non-deterministic functions return different values > > > Key: FLINK-21284 > URL: https://issues.apache.org/jira/browse/FLINK-21284 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: hehuiyuan >Priority: Major > Attachments: image-2021-02-05-10-23-02-616.png, > image-2021-02-05-10-23-20-639.png > > > Non-deterministic UDF functions is used mutiple times , the result is > different. > > {code:java} > Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex > from myhive_staff"); > tableEnv.registerTable("tmp", tm); > tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= > 8"); > }{code} > > Sample udf function is used for `RAND_INTEGER(10) as sample` when sink, > which lead to inconsistent result. > > !image-2021-02-05-10-23-02-616.png|width=759,height=433! > > > !image-2021-02-05-10-23-20-639.png|width=1664,height=728! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-21284) Non-deterministic functions return different values
[ https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279294#comment-17279294 ] hehuiyuan edited comment on FLINK-21284 at 2/5/21, 3:04 AM: [~jark] , yes , RAND_INTEGER result is non deterministic. `random.nextint` is used when filter `>=8` - - - > result$2 `random.nextint` is used when setting output. - - - > result$10 result$2 and result$10 may be defferent was (Author: hehuiyuan): [~jark] , yes , RAND_INTEGER is non deterministic. `random.nextint` is used when fiter `>=8` - - - > result$2 `random.nextint` is used when setting output. - - - > result$10 > Non-deterministic functions return different values > > > Key: FLINK-21284 > URL: https://issues.apache.org/jira/browse/FLINK-21284 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: hehuiyuan >Priority: Major > Attachments: image-2021-02-05-10-23-02-616.png, > image-2021-02-05-10-23-20-639.png > > > Non-deterministic UDF functions is used mutiple times , the result is > different. > > {code:java} > Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex > from myhive_staff"); > tableEnv.registerTable("tmp", tm); > tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= > 8"); > }{code} > > Sample udf function is used for `RAND_INTEGER(10) as sample` when sink, > which lead to inconsistent result. > > !image-2021-02-05-10-23-02-616.png|width=759,height=433! > > > !image-2021-02-05-10-23-20-639.png|width=1664,height=728! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-21284) Non-deterministic functions return different values
[ https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279294#comment-17279294 ] hehuiyuan commented on FLINK-21284: --- [~jark] , yes , RAND_INTEGER is non deterministic. `random.nextint` is used when fiter `>=8` - - - > result$2 `random.nextint` is used when setting output. - - - > result$10 > Non-deterministic functions return different values > > > Key: FLINK-21284 > URL: https://issues.apache.org/jira/browse/FLINK-21284 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: hehuiyuan >Priority: Major > Attachments: image-2021-02-05-10-23-02-616.png, > image-2021-02-05-10-23-20-639.png > > > Non-deterministic UDF functions is used mutiple times , the result is > different. > > {code:java} > Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex > from myhive_staff"); > tableEnv.registerTable("tmp", tm); > tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= > 8"); > }{code} > > Sample udf function is used for `RAND_INTEGER(10) as sample` when sink, > which lead to inconsistent result. > > !image-2021-02-05-10-23-02-616.png|width=759,height=433! > > > !image-2021-02-05-10-23-20-639.png|width=1664,height=728! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-21284) Non-deterministic functions return different values
[ https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hehuiyuan updated FLINK-21284: -- Description: Non-deterministic UDF functions is used mutiple times , the result is different. {code:java} Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex from myhive_staff"); tableEnv.registerTable("tmp", tm); tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= 8"); }{code} Sample udf function is used for `RAND_INTEGER(10) as sample` when sink, which lead to inconsistent result. !image-2021-02-05-10-23-02-616.png|width=759,height=433! !image-2021-02-05-10-23-20-639.png|width=1664,height=728! was: Non-deterministic UDF functions is used mutiple times , the result is different. {code:java} tableEnv.registerFunction("sample", new SampleFunction()); Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex from myhive_staff"); tableEnv.registerTable("tmp", tm); tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= 8"); }{code} Sample udf function is used for `RAND_INTEGER(10) as sample` when sink, which lead to inconsistent result. !image-2021-02-05-10-23-02-616.png|width=759,height=433! !image-2021-02-05-10-23-20-639.png|width=1664,height=728! > Non-deterministic functions return different values > > > Key: FLINK-21284 > URL: https://issues.apache.org/jira/browse/FLINK-21284 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: hehuiyuan >Priority: Major > Attachments: image-2021-02-05-10-23-02-616.png, > image-2021-02-05-10-23-20-639.png > > > Non-deterministic UDF functions is used mutiple times , the result is > different. > > {code:java} > Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex > from myhive_staff"); > tableEnv.registerTable("tmp", tm); > tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= > 8"); > }{code} > > Sample udf function is used for `RAND_INTEGER(10) as sample` when sink, > which lead to inconsistent result. > > !image-2021-02-05-10-23-02-616.png|width=759,height=433! > > > !image-2021-02-05-10-23-20-639.png|width=1664,height=728! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-21284) Non-deterministic UDF functions return different values
[ https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hehuiyuan updated FLINK-21284: -- Description: Non-deterministic UDF functions is used mutiple times , the result is different. {code:java} tableEnv.registerFunction("sample", new SampleFunction()); Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex from myhive_staff"); tableEnv.registerTable("tmp", tm); tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= 8"); }{code} Sample udf function is used for `RAND_INTEGER(10) as sample` when sink, which lead to inconsistent result. !image-2021-02-05-10-23-02-616.png|width=759,height=433! !image-2021-02-05-10-23-20-639.png|width=1664,height=728! was: Non-deterministic UDF functions is used mutiple times , the result is different. {code:java} tableEnv.registerFunction("sample", new SampleFunction()); Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex from myhive_staff"); tableEnv.registerTable("tmp", tm); tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= 8"); // UDF函数 public class SampleFunction extends ScalarFunction { public int eval(int pvid) { int a = (int) (Math.random() * 10); System.out.println("" + a ); return a; } }{code} Sample udf function is used for `RAND_INTEGER(10) as sample` when sink, which lead to inconsistent result. !image-2021-02-05-10-23-02-616.png|width=759,height=433! !image-2021-02-05-10-23-20-639.png|width=1664,height=728! > Non-deterministic UDF functions return different values > --- > > Key: FLINK-21284 > URL: https://issues.apache.org/jira/browse/FLINK-21284 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: hehuiyuan >Priority: Major > Attachments: image-2021-02-05-10-23-02-616.png, > image-2021-02-05-10-23-20-639.png > > > Non-deterministic UDF functions is used mutiple times , the result is > different. > > {code:java} > tableEnv.registerFunction("sample", new SampleFunction()); > Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex > from myhive_staff"); > tableEnv.registerTable("tmp", tm); > tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= > 8"); > }{code} > > Sample udf function is used for `RAND_INTEGER(10) as sample` when sink, > which lead to inconsistent result. > > !image-2021-02-05-10-23-02-616.png|width=759,height=433! > > > !image-2021-02-05-10-23-20-639.png|width=1664,height=728! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-21284) Non-deterministic functions return different values
[ https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hehuiyuan updated FLINK-21284: -- Summary: Non-deterministic functions return different values (was: Non-deterministic UDF functions return different values) > Non-deterministic functions return different values > > > Key: FLINK-21284 > URL: https://issues.apache.org/jira/browse/FLINK-21284 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: hehuiyuan >Priority: Major > Attachments: image-2021-02-05-10-23-02-616.png, > image-2021-02-05-10-23-20-639.png > > > Non-deterministic UDF functions is used mutiple times , the result is > different. > > {code:java} > tableEnv.registerFunction("sample", new SampleFunction()); > Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex > from myhive_staff"); > tableEnv.registerTable("tmp", tm); > tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= > 8"); > }{code} > > Sample udf function is used for `RAND_INTEGER(10) as sample` when sink, > which lead to inconsistent result. > > !image-2021-02-05-10-23-02-616.png|width=759,height=433! > > > !image-2021-02-05-10-23-20-639.png|width=1664,height=728! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-21286) Support BUCKET for flink sql CREATE TABLE
Jun Zhang created FLINK-21286: - Summary: Support BUCKET for flink sql CREATE TABLE Key: FLINK-21286 URL: https://issues.apache.org/jira/browse/FLINK-21286 Project: Flink Issue Type: Improvement Components: Table SQL / API Affects Versions: 1.12.1 Reporter: Jun Zhang Fix For: 1.13.0 Support BUCKET for flink CREATE TABLE : refer to hive syntax {code:java} [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-11838) Create RecoverableWriter for GCS
[ https://issues.apache.org/jira/browse/FLINK-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song reassigned FLINK-11838: Assignee: Galen Warren (was: Fokko Driesprong) > Create RecoverableWriter for GCS > > > Key: FLINK-11838 > URL: https://issues.apache.org/jira/browse/FLINK-11838 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem >Affects Versions: 1.8.0 >Reporter: Fokko Driesprong >Assignee: Galen Warren >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > GCS supports the resumable upload which we can use to create a Recoverable > writer similar to the S3 implementation: > https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload > After using the Hadoop compatible interface: > https://github.com/apache/flink/pull/7519 > We've noticed that the current implementation relies heavily on the renaming > of the files on the commit: > https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L233-L259 > This is suboptimal on an object store such as GCS. Therefore we would like to > implement a more GCS native RecoverableWriter -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks
gaoyunhaii commented on a change in pull request #14740: URL: https://github.com/apache/flink/pull/14740#discussion_r570686307 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointBriefCalculator.java ## @@ -0,0 +1,492 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.checkpoint; + +import org.apache.flink.annotation.VisibleForTesting; +import org.apache.flink.api.common.JobID; +import org.apache.flink.runtime.execution.ExecutionState; +import org.apache.flink.runtime.executiongraph.Execution; +import org.apache.flink.runtime.executiongraph.ExecutionAttemptID; +import org.apache.flink.runtime.executiongraph.ExecutionEdge; +import org.apache.flink.runtime.executiongraph.ExecutionJobVertex; +import org.apache.flink.runtime.executiongraph.ExecutionVertex; +import org.apache.flink.runtime.jobgraph.DistributionPattern; +import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID; +import org.apache.flink.runtime.jobgraph.JobEdge; +import org.apache.flink.runtime.jobgraph.JobVertexID; +import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collection; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.ListIterator; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.CompletableFuture; +import java.util.stream.Collectors; +import java.util.stream.IntStream; + +import static org.apache.flink.util.Preconditions.checkNotNull; +import static org.apache.flink.util.Preconditions.checkState; + +/** Computes the tasks to trigger, wait or commit for each checkpoint. */ +public class CheckpointBriefCalculator { +private static final Logger LOG = LoggerFactory.getLogger(CheckpointBriefCalculator.class); + +private final JobID jobId; + +private final CheckpointBriefCalculatorContext context; + +private final List jobVerticesInTopologyOrder = new ArrayList<>(); + +private final List allTasks = new ArrayList<>(); + +private final List sourceTasks = new ArrayList<>(); + +public CheckpointBriefCalculator( +JobID jobId, +CheckpointBriefCalculatorContext context, +Iterable jobVerticesInTopologyOrderIterable) { + +this.jobId = checkNotNull(jobId); +this.context = checkNotNull(context); + +checkNotNull(jobVerticesInTopologyOrderIterable); +jobVerticesInTopologyOrderIterable.forEach( +jobVertex -> { +jobVerticesInTopologyOrder.add(jobVertex); + allTasks.addAll(Arrays.asList(jobVertex.getTaskVertices())); + +if (jobVertex.getJobVertex().isInputVertex()) { + sourceTasks.addAll(Arrays.asList(jobVertex.getTaskVertices())); +} +}); +} + +public CompletableFuture calculateCheckpointBrief() { +CompletableFuture resultFuture = new CompletableFuture<>(); + +context.getMainExecutor() +.execute( +() -> { Review comment: I think it would be much simpler, very thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-21285) Support MERGE INTO for flink sql
Jun Zhang created FLINK-21285: - Summary: Support MERGE INTO for flink sql Key: FLINK-21285 URL: https://issues.apache.org/jira/browse/FLINK-21285 Project: Flink Issue Type: Improvement Components: Table SQL / API Affects Versions: 1.12.1 Reporter: Jun Zhang Fix For: 1.13.0 Support MERGE INTO for flink sql,refer to hive syntax: {code:java} MERGE INTO AS T USING AS S ON WHEN MATCHED [AND ] THEN UPDATE SET WHEN MATCHED [AND ] THEN DELETE WHEN NOT MATCHED [AND ] THEN INSERT VALUES {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-21278) NullpointExecption error is reported when using the evictor method to filter the data before the window calculation
[ https://issues.apache.org/jira/browse/FLINK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279287#comment-17279287 ] HunterHunter commented on FLINK-21278: -- Do we need to determine whether there is data in the window before triggering the window calculation. I have fixed this bug in my code. https://github.com/LinMingQiang/flink/commit/bafefee33b2513d9c4128e9a3a1f9644000137a3 > NullpointExecption error is reported when using the evictor method to filter > the data before the window calculation > --- > > Key: FLINK-21278 > URL: https://issues.apache.org/jira/browse/FLINK-21278 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.12.1 >Reporter: HunterHunter >Priority: Blocker > > When I use evictor() method to filter the data before a window is triggered, > if there is no data that meets the conditions, a nullpointExecption error > will be reported. > This problem occurs in the ReduceApplyWindowFunction.apply method. > So I think if there is no data to calculate whether it can not trigger the > calculation, or judge whether it is null before transmitting the calculation > result -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #14876: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`
flinkbot commented on pull request #14876: URL: https://github.com/apache/flink/pull/14876#issuecomment-773745339 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 100fb1eeb7f94949efc85cc6921a5c653a56163d (Fri Feb 05 02:45:10 UTC 2021) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] V1ncentzzZ commented on pull request #14849: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`
V1ncentzzZ commented on pull request #14849: URL: https://github.com/apache/flink/pull/14849#issuecomment-773745324 Thanks @wuchong @leonardBang , this [PR](https://github.com/apache/flink/pull/14876) for master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] V1ncentzzZ opened a new pull request #14876: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`
V1ncentzzZ opened a new pull request #14876: URL: https://github.com/apache/flink/pull/14876 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log Fix typo in `UnsignedTypeConversionITCase`. ## Verifying this change *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-21284) Non-deterministic UDF functions return different values
[ https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279286#comment-17279286 ] Jark Wu commented on FLINK-21284: - [~hehuiyuan], do you mean you get result sample < 8 in the sink? > Non-deterministic UDF functions return different values > --- > > Key: FLINK-21284 > URL: https://issues.apache.org/jira/browse/FLINK-21284 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: hehuiyuan >Priority: Major > Attachments: image-2021-02-05-10-23-02-616.png, > image-2021-02-05-10-23-20-639.png > > > Non-deterministic UDF functions is used mutiple times , the result is > different. > > {code:java} > tableEnv.registerFunction("sample", new SampleFunction()); > Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex > from myhive_staff"); > tableEnv.registerTable("tmp", tm); > tableEnv.sqlUpdate("insert into sinktable select * from tmp where sample >= > 8"); > // UDF函数 > public class SampleFunction extends ScalarFunction { > public int eval(int pvid) { > int a = (int) (Math.random() * 10); > System.out.println("" + a ); > return a; > } > }{code} > > Sample udf function is used for `RAND_INTEGER(10) as sample` when sink, > which lead to inconsistent result. > > !image-2021-02-05-10-23-02-616.png|width=759,height=433! > > > !image-2021-02-05-10-23-20-639.png|width=1664,height=728! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
[ https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-15318: - Fix Version/s: 1.12.0 > RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le > --- > > Key: FLINK-15318 > URL: https://issues.apache.org/jira/browse/FLINK-15318 > Project: Flink > Issue Type: Bug > Components: Benchmarks, Runtime / State Backends > Environment: arch: ppc64le > os: rhel7.6, ubuntu 18.04 > jdk: 8, 11 > mvn: 3.3.9, 3.6.2 >Reporter: Siddhesh Ghadi >Priority: Major > Fix For: 1.12.0 > > Attachments: surefire-report.txt > > > RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however > when test-timeout is increased from 2s to 5s in > org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75, > it passes. Is this acceptable solution? > Note: Tests are ran inside a container. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
[ https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang closed FLINK-15318. Resolution: Fixed > RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le > --- > > Key: FLINK-15318 > URL: https://issues.apache.org/jira/browse/FLINK-15318 > Project: Flink > Issue Type: Bug > Components: Benchmarks, Runtime / State Backends > Environment: arch: ppc64le > os: rhel7.6, ubuntu 18.04 > jdk: 8, 11 > mvn: 3.3.9, 3.6.2 >Reporter: Siddhesh Ghadi >Priority: Major > Attachments: surefire-report.txt > > > RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however > when test-timeout is increased from 2s to 5s in > org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75, > it passes. Is this acceptable solution? > Note: Tests are ran inside a container. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
[ https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279285#comment-17279285 ] Yun Tang commented on FLINK-15318: -- [~maguowei] These tests are dropped in FLINK-18373 and I will close this ticket as it fixed from 1.12.0 > RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le > --- > > Key: FLINK-15318 > URL: https://issues.apache.org/jira/browse/FLINK-15318 > Project: Flink > Issue Type: Bug > Components: Benchmarks, Runtime / State Backends > Environment: arch: ppc64le > os: rhel7.6, ubuntu 18.04 > jdk: 8, 11 > mvn: 3.3.9, 3.6.2 >Reporter: Siddhesh Ghadi >Priority: Major > Attachments: surefire-report.txt > > > RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however > when test-timeout is increased from 2s to 5s in > org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75, > it passes. Is this acceptable solution? > Note: Tests are ran inside a container. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-21278) NullpointExecption error is reported when using the evictor method to filter the data before the window calculation
[ https://issues.apache.org/jira/browse/FLINK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HunterHunter updated FLINK-21278: - Attachment: (was: image-2021-02-05-10-35-45-364.png) > NullpointExecption error is reported when using the evictor method to filter > the data before the window calculation > --- > > Key: FLINK-21278 > URL: https://issues.apache.org/jira/browse/FLINK-21278 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.12.1 >Reporter: HunterHunter >Priority: Blocker > > When I use evictor() method to filter the data before a window is triggered, > if there is no data that meets the conditions, a nullpointExecption error > will be reported. > This problem occurs in the ReduceApplyWindowFunction.apply method. > So I think if there is no data to calculate whether it can not trigger the > calculation, or judge whether it is null before transmitting the calculation > result -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-21278) NullpointExecption error is reported when using the evictor method to filter the data before the window calculation
[ https://issues.apache.org/jira/browse/FLINK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HunterHunter updated FLINK-21278: - Attachment: image-2021-02-05-10-35-45-364.png > NullpointExecption error is reported when using the evictor method to filter > the data before the window calculation > --- > > Key: FLINK-21278 > URL: https://issues.apache.org/jira/browse/FLINK-21278 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.12.1 >Reporter: HunterHunter >Priority: Blocker > Attachments: image-2021-02-05-10-35-45-364.png > > > When I use evictor() method to filter the data before a window is triggered, > if there is no data that meets the conditions, a nullpointExecption error > will be reported. > This problem occurs in the ReduceApplyWindowFunction.apply method. > So I think if there is no data to calculate whether it can not trigger the > calculation, or judge whether it is null before transmitting the calculation > result -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-21278) NullpointExecption error is reported when using the evictor method to filter the data before the window calculation
[ https://issues.apache.org/jira/browse/FLINK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279282#comment-17279282 ] HunterHunter commented on FLINK-21278: -- [~ZhaoWeiNan] code: [https://paste.ubuntu.com/p/wJyN93BwHB/] {code:java} Caused by: java.lang.NullPointerException at org.apache.flink.api.common.functions.util.PrintSinkOutputWriter.write(PrintSinkOutputWriter.java:73) at org.apache.flink.streaming.api.functions.sink.PrintSinkFunction.invoke(PrintSinkFunction.java:81) at org.apache.flink.streaming.api.functions.sink.SinkFunction.invoke(SinkFunction.java:52) at org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:56) at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:71) at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46) at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26) at org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:52) at org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:30) at org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:53) at org.apache.flink.streaming.api.functions.windowing.PassThroughWindowFunction.apply(PassThroughWindowFunction.java:36) at org.apache.flink.streaming.api.functions.windowing.ReduceApplyWindowFunction.apply(ReduceApplyWindowFunction.java:58) at org.apache.flink.streaming.runtime.operators.windowing.functions.InternalIterableWindowFunction.process(InternalIterableWindowFunction.java:44) at org.apache.flink.streaming.runtime.operators.windowing.functions.InternalIterableWindowFunction.process(InternalIterableWindowFunction.java:32) at org.apache.flink.streaming.runtime.operators.windowing.EvictingWindowOperator.emitWindowContents(EvictingWindowOperator.java:359) at org.apache.flink.streaming.runtime.operators.windowing.EvictingWindowOperator.onEventTime(EvictingWindowOperator.java:271) at org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:276) at org.apache.flink.streaming.api.operators.InternalTimeServiceManagerImpl.advanceWatermark(InternalTimeServiceManagerImpl.java:183) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:600) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitWatermark(OneInputStreamTask.java:199) at org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:173) at org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:95) at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:181) at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:152) at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186) at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) at java.lang.Thread.run(Thread.java:748){code} > NullpointExecption error is reported when using the evictor method to filter > the data before the window calculation > --- > > Key: FLINK-21278 > URL: https://issues.apache.org/jira/browse/FLINK-21278 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.12.1 >Reporter: HunterHunter >Priority: Blocker > > When I use evictor() method to filter the data before a window is triggered, > if there is no data that meets the conditions, a nullpointExecption error > will be reported. > This problem occurs in the ReduceApplyWindowFunction.apply method. > So I think if there is no data to calculate whether it can not trigger the > calculation, or judge whethe
[GitHub] [flink] wuchong commented on pull request #14849: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`
wuchong commented on pull request #14849: URL: https://github.com/apache/flink/pull/14849#issuecomment-773740994 @V1ncentzzZ could you open a pull request for master too? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-20254) HiveTableSourceITCase.testStreamPartitionReadByCreateTime times out
[ https://issues.apache.org/jira/browse/FLINK-20254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-20254. Resolution: Fixed master (1.13): fce75b5f5c078884022f89ab6fa9b39e23a84279 > HiveTableSourceITCase.testStreamPartitionReadByCreateTime times out > --- > > Key: FLINK-20254 > URL: https://issues.apache.org/jira/browse/FLINK-20254 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Leonard Xu >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.13.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9808&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=62110053-334f-5295-a0ab-80dd7e2babbf > {code} > 2020-11-19T10:34:23.5591765Z [ERROR] Tests run: 18, Failures: 0, Errors: 1, > Skipped: 0, Time elapsed: 192.243 s <<< FAILURE! - in > org.apache.flink.connectors.hive.HiveTableSourceITCase > 2020-11-19T10:34:23.5593193Z [ERROR] > testStreamPartitionReadByCreateTime(org.apache.flink.connectors.hive.HiveTableSourceITCase) > Time elapsed: 120.075 s <<< ERROR! > 2020-11-19T10:34:23.5593929Z org.junit.runners.model.TestTimedOutException: > test timed out after 12 milliseconds > 2020-11-19T10:34:23.5594321Z at java.lang.Thread.sleep(Native Method) > 2020-11-19T10:34:23.5594777Z at > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.sleepBeforeRetry(CollectResultFetcher.java:231) > 2020-11-19T10:34:23.5595378Z at > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:119) > 2020-11-19T10:34:23.5596001Z at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:103) > 2020-11-19T10:34:23.5596610Z at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77) > 2020-11-19T10:34:23.5597218Z at > org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:115) > 2020-11-19T10:34:23.5597811Z at > org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:355) > 2020-11-19T10:34:23.5598555Z at > org.apache.flink.connectors.hive.HiveTableSourceITCase.fetchRows(HiveTableSourceITCase.java:653) > 2020-11-19T10:34:23.5599407Z at > org.apache.flink.connectors.hive.HiveTableSourceITCase.testStreamPartitionReadByCreateTime(HiveTableSourceITCase.java:594) > 2020-11-19T10:34:23.5599982Z at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2020-11-19T10:34:23.5600393Z at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2020-11-19T10:34:23.5600865Z at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2020-11-19T10:34:23.5601300Z at > java.lang.reflect.Method.invoke(Method.java:498) > 2020-11-19T10:34:23.5601713Z at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > 2020-11-19T10:34:23.5602211Z at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2020-11-19T10:34:23.5602688Z at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > 2020-11-19T10:34:23.5603181Z at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2020-11-19T10:34:23.5603753Z at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > 2020-11-19T10:34:23.5604308Z at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > 2020-11-19T10:34:23.5604780Z at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > 2020-11-19T10:34:23.5605114Z at java.lang.Thread.run(Thread.java:748) > 2020-11-19T10:34:23.5605299Z > 2020-11-19T10:34:24.4180149Z [INFO] Running > org.apache.flink.connectors.hive.TableEnvHiveConnectorITCase > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wuchong merged pull request #14849: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`
wuchong merged pull request #14849: URL: https://github.com/apache/flink/pull/14849 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14821: [FLINK-21210][coordination] ApplicationClusterEntryPoint should explicitly close PackagedProgram
flinkbot edited a comment on pull request #14821: URL: https://github.com/apache/flink/pull/14821#issuecomment-770351339 ## CI report: * aa75cbfa98c0494578283d2dab0908cdd4942c3a Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12968) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20495) Elasticsearch6DynamicSinkITCase Hang
[ https://issues.apache.org/jira/browse/FLINK-20495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279278#comment-17279278 ] Leonard Xu commented on FLINK-20495: another instance: https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12906&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=03dca39c-73e8-5aaf-601d-328ae5c35f20 > Elasticsearch6DynamicSinkITCase Hang > > > Key: FLINK-20495 > URL: https://issues.apache.org/jira/browse/FLINK-20495 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines, Connectors / > ElasticSearch, Tests >Affects Versions: 1.13.0 >Reporter: Huang Xingbo >Priority: Major > Labels: test-stability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10535&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=03dca39c-73e8-5aaf-601d-328ae5c35f20] > > {code:java} > 2020-12-04T22:39:33.9748225Z [INFO] Running > org.apache.flink.streaming.connectors.elasticsearch.table.Elasticsearch6DynamicSinkITCase > 2020-12-04T22:54:51.9486410Z > == > 2020-12-04T22:54:51.9488766Z Process produced no output for 900 seconds. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] JingsongLi merged pull request #14766: [FLINK-20254][hive] Make PartitionMonitor fetching partitions with same create time properly
JingsongLi merged pull request #14766: URL: https://github.com/apache/flink/pull/14766 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingsongLi commented on pull request #14766: [FLINK-20254][hive] Make PartitionMonitor fetching partitions with same create time properly
JingsongLi commented on pull request #14766: URL: https://github.com/apache/flink/pull/14766#issuecomment-773738333 @leonardBang You can create a JIRA for the fail test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org