[GitHub] [flink] flinkbot edited a comment on pull request #14859: [FLINK-21208][python] Make Arrow Coder serialize schema info in every batch

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14859:
URL: https://github.com/apache/flink/pull/14859#issuecomment-773047857


   
   ## CI report:
   
   * 31aab909d881e3f2bf60197b1d27048114f082c0 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12903)
 
   * 2476b3750d3dd73e706a7fc0b8887c1a5631c246 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14842: [FLINK-21238][python] Support to close PythonFunctionFactory

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14842:
URL: https://github.com/apache/flink/pull/14842#issuecomment-772256636


   
   ## CI report:
   
   * 0b46d07633ccbd6774129dfa4f78424e6cacc533 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12936)
 
   * 5c977a53fe75be12cb949431d0e2e1fb8084c8ab Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12981)
 
   * 2469c99bbf68659d33f4f6a8f36f77c0adc8f870 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14834: [FLINK-21234][build system] Adding timeout to all tasks

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14834:
URL: https://github.com/apache/flink/pull/14834#issuecomment-771607546


   
   ## CI report:
   
   * 3c55d1b87076148f095c08a55319f1b898dc932d UNKNOWN
   * 5c8ac11faf9d6fbcad7550bec8abf6a6343347c3 UNKNOWN
   * 4e50c2f87a73c3a79c178a580c84c31924a72c4a Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12965)
 
   * 35e6f16b00a6ac645dc140b7289272074a82fd19 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12980)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14727: [FLINK-19945][Connectors / FileSystem]Support sink parallelism config…

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14727:
URL: https://github.com/apache/flink/pull/14727#issuecomment-765189922


   
   ## CI report:
   
   * a95a451df26b7d32d9a22b801cbc8aff7ceb719e Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12979)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14629:
URL: https://github.com/apache/flink/pull/14629#issuecomment-759361463


   
   ## CI report:
   
   * 5426ab123aef03d4710c0fea1237fa014684f372 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12918)
 
   * 145f891faab2190d83a5ea35ac86a025605b43bd Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12976)
 
   * 6b93e4c21b1edf4cf9c071d88b78ab4230fbc004 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14379: [FLINK-20563][hive] Support built-in functions for Hive versions prio…

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14379:
URL: https://github.com/apache/flink/pull/14379#issuecomment-73009


   
   ## CI report:
   
   * 64460eaf888b0333b8aed626d48f66fd875997cf Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12972)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] leonardBang commented on pull request #14863: [FLINK-21203]Don’t collect -U&+U Row When they are equals In the LastRowFunction

2021-02-04 Thread GitBox


leonardBang commented on pull request #14863:
URL: https://github.com/apache/flink/pull/14863#issuecomment-773854776


   @wangpeibin713 Could you rebase your code to master rather than use ` git 
merge` command?  I found you used `git merge` which lead to current branch has 
two parents.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-21291) FlinkKafka Consumer can't dynamic discover the partition update

2021-02-04 Thread zhangyunyun (Jira)
zhangyunyun created FLINK-21291:
---

 Summary: FlinkKafka Consumer can't dynamic discover the partition 
update
 Key: FLINK-21291
 URL: https://issues.apache.org/jira/browse/FLINK-21291
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.11.2
Reporter: zhangyunyun


When start the job, occurs WARN log like below:
{code:java}
WARN  org.apache.kafka.clients.consumer.ConsumerConfig  - The configuration 
'flink.partition-discovery.interval-millis' was supplied but isn't a known 
config.

{code}
 

And I try to change the kafka partion with command, partition number from 3 to 4
{code:java}

./kafka-topics.sh --alter --zookeeper 10.0.10.21:15311 --topic STRUCTED_LOG 
--partitions 4
{code}
it dosen't work.

 

How can I do with this problem. Thanks a lot

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21045) Improve Usability of Pluggable Modules

2021-02-04 Thread Nicholas Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Jiang updated FLINK-21045:
---
Description: 
This improvement aims to
 # Simplify the module discovery mapping by module name. This will encourage 
users to create singleton of module instances.
 # Support changing the resolution order of modules in a flexible manner. This 
will introduce two methods {{#useModules}} and {{#listFullModules}} in both 
{{ModuleManager}} and {{TableEnvironment}}.
 # Support SQL syntax upon {{LOAD/UNLOAD MODULE}}, {{USE MODULES}}, and {{SHOW 
[FULL] MODULES}} in both {{FlinkSqlParserImpl}} and {{SqlClient}}.
 # Update the documentation to keep users informed of this improvement.

Please reach to the [discussion 
thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLINK-21045-Support-load-module-and-unload-module-SQL-syntax-td48398.html]
 for more details.

  was:
At present, Flink SQL doesn't support the 'load module' and 'unload module' SQL 
syntax. It's necessary for uses in the situation that users load and unload 
user-defined module through table api or sql client.


SQL syntax has been proposed in FLIP-68: 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules


> Improve Usability of Pluggable Modules
> --
>
> Key: FLINK-21045
> URL: https://issues.apache.org/jira/browse/FLINK-21045
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.13.0
>Reporter: Nicholas Jiang
>Assignee: Jane Chan
>Priority: Major
> Fix For: 1.13.0
>
>
> This improvement aims to
>  # Simplify the module discovery mapping by module name. This will encourage 
> users to create singleton of module instances.
>  # Support changing the resolution order of modules in a flexible manner. 
> This will introduce two methods {{#useModules}} and {{#listFullModules}} in 
> both {{ModuleManager}} and {{TableEnvironment}}.
>  # Support SQL syntax upon {{LOAD/UNLOAD MODULE}}, {{USE MODULES}}, and 
> {{SHOW [FULL] MODULES}} in both {{FlinkSqlParserImpl}} and {{SqlClient}}.
>  # Update the documentation to keep users informed of this improvement.
> Please reach to the [discussion 
> thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLINK-21045-Support-load-module-and-unload-module-SQL-syntax-td48398.html]
>  for more details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21045) Improve Usability of Pluggable Modules

2021-02-04 Thread Nicholas Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Jiang updated FLINK-21045:
---
Summary: Improve Usability of Pluggable Modules  (was: Support 'load 
module' and 'unload module' SQL syntax)

> Improve Usability of Pluggable Modules
> --
>
> Key: FLINK-21045
> URL: https://issues.apache.org/jira/browse/FLINK-21045
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.13.0
>Reporter: Nicholas Jiang
>Assignee: Jane Chan
>Priority: Major
> Fix For: 1.13.0
>
>
> At present, Flink SQL doesn't support the 'load module' and 'unload module' 
> SQL syntax. It's necessary for uses in the situation that users load and 
> unload user-defined module through table api or sql client.
> SQL syntax has been proposed in FLIP-68: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-21045) Support 'load module' and 'unload module' SQL syntax

2021-02-04 Thread Jane Chan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279423#comment-17279423
 ] 

Jane Chan edited comment on FLINK-21045 at 2/5/21, 7:28 AM:


Hi [~nicholasjiang], "Extend Core Table System with Pluggable Modules" is the 
title of FLIP-68 and FLINK-14132.

How about
{panel:title=Improve Usability of Pluggable Modules}
*Description*

This improvement aims to
 # Simplify the module discovery mapping by module name. This will encourage 
users to create singleton of module instances.
 # Support changing the resolution order of modules in a flexible manner. This 
will introduce two methods {{#useModules}} and {{#listFullModules}} in both 
{{ModuleManager}} and {{TableEnvironment}}.
 # Support SQL syntax upon {{LOAD/UNLOAD MODULE}}, {{USE MODULES}}, and {{SHOW 
[FULL] MODULES}} in both {{FlinkSqlParserImpl}} and {{SqlClient}}.
 # Update the documentation to keep users informed of this improvement.

Please reach to the [discussion 
thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLINK-21045-Support-load-module-and-unload-module-SQL-syntax-td48398.html]
 for more details.
{panel}
The proposed subtasks follow the description bullet list. What do you think?


was (Author: qingyue):
Hi [~nicholasjiang], "Extend Core Table System with Pluggable Modules" is the 
title of FLIP-68 and FLINK-14132.

How about
{panel:title=Improve Usability of Pluggable Modules}
*Description*

This improvement aims to
 # Simplify the module discovery mapping by module name. This will encourage 
users to create singleton of module instance.
 # Support changing the resolution order of modules in a flexible manner. This 
will introduce two methods {{#useModules}} and {{#listFullModules}} in both 
{{ModuleManager}} and {{TableEnvironment}}.
 # Support SQL syntax upon {{LOAD/UNLOAD MODULE}}, {{USE MODULES}}, and {{SHOW 
[FULL] MODULES in both {{SqlParserImpl}} and {{SqlClient}}.
 # Update the documentation to keep users informed with this improvement.

Please reach to the [discussion 
thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLINK-21045-Support-load-module-and-unload-module-SQL-syntax-td48398.html]
 for more details.
{panel}

The proposed subtasks follows the description bullet list. What do you think?

> Support 'load module' and 'unload module' SQL syntax
> 
>
> Key: FLINK-21045
> URL: https://issues.apache.org/jira/browse/FLINK-21045
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.13.0
>Reporter: Nicholas Jiang
>Assignee: Jane Chan
>Priority: Major
> Fix For: 1.13.0
>
>
> At present, Flink SQL doesn't support the 'load module' and 'unload module' 
> SQL syntax. It's necessary for uses in the situation that users load and 
> unload user-defined module through table api or sql client.
> SQL syntax has been proposed in FLIP-68: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21045) Support 'load module' and 'unload module' SQL syntax

2021-02-04 Thread Jane Chan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279423#comment-17279423
 ] 

Jane Chan commented on FLINK-21045:
---

Hi [~nicholasjiang], "Extend Core Table System with Pluggable Modules" is the 
title of FLIP-68 and FLINK-14132.

How about
{panel:title=Improve Usability of Pluggable Modules}
*Description*

This improvement aims to
 # Simplify the module discovery mapping by module name. This will encourage 
users to create singleton of module instance.
 # Support changing the resolution order of modules in a flexible manner. This 
will introduce two methods {{#useModules}} and {{#listFullModules}} in both 
{{ModuleManager}} and {{TableEnvironment}}.
 # Support SQL syntax upon {{LOAD/UNLOAD MODULE}}, {{USE MODULES}}, and {{SHOW 
[FULL] MODULES in both {{SqlParserImpl}} and {{SqlClient}}.
 # Update the documentation to keep users informed with this improvement.

Please reach to the [discussion 
thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLINK-21045-Support-load-module-and-unload-module-SQL-syntax-td48398.html]
 for more details.
{panel}

The proposed subtasks follows the description bullet list. What do you think?

> Support 'load module' and 'unload module' SQL syntax
> 
>
> Key: FLINK-21045
> URL: https://issues.apache.org/jira/browse/FLINK-21045
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.13.0
>Reporter: Nicholas Jiang
>Assignee: Jane Chan
>Priority: Major
> Fix For: 1.13.0
>
>
> At present, Flink SQL doesn't support the 'load module' and 'unload module' 
> SQL syntax. It's necessary for uses in the situation that users load and 
> unload user-defined module through table api or sql client.
> SQL syntax has been proposed in FLIP-68: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink-benchmarks] zhuzhurk commented on a change in pull request #7: [FLINK-20612][runtime] Add benchmarks for scheduler

2021-02-04 Thread GitBox


zhuzhurk commented on a change in pull request #7:
URL: https://github.com/apache/flink-benchmarks/pull/7#discussion_r570766969



##
File path: 
src/main/java/org/apache/flink/scheduler/benchmark/SchedulerBenchmarkUtils.java
##
@@ -0,0 +1,245 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.scheduler.benchmark;
+
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.ExecutionMode;
+import org.apache.flink.api.common.time.Deadline;
+import org.apache.flink.api.common.time.Time;
+import org.apache.flink.configuration.JobManagerOptions;
+import org.apache.flink.runtime.akka.AkkaUtils;
+import org.apache.flink.runtime.blob.VoidBlobWriter;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.runtime.executiongraph.AccessExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.DummyJobInformation;
+import org.apache.flink.runtime.executiongraph.ExecutionAttemptID;
+import org.apache.flink.runtime.executiongraph.ExecutionGraph;
+import org.apache.flink.runtime.executiongraph.ExecutionVertex;
+import org.apache.flink.runtime.executiongraph.JobInformation;
+import org.apache.flink.runtime.executiongraph.NoOpExecutionDeploymentListener;
+import org.apache.flink.runtime.executiongraph.TaskExecutionStateTransition;
+import org.apache.flink.runtime.executiongraph.failover.RestartAllStrategy;
+import 
org.apache.flink.runtime.executiongraph.failover.flip1.partitionrelease.RegionPartitionReleaseStrategy;
+import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
+import 
org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
+import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.jobgraph.JobVertex;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.jobgraph.ScheduleMode;
+import org.apache.flink.runtime.jobmaster.LogicalSlot;
+import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
+import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
+import org.apache.flink.runtime.scheduler.DefaultScheduler;
+import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
+import org.apache.flink.runtime.taskmanager.TaskExecutionState;
+import org.apache.flink.runtime.testingUtils.TestingUtils;
+import org.apache.flink.runtime.testtasks.NoOpInvokable;
+
+import java.io.IOException;
+import java.time.Duration;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.List;
+import java.util.concurrent.TimeoutException;
+import java.util.function.Predicate;
+
+/**
+ * Utilities for runtime benchmarks.
+ */
+public class SchedulerBenchmarkUtils {
+
+   public static List createDefaultJobVertices(
+   int parallelism,
+   DistributionPattern distributionPattern,
+   ResultPartitionType resultPartitionType) {
+
+   List jobVertices = new ArrayList<>();
+
+   final JobVertex source = new JobVertex("source");
+   source.setInvokableClass(NoOpInvokable.class);
+   source.setParallelism(parallelism);
+   jobVertices.add(source);
+
+   final JobVertex sink = new JobVertex("sink");
+   sink.setInvokableClass(NoOpInvokable.class);
+   sink.setParallelism(parallelism);
+   jobVertices.add(sink);
+
+   sink.connectNewDataSetAsInput(source, distributionPattern, 
resultPartitionType);
+
+   return jobVertices;
+   }
+
+   public static JobGraph createJobGraph(
+   List jobVertices,
+   ScheduleMode scheduleMode,
+   ExecutionMode executionMode) throws IOException {
+
+   final JobGraph jobGraph = new JobGraph(jobVertices.toArray(new 
JobVertex[0]));
+
+   jobGraph.setScheduleMode(scheduleMode);
+   ExecutionConfig executionConfig = new ExecutionConfig();
+   executionConfig.setExecut

[GitHub] [flink] flinkbot edited a comment on pull request #14876: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14876:
URL: https://github.com/apache/flink/pull/14876#issuecomment-773757019


   
   ## CI report:
   
   * 100fb1eeb7f94949efc85cc6921a5c653a56163d Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12974)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14877: [FLINK-21225][table] Support OVER window distinct aggregates in Table API

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14877:
URL: https://github.com/apache/flink/pull/14877#issuecomment-773826928


   
   ## CI report:
   
   * eb7d6cb5de1138ee76b840b2b5a5dd850a66b407 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12977)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14842: [FLINK-21238][python] Support to close PythonFunctionFactory

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14842:
URL: https://github.com/apache/flink/pull/14842#issuecomment-772256636


   
   ## CI report:
   
   * 0b46d07633ccbd6774129dfa4f78424e6cacc533 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12936)
 
   * 5c977a53fe75be12cb949431d0e2e1fb8084c8ab UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14844: [FLINK-21208][python] Make Arrow Coder serialize schema info in every batch

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14844:
URL: https://github.com/apache/flink/pull/14844#issuecomment-772295878


   
   ## CI report:
   
   * 54ed8511139561657f10d84dc25b189acbbf156c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12973)
 
   * e3c6f8e96db156989b8b749cb3f0431105997b7f Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12978)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14727: [FLINK-19945][Connectors / FileSystem]Support sink parallelism config…

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14727:
URL: https://github.com/apache/flink/pull/14727#issuecomment-765189922


   
   ## CI report:
   
   * f70acaad1bcc41000a101351e597c11372329e21 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12837)
 
   * a95a451df26b7d32d9a22b801cbc8aff7ceb719e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-21290) Support Projection push down for Window TVF

2021-02-04 Thread Jark Wu (Jira)
Jark Wu created FLINK-21290:
---

 Summary: Support Projection push down for Window TVF
 Key: FLINK-21290
 URL: https://issues.apache.org/jira/browse/FLINK-21290
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Jark Wu


{code:scala}
  @Test
  def testTumble_ProjectionPushDown(): Unit = {
// TODO: [b, c, e, proctime] are never used, should be pruned
val sql =
  """
|SELECT
|   a,
|   window_start,
|   window_end,
|   count(*),
|   sum(d)
|FROM TABLE(TUMBLE(TABLE MyTable, DESCRIPTOR(rowtime), INTERVAL '15' 
MINUTE))
|GROUP BY a, window_start, window_end
  """.stripMargin
util.verifyRelPlan(sql)
  }
{code}

For the above test, currently we get the following plan:

{code}
Calc(select=[a, window_start, window_end, EXPR$3, EXPR$4])
+- WindowAggregate(groupBy=[a], window=[TUMBLE(time_col=[rowtime], size=[15 
min])], select=[a, COUNT(*) AS EXPR$3, SUM(d) AS EXPR$4, start('w$) AS 
window_start, end('w$) AS window_end])
   +- Exchange(distribution=[hash[a]])
  +- Calc(select=[a, d, rowtime])
 +- WatermarkAssigner(rowtime=[rowtime], watermark=[-(rowtime, 
1000:INTERVAL SECOND)])
+- Calc(select=[a, b, c, d, e, rowtime, PROCTIME() AS proctime])
   +- TableSourceScan(table=[[default_catalog, default_database, 
MyTable]], fields=[a, b, c, d, e, rowtime])
{code}

It should be able to prune fields and get the following plan:

{code}
Calc(select=[a, window_start, window_end, EXPR$3, EXPR$4])
+- WindowAggregate(groupBy=[a], window=[TUMBLE(time_col=[rowtime], size=[15 
min])], select=[a, COUNT(*) AS EXPR$3, SUM(d) AS EXPR$4, start('w$) AS 
window_start, end('w$) AS window_end])
   +- Exchange(distribution=[hash[a]])
 +- WatermarkAssigner(rowtime=[rowtime], watermark=[-(rowtime, 
1000:INTERVAL SECOND)])
   +- TableSourceScan(table=[[default_catalog, default_database, 
MyTable]], fields=[a, d, rowtime])
{code}

The reason is we didn't transpose Project and WindowTableFunction in logical 
phase. 

{code}
LogicalAggregate(group=[{0, 1, 2}], EXPR$3=[COUNT()], EXPR$4=[SUM($3)])
+- LogicalProject(a=[$0], window_start=[$7], window_end=[$8], d=[$3])
   +- LogicalTableFunctionScan(invocation=[TUMBLE($6, DESCRIPTOR($5), 
90:INTERVAL MINUTE)], rowType=[RecordType(INTEGER a, BIGINT b, 
VARCHAR(2147483647) c, DECIMAL(10, 3) d, BIGINT e, TIME ATTRIBUTE(ROWTIME) 
rowtime, TIME ATTRIBUTE(PROCTIME) proctime, TIMESTAMP(3) window_start, 
TIMESTAMP(3) window_end, TIME ATTRIBUTE(ROWTIME) window_time)])
  +- LogicalProject(a=[$0], b=[$1], c=[$2], d=[$3], e=[$4], rowtime=[$5], 
proctime=[$6])
 +- LogicalWatermarkAssigner(rowtime=[rowtime], watermark=[-($5, 
1000:INTERVAL SECOND)])
+- LogicalProject(a=[$0], b=[$1], c=[$2], d=[$3], e=[$4], 
rowtime=[$5], proctime=[PROCTIME()])
   +- LogicalTableScan(table=[[default_catalog, default_database, 
MyTable]])
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-10520) Job save points REST API fails unless parameters are specified

2021-02-04 Thread Matthias (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279413#comment-17279413
 ] 

Matthias commented on FLINK-10520:
--

That's reasonable. I agree (y)

> Job save points REST API fails unless parameters are specified
> --
>
> Key: FLINK-10520
> URL: https://issues.apache.org/jira/browse/FLINK-10520
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / REST
>Affects Versions: 1.6.1
>Reporter: Elias Levy
>Assignee: Chesnay Schepler
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> The new REST API POST endpoint, {{/jobs/:jobid/savepoints}}, returns an error 
> unless the request includes a body with all parameters ({{target-directory}} 
> and {{cancel-job}})), even thought the 
> [documentation|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/runtime/rest/handler/job/savepoints/SavepointHandlers.html]
>  suggests these are optional.
> If a POST request with no data is made, the response is a 400 status code 
> with the error message "Bad request received."
> If the POST request submits an empty JSON object ( {} ), the response is a 
> 400 status code with the error message "Request did not match expected format 
> SavepointTriggerRequestBody."  The same is true if only the 
> {{target-directory}} or {{cancel-job}} parameters are included.
> As the system is configured with a default savepoint location, there 
> shouldn't be a need to include the parameter in the quest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] XComp commented on pull request #14798: [FLINK-21187] Provide exception history for root causes

2021-02-04 Thread GitBox


XComp commented on pull request #14798:
URL: https://github.com/apache/flink/pull/14798#issuecomment-773838859


   The test failures reported by Azure seem to be related to 
[FLINK-21277](https://issues.apache.org/jira/browse/FLINK-21277). I'm going to 
rebase and squash the related commits together after @rkhachatryan gave his 
final ok



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21208) pyarrow exception when using window with pandas udaf

2021-02-04 Thread Huang Xingbo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279400#comment-17279400
 ] 

Huang Xingbo commented on FLINK-21208:
--

 [~liuyufei] The serialization protocol provided by arrow is to serialize the 
schema info into the header before transmitting data. This is actually a 
stateful serializer. But for beam, it requires your serializer to be stateless. 
Both of them are not wrong and have their own considerations, but when used in 
combination, there will be problems, unless you transmit a schema for each 
arrow batch.


> pyarrow exception when using window with pandas udaf
> 
>
> Key: FLINK-21208
> URL: https://issues.apache.org/jira/browse/FLINK-21208
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.12.0, 1.12.1
>Reporter: YufeiLiu
>Priority: Major
>  Labels: pull-request-available
>
> I write a pyflink demo and execute in local environment, the logic is 
> simple:generate records and aggerate in 100s tumle window, using a pandas 
> udaf.
> But the job failed after several minutes, I don't think it's a resource 
> problem because the amount of data is small, here is the error trace.
> {code:java}
> Caused by: org.apache.flink.streaming.runtime.tasks.AsynchronousException: 
> Caught exception while processing timer.
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$StreamTaskAsyncExceptionHandler.handleAsyncException(StreamTask.java:1108)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.handleAsyncException(StreamTask.java:1082)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invokeProcessingTimeCallback(StreamTask.java:1213)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$null$17(StreamTask.java:1202)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:78)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:302)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:184)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: TimerException{java.lang.RuntimeException: Failed to close remote 
> bundle}
>   ... 11 more
> Caused by: java.lang.RuntimeException: Failed to close remote bundle
>   at 
> org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.finishBundle(BeamPythonFunctionRunner.java:371)
>   at 
> org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.flush(BeamPythonFunctionRunner.java:325)
>   at 
> org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.invokeFinishBundle(AbstractPythonFunctionOperator.java:291)
>   at 
> org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.checkInvokeFinishBundleByTime(AbstractPythonFunctionOperator.java:285)
>   at 
> org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.lambda$open$0(AbstractPythonFunctionOperator.java:134)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invokeProcessingTimeCallback(StreamTask.java:1211)
>   ... 10 more
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error received from SDK harness for instruction 
> 3: Traceback (most recent call last):
>   File 
> "/Users/lyf/Library/Python/3.7/lib/python/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 253, in _execute
> response = task()
>   File 
> "/Users/lyf/Library/Python/3.7/lib/python/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 310, in 
> lambda: self.create_worker().do_instruction(request), request)
>   File 
> "/Users/lyf/Library/Python/3.7/lib/python/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 480, in do_instruction
> getattr(request, request_type), request.instruction_id)
>   File 
> "/Users/lyf/Library/Python/3.7/lib/python/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 515, in process_bundle
> bundle_processor.process_bundle(instruction_id))
>   File 
> "/Users/lyf/Library/Python/3.7/lib/python/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 978, in proces

[GitHub] [flink] flinkbot edited a comment on pull request #14877: [FLINK-21225][table] Support OVER window distinct aggregates in Table API

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14877:
URL: https://github.com/apache/flink/pull/14877#issuecomment-773826928


   
   ## CI report:
   
   * eb7d6cb5de1138ee76b840b2b5a5dd850a66b407 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12977)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14844: [FLINK-21208][python] Make Arrow Coder serialize schema info in every batch

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14844:
URL: https://github.com/apache/flink/pull/14844#issuecomment-772295878


   
   ## CI report:
   
   * 54ed8511139561657f10d84dc25b189acbbf156c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12973)
 
   * e3c6f8e96db156989b8b749cb3f0431105997b7f UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14834: [FLINK-21234][build system] Adding timeout to all tasks

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14834:
URL: https://github.com/apache/flink/pull/14834#issuecomment-771607546


   
   ## CI report:
   
   * 3c55d1b87076148f095c08a55319f1b898dc932d UNKNOWN
   * 5c8ac11faf9d6fbcad7550bec8abf6a6343347c3 UNKNOWN
   * 4e50c2f87a73c3a79c178a580c84c31924a72c4a Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12965)
 
   * 35e6f16b00a6ac645dc140b7289272074a82fd19 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14877: [FLINK-21225][table] Support OVER window distinct aggregates in Table API

2021-02-04 Thread GitBox


flinkbot commented on pull request #14877:
URL: https://github.com/apache/flink/pull/14877#issuecomment-773826928


   
   ## CI report:
   
   * eb7d6cb5de1138ee76b840b2b5a5dd850a66b407 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zhuzhurk commented on a change in pull request #14862: [FLINK-21273][coordination] Remove unused ExecutionVertexSchedulingRe…

2021-02-04 Thread GitBox


zhuzhurk commented on a change in pull request #14862:
URL: https://github.com/apache/flink/pull/14862#discussion_r570744967



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SlotSharingExecutionSlotAllocator.java
##
@@ -115,15 +115,11 @@
  *   returns the results.
  * 
  *
- * @param executionVertexSchedulingRequirements the requirements for 
scheduling the executions.
+ * @param executionVertexIds the requirements for scheduling the 
executions.

Review comment:
   the comment is outdated

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/ExecutionSlotAllocator.java
##
@@ -29,10 +29,10 @@
 /**
  * Allocate slots for the given executions.
  *
- * @param executionVertexSchedulingRequirements The requirements for 
scheduling the executions.
+ * @param executionVertexIds The requirements for scheduling the 
executions.

Review comment:
   "The requirements for scheduling the executions" -> "Execution vertices 
to allocate slots for"





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21274) At per-job mode,if the HDFS write is slow(about 5 seconds), the flink job archive file will upload fails

2021-02-04 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279381#comment-17279381
 ] 

Yang Wang commented on FLINK-21274:
---

[~wjc920] I am afraid this ticket is related with FLINK-21008. The root cause 
of both them are that {{ClusterEntrypoint#shutDownAsync}} is not fully 
executed. And then this leads to the residual HA related data or archiving 
failure for completed jobs.

 

However, I cannot agree with your fix. Simply calling the 
{{getTerminationFuture().get()}} will block the executing of 
{{runClusterEntrypoint}}. This also could not completely resolve the issue if 
we receive the SIGTERM very fast. So I prefer the solution posted in 
FLINK-21008, triggering a {{ClusterEntrypoint.closeAsync()}} if we see a 
SIGTERM and then wait on the completion. WDYT?

> At per-job mode,if the HDFS write is slow(about 5 seconds), the flink job 
> archive file will upload fails
> 
>
> Key: FLINK-21274
> URL: https://issues.apache.org/jira/browse/FLINK-21274
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.10.1
>Reporter: Jichao Wang
>Priority: Critical
> Attachments: 1.png, 2.png, 
> application_1612404624605_0010-JobManager.log
>
>
> This is a partial configuration of my Flink History service(flink-conf.yaml), 
> and this is also the configuration of my Flink client.
> {code:java}
> jobmanager.archive.fs.dir: hdfs://hdfsHACluster/flink/completed-jobs/
> historyserver.archive.fs.dir: hdfs://hdfsHACluster/flink/completed-jobs/
> {code}
> I used {color:#0747a6}flink run -m yarn-cluster 
> /cloud/service/flink/examples/batch/WordCount.jar{color} to submit a 
> WorkCount task to the Yarn cluster. Under normal circumstances, after the 
> task is completed, the flink job execution information will be archived to 
> HDFS, and then the JobManager process will exit. However, when this archiving 
> process takes a long time (maybe the HDFS write speed is slow), the task 
> archive file upload fails.
> The specific reproduction method is as follows:
> Modify the 
> {color:#0747a6}org.apache.flink.runtime.history.FsJobArchivist#archiveJob{color}
>  method to wait 5 seconds before actually writing to HDFS (simulating a slow 
> write speed scenario).
> {code:java}
> public static Path archiveJob(Path rootPath, JobID jobId, 
> Collection jsonToArchive) 
> throws IOException {
> try {
> FileSystem fs = rootPath.getFileSystem();
> Path path = new Path(rootPath, jobId.toString());
> OutputStream out = fs.create(path, FileSystem.WriteMode.NO_OVERWRITE);
> try {
> LOG.info("===Wait 5 seconds..");
> Thread.sleep(5000);
> } catch (InterruptedException e) {
> e.printStackTrace();
> }
> try (JsonGenerator gen = jacksonFactory.createGenerator(out, 
> JsonEncoding.UTF8)) {
> ...  // Part of the code is omitted here
> } catch (Exception e) {
> fs.delete(path, false);
> throw e;
> }
> LOG.info("Job {} has been archived at {}.", jobId, path);
> return path;
> } catch (IOException e) {
> LOG.error("Failed to archive job.", e);
> throw e;
> }
> }
> {code}
> After I make the above changes to the code, I cannot find the corresponding 
> task on Flink's HistoryServer(Refer to Figure 1.png and Figure 2.png).
> Then I went to Yarn to browse the JobManager log (see attachment 
> application_1612404624605_0010-JobManager.log for log details), and found 
> that the following logs are missing in the task log:
> {code:java}
> INFO entrypoint.ClusterEntrypoint: Terminating cluster entrypoint process 
> YarnJobClusterEntrypoint with exit code 0.{code}
> Usually, if the task exits normally, a similar log will be printed before 
> executing {color:#0747a6}System.exit(returnCode){color}.
> If no Exception information is found in the JobManager log, the above 
> situation occurs, indicating that the JobManager is running to a certain 
> point, and there is no user thread in the JobManager process, which causes 
> the program to exit without completing the normal process.
> Eventually I found out that multiple services (for example: ioExecutor, 
> metricRegistry, commonRpcService) were exited asynchronously in 
> {color:#0747a6}org.apache.flink.runtime.entrypoint.ClusterEntrypoint#stopClusterServices{color},
>  and multiple services would be exited in the shutdown() method of 
> metricRegistry (for example : executor), these exit actions are executed 
> asynchronously and in parallel. If ioExecutor or executor exits last, it will 
> cause the above problems.The root cause is that th

[GitHub] [flink] flinkbot edited a comment on pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14629:
URL: https://github.com/apache/flink/pull/14629#issuecomment-759361463


   
   ## CI report:
   
   * 5426ab123aef03d4710c0fea1237fa014684f372 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12918)
 
   * 145f891faab2190d83a5ea35ac86a025605b43bd Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12976)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14877: [FLINK-21225][table] Support OVER window distinct aggregates in Table API

2021-02-04 Thread GitBox


flinkbot commented on pull request #14877:
URL: https://github.com/apache/flink/pull/14877#issuecomment-773816561


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit eb7d6cb5de1138ee76b840b2b5a5dd850a66b407 (Fri Feb 05 
06:12:30 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21289) Application mode on kubernetes deployment support run PackagedProgram.main with pipeline.classpaths

2021-02-04 Thread Zhou Parker (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhou Parker updated FLINK-21289:

Description: 
我尝试将flink作业以application 
mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。

在yarn上可以工作,是因为我们用 {color:#ff}-C [http://|http:///] 
{color}的方式,让依赖可以被URLClassloader加载。

但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml 中生成一个pipeline.classpaths 
配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。

通过阅读源码,*我发现运行用户代码的类加载器实际并没有把 pipeline.classpaths 
中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。

我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。

 

 

English translation:

I'm trying to submit flink job to kubernetes cluster with application mode, but 
throw ClassNotFoundException when some dependency class is not shipped in kind 
of local:///opt/flink/usrlib/.jar.

This works on yarn, since we use {color:#ff}-C 
[http://|http:///]{color} command line style that let dependency class  
can be load by URLClassloader.

But i figure out that not works on kubernetes. When submit to kubernetes 
cluster, -C is only shipped as item "pipeline.classpaths" in 
configmap/flink-conf.yaml。

After read the source code, *i find out that the Classloader launching the 
"main" entry of user code miss consider adding pipeline.classpaths into 
candidates URLs*. from source code, i also learn that we can ship the 
dependency jar in the usrlib dir to solve the problem. But that failed for us, 
we are not _preferred_ to ship dependencies in image at compile time, since 
dependencies are known dynamically in runtime

I proposed to improve the process, let the Classloader consider usrlib as well 
as pipeline.classpaths, this is a quite little change. I test the solution and 
it works quite well

 

 

  was:
我尝试将flink作业以application 
mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。

在yarn上可以工作,是因为我们用 {color:#ff}-C [http://|http:///] 
{color}的方式,让依赖可以被URLClassloader加载。

但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml 中生成一个pipeline.classpaths 
配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。

通过阅读源码,*我发现运行用户代码的类加载器实际并把 pipeline.classpaths 
中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,可以通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。

我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。

 

 

English translation:

I'm trying to submit flink job to kubernetes cluster with application mode, but 
throw ClassNotFoundException when dependency class is not shipped in kind of 
local:///opt/flink/usrlib/.jar.

This works on yarn, since we use {color:#ff}-C 
[http://|http:///]{color} command line style that let dependency class  
can be load by URLClassloader.

But i figure out that not works on kubernetes. When submit to kubernetes 
cluster, -C is only shipped as item "pipeline.classpaths" in 
configmap/flink-conf.yaml。

After read the source code, *i find out that the Classloader launching the 
"main" entry of user code without consider add pipeline.classpaths into 
candidates URLs*. from source code, i also learn that we can ship the 
dependency jar in the usrlib dir to solve the problem. But failed for me, we 
are not _preferred_ to ship dependencies in image at compile time, since they 
are known dynamically in runtime

I proposed improving the process, let the Classloader consider usrlib as well 
as pipeline.classpaths, this is a quite little change. I test the solution and 
it works quite well

 

 


> Application mode on kubernetes deployment support run PackagedProgram.main 
> with pipeline.classpaths
> ---
>
> Key: FLINK-21289
> URL: https://issues.apache.org/jira/browse/FLINK-21289
> Project: Flink
>  Issue Type: Improvement
>  Components: Client / Job Submission, Deployment / Kubernetes
>Affects Versions: 1.11.2, 1.12.1
> Environment: flink: 1.11
> kubernetes: 1.15
>  
>Reporter: Zhou Parker
>Priority: Minor
> Attachments: 0001-IMP.patch
>
>
> 我尝试将flink作业以application 
> mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。
> 在yarn上可以工作,是因为我们用 {color:#ff}-C [http://|http:///] 
> {color}的方式,让依赖可以被URLClassloader加载。
> 但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml 
> 中生成一个pipeline.classpaths 配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。
> 通过阅读源码,*我发现运行用户代码的类加载器实际并没有把 pipeline.classpaths 
> 中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。
> 我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。
>  
>  
> English translation:
> I'm trying to submit flink job to kubernetes cluster with application mode, 
> but throw Cla

[jira] [Updated] (FLINK-21289) Application mode on kubernetes deployment support run PackagedProgram.main with pipeline.classpaths

2021-02-04 Thread Zhou Parker (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhou Parker updated FLINK-21289:

Description: 
我尝试将flink作业以application 
mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。

在yarn上可以工作,是因为我们用 {color:#ff}-C [http://|http:///] 
{color}的方式,让依赖可以被URLClassloader加载。

但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml 中生成一个pipeline.classpaths 
配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。

通过阅读源码,*我发现运行用户代码的类加载器实际并把 pipeline.classpaths 
中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,可以通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。

我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。

 

 

English translation:

I'm trying to submit flink job to kubernetes cluster with application mode, but 
throw ClassNotFoundException when dependency class is not shipped in kind of 
local:///opt/flink/usrlib/.jar.

This works on yarn, since we use {color:#ff}-C 
[http://|http:///]{color} command line style that let dependency class  
can be load by URLClassloader.

But i figure out that not works on kubernetes. When submit to kubernetes 
cluster, -C is only shipped as item "pipeline.classpaths" in 
configmap/flink-conf.yaml。

After read the source code, *i find out that the Classloader launching the 
"main" entry of user code without consider add pipeline.classpaths into 
candidates URLs*. from source code, i also learn that we can ship the 
dependency jar in the usrlib dir to solve the problem. But failed for me, we 
are not _preferred_ to ship dependencies in image at compile time, since they 
are known dynamically in runtime

I proposed improving the process, let the Classloader consider usrlib as well 
as pipeline.classpaths, this is a quite little change. I test the solution and 
it works quite well

 

 

  was:
我尝试将flink作业以application 
mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。

在yarn上可以工作,是因为我们用 {color:#FF}-C [http://|http:///] 
{color}的方式,让依赖可以被URLClassloader加载。

但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml 中生成一个pipeline.classpaths 
配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。

通过阅读源码,*我发现运行用户代码的类加载器实际并把 pipeline.classpaths 
中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,可以通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。

我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。

 

I'm trying to submit flink job to kubernetes cluster with application mode, but 
throw ClassNotFoundException when dependency class is not shipped in kind of 
local:///opt/flink/usrlib/.jar.

This works on yarn, since we use {color:#FF}-C http://{color} command 
line style that let dependency class  can be load by URLClassloader.

But i figure out that not works on kubernetes. When submit to kubernetes 
cluster, -C is only shipped as item "pipeline.classpaths" in 
configmap/flink-conf.yaml。

After read the source code, *i find out that the Classloader launching the 
"main" entry of user code without consider add pipeline.classpaths into 
candidates URLs*. from source code, i also learn that we can ship the 
dependency jar in the usrlib dir to solve the problem. But failed for me, we 
are not _preferred_ to ship dependencies in image at compile time, since they 
are known dynamically in runtime

I proposed improving the process, let the Classloader consider usrlib as well 
as pipeline.classpaths, this is a quite little change. I test the solution and 
it works quite well

 

 


> Application mode on kubernetes deployment support run PackagedProgram.main 
> with pipeline.classpaths
> ---
>
> Key: FLINK-21289
> URL: https://issues.apache.org/jira/browse/FLINK-21289
> Project: Flink
>  Issue Type: Improvement
>  Components: Client / Job Submission, Deployment / Kubernetes
>Affects Versions: 1.11.2, 1.12.1
> Environment: flink: 1.11
> kubernetes: 1.15
>  
>Reporter: Zhou Parker
>Priority: Minor
> Attachments: 0001-IMP.patch
>
>
> 我尝试将flink作业以application 
> mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。
> 在yarn上可以工作,是因为我们用 {color:#ff}-C [http://|http:///] 
> {color}的方式,让依赖可以被URLClassloader加载。
> 但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml 
> 中生成一个pipeline.classpaths 配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。
> 通过阅读源码,*我发现运行用户代码的类加载器实际并把 pipeline.classpaths 
> 中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,可以通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。
> 我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。
>  
>  
> English translation:
> I'm trying to submit flink job to kubernetes cluster with application mode, 
> but throw ClassNotFoundException when dependency class is not shipped in 

[jira] [Created] (FLINK-21289) Application mode on kubernetes deployment support run PackagedProgram.main with pipeline.classpaths

2021-02-04 Thread Zhou Parker (Jira)
Zhou Parker created FLINK-21289:
---

 Summary: Application mode on kubernetes deployment support run 
PackagedProgram.main with pipeline.classpaths
 Key: FLINK-21289
 URL: https://issues.apache.org/jira/browse/FLINK-21289
 Project: Flink
  Issue Type: Improvement
  Components: Client / Job Submission, Deployment / Kubernetes
Affects Versions: 1.12.1, 1.11.2
 Environment: flink: 1.11

kubernetes: 1.15

 
Reporter: Zhou Parker
 Attachments: 0001-IMP.patch

我尝试将flink作业以application 
mode方式提交到kubernetes上运行。但程序的依赖包并不完全存在于local:///opt/flink/usrlib/.jar中。导致找不到类。

在yarn上可以工作,是因为我们用 {color:#FF}-C [http://|http:///] 
{color}的方式,让依赖可以被URLClassloader加载。

但我发现,当实验提交到kubernetes时,-C只会在 configmap/flink-conf.yaml 中生成一个pipeline.classpaths 
配置条目。我们的main函数可以执行,但是在加载外部依赖类的时候提示找不到类。

通过阅读源码,*我发现运行用户代码的类加载器实际并把 pipeline.classpaths 
中的条目加入候选URL*,这导致了无法加载类的情况。从源码中,我也发现,可以通过将依赖包放在usrlib目录下(默认的userClassPath)可以解决问题。但我们的依赖可能是动态的,不合适一次性打到镜像里面。

我提议可以改进这个过程,将pipeline.classpaths也加入到对应的类加载器。这个改动很小,我自己经过测试,可以完美解决问题。

 

I'm trying to submit flink job to kubernetes cluster with application mode, but 
throw ClassNotFoundException when dependency class is not shipped in kind of 
local:///opt/flink/usrlib/.jar.

This works on yarn, since we use {color:#FF}-C http://{color} command 
line style that let dependency class  can be load by URLClassloader.

But i figure out that not works on kubernetes. When submit to kubernetes 
cluster, -C is only shipped as item "pipeline.classpaths" in 
configmap/flink-conf.yaml。

After read the source code, *i find out that the Classloader launching the 
"main" entry of user code without consider add pipeline.classpaths into 
candidates URLs*. from source code, i also learn that we can ship the 
dependency jar in the usrlib dir to solve the problem. But failed for me, we 
are not _preferred_ to ship dependencies in image at compile time, since they 
are known dynamically in runtime

I proposed improving the process, let the Classloader consider usrlib as well 
as pipeline.classpaths, this is a quite little change. I test the solution and 
it works quite well

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] LadyForest commented on pull request #14877: [FLINK-21225][table] Support OVER window distinct aggregates in Table API

2021-02-04 Thread GitBox


LadyForest commented on pull request #14877:
URL: https://github.com/apache/flink/pull/14877#issuecomment-773813622


   @flinkbot run azure



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21225) OverConvertRule does not consider distinct

2021-02-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-21225:
---
Labels: pull-request-available  (was: )

> OverConvertRule does not consider distinct
> --
>
> Key: FLINK-21225
> URL: https://issues.apache.org/jira/browse/FLINK-21225
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Timo Walther
>Assignee: Jane Chan
>Priority: Major
>  Labels: pull-request-available
>
> We don't support OVER window distinct aggregates in Table API. Even though 
> this is explicitly documented:
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/tableApi.html#aggregations
> {code}
> // Distinct aggregation on over window
> Table result = orders
> .window(Over
> .partitionBy($("a"))
> .orderBy($("rowtime"))
> .preceding(UNBOUNDED_RANGE)
> .as("w"))
> .select(
> $("a"), $("b").avg().distinct().over($("w")),
> $("b").max().over($("w")),
> $("b").min().over($("w"))
> );
> {code}
> The distinct flag is set to false in {{OverConvertRule}}.
> See also
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Unknown-call-expression-avg-amount-when-use-distinct-in-Flink-Thanks-td40905.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] LadyForest opened a new pull request #14877: [FLINK-21225][table] Support OVER window distinct aggregates in Table API

2021-02-04 Thread GitBox


LadyForest opened a new pull request #14877:
URL: https://github.com/apache/flink/pull/14877


   ## Contribution Checklist
   
   ## What is the purpose of the change
   
   * This pull request tries to support OVER window distinct aggregates in 
`OverConverterRule`. Currently, `OverConvertRule` always sets the distinct flag 
to false, and could not make rex call for an `agg` expression's children 
expression like `distinct(avg/count/sum(field))`, thus cause 
`ExpressionConverter#visit` throwing `RuntimeException` that "`Unknown call 
expression: avg(field)`".
   
   ## Brief change log
   
 - *The changes applied to `OverConverterRule` add a flag `isDistinct` by 
checking the function definition of `DISTINCT` and using inner agg expression 
to generate RexNode if `isDistinct` is true.*
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
 - *`OverAggregateTest#testRowTimeBoundedDistinctWithPartitionedRangeOver`, 
`OverAggregateTest#testRowTimeUnboundedDistinctWithPartitionedRangeOver`, 
`OverAggregateTest#testRowTimeBoundedDistinctWithPartitionedRowsOver` and 
`OverAggregateTest#testRowTimeUnboundedDistinctWithPartitionedRowsOver` are to 
verify the optimized plan.*
 - 
*`OverAggregateITCase#testRowTimeBoundedDistinctWithPartitionedRangeOver`, 
`OverAggregateITCase#testRowTimeUnboundedDistinctWithPartitionedRangeOver`, 
`OverAggregateITCase#testRowTimeBoundedDistinctWithPartitionedRowsOver` and 
`OverAggregateITCase#testRowTimeUnboundedDistinctWithPartitionedRowsOver` are 
to verify the execution result.*
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / 
don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xintongsong commented on a change in pull request #14869: [FLINK-21047][coordination] Fix the incorrect registered/free resourc…

2021-02-04 Thread GitBox


xintongsong commented on a change in pull request #14869:
URL: https://github.com/apache/flink/pull/14869#discussion_r570737663



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/TaskExecutorManager.java
##
@@ -402,27 +407,31 @@ private void releaseIdleTaskExecutor(InstanceID 
timedOutTaskManagerId) {
 // 
-
 
 public ResourceProfile getTotalRegisteredResources() {
-return getResourceFromNumSlots(getNumberRegisteredSlots());
+return taskManagerRegistrations.values().stream()
+.map(TaskManagerRegistration::getTotalResource)
+.reduce(ResourceProfile.ZERO, ResourceProfile::merge);
 }
 
 public ResourceProfile getTotalRegisteredResourcesOf(InstanceID 
instanceID) {
-return getResourceFromNumSlots(getNumberRegisteredSlotsOf(instanceID));
+return Optional.ofNullable(taskManagerRegistrations.get(instanceID))
+.map(TaskManagerRegistration::getTotalResource)
+.orElse(ResourceProfile.ZERO);
 }
 
 public ResourceProfile getTotalFreeResources() {
-return getResourceFromNumSlots(getNumberFreeSlots());
+return taskManagerRegistrations.keySet().stream()
+.map(this::getTotalFreeResourcesOf)
+.reduce(ResourceProfile.ZERO, ResourceProfile::merge);

Review comment:
   Same here.

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/SlotManagerImpl.java
##
@@ -230,30 +230,34 @@ public int getNumberFreeSlotsOf(InstanceID instanceId) {
 
 @Override
 public ResourceProfile getRegisteredResource() {
-return getResourceFromNumSlots(getNumberRegisteredSlots());
+return taskManagerRegistrations.values().stream()
+.map(TaskManagerRegistration::getTotalResource)
+.reduce(ResourceProfile.ZERO, ResourceProfile::merge);
 }
 
 @Override
 public ResourceProfile getRegisteredResourceOf(InstanceID instanceID) {
-return getResourceFromNumSlots(getNumberRegisteredSlotsOf(instanceID));
+return Optional.ofNullable(taskManagerRegistrations.get(instanceID))
+.map(TaskManagerRegistration::getTotalResource)
+.orElse(ResourceProfile.ZERO);
 }
 
 @Override
 public ResourceProfile getFreeResource() {
-return getResourceFromNumSlots(getNumberFreeSlots());
+return taskManagerRegistrations.keySet().stream()
+.map(this::getFreeResourceOf)
+.reduce(ResourceProfile.ZERO, ResourceProfile::merge);

Review comment:
   Better to iterate on the entry set than to iterate on the key set and 
get the entry with the key.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] curcur commented on pull request #14838: [WIP][state] Add StateChangelog API

2021-02-04 Thread GitBox


curcur commented on pull request #14838:
URL: https://github.com/apache/flink/pull/14838#issuecomment-773810436


   > 1. > But this is where/the name I planned to put the WrappedStateBackend.
   > 
   > In my opinion, both backend and changelog should reside in the same 
module, at least for now.
   
   I have no objection to putting them together.
   
   > 
   > 1. I fixed the `StateChangeFormat` issue
   
   Thanks for fixing this!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14872: [FLINK-21162] [Blink Planner]use "IF(col = '' OR col IS NULL, 'a', 'b')",when co…

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14872:
URL: https://github.com/apache/flink/pull/14872#issuecomment-773303946


   
   ## CI report:
   
   * f6390443428d1f56a41251ffa412bf8eab820d86 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12967)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14838: [WIP][state] Add StateChangelog API

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14838:
URL: https://github.com/apache/flink/pull/14838#issuecomment-772060058


   
   ## CI report:
   
   * c1c536632abeeef101cd3a89edcdf63f1e950eff Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12966)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14629:
URL: https://github.com/apache/flink/pull/14629#issuecomment-759361463


   
   ## CI report:
   
   * 5426ab123aef03d4710c0fea1237fa014684f372 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12918)
 
   * 145f891faab2190d83a5ea35ac86a025605b43bd UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-20563) Support built-in functions for Hive versions prior to 1.2.0

2021-02-04 Thread Rui Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated FLINK-20563:
---
Fix Version/s: 1.13.0

> Support built-in functions for Hive versions prior to 1.2.0
> ---
>
> Key: FLINK-20563
> URL: https://issues.apache.org/jira/browse/FLINK-20563
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Reporter: Rui Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> Currently Hive built-in functions are supported only for Hive-1.2.0 and 
> later. We should investigate how to lift this limitation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-17061) Unset process/flink memory size from configuration once dynamic worker resource is activated.

2021-02-04 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song reassigned FLINK-17061:


Assignee: Xintong Song

> Unset process/flink memory size from configuration once dynamic worker 
> resource is activated.
> -
>
> Key: FLINK-17061
> URL: https://issues.apache.org/jira/browse/FLINK-17061
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Configuration, Runtime / Coordination
>Affects Versions: 1.11.0
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>
> With FLINK-14106, memory of a TaskExecutor is decided in two steps on active 
> resource managers.
> - {{SlotManager}} decides {{WorkerResourceSpec}}, including memory used by 
> Flink tasks: task heap, task off-heap, network and managed memory.
> - {{ResourceManager}} derives {{TaskExecutorProcessSpec}} from 
> {{WorkerResourceSpec}} and the configuration, deciding sizes of memory used 
> by Flink framework and JVM: framework heap, framework off-heap, jvm metaspace 
> and jvm overhead.
> This works fine for now, because both {{WorkerResourceSpec}} and 
> {{TaskExecutorProcessSpec}} are derived from the same configurations. 
> However, it might cause problem if later we have new {{SlotManager}} 
> implementations that decides {{WorkerResourceSpec}} dynamically. In such 
> cases, the process/flink sizes in configuration should be ignored, or it may 
> easily lead to configuration conflicts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21288) Support redundant task managers for fine grained resource management

2021-02-04 Thread Xintong Song (Jira)
Xintong Song created FLINK-21288:


 Summary: Support redundant task managers for fine grained resource 
management
 Key: FLINK-21288
 URL: https://issues.apache.org/jira/browse/FLINK-21288
 Project: Flink
  Issue Type: Sub-task
Reporter: Xintong Song






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-20970) DECIMAL(10, 0) can not be GROUP BY key.

2021-02-04 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu closed FLINK-20970.
---
Resolution: Not A Problem

As we only maintain 3 majar releases, and it has been fixed in the 1.11 and 
1.12, I will close this one. 

> DECIMAL(10, 0) can not be GROUP BY key.
> ---
>
> Key: FLINK-20970
> URL: https://issues.apache.org/jira/browse/FLINK-20970
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.10.1
>Reporter: Wong Mulan
>Priority: Major
> Attachments: image-2021-01-14-17-06-28-648.png
>
>
> If value which type is not DECIMAL(38, 18) is GROUP BY key, the result will 
> be -1.
> So, only DECIMAL(38, 18) can be GROUP BY key?
> Whatever the value is, it will be return -1.
>  !image-2021-01-14-17-06-28-648.png|thumbnail! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wangyang0918 commented on pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration

2021-02-04 Thread GitBox


wangyang0918 commented on pull request #14629:
URL: https://github.com/apache/flink/pull/14629#issuecomment-773801799


   @MiLk Thanks for your feedback.
   
   I have added a new commit for the documentation. Now it is ready for review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14740:
URL: https://github.com/apache/flink/pull/14740#issuecomment-766340750


   
   ## CI report:
   
   * c7e6b28b249f85cf52740d5201a769e0982a60aa UNKNOWN
   * bebd298009b12a9d5ac6518902f5534f8e00ff32 UNKNOWN
   * 743d1592db1b1f62ef6e2b208517438e2fab3a66 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12849)
 
   * 0a5a79498ab93134eccbe025489ede9aae233392 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12975)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xintongsong commented on a change in pull request #14865: [FLINK-21270] Generate the slot request respect to the resource specification of SlotSharingGroup if present

2021-02-04 Thread GitBox


xintongsong commented on a change in pull request #14865:
URL: https://github.com/apache/flink/pull/14865#discussion_r570703453



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/ExecutionVertexSchedulingRequirementsMapper.java
##
@@ -52,17 +52,22 @@ public static ExecutionVertexSchedulingRequirements from(
 /**
  * Get resource profile of the physical slot to allocate a logical slot in 
for the given vertex.
  * If the vertex is in a slot sharing group, the physical slot resource 
profile should be the
- * resource profile of the slot sharing group. Otherwise it should be the 
resource profile of
- * the vertex itself since the physical slot would be used by this vertex 
only in this case.
+ * resource profile of the slot sharing group if present. Otherwise it 
should be the resource
+ * profile of the vertex itself since the physical slot would be used by 
this vertex only in
+ * this case.
  *
  * @return resource profile of the physical slot to allocate a logical 
slot for the given vertex
  */
 public static ResourceProfile getPhysicalSlotResourceProfile(
 final ExecutionVertex executionVertex) {
 final SlotSharingGroup slotSharingGroup =
 executionVertex.getJobVertex().getSlotSharingGroup();
-return ResourceProfile.fromResourceSpec(
-slotSharingGroup.getResourceSpec(), MemorySize.ZERO);
+if 
(slotSharingGroup.getResourceProfile().equals(ResourceProfile.UNKNOWN)) {
+return ResourceProfile.fromResourceSpec(
+slotSharingGroup.getResourceSpec(), MemorySize.ZERO);

Review comment:
   I wonder if `SlotSharingGroup#resourceSpec` can be removed.
   
   There're two coded paths for aggregating operator resources into slot 
resources.
   - `SlotSharingGroup#resourceSpec`
   - `SlotSharingExecutionSlotAllocator#getPhysicalSlotResourceProfile`
   IIUC, after introducing deterministic slot sharing groups, we no longer need 
the first code path.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] leonardBang commented on a change in pull request #14684: [FLINK-20460][Connector-HBase] Support async lookup for HBase connector

2021-02-04 Thread GitBox


leonardBang commented on a change in pull request #14684:
URL: https://github.com/apache/flink/pull/14684#discussion_r570712863



##
File path: 
flink-connectors/flink-connector-hbase-2.2/src/main/java/org/apache/flink/connector/hbase2/source/HBaseDynamicTableSource.java
##
@@ -35,17 +42,45 @@ public HBaseDynamicTableSource(
 Configuration conf,
 String tableName,
 HBaseTableSchema hbaseSchema,
-String nullStringLiteral) {
-super(conf, tableName, hbaseSchema, nullStringLiteral);
+String nullStringLiteral,
+HBaseLookupOptions lookupOptions) {
+super(conf, tableName, hbaseSchema, nullStringLiteral, lookupOptions);
+}
+
+@Override
+public LookupRuntimeProvider getLookupRuntimeProvider(LookupContext 
context) {
+checkArgument(context.getKeys().length == 1 && 
context.getKeys()[0].length == 1,
+"Currently, HBase table can only be lookup by single rowkey.");
+checkArgument(
+hbaseSchema.getRowKeyName().isPresent(),
+"HBase schema must have a row key when used in lookup mode.");
+checkArgument(
+hbaseSchema
+.convertsToTableSchema()
+.getTableColumn(context.getKeys()[0][0])
+.filter(f -> 
f.getName().equals(hbaseSchema.getRowKeyName().get()))
+.isPresent(),
+"Currently, HBase table only supports lookup by rowkey field.");
+boolean isAsync = lookupOptions.getLookupAsync();
+if (isAsync){

Review comment:
   ```suggestion
   if (lookupOptions.getLookupAsync()){
   ```

##
File path: 
flink-connectors/flink-connector-hbase-2.2/src/main/java/org/apache/flink/connector/hbase2/source/HBaseRowDataAsyncLookupFunction.java
##
@@ -0,0 +1,232 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.hbase2.source;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.connector.hbase.options.HBaseLookupOptions;
+import org.apache.flink.connector.hbase.util.HBaseConfigurationUtil;
+import org.apache.flink.connector.hbase.util.HBaseSerde;
+import org.apache.flink.connector.hbase.util.HBaseTableSchema;
+import org.apache.flink.metrics.Gauge;
+import org.apache.flink.runtime.concurrent.Executors;
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.functions.AsyncTableFunction;
+import org.apache.flink.table.functions.FunctionContext;
+import org.apache.flink.util.StringUtils;
+
+import org.apache.flink.shaded.guava18.com.google.common.cache.Cache;
+import org.apache.flink.shaded.guava18.com.google.common.cache.CacheBuilder;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.TableNotFoundException;
+import org.apache.hadoop.hbase.client.AsyncConnection;
+import org.apache.hadoop.hbase.client.AsyncTable;
+import org.apache.hadoop.hbase.client.ConnectionFactory;
+import org.apache.hadoop.hbase.client.Get;
+import org.apache.hadoop.hbase.client.Result;
+import org.apache.hadoop.hbase.client.ScanResultConsumer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * The HBaseRowDataAsyncLookupFunction is a standard user-defined table 
function, it can be used in
+ * tableAPI and also useful for temporal table join plan in SQL. It looks up 
the result as {@link
+ * RowData}.
+ */
+@Internal
+public class HBaseRowDataAsyncLookupFunction extends 
AsyncTableFunction {
+
+private static final Logger LOG = 
LoggerFactory.getLogger(HBaseRowDataAsyncLookupFunction.class);
+
+

Review comment:
   redundant blank lines

##
File path: 
flink-connectors/flink-connector-hbase-base/src/main/ja

[GitHub] [flink] gaoyunhaii edited a comment on pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks

2021-02-04 Thread GitBox


gaoyunhaii edited a comment on pull request #14740:
URL: https://github.com/apache/flink/pull/14740#issuecomment-773782948


   Hi Roman @rkhachatryan very thanks for the review! I have update the PR via 
https://github.com/apache/flink/pull/14740/commits/0a5a79498ab93134eccbe025489ede9aae233392
 according to the comments~
   
   The current PR indeed did not include the case that the task finishes 
concurrently when JM tries to trigger it, 
[FLINK-21246](https://issues.apache.org/jira/browse/FLINK-21246) would solve 
this issue. I also think in this case the checkpoint would be declined with a 
reason that would not cause job failure. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14844: [FLINK-21208][python] Make Arrow Coder serialize schema info in every batch

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14844:
URL: https://github.com/apache/flink/pull/14844#issuecomment-772295878


   
   ## CI report:
   
   * 54ed8511139561657f10d84dc25b189acbbf156c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12973)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14740:
URL: https://github.com/apache/flink/pull/14740#issuecomment-766340750


   
   ## CI report:
   
   * c7e6b28b249f85cf52740d5201a769e0982a60aa UNKNOWN
   * bebd298009b12a9d5ac6518902f5534f8e00ff32 UNKNOWN
   * 743d1592db1b1f62ef6e2b208517438e2fab3a66 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12849)
 
   * 0a5a79498ab93134eccbe025489ede9aae233392 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] gaoyunhaii commented on pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks

2021-02-04 Thread GitBox


gaoyunhaii commented on pull request #14740:
URL: https://github.com/apache/flink/pull/14740#issuecomment-773782948


   Hi Roman @rkhachatryan very thanks for the review! I have update the PR via 
https://github.com/apache/flink/pull/14740/commits/0a5a79498ab93134eccbe025489ede9aae233392
 according to the comments~



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks

2021-02-04 Thread GitBox


gaoyunhaii commented on a change in pull request #14740:
URL: https://github.com/apache/flink/pull/14740#discussion_r570714613



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java
##
@@ -611,6 +590,10 @@ public long getNumberOfRestarts() {
 return numberOfRestartsCounter.getCount();
 }
 
+public int getVerticesFinished() {

Review comment:
   I also think `getFinishedVertices` would be more nature, but a bit 
concern here is that the variable to get is name by `verticesFinished`, should 
we keeps this method to be a getter method for that variable ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks

2021-02-04 Thread GitBox


gaoyunhaii commented on a change in pull request #14740:
URL: https://github.com/apache/flink/pull/14740#discussion_r570713863



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointBriefCalculator.java
##
@@ -0,0 +1,492 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.checkpoint;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.runtime.executiongraph.Execution;
+import org.apache.flink.runtime.executiongraph.ExecutionAttemptID;
+import org.apache.flink.runtime.executiongraph.ExecutionEdge;
+import org.apache.flink.runtime.executiongraph.ExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.ExecutionVertex;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobEdge;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.ListIterator;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+import static org.apache.flink.util.Preconditions.checkState;
+
+/** Computes the tasks to trigger, wait or commit for each checkpoint. */
+public class CheckpointBriefCalculator {
+private static final Logger LOG = 
LoggerFactory.getLogger(CheckpointBriefCalculator.class);
+
+private final JobID jobId;
+
+private final CheckpointBriefCalculatorContext context;
+
+private final List jobVerticesInTopologyOrder = new 
ArrayList<>();
+
+private final List allTasks = new ArrayList<>();
+
+private final List sourceTasks = new ArrayList<>();
+
+public CheckpointBriefCalculator(
+JobID jobId,
+CheckpointBriefCalculatorContext context,
+Iterable jobVerticesInTopologyOrderIterable) {
+
+this.jobId = checkNotNull(jobId);
+this.context = checkNotNull(context);
+
+checkNotNull(jobVerticesInTopologyOrderIterable);
+jobVerticesInTopologyOrderIterable.forEach(
+jobVertex -> {
+jobVerticesInTopologyOrder.add(jobVertex);
+
allTasks.addAll(Arrays.asList(jobVertex.getTaskVertices()));
+
+if (jobVertex.getJobVertex().isInputVertex()) {
+
sourceTasks.addAll(Arrays.asList(jobVertex.getTaskVertices()));
+}
+});
+}
+
+public CompletableFuture calculateCheckpointBrief() {
+CompletableFuture resultFuture = new 
CompletableFuture<>();
+
+context.getMainExecutor()
+.execute(
+() -> {
+try {
+if (!isAllExecutionAttemptsAreInitiated()) {
+throw new CheckpointException(
+
CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING);
+}
+
+CheckpointBrief result;
+if (!context.hasFinishedTasks()) {
+result = calculateWithAllTasksRunning();
+} else {
+result = calculateAfterTasksFinished();
+}
+
+if 
(!isAllExecutionsToTriggerStarted(result.getTasksToTrigger())) {
+throw new CheckpointException(
+
CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING);
+  

[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks

2021-02-04 Thread GitBox


gaoyunhaii commented on a change in pull request #14740:
URL: https://github.com/apache/flink/pull/14740#discussion_r570712494



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java
##
@@ -2132,23 +2123,41 @@ public boolean isForce() {
 }
 
 private void reportToStatsTracker(
-PendingCheckpoint checkpoint, Map tasks) {
+PendingCheckpoint checkpoint,
+Map tasks,
+List finishedTasks) {
 if (statsTracker == null) {
 return;
 }
 Map vertices =
-tasks.values().stream()
+Stream.concat(
+tasks.values().stream(),
+
finishedTasks.stream().map(Execution::getVertex))
 .map(ExecutionVertex::getJobVertex)
 .distinct()
 .collect(
 toMap(
 ExecutionJobVertex::getJobVertexId,
 ExecutionJobVertex::getParallelism));
-checkpoint.setStatsCallback(
+
+PendingCheckpointStats pendingCheckpointStats =
 statsTracker.reportPendingCheckpoint(
 checkpoint.getCheckpointID(),
 checkpoint.getCheckpointTimestamp(),
 checkpoint.getProps(),
-vertices));
+vertices);
+checkpoint.setStatsCallback(pendingCheckpointStats);
+
+reportFinishedTasks(pendingCheckpointStats, finishedTasks);
+}
+
+private void reportFinishedTasks(
+PendingCheckpointStats pendingCheckpointStats, List 
finishedTasks) {
+long now = System.currentTimeMillis();
+finishedTasks.forEach(
+execution ->
+pendingCheckpointStats.reportSubtaskStats(

Review comment:
   Yes, currently it would report 0 for the metrics of finished tasks. 
   
   I think it would be desired since if we do not report these tasks, users 
would be not easy to know which tasks are finished when the checkpoint trigger, 
thus he could not easily distinguish the finished tasks with the tasks that 
indeed not report snapshot for some reason. We may also consider add another 
flag to indicate if a task is finished when triggering checkpoints in a 
separate issue. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks

2021-02-04 Thread GitBox


gaoyunhaii commented on a change in pull request #14740:
URL: https://github.com/apache/flink/pull/14740#discussion_r570711043



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointBriefCalculator.java
##
@@ -0,0 +1,492 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.checkpoint;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.runtime.executiongraph.Execution;
+import org.apache.flink.runtime.executiongraph.ExecutionAttemptID;
+import org.apache.flink.runtime.executiongraph.ExecutionEdge;
+import org.apache.flink.runtime.executiongraph.ExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.ExecutionVertex;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobEdge;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.ListIterator;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+import static org.apache.flink.util.Preconditions.checkState;
+
+/** Computes the tasks to trigger, wait or commit for each checkpoint. */
+public class CheckpointBriefCalculator {
+private static final Logger LOG = 
LoggerFactory.getLogger(CheckpointBriefCalculator.class);
+
+private final JobID jobId;
+
+private final CheckpointBriefCalculatorContext context;
+
+private final List jobVerticesInTopologyOrder = new 
ArrayList<>();
+
+private final List allTasks = new ArrayList<>();
+
+private final List sourceTasks = new ArrayList<>();
+
+public CheckpointBriefCalculator(
+JobID jobId,
+CheckpointBriefCalculatorContext context,
+Iterable jobVerticesInTopologyOrderIterable) {
+
+this.jobId = checkNotNull(jobId);
+this.context = checkNotNull(context);
+
+checkNotNull(jobVerticesInTopologyOrderIterable);
+jobVerticesInTopologyOrderIterable.forEach(
+jobVertex -> {
+jobVerticesInTopologyOrder.add(jobVertex);
+
allTasks.addAll(Arrays.asList(jobVertex.getTaskVertices()));
+
+if (jobVertex.getJobVertex().isInputVertex()) {
+
sourceTasks.addAll(Arrays.asList(jobVertex.getTaskVertices()));
+}
+});
+}
+
+public CompletableFuture calculateCheckpointBrief() {
+CompletableFuture resultFuture = new 
CompletableFuture<>();
+
+context.getMainExecutor()
+.execute(
+() -> {
+try {
+if (!isAllExecutionAttemptsAreInitiated()) {
+throw new CheckpointException(
+
CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING);
+}
+
+CheckpointBrief result;
+if (!context.hasFinishedTasks()) {
+result = calculateWithAllTasksRunning();
+} else {
+result = calculateAfterTasksFinished();
+}
+
+if 
(!isAllExecutionsToTriggerStarted(result.getTasksToTrigger())) {
+throw new CheckpointException(
+
CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING);
+  

[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks

2021-02-04 Thread GitBox


gaoyunhaii commented on a change in pull request #14740:
URL: https://github.com/apache/flink/pull/14740#discussion_r570704750



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointBriefCalculator.java
##
@@ -0,0 +1,492 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.checkpoint;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.runtime.executiongraph.Execution;
+import org.apache.flink.runtime.executiongraph.ExecutionAttemptID;
+import org.apache.flink.runtime.executiongraph.ExecutionEdge;
+import org.apache.flink.runtime.executiongraph.ExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.ExecutionVertex;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobEdge;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.ListIterator;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+import static org.apache.flink.util.Preconditions.checkState;
+
+/** Computes the tasks to trigger, wait or commit for each checkpoint. */
+public class CheckpointBriefCalculator {
+private static final Logger LOG = 
LoggerFactory.getLogger(CheckpointBriefCalculator.class);
+
+private final JobID jobId;
+
+private final CheckpointBriefCalculatorContext context;
+
+private final List jobVerticesInTopologyOrder = new 
ArrayList<>();
+
+private final List allTasks = new ArrayList<>();
+
+private final List sourceTasks = new ArrayList<>();
+
+public CheckpointBriefCalculator(
+JobID jobId,
+CheckpointBriefCalculatorContext context,
+Iterable jobVerticesInTopologyOrderIterable) {
+
+this.jobId = checkNotNull(jobId);
+this.context = checkNotNull(context);
+
+checkNotNull(jobVerticesInTopologyOrderIterable);
+jobVerticesInTopologyOrderIterable.forEach(
+jobVertex -> {
+jobVerticesInTopologyOrder.add(jobVertex);
+
allTasks.addAll(Arrays.asList(jobVertex.getTaskVertices()));
+
+if (jobVertex.getJobVertex().isInputVertex()) {
+
sourceTasks.addAll(Arrays.asList(jobVertex.getTaskVertices()));
+}
+});
+}
+
+public CompletableFuture calculateCheckpointBrief() {
+CompletableFuture resultFuture = new 
CompletableFuture<>();
+
+context.getMainExecutor()
+.execute(
+() -> {
+try {
+if (!isAllExecutionAttemptsAreInitiated()) {
+throw new CheckpointException(
+
CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING);
+}
+
+CheckpointBrief result;
+if (!context.hasFinishedTasks()) {
+result = calculateWithAllTasksRunning();
+} else {
+result = calculateAfterTasksFinished();
+}
+
+if 
(!isAllExecutionsToTriggerStarted(result.getTasksToTrigger())) {
+throw new CheckpointException(
+
CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING);
+  

[jira] [Created] (FLINK-21287) Failed to build flink source code

2021-02-04 Thread Lei Qu (Jira)
Lei Qu created FLINK-21287:
--

 Summary: Failed to build flink source code
 Key: FLINK-21287
 URL: https://issues.apache.org/jira/browse/FLINK-21287
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.12.1
Reporter: Lei Qu


[ERROR] Failed to execute goal 
org.xolstice.maven.plugins:protobuf-maven-plugin:0.5.1:test-compile (default) 
on project flink-parquet_2.11: protoc did not exit cleanly. Review output for 
more information. -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn  -rf :flink-parquet_2.11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] leonardBang commented on pull request #14863: [FLINK-21203]Don’t collect -U&+U Row When they are equals In the LastRowFunction

2021-02-04 Thread GitBox


leonardBang commented on pull request #14863:
URL: https://github.com/apache/flink/pull/14863#issuecomment-773765646


   > cc @leonardBang , could you help to review this?
   
   ok, I'll take a look



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14876: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14876:
URL: https://github.com/apache/flink/pull/14876#issuecomment-773757019


   
   ## CI report:
   
   * 100fb1eeb7f94949efc85cc6921a5c653a56163d Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12974)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14844: [FLINK-21208][python] Make Arrow Coder serialize schema info in every batch

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14844:
URL: https://github.com/apache/flink/pull/14844#issuecomment-772295878


   
   ## CI report:
   
   * 4c5d3fcfdf5bb2a114964f5cd60fc6743fe331da Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12833)
 
   * 54ed8511139561657f10d84dc25b189acbbf156c Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12973)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14379: [FLINK-20563][hive] Support built-in functions for Hive versions prio…

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14379:
URL: https://github.com/apache/flink/pull/14379#issuecomment-73009


   
   ## CI report:
   
   * a413199ba00fdf4b41b09e7cf4bcf02bcd6da0ce Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10861)
 
   * 64460eaf888b0333b8aed626d48f66fd875997cf Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12972)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks

2021-02-04 Thread GitBox


gaoyunhaii commented on a change in pull request #14740:
URL: https://github.com/apache/flink/pull/14740#discussion_r570694404



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointBriefCalculator.java
##
@@ -0,0 +1,492 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.checkpoint;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.runtime.executiongraph.Execution;
+import org.apache.flink.runtime.executiongraph.ExecutionAttemptID;
+import org.apache.flink.runtime.executiongraph.ExecutionEdge;
+import org.apache.flink.runtime.executiongraph.ExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.ExecutionVertex;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobEdge;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.ListIterator;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+import static org.apache.flink.util.Preconditions.checkState;
+
+/** Computes the tasks to trigger, wait or commit for each checkpoint. */
+public class CheckpointBriefCalculator {
+private static final Logger LOG = 
LoggerFactory.getLogger(CheckpointBriefCalculator.class);
+
+private final JobID jobId;
+
+private final CheckpointBriefCalculatorContext context;
+
+private final List jobVerticesInTopologyOrder = new 
ArrayList<>();
+
+private final List allTasks = new ArrayList<>();
+
+private final List sourceTasks = new ArrayList<>();
+
+public CheckpointBriefCalculator(
+JobID jobId,
+CheckpointBriefCalculatorContext context,
+Iterable jobVerticesInTopologyOrderIterable) {
+
+this.jobId = checkNotNull(jobId);
+this.context = checkNotNull(context);
+
+checkNotNull(jobVerticesInTopologyOrderIterable);
+jobVerticesInTopologyOrderIterable.forEach(
+jobVertex -> {
+jobVerticesInTopologyOrder.add(jobVertex);
+
allTasks.addAll(Arrays.asList(jobVertex.getTaskVertices()));
+
+if (jobVertex.getJobVertex().isInputVertex()) {
+
sourceTasks.addAll(Arrays.asList(jobVertex.getTaskVertices()));
+}
+});
+}
+
+public CompletableFuture calculateCheckpointBrief() {
+CompletableFuture resultFuture = new 
CompletableFuture<>();
+
+context.getMainExecutor()
+.execute(
+() -> {
+try {
+if (!isAllExecutionAttemptsAreInitiated()) {
+throw new CheckpointException(
+
CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING);
+}
+
+CheckpointBrief result;
+if (!context.hasFinishedTasks()) {
+result = calculateWithAllTasksRunning();
+} else {
+result = calculateAfterTasksFinished();
+}
+
+if 
(!isAllExecutionsToTriggerStarted(result.getTasksToTrigger())) {
+throw new CheckpointException(
+
CheckpointFailureReason.NOT_ALL_REQUIRED_TASKS_RUNNING);
+  

[jira] [Commented] (FLINK-11838) Create RecoverableWriter for GCS

2021-02-04 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279299#comment-17279299
 ] 

Xintong Song commented on FLINK-11838:
--

Hi [~galenwarren],

Thanks for offering the contribution. I will help you with this contribution.

Since this ticket has not been updated for quite some time and the original PR 
has been abandoned, I have assigned you to the ticket.

Just to managed expectation, I could use some time to pick up the GCS 
backgrounds and review your design proposal.

During this time, I would suggest to take a look at the following guidelines.
 [https://flink.apache.org/contributing/contribute-code.html]
 [https://flink.apache.org/contributing/code-style-and-quality-preamble.html]

After a first glance at the PR, I've two suggestions.
- I noticed you've described your proposal on the PR you've opened. It would be 
nice to update it to the description of this JIRA ticket. Usually, we use the 
JIRA ticket for design discussions, and the PR for reviewing implementation 
details.
- The PR contains 3k LOC changes, in a single commit, which could be hard to 
review, especially when we cannot communicate face-to-face. It would be nice to 
organize the codes into smaller commits following the contribution guidelines. 
This can be done after we reach consensus on the design proposal.

> Create RecoverableWriter for GCS
> 
>
> Key: FLINK-11838
> URL: https://issues.apache.org/jira/browse/FLINK-11838
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.8.0
>Reporter: Fokko Driesprong
>Assignee: Galen Warren
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> GCS supports the resumable upload which we can use to create a Recoverable 
> writer similar to the S3 implementation:
> https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload
> After using the Hadoop compatible interface: 
> https://github.com/apache/flink/pull/7519
> We've noticed that the current implementation relies heavily on the renaming 
> of the files on the commit: 
> https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L233-L259
> This is suboptimal on an object store such as GCS. Therefore we would like to 
> implement a more GCS native RecoverableWriter 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #14876: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`

2021-02-04 Thread GitBox


flinkbot commented on pull request #14876:
URL: https://github.com/apache/flink/pull/14876#issuecomment-773757019


   
   ## CI report:
   
   * 100fb1eeb7f94949efc85cc6921a5c653a56163d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14844: [FLINK-21208][python] Make Arrow Coder serialize schema info in every batch

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14844:
URL: https://github.com/apache/flink/pull/14844#issuecomment-772295878


   
   ## CI report:
   
   * 4c5d3fcfdf5bb2a114964f5cd60fc6743fe331da Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12833)
 
   * 54ed8511139561657f10d84dc25b189acbbf156c UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14798: [FLINK-21187] Provide exception history for root causes

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14798:
URL: https://github.com/apache/flink/pull/14798#issuecomment-769223911


   
   ## CI report:
   
   * 5910b1a876d28886a8b5f87c09e67d75d2a45cd3 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12962)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14379: [FLINK-20563][hive] Support built-in functions for Hive versions prio…

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14379:
URL: https://github.com/apache/flink/pull/14379#issuecomment-73009


   
   ## CI report:
   
   * a413199ba00fdf4b41b09e7cf4bcf02bcd6da0ce Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10861)
 
   * 64460eaf888b0333b8aed626d48f66fd875997cf UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-20970) DECIMAL(10, 0) can not be GROUP BY key.

2021-02-04 Thread Wong Mulan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279297#comment-17279297
 ] 

Wong Mulan commented on FLINK-20970:


There is not the problem in 1.11.3 and 1.12.1 version.

> DECIMAL(10, 0) can not be GROUP BY key.
> ---
>
> Key: FLINK-20970
> URL: https://issues.apache.org/jira/browse/FLINK-20970
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.10.1
>Reporter: Wong Mulan
>Priority: Major
> Attachments: image-2021-01-14-17-06-28-648.png
>
>
> If value which type is not DECIMAL(38, 18) is GROUP BY key, the result will 
> be -1.
> So, only DECIMAL(38, 18) can be GROUP BY key?
> Whatever the value is, it will be return -1.
>  !image-2021-01-14-17-06-28-648.png|thumbnail! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wuchong commented on pull request #14863: [FLINK-21203]Don’t collect -U&+U Row When they are equals In the LastRowFunction

2021-02-04 Thread GitBox


wuchong commented on pull request #14863:
URL: https://github.com/apache/flink/pull/14863#issuecomment-773753124


   cc @leonardBang , could you help to review this?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21284) Non-deterministic functions return different values

2021-02-04 Thread hehuiyuan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hehuiyuan updated FLINK-21284:
--
Description: 
Non-deterministic UDF functions is used mutiple times , the result is different.

 
{code:java}
Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
from myhive_staff");
tableEnv.registerTable("tmp", tm);


tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
8");


}{code}
 

RAND_INTEGER()  function is used for `RAND_INTEGER(10) as sample`  when sink,

which lead  to  inconsistent result.

 

!image-2021-02-05-10-23-02-616.png|width=759,height=433!

 

 

!image-2021-02-05-10-23-20-639.png|width=1664,height=728!

  was:
Non-deterministic UDF functions is used mutiple times , the result is different.

 
{code:java}

Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
from myhive_staff");
tableEnv.registerTable("tmp", tm);


tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
8");


}{code}
 

Sample udf function is used for `RAND_INTEGER(10) as sample`  when sink,

which lead  to  inconsistent result.

 

!image-2021-02-05-10-23-02-616.png|width=759,height=433!

 

 

!image-2021-02-05-10-23-20-639.png|width=1664,height=728!


> Non-deterministic  functions return different values
> 
>
> Key: FLINK-21284
> URL: https://issues.apache.org/jira/browse/FLINK-21284
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: hehuiyuan
>Priority: Major
> Attachments: image-2021-02-05-10-23-02-616.png, 
> image-2021-02-05-10-23-20-639.png
>
>
> Non-deterministic UDF functions is used mutiple times , the result is 
> different.
>  
> {code:java}
> Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
> from myhive_staff");
> tableEnv.registerTable("tmp", tm);
> tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
> 8");
> }{code}
>  
> RAND_INTEGER()  function is used for `RAND_INTEGER(10) as sample`  when sink,
> which lead  to  inconsistent result.
>  
> !image-2021-02-05-10-23-02-616.png|width=759,height=433!
>  
>  
> !image-2021-02-05-10-23-20-639.png|width=1664,height=728!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21284) Non-deterministic functions return different values

2021-02-04 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279295#comment-17279295
 ] 

Jark Wu commented on FLINK-21284:
-

[~hehuiyuan], I agree this is a bug. 

> Non-deterministic  functions return different values
> 
>
> Key: FLINK-21284
> URL: https://issues.apache.org/jira/browse/FLINK-21284
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: hehuiyuan
>Priority: Major
> Attachments: image-2021-02-05-10-23-02-616.png, 
> image-2021-02-05-10-23-20-639.png
>
>
> Non-deterministic UDF functions is used mutiple times , the result is 
> different.
>  
> {code:java}
> Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
> from myhive_staff");
> tableEnv.registerTable("tmp", tm);
> tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
> 8");
> }{code}
>  
> Sample udf function is used for `RAND_INTEGER(10) as sample`  when sink,
> which lead  to  inconsistent result.
>  
> !image-2021-02-05-10-23-02-616.png|width=759,height=433!
>  
>  
> !image-2021-02-05-10-23-20-639.png|width=1664,height=728!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-21284) Non-deterministic functions return different values

2021-02-04 Thread hehuiyuan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279294#comment-17279294
 ] 

hehuiyuan edited comment on FLINK-21284 at 2/5/21, 3:04 AM:


[~jark] , yes ,   RAND_INTEGER result  is non deterministic.

`random.nextint` is used when filter `>=8`        - - - > result$2

 `random.nextint` is used  when setting output. - - - > result$10

 

result$2  and result$10 may be defferent  


was (Author: hehuiyuan):
[~jark] , yes ,   RAND_INTEGER is non deterministic.

`random.nextint` is used when fiter `>=8`        - - - > result$2

 `random.nextint` is used  when setting output. - - - > result$10

> Non-deterministic  functions return different values
> 
>
> Key: FLINK-21284
> URL: https://issues.apache.org/jira/browse/FLINK-21284
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: hehuiyuan
>Priority: Major
> Attachments: image-2021-02-05-10-23-02-616.png, 
> image-2021-02-05-10-23-20-639.png
>
>
> Non-deterministic UDF functions is used mutiple times , the result is 
> different.
>  
> {code:java}
> Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
> from myhive_staff");
> tableEnv.registerTable("tmp", tm);
> tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
> 8");
> }{code}
>  
> Sample udf function is used for `RAND_INTEGER(10) as sample`  when sink,
> which lead  to  inconsistent result.
>  
> !image-2021-02-05-10-23-02-616.png|width=759,height=433!
>  
>  
> !image-2021-02-05-10-23-20-639.png|width=1664,height=728!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21284) Non-deterministic functions return different values

2021-02-04 Thread hehuiyuan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279294#comment-17279294
 ] 

hehuiyuan commented on FLINK-21284:
---

[~jark] , yes ,   RAND_INTEGER is non deterministic.

`random.nextint` is used when fiter `>=8`        - - - > result$2

 `random.nextint` is used  when setting output. - - - > result$10

> Non-deterministic  functions return different values
> 
>
> Key: FLINK-21284
> URL: https://issues.apache.org/jira/browse/FLINK-21284
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: hehuiyuan
>Priority: Major
> Attachments: image-2021-02-05-10-23-02-616.png, 
> image-2021-02-05-10-23-20-639.png
>
>
> Non-deterministic UDF functions is used mutiple times , the result is 
> different.
>  
> {code:java}
> Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
> from myhive_staff");
> tableEnv.registerTable("tmp", tm);
> tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
> 8");
> }{code}
>  
> Sample udf function is used for `RAND_INTEGER(10) as sample`  when sink,
> which lead  to  inconsistent result.
>  
> !image-2021-02-05-10-23-02-616.png|width=759,height=433!
>  
>  
> !image-2021-02-05-10-23-20-639.png|width=1664,height=728!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21284) Non-deterministic functions return different values

2021-02-04 Thread hehuiyuan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hehuiyuan updated FLINK-21284:
--
Description: 
Non-deterministic UDF functions is used mutiple times , the result is different.

 
{code:java}

Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
from myhive_staff");
tableEnv.registerTable("tmp", tm);


tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
8");


}{code}
 

Sample udf function is used for `RAND_INTEGER(10) as sample`  when sink,

which lead  to  inconsistent result.

 

!image-2021-02-05-10-23-02-616.png|width=759,height=433!

 

 

!image-2021-02-05-10-23-20-639.png|width=1664,height=728!

  was:
Non-deterministic UDF functions is used mutiple times , the result is different.

 
{code:java}
tableEnv.registerFunction("sample", new SampleFunction());

Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
from myhive_staff");
tableEnv.registerTable("tmp", tm);


tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
8");


}{code}
 

Sample udf function is used for `RAND_INTEGER(10) as sample`  when sink,

which lead  to  inconsistent result.

 

!image-2021-02-05-10-23-02-616.png|width=759,height=433!

 

 

!image-2021-02-05-10-23-20-639.png|width=1664,height=728!


> Non-deterministic  functions return different values
> 
>
> Key: FLINK-21284
> URL: https://issues.apache.org/jira/browse/FLINK-21284
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: hehuiyuan
>Priority: Major
> Attachments: image-2021-02-05-10-23-02-616.png, 
> image-2021-02-05-10-23-20-639.png
>
>
> Non-deterministic UDF functions is used mutiple times , the result is 
> different.
>  
> {code:java}
> Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
> from myhive_staff");
> tableEnv.registerTable("tmp", tm);
> tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
> 8");
> }{code}
>  
> Sample udf function is used for `RAND_INTEGER(10) as sample`  when sink,
> which lead  to  inconsistent result.
>  
> !image-2021-02-05-10-23-02-616.png|width=759,height=433!
>  
>  
> !image-2021-02-05-10-23-20-639.png|width=1664,height=728!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21284) Non-deterministic UDF functions return different values

2021-02-04 Thread hehuiyuan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hehuiyuan updated FLINK-21284:
--
Description: 
Non-deterministic UDF functions is used mutiple times , the result is different.

 
{code:java}
tableEnv.registerFunction("sample", new SampleFunction());

Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
from myhive_staff");
tableEnv.registerTable("tmp", tm);


tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
8");


}{code}
 

Sample udf function is used for `RAND_INTEGER(10) as sample`  when sink,

which lead  to  inconsistent result.

 

!image-2021-02-05-10-23-02-616.png|width=759,height=433!

 

 

!image-2021-02-05-10-23-20-639.png|width=1664,height=728!

  was:
Non-deterministic UDF functions is used mutiple times , the result is different.

 
{code:java}
tableEnv.registerFunction("sample", new SampleFunction());

Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
from myhive_staff");
tableEnv.registerTable("tmp", tm);


tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
8");




// UDF函数
public class SampleFunction extends ScalarFunction {
  public int eval(int pvid) {
int a = (int) (Math.random() * 10);
System.out.println("" + a );
return a;
  }
}{code}
 

Sample udf function is used for `RAND_INTEGER(10) as sample`  when sink,

which lead  to  inconsistent result.

 

!image-2021-02-05-10-23-02-616.png|width=759,height=433!

 

 

!image-2021-02-05-10-23-20-639.png|width=1664,height=728!


> Non-deterministic UDF functions return different values
> ---
>
> Key: FLINK-21284
> URL: https://issues.apache.org/jira/browse/FLINK-21284
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: hehuiyuan
>Priority: Major
> Attachments: image-2021-02-05-10-23-02-616.png, 
> image-2021-02-05-10-23-20-639.png
>
>
> Non-deterministic UDF functions is used mutiple times , the result is 
> different.
>  
> {code:java}
> tableEnv.registerFunction("sample", new SampleFunction());
> Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
> from myhive_staff");
> tableEnv.registerTable("tmp", tm);
> tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
> 8");
> }{code}
>  
> Sample udf function is used for `RAND_INTEGER(10) as sample`  when sink,
> which lead  to  inconsistent result.
>  
> !image-2021-02-05-10-23-02-616.png|width=759,height=433!
>  
>  
> !image-2021-02-05-10-23-20-639.png|width=1664,height=728!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21284) Non-deterministic functions return different values

2021-02-04 Thread hehuiyuan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hehuiyuan updated FLINK-21284:
--
Summary: Non-deterministic  functions return different values  (was: 
Non-deterministic UDF functions return different values)

> Non-deterministic  functions return different values
> 
>
> Key: FLINK-21284
> URL: https://issues.apache.org/jira/browse/FLINK-21284
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: hehuiyuan
>Priority: Major
> Attachments: image-2021-02-05-10-23-02-616.png, 
> image-2021-02-05-10-23-20-639.png
>
>
> Non-deterministic UDF functions is used mutiple times , the result is 
> different.
>  
> {code:java}
> tableEnv.registerFunction("sample", new SampleFunction());
> Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
> from myhive_staff");
> tableEnv.registerTable("tmp", tm);
> tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
> 8");
> }{code}
>  
> Sample udf function is used for `RAND_INTEGER(10) as sample`  when sink,
> which lead  to  inconsistent result.
>  
> !image-2021-02-05-10-23-02-616.png|width=759,height=433!
>  
>  
> !image-2021-02-05-10-23-20-639.png|width=1664,height=728!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21286) Support BUCKET for flink sql CREATE TABLE

2021-02-04 Thread Jun Zhang (Jira)
Jun Zhang created FLINK-21286:
-

 Summary: Support  BUCKET for flink sql CREATE TABLE
 Key: FLINK-21286
 URL: https://issues.apache.org/jira/browse/FLINK-21286
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 1.12.1
Reporter: Jun Zhang
 Fix For: 1.13.0


Support BUCKET for flink CREATE TABLE : refer to hive syntax
{code:java}
 [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] 
INTO num_buckets BUCKETS]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-11838) Create RecoverableWriter for GCS

2021-02-04 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song reassigned FLINK-11838:


Assignee: Galen Warren  (was: Fokko Driesprong)

> Create RecoverableWriter for GCS
> 
>
> Key: FLINK-11838
> URL: https://issues.apache.org/jira/browse/FLINK-11838
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.8.0
>Reporter: Fokko Driesprong
>Assignee: Galen Warren
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> GCS supports the resumable upload which we can use to create a Recoverable 
> writer similar to the S3 implementation:
> https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload
> After using the Hadoop compatible interface: 
> https://github.com/apache/flink/pull/7519
> We've noticed that the current implementation relies heavily on the renaming 
> of the files on the commit: 
> https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L233-L259
> This is suboptimal on an object store such as GCS. Therefore we would like to 
> implement a more GCS native RecoverableWriter 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] gaoyunhaii commented on a change in pull request #14740: [FLINK-21067][runtime][checkpoint] Modify the logic of computing which tasks to trigger/ack/commit to support finished tasks

2021-02-04 Thread GitBox


gaoyunhaii commented on a change in pull request #14740:
URL: https://github.com/apache/flink/pull/14740#discussion_r570686307



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointBriefCalculator.java
##
@@ -0,0 +1,492 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.checkpoint;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.runtime.executiongraph.Execution;
+import org.apache.flink.runtime.executiongraph.ExecutionAttemptID;
+import org.apache.flink.runtime.executiongraph.ExecutionEdge;
+import org.apache.flink.runtime.executiongraph.ExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.ExecutionVertex;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobEdge;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.ListIterator;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+import static org.apache.flink.util.Preconditions.checkState;
+
+/** Computes the tasks to trigger, wait or commit for each checkpoint. */
+public class CheckpointBriefCalculator {
+private static final Logger LOG = 
LoggerFactory.getLogger(CheckpointBriefCalculator.class);
+
+private final JobID jobId;
+
+private final CheckpointBriefCalculatorContext context;
+
+private final List jobVerticesInTopologyOrder = new 
ArrayList<>();
+
+private final List allTasks = new ArrayList<>();
+
+private final List sourceTasks = new ArrayList<>();
+
+public CheckpointBriefCalculator(
+JobID jobId,
+CheckpointBriefCalculatorContext context,
+Iterable jobVerticesInTopologyOrderIterable) {
+
+this.jobId = checkNotNull(jobId);
+this.context = checkNotNull(context);
+
+checkNotNull(jobVerticesInTopologyOrderIterable);
+jobVerticesInTopologyOrderIterable.forEach(
+jobVertex -> {
+jobVerticesInTopologyOrder.add(jobVertex);
+
allTasks.addAll(Arrays.asList(jobVertex.getTaskVertices()));
+
+if (jobVertex.getJobVertex().isInputVertex()) {
+
sourceTasks.addAll(Arrays.asList(jobVertex.getTaskVertices()));
+}
+});
+}
+
+public CompletableFuture calculateCheckpointBrief() {
+CompletableFuture resultFuture = new 
CompletableFuture<>();
+
+context.getMainExecutor()
+.execute(
+() -> {

Review comment:
   I think it would be much simpler, very thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-21285) Support MERGE INTO for flink sql

2021-02-04 Thread Jun Zhang (Jira)
Jun Zhang created FLINK-21285:
-

 Summary: Support  MERGE INTO for flink sql
 Key: FLINK-21285
 URL: https://issues.apache.org/jira/browse/FLINK-21285
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 1.12.1
Reporter: Jun Zhang
 Fix For: 1.13.0


Support MERGE INTO for flink sql,refer to hive syntax:
{code:java}
MERGE INTO  AS T USING  AS S
ON 
WHEN MATCHED [AND ] THEN UPDATE SET 
WHEN MATCHED [AND ] THEN DELETE
WHEN NOT MATCHED [AND ] THEN INSERT VALUES
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21278) NullpointExecption error is reported when using the evictor method to filter the data before the window calculation

2021-02-04 Thread HunterHunter (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279287#comment-17279287
 ] 

HunterHunter commented on FLINK-21278:
--

Do we need to determine whether there is data in the window before triggering 
the window calculation.

I have fixed this bug in my code.

https://github.com/LinMingQiang/flink/commit/bafefee33b2513d9c4128e9a3a1f9644000137a3

> NullpointExecption error is reported when using the evictor method to filter 
> the data before the window calculation
> ---
>
> Key: FLINK-21278
> URL: https://issues.apache.org/jira/browse/FLINK-21278
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.12.1
>Reporter: HunterHunter
>Priority: Blocker
>
> When I use evictor() method to filter the data before a window is triggered, 
> if there is no data that meets the conditions, a nullpointExecption error 
> will be reported.
> This problem occurs in the ReduceApplyWindowFunction.apply method.
> So I think if there is no data to calculate whether it can not trigger the 
> calculation, or judge whether it is null before transmitting the calculation 
> result



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #14876: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`

2021-02-04 Thread GitBox


flinkbot commented on pull request #14876:
URL: https://github.com/apache/flink/pull/14876#issuecomment-773745339


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 100fb1eeb7f94949efc85cc6921a5c653a56163d (Fri Feb 05 
02:45:10 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] V1ncentzzZ commented on pull request #14849: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`

2021-02-04 Thread GitBox


V1ncentzzZ commented on pull request #14849:
URL: https://github.com/apache/flink/pull/14849#issuecomment-773745324


   Thanks @wuchong @leonardBang , this 
[PR](https://github.com/apache/flink/pull/14876) for master.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] V1ncentzzZ opened a new pull request #14876: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`

2021-02-04 Thread GitBox


V1ncentzzZ opened a new pull request #14876:
URL: https://github.com/apache/flink/pull/14876


   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   Fix typo in `UnsignedTypeConversionITCase`.
   
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't 
know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21284) Non-deterministic UDF functions return different values

2021-02-04 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279286#comment-17279286
 ] 

Jark Wu commented on FLINK-21284:
-

[~hehuiyuan], do you mean you get result sample < 8 in the sink?

> Non-deterministic UDF functions return different values
> ---
>
> Key: FLINK-21284
> URL: https://issues.apache.org/jira/browse/FLINK-21284
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: hehuiyuan
>Priority: Major
> Attachments: image-2021-02-05-10-23-02-616.png, 
> image-2021-02-05-10-23-20-639.png
>
>
> Non-deterministic UDF functions is used mutiple times , the result is 
> different.
>  
> {code:java}
> tableEnv.registerFunction("sample", new SampleFunction());
> Table tm = tableEnv.sqlQuery("select name, RAND_INTEGER(10) as sample , sex 
> from myhive_staff");
> tableEnv.registerTable("tmp", tm);
> tableEnv.sqlUpdate("insert into sinktable select * from tmp  where sample >= 
> 8");
> // UDF函数
> public class SampleFunction extends ScalarFunction {
>   public int eval(int pvid) {
> int a = (int) (Math.random() * 10);
> System.out.println("" + a );
> return a;
>   }
> }{code}
>  
> Sample udf function is used for `RAND_INTEGER(10) as sample`  when sink,
> which lead  to  inconsistent result.
>  
> !image-2021-02-05-10-23-02-616.png|width=759,height=433!
>  
>  
> !image-2021-02-05-10-23-20-639.png|width=1664,height=728!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2021-02-04 Thread Yun Tang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Tang updated FLINK-15318:
-
Fix Version/s: 1.12.0

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Fix For: 1.12.0
>
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2021-02-04 Thread Yun Tang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Tang closed FLINK-15318.

Resolution: Fixed

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2021-02-04 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279285#comment-17279285
 ] 

Yun Tang commented on FLINK-15318:
--

[~maguowei] These tests are dropped in FLINK-18373 and I will close this ticket 
as it fixed from 1.12.0

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21278) NullpointExecption error is reported when using the evictor method to filter the data before the window calculation

2021-02-04 Thread HunterHunter (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HunterHunter updated FLINK-21278:
-
Attachment: (was: image-2021-02-05-10-35-45-364.png)

> NullpointExecption error is reported when using the evictor method to filter 
> the data before the window calculation
> ---
>
> Key: FLINK-21278
> URL: https://issues.apache.org/jira/browse/FLINK-21278
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.12.1
>Reporter: HunterHunter
>Priority: Blocker
>
> When I use evictor() method to filter the data before a window is triggered, 
> if there is no data that meets the conditions, a nullpointExecption error 
> will be reported.
> This problem occurs in the ReduceApplyWindowFunction.apply method.
> So I think if there is no data to calculate whether it can not trigger the 
> calculation, or judge whether it is null before transmitting the calculation 
> result



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21278) NullpointExecption error is reported when using the evictor method to filter the data before the window calculation

2021-02-04 Thread HunterHunter (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HunterHunter updated FLINK-21278:
-
Attachment: image-2021-02-05-10-35-45-364.png

> NullpointExecption error is reported when using the evictor method to filter 
> the data before the window calculation
> ---
>
> Key: FLINK-21278
> URL: https://issues.apache.org/jira/browse/FLINK-21278
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.12.1
>Reporter: HunterHunter
>Priority: Blocker
> Attachments: image-2021-02-05-10-35-45-364.png
>
>
> When I use evictor() method to filter the data before a window is triggered, 
> if there is no data that meets the conditions, a nullpointExecption error 
> will be reported.
> This problem occurs in the ReduceApplyWindowFunction.apply method.
> So I think if there is no data to calculate whether it can not trigger the 
> calculation, or judge whether it is null before transmitting the calculation 
> result



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21278) NullpointExecption error is reported when using the evictor method to filter the data before the window calculation

2021-02-04 Thread HunterHunter (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279282#comment-17279282
 ] 

HunterHunter commented on FLINK-21278:
--

[~ZhaoWeiNan]

code: [https://paste.ubuntu.com/p/wJyN93BwHB/]
{code:java}

Caused by: java.lang.NullPointerException
at 
org.apache.flink.api.common.functions.util.PrintSinkOutputWriter.write(PrintSinkOutputWriter.java:73)
at 
org.apache.flink.streaming.api.functions.sink.PrintSinkFunction.invoke(PrintSinkFunction.java:81)
at 
org.apache.flink.streaming.api.functions.sink.SinkFunction.invoke(SinkFunction.java:52)
at 
org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:56)
at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:71)
at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46)
at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26)
at 
org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:52)
at 
org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:30)
at 
org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:53)
at 
org.apache.flink.streaming.api.functions.windowing.PassThroughWindowFunction.apply(PassThroughWindowFunction.java:36)
at 
org.apache.flink.streaming.api.functions.windowing.ReduceApplyWindowFunction.apply(ReduceApplyWindowFunction.java:58)
at 
org.apache.flink.streaming.runtime.operators.windowing.functions.InternalIterableWindowFunction.process(InternalIterableWindowFunction.java:44)
at 
org.apache.flink.streaming.runtime.operators.windowing.functions.InternalIterableWindowFunction.process(InternalIterableWindowFunction.java:32)
at 
org.apache.flink.streaming.runtime.operators.windowing.EvictingWindowOperator.emitWindowContents(EvictingWindowOperator.java:359)
at 
org.apache.flink.streaming.runtime.operators.windowing.EvictingWindowOperator.onEventTime(EvictingWindowOperator.java:271)
at 
org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:276)
at 
org.apache.flink.streaming.api.operators.InternalTimeServiceManagerImpl.advanceWatermark(InternalTimeServiceManagerImpl.java:183)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:600)
at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitWatermark(OneInputStreamTask.java:199)
at 
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:173)
at 
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:95)
at 
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:181)
at 
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:152)
at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)
at java.lang.Thread.run(Thread.java:748){code}
 

> NullpointExecption error is reported when using the evictor method to filter 
> the data before the window calculation
> ---
>
> Key: FLINK-21278
> URL: https://issues.apache.org/jira/browse/FLINK-21278
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.12.1
>Reporter: HunterHunter
>Priority: Blocker
>
> When I use evictor() method to filter the data before a window is triggered, 
> if there is no data that meets the conditions, a nullpointExecption error 
> will be reported.
> This problem occurs in the ReduceApplyWindowFunction.apply method.
> So I think if there is no data to calculate whether it can not trigger the 
> calculation, or judge whethe

[GitHub] [flink] wuchong commented on pull request #14849: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`

2021-02-04 Thread GitBox


wuchong commented on pull request #14849:
URL: https://github.com/apache/flink/pull/14849#issuecomment-773740994


   @V1ncentzzZ  could you open a pull request for master too?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-20254) HiveTableSourceITCase.testStreamPartitionReadByCreateTime times out

2021-02-04 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-20254.

Resolution: Fixed

master (1.13): fce75b5f5c078884022f89ab6fa9b39e23a84279

> HiveTableSourceITCase.testStreamPartitionReadByCreateTime times out
> ---
>
> Key: FLINK-20254
> URL: https://issues.apache.org/jira/browse/FLINK-20254
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Assignee: Leonard Xu
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9808&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=62110053-334f-5295-a0ab-80dd7e2babbf
> {code}
> 2020-11-19T10:34:23.5591765Z [ERROR] Tests run: 18, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 192.243 s <<< FAILURE! - in 
> org.apache.flink.connectors.hive.HiveTableSourceITCase
> 2020-11-19T10:34:23.5593193Z [ERROR] 
> testStreamPartitionReadByCreateTime(org.apache.flink.connectors.hive.HiveTableSourceITCase)
>   Time elapsed: 120.075 s  <<< ERROR!
> 2020-11-19T10:34:23.5593929Z org.junit.runners.model.TestTimedOutException: 
> test timed out after 12 milliseconds
> 2020-11-19T10:34:23.5594321Z  at java.lang.Thread.sleep(Native Method)
> 2020-11-19T10:34:23.5594777Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.sleepBeforeRetry(CollectResultFetcher.java:231)
> 2020-11-19T10:34:23.5595378Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:119)
> 2020-11-19T10:34:23.5596001Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:103)
> 2020-11-19T10:34:23.5596610Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77)
> 2020-11-19T10:34:23.5597218Z  at 
> org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:115)
> 2020-11-19T10:34:23.5597811Z  at 
> org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:355)
> 2020-11-19T10:34:23.5598555Z  at 
> org.apache.flink.connectors.hive.HiveTableSourceITCase.fetchRows(HiveTableSourceITCase.java:653)
> 2020-11-19T10:34:23.5599407Z  at 
> org.apache.flink.connectors.hive.HiveTableSourceITCase.testStreamPartitionReadByCreateTime(HiveTableSourceITCase.java:594)
> 2020-11-19T10:34:23.5599982Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-11-19T10:34:23.5600393Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-11-19T10:34:23.5600865Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-11-19T10:34:23.5601300Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-11-19T10:34:23.5601713Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-11-19T10:34:23.5602211Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-11-19T10:34:23.5602688Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-11-19T10:34:23.5603181Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-11-19T10:34:23.5603753Z  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> 2020-11-19T10:34:23.5604308Z  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> 2020-11-19T10:34:23.5604780Z  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 2020-11-19T10:34:23.5605114Z  at java.lang.Thread.run(Thread.java:748)
> 2020-11-19T10:34:23.5605299Z 
> 2020-11-19T10:34:24.4180149Z [INFO] Running 
> org.apache.flink.connectors.hive.TableEnvHiveConnectorITCase
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wuchong merged pull request #14849: [hotfix][typo]Fix typo in `UnsignedTypeConversionITCase`

2021-02-04 Thread GitBox


wuchong merged pull request #14849:
URL: https://github.com/apache/flink/pull/14849


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14821: [FLINK-21210][coordination] ApplicationClusterEntryPoint should explicitly close PackagedProgram

2021-02-04 Thread GitBox


flinkbot edited a comment on pull request #14821:
URL: https://github.com/apache/flink/pull/14821#issuecomment-770351339


   
   ## CI report:
   
   * aa75cbfa98c0494578283d2dab0908cdd4942c3a Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=12968)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-20495) Elasticsearch6DynamicSinkITCase Hang

2021-02-04 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279278#comment-17279278
 ] 

Leonard Xu commented on FLINK-20495:


another instance:
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12906&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=03dca39c-73e8-5aaf-601d-328ae5c35f20

> Elasticsearch6DynamicSinkITCase Hang
> 
>
> Key: FLINK-20495
> URL: https://issues.apache.org/jira/browse/FLINK-20495
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Connectors / 
> ElasticSearch, Tests
>Affects Versions: 1.13.0
>Reporter: Huang Xingbo
>Priority: Major
>  Labels: test-stability
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10535&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=03dca39c-73e8-5aaf-601d-328ae5c35f20]
>  
> {code:java}
> 2020-12-04T22:39:33.9748225Z [INFO] Running 
> org.apache.flink.streaming.connectors.elasticsearch.table.Elasticsearch6DynamicSinkITCase
> 2020-12-04T22:54:51.9486410Z 
> ==
> 2020-12-04T22:54:51.9488766Z Process produced no output for 900 seconds.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] JingsongLi merged pull request #14766: [FLINK-20254][hive] Make PartitionMonitor fetching partitions with same create time properly

2021-02-04 Thread GitBox


JingsongLi merged pull request #14766:
URL: https://github.com/apache/flink/pull/14766


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingsongLi commented on pull request #14766: [FLINK-20254][hive] Make PartitionMonitor fetching partitions with same create time properly

2021-02-04 Thread GitBox


JingsongLi commented on pull request #14766:
URL: https://github.com/apache/flink/pull/14766#issuecomment-773738333


   @leonardBang You can create a JIRA for the fail test.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   4   5   >