[GitHub] [flink] flinkbot edited a comment on pull request #15294: [FLINK-21945][streaming] Omit pointwise connections from checkpointing in unaligned checkpoints.

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15294:
URL: https://github.com/apache/flink/pull/15294#issuecomment-803058203


   
   ## CI report:
   
   * b22d2cdd4e457842b585e64089858a0ae8eb9a2b UNKNOWN
   * 19964baf25121c6f0a4f85d75ffcc568d98dcdaa Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15605)
 
   * 6bff4c2e3485c8f0dd09b872652c6a7958836e7f UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15253: [FLINK-21808][hive] Support DQL/DML in HiveParser

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15253:
URL: https://github.com/apache/flink/pull/15253#issuecomment-801069817


   
   ## CI report:
   
   * c550f67518cae7d8fe19752af581c582bd7115b9 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15625)
 
   * f57250440164af2180629998bc4cfda236d64772 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15161: [FLINK-20114][connector/kafka,common] Fix a few KafkaSource-related bugs

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15161:
URL: https://github.com/apache/flink/pull/15161#issuecomment-797177953


   
   ## CI report:
   
   * d3f59e302b9860d49ce79252aeb8bd1decaeeabf UNKNOWN
   * 6fc6cc9a2e6c26e9e8fa62b481237231394078eb Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15507)
 
   * 6c17ec7241323224d3053134e82353c63f28a24f UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15016: [FLINK-21413][state] Clean TtlMapState and TtlListState after all elements are expired

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15016:
URL: https://github.com/apache/flink/pull/15016#issuecomment-785640058


   
   ## CI report:
   
   * 5d95fe96d132b396f3d59f2c102d63a68ccd0d73 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15646)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21906) Support computed column syntax for Hive DDL dialect

2021-03-28 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310429#comment-17310429
 ] 

Jark Wu commented on FLINK-21906:
-

Thanks [~hackergin], I assigned this issue to you.

> Support computed column syntax for Hive DDL dialect
> ---
>
> Key: FLINK-21906
> URL: https://issues.apache.org/jira/browse/FLINK-21906
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Jark Wu
>Assignee: jinfeng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-21906) Support computed column syntax for Hive DDL dialect

2021-03-28 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-21906:
---

Assignee: jinfeng

> Support computed column syntax for Hive DDL dialect
> ---
>
> Key: FLINK-21906
> URL: https://issues.apache.org/jira/browse/FLINK-21906
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Jark Wu
>Assignee: jinfeng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] SteNicholas commented on pull request #15020: [FLINK-21445][kubernetes] Application mode does not set the configuration when building PackagedProgram

2021-03-28 Thread GitBox


SteNicholas commented on pull request #15020:
URL: https://github.com/apache/flink/pull/15020#issuecomment-809072744


   @tillrohrmann , I will update this pull request through following the 
comments above.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-21989) Add a SupportsSourceWatermark ability interface

2021-03-28 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-21989.

Fix Version/s: 1.13.0
   Resolution: Fixed

Fixed in 1.13.0: 71122fda952b7bd3b0308182e10da77af1cc67ff

> Add a SupportsSourceWatermark ability interface
> ---
>
> Key: FLINK-21989
> URL: https://issues.apache.org/jira/browse/FLINK-21989
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> FLINK-21899 added a dedicated function that can be used in watermark 
> definitions. Currently, the generated watermark strategy is invalid because 
> of the exception that we throw in the function’s implementation. We should 
> integrate this concept deeper into the interfaces instead of the need to 
> implement some expression analyzing utility for every source.
> We propose the following interface:
> {code}
> SupportsSourceWatermark {
>   void applySourceWatermark()
> }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] twalthr closed pull request #15388: [FLINK-21989][table] Add a SupportsSourceWatermark ability interface

2021-03-28 Thread GitBox


twalthr closed pull request #15388:
URL: https://github.com/apache/flink/pull/15388


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14839: [FLINK-21353][state] Add DFS-based StateChangelog

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #14839:
URL: https://github.com/apache/flink/pull/14839#issuecomment-772060196


   
   ## CI report:
   
   * 426533428e0971d34f6e80acc89fe5a5a72ea2a4 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15638)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zicat commented on a change in pull request #15247: [FLINK-21833][Table SQL / Runtime] TemporalRowTimeJoinOperator.java will lead to the state expansion by short-life-cycle & huge Row

2021-03-28 Thread GitBox


zicat commented on a change in pull request #15247:
URL: https://github.com/apache/flink/pull/15247#discussion_r603014108



##
File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/runtime/operators/join/temporal/TemporalRowTimeJoinOperator.java
##
@@ -301,6 +302,8 @@ private void cleanupExpiredVersionInState(long 
currentWatermark, List r
 public void cleanupState(long time) {
 leftState.clear();
 rightState.clear();
+nextLeftIndex.clear();
+registeredTimer.clear();

Review comment:
Please let me know if there is anything I should change, thx 
@leonardBang 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21906) Support computed column syntax for Hive DDL dialect

2021-03-28 Thread jinfeng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310411#comment-17310411
 ] 

jinfeng commented on FLINK-21906:
-

I learned the code in FLIP-152, what we need to do is to use antlr in 
flink-hive-connector to implement the syntax of watermark, and convert ast to 
CreateTableOperation,
Maybe I can take a try. This feature is not target to 1.13 release, which means 
that I have more time to complete it , that's fine

> Support computed column syntax for Hive DDL dialect
> ---
>
> Key: FLINK-21906
> URL: https://issues.apache.org/jira/browse/FLINK-21906
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Jark Wu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21981) Increase the priority of the parameter in flink-conf

2021-03-28 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310410#comment-17310410
 ] 

Xintong Song commented on FLINK-21981:
--

Hi [~Bo Cui],

I noticed you have filed several tickets, all related to using Flink Yarn 
deployment. The discussion seems not going smoothly.

If you wish, you can reach to my gmail (tonysong820). We can try to setup an 
offline discussion in Chinese. Hope that helps.

> Increase the priority of the parameter in flink-conf
> 
>
> Key: FLINK-21981
> URL: https://issues.apache.org/jira/browse/FLINK-21981
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Reporter: Bo Cui
>Priority: Major
>
> in my cluster, env has HADOOP_CONF_DIR and YARN_CONF_DIR, but they are hadoop 
> and yarn conf, not flink conf. and hadoop conf of flink(env.hadoop.conf.dir) 
> is different from them. so we should use `env.hadoop.conf.dir` first
> https://github.com/apache/flink/blob/57e93c90a14a9f5316da821863dd2b335a8f86e0/flink-dist/src/main/flink-bin/bin/config.sh#L256



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15404: [hotfix][docs] Remove redundant import

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15404:
URL: https://github.com/apache/flink/pull/15404#issuecomment-809050239


   
   ## CI report:
   
   * b1c186c4ac9432c1bda5474b5a1d830062195e71 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15647)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15200: [FLINK-21355] Send changes to the state changelog

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15200:
URL: https://github.com/apache/flink/pull/15200#issuecomment-798902665


   
   ## CI report:
   
   * 5e1342d9916f5c4356c622a40bc27bcbdacde9d7 UNKNOWN
   * 683a1724ca1074a01a0cbaf986afa4a85537b478 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15639)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15054: [FLINK-13550][runtime][ui] Operator's Flame Graph

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15054:
URL: https://github.com/apache/flink/pull/15054#issuecomment-788337524


   
   ## CI report:
   
   * 26a28f2d83f56cb386e1365fd4df4fb8a2f2bf86 UNKNOWN
   * 0b5aaf42f2861a38e26a80e25cf2324e7cf06bb7 UNKNOWN
   * e7f59e2e2811c0718a453572e90a4ad2a900ff03 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15642)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15016: [FLINK-21413][state] Clean TtlMapState and TtlListState after all elements are expired

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15016:
URL: https://github.com/apache/flink/pull/15016#issuecomment-785640058


   
   ## CI report:
   
   * cf3c223658f37cc838409b690e2e1bf93c04f207 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15641)
 
   * 5d95fe96d132b396f3d59f2c102d63a68ccd0d73 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15646)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-22005) SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1)

2021-03-28 Thread Guowei Ma (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guowei Ma closed FLINK-22005.
-
Fix Version/s: 1.13.0
 Release Note: fix in the master c375f4cfd394c10b54110eac446873055b716b89
   Resolution: Fixed

> SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1) 
> 
>
> Key: FLINK-22005
> URL: https://issues.apache.org/jira/browse/FLINK-22005
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> The test fail because of Waiting for Elasticsearch records indefinitely.
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15583=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529=19826



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22005) SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1)

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310399#comment-17310399
 ] 

Guowei Ma commented on FLINK-22005:
---

thanks [~Leonard Xu]. I find that this does not appear in the following 
test(28/29). So I close this tickets.

> SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1) 
> 
>
> Key: FLINK-22005
> URL: https://issues.apache.org/jira/browse/FLINK-22005
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
>
> The test fail because of Waiting for Elasticsearch records indefinitely.
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15583=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529=19826



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wuchong commented on a change in pull request #15400: [FLINK-20557][sql-client] Support STATEMENT SET in SQL CLI

2021-03-28 Thread GitBox


wuchong commented on a change in pull request #15400:
URL: https://github.com/apache/flink/pull/15400#discussion_r603002125



##
File path: 
flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/CliClient.java
##
@@ -299,15 +307,23 @@ private void callOperation(Operation operation) {
 } else if (operation instanceof HelpOperation) {
 // HELP
 callHelp();
+} else if (operation instanceof BeginStatementSetOperation) {
+// BEGIN STATEMENT SET
+callBeginStatementSet();
+} else if (operation instanceof EndStatementSetOperation) {
+// END
+callEndStatementSet();
+} else if (operation instanceof CatalogSinkModifyOperation) {
+// INSERT INTO/OVERWRITE
+callInsert((CatalogSinkModifyOperation) operation);
+} else if (isStatementSetMode) {

Review comment:
   This looks really hack and hard to maintain what statement are not 
allowed in statement set. I suggest to check statements at the beginning of 
this method. 
   
   ```java
 if (isStatementSetMode) {
   // check the current operation is allowed in STATEMENT SET
   if (!(operation instanceof CatalogSinkModifyOperation
   || operation instanceof EndStatementSetOperation)) {
   printError(MESSAGE_STATEMENT_SET_SQL_EXECUTION_ERROR);
   return;
   }
   }
   ```

##
File path: 
flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/CliClient.java
##
@@ -412,27 +428,65 @@ private void callSelect(QueryOperation operation) {
 }
 
 private boolean callInsert(CatalogSinkModifyOperation operation) {
-printInfo(CliStrings.MESSAGE_SUBMITTING_STATEMENT);
-
-try {
-TableResult result = executor.executeOperation(sessionId, 
operation);
-checkState(result.getJobClient().isPresent());
-terminal.writer()
-.println(
-
CliStrings.messageInfo(CliStrings.MESSAGE_STATEMENT_SUBMITTED)
-.toAnsi());
-// keep compatibility with before
-terminal.writer()
-.println(
-String.format(
-"Job ID: %s\n",
-
result.getJobClient().get().getJobID().toString()));
-terminal.flush();
+if (isStatementSetMode) {
+statementSetOperations.add(operation);
+printInfo(CliStrings.MESSAGE_ADD_STATEMENT_TO_STATEMENT_SET);
 return true;
-} catch (SqlExecutionException e) {
-printExecutionException(e);
+} else {
+printInfo(CliStrings.MESSAGE_SUBMITTING_STATEMENT);
+
+try {
+TableResult result = executor.executeOperation(sessionId, 
operation);
+checkState(result.getJobClient().isPresent());
+terminal.writer()
+.println(
+
CliStrings.messageInfo(CliStrings.MESSAGE_STATEMENT_SUBMITTED)
+.toAnsi());
+// keep compatibility with before
+terminal.writer()
+.println(
+String.format(
+"Job ID: %s\n",
+
result.getJobClient().get().getJobID().toString()));
+terminal.flush();
+return true;
+} catch (SqlExecutionException e) {
+printExecutionException(e);
+}
+return false;
+}
+}
+
+private void callBeginStatementSet() {
+if (isStatementSetMode) {
+printStatementSetExecutionException();
+} else {
+isStatementSetMode = true;
+statementSetOperations = new ArrayList<>();
+printInfo(CliStrings.MESSAGE_BEGIN_STATEMENT_SET);
+}
+}
+
+private void callEndStatementSet() {
+if (isStatementSetMode) {
+isStatementSetMode = false;
+printInfo(CliStrings.MESSAGE_SUBMITTING_STATEMENT_SET);
+
+try {
+TableResult result = executor.executeOperation(sessionId, 
statementSetOperations);
+checkState(result.getJobClient().isPresent());
+terminal.writer()
+.println(
+
CliStrings.messageInfo(CliStrings.MESSAGE_STATEMENT_SET_SUBMITTED)
+.toAnsi());
+terminal.flush();

Review comment:
   This logic should be reused with `callInsert`, and we should also 
consider dml-sync option (could you rebase the branch) ?

##
File path: 

[jira] [Commented] (FLINK-22005) SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1)

2021-03-28 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310391#comment-17310391
 ] 

Leonard Xu commented on FLINK-22005:


[~maguowei] Could you rebase to latest master, this issue has  been fixed in 
[https://github.com/apache/flink/pull/15394#event-4516849115]

> SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1) 
> 
>
> Key: FLINK-22005
> URL: https://issues.apache.org/jira/browse/FLINK-22005
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
>
> The test fail because of Waiting for Elasticsearch records indefinitely.
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15583=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529=19826



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21981) Increase the priority of the parameter in flink-conf

2021-03-28 Thread Bo Cui (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310388#comment-17310388
 ] 

Bo Cui commented on FLINK-21981:


[~Paul Lin] yes, the current plan is like `export ...`, but since the 
`env.hadoop.conf.dir` exists and we can use it to reduce `export...`, why not 
use it first ?

> Increase the priority of the parameter in flink-conf
> 
>
> Key: FLINK-21981
> URL: https://issues.apache.org/jira/browse/FLINK-21981
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Reporter: Bo Cui
>Priority: Major
>
> in my cluster, env has HADOOP_CONF_DIR and YARN_CONF_DIR, but they are hadoop 
> and yarn conf, not flink conf. and hadoop conf of flink(env.hadoop.conf.dir) 
> is different from them. so we should use `env.hadoop.conf.dir` first
> https://github.com/apache/flink/blob/57e93c90a14a9f5316da821863dd2b335a8f86e0/flink-dist/src/main/flink-bin/bin/config.sh#L256



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21990) SourceStreamTask will always hang if the CheckpointedFunction#snapshotState throws an exception.

2021-03-28 Thread Kezhu Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310387#comment-17310387
 ] 

Kezhu Wang commented on FLINK-21990:


A {{disableChaining}} in between failed source and downstream sink passes the 
test case. But still hang for a while before run into blocking operations. 
Seems that simple {{Thread.interrupt}} is not that enough, a cooperative 
{{SourceFunction.cancel}} should help.

> SourceStreamTask will always hang if the CheckpointedFunction#snapshotState 
> throws an exception.
> 
>
> Key: FLINK-21990
> URL: https://issues.apache.org/jira/browse/FLINK-21990
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.0, 1.12.0
>Reporter: ming li
>Priority: Critical
>
> If the source in {{SourceStreamTask}} implements {{CheckpointedFunction}} and 
> an exception is thrown in the snapshotState method, then the 
> {{SourceStreamTask}} will always hang.
> The main reason is that the checkpoint is executed in the mailbox. When the 
> {{CheckpointedFunction#snapshotState}}  of the source throws an exception, 
> the StreamTask#cleanUpInvoke will be called, where it will wait for the end 
> of the {{LegacySourceFunctionThread}} of the source. However, the source 
> thread does not end by itself (this requires the user to control it), the 
> {{Task}} will hang at this time, and the JobMaster has no perception of this 
> behavior.
> {code:java}
> protected void cleanUpInvoke() throws Exception {
> getCompletionFuture().exceptionally(unused -> null).join(); //wait for 
> the end of the source
> // clean up everything we initialized
> isRunning = false;
> ...
> }{code}
> I think we should call the cancel method of the source first, and then wait 
> for the end.
> The following is my test code, the test branch is Flink's master branch.
> {code:java}
> @Test
> public void testSourceFailure() throws Exception {
> final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.enableCheckpointing(2000L);
> env.setRestartStrategy(RestartStrategies.noRestart());
> env.addSource(new FailedSource()).addSink(new DiscardingSink<>());
> JobGraph jobGraph = 
> StreamingJobGraphGenerator.createJobGraph(env.getStreamGraph());
> try {
> // assert that the job only execute checkpoint once and only failed 
> once.
> TestUtils.submitJobAndWaitForResult(
> cluster.getClusterClient(), jobGraph, 
> getClass().getClassLoader());
> } catch (JobExecutionException jobException) {
> Optional throwable =
> ExceptionUtils.findThrowable(jobException, 
> FlinkRuntimeException.class);
> Assert.assertTrue(throwable.isPresent());
> Assert.assertEquals(
> 
> CheckpointFailureManager.EXCEEDED_CHECKPOINT_TOLERABLE_FAILURE_MESSAGE,
> throwable.get().getMessage());
> }
> // assert that the job only failed once.
> Assert.assertEquals(1, 
> StringGeneratingSourceFunction.INITIALIZE_TIMES.get());
> }
> private static class FailedSource extends RichParallelSourceFunction
> implements CheckpointedFunction {
> private transient boolean running;
> @Override
> public void open(Configuration parameters) throws Exception {
> running = true;
> }
> @Override
> public void run(SourceContext ctx) throws Exception {
> while (running) {
> ctx.collect("test");
> }
> }
> @Override
> public void cancel() {
> running = false;
> }
> @Override
> public void snapshotState(FunctionSnapshotContext context) throws 
> Exception {
> throw new RuntimeException("source failed");
> }
> @Override
> public void initializeState(FunctionInitializationContext context) throws 
> Exception {}
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-18356) Exit code 137 returned from process when testing pyflink

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310383#comment-17310383
 ] 

Guowei Ma commented on FLINK-18356:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15554=logs=34f41360-6c0d-54d3-11a1-0292a2def1d9=2d56e022-1ace-542f-bf1a-b37dd63243f2=9452

> Exit code 137 returned from process when testing pyflink
> 
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python, Build System / Azure Pipelines
>Affects Versions: 1.12.0
>Reporter: Piotr Nowojski
>Priority: Major
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15403: [hotfix] fix typo, assign processedData to currentProcessedData rather than currentPersistedData.

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15403:
URL: https://github.com/apache/flink/pull/15403#issuecomment-809045211


   
   ## CI report:
   
   * d87a60142ebdb3eef911d51391679b90f4acaa6a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15645)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15404: [hotfix][docs] Remove redundant import

2021-03-28 Thread GitBox


flinkbot commented on pull request #15404:
URL: https://github.com/apache/flink/pull/15404#issuecomment-809050239


   
   ## CI report:
   
   * b1c186c4ac9432c1bda5474b5a1d830062195e71 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15402: [FLINK-21985][table-api] Support Explain Query/Modifcation syntax in Calcite Parser

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15402:
URL: https://github.com/apache/flink/pull/15402#issuecomment-809045142


   
   ## CI report:
   
   * 97ba64b2959826a53be3273d9b565b3fe8f96fdc Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15644)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-18356) Exit code 137 returned from process when testing pyflink

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310380#comment-17310380
 ] 

Guowei Ma commented on FLINK-18356:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf=24506

> Exit code 137 returned from process when testing pyflink
> 
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python, Build System / Azure Pipelines
>Affects Versions: 1.12.0
>Reporter: Piotr Nowojski
>Priority: Major
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21103) E2e tests time out on azure

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310377#comment-17310377
 ] 

Guowei Ma commented on FLINK-21103:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15575=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] YikSanChan commented on pull request #15360: [FLINK-21938][docs] Add how to unit test python udfs

2021-03-28 Thread GitBox


YikSanChan commented on pull request #15360:
URL: https://github.com/apache/flink/pull/15360#issuecomment-809048667


   @rmetzger Hi Robert, is there anything I need to do here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21954) Test failures occur due to the test not waiting for the ExecutionGraph to be created

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310374#comment-17310374
 ] 

Guowei Ma commented on FLINK-21954:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15573=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7030a106-e977-5851-a05e-535de648c9c9=8502

> Test failures occur due to the test not waiting for the ExecutionGraph to be 
> created
> 
>
> Key: FLINK-21954
> URL: https://issues.apache.org/jira/browse/FLINK-21954
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Matthias
>Priority: Major
>  Labels: test-stability
>
> Various tests are failing due the test not waiting for the ExecutionGraph to 
> be created:
> * 
> [JobMasterTest.testRestoringFromSavepoint|https://dev.azure.com/mapohl/flink/_build/results?buildId=356=logs=243b38e1-22e7-598a-c8ae-385dce2c28b5=fea482b6-4f61-51f4-2584-f73df532b395=8266]
> * {{JobMasterTest.testRequestNextInputSplitWithGlobalFailover}}
> * {{JobMasterTest.testRequestNextInputSplitWithLocalFailover}} (also fails 
> due to FLINK-21450)
> * {{JobMasterQueryableStateTest.testRequestKvStateOfWrongJob}}
> * {{JobMasterQueryableStateTest.testRequestKvStateWithIrrelevantRegistration}}
> * {{JobMasterQueryableStateTest.testDuplicatedKvStateRegistrationsFailTask}}
> * {{JobMasterQueryableStateTest.testRegisterKvState}}
> We might have to double-check whether other tests are affected as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21659) Running HA per-job cluster (rocks, incremental) end-to-end test fails

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310373#comment-17310373
 ] 

Guowei Ma commented on FLINK-21659:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15573=logs=4dd4dbdd-1802-5eb7-a518-6acd9d24d0fc=8d6b4dd3-4ca1-5611-1743-57a7d76b395a=1733


> Running HA per-job cluster (rocks, incremental) end-to-end test fails
> -
>
> Key: FLINK-21659
> URL: https://issues.apache.org/jira/browse/FLINK-21659
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Critical
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=14232=logs=4dd4dbdd-1802-5eb7-a518-6acd9d24d0fc=8d6b4dd3-4ca1-5611-1743-57a7d76b395a
> It seems that the task deploy tasks to the TaskManager0 failed and it causes 
> the checkpoint fails.
> {code:java}
> java.util.concurrent.CompletionException: 
> java.util.concurrent.TimeoutException: Invocation of public abstract 
> java.util.concurrent.CompletableFuture 
> org.apache.flink.runtime.taskexecutor.TaskExecutorGateway.submitTask(org.apache.flink.runtime.deployment.TaskDeploymentDescriptor,org.apache.flink.runtime.jobmaster.JobMasterId,org.apache.flink.api.common.time.Time)
>  timed out.
> at 
> java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
>  ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
>  ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:925) 
> ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:913)
>  ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
>  ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
>  ~[?:1.8.0_282]
> at 
> org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:234)
>  ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
>  ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
>  ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
>  ~[?:1.8.0_282]
> at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
>  ~[?:1.8.0_282]
> at 
> org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1064)
>  ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at akka.dispatch.OnComplete.internal(Future.scala:263) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at akka.dispatch.OnComplete.internal(Future.scala:261) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at 
> org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73)
>  ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at 
> akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:644) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:205) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at 
> scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)
>  ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at 
> scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at 
> scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599) 
> ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
> at 
> akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:328)
>  

[GitHub] [flink] flinkbot commented on pull request #15404: [hotfix][docs] Remove redundant import

2021-03-28 Thread GitBox


flinkbot commented on pull request #15404:
URL: https://github.com/apache/flink/pull/15404#issuecomment-809047367


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit b1c186c4ac9432c1bda5474b5a1d830062195e71 (Mon Mar 29 
03:54:18 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21990) SourceStreamTask will always hang if the CheckpointedFunction#snapshotState throws an exception.

2021-03-28 Thread Kezhu Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310372#comment-17310372
 ] 

Kezhu Wang commented on FLINK-21990:


Ideally, this should be fixed in 
[1.12.2|https://github.com/apache/flink/blob/release-1.12.2/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/SourceStreamTask.java#L175]
 and 
[master|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/SourceStreamTask.java#L181].
 But I do run to hang in test case(modified version). Either that run does not 
touch blocking operations or interruption state was eaten up 
somewhere(FLINK-21186 ?) I think that fix worth a test to gain confidence on 
this.

cc [~AHeise]

> SourceStreamTask will always hang if the CheckpointedFunction#snapshotState 
> throws an exception.
> 
>
> Key: FLINK-21990
> URL: https://issues.apache.org/jira/browse/FLINK-21990
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.0, 1.12.0
>Reporter: ming li
>Priority: Critical
>
> If the source in {{SourceStreamTask}} implements {{CheckpointedFunction}} and 
> an exception is thrown in the snapshotState method, then the 
> {{SourceStreamTask}} will always hang.
> The main reason is that the checkpoint is executed in the mailbox. When the 
> {{CheckpointedFunction#snapshotState}}  of the source throws an exception, 
> the StreamTask#cleanUpInvoke will be called, where it will wait for the end 
> of the {{LegacySourceFunctionThread}} of the source. However, the source 
> thread does not end by itself (this requires the user to control it), the 
> {{Task}} will hang at this time, and the JobMaster has no perception of this 
> behavior.
> {code:java}
> protected void cleanUpInvoke() throws Exception {
> getCompletionFuture().exceptionally(unused -> null).join(); //wait for 
> the end of the source
> // clean up everything we initialized
> isRunning = false;
> ...
> }{code}
> I think we should call the cancel method of the source first, and then wait 
> for the end.
> The following is my test code, the test branch is Flink's master branch.
> {code:java}
> @Test
> public void testSourceFailure() throws Exception {
> final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.enableCheckpointing(2000L);
> env.setRestartStrategy(RestartStrategies.noRestart());
> env.addSource(new FailedSource()).addSink(new DiscardingSink<>());
> JobGraph jobGraph = 
> StreamingJobGraphGenerator.createJobGraph(env.getStreamGraph());
> try {
> // assert that the job only execute checkpoint once and only failed 
> once.
> TestUtils.submitJobAndWaitForResult(
> cluster.getClusterClient(), jobGraph, 
> getClass().getClassLoader());
> } catch (JobExecutionException jobException) {
> Optional throwable =
> ExceptionUtils.findThrowable(jobException, 
> FlinkRuntimeException.class);
> Assert.assertTrue(throwable.isPresent());
> Assert.assertEquals(
> 
> CheckpointFailureManager.EXCEEDED_CHECKPOINT_TOLERABLE_FAILURE_MESSAGE,
> throwable.get().getMessage());
> }
> // assert that the job only failed once.
> Assert.assertEquals(1, 
> StringGeneratingSourceFunction.INITIALIZE_TIMES.get());
> }
> private static class FailedSource extends RichParallelSourceFunction
> implements CheckpointedFunction {
> private transient boolean running;
> @Override
> public void open(Configuration parameters) throws Exception {
> running = true;
> }
> @Override
> public void run(SourceContext ctx) throws Exception {
> while (running) {
> ctx.collect("test");
> }
> }
> @Override
> public void cancel() {
> running = false;
> }
> @Override
> public void snapshotState(FunctionSnapshotContext context) throws 
> Exception {
> throw new RuntimeException("source failed");
> }
> @Override
> public void initializeState(FunctionInitializationContext context) throws 
> Exception {}
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] YikSanChan opened a new pull request #15404: [hotfix][docs] Remove redundant import

2021-03-28 Thread GitBox


YikSanChan opened a new pull request #15404:
URL: https://github.com/apache/flink/pull/15404


   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't 
know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21981) Increase the priority of the parameter in flink-conf

2021-03-28 Thread Paul Lin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310371#comment-17310371
 ] 

Paul Lin commented on FLINK-21981:
--

[~Bo Cui] Just to confirm, did you try exporting `HADOOP_CONF_DIR` before run 
any flink commands? It should work for multi-tenant scenarios. Env variables 
are more flexible and environment-specific, thus should take higher priority 
than static configurations.

> Increase the priority of the parameter in flink-conf
> 
>
> Key: FLINK-21981
> URL: https://issues.apache.org/jira/browse/FLINK-21981
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Reporter: Bo Cui
>Priority: Major
>
> in my cluster, env has HADOOP_CONF_DIR and YARN_CONF_DIR, but they are hadoop 
> and yarn conf, not flink conf. and hadoop conf of flink(env.hadoop.conf.dir) 
> is different from them. so we should use `env.hadoop.conf.dir` first
> https://github.com/apache/flink/blob/57e93c90a14a9f5316da821863dd2b335a8f86e0/flink-dist/src/main/flink-bin/bin/config.sh#L256



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21103) E2e tests time out on azure

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310370#comment-17310370
 ] 

Guowei Ma commented on FLINK-21103:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15573=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22005) SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1)

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310368#comment-17310368
 ] 

Guowei Ma commented on FLINK-22005:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15580=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529=48499

> SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1) 
> 
>
> Key: FLINK-22005
> URL: https://issues.apache.org/jira/browse/FLINK-22005
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
>
> The test fail because of Waiting for Elasticsearch records indefinitely.
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15583=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529=19826



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22005) SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1)

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310366#comment-17310366
 ] 

Guowei Ma commented on FLINK-22005:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15589=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529=58087

> SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1) 
> 
>
> Key: FLINK-22005
> URL: https://issues.apache.org/jira/browse/FLINK-22005
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
>
> The test fail because of Waiting for Elasticsearch records indefinitely.
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15583=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529=19826



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-21103) E2e tests time out on azure

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307542#comment-17307542
 ] 

Guowei Ma edited comment on FLINK-21103 at 3/29/21, 3:47 AM:
-

on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results



was (Author: maguowei):
on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15589=results


> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] fsk119 commented on a change in pull request #15402: [FLINK-21985][table-api] Support Explain Query/Modifcation syntax in Calcite Parser

2021-03-28 Thread GitBox


fsk119 commented on a change in pull request #15402:
URL: https://github.com/apache/flink/pull/15402#discussion_r602994006



##
File path: 
flink-table/flink-sql-parser-hive/src/main/codegen/includes/parserImpls.ftl
##
@@ -1605,3 +1605,17 @@ SqlShowModules SqlShowModules() :
 return new SqlShowModules(startPos.plus(getPos()), requireFull);
 }
 }
+
+/**
+* Parses a explain module statement.
+*/
+SqlNode SqlRichExplain() :
+{
+SqlNode stmt;
+}
+{
+ ( )*
+stmt = SqlQueryOrDml() {
+return new SqlRichExplain(getPos(),stmt);
+}
+}

Review comment:
   Add a new line

##
File path: 
flink-table/flink-sql-parser-hive/src/test/java/org/apache/flink/sql/parser/hive/FlinkHiveSqlParserImplTest.java
##
@@ -482,4 +482,54 @@ public void testShowModules() {
 
 sql("show full modules").ok("SHOW FULL MODULES");
 }
+
+@Test
+public void testExplain() {
+String sql = "explain plan for select * from emps";
+String expected = "EXPLAIN SELECT *\n" + "FROM `EMPS`";
+this.sql(sql).ok(expected);
+}
+
+@Test
+public void testExplainJsonFormat() {
+// unsupport testExplainJsonFormat now
+}
+
+@Test
+public void testExplainWithImpl() {
+// unsupport testExplainWithImpl now
+}
+
+@Test
+public void testExplainWithoutImpl() {
+// unsupport testExplainWithoutImpl now
+}
+
+@Test
+public void testExplainWithType() {
+// unsupport testExplainWithType now
+}
+
+@Test
+public void testExplainAsXml() {
+// unsupport testExplainWithType now
+}
+
+@Test
+public void testExplainAsJson() {
+// unsupport testExplainWithType now
+}

Review comment:
   Add comment: 
   // TODO: FLINK-20562

##
File path: 
flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/calcite/FlinkPlannerImpl.scala
##
@@ -146,6 +146,10 @@ class FlinkPlannerImpl(
   val validated = validator.validate(explain.getExplicandum)
   explain.setOperand(0, validated)
   explain
+case richExplain: SqlRichExplain =>
+  val validated = validator.validate(richExplain.getStatement)
+  richExplain.setOperand(0,validated)
+  richExplain

Review comment:
   Move forward before `SqlExplain`?  Do we really need SqlExplain? 

##
File path: 
flink-table/flink-sql-parser/src/test/java/org/apache/flink/sql/parser/FlinkSqlParserImplTest.java
##
@@ -1215,6 +1214,56 @@ public void testEnd() {
 sql("end").ok("END");
 }
 
+@Test
+public void testExplain() {
+String sql = "explain plan for select * from emps";
+String expected = "EXPLAIN SELECT *\n" + "FROM `EMPS`";
+this.sql(sql).ok(expected);
+}
+
+@Test
+public void testExplainJsonFormat() {
+// unsupport testExplainJsonFormat now
+}
+
+@Test
+public void testExplainWithImpl() {
+// unsupport testExplainWithImpl now
+}
+
+@Test
+public void testExplainWithoutImpl() {
+// unsupport testExplainWithoutImpl now
+}
+
+@Test
+public void testExplainWithType() {
+// unsupport testExplainWithType now
+}
+
+@Test
+public void testExplainAsXml() {
+// unsupport testExplainWithType now
+}
+
+@Test
+public void testExplainAsJson() {
+// unsupport testExplainWithType now

Review comment:
   ditto

##
File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/calcite/FlinkPlannerImpl.scala
##
@@ -18,28 +18,26 @@
 
 package org.apache.flink.table.calcite
 
-import org.apache.flink.sql.parser.ExtendedSqlNode
-import org.apache.flink.sql.parser.dql.{SqlRichDescribeTable, SqlShowCatalogs, 
SqlShowCurrentCatalog, SqlShowCurrentDatabase, SqlShowDatabases, 
SqlShowFunctions, SqlShowTables, SqlShowViews}
-import org.apache.flink.table.api.{TableException, ValidationException}
-import org.apache.flink.table.catalog.CatalogReader
-import org.apache.flink.table.parse.CalciteParser
+import _root_.java.lang.{Boolean => JBoolean}
+import _root_.java.util
+import _root_.java.util.function.{Function => JFunction}
 
 import org.apache.calcite.plan.RelOptTable.ViewExpander
 import org.apache.calcite.plan._
 import org.apache.calcite.rel.RelRoot
 import org.apache.calcite.rel.`type`.RelDataType
 import org.apache.calcite.rex.RexBuilder
-import org.apache.calcite.sql.advise.{SqlAdvisor, SqlAdvisorValidator}
+import org.apache.calcite.sql.advise.SqlAdvisorValidator
 import org.apache.calcite.sql.validate.SqlValidator
 import org.apache.calcite.sql.{SqlExplain, SqlKind, SqlNode, SqlOperatorTable}
 import org.apache.calcite.sql2rel.{SqlRexConvertletTable, SqlToRelConverter}
 import org.apache.calcite.tools.{FrameworkConfig, RelConversionException}
+import org.apache.flink.sql.parser.ExtendedSqlNode
+import 

[jira] [Comment Edited] (FLINK-21103) E2e tests time out on azure

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307542#comment-17307542
 ] 

Guowei Ma edited comment on FLINK-21103 at 3/29/21, 3:45 AM:
-

on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15589=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15584=results



was (Author: maguowei):
on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15589=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15584=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15583=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #15403: [hotfix] fix typo, assign processedData to currentProcessedData rather than currentPersistedData.

2021-03-28 Thread GitBox


flinkbot commented on pull request #15403:
URL: https://github.com/apache/flink/pull/15403#issuecomment-809045211


   
   ## CI report:
   
   * d87a60142ebdb3eef911d51391679b90f4acaa6a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-21103) E2e tests time out on azure

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307542#comment-17307542
 ] 

Guowei Ma edited comment on FLINK-21103 at 3/29/21, 3:45 AM:
-

on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15589=results



was (Author: maguowei):
on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15589=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15584=results


> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #15402: [FLINK-21985][table-api] Support Explain Query/Modifcation syntax in Calcite Parser

2021-03-28 Thread GitBox


flinkbot commented on pull request #15402:
URL: https://github.com/apache/flink/pull/15402#issuecomment-809045142


   
   ## CI report:
   
   * 97ba64b2959826a53be3273d9b565b3fe8f96fdc UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15401: [FLINK-21969][python] Invoke finish bundle method before emitting the max timestamp watermark in PythonTimestampsAndWatermarksOperato

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15401:
URL: https://github.com/apache/flink/pull/15401#issuecomment-809036806


   
   ## CI report:
   
   * 7d4a73b2d067dad06ae36fbf2750a1018df4338c Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15643)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22005) SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1)

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310365#comment-17310365
 ] 

Guowei Ma commented on FLINK-22005:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15583=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529=19760

> SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1) 
> 
>
> Key: FLINK-22005
> URL: https://issues.apache.org/jira/browse/FLINK-22005
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
>
> The test fail because of Waiting for Elasticsearch records indefinitely.
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15583=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529=19826



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15016: [FLINK-21413][state] Clean TtlMapState and TtlListState after all elements are expired

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15016:
URL: https://github.com/apache/flink/pull/15016#issuecomment-785640058


   
   ## CI report:
   
   * cf3c223658f37cc838409b690e2e1bf93c04f207 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15641)
 
   * 5d95fe96d132b396f3d59f2c102d63a68ccd0d73 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15054: [FLINK-13550][runtime][ui] Operator's Flame Graph

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15054:
URL: https://github.com/apache/flink/pull/15054#issuecomment-788337524


   
   ## CI report:
   
   * 26a28f2d83f56cb386e1365fd4df4fb8a2f2bf86 UNKNOWN
   * 0b5aaf42f2861a38e26a80e25cf2324e7cf06bb7 UNKNOWN
   * 7717d7719f6b448084ff27365c2b718d2c059376 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15640)
 
   * e7f59e2e2811c0718a453572e90a4ad2a900ff03 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15642)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21981) Increase the priority of the parameter in flink-conf

2021-03-28 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310364#comment-17310364
 ] 

Xintong Song commented on FLINK-21981:
--

{quote}and HADOOP_CONF_DIR=default hdfs-site, and HDADOOP_CONF_DIR cannot be 
modified in multi-tenant scenarios.
{quote}
Why is that? You should be able to expose the env only for the specific 
command, instead of exposing globally.

> Increase the priority of the parameter in flink-conf
> 
>
> Key: FLINK-21981
> URL: https://issues.apache.org/jira/browse/FLINK-21981
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Reporter: Bo Cui
>Priority: Major
>
> in my cluster, env has HADOOP_CONF_DIR and YARN_CONF_DIR, but they are hadoop 
> and yarn conf, not flink conf. and hadoop conf of flink(env.hadoop.conf.dir) 
> is different from them. so we should use `env.hadoop.conf.dir` first
> https://github.com/apache/flink/blob/57e93c90a14a9f5316da821863dd2b335a8f86e0/flink-dist/src/main/flink-bin/bin/config.sh#L256



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22005) SQL Client end-to-end test (Old planner) Elasticsearch (v7.5.1)

2021-03-28 Thread Guowei Ma (Jira)
Guowei Ma created FLINK-22005:
-

 Summary: SQL Client end-to-end test (Old planner) Elasticsearch 
(v7.5.1) 
 Key: FLINK-22005
 URL: https://issues.apache.org/jira/browse/FLINK-22005
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.13.0
Reporter: Guowei Ma


The test fail because of Waiting for Elasticsearch records indefinitely.
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15583=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529=19826




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21982) jobs should use client org.apache.hadoop.config on yarn

2021-03-28 Thread Bo Cui (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310358#comment-17310358
 ] 

Bo Cui commented on FLINK-21982:


when new configuration, configuration will load core-site by 
classLoader.getResource(String) 

https://github.com/apache/hadoop/blob/ea6595d3b68ac462aec0d493718d7a10fbda0b6d/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L788

> jobs should use client org.apache.hadoop.config on yarn
> ---
>
> Key: FLINK-21982
> URL: https://issues.apache.org/jira/browse/FLINK-21982
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.13.0
>Reporter: Bo Cui
>Priority: Major
>
> currently, jobs use the Yarn server configuration during execution. 
> i think we should submit the configuration to the HDFS, like MR job.xml...etc
> and container fetches the job.xml and creates a soft link(hdfs-site core-site 
> yarn-site) during container initializes
> and then the configuration can obtain these resources through 
> classLoader.getResource(String) during configuration initializes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-21981) Increase the priority of the parameter in flink-conf

2021-03-28 Thread Bo Cui (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310355#comment-17310355
 ] 

Bo Cui edited comment on FLINK-21981 at 3/29/21, 3:33 AM:
--

the client side has 2 hdfs-silt (default hdfs-site and flink hdfs-site, they 
are not the same )

and HADOOP_CONF_DIR=default hdfs-site, and HDADOOP_CONF_DIR cannot be modified 
in multi-tenant scenarios.

so i think Preferential use of `env.hadoop.conf.dir` may be better.


was (Author: bo cui):
the client side has 2 hdfs-silt (default hdfs-site and flink hdfs-site, they 
are not the same )

and HADOOP_CONF_DIR=default hdfs-site, and HDADOOP_CONF_DIR cannot be modified 
in multi-tenant scenarios.

so i think Preferential use of `env.hadoop.conf.dir` may be better.

 

 

> Increase the priority of the parameter in flink-conf
> 
>
> Key: FLINK-21981
> URL: https://issues.apache.org/jira/browse/FLINK-21981
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Reporter: Bo Cui
>Priority: Major
>
> in my cluster, env has HADOOP_CONF_DIR and YARN_CONF_DIR, but they are hadoop 
> and yarn conf, not flink conf. and hadoop conf of flink(env.hadoop.conf.dir) 
> is different from them. so we should use `env.hadoop.conf.dir` first
> https://github.com/apache/flink/blob/57e93c90a14a9f5316da821863dd2b335a8f86e0/flink-dist/src/main/flink-bin/bin/config.sh#L256



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21981) Increase the priority of the parameter in flink-conf

2021-03-28 Thread Bo Cui (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310355#comment-17310355
 ] 

Bo Cui commented on FLINK-21981:


the client side has 2 hdfs-silt (default hdfs-site and flink hdfs-site, they 
are not the same )

and HADOOP_CONF_DIR=default hdfs-site, and HDADOOP_CONF_DIR cannot be modified 
in multi-tenant scenarios.

so i think Preferential use of `env.hadoop.conf.dir` may be better.

 

 

> Increase the priority of the parameter in flink-conf
> 
>
> Key: FLINK-21981
> URL: https://issues.apache.org/jira/browse/FLINK-21981
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Reporter: Bo Cui
>Priority: Major
>
> in my cluster, env has HADOOP_CONF_DIR and YARN_CONF_DIR, but they are hadoop 
> and yarn conf, not flink conf. and hadoop conf of flink(env.hadoop.conf.dir) 
> is different from them. so we should use `env.hadoop.conf.dir` first
> https://github.com/apache/flink/blob/57e93c90a14a9f5316da821863dd2b335a8f86e0/flink-dist/src/main/flink-bin/bin/config.sh#L256



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21982) jobs should use client org.apache.hadoop.config on yarn

2021-03-28 Thread Bo Cui (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310354#comment-17310354
 ] 

Bo Cui commented on FLINK-21982:


thx [~xintongsong]
 FLINK-16005 only solves part of my problem(YarnApplicationMasterRunner and 
YarnResourceManager...).
 If the job uses `new Configuration` API, job and configuration still use the 
yarn/hdfs server configuration.
{quote}i think we should submit the configuration to the HDFS, like MR 
job.xml...etc
 and container fetches the job.xml and creates a soft link(hdfs-site core-site 
yarn-site) during container initializes
 and then the configuration can obtain these resources through 
classLoader.getResource(String) during configuration initializes
{quote}

> jobs should use client org.apache.hadoop.config on yarn
> ---
>
> Key: FLINK-21982
> URL: https://issues.apache.org/jira/browse/FLINK-21982
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.13.0
>Reporter: Bo Cui
>Priority: Major
>
> currently, jobs use the Yarn server configuration during execution. 
> i think we should submit the configuration to the HDFS, like MR job.xml...etc
> and container fetches the job.xml and creates a soft link(hdfs-site core-site 
> yarn-site) during container initializes
> and then the configuration can obtain these resources through 
> classLoader.getResource(String) during configuration initializes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #15403: [hotfix] fix typo, assign processedData to currentProcessedData rather than currentPersistedData.

2021-03-28 Thread GitBox


flinkbot commented on pull request #15403:
URL: https://github.com/apache/flink/pull/15403#issuecomment-809039318


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit d87a60142ebdb3eef911d51391679b90f4acaa6a (Mon Mar 29 
03:25:41 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15402: [FLINK-21985][table-api] Support Explain Query/Modifcation syntax in Calcite Parser

2021-03-28 Thread GitBox


flinkbot commented on pull request #15402:
URL: https://github.com/apache/flink/pull/15402#issuecomment-809039335


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 97ba64b2959826a53be3273d9b565b3fe8f96fdc (Mon Mar 29 
03:25:43 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21640) Job fails to be submitted in tenant scenario

2021-03-28 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310353#comment-17310353
 ] 

Xintong Song commented on FLINK-21640:
--

[~Bo Cui], [~fly_in_gis],

Obviously, different companies maintain ZooKeeper differently. I think 
users/companies can choose whichever approach they want, and Flink should not 
make assumptions on how ZK is maintained. In that sense, creating the root 
ZNode with global access only to fit with some specific ways of maintaining ZK 
does not sounds fair to me.

Leaving aside how ZK is maintained, I think it is a common convention for 
access control, that by default new contents should be created with as strict 
access as possible. E.g., by default new files/directories can only be modified 
by the creating user on most of the Linux/Unix systems.

> Job fails to be submitted in tenant scenario
> 
>
> Key: FLINK-21640
> URL: https://issues.apache.org/jira/browse/FLINK-21640
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, Client / Job Submission
>Affects Versions: 1.12.2, 1.13.0
>Reporter: Bo Cui
>Assignee: Bo Cui
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-03-06-09-30-52-410.png, 
> image-2021-03-06-09-34-05-518.png
>
>
> Job fails to be submitted in tenant scenario
>  !image-2021-03-06-09-30-52-410.png! 
> because current user does not have the Znode permission.
>  !image-2021-03-06-09-34-05-518.png! 
> i think the parent znode acl is anyone.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] est08zw opened a new pull request #15403: [hotfix] fix typo, assign processedData to currentProcessedData rather than currentPersistedData.

2021-03-28 Thread GitBox


est08zw opened a new pull request #15403:
URL: https://github.com/apache/flink/pull/15403


   
   
   ## What is the purpose of the change
   
   fix typo, assign processedData to currentProcessedData rather than 
currentPersistedData.
   
   ## Brief change log
   
   fix typo, assign processedData to currentProcessedData rather than 
currentPersistedData.
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] chaozwn closed pull request #15390: [FLINK-21985][table-api] Support Explain Query/Modifcation syntax in Calcite Parser

2021-03-28 Thread GitBox


chaozwn closed pull request #15390:
URL: https://github.com/apache/flink/pull/15390


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-22004) Translate Flink Roadmap to Chinese.

2021-03-28 Thread Yuan Mei (Jira)
Yuan Mei created FLINK-22004:


 Summary: Translate Flink Roadmap to Chinese.
 Key: FLINK-22004
 URL: https://issues.apache.org/jira/browse/FLINK-22004
 Project: Flink
  Issue Type: Task
  Components: Documentation
Reporter: Yuan Mei



https://flink.apache.org/roadmap.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] chaozwn opened a new pull request #15402: [FLINK-21985][table-api] Support Explain Query/Modifcation syntax in Calcite Parser

2021-03-28 Thread GitBox


chaozwn opened a new pull request #15402:
URL: https://github.com/apache/flink/pull/15402


   …Calcite Parser
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't 
know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-21103) E2e tests time out on azure

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307542#comment-17307542
 ] 

Guowei Ma edited comment on FLINK-21103 at 3/29/21, 3:18 AM:
-

on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15589=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15584=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15583=results


was (Author: maguowei):
on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15589=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15584=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-21103) E2e tests time out on azure

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307542#comment-17307542
 ] 

Guowei Ma edited comment on FLINK-21103 at 3/29/21, 3:17 AM:
-

on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15589=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15584=results


was (Author: maguowei):
on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15589=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #15401: [FLINK-21969][python] Invoke finish bundle method before emitting the max timestamp watermark in PythonTimestampsAndWatermarksOperator

2021-03-28 Thread GitBox


flinkbot commented on pull request #15401:
URL: https://github.com/apache/flink/pull/15401#issuecomment-809036806


   
   ## CI report:
   
   * 7d4a73b2d067dad06ae36fbf2750a1018df4338c UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-21103) E2e tests time out on azure

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307542#comment-17307542
 ] 

Guowei Ma edited comment on FLINK-21103 at 3/29/21, 3:16 AM:
-

on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15589=results


was (Author: maguowei):
on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15016: [FLINK-21413][state] Clean TtlMapState and TtlListState after all elements are expired

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15016:
URL: https://github.com/apache/flink/pull/15016#issuecomment-785640058


   
   ## CI report:
   
   * cf3c223658f37cc838409b690e2e1bf93c04f207 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15641)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-16947) ArtifactResolutionException: Could not transfer artifact. Entry [...] has not been leased from this pool

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310349#comment-17310349
 ] 

Guowei Ma commented on FLINK-16947:
---

just for reporting on brach 1.12
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15603=logs=51fed01c-4eb0-5511-d479-ed5e8b9a7820=e5682198-9e22-5770-69f6-7551182edea8=6847

> ArtifactResolutionException: Could not transfer artifact.  Entry [...] has 
> not been leased from this pool
> -
>
> Key: FLINK-16947
> URL: https://issues.apache.org/jira/browse/FLINK-16947
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines
>Reporter: Piotr Nowojski
>Assignee: Robert Metzger
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6982=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5
> Build of flink-metrics-availability-test failed with:
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.22.1:test (end-to-end-tests) 
> on project flink-metrics-availability-test: Unable to generate classpath: 
> org.apache.maven.artifact.resolver.ArtifactResolutionException: Could not 
> transfer artifact org.apache.maven.surefire:surefire-grouper:jar:2.22.1 
> from/to google-maven-central 
> (https://maven-central-eu.storage-download.googleapis.com/maven2/): Entry 
> [id:13][route:{s}->https://maven-central-eu.storage-download.googleapis.com:443][state:null]
>  has not been leased from this pool
> [ERROR] org.apache.maven.surefire:surefire-grouper:jar:2.22.1
> [ERROR] 
> [ERROR] from the specified remote repositories:
> [ERROR] google-maven-central 
> (https://maven-central-eu.storage-download.googleapis.com/maven2/, 
> releases=true, snapshots=false),
> [ERROR] apache.snapshots (https://repository.apache.org/snapshots, 
> releases=false, snapshots=true)
> [ERROR] Path to dependency:
> [ERROR] 1) dummy:dummy:jar:1.0
> [ERROR] 2) org.apache.maven.surefire:surefire-junit47:jar:2.22.1
> [ERROR] 3) org.apache.maven.surefire:common-junit48:jar:2.22.1
> [ERROR] 4) org.apache.maven.surefire:surefire-grouper:jar:2.22.1
> [ERROR] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> [ERROR] 
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :flink-metrics-availability-test
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] Jiayi-Liao commented on pull request #15016: [FLINK-21413][state] Clean TtlMapState and TtlListState after all elements are expired

2021-03-28 Thread GitBox


Jiayi-Liao commented on pull request #15016:
URL: https://github.com/apache/flink/pull/15016#issuecomment-809035046


   @carp84 @Myasuka Sorry for the late reply(pretty busy these days). I've 
improved the codes based on your comments, please take another look, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21954) Test failures occur due to the test not waiting for the ExecutionGraph to be created

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310347#comment-17310347
 ] 

Guowei Ma commented on FLINK-21954:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15601=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7030a106-e977-5851-a05e-535de648c9c9=8552

> Test failures occur due to the test not waiting for the ExecutionGraph to be 
> created
> 
>
> Key: FLINK-21954
> URL: https://issues.apache.org/jira/browse/FLINK-21954
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Matthias
>Priority: Major
>  Labels: test-stability
>
> Various tests are failing due the test not waiting for the ExecutionGraph to 
> be created:
> * 
> [JobMasterTest.testRestoringFromSavepoint|https://dev.azure.com/mapohl/flink/_build/results?buildId=356=logs=243b38e1-22e7-598a-c8ae-385dce2c28b5=fea482b6-4f61-51f4-2584-f73df532b395=8266]
> * {{JobMasterTest.testRequestNextInputSplitWithGlobalFailover}}
> * {{JobMasterTest.testRequestNextInputSplitWithLocalFailover}} (also fails 
> due to FLINK-21450)
> * {{JobMasterQueryableStateTest.testRequestKvStateOfWrongJob}}
> * {{JobMasterQueryableStateTest.testRequestKvStateWithIrrelevantRegistration}}
> * {{JobMasterQueryableStateTest.testDuplicatedKvStateRegistrationsFailTask}}
> * {{JobMasterQueryableStateTest.testRegisterKvState}}
> We might have to double-check whether other tests are affected as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22003) UnalignedCheckpointITCase fail

2021-03-28 Thread Guowei Ma (Jira)
Guowei Ma created FLINK-22003:
-

 Summary: UnalignedCheckpointITCase fail
 Key: FLINK-22003
 URL: https://issues.apache.org/jira/browse/FLINK-22003
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.13.0
Reporter: Guowei Ma


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15601=logs=119bbba7-f5e3-5e08-e72d-09f1529665de=7dc1f5a9-54e1-502e-8b02-c7df69073cfc=4142


{code:java}
[ERROR] execute[parallel pipeline with remote channels, p = 
5](org.apache.flink.test.checkpointing.UnalignedCheckpointITCase)  Time 
elapsed: 60.018 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 6 
milliseconds
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
at 
java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at 
java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1859)
at 
org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:69)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1839)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1822)
at 
org.apache.flink.test.checkpointing.UnalignedCheckpointTestBase.execute(UnalignedCheckpointTestBase.java:138)
at 
org.apache.flink.test.checkpointing.UnalignedCheckpointITCase.execute(UnalignedCheckpointITCase.java:184)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)

{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21853) Running HA per-job cluster (rocks, non-incremental) end-to-end test could not finished in 900 seconds

2021-03-28 Thread Guowei Ma (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guowei Ma updated FLINK-21853:
--
Attachment: flink-vsts-standalonejob-3-fv-az227-245.log.zip

> Running HA per-job cluster (rocks, non-incremental) end-to-end test could not 
> finished in 900 seconds
> -
>
> Key: FLINK-21853
> URL: https://issues.apache.org/jira/browse/FLINK-21853
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination, Runtime / State Backends
>Affects Versions: 1.11.3, 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
> Attachments: flink-vsts-standalonejob-3-fv-az227-245.log.zip
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=14921=logs=91bf6583-3fb2-592f-e4d4-d79d79c3230a=03dbd840-5430-533d-d1a7-05d0ebe03873=7318
> {code:java}
> Waiting for text Completed checkpoint [1-9]* for job 
>  to appear 2 of times in logs...
> grep: 
> /home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT/log/*standalonejob-2*.log:
>  No such file or directory
> grep: 
> /home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT/log/*standalonejob-2*.log:
>  No such file or directory
> grep: 
> /home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT/log/*standalonejob-2*.log:
>  No such file or directory
> Starting standalonejob daemon on host fv-az232-135.
> grep: 
> /home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT/log/*standalonejob-2*.log:
>  No such file or directory
> Killed TM @ 15744
> Killed TM @ 19625
> Test (pid: 9232) did not finish after 900 seconds.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21853) Running HA per-job cluster (rocks, non-incremental) end-to-end test could not finished in 900 seconds

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310343#comment-17310343
 ] 

Guowei Ma commented on FLINK-21853:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15601=logs=4dd4dbdd-1802-5eb7-a518-6acd9d24d0fc=8d6b4dd3-4ca1-5611-1743-57a7d76b395a=1675

>From the log(flink-vsts-standalonejob-3-fv-az227-245.log) there are failures 
>of submitting the task , which might lead to checkpoint could not success on 
>time.


{code:java}
2021-03-27 22:25:36,398 WARN  
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] 
- Slot allocation for slot 10.1.0.4:43337-a86692_2 for job 
 failed.
java.util.concurrent.TimeoutException: Invocation of public abstract 
java.util.concurrent.CompletableFuture 
org.apache.flink.runtime.taskexecutor.TaskExecutorGateway.requestSlot(org.apache.flink.runtime.clusterframework.types.SlotID,org.apache.flink.api.common.JobID,org.apache.flink.runtime.clusterframework.types.AllocationID,org.apache.flink.runtime.clusterframework.types.ResourceProfile,java.lang.String,org.apache.flink.runtime.resourcemanager.ResourceManagerId,org.apache.flink.api.common.time.Time)
 timed out.
at 
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager.allocateSlot(DeclarativeSlotManager.java:561)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at 
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager.internalTryAllocateSlots(DeclarativeSlotManager.java:511)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at 
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager.tryAllocateSlotsForJob(DeclarativeSlotManager.java:477)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at 
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager.checkResourceRequirements(DeclarativeSlotManager.java:444)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at 
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager.unregisterTaskManager(DeclarativeSlotManager.java:346)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at 
org.apache.flink.runtime.resourcemanager.ResourceManager.closeTaskManagerConnection(ResourceManager.java:1078)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at 
org.apache.flink.runtime.resourcemanager.ResourceManager$TaskManagerHeartbeatListener.notifyHeartbeatTimeout(ResourceManager.java:1488)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at 
org.apache.flink.runtime.heartbeat.HeartbeatMonitorImpl.run(HeartbeatMonitorImpl.java:111)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_282]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_282]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) 
[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) 
[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) 
[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) 
[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) 
[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 
[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 
[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at akka.actor.Actor$class.aroundReceive(Actor.scala:517) 
[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) 
[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) 
[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at akka.actor.ActorCell.invoke(ActorCell.scala:561) 
[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
at 

[GitHub] [flink] flinkbot commented on pull request #15401: [FLINK-21969][python] Invoke finish bundle method before emitting the max timestamp watermark in PythonTimestampsAndWatermarksOperator

2021-03-28 Thread GitBox


flinkbot commented on pull request #15401:
URL: https://github.com/apache/flink/pull/15401#issuecomment-809032576


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 7d4a73b2d067dad06ae36fbf2750a1018df4338c (Mon Mar 29 
03:04:15 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21969) PythonTimestampsAndWatermarksOperator emitted the Long.MAX_VALUE watermark before emitting all the data

2021-03-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-21969:
---
Labels: pull-request-available  (was: )

> PythonTimestampsAndWatermarksOperator emitted the Long.MAX_VALUE watermark 
> before emitting all the data
> ---
>
> Key: FLINK-21969
> URL: https://issues.apache.org/jira/browse/FLINK-21969
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.12.2, 1.13.0
>Reporter: Wei Zhong
>Assignee: Wei Zhong
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-03-25-14-17-06-873.png
>
>
> Currently the PythonTimestampsAndWatermarksOperator emitted the 
> Long.MAX_VALUE watermark before emitting all the data, which makes some 
> registered timer can not be triggered on bounded stream, we need to fix this.
> !image-2021-03-25-14-17-06-873.png|width=493,height=284!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] WeiZhong94 opened a new pull request #15401: [FLINK-21969][python] Invoke finish bundle method before emitting the max timestamp watermark in PythonTimestampsAndWatermarksOperator

2021-03-28 Thread GitBox


WeiZhong94 opened a new pull request #15401:
URL: https://github.com/apache/flink/pull/15401


   ## What is the purpose of the change
   
   *This pull request invokes finish bundle method before emitting the max 
timestamp watermark in PythonTimestampsAndWatermarksOperator.*
   
   ## Brief change log
   
 - *Invoke finish bundle method before emitting the max timestamp watermark 
in PythonTimestampsAndWatermarksOperator*
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21990) SourceStreamTask will always hang if the CheckpointedFunction#snapshotState throws an exception.

2021-03-28 Thread Jiayi Liao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiayi Liao updated FLINK-21990:
---
Priority: Critical  (was: Major)

> SourceStreamTask will always hang if the CheckpointedFunction#snapshotState 
> throws an exception.
> 
>
> Key: FLINK-21990
> URL: https://issues.apache.org/jira/browse/FLINK-21990
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.0, 1.12.0
>Reporter: ming li
>Priority: Critical
>
> If the source in {{SourceStreamTask}} implements {{CheckpointedFunction}} and 
> an exception is thrown in the snapshotState method, then the 
> {{SourceStreamTask}} will always hang.
> The main reason is that the checkpoint is executed in the mailbox. When the 
> {{CheckpointedFunction#snapshotState}}  of the source throws an exception, 
> the StreamTask#cleanUpInvoke will be called, where it will wait for the end 
> of the {{LegacySourceFunctionThread}} of the source. However, the source 
> thread does not end by itself (this requires the user to control it), the 
> {{Task}} will hang at this time, and the JobMaster has no perception of this 
> behavior.
> {code:java}
> protected void cleanUpInvoke() throws Exception {
> getCompletionFuture().exceptionally(unused -> null).join(); //wait for 
> the end of the source
> // clean up everything we initialized
> isRunning = false;
> ...
> }{code}
> I think we should call the cancel method of the source first, and then wait 
> for the end.
> The following is my test code, the test branch is Flink's master branch.
> {code:java}
> @Test
> public void testSourceFailure() throws Exception {
> final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.enableCheckpointing(2000L);
> env.setRestartStrategy(RestartStrategies.noRestart());
> env.addSource(new FailedSource()).addSink(new DiscardingSink<>());
> JobGraph jobGraph = 
> StreamingJobGraphGenerator.createJobGraph(env.getStreamGraph());
> try {
> // assert that the job only execute checkpoint once and only failed 
> once.
> TestUtils.submitJobAndWaitForResult(
> cluster.getClusterClient(), jobGraph, 
> getClass().getClassLoader());
> } catch (JobExecutionException jobException) {
> Optional throwable =
> ExceptionUtils.findThrowable(jobException, 
> FlinkRuntimeException.class);
> Assert.assertTrue(throwable.isPresent());
> Assert.assertEquals(
> 
> CheckpointFailureManager.EXCEEDED_CHECKPOINT_TOLERABLE_FAILURE_MESSAGE,
> throwable.get().getMessage());
> }
> // assert that the job only failed once.
> Assert.assertEquals(1, 
> StringGeneratingSourceFunction.INITIALIZE_TIMES.get());
> }
> private static class FailedSource extends RichParallelSourceFunction
> implements CheckpointedFunction {
> private transient boolean running;
> @Override
> public void open(Configuration parameters) throws Exception {
> running = true;
> }
> @Override
> public void run(SourceContext ctx) throws Exception {
> while (running) {
> ctx.collect("test");
> }
> }
> @Override
> public void cancel() {
> running = false;
> }
> @Override
> public void snapshotState(FunctionSnapshotContext context) throws 
> Exception {
> throw new RuntimeException("source failed");
> }
> @Override
> public void initializeState(FunctionInitializationContext context) throws 
> Exception {}
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21990) SourceStreamTask will always hang if the CheckpointedFunction#snapshotState throws an exception.

2021-03-28 Thread Jiayi Liao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310341#comment-17310341
 ] 

Jiayi Liao commented on FLINK-21990:


[~pnowojski] We've encountered the problem when upgrading Flink version from 
1.9 to 1.11. This seems to be a very serious problem and can be reproduced with 
the test case above. Could you spare some time and take a look? 

> SourceStreamTask will always hang if the CheckpointedFunction#snapshotState 
> throws an exception.
> 
>
> Key: FLINK-21990
> URL: https://issues.apache.org/jira/browse/FLINK-21990
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.0, 1.12.0
>Reporter: ming li
>Priority: Major
>
> If the source in {{SourceStreamTask}} implements {{CheckpointedFunction}} and 
> an exception is thrown in the snapshotState method, then the 
> {{SourceStreamTask}} will always hang.
> The main reason is that the checkpoint is executed in the mailbox. When the 
> {{CheckpointedFunction#snapshotState}}  of the source throws an exception, 
> the StreamTask#cleanUpInvoke will be called, where it will wait for the end 
> of the {{LegacySourceFunctionThread}} of the source. However, the source 
> thread does not end by itself (this requires the user to control it), the 
> {{Task}} will hang at this time, and the JobMaster has no perception of this 
> behavior.
> {code:java}
> protected void cleanUpInvoke() throws Exception {
> getCompletionFuture().exceptionally(unused -> null).join(); //wait for 
> the end of the source
> // clean up everything we initialized
> isRunning = false;
> ...
> }{code}
> I think we should call the cancel method of the source first, and then wait 
> for the end.
> The following is my test code, the test branch is Flink's master branch.
> {code:java}
> @Test
> public void testSourceFailure() throws Exception {
> final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.enableCheckpointing(2000L);
> env.setRestartStrategy(RestartStrategies.noRestart());
> env.addSource(new FailedSource()).addSink(new DiscardingSink<>());
> JobGraph jobGraph = 
> StreamingJobGraphGenerator.createJobGraph(env.getStreamGraph());
> try {
> // assert that the job only execute checkpoint once and only failed 
> once.
> TestUtils.submitJobAndWaitForResult(
> cluster.getClusterClient(), jobGraph, 
> getClass().getClassLoader());
> } catch (JobExecutionException jobException) {
> Optional throwable =
> ExceptionUtils.findThrowable(jobException, 
> FlinkRuntimeException.class);
> Assert.assertTrue(throwable.isPresent());
> Assert.assertEquals(
> 
> CheckpointFailureManager.EXCEEDED_CHECKPOINT_TOLERABLE_FAILURE_MESSAGE,
> throwable.get().getMessage());
> }
> // assert that the job only failed once.
> Assert.assertEquals(1, 
> StringGeneratingSourceFunction.INITIALIZE_TIMES.get());
> }
> private static class FailedSource extends RichParallelSourceFunction
> implements CheckpointedFunction {
> private transient boolean running;
> @Override
> public void open(Configuration parameters) throws Exception {
> running = true;
> }
> @Override
> public void run(SourceContext ctx) throws Exception {
> while (running) {
> ctx.collect("test");
> }
> }
> @Override
> public void cancel() {
> running = false;
> }
> @Override
> public void snapshotState(FunctionSnapshotContext context) throws 
> Exception {
> throw new RuntimeException("source failed");
> }
> @Override
> public void initializeState(FunctionInitializationContext context) throws 
> Exception {}
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15054: [FLINK-13550][runtime][ui] Operator's Flame Graph

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15054:
URL: https://github.com/apache/flink/pull/15054#issuecomment-788337524


   
   ## CI report:
   
   * 26a28f2d83f56cb386e1365fd4df4fb8a2f2bf86 UNKNOWN
   * 0b5aaf42f2861a38e26a80e25cf2324e7cf06bb7 UNKNOWN
   * 7717d7719f6b448084ff27365c2b718d2c059376 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15640)
 
   * e7f59e2e2811c0718a453572e90a4ad2a900ff03 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15016: [FLINK-21413][state] Clean TtlMapState and TtlListState after all elements are expired

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15016:
URL: https://github.com/apache/flink/pull/15016#issuecomment-785640058


   
   ## CI report:
   
   * dd64013d8919ad2d5790a0184d952c7123be7da3 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=14963)
 
   * cf3c223658f37cc838409b690e2e1bf93c04f207 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-20412) Collect Result Fetching occasionally fails after a JobManager Failover

2021-03-28 Thread Caizhi Weng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310331#comment-17310331
 ] 

Caizhi Weng edited comment on FLINK-20412 at 3/29/21, 2:55 AM:
---

Sorry for this very late response as I'm not active in community these days.

>From the exception stack it seems that {{executorService}} in 
>{{CollectSinkOperatorCoordinator#handleCoordinationRequest}} is {{null}}. 
>However {{executorService}} is initialized in 
>{{CollectSinkOperatorCoordinator#start}}. So this indicates that 
>{{handleCoordinationRequest}} is called before {{start}} after a JM failure, 
>which seems to be a bug in the coordinator system itself and does not seem to 
>relate to the collect result fetching.

However I've looped 
{{FileSourceTextLinesITCase#testContinuousTextFileSourceWithJobManagerFailover}}
 for more than 1500 times and I still can't reproduce this issue. Is something 
related to the coordinator system fixed during these months?


was (Author: tsreaper):
Sorry for this very late response as I'm not active in community these days.

>From the exception stack it seems that {{executorService}} in 
>{{CollectSinkOperatorCoordinator#handleCoordinationRequest}} is {{null}}. 
>However {{executorService}} is initialized in 
>{{CollectSinkOperatorCoordinator#start}}. So this indicates that 
>{{handleCoordinationRequest}} is called before {{start}} after a JM failure, 
>which seems to be a bug in the coordinator system itself and does not seem to 
>relate to the collect result fetching.

However I've looped 
{{FileSourceTextLinesITCase#testContinuousTextFileSourceWithJobManagerFailover}}
 for more than 500 times and I still can't reproduce this issue. Is something 
related to the coordinator system fixed during these months?

> Collect Result Fetching occasionally fails after a JobManager Failover
> --
>
> Key: FLINK-20412
> URL: https://issues.apache.org/jira/browse/FLINK-20412
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.11.2, 1.12.0
>Reporter: Stephan Ewen
>Priority: Major
> Fix For: 1.11.4, 1.13.0
>
>
> The encountered exception is as blow.
>  
> The issue can be reproduced by running a test with JobManager failover in a 
> tight loop, for example the FileTextLinesITCase from this PR: 
> [https://github.com/apache/flink/pull/14199]
>  
> {code:java}
> 15335 [main] WARN  
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher - An 
> exception occurs when fetching query results
> java.util.concurrent.ExecutionException: java.lang.NullPointerException
>   at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) 
> ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) ~[?:?]
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.sendRequest(CollectResultFetcher.java:163)
>  ~[classes/:?]
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:134)
>  [classes/:?]
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:103)
>  [classes/:?]
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77)
>  [classes/:?]
>   at 
> org.apache.flink.streaming.api.datastream.DataStreamUtils.collectRecordsFromUnboundedStream(DataStreamUtils.java:142)
>  [classes/:?]
>   at 
> org.apache.flink.connector.file.src.FileSourceTextLinesITCase.testContinuousTextFileSource(FileSourceTextLinesITCase.java:272)
>  [test-classes/:?]
>   at 
> org.apache.flink.connector.file.src.FileSourceTextLinesITCase.testContinuousTextFileSourceWithJobManagerFailover(FileSourceTextLinesITCase.java:228)
>  [test-classes/:?]
>   at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:?]
>   at 
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  ~[?:?]
>   at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:?]
>   at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  [junit-4.12.jar:4.12]
>   at 
> 

[jira] [Comment Edited] (FLINK-21103) E2e tests time out on azure

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307542#comment-17307542
 ] 

Guowei Ma edited comment on FLINK-21103 at 3/29/21, 2:50 AM:
-

on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15622=results


was (Author: maguowei):
on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-21631) Add CountTrigger for Python group window aggregation

2021-03-28 Thread Huang Xingbo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huang Xingbo closed FLINK-21631.

Resolution: Fixed

CountTriggers have been implemented in 
[FLINK-21628|https://issues.apache.org/jira/browse/FLINK-21628], 
[FLINK-21629|https://issues.apache.org/jira/browse/FLINK-21629] and 
[FLINK-21630|https://issues.apache.org/jira/browse/FLINK-21630], so we will 
close this issue.

> Add CountTrigger for Python group window aggregation
> 
>
> Key: FLINK-21631
> URL: https://issues.apache.org/jira/browse/FLINK-21631
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Affects Versions: 1.13.0
>Reporter: Huang Xingbo
>Assignee: Huang Xingbo
>Priority: Major
> Fix For: 1.13.0
>
>
> Add CountTrigger for Python group window aggregation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21954) Test failures occur due to the test not waiting for the ExecutionGraph to be created

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310337#comment-17310337
 ] 

Guowei Ma commented on FLINK-21954:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15632=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7030a106-e977-5851-a05e-535de648c9c9=8502

> Test failures occur due to the test not waiting for the ExecutionGraph to be 
> created
> 
>
> Key: FLINK-21954
> URL: https://issues.apache.org/jira/browse/FLINK-21954
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Matthias
>Priority: Major
>  Labels: test-stability
>
> Various tests are failing due the test not waiting for the ExecutionGraph to 
> be created:
> * 
> [JobMasterTest.testRestoringFromSavepoint|https://dev.azure.com/mapohl/flink/_build/results?buildId=356=logs=243b38e1-22e7-598a-c8ae-385dce2c28b5=fea482b6-4f61-51f4-2584-f73df532b395=8266]
> * {{JobMasterTest.testRequestNextInputSplitWithGlobalFailover}}
> * {{JobMasterTest.testRequestNextInputSplitWithLocalFailover}} (also fails 
> due to FLINK-21450)
> * {{JobMasterQueryableStateTest.testRequestKvStateOfWrongJob}}
> * {{JobMasterQueryableStateTest.testRequestKvStateWithIrrelevantRegistration}}
> * {{JobMasterQueryableStateTest.testDuplicatedKvStateRegistrationsFailTask}}
> * {{JobMasterQueryableStateTest.testRegisterKvState}}
> We might have to double-check whether other tests are affected as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-20431) KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers:133->lambda$testCommitOffsetsWithoutAliveFetchers$3:134 expected:<10> but was:<1>

2021-03-28 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310334#comment-17310334
 ] 

Guowei Ma commented on FLINK-20431:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15632=logs=1fc6e7bf-633c-5081-c32a-9dea24b05730=80a658d1-f7f6-5d93-2758-53ac19fd5b19=8266

> KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers:133->lambda$testCommitOffsetsWithoutAliveFetchers$3:134
>  expected:<10> but was:<1>
> -
>
> Key: FLINK-20431
> URL: https://issues.apache.org/jira/browse/FLINK-20431
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.12.2, 1.13.0
>Reporter: Huang Xingbo
>Assignee: Jiangjie Qin
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.13.0, 1.12.3
>
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10351=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5]
> [ERROR] Failures: 
> [ERROR] 
> KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers:133->lambda$testCommitOffsetsWithoutAliveFetchers$3:134
>  expected:<10> but was:<1>
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21982) jobs should use client org.apache.hadoop.config on yarn

2021-03-28 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310332#comment-17310332
 ] 

Xintong Song commented on FLINK-21982:
--

[~Bo Cui],
Could you check whether FLINK-16005 solves your problem? This new feature has 
not been released tet. It should be released with the upcoming 1.13.0.

> jobs should use client org.apache.hadoop.config on yarn
> ---
>
> Key: FLINK-21982
> URL: https://issues.apache.org/jira/browse/FLINK-21982
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.13.0
>Reporter: Bo Cui
>Priority: Major
>
> currently, jobs use the Yarn server configuration during execution. 
> i think we should submit the configuration to the HDFS, like MR job.xml...etc
> and container fetches the job.xml and creates a soft link(hdfs-site core-site 
> yarn-site) during container initializes
> and then the configuration can obtain these resources through 
> classLoader.getResource(String) during configuration initializes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22002) AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg fail because of submitting task time-out.

2021-03-28 Thread Guowei Ma (Jira)
Guowei Ma created FLINK-22002:
-

 Summary: 
AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg fail 
because of submitting task time-out.
 Key: FLINK-22002
 URL: https://issues.apache.org/jira/browse/FLINK-22002
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.12.2
Reporter: Guowei Ma


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15634=logs=955770d3-1fed-5a0a-3db6-0c7554c910cb=14447d61-56b4-5000-80c1-daa459247f6a=6424


{code:java}
org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase
2021-03-29T00:27:25.3406344Z [ERROR] 
testSingleAggOnTable_HashAgg_WithLocalAgg(org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase)
  Time elapsed: 21.908 s  <<< ERROR!
2021-03-29T00:27:25.3407190Z java.lang.RuntimeException: Failed to fetch next 
result
2021-03-29T00:27:25.3407792Zat 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:109)
2021-03-29T00:27:25.3408502Zat 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80)
2021-03-29T00:27:25.3409188Zat 
org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:117)
2021-03-29T00:27:25.3416724Zat 
org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:350)
2021-03-29T00:27:25.3417510Zat 
java.util.Iterator.forEachRemaining(Iterator.java:115)
2021-03-29T00:27:25.3418416Zat 
org.apache.flink.util.CollectionUtil.iteratorToList(CollectionUtil.java:108)
2021-03-29T00:27:25.3419031Zat 
org.apache.flink.table.planner.runtime.utils.BatchTestBase.executeQuery(BatchTestBase.scala:298)
2021-03-29T00:27:25.3419657Zat 
org.apache.flink.table.planner.runtime.utils.BatchTestBase.check(BatchTestBase.scala:138)
2021-03-29T00:27:25.3420638Zat 
org.apache.flink.table.planner.runtime.utils.BatchTestBase.checkResult(BatchTestBase.scala:104)
2021-03-29T00:27:25.3421384Zat 
org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase.testSingleAggOnTable(AggregateReduceGroupingITCase.scala:182)
2021-03-29T00:27:25.3422284Zat 
org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg(AggregateReduceGroupingITCase.scala:135)
2021-03-29T00:27:25.3422975Zat 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2021-03-29T00:27:25.3423504Zat 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2021-03-29T00:27:25.3424298Zat 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2021-03-29T00:27:25.3425229Zat 
java.lang.reflect.Method.invoke(Method.java:498)
2021-03-29T00:27:25.3426107Zat 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
2021-03-29T00:27:25.3426756Zat 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
2021-03-29T00:27:25.3427743Zat 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
2021-03-29T00:27:25.3428520Zat 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
2021-03-29T00:27:25.3429128Zat 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
2021-03-29T00:27:25.3429715Zat 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
2021-03-29T00:27:25.3433435Zat 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
2021-03-29T00:27:25.3433977Zat 
org.junit.rules.RunRules.evaluate(RunRules.java:20)
2021-03-29T00:27:25.3434476Zat 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
2021-03-29T00:27:25.3435607Zat 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
2021-03-29T00:27:25.3436460Zat 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
2021-03-29T00:27:25.3437054Zat 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
2021-03-29T00:27:25.3437673Zat 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
2021-03-29T00:27:25.3438765Zat 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
2021-03-29T00:27:25.3439362Zat 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
2021-03-29T00:27:25.3440504Zat 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
2021-03-29T00:27:25.3441100Zat 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
2021-03-29T00:27:25.3441673Zat 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
2021-03-29T00:27:25.3442205Zat 

[jira] [Comment Edited] (FLINK-20412) Collect Result Fetching occasionally fails after a JobManager Failover

2021-03-28 Thread Caizhi Weng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310331#comment-17310331
 ] 

Caizhi Weng edited comment on FLINK-20412 at 3/29/21, 2:40 AM:
---

Sorry for this very late response as I'm not active in community these days.

>From the exception stack it seems that {{executorService}} in 
>{{CollectSinkOperatorCoordinator#handleCoordinationRequest}} is {{null}}. 
>However {{executorService}} is initialized in 
>{{CollectSinkOperatorCoordinator#start}}. So this indicates that 
>{{handleCoordinationRequest}} is called before {{start}} after a JM failure, 
>which seems to be a bug in the coordinator system itself and does not seem to 
>relate to the collect result fetching.

However I've looped 
{{FileSourceTextLinesITCase#testContinuousTextFileSourceWithJobManagerFailover}}
 for more than 500 times and I still can't reproduce this issue. Is something 
related to the coordinator system fixed during these months?


was (Author: tsreaper):
Sorry for this very late response as I'm not active in community these days.

>From the exception stack it seems that {{executorService}} in 
>{{CollectSinkOperatorCoordinator#handleCoordinationRequest}} is {{null}}. 
>However {{executorService}} is initialized in 
>{{CollectSinkOperatorCoordinator#start}}. So this indicates that 
>{{handleCoordinationRequest}} is called before {{start}} after a JM failure, 
>which seems to be a bug in the coordinator system itself and does not seem to 
>relate to the collect result fetching.

However I've looped 
{{FileSourceTextLinesITCase#testContinuousTextFileSourceWithJobManagerFailover}}
 for more than 500 times and I still can't reproduce this issue. Is something 
related the coordinator system fixed during these months?

> Collect Result Fetching occasionally fails after a JobManager Failover
> --
>
> Key: FLINK-20412
> URL: https://issues.apache.org/jira/browse/FLINK-20412
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.11.2, 1.12.0
>Reporter: Stephan Ewen
>Priority: Major
> Fix For: 1.11.4, 1.13.0
>
>
> The encountered exception is as blow.
>  
> The issue can be reproduced by running a test with JobManager failover in a 
> tight loop, for example the FileTextLinesITCase from this PR: 
> [https://github.com/apache/flink/pull/14199]
>  
> {code:java}
> 15335 [main] WARN  
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher - An 
> exception occurs when fetching query results
> java.util.concurrent.ExecutionException: java.lang.NullPointerException
>   at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) 
> ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) ~[?:?]
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.sendRequest(CollectResultFetcher.java:163)
>  ~[classes/:?]
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:134)
>  [classes/:?]
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:103)
>  [classes/:?]
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77)
>  [classes/:?]
>   at 
> org.apache.flink.streaming.api.datastream.DataStreamUtils.collectRecordsFromUnboundedStream(DataStreamUtils.java:142)
>  [classes/:?]
>   at 
> org.apache.flink.connector.file.src.FileSourceTextLinesITCase.testContinuousTextFileSource(FileSourceTextLinesITCase.java:272)
>  [test-classes/:?]
>   at 
> org.apache.flink.connector.file.src.FileSourceTextLinesITCase.testContinuousTextFileSourceWithJobManagerFailover(FileSourceTextLinesITCase.java:228)
>  [test-classes/:?]
>   at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:?]
>   at 
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  ~[?:?]
>   at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:?]
>   at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  [junit-4.12.jar:4.12]
>   at 
> 

[jira] [Updated] (FLINK-20682) Add configuration options related to hadoop

2021-03-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20682:
-
Fix Version/s: (was: 1.12.1)
   (was: 1.13.0)

> Add configuration options related to hadoop
> ---
>
> Key: FLINK-20682
> URL: https://issues.apache.org/jira/browse/FLINK-20682
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.12.0
>Reporter: Ruguo Yu
>Priority: Major
>  Labels: pull-requests-available
>
> Current, we submit flink job to yarn with run-application target and need to 
> specify some configuration related to hadoop, because we use distributed 
> filesystem similar to Ali oss to storage resources, in this case, we will 
> pass special configuration option and set them to hadoopConfiguration.
> In order to solve such problems, we can provide a configuration option 
> prefixed with "flink.hadoop."(such as -Dflink.hadoop.xxx=yyy), and then take 
> it into HadoopConfiguration.
> A simple implementation code is as follows:
> {code:java}
> module: flink-filesystems/flink-hadoop-fs
> class: org.apache.flink.runtime.util.HadoopUtils
> //代码占位符
> public static Configuration 
> getHadoopConfiguration(org.apache.flink.configuration.Configuration 
> flinkConfiguration) {
>..
>// Copy any "flink.hadoop.xxx=yyy" flink configuration to hadoop 
> configuration as "xxx=yyy"
>for (String key : flinkConfiguration.keySet()) {
>   if (key.startsWith("flink.hadoop.")) {
>  result.set(key.substring("flink.hadoop.".length()),
> flinkConfiguration.getString(key, null));
>   }
>}
>return result;
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-20412) Collect Result Fetching occasionally fails after a JobManager Failover

2021-03-28 Thread Caizhi Weng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310331#comment-17310331
 ] 

Caizhi Weng commented on FLINK-20412:
-

Sorry for this very late response as I'm not active in community these days.

>From the exception stack it seems that {{executorService}} in 
>{{CollectSinkOperatorCoordinator#handleCoordinationRequest}} is {{null}}. 
>However {{executorService}} is initialized in 
>{{CollectSinkOperatorCoordinator#start}}. So this indicates that 
>{{handleCoordinationRequest}} is called before {{start}} after a JM failure, 
>which seems to be a bug in the coordinator system itself and does not seem to 
>relate to the collect result fetching.

However I've looped 
{{FileSourceTextLinesITCase#testContinuousTextFileSourceWithJobManagerFailover}}
 for more than 500 times and I still can't reproduce this issue. Is something 
related the coordinator system fixed during these months?

> Collect Result Fetching occasionally fails after a JobManager Failover
> --
>
> Key: FLINK-20412
> URL: https://issues.apache.org/jira/browse/FLINK-20412
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.11.2, 1.12.0
>Reporter: Stephan Ewen
>Priority: Major
> Fix For: 1.11.4, 1.13.0
>
>
> The encountered exception is as blow.
>  
> The issue can be reproduced by running a test with JobManager failover in a 
> tight loop, for example the FileTextLinesITCase from this PR: 
> [https://github.com/apache/flink/pull/14199]
>  
> {code:java}
> 15335 [main] WARN  
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher - An 
> exception occurs when fetching query results
> java.util.concurrent.ExecutionException: java.lang.NullPointerException
>   at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) 
> ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) ~[?:?]
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.sendRequest(CollectResultFetcher.java:163)
>  ~[classes/:?]
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:134)
>  [classes/:?]
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:103)
>  [classes/:?]
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77)
>  [classes/:?]
>   at 
> org.apache.flink.streaming.api.datastream.DataStreamUtils.collectRecordsFromUnboundedStream(DataStreamUtils.java:142)
>  [classes/:?]
>   at 
> org.apache.flink.connector.file.src.FileSourceTextLinesITCase.testContinuousTextFileSource(FileSourceTextLinesITCase.java:272)
>  [test-classes/:?]
>   at 
> org.apache.flink.connector.file.src.FileSourceTextLinesITCase.testContinuousTextFileSourceWithJobManagerFailover(FileSourceTextLinesITCase.java:228)
>  [test-classes/:?]
>   at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:?]
>   at 
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  ~[?:?]
>   at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:?]
>   at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> [junit-4.12.jar:4.12]
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) 
> [junit-4.12.jar:4.12]
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) 
> [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) 
> [junit-4.12.jar:4.12]
>   at 

[GitHub] [flink] flinkbot edited a comment on pull request #15054: [FLINK-13550][runtime][ui] Operator's Flame Graph

2021-03-28 Thread GitBox


flinkbot edited a comment on pull request #15054:
URL: https://github.com/apache/flink/pull/15054#issuecomment-788337524


   
   ## CI report:
   
   * 26a28f2d83f56cb386e1365fd4df4fb8a2f2bf86 UNKNOWN
   * 0b5aaf42f2861a38e26a80e25cf2324e7cf06bb7 UNKNOWN
   * 7717d7719f6b448084ff27365c2b718d2c059376 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15640)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21981) Increase the priority of the parameter in flink-conf

2021-03-28 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310327#comment-17310327
 ] 

Xintong Song commented on FLINK-21981:
--

Why do you think `env.hadoop.conf.dir` is better?

I don't have a strong opinion on the priorities of the two options. However, 
I'd rather not to introduce any behavior changes, unless there's a very good 
reason.

The priorities of the two options have been the current way since 2017 and 
there could be many users using these options. Changing the priorities may 
cause these users problems when they upgrade their flink version, which could 
be hard to debug.

> Increase the priority of the parameter in flink-conf
> 
>
> Key: FLINK-21981
> URL: https://issues.apache.org/jira/browse/FLINK-21981
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Reporter: Bo Cui
>Priority: Major
>
> in my cluster, env has HADOOP_CONF_DIR and YARN_CONF_DIR, but they are hadoop 
> and yarn conf, not flink conf. and hadoop conf of flink(env.hadoop.conf.dir) 
> is different from them. so we should use `env.hadoop.conf.dir` first
> https://github.com/apache/flink/blob/57e93c90a14a9f5316da821863dd2b335a8f86e0/flink-dist/src/main/flink-bin/bin/config.sh#L256



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] afedulov commented on pull request #15054: [FLINK-13550][runtime][ui] Operator's Flame Graph

2021-03-28 Thread GitBox


afedulov commented on pull request #15054:
URL: https://github.com/apache/flink/pull/15054#issuecomment-809021491


   @zentol thanks a lot for your review! I addressed your comments in a 
separate commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] kezhuw commented on a change in pull request #15397: [FLINK-21817][connector/common] Remove split assignment tracker from coordinator state

2021-03-28 Thread GitBox


kezhuw commented on a change in pull request #15397:
URL: https://github.com/apache/flink/pull/15397#discussion_r602977665



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/source/coordinator/SplitAssignmentTracker.java
##
@@ -83,10 +86,8 @@ public void snapshotState(
  */
 public void restoreState(SimpleVersionedSerializer 
splitSerializer, DataInputStream in)
 throws Exception {

Review comment:
   It is also a good to read out and drop possible empty state. This will 
protect any future trailing state reader. But I guess in that stage, there will 
be a version bump to `SourceCoordinatorSerdeUtils.CURRENT_VERSION`. So I am 
neutral to this, just a preference to match write side.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] hehuiyuan removed a comment on pull request #15395: [FLINK-15146][core][ttl] Fix check that incremental cleanup size must be greater than zero

2021-03-28 Thread GitBox


hehuiyuan removed a comment on pull request #15395:
URL: https://github.com/apache/flink/pull/15395#issuecomment-809012981


   Hi @azagrebin ,


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] hehuiyuan commented on pull request #15395: [FLINK-15146][core][ttl] Fix check that incremental cleanup size must be greater than zero

2021-03-28 Thread GitBox


hehuiyuan commented on pull request #15395:
URL: https://github.com/apache/flink/pull/15395#issuecomment-809013299


   Hi @azagrebin , 
   please take a look at the remaining issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] hehuiyuan commented on pull request #15395: [FLINK-15146][core][ttl] Fix check that incremental cleanup size must be greater than zero

2021-03-28 Thread GitBox


hehuiyuan commented on pull request #15395:
URL: https://github.com/apache/flink/pull/15395#issuecomment-809012981


   Hi @azagrebin ,


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] afedulov removed a comment on pull request #15054: [FLINK-13550][runtime][ui] Operator's Flame Graph

2021-03-28 Thread GitBox


afedulov removed a comment on pull request #15054:
URL: https://github.com/apache/flink/pull/15054#issuecomment-809011781


   @zentol thanks a lot for your review! I addressed your comments in a 
separate commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] afedulov commented on pull request #15054: [FLINK-13550][runtime][ui] Operator's Flame Graph

2021-03-28 Thread GitBox


afedulov commented on pull request #15054:
URL: https://github.com/apache/flink/pull/15054#issuecomment-809011781


   @zentol thanks a lot for your review! I addressed your comments in a 
separate commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] afedulov commented on a change in pull request #15054: [FLINK-13550][runtime][ui] Operator's Flame Graph

2021-03-28 Thread GitBox


afedulov commented on a change in pull request #15054:
URL: https://github.com/apache/flink/pull/15054#discussion_r602954386



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/rest/messages/JobVertexMessageParameters.java
##
@@ -26,6 +26,8 @@
 public class JobVertexMessageParameters extends JobMessageParameters {
 
 public final JobVertexIdPathParameter jobVertexIdPathParameter = new 
JobVertexIdPathParameter();
+public final FlameGraphTypeQueryParameter flameGraphTypeQueryParameter =

Review comment:
   My bad! Support for different Flame Graphs type was the last minute 
addition and I misunderstood how those `JobVertexMessageParameters` are 
intended to be used. Fixing it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   >