[GitHub] [flink] flinkbot edited a comment on pull request #15531: [FLINK-22147][connector/kafka] Refactor partition discovery logic in Kafka source enumerator

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #15531:
URL: https://github.com/apache/flink/pull/15531#issuecomment-816343442


   
   ## CI report:
   
   * ef0489f55f950c18c3627296fd8db333a0e0d6af Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18807)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16118: [FLINK-22676][coordination] The partition tracker stops tracking internal partitions when TM disconnects

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #16118:
URL: https://github.com/apache/flink/pull/16118#issuecomment-857430432


   
   ## CI report:
   
   * 99adb64639b2a6239228bd73b48afd206c1d23e5 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18820)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #16120: [FLINK-22939][azure] Generalize JDK switch

2021-06-09 Thread GitBox


flinkbot commented on pull request #16120:
URL: https://github.com/apache/flink/pull/16120#issuecomment-857445858


   
   ## CI report:
   
   * 5630e8d58107249c5a33e5d56e60db481f12b74a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] AHeise commented on a change in pull request #15924: [FLINK-22670][FLIP-150][connector/common] Hybrid source baseline

2021-06-09 Thread GitBox


AHeise commented on a change in pull request #15924:
URL: https://github.com/apache/flink/pull/15924#discussion_r648026302



##
File path: 
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/hybrid/HybridSourceReader.java
##
@@ -0,0 +1,194 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.base.source.hybrid;
+
+import org.apache.flink.api.connector.source.ReaderOutput;
+import org.apache.flink.api.connector.source.SourceEvent;
+import org.apache.flink.api.connector.source.SourceReader;
+import org.apache.flink.api.connector.source.SourceReaderContext;
+import org.apache.flink.api.connector.source.SourceSplit;
+import org.apache.flink.core.io.InputStatus;
+import org.apache.flink.util.Preconditions;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.CompletableFuture;
+
+/**
+ * Hybrid source reader that delegates to the actual source reader.
+ *
+ * This reader is setup with a sequence of underlying source readers. At a 
given point in time,
+ * one of these readers is active. Underlying readers are opened and closed on 
demand as determined
+ * by the enumerator, which selects the active reader via {@link 
SwitchSourceEvent}.
+ *
+ * When the underlying reader has consumed all input, {@link 
HybridSourceReader} sends {@link
+ * SourceReaderFinishedEvent} to the coordinator and waits for the {@link 
SwitchSourceEvent}.
+ */
+public class HybridSourceReader implements SourceReader {
+private static final Logger LOG = 
LoggerFactory.getLogger(HybridSourceReader.class);
+private final SourceReaderContext readerContext;
+private final List> chainedReaders;
+private int currentSourceIndex = -1;
+private SourceReader currentReader;
+private CompletableFuture availabilityFuture;
+
+public HybridSourceReader(
+SourceReaderContext readerContext,
+List> readers) {
+this.readerContext = readerContext;
+this.chainedReaders = readers;
+}
+
+@Override
+public void start() {
+setCurrentReader(0);
+}
+
+@Override
+public InputStatus pollNext(ReaderOutput output) throws Exception {
+InputStatus status = currentReader.pollNext(output);
+if (status == InputStatus.END_OF_INPUT) {
+// trap END_OF_INPUT if this wasn't the final reader
+LOG.info(
+"End of input subtask={} sourceIndex={} {}",
+readerContext.getIndexOfSubtask(),
+currentSourceIndex,
+currentReader);
+if (currentSourceIndex + 1 < chainedReaders.size()) {
+// Signal the coordinator that the current reader has consumed 
all input and the
+// next source can potentially be activated (after all readers 
are ready).
+readerContext.sendSourceEventToCoordinator(
+new SourceReaderFinishedEvent(currentSourceIndex));
+// More data will be available from the next reader.
+// InputStatus.NOTHING_AVAILABLE requires us to complete the 
availability
+// future after source switch to resume poll.
+return InputStatus.NOTHING_AVAILABLE;
+}
+}
+return status;
+}
+
+@Override
+public List snapshotState(long checkpointId) {
+List state = 
currentReader.snapshotState(checkpointId);

Review comment:
   I saw later that we snapshot a reader on switching, so it's probably 
fine as is.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #16119: [hotfix][docs] 'ALTER.md' Format repair

2021-06-09 Thread GitBox


flinkbot commented on pull request #16119:
URL: https://github.com/apache/flink/pull/16119#issuecomment-857445757


   
   ## CI report:
   
   * 971ef26d9db9cd9bb4f60c59bf9451270ea71885 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-20802) force-shading does not declare flink-parent as it's parent

2021-06-09 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-20802.

Resolution: Won't Fix

> force-shading does not declare flink-parent as it's parent
> --
>
> Key: FLINK-20802
> URL: https://issues.apache.org/jira/browse/FLINK-20802
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.8.0
>Reporter: Chesnay Schepler
>Priority: Minor
>  Labels: starter
>
> {{force-shading}} does not declare flink-parent as it's parent, despite being 
> declared as a module of it, causing issues like FLINK-20792.
> We should re-evaluate whether force-shading is still necessary, and if so 
> migrate to flink-shaded-force-shading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (FLINK-18701) NOT NULL constraint is not guaranteed when aggregation split is enabled

2021-06-09 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-18701:
-
Comment: was deleted

(was: This issue is assigned but has not received an update in 7 days so it has 
been labeled "stale-assigned". If you are still working on the issue, please 
give an update and remove the label. If you are no longer working on the issue, 
please unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.)

> NOT NULL constraint is not guaranteed when aggregation split is enabled
> ---
>
> Key: FLINK-18701
> URL: https://issues.apache.org/jira/browse/FLINK-18701
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.1
>Reporter: Jark Wu
>Priority: Minor
>
> Take the following test: 
> {{org.apache.flink.table.planner.runtime.stream.sql.SplitAggregateITCase#testMinMaxWithRetraction}}
> {code:scala}
> val t1 = tEnv.sqlQuery(
>   s"""
>  |SELECT
>  |  c, MIN(b), MAX(b), COUNT(DISTINCT a)
>  |FROM(
>  |  SELECT
>  |a, COUNT(DISTINCT b) as b, MAX(b) as c
>  |  FROM T
>  |  GROUP BY a
>  |) GROUP BY c
>""".stripMargin)
> val sink = new TestingRetractSink
> t1.toRetractStream[Row].addSink(sink)
> env.execute()
> println(sink.getRawResults)
> {code}
> The query schema is
> {code:java}
> root
>  |-- c: INT
>  |-- EXPR$1: BIGINT NOT NULL
>  |-- EXPR$2: BIGINT NOT NULL
>  |-- EXPR$3: BIGINT NOT NULL
> {code}
> This should be correct as the count is never null and thus min/max are never 
> null, however, we can receive null in the sink.
> {code}
> List((true,1,null,null,1), (true,2,2,2,1), (false,1,null,null,1), 
> (true,6,2,2,1), (true,5,1,1,0), (false,5,1,1,0), (true,5,1,1,2), 
> (true,4,2,2,0), (false,5,1,1,2), (true,5,1,3,2), (false,4,2,2,0), 
> (false,5,1,3,2), (true,5,1,4,2))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-18701) NOT NULL constraint is not guaranteed when aggregation split is enabled

2021-06-09 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-18701:
-
Labels:   (was: auto-deprioritized-major auto-unassigned)

> NOT NULL constraint is not guaranteed when aggregation split is enabled
> ---
>
> Key: FLINK-18701
> URL: https://issues.apache.org/jira/browse/FLINK-18701
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.1
>Reporter: Jark Wu
>Priority: Minor
>
> Take the following test: 
> {{org.apache.flink.table.planner.runtime.stream.sql.SplitAggregateITCase#testMinMaxWithRetraction}}
> {code:scala}
> val t1 = tEnv.sqlQuery(
>   s"""
>  |SELECT
>  |  c, MIN(b), MAX(b), COUNT(DISTINCT a)
>  |FROM(
>  |  SELECT
>  |a, COUNT(DISTINCT b) as b, MAX(b) as c
>  |  FROM T
>  |  GROUP BY a
>  |) GROUP BY c
>""".stripMargin)
> val sink = new TestingRetractSink
> t1.toRetractStream[Row].addSink(sink)
> env.execute()
> println(sink.getRawResults)
> {code}
> The query schema is
> {code:java}
> root
>  |-- c: INT
>  |-- EXPR$1: BIGINT NOT NULL
>  |-- EXPR$2: BIGINT NOT NULL
>  |-- EXPR$3: BIGINT NOT NULL
> {code}
> This should be correct as the count is never null and thus min/max are never 
> null, however, we can receive null in the sink.
> {code}
> List((true,1,null,null,1), (true,2,2,2,1), (false,1,null,null,1), 
> (true,6,2,2,1), (true,5,1,1,0), (false,5,1,1,0), (true,5,1,1,2), 
> (true,4,2,2,0), (false,5,1,1,2), (true,5,1,3,2), (false,4,2,2,0), 
> (false,5,1,3,2), (true,5,1,4,2))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-18701) NOT NULL constraint is not guaranteed when aggregation split is enabled

2021-06-09 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-18701:
-
Priority: Major  (was: Minor)

> NOT NULL constraint is not guaranteed when aggregation split is enabled
> ---
>
> Key: FLINK-18701
> URL: https://issues.apache.org/jira/browse/FLINK-18701
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.1
>Reporter: Jark Wu
>Priority: Major
>
> Take the following test: 
> {{org.apache.flink.table.planner.runtime.stream.sql.SplitAggregateITCase#testMinMaxWithRetraction}}
> {code:scala}
> val t1 = tEnv.sqlQuery(
>   s"""
>  |SELECT
>  |  c, MIN(b), MAX(b), COUNT(DISTINCT a)
>  |FROM(
>  |  SELECT
>  |a, COUNT(DISTINCT b) as b, MAX(b) as c
>  |  FROM T
>  |  GROUP BY a
>  |) GROUP BY c
>""".stripMargin)
> val sink = new TestingRetractSink
> t1.toRetractStream[Row].addSink(sink)
> env.execute()
> println(sink.getRawResults)
> {code}
> The query schema is
> {code:java}
> root
>  |-- c: INT
>  |-- EXPR$1: BIGINT NOT NULL
>  |-- EXPR$2: BIGINT NOT NULL
>  |-- EXPR$3: BIGINT NOT NULL
> {code}
> This should be correct as the count is never null and thus min/max are never 
> null, however, we can receive null in the sink.
> {code}
> List((true,1,null,null,1), (true,2,2,2,1), (false,1,null,null,1), 
> (true,6,2,2,1), (true,5,1,1,0), (false,5,1,1,0), (true,5,1,1,2), 
> (true,4,2,2,0), (false,5,1,1,2), (true,5,1,3,2), (false,4,2,2,0), 
> (false,5,1,3,2), (true,5,1,4,2))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (FLINK-18701) NOT NULL constraint is not guaranteed when aggregation split is enabled

2021-06-09 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-18701:
-
Comment: was deleted

(was: This issue was marked "stale-assigned" and has not received an update in 
7 days. It is now automatically unassigned. If you are still working on it, you 
can assign it to yourself again. Please also give an update about the status of 
the work.)

> NOT NULL constraint is not guaranteed when aggregation split is enabled
> ---
>
> Key: FLINK-18701
> URL: https://issues.apache.org/jira/browse/FLINK-18701
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.1
>Reporter: Jark Wu
>Priority: Minor
>
> Take the following test: 
> {{org.apache.flink.table.planner.runtime.stream.sql.SplitAggregateITCase#testMinMaxWithRetraction}}
> {code:scala}
> val t1 = tEnv.sqlQuery(
>   s"""
>  |SELECT
>  |  c, MIN(b), MAX(b), COUNT(DISTINCT a)
>  |FROM(
>  |  SELECT
>  |a, COUNT(DISTINCT b) as b, MAX(b) as c
>  |  FROM T
>  |  GROUP BY a
>  |) GROUP BY c
>""".stripMargin)
> val sink = new TestingRetractSink
> t1.toRetractStream[Row].addSink(sink)
> env.execute()
> println(sink.getRawResults)
> {code}
> The query schema is
> {code:java}
> root
>  |-- c: INT
>  |-- EXPR$1: BIGINT NOT NULL
>  |-- EXPR$2: BIGINT NOT NULL
>  |-- EXPR$3: BIGINT NOT NULL
> {code}
> This should be correct as the count is never null and thus min/max are never 
> null, however, we can receive null in the sink.
> {code}
> List((true,1,null,null,1), (true,2,2,2,1), (false,1,null,null,1), 
> (true,6,2,2,1), (true,5,1,1,0), (false,5,1,1,0), (true,5,1,1,2), 
> (true,4,2,2,0), (false,5,1,1,2), (true,5,1,3,2), (false,4,2,2,0), 
> (false,5,1,3,2), (true,5,1,4,2))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (FLINK-18701) NOT NULL constraint is not guaranteed when aggregation split is enabled

2021-06-09 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-18701:
-
Comment: was deleted

(was: This issue was labeled "stale-major" 7 ago and has not received any 
updates so it is being deprioritized. If this ticket is actually Major, please 
raise the priority and ask a committer to assign you the issue or revive the 
public discussion.
)

> NOT NULL constraint is not guaranteed when aggregation split is enabled
> ---
>
> Key: FLINK-18701
> URL: https://issues.apache.org/jira/browse/FLINK-18701
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.1
>Reporter: Jark Wu
>Priority: Minor
>
> Take the following test: 
> {{org.apache.flink.table.planner.runtime.stream.sql.SplitAggregateITCase#testMinMaxWithRetraction}}
> {code:scala}
> val t1 = tEnv.sqlQuery(
>   s"""
>  |SELECT
>  |  c, MIN(b), MAX(b), COUNT(DISTINCT a)
>  |FROM(
>  |  SELECT
>  |a, COUNT(DISTINCT b) as b, MAX(b) as c
>  |  FROM T
>  |  GROUP BY a
>  |) GROUP BY c
>""".stripMargin)
> val sink = new TestingRetractSink
> t1.toRetractStream[Row].addSink(sink)
> env.execute()
> println(sink.getRawResults)
> {code}
> The query schema is
> {code:java}
> root
>  |-- c: INT
>  |-- EXPR$1: BIGINT NOT NULL
>  |-- EXPR$2: BIGINT NOT NULL
>  |-- EXPR$3: BIGINT NOT NULL
> {code}
> This should be correct as the count is never null and thus min/max are never 
> null, however, we can receive null in the sink.
> {code}
> List((true,1,null,null,1), (true,2,2,2,1), (false,1,null,null,1), 
> (true,6,2,2,1), (true,5,1,1,0), (false,5,1,1,0), (true,5,1,1,2), 
> (true,4,2,2,0), (false,5,1,1,2), (true,5,1,3,2), (false,4,2,2,0), 
> (false,5,1,3,2), (true,5,1,4,2))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (FLINK-22590) Add Scala implicit conversions for new API methods

2021-06-09 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-22590:
-
Comment: was deleted

(was: I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I 
help the community manage its development. I see this issue is assigned but has 
not received an update in 14, so it has been labeled "stale-assigned".
If you are still working on the issue, please remove the label and add a 
comment updating the community on your progress.  If this issue is waiting on 
feedback, please consider this a reminder to the committer/reviewer. Flink is a 
very active project, and so we appreciate your patience.
If you are no longer working on the issue, please unassign yourself so someone 
else may work on it. If the "warning_label" label is not removed in 7 days, the 
issue will be automatically unassigned.
)

> Add Scala implicit conversions for new API methods
> --
>
> Key: FLINK-22590
> URL: https://issues.apache.org/jira/browse/FLINK-22590
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>
> FLINK-19980 should also be exposed through Scala's implicit conversions. To 
> allow a fluent API such as `table.toDataStream(...)`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22590) Add Scala implicit conversions for new API methods

2021-06-09 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-22590:
-
Labels:   (was: stale-assigned)

> Add Scala implicit conversions for new API methods
> --
>
> Key: FLINK-22590
> URL: https://issues.apache.org/jira/browse/FLINK-22590
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>
> FLINK-19980 should also be exposed through Scala's implicit conversions. To 
> allow a fluent API such as `table.toDataStream(...)`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (FLINK-22590) Add Scala implicit conversions for new API methods

2021-06-09 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-22590:
-
Comment: was deleted

(was: I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I 
help the community manage its development. I see this issue is assigned but has 
not received an update in 14, so it has been labeled "stale-assigned".
If you are still working on the issue, please remove the label and add a 
comment updating the community on your progress.  If this issue is waiting on 
feedback, please consider this a reminder to the committer/reviewer. Flink is a 
very active project, and so we appreciate your patience.
If you are no longer working on the issue, please unassign yourself so someone 
else may work on it. If the "warning_label" label is not removed in 7 days, the 
issue will be automatically unassigned.
)

> Add Scala implicit conversions for new API methods
> --
>
> Key: FLINK-22590
> URL: https://issues.apache.org/jira/browse/FLINK-22590
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>
> FLINK-19980 should also be exposed through Scala's implicit conversions. To 
> allow a fluent API such as `table.toDataStream(...)`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] twalthr closed pull request #16097: [FLINK-22864][table] Remove BatchTableEnvironment and related classes

2021-06-09 Thread GitBox


twalthr closed pull request #16097:
URL: https://github.com/apache/flink/pull/16097


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Issue Comment Deleted] (FLINK-18701) NOT NULL constraint is not guaranteed when aggregation split is enabled

2021-06-09 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-18701:
-
Comment: was deleted

(was: I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I 
help the community manage its development. I see this issues has been marked as 
Major but is unassigned and neither itself nor its Sub-Tasks have been updated 
for 30 days. I have gone ahead and added a "stale-major" to the issue". If this 
ticket is a Major, please either assign yourself or give an update. Afterwards, 
please remove the label or in 7 days the issue will be deprioritized.
)

> NOT NULL constraint is not guaranteed when aggregation split is enabled
> ---
>
> Key: FLINK-18701
> URL: https://issues.apache.org/jira/browse/FLINK-18701
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.1
>Reporter: Jark Wu
>Priority: Minor
>
> Take the following test: 
> {{org.apache.flink.table.planner.runtime.stream.sql.SplitAggregateITCase#testMinMaxWithRetraction}}
> {code:scala}
> val t1 = tEnv.sqlQuery(
>   s"""
>  |SELECT
>  |  c, MIN(b), MAX(b), COUNT(DISTINCT a)
>  |FROM(
>  |  SELECT
>  |a, COUNT(DISTINCT b) as b, MAX(b) as c
>  |  FROM T
>  |  GROUP BY a
>  |) GROUP BY c
>""".stripMargin)
> val sink = new TestingRetractSink
> t1.toRetractStream[Row].addSink(sink)
> env.execute()
> println(sink.getRawResults)
> {code}
> The query schema is
> {code:java}
> root
>  |-- c: INT
>  |-- EXPR$1: BIGINT NOT NULL
>  |-- EXPR$2: BIGINT NOT NULL
>  |-- EXPR$3: BIGINT NOT NULL
> {code}
> This should be correct as the count is never null and thus min/max are never 
> null, however, we can receive null in the sink.
> {code}
> List((true,1,null,null,1), (true,2,2,2,1), (false,1,null,null,1), 
> (true,6,2,2,1), (true,5,1,1,0), (false,5,1,1,0), (true,5,1,1,2), 
> (true,4,2,2,0), (false,5,1,1,2), (true,5,1,3,2), (false,4,2,2,0), 
> (false,5,1,3,2), (true,5,1,4,2))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-19847) Can we create a fast support on the Nested table join?

2021-06-09 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-19847.

Resolution: Invalid

> Can we create a fast support on the Nested table join?
> --
>
> Key: FLINK-19847
> URL: https://issues.apache.org/jira/browse/FLINK-19847
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Affects Versions: 1.11.1
>Reporter: xiaogang zhou
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> In CommonLookupJoin, one TODO is 
> support nested lookup keys in the future,
> // currently we only support top-level lookup keys
>  
> can we create a fast support on the Array join? thx



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19847) Can we create a fast support on the Nested table join?

2021-06-09 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359829#comment-17359829
 ] 

Jingsong Lee commented on FLINK-19847:
--

Even if we want to support nested table join, it should be join on Row or Map 
instead of Array.

Because Array is a special struct with index... I am not sure it is safety to 
support equals on it.

> Can we create a fast support on the Nested table join?
> --
>
> Key: FLINK-19847
> URL: https://issues.apache.org/jira/browse/FLINK-19847
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Affects Versions: 1.11.1
>Reporter: xiaogang zhou
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> In CommonLookupJoin, one TODO is 
> support nested lookup keys in the future,
> // currently we only support top-level lookup keys
>  
> can we create a fast support on the Array join? thx



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-22877) Remove BatchTableEnvironment and related API classes

2021-06-09 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-22877.

Fix Version/s: 1.14.0
 Release Note: Due to the removal of BatchTableEnvironment, 
BatchTableSource and BatchTableSink have been removed as well. Use 
DynamicTableSource and DynamicTableSink instead. They support the old 
InputFormat and OutputFormat interfaces as runtime providers if necessary.
   Resolution: Fixed

commit 08910f74e4f22ce9afaba3536fa213887667371e
[table] Remove remaining comments around DataSet

commit a55a67d2ff00602f794cf621e9971feeb5bfff77
[table] Remove outdated comments

commit eb5f094e1805f0a689e29deb35f242c79abc8676
[table] Remove BatchTableSink and related classes

commit 1cccef02ec0f10f9dfbdf62b2e4966fedeea9ff6
[table] Remove BatchTableSource and related classes

commit e46999388d844939bbc9e66254f53765722dd358
[table] Remove BatchTableEnvironment and related classes

> Remove BatchTableEnvironment and related API classes
> 
>
> Key: FLINK-22877
> URL: https://issues.apache.org/jira/browse/FLINK-22877
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
> Fix For: 1.14.0
>
>
> Remove BatchTableEnvironment and other DataSet related API classes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-16903) Add sink.parallelism for file system factory

2021-06-09 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-16903.

Resolution: Duplicate

> Add sink.parallelism for file system factory
> 
>
> Key: FLINK-16903
> URL: https://issues.apache.org/jira/browse/FLINK-16903
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Reporter: Jingsong Lee
>Priority: Major
>
> A single task may be writing multiple files at the same time. If the 
> parallelism is too high, it may lead to a large number of small files. If the 
> parallelism is too small, the performance is not enough. This requires that 
> the user can specify parallelism.
>  * Default is the same as upstream transformation
>  * Users can specify parallelism too.
> |‘connector.sink.parallelism’ = ...|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-16868) Table/SQL doesn't support custom trigger

2021-06-09 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-16868.

Resolution: Invalid

Re-open this if want to introduce Flink SQL late/early fire types.

> Table/SQL doesn't support custom trigger
> 
>
> Key: FLINK-16868
> URL: https://issues.apache.org/jira/browse/FLINK-16868
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Runtime
>Reporter: Jimmy Wong
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Table/SQL doesn't support custom trigger, such as CountTrigger, 
> ContinuousEventTimeTrigger/ContinuousProcessingTimeTrigger. Do we has plans 
> to make it?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22941) support column comment in catalogTable column schema

2021-06-09 Thread peng wang (Jira)
peng wang created FLINK-22941:
-

 Summary: support column comment in catalogTable column schema
 Key: FLINK-22941
 URL: https://issues.apache.org/jira/browse/FLINK-22941
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / Planner
Reporter: peng wang


we found that column comment is support in flink ddl syntax, but it was dropped 
when SqlCreateTable class convert to  CatalogTable class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22941) support column comment in catalogTable column schema

2021-06-09 Thread peng wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

peng wang updated FLINK-22941:
--
Description: 
we found that column comment is support in flink ddl syntax, but it was dropped 
when SqlCreateTable class convert to  CatalogTable class. 

would we record column comment in catalogTable?

  was:we found that column comment is support in flink ddl syntax, but it was 
dropped when SqlCreateTable class convert to  CatalogTable class. 


> support column comment in catalogTable column schema
> 
>
> Key: FLINK-22941
> URL: https://issues.apache.org/jira/browse/FLINK-22941
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner
>Reporter: peng wang
>Priority: Major
>
> we found that column comment is support in flink ddl syntax, but it was 
> dropped when SqlCreateTable class convert to  CatalogTable class. 
> would we record column comment in catalogTable?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] PatrickRen commented on a change in pull request #16115: [FLINK-22766][connector/kafka] Report offsets and Kafka consumer metrics in Flink metric group

2021-06-09 Thread GitBox


PatrickRen commented on a change in pull request #16115:
URL: https://github.com/apache/flink/pull/16115#discussion_r648045623



##
File path: 
flink-connectors/flink-connector-kafka/src/test/java/org/apache/flink/connector/kafka/source/metrics/KafkaSourceReaderMetricsListener.java
##
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.kafka.source.metrics;
+
+import org.apache.flink.metrics.CharacterFilter;
+import org.apache.flink.metrics.Metric;
+import org.apache.flink.runtime.jobgraph.OperatorID;
+import org.apache.flink.runtime.metrics.MetricRegistry;
+import org.apache.flink.runtime.metrics.groups.OperatorMetricGroup;
+import org.apache.flink.runtime.metrics.groups.UnregisteredMetricGroups;
+import org.apache.flink.runtime.metrics.scope.ScopeFormat;
+import org.apache.flink.runtime.metrics.util.TestingMetricRegistry;
+
+import org.apache.kafka.common.TopicPartition;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/** A utilization class for mocking {@link OperatorMetricGroup} and collecting 
reported metrics. */
+public class KafkaSourceReaderMetricsListener {
+public static final String OPERATOR_NAME = "operator_metric_mocker";
+public static final char DELIMITER = '.';
+
+Map kafkaSourceReaderMetrics = new HashMap<>();
+Map kafkaConsumerMetrics = new HashMap<>();
+
+private final OperatorMetricGroup metricGroup;
+
+public KafkaSourceReaderMetricsListener() {
+TestingMetricRegistry registry =
+TestingMetricRegistry.builder()
+.setDelimiter(DELIMITER)
+.setRegisterConsumer(
+(metric, metricName, group) -> {
+String parentGroupScope =
+
group.getLogicalScope(CharacterFilter.NO_OP_FILTER);
+
+// We only care about Kafka source reader 
metrics
+if 
(isKafkaSourceReaderMetric(parentGroupScope)) {
+
+// KafkaConsumer metric
+if 
(isKafkaConsumerMetric(parentGroupScope)) {
+
kafkaConsumerMetrics.put(metricName, metric);
+return;
+}
+
+final String topic =
+group.getAllVariables()
+.get(
+
ScopeFormat.asVariable(
+
KafkaSourceReaderMetrics
+   
 .TOPIC_GROUP));
+final String partition =
+group.getAllVariables()
+.get(
+
ScopeFormat.asVariable(
+
KafkaSourceReaderMetrics
+   
 .PARTITION_GROUP));
+
+if (topic == null && partition == 
null) {
+// General metric
+
kafkaSourceReaderMetrics.put(metricName, metric);
+} else {
+// topic partition specific metric
+kafkaSourceReaderMetrics.put(
+String.join(
+
String.valueOf(DELIMITER),
+topic,
+partition,
+metricN

[jira] [Updated] (FLINK-22941) support column comment in catalogTable column schema

2021-06-09 Thread peng wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

peng wang updated FLINK-22941:
--
Affects Version/s: 1.13.0

> support column comment in catalogTable column schema
> 
>
> Key: FLINK-22941
> URL: https://issues.apache.org/jira/browse/FLINK-22941
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner
>Affects Versions: 1.13.0
>Reporter: peng wang
>Priority: Major
>
> we found that column comment is support in flink ddl syntax, but it was 
> dropped when SqlCreateTable class convert to  CatalogTable class. 
> would we record column comment in catalogTable?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-18637) Key group is not in KeyGroupRange

2021-06-09 Thread Chirag Dewan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359836#comment-17359836
 ] 

Chirag Dewan commented on FLINK-18637:
--

[~yunta], [~oripwk]

Any reason for concurrent access other than metrics? 

I don't use any metric and still I get a Key 0 not in Key Group 32-63 
exception. My job parallelism is 4. And I also use Integer in a value state. So 
there should be no reason for a hash code mutability as well. 

Any other reason for such behaviors? I also see Position out of bounds 
exception. All with RocksDB,

 

Thanks

> Key group is not in KeyGroupRange
> -
>
> Key: FLINK-18637
> URL: https://issues.apache.org/jira/browse/FLINK-18637
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
> Environment: Version: 1.10.0, Rev:, Date:
> OS current user: yarn
>  Current Hadoop/Kerberos user: hadoop
>  JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.141-b15
>  Maximum heap size: 28960 MiBytes
>  JAVA_HOME: /usr/java/jdk1.8.0_141/jre
>  Hadoop version: 2.8.5-amzn-6
>  JVM Options:
>  -Xmx30360049728
>  -Xms30360049728
>  -XX:MaxDirectMemorySize=4429185024
>  -XX:MaxMetaspaceSize=1073741824
>  -XX:+UseG1GC
>  -XX:+UnlockDiagnosticVMOptions
>  -XX:+G1SummarizeConcMark
>  -verbose:gc
>  -XX:+PrintGCDetails
>  -XX:+PrintGCDateStamps
>  -XX:+UnlockCommercialFeatures
>  -XX:+FlightRecorder
>  -XX:+DebugNonSafepoints
>  
> -XX:FlightRecorderOptions=defaultrecording=true,settings=/home/hadoop/heap.jfc,dumponexit=true,dumponexitpath=/var/lib/hadoop-yarn/recording.jfr,loglevel=info
>  
> -Dlog.file=/var/log/hadoop-yarn/containers/application_1593935560662_0002/container_1593935560662_0002_01_02/taskmanager.log
>  -Dlog4j.configuration=[file:./log4j.properties|file:///log4j.properties]
>  Program Arguments:
>  -Dtaskmanager.memory.framework.off-heap.size=134217728b
>  -Dtaskmanager.memory.network.max=1073741824b
>  -Dtaskmanager.memory.network.min=1073741824b
>  -Dtaskmanager.memory.framework.heap.size=134217728b
>  -Dtaskmanager.memory.managed.size=23192823744b
>  -Dtaskmanager.cpu.cores=7.0
>  -Dtaskmanager.memory.task.heap.size=30225832000b
>  -Dtaskmanager.memory.task.off-heap.size=3221225472b
>  --configDir.
>  
> -Djobmanager.rpc.address=ip-10-180-30-250.us-west-2.compute.internal-Dweb.port=0
>  -Dweb.tmpdir=/tmp/flink-web-64f613cf-bf04-4a09-8c14-75c31b619574
>  -Djobmanager.rpc.port=33739
>  -Drest.address=ip-10-180-30-250.us-west-2.compute.internal
>Reporter: Ori Popowski
>Priority: Major
>
> I'm getting this error when creating a savepoint. I've read in 
> https://issues.apache.org/jira/browse/FLINK-16193 that it's caused by 
> unstable hashcode or equals on the key, or improper use of 
> {{reinterpretAsKeyedStream}}.
>   
>  My key is a string and I don't use {{reinterpretAsKeyedStream}}.
>  
> {code:java}
> senv
>   .addSource(source)
>   .flatMap(…)
>   .filterWith { case (metadata, _, _) => … }
>   .assignTimestampsAndWatermarks(new 
> BoundedOutOfOrdernessTimestampExtractor(…))
>   .keyingBy { case (meta, _) => meta.toPathString }
>   .process(new TruncateLargeSessions(config.sessionSizeLimit))
>   .keyingBy { case (meta, _) => meta.toPathString }
>   .window(EventTimeSessionWindows.withGap(Time.of(…)))
>   .process(new ProcessSession(sessionPlayback, config))
>   .addSink(sink){code}
>  
> {code:java}
> org.apache.flink.util.FlinkException: Triggering a savepoint for the job 
> 962fc8e984e7ca1ed65a038aa62ce124 failed.
>   at 
> org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:633)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$savepoint$9(CliFrontend.java:611)
>   at 
> org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:843)
>   at 
> org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:608)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:910)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
> Caused by: java.util.concurrent.CompletionException: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.checkpoint.CheckpointException: The job has failed.
>   at 
> org.apache.flink.runtime.scheduler.SchedulerBase.lambda$triggerSavepoint$3(SchedulerBase.java:744)
>   at 
> java.util.concurrent.CompletableFutur

[GitHub] [flink] pnowojski merged pull request #15982: [FLINK-22284] Log the remote address when channel is closed in NettyPartitionRequestClient

2021-06-09 Thread GitBox


pnowojski merged pull request #15982:
URL: https://github.com/apache/flink/pull/15982


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16120: [FLINK-22939][azure] Generalize JDK switch

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #16120:
URL: https://github.com/apache/flink/pull/16120#issuecomment-857445858


   
   ## CI report:
   
   * 5630e8d58107249c5a33e5d56e60db481f12b74a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18822)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16119: [hotfix][docs] 'ALTER.md' Format repair

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #16119:
URL: https://github.com/apache/flink/pull/16119#issuecomment-857445757


   
   ## CI report:
   
   * 971ef26d9db9cd9bb4f60c59bf9451270ea71885 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18821)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16116: [hotfix][docs] use non deprecated syntax in SinkFunction code sample

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #16116:
URL: https://github.com/apache/flink/pull/16116#issuecomment-857396931


   
   ## CI report:
   
   * eaf7755e4f56b894c8ec57ff6a6053d0534c4d22 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18816)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] pnowojski commented on pull request #15982: [FLINK-22284] Log the remote address when channel is closed in NettyPartitionRequestClient

2021-06-09 Thread GitBox


pnowojski commented on pull request #15982:
URL: https://github.com/apache/flink/pull/15982#issuecomment-857467880


   Thanks for the review @gaoyunhaii .
   
   just FYI, I've merged this PR, however I didn't notice it hasn't been 
squashed :( 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-22284) Null address will be logged when channel is closed in NettyPartitionRequestClient

2021-06-09 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski closed FLINK-22284.
--
Resolution: Fixed

merged to master as 4d73310ce61^ and 4d73310ce61 (accidentally I've forgotten 
to squash the commits/check that the commits are squashed before merging)

> Null address will be logged when channel is closed in 
> NettyPartitionRequestClient
> -
>
> Key: FLINK-22284
> URL: https://issues.apache.org/jira/browse/FLINK-22284
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.13.0
>Reporter: Zhilong Hong
>Assignee: Zhilong Hong
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.14.0
>
> Attachments: exception.png
>
>
> In NettyPartitionRequestClient#requestSubpartition, when a channel is closed, 
> the channel will throw a LocalTransportException with the error message 
> "Sending the partition request to 'null' failed.". The message is confusing 
> since we wouldn't know where the remote client connected to this channel 
> locates, and we couldn't track down to that TaskExecutor and find out what 
> happened.
>  
> Also I'm wondering that should we use TransportException instead of 
> LocalTransportException here, because it's a little confusing to see a 
> LocalTransportException is thrown out when a remote channel is closed.
>  
> !exception.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-16543) Support setting schedule mode by config for Blink planner in batch mode

2021-06-09 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359839#comment-17359839
 ] 

Jingsong Lee edited comment on FLINK-16543 at 6/9/21, 7:46 AM:
---

Now you can set "table.exec.shuffle-mode" to ALL_EDGES_PIPELINED.

It will schedule each pipelined region in an eager way.


was (Author: lzljs3620320):
Now you can set "table.exec.shuffle-mode" to ALL_EDGES_PIPELINED.

> Support setting schedule mode by config for Blink planner in batch mode
> ---
>
> Key: FLINK-16543
> URL: https://issues.apache.org/jira/browse/FLINK-16543
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration, Table SQL / Runtime
>Reporter: Caizhi Weng
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Currently Blink planner is bound to use the 
> {{LAZY_FROM_SOURCES_WITH_BATCH_SLOT_REQUEST}} schedule mode in batch mode. It 
> is hard coded in the {{ExecutorUtils.setBatchProperties}} method.
> {code:java}
> public static void setBatchProperties(StreamGraph streamGraph, TableConfig 
> tableConfig) {
>   streamGraph.getStreamNodes().forEach(
>   sn -> sn.setResources(ResourceSpec.UNKNOWN, 
> ResourceSpec.UNKNOWN));
>   streamGraph.setChaining(true);
>   streamGraph.setAllVerticesInSameSlotSharingGroupByDefault(false);
>   
> streamGraph.setScheduleMode(ScheduleMode.LAZY_FROM_SOURCES_WITH_BATCH_SLOT_REQUEST);
>   streamGraph.setStateBackend(null);
>   if (streamGraph.getCheckpointConfig().isCheckpointingEnabled()) {
>   throw new IllegalArgumentException("Checkpoint is not supported 
> for batch jobs.");
>   }
>   if (ExecutorUtils.isShuffleModeAllBatch(tableConfig)) {
>   streamGraph.setBlockingConnectionsBetweenChains(true);
>   }
> }
> {code}
> By under certain use cases where execution time is short, especially under 
> OLAP use cases, {{LAZY_FROM_SOURCES_WITH_BATCH_SLOT_REQUEST}} might not be 
> the best choice, as it will cause data to be spilled onto disks when 
> shuffling. Under such use cases, {{EAGER}} schedule mode with {{PIPELINED}} 
> shuffle mode is preferred.
> Currently we can set shuffle mode by the {{table.exec.shuffle-mode}} table 
> config, and we would like to add another config to change the schedule mode 
> for Blink planner in batch mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-16543) Support setting schedule mode by config for Blink planner in batch mode

2021-06-09 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-16543.

Resolution: Duplicate

Now you can set "table.exec.shuffle-mode" to ALL_EDGES_PIPELINED.

> Support setting schedule mode by config for Blink planner in batch mode
> ---
>
> Key: FLINK-16543
> URL: https://issues.apache.org/jira/browse/FLINK-16543
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration, Table SQL / Runtime
>Reporter: Caizhi Weng
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Currently Blink planner is bound to use the 
> {{LAZY_FROM_SOURCES_WITH_BATCH_SLOT_REQUEST}} schedule mode in batch mode. It 
> is hard coded in the {{ExecutorUtils.setBatchProperties}} method.
> {code:java}
> public static void setBatchProperties(StreamGraph streamGraph, TableConfig 
> tableConfig) {
>   streamGraph.getStreamNodes().forEach(
>   sn -> sn.setResources(ResourceSpec.UNKNOWN, 
> ResourceSpec.UNKNOWN));
>   streamGraph.setChaining(true);
>   streamGraph.setAllVerticesInSameSlotSharingGroupByDefault(false);
>   
> streamGraph.setScheduleMode(ScheduleMode.LAZY_FROM_SOURCES_WITH_BATCH_SLOT_REQUEST);
>   streamGraph.setStateBackend(null);
>   if (streamGraph.getCheckpointConfig().isCheckpointingEnabled()) {
>   throw new IllegalArgumentException("Checkpoint is not supported 
> for batch jobs.");
>   }
>   if (ExecutorUtils.isShuffleModeAllBatch(tableConfig)) {
>   streamGraph.setBlockingConnectionsBetweenChains(true);
>   }
> }
> {code}
> By under certain use cases where execution time is short, especially under 
> OLAP use cases, {{LAZY_FROM_SOURCES_WITH_BATCH_SLOT_REQUEST}} might not be 
> the best choice, as it will cause data to be spilled onto disks when 
> shuffling. Under such use cases, {{EAGER}} schedule mode with {{PIPELINED}} 
> shuffle mode is preferred.
> Currently we can set shuffle mode by the {{table.exec.shuffle-mode}} table 
> config, and we would like to add another config to change the schedule mode 
> for Blink planner in batch mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22256) Persist checkpoint type information

2021-06-09 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-22256:
---
Priority: Minor  (was: Major)

> Persist checkpoint type information
> ---
>
> Key: FLINK-22256
> URL: https://issues.apache.org/jira/browse/FLINK-22256
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Reporter: Fabian Paul
>Priority: Minor
>
> As a user, it is retrospectively difficult to determine what kind of 
> checkpoint (i.e. incremental, unaligned, ...) was performed when looking only 
> at the persisted checkpoint metadata.
> The only way would be to look into the execution configuration of the job 
> which might not be available anymore and can be scattered across the 
> application code and cluster configuration.
> It would be highly beneficial if such information would be part of the 
> persisted metadata to not track these external pointers.
>  It would also be great to persist the metadata information in a standardized 
> format so that external projects don't need to use Flink's metadata 
> serializers to access it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22382) ProcessFailureCancelingITCase.testCancelingOnProcessFailure

2021-06-09 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-22382:
---
Component/s: (was: Runtime / Task)

> ProcessFailureCancelingITCase.testCancelingOnProcessFailure
> ---
>
> Key: FLINK-22382
> URL: https://issues.apache.org/jira/browse/FLINK-22382
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Dawid Wysakowicz
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16896&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=5360d54c-8d94-5d85-304e-a89267eb785a&l=9756
> {code}
> Apr 20 18:05:14   Suppressed: java.util.concurrent.TimeoutException
> Apr 20 18:05:14   at 
> org.apache.flink.core.testutils.CommonTestUtils.waitUtil(CommonTestUtils.java:210)
> Apr 20 18:05:14   at 
> org.apache.flink.test.recovery.ProcessFailureCancelingITCase.waitUntilAtLeastOneTaskHasBeenDeployed(ProcessFailureCancelingITCase.java:236)
> Apr 20 18:05:14   at 
> org.apache.flink.test.recovery.ProcessFailureCancelingITCase.testCancelingOnProcessFailure(ProcessFailureCancelingITCase.java:193)
> Apr 20 18:05:14   at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> Apr 20 18:05:14   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Apr 20 18:05:14   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Apr 20 18:05:14   at 
> java.lang.reflect.Method.invoke(Method.java:498)
> Apr 20 18:05:14   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> Apr 20 18:05:14   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Apr 20 18:05:14   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> Apr 20 18:05:14   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Apr 20 18:05:14   at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> Apr 20 18:05:14   at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> Apr 20 18:05:14   at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> Apr 20 18:05:14   at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> Apr 20 18:05:14   at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> Apr 20 18:05:14   at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> Apr 20 18:05:14   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> Apr 20 18:05:14   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> Apr 20 18:05:14   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> Apr 20 18:05:14   at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> Apr 20 18:05:14   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> Apr 20 18:05:14   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> Apr 20 18:05:14   at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> Apr 20 18:05:14   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> Apr 20 18:05:14   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> Apr 20 18:05:14   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> Apr 20 18:05:14   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> Apr 20 18:05:14   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> Apr 20 18:05:14   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> Apr 20 18:05:14   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> Apr 20 18:05:14   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> Apr 20 18:05:14   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> Apr 20 18:05:14   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] Myasuka commented on a change in pull request #15200: [FLINK-21355] Send changes to the state changelog

2021-06-09 Thread GitBox


Myasuka commented on a change in pull request #15200:
URL: https://github.com/apache/flink/pull/15200#discussion_r648057755



##
File path: 
flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogMapState.java
##
@@ -51,16 +63,28 @@ public UV get(UK key) throws Exception {
 
 @Override
 public void put(UK key, UV value) throws Exception {
+if (getValueSerializer() instanceof MapSerializer) {
+changeLogger.valueElementChanged(
+out -> {
+serializeKey(key, out);
+serializeValue(value, out);
+},
+getCurrentNamespace());
+} else {
+changeLogger.valueAdded(singletonMap(key, value), 
getCurrentNamespace());

Review comment:
   In general, I don't think it's an implementation detail. 
`MapStateDescriptor` only allows `MapSerializer` or `MapTypeInfo`, and the only 
serializer would be created in `MapTypeInfo` is `MapSerializer`: see code 
[here](https://github.com/apache/flink/blob/4d73310ce6152e4d6bdccca44b8e1f8ecec0e1b2/flink-core/src/main/java/org/apache/flink/api/java/typeutils/MapTypeInfo.java#L115).

##
File path: 
flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogListState.java
##
@@ -39,53 +44,83 @@
 extends AbstractChangelogState, InternalListState>
 implements InternalListState {
 
-ChangelogListState(InternalListState delegatedState) {
-super(delegatedState);
+ChangelogListState(
+InternalListState delegatedState,
+KvStateChangeLogger, N> changeLogger) {
+super(delegatedState, changeLogger);
 }
 
 @Override
 public void update(List values) throws Exception {
+changeLogger.valueUpdated(values, getCurrentNamespace());
 delegatedState.update(values);
 }
 
 @Override
 public void addAll(List values) throws Exception {
+changeLogger.valueAdded(values, getCurrentNamespace());
 delegatedState.addAll(values);
 }
 
 @Override
 public void updateInternal(List valueToStore) throws Exception {
+changeLogger.valueUpdated(valueToStore, getCurrentNamespace());
 delegatedState.updateInternal(valueToStore);
 }
 
 @Override
 public void add(V value) throws Exception {
+if (getValueSerializer() instanceof ListSerializer) {
+changeLogger.valueElementChanged(
+w ->
+((ListSerializer) getValueSerializer())
+.getElementSerializer()
+.serialize(value, w),
+getCurrentNamespace());
+} else {
+changeLogger.valueAdded(singletonList(value), 
getCurrentNamespace());

Review comment:
   The same as above.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] Myasuka commented on a change in pull request #15200: [FLINK-21355] Send changes to the state changelog

2021-06-09 Thread GitBox


Myasuka commented on a change in pull request #15200:
URL: https://github.com/apache/flink/pull/15200#discussion_r648057861



##
File path: 
flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogListState.java
##
@@ -39,53 +44,83 @@
 extends AbstractChangelogState, InternalListState>
 implements InternalListState {
 
-ChangelogListState(InternalListState delegatedState) {
-super(delegatedState);
+ChangelogListState(
+InternalListState delegatedState,
+KvStateChangeLogger, N> changeLogger) {
+super(delegatedState, changeLogger);
 }
 
 @Override
 public void update(List values) throws Exception {
+changeLogger.valueUpdated(values, getCurrentNamespace());
 delegatedState.update(values);
 }
 
 @Override
 public void addAll(List values) throws Exception {
+changeLogger.valueAdded(values, getCurrentNamespace());
 delegatedState.addAll(values);
 }
 
 @Override
 public void updateInternal(List valueToStore) throws Exception {
+changeLogger.valueUpdated(valueToStore, getCurrentNamespace());
 delegatedState.updateInternal(valueToStore);
 }
 
 @Override
 public void add(V value) throws Exception {
+if (getValueSerializer() instanceof ListSerializer) {
+changeLogger.valueElementChanged(
+w ->
+((ListSerializer) getValueSerializer())
+.getElementSerializer()
+.serialize(value, w),
+getCurrentNamespace());
+} else {
+changeLogger.valueAdded(singletonList(value), 
getCurrentNamespace());

Review comment:
   The same reason as above comment in `ChangelogMapState`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wangyang0918 opened a new pull request #16121: [FLINK-20219][coordination] Shutdown cluster externally should only clean up the HA data if all the jobs have finished

2021-06-09 Thread GitBox


wangyang0918 opened a new pull request #16121:
URL: https://github.com/apache/flink/pull/16121


   
   
   ## What is the purpose of the change
   
   FLINK-20695 has supported to clean up the HA related data(ZNode/ConfigMaps, 
as well as HA storage) once the job reached to the global terminal state. This 
PR now focus on graceful shutdown cluster once received an external signal.
   
   What is the graceful shutdown behavior?
   * If Flink cluster still has running jobs, then the HA related data will be 
retained.
   * If no jobs running, all the ZNodes/ConfigMaps and the HA storage will be 
cleaned up.
   
   
   ## Brief change log
   
   * Shutdown cluster externally should only clean up the HA data if all the 
jobs have finished
   
   
   ## Verifying this change
   
   * Add corresponding unit tests
 * 
`testShutDownClusterWithRunningJobsShouldCompleteShutDownFutureExceptionally` 
 * `testShutDownFutureCompleteExceptionallyShouldNotCleanUpHAData`
 * 
`testShutDownFutureCompleteWithNotAllJobsFinishedExceptionShouldNotCleanUpHAData`
 * `testShutDownFutureCompleteNormallyShouldCleanUpHAData`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (**yes** / no / 
don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-20219) Rethink the HA related ZNodes/ConfigMap clean up for session cluster

2021-06-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-20219:
---
Labels: pull-request-available  (was: )

> Rethink the HA related ZNodes/ConfigMap clean up for session cluster
> 
>
> Key: FLINK-20219
> URL: https://issues.apache.org/jira/browse/FLINK-20219
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes, Deployment / Scripts, Runtime / 
> Coordination
>Affects Versions: 1.12.0
>Reporter: Yang Wang
>Assignee: Yang Wang
>Priority: Major
>  Labels: pull-request-available
>
> When I am testing the Kubernetes HA service, I realize that ConfigMap clean 
> up for session cluster(both standalone and native) are not very easy.
>  * For the native K8s session, we suggest our users to stop it via {{echo 
> 'stop' | ./bin/kubernetes-session.sh -Dkubernetes.cluster-id= 
> -Dexecution.attached=true}}. Currently, it has the same effect with {{kubectl 
> delete deploy }}. This will not clean up the leader 
> ConfigMaps(e.g. ResourceManager, Dispatcher, RestServer, JobManager). Even 
> though there is no running jobs before stop, we still get some retained 
> ConfigMaps. So when and how to clean up the retained ConfigMaps? Should the 
> user do it manually? Or we could provide some utilities in Flink client.
>  * For the standalone session, I think it is reasonable for the users to do 
> the HA ConfigMap clean up manually.
>  
> We could use the following command to do the manually clean up.
> {{kubectl delete cm 
> --selector='app=,configmap-type=high-availability'}}
>  
> Note: This is not a problem for Flink application cluster. Since we could do 
> the clean up automatically when all the running jobs in the application 
> reached terminal state(e.g. FAILED, CANCELED, FINISHED) and then destroy the 
> Flink cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22547) OperatorCoordinatorHolderTest. verifyCheckpointEventOrderWhenCheckpointFutureCompletesLate fail

2021-06-09 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-22547:
---
Description: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17499&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=05b74a19-4ee4-5036-c46f-ada307df6cf0&l=7502


{noformat}
2021-05-02T22:45:49.7343556Z May 02 22:45:49 java.lang.AssertionError
2021-05-02T22:45:49.7344688Z May 02 22:45:49at 
org.junit.Assert.fail(Assert.java:86)
2021-05-02T22:45:49.7345646Z May 02 22:45:49at 
org.junit.Assert.assertTrue(Assert.java:41)
2021-05-02T22:45:49.7346698Z May 02 22:45:49at 
org.junit.Assert.assertTrue(Assert.java:52)
2021-05-02T22:45:49.7353570Z May 02 22:45:49at 
org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolderTest.checkpointEventValueAtomicity(OperatorCoordinatorHolderTest.java:363)
2021-05-02T22:45:49.7355384Z May 02 22:45:49at 
org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolderTest.verifyCheckpointEventOrderWhenCheckpointFutureCompletesLate(OperatorCoordinatorHolderTest.java:331)
2021-05-02T22:45:49.7356826Z May 02 22:45:49at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2021-05-02T22:45:49.7904883Z May 02 22:45:49at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2021-05-02T22:45:49.7905443Z May 02 22:45:49at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2021-05-02T22:45:49.7905918Z May 02 22:45:49at 
java.lang.reflect.Method.invoke(Method.java:498)
2021-05-02T22:45:49.7906402Z May 02 22:45:49at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
2021-05-02T22:45:49.7907018Z May 02 22:45:49at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
2021-05-02T22:45:49.7907555Z May 02 22:45:49at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
2021-05-02T22:45:49.7909318Z May 02 22:45:49at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
2021-05-02T22:45:49.7910078Z May 02 22:45:49at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
2021-05-02T22:45:49.7910869Z May 02 22:45:49at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
2021-05-02T22:45:49.7911597Z May 02 22:45:49at 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
2021-05-02T22:45:49.7912383Z May 02 22:45:49at 
org.junit.rules.RunRules.evaluate(RunRules.java:20)
2021-05-02T22:45:49.7914058Z May 02 22:45:49at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
2021-05-02T22:45:49.7915214Z May 02 22:45:49at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
2021-05-02T22:45:49.7916058Z May 02 22:45:49at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
2021-05-02T22:45:49.7916852Z May 02 22:45:49at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
2021-05-02T22:45:49.7917550Z May 02 22:45:49at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
2021-05-02T22:45:49.7919076Z May 02 22:45:49at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
2021-05-02T22:45:49.7920292Z May 02 22:45:49at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
2021-05-02T22:45:49.7921041Z May 02 22:45:49at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
2021-05-02T22:45:49.7921788Z May 02 22:45:49at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
2021-05-02T22:45:49.7922652Z May 02 22:45:49at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
2021-05-02T22:45:49.7923564Z May 02 22:45:49at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
2021-05-02T22:45:49.7924834Z May 02 22:45:49at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
2021-05-02T22:45:49.7925709Z May 02 22:45:49at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
2021-05-02T22:45:49.7926617Z May 02 22:45:49at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
2021-05-02T22:45:49.7927661Z May 02 22:45:49at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
2021-05-02T22:45:49.7928497Z May 02 22:45:49at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
2021-05-02T22:45:49.7929783Z May 02 22:45:49at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
{noformat}


  
was:https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17499&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=05b

[jira] [Reopened] (FLINK-16296) Improve performance of BaseRowSerializer#serialize() for GenericRow

2021-06-09 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee reopened FLINK-16296:
--

> Improve performance of BaseRowSerializer#serialize() for GenericRow
> ---
>
> Key: FLINK-16296
> URL: https://issues.apache.org/jira/browse/FLINK-16296
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Jark Wu
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Currently, when serialize a {{GenericRow}} using 
> {{BaseRowSerializer#serialize()}} , there will be 2 memory copy. The first is 
> GenericRow -> BinaryRow, the second is  BinaryRow -> DataOutputView. 
> However, in theory, we can serialize GenericRow into DataOutputView directly, 
> because we already get all the column values and types. We can serialize the 
> null bit part for all columns and then the fix-part for all columns and then 
> the variable lenght part. 
> For example, when the column is a BinaryString, we can serialize the pos and 
> length, and calcute the new variable part length, and then serialize the next 
> column. If there is a generic type in the row, then it will fallback into 
> previous way. But generic type in SQL is rare. 
> This is a general improvements and can be benefit for every operators. 
> If this can be done, then {{GenericRow}} is always the best choice for 
> producers, and {{BinaryRow}} is always the best choice for consumers.  For 
> example, constructing a GenericRow or BinaryRow with existing {{(String, 
> Integer, Long)}} fields, and serailize into network. The GenericRow can 
> simpliy wraps on the {{(String, Integer, Long)}} values and seralize into 
> network directly with only one memory copy. However, BinaryRow will copy 
> {{(String, Integer, Long)}}  fields into a bytes[] and then copy the byte[] 
> into network. It involves two memory copy. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-16296) Improve performance of BaseRowSerializer#serialize() for GenericRow

2021-06-09 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-16296.

Resolution: Fixed

Close this one, I think we should keep BinaryRow by design.

And I think this optimization is not big. The copy of BinaryRow is very fast 
without new objects.

> Improve performance of BaseRowSerializer#serialize() for GenericRow
> ---
>
> Key: FLINK-16296
> URL: https://issues.apache.org/jira/browse/FLINK-16296
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Jark Wu
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Currently, when serialize a {{GenericRow}} using 
> {{BaseRowSerializer#serialize()}} , there will be 2 memory copy. The first is 
> GenericRow -> BinaryRow, the second is  BinaryRow -> DataOutputView. 
> However, in theory, we can serialize GenericRow into DataOutputView directly, 
> because we already get all the column values and types. We can serialize the 
> null bit part for all columns and then the fix-part for all columns and then 
> the variable lenght part. 
> For example, when the column is a BinaryString, we can serialize the pos and 
> length, and calcute the new variable part length, and then serialize the next 
> column. If there is a generic type in the row, then it will fallback into 
> previous way. But generic type in SQL is rare. 
> This is a general improvements and can be benefit for every operators. 
> If this can be done, then {{GenericRow}} is always the best choice for 
> producers, and {{BinaryRow}} is always the best choice for consumers.  For 
> example, constructing a GenericRow or BinaryRow with existing {{(String, 
> Integer, Long)}} fields, and serailize into network. The GenericRow can 
> simpliy wraps on the {{(String, Integer, Long)}} values and seralize into 
> network directly with only one memory copy. However, BinaryRow will copy 
> {{(String, Integer, Long)}}  fields into a bytes[] and then copy the byte[] 
> into network. It involves two memory copy. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-16296) Improve performance of BaseRowSerializer#serialize() for GenericRow

2021-06-09 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-16296.

Resolution: Invalid

> Improve performance of BaseRowSerializer#serialize() for GenericRow
> ---
>
> Key: FLINK-16296
> URL: https://issues.apache.org/jira/browse/FLINK-16296
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Jark Wu
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Currently, when serialize a {{GenericRow}} using 
> {{BaseRowSerializer#serialize()}} , there will be 2 memory copy. The first is 
> GenericRow -> BinaryRow, the second is  BinaryRow -> DataOutputView. 
> However, in theory, we can serialize GenericRow into DataOutputView directly, 
> because we already get all the column values and types. We can serialize the 
> null bit part for all columns and then the fix-part for all columns and then 
> the variable lenght part. 
> For example, when the column is a BinaryString, we can serialize the pos and 
> length, and calcute the new variable part length, and then serialize the next 
> column. If there is a generic type in the row, then it will fallback into 
> previous way. But generic type in SQL is rare. 
> This is a general improvements and can be benefit for every operators. 
> If this can be done, then {{GenericRow}} is always the best choice for 
> producers, and {{BinaryRow}} is always the best choice for consumers.  For 
> example, constructing a GenericRow or BinaryRow with existing {{(String, 
> Integer, Long)}} fields, and serailize into network. The GenericRow can 
> simpliy wraps on the {{(String, Integer, Long)}} values and seralize into 
> network directly with only one memory copy. However, BinaryRow will copy 
> {{(String, Integer, Long)}}  fields into a bytes[] and then copy the byte[] 
> into network. It involves two memory copy. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #16121: [FLINK-20219][coordination] Shutdown cluster externally should only clean up the HA data if all the jobs have finished

2021-06-09 Thread GitBox


flinkbot commented on pull request #16121:
URL: https://github.com/apache/flink/pull/16121#issuecomment-857476535


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit d7a5a79edc4755d73df23b762cc5d38e4822314e (Wed Jun 09 
07:57:42 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-16295) Optimize BinaryString.copy to not materialize if there is javaObject

2021-06-09 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-16295.

Resolution: Invalid

> Optimize BinaryString.copy to not materialize if there is javaObject
> 
>
> Key: FLINK-16295
> URL: https://issues.apache.org/jira/browse/FLINK-16295
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Jingsong Lee
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> When not object reuse, this copy is really performance killer.
> CC: [~jark] [~ykt836]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22593) SavepointITCase.testShouldAddEntropyToSavepointPath unstable

2021-06-09 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-22593:
---
Fix Version/s: 1.14.0

> SavepointITCase.testShouldAddEntropyToSavepointPath unstable
> 
>
> Key: FLINK-22593
> URL: https://issues.apache.org/jira/browse/FLINK-22593
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.14.0
>Reporter: Robert Metzger
>Priority: Blocker
>  Labels: stale-critical, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=9072&view=logs&j=cc649950-03e9-5fae-8326-2f1ad744b536&t=51cab6ca-669f-5dc0-221d-1e4f7dc4fc85
> {code}
> 2021-05-07T10:56:20.9429367Z May 07 10:56:20 [ERROR] Tests run: 13, Failures: 
> 0, Errors: 1, Skipped: 0, Time elapsed: 33.441 s <<< FAILURE! - in 
> org.apache.flink.test.checkpointing.SavepointITCase
> 2021-05-07T10:56:20.9445862Z May 07 10:56:20 [ERROR] 
> testShouldAddEntropyToSavepointPath(org.apache.flink.test.checkpointing.SavepointITCase)
>   Time elapsed: 2.083 s  <<< ERROR!
> 2021-05-07T10:56:20.9447106Z May 07 10:56:20 
> java.util.concurrent.ExecutionException: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.checkpoint.CheckpointException: Checkpoint 
> triggering task Sink: Unnamed (3/4) of job 4e155a20f0a7895043661a6446caf1cb 
> has not being executed at the moment. Aborting checkpoint. Failure reason: 
> Not all required tasks are currently running.
> 2021-05-07T10:56:20.9448194Z May 07 10:56:20  at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> 2021-05-07T10:56:20.9448797Z May 07 10:56:20  at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> 2021-05-07T10:56:20.9449428Z May 07 10:56:20  at 
> org.apache.flink.test.checkpointing.SavepointITCase.submitJobAndTakeSavepoint(SavepointITCase.java:305)
> 2021-05-07T10:56:20.9450160Z May 07 10:56:20  at 
> org.apache.flink.test.checkpointing.SavepointITCase.testShouldAddEntropyToSavepointPath(SavepointITCase.java:273)
> 2021-05-07T10:56:20.9450785Z May 07 10:56:20  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-05-07T10:56:20.9451331Z May 07 10:56:20  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-05-07T10:56:20.9451940Z May 07 10:56:20  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-05-07T10:56:20.9452498Z May 07 10:56:20  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2021-05-07T10:56:20.9453247Z May 07 10:56:20  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2021-05-07T10:56:20.9454007Z May 07 10:56:20  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2021-05-07T10:56:20.9454687Z May 07 10:56:20  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2021-05-07T10:56:20.9455302Z May 07 10:56:20  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2021-05-07T10:56:20.9455909Z May 07 10:56:20  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2021-05-07T10:56:20.9456493Z May 07 10:56:20  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> 2021-05-07T10:56:20.9457074Z May 07 10:56:20  at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> 2021-05-07T10:56:20.9457636Z May 07 10:56:20  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 2021-05-07T10:56:20.9458157Z May 07 10:56:20  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2021-05-07T10:56:20.9458678Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2021-05-07T10:56:20.9459252Z May 07 10:56:20  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2021-05-07T10:56:20.9459865Z May 07 10:56:20  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2021-05-07T10:56:20.9460433Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2021-05-07T10:56:20.9461058Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2021-05-07T10:56:20.9461607Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2021-05-07T10:56:20.9462159Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2021-05-07T10:56:20.9462705Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2021-05-07T10:56:20.9463243Z May 07 10:56:20  at 
> org.junit.runners.

[jira] [Closed] (FLINK-15538) Separate decimal implementations into separate sub-classes

2021-06-09 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-15538.

Resolution: Invalid

I think we can keep Decimal just like JDK BigDecimal.

> Separate decimal implementations into separate sub-classes
> --
>
> Key: FLINK-15538
> URL: https://issues.apache.org/jira/browse/FLINK-15538
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Liya Fan
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The current implementation of Decimal values have two (somewhat independent) 
> implementations: one is based on Long, while the other is based on 
> BigDecimal. 
> This makes the Decmial class not clear (both implementations cluttered in a 
> single class) and less efficient (each method involves a if-else branch). 
> So in this issue, we make Decimal an abstract class, and separate the two 
> implementation into two sub-classes. This makes the code clearer and more 
> efficient. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22593) SavepointITCase.testShouldAddEntropyToSavepointPath unstable

2021-06-09 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-22593:
---
Priority: Blocker  (was: Critical)

> SavepointITCase.testShouldAddEntropyToSavepointPath unstable
> 
>
> Key: FLINK-22593
> URL: https://issues.apache.org/jira/browse/FLINK-22593
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.14.0
>Reporter: Robert Metzger
>Priority: Blocker
>  Labels: stale-critical, test-stability
>
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=9072&view=logs&j=cc649950-03e9-5fae-8326-2f1ad744b536&t=51cab6ca-669f-5dc0-221d-1e4f7dc4fc85
> {code}
> 2021-05-07T10:56:20.9429367Z May 07 10:56:20 [ERROR] Tests run: 13, Failures: 
> 0, Errors: 1, Skipped: 0, Time elapsed: 33.441 s <<< FAILURE! - in 
> org.apache.flink.test.checkpointing.SavepointITCase
> 2021-05-07T10:56:20.9445862Z May 07 10:56:20 [ERROR] 
> testShouldAddEntropyToSavepointPath(org.apache.flink.test.checkpointing.SavepointITCase)
>   Time elapsed: 2.083 s  <<< ERROR!
> 2021-05-07T10:56:20.9447106Z May 07 10:56:20 
> java.util.concurrent.ExecutionException: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.checkpoint.CheckpointException: Checkpoint 
> triggering task Sink: Unnamed (3/4) of job 4e155a20f0a7895043661a6446caf1cb 
> has not being executed at the moment. Aborting checkpoint. Failure reason: 
> Not all required tasks are currently running.
> 2021-05-07T10:56:20.9448194Z May 07 10:56:20  at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> 2021-05-07T10:56:20.9448797Z May 07 10:56:20  at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> 2021-05-07T10:56:20.9449428Z May 07 10:56:20  at 
> org.apache.flink.test.checkpointing.SavepointITCase.submitJobAndTakeSavepoint(SavepointITCase.java:305)
> 2021-05-07T10:56:20.9450160Z May 07 10:56:20  at 
> org.apache.flink.test.checkpointing.SavepointITCase.testShouldAddEntropyToSavepointPath(SavepointITCase.java:273)
> 2021-05-07T10:56:20.9450785Z May 07 10:56:20  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-05-07T10:56:20.9451331Z May 07 10:56:20  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-05-07T10:56:20.9451940Z May 07 10:56:20  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-05-07T10:56:20.9452498Z May 07 10:56:20  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2021-05-07T10:56:20.9453247Z May 07 10:56:20  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2021-05-07T10:56:20.9454007Z May 07 10:56:20  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2021-05-07T10:56:20.9454687Z May 07 10:56:20  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2021-05-07T10:56:20.9455302Z May 07 10:56:20  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2021-05-07T10:56:20.9455909Z May 07 10:56:20  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2021-05-07T10:56:20.9456493Z May 07 10:56:20  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> 2021-05-07T10:56:20.9457074Z May 07 10:56:20  at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> 2021-05-07T10:56:20.9457636Z May 07 10:56:20  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 2021-05-07T10:56:20.9458157Z May 07 10:56:20  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2021-05-07T10:56:20.9458678Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2021-05-07T10:56:20.9459252Z May 07 10:56:20  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2021-05-07T10:56:20.9459865Z May 07 10:56:20  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2021-05-07T10:56:20.9460433Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2021-05-07T10:56:20.9461058Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2021-05-07T10:56:20.9461607Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2021-05-07T10:56:20.9462159Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2021-05-07T10:56:20.9462705Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2021-05-07T10:56:20.9463243Z May 07 10:56:20  at 
> org.junit.runners.ParentRunner.run(Pa

[GitHub] [flink] flinkbot edited a comment on pull request #15835: [FLINK-19164][release] Use versions-maven-plugin to properly update versions

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #15835:
URL: https://github.com/apache/flink/pull/15835#issuecomment-832515141


   
   ## CI report:
   
   * 33f9fe75e9b170376cc3fdd30631649a8afef516 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17566)
 
   * 0eccbda300dc699a81fcfb96f5278aa356734274 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15924: [FLINK-22670][FLIP-150][connector/common] Hybrid source baseline

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #15924:
URL: https://github.com/apache/flink/pull/15924#issuecomment-841943851


   
   ## CI report:
   
   * d1b81032b6f735cb72fc862b35512b9126b83915 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18810)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-22942) Disable upsert into syntax in Flink SQL

2021-06-09 Thread Leonard Xu (Jira)
Leonard Xu created FLINK-22942:
--

 Summary: Disable upsert into syntax in Flink SQL
 Key: FLINK-22942
 URL: https://issues.apache.org/jira/browse/FLINK-22942
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: Leonard Xu


I found we can write  *insert into* and *upsert into* in Flink SQL, but the 
later syntax's semantic and behavior is never discussed, currently they have 
same implementation.

We should disable the later one util we support  `*upsert into* ` with correct 
behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #16115: [FLINK-22766][connector/kafka] Report offsets and Kafka consumer metrics in Flink metric group

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #16115:
URL: https://github.com/apache/flink/pull/16115#issuecomment-857350673


   
   ## CI report:
   
   * 64d70892f99ab436c6bf52012e124874a11cbb18 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18809)
 
   * 4c653575cd66258b4cdd0e54b8cf7dfeca474f9b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #16121: [FLINK-20219][coordination] Shutdown cluster externally should only clean up the HA data if all the jobs have finished

2021-06-09 Thread GitBox


flinkbot commented on pull request #16121:
URL: https://github.com/apache/flink/pull/16121#issuecomment-857483529


   
   ## CI report:
   
   * d7a5a79edc4755d73df23b762cc5d38e4822314e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22919) Remove support for Hadoop1.x in HadoopInputFormatCommonBase.getCredentialsFromUGI

2021-06-09 Thread Junfan Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359850#comment-17359850
 ] 

Junfan Zhang commented on FLINK-22919:
--

[~lirui] Please review it. Thanks

> Remove support for Hadoop1.x in 
> HadoopInputFormatCommonBase.getCredentialsFromUGI
> -
>
> Key: FLINK-22919
> URL: https://issues.apache.org/jira/browse/FLINK-22919
> Project: Flink
>  Issue Type: Improvement
>Reporter: Junfan Zhang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-18637) Key group is not in KeyGroupRange

2021-06-09 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359852#comment-17359852
 ] 

Yun Tang commented on FLINK-18637:
--

[~chiggi_dev] Could you share the exception stack in details? And some code of 
how you use value state is better.

> Key group is not in KeyGroupRange
> -
>
> Key: FLINK-18637
> URL: https://issues.apache.org/jira/browse/FLINK-18637
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
> Environment: Version: 1.10.0, Rev:, Date:
> OS current user: yarn
>  Current Hadoop/Kerberos user: hadoop
>  JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.141-b15
>  Maximum heap size: 28960 MiBytes
>  JAVA_HOME: /usr/java/jdk1.8.0_141/jre
>  Hadoop version: 2.8.5-amzn-6
>  JVM Options:
>  -Xmx30360049728
>  -Xms30360049728
>  -XX:MaxDirectMemorySize=4429185024
>  -XX:MaxMetaspaceSize=1073741824
>  -XX:+UseG1GC
>  -XX:+UnlockDiagnosticVMOptions
>  -XX:+G1SummarizeConcMark
>  -verbose:gc
>  -XX:+PrintGCDetails
>  -XX:+PrintGCDateStamps
>  -XX:+UnlockCommercialFeatures
>  -XX:+FlightRecorder
>  -XX:+DebugNonSafepoints
>  
> -XX:FlightRecorderOptions=defaultrecording=true,settings=/home/hadoop/heap.jfc,dumponexit=true,dumponexitpath=/var/lib/hadoop-yarn/recording.jfr,loglevel=info
>  
> -Dlog.file=/var/log/hadoop-yarn/containers/application_1593935560662_0002/container_1593935560662_0002_01_02/taskmanager.log
>  -Dlog4j.configuration=[file:./log4j.properties|file:///log4j.properties]
>  Program Arguments:
>  -Dtaskmanager.memory.framework.off-heap.size=134217728b
>  -Dtaskmanager.memory.network.max=1073741824b
>  -Dtaskmanager.memory.network.min=1073741824b
>  -Dtaskmanager.memory.framework.heap.size=134217728b
>  -Dtaskmanager.memory.managed.size=23192823744b
>  -Dtaskmanager.cpu.cores=7.0
>  -Dtaskmanager.memory.task.heap.size=30225832000b
>  -Dtaskmanager.memory.task.off-heap.size=3221225472b
>  --configDir.
>  
> -Djobmanager.rpc.address=ip-10-180-30-250.us-west-2.compute.internal-Dweb.port=0
>  -Dweb.tmpdir=/tmp/flink-web-64f613cf-bf04-4a09-8c14-75c31b619574
>  -Djobmanager.rpc.port=33739
>  -Drest.address=ip-10-180-30-250.us-west-2.compute.internal
>Reporter: Ori Popowski
>Priority: Major
>
> I'm getting this error when creating a savepoint. I've read in 
> https://issues.apache.org/jira/browse/FLINK-16193 that it's caused by 
> unstable hashcode or equals on the key, or improper use of 
> {{reinterpretAsKeyedStream}}.
>   
>  My key is a string and I don't use {{reinterpretAsKeyedStream}}.
>  
> {code:java}
> senv
>   .addSource(source)
>   .flatMap(…)
>   .filterWith { case (metadata, _, _) => … }
>   .assignTimestampsAndWatermarks(new 
> BoundedOutOfOrdernessTimestampExtractor(…))
>   .keyingBy { case (meta, _) => meta.toPathString }
>   .process(new TruncateLargeSessions(config.sessionSizeLimit))
>   .keyingBy { case (meta, _) => meta.toPathString }
>   .window(EventTimeSessionWindows.withGap(Time.of(…)))
>   .process(new ProcessSession(sessionPlayback, config))
>   .addSink(sink){code}
>  
> {code:java}
> org.apache.flink.util.FlinkException: Triggering a savepoint for the job 
> 962fc8e984e7ca1ed65a038aa62ce124 failed.
>   at 
> org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:633)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$savepoint$9(CliFrontend.java:611)
>   at 
> org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:843)
>   at 
> org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:608)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:910)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
> Caused by: java.util.concurrent.CompletionException: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.checkpoint.CheckpointException: The job has failed.
>   at 
> org.apache.flink.runtime.scheduler.SchedulerBase.lambda$triggerSavepoint$3(SchedulerBase.java:744)
>   at 
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
>   at 
> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
>   at 
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
>   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.h

[jira] [Updated] (FLINK-22608) Flink Kryo deserialize read wrong bytes

2021-06-09 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-22608:
---
Component/s: (was: Runtime / State Backends)
 (was: Runtime / Checkpointing)
 API / Type Serialization System

> Flink Kryo deserialize read wrong bytes
> ---
>
> Key: FLINK-22608
> URL: https://issues.apache.org/jira/browse/FLINK-22608
> Project: Flink
>  Issue Type: Bug
>  Components: API / Type Serialization System
>Affects Versions: 1.12.0
>Reporter: Si Chen
>Priority: Major
>  Labels: stale-major
>
> In flink program, I use ValueState to save my state. The state stores pojo. 
> But my pojo used kryo serializer. As the program run some time, I add a field 
> in pojo. Then recovery the program with checkpoint. I found the value of the 
> field incorrect. Then I read the source code I found
>  
> {code:java}
> //代码占位符
> org.apache.flink.runtime.state.heap.HeapRestoreOperation#readStateHandleStateData
> private void readStateHandleStateData(
>FSDataInputStream fsDataInputStream,
>DataInputViewStreamWrapper inView,
>KeyGroupRangeOffsets keyGroupOffsets,
>Map kvStatesById,
>int numStates,
>int readVersion,
>boolean isCompressed) throws IOException {
>final StreamCompressionDecorator streamCompressionDecorator = isCompressed 
> ?
>   SnappyStreamCompressionDecorator.INSTANCE : 
> UncompressedStreamCompressionDecorator.INSTANCE;
>for (Tuple2 groupOffset : keyGroupOffsets) {
>   int keyGroupIndex = groupOffset.f0;
>   long offset = groupOffset.f1;
>   // Check that restored key groups all belong to the backend.
>   Preconditions.checkState(keyGroupRange.contains(keyGroupIndex), "The 
> key group must belong to the backend.");
>   fsDataInputStream.seek(offset);
>   int writtenKeyGroupIndex = inView.readInt();
>   Preconditions.checkState(writtenKeyGroupIndex == keyGroupIndex,
>  "Unexpected key-group in restore.");
>   try (InputStream kgCompressionInStream =
>  
> streamCompressionDecorator.decorateWithCompression(fsDataInputStream)) {
>  readKeyGroupStateData(
> kgCompressionInStream,
> kvStatesById,
> keyGroupIndex,
> numStates,
> readVersion);
>   }
>}
> }
> {code}
> my state keyGroupIndex is 81, and keyGroupOffset is 3572. And the next 
> keyGroupOffset is 3611. So my state offset rang is 3572 to 3611. But when I 
> add new field in pojo. Kryo will read more bytes in the next keyGroup.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22932) RocksDBStateBackendWindowITCase fails with savepoint timeout

2021-06-09 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-22932:
--
Labels: test-stability  (was: )

> RocksDBStateBackendWindowITCase fails with savepoint timeout
> 
>
> Key: FLINK-22932
> URL: https://issues.apache.org/jira/browse/FLINK-22932
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.13.1
>Reporter: Roman Khachatryan
>Priority: Major
>  Labels: test-stability
> Fix For: 1.13.2
>
>
> Initially 
> [reported|https://issues.apache.org/jira/browse/FLINK-22067?focusedCommentId=17358306&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17358306]
>  in FLINK-22067
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18709&view=logs&j=a8bc9173-2af6-5ba8-775c-12063b4f1d54&t=46a16c18-c679-5905-432b-9be5d8e27bc6&l=10183
> Savepoint is triggered but is not completed in time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22932) RocksDBStateBackendWindowITCase fails with savepoint timeout

2021-06-09 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-22932:
--
Priority: Critical  (was: Major)

> RocksDBStateBackendWindowITCase fails with savepoint timeout
> 
>
> Key: FLINK-22932
> URL: https://issues.apache.org/jira/browse/FLINK-22932
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.13.1
>Reporter: Roman Khachatryan
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.13.2
>
>
> Initially 
> [reported|https://issues.apache.org/jira/browse/FLINK-22067?focusedCommentId=17358306&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17358306]
>  in FLINK-22067
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18709&view=logs&j=a8bc9173-2af6-5ba8-775c-12063b4f1d54&t=46a16c18-c679-5905-432b-9be5d8e27bc6&l=10183
> Savepoint is triggered but is not completed in time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zuston edited a comment on pull request #16107: [FLINK-22919] Remove support for Hadoop1.x in HadoopInputFormatCommonBase.getCredentialsFromUGI

2021-06-09 Thread GitBox


zuston edited a comment on pull request #16107:
URL: https://github.com/apache/flink/pull/16107#issuecomment-856618834


   @lirui-apache Please review it. Thanks. This PR will block 
[PR#15653](https://github.com/apache/flink/pull/15653)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-11813) Standby per job mode Dispatchers don't know job's JobSchedulingStatus

2021-06-09 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-11813:
--
Labels:   (was: stale-major)

> Standby per job mode Dispatchers don't know job's JobSchedulingStatus
> -
>
> Key: FLINK-11813
> URL: https://issues.apache.org/jira/browse/FLINK-11813
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.6.4, 1.7.2, 1.8.0
>Reporter: Till Rohrmann
>Priority: Major
> Fix For: 1.14.0
>
>
> At the moment, it can happen that standby {{Dispatchers}} in per job mode 
> will restart a terminated job after they gained leadership. The problem is 
> that we currently clear the {{RunningJobsRegistry}} once a job has reached a 
> globally terminal state. After the leading {{Dispatcher}} terminates, a 
> standby {{Dispatcher}} will gain leadership. Without having the information 
> from the {{RunningJobsRegistry}} it cannot tell whether the job has been 
> executed or whether the {{Dispatcher}} needs to re-execute the job. At the 
> moment, the {{Dispatcher}} will assume that there was a fault and hence 
> re-execute the job. This can lead to duplicate results.
> I think we need some way to tell standby {{Dispatchers}} that a certain job 
> has been successfully executed. One trivial solution could be to not clean up 
> the {{RunningJobsRegistry}} but then we will clutter ZooKeeper.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22929) Change the default failover strategy to FixDelayRestartStrategy for batch jobs

2021-06-09 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-22929:
--
Component/s: (was: Runtime / Coordination)
 Runtime / Configuration
 API / DataStream

> Change the default failover strategy to FixDelayRestartStrategy for batch jobs
> --
>
> Key: FLINK-22929
> URL: https://issues.apache.org/jira/browse/FLINK-22929
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Runtime / Configuration
>Affects Versions: 1.13.0, 1.13.1
>Reporter: Yun Gao
>Priority: Major
>
> Currently for the default failover strategy:
>  # Stream Job without checkpoint: NoRestartStrategy
>  # Stream Job with checkpoint:  FixDelayRestartStrategy as configured  [in 
> this 
> method|https://github.com/apache/flink/blob/ed6b33d487bccd9fd96607a3fe681ead1912d365/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/failover/flip1/RestartBackoffTimeStrategyFactoryLoader.java#L160]
>  # Batch Job: NoRestartStrategy
>  
> The default failover strategy is reasonable for the stream jobs since without 
> checkpoint, the stream job could not restart without paying high costs. 
> However, for batch jobs, the failover is handled via persisted intermediate 
> result partitions, and users usually expect the batch job could finish 
> normally by default (similar to other batch processing system). Thus it seems 
> to be more reasonable to make the default failover strategy for the batch 
> jobs to be the same the stream job with checkpoint enabled (namely 
> FixDelayRestartStrategy).
>  
> Some users are also [report the related 
> issues.|https://lists.apache.org/thread.html/rc4135e4ab41768f5fc3d4405b980872a6e39d2c0f5c92a744c623732%40%3Cuser.flink.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wangyang0918 commented on a change in pull request #16086: [FLINK-22585] Add deprecated message when -m yarn-cluster is used

2021-06-09 Thread GitBox


wangyang0918 commented on a change in pull request #16086:
URL: https://github.com/apache/flink/pull/16086#discussion_r648079283



##
File path: 
flink-python/src/main/java/org/apache/flink/client/python/PythonShellParser.java
##
@@ -192,6 +192,17 @@ private static void printYarnHelp() {
 formatter.printHelp(" ", YARN_OPTIONS);
 }
 
+private static void printYarnWarnMessage() {

Review comment:
   I do not think we only need to print the warning message for python 
parser.
   
   IIUC, when the `FlinkYarnSessionCli` is actived(session mode, perjob), we 
need to print the warning messages and suggest the users to use "-t/--target" 
instead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-statefun] igalshilman closed pull request #237: [FLINK-22529] [kinesis] Allow Flink's ConsumerConfigConstants and flexibility in providing AWS region and credentials

2021-06-09 Thread GitBox


igalshilman closed pull request #237:
URL: https://github.com/apache/flink-statefun/pull/237


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15835: [FLINK-19164][release] Use versions-maven-plugin to properly update versions

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #15835:
URL: https://github.com/apache/flink/pull/15835#issuecomment-832515141


   
   ## CI report:
   
   * 33f9fe75e9b170376cc3fdd30631649a8afef516 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17566)
 
   * 0eccbda300dc699a81fcfb96f5278aa356734274 UNKNOWN
   * eecd2bcd1353bb9148d7f75766519f8905d55ef2 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-22932) RocksDBStateBackendWindowITCase fails with savepoint timeout

2021-06-09 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-22932:
---
Description: 
Initially 
[reported|https://issues.apache.org/jira/browse/FLINK-22067?focusedCommentId=17358306&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17358306]
 in FLINK-22067

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18709&view=logs&j=a8bc9173-2af6-5ba8-775c-12063b4f1d54&t=46a16c18-c679-5905-432b-9be5d8e27bc6&l=10183

Savepoint is triggered but is not completed in time.


{noformat}
2021-06-06T22:27:46.4845045Z Jun 06 22:27:46 java.lang.RuntimeException: Failed 
to take savepoint
2021-06-06T22:27:46.4846088Z Jun 06 22:27:46at 
org.apache.flink.state.api.utils.SavepointTestBase.takeSavepoint(SavepointTestBase.java:71)
2021-06-06T22:27:46.4847049Z Jun 06 22:27:46at 
org.apache.flink.state.api.utils.SavepointTestBase.takeSavepoint(SavepointTestBase.java:46)
2021-06-06T22:27:46.4848262Z Jun 06 22:27:46at 
org.apache.flink.state.api.SavepointWindowReaderITCase.testApplyEvictorWindowStateReader(SavepointWindowReaderITCase.java:350)
2021-06-06T22:27:46.4854133Z Jun 06 22:27:46at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2021-06-06T22:27:46.4855430Z Jun 06 22:27:46at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2021-06-06T22:27:46.4856528Z Jun 06 22:27:46at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2021-06-06T22:27:46.4857487Z Jun 06 22:27:46at 
java.lang.reflect.Method.invoke(Method.java:498)
2021-06-06T22:27:46.4858685Z Jun 06 22:27:46at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
2021-06-06T22:27:46.4859773Z Jun 06 22:27:46at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
2021-06-06T22:27:46.4860964Z Jun 06 22:27:46at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
2021-06-06T22:27:46.4862306Z Jun 06 22:27:46at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
2021-06-06T22:27:46.4863756Z Jun 06 22:27:46at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
2021-06-06T22:27:46.4864993Z Jun 06 22:27:46at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
2021-06-06T22:27:46.4866179Z Jun 06 22:27:46at 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
2021-06-06T22:27:46.4867272Z Jun 06 22:27:46at 
org.junit.rules.RunRules.evaluate(RunRules.java:20)
2021-06-06T22:27:46.4868255Z Jun 06 22:27:46at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
2021-06-06T22:27:46.4869045Z Jun 06 22:27:46at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
2021-06-06T22:27:46.4869902Z Jun 06 22:27:46at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
2021-06-06T22:27:46.4871038Z Jun 06 22:27:46at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
2021-06-06T22:27:46.4871756Z Jun 06 22:27:46at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
2021-06-06T22:27:46.4872502Z Jun 06 22:27:46at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
2021-06-06T22:27:46.4873389Z Jun 06 22:27:46at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
2021-06-06T22:27:46.4874150Z Jun 06 22:27:46at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
2021-06-06T22:27:46.4874914Z Jun 06 22:27:46at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
2021-06-06T22:27:46.4875661Z Jun 06 22:27:46at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
2021-06-06T22:27:46.4876382Z Jun 06 22:27:46at 
org.junit.rules.RunRules.evaluate(RunRules.java:20)
2021-06-06T22:27:46.4877018Z Jun 06 22:27:46at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
2021-06-06T22:27:46.4877661Z Jun 06 22:27:46at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
2021-06-06T22:27:46.4878522Z Jun 06 22:27:46at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
2021-06-06T22:27:46.4879506Z Jun 06 22:27:46at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
2021-06-06T22:27:46.4880246Z Jun 06 22:27:46at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
2021-06-06T22:27:46.4881025Z Jun 06 22:27:46at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
2021-06-06T22:27:46.4881839Z Jun 06 22:27:46at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
2021-06-06T22:2

[GitHub] [flink] flinkbot edited a comment on pull request #16115: [FLINK-22766][connector/kafka] Report offsets and Kafka consumer metrics in Flink metric group

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #16115:
URL: https://github.com/apache/flink/pull/16115#issuecomment-857350673


   
   ## CI report:
   
   * 64d70892f99ab436c6bf52012e124874a11cbb18 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18809)
 
   * 4c653575cd66258b4cdd0e54b8cf7dfeca474f9b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18826)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16121: [FLINK-20219][coordination] Shutdown cluster externally should only clean up the HA data if all the jobs have finished

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #16121:
URL: https://github.com/apache/flink/pull/16121#issuecomment-857483529


   
   ## CI report:
   
   * d7a5a79edc4755d73df23b762cc5d38e4822314e UNKNOWN
   * 132ae025ea4f96896c80692384986e05f2f850da UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-22529) StateFun Kinesis ingresses should support configs that are available via FlinkKinesisConsumer's ConsumerConfigConstants

2021-06-09 Thread Igal Shilman (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igal Shilman closed FLINK-22529.

Resolution: Fixed

Added at flink-statefun/master:116eab93d6772566a507f5ff40b7cb86cc82f91c

> StateFun Kinesis ingresses should support configs that are available via 
> FlinkKinesisConsumer's ConsumerConfigConstants
> ---
>
> Key: FLINK-22529
> URL: https://issues.apache.org/jira/browse/FLINK-22529
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Major
>  Labels: pull-request-available
> Fix For: statefun-3.1.0
>
>
> The Kinesis ingress should support the configs that are available in 
> {{FlinkKinesisConsumer}}'s {{ConsumerConfigConstants}}. Instead, currently, 
> all property keys provided to the Kinesis ingress are assumed to be 
> AWS-client related keys, and therefore have all been appended with the 
> `aws.clientconfigs` string.
> I'd suggest to avoid mixing the {{ConsumerConfigConstants}} configs within 
> the properties as well. Having named methods on the {{KinesisIngressBuilder}} 
> for those configuration would provide a cleaner solution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22932) RocksDBStateBackendWindowITCase fails with savepoint timeout

2021-06-09 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-22932:
---
Component/s: (was: Runtime / Checkpointing)
 Runtime / Coordination

> RocksDBStateBackendWindowITCase fails with savepoint timeout
> 
>
> Key: FLINK-22932
> URL: https://issues.apache.org/jira/browse/FLINK-22932
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.13.1
>Reporter: Roman Khachatryan
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.13.2
>
>
> Initially 
> [reported|https://issues.apache.org/jira/browse/FLINK-22067?focusedCommentId=17358306&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17358306]
>  in FLINK-22067
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18709&view=logs&j=a8bc9173-2af6-5ba8-775c-12063b4f1d54&t=46a16c18-c679-5905-432b-9be5d8e27bc6&l=10183
> Savepoint is triggered but is not completed in time.
> {noformat}
> 2021-06-06T22:27:46.4845045Z Jun 06 22:27:46 java.lang.RuntimeException: 
> Failed to take savepoint
> 2021-06-06T22:27:46.4846088Z Jun 06 22:27:46  at 
> org.apache.flink.state.api.utils.SavepointTestBase.takeSavepoint(SavepointTestBase.java:71)
> 2021-06-06T22:27:46.4847049Z Jun 06 22:27:46  at 
> org.apache.flink.state.api.utils.SavepointTestBase.takeSavepoint(SavepointTestBase.java:46)
> 2021-06-06T22:27:46.4848262Z Jun 06 22:27:46  at 
> org.apache.flink.state.api.SavepointWindowReaderITCase.testApplyEvictorWindowStateReader(SavepointWindowReaderITCase.java:350)
> 2021-06-06T22:27:46.4854133Z Jun 06 22:27:46  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-06-06T22:27:46.4855430Z Jun 06 22:27:46  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-06-06T22:27:46.4856528Z Jun 06 22:27:46  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-06-06T22:27:46.4857487Z Jun 06 22:27:46  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2021-06-06T22:27:46.4858685Z Jun 06 22:27:46  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2021-06-06T22:27:46.4859773Z Jun 06 22:27:46  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2021-06-06T22:27:46.4860964Z Jun 06 22:27:46  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2021-06-06T22:27:46.4862306Z Jun 06 22:27:46  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2021-06-06T22:27:46.4863756Z Jun 06 22:27:46  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 2021-06-06T22:27:46.4864993Z Jun 06 22:27:46  at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> 2021-06-06T22:27:46.4866179Z Jun 06 22:27:46  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 2021-06-06T22:27:46.4867272Z Jun 06 22:27:46  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2021-06-06T22:27:46.4868255Z Jun 06 22:27:46  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2021-06-06T22:27:46.4869045Z Jun 06 22:27:46  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2021-06-06T22:27:46.4869902Z Jun 06 22:27:46  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2021-06-06T22:27:46.4871038Z Jun 06 22:27:46  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2021-06-06T22:27:46.4871756Z Jun 06 22:27:46  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2021-06-06T22:27:46.4872502Z Jun 06 22:27:46  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2021-06-06T22:27:46.4873389Z Jun 06 22:27:46  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2021-06-06T22:27:46.4874150Z Jun 06 22:27:46  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2021-06-06T22:27:46.4874914Z Jun 06 22:27:46  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> 2021-06-06T22:27:46.4875661Z Jun 06 22:27:46  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> 2021-06-06T22:27:46.4876382Z Jun 06 22:27:46  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2021-06-06T22:27:46.4877018Z Jun 06 22:27:46  at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 2021-06-06T22:27:46.4877661Z Jun 06 22:27:46  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 2021-06-06T22:27:46.4878522Z Jun 06 22:27:46  at 
> org.apache.m

[jira] [Commented] (FLINK-18637) Key group is not in KeyGroupRange

2021-06-09 Thread Chirag Dewan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359865#comment-17359865
 ] 

Chirag Dewan commented on FLINK-18637:
--

This is my process() code:

 

public class Aggregator_KeyedExpression extendsKeyedProcessFunction {

 

    private ValueStatevalueState;

 

    @Override

    public void open() throws Exception {

ValueStateDescriptor descriptor =

   new ValueStateDescriptor(

   "totalPrize",Integer.class);

 

    valueState =getRuntimeContext().getState(descriptor);

    }

 

@Override

    public void processElement(GameZoneInputinEvent, Context ctx, final 
List outEvents)

   throws Exception {



if(valueState.value() == null) {

   valueState.update(0);

    }

   

valueState.update(valueState.value()+ inEvent.getPrizeDelta());

   

int sum =valueState.value();



    GameZoneOutputoutput = new GameZoneOutput();

   output.setPlayerId(inEvent.getPlayerId());

   output.setNetPrize(sum);

   outEvents.add(output);

 

    }

 

    @Override

    public void close() throws Exception {

   valueState.clear();

    }

    
 @Override
 public void onTimer(long timestamp, TimerContext ctx) throws Exception {
 state1.update(0);
 }

}

 

The onTimer is registered at one day frequency. 

During the events processing I get Null Pointer:

if (valueState != null)
   valueState.value() -> Strangely, I get a NPE here

 

And during async snapshot, I am getting multiple exceptions:

 
Aggregator (2/4)#0 - asynchronous part of checkpoint 2 could not be completed.
 java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: 
Key group 0 is not in KeyGroupRange\{startKeyGroup=32, endKeyGroup=63}.
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_261]
 at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_261]
 at 
org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:621)
 ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at 
org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer.(OperatorSnapshotFinalizer.java:54)
 ~[flink-streaming-java_2.11-1.12.1.jar:1.12.1]
 at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:122)
 [flink-streaming-java_2.11-1.12.1.jar:1.12.1]
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_261]
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_261]
 at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]
 Caused by: java.lang.IllegalArgumentException: Key group 0 is not in 
KeyGroupRange\{startKeyGroup=32, endKeyGroup=63}.
 at 
org.apache.flink.runtime.state.KeyGroupRangeOffsets.computeKeyGroupIndex(KeyGroupRangeOffsets.java:144)
 ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at 
org.apache.flink.runtime.state.KeyGroupRangeOffsets.setKeyGroupOffset(KeyGroupRangeOffsets.java:106)
 ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at 
org.apache.flink.contrib.streaming.state.snapshot.RocksFullSnapshotStrategy$SnapshotAsynchronousPartCallable.writeKVStateData(RocksFullSnapshotStrategy.java:333)
 ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at 
org.apache.flink.contrib.streaming.state.snapshot.RocksFullSnapshotStrategy$SnapshotAsynchronousPartCallable.writeSnapshotToOutputStream(RocksFullSnapshotStrategy.java:264)
 ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at 
org.apache.flink.contrib.streaming.state.snapshot.RocksFullSnapshotStrategy$SnapshotAsynchronousPartCallable.callInternal(RocksFullSnapshotStrategy.java:227)
 ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at 
org.apache.flink.contrib.streaming.state.snapshot.RocksFullSnapshotStrategy$SnapshotAsynchronousPartCallable.callInternal(RocksFullSnapshotStrategy.java:180)
 ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at 
org.apache.flink.runtime.state.AsyncSnapshotCallable.call(AsyncSnapshotCallable.java:78)
 ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_261]
 at 
org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:618)
 ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 ... 5 more
 
and
 
Caused by: java.lang.IllegalArgumentException: Position out of bounds.
 at org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:138) 
~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at 
org.apache.flink.core.memory.DataOutputSerializer.setPosition(DataOutputSerializer.java:352)
 ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at 
org.apache.flink.contrib.streaming.state.RocksDBSerializedCompositeKeyBuilder.resetToKey(RocksDBSerializedCompositeKeyBuilder.java:185)
 ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at 
org.apache.flink.contrib.streaming.sta

[GitHub] [flink] AHeise commented on a change in pull request #16113: [FLINK-22926][datastream] IDLE FLIP-27 source should go ACTIVE when registering a new split

2021-06-09 Thread GitBox


AHeise commented on a change in pull request #16113:
URL: https://github.com/apache/flink/pull/16113#discussion_r648090118



##
File path: 
flink-core/src/main/java/org/apache/flink/api/common/eventtime/Watermark.java
##
@@ -56,6 +56,9 @@ Licensed to the Apache Software Foundation (ASF) under one
 /** The watermark that signifies end-of-event-time. */
 public static final Watermark MAX_WATERMARK = new 
Watermark(Long.MAX_VALUE);
 
+/** The watermark that signifies start-of-event-time. */
+public static final Watermark MIN_WATERMARK = new 
Watermark(Long.MIN_VALUE);

Review comment:
   Can we add that to Public?

##
File path: 
flink-core/src/main/java/org/apache/flink/api/connector/source/lib/util/IteratorSourceReader.java
##
@@ -19,6 +19,7 @@
 package org.apache.flink.api.connector.source.lib.util;
 
 import org.apache.flink.api.connector.source.ReaderOutput;
+import org.apache.flink.api.connector.source.SourceOutput;

Review comment:
   I don't get this change. Could you expand commit message or add a test 
case?

##
File path: 
flink-core/src/main/java/org/apache/flink/api/common/eventtime/WatermarkOutputMultiplexer.java
##
@@ -86,6 +86,7 @@ public void registerNewOutput(String id) {
 checkState(previouslyRegistered == null, "Already contains an output 
for ID %s", id);
 
 combinedWatermarkStatus.add(outputState);
+underlyingOutput.emitWatermark(Watermark.MIN_WATERMARK);

Review comment:
   Is this also necessary if there has been a split already?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22933) Upgrade the Flink Fabric8io/kubernetes-client version to >=5.4.0 to be FIPS compliant

2021-06-09 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359870#comment-17359870
 ] 

Yang Wang commented on FLINK-22933:
---

[~Fuyao Li] Thanks for creating this ticket. Generally speaking, using the 
latest stable version is reasonable.

But just as you said, fabric8 Kubernetes-client 5.x has changed a lot from 4.x. 
And Flink now only uses the Kubernetes client to create/delete K8s resources 
and do not have some complicated use cases. So I am really hesitating to 
upgrade the version.

 

IIUC, your posted exception does not come from Flink internally. Instead, it 
happens in your K8s operator implementation. Could you please try to use the 
5.x version only in your K8s operator? I believe you could achieve that by 
shading the "flink-kubernetes" dependency.

> Upgrade the Flink Fabric8io/kubernetes-client version to >=5.4.0 to be FIPS 
> compliant
> -
>
> Key: FLINK-22933
> URL: https://issues.apache.org/jira/browse/FLINK-22933
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Affects Versions: 1.13.0, 1.13.1
>Reporter: Fuyao Li
>Priority: Critical
> Fix For: 1.14.0
>
>
> The current Fabric8io version in Flink is 4.9.2
> See link: 
> [https://github.com/apache/flink/blob/master/flink-kubernetes/pom.xml#L35]
> This version of Fabric8io library is not FIPS compliant 
> ([https://www.sdxcentral.com/security/definitions/what-does-mean-fips-compliant/).]
> Such function is added in Fabric8io recently. See links below.
> [https://github.com/fabric8io/kubernetes-client/pull/2788]
>  [https://github.com/fabric8io/kubernetes-client/issues/2732]
>  
> I am trying to write a native kubernetes operator leveraging APIs and 
> interfaces provided by Flink source code. For example, ApplicationDeployer.
> I am writing my own implementation based on Yang's example code: 
> [https://github.com/wangyang0918/flink-native-k8s-operator]
>  
> Using version 4.9.2 for my operator will be working perfectly, but it could 
> cause FIPS compliant issues.
>  
> Using version 5.4.0 will run into issues since Fabric8io version 4 and 
> version 5 API is not that compatible. I saw errors below.
> {code:java}
> Exception in thread "main" java.lang.AbstractMethodError: Receiver class 
> io.fabric8.kubernetes.client.handlers.ServiceHandler does not define or 
> inherit an implementation of the resolved method 'abstract java.lang.Object 
> create(okhttp3.OkHttpClient, io.fabric8.kubernetes.client.Config, 
> java.lang.String, java.lang.Object, boolean)' of interface 
> io.fabric8.kubernetes.client.ResourceHandler.Exception in thread "main" 
> java.lang.AbstractMethodError: Receiver class 
> io.fabric8.kubernetes.client.handlers.ServiceHandler does not define or 
> inherit an implementation of the resolved method 'abstract java.lang.Object 
> create(okhttp3.OkHttpClient, io.fabric8.kubernetes.client.Config, 
> java.lang.String, java.lang.Object, boolean)' of interface 
> io.fabric8.kubernetes.client.ResourceHandler. at 
> io.fabric8.kubernetes.client.utils.CreateOrReplaceHelper.lambda$createOrReplaceItem$0(CreateOrReplaceHelper.java:77)
>  at 
> io.fabric8.kubernetes.client.utils.CreateOrReplaceHelper.createOrReplace(CreateOrReplaceHelper.java:56)
>  at 
> io.fabric8.kubernetes.client.utils.CreateOrReplaceHelper.createOrReplaceItem(CreateOrReplaceHelper.java:91)
>  at 
> io.fabric8.kubernetes.client.dsl.internal.NamespaceVisitFromServerGetWatchDeleteRecreateWaitApplicableListImpl.createOrReplaceOrDeleteExisting(NamespaceVisitFromServerGetWatchDeleteRecreateWaitApplicableListImpl.java:454)
>  at 
> io.fabric8.kubernetes.client.dsl.internal.NamespaceVisitFromServerGetWatchDeleteRecreateWaitApplicableListImpl.createOrReplace(NamespaceVisitFromServerGetWatchDeleteRecreateWaitApplicableListImpl.java:297)
>  at 
> io.fabric8.kubernetes.client.dsl.internal.NamespaceVisitFromServerGetWatchDeleteRecreateWaitApplicableListImpl.createOrReplace(NamespaceVisitFromServerGetWatchDeleteRecreateWaitApplicableListImpl.java:66)
>  at 
> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.createJobManagerComponent(Fabric8FlinkKubeClient.java:113)
>  at 
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployClusterInternal(KubernetesClusterDescriptor.java:274)
>  at 
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployApplicationCluster(KubernetesClusterDescriptor.java:208)
>  at 
> org.apache.flink.client.deployment.application.cli.ApplicationClusterDeployer.run(ApplicationClusterDeployer.java:67)
>  at 
> org.apache.flink.kubernetes.operator.controller.FlinkApplicationController.reconcile(FlinkApplicationController.java:207)
>  at 
> org.apache.flink.kubernetes.operator.controller.FlinkAppli

[GitHub] [flink] flinkbot edited a comment on pull request #15835: [FLINK-19164][release] Use versions-maven-plugin to properly update versions

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #15835:
URL: https://github.com/apache/flink/pull/15835#issuecomment-832515141


   
   ## CI report:
   
   * 33f9fe75e9b170376cc3fdd30631649a8afef516 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17566)
 
   * 0eccbda300dc699a81fcfb96f5278aa356734274 UNKNOWN
   * eecd2bcd1353bb9148d7f75766519f8905d55ef2 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18825)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan commented on a change in pull request #15200: [FLINK-21355] Send changes to the state changelog

2021-06-09 Thread GitBox


rkhachatryan commented on a change in pull request #15200:
URL: https://github.com/apache/flink/pull/15200#discussion_r648105112



##
File path: 
flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogMapState.java
##
@@ -51,16 +63,28 @@ public UV get(UK key) throws Exception {
 
 @Override
 public void put(UK key, UV value) throws Exception {
+if (getValueSerializer() instanceof MapSerializer) {
+changeLogger.valueElementChanged(
+out -> {
+serializeKey(key, out);
+serializeValue(value, out);
+},
+getCurrentNamespace());
+} else {
+changeLogger.valueAdded(singletonMap(key, value), 
getCurrentNamespace());

Review comment:
   I'm not sure (as `MapStateDescriptor` is not the only possible 
`StateDescriptor`); but it simplifies the code :) I'll remove the additional 
logic, thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16121: [FLINK-20219][coordination] Shutdown cluster externally should only clean up the HA data if all the jobs have finished

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #16121:
URL: https://github.com/apache/flink/pull/16121#issuecomment-857483529


   
   ## CI report:
   
   * d7a5a79edc4755d73df23b762cc5d38e4822314e UNKNOWN
   * 132ae025ea4f96896c80692384986e05f2f850da Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18827)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-statefun] evans-ye opened a new pull request #239: Turn smoke e2e Fn into a RemoteFunction

2021-06-09 Thread GitBox


evans-ye opened a new pull request #239:
URL: https://github.com/apache/flink-statefun/pull/239


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-statefun] evans-ye closed pull request #239: Turn smoke e2e Fn into a RemoteFunction

2021-06-09 Thread GitBox


evans-ye closed pull request #239:
URL: https://github.com/apache/flink-statefun/pull/239


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-statefun] evans-ye commented on pull request #239: Turn smoke e2e Fn into a RemoteFunction

2021-06-09 Thread GitBox


evans-ye commented on pull request #239:
URL: https://github.com/apache/flink-statefun/pull/239#issuecomment-857523943


   Sorry. This is still a private draft . Shouldn't send to the public 
repository.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15991: [FLINK-22757][docs] Adding gcs documentation. Connecting flink to gcs.

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #15991:
URL: https://github.com/apache/flink/pull/15991#issuecomment-846585260


   
   ## CI report:
   
   * fe197531a2770d5302cc7fbee02bfce578d7d6f4 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18687)
 
   * 7537c978e103600a908997530e48fdc1e9aa8803 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16118: [FLINK-22676][coordination] The partition tracker stops tracking internal partitions when TM disconnects

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #16118:
URL: https://github.com/apache/flink/pull/16118#issuecomment-857430432


   
   ## CI report:
   
   * 99adb64639b2a6239228bd73b48afd206c1d23e5 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18820)
 
   * 3e7547f7f3ef663536bf763e0306c717f4cf45ec UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol merged pull request #16120: [FLINK-22939][azure] Generalize JDK switch

2021-06-09 Thread GitBox


zentol merged pull request #16120:
URL: https://github.com/apache/flink/pull/16120


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-22939) Generalize JDK switch in azure setup

2021-06-09 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-22939.

Fix Version/s: 1.13.2
   1.12.5
   Resolution: Fixed

master: 81543c2cf88caf9530e57c189016bd96c20c3a5a

1.13: 19582b9895ac477c0f95bccb0da597329b82628f

1.12: 277965bbad67000393b3b579f6c6d4cf21c67713

> Generalize JDK switch in azure setup
> 
>
> Key: FLINK-22939
> URL: https://issues.apache.org/jira/browse/FLINK-22939
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System / Azure Pipelines
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.12.5, 1.13.2
>
>
> Our current azure setup makes it a bit difficult to switch to a different JDK 
> because the "jdk" parameter is only evaluated if it is set to "jdk11".
> Instead, we could generalize this a bit so that it is always evaluated, such 
> that if the image contains the JDK (under some expected location) one can 
> just specify {{jdk: 14}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22943) java.lang.ClassCastException: java.time.Instant cannot be cast to java.sql.Timestamp

2021-06-09 Thread jack wang (Jira)
jack wang created FLINK-22943:
-

 Summary: java.lang.ClassCastException: java.time.Instant cannot be 
cast to java.sql.Timestamp
 Key: FLINK-22943
 URL: https://issues.apache.org/jira/browse/FLINK-22943
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Affects Versions: 1.13.1
Reporter: jack wang


Before 3.1.2 of hive version, getQueryCurrentTimestamp return Timestamp. But 
when hive version is 3.1.2,getCurrentTSMethod invoke return Instant.  So the 
code `(Timestamp)getCurrentTSMethod.invoke(sessionState)` will result the 
ClassCastException. It should be compatibility with this situation.

when I use hive dialect to create hive table, it will tirgger this error. The 
error is below:

Exception in thread "main" java.lang.ClassCastException: java.time.Instant 
cannot be cast to java.sql.TimestampException in thread "main" 
java.lang.ClassCastException: java.time.Instant cannot be cast to 
java.sql.Timestamp at 
org.apache.flink.table.planner.delegation.hive.HiveParser.setCurrentTimestamp(HiveParser.java:365)
 at 
org.apache.flink.table.planner.delegation.hive.HiveParser.startSessionState(HiveParser.java:350)
 at 
org.apache.flink.table.planner.delegation.hive.HiveParser.parse(HiveParser.java:218)
 at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:722)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-22918) StreamingFileSink does not commit partition when no message is sent

2021-06-09 Thread lihe ma (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lihe ma closed FLINK-22918.
---
Resolution: Not A Bug

this situation happens when flink job is misconfigured without checkpoint or 
savepoint, the pending partition message was discarded, jobmanager cannot 
commit previous partitions.

> StreamingFileSink does not commit partition when no message is sent 
> 
>
> Key: FLINK-22918
> URL: https://issues.apache.org/jira/browse/FLINK-22918
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.11.3
>Reporter: lihe ma
>Priority: Minor
>
> FileSink extract the partitions from messages , then send partition list to 
> committer. In  dynamic partition case, if one partition contains no data  
> between two checkpoints. then the partition wont commit.
> like this:
> date=0101/hour=01/key=a/part1-file  (write at 01-01 01:20:00)
> if there is no data with key=a between 01:20 and 02:10 (a time when watermark 
> passed) , the committer will not receive this partition and it will never 
> commit.
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15200: [FLINK-21355] Send changes to the state changelog

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #15200:
URL: https://github.com/apache/flink/pull/15200#issuecomment-798902665


   
   ## CI report:
   
   * 5e1342d9916f5c4356c622a40bc27bcbdacde9d7 UNKNOWN
   * 24c8af22f7a4b9dcf93a143188cd7aef53f9e1b2 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18805)
 
   * 11a7a4b99c6d573c101ee788432d4a160959a81e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16104: [FLINK-22730][table-planner-blink] Add per record code when generating table function collector code for lookup joins

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #16104:
URL: https://github.com/apache/flink/pull/16104#issuecomment-856601875


   
   ## CI report:
   
   * a2b9def4c3b590b7188e25a2027a372d22e4d424 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18815)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16118: [FLINK-22676][coordination] The partition tracker stops tracking internal partitions when TM disconnects

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #16118:
URL: https://github.com/apache/flink/pull/16118#issuecomment-857430432


   
   ## CI report:
   
   * 99adb64639b2a6239228bd73b48afd206c1d23e5 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18820)
 
   * 3e7547f7f3ef663536bf763e0306c717f4cf45ec Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18830)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15200: [FLINK-21355] Send changes to the state changelog

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #15200:
URL: https://github.com/apache/flink/pull/15200#issuecomment-798902665


   
   ## CI report:
   
   * 5e1342d9916f5c4356c622a40bc27bcbdacde9d7 UNKNOWN
   * 24c8af22f7a4b9dcf93a143188cd7aef53f9e1b2 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18805)
 
   * 11a7a4b99c6d573c101ee788432d4a160959a81e Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18828)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16117: [BP-1.13][FLINK-22889][tests] Ignore JdbcExactlyOnceSinkE2eTest temporarily

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #16117:
URL: https://github.com/apache/flink/pull/16117#issuecomment-857409051


   
   ## CI report:
   
   * 8f9eb693c349ac75b42cb770411ba97957b65f2b Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18818)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22664) Task metrics are not properly unregistered during region failover

2021-06-09 Thread Guokuai Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359909#comment-17359909
 ] 

Guokuai Huang commented on FLINK-22664:
---

[~trohrmann] Yes, sorry for the late reply. After I found the problem, I 
explained it and closed the issue.

> Task metrics are not properly unregistered during region failover
> -
>
> Key: FLINK-22664
> URL: https://issues.apache.org/jira/browse/FLINK-22664
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Guokuai Huang
>Priority: Major
> Attachments: Screen Shot 2021-05-14 at 2.51.04 PM.png, Screen Shot 
> 2021-05-14 at 5.40.22 PM.png
>
>
> In the current implementation of AbstractPrometheusReporter, metrics with the 
> same scopedMetricName share the same metric Collector. At the same time, a 
> HashMap named collectorsWithCountByMetricName is maintained to record the 
> refrence counter of each Collector. Only when the refrence counter of one 
> Collector becomes 0, it will be unregistered. 
> Suppose we have a flink job with single chained operator, and *execution 
> failover-strategy is set to region.*
>  !Screen Shot 2021-05-14 at 2.51.04 PM.png!
>  The following figure compares the number of metrics when this job runs on 2 
> TaskManager with 1 slots/TM and 1 TaskManager with 2 slots/TM after region 
> failover.
> Each inflection point on the graph represents a region failover. *For 
> TaskManager with multiple tasks(slots), the number of metrics increases after 
> region failover.*
> This is a case I deliberately constructed to illustrate this problem. 
> TaskManager only needs to restart part of the tasks during each region 
> failover, that is to say, *the refrence counter of task's metric Collector 
> will never become 0, so the metric Collector will not be unregistered.*
> This problem has brought a lot of pressure to our Prometheus, please see if 
> there is a good solution.
> !Screen Shot 2021-05-14 at 5.40.22 PM.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zentol merged pull request #15835: [FLINK-19164][release] Use versions-maven-plugin to properly update versions

2021-06-09 Thread GitBox


zentol merged pull request #15835:
URL: https://github.com/apache/flink/pull/15835


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-19164) Release scripts break other dependency versions unintentionally

2021-06-09 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-19164.

Fix Version/s: 1.14.0
   Resolution: Fixed

master: 36891da2f2d0662a04e122f768297c3709ebd982

> Release scripts break other dependency versions unintentionally
> ---
>
> Key: FLINK-19164
> URL: https://issues.apache.org/jira/browse/FLINK-19164
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Scripts, Release System
>Reporter: Serhat Soydan
>Priority: Minor
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.14.0
>
>
> All the scripts below has a line to change the old version to new version in 
> pom files.
> [https://github.com/apache/flink/blob/master/tools/change-version.sh#L31]
> [https://github.com/apache/flink/blob/master/tools/releasing/create_release_branch.sh#L60]
> [https://github.com/apache/flink/blob/master/tools/releasing/update_branch_version.sh#L52]
>  
> It works like find & replace so it is prone to unintentional errors. Any 
> dependency with a version equals to "old version" might be automatically 
> changed to "new version". See below to see how to produce a similar case. 
>  
> +How to re-produce the bug:+
>  * Clone/Fork Flink repo and for example checkout version v*1.11.1* 
>  * Apply any changes you need
>  * Run "create_release_branch.sh" script with OLD_VERSION=*1.11.1* 
> NEW_VERSION={color:#de350b}*1.12.0*{color}
>  ** In parent pom.xml, an auto find&replace of maven-dependency-analyzer 
> version will be done automatically and *unintentionally* which will break the 
> build.
>  
> 
> org.apache.maven.shared
> maven-dependency-analyzer
> *1.11.1*
> 
>  
> 
> org.apache.maven.shared
> maven-dependency-analyzer
> {color:#de350b}*1.12.0*{color}
> 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-19164) Release scripts break other dependency versions unintentionally

2021-06-09 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-19164:


Assignee: Serhat Soydan

> Release scripts break other dependency versions unintentionally
> ---
>
> Key: FLINK-19164
> URL: https://issues.apache.org/jira/browse/FLINK-19164
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Scripts, Release System
>Reporter: Serhat Soydan
>Assignee: Serhat Soydan
>Priority: Minor
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.14.0
>
>
> All the scripts below has a line to change the old version to new version in 
> pom files.
> [https://github.com/apache/flink/blob/master/tools/change-version.sh#L31]
> [https://github.com/apache/flink/blob/master/tools/releasing/create_release_branch.sh#L60]
> [https://github.com/apache/flink/blob/master/tools/releasing/update_branch_version.sh#L52]
>  
> It works like find & replace so it is prone to unintentional errors. Any 
> dependency with a version equals to "old version" might be automatically 
> changed to "new version". See below to see how to produce a similar case. 
>  
> +How to re-produce the bug:+
>  * Clone/Fork Flink repo and for example checkout version v*1.11.1* 
>  * Apply any changes you need
>  * Run "create_release_branch.sh" script with OLD_VERSION=*1.11.1* 
> NEW_VERSION={color:#de350b}*1.12.0*{color}
>  ** In parent pom.xml, an auto find&replace of maven-dependency-analyzer 
> version will be done automatically and *unintentionally* which will break the 
> build.
>  
> 
> org.apache.maven.shared
> maven-dependency-analyzer
> *1.11.1*
> 
>  
> 
> org.apache.maven.shared
> maven-dependency-analyzer
> {color:#de350b}*1.12.0*{color}
> 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22296) Introduce Preconditions-util into Python API

2021-06-09 Thread Roc Marshal (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359912#comment-17359912
 ] 

Roc Marshal commented on FLINK-22296:
-

Hi, [~nicholasjiang] [~dian.fu]

If it's of little significance, I will close this JIRA.

> Introduce Preconditions-util into Python API
> 
>
> Key: FLINK-22296
> URL: https://issues.apache.org/jira/browse/FLINK-22296
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Roc Marshal
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Similar to the 
> [Preconditions|https://github.com/apache/flink/blob/87efae4d3180a52e16240a0b4bbb197f85acd22c/flink-core/src/main/java/org/apache/flink/util/Preconditions.java#L43]
>  class in flink java API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15991: [FLINK-22757][docs] Adding gcs documentation. Connecting flink to gcs.

2021-06-09 Thread GitBox


flinkbot edited a comment on pull request #15991:
URL: https://github.com/apache/flink/pull/15991#issuecomment-846585260


   
   ## CI report:
   
   * fe197531a2770d5302cc7fbee02bfce578d7d6f4 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18687)
 
   * 7537c978e103600a908997530e48fdc1e9aa8803 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=18829)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-22944) Optimize writing state changes

2021-06-09 Thread Roman Khachatryan (Jira)
Roman Khachatryan created FLINK-22944:
-

 Summary: Optimize writing state changes
 Key: FLINK-22944
 URL: https://issues.apache.org/jira/browse/FLINK-22944
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Affects Versions: 1.14.0
Reporter: Roman Khachatryan
 Fix For: 1.14.0


An (incremental) improvement over the existing way of writing state changes, 
which writes metadata name (UTF-8) and keygroup for each change. Please see 
https://github.com/apache/flink/pull/16035#discussion_r642614515 for more 
details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22944) Optimize writing state changes

2021-06-09 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan updated FLINK-22944:
--
Parent: FLINK-21352
Issue Type: Sub-task  (was: Improvement)

> Optimize writing state changes
> --
>
> Key: FLINK-22944
> URL: https://issues.apache.org/jira/browse/FLINK-22944
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Affects Versions: 1.14.0
>Reporter: Roman Khachatryan
>Priority: Major
> Fix For: 1.14.0
>
>
> An (incremental) improvement over the existing way of writing state changes, 
> which writes metadata name (UTF-8) and keygroup for each change. Please see 
> https://github.com/apache/flink/pull/16035#discussion_r642614515 for more 
> details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] rkhachatryan commented on a change in pull request #16035: [FLINK-22808][state/changelog] Log metadata

2021-06-09 Thread GitBox


rkhachatryan commented on a change in pull request #16035:
URL: https://github.com/apache/flink/pull/16035#discussion_r642614515



##
File path: 
flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/AbstractStateChangeLogger.java
##
@@ -100,6 +127,8 @@ protected void log(
 return serializeRaw(
 wrapper -> {
 wrapper.writeByte(op.code);
+// todo: wrapper.writeShort(stateId); and/or sort and 
write once (same for key, ns?)
+wrapper.writeUTF(metaInfo.getName());

Review comment:
   Writing state name for each change is sub-optimal, and there are two 
ways to avoid it:
   1. write a mapping from state name to some id in the beginning, and write 
only id here
   2. sort changes by state name and write it once
   
   The first option is easier to implement but turned out more error-prone (on 
recovery).
   The second option is also more efficient and can be extended to other 
dimensions as well (key, namespace).
   
   However, I think it's easier to have this simpler version with recovery 
integrated (and maybe benchmarked) and then apply the optimization.
   
   Ticket: FLINK-22944.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-22944) Optimize writing state changes

2021-06-09 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan updated FLINK-22944:
--
Description: 
An (incremental) improvement over the existing way of writing state changes, 
which writes metadata name (UTF-8) and keygroup for each change. Please see 
https://github.com/apache/flink/pull/16035#discussion_r642614515 for more 
details.

Additionally, revisit 
[serializeRaw|https://github.com/apache/flink/pull/15420#discussion_r646164327] 
in the same AbstractStateChangeLogger.

  was:An (incremental) improvement over the existing way of writing state 
changes, which writes metadata name (UTF-8) and keygroup for each change. 
Please see https://github.com/apache/flink/pull/16035#discussion_r642614515 for 
more details.


> Optimize writing state changes
> --
>
> Key: FLINK-22944
> URL: https://issues.apache.org/jira/browse/FLINK-22944
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Affects Versions: 1.14.0
>Reporter: Roman Khachatryan
>Priority: Major
> Fix For: 1.14.0
>
>
> An (incremental) improvement over the existing way of writing state changes, 
> which writes metadata name (UTF-8) and keygroup for each change. Please see 
> https://github.com/apache/flink/pull/16035#discussion_r642614515 for more 
> details.
> Additionally, revisit 
> [serializeRaw|https://github.com/apache/flink/pull/15420#discussion_r646164327]
>  in the same AbstractStateChangeLogger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   >