[jira] [Resolved] (FLINK-24583) ElasticsearchWriterITCase.testWriteOnBulkIntervalFlush timeout on azure

2021-11-24 Thread Arvid Heise (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arvid Heise resolved FLINK-24583.
-
Resolution: Fixed

> ElasticsearchWriterITCase.testWriteOnBulkIntervalFlush timeout on azure
> ---
>
> Key: FLINK-24583
> URL: https://issues.apache.org/jira/browse/FLINK-24583
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.15.0
>Reporter: Xintong Song
>Assignee: Alexander Preuss
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.15.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=25191&view=logs&j=961f8f81-6b52-53df-09f6-7291a2e4af6a&t=f53023d8-92c3-5d78-ec7e-70c2bf37be20&l=12452
> {code}
> Oct 18 23:47:27 [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 22.228 s <<< FAILURE! - in 
> org.apache.flink.connector.elasticsearch.sink.ElasticsearchWriterITCase
> Oct 18 23:47:27 [ERROR] testWriteOnBulkIntervalFlush  Time elapsed: 2.032 s  
> <<< ERROR!
> Oct 18 23:47:27 java.util.concurrent.TimeoutException: Condition was not met 
> in given timeout.
> Oct 18 23:47:27   at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:166)
> Oct 18 23:47:27   at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:144)
> Oct 18 23:47:27   at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:136)
> Oct 18 23:47:27   at 
> org.apache.flink.connector.elasticsearch.sink.ElasticsearchWriterITCase.testWriteOnBulkIntervalFlush(ElasticsearchWriterITCase.java:139)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24583) ElasticsearchWriterITCase.testWriteOnBulkIntervalFlush timeout on azure

2021-11-24 Thread Arvid Heise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448420#comment-17448420
 ] 

Arvid Heise commented on FLINK-24583:
-

Merged into master as 349d57baf9492978cdfcfbcb6e5223bb30746622.

> ElasticsearchWriterITCase.testWriteOnBulkIntervalFlush timeout on azure
> ---
>
> Key: FLINK-24583
> URL: https://issues.apache.org/jira/browse/FLINK-24583
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.15.0
>Reporter: Xintong Song
>Assignee: Alexander Preuss
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.15.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=25191&view=logs&j=961f8f81-6b52-53df-09f6-7291a2e4af6a&t=f53023d8-92c3-5d78-ec7e-70c2bf37be20&l=12452
> {code}
> Oct 18 23:47:27 [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 22.228 s <<< FAILURE! - in 
> org.apache.flink.connector.elasticsearch.sink.ElasticsearchWriterITCase
> Oct 18 23:47:27 [ERROR] testWriteOnBulkIntervalFlush  Time elapsed: 2.032 s  
> <<< ERROR!
> Oct 18 23:47:27 java.util.concurrent.TimeoutException: Condition was not met 
> in given timeout.
> Oct 18 23:47:27   at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:166)
> Oct 18 23:47:27   at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:144)
> Oct 18 23:47:27   at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:136)
> Oct 18 23:47:27   at 
> org.apache.flink.connector.elasticsearch.sink.ElasticsearchWriterITCase.testWriteOnBulkIntervalFlush(ElasticsearchWriterITCase.java:139)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-24596) Bugs in sink.buffer-flush before upsert-kafka

2021-11-24 Thread Arvid Heise (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arvid Heise updated FLINK-24596:

Priority: Blocker  (was: Critical)

> Bugs in sink.buffer-flush before upsert-kafka
> -
>
> Key: FLINK-24596
> URL: https://issues.apache.org/jira/browse/FLINK-24596
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.0, 1.15.0
>Reporter: Jingsong Lee
>Assignee: Fabian Paul
>Priority: Blocker
>
> There is no ITCase for sink.buffer-flush before upsert-kafka. We should add 
> it.
> FLINK-23735 brings some bugs:
> * SinkBufferFlushMode bufferFlushMode not Serializable
> * Function valueCopyFunction not Serializable
> * Planner dose not support DataStreamProvider with new Sink



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25029) Hadoop Caller Context Setting In Flink

2021-11-24 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-25029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448421#comment-17448421
 ] 

刘方奇 commented on FLINK-25029:
-

THX for [~lzljs3620320] 's reply. Actually Caller Context is like a mark for a 
hdfs operation, then it can be get in name node audit log. Normally Caller 
Context are set in the beginning of the thread.

e.g: A hive query thread can set Caller Context as "my query", then Name Node 
can perceive that the "my query" operate which path in which time.

I saw there was a similar issue ( 
https://issues.apache.org/jira/browse/FLINK-16809 )which describe this. But it 
did not close, so I try to take the ticket by the way.

> Hadoop Caller Context Setting In Flink
> --
>
> Key: FLINK-25029
> URL: https://issues.apache.org/jira/browse/FLINK-25029
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: 刘方奇
>Priority: Major
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> The above is the main effect of the Caller Context. HDFS Client set Caller 
> Context, then name node get it in audit log to do some work.
> Now the Spark and hive have the Caller Context to meet the HDFS Job Audit 
> requirement.
> In my company, flink jobs often cause some problems for HDFS, so we did it 
> for preventing some cases.
> If the feature is general enough. Should we support it, then I can submit a 
> PR for this.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] AHeise merged pull request #17728: [FLINK-24832][tests][testinfrastructure] Update JUnit5 to v5.8.1

2021-11-24 Thread GitBox


AHeise merged pull request #17728:
URL: https://github.com/apache/flink/pull/17728


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (FLINK-24832) Update JUnit5 to v5.8.1

2021-11-24 Thread Martijn Visser (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser resolved FLINK-24832.

Resolution: Fixed

Merged in master: 5233d825d4f57bb7c375c810a7c174dc7e4a8bcd

> Update JUnit5 to v5.8.1
> ---
>
> Key: FLINK-24832
> URL: https://issues.apache.org/jira/browse/FLINK-24832
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Test Infrastructure, Tests
>Reporter: Martijn Visser
>Assignee: Martijn Visser
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> We should update to the latest version of JUnit5, v5.8.1



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] AHeise commented on a change in pull request #17881: [FLINK-24971][tests] Adding retry mechanism in case git clone fails

2021-11-24 Thread GitBox


AHeise commented on a change in pull request #17881:
URL: https://github.com/apache/flink/pull/17881#discussion_r755782939



##
File path: flink-end-to-end-tests/test-scripts/common_ssl.sh
##
@@ -77,7 +77,17 @@ function _set_conf_ssl_helper {
 # -> we need to build it ourselves
 FLINK_SHADED_VERSION=$(cat ${END_TO_END_DIR}/../pom.xml | sed -n 
's/.*\(.*\)<\/flink.shaded.version>/\1/p')
 echo "BUILDING flink-shaded-netty-tcnative-static"
-git clone https://github.com/apache/flink-shaded.git
+# Adding retry to git clone, due to FLINK-24971
+CLONE_COUNTER=1
+CLONE_RETRIES=4 # Since we start the counter at 1, we are retrying 
this 3 times
+CLONE_SLEEP=1
+until [ $CLONE_COUNTER -ge $CLONE_RETRIES ]
+do
+echo "Cloning flink-shaded, attempt number" $CLONE_COUNTER
+git clone https://github.com/apache/flink-shaded.git && break
+CLONE_COUNTER=$[$CLONE_COUNTER+1]
+sleep $CLONE_SLEEP
+done

Review comment:
   I wonder if we can add that as a general function for reuse on other 
batch tests...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17715: [FLINK-24820][table][docs] Examples in documentation for value1 IS DISTINCT …

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17715:
URL: https://github.com/apache/flink/pull/17715#issuecomment-962987530


   
   ## CI report:
   
   * 863b790976db36e3ddb858f118e964f9ca48927f Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26128)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-24429) Port FileSystemTableSink to new Unified Sink API (FLIP-143)

2021-11-24 Thread Alexander Preuss (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Preuss reassigned FLINK-24429:


Assignee: (was: Alexander Preuss)

> Port FileSystemTableSink to new Unified Sink API (FLIP-143)
> ---
>
> Key: FLINK-24429
> URL: https://issues.apache.org/jira/browse/FLINK-24429
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.15.0
>Reporter: Alexander Preuss
>Priority: Major
>
> We want to port the FileSystemTableSink to the new Sink API as was done with 
> the  Kafka Sink.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] MartijnVisser commented on a change in pull request #17881: [FLINK-24971][tests] Adding retry mechanism in case git clone fails

2021-11-24 Thread GitBox


MartijnVisser commented on a change in pull request #17881:
URL: https://github.com/apache/flink/pull/17881#discussion_r755786268



##
File path: flink-end-to-end-tests/test-scripts/common_ssl.sh
##
@@ -77,7 +77,17 @@ function _set_conf_ssl_helper {
 # -> we need to build it ourselves
 FLINK_SHADED_VERSION=$(cat ${END_TO_END_DIR}/../pom.xml | sed -n 
's/.*\(.*\)<\/flink.shaded.version>/\1/p')
 echo "BUILDING flink-shaded-netty-tcnative-static"
-git clone https://github.com/apache/flink-shaded.git
+# Adding retry to git clone, due to FLINK-24971
+CLONE_COUNTER=1
+CLONE_RETRIES=4 # Since we start the counter at 1, we are retrying 
this 3 times
+CLONE_SLEEP=1
+until [ $CLONE_COUNTER -ge $CLONE_RETRIES ]
+do
+echo "Cloning flink-shaded, attempt number" $CLONE_COUNTER
+git clone https://github.com/apache/flink-shaded.git && break
+CLONE_COUNTER=$[$CLONE_COUNTER+1]
+sleep $CLONE_SLEEP
+done

Review comment:
   That shouldn't be too hard. I guess the generic function should be in 
`common.sh` ? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-24429) Port FileSystemTableSink to new Unified Sink API (FLIP-143)

2021-11-24 Thread Alexander Preuss (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448425#comment-17448425
 ] 

Alexander Preuss commented on FLINK-24429:
--

Hi [~monster#12] , thank you for reaching out. Some time ago I started working 
on this but this ticket is currently waiting for 
[https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction|FLIP-191].
 Without the small file compaction we cannot port the original functionality to 
the new interface. 

> Port FileSystemTableSink to new Unified Sink API (FLIP-143)
> ---
>
> Key: FLINK-24429
> URL: https://issues.apache.org/jira/browse/FLINK-24429
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.15.0
>Reporter: Alexander Preuss
>Priority: Major
>
> We want to port the FileSystemTableSink to the new Sink API as was done with 
> the  Kafka Sink.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] jackwener commented on pull request #17842: [FLINK-24966] [docs] Fix spelling errors in the project

2021-11-24 Thread GitBox


jackwener commented on pull request #17842:
URL: https://github.com/apache/flink/pull/17842#issuecomment-977630592


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-25029) Hadoop Caller Context Setting In Flink

2021-11-24 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448426#comment-17448426
 ] 

Jingsong Lee commented on FLINK-25029:
--

cc: [~arvid] 

> Hadoop Caller Context Setting In Flink
> --
>
> Key: FLINK-25029
> URL: https://issues.apache.org/jira/browse/FLINK-25029
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: 刘方奇
>Priority: Major
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> The above is the main effect of the Caller Context. HDFS Client set Caller 
> Context, then name node get it in audit log to do some work.
> Now the Spark and hive have the Caller Context to meet the HDFS Job Audit 
> requirement.
> In my company, flink jobs often cause some problems for HDFS, so we did it 
> for preventing some cases.
> If the feature is general enough. Should we support it, then I can submit a 
> PR for this.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] flinkbot edited a comment on pull request #17842: [FLINK-24966] [docs] Fix spelling errors in the project

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17842:
URL: https://github.com/apache/flink/pull/17842#issuecomment-974310249


   
   ## CI report:
   
   * bbc64e8c2730b42b8851683e494c5ddee017de05 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26982)
 
   *  Unknown: [CANCELED](TBD) 
   * ba291419cba7e29422178d48536695ef558ef593 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] AHeise commented on a change in pull request #17870: [FLINK-24596][kafka] Fix buffered KafkaUpsert sink

2021-11-24 Thread GitBox


AHeise commented on a change in pull request #17870:
URL: https://github.com/apache/flink/pull/17870#discussion_r755789192



##
File path: 
flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/ReducingUpsertSink.java
##
@@ -43,19 +45,19 @@
 private final DataType physicalDataType;
 private final int[] keyProjection;
 private final SinkBufferFlushMode bufferFlushMode;
-private final Function valueCopyFunction;
+@Nullable private final TypeSerializer typeSerializer;
 
 ReducingUpsertSink(
 Sink wrappedSink,
 DataType physicalDataType,
 int[] keyProjection,
 SinkBufferFlushMode bufferFlushMode,
-Function valueCopyFunction) {

Review comment:
   Here we could also use `SerializableFunction` but your change is also 
good (in general not a huge fan of nullable fields that imply different 
behavior).

##
File path: 
flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/SinkBufferFlushMode.java
##
@@ -18,10 +18,11 @@
 
 package org.apache.flink.streaming.connectors.kafka.table;
 
+import java.io.Serializable;
 import java.util.Objects;
 
 /** Sink buffer flush configuration. */
-public class SinkBufferFlushMode {
+public class SinkBufferFlushMode implements Serializable {

Review comment:
   🙈 

##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/collect/CollectStreamSink.java
##
@@ -30,17 +32,19 @@
 @Internal
 public class CollectStreamSink extends DataStreamSink {
 
-private final LegacySinkTransformation transformation;
+private final PhysicalTransformation transformation;
 
+@SuppressWarnings("unchecked")
 public CollectStreamSink(DataStream inputStream, 
CollectSinkOperatorFactory factory) {
 super(inputStream, (CollectSinkOperator) factory.getOperator());
 this.transformation =
-new LegacySinkTransformation<>(
-inputStream.getTransformation(), "Collect Stream 
Sink", factory, 1);
+(PhysicalTransformation)

Review comment:
   Why is the cast necessary?
   Can you try to make `public class LegacySinkTransformation extends 
PhysicalTransformation`?
   The second ctor should a factory method instead...

##
File path: 
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/common/CommonExecSinkITCase.java
##
@@ -152,6 +154,43 @@ public void invoke(
 assertTimestampResults(timestamps, rows);
 }
 
+@Test
+public void testUnifiedSinksAreUsableWithDataStreamSinkProvider()
+throws ExecutionException, InterruptedException {
+final StreamTableEnvironment tableEnv = 
StreamTableEnvironment.create(env);
+final SharedReference> fetched = sharedObjects.add(new 
ArrayList<>());
+final List rows = Arrays.asList(Row.of(1), Row.of(2));
+
+final TableDescriptor sourceDescriptor =
+TableFactoryHarness.newBuilder()
+.schema(Schema.newBuilder().column("a", INT()).build())
+.source(new TimestampTestSource(rows))
+.sink(
+new TableFactoryHarness.SinkBase() {
+@Override
+public DataStreamSinkProvider 
getSinkRuntimeProvider(
+DynamicTableSink.Context context) {
+return dataStream ->
+dataStream.sinkTo(
+TestSink.newBuilder()
+.setWriter(
+new 
RecordWriter(fetched))
+
.setCommittableSerializer(
+
TestSink
+   
 .StringCommittableSerializer
+   
 .INSTANCE)
+.build());
+}
+})

Review comment:
   Could you please extract that to a method local class to ease the 
formatting?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-24596) Bugs in sink.buffer-flush before upsert-kafka

2021-11-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-24596:
---
Labels: pull-request-available  (was: )

> Bugs in sink.buffer-flush before upsert-kafka
> -
>
> Key: FLINK-24596
> URL: https://issues.apache.org/jira/browse/FLINK-24596
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.0, 1.15.0
>Reporter: Jingsong Lee
>Assignee: Fabian Paul
>Priority: Blocker
>  Labels: pull-request-available
>
> There is no ITCase for sink.buffer-flush before upsert-kafka. We should add 
> it.
> FLINK-23735 brings some bugs:
> * SinkBufferFlushMode bufferFlushMode not Serializable
> * Function valueCopyFunction not Serializable
> * Planner dose not support DataStreamProvider with new Sink



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] zentol merged pull request #17556: [FLINK-24627][tests] add some Junit5 extensions to replace the existed Junit4 rules

2021-11-24 Thread GitBox


zentol merged pull request #17556:
URL: https://github.com/apache/flink/pull/17556


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17813: [FLINK-24802][Table SQL/Planner] Improve cast ROW to STRING

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17813:
URL: https://github.com/apache/flink/pull/17813#issuecomment-970956527


   
   ## CI report:
   
   * e04cc8d0d537afadf1b195b068939fa8e0a1f994 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26981)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17842: [FLINK-24966] [docs] Fix spelling errors in the project

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17842:
URL: https://github.com/apache/flink/pull/17842#issuecomment-974310249


   
   ## CI report:
   
   * bbc64e8c2730b42b8851683e494c5ddee017de05 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26982)
 
   *  Unknown: [CANCELED](TBD) 
   * ba291419cba7e29422178d48536695ef558ef593 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26985)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-24627) Port JUnit 4 rules to JUnit5 extensions

2021-11-24 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-24627:
-
Summary: Port JUnit 4 rules to JUnit5 extensions   (was: Add some generic 
junit5 extensions to replace junit4 rules)

> Port JUnit 4 rules to JUnit5 extensions 
> 
>
> Key: FLINK-24627
> URL: https://issues.apache.org/jira/browse/FLINK-24627
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Hang Ruan
>Assignee: Hang Ruan
>Priority: Not a Priority
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> We have to use junit5 extensions to replace the existed junit4 rules in order 
> to change tests to junit5 in flink. There are some generic rules that should 
> be provided in advance.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] AHeise commented on a change in pull request #17580: [FLINK-11470][runtime] Pass configurations to filesystems when executing in LocalStreamEnvironment

2021-11-24 Thread GitBox


AHeise commented on a change in pull request #17580:
URL: https://github.com/apache/flink/pull/17580#discussion_r755791294



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/minicluster/MiniCluster.java
##
@@ -120,12 +125,20 @@
 import java.util.function.Function;
 import java.util.stream.Collectors;
 
+import static org.apache.flink.configuration.ConfigOptions.key;
 import static org.apache.flink.util.Preconditions.checkNotNull;
 import static org.apache.flink.util.Preconditions.checkState;
 
 /** MiniCluster to execute Flink jobs locally. */
 public class MiniCluster implements AutoCloseableAsync {
 
+public static final ConfigOption PLUGIN_DIRECTORY =

Review comment:
   Do we need a config option here? I was thinking that it would be better 
in this case to add a simple `String` field to `MiniClusterConfiguration` iff 
the option is only ever used in the `MiniCluster`. Then we can make it easier 
to use in `MiniClusterConfiguration#Builder`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-24627) Add some generic junit5 extensions to replace junit4 rules

2021-11-24 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-24627.

Fix Version/s: 1.15.0
 Assignee: Hang Ruan
   Resolution: Fixed

master: 78b231f60aed59061f0f609e0cfd659d78e6fdd5

> Add some generic junit5 extensions to replace junit4 rules
> --
>
> Key: FLINK-24627
> URL: https://issues.apache.org/jira/browse/FLINK-24627
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Hang Ruan
>Assignee: Hang Ruan
>Priority: Not a Priority
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> We have to use junit5 extensions to replace the existed junit4 rules in order 
> to change tests to junit5 in flink. There are some generic rules that should 
> be provided in advance.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-24627) Port JUnit 4 rules to JUnit5 extensions

2021-11-24 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-24627:
-
Priority: Major  (was: Not a Priority)

> Port JUnit 4 rules to JUnit5 extensions 
> 
>
> Key: FLINK-24627
> URL: https://issues.apache.org/jira/browse/FLINK-24627
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Hang Ruan
>Assignee: Hang Ruan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> We have to use junit5 extensions to replace the existed junit4 rules in order 
> to change tests to junit5 in flink. There are some generic rules that should 
> be provided in advance.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink-ml] zhipeng93 commented on a change in pull request #28: [Flink-24556] Add Estimator and Transformer for logistic regression

2021-11-24 Thread GitBox


zhipeng93 commented on a change in pull request #28:
URL: https://github.com/apache/flink-ml/pull/28#discussion_r755792254



##
File path: 
flink-ml-lib/src/main/java/org/apache/flink/ml/common/param/linear/HasPredictionDetailCol.java
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.common.param.linear;
+
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.param.StringParam;
+import org.apache.flink.ml.param.WithParams;
+
+/** Interface for the shared prediction detail param. */
+public interface HasPredictionDetailCol extends WithParams {

Review comment:
   `HasProbabilityCol ` may not be a very good solution here. I checked for 
Spark and reused its `HasRawPredictionCol`.
   
   What do you think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-25013) flink pulsar connector cannot consume deferred messages

2021-11-24 Thread Yufan Sheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufan Sheng updated FLINK-25013:

Attachment: screenshot-1.png

> flink pulsar connector cannot consume deferred messages
> ---
>
> Key: FLINK-25013
> URL: https://issues.apache.org/jira/browse/FLINK-25013
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.14.0
>Reporter: zhangjg
>Priority: Major
> Attachments: screenshot-1.png
>
>
>  
> at flink 1.14.0 version ,  pulsar connector ( PulsarSource) cannot consume 
> deferred messages . 
> For the message sent by the producer, the flink pulsar source will be 
> consumed immediately
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25013) flink pulsar connector cannot consume deferred messages

2021-11-24 Thread Yufan Sheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448430#comment-17448430
 ] 

Yufan Sheng commented on FLINK-25013:
-

 !screenshot-1.png! 
The test passed on my local machine. This issue should be closed now.

[~MartijnVisser]

> flink pulsar connector cannot consume deferred messages
> ---
>
> Key: FLINK-25013
> URL: https://issues.apache.org/jira/browse/FLINK-25013
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.14.0
>Reporter: zhangjg
>Priority: Major
> Attachments: screenshot-1.png
>
>
>  
> at flink 1.14.0 version ,  pulsar connector ( PulsarSource) cannot consume 
> deferred messages . 
> For the message sent by the producer, the flink pulsar source will be 
> consumed immediately
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-25013) flink pulsar connector cannot consume deferred messages

2021-11-24 Thread Yufan Sheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448430#comment-17448430
 ] 

Yufan Sheng edited comment on FLINK-25013 at 11/24/21, 8:19 AM:


 !screenshot-1.png|width=400!
The test passed on my local machine. This issue should be closed now.

[~MartijnVisser]


was (Author: syhily):
 !screenshot-1.png! 
The test passed on my local machine. This issue should be closed now.

[~MartijnVisser]

> flink pulsar connector cannot consume deferred messages
> ---
>
> Key: FLINK-25013
> URL: https://issues.apache.org/jira/browse/FLINK-25013
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.14.0
>Reporter: zhangjg
>Priority: Major
> Attachments: screenshot-1.png
>
>
>  
> at flink 1.14.0 version ,  pulsar connector ( PulsarSource) cannot consume 
> deferred messages . 
> For the message sent by the producer, the flink pulsar source will be 
> consumed immediately
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] wangyang0918 commented on a change in pull request #17554: [FLINK-24624][Kubernetes]Kill cluster when starting kubernetes session or application cluster failed

2021-11-24 Thread GitBox


wangyang0918 commented on a change in pull request #17554:
URL: https://github.com/apache/flink/pull/17554#discussion_r755797584



##
File path: 
flink-kubernetes/src/main/java/org/apache/flink/kubernetes/KubernetesClusterDescriptor.java
##
@@ -256,36 +244,51 @@ private String getWebMonitorAddress(Configuration 
configuration) throws Exceptio
 flinkConfig.get(JobManagerOptions.PORT));
 }
 
+final KubernetesJobManagerParameters kubernetesJobManagerParameters =
+new KubernetesJobManagerParameters(flinkConfig, 
clusterSpecification);
+
+final FlinkPod podTemplate =
+kubernetesJobManagerParameters
+.getPodTemplateFilePath()
+.map(
+file ->
+
KubernetesUtils.loadPodFromTemplateFile(
+client, file, 
Constants.MAIN_CONTAINER_NAME))
+.orElse(new FlinkPod.Builder().build());
+final KubernetesJobManagerSpecification kubernetesJobManagerSpec =
+
KubernetesJobManagerFactory.buildKubernetesJobManagerSpecification(
+podTemplate, kubernetesJobManagerParameters);
+
+client.createJobManagerComponent(kubernetesJobManagerSpec);
+
+return createClusterClientProvider(clusterId);
+}
+
+private ClusterClientProvider safelyDeployCluster(
+SupplierWithException, Exception> 
supplier)
+throws ClusterDeploymentException {
 try {
-final KubernetesJobManagerParameters 
kubernetesJobManagerParameters =
-new KubernetesJobManagerParameters(flinkConfig, 
clusterSpecification);
-
-final FlinkPod podTemplate =
-kubernetesJobManagerParameters
-.getPodTemplateFilePath()
-.map(
-file ->
-
KubernetesUtils.loadPodFromTemplateFile(
-client, file, 
Constants.MAIN_CONTAINER_NAME))
-.orElse(new FlinkPod.Builder().build());
-final KubernetesJobManagerSpecification kubernetesJobManagerSpec =
-
KubernetesJobManagerFactory.buildKubernetesJobManagerSpecification(
-podTemplate, kubernetesJobManagerParameters);
-
-client.createJobManagerComponent(kubernetesJobManagerSpec);
-
-return createClusterClientProvider(clusterId);
+
+ClusterClientProvider clusterClientProvider = 
supplier.get();
+
+try (ClusterClient clusterClient = 
clusterClientProvider.getClusterClient()) {

Review comment:
   @cc13ny Thanks for your valuable comments.
   
   @Aitozi This discussion make me to rethink that whether we really need to 
clean up the K8s resources when creating Flink client failed. Because the Flink 
cluster might be running normally.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol merged pull request #17852: [FLINK-24979][build][hbase] Remove MaxPermSize setting

2021-11-24 Thread GitBox


zentol merged pull request #17852:
URL: https://github.com/apache/flink/pull/17852


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-25030) Unexpected record in KafkaSourceITCase$IntegrationTests.testMultipleSplits

2021-11-24 Thread Matthias (Jira)
Matthias created FLINK-25030:


 Summary: Unexpected record in 
KafkaSourceITCase$IntegrationTests.testMultipleSplits
 Key: FLINK-25030
 URL: https://issues.apache.org/jira/browse/FLINK-25030
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.14.1
Reporter: Matthias


We experienced a test failure in 
{{KafkaSourceITCase$IntegrationTests.testMultipleSplits}} in our Flink fork for 
1.14 due to an unexpected record:
{code}
[...]
Nov 23 21:10:19 [ERROR] 
org.apache.flink.connector.kafka.source.KafkaSourceITCase$IntegrationTests.testMultipleSplits{TestEnvironment,
 ExternalContext}[1]
Nov 23 21:10:19 [ERROR]   Run 1: 
KafkaSourceITCase$IntegrationTests>SourceTestSuiteBase.testMultipleSplits:160 
Nov 23 21:10:19 Expected: Records consumed by Flink should be identical to test 
data and preserve the order in multiple splits
Nov 23 21:10:19  but: Unexpected record '2-13N3fae7bfL1iEMF3I0TaWGC57vrflv' 
at position 367
Nov 23 21:10:19 Current progress of multiple split test data validation:
Nov 23 21:10:19 Split 0 (115/115): 
Nov 23 21:10:19 0-C7bHGoulUrqjQqGM8PiVI6BS9B3Okq2PJdf3EBas3G
Nov 23 21:10:19 0-GRt5T5YYDsgq1t0UBt3cUjvnktIbz
[...]
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25030) Unexpected record in KafkaSourceITCase$IntegrationTests.testMultipleSplits

2021-11-24 Thread Matthias (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias updated FLINK-25030:
-
Labels: test-stability  (was: )

> Unexpected record in KafkaSourceITCase$IntegrationTests.testMultipleSplits
> --
>
> Key: FLINK-25030
> URL: https://issues.apache.org/jira/browse/FLINK-25030
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.1
>Reporter: Matthias
>Priority: Major
>  Labels: test-stability
>
> We experienced a test failure in 
> {{KafkaSourceITCase$IntegrationTests.testMultipleSplits}} in our Flink fork 
> for 1.14 due to an unexpected record:
> {code}
> [...]
> Nov 23 21:10:19 [ERROR] 
> org.apache.flink.connector.kafka.source.KafkaSourceITCase$IntegrationTests.testMultipleSplits{TestEnvironment,
>  ExternalContext}[1]
> Nov 23 21:10:19 [ERROR]   Run 1: 
> KafkaSourceITCase$IntegrationTests>SourceTestSuiteBase.testMultipleSplits:160 
> Nov 23 21:10:19 Expected: Records consumed by Flink should be identical to 
> test data and preserve the order in multiple splits
> Nov 23 21:10:19  but: Unexpected record 
> '2-13N3fae7bfL1iEMF3I0TaWGC57vrflv' at position 367
> Nov 23 21:10:19 Current progress of multiple split test data validation:
> Nov 23 21:10:19 Split 0 (115/115): 
> Nov 23 21:10:19 0-C7bHGoulUrqjQqGM8PiVI6BS9B3Okq2PJdf3EBas3G
> Nov 23 21:10:19 0-GRt5T5YYDsgq1t0UBt3cUjvnktIbz
> [...]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Closed] (FLINK-24979) Remove MaxPermSize configuration in HBase surefire config

2021-11-24 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-24979.

Resolution: Fixed

master: cd2c10215f187f85579db30c86292aff3b9dbd01

> Remove MaxPermSize configuration in HBase surefire config
> -
>
> Key: FLINK-24979
> URL: https://issues.apache.org/jira/browse/FLINK-24979
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Connectors / HBase
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> The MaxPermSize parameter has no effect since JDK 8, and is actively rejected 
> in Java 17. Given that we for years it worked just fine in without it, we can 
> just remove it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] zentol merged pull request #17855: [FLINK-24983][build] Upgrade surefire to 3.0.5-M5

2021-11-24 Thread GitBox


zentol merged pull request #17855:
URL: https://github.com/apache/flink/pull/17855


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-25027) Allow GC of a finished job's JobMaster before the slot timeout is reached

2021-11-24 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-25027:
--
Priority: Critical  (was: Major)

> Allow GC of a finished job's JobMaster before the slot timeout is reached
> -
>
> Key: FLINK-25027
> URL: https://issues.apache.org/jira/browse/FLINK-25027
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Nico Kruber
>Priority: Critical
> Attachments: image-2021-11-23-20-32-20-479.png
>
>
> In a session cluster, after a (batch) job is finished, the JobMaster seems to 
> stick around for another couple of minutes before being eligible for garbage 
> collection.
> Looking into a heap dump, it seems to be tied to a 
> {{PhysicalSlotRequestBulkCheckerImpl}} which is enqueued in the underlying 
> Akka executor (and keeps the JM from being GC’d). Per default the action is 
> scheduled for {{slot.request.timeout}} that defaults to 5 min (thanks 
> [~trohrmann] for helping out here)
> !image-2021-11-23-20-32-20-479.png!
> With this setting, you will have to account for enough metaspace to cover 5 
> minutes of time which may span a couple of jobs, needlessly!
> The problem seems to be that Flink is using the main thread executor for the 
> scheduling that uses the {{ActorSystem}}'s scheduler and the future task 
> scheduled with Akka can (probably) not be easily cancelled.
> One idea could be to use a dedicated thread pool per JM, that we shut down 
> when the JM terminates. That way we would not keep the JM from being GC’d.
> (The concrete example we investigated was a DataSet job)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25027) Allow GC of a finished job's JobMaster before the slot timeout is reached

2021-11-24 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-25027:
--
Issue Type: Bug  (was: Improvement)

> Allow GC of a finished job's JobMaster before the slot timeout is reached
> -
>
> Key: FLINK-25027
> URL: https://issues.apache.org/jira/browse/FLINK-25027
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Nico Kruber
>Priority: Critical
> Attachments: image-2021-11-23-20-32-20-479.png
>
>
> In a session cluster, after a (batch) job is finished, the JobMaster seems to 
> stick around for another couple of minutes before being eligible for garbage 
> collection.
> Looking into a heap dump, it seems to be tied to a 
> {{PhysicalSlotRequestBulkCheckerImpl}} which is enqueued in the underlying 
> Akka executor (and keeps the JM from being GC’d). Per default the action is 
> scheduled for {{slot.request.timeout}} that defaults to 5 min (thanks 
> [~trohrmann] for helping out here)
> !image-2021-11-23-20-32-20-479.png!
> With this setting, you will have to account for enough metaspace to cover 5 
> minutes of time which may span a couple of jobs, needlessly!
> The problem seems to be that Flink is using the main thread executor for the 
> scheduling that uses the {{ActorSystem}}'s scheduler and the future task 
> scheduled with Akka can (probably) not be easily cancelled.
> One idea could be to use a dedicated thread pool per JM, that we shut down 
> when the JM terminates. That way we would not keep the JM from being GC’d.
> (The concrete example we investigated was a DataSet job)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Closed] (FLINK-24983) Upgrade surefire to 3.0.0-M5

2021-11-24 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-24983.

Resolution: Fixed

master: 7f4e3e2944778b196e754b84e7a8514181dd9cf0

> Upgrade surefire to 3.0.0-M5
> 
>
> Key: FLINK-24983
> URL: https://issues.apache.org/jira/browse/FLINK-24983
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Tests
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> Surefire 3.0.0-M5 comes with a new TCP/IP communication channel between 
> surefire and JVM forks.
> This will allow us to resolve "Corrupted STDOUT" issues when the JVM is 
> printing warnings due to unsafe accesses.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-24858) TypeSerializer version mismatch during eagerly restore

2021-11-24 Thread Arvid Heise (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arvid Heise updated FLINK-24858:

Priority: Blocker  (was: Critical)

> TypeSerializer version mismatch during eagerly restore
> --
>
> Key: FLINK-24858
> URL: https://issues.apache.org/jira/browse/FLINK-24858
> Project: Flink
>  Issue Type: Bug
>  Components: API / Type Serialization System
>Affects Versions: 1.14.0, 1.13.3, 1.15.0
>Reporter: Fabian Paul
>Assignee: Fabian Paul
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.15.0, 1.14.1
>
>
> Currently, some of our TypeSerializer snapshots assume information about the 
> binary layout of the actual data rather than only holding information about 
> the TypeSerialzer.
> Multiple users ran into this problem 
> i.e.[https://lists.apache.org/thread/4q5q7wp0br96op6p7f695q2l8lz4wfzx|https://lists.apache.org/thread/4q5q7wp0br96op6p7f695q2l8lz4wfzx]
> {quote}This manifest itself when state is restored egarly (for example an 
> operator state) but, for example a user doesn't register the state on their 
> intializeState/open,* and then a checkpoint happens.
> The result is that we will have elements serialized according to an old 
> binary layout, but our serializer snapshot declares a new version which 
> indicates that the elements are written with a new binary layout.
> The next restore will fail.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25027) Allow GC of a finished job's JobMaster before the slot timeout is reached

2021-11-24 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-25027:
--
Fix Version/s: 1.15.0
   1.14.1
   1.13.4

> Allow GC of a finished job's JobMaster before the slot timeout is reached
> -
>
> Key: FLINK-25027
> URL: https://issues.apache.org/jira/browse/FLINK-25027
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Nico Kruber
>Priority: Critical
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
> Attachments: image-2021-11-23-20-32-20-479.png
>
>
> In a session cluster, after a (batch) job is finished, the JobMaster seems to 
> stick around for another couple of minutes before being eligible for garbage 
> collection.
> Looking into a heap dump, it seems to be tied to a 
> {{PhysicalSlotRequestBulkCheckerImpl}} which is enqueued in the underlying 
> Akka executor (and keeps the JM from being GC’d). Per default the action is 
> scheduled for {{slot.request.timeout}} that defaults to 5 min (thanks 
> [~trohrmann] for helping out here)
> !image-2021-11-23-20-32-20-479.png!
> With this setting, you will have to account for enough metaspace to cover 5 
> minutes of time which may span a couple of jobs, needlessly!
> The problem seems to be that Flink is using the main thread executor for the 
> scheduling that uses the {{ActorSystem}}'s scheduler and the future task 
> scheduled with Akka can (probably) not be easily cancelled.
> One idea could be to use a dedicated thread pool per JM, that we shut down 
> when the JM terminates. That way we would not keep the JM from being GC’d.
> (The concrete example we investigated was a DataSet job)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25030) Unexpected record in KafkaSourceITCase$IntegrationTests.testMultipleSplits

2021-11-24 Thread Matthias (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias updated FLINK-25030:
-
Description: 
We experienced a test failure in 
{{KafkaSourceITCase$IntegrationTests.testMultipleSplits}} in our Flink fork for 
1.14 due to an unexpected record:
{code}
[...]
Nov 23 21:10:19 [ERROR] 
org.apache.flink.connector.kafka.source.KafkaSourceITCase$IntegrationTests.testMultipleSplits{TestEnvironment,
 ExternalContext}[1]
Nov 23 21:10:19 [ERROR]   Run 1: 
KafkaSourceITCase$IntegrationTests>SourceTestSuiteBase.testMultipleSplits:160 
Nov 23 21:10:19 Expected: Records consumed by Flink should be identical to test 
data and preserve the order in multiple splits
Nov 23 21:10:19  but: Unexpected record '2-13N3fae7bfL1iEMF3I0TaWGC57vrflv' 
at position 367
Nov 23 21:10:19 Current progress of multiple split test data validation:
Nov 23 21:10:19 Split 0 (115/115): 
Nov 23 21:10:19 0-C7bHGoulUrqjQqGM8PiVI6BS9B3Okq2PJdf3EBas3G
Nov 23 21:10:19 0-GRt5T5YYDsgq1t0UBt3cUjvnktIbz
[...]
{code}

I verified that we do not touch kafka-related in the Fork (by going through the 
patches with {{grep -i 'kafka\|source'}}). I added the pipeline build artifacts 
to this ticket.

  was:
We experienced a test failure in 
{{KafkaSourceITCase$IntegrationTests.testMultipleSplits}} in our Flink fork for 
1.14 due to an unexpected record:
{code}
[...]
Nov 23 21:10:19 [ERROR] 
org.apache.flink.connector.kafka.source.KafkaSourceITCase$IntegrationTests.testMultipleSplits{TestEnvironment,
 ExternalContext}[1]
Nov 23 21:10:19 [ERROR]   Run 1: 
KafkaSourceITCase$IntegrationTests>SourceTestSuiteBase.testMultipleSplits:160 
Nov 23 21:10:19 Expected: Records consumed by Flink should be identical to test 
data and preserve the order in multiple splits
Nov 23 21:10:19  but: Unexpected record '2-13N3fae7bfL1iEMF3I0TaWGC57vrflv' 
at position 367
Nov 23 21:10:19 Current progress of multiple split test data validation:
Nov 23 21:10:19 Split 0 (115/115): 
Nov 23 21:10:19 0-C7bHGoulUrqjQqGM8PiVI6BS9B3Okq2PJdf3EBas3G
Nov 23 21:10:19 0-GRt5T5YYDsgq1t0UBt3cUjvnktIbz
[...]
{code}

I verified that we do not touch kafka-related in the Fork (by going through the 
patches with `grep -i 'kafka\|source'`). I added the pipeline build artifacts 
to this ticket.


> Unexpected record in KafkaSourceITCase$IntegrationTests.testMultipleSplits
> --
>
> Key: FLINK-25030
> URL: https://issues.apache.org/jira/browse/FLINK-25030
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.1
>Reporter: Matthias
>Priority: Major
>  Labels: test-stability
> Attachments: logs-ci_build-test_ci_build_kafka_gelly-1637699602.zip
>
>
> We experienced a test failure in 
> {{KafkaSourceITCase$IntegrationTests.testMultipleSplits}} in our Flink fork 
> for 1.14 due to an unexpected record:
> {code}
> [...]
> Nov 23 21:10:19 [ERROR] 
> org.apache.flink.connector.kafka.source.KafkaSourceITCase$IntegrationTests.testMultipleSplits{TestEnvironment,
>  ExternalContext}[1]
> Nov 23 21:10:19 [ERROR]   Run 1: 
> KafkaSourceITCase$IntegrationTests>SourceTestSuiteBase.testMultipleSplits:160 
> Nov 23 21:10:19 Expected: Records consumed by Flink should be identical to 
> test data and preserve the order in multiple splits
> Nov 23 21:10:19  but: Unexpected record 
> '2-13N3fae7bfL1iEMF3I0TaWGC57vrflv' at position 367
> Nov 23 21:10:19 Current progress of multiple split test data validation:
> Nov 23 21:10:19 Split 0 (115/115): 
> Nov 23 21:10:19 0-C7bHGoulUrqjQqGM8PiVI6BS9B3Okq2PJdf3EBas3G
> Nov 23 21:10:19 0-GRt5T5YYDsgq1t0UBt3cUjvnktIbz
> [...]
> {code}
> I verified that we do not touch kafka-related in the Fork (by going through 
> the patches with {{grep -i 'kafka\|source'}}). I added the pipeline build 
> artifacts to this ticket.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25030) Unexpected record in KafkaSourceITCase$IntegrationTests.testMultipleSplits

2021-11-24 Thread Matthias (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias updated FLINK-25030:
-
Attachment: logs-ci_build-test_ci_build_kafka_gelly-1637699602.zip

> Unexpected record in KafkaSourceITCase$IntegrationTests.testMultipleSplits
> --
>
> Key: FLINK-25030
> URL: https://issues.apache.org/jira/browse/FLINK-25030
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.1
>Reporter: Matthias
>Priority: Major
>  Labels: test-stability
> Attachments: logs-ci_build-test_ci_build_kafka_gelly-1637699602.zip
>
>
> We experienced a test failure in 
> {{KafkaSourceITCase$IntegrationTests.testMultipleSplits}} in our Flink fork 
> for 1.14 due to an unexpected record:
> {code}
> [...]
> Nov 23 21:10:19 [ERROR] 
> org.apache.flink.connector.kafka.source.KafkaSourceITCase$IntegrationTests.testMultipleSplits{TestEnvironment,
>  ExternalContext}[1]
> Nov 23 21:10:19 [ERROR]   Run 1: 
> KafkaSourceITCase$IntegrationTests>SourceTestSuiteBase.testMultipleSplits:160 
> Nov 23 21:10:19 Expected: Records consumed by Flink should be identical to 
> test data and preserve the order in multiple splits
> Nov 23 21:10:19  but: Unexpected record 
> '2-13N3fae7bfL1iEMF3I0TaWGC57vrflv' at position 367
> Nov 23 21:10:19 Current progress of multiple split test data validation:
> Nov 23 21:10:19 Split 0 (115/115): 
> Nov 23 21:10:19 0-C7bHGoulUrqjQqGM8PiVI6BS9B3Okq2PJdf3EBas3G
> Nov 23 21:10:19 0-GRt5T5YYDsgq1t0UBt3cUjvnktIbz
> [...]
> {code}
> I verified that we do not touch kafka-related in the Fork (by going through 
> the patches with `grep -i 'kafka\|source'`). I added the pipeline build 
> artifacts to this ticket.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25030) Unexpected record in KafkaSourceITCase$IntegrationTests.testMultipleSplits

2021-11-24 Thread Matthias (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias updated FLINK-25030:
-
Description: 
We experienced a test failure in 
{{KafkaSourceITCase$IntegrationTests.testMultipleSplits}} in our Flink fork for 
1.14 due to an unexpected record:
{code}
[...]
Nov 23 21:10:19 [ERROR] 
org.apache.flink.connector.kafka.source.KafkaSourceITCase$IntegrationTests.testMultipleSplits{TestEnvironment,
 ExternalContext}[1]
Nov 23 21:10:19 [ERROR]   Run 1: 
KafkaSourceITCase$IntegrationTests>SourceTestSuiteBase.testMultipleSplits:160 
Nov 23 21:10:19 Expected: Records consumed by Flink should be identical to test 
data and preserve the order in multiple splits
Nov 23 21:10:19  but: Unexpected record '2-13N3fae7bfL1iEMF3I0TaWGC57vrflv' 
at position 367
Nov 23 21:10:19 Current progress of multiple split test data validation:
Nov 23 21:10:19 Split 0 (115/115): 
Nov 23 21:10:19 0-C7bHGoulUrqjQqGM8PiVI6BS9B3Okq2PJdf3EBas3G
Nov 23 21:10:19 0-GRt5T5YYDsgq1t0UBt3cUjvnktIbz
[...]
{code}

I verified that we do not touch kafka-related in the Fork (by going through the 
patches with `grep -i 'kafka\|source'`). I added the pipeline build artifacts 
to this ticket.

  was:
We experienced a test failure in 
{{KafkaSourceITCase$IntegrationTests.testMultipleSplits}} in our Flink fork for 
1.14 due to an unexpected record:
{code}
[...]
Nov 23 21:10:19 [ERROR] 
org.apache.flink.connector.kafka.source.KafkaSourceITCase$IntegrationTests.testMultipleSplits{TestEnvironment,
 ExternalContext}[1]
Nov 23 21:10:19 [ERROR]   Run 1: 
KafkaSourceITCase$IntegrationTests>SourceTestSuiteBase.testMultipleSplits:160 
Nov 23 21:10:19 Expected: Records consumed by Flink should be identical to test 
data and preserve the order in multiple splits
Nov 23 21:10:19  but: Unexpected record '2-13N3fae7bfL1iEMF3I0TaWGC57vrflv' 
at position 367
Nov 23 21:10:19 Current progress of multiple split test data validation:
Nov 23 21:10:19 Split 0 (115/115): 
Nov 23 21:10:19 0-C7bHGoulUrqjQqGM8PiVI6BS9B3Okq2PJdf3EBas3G
Nov 23 21:10:19 0-GRt5T5YYDsgq1t0UBt3cUjvnktIbz
[...]
{code}


> Unexpected record in KafkaSourceITCase$IntegrationTests.testMultipleSplits
> --
>
> Key: FLINK-25030
> URL: https://issues.apache.org/jira/browse/FLINK-25030
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.1
>Reporter: Matthias
>Priority: Major
>  Labels: test-stability
> Attachments: logs-ci_build-test_ci_build_kafka_gelly-1637699602.zip
>
>
> We experienced a test failure in 
> {{KafkaSourceITCase$IntegrationTests.testMultipleSplits}} in our Flink fork 
> for 1.14 due to an unexpected record:
> {code}
> [...]
> Nov 23 21:10:19 [ERROR] 
> org.apache.flink.connector.kafka.source.KafkaSourceITCase$IntegrationTests.testMultipleSplits{TestEnvironment,
>  ExternalContext}[1]
> Nov 23 21:10:19 [ERROR]   Run 1: 
> KafkaSourceITCase$IntegrationTests>SourceTestSuiteBase.testMultipleSplits:160 
> Nov 23 21:10:19 Expected: Records consumed by Flink should be identical to 
> test data and preserve the order in multiple splits
> Nov 23 21:10:19  but: Unexpected record 
> '2-13N3fae7bfL1iEMF3I0TaWGC57vrflv' at position 367
> Nov 23 21:10:19 Current progress of multiple split test data validation:
> Nov 23 21:10:19 Split 0 (115/115): 
> Nov 23 21:10:19 0-C7bHGoulUrqjQqGM8PiVI6BS9B3Okq2PJdf3EBas3G
> Nov 23 21:10:19 0-GRt5T5YYDsgq1t0UBt3cUjvnktIbz
> [...]
> {code}
> I verified that we do not touch kafka-related in the Fork (by going through 
> the patches with `grep -i 'kafka\|source'`). I added the pipeline build 
> artifacts to this ticket.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25027) Allow GC of a finished job's JobMaster before the slot timeout is reached

2021-11-24 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448436#comment-17448436
 ] 

Till Rohrmann commented on FLINK-25027:
---

[~zjureel] I think this is a good idea. We should avoid using the main thread 
executor for scheduling tasks that have a long delay. Do you want to work on 
this issue?

> Allow GC of a finished job's JobMaster before the slot timeout is reached
> -
>
> Key: FLINK-25027
> URL: https://issues.apache.org/jira/browse/FLINK-25027
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Nico Kruber
>Priority: Critical
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
> Attachments: image-2021-11-23-20-32-20-479.png
>
>
> In a session cluster, after a (batch) job is finished, the JobMaster seems to 
> stick around for another couple of minutes before being eligible for garbage 
> collection.
> Looking into a heap dump, it seems to be tied to a 
> {{PhysicalSlotRequestBulkCheckerImpl}} which is enqueued in the underlying 
> Akka executor (and keeps the JM from being GC’d). Per default the action is 
> scheduled for {{slot.request.timeout}} that defaults to 5 min (thanks 
> [~trohrmann] for helping out here)
> !image-2021-11-23-20-32-20-479.png!
> With this setting, you will have to account for enough metaspace to cover 5 
> minutes of time which may span a couple of jobs, needlessly!
> The problem seems to be that Flink is using the main thread executor for the 
> scheduling that uses the {{ActorSystem}}'s scheduler and the future task 
> scheduled with Akka can (probably) not be easily cancelled.
> One idea could be to use a dedicated thread pool per JM, that we shut down 
> when the JM terminates. That way we would not keep the JM from being GC’d.
> (The concrete example we investigated was a DataSet job)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25030) Unexpected record in KafkaSourceITCase$IntegrationTests.testMultipleSplits

2021-11-24 Thread Matthias (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448437#comment-17448437
 ] 

Matthias commented on FLINK-25030:
--

FLINK-24261 and FLINK-25030 are related in a way that the same test is failing. 
But the failure reason is different. I linked them, anyway.

> Unexpected record in KafkaSourceITCase$IntegrationTests.testMultipleSplits
> --
>
> Key: FLINK-25030
> URL: https://issues.apache.org/jira/browse/FLINK-25030
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.1
>Reporter: Matthias
>Priority: Major
>  Labels: test-stability
> Attachments: logs-ci_build-test_ci_build_kafka_gelly-1637699602.zip
>
>
> We experienced a test failure in 
> {{KafkaSourceITCase$IntegrationTests.testMultipleSplits}} in our Flink fork 
> for 1.14 due to an unexpected record:
> {code}
> [...]
> Nov 23 21:10:19 [ERROR] 
> org.apache.flink.connector.kafka.source.KafkaSourceITCase$IntegrationTests.testMultipleSplits{TestEnvironment,
>  ExternalContext}[1]
> Nov 23 21:10:19 [ERROR]   Run 1: 
> KafkaSourceITCase$IntegrationTests>SourceTestSuiteBase.testMultipleSplits:160 
> Nov 23 21:10:19 Expected: Records consumed by Flink should be identical to 
> test data and preserve the order in multiple splits
> Nov 23 21:10:19  but: Unexpected record 
> '2-13N3fae7bfL1iEMF3I0TaWGC57vrflv' at position 367
> Nov 23 21:10:19 Current progress of multiple split test data validation:
> Nov 23 21:10:19 Split 0 (115/115): 
> Nov 23 21:10:19 0-C7bHGoulUrqjQqGM8PiVI6BS9B3Okq2PJdf3EBas3G
> Nov 23 21:10:19 0-GRt5T5YYDsgq1t0UBt3cUjvnktIbz
> [...]
> {code}
> I verified that we do not touch kafka-related in the Fork (by going through 
> the patches with {{grep -i 'kafka\|source'}}). I added the pipeline build 
> artifacts to this ticket.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] zentol merged pull request #17856: [FLINK-24984][ci] Rename MAVEN_OPTS variable

2021-11-24 Thread GitBox


zentol merged pull request #17856:
URL: https://github.com/apache/flink/pull/17856


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on pull request #17857: [FLINK-24985][tests] Relax assumptions of stacktrace layout

2021-11-24 Thread GitBox


zentol commented on pull request #17857:
URL: https://github.com/apache/flink/pull/17857#issuecomment-977646498


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-24984) Rename MAVEN_OPTS variable in azure yaml

2021-11-24 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-24984.

Resolution: Fixed

master: 5a3c843d5e92073d546dba858030700a36582392

> Rename MAVEN_OPTS variable in azure yaml
> 
>
> Key: FLINK-24984
> URL: https://issues.apache.org/jira/browse/FLINK-24984
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System / Azure Pipelines
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> The build-apache-repo.ymnl defines a MAVEN_OPTS variable, which are later 
> passed to Maven.
> This is a bit misleading because MAVEN_OPTS already has semantics attached to 
> it by Maven itself, in that it passes the value of such an environment 
> variable to the JVM.
> As such it can, and should, only be used to control JVM parameters (e.g., 
> memory), and not Maven-specific stuff. In fact the JVM would fail if, for 
> example, it were to contain a profile activation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Closed] (FLINK-25013) flink pulsar connector cannot consume deferred messages

2021-11-24 Thread Martijn Visser (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser closed FLINK-25013.
--
Resolution: Not A Problem

Thanks for the investigation and resolution [~syhily]. I'm closing this ticket. 

> flink pulsar connector cannot consume deferred messages
> ---
>
> Key: FLINK-25013
> URL: https://issues.apache.org/jira/browse/FLINK-25013
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.14.0
>Reporter: zhangjg
>Priority: Major
> Attachments: screenshot-1.png
>
>
>  
> at flink 1.14.0 version ,  pulsar connector ( PulsarSource) cannot consume 
> deferred messages . 
> For the message sent by the producer, the flink pulsar source will be 
> consumed immediately
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] flinkbot edited a comment on pull request #17857: [FLINK-24985][tests] Relax assumptions of stacktrace layout

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17857:
URL: https://github.com/apache/flink/pull/17857#issuecomment-975401663


   
   ## CI report:
   
   * 17691f4330fb8017970c9934c8e52ea91da7cdf4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26947)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17857: [FLINK-24985][tests] Relax assumptions of stacktrace layout

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17857:
URL: https://github.com/apache/flink/pull/17857#issuecomment-975401663


   
   ## CI report:
   
   * 17691f4330fb8017970c9934c8e52ea91da7cdf4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26947)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-25029) Hadoop Caller Context Setting In Flink

2021-11-24 Thread Arvid Heise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448448#comment-17448448
 ] 

Arvid Heise commented on FLINK-25029:
-

Looks like a coordination topic to me, [~dmvk]?

> Hadoop Caller Context Setting In Flink
> --
>
> Key: FLINK-25029
> URL: https://issues.apache.org/jira/browse/FLINK-25029
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: 刘方奇
>Priority: Major
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> The above is the main effect of the Caller Context. HDFS Client set Caller 
> Context, then name node get it in audit log to do some work.
> Now the Spark and hive have the Caller Context to meet the HDFS Job Audit 
> requirement.
> In my company, flink jobs often cause some problems for HDFS, so we did it 
> for preventing some cases.
> If the feature is general enough. Should we support it, then I can submit a 
> PR for this.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] flinkbot edited a comment on pull request #17866: [FLINK-24996][ci] Update CI image to contain Java 17

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17866:
URL: https://github.com/apache/flink/pull/17866#issuecomment-975542249


   
   ## CI report:
   
   * 3fafa6faf69e75095fe6be191958bcedbf8bf1bd Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26850)
 
   * 2159f71b8bc45b04836d046d4efd5f1b61029d54 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] dannycranmer commented on a change in pull request #17345: [FLINK-24227][connectors/kinesis] Added Kinesis Data Streams Sink i…

2021-11-24 Thread GitBox


dannycranmer commented on a change in pull request #17345:
URL: https://github.com/apache/flink/pull/17345#discussion_r755809363



##
File path: 
flink-connectors/flink-connector-aws-kinesis-data-streams/src/main/java/org/apache/flink/connector/kinesis/sink/KinesisDataStreamsSinkBuilder.java
##
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.kinesis.sink;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.connector.base.sink.AsyncSinkBaseBuilder;
+
+import software.amazon.awssdk.services.kinesis.model.PutRecordsRequestEntry;
+
+import java.util.Properties;
+
+/**
+ * Builder to construct {@link KinesisDataStreamsSink}.
+ *
+ * The following example shows the minimum setup to create a {@link 
KinesisDataStreamsSink} that
+ * writes String values to a Kinesis Data Streams stream named 
your_stream_here.
+ *
+ * {@code
+ * ElementConverter elementConverter =
+ * KinesisDataStreamsSinkElementConverter.builder()
+ * .setSerializationSchema(new SimpleStringSchema())
+ * .setPartitionKeyGenerator(element -> 
String.valueOf(element.hashCode()))
+ * .build();
+ *
+ * KinesisDataStreamsSink kdsSink =
+ * KinesisDataStreamsSink.builder()
+ * .setElementConverter(elementConverter)
+ * .setStreamName("your_stream_name")
+ * .build();
+ * }
+ *
+ * If the following parameters are not set in this builder, the following 
defaults will be used:
+ *
+ * 
+ *   {@code maxBatchSize} will be 200
+ *   {@code maxInFlightRequests} will be 16
+ *   {@code maxBufferedRequests} will be 1
+ *   {@code flushOnBufferSizeInBytes} will be 64MB i.e. {@code 64 * 1024 * 
1024}
+ *   {@code maxTimeInBufferMS} will be 5000ms
+ *   {@code failOnError} will be false
+ * 
+ *
+ * @param  type of elements that should be persisted in the destination
+ */
+@PublicEvolving
+public class KinesisDataStreamsSinkBuilder
+extends AsyncSinkBaseBuilder<
+InputT, PutRecordsRequestEntry, 
KinesisDataStreamsSinkBuilder> {
+
+private static final int DEFAULT_MAX_BATCH_SIZE = 200;
+private static final int DEFAULT_MAX_IN_FLIGHT_REQUESTS = 16;
+private static final int DEFAULT_MAX_BUFFERED_REQUESTS = 1;
+private static final long DEFAULT_FLUSH_ON_BUFFER_SIZE_IN_B = 64 * 1024 * 
1024;
+private static final long DEFAULT_MAX_TIME_IN_BUFFER_MS = 5000;
+private static final boolean DEFAULT_FAIL_ON_ERROR = false;

Review comment:
   The Amazon Kinesis Data Streams [PutRecrods 
quota](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecords.html)
 has a few dimensions:
   > Each PutRecords request can support up to 500 records. Each record in the 
request can be as large as 1 MiB, up to a limit of 5 MiB for the entire request
   
   - Batch size (record count): 500 records
   - Max record size: 1MiB
   - Batch size (bytes): 5MiB
   
   How would I use these values to setup a configuration for KDS? Does 
`DEFAULT_FLUSH_ON_BUFFER_SIZE_IN_B` guarantee a buffer size or is it a minimum? 
Could we get in a situation where the payload we are trying to send exceeds the 
maximum size and we spin on it?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-24855) Source Coordinator Thread already exists. There should never be more than one thread driving the actions of a Source Coordinator.

2021-11-24 Thread lilicheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448450#comment-17448450
 ] 

lilicheng commented on FLINK-24855:
---

[~wangminchao] Have you solved this problem? I also encountered the same 
problem.:(

> Source Coordinator Thread already exists. There should never be more than one 
> thread driving the actions of a Source Coordinator.
> -
>
> Key: FLINK-24855
> URL: https://issues.apache.org/jira/browse/FLINK-24855
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, Runtime / Coordination
>Affects Versions: 1.13.3
> Environment: flink 1.13.3
> flink-cdc 2.1
>Reporter: WangMinChao
>Priority: Critical
>
>  
> When I am synchronizing large tables, have the following problems :
> 2021-11-09 20:33:04,222 INFO 
> com.ververica.cdc.connectors.mysql.source.enumerator.MySqlSourceEnumerator [] 
> - Assign split MySqlSnapshotSplit\{tableId=db.table, splitId='db.table:383', 
> splitKeyType=[`id` BIGINT NOT NULL], splitStart=[9798290], 
> splitEnd=[9823873], highWatermark=null} to subtask 1
> 2021-11-09 20:33:04,248 INFO 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering 
> checkpoint 101 (type=CHECKPOINT) @ 1636461183945 for job 
> 3cee105643cfee78b80cd0a41143b5c1.
> 2021-11-09 20:33:10,734 ERROR 
> org.apache.flink.runtime.util.FatalExitExceptionHandler [] - FATAL: Thread 
> 'SourceCoordinator-Source: mysqlcdc-source -> Sink: kafka-sink' produced an 
> uncaught exception. Stopping the process...
> java.lang.Error: Source Coordinator Thread already exists. There should never 
> be more than one thread driving the actions of a Source Coordinator. Existing 
> Thread: Thread[SourceCoordinator-Source: mysqlcdc-source -> Sink: 
> kafka-sink,5,main]
> at 
> org.apache.flink.runtime.source.coordinator.SourceCoordinatorProvider$CoordinatorExecutorThreadFactory.newThread(SourceCoordinatorProvider.java:119)
>  [flink-dist_2.12-1.13.3.jar:1.13.3]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:619)
>  ~[?:1.8.0_191]
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:932)
>  ~[?:1.8.0_191]
> at 
> java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1025)
>  ~[?:1.8.0_191]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
>  ~[?:1.8.0_191]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_191]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] dannycranmer commented on a change in pull request #17345: [FLINK-24227][connectors/kinesis] Added Kinesis Data Streams Sink i…

2021-11-24 Thread GitBox


dannycranmer commented on a change in pull request #17345:
URL: https://github.com/apache/flink/pull/17345#discussion_r755809363



##
File path: 
flink-connectors/flink-connector-aws-kinesis-data-streams/src/main/java/org/apache/flink/connector/kinesis/sink/KinesisDataStreamsSinkBuilder.java
##
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.kinesis.sink;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.connector.base.sink.AsyncSinkBaseBuilder;
+
+import software.amazon.awssdk.services.kinesis.model.PutRecordsRequestEntry;
+
+import java.util.Properties;
+
+/**
+ * Builder to construct {@link KinesisDataStreamsSink}.
+ *
+ * The following example shows the minimum setup to create a {@link 
KinesisDataStreamsSink} that
+ * writes String values to a Kinesis Data Streams stream named 
your_stream_here.
+ *
+ * {@code
+ * ElementConverter elementConverter =
+ * KinesisDataStreamsSinkElementConverter.builder()
+ * .setSerializationSchema(new SimpleStringSchema())
+ * .setPartitionKeyGenerator(element -> 
String.valueOf(element.hashCode()))
+ * .build();
+ *
+ * KinesisDataStreamsSink kdsSink =
+ * KinesisDataStreamsSink.builder()
+ * .setElementConverter(elementConverter)
+ * .setStreamName("your_stream_name")
+ * .build();
+ * }
+ *
+ * If the following parameters are not set in this builder, the following 
defaults will be used:
+ *
+ * 
+ *   {@code maxBatchSize} will be 200
+ *   {@code maxInFlightRequests} will be 16
+ *   {@code maxBufferedRequests} will be 1
+ *   {@code flushOnBufferSizeInBytes} will be 64MB i.e. {@code 64 * 1024 * 
1024}
+ *   {@code maxTimeInBufferMS} will be 5000ms
+ *   {@code failOnError} will be false
+ * 
+ *
+ * @param  type of elements that should be persisted in the destination
+ */
+@PublicEvolving
+public class KinesisDataStreamsSinkBuilder
+extends AsyncSinkBaseBuilder<
+InputT, PutRecordsRequestEntry, 
KinesisDataStreamsSinkBuilder> {
+
+private static final int DEFAULT_MAX_BATCH_SIZE = 200;
+private static final int DEFAULT_MAX_IN_FLIGHT_REQUESTS = 16;
+private static final int DEFAULT_MAX_BUFFERED_REQUESTS = 1;
+private static final long DEFAULT_FLUSH_ON_BUFFER_SIZE_IN_B = 64 * 1024 * 
1024;
+private static final long DEFAULT_MAX_TIME_IN_BUFFER_MS = 5000;
+private static final boolean DEFAULT_FAIL_ON_ERROR = false;

Review comment:
   The Amazon Kinesis Data Streams [PutRecrods 
quota](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecords.html)
 has a few dimensions:
   > Each PutRecords request can support up to 500 records. Each record in the 
request can be as large as 1 MiB, up to a limit of 5 MiB for the entire request
   
   - Batch size (record count): 500 records
   - Max record size: 1MiB
   - Batch size (bytes): 5MiB
   
   How would I use these values to setup a configuration for KDS? Does 
`DEFAULT_FLUSH_ON_BUFFER_SIZE_IN_B` guarantee a buffer size or is it a minimum? 
Could we get in a situation where the payload we are trying to send exceeds the 
maximum size and we spin on it?
   
   What happens if I try to send a 2MiB record?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-25031) Job finishes iff all job vertices finish

2021-11-24 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-25031:
--

 Summary: Job finishes iff all job vertices finish
 Key: FLINK-25031
 URL: https://issues.apache.org/jira/browse/FLINK-25031
 Project: Flink
  Issue Type: Sub-task
Reporter: Lijie Wang


The adaptive batch job scheduler needs to build ExecutionGraph dynamically. For 
a dynamic graph, since its execution vertices can be lazily created, a job 
should not finish when all ExecutionVertex(es) finish. Changes should be made 
to let a job finish only when all registered ExecutionJobVertex have finished.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] flinkbot edited a comment on pull request #17866: [FLINK-24996][ci] Update CI image to contain Java 17

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17866:
URL: https://github.com/apache/flink/pull/17866#issuecomment-975542249


   
   ## CI report:
   
   * 3fafa6faf69e75095fe6be191958bcedbf8bf1bd Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26850)
 
   * 2159f71b8bc45b04836d046d4efd5f1b61029d54 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26989)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-25011) Introduce VertexParallelismDecider

2021-11-24 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-25011:
---
Component/s: Runtime / Coordination

> Introduce VertexParallelismDecider
> --
>
> Key: FLINK-25011
> URL: https://issues.apache.org/jira/browse/FLINK-25011
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Priority: Major
>
> Introduce VertexParallelismDecider and provide a default implementation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25031) Job finishes iff all job vertices finish

2021-11-24 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-25031:
---
Component/s: Runtime / Coordination

> Job finishes iff all job vertices finish
> 
>
> Key: FLINK-25031
> URL: https://issues.apache.org/jira/browse/FLINK-25031
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Priority: Major
>
> The adaptive batch job scheduler needs to build ExecutionGraph dynamically. 
> For a dynamic graph, since its execution vertices can be lazily created, a 
> job should not finish when all ExecutionVertex(es) finish. Changes should be 
> made to let a job finish only when all registered ExecutionJobVertex have 
> finished.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24611) Prevent JM from discarding state on checkpoint abortion

2021-11-24 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448454#comment-17448454
 ] 

Yun Tang commented on FLINK-24611:
--

Sorry for missing the notification as I was in the two days' on-call.

[~sewen] {{Incremental state are currently only SST files, how often are those 
below 20KBytes?}}
In the general case, RocksDB should not have many 20KB size files. This could 
happen if checkpoint interval is too small, thus make the flushing too 
frequently. On the other hand, it might happen under case that too many states 
with extremely small memory capacity, write buffer manager would flush empty 
memtable in advance (see FLINK-19238).

Dropping placeholder handle would make the behavior of checkpointed state size 
change. Current checkpointed state size shown in the web UI is incremental 
checkpoint size, if we drop the placeholder handle, we cannot know the actual 
incremental checkpoint size.

If dropping placeholder handle in changelog state-backend would make things 
more easier, can we only disable the placeholder handle with changelog 
state-backend, but still keep it for the normal usage of RocksDB state-backend? 
Changelog state-backend would not checkpoint too often. On the other hand, 
flushing memtables too often in the low-memory case could show explicit poor 
performance and user should detect that then.

> Prevent JM from discarding state on checkpoint abortion
> ---
>
> Key: FLINK-24611
> URL: https://issues.apache.org/jira/browse/FLINK-24611
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Affects Versions: 1.15.0
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> 
> MOTIVATION
> When a checkpoint is aborted, JM discards any state that was sent to it and 
> wasn't used in other checkpoints. This forces incremental state backends to 
> wait for confirmation from JM before re-using this state. For changelog 
> backend this is even more critical.
> One approach proposed was to make backends/TMs responsible for their state, 
> until it's not shared with other TMs, i.e. until rescaling (private/shared 
> state ownership track: FLINK-23342 and more).
> However, that approach is quite invasive.
> An alternative solution would be:
> 1. SharedStateRegistry remembers the latest checkpoint for each shared state 
> (instead of usage count currently)
> 2. CompletedCheckpointStore notifies it about the lowest valid checkpoint (on 
> subsumption)
> 3. SharedStateRegistry then discards any state associated with the lower 
> (subsumed/aborted) checkpoints
> So the aborted checkpoint can only be discarded after some subsequent 
> successful checkpoint (which can mark state as used).
> Mostly JM code is changed. IncrementalRemoteKeyedStateHandle.discard needs to 
> be adjusted.
> Backends don't re-upload state.
> 
> IMPLEMENTATION CONSIDERATIONS
> On subsumption, JM needs to find all the unused state and discard it.
> This can either be done by
> 1) simply traversing all entries; or by 
> 2) maintaining a set of entries per checkpoint (e.g. SortedMap Set>). This allows to skip unnecessary traversal at the cost of higher 
> memory usage
> In both cases:
> - each entry stores last checkpoint ID it was used in (long)
> - key is hashed (even with plain traversal, map.entrySet.iterator.remove() 
> computes hash internally)
> Because of the need to maintain secondary sets, (2) isn't asymptotically 
> better than (1), and is likely worse in practice and requires more memory 
> (see discussion in the comments). So approach (1) seems reasonable.
> 
> CORNER CASES
> The following cases shouldn't pose any difficulties:
> 1. Recovery, re-scaling, and state used by not all or by no tasks - we still 
> register all states on recovery even after FLINK-22483/FLINK-24086
> 2. Cross-task state sharing - not an issue as long as everything is managed 
> by JM
> 3. Dependencies between SharedStateRegistry and CompletedCheckpointStore - 
> simple after FLINK-24086
> 4. Multiple concurrent checkpoints (below)
> Consider the following case:
> (nr. concurrent checkpoints > 1)
> 1. checkpoint 1 starts, TM reports that it uses file f1; checkpoint 1 gets 
> aborted - f1 is now tracked
> 2. savepoint 2 starts, it *will* use f1
> 3. checkpoint 3 starts and finishes; it does NOT use file f1
> When a checkpoint finishes, all pending checkpoints before it are aborted - 
> but not savepoints.
> Savepoin

[jira] [Created] (FLINK-25032) Allow to create execution vertices and execution edges lazily

2021-11-24 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-25032:
--

 Summary: Allow to create execution vertices and execution edges 
lazily
 Key: FLINK-25032
 URL: https://issues.apache.org/jira/browse/FLINK-25032
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Lijie Wang


For a dynamic graph, its execution vertices and execution edges should be 
lazily created.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24867) E2e tests take longer than the maximum 310 minutes on AZP

2021-11-24 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448455#comment-17448455
 ] 

Chesnay Schepler commented on FLINK-24867:
--

No, we only added it for 1.14.

> E2e tests take longer than the maximum 310 minutes on AZP
> -
>
> Key: FLINK-24867
> URL: https://issues.apache.org/jira/browse/FLINK-24867
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.14.0, 1.13.3
>Reporter: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> The e2e tests took longer than the maximum 310 minutes in one AZP run. This 
> made the build step fail.
> {code}
> ##[error]The job running on agent Azure Pipelines 9 ran longer than the 
> maximum time of 310 minutes. For more information, see 
> https://go.microsoft.com/fwlink/?linkid=2077134
> Agent: Azure Pipelines 9
> Started: Today at 09:25
> Duration: 5h 10m 11s
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=26257&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25033) Let some scheduler components updatable.

2021-11-24 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-25033:
--

 Summary: Let some scheduler components updatable.
 Key: FLINK-25033
 URL: https://issues.apache.org/jira/browse/FLINK-25033
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Lijie Wang


Many scheduler components rely on the execution topology to make decisions. 
Some of them will build up some mappings against the execution topology on 
initialization for later use. When the execution topology becomes dynamic, 
these components need to be notified about the topology changes and adjust 
themselves accordingly. These components are:
 * DefaultExecutionTopology
 * SchedulingStrategy
 * PartitionReleaseStrategy
 * SlotSharingStrategy
 * OperatorCoordinatorHandler
 * Network memory of SlotSharingGroup.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] flinkbot edited a comment on pull request #17789: [FLINK-24351][docs] Translate "JSON Function" pages into Chinese

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17789:
URL: https://github.com/apache/flink/pull/17789#issuecomment-968290372


   
   ## CI report:
   
   * 3db6df5e2bfaf867f718ed0fb8d2929511a1e85c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26920)
 
   * cc368b3f669d6504aa779b22116ed41263f10612 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17789: [FLINK-24351][docs] Translate "JSON Function" pages into Chinese

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17789:
URL: https://github.com/apache/flink/pull/17789#issuecomment-968290372


   
   ## CI report:
   
   * 3db6df5e2bfaf867f718ed0fb8d2929511a1e85c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26920)
 
   * cc368b3f669d6504aa779b22116ed41263f10612 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26990)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-25029) Hadoop Caller Context Setting In Flink

2021-11-24 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-25029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448457#comment-17448457
 ] 

David Morávek commented on FLINK-25029:
---

I'd say this is mostly related to filesystems (namely to the hadoop 
filesystem). If I understand it correctly, it should be enough to call 
`CallerContext#setCurrent` before initiating any calls that interact with HDFS. 
Whether it's also a coordination issue depends on whether filesystems have 
enough information to create a "descriptive" context for the call. (eg. in 
context hierarchy, "flink ->  ->  -> ???") 

[1] 
https://github.com/apache/hadoop/blob/rel/release-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java#L147

> Hadoop Caller Context Setting In Flink
> --
>
> Key: FLINK-25029
> URL: https://issues.apache.org/jira/browse/FLINK-25029
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: 刘方奇
>Priority: Major
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> The above is the main effect of the Caller Context. HDFS Client set Caller 
> Context, then name node get it in audit log to do some work.
> Now the Spark and hive have the Caller Context to meet the HDFS Job Audit 
> requirement.
> In my company, flink jobs often cause some problems for HDFS, so we did it 
> for preventing some cases.
> If the feature is general enough. Should we support it, then I can submit a 
> PR for this.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24855) Source Coordinator Thread already exists. There should never be more than one thread driving the actions of a Source Coordinator.

2021-11-24 Thread WangMinChao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448458#comment-17448458
 ] 

WangMinChao commented on FLINK-24855:
-

[~lilicheng] No,I increased the memory size of the taskmanager and the problem 
did not recur:P

> Source Coordinator Thread already exists. There should never be more than one 
> thread driving the actions of a Source Coordinator.
> -
>
> Key: FLINK-24855
> URL: https://issues.apache.org/jira/browse/FLINK-24855
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, Runtime / Coordination
>Affects Versions: 1.13.3
> Environment: flink 1.13.3
> flink-cdc 2.1
>Reporter: WangMinChao
>Priority: Critical
>
>  
> When I am synchronizing large tables, have the following problems :
> 2021-11-09 20:33:04,222 INFO 
> com.ververica.cdc.connectors.mysql.source.enumerator.MySqlSourceEnumerator [] 
> - Assign split MySqlSnapshotSplit\{tableId=db.table, splitId='db.table:383', 
> splitKeyType=[`id` BIGINT NOT NULL], splitStart=[9798290], 
> splitEnd=[9823873], highWatermark=null} to subtask 1
> 2021-11-09 20:33:04,248 INFO 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering 
> checkpoint 101 (type=CHECKPOINT) @ 1636461183945 for job 
> 3cee105643cfee78b80cd0a41143b5c1.
> 2021-11-09 20:33:10,734 ERROR 
> org.apache.flink.runtime.util.FatalExitExceptionHandler [] - FATAL: Thread 
> 'SourceCoordinator-Source: mysqlcdc-source -> Sink: kafka-sink' produced an 
> uncaught exception. Stopping the process...
> java.lang.Error: Source Coordinator Thread already exists. There should never 
> be more than one thread driving the actions of a Source Coordinator. Existing 
> Thread: Thread[SourceCoordinator-Source: mysqlcdc-source -> Sink: 
> kafka-sink,5,main]
> at 
> org.apache.flink.runtime.source.coordinator.SourceCoordinatorProvider$CoordinatorExecutorThreadFactory.newThread(SourceCoordinatorProvider.java:119)
>  [flink-dist_2.12-1.13.3.jar:1.13.3]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:619)
>  ~[?:1.8.0_191]
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:932)
>  ~[?:1.8.0_191]
> at 
> java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1025)
>  ~[?:1.8.0_191]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
>  ~[?:1.8.0_191]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_191]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] flinkbot edited a comment on pull request #17881: [FLINK-24971][tests] Adding retry mechanism in case git clone fails

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17881:
URL: https://github.com/apache/flink/pull/17881#issuecomment-976302259


   
   ## CI report:
   
   * e26f93603b44c28a0bf284e3c4ba51e2b1565c94 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26901)
 
   * 8df3b668e66cb56d9c8450446624d504d04d8d1f UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17881: [FLINK-24971][tests] Adding retry mechanism in case git clone fails

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17881:
URL: https://github.com/apache/flink/pull/17881#issuecomment-976302259


   
   ## CI report:
   
   * e26f93603b44c28a0bf284e3c4ba51e2b1565c94 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26901)
 
   * 8df3b668e66cb56d9c8450446624d504d04d8d1f Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26991)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-25034) Support flexible number of subpartitions in IntermediateResultPartition

2021-11-24 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-25034:
--

 Summary: Support flexible number of subpartitions in 
IntermediateResultPartition
 Key: FLINK-25034
 URL: https://issues.apache.org/jira/browse/FLINK-25034
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Lijie Wang


Currently, when a task is deployed, it needs to know the parallelism of its 
consumer job vertex. This is because the consumer vertex parallelism is needed 
to decide the _numberOfSubpartitions_ of _PartitionDescriptor_ which is part of 
the {_}ResultPartitionDeploymentDescriptor{_}. The reason behind that is, at 
the moment, for one result partition, different subpartitions serve different 
consumer execution vertices. More specifically, one consumer execution vertex 
only consumes data from subpartition with the same index. 

Considering a dynamic graph, the parallelism of a job vertex may not have been 
decided when its upstream vertices are deployed. To enable Flink to work in 
this case, we need a way to allow an execution vertex to run without knowing 
the parallelism of its consumer job vertices. One basic idea is to enable 
multiple subpartitions in one result partition to serve the same consumer 
execution vertex.

To achieve this goal, we can set the number of subpartitions to be the *max 
parallelism* of the consumer job vertex. When the consumer vertex is deployed, 
it should be assigned with a subpartition range to consume.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25035) Shuffle Service Supports Consuming Subpartition Range

2021-11-24 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-25035:
--

 Summary: Shuffle Service Supports Consuming Subpartition Range
 Key: FLINK-25035
 URL: https://issues.apache.org/jira/browse/FLINK-25035
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Reporter: Lijie Wang


In adaptive batch job scheduler, the shuffle service needs to support a 
SingleInputGate to consume  a certain range of subpartitions.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] zentol merged pull request #17133: [FLINK-24138] Architectural tests

2021-11-24 Thread GitBox


zentol merged pull request #17133:
URL: https://github.com/apache/flink/pull/17133


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-25036) Introduce stage-wised scheduling strategy

2021-11-24 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-25036:
--

 Summary: Introduce stage-wised scheduling strategy
 Key: FLINK-25036
 URL: https://issues.apache.org/jira/browse/FLINK-25036
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Lijie Wang


The scheduling of the adaptive batch job scheduler should be stage granularity, 
because the information for deciding parallelism can only be collected after 
the upstream stage is fully finished, so we need to introduce a new scheduling 
strategy: Stage-wised Scheduling Strategy.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Closed] (FLINK-24138) Automated architectural tests

2021-11-24 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-24138.

Fix Version/s: 1.15.0
   Resolution: Fixed

master: b6af27e1229c8922ca9ea6e1f71dd5adeef2b7c7

> Automated architectural tests
> -
>
> Key: FLINK-24138
> URL: https://issues.apache.org/jira/browse/FLINK-24138
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Tests
>Reporter: Ingo Bürk
>Assignee: Ingo Bürk
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> See ML thread: 
> https://lists.apache.org/thread.html/r35b679f0b0d83be8a4912dcd2155e28b316f476547ae5dab601bda65%40%3Cdev.flink.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] flinkbot edited a comment on pull request #17842: [FLINK-24966] [docs] Fix spelling errors in the project

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17842:
URL: https://github.com/apache/flink/pull/17842#issuecomment-974310249


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * ba291419cba7e29422178d48536695ef558ef593 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26985)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-25027) Allow GC of a finished job's JobMaster before the slot timeout is reached

2021-11-24 Thread Shammon (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448464#comment-17448464
 ] 

Shammon commented on FLINK-25027:
-

[~trohrmann] Yes. Can you assign the issue to me? I'll be glad to work on it. 
THX

> Allow GC of a finished job's JobMaster before the slot timeout is reached
> -
>
> Key: FLINK-25027
> URL: https://issues.apache.org/jira/browse/FLINK-25027
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Nico Kruber
>Priority: Critical
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
> Attachments: image-2021-11-23-20-32-20-479.png
>
>
> In a session cluster, after a (batch) job is finished, the JobMaster seems to 
> stick around for another couple of minutes before being eligible for garbage 
> collection.
> Looking into a heap dump, it seems to be tied to a 
> {{PhysicalSlotRequestBulkCheckerImpl}} which is enqueued in the underlying 
> Akka executor (and keeps the JM from being GC’d). Per default the action is 
> scheduled for {{slot.request.timeout}} that defaults to 5 min (thanks 
> [~trohrmann] for helping out here)
> !image-2021-11-23-20-32-20-479.png!
> With this setting, you will have to account for enough metaspace to cover 5 
> minutes of time which may span a couple of jobs, needlessly!
> The problem seems to be that Flink is using the main thread executor for the 
> scheduling that uses the {{ActorSystem}}'s scheduler and the future task 
> scheduled with Akka can (probably) not be easily cancelled.
> One idea could be to use a dedicated thread pool per JM, that we shut down 
> when the JM terminates. That way we would not keep the JM from being GC’d.
> (The concrete example we investigated was a DataSet job)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-24987) Enhance ExternalizedCheckpointCleanup enum

2021-11-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/FLINK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Morávek updated FLINK-24987:
--
Affects Version/s: (was: 1.14.0)

> Enhance ExternalizedCheckpointCleanup enum
> --
>
> Key: FLINK-24987
> URL: https://issues.apache.org/jira/browse/FLINK-24987
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Nicolaus Weidner
>Assignee: Nicolaus Weidner
>Priority: Major
> Fix For: 1.15.0
>
>
> We use the config setting 
> [execution.checkpointing.externalized-checkpoint-retention|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/ExecutionCheckpointingOptions.java#L90-L119]
>  to distinguish three cases:
> - delete on cancellation
> - retain on cancellation
> - no externalized checkpoints (if no value is set)
> It would be easier to understand if we had an explicit enum value 
> NO_EXTERNALIZED_CHECKPOINTS for the third case in 
> [ExternalizedCheckpointCleanup|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/CheckpointConfig.java#L702-L742].
>  This would also avoid potential issues for clients with handling null values 
> (for example, null values being dropped on serialization could be annoying 
> when trying to change from RETAIN_ON_CANCELLATION to no external checkpoints).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-24987) Enhance ExternalizedCheckpointCleanup enum

2021-11-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/FLINK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Morávek updated FLINK-24987:
--
Fix Version/s: 1.15.0

> Enhance ExternalizedCheckpointCleanup enum
> --
>
> Key: FLINK-24987
> URL: https://issues.apache.org/jira/browse/FLINK-24987
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Affects Versions: 1.14.0
>Reporter: Nicolaus Weidner
>Assignee: Nicolaus Weidner
>Priority: Major
> Fix For: 1.15.0
>
>
> We use the config setting 
> [execution.checkpointing.externalized-checkpoint-retention|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/ExecutionCheckpointingOptions.java#L90-L119]
>  to distinguish three cases:
> - delete on cancellation
> - retain on cancellation
> - no externalized checkpoints (if no value is set)
> It would be easier to understand if we had an explicit enum value 
> NO_EXTERNALIZED_CHECKPOINTS for the third case in 
> [ExternalizedCheckpointCleanup|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/CheckpointConfig.java#L702-L742].
>  This would also avoid potential issues for clients with handling null values 
> (for example, null values being dropped on serialization could be annoying 
> when trying to change from RETAIN_ON_CANCELLATION to no external checkpoints).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24844) CassandraConnectorITCase.testCassandraBatchPojoFormat fails on AZP

2021-11-24 Thread Etienne Chauchot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448465#comment-17448465
 ] 

Etienne Chauchot commented on FLINK-24844:
--

Hi Till, thanks for the pointer. I plan to work on this issue when I'm done 
with [this|https://issues.apache.org/jira/browse/FLINK-22775] other Cassandra 
flaky test.

> CassandraConnectorITCase.testCassandraBatchPojoFormat fails on AZP
> --
>
> Key: FLINK-24844
> URL: https://issues.apache.org/jira/browse/FLINK-24844
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Cassandra
>Affects Versions: 1.14.0
>Reporter: Till Rohrmann
>Assignee: Etienne Chauchot
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.14.1
>
>
> The test {{CassandraConnectorITCase.testCassandraBatchPojoFormat}} fails on 
> AZP with
> {code}
> 2021-11-09T00:32:42.1369473Z Nov 09 00:32:42 [ERROR] Tests run: 17, Failures: 
> 0, Errors: 1, Skipped: 0, Time elapsed: 152.962 s <<< FAILURE! - in 
> org.apache.flink.streaming.connectors.cassandra.CassandraConnectorITCase
> 2021-11-09T00:32:42.1371638Z Nov 09 00:32:42 [ERROR] 
> testCassandraBatchPojoFormat  Time elapsed: 20.378 s  <<< ERROR!
> 2021-11-09T00:32:42.1372881Z Nov 09 00:32:42 
> com.datastax.driver.core.exceptions.AlreadyExistsException: Table 
> flink.batches already exists
> 2021-11-09T00:32:42.1373913Z Nov 09 00:32:42  at 
> com.datastax.driver.core.exceptions.AlreadyExistsException.copy(AlreadyExistsException.java:111)
> 2021-11-09T00:32:42.1374921Z Nov 09 00:32:42  at 
> com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
> 2021-11-09T00:32:42.1379615Z Nov 09 00:32:42  at 
> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
> 2021-11-09T00:32:42.1380668Z Nov 09 00:32:42  at 
> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:63)
> 2021-11-09T00:32:42.1381523Z Nov 09 00:32:42  at 
> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:39)
> 2021-11-09T00:32:42.1382552Z Nov 09 00:32:42  at 
> org.apache.flink.streaming.connectors.cassandra.CassandraConnectorITCase.testCassandraBatchPojoFormat(CassandraConnectorITCase.java:543)
> 2021-11-09T00:32:42.1383487Z Nov 09 00:32:42  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-11-09T00:32:42.1384433Z Nov 09 00:32:42  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-11-09T00:32:42.1385336Z Nov 09 00:32:42  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-11-09T00:32:42.1386119Z Nov 09 00:32:42  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2021-11-09T00:32:42.1387204Z Nov 09 00:32:42  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> 2021-11-09T00:32:42.1388225Z Nov 09 00:32:42  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2021-11-09T00:32:42.1389101Z Nov 09 00:32:42  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> 2021-11-09T00:32:42.1400913Z Nov 09 00:32:42  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2021-11-09T00:32:42.1401588Z Nov 09 00:32:42  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2021-11-09T00:32:42.1402487Z Nov 09 00:32:42  at 
> org.apache.flink.testutils.junit.RetryRule$RetryOnExceptionStatement.evaluate(RetryRule.java:192)
> 2021-11-09T00:32:42.1403055Z Nov 09 00:32:42  at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> 2021-11-09T00:32:42.1403556Z Nov 09 00:32:42  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
> 2021-11-09T00:32:42.1404008Z Nov 09 00:32:42  at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 2021-11-09T00:32:42.1404650Z Nov 09 00:32:42  at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> 2021-11-09T00:32:42.1405151Z Nov 09 00:32:42  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> 2021-11-09T00:32:42.1405632Z Nov 09 00:32:42  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> 2021-11-09T00:32:42.1406166Z Nov 09 00:32:42  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> 2021-11-09T00:32:42.1406670Z Nov 09 00:32:42  at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> 2021-11-09T00:32:42.1407125Z Nov 09 00:32:42  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> 2021-11-09T00:32:42.1407599Z Nov 09 00:32:42  at 
> org.junit.runners.P

[GitHub] [flink] echauchot commented on pull request #17640: [FLINK-21407][doc][formats] Add formats to DataStream connectors doc

2021-11-24 Thread GitBox


echauchot commented on pull request #17640:
URL: https://github.com/apache/flink/pull/17640#issuecomment-977686313


   Hi @AHeise can we merge this PR ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17857: [FLINK-24985][tests] Relax assumptions of stacktrace layout

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17857:
URL: https://github.com/apache/flink/pull/17857#issuecomment-975401663


   
   ## CI report:
   
   * 17691f4330fb8017970c9934c8e52ea91da7cdf4 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26947)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol merged pull request #17857: [FLINK-24985][tests] Relax assumptions of stacktrace layout

2021-11-24 Thread GitBox


zentol merged pull request #17857:
URL: https://github.com/apache/flink/pull/17857


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] RocMarshal commented on pull request #17789: [FLINK-24351][docs] Translate "JSON Function" pages into Chinese

2021-11-24 Thread GitBox


RocMarshal commented on pull request #17789:
URL: https://github.com/apache/flink/pull/17789#issuecomment-977687669


   @MonsterChenzhuo Please also make sure that the PR template is properly 
filled in, which makes it useful for reviewers to comprehend the change details 
of the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-24985) LocalBufferPoolDestroyTest#isInBlockingBufferRequest does not work on Java 17

2021-11-24 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-24985.

Resolution: Fixed

master: cd4f7ea9c4b2592557fdb53cb486671cb18ddbb0

> LocalBufferPoolDestroyTest#isInBlockingBufferRequest does not work on Java 17
> -
>
> Key: FLINK-24985
> URL: https://issues.apache.org/jira/browse/FLINK-24985
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> LocalBufferPoolDestroyTest#isInBlockingBufferRequest does some nasty 
> stacktrace analysis to find out out whether a thread is currently waiting for 
> a buffer or not.
> The assumptions it makes over the stracktrace layout no longer apply on Java 
> 17, and need to be adjusted.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] echauchot commented on pull request #17839: [FLINK-24859] document new connectors formats

2021-11-24 Thread GitBox


echauchot commented on pull request #17839:
URL: https://github.com/apache/flink/pull/17839#issuecomment-977688038


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-25029) Hadoop Caller Context Setting In Flink

2021-11-24 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-25029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448467#comment-17448467
 ] 

刘方奇 commented on FLINK-25029:
-

THX for [~dmvk] , your reply was prefect.

So, now there are two questions of the issue:
1. Whether we should do some code work of this function for flink.

2. Which "descriptive" context of flink job should we pick to send to hdfs, if 
we do this work, I think it is simple to get the informations for jobid / 
taskid / task index etc.

> Hadoop Caller Context Setting In Flink
> --
>
> Key: FLINK-25029
> URL: https://issues.apache.org/jira/browse/FLINK-25029
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: 刘方奇
>Priority: Major
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> The above is the main effect of the Caller Context. HDFS Client set Caller 
> Context, then name node get it in audit log to do some work.
> Now the Spark and hive have the Caller Context to meet the HDFS Job Audit 
> requirement.
> In my company, flink jobs often cause some problems for HDFS, so we did it 
> for preventing some cases.
> If the feature is general enough. Should we support it, then I can submit a 
> PR for this.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] echauchot removed a comment on pull request #17839: [FLINK-24859] document new connectors formats

2021-11-24 Thread GitBox


echauchot removed a comment on pull request #17839:
URL: https://github.com/apache/flink/pull/17839#issuecomment-977688038


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] echauchot commented on pull request #17839: [FLINK-24859] document new connectors formats

2021-11-24 Thread GitBox


echauchot commented on pull request #17839:
URL: https://github.com/apache/flink/pull/17839#issuecomment-977688776


   @flinkbot run azure
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17839: [FLINK-24859] document new connectors formats

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17839:
URL: https://github.com/apache/flink/pull/17839#issuecomment-974134553


   
   ## CI report:
   
   * d066843917e65f45a3a7299ae58b9ef5bc8ee782 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26943)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on pull request #17887: [FLINK-25020] Properly forward exception

2021-11-24 Thread GitBox


zentol commented on pull request #17887:
URL: https://github.com/apache/flink/pull/17887#issuecomment-977690122


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-25037) Compilation of compile_cron_python_wheels failed on AZP

2021-11-24 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-25037:
-

 Summary: Compilation of compile_cron_python_wheels failed on AZP
 Key: FLINK-25037
 URL: https://issues.apache.org/jira/browse/FLINK-25037
 Project: Flink
  Issue Type: Bug
  Components: API / Python, Build System / Azure Pipelines
Affects Versions: 1.12.5
Reporter: Till Rohrmann
 Fix For: 1.12.6


The compilation of {{compile_cron_python_wheels}} failed on AZP with

{code}
==
Compiling Flink
==
Invoking mvn with 'mvn -Dmaven.wagon.http.pool=false --settings 
/__w/1/s/tools/ci/google-mirror-settings.xml 
-Dorg.slf4j.simpleLogger.showDateTime=true 
-Dorg.slf4j.simpleLogger.dateTimeFormat=HH:mm:ss.SSS 
-Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn
 --no-snapshot-updates -B -Dhadoop.version=2.8.3 -Dinclude_hadoop_aws 
-Dscala-2.11  clean deploy 
-DaltDeploymentRepository=validation_repository::default::file:/tmp/flink-validation-deployment
 -Dmaven.repo.local=/home/vsts/work/1/.m2/repository 
-Dflink.convergence.phase=install -Pcheck-convergence -Dflink.forkCount=2 
-Dflink.forkCountTestPackage=2 -Dmaven.javadoc.skip=true -U -DskipTests'
[ERROR] Could not create local repository at /home/vsts/work/1/.m2/repository 
-> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/LocalRepositoryNotAccessibleException
==
Compiling Flink failed.
==
{code}

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=26968&view=logs&j=a29bcfe1-064d-50b9-354f-07802213a3c0&t=47ff6576-c9dc-5eab-9db8-183dcca3bede



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] flinkbot edited a comment on pull request #17839: [FLINK-24859] document new connectors formats

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17839:
URL: https://github.com/apache/flink/pull/17839#issuecomment-974134553


   
   ## CI report:
   
   * d066843917e65f45a3a7299ae58b9ef5bc8ee782 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26943)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17887: [FLINK-25020] Properly forward exception

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17887:
URL: https://github.com/apache/flink/pull/17887#issuecomment-976615140


   
   ## CI report:
   
   * 15a77538ee4fdbc85ca6c875ae3797b0a9b2d9e5 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26944)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17887: [FLINK-25020] Properly forward exception

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17887:
URL: https://github.com/apache/flink/pull/17887#issuecomment-976615140


   
   ## CI report:
   
   * 15a77538ee4fdbc85ca6c875ae3797b0a9b2d9e5 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26944)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-25037) Compilation of compile_cron_python_wheels failed on AZP

2021-11-24 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-25037.

Fix Version/s: (was: 1.12.6)
   Resolution: Duplicate

> Compilation of compile_cron_python_wheels failed on AZP
> ---
>
> Key: FLINK-25037
> URL: https://issues.apache.org/jira/browse/FLINK-25037
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python, Build System / Azure Pipelines
>Affects Versions: 1.12.5
>Reporter: Till Rohrmann
>Priority: Major
>
> The compilation of {{compile_cron_python_wheels}} failed on AZP with
> {code}
> ==
> Compiling Flink
> ==
> Invoking mvn with 'mvn -Dmaven.wagon.http.pool=false --settings 
> /__w/1/s/tools/ci/google-mirror-settings.xml 
> -Dorg.slf4j.simpleLogger.showDateTime=true 
> -Dorg.slf4j.simpleLogger.dateTimeFormat=HH:mm:ss.SSS 
> -Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn
>  --no-snapshot-updates -B -Dhadoop.version=2.8.3 -Dinclude_hadoop_aws 
> -Dscala-2.11  clean deploy 
> -DaltDeploymentRepository=validation_repository::default::file:/tmp/flink-validation-deployment
>  -Dmaven.repo.local=/home/vsts/work/1/.m2/repository 
> -Dflink.convergence.phase=install -Pcheck-convergence -Dflink.forkCount=2 
> -Dflink.forkCountTestPackage=2 -Dmaven.javadoc.skip=true -U -DskipTests'
> [ERROR] Could not create local repository at /home/vsts/work/1/.m2/repository 
> -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/LocalRepositoryNotAccessibleException
> ==
> Compiling Flink failed.
> ==
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=26968&view=logs&j=a29bcfe1-064d-50b9-354f-07802213a3c0&t=47ff6576-c9dc-5eab-9db8-183dcca3bede



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] JingGe commented on pull request #17888: [FLINK-24077][hbase2/HBaseConnectorITCase] fix sporadic ut failing

2021-11-24 Thread GitBox


JingGe commented on pull request #17888:
URL: https://github.com/apache/flink/pull/17888#issuecomment-977700997


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17888: [FLINK-24077][hbase2/HBaseConnectorITCase] fix sporadic ut failing

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17888:
URL: https://github.com/apache/flink/pull/17888#issuecomment-976661814


   
   ## CI report:
   
   * ff803e470139bc8af3c0beb2d33b278427b70569 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26957)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe commented on pull request #17888: [FLINK-24077][hbase2/HBaseConnectorITCase] fix sporadic ut failing

2021-11-24 Thread GitBox


JingGe commented on pull request #17888:
URL: https://github.com/apache/flink/pull/17888#issuecomment-977702914


   Since the test was failing sporadically, we will run CI multiple times.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-24987) Enhance ExternalizedCheckpointCleanup enum

2021-11-24 Thread Nicolaus Weidner (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolaus Weidner updated FLINK-24987:
-
Component/s: Runtime / Configuration
 (was: API / DataStream)

> Enhance ExternalizedCheckpointCleanup enum
> --
>
> Key: FLINK-24987
> URL: https://issues.apache.org/jira/browse/FLINK-24987
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Reporter: Nicolaus Weidner
>Assignee: Nicolaus Weidner
>Priority: Major
> Fix For: 1.15.0
>
>
> We use the config setting 
> [execution.checkpointing.externalized-checkpoint-retention|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/ExecutionCheckpointingOptions.java#L90-L119]
>  to distinguish three cases:
> - delete on cancellation
> - retain on cancellation
> - no externalized checkpoints (if no value is set)
> It would be easier to understand if we had an explicit enum value 
> NO_EXTERNALIZED_CHECKPOINTS for the third case in 
> [ExternalizedCheckpointCleanup|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/CheckpointConfig.java#L702-L742].
>  This would also avoid potential issues for clients with handling null values 
> (for example, null values being dropped on serialization could be annoying 
> when trying to change from RETAIN_ON_CANCELLATION to no external checkpoints).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] flinkbot edited a comment on pull request #17519: [FLINK-24368][testutils] Refactor FlinkContainer to split JM and TMs into different containers and supports HA

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17519:
URL: https://github.com/apache/flink/pull/17519#issuecomment-946579332


   
   ## CI report:
   
   * 958332c60f7fdf8bd3b402bff4dff469448885b2 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26520)
 
   * f322d52abfc8f66e1c6705e02a1b07e5448136a4 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17888: [FLINK-24077][hbase2/HBaseConnectorITCase] fix sporadic ut failing

2021-11-24 Thread GitBox


flinkbot edited a comment on pull request #17888:
URL: https://github.com/apache/flink/pull/17888#issuecomment-976661814


   
   ## CI report:
   
   * ff803e470139bc8af3c0beb2d33b278427b70569 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=26957)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   4   5   6   >