[GitHub] [flink] zhuzhurk commented on a change in pull request #16436: [FLINK-22017][coordination] Allow BLOCKING result partition to be individually consumable

2021-07-14 Thread GitBox


zhuzhurk commented on a change in pull request #16436:
URL: https://github.com/apache/flink/pull/16436#discussion_r670187546



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adapter/DefaultResultPartitionTest.java
##
@@ -34,24 +33,18 @@
 /** Unit tests for {@link DefaultResultPartition}. */
 public class DefaultResultPartitionTest extends TestLogger {
 
-private static final TestResultPartitionStateSupplier resultPartitionState 
=
-new TestResultPartitionStateSupplier();
-
-private final IntermediateResultPartitionID resultPartitionId =
-new IntermediateResultPartitionID();
-private final IntermediateDataSetID intermediateResultId = new 
IntermediateDataSetID();
+@Test

Review comment:
   There is no need to make these changes




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zhuzhurk commented on a change in pull request #16436: [FLINK-22017][coordination] Allow BLOCKING result partition to be individually consumable

2021-07-14 Thread GitBox


zhuzhurk commented on a change in pull request #16436:
URL: https://github.com/apache/flink/pull/16436#discussion_r670186121



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/EdgeManagerBuildUtil.java
##
@@ -88,6 +88,7 @@ private static void connectAllToAll(
 Arrays.stream(intermediateResult.getPartitions())
 
.map(IntermediateResultPartition::getPartitionId)
 .collect(Collectors.toList()));
+registerConsumedPartitionGroupToEdgeManager(intermediateResult, 
consumedPartitions);

Review comment:
   `intermediateResult.getPartitions()` is an array so I think 
`createAndRegisterConsumedPartitionGroup(IntermediateResultPartition... 
partitions)` can work.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zhuzhurk commented on a change in pull request #16436: [FLINK-22017][coordination] Allow BLOCKING result partition to be individually consumable

2021-07-14 Thread GitBox


zhuzhurk commented on a change in pull request #16436:
URL: https://github.com/apache/flink/pull/16436#discussion_r670184762



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/EdgeManagerBuildUtil.java
##
@@ -171,4 +177,30 @@ private static void connectPointwise(
 }
 }
 }
+
+private static ConsumedPartitionGroup 
createAndRegisterConsumedPartitionGroupToEdgeManager(
+IntermediateResultPartitionID consumedPartitionId,
+IntermediateResult intermediateResult) {
+ConsumedPartitionGroup consumedPartitionGroup =
+
ConsumedPartitionGroup.fromSinglePartition(consumedPartitionId);
+intermediateResult
+.getProducer()
+.getGraph()
+.getEdgeManager()
+.registerConsumedPartitionGroup(consumedPartitionGroup);
+return consumedPartitionGroup;

Review comment:
   The method can be implemented as 
`createAndRegisterConsumedPartitionGroupToEdgeManager(Collections.singleton(consumedPartitionId))`.
   And maybe we can remove `ConsumedPartitionGroup#fromSinglePartition(...)`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-23299) Bump flink-shaded-zookeeper 3.5 to 3.5.9

2021-07-14 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-23299:
-
Fix Version/s: (was: shaded-1.14)
   shaded-14.0

> Bump flink-shaded-zookeeper 3.5 to 3.5.9
> 
>
> Key: FLINK-23299
> URL: https://issues.apache.org/jira/browse/FLINK-23299
> Project: Flink
>  Issue Type: Improvement
>  Components: BuildSystem / Shaded
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: shaded-14.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23326) Bump flink-shaded-guava to 30.1.1

2021-07-14 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-23326:
-
Fix Version/s: (was: shaded-1.14)
   shaded-14.0

> Bump flink-shaded-guava to 30.1.1
> -
>
> Key: FLINK-23326
> URL: https://issues.apache.org/jira/browse/FLINK-23326
> Project: Flink
>  Issue Type: Improvement
>  Components: BuildSystem / Shaded
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: shaded-14.0
>
>
> A guava version bump is long overdue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23325) Bump flink-shaded-netty to 4.1.65

2021-07-14 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-23325:
-
Fix Version/s: (was: shaded-1.14)
   shaded-14.0

> Bump flink-shaded-netty to 4.1.65
> -
>
> Key: FLINK-23325
> URL: https://issues.apache.org/jira/browse/FLINK-23325
> Project: Flink
>  Issue Type: Improvement
>  Components: BuildSystem / Shaded
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: shaded-14.0
>
>
> For general improvements, performance and security.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23291) Bump flink-shaded-jackson to 2.12.4

2021-07-14 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-23291:
-
Fix Version/s: (was: shaded-1.14)
   shaded-14.0

> Bump flink-shaded-jackson to 2.12.4
> ---
>
> Key: FLINK-23291
> URL: https://issues.apache.org/jira/browse/FLINK-23291
> Project: Flink
>  Issue Type: Improvement
>  Components: BuildSystem / Shaded
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: shaded-14.0
>
>
> 2.12.1 -> 2.12.3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zhuzhurk commented on a change in pull request #16436: [FLINK-22017][coordination] Allow BLOCKING result partition to be individually consumable

2021-07-14 Thread GitBox


zhuzhurk commented on a change in pull request #16436:
URL: https://github.com/apache/flink/pull/16436#discussion_r670184762



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/EdgeManagerBuildUtil.java
##
@@ -171,4 +177,30 @@ private static void connectPointwise(
 }
 }
 }
+
+private static ConsumedPartitionGroup 
createAndRegisterConsumedPartitionGroupToEdgeManager(
+IntermediateResultPartitionID consumedPartitionId,
+IntermediateResult intermediateResult) {
+ConsumedPartitionGroup consumedPartitionGroup =
+
ConsumedPartitionGroup.fromSinglePartition(consumedPartitionId);
+intermediateResult
+.getProducer()
+.getGraph()
+.getEdgeManager()
+.registerConsumedPartitionGroup(consumedPartitionGroup);
+return consumedPartitionGroup;

Review comment:
   The method can be implemented as 
`createAndRegisterConsumedPartitionGroupToEdgeManager(Collections.singleton(consumedPartitionId))`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-22452) Support specifying custom transactional.id prefix in FlinkKafkaProducer

2021-07-14 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-22452:
---
Fix Version/s: 1.14.0

> Support specifying custom transactional.id prefix in FlinkKafkaProducer
> ---
>
> Key: FLINK-22452
> URL: https://issues.apache.org/jira/browse/FLINK-22452
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Affects Versions: 1.12.2
>Reporter: Wenhao Ji
>Assignee: Wenhao Ji
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Currently, the "transactional.id"s of the Kafka producers in 
> FlinkKafkaProducer are generated based on the task name. This mechanism has 
> some limitations:
>  * It will exceed Kafka's limitation if the task name is too long. (resolved 
> in FLINK-17691)
>  * They will very likely clash each other if the job topologies are similar. 
> (discussed in FLINK-11654)
>  * Only certain "transactional.id" may be authorized by [Prefixed 
> ACLs|https://docs.confluent.io/platform/current/kafka/authorization.html#prefixed-acls]
>  on the target Kafka cluster.
> Besides, the spring community has introduced the 
> [setTransactionIdPrefix|https://docs.spring.io/spring-kafka/api/org/springframework/kafka/core/DefaultKafkaProducerFactory.html#setTransactionIdPrefix(java.lang.String)]
>  method to their Kafka client.
> Therefore, I think it will be necessary to have this feature in the Flink 
> Kafka connector. 
>  
> As discussed in FLINK-11654, the possible solution will be,
>  * either introduce an additional method called 
> setTransactionalIdPrefix(String) in the FlinkKafkaProducer,
>  * or use the existing "transactional.id" properties as the prefix.
>  And the behavior of the "transactional.id" generation will be
>  * keep the behavior as it was if absent,
>  * use the one if present as the prefix for the TransactionalIdsGenerator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] pnowojski commented on a change in pull request #16494: [FLINK-22452][Connectors/Kafka] Support specifying custom transactional.id prefix in FlinkKafkaProducer

2021-07-14 Thread GitBox


pnowojski commented on a change in pull request #16494:
URL: https://github.com/apache/flink/pull/16494#discussion_r670181201



##
File path: 
flink-connectors/flink-connector-kafka/src/test/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducerITCase.java
##
@@ -597,6 +602,93 @@ public void testMigrateFromAtExactlyOnceToAtLeastOnce() 
throws Exception {
 deleteTestTopic(topic);
 }
 
+@Test
+public void testCustomizeTransactionalIdPrefix() throws Exception {
+String transactionalIdPrefix = "my-prefix";
+
+Properties properties = createProperties();
+String topic = "testCustomizeTransactionalIdPrefix";
+FlinkKafkaProducer kafkaProducer =
+spy(
+new FlinkKafkaProducer<>(
+topic,
+integerKeyedSerializationSchema,
+properties,
+FlinkKafkaProducer.Semantic.EXACTLY_ONCE));
+kafkaProducer.setTransactionalIdPrefix(transactionalIdPrefix);
+
+try (OneInputStreamOperatorTestHarness testHarness =
+new OneInputStreamOperatorTestHarness<>(
+new StreamSink<>(kafkaProducer), 
IntSerializer.INSTANCE)) {
+testHarness.setup();
+testHarness.open();
+testHarness.processElement(1, 0);
+testHarness.snapshot(0, 1);
+}
+
+ArgumentCaptor stateCaptor =
+
ArgumentCaptor.forClass(FlinkKafkaProducer.KafkaTransactionState.class);
+verify(kafkaProducer).invoke(stateCaptor.capture(), any(), any());

Review comment:
   We are [discouraging use of 
mockito](https://flink.apache.org/contributing/code-style-and-quality-common.html#design-for-testability).
 I see at least a couple ways how to avoid it:
   
   1. extend `FlinkKafkaProducer` and override `beginTransaction()` that would 
proxy the call to the super class and capture the result
   2. use 
`org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction#currentTransaction()`
 existing method to validate it (it's a `protected` method so you would need to 
access it in test from the same package?

##
File path: 
flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java
##
@@ -1149,22 +1168,28 @@ public void 
initializeState(FunctionInitializationContext context) throws Except
 migrateNextTransactionalIdHindState(context);
 }
 
-String taskName = getRuntimeContext().getTaskName();
-// Kafka transactional IDs are limited in length to be less than the 
max value of a short,
-// so we truncate here if necessary to a more reasonable length string.
-if (taskName.length() > maxTaskNameSize) {
-taskName = taskName.substring(0, maxTaskNameSize);
-LOG.warn(
-"Truncated task name for Kafka TransactionalId from {} to 
{}.",
-getRuntimeContext().getTaskName(),
-taskName);
+String transactionalIdPrefix;

Review comment:
   nit: rename `transactionalIdPrefix` -> `actualTransactionalIdPrefix` or 
`computedTransactionalIdPrefix`?

##
File path: 
flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java
##
@@ -248,6 +248,9 @@
 /** Flag controlling whether we are writing the Flink record's timestamp 
into Kafka. */
 protected boolean writeTimestampToKafka = false;
 
+/** The transactional.id prefix to be used by the producers when 
communicating with Kafka. */
+private String transactionalIdPrefix = null;

Review comment:
   nit: annotate `@Nullable` or change to `Optional`?

##
File path: 
flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java
##
@@ -774,6 +777,22 @@ public void setLogFailuresOnly(boolean logFailuresOnly) {
 this.logFailuresOnly = logFailuresOnly;
 }
 
+/**
+ * Specifies the prefix of the transactional.id property to be used by the 
producers when
+ * communicating with Kafka. If not set, the transactional.id will be 
prefixed with {@code
+ * taskName + "-" + operatorUid}.
+ *
+ * Note that, if we change the prefix when the Flink application 
previously failed before
+ * first checkpoint completed or we are starting new batch of {@link 
FlinkKafkaProducer} from
+ * scratch without clean shutdown of the previous one, since we don't know 
what was the
+ * previously used transactional.id prefix, there will be some lingering 
transactions left.

Review comment:
   👍 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.


[jira] [Commented] (FLINK-23379) interval left join null value result out of order

2021-07-14 Thread waywtdcc (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381093#comment-17381093
 ] 

waywtdcc commented on FLINK-23379:
--

Hi,[~lzljs3620320]

Yeah, but what I want is the append flow, which becomes the retract flow after 
deduplication. How to deal with it?

>  interval left join null value result out of order
> --
>
> Key: FLINK-23379
> URL: https://issues.apache.org/jira/browse/FLINK-23379
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.12.2
>Reporter: waywtdcc
>Priority: Major
>
> * Scenes:
>  Person main table left interval join associated message information table,
>  The first record that is not associated with the message information table 
> will be later than the later record that is associated with the message 
> information table.
>  When there are normal output and null value output with the same primary 
> key, it will be out of order, and the null value output is later than the 
> normal value output, resulting in incorrect results
> enter:
> {"id": 1, "name":"chencc2", "message": "good boy2", "ts":"2021-03-26 
> 18:56:43"}
> {"id": 1, "name":"chencc2", "age": "28", "ts":"2021-03-26 19:02:47"}
> {"id": 1, "name":"chencc2", "message": "good boy3", "ts":"2021-03-26 
> 19:06:43"}
> {"id": 1, "name":"chencc2", "age": "27", "ts":"2021-03-26 19:06:47"}
> Output:
>  +I(chencc2,27,2021-03-26T19:06:47,good boy3,2021-03-26 19:06:43.000)
>  +I(chencc2,28,2021-03-26T19:02:47,null,null)
>  The time of the second record here is 19:02 earlier than the first record, 
> but the output of the result is late, causing data update errors
>  
>  *  code
> {code:java}
> tableEnv.executeSql("drop table if exists persons_table_kafka2");
>          String kafka_source_sql = "CREATE TABLE persons_table_kafka2 (\n" +
>                  "  `id` BIGINT,\n" +
>                  "  `name` STRING,\n" +
>                  "  `age` INT,\n" +
>                  "  proctime as PROCTIME(),\n" +
>                  "  `ts` TIMESTAMP(3),\n" +
>                  " WATERMARK FOR ts AS ts\n" +
>                  ") WITH (\n" +
>                  "  'connector' = 'kafka',\n" +
>                  "  'topic' = 'persons_test2',\n" +
>                  "  'properties.bootstrap.servers' = 'node2:6667',\n" +
>                  "  'properties.group.id' = 'testGroa115',\n" +
>                  "  'scan.startup.mode' = 'group-offsets',\n" +
>                  "  'format' = 'json'\n" +
>                  ")";
>          tableEnv.executeSql(kafka_source_sql);
>         tableEnv.executeSql("drop table if exists 
> persons_message_table_kafka2");
>          String kafka_source_sql2 = "CREATE TABLE 
> persons_message_table_kafka2 (\n" +
>                  "  `id` BIGINT,\n" +
>                  "  `name` STRING,\n" +
>                  "  `message` STRING,\n" +
>                  "  `ts` TIMESTAMP(3) ," +
>  //                " WATERMARK FOR ts AS ts - INTERVAL '5' SECOND\n" +
>                  " WATERMARK FOR ts AS ts\n" +
>                  ") WITH (\n" +
>                  "  'connector' = 'kafka',\n" +
>                  "  'topic' = 'persons_extra_message2',\n" +
>                  "  'properties.bootstrap.servers' = 'node2:6667',\n" +
>                  "  'properties.group.id' = 'testGroud2e313',\n" +
>                  "  'scan.startup.mode' = 'group-offsets',\n" +
>                  "  'format' = 'json'\n" +
>                  ")";
>          tableEnv.executeSql(kafka_source_sql2);
>         tableEnv.executeSql("" +
>                  "CREATE TEMPORARY VIEW result_data_view " +
>                  " as " +
>                  " select " +
>                  " t1.name, t1.age,t1.ts as ts1,t2.message, cast (t2.ts as 
> string) as ts2 " +
>                  " from persons_table_kafka2 t1 " +
>                  " left  join persons_message_table_kafka2 t2 on t1.name = 
> t2.name and t1.ts between " +
>                          " t2.ts and t2.ts +  INTERVAL '3' MINUTE"
>                  );
>         Table resultTable = tableEnv.from("result_data_view");
>          DataStream rowDataDataStream = 
> tableEnv.toAppendStream(resultTable, RowData.class);
>          rowDataDataStream.print();
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23214) Make ShuffleMaster a cluster level shared service

2021-07-14 Thread Yingjie Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao updated FLINK-23214:

Parent: FLINK-22910
Issue Type: Sub-task  (was: Improvement)

> Make ShuffleMaster a cluster level shared service
> -
>
> Key: FLINK-23214
> URL: https://issues.apache.org/jira/browse/FLINK-23214
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination, Runtime / Network
>Affects Versions: 1.14.0
>Reporter: Yingjie Cao
>Priority: Major
>  Labels: pull-request-available
>
> This ticket tries to make ShuffleMaster a cluster level shared service which 
> makes it consistent with the ShuffleEnvironment at the TM side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23214) Make ShuffleMaster a cluster level shared service

2021-07-14 Thread Yingjie Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao updated FLINK-23214:

Parent: (was: FLINK-22672)
Issue Type: Improvement  (was: Sub-task)

> Make ShuffleMaster a cluster level shared service
> -
>
> Key: FLINK-23214
> URL: https://issues.apache.org/jira/browse/FLINK-23214
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination, Runtime / Network
>Affects Versions: 1.14.0
>Reporter: Yingjie Cao
>Priority: Major
>  Labels: pull-request-available
>
> This ticket tries to make ShuffleMaster a cluster level shared service which 
> makes it consistent with the ShuffleEnvironment at the TM side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22674) Provide JobID when apply shuffle resource by ShuffleMaster

2021-07-14 Thread Yingjie Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao updated FLINK-22674:

Parent: FLINK-22910
Issue Type: Sub-task  (was: Improvement)

> Provide JobID when apply shuffle resource by ShuffleMaster
> --
>
> Key: FLINK-22674
> URL: https://issues.apache.org/jira/browse/FLINK-22674
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Jin Xing
>Priority: Major
>  Labels: pull-request-available
>
> In current Flink 'pluggable shuffle service' framework, only 
> PartitionDescriptor and ProducerDescriptor are included as parameters in 
> ShuffleMaster#registerPartitionWithProducer.
> But when extending a remote shuffle service based on 'pluggable shuffle 
> service', JobID is also needed when apply shuffle resource from remote 
> cluster. It can be used as an identification to link shuffle resource with 
> the corresponding job:
>  # Remote shuffle cluster can isolate or do capacity control on shuffle 
> resource between jobs;
>  # Remote shuffle cluster can use JobID for shuffle data cleanup when job is 
> lost thus to avoid file leak;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22674) Provide JobID when apply shuffle resource by ShuffleMaster

2021-07-14 Thread Yingjie Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao updated FLINK-22674:

Parent: (was: FLINK-22672)
Issue Type: Improvement  (was: Sub-task)

> Provide JobID when apply shuffle resource by ShuffleMaster
> --
>
> Key: FLINK-22674
> URL: https://issues.apache.org/jira/browse/FLINK-22674
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Reporter: Jin Xing
>Priority: Major
>  Labels: pull-request-available
>
> In current Flink 'pluggable shuffle service' framework, only 
> PartitionDescriptor and ProducerDescriptor are included as parameters in 
> ShuffleMaster#registerPartitionWithProducer.
> But when extending a remote shuffle service based on 'pluggable shuffle 
> service', JobID is also needed when apply shuffle resource from remote 
> cluster. It can be used as an identification to link shuffle resource with 
> the corresponding job:
>  # Remote shuffle cluster can isolate or do capacity control on shuffle 
> resource between jobs;
>  # Remote shuffle cluster can use JobID for shuffle data cleanup when job is 
> lost thus to avoid file leak;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23249) Introduce ShuffleMasterContext to ShuffleMaster

2021-07-14 Thread Yingjie Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao updated FLINK-23249:

Parent: FLINK-22910
Issue Type: Sub-task  (was: Improvement)

> Introduce ShuffleMasterContext to ShuffleMaster
> ---
>
> Key: FLINK-23249
> URL: https://issues.apache.org/jira/browse/FLINK-23249
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination, Runtime / Network
>Reporter: Yingjie Cao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Introduce ShuffleMasterContext to ShuffleMaster. Just like the 
> ShuffleEnvironmentContext at the TaskManager side, the ShuffleMasterContext 
> can act as a proxy of ShuffleMaster and other components of Flink like the 
> ResourceManagerPartitionTracker.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23249) Introduce ShuffleMasterContext to ShuffleMaster

2021-07-14 Thread Yingjie Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao updated FLINK-23249:

Parent: (was: FLINK-22672)
Issue Type: Improvement  (was: Sub-task)

> Introduce ShuffleMasterContext to ShuffleMaster
> ---
>
> Key: FLINK-23249
> URL: https://issues.apache.org/jira/browse/FLINK-23249
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination, Runtime / Network
>Reporter: Yingjie Cao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Introduce ShuffleMasterContext to ShuffleMaster. Just like the 
> ShuffleEnvironmentContext at the TaskManager side, the ShuffleMasterContext 
> can act as a proxy of ShuffleMaster and other components of Flink like the 
> ResourceManagerPartitionTracker.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #16482: [FLINK-23369][docs][connector-hive][connector-kafka] Improve config options by using enums

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16482:
URL: https://github.com/apache/flink/pull/16482#issuecomment-879006434


   
   ## CI report:
   
   * 1558f0294c43ac98d7d9b00973f4952d4c2b2d18 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20465)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-22675) Add lifecycle methods to ShuffleMaster

2021-07-14 Thread Yingjie Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao updated FLINK-22675:

Parent: FLINK-22910
Issue Type: Sub-task  (was: Improvement)

> Add lifecycle methods to ShuffleMaster
> --
>
> Key: FLINK-22675
> URL: https://issues.apache.org/jira/browse/FLINK-22675
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Jin Xing
>Priority: Major
>
> When extending remote shuffle service based on 'pluggable shuffle service',  
> ShuffleMaster talks with remote cluster by network connection. This Jira 
> proposes to add an interface method –- ShuffleMaster#close, which can be 
> extended and do cleanup work and will be called when Flink application is 
> closed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-23379) interval left join null value result out of order

2021-07-14 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381086#comment-17381086
 ] 

Jingsong Lee edited comment on FLINK-23379 at 7/15/21, 6:42 AM:


There is no definition of primary key in the SQL. So Flink dose not know 
anything about the order.

[~qingru zhang] 's solution is right, you can add a deduplicate after interval 
join.


was (Author: lzljs3620320):
There is no definition of primary key in the SQL. So Flink dose not know 
anything about the order.

[~qingru zhang] 's solution is right, you can add a deduplicate to let it be a 
retraction stream with primary key. But as I know, interval join dose not 
support retraction input. So maybe you need let it be regular join...

>  interval left join null value result out of order
> --
>
> Key: FLINK-23379
> URL: https://issues.apache.org/jira/browse/FLINK-23379
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.12.2
>Reporter: waywtdcc
>Priority: Major
>
> * Scenes:
>  Person main table left interval join associated message information table,
>  The first record that is not associated with the message information table 
> will be later than the later record that is associated with the message 
> information table.
>  When there are normal output and null value output with the same primary 
> key, it will be out of order, and the null value output is later than the 
> normal value output, resulting in incorrect results
> enter:
> {"id": 1, "name":"chencc2", "message": "good boy2", "ts":"2021-03-26 
> 18:56:43"}
> {"id": 1, "name":"chencc2", "age": "28", "ts":"2021-03-26 19:02:47"}
> {"id": 1, "name":"chencc2", "message": "good boy3", "ts":"2021-03-26 
> 19:06:43"}
> {"id": 1, "name":"chencc2", "age": "27", "ts":"2021-03-26 19:06:47"}
> Output:
>  +I(chencc2,27,2021-03-26T19:06:47,good boy3,2021-03-26 19:06:43.000)
>  +I(chencc2,28,2021-03-26T19:02:47,null,null)
>  The time of the second record here is 19:02 earlier than the first record, 
> but the output of the result is late, causing data update errors
>  
>  *  code
> {code:java}
> tableEnv.executeSql("drop table if exists persons_table_kafka2");
>          String kafka_source_sql = "CREATE TABLE persons_table_kafka2 (\n" +
>                  "  `id` BIGINT,\n" +
>                  "  `name` STRING,\n" +
>                  "  `age` INT,\n" +
>                  "  proctime as PROCTIME(),\n" +
>                  "  `ts` TIMESTAMP(3),\n" +
>                  " WATERMARK FOR ts AS ts\n" +
>                  ") WITH (\n" +
>                  "  'connector' = 'kafka',\n" +
>                  "  'topic' = 'persons_test2',\n" +
>                  "  'properties.bootstrap.servers' = 'node2:6667',\n" +
>                  "  'properties.group.id' = 'testGroa115',\n" +
>                  "  'scan.startup.mode' = 'group-offsets',\n" +
>                  "  'format' = 'json'\n" +
>                  ")";
>          tableEnv.executeSql(kafka_source_sql);
>         tableEnv.executeSql("drop table if exists 
> persons_message_table_kafka2");
>          String kafka_source_sql2 = "CREATE TABLE 
> persons_message_table_kafka2 (\n" +
>                  "  `id` BIGINT,\n" +
>                  "  `name` STRING,\n" +
>                  "  `message` STRING,\n" +
>                  "  `ts` TIMESTAMP(3) ," +
>  //                " WATERMARK FOR ts AS ts - INTERVAL '5' SECOND\n" +
>                  " WATERMARK FOR ts AS ts\n" +
>                  ") WITH (\n" +
>                  "  'connector' = 'kafka',\n" +
>                  "  'topic' = 'persons_extra_message2',\n" +
>                  "  'properties.bootstrap.servers' = 'node2:6667',\n" +
>                  "  'properties.group.id' = 'testGroud2e313',\n" +
>                  "  'scan.startup.mode' = 'group-offsets',\n" +
>                  "  'format' = 'json'\n" +
>                  ")";
>          tableEnv.executeSql(kafka_source_sql2);
>         tableEnv.executeSql("" +
>                  "CREATE TEMPORARY VIEW result_data_view " +
>                  " as " +
>                  " select " +
>                  " t1.name, t1.age,t1.ts as ts1,t2.message, cast (t2.ts as 
> string) as ts2 " +
>                  " from persons_table_kafka2 t1 " +
>                  " left  join persons_message_table_kafka2 t2 on t1.name = 
> t2.name and t1.ts between " +
>                          " t2.ts and t2.ts +  INTERVAL '3' MINUTE"
>                  );
>         Table resultTable = tableEnv.from("result_data_view");
>          DataStream rowDataDataStream = 
> tableEnv.toAppendStream(resultTable, RowData.class);
>          rowDataDataStream.print();
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22675) Add lifecycle methods to ShuffleMaster

2021-07-14 Thread Yingjie Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao updated FLINK-22675:

Parent: (was: FLINK-22672)
Issue Type: Improvement  (was: Sub-task)

> Add lifecycle methods to ShuffleMaster
> --
>
> Key: FLINK-22675
> URL: https://issues.apache.org/jira/browse/FLINK-22675
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Reporter: Jin Xing
>Priority: Major
>
> When extending remote shuffle service based on 'pluggable shuffle service',  
> ShuffleMaster talks with remote cluster by network connection. This Jira 
> proposes to add an interface method –- ShuffleMaster#close, which can be 
> extended and do cleanup work and will be called when Flink application is 
> closed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-23379) interval left join null value result out of order

2021-07-14 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381086#comment-17381086
 ] 

Jingsong Lee commented on FLINK-23379:
--

There is no definition of primary key in the SQL. So Flink dose not know 
anything about the order.

[~qingru zhang] 's solution is right, you can add a deduplicate to let it be a 
retraction stream with primary key. But as I know, interval join dose not 
support retraction input. So maybe you need let it be regular join...

>  interval left join null value result out of order
> --
>
> Key: FLINK-23379
> URL: https://issues.apache.org/jira/browse/FLINK-23379
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.12.2
>Reporter: waywtdcc
>Priority: Major
>
> * Scenes:
>  Person main table left interval join associated message information table,
>  The first record that is not associated with the message information table 
> will be later than the later record that is associated with the message 
> information table.
>  When there are normal output and null value output with the same primary 
> key, it will be out of order, and the null value output is later than the 
> normal value output, resulting in incorrect results
> enter:
> {"id": 1, "name":"chencc2", "message": "good boy2", "ts":"2021-03-26 
> 18:56:43"}
> {"id": 1, "name":"chencc2", "age": "28", "ts":"2021-03-26 19:02:47"}
> {"id": 1, "name":"chencc2", "message": "good boy3", "ts":"2021-03-26 
> 19:06:43"}
> {"id": 1, "name":"chencc2", "age": "27", "ts":"2021-03-26 19:06:47"}
> Output:
>  +I(chencc2,27,2021-03-26T19:06:47,good boy3,2021-03-26 19:06:43.000)
>  +I(chencc2,28,2021-03-26T19:02:47,null,null)
>  The time of the second record here is 19:02 earlier than the first record, 
> but the output of the result is late, causing data update errors
>  
>  *  code
> {code:java}
> tableEnv.executeSql("drop table if exists persons_table_kafka2");
>          String kafka_source_sql = "CREATE TABLE persons_table_kafka2 (\n" +
>                  "  `id` BIGINT,\n" +
>                  "  `name` STRING,\n" +
>                  "  `age` INT,\n" +
>                  "  proctime as PROCTIME(),\n" +
>                  "  `ts` TIMESTAMP(3),\n" +
>                  " WATERMARK FOR ts AS ts\n" +
>                  ") WITH (\n" +
>                  "  'connector' = 'kafka',\n" +
>                  "  'topic' = 'persons_test2',\n" +
>                  "  'properties.bootstrap.servers' = 'node2:6667',\n" +
>                  "  'properties.group.id' = 'testGroa115',\n" +
>                  "  'scan.startup.mode' = 'group-offsets',\n" +
>                  "  'format' = 'json'\n" +
>                  ")";
>          tableEnv.executeSql(kafka_source_sql);
>         tableEnv.executeSql("drop table if exists 
> persons_message_table_kafka2");
>          String kafka_source_sql2 = "CREATE TABLE 
> persons_message_table_kafka2 (\n" +
>                  "  `id` BIGINT,\n" +
>                  "  `name` STRING,\n" +
>                  "  `message` STRING,\n" +
>                  "  `ts` TIMESTAMP(3) ," +
>  //                " WATERMARK FOR ts AS ts - INTERVAL '5' SECOND\n" +
>                  " WATERMARK FOR ts AS ts\n" +
>                  ") WITH (\n" +
>                  "  'connector' = 'kafka',\n" +
>                  "  'topic' = 'persons_extra_message2',\n" +
>                  "  'properties.bootstrap.servers' = 'node2:6667',\n" +
>                  "  'properties.group.id' = 'testGroud2e313',\n" +
>                  "  'scan.startup.mode' = 'group-offsets',\n" +
>                  "  'format' = 'json'\n" +
>                  ")";
>          tableEnv.executeSql(kafka_source_sql2);
>         tableEnv.executeSql("" +
>                  "CREATE TEMPORARY VIEW result_data_view " +
>                  " as " +
>                  " select " +
>                  " t1.name, t1.age,t1.ts as ts1,t2.message, cast (t2.ts as 
> string) as ts2 " +
>                  " from persons_table_kafka2 t1 " +
>                  " left  join persons_message_table_kafka2 t2 on t1.name = 
> t2.name and t1.ts between " +
>                          " t2.ts and t2.ts +  INTERVAL '3' MINUTE"
>                  );
>         Table resultTable = tableEnv.from("result_data_view");
>          DataStream rowDataDataStream = 
> tableEnv.toAppendStream(resultTable, RowData.class);
>          rowDataDataStream.print();
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22910) Refine ShuffleMaster lifecycle management for pluggable shuffle service framework

2021-07-14 Thread Yingjie Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao updated FLINK-22910:

Parent: (was: FLINK-22672)
Issue Type: Improvement  (was: Sub-task)

> Refine ShuffleMaster lifecycle management for pluggable shuffle service 
> framework
> -
>
> Key: FLINK-22910
> URL: https://issues.apache.org/jira/browse/FLINK-22910
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Yingjie Cao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> The current _ShuffleMaster_ has an unclear lifecycle which is inconsistent 
> with the _ShuffleEnvironment_ at the _TM_ side. Besides, it is hard to 
> Implement some important capabilities for remote shuffle service. For 
> example, 1) release external resources when a job finished; 2) Stop or start 
> tracking some partitions depending on the status of the external service or 
> system.
> We drafted a document[1] which proposed some simple changes to solve these 
> issues. The document is still not wholly completed yet. We will start a 
> discussion once it is finished.
>  
> [1] 
> https://docs.google.com/document/d/1_cHoapNbx_fJ7ZNraSqw4ZK1hMRiWWJDITuSZrdMDDs/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] jinxing64 edited a comment on pull request #16118: [FLINK-22676][coordination] The partition tracker stops tracking internal partitions when TM disconnects

2021-07-14 Thread GitBox


jinxing64 edited a comment on pull request #16118:
URL: https://github.com/apache/flink/pull/16118#issuecomment-880419048


   Thanks a lot for the concern @guoweiM ; How do you think? @tillrohrmann and 
@zhuzhurk 
   It will be great to hear comments from you Flink master. I will keep 
refining this PR if we can reach consensus on the idea.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-23379) interval left join null value result out of order

2021-07-14 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381084#comment-17381084
 ] 

Jingsong Lee commented on FLINK-23379:
--

Hi [~waywtdcc], the order in a distributed system is not guaranteed unless the 
stream is a retract or upsert stream.

Even if there is an event time, it only guarantees the alignment between 
watermarks, and the data in watermarks may be out of order.

>  interval left join null value result out of order
> --
>
> Key: FLINK-23379
> URL: https://issues.apache.org/jira/browse/FLINK-23379
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.12.2
>Reporter: waywtdcc
>Priority: Major
>
> * Scenes:
>  Person main table left interval join associated message information table,
>  The first record that is not associated with the message information table 
> will be later than the later record that is associated with the message 
> information table.
>  When there are normal output and null value output with the same primary 
> key, it will be out of order, and the null value output is later than the 
> normal value output, resulting in incorrect results
> enter:
> {"id": 1, "name":"chencc2", "message": "good boy2", "ts":"2021-03-26 
> 18:56:43"}
> {"id": 1, "name":"chencc2", "age": "28", "ts":"2021-03-26 19:02:47"}
> {"id": 1, "name":"chencc2", "message": "good boy3", "ts":"2021-03-26 
> 19:06:43"}
> {"id": 1, "name":"chencc2", "age": "27", "ts":"2021-03-26 19:06:47"}
> Output:
>  +I(chencc2,27,2021-03-26T19:06:47,good boy3,2021-03-26 19:06:43.000)
>  +I(chencc2,28,2021-03-26T19:02:47,null,null)
>  The time of the second record here is 19:02 earlier than the first record, 
> but the output of the result is late, causing data update errors
>  
>  *  code
> {code:java}
> tableEnv.executeSql("drop table if exists persons_table_kafka2");
>          String kafka_source_sql = "CREATE TABLE persons_table_kafka2 (\n" +
>                  "  `id` BIGINT,\n" +
>                  "  `name` STRING,\n" +
>                  "  `age` INT,\n" +
>                  "  proctime as PROCTIME(),\n" +
>                  "  `ts` TIMESTAMP(3),\n" +
>                  " WATERMARK FOR ts AS ts\n" +
>                  ") WITH (\n" +
>                  "  'connector' = 'kafka',\n" +
>                  "  'topic' = 'persons_test2',\n" +
>                  "  'properties.bootstrap.servers' = 'node2:6667',\n" +
>                  "  'properties.group.id' = 'testGroa115',\n" +
>                  "  'scan.startup.mode' = 'group-offsets',\n" +
>                  "  'format' = 'json'\n" +
>                  ")";
>          tableEnv.executeSql(kafka_source_sql);
>         tableEnv.executeSql("drop table if exists 
> persons_message_table_kafka2");
>          String kafka_source_sql2 = "CREATE TABLE 
> persons_message_table_kafka2 (\n" +
>                  "  `id` BIGINT,\n" +
>                  "  `name` STRING,\n" +
>                  "  `message` STRING,\n" +
>                  "  `ts` TIMESTAMP(3) ," +
>  //                " WATERMARK FOR ts AS ts - INTERVAL '5' SECOND\n" +
>                  " WATERMARK FOR ts AS ts\n" +
>                  ") WITH (\n" +
>                  "  'connector' = 'kafka',\n" +
>                  "  'topic' = 'persons_extra_message2',\n" +
>                  "  'properties.bootstrap.servers' = 'node2:6667',\n" +
>                  "  'properties.group.id' = 'testGroud2e313',\n" +
>                  "  'scan.startup.mode' = 'group-offsets',\n" +
>                  "  'format' = 'json'\n" +
>                  ")";
>          tableEnv.executeSql(kafka_source_sql2);
>         tableEnv.executeSql("" +
>                  "CREATE TEMPORARY VIEW result_data_view " +
>                  " as " +
>                  " select " +
>                  " t1.name, t1.age,t1.ts as ts1,t2.message, cast (t2.ts as 
> string) as ts2 " +
>                  " from persons_table_kafka2 t1 " +
>                  " left  join persons_message_table_kafka2 t2 on t1.name = 
> t2.name and t1.ts between " +
>                          " t2.ts and t2.ts +  INTERVAL '3' MINUTE"
>                  );
>         Table resultTable = tableEnv.from("result_data_view");
>          DataStream rowDataDataStream = 
> tableEnv.toAppendStream(resultTable, RowData.class);
>          rowDataDataStream.print();
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-23107) Separate deduplicate rank from rank functions

2021-07-14 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-23107.

Resolution: Fixed

master: 3abec22c53662e69f82e6bc0d15f28e77b712fe3

> Separate deduplicate rank from rank functions
> -
>
> Key: FLINK-23107
> URL: https://issues.apache.org/jira/browse/FLINK-23107
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Jingsong Lee
>Assignee: Shuo Cheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> SELECT * FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY d ORDER BY e DESC) 
> AS rownum from T) WHERE rownum=1
> Actually above sql is a deduplicate rank instead of a normal rank. We should 
> separate the implementation for optimize the deduplicate rank and reduce bugs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] JingsongLi merged pull request #16434: [FLINK-23107][table-runtime] Separate implementation of deduplicate r…

2021-07-14 Thread GitBox


JingsongLi merged pull request #16434:
URL: https://github.com/apache/flink/pull/16434


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-23208) Late processing timers need to wait 1ms at least to be fired

2021-07-14 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381076#comment-17381076
 ] 

Piotr Nowojski edited comment on FLINK-23208 at 7/15/21, 6:29 AM:
--

Great, thank [~wind_ljy]. Yes flink-benchmarks. Maybe we can write a 
job/operator that registers 1_000_000 timers and finishes once they are 
processed somehow?


was (Author: pnowojski):
Great, thank [~wind_ljy]. Yes flink-benchmarks. Maybe we can write a 
job/operator that registers 10_000_000 timers and finishes once they are 
processed somehow?

> Late processing timers need to wait 1ms at least to be fired
> 
>
> Key: FLINK-23208
> URL: https://issues.apache.org/jira/browse/FLINK-23208
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Runtime / Task
>Affects Versions: 1.11.0, 1.11.3, 1.13.0, 1.14.0, 1.12.4
>Reporter: Jiayi Liao
>Assignee: Jiayi Liao
>Priority: Critical
>  Labels: critical
> Attachments: screenshot-1.png
>
>
> The problem is from the codes below:
> {code:java}
> public static long getProcessingTimeDelay(long processingTimestamp, long 
> currentTimestamp) {
>   // delay the firing of the timer by 1 ms to align the semantics with 
> watermark. A watermark
>   // T says we won't see elements in the future with a timestamp smaller 
> or equal to T.
>   // With processing time, we therefore need to delay firing the timer by 
> one ms.
>   return Math.max(processingTimestamp - currentTimestamp, 0) + 1;
> }
> {code}
> Assuming a Flink job creates 1 timer per millionseconds, and is able to 
> consume 1 timer/ms. Here is what will happen: 
> * Timestmap1(1st ms): timer1 is registered and will be triggered on 
> Timestamp2. 
> * Timestamp2(2nd ms): timer2 is registered and timer1 is triggered
> * Timestamp3(3rd ms): timer3 is registered and timer1 is consumed, after 
> this, {{InternalTimerServiceImpl}} registers next timer, which is timer2, and 
> timer2 will be triggered on Timestamp4(wait 1ms at least)
> * Timestamp4(4th ms): timer4 is registered and timer2 is triggered
> * Timestamp5(5th ms): timer5 is registered and timer2 is consumed, after 
> this, {{InternalTimerServiceImpl}} registers next timer, which is timer3, and 
> timer3 will be triggered on Timestamp6(wait 1ms at least)
> As we can see here, the ability of the Flink job is consuming 1 timer/ms, but 
> it's actually able to consume 0.5 timer/ms. And another problem is that we 
> cannot observe the delay from the lag metrics of the source(Kafka). Instead, 
> what we can tell is that the moment of output is much later than expected. 
> I've added a metrics in our inner version, we can see the lag of the timer 
> triggering keeps increasing: 
>  !screenshot-1.png! 
> *In another word, we should never let the late processing timer wait 1ms, I 
> think a simple change would be as below:*
> {code:java}
> return Math.max(processingTimestamp - currentTimestamp, -1) + 1;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-23208) Late processing timers need to wait 1ms at least to be fired

2021-07-14 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381076#comment-17381076
 ] 

Piotr Nowojski commented on FLINK-23208:


Great, thank [~wind_ljy]. Yes flink-benchmarks. Maybe we can write a 
job/operator that registers 10_000_000 timers and finishes once they are 
processed somehow?

> Late processing timers need to wait 1ms at least to be fired
> 
>
> Key: FLINK-23208
> URL: https://issues.apache.org/jira/browse/FLINK-23208
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Runtime / Task
>Affects Versions: 1.11.0, 1.11.3, 1.13.0, 1.14.0, 1.12.4
>Reporter: Jiayi Liao
>Assignee: Jiayi Liao
>Priority: Critical
>  Labels: critical
> Attachments: screenshot-1.png
>
>
> The problem is from the codes below:
> {code:java}
> public static long getProcessingTimeDelay(long processingTimestamp, long 
> currentTimestamp) {
>   // delay the firing of the timer by 1 ms to align the semantics with 
> watermark. A watermark
>   // T says we won't see elements in the future with a timestamp smaller 
> or equal to T.
>   // With processing time, we therefore need to delay firing the timer by 
> one ms.
>   return Math.max(processingTimestamp - currentTimestamp, 0) + 1;
> }
> {code}
> Assuming a Flink job creates 1 timer per millionseconds, and is able to 
> consume 1 timer/ms. Here is what will happen: 
> * Timestmap1(1st ms): timer1 is registered and will be triggered on 
> Timestamp2. 
> * Timestamp2(2nd ms): timer2 is registered and timer1 is triggered
> * Timestamp3(3rd ms): timer3 is registered and timer1 is consumed, after 
> this, {{InternalTimerServiceImpl}} registers next timer, which is timer2, and 
> timer2 will be triggered on Timestamp4(wait 1ms at least)
> * Timestamp4(4th ms): timer4 is registered and timer2 is triggered
> * Timestamp5(5th ms): timer5 is registered and timer2 is consumed, after 
> this, {{InternalTimerServiceImpl}} registers next timer, which is timer3, and 
> timer3 will be triggered on Timestamp6(wait 1ms at least)
> As we can see here, the ability of the Flink job is consuming 1 timer/ms, but 
> it's actually able to consume 0.5 timer/ms. And another problem is that we 
> cannot observe the delay from the lag metrics of the source(Kafka). Instead, 
> what we can tell is that the moment of output is much later than expected. 
> I've added a metrics in our inner version, we can see the lag of the timer 
> triggering keeps increasing: 
>  !screenshot-1.png! 
> *In another word, we should never let the late processing timer wait 1ms, I 
> think a simple change would be as below:*
> {code:java}
> return Math.max(processingTimestamp - currentTimestamp, -1) + 1;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-23208) Late processing timers need to wait 1ms at least to be fired

2021-07-14 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski reassigned FLINK-23208:
--

Assignee: Jiayi Liao

> Late processing timers need to wait 1ms at least to be fired
> 
>
> Key: FLINK-23208
> URL: https://issues.apache.org/jira/browse/FLINK-23208
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Runtime / Task
>Affects Versions: 1.11.0, 1.11.3, 1.13.0, 1.14.0, 1.12.4
>Reporter: Jiayi Liao
>Assignee: Jiayi Liao
>Priority: Critical
>  Labels: critical
> Attachments: screenshot-1.png
>
>
> The problem is from the codes below:
> {code:java}
> public static long getProcessingTimeDelay(long processingTimestamp, long 
> currentTimestamp) {
>   // delay the firing of the timer by 1 ms to align the semantics with 
> watermark. A watermark
>   // T says we won't see elements in the future with a timestamp smaller 
> or equal to T.
>   // With processing time, we therefore need to delay firing the timer by 
> one ms.
>   return Math.max(processingTimestamp - currentTimestamp, 0) + 1;
> }
> {code}
> Assuming a Flink job creates 1 timer per millionseconds, and is able to 
> consume 1 timer/ms. Here is what will happen: 
> * Timestmap1(1st ms): timer1 is registered and will be triggered on 
> Timestamp2. 
> * Timestamp2(2nd ms): timer2 is registered and timer1 is triggered
> * Timestamp3(3rd ms): timer3 is registered and timer1 is consumed, after 
> this, {{InternalTimerServiceImpl}} registers next timer, which is timer2, and 
> timer2 will be triggered on Timestamp4(wait 1ms at least)
> * Timestamp4(4th ms): timer4 is registered and timer2 is triggered
> * Timestamp5(5th ms): timer5 is registered and timer2 is consumed, after 
> this, {{InternalTimerServiceImpl}} registers next timer, which is timer3, and 
> timer3 will be triggered on Timestamp6(wait 1ms at least)
> As we can see here, the ability of the Flink job is consuming 1 timer/ms, but 
> it's actually able to consume 0.5 timer/ms. And another problem is that we 
> cannot observe the delay from the lag metrics of the source(Kafka). Instead, 
> what we can tell is that the moment of output is much later than expected. 
> I've added a metrics in our inner version, we can see the lag of the timer 
> triggering keeps increasing: 
>  !screenshot-1.png! 
> *In another word, we should never let the late processing timer wait 1ms, I 
> think a simple change would be as below:*
> {code:java}
> return Math.max(processingTimestamp - currentTimestamp, -1) + 1;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] dmvk commented on pull request #16487: [FLINK-22483][runtime][coordination] Recover checkpoints when JobMaster gains leadership

2021-07-14 Thread GitBox


dmvk commented on pull request #16487:
URL: https://github.com/apache/flink/pull/16487#issuecomment-880432234


   Hi @edu05, for CI failures, from a quick look, the flaky tests you're 
hitting have been fixed super recently (~ tuesday). Can you please try to 
rebase this on master?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22483) Recover checkpoints when JobMaster gains leadership

2021-07-14 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381072#comment-17381072
 ] 

Robert Metzger commented on FLINK-22483:


It seems like the 
{{CheckpointStoreITCase.testRestartOnRecoveryFailure(CheckpointStoreITCase.java:93)}}
 test is hanging (if you scroll further up, you see that the "main" thread is 
stuck in this method).
You can download the full logs of that CI run to get the output of the hanging 
test. Most likely, you'll see in the test what's going wrong.


> Recover checkpoints when JobMaster gains leadership
> ---
>
> Key: FLINK-22483
> URL: https://issues.apache.org/jira/browse/FLINK-22483
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Robert Metzger
>Assignee: Eduardo Winpenny Tejedor
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Recovering checkpoints (from the CompletedCheckpointStore) is a potentially 
> long-lasting/blocking operation, for example if the file system 
> implementation is retrying to connect to a unavailable storage backend.
> Currently, we are calling the CompletedCheckpointStore.recover() method from 
> the main thread of the JobManager, making it unresponsive to any RPC call 
> while the recover method is blocked:
> {code}
> 2021-04-02 20:33:31,384 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job XXX 
> switched from state RUNNING to RESTARTING.
> com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to 
> minio.minio.svc:9000 [minio.minio.svc/] failed: Connection refused 
> (Connection refused)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1153)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
>  ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550) ~[?:?]
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530) ~[?:?]
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062) 
> ~[?:?]
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008) 
> ~[?:?]
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1490) 
> ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$openStream$1(PrestoS3FileSystem.java:905)
>  ~[?:?]
>   at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:902)
>  ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:887)
>  ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.seekStream(PrestoS3FileSystem.java:880)
>  ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$read$0(PrestoS3FileSystem.java:819)
>  ~[?:?]
>   at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?]
>   at 
> com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.read(PrestoS3FileSystem.java:818)
>  ~[?:?]
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) 
> ~[?:1.8.0_282]
>   at XXX.recover(KubernetesHaCheckpointStore.java:69) 
> ~[vvp-flink-ha-kubernetes-flink112-1.1.0.jar:?]
>   at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedStateInternal(CheckpointCoordinator.java:1511)
>  ~[flink-dist_2.12-1.12.2-stream1.jar:1.12.2-stream1]
>   at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedStateToAll(CheckpointCoordinator.java:1451)
>  ~[flink-dist_2.12-1.12.2-stream1.jar:1.12.2-stream1]
>   at 
> org.apache.flink.runtime.scheduler.SchedulerBase.restoreState(SchedulerBase.java:421)
>  ~[flink-dist_2.12-1.12.2-stream1.jar:1.12.2-stream1]
>   at 
> org.apache.flink.runtime.scheduler.Defaul

[GitHub] [flink] rmetzger commented on a change in pull request #16487: [FLINK-22483][runtime][coordination] Recover checkpoints when JobMaster gains leadership

2021-07-14 Thread GitBox


rmetzger commented on a change in pull request #16487:
URL: https://github.com/apache/flink/pull/16487#discussion_r670168083



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/SchedulerUtilsTest.java
##
@@ -21,17 +21,30 @@
 import org.apache.flink.api.common.JobID;
 import org.apache.flink.configuration.CheckpointingOptions;
 import org.apache.flink.configuration.Configuration;
+import org.apache.flink.runtime.checkpoint.CheckpointRecoveryFactory;
 import org.apache.flink.runtime.checkpoint.CompletedCheckpointStore;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.jobgraph.tasks.JobCheckpointingSettings;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.verify;
+import static org.mockito.Mockito.when;

Review comment:
   We try to not use Mocking in Flink (only in very rare cases, such as 
external systems not providing proper interfaces). See: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#design-for-testability




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-23385) Fix nullability of COALESCE

2021-07-14 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-23385:
-
Summary: Fix nullability of COALESCE  (was: 
org.apache.flink.table.api.TableException when using REGEXP_EXTRACT)

> Fix nullability of COALESCE
> ---
>
> Key: FLINK-23385
> URL: https://issues.apache.org/jira/browse/FLINK-23385
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.1
>Reporter: Maciej Bryński
>Priority: Major
>
> EDIT: Simpler case:
> {code:java}
> SELECT COALESCE(REGEXP_EXTRACT('22','[A-Z]+'),'-');
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.flink.table.api.TableException: Column 'EXPR$0' is NOT NULL, 
> however, a null value is being written into it. You can set job configuration 
> 'table.exec.sink.not-null-enforcer'='drop' to suppress this exception and 
> drop such records silently.
> {code}
> When using REGEXP_EXTRACT on NOT NULL column I'm getting following exception
> {code:java}
> select COALESCE(REGEXP_EXTRACT(test, '[A-Z]+'), '-') from test limit 10
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.flink.table.api.TableException: Column 'EXPR$0' is NOT NULL, 
> however, a null value is being written into it. You can set job configuration 
> 'table.exec.sink.not-null-enforcer'='drop' to suppress this exception and 
> drop such records silently.
> {code}
> I think the reason is that nullability of result is wrongly calculated.
>  Example:
> {code:java}
> create table test (
>  test STRING NOT NULL
> ) WITH (
> 'connector' = 'datagen'
> );
> explain select COALESCE(REGEXP_EXTRACT(test, '[A-Z]+'), '-') from test
> == Abstract Syntax Tree ==
> LogicalProject(EXPR$0=[REGEXP_EXTRACT($0, _UTF-16LE'[A-Z]+')])
> +- LogicalTableScan(table=[[default_catalog, default_database, test]])== 
> Optimized Physical Plan ==
> Calc(select=[REGEXP_EXTRACT(test, _UTF-16LE'[A-Z]+') AS EXPR$0])
> +- TableSourceScan(table=[[default_catalog, default_database, test]], 
> fields=[test])== Optimized Execution Plan ==
> Calc(select=[REGEXP_EXTRACT(test, _UTF-16LE'[A-Z]+') AS EXPR$0])
> +- TableSourceScan(table=[[default_catalog, default_database, test]], 
> fields=[test]){code}
> As you can see Flink is removing COALESCE from query which is wrong.
>  
> Same for view (null = false):
> {code:java}
> create view v as select COALESCE(REGEXP_EXTRACT(test, '[A-Z]+'), '-') from 
> test
> describe v;
> +++---+-++---+
> |   name |   type |  null | key | extras | watermark |
> +++---+-++---+
> | EXPR$0 | STRING | false | ||   |
> +++---+-++---+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23392) Benchmarks should exclude startup/shutdown time

2021-07-14 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-23392:
-
Description: 
The benchmarks currently simply run against a local environment, which means 
that the benchmark time includes how long it takes for Flink to start and shut 
down.
This creates noise in the benchmark results, and can result in serious 
performance regressions being reported that are not not actually performance 
related.

Instead, the benchmarks should setup separate setup a MiniCluster and run the 
job against that.

  was:
The benchmarks currently simply run against a local environment, which means 
that the benchmark time includes how long it takes for Flink to start and shut 
down. For most of the benchmarks the actual processing only accounts for a 
small fraction of the current benchmark duration.
This creates noise in the benchmark results, and can result in serious 
performance regressions being reported that are not not actually performance 
related.

Instead, the benchmarks should setup separate setup a MiniCluster and run the 
job against that.


> Benchmarks should exclude startup/shutdown time
> ---
>
> Key: FLINK-23392
> URL: https://issues.apache.org/jira/browse/FLINK-23392
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks
>Reporter: Chesnay Schepler
>Priority: Major
>
> The benchmarks currently simply run against a local environment, which means 
> that the benchmark time includes how long it takes for Flink to start and 
> shut down.
> This creates noise in the benchmark results, and can result in serious 
> performance regressions being reported that are not not actually performance 
> related.
> Instead, the benchmarks should setup separate setup a MiniCluster and run the 
> job against that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-23385) Fix nullability of COALESCE

2021-07-14 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381070#comment-17381070
 ] 

Timo Walther commented on FLINK-23385:
--

I renamed the issue. There are a couple of functions that have an invalid type 
inference.

> Fix nullability of COALESCE
> ---
>
> Key: FLINK-23385
> URL: https://issues.apache.org/jira/browse/FLINK-23385
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.1
>Reporter: Maciej Bryński
>Priority: Major
>
> EDIT: Simpler case:
> {code:java}
> SELECT COALESCE(REGEXP_EXTRACT('22','[A-Z]+'),'-');
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.flink.table.api.TableException: Column 'EXPR$0' is NOT NULL, 
> however, a null value is being written into it. You can set job configuration 
> 'table.exec.sink.not-null-enforcer'='drop' to suppress this exception and 
> drop such records silently.
> {code}
> When using REGEXP_EXTRACT on NOT NULL column I'm getting following exception
> {code:java}
> select COALESCE(REGEXP_EXTRACT(test, '[A-Z]+'), '-') from test limit 10
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.flink.table.api.TableException: Column 'EXPR$0' is NOT NULL, 
> however, a null value is being written into it. You can set job configuration 
> 'table.exec.sink.not-null-enforcer'='drop' to suppress this exception and 
> drop such records silently.
> {code}
> I think the reason is that nullability of result is wrongly calculated.
>  Example:
> {code:java}
> create table test (
>  test STRING NOT NULL
> ) WITH (
> 'connector' = 'datagen'
> );
> explain select COALESCE(REGEXP_EXTRACT(test, '[A-Z]+'), '-') from test
> == Abstract Syntax Tree ==
> LogicalProject(EXPR$0=[REGEXP_EXTRACT($0, _UTF-16LE'[A-Z]+')])
> +- LogicalTableScan(table=[[default_catalog, default_database, test]])== 
> Optimized Physical Plan ==
> Calc(select=[REGEXP_EXTRACT(test, _UTF-16LE'[A-Z]+') AS EXPR$0])
> +- TableSourceScan(table=[[default_catalog, default_database, test]], 
> fields=[test])== Optimized Execution Plan ==
> Calc(select=[REGEXP_EXTRACT(test, _UTF-16LE'[A-Z]+') AS EXPR$0])
> +- TableSourceScan(table=[[default_catalog, default_database, test]], 
> fields=[test]){code}
> As you can see Flink is removing COALESCE from query which is wrong.
>  
> Same for view (null = false):
> {code:java}
> create view v as select COALESCE(REGEXP_EXTRACT(test, '[A-Z]+'), '-') from 
> test
> describe v;
> +++---+-++---+
> |   name |   type |  null | key | extras | watermark |
> +++---+-++---+
> | EXPR$0 | STRING | false | ||   |
> +++---+-++---+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #16482: [FLINK-23369][docs][connector-hive][connector-kafka] Improve config options by using enums

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16482:
URL: https://github.com/apache/flink/pull/16482#issuecomment-879006434


   
   ## CI report:
   
   * 0a50b6ae5dc78a0a8d03880c9b6e6aca83552219 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20429)
 
   * 1558f0294c43ac98d7d9b00973f4952d4c2b2d18 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16434: [FLINK-23107][table-runtime] Separate implementation of deduplicate r…

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16434:
URL: https://github.com/apache/flink/pull/16434#issuecomment-876881346


   
   ## CI report:
   
   * 12f7fa66c5185c7a3375bf4cbd19b2df1f82a04a Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20460)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16337: [FLINK-22843][docs-zh]Document and code are inconsistent.

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16337:
URL: https://github.com/apache/flink/pull/16337#issuecomment-871994148


   
   ## CI report:
   
   * b7d78128ac4f1d7b8fa086c82748439928a10fee Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20417)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20419)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20462)
 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20461)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20464)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-23392) Benchmarks should exclude startup/shutdown time

2021-07-14 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23392:


 Summary: Benchmarks should exclude startup/shutdown time
 Key: FLINK-23392
 URL: https://issues.apache.org/jira/browse/FLINK-23392
 Project: Flink
  Issue Type: Bug
  Components: Benchmarks
Reporter: Chesnay Schepler


The benchmarks currently simply run against a local environment, which means 
that the benchmark time includes how long it takes for Flink to start and shut 
down. For most of the benchmarks the actual processing only accounts for a 
small fraction of the current benchmark duration.
This creates noise in the benchmark results, and can result in serious 
performance regressions being reported that are not not actually performance 
related.

Instead, the benchmarks should setup separate setup a MiniCluster and run the 
job against that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15899: [FLINK-22627][runtime] Remove SlotManagerImpl

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #15899:
URL: https://github.com/apache/flink/pull/15899#issuecomment-839417893


   
   ## CI report:
   
   * d3d33b6b0e90ea1bb8f4597940598db7f2de5c11 UNKNOWN
   * 75dd900bfb5480c29d82cbe68e11dfde3b9a57e9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20459)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-16093) Translate "System Functions" page of "Functions" into Chinese

2021-07-14 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu closed FLINK-16093.
---
Fix Version/s: 1.14.0
   Resolution: Fixed

Fixed in master: 1a195f5cc597b95c16adb2f4452622769c6480c6

> Translate "System Functions" page of "Functions" into Chinese 
> --
>
> Key: FLINK-16093
> URL: https://issues.apache.org/jira/browse/FLINK-16093
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Documentation
>Reporter: Jark Wu
>Assignee: ZhiJie Yang
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.14.0
>
> Attachments: image-2021-07-02-11-26-04-216.png
>
>
> The page url is 
> https://ci.apache.org/projects/flink/flink-docs-master/zh/dev/table/functions/systemFunctions.html
> The markdown file is located in 
> {{flink/docs/dev/table/functions/systemFunctions.zh.md}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wuchong merged pull request #16348: [FLINK-16093][docs-zh]Translate "System Functions" page of "Functions" into Chinese

2021-07-14 Thread GitBox


wuchong merged pull request #16348:
URL: https://github.com/apache/flink/pull/16348


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-12415) Translate "History Server" page into Chinese

2021-07-14 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-12415:
---

Assignee: wangzhao

> Translate "History Server" page into Chinese
> 
>
> Key: FLINK-12415
> URL: https://issues.apache.org/jira/browse/FLINK-12415
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Documentation
>Reporter: Armstrong Nova
>Assignee: wangzhao
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Translate 
> "[https://ci.apache.org/projects/flink/flink-docs-master/monitoring/historyserver.html]";
>  page into Chinese.
> This doc located in "flink/docs/monitoring/historyserver.zh.md"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-12415) Translate "History Server" page into Chinese

2021-07-14 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-12415:
---

Assignee: wangzhao  (was: wangzhao)

> Translate "History Server" page into Chinese
> 
>
> Key: FLINK-12415
> URL: https://issues.apache.org/jira/browse/FLINK-12415
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Documentation
>Reporter: Armstrong Nova
>Assignee: wangzhao
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Translate 
> "[https://ci.apache.org/projects/flink/flink-docs-master/monitoring/historyserver.html]";
>  page into Chinese.
> This doc located in "flink/docs/monitoring/historyserver.zh.md"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22871) Support to execute PyFlink jobs in YARN application mode

2021-07-14 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381059#comment-17381059
 ] 

Dian Fu commented on FLINK-22871:
-

[~wukong] What's the status of this JIRA? I'm asking because we're towards to 
the feature freeze of 1.14 at the end of this month. It would be great if we 
could include this feature in 1.14.

> Support to execute PyFlink jobs in YARN application mode
> 
>
> Key: FLINK-22871
> URL: https://issues.apache.org/jira/browse/FLINK-22871
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Python, Deployment / YARN
>Reporter: konwu
>Priority: Major
>  Labels: auto-unassigned
> Fix For: 1.14.0
>
>
> for now pyflink is not support hadoop yarn application mode, cause of yarn 
> nodemanager may not have suitable python version
> after test for use venv(python virtual environment) that uploaded by 
> 'python.files' properties, then change 'env.pythonExec' path, it also works
> so,is there any possiable to support this in a suitable way
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] jinxing64 commented on pull request #16118: [FLINK-22676][coordination] The partition tracker stops tracking internal partitions when TM disconnects

2021-07-14 Thread GitBox


jinxing64 commented on pull request #16118:
URL: https://github.com/apache/flink/pull/16118#issuecomment-880419048


   Thanks a lot for the concern @guoweiM ; How do you think? @tillrohrmann and 
@zhuzhurk 
   It will be great to hear comments from you Flink master. I will keep 
refining this PR if we can reach consensus for the idea.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] jinxing64 commented on pull request #16118: [FLINK-22676][coordination] The partition tracker stops tracking internal partitions when TM disconnects

2021-07-14 Thread GitBox


jinxing64 commented on pull request #16118:
URL: https://github.com/apache/flink/pull/16118#issuecomment-880417258


   Thanks for comments @guoweiM . 
   Currently there are two data structures in JobMasterPartitionTrackerImpl:
   1. `partitionTable`, which maintains the mapping from tmId to result 
partitions it produced;
   2. `partitionInfos`, which records all the partitions under tracking;
   
   When a tm is gone, it's relative partitions in `partitionTable` will be 
cleared, which means no partitions is accommodated on it;
   At the same time, only tm internal partitions is cleared from 
`partitionsInfos` but external partitions are kept -- external partitions are 
available and could be decoupled from the lifecycle from tm;
   I think that's where the `inconsistencies` you mentioned came from. And I 
agree that it take words to explain the underlying mechanism and might be hard 
for understanding for new commers;
   
   > If JobMasterPartitionTracker has found that 
ResultPartitionDeploymentDescriptor is not an internal shuffle, don’t maintain 
the mapping from TM to ResultPartition at the beginning.
   
   I agree with you for the proposal -- `partitionTable` only maintains tm 
internal partitions from the start and `partitionInfos` records all external 
and internal partitions. With this change the semantics of below interfaces 
will be:
   1. `boolean isTrackingPartitionsFor(K key);`: if there's internal partitions 
tracked for the `key`;
   2. `boolean isPartitionTracked(ResultPartitionID resultPartitionID)`: if the 
resultPartitionID is tracked anywhere, no matter internally or externally
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-23379) interval left join null value result out of order

2021-07-14 Thread waywtdcc (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381048#comment-17381048
 ] 

waywtdcc commented on FLINK-23379:
--

[~qingru zhang] Hello, I understand what you mean, but I don't think this is a 
retract stream. The retract stream is the historical result of the late data 
update, but the null value here is early and can be output in advance. What do 
you think?

>  interval left join null value result out of order
> --
>
> Key: FLINK-23379
> URL: https://issues.apache.org/jira/browse/FLINK-23379
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.12.2
>Reporter: waywtdcc
>Priority: Major
>
> * Scenes:
>  Person main table left interval join associated message information table,
>  The first record that is not associated with the message information table 
> will be later than the later record that is associated with the message 
> information table.
>  When there are normal output and null value output with the same primary 
> key, it will be out of order, and the null value output is later than the 
> normal value output, resulting in incorrect results
> enter:
> {"id": 1, "name":"chencc2", "message": "good boy2", "ts":"2021-03-26 
> 18:56:43"}
> {"id": 1, "name":"chencc2", "age": "28", "ts":"2021-03-26 19:02:47"}
> {"id": 1, "name":"chencc2", "message": "good boy3", "ts":"2021-03-26 
> 19:06:43"}
> {"id": 1, "name":"chencc2", "age": "27", "ts":"2021-03-26 19:06:47"}
> Output:
>  +I(chencc2,27,2021-03-26T19:06:47,good boy3,2021-03-26 19:06:43.000)
>  +I(chencc2,28,2021-03-26T19:02:47,null,null)
>  The time of the second record here is 19:02 earlier than the first record, 
> but the output of the result is late, causing data update errors
>  
>  *  code
> {code:java}
> tableEnv.executeSql("drop table if exists persons_table_kafka2");
>          String kafka_source_sql = "CREATE TABLE persons_table_kafka2 (\n" +
>                  "  `id` BIGINT,\n" +
>                  "  `name` STRING,\n" +
>                  "  `age` INT,\n" +
>                  "  proctime as PROCTIME(),\n" +
>                  "  `ts` TIMESTAMP(3),\n" +
>                  " WATERMARK FOR ts AS ts\n" +
>                  ") WITH (\n" +
>                  "  'connector' = 'kafka',\n" +
>                  "  'topic' = 'persons_test2',\n" +
>                  "  'properties.bootstrap.servers' = 'node2:6667',\n" +
>                  "  'properties.group.id' = 'testGroa115',\n" +
>                  "  'scan.startup.mode' = 'group-offsets',\n" +
>                  "  'format' = 'json'\n" +
>                  ")";
>          tableEnv.executeSql(kafka_source_sql);
>         tableEnv.executeSql("drop table if exists 
> persons_message_table_kafka2");
>          String kafka_source_sql2 = "CREATE TABLE 
> persons_message_table_kafka2 (\n" +
>                  "  `id` BIGINT,\n" +
>                  "  `name` STRING,\n" +
>                  "  `message` STRING,\n" +
>                  "  `ts` TIMESTAMP(3) ," +
>  //                " WATERMARK FOR ts AS ts - INTERVAL '5' SECOND\n" +
>                  " WATERMARK FOR ts AS ts\n" +
>                  ") WITH (\n" +
>                  "  'connector' = 'kafka',\n" +
>                  "  'topic' = 'persons_extra_message2',\n" +
>                  "  'properties.bootstrap.servers' = 'node2:6667',\n" +
>                  "  'properties.group.id' = 'testGroud2e313',\n" +
>                  "  'scan.startup.mode' = 'group-offsets',\n" +
>                  "  'format' = 'json'\n" +
>                  ")";
>          tableEnv.executeSql(kafka_source_sql2);
>         tableEnv.executeSql("" +
>                  "CREATE TEMPORARY VIEW result_data_view " +
>                  " as " +
>                  " select " +
>                  " t1.name, t1.age,t1.ts as ts1,t2.message, cast (t2.ts as 
> string) as ts2 " +
>                  " from persons_table_kafka2 t1 " +
>                  " left  join persons_message_table_kafka2 t2 on t1.name = 
> t2.name and t1.ts between " +
>                          " t2.ts and t2.ts +  INTERVAL '3' MINUTE"
>                  );
>         Table resultTable = tableEnv.from("result_data_view");
>          DataStream rowDataDataStream = 
> tableEnv.toAppendStream(resultTable, RowData.class);
>          rowDataDataStream.print();
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #16337: [FLINK-22843][docs-zh]Document and code are inconsistent.

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16337:
URL: https://github.com/apache/flink/pull/16337#issuecomment-871994148


   
   ## CI report:
   
   * b7d78128ac4f1d7b8fa086c82748439928a10fee Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20417)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20419)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20462)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20464)
 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20461)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16184: [FLINK-21089] Skip the execution of the fully finished operators after recovery

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16184:
URL: https://github.com/apache/flink/pull/16184#issuecomment-863123325


   
   ## CI report:
   
   * 88b40ab4dc9e1c8d62b8ac0eaf2d87340c1feb8e Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20453)
 
   * b73f0008d84db9b3eee077204929c65c14950179 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20463)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16337: [FLINK-22843][docs-zh]Document and code are inconsistent.

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16337:
URL: https://github.com/apache/flink/pull/16337#issuecomment-871994148


   
   ## CI report:
   
   * b7d78128ac4f1d7b8fa086c82748439928a10fee Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20417)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20461)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20419)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20462)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20464)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16184: [FLINK-21089] Skip the execution of the fully finished operators after recovery

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16184:
URL: https://github.com/apache/flink/pull/16184#issuecomment-863123325


   
   ## CI report:
   
   * 88b40ab4dc9e1c8d62b8ac0eaf2d87340c1feb8e Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20453)
 
   * b73f0008d84db9b3eee077204929c65c14950179 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-23379) interval left join null value result out of order

2021-07-14 Thread JING ZHANG (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381003#comment-17381003
 ] 

JING ZHANG commented on FLINK-23379:


[~waywtdcc] You're right, however you could solve your problem by add a 
[deduplicate|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/deduplication]
 based on event time.

Interval join would hold back the watermark, so the padding result for 
mismatched records would not be considered as late event by the following 
operators.

Besides, If we update the behavior of IntervalJoin to send the padding result 
intermediately, the result stream of interval join is a retract stream instead 
of append stream. The modification would restrict the user case of interval 
join because a lot of existed operators could not receive update stream as 
input (e.g Deduplicate/IntervalJoin/OverAgg/WindowAgg/WindowJoin/WindowRank) 
and the rank would be routed to a poor performance implementation.

What do you think?

 

 

>  interval left join null value result out of order
> --
>
> Key: FLINK-23379
> URL: https://issues.apache.org/jira/browse/FLINK-23379
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.12.2
>Reporter: waywtdcc
>Priority: Major
>
> * Scenes:
>  Person main table left interval join associated message information table,
>  The first record that is not associated with the message information table 
> will be later than the later record that is associated with the message 
> information table.
>  When there are normal output and null value output with the same primary 
> key, it will be out of order, and the null value output is later than the 
> normal value output, resulting in incorrect results
> enter:
> {"id": 1, "name":"chencc2", "message": "good boy2", "ts":"2021-03-26 
> 18:56:43"}
> {"id": 1, "name":"chencc2", "age": "28", "ts":"2021-03-26 19:02:47"}
> {"id": 1, "name":"chencc2", "message": "good boy3", "ts":"2021-03-26 
> 19:06:43"}
> {"id": 1, "name":"chencc2", "age": "27", "ts":"2021-03-26 19:06:47"}
> Output:
>  +I(chencc2,27,2021-03-26T19:06:47,good boy3,2021-03-26 19:06:43.000)
>  +I(chencc2,28,2021-03-26T19:02:47,null,null)
>  The time of the second record here is 19:02 earlier than the first record, 
> but the output of the result is late, causing data update errors
>  
>  *  code
> {code:java}
> tableEnv.executeSql("drop table if exists persons_table_kafka2");
>          String kafka_source_sql = "CREATE TABLE persons_table_kafka2 (\n" +
>                  "  `id` BIGINT,\n" +
>                  "  `name` STRING,\n" +
>                  "  `age` INT,\n" +
>                  "  proctime as PROCTIME(),\n" +
>                  "  `ts` TIMESTAMP(3),\n" +
>                  " WATERMARK FOR ts AS ts\n" +
>                  ") WITH (\n" +
>                  "  'connector' = 'kafka',\n" +
>                  "  'topic' = 'persons_test2',\n" +
>                  "  'properties.bootstrap.servers' = 'node2:6667',\n" +
>                  "  'properties.group.id' = 'testGroa115',\n" +
>                  "  'scan.startup.mode' = 'group-offsets',\n" +
>                  "  'format' = 'json'\n" +
>                  ")";
>          tableEnv.executeSql(kafka_source_sql);
>         tableEnv.executeSql("drop table if exists 
> persons_message_table_kafka2");
>          String kafka_source_sql2 = "CREATE TABLE 
> persons_message_table_kafka2 (\n" +
>                  "  `id` BIGINT,\n" +
>                  "  `name` STRING,\n" +
>                  "  `message` STRING,\n" +
>                  "  `ts` TIMESTAMP(3) ," +
>  //                " WATERMARK FOR ts AS ts - INTERVAL '5' SECOND\n" +
>                  " WATERMARK FOR ts AS ts\n" +
>                  ") WITH (\n" +
>                  "  'connector' = 'kafka',\n" +
>                  "  'topic' = 'persons_extra_message2',\n" +
>                  "  'properties.bootstrap.servers' = 'node2:6667',\n" +
>                  "  'properties.group.id' = 'testGroud2e313',\n" +
>                  "  'scan.startup.mode' = 'group-offsets',\n" +
>                  "  'format' = 'json'\n" +
>                  ")";
>          tableEnv.executeSql(kafka_source_sql2);
>         tableEnv.executeSql("" +
>                  "CREATE TEMPORARY VIEW result_data_view " +
>                  " as " +
>                  " select " +
>                  " t1.name, t1.age,t1.ts as ts1,t2.message, cast (t2.ts as 
> string) as ts2 " +
>                  " from persons_table_kafka2 t1 " +
>                  " left  join persons_message_table_kafka2 t2 on t1.name = 
> t2.name and t1.ts between " +
>                          " t2.ts and t2.ts +  INTERVAL '3' MINUTE"
>                  );
>         Table resultTable = tableEnv.from("resu

[GitHub] [flink] ZhijieYang commented on pull request #16337: [FLINK-22843][docs-zh]Document and code are inconsistent.

2021-07-14 Thread GitBox


ZhijieYang commented on pull request #16337:
URL: https://github.com/apache/flink/pull/16337#issuecomment-880365624


   > > @leonardBang Hi, This is my first time contribute code, so some of the 
processes are not clear to me yet. Wait for the build means I don't need to do 
anything else and the flinkbot will build automatically, or I need to build 
manually and then request merge
   > 
   > HI, @ZhijieYang Flink community requires all PR could merge until the CI 
tests passed, both every new commit and `@flinkbot run azure` will trigger the 
CI, we just need to wait the CI passed when we add a new commit. If it not 
works, we can then call `@flinkbot run azure` to trigger the CI manually
   
   Thanks a lot @leonardBang
   Best wishes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16337: [FLINK-22843][docs-zh]Document and code are inconsistent.

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16337:
URL: https://github.com/apache/flink/pull/16337#issuecomment-871994148


   
   ## CI report:
   
   * b7d78128ac4f1d7b8fa086c82748439928a10fee Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20417)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20461)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20419)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20462)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] pltbkd commented on pull request #16357: [FLINK-23209] Introduce HeartbeatListener.notifyTargetUnreachable

2021-07-14 Thread GitBox


pltbkd commented on pull request #16357:
URL: https://github.com/apache/flink/pull/16357#issuecomment-880364616


   >This entails that the time until marking someone as dead is threshold * 
heartbeat interval. I hope that this definition is easy enough to understand 
for our users.
   
   I overall agree it is a good enough plan for the first version. Maybe we 
should suggest user not to set the threshold to more than timeout/interval, 
which by default is 50s/10s=5, a relative small value. We could add the 
suggestion in the configuration description, if you think it's necessary as 
well.
   
   >We can try that out, but I am apprehensive about the default of 1 (purely 
because expecting every message to go through is like the thing everyone drills 
you not to do).
   
   I agree with zentol and I'd like to suggest the default value to be 2, 
standing for discovering and confirming.
   
   >The problem is that the heartbeat is used to transport status information 
about the components. Since we require a certain order of rpcs, we cannot send 
the heartbeat signals easily from outside the main thread because it could lead 
to race conditions and outdated status information. One could try to use some 
logical clocks to synchronize the messages again, but this hasn't been tried 
yet.
   
   I now understand why the heartbeats also need to be in order. IMO, there are 
some advantages to have a more reliable and frequent heartbeat mechanism, with 
which I think both interval and timeout could be decreased. But since I don't 
know many details here, I would leave this issue to experts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] leonardBang commented on pull request #16337: [FLINK-22843][docs-zh]Document and code are inconsistent.

2021-07-14 Thread GitBox


leonardBang commented on pull request #16337:
URL: https://github.com/apache/flink/pull/16337#issuecomment-880363807


   > @leonardBang Hi, This is my first time contribute code, so some of the 
processes are not clear to me yet. Wait for the build means I don't need to do 
anything else and the flinkbot will build automatically, or I need to build 
manually and then request merge
   
   HI, @ZhijieYang  Flink community requires all PR could merge until the CI 
tests passed, both every new commit and `@flinkbot run azure` will trigger the 
CI, we just need to wait the CI passed when we add a new commit. If it not 
works, we can  then call `@flinkbot run azure` to trigger the CI manually


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16434: [FLINK-23107][table-runtime] Separate implementation of deduplicate r…

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16434:
URL: https://github.com/apache/flink/pull/16434#issuecomment-876881346


   
   ## CI report:
   
   * 802d88a74c0e754769dd67780fe6599753f66efc Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20366)
 
   * 12f7fa66c5185c7a3375bf4cbd19b2df1f82a04a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20460)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16337: [FLINK-22843][docs-zh]Document and code are inconsistent.

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16337:
URL: https://github.com/apache/flink/pull/16337#issuecomment-871994148


   
   ## CI report:
   
   * b7d78128ac4f1d7b8fa086c82748439928a10fee Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20419)
 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20417)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20461)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15899: [FLINK-22627][runtime] Remove SlotManagerImpl

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #15899:
URL: https://github.com/apache/flink/pull/15899#issuecomment-839417893


   
   ## CI report:
   
   * d3d33b6b0e90ea1bb8f4597940598db7f2de5c11 UNKNOWN
   * a13ae195d672eca62ad24304de6678533d665cb6 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20408)
 
   * 75dd900bfb5480c29d82cbe68e11dfde3b9a57e9 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20459)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-22893) ResumeCheckpointManuallyITCase hangs on azure

2021-07-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei reassigned FLINK-22893:


Assignee: Anton Kalashnikov

> ResumeCheckpointManuallyITCase hangs on azure
> -
>
> Key: FLINK-22893
> URL: https://issues.apache.org/jira/browse/FLINK-22893
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.14.0
>Reporter: Dawid Wysakowicz
>Assignee: Anton Kalashnikov
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18700&view=logs&j=2c3cbe13-dee0-5837-cf47-3053da9a8a78&t=2c7d57b9-7341-5a87-c9af-2cf7cc1a37dc&l=4382



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] ranqiqiang commented on a change in pull request #16478: [FLINK-23228][docs-zh] Translate "Stateful Stream Processing" page into Chinese

2021-07-14 Thread GitBox


ranqiqiang commented on a change in pull request #16478:
URL: https://github.com/apache/flink/pull/16478#discussion_r670099944



##
File path: docs/content.zh/docs/concepts/stateful-stream-processing.md
##
@@ -24,342 +24,227 @@ under the License.
 
 # 有状态流处理
 
-## What is State?
+## 什么是状态?
 
-While many operations in a dataflow simply look at one individual *event at a
-time* (for example an event parser), some operations remember information
-across multiple events (for example window operators). These operations are
-called **stateful**.
+虽然数据流中的很多操作一次只着眼于一个单独的事件(例如事件解析器),但有些操作会记住多个事件的信息(例如窗口算子)。
+这些操作称为**有状态的**(stateful)。
 
-Some examples of stateful operations:
+有状态操作的一些示例:
 
-  - When an application searches for certain event patterns, the state will
-store the sequence of events encountered so far.
-  - When aggregating events per minute/hour/day, the state holds the pending
-aggregates.
-  - When training a machine learning model over a stream of data points, the
-state holds the current version of the model parameters.
-  - When historic data needs to be managed, the state allows efficient access
-to events that occurred in the past.
+  - 当应用程序搜索某些事件模式时,状态将存储到目前为止遇到的事件序列。
+  - 当每分钟/每小时/每天聚合事件时,状态会持有待处理的聚合。
+  - 当在数据点的流上训练一个机器学习模型时,状态会保存模型参数的当前版本。
+  - 当需要管理历史数据时,状态允许有效访问过去发生的事件。
 
-Flink needs to be aware of the state in order to make it fault tolerant using
+Flink 需要知道状态以便使用
 [checkpoints]({{< ref "docs/dev/datastream/fault-tolerance/checkpointing" >}})
-and [savepoints]({{< ref "docs/ops/state/savepoints" >}}).
+和 [savepoints]({{< ref "docs/ops/state/savepoints" >}}) 进行容错。
 
-Knowledge about the state also allows for rescaling Flink applications, meaning
-that Flink takes care of redistributing state across parallel instances.
+关于状态的知识也允许我们重新调节 Flink 应用程序,这意味着 Flink 负责跨并行实例重新分布状态。
 
-[Queryable state]({{< ref 
"docs/dev/datastream/fault-tolerance/queryable_state" >}}) allows you to access 
state from outside of Flink during runtime.
+[可查询的状态]({{< ref "docs/dev/datastream/fault-tolerance/queryable_state" 
>}})允许你在运行时从 Flink 外部访问状态。
 
-When working with state, it might also be useful to read about [Flink's state
-backends]({{< ref "docs/ops/state/state_backends" >}}). Flink
-provides different state backends that specify how and where state is stored.
+在使用状态时,阅读 [Flink 的状态后端]({{< ref "docs/ops/state/state_backends" >}})可能也很有用。 
+Flink 提供了不同的状态后端,用于指定状态存储的方式和位置。
 
 {{< top >}}
 
-## Keyed State
-
-Keyed state is maintained in what can be thought of as an embedded key/value
-store.  The state is partitioned and distributed strictly together with the
-streams that are read by the stateful operators. Hence, access to the key/value
-state is only possible on *keyed streams*, i.e. after a keyed/partitioned data
-exchange, and is restricted to the values associated with the current event's
-key. Aligning the keys of streams and state makes sure that all state updates
-are local operations, guaranteeing consistency without transaction overhead.
-This alignment also allows Flink to redistribute the state and adjust the
-stream partitioning transparently.
-
-{{< img src="/fig/state_partitioning.svg" alt="State and Partitioning" 
class="offset" width="50%" >}}
-
-Keyed State is further organized into so-called *Key Groups*. Key Groups are
-the atomic unit by which Flink can redistribute Keyed State; there are exactly
-as many Key Groups as the defined maximum parallelism.  During execution each
-parallel instance of a keyed operator works with the keys for one or more Key
-Groups.
-
-## State Persistence
-
-Flink implements fault tolerance using a combination of **stream replay** and
-**checkpointing**. A checkpoint marks a specific point in each of the
-input streams along with the corresponding state for each of the operators. A
-streaming dataflow can be resumed from a checkpoint while maintaining
-consistency *(exactly-once processing semantics)* by restoring the state of the
-operators and replaying the records from the point of the checkpoint.
-
-The checkpoint interval is a means of trading off the overhead of fault
-tolerance during execution with the recovery time (the number of records that
-need to be replayed).
-
-The fault tolerance mechanism continuously draws snapshots of the distributed
-streaming data flow. For streaming applications with small state, these
-snapshots are very light-weight and can be drawn frequently without much impact
-on performance.  The state of the streaming applications is stored at a
-configurable place, usually in a distributed file system.
-
-In case of a program failure (due to machine-, network-, or software failure),
-Flink stops the distributed streaming dataflow.  The system then restarts the
-operators and resets them to the latest successful checkpoint. The input
-streams are reset to the point of the state snapshot. Any records that are
-processed as part of the restarted parallel dataflow are guaranteed to not have
-affect

[GitHub] [flink] ranqiqiang commented on a change in pull request #16478: [FLINK-23228][docs-zh] Translate "Stateful Stream Processing" page into Chinese

2021-07-14 Thread GitBox


ranqiqiang commented on a change in pull request #16478:
URL: https://github.com/apache/flink/pull/16478#discussion_r670099338



##
File path: docs/content.zh/docs/concepts/stateful-stream-processing.md
##
@@ -24,342 +24,227 @@ under the License.
 
 # 有状态流处理
 
-## What is State?
+## 什么是状态?
 
-While many operations in a dataflow simply look at one individual *event at a
-time* (for example an event parser), some operations remember information
-across multiple events (for example window operators). These operations are
-called **stateful**.
+虽然数据流中的很多操作一次只着眼于一个单独的事件(例如事件解析器),但有些操作会记住多个事件的信息(例如窗口算子)。
+这些操作称为**有状态的**(stateful)。
 
-Some examples of stateful operations:
+有状态操作的一些示例:
 
-  - When an application searches for certain event patterns, the state will
-store the sequence of events encountered so far.
-  - When aggregating events per minute/hour/day, the state holds the pending
-aggregates.
-  - When training a machine learning model over a stream of data points, the
-state holds the current version of the model parameters.
-  - When historic data needs to be managed, the state allows efficient access
-to events that occurred in the past.
+  - 当应用程序搜索某些事件模式时,状态将存储到目前为止遇到的事件序列。
+  - 当每分钟/每小时/每天聚合事件时,状态会持有待处理的聚合。
+  - 当在数据点的流上训练一个机器学习模型时,状态会保存模型参数的当前版本。
+  - 当需要管理历史数据时,状态允许有效访问过去发生的事件。
 
-Flink needs to be aware of the state in order to make it fault tolerant using
+Flink 需要知道状态以便使用
 [checkpoints]({{< ref "docs/dev/datastream/fault-tolerance/checkpointing" >}})
-and [savepoints]({{< ref "docs/ops/state/savepoints" >}}).
+和 [savepoints]({{< ref "docs/ops/state/savepoints" >}}) 进行容错。
 
-Knowledge about the state also allows for rescaling Flink applications, meaning
-that Flink takes care of redistributing state across parallel instances.
+关于状态的知识也允许我们重新调节 Flink 应用程序,这意味着 Flink 负责跨并行实例重新分布状态。
 
-[Queryable state]({{< ref 
"docs/dev/datastream/fault-tolerance/queryable_state" >}}) allows you to access 
state from outside of Flink during runtime.
+[可查询的状态]({{< ref "docs/dev/datastream/fault-tolerance/queryable_state" 
>}})允许你在运行时从 Flink 外部访问状态。
 
-When working with state, it might also be useful to read about [Flink's state
-backends]({{< ref "docs/ops/state/state_backends" >}}). Flink
-provides different state backends that specify how and where state is stored.
+在使用状态时,阅读 [Flink 的状态后端]({{< ref "docs/ops/state/state_backends" >}})可能也很有用。 
+Flink 提供了不同的状态后端,用于指定状态存储的方式和位置。
 
 {{< top >}}
 
-## Keyed State
-
-Keyed state is maintained in what can be thought of as an embedded key/value
-store.  The state is partitioned and distributed strictly together with the
-streams that are read by the stateful operators. Hence, access to the key/value
-state is only possible on *keyed streams*, i.e. after a keyed/partitioned data
-exchange, and is restricted to the values associated with the current event's
-key. Aligning the keys of streams and state makes sure that all state updates
-are local operations, guaranteeing consistency without transaction overhead.
-This alignment also allows Flink to redistribute the state and adjust the
-stream partitioning transparently.
-
-{{< img src="/fig/state_partitioning.svg" alt="State and Partitioning" 
class="offset" width="50%" >}}
-
-Keyed State is further organized into so-called *Key Groups*. Key Groups are
-the atomic unit by which Flink can redistribute Keyed State; there are exactly
-as many Key Groups as the defined maximum parallelism.  During execution each
-parallel instance of a keyed operator works with the keys for one or more Key
-Groups.
-
-## State Persistence
-
-Flink implements fault tolerance using a combination of **stream replay** and
-**checkpointing**. A checkpoint marks a specific point in each of the
-input streams along with the corresponding state for each of the operators. A
-streaming dataflow can be resumed from a checkpoint while maintaining
-consistency *(exactly-once processing semantics)* by restoring the state of the
-operators and replaying the records from the point of the checkpoint.
-
-The checkpoint interval is a means of trading off the overhead of fault
-tolerance during execution with the recovery time (the number of records that
-need to be replayed).
-
-The fault tolerance mechanism continuously draws snapshots of the distributed
-streaming data flow. For streaming applications with small state, these
-snapshots are very light-weight and can be drawn frequently without much impact
-on performance.  The state of the streaming applications is stored at a
-configurable place, usually in a distributed file system.
-
-In case of a program failure (due to machine-, network-, or software failure),
-Flink stops the distributed streaming dataflow.  The system then restarts the
-operators and resets them to the latest successful checkpoint. The input
-streams are reset to the point of the state snapshot. Any records that are
-processed as part of the restarted parallel dataflow are guaranteed to not have
-affect

[GitHub] [flink] Jesuitry commented on pull request #9336: [FLINK-13548][Deployment/YARN]Support priority of the Flink YARN application

2021-07-14 Thread GitBox


Jesuitry commented on pull request #9336:
URL: https://github.com/apache/flink/pull/9336#issuecomment-880352382


   Can this be used in the fair scheduler?I set yarn.application.priority, but 
it does not take effect.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] ranqiqiang commented on a change in pull request #16478: [FLINK-23228][docs-zh] Translate "Stateful Stream Processing" page into Chinese

2021-07-14 Thread GitBox


ranqiqiang commented on a change in pull request #16478:
URL: https://github.com/apache/flink/pull/16478#discussion_r670097805



##
File path: docs/content.zh/docs/concepts/stateful-stream-processing.md
##
@@ -24,342 +24,227 @@ under the License.
 
 # 有状态流处理
 
-## What is State?
+## 什么是状态?
 
-While many operations in a dataflow simply look at one individual *event at a
-time* (for example an event parser), some operations remember information
-across multiple events (for example window operators). These operations are
-called **stateful**.
+虽然数据流中的很多操作一次只着眼于一个单独的事件(例如事件解析器),但有些操作会记住多个事件的信息(例如窗口算子)。
+这些操作称为**有状态的**(stateful)。
 
-Some examples of stateful operations:
+有状态操作的一些示例:
 
-  - When an application searches for certain event patterns, the state will
-store the sequence of events encountered so far.
-  - When aggregating events per minute/hour/day, the state holds the pending
-aggregates.
-  - When training a machine learning model over a stream of data points, the
-state holds the current version of the model parameters.
-  - When historic data needs to be managed, the state allows efficient access
-to events that occurred in the past.
+  - 当应用程序搜索某些事件模式时,状态将存储到目前为止遇到的事件序列。

Review comment:
   event 事件
   the sequence of events encountered so far : 截止到当前事件序列
   个人感觉就是:state记录了截止到当前所发生的事件.
   sequence 这里不知道是翻译成按顺序记录,还是序列好理解。
   可以群里问问他们大家一起聊聊,怎么理解才是对的
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-23391) KafkaSourceReaderTest.testKafkaSourceMetrics fails on azure

2021-07-14 Thread Xintong Song (Jira)
Xintong Song created FLINK-23391:


 Summary: KafkaSourceReaderTest.testKafkaSourceMetrics fails on 
azure
 Key: FLINK-23391
 URL: https://issues.apache.org/jira/browse/FLINK-23391
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.13.1
Reporter: Xintong Song
 Fix For: 1.13.2


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20456&view=logs&j=c5612577-f1f7-5977-6ff6-7432788526f7&t=53f6305f-55e6-561c-8f1e-3a1dde2c77df&l=6783

{code}
Jul 14 23:00:26 [ERROR] Tests run: 10, Failures: 0, Errors: 1, Skipped: 0, Time 
elapsed: 99.93 s <<< FAILURE! - in 
org.apache.flink.connector.kafka.source.reader.KafkaSourceReaderTest
Jul 14 23:00:26 [ERROR] 
testKafkaSourceMetrics(org.apache.flink.connector.kafka.source.reader.KafkaSourceReaderTest)
  Time elapsed: 60.225 s  <<< ERROR!
Jul 14 23:00:26 java.util.concurrent.TimeoutException: Offsets are not 
committed successfully. Dangling offsets: 
{15213={KafkaSourceReaderTest-0=OffsetAndMetadata{offset=10, leaderEpoch=null, 
metadata=''}}}
Jul 14 23:00:26 at 
org.apache.flink.core.testutils.CommonTestUtils.waitUtil(CommonTestUtils.java:210)
Jul 14 23:00:26 at 
org.apache.flink.connector.kafka.source.reader.KafkaSourceReaderTest.testKafkaSourceMetrics(KafkaSourceReaderTest.java:275)
Jul 14 23:00:26 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
Jul 14 23:00:26 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Jul 14 23:00:26 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Jul 14 23:00:26 at java.lang.reflect.Method.invoke(Method.java:498)
Jul 14 23:00:26 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
Jul 14 23:00:26 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
Jul 14 23:00:26 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
Jul 14 23:00:26 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
Jul 14 23:00:26 at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
Jul 14 23:00:26 at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
Jul 14 23:00:26 at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
Jul 14 23:00:26 at 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
Jul 14 23:00:26 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
Jul 14 23:00:26 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
Jul 14 23:00:26 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
Jul 14 23:00:26 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
Jul 14 23:00:26 at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
Jul 14 23:00:26 at org.junit.runners.Suite.runChild(Suite.java:128)
Jul 14 23:00:26 at org.junit.runners.Suite.runChild(Suite.java:27)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
Jul 14 23:00:26 at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
Jul 14 23:00:26 at 
org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
Jul 14 23:00:26 at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
Jul 14 23:00:26 at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
Jul 14 23:00:26 at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
J

[jira] [Commented] (FLINK-22198) KafkaTableITCase hang.

2021-07-14 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380968#comment-17380968
 ] 

Xintong Song commented on FLINK-22198:
--

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20456&view=logs&j=ce8f3cc3-c1ea-5281-f5eb-df9ebd24947f&t=f266c805-9429-58ed-2f9e-482e7b82f58b&l=6857

> KafkaTableITCase hang.
> --
>
> Key: FLINK-22198
> URL: https://issues.apache.org/jira/browse/FLINK-22198
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.12.4
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16287&view=logs&j=c5f0071e-1851-543e-9a45-9ac140befc32&t=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5&l=6625
> There is no any artifacts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-20329) Elasticsearch7DynamicSinkITCase hangs

2021-07-14 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380965#comment-17380965
 ] 

Xintong Song commented on FLINK-20329:
--

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20456&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=03dca39c-73e8-5aaf-601d-328ae5c35f20&l=12200

> Elasticsearch7DynamicSinkITCase hangs
> -
>
> Key: FLINK-20329
> URL: https://issues.apache.org/jira/browse/FLINK-20329
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.12.0, 1.13.0
>Reporter: Dian Fu
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10052&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=03dca39c-73e8-5aaf-601d-328ae5c35f20
> {code}
> 2020-11-24T16:04:05.9260517Z [INFO] Running 
> org.apache.flink.streaming.connectors.elasticsearch.table.Elasticsearch7DynamicSinkITCase
> 2020-11-24T16:19:25.5481231Z 
> ==
> 2020-11-24T16:19:25.5483549Z Process produced no output for 900 seconds.
> 2020-11-24T16:19:25.5484064Z 
> ==
> 2020-11-24T16:19:25.5484498Z 
> ==
> 2020-11-24T16:19:25.5484882Z The following Java processes are running (JPS)
> 2020-11-24T16:19:25.5485475Z 
> ==
> 2020-11-24T16:19:25.5694497Z Picked up JAVA_TOOL_OPTIONS: 
> -XX:+HeapDumpOnOutOfMemoryError
> 2020-11-24T16:19:25.7263048Z 16192 surefirebooter5057948964630155904.jar
> 2020-11-24T16:19:25.7263515Z 18566 Jps
> 2020-11-24T16:19:25.7263709Z 959 Launcher
> 2020-11-24T16:19:25.7411148Z 
> ==
> 2020-11-24T16:19:25.7427013Z Printing stack trace of Java process 16192
> 2020-11-24T16:19:25.7427369Z 
> ==
> 2020-11-24T16:19:25.7484365Z Picked up JAVA_TOOL_OPTIONS: 
> -XX:+HeapDumpOnOutOfMemoryError
> 2020-11-24T16:19:26.0848776Z 2020-11-24 16:19:26
> 2020-11-24T16:19:26.0849578Z Full thread dump OpenJDK 64-Bit Server VM 
> (25.275-b01 mixed mode):
> 2020-11-24T16:19:26.0849831Z 
> 2020-11-24T16:19:26.0850185Z "Attach Listener" #32 daemon prio=9 os_prio=0 
> tid=0x7fc148001000 nid=0x48e7 waiting on condition [0x]
> 2020-11-24T16:19:26.0850595Zjava.lang.Thread.State: RUNNABLE
> 2020-11-24T16:19:26.0850814Z 
> 2020-11-24T16:19:26.0851375Z "testcontainers-ryuk" #31 daemon prio=5 
> os_prio=0 tid=0x7fc251232000 nid=0x3fb0 in Object.wait() 
> [0x7fc1012c4000]
> 2020-11-24T16:19:26.0854688Zjava.lang.Thread.State: TIMED_WAITING (on 
> object monitor)
> 2020-11-24T16:19:26.0855379Z  at java.lang.Object.wait(Native Method)
> 2020-11-24T16:19:26.0855844Z  at 
> org.testcontainers.utility.ResourceReaper.lambda$null$1(ResourceReaper.java:142)
> 2020-11-24T16:19:26.0857272Z  - locked <0x8e2bd2d0> (a 
> java.util.ArrayList)
> 2020-11-24T16:19:26.0857977Z  at 
> org.testcontainers.utility.ResourceReaper$$Lambda$93/1981729428.run(Unknown 
> Source)
> 2020-11-24T16:19:26.0858471Z  at 
> org.rnorth.ducttape.ratelimits.RateLimiter.doWhenReady(RateLimiter.java:27)
> 2020-11-24T16:19:26.0858961Z  at 
> org.testcontainers.utility.ResourceReaper.lambda$start$2(ResourceReaper.java:133)
> 2020-11-24T16:19:26.0859422Z  at 
> org.testcontainers.utility.ResourceReaper$$Lambda$92/40191541.run(Unknown 
> Source)
> 2020-11-24T16:19:26.0859788Z  at java.lang.Thread.run(Thread.java:748)
> 2020-11-24T16:19:26.0860030Z 
> 2020-11-24T16:19:26.0860371Z "process reaper" #24 daemon prio=10 os_prio=0 
> tid=0x7fc0f803b800 nid=0x3f92 waiting on condition [0x7fc10296e000]
> 2020-11-24T16:19:26.0860913Zjava.lang.Thread.State: TIMED_WAITING 
> (parking)
> 2020-11-24T16:19:26.0861387Z  at sun.misc.Unsafe.park(Native Method)
> 2020-11-24T16:19:26.0862495Z  - parking to wait for  <0x8814bf30> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
> 2020-11-24T16:19:26.0863253Z  at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> 2020-11-24T16:19:26.0863760Z  at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
> 2020-11-24T16:19:26.0864274Z  at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
> 2020-11-24T16:19:26.0864762Z  at 
> java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:

[jira] [Created] (FLINK-23390) FlinkKafkaInternalProducerITCase.testResumeTransactionAfterClosed

2021-07-14 Thread Xintong Song (Jira)
Xintong Song created FLINK-23390:


 Summary: 
FlinkKafkaInternalProducerITCase.testResumeTransactionAfterClosed
 Key: FLINK-23390
 URL: https://issues.apache.org/jira/browse/FLINK-23390
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.14.0
Reporter: Xintong Song
 Fix For: 1.14.0


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20454&view=logs&j=ce8f3cc3-c1ea-5281-f5eb-df9ebd24947f&t=f266c805-9429-58ed-2f9e-482e7b82f58b&l=6914

{code}
Jul 14 22:01:05 [ERROR] Tests run: 10, Failures: 0, Errors: 1, Skipped: 0, Time 
elapsed: 49 s <<< FAILURE! - in 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase
Jul 14 22:01:05 [ERROR] 
testResumeTransactionAfterClosed(org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase)
  Time elapsed: 5.271 s  <<< ERROR!
Jul 14 22:01:05 java.lang.Exception: Unexpected exception, 
expected but was
Jul 14 22:01:05 at 
org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:30)
Jul 14 22:01:05 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
Jul 14 22:01:05 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
Jul 14 22:01:05 at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
Jul 14 22:01:05 at java.base/java.lang.Thread.run(Thread.java:834)
Jul 14 22:01:05 Caused by: java.lang.AssertionError: The message should have 
been successfully sent expected null, but 
was:
Jul 14 22:01:05 at org.junit.Assert.fail(Assert.java:89)
Jul 14 22:01:05 at org.junit.Assert.failNotNull(Assert.java:756)
Jul 14 22:01:05 at org.junit.Assert.assertNull(Assert.java:738)
Jul 14 22:01:05 at 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase.getClosedProducer(FlinkKafkaInternalProducerITCase.java:228)
Jul 14 22:01:05 at 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase.testResumeTransactionAfterClosed(FlinkKafkaInternalProducerITCase.java:184)
Jul 14 22:01:05 at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Jul 14 22:01:05 at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Jul 14 22:01:05 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Jul 14 22:01:05 at 
java.base/java.lang.reflect.Method.invoke(Method.java:566)
Jul 14 22:01:05 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
Jul 14 22:01:05 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
Jul 14 22:01:05 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
Jul 14 22:01:05 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
Jul 14 22:01:05 at 
org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:19)
Jul 14 22:01:05 ... 4 more
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] ZhijieYang commented on pull request #16348: [FLINK-16093][docs-zh]Translate "System Functions" page of "Functions" into Chinese

2021-07-14 Thread GitBox


ZhijieYang commented on pull request #16348:
URL: https://github.com/apache/flink/pull/16348#issuecomment-880346518


   @RocMarshal So could I rebase these two commits to one now to looks clear?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] ZhijieYang commented on pull request #16337: [FLINK-22843][docs-zh]Document and code are inconsistent.

2021-07-14 Thread GitBox


ZhijieYang commented on pull request #16337:
URL: https://github.com/apache/flink/pull/16337#issuecomment-880345922


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] ZhijieYang commented on pull request #16337: [FLINK-22843][docs-zh]Document and code are inconsistent.

2021-07-14 Thread GitBox


ZhijieYang commented on pull request #16337:
URL: https://github.com/apache/flink/pull/16337#issuecomment-880345578


   @leonardBang Hi, This is my first time contribute code, so some of the 
processes are not clear to me yet. Wait for the build means I don't need to do 
anything else and the flinkbot will build automatically, or I need to build 
manually and then request merge


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22889) JdbcExactlyOnceSinkE2eTest hangs on azure

2021-07-14 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380959#comment-17380959
 ] 

Xintong Song commented on FLINK-22889:
--

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20454&view=logs&j=ba53eb01-1462-56a3-8e98-0dd97fbcaab5&t=bfbc6239-57a0-5db0-63f3-41551b4f7d51&l=14595

> JdbcExactlyOnceSinkE2eTest hangs on azure
> -
>
> Key: FLINK-22889
> URL: https://issues.apache.org/jira/browse/FLINK-22889
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Affects Versions: 1.14.0, 1.13.1
>Reporter: Dawid Wysakowicz
>Assignee: Roman Khachatryan
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18690&view=logs&j=ba53eb01-1462-56a3-8e98-0dd97fbcaab5&t=bfbc6239-57a0-5db0-63f3-41551b4f7d51&l=16658



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22889) JdbcExactlyOnceSinkE2eTest hangs on azure

2021-07-14 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380957#comment-17380957
 ] 

Xintong Song commented on FLINK-22889:
--

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20454&view=logs&j=961f8f81-6b52-53df-09f6-7291a2e4af6a&t=60581941-0138-53c0-39fe-86d62be5f407&l=14453

> JdbcExactlyOnceSinkE2eTest hangs on azure
> -
>
> Key: FLINK-22889
> URL: https://issues.apache.org/jira/browse/FLINK-22889
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Affects Versions: 1.14.0, 1.13.1
>Reporter: Dawid Wysakowicz
>Assignee: Roman Khachatryan
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18690&view=logs&j=ba53eb01-1462-56a3-8e98-0dd97fbcaab5&t=bfbc6239-57a0-5db0-63f3-41551b4f7d51&l=16658



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21751) Improve handling of freed slots if final requirement message is in flight

2021-07-14 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380955#comment-17380955
 ] 

Xintong Song commented on FLINK-21751:
--

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20454&view=logs&j=4dd4dbdd-1802-5eb7-a518-6acd9d24d0fc&t=8d6b4dd3-4ca1-5611-1743-57a7d76b395a&l=371

> Improve handling of freed slots if final requirement message is in flight
> -
>
> Key: FLINK-21751
> URL: https://issues.apache.org/jira/browse/FLINK-21751
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> When a job shuts down there is a race condition between slots being freed and 
> requirements being set to 0. If the slot release arrives first at the RM then 
> it will immediately try to re-allocate slots, since the requirements are not 
> 0 yet.
> In practice this is unlikely to cause issues (because the trip from 
> JobMaster->TM->RM should always take longer than JobMaster->RM), but this 
> problem results in various test stabilities.
> Essentially there are 2 alternatives:
> a) enforce a strict order such that the requirement update must be 
> acknowledged before slots are freed
> b) have the RM inform the TM if the job has finished, to clean up any pending 
> slots.
> Both options are not ideal.
> a) implies that the JobMaster has to stick around longer to wait for the 
> acknowledge and this also introduces a delay to all slot freeing operations.
> b) can easily lead to bugs in the future; if the TM was informed that the job 
> has concluded it must only cancel pending slots; it may not free all job 
> resources because other messages from the JM may still be in flight (for 
> example, the partition promotions).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #16434: [FLINK-23107][table-runtime] Separate implementation of deduplicate r…

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16434:
URL: https://github.com/apache/flink/pull/16434#issuecomment-876881346


   
   ## CI report:
   
   * 802d88a74c0e754769dd67780fe6599753f66efc Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20366)
 
   * 12f7fa66c5185c7a3375bf4cbd19b2df1f82a04a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15899: [FLINK-22627][runtime] Remove SlotManagerImpl

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #15899:
URL: https://github.com/apache/flink/pull/15899#issuecomment-839417893


   
   ## CI report:
   
   * d3d33b6b0e90ea1bb8f4597940598db7f2de5c11 UNKNOWN
   * a13ae195d672eca62ad24304de6678533d665cb6 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20408)
 
   * 75dd900bfb5480c29d82cbe68e11dfde3b9a57e9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] ZhijieYang commented on a change in pull request #16355: [hotfix] [docs] Fix document address does not match the actual web page.

2021-07-14 Thread GitBox


ZhijieYang commented on a change in pull request #16355:
URL: https://github.com/apache/flink/pull/16355#discussion_r670078863



##
File path: 
docs/content.zh/docs/dev/datastream/fault-tolerance/serialization/third_party_serializers.md
##
@@ -3,8 +3,8 @@ title: 自定义序列化器
 weight: 11

Review comment:
   How about we put this issue on hold and focus on fixing the errors in 
the documentation first, I noticed that this page is not translated, later I 
want to create a JIRA to fix the translation of the whole 
`docs/content.zh/docs/dev/datastream/fault-tolerance/serialization/third_party_serializers.md`
 page, we can discuss that in the next pr, this time leave it as it is for now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22198) KafkaTableITCase hang.

2021-07-14 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380952#comment-17380952
 ] 

Xintong Song commented on FLINK-22198:
--

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20455&view=logs&j=4be4ed2b-549a-533d-aa33-09e28e360cc8&t=0db94045-2aa0-53fa-f444-0130d6933518&l=7081

> KafkaTableITCase hang.
> --
>
> Key: FLINK-22198
> URL: https://issues.apache.org/jira/browse/FLINK-22198
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.12.4
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16287&view=logs&j=c5f0071e-1851-543e-9a45-9ac140befc32&t=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5&l=6625
> There is no any artifacts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-23208) Late processing timers need to wait 1ms at least to be fired

2021-07-14 Thread Jiayi Liao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-23208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380948#comment-17380948
 ] 

Jiayi Liao commented on FLINK-23208:


[~pnowojski] I can work on this. 

Should the benchmark be added in {{flink-benchmarks}} repository or somewhere 
else ? 



> Late processing timers need to wait 1ms at least to be fired
> 
>
> Key: FLINK-23208
> URL: https://issues.apache.org/jira/browse/FLINK-23208
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Runtime / Task
>Affects Versions: 1.11.0, 1.11.3, 1.13.0, 1.14.0, 1.12.4
>Reporter: Jiayi Liao
>Priority: Critical
>  Labels: critical
> Attachments: screenshot-1.png
>
>
> The problem is from the codes below:
> {code:java}
> public static long getProcessingTimeDelay(long processingTimestamp, long 
> currentTimestamp) {
>   // delay the firing of the timer by 1 ms to align the semantics with 
> watermark. A watermark
>   // T says we won't see elements in the future with a timestamp smaller 
> or equal to T.
>   // With processing time, we therefore need to delay firing the timer by 
> one ms.
>   return Math.max(processingTimestamp - currentTimestamp, 0) + 1;
> }
> {code}
> Assuming a Flink job creates 1 timer per millionseconds, and is able to 
> consume 1 timer/ms. Here is what will happen: 
> * Timestmap1(1st ms): timer1 is registered and will be triggered on 
> Timestamp2. 
> * Timestamp2(2nd ms): timer2 is registered and timer1 is triggered
> * Timestamp3(3rd ms): timer3 is registered and timer1 is consumed, after 
> this, {{InternalTimerServiceImpl}} registers next timer, which is timer2, and 
> timer2 will be triggered on Timestamp4(wait 1ms at least)
> * Timestamp4(4th ms): timer4 is registered and timer2 is triggered
> * Timestamp5(5th ms): timer5 is registered and timer2 is consumed, after 
> this, {{InternalTimerServiceImpl}} registers next timer, which is timer3, and 
> timer3 will be triggered on Timestamp6(wait 1ms at least)
> As we can see here, the ability of the Flink job is consuming 1 timer/ms, but 
> it's actually able to consume 0.5 timer/ms. And another problem is that we 
> cannot observe the delay from the lag metrics of the source(Kafka). Instead, 
> what we can tell is that the moment of output is much later than expected. 
> I've added a metrics in our inner version, we can see the lag of the timer 
> triggering keeps increasing: 
>  !screenshot-1.png! 
> *In another word, we should never let the late processing timer wait 1ms, I 
> think a simple change would be as below:*
> {code:java}
> return Math.max(processingTimestamp - currentTimestamp, -1) + 1;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zuoniduimian commented on a change in pull request #16490: [FLINK-23241][docs-zh] Translate the page of "Working with State " in…

2021-07-14 Thread GitBox


zuoniduimian commented on a change in pull request #16490:
URL: https://github.com/apache/flink/pull/16490#discussion_r670074353



##
File path: docs/content.zh/docs/dev/datastream/application_parameters.md
##
@@ -24,28 +24,23 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Handling Application Parameters
+# 应用程序参数处理
 
-
-
-Handling Application Parameters
+应用程序参数处理
 ---
-Almost all Flink applications, both batch and streaming, rely on external 
configuration parameters.
-They are used to specify input and output sources (like paths or addresses), 
system parameters (parallelism, runtime configuration), and application 
specific parameters (typically used within user functions).
+几乎所有的Flink应用程序,也就是批和流程序,都依赖于外部配置参数。这些配置参数指定输入和输出源(如路径或地址),系统参数(并行度,运行时配置)和应用程序特定参数(通常在用户函数中使用)。
 
-Flink provides a simple utility called `ParameterTool` to provide some basic 
tooling for solving these problems.
-Please note that you don't have to use the `ParameterTool` described here. 
Other frameworks such as [Commons 
CLI](https://commons.apache.org/proper/commons-cli/) and
-[argparse4j](http://argparse4j.sourceforge.net/) also work well with Flink.
+Flink提供一个名为 `Parametertool` 的简单实用类,为解决以上问题提供了基本的工具。 这里请注意,此处描述的` 
parametertool` 并不是必须的。[Commons 
CLI](https://commons.apache.org/proper/commons-cli/) 和 
[argparse4j](http://argparse4j.sourceforge.net/)等其他框架也与Flink兼容非常好。

Review comment:
   thanks for your comment.I will modify it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] yljee commented on pull request #16316: [FLINK-16090] Translate "Table API" page of "Table API & SQL" into Chinese

2021-07-14 Thread GitBox


yljee commented on pull request #16316:
URL: https://github.com/apache/flink/pull/16316#issuecomment-880335533


   @RocMarshal Could you review my PR when you're free plz? Thx.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16495: [hotfix][docs][deployment]: update broken download url

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16495:
URL: https://github.com/apache/flink/pull/16495#issuecomment-880293387


   
   ## CI report:
   
   * 55715c7c78725b933ce751cb48d6a2275cf6f880 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20457)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] RollsBean commented on a change in pull request #16490: [FLINK-23241][docs-zh] Translate the page of "Working with State " in…

2021-07-14 Thread GitBox


RollsBean commented on a change in pull request #16490:
URL: https://github.com/apache/flink/pull/16490#discussion_r670059732



##
File path: docs/content.zh/docs/dev/datastream/application_parameters.md
##
@@ -24,28 +24,23 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Handling Application Parameters
+# 应用程序参数处理
 
-
-
-Handling Application Parameters
+应用程序参数处理
 ---
-Almost all Flink applications, both batch and streaming, rely on external 
configuration parameters.
-They are used to specify input and output sources (like paths or addresses), 
system parameters (parallelism, runtime configuration), and application 
specific parameters (typically used within user functions).
+几乎所有的Flink应用程序,也就是批和流程序,都依赖于外部配置参数。这些配置参数指定输入和输出源(如路径或地址),系统参数(并行度,运行时配置)和应用程序特定参数(通常在用户函数中使用)。
 
-Flink provides a simple utility called `ParameterTool` to provide some basic 
tooling for solving these problems.
-Please note that you don't have to use the `ParameterTool` described here. 
Other frameworks such as [Commons 
CLI](https://commons.apache.org/proper/commons-cli/) and
-[argparse4j](http://argparse4j.sourceforge.net/) also work well with Flink.
+Flink提供一个名为 `Parametertool` 的简单实用类,为解决以上问题提供了基本的工具。 这里请注意,此处描述的` 
parametertool` 并不是必须的。[Commons 
CLI](https://commons.apache.org/proper/commons-cli/) 和 
[argparse4j](http://argparse4j.sourceforge.net/)等其他框架也与Flink兼容非常好。

Review comment:
   1. 英文和中文之间一般有一个空格,比如33行开头 “Flink 提供...“。
   2. 第二句 ` parametertool` 改为 `ParameterTool`

##
File path: docs/content.zh/docs/dev/datastream/application_parameters.md
##
@@ -58,32 +53,33 @@ ParameterTool parameter = 
ParameterTool.fromPropertiesFile(propertiesFileInputSt
 ```
 
 
- From the command line arguments
+ 配置值来自命令行
+
+该操作从命令行获取像 `--input hdfs:///mydata --elements 42` 的参数。
 
-This allows getting arguments like `--input hdfs:///mydata --elements 42` from 
the command line.
 ```java
 public static void main(String[] args) {
 ParameterTool parameter = ParameterTool.fromArgs(args);
 // .. regular code ..
 ```
 
 
- From system properties
+ 配置值来自系统属性
 
-When starting a JVM, you can pass system properties to it: 
`-Dinput=hdfs:///mydata`. You can also initialize the `ParameterTool` from 
these system properties:
+启动JVM时,可以将系统属性传递给JVM:`-Dinput=hdfs:///mydata`。还可以从这些系统属性初始化 `ParameterTool`:
 
 ```java
 ParameterTool parameter = ParameterTool.fromSystemProperties();
 ```
 
+### Flink程序中使用参数

Review comment:
   同上

##
File path: docs/content.zh/docs/dev/datastream/application_parameters.md
##
@@ -58,32 +53,33 @@ ParameterTool parameter = 
ParameterTool.fromPropertiesFile(propertiesFileInputSt
 ```
 
 
- From the command line arguments
+ 配置值来自命令行
+
+该操作从命令行获取像 `--input hdfs:///mydata --elements 42` 的参数。
 
-This allows getting arguments like `--input hdfs:///mydata --elements 42` from 
the command line.
 ```java
 public static void main(String[] args) {
 ParameterTool parameter = ParameterTool.fromArgs(args);
 // .. regular code ..
 ```
 
 
- From system properties
+ 配置值来自系统属性
 
-When starting a JVM, you can pass system properties to it: 
`-Dinput=hdfs:///mydata`. You can also initialize the `ParameterTool` from 
these system properties:
+启动JVM时,可以将系统属性传递给JVM:`-Dinput=hdfs:///mydata`。还可以从这些系统属性初始化 `ParameterTool`:

Review comment:
   “JVM” 也是,和中文之间要有一个空格

##
File path: docs/content.zh/docs/dev/datastream/fault-tolerance/state.md
##
@@ -85,15 +73,9 @@ keyed = words.key_by(lambda row: row[0])
 {{< /tab >}}
 {{< /tabs >}}
 
- Tuple Keys and Expression Keys
+ 元祖键和表达式键
 
-Flink also has two alternative ways of defining keys: tuple keys and expression
-keys in the Java/Scala API(still not supported in the Python API). With this 
you can
-specify keys using tuple field indices or expressions
-for selecting fields of objects. We don't recommend using these today but you
-can refer to the Javadoc of DataStream to learn about them. Using a KeySelector
-function is strictly superior: with Java lambdas they are easy to use and they
-have potentially less overhead at runtime.
+Flink 还有两种定义key的方法:Java/scala API 中的元组键和表达式键(python API 
中仍然不支持)。这样,可以使用元组字段索引或表达式来指定 key,选择对象的字段。我们现在不推荐使用这些,但是可以参考 DataStream 的 
Javadoc 来了解它们。使用 KeySelector 函数是绝对有优势的:结合 java lambda 语法,KeySelector 
易于使用,并且在运行时的开销会更小。

Review comment:
   1. ”定义key的方法“  改成 “定义 key 的方法”。
   2. 第二句不太通顺,改成 “这样,你就可以使用元组字段索引或表达式来指定 key,用于选择对象的字段。” 是不是更好一点。
   3. 最后一句 “结合 java lambda 语法,KeySelector 易于使用,并且在运行时的开销会更小。” Java 首字母大写,然后 
`potentially` 不翻译出来感觉会不符合原意,让人感觉开销肯定会更小。

##
File path: docs/content.zh/docs/dev/datastream/fault-tolerance/state.md
##
@@ -25,32 +25,20 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Working with State
+# 带有状态的处理
 
-In this section you will learn about the APIs that Flink provides for writing
-stateful programs. Please take a

[jira] [Updated] (FLINK-20321) Get NPE when using AvroDeserializationSchema to deserialize null input

2021-07-14 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-20321:
-
Fix Version/s: (was: 1.12.6)

> Get NPE when using AvroDeserializationSchema to deserialize null input
> --
>
> Key: FLINK-20321
> URL: https://issues.apache.org/jira/browse/FLINK-20321
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Affects Versions: 1.12.0
>Reporter: Shengkai Fang
>Assignee: Xue Wang
>Priority: Major
>  Labels: pull-request-available, sprint, starter
> Fix For: 1.12.5
>
>
> You can reproduce the bug by adding the code into the 
> {{AvroDeserializationSchemaTest}}.
> The code follows
> {code:java}
> @Test
>   public void testSpecificRecord2() throws Exception {
>   DeserializationSchema deserializer = 
> AvroDeserializationSchema.forSpecific(Address.class);
>   Address deserializedAddress = deserializer.deserialize(null);
>   assertEquals(null, deserializedAddress);
>   }
> {code}
> Exception stack:
> {code:java}
> java.lang.NullPointerException
>   at 
> org.apache.flink.formats.avro.utils.MutableByteArrayInputStream.setBuffer(MutableByteArrayInputStream.java:43)
>   at 
> org.apache.flink.formats.avro.AvroDeserializationSchema.deserialize(AvroDeserializationSchema.java:131)
>   at 
> org.apache.flink.formats.avro.AvroDeserializationSchemaTest.testSpecificRecord2(AvroDeserializationSchemaTest.java:69)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21445) Application mode does not set the configuration when building PackagedProgram

2021-07-14 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-21445:
-
Fix Version/s: (was: 1.12.6)
   1.12.5

> Application mode does not set the configuration when building PackagedProgram
> -
>
> Key: FLINK-21445
> URL: https://issues.apache.org/jira/browse/FLINK-21445
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes, Deployment / Scripts, 
> Deployment / YARN
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Yang Wang
>Assignee: Matthias
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.14.0, 1.12.5, 1.13.2
>
>
> Application mode uses {{ClassPathPackagedProgramRetriever}} to create the 
> {{PackagedProgram}}. However, it does not set the configuration. This will 
> cause some client configurations not take effect. For example, 
> {{classloader.resolve-order}}.
> I think we just forget to do this since we have done the similar thing in 
> {{CliFrontend}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-20321) Get NPE when using AvroDeserializationSchema to deserialize null input

2021-07-14 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-20321:
-
Fix Version/s: (was: 1.13.0)
   1.12.5

> Get NPE when using AvroDeserializationSchema to deserialize null input
> --
>
> Key: FLINK-20321
> URL: https://issues.apache.org/jira/browse/FLINK-20321
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Affects Versions: 1.12.0
>Reporter: Shengkai Fang
>Assignee: Xue Wang
>Priority: Major
>  Labels: pull-request-available, sprint, starter
> Fix For: 1.12.5, 1.12.6
>
>
> You can reproduce the bug by adding the code into the 
> {{AvroDeserializationSchemaTest}}.
> The code follows
> {code:java}
> @Test
>   public void testSpecificRecord2() throws Exception {
>   DeserializationSchema deserializer = 
> AvroDeserializationSchema.forSpecific(Address.class);
>   Address deserializedAddress = deserializer.deserialize(null);
>   assertEquals(null, deserializedAddress);
>   }
> {code}
> Exception stack:
> {code:java}
> java.lang.NullPointerException
>   at 
> org.apache.flink.formats.avro.utils.MutableByteArrayInputStream.setBuffer(MutableByteArrayInputStream.java:43)
>   at 
> org.apache.flink.formats.avro.AvroDeserializationSchema.deserialize(AvroDeserializationSchema.java:131)
>   at 
> org.apache.flink.formats.avro.AvroDeserializationSchemaTest.testSpecificRecord2(AvroDeserializationSchemaTest.java:69)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] KarmaGYZ commented on pull request #15899: [FLINK-22627][runtime] Remove SlotManagerImpl

2021-07-14 Thread GitBox


KarmaGYZ commented on pull request #15899:
URL: https://github.com/apache/flink/pull/15899#issuecomment-880330331


   Thanks for the review and the awesome clean up in #16485, @tillrohrmann ! PR 
rebased onto the latest master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-23182) Connection leak in RMQSource

2021-07-14 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-23182:
-
Fix Version/s: (was: 1.12.6)
   1.12.5

> Connection leak in RMQSource 
> -
>
> Key: FLINK-23182
> URL: https://issues.apache.org/jira/browse/FLINK-23182
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors/ RabbitMQ
>Affects Versions: 1.13.1, 1.12.4
>Reporter: Michał Ciesielczyk
>Assignee: Michał Ciesielczyk
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.12.5, 1.13.2
>
>
> The RabbitMQ connection is not closed properly in the RMQSource connector in 
> case of failures. This leads to a connection leak (we loose handles to still 
> opened connections) that will last until the Flink TaskManager is either 
> stopped or crashes.
> The issue is caused by improper resource releasing in open and close methods 
> of RMQSource:
>  - 
> [https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-rabbitmq/src/main/java/org/apache/flink/streaming/connectors/rabbitmq/RMQSource.java#L260]
>  - here the connection is opened, but not closed in case of failure (e.g. 
> caused by invalid queue configuration)
>  - 
> [https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-rabbitmq/src/main/java/org/apache/flink/streaming/connectors/rabbitmq/RMQSource.java#L282]
>  - here the connection might not closed properly if stopping the consumer 
> causes a failure first
> In both cases, the solution is relatively simple - make sure that the 
> connection#close is always called if it should be (failing to close one 
> resource should not prevent other close methods from being called). In open 
> we probably can silently close allocated resources (as the process did not 
> succeed eventually anyway). In close, we should either throw the first caught 
> exception or the last one, and log all the others as warnings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23184) CompileException Assignment conversion not possible from type "int" to type "short"

2021-07-14 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-23184:
-
Fix Version/s: (was: 1.12.6)
   1.12.5

> CompileException Assignment conversion not possible from type "int" to type 
> "short"
> ---
>
> Key: FLINK-23184
> URL: https://issues.apache.org/jira/browse/FLINK-23184
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.14.0
>Reporter: xiaojin.wy
>Assignee: Caizhi Weng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.12.5, 1.13.2
>
>
> {code:sql}
> CREATE TABLE MySink (
>   `a` SMALLINT
> ) WITH (
>   'connector' = 'filesystem',
>   'format' = 'testcsv',
>   'path' = '$resultPath'
> )
> CREATE TABLE database8_t0 (
>   `c0` SMALLINT
> ) WITH (
>   'connector' = 'filesystem',
>   'format' = 'testcsv',
>   'path' = '$resultPath11'
> )
> CREATE TABLE database8_t1 (
>   `c0` SMALLINT,
>   `c1` TINYINT
> ) WITH (
>   'connector' = 'filesystem',
>   'format' = 'testcsv',
>   'path' = '$resultPath22'
> )
> INSERT OVERWRITE database8_t0(c0) VALUES(cast(22424 as SMALLINT))
> INSERT OVERWRITE database8_t1(c0, c1) VALUES(cast(-17443 as SMALLINT), 
> cast(97 as TINYINT))
> insert into MySink
> SELECT database8_t0.c0 AS ref0 FROM database8_t0, database8_t1 WHERE CAST ((- 
> (database8_t0.c0)) AS BOOLEAN)
> {code}
> After running that , you will get the errors:
> {code}
> 2021-06-29 19:39:27
> org.apache.flink.runtime.JobException: Recovery is suppressed by 
> NoRestartBackoffTimeStrategy
>   at 
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:138)
>   at 
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:82)
>   at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:207)
>   at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:197)
>   at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:188)
>   at 
> org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:677)
>   at 
> org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:79)
>   at 
> org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:440)
>   at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:305)
>   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212)
>   at 
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
>   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
>   at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>   at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>   at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>   at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>   at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.lang.RuntimeException: Could not instantiate generated class 
> 'BatchExecCalc$4536'
>   at 
> org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:66)
>   at 
> o

[jira] [Updated] (FLINK-23223) When flushAlways is enabled the subpartition may lose notification of data availability

2021-07-14 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-23223:
-
Fix Version/s: (was: 1.12.6)
   1.12.5

> When flushAlways is enabled the subpartition may lose notification of data 
> availability
> ---
>
> Key: FLINK-23223
> URL: https://issues.apache.org/jira/browse/FLINK-23223
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.11.3, 1.14.0, 1.12.5, 1.13.2
>Reporter: Yun Gao
>Assignee: Yun Gao
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.12.5, 1.13.2
>
>
> When the flushAways is enabled (namely set buffer timeout to 0), there might 
> be cases like:
>  # The subpartition emit an event which blocks the channel
>  # The subpartition produce more records. However, this records would not be 
> notified since isBlocked = true.
>  # When the downstream tasks resume the subpartition later, the subpartition 
> would only mark isBlocked to false. For local input channels although it 
> tries to add the channel if isAvailable = true, but this check would not pass 
> since flushRequest = false. 
> One case for this issue is https://issues.apache.org/jira/browse/FLINK-22085 
> which uses LocalInputChannel.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22443) can not be execute an extreme long sql under batch mode

2021-07-14 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-22443:
-
Fix Version/s: (was: 1.12.6)
   1.12.5

> can not be execute an extreme long sql under batch mode
> ---
>
> Key: FLINK-22443
> URL: https://issues.apache.org/jira/browse/FLINK-22443
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.12.2
> Environment: execute command
>  
> {code:java}
> bin/sql-client.sh embedded -d conf/sql-client-batch.yaml 
> {code}
> content of conf/sql-client-batch.yaml
>  
> {code:java}
> catalogs:
> - name: bnpmphive
>   type: hive
>   hive-conf-dir: /home/gum/hive/conf
>   hive-version: 3.1.2
> execution:
>   planner: blink
>   type: batch
>   #type: streaming
>   result-mode: table
>   parallelism: 4
>   max-parallelism: 2000
>   current-catalog: bnpmphive
>   #current-database: snmpprobe 
> #configuration:
> #  table.sql-dialect: hivemodules:
>- name: core
>  type: core
>- name: myhive
>  type: hivedeployment:
>   # general cluster communication timeout in ms
>   response-timeout: 5000
>   # (optional) address from cluster to gateway
>   gateway-address: ""
>   # (optional) port from cluster to gateway
>   gateway-port: 0
> {code}
>  
>Reporter: macdoor615
>Assignee: Caizhi Weng
>Priority: Blocker
>  Labels: pull-request-available, stale-blocker, stale-critical
> Fix For: 1.14.0, 1.12.5, 1.13.2
>
> Attachments: flink-gum-taskexecutor-8-hb3-prod-hadoop-002.log.4.zip, 
> raw_p_restapi_hcd.csv.zip
>
>
> 1. execute an extreme long sql under batch mode
>  
> {code:java}
> select
> 'CD' product_name,
> r.code business_platform,
> 5 statisticperiod,
> cast('2021-03-24 00:00:00' as timestamp) coltime,
> cast(r1.indicatorvalue as double) as YWPT_ZHQI_CD_038_GZ_2,
> cast(r2.indicatorvalue as double) as YWPT_ZHQI_CD_038_YW_7,
> cast(r3.indicatorvalue as double) as YWPT_ZHQI_CD_038_YW_5,
> cast(r4.indicatorvalue as double) as YWPT_ZHQI_CD_038_YW_6,
> cast(r5.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00029,
> cast(r6.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00028,
> cast(r7.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00015,
> cast(r8.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00014,
> cast(r9.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00011,
> cast(r10.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00010,
> cast(r11.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00013,
> cast(r12.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00012,
> cast(r13.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00027,
> cast(r14.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00026,
> cast(r15.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00046,
> cast(r16.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00047,
> cast(r17.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00049,
> cast(r18.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00048,
> cast(r19.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00024,
> cast(r20.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00025,
> cast(r21.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00022,
> cast(r22.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00023,
> cast(r23.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00054,
> cast(r24.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00055,
> cast(r25.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00033,
> cast(r26.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00032,
> cast(r27.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00053,
> cast(r28.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00052,
> cast(r29.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00051,
> cast(r30.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00050,
> cast(r31.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00043,
> cast(r32.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00042,
> cast(r33.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00017,
> cast(r34.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00016,
> cast(r35.indicatorvalue as double) as YWPT_ZHQI_CD_038_GZ_3,
> cast(r36.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00045,
> cast(r37.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00044,
> cast(r38.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00038,
> cast(r39.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00039,
> cast(r40.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00037,
> cast(r41.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00036,
> cast(r42.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00040,
> cast(r43.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00041,
> cast(r44.indicatorvalue as double) as YWPT_ZHQI_CD_038_XT_00034,
> cast(r45.indicatorvalue as double) as YW

[jira] [Updated] (FLINK-23248) SinkWriter is not closed when failing

2021-07-14 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-23248:
-
Fix Version/s: (was: 1.12.6)
   1.12.5

> SinkWriter is not closed when failing
> -
>
> Key: FLINK-23248
> URL: https://issues.apache.org/jira/browse/FLINK-23248
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.13.1, 1.12.4
>Reporter: Fabian Paul
>Assignee: Fabian Paul
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.12.5, 1.13.2
>
>
> Currently the SinkWriter is only closed when the operator finishes in 
> `AbstractSinkWriterOperator#close()` but we also must close the SinkWrite on 
> `AbstractSinkWriterOperator#dispose()` to release possible acquired resources 
> when failing
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-23374) Clean up logs produced by code splitter

2021-07-14 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-23374.

Resolution: Fixed

master: f4afbf3e7de19ebcc5cb9324a22ba99fcd354dce

> Clean up logs produced by code splitter
> ---
>
> Key: FLINK-23374
> URL: https://issues.apache.org/jira/browse/FLINK-23374
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Reporter: Caizhi Weng
>Assignee: Caizhi Weng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Currently logs are full of errors like
> {code}
> 2021-07-12T20:44:24.0059271Z line 221:17 missing ';' at '('
> 2021-07-12T20:44:24.0063922Z line 221:59 mismatched input ',' expecting ')'
> 2021-07-12T20:44:24.0070238Z line 221:80 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.0076791Z line 223:69 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.0079044Z line 224:33 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.0081798Z line 225:65 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.0088060Z line 226:16 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.0089917Z line 227:76 extraneous input ')' expecting ';'
> 2021-07-12T20:44:24.0792855Z line 385:17 missing ';' at '('
> 2021-07-12T20:44:24.0796518Z line 385:59 mismatched input ',' expecting ')'
> 2021-07-12T20:44:24.0800736Z line 385:80 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.0805884Z line 387:69 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.0808360Z line 388:33 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.0810762Z line 389:65 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.0817301Z line 390:15 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.0818725Z line 391:75 extraneous input ')' expecting ';'
> 2021-07-12T20:44:24.1364946Z line 339:17 missing ';' at '('
> 2021-07-12T20:44:24.1366054Z line 339:59 mismatched input ',' expecting ')'
> 2021-07-12T20:44:24.1373495Z line 339:80 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1378822Z line 341:69 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1380984Z line 342:33 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1385438Z line 343:65 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1388520Z line 344:16 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1392687Z line 345:76 extraneous input ')' expecting ';'
> 2021-07-12T20:44:24.1633870Z line 221:17 missing ';' at '('
> 2021-07-12T20:44:24.1640923Z line 221:59 mismatched input ',' expecting ')'
> 2021-07-12T20:44:24.1647307Z line 221:80 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1654857Z line 223:69 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1659580Z line 224:33 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1664576Z line 225:65 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1669345Z line 226:16 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1672873Z line 227:75 extraneous input ')' expecting ';'
> 2021-07-12T20:44:24.1842542Z line 325:17 missing ';' at '('
> 2021-07-12T20:44:24.1846374Z line 325:59 mismatched input ',' expecting ')'
> 2021-07-12T20:44:24.1852461Z line 325:80 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1867352Z line 327:69 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1872101Z line 328:33 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1876855Z line 329:65 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1888644Z line 330:15 mismatched input ',' expecting ';'
> 2021-07-12T20:44:24.1892271Z line 331:75 extraneous input ')' expecting ';'
> {code}
> This is caused by the code splitter logic and should be cleaned up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-12415) Translate "History Server" page into Chinese

2021-07-14 Thread wangzhao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380940#comment-17380940
 ] 

wangzhao commented on FLINK-12415:
--

Hi,[~jark] , can you assign this issue to me?
I'd like to working on this page.

> Translate "History Server" page into Chinese
> 
>
> Key: FLINK-12415
> URL: https://issues.apache.org/jira/browse/FLINK-12415
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Documentation
>Reporter: Armstrong Nova
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Translate 
> "[https://ci.apache.org/projects/flink/flink-docs-master/monitoring/historyserver.html]";
>  page into Chinese.
> This doc located in "flink/docs/monitoring/historyserver.zh.md"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] JingsongLi merged pull request #16489: [FLINK-23374][table-code-splitter] Clean up logs produced by code splitter

2021-07-14 Thread GitBox


JingsongLi merged pull request #16489:
URL: https://github.com/apache/flink/pull/16489


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] RocMarshal edited a comment on pull request #16348: [FLINK-16093][docs-zh]Translate "System Functions" page of "Functions" into Chinese

2021-07-14 Thread GitBox


RocMarshal edited a comment on pull request #16348:
URL: https://github.com/apache/flink/pull/16348#issuecomment-880319744


   @ZhijieYang Thanks for the update.
   LGTM +1 in the mass.
   Hi, please let us know  what do you @wuchong think of this state .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] RocMarshal commented on pull request #16348: [FLINK-16093][docs-zh]Translate "System Functions" page of "Functions" into Chinese

2021-07-14 Thread GitBox


RocMarshal commented on pull request #16348:
URL: https://github.com/apache/flink/pull/16348#issuecomment-880319744


   @ZhijieYang Thanks for the update.
   LGTM +1 in the mess.
   Hi, please let us know  what do you @wuchong think of this state .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #16495: [hotfix][docs][deployment]: update broken download url

2021-07-14 Thread GitBox


flinkbot edited a comment on pull request #16495:
URL: https://github.com/apache/flink/pull/16495#issuecomment-880293387


   
   ## CI report:
   
   * 55715c7c78725b933ce751cb48d6a2275cf6f880 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20457)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   4   >