date:20240416

[jira] [Commented] (FLINK-34752) "legacy-flink-cdc-sources" Page of TIDB for Flink CDC Chinese Documentation.

2024-04-16 Thread LvYanquan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837975#comment-17837975
 ] 

LvYanquan commented on FLINK-34752:
---

[~siriusfan] Thanks for taking this, I don't have the authority to assign, but 
you can go ahead and submit a PR.

> "legacy-flink-cdc-sources" Page of TIDB for Flink CDC Chinese Documentation.
> 
>
> Key: FLINK-34752
> URL: https://issues.apache.org/jira/browse/FLINK-34752
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Documentation, Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: LvYanquan
>Priority: Minor
> Fix For: cdc-3.1.0
>
>
> Translate legacy-flink-cdc-sources pages of 
> https://github.com/apache/flink-cdc/blob/master/docs/content/docs/connectors/legacy-flink-cdc-sources/tidb-cdc.md
>  into Chinese.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35112) Membership for Row class does not include field names

2024-04-16 Thread Dian Fu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837965#comment-17837965
 ] 

Dian Fu commented on FLINK-35112:
-

Great (y). Thanks [~wzorgdrager] 

> Membership for Row class does not include field names
> -
>
> Key: FLINK-35112
> URL: https://issues.apache.org/jira/browse/FLINK-35112
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.18.1
>Reporter: Wouter Zorgdrager
>Priority: Minor
>
> In the Row class in PyFlink I cannot do a membership check for field names. 
> This minimal example will show the unexpected behavior:
> ```
> from pyflink.common import Row
> row = Row(name="Alice", age=11)
> # Expected to be True, but is False
> print("name" in row)
> person = Row("name", "age")
> # This is True, as expected
> print('name' in person)
> ```
> The related code in the Row class is:
> ```
>     def __contains__(self, item):
>         return item in self._values
> ```
> It should be relatively easy to fix with the following code:
> ```
>     def __contains__(self, item):
>         if hasattr(self, "_fields"):
>             return item in self._fields
>         else:
>             return item in self._values
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-35089) Two input AbstractStreamOperator may throw NPE when receiving RecordAttributes

2024-04-16 Thread Xintong Song (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song closed FLINK-35089.

Fix Version/s: 1.20.0
   1.19.1
   Resolution: Fixed

- master (1.20): 7a90a05e82ddfb3438e611d44fd329858255de6b
- release-1.19: a2c3d27f5dced2ba73307e8230cd07a11b26c401

> Two input AbstractStreamOperator may throw NPE when receiving RecordAttributes
> --
>
> Key: FLINK-35089
> URL: https://issues.apache.org/jira/browse/FLINK-35089
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.19.0
>Reporter: Xuannan Su
>Assignee: Xuannan Su
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0, 1.19.1
>
>
> Currently the `lastRecordAttributes1` and `lastRecordAttributes2` in the 
> `AbstractStreamOperator` are transient. The two fields will be null when it 
> is deserialized in TaskManager, which may cause an NPE.
> To fix it, we will initialize the two fields in the {{setup}} method.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35089][runtime] Initialize lastRecordAttributes in AbstractStreamOperator during setup [flink]

2024-04-16 Thread via GitHub



xintongsong closed pull request #24655: [FLINK-35089][runtime] Initialize 
lastRecordAttributes in AbstractStreamOperator during setup
URL: https://github.com/apache/flink/pull/24655


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35125][state] Implement ValueState for ForStStateBackend [flink]

2024-04-16 Thread via GitHub



fredia commented on code in PR #24671:
URL: https://github.com/apache/flink/pull/24671#discussion_r1568129118


##
flink-runtime/src/main/java/org/apache/flink/runtime/state/v2/InternalSyncState.java:
##
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.state.v2;
+
+import org.apache.flink.annotation.Internal;
+
+import java.io.IOException;
+
+/**
+ * Internal state interface for the stateBackend layer, where state requests 
are executed
+ * synchronously. Unlike the user-facing state interface, {@code 
InternalSyncState}'s interfaces
+ * provide the key directly in the method signature, so there is no need to 
pass the keyContext
+ * information for these interfaces.
+ *
+ * @param  Type of the key in the state.
+ */
+@Internal
+public interface InternalSyncState {

Review Comment:
   `SyncState` is a little bit strange, I'm not sure if 
`InternalUnderlyingState` would be better, maybe we can seek other people’s 
opinions.



##
flink-state-backends/flink-statebackend-forst/src/main/java/org/apache/flink/state/forst/state/AbstractForStState.java:
##
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.state.forst.state;
+
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.core.memory.DataInputDeserializer;
+import org.apache.flink.core.memory.DataOutputSerializer;
+import org.apache.flink.runtime.state.CompositeKeySerializationUtils;
+import org.apache.flink.runtime.state.KeyGroupRangeAssignment;
+import org.apache.flink.runtime.state.SerializedCompositeKeyBuilder;
+import org.apache.flink.runtime.state.v2.InternalSyncState;
+
+import org.rocksdb.ColumnFamilyHandle;
+import org.rocksdb.RocksDB;
+import org.rocksdb.RocksDBException;
+import org.rocksdb.WriteOptions;
+
+import java.io.IOException;
+
+/**
+ * Base class for {@link InternalSyncState} implementations that store state 
in a ForSt database.
+ *
+ * @param  Type of the key in the state.
+ * @param  Type of the value in the state.
+ */
+public class AbstractForStState implements InternalSyncState {
+
+protected final RocksDB db;
+
+protected final ColumnFamilyHandle columnFamily;
+
+private final int maxParallelism;
+
+protected final WriteOptions writeOptions;
+
+protected final TypeSerializer keySerializer;

Review Comment:
   Is`namespaceSerializer` needed here?



##
flink-runtime/src/main/java/org/apache/flink/runtime/state/v2/InternalSyncState.java:
##
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.state.v2;
+
+import

[jira] [Updated] (FLINK-34907) jobRunningTs should be the timestamp that all tasks are running

2024-04-16 Thread Rui Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-34907:

Description: 
Currently, we consider the timestamp that JobStatus is changed to RUNNING as 
jobRunningTs. But the JobStatus will be RUNNING once job starts schedule, so it 
doesn't mean all tasks are running(It doesn't include request TM resources from 
kubernetes/yarn, deploy tasks and restore states, these steps will take a lot 
of time).

It will let the isStabilizing or estimating restart time are not accurate.

Solution: jobRunningTs should be the timestamp that all tasks are running.

It can be got from SubtasksTimesHeaders rest api.

  was:
Currently, we consider the timestamp that JobStatus is changed to RUNNING as 
jobRunningTs. But the JobStatus will be RUNNING once job starts schedule, so it 
doesn't mean all tasks are running. 

It will let the isStabilizing or estimating restart time are not accurate.

Solution: jobRunningTs should be the timestamp that all tasks are running.

It can be got from SubtasksTimesHeaders rest api.



> jobRunningTs should be the timestamp that all tasks are running
> ---
>
> Key: FLINK-34907
> URL: https://issues.apache.org/jira/browse/FLINK-34907
> Project: Flink
>  Issue Type: Improvement
>  Components: Autoscaler
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
> Fix For: kubernetes-operator-1.9.0
>
>
> Currently, we consider the timestamp that JobStatus is changed to RUNNING as 
> jobRunningTs. But the JobStatus will be RUNNING once job starts schedule, so 
> it doesn't mean all tasks are running(It doesn't include request TM resources 
> from kubernetes/yarn, deploy tasks and restore states, these steps will take 
> a lot of time).
> It will let the isStabilizing or estimating restart time are not accurate.
> Solution: jobRunningTs should be the timestamp that all tasks are running.
> It can be got from SubtasksTimesHeaders rest api.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35129) Postgres source commits the offset after every multiple checkpoint cycles.

2024-04-16 Thread Hongshun Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837942#comment-17837942
 ] 

Hongshun Wang commented on FLINK-35129:
---

[~m.orazow] , would you like to take it?

> Postgres source commits the offset after every multiple checkpoint cycles.
> --
>
> Key: FLINK-35129
> URL: https://issues.apache.org/jira/browse/FLINK-35129
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Hongshun Wang
>Priority: Major
>
> After entering the Stream phase, the offset consumed by the global slot is 
> committed upon the completion of each checkpoint, preventing log files from 
> being unable to be recycled continuously, which could lead to insufficient 
> disk space.
> However, the job can only restart from the latest checkpoint or savepoint. if 
> restored from an earlier state, WAL may already have been recycled.
>  
> The way to solve it is to commit the offset after every multiple checkpoint 
> cycles. The number of checkpoint cycles is determine by connector option, and 
> the default value is 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35129) Postgres source commits the offset after every multiple checkpoint cycles.

2024-04-16 Thread Hongshun Wang (Jira)

Hongshun Wang created FLINK-35129:
-

 Summary: Postgres source commits the offset after every multiple 
checkpoint cycles.
 Key: FLINK-35129
 URL: https://issues.apache.org/jira/browse/FLINK-35129
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: Hongshun Wang


After entering the Stream phase, the offset consumed by the global slot is 
committed upon the completion of each checkpoint, preventing log files from 
being unable to be recycled continuously, which could lead to insufficient disk 
space.

However, the job can only restart from the latest checkpoint or savepoint. if 
restored from an earlier state, WAL may already have been recycled.

 

The way to solve it is to commit the offset after every multiple checkpoint 
cycles. The number of checkpoint cycles is determine by connector option, and 
the default value is 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35025][Runtime/State] Abstract stream operators for async state processing [flink]

2024-04-16 Thread via GitHub



Zakelly commented on code in PR #24657:
URL: https://github.com/apache/flink/pull/24657#discussion_r1568151628


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/asyncprocessing/AsyncStateProcessing.java:
##
@@ -34,19 +34,6 @@ public interface AsyncStateProcessing {
  */
 boolean isAsyncStateProcessingEnabled();
 
-/**
- * Set key context for async state processing.
- *
- * @param record the record.
- * @param keySelector the key selector to select a key from record.
- * @param  the type of the record.
- */
- void setAsyncKeyedContextElement(StreamRecord record, 
KeySelector keySelector)
-throws Exception;
-
-/** A callback that will be triggered after an element finishes {@code 
processElement}. */
-void postProcessElement();

Review Comment:
   Well, I think the current commits are able to better demonstrate the 
evolving of interfaces (introduce and split). But I'll refine this. Thanks for 
your suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35025][Runtime/State] Abstract stream operators for async state processing [flink]

2024-04-16 Thread via GitHub



Zakelly commented on code in PR #24657:
URL: https://github.com/apache/flink/pull/24657#discussion_r1568147623


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/asyncprocessing/AsyncStateProcessingOperator.java:
##
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.runtime.operators.asyncprocessing;
+
+import org.apache.flink.api.java.functions.KeySelector;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.util.function.ThrowingRunnable;
+
+/**
+ * A more detailed interface based on {@link AsyncStateProcessing}, which 
gives the essential
+ * methods for an operator to perform async state processing.
+ */
+public interface AsyncStateProcessingOperator extends AsyncStateProcessing {
+
+/** Get the {@link ElementOrder} of this operator. */
+ElementOrder getElementOrder();
+
+/**
+ * Set key context for async state processing.
+ *
+ * @param record the record.
+ * @param keySelector the key selector to select a key from record.
+ * @param  the type of the record.
+ */
+ void setAsyncKeyedContextElement(StreamRecord record, 
KeySelector keySelector)
+throws Exception;
+
+/** A callback that will be triggered after an element finishes {@code 
processElement}. */
+void postProcessElement();

Review Comment:
   `setAsyncKeyedContextElement` is borrowed from `setKeyContextElement`, given 
that the stream record and key selector should be passed in as parameters, the 
method is only for set context so I'd keep the name.
   
   The `postProcessElement` is designed to be a callback, which is more 
understandable in callers' perspective of view. For now it just release the 
context, but more might be added (watermark emit or something). I also want to 
introduce the `preProcessElement` but it seems unused currently. It will be 
introduced in need. 
   
   WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34987][state] Introduce Internal State for Async State API [flink]

2024-04-16 Thread via GitHub



masteryhx commented on PR #24651:
URL: https://github.com/apache/flink/pull/24651#issuecomment-2060251285

   Rebased master.
   I will merge it after the CI is green. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35092) Add integrated test for Doris / Starrocks sink pipeline connector

2024-04-16 Thread Xiqian YU (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiqian YU updated FLINK-35092:
--
Summary: Add integrated test for Doris / Starrocks sink pipeline connector  
(was: Add integrated test for Doris sink pipeline connector)

> Add integrated test for Doris / Starrocks sink pipeline connector
> -
>
> Key: FLINK-35092
> URL: https://issues.apache.org/jira/browse/FLINK-35092
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Xiqian YU
>Priority: Minor
>
> Currently, no integrated test are being applied to Doris pipeline connector 
> (there's only one DorisRowConverterTest case for now). Adding ITcases would 
> improving Doris connector's code quality and reliability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35071) Remove dependency on flink-shaded from cdc source connector

2024-04-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35071:
---
Labels: pull-request-available  (was: )

> Remove dependency on flink-shaded from cdc source connector
> ---
>
> Key: FLINK-35071
> URL: https://issues.apache.org/jira/browse/FLINK-35071
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Hongshun Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
>
> Like what flink-connector-aws does in FLINK-32208, the flink cdc source 
> connectors depend on flink-shaded-guava. With the externalization of 
> connector, connectors shouldn't rely on Flink-Shaded but instead shade 
> dependencies such as this one themselves.
>  
> More over,  flink-shaded-guava will be included in the final jar package, so 
> maybe cause dependency conflict.
> Now we can see why dependency conflict occurs:
>  * Since Flink 1.18, version of guava is upgrade from 30 to 31 (using 
> 31.1-jre-17.0)
> [!https://private-user-images.githubusercontent.com/125648852/308026228-5425d959-089a-46bd-9e34-0c262cd04457.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTI3MzEwMDEsIm5iZiI6MTcxMjczMDcwMSwicGF0aCI6Ii8xMjU2NDg4NTIvMzA4MDI2MjI4LTU0MjVkOTU5LTA4OWEtNDZiZC05ZTM0LTBjMjYyY2QwNDQ1Ny5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNDEwJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDQxMFQwNjMxNDFaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT05MWNjYjIxYWQ2OTE1ODdkMmRlMjFlNzRkOWFhMmM5OTZmNTk1OWQzZDE5NWMwNGUxNjI1Y2VkMzQ4MTQ3YTkxJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.NK51eb_rBFmtrNJgMj1b6YVPuoOrTLtnACpGrwW_CgI!|https://private-user-images.githubusercontent.com/125648852/308026228-5425d959-089a-46bd-9e34-0c262cd04457.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTI3MzEwMDEsIm5iZiI6MTcxMjczMDcwMSwicGF0aCI6Ii8xMjU2NDg4NTIvMzA4MDI2MjI4LTU0MjVkOTU5LTA4OWEtNDZiZC05ZTM0LTBjMjYyY2QwNDQ1Ny5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNDEwJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDQxMFQwNjMxNDFaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT05MWNjYjIxYWQ2OTE1ODdkMmRlMjFlNzRkOWFhMmM5OTZmNTk1OWQzZDE5NWMwNGUxNjI1Y2VkMzQ4MTQ3YTkxJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.NK51eb_rBFmtrNJgMj1b6YVPuoOrTLtnACpGrwW_CgI]
>  * Since CDC 3.0.1, version of guava is also upgrade from 30 to 31 (using 
> 31.1-jre-17.0)
>  * So CDC 3.0.1 is compatible with Flink 1.18, CDC 2.x is is compatible with 
> Flink 1.13-1.17. CDC 2.x and Flink 1.18 dependency conflict, CDC 3.0.1 and 
> Flink 1.17(or earlier version) dependency conflict.
> ([https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#relocations])
>  in pom.xml.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35071][cdc-connector][cdc-base] Shade guava31 to avoid dependency conflict with flink below 1.18 [flink-cdc]

2024-04-16 Thread via GitHub



loserwang1024 commented on PR #3083:
URL: https://github.com/apache/flink-cdc/pull/3083#issuecomment-2060246663

   @PatrickRen , CC?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34908][pipeline-connector][doris] Fix Mysql pipeline to doris will lost precision for timestamp [flink-cdc]

2024-04-16 Thread via GitHub



yuxiqian commented on code in PR #3207:
URL: https://github.com/apache/flink-cdc/pull/3207#discussion_r1568134905


##
flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-doris/src/main/java/org/apache/flink/cdc/connectors/doris/sink/DorisRowConverter.java:
##
@@ -103,7 +103,8 @@ static SerializationConverter 
createExternalConverter(DataType type, ZoneId pipe
 return (index, val) ->
 val.getTimestamp(index, 
DataTypeChecks.getPrecision(type))
 .toLocalDateTime()
-
.format(DorisEventSerializer.DATE_TIME_FORMATTER);
+.toString()
+.replace('T', ' ');

Review Comment:
   What about fixing `DorisEventSerializer.DATE_TIME_FORMATTER` instead of 
doing string-level operations? It should make code clearer and more efficient.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35128][cdc-connector][cdc-base] Re-calculate the starting changelog offset after the new table added [flink-cdc]

2024-04-16 Thread via GitHub



loserwang1024 commented on PR #3230:
URL: https://github.com/apache/flink-cdc/pull/3230#issuecomment-2060235695

   @PatrickRen , @morazow , CC


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35128) Re-calculate the starting change log offset after the new table added

2024-04-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35128:
---
Labels: pull-request-available  (was: )

> Re-calculate the starting change log offset after the new table added
> -
>
> Key: FLINK-35128
> URL: https://issues.apache.org/jira/browse/FLINK-35128
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Hongshun Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
>
> In mysql cdc, re-calculate the starting binlog offset after the new table 
> added in MySqlBinlogSplit#appendFinishedSplitInfos, while there lack of same 
> action in StreamSplit#appendFinishedSplitInfos. This will cause data loss if 
> any newly added table snapshot split's highwatermark is smaller.
>  
> Some unstable test problem occurs because of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-35128][cdc-connector][cdc-base] Re-calculate the starting changelog offset after the new table added [flink-cdc]

2024-04-16 Thread via GitHub



loserwang1024 opened a new pull request, #3230:
URL: https://github.com/apache/flink-cdc/pull/3230

   In mysql cdc, re-calculate the starting binlog offset after the new table 
added in MySqlBinlogSplit#appendFinishedSplitInfos, while there lack of same 
action in StreamSplit#appendFinishedSplitInfos. This will cause data loss if 
any newly added table snapshot split's highwatermark is smaller.
   
   Some unstable test problem occurs because of it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-35028) Timer firing under async execution model

2024-04-16 Thread Yanfei Lei (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanfei Lei reassigned FLINK-35028:
--

Assignee: Yanfei Lei

> Timer firing under async execution model
> 
>
> Key: FLINK-35028
> URL: https://issues.apache.org/jira/browse/FLINK-35028
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends, Runtime / Task
>Reporter: Yanfei Lei
>Assignee: Yanfei Lei
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-35079) MongoConnector failed to resume token when current collection removed

2024-04-16 Thread Jiabao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiabao Sun resolved FLINK-35079.

Fix Version/s: cdc-3.1.0
   Resolution: Fixed

resolved via cdc master: 0562e35da75fb2c8e512d438adb8f80a87964dc4

> MongoConnector failed to resume token when current collection removed
> -
>
> Key: FLINK-35079
> URL: https://issues.apache.org/jira/browse/FLINK-35079
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Reporter: Xiqian YU
>Assignee: Xiqian YU
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.1.0
>
>
> When connector tries to create cursor with an expired resuming token during 
> stream task fetching stage, MongoDB connector will crash with such message: 
> "error due to Command failed with error 280 (ChangeStreamFatalError): 'cannot 
> resume stream; the resume token was not found."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35079] Fallback to timestamp startup mode when resume token has expired [flink-cdc]

2024-04-16 Thread via GitHub



Jiabao-Sun merged PR #3221:
URL: https://github.com/apache/flink-cdc/pull/3221


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (FLINK-35046) Introduce New KeyedStateBackend related Async interfaces

2024-04-16 Thread Hangxiang Yu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hangxiang Yu resolved FLINK-35046.
--
Fix Version/s: 1.20.0
   Resolution: Fixed

merged 691cc671 into master.

> Introduce New KeyedStateBackend related Async interfaces
> 
>
> Key: FLINK-35046
> URL: https://issues.apache.org/jira/browse/FLINK-35046
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Hangxiang Yu
>Assignee: Hangxiang Yu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> Since we have introduced new State API, the async version of some classes 
> should be introduced to support it, e.g. AsyncKeyedStateBackend, new State 
> Descriptor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35046][state] Introduce AsyncKeyedStateBackend supporting to create StateExecutor [flink]

2024-04-16 Thread via GitHub



masteryhx closed pull request #24663: [FLINK-35046][state] Introduce 
AsyncKeyedStateBackend supporting to create StateExecutor
URL: https://github.com/apache/flink/pull/24663


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35128) Re-calculate the starting change log offset after the new table added

2024-04-16 Thread Hongshun Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongshun Wang updated FLINK-35128:
--
Description: 
In mysql cdc, re-calculate the starting binlog offset after the new table added 
in MySqlBinlogSplit#appendFinishedSplitInfos, while there lack of same action 
in StreamSplit#appendFinishedSplitInfos. This will cause data loss if any newly 
added table snapshot split's highwatermark is smaller.

 

Some unstable test problem occurs because of it.

  was:In mysql cdc, re-calculate the starting binlog offset after the new table 
added in MySqlBinlogSplit#appendFinishedSplitInfos, while there lack of same 
action in StreamSplit#appendFinishedSplitInfos. This will cause data loss if 
any newly added table snapshot split's highwatermark is smaller.


> Re-calculate the starting change log offset after the new table added
> -
>
> Key: FLINK-35128
> URL: https://issues.apache.org/jira/browse/FLINK-35128
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Hongshun Wang
>Priority: Major
> Fix For: 3.1.0
>
>
> In mysql cdc, re-calculate the starting binlog offset after the new table 
> added in MySqlBinlogSplit#appendFinishedSplitInfos, while there lack of same 
> action in StreamSplit#appendFinishedSplitInfos. This will cause data loss if 
> any newly added table snapshot split's highwatermark is smaller.
>  
> Some unstable test problem occurs because of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35128) Re-calculate the starting change log offset after the new table added

2024-04-16 Thread Hongshun Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongshun Wang updated FLINK-35128:
--
Summary: Re-calculate the starting change log offset after the new table 
added  (was: Re-calculate the starting binlog offset after the new table added)

> Re-calculate the starting change log offset after the new table added
> -
>
> Key: FLINK-35128
> URL: https://issues.apache.org/jira/browse/FLINK-35128
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Hongshun Wang
>Priority: Major
> Fix For: 3.1.0
>
>
> In mysql cdc, re-calculate the starting binlog offset after the new table 
> added in MySqlBinlogSplit#appendFinishedSplitInfos, while there lack of same 
> action in StreamSplit#appendFinishedSplitInfos. This will cause data loss if 
> any newly added table snapshot split's highwatermark is smaller.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35128) Re-calculate the starting binlog offset after the new table added

2024-04-16 Thread Hongshun Wang (Jira)

Hongshun Wang created FLINK-35128:
-

 Summary: Re-calculate the starting binlog offset after the new 
table added
 Key: FLINK-35128
 URL: https://issues.apache.org/jira/browse/FLINK-35128
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: Hongshun Wang
 Fix For: 3.1.0


In mysql cdc, re-calculate the starting binlog offset after the new table added 
in MySqlBinlogSplit#appendFinishedSplitInfos, while there lack of same action 
in StreamSplit#appendFinishedSplitInfos. This will cause data loss if any newly 
added table snapshot split's highwatermark is smaller.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35127) CDC ValuesDataSourceITCase crashed due to OutOfMemoryError

2024-04-16 Thread Jiabao Sun (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837929#comment-17837929
 ] 

Jiabao Sun commented on FLINK-35127:


Hi [~kunni],
Could you help take a look?

> CDC ValuesDataSourceITCase crashed due to OutOfMemoryError
> --
>
> Key: FLINK-35127
> URL: https://issues.apache.org/jira/browse/FLINK-35127
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Reporter: Jiabao Sun
>Priority: Major
>  Labels: test-stability
> Fix For: cdc-3.1.0
>
>
> {code}
> [INFO] Running 
> org.apache.flink.cdc.connectors.values.source.ValuesDataSourceITCase
> Error: Exception in thread "surefire-forkedjvm-command-thread" 
> java.lang.OutOfMemoryError: Java heap space
> Error:  
> Error:  Exception: java.lang.OutOfMemoryError thrown from the 
> UncaughtExceptionHandler in thread "taskmanager_4-main-scheduler-thread-2"
> Error:  
> Error:  Exception: java.lang.OutOfMemoryError thrown from the 
> UncaughtExceptionHandler in thread "System Time Trigger for Source: values 
> (1/4)#0"
> {code}
> https://github.com/apache/flink-cdc/actions/runs/8698450229/job/23858750352?pr=3221#step:6:1949



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35127) CDC ValuesDataSourceITCase crashed due to OutOfMemoryError

2024-04-16 Thread Jiabao Sun (Jira)

Jiabao Sun created FLINK-35127:
--

 Summary: CDC ValuesDataSourceITCase crashed due to OutOfMemoryError
 Key: FLINK-35127
 URL: https://issues.apache.org/jira/browse/FLINK-35127
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: Jiabao Sun
 Fix For: cdc-3.1.0


{code}
[INFO] Running 
org.apache.flink.cdc.connectors.values.source.ValuesDataSourceITCase
Error: Exception in thread "surefire-forkedjvm-command-thread" 
java.lang.OutOfMemoryError: Java heap space
Error:  
Error:  Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "taskmanager_4-main-scheduler-thread-2"
Error:  
Error:  Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "System Time Trigger for Source: values 
(1/4)#0"
{code}

https://github.com/apache/flink-cdc/actions/runs/8698450229/job/23858750352?pr=3221#step:6:1949




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [docs]Problem with Case Document Format in Quickstart [flink-cdc]

2024-04-16 Thread via GitHub



ZmmBigdata opened a new pull request, #3229:
URL: https://github.com/apache/flink-cdc/pull/3229

   There is an issue with the format of three shell scripts
   
![cf9c895cc2a90e65946ffd620217f48](https://github.com/apache/flink-cdc/assets/102840730/35bdf1d9-cbeb-404a-8eed-4330aad445b9)
   
![cc29af05deb1f5d2ddc47d46121de42](https://github.com/apache/flink-cdc/assets/102840730/4d398026-191f-4971-99b9-e0bee5606fcb)
   
![498cef1ec5bc541d790643753af0874](https://github.com/apache/flink-cdc/assets/102840730/f013f71d-5cae-4b92-99dc-05d910deeb26)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35025][Runtime/State] Abstract stream operators for async state processing [flink]

2024-04-16 Thread via GitHub



yunfengzhou-hub commented on code in PR #24657:
URL: https://github.com/apache/flink/pull/24657#discussion_r1567131634


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/asyncprocessing/AsyncStateProcessingOperator.java:
##
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.runtime.operators.asyncprocessing;
+
+import org.apache.flink.api.java.functions.KeySelector;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.util.function.ThrowingRunnable;
+
+/**
+ * A more detailed interface based on {@link AsyncStateProcessing}, which 
gives the essential
+ * methods for an operator to perform async state processing.
+ */
+public interface AsyncStateProcessingOperator extends AsyncStateProcessing {
+
+/** Get the {@link ElementOrder} of this operator. */
+ElementOrder getElementOrder();
+
+/**
+ * Set key context for async state processing.
+ *
+ * @param record the record.
+ * @param keySelector the key selector to select a key from record.
+ * @param  the type of the record.
+ */
+ void setAsyncKeyedContextElement(StreamRecord record, 
KeySelector keySelector)
+throws Exception;
+
+/** A callback that will be triggered after an element finishes {@code 
processElement}. */
+void postProcessElement();

Review Comment:
   It might be better to make the name of the two methods above symmetric. For 
example, setXXXContext + releaseXXXContext, or preProcessElement + 
postProcessElement.



##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/asyncprocessing/AsyncStateProcessing.java:
##
@@ -34,19 +34,6 @@ public interface AsyncStateProcessing {
  */
 boolean isAsyncStateProcessingEnabled();
 
-/**
- * Set key context for async state processing.
- *
- * @param record the record.
- * @param keySelector the key selector to select a key from record.
- * @param  the type of the record.
- */
- void setAsyncKeyedContextElement(StreamRecord record, 
KeySelector keySelector)
-throws Exception;
-
-/** A callback that will be triggered after an element finishes {@code 
processElement}. */
-void postProcessElement();

Review Comment:
   Let's refine the commits to avoid adding and removing a method in the same 
PR.



##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/asyncprocessing/AbstractAsyncStateStreamOperatorV2.java:
##
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.runtime.operators.asyncprocessing;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.common.operators.MailboxExecutor;
+import org.apache.flink.api.java.functions.KeySelector;
+import org.apache.flink.runtime.asyncprocessing.AsyncExecutionController;
+import org.apache.flink.runtime.asyncprocessing.RecordContext;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperatorV2;
+import org.apache.flink.streaming.api.operators.StreamOperatorParameters;
+import org.apache.flink.streaming.api.operators.StreamTaskStateInitializer;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import

Re: [PR] [FLINK-34444] Initial implementation of JM operator metric rest api [flink]

2024-04-16 Thread via GitHub



mas-chen commented on code in PR #24564:
URL: https://github.com/apache/flink/pull/24564#discussion_r1568044466


##
docs/static/generated/rest_v1_dispatcher.yml:
##
@@ -1089,6 +1089,37 @@ paths:
 application/json:
   schema:
 $ref: '#/components/schemas/JobVertexBackPressureInfo'
+  /jobs/{jobid}/vertices/{vertexid}/coordinator-metrics:
+get:
+  description: Provides access to job manager operator metrics

Review Comment:
   Yes it should, but my question is more about the scope of the endpoint and 
this feature.
   
   If a user wants other JM operator metrics in the future, what would happen?
   1. We would need to support another endpoint (possibly deprecate this one 
and create a new one that supports all JM operator metrics)
   2. Integrate with the Flink UI again
   
   Also, the feedback from @zentol and @lindong28 in [1] reflects that the devs 
were against `coordinator-metrics` and introducing a specific scope to support 
that, and perhaps that feedback also applies to this REST API. I had this as my 
original proposal in the voting thread for the FLIP to support 
`/jobs/.../vertices/.../jm-operator-metrics`. The endpoint path doesn't change 
anything fundamentally in this PR
   
   [1] https://lists.apache.org/thread/hdvz57wmlqxdxgnw73tcgbcfyo6vk09h



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34444] Initial implementation of JM operator metric rest api [flink]

2024-04-16 Thread via GitHub



mas-chen commented on code in PR #24564:
URL: https://github.com/apache/flink/pull/24564#discussion_r1568044466


##
docs/static/generated/rest_v1_dispatcher.yml:
##
@@ -1089,6 +1089,37 @@ paths:
 application/json:
   schema:
 $ref: '#/components/schemas/JobVertexBackPressureInfo'
+  /jobs/{jobid}/vertices/{vertexid}/coordinator-metrics:
+get:
+  description: Provides access to job manager operator metrics

Review Comment:
   Yes it should, but my question is more about the scope of the endpoint and 
this feature.
   
   If a user wants other JM operator metrics in the future, what would happen?
   1. We would need to support another endpoint (possibly deprecate this one 
and create a new one that supports all JM operator metrics)
   2. Integrate with the Flink UI again
   
   Also, the feedback from @zentol and @lindong28 in [1] reflects that the devs 
were against `coordinator-metrics` and introducing a specific scope to support 
that, and perhaps that feedback also applies to this REST API. I had this as my 
original proposal in the voting thread for the FLIP to support 
`/jobs/.../vertices/.../jm-operator-metrics`
   
   [1] https://lists.apache.org/thread/hdvz57wmlqxdxgnw73tcgbcfyo6vk09h



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35116] Bump operator sdk version to 4.8.3 [flink-kubernetes-operator]

2024-04-16 Thread via GitHub



gyfora commented on PR #816:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/816#issuecomment-2059797010

   Together with the version bump, this workaround can also be removed I 
believe: 
https://github.com/apache/flink-kubernetes-operator/commit/726e484c6a9b4121563829bc094b3eebeb8ddcf3
 (cc @csviri )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-31966) Flink Kubernetes operator lacks TLS support

2024-04-16 Thread Adrian Vasiliu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Vasiliu updated FLINK-31966:
---
Attachment: (was: image-2024-04-16-16-33-39-644.png)

> Flink Kubernetes operator lacks TLS support 
> 
>
> Key: FLINK-31966
> URL: https://issues.apache.org/jira/browse/FLINK-31966
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Adrian Vasiliu
>Assignee: Tony Garrard
>Priority: Major
> Fix For: kubernetes-operator-1.8.0
>
>
> *Summary*
> The Flink Kubernetes operator lacks support inside the FlinkDeployment 
> operand for configuring Flink with TLS (both one-way and mutual) for the 
> internal communication between jobmanagers and taskmanagers, and for the 
> external REST endpoint. Although a workaround exists to configure the job and 
> task managers, this breaks the operator and renders it unable to reconcile.
> *Additional information*
>  * The Apache Flink operator supports passing through custom flink 
> configuration to be applied to job and task managers.
>  * If you supply SSL-based properties, the operator can no longer speak to 
> the deployed job manager. The operator is reading the flink conf and using it 
> to create a connection to the job manager REST endpoint, but it uses the 
> truststore file paths within flink-conf.yaml, which are unresolvable from the 
> operator. This leaves the operator hanging in a pending state as it cannot 
> complete a reconcile.
> *Proposal*
> Our proposal is to make changes to the operator code. A simple change exists 
> that would be enough to enable anonymous SSL at the REST endpoint, but more 
> invasive changes would be required to enable full mTLS throughout.
> The simple change to enable anonymous SSL would be for the operator to parse 
> flink-conf and podTemplate to identify the Kubernetes resource that contains 
> the certificate from the job manager keystore and use it inside the 
> operator’s trust store.
> In the case of mutual TLS, further changes are required: the operator would 
> need to generate a certificate signed by the same issuing authority as the 
> job manager’s certificates and then use it in a keystore when challenged by 
> that job manager. We propose that the operator becomes responsible for making 
> CertificateSigningRequests to generate certificates for job manager, task 
> manager and operator. The operator can then coordinate deploying the job and 
> task managers with the correct flink-conf and volume mounts. This would also 
> work for anonymous SSL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34961] Use dedicated CI name for HBase connector to differentiate it in infra-reports [flink-connector-hbase]

2024-04-16 Thread via GitHub



snuyanzin merged PR #45:
URL: https://github.com/apache/flink-connector-hbase/pull/45


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-35126) Improve checkpoint progress health check config and enable by default

2024-04-16 Thread Gyula Fora (Jira)

Gyula Fora created FLINK-35126:
--

 Summary: Improve checkpoint progress health check config and 
enable by default
 Key: FLINK-35126
 URL: https://issues.apache.org/jira/browse/FLINK-35126
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Gyula Fora


Currently the checkpoint progress health check window is configurable by 
Duration. This makes it hard to enable by default as the sensible interval 
depends on the checkpoint interval.

We should rework the config and add an alternative checkpoint interval 
multiplier based config which could be set by default to 3 (default window is 
3x checkpoint interval )

If checkpointing is not enabled in config the health check would be disabled of 
course.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34653) Support table merging with route in Flink CDC

2024-04-16 Thread Leonard Xu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837752#comment-17837752
 ] 

Leonard Xu commented on FLINK-34653:


master via 6017b165289d8e6f40db396cd07c62285be7fca9

> Support table merging with route in Flink CDC
> -
>
> Key: FLINK-34653
> URL: https://issues.apache.org/jira/browse/FLINK-34653
> Project: Flink
>  Issue Type: New Feature
>  Components: Flink CDC
>Reporter: Qingsheng Ren
>Assignee: Qingsheng Ren
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.1.0
>
>
> Currently route in Flink CDC only supports very simple table id replacing. It 
> should support more complex table merging strategies. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34653) Support table merging with route in Flink CDC

2024-04-16 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-34653:
---
Release Note:   (was: master via 6017b165289d8e6f40db396cd07c62285be7fca9)

> Support table merging with route in Flink CDC
> -
>
> Key: FLINK-34653
> URL: https://issues.apache.org/jira/browse/FLINK-34653
> Project: Flink
>  Issue Type: New Feature
>  Components: Flink CDC
>Reporter: Qingsheng Ren
>Assignee: Qingsheng Ren
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.1.0
>
>
> Currently route in Flink CDC only supports very simple table id replacing. It 
> should support more complex table merging strategies. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-34653) Support table merging with route in Flink CDC

2024-04-16 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu closed FLINK-34653.
--
Release Note: master via 6017b165289d8e6f40db396cd07c62285be7fca9
  Resolution: Fixed

> Support table merging with route in Flink CDC
> -
>
> Key: FLINK-34653
> URL: https://issues.apache.org/jira/browse/FLINK-34653
> Project: Flink
>  Issue Type: New Feature
>  Components: Flink CDC
>Reporter: Qingsheng Ren
>Assignee: Qingsheng Ren
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.1.0
>
>
> Currently route in Flink CDC only supports very simple table id replacing. It 
> should support more complex table merging strategies. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34653] Support table merging with route [flink-cdc]

2024-04-16 Thread via GitHub



leonardBang merged PR #3129:
URL: https://github.com/apache/flink-cdc/pull/3129


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [e2e] add pipeline e2e test for mysql connector. [flink-cdc]

2024-04-16 Thread via GitHub



leonardBang commented on PR #2997:
URL: https://github.com/apache/flink-cdc/pull/2997#issuecomment-2059264100

   @lvyanquan kindly ping 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34648] add waitChangeResultRequest and WaitChangeResultResponse to avoid RPC timeout. [flink-cdc]

2024-04-16 Thread via GitHub



leonardBang commented on code in PR #3128:
URL: https://github.com/apache/flink-cdc/pull/3128#discussion_r1567483688


##
flink-cdc-common/src/main/java/org/apache/flink/cdc/common/pipeline/PipelineOptions.java:
##
@@ -85,5 +89,12 @@ public class PipelineOptions {
 .withDescription(
 "The unique ID for schema operator. This ID will 
be used for inter-operator communications and must be unique across 
operators.");
 
+public static final ConfigOption 
PIPELINE_SCHEMA_OPERATOR_RPC_TIMEOUT =
+ConfigOptions.key("schema-operator.rpc-timeout")
+.durationType()
+
.defaultValue(Duration.ofSeconds(SCHEMA_OPERATOR_RPC_TIMEOUT_SECOND_DEFAULT))
+.withDescription(
+"The timeout time for SchemaOperator to wait for 
schema change. the default value is 3 min.");

Review Comment:
   ```suggestion
   "The timeout time for SchemaOperator to wait 
downstream SchemaChangeEvent applying finished, the he default value is 3 
minutes.");
   ```



##
flink-cdc-common/src/main/java/org/apache/flink/cdc/common/pipeline/PipelineOptions.java:
##
@@ -23,12 +23,16 @@
 import org.apache.flink.cdc.common.configuration.description.Description;
 import org.apache.flink.cdc.common.configuration.description.ListElement;
 
+import java.time.Duration;
+
 import static 
org.apache.flink.cdc.common.configuration.description.TextElement.text;
 
 /** Predefined pipeline configuration options. */
 @PublicEvolving
 public class PipelineOptions {
 
+public static final Integer SCHEMA_OPERATOR_RPC_TIMEOUT_SECOND_DEFAULT = 3 
* 60;

Review Comment:
   ```suggestion
   public static final Duration DEFAULT_SCHEMA_OPERATOR_RPC_TIMEOUT = 
Duration.ofMinutes(3);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-31966) Flink Kubernetes operator lacks TLS support

2024-04-16 Thread Adrian Vasiliu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Vasiliu updated FLINK-31966:
---
Attachment: image-2024-04-16-16-33-39-644.png

> Flink Kubernetes operator lacks TLS support 
> 
>
> Key: FLINK-31966
> URL: https://issues.apache.org/jira/browse/FLINK-31966
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Adrian Vasiliu
>Assignee: Tony Garrard
>Priority: Major
> Fix For: kubernetes-operator-1.8.0
>
> Attachments: image-2024-04-16-16-33-39-644.png
>
>
> *Summary*
> The Flink Kubernetes operator lacks support inside the FlinkDeployment 
> operand for configuring Flink with TLS (both one-way and mutual) for the 
> internal communication between jobmanagers and taskmanagers, and for the 
> external REST endpoint. Although a workaround exists to configure the job and 
> task managers, this breaks the operator and renders it unable to reconcile.
> *Additional information*
>  * The Apache Flink operator supports passing through custom flink 
> configuration to be applied to job and task managers.
>  * If you supply SSL-based properties, the operator can no longer speak to 
> the deployed job manager. The operator is reading the flink conf and using it 
> to create a connection to the job manager REST endpoint, but it uses the 
> truststore file paths within flink-conf.yaml, which are unresolvable from the 
> operator. This leaves the operator hanging in a pending state as it cannot 
> complete a reconcile.
> *Proposal*
> Our proposal is to make changes to the operator code. A simple change exists 
> that would be enough to enable anonymous SSL at the REST endpoint, but more 
> invasive changes would be required to enable full mTLS throughout.
> The simple change to enable anonymous SSL would be for the operator to parse 
> flink-conf and podTemplate to identify the Kubernetes resource that contains 
> the certificate from the job manager keystore and use it inside the 
> operator’s trust store.
> In the case of mutual TLS, further changes are required: the operator would 
> need to generate a certificate signed by the same issuing authority as the 
> job manager’s certificates and then use it in a keystore when challenged by 
> that job manager. We propose that the operator becomes responsible for making 
> CertificateSigningRequests to generate certificates for job manager, task 
> manager and operator. The operator can then coordinate deploying the job and 
> task managers with the correct flink-conf and volume mounts. This would also 
> work for anonymous SSL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34961] Use dedicated CI name for Hbase connector to differentiate it in infra-reports [flink-connector-hbase]

2024-04-16 Thread via GitHub



snuyanzin merged PR #44:
URL: https://github.com/apache/flink-connector-hbase/pull/44


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34961] Use dedicated CI name for Hbase connector to differentiate it in infra-reports [flink-connector-hbase]

2024-04-16 Thread via GitHub



snuyanzin commented on PR #44:
URL: 
https://github.com/apache/flink-connector-hbase/pull/44#issuecomment-2059202176

   Thanks for taking a look


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [FLINK-34961] Use dedicated CI name for Hbase connector to differentiate it in infra-reports [flink-connector-hbase]

2024-04-16 Thread via GitHub



snuyanzin opened a new pull request, #44:
URL: https://github.com/apache/flink-connector-hbase/pull/44

   
   
   The PR will allow to differentiate between Hbase connector statistics and 
others with name ci
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34878) [Feature][Pipeline] Flink CDC pipeline transform supports CASE WHEN

2024-04-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-34878:
---
Labels: github-import pull-request-available  (was: github-import)

> [Feature][Pipeline] Flink CDC pipeline transform supports CASE WHEN
> ---
>
> Key: FLINK-34878
> URL: https://issues.apache.org/jira/browse/FLINK-34878
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Flink CDC Issue Import
>Priority: Major
>  Labels: github-import, pull-request-available
>
> ### Search before asking
> - [X] I searched in the 
> [issues|https://github.com/ververica/flink-cdc-connectors/issues] and found 
> nothing similar.
> ### Motivation
> To be supplemented.
> ### Solution
> To be supplemented.
> ### Alternatives
> None.
> ### Anything else?
> To be supplemented.
> ### Are you willing to submit a PR?
> - [X] I'm willing to submit a PR!
>  Imported from GitHub 
> Url: https://github.com/apache/flink-cdc/issues/3079
> Created by: [aiwenmo|https://github.com/aiwenmo]
> Labels: enhancement, 
> Created at: Mon Feb 26 23:47:53 CST 2024
> State: open



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-33440) Bump flink version on flink-connectors-hbase

2024-04-16 Thread Sergey Nuyanzin (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin closed FLINK-33440.
---

> Bump flink version on flink-connectors-hbase
> 
>
> Key: FLINK-33440
> URL: https://issues.apache.org/jira/browse/FLINK-33440
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: Ferenc Csaky
>Assignee: Ferenc Csaky
>Priority: Major
>  Labels: pull-request-available
> Fix For: hbase-4.0.0
>
>
> Follow-up the 1.18 release in the connector repo as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-33440) Bump flink version on flink-connectors-hbase

2024-04-16 Thread Sergey Nuyanzin (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin resolved FLINK-33440.
-
Fix Version/s: hbase-4.0.0
   Resolution: Fixed

> Bump flink version on flink-connectors-hbase
> 
>
> Key: FLINK-33440
> URL: https://issues.apache.org/jira/browse/FLINK-33440
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: Ferenc Csaky
>Assignee: Ferenc Csaky
>Priority: Major
>  Labels: pull-request-available
> Fix For: hbase-4.0.0
>
>
> Follow-up the 1.18 release in the connector repo as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33440) Bump flink version on flink-connectors-hbase

2024-04-16 Thread Sergey Nuyanzin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837705#comment-17837705
 ] 

Sergey Nuyanzin commented on FLINK-33440:
-

Merged as 
[08b7b69cd82acf3e8ba9af08d715b0b9616af0b0|https://github.com/apache/flink-connector-hbase/commit/08b7b69cd82acf3e8ba9af08d715b0b9616af0b0]

> Bump flink version on flink-connectors-hbase
> 
>
> Key: FLINK-33440
> URL: https://issues.apache.org/jira/browse/FLINK-33440
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: Ferenc Csaky
>Assignee: Ferenc Csaky
>Priority: Major
>  Labels: pull-request-available
>
> Follow-up the 1.18 release in the connector repo as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35116] Bump operator sdk version to 4.8.3 [flink-kubernetes-operator]

2024-04-16 Thread via GitHub



csviri commented on code in PR #816:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/816#discussion_r1567338233


##
flink-kubernetes-operator-api/src/main/java/org/apache/flink/kubernetes/operator/api/validation/CrdCompatibilityChecker.java:
##
@@ -93,7 +93,10 @@ protected static void checkObjectCompatibility(
 // This field was removed from Kubernetes ObjectMeta v1 in 
1.25 as it was unused
 // for a long time. If set for any reason (very unlikely 
as it does nothing),
 // the property will be dropped / ignored by the api 
server.
-if (!fieldPath.endsWith(".metadata.clusterName")) {
+if (!fieldPath.endsWith(".metadata.clusterName")
+// This field was removed from the pod templates 
when upgrading
+// from crd-generator 6.8 to 6.11, needs further 
assessment
+&& 
!fieldPath.contains(".volumeClaimTemplate.spec.resources.claims")) {

Review Comment:
   This field was removed from Kubernetes API: 
https://github.com/kubernetes/api/blob/master/core/v1/types.go#L2620-L2625



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-33440) Bump flink version on flink-connectors-hbase

2024-04-16 Thread Sergey Nuyanzin (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin reassigned FLINK-33440:
---

Assignee: Ferenc Csaky

> Bump flink version on flink-connectors-hbase
> 
>
> Key: FLINK-33440
> URL: https://issues.apache.org/jira/browse/FLINK-33440
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: Ferenc Csaky
>Assignee: Ferenc Csaky
>Priority: Major
>  Labels: pull-request-available
>
> Follow-up the 1.18 release in the connector repo as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33440] Update Flink version matrix, add 1.19-SNAPSHOT to GH workflows, update flink-connector-parent [flink-connector-hbase]

2024-04-16 Thread via GitHub



snuyanzin merged PR #35:
URL: https://github.com/apache/flink-connector-hbase/pull/35


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35116] Bump operator sdk version to 4.8.3 [flink-kubernetes-operator]

2024-04-16 Thread via GitHub



mbalassi commented on PR #816:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/816#issuecomment-2059011676

   I will take a look at the test failure later, it is a timeout in 
`AbstractFlinkServiceTest#testBlockingDeploymentDeletion`. I managed to 
reproduce it locally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35028][runtime] Timer firing under async execution model [flink]

2024-04-16 Thread via GitHub



flinkbot commented on PR #24672:
URL: https://github.com/apache/flink/pull/24672#issuecomment-2059006200

   
   ## CI report:
   
   * 6507dc0d38dec17e91fa3f722cd924ca82197622 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34444] Initial implementation of JM operator metric rest api [flink]

2024-04-16 Thread via GitHub



afedulov commented on code in PR #24564:
URL: https://github.com/apache/flink/pull/24564#discussion_r1567286518


##
docs/static/generated/rest_v1_dispatcher.yml:
##
@@ -1089,6 +1089,37 @@ paths:
 application/json:
   schema:
 $ref: '#/components/schemas/JobVertexBackPressureInfo'
+  /jobs/{jobid}/vertices/{vertexid}/coordinator-metrics:
+get:
+  description: Provides access to job manager operator metrics

Review Comment:
   Maybe I am missing something, but should not the endpoint called 
`coordinator-metrics` only return metrics for coordinators? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35028) Timer firing under async execution model

2024-04-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35028:
---
Labels: pull-request-available  (was: )

> Timer firing under async execution model
> 
>
> Key: FLINK-35028
> URL: https://issues.apache.org/jira/browse/FLINK-35028
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends, Runtime / Task
>Reporter: Yanfei Lei
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-35028][runtime] Timer firing under async execution model [flink]

2024-04-16 Thread via GitHub



fredia opened a new pull request, #24672:
URL: https://github.com/apache/flink/pull/24672

   
   
   ## What is the purpose of the change
   
   This PR adapts timer firing to async state execution. The entrances are 
`AbstractAsyncStateStreamOperator ` and `AbstractAsyncStateStreamOperatorV2`,  
the hot path without this feature is not affected.
   
   
   ## Brief change log
   
   - Introduce `InternalTimerServiceAsyncImpl`.
   - Add `InternalTimeServiceManager#getAsyncInternalTimerService` method.
   - Override `getInternalTimerService` of `AbstractAsyncStateStreamOperator ` 
and `AbstractAsyncStateStreamOperatorV2`
   
   ## Verifying this change
   
   
   This change added tests and can be verified as follows:
   - `InternalTimerServiceAsyncImplTest`
   - `AbstractAsyncStateStreamOperatorTest#testTimerServiceIsAsync`
   - - `AbstractAsyncStateStreamOperatorV2Test#testTimerServiceIsAsync`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? (JavaDocs)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32877][Filesystem]add HTTP options to gcs-cloud-storage client [flink]

2024-04-16 Thread via GitHub



dfontana commented on PR #23226:
URL: https://github.com/apache/flink/pull/23226#issuecomment-2058978531

   Hi @singhravidutt thank you for your efforts; are you actively working on 
this PR? I am also interested in seeing this land; am happy to help where needed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35125][state] Implement ValueState for ForStStateBackend [flink]

2024-04-16 Thread via GitHub



flinkbot commented on PR #24671:
URL: https://github.com/apache/flink/pull/24671#issuecomment-2058974205

   
   ## CI report:
   
   * f4b1a4374cc6c579ff50e5b09bb5715b386645d2 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35125) Implement ValueState for ForStStateBackend

2024-04-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35125:
---
Labels: pull-request-available  (was: )

> Implement ValueState for ForStStateBackend
> --
>
> Key: FLINK-35125
> URL: https://issues.apache.org/jira/browse/FLINK-35125
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Jinzhong Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-35125][state] Implement ValueState for ForStStateBackend [flink]

2024-04-16 Thread via GitHub



ljz2051 opened a new pull request, #24671:
URL: https://github.com/apache/flink/pull/24671

   ## What is the purpose of the change
   
   This pull request implements the ValueState for ForStStateBackend.
   
   ## Brief change log
   
 -  Define the InternalSyncState interfaces in flink-runtime layer
 -  Implement ForStValueState
   
   
   ## Verifying this change
   
   This change added tests and can be verified by ForStValueStateTest.
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-35125) Implement ValueState for ForStStateBackend

2024-04-16 Thread Jinzhong Li (Jira)

Jinzhong Li created FLINK-35125:
---

 Summary: Implement ValueState for ForStStateBackend
 Key: FLINK-35125
 URL: https://issues.apache.org/jira/browse/FLINK-35125
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends
Reporter: Jinzhong Li
 Fix For: 2.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33440] Update Flink version matrix, add 1.19-SNAPSHOT to GH workflows, update flink-connector-parent [flink-connector-hbase]

2024-04-16 Thread via GitHub



ferenc-csaky commented on PR #35:
URL: 
https://github.com/apache/flink-connector-hbase/pull/35#issuecomment-2058924269

   @snuyanzin I fixed the dep convergence for 1.19 the last CI run failed on, 
also had to change `HBaseTablePlanTest` as the JUnit5 migration for 
`TableTestBase` in the core repo in 1.19 required some adaptation. Can you 
retrigger another run pls?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35025][Runtime/State] Abstract stream operators for async state processing [flink]

2024-04-16 Thread via GitHub



Zakelly commented on code in PR #24657:
URL: https://github.com/apache/flink/pull/24657#discussion_r1567158767


##
flink-runtime/src/main/java/org/apache/flink/runtime/asyncprocessing/StateRequestBuffer.java:
##
@@ -59,6 +60,9 @@ public StateRequestBuffer() {
 }
 
 void enqueueToActive(StateRequest request) {
+if (request.getRequestType() == StateRequestType.SYNC_POINT) {
+request.getFuture().complete(null);
+}

Review Comment:
   Yes. Once the `SYNC_POINT` is about to add to the `activeQueue` (meaning 
that it is ready), complete it directly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35116) Upgrade JOSDK dependency to 4.8.3

2024-04-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35116:
---
Labels: pull-request-available  (was: )

> Upgrade JOSDK dependency to 4.8.3
> -
>
> Key: FLINK-35116
> URL: https://issues.apache.org/jira/browse/FLINK-35116
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Márton Balassi
>Assignee: Márton Balassi
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.9.0
>
>
> This bring a much needed fix for the operator HA behaviour:
> https://github.com/operator-framework/java-operator-sdk/releases/tag/v4.8.3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-35116] Bump operator sdk version to 4.8.3 [flink-kubernetes-operator]

2024-04-16 Thread via GitHub



mbalassi opened a new pull request, #816:
URL: https://github.com/apache/flink-kubernetes-operator/pull/816

   Also bumps fabric8 client 6.11.0 to harmonize it with JOSDK.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] Adding AWS Connectors v4.3.0 [flink-web]

2024-04-16 Thread via GitHub



dannycranmer opened a new pull request, #733:
URL: https://github.com/apache/flink-web/pull/733

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35045][state] Introduce ForStFlinkFileSystem to support reading and writing with ByteBuffer [flink]

2024-04-16 Thread via GitHub



Zakelly commented on code in PR #24632:
URL: https://github.com/apache/flink/pull/24632#discussion_r1567124524


##
flink-state-backends/flink-statebackend-forst/src/main/java/org/apache/flink/state/forst/fs/ByteBufferReadableFSDataInputStream.java:
##
@@ -0,0 +1,195 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.state.forst.fs;
+
+import org.apache.flink.core.fs.FSDataInputStream;
+import org.apache.flink.core.fs.PositionedReadable;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.Queue;
+import java.util.concurrent.Callable;
+import java.util.concurrent.LinkedBlockingQueue;
+
+/**
+ * A {@link FSDataInputStream} delegates requests to other one and supports 
reading data with {@link
+ * ByteBuffer}.
+ *
+ * All methods in this class maybe used by ForSt, please start a discussion 
firstly if it has to
+ * be modified.
+ */
+public class ByteBufferReadableFSDataInputStream extends FSDataInputStream {
+
+private final FSDataInputStream originalInputStream;
+
+/**
+ * InputStream Pool which provides multiple input streams to random read 
concurrently. An input
+ * stream should only be used by a thread at a point in time.
+ */
+private final Queue readInputStreamPool;
+
+private final Callable inputStreamBuilder;
+
+public ByteBufferReadableFSDataInputStream(
+FSDataInputStream originalInputStream,
+Callable inputStreamBuilder,
+int inputStreamCapacity) {
+this.originalInputStream = originalInputStream;
+this.inputStreamBuilder = inputStreamBuilder;
+this.readInputStreamPool = new 
LinkedBlockingQueue<>(inputStreamCapacity);
+}
+
+/**
+ * Reads up to ByteBuffer#remaining bytes of data from the 
input stream into a
+ * ByteBuffer. Not Thread-safe yet since the interface of sequential read 
of ForSt only be
+ * accessed by one thread at a time.
+ *
+ * @param bb the buffer into which the data is read.
+ * @return the total number of bytes read into the buffer.
+ * @exception IOException If the first byte cannot be read for any reason 
other than end of
+ * file, or if the input stream has been closed, or if some other I/O 
error occurs.
+ * @exception NullPointerException If bb is null.
+ */
+public int readFully(ByteBuffer bb) throws IOException {
+if (bb == null) {
+throw new NullPointerException();
+} else if (bb.remaining() == 0) {
+return 0;
+}
+return readFullyFromFSDataInputStream(originalInputStream, bb);
+}
+
+/**
+ * Reads up to ByteBuffer#remaining bytes of data from the 
specific position of the
+ * input stream into a ByteBuffer. Tread-safe since the interface of 
random read of ForSt may be
+ * concurrently accessed by multiple threads. TODO: Support to split this 
method to other class.
+ *
+ * @param position the start offset in input stream at which the data is 
read.
+ * @param bb the buffer into which the data is read.
+ * @return the total number of bytes read into the buffer.
+ * @exception IOException If the first byte cannot be read for any reason 
other than end of
+ * file, or if the input stream has been closed, or if some other I/O 
error occurs.
+ * @exception NullPointerException If bb is null.
+ */
+public int readFully(long position, ByteBuffer bb) throws Exception {
+if (bb == null) {
+throw new NullPointerException();
+} else if (bb.remaining() == 0) {
+return 0;
+}
+
+FSDataInputStream fsDataInputStream = readInputStreamPool.poll();
+if (fsDataInputStream == null) {
+fsDataInputStream = inputStreamBuilder.call();
+}
+
+int result;
+if (fsDataInputStream instanceof PositionedReadable) {
+byte[] tmp = new byte[bb.remaining()];

Review Comment:
   Can this be optimized? Without introducing another temporary heap array?



##
flink-core/src/main/java/org/apache/flink/core/fs/PositionedReadable.java:
##
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software

[jira] [Updated] (FLINK-33991) Custom Error Handling for Kinesis Polling Consumer

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33991:
--
Affects Version/s: (was: aws-connector-4.2.0)

> Custom Error Handling for Kinesis Polling Consumer 
> ---
>
> Key: FLINK-33991
> URL: https://issues.apache.org/jira/browse/FLINK-33991
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kinesis
>Reporter: Emre Kartoglu
>Assignee: Emre Kartoglu
>Priority: Major
> Fix For: aws-connector-4.4.0
>
>
> We introduced custom error handling for the Kinesis EFO Consumer as part of 
> https://issues.apache.org/jira/browse/FLINK-33260
> PR for the EFO consumer: 
> [https://github.com/apache/flink-connector-aws/pull/110]
>  
> This ticket is to apply the same logic to the Kinesis Polling Consumer in the 
> same codebase.
> Current configuration for the EFO consumer looks like:
> {code:java}
> flink.shard.consumer.error.recoverable[0].exception=java.net.UnknownHostException
> flink.shard.consumer.error.recoverable[1].exception=java.net.SocketTimeoutException
>  {code}
> We should re-use the same code for the polling consumer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32007) Implement Python Wrappers for DynamoDB Connector

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-32007:
--
Fix Version/s: aws-connector-4.4.0
   (was: aws-connector-4.3.0)

> Implement Python Wrappers for DynamoDB Connector
> 
>
> Key: FLINK-32007
> URL: https://issues.apache.org/jira/browse/FLINK-32007
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Python, Connectors / DynamoDB
>Reporter: Ahmed Hamdy
>Assignee: Khanh Vu
>Priority: Minor
> Fix For: aws-connector-4.4.0
>
>
> Implement Python API Wrappers for DynamoDB Sink



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33259) flink-connector-aws should use/extend the common connector workflow

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33259:
--
Fix Version/s: aws-connector-4.4.0
   (was: aws-connector-4.3.0)

> flink-connector-aws should use/extend the common connector workflow
> ---
>
> Key: FLINK-33259
> URL: https://issues.apache.org/jira/browse/FLINK-33259
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / AWS
>Affects Versions: aws-connector-3.0.0, aws-connector-4.2.0
>Reporter: Hong Liang Teoh
>Assignee: Aleksandr Pilipenko
>Priority: Major
> Fix For: aws-connector-4.4.0
>
>
> We should use the common ci github workflow.
> [https://github.com/apache/flink-connector-shared-utils/blob/ci_utils/.github/workflows/ci.yml]
>  
> Example used in flink-connector-elasticsearch
> [https://github.com/apache/flink-connector-elasticsearch/blob/main/.github/workflows/push_pr.yml]
>  
> This improves our operational stance because we will now inherit any 
> improvements/changes to the main ci workflow file



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33259) flink-connector-aws should use/extend the common connector workflow

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33259:
--
Affects Version/s: aws-connector-4.3.0

> flink-connector-aws should use/extend the common connector workflow
> ---
>
> Key: FLINK-33259
> URL: https://issues.apache.org/jira/browse/FLINK-33259
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / AWS
>Affects Versions: aws-connector-3.0.0, aws-connector-4.2.0, 
> aws-connector-4.3.0
>Reporter: Hong Liang Teoh
>Assignee: Aleksandr Pilipenko
>Priority: Major
> Fix For: aws-connector-4.4.0
>
>
> We should use the common ci github workflow.
> [https://github.com/apache/flink-connector-shared-utils/blob/ci_utils/.github/workflows/ci.yml]
>  
> Example used in flink-connector-elasticsearch
> [https://github.com/apache/flink-connector-elasticsearch/blob/main/.github/workflows/push_pr.yml]
>  
> This improves our operational stance because we will now inherit any 
> improvements/changes to the main ci workflow file



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33991) Custom Error Handling for Kinesis Polling Consumer

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33991:
--
Fix Version/s: aws-connector-4.4.0
   (was: aws-connector-4.3.0)

> Custom Error Handling for Kinesis Polling Consumer 
> ---
>
> Key: FLINK-33991
> URL: https://issues.apache.org/jira/browse/FLINK-33991
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kinesis
>Affects Versions: aws-connector-4.2.0
>Reporter: Emre Kartoglu
>Assignee: Emre Kartoglu
>Priority: Major
> Fix For: aws-connector-4.4.0
>
>
> We introduced custom error handling for the Kinesis EFO Consumer as part of 
> https://issues.apache.org/jira/browse/FLINK-33260
> PR for the EFO consumer: 
> [https://github.com/apache/flink-connector-aws/pull/110]
>  
> This ticket is to apply the same logic to the Kinesis Polling Consumer in the 
> same codebase.
> Current configuration for the EFO consumer looks like:
> {code:java}
> flink.shard.consumer.error.recoverable[0].exception=java.net.UnknownHostException
> flink.shard.consumer.error.recoverable[1].exception=java.net.SocketTimeoutException
>  {code}
> We should re-use the same code for the polling consumer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-31872) Add Support for Configuring AIMD Ratelimiting strategy parameters by Sink users for KinesisStreamsSink

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-31872:
--
Affects Version/s: (was: aws-connector-4.2.0)

> Add Support for Configuring AIMD Ratelimiting strategy parameters by Sink 
> users for KinesisStreamsSink
> --
>
> Key: FLINK-31872
> URL: https://issues.apache.org/jira/browse/FLINK-31872
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kinesis
>Reporter: Ahmed Hamdy
>Assignee: Ahmed Hamdy
>Priority: Major
> Fix For: aws-connector-4.4.0
>
>
> h1. Issue
> As part of FLINK-31772
> I performed a complete benchmark for {{KinesisStreamsSink}} after configuring 
> rate limiting strategy.
> It appears that optimum values for rate limiting strategy parameters are 
> dependent on use case (shard number/ parallellism/ record thouroughput)
> We initially implemeted the {{AIMDRateLimitingStrategy}} in accordance with 
> one used for TCP congestion control but since parameters are use case 
> dependent we would like to allow sink users to adjust parameters as suitable.
> h2. Requirements
>  - we *must* allow users to configure increment rate and decrease factor of 
> AIMDRateLimitingStrategy for {{KinesisStreamsSink}}
>  - we *must* provide backward compatible default values identical to current 
> values to introduce no further regressions.
> h2. Appendix
> h3. Performace Benchmark Results
> |Parallelism/Shards/Payload|paralellism|shards|payload|records/sec|Async 
> Sink|Async Sink With Configured Ratelimiting Strategy Thourouput (MB/s)|Async 
> sink/ Maximum Thourouput|% of Improvement|
> |Low/Low/Low|1|1|1024|1|0.991|1|1|0.9|
> |Low/Low/High|1|1|102400|100|0.9943|1|1|0.57|
> |Low/Med/Low|1|8|1024|8|4.12|4.57|0.57125|5.625|
> |Low/Med/High|1|8|102400|800|4.35|7.65|0.95625|41.25|
> |Med/Low/Low|8|1|1024|2|0.852|0.846|0.846|-0.6|
> |Med/Low/High|8|1|102400|200|0.921|0.867|0.867|-5.4|
> |Med/Med/Low|8|8|1024|8|5.37|4.76|0.595|-7.625|
> |Med/Med/High|8|8|102400|800|7.53|7.69|0.96125|2|
> |Med/High/Low|8|64|1024|8|32.5|37.4|0.58438|7.65625|
> |Med/High/High|8|64|102400|800|47.27|60.4|0.94375|20.51562|
> |High/High/Low|256|256|1024|30|127|127|0.49609|0|
> |High/High/High|256|256|102400|3000|225|246|0.96094|8.20313|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32116) FlinkKinesisConsumer cannot stop-with-savepoint when configured with watermark assigner and watermark tracker

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-32116:
--
Affects Version/s: aws-connector-4.3.0

> FlinkKinesisConsumer cannot stop-with-savepoint when configured with 
> watermark assigner and watermark tracker
> -
>
> Key: FLINK-32116
> URL: https://issues.apache.org/jira/browse/FLINK-32116
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.16.1, 1.15.4, aws-connector-4.2.0, aws-connector-4.3.0
>Reporter: Hong Liang Teoh
>Assignee: Aleksandr Pilipenko
>Priority: Major
> Fix For: aws-connector-4.4.0
>
>
> Problem:
> When FlinkKinesisConsumer is configured with legacy watermarking system, it 
> is unable to take a savepoint during stop-with-savepoint, and will get stuck 
> indefinitely.
>  
>  
> {code:java}
> FlinkKinesisConsumer src = new FlinkKinesisConsumer("YourStreamHere", new 
> SimpleStringSchema(), consumerConfig);
> // Set up watermark assigner on Kinesis source
> src.setPeriodicWatermarkAssigner(...);
> // Set up watermark tracker on Kinesis source
> src.setWatermarkTracker(...);{code}
>  
>  
> *Why does it get stuck?*
> When watermarks are setup, the `shardConsumer` and `recordEmitter` thread 
> communicate using asynchronous queue.
> On stop-with-savepoint, shardConsumer waits for queue to empty before 
> continuing. recordEmitter is terminated before queue is empty. As such, queue 
> is never going to be empty, and app gets stuck indefinitely.
>  
> *Workarounds*
> Use the new watermark framework
> {code:java}
> FlinkKinesisConsumer src = new FlinkKinesisConsumer("YourStreamHere", new 
> SimpleStringSchema(), consumerConfig);
> env.addSource(src)
> // Set up watermark strategy with both watermark assigner and watermark 
> tracker
>     
> .assignTimestampsAndWatermarks(WatermarkStrategy.forMonotonousTimestamps()){code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-25779) Define ConfigOption for aws, scan and sink map in Kinesis

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-25779:
--
Fix Version/s: aws-connector-4.4.0
   (was: aws-connector-4.3.0)

> Define ConfigOption for aws, scan and sink map in Kinesis
> -
>
> Key: FLINK-25779
> URL: https://issues.apache.org/jira/browse/FLINK-25779
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: aws-connector-4.2.0
>Reporter: Francesco Guardiani
>Assignee: Aleksandr Pilipenko
>Priority: Critical
> Fix For: aws-connector-4.4.0
>
>
> Now there is no {{ConfigOption}} definition of {{aws.*}}, {{scan.*}} and 
> {{sink.*}} in Kinesis connector. This breaks option forwarding when restoring 
> the table options from the persisted plan/catalog. We can define it using 
> {{ConfigOptionBuilder#mapType}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-31922) Port over Kinesis Client configurations for retry and backoff

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-31922:
--
Affects Version/s: aws-connector-4.3.0

> Port over Kinesis Client configurations for retry and backoff
> -
>
> Key: FLINK-31922
> URL: https://issues.apache.org/jira/browse/FLINK-31922
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / AWS
>Affects Versions: aws-connector-4.2.0, aws-connector-4.3.0
>Reporter: Hong Liang Teoh
>Assignee: Daren Wong
>Priority: Major
> Fix For: aws-connector-4.4.0
>
>
> Port over the Kinesis Client configurations for GetRecords, ListShards, 
> DescribeStream



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-24379) Support AWS Glue Schema Registry Avro for Table API

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-24379:
--
Affects Version/s: (was: aws-connector-4.2.0)

> Support AWS Glue Schema Registry Avro for Table API
> ---
>
> Key: FLINK-24379
> URL: https://issues.apache.org/jira/browse/FLINK-24379
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Reporter: Brad Davis
>Assignee: Lorenzo Nicora
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: aws-connector-4.4.0
>
>
> Unlike most (all?) of the other Avro formats, the AWS Glue Schema Registry 
> version doesn't include a 
> META-INF/services/org.apache.flink.table.factories.Factory resource or a 
> class implementing 
> org.apache.flink.table.factories.DeserializationFormatFactory and 
> org.apache.flink.table.factories.SerializationFormatFactory.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-31922) Port over Kinesis Client configurations for retry and backoff

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-31922:
--
Fix Version/s: aws-connector-4.4.0
   (was: aws-connector-4.3.0)

> Port over Kinesis Client configurations for retry and backoff
> -
>
> Key: FLINK-31922
> URL: https://issues.apache.org/jira/browse/FLINK-31922
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / AWS
>Affects Versions: aws-connector-4.2.0
>Reporter: Hong Liang Teoh
>Assignee: Daren Wong
>Priority: Major
> Fix For: aws-connector-4.4.0
>
>
> Port over the Kinesis Client configurations for GetRecords, ListShards, 
> DescribeStream



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32116) FlinkKinesisConsumer cannot stop-with-savepoint when configured with watermark assigner and watermark tracker

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-32116:
--
Fix Version/s: aws-connector-4.4.0
   (was: aws-connector-4.3.0)

> FlinkKinesisConsumer cannot stop-with-savepoint when configured with 
> watermark assigner and watermark tracker
> -
>
> Key: FLINK-32116
> URL: https://issues.apache.org/jira/browse/FLINK-32116
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.16.1, 1.15.4, aws-connector-4.2.0
>Reporter: Hong Liang Teoh
>Assignee: Aleksandr Pilipenko
>Priority: Major
> Fix For: aws-connector-4.4.0
>
>
> Problem:
> When FlinkKinesisConsumer is configured with legacy watermarking system, it 
> is unable to take a savepoint during stop-with-savepoint, and will get stuck 
> indefinitely.
>  
>  
> {code:java}
> FlinkKinesisConsumer src = new FlinkKinesisConsumer("YourStreamHere", new 
> SimpleStringSchema(), consumerConfig);
> // Set up watermark assigner on Kinesis source
> src.setPeriodicWatermarkAssigner(...);
> // Set up watermark tracker on Kinesis source
> src.setWatermarkTracker(...);{code}
>  
>  
> *Why does it get stuck?*
> When watermarks are setup, the `shardConsumer` and `recordEmitter` thread 
> communicate using asynchronous queue.
> On stop-with-savepoint, shardConsumer waits for queue to empty before 
> continuing. recordEmitter is terminated before queue is empty. As such, queue 
> is never going to be empty, and app gets stuck indefinitely.
>  
> *Workarounds*
> Use the new watermark framework
> {code:java}
> FlinkKinesisConsumer src = new FlinkKinesisConsumer("YourStreamHere", new 
> SimpleStringSchema(), consumerConfig);
> env.addSource(src)
> // Set up watermark strategy with both watermark assigner and watermark 
> tracker
>     
> .assignTimestampsAndWatermarks(WatermarkStrategy.forMonotonousTimestamps()){code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-24379) Support AWS Glue Schema Registry Avro for Table API

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-24379:
--
Fix Version/s: aws-connector-4.4.0
   (was: aws-connector-4.3.0)

> Support AWS Glue Schema Registry Avro for Table API
> ---
>
> Key: FLINK-24379
> URL: https://issues.apache.org/jira/browse/FLINK-24379
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Affects Versions: aws-connector-4.2.0
>Reporter: Brad Davis
>Assignee: Lorenzo Nicora
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: aws-connector-4.4.0
>
>
> Unlike most (all?) of the other Avro formats, the AWS Glue Schema Registry 
> version doesn't include a 
> META-INF/services/org.apache.flink.table.factories.Factory resource or a 
> class implementing 
> org.apache.flink.table.factories.DeserializationFormatFactory and 
> org.apache.flink.table.factories.SerializationFormatFactory.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-25779) Define ConfigOption for aws, scan and sink map in Kinesis

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-25779:
--
Affects Version/s: aws-connector-4.3.0

> Define ConfigOption for aws, scan and sink map in Kinesis
> -
>
> Key: FLINK-25779
> URL: https://issues.apache.org/jira/browse/FLINK-25779
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: aws-connector-4.2.0, aws-connector-4.3.0
>Reporter: Francesco Guardiani
>Assignee: Aleksandr Pilipenko
>Priority: Critical
> Fix For: aws-connector-4.4.0
>
>
> Now there is no {{ConfigOption}} definition of {{aws.*}}, {{scan.*}} and 
> {{sink.*}} in Kinesis connector. This breaks option forwarding when restoring 
> the table options from the persisted plan/catalog. We can define it using 
> {{ConfigOptionBuilder#mapType}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-31872) Add Support for Configuring AIMD Ratelimiting strategy parameters by Sink users for KinesisStreamsSink

2024-04-16 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-31872:
--
Fix Version/s: aws-connector-4.4.0
   (was: aws-connector-4.3.0)

> Add Support for Configuring AIMD Ratelimiting strategy parameters by Sink 
> users for KinesisStreamsSink
> --
>
> Key: FLINK-31872
> URL: https://issues.apache.org/jira/browse/FLINK-31872
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kinesis
>Affects Versions: aws-connector-4.2.0
>Reporter: Ahmed Hamdy
>Assignee: Ahmed Hamdy
>Priority: Major
> Fix For: aws-connector-4.4.0
>
>
> h1. Issue
> As part of FLINK-31772
> I performed a complete benchmark for {{KinesisStreamsSink}} after configuring 
> rate limiting strategy.
> It appears that optimum values for rate limiting strategy parameters are 
> dependent on use case (shard number/ parallellism/ record thouroughput)
> We initially implemeted the {{AIMDRateLimitingStrategy}} in accordance with 
> one used for TCP congestion control but since parameters are use case 
> dependent we would like to allow sink users to adjust parameters as suitable.
> h2. Requirements
>  - we *must* allow users to configure increment rate and decrease factor of 
> AIMDRateLimitingStrategy for {{KinesisStreamsSink}}
>  - we *must* provide backward compatible default values identical to current 
> values to introduce no further regressions.
> h2. Appendix
> h3. Performace Benchmark Results
> |Parallelism/Shards/Payload|paralellism|shards|payload|records/sec|Async 
> Sink|Async Sink With Configured Ratelimiting Strategy Thourouput (MB/s)|Async 
> sink/ Maximum Thourouput|% of Improvement|
> |Low/Low/Low|1|1|1024|1|0.991|1|1|0.9|
> |Low/Low/High|1|1|102400|100|0.9943|1|1|0.57|
> |Low/Med/Low|1|8|1024|8|4.12|4.57|0.57125|5.625|
> |Low/Med/High|1|8|102400|800|4.35|7.65|0.95625|41.25|
> |Med/Low/Low|8|1|1024|2|0.852|0.846|0.846|-0.6|
> |Med/Low/High|8|1|102400|200|0.921|0.867|0.867|-5.4|
> |Med/Med/Low|8|8|1024|8|5.37|4.76|0.595|-7.625|
> |Med/Med/High|8|8|102400|800|7.53|7.69|0.96125|2|
> |Med/High/Low|8|64|1024|8|32.5|37.4|0.58438|7.65625|
> |Med/High/High|8|64|102400|800|47.27|60.4|0.94375|20.51562|
> |High/High/Low|256|256|1024|30|127|127|0.49609|0|
> |High/High/High|256|256|102400|3000|225|246|0.96094|8.20313|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35124) Connector Release Fails to run Checkstyle

2024-04-16 Thread Danny Cranmer (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837654#comment-17837654
 ] 

Danny Cranmer commented on FLINK-35124:
---

[~echauchot] do you recall why the tools directory was removed from the clone? 

> Connector Release Fails to run Checkstyle
> -
>
> Key: FLINK-35124
> URL: https://issues.apache.org/jira/browse/FLINK-35124
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Danny Cranmer
>Priority: Major
>
> During a release of the AWS connectors the build was failing at the 
> \{{./tools/releasing/shared/stage_jars.sh}} step due to a checkstyle error.
>  
> {code:java}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-checkstyle-plugin:3.1.2:check (validate) on 
> project flink-connector-aws: Failed during checkstyle execution: Unable to 
> find suppressions file at location: /tools/maven/suppressions.xml: Could not 
> find resource '/tools/maven/suppressions.xml'. -> [Help 1] {code}
>  
> Looks like it is caused by this 
> [https://github.com/apache/flink-connector-shared-utils/commit/a75b89ee3f8c9a03e97ead2d0bd9d5b7bb02b51a]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35124) Connector Release Fails to run Checkstyle

2024-04-16 Thread Danny Cranmer (Jira)

Danny Cranmer created FLINK-35124:
-

 Summary: Connector Release Fails to run Checkstyle
 Key: FLINK-35124
 URL: https://issues.apache.org/jira/browse/FLINK-35124
 Project: Flink
  Issue Type: Bug
  Components: Build System
Reporter: Danny Cranmer


During a release of the AWS connectors the build was failing at the 
\{{./tools/releasing/shared/stage_jars.sh}} step due to a checkstyle error.

 
{code:java}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-checkstyle-plugin:3.1.2:check (validate) on 
project flink-connector-aws: Failed during checkstyle execution: Unable to find 
suppressions file at location: /tools/maven/suppressions.xml: Could not find 
resource '/tools/maven/suppressions.xml'. -> [Help 1] {code}
 

Looks like it is caused by this 
[https://github.com/apache/flink-connector-shared-utils/commit/a75b89ee3f8c9a03e97ead2d0bd9d5b7bb02b51a]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35041) IncrementalRemoteKeyedStateHandleTest.testSharedStateReRegistration failed

2024-04-16 Thread Ryan Skraba (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837647#comment-17837647
 ] 

Ryan Skraba commented on FLINK-35041:
-

1.20 test_ci core 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=58947=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=24c3384f-1bcb-57b3-224f-51bf973bbee8=8901

> IncrementalRemoteKeyedStateHandleTest.testSharedStateReRegistration failed
> --
>
> Key: FLINK-35041
> URL: https://issues.apache.org/jira/browse/FLINK-35041
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> {code:java}
> Apr 08 03:22:45 03:22:45.450 [ERROR] 
> org.apache.flink.runtime.state.IncrementalRemoteKeyedStateHandleTest.testSharedStateReRegistration
>  -- Time elapsed: 0.034 s <<< FAILURE!
> Apr 08 03:22:45 org.opentest4j.AssertionFailedError: 
> Apr 08 03:22:45 
> Apr 08 03:22:45 expected: false
> Apr 08 03:22:45  but was: true
> Apr 08 03:22:45   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> Apr 08 03:22:45   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> Apr 08 03:22:45   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(K.java:45)
> Apr 08 03:22:45   at 
> org.apache.flink.runtime.state.DiscardRecordedStateObject.verifyDiscard(DiscardRecordedStateObject.java:34)
> Apr 08 03:22:45   at 
> org.apache.flink.runtime.state.IncrementalRemoteKeyedStateHandleTest.testSharedStateReRegistration(IncrementalRemoteKeyedStateHandleTest.java:211)
> Apr 08 03:22:45   at java.lang.reflect.Method.invoke(Method.java:498)
> Apr 08 03:22:45   at 
> java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
> Apr 08 03:22:45   at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> Apr 08 03:22:45   at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> Apr 08 03:22:45   at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
> Apr 08 03:22:45   at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
> {code}
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=58782=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=125e07e7-8de0-5c6c-a541-a567415af3ef=9238]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35041) IncrementalRemoteKeyedStateHandleTest.testSharedStateReRegistration failed

2024-04-16 Thread Ryan Skraba (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837642#comment-17837642
 ] 

Ryan Skraba commented on FLINK-35041:
-

* 1.20 Java 11: Test (module: core) 
https://github.com/apache/flink/actions/runs/8698882161/job/23856834635#step:10:9180
* 1.20 Default (Java 8): Test (module: core) 
https://github.com/apache/flink/actions/runs/8689117008/job/23826646737#step:10:7553

> IncrementalRemoteKeyedStateHandleTest.testSharedStateReRegistration failed
> --
>
> Key: FLINK-35041
> URL: https://issues.apache.org/jira/browse/FLINK-35041
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> {code:java}
> Apr 08 03:22:45 03:22:45.450 [ERROR] 
> org.apache.flink.runtime.state.IncrementalRemoteKeyedStateHandleTest.testSharedStateReRegistration
>  -- Time elapsed: 0.034 s <<< FAILURE!
> Apr 08 03:22:45 org.opentest4j.AssertionFailedError: 
> Apr 08 03:22:45 
> Apr 08 03:22:45 expected: false
> Apr 08 03:22:45  but was: true
> Apr 08 03:22:45   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> Apr 08 03:22:45   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> Apr 08 03:22:45   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(K.java:45)
> Apr 08 03:22:45   at 
> org.apache.flink.runtime.state.DiscardRecordedStateObject.verifyDiscard(DiscardRecordedStateObject.java:34)
> Apr 08 03:22:45   at 
> org.apache.flink.runtime.state.IncrementalRemoteKeyedStateHandleTest.testSharedStateReRegistration(IncrementalRemoteKeyedStateHandleTest.java:211)
> Apr 08 03:22:45   at java.lang.reflect.Method.invoke(Method.java:498)
> Apr 08 03:22:45   at 
> java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
> Apr 08 03:22:45   at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> Apr 08 03:22:45   at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> Apr 08 03:22:45   at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
> Apr 08 03:22:45   at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
> {code}
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=58782=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=125e07e7-8de0-5c6c-a541-a567415af3ef=9238]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34919) WebMonitorEndpointTest.cleansUpExpiredExecutionGraphs fails starting REST server

2024-04-16 Thread Ryan Skraba (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837643#comment-17837643
 ] 

Ryan Skraba commented on FLINK-34919:
-

1.19 Java 21: Test (module: core) 
https://github.com/apache/flink/actions/runs/8698881986/job/23856839686#step:10:8836

> WebMonitorEndpointTest.cleansUpExpiredExecutionGraphs fails starting REST 
> server
> 
>
> Key: FLINK-34919
> URL: https://issues.apache.org/jira/browse/FLINK-34919
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.19.0, 1.20.0
>Reporter: Ryan Skraba
>Priority: Critical
>  Labels: test-stability
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=58482=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=125e07e7-8de0-5c6c-a541-a567415af3ef=8641]
> {code:java}
> Mar 22 04:12:50 04:12:50.260 [INFO] Running 
> org.apache.flink.runtime.webmonitor.WebMonitorEndpointTest
> Mar 22 04:12:50 04:12:50.609 [ERROR] Tests run: 1, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 0.318 s <<< FAILURE! -- in 
> org.apache.flink.runtime.webmonitor.WebMonitorEndpointTest
> Mar 22 04:12:50 04:12:50.609 [ERROR] 
> org.apache.flink.runtime.webmonitor.WebMonitorEndpointTest.cleansUpExpiredExecutionGraphs
>  -- Time elapsed: 0.303 s <<< ERROR!
> Mar 22 04:12:50 java.net.BindException: Could not start rest endpoint on any 
> port in port range 8081
> Mar 22 04:12:50   at 
> org.apache.flink.runtime.rest.RestServerEndpoint.start(RestServerEndpoint.java:286)
> Mar 22 04:12:50   at 
> org.apache.flink.runtime.webmonitor.WebMonitorEndpointTest.cleansUpExpiredExecutionGraphs(WebMonitorEndpointTest.java:69)
> Mar 22 04:12:50   at java.lang.reflect.Method.invoke(Method.java:498)
> Mar 22 04:12:50   at 
> java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
> Mar 22 04:12:50   at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> Mar 22 04:12:50   at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> Mar 22 04:12:50   at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
> Mar 22 04:12:50   at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
> Mar 22 04:12:50  {code}
> This was noted as a symptom of FLINK-22980, but doesn't have the same failure.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35025][Runtime/State] Abstract stream operators for async state processing [flink]

2024-04-16 Thread via GitHub



fredia commented on code in PR #24657:
URL: https://github.com/apache/flink/pull/24657#discussion_r1567096409


##
flink-runtime/src/main/java/org/apache/flink/runtime/asyncprocessing/StateRequestBuffer.java:
##
@@ -59,6 +60,9 @@ public StateRequestBuffer() {
 }
 
 void enqueueToActive(StateRequest request) {
+if (request.getRequestType() == StateRequestType.SYNC_POINT) {
+request.getFuture().complete(null);
+}

Review Comment:
   If `requestType=StateRequestType.SYNC_POINT`, don't add the request to 
active buffer?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35002) GitHub action/upload-artifact@v4 can timeout

2024-04-16 Thread Ryan Skraba (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837641#comment-17837641
 ] 

Ryan Skraba commented on FLINK-35002:
-

1.20 AdaptiveScheduler: Compile 
https://github.com/apache/flink/commit/9ae9c1463d87da2fa0e4dd58ee6fd10d38cda6bd/checks/23856554227/logs
1.18 Hadoop 3.1.3: Compile 
https://github.com/apache/flink/commit/f5c62abf7475ea8bc976de2a2079b1a9e29b79df/checks/23856554112/logs

Note that both the above successfully built and run tests, only the artifacts 
were never successfully uploaded for the run to complete.

> GitHub action/upload-artifact@v4 can timeout
> 
>
> Key: FLINK-35002
> URL: https://issues.apache.org/jira/browse/FLINK-35002
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Ryan Skraba
>Priority: Major
>  Labels: github-actions, test-stability
>
> A timeout can occur when uploading a successfully built artifact:
>  * [https://github.com/apache/flink/actions/runs/8516411871/job/23325392650]
> {code:java}
> 2024-04-02T02:20:15.6355368Z With the provided path, there will be 1 file 
> uploaded
> 2024-04-02T02:20:15.6360133Z Artifact name is valid!
> 2024-04-02T02:20:15.6362872Z Root directory input is valid!
> 2024-04-02T02:20:20.6975036Z Attempt 1 of 5 failed with error: Request 
> timeout: /twirp/github.actions.results.api.v1.ArtifactService/CreateArtifact. 
> Retrying request in 3000 ms...
> 2024-04-02T02:20:28.7084937Z Attempt 2 of 5 failed with error: Request 
> timeout: /twirp/github.actions.results.api.v1.ArtifactService/CreateArtifact. 
> Retrying request in 4785 ms...
> 2024-04-02T02:20:38.5015936Z Attempt 3 of 5 failed with error: Request 
> timeout: /twirp/github.actions.results.api.v1.ArtifactService/CreateArtifact. 
> Retrying request in 7375 ms...
> 2024-04-02T02:20:50.8901508Z Attempt 4 of 5 failed with error: Request 
> timeout: /twirp/github.actions.results.api.v1.ArtifactService/CreateArtifact. 
> Retrying request in 14988 ms...
> 2024-04-02T02:21:10.9028438Z ##[error]Failed to CreateArtifact: Failed to 
> make request after 5 attempts: Request timeout: 
> /twirp/github.actions.results.api.v1.ArtifactService/CreateArtifact
> 2024-04-02T02:22:59.9893296Z Post job cleanup.
> 2024-04-02T02:22:59.9958844Z Post job cleanup. {code}
> (This is unlikely to be something we can fix, but we can track it.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-25537] [JUnit5 Migration] Module: flink-core with,Package: util [flink]

2024-04-16 Thread via GitHub



flinkbot commented on PR #24670:
URL: https://github.com/apache/flink/pull/24670#issuecomment-2058604771

   
   ## CI report:
   
   * aae449392b8ce075a2413679ed2a0c42eaa52c68 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35025][Runtime/State] Abstract stream operators for async state processing [flink]

2024-04-16 Thread via GitHub



fredia commented on code in PR #24657:
URL: https://github.com/apache/flink/pull/24657#discussion_r1567004818


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/io/RecordProcessorUtils.java:
##
@@ -82,6 +87,10 @@ public static  ThrowingConsumer, 
Exception> getRecordProcesso
 
 if (canOmitSetKeyContext) {

Review Comment:
   Thanks for the clarification,  I see.  `canOmitSetKeyContext` represents 
that the operator is non-keyed, and async state processing is only for keyed 
operators.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34961] Use dedicated CI name for Pulsar connector to differentiate it in infra-reports [flink-connector-pulsar]

2024-04-16 Thread via GitHub



snuyanzin merged PR #89:
URL: https://github.com/apache/flink-connector-pulsar/pull/89


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35123) Flink Kubernetes Operator should not do deleteHAData

2024-04-16 Thread Gyula Fora (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837610#comment-17837610
 ] 

Gyula Fora commented on FLINK-35123:


I agree that if the rest api is accessible we could call shutdown and not touch 
the HA metadata. But there are some cases when you Need to delete HA metadata 
explicitly:
 - Cluster is not in a healthy state (rest api not available)
 - Job is previously suspended with last-state upgrade mode where HA metadata 
is left

Also in Kubernetes HA configuration which is much more common than ZK the HA 
metadata cleanup is much faster than anything else. It's a simple ConfigMap 
deletion.

> Flink Kubernetes Operator should not do deleteHAData 
> -
>
> Key: FLINK-35123
> URL: https://issues.apache.org/jira/browse/FLINK-35123
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.7.0, kubernetes-operator-1.8.0
>Reporter: Fei Feng
>Priority: Major
> Attachments: image-2024-04-16-15-56-33-426.png
>
>
> we use flink HA based on zookeeper. when a lots of FlinkDeployment was 
> deleting, operator will be spend to many time in cleanHaData. the jstack show 
> that reconcile thread was hang on disconnect with zookeeper. this made 
> deleting flinkdeployment was slowly. 
> !image-2024-04-16-15-56-33-426.png|width=502,height=263!
>  
> I don't understand why flink kubernetes operator need cleanHAdata , as 
> [~aitozi] comment in PR  [FLINK-26336 Call cancel on deletion & clean up 
> configmaps as well 
> #28|https://github.com/apache/flink-kubernetes-operator/pull/28#discussion_r815968841]
> {quote}it's a bit of out of scope of the operator responsibility or ability
> {quote}
> and I'm totally agree with his point. 
> and I want to know why we call don't call RestClusterClient#shutDownCluster 
> interface, which is
> 1. more graceful and reasonable (operator need not care whether flink app 
> enable ha or not) 2. compatible across flink versions .   
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34961] Use dedicated CI name for Pulsar connector to differentiate it in infra-reports [flink-connector-pulsar]

2024-04-16 Thread via GitHub



snuyanzin commented on PR #89:
URL: 
https://github.com/apache/flink-connector-pulsar/pull/89#issuecomment-2058580128

   yep, thanks for noticing it
   however i guess it is worth discussion first to change it
   thanks for taking a look


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-25537] [JUnit5 Migration] Module: flink-core with,Package: util [flink]

2024-04-16 Thread via GitHub



GOODBOY008 commented on PR #24670:
URL: https://github.com/apache/flink/pull/24670#issuecomment-2058579166

   @Jiabao-Sun @1996fanrui PTAL 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34549][API] Introduce config, context and processingTimerService for DataStream API V2 [flink]

2024-04-16 Thread via GitHub



reswqa commented on code in PR #24541:
URL: https://github.com/apache/flink/pull/24541#discussion_r1566991099


##
flink-datastream-api/src/main/java/org/apache/flink/datastream/api/context/TimestampManager.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.datastream.api.context;
+
+import org.apache.flink.annotation.Experimental;
+
+import java.util.Optional;
+
+/** This is responsibility for retrieving timestamp related things of process 
function. */
+@Experimental
+public interface TimestampManager {
+/**
+ * Get the timestamp of current processing record.
+ *
+ * @return the timestamp of current processed record. If it does not have 
timestamp, empty will
+ * be returned.
+ */
+Optional getCurrentRecordTimestamp();

Review Comment:
   It sounds more reasonable to introduce this during event-time supporting, I 
will remove it then.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34549][API] Introduce config, context and processingTimerService for DataStream API V2 [flink]

2024-04-16 Thread via GitHub



reswqa commented on code in PR #24541:
URL: https://github.com/apache/flink/pull/24541#discussion_r1566988135


##
flink-datastream-api/src/main/java/org/apache/flink/datastream/api/context/StateManager.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.datastream.api.context;
+
+import org.apache.flink.annotation.Experimental;
+
+import java.util.Optional;
+
+/** This is responsibility for managing runtime information related to state 
of process function. */
+@Experimental
+public interface StateManager {
+/**
+ * Get the key of current record.
+ *
+ * @return The key of current processed record. {@link Optional#empty()} 
if the key can not be
+ * extracted for this function.

Review Comment:
   We finally decided in FLIP-433 that we need to throw an exception for 
illegal access to state. I think the same is true for this case, so I've 
removed the `Optional`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34549][API] Introduce config, context and processingTimerService for DataStream API V2 [flink]

2024-04-16 Thread via GitHub



reswqa commented on code in PR #24541:
URL: https://github.com/apache/flink/pull/24541#discussion_r1566978504


##
flink-datastream/src/main/java/org/apache/flink/datastream/impl/stream/ProcessConfigurableDataStream.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.datastream.impl.stream;
+
+import org.apache.flink.api.common.operators.SlotSharingGroup;
+import org.apache.flink.api.common.operators.util.OperatorValidationUtils;
+import org.apache.flink.api.dag.Transformation;
+import org.apache.flink.datastream.api.stream.ProcessConfigurable;
+import org.apache.flink.datastream.impl.ExecutionEnvironmentImpl;
+import org.apache.flink.streaming.api.datastream.DataStream;
+
+/** A {@link DataStream} implementation which processing configurable. */
+@SuppressWarnings("unchecked")
+public class ProcessConfigurableDataStream>

Review Comment:
   You're totally right, binding configuration to streams is something we 
wanted to avoid in the first place, even at the implementation level. I have 
refactored the design here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 >

1 - 100 of 175 matches

Mail list logo