date:20240604

Re: [PR] [FLINK-35157][runtime] Sources with watermark alignment get stuck once some subtasks finish [flink]

2024-06-04 Thread via GitHub



elon-X commented on PR #24757:
URL: https://github.com/apache/flink/pull/24757#issuecomment-2148906495

   @1996fanrui I've made some changes based on your suggestions. Please review 
the changes when you have a chance and let me know if there are any further 
improvements needed. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35415][base] Fix compatibility with Flink 1.19 [flink-cdc]

2024-06-04 Thread via GitHub



yuxiqian commented on PR #3348:
URL: https://github.com/apache/flink-cdc/pull/3348#issuecomment-2148866328

   Done, seems I messed up the commit history during last rebase... Fixed it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35504] Improve Elasticsearch 8 connector observability [flink-connector-elasticsearch]

2024-06-04 Thread via GitHub



liuml07 commented on code in PR #106:
URL: 
https://github.com/apache/flink-connector-elasticsearch/pull/106#discussion_r1626913771


##
flink-connector-elasticsearch8/src/main/java/org/apache/flink/connector/elasticsearch/sink/Elasticsearch8AsyncWriter.java:
##
@@ -102,13 +103,15 @@ public Elasticsearch8AsyncWriter(
 final SinkWriterMetricGroup metricGroup = context.metricGroup();
 checkNotNull(metricGroup);
 
+this.numRecordsSendCounter = metricGroup.getNumRecordsSendCounter();
 this.numRecordsOutErrorsCounter = 
metricGroup.getNumRecordsOutErrorsCounter();
 }
 
 @Override
 protected void submitRequestEntries(
 List requestEntries, Consumer> 
requestResult) {
-LOG.debug("submitRequestEntries with {} items", requestEntries.size());
+LOG.info("submitRequestEntries with {} items", requestEntries.size());

Review Comment:
   I added a new commit to illustrate the latter idea. I can switch to DEBUG if 
you think it's overkill or not necessary.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [Flink-35473][table] Improve Table/SQL Configuration for Flink 2.0 [flink]

2024-06-04 Thread via GitHub



flinkbot commented on PR #24889:
URL: https://github.com/apache/flink/pull/24889#issuecomment-2148823508

   
   ## CI report:
   
   * dbc98f28e5eaa59a323807132c17d909fcc6133a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35504] Improve Elasticsearch 8 connector observability [flink-connector-elasticsearch]

2024-06-04 Thread via GitHub



liuml07 commented on code in PR #106:
URL: 
https://github.com/apache/flink-connector-elasticsearch/pull/106#discussion_r1626901218


##
flink-connector-elasticsearch8/src/main/java/org/apache/flink/connector/elasticsearch/sink/Elasticsearch8AsyncWriter.java:
##
@@ -102,13 +103,15 @@ public Elasticsearch8AsyncWriter(
 final SinkWriterMetricGroup metricGroup = context.metricGroup();
 checkNotNull(metricGroup);
 
+this.numRecordsSendCounter = metricGroup.getNumRecordsSendCounter();
 this.numRecordsOutErrorsCounter = 
metricGroup.getNumRecordsOutErrorsCounter();
 }
 
 @Override
 protected void submitRequestEntries(
 List requestEntries, Consumer> 
requestResult) {
-LOG.debug("submitRequestEntries with {} items", requestEntries.size());
+LOG.info("submitRequestEntries with {} items", requestEntries.size());

Review Comment:
   Understood. Though this gets called for a batch of entries, it might get 
many of this successful log message that are less useful. We can get this back 
to debug level; so it's more like "no-logs-is-good-logs". Or do you think it's 
a valid idea to print successful message at a frequency, so it's still 
explicitly reporting progress?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [Flink-35473][table] Improve Table/SQL Configuration for Flink 2.0 [flink]

2024-06-04 Thread via GitHub



LadyForest opened a new pull request, #24889:
URL: https://github.com/apache/flink/pull/24889

   ## What is the purpose of the change
   
   This pull request implements 
[FLIP-457](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=307136992=jira).
   
   ## Brief change log
   
   This PR contains 6 commits.
   - 1a14d7ed Move adaptive local hash agg option to ExecutionConfigOptions
   - 06d11a93 Move optimizer-related options to OptimizerConfigOptions
   - 35007704 Move LookupJoinHintOptions to table-api module
   - 5445c3d8 Deprecate the outdated optimizer config options
   - bcfd31c9 Add issue navigator to track legacy window options
   - dbc98f28 Add missing annotation for ResultMode and SqlClientOptions
   
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: yes
 - The serializer: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35485) JobMaster failed with "the job xx has not been finished"

2024-06-04 Thread Xingcan Cui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852267#comment-17852267
 ] 

Xingcan Cui commented on FLINK-35485:
-

Hi [~mapohl], unfortunately, I failed to fetch more logs from the cluster. If 
it happens again, I'll try to get some logs and post them here. Thanks!

> JobMaster failed with "the job xx has not been finished"
> 
>
> Key: FLINK-35485
> URL: https://issues.apache.org/jira/browse/FLINK-35485
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>
> We ran a session cluster on K8s and used Flink SQL gateway to submit queries. 
> Hit the following rare exception once which caused the job manager to restart.
> {code:java}
> org.apache.flink.util.FlinkException: JobMaster for job 
> 50d681ae1e8170f77b4341dda6aba9bc failed.
>   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.jobMasterFailed(Dispatcher.java:1454)
>   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.jobManagerRunnerFailed(Dispatcher.java:776)
>   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.lambda$runJob$6(Dispatcher.java:698)
>   at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown 
> Source)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown 
> Source)
>   at java.base/java.util.concurrent.CompletableFuture$Completion.run(Unknown 
> Source)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.lambda$handleRunAsync$4(PekkoRpcActor.java:451)
>   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRunAsync(PekkoRpcActor.java:451)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcMessage(PekkoRpcActor.java:218)
>   at 
> org.apache.flink.runtime.rpc.pekko.FencedPekkoRpcActor.handleRpcMessage(FencedPekkoRpcActor.java:85)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleMessage(PekkoRpcActor.java:168)
>   at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33)
>   at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29)
>   at scala.PartialFunction.applyOrElse(PartialFunction.scala:127)
>   at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126)
>   at 
> org.apache.pekko.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:29)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
>   at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
>   at org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547)
>   at org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545)
>   at 
> org.apache.pekko.actor.AbstractActor.aroundReceive(AbstractActor.scala:229)
>   at org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590)
>   at org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557)
>   at org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:280)
>   at org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:241)
>   at org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:253)
>   at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
>   at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown 
> Source)
>   at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
>   at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
>   at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
> Caused by: org.apache.flink.runtime.jobmaster.JobNotFinishedException: The 
> job (50d681ae1e8170f77b4341dda6aba9bc) has not been finished.
>   at 
> org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.closeAsync(DefaultJobMasterServiceProcess.java:157)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcess(JobMasterServiceLeadershipRunner.java:431)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.callIfRunning(JobMasterServiceLeadershipRunner.java:476)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.lambda$stopJobMasterServiceProcessAsync$12(JobMasterServiceLeadershipRunner.java:407)
>   at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(Unknown 
> Source)
>   at java.base/java.util.concurrent.CompletableFuture.thenCompose(Unknown 
> Source)
>   at 
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.stopJobMasterServiceProcessAsync(JobMasterServiceLeadershipRunner.java:405)
>   at 
>

Re: [PR] [FLINK-26050] Manually compact small SST files [flink]

2024-06-04 Thread via GitHub



Myasuka commented on PR #24880:
URL: https://github.com/apache/flink/pull/24880#issuecomment-2148803082

   @rkhachatryan since the manual compaction would introduce extra CPU & IO, 
and it would impact the block-cache with the change of level structure. I think 
we could also add a doc to tell users how it impacts the throughput/core with 
the nexmark.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35201][table] Support the execution of drop materialized table in full refresh mode [flink]

2024-06-04 Thread via GitHub



flinkbot commented on PR #24888:
URL: https://github.com/apache/flink/pull/24888#issuecomment-2148801220

   
   ## CI report:
   
   * f0ab631e7d9fea91593ec7550454fb926b265a60 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35415][base] Fix compatibility with Flink 1.19 [flink-cdc]

2024-06-04 Thread via GitHub



leonardBang commented on code in PR #3348:
URL: https://github.com/apache/flink-cdc/pull/3348#discussion_r1626875255


##
flink-cdc-composer/src/main/java/org/apache/flink/cdc/composer/flink/translator/DataSinkTranslator.java:
##
@@ -152,4 +152,37 @@ private String generateSinkName(SinkDef sinkDef) {
 return sinkDef.getName()
 .orElse(String.format("Flink CDC Event Sink: %s", 
sinkDef.getType()));
 }
+
+private static  SimpleVersionedSerializer 
getCommittableSerializer(Object sink) {
+// TwoPhaseCommittingSink has been deprecated, and its signature has 
changed
+// during Flink 1.18 to 1.19. Remove this when Flink 1.18 is no longer 
supported.
+try {
+return (SimpleVersionedSerializer)
+
sink.getClass().getDeclaredMethod("getCommittableSerializer").invoke(sink);
+} catch (NoSuchMethodException | IllegalAccessException | 
InvocationTargetException e) {
+throw new RuntimeException("Failed to get CommittableSerializer", 
e);
+}
+}
+
+private static 
+OneInputStreamOperatorFactory, 
CommittableMessage>
+getCommitterOperatorFactory(
+Sink sink, boolean isBatchMode, boolean 
isCheckpointingEnabled) {
+// OneInputStreamOperatorFactory is an @Internal class, and its 
signature has changed

Review Comment:
   ditto



##
flink-cdc-composer/src/main/java/org/apache/flink/cdc/composer/flink/translator/DataSinkTranslator.java:
##
@@ -150,4 +150,37 @@ private String generateSinkName(SinkDef sinkDef) {
 return sinkDef.getName()
 .orElse(String.format("Flink CDC Event Sink: %s", 
sinkDef.getType()));
 }
+
+private static  SimpleVersionedSerializer 
getCommittableSerializer(Object sink) {
+// TwoPhaseCommittingSink has been deprecated, and its signature has 
changed
+// during Flink 1.18 to 1.19. Remove this when Flink 1.18 is no longer 
supported.
+try {
+return (SimpleVersionedSerializer)
+
sink.getClass().getDeclaredMethod("getCommittableSerializer").invoke(sink);

Review Comment:
   +1,  distribute CDC with various Flink versions in the future is recommended 
just like other flink external connectors



##
flink-cdc-composer/src/main/java/org/apache/flink/cdc/composer/flink/translator/DataSinkTranslator.java:
##
@@ -152,4 +152,37 @@ private String generateSinkName(SinkDef sinkDef) {
 return sinkDef.getName()
 .orElse(String.format("Flink CDC Event Sink: %s", 
sinkDef.getType()));
 }
+
+private static  SimpleVersionedSerializer 
getCommittableSerializer(Object sink) {
+// TwoPhaseCommittingSink has been deprecated, and its signature has 
changed

Review Comment:
   we can add a //FIXME note



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35201) Support the execution of drop materialized table in full refresh mode

2024-06-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35201:
---
Labels: pull-request-available  (was: )

> Support the execution of drop materialized table in full refresh mode
> -
>
> Key: FLINK-35201
> URL: https://issues.apache.org/jira/browse/FLINK-35201
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Gateway
>Affects Versions: 1.20.0
>Reporter: dalongliu
>Assignee: Feng Jin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> In full refresh mode, support drop materialized table and the background 
> refresh workflow.
> {code:SQL}
> DROP MATERIALIZED TABLE [ IF EXISTS ] [catalog_name.][db_name.]table_name
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-35201][table] Support the execution of drop materialized table in full refresh mode [flink]

2024-06-04 Thread via GitHub



hackergin opened a new pull request, #24888:
URL: https://github.com/apache/flink/pull/24888

   ## What is the purpose of the change
   
   *Support the execution of drop materialized table in full refresh mode*
   
   
   ## Brief change log
   
   * Support the execution of drop materialized table in full refresh mode
   
   
   ## Verifying this change
   
   * Add test case `testDropMaterializedTableInFullMode` to verify the 
execution of drop materialized table in full mode. 
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? (Will be added in a separate pr)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32081][checkpoint] Compatibility between file-merging on and off across job runs [flink]

2024-06-04 Thread via GitHub



Zakelly commented on code in PR #24873:
URL: https://github.com/apache/flink/pull/24873#discussion_r1626876357


##
flink-tests/src/test/java/org/apache/flink/test/checkpointing/SnapshotFileMergingCompatibilityITCase.java:
##
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.test.checkpointing;
+
+import org.apache.flink.configuration.CheckpointingOptions;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend;
+import org.apache.flink.core.execution.RestoreMode;
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+import org.apache.flink.util.TestLogger;
+
+import org.junit.jupiter.api.io.TempDir;
+import org.junit.jupiter.params.ParameterizedTest;
+import org.junit.jupiter.params.provider.MethodSource;
+
+import java.nio.file.Path;
+import java.util.Arrays;
+import java.util.Collection;
+
+import static 
org.apache.flink.test.checkpointing.ResumeCheckpointManuallyITCase.runJobAndGetExternalizedCheckpoint;
+import static org.assertj.core.api.Assertions.assertThat;
+
+/**
+ * FileMerging Compatibility IT case which tests recovery from a checkpoint 
created in different
+ * fileMerging mode (i.e. fileMerging enabled/disabled).
+ */
+public class SnapshotFileMergingCompatibilityITCase extends TestLogger {
+
+public static Collection parameters() {
+return Arrays.asList(
+new Object[][] {
+{RestoreMode.CLAIM, true},
+{RestoreMode.CLAIM, false},
+{RestoreMode.NO_CLAIM, true},
+{RestoreMode.NO_CLAIM, false}
+});
+}
+
+@ParameterizedTest(name = "RestoreMode = {0}, fileMergingAcrossBoundary = 
{1}")
+@MethodSource("parameters")
+public void testSwitchFromDisablingToEnablingFileMerging(
+RestoreMode restoreMode, boolean fileMergingAcrossBoundary, 
@TempDir Path checkpointDir)
+throws Exception {
+testSwitchingFileMerging(
+checkpointDir, false, true, restoreMode, 
fileMergingAcrossBoundary);
+}
+
+@ParameterizedTest(name = "RestoreMode = {0}, fileMergingAcrossBoundary = 
{1}")
+@MethodSource("parameters")
+public void testSwitchFromEnablingToDisablingFileMerging(
+RestoreMode restoreMode, boolean fileMergingAcrossBoundary, 
@TempDir Path checkpointDir)
+throws Exception {
+testSwitchingFileMerging(
+checkpointDir, true, false, restoreMode, 
fileMergingAcrossBoundary);
+}
+
+private void testSwitchingFileMerging(
+Path checkpointDir,
+boolean firstFileMergingSwitch,
+boolean secondFileMergingSwitch,
+RestoreMode restoreMode,
+boolean fileMergingAcrossBoundary)
+throws Exception {
+final Configuration config = new Configuration();
+config.set(CheckpointingOptions.CHECKPOINTS_DIRECTORY, 
checkpointDir.toUri().toString());
+config.set(CheckpointingOptions.INCREMENTAL_CHECKPOINTS, true);
+config.set(CheckpointingOptions.FILE_MERGING_ACROSS_BOUNDARY, 
fileMergingAcrossBoundary);
+config.set(CheckpointingOptions.FILE_MERGING_ENABLED, 
firstFileMergingSwitch);
+MiniClusterWithClientResource firstCluster =
+new MiniClusterWithClientResource(
+new MiniClusterResourceConfiguration.Builder()
+.setConfiguration(config)
+.setNumberTaskManagers(2)
+.setNumberSlotsPerTaskManager(2)
+.build());
+EmbeddedRocksDBStateBackend stateBackend1 = new 
EmbeddedRocksDBStateBackend();
+stateBackend1.configure(config, 
Thread.currentThread().getContextClassLoader());
+firstCluster.before();
+String externalCheckpoint;
+try {
+externalCheckpoint =
+runJobAndGetExternalizedCheckpoint(
+stateBackend1, null, firstCluster,

Re: [PR] [FLINK-35423][table] ARRAY_EXCEPT should follow set semantics [flink]

2024-06-04 Thread via GitHub



xuyangzhong commented on code in PR #24828:
URL: https://github.com/apache/flink/pull/24828#discussion_r1626871428


##
docs/data/sql_functions.yml:
##
@@ -687,7 +687,7 @@ collection:
 description: Returns a map created from an arrays of keys and values. Note 
that the lengths of two arrays should be the same.
   - sql: ARRAY_EXCEPT(array1, array2)
 table: arrayOne.arrayExcept(arrayTwo)
-description: Returns an ARRAY that contains the elements from array1 that 
are not in array2. If no elements remain after excluding the elements in array2 
from array1, the function returns an empty ARRAY. If one or both arguments are 
NULL, the function returns NULL. The order of the elements from array1 is kept.
+description: Returns an ARRAY that contains the elements from array1 that 
are not in array2. If no elements remain after excluding the elements in array2 
from array1, the function returns an empty ARRAY. If one or both arguments are 
NULL, the function returns NULL. The order of the elements from array1 is kept 
however the duplicates are removed.

Review Comment:
   nit: what about following the format with 'ARRAY_UNION'?
   ```
   Returns an ARRAY that contains the elements from array1 that are not in 
array2, without duplicates.
   ...
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [build] fix jackson conflicts among cdc connectors [flink-cdc]

2024-06-04 Thread via GitHub



leonardBang merged PR #2987:
URL: https://github.com/apache/flink-cdc/pull/2987


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35510][statebackend] Implement basic incremental checkpoint fo… [flink]

2024-06-04 Thread via GitHub



zoltar9264 commented on PR #24879:
URL: https://github.com/apache/flink/pull/24879#issuecomment-2148784514

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35504] Improve Elasticsearch 8 connector observability [flink-connector-elasticsearch]

2024-06-04 Thread via GitHub



reswqa commented on code in PR #106:
URL: 
https://github.com/apache/flink-connector-elasticsearch/pull/106#discussion_r1626868230


##
flink-connector-elasticsearch8/src/main/java/org/apache/flink/connector/elasticsearch/sink/Elasticsearch8AsyncWriter.java:
##
@@ -102,13 +103,15 @@ public Elasticsearch8AsyncWriter(
 final SinkWriterMetricGroup metricGroup = context.metricGroup();
 checkNotNull(metricGroup);
 
+this.numRecordsSendCounter = metricGroup.getNumRecordsSendCounter();
 this.numRecordsOutErrorsCounter = 
metricGroup.getNumRecordsOutErrorsCounter();
 }
 
 @Override
 protected void submitRequestEntries(
 List requestEntries, Consumer> 
requestResult) {
-LOG.debug("submitRequestEntries with {} items", requestEntries.size());
+LOG.info("submitRequestEntries with {} items", requestEntries.size());

Review Comment:
   I'm in favor of boosting log level when an exception occurs, but I don't 
really want to print things like submitRequest and successful requests. I'm 
afraid they fill up the entire log file and don't make much sense. 樂 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35522][runtime] Fix the issue that the source task may get stuck in speculative execution mode. [flink]

2024-06-04 Thread via GitHub



zhuzhurk commented on code in PR #24887:
URL: https://github.com/apache/flink/pull/24887#discussion_r1626862364


##
flink-tests/src/test/java/org/apache/flink/test/scheduling/SpeculativeExecutionITCase.java:
##
@@ -105,6 +109,8 @@ class SpeculativeExecutionITCase {
 
 private static final AtomicInteger slowTaskCounter = new AtomicInteger(1);
 
+private static final AtomicInteger forceFailureCounter = new 
AtomicInteger();

Review Comment:
   Maybe make it a field of the source?



##
flink-tests/src/test/java/org/apache/flink/test/scheduling/SpeculativeExecutionITCase.java:
##
@@ -524,30 +542,66 @@ public void open(GenericInputSplit split) throws 
IOException {
 }
 
 private static class TestingNumberSequenceSource extends 
NumberSequenceSource {
-private TestingNumberSequenceSource() {
+
+private final boolean forceFailureFlag;
+
+private TestingNumberSequenceSource(boolean forceFailureFlag) {
 super(0, NUMBERS_TO_PRODUCE - 1);
+this.forceFailureFlag = forceFailureFlag;
 }
 
 @Override
 public SourceReader createReader(
 SourceReaderContext readerContext) {
-return new TestingIteratorSourceReader(readerContext);
+return new TestingIteratorSourceReader(readerContext, 
forceFailureFlag);
+}
+
+@Override
+public SplitEnumerator>
+createEnumerator(final 
SplitEnumeratorContext enumContext) {
+
+int splitSize = enumContext.currentParallelism();
+if (forceFailureFlag) {
+splitSize = 1;

Review Comment:
   Not sure why it should be 1 instead of 0(no split)?
   
   And comments are needed to explain such kind of magic.



##
flink-tests/src/test/java/org/apache/flink/test/scheduling/SpeculativeExecutionITCase.java:
##
@@ -524,30 +542,66 @@ public void open(GenericInputSplit split) throws 
IOException {
 }
 
 private static class TestingNumberSequenceSource extends 
NumberSequenceSource {
-private TestingNumberSequenceSource() {
+
+private final boolean forceFailureFlag;

Review Comment:
   Maybe add some comments to explain what will happen if `forceFailureCounter 
> 0`?



##
flink-tests/src/test/java/org/apache/flink/test/scheduling/SpeculativeExecutionITCase.java:
##
@@ -524,30 +542,66 @@ public void open(GenericInputSplit split) throws 
IOException {
 }
 
 private static class TestingNumberSequenceSource extends 
NumberSequenceSource {
-private TestingNumberSequenceSource() {
+
+private final boolean forceFailureFlag;

Review Comment:
   Looks to me it can be replaced by 
`forceFailureCounter`(forceFailureCounter=0 means forceFailureFlag=false).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34482][checkpoint] Rename checkpointing options [flink]

2024-06-04 Thread via GitHub



Zakelly commented on code in PR #24878:
URL: https://github.com/apache/flink/pull/24878#discussion_r1626851630


##
flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java:
##
@@ -210,11 +209,11 @@ public class CheckpointingOptions {
  * Local recovery currently only covers keyed state backends. 
Currently, MemoryStateBackend
  * does not support local recovery and ignore this option.
  */
-@Documentation.Section(Documentation.Sections.COMMON_STATE_BACKENDS)
-public static final ConfigOption 
LOCAL_RECOVERY_TASK_MANAGER_STATE_ROOT_DIRS =

Review Comment:
   Can we just rename variable name? I mean is there any a chance that a 
developer use this?



##
flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java:
##
@@ -97,11 +97,11 @@ public class CheckpointingOptions {
  * to {@link #CHECKPOINTS_DIRECTORY}, and the checkpoint meta data and 
actual program state
  * will both be persisted to the path.
  */
-@Documentation.Section(value = 
Documentation.Sections.COMMON_STATE_BACKENDS, position = 2)
 public static final ConfigOption CHECKPOINT_STORAGE =
-ConfigOptions.key("state.checkpoint-storage")

Review Comment:
   Some `state.checkpoint-storage` in documents should also be updated.



##
flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java:
##
@@ -97,11 +97,11 @@ public class CheckpointingOptions {
  * to {@link #CHECKPOINTS_DIRECTORY}, and the checkpoint meta data and 
actual program state
  * will both be persisted to the path.
  */
-@Documentation.Section(value = 
Documentation.Sections.COMMON_STATE_BACKENDS, position = 2)

Review Comment:
   Do we need to remove the common state backends?  Or we can rename it to 
common_checkpointing or something?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35522][runtime] Fix the issue that the source task may get stuck in speculative execution mode. [flink]

2024-06-04 Thread via GitHub



flinkbot commented on PR #24887:
URL: https://github.com/apache/flink/pull/24887#issuecomment-2148750895

   
   ## CI report:
   
   * 40cb30f959a9a400f76b7e8b535d54d9594b4e5b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (FLINK-35325) Paimon connector miss the position of AddColumnEvent

2024-06-04 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu resolved FLINK-35325.

Resolution: Implemented

via flink-cdc master: 02141660490855b033583e711152059c7652410f

> Paimon connector miss the position of AddColumnEvent
> 
>
> Key: FLINK-35325
> URL: https://issues.apache.org/jira/browse/FLINK-35325
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.1.1
>Reporter: LvYanquan
>Assignee: tianzhu.wen
>Priority: Minor
>  Labels: pull-request-available
> Fix For: cdc-3.1.1
>
>
> Currently, new columns are always added in the last position, however some 
> newly add columns had a specific before and after relationship with other 
> column.
> Source code:
> [https://github.com/apache/flink-cdc/blob/fa6e7ea51258dcd90f06036196618224156df367/flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-paimon/src/main/java/org/apache/flink/cdc/connectors/paimon/sink/PaimonMetadataApplier.java#L137]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35325) Paimon connector miss the position of AddColumnEvent

2024-06-04 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu reassigned FLINK-35325:
--

Assignee: tianzhu.wen

> Paimon connector miss the position of AddColumnEvent
> 
>
> Key: FLINK-35325
> URL: https://issues.apache.org/jira/browse/FLINK-35325
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.1.1
>Reporter: LvYanquan
>Assignee: tianzhu.wen
>Priority: Minor
>  Labels: pull-request-available
> Fix For: cdc-3.1.1
>
>
> Currently, new columns are always added in the last position, however some 
> newly add columns had a specific before and after relationship with other 
> column.
> Source code:
> [https://github.com/apache/flink-cdc/blob/fa6e7ea51258dcd90f06036196618224156df367/flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-paimon/src/main/java/org/apache/flink/cdc/connectors/paimon/sink/PaimonMetadataApplier.java#L137]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-25537] [JUnit5 Migration] Module: flink-core with,Package: core [flink]

2024-06-04 Thread via GitHub



Jiabao-Sun commented on code in PR #24881:
URL: https://github.com/apache/flink/pull/24881#discussion_r1626824660


##
flink-core/src/test/java/org/apache/flink/core/fs/AutoCloseableRegistryTest.java:
##
@@ -66,9 +64,9 @@ public void testSuppressedExceptions() throws Exception {
 
 fail("Close should throw exception");
 } catch (Exception ex) {
-assertEquals("1", ex.getMessage());
-assertEquals("2", ex.getSuppressed()[0].getMessage());
-assertEquals("java.lang.AssertionError: 3", 
ex.getSuppressed()[1].getMessage());
+assertThat(ex.getMessage()).isEqualTo("1");
+assertThat(ex.getSuppressed()[0].getMessage()).isEqualTo("2");
+
assertThat(ex.getSuppressed()[1].getMessage()).isEqualTo("java.lang.AssertionError:
 3");

Review Comment:
   ```java
   assertThatThrownBy(autoCloseableRegistry::close)
   .hasMessage("1")
   .satisfies(
   e -> 
assertThat(e.getSuppressed()[0]).hasMessage("2"),
   e -> 
assertThat(e.getSuppressed()[1]).hasMessage("java.lang.AssertionError: 3"));
   ```



##
flink-core/src/test/java/org/apache/flink/core/fs/local/LocalFileSystemTest.java:
##
@@ -156,37 +151,38 @@ public void testLocalFilesystem() throws Exception {
 
 testbytestest = new byte[5];
 final FSDataInputStream lfsinput2 = lfs.open(pathtotestfile2);
-assertEquals(lfsinput2.read(testbytestest), 5);
+assertThat(lfsinput2.read(testbytestest)).isEqualTo(5);
 lfsinput2.close();
-assertTrue(Arrays.equals(testbytes, testbytestest));
+assertThat(testbytestest).containsExactly(testbytes);
 
 // does lfs see two files?
-assertEquals(lfs.listStatus(pathtotmpdir).length, 2);
+assertThat(lfs.listStatus(pathtotmpdir)).hasSize(2);
 
 // do we get exactly one blocklocation per file? no matter what start 
and len we provide
-
assertEquals(lfs.getFileBlockLocations(lfs.getFileStatus(pathtotestfile1), 0, 
0).length, 1);
+
assertThat(lfs.getFileBlockLocations(lfs.getFileStatus(pathtotestfile1), 0, 
0).length)
+.isOne();
 
 /*
  * can lfs delete files / directories?
  */
-assertTrue(lfs.delete(pathtotestfile1, false));
+assertThat(lfs.delete(pathtotestfile1, false)).isTrue();
 
 // and can lfs also delete directories recursively?
-assertTrue(lfs.delete(pathtotmpdir, true));
+assertThat(lfs.delete(pathtotmpdir, true)).isTrue();
 
-assertTrue(!tempdir.exists());
+assertThat(tempdir.exists()).isFalse();

Review Comment:
   ```suggestion
   assertThat(tempdir).doesNotExist();
   ```



##
flink-filesystems/flink-azure-fs-hadoop/src/test/java/org/apache/flink/fs/azurefs/AzureBlobRecoverableWriterTest.java:
##
@@ -46,15 +43,11 @@ public class AzureBlobRecoverableWriterTest extends 
AbstractRecoverableWriterTes
 private static final String ACCESS_KEY = 
System.getenv("ARTIFACTS_AZURE_ACCESS_KEY");
 private static final String TEST_DATA_DIR = "tests-" + UUID.randomUUID();
 
-@BeforeClass
+@BeforeAll
 public static void checkCredentialsAndSetup() throws IOException {

Review Comment:
   ```suggestion
   static void checkCredentialsAndSetup() throws IOException {
   ```



##
flink-core/src/test/java/org/apache/flink/core/memory/ByteArrayOutputStreamWithPosTest.java:
##
@@ -20,90 +20,91 @@
 
 import org.apache.flink.configuration.ConfigConstants;
 
-import org.junit.Assert;
-import org.junit.Before;
-import org.junit.Rule;
-import org.junit.Test;
-import org.junit.rules.ExpectedException;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
 
 import java.io.IOException;
 import java.util.Arrays;
 
+import static org.assertj.core.api.Assertions.assertThat;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+
 /** Tests for {@link ByteArrayOutputStreamWithPos}. */
-public class ByteArrayOutputStreamWithPosTest {
+class ByteArrayOutputStreamWithPosTest {
 
 private static final int BUFFER_SIZE = 32;
 
-@Rule public ExpectedException thrown = ExpectedException.none();
-
 private ByteArrayOutputStreamWithPos stream;
 
-@Before
-public void setup() {
+@BeforeEach
+void setup() {
 stream = new ByteArrayOutputStreamWithPos(BUFFER_SIZE);
 }
 
 /** Test setting position which is exactly the same with the buffer size. 
*/
 @Test
-public void testSetPositionWhenBufferIsFull() throws Exception {
+void testSetPositionWhenBufferIsFull() throws Exception {
 stream.write(new byte[BUFFER_SIZE]);
 
 // check whether the buffer is filled fully
-Assert.assertEquals(BUFFER_SIZE, stream.getBuf().length);
+assertThat(stream.getBuf()).hasSize(BUFFER_SIZE);
 
 // check current position

Re: [PR] [FLINK-35325][cdc-connector][paimon]Support for specifying column order. [flink-cdc]

2024-06-04 Thread via GitHub



leonardBang merged PR #3323:
URL: https://github.com/apache/flink-cdc/pull/3323


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35200) Support the execution of suspend, resume materialized table in full refresh mode

2024-06-04 Thread dalongliu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852245#comment-17852245
 ] 

dalongliu commented on FLINK-35200:
---

Merged in master: 8d1e043b0c4277582b8862c2bc3314631eec4a7b

> Support the execution of suspend, resume materialized table in full refresh 
> mode
> 
>
> Key: FLINK-35200
> URL: https://issues.apache.org/jira/browse/FLINK-35200
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Gateway
>Affects Versions: 1.20.0
>Reporter: dalongliu
>Assignee: Feng Jin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-35200) Support the execution of suspend, resume materialized table in full refresh mode

2024-06-04 Thread dalongliu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu resolved FLINK-35200.
---
Resolution: Fixed

> Support the execution of suspend, resume materialized table in full refresh 
> mode
> 
>
> Key: FLINK-35200
> URL: https://issues.apache.org/jira/browse/FLINK-35200
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Gateway
>Affects Versions: 1.20.0
>Reporter: dalongliu
>Assignee: Feng Jin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35201) Support the execution of drop materialized table in full refresh mode

2024-06-04 Thread dalongliu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dalongliu reassigned FLINK-35201:
-

Assignee: Feng Jin

> Support the execution of drop materialized table in full refresh mode
> -
>
> Key: FLINK-35201
> URL: https://issues.apache.org/jira/browse/FLINK-35201
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Gateway
>Affects Versions: 1.20.0
>Reporter: dalongliu
>Assignee: Feng Jin
>Priority: Major
> Fix For: 1.20.0
>
>
> In full refresh mode, support drop materialized table and the background 
> refresh workflow.
> {code:SQL}
> DROP MATERIALIZED TABLE [ IF EXISTS ] [catalog_name.][db_name.]table_name
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35200][table] Support the execution of suspend, resume materialized table in full refresh mode [flink]

2024-06-04 Thread via GitHub



lsyldliu closed pull request #24877:  [FLINK-35200][table] Support the 
execution of suspend, resume materialized table in full refresh mode
URL: https://github.com/apache/flink/pull/24877


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [BP-3.1.1][FLINK-35149][cdc-composer] Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not TwoPhaseCommittingSink (#3233) [flink-cdc]

2024-06-04 Thread via GitHub



loserwang1024 commented on PR #3387:
URL: https://github.com/apache/flink-cdc/pull/3387#issuecomment-2148735709

   @PatrickRen , CC


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [BP-3.1.1][FLINK-35149][cdc-composer] Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not TwoPhaseCommittingSink (#3233) [flink-cdc]

2024-06-04 Thread via GitHub



loserwang1024 opened a new pull request, #3387:
URL: https://github.com/apache/flink-cdc/pull/3387

   BP https://issues.apache.org/jira/browse/FLINK-35149 to 3.1.1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [BP-3.1.1][FLINK-35149][cdc-composer] Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not TwoPhaseCommittingSink (#3233) [flink-cdc]

2024-06-04 Thread via GitHub



loserwang1024 closed pull request #3386: [BP-3.1.1][FLINK-35149][cdc-composer] 
Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not 
TwoPhaseCommittingSink (#3233)
URL: https://github.com/apache/flink-cdc/pull/3386


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [BP-3.1.1][FLINK-35149][cdc-composer] Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not TwoPhaseCommittingSink (#3233) [flink-cdc]

2024-06-04 Thread via GitHub



loserwang1024 opened a new pull request, #3386:
URL: https://github.com/apache/flink-cdc/pull/3386

   BP https://issues.apache.org/jira/browse/FLINK-35149 to 3.1.1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-31664] Implement ARRAY_INTERSECT function [flink]

2024-06-04 Thread via GitHub



liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2148729746

   @snuyanzin fix your review, thanks very much


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [FLINK-35522][runtime] Fix the issue that the source task may get stuck in speculative execution mode. [flink]

2024-06-04 Thread via GitHub



SinBex opened a new pull request, #24887:
URL: https://github.com/apache/flink/pull/24887

   ## What is the purpose of the change
   
   If the source task does not get assigned a split because the SplitEnumerator 
has no more splits, and a failover occurs during the closing process, the 
SourceCoordinatorContext will not resend the NoMoreSplit event to the newly 
started source task, causing the source vertex to remain stuck indefinitely. 
   This case may only occur in batch jobs where speculative execution has been 
enabled.
   
   
   ## Brief change log
   
 - fix the issue that the source task may get stuck in speculative 
execution mode.
   
   
   ## Verifying this change
   This change added tests and can be verified as follows:
   
 - *Added it case to verify the issue.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): ( no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: ( no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): ( no )
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no )
 - The S3 file system connector: (no )
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35522) The source task may get stuck after a failover occurs in batch jobs

2024-06-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35522:
---
Labels: pull-request-available  (was: )

> The source task may get stuck after a failover occurs in batch jobs
> ---
>
> Key: FLINK-35522
> URL: https://issues.apache.org/jira/browse/FLINK-35522
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: xingbe
>Assignee: xingbe
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> If the source task does not get assigned a split because the SplitEnumerator 
> has no more splits, and a failover occurs during the closing process, the 
> SourceCoordinatorContext will not resend the NoMoreSplit event to the newly 
> started source task, causing the source vertex to remain stuck indefinitely. 
> This case may only occur in batch jobs where speculative execution has been 
> enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-31664] Implement ARRAY_INTERSECT function [flink]

2024-06-04 Thread via GitHub



liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1626831821


##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/CollectionFunctionsITCase.java:
##
@@ -1723,6 +1724,83 @@ private Stream arrayExceptTestCases() {
 + "ARRAY_EXCEPT(, )"));
 }
 
+private Stream arrayIntersectTestCases() {
+return Stream.of(
+
TestSetSpec.forFunction(BuiltInFunctionDefinitions.ARRAY_INTERSECT)
+.onFieldsWithData(
+new Integer[] {1, 1, 2},
+null,
+new Row[] {Row.of(true, 1), Row.of(true, 2), 
null},
+new Integer[] {null, null, 1},
+new Map[] {
+CollectionUtil.map(entry(1, "a"), entry(2, 
"b")),
+CollectionUtil.map(entry(3, "c"), entry(4, 
"d"))
+},
+new Integer[][] {new Integer[] {1, 2, 3}})
+.andDataTypes(
+DataTypes.ARRAY(DataTypes.INT()),
+DataTypes.ARRAY(DataTypes.INT()),
+DataTypes.ARRAY(
+DataTypes.ROW(DataTypes.BOOLEAN(), 
DataTypes.INT())),
+DataTypes.ARRAY(DataTypes.INT()),
+DataTypes.ARRAY(DataTypes.MAP(DataTypes.INT(), 
DataTypes.STRING())),
+
DataTypes.ARRAY(DataTypes.ARRAY(DataTypes.INT(
+// ARRAY
+.testResult(
+$("f0").arrayIntersect(new Integer[] {1, null, 
4}),
+"ARRAY_INTERSECT(f0, ARRAY[1, NULL, 4])",
+new Integer[] {1, 1},
+DataTypes.ARRAY(DataTypes.INT()))
+.testResult(
+$("f0").arrayIntersect(new Integer[] {3, 4}),
+"ARRAY_INTERSECT(f0, ARRAY[3, 4])",
+new Integer[] {},
+DataTypes.ARRAY(DataTypes.INT()))
+.testResult(
+$("f1").arrayIntersect(new Integer[] {1, null, 
4}),
+"ARRAY_INTERSECT(f1, ARRAY[1, NULL, 4])",
+null,
+DataTypes.ARRAY(DataTypes.INT()))
+// ARRAY>
+.testResult(
+$("f2").arrayIntersect(
+new Row[] {
+null, Row.of(true, 2),
+}),
+"ARRAY_INTERSECT(f2, ARRAY[NULL, ROW(TRUE, 
2)])",
+new Row[] {Row.of(true, 2), null},
+DataTypes.ARRAY(
+DataTypes.ROW(DataTypes.BOOLEAN(), 
DataTypes.INT(
+// arrayOne contains null elements
+.testResult(
+$("f3").arrayIntersect(new Integer[] {null, 
42}),
+"ARRAY_INTERSECT(f3, ARRAY[null, 42])",
+new Integer[] {null, null},

Review Comment:
   fixed @snuyanzin 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-31664] Implement ARRAY_INTERSECT function [flink]

2024-06-04 Thread via GitHub



liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1626831685


##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/CollectionFunctionsITCase.java:
##
@@ -1723,6 +1724,83 @@ private Stream arrayExceptTestCases() {
 + "ARRAY_EXCEPT(, )"));
 }
 
+private Stream arrayIntersectTestCases() {
+return Stream.of(
+
TestSetSpec.forFunction(BuiltInFunctionDefinitions.ARRAY_INTERSECT)
+.onFieldsWithData(
+new Integer[] {1, 1, 2},
+null,
+new Row[] {Row.of(true, 1), Row.of(true, 2), 
null},
+new Integer[] {null, null, 1},
+new Map[] {
+CollectionUtil.map(entry(1, "a"), entry(2, 
"b")),
+CollectionUtil.map(entry(3, "c"), entry(4, 
"d"))
+},
+new Integer[][] {new Integer[] {1, 2, 3}})
+.andDataTypes(
+DataTypes.ARRAY(DataTypes.INT()),
+DataTypes.ARRAY(DataTypes.INT()),
+DataTypes.ARRAY(
+DataTypes.ROW(DataTypes.BOOLEAN(), 
DataTypes.INT())),
+DataTypes.ARRAY(DataTypes.INT()),
+DataTypes.ARRAY(DataTypes.MAP(DataTypes.INT(), 
DataTypes.STRING())),
+
DataTypes.ARRAY(DataTypes.ARRAY(DataTypes.INT(
+// ARRAY
+.testResult(
+$("f0").arrayIntersect(new Integer[] {1, null, 
4}),
+"ARRAY_INTERSECT(f0, ARRAY[1, NULL, 4])",
+new Integer[] {1, 1},

Review Comment:
   @snuyanzin  fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35512) ArtifactFetchManagerTest unit tests fail

2024-06-04 Thread Rob Young (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852243#comment-17852243
 ] 

Rob Young commented on FLINK-35512:
---

The additional artifact looks arbitrary from the ArtifactFetchManager's point 
of view, so instead of depending on output of the build, maybe we should use a 
file controlled by the test like:
{code:java}
-    private File getFlinkClientsJar() throws IOException {
-        return TestingUtils.getFileFromTargetDir(
-                ArtifactFetchManager.class,
-                p ->
-                        org.apache.flink.util.FileUtils.isJarFile(p)
-                                && 
p.toFile().getName().startsWith("flink-clients")
-                                && 
!p.toFile().getName().contains("test-utils"));
+    private File createArbitraryArtifact() throws IOException {
+        Path tempFile = Files.createTempFile(tempDir, "arbitrary", ".jar");
+        Files.write(tempFile, 
UUID.randomUUID().toString().getBytes(StandardCharsets.UTF_8));
+        return tempFile.toFile();
     } {code}
usages of `File sourceFile = TestingUtils.getClassFile(getClass());` could also 
be replaced with this so all test inputs are generated by the test

 

> ArtifactFetchManagerTest unit tests fail
> 
>
> Key: FLINK-35512
> URL: https://issues.apache.org/jira/browse/FLINK-35512
> Project: Flink
>  Issue Type: Technical Debt
>Affects Versions: 1.19.1
>Reporter: Hong Liang Teoh
>Priority: Major
> Fix For: 1.19.1
>
>
> The below three tests from *ArtifactFetchManagerTest* seem to fail 
> consistently:
>  * ArtifactFetchManagerTest.testFileSystemFetchWithAdditionalUri
>  * ArtifactFetchManagerTest.testMixedArtifactFetch
>  * ArtifactFetchManagerTest.testHttpFetch
> The error printed is
> {code:java}
> java.lang.AssertionError:
> Expecting actual not to be empty
>     at 
> org.apache.flink.client.program.artifact.ArtifactFetchManagerTest.getFlinkClientsJar(ArtifactFetchManagerTest.java:248)
>     at 
> org.apache.flink.client.program.artifact.ArtifactFetchManagerTest.testMixedArtifactFetch(ArtifactFetchManagerTest.java:146)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
>     at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>     at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>     at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
>     at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) 
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35522) The source task may get stuck after a failover occurs in batch jobs

2024-06-04 Thread Zhu Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852241#comment-17852241
 ] 

Zhu Zhu commented on FLINK-35522:
-

Thanks for reporting this problem and volunteering to fix it! [~xiasun]
The ticket is assigned to you.

> The source task may get stuck after a failover occurs in batch jobs
> ---
>
> Key: FLINK-35522
> URL: https://issues.apache.org/jira/browse/FLINK-35522
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: xingbe
>Assignee: xingbe
>Priority: Major
> Fix For: 1.20.0
>
>
> If the source task does not get assigned a split because the SplitEnumerator 
> has no more splits, and a failover occurs during the closing process, the 
> SourceCoordinatorContext will not resend the NoMoreSplit event to the newly 
> started source task, causing the source vertex to remain stuck indefinitely. 
> This case may only occur in batch jobs where speculative execution has been 
> enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35522) The source task may get stuck after a failover occurs in batch jobs

2024-06-04 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-35522:
---

Assignee: xingbe

> The source task may get stuck after a failover occurs in batch jobs
> ---
>
> Key: FLINK-35522
> URL: https://issues.apache.org/jira/browse/FLINK-35522
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: xingbe
>Assignee: xingbe
>Priority: Major
> Fix For: 1.20.0
>
>
> If the source task does not get assigned a split because the SplitEnumerator 
> has no more splits, and a failover occurs during the closing process, the 
> SourceCoordinatorContext will not resend the NoMoreSplit event to the newly 
> started source task, causing the source vertex to remain stuck indefinitely. 
> This case may only occur in batch jobs where speculative execution has been 
> enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35354] Support host mapping in Flink tikv cdc [flink-cdc]

2024-06-04 Thread via GitHub



Mrart commented on PR #3336:
URL: https://github.com/apache/flink-cdc/pull/3336#issuecomment-2148692911

   @leonardBang Can you help me review it again？


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35522) The source task may get stuck after a failover occurs in batch jobs

2024-06-04 Thread xingbe (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852240#comment-17852240
 ] 

xingbe commented on FLINK-35522:


I have a solution to fix it, Could you please assign this ticket to me? 
[~zhuzh] 

> The source task may get stuck after a failover occurs in batch jobs
> ---
>
> Key: FLINK-35522
> URL: https://issues.apache.org/jira/browse/FLINK-35522
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: xingbe
>Priority: Major
> Fix For: 1.20.0
>
>
> If the source task does not get assigned a split because the SplitEnumerator 
> has no more splits, and a failover occurs during the closing process, the 
> SourceCoordinatorContext will not resend the NoMoreSplit event to the newly 
> started source task, causing the source vertex to remain stuck indefinitely. 
> This case may only occur in batch jobs where speculative execution has been 
> enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35522) The source task may get stuck after a failover occurs in batch jobs

2024-06-04 Thread xingbe (Jira)

xingbe created FLINK-35522:
--

 Summary: The source task may get stuck after a failover occurs in 
batch jobs
 Key: FLINK-35522
 URL: https://issues.apache.org/jira/browse/FLINK-35522
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.18.1, 1.19.0, 1.17.2, 1.20.0
Reporter: xingbe
 Fix For: 1.20.0


If the source task does not get assigned a split because the SplitEnumerator 
has no more splits, and a failover occurs during the closing process, the 
SourceCoordinatorContext will not resend the NoMoreSplit event to the newly 
started source task, causing the source vertex to remain stuck indefinitely. 
This case may only occur in batch jobs where speculative execution has been 
enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35504] Improve Elasticsearch 8 connector observability [flink-connector-elasticsearch]

2024-06-04 Thread via GitHub



liuml07 commented on PR #106:
URL: 
https://github.com/apache/flink-connector-elasticsearch/pull/106#issuecomment-2148526213

   @reswqa Could you help review and merge? Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34172] Add support for altering a distribution via ALTER TABLE [flink]

2024-06-04 Thread via GitHub



flinkbot commented on PR #24886:
URL: https://github.com/apache/flink/pull/24886#issuecomment-2148475342

   
   ## CI report:
   
   * 4081f6d00def12b386970d94681364dd99d089a6 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34172) Add support for altering a distribution via ALTER TABLE

2024-06-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-34172:
---
Labels: pull-request-available  (was: )

> Add support for altering a distribution via ALTER TABLE 
> 
>
> Key: FLINK-34172
> URL: https://issues.apache.org/jira/browse/FLINK-34172
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Jim Hughes
>Assignee: Jim Hughes
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-34172] Add support for altering a distribution via ALTER TABLE [flink]

2024-06-04 Thread via GitHub



jnh5y opened a new pull request, #24886:
URL: https://github.com/apache/flink/pull/24886

   ## What is the purpose of the change
   
   This PR implements the SQL parser changes for ALTER TABLE to support ADD, 
MODIFY, and DROP DISTRIBUTION statements.
   
   ## Brief change log
   The SQL Parser has been updated.
   The `AlterSchemaConverter` has been updated to pass the changes in a 
DISTRIBUTION through to the `Operation`.
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35502) compress the checkpoint metadata generated by ZK/ETCD HA Services

2024-06-04 Thread Mingliang Liu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852132#comment-17852132
 ] 

Mingliang Liu commented on FLINK-35502:
---

I guess in my day job I don't see user requests that need to recover from any 
point in the past days. I think it works just fine to recover from recent 
checkpoints in the past days. And compressing is a good improvement as data is 
getting large.

> compress the checkpoint metadata generated by ZK/ETCD HA Services
> -
>
> Key: FLINK-35502
> URL: https://issues.apache.org/jira/browse/FLINK-35502
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ying Z
>Priority: Major
>
> In the implementation of Flink HA, the metadata of checkpoints is stored in 
> either Zookeeper (ZK HA) or ETCD (K8S HA), such as:
> {code:java}
> checkpointID-0036044: 
> checkpointID-0036045: 
> ...
> ... {code}
> However, neither of these are designed to store excessive amounts of data. If 
> the 
> [state.checkpoints.num-retained]([https://nightlies.apache.org/flink/flink-docs-release-1.19/zh/docs/deployment/config/#state-checkpoints-num-retained])
>  setting is set too large, it can easily cause abnormalities in ZK/ETCD. 
> The error log when set state.checkpoints.num-retained to 1500:
> {code:java}
> Caused by: org.apache.flink.util.SerializedThrowable: 
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: 
> PUT at: 
> https://xxx/api/v1/namespaces/default/configmaps/xxx-jobmanager-leader. 
> Message: ConfigMap "xxx-jobmanager-leader" is invalid: 0J:
> Too long: must have at most 1048576 bytes. Received status: 
> Status(apiVersion=v1, code=422, 
> details=StatusDetails(causes=(StatusCause(field=[J, message=Too long: must 
> have at most 1048576 bytes, reason=FieldValueTooLong, 
> additionalProperties={})l, group=null, kind=ConfigMap, 
> name=xxx-jobmanager-leader, retryAfterSeconds=null, uid=null, 
> additionalProperties=(}), kind=Status, message=ConfigMap 
> "xxx-jobmanager-leader" is invalid: [): Too long: must have at most 1048576 
> bytes, metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), 
> reason=Invalid, status=Failure, additionalProperties=(}). {code}
> In Flink's code, all checkpoint metadata are updated at the same time, and 
> The checkpoint metadata contains many repeated bytes, therefore it can 
> achieve a very good compression ratio.
> Therefore, I suggest compressing the data when writing checkpoints and 
> decompressing it when reading, to reduce storage pressure and improve IO 
> efficiency.
> Here is the sample code, and reduce the metadata size from 1M bytes to 30K.
> {code:java}
> // Map -> Json
> ObjectMapper objectMapper = new ObjectMapper();
> String checkpointJson = objectMapper.writeValueAsString(checkpointMap); // // 
> copress and base64  
> String compressedBase64 = compressAndEncode(checkpointJson); 
> compressedData.put("checkpoint-all", compressedBase64);{code}
> {code:java}
>     private static String compressAndEncode(String data) throws IOException {
>         ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
>         try (GZIPOutputStream gzipOutputStream = new 
> GZIPOutputStream(outputStream))
> {             gzipOutputStream.write(data.getBytes(StandardCharsets.UTF_8));  
>        }
>         byte[] compressedData = outputStream.toByteArray();
>         return Base64.getEncoder().encodeToString(compressedData);
>     } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-24605) org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.NoOffsetForPartitionException: Undefined offset with no reset policy for partitions

2024-06-04 Thread EMERSON WANG (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-24605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852128#comment-17852128
 ] 

EMERSON WANG commented on FLINK-24605:
--

We got the same exception when scan.startup.mode was set to 'group-offsets' and 
properties.auto.offset.reset was set to 'latest'.

We had to work around as follows:
To start the Flink job for the first time, we had to set scan.startup.mode to 
'latest', let the job run for a few minutes, then stopped the job, reset 
scan.startup.mode to 'group-offsets' and finally restart the job.

We'd appreciate it very much if you could resolve this ticket as soon as 
possible.

> org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.NoOffsetForPartitionException:
>  Undefined offset with no reset policy for partitions
> ---
>
> Key: FLINK-24605
> URL: https://issues.apache.org/jira/browse/FLINK-24605
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.0
>Reporter: Abhijit Talukdar
>Priority: Major
>
> Getting below issue when using 'scan.startup.mode' = 'group-offsets'.
>  
> WITH (
>  'connector' = 'kafka',
>  'topic' = 'ss7gsm-signaling-event',
>  'properties.bootstrap.servers' = '**:9093',
>  'properties.group.id' = 'ss7gsm-signaling-event-T5',
>  'value.format' = 'avro-confluent',
>  'value.avro-confluent.schema-registry.url' = 'https://***:9099',
>  {color:#ff8b00}'scan.startup.mode' = 'group-offsets',{color}
> {color:#ff8b00} 'properties.auto.offset.reset' = 'earliest',{color}
>  'properties.security.protocol'= 'SASL_SSL',
>  'properties.ssl.truststore.location'= '/*/*/ca-certs.jks',
>  'properties.ssl.truststore.password'= '*',
>  'properties.sasl.kerberos.service.name'= 'kafka'
> )
>  
> 'ss7gsm-signaling-event-T5' is a new group id. If the group id is present in 
> ZK then it works otherwise getting below exception. 
> 'properties.auto.offset.reset' property is ignored.
>  
> 021-10-20 22:18:28,267 INFO  
> org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.ConsumerConfig
>  [] - ConsumerConfig values: 021-10-20 22:18:28,267 INFO  
> org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.ConsumerConfig
>  [] - ConsumerConfig values: 
> allow.auto.create.topics = false
> auto.commit.interval.ms = 5000
> {color:#FF} +*auto.offset.reset = none*+{color}
> bootstrap.servers = [.xxx.com:9093]
>  
>  
> Exception:
>  
> 021-10-20 22:18:28,620 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Source: 
> KafkaSource-hiveSignaling.signaling_stg.ss7gsm_signaling_event_flink_k -> 
> Sink: Collect table sink (1/1) (89b175333242fab8914271ad7638ba92) switched 
> from INITIALIZING to RUNNING.021-10-20 22:18:28,620 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Source: 
> KafkaSource-hiveSignaling.signaling_stg.ss7gsm_signaling_event_flink_k -> 
> Sink: Collect table sink (1/1) (89b175333242fab8914271ad7638ba92) switched 
> from INITIALIZING to RUNNING.2021-10-20 22:18:28,621 INFO  
> org.apache.flink.connector.kafka.source.enumerator.KafkaSourceEnumerator [] - 
> Assigning splits to readers \{0=[[Partition: ss7gsm-signaling-event-2, 
> StartingOffset: -3, StoppingOffset: -9223372036854775808], [Partition: 
> ss7gsm-signaling-event-8, StartingOffset: -3, StoppingOffset: 
> -9223372036854775808], [Partition: ss7gsm-signaling-event-7, StartingOffset: 
> -3, StoppingOffset: -9223372036854775808], [Partition: 
> ss7gsm-signaling-event-9, StartingOffset: -3, StoppingOffset: 
> -9223372036854775808], [Partition: ss7gsm-signaling-event-5, StartingOffset: 
> -3, StoppingOffset: -9223372036854775808], [Partition: 
> ss7gsm-signaling-event-6, StartingOffset: -3, StoppingOffset: 
> -9223372036854775808], [Partition: ss7gsm-signaling-event-0, StartingOffset: 
> -3, StoppingOffset: -9223372036854775808], [Partition: 
> ss7gsm-signaling-event-4, StartingOffset: -3, StoppingOffset: 
> -9223372036854775808], [Partition: ss7gsm-signaling-event-1, StartingOffset: 
> -3, StoppingOffset: -9223372036854775808], [Partition: 
> ss7gsm-signaling-event-3, StartingOffset: -3, StoppingOffset: 
> -9223372036854775808]]}2021-10-20 22:18:28,716 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Source: 
> KafkaSource-hiveSignaling.signaling_stg.ss7gsm_signaling_event_flink_k -> 
> Sink: Collect table sink (1/1) (89b175333242fab8914271ad7638ba92) switched 
> from RUNNING to FAILED on xx.xxx.xxx.xxx:42075-d80607 @ xx.xxx.com 
> (dataPort=34120).java.lang.RuntimeException: One or more fetchers have 
> encountered exception at 
>

[jira] [Updated] (FLINK-35515) Upgrade hive version to 4.0.0

2024-06-04 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-35515:
---
Issue Type: New Feature  (was: Improvement)

> Upgrade hive version to 4.0.0
> -
>
> Key: FLINK-35515
> URL: https://issues.apache.org/jira/browse/FLINK-35515
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.18.1
>Reporter: vikasap
>Priority: Major
>
> Hive version 4.0.0 was released recently. However none of the major flink 
> versions will work with this. Filing this so that major flink version's 
> flink-sql and table api will be able to work with the new version of hive 
> metastore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35515) Upgrade hive version to 4.0.0

2024-06-04 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-35515:
---
Fix Version/s: (was: 1.18.2)

> Upgrade hive version to 4.0.0
> -
>
> Key: FLINK-35515
> URL: https://issues.apache.org/jira/browse/FLINK-35515
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Affects Versions: 1.18.1
>Reporter: vikasap
>Priority: Major
>
> Hive version 4.0.0 was released recently. However none of the major flink 
> versions will work with this. Filing this so that major flink version's 
> flink-sql and table api will be able to work with the new version of hive 
> metastore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35521) Flink FileSystem SQL Connector Generating SUCESS File Multiple Times

2024-06-04 Thread EMERSON WANG (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

EMERSON WANG updated FLINK-35521:
-
Description: 
Our Flink table SQL job received data from the Kafka streams and then sinked 
all partitioned data into the associated parquet files under the same S3 folder 
through the filesystem SQL connector.

For the S3 filesystem SQL connector, sink.partition-commit.policy.kind was set 
to 'success-file' and sink.partition-commit.trigger was set to 
'partition-time'. We found that _SUCCESS file in the S3 folder was generated 
multiple times after multiple partitions are committed.

Because all partitioned parquet files and _SUCCESS file are in the same S3 
folder and _SUCCESS file is used to trigger the downstream application, we 
really like the _SUCCESS file to be generated only once instead of multiple 
times after all partitions are committed and all parquet files are ready to be 
processed. Thus, one _SUCCESS file can be used to trigger the downstream 
application only once instead of multiple times.

We knew we could set sink.partition-commit.trigger to 'process-time' to 
generate _SUCCESS file only once in the S3 folder; however, 'process-time' 
would not meet our business requirements.

We'd request the FileSystem SQL connector should support to the following new 
user case:
Even if sink.partition-commit.trigger is set to 'partition-time', _SUCCESS file 
will be generated only once after all partitions are committed and all output 
files are ready to be processed, and will be used to trigger the downstream 
application only once instead of multiple times.

  was:
Our Flink table SQL job received data from the Kafka streams and then sinked 
all partitioned data into the associated parquet files under the same S3 folder
through the filesystem SQL connector.

For the S3 filesystem SQL connector, sink.partition-commit.policy.kind was set 
to 'success-file' and sink.partition-commit.trigger was set
to 'partition-time'. We found that _SUCCESS file in the S3 folder was generated 
multiple times after multiple partitions are committed.

Because all partitioned parquet files and _SUCCESS file are in the same S3 
folder and _SUCCESS file is used to trigger the downstream application, we 
really like the _SUCCESS file to be generated only once instead of multiple 
times after all partitions are committed and all parquet files are ready to be 
processed.
Thus, one _SUCCESS file can be used to trigger the downstream application only 
once instead of multiple times.

We knew we could set sink.partition-commit.trigger to 'process-time' to 
generate _SUCCESS file only once in the S3 folder; however, 'process-time' 
would not meet our business requirements.

We'd request the FileSystem SQL connector should support to the following new 
user case:
Even if sink.partition-commit.trigger is set to 'partition-time', _SUCCESS file 
will be generated only once after all partitions are committed and all output 
files are ready to be processed, and will be used to trigger the downstream 
application only once instead of multiple times.


> Flink FileSystem SQL Connector Generating SUCESS File Multiple Times
> 
>
> Key: FLINK-35521
> URL: https://issues.apache.org/jira/browse/FLINK-35521
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.18.1
> Environment: Our PyFlink SQL jobs are running in AWS EKS environment.
>Reporter: EMERSON WANG
>Priority: Major
>
> Our Flink table SQL job received data from the Kafka streams and then sinked 
> all partitioned data into the associated parquet files under the same S3 
> folder through the filesystem SQL connector.
> For the S3 filesystem SQL connector, sink.partition-commit.policy.kind was 
> set to 'success-file' and sink.partition-commit.trigger was set to 
> 'partition-time'. We found that _SUCCESS file in the S3 folder was generated 
> multiple times after multiple partitions are committed.
> Because all partitioned parquet files and _SUCCESS file are in the same S3 
> folder and _SUCCESS file is used to trigger the downstream application, we 
> really like the _SUCCESS file to be generated only once instead of multiple 
> times after all partitions are committed and all parquet files are ready to 
> be processed. Thus, one _SUCCESS file can be used to trigger the downstream 
> application only once instead of multiple times.
> We knew we could set sink.partition-commit.trigger to 'process-time' to 
> generate _SUCCESS file only once in the S3 folder; however, 'process-time' 
> would not meet our business requirements.
> We'd request the FileSystem SQL connector should support to the following new 
> user case:
> Even if sink.partition-commit.trigger is set to

[jira] [Created] (FLINK-35521) Flink FileSystem SQL Connector Generating SUCESS File Multiple Times

2024-06-04 Thread EMERSON WANG (Jira)

EMERSON WANG created FLINK-35521:


 Summary: Flink FileSystem SQL Connector Generating SUCESS File 
Multiple Times
 Key: FLINK-35521
 URL: https://issues.apache.org/jira/browse/FLINK-35521
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / FileSystem
Affects Versions: 1.18.1
 Environment: Our PyFlink SQL jobs are running in AWS EKS environment.
Reporter: EMERSON WANG


Our Flink table SQL job received data from the Kafka streams and then sinked 
all partitioned data into the associated parquet files under the same S3 folder
through the filesystem SQL connector.

For the S3 filesystem SQL connector, sink.partition-commit.policy.kind was set 
to 'success-file' and sink.partition-commit.trigger was set
to 'partition-time'. We found that _SUCCESS file in the S3 folder was generated 
multiple times after multiple partitions are committed.

Because all partitioned parquet files and _SUCCESS file are in the same S3 
folder and _SUCCESS file is used to trigger the downstream application, we 
really like the _SUCCESS file to be generated only once instead of multiple 
times after all partitions are committed and all parquet files are ready to be 
processed.
Thus, one _SUCCESS file can be used to trigger the downstream application only 
once instead of multiple times.

We knew we could set sink.partition-commit.trigger to 'process-time' to 
generate _SUCCESS file only once in the S3 folder; however, 'process-time' 
would not meet our business requirements.

We'd request the FileSystem SQL connector should support to the following new 
user case:
Even if sink.partition-commit.trigger is set to 'partition-time', _SUCCESS file 
will be generated only once after all partitions are committed and all output 
files are ready to be processed, and will be used to trigger the downstream 
application only once instead of multiple times.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-31533) CREATE TABLE AS SELECT should support to define partition

2024-06-04 Thread Jira



[ 
https://issues.apache.org/jira/browse/FLINK-31533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852110#comment-17852110
 ] 

Sergio Peña commented on FLINK-31533:
-

Hi [~luoyuxia] [~aitozi], I'd like to make some contribution to Flink to extend 
the CTAS statement to allow a custom schema definition (columns, partition & 
primary keys, watermarks, etc). I noticed this task and FLINK-31534 are meant 
for a subset of those changes. Are you still considering in working on this at 
some point? I see these comments were made a year ago, but I don't want to step 
on others people's work if there's some progress or interest on it. Are you ok 
if I take this task?

I'm going to write a FLIP with the proposal about the semantics for the schema 
definition, which is similar to Mysql CTAS. 

> CREATE TABLE AS SELECT should support to define partition
> -
>
> Key: FLINK-31533
> URL: https://issues.apache.org/jira/browse/FLINK-31533
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: luoyuxia
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [hotfix] A couple of hotfixes [flink]

2024-06-04 Thread via GitHub



pnowojski merged PR #24883:
URL: https://github.com/apache/flink/pull/24883


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [hotfix] A couple of hotfixes [flink]

2024-06-04 Thread via GitHub



pnowojski commented on code in PR #24883:
URL: https://github.com/apache/flink/pull/24883#discussion_r1626244851


##
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperationTest.java:
##
@@ -563,7 +563,7 @@ private InflightDataGateOrPartitionRescalingDescriptor gate(
 }
 
 @Test
-void testChannelStateAssignmentTwoGatesPartiallyDownscaling()
+public void testChannelStateAssignmentTwoGatesPartiallyDownscaling()

Review Comment:
   Ahhh, that explains my confusion when I was cherry-picking some things  
between branches. Anyway, next time I will just skip this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35501] Use common IO thread pool for RocksDB data transfer [flink]

2024-06-04 Thread via GitHub



rkhachatryan commented on PR #24882:
URL: https://github.com/apache/flink/pull/24882#issuecomment-2147786034

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [hotfix] A couple of hotfixes [flink]

2024-06-04 Thread via GitHub



JingGe commented on code in PR #24883:
URL: https://github.com/apache/flink/pull/24883#discussion_r1626162425


##
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperationTest.java:
##
@@ -563,7 +563,7 @@ private InflightDataGateOrPartitionRescalingDescriptor gate(
 }
 
 @Test
-void testChannelStateAssignmentTwoGatesPartiallyDownscaling()
+public void testChannelStateAssignmentTwoGatesPartiallyDownscaling()

Review Comment:
   NIT: With Junit 5, methods are not required to be public.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34487][ci] Adds Python Wheels nightly GHA workflow [flink]

2024-06-04 Thread via GitHub



XComp commented on code in PR #24426:
URL: https://github.com/apache/flink/pull/24426#discussion_r1626117295


##
.github/workflows/nightly.yml:
##
@@ -94,3 +94,46 @@ jobs:
   s3_bucket: ${{ secrets.IT_CASE_S3_BUCKET }}
   s3_access_key: ${{ secrets.IT_CASE_S3_ACCESS_KEY }}
   s3_secret_key: ${{ secrets.IT_CASE_S3_SECRET_KEY }}
+
+  build_python_wheels:
+name: "Build Python Wheels on ${{ matrix.os_name }}"
+runs-on: ${{ matrix.os }}
+strategy:
+  fail-fast: false
+  matrix:
+include:
+  - os: ubuntu-latest
+os_name: linux
+  - os: macos-latest
+os_name: macos
+steps:
+  - name: "Checkout the repository"
+uses: actions/checkout@v4
+with:
+  fetch-depth: 0
+  persist-credentials: false
+  - name: "Stringify workflow name"
+uses: "./.github/actions/stringify"
+id: stringify_workflow
+with:
+  value: ${{ github.workflow }}
+  - name: "Build python wheels for ${{ matrix.os_name }}"
+uses: pypa/cibuildwheel@v2.16.5

Review Comment:
   Looks fine from my side. But I am not familiar with the whole buildwheel 
logic. @HuangXingBo can you do another pass over it and approve the changes 
once more?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] (FLINK-35282) PyFlink Support for Apache Beam > 2.49

2024-06-04 Thread Hong Liang Teoh (Jira)



[ https://issues.apache.org/jira/browse/FLINK-35282 ]


Hong Liang Teoh deleted comment on FLINK-35282:
-

was (Author: hong):
Yes. Reopned JIRA

> PyFlink Support for Apache Beam > 2.49
> --
>
> Key: FLINK-35282
> URL: https://issues.apache.org/jira/browse/FLINK-35282
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.19.0, 1.18.1
>Reporter: APA
>Assignee: Antonio Vespoli
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> From what I see PyFlink still has the requirement of Apache Beam => 2.43.0 
> and <= 2.49.0 which subsequently results in a requirement of PyArrow <= 
> 12.0.0. That keeps us exposed to 
> [https://nvd.nist.gov/vuln/detail/CVE-2023-47248]
> I'm not deep enough familiar with the PyFlink code base to understand why 
> Apache Beam's upper dependency limit can't be lifted. From all the existing 
> issues I haven't seen one addressing this. Therefore I created one now. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35282) PyFlink Support for Apache Beam > 2.49

2024-06-04 Thread Hong Liang Teoh (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852074#comment-17852074
 ] 

Hong Liang Teoh commented on FLINK-35282:
-

Yes. Reopned JIRA

> PyFlink Support for Apache Beam > 2.49
> --
>
> Key: FLINK-35282
> URL: https://issues.apache.org/jira/browse/FLINK-35282
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.19.0, 1.18.1
>Reporter: APA
>Assignee: Antonio Vespoli
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> From what I see PyFlink still has the requirement of Apache Beam => 2.43.0 
> and <= 2.49.0 which subsequently results in a requirement of PyArrow <= 
> 12.0.0. That keeps us exposed to 
> [https://nvd.nist.gov/vuln/detail/CVE-2023-47248]
> I'm not deep enough familiar with the PyFlink code base to understand why 
> Apache Beam's upper dependency limit can't be lifted. From all the existing 
> issues I haven't seen one addressing this. Therefore I created one now. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework [flink]

2024-06-04 Thread via GitHub



XComp commented on code in PR #24471:
URL: https://github.com/apache/flink/pull/24471#discussion_r1626077906


##
flink-end-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java:
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.sql.tests;
+
+import org.apache.flink.connector.testframe.source.FromElementsSource;
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.types.Row;
+
+import java.time.Instant;
+import java.time.LocalDateTime;
+import java.time.ZoneOffset;
+import java.util.ArrayList;
+import java.util.List;
+
+class Generator implements FromElementsSource.ElementsSupplier {
+private static final long serialVersionUID = -8455653458083514261L;
+private final List elements;
+
+static Generator create(
+int numKeys, float rowsPerKeyAndSecond, int durationSeconds, int 
offsetSeconds) {
+final int stepMs = (int) (1000 / rowsPerKeyAndSecond);
+final long durationMs = durationSeconds * 1000L;
+final long offsetMs = offsetSeconds * 2000L;
+final List elements = new ArrayList<>();
+int keyIndex = 0;
+long ms = 0;
+while (ms < durationMs) {
+elements.add(createRow(keyIndex++, ms, offsetMs));

Review Comment:
   Jing actually has a good point on the memory consumption. I missed that one. 
 We should continue generating the records on-the-fly to be closer to what the 
original test did.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35501] Use common IO thread pool for RocksDB data transfer [flink]

2024-06-04 Thread via GitHub



rkhachatryan commented on PR #24882:
URL: https://github.com/apache/flink/pull/24882#issuecomment-2147612702

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [hotfix][docs] Fix typo in Log Example Given on upgrade.md File [flink-kubernetes-operator]

2024-06-04 Thread via GitHub



nacisimsek opened a new pull request, #835:
URL: https://github.com/apache/flink-kubernetes-operator/pull/835

   The ID in the savepoint folder name that is passed as a parameter, and the 
ID in the folder name that is given as the log example do NOT match. 
   
   Expected ID in log example: `aec3dd08e76d`
   Actual ID given in the log example: `2f40a9c8e4b9`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35475][runtime] Introduce isInternalSorterSupport to OperatorAttributes [flink]

2024-06-04 Thread via GitHub



jeyhunkarimov commented on PR #24874:
URL: https://github.com/apache/flink/pull/24874#issuecomment-2147546699

   Thanks @Sxnan LGTM!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35353][docs-zh]Translate "Profiler" page into Chinese [flink]

2024-06-04 Thread via GitHub



drymatini commented on PR #24822:
URL: https://github.com/apache/flink/pull/24822#issuecomment-2147503847

   Hi @JingGe , thank you for the review, I have already made adjustment based 
on your opinion, please check my commit again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32081][checkpoint] Compatibility between file-merging on and off across job runs [flink]

2024-06-04 Thread via GitHub



ljz2051 commented on code in PR #24873:
URL: https://github.com/apache/flink/pull/24873#discussion_r1625978087


##
flink-tests/src/test/java/org/apache/flink/test/checkpointing/SnapshotFileMergingCompatibilityITCase.java:
##
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.test.checkpointing;
+
+import org.apache.flink.configuration.CheckpointingOptions;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend;
+import org.apache.flink.core.execution.RestoreMode;
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+import org.apache.flink.util.TestLogger;
+
+import org.junit.jupiter.api.io.TempDir;
+import org.junit.jupiter.params.ParameterizedTest;
+import org.junit.jupiter.params.provider.MethodSource;
+
+import java.nio.file.Path;
+import java.util.Arrays;
+import java.util.Collection;
+
+import static 
org.apache.flink.test.checkpointing.ResumeCheckpointManuallyITCase.runJobAndGetExternalizedCheckpoint;
+import static org.assertj.core.api.Assertions.assertThat;
+
+/**
+ * FileMerging Compatibility IT case which tests recovery from a checkpoint 
created in different
+ * fileMerging mode (i.e. fileMerging enabled/disabled).
+ */
+public class SnapshotFileMergingCompatibilityITCase extends TestLogger {
+
+public static Collection parameters() {
+return Arrays.asList(
+new Object[][] {
+{RestoreMode.CLAIM, true},
+{RestoreMode.CLAIM, false},
+{RestoreMode.NO_CLAIM, true},
+{RestoreMode.NO_CLAIM, false}
+});
+}
+
+@ParameterizedTest(name = "RestoreMode = {0}, fileMergingAcrossBoundary = 
{1}")
+@MethodSource("parameters")
+public void testSwitchFromDisablingToEnablingFileMerging(
+RestoreMode restoreMode, boolean fileMergingAcrossBoundary, 
@TempDir Path checkpointDir)
+throws Exception {
+testSwitchingFileMerging(
+checkpointDir, false, true, restoreMode, 
fileMergingAcrossBoundary);
+}
+
+@ParameterizedTest(name = "RestoreMode = {0}, fileMergingAcrossBoundary = 
{1}")
+@MethodSource("parameters")
+public void testSwitchFromEnablingToDisablingFileMerging(
+RestoreMode restoreMode, boolean fileMergingAcrossBoundary, 
@TempDir Path checkpointDir)
+throws Exception {
+testSwitchingFileMerging(
+checkpointDir, true, false, restoreMode, 
fileMergingAcrossBoundary);
+}
+
+private void testSwitchingFileMerging(
+Path checkpointDir,
+boolean firstFileMergingSwitch,
+boolean secondFileMergingSwitch,
+RestoreMode restoreMode,
+boolean fileMergingAcrossBoundary)
+throws Exception {
+final Configuration config = new Configuration();
+config.set(CheckpointingOptions.CHECKPOINTS_DIRECTORY, 
checkpointDir.toUri().toString());
+config.set(CheckpointingOptions.INCREMENTAL_CHECKPOINTS, true);
+config.set(CheckpointingOptions.FILE_MERGING_ACROSS_BOUNDARY, 
fileMergingAcrossBoundary);
+config.set(CheckpointingOptions.FILE_MERGING_ENABLED, 
firstFileMergingSwitch);
+MiniClusterWithClientResource firstCluster =
+new MiniClusterWithClientResource(
+new MiniClusterResourceConfiguration.Builder()
+.setConfiguration(config)
+.setNumberTaskManagers(2)
+.setNumberSlotsPerTaskManager(2)
+.build());
+EmbeddedRocksDBStateBackend stateBackend1 = new 
EmbeddedRocksDBStateBackend();
+stateBackend1.configure(config, 
Thread.currentThread().getContextClassLoader());
+firstCluster.before();
+String externalCheckpoint;
+try {
+externalCheckpoint =
+runJobAndGetExternalizedCheckpoint(
+stateBackend1, null, firstCluster,

Re: [PR] [FLINK-32081][checkpoint] Compatibility between file-merging on and off across job runs [flink]

2024-06-04 Thread via GitHub



ljz2051 commented on code in PR #24873:
URL: https://github.com/apache/flink/pull/24873#discussion_r1625978087


##
flink-tests/src/test/java/org/apache/flink/test/checkpointing/SnapshotFileMergingCompatibilityITCase.java:
##
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.test.checkpointing;
+
+import org.apache.flink.configuration.CheckpointingOptions;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend;
+import org.apache.flink.core.execution.RestoreMode;
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+import org.apache.flink.util.TestLogger;
+
+import org.junit.jupiter.api.io.TempDir;
+import org.junit.jupiter.params.ParameterizedTest;
+import org.junit.jupiter.params.provider.MethodSource;
+
+import java.nio.file.Path;
+import java.util.Arrays;
+import java.util.Collection;
+
+import static 
org.apache.flink.test.checkpointing.ResumeCheckpointManuallyITCase.runJobAndGetExternalizedCheckpoint;
+import static org.assertj.core.api.Assertions.assertThat;
+
+/**
+ * FileMerging Compatibility IT case which tests recovery from a checkpoint 
created in different
+ * fileMerging mode (i.e. fileMerging enabled/disabled).
+ */
+public class SnapshotFileMergingCompatibilityITCase extends TestLogger {
+
+public static Collection parameters() {
+return Arrays.asList(
+new Object[][] {
+{RestoreMode.CLAIM, true},
+{RestoreMode.CLAIM, false},
+{RestoreMode.NO_CLAIM, true},
+{RestoreMode.NO_CLAIM, false}
+});
+}
+
+@ParameterizedTest(name = "RestoreMode = {0}, fileMergingAcrossBoundary = 
{1}")
+@MethodSource("parameters")
+public void testSwitchFromDisablingToEnablingFileMerging(
+RestoreMode restoreMode, boolean fileMergingAcrossBoundary, 
@TempDir Path checkpointDir)
+throws Exception {
+testSwitchingFileMerging(
+checkpointDir, false, true, restoreMode, 
fileMergingAcrossBoundary);
+}
+
+@ParameterizedTest(name = "RestoreMode = {0}, fileMergingAcrossBoundary = 
{1}")
+@MethodSource("parameters")
+public void testSwitchFromEnablingToDisablingFileMerging(
+RestoreMode restoreMode, boolean fileMergingAcrossBoundary, 
@TempDir Path checkpointDir)
+throws Exception {
+testSwitchingFileMerging(
+checkpointDir, true, false, restoreMode, 
fileMergingAcrossBoundary);
+}
+
+private void testSwitchingFileMerging(
+Path checkpointDir,
+boolean firstFileMergingSwitch,
+boolean secondFileMergingSwitch,
+RestoreMode restoreMode,
+boolean fileMergingAcrossBoundary)
+throws Exception {
+final Configuration config = new Configuration();
+config.set(CheckpointingOptions.CHECKPOINTS_DIRECTORY, 
checkpointDir.toUri().toString());
+config.set(CheckpointingOptions.INCREMENTAL_CHECKPOINTS, true);
+config.set(CheckpointingOptions.FILE_MERGING_ACROSS_BOUNDARY, 
fileMergingAcrossBoundary);
+config.set(CheckpointingOptions.FILE_MERGING_ENABLED, 
firstFileMergingSwitch);
+MiniClusterWithClientResource firstCluster =
+new MiniClusterWithClientResource(
+new MiniClusterResourceConfiguration.Builder()
+.setConfiguration(config)
+.setNumberTaskManagers(2)
+.setNumberSlotsPerTaskManager(2)
+.build());
+EmbeddedRocksDBStateBackend stateBackend1 = new 
EmbeddedRocksDBStateBackend();
+stateBackend1.configure(config, 
Thread.currentThread().getContextClassLoader());
+firstCluster.before();
+String externalCheckpoint;
+try {
+externalCheckpoint =
+runJobAndGetExternalizedCheckpoint(
+stateBackend1, null, firstCluster,

[jira] [Closed] (FLINK-35506) disable kafka auto-commit and rely on flink’s checkpointing if both are enabled

2024-06-04 Thread elon_X (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

elon_X closed FLINK-35506.
--
Resolution: Not A Problem

> disable kafka auto-commit and rely on flink’s checkpointing if both are 
> enabled
> ---
>
> Key: FLINK-35506
> URL: https://issues.apache.org/jira/browse/FLINK-35506
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Affects Versions: 1.16.1
>Reporter: elon_X
>Priority: Major
> Attachments: image-2024-06-03-23-39-28-270.png
>
>
> When I use KafkaSource for consuming topics and set the Kafka parameter 
> {{{}enable.auto.commit=true{}}}, while also enabling checkpointing for the 
> task, I notice that both will commit offsets. Should Kafka's auto-commit be 
> disabled when enabling Flink checkpointing, similar to how it's done with 
> FlinkKafkaConsumer?
>  
> *How to reproduce*
>  
> {code:java}
> // code placeholder
> Properties kafkaParams = new Properties();
> kafkaParams.put("enable.auto.commit", "true");
> kafkaParams.put("auto.offset.reset", "latest");
> kafkaParams.put("fetch.min.bytes", "4096");
> kafkaParams.put("sasl.mechanism", "PLAIN");
> kafkaParams.put("security.protocol", "SASL_PLAINTEXT");
> kafkaParams.put("bootstrap.servers", bootStrap);
> kafkaParams.put("group.id", expoGroupId);
> kafkaParams.put("sasl.jaas.config","org.apache.kafka.common.security.plain.PlainLoginModule
>  required username=\"" + username + "\" password=\"" + password + "\";");
> KafkaSource source = KafkaSource
> .builder()
> .setBootstrapServers(bootStrap)
> .setProperties(kafkaParams)
> .setGroupId(expoGroupId)
> .setTopics(Arrays.asList(expoTopic))
> .setValueOnlyDeserializer(new SimpleStringSchema())
> .setStartingOffsets(OffsetsInitializer.latest())
> .build();
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromSource(source, WatermarkStrategy.noWatermarks(), "kafka-source")
> .filter(r ->  true);
> env.enableCheckpointing(3000 * 1000);
> env.getCheckpointConfig().setMinPauseBetweenCheckpoints(1000 * 1000);
> env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.AT_LEAST_ONCE);
> env.getCheckpointConfig().setCheckpointTimeout(1000 * 300);
> env.execute("kafka-consumer"); {code}
>  
>  
> the kafka client's 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator continuously 
> committing offsets.
> !image-2024-06-03-23-39-28-270.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-26050] Manually compact small SST files [flink]

2024-06-04 Thread via GitHub



rkhachatryan merged PR #24880:
URL: https://github.com/apache/flink/pull/24880


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35516][Connector/Files] Update the Experimental annotation for files connector [flink]

2024-06-04 Thread via GitHub



flinkbot commented on PR #24885:
URL: https://github.com/apache/flink/pull/24885#issuecomment-2147260985

   
   ## CI report:
   
   * d303ba9557f4cbef16b70565a97ba35b829f9bc6 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35516) Update the Experimental annotation to PublicEvolving for files connector

2024-06-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35516:
---
Labels: pull-request-available  (was: )

> Update the Experimental annotation to PublicEvolving for files connector 
> -
>
> Key: FLINK-35516
> URL: https://issues.apache.org/jira/browse/FLINK-35516
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / FileSystem
>Reporter: RocMarshal
>Assignee: RocMarshal
>Priority: Minor
>  Labels: pull-request-available
>
> as described in https://issues.apache.org/jira/browse/FLINK-35496
> We should update the annotations for the stable APIs in files connector.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-35516][Connector/Files] Update the Experimental annotation for files connector [flink]

2024-06-04 Thread via GitHub



RocMarshal opened a new pull request, #24885:
URL: https://github.com/apache/flink/pull/24885

   
   
   
   
   ## What is the purpose of the change
   
   Update the Experimental annotation for files connector
   
   
   ## Brief change log
   
   Update the Experimental annotation for files connector
   
   
   ## Verifying this change
   
   
   
   This change is already covered by existing tests
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35512) ArtifactFetchManagerTest unit tests fail

2024-06-04 Thread Elphas Toringepi (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852005#comment-17852005
 ] 

Elphas Toringepi commented on FLINK-35512:
--

[~hong] +1, I reproduced the error by running
{noformat}
./mvnw clean package{noformat}

> ArtifactFetchManagerTest unit tests fail
> 
>
> Key: FLINK-35512
> URL: https://issues.apache.org/jira/browse/FLINK-35512
> Project: Flink
>  Issue Type: Technical Debt
>Affects Versions: 1.19.1
>Reporter: Hong Liang Teoh
>Priority: Major
> Fix For: 1.19.1
>
>
> The below three tests from *ArtifactFetchManagerTest* seem to fail 
> consistently:
>  * ArtifactFetchManagerTest.testFileSystemFetchWithAdditionalUri
>  * ArtifactFetchManagerTest.testMixedArtifactFetch
>  * ArtifactFetchManagerTest.testHttpFetch
> The error printed is
> {code:java}
> java.lang.AssertionError:
> Expecting actual not to be empty
>     at 
> org.apache.flink.client.program.artifact.ArtifactFetchManagerTest.getFlinkClientsJar(ArtifactFetchManagerTest.java:248)
>     at 
> org.apache.flink.client.program.artifact.ArtifactFetchManagerTest.testMixedArtifactFetch(ArtifactFetchManagerTest.java:146)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
>     at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>     at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>     at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
>     at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) 
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Jing Ge (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852002#comment-17852002
 ] 

Jing Ge commented on FLINK-35517:
-

[~fanrui] all PRs will be recovered.

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Jing Ge
>Priority: Blocker
> Fix For: 1.20.0
>
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Jing Ge (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge resolved FLINK-35517.
-
Fix Version/s: 1.20.0
   Resolution: Fixed

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Jing Ge
>Priority: Blocker
> Fix For: 1.20.0
>
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35518) CI Bot doesn't run on PRs - status UNKNOWN

2024-06-04 Thread Jing Ge (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge reassigned FLINK-35518:
---

Assignee: Jing Ge

> CI Bot doesn't run on PRs - status UNKNOWN
> --
>
> Key: FLINK-35518
> URL: https://issues.apache.org/jira/browse/FLINK-35518
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Piotr Nowojski
>Assignee: Jing Ge
>Priority: Critical
>
> Doesn't want to run on my PR/branch. I was doing force-pushes, rebases, 
> asking flink bot to run, closed and opened new PR, but nothing helped
> https://github.com/apache/flink/pull/24868
> https://github.com/apache/flink/pull/24883
> I've heard others were having similar problems recently.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35520) master can't compile as license check failed

2024-06-04 Thread Sergey Nuyanzin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852001#comment-17852001
 ] 

Sergey Nuyanzin commented on FLINK-35520:
-

so far it was reverted as mentioned in comments FLINK-35282
 PS:  I have a fix for this at 
https://github.com/snuyanzin/flink/commit/5a4f4d0eb785050552c73fbfc74214f85ee278b0
which could be tried when build will be more stable

> master can't compile as license check failed
> 
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45808



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-35518) CI Bot doesn't run on PRs - status UNKNOWN

2024-06-04 Thread Jing Ge (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge closed FLINK-35518.
---
Resolution: Duplicate

> CI Bot doesn't run on PRs - status UNKNOWN
> --
>
> Key: FLINK-35518
> URL: https://issues.apache.org/jira/browse/FLINK-35518
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Piotr Nowojski
>Assignee: Jing Ge
>Priority: Critical
>
> Doesn't want to run on my PR/branch. I was doing force-pushes, rebases, 
> asking flink bot to run, closed and opened new PR, but nothing helped
> https://github.com/apache/flink/pull/24868
> https://github.com/apache/flink/pull/24883
> I've heard others were having similar problems recently.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32229) Implement metrics and logging for Initial implementation

2024-06-04 Thread Hong Liang Teoh (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851998#comment-17851998
 ] 

Hong Liang Teoh commented on FLINK-32229:
-

[~chalixar] sorry I didn't manage to see this, and [~burakoz] has requested to 
work on this. Could this be worked out between you and [~burakoz] ? Maybe we 
can collaborate by reviewing the PR?

> Implement metrics and logging for Initial implementation
> 
>
> Key: FLINK-32229
> URL: https://issues.apache.org/jira/browse/FLINK-32229
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Hong Liang Teoh
>Assignee: Burak Ozakinci
>Priority: Major
>
> Add/Ensure Kinesis specific metrics for MillisBehindLatest/numRecordsIn are 
> published.
> More metrics here: 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Reopened] (FLINK-35282) PyFlink Support for Apache Beam > 2.49

2024-06-04 Thread Sergey Nuyanzin (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin reopened FLINK-35282:
-

I noticed that it was reverted by [~hong] at 
[caa68c2481f7c483d0364206d36654af26f2074f|https://github.com/apache/flink/commit/caa68c2481f7c483d0364206d36654af26f2074f]
for that reason it would make sense to reopen the issue as well


BTW I have a fix for license check issue at 
https://github.com/snuyanzin/flink/commit/5a4f4d0eb785050552c73fbfc74214f85ee278b0
We could try is once build becomes more stable

> PyFlink Support for Apache Beam > 2.49
> --
>
> Key: FLINK-35282
> URL: https://issues.apache.org/jira/browse/FLINK-35282
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.19.0, 1.18.1
>Reporter: APA
>Assignee: Antonio Vespoli
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> From what I see PyFlink still has the requirement of Apache Beam => 2.43.0 
> and <= 2.49.0 which subsequently results in a requirement of PyArrow <= 
> 12.0.0. That keeps us exposed to 
> [https://nvd.nist.gov/vuln/detail/CVE-2023-47248]
> I'm not deep enough familiar with the PyFlink code base to understand why 
> Apache Beam's upper dependency limit can't be lifted. From all the existing 
> issues I haven't seen one addressing this. Therefore I created one now. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-32229) Implement metrics and logging for Initial implementation

2024-06-04 Thread Hong Liang Teoh (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Liang Teoh reassigned FLINK-32229:
---

Assignee: Burak Ozakinci

> Implement metrics and logging for Initial implementation
> 
>
> Key: FLINK-32229
> URL: https://issues.apache.org/jira/browse/FLINK-32229
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Hong Liang Teoh
>Assignee: Burak Ozakinci
>Priority: Major
>
> Add/Ensure Kinesis specific metrics for MillisBehindLatest/numRecordsIn are 
> published.
> More metrics here: 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851995#comment-17851995
 ] 

Rui Fan edited comment on FLINK-35517 at 6/4/24 10:52 AM:
--

Hi [~jingge] , thanks for the quick fix. :)

May I know whether all old PRs can be recovered as well? Or only new PR works 
now?


was (Author: fanrui):
Hi [~jingge] , may I know whether all old PRs can be recovered as well? Or only 
new PR works now?

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Jing Ge
>Priority: Blocker
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework [flink]

2024-06-04 Thread via GitHub



JingGe commented on code in PR #24471:
URL: https://github.com/apache/flink/pull/24471#discussion_r1624540081


##
flink-end-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java:
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.sql.tests;
+
+import org.apache.flink.connector.testframe.source.FromElementsSource;
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.types.Row;
+
+import java.time.Instant;
+import java.time.LocalDateTime;
+import java.time.ZoneOffset;
+import java.util.ArrayList;
+import java.util.List;
+
+class Generator implements FromElementsSource.ElementsSupplier {
+private static final long serialVersionUID = -8455653458083514261L;
+private final List elements;
+
+static Generator create(
+int numKeys, float rowsPerKeyAndSecond, int durationSeconds, int 
offsetSeconds) {
+final int stepMs = (int) (1000 / rowsPerKeyAndSecond);
+final long durationMs = durationSeconds * 1000L;
+final long offsetMs = offsetSeconds * 2000L;
+final List elements = new ArrayList<>();
+int keyIndex = 0;
+long ms = 0;
+while (ms < durationMs) {
+elements.add(createRow(keyIndex++, ms, offsetMs));

Review Comment:
   The original implementation was only for batch in `BatchSQLTestProgram`. 
This PR is for the migration that should not implicitly bring big change for 
the data generation that might cause performance issue later. In addition, the 
new implementation is still in the `flink-batch-sql-test` module which should 
be used only for batch. 
   Not sure if there are already similar generators in the stream-sql-test. If 
not, a new jira task could be created and add the new implementation to the 
stream sql test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework [flink]

2024-06-04 Thread via GitHub



JingGe commented on code in PR #24471:
URL: https://github.com/apache/flink/pull/24471#discussion_r1624540081


##
flink-end-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java:
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.sql.tests;
+
+import org.apache.flink.connector.testframe.source.FromElementsSource;
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.types.Row;
+
+import java.time.Instant;
+import java.time.LocalDateTime;
+import java.time.ZoneOffset;
+import java.util.ArrayList;
+import java.util.List;
+
+class Generator implements FromElementsSource.ElementsSupplier {
+private static final long serialVersionUID = -8455653458083514261L;
+private final List elements;
+
+static Generator create(
+int numKeys, float rowsPerKeyAndSecond, int durationSeconds, int 
offsetSeconds) {
+final int stepMs = (int) (1000 / rowsPerKeyAndSecond);
+final long durationMs = durationSeconds * 1000L;
+final long offsetMs = offsetSeconds * 2000L;
+final List elements = new ArrayList<>();
+int keyIndex = 0;
+long ms = 0;
+while (ms < durationMs) {
+elements.add(createRow(keyIndex++, ms, offsetMs));

Review Comment:
   The original implementation was only for batch in `BatchSQLTestProgram`. 
This PR is for the migration that should not implicitly bring big change for 
the data generation that might cause performance issue later. In addition, the 
new implementation is still in the `flink-batch-sql-test` module which should 
be used only for batch. 
   Not sure if there are already similar generators in the stream-sql-test. If 
not, a new jira task could be created and add the new generator implementation 
to the stream sql test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework [flink]

2024-06-04 Thread via GitHub



JingGe commented on code in PR #24471:
URL: https://github.com/apache/flink/pull/24471#discussion_r1624540081


##
flink-end-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java:
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.sql.tests;
+
+import org.apache.flink.connector.testframe.source.FromElementsSource;
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.types.Row;
+
+import java.time.Instant;
+import java.time.LocalDateTime;
+import java.time.ZoneOffset;
+import java.util.ArrayList;
+import java.util.List;
+
+class Generator implements FromElementsSource.ElementsSupplier {
+private static final long serialVersionUID = -8455653458083514261L;
+private final List elements;
+
+static Generator create(
+int numKeys, float rowsPerKeyAndSecond, int durationSeconds, int 
offsetSeconds) {
+final int stepMs = (int) (1000 / rowsPerKeyAndSecond);
+final long durationMs = durationSeconds * 1000L;
+final long offsetMs = offsetSeconds * 2000L;
+final List elements = new ArrayList<>();
+int keyIndex = 0;
+long ms = 0;
+while (ms < durationMs) {
+elements.add(createRow(keyIndex++, ms, offsetMs));

Review Comment:
   The original implementation was only for batch in `BatchSQLTestProgram`. 
This PR is for the migration that should not implicitly bring big change for 
the data generation that might cause performance issue later. In addition, the 
new implementation is still in the `flink-batch-sql-test` module which should 
be used only for batch. 
   Not sure if there are already similar generators in the stream-sql-test. If 
not, a new jira task could created and add the new implementation to the stream 
sql test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851995#comment-17851995
 ] 

Rui Fan commented on FLINK-35517:
-

Hi [~jingge] , may I know whether all old PRs can be recovered as well? Or only 
new PR works now?

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Jing Ge
>Priority: Blocker
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Jing Ge (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851994#comment-17851994
 ] 

Jing Ge commented on FLINK-35517:
-

It should work again. [~Weijie Guo] please let me know if you still have any 
issues. Thanks!

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Jing Ge
>Priority: Blocker
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35435] Add timeout Configuration to Async Sink [flink]

2024-06-04 Thread via GitHub



vahmed-hamdy commented on PR #24839:
URL: https://github.com/apache/flink/pull/24839#issuecomment-2147156792

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35520) master can't compile as license check failed

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Summary: master can't compile as license check failed  (was: Nightly build 
can't compile as license check failed)

> master can't compile as license check failed
> 
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45808



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-35520) Nightly build can't compile as license check failed

2024-06-04 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851983#comment-17851983
 ] 

Weijie Guo edited comment on FLINK-35520 at 6/4/24 9:53 AM:


Hi [~antoniovespoli], would you mind taking a look? I found this issue after 
FLINK-35282 merged.


was (Author: weijie guo):
Hi [~antoniovespoli], would you mind taking a look? I found this issue after 
91a9e06d merged.

> Nightly build can't compile as license check failed
> ---
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45808



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35520) Nightly build can't compile as license check failed

2024-06-04 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851983#comment-17851983
 ] 

Weijie Guo commented on FLINK-35520:


Hi [~antoniovespoli], would you mind taking a look? I found this issue after 
91a9e06d merged.

> Nightly build can't compile as license check failed
> ---
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45808



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35520) Nightly build can't compile as license check failed

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Description: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45808
  (was: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45807|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45805])

> Nightly build can't compile as license check failed
> ---
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45808



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35520) Nightly build can't compile as license check failed

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Description: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45807|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45805]
  (was: 09:26:55,177 ERROR 
org.apache.flink.tools.ci.licensecheck.JarFileChecker [] - Jar file 
/tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
 contains a LICENSE file in an unexpected location: /LICENSE

 

[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45805])

> Nightly build can't compile as license check failed
> ---
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45807|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45805]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35520) Nightly build can't compile as license check failed

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Summary: Nightly build can't compile as license check failed  (was: Nightly 
build can't compile)

> Nightly build can't compile as license check failed
> ---
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> 09:26:55,177 ERROR org.apache.flink.tools.ci.licensecheck.JarFileChecker [] - 
> Jar file 
> /tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
>  contains a LICENSE file in an unexpected location: /LICENSE
>  
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45805]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35305]Amazon SQS Sink Connector [flink-connector-aws]

2024-06-04 Thread via GitHub



hlteoh37 commented on code in PR #141:
URL: 
https://github.com/apache/flink-connector-aws/pull/141#discussion_r1625705602


##
flink-connector-aws/flink-connector-sqs/src/main/java/org.apache.flink.connector.sqs/sink/SqsSinkElementConverter.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.sqs.sink;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.api.common.serialization.SerializationSchema;
+import org.apache.flink.api.connector.sink2.Sink;
+import org.apache.flink.api.connector.sink2.SinkWriter;
+import org.apache.flink.connector.base.sink.writer.ElementConverter;
+import org.apache.flink.metrics.MetricGroup;
+import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
+import org.apache.flink.util.FlinkRuntimeException;
+import org.apache.flink.util.Preconditions;
+import org.apache.flink.util.SimpleUserCodeClassLoader;
+import org.apache.flink.util.UserCodeClassLoader;
+
+import software.amazon.awssdk.services.sqs.model.SendMessageBatchRequestEntry;
+
+import java.nio.charset.StandardCharsets;
+import java.util.UUID;
+
+/**
+ * An implementation of the {@link ElementConverter} that uses the AWS SQS SDK 
v2. The user only
+ * needs to provide a {@link SerializationSchema} of the {@code InputT} to 
transform it into a
+ * {@link SendMessageBatchRequestEntry} that may be persisted.
+ */
+@Internal
+public class SqsSinkElementConverter
+implements ElementConverter {
+
+/** A serialization schema to specify how the input element should be 
serialized. */
+private final SerializationSchema serializationSchema;
+
+private SqsSinkElementConverter(SerializationSchema 
serializationSchema) {
+this.serializationSchema = serializationSchema;
+}
+
+@Override
+public SendMessageBatchRequestEntry apply(InputT element, 
SinkWriter.Context context) {
+final byte[] messageBody = serializationSchema.serialize(element);
+return SendMessageBatchRequestEntry.builder()
+.id(UUID.randomUUID().toString())
+.messageBody(new String(messageBody, StandardCharsets.UTF_8))
+.build();
+}
+
+@Override
+public void open(Sink.InitContext context) {
+try {
+serializationSchema.open(
+new SerializationSchema.InitializationContext() {
+@Override
+public MetricGroup getMetricGroup() {
+return new UnregisteredMetricsGroup();
+}
+
+@Override
+public UserCodeClassLoader getUserCodeClassLoader() {
+return SimpleUserCodeClassLoader.create(
+
SqsSinkElementConverter.class.getClassLoader());
+}
+});
+} catch (Exception e) {
+throw new FlinkRuntimeException("Failed to initialize 
serialization schema.", e);
+}
+}
+
+public static  Builder builder() {
+return new Builder<>();
+}
+
+/** A builder for the SqsSinkElementConverter. */
+public static class Builder {
+
+private SerializationSchema serializationSchema;
+
+public Builder setSerializationSchema(
+SerializationSchema serializationSchema) {
+this.serializationSchema = serializationSchema;
+return this;
+}
+
+public SqsSinkElementConverter build() {
+Preconditions.checkNotNull(
+serializationSchema,
+"No SerializationSchema was supplied to the " + "SQS Sink 
builder.");

Review Comment:
   nit: `+` not needed here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35520) Nightly build can't compile

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Summary: Nightly build can't compile  (was: Nightly build can't compile as 
problems were detected from NoticeFileChecker)

> Nightly build can't compile
> ---
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> 09:26:55,177 ERROR org.apache.flink.tools.ci.licensecheck.JarFileChecker [] - 
> Jar file 
> /tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
>  contains a LICENSE file in an unexpected location: /LICENSE
>  
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45805]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35520) Nightly build can't compile as problems were detected from NoticeFileChecker

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Description: 
[] - Jar file 
/tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
 contains a LICENSE file in an unexpected location: /LICENSE

 

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45805

> Nightly build can't compile as problems were detected from NoticeFileChecker
> 
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> [] - Jar file 
> /tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
>  contains a LICENSE file in an unexpected location: /LICENSE
>  
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45805



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35305]Amazon SQS Sink Connector [flink-connector-aws]

2024-06-04 Thread via GitHub



hlteoh37 commented on PR #141:
URL: 
https://github.com/apache/flink-connector-aws/pull/141#issuecomment-2147097522

   We seem to be having quite a few `.` in the class folders. Can we change 
them to `/` instead?
   e.g. 
`flink-connector-aws/flink-connector-sqs/src/test/java/org.apache.flink/connector.sqs/sink/SqsExceptionClassifiersTest.java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35520) Nightly build can't compile as problems were detected from NoticeFileChecker

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Description: 
09:26:55,177 ERROR org.apache.flink.tools.ci.licensecheck.JarFileChecker [] - 
Jar file 
/tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
 contains a LICENSE file in an unexpected location: /LICENSE

 

[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45805]

  was:
[] - Jar file 
/tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
 contains a LICENSE file in an unexpected location: /LICENSE

 

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45805


> Nightly build can't compile as problems were detected from NoticeFileChecker
> 
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> 09:26:55,177 ERROR org.apache.flink.tools.ci.licensecheck.JarFileChecker [] - 
> Jar file 
> /tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
>  contains a LICENSE file in an unexpected location: /LICENSE
>  
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=45805]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

1 2 >

1 - 100 of 133 matches

Mail list logo