date:20240305

[jira] [Closed] (FLINK-34564) Unstable test case TableSourceITCase#testTableHintWithLogicalTableScanReuse

2024-03-05 Thread Matthias Pohl (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl closed FLINK-34564.
-
Resolution: Duplicate

Thanks for reporting this, [~RocMarshal]. This test is especially flaky in 
github actions. And [~lincoln.86xy] is right: It's already covered by 
FLINK-29114. I'm gonna close this one in favor of FLINK-29114.

> Unstable test case TableSourceITCase#testTableHintWithLogicalTableScanReuse
> ---
>
> Key: FLINK-34564
> URL: https://issues.apache.org/jira/browse/FLINK-34564
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.0.0, 1.19.0
>Reporter: RocMarshal
>Priority: Minor
> Attachments: image-2024-03-02-11-01-12-718.png, 
> image-2024-03-02-11-01-44-431.png
>
>
> * branch 1.19 & master
>  * java version 1.8
>  * how to re-produce
>  ** Add '@RepeatedTest' for 
> TableSourceITCase#testTableHintWithLogicalTableScanReuse
>  ** then run it
>  ** !image-2024-03-02-11-01-12-718.png!
>  ** !image-2024-03-02-11-01-44-431.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34429) Adding K8S Annotations to Internal Service

2024-03-05 Thread Danny Cranmer (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823893#comment-17823893
 ] 

Danny Cranmer commented on FLINK-34429:
---

Merged commit 
[{{70975b2}}|https://github.com/apache/flink/commit/70975b258700f12cdb4e9352180be2f4213a6a08]
 into apache:master 

> Adding K8S Annotations to Internal Service
> --
>
> Key: FLINK-34429
> URL: https://issues.apache.org/jira/browse/FLINK-34429
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Barak Ben-Nathan
>Assignee: Barak Ben-Nathan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> Flink currently supports adding Annotations to the Rest Service (with the 
> configuration key:
> kubernetes.rest-service.annotations). 
> New configuration key: kubernetes.internal-service.annotations should be 
> added to allow the same functionality for the internal service. 
>  
> This is useful, for example, when implementing Teleport app discovery. 
> Without special annotation to the service, Teleport floods the JobManger with 
> errors:
>  
> {code:java}
> 2024-02-12 09:24:43,747 WARN 
> org.apache.pekko.remote.transport.netty.NettyTransport [] - Remote connection 
> to [/172.16.108.253:39884] failed with 
> org.jboss.netty.handler.codec.frame.TooLongFrameException: Adjusted frame 
> length exceeds 10485760: 369295621 - discarded 2024-02-12 09:24:43,753 ERROR 
> org.apache.flink.runtime.blob.BlobServerConnection [] - Error while executing 
> BLOB connection from /172.16.108.253:40954. java.io.IOException: Unknown 
> operation 22 at 
> org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:116)
>  [flink-dist-1.18.1.jar:1.18.1]  {code}
> To avoid these we need to add [special 
> Annotations|https://github.com/gravitational/teleport/blob/master/rfd/0135-kube-apps-discovery.md#annotations]
>  to the internal service.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-34429) Adding K8S Annotations to Internal Service

2024-03-05 Thread Danny Cranmer (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer resolved FLINK-34429.
---
Resolution: Done

> Adding K8S Annotations to Internal Service
> --
>
> Key: FLINK-34429
> URL: https://issues.apache.org/jira/browse/FLINK-34429
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Barak Ben-Nathan
>Assignee: Barak Ben-Nathan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> Flink currently supports adding Annotations to the Rest Service (with the 
> configuration key:
> kubernetes.rest-service.annotations). 
> New configuration key: kubernetes.internal-service.annotations should be 
> added to allow the same functionality for the internal service. 
>  
> This is useful, for example, when implementing Teleport app discovery. 
> Without special annotation to the service, Teleport floods the JobManger with 
> errors:
>  
> {code:java}
> 2024-02-12 09:24:43,747 WARN 
> org.apache.pekko.remote.transport.netty.NettyTransport [] - Remote connection 
> to [/172.16.108.253:39884] failed with 
> org.jboss.netty.handler.codec.frame.TooLongFrameException: Adjusted frame 
> length exceeds 10485760: 369295621 - discarded 2024-02-12 09:24:43,753 ERROR 
> org.apache.flink.runtime.blob.BlobServerConnection [] - Error while executing 
> BLOB connection from /172.16.108.253:40954. java.io.IOException: Unknown 
> operation 22 at 
> org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:116)
>  [flink-dist-1.18.1.jar:1.18.1]  {code}
> To avoid these we need to add [special 
> Annotations|https://github.com/gravitational/teleport/blob/master/rfd/0135-kube-apps-discovery.md#annotations]
>  to the internal service.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34429] [flink-kubernetes] Adding K8S Annotations to Internal Service [flink]

2024-03-05 Thread via GitHub



dannycranmer merged PR #24348:
URL: https://github.com/apache/flink/pull/24348


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-34016) Janino compile failed when watermark with column by udf

2024-03-05 Thread Sebastien Pereira (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823892#comment-17823892
 ] 

Sebastien Pereira commented on FLINK-34016:
---

Hi [~xuyangzhong] any change for the fix to land in [release 
1.19|https://cwiki.apache.org/confluence/display/FLINK/1.19+Release]? 

> Janino compile failed when watermark with column by udf
> ---
>
> Key: FLINK-34016
> URL: https://issues.apache.org/jira/browse/FLINK-34016
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.15.0, 1.18.0
>Reporter: ude
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-01-25-11-53-06-158.png, 
> image-2024-01-25-11-54-54-381.png, image-2024-01-25-12-57-21-318.png, 
> image-2024-01-25-12-57-34-632.png
>
>
> After submit the following flink sql by sql-client.sh will throw an exception:
> {code:java}
> Caused by: java.lang.RuntimeException: Could not instantiate generated class 
> 'WatermarkGenerator$0'
>     at 
> org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:74)
>     at 
> org.apache.flink.table.runtime.generated.GeneratedWatermarkGeneratorSupplier.createWatermarkGenerator(GeneratedWatermarkGeneratorSupplier.java:69)
>     at 
> org.apache.flink.streaming.api.operators.source.ProgressiveTimestampsAndWatermarks.createMainOutput(ProgressiveTimestampsAndWatermarks.java:109)
>     at 
> org.apache.flink.streaming.api.operators.SourceOperator.initializeMainOutput(SourceOperator.java:462)
>     at 
> org.apache.flink.streaming.api.operators.SourceOperator.emitNextNotReading(SourceOperator.java:438)
>     at 
> org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:414)
>     at 
> org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68)
>     at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:562)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:807)
>     at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)
>     at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:932)
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
>     at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.util.FlinkRuntimeException: 
> org.apache.flink.api.common.InvalidProgramException: Table program cannot be 
> compiled. This is a bug. Please file an issue.
>     at 
> org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:94)
>     at 
> org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:101)
>     at 
> org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:68)
>     ... 16 more
> Caused by: 
> org.apache.flink.shaded.guava31.com.google.common.util.concurrent.UncheckedExecutionException:
>  org.apache.flink.api.common.InvalidProgramException: Table program cannot be 
> compiled. This is a bug. Please file an issue.
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2055)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache.get(LocalCache.java:3966)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4863)
>     at 
> org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:92)
>     ... 18 more
> Caused by: org.apache.flink.api.common.InvalidProgramException: Table program 
> cannot be compiled. This is a bug. Please file an issue.
>     at 
> org.apache.flink.table.runtime.generated.CompileUtils.doCompile(CompileUtils.java:107)
>     at 
> org.apache.flink.table.runtime.generated.CompileUtils.lambda$compile$0(CompileUtils.java:92)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4868)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3533)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2282)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.Local

[jira] [Comment Edited] (FLINK-34016) Janino compile failed when watermark with column by udf

2024-03-05 Thread Sebastien Pereira (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823892#comment-17823892
 ] 

Sebastien Pereira edited comment on FLINK-34016 at 3/6/24 7:47 AM:
---

Hi [~xuyangzhong] any chance for the fix to land in [release 
1.19|https://cwiki.apache.org/confluence/display/FLINK/1.19+Release]? 


was (Author: JIRAUSER302567):
Hi [~xuyangzhong] any change for the fix to land in [release 
1.19|https://cwiki.apache.org/confluence/display/FLINK/1.19+Release]? 

> Janino compile failed when watermark with column by udf
> ---
>
> Key: FLINK-34016
> URL: https://issues.apache.org/jira/browse/FLINK-34016
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.15.0, 1.18.0
>Reporter: ude
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-01-25-11-53-06-158.png, 
> image-2024-01-25-11-54-54-381.png, image-2024-01-25-12-57-21-318.png, 
> image-2024-01-25-12-57-34-632.png
>
>
> After submit the following flink sql by sql-client.sh will throw an exception:
> {code:java}
> Caused by: java.lang.RuntimeException: Could not instantiate generated class 
> 'WatermarkGenerator$0'
>     at 
> org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:74)
>     at 
> org.apache.flink.table.runtime.generated.GeneratedWatermarkGeneratorSupplier.createWatermarkGenerator(GeneratedWatermarkGeneratorSupplier.java:69)
>     at 
> org.apache.flink.streaming.api.operators.source.ProgressiveTimestampsAndWatermarks.createMainOutput(ProgressiveTimestampsAndWatermarks.java:109)
>     at 
> org.apache.flink.streaming.api.operators.SourceOperator.initializeMainOutput(SourceOperator.java:462)
>     at 
> org.apache.flink.streaming.api.operators.SourceOperator.emitNextNotReading(SourceOperator.java:438)
>     at 
> org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:414)
>     at 
> org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68)
>     at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:562)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:807)
>     at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)
>     at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:932)
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
>     at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.util.FlinkRuntimeException: 
> org.apache.flink.api.common.InvalidProgramException: Table program cannot be 
> compiled. This is a bug. Please file an issue.
>     at 
> org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:94)
>     at 
> org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:101)
>     at 
> org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:68)
>     ... 16 more
> Caused by: 
> org.apache.flink.shaded.guava31.com.google.common.util.concurrent.UncheckedExecutionException:
>  org.apache.flink.api.common.InvalidProgramException: Table program cannot be 
> compiled. This is a bug. Please file an issue.
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2055)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache.get(LocalCache.java:3966)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4863)
>     at 
> org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:92)
>     ... 18 more
> Caused by: org.apache.flink.api.common.InvalidProgramException: Table program 
> cannot be compiled. This is a bug. Please file an issue.
>     at 
> org.apache.flink.table.runtime.generated.CompileUtils.doCompile(CompileUtils.java:107)
>     at 
> org.apache.flink.table.runtime.generated.CompileUtils.lambda$compile$0(CompileUtils.java:92)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4868)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(L

[jira] [Created] (FLINK-34585) [JUnit5 Migration] Module: Flink CDC

2024-03-05 Thread Hang Ruan (Jira)

Hang Ruan created FLINK-34585:
-

 Summary: [JUnit5 Migration] Module: Flink CDC
 Key: FLINK-34585
 URL: https://issues.apache.org/jira/browse/FLINK-34585
 Project: Flink
  Issue Type: Sub-task
  Components: Flink CDC
Reporter: Hang Ruan


Most tests in Flink CDC are still using Junit 4. We need to use Junit 5 instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34413] Remove HBase 1.x connector files and deps [flink-connector-hbase]

2024-03-05 Thread via GitHub



Tan-JiaLiang commented on code in PR #42:
URL: 
https://github.com/apache/flink-connector-hbase/pull/42#discussion_r1513905416


##
docs/content.zh/docs/connectors/table/hbase.md:
##


Review Comment:
   @ferenc-csaky I think just remove lines 21-22 instead of the whole file, 
WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34584) Change package name to org.apache.flink.cdc

2024-03-05 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-34584:
---
Fix Version/s: cdc-3.1.0

> Change package name to org.apache.flink.cdc
> ---
>
> Key: FLINK-34584
> URL: https://issues.apache.org/jira/browse/FLINK-34584
> Project: Flink
>  Issue Type: Sub-task
>  Components: Flink CDC
>Reporter: Hang Ruan
>Priority: Major
> Fix For: cdc-3.1.0
>
>
> Flink CDC need to change its package name to org.apache.flink.cdc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34584) Change package name to org.apache.flink.cdc

2024-03-05 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu reassigned FLINK-34584:
--

Assignee: Hang Ruan

> Change package name to org.apache.flink.cdc
> ---
>
> Key: FLINK-34584
> URL: https://issues.apache.org/jira/browse/FLINK-34584
> Project: Flink
>  Issue Type: Sub-task
>  Components: Flink CDC
>Reporter: Hang Ruan
>Assignee: Hang Ruan
>Priority: Major
> Fix For: cdc-3.1.0
>
>
> Flink CDC need to change its package name to org.apache.flink.cdc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34584) Change package name to org.apache.flink.cdc

2024-03-05 Thread Hang Ruan (Jira)

Hang Ruan created FLINK-34584:
-

 Summary: Change package name to org.apache.flink.cdc
 Key: FLINK-34584
 URL: https://issues.apache.org/jira/browse/FLINK-34584
 Project: Flink
  Issue Type: Sub-task
  Components: Flink CDC
Reporter: Hang Ruan


Flink CDC need to change its package name to org.apache.flink.cdc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34401][docs-zh] Translate "Flame Graphs" page into Chinese [flink]

2024-03-05 Thread via GitHub



lxliyou001 commented on PR #24279:
URL: https://github.com/apache/flink/pull/24279#issuecomment-1980164212

   @JingGe sure,could you help me do the "Squash and merge" action? thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Revert "[FLINK-33532][network] Move the serialization of ShuffleDescr… [flink]

2024-03-05 Thread via GitHub



KarmaGYZ commented on PR #24441:
URL: https://github.com/apache/flink/pull/24441#issuecomment-1980162303

   cc @zhuzhurk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Revert "[FLINK-33532][network] Move the serialization of ShuffleDescr… [flink]

2024-03-05 Thread via GitHub



flinkbot commented on PR #24441:
URL: https://github.com/apache/flink/pull/24441#issuecomment-1980160690

   
   ## CI report:
   
   * c1974425fb043c911042633fd255098988c09e0e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34268] Add a test to verify if restore test exists for ExecNode [flink]

2024-03-05 Thread via GitHub



bvarghese1 commented on code in PR #24219:
URL: https://github.com/apache/flink/pull/24219#discussion_r1513875190


##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/testutils/RestoreTestCompleteness.java:
##
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.nodes.exec.testutils;
+
+import org.apache.flink.table.planner.plan.nodes.exec.ExecNode;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecDropUpdateBefore;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecExchange;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecGlobalGroupAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecGlobalWindowAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecLocalGroupAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecLocalWindowAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecOverAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecPythonCalc;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecPythonCorrelate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecPythonGroupAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecPythonGroupTableAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecPythonGroupWindowAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecPythonOverAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecWindowAggregate;
+import org.apache.flink.table.planner.plan.utils.ExecNodeMetadataUtil;
+import 
org.apache.flink.table.planner.plan.utils.ExecNodeMetadataUtil.ExecNodeNameVersion;
+
+import org.apache.flink.shaded.guava31.com.google.common.reflect.ClassPath;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/** Validate restore tests exists for Exec Nodes. */
+public class RestoreTestCompleteness {
+
+private static final Set>> SKIP_EXEC_NODES =
+new HashSet>>() {
+{
+/** Covered with {@link StreamExecGroupAggregate}. */
+add(StreamExecExchange.class);
+
+/** Covered with {@link 
StreamExecIncrementalGroupAggregate}. */
+add(StreamExecLocalGroupAggregate.class);
+add(StreamExecGlobalGroupAggregate.class);
+
+/** Covered with {@link StreamExecWindowAggregate}. */
+add(StreamExecLocalWindowAggregate.class);
+add(StreamExecGlobalWindowAggregate.class);
+
+/** Covered with {@link StreamExecChangelogNormalize}. */
+add(StreamExecDropUpdateBefore.class);
+
+/** TODO: Remove after FLINK-33676 is merged. */
+add(StreamExecWindowAggregate.class);
+
+/** TODO: Remove after FLINK-33805 is merged. */
+add(StreamExecOverAggregate.class);
+
+/** Ignoring python based exec nodes temporarily. */
+add(StreamExecPythonCalc.class);
+add(StreamExecPythonCorrelate.class);
+add(StreamExecPythonOverAggregate.class);
+add(StreamExecPythonGroupAggregate.class);
+add(StreamExecPythonGroupTableAggregate.class);
+add(StreamExecPythonGroupWindowAggregate.class);
+}
+};
+
+@Test
+public void testMissingRestoreTest()
+throws IOException, NoSuchMethodException, InstantiationException,
+IllegalAccessException, InvocationTargetException {
+Map>> 
versionedExecNodes =
+

Re: [PR] [FLINK-34268] Add a test to verify if restore test exists for ExecNode [flink]

2024-03-05 Thread via GitHub



bvarghese1 commented on code in PR #24219:
URL: https://github.com/apache/flink/pull/24219#discussion_r1513874825


##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/testutils/RestoreTestCompleteness.java:
##
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.nodes.exec.testutils;
+
+import org.apache.flink.table.planner.plan.nodes.exec.ExecNode;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecDropUpdateBefore;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecExchange;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecGlobalGroupAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecGlobalWindowAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecLocalGroupAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecLocalWindowAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecOverAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecPythonCalc;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecPythonCorrelate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecPythonGroupAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecPythonGroupTableAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecPythonGroupWindowAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecPythonOverAggregate;
+import 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecWindowAggregate;
+import org.apache.flink.table.planner.plan.utils.ExecNodeMetadataUtil;
+import 
org.apache.flink.table.planner.plan.utils.ExecNodeMetadataUtil.ExecNodeNameVersion;
+
+import org.apache.flink.shaded.guava31.com.google.common.reflect.ClassPath;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/** Validate restore tests exists for Exec Nodes. */
+public class RestoreTestCompleteness {
+
+private static final Set>> SKIP_EXEC_NODES =
+new HashSet>>() {
+{
+/** Covered with {@link StreamExecGroupAggregate}. */
+add(StreamExecExchange.class);
+
+/** Covered with {@link 
StreamExecIncrementalGroupAggregate}. */
+add(StreamExecLocalGroupAggregate.class);
+add(StreamExecGlobalGroupAggregate.class);
+
+/** Covered with {@link StreamExecWindowAggregate}. */
+add(StreamExecLocalWindowAggregate.class);
+add(StreamExecGlobalWindowAggregate.class);

Review Comment:
   Thats a great idea! I made the suggested change.



##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/testutils/RestoreTestCompleteness.java:
##
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.nodes.exec.testutils;
+
+import org.a

[PR] Revert "[FLINK-33532][network] Move the serialization of ShuffleDescr… [flink]

2024-03-05 Thread via GitHub



caodizhou opened a new pull request, #24441:
URL: https://github.com/apache/flink/pull/24441

   …iptorGroup out of the RPC main thread]"
   
   This reverts commit d18a4bfe596fc580f8280750fa3bfa22007671d9.
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-34576) Flink deployment keep staying at RECONCILING/STABLE status

2024-03-05 Thread chenyuzhi (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823864#comment-17823864
 ] 

chenyuzhi commented on FLINK-34576:
---

I have check our operaotr's log， got some error log：
{code:java}
2024-03-05 04:35:55,147 ERROR 
io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher 
[gdc-gdc-sa/logstream-grand-grand-dep456-g96-production] - Error during event 
processing ExecutionScope{ resource id: 
ResourceID{name='logstream-grand-grand-dep456-g96-production', 
namespace='gdc-gdc-sa'}, version: 3628981709} failed.
org.apache.flink.kubernetes.operator.exception.ReconciliationException: 
org.apache.flink.kubernetes.operator.exception.StatusConflictException: Status 
have been modified externally in version 3628984691 Previous: 
{"jobStatus":{"jobName":"logstream-dep456-grand-grand_dep456_g96","jobId":"706cae800e89eb15911760c173a40c05","state":"RECONCILING","startTime":"1709203055328","updateTime":"1709203103773","savepointInfo":{"lastSavepoint":{"timeStamp":1709203025670,"location":"hdfs://mogra/flink/gdc_sa/savepoints/659fa2ec5ce541a5e2d68947/savepoint-0c18ba-008f65afd340","triggerType":"UNKNOWN","formatType":"UNKNOWN","triggerNonce":null},"triggerId":null,"triggerTimestamp":null,"triggerType":null,"formatType":null,"savepointHistory":[],"lastPeriodicSavepointTimestamp":0}},"error":"{\"type\":\"org.apache.flink.kubernetes.operator.exception.ReconciliationException\",\"message\":\"org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException:
 java.lang.RuntimeException: Failed to load 
configuration\",\"additionalMetadata\":{},\"throwableList\":[{\"type\":\"org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException\",\"message\":\"java.lang.RuntimeException:
 Failed to load 
configuration\",\"additionalMetadata\":{}},{\"type\":\"java.lang.RuntimeException\",\"message\":\"Failed
 to load 
configuration\",\"additionalMetadata\":{}}]}","lifecycleState":"STABLE","clusterInfo":{"flink-revision":"b6d20ed
 @ 
2023-12-20T10:01:39+01:00","flink-version":"1.14.0-GDC1.6.0","total-cpu":"1.6","total-memory":"4294967296"},"jobManagerDeploymentStatus":"READY","reconciliationStatus":{"reconciliationTimestamp":1709203026998,"lastReconciledSpec":"{\"spec\":{\"job\":{\"jarURI\":\"local:///opt/flink/usrlib/usercode.jar\",\"parallelism\":1,\"entryClass\":\"com.netease.gdc.streaming.serverdump.Main\",\"args\":[\"-id\",\"grand-grand_dep456_g96\",\"-server\",\"http://logstreamapi-in.nie.netease.com:9099/\",\"-auth_key\",\"3e0e9294802f4b868dc89e8f32ae43ab\",\"-auth_user\",\"_loghub\",\"-auth_project\",\"dep456\",\"-type\",\"logstream\",\"-meta_api\",\"https://meta.gdc.nie.netease.com/\",\"-datahub_api\",\"http://datahub-\"],\"state\":\"running\",\"savepointTriggerNonce\":1,\"initialSavepointPath\":\"hdfs://mogra/flink/gdc_sa/savepoints/659fa2ec5ce541a5e2d68947/savepoint-0c18ba-008f65afd340\",\"upgradeMode\":\"savepoint\",\"allowNonRestoredState\":true},\"restartNonce\":null,\"flinkConfiguration\":{\"containerized.master.env.LAMBDA_METRICS_TAG_APP_NAME\":\"logstream-grand-grand_dep456_g96-production\",\"containerized.master.env.LAMBDA_METRICS_TAG_DEPARTMENT\":\"sa\",\"containerized.master.env.LAMBDA_METRICS_TAG_LAMBDA_ID\":\"659fa2ec5ce541a5e2d68947\",\"containerized.master.env.LAMBDA_METRICS_TAG_PROJECT\":\"gdc\",\"containerized.master.env.LAMBDA_METRICS_TAG__share_project_\":\"gdc\",\"containerized.taskmanager.env.JAVA_LIBRARY_PATH\":\"$JAVA_LIBRARY_PATH:/home/hadoop/hadoop/lib/native\",\"containerized.taskmanager.env.LAMBDA_METRICS_TAG_APP_NAME\":\"logstream-grand-grand_dep456_g96-production\",\"containerized.taskmanager.env.LAMBDA_METRICS_TAG_DEPARTMENT\":\"sa\",\"containerized.taskmanager.env.LAMBDA_METRICS_TAG_LAMBDA_ID\":\"659fa2ec5ce541a5e2d68947\",\"containerized.taskmanager.env.LAMBDA_METRICS_TAG_PROJECT\":\"gdc\",\"containerized.taskmanager.env.LAMBDA_METRICS_TAG__share_project_\":\"gdc\",\"containerized.taskmanager.env.LD_LIBRARY_PATH\":\"$LD_LIBRARY_PATH:/home/hadoop/hadoop/lib/native\",\"env.hadoop.conf.dir\":\"/home/hadoop/gdcconf/hadoop/mogra\",\"env.java.opts\":\"-Duser.timezone=GMT+08
 -XX:+UseG1GC -Xloggc:log/gc.log -XX:+PrintGCDateStamps 
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 
-XX:GCLogFileSize=10M\",\"execution.checkpointing.externalized-checkpoint-retention\":\"RETAIN_ON_CANCELLATION\",\"execution.checkpointing.interval\":\"30s\",\"execution.checkpointing.tolerable-failed-checkpoints\":\"3\",\"execution.shutdown-on-application-finish\":\"false\",\"high-availability\":\"zookeeper\",\"high-availability.kubernetes.leader-election.lease-duration\":\"60
 s\",\"high-availability.kubernetes.leader-election.renew-deadline\":\"60 
s\",\"high-availability.kubernetes.leader-election.retry-period\":\"30 
s\",\"high-availability.storageDir\":\"hdfs://mogra/flink/gdc_sa/ha/\",\"high-availability.zookeeper.path.roo

[jira] [Updated] (FLINK-34582) release build tools lost the newly added py3.11 packages for mac

2024-03-05 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34582:

Priority: Blocker  (was: Critical)

> release build tools lost the newly added py3.11 packages for mac
> 
>
> Key: FLINK-34582
> URL: https://issues.apache.org/jira/browse/FLINK-34582
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.19.0, 1.20.0
>Reporter: lincoln lee
>Assignee: Xingbo Huang
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> during 1.19.0-rc1 building binaries via 
> tools/releasing/create_binary_release.sh
> lost the newly added py3.11  2 packages for mac



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-20399) Migrate test_sql_client.sh

2024-03-05 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu closed FLINK-20399.
---
  Assignee: (was: Lorenzo Affetti)
Resolution: Fixed

> Migrate test_sql_client.sh
> --
>
> Key: FLINK-20399
> URL: https://issues.apache.org/jira/browse/FLINK-20399
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Client, Tests
>Reporter: Jark Wu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-20399) Migrate test_sql_client.sh

2024-03-05 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-20399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823859#comment-17823859
 ] 

Jark Wu commented on FLINK-20399:
-

[~lorenzo.affetti] Yes, I think so. Thank you for the reminder. 

> Migrate test_sql_client.sh
> --
>
> Key: FLINK-20399
> URL: https://issues.apache.org/jira/browse/FLINK-20399
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Client, Tests
>Reporter: Jark Wu
>Assignee: Lorenzo Affetti
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34516] Move CheckpointingMode to flink-core [flink]

2024-03-05 Thread via GitHub



Zakelly commented on code in PR #24381:
URL: https://github.com/apache/flink/pull/24381#discussion_r1513823880


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/CheckpointConfig.java:
##
@@ -178,7 +178,9 @@ public boolean isCheckpointingEnabled() {
  * Gets the checkpointing mode (exactly-once vs. at-least-once).
  *
  * @return The checkpointing mode.
+ * @deprecated Use {@link #getCheckpointMode} instead.
  */
+@Deprecated
 public CheckpointingMode getCheckpointingMode() {
 return 
configuration.get(ExecutionCheckpointingOptions.CHECKPOINTING_MODE);

Review Comment:
   > IIUC, we should replace all internal usages of original CheckpointMode and 
docs with new one except for the user-facing interface, right?
   
   Yeah, and this will be addressed by another PR.
   
   > I'd suggest to transform to the new CheckpointMode for all original public 
interfaces, and use the new one for other inner places.
   
   Well, I suggest we make this easier since the public API and the option will 
be removed at the same time, no more complex conversion needed, WDYT?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-34583) Bug for dynamic table option hints with multiple CTEs

2024-03-05 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-34583:
---

 Summary: Bug for dynamic table option hints with multiple CTEs
 Key: FLINK-34583
 URL: https://issues.apache.org/jira/browse/FLINK-34583
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.18.1
Reporter: Xingcan Cui


The table options hints don't work well with multiple WITH clauses referring to 
the same table. Please see the following example.

 

The following query with hints works well.
{code:java}
SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code}
The following query with multiple WITH clauses also works well.
{code:java}
WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...)
T3 AS (SELECT ... FROM T2 WHERE...)
SELECT * FROM T3;{code}
The following query with multiple WITH clauses referring to the same original 
table failed to recognize the hints.
{code:java}
WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
T3 AS (SELECT ... FROM T2 WHERE...),
T4 AS (SELECT ... FROM T2 WHERE...),
T5 AS (SELECT ... FROM T3 JOIN T4 ON...)
SELECT * FROM T5;{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34583) Bug for dynamic table option hints with multiple CTEs

2024-03-05 Thread Xingcan Cui (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34583:

Description: 
The table options hints don't work well with multiple WITH clauses referring to 
the same table. Please see the following example.

 

The following query with hints works well.
{code:java}
SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code}
The following query with multiple WITH clauses also works well.
{code:java}
WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
T3 AS (SELECT ... FROM T2 WHERE...)
SELECT * FROM T3;{code}
The following query with multiple WITH clauses referring to the same original 
table failed to recognize the hints.
{code:java}
WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
T3 AS (SELECT ... FROM T2 WHERE...),
T4 AS (SELECT ... FROM T2 WHERE...),
T5 AS (SELECT ... FROM T3 JOIN T4 ON...)
SELECT * FROM T5;{code}

  was:
The table options hints don't work well with multiple WITH clauses referring to 
the same table. Please see the following example.

 

The following query with hints works well.
{code:java}
SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code}
The following query with multiple WITH clauses also works well.
{code:java}
WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...)
T3 AS (SELECT ... FROM T2 WHERE...)
SELECT * FROM T3;{code}
The following query with multiple WITH clauses referring to the same original 
table failed to recognize the hints.
{code:java}
WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
T3 AS (SELECT ... FROM T2 WHERE...),
T4 AS (SELECT ... FROM T2 WHERE...),
T5 AS (SELECT ... FROM T3 JOIN T4 ON...)
SELECT * FROM T5;{code}


> Bug for dynamic table option hints with multiple CTEs
> -
>
> Key: FLINK-34583
> URL: https://issues.apache.org/jira/browse/FLINK-34583
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>
> The table options hints don't work well with multiple WITH clauses referring 
> to the same table. Please see the following example.
>  
> The following query with hints works well.
> {code:java}
> SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...;{code}
> The following query with multiple WITH clauses also works well.
> {code:java}
> WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
> T3 AS (SELECT ... FROM T2 WHERE...)
> SELECT * FROM T3;{code}
> The following query with multiple WITH clauses referring to the same original 
> table failed to recognize the hints.
> {code:java}
> WITH T2 AS (SELECT * FROM T1 /*+ OPTIONS('foo' = 'bar') */ WHERE...),
> T3 AS (SELECT ... FROM T2 WHERE...),
> T4 AS (SELECT ... FROM T2 WHERE...),
> T5 AS (SELECT ... FROM T3 JOIN T4 ON...)
> SELECT * FROM T5;{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34566] Pass a FixedThreadPool to set reconciliation parallelism correctly [flink-kubernetes-operator]

2024-03-05 Thread via GitHub



ysymi commented on code in PR #790:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/790#discussion_r1513802450


##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/FlinkOperatorTest.java:
##
@@ -70,10 +70,21 @@ public void testConfigurationPassedToJOSDK() {
 
 var configService = 
testOperator.getOperator().getConfigurationService();
 
-// Test parallelism being passed
+// Test parallelism being passed expectedly
 var executorService = configService.getExecutorService();
 Assertions.assertInstanceOf(ThreadPoolExecutor.class, executorService);
 ThreadPoolExecutor threadPoolExecutor = (ThreadPoolExecutor) 
executorService;
+for (int i = 0; i < testParallelism * 2; i++) {
+threadPoolExecutor.execute(
+() -> {
+try {
+Thread.sleep(1000);
+} catch (InterruptedException e) {
+e.printStackTrace();
+}
+});
+}

Review Comment:
   actually, I think code in lines 78 to 85 was ugly, any good suggestion? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34516] Move CheckpointingMode to flink-core [flink]

2024-03-05 Thread via GitHub



masteryhx commented on code in PR #24381:
URL: https://github.com/apache/flink/pull/24381#discussion_r1513788931


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/CheckpointConfig.java:
##
@@ -178,7 +178,9 @@ public boolean isCheckpointingEnabled() {
  * Gets the checkpointing mode (exactly-once vs. at-least-once).
  *
  * @return The checkpointing mode.
+ * @deprecated Use {@link #getCheckpointMode} instead.
  */
+@Deprecated
 public CheckpointingMode getCheckpointingMode() {
 return 
configuration.get(ExecutionCheckpointingOptions.CHECKPOINTING_MODE);

Review Comment:
   IIUC, we should replace all internal usages of original CheckpointMode and 
docs with new one except for the user-facing interface, right?
   I'd suggest to transform to the new CheckpointMode for all original public 
interfaces, and use the new one for other inner places.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34413] Remove HBase 1.x connector files and deps [flink-connector-hbase]

2024-03-05 Thread via GitHub



Tan-JiaLiang commented on code in PR #42:
URL: 
https://github.com/apache/flink-connector-hbase/pull/42#discussion_r1513778129


##
docs/content.zh/docs/connectors/table/hbase.md:
##


Review Comment:
   `docs/data/hbase.yml` we still have an inclusion of hbase-1.4 on 21-22, See
   
   
https://github.com/apache/flink-connector-hbase/blob/91d166d8c7ffb21ae065fce92618ecac74b7f9e7/docs/data/hbase.yml#L19-L24



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34456][configuration]Move all checkpoint-related options into CheckpointingOptions [flink]

2024-03-05 Thread via GitHub



spoon-lz commented on PR #24374:
URL: https://github.com/apache/flink/pull/24374#issuecomment-1980033057

   @Zakelly A separate PR seems more reasonable, I will submit a separate PR to 
split `ExternalizedCheckpointCleanup`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34582) release build tools lost the newly added py3.11 packages for mac

2024-03-05 Thread Xingbo Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingbo Huang updated FLINK-34582:
-
Fix Version/s: 1.19.0

> release build tools lost the newly added py3.11 packages for mac
> 
>
> Key: FLINK-34582
> URL: https://issues.apache.org/jira/browse/FLINK-34582
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.19.0, 1.20.0
>Reporter: lincoln lee
>Assignee: Xingbo Huang
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> during 1.19.0-rc1 building binaries via 
> tools/releasing/create_binary_release.sh
> lost the newly added py3.11  2 packages for mac



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-34582) release build tools lost the newly added py3.11 packages for mac

2024-03-05 Thread Xingbo Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingbo Huang closed FLINK-34582.

Resolution: Fixed

Merged into master via a9d9bab47a6b2a9520f7d2b6f3791690df50e214

Merged into release-1.19 via fa738bb09310a0012b5c8341e403c597855079b1

> release build tools lost the newly added py3.11 packages for mac
> 
>
> Key: FLINK-34582
> URL: https://issues.apache.org/jira/browse/FLINK-34582
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.19.0, 1.20.0
>Reporter: lincoln lee
>Assignee: Xingbo Huang
>Priority: Critical
>  Labels: pull-request-available
>
> during 1.19.0-rc1 building binaries via 
> tools/releasing/create_binary_release.sh
> lost the newly added py3.11  2 packages for mac



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34566) Flink Kubernetes Operator reconciliation parallelism setting not work

2024-03-05 Thread Fei Feng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823848#comment-17823848
 ] 

Fei Feng commented on FLINK-34566:
--

OK, I have reviewed their changes.
 

> Flink Kubernetes Operator reconciliation parallelism setting not work
> -
>
> Key: FLINK-34566
> URL: https://issues.apache.org/jira/browse/FLINK-34566
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.7.0
>Reporter: Fei Feng
>Assignee: Fei Feng
>Priority: Blocker
>  Labels: pull-request-available
> Attachments: image-2024-03-04-10-58-37-679.png, 
> image-2024-03-04-11-17-22-877.png, image-2024-03-04-11-31-44-451.png
>
>
> After we upgrade JOSDK to version 4.4.2 from version 4.3.0 in FLINK-33005 , 
> we can not enlarge reconciliation parallelism , and the maximum 
> reconciliation parallelism was only 10. This results FlinkDeployment and 
> SessionJob 's reconciliation delay about 10-30 seconds when we have a large 
> scale flink session cluster and session jobs in k8s cluster。
>  
> After investigating and validating, I found the reason is the logic for 
> reconciliation thread pool creation in JOSDK has changed significantly 
> between this two version. 
> v4.3.0: 
> reconciliation thread pool was created as a FixedThreadPool ( maximumPoolSize 
> was same as corePoolSize), so we pass the reconciliation thread and get a 
> thread pool that matches our expectations.
> !image-2024-03-04-10-58-37-679.png|width=497,height=91!
> [https://github.com/operator-framework/java-operator-sdk/blob/v4.3.0/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ConfigurationServiceOverrider.java#L198]
>  
> but in v4.2.0:
> the reconciliation thread pool was created as a customer executor which we 
> can pass corePoolSize and maximumPoolSize to create this thread pool.The 
> problem is that we only set the maximumPoolSize of the thread pool, while, 
> the corePoolSize of the thread pool is defaulted to 10. This causes thread 
> pool size was only 10 and majority of events would be placed in the workQueue 
> for a while.  
> !image-2024-03-04-11-17-22-877.png|width=569,height=112!
> [https://github.com/operator-framework/java-operator-sdk/blob/v4.4.2/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ExecutorServiceManager.java#L37]
>  
> the solution is also simple, we can create and pass thread pool in flink 
> kubernetes operator so that we can control the reconciliation thread pool 
> directly, such as:
> !image-2024-03-04-11-31-44-451.png|width=483,height=98!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-16627) Support only generate non-null values when serializing into JSON

2024-03-05 Thread yisha zhou (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823845#comment-17823845
 ] 

yisha zhou commented on FLINK-16627:


[~libenchao]  Thanks a lot for your help!

> Support only generate non-null values when serializing into JSON
> 
>
> Key: FLINK-16627
> URL: https://issues.apache.org/jira/browse/FLINK-16627
> Project: Flink
>  Issue Type: New Feature
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Planner
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Assignee: yisha zhou
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> auto-unassigned, pull-request-available, sprint
> Fix For: 1.20.0
>
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……）
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……）
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {
> def eval(str: String) : String= { 
>if(str == null){
>return ""
>}else{
>return str
>}
> }
> 
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> 
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable，but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34582][realse][python] Updates cibuildwheel to support cpython 3.11 wheel package [flink]

2024-03-05 Thread via GitHub



HuangXingBo closed pull request #24439: [FLINK-34582][realse][python] Updates 
cibuildwheel to support cpython 3.11 wheel package
URL: https://github.com/apache/flink/pull/24439


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [hotfix] [docs] Flink comment typo in pyflink datastream [flink]

2024-03-05 Thread via GitHub



flinkbot commented on PR #24440:
URL: https://github.com/apache/flink/pull/24440#issuecomment-1980006049

   
   ## CI report:
   
   * 6167e861d0f03efc743588946e39274abe48dab9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34582][realse][python] Updates cibuildwheel to support cpython 3.11 wheel package [flink]

2024-03-05 Thread via GitHub



HuangXingBo commented on PR #24439:
URL: https://github.com/apache/flink/pull/24439#issuecomment-1980003795

   
https://dev.azure.com/hxbks2ks/FLINK-TEST/_build/results?buildId=2229&view=artifacts&pathAsName=false&type=publishedArtifacts
   build in my azure pipeline


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [hotfix] [docs] Flink comment typo in pyflink datastream [flink]

2024-03-05 Thread via GitHub



mynkyu opened a new pull request, #24440:
URL: https://github.com/apache/flink/pull/24440

   
   
   ## What is the purpose of the change
   - Comment typo fix PR
   
   ## Brief change log
   ASIS: `THe parallelism for this operator.`
   TOBE: `The parallelism for this operator.`
   
   ## Verifying this change
   This change is a typo cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (FLINK-34184) Update copyright and license file

2024-03-05 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu resolved FLINK-34184.

Resolution: Fixed

Fixed in flink-cdc(master) via: a6c1b06e11004918cb6e5714ade3699e052e1aad

> Update copyright and license file
> -
>
> Key: FLINK-34184
> URL: https://issues.apache.org/jira/browse/FLINK-34184
> Project: Flink
>  Issue Type: Sub-task
>  Components: Flink CDC
>Reporter: Leonard Xu
>Assignee: Hang Ruan
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34184) Update copyright and license file

2024-03-05 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-34184:
---
Fix Version/s: cdc-3.1.0

> Update copyright and license file
> -
>
> Key: FLINK-34184
> URL: https://issues.apache.org/jira/browse/FLINK-34184
> Project: Flink
>  Issue Type: Sub-task
>  Components: Flink CDC
>Reporter: Leonard Xu
>Assignee: Hang Ruan
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [hotfix] [docs] Flink comment typo in pyflink datastream [flink]

2024-03-05 Thread via GitHub



mynkyu closed pull request #24434: [hotfix] [docs] Flink comment typo in 
pyflink datastream
URL: https://github.com/apache/flink/pull/24434


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34582][realse][python] Updates cibuildwheel to support cpython 3.11 wheel package [flink]

2024-03-05 Thread via GitHub



flinkbot commented on PR #24439:
URL: https://github.com/apache/flink/pull/24439#issuecomment-1979991024

   
   ## CI report:
   
   * 7b165aad01493ad32124c9e5b807c0f75a64aead UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-32081) Compatibility between file-merging on and off across job runs

2024-03-05 Thread Zakelly Lan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zakelly Lan updated FLINK-32081:

Fix Version/s: 1.20.0
   (was: 1.19.0)

> Compatibility between file-merging on and off across job runs
> -
>
> Key: FLINK-32081
> URL: https://issues.apache.org/jira/browse/FLINK-32081
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Jinzhong Li
>Priority: Major
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32083) Chinese translation of documentation of checkpoint file-merging

2024-03-05 Thread Zakelly Lan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zakelly Lan updated FLINK-32083:

Fix Version/s: 1.20.0
   (was: 1.19.0)

> Chinese translation of documentation of checkpoint file-merging
> ---
>
> Key: FLINK-32083
> URL: https://issues.apache.org/jira/browse/FLINK-32083
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Hangxiang Yu
>Priority: Major
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32082) Documentation of checkpoint file-merging

2024-03-05 Thread Zakelly Lan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zakelly Lan updated FLINK-32082:

Fix Version/s: 1.20.0
   (was: 1.19.0)

> Documentation of checkpoint file-merging
> 
>
> Key: FLINK-32082
> URL: https://issues.apache.org/jira/browse/FLINK-32082
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Yanfei Lei
>Priority: Major
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32080) Restoration of FileMergingSnapshotManager

2024-03-05 Thread Zakelly Lan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zakelly Lan updated FLINK-32080:

Fix Version/s: 1.20.0
   (was: 1.19.0)

> Restoration of FileMergingSnapshotManager
> -
>
> Key: FLINK-32080
> URL: https://issues.apache.org/jira/browse/FLINK-32080
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Jinzhong Li
>Priority: Major
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32079) Read/write checkpoint metadata of merged files

2024-03-05 Thread Zakelly Lan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zakelly Lan updated FLINK-32079:

Fix Version/s: 1.20.0
   (was: 1.19.0)

> Read/write checkpoint metadata of merged files
> --
>
> Key: FLINK-32079
> URL: https://issues.apache.org/jira/browse/FLINK-32079
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Hangxiang Yu
>Priority: Major
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32074) Support file merging across checkpoints

2024-03-05 Thread Zakelly Lan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zakelly Lan updated FLINK-32074:

Fix Version/s: 1.20.0
   (was: 1.19.0)

> Support file merging across checkpoints
> ---
>
> Key: FLINK-32074
> URL: https://issues.apache.org/jira/browse/FLINK-32074
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Zakelly Lan
>Priority: Major
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32076) Add file pool for concurrent file reusing

2024-03-05 Thread Zakelly Lan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zakelly Lan updated FLINK-32076:

Fix Version/s: 1.20.0
   (was: 1.19.0)

> Add file pool for concurrent file reusing
> -
>
> Key: FLINK-32076
> URL: https://issues.apache.org/jira/browse/FLINK-32076
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Hangxiang Yu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32078) Implement private state file merging

2024-03-05 Thread Zakelly Lan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zakelly Lan updated FLINK-32078:

Fix Version/s: 1.20.0
   (was: 1.19.0)

> Implement private state file merging
> 
>
> Key: FLINK-32078
> URL: https://issues.apache.org/jira/browse/FLINK-32078
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Yanfei Lei
>Priority: Major
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32077) Implement shared state file merging

2024-03-05 Thread Zakelly Lan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zakelly Lan updated FLINK-32077:

Fix Version/s: 1.20.0
   (was: 1.19.0)

> Implement shared state file merging
> ---
>
> Key: FLINK-32077
> URL: https://issues.apache.org/jira/browse/FLINK-32077
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Zakelly Lan
>Priority: Major
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34582) release build tools lost the newly added py3.11 packages for mac

2024-03-05 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-34582:
---
Labels: pull-request-available  (was: )

> release build tools lost the newly added py3.11 packages for mac
> 
>
> Key: FLINK-34582
> URL: https://issues.apache.org/jira/browse/FLINK-34582
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.19.0, 1.20.0
>Reporter: lincoln lee
>Assignee: Xingbo Huang
>Priority: Critical
>  Labels: pull-request-available
>
> during 1.19.0-rc1 building binaries via 
> tools/releasing/create_binary_release.sh
> lost the newly added py3.11  2 packages for mac



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-34582][realse][python] Updates cibuildwheel to support cpython 3.11 wheel package [flink]

2024-03-05 Thread via GitHub



HuangXingBo opened a new pull request, #24439:
URL: https://github.com/apache/flink/pull/24439

   ## What is the purpose of the change
   
   *This pull request will update cibuildwheel to support cpython 3.11 wheel 
package*
   
   
   ## Brief change log
   
 - *Updates cibuildwheel to support cpython 3.11 wheel package*
   
   
   ## Verifying this change
   
 - *Trigger build wheel mannually*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-34582) release build tools lost the newly added py3.11 packages for mac

2024-03-05 Thread lincoln lee (Jira)

lincoln lee created FLINK-34582:
---

 Summary: release build tools lost the newly added py3.11 packages 
for mac
 Key: FLINK-34582
 URL: https://issues.apache.org/jira/browse/FLINK-34582
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.19.0, 1.20.0
Reporter: lincoln lee
Assignee: Xingbo Huang


during 1.19.0-rc1 building binaries via tools/releasing/create_binary_release.sh

lost the newly added py3.11  2 packages for mac



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34456][configuration]Move all checkpoint-related options into CheckpointingOptions [flink]

2024-03-05 Thread via GitHub



Zakelly commented on PR #24374:
URL: https://github.com/apache/flink/pull/24374#issuecomment-1979977569

   > @Zakelly I also saw the conclusion of the discussion in the email. I will 
re-edit this PR after #24381 is merged. Another question is, "Split 
ExternalizedCheckpointCleanup out of CheckpointConfig and move it to flink-core 
module". Does this change also need to be discussed?
   
   I think this is much easier since it is only annotated with 
@\PublicEvolving. I think we can do this in a similar way without another 
discussion. A seperated PR is also good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34456][configuration]Move all checkpoint-related options into CheckpointingOptions [flink]

2024-03-05 Thread via GitHub



spoon-lz commented on PR #24374:
URL: https://github.com/apache/flink/pull/24374#issuecomment-1979967662

   @Zakelly I also saw the conclusion of the discussion in the email. I will 
re-edit this PR after #24381 is merged. Another question is, "Split 
ExternalizedCheckpointCleanup out of CheckpointConfig and move it to flink-core 
module". Does this change also need to be discussed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-33210) Introduce lineage graph relevant interfaces

2024-03-05 Thread Fang Yong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-33210.
-
Fix Version/s: 1.20.0
   Resolution: Resolved

Closed by 6b9c282b5090720a02b941586cb7f5389691dba9

> Introduce lineage graph relevant interfaces 
> 
>
> Key: FLINK-33210
> URL: https://issues.apache.org/jira/browse/FLINK-33210
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream, Table SQL / API
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> Introduce LineageGraph, LineageVertex and LineageEdge interfaces



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [hotfix][docs] Modify obvious errors in the doc. [flink]

2024-03-05 Thread via GitHub



flinkbot commented on PR #24438:
URL: https://github.com/apache/flink/pull/24438#issuecomment-1979939628

   
   ## CI report:
   
   * 4269bebf26d5b1fd9699183a4caef207e4071806 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33210] Introduce lineage graph relevant interfaces [flink]

2024-03-05 Thread via GitHub



FangYongs merged PR #23626:
URL: https://github.com/apache/flink/pull/23626


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [hotfix][docs] Modify obvious errors in the doc. [flink]

2024-03-05 Thread via GitHub



damonxue opened a new pull request, #24438:
URL: https://github.com/apache/flink/pull/24438

   ## What is the purpose of the change
   
   Modify obvious errors in the doc.
   
   
   ## Brief change log
   
   *(for example:)*
 - *Apahe => Apache*
   
   ## Documentation
   
 - Does this pull request introduce a new feature? 
 - No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-34581) streaming code throws java.lang.reflect.InaccessibleObjectException

2024-03-05 Thread Henning Schmiedehausen (Jira)

Henning Schmiedehausen created FLINK-34581:
--

 Summary: streaming code throws 
java.lang.reflect.InaccessibleObjectException
 Key: FLINK-34581
 URL: https://issues.apache.org/jira/browse/FLINK-34581
 Project: Flink
  Issue Type: Bug
  Components: API / DataStream
Affects Versions: 1.18.1
Reporter: Henning Schmiedehausen


I have a pretty simple test pipeline (read a bunch of tables from Apache Kafka, 
join and project them, then write to Apache Iceberg) that I run locally with 
Junit5 and the MiniClusterExtension. This works ok with Java 11 and Flink 
1.17.2 and Flink 1.18.1.

With Java 17, I see

{{[WARN ] IcebergStreamWriter (1/1)#1 
(78d9251dbbab3aae84bf303dfc080d23_626c1e687bcaad0c13c507629a5894a8_0_1) 
switched from RUNNING to FAILED with failure cause:}}
{{java.io.IOException: Could not perform checkpoint 2 for operator 
IcebergStreamWriter (1/1)#1.}}
{{at 
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1275)}}
{{at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:147)}}
{{at 
org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.triggerCheckpoint(SingleCheckpointBarrierHandler.java:287)}}
{{at 
org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.access$100(SingleCheckpointBarrierHandler.java:64)}}
{{at 
org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler$ControllerImpl.triggerGlobalCheckpoint(SingleCheckpointBarrierHandler.java:488)}}
{{at 
org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.triggerGlobalCheckpoint(AbstractAlignedBarrierHandlerState.java:74)}}
{{at 
org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.barrierReceived(AbstractAlignedBarrierHandlerState.java:66)}}
{{at 
org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.lambda$processBarrier$2(SingleCheckpointBarrierHandler.java:234)}}
{{at 
org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.markCheckpointAlignedAndTransformState(SingleCheckpointBarrierHandler.java:262)}}
{{at 
org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.processBarrier(SingleCheckpointBarrierHandler.java:231)}}
{{at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:181)}}
{{at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:159)}}
{{at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:118)}}
{{at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)}}
{{at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:562)}}
{{at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)}}
{{at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858)}}
{{at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:807)}}
{{at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)}}
{{at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:932)}}
{{at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)}}
{{at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)}}
{{at java.base/java.lang.Thread.run(Thread.java:840)}}
{{Caused by: java.lang.RuntimeException: 
java.lang.reflect.InaccessibleObjectException: Unable to make field private 
final java.lang.Object[] java.util.Arrays$ArrayList.a accessible: module 
java.base does not "opens java.util" to unnamed module @6af93788}}
{{at 
com.twitter.chill.java.ArraysAsListSerializer.(ArraysAsListSerializer.java:69)}}
{{at 
org.apache.flink.api.java.typeutils.runtime.kryo.FlinkChillPackageRegistrar.registerSerializers(FlinkChillPackageRegistrar.java:67)}}
{{at 
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.getKryoInstance(KryoSerializer.java:513)}}
{{at 
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.checkKryoInitialized(KryoSerializer.java:522)}}
{{at 
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.serialize(KryoSerializer.java:348)}}
{{at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:165)}}
{{at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:43)}}
{{at 
org.apache.flink.runtime.plugable.SerializationDelegate.write(SerializationDelegate.java:54)}}
{{at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.se

Re: [PR] [hotfix][docs] Modify obvious errors in the doc. [flink]

2024-03-05 Thread via GitHub



damonxue closed pull request #24425: [hotfix][docs] Modify obvious errors in 
the doc.
URL: https://github.com/apache/flink/pull/24425


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-34530) Build Release Candidate: 1.19.0-rc1

2024-03-05 Thread lincoln lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823828#comment-17823828
 ] 

lincoln lee commented on FLINK-34530:
-

Rechecked the python wheel packages  
[https://dev.azure.com/lincoln86xy/lincoln86xy/_build/results?buildId=640&view=artifacts&pathAsName=false&type=publishedArtifacts]

and also recent changes related to this 
https://issues.apache.org/jira/browse/FLINK-33030

Seems we lost the newly added py3.11 packages for mac(both 10.9 & 11.0), this 
explains the delta during creating binary release  [~dianfu] 

> Build Release Candidate: 1.19.0-rc1
> ---
>
> Key: FLINK-34530
> URL: https://issues.apache.org/jira/browse/FLINK-34530
> Project: Flink
>  Issue Type: New Feature
>Affects Versions: 1.19.0
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
> Fix For: 1.19.0
>
>
> The core of the release process is the build-vote-fix cycle. Each cycle 
> produces one release candidate. The Release Manager repeats this cycle until 
> the community approves one release candidate, which is then finalized.
> h4. Prerequisites
> Set up a few environment variables to simplify Maven commands that follow. 
> This identifies the release candidate being built. Start with {{RC_NUM}} 
> equal to 1 and increment it for each candidate:
> {code}
> RC_NUM="1"
> TAG="release-${RELEASE_VERSION}-rc${RC_NUM}"
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [hotfix][docs] Modify obvious errors in the doc. [flink]

2024-03-05 Thread via GitHub



damonxue closed pull request #24425: [hotfix][docs] Modify obvious errors in 
the doc.
URL: https://github.com/apache/flink/pull/24425


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-34517) environment configs ignored when calling procedure operation

2024-03-05 Thread luoyuxia (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823069#comment-17823069
 ] 

luoyuxia edited comment on FLINK-34517 at 3/6/24 1:45 AM:
--

1.18: 620c5a7aeba448d247107ad44a6ba6f1e759052e

1.19: todo

master: 24b6f7cbf94fca154fc8680e8b3393abd68b8e77


was (Author: luoyuxia):
1.18: 620c5a7aeba448d247107ad44a6ba6f1e759052e

1.19: todo

master: todo

> environment configs ignored when calling procedure operation
> 
>
> Key: FLINK-34517
> URL: https://issues.apache.org/jira/browse/FLINK-34517
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: JustinLee
>Assignee: JustinLee
>Priority: Major
>  Labels: pull-request-available
>
> when calling procedure operation in Flink SQL, the ProcedureContext only 
> contains the underlying application-specific config , not 
> environment-specific config.
> to be more specific, in a  Flink sql app of the same 
> StreamExecutionEnvironment which has a config1. when executing a sql query, 
> config1 works, while calling a sql procedure, config1 doesn't work, which 
> apparently is not an expected behavior.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34517][table]fix environment configs ignored when calling procedure operation [flink]

2024-03-05 Thread via GitHub



luoyuxia merged PR #24397:
URL: https://github.com/apache/flink/pull/24397


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-34534) Vote on the release candidate

2024-03-05 Thread lincoln lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823822#comment-17823822
 ] 

lincoln lee commented on FLINK-34534:
-

https://lists.apache.org/thread/10bxy1zhzy6hycjyohyl3pzx3xs3zh34

> Vote on the release candidate
> -
>
> Key: FLINK-34534
> URL: https://issues.apache.org/jira/browse/FLINK-34534
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.19.0
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
> Fix For: 1.19.0
>
>
> Once you have built and individually reviewed the release candidate, please 
> share it for the community-wide review. Please review foundation-wide [voting 
> guidelines|http://www.apache.org/foundation/voting.html] for more information.
> Start the review-and-vote thread on the dev@ mailing list. Here’s an email 
> template; please adjust as you see fit.
> {quote}From: Release Manager
> To: d...@flink.apache.org
> Subject: [VOTE] Release 1.2.3, release candidate #3
> Hi everyone,
> Please review and vote on the release candidate #3 for the version 1.2.3, as 
> follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
> The complete staging area is available for your review, which includes:
>  * JIRA release notes [1],
>  * the official Apache source release and binary convenience releases to be 
> deployed to dist.apache.org [2], which are signed with the key with 
> fingerprint  [3],
>  * all artifacts to be deployed to the Maven Central Repository [4],
>  * source code tag "release-1.2.3-rc3" [5],
>  * website pull request listing the new release and adding announcement blog 
> post [6].
> The vote will be open for at least 72 hours. It is adopted by majority 
> approval, with at least 3 PMC affirmative votes.
> Thanks,
> Release Manager
> [1] link
> [2] link
> [3] [https://dist.apache.org/repos/dist/release/flink/KEYS]
> [4] link
> [5] link
> [6] link
> {quote}
> *If there are any issues found in the release candidate, reply on the vote 
> thread to cancel the vote.* There’s no need to wait 72 hours. Proceed to the 
> Fix Issues step below and address the problem. However, some issues don’t 
> require cancellation. For example, if an issue is found in the website pull 
> request, just correct it on the spot and the vote can continue as-is.
> For cancelling a release, the release manager needs to send an email to the 
> release candidate thread, stating that the release candidate is officially 
> cancelled. Next, all artifacts created specifically for the RC in the 
> previous steps need to be removed:
>  * Delete the staging repository in Nexus
>  * Remove the source / binary RC files from dist.apache.org
>  * Delete the source code tag in git
> *If there are no issues, reply on the vote thread to close the voting.* Then, 
> tally the votes in a separate email. Here’s an email template; please adjust 
> as you see fit.
> {quote}From: Release Manager
> To: d...@flink.apache.org
> Subject: [RESULT] [VOTE] Release 1.2.3, release candidate #3
> I'm happy to announce that we have unanimously approved this release.
> There are XXX approving votes, XXX of which are binding:
>  * approver 1
>  * approver 2
>  * approver 3
>  * approver 4
> There are no disapproving votes.
> Thanks everyone!
> {quote}
>  
> 
> h3. Expectations
>  * Community votes to release the proposed candidate, with at least three 
> approving PMC votes
> Any issues that are raised till the vote is over should be either resolved or 
> moved into the next release (if applicable).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34534) Vote on the release candidate

2024-03-05 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee reassigned FLINK-34534:
---

Assignee: lincoln lee

> Vote on the release candidate
> -
>
> Key: FLINK-34534
> URL: https://issues.apache.org/jira/browse/FLINK-34534
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.19.0
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
> Fix For: 1.19.0
>
>
> Once you have built and individually reviewed the release candidate, please 
> share it for the community-wide review. Please review foundation-wide [voting 
> guidelines|http://www.apache.org/foundation/voting.html] for more information.
> Start the review-and-vote thread on the dev@ mailing list. Here’s an email 
> template; please adjust as you see fit.
> {quote}From: Release Manager
> To: d...@flink.apache.org
> Subject: [VOTE] Release 1.2.3, release candidate #3
> Hi everyone,
> Please review and vote on the release candidate #3 for the version 1.2.3, as 
> follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
> The complete staging area is available for your review, which includes:
>  * JIRA release notes [1],
>  * the official Apache source release and binary convenience releases to be 
> deployed to dist.apache.org [2], which are signed with the key with 
> fingerprint  [3],
>  * all artifacts to be deployed to the Maven Central Repository [4],
>  * source code tag "release-1.2.3-rc3" [5],
>  * website pull request listing the new release and adding announcement blog 
> post [6].
> The vote will be open for at least 72 hours. It is adopted by majority 
> approval, with at least 3 PMC affirmative votes.
> Thanks,
> Release Manager
> [1] link
> [2] link
> [3] [https://dist.apache.org/repos/dist/release/flink/KEYS]
> [4] link
> [5] link
> [6] link
> {quote}
> *If there are any issues found in the release candidate, reply on the vote 
> thread to cancel the vote.* There’s no need to wait 72 hours. Proceed to the 
> Fix Issues step below and address the problem. However, some issues don’t 
> require cancellation. For example, if an issue is found in the website pull 
> request, just correct it on the spot and the vote can continue as-is.
> For cancelling a release, the release manager needs to send an email to the 
> release candidate thread, stating that the release candidate is officially 
> cancelled. Next, all artifacts created specifically for the RC in the 
> previous steps need to be removed:
>  * Delete the staging repository in Nexus
>  * Remove the source / binary RC files from dist.apache.org
>  * Delete the source code tag in git
> *If there are no issues, reply on the vote thread to close the voting.* Then, 
> tally the votes in a separate email. Here’s an email template; please adjust 
> as you see fit.
> {quote}From: Release Manager
> To: d...@flink.apache.org
> Subject: [RESULT] [VOTE] Release 1.2.3, release candidate #3
> I'm happy to announce that we have unanimously approved this release.
> There are XXX approving votes, XXX of which are binding:
>  * approver 1
>  * approver 2
>  * approver 3
>  * approver 4
> There are no disapproving votes.
> Thanks everyone!
> {quote}
>  
> 
> h3. Expectations
>  * Community votes to release the proposed candidate, with at least three 
> approving PMC votes
> Any issues that are raised till the vote is over should be either resolved or 
> moved into the next release (if applicable).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-34531) Build and stage Java and Python artifacts

2024-03-05 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-34531.
---
Resolution: Fixed

> Build and stage Java and Python artifacts
> -
>
> Key: FLINK-34531
> URL: https://issues.apache.org/jira/browse/FLINK-34531
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
>
> # Create a local release branch ((!) this step can not be skipped for minor 
> releases):
> {code:bash}
> $ cd ./tools
> tools/ $ OLD_VERSION=$CURRENT_SNAPSHOT_VERSION NEW_VERSION=$RELEASE_VERSION 
> RELEASE_CANDIDATE=$RC_NUM releasing/create_release_branch.sh
> {code}
>  # Tag the release commit:
> {code:bash}
> $ git tag -s ${TAG} -m "${TAG}"
> {code}
>  # We now need to do several things:
>  ## Create the source release archive
>  ## Deploy jar artefacts to the [Apache Nexus 
> Repository|https://repository.apache.org/], which is the staging area for 
> deploying the jars to Maven Central
>  ## Build PyFlink wheel packages
> You might want to create a directory on your local machine for collecting the 
> various source and binary releases before uploading them. Creating the binary 
> releases is a lengthy process but you can do this on another machine (for 
> example, in the "cloud"). When doing this, you can skip signing the release 
> files on the remote machine, download them to your local machine and sign 
> them there.
>  # Build the source release:
> {code:bash}
> tools $ RELEASE_VERSION=$RELEASE_VERSION releasing/create_source_release.sh
> {code}
>  # Stage the maven artifacts:
> {code:bash}
> tools $ releasing/deploy_staging_jars.sh
> {code}
> Review all staged artifacts ([https://repository.apache.org/]). They should 
> contain all relevant parts for each module, including pom.xml, jar, test jar, 
> source, test source, javadoc, etc. Carefully review any new artifacts.
>  # Close the staging repository on Apache Nexus. When prompted for a 
> description, enter “Apache Flink, version X, release candidate Y”.
> Then, you need to build the PyFlink wheel packages (since 1.11):
>  # Set up an azure pipeline in your own Azure account. You can refer to 
> [Azure 
> Pipelines|https://cwiki.apache.org/confluence/display/FLINK/Azure+Pipelines#AzurePipelines-Tutorial:SettingupAzurePipelinesforaforkoftheFlinkrepository]
>  for more details on how to set up azure pipeline for a fork of the Flink 
> repository. Note that a google cloud mirror in Europe is used for downloading 
> maven artifacts, therefore it is recommended to set your [Azure organization 
> region|https://docs.microsoft.com/en-us/azure/devops/organizations/accounts/change-organization-location]
>  to Europe to speed up the downloads.
>  # Push the release candidate branch to your forked personal Flink 
> repository, e.g.
> {code:bash}
> tools $ git push  
> refs/heads/release-${RELEASE_VERSION}-rc${RC_NUM}:release-${RELEASE_VERSION}-rc${RC_NUM}
> {code}
>  # Trigger the Azure Pipelines manually to build the PyFlink wheel packages
>  ## Go to your Azure Pipelines Flink project → Pipelines
>  ## Click the "New pipeline" button on the top right
>  ## Select "GitHub" → your GitHub Flink repository → "Existing Azure 
> Pipelines YAML file"
>  ## Select your branch → Set path to "/azure-pipelines.yaml" → click on 
> "Continue" → click on "Variables"
>  ## Then click "New Variable" button, fill the name with "MODE", and the 
> value with "release". Click "OK" to set the variable and the "Save" button to 
> save the variables, then back on the "Review your pipeline" screen click 
> "Run" to trigger the build.
>  ## You should now see a build where only the "CI build (release)" is running
>  # Download the PyFlink wheel packages from the build result page after the 
> jobs of "build_wheels mac" and "build_wheels linux" have finished.
>  ## Download the PyFlink wheel packages
>  ### Open the build result page of the pipeline
>  ### Go to the {{Artifacts}} page (build_wheels linux -> 1 artifact)
>  ### Click {{wheel_Darwin_build_wheels mac}} and {{wheel_Linux_build_wheels 
> linux}} separately to download the zip files
>  ## Unzip these two zip files
> {code:bash}
> $ cd /path/to/downloaded_wheel_packages
> $ unzip wheel_Linux_build_wheels\ linux.zip
> $ unzip wheel_Darwin_build_wheels\ mac.zip{code}
>  ## Create directory {{./dist}} under the directory of {{{}flink-python{}}}:
> {code:bash}
> $ cd 
> $ mkdir flink-python/dist{code}
>  ## Move the unzipped wheel packages to the directory of 
> {{{}flink-python/dist{}}}:
> {code:java}
> $ mv /path/to/wheel_Darwin_build_wheels\ mac/* flink-python/dist/
> $ mv /path/to/wheel_Linux_build_wheels\ linux/* flink-python/dist/
> $ cd tools{code}
> Finally, we create the binary convenience release files:
> {code:bash}
> tools $ RELEASE_VERSION=

[jira] [Closed] (FLINK-34532) Stage source and binary releases on dist.apache.org

2024-03-05 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-34532.
---
Resolution: Fixed

[https://dist.apache.org/repos/dist/dev/flink/flink-1.19.0-rc1/]

[https://github.com/apache/flink/releases/tag/release-1.19.0-rc1]

> Stage source and binary releases on dist.apache.org
> ---
>
> Key: FLINK-34532
> URL: https://issues.apache.org/jira/browse/FLINK-34532
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
>
> Copy the source release to the dev repository of dist.apache.org:
> # If you have not already, check out the Flink section of the dev repository 
> on dist.apache.org via Subversion. In a fresh directory:
> {code:bash}
> $ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
> {code}
> # Make a directory for the new release and copy all the artifacts (Flink 
> source/binary distributions, hashes, GPG signatures and the python 
> subdirectory) into that newly created directory:
> {code:bash}
> $ mkdir flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
> $ mv /tools/releasing/release/* 
> flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
> {code}
> # Add and commit all the files.
> {code:bash}
> $ cd flink
> flink $ svn add flink-${RELEASE_VERSION}-rc${RC_NUM}
> flink $ svn commit -m "Add flink-${RELEASE_VERSION}-rc${RC_NUM}"
> {code}
> # Verify that files are present under 
> [https://dist.apache.org/repos/dist/dev/flink|https://dist.apache.org/repos/dist/dev/flink].
> # Push the release tag if not done already (the following command assumes to 
> be called from within the apache/flink checkout):
> {code:bash}
> $ git push  refs/tags/release-${RELEASE_VERSION}-rc${RC_NUM}
> {code}
>  
> 
> h3. Expectations
>  * Maven artifacts deployed to the staging repository of 
> [repository.apache.org|https://repository.apache.org/content/repositories/]
>  * Source distribution deployed to the dev repository of 
> [dist.apache.org|https://dist.apache.org/repos/dist/dev/flink/]
>  * Check hashes (e.g. shasum -c *.sha512)
>  * Check signatures (e.g. {{{}gpg --verify 
> flink-1.2.3-source-release.tar.gz.asc flink-1.2.3-source-release.tar.gz{}}})
>  * {{grep}} for legal headers in each file.
>  * If time allows check the NOTICE files of the modules whose dependencies 
> have been changed in this release in advance, since the license issues from 
> time to time pop up during voting. See [Verifying a Flink 
> Release|https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Release]
>  "Checking License" section.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34417] Log Job ID via MDC [flink]

2024-03-05 Thread via GitHub



rkhachatryan commented on PR #24292:
URL: https://github.com/apache/flink/pull/24292#issuecomment-1979884146

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34566] Pass a FixedThreadPool to set reconciliation parallelism correctly [flink-kubernetes-operator]

2024-03-05 Thread via GitHub



ysymi commented on code in PR #790:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/790#discussion_r1513647273


##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/FlinkOperatorTest.java:
##
@@ -70,10 +70,21 @@ public void testConfigurationPassedToJOSDK() {
 
 var configService = 
testOperator.getOperator().getConfigurationService();
 
-// Test parallelism being passed
+// Test parallelism being passed expectedly
 var executorService = configService.getExecutorService();
 Assertions.assertInstanceOf(ThreadPoolExecutor.class, executorService);
 ThreadPoolExecutor threadPoolExecutor = (ThreadPoolExecutor) 
executorService;
+for (int i = 0; i < testParallelism * 2; i++) {
+threadPoolExecutor.execute(
+() -> {
+try {
+Thread.sleep(1000);
+} catch (InterruptedException e) {
+e.printStackTrace();
+}
+});
+}

Review Comment:
   I want to test whether the number of threads in the executor can reach our 
expected maximumPoolSize when the executor receives enough tasks.
   see assertion in line 87 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34469) Implement TableDistribution toString

2024-03-05 Thread Jing Ge (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge updated FLINK-34469:

Affects Version/s: 1.20.0

> Implement TableDistribution toString
> 
>
> Key: FLINK-34469
> URL: https://issues.apache.org/jira/browse/FLINK-34469
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.20.0
>Reporter: Timo Walther
>Assignee: Jeyhun Karimov
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> The newly added TableDistribution misses a toString implementation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34509) Docs: Fix Debezium JSON/AVRO opts in examples

2024-03-05 Thread Jing Ge (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge updated FLINK-34509:

Fix Version/s: 1.20.0

> Docs: Fix Debezium JSON/AVRO opts in examples 
> --
>
> Key: FLINK-34509
> URL: https://issues.apache.org/jira/browse/FLINK-34509
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.19.0, 1.20.0
>Reporter: Lorenzo Affetti
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Problem found here: 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/formats/debezium/#how-to-use-debezium-format.]
> Here is the example provided:
> {code:java}
> CREATE TABLE topic_products (
>   -- schema is totally the same to the MySQL "products" table
>   id BIGINT,
>   name STRING,
>   description STRING,
>   weight DECIMAL(10, 2)
> ) WITH (
>  'connector' = 'kafka',
>  'topic' = 'products_binlog',
>  'properties.bootstrap.servers' = 'localhost:9092',
>  'properties.group.id' = 'testGroup',
>  -- using 'debezium-json' as the format to interpret Debezium JSON messages
>  -- please use 'debezium-avro-confluent' if Debezium encodes messages in Avro 
> format
>  'format' = 'debezium-json'
> ) {code}
> Actually, `debezium-json` would require `'debezium-json.schema-include' = 
> 'true'` to work (as, by default, Debezium includes the schema. See 
> [https://www.markhneedham.com/blog/2023/01/24/flink-sql-could-not-execute-sql-statement-corrupt-debezium-message/).]
> On the other hand `debezium-avro` would require the URL of the Confluent 
> schema registry: `'debezium-avro-confluent.url' = '[http://...:8081'.]
> I propose to split the single example in 2 with the correct default options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34509) Docs: Fix Debezium JSON/AVRO opts in examples

2024-03-05 Thread Jing Ge (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge updated FLINK-34509:

Affects Version/s: 1.19.0
   1.20.0

> Docs: Fix Debezium JSON/AVRO opts in examples 
> --
>
> Key: FLINK-34509
> URL: https://issues.apache.org/jira/browse/FLINK-34509
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.19.0, 1.20.0
>Reporter: Lorenzo Affetti
>Priority: Minor
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Problem found here: 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/formats/debezium/#how-to-use-debezium-format.]
> Here is the example provided:
> {code:java}
> CREATE TABLE topic_products (
>   -- schema is totally the same to the MySQL "products" table
>   id BIGINT,
>   name STRING,
>   description STRING,
>   weight DECIMAL(10, 2)
> ) WITH (
>  'connector' = 'kafka',
>  'topic' = 'products_binlog',
>  'properties.bootstrap.servers' = 'localhost:9092',
>  'properties.group.id' = 'testGroup',
>  -- using 'debezium-json' as the format to interpret Debezium JSON messages
>  -- please use 'debezium-avro-confluent' if Debezium encodes messages in Avro 
> format
>  'format' = 'debezium-json'
> ) {code}
> Actually, `debezium-json` would require `'debezium-json.schema-include' = 
> 'true'` to work (as, by default, Debezium includes the schema. See 
> [https://www.markhneedham.com/blog/2023/01/24/flink-sql-could-not-execute-sql-statement-corrupt-debezium-message/).]
> On the other hand `debezium-avro` would require the URL of the Confluent 
> schema registry: `'debezium-avro-confluent.url' = '[http://...:8081'.]
> I propose to split the single example in 2 with the correct default options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-34509) Docs: Fix Debezium JSON/AVRO opts in examples

2024-03-05 Thread Jing Ge (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge closed FLINK-34509.
---
Resolution: Fixed

> Docs: Fix Debezium JSON/AVRO opts in examples 
> --
>
> Key: FLINK-34509
> URL: https://issues.apache.org/jira/browse/FLINK-34509
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.19.0, 1.20.0
>Reporter: Lorenzo Affetti
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Problem found here: 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/formats/debezium/#how-to-use-debezium-format.]
> Here is the example provided:
> {code:java}
> CREATE TABLE topic_products (
>   -- schema is totally the same to the MySQL "products" table
>   id BIGINT,
>   name STRING,
>   description STRING,
>   weight DECIMAL(10, 2)
> ) WITH (
>  'connector' = 'kafka',
>  'topic' = 'products_binlog',
>  'properties.bootstrap.servers' = 'localhost:9092',
>  'properties.group.id' = 'testGroup',
>  -- using 'debezium-json' as the format to interpret Debezium JSON messages
>  -- please use 'debezium-avro-confluent' if Debezium encodes messages in Avro 
> format
>  'format' = 'debezium-json'
> ) {code}
> Actually, `debezium-json` would require `'debezium-json.schema-include' = 
> 'true'` to work (as, by default, Debezium includes the schema. See 
> [https://www.markhneedham.com/blog/2023/01/24/flink-sql-could-not-execute-sql-statement-corrupt-debezium-message/).]
> On the other hand `debezium-avro` would require the URL of the Confluent 
> schema registry: `'debezium-avro-confluent.url' = '[http://...:8081'.]
> I propose to split the single example in 2 with the correct default options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34509) Docs: Fix Debezium JSON/AVRO opts in examples

2024-03-05 Thread Jing Ge (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823793#comment-17823793
 ] 

Jing Ge commented on FLINK-34509:
-

master: 641f4f4d0d0156b84bdb9ba528b1dd96f7ae9d9c

> Docs: Fix Debezium JSON/AVRO opts in examples 
> --
>
> Key: FLINK-34509
> URL: https://issues.apache.org/jira/browse/FLINK-34509
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lorenzo Affetti
>Priority: Minor
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Problem found here: 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/formats/debezium/#how-to-use-debezium-format.]
> Here is the example provided:
> {code:java}
> CREATE TABLE topic_products (
>   -- schema is totally the same to the MySQL "products" table
>   id BIGINT,
>   name STRING,
>   description STRING,
>   weight DECIMAL(10, 2)
> ) WITH (
>  'connector' = 'kafka',
>  'topic' = 'products_binlog',
>  'properties.bootstrap.servers' = 'localhost:9092',
>  'properties.group.id' = 'testGroup',
>  -- using 'debezium-json' as the format to interpret Debezium JSON messages
>  -- please use 'debezium-avro-confluent' if Debezium encodes messages in Avro 
> format
>  'format' = 'debezium-json'
> ) {code}
> Actually, `debezium-json` would require `'debezium-json.schema-include' = 
> 'true'` to work (as, by default, Debezium includes the schema. See 
> [https://www.markhneedham.com/blog/2023/01/24/flink-sql-could-not-execute-sql-statement-corrupt-debezium-message/).]
> On the other hand `debezium-avro` would require the URL of the Confluent 
> schema registry: `'debezium-avro-confluent.url' = '[http://...:8081'.]
> I propose to split the single example in 2 with the correct default options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [WIP][FLINK-34442][table] Support optimizations for pre-partitioned data sources [flink]

2024-03-05 Thread via GitHub



jeyhunkarimov commented on PR #24437:
URL: https://github.com/apache/flink/pull/24437#issuecomment-1979739792

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32263]-table-Add-ELT-function [flink]

2024-03-05 Thread via GitHub



hanyuzheng7 commented on PR #22853:
URL: https://github.com/apache/flink/pull/22853#issuecomment-1979728745

   @flinkbot run azure
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34417] Log Job ID via MDC [flink]

2024-03-05 Thread via GitHub



zentol commented on code in PR #24292:
URL: https://github.com/apache/flink/pull/24292#discussion_r1513528600


##
flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskExecutor.java:
##
@@ -903,22 +907,28 @@ private Stream 
filterPartitionsRequiringRel
 public CompletableFuture cancelTask(
 ExecutionAttemptID executionAttemptID, Time timeout) {
 final Task task = taskSlotTable.getTask(executionAttemptID);
+MdcUtils.addJobID(task.getJobID());

Review Comment:
   this can fail if the task is null.



##
flink-tests/src/test/java/org/apache/flink/test/misc/JobIDLoggingITCase.java:
##
@@ -0,0 +1,228 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.test.misc;
+
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.api.common.JobStatus;
+import org.apache.flink.api.common.time.Deadline;
+import org.apache.flink.client.program.ClusterClient;
+import org.apache.flink.core.execution.CheckpointType;
+import org.apache.flink.runtime.checkpoint.CheckpointCoordinator;
+import org.apache.flink.runtime.checkpoint.CheckpointException;
+import org.apache.flink.runtime.executiongraph.ExecutionGraph;
+import org.apache.flink.runtime.jobmaster.JobMaster;
+import org.apache.flink.runtime.taskexecutor.TaskExecutor;
+import org.apache.flink.runtime.taskmanager.Task;
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.functions.sink.DiscardingSink;
+import org.apache.flink.streaming.runtime.tasks.StreamTask;
+import org.apache.flink.test.junit5.InjectClusterClient;
+import org.apache.flink.test.junit5.MiniClusterExtension;
+import org.apache.flink.testutils.logging.LoggerAuditingExtension;
+import org.apache.flink.util.ExceptionUtils;
+import org.apache.flink.util.MdcUtils;
+import org.apache.flink.util.TestLogger;
+
+import org.apache.logging.log4j.core.LogEvent;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.extension.RegisterExtension;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.time.Duration;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Objects;
+import java.util.concurrent.ExecutionException;
+import java.util.regex.Pattern;
+import java.util.stream.Collectors;
+
+import static org.apache.flink.util.Preconditions.checkState;
+import static org.junit.Assert.assertTrue;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+import static org.slf4j.event.Level.DEBUG;
+
+/**
+ * Tests adding of {@link JobID} to logs (via {@link org.slf4j.MDC}) in the 
most important cases.
+ */
+public class JobIDLoggingITCase extends TestLogger {
+private static final Logger logger = 
LoggerFactory.getLogger(JobIDLoggingITCase.class);
+
+@RegisterExtension
+public final LoggerAuditingExtension checkpointCoordinatorLogging =
+new LoggerAuditingExtension(CheckpointCoordinator.class, DEBUG);
+
+@RegisterExtension
+public final LoggerAuditingExtension streamTaskLogging =
+new LoggerAuditingExtension(StreamTask.class, DEBUG);
+
+@RegisterExtension
+public final LoggerAuditingExtension taskExecutorLogging =
+new LoggerAuditingExtension(TaskExecutor.class, DEBUG);
+
+@RegisterExtension
+public final LoggerAuditingExtension taskLogging =
+new LoggerAuditingExtension(Task.class, DEBUG);
+
+@RegisterExtension
+public final LoggerAuditingExtension executionGraphLogging =
+new LoggerAuditingExtension(ExecutionGraph.class, DEBUG);
+
+@RegisterExtension
+public final LoggerAuditingExtension jobMasterLogging =
+new LoggerAuditingExtension(JobMaster.class, DEBUG);
+
+@RegisterExtension
+public final LoggerAuditingExtension asyncCheckpointRunnableLogging =
+// this class is private
+new LoggerAuditingExtension(
+
"org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable", DEBUG);
+
+@RegisterExtension
+public static MiniClusterExtension miniCl

Re: [PR] [WIP][FLINK-34442][table] Support optimizations for pre-partitioned data sources [flink]

2024-03-05 Thread via GitHub



jeyhunkarimov commented on PR #24437:
URL: https://github.com/apache/flink/pull/24437#issuecomment-1979689365

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-34575) Vulnerabilities in commons-compress 1.24.0; upgrade to 1.26.0 needed.

2024-03-05 Thread Adrian Vasiliu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823762#comment-17823762
 ] 

Adrian Vasiliu commented on FLINK-34575:


Thanks Martijn. I couldn't find an estimated date for 1.20, would you know?
Also, no bug fix release planned for 1.18.x / 1.19.x, before 1.20?
I ask because of the pressure from vulnerability scanners...

> Vulnerabilities in commons-compress 1.24.0; upgrade to 1.26.0 needed.
> -
>
> Key: FLINK-34575
> URL: https://issues.apache.org/jira/browse/FLINK-34575
> Project: Flink
>  Issue Type: Technical Debt
>Affects Versions: 1.18.1
>Reporter: Adrian Vasiliu
>Priority: Major
>
> Since Feb. 19, medium/high CVEs have been found for commons-compress 1.24.0:
> [https://nvd.nist.gov/vuln/detail/CVE-2024-25710]
> https://nvd.nist.gov/vuln/detail/CVE-2024-26308
> [https://github.com/apache/flink/pull/24352] has been opened automatically on 
> Feb. 21 by dependabot for bumping commons-compress to v1.26.0 which fixes the 
> CVEs, but two CI checks are red on the PR.
> Flink's dependency on commons-compress has been upgraded to v1.24.0 in Oct 
> 2023 (https://issues.apache.org/jira/browse/FLINK-33329).
> v1.24.0 is the version currently in the master 
> branch:[https://github.com/apache/flink/blob/master/pom.xml#L727-L729|https://github.com/apache/flink/blob/master/pom.xml#L727-L729).].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-34575) Vulnerabilities in commons-compress 1.24.0; upgrade to 1.26.0 needed.

2024-03-05 Thread Adrian Vasiliu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823762#comment-17823762
 ] 

Adrian Vasiliu edited comment on FLINK-34575 at 3/5/24 8:18 PM:


Thanks Martijn. I couldn't find an estimated date for 1.20, would you know?
Also, no bug / security fix release planned for 1.18.x / 1.19.x, before 1.20?
I ask because of the pressure from vulnerability scanners...


was (Author: JIRAUSER280892):
Thanks Martijn. I couldn't find an estimated date for 1.20, would you know?
Also, no bug fix release planned for 1.18.x / 1.19.x, before 1.20?
I ask because of the pressure from vulnerability scanners...

> Vulnerabilities in commons-compress 1.24.0; upgrade to 1.26.0 needed.
> -
>
> Key: FLINK-34575
> URL: https://issues.apache.org/jira/browse/FLINK-34575
> Project: Flink
>  Issue Type: Technical Debt
>Affects Versions: 1.18.1
>Reporter: Adrian Vasiliu
>Priority: Major
>
> Since Feb. 19, medium/high CVEs have been found for commons-compress 1.24.0:
> [https://nvd.nist.gov/vuln/detail/CVE-2024-25710]
> https://nvd.nist.gov/vuln/detail/CVE-2024-26308
> [https://github.com/apache/flink/pull/24352] has been opened automatically on 
> Feb. 21 by dependabot for bumping commons-compress to v1.26.0 which fixes the 
> CVEs, but two CI checks are red on the PR.
> Flink's dependency on commons-compress has been upgraded to v1.24.0 in Oct 
> 2023 (https://issues.apache.org/jira/browse/FLINK-33329).
> v1.24.0 is the version currently in the master 
> branch:[https://github.com/apache/flink/blob/master/pom.xml#L727-L729|https://github.com/apache/flink/blob/master/pom.xml#L727-L729).].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34152) Tune TaskManager memory

2024-03-05 Thread Yang LI (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823720#comment-17823720
 ] 

Yang LI commented on FLINK-34152:
-

Hello [~mxm] , 

About the integration of Flink autoscaling with memory tuning into our 
Kubernetes (k8s) cluster. We believe that incorporating the Flink cluster into 
a standard k8s cluster will enhance our flexibility in resource allocation. 
However, a significant concern arises from the frequent node upgrades within 
our company's k8s cluster. These upgrades mandate rolling updates across all 
deployed applications, entailing a sequential restart of each pod.


Such operations are notoriously time-consuming. Moreover, they pose a unique 
challenge for Flink jobs, which are forced to restart with each task manager's 
stop-and-start cycle, leading to substantial downtime in comparison to other 
applications in our environment.

Our current flink cluster is less impacted by this because we are in a 
dedicated node group with always the same type instance inside, so we can do it 
in a fashion destroy and re-create everything once.

Given this backdrop, I am curious to know your thoughts on the potential 
development of Flink features designed to better accommodate the rolling update 
process. While elastic scaling represents a step in the right direction for 
mitigating such issues, but still I fear is not enough

Do you believe there are, or will be, enhancements to Flink that could ease the 
impact of rolling updates on Flink jobs? 

 

> Tune TaskManager memory
> ---
>
> Key: FLINK-34152
> URL: https://issues.apache.org/jira/browse/FLINK-34152
> Project: Flink
>  Issue Type: Sub-task
>  Components: Autoscaler, Kubernetes Operator
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.8.0
>
>
> The current autoscaling algorithm adjusts the parallelism of the job task 
> vertices according to the processing needs. By adjusting the parallelism, we 
> systematically scale the amount of CPU for a task. At the same time, we also 
> indirectly change the amount of memory tasks have at their dispense. However, 
> there are some problems with this.
>  # Memory is overprovisioned: On scale up we may add more memory than we 
> actually need. Even on scale down, the memory / cpu ratio can still be off 
> and too much memory is used.
>  # Memory is underprovisioned: For stateful jobs, we risk running into 
> OutOfMemoryErrors on scale down. Even before running out of memory, too 
> little memory can have a negative impact on the effectiveness of the scaling.
> We lack the capability to tune memory proportionally to the processing needs. 
> In the same way that we measure CPU usage and size the tasks accordingly, we 
> need to evaluate memory usage and adjust the heap memory size.
> https://docs.google.com/document/d/19GXHGL_FvN6WBgFvLeXpDABog2H_qqkw1_wrpamkFSc/edit
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34417] Log Job ID via MDC [flink]

2024-03-05 Thread via GitHub



rkhachatryan commented on code in PR #24292:
URL: https://github.com/apache/flink/pull/24292#discussion_r1513218720


##
flink-tests/src/test/resources/log4j2-test.properties:
##
@@ -28,7 +28,7 @@ appender.testlogger.name = TestLogger
 appender.testlogger.type = CONSOLE
 appender.testlogger.target = SYSTEM_ERR
 appender.testlogger.layout.type = PatternLayout
-appender.testlogger.layout.pattern = %-4r [%t] %-5p %c %x - %m%n
+appender.testlogger.layout.pattern = [%-32X{flink-job-id}] %c{0} [%t] %-5p %m%n

Review Comment:
   No, I just wanted to add job ID to log messages in tests.
   Just `%x` doesn't work, because it stands for Nested DC, while this PR uses 
Mapped DC (which is `%X`).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34417] Log Job ID via MDC [flink]

2024-03-05 Thread via GitHub



rkhachatryan commented on code in PR #24292:
URL: https://github.com/apache/flink/pull/24292#discussion_r1513218720


##
flink-tests/src/test/resources/log4j2-test.properties:
##
@@ -28,7 +28,7 @@ appender.testlogger.name = TestLogger
 appender.testlogger.type = CONSOLE
 appender.testlogger.target = SYSTEM_ERR
 appender.testlogger.layout.type = PatternLayout
-appender.testlogger.layout.pattern = %-4r [%t] %-5p %c %x - %m%n
+appender.testlogger.layout.pattern = [%-32X{flink-job-id}] %c{0} [%t] %-5p %m%n

Review Comment:
   No, I just wanted to add job ID to log messages in tests.
   Just `%x` doesn't work, because it stands for Nested DC, while this PR uses 
Mapped DC.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [WIP][FLINK-34442][table] Support optimizations for pre-partitioned data sources [flink]

2024-03-05 Thread via GitHub



jeyhunkarimov commented on PR #24437:
URL: https://github.com/apache/flink/pull/24437#issuecomment-1979268800

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34417] Log Job ID via MDC [flink]

2024-03-05 Thread via GitHub



zentol commented on code in PR #24292:
URL: https://github.com/apache/flink/pull/24292#discussion_r1513207066


##
flink-tests/src/test/resources/log4j2-test.properties:
##
@@ -28,7 +28,7 @@ appender.testlogger.name = TestLogger
 appender.testlogger.type = CONSOLE
 appender.testlogger.target = SYSTEM_ERR
 appender.testlogger.layout.type = PatternLayout
-appender.testlogger.layout.pattern = %-4r [%t] %-5p %c %x - %m%n
+appender.testlogger.layout.pattern = [%-32X{flink-job-id}] %c{0} [%t] %-5p %m%n

Review Comment:
   do we need this for the test to work? I'd expect that the MDC is populated 
regardless of what the pattern says, but that may very well not be the case.



##
flink-tests/src/test/java/org/apache/flink/test/misc/JobIDLoggingITCase.java:
##
@@ -0,0 +1,231 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.test.misc;
+
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.api.common.JobStatus;
+import org.apache.flink.api.common.time.Deadline;
+import org.apache.flink.client.program.ClusterClient;
+import org.apache.flink.core.execution.CheckpointType;
+import org.apache.flink.runtime.checkpoint.CheckpointCoordinator;
+import org.apache.flink.runtime.checkpoint.CheckpointException;
+import org.apache.flink.runtime.executiongraph.ExecutionGraph;
+import org.apache.flink.runtime.jobmaster.JobMaster;
+import org.apache.flink.runtime.taskexecutor.TaskExecutor;
+import org.apache.flink.runtime.taskmanager.Task;
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.functions.sink.DiscardingSink;
+import org.apache.flink.streaming.runtime.tasks.StreamTask;
+import org.apache.flink.test.junit5.InjectClusterClient;
+import org.apache.flink.test.junit5.MiniClusterExtension;
+import org.apache.flink.testutils.logging.LoggerAuditingExtension;
+import org.apache.flink.util.ExceptionUtils;
+import org.apache.flink.util.MdcUtils;
+import org.apache.flink.util.TestLogger;
+
+import org.apache.logging.log4j.core.LogEvent;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.extension.RegisterExtension;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.time.Duration;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Objects;
+import java.util.concurrent.ExecutionException;
+import java.util.regex.Pattern;
+import java.util.stream.Collectors;
+
+import static org.apache.flink.util.Preconditions.checkState;
+import static org.junit.Assert.assertTrue;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+import static org.slf4j.event.Level.DEBUG;
+
+/**
+ * Tests adding of {@link JobID} to logs (via {@link org.slf4j.MDC}) in the 
most important cases.
+ */
+public class JobIDLoggingITCase extends TestLogger {
+private static final Logger logger = 
LoggerFactory.getLogger(JobIDLoggingITCase.class);
+
+@RegisterExtension
+public final LoggerAuditingExtension checkpointCoordinatorLogging =
+new LoggerAuditingExtension(CheckpointCoordinator.class, DEBUG);
+
+@RegisterExtension
+public final LoggerAuditingExtension streamTaskLogging =
+new LoggerAuditingExtension(StreamTask.class, DEBUG);
+
+@RegisterExtension
+public final LoggerAuditingExtension taskExecutorLogging =
+new LoggerAuditingExtension(TaskExecutor.class, DEBUG);
+
+@RegisterExtension
+public final LoggerAuditingExtension taskLogging =
+new LoggerAuditingExtension(Task.class, DEBUG);
+
+@RegisterExtension
+public final LoggerAuditingExtension executionGraphLogging =
+new LoggerAuditingExtension(ExecutionGraph.class, DEBUG);
+
+@RegisterExtension
+public final LoggerAuditingExtension jobMasterLogging =
+new LoggerAuditingExtension(JobMaster.class, DEBUG);
+
+@RegisterExtension
+public final LoggerAuditingExtension asyncCheckpointRunnableLogging =
+// this class is private
+new LoggerAuditingExtension(
+
"org.apache.flink.strea

Re: [PR] [WIP][FLINK-34442][table] Support optimizations for pre-partitioned data sources [flink]

2024-03-05 Thread via GitHub



jeyhunkarimov commented on PR #24437:
URL: https://github.com/apache/flink/pull/24437#issuecomment-1979247860

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-34579) Introduce metric for time since last completed checkpoint

2024-03-05 Thread Stefan Richter (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Richter closed FLINK-34579.
--
Resolution: Won't Do

> Introduce metric for time since last completed checkpoint
> -
>
> Key: FLINK-34579
> URL: https://issues.apache.org/jira/browse/FLINK-34579
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> This metric will help us to identify jobs with checkpointing problems without 
> first requiring to complete or fail the checkpoint first before the problem 
> surfaces.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34579][metrics] Introduce metric for time since last competed checkpoint [flink]

2024-03-05 Thread via GitHub



StefanRRichter closed pull request #24435: [FLINK-34579][metrics] Introduce 
metric for time since last competed checkpoint
URL: https://github.com/apache/flink/pull/24435


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] (FLINK-34532) Stage source and binary releases on dist.apache.org

2024-03-05 Thread lincoln lee (Jira)



[ https://issues.apache.org/jira/browse/FLINK-34532 ]


lincoln lee deleted comment on FLINK-34532:
-

was (Author: lincoln.86xy):
An error occurred when build from the source package: 
[https://dist.apache.org/repos/dist/dev/flink/flink-1.19.0-rc1/flink-1.19.0-src.tgz]

{code}

[INFO] --- maven-checkstyle-plugin:3.1.2:check (validate) @ 
flink-table-planner_2.12 ---
[WARNING] Old version of checkstyle detected. Consider updating to >= v8.30
[WARNING] For more information see: 
https://maven.apache.org/plugins/maven-checkstyle-plugin/examples/upgrading-checkstyle.html
[INFO] There are 2 errors reported by Checkstyle 8.14 with 
/tools/maven/checkstyle.xml ruleset.
[ERROR] 
src/main/scala/org/apache/flink/table/planner/plan/rules/logical/RemoteCalcCallFinder.java:[31]
 (whitespace) EmptyLineSeparator: 'METHOD_DEF' should be separated from 
previous statement.
[ERROR] 
src/main/scala/org/apache/flink/table/planner/plan/rules/logical/RemoteCalcCallFinder.java:[33]
 (whitespace) EmptyLineSeparator: 'METHOD_DEF' should be separated from 
previous statement.

...

[INFO] Flink : Table : Planner  FAILURE [03:37 min]

...

[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time:  16:45 min
[INFO] Finished at: 2024-03-05T23:10:46+08:00
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-checkstyle-plugin:3.1.2:check (validate) on 
project flink-table-planner_2.12: You have 2 Checkstyle violations. -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :flink-table-planner_2.12

{code}

 

> Stage source and binary releases on dist.apache.org
> ---
>
> Key: FLINK-34532
> URL: https://issues.apache.org/jira/browse/FLINK-34532
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
>
> Copy the source release to the dev repository of dist.apache.org:
> # If you have not already, check out the Flink section of the dev repository 
> on dist.apache.org via Subversion. In a fresh directory:
> {code:bash}
> $ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
> {code}
> # Make a directory for the new release and copy all the artifacts (Flink 
> source/binary distributions, hashes, GPG signatures and the python 
> subdirectory) into that newly created directory:
> {code:bash}
> $ mkdir flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
> $ mv /tools/releasing/release/* 
> flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
> {code}
> # Add and commit all the files.
> {code:bash}
> $ cd flink
> flink $ svn add flink-${RELEASE_VERSION}-rc${RC_NUM}
> flink $ svn commit -m "Add flink-${RELEASE_VERSION}-rc${RC_NUM}"
> {code}
> # Verify that files are present under 
> [https://dist.apache.org/repos/dist/dev/flink|https://dist.apache.org/repos/dist/dev/flink].
> # Push the release tag if not done already (the following command assumes to 
> be called from within the apache/flink checkout):
> {code:bash}
> $ git push  refs/tags/release-${RELEASE_VERSION}-rc${RC_NUM}
> {code}
>  
> 
> h3. Expectations
>  * Maven artifacts deployed to the staging repository of 
> [repository.apache.org|https://repository.apache.org/content/repositories/]
>  * Source distribution deployed to the dev repository of 
> [dist.apache.org|https://dist.apache.org/repos/dist/dev/flink/]
>  * Check hashes (e.g. shasum -c *.sha512)
>  * Check signatures (e.g. {{{}gpg --verify 
> flink-1.2.3-source-release.tar.gz.asc flink-1.2.3-source-release.tar.gz{}}})
>  * {{grep}} for legal headers in each file.
>  * If time allows check the NOTICE files of the modules whose dependencies 
> have been changed in this release in advance, since the license issues from 
> time to time pop up during voting. See [Verifying a Flink 
> Release|https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Release]
>  "Checking License" section.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34566) Flink Kubernetes Operator reconciliation parallelism setting not work

2024-03-05 Thread Gyula Fora (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823682#comment-17823682
 ] 

Gyula Fora commented on FLINK-34566:


[~Fei Feng] here is the JOSDK side fix, can you please help review it?
[https://github.com/operator-framework/java-operator-sdk/pull/2262/files]

 

> Flink Kubernetes Operator reconciliation parallelism setting not work
> -
>
> Key: FLINK-34566
> URL: https://issues.apache.org/jira/browse/FLINK-34566
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.7.0
>Reporter: Fei Feng
>Assignee: Fei Feng
>Priority: Blocker
>  Labels: pull-request-available
> Attachments: image-2024-03-04-10-58-37-679.png, 
> image-2024-03-04-11-17-22-877.png, image-2024-03-04-11-31-44-451.png
>
>
> After we upgrade JOSDK to version 4.4.2 from version 4.3.0 in FLINK-33005 , 
> we can not enlarge reconciliation parallelism , and the maximum 
> reconciliation parallelism was only 10. This results FlinkDeployment and 
> SessionJob 's reconciliation delay about 10-30 seconds when we have a large 
> scale flink session cluster and session jobs in k8s cluster。
>  
> After investigating and validating, I found the reason is the logic for 
> reconciliation thread pool creation in JOSDK has changed significantly 
> between this two version. 
> v4.3.0: 
> reconciliation thread pool was created as a FixedThreadPool ( maximumPoolSize 
> was same as corePoolSize), so we pass the reconciliation thread and get a 
> thread pool that matches our expectations.
> !image-2024-03-04-10-58-37-679.png|width=497,height=91!
> [https://github.com/operator-framework/java-operator-sdk/blob/v4.3.0/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ConfigurationServiceOverrider.java#L198]
>  
> but in v4.2.0:
> the reconciliation thread pool was created as a customer executor which we 
> can pass corePoolSize and maximumPoolSize to create this thread pool.The 
> problem is that we only set the maximumPoolSize of the thread pool, while, 
> the corePoolSize of the thread pool is defaulted to 10. This causes thread 
> pool size was only 10 and majority of events would be placed in the workQueue 
> for a while.  
> !image-2024-03-04-11-17-22-877.png|width=569,height=112!
> [https://github.com/operator-framework/java-operator-sdk/blob/v4.4.2/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ExecutorServiceManager.java#L37]
>  
> the solution is also simple, we can create and pass thread pool in flink 
> kubernetes operator so that we can control the reconciliation thread pool 
> directly, such as:
> !image-2024-03-04-11-31-44-451.png|width=483,height=98!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34532) Stage source and binary releases on dist.apache.org

2024-03-05 Thread lincoln lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823680#comment-17823680
 ] 

lincoln lee commented on FLINK-34532:
-

An error occurred when build from the source package: 
[https://dist.apache.org/repos/dist/dev/flink/flink-1.19.0-rc1/flink-1.19.0-src.tgz]

{code}

[INFO] --- maven-checkstyle-plugin:3.1.2:check (validate) @ 
flink-table-planner_2.12 ---
[WARNING] Old version of checkstyle detected. Consider updating to >= v8.30
[WARNING] For more information see: 
https://maven.apache.org/plugins/maven-checkstyle-plugin/examples/upgrading-checkstyle.html
[INFO] There are 2 errors reported by Checkstyle 8.14 with 
/tools/maven/checkstyle.xml ruleset.
[ERROR] 
src/main/scala/org/apache/flink/table/planner/plan/rules/logical/RemoteCalcCallFinder.java:[31]
 (whitespace) EmptyLineSeparator: 'METHOD_DEF' should be separated from 
previous statement.
[ERROR] 
src/main/scala/org/apache/flink/table/planner/plan/rules/logical/RemoteCalcCallFinder.java:[33]
 (whitespace) EmptyLineSeparator: 'METHOD_DEF' should be separated from 
previous statement.

...

[INFO] Flink : Table : Planner  FAILURE [03:37 min]

...

[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time:  16:45 min
[INFO] Finished at: 2024-03-05T23:10:46+08:00
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-checkstyle-plugin:3.1.2:check (validate) on 
project flink-table-planner_2.12: You have 2 Checkstyle violations. -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :flink-table-planner_2.12

{code}

 

> Stage source and binary releases on dist.apache.org
> ---
>
> Key: FLINK-34532
> URL: https://issues.apache.org/jira/browse/FLINK-34532
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lincoln Lee
>Assignee: lincoln lee
>Priority: Major
>
> Copy the source release to the dev repository of dist.apache.org:
> # If you have not already, check out the Flink section of the dev repository 
> on dist.apache.org via Subversion. In a fresh directory:
> {code:bash}
> $ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
> {code}
> # Make a directory for the new release and copy all the artifacts (Flink 
> source/binary distributions, hashes, GPG signatures and the python 
> subdirectory) into that newly created directory:
> {code:bash}
> $ mkdir flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
> $ mv /tools/releasing/release/* 
> flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
> {code}
> # Add and commit all the files.
> {code:bash}
> $ cd flink
> flink $ svn add flink-${RELEASE_VERSION}-rc${RC_NUM}
> flink $ svn commit -m "Add flink-${RELEASE_VERSION}-rc${RC_NUM}"
> {code}
> # Verify that files are present under 
> [https://dist.apache.org/repos/dist/dev/flink|https://dist.apache.org/repos/dist/dev/flink].
> # Push the release tag if not done already (the following command assumes to 
> be called from within the apache/flink checkout):
> {code:bash}
> $ git push  refs/tags/release-${RELEASE_VERSION}-rc${RC_NUM}
> {code}
>  
> 
> h3. Expectations
>  * Maven artifacts deployed to the staging repository of 
> [repository.apache.org|https://repository.apache.org/content/repositories/]
>  * Source distribution deployed to the dev repository of 
> [dist.apache.org|https://dist.apache.org/repos/dist/dev/flink/]
>  * Check hashes (e.g. shasum -c *.sha512)
>  * Check signatures (e.g. {{{}gpg --verify 
> flink-1.2.3-source-release.tar.gz.asc flink-1.2.3-source-release.tar.gz{}}})
>  * {{grep}} for legal headers in each file.
>  * If time allows check the NOTICE files of the modules whose dependencies 
> have been changed in this release in advance, since the license issues from 
> time to time pop up during voting. See [Verifying a Flink 
> Release|https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Release]
>  "Checking License" section.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-16627) Support only generate non-null values when serializing into JSON

2024-03-05 Thread Benchao Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benchao Li resolved FLINK-16627.

Fix Version/s: 1.20.0
   Resolution: Fixed

Fixed via 93555a7d55dc27571122cccb7c4d8af2c5db54cb (master)

[~nilerzhou] Thanks for your contribution!

> Support only generate non-null values when serializing into JSON
> 
>
> Key: FLINK-16627
> URL: https://issues.apache.org/jira/browse/FLINK-16627
> Project: Flink
>  Issue Type: New Feature
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Planner
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Assignee: yisha zhou
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> auto-unassigned, pull-request-available, sprint
> Fix For: 1.20.0
>
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……）
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……）
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {
> def eval(str: String) : String= { 
>if(str == null){
>return ""
>}else{
>return str
>}
> }
> 
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> 
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable，but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-16627][json]Support ignore null fields when serializing into JSON [flink]

2024-03-05 Thread via GitHub



libenchao closed pull request #24430: [FLINK-16627][json]Support ignore null 
fields when serializing into JSON
URL: https://github.com/apache/flink/pull/24430


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-34576) Flink deployment keep staying at RECONCILING/STABLE status

2024-03-05 Thread chenyuzhi (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823663#comment-17823663
 ] 

chenyuzhi commented on FLINK-34576:
---

Thanks for the reply.

1.  Is there a way to somehow repro this on a smaller case?

I have tried to simulate leader switching by deleting pod in the test 
environment, but without repro. In the production environment, it is very 
likely to occur (maybe it is related to the load?).

 

Maybe there is some way to make the operator pod lost the leader to repro(not 
delete pod, but I haven't found any other way to make the pod lost the leader)


2. Have you tried operator version 1.7.0? We may have fixed the issue there 
already

It has not been upgraded to use 1.7.0 because this version no longer supports 
Flink1.14.0, but our production environment is still using it.

 
Are you pointing about this [JOSDK 
issue|https://github.com/operator-framework/java-operator-sdk/issues/2056]? We 
did encounter a split-brain problem similar to multiple leaders earlier, but As 
mentioned in the first question, this status exception will still occur after 
the master is successfully switched (by checking the log oldLeader exit, 
newLeader takeover).
 
3. Does it also affect newer Flink versions as well?
 

Our highest Flink version is 1.15.2, and the impact of higher versions is 
uncertain.
 

4. Can you share some relevant operator logs?

Sure.

 
operatorA log when leader switches (stopping leader appears), take it from 
log-file

 
{code:java}
2024-03-05 04:35:46,565 o.a.f.c.Configuration          [WARN 
][gdc-qdata-bu/prod-s1-monitor-reward-sjuneizhandoushemenrel] Config uses 
deprecated configuration key 'high-availability' instead of proper key 
'high-availability.type'
2024-03-05 04:35:46,567 o.a.f.c.Configuration          [WARN 
][gdc-gdc-sa/logstream-panama-panama-h73na-serverlog-produ] Config uses 
deprecated configuration key 'high-availability' instead of proper key 
'high-availability.type'
2024-03-05 04:35:46,569 o.a.f.c.Configuration          [WARN 
][gdc-gdc-sa/logstream-erie-erie-gzailab-sym2-ns-imageveri] Config uses 
deprecated configuration key 'high-availability' instead of proper key 
'high-availability.type'
2024-03-05 04:35:46,569 o.a.f.c.Configuration          [WARN 
][gdc-gdc-sa/test-vk-log3] Config uses deprecated configuration key 
'high-availability' instead of proper key 'high-availability.type'
2024-03-05 04:35:46,574 i.j.o.LeaderElectionManager    [INFO ] New leader with 
identity: 
2024-03-05 04:35:46,584 o.a.f.c.Configuration          [WARN 
][gdc-cld-bu/logdistribution-grand-cld-dnode-contianer-ba3] Config uses 
deprecated configuration key 'kubernetes.jobmanager.cpu' instead of proper key 
'kubernetes.jobmanager.cpu.amount'
2024-03-05 04:35:46,584 o.a.f.c.Configuration          [WARN 
][gdc-cld-bu/logdistribution-grand-cld-dnode-contianer-ba3] Config uses 
deprecated configuration key 'kubernetes.taskmanager.cpu' instead of proper key 
'kubernetes.taskmanager.cpu.amount'
2024-03-05 04:35:46,586 o.a.f.k.o.r.d.AbstractFlinkResourceReconciler [INFO 
][gdc-cld-bu/logdistribution-grand-cld-dnode-contianer-ba3] Resource fully 
reconciled, nothing to do...
2024-03-05 04:35:46,586 i.j.o.LeaderElectionManager    [INFO ] Stopped leading 
for identity: flink-kubernetes-operator-85f6994468-cpsx9. Exiting.
2024-03-05 04:35:46,589 o.a.f.k.o.l.AuditUtils         [INFO 
][gdc-gdc-bu/test-lag-202306-v2-copy-cpu] >>> Status | Error   | STABLE         
 | 
{"type":"org.apache.flink.kubernetes.operator.exception.RecoveryFailureException","message":"HA
 metadata not available to restore from last state. It is possible that the job 
has finished or terminally failed, or the configmaps have been deleted. Manual 
restore required.","additionalMetadata":{},"throwableList":[]} 
2024-03-05 04:35:46,591 o.a.f.c.Configuration          [WARN 
][gdc-a29-bu/logdistribution-xia-xia-a29-pc-vm-log-product] Config uses 
deprecated configuration key 'kubernetes.jobmanager.cpu' instead of proper key 
'kubernetes.jobmanager.cpu.amount'
2024-03-05 04:35:46,591 o.a.f.c.Configuration          [WARN 
][gdc-a29-bu/logdistribution-xia-xia-a29-pc-vm-log-product] Config uses 
deprecated configuration key 'kubernetes.taskmanager.cpu' instead of proper key 
'kubernetes.taskmanager.cpu.amount'
2024-03-05 04:35:46,592 o.a.f.c.Configuration          [WARN 
][gdc-gdc-sa/logstream-grand-grand-s8-serverlog-production] Config uses 
deprecated configuration key 'high-availability' instead of proper key 
'high-availability.type'
2024-03-05 04:35:46,592 o.a.f.c.Configuration          [WARN 
][gdc-gdc-sa/logstream-jinghang-jinghang-g106-seazyi-nginx] Config uses 
deprecated configuration key 'kubernetes.jobmanager.cpu' instead of proper key 
'kubernetes.jobmanager.cpu.amount'
2024-03-05 04:35:46,592 o.a.f.c.Configuration          [WARN 
][gdc-gdc-sa/logstream-jinghang-jinghang-g106-seazyi-nginx] Config

Re: [PR] [WIP][FLINK-34442][table] Support optimizations for pre-partitioned data sources [flink]

2024-03-05 Thread via GitHub



flinkbot commented on PR #24437:
URL: https://github.com/apache/flink/pull/24437#issuecomment-1978955923

   
   ## CI report:
   
   * b714f6b03da97df2284d82941a3014dc6a3530ad UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34442) Support optimizations for pre-partitioned [external] data sources

2024-03-05 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-34442:
---
Labels: pull-request-available  (was: )

> Support optimizations for pre-partitioned [external] data sources
> -
>
> Key: FLINK-34442
> URL: https://issues.apache.org/jira/browse/FLINK-34442
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API, Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Jeyhun Karimov
>Assignee: Jeyhun Karimov
>Priority: Major
>  Labels: pull-request-available
>
> There are some use-cases in which data sources are pre-partitioned:
> - Kafka broker is already partitioned w.r.t. some key[s]
> - There are multiple [Flink] jobs  that materialize their outputs and read 
> them as input subsequently
> One of the main benefits is that we might avoid unnecessary shuffling. 
> There is already an experimental feature in DataStream to support a subset of 
> these [1].
> We should support this for Flink Table/SQL as well. 
> [1] 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/experimental/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [WIP][FLINK-34442][table] Support optimizations for pre-partitioned data sources [flink]

2024-03-05 Thread via GitHub



jeyhunkarimov opened a new pull request, #24437:
URL: https://github.com/apache/flink/pull/24437

   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] feat: autoscaling decision parallelism improvement [FLINK-34563] [flink-kubernetes-operator]

2024-03-05 Thread via GitHub



mxm commented on PR #787:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/787#issuecomment-1978939188

   >In my use case, the job graph comprises only 6 operators and allocates 6 
task slots per task manager. Prior to implementing this improvement, setting 
the maximum parallelism to 18 resulted in frequent rescaling of my Flink job to 
various levels of parallelism for all vertex, such as 7, 8, 13, 15, and 16. 
However, with this enhancement, the Flink job rescales the biggest vertex only 
to parallelism levels of 6, 12, and 18. While it's true that other vertices may 
still experience rescaling to parallelism levels like 7, 13, or 15, the overall 
frequency of rescaling triggered by the Flink autoscaler has significantly 
decreased.
   
   I agree that this improvement is beneficial especially for lower-parallelism 
jobs. I wonder whether it would make sense to align the parallelism with the 
number of task slots, i.e. have the parallelism always be a multiple of the 
number of task slots. This could result in more stable metrics because subtasks 
are equally distributed across the TaskManagers, which should stabilize the 
metrics for each associated job vertex (task).
   
   For example, if the number of task slots is 6, like in your example, the 
minimum parallelism would be 6. The next parallelism 12, 18, 24,... That's 
essentially your idea but generalizing it across all vertices.
   
   The only drawback is that, again, this needs to work with the key group 
alignment that we perform. Long term, it would probably be smarter to adjust 
the number of task slots such that they divide the number of key group without 
a remainder. We can start with adjusting according to multiples of the number 
of task slots configured whenever we do not perform the key group adjustments 
(e.g. no shuffle).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34566] Pass a FixedThreadPool to set reconciliation parallelism correctly [flink-kubernetes-operator]

2024-03-05 Thread via GitHub



gyfora commented on code in PR #790:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/790#discussion_r1512945769


##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/FlinkOperatorTest.java:
##
@@ -70,10 +70,21 @@ public void testConfigurationPassedToJOSDK() {
 
 var configService = 
testOperator.getOperator().getConfigurationService();
 
-// Test parallelism being passed
+// Test parallelism being passed expectedly
 var executorService = configService.getExecutorService();
 Assertions.assertInstanceOf(ThreadPoolExecutor.class, executorService);
 ThreadPoolExecutor threadPoolExecutor = (ThreadPoolExecutor) 
executorService;
+for (int i = 0; i < testParallelism * 2; i++) {
+threadPoolExecutor.execute(
+() -> {
+try {
+Thread.sleep(1000);
+} catch (InterruptedException e) {
+e.printStackTrace();
+}
+});
+}

Review Comment:
   what do we expect to test here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 >

1 - 100 of 162 matches

Mail list logo