[jira] [Commented] (FLINK-34927) Translate flink-kubernetes-operator documentation

2024-03-24 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830364#comment-17830364
 ] 

Gyula Fora commented on FLINK-34927:


I think this would be great, I won't be able to review the content though :D 

> Translate flink-kubernetes-operator documentation
> -
>
> Key: FLINK-34927
> URL: https://issues.apache.org/jira/browse/FLINK-34927
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Caican Cai
>Priority: Major
> Fix For: kubernetes-operator-1.9.0
>
>
> Currently, the flink-kubernetes-operator documentation is only in English. I 
> hope it can be translated into Chinese so that more Chinese users can use it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34537] Autoscaler JDBC Support HikariPool [flink-kubernetes-operator]

2024-03-24 Thread via GitHub


1996fanrui commented on code in PR #785:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/785#discussion_r1537108398


##
flink-autoscaler-standalone/pom.xml:
##
@@ -163,6 +163,12 @@ under the License.
 test
 
 
+

Review Comment:
   It's still after the Test related dependencies. You can check `` comment, you can move `HikariCP` before it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-34925) FlinkCDC 3.0 for extracting data from sqlserver, errors occur

2024-03-24 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-34925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830356#comment-17830356
 ] 

宇宙先生 edited comment on FLINK-34925 at 3/25/24 6:34 AM:
---

[~loserwang1024]   thanks for ur replay,  may I have your details  about the 
problem,  process steps  and so on.,Thanks again.


was (Author: JIRAUSER283011):
[~loserwang1024]   thanks for ur replay,  may I have your details  about the 
problem,  process stps  and so on.,Thanks again.

>  FlinkCDC 3.0 for extracting data from sqlserver,  errors occur
> ---
>
> Key: FLINK-34925
> URL: https://issues.apache.org/jira/browse/FLINK-34925
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Reporter: 宇宙先生
>Priority: Critical
> Attachments: image-2024-03-24-18-12-52-747.png, 
> image-2024-03-24-18-23-19-657.png
>
>
> when I use FlinkCDC 3.0 for extracting data from sqlserver,  errors occur
> !image-2024-03-24-18-12-52-747.png!
> .is referenced as PRIMARY KEY, but a matching column is not defined 
> in table.
> I found some information on Debezuim's website,The official website says,
> This bug was fixed in the debe 2.0 version, I checked the current flinkcdc 
> debezuim is version 1.97, I want to know what is the cause of this problem, 
> can I directly upgrade the debezuim version to fix it? Debezuim's  link   
> [Debezium 2.0.0.Beta1 
> Released|https://debezium.io/blog/2022/07/27/debezium-2.0-beta1-released/]
> !image-2024-03-24-18-23-19-657.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34925) FlinkCDC 3.0 for extracting data from sqlserver, errors occur

2024-03-24 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-34925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830356#comment-17830356
 ] 

宇宙先生 commented on FLINK-34925:
--

[~loserwang1024]   thanks for ur replay,  may I have your details  about the 
problem,  process stps  and so on.,Thanks again.

>  FlinkCDC 3.0 for extracting data from sqlserver,  errors occur
> ---
>
> Key: FLINK-34925
> URL: https://issues.apache.org/jira/browse/FLINK-34925
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Reporter: 宇宙先生
>Priority: Critical
> Attachments: image-2024-03-24-18-12-52-747.png, 
> image-2024-03-24-18-23-19-657.png
>
>
> when I use FlinkCDC 3.0 for extracting data from sqlserver,  errors occur
> !image-2024-03-24-18-12-52-747.png!
> .is referenced as PRIMARY KEY, but a matching column is not defined 
> in table.
> I found some information on Debezuim's website,The official website says,
> This bug was fixed in the debe 2.0 version, I checked the current flinkcdc 
> debezuim is version 1.97, I want to know what is the cause of this problem, 
> can I directly upgrade the debezuim version to fix it? Debezuim's  link   
> [Debezium 2.0.0.Beta1 
> Released|https://debezium.io/blog/2022/07/27/debezium-2.0-beta1-released/]
> !image-2024-03-24-18-23-19-657.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-31664] Implement ARRAY_INTERSECT function [flink]

2024-03-24 Thread via GitHub


liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2017314639

   > 
https://clickhouse.com/docs/en/sql-reference/functions/array-functions#arrayintersectarr
   
   from my side, it is not a good idea.
   because we can use array_intersect(array_intersect(array1, array2), array3) 
does same. it is just a syntatic sugar.
   and array_union/array_except has supported and merged. there are both two 
args. we may align , what is your opinion @dawidwys @MartijnVisser 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33121] Failed precondition in JobExceptionsHandler due to concurrent global failures [flink]

2024-03-24 Thread via GitHub


pgaref commented on PR #23440:
URL: https://github.com/apache/flink/pull/23440#issuecomment-2017246294

   > I suppose this can work but it seems rather brittle and may obfuscate the 
underlying failure cause. If the reality is that multiple global failures can 
occur or that global failures can occur after task failures then the exception 
history should simply support that. The old stack got around this by just 
throwing away older global failures if a new one occurred, which isn't great 
either.
   > 
   > This might be the opportunity to properly fix this. In the old days (pre 
source coordinators) this wasn't as pressing because global failures were rare 
and only triggered by the JM (which typically ensured it only happened once), 
but since this has changed now we should adjust accordingly.
   
   Thanks @zentol  ! Opened https://issues.apache.org/jira/browse/FLINK-34922 
to address properly supporting multiple Global failures in the Exception 
History.
   
   Also changed this PR to ignore Global failures while being in a 
Restarting/Canceling or Failing phase on the Adaptive scheduler -- let me know 
what you think


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33121) Failed precondition in JobExceptionsHandler due to concurrent global failures

2024-03-24 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated FLINK-33121:
---
Description: 
We make the assumption that Global Failures (with null Task name) may only be 
RootExceptions and and Local/Task exception may be part of concurrent 
exceptions List (see {{{}JobExceptionsHandler#createRootExceptionInfo{}}}).
However, when the Adaptive scheduler is in a Restarting phase due to an 
existing failure (that is now the new Root) we can still, in rare occasions, 
capture new Global failures, violating this condition (with an assertion is 
thrown as part of {{{}assertLocalExceptionInfo{}}}) seeing something like:
{code:java}
The taskName must not be null for a non-global failure.  {code}
We want to ignore Global failures while being in a Restarting/Canceling or 
Failing phase on the Adaptive scheduler until we properly support multiple 
Global failures in the Exception History as part of 
https://issues.apache.org/jira/browse/FLINK-34922

Note: DefaultScheduler does not suffer from this issue as it treats failures 
directly as HistoryEntries (no conversion step)

  was:
We make the assumption that Global Failures (with null Task name) may only be 
RootExceptions and and Local/Task exception may be part of concurrent 
exceptions List (see {{{}JobExceptionsHandler#createRootExceptionInfo{}}}).
However, when the Adaptive scheduler is in a Restarting phase due to an 
existing failure (that is now the new Root) we can still, in rare occasions, 
capture new Global failures, violating this condition (with an assertion is 
thrown as part of {{{}assertLocalExceptionInfo{}}}) seeing something like:
{code:java}
The taskName must not be null for a non-global failure.  {code}
A solution to this could be to ignore Global failures while being in a 
Restarting phase on the Adaptive scheduler.

Note: DefaultScheduler does not suffer from this issue as it treats failures 
directly as HistoryEntries (no conversion step)


> Failed precondition in JobExceptionsHandler due to concurrent global failures
> -
>
> Key: FLINK-33121
> URL: https://issues.apache.org/jira/browse/FLINK-33121
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>
> We make the assumption that Global Failures (with null Task name) may only be 
> RootExceptions and and Local/Task exception may be part of concurrent 
> exceptions List (see {{{}JobExceptionsHandler#createRootExceptionInfo{}}}).
> However, when the Adaptive scheduler is in a Restarting phase due to an 
> existing failure (that is now the new Root) we can still, in rare occasions, 
> capture new Global failures, violating this condition (with an assertion is 
> thrown as part of {{{}assertLocalExceptionInfo{}}}) seeing something like:
> {code:java}
> The taskName must not be null for a non-global failure.  {code}
> We want to ignore Global failures while being in a Restarting/Canceling or 
> Failing phase on the Adaptive scheduler until we properly support multiple 
> Global failures in the Exception History as part of 
> https://issues.apache.org/jira/browse/FLINK-34922
> Note: DefaultScheduler does not suffer from this issue as it treats failures 
> directly as HistoryEntries (no conversion step)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34505][table] Migrate WindowGroupReorderRule to java. [flink]

2024-03-24 Thread via GitHub


liuyongvs commented on code in PR #24375:
URL: https://github.com/apache/flink/pull/24375#discussion_r1537071891


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/rules/logical/WindowGroupReorderRule.java:
##
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.rules.logical;
+
+import org.apache.flink.shaded.guava31.com.google.common.collect.Lists;
+
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelRule;
+import org.apache.calcite.rel.RelCollation;
+import org.apache.calcite.rel.RelFieldCollation;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.core.Window.Group;
+import org.apache.calcite.rel.logical.LogicalProject;
+import org.apache.calcite.rel.logical.LogicalWindow;
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.rex.RexInputRef;
+import org.immutables.value.Value;
+
+import java.util.ArrayList;
+import java.util.Comparator;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+
+/**
+ * Planner rule that makes the over window groups which have the same shuffle 
keys and order keys
+ * together.
+ */
+@Value.Enclosing
+public class WindowGroupReorderRule
+extends RelRule {
+
+public static final WindowGroupReorderRule INSTANCE =
+
WindowGroupReorderRule.WindowGroupReorderRuleConfig.DEFAULT.toRule();
+
+private WindowGroupReorderRule(WindowGroupReorderRuleConfig config) {
+super(config);
+}
+
+@Override
+public boolean matches(RelOptRuleCall call) {
+LogicalWindow window = call.rel(0);
+return window.groups.size() > 1;
+}
+
+@Override
+public void onMatch(RelOptRuleCall call) {
+LogicalWindow window = call.rel(0);
+RelNode input = call.rel(1);
+List oldGroups = new ArrayList<>(window.groups);
+List sequenceGroups = new ArrayList<>(window.groups);
+
+sequenceGroups.sort(
+(o1, o2) -> {
+int keyComp = o1.keys.compareTo(o2.keys);
+if (keyComp == 0) {
+return compareRelCollation(o1.orderKeys, o2.orderKeys);
+} else {
+return keyComp;
+}
+});
+
+if (!sequenceGroups.equals(oldGroups) && 
!Lists.reverse(sequenceGroups).equals(oldGroups)) {
+int offset = input.getRowType().getFieldCount();
+List aggTypeIndexes = new ArrayList<>();
+for (Group group : oldGroups) {
+int aggCount = group.aggCalls.size();
+int[] typeIndexes = new int[aggCount];
+for (int i = 0; i < aggCount; i++) {
+typeIndexes[i] = offset + i;
+}
+offset += aggCount;
+aggTypeIndexes.add(typeIndexes);
+}
+
+offset = input.getRowType().getFieldCount();
+List mapToOldTypeIndexes =
+IntStream.range(0, 
offset).boxed().collect(Collectors.toList());
+for (Group newGroup : sequenceGroups) {
+int aggCount = newGroup.aggCalls.size();
+int oldIndex = oldGroups.indexOf(newGroup);
+offset += aggCount;
+for (int aggIndex = 0; aggIndex < aggCount; aggIndex++) {
+
mapToOldTypeIndexes.add(aggTypeIndexes.get(oldIndex)[aggIndex]);
+}
+}
+
+List> newFieldList =
+mapToOldTypeIndexes.stream()
+.map(index -> 
window.getRowType().getFieldList().get(index))
+.collect(Collectors.toList());
+RelDataType intermediateRowType =
+
window.getCluster().getTypeFactory().createStructType(newFieldList);
+LogicalWindow newLogicalWindow =
+LogicalWindow.create(
+window.getCluster().getPlanner().emptyTraitSet(),
+input,
+windo

Re: [PR] [FLINK-34906] Only scale when all tasks are running [flink-kubernetes-operator]

2024-03-24 Thread via GitHub


1996fanrui commented on PR #801:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/801#issuecomment-2017164952

   Thanks @mxm for the review and discussion!
   
   > > This issue only affects the standalone autoscaler as the kubernetes 
operator has this logic already in place for setting the RUNNING state. Can we 
somehow deduplicate this logic?
   > 
   > Is that really the case? AFAIK we only check for a RUNNING job state.
   
   `AbstractFlinkService#getEffectiveStatus` adjusts the `JobStatus.RUNNING` to 
`JobStatus.CREATED`, thanks @gyfora for helping find it. I didn't extract it as 
a common class due to @gyfora mentioned `autoscaler` may be moved to the 
separated repo, so it's better to copy related logic to `autoscaler standalone` 
module.
   
   > This looks related to #699 which took a different approach by ignoring 
certain exceptions during the stabilization phase and effectively postponing 
metric collection.
   
   The adjustment logic is introduced before #699 , it means the some of 
metrics may be not ready even if all tasks are running(I guess some metrics are 
generated after running). That's what exactly what #699  solved.
   
   Why do we need to adjust the JobStatus?
   
   - If some of tasks are not running, autoscaler doesn't need to call metric 
collection related logic.
   - If `job.autoscaler.stabilization.interval` is set to small value by users, 
it's easy to throw metric not found exception.
   - As I understand, `job.autoscaler.stabilization.interval` hopes to filter 
out unstable metrics when all tasks just start running. 
 - For example, job starts at `09:00:00`, and all tasks start running at 
`09:03:00`, and  `job.autoscaler.stabilization.interval` = 1 min.
 - We hopes the stabilization period is `09:03:00` to `09:04:00` instead of 
`09:00:00` to `09:01:00`, right?
 - All tasks starts since `09:03:00`, so the metric may be not stable from 
`09:03:00` to `09:04:00`.
 - Of course, this issue might needs FLINK-34907 as well.
   
   Please correct me if anything is wrong, thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34743][autoscaler] Memory tuning takes effect even if the parallelism isn't changed [flink-kubernetes-operator]

2024-03-24 Thread via GitHub


1996fanrui commented on PR #799:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/799#issuecomment-2017142562

   Thanks @mxm for the patient reply and clarification!
   
   > You mentioned the adaptive scheduler. Frankly, the use of the adaptive 
scheduler with autoscaling isn't fully developed. I would discourage users from 
using it with autoscaling at its current state.
   
   +1, I agree with you.
   
   > the rescale time has been considered for the downscale / upscale 
processing capacity, but the current processing capacity doesn't factor in 
downtime. 
   
   Sorry, I didn't notice that. I thought the restart time has been considered 
for any cases before. Thanks you for pointing it out, I will check later.
   
   Also, do you have any other specific concerns if we allow `memory tuning 
takes effect even if the parallelism isn't changed`? I could try to solve all 
of them in this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [hotfix] Correct the doc related to restart time tracking [flink-kubernetes-operator]

2024-03-24 Thread via GitHub


1996fanrui commented on PR #802:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/802#issuecomment-2017134638

   Thanks @czy006 and @mxm for the review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [hotfix] Correct the doc related to restart time tracking [flink-kubernetes-operator]

2024-03-24 Thread via GitHub


1996fanrui merged PR #802:
URL: https://github.com/apache/flink-kubernetes-operator/pull/802


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-34927) Translate flink-kubernetes-operator documentation

2024-03-24 Thread Rui Fan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830328#comment-17830328
 ] 

Rui Fan commented on FLINK-34927:
-

Thanks [~caicancai] driving it! 

Translating flink-kubernetes-operator documentation into Chinese is useful for 
flink community and Chinese flink users. I saw flink documentation has Chinese 
documentation before, and flink-cdc is translating in FLINK-34730. 

Translating flink-kubernetes-operator documentation into Chinese will make it 
easier for Chinese flink users to use flink-kubernets-operator and autoscaler. 
If community think it's acceptable, I'm happy to review it.:)



> Translate flink-kubernetes-operator documentation
> -
>
> Key: FLINK-34927
> URL: https://issues.apache.org/jira/browse/FLINK-34927
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Caican Cai
>Priority: Major
> Fix For: kubernetes-operator-1.9.0
>
>
> Currently, the flink-kubernetes-operator documentation is only in English. I 
> hope it can be translated into Chinese so that more Chinese users can use it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34927) Translate flink-kubernetes-operator documentation

2024-03-24 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-34927:

Fix Version/s: kubernetes-operator-1.9.0
   (was: 2.0.0)

> Translate flink-kubernetes-operator documentation
> -
>
> Key: FLINK-34927
> URL: https://issues.apache.org/jira/browse/FLINK-34927
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Caican Cai
>Priority: Major
> Fix For: kubernetes-operator-1.9.0
>
>
> Currently, the flink-kubernetes-operator documentation is only in English. I 
> hope it can be translated into Chinese so that more Chinese users can use it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34927) Translate flink-kubernetes-operator documentation

2024-03-24 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-34927:

Affects Version/s: (was: 1.19.0)

> Translate flink-kubernetes-operator documentation
> -
>
> Key: FLINK-34927
> URL: https://issues.apache.org/jira/browse/FLINK-34927
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Caican Cai
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently, the flink-kubernetes-operator documentation is only in English. I 
> hope it can be translated into Chinese so that more Chinese users can use it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [release] Adjust website for Kubernetes operator 1.8.0 release [flink-web]

2024-03-24 Thread via GitHub


1996fanrui commented on PR #726:
URL: https://github.com/apache/flink-web/pull/726#issuecomment-2017117505

   > @1996fanrui I'm going to merge. Please feel free to comment on the PR. We 
can still correct any mistakes or further improve the post!
   
   Sorry for the late response! `1.7.0` is the first version for `Autoscaler 
Standalone`, and we finished a series of improvements in `1.8.0`, such as : 
   
   - Autoscaler Standalone control loop supports multiple thread
   - Implement JdbcAutoScalerStateStore
   - Implement JdbcAutoScalerEventHandler
   - etc
   
   After these improvements, the `Autoscaler Standalone` is close to production 
ready. Would you mind if we mention them in this release announcement?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-34841) [3.1][pipeline-connectors] Add jdbc pipeline sink

2024-03-24 Thread Leonard Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu reassigned FLINK-34841:
--

Assignee: Zhongqiang Gong

> [3.1][pipeline-connectors] Add jdbc pipeline sink 
> --
>
> Key: FLINK-34841
> URL: https://issues.apache.org/jira/browse/FLINK-34841
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Flink CDC Issue Import
>Assignee: Zhongqiang Gong
>Priority: Major
>  Labels: github-import
>
> ### Search before asking
> - [X] I searched in the 
> [issues|https://github.com/ververica/flink-cdc-connectors/issues] and found 
> nothing similar.
> ### Motivation
> From my side and I saw in dingding group, Some user want to sync data to 
> relation database by flink cdc.
> ### Solution
> _No response_
> ### Alternatives
> _No response_
> ### Anything else?
> _No response_
> ### Are you willing to submit a PR?
> - [X] I'm willing to submit a PR!
>  Imported from GitHub 
> Url: https://github.com/apache/flink-cdc/issues/2866
> Created by: [GOODBOY008|https://github.com/GOODBOY008]
> Labels: enhancement, 
> Created at: Wed Dec 13 15:34:21 CST 2023
> State: open



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34927) Translate flink-kubernetes-operator documentation

2024-03-24 Thread Caican Cai (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830325#comment-17830325
 ] 

Caican Cai commented on FLINK-34927:


[~fanrui] [~gyfora] Hello, you can check to see if you accept this feature. If 
you accept it, I will create the corresponding subtask. At present, I have 
translated part of the document. Thank you.

> Translate flink-kubernetes-operator documentation
> -
>
> Key: FLINK-34927
> URL: https://issues.apache.org/jira/browse/FLINK-34927
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Affects Versions: 1.19.0
>Reporter: Caican Cai
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently, the flink-kubernetes-operator documentation is only in English. I 
> hope it can be translated into Chinese so that more Chinese users can use it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-34732) Add document dead link check for Flink CDC Documentation

2024-03-24 Thread Leonard Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu reassigned FLINK-34732:
--

Assignee: Zhongqiang Gong

> Add document dead link check for Flink CDC Documentation
> 
>
> Key: FLINK-34732
> URL: https://issues.apache.org/jira/browse/FLINK-34732
> Project: Flink
>  Issue Type: Sub-task
>  Components: Flink CDC
>Reporter: Zhongqiang Gong
>Assignee: Zhongqiang Gong
>Priority: Minor
>  Labels: pull-request-available
>
> Add ci for check dead link in flink cdc document.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34668][checkpoint] Report state handle of file merging directory to JM [flink]

2024-03-24 Thread via GitHub


Zakelly commented on code in PR #24513:
URL: https://github.com/apache/flink/pull/24513#discussion_r1536996692


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/filemerging/FileMergingSnapshotManagerBase.java:
##
@@ -278,7 +300,11 @@ public SegmentFileStateHandle 
closeStreamAndCreateStateHandle(
 returnPhysicalFileForNextReuse(subtaskKey, 
checkpointId, physicalFile);
 
 return new SegmentFileStateHandle(
-physicalFile.getFilePath(), startPos, 
stateSize, scope);
+getManagedDirStateHandle(subtaskKey, 
scope),

Review Comment:
   Is it possible for some cp happened to be with no state, and the directory 
is deleted by JM?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-34927) Translate flink-kubernetes-operator documentation

2024-03-24 Thread Caican Cai (Jira)
Caican Cai created FLINK-34927:
--

 Summary: Translate flink-kubernetes-operator documentation
 Key: FLINK-34927
 URL: https://issues.apache.org/jira/browse/FLINK-34927
 Project: Flink
  Issue Type: New Feature
  Components: Kubernetes Operator
Affects Versions: 1.19.0
Reporter: Caican Cai
 Fix For: 2.0.0


Currently, the flink-kubernetes-operator documentation is only in English. I 
hope it can be translated into Chinese so that more Chinese users can use it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34668][checkpoint] Report state handle of file merging directory to JM [flink]

2024-03-24 Thread via GitHub


Zakelly commented on code in PR #24513:
URL: https://github.com/apache/flink/pull/24513#discussion_r1536996545


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/filemerging/FileMergingSnapshotManagerBase.java:
##
@@ -278,7 +300,11 @@ public SegmentFileStateHandle 
closeStreamAndCreateStateHandle(
 returnPhysicalFileForNextReuse(subtaskKey, 
checkpointId, physicalFile);
 
 return new SegmentFileStateHandle(
-physicalFile.getFilePath(), startPos, 
stateSize, scope);
+getManagedDirStateHandle(subtaskKey, 
scope),

Review Comment:
   Is it possible for some cp happened to be with no state, and the directory 
is deleted by JM?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34668][checkpoint] Report state handle of file merging directory to JM [flink]

2024-03-24 Thread via GitHub


Zakelly commented on code in PR #24513:
URL: https://github.com/apache/flink/pull/24513#discussion_r1536996545


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/filemerging/FileMergingSnapshotManagerBase.java:
##
@@ -278,7 +300,11 @@ public SegmentFileStateHandle 
closeStreamAndCreateStateHandle(
 returnPhysicalFileForNextReuse(subtaskKey, 
checkpointId, physicalFile);
 
 return new SegmentFileStateHandle(
-physicalFile.getFilePath(), startPos, 
stateSize, scope);
+getManagedDirStateHandle(subtaskKey, 
scope),

Review Comment:
   Is it possible for some cp happened to be with no state, and the directory 
is deleted by JM?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-34917) Support enhanced `CREATE CATALOG` syntax

2024-03-24 Thread Jane Chan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan reassigned FLINK-34917:
-

Assignee: Yubin Li

> Support enhanced `CREATE CATALOG` syntax
> 
>
> Key: FLINK-34917
> URL: https://issues.apache.org/jira/browse/FLINK-34917
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.20.0
>Reporter: Yubin Li
>Assignee: Yubin Li
>Priority: Major
> Attachments: image-2024-03-22-18-31-59-632.png
>
>
> {{IF NOT EXISTS}}  clause: If the catalog already exists, nothing happens.
> {{COMMENT}} clause: An optional string literal. The description for the 
> catalog.
> NOTICE: we just need to introduce the '[IF NOT EXISTS]' and '[COMMENT]' 
> clause to the 'create catalog' statement.
> !image-2024-03-22-18-31-59-632.png|width=795,height=87!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-34918) Introduce the support of Catalog for comments

2024-03-24 Thread Jane Chan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan reassigned FLINK-34918:
-

Assignee: Yubin Li

> Introduce the support of Catalog for comments
> -
>
> Key: FLINK-34918
> URL: https://issues.apache.org/jira/browse/FLINK-34918
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.20.0
>Reporter: Yubin Li
>Assignee: Yubin Li
>Priority: Major
>
> We propose to introduce `getComment()` method in `CatalogDescriptor`, and the 
> reasons are as follows.
> 1. For the sake of design consistency, follow the design of FLIP-295 [1] 
> which introduced `CatalogStore` component, `CatalogDescriptor` includes names 
> and attributes, both of which are used to describe the catalog, and `comment` 
> can be added smoothly.
> 2. Extending the existing class rather than add new method to the existing 
> interface, Especially, the `Catalog` interface, as a core interface, is used 
> by a series of important components such as `CatalogFactory`, 
> `CatalogManager` and `FactoryUtil`, and is implemented by a large number of 
> connectors such as JDBC, Paimon, and Hive. Adding methods to it will greatly 
> increase the implementation complexity, and more importantly, increase the 
> cost of iteration, maintenance, and verification.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-34916) Support `ALTER CATALOG` syntax

2024-03-24 Thread Jane Chan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan reassigned FLINK-34916:
-

Assignee: Yubin Li

> Support `ALTER CATALOG` syntax
> --
>
> Key: FLINK-34916
> URL: https://issues.apache.org/jira/browse/FLINK-34916
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.20.0
>Reporter: Yubin Li
>Assignee: Yubin Li
>Priority: Major
> Attachments: image-2024-03-22-18-30-33-182.png
>
>
> Set one or more properties in the specified catalog. If a particular property 
> is already set in the catalog, override the old value with the new one.
> !image-2024-03-22-18-30-33-182.png|width=736,height=583!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-34915) Support `DESCRIBE CATALOG` syntax

2024-03-24 Thread Jane Chan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan reassigned FLINK-34915:
-

Assignee: Yubin Li

> Support `DESCRIBE CATALOG` syntax
> -
>
> Key: FLINK-34915
> URL: https://issues.apache.org/jira/browse/FLINK-34915
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.20.0
>Reporter: Yubin Li
>Assignee: Yubin Li
>Priority: Major
> Attachments: image-2024-03-22-18-29-00-454.png
>
>
> Describe the metadata of an existing catalog. The metadata information 
> includes the catalog’s name, type, and comment. If the optional {{EXTENDED}} 
> option is specified, catalog properties are also returned.
> NOTICE: The parser part of this syntax has been implemented in FLIP-69 , and 
> it is not actually available. we can complete the syntax in this FLIP. 
> !image-2024-03-22-18-29-00-454.png|width=561,height=374!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34239) Introduce a deep copy method of SerializerConfig for merging with Table configs in org.apache.flink.table.catalog.DataTypeFactoryImpl

2024-03-24 Thread Zhanghao Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830313#comment-17830313
 ] 

Zhanghao Chen commented on FLINK-34239:
---

[~mallikarjuna] Thanks for the PR, I'll take a look

> Introduce a deep copy method of SerializerConfig for merging with Table 
> configs in org.apache.flink.table.catalog.DataTypeFactoryImpl 
> --
>
> Key: FLINK-34239
> URL: https://issues.apache.org/jira/browse/FLINK-34239
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Core
>Affects Versions: 1.19.0
>Reporter: Zhanghao Chen
>Assignee: Kumar Mallikarjuna
>Priority: Major
>  Labels: pull-request-available
>
> *Problem*
> Currently, 
> org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig
>  will create a deep-copy of the SerializerConfig and merge Table config into 
> it. However, the deep copy is done by manully calling the getter and setter 
> methods of SerializerConfig, and is prone to human errors, e.g. missing 
> copying a newly added field in SerializerConfig.
> *Proposal*
> Introduce a deep copy method for SerializerConfig and replace the curr impl 
> in 
> org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34925) FlinkCDC 3.0 for extracting data from sqlserver, errors occur

2024-03-24 Thread Hongshun Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830310#comment-17830310
 ] 

Hongshun Wang commented on FLINK-34925:
---

Debezium 2.0 is a major version upgrade which does not guarantee compatibility. 
You can make some minor changes based on 
[DBZ-5398|https://issues.redhat.com/browse/DBZ-5398] similar to what is done in 
[https://github.com/apache/flink-cdc/pull/2842].

>  FlinkCDC 3.0 for extracting data from sqlserver,  errors occur
> ---
>
> Key: FLINK-34925
> URL: https://issues.apache.org/jira/browse/FLINK-34925
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Reporter: 宇宙先生
>Priority: Critical
> Attachments: image-2024-03-24-18-12-52-747.png, 
> image-2024-03-24-18-23-19-657.png
>
>
> when I use FlinkCDC 3.0 for extracting data from sqlserver,  errors occur
> !image-2024-03-24-18-12-52-747.png!
> .is referenced as PRIMARY KEY, but a matching column is not defined 
> in table.
> I found some information on Debezuim's website,The official website says,
> This bug was fixed in the debe 2.0 version, I checked the current flinkcdc 
> debezuim is version 1.97, I want to know what is the cause of this problem, 
> can I directly upgrade the debezuim version to fix it? Debezuim's  link   
> [Debezium 2.0.0.Beta1 
> Released|https://debezium.io/blog/2022/07/27/debezium-2.0-beta1-released/]
> !image-2024-03-24-18-23-19-657.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [hotfix] Change old com.ververica dependency to flink [flink-cdc]

2024-03-24 Thread via GitHub


xleoken commented on PR #3110:
URL: https://github.com/apache/flink-cdc/pull/3110#issuecomment-2017024741

   cc @leonardBang @PatrickRen @Jiabao-Sun


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-34926) Adaptive auto parallelism doesn't work for a query

2024-03-24 Thread Xingcan Cui (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34926:

Attachment: image.png

> Adaptive auto parallelism doesn't work for a query
> --
>
> Key: FLINK-34926
> URL: https://issues.apache.org/jira/browse/FLINK-34926
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
> Attachments: image.png
>
>
> We have the following query running in batch mode.
> {code:java}
> WITH FEATURE_INCLUSION AS (
>     SELECT
>         insertion_id, -- Not unique
>         features -- Array>
>     FROM
>         features_table
> ),
> TOTAL AS (
>     SELECT
>         COUNT(DISTINCT insertion_id) total_id
>     FROM
>         FEATURE_INCLUSION
> ),
> FEATURE_INCLUSION_COUNTS AS (
>     SELECT
>         `key`,
>         COUNT(DISTINCT insertion_id) AS id_count
>     FROM
>         FEATURE_INCLUSION,
>         UNNEST(features) as t (`key`, `value`)
>     WHERE
>         TRUE
>     GROUP BY
>         `key`
> ),
> RESULTS AS (
>     SELECT
>         `key`
>     FROM
>         FEATURE_INCLUSION_COUNTS,
>         TOTAL
>     WHERE
>        (1.0 * id_count)/total_id > 0.1
> )
> SELECT
>     JSON_ARRAYAGG(`key`) AS feature_ids,
> FROM
>     RESULTS{code}
> The parallelism adaptively set by Flink for the following operator was always 
> 1.
> {code:java}
> [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
> insertion_id])
> +- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
> Partial_COUNT(insertion_id) AS count$0]){code}
> If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and 
> manually set `parallelism.default` to be greater than one, it worked.
> The screenshot of the full job graph is attached.  !image_720.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34926) Adaptive auto parallelism doesn't work for a query

2024-03-24 Thread Xingcan Cui (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34926:

Description: 
We have the following query running in batch mode.
{code:java}
WITH FEATURE_INCLUSION AS (
    SELECT
        insertion_id, -- Not unique
        features -- Array>
    FROM
        features_table
),
TOTAL AS (
    SELECT
        COUNT(DISTINCT insertion_id) total_id
    FROM
        FEATURE_INCLUSION
),
FEATURE_INCLUSION_COUNTS AS (
    SELECT
        `key`,
        COUNT(DISTINCT insertion_id) AS id_count
    FROM
        FEATURE_INCLUSION,
        UNNEST(features) as t (`key`, `value`)
    WHERE
        TRUE
    GROUP BY
        `key`
),
RESULTS AS (
    SELECT
        `key`
    FROM
        FEATURE_INCLUSION_COUNTS,
        TOTAL
    WHERE
       (1.0 * id_count)/total_id > 0.1
)
SELECT
    JSON_ARRAYAGG(`key`) AS feature_ids,
FROM
    RESULTS{code}
The parallelism adaptively set by Flink for the following operator was always 1.
{code:java}
[37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
insertion_id])
+- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
Partial_COUNT(insertion_id) AS count$0]){code}
If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually 
set `parallelism.default` to be greater than one, it worked.

The screenshot of the full job graph is attached.  !image.png!

  was:
We have the following query running in batch mode.
{code:java}
WITH FEATURE_INCLUSION AS (
    SELECT
        insertion_id, -- Not unique
        features -- Array>
    FROM
        features_table
),
TOTAL AS (
    SELECT
        COUNT(DISTINCT insertion_id) total_id
    FROM
        FEATURE_INCLUSION
),
FEATURE_INCLUSION_COUNTS AS (
    SELECT
        `key`,
        COUNT(DISTINCT insertion_id) AS id_count
    FROM
        FEATURE_INCLUSION,
        UNNEST(features) as t (`key`, `value`)
    WHERE
        TRUE
    GROUP BY
        `key`
),
RESULTS AS (
    SELECT
        `key`
    FROM
        FEATURE_INCLUSION_COUNTS,
        TOTAL
    WHERE
       (1.0 * id_count)/total_id > 0.1
)
SELECT
    JSON_ARRAYAGG(`key`) AS feature_ids,
FROM
    RESULTS{code}
The parallelism adaptively set by Flink for the following operator was always 1.
{code:java}
[37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
insertion_id])
+- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
Partial_COUNT(insertion_id) AS count$0]){code}
If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually 
set `parallelism.default` to be greater than one, it worked.

The screenshot of the full job graph is attached.  !image_720.png!


> Adaptive auto parallelism doesn't work for a query
> --
>
> Key: FLINK-34926
> URL: https://issues.apache.org/jira/browse/FLINK-34926
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
> Attachments: image.png
>
>
> We have the following query running in batch mode.
> {code:java}
> WITH FEATURE_INCLUSION AS (
>     SELECT
>         insertion_id, -- Not unique
>         features -- Array>
>     FROM
>         features_table
> ),
> TOTAL AS (
>     SELECT
>         COUNT(DISTINCT insertion_id) total_id
>     FROM
>         FEATURE_INCLUSION
> ),
> FEATURE_INCLUSION_COUNTS AS (
>     SELECT
>         `key`,
>         COUNT(DISTINCT insertion_id) AS id_count
>     FROM
>         FEATURE_INCLUSION,
>         UNNEST(features) as t (`key`, `value`)
>     WHERE
>         TRUE
>     GROUP BY
>         `key`
> ),
> RESULTS AS (
>     SELECT
>         `key`
>     FROM
>         FEATURE_INCLUSION_COUNTS,
>         TOTAL
>     WHERE
>        (1.0 * id_count)/total_id > 0.1
> )
> SELECT
>     JSON_ARRAYAGG(`key`) AS feature_ids,
> FROM
>     RESULTS{code}
> The parallelism adaptively set by Flink for the following operator was always 
> 1.
> {code:java}
> [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
> insertion_id])
> +- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
> Partial_COUNT(insertion_id) AS count$0]){code}
> If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and 
> manually set `parallelism.default` to be greater than one, it worked.
> The screenshot of the full job graph is attached.  !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34926) Adaptive auto parallelism doesn't work for a query

2024-03-24 Thread Xingcan Cui (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34926:

Attachment: (was: image_720.png)

> Adaptive auto parallelism doesn't work for a query
> --
>
> Key: FLINK-34926
> URL: https://issues.apache.org/jira/browse/FLINK-34926
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>
> We have the following query running in batch mode.
> {code:java}
> WITH FEATURE_INCLUSION AS (
>     SELECT
>         insertion_id, -- Not unique
>         features -- Array>
>     FROM
>         features_table
> ),
> TOTAL AS (
>     SELECT
>         COUNT(DISTINCT insertion_id) total_id
>     FROM
>         FEATURE_INCLUSION
> ),
> FEATURE_INCLUSION_COUNTS AS (
>     SELECT
>         `key`,
>         COUNT(DISTINCT insertion_id) AS id_count
>     FROM
>         FEATURE_INCLUSION,
>         UNNEST(features) as t (`key`, `value`)
>     WHERE
>         TRUE
>     GROUP BY
>         `key`
> ),
> RESULTS AS (
>     SELECT
>         `key`
>     FROM
>         FEATURE_INCLUSION_COUNTS,
>         TOTAL
>     WHERE
>        (1.0 * id_count)/total_id > 0.1
> )
> SELECT
>     JSON_ARRAYAGG(`key`) AS feature_ids,
> FROM
>     RESULTS{code}
> The parallelism adaptively set by Flink for the following operator was always 
> 1.
> {code:java}
> [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
> insertion_id])
> +- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
> Partial_COUNT(insertion_id) AS count$0]){code}
> If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and 
> manually set `parallelism.default` to be greater than one, it worked.
> The screenshot of the full job graph is attached.  !image_720.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34926) Adaptive auto parallelism doesn't work for a query

2024-03-24 Thread Xingcan Cui (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingcan Cui updated FLINK-34926:

Description: 
We have the following query running in batch mode.
{code:java}
WITH FEATURE_INCLUSION AS (
    SELECT
        insertion_id, -- Not unique
        features -- Array>
    FROM
        features_table
),
TOTAL AS (
    SELECT
        COUNT(DISTINCT insertion_id) total_id
    FROM
        FEATURE_INCLUSION
),
FEATURE_INCLUSION_COUNTS AS (
    SELECT
        `key`,
        COUNT(DISTINCT insertion_id) AS id_count
    FROM
        FEATURE_INCLUSION,
        UNNEST(features) as t (`key`, `value`)
    WHERE
        TRUE
    GROUP BY
        `key`
),
RESULTS AS (
    SELECT
        `key`
    FROM
        FEATURE_INCLUSION_COUNTS,
        TOTAL
    WHERE
       (1.0 * id_count)/total_id > 0.1
)
SELECT
    JSON_ARRAYAGG(`key`) AS feature_ids,
FROM
    RESULTS{code}
The parallelism adaptively set by Flink for the following operator was always 1.
{code:java}
[37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
insertion_id])
+- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
Partial_COUNT(insertion_id) AS count$0]){code}
If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually 
set `parallelism.default` to be greater than one, it worked.

The screenshot of the full job graph is attached.  !image_720.png!

  was:
We have the following query running in batch mode.

 
{code:java}
WITH FEATURE_INCLUSION AS (
    SELECT
        insertion_id, -- Not unique
        features -- Array>
    FROM
        features_table
),
TOTAL AS (
    SELECT
        COUNT(DISTINCT insertion_id) total_id
    FROM
        FEATURE_INCLUSION
),
FEATURE_INCLUSION_COUNTS AS (
    SELECT
        `key`,
        COUNT(DISTINCT insertion_id) AS id_count
    FROM
        FEATURE_INCLUSION,
        UNNEST(features) as t (`key`, `value`)
    WHERE
        TRUE
    GROUP BY
        `key`
),
RESULTS AS (
    SELECT
        `key`
    FROM
        FEATURE_INCLUSION_COUNTS,
        TOTAL
    WHERE
       (1.0 * id_count)/total_id > 0.1
)
SELECT
    JSON_ARRAYAGG(`key`) AS feature_ids,
FROM
    RESULTS{code}
The parallelism adaptively set by Flink for the following operator was always 1.
{code:java}
[37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
insertion_id])
+- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
Partial_COUNT(insertion_id) AS count$0]){code}
If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually 
set `parallelism.default` to be greater than one, it worked.

The screenshot of the full job graph is attached. !image_720.png!


> Adaptive auto parallelism doesn't work for a query
> --
>
> Key: FLINK-34926
> URL: https://issues.apache.org/jira/browse/FLINK-34926
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.1
>Reporter: Xingcan Cui
>Priority: Major
>
> We have the following query running in batch mode.
> {code:java}
> WITH FEATURE_INCLUSION AS (
>     SELECT
>         insertion_id, -- Not unique
>         features -- Array>
>     FROM
>         features_table
> ),
> TOTAL AS (
>     SELECT
>         COUNT(DISTINCT insertion_id) total_id
>     FROM
>         FEATURE_INCLUSION
> ),
> FEATURE_INCLUSION_COUNTS AS (
>     SELECT
>         `key`,
>         COUNT(DISTINCT insertion_id) AS id_count
>     FROM
>         FEATURE_INCLUSION,
>         UNNEST(features) as t (`key`, `value`)
>     WHERE
>         TRUE
>     GROUP BY
>         `key`
> ),
> RESULTS AS (
>     SELECT
>         `key`
>     FROM
>         FEATURE_INCLUSION_COUNTS,
>         TOTAL
>     WHERE
>        (1.0 * id_count)/total_id > 0.1
> )
> SELECT
>     JSON_ARRAYAGG(`key`) AS feature_ids,
> FROM
>     RESULTS{code}
> The parallelism adaptively set by Flink for the following operator was always 
> 1.
> {code:java}
> [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
> insertion_id])
> +- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
> Partial_COUNT(insertion_id) AS count$0]){code}
> If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and 
> manually set `parallelism.default` to be greater than one, it worked.
> The screenshot of the full job graph is attached.  !image_720.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34926) Adaptive auto parallelism doesn't work for a query

2024-03-24 Thread Xingcan Cui (Jira)
Xingcan Cui created FLINK-34926:
---

 Summary: Adaptive auto parallelism doesn't work for a query
 Key: FLINK-34926
 URL: https://issues.apache.org/jira/browse/FLINK-34926
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.18.1
Reporter: Xingcan Cui
 Attachments: image_720.png

We have the following query running in batch mode.

 
{code:java}
WITH FEATURE_INCLUSION AS (
    SELECT
        insertion_id, -- Not unique
        features -- Array>
    FROM
        features_table
),
TOTAL AS (
    SELECT
        COUNT(DISTINCT insertion_id) total_id
    FROM
        FEATURE_INCLUSION
),
FEATURE_INCLUSION_COUNTS AS (
    SELECT
        `key`,
        COUNT(DISTINCT insertion_id) AS id_count
    FROM
        FEATURE_INCLUSION,
        UNNEST(features) as t (`key`, `value`)
    WHERE
        TRUE
    GROUP BY
        `key`
),
RESULTS AS (
    SELECT
        `key`
    FROM
        FEATURE_INCLUSION_COUNTS,
        TOTAL
    WHERE
       (1.0 * id_count)/total_id > 0.1
)
SELECT
    JSON_ARRAYAGG(`key`) AS feature_ids,
FROM
    RESULTS{code}
The parallelism adaptively set by Flink for the following operator was always 1.
{code:java}
[37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, 
insertion_id])
+- [38]:LocalHashAggregate(groupBy=[key], select=[key, 
Partial_COUNT(insertion_id) AS count$0]){code}
If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually 
set `parallelism.default` to be greater than one, it worked.

The screenshot of the full job graph is attached. !image_720.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34712][release] Generate reference data for state migration tests based on release-1.19.0 [flink]

2024-03-24 Thread via GitHub


lincoln-lil commented on PR #24517:
URL: https://github.com/apache/flink/pull/24517#issuecomment-2016860210

   @snuyanzin sorry to ping you here, do you still remember this data 
generation during 1.18 release ( https://github.com/apache/flink/pull/23710)? 
   Is there any exception when do `generate-migration-test-data` via maven?
   I debugged generating 
`org.apache.flink.test.checkpointing.StatefulJobSnapshotMigrationITCase` 
locally via `MigrationTestsSnapshotGenerator` with following args:
   ```
   --dir
   /Users/lilin/work/git/flink/flink-tests
   --version
   1.19
   --classes
   org.apache.flink.test.checkpointing.StatefulJobSnapshotMigrationITCase
   ```
   but will fail to load the specified class...
   Before diving into each line of the generator code, try asking if you have 
encountered similar issues before, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [BP-1.18][FLINK-34409][ci] Enables (most) tests that were disabled for the AdaptiveScheduler due to missing features [flink]

2024-03-24 Thread via GitHub


XComp commented on PR #24558:
URL: https://github.com/apache/flink/pull/24558#issuecomment-2016827477

   CI with AdaptiveScheduler was 
[successful](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=58517)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [BP-1.19][FLINK-34409][ci] Enables (most) tests that were disabled for the AdaptiveScheduler due to missing features [flink]

2024-03-24 Thread via GitHub


XComp commented on PR #24557:
URL: https://github.com/apache/flink/pull/24557#issuecomment-2016827592

   CI with AdaptiveScheduler was 
[successful](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=58516)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-34487) Integrate tools/azure-pipelines/build-python-wheels.yml into GHA nightly workflow

2024-03-24 Thread Muhammet Orazov (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830257#comment-17830257
 ] 

Muhammet Orazov commented on FLINK-34487:
-

Hello [~mapohl], could you please have a look to the attached PR? Thanks a lot 
(y) 

> Integrate tools/azure-pipelines/build-python-wheels.yml into GHA nightly 
> workflow
> -
>
> Key: FLINK-34487
> URL: https://issues.apache.org/jira/browse/FLINK-34487
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System / CI
>Affects Versions: 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Assignee: Muhammet Orazov
>Priority: Major
>  Labels: github-actions, pull-request-available
>
> Analogously to the [Azure Pipelines nightly 
> config|https://github.com/apache/flink/blob/e923d4060b6dabe650a8950774d176d3e92437c2/tools/azure-pipelines/build-apache-repo.yml#L183]
>  we want to generate the wheels artifacts in the GHA nightly workflow as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33707) Verify the snapshot migration on Java17

2024-03-24 Thread Muhammet Orazov (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830254#comment-17830254
 ] 

Muhammet Orazov commented on FLINK-33707:
-

Hello,

Would this affect the Flink 1.18 version?

> Verify the snapshot migration on Java17
> ---
>
> Key: FLINK-33707
> URL: https://issues.apache.org/jira/browse/FLINK-33707
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Reporter: Yun Tang
>Priority: Major
>
> This task is like FLINK-33699, I think we could introduce a 
> StatefulJobSnapshotMigrationITCase-like test to restore snapshots containing 
> scala code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34419][docker] Add tests for JDK 17 & 21 [flink-docker]

2024-03-24 Thread via GitHub


morazow commented on PR #182:
URL: https://github.com/apache/flink-docker/pull/182#issuecomment-2016815995

   @XComp please have a look again. If this looks good, I will backport to 
other branches accordingly, thanks 🙏 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34419][docker] Add tests for JDK 17 & 21 [flink-docker]

2024-03-24 Thread via GitHub


morazow commented on code in PR #182:
URL: https://github.com/apache/flink-docker/pull/182#discussion_r1536816965


##
.github/workflows/ci.yml:
##
@@ -17,14 +17,22 @@ name: "CI"
 
 on: [push, pull_request]
 
+env:
+  TAR_URL: 
"https://s3.amazonaws.com/flink-nightly/flink-1.20-SNAPSHOT-bin-scala_2.12.tgz";
+
 jobs:
   ci:
+name: CI using JDK ${{ matrix.java_version }}
 runs-on: ubuntu-latest
+strategy:
+  fail-fast: false
+  max-parallel: 1

Review Comment:
   Thanks @XComp,
   
   I have added loop test, and indeed everything worked, the issue didn't 
happen (CI run: 
https://github.com/morazow/flink-docker/actions/runs/8409563442/job/23027006083).
   
   I think we could safely have matrix setup without parallelism set to `1`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-34925) FlinkCDC 3.0 for extracting data from sqlserver, errors occur

2024-03-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/FLINK-34925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

宇宙先生 updated FLINK-34925:
-
Summary:  FlinkCDC 3.0 for extracting data from sqlserver,  errors occur  
(was: flinkCDC 抽取SQLSERVER 报错)

>  FlinkCDC 3.0 for extracting data from sqlserver,  errors occur
> ---
>
> Key: FLINK-34925
> URL: https://issues.apache.org/jira/browse/FLINK-34925
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Reporter: 宇宙先生
>Priority: Critical
> Attachments: image-2024-03-24-18-12-52-747.png, 
> image-2024-03-24-18-23-19-657.png
>
>
> when I use FlinkCDC 3.0 for extracting data from sqlserver,  errors occur
> !image-2024-03-24-18-12-52-747.png!
> .is referenced as PRIMARY KEY, but a matching column is not defined 
> in table.
> I found some information on Debezuim's website,The official website says,
> This bug was fixed in the debe 2.0 version, I checked the current flinkcdc 
> debezuim is version 1.97, I want to know what is the cause of this problem, 
> can I directly upgrade the debezuim version to fix it? Debezuim's  link   
> [Debezium 2.0.0.Beta1 
> Released|https://debezium.io/blog/2022/07/27/debezium-2.0-beta1-released/]
> !image-2024-03-24-18-23-19-657.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34925) flinkCDC 抽取SQLSERVER 报错

2024-03-24 Thread Jira
宇宙先生 created FLINK-34925:


 Summary: flinkCDC 抽取SQLSERVER 报错
 Key: FLINK-34925
 URL: https://issues.apache.org/jira/browse/FLINK-34925
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: 宇宙先生
 Attachments: image-2024-03-24-18-12-52-747.png, 
image-2024-03-24-18-23-19-657.png

when I use FlinkCDC 3.0 for extracting data from sqlserver,  errors occur

!image-2024-03-24-18-12-52-747.png!

.is referenced as PRIMARY KEY, but a matching column is not defined in 
table.

I found some information on Debezuim's website,The official website says,

This bug was fixed in the debe 2.0 version, I checked the current flinkcdc 
debezuim is version 1.97, I want to know what is the cause of this problem, can 
I directly upgrade the debezuim version to fix it? Debezuim's  link   [Debezium 
2.0.0.Beta1 
Released|https://debezium.io/blog/2022/07/27/debezium-2.0-beta1-released/]

!image-2024-03-24-18-23-19-657.png!

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)