date:20230208

[GitHub] [flink-ml] vacaly commented on a diff in pull request #192: [FLINK-30451] Add Estimator and Transformer for Swing

2023-02-08 Thread via GitHub



vacaly commented on code in PR #192:
URL: https://github.com/apache/flink-ml/pull/192#discussion_r1101092374


##
flink-ml-lib/src/main/java/org/apache/flink/ml/recommendation/swing/Swing.java:
##
@@ -0,0 +1,430 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.recommendation.swing;
+
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeinfo.Types;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.api.java.typeutils.RowTypeInfo;
+import org.apache.flink.iteration.operator.OperatorStateUtils;
+import org.apache.flink.ml.api.AlgoOperator;
+import org.apache.flink.ml.common.datastream.TableUtils;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.runtime.state.StateInitializationContext;
+import org.apache.flink.runtime.state.StateSnapshotContext;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
+import org.apache.flink.streaming.api.operators.BoundedOneInput;
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.table.catalog.ResolvedSchema;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Preconditions;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+
+/**
+ * An AlgoOperator which implements the Swing algorithm.
+ *
+ * Swing is an item recall model. The topology of user-item graph usually 
can be described as
+ * user-item-user or item-user-item, which are like 'swing'. For example, if 
both user u
+ * and user v have purchased the same commodity i , they 
will form a relationship
+ * diagram similar to a swing. If u and v have purchased 
commodity j in
+ * addition to i, it is supposed i and j are 
similar. The formula of
+ * Swing is
+ *
+ * $$ w_{(i,j)}=\sum_{u\in U_i\cap U_j}\sum_{v\in U_i\cap
+ * 
U_j}{\frac{1}{{(I_u+\alpha_1)}^\beta}}*{\frac{1}{{(I_v+\alpha_1)}^\beta}}*{\frac{1}{\alpha\_2+|I_u\cap
+ * I_v|}} $$
+ *
+ * This implementation is based on the algorithm proposed in the paper: 
"Large Scale Product
+ * Graph Construction for Recommendation in E-commerce" by Xiaoyong Yang, 
Yadong Zhu and Yi Zhang.
+ * (https://arxiv.org/pdf/2010.05525.pdf)
+ */
+public class Swing implements AlgoOperator, SwingParams {
+private final Map, Object> paramMap = new HashMap<>();
+
+public Swing() {
+ParamUtils.initializeMapWithDefaultValues(paramMap, this);
+}
+
+@Override
+public Table[] transform(Table... inputs) {
+
+final String userCol = getUserCol();
+final String itemCol = getItemCol();
+Preconditions.checkArgument(inputs.length == 1);
+final ResolvedSchema schema = inputs[0].getResolvedSchema();
+
+if (!(Types.LONG.equals(TableUtils.getTypeInfoByName(schema, userCol))
+&& Types.LONG.equals(TableUtils.getTypeInfoByName(schema, 
itemCol {
+throw new IllegalArgumentException("The types of user and item 
columns must be Long.");
+}
+
+StreamTableEnvironment tEnv =
+(StreamTableEnvironment) ((TableImpl) 
inputs[0]).getTableEnvironment();
+
+SingleOutputStreamOperator> itemUsers =
+tEnv.toDataStream(inputs[0])
+.map(
+row -> {
+

[GitHub] [flink-ml] vacaly commented on a diff in pull request #192: [FLINK-30451] Add Estimator and Transformer for Swing

2023-02-08 Thread via GitHub



vacaly commented on code in PR #192:
URL: https://github.com/apache/flink-ml/pull/192#discussion_r1101090934


##
flink-ml-lib/src/main/java/org/apache/flink/ml/recommendation/swing/Swing.java:
##
@@ -0,0 +1,430 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.recommendation.swing;
+
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeinfo.Types;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.api.java.typeutils.RowTypeInfo;
+import org.apache.flink.iteration.operator.OperatorStateUtils;
+import org.apache.flink.ml.api.AlgoOperator;
+import org.apache.flink.ml.common.datastream.TableUtils;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.runtime.state.StateInitializationContext;
+import org.apache.flink.runtime.state.StateSnapshotContext;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
+import org.apache.flink.streaming.api.operators.BoundedOneInput;
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.table.catalog.ResolvedSchema;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Preconditions;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+
+/**
+ * An AlgoOperator which implements the Swing algorithm.
+ *
+ * Swing is an item recall model. The topology of user-item graph usually 
can be described as
+ * user-item-user or item-user-item, which are like 'swing'. For example, if 
both user u
+ * and user v have purchased the same commodity i , they 
will form a relationship
+ * diagram similar to a swing. If u and v have purchased 
commodity j in
+ * addition to i, it is supposed i and j are 
similar. The formula of
+ * Swing is
+ *
+ * $$ w_{(i,j)}=\sum_{u\in U_i\cap U_j}\sum_{v\in U_i\cap
+ * 
U_j}{\frac{1}{{(I_u+\alpha_1)}^\beta}}*{\frac{1}{{(I_v+\alpha_1)}^\beta}}*{\frac{1}{\alpha\_2+|I_u\cap
+ * I_v|}} $$
+ *
+ * This implementation is based on the algorithm proposed in the paper: 
"Large Scale Product
+ * Graph Construction for Recommendation in E-commerce" by Xiaoyong Yang, 
Yadong Zhu and Yi Zhang.
+ * (https://arxiv.org/pdf/2010.05525.pdf)
+ */
+public class Swing implements AlgoOperator, SwingParams {
+private final Map, Object> paramMap = new HashMap<>();
+
+public Swing() {
+ParamUtils.initializeMapWithDefaultValues(paramMap, this);
+}
+
+@Override
+public Table[] transform(Table... inputs) {
+
+final String userCol = getUserCol();
+final String itemCol = getItemCol();
+Preconditions.checkArgument(inputs.length == 1);
+final ResolvedSchema schema = inputs[0].getResolvedSchema();
+
+if (!(Types.LONG.equals(TableUtils.getTypeInfoByName(schema, userCol))
+&& Types.LONG.equals(TableUtils.getTypeInfoByName(schema, 
itemCol {
+throw new IllegalArgumentException("The types of user and item 
columns must be Long.");
+}
+
+StreamTableEnvironment tEnv =
+(StreamTableEnvironment) ((TableImpl) 
inputs[0]).getTableEnvironment();
+
+SingleOutputStreamOperator> itemUsers =
+tEnv.toDataStream(inputs[0])
+.map(
+row -> {
+

[GitHub] [flink-ml] vacaly commented on a diff in pull request #192: [FLINK-30451] Add Estimator and Transformer for Swing

2023-02-08 Thread via GitHub



vacaly commented on code in PR #192:
URL: https://github.com/apache/flink-ml/pull/192#discussion_r1101090934


##
flink-ml-lib/src/main/java/org/apache/flink/ml/recommendation/swing/Swing.java:
##
@@ -0,0 +1,430 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.recommendation.swing;
+
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeinfo.Types;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.api.java.typeutils.RowTypeInfo;
+import org.apache.flink.iteration.operator.OperatorStateUtils;
+import org.apache.flink.ml.api.AlgoOperator;
+import org.apache.flink.ml.common.datastream.TableUtils;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.runtime.state.StateInitializationContext;
+import org.apache.flink.runtime.state.StateSnapshotContext;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
+import org.apache.flink.streaming.api.operators.BoundedOneInput;
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.table.catalog.ResolvedSchema;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Preconditions;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+
+/**
+ * An AlgoOperator which implements the Swing algorithm.
+ *
+ * Swing is an item recall model. The topology of user-item graph usually 
can be described as
+ * user-item-user or item-user-item, which are like 'swing'. For example, if 
both user u
+ * and user v have purchased the same commodity i , they 
will form a relationship
+ * diagram similar to a swing. If u and v have purchased 
commodity j in
+ * addition to i, it is supposed i and j are 
similar. The formula of
+ * Swing is
+ *
+ * $$ w_{(i,j)}=\sum_{u\in U_i\cap U_j}\sum_{v\in U_i\cap
+ * 
U_j}{\frac{1}{{(I_u+\alpha_1)}^\beta}}*{\frac{1}{{(I_v+\alpha_1)}^\beta}}*{\frac{1}{\alpha\_2+|I_u\cap
+ * I_v|}} $$
+ *
+ * This implementation is based on the algorithm proposed in the paper: 
"Large Scale Product
+ * Graph Construction for Recommendation in E-commerce" by Xiaoyong Yang, 
Yadong Zhu and Yi Zhang.
+ * (https://arxiv.org/pdf/2010.05525.pdf)
+ */
+public class Swing implements AlgoOperator, SwingParams {
+private final Map, Object> paramMap = new HashMap<>();
+
+public Swing() {
+ParamUtils.initializeMapWithDefaultValues(paramMap, this);
+}
+
+@Override
+public Table[] transform(Table... inputs) {
+
+final String userCol = getUserCol();
+final String itemCol = getItemCol();
+Preconditions.checkArgument(inputs.length == 1);
+final ResolvedSchema schema = inputs[0].getResolvedSchema();
+
+if (!(Types.LONG.equals(TableUtils.getTypeInfoByName(schema, userCol))
+&& Types.LONG.equals(TableUtils.getTypeInfoByName(schema, 
itemCol {
+throw new IllegalArgumentException("The types of user and item 
columns must be Long.");
+}
+
+StreamTableEnvironment tEnv =
+(StreamTableEnvironment) ((TableImpl) 
inputs[0]).getTableEnvironment();
+
+SingleOutputStreamOperator> itemUsers =
+tEnv.toDataStream(inputs[0])
+.map(
+row -> {
+

[GitHub] [flink-ml] vacaly commented on a diff in pull request #192: [FLINK-30451] Add Estimator and Transformer for Swing

2023-02-08 Thread via GitHub



vacaly commented on code in PR #192:
URL: https://github.com/apache/flink-ml/pull/192#discussion_r1101090934


##
flink-ml-lib/src/main/java/org/apache/flink/ml/recommendation/swing/Swing.java:
##
@@ -0,0 +1,430 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.recommendation.swing;
+
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeinfo.Types;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.api.java.typeutils.RowTypeInfo;
+import org.apache.flink.iteration.operator.OperatorStateUtils;
+import org.apache.flink.ml.api.AlgoOperator;
+import org.apache.flink.ml.common.datastream.TableUtils;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.runtime.state.StateInitializationContext;
+import org.apache.flink.runtime.state.StateSnapshotContext;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
+import org.apache.flink.streaming.api.operators.BoundedOneInput;
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.table.catalog.ResolvedSchema;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Preconditions;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+
+/**
+ * An AlgoOperator which implements the Swing algorithm.
+ *
+ * Swing is an item recall model. The topology of user-item graph usually 
can be described as
+ * user-item-user or item-user-item, which are like 'swing'. For example, if 
both user u
+ * and user v have purchased the same commodity i , they 
will form a relationship
+ * diagram similar to a swing. If u and v have purchased 
commodity j in
+ * addition to i, it is supposed i and j are 
similar. The formula of
+ * Swing is
+ *
+ * $$ w_{(i,j)}=\sum_{u\in U_i\cap U_j}\sum_{v\in U_i\cap
+ * 
U_j}{\frac{1}{{(I_u+\alpha_1)}^\beta}}*{\frac{1}{{(I_v+\alpha_1)}^\beta}}*{\frac{1}{\alpha\_2+|I_u\cap
+ * I_v|}} $$
+ *
+ * This implementation is based on the algorithm proposed in the paper: 
"Large Scale Product
+ * Graph Construction for Recommendation in E-commerce" by Xiaoyong Yang, 
Yadong Zhu and Yi Zhang.
+ * (https://arxiv.org/pdf/2010.05525.pdf)
+ */
+public class Swing implements AlgoOperator, SwingParams {
+private final Map, Object> paramMap = new HashMap<>();
+
+public Swing() {
+ParamUtils.initializeMapWithDefaultValues(paramMap, this);
+}
+
+@Override
+public Table[] transform(Table... inputs) {
+
+final String userCol = getUserCol();
+final String itemCol = getItemCol();
+Preconditions.checkArgument(inputs.length == 1);
+final ResolvedSchema schema = inputs[0].getResolvedSchema();
+
+if (!(Types.LONG.equals(TableUtils.getTypeInfoByName(schema, userCol))
+&& Types.LONG.equals(TableUtils.getTypeInfoByName(schema, 
itemCol {
+throw new IllegalArgumentException("The types of user and item 
columns must be Long.");
+}
+
+StreamTableEnvironment tEnv =
+(StreamTableEnvironment) ((TableImpl) 
inputs[0]).getTableEnvironment();
+
+SingleOutputStreamOperator> itemUsers =
+tEnv.toDataStream(inputs[0])
+.map(
+row -> {
+

[GitHub] [flink-connector-pulsar] syhily commented on pull request #21: [hotfix][docs] Shortcode syntax

2023-02-08 Thread via GitHub



syhily commented on PR #21:
URL: 
https://github.com/apache/flink-connector-pulsar/pull/21#issuecomment-1423773919

   Sorry for the inconvenience to the documentation building. I will check more 
closely on similar errors.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30980) Support s3.signer-type for S3

2023-02-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30980:
---
Labels: pull-request-available  (was: )

> Support s3.signer-type for S3
> -
>
> Key: FLINK-30980
> URL: https://issues.apache.org/jira/browse/FLINK-30980
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.4.0
>
>
> Currently, s3.signer-type should be s3a.signing-algorithm, we can also 
> support s3.signer-type configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-table-store] y-bowen opened a new pull request, #507: [FLINK-30980] Support s3.signer-type for S3.

2023-02-08 Thread via GitHub



y-bowen opened a new pull request, #507:
URL: https://github.com/apache/flink-table-store/pull/507

   s3.signer-type mirrored to s3a.signing-algorithm.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-30935) Add KafkaSerializer deserialize check when using SimpleVersionedSerializer

2023-02-08 Thread Ran Tao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685136#comment-17685136
 ] 

Ran Tao edited comment on FLINK-30935 at 2/9/23 7:49 AM:
-

Hi [~chesnay] [~becket_qin] what do u think?


was (Author: lemonjing):
Hi [~chesnay] what do u think?

> Add KafkaSerializer deserialize check when using SimpleVersionedSerializer
> --
>
> Key: FLINK-30935
> URL: https://issues.apache.org/jira/browse/FLINK-30935
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Reporter: Ran Tao
>Priority: Major
>
> {code:java}
> @Override
> public int getVersion() {
> return CURRENT_VERSION;
> }
> @Override
> public KafkaPartitionSplit deserialize(int version, byte[] serialized) throws 
> IOException {
> try (ByteArrayInputStream bais = new ByteArrayInputStream(serialized);
> DataInputStream in = new DataInputStream(bais)) {
> String topic = in.readUTF();
> int partition = in.readInt();
> long offset = in.readLong();
> long stoppingOffset = in.readLong();
> return new KafkaPartitionSplit(
> new TopicPartition(topic, partition), offset, stoppingOffset);
> }
> } {code}
> Current kafka many implemented serializers do not deal with version check. I 
> think we can add it like many other connectors's implementation in case of 
> incompatible or corrupt state when restoring from checkpoint. 
> e.g.
> {code:java}
> @Override
> public KafkaPartitionSplit deserialize(int version, byte[] serialized) throws 
> IOException {
> switch (version) {
> case 0:
> return deserializeV0(serialized);
> default:
> throw new IOException("Unrecognized version or corrupt state: " + 
> version);
> }
> }
> private KafkaPartitionSplit deserializeV0(byte[] serialized) throws 
> IOException {
> try (ByteArrayInputStream bais = new ByteArrayInputStream(serialized);
> DataInputStream in = new DataInputStream(bais)) {
> String topic = in.readUTF();
> int partition = in.readInt();
> long offset = in.readLong();
> long stoppingOffset = in.readLong();
> return new KafkaPartitionSplit(
> new TopicPartition(topic, partition), offset, stoppingOffset);
> }
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30935) Add KafkaSerializer deserialize check when using SimpleVersionedSerializer

2023-02-08 Thread Ran Tao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated FLINK-30935:

Description: 
{code:java}
@Override
public int getVersion() {
return CURRENT_VERSION;
}

@Override
public KafkaPartitionSplit deserialize(int version, byte[] serialized) throws 
IOException {
try (ByteArrayInputStream bais = new ByteArrayInputStream(serialized);
DataInputStream in = new DataInputStream(bais)) {
String topic = in.readUTF();
int partition = in.readInt();
long offset = in.readLong();
long stoppingOffset = in.readLong();
return new KafkaPartitionSplit(
new TopicPartition(topic, partition), offset, stoppingOffset);
}
} {code}
Current kafka many implemented serializers do not deal with version check. I 
think we can add it like many other connectors's implementation in case of 
incompatible or corrupt state when restoring from checkpoint. 

e.g.
{code:java}
@Override
public KafkaPartitionSplit deserialize(int version, byte[] serialized) throws 
IOException {
switch (version) {
case 0:
return deserializeV0(serialized);
default:
throw new IOException("Unrecognized version or corrupt state: " + 
version);
}
}

private KafkaPartitionSplit deserializeV0(byte[] serialized) throws IOException 
{
try (ByteArrayInputStream bais = new ByteArrayInputStream(serialized);
DataInputStream in = new DataInputStream(bais)) {
String topic = in.readUTF();
int partition = in.readInt();
long offset = in.readLong();
long stoppingOffset = in.readLong();
return new KafkaPartitionSplit(
new TopicPartition(topic, partition), offset, stoppingOffset);
}
} {code}

  was:
{code:java}
@Override
public int getVersion() {
return CURRENT_VERSION;
}

@Override
public KafkaPartitionSplit deserialize(int version, byte[] serialized) throws 
IOException {
try (ByteArrayInputStream bais = new ByteArrayInputStream(serialized);
DataInputStream in = new DataInputStream(bais)) {
String topic = in.readUTF();
int partition = in.readInt();
long offset = in.readLong();
long stoppingOffset = in.readLong();
return new KafkaPartitionSplit(
new TopicPartition(topic, partition), offset, stoppingOffset);
}
} {code}
Current kafka many implemented serializers do not deal with version check. I 
think we can add it like many other connectors in case of incompatible or 
corrupt state when restoring from checkpoint..

 

 


> Add KafkaSerializer deserialize check when using SimpleVersionedSerializer
> --
>
> Key: FLINK-30935
> URL: https://issues.apache.org/jira/browse/FLINK-30935
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Reporter: Ran Tao
>Priority: Major
>
> {code:java}
> @Override
> public int getVersion() {
> return CURRENT_VERSION;
> }
> @Override
> public KafkaPartitionSplit deserialize(int version, byte[] serialized) throws 
> IOException {
> try (ByteArrayInputStream bais = new ByteArrayInputStream(serialized);
> DataInputStream in = new DataInputStream(bais)) {
> String topic = in.readUTF();
> int partition = in.readInt();
> long offset = in.readLong();
> long stoppingOffset = in.readLong();
> return new KafkaPartitionSplit(
> new TopicPartition(topic, partition), offset, stoppingOffset);
> }
> } {code}
> Current kafka many implemented serializers do not deal with version check. I 
> think we can add it like many other connectors's implementation in case of 
> incompatible or corrupt state when restoring from checkpoint. 
> e.g.
> {code:java}
> @Override
> public KafkaPartitionSplit deserialize(int version, byte[] serialized) throws 
> IOException {
> switch (version) {
> case 0:
> return deserializeV0(serialized);
> default:
> throw new IOException("Unrecognized version or corrupt state: " + 
> version);
> }
> }
> private KafkaPartitionSplit deserializeV0(byte[] serialized) throws 
> IOException {
> try (ByteArrayInputStream bais = new ByteArrayInputStream(serialized);
> DataInputStream in = new DataInputStream(bais)) {
> String topic = in.readUTF();
> int partition = in.readInt();
> long offset = in.readLong();
> long stoppingOffset = in.readLong();
> return new KafkaPartitionSplit(
> new TopicPartition(topic, partition), offset, stoppingOffset);
> }
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] zoltar9264 commented on a diff in pull request #21812: [FLINK-28440][state/changelog] record reference count of StreamStateHandle in TaskChangelogRegistry

2023-02-08 Thread via GitHub



zoltar9264 commented on code in PR #21812:
URL: https://github.com/apache/flink/pull/21812#discussion_r1101082127


##
flink-dstl/flink-dstl-dfs/src/test/java/org/apache/flink/changelog/fs/FsStateChangelogWriterTest.java:
##
@@ -106,6 +109,143 @@ void testPersistAgain() throws Exception {
 });
 }
 
+@Test
+void testChangelogFileAvailable() throws Exception {

Review Comment:
   Thanks @fredia , I think we can verify all changelog available case in this 
pr, so I will rename it to testFileAvailableAfterPreUpload but not move to 
https://github.com/apache/flink/pull/21895 .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30935) Add KafkaSerializer deserialize check when using SimpleVersionedSerializer

2023-02-08 Thread Ran Tao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated FLINK-30935:

Description: 
{code:java}
@Override
public int getVersion() {
return CURRENT_VERSION;
}

@Override
public KafkaPartitionSplit deserialize(int version, byte[] serialized) throws 
IOException {
try (ByteArrayInputStream bais = new ByteArrayInputStream(serialized);
DataInputStream in = new DataInputStream(bais)) {
String topic = in.readUTF();
int partition = in.readInt();
long offset = in.readLong();
long stoppingOffset = in.readLong();
return new KafkaPartitionSplit(
new TopicPartition(topic, partition), offset, stoppingOffset);
}
} {code}
Current kafka many implemented serializers do not deal with version check. I 
think we can add it like many other connectors in case of incompatible or 
corrupt state when restoring from checkpoint..

 

 

  was:
{code:java}
@Override
public int getVersion() {
return CURRENT_VERSION;
}

@Override
public KafkaPartitionSplit deserialize(int version, byte[] serialized) throws 
IOException {
try (ByteArrayInputStream bais = new ByteArrayInputStream(serialized);
DataInputStream in = new DataInputStream(bais)) {
String topic = in.readUTF();
int partition = in.readInt();
long offset = in.readLong();
long stoppingOffset = in.readLong();
return new KafkaPartitionSplit(
new TopicPartition(topic, partition), offset, stoppingOffset);
}
} {code}
Current kafka many implemented serializers do not deal with version check. I 
think we can add it like many other connectors in case of incompatible or 
corrupt state when restoring from checkpoint..


> Add KafkaSerializer deserialize check when using SimpleVersionedSerializer
> --
>
> Key: FLINK-30935
> URL: https://issues.apache.org/jira/browse/FLINK-30935
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Reporter: Ran Tao
>Priority: Major
>
> {code:java}
> @Override
> public int getVersion() {
> return CURRENT_VERSION;
> }
> @Override
> public KafkaPartitionSplit deserialize(int version, byte[] serialized) throws 
> IOException {
> try (ByteArrayInputStream bais = new ByteArrayInputStream(serialized);
> DataInputStream in = new DataInputStream(bais)) {
> String topic = in.readUTF();
> int partition = in.readInt();
> long offset = in.readLong();
> long stoppingOffset = in.readLong();
> return new KafkaPartitionSplit(
> new TopicPartition(topic, partition), offset, stoppingOffset);
> }
> } {code}
> Current kafka many implemented serializers do not deal with version check. I 
> think we can add it like many other connectors in case of incompatible or 
> corrupt state when restoring from checkpoint..
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] flinkbot commented on pull request #21903: [FLINK-30981][Python] Fix explain_sql throws method not exist

2023-02-08 Thread via GitHub



flinkbot commented on PR #21903:
URL: https://github.com/apache/flink/pull/21903#issuecomment-1423766364

   
   ## CI report:
   
   * 343db152d586bd0e278340f90a6e804e4432439a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-30972) E2e tests always fail in phase "Prepare E2E run"

2023-02-08 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686240#comment-17686240
 ] 

Martijn Visser commented on FLINK-30972:


[~renqs] This always happens when a new version of OpenSSL has been released; 
fix should be easy

> E2e tests always fail in phase "Prepare E2E run"
> 
>
> Key: FLINK-30972
> URL: https://issues.apache.org/jira/browse/FLINK-30972
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI, Tests
>Affects Versions: 1.17.0
>Reporter: Lijie Wang
>Assignee: Qingsheng Ren
>Priority: Blocker
>  Labels: test-stability
>
> {code:java}
> Installing required software
> Reading package lists...
> Building dependency tree...
> Reading state information...
> bc is already the newest version (1.07.1-2build1).
> bc set to manually installed.
> libapr1 is already the newest version (1.6.5-1ubuntu1).
> libapr1 set to manually installed.
> 0 upgraded, 0 newly installed, 0 to remove and 13 not upgraded.
> --2023-02-09 04:38:47--  
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
> Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
> 185.125.190.36, 185.125.190.39, ...
> Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
> connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2023-02-09 04:38:47 ERROR 404: Not Found.
> WARNING: apt does not have a stable CLI interface. Use with caution in 
> scripts.
> Reading package lists...
> E: Unsupported file ./libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb given on 
> commandline
> ##[error]Bash exited with code '100'.
> Finishing: Prepare E2E run
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] TanYuxin-tyx commented on a diff in pull request #21843: [FLINK-30712][network] Update documents for taskmanager memory configurations and tuning

2023-02-08 Thread via GitHub



TanYuxin-tyx commented on code in PR #21843:
URL: https://github.com/apache/flink/pull/21843#discussion_r1101079804


##
docs/content/docs/deployment/memory/network_mem_tuning.md:
##
@@ -97,20 +97,17 @@ The actual value of parallelism from which the problem 
occurs is various from jo
 ## Network buffer lifecycle
  
 Flink has several local buffer pools - one for the output stream and one for 
each input gate. 
-Each of those pools is limited to at most 
+The upper limit of the size of each buffer pool is called the buffer pool 
**Target**, which is calculated by the following formula.
 
 `#channels * taskmanager.network.memory.buffers-per-channel + 
taskmanager.network.memory.floating-buffers-per-gate`
 
 The size of the buffer can be configured by setting 
`taskmanager.memory.segment-size`.
 
 ### Input network buffers
 
-Buffers in the input channel are divided into exclusive and floating buffers.  
Exclusive buffers can be used by only one particular channel.  A channel can 
request additional floating buffers from a buffer pool shared across all 
channels belonging to the given input gate. The remaining floating buffers are 
optional and are acquired only if there are enough resources available.
+Not all buffers in the buffer pool Target can be obtained eventually. A 
**Threshold** is introduced to divide the buffer pool Target into two parts. 
The part below the threshold is called required. The excess part buffers, if 
any, is optional. A task will fail if the required buffers cannot be obtained 
in runtime. A task will not fail due to not obtaining optional buffers, but may 
suffer a performance reduction. If not explicitly configured, the default value 
of the threshold is Integer.MAX_VALUE for streaming workloads, and 1000 for 
batch workloads.

Review Comment:
   Ok, Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] TanYuxin-tyx commented on a diff in pull request #21843: [FLINK-30712][network] Update documents for taskmanager memory configurations and tuning

2023-02-08 Thread via GitHub



TanYuxin-tyx commented on code in PR #21843:
URL: https://github.com/apache/flink/pull/21843#discussion_r1101079612


##
docs/content/docs/deployment/memory/network_mem_tuning.md:
##
@@ -97,20 +97,17 @@ The actual value of parallelism from which the problem 
occurs is various from jo
 ## Network buffer lifecycle
  
 Flink has several local buffer pools - one for the output stream and one for 
each input gate. 
-Each of those pools is limited to at most 
+The upper limit of the size of each buffer pool is called the buffer pool 
**Target**, which is calculated by the following formula.

Review Comment:
   Ok, thanks for rewriting the doc. Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] shuiqiangchen commented on pull request #21897: [FLINK-30922][table-planner] Apply persisted columns when doing appendPartitionAndNu…

2023-02-08 Thread via GitHub



shuiqiangchen commented on PR #21897:
URL: https://github.com/apache/flink/pull/21897#issuecomment-1423763350

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-connector-pulsar] xintongsong commented on pull request #21: [hotfix][docs] Shortcode syntax

2023-02-08 Thread via GitHub



xintongsong commented on PR #21:
URL: 
https://github.com/apache/flink-connector-pulsar/pull/21#issuecomment-1423762975

   I'd be fine with either way as long as it fix the documentation build.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30977) flink tumbling window stream converting to pandas dataframe not work

2023-02-08 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-30977:
---
Component/s: API / Python

> flink tumbling window stream converting to pandas dataframe not work
> 
>
> Key: FLINK-30977
> URL: https://issues.apache.org/jira/browse/FLINK-30977
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
> Environment: pyflink1.15.2
>Reporter: Joekwal
>Priority: Major
>
> I want to know if tumbling window supported to convert to pandas?
> {code:java}
> code... #create env
> kafka_src = """
> CREATE TABLE if not exists `kafka_src` (
> ...
> `event_time` as CAST(`end_time` as TIMESTAMP(3)),
> WATERMARK FOR event_time as event_time - INTERVAL '5' SECOND
> )
> with (
> 'connector' = 'kafka',
> 'topic' = 'topic',
> 'properties.bootstrap.servers' = '***',
> 'properties.group.id' = '***',
> 'scan.startup.mode' = 'earliest-offset',
> 'value.format' = 'debezium-json'
> );
> """  
>   
> t_env.execute_sql(kafka_src)
> table = st_env.sql_query("SELECT columns,`event_time`  \
>     FROM TABLE(TUMBLE(TABLE table_name, DESCRIPTOR(event_time), INTERVAL '1' 
> MINUTES))")
> table.execute().print()  #could print the result
> df = table.to_pandas()
> #schema is correct!
> schema = DataTypes.ROW([DataTypes.FIELD("column1", DataTypes.STRING()),
>                         ...
>                             ])
> table = st_env.from_pandas(df,schema=schema)
> st_env.create_temporary_view("view_table",table)
> st_env.sql_query("select * from view_table").execute().print() # Not 
> work!Can't print the result {code}
> Tumbling window stream from kafka source convert to pandas dataframe and it 
> can't print the result.The schema is right.I have tested in another job with 
> using batch stream from jdbc source.It can print the result.The only 
> different thing is the input stream.As doc mentioned, the bounded stream is 
> supported to convert to pandas.So what could have gone wrong?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-ml] vacaly commented on a diff in pull request #192: [FLINK-30451] Add Estimator and Transformer for Swing

2023-02-08 Thread via GitHub



vacaly commented on code in PR #192:
URL: https://github.com/apache/flink-ml/pull/192#discussion_r1101074992


##
flink-ml-lib/src/main/java/org/apache/flink/ml/recommendation/swing/Swing.java:
##
@@ -0,0 +1,430 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.recommendation.swing;
+
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeinfo.Types;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.api.java.typeutils.RowTypeInfo;
+import org.apache.flink.iteration.operator.OperatorStateUtils;
+import org.apache.flink.ml.api.AlgoOperator;
+import org.apache.flink.ml.common.datastream.TableUtils;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.runtime.state.StateInitializationContext;
+import org.apache.flink.runtime.state.StateSnapshotContext;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
+import org.apache.flink.streaming.api.operators.BoundedOneInput;
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.table.catalog.ResolvedSchema;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Preconditions;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+
+/**
+ * An AlgoOperator which implements the Swing algorithm.
+ *
+ * Swing is an item recall model. The topology of user-item graph usually 
can be described as
+ * user-item-user or item-user-item, which are like 'swing'. For example, if 
both user u
+ * and user v have purchased the same commodity i , they 
will form a relationship
+ * diagram similar to a swing. If u and v have purchased 
commodity j in
+ * addition to i, it is supposed i and j are 
similar. The formula of
+ * Swing is
+ *
+ * $$ w_{(i,j)}=\sum_{u\in U_i\cap U_j}\sum_{v\in U_i\cap
+ * 
U_j}{\frac{1}{{(I_u+\alpha_1)}^\beta}}*{\frac{1}{{(I_v+\alpha_1)}^\beta}}*{\frac{1}{\alpha\_2+|I_u\cap
+ * I_v|}} $$
+ *
+ * This implementation is based on the algorithm proposed in the paper: 
"Large Scale Product
+ * Graph Construction for Recommendation in E-commerce" by Xiaoyong Yang, 
Yadong Zhu and Yi Zhang.
+ * (https://arxiv.org/pdf/2010.05525.pdf)
+ */
+public class Swing implements AlgoOperator, SwingParams {
+private final Map, Object> paramMap = new HashMap<>();
+
+public Swing() {
+ParamUtils.initializeMapWithDefaultValues(paramMap, this);
+}
+
+@Override
+public Table[] transform(Table... inputs) {
+
+final String userCol = getUserCol();
+final String itemCol = getItemCol();
+Preconditions.checkArgument(inputs.length == 1);
+final ResolvedSchema schema = inputs[0].getResolvedSchema();
+
+if (!(Types.LONG.equals(TableUtils.getTypeInfoByName(schema, userCol))
+&& Types.LONG.equals(TableUtils.getTypeInfoByName(schema, 
itemCol {
+throw new IllegalArgumentException("The types of user and item 
columns must be Long.");
+}
+
+StreamTableEnvironment tEnv =
+(StreamTableEnvironment) ((TableImpl) 
inputs[0]).getTableEnvironment();
+
+SingleOutputStreamOperator> itemUsers =
+tEnv.toDataStream(inputs[0])
+.map(
+row -> {
+

[jira] [Updated] (FLINK-30981) explain_sql throws java method not exist

2023-02-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30981:
---
Labels: pull-request-available  (was: )

> explain_sql throws java method not exist
> 
>
> Key: FLINK-30981
> URL: https://issues.apache.org/jira/browse/FLINK-30981
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.17.0
>Reporter: Juntao Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0, 1.18.0
>
>
> Execute `t_env.explainSql("ANY VALID SQL")` will throw error:
> {code:java}
> Traceback (most recent call last):
>   File "ISSUE/FLINK-25622.py", line 42, in 
>     main()
>   File "ISSUE/FLINK-25622.py", line 34, in main
>     print(t_env.explain_sql(
>   File 
> "/Users/vancior/Documents/Github/flink-back/flink-python/pyflink/table/table_environment.py",
>  line 799, in explain_sql
>     return self._j_tenv.explainSql(stmt, j_extra_details)
>   File 
> "/Users/vancior/miniconda3/envs/flink-python/lib/python3.8/site-packages/py4j/java_gateway.py",
>  line 1322, in __call__
>     return_value = get_return_value(
>   File 
> "/Users/vancior/Documents/Github/flink-back/flink-python/pyflink/util/exceptions.py",
>  line 146, in deco
>     return f(*a, **kw)
>   File 
> "/Users/vancior/miniconda3/envs/flink-python/lib/python3.8/site-packages/py4j/protocol.py",
>  line 330, in get_return_value
>     raise Py4JError(
> py4j.protocol.Py4JError: An error occurred while calling o11.explainSql. 
> Trace:
> org.apache.flink.api.python.shaded.py4j.Py4JException: Method 
> explainSql([class java.lang.String, class 
> [Lorg.apache.flink.table.api.ExplainDetail;]) does not exist
>     at 
> org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:321)
>     at 
> org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:329)
>     at 
> org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:274)
>     at 
> org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>     at 
> org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
>     at 
> org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
>     at java.base/java.lang.Thread.run(Thread.java:829) {code}
> [30668|https://issues.apache.org/jira/browse/FLINK-30668] changed 
> TableEnvironment#explainSql to an interface default method, while both 
> TableEnvironmentInternal and TableEnvironmentImpl not overwriting it, it 
> triggers a bug in py4j, see [https://github.com/py4j/py4j/issues/506] .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] Vancior opened a new pull request, #21903: [FLINK-30981][Python] Fix explain_sql throws method not exist

2023-02-08 Thread via GitHub



Vancior opened a new pull request, #21903:
URL: https://github.com/apache/flink/pull/21903

   
   ## What is the purpose of the change
   
   This PR fixes calling `TableEnvironment#explain_sql` with any valid sql 
string will throws `Method explainSql([class java.lang.String, class 
[Lorg.apache.flink.table.api.ExplainDetail;]) does not exist`.
   
   
   ## Verifying this change
   
   
   This change added tests and can be verified as follows:
   - unit test `test_explain_sql` and `test_explain_sql_extended` in 
test_table_environment_api.py
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30981) explain_sql throws java method not exist

2023-02-08 Thread Juntao Hu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juntao Hu updated FLINK-30981:
--
Description: 
Execute `t_env.explainSql("ANY VALID SQL")` will throw error:
{code:java}
Traceback (most recent call last):
  File "ISSUE/FLINK-25622.py", line 42, in 
    main()
  File "ISSUE/FLINK-25622.py", line 34, in main
    print(t_env.explain_sql(
  File 
"/Users/vancior/Documents/Github/flink-back/flink-python/pyflink/table/table_environment.py",
 line 799, in explain_sql
    return self._j_tenv.explainSql(stmt, j_extra_details)
  File 
"/Users/vancior/miniconda3/envs/flink-python/lib/python3.8/site-packages/py4j/java_gateway.py",
 line 1322, in __call__
    return_value = get_return_value(
  File 
"/Users/vancior/Documents/Github/flink-back/flink-python/pyflink/util/exceptions.py",
 line 146, in deco
    return f(*a, **kw)
  File 
"/Users/vancior/miniconda3/envs/flink-python/lib/python3.8/site-packages/py4j/protocol.py",
 line 330, in get_return_value
    raise Py4JError(
py4j.protocol.Py4JError: An error occurred while calling o11.explainSql. Trace:
org.apache.flink.api.python.shaded.py4j.Py4JException: Method explainSql([class 
java.lang.String, class [Lorg.apache.flink.table.api.ExplainDetail;]) does not 
exist
    at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:321)
    at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:329)
    at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:274)
    at 
org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at 
org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
    at 
org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.base/java.lang.Thread.run(Thread.java:829) {code}
[30668|https://issues.apache.org/jira/browse/FLINK-30668] changed 
TableEnvironment#explainSql to an interface default method, while both 
TableEnvironmentInternal and TableEnvironmentImpl not overwriting it, it 
triggers a bug in py4j, see [https://github.com/py4j/py4j/issues/506] .

  was:
Execute `t_env.explainSql("ANY VALID SQL")` will throw error:
{code:java}
Traceback (most recent call last):
  File "ISSUE/FLINK-25622.py", line 42, in 
    main()
  File "ISSUE/FLINK-25622.py", line 34, in main
    print(t_env.explain_sql(
  File 
"/Users/vancior/Documents/Github/flink-back/flink-python/pyflink/table/table_environment.py",
 line 799, in explain_sql
    return self._j_tenv.explainSql(stmt, j_extra_details)
  File 
"/Users/vancior/miniconda3/envs/flink-python/lib/python3.8/site-packages/py4j/java_gateway.py",
 line 1322, in __call__
    return_value = get_return_value(
  File 
"/Users/vancior/Documents/Github/flink-back/flink-python/pyflink/util/exceptions.py",
 line 146, in deco
    return f(*a, **kw)
  File 
"/Users/vancior/miniconda3/envs/flink-python/lib/python3.8/site-packages/py4j/protocol.py",
 line 330, in get_return_value
    raise Py4JError(
py4j.protocol.Py4JError: An error occurred while calling o11.explainSql. Trace:
org.apache.flink.api.python.shaded.py4j.Py4JException: Method explainSql([class 
java.lang.String, class [Lorg.apache.flink.table.api.ExplainDetail;]) does not 
exist
    at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:321)
    at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:329)
    at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:274)
    at 
org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at 
org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
    at 
org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.base/java.lang.Thread.run(Thread.java:829) {code}
[#30668] changed TableEnvironment#explainSql to an interface default method, 
while both TableEnvironmentInternal and TableEnvironmentImpl not overwriting 
it, it triggers a bug in py4j, see [https://github.com/py4j/py4j/issues/506] .


> explain_sql throws java method not exist
> 
>
> Key: FLINK-30981
> URL: https://issues.apache.org/jira/browse/FLINK-30981
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.17.0
>Reporter: Juntao Hu
>Priority: Major
> Fix For: 1.17.0, 1.18.0
>
>
> Execute `t_env.explainSql("ANY VALID SQL")` will throw error:
> {code:java}
> Traceback (most recent call last):
>   File "ISSUE/FLINK-25622.py", line 42, in 
>     main()
>   File "ISSUE/FLINK-25622.py", line 34, in main
>     p

[jira] [Updated] (FLINK-30981) explain_sql throws java method not exist

2023-02-08 Thread Juntao Hu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juntao Hu updated FLINK-30981:
--
Description: 
Execute `t_env.explainSql("ANY VALID SQL")` will throw error:
{code:java}
Traceback (most recent call last):
  File "ISSUE/FLINK-25622.py", line 42, in 
    main()
  File "ISSUE/FLINK-25622.py", line 34, in main
    print(t_env.explain_sql(
  File 
"/Users/vancior/Documents/Github/flink-back/flink-python/pyflink/table/table_environment.py",
 line 799, in explain_sql
    return self._j_tenv.explainSql(stmt, j_extra_details)
  File 
"/Users/vancior/miniconda3/envs/flink-python/lib/python3.8/site-packages/py4j/java_gateway.py",
 line 1322, in __call__
    return_value = get_return_value(
  File 
"/Users/vancior/Documents/Github/flink-back/flink-python/pyflink/util/exceptions.py",
 line 146, in deco
    return f(*a, **kw)
  File 
"/Users/vancior/miniconda3/envs/flink-python/lib/python3.8/site-packages/py4j/protocol.py",
 line 330, in get_return_value
    raise Py4JError(
py4j.protocol.Py4JError: An error occurred while calling o11.explainSql. Trace:
org.apache.flink.api.python.shaded.py4j.Py4JException: Method explainSql([class 
java.lang.String, class [Lorg.apache.flink.table.api.ExplainDetail;]) does not 
exist
    at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:321)
    at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:329)
    at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:274)
    at 
org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at 
org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
    at 
org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.base/java.lang.Thread.run(Thread.java:829) {code}
[#30668] changed TableEnvironment#explainSql to an interface default method, 
while both TableEnvironmentInternal and TableEnvironmentImpl not overwriting 
it, it triggers a bug in py4j, see [https://github.com/py4j/py4j/issues/506] .

  was:
Execute `t_env.explainSql("ANY VALID SQL")` will throw error:
{code:java}
Traceback (most recent call last):
  File "ISSUE/FLINK-25622.py", line 42, in 
    main()
  File "ISSUE/FLINK-25622.py", line 34, in main
    print(t_env.explain_sql(
  File 
"/Users/vancior/Documents/Github/flink-back/flink-python/pyflink/table/table_environment.py",
 line 799, in explain_sql
    return self._j_tenv.explainSql(stmt, j_extra_details)
  File 
"/Users/vancior/miniconda3/envs/flink-python/lib/python3.8/site-packages/py4j/java_gateway.py",
 line 1322, in __call__
    return_value = get_return_value(
  File 
"/Users/vancior/Documents/Github/flink-back/flink-python/pyflink/util/exceptions.py",
 line 146, in deco
    return f(*a, **kw)
  File 
"/Users/vancior/miniconda3/envs/flink-python/lib/python3.8/site-packages/py4j/protocol.py",
 line 330, in get_return_value
    raise Py4JError(
py4j.protocol.Py4JError: An error occurred while calling o11.explainSql. Trace:
org.apache.flink.api.python.shaded.py4j.Py4JException: Method explainSql([class 
java.lang.String, class [Lorg.apache.flink.table.api.ExplainDetail;]) does not 
exist
    at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:321)
    at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:329)
    at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:274)
    at 
org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at 
org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
    at 
org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.base/java.lang.Thread.run(Thread.java:829) {code}
#30668 changed TableEnvironment#explainSql to an interface default method, 
while both TableEnvironmentInternal and TableEnvironmentImpl not overwriting 
it, it triggers a bug in py4j, see [https://github.com/py4j/py4j/issues/506] .


> explain_sql throws java method not exist
> 
>
> Key: FLINK-30981
> URL: https://issues.apache.org/jira/browse/FLINK-30981
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.17.0
>Reporter: Juntao Hu
>Priority: Major
> Fix For: 1.17.0, 1.18.0
>
>
> Execute `t_env.explainSql("ANY VALID SQL")` will throw error:
> {code:java}
> Traceback (most recent call last):
>   File "ISSUE/FLINK-25622.py", line 42, in 
>     main()
>   File "ISSUE/FLINK-25622.py", line 34, in main
>     print(t_env.explain_sql(
>   File 
> "/Users/vancior/

[jira] [Created] (FLINK-30982) Support checkpoint mechanism in GBT

2023-02-08 Thread Fan Hong (Jira)

Fan Hong created FLINK-30982:


 Summary: Support checkpoint mechanism in GBT
 Key: FLINK-30982
 URL: https://issues.apache.org/jira/browse/FLINK-30982
 Project: Flink
  Issue Type: Sub-task
  Components: Library / Machine Learning
Reporter: Fan Hong






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30953) Add estimator and transformer for GBTClassifier

2023-02-08 Thread Fan Hong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fan Hong updated FLINK-30953:
-
Summary: Add estimator and transformer for GBTClassifier  (was: Support 
checkpoint machanism and model save/load)

> Add estimator and transformer for GBTClassifier
> ---
>
> Key: FLINK-30953
> URL: https://issues.apache.org/jira/browse/FLINK-30953
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Machine Learning
>Reporter: Fan Hong
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-30396) sql hint 'LOOKUP' which is defined in outer query block may take effect in inner query block

2023-02-08 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee reassigned FLINK-30396:
---

Assignee: Jianhui Dong

> sql hint 'LOOKUP' which is defined in outer query block may take effect in 
> inner query block
> 
>
> Key: FLINK-30396
> URL: https://issues.apache.org/jira/browse/FLINK-30396
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.16.0
>Reporter: Jianhui Dong
>Assignee: Jianhui Dong
>Priority: Major
>  Labels: pull-request-available
>
> As [flink 
> doc|https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/hints/#query-hints]
>  said:
> > {{Query hints}} can be used to suggest the optimizer to affect query 
> > execution plan within a specified query scope. Their effective scope is 
> > current {{{}Query block{}}}([What are query blocks 
> > ?|https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/hints/#what-are-query-blocks-])
> >  which {{Query Hints}} are specified.
> But the sql hint 'LOOKUP' behaves differently like the demo following:
> {code:java}
> -- DDL
> CREATE TABLE left_table (
>     lid INTEGER,
>     lname VARCHAR,
>     pts AS PROCTIME()
> ) WITH (
>     'connector' = 'filesystem',
> 'format' = 'csv',
> 'path'='xxx'
> ) 
> CREATE TABLE dim_table (
>     id INTEGER,
>     name VARCHAR,
>     mentor VARCHAR,
>     gender VARCHAR
> ) WITH (
>     'connector' = 'jdbc',
> 'url' = 'xxx',
> 'table-name' = 'dim1',
> 'username' = 'xxx',
> 'password' = 'xxx',
> 'driver'= 'com.mysql.cj.jdbc.Driver' 
> )
> -- DML
> SELECT /*+ LOOKUP('table'='outer') */
>     ll.id AS lid,
>     ll.name,
>     r.mentor,
>     r.gender
> FROM (
>     SELECT /*+ LOOKUP('table'='inner') */
>     l.lid AS id,
>     l.lname AS name,
>     r.mentor,
>     r.gender,
>     l.pts
>     FROM left_table AS l
> JOIN dim_table FOR SYSTEM_TIME AS OF l.pts AS r
> ON l.lname = r.name
> ) ll JOIN dim_table FOR SYSTEM_TIME AS OF ll.pts AS r ON ll.name=r.name{code}
> The inner correlate will have two hints:
> {noformat}
> {     
> [LOOKUP inheritPath:[0] options:{table=inner}],
>     [LOOKUP inheritPath:[0, 0, 0] options:{table=outer}]
> }{noformat}
> and IMO which maybe is a bug.
> The first hint comes from the inner query block and the second hint comes 
> from the outer block, and ClearJoinHintWithInvalidPropagationShuttle will not 
> clear the second hint cause the correlate has no 'ALIAS' hint.
> The reason for the above case is that the hint 'ALIAS' now only works for 
> join rel nodes and 'LOOKUP' works for correlate and join rel nodes.
> I think maybe the better way would be to make 'ALIAS' support both correlate 
> and join rel nodes like 'LOOKUP'.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-30981) explain_sql throws java method not exist

2023-02-08 Thread Juntao Hu (Jira)

Juntao Hu created FLINK-30981:
-

 Summary: explain_sql throws java method not exist
 Key: FLINK-30981
 URL: https://issues.apache.org/jira/browse/FLINK-30981
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.17.0
Reporter: Juntao Hu
 Fix For: 1.17.0, 1.18.0


Execute `t_env.explainSql("ANY VALID SQL")` will throw error:
{code:java}
Traceback (most recent call last):
  File "ISSUE/FLINK-25622.py", line 42, in 
    main()
  File "ISSUE/FLINK-25622.py", line 34, in main
    print(t_env.explain_sql(
  File 
"/Users/vancior/Documents/Github/flink-back/flink-python/pyflink/table/table_environment.py",
 line 799, in explain_sql
    return self._j_tenv.explainSql(stmt, j_extra_details)
  File 
"/Users/vancior/miniconda3/envs/flink-python/lib/python3.8/site-packages/py4j/java_gateway.py",
 line 1322, in __call__
    return_value = get_return_value(
  File 
"/Users/vancior/Documents/Github/flink-back/flink-python/pyflink/util/exceptions.py",
 line 146, in deco
    return f(*a, **kw)
  File 
"/Users/vancior/miniconda3/envs/flink-python/lib/python3.8/site-packages/py4j/protocol.py",
 line 330, in get_return_value
    raise Py4JError(
py4j.protocol.Py4JError: An error occurred while calling o11.explainSql. Trace:
org.apache.flink.api.python.shaded.py4j.Py4JException: Method explainSql([class 
java.lang.String, class [Lorg.apache.flink.table.api.ExplainDetail;]) does not 
exist
    at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:321)
    at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:329)
    at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:274)
    at 
org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at 
org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
    at 
org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.base/java.lang.Thread.run(Thread.java:829) {code}
#30668 changed TableEnvironment#explainSql to an interface default method, 
while both TableEnvironmentInternal and TableEnvironmentImpl not overwriting 
it, it triggers a bug in py4j, see [https://github.com/py4j/py4j/issues/506] .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30976) docs_404_check fails occasionally

2023-02-08 Thread Qingsheng Ren (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686231#comment-17686231
 ] 

Qingsheng Ren commented on FLINK-30976:
---

{{docs_404_check}} is only run in PR triggered and cron jobs, so most CI runs 
skip this stage. 

This could be reproduced in my own Azure pipeline:

[https://dev.azure.com/renqs/Apache%20Flink/_build/results?buildId=438&view=logs&j=c5d67f7d-375d-5407-4743-f9d0c4436a81&t=38411795-40c9-51fa-10b0-bd083cf9f5a5&l=133]
 

> docs_404_check fails occasionally
> -
>
> Key: FLINK-30976
> URL: https://issues.apache.org/jira/browse/FLINK-30976
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.17.0, 1.18.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
>
> We've seen the docs_404_check failing in nightly builds (only the cron stage 
> but not the ci stage):
> {code}
> Re-run Hugo with the flag --panicOnWarning to get a better error message.
> ERROR 2023/02/09 01:27:27 "docs/connectors/datastream/pulsar.md": Invalid use 
> of artifact shortcode. Unknown flag `4.0.0-SNAPSHOT`
> ERROR 2023/02/09 01:27:34 "docs/connectors/datastream/pulsar.md": Invalid use 
> of artifact shortcode. Unknown flag `4.0.0-SNAPSHOT`
> Error: Error building site: logged 2 error(s)
> Total in 12945 ms
> Error building the docs
> {code}
> * 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45909&view=logs&j=6dc02e5c-5865-5c6a-c6c5-92d598e3fc43&t=ddd6d61a-af16-5d03-2b9a-76a279badf98&l=133
> * 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45906&view=logs&j=6dc02e5c-5865-5c6a-c6c5-92d598e3fc43&t=ddd6d61a-af16-5d03-2b9a-76a279badf98&l=132



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] dangshazi commented on pull request #21840: [FLINK-29481] Show jobType in webUI

2023-02-08 Thread via GitHub



dangshazi commented on PR #21840:
URL: https://github.com/apache/flink/pull/21840#issuecomment-1423741398

   @xintongsong  Please help review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-connector-pulsar] syhily commented on a diff in pull request #21: [hotfix][docs] Shortcode syntax

2023-02-08 Thread via GitHub



syhily commented on code in PR #21:
URL: 
https://github.com/apache/flink-connector-pulsar/pull/21#discussion_r1101059543


##
docs/content/docs/connectors/datastream/pulsar.md:
##
@@ -31,7 +31,7 @@ Flink provides an [Apache Pulsar](https://pulsar.apache.org) 
connector for readi
 You can use the connector with the Pulsar 2.10.0 or higher. It is recommended 
to always use the latest Pulsar version.
 The details on Pulsar compatibility can be found in 
[PIP-72](https://github.com/apache/pulsar/wiki/PIP-72%3A-Introduce-Pulsar-Interface-Taxonomy%3A-Audience-and-Stability-Classification).
 
-{{< artifact flink-connector-pulsar 4.0.0-SNAPSHOT >}}
+{{< artifact flink-connector-pulsar >}}

Review Comment:
   ```suggestion
   {{< connector_artifact flink-connector-pulsar 4.0.0-SNAPSHOT >}}
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-connector-pulsar] syhily commented on a diff in pull request #21: [hotfix][docs] Shortcode syntax

2023-02-08 Thread via GitHub



syhily commented on code in PR #21:
URL: 
https://github.com/apache/flink-connector-pulsar/pull/21#discussion_r1101059387


##
docs/content.zh/docs/connectors/datastream/pulsar.md:
##
@@ -30,7 +30,7 @@ Flink 当前提供 [Apache Pulsar](https://pulsar.apache.org) Source 
和 Sink 
 
 当前支持 Pulsar 2.10.0 及其之后的版本，建议在总是将 Pulsar 升级至最新版。如果想要了解更多关于 Pulsar API 
兼容性设计，可以阅读文档 
[PIP-72](https://github.com/apache/pulsar/wiki/PIP-72%3A-Introduce-Pulsar-Interface-Taxonomy%3A-Audience-and-Stability-Classification)。
 
-{{< artifact flink-connector-pulsar 4.0.0-SNAPSHOT >}}
+{{< artifact flink-connector-pulsar >}}

Review Comment:
   ```suggestion
   {{< connector_artifact flink-connector-pulsar 4.0.0-SNAPSHOT >}}
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30980) Support s3.signer-type for S3

2023-02-08 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-30980:
-
Description: Currently, s3.signer-type should be s3a.signing-algorithm, we 
can also support s3.signer-type configuration.  (was: Currently, s3.signer-type 
should be s3a.signer-type, we can also support s3.signer-type configuration.)

> Support s3.signer-type for S3
> -
>
> Key: FLINK-30980
> URL: https://issues.apache.org/jira/browse/FLINK-30980
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.4.0
>
>
> Currently, s3.signer-type should be s3a.signing-algorithm, we can also 
> support s3.signer-type configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-30980) Support s3.signer-type for S3

2023-02-08 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-30980:


 Summary: Support s3.signer-type for S3
 Key: FLINK-30980
 URL: https://issues.apache.org/jira/browse/FLINK-30980
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


Currently, s3.signer-type should be s3a.signer-type, we can also support 
s3.signer-type configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] xintongsong commented on a diff in pull request #21890: [FLINK-30860][doc] Add document for hybrid shuffle with adaptive batch scheduler

2023-02-08 Thread via GitHub



xintongsong commented on code in PR #21890:
URL: https://github.com/apache/flink/pull/21890#discussion_r1101054167


##
docs/content.zh/docs/ops/batch/batch_shuffle.md:
##
@@ -114,12 +114,34 @@ Hybrid shuffle provides two spilling strategies:
 
 To use hybrid shuffle mode, you need to configure the 
[execution.batch-shuffle-mode]({{< ref "docs/deployment/config" 
>}}#execution-batch-shuffle-mode) to `ALL_EXCHANGES_HYBRID_FULL` (full spilling 
strategy) or `ALL_EXCHANGES_HYBRID_SELECTIVE` (selective spilling strategy).
 
+ Supports AdaptiveBatchScheduler and SpeculativeExecution
+
+Hybrid shuffle currently supports `AdaptiveBatchScheduler` by default. If you 
want to use `DefaultScheduler`, please configure the [jobmanager.scheduler]({{< 
ref "docs/deployment/config" >}}#jobmanager-scheduler) to `DefaultScheduler`. 
See [elastic_scaling]({{< ref "docs/deployment/elastic_scaling" 
>}}#adaptive-batch-scheduler) for details.
+
+If you want to enable `SpeculativeExecution` in the same time, see 
[speculative_execution]({{< ref "docs/deployment/speculative_execution" >}}) 
for details.

Review Comment:
   This is irrelevant to hybrid shuffle.



##
docs/content.zh/docs/ops/batch/batch_shuffle.md:
##
@@ -114,12 +114,34 @@ Hybrid shuffle provides two spilling strategies:
 
 To use hybrid shuffle mode, you need to configure the 
[execution.batch-shuffle-mode]({{< ref "docs/deployment/config" 
>}}#execution-batch-shuffle-mode) to `ALL_EXCHANGES_HYBRID_FULL` (full spilling 
strategy) or `ALL_EXCHANGES_HYBRID_SELECTIVE` (selective spilling strategy).
 
+ Supports AdaptiveBatchScheduler and SpeculativeExecution
+
+Hybrid shuffle currently supports `AdaptiveBatchScheduler` by default. If you 
want to use `DefaultScheduler`, please configure the [jobmanager.scheduler]({{< 
ref "docs/deployment/config" >}}#jobmanager-scheduler) to `DefaultScheduler`. 
See [elastic_scaling]({{< ref "docs/deployment/elastic_scaling" 
>}}#adaptive-batch-scheduler) for details.
+
+If you want to enable `SpeculativeExecution` in the same time, see 
[speculative_execution]({{< ref "docs/deployment/speculative_execution" >}}) 
for details.
+
+Hybrid shuffle divides the partition data consumption constraints between 
producer and consumer into the following three cases:
+
+- **ALL_PRODUCERS_FINISHED** : hybrid partition data can be consumed only when 
all producers are finished.
+- **ONLY_FINISHED_PRODUCERS** : hybrid partition data can be consumed when its 
producer is finished.
+- **UNFINISHED_PRODUCERS** : hybrid partition data can be consumed even if its 
producer is un-finished.
+
+If `SpeculativeExecution` is enabled, the default constraint is 
`ONLY_FINISHED_PRODUCERS` to bring some performance optimization compared with 
blocking shuffle. Otherwise, the default constraint is `UNFINISHED_PRODUCERS` 
to perform pipelined-like shuffle. These could be configured via 
[jobmanager.partition.hybrid.partition-data-consume-constraint]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-partition-hybrid-partition-data-consume-constraint).

Review Comment:
   What is the potential impacts when changing this option?



##
docs/content.zh/docs/ops/batch/batch_shuffle.md:
##
@@ -114,12 +114,34 @@ Hybrid shuffle provides two spilling strategies:
 
 To use hybrid shuffle mode, you need to configure the 
[execution.batch-shuffle-mode]({{< ref "docs/deployment/config" 
>}}#execution-batch-shuffle-mode) to `ALL_EXCHANGES_HYBRID_FULL` (full spilling 
strategy) or `ALL_EXCHANGES_HYBRID_SELECTIVE` (selective spilling strategy).
 
+ Supports AdaptiveBatchScheduler and SpeculativeExecution
+
+Hybrid shuffle currently supports `AdaptiveBatchScheduler` by default. If you 
want to use `DefaultScheduler`, please configure the [jobmanager.scheduler]({{< 
ref "docs/deployment/config" >}}#jobmanager-scheduler) to `DefaultScheduler`. 
See [elastic_scaling]({{< ref "docs/deployment/elastic_scaling" 
>}}#adaptive-batch-scheduler) for details.
+
+If you want to enable `SpeculativeExecution` in the same time, see 
[speculative_execution]({{< ref "docs/deployment/speculative_execution" >}}) 
for details.
+
+Hybrid shuffle divides the partition data consumption constraints between 
producer and consumer into the following three cases:
+
+- **ALL_PRODUCERS_FINISHED** : hybrid partition data can be consumed only when 
all producers are finished.
+- **ONLY_FINISHED_PRODUCERS** : hybrid partition data can be consumed when its 
producer is finished.
+- **UNFINISHED_PRODUCERS** : hybrid partition data can be consumed even if its 
producer is un-finished.
+
+If `SpeculativeExecution` is enabled, the default constraint is 
`ONLY_FINISHED_PRODUCERS` to bring some performance optimization compared with 
blocking shuffle. Otherwise, the default constraint is `UNFINISHED_PRODUCERS` 
to perform pipelined-like shuffle. These could be configured via 
[jobmanager.partition.hybrid.partition-data-consume-constraint]({{< ref 
"docs/dep

[GitHub] [flink-connector-pulsar] syhily commented on pull request #21: [hotfix][docs] Shortcode syntax

2023-02-08 Thread via GitHub



syhily commented on PR #21:
URL: 
https://github.com/apache/flink-connector-pulsar/pull/21#issuecomment-1423735805

   I think we may need this syntax?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-30979) The buckets of the secondary partition should fall on different tasks

2023-02-08 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-30979:


 Summary: The buckets of the secondary partition should fall on 
different tasks
 Key: FLINK-30979
 URL: https://issues.apache.org/jira/browse/FLINK-30979
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


In Flink Streaming Job, sink to table store.
Considering that I only set one bucket now, but there are many secondary 
partitions, I expect to use multiple parallelism tasks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30407) Better encapsulate error handling logic in controllers

2023-02-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-30407:
---
Labels: starter  (was: )

> Better encapsulate error handling logic in controllers
> --
>
> Key: FLINK-30407
> URL: https://issues.apache.org/jira/browse/FLINK-30407
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Priority: Major
>  Labels: starter
>
> The error handling in the FlinkDeployment and SessionJobControllers are a bit 
> adhoc and mostly consist of a series of try catch blocks.
> We should introduce a set of error handlers that we can encapsulate nicely 
> and share between the controllers to reduce code duplication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29566) Reschedule the cleanup logic if cancel job failed

2023-02-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-29566:
---
Labels: starter  (was: )

> Reschedule the cleanup logic if cancel job failed
> -
>
> Key: FLINK-29566
> URL: https://issues.apache.org/jira/browse/FLINK-29566
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Xin Hao
>Priority: Minor
>  Labels: starter
>
> Currently, when we remove the FlinkSessionJob object,
> we always remove the object even if the Flink job is not being canceled 
> successfully.
>  
> This is *not semantic consistent* if the FlinkSessionJob has been removed but 
> the Flink job is still running.
>  
> One of the scenarios is that if we deploy a FlinkDeployment with HA mode.
> When we remove the FlinkSessionJob and change the FlinkDeployment at the same 
> time,
> or if the TMs are restarting because of some bugs such as OOM.
> Both of these will cause the cancelation of the Flink job to fail because the 
> TMs are not available.
>  
> We should *reschedule* the cleanup logic if the FlinkDeployment is present.
> And we can add a new ReconciliationState DELETING to indicate the 
> FlinkSessionJob's status.
>  
> The logic will be
> {code:java}
> if the FlinkDeployment is not present
> delete the FlinkSessionJob object
> else
> if the JM is not available
>         reschedule
> else
> if cancel job successfully
>             delete the FlinkSessionJob object
> else
>             reschedule{code}
> When we cancel the Flink job, we need to verify all the jobs with the same 
> name have been deleted in case of the job id is changed after JM restarted.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30519) Add e2e tests for operator dynamic config

2023-02-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-30519:
---
Labels: starter  (was: )

> Add e2e tests for operator dynamic config
> -
>
> Key: FLINK-30519
> URL: https://issues.apache.org/jira/browse/FLINK-30519
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Priority: Critical
>  Labels: starter
>
> The dynamic config feature is currently not covered by e2e tests and is 
> subject to accidental regressions, as shown in:
> https://issues.apache.org/jira/browse/FLINK-30329
> We should add an e2e test that covers this



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30407) Better encapsulate error handling logic in controllers

2023-02-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-30407:
---
Fix Version/s: (was: kubernetes-operator-1.4.0)

> Better encapsulate error handling logic in controllers
> --
>
> Key: FLINK-30407
> URL: https://issues.apache.org/jira/browse/FLINK-30407
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Matyas Orhidi
>Priority: Major
>
> The error handling in the FlinkDeployment and SessionJobControllers are a bit 
> adhoc and mostly consist of a series of try catch blocks.
> We should introduce a set of error handlers that we can encapsulate nicely 
> and share between the controllers to reduce code duplication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30519) Add e2e tests for operator dynamic config

2023-02-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-30519:
---
Fix Version/s: (was: kubernetes-operator-1.4.0)

> Add e2e tests for operator dynamic config
> -
>
> Key: FLINK-30519
> URL: https://issues.apache.org/jira/browse/FLINK-30519
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Priority: Critical
>
> The dynamic config feature is currently not covered by e2e tests and is 
> subject to accidental regressions, as shown in:
> https://issues.apache.org/jira/browse/FLINK-30329
> We should add an e2e test that covers this



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-30407) Better encapsulate error handling logic in controllers

2023-02-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-30407:
--

Assignee: (was: Matyas Orhidi)

> Better encapsulate error handling logic in controllers
> --
>
> Key: FLINK-30407
> URL: https://issues.apache.org/jira/browse/FLINK-30407
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Priority: Major
>
> The error handling in the FlinkDeployment and SessionJobControllers are a bit 
> adhoc and mostly consist of a series of try catch blocks.
> We should introduce a set of error handlers that we can encapsulate nicely 
> and share between the controllers to reduce code duplication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-ml] lindong28 commented on pull request #209: [FLINK-30688][followup] Disable Kryo fallback for tests in flink-ml-lib

2023-02-08 Thread via GitHub



lindong28 commented on PR #209:
URL: https://github.com/apache/flink-ml/pull/209#issuecomment-1423731056

   @jiangxin369 @zhipeng93 Can you review this PR? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-30779) Bump josdk version to 4.2.6

2023-02-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-30779.
--
Resolution: Fixed

> Bump josdk version to 4.2.6
> ---
>
> Key: FLINK-30779
> URL: https://issues.apache.org/jira/browse/FLINK-30779
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Major
>
> We should upgrade to the latest version to get some important improvements



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30779) Bump josdk version to 4.2.6

2023-02-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-30779:
---
Summary: Bump josdk version to 4.2.6  (was: Bump josdk version to 4.3.2)

> Bump josdk version to 4.2.6
> ---
>
> Key: FLINK-30779
> URL: https://issues.apache.org/jira/browse/FLINK-30779
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Major
>
> We should upgrade to the latest version to get some important improvements



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-30898) Do not include and build examples in operator image

2023-02-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-30898.
--
Resolution: Fixed

merged to main a89081fc7ac2abb492a01d71c211ffe4b2ed51ab

> Do not include and build examples in operator image
> ---
>
> Key: FLINK-30898
> URL: https://issues.apache.org/jira/browse/FLINK-30898
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.4.0
>
>
> The docker build has slowed down substantially over time. We include many 
> things in the image that are not necessary.
> We should not include examples at all.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30977) flink tumbling window stream converting to pandas dataframe not work

2023-02-08 Thread Joekwal (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joekwal updated FLINK-30977:

Description: 
I want to know if tumbling window supported to convert to pandas?
{code:java}
code... #create env

kafka_src = """
CREATE TABLE if not exists `kafka_src` (
...
`event_time` as CAST(`end_time` as TIMESTAMP(3)),
WATERMARK FOR event_time as event_time - INTERVAL '5' SECOND
)
with (
'connector' = 'kafka',
'topic' = 'topic',
'properties.bootstrap.servers' = '***',
'properties.group.id' = '***',
'scan.startup.mode' = 'earliest-offset',
'value.format' = 'debezium-json'
);
"""  
  
t_env.execute_sql(kafka_src)
table = st_env.sql_query("SELECT columns,`event_time`  \
    FROM TABLE(TUMBLE(TABLE table_name, DESCRIPTOR(event_time), INTERVAL '1' 
MINUTES))")

table.execute().print()  #could print the result

df = table.to_pandas()

#schema is correct!
schema = DataTypes.ROW([DataTypes.FIELD("column1", DataTypes.STRING()),
                        ...
                            ])
table = st_env.from_pandas(df,schema=schema)
st_env.create_temporary_view("view_table",table)

st_env.sql_query("select * from view_table").execute().print() # Not work!Can't 
print the result {code}
Tumbling window stream from kafka source convert to pandas dataframe and it 
can't print the result.The schema is right.I have tested in another job with 
using batch stream from jdbc source.It can print the result.The only different 
thing is the input stream.As doc mentioned, the bounded stream is supported to 
convert to pandas.So what could have gone wrong?

  was:
I want to know if tumbling window supported to convert to pandas?
{code:java}
code... #create env
kafka_src = """
CREATE TABLE if not exists `kafka_src` (
...
`event_time` as CAST(`end_time` as TIMESTAMP(3)),
WATERMARK FOR event_time as event_time - INTERVAL '5' SECOND
)
with (
'connector' = 'kafka',
'topic' = 'topic',
'properties.bootstrap.servers' = '***',
'properties.group.id' = '***',
'scan.startup.mode' = 'earliest-offset',
'value.format' = 'debezium-json'
);
"""    
t_env.execute_sql(kafka_src)
table = st_env.sql_query("SELECT columns,`event_time`  \
    FROM TABLE(TUMBLE(TABLE table_name, DESCRIPTOR(event_time), INTERVAL '1' 
MINUTES))")
table.execute().print()  #could print the result
df = table.to_pandas()
#schema is correct!
schema = DataTypes.ROW([DataTypes.FIELD("column1", DataTypes.STRING()),
                        ...
                            ])
table = st_env.from_pandas(df,schema=schema)
st_env.create_temporary_view("view_table",table)
st_env.sql_query("select * from view_table").execute().print() # Not work!Can't 
print the result {code}

Tumbling window stream from kafka source convert to pandas dataframe and it 
can't print the result.The schema is right.I have tested in another job with 
using batch stream from jdbc source.It can print the result.The only different 
thing is the input stream.As doc mentioned, the bounded stream is supported to 
convert to pandas.So what could have gone wrong?


> flink tumbling window stream converting to pandas dataframe not work
> 
>
> Key: FLINK-30977
> URL: https://issues.apache.org/jira/browse/FLINK-30977
> Project: Flink
>  Issue Type: Bug
> Environment: pyflink1.15.2
>Reporter: Joekwal
>Priority: Major
>
> I want to know if tumbling window supported to convert to pandas?
> {code:java}
> code... #create env
> kafka_src = """
> CREATE TABLE if not exists `kafka_src` (
> ...
> `event_time` as CAST(`end_time` as TIMESTAMP(3)),
> WATERMARK FOR event_time as event_time - INTERVAL '5' SECOND
> )
> with (
> 'connector' = 'kafka',
> 'topic' = 'topic',
> 'properties.bootstrap.servers' = '***',
> 'properties.group.id' = '***',
> 'scan.startup.mode' = 'earliest-offset',
> 'value.format' = 'debezium-json'
> );
> """  
>   
> t_env.execute_sql(kafka_src)
> table = st_env.sql_query("SELECT columns,`event_time`  \
>     FROM TABLE(TUMBLE(TABLE table_name, DESCRIPTOR(event_time), INTERVAL '1' 
> MINUTES))")
> table.execute().print()  #could print the result
> df = table.to_pandas()
> #schema is correct!
> schema = DataTypes.ROW([DataTypes.FIELD("column1", DataTypes.STRING()),
>                         ...
>                             ])
> table = st_env.from_pandas(df,schema=schema)
> st_env.create_temporary_view("view_table",table)
> st_env.sql_query("select * from view_table").execute().print() # Not 
> work!Can't print the result {code}
> Tumbling window stream from kafka source convert to pandas dataframe and it 
> can't print the result.The schema is right.I have tested in another job with 
> using batch stream from jdbc source.It can print the result.The only 
> different thing is the input stream.As doc mentioned, the bounded stream is 
> supported to convert to p

[GitHub] [flink] flinkbot commented on pull request #21902: [FLINK-30974] Add commons-io to flink-sql-connector-hbase

2023-02-08 Thread via GitHub



flinkbot commented on PR #21902:
URL: https://github.com/apache/flink/pull/21902#issuecomment-1423722137

   
   ## CI report:
   
   * 8985a1cbf5f0a49d20121bb4d5468688a399c733 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30977) flink tumbling window stream converting to pandas dataframe not work

2023-02-08 Thread Joekwal (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joekwal updated FLINK-30977:

Description: 
I want to know if tumbling window supported to convert to pandas?
{code:java}
code... #create env
kafka_src = """
CREATE TABLE if not exists `kafka_src` (
...
`event_time` as CAST(`end_time` as TIMESTAMP(3)),
WATERMARK FOR event_time as event_time - INTERVAL '5' SECOND
)
with (
'connector' = 'kafka',
'topic' = 'topic',
'properties.bootstrap.servers' = '***',
'properties.group.id' = '***',
'scan.startup.mode' = 'earliest-offset',
'value.format' = 'debezium-json'
);
"""    
t_env.execute_sql(kafka_src)
table = st_env.sql_query("SELECT columns,`event_time`  \
    FROM TABLE(TUMBLE(TABLE table_name, DESCRIPTOR(event_time), INTERVAL '1' 
MINUTES))")
table.execute().print()  #could print the result
df = table.to_pandas()
#schema is correct!
schema = DataTypes.ROW([DataTypes.FIELD("column1", DataTypes.STRING()),
                        ...
                            ])
table = st_env.from_pandas(df,schema=schema)
st_env.create_temporary_view("view_table",table)
st_env.sql_query("select * from view_table").execute().print() # Not work!Can't 
print the result {code}

Tumbling window stream from kafka source convert to pandas dataframe and it 
can't print the result.The schema is right.I have tested in another job with 
using batch stream from jdbc source.It can print the result.The only different 
thing is the input stream.As doc mentioned, the bounded stream is supported to 
convert to pandas.So what could have gone wrong?

  was:
I want to know if tumbling window supported to convert to pandas?
```
code... #create env

kafka_src = """
CREATE TABLE if not exists `kafka_src` (
...
`event_time` as CAST(`end_time` as TIMESTAMP(3)),
WATERMARK FOR event_time as event_time - INTERVAL '5' SECOND
)
with (
'connector' = 'kafka',
'topic' = 'topic',
'properties.bootstrap.servers' = '***',
'properties.group.id' = '***',
'scan.startup.mode' = 'earliest-offset',
'value.format' = 'debezium-json'
);
"""    
t_env.execute_sql(kafka_src)

table = st_env.sql_query("SELECT columns,`event_time`  \
    FROM TABLE(TUMBLE(TABLE table_name, DESCRIPTOR(event_time), INTERVAL '1' 
MINUTES))")

table.execute().print()  #could print the result
df = table.to_pandas()

#schema is correct!
schema = DataTypes.ROW([DataTypes.FIELD("column1", DataTypes.STRING()),
                        ...
                            ])

table = st_env.from_pandas(df,schema=schema)

st_env.create_temporary_view("view_table",table)

st_env.sql_query("select * from view_table").execute().print() # Not work!Can't 
print the result

```
Tumbling window stream from kafka source convert to pandas dataframe and it 
can't print the result.The schema is right.I have tested in another job with 
using batch stream from jdbc source.It can print the result.The only different 
thing is the input stream.As doc mentioned, the bounded stream is supported to 
convert to pandas.So what could have gone wrong?


> flink tumbling window stream converting to pandas dataframe not work
> 
>
> Key: FLINK-30977
> URL: https://issues.apache.org/jira/browse/FLINK-30977
> Project: Flink
>  Issue Type: Bug
> Environment: pyflink1.15.2
>Reporter: Joekwal
>Priority: Major
>
> I want to know if tumbling window supported to convert to pandas?
> {code:java}
> code... #create env
> kafka_src = """
> CREATE TABLE if not exists `kafka_src` (
> ...
> `event_time` as CAST(`end_time` as TIMESTAMP(3)),
> WATERMARK FOR event_time as event_time - INTERVAL '5' SECOND
> )
> with (
> 'connector' = 'kafka',
> 'topic' = 'topic',
> 'properties.bootstrap.servers' = '***',
> 'properties.group.id' = '***',
> 'scan.startup.mode' = 'earliest-offset',
> 'value.format' = 'debezium-json'
> );
> """    
> t_env.execute_sql(kafka_src)
> table = st_env.sql_query("SELECT columns,`event_time`  \
>     FROM TABLE(TUMBLE(TABLE table_name, DESCRIPTOR(event_time), INTERVAL '1' 
> MINUTES))")
> table.execute().print()  #could print the result
> df = table.to_pandas()
> #schema is correct!
> schema = DataTypes.ROW([DataTypes.FIELD("column1", DataTypes.STRING()),
>                         ...
>                             ])
> table = st_env.from_pandas(df,schema=schema)
> st_env.create_temporary_view("view_table",table)
> st_env.sql_query("select * from view_table").execute().print() # Not 
> work!Can't print the result {code}
> Tumbling window stream from kafka source convert to pandas dataframe and it 
> can't print the result.The schema is right.I have tested in another job with 
> using batch stream from jdbc source.It can print the result.The only 
> different thing is the input stream.As doc mentioned, the bounded stream is 
> supported to convert to pandas.So wha

[jira] [Updated] (FLINK-30977) flink tumbling window stream converting to pandas dataframe not work

2023-02-08 Thread Joekwal (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joekwal updated FLINK-30977:

Description: 
I want to know if tumbling window supported to convert to pandas?
```
code... #create env

kafka_src = """
CREATE TABLE if not exists `kafka_src` (
...
`event_time` as CAST(`end_time` as TIMESTAMP(3)),
WATERMARK FOR event_time as event_time - INTERVAL '5' SECOND
)
with (
'connector' = 'kafka',
'topic' = 'topic',
'properties.bootstrap.servers' = '***',
'properties.group.id' = '***',
'scan.startup.mode' = 'earliest-offset',
'value.format' = 'debezium-json'
);
"""    
t_env.execute_sql(kafka_src)

table = st_env.sql_query("SELECT columns,`event_time`  \
    FROM TABLE(TUMBLE(TABLE table_name, DESCRIPTOR(event_time), INTERVAL '1' 
MINUTES))")

table.execute().print()  #could print the result
df = table.to_pandas()

#schema is correct!
schema = DataTypes.ROW([DataTypes.FIELD("column1", DataTypes.STRING()),
                        ...
                            ])

table = st_env.from_pandas(df,schema=schema)

st_env.create_temporary_view("view_table",table)

st_env.sql_query("select * from view_table").execute().print() # Not work!Can't 
print the result

```
Tumbling window stream from kafka source convert to pandas dataframe and it 
can't print the result.The schema is right.I have tested in another job with 
using batch stream from jdbc source.It can print the result.The only different 
thing is the input stream.As doc mentioned, the bounded stream is supported to 
convert to pandas.So what could have gone wrong?

  was:
I want to know if tumbling window supported to convert to pandas?
```
code... #create env

kafka_src = """
CREATE TABLE if not exists `kafka_src` (
...
`event_time` as CAST(`end_time` as TIMESTAMP(3)),
WATERMARK FOR event_time as event_time - INTERVAL '5' SECOND
)
with (
'connector' = 'kafka',
'topic' = 'topic',
'properties.bootstrap.servers' = '***',
'properties.group.id' = '***',
'scan.startup.mode' = 'earliest-offset',
'value.format' = 'debezium-json'
);
"""    
t_env.execute_sql(kafka_src)

table = st_env.sql_query("SELECT columns,`event_time`  \
    FROM TABLE(TUMBLE(TABLE table_name, DESCRIPTOR(event_time), INTERVAL '1' 
MINUTES))")

# table.execute().print()  #could print the result
df = table.to_pandas()

#schema is correct!
schema = DataTypes.ROW([DataTypes.FIELD("column1", DataTypes.STRING()),
                        ...
                            ])

table = st_env.from_pandas(df,schema=schema)

st_env.create_temporary_view("view_table",table)

st_env.sql_query("select * from view_table").execute().print() # Not work!Can't 
print the result

```
Tumbling window stream from kafka source convert to pandas dataframe and it 
can't print the result.The schema is right.I have tested in another job with 
using batch stream from jdbc source.It can print the result.The only different 
thing is the input stream.As doc mentioned, the bounded stream is supported to 
convert to pandas.So what could have gone wrong?


> flink tumbling window stream converting to pandas dataframe not work
> 
>
> Key: FLINK-30977
> URL: https://issues.apache.org/jira/browse/FLINK-30977
> Project: Flink
>  Issue Type: Bug
> Environment: pyflink1.15.2
>Reporter: Joekwal
>Priority: Major
>
> I want to know if tumbling window supported to convert to pandas?
> ```
> code... #create env
> kafka_src = """
> CREATE TABLE if not exists `kafka_src` (
> ...
> `event_time` as CAST(`end_time` as TIMESTAMP(3)),
> WATERMARK FOR event_time as event_time - INTERVAL '5' SECOND
> )
> with (
> 'connector' = 'kafka',
> 'topic' = 'topic',
> 'properties.bootstrap.servers' = '***',
> 'properties.group.id' = '***',
> 'scan.startup.mode' = 'earliest-offset',
> 'value.format' = 'debezium-json'
> );
> """    
> t_env.execute_sql(kafka_src)
> table = st_env.sql_query("SELECT columns,`event_time`  \
>     FROM TABLE(TUMBLE(TABLE table_name, DESCRIPTOR(event_time), INTERVAL '1' 
> MINUTES))")
> table.execute().print()  #could print the result
> df = table.to_pandas()
> #schema is correct!
> schema = DataTypes.ROW([DataTypes.FIELD("column1", DataTypes.STRING()),
>                         ...
>                             ])
> table = st_env.from_pandas(df,schema=schema)
> st_env.create_temporary_view("view_table",table)
> st_env.sql_query("select * from view_table").execute().print() # Not 
> work!Can't print the result
> ```
> Tumbling window stream from kafka source convert to pandas dataframe and it 
> can't print the result.The schema is right.I have tested in another job with 
> using batch stream from jdbc source.It can print the result.The only 
> different thing is the input stream.As doc mentioned, the bounded stream is 
> supported to convert to pandas.So what could hav

[jira] [Created] (FLINK-30978) ExecutorImplITCase.testInterruptExecution hangs waiting for SQL gateway service closing

2023-02-08 Thread Qingsheng Ren (Jira)

Qingsheng Ren created FLINK-30978:
-

 Summary: ExecutorImplITCase.testInterruptExecution hangs waiting 
for SQL gateway service closing
 Key: FLINK-30978
 URL: https://issues.apache.org/jira/browse/FLINK-30978
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Gateway
Affects Versions: 1.17.0
Reporter: Qingsheng Ren
Assignee: Shengkai Fang
 Fix For: 1.17.0


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45921&view=logs&j=0c940707-2659-5648-cbe6-a1ad63045f0a&t=075c2716-8010-5565-fe08-3c4bb45824a4&l=44674



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] shuiqiangchen commented on a diff in pull request #21823: [FLINK-30694][docs]Translate "Windowing TVF" page of "Querys" into Ch…

2023-02-08 Thread via GitHub



shuiqiangchen commented on code in PR #21823:
URL: https://github.com/apache/flink/pull/21823#discussion_r1101043029


##
docs/content.zh/docs/dev/table/sql/queries/window-tvf.md:
##
@@ -22,61 +22,61 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Windowing table-valued functions (Windowing TVFs)
+# 窗口表值函数（Windowing TVFs）
 
 {{< label Batch >}} {{< label Streaming >}}
 
-Windows are at the heart of processing infinite streams. Windows split the 
stream into “buckets” of finite size, over which we can apply computations. 
This document focuses on how windowing is performed in Flink SQL and how the 
programmer can benefit to the maximum from its offered functionality.
+窗口是处理无限流的核心。窗口把流分割为有限大小的 “桶”，这样就可以在其之上进行计算。本文档聚焦于窗口在 Flink SQL 
中是如何工作的，编程人员如何最大化地利用好它。
 
-Apache Flink provides several window table-valued functions (TVF) to divide 
the elements of your table into windows, including:
+Apache Flink 提供了如下 `窗口表值函数`（TVF）把表的数据划分到窗口中：
 
-- [Tumble Windows](#tumble)
-- [Hop Windows](#hop)
-- [Cumulate Windows](#cumulate)
-- Session Windows (will be supported soon)
+- [滚动窗口](#tumble)
+- [滑动窗口](#hop)
+- [累计窗口](#cumulate)
+- 会话窗口 (很快就能支持了)
 
-Note that each element can logically belong to more than one window, depending 
on the windowing table-valued function you use. For example, HOP windowing 
creates overlapping windows wherein a single element can be assigned to 
multiple windows.
+注意：每一个元素可以被逻辑上应用于超过一个窗口。这取决于使用的 `窗口表值函数`。例如：滑动窗口可以把单个元素分配给多个窗口。

Review Comment:
   逻辑上，每个元素可以应用于一个或多个窗口，这取决于所使用的`窗口表值函数`。
   ```suggestion
   注意：每一个元素可以被逻辑上应用于超过一个窗口。这取决于使用的 `窗口表值函数`。例如：滑动窗口可以把单个元素分配给多个窗口。
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] xintongsong commented on a diff in pull request #21843: [FLINK-30712][network] Update documents for taskmanager memory configurations and tuning

2023-02-08 Thread via GitHub



xintongsong commented on code in PR #21843:
URL: https://github.com/apache/flink/pull/21843#discussion_r1101041753


##
docs/content/docs/deployment/memory/network_mem_tuning.md:
##
@@ -97,20 +97,17 @@ The actual value of parallelism from which the problem 
occurs is various from jo
 ## Network buffer lifecycle
  
 Flink has several local buffer pools - one for the output stream and one for 
each input gate. 
-Each of those pools is limited to at most 
+The upper limit of the size of each buffer pool is called the buffer pool 
**Target**, which is calculated by the following formula.
 
 `#channels * taskmanager.network.memory.buffers-per-channel + 
taskmanager.network.memory.floating-buffers-per-gate`
 
 The size of the buffer can be configured by setting 
`taskmanager.memory.segment-size`.
 
 ### Input network buffers
 
-Buffers in the input channel are divided into exclusive and floating buffers.  
Exclusive buffers can be used by only one particular channel.  A channel can 
request additional floating buffers from a buffer pool shared across all 
channels belonging to the given input gate. The remaining floating buffers are 
optional and are acquired only if there are enough resources available.
+Not all buffers in the buffer pool Target can be obtained eventually. A 
**Threshold** is introduced to divide the buffer pool Target into two parts. 
The part below the threshold is called required. The excess part buffers, if 
any, is optional. A task will fail if the required buffers cannot be obtained 
in runtime. A task will not fail due to not obtaining optional buffers, but may 
suffer a performance reduction. If not explicitly configured, the default value 
of the threshold is Integer.MAX_VALUE for streaming workloads, and 1000 for 
batch workloads.
 
-In the initialization phase:
-- Flink will try to acquire the configured amount of exclusive buffers for 
each channel
-- all exclusive buffers must be fulfilled or the job will fail with an 
exception
-- a single floating buffer has to be allocated for Flink to be able to make 
progress
+It is not recommended to adjust the above threshold during normal use. Unless 
you are a Flink network expert and can clearly understand the impact of this 
threshold, you can adjust the above threshold through the option 
`taskmanager.network.memory.read-buffer.required-per-gate.max`. If this option 
is configured to a smaller value, it can avoid the "insufficient number of 
network buffers" exception as much as possible, but may suffer a performance 
reduction silently. If this option is configured as Integer.MAX_VALUE, the 
required buffer limit is disabled. When the feature is disabled, more read 
buffers may be required in runtime, which is good for performance but this may 
lead to more easily throwing insufficient network buffers exceptions.

Review Comment:
   ```suggestion
   The default value for this threshold is `Integer.MAX_VALUE` for streaming 
workloads, and `1000` for batch workloads.
   We do not recommend users to change this threshold, unless the user has good 
reasons and knows what he/she is doing well.
   The relevant configuration option is 
`taskmanager.network.memory.read-buffer.required-per-gate.max`.
   In general, a smaller threshold leads to less chance of the "insufficient 
number of network buffers" exception, while the workloads may suffer 
performance reduction silently, and vice versa.
   ```



##
docs/content/docs/deployment/memory/network_mem_tuning.md:
##
@@ -97,20 +97,17 @@ The actual value of parallelism from which the problem 
occurs is various from jo
 ## Network buffer lifecycle
  
 Flink has several local buffer pools - one for the output stream and one for 
each input gate. 
-Each of those pools is limited to at most 
+The upper limit of the size of each buffer pool is called the buffer pool 
**Target**, which is calculated by the following formula.

Review Comment:
   ```suggestion
   The target size of each buffer pool is calculated by the following formula.
   ```



##
docs/content/docs/deployment/memory/network_mem_tuning.md:
##
@@ -97,20 +97,17 @@ The actual value of parallelism from which the problem 
occurs is various from jo
 ## Network buffer lifecycle
  
 Flink has several local buffer pools - one for the output stream and one for 
each input gate. 
-Each of those pools is limited to at most 
+The upper limit of the size of each buffer pool is called the buffer pool 
**Target**, which is calculated by the following formula.
 
 `#channels * taskmanager.network.memory.buffers-per-channel + 
taskmanager.network.memory.floating-buffers-per-gate`
 
 The size of the buffer can be configured by setting 
`taskmanager.memory.segment-size`.
 
 ### Input network buffers
 
-Buffers in the input channel are divided into exclusive and floating buffers.  
Exclusive buffers can be used by only one particular channel.  A channel can 
request additional floating buffers from a buffer

[jira] [Updated] (FLINK-30974) Add commons-io to flink-sql-connector-hbase

2023-02-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30974:
---
Labels: pull-request-available  (was: )

> Add commons-io to flink-sql-connector-hbase
> ---
>
> Key: FLINK-30974
> URL: https://issues.apache.org/jira/browse/FLINK-30974
> Project: Flink
>  Issue Type: Bug
>Reporter: xi chaomin
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-02-09-14-43-06-187.png
>
>
> I use flink sql to join hbase table, when I set lookup.asycn = true, I get an 
> error.
> !image-2023-02-09-14-43-06-187.png|width=1385,height=549!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] xicm opened a new pull request, #21902: [FLINK-30974] Add commons-io to flink-sql-connector-hbase

2023-02-08 Thread via GitHub



xicm opened a new pull request, #21902:
URL: https://github.com/apache/flink/pull/21902

   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-30977) flink tumbling window stream converting to pandas dataframe not work

2023-02-08 Thread Joekwal (Jira)

Joekwal created FLINK-30977:
---

 Summary: flink tumbling window stream converting to pandas 
dataframe not work
 Key: FLINK-30977
 URL: https://issues.apache.org/jira/browse/FLINK-30977
 Project: Flink
  Issue Type: Bug
 Environment: pyflink1.15.2
Reporter: Joekwal


I want to know if tumbling window supported to convert to pandas?
```
code... #create env

kafka_src = """
CREATE TABLE if not exists `kafka_src` (
...
`event_time` as CAST(`end_time` as TIMESTAMP(3)),
WATERMARK FOR event_time as event_time - INTERVAL '5' SECOND
)
with (
'connector' = 'kafka',
'topic' = 'topic',
'properties.bootstrap.servers' = '***',
'properties.group.id' = '***',
'scan.startup.mode' = 'earliest-offset',
'value.format' = 'debezium-json'
);
"""    
t_env.execute_sql(kafka_src)

table = st_env.sql_query("SELECT columns,`event_time`  \
    FROM TABLE(TUMBLE(TABLE table_name, DESCRIPTOR(event_time), INTERVAL '1' 
MINUTES))")

# table.execute().print()  #could print the result
df = table.to_pandas()

#schema is correct!
schema = DataTypes.ROW([DataTypes.FIELD("column1", DataTypes.STRING()),
                        ...
                            ])

table = st_env.from_pandas(df,schema=schema)

st_env.create_temporary_view("view_table",table)

st_env.sql_query("select * from view_table").execute().print() # Not work!Can't 
print the result

```
Tumbling window stream from kafka source convert to pandas dataframe and it 
can't print the result.The schema is right.I have tested in another job with 
using batch stream from jdbc source.It can print the result.The only different 
thing is the input stream.As doc mentioned, the bounded stream is supported to 
convert to pandas.So what could have gone wrong?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30974) Add commons-io to flink-sql-connector-hbase

2023-02-08 Thread xi chaomin (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xi chaomin updated FLINK-30974:
---
Summary: Add commons-io to flink-sql-connector-hbase  (was: 
NoClassDefFoundError 
org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtils)

> Add commons-io to flink-sql-connector-hbase
> ---
>
> Key: FLINK-30974
> URL: https://issues.apache.org/jira/browse/FLINK-30974
> Project: Flink
>  Issue Type: Bug
>Reporter: xi chaomin
>Priority: Major
> Attachments: image-2023-02-09-14-43-06-187.png
>
>
> I use flink sql to join hbase table, when I set lookup.asycn = true, I get an 
> error.
> !image-2023-02-09-14-43-06-187.png|width=1385,height=549!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-30838) Update documentation about the AdaptiveBatchScheduler

2023-02-08 Thread Zhu Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686158#comment-17686158
 ] 

Zhu Zhu edited comment on FLINK-30838 at 2/9/23 6:49 AM:
-

master:
a6c84cd2cb36f37aed06ded1af3cf4b52471bdfc

release-1.17:
11724b5cb66b7efb0111aa4d7feaab204f86fa8b


was (Author: zhuzh):
Done via a6c84cd2cb36f37aed06ded1af3cf4b52471bdfc

> Update documentation about the AdaptiveBatchScheduler
> -
>
> Key: FLINK-30838
> URL: https://issues.apache.org/jira/browse/FLINK-30838
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Documentation is needed to update to help users how to enable the 
> AdaptiveBatchScheduler and properly configuring it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30976) docs_404_check fails occasionally

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686225#comment-17686225
 ] 

Matthias Pohl commented on FLINK-30976:
---

It seems to be fixed again. But I'm still wondering what caused and what fixed 
it...

> docs_404_check fails occasionally
> -
>
> Key: FLINK-30976
> URL: https://issues.apache.org/jira/browse/FLINK-30976
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.17.0, 1.18.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
>
> We've seen the docs_404_check failing in nightly builds (only the cron stage 
> but not the ci stage):
> {code}
> Re-run Hugo with the flag --panicOnWarning to get a better error message.
> ERROR 2023/02/09 01:27:27 "docs/connectors/datastream/pulsar.md": Invalid use 
> of artifact shortcode. Unknown flag `4.0.0-SNAPSHOT`
> ERROR 2023/02/09 01:27:34 "docs/connectors/datastream/pulsar.md": Invalid use 
> of artifact shortcode. Unknown flag `4.0.0-SNAPSHOT`
> Error: Error building site: logged 2 error(s)
> Total in 12945 ms
> Error building the docs
> {code}
> * 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45909&view=logs&j=6dc02e5c-5865-5c6a-c6c5-92d598e3fc43&t=ddd6d61a-af16-5d03-2b9a-76a279badf98&l=133
> * 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45906&view=logs&j=6dc02e5c-5865-5c6a-c6c5-92d598e3fc43&t=ddd6d61a-af16-5d03-2b9a-76a279badf98&l=132



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-30976) docs_404_check fails occasionally

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686225#comment-17686225
 ] 

Matthias Pohl edited comment on FLINK-30976 at 2/9/23 6:49 AM:
---

It seems to be fixed again. But I'm still wondering what caused and what fixed 
it and why it didn't cause a test failure in the CI stage.


was (Author: mapohl):
It seems to be fixed again. But I'm still wondering what caused and what fixed 
it...

> docs_404_check fails occasionally
> -
>
> Key: FLINK-30976
> URL: https://issues.apache.org/jira/browse/FLINK-30976
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.17.0, 1.18.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
>
> We've seen the docs_404_check failing in nightly builds (only the cron stage 
> but not the ci stage):
> {code}
> Re-run Hugo with the flag --panicOnWarning to get a better error message.
> ERROR 2023/02/09 01:27:27 "docs/connectors/datastream/pulsar.md": Invalid use 
> of artifact shortcode. Unknown flag `4.0.0-SNAPSHOT`
> ERROR 2023/02/09 01:27:34 "docs/connectors/datastream/pulsar.md": Invalid use 
> of artifact shortcode. Unknown flag `4.0.0-SNAPSHOT`
> Error: Error building site: logged 2 error(s)
> Total in 12945 ms
> Error building the docs
> {code}
> * 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45909&view=logs&j=6dc02e5c-5865-5c6a-c6c5-92d598e3fc43&t=ddd6d61a-af16-5d03-2b9a-76a279badf98&l=133
> * 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45906&view=logs&j=6dc02e5c-5865-5c6a-c6c5-92d598e3fc43&t=ddd6d61a-af16-5d03-2b9a-76a279badf98&l=132



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-30976) docs_404_check fails occasionally

2023-02-08 Thread Matthias Pohl (Jira)

Matthias Pohl created FLINK-30976:
-

 Summary: docs_404_check fails occasionally
 Key: FLINK-30976
 URL: https://issues.apache.org/jira/browse/FLINK-30976
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.17.0, 1.18.0
Reporter: Matthias Pohl


We've seen the docs_404_check failing in nightly builds (only the cron stage 
but not the ci stage):
{code}
Re-run Hugo with the flag --panicOnWarning to get a better error message.
ERROR 2023/02/09 01:27:27 "docs/connectors/datastream/pulsar.md": Invalid use 
of artifact shortcode. Unknown flag `4.0.0-SNAPSHOT`
ERROR 2023/02/09 01:27:34 "docs/connectors/datastream/pulsar.md": Invalid use 
of artifact shortcode. Unknown flag `4.0.0-SNAPSHOT`
Error: Error building site: logged 2 error(s)
Total in 12945 ms
Error building the docs
{code}
* 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45909&view=logs&j=6dc02e5c-5865-5c6a-c6c5-92d598e3fc43&t=ddd6d61a-af16-5d03-2b9a-76a279badf98&l=133
* 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45906&view=logs&j=6dc02e5c-5865-5c6a-c6c5-92d598e3fc43&t=ddd6d61a-af16-5d03-2b9a-76a279badf98&l=132



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-30682) FLIP-283: Use adaptive batch scheduler as default scheduler for batch jobs

2023-02-08 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-30682.
---
Release Note: Adaptive batch scheduler are now used for batch jobs by 
default. It will automatically decide the parallelism of operators. The keys 
and values of related configuration items are improved for easy of use. More 
details can be found at 
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/elastic_scaling/#adaptive-batch-scheduler.
  Resolution: Done

> FLIP-283: Use adaptive batch scheduler as default scheduler for batch jobs
> --
>
> Key: FLINK-30682
> URL: https://issues.apache.org/jira/browse/FLINK-30682
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
> Fix For: 1.17.0
>
>
> To further use the adaptive batch scheduler to improve flink's batch 
> capability, in this FLIP we aim to make the adaptive batch scheduler as the 
> default batch scheduler and optimize the current adaptive batch scheduler 
> configuration.
> More details see 
> [FLIP-283|https://cwiki.apache.org/confluence/display/FLINK/FLIP-283%3A+Use+adaptive+batch+scheduler+as+default+scheduler+for+batch+jobs].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30629) ClientHeartbeatTest.testJobRunningIfClientReportHeartbeat is unstable

2023-02-08 Thread Qingsheng Ren (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686224#comment-17686224
 ] 

Qingsheng Ren commented on FLINK-30629:
---

The unstable case still exists on 1.17:

[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45921&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=24c3384f-1bcb-57b3-224f-51bf973bbee8&l=9864]

The failed case is 
ClientHeartbeatTest.testJob{*}CancelledIfClientHeartbeatTimeout{*}, which looks 
like related to this one, so I report the case here. 

[~Jiangang] Could you take a look when you are available? Thanks

> ClientHeartbeatTest.testJobRunningIfClientReportHeartbeat is unstable
> -
>
> Key: FLINK-30629
> URL: https://issues.apache.org/jira/browse/FLINK-30629
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.17.0
>Reporter: Xintong Song
>Assignee: Liu
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.17.0
>
> Attachments: ClientHeartbeatTestLog.txt
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44690&view=logs&j=77a9d8e1-d610-59b3-fc2a-4766541e0e33&t=125e07e7-8de0-5c6c-a541-a567415af3ef&l=10819
> {code:java}
> Jan 11 04:32:39 [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 21.02 s <<< FAILURE! - in 
> org.apache.flink.client.ClientHeartbeatTest
> Jan 11 04:32:39 [ERROR] 
> org.apache.flink.client.ClientHeartbeatTest.testJobRunningIfClientReportHeartbeat
>   Time elapsed: 9.157 s  <<< ERROR!
> Jan 11 04:32:39 java.lang.IllegalStateException: MiniCluster is not yet 
> running or has already been shut down.
> Jan 11 04:32:39   at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:193)
> Jan 11 04:32:39   at 
> org.apache.flink.runtime.minicluster.MiniCluster.getDispatcherGatewayFuture(MiniCluster.java:1044)
> Jan 11 04:32:39   at 
> org.apache.flink.runtime.minicluster.MiniCluster.runDispatcherCommand(MiniCluster.java:917)
> Jan 11 04:32:39   at 
> org.apache.flink.runtime.minicluster.MiniCluster.getJobStatus(MiniCluster.java:841)
> Jan 11 04:32:39   at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.getJobStatus(MiniClusterJobClient.java:91)
> Jan 11 04:32:39   at 
> org.apache.flink.client.ClientHeartbeatTest.testJobRunningIfClientReportHeartbeat(ClientHeartbeatTest.java:79)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Reopened] (FLINK-30629) ClientHeartbeatTest.testJobRunningIfClientReportHeartbeat is unstable

2023-02-08 Thread Qingsheng Ren (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qingsheng Ren reopened FLINK-30629:
---

> ClientHeartbeatTest.testJobRunningIfClientReportHeartbeat is unstable
> -
>
> Key: FLINK-30629
> URL: https://issues.apache.org/jira/browse/FLINK-30629
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.17.0
>Reporter: Xintong Song
>Assignee: Liu
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.17.0
>
> Attachments: ClientHeartbeatTestLog.txt
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44690&view=logs&j=77a9d8e1-d610-59b3-fc2a-4766541e0e33&t=125e07e7-8de0-5c6c-a541-a567415af3ef&l=10819
> {code:java}
> Jan 11 04:32:39 [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 21.02 s <<< FAILURE! - in 
> org.apache.flink.client.ClientHeartbeatTest
> Jan 11 04:32:39 [ERROR] 
> org.apache.flink.client.ClientHeartbeatTest.testJobRunningIfClientReportHeartbeat
>   Time elapsed: 9.157 s  <<< ERROR!
> Jan 11 04:32:39 java.lang.IllegalStateException: MiniCluster is not yet 
> running or has already been shut down.
> Jan 11 04:32:39   at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:193)
> Jan 11 04:32:39   at 
> org.apache.flink.runtime.minicluster.MiniCluster.getDispatcherGatewayFuture(MiniCluster.java:1044)
> Jan 11 04:32:39   at 
> org.apache.flink.runtime.minicluster.MiniCluster.runDispatcherCommand(MiniCluster.java:917)
> Jan 11 04:32:39   at 
> org.apache.flink.runtime.minicluster.MiniCluster.getJobStatus(MiniCluster.java:841)
> Jan 11 04:32:39   at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.getJobStatus(MiniClusterJobClient.java:91)
> Jan 11 04:32:39   at 
> org.apache.flink.client.ClientHeartbeatTest.testJobRunningIfClientReportHeartbeat(ClientHeartbeatTest.java:79)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-30975) Upgrade aws sdk to v2

2023-02-08 Thread Samrat Deb (Jira)

Samrat Deb created FLINK-30975:
--

 Summary: Upgrade aws sdk to v2
 Key: FLINK-30975
 URL: https://issues.apache.org/jira/browse/FLINK-30975
 Project: Flink
  Issue Type: Improvement
Reporter: Samrat Deb






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30974) NoClassDefFoundError org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtils

2023-02-08 Thread xi chaomin (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xi chaomin updated FLINK-30974:
---
Attachment: image-2023-02-09-14-43-06-187.png

> NoClassDefFoundError 
> org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtils
> 
>
> Key: FLINK-30974
> URL: https://issues.apache.org/jira/browse/FLINK-30974
> Project: Flink
>  Issue Type: Bug
>Reporter: xi chaomin
>Priority: Major
> Attachments: image-2023-02-09-14-43-06-187.png
>
>
> I use flink sql to join hbase table, when I set lookup.asycn = true, I get an 
> error.
> {code:java}
> Suppressed: java.lang.Exception: java.lang.NoClassDefFoundError: 
> rg/apache/flink/hbase/shaded/rg/apache/commons/io/I0Utilsat 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runAndsuppressThrowable(StreamTask.java:1024)at
>  
> org.apache.flink.streaming.runtime.tasks.StreamaskcleanUp(StreamTask.java:928)at
>  
> org.apache.flink,runtime.taskmanager,Task,lambdarestoreAndInvoke$0(Task.java:940)at
>  org.apacheflink.runtime 
> .taskmanagerk.runwithsystemExitMonitoring(Task.java:958)at 
> org.apache.flink.runtime .task.restoreAndInvoke(Task.java:948.3 moreCaused 
> by: java,langNoClassDefFoundError: 
> org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtilsat 
> org.apache.hadoop.hbase.client.AsyncCnectionImpl.close(AsyncConnectionImpliava:193)at
>  org.apache.flink.connector .hbase2 . 
> source..HBaseRowDataAsyncLookupFunction.close(HBaseRowDataAsyncLookupFunction.java:251)
> at LookupFunction$281.close(unknowr
> at 
> org.apache.flinkapi.common,functions,util.Functionutils.closeFunction(Functionutils.java:41)at
>  
> org.apache.flink.table.runtime.operators.join.lookup.AsyncLookupJoinRunner.close(AsyncLookupJoinRunner,java:137)at
>  
> org.apache.flink.api.common.functions.util,Functionutils.closeFunction(Functionutils.java:41)at
>  
> org.apache.flink,streaming.api.operators.AbstractUdfStreamoperatorclose(Abstractudfstreamoperator.java:114)at
>  
> org.apache.flink.streaming.runtime.tasks.Stream0peratorwrapper.close(Streamoperatorwrapper.java:141)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30974) NoClassDefFoundError org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtils

2023-02-08 Thread xi chaomin (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xi chaomin updated FLINK-30974:
---
Description: 
I use flink sql to join hbase table, when I set lookup.asycn = true, I get an 
error.

!image-2023-02-09-14-43-06-187.png|width=1385,height=549!

  was:
I use flink sql to join hbase table, when I set lookup.asycn = true, I get an 
error.
{code:java}
Suppressed: java.lang.Exception: java.lang.NoClassDefFoundError: 
rg/apache/flink/hbase/shaded/rg/apache/commons/io/I0Utilsat 
org.apache.flink.streaming.runtime.tasks.StreamTask.runAndsuppressThrowable(StreamTask.java:1024)at
 
org.apache.flink.streaming.runtime.tasks.StreamaskcleanUp(StreamTask.java:928)at
 
org.apache.flink,runtime.taskmanager,Task,lambdarestoreAndInvoke$0(Task.java:940)at
 org.apacheflink.runtime 
.taskmanagerk.runwithsystemExitMonitoring(Task.java:958)at 
org.apache.flink.runtime .task.restoreAndInvoke(Task.java:948.3 moreCaused by: 
java,langNoClassDefFoundError: 
org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtilsat 
org.apache.hadoop.hbase.client.AsyncCnectionImpl.close(AsyncConnectionImpliava:193)at
 org.apache.flink.connector .hbase2 . 
source..HBaseRowDataAsyncLookupFunction.close(HBaseRowDataAsyncLookupFunction.java:251)
at LookupFunction$281.close(unknowr
at 
org.apache.flinkapi.common,functions,util.Functionutils.closeFunction(Functionutils.java:41)at
 
org.apache.flink.table.runtime.operators.join.lookup.AsyncLookupJoinRunner.close(AsyncLookupJoinRunner,java:137)at
 
org.apache.flink.api.common.functions.util,Functionutils.closeFunction(Functionutils.java:41)at
 
org.apache.flink,streaming.api.operators.AbstractUdfStreamoperatorclose(Abstractudfstreamoperator.java:114)at
 
org.apache.flink.streaming.runtime.tasks.Stream0peratorwrapper.close(Streamoperatorwrapper.java:141)
 {code}


> NoClassDefFoundError 
> org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtils
> 
>
> Key: FLINK-30974
> URL: https://issues.apache.org/jira/browse/FLINK-30974
> Project: Flink
>  Issue Type: Bug
>Reporter: xi chaomin
>Priority: Major
> Attachments: image-2023-02-09-14-43-06-187.png
>
>
> I use flink sql to join hbase table, when I set lookup.asycn = true, I get an 
> error.
> !image-2023-02-09-14-43-06-187.png|width=1385,height=549!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30974) NoClassDefFoundError org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtils

2023-02-08 Thread xi chaomin (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xi chaomin updated FLINK-30974:
---
Description: 
I use flink sql to join hbase table, when I set lookup.asycn = true, I get an 
error.
{code:java}
Suppressed: java.lang.Exception: java.lang.NoClassDefFoundError: 
rg/apache/flink/hbase/shaded/rg/apache/commons/io/I0Utilsat 
org.apache.flink.streaming.runtime.tasks.StreamTask.runAndsuppressThrowable(StreamTask.java:1024)at
 
org.apache.flink.streaming.runtime.tasks.StreamaskcleanUp(StreamTask.java:928)at
 
org.apache.flink,runtime.taskmanager,Task,lambdarestoreAndInvoke$0(Task.java:940)at
 org.apacheflink.runtime 
.taskmanagerk.runwithsystemExitMonitoring(Task.java:958)at 
org.apache.flink.runtime .task.restoreAndInvoke(Task.java:948.3 moreCaused by: 
java,langNoClassDefFoundError: 
org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtilsat 
org.apache.hadoop.hbase.client.AsyncCnectionImpl.close(AsyncConnectionImpliava:193)at
 org.apache.flink.connector .hbase2 . 
source..HBaseRowDataAsyncLookupFunction.close(HBaseRowDataAsyncLookupFunction.java:251)
at LookupFunction$281.close(unknowr
at 
org.apache.flinkapi.common,functions,util.Functionutils.closeFunction(Functionutils.java:41)at
 
org.apache.flink.table.runtime.operators.join.lookup.AsyncLookupJoinRunner.close(AsyncLookupJoinRunner,java:137)at
 
org.apache.flink.api.common.functions.util,Functionutils.closeFunction(Functionutils.java:41)at
 
org.apache.flink,streaming.api.operators.AbstractUdfStreamoperatorclose(Abstractudfstreamoperator.java:114)at
 
org.apache.flink.streaming.runtime.tasks.Stream0peratorwrapper.close(Streamoperatorwrapper.java:141)
 {code}

  was:
I use flink sql to join hbase table, when I set 
| |


> NoClassDefFoundError 
> org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtils
> 
>
> Key: FLINK-30974
> URL: https://issues.apache.org/jira/browse/FLINK-30974
> Project: Flink
>  Issue Type: Bug
>Reporter: xi chaomin
>Priority: Major
>
> I use flink sql to join hbase table, when I set lookup.asycn = true, I get an 
> error.
> {code:java}
> Suppressed: java.lang.Exception: java.lang.NoClassDefFoundError: 
> rg/apache/flink/hbase/shaded/rg/apache/commons/io/I0Utilsat 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runAndsuppressThrowable(StreamTask.java:1024)at
>  
> org.apache.flink.streaming.runtime.tasks.StreamaskcleanUp(StreamTask.java:928)at
>  
> org.apache.flink,runtime.taskmanager,Task,lambdarestoreAndInvoke$0(Task.java:940)at
>  org.apacheflink.runtime 
> .taskmanagerk.runwithsystemExitMonitoring(Task.java:958)at 
> org.apache.flink.runtime .task.restoreAndInvoke(Task.java:948.3 moreCaused 
> by: java,langNoClassDefFoundError: 
> org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtilsat 
> org.apache.hadoop.hbase.client.AsyncCnectionImpl.close(AsyncConnectionImpliava:193)at
>  org.apache.flink.connector .hbase2 . 
> source..HBaseRowDataAsyncLookupFunction.close(HBaseRowDataAsyncLookupFunction.java:251)
> at LookupFunction$281.close(unknowr
> at 
> org.apache.flinkapi.common,functions,util.Functionutils.closeFunction(Functionutils.java:41)at
>  
> org.apache.flink.table.runtime.operators.join.lookup.AsyncLookupJoinRunner.close(AsyncLookupJoinRunner,java:137)at
>  
> org.apache.flink.api.common.functions.util,Functionutils.closeFunction(Functionutils.java:41)at
>  
> org.apache.flink,streaming.api.operators.AbstractUdfStreamoperatorclose(Abstractudfstreamoperator.java:114)at
>  
> org.apache.flink.streaming.runtime.tasks.Stream0peratorwrapper.close(Streamoperatorwrapper.java:141)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] shuiqiangchen commented on pull request #21897: [FLINK-30922][table-planner] Apply persisted columns when doing appendPartitionAndNu…

2023-02-08 Thread via GitHub



shuiqiangchen commented on PR #21897:
URL: https://github.com/apache/flink/pull/21897#issuecomment-1423711042

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30974) NoClassDefFoundError org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtils

2023-02-08 Thread xi chaomin (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xi chaomin updated FLINK-30974:
---
Description: 
I use flink sql to join hbase table, when I set 
| |

> NoClassDefFoundError 
> org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtils
> 
>
> Key: FLINK-30974
> URL: https://issues.apache.org/jira/browse/FLINK-30974
> Project: Flink
>  Issue Type: Bug
>Reporter: xi chaomin
>Priority: Major
>
> I use flink sql to join hbase table, when I set 
> | |



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-30974) NoClassDefFoundError org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtils

2023-02-08 Thread xi chaomin (Jira)

xi chaomin created FLINK-30974:
--

 Summary: NoClassDefFoundError 
org/apache/flink/hbase/shaded/org/apache/commons/io/IOUtils
 Key: FLINK-30974
 URL: https://issues.apache.org/jira/browse/FLINK-30974
 Project: Flink
  Issue Type: Bug
Reporter: xi chaomin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-30973) e2e test prepare crashes with libssl dependency not being available

2023-02-08 Thread Matthias Pohl (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl closed FLINK-30973.
-
Resolution: Duplicate

I don't know why I didn't manage to find FLINK-30972 before creating this 
issue. 8) 

> e2e test prepare crashes with libssl dependency not being available
> ---
>
> Key: FLINK-30973
> URL: https://issues.apache.org/jira/browse/FLINK-30973
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Affects Versions: 1.17.0, 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
>
> We're experiencing a test instability where the preparation of the e2e tests 
> fails because of a 404:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45907&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&t=11b0df07-3e5e-58da-eb81-03003e470195&l=1830
> {code}
> --2023-02-09 00:23:32--  
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
> Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
> 185.125.190.36, 185.125.190.39, ...
> Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
> connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2023-02-09 00:23:33 ERROR 404: Not Found.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30972) E2e tests always fail in phase "Prepare E2E run"

2023-02-08 Thread Matthias Pohl (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-30972:
--
Labels: test-stability  (was: )

> E2e tests always fail in phase "Prepare E2E run"
> 
>
> Key: FLINK-30972
> URL: https://issues.apache.org/jira/browse/FLINK-30972
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI, Tests
>Affects Versions: 1.17.0
>Reporter: Lijie Wang
>Assignee: Qingsheng Ren
>Priority: Blocker
>  Labels: test-stability
>
> {code:java}
> Installing required software
> Reading package lists...
> Building dependency tree...
> Reading state information...
> bc is already the newest version (1.07.1-2build1).
> bc set to manually installed.
> libapr1 is already the newest version (1.6.5-1ubuntu1).
> libapr1 set to manually installed.
> 0 upgraded, 0 newly installed, 0 to remove and 13 not upgraded.
> --2023-02-09 04:38:47--  
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
> Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
> 185.125.190.36, 185.125.190.39, ...
> Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
> connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2023-02-09 04:38:47 ERROR 404: Not Found.
> WARNING: apt does not have a stable CLI interface. Use with caution in 
> scripts.
> Reading package lists...
> E: Unsupported file ./libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb given on 
> commandline
> ##[error]Bash exited with code '100'.
> Finishing: Prepare E2E run
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686216#comment-17686216
 ] 

Matthias Pohl commented on FLINK-18356:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45906&view=logs&j=a9db68b9-a7e0-54b6-0f98-010e0aff39e2&t=cdd32e0b-6047-565b-c58f-14054472f1be&l=12256

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0
>Reporter: Piotr Nowojski
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30973) e2e test prepare crashes with libssl dependency not being available

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686215#comment-17686215
 ] 

Matthias Pohl commented on FLINK-30973:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45906&view=results

> e2e test prepare crashes with libssl dependency not being available
> ---
>
> Key: FLINK-30973
> URL: https://issues.apache.org/jira/browse/FLINK-30973
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Affects Versions: 1.17.0, 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
>
> We're experiencing a test instability where the preparation of the e2e tests 
> fails because of a 404:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45907&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&t=11b0df07-3e5e-58da-eb81-03003e470195&l=1830
> {code}
> --2023-02-09 00:23:32--  
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
> Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
> 185.125.190.36, 185.125.190.39, ...
> Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
> connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2023-02-09 00:23:33 ERROR 404: Not Found.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30640) Unstable test in CliClientITCase

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686214#comment-17686214
 ] 

Matthias Pohl commented on FLINK-30640:
---

The following build doesn't include the aforementioned fix, yet:
1.17: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45909&view=logs&j=de826397-1924-5900-0034-51895f69d4b7&t=f311e913-93a2-5a37-acab-4a63e1328f94&l=42892

> Unstable test in CliClientITCase
> 
>
> Key: FLINK-30640
> URL: https://issues.apache.org/jira/browse/FLINK-30640
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive, Table SQL / Client
>Affects Versions: 1.17.0
>Reporter: yuzelin
>Assignee: dalongliu
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.17.0
>
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44743&view=logs&j=0c940707-2659-5648-cbe6-a1ad63045f0a&t=075c2716-8010-5565-fe08-3c4bb45824a4]
>  
> The failed test can work normally in my local environment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] godfreyhe commented on a diff in pull request #21789: [FLINK-30824][hive] Add document for option table.exec.hive.native-agg-function.enabled

2023-02-08 Thread via GitHub



godfreyhe commented on code in PR #21789:
URL: https://github.com/apache/flink/pull/21789#discussion_r1101023309


##
docs/content.zh/docs/connectors/table/hive/hive_functions.md:
##
@@ -73,6 +73,34 @@ Some Hive built-in functions in older versions have [thread 
safety issues](https
 We recommend users patch their own Hive to fix them.
 {{< /hint >}}
 
+## Use Native Hive Aggregate Functions
+
+If [HiveModule]({{< ref "docs/dev/table/modules" >}}#hivemodule) is loaded 
with a higher priority than CoreModule, Flink will try to use the Hive built-in 
function first. And then for Hive built-in aggregation function,
+Flink currently uses sort-based aggregation strategy. Compared to hash-based 
aggregation strategy, the performance is one to two times worse, so from Flink 
1.17, we have implemented some of Hive's aggregation functions natively in 
Flink.

Review Comment:
   If no specific scenario is given, it is best not to give specific 
performance results here. 



##
docs/content.zh/docs/connectors/table/hive/hive_functions.md:
##
@@ -73,6 +73,34 @@ Some Hive built-in functions in older versions have [thread 
safety issues](https
 We recommend users patch their own Hive to fix them.
 {{< /hint >}}
 
+## Use Native Hive Aggregate Functions
+
+If [HiveModule]({{< ref "docs/dev/table/modules" >}}#hivemodule) is loaded 
with a higher priority than CoreModule, Flink will try to use the Hive built-in 
function first. And then for Hive built-in aggregation function,
+Flink currently uses sort-based aggregation strategy. Compared to hash-based 
aggregation strategy, the performance is one to two times worse, so from Flink 
1.17, we have implemented some of Hive's aggregation functions natively in 
Flink.
+These functions will use the hash-agg strategy to improve performance. 
Currently, only five functions are supported, namely sum/count/avg/min/max, and 
more aggregation functions will be supported in the future.

Review Comment:
   for the fix length agg buffer, we can use hash-agg, otherwise sort-agg will 
be chosed
   
   Another performance improvement is code gen



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuzhurk commented on a diff in pull request #21898: [FLINK-30904][docs] Update the documentation and configuration description of slow task detector.

2023-02-08 Thread via GitHub



zhuzhurk commented on code in PR #21898:
URL: https://github.com/apache/flink/pull/21898#discussion_r1101022277


##
docs/content.zh/docs/deployment/speculative_execution.md:
##
@@ -55,7 +55,12 @@ under the License.
 - [`execution.batch.speculative.max-concurrent-executions`]({{< ref 
"docs/deployment/config" 
>}}#execution-batch-speculative-speculative-max-concurrent-e)
 - [`execution.batch.speculative.block-slow-node-duration`]({{< ref 
"docs/deployment/config" 
>}}#execution-batch-speculative-speculative-block-slow-node)
 
-你还可以调优下列慢任务检测器的配置项：
+目前，预测执行通过基于执行时间的慢任务检测器来检测慢任务，检测器将定期统计所有已执行完成的节点，当完成率达到某个阈值后

Review Comment:
   》某个阈值
   
   Maybe explain it by add a link to the config item 
`slow-task-detector.execution-time.baseline-ratio`



##
docs/content.zh/docs/deployment/speculative_execution.md:
##
@@ -55,7 +55,12 @@ under the License.
 - [`execution.batch.speculative.max-concurrent-executions`]({{< ref 
"docs/deployment/config" 
>}}#execution-batch-speculative-speculative-max-concurrent-e)
 - [`execution.batch.speculative.block-slow-node-duration`]({{< ref 
"docs/deployment/config" 
>}}#execution-batch-speculative-speculative-block-slow-node)
 
-你还可以调优下列慢任务检测器的配置项：
+目前，预测执行通过基于执行时间的慢任务检测器来检测慢任务，检测器将定期统计所有已执行完成的节点，当完成率达到某个阈值后
+则会将已完成节点的执行时间中位数作为基线，若运行中节点的执行时间超过基线则会被判定为慢节点。值得一提的是，

Review Comment:
   》则会将已完成节点的执行时间中位数作为基线
   
   The baseline is the median multiplied by the configured 
multiplier(`slow-task-detector.execution-time.baseline-multiplier`)



##
docs/content/docs/deployment/speculative_execution.md:
##
@@ -62,6 +62,14 @@ To make speculative execution work better for different 
jobs, you can tune below
 - [`execution.batch.speculative.max-concurrent-executions`]({{< ref 
"docs/deployment/config" 
>}}#execution-batch-speculative-speculative-max-concurrent-e)
 - [`execution.batch.speculative.block-slow-node-duration`]({{< ref 
"docs/deployment/config" 
>}}#execution-batch-speculative-speculative-block-slow-node)
 
+Currently, speculative execution uses the slow task detector based on 
execution time to detect slow tasks. 
+The detector will periodically count all finished executions, if the finished 
execution ratio reaches the threshold, 
+the median of the tasks' execution time will be defined as the baseline and 
the execution will 
+be detected as a slow task if its execution time exceeds the baseline. It is 
worth mentioning that 
+the execution time will be weighted with the input bytes of the execution 
vertex, so the executions 
+with large data volume differences but close computing power will not be 
detected as a slow task, 
+when data skew occurs. That will avoid starting invalid attempts.

Review Comment:
   That will avoid starting invalid attempts. -> This helps to avoid starting 
unnecessary speculative attempts.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-30972) E2e tests always fail in phase "Prepare E2E run"

2023-02-08 Thread Qingsheng Ren (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qingsheng Ren reassigned FLINK-30972:
-

Assignee: Qingsheng Ren

> E2e tests always fail in phase "Prepare E2E run"
> 
>
> Key: FLINK-30972
> URL: https://issues.apache.org/jira/browse/FLINK-30972
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI, Tests
>Affects Versions: 1.17.0
>Reporter: Lijie Wang
>Assignee: Qingsheng Ren
>Priority: Blocker
>
> {code:java}
> Installing required software
> Reading package lists...
> Building dependency tree...
> Reading state information...
> bc is already the newest version (1.07.1-2build1).
> bc set to manually installed.
> libapr1 is already the newest version (1.6.5-1ubuntu1).
> libapr1 set to manually installed.
> 0 upgraded, 0 newly installed, 0 to remove and 13 not upgraded.
> --2023-02-09 04:38:47--  
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
> Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
> 185.125.190.36, 185.125.190.39, ...
> Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
> connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2023-02-09 04:38:47 ERROR 404: Not Found.
> WARNING: apt does not have a stable CLI interface. Use with caution in 
> scripts.
> Reading package lists...
> E: Unsupported file ./libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb given on 
> commandline
> ##[error]Bash exited with code '100'.
> Finishing: Prepare E2E run
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30972) E2e tests always fail in phase "Prepare E2E run"

2023-02-08 Thread Qingsheng Ren (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686211#comment-17686211
 ] 

Qingsheng Ren commented on FLINK-30972:
---

Thanks for reporting this [~wanglijie] ! The required package was removed from 
the Ubuntu repository. I'll make a hotfix first and try to find a permanent 
solution later. 

> E2e tests always fail in phase "Prepare E2E run"
> 
>
> Key: FLINK-30972
> URL: https://issues.apache.org/jira/browse/FLINK-30972
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI, Tests
>Affects Versions: 1.17.0
>Reporter: Lijie Wang
>Priority: Blocker
>
> {code:java}
> Installing required software
> Reading package lists...
> Building dependency tree...
> Reading state information...
> bc is already the newest version (1.07.1-2build1).
> bc set to manually installed.
> libapr1 is already the newest version (1.6.5-1ubuntu1).
> libapr1 set to manually installed.
> 0 upgraded, 0 newly installed, 0 to remove and 13 not upgraded.
> --2023-02-09 04:38:47--  
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
> Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
> 185.125.190.36, 185.125.190.39, ...
> Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
> connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2023-02-09 04:38:47 ERROR 404: Not Found.
> WARNING: apt does not have a stable CLI interface. Use with caution in 
> scripts.
> Reading package lists...
> E: Unsupported file ./libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb given on 
> commandline
> ##[error]Bash exited with code '100'.
> Finishing: Prepare E2E run
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30508) CliClientITCase.testSqlStatements failed with output not matched with expected

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686212#comment-17686212
 ] 

Matthias Pohl commented on FLINK-30508:
---

[~lsy] With "another issue" you mean FLINK-30640?

> CliClientITCase.testSqlStatements failed with output not matched with expected
> --
>
> Key: FLINK-30508
> URL: https://issues.apache.org/jira/browse/FLINK-30508
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Qingsheng Ren
>Assignee: Shengkai Fang
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.17.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44246&view=logs&j=a9db68b9-a7e0-54b6-0f98-010e0aff39e2&t=cdd32e0b-6047-565b-c58f-14054472f1be&l=14992



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686208#comment-17686208
 ] 

Matthias Pohl commented on FLINK-18356:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45909&view=logs&j=a9db68b9-a7e0-54b6-0f98-010e0aff39e2&t=cdd32e0b-6047-565b-c58f-14054472f1be&l=33415

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0
>Reporter: Piotr Nowojski
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30973) e2e test prepare crashes with libssl dependency not being available

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686207#comment-17686207
 ] 

Matthias Pohl commented on FLINK-30973:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45909&view=results

> e2e test prepare crashes with libssl dependency not being available
> ---
>
> Key: FLINK-30973
> URL: https://issues.apache.org/jira/browse/FLINK-30973
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Affects Versions: 1.17.0, 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
>
> We're experiencing a test instability where the preparation of the e2e tests 
> fails because of a 404:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45907&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&t=11b0df07-3e5e-58da-eb81-03003e470195&l=1830
> {code}
> --2023-02-09 00:23:32--  
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
> Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
> 185.125.190.36, 185.125.190.39, ...
> Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
> connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2023-02-09 00:23:33 ERROR 404: Not Found.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30973) e2e test prepare crashes with libssl dependency not being available

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686206#comment-17686206
 ] 

Matthias Pohl commented on FLINK-30973:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45912&view=results

> e2e test prepare crashes with libssl dependency not being available
> ---
>
> Key: FLINK-30973
> URL: https://issues.apache.org/jira/browse/FLINK-30973
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Affects Versions: 1.17.0, 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
>
> We're experiencing a test instability where the preparation of the e2e tests 
> fails because of a 404:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45907&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&t=11b0df07-3e5e-58da-eb81-03003e470195&l=1830
> {code}
> --2023-02-09 00:23:32--  
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
> Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
> 185.125.190.36, 185.125.190.39, ...
> Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
> connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2023-02-09 00:23:33 ERROR 404: Not Found.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30973) e2e test prepare crashes with libssl dependency not being available

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686205#comment-17686205
 ] 

Matthias Pohl commented on FLINK-30973:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45910&view=results

> e2e test prepare crashes with libssl dependency not being available
> ---
>
> Key: FLINK-30973
> URL: https://issues.apache.org/jira/browse/FLINK-30973
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Affects Versions: 1.17.0, 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
>
> We're experiencing a test instability where the preparation of the e2e tests 
> fails because of a 404:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45907&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&t=11b0df07-3e5e-58da-eb81-03003e470195&l=1830
> {code}
> --2023-02-09 00:23:32--  
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
> Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
> 185.125.190.36, 185.125.190.39, ...
> Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
> connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2023-02-09 00:23:33 ERROR 404: Not Found.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-26091) SafetyNetCloseableRegistryTest.testNonBlockingClose failed on azure

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686203#comment-17686203
 ] 

Matthias Pohl commented on FLINK-26091:
---

I'm increasing the priority again because it happened again.

> SafetyNetCloseableRegistryTest.testNonBlockingClose failed on azure
> ---
>
> Key: FLINK-26091
> URL: https://issues.apache.org/jira/browse/FLINK-26091
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.14.3, 1.15.3
>Reporter: Yun Gao
>Priority: Critical
>  Labels: auto-deprioritized-major, test-stability
>
> {code:java}
> Feb 11 02:11:41 [INFO] Running 
> org.apache.flink.core.fs.SafetyNetCloseableRegistryTest
> Feb 11 02:11:45 [ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 3.188 s <<< FAILURE! - in 
> org.apache.flink.core.fs.SafetyNetCloseableRegistryTest
> Feb 11 02:11:45 [ERROR] testNonBlockingClose  Time elapsed: 0.265 s  <<< 
> FAILURE!
> Feb 11 02:11:45 org.opentest4j.MultipleFailuresError: 
> Feb 11 02:11:45 Multiple Failures (2 failures)
> Feb 11 02:11:45   java.lang.AssertionError: Closed registry should not 
> accept closeables!
> Feb 11 02:11:45   java.lang.AssertionError: 
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.TestRun.getStoredResultOrSuccessful(TestRun.java:196)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunListenerAdapter.fireExecutionFinished(RunListenerAdapter.java:226)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:192)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:79)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.SynchronizedRunListener.testFinished(SynchronizedRunListener.java:87)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.RunNotifier$9.notifyListener(RunNotifier.java:225)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:72)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.RunNotifier.fireTestFinished(RunNotifier.java:222)
> Feb 11 02:11:45   at 
> org.junit.internal.runners.model.EachTestNotifier.fireTestFinished(EachTestNotifier.java:38)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:372)
> Feb 11 02:11:45   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> Feb 11 02:11:45   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> Feb 11 02:11:45   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> Feb 11 02:11:45   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
> Feb 11 02:11:45   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> Feb 11 02:11:45   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> Feb 11 02:11:45   at 
> java.util.Iterator.forEachRemaining(Iterator.java:116)
> Feb 11 02:11:45   at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
> Feb 11 02:11:45   at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> Feb 11 02:11:45   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> Feb 11 02:11:45   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> Feb 11 02:11:45   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=31188&view=logs&j=6bfdaf55-0c08-5e3f-a2d2-2a0285fd41cf&t=cb073eeb-41fa-5f93-7035-c175e0e49392&l=5495



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-26091) SafetyNetCloseableRegistryTest.testNonBlockingClose failed on azure

2023-02-08 Thread Matthias Pohl (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-26091:
--
Priority: Critical  (was: Minor)

> SafetyNetCloseableRegistryTest.testNonBlockingClose failed on azure
> ---
>
> Key: FLINK-26091
> URL: https://issues.apache.org/jira/browse/FLINK-26091
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.14.3, 1.15.3
>Reporter: Yun Gao
>Priority: Critical
>  Labels: auto-deprioritized-major, test-stability
>
> {code:java}
> Feb 11 02:11:41 [INFO] Running 
> org.apache.flink.core.fs.SafetyNetCloseableRegistryTest
> Feb 11 02:11:45 [ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 3.188 s <<< FAILURE! - in 
> org.apache.flink.core.fs.SafetyNetCloseableRegistryTest
> Feb 11 02:11:45 [ERROR] testNonBlockingClose  Time elapsed: 0.265 s  <<< 
> FAILURE!
> Feb 11 02:11:45 org.opentest4j.MultipleFailuresError: 
> Feb 11 02:11:45 Multiple Failures (2 failures)
> Feb 11 02:11:45   java.lang.AssertionError: Closed registry should not 
> accept closeables!
> Feb 11 02:11:45   java.lang.AssertionError: 
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.TestRun.getStoredResultOrSuccessful(TestRun.java:196)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunListenerAdapter.fireExecutionFinished(RunListenerAdapter.java:226)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:192)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:79)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.SynchronizedRunListener.testFinished(SynchronizedRunListener.java:87)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.RunNotifier$9.notifyListener(RunNotifier.java:225)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:72)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.RunNotifier.fireTestFinished(RunNotifier.java:222)
> Feb 11 02:11:45   at 
> org.junit.internal.runners.model.EachTestNotifier.fireTestFinished(EachTestNotifier.java:38)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:372)
> Feb 11 02:11:45   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> Feb 11 02:11:45   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> Feb 11 02:11:45   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> Feb 11 02:11:45   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
> Feb 11 02:11:45   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> Feb 11 02:11:45   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> Feb 11 02:11:45   at 
> java.util.Iterator.forEachRemaining(Iterator.java:116)
> Feb 11 02:11:45   at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
> Feb 11 02:11:45   at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> Feb 11 02:11:45   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> Feb 11 02:11:45   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> Feb 11 02:11:45   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=31188&view=logs&j=6bfdaf55-0c08-5e3f-a2d2-2a0285fd41cf&t=cb073eeb-41fa-5f93-7035-c175e0e49392&l=5495



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-26091) SafetyNetCloseableRegistryTest.testNonBlockingClose failed on azure

2023-02-08 Thread Matthias Pohl (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-26091:
--
Affects Version/s: 1.15.3

> SafetyNetCloseableRegistryTest.testNonBlockingClose failed on azure
> ---
>
> Key: FLINK-26091
> URL: https://issues.apache.org/jira/browse/FLINK-26091
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.14.3, 1.15.3
>Reporter: Yun Gao
>Priority: Minor
>  Labels: auto-deprioritized-major, test-stability
>
> {code:java}
> Feb 11 02:11:41 [INFO] Running 
> org.apache.flink.core.fs.SafetyNetCloseableRegistryTest
> Feb 11 02:11:45 [ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 3.188 s <<< FAILURE! - in 
> org.apache.flink.core.fs.SafetyNetCloseableRegistryTest
> Feb 11 02:11:45 [ERROR] testNonBlockingClose  Time elapsed: 0.265 s  <<< 
> FAILURE!
> Feb 11 02:11:45 org.opentest4j.MultipleFailuresError: 
> Feb 11 02:11:45 Multiple Failures (2 failures)
> Feb 11 02:11:45   java.lang.AssertionError: Closed registry should not 
> accept closeables!
> Feb 11 02:11:45   java.lang.AssertionError: 
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.TestRun.getStoredResultOrSuccessful(TestRun.java:196)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunListenerAdapter.fireExecutionFinished(RunListenerAdapter.java:226)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:192)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:79)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.SynchronizedRunListener.testFinished(SynchronizedRunListener.java:87)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.RunNotifier$9.notifyListener(RunNotifier.java:225)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:72)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.RunNotifier.fireTestFinished(RunNotifier.java:222)
> Feb 11 02:11:45   at 
> org.junit.internal.runners.model.EachTestNotifier.fireTestFinished(EachTestNotifier.java:38)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:372)
> Feb 11 02:11:45   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> Feb 11 02:11:45   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> Feb 11 02:11:45   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> Feb 11 02:11:45   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
> Feb 11 02:11:45   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> Feb 11 02:11:45   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> Feb 11 02:11:45   at 
> java.util.Iterator.forEachRemaining(Iterator.java:116)
> Feb 11 02:11:45   at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
> Feb 11 02:11:45   at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> Feb 11 02:11:45   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> Feb 11 02:11:45   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> Feb 11 02:11:45   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=31188&view=logs&j=6bfdaf55-0c08-5e3f-a2d2-2a0285fd41cf&t=cb073eeb-41fa-5f93-7035-c175e0e49392&l=5495



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-26091) SafetyNetCloseableRegistryTest.testNonBlockingClose failed on azure

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686202#comment-17686202
 ] 

Matthias Pohl commented on FLINK-26091:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45907&view=logs&j=b0a398c0-685b-599c-eb57-c8c2a771138e&t=747432ad-a576-5911-1e2a-68c6bedc248a&l=6244

> SafetyNetCloseableRegistryTest.testNonBlockingClose failed on azure
> ---
>
> Key: FLINK-26091
> URL: https://issues.apache.org/jira/browse/FLINK-26091
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.14.3, 1.15.3
>Reporter: Yun Gao
>Priority: Critical
>  Labels: auto-deprioritized-major, test-stability
>
> {code:java}
> Feb 11 02:11:41 [INFO] Running 
> org.apache.flink.core.fs.SafetyNetCloseableRegistryTest
> Feb 11 02:11:45 [ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 3.188 s <<< FAILURE! - in 
> org.apache.flink.core.fs.SafetyNetCloseableRegistryTest
> Feb 11 02:11:45 [ERROR] testNonBlockingClose  Time elapsed: 0.265 s  <<< 
> FAILURE!
> Feb 11 02:11:45 org.opentest4j.MultipleFailuresError: 
> Feb 11 02:11:45 Multiple Failures (2 failures)
> Feb 11 02:11:45   java.lang.AssertionError: Closed registry should not 
> accept closeables!
> Feb 11 02:11:45   java.lang.AssertionError: 
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.TestRun.getStoredResultOrSuccessful(TestRun.java:196)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunListenerAdapter.fireExecutionFinished(RunListenerAdapter.java:226)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:192)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:79)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.SynchronizedRunListener.testFinished(SynchronizedRunListener.java:87)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.RunNotifier$9.notifyListener(RunNotifier.java:225)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:72)
> Feb 11 02:11:45   at 
> org.junit.runner.notification.RunNotifier.fireTestFinished(RunNotifier.java:222)
> Feb 11 02:11:45   at 
> org.junit.internal.runners.model.EachTestNotifier.fireTestFinished(EachTestNotifier.java:38)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:372)
> Feb 11 02:11:45   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> Feb 11 02:11:45   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> Feb 11 02:11:45   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> Feb 11 02:11:45   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> Feb 11 02:11:45   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> Feb 11 02:11:45   at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
> Feb 11 02:11:45   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> Feb 11 02:11:45   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> Feb 11 02:11:45   at 
> java.util.Iterator.forEachRemaining(Iterator.java:116)
> Feb 11 02:11:45   at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
> Feb 11 02:11:45   at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> Feb 11 02:11:45   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> Feb 11 02:11:45   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> Feb 11 02:11:45   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=31188&view=logs&j=6bfdaf55-0c08-5e3f-a2d2-2a0285fd41cf&t=cb073eeb-41fa-5f93-7035-c175e0e49392&l=5495



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-27285) CassandraConnectorITCase failed on azure due to NoHostAvailableException

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686200#comment-17686200
 ] 

Matthias Pohl commented on FLINK-27285:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45907&view=logs&j=e9af9cde-9a65-5281-a58e-2c8511d36983&t=c520d2c3-4d17-51f1-813b-4b0b74a0c307&l=13532

> CassandraConnectorITCase failed on azure due to NoHostAvailableException
> 
>
> Key: FLINK-27285
> URL: https://issues.apache.org/jira/browse/FLINK-27285
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Cassandra
>Affects Versions: 1.16.0, 1.15.3, cassandra-3.0.0
>Reporter: Yun Gao
>Priority: Critical
>  Labels: auto-deprioritized-major, test-stability
>
> {code:java}
> 2022-04-17T06:24:40.1216092Z Apr 17 06:24:40 [ERROR] 
> org.apache.flink.streaming.connectors.cassandra.CassandraConnectorITCase  
> Time elapsed: 29.81 s  <<< ERROR!
> 2022-04-17T06:24:40.1218517Z Apr 17 06:24:40 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: /172.17.0.1:53053 
> (com.datastax.driver.core.exceptions.OperationTimedOutException: 
> [/172.17.0.1] Timed out waiting for server response))
> 2022-04-17T06:24:40.1220821Z Apr 17 06:24:40  at 
> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
> 2022-04-17T06:24:40.1222816Z Apr 17 06:24:40  at 
> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:37)
> 2022-04-17T06:24:40.1224696Z Apr 17 06:24:40  at 
> com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
> 2022-04-17T06:24:40.1226624Z Apr 17 06:24:40  at 
> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
> 2022-04-17T06:24:40.1228346Z Apr 17 06:24:40  at 
> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:63)
> 2022-04-17T06:24:40.1229839Z Apr 17 06:24:40  at 
> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:39)
> 2022-04-17T06:24:40.1231736Z Apr 17 06:24:40  at 
> org.apache.flink.streaming.connectors.cassandra.CassandraConnectorITCase.startAndInitializeCassandra(CassandraConnectorITCase.java:385)
> 2022-04-17T06:24:40.1233614Z Apr 17 06:24:40  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2022-04-17T06:24:40.1234992Z Apr 17 06:24:40  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2022-04-17T06:24:40.1236194Z Apr 17 06:24:40  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2022-04-17T06:24:40.1237598Z Apr 17 06:24:40  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2022-04-17T06:24:40.1238768Z Apr 17 06:24:40  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> 2022-04-17T06:24:40.1240056Z Apr 17 06:24:40  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2022-04-17T06:24:40.1242109Z Apr 17 06:24:40  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> 2022-04-17T06:24:40.1243493Z Apr 17 06:24:40  at 
> org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33)
> 2022-04-17T06:24:40.1244903Z Apr 17 06:24:40  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
> 2022-04-17T06:24:40.1246352Z Apr 17 06:24:40  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 2022-04-17T06:24:40.1247809Z Apr 17 06:24:40  at 
> org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:30)
> 2022-04-17T06:24:40.1249193Z Apr 17 06:24:40  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> 2022-04-17T06:24:40.1250395Z Apr 17 06:24:40  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2022-04-17T06:24:40.1251468Z Apr 17 06:24:40  at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 2022-04-17T06:24:40.1252601Z Apr 17 06:24:40  at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> 2022-04-17T06:24:40.1253640Z Apr 17 06:24:40  at 
> org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> 2022-04-17T06:24:40.1254768Z Apr 17 06:24:40  at 
> org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> 2022-04-17T06:24:40.1256077Z Apr 17 06:24:40  at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
> 2022-04-17T06:24:40.1257492Z Apr 17 06:24:40  at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
> 2022-04-17T06:24:40.1258820Z Apr 17 06:24:40  at 
> org.junit.vintage

[jira] [Commented] (FLINK-25813) TableITCase.testCollectWithClose failed on azure

2023-02-08 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686199#comment-17686199
 ] 

Matthias Pohl commented on FLINK-25813:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45907&view=logs&j=a9db68b9-a7e0-54b6-0f98-010e0aff39e2&t=cdd32e0b-6047-565b-c58f-14054472f1be&l=10822

> TableITCase.testCollectWithClose failed on azure
> 
>
> Key: FLINK-25813
> URL: https://issues.apache.org/jira/browse/FLINK-25813
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.15.0, 1.16.0, 1.17.0
>Reporter: Yun Gao
>Assignee: Caizhi Weng
>Priority: Critical
>  Labels: pull-request-available, stale-assigned, test-stability
>
> {code:java}
> 2022-01-25T08:35:25.3735884Z Jan 25 08:35:25 [ERROR] 
> TableITCase.testCollectWithClose  Time elapsed: 0.377 s  <<< FAILURE!
> 2022-01-25T08:35:25.3737127Z Jan 25 08:35:25 java.lang.AssertionError: Values 
> should be different. Actual: RUNNING
> 2022-01-25T08:35:25.3738167Z Jan 25 08:35:25  at 
> org.junit.Assert.fail(Assert.java:89)
> 2022-01-25T08:35:25.3739085Z Jan 25 08:35:25  at 
> org.junit.Assert.failEquals(Assert.java:187)
> 2022-01-25T08:35:25.3739922Z Jan 25 08:35:25  at 
> org.junit.Assert.assertNotEquals(Assert.java:163)
> 2022-01-25T08:35:25.3740846Z Jan 25 08:35:25  at 
> org.junit.Assert.assertNotEquals(Assert.java:177)
> 2022-01-25T08:35:25.3742302Z Jan 25 08:35:25  at 
> org.apache.flink.table.api.TableITCase.testCollectWithClose(TableITCase.scala:135)
> 2022-01-25T08:35:25.3743327Z Jan 25 08:35:25  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2022-01-25T08:35:25.3744343Z Jan 25 08:35:25  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2022-01-25T08:35:25.3745575Z Jan 25 08:35:25  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2022-01-25T08:35:25.3746840Z Jan 25 08:35:25  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2022-01-25T08:35:25.3747922Z Jan 25 08:35:25  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> 2022-01-25T08:35:25.3749151Z Jan 25 08:35:25  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2022-01-25T08:35:25.3750422Z Jan 25 08:35:25  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> 2022-01-25T08:35:25.3751820Z Jan 25 08:35:25  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2022-01-25T08:35:25.3753196Z Jan 25 08:35:25  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2022-01-25T08:35:25.3754253Z Jan 25 08:35:25  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 2022-01-25T08:35:25.3755441Z Jan 25 08:35:25  at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258)
> 2022-01-25T08:35:25.3756656Z Jan 25 08:35:25  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> 2022-01-25T08:35:25.3757778Z Jan 25 08:35:25  at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> 2022-01-25T08:35:25.3758821Z Jan 25 08:35:25  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
> 2022-01-25T08:35:25.3759840Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 2022-01-25T08:35:25.3760919Z Jan 25 08:35:25  at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> 2022-01-25T08:35:25.3762249Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> 2022-01-25T08:35:25.3763322Z Jan 25 08:35:25  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> 2022-01-25T08:35:25.3764436Z Jan 25 08:35:25  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> 2022-01-25T08:35:25.3765907Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> 2022-01-25T08:35:25.3766957Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> 2022-01-25T08:35:25.3768104Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> 2022-01-25T08:35:25.3769128Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> 2022-01-25T08:35:25.3770125Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> 2022-01-25T08:35:25.3771118Z Jan 25 08:35:25  at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> 2022-01-25T08:35:25.3772264Z Jan 25 08:35:25  at 
> org.junit.r

[jira] [Updated] (FLINK-30972) E2e tests always fail in phase "Prepare E2E run"

2023-02-08 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-30972:

Priority: Blocker  (was: Critical)

> E2e tests always fail in phase "Prepare E2E run"
> 
>
> Key: FLINK-30972
> URL: https://issues.apache.org/jira/browse/FLINK-30972
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI, Tests
>Reporter: Lijie Wang
>Priority: Blocker
>
> {code:java}
> Installing required software
> Reading package lists...
> Building dependency tree...
> Reading state information...
> bc is already the newest version (1.07.1-2build1).
> bc set to manually installed.
> libapr1 is already the newest version (1.6.5-1ubuntu1).
> libapr1 set to manually installed.
> 0 upgraded, 0 newly installed, 0 to remove and 13 not upgraded.
> --2023-02-09 04:38:47--  
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
> Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
> 185.125.190.36, 185.125.190.39, ...
> Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
> connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2023-02-09 04:38:47 ERROR 404: Not Found.
> WARNING: apt does not have a stable CLI interface. Use with caution in 
> scripts.
> Reading package lists...
> E: Unsupported file ./libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb given on 
> commandline
> ##[error]Bash exited with code '100'.
> Finishing: Prepare E2E run
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-30973) e2e test prepare crashes with libssl dependency not being available

2023-02-08 Thread Matthias Pohl (Jira)

Matthias Pohl created FLINK-30973:
-

 Summary: e2e test prepare crashes with libssl dependency not being 
available
 Key: FLINK-30973
 URL: https://issues.apache.org/jira/browse/FLINK-30973
 Project: Flink
  Issue Type: Bug
  Components: Test Infrastructure
Affects Versions: 1.16.1, 1.15.3, 1.17.0, 1.18.0
Reporter: Matthias Pohl


We're experiencing a test instability where the preparation of the e2e tests 
fails because of a 404:
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45907&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&t=11b0df07-3e5e-58da-eb81-03003e470195&l=1830

{code}
--2023-02-09 00:23:32--  
http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
185.125.190.36, 185.125.190.39, ...
Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
connected.
HTTP request sent, awaiting response... 404 Not Found
2023-02-09 00:23:33 ERROR 404: Not Found.
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30972) E2e tests always fail in phase "Prepare E2E run"

2023-02-08 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-30972:

Affects Version/s: 1.17.0

> E2e tests always fail in phase "Prepare E2E run"
> 
>
> Key: FLINK-30972
> URL: https://issues.apache.org/jira/browse/FLINK-30972
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI, Tests
>Affects Versions: 1.17.0
>Reporter: Lijie Wang
>Priority: Blocker
>
> {code:java}
> Installing required software
> Reading package lists...
> Building dependency tree...
> Reading state information...
> bc is already the newest version (1.07.1-2build1).
> bc set to manually installed.
> libapr1 is already the newest version (1.6.5-1ubuntu1).
> libapr1 set to manually installed.
> 0 upgraded, 0 newly installed, 0 to remove and 13 not upgraded.
> --2023-02-09 04:38:47--  
> http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
> Resolving security.ubuntu.com (security.ubuntu.com)... 91.189.91.39, 
> 185.125.190.36, 185.125.190.39, ...
> Connecting to security.ubuntu.com (security.ubuntu.com)|91.189.91.39|:80... 
> connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2023-02-09 04:38:47 ERROR 404: Not Found.
> WARNING: apt does not have a stable CLI interface. Use with caution in 
> scripts.
> Reading package lists...
> E: Unsupported file ./libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb given on 
> commandline
> ##[error]Bash exited with code '100'.
> Finishing: Prepare E2E run
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] Mulavar commented on pull request #21545: [FLINK-30396][table]make alias hint take effect in correlate

2023-02-08 Thread via GitHub



Mulavar commented on PR #21545:
URL: https://github.com/apache/flink/pull/21545#issuecomment-1423682118

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] Mulavar commented on a diff in pull request #21545: [FLINK-30396][table]make alias hint take effect in correlate

2023-02-08 Thread via GitHub



Mulavar commented on code in PR #21545:
URL: https://github.com/apache/flink/pull/21545#discussion_r1101015734


##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/alias/ClearLookupJoinHintWithInvalidPropagationShuttleTest.java:
##
@@ -61,89 +76,208 @@ public void before() throws Exception {
 + ") WITH (\n"
 + " 'connector' = 'values'\n"
 + ")");
+util.tableEnv()
+.createTemporarySystemFunction(
+"MockOffset",
+new 
ClearLookupJoinHintWithInvalidPropagationShuttleTest()
+.new MockOffsetTableFunction());
 }
 
 @Test
 public void testNoNeedToClearLookupHint() {
 // SELECT /*+ LOOKUP('table'='lookup', 'retry-predicate'='lookup_miss',
-// 'retry-strategy'='fixed_delay', 'fixed-delay'='155 ms', 
'max-attempts'='10') ) */ *
-//  FROM src
-//  JOIN lookup FOR SYSTEM_TIME AS OF T.proctime AS D
-//  ON T.a = D.a
+// 'retry-strategy'='fixed_delay', 'fixed-delay'='155 ms', 
'max-attempts'='10',
+// 'async'='true', 'output-mode'='allow_unordered','capacity'='1000', 
'time-out'='300 s')
+// */ s.a
+// FROM src s
+// JOIN lookup FOR SYSTEM_TIME AS OF s.pts AS d
+// ON s.a=d.a
+CorrelationId cid = builder.getCluster().createCorrel();
+RelDataType aType =
+builder.getTypeFactory()
+.createStructType(
+Collections.singletonList(
+
builder.getTypeFactory().createSqlType(SqlTypeName.BIGINT)),
+Collections.singletonList("a"));
+RelDataType ptsType =
+builder.getTypeFactory()
+.createStructType(
+Collections.singletonList(
+builder.getTypeFactory()
+
.createProctimeIndicatorType(false)),
+Collections.singletonList("pts"));
 RelNode root =
 builder.scan("src")
 .scan("lookup")
 
.snapshot(builder.getRexBuilder().makeCall(FlinkSqlOperatorTable.PROCTIME))
-.join(
+.filter(
+builder.equals(
+builder.field(
+
builder.getRexBuilder().makeCorrel(aType, cid),
+"a"),
+
builder.getRexBuilder().makeInputRef(aType, 0)))
+.correlate(
 JoinRelType.INNER,
-builder.equals(builder.field(2, 0, "a"), 
builder.field(2, 1, "a")))
+cid,
+builder.getRexBuilder().makeInputRef(aType, 0),
+builder.getRexBuilder().makeInputRef(ptsType, 
1))
 .project(builder.field(1, 0, "a"))
-
.hints(LookupJoinHintTestUtil.getLookupJoinHint("lookup", false, true))
+
.hints(RelHint.builder(FlinkHints.HINT_ALIAS).hintOption("t1").build())
+.hints(LookupJoinHintTestUtil.getLookupJoinHint("d", 
true, false))
 .build();
 verifyRelPlan(root);
 }
 
 @Test
-public void 
testClearLookupHintWithInvalidPropagationToViewWhileViewHasLookupHints() {
-// SELECT /*+ LOOKUP('table'='lookup', 'retry-predicate'='lookup_miss',
-// 'retry-strategy'='fixed_delay', 'fixed-delay'='155 ms', 
'max-attempts'='10') ) */ *
-//   FROM (
-// SELECT /*+ LOOKUP('table'='lookup', 'async'='true', 
'output-mode'='allow_unordered',
-// 'capacity'='1000', 'time-out'='300 s'
-//   src.a, src.proctime
-// FROM src
-//   JOIN lookup FOR SYSTEM_TIME AS OF T.proctime AS D
-// ON T.a = D.id
-// ) t1 JOIN lookup FOR SYSTEM_TIME AS OF t1.proctime AS t2 ON 
t1.a = t2.a
+public void testClearLookupHintWithInvalidPropagationToSubQuery() {
+// SELECT /*+ LOOKUP('table'='src', 'retry-predicate'='lookup_miss',
+// 'retry-strategy'='fixed_delay', 'fixed-delay'='155 ms', 
'max-attempts'='10',
+// 'async'='true', 'output-mode'='allow_unordered','capacity'='1000', 
'time-out'='300 s')
+// */ t1.a
+//  FROM (
+//  SELECT s.a
+//  FROM src s
+//  JOIN lookup FOR SYSTEM_TIME AS OF s.pts AS d
+//  ON s.a=d.a
+//  ) t1
+//  JOIN src t2
+//  ON t1.a=t2.a
+
+CorrelationId ci

[jira] [Commented] (FLINK-29734) Support authentication in the Flink SQL Gateway

2023-02-08 Thread melin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686197#comment-17686197
 ] 

melin commented on FLINK-29734:
---

(?)

> Support authentication in the Flink SQL Gateway
> ---
>
> Key: FLINK-29734
> URL: https://issues.apache.org/jira/browse/FLINK-29734
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Gateway
>Affects Versions: 1.17.0
>Reporter: Shengkai Fang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] lincoln-lil commented on a diff in pull request #21545: [FLINK-30396][table]make alias hint take effect in correlate

2023-02-08 Thread via GitHub



lincoln-lil commented on code in PR #21545:
URL: https://github.com/apache/flink/pull/21545#discussion_r1101012560


##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/alias/ClearLookupJoinHintWithInvalidPropagationShuttleTest.java:
##
@@ -61,89 +76,208 @@ public void before() throws Exception {
 + ") WITH (\n"
 + " 'connector' = 'values'\n"
 + ")");
+util.tableEnv()
+.createTemporarySystemFunction(
+"MockOffset",
+new 
ClearLookupJoinHintWithInvalidPropagationShuttleTest()
+.new MockOffsetTableFunction());
 }
 
 @Test
 public void testNoNeedToClearLookupHint() {
 // SELECT /*+ LOOKUP('table'='lookup', 'retry-predicate'='lookup_miss',
-// 'retry-strategy'='fixed_delay', 'fixed-delay'='155 ms', 
'max-attempts'='10') ) */ *
-//  FROM src
-//  JOIN lookup FOR SYSTEM_TIME AS OF T.proctime AS D
-//  ON T.a = D.a
+// 'retry-strategy'='fixed_delay', 'fixed-delay'='155 ms', 
'max-attempts'='10',
+// 'async'='true', 'output-mode'='allow_unordered','capacity'='1000', 
'time-out'='300 s')
+// */ s.a
+// FROM src s
+// JOIN lookup FOR SYSTEM_TIME AS OF s.pts AS d
+// ON s.a=d.a
+CorrelationId cid = builder.getCluster().createCorrel();
+RelDataType aType =
+builder.getTypeFactory()
+.createStructType(
+Collections.singletonList(
+
builder.getTypeFactory().createSqlType(SqlTypeName.BIGINT)),
+Collections.singletonList("a"));
+RelDataType ptsType =
+builder.getTypeFactory()
+.createStructType(
+Collections.singletonList(
+builder.getTypeFactory()
+
.createProctimeIndicatorType(false)),
+Collections.singletonList("pts"));
 RelNode root =
 builder.scan("src")
 .scan("lookup")
 
.snapshot(builder.getRexBuilder().makeCall(FlinkSqlOperatorTable.PROCTIME))
-.join(
+.filter(
+builder.equals(
+builder.field(
+
builder.getRexBuilder().makeCorrel(aType, cid),
+"a"),
+
builder.getRexBuilder().makeInputRef(aType, 0)))
+.correlate(
 JoinRelType.INNER,
-builder.equals(builder.field(2, 0, "a"), 
builder.field(2, 1, "a")))
+cid,
+builder.getRexBuilder().makeInputRef(aType, 0),
+builder.getRexBuilder().makeInputRef(ptsType, 
1))
 .project(builder.field(1, 0, "a"))
-
.hints(LookupJoinHintTestUtil.getLookupJoinHint("lookup", false, true))
+
.hints(RelHint.builder(FlinkHints.HINT_ALIAS).hintOption("t1").build())
+.hints(LookupJoinHintTestUtil.getLookupJoinHint("d", 
true, false))
 .build();
 verifyRelPlan(root);
 }
 
 @Test
-public void 
testClearLookupHintWithInvalidPropagationToViewWhileViewHasLookupHints() {
-// SELECT /*+ LOOKUP('table'='lookup', 'retry-predicate'='lookup_miss',
-// 'retry-strategy'='fixed_delay', 'fixed-delay'='155 ms', 
'max-attempts'='10') ) */ *
-//   FROM (
-// SELECT /*+ LOOKUP('table'='lookup', 'async'='true', 
'output-mode'='allow_unordered',
-// 'capacity'='1000', 'time-out'='300 s'
-//   src.a, src.proctime
-// FROM src
-//   JOIN lookup FOR SYSTEM_TIME AS OF T.proctime AS D
-// ON T.a = D.id
-// ) t1 JOIN lookup FOR SYSTEM_TIME AS OF t1.proctime AS t2 ON 
t1.a = t2.a
+public void testClearLookupHintWithInvalidPropagationToSubQuery() {
+// SELECT /*+ LOOKUP('table'='src', 'retry-predicate'='lookup_miss',
+// 'retry-strategy'='fixed_delay', 'fixed-delay'='155 ms', 
'max-attempts'='10',
+// 'async'='true', 'output-mode'='allow_unordered','capacity'='1000', 
'time-out'='300 s')
+// */ t1.a
+//  FROM (
+//  SELECT s.a
+//  FROM src s
+//  JOIN lookup FOR SYSTEM_TIME AS OF s.pts AS d
+//  ON s.a=d.a
+//  ) t1
+//  JOIN src t2
+//  ON t1.a=t2.a
+
+CorrelationI

[GitHub] [flink-ml] lindong28 opened a new pull request, #209: [FLINK-30688][followup] Disable Kryo fallback for tests in flink-ml-lib

2023-02-08 Thread via GitHub



lindong28 opened a new pull request, #209:
URL: https://github.com/apache/flink-ml/pull/209

   ## What is the purpose of the change
   
   Disable Kryo fallback for all tests (except `OnlineLogisticRegressionTest`) 
in flink-ml-lib to ensure that all Flink ML algorithms can run without relying 
on Kryo for serialization and deserialization.
   
   ## Brief change log
   
   TBD
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] lindong28 merged pull request #203: [FLINK-30688][followup] Disable Kryo fallback for tests in flink-ml-lib

2023-02-08 Thread via GitHub



lindong28 merged PR #203:
URL: https://github.com/apache/flink-ml/pull/203


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 3 4 5 >

1 - 100 of 415 matches

Mail list logo