date:20220331

[GitHub] [flink] flinkbot edited a comment on pull request #19177: [FLINK-23399][state] Add a benchmark for rescaling

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19177:
URL: https://github.com/apache/flink/pull/19177#issuecomment-1073539671


   
   ## CI report:
   
   * 1012ead6db20e77f4fa95ebcc92fd3f23e59cc11 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34044)
 
   * 6eae4e703ea4a777b1d0b92e8a6e81653c4e849a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34073)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] syhily commented on a change in pull request #18816: [FLINK-26027][pulsar] add sink metrics to pulsar sink

2022-03-31 Thread GitBox



syhily commented on a change in pull request #18816:
URL: https://github.com/apache/flink/pull/18816#discussion_r840254164



##
File path: 
flink-metrics/flink-metrics-core/src/main/java/org/apache/flink/metrics/groups/UnregisteredMetricsGroup.java
##
@@ -178,4 +182,35 @@ public T getValue() {
 return null;
 }
 }
+
+private static class UnregisteredSinkWriterMetricGroup extends 
UnregisteredMetricsGroup

Review comment:
   Why we need this implementation?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] yunfengzhou-hub commented on a change in pull request #71: [FLINK-26443] Add benchmark framework

2022-03-31 Thread GitBox



yunfengzhou-hub commented on a change in pull request #71:
URL: https://github.com/apache/flink-ml/pull/71#discussion_r840253077



##
File path: 
flink-ml-benchmark/src/main/java/org/apache/flink/ml/benchmark/BenchmarkUtils.java
##
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.benchmark;
+
+import org.apache.flink.api.common.JobExecutionResult;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.ml.api.AlgoOperator;
+import org.apache.flink.ml.api.Estimator;
+import org.apache.flink.ml.api.Model;
+import org.apache.flink.ml.api.Stage;
+import org.apache.flink.ml.benchmark.generator.DataGenerator;
+import org.apache.flink.ml.common.datastream.TableUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.util.Preconditions;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/** Utility methods for benchmarks. */
+public class BenchmarkUtils {
+/**
+ * Instantiates a benchmark from its parameter map and executes the 
benchmark in the provided
+ * environment.
+ *
+ * @return Results of the executed benchmark.
+ */
+@SuppressWarnings({"unchecked", "rawtypes"})
+public static BenchmarkResult runBenchmark(
+StreamTableEnvironment tEnv, String name, Map params) 
throws Exception {
+Stage stage = ReadWriteUtils.instantiateWithParams((Map) 
params.get("stage"));
+DataGenerator inputsGenerator =
+ReadWriteUtils.instantiateWithParams((Map) 
params.get("inputs"));
+DataGenerator modelDataGenerator = null;
+if (params.containsKey("modelData")) {
+modelDataGenerator =
+ReadWriteUtils.instantiateWithParams((Map) 
params.get("modelData"));
+}
+
+return runBenchmark(tEnv, name, stage, inputsGenerator, 
modelDataGenerator);
+}
+
+/**
+ * Executes a benchmark from a stage with its inputsGenerator in the 
provided environment.
+ *
+ * @return Results of the executed benchmark.
+ */
+public static BenchmarkResult runBenchmark(
+StreamTableEnvironment tEnv,
+String name,
+Stage stage,
+DataGenerator inputsGenerator)
+throws Exception {
+return runBenchmark(tEnv, name, stage, inputsGenerator, null);
+}
+
+/**
+ * Executes a benchmark from a stage with its inputsGenerator and 
modelDataGenerator in the
+ * provided environment.
+ *
+ * @return Results of the executed benchmark.
+ */
+public static BenchmarkResult runBenchmark(
+StreamTableEnvironment tEnv,
+String name,
+Stage stage,
+DataGenerator inputsGenerator,
+DataGenerator modelDataGenerator)
+throws Exception {
+StreamExecutionEnvironment env = 
TableUtils.getExecutionEnvironment(tEnv);
+
+Table[] inputTables = inputsGenerator.getData(tEnv);
+if (modelDataGenerator != null) {
+((Model) stage).setModelData(modelDataGenerator.getData(tEnv));
+}
+
+Table[] outputTables;
+if (stage instanceof Estimator) {
+outputTables = ((Estimator) 
stage).fit(inputTables).getModelData();
+} else if (stage instanceof AlgoOperator) {
+outputTables = ((AlgoOperator) stage).transform(inputTables);
+} else {
+throw new IllegalArgumentException("Unsupported Stage class " + 
stage.getClass());
+}
+
+for (Table table : outputTables) {
+tEnv.toDataStream(table).addSink(new 
CountingAndDiscardingSink<>());
+}
+
+JobExecutionResult executionResult = env.execute();
+
+BenchmarkResult result = new BenchmarkResult();
+result.name = name;
+result.totalTimeMs = (double) 
executionResult.getNetRuntime(TimeUnit.MILLISECONDS);
+

[GitHub] [flink] flinkbot edited a comment on pull request #19177: [FLINK-23399][state] Add a benchmark for rescaling

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19177:
URL: https://github.com/apache/flink/pull/19177#issuecomment-1073539671


   
   ## CI report:
   
   * 1012ead6db20e77f4fa95ebcc92fd3f23e59cc11 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34044)
 
   * 6eae4e703ea4a777b1d0b92e8a6e81653c4e849a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-26972) Support spot instance in Flink Kubernetes operator

2022-03-31 Thread Yang Wang (Jira)

Yang Wang created FLINK-26972:
-

 Summary: Support spot instance in Flink Kubernetes operator
 Key: FLINK-26972
 URL: https://issues.apache.org/jira/browse/FLINK-26972
 Project: Flink
  Issue Type: New Feature
  Components: Kubernetes Operator
Reporter: Yang Wang


>From the discussion in the ML[1][2], some users are running Flink on spot EC2. 
>Since cloud Kubernetes also supports spot instance[3], it will be great if the 
>K8s operator could be aware of eviction events. Then it could trigger a 
>savepoint and delete the pods gracefully.

 

Maybe this is unnecessary when the Kubernetes HA enabled. Let's keep this 
ticket open and see what could we do in the Flink K8s operator to make stateful 
Flink application running on spot instance better.

 

[1]. [https://lists.apache.org/thread/k13mh1c01zsd59kchw4dv160qx4s6l66]

[2]. [https://lists.apache.org/thread/c1x1ys3cdvhd9r4v08z399kch3o34nf0]

[3]. [https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (FLINK-11388) Add an OSS RecoverableWriter

2022-03-31 Thread Jingsong Lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17513235#comment-17513235
 ] 

Jingsong Lee edited comment on FLINK-11388 at 4/1/22 5:21 AM:
--

master: 672d26e831047a99b4d6432fe0046ff036279887

document: abd31fc7de06b5041f75556cdcce2fb2acfc85a7


was (Author: lzljs3620320):
master: 672d26e831047a99b4d6432fe0046ff036279887

> Add an OSS RecoverableWriter
> 
>
> Key: FLINK-11388
> URL: https://issues.apache.org/jira/browse/FLINK-11388
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.7.1
>Reporter: wujinhu
>Assignee: wujinhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> OSS offers persistence only after uploads or multi-part uploads complete.  In 
> order to make streaming uses OSS as sink, we should implement a Recoverable 
> writer. This writer will snapshot and store multi-part upload information and 
> recover from those information when failure occurs



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] JingsongLi merged pull request #19292: [FLINK-11388][docs][oss]Update docs for OSS Recoverable writer

2022-03-31 Thread GitBox



JingsongLi merged pull request #19292:
URL: https://github.com/apache/flink/pull/19292


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] imaffe commented on pull request #19289: [BACKPORT 1.14]: [FLINK-25440][doc][pulsar] Stop and Start cursor now all uses publish…

2022-03-31 Thread GitBox



imaffe commented on pull request #19289:
URL: https://github.com/apache/flink/pull/19289#issuecomment-1085410874


   @MartijnVisser The pipeline has succeeded ~ Thank you ~ 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-26971) UniversalCompaction should pick by size ratio after picking by file num

2022-03-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-26971:
---
Labels: pull-request-available  (was: )

> UniversalCompaction should pick by size ratio after picking by file num
> ---
>
> Key: FLINK-26971
> URL: https://issues.apache.org/jira/browse/FLINK-26971
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.1.0
>
>
> This way we can avoid the problem of oversized first files



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink-table-store] JingsongLi opened a new pull request #70: [FLINK-26971] UniversalCompaction should pick by size ratio after picking by file num

2022-03-31 Thread GitBox



JingsongLi opened a new pull request #70:
URL: https://github.com/apache/flink-table-store/pull/70


   This way we can avoid the problem of oversized first files
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-26971) UniversalCompaction should pick by size ratio after picking by file num

2022-03-31 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-26971:


 Summary: UniversalCompaction should pick by size ratio after 
picking by file num
 Key: FLINK-26971
 URL: https://issues.apache.org/jira/browse/FLINK-26971
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.1.0


This way we can avoid the problem of oversized first files



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink-ml] zhipeng93 commented on pull request #73: [FLINK-26626] Add Transformer and Estimator for StandardScaler

2022-03-31 Thread GitBox



zhipeng93 commented on pull request #73:
URL: https://github.com/apache/flink-ml/pull/73#issuecomment-1085405076


   @lindong28 Thanks for the review. I have adressed your comments in the 
latest commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #18386:
URL: https://github.com/apache/flink/pull/18386#issuecomment-1015100174


   
   ## CI report:
   
   * b2fea27bfc4b620fb65360be604bc56ce5a27be5 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34065)
 
   * da69798b9da1c9412a9a36bc25a56c0e0a3940c1 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34071)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #18386:
URL: https://github.com/apache/flink/pull/18386#issuecomment-1015100174


   
   ## CI report:
   
   * b2fea27bfc4b620fb65360be604bc56ce5a27be5 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34065)
 
   * da69798b9da1c9412a9a36bc25a56c0e0a3940c1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] zhipeng93 commented on a change in pull request #73: [FLINK-26626] Add Transformer and Estimator for StandardScaler

2022-03-31 Thread GitBox



zhipeng93 commented on a change in pull request #73:
URL: https://github.com/apache/flink-ml/pull/73#discussion_r840215290



##
File path: flink-ml-core/src/main/java/org/apache/flink/ml/linalg/BLAS.java
##
@@ -32,9 +32,31 @@ public static double asum(DenseVector x) {
 }
 
 /** y += a * x . */
-public static void axpy(double a, DenseVector x, DenseVector y) {
+public static void axpy(double a, Vector x, DenseVector y) {
 Preconditions.checkArgument(x.size() == y.size(), "Vector size 
mismatched.");
-JAVA_BLAS.daxpy(x.size(), a, x.values, 1, y.values, 1);
+if (x instanceof SparseVector) {
+axpy(a, (SparseVector) x, y);
+} else {
+axpy(a, (DenseVector) x, y);
+}
+}
+
+/** Computes the hadamard product of the two vectors (y = y \hdot x). */
+public static void hDot(Vector x, Vector y) {
+Preconditions.checkArgument(x.size() == y.size(), "Vector size 
mismatched.");
+if (y instanceof DenseVector) {
+if (x instanceof SparseVector) {

Review comment:
   Thanks for the comments. I have reordered the order of `a instanceof b`.
   
   BTW, we do not support `axpy` when `y` is a SparseVector because it may lead 
to some memory re-allocation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19316: [FLINK-25238][table-runtime] Allow customized types for ArrayDataSerializer#copy

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19316:
URL: https://github.com/apache/flink/pull/19316#issuecomment-1085391584


   
   ## CI report:
   
   * cd534d2b00c7f1c116eeca4fc9982290970a1f68 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34070)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19298: [hive] support insert timestamp to decimal

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19298:
URL: https://github.com/apache/flink/pull/19298#issuecomment-1084376509


   
   ## CI report:
   
   * cd0f9e9ea3bfff7dceabebaa423be2228e5ec291 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34034)
 
   * 0b3bd6376db242fc347c933986f2156f483de841 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34069)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #18386:
URL: https://github.com/apache/flink/pull/18386#issuecomment-1015100174


   
   ## CI report:
   
   * b2fea27bfc4b620fb65360be604bc56ce5a27be5 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34065)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] zhipeng93 commented on a change in pull request #73: [FLINK-26626] Add Transformer and Estimator for StandardScaler

2022-03-31 Thread GitBox



zhipeng93 commented on a change in pull request #73:
URL: https://github.com/apache/flink-ml/pull/73#discussion_r840212185



##
File path: 
flink-ml-lib/src/main/java/org/apache/flink/ml/feature/standardscaler/StandardScaler.java
##
@@ -0,0 +1,288 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.feature.standardscaler;
+
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.api.java.typeutils.TupleTypeInfo;
+import org.apache.flink.iteration.operator.OperatorStateUtils;
+import org.apache.flink.ml.api.Estimator;
+import org.apache.flink.ml.linalg.BLAS;
+import org.apache.flink.ml.linalg.DenseVector;
+import org.apache.flink.ml.linalg.Vector;
+import org.apache.flink.ml.linalg.Vectors;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.runtime.state.StateInitializationContext;
+import org.apache.flink.runtime.state.StateSnapshotContext;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
+import org.apache.flink.streaming.api.operators.BoundedOneInput;
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Preconditions;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * An Estimator which implements the standard scaling algorithm.
+ *
+ * Standardization is a common requirement for machine learning training 
because they may behave
+ * badly if the individual features of a input do not look like standard 
normally distributed data
+ * (e.g. Gaussian with 0 mean and unit variance).
+ *
+ * This estimator standardizes the input features by removing the mean and 
scaling each dimension
+ * to unit variance.
+ */
+public class StandardScaler
+implements Estimator,
+StandardScalerParams {
+private final Map, Object> paramMap = new HashMap<>();
+
+public StandardScaler() {
+ParamUtils.initializeMapWithDefaultValues(paramMap, this);
+}
+
+@Override
+public StandardScalerModel fit(Table... inputs) {
+Preconditions.checkArgument(inputs.length == 1);
+StreamTableEnvironment tEnv =
+(StreamTableEnvironment) ((TableImpl) 
inputs[0]).getTableEnvironment();
+DataStream> 
sumAndSquaredSumAndWeight =
+tEnv.toDataStream(inputs[0])
+.transform(
+"computeMeta",
+new TupleTypeInfo<>(
+TypeInformation.of(DenseVector.class),
+TypeInformation.of(DenseVector.class),
+BasicTypeInfo.LONG_TYPE_INFO),
+new ComputeMetaOperator(getFeaturesCol()));
+
+DataStream modelData =
+sumAndSquaredSumAndWeight
+.transform(
+"buildModel",
+
TypeInformation.of(StandardScalerModelData.class),
+new BuildModelOperator())
+.setParallelism(1);
+
+StandardScalerModel model =
+new 
StandardScalerModel().setModelData(tEnv.fromDataStream(modelData));
+ReadWriteUtils.updateExistingParams(model, paramMap);
+return model;
+}
+
+/**
+ * Builds the {@link

[GitHub] [flink-ml] zhipeng93 commented on a change in pull request #73: [FLINK-26626] Add Transformer and Estimator for StandardScaler

2022-03-31 Thread GitBox



zhipeng93 commented on a change in pull request #73:
URL: https://github.com/apache/flink-ml/pull/73#discussion_r840212185



##
File path: 
flink-ml-lib/src/main/java/org/apache/flink/ml/feature/standardscaler/StandardScaler.java
##
@@ -0,0 +1,288 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.feature.standardscaler;
+
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.api.java.typeutils.TupleTypeInfo;
+import org.apache.flink.iteration.operator.OperatorStateUtils;
+import org.apache.flink.ml.api.Estimator;
+import org.apache.flink.ml.linalg.BLAS;
+import org.apache.flink.ml.linalg.DenseVector;
+import org.apache.flink.ml.linalg.Vector;
+import org.apache.flink.ml.linalg.Vectors;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.runtime.state.StateInitializationContext;
+import org.apache.flink.runtime.state.StateSnapshotContext;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
+import org.apache.flink.streaming.api.operators.BoundedOneInput;
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Preconditions;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * An Estimator which implements the standard scaling algorithm.
+ *
+ * Standardization is a common requirement for machine learning training 
because they may behave
+ * badly if the individual features of a input do not look like standard 
normally distributed data
+ * (e.g. Gaussian with 0 mean and unit variance).
+ *
+ * This estimator standardizes the input features by removing the mean and 
scaling each dimension
+ * to unit variance.
+ */
+public class StandardScaler
+implements Estimator,
+StandardScalerParams {
+private final Map, Object> paramMap = new HashMap<>();
+
+public StandardScaler() {
+ParamUtils.initializeMapWithDefaultValues(paramMap, this);
+}
+
+@Override
+public StandardScalerModel fit(Table... inputs) {
+Preconditions.checkArgument(inputs.length == 1);
+StreamTableEnvironment tEnv =
+(StreamTableEnvironment) ((TableImpl) 
inputs[0]).getTableEnvironment();
+DataStream> 
sumAndSquaredSumAndWeight =
+tEnv.toDataStream(inputs[0])
+.transform(
+"computeMeta",
+new TupleTypeInfo<>(
+TypeInformation.of(DenseVector.class),
+TypeInformation.of(DenseVector.class),
+BasicTypeInfo.LONG_TYPE_INFO),
+new ComputeMetaOperator(getFeaturesCol()));
+
+DataStream modelData =
+sumAndSquaredSumAndWeight
+.transform(
+"buildModel",
+
TypeInformation.of(StandardScalerModelData.class),
+new BuildModelOperator())
+.setParallelism(1);
+
+StandardScalerModel model =
+new 
StandardScalerModel().setModelData(tEnv.fromDataStream(modelData));
+ReadWriteUtils.updateExistingParams(model, paramMap);
+return model;
+}
+
+/**
+ * Builds the {@link

[GitHub] [flink] flinkbot commented on pull request #19316: [FLINK-25238][table-runtime] Allow customized types for ArrayDataSerializer#copy

2022-03-31 Thread GitBox



flinkbot commented on pull request #19316:
URL: https://github.com/apache/flink/pull/19316#issuecomment-1085391584


   
   ## CI report:
   
   * cd534d2b00c7f1c116eeca4fc9982290970a1f68 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19315: [hive] support distinct from

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19315:
URL: https://github.com/apache/flink/pull/19315#issuecomment-1085386734


   
   ## CI report:
   
   * 5c2e924a5bf679bf9b54cd00b1ec1d1e5116048e UNKNOWN
   * a1f4dbd7a4131b977e2efe1a6d3fce52c2b3bad7 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34067)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19298: [hive] support insert timestamp to decimal

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19298:
URL: https://github.com/apache/flink/pull/19298#issuecomment-1084376509


   
   ## CI report:
   
   * cd0f9e9ea3bfff7dceabebaa423be2228e5ec291 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34034)
 
   * 0b3bd6376db242fc347c933986f2156f483de841 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19314: [FLINK-26970][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19314:
URL: https://github.com/apache/flink/pull/19314#issuecomment-1085355130


   
   ## CI report:
   
   * c718701d04dde31a32925135833390ec58843530 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34061)
 
   * 2c52a9e27f44e67fe85cf0435b6220753e7938a4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34068)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #18386:
URL: https://github.com/apache/flink/pull/18386#issuecomment-1015100174


   
   ## CI report:
   
   * b2fea27bfc4b620fb65360be604bc56ce5a27be5 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34065)
 
   * da69798b9da1c9412a9a36bc25a56c0e0a3940c1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] zhipeng93 commented on a change in pull request #73: [FLINK-26626] Add Transformer and Estimator for StandardScaler

2022-03-31 Thread GitBox



zhipeng93 commented on a change in pull request #73:
URL: https://github.com/apache/flink-ml/pull/73#discussion_r840212185



##
File path: 
flink-ml-lib/src/main/java/org/apache/flink/ml/feature/standardscaler/StandardScaler.java
##
@@ -0,0 +1,288 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.feature.standardscaler;
+
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.api.java.typeutils.TupleTypeInfo;
+import org.apache.flink.iteration.operator.OperatorStateUtils;
+import org.apache.flink.ml.api.Estimator;
+import org.apache.flink.ml.linalg.BLAS;
+import org.apache.flink.ml.linalg.DenseVector;
+import org.apache.flink.ml.linalg.Vector;
+import org.apache.flink.ml.linalg.Vectors;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.runtime.state.StateInitializationContext;
+import org.apache.flink.runtime.state.StateSnapshotContext;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
+import org.apache.flink.streaming.api.operators.BoundedOneInput;
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Preconditions;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * An Estimator which implements the standard scaling algorithm.
+ *
+ * Standardization is a common requirement for machine learning training 
because they may behave
+ * badly if the individual features of a input do not look like standard 
normally distributed data
+ * (e.g. Gaussian with 0 mean and unit variance).
+ *
+ * This estimator standardizes the input features by removing the mean and 
scaling each dimension
+ * to unit variance.
+ */
+public class StandardScaler
+implements Estimator,
+StandardScalerParams {
+private final Map, Object> paramMap = new HashMap<>();
+
+public StandardScaler() {
+ParamUtils.initializeMapWithDefaultValues(paramMap, this);
+}
+
+@Override
+public StandardScalerModel fit(Table... inputs) {
+Preconditions.checkArgument(inputs.length == 1);
+StreamTableEnvironment tEnv =
+(StreamTableEnvironment) ((TableImpl) 
inputs[0]).getTableEnvironment();
+DataStream> 
sumAndSquaredSumAndWeight =
+tEnv.toDataStream(inputs[0])
+.transform(
+"computeMeta",
+new TupleTypeInfo<>(
+TypeInformation.of(DenseVector.class),
+TypeInformation.of(DenseVector.class),
+BasicTypeInfo.LONG_TYPE_INFO),
+new ComputeMetaOperator(getFeaturesCol()));
+
+DataStream modelData =
+sumAndSquaredSumAndWeight
+.transform(
+"buildModel",
+
TypeInformation.of(StandardScalerModelData.class),
+new BuildModelOperator())
+.setParallelism(1);
+
+StandardScalerModel model =
+new 
StandardScalerModel().setModelData(tEnv.fromDataStream(modelData));
+ReadWriteUtils.updateExistingParams(model, paramMap);
+return model;
+}
+
+/**
+ * Builds the {@link

[GitHub] [flink-ml] zhipeng93 commented on a change in pull request #73: [FLINK-26626] Add Transformer and Estimator for StandardScaler

2022-03-31 Thread GitBox



zhipeng93 commented on a change in pull request #73:
URL: https://github.com/apache/flink-ml/pull/73#discussion_r840212010



##
File path: 
flink-ml-lib/src/main/java/org/apache/flink/ml/feature/standardscaler/StandardScaler.java
##
@@ -0,0 +1,288 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.feature.standardscaler;
+
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.api.java.typeutils.TupleTypeInfo;
+import org.apache.flink.iteration.operator.OperatorStateUtils;
+import org.apache.flink.ml.api.Estimator;
+import org.apache.flink.ml.linalg.BLAS;
+import org.apache.flink.ml.linalg.DenseVector;
+import org.apache.flink.ml.linalg.Vector;
+import org.apache.flink.ml.linalg.Vectors;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.runtime.state.StateInitializationContext;
+import org.apache.flink.runtime.state.StateSnapshotContext;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
+import org.apache.flink.streaming.api.operators.BoundedOneInput;
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Preconditions;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * An Estimator which implements the standard scaling algorithm.
+ *
+ * Standardization is a common requirement for machine learning training 
because they may behave
+ * badly if the individual features of a input do not look like standard 
normally distributed data
+ * (e.g. Gaussian with 0 mean and unit variance).
+ *
+ * This estimator standardizes the input features by removing the mean and 
scaling each dimension
+ * to unit variance.
+ */
+public class StandardScaler
+implements Estimator,
+StandardScalerParams {
+private final Map, Object> paramMap = new HashMap<>();
+
+public StandardScaler() {
+ParamUtils.initializeMapWithDefaultValues(paramMap, this);
+}
+
+@Override
+public StandardScalerModel fit(Table... inputs) {
+Preconditions.checkArgument(inputs.length == 1);
+StreamTableEnvironment tEnv =
+(StreamTableEnvironment) ((TableImpl) 
inputs[0]).getTableEnvironment();
+DataStream> 
sumAndSquaredSumAndWeight =
+tEnv.toDataStream(inputs[0])
+.transform(
+"computeMeta",
+new TupleTypeInfo<>(
+TypeInformation.of(DenseVector.class),
+TypeInformation.of(DenseVector.class),
+BasicTypeInfo.LONG_TYPE_INFO),
+new ComputeMetaOperator(getFeaturesCol()));
+
+DataStream modelData =
+sumAndSquaredSumAndWeight
+.transform(
+"buildModel",
+
TypeInformation.of(StandardScalerModelData.class),
+new BuildModelOperator())
+.setParallelism(1);
+
+StandardScalerModel model =
+new 
StandardScalerModel().setModelData(tEnv.fromDataStream(modelData));
+ReadWriteUtils.updateExistingParams(model, paramMap);
+return model;
+}
+
+/**
+ * Builds the {@link

[GitHub] [flink] flinkbot edited a comment on pull request #19315: [hive] support distinct from

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19315:
URL: https://github.com/apache/flink/pull/19315#issuecomment-1085386734


   
   ## CI report:
   
   * 5c2e924a5bf679bf9b54cd00b1ec1d1e5116048e UNKNOWN
   * a1f4dbd7a4131b977e2efe1a6d3fce52c2b3bad7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-25238) flink iceberg source reading array types fail with Cast Exception

2022-03-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-25238:
---
Labels: pull-request-available  (was: )

> flink iceberg source reading array types fail with Cast Exception
> -
>
> Key: FLINK-25238
> URL: https://issues.apache.org/jira/browse/FLINK-25238
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.13.2
>Reporter: Praneeth Ramesh
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-12-09 at 6.58.56 PM.png, Screen Shot 
> 2021-12-09 at 7.04.10 PM.png
>
>
> I have a stream with iceberg table as a source. I have few columns of array 
> types in the table. 
> I try to read using iceberg connector. 
> Flink Version : 1.13.2
> Iceberg Flink Version: 0.12.1
>  
> I see the error as below.
> java.lang.ClassCastException: class 
> org.apache.iceberg.flink.data.FlinkParquetReaders$ReusableArrayData cannot be 
> cast to class org.apache.flink.table.data.ColumnarArrayData 
> (org.apache.iceberg.flink.data.FlinkParquetReaders$ReusableArrayData and 
> org.apache.flink.table.data.ColumnarArrayData are in unnamed module of loader 
> 'app')
>     at 
> org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copy(ArrayDataSerializer.java:90)
>     at 
> org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copy(ArrayDataSerializer.java:47)
>     at 
> org.apache.flink.table.runtime.typeutils.RowDataSerializer.copyRowData(RowDataSerializer.java:170)
>     at 
> org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:131)
>     at 
> org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:48)
>     at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:69)
>     at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46)
>     at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26)
>     at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:50)
>     at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:28)
>     at 
> org.apache.flink.streaming.api.operators.StreamSourceContexts$ManualWatermarkContext.processAndCollect(StreamSourceContexts.java:317)
>     at 
> org.apache.flink.streaming.api.operators.StreamSourceContexts$WatermarkContext.collect(StreamSourceContexts.java:411)
>     at 
> org.apache.iceberg.flink.source.StreamingReaderOperator.processSplits(StreamingReaderOperator.java:155)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
>     at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsNonBlocking(MailboxProcessor.java:359)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:323)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:202)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:681)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:636)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:620)
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
>     at java.base/java.lang.Thread.run(Thread.java:834)
>  
> Could be same issue as https://issues.apache.org/jira/browse/FLINK-21247 
> except it happening for another type.
> I see that Iceberg use custom types other than the types from 
> org.apache.flink.table.data like
> org.apache.iceberg.flink.data.FlinkParquetReaders.ReusableArrayData and these 
> types are not handled in 
> org.apache.flink.table.runtime.typeutils.ArrayDataSerializer
> !Screen Shot 2021-12-09 at 6.58.56 PM.png!
>  Just to try I changed the above code to handle the iceberg type as a binary 
> Array and built it locally and used in my application and that worked. 
>  
> !Screen Shot 2021-12-09 at 7.04.10 PM.png!
> Not sure if this is already handled in some newer versions. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #19314: [FLINK-26970][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19314:
URL: https://github.com/apache/flink/pull/19314#issuecomment-1085355130


   
   ## CI report:
   
   * c718701d04dde31a32925135833390ec58843530 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34061)
 
   * 2c52a9e27f44e67fe85cf0435b6220753e7938a4 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19219: [FLINK-26835][serialization] Fix concurrent modification exception

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19219:
URL: https://github.com/apache/flink/pull/19219#issuecomment-1077334179


   
   ## CI report:
   
   * e862f4777de3fa54ea129ab1b5ac0015b692750c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34017)
 
   * 98975fdda5ca88211e5cf2f91f9f84d8cabe5349 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34066)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #18386:
URL: https://github.com/apache/flink/pull/18386#issuecomment-1015100174


   
   ## CI report:
   
   * b2fea27bfc4b620fb65360be604bc56ce5a27be5 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34065)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] yittg opened a new pull request #19316: [FLINK-25238][table-runtime] Allow customized types for ArrayDataSerializer#copy

2022-03-31 Thread GitBox



yittg opened a new pull request #19316:
URL: https://github.com/apache/flink/pull/19316


   
   
   ## What is the purpose of the change
   
   Allow customized types for ArrayDataSerializer#copy, instead of throwing a 
cast exception.
   
   
   ## Brief change log
   
   
   
   ## Verifying this change
   
   
   This change added tests and can be verified as follows:
   
 - Added a test case for customized ArrayData types.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19314: [FLINK-26970][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19314:
URL: https://github.com/apache/flink/pull/19314#issuecomment-1085355130


   
   ## CI report:
   
   * c718701d04dde31a32925135833390ec58843530 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34061)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #19315: [hive] support distinct from

2022-03-31 Thread GitBox



flinkbot commented on pull request #19315:
URL: https://github.com/apache/flink/pull/19315#issuecomment-1085386734


   
   ## CI report:
   
   * 5c2e924a5bf679bf9b54cd00b1ec1d1e5116048e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19219: [FLINK-26835][serialization] Fix concurrent modification exception

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19219:
URL: https://github.com/apache/flink/pull/19219#issuecomment-1077334179


   
   ## CI report:
   
   * e862f4777de3fa54ea129ab1b5ac0015b692750c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34017)
 
   * 98975fdda5ca88211e5cf2f91f9f84d8cabe5349 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #18386:
URL: https://github.com/apache/flink/pull/18386#issuecomment-1015100174


   
   ## CI report:
   
   * b2fea27bfc4b620fb65360be604bc56ce5a27be5 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34065)
 
   * da69798b9da1c9412a9a36bc25a56c0e0a3940c1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] ZhangChaoming commented on a change in pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



ZhangChaoming commented on a change in pull request #18386:
URL: https://github.com/apache/flink/pull/18386#discussion_r840207227



##
File path: 
flink-table/flink-sql-parser/src/main/java/org/apache/flink/sql/parser/dql/SqlShowDatabases.java
##
@@ -49,8 +70,44 @@ public SqlOperator getOperator() {
 return Collections.EMPTY_LIST;
 }
 
+public String getCatalogName() {
+return Objects.isNull(this.catalogName) ? null : 
catalogName.getSimple();
+}
+
+public boolean isNotLike() {
+return notLike;
+}
+
+public String getPreposition() {
+return preposition;
+}
+
+public String getLikeSqlPattern() {
+return Objects.isNull(this.likeLiteral) ? null : 
likeLiteral.getValueAs(String.class);
+}
+
+public SqlCharStringLiteral getLikeLiteral() {
+return likeLiteral;
+}
+
+public boolean isWithLike() {
+return Objects.nonNull(likeLiteral);
+}
+
 @Override
 public void unparse(SqlWriter writer, int leftPrec, int rightPrec) {
-writer.keyword("SHOW DATABASES");
+if (this.preposition == null) {
+writer.keyword("SHOW DATABASES");
+} else if (catalogName != null) {
+writer.keyword("SHOW DATABASES " + this.preposition);
+catalogName.unparse(writer, leftPrec, rightPrec);
+}
+if (likeLiteral != null) {
+if (notLike) {
+writer.keyword(String.format("NOT LIKE '%s'", 
getLikeSqlPattern()));

Review comment:
   @RocMarshal , I am looking forard for your response.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] luoyuxia opened a new pull request #19315: [hive] support distinct from

2022-03-31 Thread GitBox



luoyuxia opened a new pull request #19315:
URL: https://github.com/apache/flink/pull/19315


   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] ZhangChaoming commented on pull request #19219: [FLINK-26835][serialization] Fix concurrent modification exception

2022-03-31 Thread GitBox



ZhangChaoming commented on pull request #19219:
URL: https://github.com/apache/flink/pull/19219#issuecomment-1085384720


   @dawidwys @pnowojski Since I have no idear about the performance impact of 
`ConcurrentHashMap`, I aggree to @dawidwys' suggestion that alway duplicate a 
new serializer:


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19314: [FLINK-26970][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19314:
URL: https://github.com/apache/flink/pull/19314#issuecomment-1085355130


   
   ## CI report:
   
   * c718701d04dde31a32925135833390ec58843530 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34061)
 
   * 2c52a9e27f44e67fe85cf0435b6220753e7938a4 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #18386:
URL: https://github.com/apache/flink/pull/18386#issuecomment-1015100174


   
   ## CI report:
   
   * b2fea27bfc4b620fb65360be604bc56ce5a27be5 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34065)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-26728) Support min max min_by max_by operation in KeyedStream

2022-03-31 Thread CaoYu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515687#comment-17515687
 ] 

CaoYu commented on FLINK-26728:
---

Hi [~dianfu] 

I have already summit a PR in GitHub.

 

Would you have time take a look?

[https://github.com/apache/flink/pull/19242]

 

Thanks.

 

> Support min max min_by max_by operation in KeyedStream
> --
>
> Key: FLINK-26728
> URL: https://issues.apache.org/jira/browse/FLINK-26728
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: CaoYu
>Assignee: CaoYu
>Priority: Major
>  Labels: pull-request-available
>
> Support min max min_by max_by operation in KeyedStream
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #19314: [FLINK-26970][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19314:
URL: https://github.com/apache/flink/pull/19314#issuecomment-1085355130


   
   ## CI report:
   
   * c718701d04dde31a32925135833390ec58843530 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34061)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #18386:
URL: https://github.com/apache/flink/pull/18386#issuecomment-1015100174


   
   ## CI report:
   
   * 746c20dcbaa8a541975f0b3d6bddb041008db2e2 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34004)
 
   * b2fea27bfc4b620fb65360be604bc56ce5a27be5 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34065)
 
   * da69798b9da1c9412a9a36bc25a56c0e0a3940c1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] zhipeng93 commented on a change in pull request #73: [FLINK-26626] Add Transformer and Estimator for StandardScaler

2022-03-31 Thread GitBox



zhipeng93 commented on a change in pull request #73:
URL: https://github.com/apache/flink-ml/pull/73#discussion_r840204804



##
File path: flink-ml-core/src/main/java/org/apache/flink/ml/linalg/BLAS.java
##
@@ -89,4 +111,56 @@ public static void gemv(
 y.values,
 1);
 }
+
+private static void axpy(double a, DenseVector x, DenseVector y) {
+JAVA_BLAS.daxpy(x.size(), a, x.values, 1, y.values, 1);
+}
+
+private static void axpy(double a, SparseVector x, DenseVector y) {
+for (int i = 0; i < x.indices.length; i++) {
+int index = x.indices[i];
+y.values[index] += a * x.values[i];
+}
+}
+
+private static void hDot(SparseVector x, SparseVector y) {
+int idx = 0;
+int idy = 0;
+while (idx < x.indices.length && idy < y.indices.length) {
+int indexX = x.indices[idx];
+while (idy < y.indices.length && y.indices[idy] < indexX) {
+y.values[idy] = 0;
+idy++;
+}
+if (idy < y.indices.length && y.indices[idy] == indexX) {
+y.values[idy] *= x.values[idx];
+idy++;
+}
+idx++;
+}
+}
+
+private static void hDot(SparseVector x, DenseVector y) {
+int idx = 0;
+for (int i = 0; i < y.size(); i++) {
+if (x.indices[idx] == i) {

Review comment:
   Thanks for pointing this out. It is fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19314: [FLINK-26970][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19314:
URL: https://github.com/apache/flink/pull/19314#issuecomment-1085355130


   
   ## CI report:
   
   * c718701d04dde31a32925135833390ec58843530 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34061)
 
   * 2c52a9e27f44e67fe85cf0435b6220753e7938a4 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19314: [FLINK-26970][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19314:
URL: https://github.com/apache/flink/pull/19314#issuecomment-1085355130


   
   ## CI report:
   
   * c718701d04dde31a32925135833390ec58843530 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34061)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19314: [FLINK-26970][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19314:
URL: https://github.com/apache/flink/pull/19314#issuecomment-1085355130


   
   ## CI report:
   
   * c718701d04dde31a32925135833390ec58843530 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34061)
 
   * 2c52a9e27f44e67fe85cf0435b6220753e7938a4 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19314: [FLINK-26970][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19314:
URL: https://github.com/apache/flink/pull/19314#issuecomment-1085355130


   
   ## CI report:
   
   * c718701d04dde31a32925135833390ec58843530 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34061)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] Myasuka commented on a change in pull request #19177: [FLINK-23399][state] Add a benchmark for rescaling

2022-03-31 Thread GitBox



Myasuka commented on a change in pull request #19177:
URL: https://github.com/apache/flink/pull/19177#discussion_r840197342



##
File path: 
flink-state-backends/flink-statebackend-rocksdb/src/test/java/org/apache/flink/contrib/streaming/state/benchmark/RescalingBenchmark.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License
+ */
+
+package org.apache.flink.contrib.streaming.state.benchmark;
+
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.functions.KeySelector;
+import org.apache.flink.runtime.checkpoint.OperatorSubtaskState;
+import org.apache.flink.runtime.operators.testutils.MockEnvironment;
+import org.apache.flink.runtime.operators.testutils.MockEnvironmentBuilder;
+import org.apache.flink.runtime.state.CheckpointStorageAccess;
+import org.apache.flink.runtime.state.KeyGroupRangeAssignment;
+import org.apache.flink.runtime.state.StateBackend;
+import org.apache.flink.streaming.api.functions.KeyedProcessFunction;
+import org.apache.flink.streaming.api.operators.KeyedProcessOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.streaming.util.AbstractStreamOperatorTestHarness;
+import org.apache.flink.streaming.util.KeyedOneInputStreamOperatorTestHarness;
+
+import java.io.IOException;
+import java.util.Iterator;
+import java.util.function.Supplier;
+
+/** The benchmark of rescaling from savepoint. */
+public class RescalingBenchmark {
+private final int maxParallelism;
+
+private final int parallelismBefore;
+private final int parallelismAfter;
+
+private final int managedMemorySize;
+
+private final StateBackend stateBackend;
+private final CheckpointStorageAccess checkpointStorageAccess;
+
+private OperatorSubtaskState stateForRescaling;
+private OperatorSubtaskState stateForSubtask;
+private KeyedOneInputStreamOperatorTestHarness subtaskHarness;
+
+private final StreamRecordGenerator streamRecordGenerator;
+private final Supplier> 
stateProcessFunctionSupplier;
+
+public RescalingBenchmark(
+final int parallelismBefore,
+final int parallelismAfter,
+final int maxParallelism,
+final int managedMemorySize,
+final StateBackend stateBackend,
+final CheckpointStorageAccess checkpointStorageAccess,
+final StreamRecordGenerator streamRecordGenerator,
+final Supplier> 
stateProcessFunctionSupplier) {
+this.parallelismBefore = parallelismBefore;
+this.parallelismAfter = parallelismAfter;
+this.maxParallelism = maxParallelism;
+this.managedMemorySize = managedMemorySize;
+this.stateBackend = stateBackend;
+this.checkpointStorageAccess = checkpointStorageAccess;
+this.streamRecordGenerator = streamRecordGenerator;
+this.stateProcessFunctionSupplier = stateProcessFunctionSupplier;
+}
+
+public void setUp() throws Exception {
+stateForRescaling = prepareState();
+}
+
+public void tearDown() throws IOException {
+stateForRescaling.discardState();
+}
+
+/** rescaling on one subtask, this is the benchmark entrance. */
+public void rescale() throws Exception {
+subtaskHarness.initializeState(stateForSubtask);
+}
+
+/** close test harness for subtask. */
+public void closeHarnessForSubtask() throws Exception {

Review comment:
   I think the method could be renamed to `closeOperator`, which looks 
better, as this method would not bind to the implementation.
   Remember to update related comments.

##
File path: 
flink-state-backends/flink-statebackend-rocksdb/src/test/java/org/apache/flink/contrib/streaming/state/benchmark/RescalingBenchmarkBuilder.java
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *

[GitHub] [flink-ml] yunfengzhou-hub commented on a change in pull request #71: [FLINK-26443] Add benchmark framework

2022-03-31 Thread GitBox



yunfengzhou-hub commented on a change in pull request #71:
URL: https://github.com/apache/flink-ml/pull/71#discussion_r840198302



##
File path: 
flink-ml-benchmark/src/main/java/org/apache/flink/ml/benchmark/clustering/kmeans/KMeansInputsGenerator.java
##
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.benchmark.clustering.kmeans;
+
+import org.apache.flink.ml.benchmark.generator.DataGenerator;
+import org.apache.flink.ml.benchmark.generator.GeneratorUtils;
+import org.apache.flink.ml.clustering.kmeans.KMeansParams;
+import org.apache.flink.ml.common.datastream.TableUtils;
+import org.apache.flink.ml.linalg.DenseVector;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.DataTypes;
+import org.apache.flink.table.api.Schema;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Class that generates table arrays containing inputs for {@link
+ * org.apache.flink.ml.clustering.kmeans.KMeans} and {@link
+ * org.apache.flink.ml.clustering.kmeans.KMeansModel}.
+ */
+public class KMeansInputsGenerator

Review comment:
   Besides, I have moved these classes from` 
org.apache.flink.ml.benchmark.generator` to 
`org.apache.flink.ml.benchmark.data`, as I think the former might bring 
ambiguity. Classes like `DenseVectorGenerator` are now in the same package as 
`DataGenerator`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19314: [FLINK-26970][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19314:
URL: https://github.com/apache/flink/pull/19314#issuecomment-1085355130


   
   ## CI report:
   
   * c718701d04dde31a32925135833390ec58843530 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34061)
 
   * 2c52a9e27f44e67fe85cf0435b6220753e7938a4 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #18386:
URL: https://github.com/apache/flink/pull/18386#issuecomment-1015100174


   
   ## CI report:
   
   * 746c20dcbaa8a541975f0b3d6bddb041008db2e2 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34004)
 
   * b2fea27bfc4b620fb65360be604bc56ce5a27be5 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34065)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] yunfengzhou-hub commented on a change in pull request #71: [FLINK-26443] Add benchmark framework

2022-03-31 Thread GitBox



yunfengzhou-hub commented on a change in pull request #71:
URL: https://github.com/apache/flink-ml/pull/71#discussion_r840197583



##
File path: 
flink-ml-benchmark/src/main/java/org/apache/flink/ml/benchmark/generator/GeneratorUtils.java
##
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.benchmark.generator;
+
+import org.apache.flink.api.common.functions.RichMapFunction;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.ml.linalg.DenseVector;
+import org.apache.flink.ml.linalg.Vectors;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.util.NumberSequenceIterator;
+
+import java.util.Random;
+
+/** Utility methods to generate data for benchmarks. */
+public class GeneratorUtils {
+/**
+ * Generates random continuous vectors.
+ *
+ * @param env The stream execution environment.
+ * @param numData Number of examples to generate in total.
+ * @param seed The seed to generate seed on each partition.
+ * @param dims Dimension of the vectors to be generated.
+ * @return The generated vector stream.
+ */
+public static DataStream generateRandomContinuousVectorStream(

Review comment:
   We might have `discreteVectorStream` in future that only contains 0 and 
1. But I agree that we can remove the word `continuous` from the method name, 
and maybe in future we can add an option on this method or class for users to 
specify how they would like the values to distribute in the generated vectors.
   
   Given that we are introducing classes like `DenseVectorArrayGenerator`, 
`GeneratorUtils` would be removed so the rest of this comment is not applicable 
now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] ZhangChaoming commented on pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



ZhangChaoming commented on pull request #18386:
URL: https://github.com/apache/flink/pull/18386#issuecomment-1085368113


@slinkydeveloper IMO, this enhanced syntax can NOT be applied easily to all 
the ShowOperations. 
   - The identifier in preposition is different
   - This syntax is not suitable for some operations like `ShowCurrentXXX` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #18386:
URL: https://github.com/apache/flink/pull/18386#issuecomment-1015100174


   
   ## CI report:
   
   * 746c20dcbaa8a541975f0b3d6bddb041008db2e2 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34004)
 
   * b2fea27bfc4b620fb65360be604bc56ce5a27be5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-25238) flink iceberg source reading array types fail with Cast Exception

2022-03-31 Thread Yi Tang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515682#comment-17515682
 ] 

Yi Tang edited comment on FLINK-25238 at 4/1/22 3:06 AM:
-

[~sjwiesman] Shall we reopen this issue. The similar issue for MapData is fixed 
in FLINK-21247, and the detail is like that been explained in 
https://issues.apache.org/jira/browse/FLINK-21247?focusedCommentId=17278483=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17278483

 

cc [~openinx] , [~lzljs3620320] 


was (Author: yittg):
[~sjwiesman] Shall we reopen this issue. The similar issue for MapData is fixed 
in FLINK-21247, and the detail is like that been explained in 
[FLINK-21247#comment-17278483|#comment-17278483].

 

cc [~openinx] , [~lzljs3620320] 

> flink iceberg source reading array types fail with Cast Exception
> -
>
> Key: FLINK-25238
> URL: https://issues.apache.org/jira/browse/FLINK-25238
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.13.2
>Reporter: Praneeth Ramesh
>Priority: Major
> Attachments: Screen Shot 2021-12-09 at 6.58.56 PM.png, Screen Shot 
> 2021-12-09 at 7.04.10 PM.png
>
>
> I have a stream with iceberg table as a source. I have few columns of array 
> types in the table. 
> I try to read using iceberg connector. 
> Flink Version : 1.13.2
> Iceberg Flink Version: 0.12.1
>  
> I see the error as below.
> java.lang.ClassCastException: class 
> org.apache.iceberg.flink.data.FlinkParquetReaders$ReusableArrayData cannot be 
> cast to class org.apache.flink.table.data.ColumnarArrayData 
> (org.apache.iceberg.flink.data.FlinkParquetReaders$ReusableArrayData and 
> org.apache.flink.table.data.ColumnarArrayData are in unnamed module of loader 
> 'app')
>     at 
> org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copy(ArrayDataSerializer.java:90)
>     at 
> org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copy(ArrayDataSerializer.java:47)
>     at 
> org.apache.flink.table.runtime.typeutils.RowDataSerializer.copyRowData(RowDataSerializer.java:170)
>     at 
> org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:131)
>     at 
> org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:48)
>     at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:69)
>     at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46)
>     at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26)
>     at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:50)
>     at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:28)
>     at 
> org.apache.flink.streaming.api.operators.StreamSourceContexts$ManualWatermarkContext.processAndCollect(StreamSourceContexts.java:317)
>     at 
> org.apache.flink.streaming.api.operators.StreamSourceContexts$WatermarkContext.collect(StreamSourceContexts.java:411)
>     at 
> org.apache.iceberg.flink.source.StreamingReaderOperator.processSplits(StreamingReaderOperator.java:155)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
>     at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsNonBlocking(MailboxProcessor.java:359)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:323)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:202)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:681)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:636)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:620)
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
>     at java.base/java.lang.Thread.run(Thread.java:834)
>  
> Could be same issue as https://issues.apache.org/jira/browse/FLINK-21247 
> except it happening for another type.
> I see that Iceberg use custom types other than the types from 
> org.apache.flink.table.data like
> org.apache.iceberg.flink.data.FlinkParquetReaders.ReusableArrayData and these 
> types are not handled in 
>

[jira] [Commented] (FLINK-25238) flink iceberg source reading array types fail with Cast Exception

2022-03-31 Thread Yi Tang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515682#comment-17515682
 ] 

Yi Tang commented on FLINK-25238:
-

[~sjwiesman] Shall we reopen this issue. The similar issue for MapData is fixed 
in FLINK-21247, and the detail is like that been explained in 
[FLINK-21247#comment-17278483|#comment-17278483].

 

cc [~openinx] , [~lzljs3620320] 

> flink iceberg source reading array types fail with Cast Exception
> -
>
> Key: FLINK-25238
> URL: https://issues.apache.org/jira/browse/FLINK-25238
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.13.2
>Reporter: Praneeth Ramesh
>Priority: Major
> Attachments: Screen Shot 2021-12-09 at 6.58.56 PM.png, Screen Shot 
> 2021-12-09 at 7.04.10 PM.png
>
>
> I have a stream with iceberg table as a source. I have few columns of array 
> types in the table. 
> I try to read using iceberg connector. 
> Flink Version : 1.13.2
> Iceberg Flink Version: 0.12.1
>  
> I see the error as below.
> java.lang.ClassCastException: class 
> org.apache.iceberg.flink.data.FlinkParquetReaders$ReusableArrayData cannot be 
> cast to class org.apache.flink.table.data.ColumnarArrayData 
> (org.apache.iceberg.flink.data.FlinkParquetReaders$ReusableArrayData and 
> org.apache.flink.table.data.ColumnarArrayData are in unnamed module of loader 
> 'app')
>     at 
> org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copy(ArrayDataSerializer.java:90)
>     at 
> org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copy(ArrayDataSerializer.java:47)
>     at 
> org.apache.flink.table.runtime.typeutils.RowDataSerializer.copyRowData(RowDataSerializer.java:170)
>     at 
> org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:131)
>     at 
> org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:48)
>     at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:69)
>     at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46)
>     at 
> org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26)
>     at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:50)
>     at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:28)
>     at 
> org.apache.flink.streaming.api.operators.StreamSourceContexts$ManualWatermarkContext.processAndCollect(StreamSourceContexts.java:317)
>     at 
> org.apache.flink.streaming.api.operators.StreamSourceContexts$WatermarkContext.collect(StreamSourceContexts.java:411)
>     at 
> org.apache.iceberg.flink.source.StreamingReaderOperator.processSplits(StreamingReaderOperator.java:155)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
>     at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsNonBlocking(MailboxProcessor.java:359)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:323)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:202)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:681)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:636)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:620)
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
>     at java.base/java.lang.Thread.run(Thread.java:834)
>  
> Could be same issue as https://issues.apache.org/jira/browse/FLINK-21247 
> except it happening for another type.
> I see that Iceberg use custom types other than the types from 
> org.apache.flink.table.data like
> org.apache.iceberg.flink.data.FlinkParquetReaders.ReusableArrayData and these 
> types are not handled in 
> org.apache.flink.table.runtime.typeutils.ArrayDataSerializer
> !Screen Shot 2021-12-09 at 6.58.56 PM.png!
>  Just to try I changed the above code to handle the iceberg type as a binary 
> Array and built it locally and used in my application and that worked. 
>  
> !Screen Shot 2021-12-09 at 7.04.10 PM.png!
> Not sure if this is already handled in some newer versions. 



--
This message was sent by Atlassian Jira

[GitHub] [flink] ZhangChaoming commented on a change in pull request #18386: [FLINK-25684][table] Support enhanced `show databases` syntax

2022-03-31 Thread GitBox



ZhangChaoming commented on a change in pull request #18386:
URL: https://github.com/apache/flink/pull/18386#discussion_r840195326



##
File path: 
flink-table/flink-table-planner/src/test/scala/org/apache/flink/table/api/TableEnvironmentTest.scala
##
@@ -656,14 +656,49 @@ class TableEnvironmentTest {
   def testExecuteSqlWithShowDatabases(): Unit = {
 val tableResult1 = tableEnv.executeSql("CREATE DATABASE db1 COMMENT 
'db1_comment'")
 assertEquals(ResultKind.SUCCESS, tableResult1.getResultKind)
-val tableResult2 = tableEnv.executeSql("SHOW DATABASES")
-assertEquals(ResultKind.SUCCESS_WITH_CONTENT, tableResult2.getResultKind)
+val tableResult2 = tableEnv.executeSql("CREATE DATABASE db2 COMMENT 
'db2_comment'")
+assertEquals(ResultKind.SUCCESS, tableResult2.getResultKind)
+val tableResult3 = tableEnv.executeSql("CREATE DATABASE pre_db3 COMMENT 
'db3_comment'")
+assertEquals(ResultKind.SUCCESS, tableResult3.getResultKind)
+
+val tableResult4 = tableEnv.executeSql("SHOW DATABASES")
+assertEquals(ResultKind.SUCCESS_WITH_CONTENT, tableResult4.getResultKind)
 assertEquals(
   ResolvedSchema.of(Column.physical("database name", DataTypes.STRING())),
-  tableResult2.getResolvedSchema)
+  tableResult4.getResolvedSchema)
 checkData(
-  util.Arrays.asList(Row.of("default_database"), Row.of("db1")).iterator(),
-  tableResult2.collect())
+  util.Arrays.asList(Row.of("default_database"),
+Row.of("db1"),
+Row.of("db2"),
+Row.of("pre_db3")).iterator(),
+  tableResult4.collect())
+
+val tableResult5 = tableEnv.executeSql("SHOW DATABASES LIKE 'db%'")
+assertEquals(ResultKind.SUCCESS_WITH_CONTENT, tableResult5.getResultKind)
+assertEquals(
+  ResolvedSchema.of(Column.physical("database name", DataTypes.STRING())),
+  tableResult5.getResolvedSchema)
+checkData(
+  util.Arrays.asList(Row.of("db1"), Row.of("db2")).iterator(),
+  tableResult5.collect())
+
+val tableResult6 = tableEnv.executeSql("SHOW DATABASES LIKE '_re%'")
+assertEquals(ResultKind.SUCCESS_WITH_CONTENT, tableResult6.getResultKind)
+assertEquals(
+  ResolvedSchema.of(Column.physical("database name", DataTypes.STRING())),
+  tableResult6.getResolvedSchema)

Review comment:
   OK

##
File path: 
flink-table/flink-table-planner/src/test/scala/org/apache/flink/table/api/TableEnvironmentTest.scala
##
@@ -656,14 +656,49 @@ class TableEnvironmentTest {
   def testExecuteSqlWithShowDatabases(): Unit = {
 val tableResult1 = tableEnv.executeSql("CREATE DATABASE db1 COMMENT 
'db1_comment'")
 assertEquals(ResultKind.SUCCESS, tableResult1.getResultKind)
-val tableResult2 = tableEnv.executeSql("SHOW DATABASES")
-assertEquals(ResultKind.SUCCESS_WITH_CONTENT, tableResult2.getResultKind)
+val tableResult2 = tableEnv.executeSql("CREATE DATABASE db2 COMMENT 
'db2_comment'")
+assertEquals(ResultKind.SUCCESS, tableResult2.getResultKind)
+val tableResult3 = tableEnv.executeSql("CREATE DATABASE pre_db3 COMMENT 
'db3_comment'")
+assertEquals(ResultKind.SUCCESS, tableResult3.getResultKind)
+
+val tableResult4 = tableEnv.executeSql("SHOW DATABASES")
+assertEquals(ResultKind.SUCCESS_WITH_CONTENT, tableResult4.getResultKind)
 assertEquals(
   ResolvedSchema.of(Column.physical("database name", DataTypes.STRING())),
-  tableResult2.getResolvedSchema)
+  tableResult4.getResolvedSchema)
 checkData(
-  util.Arrays.asList(Row.of("default_database"), Row.of("db1")).iterator(),
-  tableResult2.collect())
+  util.Arrays.asList(Row.of("default_database"),
+Row.of("db1"),
+Row.of("db2"),
+Row.of("pre_db3")).iterator(),
+  tableResult4.collect())

Review comment:
   OK

##
File path: 
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/operations/SqlToOperationConverterTest.java
##
@@ -1668,6 +1669,41 @@ public void testShowJars() {
 assertThat(showModulesOperation.asSummaryString()).isEqualTo("SHOW 
JARS");
 }
 
+@Test
+public void testShowDatabase() {
+Operation operation = parse("SHOW DATABASES like 't%'", 
SqlDialect.DEFAULT);
+assert operation instanceof ShowDatabasesOperation;
+
+ShowDatabasesOperation showTablesOperation = (ShowDatabasesOperation) 
operation;
+assertThat(showTablesOperation.getCatalogName()).isEqualTo(null);

Review comment:
   OK




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-26460) Fix Unsupported type when convertTypeToSpec: MAP

2022-03-31 Thread godfrey he (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

godfrey he updated FLINK-26460:
---
Fix Version/s: 1.16.0

> Fix Unsupported type when convertTypeToSpec: MAP
> 
>
> Key: FLINK-26460
> URL: https://issues.apache.org/jira/browse/FLINK-26460
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.1, 1.15.0
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> {code:java}
> CREATE TABLE zm_test (
>   `a` BIGINT,
>   `m` MAP
> );
> {code}
> if we insert into zm_test use
> {code:java}
> INSERT INTO zm_test(`a`) SELECT `a` FROM MyTable;
> {code}
> then will throw Exception
> {code:java}
> Unsupported type when convertTypeToSpec: MAP
> {code}
> we must use
> {code:java}
> INSERT INTO zm_test SELECT `a`, cast(null AS MAP) FROM MyTable;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26970) Fix comment style error

2022-03-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-26970:
---
Labels: pull-request-available  (was: )

> Fix comment style error
> ---
>
> Key: FLINK-26970
> URL: https://issues.apache.org/jira/browse/FLINK-26970
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Affects Versions: 1.14.3
>Reporter: yingjie.wang
>Priority: Major
>  Labels: pull-request-available
>
> Fix comment style error.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #19314: [FLINK-26970][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19314:
URL: https://github.com/apache/flink/pull/19314#issuecomment-1085355130


   
   ## CI report:
   
   * c718701d04dde31a32925135833390ec58843530 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34061)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] yunfengzhou-hub commented on a change in pull request #71: [FLINK-26443] Add benchmark framework

2022-03-31 Thread GitBox



yunfengzhou-hub commented on a change in pull request #71:
URL: https://github.com/apache/flink-ml/pull/71#discussion_r840189932



##
File path: 
flink-ml-benchmark/src/main/java/org/apache/flink/ml/benchmark/generator/GeneratorUtils.java
##
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.benchmark.generator;
+
+import org.apache.flink.api.common.functions.RichMapFunction;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.ml.linalg.DenseVector;
+import org.apache.flink.ml.linalg.Vectors;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.util.NumberSequenceIterator;
+
+import java.util.Random;
+
+/** Utility methods to generate data for benchmarks. */
+public class GeneratorUtils {
+/**
+ * Generates random continuous vectors.
+ *
+ * @param env The stream execution environment.
+ * @param numData Number of examples to generate in total.
+ * @param seed The seed to generate seed on each partition.
+ * @param dims Dimension of the vectors to be generated.
+ * @return The generated vector stream.
+ */
+public static DataStream generateRandomContinuousVectorStream(
+StreamExecutionEnvironment env, long numData, long seed, int dims) 
{
+return env.fromParallelCollection(
+new NumberSequenceIterator(1L, numData), 
BasicTypeInfo.LONG_TYPE_INFO)
+.map(new GenerateRandomContinuousVectorFunction(seed, dims));
+}
+
+private static class GenerateRandomContinuousVectorFunction
+extends RichMapFunction {
+private final int dims;
+private final long initSeed;
+private Random random;
+
+private GenerateRandomContinuousVectorFunction(long initSeed, int 
dims) {
+this.dims = dims;
+this.initSeed = initSeed;
+}
+
+@Override
+public void open(Configuration parameters) throws Exception {
+super.open(parameters);
+int index = getRuntimeContext().getIndexOfThisSubtask();
+random = new Random(Tuple2.of(initSeed, index).hashCode());
+}
+
+@Override
+public DenseVector map(Long value) {
+double[] values = new double[dims];
+for (int i = 0; i < dims; i++) {
+values[i] = random.nextDouble();
+}
+return Vectors.dense(values);
+}
+}
+
+/**
+ * Generates random continuous vector arrays.
+ *
+ * @param env The stream execution environment.
+ * @param numData Number of examples to generate in total.
+ * @param arraySize Size of the vector array.
+ * @param seed The seed to generate seed on each partition.
+ * @param dims Dimension of the vectors to be generated.
+ * @return The generated vector stream.
+ */
+public static DataStream 
generateRandomContinuousVectorArrayStream(

Review comment:
   Given that we are introducing classes like `DenseVectorArrayGenerator`, 
`GeneratorUtils` would be removed so this comment is not applicable now.

##
File path: 
flink-ml-benchmark/src/main/java/org/apache/flink/ml/benchmark/generator/GeneratorUtils.java
##
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See

[jira] [Updated] (FLINK-26970) Fix comment style error

2022-03-31 Thread yingjie.wang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yingjie.wang updated FLINK-26970:
-
External issue URL:   (was: https://github.com/apache/flink/pull/19314)

> Fix comment style error
> ---
>
> Key: FLINK-26970
> URL: https://issues.apache.org/jira/browse/FLINK-26970
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Affects Versions: 1.14.3
>Reporter: yingjie.wang
>Priority: Major
>
> Fix comment style error.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-26970) Fix comment style error

2022-03-31 Thread yingjie.wang (Jira)

yingjie.wang created FLINK-26970:


 Summary: Fix comment style error
 Key: FLINK-26970
 URL: https://issues.apache.org/jira/browse/FLINK-26970
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Task
Affects Versions: 1.14.3
Reporter: yingjie.wang


Fix comment style error.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink-ml] zhipeng93 commented on a change in pull request #73: [FLINK-26626] Add Transformer and Estimator for StandardScaler

2022-03-31 Thread GitBox



zhipeng93 commented on a change in pull request #73:
URL: https://github.com/apache/flink-ml/pull/73#discussion_r840189898



##
File path: 
flink-ml-lib/src/main/java/org/apache/flink/ml/feature/standardscaler/StandardScalerModel.java
##
@@ -0,0 +1,185 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.feature.standardscaler;
+
+import org.apache.flink.api.common.functions.RichMapFunction;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.typeutils.RowTypeInfo;
+import org.apache.flink.ml.api.Model;
+import org.apache.flink.ml.common.broadcast.BroadcastUtils;
+import org.apache.flink.ml.common.datastream.TableUtils;
+import org.apache.flink.ml.linalg.BLAS;
+import org.apache.flink.ml.linalg.DenseVector;
+import org.apache.flink.ml.linalg.SparseVector;
+import org.apache.flink.ml.linalg.Vector;
+import org.apache.flink.ml.linalg.Vectors;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.commons.lang3.ArrayUtils;
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+
+/** A Model which transforms data using the model data computed by {@link 
StandardScaler}. */
+public class StandardScalerModel

Review comment:
   As we discussed offline, we agree to keep the code for now because it is 
an implementation issue and do not affect end users.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #19314: [FLINK-26962][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



flinkbot commented on pull request #19314:
URL: https://github.com/apache/flink/pull/19314#issuecomment-1085355130


   
   ## CI report:
   
   * c718701d04dde31a32925135833390ec58843530 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-14998) Remove FileUtils#deletePathIfEmpty

2022-03-31 Thread Yun Tang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515679#comment-17515679
 ] 

Yun Tang commented on FLINK-14998:
--

[~jay li], you can refer to 
https://flink.apache.org/contributing/contribute-code.html for more details to 
know how to create a PR. For the next question, I think it's clear to see and 
we can discuss in the review of PR.

> Remove FileUtils#deletePathIfEmpty
> --
>
> Key: FLINK-14998
> URL: https://issues.apache.org/jira/browse/FLINK-14998
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems
>Reporter: Yun Tang
>Assignee: jay li
>Priority: Minor
>  Labels: auto-deprioritized-major, starter
> Fix For: 1.16.0
>
>
> With the lesson learned from FLINK-7266, and the refactor of FLINK-8540, 
> method of  {{FileUtils#deletePathIfEmpty}} has been totally useless in Flink 
> production code. From my point of view, it's not wise to provide a method 
> with already known high-risk defect in Flink official code. I suggest to 
> remove this part of code.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26962) Fix comment style error

2022-03-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-26962:
---
Labels: pull-request-available  (was: )

> Fix comment style error
> ---
>
> Key: FLINK-26962
> URL: https://issues.apache.org/jira/browse/FLINK-26962
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Affects Versions: 1.14.3
>Reporter: yingjie.wang
>Priority: Major
>  Labels: pull-request-available
>
> Fix comment style error



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] CodingCaproni opened a new pull request #19314: [FLINK-26962][Runtime/Task] Fix comment style error

2022-03-31 Thread GitBox



CodingCaproni opened a new pull request #19314:
URL: https://github.com/apache/flink/pull/19314


   **What is the purpose of the change**
   Fix comment style error.
   **Verifying this change**
   This change is a trivial rework without any test coverage.
   **Does this pull request potentially affect one of the following parts:**
   - Dependencies (does it add or upgrade a dependency): no
   - The public API, i.e., is any changed class annotated with 
@Public(Evolving): no
   - The serializers: no
   - The runtime per-record code paths (performance sensitive): no
   - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
   - The S3 file system connector: no
   **Documentation**
   - Does this pull request introduce a new feature? no
   - If yes, how is the feature documented? no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] yunfengzhou-hub commented on a change in pull request #71: [FLINK-26443] Add benchmark framework

2022-03-31 Thread GitBox



yunfengzhou-hub commented on a change in pull request #71:
URL: https://github.com/apache/flink-ml/pull/71#discussion_r840186693



##
File path: 
flink-ml-benchmark/src/main/java/org/apache/flink/ml/benchmark/clustering/kmeans/KMeansInputsGenerator.java
##
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.benchmark.clustering.kmeans;
+
+import org.apache.flink.ml.benchmark.generator.DataGenerator;
+import org.apache.flink.ml.benchmark.generator.GeneratorUtils;
+import org.apache.flink.ml.clustering.kmeans.KMeansParams;
+import org.apache.flink.ml.common.datastream.TableUtils;
+import org.apache.flink.ml.linalg.DenseVector;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.DataTypes;
+import org.apache.flink.table.api.Schema;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Class that generates table arrays containing inputs for {@link
+ * org.apache.flink.ml.clustering.kmeans.KMeans} and {@link
+ * org.apache.flink.ml.clustering.kmeans.KMeansModel}.
+ */
+public class KMeansInputsGenerator

Review comment:
   I agree. In this case I'll remove `GeneratorUtils`, as its current 
functions would be replaced by `DenseVectorGenerator` and 
`DenseVectorArrayGenerator`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-26792) BufferedMutator failed when insert and update with a flush in Hbase Sink

2022-03-31 Thread lothar (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lothar closed FLINK-26792.
--
Resolution: Not A Bug

> BufferedMutator failed when insert and update with a flush in Hbase Sink
> 
>
> Key: FLINK-26792
> URL: https://issues.apache.org/jira/browse/FLINK-26792
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / HBase
>Reporter: lothar
>Priority: Major
> Attachments: image-2022-03-22-14-03-18-558.png
>
>
> When I write data from Kafka to Hbase(CDC data contains Insert、Update), once 
> the first insert data and update data with same rowkey in a flush, the data 
> will be invalid.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink-ml] yunfengzhou-hub commented on a change in pull request #71: [FLINK-26443] Add benchmark framework

2022-03-31 Thread GitBox



yunfengzhou-hub commented on a change in pull request #71:
URL: https://github.com/apache/flink-ml/pull/71#discussion_r840185834



##
File path: 
flink-ml-benchmark/src/main/java/org/apache/flink/ml/benchmark/clustering/kmeans/KMeansModelDataGenerator.java
##
@@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.benchmark.clustering.kmeans;
+
+import org.apache.flink.ml.benchmark.generator.DataGenerator;
+import org.apache.flink.ml.benchmark.generator.GeneratorUtils;
+import org.apache.flink.ml.clustering.kmeans.KMeansModelData;
+import org.apache.flink.ml.clustering.kmeans.KMeansParams;
+import org.apache.flink.ml.common.datastream.TableUtils;
+import org.apache.flink.ml.linalg.DenseVector;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Class that generates table arrays containing model data for {@link
+ * org.apache.flink.ml.clustering.kmeans.KMeansModel}.
+ */
+public class KMeansModelDataGenerator

Review comment:
   We could extract a `DenseVectorArrayGenerator` from it, but 
`KMeansModelDataGenerator` still needs to be preserved because 
`KMeansModelData` still has `weights` apart from a dense vector array 
centroids. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-26858) When submitting a task, an error is reported and the description is inaccurate, which will lead to misleading

2022-03-31 Thread Jira



[ 
https://issues.apache.org/jira/browse/FLINK-26858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515676#comment-17515676
 ] 

丛鹏 edited comment on FLINK-26858 at 4/1/22 2:33 AM:


[~martijnvisser] Thank you for your comments. What can I do, such as turning 
off the issue?


was (Author: congpeng0...@gmail.com):
What can I do, such as turning off the issue?

> When submitting a task, an error is reported and the description is 
> inaccurate, which will lead to misleading
> -
>
> Key: FLINK-26858
> URL: https://issues.apache.org/jira/browse/FLINK-26858
> Project: Flink
>  Issue Type: Improvement
>  Components: Command Line Client
>Affects Versions: 1.12.2, 1.12.7, 1.13.6, 1.14.3
> Environment: idea
> flink 1.14
>Reporter: 丛鹏
>Assignee: 丛鹏
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> Hi, I'm using flink1.12.2 a problem is found when submitting the task of yarn 
> application,
> An example of the Flink official website submitting a task is
> ./bin/flink run-application -t yarn-application ./ 
> examples/streaming/TopSpeedWindowing. jar
> If some of them are misspelled, then yarn-application is written as 
> yarn-appliation
> One C is missing
> Will report an error：
>  
> java.lang.IllegalStateException: No ClusterClientFactory found. If you were 
> targeting a Yarn cluster, please make sure to export the HADOOP_CLASSPATH 
> environment variable or have hadoop in your classpath. For more information 
> refer to the "Deployment" section of the official Apache Flink documentation.
>  
> BUT
>  
> I saw that the code is the 213 line configuration set encapsulated by 
> CliFrontend.java. There is a problem with effectiveconfiguration, resulting 
> in DefaultClusterClientServiceLoader.Java: 83 judgment entry error
>  
> Finally, it leads to logical judgment  
> if (compatibleFactories.isEmpty()) is true
>  
> then 
>  "No ClusterClientFactory found. If you were targeting a Yarn cluster, "
>                             + "please make sure to export the 
> HADOOP_CLASSPATH environment variable or have hadoop in your "
>                             + "classpath. For more information refer to the 
> \"Deployment\" section of the official "
>                             + "Apache Flink documentation."
>  
> Look at all the situations that lead to the failure of the encapsulation of 
> the configuration class,Will prompt HADOOP_CLASSPATH environment 's reason
>  
> I think there is something wrong with the description of the error 
> information here, which will lead to misleading. Users mistakenly think it is 
> their own Hadoop_ There is a problem with the classpath environment. I hope 
> you can reply 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-26858) When submitting a task, an error is reported and the description is inaccurate, which will lead to misleading

2022-03-31 Thread Jira



[ 
https://issues.apache.org/jira/browse/FLINK-26858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515676#comment-17515676
 ] 

丛鹏 commented on FLINK-26858:


What can I do, such as turning off the issue?

> When submitting a task, an error is reported and the description is 
> inaccurate, which will lead to misleading
> -
>
> Key: FLINK-26858
> URL: https://issues.apache.org/jira/browse/FLINK-26858
> Project: Flink
>  Issue Type: Improvement
>  Components: Command Line Client
>Affects Versions: 1.12.2, 1.12.7, 1.13.6, 1.14.3
> Environment: idea
> flink 1.14
>Reporter: 丛鹏
>Assignee: 丛鹏
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> Hi, I'm using flink1.12.2 a problem is found when submitting the task of yarn 
> application,
> An example of the Flink official website submitting a task is
> ./bin/flink run-application -t yarn-application ./ 
> examples/streaming/TopSpeedWindowing. jar
> If some of them are misspelled, then yarn-application is written as 
> yarn-appliation
> One C is missing
> Will report an error：
>  
> java.lang.IllegalStateException: No ClusterClientFactory found. If you were 
> targeting a Yarn cluster, please make sure to export the HADOOP_CLASSPATH 
> environment variable or have hadoop in your classpath. For more information 
> refer to the "Deployment" section of the official Apache Flink documentation.
>  
> BUT
>  
> I saw that the code is the 213 line configuration set encapsulated by 
> CliFrontend.java. There is a problem with effectiveconfiguration, resulting 
> in DefaultClusterClientServiceLoader.Java: 83 judgment entry error
>  
> Finally, it leads to logical judgment  
> if (compatibleFactories.isEmpty()) is true
>  
> then 
>  "No ClusterClientFactory found. If you were targeting a Yarn cluster, "
>                             + "please make sure to export the 
> HADOOP_CLASSPATH environment variable or have hadoop in your "
>                             + "classpath. For more information refer to the 
> \"Deployment\" section of the official "
>                             + "Apache Flink documentation."
>  
> Look at all the situations that lead to the failure of the encapsulation of 
> the configuration class,Will prompt HADOOP_CLASSPATH environment 's reason
>  
> I think there is something wrong with the description of the error 
> information here, which will lead to misleading. Users mistakenly think it is 
> their own Hadoop_ There is a problem with the classpath environment. I hope 
> you can reply 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-26884) Move Elasticsearch connector to external connector repository

2022-03-31 Thread zhisheng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515674#comment-17515674
 ] 

zhisheng commented on FLINK-26884:
--

[~martijnvisser] thanks your quickly reply, I get it.

> Move Elasticsearch connector to external connector repository
> -
>
> Key: FLINK-26884
> URL: https://issues.apache.org/jira/browse/FLINK-26884
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Reporter: Jing Ge
>Assignee: Jing Ge
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink-ml] yunfengzhou-hub commented on a change in pull request #71: [FLINK-26443] Add benchmark framework

2022-03-31 Thread GitBox



yunfengzhou-hub commented on a change in pull request #71:
URL: https://github.com/apache/flink-ml/pull/71#discussion_r840183375



##
File path: 
flink-ml-benchmark/src/main/java/org/apache/flink/ml/benchmark/Benchmark.java
##
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.benchmark;
+
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.util.Preconditions;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.FileInputStream;
+import java.io.InputStream;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+/** Entry class for benchmark execution. */
+public class Benchmark {
+private static final Logger LOG = LoggerFactory.getLogger(Benchmark.class);
+
+@SuppressWarnings("unchecked")
+public static void main(String[] args) throws Exception {
+StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
+
+InputStream inputStream = new FileInputStream(args[0]);
+Map jsonMap = 
ReadWriteUtils.OBJECT_MAPPER.readValue(inputStream, Map.class);
+Preconditions.checkArgument(
+jsonMap.containsKey("version") && 
jsonMap.get("version").equals(1));

Review comment:
   In my opinion, making it package-private might be better. `"version"` is 
also used in `BenchmarkTest` now and might be used in `BenchmarkUtils` in 
future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] yunfengzhou-hub commented on a change in pull request #71: [FLINK-26443] Add benchmark framework

2022-03-31 Thread GitBox



yunfengzhou-hub commented on a change in pull request #71:
URL: https://github.com/apache/flink-ml/pull/71#discussion_r840180676



##
File path: 
flink-ml-benchmark/src/test/java/org/apache/flink/ml/benchmark/BenchmarkTest.java
##
@@ -0,0 +1,163 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.benchmark;
+
+import org.apache.flink.ml.benchmark.clustering.kmeans.KMeansInputsGenerator;
+import org.apache.flink.ml.clustering.kmeans.KMeans;
+import org.apache.flink.ml.clustering.kmeans.KMeansModel;
+import org.apache.flink.ml.param.WithParams;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.test.util.AbstractTestBase;
+
+import org.apache.commons.io.FileUtils;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+
+import java.io.ByteArrayOutputStream;
+import java.io.File;
+import java.io.InputStream;
+import java.io.PrintStream;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+import static org.apache.flink.ml.util.ReadWriteUtils.OBJECT_MAPPER;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+
+/** Tests benchmarks. */
+@SuppressWarnings("unchecked")
+public class BenchmarkTest extends AbstractTestBase {
+@Rule public final TemporaryFolder tempFolder = new TemporaryFolder();
+
+private final ByteArrayOutputStream outContent = new 
ByteArrayOutputStream();
+private final PrintStream originalOut = System.out;
+
+@Before
+public void before() {
+System.setOut(new PrintStream(outContent));
+}
+
+@After
+public void after() {
+System.setOut(originalOut);

Review comment:
   When we do `System.setOut(new PrintStream(outContent));`, we have 
already made `System.out` points to `new PrintStream(outContent)`, so 
`System.setOut(System.out)` cannot help to reset to stdout.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-14998) Remove FileUtils#deletePathIfEmpty

2022-03-31 Thread jay li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515379#comment-17515379
 ] 

jay li edited comment on FLINK-14998 at 4/1/22 2:11 AM:


Hi [~yunta]  thanks i got this ticket , I have two questions to ask.

1 ) I need to create a new branch(fork repo) from master and then submit master 
to pr after solving this problem?     

2) I need remove FileUtils#deletePathIfEmpty code and corresponding unit tests.


was (Author: JIRAUSER287310):
Hi [~yunta]  thanks i got this ticket , I have two questions to ask.

1 ) I need to create a new branch from master and then submit master to pr 
after solving this problem?     

2) I need remove FileUtils#deletePathIfEmpty code and corresponding unit tests.

> Remove FileUtils#deletePathIfEmpty
> --
>
> Key: FLINK-14998
> URL: https://issues.apache.org/jira/browse/FLINK-14998
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems
>Reporter: Yun Tang
>Assignee: jay li
>Priority: Minor
>  Labels: auto-deprioritized-major, starter
> Fix For: 1.16.0
>
>
> With the lesson learned from FLINK-7266, and the refactor of FLINK-8540, 
> method of  {{FileUtils#deletePathIfEmpty}} has been totally useless in Flink 
> production code. From my point of view, it's not wise to provide a method 
> with already known high-risk defect in Flink official code. I suggest to 
> remove this part of code.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #19303: [FLINK-26961][connectors][filesystems][formats] Update multiple Jacks…

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19303:
URL: https://github.com/apache/flink/pull/19303#issuecomment-1084622464


   
   ## CI report:
   
   * 99092a0711958b858b23f10ea0056167c3b4f7f7 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34060)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] yunfengzhou-hub commented on a change in pull request #71: [FLINK-26443] Add benchmark framework

2022-03-31 Thread GitBox



yunfengzhou-hub commented on a change in pull request #71:
URL: https://github.com/apache/flink-ml/pull/71#discussion_r840168951



##
File path: 
flink-ml-benchmark/src/main/java/org/apache/flink/ml/benchmark/BenchmarkResult.java
##
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.benchmark;
+
+import java.util.LinkedHashMap;
+import java.util.Map;
+
+/** The result of executing a benchmark. */
+public class BenchmarkResult {
+/** The name of the benchmark. */
+public String name;

Review comment:
   I agree that these fields should be made `final`, but adding a 
constructor with all these field values might possibly limit the extension of 
the interface. For example, when in near future when we add fields like 
`latencyP99Ms`, all usages of this constructor needs to be modified. I would 
prefer to add a `BenchmarkResult.Builder` that helps setting an arbitrary 
number of fields before constructing the `BenchmarkResult` instance.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] hackergin commented on a change in pull request #19302: [FLINK-26810][connectors/elasticsearch] Use local timezone for TIMESTAMP_WITH_LOCAL_TIMEZONE fields in dynamic index

2022-03-31 Thread GitBox



hackergin commented on a change in pull request #19302:
URL: https://github.com/apache/flink/pull/19302#discussion_r840165388



##
File path: 
flink-connectors/flink-connector-elasticsearch-base/src/main/java/org/apache/flink/connector/elasticsearch/table/IndexGeneratorFactory.java
##
@@ -186,7 +185,10 @@ private static DynamicFormatter createFormatFunction(
 case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
 return (value, dateTimeFormatter) -> {
 TimestampData indexField = (TimestampData) value;
-return 
indexField.toInstant().atZone(ZoneOffset.UTC).format(dateTimeFormatter);
+return indexField
+.toInstant()

Review comment:
   When `createIndexGenerator` there is already a localTimeZoneId , So we 
don't need to get is from `ZoneId.systemDefault()` ? 

##
File path: 
flink-connectors/flink-connector-elasticsearch-base/src/main/java/org/apache/flink/streaming/connectors/elasticsearch/table/IndexGeneratorFactory.java
##
@@ -183,7 +182,10 @@ private static DynamicFormatter createFormatFunction(
 case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
 return (value, dateTimeFormatter) -> {
 TimestampData indexField = (TimestampData) value;
-return 
indexField.toInstant().atZone(ZoneOffset.UTC).format(dateTimeFormatter);
+return indexField
+.toInstant()

Review comment:
   the same as above. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25794) Memory pages in LazyMemorySegmentPool should be clear after they are released to MemoryManager

2022-03-31 Thread Yangze Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515657#comment-17515657
 ] 

Yangze Guo commented on FLINK-25794:


[~zjureel] Is this issue still valid? Shall we close it?

> Memory pages in LazyMemorySegmentPool should be clear after they are released 
> to MemoryManager
> --
>
> Key: FLINK-25794
> URL: https://issues.apache.org/jira/browse/FLINK-25794
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.11.6, 1.13.5, 1.14.3
>Reporter: Shammon
>Assignee: Shammon
>Priority: Major
>  Labels: pull-request-available, stale-assigned
>
> `LazyMemorySegmentPool` manages memory segments cache for join, agg, sort and 
> etc. operators. These segments in the cache will be released to 
> `MemoryManager` after some specify operations such as join operator finishes 
> to build data in `LazyMemorySegmentPool.cleanCache` method. But these 
> segments are still in `LazyMemorySegmentPool.cachePages`, it may cause memory 
> fault if the `MemoryManager` has deallocated these segments



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26969) Write operation examples of tumbling window, sliding window, session window and count window based on pyflink

2022-03-31 Thread Dian Fu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated FLINK-26969:

Issue Type: Improvement  (was: New Feature)

> Write operation examples of tumbling window, sliding window, session window 
> and count window based on pyflink
> -
>
> Key: FLINK-26969
> URL: https://issues.apache.org/jira/browse/FLINK-26969
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Reporter: zhangjingcun
>Assignee: zhangjingcun
>Priority: Major
>
> Write operation examples of tumbling window, sliding window, session window 
> and count window based on pyflink



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (FLINK-26969) Write operation examples of tumbling window, sliding window, session window and count window based on pyflink

2022-03-31 Thread Dian Fu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu reassigned FLINK-26969:
---

Assignee: zhangjingcun

> Write operation examples of tumbling window, sliding window, session window 
> and count window based on pyflink
> -
>
> Key: FLINK-26969
> URL: https://issues.apache.org/jira/browse/FLINK-26969
> Project: Flink
>  Issue Type: New Feature
>  Components: Examples
>Affects Versions: 1.14.4
>Reporter: zhangjingcun
>Assignee: zhangjingcun
>Priority: Major
>
> Write operation examples of tumbling window, sliding window, session window 
> and count window based on pyflink



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26969) Write operation examples of tumbling window, sliding window, session window and count window based on pyflink

2022-03-31 Thread Dian Fu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated FLINK-26969:

Affects Version/s: (was: 1.14.4)

> Write operation examples of tumbling window, sliding window, session window 
> and count window based on pyflink
> -
>
> Key: FLINK-26969
> URL: https://issues.apache.org/jira/browse/FLINK-26969
> Project: Flink
>  Issue Type: New Feature
>  Components: Examples
>Reporter: zhangjingcun
>Assignee: zhangjingcun
>Priority: Major
>
> Write operation examples of tumbling window, sliding window, session window 
> and count window based on pyflink



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26969) Write operation examples of tumbling window, sliding window, session window and count window based on pyflink

2022-03-31 Thread Dian Fu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated FLINK-26969:

Component/s: API / Python

> Write operation examples of tumbling window, sliding window, session window 
> and count window based on pyflink
> -
>
> Key: FLINK-26969
> URL: https://issues.apache.org/jira/browse/FLINK-26969
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python, Examples
>Reporter: zhangjingcun
>Assignee: zhangjingcun
>Priority: Major
>
> Write operation examples of tumbling window, sliding window, session window 
> and count window based on pyflink



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26969) Write operation examples of tumbling window, sliding window, session window and count window based on pyflink

2022-03-31 Thread zhangjingcun (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangjingcun updated FLINK-26969:
-
Description: Write operation examples of tumbling window, sliding window, 
session window and count window based on pyflink  (was: Write operation 
examples of tumbling window, sliding window, session window and time window 
based on pyflink)

> Write operation examples of tumbling window, sliding window, session window 
> and count window based on pyflink
> -
>
> Key: FLINK-26969
> URL: https://issues.apache.org/jira/browse/FLINK-26969
> Project: Flink
>  Issue Type: New Feature
>  Components: Examples
>Affects Versions: 1.14.4
>Reporter: zhangjingcun
>Priority: Major
>
> Write operation examples of tumbling window, sliding window, session window 
> and count window based on pyflink



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26969) Write operation examples of tumbling window, sliding window, session window and count window based on pyflink

2022-03-31 Thread zhangjingcun (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangjingcun updated FLINK-26969:
-
Summary: Write operation examples of tumbling window, sliding window, 
session window and count window based on pyflink  (was: Write operation 
examples of tumbling window, sliding window, session window and time window 
based on pyflink)

> Write operation examples of tumbling window, sliding window, session window 
> and count window based on pyflink
> -
>
> Key: FLINK-26969
> URL: https://issues.apache.org/jira/browse/FLINK-26969
> Project: Flink
>  Issue Type: New Feature
>  Components: Examples
>Affects Versions: 1.14.4
>Reporter: zhangjingcun
>Priority: Major
>
> Write operation examples of tumbling window, sliding window, session window 
> and time window based on pyflink



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-26969) Write operation examples of tumbling window, sliding window, session window and time window based on pyflink

2022-03-31 Thread zhangjingcun (Jira)

zhangjingcun created FLINK-26969:


 Summary: Write operation examples of tumbling window, sliding 
window, session window and time window based on pyflink
 Key: FLINK-26969
 URL: https://issues.apache.org/jira/browse/FLINK-26969
 Project: Flink
  Issue Type: New Feature
  Components: Examples
Affects Versions: 1.14.4
Reporter: zhangjingcun


Write operation examples of tumbling window, sliding window, session window and 
time window based on pyflink



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #19303: [FLINK-26961][connectors][filesystems][formats] Update multiple Jacks…

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19303:
URL: https://github.com/apache/flink/pull/19303#issuecomment-1084622464


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * 99092a0711958b858b23f10ea0056167c3b4f7f7 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34060)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19303: [FLINK-26961][connectors][filesystems][formats] Update multiple Jacks…

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19303:
URL: https://github.com/apache/flink/pull/19303#issuecomment-1084622464


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * 99092a0711958b858b23f10ea0056167c3b4f7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19303: [FLINK-26961][connectors][filesystems][formats] Update multiple Jacks…

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19303:
URL: https://github.com/apache/flink/pull/19303#issuecomment-1084622464


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19303: [FLINK-26961][connectors][filesystems][formats] Update multiple Jacks…

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19303:
URL: https://github.com/apache/flink/pull/19303#issuecomment-1084622464


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * 99092a0711958b858b23f10ea0056167c3b4f7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19303: [FLINK-26961][connectors][filesystems][formats] Update multiple Jacks…

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19303:
URL: https://github.com/apache/flink/pull/19303#issuecomment-1084622464


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19303: [FLINK-26961][connectors][filesystems][formats] Update multiple Jacks…

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19303:
URL: https://github.com/apache/flink/pull/19303#issuecomment-1084622464


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * 99092a0711958b858b23f10ea0056167c3b4f7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] snuyanzin commented on pull request #19303: [FLINK-26961][connectors][filesystems][formats] Update multiple Jacks…

2022-03-31 Thread GitBox



snuyanzin commented on pull request #19303:
URL: https://github.com/apache/flink/pull/19303#issuecomment-1085224069


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19303: [FLINK-26961][connectors][filesystems][formats] Update multiple Jacks…

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19303:
URL: https://github.com/apache/flink/pull/19303#issuecomment-1084622464


   
   ## CI report:
   
   * b2d8f3d7657f62c44cfaeab31ccf7c5166ea184b Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34059)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19303: [FLINK-26961][connectors][filesystems][formats] Update multiple Jacks…

2022-03-31 Thread GitBox



flinkbot edited a comment on pull request #19303:
URL: https://github.com/apache/flink/pull/19303#issuecomment-1084622464


   
   ## CI report:
   
   * b2d8f3d7657f62c44cfaeab31ccf7c5166ea184b Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=34059)
 
   * 99092a0711958b858b23f10ea0056167c3b4f7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 3 4 5 6 >

1 - 100 of 518 matches

Mail list logo