[GitHub] [flink-table-store] zjureel commented on pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


zjureel commented on PR #376:
URL: 
https://github.com/apache/flink-table-store/pull/376#issuecomment-1318222016

   Thanks @JingsongLi  I have updated the codes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] JerryYue-M commented on pull request #21197: [FLINK-29801] OperatorCoordinator need open the way to operate metric…

2022-11-16 Thread GitBox


JerryYue-M commented on PR #21197:
URL: https://github.com/apache/flink/pull/21197#issuecomment-1318220393

   @zhuzhurk  
   your comments had resolved and no checkstyle and compile problems, can you 
take a look at those new changes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] JerryYue-M commented on pull request #21197: [FLINK-29801] OperatorCoordinator need open the way to operate metric…

2022-11-16 Thread GitBox


JerryYue-M commented on PR #21197:
URL: https://github.com/apache/flink/pull/21197#issuecomment-1318219982

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-ml] lindong28 merged pull request #172: [FLINK-29592] Add Estimator and Transformer for RobustScaler

2022-11-16 Thread GitBox


lindong28 merged PR #172:
URL: https://github.com/apache/flink-ml/pull/172


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-ml] lindong28 commented on pull request #172: [FLINK-29592] Add Estimator and Transformer for RobustScaler

2022-11-16 Thread GitBox


lindong28 commented on PR #172:
URL: https://github.com/apache/flink-ml/pull/172#issuecomment-1318216209

   Thanks @jiangxin369 for the PR! Thanks @yunfengzhou-hub for the review. LGTM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29572) Flink Task Manager skip loopback interface for resource manager registration

2022-11-16 Thread Weihua Hu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635183#comment-17635183
 ] 

Weihua Hu commented on FLINK-29572:
---

[~AlexXXX] IMO, it's a bug to use loopback. we will remove it in 
https://issues.apache.org/jira/browse/FLINK-27341

> Flink Task Manager skip loopback interface for resource manager registration
> 
>
> Key: FLINK-29572
> URL: https://issues.apache.org/jira/browse/FLINK-29572
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.15.3
> Environment: Flink 1.15.2
> Kubernetes with Istio Proxy
>Reporter: Kevin Li
>Priority: Major
>
> Currently Flink Task Manager use different local interface to bind to connect 
> to Resource Manager. First one is Loopback interface. Normally if Job Manager 
> is running on remote host/container, using loopback interface to connect will 
> fail and it will pick up correct IP address.
> However, if Task Manager is running with some proxy, loopback interface can 
> connect to remote host as well. This will result 127.0.0.1 reported to 
> Resource Manager during registration, even Job Manager/Resource Manager runs 
> on remote host, and problem will happen. For us, only one Task Manager can 
> register in this case.
> I suggest adding configuration to skip Loopback interface check if we know 
> Job/Resource Manager is running on remote host/container.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-8712) Cannot execute job with multiple slot sharing groups on LocalExecutor

2022-11-16 Thread Weihua Hu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635180#comment-17635180
 ] 

Weihua Hu commented on FLINK-8712:
--

[~xtsong] Agree with you, maybe we can close this issue since it is already 
support to configure the number of slotsOfTM.

> Cannot execute job with multiple slot sharing groups on LocalExecutor
> -
>
> Key: FLINK-8712
> URL: https://issues.apache.org/jira/browse/FLINK-8712
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Not a Priority
>  Labels: auto-deprioritized-critical, auto-deprioritized-major, 
> auto-deprioritized-minor
>
> Currently, it is not possible to run a job with multiple slot sharing groups 
> on the LocalExecutor. The problem is that we determine the number of required 
> slots simply by looking for the max parallelism of the job but do not 
> consider slot sharing groups.
>  
> {code:java}
> // set up the streaming execution environment
> final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.setParallelism(1);
> final DataStreamSource input = env.addSource(new InfinitySource());
> final SingleOutputStreamOperator different = input.map(new 
> MapFunction() {
>@Override
>public Integer map(Integer integer) throws Exception {
>   return integer;
>}
> }).slotSharingGroup("Different");
> different.print();
> // execute program
> env.execute("Flink Streaming Java API Skeleton");{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] gaborgsomogyi commented on pull request #21339: [FLINK-30024][tests] Build local test KDC docker image

2022-11-16 Thread GitBox


gaborgsomogyi commented on PR #21339:
URL: https://github.com/apache/flink/pull/21339#issuecomment-1318206668

   @MartijnVisser Yeah, this PR is intended to add more stability to the test 
(though I'm not saying it will solve the version bump issues). Additionally one 
needs to change the Hadoop version only in 1-2 places. I'm aware of the bump PR 
but I've not tracked the latest news. If you could show me a bad run then maybe 
I can have a look and help some way. At the same time I'm fixing the missing 
`hadoop.tar.gz` issue here. The other thing what I can suggest is to merge this 
when ready and we can have a look at top of this.
   
   As a general note I think it's super hard that we're not able to execute the 
e2e tests locally. I'm thinking about to fix it somehow, but since I'm working 
on time pressured features this is best effort for now.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-30047) getLastSavepointStatus should return null when there is never savepoint completed or pending

2022-11-16 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-30047.
--
Fix Version/s: kubernetes-operator-1.3.0
   Resolution: Fixed

merged to main 215912b08ab0d915be4e84be17f5df97fd4bb6b5

> getLastSavepointStatus should return null when there is never savepoint 
> completed or pending
> 
>
> Key: FLINK-30047
> URL: https://issues.apache.org/jira/browse/FLINK-30047
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Clara Xiong
>Assignee: Clara Xiong
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.3.0
>
>
> Current SUCCEEDED is returned in this case but null should be returned 
> instead to distinguish from really success.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-30047) getLastSavepointStatus should return null when there is never savepoint completed or pending

2022-11-16 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-30047:
--

Assignee: Clara Xiong

> getLastSavepointStatus should return null when there is never savepoint 
> completed or pending
> 
>
> Key: FLINK-30047
> URL: https://issues.apache.org/jira/browse/FLINK-30047
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Clara Xiong
>Assignee: Clara Xiong
>Priority: Major
>  Labels: pull-request-available
>
> Current SUCCEEDED is returned in this case but null should be returned 
> instead to distinguish from really success.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] gyfora merged pull request #443: [FLINK-30047] getLastSavepointStatus should return null when there is…

2022-11-16 Thread GitBox


gyfora merged PR #443:
URL: https://github.com/apache/flink-kubernetes-operator/pull/443


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-29572) Flink Task Manager skip loopback interface for resource manager registration

2022-11-16 Thread AlexHu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635164#comment-17635164
 ] 

AlexHu edited comment on FLINK-29572 at 11/17/22 6:38 AM:
--

We faced the same problem at k8s. To be honest, we are confused that why using 
the loopback as the default strategy? Especially, at productive k8s 
environment, loopback is useless at remote shuffle. After changing the default 
strategy as LOCAL_HOST we have fixed this bug.


was (Author: JIRAUSER291799):
We faced the same problem at k8s. To be honest, we are confused that why using 
the loopback as the default strategy? After changing the default strategy as 
LOCAL_HOST we have fixed this bug.

> Flink Task Manager skip loopback interface for resource manager registration
> 
>
> Key: FLINK-29572
> URL: https://issues.apache.org/jira/browse/FLINK-29572
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.15.3
> Environment: Flink 1.15.2
> Kubernetes with Istio Proxy
>Reporter: Kevin Li
>Priority: Major
>
> Currently Flink Task Manager use different local interface to bind to connect 
> to Resource Manager. First one is Loopback interface. Normally if Job Manager 
> is running on remote host/container, using loopback interface to connect will 
> fail and it will pick up correct IP address.
> However, if Task Manager is running with some proxy, loopback interface can 
> connect to remote host as well. This will result 127.0.0.1 reported to 
> Resource Manager during registration, even Job Manager/Resource Manager runs 
> on remote host, and problem will happen. For us, only one Task Manager can 
> register in this case.
> I suggest adding configuration to skip Loopback interface check if we know 
> Job/Resource Manager is running on remote host/container.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29572) Flink Task Manager skip loopback interface for resource manager registration

2022-11-16 Thread AlexHu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635164#comment-17635164
 ] 

AlexHu commented on FLINK-29572:


We faced the same problem at k8s. To be honest, we are confused that why using 
the loopback as the default strategy? After changing the default strategy as 
LOCAL_HOST we have fixed this bug.

> Flink Task Manager skip loopback interface for resource manager registration
> 
>
> Key: FLINK-29572
> URL: https://issues.apache.org/jira/browse/FLINK-29572
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.15.3
> Environment: Flink 1.15.2
> Kubernetes with Istio Proxy
>Reporter: Kevin Li
>Priority: Major
>
> Currently Flink Task Manager use different local interface to bind to connect 
> to Resource Manager. First one is Loopback interface. Normally if Job Manager 
> is running on remote host/container, using loopback interface to connect will 
> fail and it will pick up correct IP address.
> However, if Task Manager is running with some proxy, loopback interface can 
> connect to remote host as well. This will result 127.0.0.1 reported to 
> Resource Manager during registration, even Job Manager/Resource Manager runs 
> on remote host, and problem will happen. For us, only one Task Manager can 
> register in this case.
> I suggest adding configuration to skip Loopback interface check if we know 
> Job/Resource Manager is running on remote host/container.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] xintongsong commented on a diff in pull request #21331: [FLINK-29639] Print resourceId of remote taskmanager when encounter transport exception.

2022-11-16 Thread GitBox


xintongsong commented on code in PR #21331:
URL: https://github.com/apache/flink/pull/21331#discussion_r1024797669


##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/ConnectionID.java:
##
@@ -75,15 +84,18 @@ public boolean equals(Object other) {
 }
 
 final ConnectionID ra = (ConnectionID) other;
-if (!ra.getAddress().equals(address) || ra.getConnectionIndex() != 
connectionIndex) {
-return false;
-}
-
-return true;
+return ra.getAddress().equals(address)
+&& ra.getConnectionIndex() == connectionIndex
+&& ra.getResourceID().equals(resourceID);
 }
 
 @Override
 public String toString() {
-return address + " [" + connectionIndex + "]";
+return address
++ " ["
++ connectionIndex
++ "]"
++ " resourceID: "
++ resourceID.getStringWithMetadata();

Review Comment:
   Below is how a connection-id is printed.
   ```
   localhost/127.0.0.1:1 [10] resourceID: producerLocation
   ```
   It would be nice to have a more concise format. E.g.,
   ```
   localhost/127.0.0.1:1 (producerLocation) [10]
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] AlanConfluent commented on a diff in pull request #21264: [WIP][FLINK-29928][runtime, state] Share RocksDB memory across TM slots

2022-11-16 Thread GitBox


AlanConfluent commented on code in PR #21264:
URL: https://github.com/apache/flink/pull/21264#discussion_r1024790314


##
flink-tests/src/test/java/org/apache/flink/test/state/TaskManagerWideRocksDbMemorySharingITCase.java:
##
@@ -0,0 +1,257 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.test.state;
+
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.api.common.functions.RichMapFunction;
+import org.apache.flink.api.common.state.ListState;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.api.common.time.Deadline;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.configuration.MemorySize;
+import org.apache.flink.configuration.StateBackendOptions;
+import org.apache.flink.configuration.TaskManagerOptions;
+import org.apache.flink.contrib.streaming.state.RocksDBNativeMetricOptions;
+import org.apache.flink.contrib.streaming.state.RocksDBOptions;
+import org.apache.flink.metrics.Gauge;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.testutils.InMemoryReporter;
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.streaming.api.CheckpointingMode;
+import org.apache.flink.streaming.api.datastream.DataStreamSource;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.functions.sink.DiscardingSink;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.math.BigInteger;
+import java.time.Duration;
+import java.util.ArrayList;
+import java.util.DoubleSummaryStatistics;
+import java.util.List;
+import java.util.Random;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
+import static 
org.apache.flink.api.common.restartstrategy.RestartStrategies.noRestart;
+import static 
org.apache.flink.contrib.streaming.state.RocksDBMemoryControllerUtils.calculateActualCacheCapacity;
+import static 
org.apache.flink.runtime.testutils.CommonTestUtils.waitForAllTaskRunning;
+import static org.apache.flink.util.Preconditions.checkState;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+
+/**
+ * Tests that memory sharing scope and {@link 
TaskManagerOptions#MANAGED_MEMORY_SHARED_FRACTION}
+ * work as expected, i.e. make RocksDB use the same BlockCache and 
WriteBufferManager objects. It
+ * does so using RocksDB metrics.
+ */
+public class TaskManagerWideRocksDbMemorySharingITCase {
+private static final int PARALLELISM = 4;
+private static final int NUMBER_OF_JOBS = 5;
+private static final int NUMBER_OF_TASKS = NUMBER_OF_JOBS * PARALLELISM;
+
+private static final int MANAGED_MEMORY_SIZE_BYTES = NUMBER_OF_TASKS * 25 
* 1024 * 1024;
+private static final double MANAGED_MEMORY_SHARED_FRACTION = .85d;
+private static final double WRITE_BUFFER_RATIO = 0.5;
+private static final double EXPECTED_BLOCK_CACHE_SIZE =
+calculateActualCacheCapacity(
+(long) (MANAGED_MEMORY_SIZE_BYTES * 
MANAGED_MEMORY_SHARED_FRACTION),
+WRITE_BUFFER_RATIO);
+// try to check that the memory usage is limited
+// however, there is no hard limit actually
+// because of https://issues.apache.org/jira/browse/FLINK-15532
+private static final double EFFECTIVE_LIMIT = EXPECTED_BLOCK_CACHE_SIZE * 
1.25;
+
+private InMemoryReporter metricsReporter;
+private MiniClusterWithClientResource cluster;
+
+@Before
+public void init() throws Exception {
+metricsReporter = InMemoryReporter.create();
+cluster =
+new MiniClusterWithClientResource(
+new MiniClusterResourceConfiguration.Builder()
+
.setConfiguration(getConfiguration(metricsReporter))
+.setNumberTaskManagers(1)
+.setNumberSlotsPerTaskManager(NUMBER_OF_TASKS)
+.build());
+cluster.before();
+}

[jira] [Commented] (FLINK-28558) HistoryServer log retrieval configuration improvement

2022-11-16 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635154#comment-17635154
 ] 

Xintong Song commented on FLINK-28558:
--

This feature helps integrate the log browsing service into the history server, 
so you can conveniently jump to the corresponding log by clicking on the 
interested task in history server. This works only if you already have a web 
service where logs can be browsed. 

> HistoryServer log retrieval configuration improvement
> -
>
> Key: FLINK-28558
> URL: https://issues.apache.org/jira/browse/FLINK-28558
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration, Runtime / REST
>Reporter: Xintong Song
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> HistoryServer generates log retrieval urls base on the following 
> configuration:
> - historyserver.log.jobmanager.url-pattern
> - historyserver.log.taskmanager.url-pattern
> The usability can be improved in two ways:
> - Explicitly explain in description that only http/https schemas are 
> supported, and add sanity checks for it.
> - If the schema is not specified, add "http://; by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] zjureel commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


zjureel commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024778136


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaFieldTypeExtractor.java:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import org.apache.flink.table.types.logical.RowType;
+
+import java.io.Serializable;
+import java.util.List;
+
+/** Extractor of schema for different tables. */
+public interface SchemaFieldTypeExtractor extends Serializable {
+/**
+ * Extract key type from table schema.
+ *
+ * @param schema the table schema
+ * @return the key type
+ */
+RowType keyType(TableSchema schema);
+
+/**
+ * Extract key fields from table schema.
+ *
+ * @param schema the table schema
+ * @return the key fields
+ */
+List keyFields(TableSchema schema);

Review Comment:
   Yes, we can convert `List` to `RowType` with `(RowType) new 
RowDataType(false, fields).logicalType` for 
`ChangelogValueCountFileStoreTable`, but for `ChangelogWithKeyFileStoreTable` 
we should do `addKeyNamePrefix` for the key `RowType`. 
   
   So I added `keyType` in `Extractor`, `ChangelogValueCountFileStoreTable` and 
`ChangelogWithKeyFileStoreTable` can implement their own operations in it and 
`KeyValueFileStoreScan` can get key type for different tables



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024745027


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaFieldTypeExtractor.java:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import org.apache.flink.table.types.logical.RowType;
+
+import java.io.Serializable;
+import java.util.List;
+
+/** Extractor of schema for different tables. */
+public interface SchemaFieldTypeExtractor extends Serializable {
+/**
+ * Extract key type from table schema.
+ *
+ * @param schema the table schema
+ * @return the key type
+ */
+RowType keyType(TableSchema schema);
+
+/**
+ * Extract key fields from table schema.
+ *
+ * @param schema the table schema
+ * @return the key fields
+ */
+List keyFields(TableSchema schema);

Review Comment:
   `(RowType) new RowDataType(false, fields).logicalType` this can convert 
`fields` to `RowType`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29969) Show the root cause when exceeded checkpoint tolerable failure threshold

2022-11-16 Thread Rui Fan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635118#comment-17635118
 ] 

Rui Fan commented on FLINK-29969:
-

Thanks [~yunta] 's good suggestion and review.

> Show the root cause when exceeded checkpoint tolerable failure threshold
> 
>
> Key: FLINK-29969
> URL: https://issues.apache.org/jira/browse/FLINK-29969
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.17.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Add the root cause when exceeded checkpoint tolerable failure threshold, it's 
> helpful during troubleshooting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] zjureel commented on pull request #383: [FLINK-30033] Add primary key type validation

2022-11-16 Thread GitBox


zjureel commented on PR #383:
URL: 
https://github.com/apache/flink-table-store/pull/383#issuecomment-1318036364

   CI fails due to HiveE2E, which is a known issue and has created in 
https://issues.apache.org/jira/browse/FLINK-30038


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] zjureel commented on a diff in pull request #383: [FLINK-30033] Add primary key type validation

2022-11-16 Thread GitBox


zjureel commented on code in PR #383:
URL: https://github.com/apache/flink-table-store/pull/383#discussion_r1024730773


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaManager.java:
##
@@ -157,6 +159,33 @@ public TableSchema commitNewVersion(UpdateSchema 
updateSchema) throws Exception
 }
 }
 
+private void validatePrimaryKeysType(UpdateSchema updateSchema, 
List primaryKeys) {
+if (!primaryKeys.isEmpty()) {
+Map rowFields = new HashMap<>();
+for (RowType.RowField rowField : 
updateSchema.rowType().getFields()) {
+rowFields.put(StringUtils.lowerCase(rowField.getName()), 
rowField);

Review Comment:
   I will try to fix it later
   
   > I created a jira for hive https://issues.apache.org/jira/browse/FLINK-29988
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-29988) Improve upper case fields for hive metastore

2022-11-16 Thread Shammon (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shammon updated FLINK-29988:

Component/s: Table Store

> Improve upper case fields for hive metastore
> 
>
> Key: FLINK-29988
> URL: https://issues.apache.org/jira/browse/FLINK-29988
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Major
>
> If the fields in the fts table are uppercase, there will be a mismatched 
> exception when used in the Hive.
> 1. If it is not supported at the beginning, throw an exception when flink 
> creates a table to the hive metastore.
> 2. If it is supported, so that no error is reported in the whole process, but 
> save lower case in hive metastore. We can check columns with the same name 
> when creating a table in Flink with hive metastore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] zjureel commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


zjureel commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024726786


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaFieldTypeExtractor.java:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import org.apache.flink.table.types.logical.RowType;
+
+import java.io.Serializable;
+import java.util.List;
+
+/** Extractor of schema for different tables. */
+public interface SchemaFieldTypeExtractor extends Serializable {
+/**
+ * Extract key type from table schema.
+ *
+ * @param schema the table schema
+ * @return the key type
+ */
+RowType keyType(TableSchema schema);
+
+/**
+ * Extract key fields from table schema.
+ *
+ * @param schema the table schema
+ * @return the key fields
+ */
+List keyFields(TableSchema schema);

Review Comment:
   Do you mean the `RowType keyType (TableSchema schema)` and `List < DataField 
> keyFields (TableSchema schema)` methods in `Extractor`? These two methods are 
really a bit strange, we can change `RowType keyType (TableSchema schema)` to 
`RowType keyType (List < DataField > keyFields)`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-30043) Some example sqls in flink table store rescale-bucket doucument are incorrect.

2022-11-16 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee reassigned FLINK-30043:


Assignee: Hang HOU

> Some example sqls in flink table store rescale-bucket doucument are incorrect.
> --
>
> Key: FLINK-30043
> URL: https://issues.apache.org/jira/browse/FLINK-30043
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Affects Versions: 1.16.0
>Reporter: Hang HOU
>Assignee: Hang HOU
>Priority: Critical
>  Labels: pull-request-available
> Fix For: table-store-0.3.0
>
> Attachments: image-2022-11-16-18-30-24-261.png
>
>
> [rescale-bucket|https://nightlies.apache.org/flink/flink-table-store-docs-release-0.2/docs/development/rescale-bucket/#use-case]
> For example, in this table (in table store catalog),it's columns were defined 
> as the 4 :
> "trade_order_id BIGINT,
> item_id BIGINT,
> item_price DOUBLE,
> dt STRING,"
> So when get start “insert overwrite” , probably have no relation with the 
> column “order_status ” in the table "raw_orders".
>  !image-2022-11-16-18-30-24-261.png! 
> What's more, I guess it's better append a DDL example to create the temporary 
> table "raw_orders" in this character.
> And not all the sqls have a “;” in the end. Maybe could do some little 
> adjustment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-30043) Some example sqls in flink table store rescale-bucket doucument are incorrect.

2022-11-16 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-30043.

Fix Version/s: table-store-0.3.0
   Resolution: Fixed

master: c2df467406e7817b1f308cd4a8eb1c5baaa5f294

> Some example sqls in flink table store rescale-bucket doucument are incorrect.
> --
>
> Key: FLINK-30043
> URL: https://issues.apache.org/jira/browse/FLINK-30043
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Affects Versions: 1.16.0
>Reporter: Hang HOU
>Priority: Critical
>  Labels: pull-request-available
> Fix For: table-store-0.3.0
>
> Attachments: image-2022-11-16-18-30-24-261.png
>
>
> [rescale-bucket|https://nightlies.apache.org/flink/flink-table-store-docs-release-0.2/docs/development/rescale-bucket/#use-case]
> For example, in this table (in table store catalog),it's columns were defined 
> as the 4 :
> "trade_order_id BIGINT,
> item_id BIGINT,
> item_price DOUBLE,
> dt STRING,"
> So when get start “insert overwrite” , probably have no relation with the 
> column “order_status ” in the table "raw_orders".
>  !image-2022-11-16-18-30-24-261.png! 
> What's more, I guess it's better append a DDL example to create the temporary 
> table "raw_orders" in this character.
> And not all the sqls have a “;” in the end. Maybe could do some little 
> adjustment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] JingsongLi merged pull request #385: [FLINK-30043] Some example sqls in flink table store rescale-bucket doucument are incorrect.

2022-11-16 Thread GitBox


JingsongLi merged PR #385:
URL: https://github.com/apache/flink-table-store/pull/385


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on pull request #344: [FLINK-29823] Support to get schema of table snapshot

2022-11-16 Thread GitBox


JingsongLi commented on PR #344:
URL: 
https://github.com/apache/flink-table-store/pull/344#issuecomment-1318013199

   Thanks @zjureel for the contribution.
   I want to support this requirement through another solution:
   - At present, we have supported Snapshots Table: 
https://nightlies.apache.org/flink/flink-table-store-docs-master/docs/development/query-table/#snapshots-table
 , `SELECT * FROM MyTable$snapshots;`, there is a column: `schema_id`.
   - We can support Schemas Table too, `SELECT * FROM MyTable$schemas;`.
   - Then, users can join these two tables to show the schema of one snapshot.
   
   What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #383: [FLINK-30033] Add primary key type validation

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #383:
URL: https://github.com/apache/flink-table-store/pull/383#discussion_r1024717157


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaManager.java:
##
@@ -157,6 +159,33 @@ public TableSchema commitNewVersion(UpdateSchema 
updateSchema) throws Exception
 }
 }
 
+private void validatePrimaryKeysType(UpdateSchema updateSchema, 
List primaryKeys) {
+if (!primaryKeys.isEmpty()) {
+Map rowFields = new HashMap<>();
+for (RowType.RowField rowField : 
updateSchema.rowType().getFields()) {
+rowFields.put(StringUtils.lowerCase(rowField.getName()), 
rowField);

Review Comment:
   I created a jira for hive https://issues.apache.org/jira/browse/FLINK-29988



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-28558) HistoryServer log retrieval configuration improvement

2022-11-16 Thread leo.zhi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635101#comment-17635101
 ] 

leo.zhi commented on FLINK-28558:
-

Yes, so close, we use kafka log appender, and also can write log file to minio.

But it's hard to know how use these confirguation between minio and flink 
historyserver.

Mayby minio is just a file system, not a log system.

Thank for your patience, let me think think think ...

> HistoryServer log retrieval configuration improvement
> -
>
> Key: FLINK-28558
> URL: https://issues.apache.org/jira/browse/FLINK-28558
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration, Runtime / REST
>Reporter: Xintong Song
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> HistoryServer generates log retrieval urls base on the following 
> configuration:
> - historyserver.log.jobmanager.url-pattern
> - historyserver.log.taskmanager.url-pattern
> The usability can be improved in two ways:
> - Explicitly explain in description that only http/https schemas are 
> supported, and add sanity checks for it.
> - If the schema is not specified, add "http://; by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30035) ./bin/sql-client.sh won't import external jar into the session

2022-11-16 Thread Shengkai Fang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635100#comment-17635100
 ] 

Shengkai Fang commented on FLINK-30035:
---

Hi. Could you share more exception stack and the steps to reproduce this.

> ./bin/sql-client.sh won't import external jar into the session
> --
>
> Key: FLINK-30035
> URL: https://issues.apache.org/jira/browse/FLINK-30035
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.16.0
>Reporter: Steven Zhen Wu
>Priority: Major
>
> I used to be able to run the sql-client with iceberg-flink-runtime jar using 
> the `-j,--jar ` option (e.g. with 1.15.2). 
> {code}
> ./bin/sql-client.sh embedded --jar iceberg-flink-runtime-1.16-1.1.0.jar
> {code}
> With 1.16.0, this doesn't work anymore. As a result, I am seeing 
> ClassNotFoundException.
> {code}
> java.lang.ClassNotFoundException: org.apache.iceberg.hadoop.HadoopCatalog
> {code}
> I have to put the iceberg-flink-runtime-1.16-1.1.0.jar file inside the 
> `flink/lib` directory to make the jar loaded. This seems like a regression of 
> 1.16.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024714412


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaFieldTypeExtractor.java:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import org.apache.flink.table.types.logical.RowType;
+
+import java.io.Serializable;
+import java.util.List;
+
+/** Extractor of schema for different tables. */
+public interface SchemaFieldTypeExtractor extends Serializable {

Review Comment:
   Rename this to `KeyTypeExtractor`?



##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaFieldTypeExtractor.java:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import org.apache.flink.table.types.logical.RowType;
+
+import java.io.Serializable;
+import java.util.List;
+
+/** Extractor of schema for different tables. */
+public interface SchemaFieldTypeExtractor extends Serializable {

Review Comment:
   Rename this to `KeyFieldsExtractor`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024713874


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaFieldTypeExtractor.java:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import org.apache.flink.table.types.logical.RowType;
+
+import java.io.Serializable;
+import java.util.List;
+
+/** Extractor of schema for different tables. */
+public interface SchemaFieldTypeExtractor extends Serializable {
+/**
+ * Extract key type from table schema.
+ *
+ * @param schema the table schema
+ * @return the key type
+ */
+RowType keyType(TableSchema schema);
+
+/**
+ * Extract key fields from table schema.
+ *
+ * @param schema the table schema
+ * @return the key fields
+ */
+List keyFields(TableSchema schema);

Review Comment:
   We can just have util to convert `List` to `RowType`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024712989


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/operation/KeyValueFileStoreScan.java:
##
@@ -83,6 +96,26 @@ protected boolean filterByStats(ManifestEntry entry) {
 return keyFilter == null
 || keyFilter.test(
 entry.file().rowCount(),
-entry.file().keyStats().fields(keyStatsConverter, 
entry.file().rowCount()));
+entry.file()
+.keyStats()
+.fields(
+
getFieldStatsArraySerializer(entry.file().schemaId()),
+entry.file().rowCount()));
+}
+
+private FieldStatsArraySerializer getFieldStatsArraySerializer(long id) {
+return schemaKeyStatsConverters.computeIfAbsent(
+id,
+key -> {
+final TableSchema tableSchema = getTableSchema();

Review Comment:
   `scanSchema`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024712686


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/operation/AppendOnlyFileStoreScan.java:
##
@@ -84,6 +93,23 @@ protected boolean filterByStats(ManifestEntry entry) {
 entry.file().rowCount(),
 entry.file()
 .valueStats()
-.fields(rowStatsConverter, 
entry.file().rowCount()));
+.fields(
+
getFieldStatsArraySerializer(entry.file().schemaId()),
+entry.file().rowCount()));
+}
+
+private FieldStatsArraySerializer getFieldStatsArraySerializer(long 
schemaId) {
+return schemaRowStatsConverters.computeIfAbsent(
+schemaId,
+id -> {
+TableSchema tableSchema = getTableSchema();

Review Comment:
   `scanSchema`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024712478


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/operation/AbstractFileStoreScan.java:
##
@@ -244,6 +257,14 @@ public List files() {
 };
 }
 
+protected TableSchema getTableSchema() {

Review Comment:
   `scanTableSchema`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-28558) HistoryServer log retrieval configuration improvement

2022-11-16 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635098#comment-17635098
 ] 

Xintong Song commented on FLINK-28558:
--

I'm not familiar whether there's any existing tools that collects log files 
from a terminated pod. In our production, we use a custom log4j appender to 
write the logs directly to an external log system. 

> HistoryServer log retrieval configuration improvement
> -
>
> Key: FLINK-28558
> URL: https://issues.apache.org/jira/browse/FLINK-28558
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration, Runtime / REST
>Reporter: Xintong Song
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> HistoryServer generates log retrieval urls base on the following 
> configuration:
> - historyserver.log.jobmanager.url-pattern
> - historyserver.log.taskmanager.url-pattern
> The usability can be improved in two ways:
> - Explicitly explain in description that only http/https schemas are 
> supported, and add sanity checks for it.
> - If the schema is not specified, add "http://; by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-29969) Show the root cause when exceeded checkpoint tolerable failure threshold

2022-11-16 Thread Yun Tang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Tang resolved FLINK-29969.
--
Fix Version/s: 1.17.0
 Assignee: Rui Fan
   Resolution: Fixed

merged in master: 5b8ea81f11df1d094b6331de6cb6f824e5401bcd

> Show the root cause when exceeded checkpoint tolerable failure threshold
> 
>
> Key: FLINK-29969
> URL: https://issues.apache.org/jira/browse/FLINK-29969
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.17.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Add the root cause when exceeded checkpoint tolerable failure threshold, it's 
> helpful during troubleshooting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] Myasuka merged pull request #21281: [FLINK-29969][checkpoint] Show the root cause when exceeded checkpoint tolerable failure threshold

2022-11-16 Thread GitBox


Myasuka merged PR #21281:
URL: https://github.com/apache/flink/pull/21281


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-28558) HistoryServer log retrieval configuration improvement

2022-11-16 Thread leo.zhi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635094#comment-17635094
 ] 

leo.zhi edited comment on FLINK-28558 at 11/17/22 3:06 AM:
---

Hi [~xtsong] ,

Thanks for the quick reply. :)

We use Flink on K8s instead, is there any way we use this feature on K8s?
By the way,the fs what we used is minio :(


was (Author: leo.zhi):
Hi [~xtsong] ,

Thanks for the quick reply.

We use Flink on K8s, is there any way we use this feature on K8s?

:)

> HistoryServer log retrieval configuration improvement
> -
>
> Key: FLINK-28558
> URL: https://issues.apache.org/jira/browse/FLINK-28558
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration, Runtime / REST
>Reporter: Xintong Song
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> HistoryServer generates log retrieval urls base on the following 
> configuration:
> - historyserver.log.jobmanager.url-pattern
> - historyserver.log.taskmanager.url-pattern
> The usability can be improved in two ways:
> - Explicitly explain in description that only http/https schemas are 
> supported, and add sanity checks for it.
> - If the schema is not specified, add "http://; by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024709249


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaEvolutionUtil.java:
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+/** Utils for schema evolution. */
+public class SchemaEvolutionUtil {
+
+private static final int NULL_FIELD_INDEX = -1;
+
+/**
+ * Create index mapping from table fields to underlying data fields.
+ *
+ * @param tableFields the fields of table
+ * @param dataFields the fields of underlying data
+ * @return the index mapping
+ */
+public static int[] createIndexMapping(

Review Comment:
   `@Nullable`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-28558) HistoryServer log retrieval configuration improvement

2022-11-16 Thread leo.zhi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635094#comment-17635094
 ] 

leo.zhi commented on FLINK-28558:
-

Hi [~xtsong] ,

Thanks for the quick reply.

We use Flink on K8s, is there any way we use this feature on K8s?

:)

> HistoryServer log retrieval configuration improvement
> -
>
> Key: FLINK-28558
> URL: https://issues.apache.org/jira/browse/FLINK-28558
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration, Runtime / REST
>Reporter: Xintong Song
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> HistoryServer generates log retrieval urls base on the following 
> configuration:
> - historyserver.log.jobmanager.url-pattern
> - historyserver.log.taskmanager.url-pattern
> The usability can be improved in two ways:
> - Explicitly explain in description that only http/https schemas are 
> supported, and add sanity checks for it.
> - If the schema is not specified, add "http://; by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] zjureel commented on a diff in pull request #383: [FLINK-30033] Add primary key type validation

2022-11-16 Thread GitBox


zjureel commented on code in PR #383:
URL: https://github.com/apache/flink-table-store/pull/383#discussion_r1024702921


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaManager.java:
##
@@ -157,6 +159,33 @@ public TableSchema commitNewVersion(UpdateSchema 
updateSchema) throws Exception
 }
 }
 
+private void validatePrimaryKeysType(UpdateSchema updateSchema, 
List primaryKeys) {
+if (!primaryKeys.isEmpty()) {
+Map rowFields = new HashMap<>();
+for (RowType.RowField rowField : 
updateSchema.rowType().getFields()) {
+rowFields.put(StringUtils.lowerCase(rowField.getName()), 
rowField);

Review Comment:
   I will remove `lowCase` here.
   
   But in fact there's a problem of the up/lowCase of table and column names. 
   
   The ddl syntax between table store and spark/hive are different. The table 
and column name of spark/hive are case-insensitive, while in table store with 
flink, the table name is case-insensitive and column name is case-sensitive.
   
   I think we need to verify the duplicate table/column names in the table 
store to be case-insensitive to avoid confusion for the users.  Then we can 
also use lowercase uniformly for comparison when doing some additional 
verification. What do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30049) CsvBulkWriter is unsupported for S3 FileSystem in streaming sink

2022-11-16 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-30049:
-
Description: 
{code:java}
Caused by: org.apache.flink.util.SerializedThrowable: Cannot sync state to 
system like S3. Use persist() to create a persistent recoverable intermediate 
point.
at 
org.apache.flink.core.fs.RefCountedBufferingFileStream.sync(RefCountedBufferingFileStream.java:111)
 
at 
org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.sync
at 
org.apache.flink.formats.csv.CsvBulkWriter.finish(CsvBulkWriter.java:106) 
at 
org.apache.flink.connector.file.table.FileSystemTableSink$ProjectionBulkFactory$1.finish(FileSystemTableSink.java:653)
 
at 
org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:64)
 
{code}

It looks like we should not call `sync` in CsvBulkWriter, we should just use 
`flush`.


  was:
{code:java}
Caused by: org.apache.flink.util.SerializedThrowable: Cannot sync state to 
system like S3. Use persist() to create a persistent recoverable intermediate 
point.
at 
org.apache.flink.core.fs.RefCountedBufferingFileStream.sync(RefCountedBufferingFileStream.java:111)
 
at 
org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.sync
at 
org.apache.flink.formats.csv.CsvBulkWriter.finish(CsvBulkWriter.java:106) 
at 
org.apache.flink.connector.file.table.FileSystemTableSink$ProjectionBulkFactory$1.finish(FileSystemTableSink.java:653)
 
at 
org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:64)
 
{code}



> CsvBulkWriter is unsupported for S3 FileSystem in streaming sink
> 
>
> Key: FLINK-30049
> URL: https://issues.apache.org/jira/browse/FLINK-30049
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Jingsong Lee
>Priority: Major
>
> {code:java}
> Caused by: org.apache.flink.util.SerializedThrowable: Cannot sync state to 
> system like S3. Use persist() to create a persistent recoverable intermediate 
> point.
>   at 
> org.apache.flink.core.fs.RefCountedBufferingFileStream.sync(RefCountedBufferingFileStream.java:111)
>  
>   at 
> org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.sync
>   at 
> org.apache.flink.formats.csv.CsvBulkWriter.finish(CsvBulkWriter.java:106) 
>   at 
> org.apache.flink.connector.file.table.FileSystemTableSink$ProjectionBulkFactory$1.finish(FileSystemTableSink.java:653)
>  
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:64)
>  
> {code}
> It looks like we should not call `sync` in CsvBulkWriter, we should just use 
> `flush`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30049) CsvBulkWriter is unsupported for S3 FileSystem

2022-11-16 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30049:


 Summary: CsvBulkWriter is unsupported for S3 FileSystem
 Key: FLINK-30049
 URL: https://issues.apache.org/jira/browse/FLINK-30049
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.15.2, 1.16.0
Reporter: Jingsong Lee


{code:java}
Caused by: org.apache.flink.util.SerializedThrowable: Cannot sync state to 
system like S3. Use persist() to create a persistent recoverable intermediate 
point.
at 
org.apache.flink.core.fs.RefCountedBufferingFileStream.sync(RefCountedBufferingFileStream.java:111)
 
at 
org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.sync
at 
org.apache.flink.formats.csv.CsvBulkWriter.finish(CsvBulkWriter.java:106) 
at 
org.apache.flink.connector.file.table.FileSystemTableSink$ProjectionBulkFactory$1.finish(FileSystemTableSink.java:653)
 
at 
org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:64)
 
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30049) CsvBulkWriter is unsupported for S3 FileSystem in streaming sink

2022-11-16 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-30049:
-
Summary: CsvBulkWriter is unsupported for S3 FileSystem in streaming sink  
(was: CsvBulkWriter is unsupported for S3 FileSystem)

> CsvBulkWriter is unsupported for S3 FileSystem in streaming sink
> 
>
> Key: FLINK-30049
> URL: https://issues.apache.org/jira/browse/FLINK-30049
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Jingsong Lee
>Priority: Major
>
> {code:java}
> Caused by: org.apache.flink.util.SerializedThrowable: Cannot sync state to 
> system like S3. Use persist() to create a persistent recoverable intermediate 
> point.
>   at 
> org.apache.flink.core.fs.RefCountedBufferingFileStream.sync(RefCountedBufferingFileStream.java:111)
>  
>   at 
> org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.sync
>   at 
> org.apache.flink.formats.csv.CsvBulkWriter.finish(CsvBulkWriter.java:106) 
>   at 
> org.apache.flink.connector.file.table.FileSystemTableSink$ProjectionBulkFactory$1.finish(FileSystemTableSink.java:653)
>  
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:64)
>  
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30036) Force delete pod when k8s node is not ready

2022-11-16 Thread Peng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635088#comment-17635088
 ] 

Peng Yuan commented on FLINK-30036:
---

Just as [~wangyang0918] said, there are a number of conditions that can cause a 
pod to remain in a terminating state, so we need to add a config option to 
determinate the pod should be force-delete or not.

For the concern that node not ready does not always mean the pod will block at 
terminating status. I lean to think that node not ready will certainly cause 
the pod block in a terminating status. The node object of k8s api describe the 
*Ready* of *NodeCondition* as {^}[1]{^}:
{quote}{{True}} if the node is healthy and ready to accept pods, {{False}} if 
the node is not healthy and is not accepting pods, and {{Unknown}} if the node 
controller has not heard from the node in the last 
{{node-monitor-grace-period}} (default is 40 seconds)’
{quote}
When the node is unreachable, pod will be as {^}[2]{^}: 
{quote}A Pod is not deleted automatically when a node is unreachable. The Pods 
running on an unreachable Node enter the 'Terminating' or 'Unknown' state after 
a [timeout|https://kubernetes.io/docs/concepts/architecture/nodes/#condition]. 
Pods may also enter these states when the user attempts graceful deletion of a 
Pod on an unreachable Node. The only ways in which a Pod in such a state can be 
removed from the apiserver are as follows:
 * The Node object is deleted (either by you, or by the [Node 
Controller|https://kubernetes.io/docs/concepts/architecture/nodes/#node-controller]).
 * The kubelet on the unresponsive Node starts responding, kills the Pod and 
removes the entry from the apiserver.
 * Force deletion of the Pod by the user.

The recommended best practice is to use the first or second approach. If a Node 
is confirmed to be dead (e.g. permanently disconnected from the network, 
powered down, etc), then delete the Node object. If the Node is suffering from 
a network partition, then try to resolve this or wait for it to resolve. When 
the partition heals, the kubelet will complete the deletion of the Pod and free 
up its name in the apiserver.
{quote}
For flink itself, in the k8s environment, when the taskmanager connection times 
out, resourcemanager will try to delete the tm pod. If the cause of the timeout 
is detected, Resourcemanager can forcibly delete the pod to quickly recover the 
task, because real-time performance is very important to flink.

{*}[1]{*}.Node conditions:[Nodes | 
Kubernetes|https://kubernetes.io/docs/concepts/architecture/nodes/#condition]

{*}[2].{*}Delete pods:[Force Delete StatefulSet Pods | 
Kubernetes|https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/#delete-pods]

> Force delete pod when  k8s node is not ready
> 
>
> Key: FLINK-30036
> URL: https://issues.apache.org/jira/browse/FLINK-30036
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Peng Yuan
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-11-17-10-25-59-945.png
>
>
> When the K8s node is in the NotReady state, the taskmanager pod scheduled on 
> it is always in the terminating state. When the flink cluster has a strict 
> quota, the terminating pod will hold the resources all the time. As a result, 
> the new taskmanager pod cannot apply for resources and cannot be started.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30048) MapState.remove(UK key) is better to return the old value.

2022-11-16 Thread xljtswf (Jira)
xljtswf created FLINK-30048:
---

 Summary: MapState.remove(UK key) is better to return the old value.
 Key: FLINK-30048
 URL: https://issues.apache.org/jira/browse/FLINK-30048
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Reporter: xljtswf


Hi all:

I found the MapState.remove(UK key) just returns nothing, and the Java Map 
Interface returns the old value associated with the key.

Consider the flollowing example from the Learn Flink Event-driven Application

[链接标题|https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/learn-flink/event_driven/#the-ontimer-method]
 ,

we want to get the value with the timestamp and then delete the timestamp, with 
current inplement, we must first call mapState.get(key),then call 
mapState.remove(key). it will search the key 2 times, I think it is not 
necessary. If mapState.remove(key) can return the old value, then we can just 
call mapState.remove(key) and get the old value and save the unnecessary 2nd 
search in the map.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28558) HistoryServer log retrieval configuration improvement

2022-11-16 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635085#comment-17635085
 ] 

Xintong Song commented on FLINK-28558:
--

[~leo.zhi]
Here's the documentation for this feature.
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/advanced/historyserver/#log-integration

The log archiving and browsing is highly depending on your deployment 
environment. E.g., Yarn has a log aggregation feature that collects logs store 
them on HDFS. Flink does not provide such services.

> HistoryServer log retrieval configuration improvement
> -
>
> Key: FLINK-28558
> URL: https://issues.apache.org/jira/browse/FLINK-28558
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration, Runtime / REST
>Reporter: Xintong Song
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> HistoryServer generates log retrieval urls base on the following 
> configuration:
> - historyserver.log.jobmanager.url-pattern
> - historyserver.log.taskmanager.url-pattern
> The usability can be improved in two ways:
> - Explicitly explain in description that only http/https schemas are 
> supported, and add sanity checks for it.
> - If the schema is not specified, add "http://; by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30047) getLastSavepointStatus should return null when there is never savepoint completed or pending

2022-11-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30047:
---
Labels: pull-request-available  (was: )

> getLastSavepointStatus should return null when there is never savepoint 
> completed or pending
> 
>
> Key: FLINK-30047
> URL: https://issues.apache.org/jira/browse/FLINK-30047
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Clara Xiong
>Priority: Major
>  Labels: pull-request-available
>
> Current SUCCEEDED is returned in this case but null should be returned 
> instead to distinguish from really success.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] clarax opened a new pull request, #443: [FLINK-30047] getLastSavepointStatus should return null when there is…

2022-11-16 Thread GitBox


clarax opened a new pull request, #443:
URL: https://github.com/apache/flink-kubernetes-operator/pull/443

   … never savepoint completed or pending
   
   ## What is the purpose of the change
   
   This is for FLINK-30047 getLastSavepointStatus should return null when there 
is never savepoint completed or not currently pending.  This to indicate an 
initial state.
   
   
   ## Brief change log
   
 -  getLastSavepointStatus return null instead of SUCCEEDED when there is 
never savepoint completed or not currently pending. 
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
 - Test cases updated to reflect the change

   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): No
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
No
 - Core observer or reconciler logic that is regularly executed: No
   
   ## Documentation
   
 - Does this pull request introduce a new feature? No
 - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30036) Force delete pod when k8s node is not ready

2022-11-16 Thread Peng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peng Yuan updated FLINK-30036:
--
Attachment: image-2022-11-17-10-25-59-945.png

> Force delete pod when  k8s node is not ready
> 
>
> Key: FLINK-30036
> URL: https://issues.apache.org/jira/browse/FLINK-30036
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Peng Yuan
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-11-17-10-25-59-945.png
>
>
> When the K8s node is in the NotReady state, the taskmanager pod scheduled on 
> it is always in the terminating state. When the flink cluster has a strict 
> quota, the terminating pod will hold the resources all the time. As a result, 
> the new taskmanager pod cannot apply for resources and cannot be started.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30032) IOException during MAX_WATERMARK emission causes message missing

2022-11-16 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635081#comment-17635081
 ] 

luoyuxia commented on FLINK-30032:
--

Just curious about why this fix can solve this problem. Could you please 
explain a little bit more?

Another question, when you catch the IOException at line 67 and do the logging, 
will you throw the IOException again?

 

 

> IOException during MAX_WATERMARK emission causes message missing
> 
>
> Key: FLINK-30032
> URL: https://issues.apache.org/jira/browse/FLINK-30032
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Haoze Wu
>Priority: Major
>
> We are doing testing on Flink (version 1.14.0). We launch 1 
> StandaloneSessionClusterEntrypoint and 2 TaskManagerRunner. Then we run a 
> Flink client which submit a WordCount workload. The code is similar to 
> [https://github.com/apache/flink/blob/release-1.14.0/flink-examples/flink-examples-batch/src/main/java/org/apache/flink/examples/java/wordcount/WordCount.java],
>  and we only add a Kafka topic output:
> {code:java}
>     private static DataStreamSink addKafkaSink(
>             final DataStream events, final String brokers, final 
> String topic) {
>         return events.sinkTo(KafkaSink.builder()
>                 .setBootstrapServers(brokers)
>                 .setRecordSerializer(
>                         KafkaRecordSerializationSchema.builder()
>                                 .setValueSerializationSchema(new 
> SimpleStringSchema())
>                                 .setTopic(topic)
>                                 .build())
>                 .build());
>     }
>     public static void run(final String[] args) throws Exception {
>         final String brokers = args[0];
>         final String textFilePath = args[1];
>         final String kafkaTopic = args[2];
>         final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
>         env.setRuntimeMode(RuntimeExecutionMode.BATCH);
>         final DataStream text = env.readTextFile(textFilePath);
>         final DataStream> counts =
>                 text.flatMap(new Tokenizer()).keyBy(value -> value.f0).sum(1);
>         addKafkaSink(counts.map(String::valueOf), brokers, kafkaTopic);
>         final long nano = System.nanoTime();
>         env.execute("WordCount");
>         FlinkGrayClientMain.reply("success", nano);
>     }
>  {code}
> We found that sometimes the Kafka topic fails to receive a few messages. We 
> reproduce the symptom multiple times. We found that the Kafka topic always 
> gets 160~169 messages while the expected number of messages is 170. We also 
> found that the missing messages are always the expected last few messages 
> from the 170 expected messages.
> Then we inspect the logs and code.
> First, we have an IOException to one of the TaskManagerRunner:
> {code:java}
> 2021-11-02T17:43:41,070 WARN  source.ContinuousFileReaderOperator 
> (ContinuousFileReaderOperator.java:finish(461)) - unable to emit watermark 
> while closing
> org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: 
> Could not forward element to next operator
>         at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.emitWatermark(ChainingOutput.java:114)
>  ~[flink-dist_2.11-1.14.0.jar:1.14.0]
>         at 
> org.apache.flink.streaming.api.operators.CountingOutput.emitWatermark(CountingOutput.java:40)
>  ~[flink-dist_2.11-1.14.0.jar:1.14.0]
>         at 
> org.apache.flink.streaming.api.operators.StreamSourceContexts$ManualWatermarkContext.processAndEmitWatermark(StreamSourceContexts.java:428)
>  ~[flink-dist_2.11-1.14.0.jar:1.14.0]
>         at 
> org.apache.flink.streaming.api.operators.StreamSourceContexts$WatermarkContext.emitWatermark(StreamSourceContexts.java:544)
>  ~[flink-dist_2.11-1.14.0.jar:1.14.0]
>         at 
> org.apache.flink.streaming.api.operators.StreamSourceContexts$SwitchingOnClose.emitWatermark(StreamSourceContexts.java:113)
>  ~[flink-dist_2.11-1.14.0.jar:1.14.0]
>         at 
> org.apache.flink.streaming.api.functions.source.ContinuousFileReaderOperator.finish(ContinuousFileReaderOperator.java:459)
>  ~[flink-dist_2.11-1.14.0.jar:1.14.0]
>         at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
>  ~[flink-dist_2.11-1.14.0.jar:1.14.0]
>         at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finishOperator(StreamOperatorWrapper.java:211)
>  ~[flink-dist_2.11-1.14.0.jar:1.14.0]
>         at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.lambda$deferFinishOperatorToMailbox$3(StreamOperatorWrapper.java:185)
>  ~[flink-dist_2.11-1.14.0.jar:1.14.0]
>         at 
> 

[jira] [Closed] (FLINK-30027) Fields min and max in BinaryTableStats support lazy deserialization

2022-11-16 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-30027.

Fix Version/s: table-store-0.3.0
 Assignee: Shammon
   Resolution: Fixed

master: 4d6bc72aeddea0c1bf551a2542ecc990220f8edd

> Fields min and max in BinaryTableStats support lazy deserialization
> ---
>
> Key: FLINK-30027
> URL: https://issues.apache.org/jira/browse/FLINK-30027
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Affects Versions: table-store-0.3.0
>Reporter: Shammon
>Assignee: Shammon
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.3.0
>
>
> Predicate get min and max from BinaryRowData, lazily deserialization



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] JingsongLi merged pull request #382: [FLINK-30027] Fields min and max in BinaryTableStats support lazy deserialization

2022-11-16 Thread GitBox


JingsongLi merged PR #382:
URL: https://github.com/apache/flink-table-store/pull/382


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #383: [FLINK-30033] Add primary key type validation

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #383:
URL: https://github.com/apache/flink-table-store/pull/383#discussion_r1024684798


##
docs/content/docs/development/create-table.md:
##
@@ -296,6 +296,24 @@ Use approach one if you have a large number of filtered 
queries
 with only `user_id`, and use approach two if you have a large
 number of filtered queries with only `catalog_id`.
 
+The primary key data types currently supported by the table store

Review Comment:
   Maybe we don't need to document this.



##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaManager.java:
##
@@ -157,6 +159,33 @@ public TableSchema commitNewVersion(UpdateSchema 
updateSchema) throws Exception
 }
 }
 
+private void validatePrimaryKeysType(UpdateSchema updateSchema, 
List primaryKeys) {
+if (!primaryKeys.isEmpty()) {
+Map rowFields = new HashMap<>();
+for (RowType.RowField rowField : 
updateSchema.rowType().getFields()) {
+rowFields.put(StringUtils.lowerCase(rowField.getName()), 
rowField);

Review Comment:
   Why use `lowerCase`?



##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaManager.java:
##
@@ -157,6 +159,33 @@ public TableSchema commitNewVersion(UpdateSchema 
updateSchema) throws Exception
 }
 }
 
+private void validatePrimaryKeysType(UpdateSchema updateSchema, 
List primaryKeys) {
+if (!primaryKeys.isEmpty()) {
+Map rowFields = new HashMap<>();
+for (RowType.RowField rowField : 
updateSchema.rowType().getFields()) {
+rowFields.put(StringUtils.lowerCase(rowField.getName()), 
rowField);
+}
+for (String primaryKeyName : primaryKeys) {
+RowType.RowField rowField = 
rowFields.get(StringUtils.lowerCase(primaryKeyName));
+LogicalType logicalType = rowField.getType();
+if (TableSchema.PRIMARY_KEY_UNSUPPORTED_LOGICAL_TYPES.stream()
+.anyMatch(c -> c.isInstance(logicalType))) {
+throw new UnsupportedOperationException(
+String.format(
+"Don't support to create primary key in 
[%s], the type of column[%s] is [%s]",

Review Comment:
   `The type %s in primary key field %s is unsupported`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-ml] yunfengzhou-hub commented on a diff in pull request #174: [FLINK-29604] Add Estimator and Transformer for CountVectorizer

2022-11-16 Thread GitBox


yunfengzhou-hub commented on code in PR #174:
URL: https://github.com/apache/flink-ml/pull/174#discussion_r1024683357


##
flink-ml-lib/src/main/java/org/apache/flink/ml/feature/countvectorizer/CountVectorizer.java:
##
@@ -0,0 +1,219 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.feature.countvectorizer;
+
+import org.apache.flink.api.common.functions.AggregateFunction;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.ml.api.Estimator;
+import org.apache.flink.ml.common.datastream.DataStreamUtils;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Preconditions;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+/**
+ * {@link CountVectorizer} aims to help convert a collection of text documents 
to vectors of token
+ * counts. When an a-priori dictionary is not available, {@link 
CountVectorizer} can be used as an
+ * estimator to extract the vocabulary, and generates a {@link 
CountVectorizerModel}. The model
+ * produces sparse representations for the documents over the vocabulary, 
which can then be passed
+ * to other algorithms like LDA.
+ */
+public class CountVectorizer
+implements Estimator,
+CountVectorizerParams {
+private final Map, Object> paramMap = new HashMap<>();
+
+public CountVectorizer() {
+ParamUtils.initializeMapWithDefaultValues(paramMap, this);
+}
+
+@Override
+public CountVectorizerModel fit(Table... inputs) {
+Preconditions.checkArgument(inputs.length == 1);
+double minDF = getMinDF();
+double maxDF = getMaxDF();
+if (minDF >= 1.0 && maxDF >= 1.0 || minDF < 1.0 && maxDF < 1.0) {
+Preconditions.checkArgument(maxDF >= minDF, "maxDF must be >= 
minDF.");
+}
+
+String inputCol = getInputCol();
+StreamTableEnvironment tEnv =
+(StreamTableEnvironment) ((TableImpl) 
inputs[0]).getTableEnvironment();
+DataStream inputData =
+tEnv.toDataStream(inputs[0])
+.map(
+(MapFunction)
+value -> ((String[]) 
value.getField(inputCol)));
+
+DataStream modelData =
+DataStreamUtils.aggregate(
+inputData,
+new VocabularyAggregator(getMinDF(), getMaxDF(), 
getVocabularySize()));
+
+CountVectorizerModel model =
+new 
CountVectorizerModel().setModelData(tEnv.fromDataStream(modelData));
+ReadWriteUtils.updateExistingParams(model, getParamMap());
+return model;
+}
+
+/**
+ * Extracts a vocabulary from input document collections and builds the 
{@link
+ * CountVectorizerModelData}.
+ */
+private static class VocabularyAggregator
+implements AggregateFunction<
+String[],
+Tuple2>>,
+CountVectorizerModelData> {
+private final double minDF;
+private final double maxDF;
+private final int vocabularySize;
+
+public VocabularyAggregator(double minDF, double maxDF, int 
vocabularySize) {
+this.minDF = minDF;
+this.maxDF = maxDF;
+this.vocabularySize = vocabularySize;
+}
+
+@Override
+public Tuple2>> 
createAccumulator() {
+return Tuple2.of(0L, new HashMap<>());
+}
+
+@Override
+public Tuple2>> add(
+String[] terms, Tuple2>> 
vocabAccumulator) {
+Map wc = new HashMap<>();
+

[GitHub] [flink-ml] yunfengzhou-hub commented on a diff in pull request #174: [FLINK-29604] Add Estimator and Transformer for CountVectorizer

2022-11-16 Thread GitBox


yunfengzhou-hub commented on code in PR #174:
URL: https://github.com/apache/flink-ml/pull/174#discussion_r1024683072


##
flink-ml-lib/src/main/java/org/apache/flink/ml/feature/countvectorizer/CountVectorizerParams.java:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.feature.countvectorizer;
+
+import org.apache.flink.ml.param.DoubleParam;
+import org.apache.flink.ml.param.IntParam;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.param.ParamValidators;
+
+/**
+ * Params of {@link CountVectorizer}.
+ *
+ * @param  The class type of this instance.
+ */
+public interface CountVectorizerParams extends 
CountVectorizerModelParams {
+Param VOCABULARY_SIZE =
+new IntParam(
+"vocabularySize",
+"Max size of the vocabulary. CountVectorizer will build a 
vocabulary"

Review Comment:
   I agree with it that the descriptions here are different from that of the 
binary parameter. We can keep this description.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] yangjf2019 commented on pull request #442: [hotfix][doc][test] correct the incorrect flink release version

2022-11-16 Thread GitBox


yangjf2019 commented on PR #442:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/442#issuecomment-1317952567

   Thanks for your review, and i resolved the conflict.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-ml] yunfengzhou-hub commented on a diff in pull request #172: [FLINK-29592] Add Estimator and Transformer for RobustScaler

2022-11-16 Thread GitBox


yunfengzhou-hub commented on code in PR #172:
URL: https://github.com/apache/flink-ml/pull/172#discussion_r1024682602


##
flink-ml-lib/src/main/java/org/apache/flink/ml/feature/robustscaler/RobustScalerModel.java:
##
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.feature.robustscaler;
+
+import org.apache.flink.api.common.functions.RichMapFunction;
+import org.apache.flink.api.java.typeutils.RowTypeInfo;
+import org.apache.flink.ml.api.Model;
+import org.apache.flink.ml.common.broadcast.BroadcastUtils;
+import org.apache.flink.ml.common.datastream.TableUtils;
+import org.apache.flink.ml.linalg.BLAS;
+import org.apache.flink.ml.linalg.DenseVector;
+import org.apache.flink.ml.linalg.Vector;
+import org.apache.flink.ml.linalg.typeinfo.VectorTypeInfo;
+import org.apache.flink.ml.param.Param;
+import org.apache.flink.ml.util.ParamUtils;
+import org.apache.flink.ml.util.ReadWriteUtils;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.table.api.internal.TableImpl;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.commons.lang3.ArrayUtils;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+
+/** A Model which transforms data using the model data computed by {@link 
RobustScaler}. */
+public class RobustScalerModel
+implements Model, 
RobustScalerModelParams {
+
+private final Map, Object> paramMap = new HashMap<>();
+private Table modelDataTable;
+
+public RobustScalerModel() {
+ParamUtils.initializeMapWithDefaultValues(paramMap, this);
+}
+
+@Override
+@SuppressWarnings("unchecked")
+public Table[] transform(Table... inputs) {
+Preconditions.checkArgument(inputs.length == 1);
+StreamTableEnvironment tEnv =
+(StreamTableEnvironment) ((TableImpl) 
inputs[0]).getTableEnvironment();
+DataStream inputStream = tEnv.toDataStream(inputs[0]);
+
+RowTypeInfo inputTypeInfo = 
TableUtils.getRowTypeInfo(inputs[0].getResolvedSchema());
+RowTypeInfo outputTypeInfo =
+new RowTypeInfo(
+ArrayUtils.addAll(inputTypeInfo.getFieldTypes(), 
VectorTypeInfo.INSTANCE),
+ArrayUtils.addAll(inputTypeInfo.getFieldNames(), 
getOutputCol()));
+final String broadcastModelKey = "broadcastModelKey";
+DataStream modelDataStream =
+RobustScalerModelData.getModelDataStream(modelDataTable);
+
+DataStream output =
+BroadcastUtils.withBroadcastStream(
+Collections.singletonList(inputStream),
+Collections.singletonMap(broadcastModelKey, 
modelDataStream),
+inputList -> {
+DataStream inputData = inputList.get(0);
+return inputData.map(
+new PredictOutputFunction(
+broadcastModelKey,
+getInputCol(),
+getWithCentering(),
+getWithScaling()),
+outputTypeInfo);
+});
+
+return new Table[] {tEnv.fromDataStream(output)};
+}
+
+/** This operator loads model data and predicts result. */
+private static class PredictOutputFunction extends RichMapFunction {
+private final String broadcastModelKey;
+private final String inputCol;
+private final boolean withCentering;
+private final boolean withScaling;
+
+private DenseVector medians;
+private DenseVector scales;
+
+public PredictOutputFunction(
+String broadcastModelKey,
+String inputCol,
+boolean withCentering,
+boolean withScaling) {
+

[jira] [Commented] (FLINK-28558) HistoryServer log retrieval configuration improvement

2022-11-16 Thread leo.zhi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635067#comment-17635067
 ] 

leo.zhi commented on FLINK-28558:
-

Hi [~xtsong] ,

May I know how to use below two configuations? Is there has some sample I can 
try?
It would be better to show which log archiving and browsing services I can use.
Many thanks.
 - historyserver.log.jobmanager.url-pattern
 - historyserver.log.taskmanager.url-pattern

> HistoryServer log retrieval configuration improvement
> -
>
> Key: FLINK-28558
> URL: https://issues.apache.org/jira/browse/FLINK-28558
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration, Runtime / REST
>Reporter: Xintong Song
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> HistoryServer generates log retrieval urls base on the following 
> configuration:
> - historyserver.log.jobmanager.url-pattern
> - historyserver.log.taskmanager.url-pattern
> The usability can be improved in two ways:
> - Explicitly explain in description that only http/https schemas are 
> supported, and add sanity checks for it.
> - If the schema is not specified, add "http://; by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30036) Force delete pod when k8s node is not ready

2022-11-16 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635064#comment-17635064
 ] 

Yang Wang commented on FLINK-30036:
---

After more investigation, it seems that the terminating pods are counted into 
the used quota. Then I think this ticket is a valid issue. We may need a config 
option to enable force-delete when the pod might block at terminating(e.g. node 
not ready).

I have one more concern that node not ready does not always mean the pod will 
block at terminating status. Force delete will send a SIGKILL to pod and the TM 
will not have the chance for the clean-up.

> Force delete pod when  k8s node is not ready
> 
>
> Key: FLINK-30036
> URL: https://issues.apache.org/jira/browse/FLINK-30036
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Peng Yuan
>Priority: Major
>  Labels: pull-request-available
>
> When the K8s node is in the NotReady state, the taskmanager pod scheduled on 
> it is always in the terminating state. When the flink cluster has a strict 
> quota, the terminating pod will hold the resources all the time. As a result, 
> the new taskmanager pod cannot apply for resources and cannot be started.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] 1996fanrui commented on a diff in pull request #21281: [FLINK-29969][checkpoint] Show the root cause when exceeded checkpoint tolerable failure threshold

2022-11-16 Thread GitBox


1996fanrui commented on code in PR #21281:
URL: https://github.com/apache/flink/pull/21281#discussion_r1024664659


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointFailureManager.java:
##
@@ -204,7 +204,8 @@ private void checkFailureAgainstCounter(
 if (continuousFailureCounter.get() > tolerableCpFailureNumber) {
 clearCount();
 errorHandler.accept(
-new 
FlinkRuntimeException(EXCEEDED_CHECKPOINT_TOLERABLE_FAILURE_MESSAGE));
+new FlinkRuntimeException(
+EXCEEDED_CHECKPOINT_TOLERABLE_FAILURE_MESSAGE, 
exception));

Review Comment:
   Updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-connector-aws] hlteoh37 commented on a diff in pull request #15: [FLINK-29900][Connectors/DynamoDB] Implement Table API for DynamoDB sink

2022-11-16 Thread GitBox


hlteoh37 commented on code in PR #15:
URL: 
https://github.com/apache/flink-connector-aws/pull/15#discussion_r1024629827


##
flink-connector-dynamodb/src/main/java/org/apache/flink/connector/dynamodb/table/RowDataElementConverter.java:
##
@@ -0,0 +1,50 @@
+package org.apache.flink.connector.dynamodb.table;
+
+import org.apache.flink.api.connector.sink2.SinkWriter;
+import org.apache.flink.connector.base.sink.writer.ElementConverter;
+import org.apache.flink.connector.dynamodb.sink.DynamoDbWriteRequest;
+import org.apache.flink.connector.dynamodb.sink.DynamoDbWriteRequestType;
+import org.apache.flink.table.api.TableException;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.types.DataType;
+
+/** Implementation of an {@link ElementConverter} specific for DynamoDb sink. 
*/

Review Comment:
   done



##
flink-connector-dynamodb/src/main/java/org/apache/flink/connector/dynamodb/table/DynamoDbDynamicSink.java:
##
@@ -0,0 +1,198 @@
+package org.apache.flink.connector.dynamodb.table;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.connector.base.table.sink.AsyncDynamicTableSink;
+import org.apache.flink.connector.base.table.sink.AsyncDynamicTableSinkBuilder;
+import org.apache.flink.connector.dynamodb.sink.DynamoDbSink;
+import org.apache.flink.connector.dynamodb.sink.DynamoDbSinkBuilder;
+import org.apache.flink.connector.dynamodb.sink.DynamoDbWriteRequest;
+import org.apache.flink.table.connector.ChangelogMode;
+import org.apache.flink.table.connector.sink.DynamicTableSink;
+import org.apache.flink.table.connector.sink.SinkV2Provider;
+import org.apache.flink.table.connector.sink.abilities.SupportsPartitioning;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.types.DataType;
+
+import javax.annotation.Nullable;
+
+import java.util.ArrayList;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Properties;
+import java.util.Set;
+
+/**
+ * A {@link DynamicTableSink} that describes how to create a {@link 
DynamoDbSink} from a logical
+ * description.
+ */
+@Internal
+public class DynamoDbDynamicSink extends 
AsyncDynamicTableSink
+implements SupportsPartitioning {
+
+private final String destinationTableName;
+private final boolean failOnError;
+private final Properties dynamoDbClientProperties;
+private final DataType physicalDataType;
+private final Set overwriteByPartitionKeys;
+
+protected DynamoDbDynamicSink(
+@Nullable Integer maxBatchSize,
+@Nullable Integer maxInFlightRequests,
+@Nullable Integer maxBufferedRequests,
+@Nullable Long maxBufferSizeInBytes,
+@Nullable Long maxTimeInBufferMS,
+String destinationTableName,
+boolean failOnError,
+Properties dynamoDbClientProperties,
+DataType physicalDataType,
+Set overwriteByPartitionKeys) {
+super(
+maxBatchSize,
+maxInFlightRequests,
+maxBufferedRequests,
+maxBufferSizeInBytes,
+maxTimeInBufferMS);
+this.destinationTableName = destinationTableName;
+this.failOnError = failOnError;
+this.dynamoDbClientProperties = dynamoDbClientProperties;
+this.physicalDataType = physicalDataType;
+this.overwriteByPartitionKeys = overwriteByPartitionKeys;
+}
+
+@Override
+public ChangelogMode getChangelogMode(ChangelogMode changelogMode) {
+// TODO: We can support consuming from CDC streams here
+return ChangelogMode.insertOnly();
+}
+
+@Override
+public SinkRuntimeProvider getSinkRuntimeProvider(Context context) {
+DynamoDbSinkBuilder builder =
+DynamoDbSink.builder()
+.setDestinationTableName(destinationTableName)
+.setFailOnError(failOnError)
+.setOverwriteByPartitionKeys(new 
ArrayList<>(overwriteByPartitionKeys))
+.setDynamoDbProperties(dynamoDbClientProperties)
+.setElementConverter(new 
RowDataElementConverter(physicalDataType));
+
+addAsyncOptionsToSinkBuilder(builder);
+
+// TODO: check if parallelism needed here
+return SinkV2Provider.of(builder.build());
+}
+
+@Override
+public DynamicTableSink copy() {
+return new DynamoDbDynamicSink(
+maxBatchSize,
+maxInFlightRequests,
+maxBufferedRequests,
+maxBufferSizeInBytes,
+maxTimeInBufferMS,
+destinationTableName,
+failOnError,
+dynamoDbClientProperties,
+physicalDataType,
+overwriteByPartitionKeys);
+}
+
+@Override
+public String asSummaryString() {
+// TODO: check when this is called/returned
+return 

[GitHub] [flink-connector-aws] hlteoh37 commented on a diff in pull request #15: [FLINK-29900][Connectors/DynamoDB] Implement Table API for DynamoDB sink

2022-11-16 Thread GitBox


hlteoh37 commented on code in PR #15:
URL: 
https://github.com/apache/flink-connector-aws/pull/15#discussion_r1024609732


##
flink-connector-dynamodb/pom.xml:
##
@@ -77,6 +77,22 @@ under the License.
dynamodb-enhanced

 
+   
+   
+   org.apache.flink
+   flink-table-api-java-bridge
+   ${flink.version}
+   provided
+   true
+   
+   
+org.apache.flink
+flink-table-runtime
+${flink.version}
+provided
+true
+

Review Comment:
   Addressed. Good spot



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] mattfysh commented on pull request #21254: Update docker.md

2022-11-16 Thread GitBox


mattfysh commented on PR #21254:
URL: https://github.com/apache/flink/pull/21254#issuecomment-1317802478

   > Flink also works without add, so why is it needed?
   
   Did you test on v1.16?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] mattfysh commented on pull request #21254: Update docker.md

2022-11-16 Thread GitBox


mattfysh commented on PR #21254:
URL: https://github.com/apache/flink/pull/21254#issuecomment-1317799085

   Hi @MartijnVisser - my apologies, I am short on time and hoped someone with 
PyFlink familiarity would pick this up and immediately identify why this change 
is required.
   
   The current instructions do not work in Docker, which means they won't work 
on anyone's machine regardless of host setup
   
   To reproduce, stand up a local cluster in session mode using the following 
steps:
   
   1. Create a new folder
   2. Create a Dockerfile inside the folder with the contents of "Using Flink 
Python on Docker": 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/resource-providers/standalone/docker/#using-flink-python-on-docker
   3. Create a docker-compose.yaml file with the contents of "Session Mode": 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/resource-providers/standalone/docker/#session-cluster-yml
   4. Change both instances of `image: flink:1.16.0-scala_2.12` to be `build: .`
   5. Add a volumes entry to the `jobmanager` as:
   
   volumes:
   - .:/input
   
   6. Create a word_count.py file with the contents of "The complete code so 
far" section 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/python/table_api_tutorial/
   7. Run `docker-compose up`
   8. Enter the jobmanager via `docker exec -it [job_manager_container_id] bash`
   9. Run `./bin/flink run --python /input/word_count.py`
   
   This does not work, and throws the following error:
   
   ```
   Caused by: java.lang.RuntimeException: Failed to create stage bundle 
factory! Traceback (most recent call last):
 File "/usr/local/lib/python3.7/site-packages/fastavro/read.py", line 2, in 

   from . import _read
 File "fastavro/_read.pyx", line 11, in init fastavro._read
 File "/usr/local/lib/python3.7/lzma.py", line 27, in 
   from _lzma import *
   ModuleNotFoundError: No module named '_lzma
   ```
   
   This is occurring because when building Python from source, as instructed in 
the Flink documentation I have updated, certain "optional modules" in Python 
are not built if host dependencies could not be found. This includes things 
like readline, sqlite, etc, but also lzma - more information can be found at 
https://devguide.python.org/getting-started/setup-building/#unix particularly 
the "optional modules were not found" message
   
   From what I gather, the latest version of PyFlink uses a version of fastavro 
that requires lzma to be present in the Python build. I assume the instructions 
I have updated used to work with older version of PyFlink.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] jbusche commented on pull request #420: FLINK-29536 - Add WATCH_NAMESPACE env var to operator

2022-11-16 Thread GitBox


jbusche commented on PR #420:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/420#issuecomment-1317728635

   I just tried an install in all namespaces - that looked good too:
   ```oc logs -f flink-kubernetes-operator-5b457fb9b7-znhfw |grep watch
   2022-11-16 21:53:07,843 o.a.f.k.o.FlinkOperator[INFO ] Configuring 
operator to watch the following namespaces: [JOSDK_ALL_NAMESPACES].
   2022-11-16 21:53:07,876 o.a.f.k.o.FlinkOperator[INFO ] Configuring 
operator to watch the following namespaces: [JOSDK_ALL_NAMESPACES].```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] MartijnVisser commented on pull request #21223: [FLINK-15462][connectors] Add Trino dialect

2022-11-16 Thread GitBox


MartijnVisser commented on PR #21223:
URL: https://github.com/apache/flink/pull/21223#issuecomment-1317675420

   @wanglijie95 Could you review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] MartijnVisser commented on pull request #20097: [FLINK-28284][Connectors/Jdbc] Add JdbcSink with new format

2022-11-16 Thread GitBox


MartijnVisser commented on PR #20097:
URL: https://github.com/apache/flink/pull/20097#issuecomment-1317674873

   @wanglijie95 Do you have any idea when you can look into this ticket? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] MartijnVisser commented on pull request #21254: Update docker.md

2022-11-16 Thread GitBox


MartijnVisser commented on PR #21254:
URL: https://github.com/apache/flink/pull/21254#issuecomment-1317672870

   @mattfysh Thanks for the PR but please take the code contribution guide into 
account https://flink.apache.org/contributing/contribute-code.html
   
   I can't see why it's important to add this; Flink also works without add, so 
why is it needed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] jbusche commented on pull request #420: FLINK-29536 - Add WATCH_NAMESPACE env var to operator

2022-11-16 Thread GitBox


jbusche commented on PR #420:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/420#issuecomment-1317671023

   I built a new flink-operator image based on your changes
   - Twistlock scan shows no known fixable vulnerabilities
   - helm install looks as before, able to deploy the basic.yaml.  
`WATCH_NAMESPACES=`
   - Built an OLM bundle using the same operator image, together with your 
branch.   When installed in default it properly shows: 
`WATCH_NAMESPACES=default`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] MartijnVisser commented on pull request #21272: [Flink]update some comments

2022-11-16 Thread GitBox


MartijnVisser commented on PR #21272:
URL: https://github.com/apache/flink/pull/21272#issuecomment-1317670920

   @DarkAssassinator Thanks for the PR, but the benefit of a PR like this vs 
the cost for running CI for it is that it has none/very little value.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] MartijnVisser commented on pull request #21339: [FLINK-30024][tests] Build local test KDC docker image

2022-11-16 Thread GitBox


MartijnVisser commented on PR #21339:
URL: https://github.com/apache/flink/pull/21339#issuecomment-1317657486

   @gaborgsomogyi Looking interesting, especially since I've been working on 
the upgrade to a later Hadoop version with 
   https://github.com/apache/flink/pull/21128 which runs into issues with the 
Yarn tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-30047) getLastSavepointStatus should return null when there is never savepoint completed or pending

2022-11-16 Thread Clara Xiong (Jira)
Clara Xiong created FLINK-30047:
---

 Summary: getLastSavepointStatus should return null when there is 
never savepoint completed or pending
 Key: FLINK-30047
 URL: https://issues.apache.org/jira/browse/FLINK-30047
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Clara Xiong


Current SUCCEEDED is returned in this case but null should be returned instead 
to distinguish from really success.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] jnh5y commented on a diff in pull request #21133: [FLINK-29732][sql-gateway] support configuring session with SQL statement.

2022-11-16 Thread GitBox


jnh5y commented on code in PR #21133:
URL: https://github.com/apache/flink/pull/21133#discussion_r1024424268


##
flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/SqlGatewayServiceImpl.java:
##
@@ -311,8 +340,29 @@ public GatewayInfo getGatewayInfo() {
 return GatewayInfo.INSTANCE;
 }
 
+// 

+
 @VisibleForTesting
 public Session getSession(SessionHandle sessionHandle) {
 return sessionManager.getSession(sessionHandle);
 }
+
+/** Fetch all results for configuring session. */
+private ResultSet fetchConfigureSessionResult(
+SessionHandle sessionHandle, OperationHandle operationHandle) {
+ResultSet firstResult = fetchResults(sessionHandle, operationHandle, 
0, Integer.MAX_VALUE);
+while (firstResult == ResultSet.NOT_READY_RESULTS) {
+firstResult = fetchResults(sessionHandle, operationHandle, 0, 
Integer.MAX_VALUE);

Review Comment:
   Sounds good and sounds like you and @fsk119 will figure out a good solution!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] yuzelin commented on a diff in pull request #21133: [FLINK-29732][sql-gateway] support configuring session with SQL statement.

2022-11-16 Thread GitBox


yuzelin commented on code in PR #21133:
URL: https://github.com/apache/flink/pull/21133#discussion_r1024355066


##
flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/SqlGatewayServiceImpl.java:
##
@@ -311,8 +340,29 @@ public GatewayInfo getGatewayInfo() {
 return GatewayInfo.INSTANCE;
 }
 
+// 

+
 @VisibleForTesting
 public Session getSession(SessionHandle sessionHandle) {
 return sessionManager.getSession(sessionHandle);
 }
+
+/** Fetch all results for configuring session. */
+private ResultSet fetchConfigureSessionResult(
+SessionHandle sessionHandle, OperationHandle operationHandle) {
+ResultSet firstResult = fetchResults(sessionHandle, operationHandle, 
0, Integer.MAX_VALUE);
+while (firstResult == ResultSet.NOT_READY_RESULTS) {
+firstResult = fetchResults(sessionHandle, operationHandle, 0, 
Integer.MAX_VALUE);

Review Comment:
   > Let me ask this question: If one of the operations on the gateway took 
some time, say, 5 seconds. How many calls to the gateway would we expect to 
make here while waiting for the results to be ready?
   
   I'm afraid that I have no answer to this question. I guess you are worried 
about frequent queries will make it a burden to the execution. However, I think 
it's not a problem. Let's see how the codes perform the fetching.
   
   Fetching result will finally go to 
[fetchResultsInternal](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/operation/OperationManager.java#L331),
 here if the operation is not finished, the result will always be `NOT_READY`. 
At line 239, we can see  the `ResultFetcher` is ready before the status is 
turned to finished, and the statement is executed [(in an 
`OperationExecutor`)](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/SqlGatewayServiceImpl.java#L178).
 Although we still have to get data through `ResultFetcher`, at least the 
execution is not bothered. Then the task of fetching the rest results is taken 
by `ResultStore`, a 
[thread](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-sql-gateway/src/main/java/org/ap
 ache/flink/table/gateway/service/result/ResultStore.java#L149) is fetching 
continuously based on the 
[`TableResultInternal#collectInternal`](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableResultInternal.java#L45).
   
   Anyway, the first result is important. The 
[implementation](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableResultImpl.java#L105)
 of `TableResult#await` is the proof. After getting the first result, we can 
control the interval of fetching rest results. For example, in `CliResultView` 
of SQL client for print query results, it has set several `REFRESH_INTERVALS`. 
However, I see that in remote mode, the cost of network communications in 
fetching the first ready result may be large. I think it's worth discussing 
when implementing the remote mode. I will also ask @fsk119 for advice.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] yuzelin commented on a diff in pull request #21133: [FLINK-29732][sql-gateway] support configuring session with SQL statement.

2022-11-16 Thread GitBox


yuzelin commented on code in PR #21133:
URL: https://github.com/apache/flink/pull/21133#discussion_r1024355066


##
flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/SqlGatewayServiceImpl.java:
##
@@ -311,8 +340,29 @@ public GatewayInfo getGatewayInfo() {
 return GatewayInfo.INSTANCE;
 }
 
+// 

+
 @VisibleForTesting
 public Session getSession(SessionHandle sessionHandle) {
 return sessionManager.getSession(sessionHandle);
 }
+
+/** Fetch all results for configuring session. */
+private ResultSet fetchConfigureSessionResult(
+SessionHandle sessionHandle, OperationHandle operationHandle) {
+ResultSet firstResult = fetchResults(sessionHandle, operationHandle, 
0, Integer.MAX_VALUE);
+while (firstResult == ResultSet.NOT_READY_RESULTS) {
+firstResult = fetchResults(sessionHandle, operationHandle, 0, 
Integer.MAX_VALUE);

Review Comment:
   > Let me ask this question: If one of the operations on the gateway took 
some time, say, 5 seconds. How many calls to the gateway would we expect to 
make here while waiting for the results to be ready?
   
   I'm afraid that I have no answer to this question. I guess you are worried 
about frequent queries will make it a burden to the execution. However, I think 
it's not a problem. Let's see how the codes perform the fetching.
   
   Fetching result will finally go to 
[fetchResultsInternal](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/operation/OperationManager.java#L331),
 here if the operation is not finished, the result will always be `NOT_READY`. 
At line 239, we can see  the `ResultFetcher` is ready before the status is 
turned to finished, and the statement is executed [(in an 
`OperationExecutor`)](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/SqlGatewayServiceImpl.java#L178).
 Although we still have to get data through `ResultFetcher`, at least the 
execution is not bothered. Then the task of fetching the rest results is taken 
by `ResultStore`, a 
[thread](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-sql-gateway/src/main/java/org/ap
 ache/flink/table/gateway/service/result/ResultStore.java#L149) is fetching 
continuously based on the 
[`TableResultInternal#collectInternal`](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableResultInternal.java#L45).
   
   Anyway, the first result is important. The 
[implementation](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableResultImpl.java#L105)
 is the proof. After getting the first result, we can control the interval of 
fetching rest results. For example, in `CliResultView` of SQL client for print 
query results, it has set several `REFRESH_INTERVALS`. However, I see that in 
remote mode, the cost of network communications in fetching the first ready 
result may be large. I think it's worth discussing when implementing the remote 
mode. I will also ask @fsk119 for advice.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] yuzelin commented on a diff in pull request #21133: [FLINK-29732][sql-gateway] support configuring session with SQL statement.

2022-11-16 Thread GitBox


yuzelin commented on code in PR #21133:
URL: https://github.com/apache/flink/pull/21133#discussion_r1024355066


##
flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/SqlGatewayServiceImpl.java:
##
@@ -311,8 +340,29 @@ public GatewayInfo getGatewayInfo() {
 return GatewayInfo.INSTANCE;
 }
 
+// 

+
 @VisibleForTesting
 public Session getSession(SessionHandle sessionHandle) {
 return sessionManager.getSession(sessionHandle);
 }
+
+/** Fetch all results for configuring session. */
+private ResultSet fetchConfigureSessionResult(
+SessionHandle sessionHandle, OperationHandle operationHandle) {
+ResultSet firstResult = fetchResults(sessionHandle, operationHandle, 
0, Integer.MAX_VALUE);
+while (firstResult == ResultSet.NOT_READY_RESULTS) {
+firstResult = fetchResults(sessionHandle, operationHandle, 0, 
Integer.MAX_VALUE);

Review Comment:
   > Let me ask this question: If one of the operations on the gateway took 
some time, say, 5 seconds. How many calls to the gateway would we expect to 
make here while waiting for the results to be ready?
   
   I'm afraid that I have no answer to this question. I guess you are worried 
about frequent queries will make it a burden to the execution. However, I think 
it's not a problem. Let's see how the codes perform the fetching.
   
   Fetching result will finally go to 
[fetchResultsInternal](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/operation/OperationManager.java#L331),
 here if the operation is not finished, the result will always be `NOT_READY`. 
At line 239, we can see  the `ResultFetcher` is ready before the status is 
turned to finished, and the statement is executed [(in an 
`OperationExecutor`)](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/SqlGatewayServiceImpl.java#L178).
 Although we still have to get data through `ResultFetcher`, but at least the 
execution is not bothered. Then the task of fetching the rest results is taken 
by `ResultStore`, a 
[thread](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-sql-gateway/src/main/java/or
 g/apache/flink/table/gateway/service/result/ResultStore.java#L149) is fetching 
continuously based on the 
[`TableResultInternal#collectInternal`](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableResultInternal.java#L45).
   
   Anyway, the first result is important. The 
[implementation](https://github.com/apache/flink/blob/5b32b9defcbb2adf7a0ad0898a67c40e0e012ebb/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableResultImpl.java#L105)
 is the proof. After getting the first result, we can control the interval of 
fetching rest results. For example, in `CliResultView` of SQL client for print 
query results, it has set several `REFRESH_INTERVALS`. However, I see that in 
remote mode, the cost of network communications in fetching the first ready 
result may be large. I think it's worth discussing when implementing the remote 
mode. I will also ask @fsk119 for advice.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol commented on pull request #21335: [FLINK-30045][flink-clients] fix too eager check for main class existence

2022-11-16 Thread GitBox


zentol commented on PR #21335:
URL: https://github.com/apache/flink/pull/21335#issuecomment-1317332532

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-29959) Use optimistic locking when patching resource status

2022-11-16 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-29959.
--
Fix Version/s: kubernetes-operator-1.3.0
   Resolution: Fixed

merged to main 99dc38f1270ae9b13a8b461df8a0bf66e9d8b5f7

> Use optimistic locking when patching resource status
> 
>
> Key: FLINK-29959
> URL: https://issues.apache.org/jira/browse/FLINK-29959
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.3.0
>
>
> The operator currently does not use optimistic locking on the CR when 
> patching status. This worked because we always wanted to overwrite the status.
> With leader election and potentially two operators running at the same time, 
> we are now exposed to some race conditions that were not previously present 
> with the status update logic.
> To ensure that the operator always sees the latest status we should change 
> our logic to optimistic locking with retries. If we get a lock error 
> (resource updated) we check if only the spec changed and then retry locking 
> on the new version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] mxm commented on pull request #21023: [FLINK-29501] Add option to override job vertex parallelisms during job submission

2022-11-16 Thread GitBox


mxm commented on PR #21023:
URL: https://github.com/apache/flink/pull/21023#issuecomment-1317226026

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #438: [FLINK-29974] Not allowing the cancelling the which are already in the completed state.

2022-11-16 Thread GitBox


gyfora commented on code in PR #438:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/438#discussion_r1024176617


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java:
##
@@ -362,11 +362,21 @@ public void cancelSessionJob(
 FlinkSessionJob sessionJob, UpgradeMode upgradeMode, Configuration 
conf)
 throws Exception {
 
-var jobStatus = sessionJob.getStatus().getJobStatus();
+var sessionJobStatus = sessionJob.getStatus();
+var jobStatus = sessionJobStatus.getJobStatus();
 var jobIdString = jobStatus.getJobId();
 Preconditions.checkNotNull(jobIdString, "The job to be suspend should 
not be null");
 var jobId = JobID.fromHexString(jobIdString);
 Optional savepointOpt = Optional.empty();
+
+if (ReconciliationUtils.isJobInTerminalState(sessionJobStatus)) {
+LOG.info("Job is already in terminal state.");

Review Comment:
   Please beware that what you wrote is based on the `TestingFlinkService` 
which is basically a mock that intents to emulate the Flink client behaviour. 
   
   The fact that you don't get certain errors that are explained the ticket 
does not mean that it works correctly only that the `TestingFlinkService` 
implementation is incorrect. It is best to test it manually in minikube to 
reproduce the problem, modify the testing service to represent the correct 
behaviour then write some tests and assert that you can reproduce the problem 
first.
   
   Then once you fix it you will know that it worked



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-30046) Cannot use digest in Helm chart to reference image

2022-11-16 Thread Tobias Hofer (Jira)
Tobias Hofer created FLINK-30046:


 Summary: Cannot use digest in Helm chart to reference image
 Key: FLINK-30046
 URL: https://issues.apache.org/jira/browse/FLINK-30046
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.2.0, kubernetes-operator-1.3.0, 
kubernetes-operator-1.2.1
Reporter: Tobias Hofer


Images can be referenced by tag only.

Referencing images by digest has a number of advantages:
 # Avoid unexpected or undesirable image changes.
 # Increase security and awareness by knowing the specific image running in 
your environment.

The following document describes a template to handle tags and digests:

[Adding Image Digest References to Your Helm 
Charts|https://blog.andyserver.com/2021/09/adding-image-digest-references-to-your-helm-charts/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] morhidi commented on a diff in pull request #420: FLINK-29536 - Add WATCH_NAMESPACE env var to operator

2022-11-16 Thread GitBox


morhidi commented on code in PR #420:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/420#discussion_r1024160767


##
helm/flink-kubernetes-operator/templates/flink-operator.yaml:
##
@@ -79,7 +79,13 @@ spec:
   {{- end }}
   env:
 - name: OPERATOR_NAMESPACE
-  value: {{ .Release.Namespace }}
+  valueFrom:
+fieldRef:
+  fieldPath: metadata.namespace
+- name: WATCH_NAMESPACES
+  valueFrom:
+fieldRef:
+  fieldPath: metadata.annotations['olm.targetNamespaces']

Review Comment:
   I don't mind if there's no Helm support for this, if you can injest it using 
olm.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #439: [FLINK-29959] Use optimistic locking when updating the status

2022-11-16 Thread GitBox


gyfora commented on code in PR #439:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/439#discussion_r1024159160


##
flink-kubernetes-operator-api/src/main/java/org/apache/flink/kubernetes/operator/api/status/SavepointInfo.java:
##
@@ -63,7 +63,7 @@ public void setTrigger(
 }
 
 public void resetTrigger() {
-this.triggerId = "";
+this.triggerId = null;
 this.triggerTimestamp = 0L;

Review Comment:
   at least until the next trigger



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] morhidi commented on a diff in pull request #420: FLINK-29536 - Add WATCH_NAMESPACE env var to operator

2022-11-16 Thread GitBox


morhidi commented on code in PR #420:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/420#discussion_r1024158133


##
flink-kubernetes-webhook/pom.xml:
##
@@ -96,6 +96,12 @@ under the License.
 ${okhttp.version}
 test
 
+
+org.mockito

Review Comment:
   Ah ok, I didn't realize there's `TestUtils.setEnv(Map 
newEnv)`, yes you can use that.



##
flink-kubernetes-webhook/pom.xml:
##
@@ -96,6 +96,12 @@ under the License.
 ${okhttp.version}
 test
 
+
+org.mockito

Review Comment:
   @gyfora what do you think? Shall we lift the 'mockito' restriction? It's 
quite convenient to use the EnvUtil how it is implemented today.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #21339: [FLINK-30024][tests] Build local test KDC docker image

2022-11-16 Thread GitBox


flinkbot commented on PR #21339:
URL: https://github.com/apache/flink/pull/21339#issuecomment-1317201283

   
   ## CI report:
   
   * 66489dd3b2a70839e469c94725a4409579581b1e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] morhidi commented on a diff in pull request #439: [FLINK-29959] Use optimistic locking when updating the status

2022-11-16 Thread GitBox


morhidi commented on code in PR #439:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/439#discussion_r1024152777


##
flink-kubernetes-operator-api/src/main/java/org/apache/flink/kubernetes/operator/api/status/SavepointInfo.java:
##
@@ -63,7 +63,7 @@ public void setTrigger(
 }
 
 public void resetTrigger() {
-this.triggerId = "";
+this.triggerId = null;
 this.triggerTimestamp = 0L;

Review Comment:
   It'll keep already stored `0` values however, so no business logic should 
check `(triggerTimestamp == null)`, it'll be `0`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi opened a new pull request, #21339: [FLINK-30024][tests] Build local test KDC docker image

2022-11-16 Thread GitBox


gaborgsomogyi opened a new pull request, #21339:
URL: https://github.com/apache/flink/pull/21339

   ## What is the purpose of the change
   
   YARN e2e tests are using a roughly 7 years old KDC docker image which is 
hard to maintain. In this PR I've created a `Dockerfile` to build KDC 
on-the-fly.
   
   ## Brief change log
   
   * Restructured the docker-compose part (now it's possible to use 
`docker-compose build`)
   * Added `kdc/Dockerfile` to build it locally
   * Moved the original docker files/configs into hadoop directory
   * Made `hadoop/Dockerfile` hadoop version agnostic
   * Modified `README.md` for the docker images to reflect actual stand
   
   ## Verifying this change
   
   Existing YARN e2e tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper:no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] tagarr commented on a diff in pull request #420: FLINK-29536 - Add WATCH_NAMESPACE env var to operator

2022-11-16 Thread GitBox


tagarr commented on code in PR #420:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/420#discussion_r1024148468


##
helm/flink-kubernetes-operator/templates/flink-operator.yaml:
##
@@ -79,7 +79,13 @@ spec:
   {{- end }}
   env:
 - name: OPERATOR_NAMESPACE
-  value: {{ .Release.Namespace }}
+  valueFrom:
+fieldRef:
+  fieldPath: metadata.namespace
+- name: WATCH_NAMESPACES
+  valueFrom:
+fieldRef:
+  fieldPath: metadata.annotations['olm.targetNamespaces']

Review Comment:
   I can remove this and work out how to inject it into the csv if you prefer ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30024) Secure e2e tests are using a super old kerberos docker image

2022-11-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30024:
---
Labels: pull-request-available  (was: )

> Secure e2e tests are using a super old kerberos docker image
> 
>
> Key: FLINK-30024
> URL: https://issues.apache.org/jira/browse/FLINK-30024
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.17.0
>Reporter: Gabor Somogyi
>Assignee: Gabor Somogyi
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] morhidi commented on a diff in pull request #420: FLINK-29536 - Add WATCH_NAMESPACE env var to operator

2022-11-16 Thread GitBox


morhidi commented on code in PR #420:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/420#discussion_r1024142900


##
helm/flink-kubernetes-operator/templates/flink-operator.yaml:
##
@@ -79,7 +79,13 @@ spec:
   {{- end }}
   env:
 - name: OPERATOR_NAMESPACE
-  value: {{ .Release.Namespace }}
+  valueFrom:
+fieldRef:
+  fieldPath: metadata.namespace
+- name: WATCH_NAMESPACES
+  valueFrom:
+fieldRef:
+  fieldPath: metadata.annotations['olm.targetNamespaces']

Review Comment:
   We should try to generalise this somehow. Having `WATCH_NAMESPACES` as 
overrides are ok, ingesting them from the 'olm.targetNamespaces' is not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] morhidi commented on a diff in pull request #420: FLINK-29536 - Add WATCH_NAMESPACE env var to operator

2022-11-16 Thread GitBox


morhidi commented on code in PR #420:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/420#discussion_r1024138321


##
flink-kubernetes-webhook/pom.xml:
##
@@ -96,6 +96,12 @@ under the License.
 ${okhttp.version}
 test
 
+
+org.mockito

Review Comment:
   @gyfora what do you think? Shall we lift the 'mockito' restriction? It's 
quite convenient to use the EnvUtil how it is implemented today.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #442: [hotfix][doc][test] correct the incorrect flink release version

2022-11-16 Thread GitBox


gyfora commented on PR #442:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/442#issuecomment-1317169651

   The changes look, good, there is a small merge conflict now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] tagarr commented on a diff in pull request #420: FLINK-29536 - Add WATCH_NAMESPACE env var to operator

2022-11-16 Thread GitBox


tagarr commented on code in PR #420:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/420#discussion_r1024130690


##
flink-kubernetes-webhook/pom.xml:
##
@@ -96,6 +96,12 @@ under the License.
 ${okhttp.version}
 test
 
+
+org.mockito

Review Comment:
   Actually noticed there is a method in TestUtils, am I ok to use that ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora merged pull request #439: [FLINK-29959] Use optimistic locking when updating the status

2022-11-16 Thread GitBox


gyfora merged PR #439:
URL: https://github.com/apache/flink-kubernetes-operator/pull/439


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] tagarr commented on a diff in pull request #420: FLINK-29536 - Add WATCH_NAMESPACE env var to operator

2022-11-16 Thread GitBox


tagarr commented on code in PR #420:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/420#discussion_r1024106606


##
flink-kubernetes-webhook/pom.xml:
##
@@ -96,6 +96,12 @@ under the License.
 ${okhttp.version}
 test
 
+
+org.mockito

Review Comment:
   @morhidi Understand, I looked at using 
https://junit-pioneer.org/docs/environment-variables/ which has some nice 
capabilities to change system env vars, but noticed when running it I get `This 
extension uses reflection to mutate JDK-internal state, which is fragile` 
warnings when running in intellij. Do you think it would be ok to add a method 
to EnvUtils e.g. setEnvironment(Map env) and use this to override System env 
vars ? Or can you think of something better ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #21338: [FLINK-30044][ci] Deduplicate plugin parser loops

2022-11-16 Thread GitBox


flinkbot commented on PR #21338:
URL: https://github.com/apache/flink/pull/21338#issuecomment-1317102836

   
   ## CI report:
   
   * 17f1ff2881b8650672c5c97e92379ed9278114fa UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #21337: [FLINK-29993][conf] Add MetricOptions#forReporter

2022-11-16 Thread GitBox


flinkbot commented on PR #21337:
URL: https://github.com/apache/flink/pull/21337#issuecomment-1317102658

   
   ## CI report:
   
   * 35ff4ff233e9aa14128ec016c720b1a53236901b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #21336: [FLINK-30029][ci] Parse version/classifier separately

2022-11-16 Thread GitBox


flinkbot commented on PR #21336:
URL: https://github.com/apache/flink/pull/21336#issuecomment-1317102461

   
   ## CI report:
   
   * 7d3dbcffcdcb79e55f6ec3b3bb80a49d0df2d6cb UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >