[GitHub] [flink] HuangXingBo commented on a diff in pull request #21725: [FLINK-29000][python] Support python UDF in the SQL Gateway

2023-01-18 Thread GitBox


HuangXingBo commented on code in PR #21725:
URL: https://github.com/apache/flink/pull/21725#discussion_r1080914005


##
flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/context/DefaultContext.java:
##
@@ -55,15 +58,31 @@ public DefaultContext(Configuration flinkConfig, 
List command
 flinkConfig, 
PluginUtils.createPluginManagerFromRootFolder(flinkConfig));
 
 Options commandLineOptions = collectCommandLineOptions(commandLines);
+
+final List dependencies = new ArrayList<>();
+// add python dependencies by default
+try {
+URL location =
+Class.forName(
+
"org.apache.flink.python.PythonFunctionRunner",
+false,
+
Thread.currentThread().getContextClassLoader())
+.getProtectionDomain()
+.getCodeSource()
+.getLocation();
+if (Paths.get(location.toURI()).toFile().isFile()) {
+dependencies.add(location);
+}
+} catch (URISyntaxException | ClassNotFoundException e) {
+throw new SqlExecutionException("Failed to find flink-python 
jar.", e);

Review Comment:
   Make sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] HuangXingBo commented on a diff in pull request #21725: [FLINK-29000][python] Support python UDF in the SQL Gateway

2023-01-18 Thread GitBox


HuangXingBo commented on code in PR #21725:
URL: https://github.com/apache/flink/pull/21725#discussion_r1080913833


##
flink-connectors/flink-connector-hive/pom.xml:
##
@@ -1097,6 +1097,12 @@ under the License.
${project.version}
test

+   

Review Comment:
   if we don't add the python jar as test dependency, 
`HiveServer2EndpointITCase` and `HiveServer2EndpointStatementITCase` will 
failed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] HuangXingBo commented on a diff in pull request #21725: [FLINK-29000][python] Support python UDF in the SQL Gateway

2023-01-18 Thread GitBox


HuangXingBo commented on code in PR #21725:
URL: https://github.com/apache/flink/pull/21725#discussion_r1080912316


##
flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/context/DefaultContext.java:
##
@@ -55,15 +58,31 @@ public DefaultContext(Configuration flinkConfig, 
List command
 flinkConfig, 
PluginUtils.createPluginManagerFromRootFolder(flinkConfig));
 
 Options commandLineOptions = collectCommandLineOptions(commandLines);
+
+final List dependencies = new ArrayList<>();
+// add python dependencies by default
+try {
+URL location =
+Class.forName(
+
"org.apache.flink.python.PythonFunctionRunner",
+false,
+
Thread.currentThread().getContextClassLoader())
+.getProtectionDomain()
+.getCodeSource()
+.getLocation();
+if (Paths.get(location.toURI()).toFile().isFile()) {
+dependencies.add(location);
+}
+} catch (URISyntaxException | ClassNotFoundException e) {
+throw new SqlExecutionException("Failed to find flink-python 
jar.", e);

Review Comment:
   If it is the -j approach, I guess that you want to make the loading of 
python jar an optional behavior?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] fsk119 commented on pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization

2023-01-18 Thread GitBox


fsk119 commented on PR #17830:
URL: https://github.com/apache/flink/pull/17830#issuecomment-1396569697

   Sorry for closing the PR. I hit the wrong button...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] snuyanzin opened a new pull request, #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization

2023-01-18 Thread GitBox


snuyanzin opened a new pull request, #17830:
URL: https://github.com/apache/flink/pull/17830

   ## What is the purpose of the change
   
   This PR is a part of FLIP-189 about prompt customization.
   
   There are left and right prompt which could be customized in the same way.
   Left prompt could be customized via `sql-client.display.prompt.pattern` 
property and right prompt via `sql-client.display.right-prompt.pattern`.
   There is a number of things which could be done: different datetime patterns 
(compatible with SimpleDateFormat), colors, styles, showing current database 
and catalog, showing specified current property's values.
   Examples:
   ```
   -- change left prompt to "my_prompt>"
   SET 'sql-client.display.prompt.pattern'='my_prompt>'; 
   
   
   -- show current database in prompt like "default_database prompt>"
   SET 'sql-client.display.prompt.pattern'='\d prompt>'; 
   
   
   -- show current catalog in prompt like "default_catalog prompt>"
   SET 'sql-client.display.prompt.pattern'='\c prompt>'; 
   
   
   -- show current timestamp in prompt like "2022-04-21 11:11:39.071 prompt>"
   SET 'sql-client.display.prompt.pattern'='\D prompt>'; 
   
   
   -- show property value like "localhost prompt>"
   SET 'sql-client.display.prompt.pattern'='\:taskmanager.host\: prompt>';
   
   
   -- apply style and colors
   SET 
'sql-client.display.prompt.pattern'='\[f-rgb:#aa,bold\]\:taskmanager.host\: 
\[f:g,underline\]prompt>';
   ```
   The same could be applied to right prompts.
   The whole range is covered in documentation with some examples at [1]
   Also there are some details at FLIP's page [2]
   
   [1]  
https://github.com/apache/flink/pull/17830/files#diff-c12a360080e42a64d69dd6c65fd8af77c0d6ef337fa7833c3a8b16b1759d1a11R237-R281
   [2] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-189%3A+SQL+Client+Usability+Improvements#FLIP189:SQLClientUsabilityImprovements-Supportedpromptoptions(bothforleftandrightprompts)
 
   
   ## Brief change log
   
   - Enable left and right prompt customization via properties
   - Enable styles, property values, datetime patterns compatible with 
SimpleDateFormat
   
   ## Verifying this change
   There is a class with tests
   `org.apache.flink.table.client.cli.PromptHandlerTest`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (**yes** / no)
 - If yes, how is the feature documented? (not applicable / **docs** / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] fsk119 closed pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization

2023-01-18 Thread GitBox


fsk119 closed pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] 
SQL Client prompts customization
URL: https://github.com/apache/flink/pull/17830


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] fsk119 commented on pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization

2023-01-18 Thread GitBox


fsk119 commented on PR #17830:
URL: https://github.com/apache/flink/pull/17830#issuecomment-1396566805

   Okay. Because #21717 totally changes the architecture of the SQL Client. You 
may need to rebase again and again... So I just suggest rebasing all 
refactoring. 
   
   The design of the gateway mode is in the 
[FLIP-275](https://cwiki.apache.org/confluence/display/FLINK/FLIP-275%3A+Support+Remote+SQL+Client+Based+on+SQL+Gateway).
 In this FLIP, we always use the **sql gateway**(the gateway may be at another 
machine) to execute the statements. 
   
   For new REST API, I just read the presto design recently. I find theyIn the 
presto, every 
[response](https://prestodb.io/docs/current/develop/client-protocol.html#client-response-headers)
 contains the current metadata info(including current catalog/database name) in 
its http headers.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30751) Remove references to disableDataSync in RocksDB documentation

2023-01-18 Thread David Christle (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Christle updated FLINK-30751:
---
Description: 
The EmbeddedRocksDBStateBackend allows configuration using some predefined 
options via the .setPredefinedOptions method. The documentation for 
PredefinedOptions 
([link|https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.html])
 mentions disableDataSync is called for {{FLASH_SSD_OPTIMIZED}} and 
{{{}SPINNING_DISK_OPTIMIZED{}}}.

 

But this option was removed several years ago in RocksDB 5.3.0 
([link|https://github.com/facebook/rocksdb/blob/main/HISTORY.md#530-2017-03-08]),
 and according to the code 
[PredefinedOptions.java|https://github.com/apache/flink/blob/0bbc7b1e9fed89b8c3e8ec67b7b0dad5999c2c01/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.java#L72],
 it is no longer actually set in Flink.

We should remove references to disableDataSync in PredefinedOptions.java, and 
in state_backend.py, so that it does not appear in the documentation.

  was:
The EmbeddedRocksDBStateBackend allows configuration using some predefined 
options via the .setPredefinedOptions method. The documentation for 
PredefinedOptions 
([link|https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.html])
 mentions setDataSync is called for {{FLASH_SSD_OPTIMIZED}} and 
{{SPINNING_DISK_OPTIMIZED}}.

 

But this option was removed several years ago in RocksDB 5.3.0 
([link|https://github.com/facebook/rocksdb/blob/main/HISTORY.md#530-2017-03-08]),
 and according to the code 
[PredefinedOptions.java|https://github.com/apache/flink/blob/0bbc7b1e9fed89b8c3e8ec67b7b0dad5999c2c01/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.java#L72],
 it is no longer actually set in Flink.

We should remove references to disableDataSync in PredefinedOptions.java, and 
in state_backend.py, so that it does not appear in the documentation.


> Remove references to disableDataSync in RocksDB documentation
> -
>
> Key: FLINK-30751
> URL: https://issues.apache.org/jira/browse/FLINK-30751
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.16.0
>Reporter: David Christle
>Priority: Minor
>  Labels: pull-request-available
>
> The EmbeddedRocksDBStateBackend allows configuration using some predefined 
> options via the .setPredefinedOptions method. The documentation for 
> PredefinedOptions 
> ([link|https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.html])
>  mentions disableDataSync is called for {{FLASH_SSD_OPTIMIZED}} and 
> {{{}SPINNING_DISK_OPTIMIZED{}}}.
>  
> But this option was removed several years ago in RocksDB 5.3.0 
> ([link|https://github.com/facebook/rocksdb/blob/main/HISTORY.md#530-2017-03-08]),
>  and according to the code 
> [PredefinedOptions.java|https://github.com/apache/flink/blob/0bbc7b1e9fed89b8c3e8ec67b7b0dad5999c2c01/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.java#L72],
>  it is no longer actually set in Flink.
> We should remove references to disableDataSync in PredefinedOptions.java, and 
> in state_backend.py, so that it does not appear in the documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] ruanhang1993 commented on a diff in pull request #21589: [FLINK-25509][connector-base] Add RecordEvaluator to dynamically stop source based on de-serialized records

2023-01-18 Thread GitBox


ruanhang1993 commented on code in PR #21589:
URL: https://github.com/apache/flink/pull/21589#discussion_r1080899530


##
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/splitreader/SplitsDeletion.java:
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.base.source.reader.splitreader;
+
+import org.apache.flink.annotation.Experimental;
+
+import java.util.List;
+
+/**
+ * A change to delete splits.
+ *
+ * @param  the split type.
+ */
+@Experimental

Review Comment:
   Yes, the split deletion between the enumerator and reader still exists some 
problems. I will mark it @Internal.
   Now this deletion is only passed totally in the reader. So it is ok.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-29000) Support python UDF in the SQL Gateway

2023-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-29000:
---
Labels: pull-request-available  (was: )

> Support python UDF in the SQL Gateway
> -
>
> Key: FLINK-29000
> URL: https://issues.apache.org/jira/browse/FLINK-29000
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python, Table SQL / Gateway
>Affects Versions: 1.17.0
>Reporter: Shengkai Fang
>Assignee: Xingbo Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Currently Flink SQL Client supports python UDF, the Gateway should also 
> support this feature if the SQL Client is able to submit SQL to the Gateway.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] fsk119 commented on a diff in pull request #21725: [FLINK-29000][python] Support python UDF in the SQL Gateway

2023-01-18 Thread GitBox


fsk119 commented on code in PR #21725:
URL: https://github.com/apache/flink/pull/21725#discussion_r1080884303


##
flink-connectors/flink-connector-hive/pom.xml:
##
@@ -1097,6 +1097,12 @@ under the License.
${project.version}
test

+   

Review Comment:
   Why add python dependencies here?



##
flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/context/DefaultContext.java:
##
@@ -55,15 +58,31 @@ public DefaultContext(Configuration flinkConfig, 
List command
 flinkConfig, 
PluginUtils.createPluginManagerFromRootFolder(flinkConfig));
 
 Options commandLineOptions = collectCommandLineOptions(commandLines);
+
+final List dependencies = new ArrayList<>();
+// add python dependencies by default
+try {
+URL location =
+Class.forName(
+
"org.apache.flink.python.PythonFunctionRunner",
+false,
+
Thread.currentThread().getContextClassLoader())
+.getProtectionDomain()
+.getCodeSource()
+.getLocation();
+if (Paths.get(location.toURI()).toFile().isFile()) {
+dependencies.add(location);
+}
+} catch (URISyntaxException | ClassNotFoundException e) {
+throw new SqlExecutionException("Failed to find flink-python 
jar.", e);

Review Comment:
throw SqlGatewayException. In the `SqlGateway#startSqlGateway`, we will 
notice users it's a bug if the type of the exception is not SqlGatewayException.



##
flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/context/DefaultContext.java:
##
@@ -55,15 +58,31 @@ public DefaultContext(Configuration flinkConfig, 
List command
 flinkConfig, 
PluginUtils.createPluginManagerFromRootFolder(flinkConfig));
 
 Options commandLineOptions = collectCommandLineOptions(commandLines);
+
+final List dependencies = new ArrayList<>();
+// add python dependencies by default
+try {
+URL location =
+Class.forName(
+
"org.apache.flink.python.PythonFunctionRunner",
+false,
+
Thread.currentThread().getContextClassLoader())
+.getProtectionDomain()
+.getCodeSource()
+.getLocation();
+if (Paths.get(location.toURI()).toFile().isFile()) {
+dependencies.add(location);
+}
+} catch (URISyntaxException | ClassNotFoundException e) {
+throw new SqlExecutionException("Failed to find flink-python 
jar.", e);

Review Comment:
   Can we move this code block to `DefaultContext#load`? I think we will 
introduce -l/-j comand line paramters in the future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] ruanhang1993 commented on a diff in pull request #21589: [FLINK-25509][connector-base] Add RecordEvaluator to dynamically stop source based on de-serialized records

2023-01-18 Thread GitBox


ruanhang1993 commented on code in PR #21589:
URL: https://github.com/apache/flink/pull/21589#discussion_r1080899530


##
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/splitreader/SplitsDeletion.java:
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.base.source.reader.splitreader;
+
+import org.apache.flink.annotation.Experimental;
+
+import java.util.List;
+
+/**
+ * A change to delete splits.
+ *
+ * @param  the split type.
+ */
+@Experimental

Review Comment:
   Yes, the split deletion between the enumerator and reader still exists some 
problems. I will mark it @Internal.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] ruanhang1993 commented on a diff in pull request #21589: [FLINK-25509][connector-base] Add RecordEvaluator to dynamically stop source based on de-serialized records

2023-01-18 Thread GitBox


ruanhang1993 commented on code in PR #21589:
URL: https://github.com/apache/flink/pull/21589#discussion_r1080897687


##
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/splitreader/SplitsDeletion.java:
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.base.source.reader.splitreader;
+
+import org.apache.flink.annotation.Experimental;
+
+import java.util.List;
+
+/**
+ * A change to delete splits.
+ *
+ * @param  the split type.
+ */
+@Experimental
+public class SplitsDeletion extends SplitsChange {

Review Comment:
   The changes in the kafka connector will raise in another PR after merging 
this PR. I will refactor the class name.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] snuyanzin commented on pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization

2023-01-18 Thread GitBox


snuyanzin commented on PR #17830:
URL: https://github.com/apache/flink/pull/17830#issuecomment-1396549855

   thanks for the response and reference to 
https://github.com/apache/flink/pull/21717. I will have a look.
   I noticed conflicts and try to solve them. For it's sometimes easier to 
resolve step by step rather than after several commits introducing new 
conflicts... That's why i did it.
   
   Currently i'm not familiar with gateway mode. I will have a closer look to 
see what could be done there. May be as you mentioned some REST API will be ok, 
however let's see.
   If you don't  mind probably I'll ping you in case of some questions about 
gateway mode once I have them
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #21729: [FLINK-30751] [docs] Remove references to disableDataSync in RocksDB documentation

2023-01-18 Thread GitBox


flinkbot commented on PR #21729:
URL: https://github.com/apache/flink/pull/21729#issuecomment-1396544048

   
   ## CI report:
   
   * c3a904a93192be9bf612e76edbfa39465b398404 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-27716) Add Python API docs in ML

2023-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27716:
---
Labels: pull-request-available  (was: )

> Add Python API docs in ML
> -
>
> Key: FLINK-27716
> URL: https://issues.apache.org/jira/browse/FLINK-27716
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python, Documentation, Library / Machine Learning
>Reporter: Huang Xingbo
>Assignee: Jiang Xin
>Priority: Major
>  Labels: pull-request-available
> Fix For: ml-2.2.0
>
>
> We can use sphinx same as pyflink or other tools to generate Python API docs 
> of ML



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #21728: [FLINK-30667] decouple hive connector with planner internal class

2023-01-18 Thread GitBox


flinkbot commented on PR #21728:
URL: https://github.com/apache/flink/pull/21728#issuecomment-1396543949

   
   ## CI report:
   
   * be0fcfcfa931339826e99a8b09550be8385d6fd5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-ml] jiangxin369 opened a new pull request, #201: [FLINK-27716] Add Python API docs in ML

2023-01-18 Thread GitBox


jiangxin369 opened a new pull request, #201:
URL: https://github.com/apache/flink-ml/pull/201

   
   
   ## What is the purpose of the change
   
   Add Python API docs in ML.
   
   ## Brief change log
   
 - Adds Sphinx config to generate Python API doc automatically.
 - Sets the layout and APIs to be shown.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] chenqin commented on pull request #21728: [FLINK-30667] decouple hive connector with planner internal class

2023-01-18 Thread GitBox


chenqin commented on PR #21728:
URL: https://github.com/apache/flink/pull/21728#issuecomment-1396541490

   @luoyuxia please help take a look, here is a bit rationale
   - parser should be PublicEvolving interface while both Flink and hive has 
own internal implementation. so hive connector maintainer less worry about 
Flink planner changes
   - PlannerQueryOperation should keep internal in both table-planner as well 
as hive-connector so hive connector can have full control and evolve without 
worry how Flink planner PlannerQueryOperation evolve
   - PlannerContext is simple enough util can be PublicEvolving


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-30667) remove the planner @internal dependency in flink-connector-hive

2023-01-18 Thread Chen Qin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678003#comment-17678003
 ] 

Chen Qin edited comment on FLINK-30667 at 1/19/23 7:19 AM:
---

parser should be PublicEvolving interface while both Flink and hive has own 
internal implementation. so hive connector maintainer less worry about Flink 
planner changes

PlannerQueryOperation should keep internal in both table-planner as well as 
hive-connector so hive connector can have full control and evolve without worry 
how Flink planner PlannerQueryOperation evolve

PlannerContext is simple enough util can be PublicEvolving


was (Author: foxss):
Paper should be PublicEvolving interface while both Flink and hive has own 
internal implementation. so hive connector maintainer less worry about Flink 
planner changes

PlannerQueryOperation should keep internal in both table-planner as well as 
hive-connector so hive connector can have full control and evolve without worry 
how Flink planner PlannerQueryOperation evolve

PlannerContext is simple enough util can be PublicEvolving

>  remove the planner @internal dependency in flink-connector-hive
> 
>
> Key: FLINK-30667
> URL: https://issues.apache.org/jira/browse/FLINK-30667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Affects Versions: 1.17.0
>Reporter: Chen Qin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> There are some classes in flink-connector-hive reply on  planner, but 
> fortunately, not too many.
> It mainly rely on ParserImpl, PlannerContext, PlannerQueryOperation and so 
> on.  The dependency is mainly required to create RelNode.
> To resolve this problem,  we need more abstraction for planner and provides 
> public API for external dialects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30751) Remove references to disableDataSync in RocksDB documentation

2023-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30751:
---
Labels: pull-request-available  (was: )

> Remove references to disableDataSync in RocksDB documentation
> -
>
> Key: FLINK-30751
> URL: https://issues.apache.org/jira/browse/FLINK-30751
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.16.0
>Reporter: David Christle
>Priority: Minor
>  Labels: pull-request-available
>
> The EmbeddedRocksDBStateBackend allows configuration using some predefined 
> options via the .setPredefinedOptions method. The documentation for 
> PredefinedOptions 
> ([link|https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.html])
>  mentions setDataSync is called for {{FLASH_SSD_OPTIMIZED}} and 
> {{SPINNING_DISK_OPTIMIZED}}.
>  
> But this option was removed several years ago in RocksDB 5.3.0 
> ([link|https://github.com/facebook/rocksdb/blob/main/HISTORY.md#530-2017-03-08]),
>  and according to the code 
> [PredefinedOptions.java|https://github.com/apache/flink/blob/0bbc7b1e9fed89b8c3e8ec67b7b0dad5999c2c01/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.java#L72],
>  it is no longer actually set in Flink.
> We should remove references to disableDataSync in PredefinedOptions.java, and 
> in state_backend.py, so that it does not appear in the documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] dchristle opened a new pull request, #21729: [FLINK-30751] [docs] Remove references to disableDataSync in RocksDB documentation

2023-01-18 Thread GitBox


dchristle opened a new pull request, #21729:
URL: https://github.com/apache/flink/pull/21729

   
   
   
   ## What is the purpose of the change
   
   This PR corrects and improves the readability of the documentation for 
RocksDB predefined option sets. References to `disableDataSync`, which is not 
set and has been removed from RocksDB for some time, are removed. I also sorted 
the options lexicographically, and did the same for the code that sets these 
options. This makes it easier to read & spot bugs. While doing this, I noticed 
the `SPINNING_DISK_OPTIMIZED_HIGH_MEM` predefined option set actually enables 
Bloom filter usage, but it isn't documented to do that. I added brief 
documentation for it. There are some other minor inconsistencies in the Python 
vs Java documentation that I tried to match up, too.
   
   There are also some references to Flink not needing to sync changes to the 
file system due to its checkpointing ability. `setUseFsync` is set to `false`, 
it appears, for all settings. Currently, there are some references to not 
needing syncing in the Java documentation, but they don't quite make sense 
because `setUseFsync` is not mentioned there. Instead, as the existing 
documentation suggests, options common to each predefined set are set 
elsewhere. So, I removed most duplicate references to `setUseFsync`.
   
   
   ## Brief change log
   
   
   * Remove references to `disableDataSync` in RocksDB documentation. This is 
not set, and has been removed from RocksDB for some time.
   * Make Python documentation consistent with Java documentation by removing 
`setUseFsync` references on each option.
   * Sort options in both Java and Python documentation.
   * Sort the code setting configuration to match the sorted-order 
documentation.
   * Change use of `setIncreaseParallelism` in Python documentation to 
`setMaxBackgroundJobs`, which matches the Java usage.
   * Add missing Bloom filter option to the `SPINNING_DISK_OPTIMIZED_HIGH_MEM` 
documentation, and describe where its parameters are set.
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): **no**
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
 - The serializers: **no**
 - The runtime per-record code paths (performance sensitive): **no**
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: **no**
 - The S3 file system connector: **no**
   
   ## Documentation
   
 - Does this pull request introduce a new feature? **no**
 - If yes, how is the feature documented? **not applicable**
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30025) table.execute().print() can only use the default max column width

2023-01-18 Thread Jing Ge (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge updated FLINK-30025:

Description: 
table.execute().print() can only use the default max column width. When running 
table API program "table.execute().print();", the columns with long string 
value are truncated to 30 chars. E.g.,:

!https://static.dingtalk.com/media/lALPF6XTM7ZO1FXNASrNBEI_1090_298.png_620x1q90.jpg?auth_bizType=%27IM%27=im|width=457,height=125!

I tried set the max width with: 
tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
 100); It has no effect.  How can I set the max-width?

Here is the example code:

val env = StreamExecutionEnvironment.getExecutionEnvironment
val tEnv = StreamTableEnvironment.create(env)

tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
 100)

val orderA = env
  .fromCollection(Seq(Order(1L, "beer", 3), Order(1L, 
"diaper--{-}.diaper{-}-{-}.diaper{-}-{-}.diaper{-}--.", 4), Order(3L, "rubber", 
2)))
  .toTable(tEnv)

orderA.execute().print()

 

"sql-client.display.max-column-width" seems only work in cli: SET 
'sql-client.display.max-column-width' = '40';

While using Table API, by default, the DEFAULT_MAX_COLUMN_WIDTH in PrintStyle 
is used now. It should be configurable. 

  was:
when running table API program "table.execute().print();", the columns with 
long string value are truncated to 30 chars. E.g.,:

!https://static.dingtalk.com/media/lALPF6XTM7ZO1FXNASrNBEI_1090_298.png_620x1q90.jpg?auth_bizType=%27IM%27=im|width=457,height=125!

I tried set the max width with: 
tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
 100); It has no effect.  How can I set the max-width?

Here is the example code:

val env = StreamExecutionEnvironment.getExecutionEnvironment
val tEnv = StreamTableEnvironment.create(env)

tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
 100)

val orderA = env
  .fromCollection(Seq(Order(1L, "beer", 3), Order(1L, 
"diaper---.diaper---.diaper---.diaper---.", 4), Order(3L, "rubber", 2)))
  .toTable(tEnv)

orderA.execute().print()

 

"sql-client.display.max-column-width" seems only work in cli: SET 
'sql-client.display.max-column-width' = '40';

While using Table API, by default, the DEFAULT_MAX_COLUMN_WIDTH in PrintStyle 
is used now. It should be configurable. 


> table.execute().print() can only use the default max column width 
> --
>
> Key: FLINK-30025
> URL: https://issues.apache.org/jira/browse/FLINK-30025
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Jing Ge
>Assignee: Jing Ge
>Priority: Minor
>  Labels: pull-request-available
>
> table.execute().print() can only use the default max column width. When 
> running table API program "table.execute().print();", the columns with long 
> string value are truncated to 30 chars. E.g.,:
> !https://static.dingtalk.com/media/lALPF6XTM7ZO1FXNASrNBEI_1090_298.png_620x1q90.jpg?auth_bizType=%27IM%27=im|width=457,height=125!
> I tried set the max width with: 
> tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
>  100); It has no effect.  How can I set the max-width?
> Here is the example code:
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> val tEnv = StreamTableEnvironment.create(env)
> tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
>  100)
> val orderA = env
>   .fromCollection(Seq(Order(1L, "beer", 3), Order(1L, 
> "diaper--{-}.diaper{-}-{-}.diaper{-}-{-}.diaper{-}--.", 4), Order(3L, 
> "rubber", 2)))
>   .toTable(tEnv)
> orderA.execute().print()
>  
> "sql-client.display.max-column-width" seems only work in cli: SET 
> 'sql-client.display.max-column-width' = '40';
> While using Table API, by default, the DEFAULT_MAX_COLUMN_WIDTH in PrintStyle 
> is used now. It should be configurable. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30025) Unified the max display column width for SqlClient and Table APi in both Streaming and Batch execMode

2023-01-18 Thread Jing Ge (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge updated FLINK-30025:

Description: 
FLIP-279 
[https://cwiki.apache.org/confluence/display/FLINK/FLIP-279+Unified+the+max+display+column+width+for+SqlClient+and+Table+APi+in+both+Streaming+and+Batch+execMode]

 

Background info:

table.execute().print() can only use the default max column width. When running 
table API program "table.execute().print();", the columns with long string 
value are truncated to 30 chars. E.g.,:

!https://static.dingtalk.com/media/lALPF6XTM7ZO1FXNASrNBEI_1090_298.png_620x1q90.jpg?auth_bizType=%27IM%27=im|width=457,height=125!

I tried set the max width with: 
tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
 100); It has no effect.  How can I set the max-width?

Here is the example code:

val env = StreamExecutionEnvironment.getExecutionEnvironment
val tEnv = StreamTableEnvironment.create(env)

tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
 100)

val orderA = env
  .fromCollection(Seq(Order(1L, "beer", 3), Order(1L, 
"diaper-{-}{{-}}.diaper{{-}}{-}{-}.diaper{-}{-}{{-}}.diaper{{-}}{-}-.", 4), 
Order(3L, "rubber", 2)))
  .toTable(tEnv)

orderA.execute().print()

 

"sql-client.display.max-column-width" seems only work in cli: SET 
'sql-client.display.max-column-width' = '40';

While using Table API, by default, the DEFAULT_MAX_COLUMN_WIDTH in PrintStyle 
is used now. It should be configurable. 

  was:
table.execute().print() can only use the default max column width. When running 
table API program "table.execute().print();", the columns with long string 
value are truncated to 30 chars. E.g.,:

!https://static.dingtalk.com/media/lALPF6XTM7ZO1FXNASrNBEI_1090_298.png_620x1q90.jpg?auth_bizType=%27IM%27=im|width=457,height=125!

I tried set the max width with: 
tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
 100); It has no effect.  How can I set the max-width?

Here is the example code:

val env = StreamExecutionEnvironment.getExecutionEnvironment
val tEnv = StreamTableEnvironment.create(env)

tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
 100)

val orderA = env
  .fromCollection(Seq(Order(1L, "beer", 3), Order(1L, 
"diaper--{-}.diaper{-}-{-}.diaper{-}-{-}.diaper{-}--.", 4), Order(3L, "rubber", 
2)))
  .toTable(tEnv)

orderA.execute().print()

 

"sql-client.display.max-column-width" seems only work in cli: SET 
'sql-client.display.max-column-width' = '40';

While using Table API, by default, the DEFAULT_MAX_COLUMN_WIDTH in PrintStyle 
is used now. It should be configurable. 


> Unified the max display column width for SqlClient and Table APi in both 
> Streaming and Batch execMode
> -
>
> Key: FLINK-30025
> URL: https://issues.apache.org/jira/browse/FLINK-30025
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Jing Ge
>Assignee: Jing Ge
>Priority: Minor
>  Labels: pull-request-available
>
> FLIP-279 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-279+Unified+the+max+display+column+width+for+SqlClient+and+Table+APi+in+both+Streaming+and+Batch+execMode]
>  
> Background info:
> table.execute().print() can only use the default max column width. When 
> running table API program "table.execute().print();", the columns with long 
> string value are truncated to 30 chars. E.g.,:
> !https://static.dingtalk.com/media/lALPF6XTM7ZO1FXNASrNBEI_1090_298.png_620x1q90.jpg?auth_bizType=%27IM%27=im|width=457,height=125!
> I tried set the max width with: 
> tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
>  100); It has no effect.  How can I set the max-width?
> Here is the example code:
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> val tEnv = StreamTableEnvironment.create(env)
> tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
>  100)
> val orderA = env
>   .fromCollection(Seq(Order(1L, "beer", 3), Order(1L, 
> "diaper-{-}{{-}}.diaper{{-}}{-}{-}.diaper{-}{-}{{-}}.diaper{{-}}{-}-.", 4), 
> Order(3L, "rubber", 2)))
>   .toTable(tEnv)
> orderA.execute().print()
>  
> "sql-client.display.max-column-width" seems only work in cli: SET 
> 'sql-client.display.max-column-width' = '40';
> While using Table API, by default, the DEFAULT_MAX_COLUMN_WIDTH in PrintStyle 
> is used now. It should be configurable. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30025) Unified the max display column width for SqlClient and Table APi in both Streaming and Batch execMode

2023-01-18 Thread Jing Ge (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge updated FLINK-30025:

Summary: Unified the max display column width for SqlClient and Table APi 
in both Streaming and Batch execMode  (was: table.execute().print() can only 
use the default max column width )

> Unified the max display column width for SqlClient and Table APi in both 
> Streaming and Batch execMode
> -
>
> Key: FLINK-30025
> URL: https://issues.apache.org/jira/browse/FLINK-30025
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Jing Ge
>Assignee: Jing Ge
>Priority: Minor
>  Labels: pull-request-available
>
> table.execute().print() can only use the default max column width. When 
> running table API program "table.execute().print();", the columns with long 
> string value are truncated to 30 chars. E.g.,:
> !https://static.dingtalk.com/media/lALPF6XTM7ZO1FXNASrNBEI_1090_298.png_620x1q90.jpg?auth_bizType=%27IM%27=im|width=457,height=125!
> I tried set the max width with: 
> tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
>  100); It has no effect.  How can I set the max-width?
> Here is the example code:
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> val tEnv = StreamTableEnvironment.create(env)
> tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width",
>  100)
> val orderA = env
>   .fromCollection(Seq(Order(1L, "beer", 3), Order(1L, 
> "diaper--{-}.diaper{-}-{-}.diaper{-}-{-}.diaper{-}--.", 4), Order(3L, 
> "rubber", 2)))
>   .toTable(tEnv)
> orderA.execute().print()
>  
> "sql-client.display.max-column-width" seems only work in cli: SET 
> 'sql-client.display.max-column-width' = '40';
> While using Table API, by default, the DEFAULT_MAX_COLUMN_WIDTH in PrintStyle 
> is used now. It should be configurable. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-30667) remove the planner @internal dependency in flink-connector-hive

2023-01-18 Thread Chen Qin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678003#comment-17678003
 ] 

Chen Qin edited comment on FLINK-30667 at 1/19/23 7:16 AM:
---

Paper should be PublicEvolving interface while both Flink and hive has own 
internal implementation. so hive connector maintainer less worry about Flink 
planner changes

PlannerQueryOperation should keep internal in both table-planner as well as 
hive-connector so hive connector can have full control and evolve without worry 
how Flink planner PlannerQueryOperation evolve

PlannerContext is simple enough util can be PublicEvolving


was (Author: foxss):
ParserImpl and it's interface currently both Internal. Consider HIveParser 
should not rely on table-planner ParserImpl for shake of future flexibility and 
hive connector maintenance. I would propose annotate Parser Interface with 
PublicEvolving; Let HiveParser directly implement Parser Interface to decouple 
risk might involved with future planner refactor.

PlannerQueryOperation should keep internal in both table-planner as well as 
hive-connector, thanks to interface QueryOperation were PublicEvolving, I would 
propose setting a foundational FlinkTypeFactory as PublicEvolving as well.

 

PlannerContext could be interface with separate implementations in planner and 
hive-connector

>  remove the planner @internal dependency in flink-connector-hive
> 
>
> Key: FLINK-30667
> URL: https://issues.apache.org/jira/browse/FLINK-30667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Affects Versions: 1.17.0
>Reporter: Chen Qin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> There are some classes in flink-connector-hive reply on  planner, but 
> fortunately, not too many.
> It mainly rely on ParserImpl, PlannerContext, PlannerQueryOperation and so 
> on.  The dependency is mainly required to create RelNode.
> To resolve this problem,  we need more abstraction for planner and provides 
> public API for external dialects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] chenqin opened a new pull request, #21728: [FLINK-30667] decouple hive connector with planner internal class

2023-01-18 Thread GitBox


chenqin opened a new pull request, #21728:
URL: https://github.com/apache/flink/pull/21728

   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] fsk119 commented on pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization

2023-01-18 Thread GitBox


fsk119 commented on PR #17830:
URL: https://github.com/apache/flink/pull/17830#issuecomment-1396537732

   I think some features may not be easily to support in the gateway mode. We 
may introduce some REST API to fetch the current catalog and database. WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] fsk119 commented on pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization

2023-01-18 Thread GitBox


fsk119 commented on PR #17830:
URL: https://github.com/apache/flink/pull/17830#issuecomment-1396536467

   Sorry for the late response. I am refactoring the SQL Client in the #21717 . 
Do you mind we support this features after the refactor finishes?  I can help 
to rebase this when most codes finishes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #21727: [FLINK-30752][python] Support 'EXPLAIN PLAN_ADVICE' statement in PyFlink

2023-01-18 Thread GitBox


flinkbot commented on PR #21727:
URL: https://github.com/apache/flink/pull/21727#issuecomment-1396534110

   
   ## CI report:
   
   * 65b5f8395b61ec1ee0040217f02ff202d0ad7c0e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-30677) SqlGatewayServiceStatementITCase.testFlinkSqlStatements fails

2023-01-18 Thread Shengkai Fang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengkai Fang closed FLINK-30677.
-
Fix Version/s: 1.17.0
   Resolution: Fixed

Fixed in the 4fdb5c40094cfaa5fb3b6d7ce9ec891dab3ef32a

> SqlGatewayServiceStatementITCase.testFlinkSqlStatements fails
> -
>
> Key: FLINK-30677
> URL: https://issues.apache.org/jira/browse/FLINK-30677
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Gateway
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Paul Lin
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.17.0
>
>
> We're observing a test instability with 
> {{SqlGatewayServiceStatementITCase.testFlinkSqlStatements}} in the following 
> builds:
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44775=logs=a9db68b9-a7e0-54b6-0f98-010e0aff39e2=cdd32e0b-6047-565b-c58f-14054472f1be=14251]
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44775=logs=de826397-1924-5900-0034-51895f69d4b7=f311e913-93a2-5a37-acab-4a63e1328f94=14608
> {code:java}
> Jan 13 02:46:10 [ERROR] Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 27.279 s <<< FAILURE! - in 
> org.apache.flink.table.gateway.service.SqlGatewayServiceStatementITCase
> Jan 13 02:46:10 [ERROR] 
> org.apache.flink.table.gateway.service.SqlGatewayServiceStatementITCase.testFlinkSqlStatements(String)[5]
>   Time elapsed: 1.573 s  <<< FAILURE!
> Jan 13 02:46:10 org.opentest4j.AssertionFailedError: 
> Jan 13 02:46:10 
> Jan 13 02:46:10 expected: 
> Jan 13 02:46:10   "# table.q - CREATE/DROP/SHOW/ALTER/DESCRIBE TABLE
> Jan 13 02:46:10   #
> Jan 13 02:46:10   # Licensed to the Apache Software Foundation (ASF) under 
> one or more
> Jan 13 02:46:10   # contributor license agreements.  See the NOTICE file 
> distributed with
> Jan 13 02:46:10   # this work for additional information regarding copyright 
> ownership.
> Jan 13 02:46:10   # The ASF licenses this file to you under the Apache 
> License, Version 2.0
> Jan 13 02:46:10   # (the "License"); you may not use this file except in 
> compliance with
> Jan 13 02:46:10   # the License.  You may obtain a copy of the License at
> Jan 13 02:46:10   #
> Jan 13 02:46:10   # http://www.apache.org/licenses/LICENSE-2.0
> Jan 13 02:46:10   #
> Jan 13 02:46:10   # Unless required by applicable law or agreed to in 
> writing, software
> Jan 13 02:46:10   # distributed under the License is distributed on an "AS 
> IS" BASIS,
> Jan 13 02:46:10   # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
> express or implied.
> Jan 13 02:46:10   # See the License for the specific language governing 
> permissions and
> Jan 13 02:46:10   # limitations under the License.
> [...] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-30677) SqlGatewayServiceStatementITCase.testFlinkSqlStatements fails

2023-01-18 Thread Shengkai Fang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengkai Fang reassigned FLINK-30677:
-

Assignee: Paul Lin  (was: Lin Yan)

> SqlGatewayServiceStatementITCase.testFlinkSqlStatements fails
> -
>
> Key: FLINK-30677
> URL: https://issues.apache.org/jira/browse/FLINK-30677
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Gateway
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Paul Lin
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> We're observing a test instability with 
> {{SqlGatewayServiceStatementITCase.testFlinkSqlStatements}} in the following 
> builds:
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44775=logs=a9db68b9-a7e0-54b6-0f98-010e0aff39e2=cdd32e0b-6047-565b-c58f-14054472f1be=14251]
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44775=logs=de826397-1924-5900-0034-51895f69d4b7=f311e913-93a2-5a37-acab-4a63e1328f94=14608
> {code:java}
> Jan 13 02:46:10 [ERROR] Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 27.279 s <<< FAILURE! - in 
> org.apache.flink.table.gateway.service.SqlGatewayServiceStatementITCase
> Jan 13 02:46:10 [ERROR] 
> org.apache.flink.table.gateway.service.SqlGatewayServiceStatementITCase.testFlinkSqlStatements(String)[5]
>   Time elapsed: 1.573 s  <<< FAILURE!
> Jan 13 02:46:10 org.opentest4j.AssertionFailedError: 
> Jan 13 02:46:10 
> Jan 13 02:46:10 expected: 
> Jan 13 02:46:10   "# table.q - CREATE/DROP/SHOW/ALTER/DESCRIBE TABLE
> Jan 13 02:46:10   #
> Jan 13 02:46:10   # Licensed to the Apache Software Foundation (ASF) under 
> one or more
> Jan 13 02:46:10   # contributor license agreements.  See the NOTICE file 
> distributed with
> Jan 13 02:46:10   # this work for additional information regarding copyright 
> ownership.
> Jan 13 02:46:10   # The ASF licenses this file to you under the Apache 
> License, Version 2.0
> Jan 13 02:46:10   # (the "License"); you may not use this file except in 
> compliance with
> Jan 13 02:46:10   # the License.  You may obtain a copy of the License at
> Jan 13 02:46:10   #
> Jan 13 02:46:10   # http://www.apache.org/licenses/LICENSE-2.0
> Jan 13 02:46:10   #
> Jan 13 02:46:10   # Unless required by applicable law or agreed to in 
> writing, software
> Jan 13 02:46:10   # distributed under the License is distributed on an "AS 
> IS" BASIS,
> Jan 13 02:46:10   # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
> express or implied.
> Jan 13 02:46:10   # See the License for the specific language governing 
> permissions and
> Jan 13 02:46:10   # limitations under the License.
> [...] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-30677) SqlGatewayServiceStatementITCase.testFlinkSqlStatements fails

2023-01-18 Thread Shengkai Fang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengkai Fang reassigned FLINK-30677:
-

Assignee: Lin Yan

> SqlGatewayServiceStatementITCase.testFlinkSqlStatements fails
> -
>
> Key: FLINK-30677
> URL: https://issues.apache.org/jira/browse/FLINK-30677
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Gateway
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Lin Yan
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> We're observing a test instability with 
> {{SqlGatewayServiceStatementITCase.testFlinkSqlStatements}} in the following 
> builds:
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44775=logs=a9db68b9-a7e0-54b6-0f98-010e0aff39e2=cdd32e0b-6047-565b-c58f-14054472f1be=14251]
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44775=logs=de826397-1924-5900-0034-51895f69d4b7=f311e913-93a2-5a37-acab-4a63e1328f94=14608
> {code:java}
> Jan 13 02:46:10 [ERROR] Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 27.279 s <<< FAILURE! - in 
> org.apache.flink.table.gateway.service.SqlGatewayServiceStatementITCase
> Jan 13 02:46:10 [ERROR] 
> org.apache.flink.table.gateway.service.SqlGatewayServiceStatementITCase.testFlinkSqlStatements(String)[5]
>   Time elapsed: 1.573 s  <<< FAILURE!
> Jan 13 02:46:10 org.opentest4j.AssertionFailedError: 
> Jan 13 02:46:10 
> Jan 13 02:46:10 expected: 
> Jan 13 02:46:10   "# table.q - CREATE/DROP/SHOW/ALTER/DESCRIBE TABLE
> Jan 13 02:46:10   #
> Jan 13 02:46:10   # Licensed to the Apache Software Foundation (ASF) under 
> one or more
> Jan 13 02:46:10   # contributor license agreements.  See the NOTICE file 
> distributed with
> Jan 13 02:46:10   # this work for additional information regarding copyright 
> ownership.
> Jan 13 02:46:10   # The ASF licenses this file to you under the Apache 
> License, Version 2.0
> Jan 13 02:46:10   # (the "License"); you may not use this file except in 
> compliance with
> Jan 13 02:46:10   # the License.  You may obtain a copy of the License at
> Jan 13 02:46:10   #
> Jan 13 02:46:10   # http://www.apache.org/licenses/LICENSE-2.0
> Jan 13 02:46:10   #
> Jan 13 02:46:10   # Unless required by applicable law or agreed to in 
> writing, software
> Jan 13 02:46:10   # distributed under the License is distributed on an "AS 
> IS" BASIS,
> Jan 13 02:46:10   # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
> express or implied.
> Jan 13 02:46:10   # See the License for the specific language governing 
> permissions and
> Jan 13 02:46:10   # limitations under the License.
> [...] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] fsk119 closed pull request #21700: [FLINK-30677][Test] Fix unstable SqlGatewayServiceStatementITCase.testFlinkSqlStatements

2023-01-18 Thread GitBox


fsk119 closed pull request #21700: [FLINK-30677][Test] Fix unstable 
SqlGatewayServiceStatementITCase.testFlinkSqlStatements
URL: https://github.com/apache/flink/pull/21700


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30752) Support 'EXPLAIN PLAN_ADVICE' statement in PyFlink

2023-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30752:
---
Labels: pull-request-available  (was: )

> Support 'EXPLAIN PLAN_ADVICE' statement in PyFlink
> --
>
> Key: FLINK-30752
> URL: https://issues.apache.org/jira/browse/FLINK-30752
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Affects Versions: 1.17.0
>Reporter: Jane Chan
>Priority: Not a Priority
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> For API completeness



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] LadyForest opened a new pull request, #21727: [FLINK-30752][python] Support 'EXPLAIN PLAN_ADVICE' statement in PyFlink

2023-01-18 Thread GitBox


LadyForest opened a new pull request, #21727:
URL: https://github.com/apache/flink/pull/21727

   ## What is the purpose of the change
   
   This PR supports the `EXPLAIN PLAN_ADVICE` statement in PyFlink (need to 
rebase once FLINK-30672 is merged).
   
   ## Brief change log
   
   Introduce the `ExplainDetail#PLAN_ADVICE` conversion between Python and Java.
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   - `test_explain`
   - `test_sql`
   - `basic_operations`
   - `test_table_environment_api`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): No
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: No
 - The serializers: No
 - The runtime per-record code paths (performance sensitive): No
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: No
 - The S3 file system connector: No
   
   ## Documentation
   
 - Does this pull request introduces a new feature? Yes
 - If yes, how is the feature documented? FLINK-30673
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] snuyanzin commented on pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization

2023-01-18 Thread GitBox


snuyanzin commented on PR #17830:
URL: https://github.com/apache/flink/pull/17830#issuecomment-1396515924

   
   @flinkbot run azure
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] 1996fanrui commented on pull request #21690: [FLINK-30623][runtime][bug] The canEmitBatchOfRecords should check recordWriter and changelogWriterAvailabilityProvider are available

2023-01-18 Thread GitBox


1996fanrui commented on PR #21690:
URL: https://github.com/apache/flink/pull/21690#issuecomment-1396507794

   > Are you suggesting that we should now couple the caller of DataOutput with 
the code that constructed it?
   
   In fact, the code in the master branch did this a long time ago. 
   
   `OneInputStreamTask` has inner class `StreamTaskNetworkOutput`, and 
`OneInputStreamTask#createDataOutput` will create it. 
   
   `StreamMultipleInputProcessorFactory` has inner class 
`StreamTaskSourceOutput`, and `MultipleInputStreamTask#createInputProcessor -> 
StreamMultipleInputProcessorFactory#create` will create it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-30752) Support 'EXPLAIN PLAN_ADVICE' statement in PyFlink

2023-01-18 Thread Jane Chan (Jira)
Jane Chan created FLINK-30752:
-

 Summary: Support 'EXPLAIN PLAN_ADVICE' statement in PyFlink
 Key: FLINK-30752
 URL: https://issues.apache.org/jira/browse/FLINK-30752
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Affects Versions: 1.17.0
Reporter: Jane Chan
 Fix For: 1.17.0


For API completeness



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] hehuiyuan commented on a diff in pull request #21668: [FLINK-30679][HIVE]Can not load the data of hive dim table when project-push-down is introduced

2023-01-18 Thread GitBox


hehuiyuan commented on code in PR #21668:
URL: https://github.com/apache/flink/pull/21668#discussion_r1080847440


##
flink-connectors/flink-connector-hive/src/test/java/org/apache/flink/connectors/hive/HiveLookupJoinITCase.java:
##
@@ -378,6 +378,31 @@ public void 
testLookupJoinWithLookUpSourceProjectPushDown() throws Exception {
 assertEquals("[+I[1, a], +I[2, b], +I[3, c]]", results.toString());
 }
 
+@Test
+public void testLookupJoinPartitionTableWithLookUpSourceProjectPushDown() 
throws Exception {

Review Comment:
   when using project push down,the non-partition table is ok ,for case 
testLookupJoinWithLookUpSourceProjectPushDown. But the partition table is error



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] chenqin closed pull request #21710: [FLINK-30667] [Table/Planner] remove hive-connector dep on table-planner internal class

2023-01-18 Thread GitBox


chenqin closed pull request #21710: [FLINK-30667] [Table/Planner] remove 
hive-connector dep on table-planner internal class
URL: https://github.com/apache/flink/pull/21710


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-connector-kafka] mas-chen commented on a diff in pull request #1: [FLINK-30052][Connectors/Kafka] Move existing Kafka connector code from Flink repo to dedicated Kafka repo

2023-01-18 Thread GitBox


mas-chen commented on code in PR #1:
URL: 
https://github.com/apache/flink-connector-kafka/pull/1#discussion_r1080844703


##
flink-connector-kafka-e2e-tests/flink-end-to-end-tests-common-kafka/src/test/java/org/apache/flink/tests/util/kafka/SQLClientKafkaITCase.java:
##
@@ -82,15 +86,27 @@ private static Configuration getConfiguration() {
 return flinkConfig;
 }
 
-@Rule

Review Comment:
   @zentol @MartijnVisser hope you enjoyed your break! Do you have to time to 
revisit this discussion^?
   
   It's been a while so just reminding--this is a e2e test that was previously 
@Ignore'ed and you left a comment to try and re-enable it since the commented 
JIRA ticket that was blocking this test was fixed. It's possible to support 
this with some improvements of the Flink testcontainer framework (I can go into 
the details of what is required). But I think the E2E test is redundant as per 
my earlier comment on Dec 15.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] snuyanzin commented on pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization

2023-01-18 Thread GitBox


snuyanzin commented on PR #17830:
URL: https://github.com/apache/flink/pull/17830#issuecomment-1396484993

   
   @flinkbot run azure
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] hehuiyuan commented on a diff in pull request #21668: [FLINK-30679][HIVE]Can not load the data of hive dim table when project-push-down is introduced

2023-01-18 Thread GitBox


hehuiyuan commented on code in PR #21668:
URL: https://github.com/apache/flink/pull/21668#discussion_r1080838455


##
flink-connectors/flink-connector-hive/src/test/java/org/apache/flink/connectors/hive/HiveLookupJoinITCase.java:
##
@@ -378,6 +378,31 @@ public void 
testLookupJoinWithLookUpSourceProjectPushDown() throws Exception {
 assertEquals("[+I[1, a], +I[2, b], +I[3, c]]", results.toString());
 }
 
+@Test
+public void testLookupJoinPartitionTableWithLookUpSourceProjectPushDown() 
throws Exception {

Review Comment:
   ok



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-30751) Remove references to disableDataSync in RocksDB documentation

2023-01-18 Thread David Christle (Jira)
David Christle created FLINK-30751:
--

 Summary: Remove references to disableDataSync in RocksDB 
documentation
 Key: FLINK-30751
 URL: https://issues.apache.org/jira/browse/FLINK-30751
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.16.0
Reporter: David Christle


The EmbeddedRocksDBStateBackend allows configuration using some predefined 
options via the .setPredefinedOptions method. The documentation for 
PredefinedOptions 
([link|https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.html])
 mentions setDataSync is called for {{FLASH_SSD_OPTIMIZED}} and 
{{SPINNING_DISK_OPTIMIZED}}.

 

But this option was removed several years ago in RocksDB 5.3.0 
([link|https://github.com/facebook/rocksdb/blob/main/HISTORY.md#530-2017-03-08]),
 and according to the code 
[PredefinedOptions.java|https://github.com/apache/flink/blob/0bbc7b1e9fed89b8c3e8ec67b7b0dad5999c2c01/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.java#L72],
 it is no longer actually set in Flink.

We should remove references to disableDataSync in PredefinedOptions.java, and 
in state_backend.py, so that it does not appear in the documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #21726: [FLINK-30748][docs]Translate "Overview" page of "Querys" into Chinese

2023-01-18 Thread GitBox


flinkbot commented on PR #21726:
URL: https://github.com/apache/flink/pull/21726#issuecomment-1396471685

   
   ## CI report:
   
   * 92a18af335774a0f49b3a22f26b8abbd576dd4bd UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-30748) Translate "Overview" page of "Querys" into Chinese

2023-01-18 Thread chenhaiyang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678294#comment-17678294
 ] 

chenhaiyang edited comment on FLINK-30748 at 1/19/23 5:45 AM:
--

Hi [~jark],
I have already translated the "overview" page of "Query", the pull request has 
been committed.

Please see [https://github.com/apache/flink/pull/21726] .


was (Author: JIRAUSER298680):
Hi [~jark],
I want to translate the "overview" page of "Query",please assgin this issue to 
me. thanks a lot of.

> Translate "Overview" page of "Querys" into Chinese
> --
>
> Key: FLINK-30748
> URL: https://issues.apache.org/jira/browse/FLINK-30748
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Documentation
>Reporter: chenhaiyang
>Priority: Major
>  Labels: pull-request-available
>
> The page url is 
> [https://nightlies.apache.org/flink/flink-docs-master/zh/docs/dev/table/sql/queries/overview]
>  
> The markdown file is located in 
> docs/content.zh/docs/dev/table/sql/queries/overview.md



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30680) Consider using the autoscaler to detect slow taskmanagers

2023-01-18 Thread Shammon (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678496#comment-17678496
 ] 

Shammon commented on FLINK-30680:
-

Thanks [~gyfora] to create this issue. In fact out team in bytedance has 
developed similar function in our flink cluster, we are trying to apply it in 
production. Our test results show that it has very good effect on slow nodes of 
streaming process.

As for the difference between 'detect slow tm' and restart the job and the 
overall proposal, [~wangm92] and [~Zhanghao Chen] can give more input

> Consider using the autoscaler to detect slow taskmanagers
> -
>
> Key: FLINK-30680
> URL: https://issues.apache.org/jira/browse/FLINK-30680
> Project: Flink
>  Issue Type: New Feature
>  Components: Autoscaler, Kubernetes Operator
>Reporter: Gyula Fora
>Priority: Major
>
> We could leverage logic in the autoscaler to detect slow taskmanagers by 
> comparing the per-record processing times between them.
> If we notice that all subtasks on a single TM are considerably slower than 
> the rest (at similar input rates) we should try simply restarting the job 
> instead of scaling it up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] lincoln-lil commented on a diff in pull request #21683: [FLINK-30672][table] Support 'EXPLAIN PLAN_ADVICE' statement

2023-01-18 Thread GitBox


lincoln-lil commented on code in PR #21683:
URL: https://github.com/apache/flink/pull/21683#discussion_r1080828555


##
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/utils/FlinkRelOptUtil.scala:
##
@@ -90,6 +90,58 @@ object FlinkRelOptUtil {
 sw.toString
   }
 
+  /**
+   * Converts a sequence of relational expressions to a string. This is 
different from
+   * [[RelOptUtil]]#toString and overloaded [[FlinkRelOptUtil]]#toString on 
following points:
+   *   - Generated string by this method is in a tree style
+   *   - Generated string by this method may have more information about 
RelNode, such as

Review Comment:
   nit: -> 'Generated string by this method may have available [[PlanAdvice]]'? 
   'retractionTraits' is not easy to understand for developers and I didn't see 
more 'withXX' args enabled  by default.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30748) Translate "Overview" page of "Querys" into Chinese

2023-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30748:
---
Labels: pull-request-available  (was: )

> Translate "Overview" page of "Querys" into Chinese
> --
>
> Key: FLINK-30748
> URL: https://issues.apache.org/jira/browse/FLINK-30748
> Project: Flink
>  Issue Type: Sub-task
>  Components: chinese-translation, Documentation
>Reporter: chenhaiyang
>Priority: Major
>  Labels: pull-request-available
>
> The page url is 
> [https://nightlies.apache.org/flink/flink-docs-master/zh/docs/dev/table/sql/queries/overview]
>  
> The markdown file is located in 
> docs/content.zh/docs/dev/table/sql/queries/overview.md



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] NightRunner opened a new pull request, #21726: [FLINK-30748][docs]Translate "Overview" page of "Querys" into Chinese

2023-01-18 Thread GitBox


NightRunner opened a new pull request, #21726:
URL: https://github.com/apache/flink/pull/21726

   Translate "Overview" page of "Querys" into Chinese
   
   ## What is the purpose of the change
   
   Translate /dev/table/sql/queries/overview.md
   
   ## Brief change log
   
   Translate /dev/table/sql/queries/overview.md
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature?  no


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia commented on pull request #21703: [FLINK-29880][hive] Introduce auto compaction for Hive sink in batch mode

2023-01-18 Thread GitBox


luoyuxia commented on PR #21703:
URL: https://github.com/apache/flink/pull/21703#issuecomment-1396453771

   @lsyldliu Thanks for reviewing. I have addressed your comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia commented on a diff in pull request #21703: [FLINK-29880][hive] Introduce auto compaction for Hive sink in batch mode

2023-01-18 Thread GitBox


luoyuxia commented on code in PR #21703:
URL: https://github.com/apache/flink/pull/21703#discussion_r1080818665


##
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSink.java:
##
@@ -351,15 +365,211 @@ private DataStreamSink consume(
 }
 }
 
-private DataStreamSink createBatchSink(
+private DataStreamSink createBatchSink(
+DataStream dataStream,
+DataStructureConverter converter,
+HiveWriterFactory recordWriterFactory,
+TableMetaStoreFactory metaStoreFactory,
+OutputFileConfig.OutputFileConfigBuilder fileConfigBuilder,
+String stagingParentDir,
+StorageDescriptor sd,
+Properties tableProps,
+boolean isToLocal,
+boolean overwrite,
+int sinkParallelism)
+throws IOException {
+org.apache.flink.configuration.Configuration conf =
+new org.apache.flink.configuration.Configuration();
+catalogTable.getOptions().forEach(conf::setString);
+boolean autoCompaction = 
conf.getBoolean(FileSystemConnectorOptions.AUTO_COMPACTION);
+if (autoCompaction) {
+if (batchShuffleMode != BatchShuffleMode.ALL_EXCHANGES_BLOCKING) {

Review Comment:
   Thanks for configuring it out. After think it over, yes, it should still 
work in pipeline shufle mode.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia commented on a diff in pull request #21703: [FLINK-29880][hive] Introduce auto compaction for Hive sink in batch mode

2023-01-18 Thread GitBox


luoyuxia commented on code in PR #21703:
URL: https://github.com/apache/flink/pull/21703#discussion_r1080814492


##
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveOptions.java:
##
@@ -140,6 +140,15 @@ public class HiveOptions {
 public static final ConfigOption 
SINK_PARTITION_COMMIT_SUCCESS_FILE_NAME =
 FileSystemConnectorOptions.SINK_PARTITION_COMMIT_SUCCESS_FILE_NAME;
 
+public static final ConfigOption COMPACT_SMALL_FILES_AVG_SIZE =
+key("compaction.small-files.avg-size")
+.memoryType()
+.defaultValue(MemorySize.ofMebiBytes(16))

Review Comment:
   I'm not sure. But It comes for Hive, I think  it may be  reasonable at least 
to Hive user.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia commented on a diff in pull request #21703: [FLINK-29880][hive] Introduce auto compaction for Hive sink in batch mode

2023-01-18 Thread GitBox


luoyuxia commented on code in PR #21703:
URL: https://github.com/apache/flink/pull/21703#discussion_r1080814012


##
flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/table/batch/BatchSink.java:
##
@@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.file.table.batch;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.connector.file.table.FileSystemFactory;
+import org.apache.flink.connector.file.table.FileSystemOutputFormat;
+import org.apache.flink.connector.file.table.PartitionCommitPolicyFactory;
+import org.apache.flink.connector.file.table.TableMetaStoreFactory;
+import 
org.apache.flink.connector.file.table.batch.compact.BatchCompactCoordinator;
+import 
org.apache.flink.connector.file.table.batch.compact.BatchCompactOperator;
+import 
org.apache.flink.connector.file.table.batch.compact.BatchPartitionCommitterSink;
+import 
org.apache.flink.connector.file.table.stream.compact.CompactBucketWriter;
+import org.apache.flink.connector.file.table.stream.compact.CompactMessages;
+import 
org.apache.flink.connector.file.table.stream.compact.CompactMessages.CoordinatorInput;
+import org.apache.flink.connector.file.table.stream.compact.CompactReader;
+import org.apache.flink.connector.file.table.stream.compact.CompactWriter;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.datastream.DataStreamSink;
+import org.apache.flink.streaming.api.functions.sink.filesystem.BucketWriter;
+import 
org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink;
+import org.apache.flink.table.catalog.ObjectIdentifier;
+import org.apache.flink.table.connector.sink.DynamicTableSink;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.function.SupplierWithException;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.LinkedHashMap;
+
+/** Helper for creating batch file sink. */
+@Internal
+public class BatchSink {
+private BatchSink() {}
+
+public static DataStreamSink createBatchNoAutoCompactSink(

Review Comment:
   I still prefer `createBatchNoAutoCompactSink` which is consistent with 
`createBatchCompactSink`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia commented on a diff in pull request #21703: [FLINK-29880][hive] Introduce auto compaction for Hive sink in batch mode

2023-01-18 Thread GitBox


luoyuxia commented on code in PR #21703:
URL: https://github.com/apache/flink/pull/21703#discussion_r1080813573


##
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSink.java:
##
@@ -351,15 +365,211 @@ private DataStreamSink consume(
 }
 }
 
-private DataStreamSink createBatchSink(
+private DataStreamSink createBatchSink(
+DataStream dataStream,
+DataStructureConverter converter,
+HiveWriterFactory recordWriterFactory,
+TableMetaStoreFactory metaStoreFactory,
+OutputFileConfig.OutputFileConfigBuilder fileConfigBuilder,
+String stagingParentDir,
+StorageDescriptor sd,
+Properties tableProps,
+boolean isToLocal,
+boolean overwrite,
+int sinkParallelism)
+throws IOException {
+org.apache.flink.configuration.Configuration conf =
+new org.apache.flink.configuration.Configuration();
+catalogTable.getOptions().forEach(conf::setString);
+boolean autoCompaction = 
conf.getBoolean(FileSystemConnectorOptions.AUTO_COMPACTION);
+if (autoCompaction) {
+if (batchShuffleMode != BatchShuffleMode.ALL_EXCHANGES_BLOCKING) {
+throw new UnsupportedOperationException(
+String.format(
+"Auto compaction for Hive sink in batch mode 
is not supported when the %s is not %s.",
+ExecutionOptions.BATCH_SHUFFLE_MODE.key(),
+BatchShuffleMode.ALL_EXCHANGES_BLOCKING));
+}
+Optional compactParallelismOptional =
+
conf.getOptional(FileSystemConnectorOptions.COMPACTION_PARALLELISM);
+int compactParallelism = 
compactParallelismOptional.orElse(sinkParallelism);
+return createBatchCompactSink(
+dataStream,
+converter,
+recordWriterFactory,
+metaStoreFactory,
+fileConfigBuilder
+.withPartPrefix(
+BatchCompactOperator.UNCOMPACTED_PREFIX
++ "part-"
++ UUID.randomUUID())
+.build(),
+stagingParentDir,
+sd,
+tableProps,
+isToLocal,
+overwrite,
+sinkParallelism,
+compactParallelism);
+} else {
+return createBatchNoCompactSink(
+dataStream,
+converter,
+recordWriterFactory,
+metaStoreFactory,
+fileConfigBuilder.build(),
+stagingParentDir,
+isToLocal,
+sinkParallelism);
+}
+}
+
+private DataStreamSink createBatchCompactSink(
 DataStream dataStream,
 DataStructureConverter converter,
 HiveWriterFactory recordWriterFactory,
 TableMetaStoreFactory metaStoreFactory,
 OutputFileConfig fileNaming,
 String stagingParentDir,
+StorageDescriptor sd,
+Properties tableProps,
 boolean isToLocal,
-final int parallelism)
+boolean overwrite,
+final int sinkParallelism,
+final int compactParallelism)
+throws IOException {
+String[] partitionColumns = getPartitionKeyArray();
+org.apache.flink.configuration.Configuration conf =
+new org.apache.flink.configuration.Configuration();
+catalogTable.getOptions().forEach(conf::setString);
+HadoopFileSystemFactory fsFactory = fsFactory();
+org.apache.flink.core.fs.Path tmpPath =
+new 
org.apache.flink.core.fs.Path(toStagingDir(stagingParentDir, jobConf));
+
+PartitionCommitPolicyFactory partitionCommitPolicyFactory =
+new PartitionCommitPolicyFactory(
+
conf.get(HiveOptions.SINK_PARTITION_COMMIT_POLICY_KIND),

Review Comment:
   No, we don't. In batch mode, it'll always commit partitions even though the 
`metastore` policy has been configured.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-30745) Check-pointing with Azure Data Lake Storage

2023-01-18 Thread Dheeraj Panangat (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678486#comment-17678486
 ] 

Dheeraj Panangat commented on FLINK-30745:
--

Hi [~surendralilhore] ,
Not specified any value for this property : 
fs.azure.account.keyprovider.
We are not encrypting the secrets as of now.

Just to add, we also use hudi with flink and that also uses Azure Blob Storage 
and works fine.

The difference in them being, for hudi to detect the file system it uses the 
hadoop jars/classes, but for checkpointing it uses the filesystem that is part 
of flink jars/classes (shaded classes).

> Check-pointing with Azure Data Lake Storage
> ---
>
> Key: FLINK-30745
> URL: https://issues.apache.org/jira/browse/FLINK-30745
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Affects Versions: 1.15.2, 1.14.6
>Reporter: Dheeraj Panangat
>Priority: Major
>
> Hi,
> While checkpointing to Azure Blob Storage using Flink, we get the following 
> error :
> {code:java}
> Caused by: Configuration property .dfs.core.windows.net not 
> found.
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:372)
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:1133)
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.(AzureBlobFileSystemStore.java:174)
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:110)
>  {code}
> We have given the configurations in core-site.xml too for following
> {code:java}
> fs.hdfs.impl
> fs.abfs.impl -> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem
> fs.file.impl
> fs.azure.account.auth.type
> fs.azure.account.oauth.provider.type
> fs.azure.account.oauth2.client.id
> fs.azure.account.oauth2.client.secret
> fs.azure.account.oauth2.client.endpoint
> fs.azure.createRemoteFileSystemDuringInitialization -> true {code}
> On debugging found that flink reads from core-default-shaded.xml, but even if 
> the properties are specified there, the default configs are not loaded and we 
> get a different exception as :
> {code:java}
> Caused by: Unable to load key provider class.
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getTokenProvider(AbfsConfiguration.java:540)
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:1136)
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.(AzureBlobFileSystemStore.java:174)
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:110)
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] lsyldliu commented on a diff in pull request #21703: [FLINK-29880][hive] Introduce auto compaction for Hive sink in batch mode

2023-01-18 Thread GitBox


lsyldliu commented on code in PR #21703:
URL: https://github.com/apache/flink/pull/21703#discussion_r1080776842


##
flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/table/batch/BatchSink.java:
##
@@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.file.table.batch;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.connector.file.table.FileSystemFactory;
+import org.apache.flink.connector.file.table.FileSystemOutputFormat;
+import org.apache.flink.connector.file.table.PartitionCommitPolicyFactory;
+import org.apache.flink.connector.file.table.TableMetaStoreFactory;
+import 
org.apache.flink.connector.file.table.batch.compact.BatchCompactCoordinator;
+import 
org.apache.flink.connector.file.table.batch.compact.BatchCompactOperator;
+import 
org.apache.flink.connector.file.table.batch.compact.BatchPartitionCommitterSink;
+import 
org.apache.flink.connector.file.table.stream.compact.CompactBucketWriter;
+import org.apache.flink.connector.file.table.stream.compact.CompactMessages;
+import 
org.apache.flink.connector.file.table.stream.compact.CompactMessages.CoordinatorInput;
+import org.apache.flink.connector.file.table.stream.compact.CompactReader;
+import org.apache.flink.connector.file.table.stream.compact.CompactWriter;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.datastream.DataStreamSink;
+import org.apache.flink.streaming.api.functions.sink.filesystem.BucketWriter;
+import 
org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink;
+import org.apache.flink.table.catalog.ObjectIdentifier;
+import org.apache.flink.table.connector.sink.DynamicTableSink;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.function.SupplierWithException;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.LinkedHashMap;
+
+/** Helper for creating batch file sink. */
+@Internal
+public class BatchSink {
+private BatchSink() {}
+
+public static DataStreamSink createBatchNoAutoCompactSink(
+DataStream dataStream,
+DynamicTableSink.DataStructureConverter converter,
+FileSystemOutputFormat fileSystemOutputFormat,
+final int parallelism)
+throws IOException {
+return dataStream
+.map((MapFunction) value -> (Row) 
converter.toExternal(value))
+.setParallelism(parallelism)
+.writeUsingOutputFormat(fileSystemOutputFormat)
+.setParallelism(parallelism);
+}
+
+public static  DataStreamSink createBatchCompactSink(
+DataStream dataStream,
+StreamingFileSink.BucketsBuilder<
+T, String, ? extends 
StreamingFileSink.BucketsBuilder>
+builder,
+CompactReader.Factory readFactory,
+FileSystemFactory fsFactory,
+TableMetaStoreFactory metaStoreFactory,
+PartitionCommitPolicyFactory partitionCommitPolicyFactory,
+String[] partitionColumns,
+LinkedHashMap staticPartitionSpec,
+Path tmpPath,
+ObjectIdentifier identifier,
+final long compactAverageSize,
+final long compactTargetSize,
+boolean isToLocal,
+boolean overwrite,
+final int compactParallelism)
+throws IOException {

Review Comment:
   Ditto



##
flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/table/batch/BatchSink.java:
##
@@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the 

[GitHub] [flink] lindong28 commented on pull request #21690: [FLINK-30623][runtime][bug] The canEmitBatchOfRecords should check recordWriter and changelogWriterAvailabilityProvider are available

2023-01-18 Thread GitBox


lindong28 commented on PR #21690:
URL: https://github.com/apache/flink/pull/21690#issuecomment-1396411041

   > For example: MultipleInputStreamTask[1] and OneInputStreamTask[2] have 
corresponding CheckpointBarrierHandler, if MultipleInputStreamTask calls 
CheckpointBarrierHandler of OneInputStreamTask, there should be similar bugs.
   
   Indeed, ultimately the caller will have to choose the right caller instance. 
On the other hand, the interface or abstract method is typically expected to 
have clean and self-contained semantics so that it de-couples the caller from 
the method implementation as much as possible.
   
   `boolean DataOutput#emitRecord` indeed would work as long as it is always 
called by the same StreamTask instance that constructed it. But it adds more 
coupling here and you would have to explain this API's behavior based on who is 
the caller and who constructed this DataOutput, which IMO is an anti-pattern 
for abstract/interface methods.
   
   If you both think that this approach is cleaner and it is OK to have an 
abstract method's semantics depend on its caller, I am fine to stop here and 
let this PR in.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-30191) Update py4j from 0.10.9.3 to 0.10.9.7

2023-01-18 Thread Huang Xingbo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huang Xingbo closed FLINK-30191.

Fix Version/s: 1.17.0
   Resolution: Done

Merged into master via 0bbc7b1e9fed89b8c3e8ec67b7b0dad5999c2c01

> Update py4j from 0.10.9.3 to 0.10.9.7
> -
>
> Key: FLINK-30191
> URL: https://issues.apache.org/jira/browse/FLINK-30191
> Project: Flink
>  Issue Type: Technical Debt
>  Components: API / Python
>Reporter: Martijn Visser
>Assignee: Yunfeng Zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] HuangXingBo closed pull request #21680: [FLINK-30191][python] Update net.sf.py4j:py4j dependency to 0.10.9.7

2023-01-18 Thread GitBox


HuangXingBo closed pull request #21680: [FLINK-30191][python] Update 
net.sf.py4j:py4j dependency to 0.10.9.7
URL: https://github.com/apache/flink/pull/21680


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #21725: [FLINK-2900][python] Support python UDF in the SQL Gateway

2023-01-18 Thread GitBox


flinkbot commented on PR #21725:
URL: https://github.com/apache/flink/pull/21725#issuecomment-1396402673

   
   ## CI report:
   
   * 1af7133e05aa5bcd2277b10095fcb95b220b683d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-29000) Support python UDF in the SQL Gateway

2023-01-18 Thread Huang Xingbo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huang Xingbo updated FLINK-29000:
-
Fix Version/s: 1.17.0

> Support python UDF in the SQL Gateway
> -
>
> Key: FLINK-29000
> URL: https://issues.apache.org/jira/browse/FLINK-29000
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python, Table SQL / Gateway
>Affects Versions: 1.17.0
>Reporter: Shengkai Fang
>Assignee: Xingbo Huang
>Priority: Major
> Fix For: 1.17.0
>
>
> Currently Flink SQL Client supports python UDF, the Gateway should also 
> support this feature if the SQL Client is able to submit SQL to the Gateway.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-2900) Remove Record-API dependencies from Hadoop Compat module

2023-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-2900:
--
Labels: pull-request-available  (was: )

> Remove Record-API dependencies from Hadoop Compat module
> 
>
> Key: FLINK-2900
> URL: https://issues.apache.org/jira/browse/FLINK-2900
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataSet
>Affects Versions: 0.10.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> The Hadoop Compat module includes wrappers for Hadoop Input/OutputFormat for 
> the Record API classes and a corresponding test.
> These need to be removed before removing the Record API.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] HuangXingBo opened a new pull request, #21725: [FLINK-2900][python] Support python UDF in the SQL Gateway

2023-01-18 Thread GitBox


HuangXingBo opened a new pull request, #21725:
URL: https://github.com/apache/flink/pull/21725

   ## What is the purpose of the change
   
   *This pull request will support python UDF in the SQL Gateway*
   
   
   ## Brief change log
   
 - *Support python UDF in the SQL Gateway*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30750) CompactActionITCase.testBatchCompact in table store is unstable

2023-01-18 Thread Shammon (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shammon updated FLINK-30750:

Description: 
https://github.com/apache/flink-table-store/actions/runs/3954989166/jobs/6772877033


2023-01-17T11:45:17.9511390Z [INFO] Results:
2023-01-17T11:45:17.9511641Z [INFO] 
2023-01-17T11:45:17.9511838Z [ERROR] Errors: 
2023-01-17T11:45:17.9512585Z [ERROR]   CompactActionITCase.testBatchCompact » 
JobExecution Job execution failed.
2023-01-17T11:45:17.9512964Z [INFO] 
2023-01-17T11:45:17.9513223Z [ERROR] Tests run: 224, Failures: 0, Errors: 1, 
Skipped: 4



  was:
https://github.com/apache/flink-table-store/actions/runs/3938700717/jobs/6737695722


2023-01-17T11:45:17.9511390Z [INFO] Results:
2023-01-17T11:45:17.9511641Z [INFO] 
2023-01-17T11:45:17.9511838Z [ERROR] Errors: 
2023-01-17T11:45:17.9512585Z [ERROR]   CompactActionITCase.testBatchCompact » 
JobExecution Job execution failed.
2023-01-17T11:45:17.9512964Z [INFO] 
2023-01-17T11:45:17.9513223Z [ERROR] Tests run: 224, Failures: 0, Errors: 1, 
Skipped: 4




> CompactActionITCase.testBatchCompact in table store is unstable
> ---
>
> Key: FLINK-30750
> URL: https://issues.apache.org/jira/browse/FLINK-30750
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Affects Versions: table-store-0.4.0
>Reporter: Shammon
>Priority: Major
>
> https://github.com/apache/flink-table-store/actions/runs/3954989166/jobs/6772877033
> 2023-01-17T11:45:17.9511390Z [INFO] Results:
> 2023-01-17T11:45:17.9511641Z [INFO] 
> 2023-01-17T11:45:17.9511838Z [ERROR] Errors: 
> 2023-01-17T11:45:17.9512585Z [ERROR]   CompactActionITCase.testBatchCompact » 
> JobExecution Job execution failed.
> 2023-01-17T11:45:17.9512964Z [INFO] 
> 2023-01-17T11:45:17.9513223Z [ERROR] Tests run: 224, Failures: 0, Errors: 1, 
> Skipped: 4



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] lindong28 commented on pull request #21690: [FLINK-30623][runtime][bug] The canEmitBatchOfRecords should check recordWriter and changelogWriterAvailabilityProvider are available

2023-01-18 Thread GitBox


lindong28 commented on PR #21690:
URL: https://github.com/apache/flink/pull/21690#issuecomment-1396398435

   > It's strange that MultiInputStreamTask-D constructed a DataOutput-C, and 
OneInputStreamTask-B calls the DataOutput-
   > C::emitRecord constructed for MultiInputStreamTask-D. I think it should be 
a bug if it happens.
   
   @1996fanrui Hmm... I thought we have agreed that "A and B should not care 
about the interface of C is determined by D". This is because DataOutput is an 
interface and the caller should be able to use it based on its API semantics 
without knowing who constructed it, right?
   
   Do you suggest that we should now couple the caller the DataOutput with the 
code that constructed it? It seems to break the typical pattern of how we use 
interface..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-29722) Supports hive max function by native implementation

2023-01-18 Thread Godfrey He (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Godfrey He reassigned FLINK-29722:
--

Assignee: dalongliu

> Supports hive max function by native implementation
> ---
>
> Key: FLINK-29722
> URL: https://issues.apache.org/jira/browse/FLINK-29722
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-29722) Supports hive max function by native implementation

2023-01-18 Thread Godfrey He (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Godfrey He closed FLINK-29722.
--
Resolution: Fixed

Fixed in 1.17.0: 74c7188ae9898b492c94a472d9d407bf4f8e0876

> Supports hive max function by native implementation
> ---
>
> Key: FLINK-29722
> URL: https://issues.apache.org/jira/browse/FLINK-29722
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29722) Supports hive max function by native implementation

2023-01-18 Thread Godfrey He (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Godfrey He updated FLINK-29722:
---
Component/s: Connectors / Hive

> Supports hive max function by native implementation
> ---
>
> Key: FLINK-29722
> URL: https://issues.apache.org/jira/browse/FLINK-29722
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: dalongliu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] godfreyhe closed pull request #21605: [FLINK-29722][hive] Supports native hive max function for hive dialect

2023-01-18 Thread GitBox


godfreyhe closed pull request #21605: [FLINK-29722][hive] Supports native hive 
max function for hive dialect
URL: https://github.com/apache/flink/pull/21605


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-30750) CompactActionITCase.testBatchCompact in table store is unstable

2023-01-18 Thread Shammon (Jira)
Shammon created FLINK-30750:
---

 Summary: CompactActionITCase.testBatchCompact in table store is 
unstable
 Key: FLINK-30750
 URL: https://issues.apache.org/jira/browse/FLINK-30750
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Affects Versions: table-store-0.4.0
Reporter: Shammon


https://github.com/apache/flink-table-store/actions/runs/3938700717/jobs/6737695722


2023-01-17T11:45:17.9511390Z [INFO] Results:
2023-01-17T11:45:17.9511641Z [INFO] 
2023-01-17T11:45:17.9511838Z [ERROR] Errors: 
2023-01-17T11:45:17.9512585Z [ERROR]   CompactActionITCase.testBatchCompact » 
JobExecution Job execution failed.
2023-01-17T11:45:17.9512964Z [INFO] 
2023-01-17T11:45:17.9513223Z [ERROR] Tests run: 224, Failures: 0, Errors: 1, 
Skipped: 4





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] godfreyhe commented on a diff in pull request #21586: [FLINK-30542][table-runtime] Introduce adaptive hash aggregate to adaptively determine whether local hash aggregate is required a

2023-01-18 Thread GitBox


godfreyhe commented on code in PR #21586:
URL: https://github.com/apache/flink/pull/21586#discussion_r1073610534


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/batch/BatchExecHashAggregate.java:
##
@@ -58,6 +58,7 @@ public class BatchExecHashAggregate extends 
ExecNodeBase
 private final RowType aggInputRowType;
 private final boolean isMerge;
 private final boolean isFinal;
+private final boolean canDoAdaptiveHashAgg;

Review Comment:
   how about `supportAdaptiveLocalHashAgg` ? 



##
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/agg/batch/HashAggCodeGenerator.scala:
##
@@ -38,32 +42,52 @@ import org.apache.calcite.tools.RelBuilder
  * aggregateBuffers should be update(e.g.: setInt) in [[BinaryRowData]]. (Hash 
Aggregate performs
  * much better than Sort Aggregate).
  */
-class HashAggCodeGenerator(
-ctx: CodeGeneratorContext,
-builder: RelBuilder,
-aggInfoList: AggregateInfoList,
-inputType: RowType,
-outputType: RowType,
-grouping: Array[Int],
-auxGrouping: Array[Int],
-isMerge: Boolean,
-isFinal: Boolean) {
+object HashAggCodeGenerator {
 
-  private lazy val aggInfos: Array[AggregateInfo] = aggInfoList.aggInfos
+  // It is a experimental config, will may be removed later.
+  @Experimental
+  val TABLE_EXEC_ADAPTIVE_LOCAL_HASH_AGG_ENABLED: ConfigOption[JBoolean] =
+key("table.exec.adaptive-local-hash-agg.enabled")
+  .booleanType()
+  .defaultValue(Boolean.box(true))
+  .withDescription("Whether to enable adaptive local hash agg")

Review Comment:
   Whether to enable adaptive local hash aggregation, this is only used for 
batch job. Default value is true. 



##
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/agg/batch/HashAggCodeGenerator.scala:
##
@@ -38,32 +42,52 @@ import org.apache.calcite.tools.RelBuilder
  * aggregateBuffers should be update(e.g.: setInt) in [[BinaryRowData]]. (Hash 
Aggregate performs
  * much better than Sort Aggregate).
  */
-class HashAggCodeGenerator(
-ctx: CodeGeneratorContext,
-builder: RelBuilder,
-aggInfoList: AggregateInfoList,
-inputType: RowType,
-outputType: RowType,
-grouping: Array[Int],
-auxGrouping: Array[Int],
-isMerge: Boolean,
-isFinal: Boolean) {
+object HashAggCodeGenerator {
 
-  private lazy val aggInfos: Array[AggregateInfo] = aggInfoList.aggInfos
+  // It is a experimental config, will may be removed later.
+  @Experimental
+  val TABLE_EXEC_ADAPTIVE_LOCAL_HASH_AGG_ENABLED: ConfigOption[JBoolean] =
+key("table.exec.adaptive-local-hash-agg.enabled")
+  .booleanType()
+  .defaultValue(Boolean.box(true))
+  .withDescription("Whether to enable adaptive local hash agg")
 
-  private lazy val functionIdentifiers: Map[AggregateFunction[_, _], String] =
-AggCodeGenHelper.getFunctionIdentifiers(aggInfos)
+  @Experimental
+  val TABLE_EXEC_ADAPTIVE_LOCAL_HASH_AGG_SAMPLE_POINT: ConfigOption[JLong] =
+key("table.exec.adaptive-local-hash-agg.sample-threshold")
+  .longType()
+  .defaultValue(Long.box(500L))
+  .withDescription("If adaptive local hash agg is enabled, "
++ "the proportion of distinct value will be checked after reading this 
number of records")

Review Comment:
   If adaptive local hash aggregation is enabled, this value defines how many 
records will be used as sampled data to calculate distinct value rate (see 
`TABLE_EXEC_ADAPTIVE_LOCAL_HASH_AGG_LIMIT_DISTINCT_RATIO`) for the local 
aggregate. The higher the sampling threshold, the more accurate the distinct 
value rate is. But as the sampling threshold increases, local aggregation is 
meaningless when the distinct values rate is low. The default values is 500.



##
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/agg/batch/HashAggCodeGenerator.scala:
##
@@ -38,32 +42,52 @@ import org.apache.calcite.tools.RelBuilder
  * aggregateBuffers should be update(e.g.: setInt) in [[BinaryRowData]]. (Hash 
Aggregate performs
  * much better than Sort Aggregate).
  */
-class HashAggCodeGenerator(
-ctx: CodeGeneratorContext,
-builder: RelBuilder,
-aggInfoList: AggregateInfoList,
-inputType: RowType,
-outputType: RowType,
-grouping: Array[Int],
-auxGrouping: Array[Int],
-isMerge: Boolean,
-isFinal: Boolean) {
+object HashAggCodeGenerator {
 
-  private lazy val aggInfos: Array[AggregateInfo] = aggInfoList.aggInfos
+  // It is a experimental config, will may be removed later.
+  @Experimental
+  val TABLE_EXEC_ADAPTIVE_LOCAL_HASH_AGG_ENABLED: ConfigOption[JBoolean] =
+key("table.exec.adaptive-local-hash-agg.enabled")
+  .booleanType()
+  .defaultValue(Boolean.box(true))
+  .withDescription("Whether to enable adaptive local hash agg")
 
-  private lazy val functionIdentifiers: 

[GitHub] [flink-table-store] zjureel opened a new pull request, #489: [FLINK-30038] Fix hive e2e fail

2023-01-18 Thread GitBox


zjureel opened a new pull request, #489:
URL: https://github.com/apache/flink-table-store/pull/489

   Fix hive e2e fail
   
   Caused by: org.testcontainers.containers.ContainerLaunchException: Timed out 
waiting for log output matching '.Starting HiveServer2.'
   at 
org.testcontainers.containers.wait.strategy.LogMessageWaitStrategy.waitUntilReady(LogMessageWaitStrategy.java:49)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30038) HiveE2E test is not stable

2023-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30038:
---
Labels: pull-request-available  (was: )

> HiveE2E test is not stable
> --
>
> Key: FLINK-30038
> URL: https://issues.apache.org/jira/browse/FLINK-30038
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Affects Versions: table-store-0.3.0
>Reporter: Shammon
>Priority: Major
>  Labels: pull-request-available
>
> https://github.com/apache/flink-table-store/actions/runs/3476726197/jobs/5812201704
> Caused by: org.testcontainers.containers.ContainerLaunchException: Timed out 
> waiting for log output matching '.*Starting HiveServer2.*'
>   at 
> org.testcontainers.containers.wait.strategy.LogMessageWaitStrategy.waitUntilReady(LogMessageWaitStrategy.java:49)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] lsyldliu commented on a diff in pull request #21724: [FLINK-30727][table-planner] Fix JoinReorderITCaseBase.testBushyTreeJoinReorder failed

2023-01-18 Thread GitBox


lsyldliu commented on code in PR #21724:
URL: https://github.com/apache/flink/pull/21724#discussion_r1080764519


##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/runtime/stream/sql/join/JoinReorderITCase.java:
##
@@ -42,6 +45,12 @@ public class JoinReorderITCase extends JoinReorderITCaseBase 
{
 
 private StreamExecutionEnvironment env;
 
+@AfterEach
+public void after() {

Review Comment:
   This is the root cause? Do you reproduce failed test locally?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] 1996fanrui commented on pull request #21690: [FLINK-30623][runtime][bug] The canEmitBatchOfRecords should check recordWriter and changelogWriterAvailabilityProvider are available

2023-01-18 Thread GitBox


1996fanrui commented on PR #21690:
URL: https://github.com/apache/flink/pull/21690#issuecomment-1396379268

   Hi @lindong28 
   It's strange that `MultiInputStreamTask-D` constructed a `DataOutput-C`, and 
 `OneInputStreamTask-B` calls the `DataOutput-C::emitRecord` constructed for 
`MultiInputStreamTask-D`. I think it should be a bug if it happens.
   
   `DataOutput::emitRecord` semantics should be correct if each `StreamTask` 
calls the corresponding `DataOutput`. If the correct DataOutput is not 
guaranteed to be called, there will be many problems with the code.
   
   For example: `MultipleInputStreamTask`[1] and `OneInputStreamTask`[2] have 
corresponding `CheckpointBarrierHandler`, if `MultipleInputStreamTask` calls 
`CheckpointBarrierHandler` of `OneInputStreamTask`, there should be similar 
bugs.
   
   `CheckpointBarrierHandler` is similar. `MultipleInputStreamTask` and 
`OneInputStreamTask` create different `CheckpointBarrierHandlers`, and then 
call `checkpointBarrierHandler.processBarrier`. `StreamTask` does not need to 
care about the logic inside `checkpointBarrierHandler.processBarrier`. If there 
is a `return value`, the caller are only responsible for using the `return 
value`, and don't care about how the `return value` is generated. Similarly, 
there may be different return values depending on the StreamTask.
   
   [1] 
https://github.com/apache/flink/blob/0b8a83ce54d39d0d5a5b82573c5037f306e9f7f7/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/MultipleInputStreamTask.java#L73
   [2] 
https://github.com/apache/flink/blob/0b8a83ce54d39d0d5a5b82573c5037f306e9f7f7/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/OneInputStreamTask.java#L67


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] link3280 commented on pull request #21581: [FLINK-30538][SQL gateway/client] Improve error handling of stop job operation

2023-01-18 Thread GitBox


link3280 commented on PR #21581:
URL: https://github.com/apache/flink/pull/21581#issuecomment-1396377724

   ping @fsk119 . It should be a quick one :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] link3280 commented on pull request #21700: [FLINK-30677][Test] Fix unstable SqlGatewayServiceStatementITCase.testFlinkSqlStatements

2023-01-18 Thread GitBox


link3280 commented on PR #21700:
URL: https://github.com/apache/flink/pull/21700#issuecomment-1396376577

   Thanks a lot @fsk119 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] Myasuka commented on pull request #21714: [FLINK-30724][docs] Update kafka-per-partition watermark to new source

2023-01-18 Thread GitBox


Myasuka commented on PR #21714:
URL: https://github.com/apache/flink/pull/21714#issuecomment-1396373604

   @MartijnVisser @PatrickRen please take a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #21724: [FLINK-30727][table-planner] Fix JoinReorderITCaseBase.testBushyTreeJoinReorder failed

2023-01-18 Thread GitBox


flinkbot commented on PR #21724:
URL: https://github.com/apache/flink/pull/21724#issuecomment-1396372983

   
   ## CI report:
   
   * d7feaaf31b5537c938840f279b507a295afe12e7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30727) JoinReorderITCase.testBushyTreeJoinReorder failed due to IOException

2023-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30727:
---
Labels: pull-request-available test-stability  (was: test-stability)

> JoinReorderITCase.testBushyTreeJoinReorder failed due to IOException
> 
>
> Key: FLINK-30727
> URL: https://issues.apache.org/jira/browse/FLINK-30727
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network, Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> IOException due to timeout occurring while requesting exclusive NetworkBuffer 
> caused JoinReorderITCase.testBushyTreeJoinReorder to fail:
> {code}
> [...]
> Jan 18 01:11:27 Caused by: java.io.IOException: Timeout triggered when 
> requesting exclusive buffers: The total number of network buffers is 
> currently set to 2048 of 32768 bytes each. You can increase this number by 
> setting the configuration keys 'taskmanager.memory.network.fraction', 
> 'taskmanager.memory.network.min', and 'taskmanager.memory.network.max',  or 
> you may increase the timeout which is 3ms by setting the key 
> 'taskmanager.network.memory.exclusive-buffers-request-timeout-ms'.
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.internalRequestMemorySegments(NetworkBufferPool.java:256)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.requestPooledMemorySegmentsBlocking(NetworkBufferPool.java:179)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.reserveSegments(LocalBufferPool.java:262)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.setupChannels(SingleInputGate.java:517)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.setup(SingleInputGate.java:277)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.taskmanager.InputGateWithMetrics.setup(InputGateWithMetrics.java:105)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.taskmanager.Task.setupPartitionsAndGates(Task.java:962)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:648)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:556)
> Jan 18 01:11:27   at java.lang.Thread.run(Thread.java:748)
> {code}
> Same build, 2 failures:
> * 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44987=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4=14300
> * 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44987=logs=ce3801ad-3bd5-5f06-d165-34d37e757d90=5e4d9387-1dcc-5885-a901-90469b7e6d2f=14362



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] swuferhong opened a new pull request, #21724: [FLINK-30727][table-planner] Fix JoinReorderITCaseBase.testBushyTreeJoinReorder failed

2023-01-18 Thread GitBox


swuferhong opened a new pull request, #21724:
URL: https://github.com/apache/flink/pull/21724

   
   
   
   
   ## What is the purpose of the change
   
   This pr is aims to fix JoinReorderITCaseBase.testBushyTreeJoinReorder failed 
while running CI.  The failed CI link: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44985=logs=7e1b167a-ddfd-53ed-4835-e68a01b608a7=434b1f2b-ca64-5a0b-f3a5-4a191160d4cb](url)
   
   
   ## Brief change log
   
   
   
   ## Verifying this change
   
   No tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive):  no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector:  no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? no docs
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] lsyldliu commented on a diff in pull request #21586: [FLINK-30542][table-runtime] Introduce adaptive hash aggregate to adaptively determine whether local hash aggregate is required at

2023-01-18 Thread GitBox


lsyldliu commented on code in PR #21586:
URL: https://github.com/apache/flink/pull/21586#discussion_r1073489490


##
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/agg/batch/HashAggCodeGenerator.scala:
##
@@ -38,32 +42,52 @@ import org.apache.calcite.tools.RelBuilder
  * aggregateBuffers should be update(e.g.: setInt) in [[BinaryRowData]]. (Hash 
Aggregate performs
  * much better than Sort Aggregate).
  */
-class HashAggCodeGenerator(
-ctx: CodeGeneratorContext,
-builder: RelBuilder,
-aggInfoList: AggregateInfoList,
-inputType: RowType,
-outputType: RowType,
-grouping: Array[Int],
-auxGrouping: Array[Int],
-isMerge: Boolean,
-isFinal: Boolean) {
+object HashAggCodeGenerator {
 
-  private lazy val aggInfos: Array[AggregateInfo] = aggInfoList.aggInfos
+  // It is a experimental config, will may be removed later.
+  @Experimental
+  val TABLE_EXEC_ADAPTIVE_LOCAL_HASH_AGG_ENABLED: ConfigOption[JBoolean] =
+key("table.exec.adaptive-local-hash-agg.enabled")

Review Comment:
   Referring to the 
[FLIP-283](https://cwiki.apache.org/confluence/display/FLINK/FLIP-283%3A+Use+adaptive+batch+scheduler+as+default+scheduler+for+batch+jobs),
 I think it would better to rename the option to 
`table.exec.adaptive.local-hash-agg.enabled`. cc @godfreyhe 



##
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/agg/batch/HashAggCodeGenerator.scala:
##
@@ -38,32 +42,52 @@ import org.apache.calcite.tools.RelBuilder
  * aggregateBuffers should be update(e.g.: setInt) in [[BinaryRowData]]. (Hash 
Aggregate performs
  * much better than Sort Aggregate).
  */
-class HashAggCodeGenerator(
-ctx: CodeGeneratorContext,
-builder: RelBuilder,
-aggInfoList: AggregateInfoList,
-inputType: RowType,
-outputType: RowType,
-grouping: Array[Int],
-auxGrouping: Array[Int],
-isMerge: Boolean,
-isFinal: Boolean) {
+object HashAggCodeGenerator {
 
-  private lazy val aggInfos: Array[AggregateInfo] = aggInfoList.aggInfos
+  // It is a experimental config, will may be removed later.
+  @Experimental
+  val TABLE_EXEC_ADAPTIVE_LOCAL_HASH_AGG_ENABLED: ConfigOption[JBoolean] =
+key("table.exec.adaptive-local-hash-agg.enabled")
+  .booleanType()
+  .defaultValue(Boolean.box(true))
+  .withDescription("Whether to enable adaptive local hash agg")
 
-  private lazy val functionIdentifiers: Map[AggregateFunction[_, _], String] =
-AggCodeGenHelper.getFunctionIdentifiers(aggInfos)
+  @Experimental
+  val TABLE_EXEC_ADAPTIVE_LOCAL_HASH_AGG_SAMPLE_POINT: ConfigOption[JLong] =
+key("table.exec.adaptive-local-hash-agg.sample-threshold")
+  .longType()
+  .defaultValue(Long.box(500L))
+  .withDescription("If adaptive local hash agg is enabled, "
++ "the proportion of distinct value will be checked after reading this 
number of records")

Review Comment:
   ```suggestion
   + "the proportion of distinct value will be checked after reading 
this number of records.")
   ```



##
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/utils/AggregateUtil.scala:
##
@@ -926,6 +926,19 @@ object AggregateUtil extends Enumeration {
 aggInfos.isEmpty || supportMerge
   }
 
+  /** Return true if all aggregates can be projection in adaptive hash agg. 
False otherwise. */
+  def doAllAggSupportProjection(aggCalls: Seq[AggregateCall]): Boolean = {

Review Comment:
   `doAllSuppportAdaptiveHashAgg`?



##
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/agg/batch/HashAggCodeGenerator.scala:
##
@@ -173,6 +214,81 @@ class HashAggCodeGenerator(
 
 HashAggCodeGenHelper.prepareMetrics(ctx, aggregateMapTerm, if (isFinal) 
sorterTerm else null)
 
+// Do adaptive hash aggregation
+val outputResultForOneRowAgg = {
+  // gen code to iterating the aggregate map and output to downstream
+  val inputUnboxingCode = 
s"${ctx.reuseInputUnboxingCode(reuseAggBufferTerm)}"
+  val rowDataType = classOf[RowData].getCanonicalName

Review Comment:
   ```
 s"""
|   // set result and output
|   $reuseGroupKeyTerm =  ($ROW_DATA)$currentKeyTerm;
|   $reuseAggBufferTerm = ($ROW_DATA)$currentValueTerm;
|   $inputUnboxingCode
|   ${outputExpr.code}
|   ${OperatorCodeGenerator.generateCollect(outputExpr.resultTerm)}
|
  """.stripMargin
   ```



##
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/utils/AggregateUtil.scala:
##
@@ -926,6 +926,19 @@ object AggregateUtil extends Enumeration {
 aggInfos.isEmpty || supportMerge
   }
 
+  /** Return true if all aggregates can be projection in adaptive hash agg. 
False otherwise. */

Review Comment:
   ```suggestion
 /** Return true if all aggregates 

[GitHub] [flink] lincoln-lil commented on a diff in pull request #21683: [FLINK-30672][table] Support 'EXPLAIN PLAN_ADVICE' statement

2023-01-18 Thread GitBox


lincoln-lil commented on code in PR #21683:
URL: https://github.com/apache/flink/pull/21683#discussion_r1080750180


##
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/utils/FlinkRelOptUtil.scala:
##
@@ -90,6 +90,38 @@ object FlinkRelOptUtil {
 sw.toString
   }
 
+  def toPlanAdvice(

Review Comment:
   I think this makes sense, and other properties can continue to be added as 
non-default arguments when needed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-30727) JoinReorderITCase.testBushyTreeJoinReorder failed due to IOException

2023-01-18 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678458#comment-17678458
 ] 

Yunhong Zheng commented on FLINK-30727:
---

Ok, [~mapohl] , I will look at it right away.

> JoinReorderITCase.testBushyTreeJoinReorder failed due to IOException
> 
>
> Key: FLINK-30727
> URL: https://issues.apache.org/jira/browse/FLINK-30727
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network, Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
>
> IOException due to timeout occurring while requesting exclusive NetworkBuffer 
> caused JoinReorderITCase.testBushyTreeJoinReorder to fail:
> {code}
> [...]
> Jan 18 01:11:27 Caused by: java.io.IOException: Timeout triggered when 
> requesting exclusive buffers: The total number of network buffers is 
> currently set to 2048 of 32768 bytes each. You can increase this number by 
> setting the configuration keys 'taskmanager.memory.network.fraction', 
> 'taskmanager.memory.network.min', and 'taskmanager.memory.network.max',  or 
> you may increase the timeout which is 3ms by setting the key 
> 'taskmanager.network.memory.exclusive-buffers-request-timeout-ms'.
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.internalRequestMemorySegments(NetworkBufferPool.java:256)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.requestPooledMemorySegmentsBlocking(NetworkBufferPool.java:179)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.reserveSegments(LocalBufferPool.java:262)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.setupChannels(SingleInputGate.java:517)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.setup(SingleInputGate.java:277)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.taskmanager.InputGateWithMetrics.setup(InputGateWithMetrics.java:105)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.taskmanager.Task.setupPartitionsAndGates(Task.java:962)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:648)
> Jan 18 01:11:27   at 
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:556)
> Jan 18 01:11:27   at java.lang.Thread.run(Thread.java:748)
> {code}
> Same build, 2 failures:
> * 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44987=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4=14300
> * 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44987=logs=ce3801ad-3bd5-5f06-d165-34d37e757d90=5e4d9387-1dcc-5885-a901-90469b7e6d2f=14362



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29884) Flaky test failure in finegrained_resource_management/SortMergeResultPartitionTest.testRelease

2023-01-18 Thread Yingjie Cao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678444#comment-17678444
 ] 

Yingjie Cao commented on FLINK-29884:
-

Merged into master via 0b8a83ce54d39d0d5a5b82573c5037f306e9f7f7.

> Flaky test failure in 
> finegrained_resource_management/SortMergeResultPartitionTest.testRelease
> --
>
> Key: FLINK-29884
> URL: https://issues.apache.org/jira/browse/FLINK-29884
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination, Runtime / Network, Tests
>Affects Versions: 1.17.0
>Reporter: Nico Kruber
>Assignee: Weijie Guo
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.17.0
>
>
> {{SortMergeResultPartitionTest.testRelease}} failed with a timeout in the 
> finegrained_resource_management tests:
> {code:java}
> Nov 03 17:28:07 [ERROR] Tests run: 20, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 64.649 s <<< FAILURE! - in 
> org.apache.flink.runtime.io.network.partition.SortMergeResultPartitionTest
> Nov 03 17:28:07 [ERROR] SortMergeResultPartitionTest.testRelease  Time 
> elapsed: 60.009 s  <<< ERROR!
> Nov 03 17:28:07 org.junit.runners.model.TestTimedOutException: test timed out 
> after 60 seconds
> Nov 03 17:28:07   at 
> org.apache.flink.runtime.io.network.partition.SortMergeResultPartitionTest.testRelease(SortMergeResultPartitionTest.java:374)
> Nov 03 17:28:07   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Nov 03 17:28:07   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Nov 03 17:28:07   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Nov 03 17:28:07   at java.lang.reflect.Method.invoke(Method.java:498)
> Nov 03 17:28:07   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> Nov 03 17:28:07   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Nov 03 17:28:07   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> Nov 03 17:28:07   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Nov 03 17:28:07   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> Nov 03 17:28:07   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> Nov 03 17:28:07   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> Nov 03 17:28:07   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> Nov 03 17:28:07   at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> Nov 03 17:28:07   at java.lang.Thread.run(Thread.java:748) {code}
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=42806=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-29884) Flaky test failure in finegrained_resource_management/SortMergeResultPartitionTest.testRelease

2023-01-18 Thread Yingjie Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao closed FLINK-29884.
---
Resolution: Fixed

> Flaky test failure in 
> finegrained_resource_management/SortMergeResultPartitionTest.testRelease
> --
>
> Key: FLINK-29884
> URL: https://issues.apache.org/jira/browse/FLINK-29884
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination, Runtime / Network, Tests
>Affects Versions: 1.17.0
>Reporter: Nico Kruber
>Assignee: Weijie Guo
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.17.0
>
>
> {{SortMergeResultPartitionTest.testRelease}} failed with a timeout in the 
> finegrained_resource_management tests:
> {code:java}
> Nov 03 17:28:07 [ERROR] Tests run: 20, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 64.649 s <<< FAILURE! - in 
> org.apache.flink.runtime.io.network.partition.SortMergeResultPartitionTest
> Nov 03 17:28:07 [ERROR] SortMergeResultPartitionTest.testRelease  Time 
> elapsed: 60.009 s  <<< ERROR!
> Nov 03 17:28:07 org.junit.runners.model.TestTimedOutException: test timed out 
> after 60 seconds
> Nov 03 17:28:07   at 
> org.apache.flink.runtime.io.network.partition.SortMergeResultPartitionTest.testRelease(SortMergeResultPartitionTest.java:374)
> Nov 03 17:28:07   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Nov 03 17:28:07   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Nov 03 17:28:07   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Nov 03 17:28:07   at java.lang.reflect.Method.invoke(Method.java:498)
> Nov 03 17:28:07   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> Nov 03 17:28:07   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Nov 03 17:28:07   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> Nov 03 17:28:07   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Nov 03 17:28:07   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> Nov 03 17:28:07   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> Nov 03 17:28:07   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> Nov 03 17:28:07   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> Nov 03 17:28:07   at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> Nov 03 17:28:07   at java.lang.Thread.run(Thread.java:748) {code}
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=42806=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] wsry merged pull request #21707: [FLINK-29884][test] Fix flaky test SortMergeResultPartitionTest.testRelease

2023-01-18 Thread GitBox


wsry merged PR #21707:
URL: https://github.com/apache/flink/pull/21707


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] wsry commented on pull request #21707: [FLINK-29884][test] Fix flaky test SortMergeResultPartitionTest.testRelease

2023-01-18 Thread GitBox


wsry commented on PR #21707:
URL: https://github.com/apache/flink/pull/21707#issuecomment-1396314628

   @reswqa Thanks for the fix. LGTM. Failure unrelated, Merging.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-30419) Allow tuning of transaction timeout

2023-01-18 Thread Alex Sorokoumov (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678429#comment-17678429
 ] 

Alex Sorokoumov commented on FLINK-30419:
-

> Do you mean we can have a default value for table store catalog for 
> `kafka.transaction.timeout.ms`?

Yeah! I think it would be generally useful to be able to override FTS defaults 
on the Catalog level. In fact, the FTS Catalog already overrides Table's 
{{connector}} and {{path}} fields using its {{type}} and {{warehouse}} fields. 
We would extend that to provide arbitrary fields. For example, the table below 
gets {{log.system}} and kafka options from the Catalog:
{noformat}
CREATE CATALOG table_store_catalog WITH (
    'type'='table-store',
    'warehouse'='s3://my-bucket/table-store',
    'log.system' = 'kafka',
'kafka.bootstrap.servers' = '',
    'kafka.transaction.timeout.ms' = '90',
 );

USE CATALOG table_store_catalog;

CREATE TABLE word_count (
    word STRING PRIMARY KEY NOT ENFORCED,
    cnt BIGINT
) WITH (
    'kafka.topic' = 'word_count_log'
);{noformat}
WDYT [~lzljs3620320] ?

> Allow tuning of transaction timeout
> ---
>
> Key: FLINK-30419
> URL: https://issues.apache.org/jira/browse/FLINK-30419
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Vicky Papavasileiou
>Priority: Major
>
> FTS sets the producer transaction timeout to 1hr. The maximum allowed by a 
> kafka broker is 15 mins. This causes exceptions to be thrown and the job dies 
> when kafka log is enabled on a table. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] kristoffSC commented on pull request #21393: [FLINK-27246][TableSQL/Runtime][master] - Include WHILE blocks in split generated java methods

2023-01-18 Thread GitBox


kristoffSC commented on PR #21393:
URL: https://github.com/apache/flink/pull/21393#issuecomment-1396174097

   Hi @tsreaper 
   Thank you for your valuable feedback, no need to apologize :) Please see my 
response below. 
   
   > In BlockStatementSplitter you group all single statements between 
IF/ELSE/WHILEs together and extract each group into a separate method.
   
   I'm not only extracting single line statements but also entire if/else 
calls. Other than that your understanding is correct.
   
   > (If my guesses are correct, you can implement this with just one for loop 
and a recursive call. No need for mergeBlocks or addGroups or something else. 
mergeBlocks and addGroups confuse me quick a lot.)
   
   I think I'm already doing this. There is a for loop for block statements. 
Every block has its own visitor. There is no recursion on calling visit method. 
   The only recursion is to traverse through all visitors and get blocks from 
them. This is straightforward parent -> child structure. 
   I'm not merging any blocks, I'm simply gathering all blocks from all 
visitors. Please take a look at my comment -> 
https://github.com/apache/flink/pull/21393#discussion_r1071174072
   
   Also please note that `addGroups` method was deleted. 
   
   
   > The problem of this idea (or rather, the current implementation) is that, 
it's doing almost the same thing with the old IfStatementRewriter, plus the 
currently remaining FunctionSplitter.
   
   I don't agree with that statement. Firstly my implementation handle `else 
if` blocks which original statement did not. Also in some cases, my 
implementation produced better results than original  `IfStatementRewriter`. In 
other cases - the result was the same as original `IfStatementRewriter`, So we 
have extra value here. Please look at more examples provided by me at the end 
of this comment.
   Also I'm optimizing and grouping IF/ELSE/WHILE branches bodies (block, 
statements) whereas `FunctionSplitter` assumes that statements is already 
optimized. That means that `FunctionSplitter` takesentire IF/Else without any 
extra modifications and adds it to the methods body or extracts it to the new 
method. So I'm not duplicating `FunctionSplitter`, those operate on different 
levels.
   
   > I guess what you need is just to copy a bit of code in 
IfStatementRewriter, change them into WHILE and that's all. Maybe changing the 
name to BlockStatementRewriter. This shall achieve our goals by making the 
least changes.
   
   I do not agree with this statement. `IfStatementRewriter` is specialized in 
IF/ELSE. It does not process "else if". 
   Originally I was thinking about adding bits of code to `IfStatementRewriter` 
but I realized that the problem is more complex. We need something that can 
process nested combination of WHILE's and IF's 
   I couldn't have things like 
   `shouldExtract(blockStatementContext.statement().statement(0))` and 
`shouldExtract(blockStatementContext.statement().statement(1))` 
   I wanted to have more generic code, that will apply the same logic to all 
blocks. In fact, one may argue that I actually did ` copy a bit of code in 
IfStatementRewriter change them into WHILE and that's all. ` You will find bits 
of `IfStatementRewriter` in "BlockStatementRewriter::rewrite` and 
"BlockStatementVisitor::visitMethodDeclaration` methods. Other than that, I 
proposed new solution.
   
   Also `IfStatementRewriter` approach for handling nested IF/ELSE is different 
that in my implementation. If I understand correctly, `IfStatementRewriter` 
rewrite entire class as many times as there were extracted methods.
   ```
   while (visitor.hasRewrite()) {
   visitor = new IfStatementVisitor(rewriterCode);
   rewriterCode = visitor.rewriteAndGetCode();
   }
   ```
   The first rewrite call will extract first method, `visitor.hasRewrite()` 
will return true, we will rewrite again, but this time we will rewrite the 
previously extracted method, and so on. Every time, the entire class has to be 
"parsed".

   In my case we parse class once, extracting new blocks and processing nested 
expressions by creating new visitors <- NO recursion here. Every visitor has 
its context, that will become a name prefix of extracted by this visitors. 
Every extracted block 

   Every new visitor has the same reference to "rewriter" object. This means 
that every visitor handles only small portion of entire method, and rewrites 
only its part.

I do understand that this change touches core component, so the possible 
impact is big -> SQL jobs.
I can tell that existing integration/end 2 end tests detected issues with 
previous implementations. I managed to solve those by analyzing original and 
rewritten code created for test queries. 

To sum up:
1. recursion -> I really don't think that this should be a problem. 
Recursion is called only to gather the results, parsing is not using recursion.

[jira] [Commented] (FLINK-13414) Add support for Scala 2.13

2023-01-18 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678400#comment-17678400
 ] 

Martijn Visser commented on FLINK-13414:


[~alexey.novakov] Erwan is correct. I would like to point out 
https://flink.apache.org/2022/02/22/scala-free.html too. 

> Add support for Scala 2.13
> --
>
> Key: FLINK-13414
> URL: https://issues.apache.org/jira/browse/FLINK-13414
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Scala
>Reporter: Chaoran Yu
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24736) Non vulenerable jar files for Apache Flink 1.14.4

2023-01-18 Thread Leonid Ilyevsky (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678399#comment-17678399
 ] 

Leonid Ilyevsky commented on FLINK-24736:
-

The same problem in the latest version 1.16.0. 

I build my projects under the corporate Nexus, and 
flink-rpc-akka-loader-1.16.0.jar got quarantined with the following:


ROOT CAUSE

flink-rpc-akka-loader-1.16.0.jarflink-rpc-akka.jarorg/jboss/netty/handler/codec/http/HttpMessageDecoder.class(
 , 4.0.0.Alpha1)

Nexus also mentioned CVE-2019-20444 and CVE-2019-20445.

As a result, I cannot do my build at all.

> Non vulenerable jar files for Apache Flink 1.14.4
> -
>
> Key: FLINK-24736
> URL: https://issues.apache.org/jira/browse/FLINK-24736
> Project: Flink
>  Issue Type: Bug
>Reporter: Parag Somani
>Priority: Major
>
> Hello,
> We are using Apache flink 1.14.4 as one of base image in our production. Due 
> to recent upgrade, we have many container security defects. 
> I am using "flink-1.14.4-bin-scala_2.12"in our k8s env.
> Please assist with Flink version having non-vulnerable libraries. List of 
> vulnerable libs are as follows: 
> [7.5] [CVE-2019-16869] [flink-rpc-akka-loader] [1.14.4]   
> [9.1] [CVE-2019-20444] [flink-rpc-akka-loader] [1.14.4]   
> [9.1] [CVE-2019-20445] [flink-rpc-akka-loader] [1.14.4]   
> [7.5] [sonatype-2019-0115] [flink-rpc-akka-loader] [1.14.4]
> [7.5] [sonatype-2020-0029] [flink-rpc-akka-loader] [1.14.4]
> [7.5] [CVE-2019-16869] [flink-rpc-akka] [1.14.4]  
> [9.1] [CVE-2019-20444] [flink-rpc-akka] [1.14.4]  
> [9.1] [CVE-2019-20445] [flink-rpc-akka] [1.14.4]  
> [7.5] [sonatype-2019-0115] [flink-rpc-akka] [1.14.4]  
> [7.5] [sonatype-2020-0029] [flink-rpc-akka] [1.14.4]  
> Can you assist with this ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-13414) Add support for Scala 2.13

2023-01-18 Thread Erwan Loisant (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678395#comment-17678395
 ] 

Erwan Loisant commented on FLINK-13414:
---

[~novakov.alex]

The Flink core no longer relies on Scala, so if you don't use the Scala API you 
can write a Flink pipeline with any Scala version - but that implies calling 
the Java API directly.

As the Scala API is deprecated and will be removed, this will become the only 
way to write Flink pipelines in Scala (using the Java API). Maybe 3rd party 
Scala APIs will emerge, but currently the few that exist (and are cited in the 
link below) don't seem to be maintained at all. 

[https://cwiki.apache.org/confluence/display/FLINK/FLIP-265+Deprecate+and+remove+Scala+API+support]
 

> Add support for Scala 2.13
> --
>
> Key: FLINK-13414
> URL: https://issues.apache.org/jira/browse/FLINK-13414
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Scala
>Reporter: Chaoran Yu
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-13414) Add support for Scala 2.13

2023-01-18 Thread Alexey Novakov (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678394#comment-17678394
 ] 

Alexey Novakov commented on FLINK-13414:


Hi there. Can somebody from Flink PMC please provide a summary what is 
concluded for Flink Scala users?

As I see FLINK-23986 is resolved, which I believe allows to plug any Scala 
version including latest one. Am I correct?

Will Flink always come with Scala 2.12, so that user can ignore this version if 
they are going to use newer version of Scala?

P.S. I have tried [~rgrebennikov] 
[flink-scala-api|https://github.com/findify/flink-scala-api] library, it works 
fine with Flink 1.15.

> Add support for Scala 2.13
> --
>
> Key: FLINK-13414
> URL: https://issues.apache.org/jira/browse/FLINK-13414
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Scala
>Reporter: Chaoran Yu
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] afedulov commented on pull request #20049: [FLINK-28046][connectors] Mark SourceFunction interface as @Deprecated

2023-01-18 Thread GitBox


afedulov commented on PR #20049:
URL: https://github.com/apache/flink/pull/20049#issuecomment-1387594960

   @Myasuka good idea, thanks for bringing it up. I will resume working on it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-26088) Add Elasticsearch 8.0 support

2023-01-18 Thread Matheus Felisberto (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678349#comment-17678349
 ] 

Matheus Felisberto commented on FLINK-26088:


Hi [~martijnvisser], how are you?

I am not trying to overlap or be rude, but I would like to know if there is any 
process regarding this situation. It has been a while (+60 days), since the 
last update from the assigned on this topic. I have read the Flink Jira Process 
(1) document, and if I understood correctly the flink-jira-bot should have 
removed the assignment to this ticket. Am I missing something?

1) [https://cwiki.apache.org/confluence/display/FLINK/Flink+Jira+Process]

> Add Elasticsearch 8.0 support
> -
>
> Key: FLINK-26088
> URL: https://issues.apache.org/jira/browse/FLINK-26088
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Reporter: Yuhao Bi
>Assignee: zhenzhenhua
>Priority: Major
>
> Since Elasticsearch 8.0 is officially released, I think it's time to consider 
> adding es8 connector support.
> The High Level REST Client we used for connection [is marked deprecated in es 
> 7.15.0|https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high.html].
>  Maybe we can migrate to use the new [Java API 
> Client|https://www.elastic.co/guide/en/elasticsearch/client/java-api-client/8.0/index.html]
>  at this time.
> Elasticsearch8.0 release note: 
> [https://www.elastic.co/guide/en/elasticsearch/reference/8.0/release-notes-8.0.0.html]
> release highlights: 
> [https://www.elastic.co/guide/en/elasticsearch/reference/8.0/release-highlights.html]
> REST API compatibility: 
> https://www.elastic.co/guide/en/elasticsearch/reference/8.0/rest-api-compatibility.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30745) Check-pointing with Azure Data Lake Storage

2023-01-18 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678329#comment-17678329
 ] 

Surendra Singh Lilhore commented on FLINK-30745:


Thanks [~dheerajpanangat] for reporting this issue.

What value you configured for property 
"fs.azure.account.keyprovider." ?

> Check-pointing with Azure Data Lake Storage
> ---
>
> Key: FLINK-30745
> URL: https://issues.apache.org/jira/browse/FLINK-30745
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Affects Versions: 1.15.2, 1.14.6
>Reporter: Dheeraj Panangat
>Priority: Major
>
> Hi,
> While checkpointing to Azure Blob Storage using Flink, we get the following 
> error :
> {code:java}
> Caused by: Configuration property .dfs.core.windows.net not 
> found.
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:372)
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:1133)
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.(AzureBlobFileSystemStore.java:174)
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:110)
>  {code}
> We have given the configurations in core-site.xml too for following
> {code:java}
> fs.hdfs.impl
> fs.abfs.impl -> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem
> fs.file.impl
> fs.azure.account.auth.type
> fs.azure.account.oauth.provider.type
> fs.azure.account.oauth2.client.id
> fs.azure.account.oauth2.client.secret
> fs.azure.account.oauth2.client.endpoint
> fs.azure.createRemoteFileSystemDuringInitialization -> true {code}
> On debugging found that flink reads from core-default-shaded.xml, but even if 
> the properties are specified there, the default configs are not loaded and we 
> get a different exception as :
> {code:java}
> Caused by: Unable to load key provider class.
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getTokenProvider(AbfsConfiguration.java:540)
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:1136)
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.(AzureBlobFileSystemStore.java:174)
> at 
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:110)
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-connector-gcp-pubsub] MartijnVisser commented on pull request #1: [FLINK-30058][Connector/Google PubSub] Move existing Google Cloud PubSub connector code from Flink repo

2023-01-18 Thread GitBox


MartijnVisser commented on PR #1:
URL: 
https://github.com/apache/flink-connector-gcp-pubsub/pull/1#issuecomment-1387391917

   > I assume it's intentional to rename the 
`flink-connector-gcp-pubsub-emulator-tests` to 
`flink-connector-gcp-pubsub-e2e-tests`?
   
   It is, yes :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   3   4   >