[jira] [Created] (FLINK-36625) Add helper classes for Lineage integration in connectors

2024-10-29 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-36625:
-

 Summary: Add helper classes for Lineage integration in connectors
 Key: FLINK-36625
 URL: https://issues.apache.org/jira/browse/FLINK-36625
 Project: Flink
  Issue Type: Sub-task
  Components: API / DataStream
Reporter: Zhenqiu Huang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-36410) Improve Lineage Info Collection for flink app

2024-10-29 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang resolved FLINK-36410.
---
Resolution: Done

> Improve Lineage Info Collection for flink app
> -
>
> Key: FLINK-36410
> URL: https://issues.apache.org/jira/browse/FLINK-36410
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.20.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>
> We find that lineage interface adoption is not easy for each of Flink 
> connectors as every connect has its own release dependency and schedule. 
> Thus, to make the lineage integration incrementally used by user, we want to 
> change the lineage info collection not require both source and sink as 
> lineage provider implemented. So that Lineage reporter can has partial 
> lineage graph generated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-36410) Improve Lineage Info Collection for flink app

2024-10-29 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-36410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17893955#comment-17893955
 ] 

Zhenqiu Huang commented on FLINK-36410:
---

PR is created and merged https://github.com/apache/flink/pull/25440

> Improve Lineage Info Collection for flink app
> -
>
> Key: FLINK-36410
> URL: https://issues.apache.org/jira/browse/FLINK-36410
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.20.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>
> We find that lineage interface adoption is not easy for each of Flink 
> connectors as every connect has its own release dependency and schedule. 
> Thus, to make the lineage integration incrementally used by user, we want to 
> change the lineage info collection not require both source and sink as 
> lineage provider implemented. So that Lineage reporter can has partial 
> lineage graph generated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-36410) Improve Lineage Info Collection for flink app

2024-09-30 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-36410:
--
Description: We find that lineage interface adoption is not easy for each 
of Flink connectors as every connect has its own release dependency and 
schedule. Thus, to make the lineage integration incrementally used by user, we 
want to change the lineage info collection not require both source and sink as 
lineage provider implemented. So that Lineage reporter can has partial lineage 
graph generated.

> Improve Lineage Info Collection for flink app
> -
>
> Key: FLINK-36410
> URL: https://issues.apache.org/jira/browse/FLINK-36410
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.20.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>
> We find that lineage interface adoption is not easy for each of Flink 
> connectors as every connect has its own release dependency and schedule. 
> Thus, to make the lineage integration incrementally used by user, we want to 
> change the lineage info collection not require both source and sink as 
> lineage provider implemented. So that Lineage reporter can has partial 
> lineage graph generated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-36410) Improve Lineage Info Collection for flink app

2024-09-30 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-36410:
--
Component/s: Table SQL / Runtime

> Improve Lineage Info Collection for flink app
> -
>
> Key: FLINK-36410
> URL: https://issues.apache.org/jira/browse/FLINK-36410
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.20.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36410) Improve Lineage Info Collection for flink app

2024-09-30 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-36410:
-

 Summary: Improve Lineage Info Collection for flink app
 Key: FLINK-36410
 URL: https://issues.apache.org/jira/browse/FLINK-36410
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.20.0
Reporter: Zhenqiu Huang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36169) Support namespace level resource check before scaling up

2024-08-28 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-36169:
-

 Summary: Support namespace level resource check before scaling up
 Key: FLINK-36169
 URL: https://issues.apache.org/jira/browse/FLINK-36169
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Affects Versions: 1.10.0
Reporter: Zhenqiu Huang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33211) Implement table lineage graph

2024-08-10 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang closed FLINK-33211.
-
Resolution: Fixed

> Implement table lineage graph
> -
>
> Key: FLINK-33211
> URL: https://issues.apache.org/jira/browse/FLINK-33211
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Zhenqiu Huang
>Priority: Major
>  Labels: pull-request-available
>
> Implement table lineage graph



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-35925) Remove hive connector from Flink main repo

2024-07-30 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869690#comment-17869690
 ] 

Zhenqiu Huang edited comment on FLINK-35925 at 7/30/24 4:02 PM:


[~Weijie Guo]

Yes, it is. Let me close it.


was (Author: zhenqiuhuang):
[~Weijie Guo]

Yes, it is. I think we close it.

> Remove hive connector from Flink main repo
> --
>
> Key: FLINK-35925
> URL: https://issues.apache.org/jira/browse/FLINK-35925
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Hive
>Affects Versions: 2.0.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>
> Given we have moved hive connector to standalone repo, we should remove the 
> it from flink main repo.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35925) Remove hive connector from Flink main repo

2024-07-30 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869690#comment-17869690
 ] 

Zhenqiu Huang commented on FLINK-35925:
---

[~Weijie Guo]

Yes, it is. I think we close it.

> Remove hive connector from Flink main repo
> --
>
> Key: FLINK-35925
> URL: https://issues.apache.org/jira/browse/FLINK-35925
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Hive
>Affects Versions: 2.0.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>
> Given we have moved hive connector to standalone repo, we should remove the 
> it from flink main repo.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-35925) Remove hive connector from Flink main repo

2024-07-30 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang closed FLINK-35925.
-
Resolution: Duplicate

> Remove hive connector from Flink main repo
> --
>
> Key: FLINK-35925
> URL: https://issues.apache.org/jira/browse/FLINK-35925
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Hive
>Affects Versions: 2.0.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>
> Given we have moved hive connector to standalone repo, we should remove the 
> it from flink main repo.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35925) Remove hive connector from Flink main repo

2024-07-29 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-35925:
--
Description: 
Given we have moved hive connector to standalone repo, we should remove the it 
from flink main repo.


> Remove hive connector from Flink main repo
> --
>
> Key: FLINK-35925
> URL: https://issues.apache.org/jira/browse/FLINK-35925
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Hive
>Affects Versions: 2.0.0
>Reporter: Zhenqiu Huang
>Priority: Minor
> Fix For: 2.0.0
>
>
> Given we have moved hive connector to standalone repo, we should remove the 
> it from flink main repo.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35925) Remove hive connector from Flink main repo

2024-07-29 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-35925:
-

 Summary: Remove hive connector from Flink main repo
 Key: FLINK-35925
 URL: https://issues.apache.org/jira/browse/FLINK-35925
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Hive
Affects Versions: 2.0.0
Reporter: Zhenqiu Huang
 Fix For: 2.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35745) Add namespace convention doc for Flink lineage

2024-07-02 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-35745:
-

 Summary: Add namespace convention doc for Flink lineage 
 Key: FLINK-35745
 URL: https://issues.apache.org/jira/browse/FLINK-35745
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Zhenqiu Huang


We will recommend to follow the convention from openlineage.
https://openlineage.io/docs/spec/naming/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33211) Implement table lineage graph

2024-07-02 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang closed FLINK-33211.
-
Resolution: Done

The PR is merged to upstream

> Implement table lineage graph
> -
>
> Key: FLINK-33211
> URL: https://issues.apache.org/jira/browse/FLINK-33211
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Zhenqiu Huang
>Priority: Major
>  Labels: pull-request-available
>
> Implement table lineage graph



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33212) Introduce job status changed listener for lineage

2024-07-02 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang closed FLINK-33212.
-
Resolution: Done

The PR is merged to upstream.

> Introduce job status changed listener for lineage
> -
>
> Key: FLINK-33212
> URL: https://issues.apache.org/jira/browse/FLINK-33212
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / REST
>Affects Versions: 1.20.0
>Reporter: Fang Yong
>Assignee: Zhenqiu Huang
>Priority: Major
>  Labels: pull-request-available
>
> Introduce job status changed listener relevant interfaces and its 
> implementation. The job listeners will be registered in runtime and also 
> client side pipeline executors, including localExecutor, embeddedExecutor for 
> application mode, and  abstract session cluster executor.  When job 
> submission is successfully, the job created event will be created with 
> lineage graph info.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33211) Implement table lineage graph

2024-07-02 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17862638#comment-17862638
 ] 

Zhenqiu Huang commented on FLINK-33211:
---

PR is merged to upstream

> Implement table lineage graph
> -
>
> Key: FLINK-33211
> URL: https://issues.apache.org/jira/browse/FLINK-33211
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Zhenqiu Huang
>Priority: Major
>  Labels: pull-request-available
>
> Implement table lineage graph



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33212) Introduce job status changed listener for lineage

2024-05-19 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-33212:
--
Description: # Introduce job status changed listener relevant interfaces 
and its implementation. The job listeners will be registered in runtime and 
also client side pipeline executors, including localExecutor, embeddedExecutor 
for application mode, and  abstract session cluster executor.  When job 
submission is successfully, the job created event will be created with lineage 
graph info.  (was: Introduce job status changed listener relevant interfaces)

> Introduce job status changed listener for lineage
> -
>
> Key: FLINK-33212
> URL: https://issues.apache.org/jira/browse/FLINK-33212
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / REST
>Affects Versions: 1.20.0
>Reporter: Fang Yong
>Assignee: Zhenqiu Huang
>Priority: Major
>  Labels: pull-request-available
>
> # Introduce job status changed listener relevant interfaces and its 
> implementation. The job listeners will be registered in runtime and also 
> client side pipeline executors, including localExecutor, embeddedExecutor for 
> application mode, and  abstract session cluster executor.  When job 
> submission is successfully, the job created event will be created with 
> lineage graph info.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33212) Introduce job status changed listener for lineage

2024-05-19 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-33212:
--
Description: Introduce job status changed listener relevant interfaces and 
its implementation. The job listeners will be registered in runtime and also 
client side pipeline executors, including localExecutor, embeddedExecutor for 
application mode, and  abstract session cluster executor.  When job submission 
is successfully, the job created event will be created with lineage graph info. 
 (was: # Introduce job status changed listener relevant interfaces and its 
implementation. The job listeners will be registered in runtime and also client 
side pipeline executors, including localExecutor, embeddedExecutor for 
application mode, and  abstract session cluster executor.  When job submission 
is successfully, the job created event will be created with lineage graph info.)

> Introduce job status changed listener for lineage
> -
>
> Key: FLINK-33212
> URL: https://issues.apache.org/jira/browse/FLINK-33212
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / REST
>Affects Versions: 1.20.0
>Reporter: Fang Yong
>Assignee: Zhenqiu Huang
>Priority: Major
>  Labels: pull-request-available
>
> Introduce job status changed listener relevant interfaces and its 
> implementation. The job listeners will be registered in runtime and also 
> client side pipeline executors, including localExecutor, embeddedExecutor for 
> application mode, and  abstract session cluster executor.  When job 
> submission is successfully, the job created event will be created with 
> lineage graph info.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35326) Implement lineage interface for hive connector

2024-05-09 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-35326:
-

 Summary: Implement lineage interface for hive connector
 Key: FLINK-35326
 URL: https://issues.apache.org/jira/browse/FLINK-35326
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Hive
Affects Versions: 1.20.0
Reporter: Zhenqiu Huang
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34651) The HiveTableSink of Flink does not support writing to S3

2024-05-09 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845177#comment-17845177
 ] 

Zhenqiu Huang commented on FLINK-34651:
---

Have you tried to put lib into your Flink App uber jar? Hadoop path should be 
able to find its own s3 file system implementation.

https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws/2.7.3

> The HiveTableSink of Flink does not support writing to S3
> -
>
> Key: FLINK-34651
> URL: https://issues.apache.org/jira/browse/FLINK-34651
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.13.6, 1.14.6, 1.15.4, 1.16.3, 1.17.2, 1.18.1
>Reporter: shizhengchao
>Priority: Blocker
>
> My Hive table is located on S3. When I try to write to Hive using Flink 
> Streaming SQL, I find that it does not support writing to S3. Furthermore, 
> this issue has not been fixed in the latest version. The error I got is as 
> follows:
> {code:java}
> //代码占位符
> java.io.IOException: No FileSystem for scheme: s3
>     at 
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2586)
>     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2593)
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
>     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
>     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>     at 
> org.apache.flink.connectors.hive.HadoopFileSystemFactory.create(HadoopFileSystemFactory.java:44)
>     at 
> org.apache.flink.table.filesystem.stream.StreamingSink.lambda$compactionWriter$8dbc1825$1(StreamingSink.java:95)
>     at 
> org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.initializeState(CompactCoordinator.java:102)
>     at 
> org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:118)
>     at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:290)
>     at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:441)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:585)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:565)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:540)
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
>     at java.lang.Thread.run(Thread.java:750)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33212) Introduce job status changed listener for lineage

2024-04-29 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-33212:
--
Affects Version/s: 1.20.0
   (was: 1.19.0)

> Introduce job status changed listener for lineage
> -
>
> Key: FLINK-33212
> URL: https://issues.apache.org/jira/browse/FLINK-33212
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / REST
>Affects Versions: 1.20.0
>Reporter: Fang Yong
>Assignee: Zhenqiu Huang
>Priority: Major
>
> Introduce job status changed listener relevant interfaces



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35015) Flink Parquet Reader doesn't honor parquet configuration

2024-04-04 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17834045#comment-17834045
 ] 

Zhenqiu Huang commented on FLINK-35015:
---

After taking deeper look on the code, AvroSchemaConverter takes hadoop 
configuration. So users just need to specify hadoop config in flink conf to 
enable the flag. 

> Flink Parquet Reader doesn't honor parquet configuration
> 
>
> Key: FLINK-35015
> URL: https://issues.apache.org/jira/browse/FLINK-35015
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.17.2, 1.19.0, 1.18.1
>Reporter: Zhenqiu Huang
>Priority: Minor
> Fix For: 1.20.0
>
>
> For example, To access parquet files in legacy standard, users to need to use 
> READ_INT96_AS_FIXED flag to read deprecated INT96 columns. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35015) Flink Parquet Reader doesn't honor parquet configuration

2024-04-04 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-35015:
--
Priority: Minor  (was: Critical)

> Flink Parquet Reader doesn't honor parquet configuration
> 
>
> Key: FLINK-35015
> URL: https://issues.apache.org/jira/browse/FLINK-35015
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.17.2, 1.19.0, 1.18.1
>Reporter: Zhenqiu Huang
>Priority: Minor
> Fix For: 1.20.0
>
>
> For example, To access parquet files in legacy standard, users to need to use 
> READ_INT96_AS_FIXED flag to read deprecated INT96 columns. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35015) Flink Parquet Reader doesn't honor parquet configuration

2024-04-04 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-35015:
-

 Summary: Flink Parquet Reader doesn't honor parquet configuration
 Key: FLINK-35015
 URL: https://issues.apache.org/jira/browse/FLINK-35015
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.18.1, 1.19.0, 1.17.2
Reporter: Zhenqiu Huang
 Fix For: 1.20.0


For example, To access parquet files in legacy standard, users to need to use 
READ_INT96_AS_FIXED flag to read deprecated INT96 columns. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34657) Implement Lineage Graph for streaming API use cases

2024-03-12 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-34657:
-

 Summary: Implement Lineage Graph for streaming API use cases
 Key: FLINK-34657
 URL: https://issues.apache.org/jira/browse/FLINK-34657
 Project: Flink
  Issue Type: Sub-task
Reporter: Zhenqiu Huang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34468) Implement Lineage Interface in Cassandra Connector

2024-02-19 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-34468:
-

 Summary: Implement Lineage Interface in Cassandra Connector
 Key: FLINK-34468
 URL: https://issues.apache.org/jira/browse/FLINK-34468
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Cassandra
Reporter: Zhenqiu Huang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34467) Implement Lineage Interface in Jdbc Connector

2024-02-19 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-34467:
-

 Summary: Implement Lineage Interface in Jdbc Connector
 Key: FLINK-34467
 URL: https://issues.apache.org/jira/browse/FLINK-34467
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / JDBC
Affects Versions: 1.19.0
Reporter: Zhenqiu Huang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34466) Implement Lineage Interface in Kafka Connector

2024-02-19 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-34466:
-

 Summary: Implement Lineage Interface in Kafka Connector
 Key: FLINK-34466
 URL: https://issues.apache.org/jira/browse/FLINK-34466
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Kafka
Affects Versions: 1.19.0
Reporter: Zhenqiu Huang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33244) Not Able To Pass the Configuration On Flink Session

2024-02-19 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818458#comment-17818458
 ] 

Zhenqiu Huang commented on FLINK-33244:
---

[~amar1509]
Is it still an issue for the latest operator version?

> Not Able To Pass the Configuration On Flink Session
> ---
>
> Key: FLINK-33244
> URL: https://issues.apache.org/jira/browse/FLINK-33244
> Project: Flink
>  Issue Type: Bug
>  Components: flink-contrib, Kubernetes Operator
>Affects Versions: kubernetes-operator-1.5.0
>Reporter: Amarjeet Singh
>Priority: Critical
> Fix For: 1.17.1
>
>
> Hi 
> I have tried configuring the flink run -D like 
> -Dmetrics.reporter=promgateway\
> -Dmetrics.reporter.promgateway.jobName: flink_test_outside
> these configuration .
> And Same is for FLink Kubernetive Operator
> Not able to Configure KuberConfiguraton using 
> apiVersion: flink.apache.org/v1beta1
> kind: FlinkSessionJob
> metadata:
> name: flink-job-test
> spec:
> deploymentName: flink-session-cluster
> restartNonce: 11
> flinkConfiguration:
> # Flink Config Overrides
> kubernetes.operator.job.restart.failed: "true"
> metrics.reporters: "promgateway"
> metrics.reporter.promgateway.jobName: "flink_test_outside"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34250) Add formats options to docs

2024-01-26 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-34250:
-

 Summary: Add formats options to docs
 Key: FLINK-34250
 URL: https://issues.apache.org/jira/browse/FLINK-34250
 Project: Flink
  Issue Type: Improvement
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.19.0
Reporter: Zhenqiu Huang


We have options defined for AVRO and CVS formats. But they are not included in 
docs. It is better to show in a common section.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34007) Flink Job stuck in suspend state after losing leadership in HA Mode

2024-01-18 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17808334#comment-17808334
 ] 

Zhenqiu Huang commented on FLINK-34007:
---

[~mapohl] Ack. There are no new observations from last 2 days' testing result. 
The only thing that probably worth to mention is that when the LeaderElector (3 
thread executor) exit from renew deadline out, it is actually one of the thread 
exit from the loop. From the debug log, I can still observe 2 thread 
consistently failed to acquire the leadership due to it the stop flag.


For the 1.17, I will create an instance for testing in the same cluster today. 
Let's see what's the result.

> Flink Job stuck in suspend state after losing leadership in HA Mode
> ---
>
> Key: FLINK-34007
> URL: https://issues.apache.org/jira/browse/FLINK-34007
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.19.0, 1.18.1, 1.18.2
>Reporter: Zhenqiu Huang
>Priority: Blocker
>  Labels: pull-request-available
> Attachments: Debug.log, LeaderElector-Debug.json, job-manager.log
>
>
> The observation is that Job manager goes to suspend state with a failed 
> container not able to register itself to resource manager after timeout.
> JM Log, see attached
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2024-01-16 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17807486#comment-17807486
 ] 

Zhenqiu Huang edited comment on FLINK-31275 at 1/16/24 10:59 PM:
-

Hi Everyone, I also want to bump the thread due to the internal needs. I feel 
open lineage community gives very good suggestion to define an intermediate 
representation (Dataset) about the metadata of a since/sink. Also LineageVertex 
could definitely have multiple dataset, for example Hybrid source users who 
read from Kafka first then switch to iceberg. Given this, I feel the config 
should be in dataset rather than LineageVertex. On the other hand, we want to 
make the column lineage possible, so having the query in the dataset will be 
reason for lineage provide to analysis the column relationship. For 
input/output schema, we may put it into a facet. It could be optional depends 
on the connector implementation. How do you think [~zjureel] [~mobuchowski]?


{code:java}
public interface LineageVertex {
/* List of input (for source) or output (for sink) datasets interacted with 
by the connector */
List datasets; 
} 
{code}

{code:java}
public interface Dataset {
/* Name for this particular dataset. */
String name;
/* Unique name for this dataset's datasource. */
String namespace; 
/* Query used to generate the dataset If there is */
String query;
/* Facets for the lineage vertex to describe the particular information of 
dataset. */ 
Map facets; 
} 
{code}

Facet type could be SchemaFacet and ConfigFacet




was (Author: zhenqiuhuang):
Hi Everyone, I also want to bump the thread due to the internal needs. I feel 
open lineage community gives very good suggestion to define an intermediate 
representation (Dataset) about the metadata of a since/sink. Also LineageVertex 
could definitely have multiple dataset, for example Hybrid source users who 
read from Kafka first then switch to iceberg. Given this, I feel the config 
should be in dataset rather than LineageVertex. On the other hand, we want to 
make the column lineage possible, so having the query in the dataset will be 
reason for lineage provide to analysis the column relationship. For 
input/output schema, we may put it into a facet. It could be optional depends 
on the connector implementation. How do you think [~zjureel] [~mobuchowski]?


public interface LineageVertex {
/* List of input (for source) or output (for sink) datasets interacted with 
by the connector */
List datasets; 
} 

public interface Dataset {
/* Name for this particular dataset. */
String name;
/* Unique name for this dataset's datasource. */
String namespace; 
/* Query used to generate the dataset If there is */
String query;
/* Facets for the lineage vertex to describe the particular information of 
dataset. */ 
Map facets; 
} 

Facet type could be SchemaFacet and ConfigFacet. 



> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2024-01-16 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17807486#comment-17807486
 ] 

Zhenqiu Huang commented on FLINK-31275:
---

Hi Everyone, I also want to bump the thread due to the internal needs. I feel 
open lineage community gives very good suggestion to define an intermediate 
representation (Dataset) about the metadata of a since/sink. Also LineageVertex 
could definitely have multiple dataset, for example Hybrid source users who 
read from Kafka first then switch to iceberg. Given this, I feel the config 
should be in dataset rather than LineageVertex. On the other hand, we want to 
make the column lineage possible, so having the query in the dataset will be 
reason for lineage provide to analysis the column relationship. For 
input/output schema, we may put it into a facet. It could be optional depends 
on the connector implementation. How do you think [~zjureel] [~mobuchowski]?


public interface LineageVertex {
/* List of input (for source) or output (for sink) datasets interacted with 
by the connector */
List datasets; 
} 

public interface Dataset {
/* Name for this particular dataset. */
String name;
/* Unique name for this dataset's datasource. */
String namespace; 
/* Query used to generate the dataset If there is */
String query;
/* Facets for the lineage vertex to describe the particular information of 
dataset. */ 
Map facets; 
} 

Facet type could be SchemaFacet and ConfigFacet. 



> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-34007) Flink Job stuck in suspend state after losing leadership in HA Mode

2024-01-15 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17807017#comment-17807017
 ] 

Zhenqiu Huang edited comment on FLINK-34007 at 1/16/24 3:40 AM:


[~mapohl] [~wangyang0918]
I am intensively testing flink 1.18. Within two days, there are users reported 
the job manager stuck issue in 1.17 and 1.16. 1.18 and 1.17 job instances are 
running in the same cluster. 1.16 is in different cluster.

I attached another LeaderElector-Debug.json file that contains debug log of a 
flink 1.18 app. The issue happened several times:
1. due to the configmap not accessible from api sever then renew timeout 
exceeded. 
2. a failure on patch on a updated configmap


The interesting part of the behavior of last several days is that job manager 
was not stuck but exit directly. Then, new job manager pod started correctly 
that is why new leader is selected in the log above. Hopefully, it is useful 
for your diagnosis.


[~wangyang0918]
>From my initial observation (before creating the jira), the leader annotation 
>update stopped when job manager was stuck. 







was (Author: zhenqiuhuang):
[~mapohl] [~wangyang0918]
I am intensively testing flink 1.18. Within two days, there are users reported 
the job manager stuck issue in 1.17 and 1.16. 1.18 and 1.17 job instances are 
running in the same cluster. 1.16 is in different cluster.

I attached another LeaderElector-Debug.json file that contains debug log of a 
flink 1.18 app. The issue happened several times:
1. due to the configmap not accessible from api sever then renew timeout 
exceeded 
2. a failure on patch on a updated configmap

The interesting part of the behavior of last several days is that job manager 
was not stuck but exit directly. Then, new job manager pod started correctly 
that is why new leader is selected in the log above. Hopefully, it is useful 
for your diagnosis.






> Flink Job stuck in suspend state after losing leadership in HA Mode
> ---
>
> Key: FLINK-34007
> URL: https://issues.apache.org/jira/browse/FLINK-34007
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.3, 1.17.2, 1.18.1, 1.18.2
>Reporter: Zhenqiu Huang
>Priority: Major
> Attachments: Debug.log, LeaderElector-Debug.json, job-manager.log
>
>
> The observation is that Job manager goes to suspend state with a failed 
> container not able to register itself to resource manager after timeout.
> JM Log, see attached
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-34007) Flink Job stuck in suspend state after losing leadership in HA Mode

2024-01-15 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17807017#comment-17807017
 ] 

Zhenqiu Huang edited comment on FLINK-34007 at 1/16/24 3:34 AM:


[~mapohl] [~wangyang0918]
I am intensively testing flink 1.18. Within two days, there are users reported 
the job manager stuck issue in 1.17 and 1.16. 1.18 and 1.17 job instances are 
running in the same cluster. 1.16 is in different cluster.

I attached another LeaderElector-Debug.json file that contains debug log of a 
flink 1.18 app. The issue happened several times:
1. due to the configmap not accessible from api sever then renew timeout 
exceeded 
2. a failure on patch on a updated configmap

The interesting part of the behavior of last several days is that job manager 
was not stuck but exit directly. Then, new job manager pod started correctly 
that is why new leader is selected in the log above. Hopefully, it is useful 
for your diagnosis.







was (Author: zhenqiuhuang):
[~mapohl]
I am intensively testing flink 1.18. Within two days, there are users reported 
the job manager stuck issue in 1.17 and 1.16. 1.18 and 1.17 job instances are 
running in the same cluster. 1.16 is in different cluster.

I attached another LeaderElector-Debug.json file that contains debug log of a 
flink 1.18 app. The issue happened several times:
1. due to the configmap not accessible from api sever then renew timeout 
exceeded 
2. a failure on patch on a updated configmap

The interesting part of the behavior of last several days is that job manager 
was not stuck but exit directly. Then, new job manager pod started correctly 
that is why new leader is selected in the log above. Hopefully, it is useful 
for your diagnosis.






> Flink Job stuck in suspend state after losing leadership in HA Mode
> ---
>
> Key: FLINK-34007
> URL: https://issues.apache.org/jira/browse/FLINK-34007
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.3, 1.17.2, 1.18.1, 1.18.2
>Reporter: Zhenqiu Huang
>Priority: Major
> Attachments: Debug.log, LeaderElector-Debug.json, job-manager.log
>
>
> The observation is that Job manager goes to suspend state with a failed 
> container not able to register itself to resource manager after timeout.
> JM Log, see attached
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34007) Flink Job stuck in suspend state after losing leadership in HA Mode

2024-01-15 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17807017#comment-17807017
 ] 

Zhenqiu Huang commented on FLINK-34007:
---

[~mapohl]
I am intensively testing flink 1.18. Within two days, there are users reported 
the job manager stuck issue in 1.17 and 1.16. 1.18 and 1.17 job instances are 
running in the same cluster. 1.16 is in different cluster.

I attached another LeaderElector-Debug.json file that contains debug log of a 
flink 1.18 app. The issue happened several times:
1. due to the configmap not accessible from api sever then renew timeout 
exceeded 
2. a failure on patch on a updated configmap

The interesting part of the behavior of last several days is that job manager 
was not stuck but exit directly. Then, new job manager pod started correctly 
that is why new leader is selected in the log above. Hopefully, it is useful 
for your diagnosis.






> Flink Job stuck in suspend state after losing leadership in HA Mode
> ---
>
> Key: FLINK-34007
> URL: https://issues.apache.org/jira/browse/FLINK-34007
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.3, 1.17.2, 1.18.1, 1.18.2
>Reporter: Zhenqiu Huang
>Priority: Major
> Attachments: Debug.log, LeaderElector-Debug.json, job-manager.log
>
>
> The observation is that Job manager goes to suspend state with a failed 
> container not able to register itself to resource manager after timeout.
> JM Log, see attached
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34007) Flink Job stuck in suspend state after losing leadership in HA Mode

2024-01-15 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-34007:
--
Attachment: LeaderElector-Debug.json

> Flink Job stuck in suspend state after losing leadership in HA Mode
> ---
>
> Key: FLINK-34007
> URL: https://issues.apache.org/jira/browse/FLINK-34007
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.3, 1.17.2, 1.18.1, 1.18.2
>Reporter: Zhenqiu Huang
>Priority: Major
> Attachments: Debug.log, LeaderElector-Debug.json, job-manager.log
>
>
> The observation is that Job manager goes to suspend state with a failed 
> container not able to register itself to resource manager after timeout.
> JM Log, see attached
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34007) Flink Job stuck in suspend state after losing leadership in HA Mode

2024-01-13 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17806410#comment-17806410
 ] 

Zhenqiu Huang commented on FLINK-34007:
---

[~mapohl]
Yes, from the observation on the failure case, the ConfigMap was not cleanup 
when job manager lose the leadership. Even the renewTime field is no longer 
upgraded by leader elector, it means leader elector already goes out of its run 
loop. If look into the fabric8 leader elector source code, it looks like only 
when renew deadline expired, LeaderElector will abort from its run loop. Even 
through I don't know why  renew deadline expired, enlarge the 
high-availability.kubernetes.leader-election.renew-deadline value could isolate 
some transient issues. 

I have started a testing job with debug log of both 
io.fabric8.kubernetes.client.extended.leaderelection and flink kubernetes 
leader election modules two days ago. If the job fail, I will post new logs in 
this thread.

[~wangyang0918]
Would you please elaborate a little bit why "It seems that the fabric8 
Kubernetes client leader elector will not work properly by run() more than once 
if we do not clean up the leader annotation."?



> Flink Job stuck in suspend state after losing leadership in HA Mode
> ---
>
> Key: FLINK-34007
> URL: https://issues.apache.org/jira/browse/FLINK-34007
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.3, 1.17.2, 1.18.1, 1.18.2
>Reporter: Zhenqiu Huang
>Priority: Major
> Attachments: Debug.log, job-manager.log
>
>
> The observation is that Job manager goes to suspend state with a failed 
> container not able to register itself to resource manager after timeout.
> JM Log, see attached
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34007) Flink Job stuck in suspend state after losing leadership in HA Mode

2024-01-10 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805294#comment-17805294
 ] 

Zhenqiu Huang commented on FLINK-34007:
---

[~mapohl]
Yes, I mistakenly looked into the flink 1.17 source code. I uploaded another 
debug log above. The KubernetesLeaderElector check the annotation 
"control-plane.alpha.kubernetes.io/leader" and whether the lockIdentity exists 
in content. Given this job only has 1 job manager, there should be no other job 
manager instance try to acquire the lock. The only possibility is that somehow 
the cluster config map is returned incorrectly.

In this case, even fabric8 LeaderElector will continue to try to acquire 
leadership (If it can get without exceed deadline), flink will not able to 
restart services (such RM and dispatcher) as DefaultLeaderRetrievalService is 
stopped also. To resolve the issue for now, should we focus on gracefully 
shutdown Job Manager rather than move job to Suspended status?  


> Flink Job stuck in suspend state after losing leadership in HA Mode
> ---
>
> Key: FLINK-34007
> URL: https://issues.apache.org/jira/browse/FLINK-34007
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.3, 1.17.2, 1.18.1, 1.18.2
>Reporter: Zhenqiu Huang
>Priority: Major
> Attachments: Debug.log, job-manager.log
>
>
> The observation is that Job manager goes to suspend state with a failed 
> container not able to register itself to resource manager after timeout.
> JM Log, see attached
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34007) Flink Job stuck in suspend state after losing leadership in HA Mode

2024-01-10 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-34007:
--
Attachment: Debug.log

> Flink Job stuck in suspend state after losing leadership in HA Mode
> ---
>
> Key: FLINK-34007
> URL: https://issues.apache.org/jira/browse/FLINK-34007
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.3, 1.17.2, 1.18.1, 1.18.2
>Reporter: Zhenqiu Huang
>Priority: Major
> Attachments: Debug.log, job-manager.log
>
>
> The observation is that Job manager goes to suspend state with a failed 
> container not able to register itself to resource manager after timeout.
> JM Log, see attached
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34007) Flink Job stuck in suspend state after losing leadership in HA Mode

2024-01-10 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805203#comment-17805203
 ] 

Zhenqiu Huang commented on FLINK-34007:
---

I have been investigating the 
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector. Looks like 
there is a bug in the version 5.12.4, when the leader lose leadership, the loop 
executor will be shutdown. That is why the leader is not acquired again. Thus, 
bump the kubernetest-client version will probably help.

There is a fix in this PR.
https://github.com/fabric8io/kubernetes-client/commit/042c77b360e77dfaac4ae713518b684dcd0d985b

> Flink Job stuck in suspend state after losing leadership in HA Mode
> ---
>
> Key: FLINK-34007
> URL: https://issues.apache.org/jira/browse/FLINK-34007
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.3, 1.17.2, 1.18.1, 1.18.2
>Reporter: Zhenqiu Huang
>Priority: Major
> Attachments: job-manager.log
>
>
> The observation is that Job manager goes to suspend state with a failed 
> container not able to register itself to resource manager after timeout.
> JM Log, see attached
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34007) Flink Job stuck in suspend state after losing leadership in HA Mode

2024-01-09 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-34007:
--
Attachment: job-manager.log

> Flink Job stuck in suspend state after losing leadership in HA Mode
> ---
>
> Key: FLINK-34007
> URL: https://issues.apache.org/jira/browse/FLINK-34007
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.1, 1.18.2
>Reporter: Zhenqiu Huang
>Priority: Major
> Attachments: job-manager.log
>
>
> The observation is that Job manager goes to suspend state with a failed 
> container not able to register itself to resource manager after timeout.
> JM Log, see attached
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34007) Flink Job stuck in suspend state after losing leadership in HA Mode

2024-01-09 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-34007:
--
Summary: Flink Job stuck in suspend state after losing leadership in HA 
Mode  (was: Flink Job stuck in suspend state after recovery from failure in HA 
Mode)

> Flink Job stuck in suspend state after losing leadership in HA Mode
> ---
>
> Key: FLINK-34007
> URL: https://issues.apache.org/jira/browse/FLINK-34007
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.1, 1.18.2
>Reporter: Zhenqiu Huang
>Priority: Major
>
> The observation is that Job manager goes to suspend state with a failed 
> container not able to register itself to resource manager after timeout.
> JM Log, see attached
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34007) Flink Job stuck in suspend state after recovery from failure in HA Mode

2024-01-09 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804840#comment-17804840
 ] 

Zhenqiu Huang commented on FLINK-34007:
---

>From initial investigation, the job manager is initially lose the leadership, 
>then goes to SUSPENDED status. Shouldn't the job manager exit directly rather 
>than goes to SUSPENDED status?

2024-01-08 21:44:57,142 INFO  
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner [] - 
JobMasterServiceLeadershipRunner for job 217cee964b2cfdc3115fb74cac0ec550 was 
revoked leadership with leader id 9987190b-35f4-4238-b317-057dc3615e4d. 
Stopping current JobMasterServiceProcess.
2024-01-08 21:45:16,280 INFO  
org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - 
http://172.16.197.136:8081 lost leadership
2024-01-08 21:45:16,280 INFO  
org.apache.flink.runtime.resourcemanager.ResourceManagerServiceImpl [] - 
Resource manager service is revoked leadership with session id 
9987190b-35f4-4238-b317-057dc3615e4d.
2024-01-08 21:45:16,281 INFO  
org.apache.flink.runtime.dispatcher.runner.DefaultDispatcherRunner [] - 
DefaultDispatcherRunner was revoked the leadership with leader id 
9987190b-35f4-4238-b317-057dc3615e4d. Stopping the DispatcherLeaderProcess.
2024-01-08 21:45:16,282 INFO  
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] - 
Stopping SessionDispatcherLeaderProcess.
2024-01-08 21:45:16,282 INFO  
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Stopping 
dispatcher pekko.tcp://flink@172.16.197.136:6123/user/rpc/dispatcher_1.
2024-01-08 21:45:16,282 INFO  
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Stopping all 
currently running jobs of dispatcher 
pekko.tcp://flink@172.16.197.136:6123/user/rpc/dispatcher_1.
2024-01-08 21:45:16,282 INFO  org.apache.flink.runtime.jobmaster.JobMaster  
   [] - Stopping the JobMaster for job 
'amp-ade-fitness-clickstream-projection-uat' (217cee964b2cfdc3115fb74cac0ec550).
2024-01-08 21:45:16,285 INFO  
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job 
217cee964b2cfdc3115fb74cac0ec550 reached terminal state SUSPENDED.
2024-01-08 21:45:16,286 INFO  
org.apache.flink.runtime.security.token.DefaultDelegationTokenManager [] - 
Stopping credential renewal
2024-01-08 21:45:16,286 INFO  
org.apache.flink.runtime.security.token.DefaultDelegationTokenManager [] - 
Stopped credential renewal
2024-01-08 21:45:16,286 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.FineGrainedSlotManager [] 
- Closing the slot manager.
2024-01-08 21:45:16,286 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.FineGrainedSlotManager [] 
- Suspending the slot manager.
2024-01-08 21:45:16,287 INFO  
org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] - 
Stopping DefaultLeaderRetrievalService.
2024-01-08 21:45:16,287 INFO  
org.apache.flink.kubernetes.highavailability.KubernetesLeaderRetrievalDriver [] 
- Stopping 
KubernetesLeaderRetrievalDriver{configMapName='acsflink-5e92d541f0cd0ad7352c4dc5463c54df-cluster-config-map'}.
2024-01-08 21:45:16,287 INFO  
org.apache.flink.kubernetes.kubeclient.resources.KubernetesConfigMapSharedInformer
 [] - Stopped to watch for 
amp-ae-video-uat/acsflink-5e92d541f0cd0ad7352c4dc5463c54df-cluster-config-map, 
watching id:cc34317a-3299-4cb5-a966-55cb546e8bf9
2024-01-08 21:45:16,287 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job 
amp-ade-fitness-clickstream-projection-uat (217cee964b2cfdc3115fb74cac0ec550) 
switched from state RUNNING to SUSPENDED.

> Flink Job stuck in suspend state after recovery from failure in HA Mode
> ---
>
> Key: FLINK-34007
> URL: https://issues.apache.org/jira/browse/FLINK-34007
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.1, 1.18.2
>Reporter: Zhenqiu Huang
>Priority: Major
>
> The observation is that Job manager goes to suspend state with a failed 
> container not able to register itself to resource manager after timeout.
> JM Log:
> 2024-01-04 02:58:39,210 INFO  
> org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner [] - 
> JobMasterServiceLeadershipRunner for job 217cee964b2cfdc3115fb74cac0ec550 was 
> revoked leadership with leader id eda6fee6-ce02-4076-9a99-8c43a92629f7. 
> Stopping current JobMasterServiceProcess.
> 2024-01-04 02:58:58,347 INFO  
> org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - 
> http://172.16.71.11:8081 lost leadership
> 2024-01-04 02:58:58,347 INFO  
> org.apache.flink.runtime.resourcemanager.ResourceManagerServiceImpl [] - 
> Resource manager service is revoked leadership with session id 
> eda6fee6-ce02-4076-9a99-8c43a92629f7.
> 2024-01

[jira] [Created] (FLINK-34007) Flink Job stuck in suspend state after recovery from failure in HA Mode

2024-01-05 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-34007:
-

 Summary: Flink Job stuck in suspend state after recovery from 
failure in HA Mode
 Key: FLINK-34007
 URL: https://issues.apache.org/jira/browse/FLINK-34007
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.18.1, 1.18.2
Reporter: Zhenqiu Huang


The observation is that Job manager goes to suspend state with a failed 
container not able to register itself to resource manager after timeout.

JM Log:

2024-01-04 02:58:39,210 INFO  
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner [] - 
JobMasterServiceLeadershipRunner for job 217cee964b2cfdc3115fb74cac0ec550 was 
revoked leadership with leader id eda6fee6-ce02-4076-9a99-8c43a92629f7. 
Stopping current JobMasterServiceProcess.
2024-01-04 02:58:58,347 INFO  
org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - 
http://172.16.71.11:8081 lost leadership
2024-01-04 02:58:58,347 INFO  
org.apache.flink.runtime.resourcemanager.ResourceManagerServiceImpl [] - 
Resource manager service is revoked leadership with session id 
eda6fee6-ce02-4076-9a99-8c43a92629f7.
2024-01-04 02:58:58,348 INFO  
org.apache.flink.runtime.dispatcher.runner.DefaultDispatcherRunner [] - 
DefaultDispatcherRunner was revoked the leadership with leader id 
eda6fee6-ce02-4076-9a99-8c43a92629f7. Stopping the DispatcherLeaderProcess.
2024-01-04 02:58:58,348 INFO  
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] - 
Stopping SessionDispatcherLeaderProcess.
2024-01-04 02:58:58,349 INFO  
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Stopping 
dispatcher pekko.tcp://flink@172.16.71.11:6123/user/rpc/dispatcher_1.
2024-01-04 02:58:58,349 INFO  org.apache.flink.runtime.jobmaster.JobMaster  
   [] - Stopping the JobMaster for job 
'amp-ade-fitness-clickstream-projection-uat' (217cee964b2cfdc3115fb74cac0ec550).
2024-01-04 02:58:58,349 INFO  
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Stopping all 
currently running jobs of dispatcher 
pekko.tcp://flink@172.16.71.11:6123/user/rpc/dispatcher_1.
2024-01-04 02:58:58,351 INFO  
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job 
217cee964b2cfdc3115fb74cac0ec550 reached terminal state SUSPENDED.
2024-01-04 02:58:58,352 INFO  
org.apache.flink.runtime.security.token.DefaultDelegationTokenManager [] - 
Stopping credential renewal
2024-01-04 02:58:58,352 INFO  
org.apache.flink.runtime.security.token.DefaultDelegationTokenManager [] - 
Stopped credential renewal
2024-01-04 02:58:58,352 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.FineGrainedSlotManager [] 
- Closing the slot manager.
2024-01-04 02:58:58,351 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job 
amp-ade-fitness-clickstream-projection-uat (217cee964b2cfdc3115fb74cac0ec550) 
switched from state RUNNING to SUSPENDED.
org.apache.flink.util.FlinkException: AdaptiveScheduler is being stopped.
at 
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler.closeAsync(AdaptiveScheduler.java:474)
 ~[flink-dist-1.18.1.6-ase.jar:1.18.1.6-ase]
at 
org.apache.flink.runtime.jobmaster.JobMaster.stopScheduling(JobMaster.java:1093)
 ~[flink-dist-1.18.1.6-ase.jar:1.18.1.6-ase]
at 
org.apache.flink.runtime.jobmaster.JobMaster.stopJobExecution(JobMaster.java:1056)
 ~[flink-dist-1.18.1.6-ase.jar:1.18.1.6-ase]
at 
org.apache.flink.runtime.jobmaster.JobMaster.onStop(JobMaster.java:454) 
~[flink-dist-1.18.1.6-ase.jar:1.18.1.6-ase]
at 
org.apache.flink.runtime.rpc.RpcEndpoint.internalCallOnStop(RpcEndpoint.java:239)
 ~[flink-dist-1.18.1.6-ase.jar:1.18.1.6-ase]
at 
org.apache.flink.runtime.rpc.pekko.PekkoRpcActor$StartedState.lambda$terminate$0(PekkoRpcActor.java:574)
 ~[flink-rpc-akkadb952114-fa83-4aba-b20a-b7e5771ce59c.jar:1.18.1.6-ase]
at 
org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83)
 ~[flink-dist-1.18.1.6-ase.jar:1.18.1.6-ase]
at 
org.apache.flink.runtime.rpc.pekko.PekkoRpcActor$StartedState.terminate(PekkoRpcActor.java:573)
 ~[flink-rpc-akkadb952114-fa83-4aba-b20a-b7e5771ce59c.jar:1.18.1.6-ase]
at 
org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleControlMessage(PekkoRpcActor.java:196)
 ~[flink-rpc-akkadb952114-fa83-4aba-b20a-b7e5771ce59c.jar:1.18.1.6-ase]
at 
org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33) 
[flink-rpc-akkadb952114-fa83-4aba-b20a-b7e5771ce59c.jar:1.18.1.6-ase]
at 
org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29) 
[flink-rpc-akkadb952114-fa83-4aba-b20a-b7e5771ce59c.jar:1.18.1.6-ase]
at scala.PartialFunction.applyOrElse(PartialFunction.scala:127) 
[flink-rpc-akkadb952114-fa83-4aba-b20a-b7e5771ce59c.jar:1.18.1.6-ase]

[jira] [Comment Edited] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-11-01 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781841#comment-17781841
 ] 

Zhenqiu Huang edited comment on FLINK-31275 at 11/1/23 6:23 PM:


I am recently working on Flink integration in open lineage. In our org, 
customers mainly use the data stream api for icerberg, kafka, cassandra. As 
table API is  not even used, so implement these TableLineageVertex is probably 
not the best way.

I feel the most painful thing is to infer the schema of source/source for 
lineage perspective. If the schema info can be provided in Flink connector, the 
integration in open lineage or even other framework will be clean, concise. 


was (Author: zhenqiuhuang):
I am recently working Flink with open lineage integration. In our org, 
customers mainly use the data stream api for icerberg, kafka, cassandra. As 
table API is  not even used, so implement these TableLineageVertex is probably 
not the best way.

I feel the most painful thing is to infer the schema of source/source for 
lineage perspective. If the schema info can be provided in Flink connector, the 
integration in open lineage or even other framework will be clean, concise. 

> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-11-01 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781841#comment-17781841
 ] 

Zhenqiu Huang commented on FLINK-31275:
---

I am recently working Flink with open lineage integration. In our org, 
customers mainly use the data stream api for icerberg, kafka, cassandra. As 
table API is  not even used, so implement these TableLineageVertex is probably 
not the best way.

I feel the most painful thing is to infer the schema of source/source for 
lineage perspective. If the schema info can be provided in Flink connector, the 
integration in open lineage or even other framework will be clean, concise. 

> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-10-15 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775527#comment-17775527
 ] 

Zhenqiu Huang commented on FLINK-31275:
---

[~zjureel]
We have similar requirements. To accelerate the development, I can help on some 
Jira tickets.

> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33200) ItemAt Expression validation fail in Table API due to type mismatch

2023-10-09 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-33200:
--
Affects Version/s: 1.17.1
   1.18.0
   1.18.1

> ItemAt Expression validation fail in Table API due to type mismatch
> ---
>
> Key: FLINK-33200
> URL: https://issues.apache.org/jira/browse/FLINK-33200
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.18.0, 1.17.1, 1.18.1
>Reporter: Zhenqiu Huang
>Priority: Minor
> Fix For: 1.8.4
>
>
> The table schema is defined as below:
> public static final DataType DATA_TYPE = DataTypes.ROW(
> DataTypes.FIELD("id", DataTypes.STRING()),
> DataTypes.FIELD("events", 
> DataTypes.ARRAY(DataTypes.MAP(DataTypes.STRING(), DataTypes.STRING(
> );
> public static final Schema SCHEMA = 
> Schema.newBuilder().fromRowDataType(DATA_TYPE).build();
> inTable.select(Expressions.$("events").at(1).at("eventType").as("firstEventType")
> The validation fail as "eventType" is inferred as 
> BasicTypeInfo.STRING_TYPE_INFO, the table key internally is a 
> StringDataTypeInfo. The validation fail at 
> case mti: MapTypeInfo[_, _] =>
> if (key.resultType == mti.getKeyTypeInfo) {
>   ValidationSuccess
> } else {
>   ValidationFailure(
> s"Map entry access needs a valid key of type " +
>   s"'${mti.getKeyTypeInfo}', found '${key.resultType}'.")
> }



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33200) ItemAt Expression validation fail in Table API due to type mismatch

2023-10-09 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773373#comment-17773373
 ] 

Zhenqiu Huang commented on FLINK-33200:
---

We tried both 1.17.1 and 1.18-SNAPSHOT

> ItemAt Expression validation fail in Table API due to type mismatch
> ---
>
> Key: FLINK-33200
> URL: https://issues.apache.org/jira/browse/FLINK-33200
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: Zhenqiu Huang
>Priority: Minor
> Fix For: 1.8.4
>
>
> The table schema is defined as below:
> public static final DataType DATA_TYPE = DataTypes.ROW(
> DataTypes.FIELD("id", DataTypes.STRING()),
> DataTypes.FIELD("events", 
> DataTypes.ARRAY(DataTypes.MAP(DataTypes.STRING(), DataTypes.STRING(
> );
> public static final Schema SCHEMA = 
> Schema.newBuilder().fromRowDataType(DATA_TYPE).build();
> inTable.select(Expressions.$("events").at(1).at("eventType").as("firstEventType")
> The validation fail as "eventType" is inferred as 
> BasicTypeInfo.STRING_TYPE_INFO, the table key internally is a 
> StringDataTypeInfo. The validation fail at 
> case mti: MapTypeInfo[_, _] =>
> if (key.resultType == mti.getKeyTypeInfo) {
>   ValidationSuccess
> } else {
>   ValidationFailure(
> s"Map entry access needs a valid key of type " +
>   s"'${mti.getKeyTypeInfo}', found '${key.resultType}'.")
> }



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33200) ItemAt Expression validation fail in Table API due to type mismatch

2023-10-06 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17772723#comment-17772723
 ] 

Zhenqiu Huang commented on FLINK-33200:
---

[~lzljs3620320]

As StringDataTypeInfo is the default key type in Flink table, shall we loose 
the validation to 
 
if (key.resultType == StringDataTypeInfo || key.resultType == 
BasicTypeInfo.STRING_TYPE_INFO)

> ItemAt Expression validation fail in Table API due to type mismatch
> ---
>
> Key: FLINK-33200
> URL: https://issues.apache.org/jira/browse/FLINK-33200
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: Zhenqiu Huang
>Priority: Minor
> Fix For: 1.8.4
>
>
> The table schema is defined as below:
> public static final DataType DATA_TYPE = DataTypes.ROW(
> DataTypes.FIELD("id", DataTypes.STRING()),
> DataTypes.FIELD("events", 
> DataTypes.ARRAY(DataTypes.MAP(DataTypes.STRING(), DataTypes.STRING(
> );
> public static final Schema SCHEMA = 
> Schema.newBuilder().fromRowDataType(DATA_TYPE).build();
> inTable.select(Expressions.$("events").at(1).at("eventType").as("firstEventType")
> The validation fail as "eventType" is inferred as 
> BasicTypeInfo.STRING_TYPE_INFO, the table key internally is a 
> StringDataTypeInfo. The validation fail at 
> case mti: MapTypeInfo[_, _] =>
> if (key.resultType == mti.getKeyTypeInfo) {
>   ValidationSuccess
> } else {
>   ValidationFailure(
> s"Map entry access needs a valid key of type " +
>   s"'${mti.getKeyTypeInfo}', found '${key.resultType}'.")
> }



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33200) ItemAt Expression validation fail in Table API due to type mismatch

2023-10-06 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-33200:
-

 Summary: ItemAt Expression validation fail in Table API due to 
type mismatch
 Key: FLINK-33200
 URL: https://issues.apache.org/jira/browse/FLINK-33200
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Reporter: Zhenqiu Huang
 Fix For: 1.8.4


The table schema is defined as below:

public static final DataType DATA_TYPE = DataTypes.ROW(
DataTypes.FIELD("id", DataTypes.STRING()),
DataTypes.FIELD("events", 
DataTypes.ARRAY(DataTypes.MAP(DataTypes.STRING(), DataTypes.STRING(
);

public static final Schema SCHEMA = 
Schema.newBuilder().fromRowDataType(DATA_TYPE).build();


inTable.select(Expressions.$("events").at(1).at("eventType").as("firstEventType")

The validation fail as "eventType" is inferred as 
BasicTypeInfo.STRING_TYPE_INFO, the table key internally is a 
StringDataTypeInfo. The validation fail at 

case mti: MapTypeInfo[_, _] =>
if (key.resultType == mti.getKeyTypeInfo) {
  ValidationSuccess
} else {
  ValidationFailure(
s"Map entry access needs a valid key of type " +
  s"'${mti.getKeyTypeInfo}', found '${key.resultType}'.")
}








--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33198) Add timestamp with local time zone support in Avro converters

2023-10-06 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-33198:
-

 Summary: Add timestamp with local time zone support in Avro 
converters
 Key: FLINK-33198
 URL: https://issues.apache.org/jira/browse/FLINK-33198
 Project: Flink
  Issue Type: Improvement
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Reporter: Zhenqiu Huang
 Fix For: 1.18.1


Currently, RowDataToAvroConverters doesn't handle with LogicType 
TIMESTAMP_WITH_LOCAL_TIME_ZONE. We should add the corresponding conversion.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-10-05 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17772399#comment-17772399
 ] 

Zhenqiu Huang commented on FLINK-31275:
---

[~jark]
This JIRA is mentioned in FLIP-314. Is there anyone who is actually working on 
it? 

> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Priority: Major
>
> Currently flink generates job id in `JobGraph` which can identify a job. On 
> the other hand, flink create source/sink table in planner. We need to create 
> relations between source and sink tables for the job with an identifier id



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-32777) Upgrade Okhttp version to support IPV6

2023-09-06 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762137#comment-17762137
 ] 

Zhenqiu Huang edited comment on FLINK-32777 at 9/7/23 3:52 AM:
---

[~heywxl] Thanks for the confirmation. 

Given there is already a workaround, the PR will be revised to add IPV6 support 
suggestions in Doc. At the same time, upgrade the okhttp version when 5.0.0 is 
already is the ultimate goal. 


was (Author: zhenqiuhuang):
[~heywxl] Thanks for the confirmation. 

Given there is already a workaround, the PR will be revised to add IPV6 support 
suggestions in Doc. At the same time, upgrade the okhttp version when 5.0.1 is 
already is the ultimate goal. 

> Upgrade Okhttp version to support IPV6
> --
>
> Key: FLINK-32777
> URL: https://issues.apache.org/jira/browse/FLINK-32777
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Zhenqiu Huang
>Assignee: Zhenqiu Huang
>Priority: Minor
>
> It is reported by user:
> I was testing flink-kubernetes-operator in an IPv6 cluster and found out the 
> below issues:
> Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname 
> fd70:e66a:970d::1 not verified:
> certificate: sha256/EmX0EhNn75iJO353Pi+1rClwZyVLe55HN3l5goaneKQ=
> DN: CN=kube-apiserver
> subjectAltNames: [fd70:e66a:970d:0:0:0:0:1, 
> 2406:da14:2:770b:3b82:51d7:9e89:76ce, 10.0.170.248, 
> c0c813eaff4f9d66084de428125f0b9c.yl4.ap-northeast-1.eks.amazonaws.com, 
> ip-10-0-170-248.ap-northeast-1.compute.internal, kubernetes, 
> kubernetes.default, kubernetes.default.svc, 
> kubernetes.default.svc.cluster.local]
> Which seemed to be related to a known issue of okhttp.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32777) Upgrade Okhttp version to support IPV6

2023-09-05 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762137#comment-17762137
 ] 

Zhenqiu Huang commented on FLINK-32777:
---

[~heywxl] Thanks for the confirmation. 

Given there is already a workaround, the PR will be revised to add IPV6 support 
suggestions in Doc. At the same time, upgrade the okhttp version when 5.0.1 is 
already is the ultimate goal. 

> Upgrade Okhttp version to support IPV6
> --
>
> Key: FLINK-32777
> URL: https://issues.apache.org/jira/browse/FLINK-32777
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Zhenqiu Huang
>Assignee: Zhenqiu Huang
>Priority: Minor
>
> It is reported by user:
> I was testing flink-kubernetes-operator in an IPv6 cluster and found out the 
> below issues:
> Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname 
> fd70:e66a:970d::1 not verified:
> certificate: sha256/EmX0EhNn75iJO353Pi+1rClwZyVLe55HN3l5goaneKQ=
> DN: CN=kube-apiserver
> subjectAltNames: [fd70:e66a:970d:0:0:0:0:1, 
> 2406:da14:2:770b:3b82:51d7:9e89:76ce, 10.0.170.248, 
> c0c813eaff4f9d66084de428125f0b9c.yl4.ap-northeast-1.eks.amazonaws.com, 
> ip-10-0-170-248.ap-northeast-1.compute.internal, kubernetes, 
> kubernetes.default, kubernetes.default.svc, 
> kubernetes.default.svc.cluster.local]
> Which seemed to be related to a known issue of okhttp.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32777) Upgrade Okhttp version to support IPV6

2023-08-07 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751882#comment-17751882
 ] 

Zhenqiu Huang commented on FLINK-32777:
---

Flink operator is already using the latest stable version 4.11. It means the 
fix has not been officially released to a stable version yet.

[~gyfora] What's u opinion on this?

> Upgrade Okhttp version to support IPV6
> --
>
> Key: FLINK-32777
> URL: https://issues.apache.org/jira/browse/FLINK-32777
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Zhenqiu Huang
>Priority: Minor
>
> It is reported by user:
> I was testing flink-kubernetes-operator in an IPv6 cluster and found out the 
> below issues:
> Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname 
> fd70:e66a:970d::1 not verified:
> certificate: sha256/EmX0EhNn75iJO353Pi+1rClwZyVLe55HN3l5goaneKQ=
> DN: CN=kube-apiserver
> subjectAltNames: [fd70:e66a:970d:0:0:0:0:1, 
> 2406:da14:2:770b:3b82:51d7:9e89:76ce, 10.0.170.248, 
> c0c813eaff4f9d66084de428125f0b9c.yl4.ap-northeast-1.eks.amazonaws.com, 
> ip-10-0-170-248.ap-northeast-1.compute.internal, kubernetes, 
> kubernetes.default, kubernetes.default.svc, 
> kubernetes.default.svc.cluster.local]
> Which seemed to be related to a known issue of okhttp.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32777) Upgrade Okhttp version to support IPV6

2023-08-07 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751880#comment-17751880
 ] 

Zhenqiu Huang commented on FLINK-32777:
---

>From the discussion in OKHttp community. The issue is due to IPv6 hostnames 
>Verification logic. It is resolved by this PR 

https://github.com/square/okhttp/pull/5889

> Upgrade Okhttp version to support IPV6
> --
>
> Key: FLINK-32777
> URL: https://issues.apache.org/jira/browse/FLINK-32777
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Zhenqiu Huang
>Priority: Major
>
> It is reported by user:
> I was testing flink-kubernetes-operator in an IPv6 cluster and found out the 
> below issues:
> Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname 
> fd70:e66a:970d::1 not verified:
> certificate: sha256/EmX0EhNn75iJO353Pi+1rClwZyVLe55HN3l5goaneKQ=
> DN: CN=kube-apiserver
> subjectAltNames: [fd70:e66a:970d:0:0:0:0:1, 
> 2406:da14:2:770b:3b82:51d7:9e89:76ce, 10.0.170.248, 
> c0c813eaff4f9d66084de428125f0b9c.yl4.ap-northeast-1.eks.amazonaws.com, 
> ip-10-0-170-248.ap-northeast-1.compute.internal, kubernetes, 
> kubernetes.default, kubernetes.default.svc, 
> kubernetes.default.svc.cluster.local]
> Which seemed to be related to a known issue of okhttp.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32777) Upgrade Okhttp version to support IPV6

2023-08-07 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-32777:
--
Priority: Minor  (was: Major)

> Upgrade Okhttp version to support IPV6
> --
>
> Key: FLINK-32777
> URL: https://issues.apache.org/jira/browse/FLINK-32777
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Zhenqiu Huang
>Priority: Minor
>
> It is reported by user:
> I was testing flink-kubernetes-operator in an IPv6 cluster and found out the 
> below issues:
> Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname 
> fd70:e66a:970d::1 not verified:
> certificate: sha256/EmX0EhNn75iJO353Pi+1rClwZyVLe55HN3l5goaneKQ=
> DN: CN=kube-apiserver
> subjectAltNames: [fd70:e66a:970d:0:0:0:0:1, 
> 2406:da14:2:770b:3b82:51d7:9e89:76ce, 10.0.170.248, 
> c0c813eaff4f9d66084de428125f0b9c.yl4.ap-northeast-1.eks.amazonaws.com, 
> ip-10-0-170-248.ap-northeast-1.compute.internal, kubernetes, 
> kubernetes.default, kubernetes.default.svc, 
> kubernetes.default.svc.cluster.local]
> Which seemed to be related to a known issue of okhttp.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32777) Upgrade Okhttp version to support IPV6

2023-08-07 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-32777:
-

 Summary: Upgrade Okhttp version to support IPV6
 Key: FLINK-32777
 URL: https://issues.apache.org/jira/browse/FLINK-32777
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Zhenqiu Huang


It is reported by user:


I was testing flink-kubernetes-operator in an IPv6 cluster and found out the 
below issues:

Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname fd70:e66a:970d::1 
not verified:
certificate: sha256/EmX0EhNn75iJO353Pi+1rClwZyVLe55HN3l5goaneKQ=
DN: CN=kube-apiserver
subjectAltNames: [fd70:e66a:970d:0:0:0:0:1, 
2406:da14:2:770b:3b82:51d7:9e89:76ce, 10.0.170.248, 
c0c813eaff4f9d66084de428125f0b9c.yl4.ap-northeast-1.eks.amazonaws.com, 
ip-10-0-170-248.ap-northeast-1.compute.internal, kubernetes, 
kubernetes.default, kubernetes.default.svc, 
kubernetes.default.svc.cluster.local]

Which seemed to be related to a known issue of okhttp.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32729) allow create an initial deployment with suspend state

2023-08-02 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-32729:
--
Description: With this feature, Users could create an application in 
suspend status as a backup for the other running application to improve the 
failure recovery time.

> allow create an initial deployment with suspend state
> -
>
> Key: FLINK-32729
> URL: https://issues.apache.org/jira/browse/FLINK-32729
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Zhenqiu Huang
>Assignee: Zhenqiu Huang
>Priority: Minor
>
> With this feature, Users could create an application in suspend status as a 
> backup for the other running application to improve the failure recovery time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32729) allow create an initial deployment with suspend state

2023-08-01 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-32729:
-

 Summary: allow create an initial deployment with suspend state
 Key: FLINK-32729
 URL: https://issues.apache.org/jira/browse/FLINK-32729
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Zhenqiu Huang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32597) Drop Yarn specific get rest endpoints

2023-07-16 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-32597:
-

 Summary: Drop Yarn specific get rest endpoints 
 Key: FLINK-32597
 URL: https://issues.apache.org/jira/browse/FLINK-32597
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / YARN
Affects Versions: 1.18.0
Reporter: Zhenqiu Huang


As listed in in the 2.0 release, we need to Drop YARN-specific mutating GET 
REST endpoints (yarn-cancel, yarn-stop)
We shouldn't continue having such hacks in our APIs to work around YARN 
deficiencies.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31849) Enforce some some simple validation directly through the CRD

2023-05-13 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722435#comment-17722435
 ] 

Zhenqiu Huang commented on FLINK-31849:
---

[~gyfora]

May I take this?

> Enforce some some simple validation directly through the CRD
> 
>
> Key: FLINK-31849
> URL: https://issues.apache.org/jira/browse/FLINK-31849
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Priority: Major
>
> The CRD generator allows us to annotate some fields to add basic validation 
> such as required, min/max etc. 
> We should use these to simplify the validation logic.
> https://github.com/fabric8io/kubernetes-client/blob/master/doc/CRD-generator.md



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31998) Flink Operator Deadlock on run job Failure

2023-05-13 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722434#comment-17722434
 ] 

Zhenqiu Huang commented on FLINK-31998:
---

[~gyfora] Technically, if a session job is created, it is actually a session 
cluster that can run multiple jobs in parallel or sequentially. But from 
session job CRD, the cluster to job mapping is 1 to 1. We probably need to 
adjust the CRD to decouple the job status and session cluster status.

> Flink Operator Deadlock on run job Failure
> --
>
> Key: FLINK-31998
> URL: https://issues.apache.org/jira/browse/FLINK-31998
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.2.0, kubernetes-operator-1.3.0, 
> kubernetes-operator-1.4.0
>Reporter: Ahmed Hamdy
>Priority: Major
> Fix For: kubernetes-operator-1.6.0
>
> Attachments: gleek-m6pLe3Wy--IpCKQavAQwBQ.png
>
>
> h2. Description
> FlinkOperator Reconciler goes into deadlock situation where it never udpates 
> Session job to DEPLOYED/ROLLED_BACK if {{deploy}} fails.
> Attached sequence diagram of the issue where FlinkSessionJob is stuck in 
> UPGRADING indefinitely.
> h2. proposed fix
> Reconciler should roll back changes CR if 
> {{reconciliationStatus.isBeforeFirstDeployment()}} fails to {{{}deploy(){}}}.
> [diagram|https://issues.apache.org/7239bb39-60d8-48a0-9052-f3231947edbe]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31947) Enable stdout redirect in flink-console.sh

2023-04-26 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-31947:
-

 Summary: Enable stdout redirect in flink-console.sh
 Key: FLINK-31947
 URL: https://issues.apache.org/jira/browse/FLINK-31947
 Project: Flink
  Issue Type: Improvement
  Components: flink-docker
Reporter: Zhenqiu Huang


 flink-console.sh is used by Flink Kubenates bins to start containers. But 
there is no stdout redirect as flink-dameon.sh. It will cause the program that 
when user want to access stdout from web ui, no file is found.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31730) Support Ephemeral Storage in KubernetesConfigOptions

2023-04-04 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-31730:
-

 Summary: Support Ephemeral Storage in KubernetesConfigOptions
 Key: FLINK-31730
 URL: https://issues.apache.org/jira/browse/FLINK-31730
 Project: Flink
  Issue Type: New Feature
  Components: Deployment / Kubernetes
Affects Versions: 1.18.0, 1.17.1
Reporter: Zhenqiu Huang


There is a common need to config flink main container with Ephemeral Storage 
size. It will be more user friendly to support it as a flink config.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30609) Add ephemeral storage to CRD

2023-04-03 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17708165#comment-17708165
 ] 

Zhenqiu Huang commented on FLINK-30609:
---

[~nfraison.datadog]
Please review the PR according to the requirement from your org. Thank you very 
much!

> Add ephemeral storage to CRD
> 
>
> Key: FLINK-30609
> URL: https://issues.apache.org/jira/browse/FLINK-30609
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Matyas Orhidi
>Assignee: Zhenqiu Huang
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: kubernetes-operator-1.5.0
>
>
> We should consider adding ephemeral storage to the existing [resource 
> specification 
> |https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/reference/#resource]in
>  CRD, next to {{cpu}} and {{memory}}
> https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#setting-requests-and-limits-for-local-ephemeral-storage



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30609) Add ephemeral storage to CRD

2023-04-03 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17708033#comment-17708033
 ] 

Zhenqiu Huang commented on FLINK-30609:
---

[~nfraison.datadog]
We are glad that you also have the needs. I am actively creating the PR now. 
Please wait for 1 or 2 days. 

> Add ephemeral storage to CRD
> 
>
> Key: FLINK-30609
> URL: https://issues.apache.org/jira/browse/FLINK-30609
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Matyas Orhidi
>Assignee: Zhenqiu Huang
>Priority: Major
>  Labels: starter
> Fix For: kubernetes-operator-1.5.0
>
>
> We should consider adding ephemeral storage to the existing [resource 
> specification 
> |https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/reference/#resource]in
>  CRD, next to {{cpu}} and {{memory}}
> https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#setting-requests-and-limits-for-local-ephemeral-storage



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30363) Disable HadoppConf and Kerboros Decorator by default

2022-12-11 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17645924#comment-17645924
 ] 

Zhenqiu Huang commented on FLINK-30363:
---

Discussed with [~gyfora] offline, the change will wait for the flink upstream 
feature.
https://issues.apache.org/jira/browse/FLINK-28831

> Disable HadoppConf and Kerboros Decorator by default
> 
>
> Key: FLINK-30363
> URL: https://issues.apache.org/jira/browse/FLINK-30363
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>
> Flink natively replies on these two decorator to create config maps and k8s 
> secrets before creating job manager pod in K8s. It works well in job 
> submisstion through Flink Cli.
> org.apache.flink.kubernetes.kubeclient.decorators.HadoopConfMountDecorator
> org.apache.flink.kubernetes.kubeclient.decorators.KerberosMountDecorator
> But It doesn't work in operator mode:
> 1) Operator class path doesn't had HADOOP or Kerberos related info, so 
> resources willn't be created.
> 2) When FlinkResourceManager create TM Pod. If JM POD has HADOOP env 
> variables or Kerberos config are included application flink conf, then one of 
> decorators will be enabled. TM Pod will mount on config maps or secrets that 
> are not created yet.
> Thus, we should help user to disable these two Decorators by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-30363) Disable HadoppConf and Kerboros Decorator by default

2022-12-11 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17645844#comment-17645844
 ] 

Zhenqiu Huang edited comment on FLINK-30363 at 12/11/22 7:34 PM:
-

[~bamrabi] [~gyfora]

Would you please assign this jira to me, if you think it is a reasonable 
improvement?


was (Author: zhenqiuhuang):
[~gyfora]

Would you please assign this jira to me, if you think it is a reasonable 
improvement?

> Disable HadoppConf and Kerboros Decorator by default
> 
>
> Key: FLINK-30363
> URL: https://issues.apache.org/jira/browse/FLINK-30363
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>
> Flink natively replies on these two decorator to create config maps and k8s 
> secrets before creating job manager pod in K8s. It works well in job 
> submisstion through Flink Cli.
> org.apache.flink.kubernetes.kubeclient.decorators.HadoopConfMountDecorator
> org.apache.flink.kubernetes.kubeclient.decorators.KerberosMountDecorator
> But It doesn't work in operator mode:
> 1) Operator class path doesn't had HADOOP or Kerberos related info, so 
> resources willn't be created.
> 2) When FlinkResourceManager create TM Pod. If JM POD has HADOOP env 
> variables or Kerberos config are included application flink conf, then one of 
> decorators will be enabled. TM Pod will mount on config maps or secrets that 
> are not created yet.
> Thus, we should help user to disable these two Decorators by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30363) Disable HadoppConf and Kerboros Decorator by default

2022-12-11 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17645844#comment-17645844
 ] 

Zhenqiu Huang commented on FLINK-30363:
---

[~gyfora]

Would you please assign this jira to me, if you think it is a reasonable 
improvement?

> Disable HadoppConf and Kerboros Decorator by default
> 
>
> Key: FLINK-30363
> URL: https://issues.apache.org/jira/browse/FLINK-30363
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>
> Flink natively replies on these two decorator to create config maps and k8s 
> secrets before creating job manager pod in K8s. It works well in job 
> submisstion through Flink Cli.
> org.apache.flink.kubernetes.kubeclient.decorators.HadoopConfMountDecorator
> org.apache.flink.kubernetes.kubeclient.decorators.KerberosMountDecorator
> But It doesn't work in operator mode:
> 1) Operator class path doesn't had HADOOP or Kerberos related info, so 
> resources willn't be created.
> 2) When FlinkResourceManager create TM Pod. If JM POD has HADOOP env 
> variables or Kerberos config are included application flink conf, then one of 
> decorators will be enabled. TM Pod will mount on config maps or secrets that 
> are not created yet.
> Thus, we should help user to disable these two Decorators by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30363) Disable HadoppConf and Kerboros Decorator by default

2022-12-11 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-30363:
-

 Summary: Disable HadoppConf and Kerboros Decorator by default
 Key: FLINK-30363
 URL: https://issues.apache.org/jira/browse/FLINK-30363
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.1.0
Reporter: Zhenqiu Huang


Flink natively replies on these two decorator to create config maps and k8s 
secrets before creating job manager pod in K8s. It works well in job 
submisstion through Flink Cli.

org.apache.flink.kubernetes.kubeclient.decorators.HadoopConfMountDecorator
org.apache.flink.kubernetes.kubeclient.decorators.KerberosMountDecorator

But It doesn't work in operator mode:
1) Operator class path doesn't had HADOOP or Kerberos related info, so 
resources willn't be created.
2) When FlinkResourceManager create TM Pod. If JM POD has HADOOP env variables 
or Kerberos config are included application flink conf, then one of decorators 
will be enabled. TM Pod will mount on config maps or secrets that are not 
created yet.

Thus, we should help user to disable these two Decorators by default.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27831) Provide example of Beam on the k8s operator

2022-12-06 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644076#comment-17644076
 ] 

Zhenqiu Huang commented on FLINK-27831:
---

[~mbalassi] I am interested with task. Would you please assign it to me?

> Provide example of Beam on the k8s operator
> ---
>
> Key: FLINK-27831
> URL: https://issues.apache.org/jira/browse/FLINK-27831
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Márton Balassi
>Assignee: Márton Balassi
>Priority: Minor
> Fix For: kubernetes-operator-1.4.0
>
>
> Multiple users have asked for whether the operator supports Beam jobs in 
> different shapes. I assume that running a Beam job ultimately with the 
> current operator ultimately comes down to having the right jars on the 
> classpath / packaged into the user's fatjar.
> At this stage I suggest adding one such example, providing it might attract 
> new users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-29655) Split Flink CRD from flink-kubernates-operator module

2022-10-17 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17618947#comment-17618947
 ] 

Zhenqiu Huang edited comment on FLINK-29655 at 10/17/22 2:58 PM:
-

[~gyfora]

I feel flink-kubernetes-operator-api is more accurate. Would you please assign 
the jira to me?


was (Author: zhenqiuhuang):
[~gyfora]

I feel flink-kubernetes-operator-api is more accurate. Would you please assign 
the jira to me?

> Split Flink CRD from flink-kubernates-operator module
> -
>
> Key: FLINK-29655
> URL: https://issues.apache.org/jira/browse/FLINK-29655
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.2.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>
> To launch a flink job that managed by flink operator, it is required to 
> introduce Flink CRD in backend service as dependency to create CR. In current 
> model, all of the flink libs  will be introduced to service. But actually 
> what is required are model classes in the package in 
> org.apache.flink.kubernetes.operator.crd.
> It will be user friendly to split org.apache.flink.kubernetes.operator.crd to 
> a separate module as flink-kubernetes-crd. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29655) Split Flink CRD from flink-kubernates-operator module

2022-10-17 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17618947#comment-17618947
 ] 

Zhenqiu Huang commented on FLINK-29655:
---

[~gyfora]

I feel flink-kubernetes-operator-api is more accurate. Would you please assign 
the jira to me?

> Split Flink CRD from flink-kubernates-operator module
> -
>
> Key: FLINK-29655
> URL: https://issues.apache.org/jira/browse/FLINK-29655
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.2.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>
> To launch a flink job that managed by flink operator, it is required to 
> introduce Flink CRD in backend service as dependency to create CR. In current 
> model, all of the flink libs  will be introduced to service. But actually 
> what is required are model classes in the package in 
> org.apache.flink.kubernetes.operator.crd.
> It will be user friendly to split org.apache.flink.kubernetes.operator.crd to 
> a separate module as flink-kubernetes-crd. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29655) Split Flink CRD from flink-kubernates-operator module

2022-10-17 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-29655:
-

 Summary: Split Flink CRD from flink-kubernates-operator module
 Key: FLINK-29655
 URL: https://issues.apache.org/jira/browse/FLINK-29655
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.2.0
Reporter: Zhenqiu Huang


To launch a flink job that managed by flink operator, it is required to 
introduce Flink CRD in backend service as dependency to create CR. In current 
model, all of the flink libs  will be introduced to service. But actually what 
is required are model classes in the package in 
org.apache.flink.kubernetes.operator.crd.

It will be user friendly to split org.apache.flink.kubernetes.operator.crd to a 
separate module as flink-kubernetes-crd. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29363) Allow web ui to fully redirect to other page

2022-09-21 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607823#comment-17607823
 ] 

Zhenqiu Huang commented on FLINK-29363:
---

Yes, our scenarios is exactly the same as what [~rmetzger] explained. 

[~martijnvisser]
Yes, we have auth proxy already globally. But the proxy server our team built 
is to limit the access of the job owner (who has already been authenticated) to 
the web ui of jobs running in k8 cluster. The proxy server runs in a k8 cluster 
as one of the control complane for all of flink jobs. The setting is required 
by our security team. Basically, AJAX request need to attache cookie for the 
access. If cookie expires, we need to a way to help users to redirect to our 
platform's landing page.

> Allow web ui to fully redirect to other page
> 
>
> Key: FLINK-29363
> URL: https://issues.apache.org/jira/browse/FLINK-29363
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.15.2
>Reporter: Zhenqiu Huang
>Priority: Minor
>  Labels: pull-request-available
>
> In a streaming platform system, web ui usually integrates with internal 
> authentication and authorization system. Given the validation failed, the 
> request needs to be redirected to a landing page. It does't work for AJAX 
> request. It will be great to have the web ui configurable to allow auto full 
> redirect. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29363) Allow web ui to fully redirect to other page

2022-09-20 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-29363:
-

 Summary: Allow web ui to fully redirect to other page
 Key: FLINK-29363
 URL: https://issues.apache.org/jira/browse/FLINK-29363
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.15.2
Reporter: Zhenqiu Huang


In a streaming platform system, web ui usually integrates with internal 
authentication and authorization system. Given the validation failed, the 
request needs to be redirected to a landing page. It does't work for AJAX 
request. It will be great to have the web ui configurable to allow auto full 
redirect. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-29327) Operator configs are showing up among standard Flink configs

2022-09-16 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17605955#comment-17605955
 ] 

Zhenqiu Huang edited comment on FLINK-29327 at 9/16/22 8:56 PM:


It is true that it doesn't impact the functionality. But from user experience, 
it is confusing for users to see configures that are not specified by 
themselves. Specially, when user can see the cluster level config 
kubernetes.operator.plugins.listeners.kafka-dr.private.key.location, it has 
also security concern.


was (Author: zhenqiuhuang):
It is true that it doesn't impact the functionality. But from user experience, 
it is confusing for users to see configures that are not specified by 
themselves. Specially, when user can see the cluster level config 
kubernetes.operator.plugins.listeners.kafka-dr.private.key.location, it is also 
we security concern.

> Operator configs are showing up among standard Flink configs
> 
>
> Key: FLINK-29327
> URL: https://issues.apache.org/jira/browse/FLINK-29327
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0
>Reporter: Matyas Orhidi
>Priority: Major
> Fix For: kubernetes-operator-1.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29327) Operator configs are showing up among standard Flink configs

2022-09-16 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17605955#comment-17605955
 ] 

Zhenqiu Huang commented on FLINK-29327:
---

It is true that it doesn't impact the functionality. But from user experience, 
it is confusing for users to see configures that are not specified by 
themselves. Specially, when user can see the cluster level config 
kubernetes.operator.plugins.listeners.kafka-dr.private.key.location, it is also 
we security concern.

> Operator configs are showing up among standard Flink configs
> 
>
> Key: FLINK-29327
> URL: https://issues.apache.org/jira/browse/FLINK-29327
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0
>Reporter: Matyas Orhidi
>Priority: Major
> Fix For: kubernetes-operator-1.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29327) Operator configs are showing up among standard Flink configs

2022-09-16 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17605919#comment-17605919
 ] 

Zhenqiu Huang commented on FLINK-29327:
---

[~matyas] [~bamrabi]
I am glad to take this task.

> Operator configs are showing up among standard Flink configs
> 
>
> Key: FLINK-29327
> URL: https://issues.apache.org/jira/browse/FLINK-29327
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0
>Reporter: Matyas Orhidi
>Priority: Major
> Fix For: kubernetes-operator-1.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27660) Table API support create function using customed jar

2022-06-13 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17553683#comment-17553683
 ] 

Zhenqiu Huang commented on FLINK-27660:
---

[~lsy]

Make sense. I  have started working on the change. Will wait for your changes 
in FLINK-27861 to merge. 

> Table API support create function using customed jar
> 
>
> Key: FLINK-27660
> URL: https://issues.apache.org/jira/browse/FLINK-27660
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.16.0
>Reporter: dalongliu
>Assignee: Peter Huang
>Priority: Major
> Fix For: 1.16.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27660) Table API support create function using customed jar

2022-06-12 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17553282#comment-17553282
 ] 

Zhenqiu Huang commented on FLINK-27660:
---

[~lsy] [~jark]

Looks like implementation of this jira depends on the.
1. https://issues.apache.org/jira/browse/FLINK-27651
2. https://issues.apache.org/jira/browse/FLINK-27658

I want to clarify the scope of the ticket, correct me if I am wrong.
1. Add APIs in table environment for create function with Resource URI
2. Use Resource URI to download to local path if it is remote resource. 
3. Use local file path for URLclassloader to initialize udf and register to 
function catalog.

Some questions about the implementation.
1. What local path we are recommending for resource downloading? 
2. For remote resource, What's the recommendation schema (hdfs / http) for unit 
test and integration test?






> Table API support create function using customed jar
> 
>
> Key: FLINK-27660
> URL: https://issues.apache.org/jira/browse/FLINK-27660
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.16.0
>Reporter: dalongliu
>Assignee: Peter Huang
>Priority: Major
> Fix For: 1.16.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-20833) Expose pluggable interface for exception analysis and metrics reporting in Execution Graph

2022-04-07 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519221#comment-17519221
 ] 

Zhenqiu Huang commented on FLINK-20833:
---

[~xtsong]
I am rebasing master for my diff. Would you please assign it to again?

> Expose pluggable interface for  exception analysis and metrics reporting in 
> Execution Graph
> ---
>
> Key: FLINK-20833
> URL: https://issues.apache.org/jira/browse/FLINK-20833
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>  Labels: auto-unassigned, pull-request-available, stale-minor
>
> For platform users of Apache flink, people usually want to classify the 
> failure reason( for example user code, networking, dependencies and etc) for 
> Flink jobs and emit metrics for those analyzed results. So that platform can 
> provide an accurate value for system reliability by distinguishing the 
> failure due to user logic from the system issues. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-26539) Support dynamic partial features update in Cassandra connector

2022-03-24 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512063#comment-17512063
 ] 

Zhenqiu Huang edited comment on FLINK-26539 at 3/24/22, 8:31 PM:
-

[~martijnvisser] [~ym]

What jean want to achieve is to prevent insert unintentional null fields into 
cassandra table sink. Inserting a null value creates a tombstone. Tombstone 
needs to be prevented due to these reason.

1. Tombstone take up space and can substantially increase the amount of storage 
you require.
2. Querying tables with a large number of tombstones causes performance 
problems and it causes Latency and heap pressure.

In the implementation, I think we need an optional config for users to identify 
what are those non update fields(Upsert a null to an update field should be 
still valid) in a Row, so that the compact CQL can be generated accordingly. 




was (Author: zhenqiuhuang):
[~martijnvisser] [~ym]

What jean want to achieve is to prevent insert null fields into cassandra table 
sink. Inserting a null value creates a tombstone. Tombstone needs to be 
prevented due to these reason.

1. Tombstone take up space and can substantially increase the amount of storage 
you require.
2. Querying tables with a large number of tombstones causes performance 
problems and it causes Latency and heap pressure.

In the implementation, Jean used an operator to mark out null field as 
NON_PRESENT explicitly. I think we need a smarter way to identify what are 
those null fields in a Row, so that the compact CQL can be generated 
accordingly. 



> Support dynamic partial features update in Cassandra connector
> --
>
> Key: FLINK-26539
> URL: https://issues.apache.org/jira/browse/FLINK-26539
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Cassandra
>Reporter: Zhenqiu Huang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-26539) Support dynamic partial features update in Cassandra connector

2022-03-24 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512063#comment-17512063
 ] 

Zhenqiu Huang commented on FLINK-26539:
---

[~martijnvisser] [~ym]

What jean want to achieve is to prevent insert null fields into cassandra table 
sink. Inserting a null value creates a tombstone. Tombstone needs to be 
prevented due to these reason.

1. Tombstone take up space and can substantially increase the amount of storage 
you require.
2. Querying tables with a large number of tombstones causes performance 
problems and it causes Latency and heap pressure.

In the implementation, Jean used an operator to mark out null field as 
NON_PRESENT explicitly. I think we need a smarter way to identify what are 
those null fields in a Row, so that the compact CQL can be generated 
accordingly. 



> Support dynamic partial features update in Cassandra connector
> --
>
> Key: FLINK-26539
> URL: https://issues.apache.org/jira/browse/FLINK-26539
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Cassandra
>Reporter: Zhenqiu Huang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26539) Support dynamic partial features update in Cassandra connector

2022-03-08 Thread Zhenqiu Huang (Jira)
Zhenqiu Huang created FLINK-26539:
-

 Summary: Support dynamic partial features update in Cassandra 
connector
 Key: FLINK-26539
 URL: https://issues.apache.org/jira/browse/FLINK-26539
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Cassandra
Affects Versions: 1.14.3
Reporter: Zhenqiu Huang






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-26308) Remove need for flink-operator clusterrole

2022-02-22 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496369#comment-17496369
 ] 

Zhenqiu Huang commented on FLINK-26308:
---

[~mbalassi]
Are you committing the code to flink repo or the feature will be added in 
another separate repo?

> Remove need for flink-operator clusterrole
> --
>
> Key: FLINK-26308
> URL: https://issues.apache.org/jira/browse/FLINK-26308
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kubernetes Operator
>Reporter: Márton Balassi
>Assignee: Márton Balassi
>Priority: Major
>
> Currently we create a clusterrole for the operator, where a namespaced role 
> would be sufficient. We should remove the global role as that could cause 
> friction for some users.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-14055) Add advanced function DDL syntax "USING JAR/FILE/ACHIVE"

2021-07-12 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17379272#comment-17379272
 ] 

Zhenqiu Huang commented on FLINK-14055:
---

[~gelimusong]
In my current test scenario, I am using the FsUrlStreamHandler. URLs in 
Different protocols will be handle by the corresponding FileSystem 
implementation. As I said, I met some issues in 
org.apache.hadoop.fs.http.HttpFileSystem.

> Add advanced function DDL syntax "USING JAR/FILE/ACHIVE"
> 
>
> Key: FLINK-14055
> URL: https://issues.apache.org/jira/browse/FLINK-14055
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Bowen Li
>Priority: Major
>  Labels: auto-unassigned, sprint
>
> As FLINK-7151 adds basic function DDL to Flink, this ticket is to support 
> dynamically loading functions from external source in function DDL with 
> advanced syntax like 
>  
> {code:java}
> CREATE FUNCTION func_name as class_name USING JAR/FILE/ACHIEVE 'xxx' [, 
> JAR/FILE/ACHIEVE 'yyy'] ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14055) Add advanced function DDL syntax "USING JAR/FILE/ACHIVE"

2021-07-08 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377488#comment-17377488
 ] 

Zhenqiu Huang commented on FLINK-14055:
---

[~frank wang]
>From my testing experience for yarn, there are still some gaps to achieve the 
>goal. For example, we HTTP filesystem is not well supported in the latest 
>master. The solution of shipping resources into tm classpath doesn't work for 
>Application mode. For support Kubernetes, still need a more detailed 
>discussion.

I am going to create the first diff for parser and basic api change. We can 
discuss on the diff first. For solution for different deployment modes, we can 
discuss in the FLIP.

> Add advanced function DDL syntax "USING JAR/FILE/ACHIVE"
> 
>
> Key: FLINK-14055
> URL: https://issues.apache.org/jira/browse/FLINK-14055
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Bowen Li
>Priority: Major
>  Labels: auto-unassigned, sprint
>
> As FLINK-7151 adds basic function DDL to Flink, this ticket is to support 
> dynamically loading functions from external source in function DDL with 
> advanced syntax like 
>  
> {code:java}
> CREATE FUNCTION func_name as class_name USING JAR/FILE/ACHIEVE 'xxx' [, 
> JAR/FILE/ACHIEVE 'yyy'] ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14055) Add advanced function DDL syntax "USING JAR/FILE/ACHIVE"

2021-07-07 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377072#comment-17377072
 ] 

Zhenqiu Huang commented on FLINK-14055:
---

[~gelimusong]
Please see the doc I shared. Yes, the URLclassloader is used in table 
environment for code generation. Beside this, we need to ship the udf resource 
to cluster and put these udf in the classpath of Task manager, so that TM can 
use the regular UserCodeClassLoader to access them. But for different 
deployment mode, we need to consider some deployment details.

> Add advanced function DDL syntax "USING JAR/FILE/ACHIVE"
> 
>
> Key: FLINK-14055
> URL: https://issues.apache.org/jira/browse/FLINK-14055
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Bowen Li
>Priority: Major
>  Labels: auto-unassigned, sprint
>
> As FLINK-7151 adds basic function DDL to Flink, this ticket is to support 
> dynamically loading functions from external source in function DDL with 
> advanced syntax like 
>  
> {code:java}
> CREATE FUNCTION func_name as class_name USING JAR/FILE/ACHIEVE 'xxx' [, 
> JAR/FILE/ACHIEVE 'yyy'] ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14055) Add advanced function DDL syntax "USING JAR/FILE/ACHIVE"

2021-07-07 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376830#comment-17376830
 ] 

Zhenqiu Huang commented on FLINK-14055:
---

In our internal PoC, we are using the URLClassLoader, so that we don't need to 
distinguish the type of path url. But we need to make sure the corresponding 
filesystem is in the classpath of the job.

   /**
 * Create class loader for catalog function.
 *
 * @param name the name of catalog function
 * @param catalogFunction the catalog function has remote lib paths to 
resolve
 * @return the classloader with remote libs.
 */
private static ClassLoader createClassLoaderForRemoteUDF(
String name, CatalogFunction catalogFunction) {
if (catalogFunction.getRemoteResourcePaths().isEmpty()) {
// TODO use classloader of catalog manager in the future
return Thread.currentThread().getContextClassLoader();
}

List udfURLs = new ArrayList<>();
for (String path : catalogFunction.getRemoteResourcePaths()) {
try {
udfURLs.add(new URL(path));
} catch (Throwable e) {
throw new IllegalArgumentException(
String.format(
"Catalog function %s has unresolvable remote 
resource path %s",
name, path),
e);
}
}

URL[] urls = new URL[udfURLs.size()];
return AccessController.doPrivileged(
(PrivilegedAction) () -> new 
URLClassLoader(udfURLs.toArray(urls)));
}

> Add advanced function DDL syntax "USING JAR/FILE/ACHIVE"
> 
>
> Key: FLINK-14055
> URL: https://issues.apache.org/jira/browse/FLINK-14055
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Bowen Li
>Priority: Major
>  Labels: auto-unassigned, sprint
>
> As FLINK-7151 adds basic function DDL to Flink, this ticket is to support 
> dynamically loading functions from external source in function DDL with 
> advanced syntax like 
>  
> {code:java}
> CREATE FUNCTION func_name as class_name USING JAR/FILE/ACHIEVE 'xxx' [, 
> JAR/FILE/ACHIEVE 'yyy'] ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14055) Add advanced function DDL syntax "USING JAR/FILE/ACHIVE"

2021-07-06 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376238#comment-17376238
 ] 

Zhenqiu Huang commented on FLINK-14055:
---

[~frank wang] 
Sure. The feature covers many components in the Flink. I created a draft of 
FLIP for discussion in detail. Please feel free to add comments and opinions. 
For each cluster management system, we probably need more input from other 
people.

https://docs.google.com/document/d/1ru6gn-SRhWUmhHtv-k2zj7B8h8RYoOUCfnxg120Gg4Q/edit#

> Add advanced function DDL syntax "USING JAR/FILE/ACHIVE"
> 
>
> Key: FLINK-14055
> URL: https://issues.apache.org/jira/browse/FLINK-14055
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Bowen Li
>Priority: Major
>  Labels: auto-unassigned, sprint
>
> As FLINK-7151 adds basic function DDL to Flink, this ticket is to support 
> dynamically loading functions from external source in function DDL with 
> advanced syntax like 
>  
> {code:java}
> CREATE FUNCTION func_name as class_name USING JAR/FILE/ACHIEVE 'xxx' [, 
> JAR/FILE/ACHIEVE 'yyy'] ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14055) Add advanced function DDL syntax "USING JAR/FILE/ACHIVE"

2021-07-05 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17374938#comment-17374938
 ] 

Zhenqiu Huang commented on FLINK-14055:
---

[~zhouqi] [~jark]
Yes, using the independent classloader is the right way to achieve the goal. 
The other effort is loading the jar into the client-side/application master 
using (http or fs) for job graph generation in the different cluster systems. 
Let me post the discussion doc later today.

> Add advanced function DDL syntax "USING JAR/FILE/ACHIVE"
> 
>
> Key: FLINK-14055
> URL: https://issues.apache.org/jira/browse/FLINK-14055
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Bowen Li
>Priority: Major
>  Labels: auto-unassigned, sprint
>
> As FLINK-7151 adds basic function DDL to Flink, this ticket is to support 
> dynamically loading functions from external source in function DDL with 
> advanced syntax like 
>  
> {code:java}
> CREATE FUNCTION func_name as class_name USING JAR/FILE/ACHIEVE 'xxx' [, 
> JAR/FILE/ACHIEVE 'yyy'] ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-20833) Expose pluggable interface for exception analysis and metrics reporting in Execution Graph

2021-04-28 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335080#comment-17335080
 ] 

Zhenqiu Huang commented on FLINK-20833:
---

[~rmetzger] [~xintongsong]
Would you please help to review this PR?

> Expose pluggable interface for  exception analysis and metrics reporting in 
> Execution Graph
> ---
>
> Key: FLINK-20833
> URL: https://issues.apache.org/jira/browse/FLINK-20833
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Zhenqiu Huang
>Priority: Minor
>  Labels: auto-unassigned, pull-request-available
>
> For platform users of Apache flink, people usually want to classify the 
> failure reason( for example user code, networking, dependencies and etc) for 
> Flink jobs and emit metrics for those analyzed results. So that platform can 
> provide an accurate value for system reliability by distinguishing the 
> failure due to user logic from the system issues. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-20833) Expose pluggable interface for exception analysis and metrics reporting in Execution Graph

2021-02-27 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292321#comment-17292321
 ] 

Zhenqiu Huang commented on FLINK-20833:
---

[~rmetzger][~trohrmann]
I saw you are adding the new Scheduler. I rebased the master branch add failure 
listener in the new scheduler. Would you please review the PR at your most 
convenient time?

> Expose pluggable interface for  exception analysis and metrics reporting in 
> Execution Graph
> ---
>
> Key: FLINK-20833
> URL: https://issues.apache.org/jira/browse/FLINK-20833
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Zhenqiu Huang
>Assignee: Zhenqiu Huang
>Priority: Minor
>  Labels: pull-request-available
>
> For platform users of Apache flink, people usually want to classify the 
> failure reason( for example user code, networking, dependencies and etc) for 
> Flink jobs and emit metrics for those analyzed results. So that platform can 
> provide an accurate value for system reliability by distinguishing the 
> failure due to user logic from the system issues. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21473) Migrate ParquetInputFormat to BulkFormat interface

2021-02-24 Thread Zhenqiu Huang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290733#comment-17290733
 ] 

Zhenqiu Huang commented on FLINK-21473:
---

[~lzljs3620320]
According to existing implementation, I am going split a class 
AbstractParquetInputFormat from ParquetVectorizedInputFormat, so that the 
reader creation logic can be reused in both ParquetVectorizedInputFormat and 
ParquetInputFormat. How do you think?

> Migrate ParquetInputFormat to BulkFormat interface
> --
>
> Key: FLINK-21473
> URL: https://issues.apache.org/jira/browse/FLINK-21473
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.12.1
>Reporter: Zhenqiu Huang
>Assignee: Zhenqiu Huang
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21474) Migrate ParquetTableSource to use DynamicTableSource Interface

2021-02-23 Thread Zhenqiu Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenqiu Huang updated FLINK-21474:
--
Summary: Migrate ParquetTableSource to use DynamicTableSource Interface  
(was: Migrate ParquetTableSource to DynamicTableSource Interface)

> Migrate ParquetTableSource to use DynamicTableSource Interface
> --
>
> Key: FLINK-21474
> URL: https://issues.apache.org/jira/browse/FLINK-21474
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.12.1
>Reporter: Zhenqiu Huang
>Priority: Trivial
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   >